Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names

Dong, Wenjie; Mao, Xi; Lu, Wenjuan; Wang, Jizhou; Cheng, Yao

doi:10.3390/ijgi13090327

Open AccessArticle

Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names

by

Wenjie Dong

,

Xi Mao

^*,

Wenjuan Lu

,

Jizhou Wang

and

Yao Cheng

GIS Research Institute, Chinese Academy of Surveying and Mapping, Beijing100036, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2024, 13(9), 327; https://doi.org/10.3390/ijgi13090327

Submission received: 10 July 2024 / Revised: 22 August 2024 / Accepted: 10 September 2024 / Published: 13 September 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

As the proper noun for geographical entities, place names provide an intuitive way to identify and access specific geographic locations, playing a key role in semantic expression and spatial retrieval. However, existing research has insufficiently explored the spatio-temporal derivation relationships of place names, failing to fully utilize these relationships to enhance the connectivity between place names and improve spatial retrieval capabilities. Therefore, this paper conducts research on the spatio-temporal derivation relationships of place names, defines them in a standardized manner, clarifies the boundary conditions and identification methods, and then constructs a spatio-temporal derivation network of place names for expression and uses this network to carry out reasoning research on spatial adjacency relationships. Experiments and results showed that using the theory and methods of this paper to identify the spatio-temporal derivation relationships of Canadian place names achieves an accuracy rate of 98.5% and a recall rate of 93.4%, and the reasoning results can effectively improve the accuracy of query results. The research enriches the theoretical framework of spatio-temporal derivation relationships of place names, solves the current problems of unclear definition and inability to automatically identify spatio-temporal derivation relationships, and provides new perspectives and tools for the application practice in the field of geographical information science.

Keywords:

semantics; place names; spatio-temporal derivation relationships; place name spatio-temporal derivation network

1. Introduction

Place names are the designations of natural or human geographic entities at specific spatial locations. In simple terms, place names originate from the conceptualization and naming of geographical elements, entities, or places [1]. Place names are a representative and special category of geospatial data within geographic information systems, providing an intuitive way to identify and access specific geographic locations, thereby enhancing the retrieval, analysis, and visualization capabilities of geospatial data. The naming of places typically reflects various factors such as geography, history, culture, society, and language [2]. Among them, place names can be categorized into primary and related names based on their interrelationships [3]. Primary names are newly coined for new places, while related names utilize primary names for naming new places. If the relationship between two place names is a derivational one, the newly named place is referred to as a derived place name [4]. Derivational relationships refer to the process of naming new geographical entities by affixing, combining, and condensing existing place names to establish a relationship between the two place names. Among these derivational relationships, spatio-temporal derivational relationships stand out because they not only reflect similarities in the names of the two places but also relate to the location of the geographical entities, indicating spatial proximity between the two place names, playing an important role in semantic expression and geographic information retrieval.

In light of the current research landscape, which has largely overlooked the exploration of spatio-temporal derivation relationships among place names, there remains an untapped potential for leveraging these relationships to bolster the interconnectedness of place names and to elevate the capabilities of spatial retrieval and geographic question-answering systems. This paper introduces a comprehensive approach to identifying and articulating spatio-temporal derivation relationships, grounded in a clear understanding of these relationships. Initially, the paper delineates a precise definition of the spatio-temporal derivation relationships pertaining to place names. It then articulates the specific constraints that govern such relationships. With these constraints as a foundation, the paper proceeds to investigate methodologies adept at recognizing spatio-temporal derivation relationships. Following this, a spatio-temporal derivation network of place names is established to formally delineate the derivational connections between place names. Concluding the research, the paper delves into the reasoning of spatial adjacency and positional relationships through the lens of this network.

2. Related Work

Surveying and mapping geospatial information is an important strategic data resource and a new factor of production, and geographic name information is a representative and special category of geospatial data within surveying and mapping geospatial information. Statistical data show that place names play a pivotal role in the organization and management of geographic information, and research on the qualitative geospatial expression of geographic name information and its application services has become a hot topic in the GIS academic community at home and abroad in recent years. The application of geographic name information can be traced back to the 17th century in China and the 19th century in the United Kingdom, but the application and high regard for geographic name information did not widely occur for nearly two centuries [1]. The study and application of geographic name information only began to receive high attention in the GIS academic community in the 21st century. Currently, scholars have conducted relevant research on the derivation of place names. Reference [5] explored the role of place name derivation in constructing Yoruba riddles, finding that Yoruba riddles with derived place names not only reflect the habits, characteristics, and personalities of the people of the relevant towns and cities but also can serve as a supplementary source of information for historical construction and reconstruction. Reference [6] deeply analyzed the derivational characteristics of suffixes in English and German, expanding our understanding of language evolution. Reference [7] proposed a new method for identifying semantic relationships of proper noun derivation in geographic entities, which can effectively identify the semantic relationships of proper noun derivation in geographic entities and has potential application value in the fields of place name management and translation. Reference [3] studied and analyzed the concept of derived place names, derivation methods, characteristics, and the current state of transliteration of foreign derived place names, helping people to comprehensively and systematically understand derived place names and providing references for workers in the transliteration of foreign place names into Chinese characters. Reference [8] elaborated on the three sets of concepts in toponymic linguistics—old place names and new place names, primary place names and derived place names, symmetric place names and compound place names. Reference [9] addressed the issue of the lack of annotation of derived place names in global geographic name data as well as the corresponding primary place names, derivation categories, positional relationships, etc., which cause barriers in the translation, research, and retrogression of derived place names and low efficiency in manually identifying completely derived place names and annotating derived information, by proposing an identification algorithm for common noun derivation and completely derived place names, improving the efficiency of translation. The aforementioned research delved into the definition of place name derivation and the impact on the genesis of derived place names, which not only enriches the theoretical foundation of place name derivation but also provides strong technical support for practical applications, but there is less research on the derivational relationships between primary place names and derived place names.

Semantics refers to the meaning inherent in a linguistic symbol, which aids readers in gaining a deeper understanding and comprehension of data. Understanding semantics can enhance people’s grasp of the meanings conveyed by language. Geographical name semantics refers to the meaning expressed by geographical symbols, encompassing the textual origins of place names. While place names, as symbolic representations, convey “what it is”, the semantics of place names, with the implied meaning, discuss “why it is so”. The semantics of place names, based on their constituent parts, include aspects such as spatial location, the meaning of the place name, its etymology, and administrative affiliation. Although semantics do not exist within the geographical entities and attributes themselves, when describing these entities and attributes using symbolic language, semantics permeate, expressing their intrinsic meanings. These intrinsic meanings are the result of the continuous accumulation of people’s cognition of the objective world in their living and growing environments.

In the study of place name semantics, reference [10] revealed the one-to-one correspondence between the meanings of the constituent morphemes of Igbo place names and their arrangement (syntactic structure) and also demonstrates how place names are derived, which helps create a deeper understanding of the history, origins, and culture of the ethnic group. Reference [11] conducted a morpho-syntactic and semantic analysis of the place names of the Luhya ethnic group in Bungoma County, western Kenya, using Fillmore’s frame semantics theory to determine whether the semantic elements in the place names reflect the historical functions and meanings of the names. The study showed that Luhya place names are generated through lexical rules and word transformations involving prefixation, compounding, and borrowing, and semantically, Luhya place names are transparent and descriptive in function, usually named according to topographical features, historical events, climatic conditions, and prominent figures. Reference [12] investigated the unique meanings within place names by analyzing their constituent elements, classifying them morphologically, and exploring the distinctive meanings within place names, finding that the formation of Dholuo place names is primarily through derivational morphology and also includes compounding and inflection. Reference [13] categorized types of villages and towns by mining the spatial information, naming characteristics, and spatial distribution of their place name semantics, proving that village and town place names change less compared to urban place names, have a strong correspondence between the origin of the place name and the entity, and uses place name semantics to excavate twenty-one types of characteristic forms of villages and towns in the Qinba mountain area. Reference [14], based on the summarization of the characteristics of standardized place name word formation, started by analyzing people’s cognitive habits towards place names and, through the calculation of the semantic similarity of place names and the semantic consistency of the spatial topological relationships of geographical entities, carried out a comprehensive semantic consistency matching treatment of place names, thereby improving the accuracy and efficiency of place name semantic matching. Reference [15] established a standardized semantic knowledge base of common place names based on the relationship between the common names and types of place names in standardized Chinese place names and used the semantic meanings of place names provided by it as an important indicator for place name similarity matching. The study of place name semantics reveals the deep meanings behind geographical symbols. Research on place name semantics not only enriches the theoretical system of place name semantics but also allows for a more comprehensive excavation of the intrinsic value of place names.

Scholars have conducted research on place name derivation and place name semantics from various perspectives, yet there is a lack of discussion on the spatio-temporal derivation relationships of place names in existing studies [10,11,12,13,14,15]. There has been insufficient exploration of leveraging these relationships to enhance the connectivity between place names and to improve the performance of spatial retrieval or geographic question-answering systems. As the application scenarios of place names continue to expand, they play an essential role in geographic information systems, not only serving as the core reference for spatial data positioning but also carrying a wealth of semantic information. The spatio-temporal derivation relationship, as a fundamental attribute of place names, is an important component of place name semantics and a crucial aspect of how people recognize and use place names. However, there has not been sufficient research on spatio-temporal derivation relationships, which has constrained further exploration and application of place names and their semantics. In this pioneering work, we introduce the novel concept of spatio-temporal derivation relationships for place names and articulate these connections. Through a rigorous definition and the establishment of clear criteria and identification techniques, we construct a network that encapsulates the spatio-temporal derivation of place names. Subsequently, leveraging this network, we delve into the study of spatial adjacency relationships through a reasoned approach.

The rest of this paper is organized as follows. In Section 2, we introduce the relevant work of this article. In Section 3, we provide a standardized definition of spatio-temporal derivation relationships of place names and outline the constraints of these relationships. In Section 4, based on place name semantics, we construct a spatio-temporal derivation network of place names to formally express spatio-temporal derivation relationships. In Section 5, we explain the reasoning of spatial adjacency relationships and spatial positions through the constructed network. In Section 6, we discuss the experimentation and analysis of the proposed methods, comparing and discussing their advantages and disadvantages.

3. Concept, Definition and Judgment Methods of the Spatio-Temporal Derivative Relationship of Place Names

3.1. Concept of Spatio-Temporal Derivative Relationships of Place Names

The methods of naming place names are diverse, and according to different naming methods, place names can be categorized into the following types [16]: (1) Descriptive place names: Those that depict the geographical characteristics of geographical entities, mainly including place names that indicate geographical locations, describe natural landscapes, and explain natural resources. (2) Narrative place names: Those that reflect the characteristics of human geography, mainly including place names that narrate cultural landscapes, record ethnic identities, document historical facts and legends, and embody certain ideological concepts. (3) Primary and related place names: Place names are divided into primary and related place names based on the relationships between them, with related place names mainly including transformed place names, imitation place names, and derived place names. The derivation relationship between primary and derived place names reveals the connections between place names.

The term “derivation relationship” refers to the process of naming newly discovered geographical entities by affixing, combining, and condensing existing place names to establish a relationship between two place names. In this context, the existing place name is called the “primary place name”, and the newly formed place name through derivation is defined as the “derived place name”.

The derivation relationship can be further divided into inheritance derivation, influence derivation, and spatio-temporal derivation: (1) Inheritance derivation involves direct inheritance of the name of the original place name by adding words such as “New” to the front of the primary place name to create a new place name, for example, the relationship between New York and York in the United Kingdom. By adding the word “New” to the original place name “York”, “New York” was created to commemorate their origin from the town of York in the United Kingdom. (2) Influence derivation naming is done by borrowing the name of a well-known place for new naming; for example, “Yantai Road” in Jinan City, Shandong Province, is named after Yantai, a prefecture-level city in Shandong Province. (3) Spatio-temporal derivation occurs when, based on the latest renamed primary place name, the newly discovered place name is located around the primary place name, so the primary place name is borrowed as part of the proper noun in the naming process. For example, “Peking University Subway Station” is named because the subway station is located near “Peking University”.

In the aforementioned derivation relationships, the spatio-temporal derivation relationship can indicate the semantic relationship and spatial proximity between two place names. However, current research has not conducted a detailed discussion on the spatio-temporal derivation relationships of place names, and there is also a lack of standardized definitions. Therefore, based on summarizing existing research, this paper provides a definition of the spatio-temporal derivation relationship of place names; that is, when naming newly discovered geographical entities, people often combine the existing place names of surrounding natural or artificial geographical entities and generate new place names through the derivation of these existing place names. In this context, we define the relationship between the two place names as the “spatio-temporal derivation relationship of place names”. Its formal expression is as follows:

\{\begin{matrix} \forall t_{p} \in T_{p}, \exists t_{d} \in T_{d}, (t_{p}, t_{d}) \in R \\ D_{t_{p}} < D_{t_{d}} \\ D i s (t_{p}, t_{d}) \end{matrix}

(1)

In the formula,

T_{p}

represents the set of primary place names, and

T_{d}

represents the set of derived place names.

t_{p}

and

t_{d}

are sub-sets of

T_{p}

and

T_{d}

, respectively. R denotes the spatio-temporal derivation relationship. For each primary place name

t_{p}

in the set

T_{p}

, there is at least one derived place name

t_{d}

in the set

T_{d}

such that the two place names satisfy the spatio-temporal derivation relationship R. That is, each primary place name has at least one associated derived place name, and they are connected through the spatio-temporal derivation relationship R. The symbol

D_{t_{p}}

denotes the time at which the original place name was generated, while

D_{t_{d}}

signifies the time of emergence for the derived place name. Since derived place names are generated on the basis of original place names through a process of derivation, the time of origination for the original place name precedes that of the derived place name; that is,

D_{t_{p}} < D_{t_{d}}

. The term

D i s (t_{p}, t_{d})

represents the spatial distance constraint between the geographical entities represented by the original and derived place names, indicating a certain proximity.

Derived place names named using the spatio-temporal derivation relationship not only retain the geographical information that reflects the surrounding environment but also imply the relative positional relationship between the geographical entity and its neighboring entities, embodying the location function of place names. As a special type of place name relationship, the spatio-temporal derivation relationship can simultaneously represent the semantic and spatio-temporal connections between two place names. Therefore, identifying the derivation relationship between place names can not only enrich the expression of place name semantics but also more accurately retrieve geographical information.

In addition, other place name-related concepts involved in this paper are as follows:

Concept One: A primary geographical entity refers to the entity denoted by a primary place name.

Concept Two: A derived geographical entity refers to the entity denoted by a derived place name.

3.2. Definition of the Spatio-Temporal Derivative Relationship of Place Names

The determination of spatio-temporal derivation relationships of place names primarily involves constraints in both semantic and spatial aspects. Initially, the spatio-temporal derivation relationship and its components are formally expressed, and the semantic constraints are analyzed [17].

\{\begin{matrix} t = (s, g) \\ t_{p} = (s_{p}, g_{p}) \\ t_{d} = (s_{d}, g_{d}) \\ C = \{c_{1}, c_{2}, c_{3}, \dots, c_{n}\} \\ R = (R_{c}, R_{i}) \\ \{(s_{p}, g_{p}), (s_{p}, g_{p}, g_{d})\} \in R_{c}, c_{p} \neq c_{d} \\ \{(s_{p}, g_{p}), (s_{p}, g_{p}, s_{d}, g_{d})\} \in R_{i}, c_{p} \neq c_{d} \end{matrix}

(2)

In the formula, place name t is composed of the proper noun s and the common noun g; that is,

t = (s, g)

;

t_{p}

represents the primary place name, which is composed of the primary proper noun

s_{p}

and the primary common noun

g_{p}

;

t_{d}

represents the derived place name, which is composed of the derived proper noun

s_{d}

and the derived common noun

g_{d}

; C represents the set of categories,

c_{p}

represents the category of the primary geographical entity, and

c_{d}

represents the category of the derived geographical entity; the spatio-temporal derivation relationship R includes complete derivation relationship

R_{c}

and incomplete derivation relationship

R_{i}

.

A derived place name

t_{d}

with a complete derivation relationship is composed of the primary proper noun

s_{p}

, the primary common noun

g_{p}

, and the derived common noun

g_{d}

; a derived place name

t_{d}

with an incomplete derivation relationship is composed of the primary proper noun

s_{p}

, the primary common noun

g_{p}

, the derived proper noun

s_{d}

, and the derived common noun

g_{d}

. And there exists a derivation relationship where the two geographical entity categories are different; that is,

c_{p} \neq c_{d}

. Therefore, the semantic constraint requires that there is an inclusion relationship between the primary place name and the derived place name in terms of the place name, and in terms of the category of the geographical entities, the two place names do not belong to the same category.

In terms of spatial distribution, the derived geographical entities should be clustered around the primary geographical entities and have a certain adjacency relationship with them. As shown in Figure 1, if the spatial topological relationship between the two geographical entities is containment and adjacency, it indicates that the two geographical entities have spatial adjacency. Therefore, under the premise that the two place names meet the semantic constraints, it can be directly determined that the place names have a spatio-temporal derivation relationship; if the spatial topological relationship between the two geographical entities is separation, it is necessary to determine whether they have a spatio-temporal derivation relationship based on whether the spatial distance between them meets a specific spatial constraint distance.

The constraint distance between geographical entities is affected by the category of the primary geographical entity, and different categories of geographical entities have different spatial influence ranges. Therefore, it is necessary to determine the spatial distance constraint for this category in conjunction with the category of the primary geographical entity.

3.3. Methods for Determining the Derivative Relationship of Place Names

3.3.1. Semantic Constraint Judgment

(1): Semantic Similarity

Semantic similarity serves as a metric for gauging the degree of closeness in semantics or meaning between two texts. The smaller the value, the greater the semantic difference between the texts, that is, a lower level of semantic similarity; conversely, the larger the value, the higher the semantic similarity between the texts [18,19]. A distinct feature between the primary place name and the derived place name is the similarity in the names of the place names. Therefore, this paper utilizes semantic similarity to calculate the similarity between two place names and preliminarily determine the semantic relationship between them based on the similarity scores of the place names.

Given that place names are expressed in the form of strings, this paper employs a sequence comparison-based method to calculate the similarity between two place names. This method calculates the similarity by identifying the longest common continuous character sequence (longest common sub-sequence, LCS) between two strings [20]. This method is fast in computation, takes into account the length of the sequence, and standardizes the results, allowing for direct comparison of the outcomes. The formula is as follows:

s i m i l a r i t y = \frac{2 \times L C S}{l e n g t h (s 1) + l e n g t h (s 2)}

(3)

In the formula, LCS refers to the longest common sub-sequence between the two place names, s1 represents the primary place name, and s2 represents the proper noun of the derived place name. The similarity score is normalized to a range of 0 to 1, with a score of 0 indicating no common sub-sequence between the two strings, meaning the two place names have a common place name relationship; a score of 1 indicates that the two strings are identical, meaning the two place names have a complete derived place name relationship; any other score indicates that the two strings are partially identical, suggesting that the two place names have an incomplete derived place name relationship.

(2): Category Ontology

The place name data studied in this paper were publicly obtained from OpenStreetMap (OSM). The classification of data in OSM is primarily based on the characteristics and uses of geographical elements. This classification is implemented through tags, where each geographical element can be associated with one or more tags to describe its attributes, features, and uses. In OSM, the classification is typically divided into two levels: major categories and sub-categories. Sub-categories are refinements of major categories. For example: tag: shopping = supermarket, tag: shopping = clothes, where supermarket and clothes are refinements of the category shopping. Different combinations of tags can be used to define different categories, thereby achieving a more nuanced classification.

By preprocessing the acquired tags, redundant ones are eliminated, and category levels are defined, that is, the types of relationships between various categories. Among these, there is an inheritance relationship between major categories and sub-categories; for example, supermarket inherits from shopping. Sub-categories have a sibling relationship, meaning that supermarket and clothes have a sibling category relationship. For two place names with spatio-temporal derivation relationships, they are of different category derivation, so the categories to which they belong cannot be the same; that is, the categories of the two place names being different satisfies the category constraint.

To effectively store and manage the category ontology, this paper employs the Neo4j graph database. By mapping the ontology information into the graph database, efficient storage and querying from the ontology to the graph database was achieved [21].

3.3.2. Spatial Constraint Judgment

(1): Topological Relationships

This paper initially adopts the nine-intersection model proposed by Clementini [22], which is based on the extension of dimensions, to construct a relationship matrix by analyzing the intersecting dimensions (dimension, DIM) of the interior (interior, I), boundary (boundary, B), and exterior (exterior, E) of geographical entities a and b, thereby extracting the topological relationships of the geographical entities [23].

R_{D E - 9 I M (a, b)} = [\begin{matrix} D I M (I (a) \cap I (b)) & D I M (I (a) \cap B (b)) & D I M (I (a) \cap E (b)) \\ D I M (B (a) \cap I (b)) & D I M (B (a) \cap B (b)) & D I M (B (a) \cap E (b)) \\ D I M (E (a) \cap I (b)) & D I M (E (a) \cap B (b)) & D I M (E (a) \cap E (b)) \end{matrix}]

(4)

In the formula, a and b represent two geographical entities; I, B, and E represent the interior, boundary, and exterior of the geographical entities, respectively; and DIM indicates the dimensions [24]. Considering the characteristics of spatio-temporal derivation relationships, this study focuses on three basic topological relationships: disjoint, adjacent, and containment.

(2): Spatial Measurement

To accurately quantify the spatial relationships between geographical entities, this paper utilizes numerical values to represent the quantitative distance between entities. Taking into account that the geographical entities denoted by place names include point, line, and area feature types, the distance is calculated by first extracting the centroid of area and line features and then calculating the distance between the centroid and the point features. The formula for the distance between geographical entities is as follows:

D i s = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}

(5)

In the formula: (

x_{1}, y_{1}

) and (

x_{2}, y_{2}

) represent the spatial coordinate values of the geographical entities denoted by two place names, respectively, and

D i s

represents the distance between the two entities.

(3): Decision Tree Model

The decision tree (DT) model is a supervised learning algorithm that constructs a tree-like model through a top-down recursive process. The objective of this model is to learn decision rules from the training data to predict the label values of the target variable. The classification method based on the decision tree model is straightforward and easy to understand and interpret. Additionally, as a very fast learning and prediction algorithm, it can provide high efficiency for text classification and is suitable for classifying large-scale text data scenarios.

The CART decision tree classification method is characterized by its convenience, understandability, and high efficiency, making it one of the mainstream classification methods today [25,26]. In this study, considering the differences in the influence range of geographical features of different categories, the category of the original geographical features and the distance between two geographical features are taken as the basis for decision tree classification. A decision tree model is constructed and trained to determine the spatial constraints of place names that meet semantic relationships.

In summary, the identification of spatio-temporal derivation relationships of place names requires consideration of both semantic and spatial dimensions. In this process (Figure 2), semantic constraints form the basic premise for derivation identification. If two place names do not meet the semantic constraints, there is no need to further explore their spatial relationships. This hierarchical identification method can improve the accuracy and efficiency of identification, avoiding ineffective spatial analysis between obviously unrelated pairs of place names.

4. Construction of a Semantic Based Spatio-Temporal Derivative Relationship Network for Place Names

4.1. Semantic Expression and Storage of Place Names

4.1.1. Semantic Expression of Place Names

(1): Attribute Description

Attribute description aims to elucidate the fundamental information and meaning of place names [27]. The attribute description is defined as a set comprising the following elements:

A t t r i b u t e = \{I D, C, S, G\}

(6)

In the formula, ID refers to the place name ID; C denotes the category of the place name, which is the category to which the geographical entity denoted by the place name belongs within the classification system; S stands for the proper noun of the place name, which is the word used within the place name to distinguish between different geographical entities; G represents the common noun of the place name, which is the word used within the place name to distinguish between categories of geographical entities.

(2): Spatial Description

Spatial description pertains to the absolute spatial location information of the geographical entity denoted by the place name. The spatial description set is defined as follows:

L o c a t i o n = \{L, A D, A, E T, B\}

(7)

In the formula, L represents spatial coordinates, indicating the central point position of the geographical entity denoted by the place name in the form of coordinate points; AD refers to the administrative division, which is the administrative region where the geographical entity is located; A stands for area, i.e., the land area occupied by the geographical entity; ET denotes the element type, which classifies and organizes geographical entities based on their geometric shapes and characteristics; B signifies the boundary data of the geometric object.

(3): Temporal Description

Place names are relatively stable but not immutable; they evolve continuously due to historical development, changes in the era, and demands for development. The temporal evolution of place names is reflected in the renaming of place names, which can be influenced by various factors leading to changes in their naming or the types of geographical entities. Changes in place name information consider two scenarios: (1) The category of the place name has not changed; only the name has been replaced. For example, “Home Inn” is renamed to “Hanting Hotel”; both before and after, they belong to the “Hotel” category. (2) Both the category and name of the place name have changed, for example, “Home Inn” becomes “7-Eleven Convenience Store” with the category changing from “Hotel” to “Shopping” The definition of the temporal description set is as follows:

T i m e = \{N T, N N, N C\}

(8)

In the formula, NT signifies the new renaming time point, which is the time point at which the place name information changes; NN denotes the new place name, which is the new name after the change in the place name; NC represents the new category, which is the new category after the category change.

4.1.2. Storage of Place Name Information

Firstly, the acquired place name data are preprocessed according to the standardized expression of place names, correcting erroneous or incomplete data and removing redundant and useless data to ensure the standardization and consistency of the place name data. This paper utilizes the Neo4j graph database for the storage and expression of place name information and their spatio-temporal derivation relationships. Compared with traditional relational databases, the advantages of graph databases are manifested in intuitiveness, flexibility, and high performance. These characteristics make graph databases particularly suitable for dealing with complex network structures and rich semantic relationships [28].

The storage structure of place name information is divided into two main parts: the storage of place name semantic information and the storage of the place name spatio-temporal derivation relationship network. For the expression of place name semantics, place names are stored as first-level nodes; their attribute description sets, spatial description sets, and temporal description sets are organized as second-level nodes and connected to the first-level nodes; each specific attribute value in the description sets serves as a third-level node, associated with the corresponding second-level node. Through this hierarchical storage method, the attribute relationships of place names are clearly expressed, from the first-level node pointing to the second-level node and so on [29]. For the information related to the spatio-temporal derivation relationships of place names, the first-level nodes in the semantic expression can be directly utilized; in this network, the relationship between the primary place name and the derived place name is represented by a directed edge, pointing from the primary place name to the derived place name, intuitively reflecting the derivation relationship between place names.

4.2. Construction of a Semantic Based Network for Spatio-Temporal Derivative Relationships of Place Names

This paper expresses the spatio-temporal derivation relationships between various place names through the place name spatio-temporal derivation relationship network. The construction of the place name spatio-temporal derivation relationship network is mainly divided into two steps [30,31] (Algorithm 1, Pseudo Code for the Construction Process of Semantic Based spatio-temporal Derivation Network of Place Names): (1) The first step is the calculation based on the constraint relationship of place name semantics; this step uses the judgment method of the place name spatio-temporal derivation relationship, extracts the corresponding attribute values from the information expressed in the semantics of place names for calculation, and stores the results as the relationship between the two place names. First, we extract the place name and its proper noun information for semantic similarity calculation to determine whether there is a semantic inclusion relationship between the two place names and then extract the category information of the two place names and judge through the relationship between the two categories in the category ontology; if the place names are semantically similar and of different categories, it indicates that the two place names meet the semantic requirements of the place name spatio-temporal derivation relationship, and a semantic relationship is generated between the two place names; then, we calculate and express the topological relationship between the two place names; finally, we use the spatial coordinate information of the two geographical entities to calculate the spatial distance between the place names with the spatial measurement method and express the distance between the two places as a distance relationship. (2) The second step is the identification of the place name spatio-temporal derivation relationship; this step judges whether there is a place name spatio-temporal derivation relationship between the two place names based on the semantic relationship, topological relationship, and distance relationship generated in the first step. If there is no semantic relationship between the two place names, they are directly determined to be common place names; if there is a semantic relationship between the two place names, and their topological relationship value is containment or adjacency, it is determined that there is a place name spatio-temporal derivation relationship between the two place names; if there is a semantic relationship between the two place names, but the topological relationship value is separation, a decision tree model needs to be introduced, and it is further judged whether there is a place name spatio-temporal derivation relationship based on the category of the primary place name and the distance value between the two.

Algorithm 1 Construction of a semantic based spatio-temporal derivation network for place names.
1:	Input: Store place name information in Neo4j
2:	for (I) Semantic constraint judgment
3:	(1) semantic similarity calculation
4:	if value != 0
5:	there is semantic similarity
6:	(2) class judgment
7:	if $c_{p}! = c_{d}$
8:	comply with class constraints
9:	(II) space constraint judgment
10:	(1) topological relationship calculation
11:	if topological relationship == adjacent or includes
12:	spatio-temporal derivation relationship
13:	elif topological relationship == apart
14:	(2) spatial metric calculation
15:	(3) decision tree model judgment
16:	if decision tree model value == 1
17:	spatio-temporal derivation relationship
18:	else
19:	ordinary place name relationship
20:	end for
21:	return place names with spatio-temporal derivation relationships

Through the above methods, place names with spatio-temporal derivation relationships generate spatio-temporal derivation relationship edges, pointing from the primary place name to the derived place name and completing the construction of the place name spatio-temporal derivation relationship network.

5. Spatial Relationship Inference Based on the Spatio-Temporal Derivative Relationship Network of Place Names

5.1. Spatial Proximity Reasoning Based on the Spatio-Temporal Derivative Relationship Network of Place Names

5.1.1. Reasoning Rules for Spatial Proximity Relationships

The spatial information expressed by people using natural language is also a manifestation of their spatial cognition. Employing effective methods to extract spatial information from natural language texts and applying it to spatial information retrieval and spatial intelligent reasoning has become a research hotspot in the field of geographic information science. However, the qualitative and ambiguous characteristics of natural language, such as the use of words like “around” or “nearby”, pose challenges for computers to conduct quantitative and precise analysis. In addressing the issue of spatial proximity queries, the mainstream technology in the field of geographic information retrieval tends to adopt quantitative expressions of geographic information and matching methods, which to some extent overlooks the rich geographical semantics contained in natural language, potentially leading to information loss and errors in understanding. For a specific location, the richer the content of the associated spatial relationship description, the more accurate the definition of the spatial scope. However, existing geospatial query methods often simplify the qualitative spatial proximity relationship into a quantitative measurement relationship during the intention recognition phase, and different researchers have varying standards for the quantification of qualitative descriptions, which may also introduce subjectivity, affecting the accuracy and consistency of the query results. For example, in template-based linking of geospatial data and question answering [32], specific buffer distances were set for different types of geographical entities: restaurants (500 m), cities (5 km), hotels (1 km), landmarks (1 km), and parks (500 m). In addition, some researchers [33] divided the qualitative distance within the city into four levels of “nearby”, “close”, “medium”, and “far”, corresponding to 3 km, 10 km, 20 km, and more than 20 km, respectively, based on two different scales of within the city and between cities. The default buffer range for the surrounding search function of Amap (Gaode Map) is 200 m. However, the current quantitative methods have limitations in retrieval and analysis when dealing with geographical entities that slightly exceed the preset thresholds.

To improve the accuracy of geospatial queries, this study proposes a method that combines qualitative descriptions with quantitative distances. The core of this method is to identify and utilize the derivation relationships between place names to reason about the spatial proximity of geographical entities, thereby enhancing the expressiveness and adaptability of traditional quantitative queries [34,35]. Specifically, the spatio-temporal derivation relationship of place names not only considers semantic similarity but also takes into account the spatial proximity of different geographical entities. Therefore, by identifying the derivation relationships between place names and finding place name entities that have a derivation relationship with the entity of the place name to be queried, spatial proximity can be reasoned, and more accurate query results can be returned.

Reasoning about spatial proximity based on the spatio-temporal derivation relationship network of place names can be seen as identifying derived place names that meet specific conditions from the derivation relationship network.

R (t_{p}, t_{d}) \cap C_{i} \in R e s u l t

(9)

In the formula,

R (t_{p}, t_{d})

represents the primary place name

t_{p}

and the derived place name

t_{d}

that satisfy the spatio-temporal derivation relationship, and

C_{i}

denotes the constraint category that the derived place name must meet. Therefore, the reasoning rules for spatial proximity relationships are as follows: (1) identify all spatio-temporal derivation relationships of the entities in the query and (2) determine whether the category of the identified derived place names meets the constraint category in the query, and if so, that derived place name is the result of the reasoning.

5.1.2. The Process of Inferring Spatial Proximity Relationships

Through the semantic analysis of the query sentence, when the query intent is determined to be a qualitative spatial query, inference rules are used to reason about the geographic entities and constraint information obtained from the element extraction process. First, the geographic entities are identified for spatio-temporal derivation relationships of place names, identifying all place names that have a derivation relationship with that place name and constructing a sub-network of spatio-temporal derivation relationships centered on that place name. Within this sub-network, place names with derivation relationships will serve as spatially adjacent entities that meet the qualitative query. Then, based on the type of entity queried in the sentence, the place names in the derivation relationship sub-network are filtered by category to determine the entities that match the query intent, and all results that meet the category filter are returned. Finally, since the place names that meet the category conditions also meet the spatial constraint conditions, it is inferred that the geographic entities denoted by the two place names have a spatial proximity relationship, and the geographic entities denoted by these place names are returned as the reasoning result.

Taking the query “Which schools are near Fish Creek?” as an example, through intent recognition and element extraction, “Fish Creek, near, schools” is obtained. First, Fish Creek is used as the place name for the identification of spatio-temporal derivation relationships, forming a sub-network of spatio-temporal derivation relationships regarding Fish Creek (as shown in Figure 3). Then, we search for derived place names with the category label “school” within this network. Finally, because place names with spatio-temporal derivation relationships meet the spatial constraint conditions of derivation identification, the place names in the network that meet the query conditions are returned as spatial proximity results.

5.2. Spatial Location Inference Based on Spatio-Temporal Derived Relationship Network of Place Names

The focus of this part of the research is to utilize the semantic and spatial relationships contained in the constructed network of place name spatio-temporal derivation to perform location inference for geographical entities missing from the place name repository and to study and discuss the influencing factors that need to be considered during the reasoning process. Location is a fundamental and important feature dimension among many spatial features; recognizing spatial location can answer the question of “where” in the six major questions of geography and provide spatial references for the answers to other geographical questions.

In urban spaces, the main forms of location information include coordinates, postal codes, telephone numbers, IP addresses, place names, and addresses. Apart from the coordinate type, other location data need to undergo spatial positioning to be transformed into urban spaces. Among these data types that require transformation, place name data are a relatively standardized form of data using natural language, with a very rich range of application scenarios, and they are widely included in the data of citizens’ lives, government management, and corporate institutions.

The basis for spatial location inference lies in the precise definition of the spatial relationships between geographical entities and different reference points, which is mainly achieved through the semantic and spatial relationships in the spatio-temporal derivation relationship network.

The textual method consists of the following three steps:

The first step is to obtain place name data from the place name repository that has a spatio-temporal derivation relationship with the considered geographical entity; the second step is to use semantic similarity to obtain the spatial similarity set of the entity to be reasoned from the spatio-temporal derivation relationship network; the third step is to use the spatio-temporal derivation relationship for location prediction.

For example, the place name “Mount Royal University Lot 5” does not exist in the place name repository, so the geographical location of the entity denoted by this place name cannot be obtained. Using the location inference method in this paper, first, we look for place name data in the place name repository that have a spatio-temporal derivation relationship with the place name (Figure 4); then, we obtain the spatial similarity set of the entity: Mount Royal University, Mount Royal University Lot 1, Mount Royal University Lot 2, Mount Royal University Lot 3, and Mount Royal University Lot 4; according to the definition of the derivation relationship, it can be determined that Mount Royal University Lot 5 is within the scope constrained by Mount Royal University, and then, based on the linear distribution of Lot 1, Lot 2, Lot 3, and Lot 4, it can be determined that Mount Royal University Lot 5 is located next to Mount Royal University Lot 4.

6. Experiment and Analysis

6.1. Data Sources

This section takes Canadian place name data as an example to verify the effectiveness and reliability of the main theories and technical methods presented in this paper. The Canadian place name data were publicly obtained from OpenStreetMap, including information such as OpenStreetMapID, place names, latitude and longitude, and category codes. After preprocessing the downloaded data and removing data containing special characters, there were a total of 52,400 valid place name data entries.

6.2. Construction of a Semantic Based Spatio-Temporal Derivative Relationship Network for Place Names

6.2.1. Construction of a Semantic Based Spatio-Temporal Derivative Relationship Network for Place Names

Firstly, the attributes, spatial, and temporal information of the place names are expressed in terms of semantic triples (entity, attribute, and attribute value). As shown in Figure 5, the “Slave Lake Station” node represents the place name, and the semantic expression of its place name is decomposed into three basic dimensions: attribute description, spatial description, and temporal description. Secondary nodes are used to represent these three sets of descriptions, including the attribute description set, spatial description set, and temporal description set. Each secondary node is equipped with specific tertiary nodes to store specific attribute values. For example, “bus_stop”, “Station”, and “Slave Lake” are expressed as values of category, common noun, and proper noun attributes in the attribute description set, respectively.

Directed edges are used to represent the relationships between place name attributes, and each node and edge is assigned corresponding labels. For instance, category labels are used to clearly identify the category of all place names, which helps in quickly recognizing and retrieving information about specific types of place names. Through this multi-level, labeled expression method, not only can the semantic information of place names be fully expressed, but the efficiency and accuracy of place name information retrieval can also be improved.

Then, the Canadian place name information is stored using the Neo4j graph database. The place names in Canada are identified through the judgment method of spatio-temporal derivation relationships, forming a complex Canadian spatio-temporal derivation relationship network (Figure 6).

The network categorizes place names into two types: primary place names and derived place names. The spatio-temporal derivation relationships among place names are divided according to the number of derivations, such as first-degree derivation, second-degree derivation, etc. Notably, when a second-degree derivation relationship exists, the derived place name that serves as a first-degree derivation relationship will act as the primary place name in the second-degree derivation relationship. In the network, nodes represent place names, and the edges between nodes denote the spatio-temporal derivation relationships, with the direction of the edges pointing from the primary place name to the derived place name. Place names with consecutive derivation relationships are expressed in the form of a chain of place name entities. This method allows for the connection of place names with consecutive derivation relationships rather than existing in the form of independent triples. This not only facilitates the tracing of their most original primary place names but also provides a formal expression of the complex connections between place names.

Analysis indicates that the construction of the spatio-temporal derivation relationship network of place names is of significant importance in revealing the deep connections between primary and derived place names. On one hand, as a newly defined relationship connecting two place names, it increases the associativity between place names and enriches the expression of place name semantics. On the other hand, the spatio-temporal derivation relationship indicates the semantic similarity and spatial proximity of the two place names with such a relationship. Therefore, the identification and expression of the spatio-temporal derivation relationship can effectively enhance the application of place names in information retrieval and other areas.

6.2.2. Evaluation of the Spatio-Temporal Derivative Relationship Network of Place Names

To substantiate the efficacy of the methodologies introduced in this research, an assessment of the spatio-temporal derivation network of place names was executed from dual perspectives. The first aspect entails an evaluation of the decision tree model utilized for the determination of spatial constraints; the second aspect involves a holistic appraisal of the methodologies delineated in this paper through a comparative analysis with the spatio-temporal derivation relationships ascertained by manual identification.

(1): Decision Tree Model Evaluation

For the decision tree classification model, the three parameters of precision, recall, and the F1 score are commonly used as data indicators to assess the model’s predictive accuracy [36]. However, in cases of data imbalance or where the costs of different errors are significantly different, simple predictive accuracy may not meet all requirements. Therefore, this paper combines the generalized performance evaluation technique of the confusion matrix to comprehensively assess the model’s learning ability. We first extracted place name data from 23 categories proportionally, conducted manual annotation, and constructed a dataset, totaling 1981 data entries. The manual annotation involved experts in the field of place name research, master’s students, and technical personnel, ensuring the accuracy of the dataset.

① Data Indicators

The formulas for calculating precision, recall, and the F1 score are as follows:

\{\begin{matrix} P r e c i s i o n = \frac{T P}{(T P + F P)} \\ R e c a l l = \frac{T P}{(T P + F N)} \\ F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{(P r e c i s i o n + R e c a l l)} \end{matrix}

(10)

In the formulas, TP (true positive) represents the number of positive samples predicted as true, FP (false positive) represents the number of negative samples predicted as true, and FN (false negative) represents the number of positive samples predicted as false. In terms of the meaning of the indicators, precision measures the accuracy of positive predictions; recall measures how many of all actual positive class samples are correctly predicted by the model; the F1 score takes into account both precision and recall, providing a more comprehensive assessment of the model’s performance in the form of a harmonic mean. The higher the F1 value, the better the classification effect. The scores corresponding to each evaluation indicator of the decision tree model constructed in this paper are shown in Table 1:

② Confusion Matrix

A confusion matrix is a tool used to evaluate the performance of classification models. The diagonal elements of the confusion matrix represent the number of correct predictions made by the classification model, while the off-diagonal elements represent the number of incorrect predictions; thus, comparing the predicted and actual values in the confusion matrix can reveal the distribution of incorrect predictions across different categories.

The confusion matrix for the decision tree model in this paper is shown in Table 2:

To reduce the risk of model underfitting or overfitting, it is also necessary to generalize the decision tree model. K-fold cross-validation can evenly fit the data distribution; that is, by making multiple divisions between the training set and the test set and using each division’s training set and test set for training and testing, the model can establish a balance between the training and test sets, thereby helping to assess the consistency level of different random data set divisions, thus improving the accuracy of the proposed model. Therefore, this paper uses K-fold cross-validation for model tuning. In addition, an independent test set is also used in this paper to classify place names with derivation relationships to evaluate the model’s generalization ability. The accuracy rate of the model’s judgment results is 95.9%.

This study conducted a comprehensive performance evaluation of the constructed decision tree model. The evaluation results show that the F1 score of the decision tree model can reach 0.96, a comprehensive indicator that takes into account both precision and recall, indicating that the model performs well in the classification task. The results of its confusion matrix indicate that the proportion of correctly classified instances by this classification model is relatively high. To enhance the model’s generalization ability, K-fold cross-validation was used to reduce the model’s dependence on specific training data, by training the model on multiple different training sets to reduce the risk of overfitting and improve the model’s generalization ability.

③ Evaluation of the Method for Determining the spatio-temporal Derivative Relationship of Geographical Names

Due to the absence of a method for recognizing the spatio-temporal derivation relationships of place names, the evaluation of the spatio-temporal derivation network of place names in this study involves comparing the results of manual identification of spatio-temporal derivation relationships with the method proposed in this paper, thereby assessing the effectiveness of the approach presented herein. From the preprocessed Canadian place name data, 10,000 data entries were randomly selected for verification and analysis. Among them, there were 216 pairs of place names identified manually as having spatio-temporal derivation relationships. The method in this paper demonstrated 202 pairs of place names within the network. The specific analysis results of the network are shown in Table 3:

By constructing a network of spatio-temporal derivation relationships of place names, this study identifies and expresses place names with such relationships. Specifically, the method presented in this paper correctly identified 199 pairs of derived place names, and there were 17 pairs of incorrectly identified derived place names. Based on this, the proposed method achieved a precision rate of 98.5% and a recall rate of 93.4%.

Further analysis of the method revealed that although it performs well in most cases, there are instances where the spatio-temporal derivation relationships are not correctly identified. These situations mainly include the following aspects: (1) Place name abbreviation issues: In some cases, derived place names adopt abbreviated forms of the primary place names during naming, which leads to difficulties in identification during the semantic similarity calculation process, for example, the derived place name “USask Bookstore”, where “USask” is an abbreviation for “University of Saskatchewan”. The existence of the abbreviation form makes it impossible to accurately identify the semantic relationship between the two place names during the semantic constraint judgment, thus incorrectly determining the relationship as a common place name relationship. (2) Non-standard writing of derived place names: When naming derived place names, not following the standard writing of the primary place name can also lead to errors in identifying semantic constraint relationships. For example, the primary place name “7-Eleven” may have derived place names written as “7-11 Car Wash” or “7 Eleven stop”, where the non-standard writing of the derived place names hinders our ability to recognize their semantic relationships. (3) Decision tree model: Although the decision tree model constructed in this paper has improved the efficiency of identifying place name derivation relationships, it also has certain limitations and cannot accurately identify the spatio-temporal derivation relationships between place names based solely on the spatial distance between them, leading to some errors in using the decision tree for spatial constraint judgment.

In summary, although our method achieves satisfactory results in most cases, the existence of the aforementioned issues indicates that we need to further optimize the method in future work to improve its accuracy and robustness.

6.3. Reasoning Based on the Spatio-Temporal Derivative Relationship Network of Place Names

6.3.1. Spatial Neighborhood Reasoning

In the field of geographic information retrieval, enhancing the accuracy of qualitative query results remains a persistent challenge. This paper proposes a spatial relationship inference method based on the network of spatio-temporal derivation relationships of place names aimed at improving the accuracy of qualitative query results and utilizing Google Maps and Baidu Maps as references to verify the effectiveness of the proposed inference method.

Google Maps’ “Nearby” search function allows users to conduct searches based on two quantitative conversion methods: One is based on walking distance, simulating the range that a person can walk within a specified time, and the other is based on driving distance, determining the area accessible by car. Baidu Maps’ “Nearby” search directly sets a quantitative distance around the central point to form a buffer zone for the search.

Taking the query “Which schools are near McKenzie Towne?” as an example, as shown in Figure 7, Google Maps first locates “McKenzie Towne”, then uses its “Nearby” search function to find schools in the vicinity, with the quantitative distance set for schools reachable by walking within 15 min; Baidu Maps conducts the search following the same steps, and its returned results are schools within an 800 m radius around “McKenzie Towne”. The query results are shown in Table 4.

Using the method proposed in this paper for inference, based on the question “Which schools are near McKenzie Towne?” the key information “McKenzie Towne, nearby, schools” is extracted, and the qualitative spatial query is clearly defined by the characteristic word “nearby”. Therefore, place names that have a spatio-temporal derivation relationship with McKenzie Towne can be deduced to have spatial proximity to McKenzie Towne. The process is as follows: First, the search is conducted within the constructed Canadian network of spatio-temporal derivation relationships for place names that have a derivation relationship with “McKenzie Towne”, and there are a total of eight pairs of derivation relationships (Figure 8); then, among the place names with derivation relationships, the results are filtered for place names with the category label “school”; finally, the method returns the derived place names that meets the category filter, “McKenzie Towne School|Calgary Board of Education”.

Similarly, as shown in Figure 9, for the query “Which clinics are around North Battleford?”, Google Maps was used to locate “North Battleford” and search for clinics within a 30 min walking distance. Baidu Maps, on the other hand, searches for clinics within a 2000 m radius. The query results are presented in Table 5.

Using the method proposed in this paper for inference, we searched within the Canadian network of spatio-temporal derivation relationships for place names that have a derivation relationship with “North Battleford” (Figure 10) and filtered for place names with the category label “clinic”, identifying “North Battleford Medical Clinic” as a clinic located in the vicinity of North Battleford.

In practical searches, we can observe that the results of the above two inferences are both close to the outer edge of the quantitative search range of Google Maps and Baidu Maps, but they cannot be returned as results. If combined with the use of the spatio-temporal derivation relationship network for inference, the place names obtained with spatial proximity can improve the fit between the search results and the actual situation. Therefore, the method proposed in this paper extends the traditional quantitative query by introducing the spatio-temporal derivation relationship between place names to infer their spatial proximity. This method can provide an improved approach for the “Nearby” search function on maps.

Furthermore, we also noticed that Google Maps and Baidu Maps often adopt a uniform quantitative distance standard when processing the “nearby” searches for different categories of geographical entities, without fully considering the differences in the mode of existence and range of influence of different geographical entity categories. This “one-size-fits-all” approach may overlook the specificity and diversity of geographical entities. To address this issue, the network of spatio-temporal derivation relationships of place names proposed in this paper takes into account the characteristics of different geographical entity categories during its construction. By setting appropriate spatial proximity distance thresholds for different categories of geographical entities, it can more accurately infer the spatial proximity between place names, thereby making the results of geographic information retrieval more in line with the actual situation and better meet the search needs of users.

6.3.2. Spatial Fuzzy Position Inference

This study employed Google Maps to validate the feasibility of location inference. For the location inference of the place name “Little Red Deer fishing access”, the process began with searching the place name database for data on place names that have a spatio-temporal derivation relationship with this particular place name (Figure 11); subsequently, the spatial similarity set of the entity was obtained; and finally, according to the definition of the derivation relationship, it was ascertained that “Little Red Deer fishing access” is within the constrained area of “Red Deer”. By utilizing Google Maps to search for the location of “Little Red Deer fishing access”, it was revealed that it is approximately 1760 m away from “Red Deer”.

7. Conclusions

This research first provides a standardized definition of spatio-temporal derivation relationships of place names, establishes the criteria and identification methods for these relationships, and constructs the corresponding network of spatio-temporal derivation relationships of place names. Through this network, inference of spatial adjacency relationships was conducted, thereby providing an effective approach to enhance the expression of place name semantics and the retrieval of geographic information. Using the method proposed in this paper to identify the spatio-temporal derivation relationships of Canadian place names, the precision rate reached 98.5%, and the recall rate was 93.4%. Furthermore, inference of spatial adjacency relationships through the network constructed in this paper can enhance the accuracy of existing quantitative query results, and the reasoning of spatial locations provides a solution for data not yet included in the repository.

In addition, this paper also has limitations such as incomplete identification of spatio-temporal derivation relationships of place names and insufficient coverage of place name data sources. Therefore, in further research, the sources of place name data can be expanded, and mappings between category systems of different sources of place name data can be increased to construct a more comprehensive multi-source network of spatio-temporal derivation relationships of place names.

Author Contributions

Conceptualization, Wenjie Dong and Xi Mao; data curation, Wenjie Dong; investigation, Wenjuan Lu; methodology, Wenjie Dong and Xi Mao; resources, Jizhou Wang and Yao Cheng; supervision, Xi Mao; visualization, Wenjie Dong; writing—original draft, Wenjie Dong; writing—review and editing, Wenjie Dong. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (2023YFF0611901) and the Exploration of Global Intelligent Positioning of Multilingual Place Names (AR2412).

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

No potential conflicts of interest are reported by the author(s).

References

Chen, K.; Lin, X.; Yuan, Y.; Li, R.Q.; Liu, Y. Type expression and management in digital place name dictionaries. Geogr. Geo-Inf. Sci. 2009, 25, 6–11. [Google Scholar]
Guo, L.; Hou, R. A Cultural Linguistic Analysis of Shanyin Place Names. J. Beijing Inst. Educ. 2005, 19, 39–44. [Google Scholar]
Huo, B. Exploration of Derived Place Names and Their Translation Methods. China Terminol. 2016, 18, 5–8. [Google Scholar]
Liu, H. A Method for Automatic Recognition and Translation of English Derived Place Names for Global Mapping. Master’s Thesis, Liaoning Technical University, Fuxin, China, 2022. [Google Scholar]
Rabiu, R.A. Place Names as Source Material in the Derivation of Selected Yorùbá Riddles: A Sociological Approach. J. Linguist. Cult. Commun. 2024, 2, 89–107. [Google Scholar] [CrossRef]
Lensch, A. English and German derivation revisited: A Diachronic Construction Morphology approach to the growing complexity of bases. Constr. Fram. 2023, 15, 234–256. [Google Scholar] [CrossRef]
Liu, H.Y.; Wang, J.Z. A Method for Identifying Proper Noun Derivative Semantic Relationships for Geographic Entities. Preprints 2024, 2024040941. [Google Scholar] [CrossRef]
Tan, R. Original Place Names and Derived Place Names. China Place Name 2018, 8, 30. [Google Scholar]
Liu, H.; Wang, J.; Mao, X.; Ma, W.J. Excavation of Fully Derived Place Names and Commonly Derived Place Names. Sci. Surv. Mapp. 2022, 47, 176–181+220. [Google Scholar]
Patrick, K.O. Syntactic-Semantic Analysis of Toponyms in Igbo. J. Linguist. Assoc. Niger. 2021, 24, 83–95. [Google Scholar]
Mandillah, L. A morphosyntactic and semantic analysis of toponyms among the Luhya: A case of Bungoma County. J. Lang. Linguist. Lit. Stud. 2022, 2, 28–37. [Google Scholar] [CrossRef]
Oluoch, A.A. A Morpho-Semantic Study of Toponyms in Dholuo. Ph.D. Thesis, University of Nairobi, Nairobi, Kenya, 2022. [Google Scholar]
Huang, X. A Study on the Morphological Characteristics and Causes of Villages and Towns Based on the Semantics of Place Names: A Case Study of Qinba Mountain Area. Master’s Thesis, Chongqing University, Chongqing, China, 2022. [Google Scholar]
Cao, C. Research on Place Name Matching and Translation Methods Based on Multiple Data Sources. Master’s Thesis, Beijing University of Civil Engineering and Architecture, Beijing, China, 2019. [Google Scholar]
Cheng, G.; Lu, X. Chinese place name similarity matching algorithm considering common name semantics. Acta Geod. Et Cartogr. Sin. 2014, 43, 404–410+418. [Google Scholar]
Zhao, S.W.; Hu, X.J.; Tang, J.; Zheng, X.; Jin, X.L.; Wei, B.J. Characteristics of cultural landscape genome maps and groups: A study on names of ancient towns in Hunan Yuanshui Basin. Geogr. Res. 2023, 42, 3020–3042. [Google Scholar]
Zhang, X.Y.; Zhang, C.J.; Du, C.L. Semantic relation between spatial relation terms and feature types of geographical entities. Geomat. Inf. Sci. Wuhan Univ. 2012, 37, 1266–1270. [Google Scholar]
Han, C.; Li, L.; Liu, T.; Gao, M. A method for calculating semantic text similarity. J. East China Norm. Univ. (Nat. Sci.) 2020, 5, 95–112. [Google Scholar]
Jin, B.; Shi, Y.; Teng, H. A Text Similarity Algorithm Based on Semantic Understanding. J. Dalian Univ. Technol. 2005, 45, 291–297. [Google Scholar]
Liu, C. Binary Code Similarity Detection Based on Fuzzy Testing. Master’s Thesis, Huazhong University of Science and Technology, Wuhan, China, 2019. [Google Scholar]
Wang, D.X.; Zhu, Y.Q.; Pan, P.; Luo, K.; Hou, Z.W. Construction of geodata spatial ontology and its application in data retrieval. J. Geo-Inf. Sci. 2016, 18, 443–452. [Google Scholar]
Clementini, E.; Sharma, J.; Egenhofer, M.J. Modelling topological spatial relations: Strategies for query processing. Comput. Graph. 1994, 18, 815–822. [Google Scholar] [CrossRef]
Liu, J.; Liu, H.; Chen, X.; Guo, X.; Guo, W.Y.; Zhu, X.M.; Zhao, Q.B. Knowledge Graph Construction for Multi source Geospatial Data. J. Geo-Inf. Sci. 2020, 22, 1476–1486. [Google Scholar]
Cheng, H. Research on Entity Oriented 3D Spatial Data Models and Their Applications. Master’s Thesis, Beijing University of Civil Engineering and Architecture, Beijing, China, 2017. [Google Scholar]
Zhang, J.; Zou, Y.; Zhang, Y.; Cui, D.D.; Zhou, P.X.; Li, J.F. An improved fully polarized radar image classification. Sci. Surv. Mapp. 2021, 46, 39–46. [Google Scholar]
Yuan, M. Data Mining and Machine Learning: WEKA Application Technology and Practice; Tsinghua University Press: Beijing, China, 2016. [Google Scholar]
Liu, C.; Li, R.; Wang, J. Dynamic expression of spatio-temporal object attribute features considering semantic scale. J. Geo-Inf. Sci. 2017, 19, 1185–1194. [Google Scholar]
Lu F, Yu L, Chou P Y; et al. On geographic knowledge graph. J. Geo-Inf. Sci. 2017, 19, 723–734.
Ling, C.; Li, R.; Wu, H.; Li, H.; Gui, Z.P. Semantic driven construction of geographic entity association networks and knowledge services. Acta Geod. Et Cartogr. Sin. 2023, 52, 478–489. [Google Scholar]
Abu-Salih, B. Domain-specific knowledge graphs: A survey. J. Netw. Comput. Appl. 2021, 185, 103076. [Google Scholar] [CrossRef]
Chen, X.; Jia, S.; Xiang, Y. A review: Knowledge reasoning over knowledge graph. Expert Syst. Appl. 2020, 141, 112948. [Google Scholar] [CrossRef]
Punjani, D.; Singh, K.; Both, A.; Koubarakis, M.; Angelidis, I.; Bereta, K.; Beris, T.; Bilidas, D.; Ioannidis, T.; Karalis, N.; et al. Template-based question answering over linked geospatial data. In Proceedings of the 12th Workshop on Geographic Information Retrieval, Seattle, WA, USA, 6 November 2018; pp. 1–10. [Google Scholar]
Guo, X.; Qian, H.; Wu, F. A Method for Constructing Geographical Knowledge Graph from Multisource Data. Sustainability 2021, 13, 10602. [Google Scholar] [CrossRef]
Liu, Y.; Gong, Y.X.; Zhang, J.; Gao, Y. Representation and Reasoning of Spatial Relations in Geographical Space. J. Geo-Inf. Sci. 2007, 23, 1–7. [Google Scholar]
Li, Y.; Wang, Y. Knowledge Graph Query Method Based on Geographic Location Information. Electron. Sci. Technol. 2022, 35, 17–25. [Google Scholar]
Wang, B. Research on Robot Grab Detection Algorithm Based on Deep Image and Deep Learning. Master’s Thesis, Zhejiang University, Hangzhou, China, 2019. [Google Scholar]

Figure 1. Schematic Diagram of Topological Relationships Between Features.

Figure 2. Flowchart for Identifying the spatio-temporal derivative relationship of place names.

Figure 3. Schematic Diagram of the spatio-temporal Derivative Relationship Sub-Network of the Place Name “Fish Creek”.

Figure 4. Schematic Diagram of Spatial Position Inference.

Figure 5. Semantic Expression Diagram of Place Name “Slave Lake Station”.

Figure 6. Network of spatio-temporal Derivative Relationships of Canadian Place Names.

Figure 7. Google Maps and Baidu Maps Query Results.

Figure 8. Schematic Diagram of the spatio-temporal Derivative Relationship Sub-Network of the Place Name “McKenzie Towne”.

Figure 9. Google Maps and Baidu Maps Query Results.

Figure 10. Schematic Diagram of the spatio-temporal Derivative Relationship Sub-Network of the Place Name “North Battleford”.

Figure 11. Schematic Diagram of Reasoning for the Feature “Little Red Deer fishing access “.

Table 1. Evaluation Indicators and Results of Decision Tree Model.

Evaluating Indicator	Result
Accuracy	0.9801
Recall	0.9448
F1 score	0.9621

Table 2. Confusion Matrix of Decision Tree Model.

Confusion Matrix		True Value
Confusion Matrix		Positive	Negative
Predictive Value	Positive	154	3
Predictive Value	Negative	9	231

Table 3. Evaluation Indicators and Results of the spatio-temporal Derivative Relationship Network of Place Names.

Evaluating Indicator	Result
TP	199
FP	3
FN	14
Accuracy	98.5%
Recall	93.4%

Table 4. Google and Baidu Map Query Results.

Google Map	Baidu Map
Sonshine Park Preschool	Sonshine Park Preschool
St Albert The Great Elementary and Jr High School	A Child First Preschool Inc
—	Paws in Colour by Hyuna Johnson

Table 5. Google Maps and Baidu Maps Query Results.

Google Map	Baidu Map
Medical Clinic	Sask Prenatal Classes
Battlefords Medical Centre	Bull Alexandra DR-Counselling Service
Dr. M.C. Khurana Family & Internal Medicine	Connect Hearing (North Battleford)
Twin City Medical Clinic	—

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dong, W.; Mao, X.; Lu, W.; Wang, J.; Cheng, Y. Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names. ISPRS Int. J. Geo-Inf. 2024, 13, 327. https://doi.org/10.3390/ijgi13090327

AMA Style

Dong W, Mao X, Lu W, Wang J, Cheng Y. Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names. ISPRS International Journal of Geo-Information. 2024; 13(9):327. https://doi.org/10.3390/ijgi13090327

Chicago/Turabian Style

Dong, Wenjie, Xi Mao, Wenjuan Lu, Jizhou Wang, and Yao Cheng. 2024. "Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names" ISPRS International Journal of Geo-Information 13, no. 9: 327. https://doi.org/10.3390/ijgi13090327

APA Style

Dong, W., Mao, X., Lu, W., Wang, J., & Cheng, Y. (2024). Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names. ISPRS International Journal of Geo-Information, 13(9), 327. https://doi.org/10.3390/ijgi13090327

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Construction and Inference Method of Semantic-Driven, Spatio-Temporal Derivation Relationship Network for Place Names

Abstract

1. Introduction

2. Related Work

3. Concept, Definition and Judgment Methods of the Spatio-Temporal Derivative Relationship of Place Names

3.1. Concept of Spatio-Temporal Derivative Relationships of Place Names

3.2. Definition of the Spatio-Temporal Derivative Relationship of Place Names

3.3. Methods for Determining the Derivative Relationship of Place Names

3.3.1. Semantic Constraint Judgment

3.3.2. Spatial Constraint Judgment

4. Construction of a Semantic Based Spatio-Temporal Derivative Relationship Network for Place Names

4.1. Semantic Expression and Storage of Place Names

4.1.1. Semantic Expression of Place Names

4.1.2. Storage of Place Name Information

4.2. Construction of a Semantic Based Network for Spatio-Temporal Derivative Relationships of Place Names

5. Spatial Relationship Inference Based on the Spatio-Temporal Derivative Relationship Network of Place Names

5.1. Spatial Proximity Reasoning Based on the Spatio-Temporal Derivative Relationship Network of Place Names

5.1.1. Reasoning Rules for Spatial Proximity Relationships

5.1.2. The Process of Inferring Spatial Proximity Relationships

5.2. Spatial Location Inference Based on Spatio-Temporal Derived Relationship Network of Place Names

6. Experiment and Analysis

6.1. Data Sources

6.2. Construction of a Semantic Based Spatio-Temporal Derivative Relationship Network for Place Names

6.2.1. Construction of a Semantic Based Spatio-Temporal Derivative Relationship Network for Place Names

6.2.2. Evaluation of the Spatio-Temporal Derivative Relationship Network of Place Names

6.3. Reasoning Based on the Spatio-Temporal Derivative Relationship Network of Place Names

6.3.1. Spatial Neighborhood Reasoning

6.3.2. Spatial Fuzzy Position Inference

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI