Next Article in Journal
Field Monitoring and Analysis of Factors Influencing Existing Tunnels Laterally Adjacent to Foundation Pit Excavations
Previous Article in Journal
Preliminary Test of Source Parameters of Mwp6 Italian Earthquakes: Revisiting Kinematic Function Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A DeBERTa-Based Semantic Conversion Model for Spatiotemporal Questions in Natural Language

1
School of Information Engineering, China University of Geosciences Beijing, Beijing 100083, China
2
Chinese Academy of Surveying and Mapping, Beijing 100036, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(3), 1073; https://doi.org/10.3390/app15031073
Submission received: 14 December 2024 / Revised: 20 January 2025 / Accepted: 20 January 2025 / Published: 22 January 2025

Abstract

:
To address current issues in natural language spatiotemporal queries, including insufficient question semantic understanding, incomplete semantic information extraction, and inaccurate intent recognition, this paper proposes NL2Cypher, a DeBERTa (Decoding-enhanced BERT with disentangled attention)-based natural language spatiotemporal question semantic conversion model. The model first performs semantic encoding on natural language spatiotemporal questions, extracts pre-trained features based on the DeBERTa model, inputs feature vector sequences into BiGRU (Bidirectional Gated Recurrent Unit) to learn text features, and finally obtains globally optimal label sequences through a CRF (Conditional Random Field) layer. Then, based on the encoding results, it performs classification and semantic parsing of spatiotemporal questions to achieve question intent recognition and conversion to Cypher query language. The experimental results show that the proposed DeBERTa-based conversion model NL2Cypher can accurately achieve semantic information extraction and intent understanding in both simple and compound queries when using Chinese corpus, reaching an F1 score of 92.69%, with significant accuracy improvement compared to other models. The conversion accuracy from spatiotemporal questions to query language reaches 88% on the training set and 92% on the test set. The proposed model can quickly and accurately query spatiotemporal data using natural language questions. The research results provide new tools and perspectives for subsequent knowledge graph construction and intelligent question answering, effectively promoting the development of geographic information towards intelligent services.

1. Introduction

With the rapid development of big data technology, researchers have placed higher demands on the processing of massive geographic information data [1]. The specific goal is to improve human–computer interaction efficiency and promote the comprehensive advancement of geographic spatiotemporal information from traditional data management and analysis modes towards intelligent services. Currently, most of the search engines are based on traditional keyword matching, which have limitations in dealing with natural language questioning with spatiotemporal characteristics, such as “What are the five restaurants closest to the Summer Palace?”, “Which subway stations were in Beijing in 2012?”, and “Which cities are within 600 km of Beijing?” [2]. To address these issues and help users experience the convenience brought by the big data era [3], enabling users to perform spatial queries and analyses through natural language descriptions without prior knowledge of complex Geographic Information Systems (GIS) or structured query languages is undoubtedly an innovative and practical attempt [4].
Spatiotemporal queries refer to query statements involving temporal and spatial attributes, typically combining geographic and temporal information for understanding and answering. Semantic understanding is a key technology in spatiotemporal query processing [5]. Converting natural language spatiotemporal queries into structured query statements is currently a hot research topic in natural language processing. Three main approaches address this challenge: rule-based methods, machine learning, and deep learning.
Rule-based methods utilize longest matching word segmentation with spatial knowledge base support to perform word segmentation on query statements while using spatial query sentence pattern templates for syntactic analysis [6], thereby interpreting spatial query targets and corresponding spatial operations [7]. Natural language questions are converted into standard spatial query statements and executed by relational databases, with results presented in textual or graphical form [8]. Reference [9] proposed a rule-matching model based on edit distance for matching words and sentence patterns in queries. Reference [10] combined rule-based and statistical models to extract spatiotemporal and attribute information from Chinese texts. In terms of query processing, however, methods based on statistical models are not suitable for performing the necessary data extraction tasks [11]. Reference [12] conducted a statistical analysis and classification of predicates and quantifiers describing spatial relationships in natural language, establishing four syntactic patterns for natural language spatial queries. Regarding structured query language conversion, the common approach is to implement natural language to structured language mapping through template construction [13,14]. Reference [15] designed a template-based question answering system that converts natural language questions in the geospatial dimension into GeoSQL (Geo Spatial SQL) queries.
Machine learning methods construct models using corpus data and machine learning algorithms, automatically converting natural language queries into query statements by learning the correspondence between language and databases. References [16,17] employ pre-trained named entity recognition models and dictionary lookup methods to identify geographic entities and utilize constituent syntax to extract spatial relationships between different entities. Through semantic constraint syntax, they extract spatial relationships between entities, and after annotating geographic entities and spatial relationship terms, they map their semantics to predefined templates [18].
In recent years, with the rapid accumulation of big data and the significant improvement of computer performance, deep learning methods have become a hotspot for many scholars’ research. Ian Goodfellow et al. [19] have deeply analyzed the core principle of deep learning technology and laid a solid theoretical foundation for exploring its application in the field of natural language processing. In the field of natural language processing, the application of deep learning models has become more and more widespread, especially in the task of transforming natural language problems into query statements, where it shows a powerful ability. The realization of this process mainly relies on the encoding–decoding framework of deep learning models; as explained by Charu C. Aggarwal [20] in their book Neural Networks and Deep Learning, advanced algorithms, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and other advanced algorithms, have played a key role in semantic representation and parsing of query statements, which have greatly contributed to the natural language processing technology.
Rule-based methods are overly dependent on predefined rules and templates for parsing and converting natural language queries, making it challenging to handle complex and variable semantics. Meanwhile, machine learning methods require manual feature extraction from data to achieve semantic understanding. However, this approach faces challenges, such as time consumption with massive datasets, and the model’s effectiveness is significantly impacted by feature quality, thus limiting the extracted features’ utility. The emergence of deep learning has, to some extent, alleviated these issues. Deep natural language learning models primarily focus on general natural language structure analysis and semantic expression [21,22], while research on transferring deep learning models to the spatiotemporal query domain remains relatively scarce. Existing pre-trained models are largely based on general corpora and lack a deep understanding of domain-specific terminology and special structures in spatiotemporal queries, resulting in suboptimal performance when processing such specialized terms. More critically, research on spatiotemporal query processing for Chinese language corpora is particularly limited.
Given the uniqueness of Chinese text, the complexity of spatiotemporal questions, and the abstractness of entity relations, traditional rule-based and syntactic pattern-based spatiotemporal question parsing methods are difficult to accurately capture the intention of the question and fail to accurately understand the intent of natural language spatiotemporal questions, for example, in processing “What are the hotels within 3 km north of the Forbidden City in Beijing?” This kind of complex query statement faces significant challenges. Based on the above background and challenges, this study proposes two core hypotheses:
(1) With a semantic coding module, by introducing the deep learning model DeBERTa for pre-training feature extraction, feeding the feature vector sequence into BiGRU [23] to learn text features, and combining with the CRF [24] layer to obtain the globally optimal labeling sequences, the entity recognition accuracy and generalization ability in natural language spatiotemporal interrogative sentences can be significantly improved. The model is able to capture the contextual information and semantic features in the text more effectively and thus recognize and classify entities more accurately.
(2) Semantic understanding methods complete the classification of interrogative sentences according to the encoding module and the expected query type and realize the conversion of natural language spatiotemporal interrogative sentences to the database language Cypher based on the semantic parsing of the interrogative sentences within the class.
In this paper, through the semantic encoding module and the semantic understanding methods, we realize the natural language spatiotemporal question semantics to database conversion model NL2Cypher. We use Chinese corpus to generate Cypher utterances and conduct an accuracy test; this model has exceeded the performance of other models, and compared with GPT2’s, Thousand Questions’, and other large models’ accuracy has obvious improvement, the conversion accuracy of the training set reaches 88%, and the test set reaches 92%. The experimental results show that the proposed model is able to query spatiotemporal data quickly and accurately using natural language problems, and the results of the study are useful for advancing the interaction between artificial intelligence and database, facilitating more efficient and convenient data querying and analysis [25].

2. NL2Cypher Conversion Model

NL2Cypher aims to address the challenge of converting complex natural language spatiotemporal queries into structured Cypher query statements. These natural language spatiotemporal queries extensively encompass multiple elements including geographic entities, location information, spatiotemporal relationships, and place types, while involving various query purposes such as location queries, distance calculations, directional guidance, quantitative statistical analysis, and extremum searches. Given the limitations of traditional keyword-based search methods in handling such complex and variable semantic requirements, and in order to ensure the reliability of the experiments and the superiority of the results, this study preprocesses the datasets, which includes steps such as data cleansing, word segmentation, data annotation, and segmentation of sentence lengths according to the data characteristics and model requirements. This paper thoroughly explores the semantic information extraction and parsing process of spatiotemporal queries, thereby achieving precise mapping between natural language and database query language.
The conversion model consists of two modules: a semantic encoding module and a semantic understanding method. The overall research framework is illustrated in Figure 1, and Table 1 provides an example of a natural language spatiotemporal query. The specific contents are as follows:
(1) Semantic encoding module: In this study, the DeBERTa pre-trained model is applied for the first time to the natural language spatiotemporal interrogative entity recognition task, which effectively overcomes the limitations of the traditional static word vectors and the BERT model in processing Chinese character features. By combining BiGRU, this model effectively mines the temporal and spatial features of the text, and furthermore, the dynamic integration between features is achieved with the help of CRF, which greatly optimizes the model’s ability to understand and characterize spatiotemporal interrogative sentences in Chinese natural language. In the Section 5.2 Semantic Encoding Model Results section of Section 5, comparisons are made with other state-of-the-art models, and the experimental results show that the F1 value of the present model is as high as 92.69%, which is a significant improvement in accuracy compared with other models.
(2) Semantic understanding method: This component classifies queries based on semantic encoding results; performs semantic parsing of intra-class queries according to temporal relationships, spatial relationships, and dependency relationships in the query structure; and subsequently generates Cypher statements to complete query intent recognition.
Through the above research framework, a comprehensive understanding and mapping of natural language spatiotemporal interrogations will be realized, the connection efficiency and accuracy between natural language interrogations and database queries will be improved, and effective solutions will be provided for natural language interrogation queries oriented to natural language with spatiotemporal characteristics.

3. Semantic Encoding Module

3.1. Semantic Encoding Elements

The key to the NL2Cypher transformation model is to accurately extract semantic information from natural language temporal questioning and deeply understand the intent of the questioning so that accurate Cypher query statements can be automatically generated.
A study by EhsanHamzei et al. [26] provides an in-depth analysis of patterns in location-based questions through large-scale question/answer datasets and generalizes five core semantic categories related to location: location, location type, activity, spatiotemporal relationship, and quality. In order to further improve the comprehension of natural language spatiotemporal questions, this paper builds on the research of EhsanHamzei et al. and the two datasets GeoQuestions and GeoQuery mentioned in Section 4.1, and the semantic coding schema of locations has been extended by increasing the number of categories from five to eight and introduces a more fine-grained coding mechanism to realize effective extraction of semantic information, which is designed to cover the complex semantics of natural language questions more comprehensively and thus represent the information in the spatiotemporal questions more accurately and in spatiotemporal interrogative sentences.
According to Table 2, the basic semantic encoding elements for Chinese spatiotemporal queries include eight categories:
(1) Interrogative words: locative interrogatives (where, which place, etc.), specific interrogatives (what, etc.), selective interrogatives (which one, which ones, etc.), quantitative interrogatives (how many, how much, etc.), measurement interrogatives (how long, how far, etc.), and judgment interrogatives (is it, whether, etc.);
(2) Place names: Names of geographic entities, such as Shanghai, Beijing’s Forbidden City, etc.;
(3) Location types: Classifications of geographic entities, such as hospitals, parks, etc.;
(4) Attributes: Properties of locations, such as population, area, time, etc.;
(5) Quality: Quality of locations and their attributes and location types, such as nearest, highest, etc.;
(6) Activities: Indicating actions, such as crossing, flowing through, building, etc.;
(7) Spatiotemporal relationships: Describing spatial and temporal relationships, including direction, topology, and temporal relationships, such as northeast, nearby, surrounding, before, after, etc.;
(8) Numbers and units: Including time, distance, quantity, etc., such as the year 2021, five kilometers, and three units.
Based on the above description, the query “Which cities are within 600 km around Beijing?” is encoded as “prd3t”.

3.2. Encoding Model Construction

The semantic encoding of spatiotemporal queries in Chinese corpora can be viewed as a sequence labeling process, where training on annotated data enables automatic encoding of query semantics. Chinese spatiotemporal queries contain rich semantic information, with sentence structures exhibiting long-distance dependencies and diverse semantic encoding elements [27], resulting in phenomena where the same word may be encoded as different types of information in various contexts.
To address these challenges, this paper proposes a DeBERTa-BiGRU-CRF model for spatiotemporal query semantic encoding, which is an improvement based on the dynamic pre-training model DeBERTa, and the model structure is shown in Figure 2. The model integrates a BiGRU network following the DeBERTa model to capture more contextual semantic features, followed by a CRF layer to obtain globally optimal label sequences. This architecture better understands semantic relationships between different words in sentences, captures long-distance dependencies, and resolves ambiguities.
(1) Embedding Layer
The embedding layer converts discrete text data into continuous, low-dimensional vector representations. Given an input text X = { X 1 , X 2 , X 3 . . . , X n } , the embedding layer learns a corresponding embedding vector for each character X i . The output vector Y = { Y 1 , Y 2 , Y 3 . . . , Y n } , where Y i is the embedding vector of the corresponding word X i , serves as the input to subsequent layers.
(2) DeBERTa Layer
BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained language model based on the Transformer architecture that enhances the performance of various NLP (natural language processing) tasks [28]. It uses a bidirectional training approach, leveraging contextual information from both directions for understanding and processing. Compared to traditional models like recurrent neural networks (RNN) and convolutional neural networks (CNN), BERT can better capture long-range dependencies through the attention mechanism of Transformers [29], thus improving language modeling [30].
The DeBERTa (Decoding-enhanced BERT with disentangled attention) model is an improvement over the BERT model. Introduced by Microsoft in 2020, the DeBERTa model primarily utilizes the Transformer’s encoder and demonstrates superior performance and robustness in spatiotemporal questions [31,32]. The task of the DeBERTa layer is to acquire deep semantic features of natural semantic spatiotemporal questions through word embedding training, converting text into word vectors. The model structure is shown in Figure 3.
The calculation process for the disentangled attention input representation in the DeBERTa model is shown in Equation (1).
A i , j = H i H j T + H i P j | i T + P i | j H j T + P i | j P j | i T
Here, H i and H j represent the content embeddings of the i-th and j-th words, respectively. P i | j represents the relative positional embedding between words i and j. H i H j T represents the computation from content to content, while H i P j | i T and P i | j H j T represent the computations from content to position and position to content, respectively. P i | j P j | i T represents the computation from position to position. The formula for calculating the attention scores is as follows:
Attention ( Q , K , V ) = Softmax ( Q K T d k ) V
Among them, Q, K, and V are all word-embedding representations. The variable d represents the dimensionality of the hidden states. The variable k denotes the maximum relative distance, and T indicates the transpose operation.
(3) BiGRU Layer
The task of the BiGRU layer is to extract more semantic features from the text. A Bidirectional Gated Recurrent Unit (BiGRU) is constructed based on two GRUs, each operating in opposite directions [33]. In the structure of a unidirectional GRU network, state information is propagated in a single direction; for example, a left-to-right GRU can only learn information from previous time steps. However, the output of a BiGRU network is determined by both GRUs, allowing it to capture information from both past and future time steps. Since the semantic information of a sentence is related to both its preceding and succeeding context, this paper employs BiGRU to capture more comprehensive contextual information.
In BiGRU, the reset gate n selectively forgets parts of the information in the previous hidden state, while the update gate controls the degree to which the previous hidden state is updated and selects the candidate hidden state to be updated [34]. The computational propagation of BiGRU is described by Equations (3)–(6).
z t = σ ( U z h t 1 + W z x t + b z )
r t = σ ( U r h t 1 + W r x t + b r )
h t ~ = tanh ( U h ( r t h t 1 ) + W h x t + b h )
h t = z t h t ~ + ( 1 z t ) h t 1
Here, z t and r t represent the update gate and reset gate at time t, respectively; U and W are the weight matrices; and b denotes the bias matrix. The variable σ is the activation function of s i g m o i d , x t represents the input information at the current time step, h t ~ is the state information of the new memory unit at the current time step, and h t denotes the updated state information at the current time step, while h t 1 represents the updated state from the previous time step. The symbol multiplication of the corresponding elements of the matrix. The final calculation result of the BiGRU is a combination of the forward and backward propagation results, with the backward propagation process mirroring the forward propagation.
(4) CRF Layer
Following the BiGRU layer, a Conditional Random Field (CRF) layer is added to predict the label sequence [35]. The CRF effectively models the dependencies between labels, ensuring that the generated label sequence is coherent and reasonable. The goal of this layer is to recognize geographic entities in natural language problems and to annotate the extracted word vectors using the BIOES format.
In the annotation stage, the commonly used annotation schemes include BIO and BIOES patterns. The BIOES schema is applicable to the spatiotemporal interrogative data in this study due to its ability to clearly distinguish the boundaries of spatiotemporal interrogative entities from single-word entities. Compared to the simplicity of the BIO schema, BIOES provides clearer marking at the end of the entity and reduces the ambiguity of boundary identification. B denotes the labeling of the start of an entity, I denotes the interior of an entity, E identifies the character located at the end of the entity, S denotes a single word, and O denotes irrelevant information.

4. Semantic Understanding Methods

4.1. Question Classification

The extraction of semantic information from natural language spatiotemporal questions focuses on the perspective of the question itself. Often, the semantics within a question may be abbreviated or misspelled. After extracting spatiotemporal questions using the semantic encoding module, it is necessary to perform a normalization process on the question set. This involves classifying and semantically parsing the questions to further determine the category to which a spatiotemporal question belongs.
Question classification aids in better understanding the user’s intent and requirements, thereby providing customized query results based on different question categories. This paper uses the benchmark datasets GeoQuestions [36] and GeoQuery [37] for geospatial question answering as a basis, extending categories to include direction, distance, and time question types, such as basic queries, composite queries, and fuzzy queries, and reclassifying the questions. In the geospatial context, the answers expected from spatiotemporal question queries are related to factors such as location, distance, direction, quantity, and extremum. Therefore, as shown in Table 3, questions are classified into eight types based on the expected answer type. These classification methods offer a structured processing framework for spatiotemporal questions, facilitating accurate understanding of user needs and the provision of corresponding query results.
After semantic encoding, extracted keywords can serve as critical features for distinguishing between question categories. Due to the obvious features of interrogative sentences, this paper adopted Bert (Bidirectional Encoder Representations from Transformers) [38] large model to distinguish different types of interrogative sentences. To accurately distinguish complex queries with similar textual descriptions and question types that return answers in similar formats, a fusion of question vectors and encoded key elements was used, with the final classification results obtained through model training. For example, interrogative keywords were used to differentiate between two complex queries like “Where is the hotel closest to Tiananmen?” and “Which is the hotel closest to Tiananmen?” Buffer zone queries, extremum queries, and list queries all returned answers as lists of location types, but due to different key elements in the questions, the question type that best reflected the features of the keywords was prioritized in the response.

4.2. Semantic Parsing

Building on question classification, semantic parsing was performed on intra-category questions by identifying the relationships among encoded entity elements to determine the query conditions for each type of question. In the context of geospatial and temporal questions, the focus was primarily on spatial relationships, temporal relationships, and dependency relationships [39]. The LTP (Language Technology Platform) tool [40] was used to perform dependency syntax analysis on the questions, and relationship recognition was carried out based on the needs and characteristics of different question types.
(1)
Spatial Relationships
The three elements of spatial descriptions are the position object (P), the reference object (C), and the spatial relation (r) [41]. The position object is the core of the spatial description, representing the object being located or described. The reference object serves as the benchmark or reference point used to locate the position object within the spatial description, providing a reference framework for the position object. Spatial relations are the spatial connections between the position object and the reference object, including directional relationships (such as east, west, south, north, northeast, southeast, etc.), distance relationships (such as around, nearby, etc.), and topological relationships (such as bordering, containing, adjacent, etc.). The spatial relationship triplet in a question is represented in the following form:
< P i , r j , C i > j i , j = 1 , 2 ,   , n
In the formula, i represents all objects related to r in the question, and j represents all spatial relationships in the sentence. For example, in the sentence “the park and bank north of the museum”, it contains < P a r k , N o r t h , M u s e u m > 1 , < B a n k , N o r t h , M u s e u m > 1 , < P a r k , N e a r b y , M u s e u m > 2 , < B a n k , N e a r b y , M u s e u m > 2 .
Based on the semantic encoding results of the sentence, if there is a spatial relationship encoding r, then use a combination of dependency syntax and rule-based methods to identify the location object and reference object of the spatial relationship and determine the spatial relationship triple.
(2)
Temporal Relationships
When describing geographic entities, temporal factors should be considered, such as the historical evolution of geographic entities, changing trends, and the state of specific periods [42], to more comprehensively understand and describe geographic phenomena and analyze the changes in geographic data over time. Spatiotemporal questions mainly include absolute time (e.g., September 2010) and relative time (e.g., five years ago). Their temporal relationship words mainly express temporal sequence relationships (e.g., before, after, until, etc.) and temporal modification relationships (e.g., around, about, approximately, etc.) [43]. In spatiotemporal questions, the queries and descriptions of time primarily focus on the time attributes of geographic entities or the temporal relationships between geographic entities. If a time encoding ‘d’ and a temporal relationship encoding ‘r’ are identified in the question, then time is assigned as an attribute or association to the entity encodings in the question, and ‘r’ is mapped to the temporal relationship function based on the temporal relationship set.
(3)
Dependency Relationships
The entities in the question usually have dependency relationships, such as subject–predicate, verb–object, coordination, attribute modified, and preposition–object [44,45]. For instance, if there is a coordination relationship among entities in the sentence and the first entity is associated with a certain relationship (such as a spatial relationship), then the coordinated entity is usually also associated with the same relationship. If the dependency relationship between entities is that of an attribute modifier, it indicates a modification relationship between the entities.
Based on the syntactic relationships, determine the query target of the question: location queries typically take the location ‘p’ or location type ‘t’ as the subject; distance queries take location ‘p’ as the subject; direction queries must determine the reference object and query object based on the subject–predicate relationship, verb–object relationship, etc.; buffer, counting, extremum, and list queries generally have the location type ‘t’ as the query target; and judgment queries have the IF function as the query target. List the extracted entity encoding classes and define the relationships between entities as functions; treat the query target of the question as a variable and the remaining information as constants; and define functions for buffer, direction, distance, sorting, fuzzy queries, and topological queries, such as Buffer(p, number, t), Direction( p 1 , East, p 2 ), D i s t a n c e ( p 1 , p 2 ) , Order(o, number), etc.
As an example, for the query “What parks within five kilometers south of the Forbidden City in Beijing were built in the twentieth century?”, follow the following steps:
(1) Extract entities and relations, location (p) and relation, location type (t), etc.
(2) Declare terms based on the encoding, declare location p (‘Forbidden City’), declare location type t (‘Park’).
(3) Define query target by taking the unknown query target as a variable x , x = t (‘Park’).
(4) Define function by defining the query conditions, such as dependency, spatial and temporal relations as functions, InCity (‘Forbidden City’, ‘Beijing’), Buffer (‘Forbidden City’, 5000 m, ‘Park’), Direction (‘Forbidden City’, ‘South’, ‘Park’).
(5) Nominalize language as p (‘Forbidden City’) AND p (‘Beijing’) AND InCity (‘Forbidden City’, ‘Beijing’) AND t(‘Park’) AND x = t AND Direction (‘Forbidden City’, ‘to the south’,) AND Buffer (‘Forbidden City’, 5000 m, x ) AND o (‘Time’ = ‘twentieth century’) AND Return (‘ x ’).

4.3. Generating Structured Query Language

Based on the classification of questions (count queries) and semantic parsing, a structured query statement in Cypher was constructed. Cypher is a query language widely used for interacting with graph databases [46]. It occupies a significant position in the field of graph data processing due to its intuitive and powerful expressive capability. Cypher provides a natural language-like way to query, update, and manage graph data, enabling developers to efficiently manipulate and explore complex graph structures. This paper took Cypher as an example to conduct an in-depth study of the algorithmic process for generating structured language, specifically Cypher query statements.
The basic structure of a Cypher query statement includes components such as MATCH, WHERE, RETURN, ORDER BY, and WITH. The algorithm flow is illustrated in Figure 4. The MATCH statement matches the geographical entity part to the nodes and relationships in the knowledge base, assigning a unique variable to each extracted geographic concept. The WHERE clause is generated by query conditions (function modules), which are written in Cypher. For example, the direction function in the query conditions is calculated using the arctangent function to obtain directional results such as east, south, northeast, southwest, south by southwest, and west–southwest degrees. In the distance function, fuzzy queries (such as around, nearby, etc.) refer to quantitative distance ranges assigned according to place types as proposed by D. Punjani [36]. In the time function, time modifiers (such as around, approximately, etc.) are standardized and quantitatively analyzed using different scales of time units. The RETURN clause specifies the answer to the query, which is determined by the type of question and the query target. Semantic parsing results is shown in Table 4.

4.4. Evaluation Metrics

4.4.1. Model Performance Analysis

In order to quantitatively evaluate the performance of the model, the confusion matrix and related metrics are used in this study. In the context of the semantic encoding model, “TN” refers to the number of samples correctly predicted as negative, “TP” (True Positive) refers to the number of samples correctly predicted as positive, “FN” (False Negative) refers to the number of samples incorrectly predicted as negative, and “FP” (False Positive) refers to the number of samples incorrectly predicted as positive [47,48]. These four ratios form the fundamental components of the confusion matrix, which constrain each other and collectively reflect the partitioning outcome.
Precision rate indicates the rate at which samples in the test set that were actually positive were predicted positively.
Recall rate indicates the rate of samples in the test set that were predicted to be positive that were actually positive.
The FI score represents the reconciled average of the precision and recall rates. A higher F1 score indicates superior performance of the model, showcasing its effectiveness in accurately and comprehensively identifying positive instances.
Accuracy rate indicates the overall rate of correct predictions in the test set.
The calculation formula is shown in Figure 5.

4.4.2. Accuracy Verification

In this paper, the accuracy of Cypher statements is evaluated using three metrics [49]; here, N denotes the total number of query statements in the dataset;   N q m represents the number of query statements that yield correct results upon execution; and N l f indicates the number of query statements that match perfectly with manually generated standard Cypher queries.
(1) Query accuracy ( A C C q m ) is the prediction accuracy, where “the execution result of the automatically generated Cypher matches the execution result of the true Cypher”.
A C C q m = N q m N
(2) Logical form accuracy ( A C C l f ) involves a manual determination of whether the generated Cypher language matches the verification labels in terms of query intent and the query return results.
A C C l f = N l f N
(3) Execution accuracy ( A C C e x ) combines the previous two metrics to check whether the execution results of the two Cypher statements are consistent.
A C C e x = A C C q m A C C l f

5. Experiment and Discussion

5.1. Dataset Construction

Due to the lack of Chinese datasets for natural language spatiotemporal interrogative queries, this study primarily constructs a Chinese dataset with spatiotemporal characteristics for natural language question queries by performing entity replacement, sentence structure reconstruction, and synonym substitution on the gold standard geospatial datasets GeoQuestions and GeoQuery. The specifics of the dataset are as follows:
(1) GeoQuestions contains 201 English geospatial questions, including questions about location, spatial relationships of geographic entities, spatial relationships of geographic features, quantities, and aggregation.
(2) GeoQuery (http://www.cs.utexas.edu/users/ml/geo.html, accessed on 2 July 2023) consists of 880 natural language questions divided into a training set with 600 question–answer pairs and a test set with 280 question–answer pairs.
Based on the GeoQuestions and GeoQuery datasets, this paper performs entity substitution, sentence structure reconstruction, synonym substitution, and accurate Chinese translation on geographic queries and expands and optimizes the original query sentences into 9416 Chinese geographic questions, with 13 coding types, totaling 64,994 characters. Along with the expansion optimization, the problem of representativeness and bias of the dataset is also highly emphasized. To solve these problems, data expansion and filtering techniques are used in this paper. Through data expansion, the number of samples in the dataset is increased. Through data filtering, the noise and abnormal data are effectively removed, and the bias is reduced. The implementation of these measures provides a strong guarantee for the accuracy and reliability of the study. The number of semantic encodings for each type is shown in Table 5 (datasets: https://github.com/yinxingren1/dataset.git, accessed on 10 December 2024).
The dataset is annotated using the BIOES format (B for the beginning of an annotation sequence, I for inside, E for the end, S for a single character, O for outside information; the meanings of other English letters are shown in Table 2. Elements of Spatiotemporal Semantic Encoding), with detailed annotations shown in Table 6.
(3) The classification dataset contains a total of 9416 Chinese geospatial questions with manually annotated question categories to ensure that each question is accurately categorized into its corresponding category, which provides a rich dataset for the training of the semantic coding model in the following.

5.2. Semantic Encoding Model Results

This study utilizes the PyTorch deep learning framework and is programmed in Python. The integrated development environment used is JetBrains PyCharm software (version 2020.1), running on the Windows 11 64-bit operating system. After tuning the parameters of the DeBERTa-BiGRU-CRF model, the number of layers is set to 12, and the Adam optimization algorithm is employed with a learning rate of 1 × 10−5. To prevent overfitting, adam_epsilon and dropout mechanisms are applied. The model’s batch size is set to 64, with the number of epochs being 60. The dataset is divided into training, validation, and test sets in an 8:1:1 ratio, with specific corpus statistics shown in Table 7.
In the comparative experiments, the model performance was evaluated using metrics commonly employed in sequence labeling tasks, including precision, recall, and F1 score. The DeBERTa-BiGRU-CRF model demonstrated improvements in accuracy by 12.33%, 7.32%, and 2.71% over the baseline models BiLSTM-CRF, BERT-BiLSTM-CRF, BERT-BiGRU-CRF, and T5, respectively; recall increased by 12.97%, 7.52%, 3.63%, and 0.9%; and the F1 score improved by 11.93%, 7.18%, 2.53%, and 1.17%.
To facilitate a more intuitive visualization of the experimental data, Figure 6 presents and compares the results of the different models using a bar chart.
We performed a comparative analysis using the training process curves of the validation dataset. As indicated by the curves in Figure 7, the gradual decrease in the loss function and increase in accuracy reflect the learning capabilities of the five models. The eventual stabilization of the curves indicates that each model has developed into a stable structure with decision-making capability. In terms of training outcomes, the model proposed in this paper has the highest training accuracy, with an accuracy of 0.9562.
Through the analysis of the experimental results, it can be seen that the BiLSTM-CRF model has limitations in its feature representation ability, especially in dealing with polysemy, long-distance dependencies, and generalization, which directly affects the accuracy of the results. The BERT model, although it successfully incorporates the contextual information, ignores the dependencies between the words, and thus, its performance is better than that of BiLSTM-CRF but still not as good as the model proposed in this paper. Similarly, although BERT considers contextual information, it does not fully consider absolute positional information, which makes its performance better than LSTM-CRF, but its F1 value is lower than the model in this paper. T5 shows a stronger ability in natural language generation compared to other large models, but its accuracy is slightly lower in the application scenarios of spatiotemporal questioning compared to the method proposed in this paper, mainly because the model in this paper has been specially designed to be more accurate in the application scenarios of spatiotemporal questioning, but its performance is still lower than the model proposed in this paper because the model in this paper has been specifically designed to better fit the processing needs of spatiotemporal interrogative sentences, while T5 is more adept at a wide range of natural language processing tasks due to its wide applicability, which requires a lot of pre-training and fine-tuning processes. The DeBERTa model adopted in this paper, with its decoupled attention mechanism and introduced absolute positional encoding, achieves a significant performance improvement, not only optimizing the pre-training efficiency but also substantially enhancing the generalization ability.
The experimental analysis in this paper shows that the accuracy of question word coding is higher, while the accuracy of location and spatiotemporal relationship extraction is slightly lower than other coding types because the question words are relatively fixed and limited in number, and the learning effect of the model is better. For place names that contain multiple layers and are complex, the recognition is more difficult. Due to the limitation of the size of the training corpus, the frequency of these complex place names and certain spatiotemporal relationship words is low, and the model fails to fully learn these semantic information. To address this problem, the follow-up will focus on enhancing the diversity of recognized entity types and consider introducing a hierarchical named entity recognition method to further improve the recognition effect.

5.3. Conversion Result

To evaluate the accuracy of transforming spatiotemporal questions into database language, 160 questions were randomly selected from the dataset according to the category, with 20 questions per category. Accurate Cypher language annotations were provided as validation tags. Using the three evaluation metrics outlined in Section 4.4, the NL2Cypher translation model’s performance was assessed across different datasets. Accuracy inspection of NL2Cypher is shown in Table 8.
The results indicate that the NL2Cypher translation model achieved over 88% when measuring A C C q m for the training set, over 88% when measuring A C C l f , and over 92% when measuring A C C q m for the test set of the GeoQuestions dataset, with execution accuracy around 1.06. On the GeoQuery dataset, it achieved over 88% when measuring A C C q m for the training set and over 93% when measuring A C C q m for the test set. Table 9 represents the accuracy evaluation results obtained for these two datasets by different models.
The results indicate that compared to other models, NL2Cypher achieves a significant improvement in accuracy. This improvement is attributable to the use of the DeBERTa model to generate word vectors, as DeBERTa supports training on larger-scale corpora in an unsupervised manner, thereby enhancing accuracy and generalization capability. In summary, the research on generating structured language Cypher from natural language spatiotemporal queries conducted by this model holds considerable significance.

5.4. Error Analysis

The NL2Cypher model, based on the DeBERTa semantic conversion of natural language spatiotemporal queries, adheres to the semantics of Cypher, removing non-executable parts of Cypher queries to achieve more accurate results. This involves handling scenarios where the generated Cypher statements contain syntactic errors, mismatches between functions and target field names, and instances where query results return null values. This mechanism can be used with any autoregressive model to improve the accuracy of results. Consequently, the final generated Cypher statements were compared with standard Cypher statements by randomly selecting 100 samples where logical forms did not match for analysis. These were divided into two categories:
(1) “Unanswerable” cases (20 instances) are situations where the Cypher statements cannot generate correct query results from the given data. They primarily include the following two types: Type 1, insufficient semantic information recognition, is where complex place names, abbreviations, and spatiotemporal relationships are not accurately recognized, such as “Xing’an League A’ershan City Hetu Ala Town”, “Peking University”, and “adjacent”. Type 2, limitations in question type, are described as follows: The question classification framework proposed in this paper includes the most commonly used basic and compound queries in the geographic spatiotemporal field but also has limitations, resulting in a few questions that cannot be classified, leading to incorrect results. For example, the question “What is the length of the river that runs through London?” is an attribute query related to spatial association, which is not adequately considered in this classification framework. The existing framework is limited in capturing implicit semantic relationships and complex contextual information, leading to incomplete or inaccurate conversion results.
(2) For “Answerable” cases (80 instances), in analyzing the remaining 80 Cypher statements that were answerable, it was found that 35 of them had logical form errors. Further analysis of these 35 Cypher statements revealed that 33 had errors when generating functions, for example, overlooking “max” or “min” during comparison, and 2 contained errors in constructing the WHERE clause, such as confusing “feature code” when generating the clause for “What cities in ChaoYang have the highest populations?”, which led to confusion between Chaoyang in Beijing and Chaoyang in Liaoning.
Manual inspection revealed that in these 35 Cypher statements with logical form errors, 31 were capable of producing correct results despite not fully matching standard Cypher statements. This indicates that the actual performance of the model may be underestimated.

6. Conclusions and Future Work

Natural language semantic understanding enables computers to better comprehend user intentions and provides interfaces for user queries, thereby improving the user-friendliness, query expressiveness, and human–computer interaction of database systems, promoting intelligent geographic knowledge services. To address the issues of incomplete semantic information extraction and inaccurate intention recognition in natural language queries with spatiotemporal characteristics, this paper proposes the NL2Cypher model for automated semantic encoding and understanding of spatiotemporal queries. Through inter-class coarse division and intra-class fine division strategies, the model effectively enhances the semantic understanding and intention recognition capabilities of spatiotemporal queries. The model achieves relatively high accuracy in query matching and logical form generation, and even some Cypher statements with partially mismatched logical forms can still provide correct results based on given data. Overall, the conversion from natural language spatiotemporal queries to structured Cypher queries proposed in this paper is effective and feasible.
However, our method still has some limitations. For example, this study mainly relies on the dictionary information and glyph features of Chinese. However, recent studies have revealed that the phonetic and phonetic information of Chinese characters has potential value in improving task performance. Therefore, we will consider integrating the phonetic and phonetic information of Chinese characters into the character-level representation to achieve a comprehensive grasp of Chinese features. In future work, there are mainly the following areas that can be improved:
(1) Enhance Model Generalization
Although the NL2Cypher model performs well on the current test dataset, user queries in practical applications may be more diverse and complex. Therefore, it is necessary to collect more diverse natural language spatiotemporal query data that should cover a variety of complex spatiotemporal relationships and the use of unknown vocabulary, especially datasets from foreign language environments such as Russian, Greek, or Arabic. By incorporating these diverse data, we can more effectively train and optimize the NL2Cypher model to ensure that it can maintain high accuracy and stability in a wider range of query scenarios.
(2) Improve Query Matching and Logical Form Accuracy
While the model demonstrates good performance in query matching and logical form accuracy, there is still room for improvement. Future work should focus on more detailed model tuning and training as well as introducing additional constraints and rules to further enhance query matching accuracy and logical form consistency. Given the potential of the NL2Cypher model in enhancing database query functions and improving user friendliness, we can subsequently cooperate with big data analysis companies, artificial intelligence, and machine learning companies to promote the development and application of technology.
(3) Integrating multimodal data to improve query results
Spatial data such as maps and satellite images are integrated into the NL2Cypher model, and reinforcement learning techniques are introduced to more accurately predict user intentions and provide more personalized query suggestions based on user query preferences and patterns. At the same time, we are committed to developing and improving fairness assessment algorithms to ensure that query results are not only accurate and efficient but also meet the needs of different user groups.
This research result not only has a far-reaching impact in China but also has broad promotion value in the international market. By promoting the NL2Cypher model to the international market, it is expected to provide users in more countries and regions with an efficient and intelligent query experience.

Author Contributions

Conceptualization, methodology and validation are W.L., X.M. and D.M. Investigation and data curation are Y.C. Resources is J.W., review and editing are all authors. Supervision is W.L. and Z.Z., project administration is X.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the research and development and application demonstration of remote sensing monitoring technology for typical natural resource elements (Grant No. 2023YFE0207900).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, J.; Liu, H.; Chen, X.; Guo, X.; Zhao, Q.; Liu, J.; Kang, L. Rapid Retrieval of Geospatial Data Considering Semantic Knowledge. Geomat. Inf. Sci. Wuhan Univ. 2022, 47, 463–472. [Google Scholar]
  2. Kefalidis, S.A.; Punjani, D.; Tsalapati, E.; Plas, K.; Pollali, M.; Mitsios, M.; Tsokanaridou, M.; Koubarakis, M.; Maret, P. Benchmarking geospatial question answering engines using the dataset GeoQuestions1089. In Proceedings of the International Semantic Web Conference, Athens, Greece, 6–10 November 2023; Springer Nature: Cham, Switzerland, 2023; pp. 266–284. [Google Scholar]
  3. Li, S.; Zhu, X.; Li, Z.; Liu, W.; Cui, B. From geographic information service to geographic knowledge service: Research issues and development roadmap. Acta Geod. Cartogr. Sin. 2021, 50, 1194–1202. [Google Scholar]
  4. Wu, R. Scenario-Based Query for Power Marketing Based on Knowledge Graph. Bachelor’s Thesis, Donghua University, Shanghai, China, 2023. [Google Scholar]
  5. Janowicz, K.; Gao, S.; McKenzie, G.; Hu, Y.; Bhaduri, B. GeoAI: Spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int. J. Geogr. Inf. Sci. 2020, 34, 625–636. [Google Scholar] [CrossRef]
  6. Wei, Y.; Li, H.; Hu, D.; Li, X.; Ma, L. A method of Chinese place name recognition based on composite features. Geomat. Inf. Sci. Wuhan Univ. 2018, 43, 17–23. [Google Scholar]
  7. Meng, C.; Li, Q.; Li, H.; Jia, J. Research on query method of geographic information based on ontology. Sci. Surv. Mapp. 2008, 33, 251–253. [Google Scholar]
  8. Li, B. Research on Interpreting Mechanism of Natural Spatial Query Language. Bachelor’s Thesis, Information Engineering University, Zhengzhou, China, 2009. [Google Scholar]
  9. Gai, S.; Liu, J.; Xiong, W.; Zhang, X.; Li, J. Research on Rule Matching in Natural Language Spatial Query Based on Levenshtein Distance. J. Geomat. Sci. Technol. 2015, 32, 416–421. [Google Scholar]
  10. Zhang, C. Interpretation of Event Spatio temporal and Attribute Information in Chinese Text. Acta Geod. Cartogr. Sin. 2015, 44, 590. [Google Scholar]
  11. Sirichanya, C.; Kraisak, K. Semantic data mining in the information age: A systematic review. Int. J. Intell. Syst. 2021, 36, 3880–3916. [Google Scholar] [CrossRef]
  12. Deng, M.; Huang, X.; Liu, H.; Liu, G. An Approach for Spatial Query Based on Natural-Language Spatial Relations. Geomat. Inf. Sci. Wuhan Univ. 2011, 36, 1089–1093. [Google Scholar]
  13. Cai, Q.; Xu, B.; Dong, X. Knowledge Graph Completion Model using Semantically Enhanced Prompts and structural Information. Comput. Sci. 2024, 12, 7–23. [Google Scholar]
  14. Tu, W.; Li, B.; Liu, X.; Zheng, J. Application of NL2SQL with knowledge graph fusion in equipment maintenance data retrieval. Intell. Comput. Appl. 2024, 14, 118–124. [Google Scholar]
  15. Cao, J.; Huang, T.; Chen, G.; Wu, X.; Chen, K. Research on Technology of Generating Multi-table SQL Query Statement by Natural Language. J. Front. Comput. Sci. Technol. 2020, 14, 1133–1141. [Google Scholar]
  16. Chen, W.; Fosler-Lussier, E.; Xiao, N.; Raje, S.; Ramnath, R.; Sui, D. A synergistic framework for geographic question answering. In Proceedings of the 2013 IEEE Seventh International Conference on Semantic Computing, Irvine, CA, USA, 16–18 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 94–99. [Google Scholar]
  17. Chen, W. Parameterized spatial SQL translation for geographic question answering. In Proceedings of the 2014 IEEE International Conference on Semantic Computing, Newport Beach, CA, USA, 16–18 June 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 23–27. [Google Scholar]
  18. Scheider, S.; Nyamsuren, E.; Kruiger, H.; Xu, H. Geo-analytical question-answering with GIS. Int. J. Digit. Earth 2021, 14, 1–14. [Google Scholar] [CrossRef]
  19. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  20. Charu, C. Aggarwal. Neural Networks and Deep Learning: A Textbook; Springer: Cham, Switzerland, 2018. [Google Scholar]
  21. Le, X.; Yang, C.; Yu, W. Spatial concept extraction based on spatial semantic role in natural language. Geomat. Inf. Sci. Wuhan Univ. 2005, 30, 1100–1103. [Google Scholar]
  22. Kang, M.; Du, Q.; Wang, M. A newmethod of Chinese address extraction based on addresse tree model. Acta Geod. Cartogr. Sin. 2015, 44, 99–107. [Google Scholar]
  23. Zhang, X.; Cai, Z. Text Sentiment Analysis Based on Bert-BiGRU-CNN. Comput. Simul. 2023, 40, 519–523. [Google Scholar]
  24. Ren, Z.; Qin, X.; Ran, W. SLNER: Chinese Few-Shot Named Entity Recognition with Enhanced Span and Label Semantics. Appl. Sci. 2023, 13, 8609. [Google Scholar] [CrossRef]
  25. Lee, J.G.; Kang, M. Geospatial big data: Challenges and opportunities. Big Data Res. 2015, 2, 74–81. [Google Scholar] [CrossRef]
  26. Hamzei, E.; Li, H.; Vasardani, M.; Baldwin, T.; Winter, S.; Tomko, M. Place questions and human-generated answers: A data analysis approach. In Geospatial Technologies for Local and Regional Development, Proceedings of the 22nd AGILE Conference on Geographic Information Science 22, Limassol, Cyprus, 17–20 June 2019; Springer International Publishing: Cham, Switzerland, 2020; pp. 3–19. [Google Scholar]
  27. Cheng, Y.; Xu, D.; Lv, X. Research on text reading comprehension and question answering methods based on hierarchical interactive network. Data Anal. Knowl. Discov. 2019, 2, 23–32. [Google Scholar]
  28. Koroteev, M.V. BERT: A review of applications in natural language processing and understanding. arXiv 2021, arXiv:2103.11943. [Google Scholar]
  29. Gui, T.; Xi, Z.; Zheng, R. A review of robustness research in natural language processing based on deep learning. Comput. Sci. 2023, 7, 1–26. [Google Scholar]
  30. Lewis, M.; Liu, Y.; Goyal, N. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020. [Google Scholar]
  31. Liu, Q.; Xiao, K.; Cao, S.; Zhang, H.; Jiang, D. Research on Text Classification Methods by Fusing DeBERTa Model with Graph Convolutional Networks. Artif. Intell. Robot. Res. 2024, 13, 715. [Google Scholar]
  32. He, P.; Liu, X.; Gao, J.; Chen, W. Deberta: Decoding-enhanced bert with disentangled attention. arXiv 2020, arXiv:2006.03654. [Google Scholar]
  33. Zhao, Z.A.; Wang, J.; Mao, X.; Ma, W.; Lu, W.; He, Y.; Gao, X. A Multi-dimensional CNN Coupled Landslide Susceptibility Assessment Method. Geomat. Inf. Sci. Wuhan Univ. 2024, 49, 1466–1481. [Google Scholar]
  34. Gao, X.; Wang, J.; Mao, X.; Zhao, Z.; Lu, W. The suseeptibility assessment of landslide based on Bi-GRU network. Seience Surv. Mapp. 2023, 48, 221–230. [Google Scholar]
  35. Yu, B.; Fan, Z. A review of conditional random field models for natural Language Processing. J. Inf. Resour. Manag. 2020, 10, 96–111. [Google Scholar]
  36. Punjani, D.; Singh, K.; Both, A.; Koubarakis, M.; Angelidis, I.; Bereta, K.; Beris, T.; Bilidas, D.; Ioannidis, T.; Karalis, N.; et al. Template-based question answering over linked geospatial data. In Proceedings of the 12th Workshop on Geographic Information Retrieval, Seattle, WA, USA, 6 November 2018; ACM: New York, NY, USA, 2018; pp. 1–10. [Google Scholar]
  37. Zelle, J.M.; Mooney, R.J. Learning to parse database queries using inductive logic programming. In Proceedings of the National Conference on Artificial Intelligence, Portland, OR, USA, 4–8 August 1996; pp. 1050–1055. [Google Scholar]
  38. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.G.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
  39. Lu, F.; Zhu, Y.; Zhang, X. Spatiotemporal knowledge graph: Advances and perspectives. J. Geo-Inf. Sci. 2023, 25, 1091–1105. [Google Scholar]
  40. Che, W.; Feng, Y.; Qin, L.; Liu, T. N-LTP: An Open-source Neural Language Technology Platform for Chinese. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 42–49. [Google Scholar]
  41. Vasardani, M.; Timpf, S.; Winter, S.; Tomko, M. From descriptions to depictions: A conceptual framework. In Spatial Information Theory, Proceedings of the 11th International Conference, COSIT 2013, Scarborough, UK, 2–6 September 2013; Springer International Publishing: Cham, Switzerland, 2013; pp. 299–319. [Google Scholar]
  42. Zhang, X.; Zhang, C.; Wu, M.; Lv, G. Spatiotemporal features based geographical knowledge graph construction. Sci. Sin. Inf. Sci. China: Inf. Sci. 2020, 50, 1019–1032. [Google Scholar]
  43. Zhang, C.J.; Zhang, X.Y.; Wang, S.; Chen, X.D. Annotation of Spatial-Temporal Information of Event in Chinese Text. J. Chin. Inf. Process. 2016, 30, 213–222. [Google Scholar]
  44. Guo, X.; He, T.; Hu, X.; Chen, Q. Chinese named entity relation extraction based on syntactic and semantic features. J. Chin. Inf. Process. 2014, 28, 183–189. [Google Scholar]
  45. Gan, L.; Wan, C.; Liu, D.; Zhong, Q.; Jiang, T. Chinese named entity relation extraction based on syntactic and semantic features. J. Comput. Res. Dev. 2016, 53, 284–302. [Google Scholar]
  46. Francis, N.; Green, A.; Guagliardo, P.; Libkin, L.; Lindaaker, T.; Marsault, V.; Plantikow, S.; Rydberg, M.; Selmer, P.; Taylor, A. Cypher: An evolving query language for property graphs. In Proceedings of the 2018 International Conference on Management of Data, Houston, TX, USA, 10–15 June 2018; ACM: New York, NY, USA, 2018; pp. 1433–1445. [Google Scholar]
  47. Qi, Y.; Zhai, R.; Wu, F.; Yin, J.; Gong, X.; Zhu, L.; Yu, H. CSMNER: A Toponym Entity Recognition Model for Chinese Social Media. ISPRS Int. J. Geo-Inf. 2024, 13, 311. [Google Scholar] [CrossRef]
  48. Yao, X.; Hao, X.; Liu, R.; Li, L.; Guo, X. AgCNER, the first large-scale Chinese named entity recognition dataset for agricultural diseases and pests. Sci. Data 2024, 11, 769. [Google Scholar] [CrossRef] [PubMed]
  49. Yang, M. Research on Semantic Driven Data Query and Intelligent Visualization. Bachelor’s Thesis, Chongqing University, Chongqing, China, 2018. [Google Scholar]
Figure 1. Technology framework of semantic understanding of spatiotemporal questions in natural language.
Figure 1. Technology framework of semantic understanding of spatiotemporal questions in natural language.
Applsci 15 01073 g001
Figure 2. Model structure of DeBERTa-BiGRU-CRF. (N stands for multiple Transformer layers).
Figure 2. Model structure of DeBERTa-BiGRU-CRF. (N stands for multiple Transformer layers).
Applsci 15 01073 g002
Figure 3. Model structure of DeBERTa.
Figure 3. Model structure of DeBERTa.
Applsci 15 01073 g003
Figure 4. Generating structured query workflows.
Figure 4. Generating structured query workflows.
Applsci 15 01073 g004
Figure 5. Model performance evaluation indices and calculation formulas.
Figure 5. Model performance evaluation indices and calculation formulas.
Applsci 15 01073 g005
Figure 6. Precision, recall, and F1 score of different models.
Figure 6. Precision, recall, and F1 score of different models.
Applsci 15 01073 g006
Figure 7. Comparison of model training processes.
Figure 7. Comparison of model training processes.
Applsci 15 01073 g007
Table 1. Examples of natural language to Cypher.
Table 1. Examples of natural language to Cypher.
QuestionWhat Are the Three Closest Parks to Tiananmen Square?
Semantic encodingspqdt2
Question typemaximum/minimum value query
Cypher questionMATCH (p{name: ‘Tiananmen Square’}), (t:parks) WITH p, t ORDER BY point.distance(point({latitude: p.latitude, longitude: p.longitude}), point({latitude: t.latitude, longitude: t.longitude})) LIMIT 3 RETURN t
Table 2. Elements of spatiotemporal semantic encoding.
Table 2. Elements of spatiotemporal semantic encoding.
Semantic TypesWord ClassCodeSemantic TypesWord ClassCode
Location interrogativeQuestion word1Place typeNount
Specific interrogativeQuestion word2AttributeNouno
Choice interrogativeQuestion word3QualityAdjectiveq
Quantity interrogativeQuestion word4ActivityVerbs
Measurement interrogativeQuestion word5Spatiotemporal relationPreposition/Nounr
Judgment interrogativeQuestion word6Numbers and unitsNumerals, Unitsd
Place nameNounp
Table 3. Question sentence type.
Table 3. Question sentence type.
Question TypesQuestion Examples
Location Query[Tiananmen Square][nearby][parking lot] where is it?
Distance QueryWhat’s the distance between [Beijing] and [Shanghai]?
Direction QueryWhat direction is [Nanjing] from [Paris]?
Buffer QueryWhat [hotels] are within [3 km] [north] of [Beijing][Forbidden City]?
Count QueryHow many [parks] were there in [Beijing] in [2000]?
Maximum/Minimum QueryWhat are the [closest][eight][hotels] to [Beijing][Forbidden City]?
List QueryWhat [universities] were in [Beijing] in [1980]?
Judgment QueryWas [Beijing]’s [population][one million] in the [19th century]?
Table 4. Semantic parsing results.
Table 4. Semantic parsing results.
Question 1How Many Cities Are There Within a 500-km Radius of Jinan, Shandong?
Semantic Encodingppd3t
Question TypeCount Query
Language NormalizationCOUNT(x):p(‘Shandong’) AND p(‘Jinan’) AND t(city) AND x = COUNT(t) AND Buffer(‘Jinan’, 500,000 m, x) AND Return(x)
Table 5. Dataset statistical information.
Table 5. Dataset statistical information.
Semantic Coding ElementsCount Semantic Coding ElementsCount
Location interrogative-1371Place type-t2056
Specific interrogative-2 186Attribute-o1004
Choice interrogative-31468Quality-q 493
Quantity interrogative-4629Activity-s945
Measurement interrogative-5240Temporal relation-r669
Judgment interrogative-6405Numbers and units-d1013
Place name-p 6094
Table 6. Example of spatiotemporal semantic encoding annotation. (Which direction is Wuhan relative to Qingdao? How many restaurants are there near Yuyuantan Park in Beijing? What hospitals were in NanJing in the 1980s?).
Table 6. Example of spatiotemporal semantic encoding annotation. (Which direction is Wuhan relative to Qingdao? How many restaurants are there near Yuyuantan Park in Beijing? What hospitals were in NanJing in the 1980s?).
CorpusAnnotation LabelCorpusAnnotation LabelCorpusAnnotation Label
B-pB-pB-d
E-pE-pI-d
OB-pI-d
OI-pE-d
B-pI-pB-d
E-pI-pI-d
OE-pI-d
B-3B-rE-d
E-3E-r,O
B-oOB-p
E-oB-4E-p
?OE-4O
OB-3
B-tE-3
E-tB-t
?OE-t
?O
Table 7. Training corpus statistics.
Table 7. Training corpus statistics.
CorpusTraining SetValidation SetTest Set
Character count52,31162346449
Question count7508948960
Code count13,61816561799
Table 8. Accuracy inspection of NL2Cypher.
Table 8. Accuracy inspection of NL2Cypher.
Dataset ClassifyPrecision of the Training SetPrecision of the Test Set
GeoQuestions A C C q m A C C l f A C C e x A C C q m A C C l f A C C e x
0.88250.83711.0540.92260.88631.041
GeoQuery A C C q m A C C l f A C C e x A C C q m A C C l f A C C e x
0.88370.83211.0620.92820.88331.051
Table 9. The accuracy evaluation results obtained for these two datasets by different models, %.
Table 9. The accuracy evaluation results obtained for these two datasets by different models, %.
ModelGeoQuestionsGeoQuery
Training SetTest SetTraining SetTest Set
A C C q m A C C l f A C C q m A C C l f A C C q m A C C l f A C C q m A C C l f
NL2Cypher88.3 83.7 92.388.6 88.483.2 92.888.3
GPT2 (standard)84.0 76.0 83.875.5 81.279.3 82.581.7
BaiChuan 65.8 59.5 59.455.9 76.070.4 72.571.6
QianWen 57.9 52.4 55.849.5 52.348.2 61.356.3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lu, W.; Ming, D.; Mao, X.; Wang, J.; Zhao, Z.; Cheng, Y. A DeBERTa-Based Semantic Conversion Model for Spatiotemporal Questions in Natural Language. Appl. Sci. 2025, 15, 1073. https://doi.org/10.3390/app15031073

AMA Style

Lu W, Ming D, Mao X, Wang J, Zhao Z, Cheng Y. A DeBERTa-Based Semantic Conversion Model for Spatiotemporal Questions in Natural Language. Applied Sciences. 2025; 15(3):1073. https://doi.org/10.3390/app15031073

Chicago/Turabian Style

Lu, Wenjuan, Dongping Ming, Xi Mao, Jizhou Wang, Zhanjie Zhao, and Yao Cheng. 2025. "A DeBERTa-Based Semantic Conversion Model for Spatiotemporal Questions in Natural Language" Applied Sciences 15, no. 3: 1073. https://doi.org/10.3390/app15031073

APA Style

Lu, W., Ming, D., Mao, X., Wang, J., Zhao, Z., & Cheng, Y. (2025). A DeBERTa-Based Semantic Conversion Model for Spatiotemporal Questions in Natural Language. Applied Sciences, 15(3), 1073. https://doi.org/10.3390/app15031073

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop