4.1. Question Classification
The extraction of semantic information from natural language spatiotemporal questions focuses on the perspective of the question itself. Often, the semantics within a question may be abbreviated or misspelled. After extracting spatiotemporal questions using the semantic encoding module, it is necessary to perform a normalization process on the question set. This involves classifying and semantically parsing the questions to further determine the category to which a spatiotemporal question belongs.
Question classification aids in better understanding the user’s intent and requirements, thereby providing customized query results based on different question categories. This paper uses the benchmark datasets GeoQuestions [
36] and GeoQuery [
37] for geospatial question answering as a basis, extending categories to include direction, distance, and time question types, such as basic queries, composite queries, and fuzzy queries, and reclassifying the questions. In the geospatial context, the answers expected from spatiotemporal question queries are related to factors such as location, distance, direction, quantity, and extremum. Therefore, as shown in
Table 3, questions are classified into eight types based on the expected answer type. These classification methods offer a structured processing framework for spatiotemporal questions, facilitating accurate understanding of user needs and the provision of corresponding query results.
After semantic encoding, extracted keywords can serve as critical features for distinguishing between question categories. Due to the obvious features of interrogative sentences, this paper adopted Bert (Bidirectional Encoder Representations from Transformers) [
38] large model to distinguish different types of interrogative sentences. To accurately distinguish complex queries with similar textual descriptions and question types that return answers in similar formats, a fusion of question vectors and encoded key elements was used, with the final classification results obtained through model training. For example, interrogative keywords were used to differentiate between two complex queries like “Where is the hotel closest to Tiananmen?” and “Which is the hotel closest to Tiananmen?” Buffer zone queries, extremum queries, and list queries all returned answers as lists of location types, but due to different key elements in the questions, the question type that best reflected the features of the keywords was prioritized in the response.
4.2. Semantic Parsing
Building on question classification, semantic parsing was performed on intra-category questions by identifying the relationships among encoded entity elements to determine the query conditions for each type of question. In the context of geospatial and temporal questions, the focus was primarily on spatial relationships, temporal relationships, and dependency relationships [
39]. The LTP (Language Technology Platform) tool [
40] was used to perform dependency syntax analysis on the questions, and relationship recognition was carried out based on the needs and characteristics of different question types.
- (1)
Spatial Relationships
The three elements of spatial descriptions are the position object (P), the reference object (C), and the spatial relation (r) [
41]. The position object is the core of the spatial description, representing the object being located or described. The reference object serves as the benchmark or reference point used to locate the position object within the spatial description, providing a reference framework for the position object. Spatial relations are the spatial connections between the position object and the reference object, including directional relationships (such as east, west, south, north, northeast, southeast, etc.), distance relationships (such as around, nearby, etc.), and topological relationships (such as bordering, containing, adjacent, etc.). The spatial relationship triplet in a question is represented in the following form:
In the formula, i represents all objects related to r in the question, and j represents all spatial relationships in the sentence. For example, in the sentence “the park and bank north of the museum”, it contains , , , .
Based on the semantic encoding results of the sentence, if there is a spatial relationship encoding r, then use a combination of dependency syntax and rule-based methods to identify the location object and reference object of the spatial relationship and determine the spatial relationship triple.
- (2)
Temporal Relationships
When describing geographic entities, temporal factors should be considered, such as the historical evolution of geographic entities, changing trends, and the state of specific periods [
42], to more comprehensively understand and describe geographic phenomena and analyze the changes in geographic data over time. Spatiotemporal questions mainly include absolute time (e.g., September 2010) and relative time (e.g., five years ago). Their temporal relationship words mainly express temporal sequence relationships (e.g., before, after, until, etc.) and temporal modification relationships (e.g., around, about, approximately, etc.) [
43]. In spatiotemporal questions, the queries and descriptions of time primarily focus on the time attributes of geographic entities or the temporal relationships between geographic entities. If a time encoding ‘d’ and a temporal relationship encoding ‘r’ are identified in the question, then time is assigned as an attribute or association to the entity encodings in the question, and ‘r’ is mapped to the temporal relationship function based on the temporal relationship set.
- (3)
Dependency Relationships
The entities in the question usually have dependency relationships, such as subject–predicate, verb–object, coordination, attribute modified, and preposition–object [
44,
45]. For instance, if there is a coordination relationship among entities in the sentence and the first entity is associated with a certain relationship (such as a spatial relationship), then the coordinated entity is usually also associated with the same relationship. If the dependency relationship between entities is that of an attribute modifier, it indicates a modification relationship between the entities.
Based on the syntactic relationships, determine the query target of the question: location queries typically take the location ‘p’ or location type ‘t’ as the subject; distance queries take location ‘p’ as the subject; direction queries must determine the reference object and query object based on the subject–predicate relationship, verb–object relationship, etc.; buffer, counting, extremum, and list queries generally have the location type ‘t’ as the query target; and judgment queries have the IF function as the query target. List the extracted entity encoding classes and define the relationships between entities as functions; treat the query target of the question as a variable and the remaining information as constants; and define functions for buffer, direction, distance, sorting, fuzzy queries, and topological queries, such as Buffer(p, number, t), Direction(, East, ), , Order(o, number), etc.
As an example, for the query “What parks within five kilometers south of the Forbidden City in Beijing were built in the twentieth century?”, follow the following steps:
(1) Extract entities and relations, location (p) and relation, location type (t), etc.
(2) Declare terms based on the encoding, declare location p (‘Forbidden City’), declare location type t (‘Park’).
(3) Define query target by taking the unknown query target as a variable , = t (‘Park’).
(4) Define function by defining the query conditions, such as dependency, spatial and temporal relations as functions, InCity (‘Forbidden City’, ‘Beijing’), Buffer (‘Forbidden City’, 5000 m, ‘Park’), Direction (‘Forbidden City’, ‘South’, ‘Park’).
(5) Nominalize language as p (‘Forbidden City’) AND p (‘Beijing’) AND InCity (‘Forbidden City’, ‘Beijing’) AND t(‘Park’) AND = t AND Direction (‘Forbidden City’, ‘to the south’,) AND Buffer (‘Forbidden City’, 5000 m, ) AND o (‘Time’ = ‘twentieth century’) AND Return (‘’).