Next Article in Journal
Using Social Media to Mine and Analyze Public Sentiment during a Disaster: A Case Study of the 2018 Shouguang City Flood in China
Next Article in Special Issue
High-Performance Overlay Analysis of Massive Geographic Polygons That Considers Shape Complexity in a Cloud Environment
Previous Article in Journal
Multi-Constrained Optimization Method of Line Segment Extraction Based on Multi-Scale Image Space
Previous Article in Special Issue
A Novel Method of Missing Road Generation in City Blocks Based on Big Mobile Navigation Trajectory Data
 
 
Article
Peer-Review Record

Geographic Knowledge Graph (GeoKG): A Formalized Geographic Knowledge Representation

ISPRS Int. J. Geo-Inf. 2019, 8(4), 184; https://doi.org/10.3390/ijgi8040184
by Shu Wang 1,2,3, Xueying Zhang 1,2,3,*, Peng Ye 1,2,3, Mi Du 1,2,3, Yanxu Lu 1,2,3 and Haonan Xue 1,2,3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
ISPRS Int. J. Geo-Inf. 2019, 8(4), 184; https://doi.org/10.3390/ijgi8040184
Submission received: 25 February 2019 / Revised: 15 March 2019 / Accepted: 4 April 2019 / Published: 8 April 2019
(This article belongs to the Special Issue Big Data Computing for Geospatial Applications)

Round 1

Reviewer 1 Report

The paper was improved. I have the following comments for further improving it.

1.     The paper is too long. Please remove the unrelated or unimportant introductions to background knowledge so that the paper may be more focused

2.     Please double check the formalization. For example, the definition of "state" is unclear. Doesn't it depend on attributes? If so, the six-tuple is inappropriate. Also check the spelling (Tin or Tint?)

3.     In the case study (Figure 5 and figure 6), why not directly use the concreate properties. The current two figures do not convey any information about Nanjing and related entities.


Author Response

Response to reviewer #1
Comments to the Author 

The paper was improved. I have the following comments for further improving it.

A: thanks for your useful comments to help us to improving the paper. Below we list the comments in bold and discuss how we incorporated them into our paper.

 

1.     The paper is too long. Please remove the unrelated or unimportant introductions to background knowledge so that the paper may be more focused

A: Thanks for your suggestion. In order to highlight the focused issue and the contributions, we made the following modifications.

 

Firstly, the whole introduction were adjusted by following 4 parts. The first paragraph is a brief introduction of geographic knowledge representation and the importance to the topic of the issue “big data computing”. The second paragraph focused on the issue immediately by illustrating the weakness of the current knowledge representation form with a specific example of “7·21 Beijing storm”. The third paragraph states the issue and our contributes in this paper. And the last paragraph gives a short remainder to the whole paper.

 

Secondly, some unimportant background information are removed from the introduction part. Some necessary information were added into the related works. After your suggestion, we thought our introduction is clearer than the before and could help the readers focused our contributions.

 

2.     Please double check the formalization. For example, the definition of "state" is unclear. Doesn't it depend on attributes? If so, the six-tuple is inappropriate. Also check the spelling (Tin or Tint?)

A: Thanks for your reminding.

 

Firstly, we double checked our formalization part immediately. We truly found out some mistakes on spelling and redundant space in the formula. We had corrected these errors in Line 332-333  to , Line 338  to . And also the redundant space were removed in Line 361 and Line 440. After these corrections, we re-check other formulas in our paper.

 

Secondly, we re-checked the statement of the definition of “state”. It could be unclear to the readers to understand the element of State. So, we discussed and added the statements and examples to explain the formalization.

Actually, state depends on location and time but not on attribute. Because time and space are two dimensions to represent the stage of geographic object in Euclidean space and attributes are descriptive records that cannot affect whether the state exists. For example, Typhoon Maria, 23:00/ 10July-2018, E123.40°/ N25.60°, central pressure 945 hpa, max speed 30 km/h” defines a state with all features. And “Typhoon Maria, 23:00/ 10July-2018, E123.40°/ N25.60°” also can define a state of Typhoon Maria. This kind of state exists because of the geographic object happened on a specific temporal and spatial dimension. Thus, we added the clear statements, examples and appropriate supplement in formula. The details can be seen in Line 403-415.

 

 

3.     In the case study (Figure 5 and figure 6), why not directly use the concreate properties. The current two figures do not convey any information about Nanjing and related entities.

A: Thanks for your comment. Yes, we did not use the concreate properties in Figure 5 and Figure 6 by considering the concreate properties could let the figure more complex. It could be unclear to the readers to understand the case study. And we also ignore to link the relations to the related entities. This could be more unclear to the readers.

 

In order to improve these issues, we re-draw the Figure 5 and Figure 6 by using the concreate properties in database. In addition, we also supplemented the related entities in Figure 5. Now, we use the real text to fill into the contents of the elements in Figure 5 and Figure 6. We hope the adjustments could make the readers understand more clearly.

Figure 5. The diagram of different elements of Nanjing by using GeoKG model

Figure 6. The diagram of relation elements of Nanjing in 1368

 


Author Response File: Author Response.docx

Reviewer 2 Report

The changes you made go someway to answering the questions.

The example you chose is useful, but I feel is somewhat misleading.  You assume you can model a lot more information about the historical entities and relationships with your model, whereas Yago has a limited set of this information- i.e. there was not attempt to model this historical facts in Yago!

I am afraid I still have reservation on the applicability of the method, as you did not answer the point about complexity of full DL.  I believe that if you extend the features to the full DL, reasoning with the ontologies will not be decidable.

The changes do improve the paper, but there is still some work to do and the presentation needs some cleaning.

Author Response

Response to reviewer #2
Comments to the Author 

The changes you made go someway to answering the questions.

A: Thanks for your comments to help us to improve our paper. We hope the improvements could make the manuscript better and better. We believe this research can promote the representation ability of geographic knowledge in computer, which will truly make an improvement on geospatial big data computing. All the comments in bold were fully considered and discussed.

 

1.     The example you chose is useful, but I feel is somewhat misleading.  You assume you can model a lot more information about the historical entities and relationships with your model, whereas Yago has a limited set of this information- i.e. there was not attempt to model this historical facts in Yago!

A: Thanks for your question. Our model can truly model more information about the historical entities and relationships. But, we also utilized the work from the YAGO. Although our previous manuscript cited the paper of the YAGO2, we could do not represent well. The tasks of YAGO2 were not discussed in our paper.

 

Actually, we followed the works from the team of YAGO since 2012. There are four main stages of the YAGO team: the previous work of the tools to retrieve the information from the text, the first knowledge base of YAGO, the spatial-temporal enhanced knowledge base of the YAGO2 and their multi-linguistic knowledge base of the YAGO3. They finished a very complex system including: TAXONOMY, SIMPLETAX, CORE, GEONAMES, META, MULTILINGUAL, LINK, WIKIPEDIA and OTHER. They truly attempt to store the spatial and temporal information that is use the different way. The thousands-times read papers from them are listed.

 

Therefore, we made the following supplements to improve our paper in order to make the readers to understand their works and our contributions.

 

Firstly, we added a brief statement on their work on spatial and temporal enhancements in relate works part Section 2.2. details shown in Line 162-169.

 

Secondly, we added the description and footnote at the beginning of the discussion part. Details in Line 557.

 

Thirdly, we add a section Section 5.1.1 structures in order to state the differences between the YAGO and the GeoKG. We also draw Figrue 8 “The examples with structures of the YAGO model and the GeoKG model” and tried to illustrate the differences by using the examples from the part of the case study.

 

Figure 8. The examples with structures of the YAGO model and the GeoKG model

 

We hope our supplements could let the readers to understand our model deeply.

 

References:

Weikum, Fabian Suchanek Und Gerhard. "Searching for Knowledge Instead of Web Sites." 2008

Melo G D , Suchanek F , Pease A . Integrating YAGO into the Suggested Upper Merged Ontology[J]. 2008.

Suchanek F M , Kasneci G , Weikum A G . Yago - A Large Ontology from Wikipedia and WordNet[J]. Web Semantics Science Services & Agents on the World Wide Web, 2008, 6(3):203-217.

Suchanek F M , Kasneci G , Weikum G . Yago: a core of semantic knowledge[C]// International Conference on World Wide Web. OAI, 2007.

Hoffart J , Suchanek F M , Berberich K , et al. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia[J]. Artificial Intelligence, 2013, 194:28-61.

Hoffart J , Suchanek F , Berberich K , et al. YAGO2: Exploring and Querying World Knowledge in Time, Space, Context, and Many Languages[C]// International Conference Companion on World Wide Web. ACM, 2011.

Biega J , Kuzey E , Suchanek F M . Inside YAGO2s: a transparent information extraction architecture[C]// Proceedings of the 22nd international conference on World Wide Web companion. ACM, 2013.

Mahdisoltani, F.; Biega, J. & Suchanek, F. M. (2015), YAGO3: A Knowledge Base from Multilingual Wikipedias., in 'CIDR' , www.cidrdb.org, .

 

2.     I am afraid I still have reservation on the applicability of the method, as you did not answer the point about complexity of full DL.  I believe that if you extend the features to the full DL, reasoning with the ontologies will not be decidable.

A: Thanks for your comments. You have very serious attitude and rigorous logic on science. Your comment let us know our paper do not mention the evidences on the decidability of the DL. We are also doing this research seriously with every supplement operator.

 

Actually, every supplement operator has been demonstrated both its decidability and other properties on computer science. We added four groups operators H, I, R^+ and Q, which were included in SHIQ. The decidability, soundness and completeness of all the DL (ALC, S, SI, SH, SHI, SHIQ and the followings) were demonstrated by the following references. And the complexity of SI (role or concept transitions) is PSPACE complete and the following SHI and SHIQ are EXPTIME complete. Two survey papers also reviews the demonstration work.

 

Thus, we added these proofs in our paper and also made the notification in the Table 1. The details shown in Line 128-131 and Line 315. We hope these supplements could make our paper more complete.

 

References:

Aachen, Rwth, Lufg Theoretische Informatik, Ian Horrocks, Ulrike Sattler, and Stephan Tobies. "Pspace-Algorithm for Deciding Alcnir+-Satisfiability." Paper presented at the LTCS-Report 98-08, Germany 1998

Horrocks, I, and U Sattler. "A Description Logic with Transitive and Inverse Roles and Role Hierarchies." Journal of Logic & Computation 9, no. 3 (1999):pp. 385-410(26),doi: 10.1093/logcom/9.3.385.

Horrocks, I., U. Sattler, and S. Tobies. "Practical Reasoning for Very Expressive Description Logics." Logic Journal of IGPL (2000),doi: 10.1093/jigpal/8.3.239.

Baader, Franz, and Ulrike Sattler. "An Overview of Tableau Algorithms for Description Logics." Studia Logica 69, no. 1 (2001):pp. 5-40,doi: 10.1023/A:1013882326814.

Mei, Jing. "From Alc to Shoq(D):A Survey of Tableau Algorithms for Description Logics." Computer Science 32, no. 3 (2005):pp. 1-11,doi: 10.1081/CEH-200044273.

 

3.     The changes do improve the paper, but there is still some work to do and the presentation needs some cleaning.

A: Thanks for your word. We did lots of work in 10 day to improve our paper. the supplements show below.

 

Firstly, we removed the unrelated and unimportant information in introduction part in order to make our issue more focused. It could let readers to understand our contribution more easily.

 

Secondly, we supplements the relevant proofs to make our paper more seriously. Details in the second answer.

 

Thirdly, figure 5 and figure 6 in case study were redraw by using the concreate properties. It could let the readers understand more clearly.

Fourthly, we set an online questionnaire survey to evaluate the answers of the YAGO and the GeoKG. The online link is https://www.lediaocha.com/pc/s/k3x29l8n .

The questionnaire were divide into 8 parts. The first part is the basic information survey that includes four main information on the individuals (gender, familiarity to the research area, background and education level). The statistics of these basic information were shown in Figure 9. The 2nd-7th parts correspond to the questions #Q1-#Q6 about the best answer, accuracy, completeness and repetition. The 8th part is summary questions including the overall evaluation, scores on YAGO and scores on GeoKG. The scores are set from 1-5 corresponding to (very bad, bad, normal, good and very good). And each score group includes overall score, accuracy score, completeness score and repetition score. There are 106 valid feedbacks we finally received. Then, we analyzed the statistics and also compared with the structure differences (details in Figure 8.). It could illustrate the reason of why we got these answers. We added the user evaluation as Section 5.2.4 in our paper. We trust this supplement do improve our paper and also reveal the contributions of our research.

We hope these supplements could let our paper deeply improved.

 


Author Response File: Author Response.docx

Reviewer 3 Report

This version of the article has the merit of providing insights to assess the quality of the knowledge that can be extracted from geoKG. However, it seems necessary to complete these elements:
- By giving an idea of the complexity of the design of the geoKG model compared to the YAGO model. It is necessary to compare the design complexity of the model with what it can achieve in terms of improved results. Basically, it is a question of understanding whether the design complexity is "worth" the improvements achieved.
- it would also be necessary to carry out a comparative user evaluation of the results.

Author Response

Response to reviewer #3
Comments to the Author 

This version of the article has the merit of providing insights to assess the quality of the knowledge that can be extracted from geoKG. However, it seems necessary to complete these elements:

A: Thanks for your comments to help us to improve our paper. We hope the improvements could make the manuscript better and better. We believe this research can promote the representation ability of geographic knowledge in computer, which will truly make an improvement on geospatial big data computing. All the comments were fully considered and discussed.


1.     - By giving an idea of the complexity of the design of the geoKG model compared to the YAGO model. It is necessary to compare the design complexity of the model with what it can achieve in terms of improved results. Basically, it is a question of understanding whether the design complexity is "worth" the improvements achieved.

A: Thanks for your suggestion. We discussed and did the following supplements.

 

Firstly, we added a brief statement on their work on spatial and temporal enhancements in relate works part Section 2.2. details shown in Line 162-169.

 

Secondly, we added the description and footnote at the beginning of the discussion part. Details in Line 557.

 

Thirdly, we add a comparison section Section 5.1.1 “structures” in order to state the differences between the YAGO and the GeoKG. We also draw Figrue 8 “The examples with structures of the YAGO model and the GeoKG model” and tried to illustrate the differences by using the examples from the part of the case study.

Figure 8. The examples with structures of the YAGO model and the GeoKG model

 

We hope our supplements could let the readers to understand our model deeply.


2.     - it would also be necessary to carry out a comparative user evaluation of the results.

A: Thanks for your suggestion. It do improve our paper and let the analyses more objective. And it also verified our analyses.

We set an online questionnaire survey to evaluate the answers of the YAGO and the GeoKG. The online link is https://www.lediaocha.com/pc/s/k3x29l8n .

The questionnaire were divide into 8 parts. The first part is the basic information survey that includes four main information on the individuals (gender, familiarity to the research area, background and education level). The statistics of these basic information were shown in Figure 9. The 2nd-7th parts correspond to the questions #Q1-#Q6 about the best answer, accuracy, completeness and repetition. The 8th part is summary questions including the overall evaluation, scores on YAGO and scores on GeoKG. The scores are set from 1-5 corresponding to (very bad, bad, normal, good and very good). And each score group includes overall score, accuracy score, completeness score and repetition score. There are 106 valid feedbacks we finally received. Then, we analyzed the statistics and also compared with the structure differences (details in Figure 8.). it could illustrate the reason of why we got these answers.

We added the user evaluation as Section 5.2.4 in our paper. We trust this supplement do improve our paper and also reveal the contributions of our research. Thanks for your suggestion one more time.

 


Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

The authors have appropriately addressed the previous comments.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

This paper intends to construct a geographic knowledge graph. I do like the attempt given that such a graph is of great important in geographical research. However, the current form is somewhat immature and should be extensively revised before it can be published.

1.     The logic flow of this paper is unclear and the research gap was not well stated. Many concepts were mentioned in the model section but not well used in the case study. This makes readers difficult to understand them.

2.     What is "cognitive processes" and " cognitive method"? They should be clearly defined.

3.     There is much literature on basic concepts of geographical knowledge (e.g. the papers by W. Kuhn). Please check them. The current discussion is somewhat superficial and  useless (e.g. on the term "geography").

4.     In section 2.2.1, it seems that the authors did not well distinguish "concept" and "individual". I don't think Yangzi River is a concept. Thus, the entire section is misleading.

5.     Table 1 is unclear and the notions were not well used in the case study.

6.     I don’t think the case study is appropriate since it is not a standard description for geographical knowledge (an example including Yangzi River, Nanjing, and Zhongshan Mountain may be better). Also, I don’t think sun, ocean, water vapor are geographical objects. Sun is an object but not geographic. Ocean and water vapor are concepts instead of objects.


Reviewer 2 Report

Combining the related work with the introduction made this section hard to follow.  I suggest that you separate out the introduction to clearly indicate the contribution of your work.

You refer to geographic ontology and use it to evaluate/compare your work in section 4, however you do not explicitly state what are the elements of this ontology and how can they be used to represent the case study.  I don't think it is enough to just say that it can partially support the representation- you need to make the evaluation section more concrete.

You rightly identify that dynamic geographic concept are of interest and may not be supported directly in a general geo-ontology, but could the concept of state for example be represented, given that a general ontology can represent the concept of time?

The concept of change is more complicated, but I am not convinced that an abstract representation of difference between states would be possible in DL?

Also, you did not mention much about whether your extended logic will be decidable and thus can be adopted easily within the Google knowledge graph for example?

So, given we use the full features of DL as describe, would we be able to reason about this in realistic scenarios and data sets?

Figure 4 needs to be enhanced, as I was not able to read the labels even after resizing significantly.

Reviewer 3 Report

The formal representation of geographical knowledge is a critical aspect and the proposal of the authors of this article is an additional attempt to propose a representation that takes into account the particularities of the field of geography. The authors start from a set of six questions for which they construct a formal representation of a knowledge graph.

What is important in a formal representation is its ability to represent knowledge, but also to extract and question that knowledge. It seems that this second aspect is not clearly established:

Paragraph 4.1 aims at demonstrating the ability of the GeoGK to represent knowledge. In the process to build the KG, the authors start from the six questions to derive the six basics elements to define a geographic object. Each element is able to provide an answer to the corresponding question. When they address the knowledge representation ability, they simply state back that each of the six basics elements is able to answer the six questions. To summarise, the location element is first defined to answer the « where it is ? » question in the GK. Then the author represent the water cycle with the GK, they use location elements. And then to prove that the model is able to represent knowledge, they simply state back that as the model contains location element, it is able to answer the « where it is? » question. It seems to me that there is a weakness in this demonstration and that the demonstration does not demonstrate anything. This paragraph is either unnecessary or the demonstration of the ability to represent knowledge must be conducted in a different way. For example, it would be interesting to define standard questions and demonstrate that it is possible to obtain an answer.

Paragraph 4.3 aims at demonstrating that GeoGK models can be constructed from documents such as the « Encyclopedia of China: China Geography » and stored in a graph database. For this case study, the demonstration is limited to justifying the ability to build the representation. No indication of its ability to be queried is provided. It would be interesting to define a set of questions, translated into queries on the database and to analyze the results.


Back to TopTop