Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models—A Case in Optimizing Intermodal Freight Transportation
Abstract
:Highlights
- We have developed an integrated and automated methodology that leverages a pre-trained Large Language Model (LLM) to generate scenario-based ontologies and knowledge graphs from research articles and technical manuals.
- Our methodology utilizes the ChatGPT API as the primary reasoning engine, supplemented by Natural Language Processing modules and carefully engineered prompts. This combination enables an automated tool capable of generating ontologies independently.
- The ontologies generated through our AI-powered method are interoperable and can significantly facilitate the design of data models and software architecture, particularly in the development of urban decision support systems.
- We compared ontologies generated by our LLM with those created by human experts through CQ-based qualitative evaluation, assessing the reliability and feasibility of our approach.
- The methodology has been successfully applied to intermodal freight data and simulations. This has allowed us to generate a scenario-based ontology and knowledge graph that enhances data discovery, integration, and management, thereby supporting network optimization and multiple criteria decision analysis.
- Our methodology is both generalizable and adaptive, enabling the automation of ontology generation to support the development of urban and environmental decision support systems across various disciplines.
Abstract
1. Introduction
2. Literature Review
2.1. Knowledge Representation for Smart Cities
2.2. Ontology Construction
- Methontology is a rigorous framework developed for the creation of domain ontologies that are independent of applications, facilitating a structured approach to ontology design [35]. This methodology encompasses several key phases, including specification, conceptualization, formalization, implementation, and maintenance, ensuring a comprehensive development process. Originally applied in the late 1990s, Methontology emphasizes systematic steps for the effective organization and representation of knowledge, making it a cornerstone in the field of semantic technologies and knowledge engineering [36].
- The On-To-Knowledge methodology (OTKM) represents a structured approach to ontology design, aimed at enhancing knowledge management within organizations. It focuses on converting implicit company knowledge into an explicit, structured form, facilitating better decision-making and process optimization [37]. This method emphasizes early stakeholder involvement, iterative development, and the use of formal ontologies to ensure that the captured knowledge is both accurate and usable in automated systems.
- The NeON methodology caters to the evolving demands of ontology engineering, supporting the reuse, re-engineering, and integration of ontological resources [38]. By offering a diverse set of strategies and tools, NeON facilitates the development of robust ontologies and ontology networks, addressing complex semantic frameworks [39]. This methodology is particularly beneficial for projects requiring collaborative and distributed development, making it a pivotal resource for advancing semantic technologies and their applications in various domains.
2.3. Traditional Methods for Ontology Implementation
2.4. Semantic Web Standards
- RDF serves as a standard model for data interchange on the web, enabling the representation of information in a triple format (subject-predicate-object), facilitating a graph-like structure [53].
- OWL provides additional vocabulary along with formal semantics designed for processing and integrating information as it enables the creation of more complex ontologies [54].
- SPARQL, on the other hand, is used to query databases stored in RDF format. It allows for powerful and expressive queries over diverse data sources, making it essential for extracting and manipulating data within and across knowledge graphs [55].
2.5. AI-Powered Autonomous System
2.6. LLM-Powered Knowledge Engineering
3. Knowledge Gaps and Contributions
3.1. Domain Application Challenges
3.2. Methodological Challenges
3.3. Knowledge Gaps
- Complex Domain Applications in Operations Science: Many previous studies have concentrated on testing the feasibility of constructing ontologies and knowledge using LLMs through proof-of-concept and pilot studies. However, there is a gap in utilizing AI-generated ontologies to support real-world decision-making and evaluating their performance in practical applications.
- Building Concepts with Complex Features: Most previous studies have utilized generic text inputs to test the ontology and knowledge graph creation capabilities of LLMs. However, these studies did not involve the handling and interpretation of complex scientific datasets with various data dimensions and features, such as topological, spatial, and temporal aspects.
- Integration of Knowledge Engineering Methods: Many previous studies have employed general queries with zero-shot prompting methods to generate knowledge representation using LLMs. However, the aspect of training or instructing LLMs using commonly accepted knowledge engineering methodologies, as discussed in Section 2.3, has been rarely addressed.
- Interoperability of LLM-Generated Content: While previous studies have focused on testing the feasibility and accuracy of LLM-generated knowledge representations, they often overlook the aspects of interoperability and usability, particularly concerning their application to practical research problems. Potential application areas include supporting the formulation of decision support strategies, guiding the design of critical software and database components, and facilitating the construction of knowledge bases for practical purposes.
3.4. Motivation and Contributions
- Contributions to LLM Applications: We explored the ability of Large Language Models (LLMs) to understand and interpret entities and relationships in complex spatial and topological datasets and simulation outputs, guided by the Methontology framework. Our approach is demonstrated through Section 5.2.
- Contributions to Software Engineering: We demonstrate a practice of utilizing AI-generated ontology to guide software design and semi-automatically generate critical components of a data provisioning system essential for developing an urban decision support system. Our approach aims to systematically bridge AI-generated knowledge representation with existing ontology management tools and database technologies (detailed in Section 5.3), supporting practical tasks in the development of decision support software.
- Contributions to Operations science: We devised an automated AI-powered workflow, driven by integrated data, simulations, and knowledge representations, to develop intelligent and advanced capabilities for supporting decision-making and optimizing complex urban systems. A detailed demonstration is presented through Section 5.5.
4. Methodology
4.1. Design Requirements
- Compatibility and Portability: The workflow should be designed as a generalized workflow, implementable in popular programming languages like Python. It should be modular and flexible, enabling deployment and execution on widely-used online platforms such as Jupyter Notebook.
- Automated Knowledge Source Acquisition: The workflow should be integrated with online databases and archives for scientific research literature via APIs. This integration will enable the agent to automatically download relevant research papers and technical manuals based on user-defined keywords related to a specific scope, dataset, or simulation model.
- Autonomous Knowledge Extraction: The workflow should be capable of comprehending human natural language, recognizing domain-specific terminologies, such as identifying concepts as entities and categories as classes, and understanding the relationships between various concepts and classes.
- Automated Ontology Design: The workflow should adhere to the practices and procedures of well-established ontology design methodologies, including Methontology, OTKM, and NeON (see Section 2.2). This adherence will enable it to create robust knowledge representations characterized by high granularity, reusability, flexibility, and semantic consistency.
- Ontology Interoperability: The workflow should be able to export ontologies as knowledge graphs or in widely accepted ontology languages, such as OWL and RDF (see Section 2.4). This functionality will enable the ontologies to be interpreted by popular ontology management software tools, facilitating validation processes and the creation of database schema for data modeling and integration.
- Self-validating: The autonomous tool should have the capability to validate the ontologies it generates by referencing existing domain ontologies or relevant literature. This process should include checking the definitions of concepts, relationships, and semantic mappings to ensure accuracy and consistency.
4.2. Methodological Foundation
4.2.1. Knowledge Source Acquisition and Pre-Processing
4.2.2. LLM-Powered Ontology Construction—Input Preparation
- Stop Word Removal: a technique used in many traditional NLP tasks where common words such as “and”, “the”, “is”, etc., which are deemed to have little meaning, are removed from the text. This is typically performed to reduce the dimensionality of the data and focus on more meaningful words for tasks like text classification, keyword extraction, and more, and is conducted outside the ChatGPT API using the NLTK library.
- An NLP procedure that cleans and standardizes text by converting all characters to lowercase, removing or replacing special characters and punctuation, standardizing dates, numbers, and other non-standard text forms, and managing whitespaces. This process ensures uniformity in the text, minimizing variability that could detract from analysis. Text normalization is integrated into the LLM’s functionality via the ChatGPT API.
- Sentence Boundary Detection: Following text normalization, this process identifies the start and end points of sentences within the text. Accurate sentence boundary detection is essential for understanding the text’s structure and facilitating subsequent per sentence processing. It segments a large text into manageable, analyzable units. This functionality is also incorporated into the LLM via the ChatGPT API.
- Tokenization: Following sentence boundary detection, the text within each sentence is tokenized. This process splits the text into tokens—typically words, phrases, or other meaningful elements. Tokenization transforms the text into discrete units suitable for further processing, such as parsing, part-of-speech tagging, or input to LLMs for reasoning, interpretation, and knowledge extraction. This functionality is also integrated into the LLM through the ChatGPT API.
- Image Context Recognition: The process uses Optical Character Recognition (OCR) technology to detect and convert text surrounding images into machine-readable characters. This includes extracting annotations, axis labels, data points, and descriptive text essential for interpreting the graphical content. This process is conducted outside the ChatGPT API using Layout Parser [70] and the Detectron2 model [71] for image separation serves as component identification to filter those that represent figures for further processing.
- GPT Image Recognition Engine: a component of a larger AI system that utilizes models trained to recognize and interpret content within images. This engine processes images to identify objects, features, and patterns, often using advanced neural networks optimized for image analysis. In the context of research articles, this involves recognizing graphical data, text within figures, or even the structure of the figures themselves. This functionality is integrated into the LLM through the ChatGPT API.
- Summary of Images as Texts: The process synthesizes key information from an image into a concise textual summary. It uses AI to interpret recognized objects, distilling these elements to highlight the essential insights or data presented. This is especially useful in academic settings for quickly conveying the significance and results depicted in complex figures or diagrams. This functionality is also integrated into the LLM through the ChatGPT API.
4.2.3. LLM-Powered Ontology Construction—Knowledge Extraction
- Extracting a Glossary of Terms using Named Entity Recognition (NER): This initial step in ontology creation employs NER, a pivotal NLP technique for identifying and categorizing key textual information. The process involves scanning text to detect and label critical domain-specific concepts as entities. These entities form the foundational elements of the ontology, enhancing semantic analysis by clearly outlining the terms and their relationships within the domain.
- Build Concept Taxonomy: A step focuses on organizing and structuring the identified concepts into a hierarchical framework. Building a concept taxonomy involves arranging concepts (or classes) identified during the conceptualization phase into a taxonomic (hierarchical) structure based on “is-a” relationships. This hierarchy represents a superclass-subclass relationship between broader and more specific concepts.
- Define Ad Hoc Relationship: This step is a crucial phase in the ontology development process. It involves identifying and specifying the relationships that exist between the concepts that are specific to the domain being modeled and do not necessarily fit into well-defined hierarchical (is-a), associative (part-of) relationships, or attribute (has property) relationships.
4.2.4. Ontology Evaluation
4.2.5. CQ-Based Qualitative Analysis
5. Case Study: Optimizing the Intermodal Freight Transportation System
5.1. Data Sources and Simulation Models
5.2. Ontology and Knowledge Graph Construction
- The first module is the ontology supporting the FAF data, providing essential data descriptions for developing a data provisioning system. This system encompasses factors such as types of transportation, cargo deliveries, geographical areas, and data sources for transportation data.
- The second module includes the FTOT network, which incorporates the spatial data (GIS data) necessary for calculating routes and inter-hub distances.
- The third module comprises the problem statement and optimization models derived from various academic resources. This module focuses on greenhouse gas emission calculations, optimization formulations, and intermodal transportation agents.
5.3. Data Provisioning System: Technologies and Implementation
5.4. Data Provisioning System: Database Schema Design
5.5. Decision Support Strategies—Formulation
- (a)
- Graph Representation is a graph data model that captures the topological properties (connectivity) of the freight transportation network. It is derived by the LLM using information from the FAF dataset manual. The model represents various modes and routes of freight transportation between key OD pairs, which are represented by US cities that are major logistic and transportation hubs. This graph is primarily used to navigate between any user-defined origin-destination pairs.
- (b)
- Ontology Representation delineates the major entities, along with their relationships and properties, within the intermodal freight transportation network as represented by the graph model. This ontology is derived from the information outlined in the manuals of both the FAF and FTOT datasets. It provides a foundational data structure that maps data variables—used to characterize the physical dynamics and performance of intermodal transportation processes—to their corresponding graph entities.
- (c)
- Data Sources comprise all the necessary multi-domain datasets required to provide data- and simulation-driven insights, complete with essential decision variables. These insights aid in making informed decisions aimed at optimizing operational costs and delivery times while also reducing fuel consumption and GHG emissions. These data sources are organized through a scenario-based ontology and are mapped to corresponding network entities (nodes and edges). Subsequently, they are ingested into a relational database, utilizing a data schema derived from the ontology to organize the data to enable rapid data retrieval for decision support purposes.
- (a)
- The “Network Traverse” refers to the systematic process of navigating through a transportation network to identify all possible combinations of routes and modes between an OD pair. This process involves assessing various transportation modes—such as road, rail, and river—and their interconnections, along with transit locations represented as intermediate nodes within the FTOT network. Each mode and route combination is characterized by a series of nodes that signify the origin, destination, and intermodal transits, connected by edges that represent the transitions between modes. This comprehensive approach ensures a thorough exploration of the network, facilitating optimal route and mode selection.
- (b)
- The “Decision Support Variable Definition” is a critical process that identifies key variables essential for characterizing the physical processes involved in intermodal transportation. These variables are vital for determining the most efficient, cost-effective, or fastest combinations of routes and modes available. During the ontology creation stage with LLM, structured prompts are developed. These prompts guide the ChatGPT API in extracting relevant variables from the technical manuals of the FAF and FTOT datasets, focusing on the efficiency and sustainability aspects of the freight transportation system. This structured approach ensures that all critical factors are considered in decision-making processes, enhancing the system’s overall performance and sustainability.
- (c)
- The “Data and Model Integration” is facilitated by encoding the Uniform Resource Identifier (URI) of specific data or simulations within a relational database, accessible via an API or hosted as a web service, into the ontology as properties of the graph entities. This method ensures that the freight transportation data and simulations are readily available during the network traversal. As the network is traversed to identify an individual combination of route and mode, decision variables at each graph entity are aggregated into decision metrics for the entire route combination, such as total greenhouse gas (GHG) emissions and total operational costs. This integrated setup enhances the efficiency of accessing and utilizing data, thereby optimizing the decision-making process in freight transportation management. The aggregated decision metrics for each route and mode combination are consolidated into a relational lookup table, which serves as a foundation for further decision analysis.
- (d)
- The “Optimal Mode and Route Identification” process utilizes optimization algorithms on a lookup table filled with decision metrics to systematically evaluate and select the most effective transportation options. This evaluation considers multiple, potentially conflicting criteria such as GHG emissions, operational costs, fuel consumption, and shipment time. By integrating both network optimization and Multi-Criteria Decision Analysis, this approach ensures a comprehensive and reliable determination of the optimal route and mode, balancing various transportation needs and environmental impacts effectively.
5.6. Decision Support Strategies—Implementation
6. Limitation and Future Work
- Without precisely engineered prompts, the ChatGPT API may exhibit sensitivity, leading to subtle variability and inconsistent results across different iterations. The robustness of the model can be enhanced by employing more advanced techniques, such as fine-tuning. These methods adapt pre-trained LLMs to generate ontologies using specific datasets or under particular conditions, thereby minimizing potential variability in the results from different model runs.
- There is a limitation on the maximum number of tokens that the ChatGPT API can process per request. This constraint can be circumvented by partitioning, optimizing, and refining the input data, combined with advancements in LLM research.
- Potential hallucinations of LLMs could jeopardize the accuracy and reliability of AI-generated ontologies and knowledge graphs. To minimize these hallucinations, previous studies have proposed matching techniques that utilize vocabularies defined in existing domain ontologies to validate AI-generated content [68]. Additionally, leveraging external knowledge bases through the recent emergence of RAG can also help reduce hallucinations. Our method demonstrates a qualitative analysis that logically addresses the competency questions proposed through SPARQL queries.
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hoornweg, D.; Pope, K. Population predictions for the world’s largest cities in the 21st century. Environ. Urban. 2017, 29, 195–216. [Google Scholar] [CrossRef]
- Habitat, U. The Value of Sustainable Urbanization; World Cities Report; UN-Habitat: Nairobi, Kenya, 2020. [Google Scholar]
- Moran, D.; Kanemoto, K.; Jiborn, M.; Wood, R.; Többen, J.; Seto, K.C. Carbon footprints of 13 000 cities. Environ. Res. Lett. 2018, 13, 064041. [Google Scholar] [CrossRef]
- Angelidou, M.; Psaltoglou, A.; Komninos, N.; Kakderi, C.; Tsarchopoulos, P.; Panori, A. Enhancing sustainable urban development through smart city applications. J. Sci. Technol. Policy Manag. 2018, 9, 146–169. [Google Scholar] [CrossRef]
- Bibri, S.E.; Krogstie, J. The core enabling technologies of big data analytics and context-aware computing for smart sustainable cities: A review and synthesis. J. Big Data 2017, 4, 1–50. [Google Scholar] [CrossRef]
- Xu, H.; Berres, A.; Yoginath, S.B.; Sorensen, H.; Nugent, P.J.; Severino, J.; Tennille, S.A.; Moore, A.; Jones, W.; Sanyal, J. Smart mobility in the cloud: Enabling real-time situational awareness and cyber-physical control through a digital twin for traffic. IEEE Trans. Intell. Transp. Syst. 2023, 24, 3145–3156. [Google Scholar] [CrossRef]
- Xu, H.; Berres, A.; Shao, Y.; Wang, C.R.; New, J.R.; Omitaomu, O.A. Toward a Smart Metaverse City: Immersive Realism and 3D Visualization of Digital Twin Cities. In Advances in Scalable and Intelligent Geospatial Analytics; CRC Press: Boca Raton, FL, USA, 2023; pp. 245–257. [Google Scholar]
- Weil, C.; Bibri, S.E.; Longchamp, R.; Golay, F.; Alahi, A. A Systemic Review of Urban Digital Twin Challenges, and Perspectives for Sustainable Smart Cities. Sustain. Cities Soc. 2023, 99, 104862. [Google Scholar] [CrossRef]
- McPhearson, T.; Haase, D.; Kabisch, N.; Gren, Å. Advancing understanding of the complex nature of urban systems. Ecol. Indic. 2016, 70, 566–573. [Google Scholar] [CrossRef]
- Heinold, A.; Meisel, F. Emission limits and emission allocation schemes in intermodal freight transportation. Transp. Res. Part E Logist. Transp. Rev. 2020, 141, 101963. [Google Scholar] [CrossRef]
- Matei, O.; Erdei, R.; Delinschi, D. Multimodal transportation overview and optimization ontology for a greener future. In Proceedings of the Artificial Intelligence in Intelligent Systems: 10th Computer Science On-line Conference 2021; Springer: Berling/Heidelberg, Germany, 2021; Volume 2, pp. 158–172. [Google Scholar]
- Chen, Y.; Sabri, S.; Rajabifard, A.; Agunbiade, M.E. An ontology-based spatial data harmonisation for urban analytics. Comput. Environ. Urban Syst. 2018, 72, 177–190. [Google Scholar] [CrossRef]
- Kornyshova, E.; Deneckère, R. Decision-making ontology for information system engineering. In Proceedings of the Conceptual Modeling—ER 2010: 29th International Conference on Conceptual Modeling, Proceedings 29, Vancouver, BC, Canada, 1–4 November 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 104–117. [Google Scholar]
- Wang, R.; Nellippallil, A.B.; Wang, G.; Yan, Y.; Allen, J.K.; Mistree, F. Ontology-based uncertainty management approach in designing of robust decision workflows. J. Eng. Des. 2019, 30, 726–757. [Google Scholar] [CrossRef]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. Gpt-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Team, G.; Anil, R.; Borgeaud, S.; Wu, Y.; Alayrac, J.B.; Yu, J.; Soricut, R.; Schalkwyk, J.; Dai, A.M.; Hauth, A.; et al. Gemini: A family of highly capable multimodal models. arXiv 2023, arXiv:2312.11805. [Google Scholar]
- Noy, N.F.; Hafner, C.D. The state of the art in ontology design: A survey and comparative review. AI Mag. 1997, 18, 53. [Google Scholar]
- Gruber, T.R. A translation approach to portable ontology specifications. Knowledge Acquisition 1993, 5, 199–220. [Google Scholar] [CrossRef]
- Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Philip, S.Y. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 494–514. [Google Scholar] [CrossRef] [PubMed]
- Spoladore, D.; Pessot, E. Collaborative ontology engineering methodologies for the development of decision support systems: Case studies in the healthcare domain. Electronics 2021, 10, 1060. [Google Scholar] [CrossRef]
- Petrova-Antonova, D.; Ilieva, S. Digital twin modeling of smart cities. In Proceedings of the Human Interaction, Emerging Technologies and Future Applications III: Proceedings of the 3rd International Conference on Human Interaction and Emerging Technologies: Future Applications (IHIET 2020), Paris, France, 27–29 August 2020; Springer: Berlin/Heidelberg, Germany, 2021; pp. 384–390. [Google Scholar]
- Balasubramani, B.S.; Shivaprabhu, V.R.; Krishnamurthy, S.; Cruz, I.F.; Malik, T. Ontology-based urban data exploration. In Proceedings of the 2nd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, Burlingame, CA, USA, 31 October 2016; pp. 1–8. [Google Scholar]
- Bergman, M.K. Lagerstrom-Fife. In Knowledge Representation Practionary; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Shi, J.; Pan, Z.; Jiang, L.; Zhai, X. An ontology-based methodology to establish city information model of digital twin city by merging BIM, GIS and IoT. Adv. Eng. Inform. 2023, 57, 102114. [Google Scholar] [CrossRef]
- Al-Bayati, Z.J.F. Coupling Ontology with Reference Architectures to Facilitate the Instantiation Process of Software System Architectures; University of Salford: Salford, UK, 2019. [Google Scholar]
- Kuster, C.; Hippolyte, J.L.; Rezgui, Y. The UDSA ontology: An ontology to support real time urban sustainability assessment. Adv. Eng. Softw. 2020, 140, 102731. [Google Scholar] [CrossRef]
- Simsek, U.; Kärle, E.; Angele, K.; Huaman, E.; Opdenplatz, J.; Sommer, D.; Umbrich, J.; Fensel, D. A knowledge graph perspective on knowledge engineering. SN Comput. Sci. 2022, 4, 16. [Google Scholar] [CrossRef]
- Chaudhri, V.; Baru, C.; Chittar, N.; Dong, X.; Genesereth, M.; Hendler, J.; Kalyanpur, A.; Lenat, D.; Sequeda, J.; Vrandečić, D.; et al. Knowledge graphs: Introduction, history and, perspectives. AI Mag. 2022, 43, 17–29. [Google Scholar]
- Syed, M.H.; Huy, T.Q.B.; Chung, S.T. Context-aware explainable recommendation based on domain knowledge graph. Big Data Cogn. Comput. 2022, 6, 11. [Google Scholar] [CrossRef]
- Weil, C.; Bibri, S.E.; Longchamp, R.; Golay, F.; Alahi, A. Urban digital twin challenges: A systematic review and perspectives for sustainable smart cities. Sustain. Cities Soc. 2023, 99, 104862. [Google Scholar] [CrossRef]
- Cristani, M.; Cuel, R. A survey on ontology creation methodologies. Int. J. Semant. Web Inf. Syst. (IJSWIS) 2005, 1, 49–69. [Google Scholar] [CrossRef]
- Guizzardi, G. Ontological foundations for structural conceptual models. Ph.D. Thesis, Research UT, University of Twente, Enschede, The Netherlands, 2005. [Google Scholar]
- Merali, Y. Complexity and information systems: The emergent domain. J. Inf. Technol. 2006, 21, 216–228. [Google Scholar] [CrossRef]
- Gosal, G. Ontology Building: An Integrative View of Methodologies. Int. J. Recent Innov. Trends Comput. Commun. 2015, 3, 4677–4683. [Google Scholar]
- Fernández-López, M.; Gómez-Pérez, A.; Juristo, N. METHONTOLOGY: From Ontological Art Towards Ontological Engineering. In Proceedings of the Ontological Engineering AAAI-97 Spring Symposium Series, Palo Alto, CA, USA, 24–25 March 1997; American Association for Artificial Intelligence; Ontology Engineering Group—OEG: Madrid, Spain, 1997; Available online: https://oa.upm.es/5484/ (accessed on 5 July 2024).
- Sure, Y.; Staab, S.; Studer, R. On-to-knowledge methodology (OTKM). Handbook on Ontologies; Springer: Berlin/Heidelberg, Germany, 2004; pp. 117–132. [Google Scholar]
- Gómez-Pérez, A.; Suárez-Figueroa, M.C. NeOn methodology for building ontology networks: A scenario-based methodology. In Proceedings of the International Conference on Software, Services & Semantic Technologies; Demetra EOOD: Burgas, Bulgaria; 2009; p. 160. Available online: http://hdl.handle.net/10506/672 (accessed on 5 July 2024).
- Suárez-Figueroa, M.C.; Gómez-Pérez, A.; Fernandez-Lopez, M. The NeOn Methodology framework: A scenario-based methodology for ontology development. Appl. Ontol. 2015, 10, 107–145. [Google Scholar] [CrossRef]
- Keet, C.M. Aspects of Ontology Integration. Ph.D. Thesis, School of Computing, Napier University, Edinburgh, UK, 2004. [Google Scholar]
- Nguyen, V. Ontologies and Information Systems: A Literature Survey. 2011. Available online: https://apps.dtic.mil/sti/citations/tr/ADA546186 (accessed on 5 July 2024).
- Noy, N.F.; Sintek, M.; Decker, S.; Crubézy, M.; Fergerson, R.W.; Musen, M.A. Creating semantic web contents with protege-2000. IEEE Intell. Syst. 2001, 16, 60–71. [Google Scholar] [CrossRef]
- Gil, Y.; Garijo, D.; Mishra, S.; Ratnakar, V. OntoSoft: A distributed semantic registry for scientfific software. In Proceedings of the 2016 IEEE 12th International Conference on e-Science (e-Science), Baltimore, MD, USA, 23–27 October 2016; pp. 331–336. [Google Scholar]
- Mizoguchi, R.; Sunagawa, E.; Kozaki, K.; Kitamura, Y. The model of roles within an ontology development tool: Hozo. Appl. Ontol. 2007, 2, 159–179. [Google Scholar]
- Falconer, S. OntoGraf. Protégé Wiki. 2010. Available online: https://protegewiki.stanford.edu/wiki/Protege_Desktop_Old_Versions (accessed on 5 July 2024).
- Sedlmeier, M.; Gogolla, M. Model Driven ActiveRecord with yEd. In Proceedings of the 25th International Conference Information Modelling and Knowledge Bases (EJC’2015), Maribor, Slovenia, 8–12 June 2015; pp. 65–76. [Google Scholar]
- Saigal, R.; Kumar, A. Visual understanding environment. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL ’05), Denver, CO, USA, 7–11 June 2005; p. 413. [Google Scholar] [CrossRef]
- Cañas, A.J.; Hill, G.; Carff, R.; Suri, N.; Lott, J.; Gómez, G.; Eskridge, T.C.; Arroyo, M.; Carvajal, R. CmapTools: A Knowledge Modeling and Sharing Environment. In Proceedings of the 1st International Conference on Concept Mapping, Pamplona, Spain, 14–17 September 2004; pp. 125–133. [Google Scholar]
- Dudáš, M.; Lohmann, S.; Svátek, V.; Pavlov, D. Ontology visualization methods and tools: A survey of the state of the art. Knowl. Eng. Rev. 2018, 33, e10. [Google Scholar] [CrossRef]
- Botzenhardt, A.; Maedche, A.; Wiesner, J. Developing a domain ontology for software product management. In Proceedings of the 2011 Fifth International Workshop on Software Product Management (IWSPM), Trento, Italy, 30 August 2011; pp. 7–16. [Google Scholar]
- Fernández-López, M.; Gómez-Pérez, A. Overview and analysis of methodologies for building ontologies. Knowl. Eng. Rev. 2002, 17, 129–156. [Google Scholar] [CrossRef]
- Allemang, D.; Hendler, J. Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
- Tomaszuk, D.; Hyland-Wood, D. RDF 1.1: Knowledge representation and data integration language for the Web. Symmetry 2020, 12, 84. [Google Scholar] [CrossRef]
- Cardoso, J.; Pinto, A.M. The web ontology language (owl) and its applications. In Encyclopedia of Information Science and Technology, 3rd ed.; IGI Global: Hershey, PA, USA, 2015; pp. 7662–7673. [Google Scholar]
- Curé, O.; Blin, G. RDF Database Systems: Triples Storage and SPARQL Query Processing; Morgan Kaufmann: Burlington, MA, USA, 2014. [Google Scholar]
- Brustoloni, J.C. Autonomous Agents: Characterization and Requirements; Carnegie Mellon University: Pittsburgh, PA, USA, 1991. [Google Scholar]
- Wooldridge, M.; Jennings, N.R. Intelligent agents: Theory and practice. Knowl. Eng. Rev. 1995, 10, 115–152. [Google Scholar] [CrossRef]
- Stone, P.; Brooks, R.; Brynjolfsson, E.; Calo, R.; Etzioni, O.; Hager, G.; Hirschberg, J.; Kalyanakrishnan, S.; Kamar, E.; Kraus, S.; et al. Artificial intelligence and life in 2030: The one hundred year study on artificial intelligence. arXiv 2022, arXiv:2211.06318. [Google Scholar]
- Albrecht, S.V.; Stone, P. Autonomous agents modelling other agents: A comprehensive survey and open problems. Artif. Intell. 2018, 258, 66–95. [Google Scholar] [CrossRef]
- Sun, Y.X.; Li, Z.M.; Huang, J.Z.; Yu, N.z.; Long, X. GPT-4: The future of cosmetic procedure consultation? Aesthetic Surg. J. 2023, 43, NP670–NP672. [Google Scholar] [CrossRef]
- Li, Z.; Ning, H. Autonomous GIS: The next-generation AI-powered GIS. Int. J. Digit. Earth 2023, 16, 4668–4686. [Google Scholar] [CrossRef]
- Shin, J.; Tang, C.; Mohati, T.; Nayebi, M.; Wang, S.; Hemmati, H. Prompt Engineering or Fine Tuning: An Empirical Assessment of Large Language Models in Automated Software Engineering Tasks. arXiv 2023, arXiv:2310.10508. [Google Scholar]
- Baldazzi, T.; Bellomarini, L.; Ceri, S.; Colombo, A.; Gentili, A.; Sallinger, E. Fine-tuning large enterprise language models via ontological reasoning. In Proceedings of the International Joint Conference on Rules and Reasoning; Springer: Berlin/Heidelberg, Germany, 2023; pp. 86–94. [Google Scholar]
- Meyer, L.P.; Stadler, C.; Frey, J.; Radtke, N.; Junghanns, K.; Meissner, R.; Dziwis, G.; Bulert, K.; Martin, M. Llm-assisted knowledge graph engineering: Experiments with chatgpt. In Proceedings of the Working conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow; Springer Fachmedien Wiesbaden: Wiesbaden, Germany, 2023; pp. 103–115. [Google Scholar]
- Babaei Giglou, H.; D’Souza, J.; Auer, S. LLMs4OL: Large language models for ontology learning. In Proceedings of the International Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2023; pp. 408–427. [Google Scholar]
- Zhang, B.; Soh, H. Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction. arXiv 2024, arXiv:2404.03868. [Google Scholar]
- Kommineni, V.K.; König-Ries, B.; Samuel, S. From human experts to machines: An LLM supported approach to ontology and knowledge graph construction. arXiv 2024, arXiv:2403.08345. [Google Scholar]
- Caufield, J.H.; Hegde, H.; Emonet, V.; Harris, N.L.; Joachimiak, M.P.; Matentzoglu, N.; Kim, H.; Moxon, S.; Reese, J.T.; Haendel, M.A.; et al. Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning. Bioinformatics 2024, 40, btae104. [Google Scholar] [CrossRef]
- Xu, H.; Windsor, M.; Muste, M.; Demir, I. A web-based decision support system for collaborative mitigation of multiple water-related hazards using serious gaming. J. Environ. Manag. 2020, 255, 109887. [Google Scholar] [CrossRef] [PubMed]
- Shen, Z.; Zhang, R.; Dell, M.; Lee, B.C.G.; Carlson, J.; Li, W. Layoutparser: A unified toolkit for deep learning based document image analysis. In Proceedings of the Document Analysis and Recognition—ICDAR 2021: 16th International Conference, Proceedings, Part I 16, Lausanne, Switzerland, 5–10 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 131–146. [Google Scholar]
- Wu, Y.; Kirillov, A.; Massa, F.; Lo, W.Y.; Girshick, R. Detectron2. 2019. Available online: https://github.com/facebookresearch/detectron2 (accessed on 25 April 2024).
- Dong, H.; Hussain, F.K.; Chang, E. Application of Protégé and SPARQL in the field of project knowledge management. In Proceedings of the 2007 Second International Conference on Systems and Networks Communications (ICSNC 2007), Cap Eterel, France, 25–31 August 2007; p. 74. [Google Scholar]
- Kumar, N.; Kumar, S. Querying RDF and OWL data source using SPARQL. In Proceedings of the 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), Tiruchengode, India, 4–6 July 2013; pp. 1–6. [Google Scholar]
- Lubbad, M.A. Ontology based data access with relational databases. Master’s Thesis, Fen Bilimleri Enstitüsü, İstanbul, Türkiye, 2018. [Google Scholar]
- Rector, A.; Drummond, N.; Horridge, M.; Rogers, J.; Knublauch, H.; Stevens, R.; Wang, H.; Wroe, C. OWL pizzas: Practical experience of teaching OWL-DL: Common errors & common patterns. In Proceedings of the Engineering Knowledge in the Age of the Semantic Web: 14th International Conference, EKAW 2004, Proceedings 14; Whittlebury Hall, UK, 5–8 October 2004, Springer: Berlin/Heidelberg, Germany, 2004; pp. 63–81. [Google Scholar]
- Redmond, T.; Noy, N. Computing the changes between ontologies. In Proceedings of the Joint Workshop on Knowledge Evolution and Ontology Dynamics; CEUR Workshop Proceedings: Boston, MA, USA, 2011; pp. 1–14. [Google Scholar]
- Malone, J.; Holloway, E.; Adamusiak, T.; Kapushesky, M.; Zheng, J.; Kolesnikov, N.; Zhukova, A.; Brazma, A.; Parkinson, H. Modeling sample variables with an Experimental Factor Ontology. Bioinformatics 2010, 26, 1112–1118. [Google Scholar] [CrossRef]
- Kremen, P.; Smid, M.; Kouba, Z. OWLDiff: A practical tool for comparison and merge of OWL ontologies. In Proceedings of the 2011 22nd International Workshop on Database and Expert Systems Applications, Toulouse, France, 29 August–2 September 2011; pp. 229–233. [Google Scholar]
- Noy, N.F.; Musen, M.A. Evaluating Ontology-Mapping Tools: Requirements and Experience. In Proceedings of the EON (Evaluation of Ontology-Based Tools) Workshop, Siguenza, Spain, 14–15 October 2002; pp. 1–14. [Google Scholar]
- Noy, N.F.; Musen, M.A. The prompt suite: Interactive tools for ontology merging and mapping. Int. J.-Hum.-Comput. Stud. 2003, 59, 983–1024. [Google Scholar] [CrossRef]
- Bezerra, C.; Freitas, F.; Santana, F. Evaluating ontologies with competency questions. In Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Atlanta, GA, USA, 17–20 November 2013; Volume 3, pp. 284–285. [Google Scholar]
- Ren, Y.; Parvizi, A.; Mellish, C.; Pan, J.Z.; Van Deemter, K.; Stevens, R. Towards competency question-driven ontology authoring. In Proceedings of the The Semantic Web: Trends and Challenges: 11th International Conference, ESWC 2014, Proceedings 11, Anissaras, Crete, Greece, 25–29 May 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 752–767. [Google Scholar]
- Federal Highway Administration|U.S. Department of Transportation. Freight Analysis Framework (FAF). 2024. Available online: https://ops.fhwa.dot.gov/freight/freight_analysis/faf/ (accessed on 25 April 2024).
- U.S. Department of Transportation, V.C. Freight and Fuel Transportation Optimization Tool (FTOT). 2024. Available online: https://volpeusdot.github.io/FTOT-Public/ (accessed on 25 April 2024).
- Mogotlane, K.D. Semantic Knowledge Extraction from Relational Databases. Ph.D. Thesis, Vaal University of Technology, Vanderbijlpark, South Africa, 2014. [Google Scholar]
- Mogotlane, K.D.; Fonou-Dombeu, J.V. Automatic conversion of relational databases into ontologies: A comparative analysis of Protege plug-ins performances. arXiv 2016, arXiv:1611.02816. [Google Scholar]
CQ | Queries on Human-Crafted Ontology | Queries on Automated Ontology | ||
---|---|---|---|---|
What are the primary categories within the pizza ontology? | PREFIX: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?class WHERE { ?class rdf:type owl:Class. } | Pizza, PizzaBase, Food, Spiciness 105 in total | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?class WHERE { ?class rdf:type ontology:Class. } | Food, Process, Business, Culture 34 in total |
What are the main ingredients of a pizza? | Not found in ontology | Not found in ontology | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>, PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?individual ?label WHERE { ?individual rdf:type ontology:Ingredients. OPTIONAL { ?individual rdfs:label ?label }} | Cheese, Mozzarella, Tomatoes 18 in total |
What are the different variations of toppings used for a pizza? | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>, PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class WHERE { ?class rdfs:subClassOf ontology:VegetableTopping. } | TomatoTopping, HotGreenPepperTopping, JalapenoPepperTopping, ArtichokeTopping, AsparagusTopping, OnionTopping, PeperonataTopping, CaperTopping, OliveTopping 22 in total | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>, PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?subclass WHERE { ?subclass rdfs:subClassOf ontology:PizzaToppings.} | CheeseToppings, FruitToppings, HerbAndSpiceToppings, MeatToppings, OtherToppings, SauceToppings, SeafoodToppings, VegetableToppings |
Which dishes use the dough as a main ingredient? | Not found in ontology | Not found in ontology | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?dish WHERE { ?dish rdf:type ontology:Dish. ?dough rdf:type ontology:Dough.} | Calzone, Pizza |
What are the Vegetable Toppings? | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>, SELECT ?individual WHERE {?individual rdf:type ontology:VegetableToppings. } | Tomato, Hot Green Pepper, Jalapeno Pepper, Artichoke, Asparagus, Onion 22 in total | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?vegetableTopping WHERE { ?vegetableTopping rdf:type ontology:VegetableToppings. } | Artichokes, Mushrooms, Onion, Tomatoes |
What are the different types of Pizzas? | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>, PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?pizzaType ?label WHERE {?pizzaType rdf:type owl:Class. ?pizzaType rdfs:subClassOf* ontology:Pizza. OPTIONAL {?pizzaType rdfs:label ?label. } | AmericanHot, Cajun, Capricciosa, Caprina, Fiorentina, FourSeasons 25 in total | Not found in ontology | Not found in ontology |
CQ | Automated Ontology Prefix, Query and Output | Query Output Description | Dataset | |
---|---|---|---|---|
What are the primary scenario parameters within the FTOT Ontology? | PREFIX: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX: rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX: owl: <http://www.w3.org/2002/07/owl#> SELECT ?parameter WHERE { ?parameter rdf:type base:ScenarioParameters. } | Scenario_Name, Scenario_Description, RMP_Commodity_Data, Destinations_Commodity_Data, Disruption_Data, 20 in total | The parameters, metadata related parameters, details on raw material producers, destinations, and information on disruption events. | FTOT |
What are the primary scenario inputs within the FTOT Ontology? | PREFIX: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX: rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX: owl: <http://www.w3.org/2002/07/owl#> SELECT ?input WHERE { ?input rdf:type base:ScenarioInputs. } | geospatialinformation, networkattributes, facilities, origins, destinations, 10 in total | Inputs listed include GIS information, attributes related the network such as traffic or disruptive information, location for facilities, origins and destinations. | FTOT |
Describe the geographical components of the FAF dataset? | PREFIX: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX: rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX: owl: <http://www.w3.org/2002/07/owl#> SELECT ?class ?individual WHERE { ?class rdfs:subClassOf <<—URL—>/ontology#Geography>. ?individual rdf:type ?class. } | DomesticOrigin, ForeignOrigin, DomesticDestination, ForeignDestination, 9 in total | Geographical components, such as origin and destination. | FAF |
List the main transportation hubs? | PREFIX: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX: rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX: owl: <http://www.w3.org/2002/07/owl#> SELECT ?region WHERE ?region rdf:type ontology:Regions. | Mobile-Daphne-Fairhope, Orlando-Deltona-Daytona-Beach, Chicago-Naperville, Tucson-Nogales, St-Louis-St-Charles-Farmington, Los-Angeles-Long-Beach, Philadelphia-Reading-Camden, 24 in total | List of individual hubs identified by the SPARQL query on the region’s class within the FAF ontology. | FAF |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tupayachi, J.; Xu, H.; Omitaomu, O.A.; Camur, M.C.; Sharmin, A.; Li, X. Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models—A Case in Optimizing Intermodal Freight Transportation. Smart Cities 2024, 7, 2392-2421. https://doi.org/10.3390/smartcities7050094
Tupayachi J, Xu H, Omitaomu OA, Camur MC, Sharmin A, Li X. Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models—A Case in Optimizing Intermodal Freight Transportation. Smart Cities. 2024; 7(5):2392-2421. https://doi.org/10.3390/smartcities7050094
Chicago/Turabian StyleTupayachi, Jose, Haowen Xu, Olufemi A. Omitaomu, Mustafa Can Camur, Aliza Sharmin, and Xueping Li. 2024. "Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models—A Case in Optimizing Intermodal Freight Transportation" Smart Cities 7, no. 5: 2392-2421. https://doi.org/10.3390/smartcities7050094
APA StyleTupayachi, J., Xu, H., Omitaomu, O. A., Camur, M. C., Sharmin, A., & Li, X. (2024). Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models—A Case in Optimizing Intermodal Freight Transportation. Smart Cities, 7(5), 2392-2421. https://doi.org/10.3390/smartcities7050094