TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph

Song, Sihan; Yang, Chuncheng; Xu, Li; Shang, Haibin; Li, Zhuo; Chang, Yinghui

doi:10.3390/ijgi13110414

Open AccessArticle

TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph

by

Sihan Song

^1,*,

Chuncheng Yang

^1,2,

Li Xu

²,

Haibin Shang

¹,

Zhuo Li

² and

Yinghui Chang

³

¹

National Engineering Research Center of Geographic Information System, China University of Geosciences, Wuhan 430074, China

²

Key Laboratory of Geological Survey and Evaluation of Ministry of Education, China University of Geosciences, Wuhan 430074, China

³

School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2024, 13(11), 414; https://doi.org/10.3390/ijgi13110414

Submission received: 24 September 2024 / Revised: 13 November 2024 / Accepted: 13 November 2024 / Published: 16 November 2024

Download

Browse Figures

Versions Notes

Abstract

:

A novel framework called TravelRAG is introduced in this paper, which is built upon a large language model (LLM) and integrates Retrieval-Augmented Generation (RAG) with knowledge graphs to create a retrieval system framework designed for the tourism domain. This framework seeks to address the challenges LLMs face in providing precise and contextually appropriate responses to domain-specific queries in the tourism field. TravelRAG extracts information related to tourist attractions from User-Generated Content (UGC) on social media platforms and organizes it into a multi-layer knowledge graph. The travel knowledge graph serves as the core retrieval source for the LLM, enhancing the accuracy of information retrieval and significantly reducing the generation of erroneous or fabricated responses, often termed as “hallucinations”. As a result, the accuracy of the LLM’s output is enhanced. Comparative analyses with traditional RAG pipelines indicate that TravelRAG significantly boosts both the retrieval efficiency and accuracy, while also greatly reducing the computational cost of model fine-tuning. The experimental results show that TravelRAG not only outperforms traditional methods in terms of retrieval accuracy but also better meets user needs for content generation.

Keywords:

tourism knowledge graph; retrieval-augmented generation; document intelligence; explainability; GeoAI

1. Introduction

The gradual recovery of the global economy has been accompanied by a robust rebound in the tourism industry [1]. With the growing number of social media users, sharing and documenting personal experiences on social platforms has become an integral part of daily life for many. Content shared on social media is not only up-to-date, but also rich in geographic information [2,3]. The rapid growth in users of short-video platforms, alongside the coming of age of a curious younger generation, has allowed many lesser-known attractions to reach wider audiences through recommendation algorithms. This trend has encouraged people to select travel destinations based on content with significant social influence [4,5]. Special-interest attractions differ from mainstream sites primarily due to the short-term influx of visitors and the lack of detailed travel planning. Moreover, travel notes shared on social platforms by the younger generation have become a major reference [6]. These travel notes typically include information on the location of attractions, activities, personal experiences, and nearby accommodations and dining options. Rapid analysis and accurate retrieval of travel notes from social platforms are essential for adapting to current tourism trends.

Accurately and comprehensively extracting geographic information from texts is essential for analyzing user behavior. Imagine a traveler writing in a review: “Walk along the cobblestone path for five minutes, and you’ll see a blue cottage on your left. The café next to it has the best matcha latte”. Humans can easily understand that this is a recommendation for a café, recognize that the café is located beside the “blue cottage”, and know that it can be reached by following the “path”. However, it remains challenging for a machine to accurately interpret the geographic information embedded in such descriptions. The system would need to parse vague location references like “blue cottage” and “path”, understand that “matcha latte” implies a popular item, and interpret “on your left” as a directional cue. These nuanced yet pivotal geographic data are intuitive to human readers but can often surpass a machine’s current comprehension capabilities. If machines could better extract and interpret these details, recommendation accuracy would improve, helping travelers more easily locate their ideal destinations. For the tourism industry, this could mean a more personalized and seamless user experience. As internet culture has evolved, a form of content known as “memes” has emerged, characterized by humor and cultural references; this viral content is disseminated online rapidly and is continuously evolving and being integrated into popular online culture [7]. Memes are primarily characterized as being “entertaining and memorable” and their format often differs significantly from conventional language used in daily conversation. For example, on Chinese social media platforms, media articles with attention-grabbing titles often begin with “Who understands, my family members?”. Such titles are used to emphasize the tone but usually do not convey the actual content being discussed in the article. Such memes introduce noise into the text, challenging a model’s ability to accurately extract information. Removing this noise while precisely extracting tourism-related geographic information from the text is crucial for constructing a comprehensive tourism knowledge base.

With the rapid advancement of large language models (LLMs), like the mainstream model ChatGPT [8], they have become valuable tools in everyday life, and numerous AI applications have emerged. LLMs require a lot of data in the training stage, and the latest date generated by these data is usually the time when the training starts, and thus the training data do not cover the latest content. This causes LLMs to “hallucinate” when answering questions outside the scope of their acquired knowledge [9]. Moreover, this training process requires significant investment and takes a long time, often consuming millions of dollars in financial resources, making it a huge challenge to inject up-to-date data into the LLM [10]. The emergence of Low-Rank Adaptation of Large Language Models (LoRA) reduces the cost of fine-tuning the model, and the core idea behind its implementation is to carry out the low-rank decomposition of the updated part of the model parameters to reduce the number of parameters in the calculation, optimize the computational complexity and reduce the hardware requirements [11,12]. Although the use of LoRA can effectively improve training efficiency, the model’s underlying parameter count remains substantial. Consequently, this fine-tuning approach still incurs high costs in scenarios that demand frequent knowledge updates.

Text retrieval aims to find relevant information resources in response to users’ queries, while knowledge-intensive retrieval tasks use deep learning methods [13] to map both queries and documents into dense vectors through neural networks and to automatically learn queries and document representations from data. This approach enables more precise matching of information needs. However, with the increasing model complexity and growing demands for frequent knowledge updates, it continues to face challenges in balancing efficiency and cost. To address the problems of outdated information and hallucinations in LLMs, integrating external knowledge bases has become a solution to enhance the performance of knowledge-intensive retrieval tasks. The integration of retrieval techniques with LLMs has given rise to Retrieval-Augmented Generation (RAG) [14], a method that leverages an LLM to retrieve relevant content, thereby improving the quality of generated text.

Recently, many researchers have focused on combining knowledge graphs with LLMs [15,16,17,18]. Knowledge graphs store information in a structured format, allowing for the better organization and management of internet data. GraphRAG uses knowledge graphs as an information source for LLMs, enhancing model performance and effectively addressing issues such as hallucinations, gaps in domain-specific knowledge, and outdated information [15].

In this paper, we propose a framework called TravelRAG to improve the query performance of large language models in the field of travel. Based on the idea of GraphRAG, this framework provides the latest social platform information and multi-level scenic spot knowledge for the large language model by constructing a tourism knowledge graph, effectively reducing the hallucination problem. We further optimized the retrieval process in RAG by constructing a knowledge graph of scenic spot classification for communities with different interests and generating relevant reports for interested communities to improve the retrieval efficiency and the response accuracy.

To assess the performance of TravelRAG, we conducted a comparative analysis with a traditional RAG pipeline. The results indicate that TravelRAG significantly improves both the retrieval efficiency and accuracy of tourism information retrieval, while also substantially reducing the costs associated with model fine-tuning. Our contributions are as follows:

We developed an automated pipeline for constructing knowledge graphs based on large language models.
Adapting to current advancements, we converted a substantial amount of the unstructured text into the multi-layer knowledge graph, utilizing the knowledge graph as the retrieval source in RAG.
The RAG pipeline, which uses a knowledge graph as its retrieval source, demonstrated a superior retrieval accuracy compared to traditional RAG methods.

2. Related Works

2.1. Knowledge Graph

The knowledge graph is a knowledge base that represents entities in the objective world and their relationships in the form of a graph, enabling intelligent systems to acquire the ability to solve complex tasks [19,20]. The knowledge graph is a network knowledge structure composed of the “entity–relation–entity” triplet and the form of entities and their related attribute–value pairs [21]. The key operations of knowledge graphs include the following three parts: (1) entity and relationship extraction, semantic analysis and other machine learning and natural language processing methods and algorithms; (2) knowledge representation, graph database and knowledge fusion methods and technologies for storing knowledge graphs and (3) data integration and knowledge reasoning in the application of knowledge graphs. Knowledge graphs can be divided into two categories considering the construction process. One construction process is early ontology, which is used in WordNet [22], HowNet [23] and so on. Most of these knowledge maps are handmade by experts, whose knowledge is of high quality and can ensure accuracy and completeness, but their disadvantage is that they are difficult to form on a large scale. The other process is to automatically extract entities and relationships from open internet information to construct the knowledge graphs, as is achieved through Yet Another Great Ontology (YAGO) [19] and DBPedia [24]. The scale of this kind of knowledge graph is large; however, the complexity of the data source and the inaccuracy of the information extraction algorithm may lead to a lot of incomplete information and noise. Knowledge graphs have a better ability to organize, manage and understand information from the internet, and can be used for semantic searches [25,26], intelligent question answering [27] and personalized recommendations [28,29], with certain value in specific fields. The application of knowledge graphs is the current research hotspot in the information field, and they are also a basic technology that can be used to promote the development of artificial intelligence.

Large language models show impressive performance in a variety of natural language processing tasks. These models are pre-trained using a large corpus of text to generate responses that meet user requirements. In order to enable large language models to grasp and copy necessary structural models when exposed to specific context information, many researchers use in-context learning (ICL) [30]. Context learning entails the researcher giving appropriate prompts to the language model and constantly correcting incorrect answers, involving making the large language model answer questions in a specific format or output information in a structured form. The construction of domain-specific knowledge maps requires a large number of hand-labeled corpora, which is very expensive work. Therefore, because large language models can output structured information, some researchers believe that they can be used to produce knowledge graphs. Hu [31] noted that ChatGPT has excellent learning ability with few samples, uses a small amount of data to fine-tune the model, performs subject classification for texts to be processed and extracts entities and relationships to create knowledge graphs. Wei [32] proposed a Knowledge In Context with GPT (KICGPT) framework, a knowledge graph completion framework inherited from a large language model and based on triples. This framework can achieve a better knowledge graph completion effect without additional training overhead. The framework of Attack Knowledge Graph (AttacKG+) proposed by Zhang [33] can extract the defined information effectively and accurately.

Although the performance of large language models is impressive in various tasks, LLMs have great limitations and hidden dangers in the face of complex tasks such as knowledge reasoning. Answer generation is used in large language models to predict the next word on the basis of the last word generated, which means that the generated answers are determined according to probability, so the whole process is uncertain. In addition to using large language models to construct knowledge graphs, some researchers also use knowledge graphs to enhance the ability of large language models. In order to solve the instability caused by LLMs, Agrawal [34] believes that taking a knowledge graph as the external information source of LLMs could effectively solve their hallucination problem. Petroni [35] took a knowledge graph as the knowledge source of a language model, while Dai [36] stored the knowledge graph as a neuron in the transformer [37] architecture of the language model and Choudhrary [38] and Wang [39] used the advantages of knowledge graphs to enhance the reasoning ability of LLMs.

The theme of tourism involves a comprehensive knowledge of geography, history, cuisine, user experience and other aspects. However, current knowledge retrieval on tourism topics usually only results in knowledge in a single field, lacking effective correlation between different fields [40]. Experts and scholars are researching how to extract the tourism-related information scattered in the text and use this information to build a tourism knowledge map. DBtravel [41] is a knowledge graph for English travel generated by the collaborative travel website Wiki-travel. Zhang [42] constructed a Chinese knowledge graph and extracted the tourism-related knowledge graph from the existing textual corpus, while Xiao [43] obtained extensive data from popular travel websites and extracted travel knowledge from them. Liu [44] constructed a knowledge graph of tourist attractions, combined the traditional Q&A model with the fine-grained knowledge graph Q&A model based on BiLSTM-CRF and constructed a knowledge Q&A system. Tan [45] adopted the method based on translation to train a knowledge graph, achieving knowledge vectorization of knowledge graphs to establish the knowledge graph in the tourism field.

2.2. Retrieval Augmentation Generation

The information retrieval (IR) model aims to find information on a large scale and from often unstructured data sources to meet the needs of users; it is often applied to search engines or Q&A and recommendation tasks. IR can be divided into sparse retrieval and dense retrieval [46]. In dense retrieval, documents and queries are usually represented in the form of a tape model; that is, each word is treated as an independent feature and the relevance of the documents and queries is determined mainly by matching their shared terms [47]. However, the sparse retrieval method ignores the semantic relation between words, which may lead to inaccurate retrieval results and semantic mismatch.

The dense retrieval method maps the query and document to a fixed dimension vector space so that they can be represented by a fixed-length single vector [48]. Semantic information is taken into account in the formation of these vectors, solving the semantic mismatch problem to a certain extent. However, this method of mapping text of varying lengths and complexities into a single dense volume may lead to the loss of semantic information, especially when dealing with long text or text with complex sentence structures, where a single vector may struggle to adequately capture the multi-dimensional and multi-level semantic features of the text.

In order to solve this problem, Huang [13] proposed a method to obtain text semantics by using deep learning technology to map queries and documents into dense vectors and automatically learn the representation of queries and documents from the labeled data. A matching function (such as cosine similarity or the dot product) is used to calculate the correlation between queries and documents. Compared with the traditional bag-based retrieval model, this method can achieve semantic-based matching to a certain level. Guo [49] et al. proposed the Deep Relevance Matching Model (DRMM), and they believed that the key to the retrieval task was to evaluate the semantic relevance between queries and documents. These works are considered preliminary explorations of dense retrieval models.

With the advent of Bidirectional Encoder Representations from Transformers (BERT) [50], models’ understanding of language complexity has improved due to its excellent text representation ability and deep semantic understanding ability. It also provides a solid foundation for the development of dense retrieval models. On this basis, researchers have proposed many IR methods based on pre-training models [51]. Karpuhin [52] proposed a Dense Passage Retrieval (DPR) model suitable for open-domain Q&A, while Khattab [53] proposed ColBERT and Santhanam [54] proposed ColBERTv2, which greatly surpass traditional retrieval methods in terms of retrieval accuracy and can calculate the semantic association between documents and queries more accurately.

Although dense searching has achieved great success, it still faces two major limitations [55,56,57]: (1) the Dense Retrieval (DR) model uses an indexed search pipeline with fixed search procedures (MIPS), which makes it difficult to jointly optimize all modules in an end-to-end manner, and (2) learning strategies are often inconsistent with pre-training objectives, which makes it difficult to make the full use of knowledge in pre-trained language models [58]. This kind of traditional retrieval model focuses on “retrieval”, which can obtain accurate information from the entire internet (or other information sources), but usually does not conduct in-depth analyses of the search results. When the user requires relatively complex information, they need to browse multiple results to obtain the required information.

With the excellent understanding and generation ability of large language models, the paradigm of information retrieval has gradually changed to a generative approach [57,59]. This paradigm has had great success, but encounters a “hallucination” problem [10] when processing queries beyond its training data or when current data are required. Therefore, some experts and scholars have proposed using the retrieval-enhanced generation (RAG) method to solve this problem. RAG is a method that combines retrieval-based and generative models [14]. RAG enhances the LLM’s capabilities by retrieving relevant documents from an external knowledge base through semantic similarity calculations. By referencing external knowledge, RAG effectively reduces the problem of generating factually incorrect content [60]. Early search enhancement systems integrated search and generation into an end-to-end system [14,52] using customized datasets to fine-tune small-scale models [50,61]. By designing different prompts, Yu [62] showed that LLMs could retrieve existing knowledge inside the model through prompts. Khattab [63] proposed DSP (Demonstration–Search–Predict), a complex pipeline that relies on a frozen language model and a retrieval model to solve knowledge-intensive tasks. These RAG methods are collectively referred to as traditional RAG frameworks.

Current RAG implementations rely on dense vector similarity searches as a retrieval mechanism. However, this approach of dividing the corpus into blocks of text and relying solely on a dense retrieval system proved inadequate for complex queries [60]. Luo [64] mixed sparse and dense searchers to effectively improve the retrieval effect. However, this type of approach is still limited by the limited range of source data defined in advance by the developer. In addition, achieving the granularity necessary to find answers to complex queries in similar blocks of vector spaces remains a challenge [65]. This inefficiency is due to the method’s inability to selectively locate relevant information, resulting in the retrieval of a large amount of block data that may not directly contribute to answering queries. The best RAG systems will accurately retrieve only the necessary content, minimizing the inclusion of irrelevant information. This is where knowledge graphs (KGs) can help, providing a structured and unambiguous representation of entities and relationships that is more accurate than retrieving information using vector similarity [66].

There has been a lot of research on the intersection of graph-based techniques and LLMs. Giarelis [16] proposed a unified LLM-KG framework to assist fact-checking on public deliberative platforms, which could help solve the problems of hallucination and indecision currently displayed by LLMs. Logan [17] built a KGLM to generate factual sentences by selecting facts from a knowledge graph using the current context. Luo [18] proposed a novel approach known as graph reasoning (RoG), which uses knowledge graphs as a relational path; based on this, interpretable reasoning can be performed by LLMs. The GraphRAG method was proposed by Hu [15]; this method uses an LLM to create a knowledge graph, combining the knowledge graph with RAG and setting different cue enhancement methods in the query. GraphRAG has shown significant advantages in processing hierarchical data and is superior to traditional RAG methods.

3. Methods

From the perspective of administrative divisions, all geographical entities are organized according to a certain level. Specifically, this organizational structure is ordered from macro to micro, that is, “continent–country–province–city–county–township”. The main advantage of this layered approach is its ease of management and retrieval. For example, when planning a travel itinerary, an individual usually takes an interest in a specific scenic spot first, and then gradually refines their interest to the region to which the scenic spot belongs, the specific activities within the region, and the monetary and time costs required for these activities and other elements of different granularity [67]. Different tourists have different requirements for tourism resource retrieval granularity, which requires us to develop new methods to provide more efficient and effective service recommendations with increases in user retrieval granularity. This has always been a core issue in the field of service computing [68].

Further, given the personalized habits of users when working with LLMs, it has become particularly important to meet the needs of different users regarding search granularity. Considering the combination of LLMs and knowledge graphs, the integration of different levels of travel information is crucial to improving user experience. When querying tourism information, users may show diversified needs on a spatial scale. For example, they may pay attention to the spatial relationships between scenic spots, between scenic spots and hotel facilities and between scenic spots and food facilities. Therefore, it is particularly crucial to conduct multi-level tourism resource retrieval according to different use intentions. This multi-level retrieval can not only meet the personalized needs of users for tourism information but also improve the accuracy and efficiency of service recommendations.

The overall structure of the proposed TravelRAG framework is shown in Figure 1. The implementation of TravelRAG involves several key steps: document segmentation, entity extraction, entity linking, and community construction. In this process, we systematically extract entities such as attractions, hotels, restaurants, and activities from documents, link these entities and then organize them into communities based on their respective types. Subsequently, community reports are generated. During the retrieval phase, TravelRAG enables the LLM to efficiently search within these constructed communities based on user queries, allowing for the generation of accurate responses to tourism-related inquiries.

3.1. Travel Graph Construction

3.1.1. Document Chunking

As shown in Figure 2, tourism-related documents often take the form of long texts, where entities mentioned across the document are typically spread over multiple paragraphs and are closely interrelated. To maintain as much contextual integrity as possible, we divide the text into segments that align with the optimal context length for efficient LLM processing. Traditional segmentation methods typically split documents into fixed-length chunks based on predefined values, with no shared context between these segments. This can lead to errors in entity recognition by the LLM, especially when the same entities appear across different segments with shared meanings. To mitigate this issue, we implemented a sliding window approach that incorporates overlapping context paragraphs between segments. This method significantly improves the LLM’s ability to consistently recognize the same entities across different sections of the document.

3.1.2. Entity Extraction

As shown in Figure 3, after completing document segmentation, the LLM will extract entities from the source text as graph nodes. At this stage, we design various prompts to guide the LLM to extract as many different types of entities as possible. Given the variety of tourism-related entities, we need to create a list of entity types to help the LLM extract categories such as attractions, hotels, food, and recreational facilities.

Because the pre-trained model has not been specifically fine-tuned for extraction tasks, we provide several example cases to the LLM as demonstrations. Demonstrating entity extraction examples can significantly improve the LLM’s performance in this task. In addition to extracting entities, the LLM will analyze the context surrounding the entities and generate a summary description, which will serve as the basis for the subsequent entity linking step.

3.1.3. Entity Linking

Knowledge graphs often encounter issues of data sparsity and imbalance. Designing triples for entities with hierarchical relationships is a crucial component of knowledge graph construction, as it can enrich the semantic information of entities and relationships [69]. To address this, researchers leverage the latent semantic information among hierarchically related entities to learn their representations [70]. For example, TransC [71] models entities of different hierarchical levels within the same embedding space, while JOIE [72] assigns separate dimensions to individual instances. Li [73] proposed a three-layer DIK knowledge graph architecture, which adopts a systematic approach to expand and update knowledge at each layer of the knowledge graph.

As shown in Figure 4, when constructing a tourism knowledge graph, the diverse range of entity types makes integrating all these entities into a single, large knowledge graph a daunting and complex task. Moreover, having an excessive number of graph nodes at the same level can complicate the LLM’s retrieval process. For example, if a user query focuses specifically on attractions, searching within a single layer that aggregates all tourism-related information may increase the risk of generating hallucinations.

In the context of same-level entity linking within a knowledge graph, the primary objective is to merge entities that reside at the same hierarchical level and possess similar content or closely related meanings. This approach entails calculating similarity metrics to facilitate the merging or association of analogous entities within this level, thereby aiding in the identification of duplicate information and standardizing the representation of comparable concepts. Such a methodology not only mitigates redundant information but also enhances the overall compactness of the knowledge graph, ultimately leading to more consistent and accurate retrieval outcomes, the formula for which is shown in Equation (1):

E = \{(n, t, c t) | \frac{ϕ (C_{e}) \cdot ϕ (t)}{∥ϕ (C_{e})∥ \cdot ∥ϕ (t)∥} \geq δ_{r_{1}}\}

(1)

where E represents a collection of entities of the same type;

C_{e}

represents the relevant content of the entity, whose composition is expressed as

C_{e} = [n, c t]

; t represents the candidate entity type; n indicates the name of the entity;

ϕ

denotes the function that converts the content into a vector and

δ_{r_{1}}

is the predefined threshold value for cosine similarity.

3.1.4. Relationship Linking

As shown in Figure 5, after aggregating different entities of the same type, the next step is to link these aggregates with higher-level entities. For example, once we form a “hotel” aggregate from individual hotel entities, we then establish relational links between this aggregate and attraction entities at the first level.

Cross-level associations in knowledge graphs serve to establish semantic links between entities across different levels of abstraction. For example, a tourist attraction can be connected to its broader geographic location or to specific activities offered within it by calculating cross-level semantic similarity. When the semantic similarity between entities at different levels meets a predefined threshold, they are linked to indicate a hierarchical or semantic relationship. These hierarchical associations enable the knowledge graph to construct a well-defined parent–child structure, allowing users to navigate from high-level information down to detailed, specific data. This provides a richer query context and deeper information for more comprehensive insights. In contrast to entity linking, relationship linking focuses on the connections between entities across various hierarchical levels. Consequently, semantic similarity is assessed between entities at different levels rather than between an entity and its corresponding type, the formula for which is shown in Equation (2):

R_{e^{L_{j}}}^{e^{L_{i}}} = \{(e^{L_{i}}, e^{L_{j}}) | \frac{ϕ (C^{L_{i}}) \cdot ϕ (C^{L_{j}})}{∥ϕ (C^{L_{i}})∥ \cdot ∥ϕ (C^{L_{j}})∥} \geq δ_{r_{2}}\}

(2)

where R denotes a collection that aggregates entities at different levels; L represents the level at which the entity is located and e represents an entity in a collection of different entity types. In order to reduce entity link errors at different levels, we use the entity name n, entity source context

c t

, and its composition,

C = [n, c t]

.

ϕ

denotes the function that converts the content into a vector and

δ_{r_{2}}

is a predefined threshold. By calculating the similarity of entities at different levels, relationships will be linked if they reach the threshold.

In this process, we calculate the degree of both the original and target entities, where the degree refers to the number of edges connected to each entity in the graph. After determining these degrees, we introduce a new variable called “rank”. This rank is computed by summing the degrees of the original and target entities, allowing us to assess the total connectivity between the two entities linked by each edge. This step is crucial for understanding the overall connectivity and significance of relationships within the knowledge graph.

3.1.5. Community Construction and Summary Reports

When a scenic spot entity is connected with its surrounding facility entities, this group of interconnected entities is considered the “community” of that scenic spot. These scenic spot communities are constructed using the Leiden algorithm [74], a method particularly suited for detecting and refining community structures within complex networks. The calculation formula is shown in Equation (3):

H = \frac{1}{2 m} \sum_{c} (e_{c} - γ \frac{K_{c}^{2}}{2 m})

(3)

where H represents the module in the community;

e_{c}

denotes the actual number of edges within interest community c and the expected number of edges is given by

\frac{K_{c}^{2}}{2 m}

, where

K_{c}

represents the total degree of the nodes within community c and m is the total number of edges in the entire network. The parameter

γ

controls the resolution: a higher resolution leads to a greater number of communities, while a lower resolution results in fewer communities.

This algorithm is employed to generate a hierarchical structure of entity communities, clustering the entities constructed in the previous step. This approach is a method of navigating knowledge at various levels of granularity. After the different types of scenic spot communities are formed, the LLM is used to summarize these communities. The LLM reviews the relevant information associated with the nodes in each community and generates a report. This report includes a brief introduction to the scenic spot and an overview of the primary supporting facilities within the community, such as hotels, restaurants, and recreational amenities.

3.2. Greedy Matching Retrieval Method

The overall structure of the retrieval pipeline is shown in Figure 6. In the previous section (document chunking and entity extraction), we created the document vector database and entity vector database simultaneously. The LLM extracts the possible entities from the query conditions, and then queries the entity data with the highest semantic similarity in the vector database, as shown in Equation (4):

S i m (q_{i}, c_{j}) = \frac{q_{i} \cdot c_{j}}{∥q_{i}∥ c_{j} ∥}

(4)

where

S i m (q_{i}, c_{j})

denotes the similarity between the query

q_{i}

and the candidate

c_{j}

;

q_{i}

presents the i th vectorized query entity and

c_{j}

presents the jth vectorized candidate entity.

In the matching process after querying, the greedy matching strategy is applied to select the entity vector from the database that is most similar to the query entity as the optimal matching item. At the same time, considering the importance factor, the impact of specific words on the overall query semantics is determined, the matching score is calculated, and then the most matched items are returned. The calculation process is shown in Equations (5) and (6):

S c o r e (q_{i}, c_{i}^{*}) = ω_{i} \cdot S i m (q_{i}, c_{i}^{*})

(5)

T o t a l S c o r e (Q, C) = \sum_{i = 1}^{n} ω_{i} \cdot S i m (q_{i}, c_{i}^{*})

(6)

where

S c o r e (q_{i}^{*}, c_{i})

presents the matching score of

q_{i}^{*}

and

c_{i}

;

c_{i}^{*}

is the best match of

q_{i}

;

ω_{i}

presents the importance factor of q;

T o t a l S c o r e

presents the overall matching score;

Q = {q_{1}, q_{2}, \dots, q_{n}}

presents the set of all vectorized query entities;

C = {c_{1}, c_{2}, \dots, c_{n}}

represents the set of entity vectors in the entity vector database.

After receiving the query result, the LLM obtains node data related to these entities from the knowledge graph, including the node itself and the node hierarchy. Then, the communities, text units and relationships most relevant to these node data are found in community reports, text databases, and knowledge graphs. Finally, the LLM presents the query information to the user as a fluent natural language answer.

This method improves the efficiency of the LLM’s search process, improves the hit rate of answers by prioritizing the search community and effectively reduces the hallucination phenomenon.

4. Experiment and Results

4.1. Experimental Data

4.1.1. RAG Data

Tourism has become a popular leisure activity, and there are many factors that influence destination choice, including advertising activities of the destination country, promotional activities of travel agencies, airfare discounts, movies or TV programs, personal preferences and so on [75]. Due to the prevalence of network technology, the internet has become the main channel for people to find and disseminate information. Text data, as one of the main formats of internet data, are a form in which users express opinions and evaluations effectively and widely. A large number of online reviews for tourism websites, hotels, and services indicate this [76]. The travel decisions of tourists are greatly influenced by the travel experiences of others, which are presented in the form of travel reviews or blogs [77]. These texts can provide valuable insights to potential tourists and help them optimize their destination choices and explore travel routes [78]. It is worth noting that these articles often include locations and details not covered by traditional tourist routes, making them valuable resources for revealing little-known tourist attractions and information [79]. Therefore, we collected travel blogs on “Wuzhen” from domestic tourism websites such as Ctrip and MaFengWo in China, which were written by real tourists who had visited Wuzhen.

We used a crawler to retrieve text from the website, as internet text usually has no fixed format, we need to use relevant code to clean the text, delete unnecessary text and incorrect characters, and retain the basic structure of the article, namely “title-author-time-content”. Finally, we sorted the completed dataset parameters as shown in the following Table 1.

4.1.2. Test Data

We employed a large language model to assist in generating the test sets. Our prepared RAG dataset was input into the model, allowing it to read and process the data and subsequently generating questions, answers and the corresponding source text from which the answers were derived. The Q&A in the test sets are categorized into four types, Simple, MultiContent, Reasoning, and Conditional, each representing different levels of complexity and requirements for the model’s understanding and response generation. The distribution of the number of problems is shown in Table 2.

Simple—This type of question answering indicates that the LLM can directly extract the answer from the context.
MultiContent—This type of question answering indicates that the LLM needs to examine multiple documents to synthesize an answer.
Reasoning—This type of question answering indicates that after receiving the query, the LLM must perform some reasoning based on the source text to provide an answer.
Conditional—This type of question answering indicates that the LLM may need to respond under specific constraints.

4.2. Environments

Our experimental environment is shown in Table 3.

4.3. Models

We used the Qwen2 [80] and GPT-4 [81] series models as benchmark models. The Qwen2 series, developed by Alibaba Cloud, consists of large-scale pretrained language models with parameter sizes of 7 B, 57 B and 72 B. A significant portion of the training data for this series consist of Chinese corpora, resulting in strong performance on Chinese tasks. The GPT-4 series, developed by OpenAI, represents the latest generation of large-scale AI language models, featuring 200 B parameters. The GPT-4 model has enhanced capabilities for understanding long texts, enabling the generation of more coherent and consistent outputs. When constructing the traditional RAG method as a baseline, we selected the BGE-embedding model for text-to-vector conversion in the text embedding stage [82]. The BGE-embedding model outperforms all other community models in terms of semantic retrieval accuracy and overall semantic representation in both Chinese and English. Furthermore, it supports processing at multiple levels of granularity, accommodating input text with varying degrees of detail.

4.4. Evaluation Metrics

We used RAGA (Retrieval Augmented Generation Assessment), a reference-free assessment framework for RAG pipelines [83]. Evaluating RAG architectures is challenging because there are several aspects to consider: the ability of the retrieval system to identify relevant and focused contextual paragraphs, the ability of the LLM to utilize these paragraphs accurately, and the quality of the generation itself. We evaluated the retrieval effectiveness of large language models using the following four metrics.

Faithfulness evaluates the factual consistency of the generated answer with the provided context, based on the annotated ground truth and the context retrieved by the LLM from the original text. The calculation formula is shown in Equation (7):

F a i t h f u l n e s s = \frac{| I |}{| C |}

(7)

A higher score indicates a greater alignment between the answer and the retrieved context. In this formula, I represents the number of claims in the generated answer that can be inferred from the given context and C denotes the total number of claims in the generated answer. This metric is based on the annotated ground truth and the context retrieved by the LLM from the original text.

Answer Relevancy is the relevance of the answer and is evaluated based on how closely it matches the user-provided query. A higher score indicates a closer alignment between the answer and the query. This score is calculated using cosine similarity. The calculation formula is shown in Equation (8):

A n s w e r R e l e v a n c y = \frac{1}{N} \sum_{i = 1}^{N} cos (E_{g i}, E_{o})

(8)

where

E_{g i}

represents the embedding of the generated question i,

E_{o}

represents the embedding of the original question and N denotes the total number of generated questions.

Contextual Precision assesses whether the factually relevant entries presented in the context are ranked higher. Ideally, all relevant document segments should appear at the top. This metric is calculated based on the question and its corresponding contexts, with values ranging from 0 to 1, where a higher score indicates greater precision. The calculation formulas are shown in Equations (9) and (10):

C o n t e x t u a l P r e c i s i o n @ K = \frac{\sum_{k = 1}^{K} (P r e c i s i o n @ k \times v_{k})}{| Relevant Items in Top K |}

(9)

P r e c i s i o n @ K = \frac{True Positives @ k}{True Positives @ k + False Positives @ k}

(10)

where

C o n t e x t u a l P r e c i s i o n @ K

represents the precision metric that assesses the ranking of factually relevant entries within the top-K context results. This precision metric emphasizes placing relevant document segments at higher ranks. K denotes the number of chunks in the context considered and

v_{k} \in {0, 1}

indicates whether the k-th ranked chunk is relevant (1) or not (0).

Context Recall measures the alignment between the retrieved context and the ground truth. Its values range from 0 to 1, where a higher score indicates better performance. In order to estimate context recall based on the ground truth, each sentence in the ground truth is analyzed to determine whether it can be matched with the retrieved context. Ideally, all sentences in the ground truth should be matched with the retrieved context. The calculation formula is shown in Equation (11):

C o n t e x t R e c a l l = \frac{| A |}{| T |}

(11)

A higher score indicates better recall performance. In this formula, A represents the number of claims in the ground truth that can be matched with the retrieved context and T denotes the total number of claims in the ground truth. Ideally, all claims in the ground truth should have corresponding matches within the retrieved context.

4.5. Baseline

To evaluate the effectiveness of our proposed model, we compare it with the NaiveRAG framework. The NaiveRAG research paradigm (which we call “traditional RAG” in this article) represents the earliest approach, which follows the traditional process of indexing, retrieval, and generation, also known as the “search–read” framework [84]. The traditional RAG construction process can be summarized in the following three points:

Indexing: Raw data in formats like PDF, HTML, Word and Markdown are cleaned, converted to plain text and divided into smaller chunks. These chunks are encoded into vectors using an embedding model and stored in a vector database, enabling efficient similarity searches.

Retrieval: The system encodes each user query as a vector and computes the similarity with stored vectors, retrieving the top K most similar chunks as context for the response.

Generation: The query and retrieved chunks are combined into a prompt for a large language model, which generates a response. In a conversation, previous dialogue history can be included to support multi-turn interactions.

4.6. Results

4.6.1. Travel Knowledge Graph

As illustrated in Figure 7, we constructed a knowledge graph of tourist attractions using the previously described method to improve the reasoning capabilities of large language models in retrieval tasks. This knowledge graph was constructed using GPT-4. The knowledge graph was visualized using Gephi (https://gephi.org/, accessed on 12 November 2024, version 0.1.0), which is an open-source software for graph and network analysis. It works well with large network diagrams and is an ideal tool for studying complex network structures. Due to its multi-layer structure, it was projected into a two-dimensional graph, with entities at different hierarchical levels represented by circles of varying sizes. Since the input corpus consists of travel notes about Wuzhen, the top-level entity in the knowledge graph is “Wuzhen”, while the interest groups related to tourist attractions within Wuzhen are distributed across subsequent levels, and the specific entities within these attractions are represented at even lower hierarchical levels.

Table 4 shows the number of entities, relationships and triples extracted. Figure 8 presents the entities extracted from the travel notes corresponding to the figures, as well as the relationships among these entities. During the extraction process, the LLM also provides detailed descriptions for the extracted entities. When performing entity linking, the LLM simultaneously analyzes and summarizes the relationships among the entities to enhance the retrieval efficiency in future tasks.

4.6.2. Metric Comparison

Our experimental results are shown in Table 5. Since the source text is in Chinese, we used the Qwen2 model, known for its strong support for Chinese, and GPT-4 as the foundation models in the comparative experiments. Compared to the traditional RAG pipeline, the proposed TravelRAG framework demonstrates significant improvements in retrieval performance, indicating that leveraging a tourism knowledge graph enables the LLM to enhance the retrieval effectiveness for tourist attractions. However, TravelRAG underperforms in the context recall metric, which warrants further investigation. During the retrieval process, the LLM first queries the descriptions of entities and their relationships stored in the tourism knowledge graph. These descriptions are summaries generated by the LLM after reading the travel notes, rather than direct excerpts from the original text. This discrepancy results in a lower context recall value compared to the traditional RAG pipeline. Detailed case analyses will be provided later in Section 4.6.4.

4.6.3. Ablation Study

We conducted ablation experiments with varying model parameter sizes to investigate the impact of parameter size on the performance of TravelRAG. The experimental results, as shown in the Table 6, indicate that as the number of model parameters increases, the accuracy of retrieval performance improves progressively across the four evaluation metrics. Analyzing the construction process of TravelRAG, we find that the need to provide lengthy prompts and sample outputs to the LLM before each stage of knowledge graph construction presents significant challenges for the model’s comprehension. For instance, during the entity extraction phase, the LLM must extract entities and their attributes from unstructured text and output them into a structured JSON format. Smaller models may struggle to extract entities comprehensively, and models not fine-tuned for generating data in JSON format may produce incorrect outputs, potentially leading to failure in subsequent stages of the construction process that depend on specific data formats.

Due to the varying proportions of languages in the pre-training corpora of large language models, their retrieval performance for Chinese queries differs. To investigate the differences in retrieval results between models with different language capabilities, we designed four categories of questions: Simple, MultiContent, Reasoning, and Conditional. The test structure of the two methods on the four types of questions was determined by assessing the accuracy of the answers, which involves measuring how closely the generated answers match the ground truth answers. This metric is derived by calculating answer accuracy, as detailed in Equations (12)–(14):

S c o r e = S i m i l a r i t y (A n s w e r, G r o u n d t r u t h) 0 \leq S i m i l a r i t y \leq 1

(12)

f (s c o r e) = \{\begin{matrix} T r u e A n s w e r, s c o r e \geq 0.5 \\ F a l s e A n s w e r, s c o r e < 0.5 \end{matrix}

(13)

p = \frac{N u m b e r o f T r u e A n s w e r s}{N u m b e r o f Q u e s t i o n s}

(14)

where

S c o r e

indicates the degree of similarity between the answer and the ground truth, ranging from 0 to 1. A threshold is set to filter the results, with a score of 0.5 or higher considered a true answer, and a score below 0.5 considered a false answer. p is used to denote the accuracy of the answers.

As shown in Table 7, for simple questions, the performance gap between the RAG systems built with the two models is small, but for other categories, the results are mixed. In MultiContent retrieval, the RAG system built using Qwen2 performs better, while in Reasoning and Conditional retrieval, the GPT-4-based system performs better. Thus, we conclude that models pre-trained with a higher proportion of Chinese corpus are more effective when answers need to be retrieved and aggregated from multiple sources, whereas models with larger parameter sizes excel in tasks involving logical reasoning and multi-constraint queries.

4.6.4. Case Analysis

As illustrated in Figure 9, we conducted a case study to thoroughly analyze how the proposed multi-layer tourism knowledge graph guides the retrieval process in the LLM. We posed the same query, which included conditional constraints, to both the traditional RAG pipeline and TravelRAG framework; the search was limited to attractions “suitable for families and students”.

When a user submits a query, the traditional RAG pipeline searches its vector database for the most relevant match but lacks the ability to perform reasoning-based retrieval that aligns with the user’s specific needs. As shown in the Figure 9, the response from the traditional RAG system is a full sentence but contains factual inaccuracies, with incorrect terms highlighted in red. In contrast, TravelRAG’s response is more logically structured. After constructing the tourism knowledge graph and using it as a source of information, TravelRAG retrieves data step-by-step from the multi-layer knowledge graph, guided by the specified conditions. It begins by searching for tourist attractions within the Xitang community and then lists relevant spots that meet the search criteria, explaining the details of each site to help users determine whether the results match their needs. Experimental results show that the TravelRAG framework can effectively respond to queries based on specific constraints.

5. Conclusions

A framework for tourist site retrieval is presented in this paper that integrates knowledge graphs into a large language model (LLM). This framework leverages an LLM to construct a knowledge graph for the tourism domain from unstructured text gathered from social platforms. By organizing entities into hierarchical communities, where each community contains all relevant entities, the framework builds a multi-layer knowledge graph. This knowledge graph also acts as the retrieval source for the LLM, guiding the model to generate answers based on this structured information. This approach enhances the accuracy of the LLM in responding to tourism-related queries and addresses the issue of hallucinations or fabricated information that are often produced by LLMs. Travelogues shared by users on social platforms often do not have a standardized format, and their writing styles vary widely, posing substantial challenges for smaller language models in terms of reading comprehension. Additionally, LLMs struggle to update their internal knowledge base promptly in response to the rapidly increasing availability of information online. Compared to traditional Retrieval-Augmented Generation (RAG) pipelines, the proposed TravelRAG framework demonstrates significant improvements in retrieval accuracy. The experimental results show that, with the support of the multi-layer knowledge graph, TravelRAG provides users with more accurate, detailed and contextually relevant responses to their queries. The main limitations of TravelRAG are as follows: on one hand, although LLMs have improved in understanding long contexts, they still struggle to accurately parse content in travelogues, which are rich in complex semantic information. On the other hand, the construction of the knowledge graph depends on manually crafted prompts, which increases the labor cost of creating prompt templates.

The focus of future work will be exploring how to further improve the reasoning ability of LLMs under complex conditions and for multi-hop problems when a knowledge graph is used as their information source.

Author Contributions

Sihan Song contributed to the study design and wrote the manuscript; Chuncheng Yang presented the original idea, revised the manuscript and provided financial support; Li Xu, Haibin Shang, Zhuo Li and Yinghui Chang were involved in drafting and checking the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education (Grant No. GLAB2022ZR01) and the Fundamental Research Funds for the Central Universities.

Data Availability Statement

All data that support the findings of the study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors thank the three anonymous reviewers for the positive, constructive and valuable comments and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

He, H.; Tuo, S.; Lei, K.; Gao, A. Assessing quality tourism development in China: An analysis based on the degree of mismatch and its influencing factors. Environ. Dev. Sustain. 2024, 26, 9525–9552. [Google Scholar] [CrossRef]
Gao, J.; Chou, P.; Yu, L.; Huang, C.; Lu, F. Explainable Tourist Attraction Recommendation Based on a Tourism Knowledge Graph. Sci. China Inf. Sci. 2020, 50, 1055–1068. [Google Scholar]
Lai, J.; Lansley, G.; Haworth, J.; Cheng, T. A name-led approach to profile urban places based on geotagged Twitter data. Trans. GIS 2020, 24, 858–879. [Google Scholar] [CrossRef]
Zhou, Q.; Sotiriadis, M.; Shen, S. Using TikTok in tourism destination choice: A young Chinese tourists’ perspective. Tour. Manag. Perspect. 2023, 46, 101101. [Google Scholar] [CrossRef]
Yhee, Y.; Goo, J.; Koo, C.; Chung, N. Meme-affordance tourism: The power of imitation and self-presentation. Decis. Support Syst. 2024, 179, 114177. [Google Scholar] [CrossRef]
Guo, W.; Zhou, S.; Zhang, M.; Zhang, X.; Ai, S.; Xie, S.; Li, Y.; Chen, X.; Zhang, X.; Yu, Z.; et al. The Digital Practice of Internet-Famous Sites and the Production of New Spatial Forms; Tongfang Knowledge Network (Beijing) Technology Co., Ltd.: Beijing, China, 2024; pp. 1–22. [Google Scholar] [CrossRef]
Zannettou, S.; Caulfield, T.; Blackburn, J.; Cristofaro, E.D.; Sirivianos, M.; Stringhini, G.; Suarez-Tangil, G. On the Origins of Memes by Means of Fringe Web Communities. arXiv 2018, arXiv:1805.12512. [Google Scholar]
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
Dhingra, B.; Cole, J.R.; Eisenschlos, J.M.; Gillick, D.; Eisenstein, J.; Cohen, W.W. Time-aware language models as temporal knowledge bases. Trans. Assoc. Comput. Linguist. 2022, 10, 257–273. [Google Scholar] [CrossRef]
Perković, G.; Drobnjak, A.; Botički, I. Hallucinations in llms: Understanding and addressing challenges. In Proceedings of the 2024 47th MIPRO ICT and Electronics Convention (MIPRO), Opatija, Croatia, 20–24 May 2024; pp. 2084–2088. [Google Scholar]
Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
Chen, Y.; Qian, S.; Tang, H.; Lai, X.; Liu, Z.; Han, S.; Jia, J. LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models. arXiv 2024, arXiv:2309.12307. [Google Scholar]
Huang, P.S.; He, X.; Gao, J.; Deng, L.; Acero, A.; Heck, L. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, San Francisco, CA, USA, 27 October–1 November 2013; pp. 2333–2338. [Google Scholar]
Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; tau Yih, W.; Rocktäschel, T.; et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv 2021, arXiv:2005.11401. [Google Scholar]
Hu, Y.; Lei, Z.; Zhang, Z.; Pan, B.; Ling, C.; Zhao, L. GRAG: Graph Retrieval-Augmented Generation. arXiv 2024, arXiv:2405.16506. [Google Scholar]
Giarelis, N.; Mastrokostas, C.; Karacapilidis, N. A Unified LLM-KG Framework to Assist Fact-Checking in Public Deliberation. In Proceedings of the First Workshop on Language-Driven Deliberation Technology (DELITE)@ LREC-COLING 2024, Torino, Italy, 20–25 May 2024; pp. 13–19. [Google Scholar]
Logan IV, R.L.; Liu, N.F.; Peters, M.E.; Gardner, M.; Singh, S. Barack’s wife Hillary: Using knowledge-graphs for fact-aware language modeling. arXiv 2019, arXiv:1906.07241. [Google Scholar]
Luo, L.; Li, Y.F.; Haffari, G.; Pan, S. Reasoning on graphs: Faithful and interpretable large language model reasoning. arXiv 2023, arXiv:2310.01061. [Google Scholar]
Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web, Banff, AB, Canada, 8–12 May 2007; pp. 697–706. [Google Scholar]
Carlson, A.; Betteridge, J.; Kisiel, B.; Settles, B.; Hruschka, E.R.; Mitchell, T.M. Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, Atlanta, GA, USA, 11–15 July 2010; pp. 1306–1313. [Google Scholar]
Hogan, A.; Blomqvist, E.; Cochez, M.; d’Amato, C.; Melo, G.D.; Gutierrez, C.; Kirrane, S.; Gayo, J.E.L.; Navigli, R.; Neumaier, S.; et al. Knowledge graphs. Acm Comput. Surv. (CSUR) 2021, 54, 1–37. [Google Scholar] [CrossRef]
Fellbaum, C. WordNet: An Electronic Lexical Database; MIT Press: Cambridge, MA, USA, 1998; Volume 2, pp. 678–686. [Google Scholar]
Bin, Y.; Xiao-Ran, L.; Ning, L.; Yue-Song, Y. Using information content to evaluate semantic similarity on HowNet. In Proceedings of the 2012 Eighth International Conference on Computational Intelligence and Security, Guangzhou, China, 17–18 November 2012; pp. 142–145. [Google Scholar]
Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. Dbpedia–A large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 2015, 6, 167–195. [Google Scholar] [CrossRef]
Xiong, C.; Power, R.; Callan, J. Explicit semantic ranking for academic search via knowledge graph embedding. In Proceedings of the 26th International Conference on World Wide Web, Perth, WA, Australia, 3–7 April 2017; pp. 1271–1279. [Google Scholar]
Zhu, G.; Iglesias, C.A. Sematch: Semantic similarity framework for knowledge graphs. Knowl.-Based Syst. 2017, 130, 30–32. [Google Scholar] [CrossRef]
Zhang, Y.; Dai, H.; Kozareva, Z.; Smola, A.; Song, L. Variational reasoning for question answering with knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Wang, X.; He, X.; Cao, Y.; Liu, M.; Chua, T.S. Kgat: Knowledge graph attention network for recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 950–958. [Google Scholar]
Wang, X.; Huang, T.; Wang, D.; Yuan, Y.; Liu, Z.; He, X.; Chua, T.S. Learning intents behind interactions with knowledge graph for recommendation. In Proceedings of the WWW ’21: The Web Conference, Ljubljana, Slovenia, 19–23 April 2021; pp. 878–887. [Google Scholar]
Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Ma, J.; Li, R.; Xia, H.; Xu, J.; Wu, Z.; Liu, T.; et al. A survey on in-context learning. arXiv 2022, arXiv:2301.00234. [Google Scholar]
Hu, Y.; Zou, F.; Han, J.; Sun, X.; Wang, Y. Llm-tikg: Threat intelligence knowledge graph construction utilizing large language model. Comput. Secur. 2024, 145, 103999. [Google Scholar] [CrossRef]
Wei, Y.; Huang, Q.; Kwok, J.T.; Zhang, Y. Kicgpt: Large language model with knowledge in context for knowledge graph completion. arXiv 2024, arXiv:2402.02389. [Google Scholar]
Zhang, Y.; Du, T.; Ma, Y.; Wang, X.; Xie, Y.; Yang, G.; Lu, Y.; Chang, E.C. AttacKG+: Boosting Attack Knowledge Graph Construction with Large Language Models. arXiv 2024, arXiv:2405.04753. [Google Scholar]
Agrawal, G.; Kumarage, T.; Alghamdi, Z.; Liu, H. Can Knowledge Graphs Reduce Hallucinations in LLMs: A Survey. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Mexico City, Mexico, 16–21 June 2024; Duh, K., Gomez, H., Bethard, S., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; Volume 1, pp. 3947–3960. [Google Scholar] [CrossRef]
Petroni, F.; Rocktäschel, T.; Riedel, S.; Lewis, P.; Bakhtin, A.; Wu, Y.; Miller, A. Language Models as Knowledge Bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; Inui, K., Jiang, J., Ng, V., Wan, X., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 2463–2473. [Google Scholar] [CrossRef]
Dai, D.; Dong, L.; Hao, Y.; Sui, Z.; Chang, B.; Wei, F. Knowledge Neurons in Pretrained Transformers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Muresan, S., Nakov, P., Villavicencio, A., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; Volume 1, pp. 8493–8502. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
Choudhary, N.; Reddy, C.K. Complex Logical Reasoning over Knowledge Graphs using Large Language Models. arXiv 2024, arXiv:2305.01157. [Google Scholar] [CrossRef]
Wang, S.; Wei, Z.; Xu, J.; Li, T.; Fan, Z. Unifying Structure Reasoning and Language Pre-Training for Complex Reasoning Tasks. IEEE/ACM Trans. Audio Speech Lang. Proc. 2023, 32, 1586–1595. [Google Scholar] [CrossRef]
Tao, W.; Zhou, Q.; Zhao, Y.; Yu, A. A Cross-Field Construction Method of Chinese Tourism Knowledge Graph based on Expasion and Adjustment of Entities. In Proceedings of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 12–14 June 2020; pp. 211–215. [Google Scholar]
Calleja, P.; Priyatna, F.; Mihindukulasooriya, N.; Rico, M. DBtravel: A tourism-oriented semantic graph. In Proceedings of the Current Trends in Web Engineering: ICWE 2018 International Workshops, MATWEP, EnWot, KD-WEB, WEOD, TourismKG, Cáceres, Spain, 5 June 2018; pp. 206–212. [Google Scholar]
Zhang, W.; Cao, H.; Hao, F.; Yang, L.; Ahmad, M.; Li, Y. The chinese knowledge graph on domain-tourism. In Advanced Multimedia and Ubiquitous Engineering: MUE/FutureTech; Springer: Singapore, 2019; pp. 20–27. [Google Scholar]
Xiao, D.; Wang, N.; Yu, J.; Zhang, C.; Wu, J. A practice of tourism knowledge graph construction based on heterogeneous information. In Proceedings of the Chinese Computational Linguistics: 19th China National Conference, CCL 2020, Hainan, China, 30 October–1 November 2020; pp. 159–173. [Google Scholar]
Liu, W.; Liu, J.; Wu, M.; Abbas, S.; Hu, W.; Wei, B.; Zheng, Q. Representation learning over multiple knowledge graphs for knowledge graphs alignment. Neurocomputing 2018, 320, 12–24. [Google Scholar] [CrossRef]
Tan, J.; Qiu, Q.; Guo, W.; Li, T. Research on the construction of a knowledge graph and knowledge reasoning model in the field of urban traffic. Sustainability 2021, 13, 3191. [Google Scholar] [CrossRef]
Luan, Y.; Eisenstein, J.; Toutanova, K.; Collins, M. Sparse, dense, and attentional representations for text retrieval. Trans. Assoc. Comput. Linguist. 2021, 9, 329–345. [Google Scholar] [CrossRef]
Robertson, S.; Zaragoza, H. The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr. 2009, 3, 333–389. [Google Scholar] [CrossRef]
Zhao, W.X.; Liu, J.; Ren, R.; Wen, J.R. Dense text retrieval based on pretrained language models: A survey. ACM Trans. Inf. Syst. 2024, 42, 1–60. [Google Scholar] [CrossRef]
Guo, J.; Fan, Y.; Ai, Q.; Croft, W.B. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, IN, USA, 24–28 October 2016; pp. 55–64. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Burstein, J., Doran, C., Solorio, T., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; Volume 1, pp. 4171–4186. [Google Scholar] [CrossRef]
Fan, Y.; Xie, X.; Cai, Y.; Chen, J.; Ma, X.; Li, X.; Zhang, R.; Guo, J. Pre-training methods in information retrieval. Found. Trends Inf. Retr. 2022, 16, 178–317. [Google Scholar] [CrossRef]
Karpukhin, V.; Oguz, B.; Min, S.; Lewis, P.; Wu, L.; Edunov, S.; Chen, D.; Yih, W.t. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; Webber, B., Cohn, T., He, Y., Liu, Y., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 6769–6781. [Google Scholar] [CrossRef]
Khattab, O.; Zaharia, M. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, China, 25–30 June 2020; pp. 39–48. [Google Scholar]
Santhanam, K.; Khattab, O.; Saad-Falcon, J.; Potts, C.; Zaharia, M. Colbertv2: Effective and efficient retrieval via lightweight late interaction. arXiv 2021, arXiv:2112.01488. [Google Scholar]
Metzler, D.; Tay, Y.; Bahri, D.; Najork, M. Rethinking search: Making domain experts out of dilettantes. In ACM SIGIR Forum; ACM: New York, NY, USA, 2021; Volume 55, pp. 1–27. [Google Scholar]
De Cao, N.; Izacard, G.; Riedel, S.; Petroni, F. Autoregressive entity retrieval. arXiv 2020, arXiv:2010.00904. [Google Scholar]
Sun, W.; Yan, L.; Chen, Z.; Wang, S.; Zhu, H.; Ren, P.; Chen, Z.; Yin, D.; Rijke, M.; Ren, Z. Learning to Tokenize for Generative Retrieval. In Proceedings of the 37th Annual Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, LA, USA, 10–16 December 2023; Volume 36, pp. 46345–46361. [Google Scholar]
Bevilacqua, M.; Ottaviano, G.; Lewis, P.; Yih, S.; Riedel, S.; Petroni, F. Autoregressive search engines: Generating substrings as document identifiers. Adv. Neural Inf. Process. Syst. 2022, 35, 31668–31683. [Google Scholar]
Chen, J.; Zhang, R.; Guo, J.; de Rijke, M.; Chen, W.; Fan, Y.; Cheng, X. Continual learning for generative retrieval over dynamic corpora. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 306–315. [Google Scholar]
Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Wang, M.; Wang, H. Retrieval-augmented generation for large language models: A survey. arXiv 2023, arXiv:2312.10997. [Google Scholar]
Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; Jurafsky, D., Chai, J., Schluter, N., Tetreault, J., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 7871–7880. [Google Scholar] [CrossRef]
Yu, W.; Iter, D.; Wang, S.; Xu, Y.; Ju, M.; Sanyal, S.; Zhu, C.; Zeng, M.; Jiang, M. Generate rather than Retrieve: Large Language Models are Strong Context Generators. arXiv 2023, arXiv:2209.10063. [Google Scholar] [CrossRef]
Khattab, O.; Santhanam, K.; Li, X.L.; Hall, D.; Liang, P.; Potts, C.; Zaharia, M. Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP. arXiv 2023, arXiv:2212.14024. [Google Scholar] [CrossRef]
Luo, M.; Jain, S.; Gupta, A.; Einolghozati, A.; Oguz, B.; Chatterjee, D.; Chen, X.; Baral, C.; Heidari, P. A study on the efficiency and generalization of light hybrid retrievers. arXiv 2022, arXiv:2210.01371. [Google Scholar]
Gao, L.; Ma, X.; Lin, J.; Callan, J. Precise zero-shot dense retrieval without relevance labels. arXiv 2022, arXiv:2212.10496. [Google Scholar]
Sanmartin, D. KG-RAG: Bridging the Gap Between Knowledge and Creativity. arXiv 2024, arXiv:2405.12035. [Google Scholar]
Li, H.; Gao, H.; Song, H. Tourism forecasting with granular sentiment analysis. Ann. Tour. Res. 2023, 103, 103667. [Google Scholar] [CrossRef]
Yao, L.; Sheng, Q.Z.; Ngu, A.H.; Yu, J.; Segev, A. Unified collaborative and content-based web service recommendation. IEEE Trans. Serv. Comput. 2014, 8, 453–466. [Google Scholar] [CrossRef]
Zhang, Z.; Guan, Z.; Zhang, F.; Zhuang, F.; An, Z.; Wang, F.; Xu, Y. Weighted knowledge graph embedding. In Proceedings of the 46th international ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 867–877. [Google Scholar]
Zhang, P.; Zhang, X.; Yang, F.; Liao, J.; Ma, W.; Tan, Z.; Xiao, W. Knowledge Graph Embedding for Hierarchical Entities Based on Auto-Embedding Size. Mathematics 2024, 12, 3237. [Google Scholar] [CrossRef]
Lv, X.; Hou, L.; Li, J.; Liu, Z. Differentiating concepts and instances for knowledge graph embedding. arXiv 2018, arXiv:1811.04588. [Google Scholar]
Hao, J.; Chen, M.; Yu, W.; Sun, Y.; Wang, W. Universal representation learning of knowledge bases by jointly embedding instances and ontological concepts. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1709–1719. [Google Scholar]
Li, M.; Ni, Z.; Tian, L.; Hu, Y.; Shen, J.; Wang, Y. Research on hierarchical knowledge graphs of data, information, and knowledge based on multiple data sources. Appl. Sci. 2023, 13, 4783. [Google Scholar] [CrossRef]
Traag, V.A.; Waltman, L.; van Eck, N.J. From Louvain to Leiden: Guaranteeing well-connected communities. Sci. Rep. 2019, 9, 5233. [Google Scholar] [CrossRef]
Lin, Y.S.; Huang, J.Y. Internet blogs as a tourism marketing medium: A case study. J. Bus. Res. 2006, 59, 1201–1205. [Google Scholar] [CrossRef]
Neirotti, P.; Raguseo, E.; Paolucci, E. Are customers’ reviews creating value in the hospitality industry? Exploring the moderating effects of market positioning. Int. J. Inf. Manag. 2016, 36, 1133–1143. [Google Scholar] [CrossRef]
Ye, Q.; Law, R.; Gu, B.; Chen, W. The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput. Hum. Behav. 2011, 27, 634–639. [Google Scholar] [CrossRef]
Li, Q.; Li, S.; Zhang, S.; Hu, J.; Hu, J. A review of text corpus-based tourism big data mining. Appl. Sci. 2019, 9, 3300. [Google Scholar] [CrossRef]
Zhang, C.; Tian, Y.X.; Hu, A.Y. Utilizing textual data from online reviews for daily tourism demand forecasting: A deep learning approach leveraging word embedding techniques. Expert Syst. Appl. 2025, 260, 125439. [Google Scholar] [CrossRef]
Yang, A.; Yang, B.; Hui, B.; Zheng, B.; Yu, B.; Zhou, C.; Li, C.; Li, C.; Liu, D.; Huang, F.; et al. Qwen2 Technical Report. arXiv 2024, arXiv:2407.10671. [Google Scholar]
OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2024, arXiv:2303.08774. [Google Scholar]
Chen, J.; Xiao, S.; Zhang, P.; Luo, K.; Lian, D.; Liu, Z. BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation. arXiv 2024, arXiv:2402.03216. [Google Scholar]
Es, S.; James, J.; Espinosa-Anke, L.; Schockaert, S. Ragas: Automated evaluation of retrieval augmented generation. arXiv 2023, arXiv:2309.15217. [Google Scholar]
Ma, X.; Gong, Y.; He, P.; Zhao, H.; Duan, N. Query rewriting for retrieval-augmented large language models. arXiv 2023, arXiv:2305.14283. [Google Scholar]
Mohammed, L. GPT-4 Parameters: Unlimited Guide NLP’s Game-Changer. Medium, 29 March 2022. [Google Scholar]

Figure 1. Overall framework of TravelRAG: (a) illustrates the process of extracting entities from documents to build the knowledge graph, (b) shows how the LLM retrieves results from the knowledge graph based on specific query conditions.

Figure 2. Document Chunking. (a) The red and green borders represent different blocks of content within the same text paragraph; (b) the yellow section represents the overlapping portion between the two text blocks, which serves as a window to ensure contextual continuity between the segments.

Figure 3. The color-coded text represents different types of entities. The LLM extracts these various entity types from the chunk of text and organizes them into a standardized format for output.

Figure 4. Entity linking. Entity types serve as source nodes, and each individual entity is linked to the corresponding source nodes as target nodes.

Figure 5. Entities at different levels are matched.

Figure 6. Retrieval procedure. The LLM processes the user’s query by first converting the instructions into vectors, followed by conducting similarity calculations using the data in the vector database. Once the similarity computation is complete, the system employs a greedy matching strategy to select the most similar items. Simultaneously, it considers an importance factor, evaluating how specific terms influence the overall semantics of the query. Ultimately, a score is computed to determine and return the most relevant match.

Figure 7. Tourist knowledge graph.

Figure 8. Entity extraction and relational linking example. (a) Since the source text is in Chinese, the figure shows the translated entity and a partial description of the entity. (b) The LLM adds a description of the relationship between the original entity and the target entity.

Figure 9. Comparison of responses from two RAG frameworks for the same query. The content highlighted in red contains factual inaccuracies, whereas the correct answer is indicated by the text in blue.

Table 1. The number of different types of questions.

Content Structure	Number of Articles	Total Word Count
Title–author–time–content	100	80,983

Table 2. The number of different types of questions.

Type	Simple	MultiContent	Reasoning	Conditional
Number	50	20	20	10

Table 3. The table includes the hardware model and software version used.

Hardware and Software	Configuration	Detailed Information
Hardware	CPU	Intel Core i9-12900KF
	Memory	64 GB
	Graphics Card	NVIDIA GeForce RTX 3090 Ti 24 GB
Software	System	Ubuntu 20.04.6 LTS CUDA 11.8
	Software	Python 3.10.13 PyTorch 2.1.2

Table 4. The results of extraction.

Type	Numbers
Entities	2081
Relations	20
Relational Triples	1187

Table 5. Performance comparison between TravelRAG and the traditional RAG pipeline.

Method	Size	Faithfulness	Answer Relevancy	Context Precision	Context Recall
Qwen2-RAG	72 B	0.55	0.59	0.70	0.79
GPT-4-RAG	180 B	0.65	0.67	0.80	0.87
TravelRAG	base on GPT-4	0.76	0.87	0.85	0.60

Table 6. The table presents the experimental results of the TravelRAG framework constructed using models with varying parameter sizes, with the results evaluated using four metrics.

Method	Size	Faithfulness	Answer Relevancy	Context Precision	Context Recall
Qwen2-TravelRAG	7 B	0.32	0.21	0.32	0.15
Qwen2-TravelRAG	57 B	0.57	0.43	0.63	0.38
Qwen2-TravelRAG	72 B	0.69	0.78	0.75	0.71
GPT-4-TravelRag	180 B+ ¹	0.76	0.87	0.85	0.60

¹ The exact parameter size of the model is not officially reported and has been inferred through estimates made by other researchers [85].

Table 7. The experimental results were evaluated on four types of questions, based on models pre-trained with varying proportions of Chinese in their corpora.

Method	Size	Simple	MultiContent	Reasoning	Conditional
Qwen2-TravelRAG	72 B	0.89	0.57	0.65	0.54
GPT-4-TravelRag	180 B	0.88	0.43	0.72	0.78

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Song, S.; Yang, C.; Xu, L.; Shang, H.; Li, Z.; Chang, Y. TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph. ISPRS Int. J. Geo-Inf. 2024, 13, 414. https://doi.org/10.3390/ijgi13110414

AMA Style

Song S, Yang C, Xu L, Shang H, Li Z, Chang Y. TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph. ISPRS International Journal of Geo-Information. 2024; 13(11):414. https://doi.org/10.3390/ijgi13110414

Chicago/Turabian Style

Song, Sihan, Chuncheng Yang, Li Xu, Haibin Shang, Zhuo Li, and Yinghui Chang. 2024. "TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph" ISPRS International Journal of Geo-Information 13, no. 11: 414. https://doi.org/10.3390/ijgi13110414

APA Style

Song, S., Yang, C., Xu, L., Shang, H., Li, Z., & Chang, Y. (2024). TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph. ISPRS International Journal of Geo-Information, 13(11), 414. https://doi.org/10.3390/ijgi13110414

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TravelRAG: A Tourist Attraction Retrieval Framework Based on Multi-Layer Knowledge Graph

Abstract

1. Introduction

2. Related Works

2.1. Knowledge Graph

2.2. Retrieval Augmentation Generation

3. Methods

3.1. Travel Graph Construction

3.1.1. Document Chunking

3.1.2. Entity Extraction

3.1.3. Entity Linking

3.1.4. Relationship Linking

3.1.5. Community Construction and Summary Reports

3.2. Greedy Matching Retrieval Method

4. Experiment and Results

4.1. Experimental Data

4.1.1. RAG Data

4.1.2. Test Data

4.2. Environments

4.3. Models

4.4. Evaluation Metrics

4.5. Baseline

4.6. Results

4.6.1. Travel Knowledge Graph

4.6.2. Metric Comparison

4.6.3. Ablation Study

4.6.4. Case Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI