Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network

Wang, Yi; Liu, Jun; Wang, Weiwei; Chen, Jian; Yang, Xiaoyan; Sang, Lijuan; Wen, Zhiqiang; Peng, Qizhao

doi:10.3390/app14188231

Open AccessArticle

Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network

by

Yi Wang

,

Jun Liu

,

Weiwei Wang

^*

,

Jian Chen

,

Xiaoyan Yang

,

Lijuan Sang

,

Zhiqiang Wen

and

Qizhao Peng

School of Design and Art, Shaanxi University of Science and Technology, Xi’an 710021, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8231; https://doi.org/10.3390/app14188231

Submission received: 25 July 2024 / Revised: 8 September 2024 / Accepted: 10 September 2024 / Published: 12 September 2024

(This article belongs to the Special Issue Intelligent Interaction in Cultural Heritage)

Download

Browse Figures

Versions Notes

Abstract

:

To address the challenges posed by the vast and complex knowledge information in cultural heritage design, such as low knowledge retrieval efficiency and limited visualization, this study proposes a method for knowledge extraction and knowledge graph construction based on graph attention neural networks (GAT). Using Tang Dynasty gold and silver artifacts as samples, we establish a joint knowledge extraction model based on GAT. The model employs the BERT pretraining model to encode collected textual knowledge data, conducts sentence dependency analysis, and utilizes GAT to allocate weights among entities, thereby enhancing the identification of target entities and their relationships. Comparative experiments on public datasets demonstrate that this model significantly outperforms baseline models in extraction effectiveness. Finally, the proposed method is applied to the construction of a knowledge graph for Tang Dynasty gold and silver artifacts. Taking the Gilded Musician Pattern Silver Cup as an example, this method provides designers with a visualized and interconnected knowledge collection structure.

Keywords:

design; tang dynasty gold and silverware; knowledge extraction; knowledge graph construction

1. Introduction

The Tang Dynasty represents a pinnacle era in Chinese history, marked by flourishing culture and art. The excavation of Tang Dynasty gold and silver artifacts not only showcases the treasures of ancient Chinese craftsmanship but also provides crucial physical evidence for studying the social, political, economic, and cultural aspects of the Tang Dynasty. Many of these artifacts are classified as first-class national cultural relics or are prohibited from being exhibited abroad, highlighting their significance and value in world cultural heritage. These artifacts are diverse, encompassing a wide range of items, including utensils, drinking vessels, containers, medical tools, everyday miscellaneous items, ornaments, and religious instruments. As a valuable part of cultural heritage, Chinese scholars have conducted detailed research on the historical and artistic value of these artifacts. Yang [1] pointed out that, since its discovery in 1970, the Tang Dynasty gold and silver hoard from Hejiacun has received widespread academic attention and the functions and historical significance of the objects have been clarified. Zhang [2] explored the influence of Sogdian foreign culture on the creation, development, transformation, and integration of the octagonal cup with Tang Dynasty culture. Qi [3,4,5] studied nearly a thousand pieces of Tang Dynasty gold and silverware, providing detailed identification and in-depth analysis of each object, resulting in the most comprehensive and in-depth research on Tang Dynasty gold and silverware to date, both domestically and internationally. A rich cultural heritage allows designers to fully utilize their knowledge for innovation and development, enabling the brilliance of Tang Dynasty gold and silverware to be carried forward in contemporary times. The design knowledge of Tang Dynasty gold and silver artifacts can be categorized and analyzed through interdisciplinary research across various fields, such as art history, cultural studies, materials science, and craftsmanship technology, reflecting their diversity and cross-cultural characteristics.

In the digital age, utilizing cultural heritage to rapidly aggregate diverse knowledge and construct a comprehensive knowledge system is a crucial step in the entire design process. This approach provides data and information support for preliminary research and information collection, helping designers make more reasonable and efficient design decisions while saving time and resources. In less technologically advanced eras, knowledge collection mainly relied on paper documents, field investigations, and manual collection. Clearly, traditional recording methods have become increasingly difficult to use. With the development of big data technology, the use of digital tools in the field of knowledge collection has become increasingly widespread. During the knowledge collection phase, to quickly obtain information and conduct effective preliminary research, both domestic and international studies have focused on the development of design knowledge databases. Google Arts & Culture [6] collaborates with numerous world-renowned museums and institutions, providing online access to a variety of artworks and cultural artifacts. Additionally, the platform offers high-resolution images, virtual reality tours, interactive exhibitions, and detailed background information. The Palace Museum’s Digital Artifact Database is based on the Knowledge Graph of Ancient Chinese Movable Cultural Relics. Building on the catalog information of the museum’s collection, it expands to include interdisciplinary concepts and vocabulary from fields such as Forbidden City studies, art history, iconography, and biology. This allows users to access and utilize the information resources of the museum’s collection from multiple dimensions. Chinese scholar Wei Tong [7] constructed a multilingual terminology electronic dictionary using Ming and Qing dynasty porcelain vases as an example, providing a new perspective for the digital preservation of cultural heritage.

From the perspective of knowledge management, researchers have constructed various models, such as an OWL-based and ontology-based building lifecycle management model, a lightweight and efficient large-scale RDF (resource description framework) data management system, and an XML topic map and ontology-based product development knowledge representation model to support knowledge sharing during the product development process [8]. Analysis reveals that during the knowledge acquisition process in the aforementioned knowledge management models, there are issues of inefficiency and the inability to quickly and effectively obtain the necessary knowledge. Additionally, the architectural design of these platforms lacks sufficient visualization capabilities. Cultural heritage knowledge is complex and multifaceted, often encompassing multiple interdisciplinary fields. Designers need to extract effective knowledge information from vast amounts of data. Thus, establishing a visual data model based on cultural heritage knowledge has become an urgent problem to address.

The formation of modern knowledge graphs began in 2007, with landmark projects including DBpedia and Freebase [9]. Knowledge graphs demonstrate unique advantages in the field of data integration, particularly in handling large-scale datasets that span various industries and formats. They are commonly used for data integration, search engine optimization, and intelligent recommendation systems. By analyzing existing big data modeling methods, they can be categorized into two types: metamodeling and ontology modeling. Each provides guidance and structural support at different levels for the construction and application of knowledge graphs. These two modeling methods have different focuses: metamodeling primarily concentrates on the integration, sharing, and exchange of data, while ontology modeling focuses on knowledge representation and logical reasoning [10]. Ontology modeling and metamodeling have become critical steps in the construction of knowledge graphs. Therefore, this paper proposes a unified data model for cultural heritage knowledge information aimed at design, based on knowledge graph technology.

The main steps in constructing a knowledge graph include knowledge modeling, knowledge storage, knowledge extraction, knowledge fusion, knowledge computation, and knowledge application. Among these, knowledge extraction is the key step in establishing a knowledge graph, encompassing entity recognition and relationship extraction. Currently, most researchers adopt the traditional pipeline approach, dividing named entity recognition and relationship extraction into two separate subtasks processed sequentially. First, named entity recognition is performed, then the entities extracted by the entity model are paired for relationship matching, and, finally, relationship classification is achieved [11]. The pipeline approach has distinct advantages in modularization, specialization, and ease of maintenance. However, in the traditional pipeline method, separating entity recognition and relationship extraction can lead to the propagation of errors. Joint extraction, on the other hand, effectively connects the tasks of entity recognition and relationship extraction. Joint extraction typically relies on deep learning methods, using a single model to simultaneously identify entities and extract the relationships between them. Tahsin [12] used three variants of BERT (BERT, DistilBERT, RoBERTa) to train models for analyzing consumer complaints about laptops. Barroso [13] explored how federated learning (FL) can be used in natural language processing tasks to handle disagreements among annotators. He proposed the FLEAD (federated learning for exploiting annotators’ disagreements) method, which uses FL technology to independently learn from all annotators’ opinions. Islam [14] proposed a federated learning-based method that improves model performance through collaborative training among multiple clients without sharing data. Huang [15] and others constructed a design knowledge graph framework and developed a design knowledge extraction model based on federated learning, enhancing efficiency and improving the structuring and visualization of design knowledge. Li Chao [16] and colleagues pointed out the lack of reannotated data for named entity recognition in the field of cultural relics naming and the issue of nested entities in cultural relic names. They established the “Few Relics Data” dataset to address the problem of nested entities. Traditional federated learning has pioneered a new approach to joint entity-relationship extraction models, employing a method that first extracts relationships and then progressively predicts entities.

The aforementioned studies have effectively strengthened the interaction between subtasks during the learning process, significantly improving the error accumulation issue inherent in pipeline extraction. However, these methods mostly rely on unidirectional semantic features for entity recognition, which limits their ability to comprehensively identify contextual information within text segments and address the problem of entity redundancy.

The research is based on the construction of knowledge graphs using graph attention networks (GAT) and the BERT pretrained model. GAT is a deep learning model designed to handle graph-structured data. A graph is composed of nodes and edges, commonly found in scenarios such as social networks, biological networks, and knowledge graphs. Each node can contain a feature vector and edges represent the connections between nodes.

In traditional graph convolutional networks (GCNs), Liu [17] explores a hybrid approach that combines transformer and GCNs for text classification tasks. Meanwhile, Ullah [18] addresses the issue of over-smoothing in GCNs, where node representations become indistinguishable as the network depth increases. The paper suggests adding fully connected layers to mitigate this problem and prevent information from becoming overly smooth. Finally, Senior [19] discusses the relationship between transformer architecture and graph neural networks (GNNs), pointing out that transformer can be considered a special type of GNN and that GNNs may offer better inductive biases in certain tasks. In contrast to GCN, graph attention networks (GAT) introduce an attention mechanism that allows each node to dynamically assign different weights to its neighbors, thereby capturing the dependencies between nodes more flexibly. Li [20] proposed a new framework called multirelational graph attention network (MRGAT), aimed at better modeling the complex relationships and semantic information in knowledge graphs. Peng et al. [21] employed dependency-GAT to capture long-distance dependencies between natural language questions and database schemas, improving the accuracy of SQL generation through alignment-enhanced generation.

BERT (bidirectional encoder representations from transformers) is a pretrained language model based on the transformer architecture, introduced by Google in 2018. Based on the transformer architecture, BERT aims to enhance text comprehension by learning sentence context in a bidirectional manner. The emergence of BERT has revolutionized the field of NLP, significantly improving performance in various tasks such as question-answering systems, sentiment analysis, and text classification. Asudani [22] reviews traditional word embedding models (such as TF-IDF and bag of words) and distributed word embedding models (such as Word2Vec, GloVe, and fastText). It also introduces context-based embedding models like ELMo, GPT, and BERT. Kanakarajan [23] introduces BioELECTRA, a model based on ELECTRA, pretrained specifically for the biomedical domain using full texts from PubMed and PMC (PubMed Central). Compared to BERT, these models lack the contextual awareness and deep understanding provided by BERT, which can pose challenges in more complex fine-tuning tasks and limit performance in tasks requiring deeper contextual understanding. Its innovation lies in its ability to understand the contextual semantics bidirectionally, greatly improving performance in natural language processing tasks. Rouabhi [24] used BERT and BioBERT models, both transformer based, with a primary focus on improving multilabel classification performance through data augmentation. Kim et al. [25] demonstrated significant performance improvements with the pretrained BERT model in multiple medical NLP tasks, particularly excelling in processing Korean medical texts.

The combination of GAT and BERT allows for better handling of complex relationships and contextual dependencies in graph structures, making them suitable for processing complex texts. Given the richness and diversity of knowledge surrounding Tang Dynasty gold and silver artifacts, integrating these two models can significantly enhance the effectiveness of knowledge graph construction and multilabel classification tasks.

2. Research Objectives

Based on the aforementioned issues, this paper focuses on the knowledge information of Tang Dynasty gold and silver artifacts, with the goal of establishing a design-oriented knowledge graph for Tang Dynasty gold and silver artifacts. The success of the research lies in proposing an entity relationship extraction method based on graph attention neural networks to address the problems of entity overlap and inaccurate entity weight allocation. This provides a practical and highly compatible model for the study of the knowledge graph of Tang Dynasty gold and silver artifacts, facilitating designers in quickly obtaining knowledge information and improving the efficiency of preliminary data collection for design. At the same time, based on the cultural background of Tang Dynasty gold and silver artifacts, it allows amateurs to quickly learn and acquire knowledge.

3. Unified Data Modeling in the Cultural Heritage Knowledge Collection Stage for Design

3.1. Data Feature Analysis

Design knowledge spans various stages, including preliminary data collection, conceptual design, and other phases of the design process, involving multiple participants such as designers and user groups. Each group may have different understandings of design knowledge. The usage scenarios and recording standards also vary and the characteristics of design knowledge information are as follows:

(1): Information diversity: design knowledge encompasses structured, semistructured, and unstructured data. The diverse data forms make integration and processing cumbersome.
(2): Information overload: the sheer volume of design knowledge includes a significant amount of irrelevant or low-quality information, making it challenging for designers to promptly and accurately extract the relevant information they need.
(3): Ambiguity in meaning: different teams and organizations use various standards to describe design knowledge, leading to potential discrepancies in interpretation across different contexts, which increases communication costs.
(4): Dynamic iteration: design knowledge is continuously updated and iterated upon, with its collection progressing alongside the design project’s development and evolving requirements.

3.2. Unified Data Modeling Process for Knowledge Information Based on Knowledge Graphs

Given the unique and complex nature of design knowledge information related to cultural heritage artifacts, this section analyzes information integration, innovation needs, and interdisciplinary fusion in the design process. By combining knowledge management and application requirements, we propose a unified data modeling technique for cultural heritage design knowledge information based on knowledge graphs.

As illustrated in Figure 1, the unified data modeling process for design knowledge information based on knowledge graphs consists of three modules: design knowledge data sources, design knowledge information data model construction, and design knowledge information data model application.

3.3. Key Technology Research

The construction of the design knowledge information data model encompasses several stages: knowledge modeling, knowledge storage, knowledge extraction, knowledge fusion, knowledge computation, and knowledge application. This study employs a bottom-up construction method, focusing on the design knowledge collection stage within the entire design process to build a preliminary design knowledge graph ontology framework. Based on the diverse data within design knowledge information, knowledge representation is carried out to assist in knowledge extraction, knowledge fusion, and knowledge storage. Among these stages, knowledge extraction and knowledge fusion are critical steps in constructing a knowledge graph. Using machine learning methods, we designed techniques for knowledge extraction and knowledge fusion and stored the resulting data.

Knowledge extraction technology involves the automatic identification and extraction of structured knowledge from textual data. This includes entity recognition, relationship extraction, and attribute extraction, which are used to construct knowledge graphs, databases, or other knowledge management systems. For example, in the sentence The gilded tortoise-pattern silver plate, peach-shaped, features a gilded tortoise in the center, entities such as gilded tortoise-pattern silver plate and tortoise can be extracted. The relationship between these entities is decorates, and the attribute of the gilded tortoise-pattern silver plate entity is gold and silver artifact, describing the shape attribute of the gilded tortoise-pattern silver plate entity.

In this extraction task, the attribute extraction task is redefined as a problem that combines entity recognition and relationship extraction. The core of this method lies in not only identifying entities in the text but also recognizing attribute values associated with these entities and linking these attribute values to the corresponding entities through specific relationships. Therefore, the focus of knowledge extraction should be on entity recognition and relationship extraction.

Structured data typically employs manually mapped rules. For data with an explicit structure, such as tables in databases, predefined rules can directly map entities and their relationships within the data. These rules are manually created based on the structural characteristics of the data and are used to identify and extract specific information. Semistructured data generally uses wrapper induction methods to identify and standardize the source code paths of the information to be extracted, which is used to extract information from semistructured web data. Design knowledge information, such as traditional cultural knowledge, often comprises unstructured data. Unstructured data (such as text files, text in images, etc.) is converted into an editable format through text recognition and other technologies. Specialized recognition techniques are needed to extract knowledge information from this data, identifying specific entities, relationships, and more.

Due to the generally low quality of most design knowledge information data, this study proposes an improved text information extraction method, namely an entity-relationship joint extraction method based on a segmental attention fusion mechanism. During the process of entity recognition and relationship extraction, the method fully considers the contextual information in the text data, enhancing the accuracy of entity recognition. The text is divided into multiple segments and entity recognition and relationship extraction are independently performed for each segment, addressing the issue of overlapping entity relationships.

After extracting knowledge from the text, it is necessary to merge entities with the same name. This is because the same name might refer to different entities or the same entity might be referred to by different names in various contexts. To effectively manage the extracted knowledge, these ambiguous entities with the same name need to be merged to ensure consistency in the knowledge base.

When dealing with situations where the same name refers to different entities, a pretrained language model based on contextual features is used to encode sentences containing these ambiguous entities. This approach captures the contextual information surrounding each entity. Then, the similarity between two entities with the same name is calculated based on their contextual encodings and compared to a predetermined threshold. If the similarity between the two entities is higher than this threshold, they are considered to have the same meaning. If the similarity is lower than the threshold, they are considered to refer to different entities.

When different names refer to the same entity, it is necessary to determine whether two or more expressions refer to the same entity. This ensures that the entities being compared and evaluated logically belong to the same category. To achieve this, sentences are encoded using a pretrained language model based on contextual features. This encoding is combined with a graph neural network (GNN) normalization model and a biaffine attention model with a feedforward neural network layer, utilizing two scoring mechanisms.

These scoring mechanisms leverage the encoded contextual information and the structural information from the GNN to calculate the likelihood that different named entities refer to the same entity. This score is then compared to a predetermined threshold. If the score is higher than the threshold, it can be inferred that the two differently named entities indeed refer to the same entity. Conversely, if the score is lower than the threshold, they are considered to refer to different entities.

When entity recognition faces the challenge of nested entities, such as entities embedded within other entities in the text, specialized techniques are required. For instance, in the sentence Apple’s iPhone 15, both Apple and iPhone 15 are entities, with iPhone 15 nested within the larger entity Apple’s iPhone 15. Similarly, a pretrained language model can be used to extract contextual features. These features are then processed using a graph neural network (GNN) and a biaffine attention model to capture the nested relationships between entities. Finally, a feedforward neural network layer is used for detailed feature analysis and recognition.

4. Data Model Construction

4.1. Entity-Relationship Extraction Model Construction

To better identify design knowledge information, this study constructs an entity recognition model based on the BERT pretrained model [26]. The BERT model, built on the transformer architecture, uses bidirectional training to understand the context of language, allowing it to more comprehensively capture the complexity and nuances of language. The entity recognition model consists of an embedding layer and an entity recognition layer, while the relationship extraction model consists of an embedding layer and a relationship extraction layer. To address the issue of overlapping entity relationships in design knowledge and accurately identify entities and their relationships, we established an entity-relationship joint extraction model with a segmental attention fusion mechanism. This model comprises three parts: an embedding layer, an entity recognition layer, and a relationship classification layer.

The role of the embedding layer is to convert words in the text into vector forms that can be recognized by the computer. These vectors capture the semantic features of the words and are used to process actual textual data. The entity recognition layer utilizes contextual information to identify entities in the text and their categories. Once the entities are recognized, the task of the relationship classification layer is to determine the types of relationships between these entities. By encoding and classifying pairs of entities, this layer identifies the specific relationships between them.

In BERT’s pretraining tasks, some words in the input sequence are randomly replaced with a special

[MSAK]

token. This allows the model to learn to predict the masked words based on their context, thereby understanding the relationships and semantics of the words within the sentence. The goal of the model is to predict these masked words. This approach forces the model to learn to understand the meaning of words based on their context, thereby enabling it to capture language features bidirectionally, as shown in Figure 2.

4.1.1. Embedding Layer

The embedding layer encodes words from natural language into dense mathematical vectors, converting the input text into high-dimensional space vectors that a computer can recognize. This helps the model understand the basic meanings of words. The core structure of the BERT model consists of the stacking of multiple transformer modules, forming a bidirectional self-attention representation model. Each transformer module primarily includes two submodules: the multihead attention mechanism submodule and the feed-forward neural network submodule. These submodules incorporate residual connections and layer normalization. After processing the text through the BERT model, an output vector sequence of the same length as the input text is obtained. These output vectors contain the semantic information of the input text, providing representations for downstream tasks such as entity recognition and relationship extraction.

Given the input text, the sentence is encoded as

U = \{u_{0}, u_{1}, u_{1}, \dots, u_{n}\}

, where

u_{v}

is the

v

-th character in the sentence

U

and

n

is the length of the sequence. Here,

u_{0} = [CLS]

is the start of the sentence;

u_{n} = [SEP]

is the separator of the sentence. This step converts the text sequence into a fixed-length vector, representing the semantic information of the text. After the model input, position embedding and sentence embedding are also required. Position embedding indicates the position of a word in the sentence and sentence embedding represents the vector of the entire sentence or text segment, with markers used to separate and segment the sentence. Through BERT encoding, the sentence’s semantic feature vector

R = \{r_{0}, r_{1}, r_{1}, \dots, r_{n}\}

is obtained. After the BERT model assigns word vector encoding, semantic dependency relations are introduced to strengthen the feature representation between word vectors. Semantic dependency relations refer to the grammatical relationships within a sentence, which are used to understand the sentence’s structure and meaning, thereby enhancing the expressiveness of word vectors. This paper employs the dependency parser of Stanford CoreNLP [27] to obtain the dependency relationship matrix based on the input sentence. Considering that different dependency relations and dependent words have varying impacts on the grammatical features of a sentence, a graph attention neural network (GAT) model [28] is introduced. In this model, the input sentence is treated as a graph where words are nodes and dependency relations are the connections between these nodes. The graph attention neural network dynamically adjusts the attention weights between different words to extract more comprehensive word and syntactic connections.

By using a graph attention neural network (GAT), the dependency tree of a sentence is encoded to extract syntactic information of the words in the sentence, thereby enhancing the word vectors. The dependency parser analyzes the sentence to identify the dependency relations between words in the sentence, such as subject–verb relationships, verb–object relationships, and prepositional–object relationships. Based on the identified dependency relations, a matrix

M_{n \times n}

is constructed. According to the input text

T

and the relationship matrix

M_{n \times n}

, a graph with

n

nodes is built. The graph attention neural network is defined as

L

layers. In the

l

-th layer,

t_{i}^{l}

represents the vector of node

i

. The update vector of node

i

is represented as

t_{i}^{(l + 1)}

, which is calculated based on the vectors of the connected nodes. The calculation formulas are shown in Equations (1) and (2).

α_{i j}^{l} = \frac{\exp (L e a k R e L U ({\vec{a}}^{{(l)}^{T}} [W^{(l)} r_{i}^{j} | | W^{(l)} r_{i}^{j}])}{\sum_{k = 1}^{n} M_{i k} (L e a k R e L U ({\vec{a}}^{{(l)}^{T}} [W^{(l)} r_{i}^{j} | | W^{(l)} r_{i}^{j}])}

(1)

r_{i}^{(l + 1)} = θ (\sum_{j = 1}^{n} M_{i j} α_{i j}^{l} W^{(l)} r_{j}^{l})

(2)

In the formulas,

r_{i}^{l}

,

r_{j}^{l}

, and

r_{k}^{l}

are the vector representations of nodes

i

,

j

, and

k

, respectively, in the

l

-th layer of the graph neural network,

α_{i j}^{l}

is the correlation coefficient between nodes

i

and

j

,

M_{i j}

represents the dependency relationship between nodes

i

and

j

,

M_{i j} = 0

indicates that there is no dependency relationship between nodes

i

and

j

while

M_{i j} = 1

indicates a dependency relationship, i.e., a direct connection or interaction exists between the two nodes, and

M_{i j} = 1

indicates that there is no dependency relationship between nodes

i

and

j

.

W^{(l)}

represents the trainable weight matrix in the

l

-th layer, used to perform a linear transformation on the input features to adapt to the learning tasks of the current layer,

| |

represents the concatenation of two vectors,

{\overset{⇀}{a}}^{(l)}

represents the parameters of the

l

-th layer, and

{\overset{⇀}{a}}^{{(l)}^{T}}

is the transpose operation of

{\overset{⇀}{a}}^{(l)}

;

L e a k R e L U

is an improved linear activation function used to address the traditional rectified linear unit activation function and

θ

is the nonlinear activation function.

The semantic feature vectors

R

and the dependency relationship matrix

M_{n \times n}

are input into the graph attention neural network (GAT) for

L

layers, obtaining the syntactic feature vectors

S = \{s_{1}, s_{2}, \dots, s_{n}\}

. By concatenating the semantic feature vectors and the syntactic feature vectors of the text sequence, the final output feature vectors are obtained, as shown in Equation (3):

e_{i} = [r_{i}; s_{i}]

(3)

In the formula,

r_{i}

is the semantic feature vector of the

i

-th word and

s_{i}

is the syntactic feature vector of the

i

-th word. In the embedding layer, the text sequence feature vectors are

E = {e_{1}, e_{2}, \dots, e_{n}}

.

4.1.2. Entity Recognition Layer

In the entity recognition layer, the text is divided into segments of different lengths. A classifier is used to classify these segments to determine whether each segment belongs to a predefined entity category or is a nonentity. The entity categories are defined before model training. Let the entity categories be

O = \{o_{1}, o_{2}, \dots, o_{n}\}

, where

n

is the number of categories. The input text

U

is segmented to obtain segments

P = \{u_{i}, u_{i + 1}, \dots, u_{i + k}\}

, where the length of each segment is

k + 1

. To reduce misclassification during entity recognition and accurately identify entities and nonentities, a nonentity set

\{n o n e\}

is defined. Each segment

p

is mapped to

O \cup \{n o n e\}

and the input vector for the segment classifier is constructed by concatenating the segment’s semantic feature vector, length feature vector, and global feature vector.

(1) Segment semantic feature vector: This is obtained by performing global average pooling on the word embedding vectors of each word within the segment, representing the semantic content of the entire segment. The word vector layer corresponding to segment

a

is

e

, calculated as shown in Equation (4) [29]

s p a n = A v g p o o l i n g (e_{i}, e_{i + 1}, \dots, e_{i + k})

(4)

(2) Segment semantic feature vector: A predefined matrix

W

contains feature vectors

w_{k + 1}

of different lengths within the text, encompassing the vectors corresponding to each segment. The embedding matrix can be learned through backpropagation and its contents are updatable, thereby enhancing the model’s capability to process text.

(3) Segment global feature vector: The global feature vector of a text segment is extracted through a pooling attention mechanism that captures the contextual information of the text segment. The attention mechanism calculates the relevance weights between the text segment and other parts of the text. Then, average pooling is applied to aggregate the weighted features, resulting in a compact representation of the global context, as shown in Figure 3.

The calculation process of the pooling attention mechanism is as follows:

Step 1: Construct the query vector, key vector, and value vector. Convert the segment semantic feature vector

e (b)

into a query vector

Q_{b}

and convert the sentence semantic feature vector

E

into a key vector

K_{b}

and a value vector

V_{b}

. These converted vectors are used to calculate the attention weights, thereby extracting the most relevant contextual information for the specific text segment. The calculation formulas are shown in Equations (5)–(7) [30]

Q_{p} = e (p) W_{p}^{Q}

(5)

K_{p} = E W_{p}^{K}

(6)

V_{p} = E W_{p}^{V}

(7)

In this context, the dimensions of the vectors are as follows: the query vector

Q_{p}

has a dimension of

(k + 1) \times d_{q}

, the key vector

K_{b}

has a dimension of

n \times d_{k}

, and the value vector

V_{b}

has a dimension of

n \times d_{v}

.

Step 2: In the pooling attention mechanism, calculate the attention coefficient

a t t

based on the between the query vector

Q_{p}

and the key vector

K_{p}

. This coefficient is used to assess the relevance of each word in the sentence to the specific text segment, such as entity pairs. The dimension is

n

and

T

denotes the transpose operation, as shown in Equation (8) [31,32]

a t t = softmax (maxpooling (\frac{Q_{p} K_{p}^{T}}{\sqrt{d_{k}}}))

(8)

Step 3: Use the attention coefficient

a t t

to generate the final weighted representation. Multiply the attention coefficient

a t t

by the value vector

V_{p}

to obtain the segment global vector

c_{p}

, as shown in Equation (9) [31,32]

c_{p} = a t t \cdot V_{p}

(9)

Concatenate the three types of feature vectors to obtain the comprehensive input text feature vector

x_{p}

, as shown in Equation (10) [31,32].

x_{p} = [s p a n; w_{k + 1}; c_{p}]

(10)

Finally, use a fully connected network and a softmax function to construct a classifier model to classify the input text segments. For the input feature vector

x_{p}

, the fully connected network first performs a series of linear transformations and nonlinear activations and then outputs a probability distribution

{\hat{y}}_{p}

through the softmax function. This distribution represents the probability that the input segment belongs to the corresponding entity category, as shown in Equation (11) [33].

{\hat{y}}_{p} = softmax (W_{p} x_{p} + b_{p})

(11)

Finally, the model outputs the probability distribution, predicting which entity each segment belongs to.

4.1.3. Relationship Classification Layer

The objective of the relationship extraction task is to identify entity pairs and their relationships from the text. A segmental attention fusion mechanism is used, allowing the model to accurately extract key information by considering both the entities and the contextual information, as illustrated in Figure 4. Take the relationship categories of entities as

H = \{h_{1}, h_{2}, \dots h_{n}\}

, where

n

is the number of relationship types, and determine the relationship between segments

p_{1}

and

p_{2}

in the input text

P

. The text is divided into five parts based on the entity positions, enabling the model to accurately learn the contextual information before, between, and after the entities. By combining the text feature vectors

E

, the feature vectors of entity 1 and entity 2,

E (g_{1})

and

E (g_{2})

, respectively, the feature vectors of the text to the left of entity 1

c_{left}

, the text between the entities

c_{middle}

, and the text to the right of entity 2

c_{r i g h t}

are obtained. In the case of overlapping entities, the text feature vector between the two entities is defined as the text feature vector of this overlapping part.

In the relationship extraction task, the sentence is divided into five parts and the conversion to vectors will have different dimensions. Average pooling is used to uniformly process text segments of different lengths. The feature vectors of the five text segments are concatenated to obtain the sentence feature vector as shown in Equations (12) to (17). In these equations,

c_{l e f t}

represents the text to the left of the entity,

c_{m i d d l e}

represents the text between the entities,

c_{r i g h t}

represents the text to the right of the entity,

c (g_{1})

represents entity 1, and

c (g_{2})

represents entity 2 [34].

t_{l e f t} = Avgpooling (c_{l e f t})

(12)

t_{g 1} = Avgpooling (e (g_{1}))

(13)

t_{m i d d l e} = Avgpooling (c_{m i d d l e})

(14)

t_{g 2} = Avgpooling (e (g_{2}))

(15)

t_{r i g h t} = Avgpooling (c_{r i g h t})

(16)

T = [t_{l e f t}; t_{g 1}; t_{m i d d l e}; t_{g 1}; t_{r i g h t}]

(17)

Then, a long short-term memory (LSTM) network is used to process the fused text feature vectors. LSTM is a powerful tool for handling text data as it can capture long-term dependencies in sentences or documents. Bidirectional LSTM, an improved version of RNN, includes both forward and backward propagation. At each time step, it contains an LSTM cell that selectively remembers, forgets, and outputs information. The process is described by Equations (18) to (23) [35].

i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + W_{c i} c_{t - 1} + b_{i})

(18)

f_{t} = σ (W_{x_{f}} x_{t} + W_{h_{f}} h_{t - 1} + W_{c_{f}} c_{t - 1} + b_{f})

(19)

g_{t} = \tanh (W_{x c} x_{t} + W_{h c} h_{t - 1} + W_{c c} c_{t - 1} + b_{c})

(20)

c t = i_{t} g_{t} + f_{t} c_{t - 1}

(21)

o_{t} = \tanh (W_{x o} x_{t} + W_{h o} h_{t - 1} + W_{c o} c_{t - 1} + b_{o})

(22)

h_{t} = o_{t} \tanh (c_{t})

(23)

The model’s output includes two results from both forward and backward directions, concatenated as

h_{t} = [\overset{\leftarrow}{h_{t}}, \vec{h_{t}}]

, and output as the final bi-LSTM result.

The self-attention mechanism is employed to calculate and process the relationships and dependencies within the text, achieving extraction and integration of the feature vectors from the five text segments. The self-attention mechanism is used to calculate the relevance of the input data, including the query vector

Q

, key vector

K

, and value vector

V

, obtained through a linear transformation of the input data, as shown in Equations (24) to (26) [32]. In these equations,

W

is the weight matrix and the dimensions of the query vector

Q

, key vector

K

, and value vector

V

are

5 \times d_{q}

,

5 \times d_{k}

, and

5 \times d_{v}

, respectively.

Q = T W^{Q}

(24)

K = T W^{K}

(25)

V = T W^{V}

(26)

Then, the attention weights between the five text segments are calculated. First, the dot product operation is used to calculate the attention coefficients between the query vector

Q

and the key vector

K

. Next, the

softmax

function is applied to normalize the attention coefficients. The normalized attention coefficients are then multiplied by the value vector

V

to obtain the fused feature vector

h_{r}

, as shown in Equation (27) [33]

h_{r} = softmax (\frac{Q K^{W}}{\sqrt{d}}) V

(27)

Next, a fully connected network is used to learn the relationship features between entity pairs. The

softmax

function is employed to convert the network outputs into probabilities for each relationship category. The model is trained by minimizing the cross-entropy loss between the predicted probabilities and the true labels, as shown in Equation (28) [33].

{\hat{y}}_{r} = Sig (W_{r} h_{r} + b_{r})

(28)

4.2. Model Training and Optimization

The input dataset is divided into training, validation, and test sets. After preprocessing the input data, the model needs to undergo parameter tuning and validation evaluation.

The parameter tuning process consists of two stages: one involves adjusting and optimizing the model’s hyperparameters using the validation set and the other involves adjusting and optimizing the model’s internal parameters using the training set. During the model training process, loss functions are designed for the entity recognition task and the relationship extraction task to calculate the error between the model’s predictions and the actual values, reflecting the accuracy of the model’s predictions.

The error calculated from the loss function is used to optimize the model’s parameters through the backpropagation algorithm, adjusting the internal parameters of the model to minimize the loss function value. If the model’s performance does not significantly improve or if overfitting occurs, adjustments to the model structure, such as changing the number of network layers, may be necessary.

The entity recognition task can be treated as a multiclass classification problem, using the cross-entropy loss function as the loss function for the entity recognition task, as shown in Equation (29) [36]

L_{N E R} = \sum_{i = 0}^{n + 1} \lg ({\hat{y}}_{i}^{p})

(29)

n

is the total number of entity labels in the entity recognition task and

{\hat{y}}_{i}^{p}

is the probability that the model predicts segment

p

belongs to the

i

-th entity category.

y_{i}

represents the actual label information, using one-hot encoding, where

y_{i} = 1

indicates that the segment

p

belongs to the

i

-th category; if

y_{i} = 0

, it indicates that the segment

p

does not belong to the

i

-th category. In the relationship extraction task, to handle a single entity potentially having multiple types of relationships, a multilabel cross-entropy loss function is used to calculate the loss. This loss function is suitable for multientity classification problems, where each entity is considered an independent binary classification problem. For each possible relationship type, the model will predict a probability to indicate whether the relationship exists. The calculation formula is shown in Equation (30) [36].

L_{R E} = - \sum_{j = 0}^{n + 1} \lg ({\hat{y}}_{k}^{r}) + (1 - y_{k}) \lg (1 - {\hat{y}}_{k}^{r})

(30)

r

represents the total number of possible relationship categories and

{\hat{y}}_{k}^{r}

is the probability predicted by the model, indicating the model’s prediction of the existence of a relationship of the

k

-th category between two entities.

y_{k}

represents the true distribution of the relationship category labels, using one-hot encoding. If

y_{k} = 1

, it means that there is a relationship of the

k

-th category between the two entities; if

y_{k} = 0

, it means that there is no relationship of the

k

-th category between the two entities.

The overall loss function of the model consists of the loss function for entity recognition and the loss function for relationship extraction, as shown in Equation (31) [36]

L = L_{N E R} + L_{R E}

(31)

By minimizing the total loss parameter, the model’s overall extraction capability can be improved, achieving joint optimization for both entity recognition and relationship extraction tasks.

5. Experiment Validation

5.1. Dataset Creation

To validate the effectiveness of the model in handling entity relations, we will conduct tests on both Chinese and English datasets. For the Chinese dataset, we will use the Tang Dynasty Gold and Silver Artifacts Knowledge Text Dataset.

In the collection of artifact knowledge aimed at design, the preliminary stage requires acquiring a large amount of artifact knowledge information, filtering out irrelevant texts, and retaining key descriptions of the artifact’s shape, decoration, craftsmanship, and cultural origin. To achieve this goal, we need to first gather a substantial amount of textual information. The unstructured data are based on authoritative historical materials and literature related to Tang Dynasty gold and silver artifacts, which nearly cover all significant collections of Tang Dynasty gold and silver artifacts [3,4,5,37,38,39,40,41,42,43,44,45]. These materials offer comprehensive studies from the perspectives of archaeology and history, providing detailed descriptions of various aspects, such as the artifacts’ shapes and decorations, thereby enriching the dataset’s corpus. Scholar Qi Dongfang’s book “Study of Tang Dynasty Gold and Silver Artifacts” offers detailed descriptions of the functions and series of gold and silver artifacts, laying the foundation for the dataset’s categorization. The selected artifacts in the dataset include not only everyday containers but also Buddhist artifacts unearthed from the Famen Temple underground palace, such as offering vessels and ritual implements. These gold and silver artifacts, mostly offered by Tang emperors Yizong and Xizong, exhibit high aesthetic value in their shapes and decorations due to their royal origin. Therefore, the gold and silver artifacts unearthed from the Famen Temple underground palace are also an important part of the dataset. According to the research of scholars, analyzing the fundamental properties of artifacts includes their function, technology, appearance, and symbolic meaning. These summarize the basic information and interpretation of cultural relics, which facilitates a deeper understanding of their knowledge.

During the text training process, we use the ChatGPT-4 model for data augmentation. Traditional data augmentation methods (such as synonym replacement, random deletion, etc.) have limitations in generating text with accuracy and diversity and still require manual annotation in many application scenarios [46,47,48,49,50,51,52]. Multiple studies have shown the powerful capability of ChatGPT in text data augmentation, significantly improving classification performance in few-shot learning [53]. The availability of unstructured text data enhances the model by integrating dimensions of usability, determinants, and rules. Through methods such as synonym replacement, text expansion, and adding or rewriting contextual information, the dataset is enhanced while preserving the original entity relationships. Based on modules of shape, decoration, craftsmanship, and cultural connotation of Tang Dynasty gold and silver artifacts, structured triple information is used to annotate the text, defining entity categories and relationship categories. This process ultimately constructs the Tang Dynasty Gold and Silver Artifacts Knowledge Text Dataset. Using structured triple information to label the text, defining entity categories and relationship categories, the Tang Dynasty Gold and Silver Artifacts Knowledge Text Dataset is finally constructed. The dataset is illustrated in Table 1, taking the gilt silver cup with musical performance patterns as an example. It includes the shape, decoration, decorative techniques, parts of the artifact, and cultural origin of the gilt silver cup with musical performance patterns, covering most relationship categories of Tang Dynasty gold and silver artifacts.

The English dataset used is the New York Times (NYT) dataset. The NYT dataset comes from actual news articles with annotated entities and relationships and additional information. It is used as a benchmark to validate the performance of the PRGC framework in joint relationship triplet extraction tasks [54]. The dataset covers various types of relationships and includes overlapping triplets, effectively testing the model’s ability to handle overlapping relationships [55].

Table 2 summarizes the data volumes of the New York Times dataset and the Tang Dynasty gold and silver artifact knowledge dataset, divided into training, validation, and test sets. Sentences are classified into three categories based on the overlap of triplets:

Normal: no overlapping triplets.

Single entity overlap (SEO): one entity overlaps between two triplets, meaning one entity has relationships with multiple entities.

Entity pair overlap (EPO): multiple relationships exist between a pair of entities. A sentence can belong to both EPO and SEO, meaning it can simultaneously contain single entity overlaps and entity pair overlaps.

5.2. Experimental Environment

The model training was conducted using an NVIDIA Geforce RTX 3090 GPU with 10GB of memory. The programming language used was Python 3.9.12 and the deep learning framework was PyTorch 2.1.2+cu121.

5.3. Parameter Settings

This table presents the key hyperparameters used in the training process. The learning rate is set to 1 × 10⁻⁵, ensuring gradual updates to the model weights during training. The maximum number of epochs is set to 30, indicating the model will be trained for 30 full cycles over the dataset. The batch size is 8, meaning the model processes 8 samples at a time before updating its parameters. A seed value of 42 is used to ensure reproducibility of the training results by controlling the randomization process. The results are summarized as shown in Table 3.

5.4. Standard Dataset Experimental Results

To compare the joint learning model with other outstanding joint learning models, the following models were selected:

CopyRE: the core idea combines the sequence-to-sequence (Seq2Seq) model with a copy mechanism, allowing it to directly copy parts of the input text when generating output.

GraphRel: a relationship extraction model based on graph neural networks (GNN). This model utilizes graph structures to represent entities and relationships in a sentence and captures dependencies between entities and relationships through graph convolutional networks (GCN).

CopyMTL: by training multiple related tasks (such as entity recognition and relationship extraction) simultaneously, this model uses shared representation learning to improve the model’s generalization capability.

RSAN [56]: a relationship extraction model based on the self-attention mechanism. By introducing relation-aware self-attention, it captures interactions between entities and relationships in a sentence.

Table 4 shows the comparison results of the proposed model with other benchmark models on two standard datasets. It can be seen that our model generally outperforms all benchmark models on all three evaluation metrics (precision P, recall R, and F1 score).

In the previous benchmark models, the F1 scores of each model were relatively high on the NYT dataset, especially for CopyRE and GraphRel. This is due to the limitations of these benchmark models, such as CopyRE and GraphRel, in feature extraction and context. understanding, which fail to fully capture complex semantic information and relationships.

To further validate the model’s performance in handling overlapping triplets, experiments were conducted on different types of sentences from various perspectives and compared with other benchmark models. As shown in Table 5, our model outperforms other models in normal sentences, EPO overlapping triplets, and SEO overlapping triplets across all datasets. The comparison of our model with other benchmark models on different types of sentences demonstrates that our model is more stable and superior in handling overlapping triplet tasks, whereas other benchmark models perform poorly in dealing with SEO and EPO sentences.

Table 6 compares the performance of each model on sentences containing different numbers of triplets. The sentences are divided into five categories: those containing 1, 2, 3, 4, and 5 or more triplets. The models’ precision (Prec), recall (Rec), and F1 scores are evaluated in detail for sentences of varying complexity. As the number of triplets in a sentence increases, the complexity of the sentence also increases. Table 6 shows the performance differences of each model in handling these complex sentences, particularly highlighting the superiority and stability of our model in complex scenarios.

5.5. Experimental Results on the Tang Dynasty Gold and Silver Artifacts Dataset

To evaluate the model’s performance in handling different types of overlapping triplets in the Tang Dynasty gold and silver artifacts dataset, Table 7 compares the performance of our model. This comparison aims to verify the model’s effectiveness in dealing with overlapping relationships. Table 7 shows the precision (Prec), recall (Rec), and F1 scores of the model on sentences with different types of overlapping triplets. By comparing these metrics, we can validate the model’s performance in handling tasks with different types of overlapping relationships.

The data indicate that our model performs exceptionally well on the Tang Dynasty gold and silver artifacts dataset, achieving high F1 scores on both normal sentences and EPO overlapping sentences. This demonstrates the model’s capability to effectively address the problem of overlapping entity relationships. The model demonstrated efficient performance during data processing. After setting an automatic early stopping mechanism triggered by 10 consecutive epochs without improvement, the training was completed within 19 epochs, with each epoch taking approximately 996 s. The training accuracy reached 0.8506, while the validation accuracy was 0.8050. All the above data have been compiled into Table 8. Even when utilizing an NVIDIA GeForce RTX 3060 GPU, the model operated efficiently, highlighting its computational effectiveness. Figure 5 illustrates the trend of training accuracy over the course of the 19 epochs.

6. Knowledge Retrieval of Tang Dynasty Gold and Silver Artifacts

Considering the vast amount of information, complexity of data, and lack of uniform data sources related to Tang Dynasty gold and silver artifacts, a unified data model based on knowledge graphs has been constructed. This model aims to meet the design knowledge extraction needs for cultural and creative product design. Using manual annotation, a large amount of knowledge about the shapes of artifacts from Tang Dynasty gold and silver collections was collected. Based on the research on the shapes, functions, patterns, decorations, usage scenarios, and symbolism of Tang Dynasty gold and silver artifacts in various series of books, the knowledge information has been summarized. The construction of the knowledge graph supports designers in efficiently gathering information and facilitates smoother preliminary research work.

6.1. Overall Functionality

The unified data model for knowledge information on Tang Dynasty gold and silver artifacts undergoes data preprocessing, knowledge extraction, data integration, and storage into a graph database. Based on the knowledge graph, information retrieval is conducted to assist designers in collecting information, thereby maximizing the value of textual information. The architecture of the unified data model for Tang Dynasty gold and silver artifacts knowledge information is shown in Figure 6.

The unified data model for knowledge information on Tang Dynasty gold and silver artifacts is divided into three layers: schema layer, data layer, and knowledge management layer. The following is an explanation of each layer.

(1): Schema layer: This layer determines the names, entities, attributes, and relationships of Tang Dynasty gold and silver artifacts. It constructs concepts, classifications, hierarchical structures, and ontology models in this domain, expressing various semantic relationships among entities to ensure the completeness and consistency of the knowledge graph data. From the top-level concepts, it further refines the structure to obtain the entities and relationships of gold and silver artifacts.
(2): Data layer: Structured data, such as tables and relational databases, can be directly converted into knowledge graph representations based on the unified data model for Tang Dynasty gold and silver artifacts. Historians and archaeologists have conducted systematic research on Tang Dynasty gold and silver artifacts, mainly documented in reports or published books, which are expressed in a standardized manner and contain dense and rich knowledge. These unstructured data need to be collected and organized manually. The dataset originates from published bibliographies related to Tang Dynasty gold and silver artifacts, with entities, attributes, and semantic relationships annotated.
(3): Knowledge management layer: The knowledge graph effectively represents the complex knowledge extracted and integrated from various sources, facilitating formatted storage and retrieval. To help designers manage knowledge of Tang Dynasty gold and silver artifacts and precisely locate various information, a knowledge management prototype system based on a graph database has been designed. The system functions are composed of four modules: data query, knowledge extraction, knowledge graph visualization, and knowledge base management.

Based on the unified data model for knowledge information on Tang Dynasty gold and silver artifacts, the system function modules are designed as shown in Figure 7.

The knowledge graph construction module encompasses knowledge information data management, knowledge graph building, knowledge management, and the updating and maintenance of the knowledge graph and models. Its main tasks include collecting and entering various types of knowledge information related to Tang Dynasty gold and silver artifacts and ensuring the systematic management and dynamic updating of knowledge information by continuously updating and iterating the knowledge graph through the constructed model during actual application. The knowledge graph application module consists of two main parts: practical tools and user management. The practical tools section is responsible for the visualization and efficient retrieval of the knowledge graph, providing search functions for quickly obtaining the needed information. The user management section focuses on managing user data permissions.

6.2. System Interface

Using the unified data modeling technology for knowledge information on Tang Dynasty gold and silver artifacts based on knowledge graphs, a knowledge retrieval platform has been constructed. Taking the “Gilded Musician Pattern Gold Cup” as an example, the practical workflow of the knowledge retrieval platform includes the following steps:

(1): Data transmission: As shown in Figure 8, users can upload knowledge information on the platform. After uploading the knowledge information to the database, the internal model is called to learn and process this knowledge. This process includes knowledge extraction, knowledge fusion, and integration with the existing knowledge graph. Finally, the processed knowledge information is stored in the database. Figure 9 shows the knowledge upload status, displaying the upload progress, including the number of files uploaded, and allows further management and modification of the uploaded knowledge information on this interface.
(2): Knowledge retrieval: As shown in Figure 10, after entering “Gilded Musician Pattern Gold Cup” in the search bar, the cursor points to the “Gilded Musician Pattern Gold Cup” entity. The right side displays the total number of all related entities, which is 26, the types of relationships, which are 7, and all associated entities. Additionally, by selecting any entity in the graph, users can navigate to the pages of other related entities.
(3): Graph comparison: As shown in Figure 11, in the artifact comparison interface, users can search for and locate various types of artifact knowledge in the comparison section. The system calculates the number of entities related to gold and silver artifacts and the number of relationships between them. To visually present the data, the system generates a bar chart showing the top five relationships by count for the compared gold and silver artifacts. The horizontal axis of the bar chart represents the relationships and the vertical axis represents the frequency of each relationship in the corresponding artifacts. Additionally, the names of the gold and silver artifacts associated with specific relationships are listed below the chart to enable quick identification and further analysis by the user.

Figure 8. Knowledge upload interface.

Figure 9. View knowledge upload status.

Figure 10. Related entities of “Gilded Musician Pattern Silver Cup”.

Figure 11. Knowledge comparison.

7. Discussion and Outlook

7.1. Association between Knowledge Graphs and Artifact Information

This paper takes the knowledge of Tang Dynasty gold and silver artifacts, part of China’s material heritage, as an example, focusing on the knowledge collection phase during the early design stage, addressing challenges such as knowledge data diversity, data redundancy, and data ambiguity. A unified data model for Tang Dynasty gold and silver artifact knowledge was constructed, using graph structures to express complex relationships and semantics between entities. By integrating semantic web technology and inference engines and leveraging linked data technology, the model enables the expression of more diversified knowledge. It can integrate heterogeneous data sources to form a unified knowledge representation. During the construction of the knowledge graph, users can intuitively and efficiently browse information through the integrated knowledge graph visualization function.

In prior research on Tang Dynasty gold and silver artifacts, researchers have generally followed a pattern in their descriptions of these artifacts: first providing a unified description of the form, components, and connection relationships of the artifacts, and then detailing each part’s features, patterns, and cultural connotations in a vertical order, such as from top to bottom or starting from the connection points. Based on Chinese language logic, different word types in the text are deconstructed, with irrelevant adjectives and auxiliary words being removed. For example, in the description of the silver cup with pointed lotus petals and wild goose patterns: “flared mouth, slightly outward-turned rim, curved belly expanding outward, trumpet-shaped octagonal high foot ring, with a circular petal hoop at the middle of the foot ring, and a circular tray at the top connecting the cup body. Formed by hammering, flat-chased patterns, gilded decorations, fish-roe ground. The cup body consists of two layers of lotus petals, nine petals per layer, with engraved birds, flowing clouds, trees, and landscapes within each petal. The foot ring tray is engraved with ruyi cloud patterns, while the foot surface is engraved with birds or symmetrical flowers”. In this description, “flared mouth, slightly outward-turned rim, curved belly expanding outward, trumpet-shaped octagonal high foot ring” provides a general description of the overall shape of the artifact, followed by “the cup body consists of two layers of lotus petals… the foot surface is engraved with birds or symmetrical flowers”, which describes the details of the artifact’s shape and explains the connection between the cup body and the foot ring. The silver cup with pointed lotus petals and wild goose patterns has two main components: the cup body and the high foot ring. The cup body features a flared mouth, an outward-turned rim, and an outward-expanding belly, while the high foot ring is trumpet shaped with eight petals and a circular petal hoop, connected to the cup body by a circular tray. The decorations on the cup body and foot ring are also described separately.

It becomes evident that the textual descriptions of various gold and silver artifacts closely align with the underlying construction logic of knowledge graphs. Important components can be extracted as nodes in the knowledge graph while connection and nesting structures can be extracted as edges (i.e., relationships), with material and excavation information serving as attributes. This structured approach simplifies these texts, making it easier for users to understand the complex knowledge system.

7.2. Effectiveness of Model Construction

The research combines graph attention networks (GAT) and BERT with the training model for joint extraction, significantly enhancing the ability to represent and integrate complex knowledge systems. By deeply encoding the text data related to gold and silver artifacts using the BERT model, its bidirectional contextual understanding capability precisely captures subtle semantic information within the text, ensuring the accurate extraction of each entity and its related descriptions. This process is particularly suitable for the complex and multilayered descriptive structures of gold and silver artifacts, from the overall shape of the artifact to the detailed features of its patterns and connections, all of which BERT can effectively identify.

After semantic encoding is completed, the dependency analysis module further parses the structural relationships between entities. For instance, in the description of the two key components “cup body” and “foot ring”, the dependency analysis can clearly determine their connection method and the interaction between them. Through this analysis, the model is not only able to recognize individual entities but can also systematically extract the structural dependency information between entities.

The graph attention network (GAT) further processes these entity nodes and their relationships by assigning weights. When handling complex descriptions of gold and silver artifacts, GAT excels at capturing the interdependencies between elements such as shape, pattern, and material. For example, in the case of the “silver cup with pointed lotus petals and wild goose patterns”, the cup body and high foot ring are treated as key modules and set as nodes in the knowledge graph while their connection and pattern details are precisely represented through edges and attributes in the graph structure. Through this structured approach, the model not only faithfully reconstructs the dependency relationships between complex entities but also ensures that the representation of this knowledge in the graph is more intuitive and organized.

By combining semantic analysis with multidimensional graph structure processing, the joint application of GAT and BERT, compared to traditional knowledge extraction methods, not only automates the extraction of entities and relationships but also ensures the stable operation of the entire knowledge system through deep exploration of the text’s semantics and graph structures.

7.3. Digital Innovation in Cultural Heritage Preservation

By constructing a knowledge graph platform for Tang Dynasty gold and silver artifacts, users can quickly access and integrate knowledge resources within the field. Due to the large volume of Tang Dynasty gold and silver artifacts and the varying descriptions across different periods, the knowledge information is extensive and complex. The platform unifies and organizes the descriptive data from various monographs, enabling users to conveniently and efficiently access accurate heritage knowledge, helping them to deeply explore and apply cultural heritage knowledge, thereby significantly improving the efficiency of knowledge integration.

For designers, the knowledge resources integrated into the platform allow them to flexibly apply elements such as the shapes and patterns of Tang Dynasty gold and silverware in modern design, accelerating design innovation. For the general public, the platform provides a quick and effective way to understand the complex and difficult knowledge of artifacts, not only preserving cultural heritage but also promoting its recreation and application in contemporary society. Through semantic analysis and reasoning functions, the platform uncovers potential knowledge hidden behind different entities and relationships. For example, by analyzing similarities between different artifacts and the inheritance of craftsmanship techniques, it can reveal more about the development patterns and cultural background of Tang Dynasty gold and silver artifacts, offering new perspectives and directions for academic research on cultural heritage.

This research introduces knowledge graph technology, providing a new method for the digital preservation of cultural heritage. This approach not only systematizes and visualizes vast, dispersed heritage knowledge but also forms a comprehensive knowledge network based on data linkage. With this technical approach, cultural heritage is no longer just static historical data but becomes a dynamic digital asset that can be interactively displayed, studied, and researched.

7.4. Limitations and Outlook

The model primarily relies on existing textual data for training and knowledge extraction and the descriptions related to Tang Dynasty gold and silver artifacts may vary due to differences in time periods and literary styles across sources. In the early stages of annotation, a certain level of manual intervention is required, especially during the construction and validation phases of the knowledge graph. Reducing the need for human involvement and increasing automation will help accelerate the construction of the knowledge graph and enhance its applicability across different fields. While the current visualization interface has improved the user experience to some extent, the interactivity and user-friendliness still need enhancement. The needs of different user groups, such as designers, researchers, and the general public, may differ. Therefore, balancing the demands of various users and providing personalized knowledge displays and interactive interfaces is a direction for future system optimization.

Future research will continue to enhance data collection efforts, integrating more heterogeneous data sources, such as archaeological literature, images, and physical records, to form a more comprehensive and enriched knowledge graph of cultural heritage. In response to practical needs, research will also focus on developing a knowledge-based Q&A system for cultural heritage design, utilizing the knowledge graph to handle more complex natural language processing tasks, making the retrieval and collection of design knowledge more efficient. Additionally, research will explore methods for integrating cross-domain data to establish connections between different cultural heritages. The knowledge management experience from Tang Dynasty gold and silver artifacts will be expanded to other cultural heritage fields, such as bronzeware, porcelain, calligraphy, and painting, to build a larger-scale knowledge graph system. By promoting this experience, a unified knowledge management platform encompassing a broader range of cultural heritage categories can be established, further advancing the digital preservation of China’s material cultural heritage.

Author Contributions

Methodology, Y.W.; Validation, Y.W. and W.W.; Investigation, J.L. and L.S.; Resources, J.L. and J.C.; Data curation, J.L.; Writing—original draft, J.L.; Writing—review & editing, Y.W. and W.W.; Visualization, X.Y.; Supervision, Z.W.; Project administration, Q.P. All authors have read and agreed to the published version of the manuscript.

Funding

Ministry of Education Humanities and Social Sciences Research Planning Fund Project (Project Number: 23XJCZH014); Ministry of Education Humanities and Social Sciences Research Planning Fund Project (Project Number: 23YJA760107); Shaanxi Province Innovation Capacity Support Plan Funded Project (Project Number: 2023-CX-PT-37); Shaanxi Province Natural Science Basic Research Plan Project (Project Number: 2023-JC-QN-0524); Shaanxi Province Social Science Fund Project (Project Number: 2022J056).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available on request due to copyright restriction.

Acknowledgments

First and foremost, I would like to sincerely thank the reviewers and editors for their valuable suggestions and assistance throughout the course of this research. Their professional scrutiny and rigorous recommendations have greatly improved the quality and completeness of this paper. Every piece of feedback helped me to deeply reflect on the issues in the research and, through continuous refinement, contributed to the progress of this work. I am especially grateful for their patient and meticulous review and guidance, which played a crucial role in the final completion of the paper. Furthermore, I would also like to thank my team members for their support and contributions to this research. Throughout the entire process, everyone worked closely together, encouraged each other, and faced numerous challenges as a team. Whether it was during experimental design, data collection, or analysis and discussion, each member demonstrated a strong sense of responsibility and professionalism. Once again, I extend my heartfelt thanks to all those who contributed to this research!

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, J. A Summary of the Research on the cellar Cultural relics of the Tang Dynasty in Hejia Village, Xi’an from an Interdisciplinary Perspective. Cult. Herit. Mus. 2019, 3, 41–48. [Google Scholar] [CrossRef]
Zhang, M. Cultural Fusion and Evolution in Design of Eight-ridge Cups in the Tang Dynasty. Packag. Eng. Art Ed. 2021, 42, 250–255. [Google Scholar]
Qi, D.; Shen, Q.; Museum, S.H. Selected Treasures from Hejiacun Tang Hoard; Cultural Relics Publishing House: Beijing, China, 2003. [Google Scholar]
Qi, D. Reserch on Tang Gold and Silver; Shanghai Ancient Books Publishing House: Shanghai, China, 2022. [Google Scholar]
Zhang, J.; Qi, D. Ancient Gold and Silver Articles; Cultural Relics Publishing House: Beijing, China, 2008. [Google Scholar]
Wani, S.A.; Ali, A.; Ganaie, S.A. The digitally preserved old-aged art, culture and artists: An exploration of Google Arts and Culture. PSU Res. Rev. 2019, 3, 111–122. [Google Scholar] [CrossRef]
Wei, T.; Roche, C.; Papadopoulou, M.; Jia, Y. Using ISO and semantic web standard for building a multilingual terminology e-dictionary: A use case of Chinese ceramic vases. J. Inf. Sci. 2023, 49, 855–870. [Google Scholar] [CrossRef]
Wu, Z.; Liao, J.; Song, W.; Mao, H.; Huang, Z.; Li, X.; Mao, H. Semantic hyper-graph-based knowledge representation architecture for complex product development. Comput. Ind. 2018, 100, 43–56. [Google Scholar] [CrossRef]
Liang, C.; Berant, J.; Le, Q.; Forbus, K.D.; Lao, N. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. arXiv 2016, arXiv:1611.00020. [Google Scholar]
Li, S. Research on Unified Modeling Technology of Manufacturing Big Data Based on Domain Ontology. Master’s Thesis, Sichuan University, Chengdu, China, 2021. [Google Scholar]
Shu, J.; Yang, T.; Geng, Y.; Yu, J. A Joint Extraction Method for Overlapping Entity Relationships in the Construction of Electric Power Knowledge Graph. High Volt. Eng. 2023, 1–11. [Google Scholar] [CrossRef]
Tahsin, M.U.; Shanto, M.S.H.; Rahman, R.M. Combining Natural Language Processing and Federated Learning for Consumer Complaint Analysis: A Case Study on Laptops. SN Comput. Sci. 2023, 4, 537. [Google Scholar] [CrossRef]
Rodríguez-Barroso, N.; Cámara, E.M.; Collados, J.C.; Luzón, M.V.; Herrera, F. Federated Learning for Exploiting Annotators’ Disagreements in Natural Language Processing. Trans. Assoc. Comput. Linguist. 2024, 12, 724–742. [Google Scholar] [CrossRef]
Islam, M.; Iqbal, S.; Rahman, S.; Sur, S.I.K.; Mehedi, M.H.K.; Rasel, A.A. A Federated Learning Approach for Text Classification Using NLP. In Proceedings of the Pacific-Rim Symposium on Image and Video Technology, Virtual, 12–14 November 2022. [Google Scholar]
Huang, Y.; Yu, S.; Chu, J.; Su, Z.; Wang, H.; Cong, Y.; Fan, H. Knowledlge extraction and knowledge graph construction for conceptual product design based on joint learning. Comput. Integr. Manuf. Syst. 2023, 29, 2313–2326. [Google Scholar] [CrossRef]
Li, C.; Hou, X.; Qiao, X. A Low-Resource Named Entity Recognition Method for Cultural Heritage Field Incorporating Knowledge Fusion. Acta Sci. Nat. Univ. Pekin. 2024, 60, 13–22. [Google Scholar] [CrossRef]
Liu, B.; Guan, W.; Yang, C.; Fang, Z.; Lu, Z. Transformer and graph convolutional network for text classification. Int. J. Comput. Intell. Syst. 2023, 16, 161. [Google Scholar] [CrossRef]
Ullah, I.; Manzo, M.; Shah, M.; Madden, M.G. Graph convolutional networks: Analysis, improvements and results. Appl. Intell. 2022, 52, 9033–9044. [Google Scholar] [CrossRef]
Senior, H.; Slabaugh, G.; Yuan, S.; Rossi, L. Graph neural networks in vision-language image understanding: A survey. Vis. Comput. 2024, 1–26. [Google Scholar] [CrossRef]
Li, Z.; Zhao, Y.; Zhang, Y.; Zhang, Z. Multi-relational graph attention networks for knowledge graph completion. Knowl.-Based Syst. 2022, 251, 109262. [Google Scholar] [CrossRef]
Peng, S.; Chen, G.; Cao, L.; Zeng, R.; Zhou, Y.; Li, X. Negative Emotion Recognition Method Based on Rational Graph Attention Network and Broad Learning. In Proceedings of the 21st Chinese National Conference on Computational Linguistics, Nanchang, China, 14–16 October 2022. [Google Scholar]
Asudani, D.S.; Nagwani, N.K.; Singh, P. Impact of word embedding models on text analytics in deep learning environment: A review. Artif. Intell. Rev. 2023, 56, 10345–10425. [Google Scholar] [CrossRef]
Kanakarajan, K.R.; Kundumani, B.; Sankarasubbu, M. BioELECTRA: Pretrained biomedical text encoder using discriminators. In Proceedings of the 20th Workshop on Biomedical Language Processing, Online, 11 June 2021. [Google Scholar]
Rouabhi, R.; Hammami, N.E.; Azizi, N.; Benzebouchi, N.E.; Chaib, R. Multi-label Textual Data Augmentation Using BERT Based on Transformer Model. In Proceedings of the International Conference on Computing and Information Technology, Hammamet, Tunisia, 22–26 December 2023. [Google Scholar]
Kim, Y.; Kim, J.-H.; Lee, J.M.; Jang, M.J.; Yum, Y.J.; Kim, S.; Shin, U.; Kim, Y.-M.; Joo, H.J.; Song, S. Author Correction: A pre-trained BERT for Korean medical natural language processing. Sci. Rep. 2023, 13, 9290. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Manning, C.D.; Surdeanu, M.; Bauer, J.; Finkel, J.R.; Bethard, S.; McClosky, D. The Stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MD, USA, 23–24 June 2014. [Google Scholar]
Brody, S.; Alon, U.; Yahav, E. How attentive are graph attention networks? arXiv 2021, arXiv:2105.14491. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Zhang, M. Neural Attention: Enhancing QKV Calculation in Self-Attention Mechanism with Neural Networks. arXiv 2023, arXiv:2310.11398. [Google Scholar]
Ji, D.; Gao, J.; Fei, H.; Teng, C.; Ren, Y. A deep neural network model for speakers coreference resolution in legal texts. Inf. Process. Manag. 2020, 57, 102365. [Google Scholar] [CrossRef]
Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2013, arXiv:1312.4400. [Google Scholar]
Ji, B.; Yu, J.; Li, S.; Ma, J.; Wu, Q.; Tan, Y.; Liu, H. Span-based joint entity and relation extraction with attention-based span-specific and contextual semantic representations. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, 8–13 December 2020. [Google Scholar]
Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional feature fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021. [Google Scholar]
Staudemeyer, R.C.; Morris, E.R. Understanding LSTM--a tutorial into long short-term memory recurrent neural networks. arXiv 2019, arXiv:1909.09586. [Google Scholar]
Giorgi, J.; Wang, X.; Sahar, N.; Shin, W.Y.; Bader, G.D.; Wang, B. End-to-end named entity recognition and relation extraction using pre-trained language models. arXiv 2019, arXiv:1912.13415. [Google Scholar]
Group, National Treasure Archives Section. National Treasure Archives Jade Ceramics Gold and Silver Case; China Democracy and Legal Publishing House: Beijing, China, 2009. [Google Scholar]
Zhang, J. Ancient Gold and Silver Wares in Northern Grassland of China; Cultural Relics Publishing House: Beijing, China, 2005. [Google Scholar]
Li, B. National Treasure Collection of Rare CulturalRelics of Shaanxi Province; Shaanxi People’s Education Press: Xi’an, China, 1998. [Google Scholar]
Peng, Q.; Wei, L.; Geng, B.; Ma, C.; Liu, J. The Essence of Chinese Cultural Relics; Taiwan Business Press: Taipei, Taiwan, 1993. [Google Scholar]
Ji, D.; Tan, Q. An Appraisal of the National Treasuresin the Shaanxi History Museum; Sanqin Publishing House: Xi’an, China, 2006. [Google Scholar]
Shaanxi Provincial Institute of Archaeology; Famen Temple Museum. The Beauty of Chinese Archaeological Artifacts: Buddhist Treasures and Tang Dynasty Relics from the Underground Palace of Famen Temple, Fufeng, Shaanxi; Cultural Relics Publishing House: Beijing, China, 1994. [Google Scholar]
Shaanxi Provincial Institute of Archaeology; Famen Temple Museum; Baoji Municipal Bureau of Cultural Relics; Fufeng County Museum. Archaeological Excavation Report on Famen Temple; Cultural Relics Publishing House: Beijing, China, 2007. [Google Scholar]
Shi, X. Precious Cultural Relics in the Crypt of Famen Temple; Shaanxi People’s Fine Arts Publishing House: Xi’an, China, 1988. [Google Scholar]
Shaanxi History Museum; Hou, N.; Shen, Q. Treasures of the Tang Dynasty: The Hejiacun Cellar; Cultural Relics Publishing House: Beijing, China, 2021. [Google Scholar]
Han, P.; Kocielnik, R.; Saravanan, A.; Jiang, R.; Sharir, O.; Anandkumar, A. ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs. arXiv 2024, arXiv:2402.11764. [Google Scholar]
Dai, H.; Liu, Z.; Liao, W.; Huang, X.; Cao, Y.; Wu, Z.; Zhao, L.; Xu, S.; Liu, W.; Liu, N. Auggpt: Leveraging chatgpt for text data augmentation. arXiv 2023, arXiv:2302.13007. [Google Scholar]
Zhao, H.; Chen, H.; Ruggles, T.A.; Feng, Y.; Singh, D.; Yoon, H.-J. Improving Text Classification with Large Language Model-Based Data Augmentation. Electronics 2024, 13, 2535. [Google Scholar] [CrossRef]
Meyer, J.G.; Urbanowicz, R.J.; Martin, P.C.; O’Connor, K.; Li, R.; Peng, P.-C.; Bright, T.J.; Tatonetti, N.; Won, K.J.; Gonzalez-Hernandez, G. ChatGPT and large language models in academia: Opportunities and challenges. BioData Min. 2023, 16, 20. [Google Scholar] [CrossRef]
Keshamoni, K. ChatGPT: An Advanceds Natural Language Processing System for Conversational AI Applications—A Comprehensive Review and Comparative Analysis with Other Chatbots and NLP Models. In Proceedings of the International Conference on ICT for Sustainable Development, Goa, India, 3–4 August 2023. [Google Scholar]
Yang, J.; Jin, H.; Tang, R.; Han, X.; Feng, Q.; Jiang, H.; Zhong, S.; Yin, B.; Hu, X. Harnessing the power of llms in practice: A survey on chatgpt and beyond. ACM Trans. Knowl. Discov. Data 2024, 18, 1–32. [Google Scholar] [CrossRef]
Sun, X.; Dong, L.; Li, X.; Wan, Z.; Wang, S.; Zhang, T.; Li, J.; Cheng, F.; Lyu, L.; Wu, F. Pushing the limits of chatgpt on nlp tasks. arXiv 2023, arXiv:2306.09719. [Google Scholar]
Adnan, K.; Akbar, R.; Wang, K.S. Usability enhancement model for unstructured text in big data. J. Big Data 2023, 10, 168. [Google Scholar] [CrossRef]
Zheng, H.; Wen, R.; Chen, X.; Yang, Y.; Zhang, Y.; Zhang, Z.; Zhang, N.; Qin, B.; Xu, M.; Zheng, Y. PRGC: Potential relation and global correspondence based joint relational triple extraction. arXiv 2021, arXiv:2106.09895. [Google Scholar]
Gardent, C.; Shimorina, A.; Narayan, S.; Beltrachini, L.P. The WebNLG challenge: Generating text from RDF data. In Proceedings of the 10th International Conference on Natural Language Generation, Santiago de Compostela, Spain, 4–7 September 2017. [Google Scholar]
Yuan, Y.; Zhou, X.; Pan, S.; Zhu, Q.; Song, Z.; Guo, L. A relation-specific attention network for joint entity and relation extraction. In Proceedings of the International Joint Conference on Artificial Intelligence, Yokohama, Japan, 7–15 January 2021. [Google Scholar]

Figure 1. Unified data modeling for knowledge information based on knowledge graphs.

Figure 2. Entity-relationship joint extraction model based on segmental attention fusion mechanism.

Figure 3. Pooling attention mechanism.

Figure 4. Segmental attention fusion mechanism.

Figure 5. Training accuracy trend over 19 epochs.

Figure 6. Architecture diagram of the unified data model for knowledge information on Tang Dynasty gold and silver artifacts.

Figure 7. Framework of the Tang Dynasty gold and silver artifacts knowledge retrieval system.

Table 1. Information of sample instances.

Name	Text	Entity	Relationship	Entity
Gilded Musician Pattern Silver Cup	The gilded musician pattern silver cup has an octagonal body, with each edge slightly concave in the middle and arched at both ends, forming a curved beaded pattern that adds a touch of softness to the otherwise rigid lines of the shape. Each of the eight curved surfaces of the cup body features a musician: four are playing instruments like the pan flute, small drum, vertical flute, or pipa, while the other four are dancing with sleeves, holding a pot, or holding a cup. All eight musicians are depicted as Hu people, adorned with flat-chiseled scrolling vines, mountains, and birds, and fish-egg patterns as background. The bottom of the cup is bordered with a string of beads and turns inward to form a round base, connecting to a trumpet-shaped short ring foot, which is also decorated with bead patterns. The base and the ring foot are chiseled with floral and cloud patterns, and filled with fish-egg patterns. One side of the cup has a circular handle decorated with two relief heads of Hu people with deep-set eyes, broad noses, and long beards, facing each other at the back of their heads. This feature is also seen in Sogdian silverware: the outside of the ring handle has a three-dimensional animal head, and the hook tail of the handle is welded to the cup body. The inner and outer walls of the cup are fully gilded. The strong exotic style indicates that this gilded musician pattern silver cup is a piece of Sogdian silverware.	Gilded Musician Pattern Silver Cup	Shape	Octagonal
		Gilded Musician Pattern Silver Cup	Structure	Eight curved surfaces
		Gilded Musician Pattern Silver Cup	Decoration	Musicians
		Octagonal	Decoration	Curved beaded pattern
		Musicians	Shape	Pan flute
		Musicians	Shape	Small drum
		Musicians	Shape	Vertical flute
		Musicians	Shape	Pipa
		Musicians	Shape	Dancing with sleeves
		Musicians	Shape	Holding a pot
		Musicians	Shape	Holding a cup
		Musicians	Shape	Hu people
		Gilded Musician Pattern Silver Cup	Decorative Technique	Flat-chiseling
		Gilded Musician Pattern Silver Cup	Decoration	Scrolling vines
		Gilded Musician Pattern Silver Cup	Decoration	Mountains and birds
		Gilded Musician Pattern Silver Cup	Decoration	Fish-egg pattern
		Gilded Musician Pattern Silver Cup	Decoration	Background filling
		Gilded Musician Pattern Silver Cup	Decoration	Beaded pattern
		Gilded Musician Pattern Silver Cup	Decoration	Round base
		Gilded Musician Pattern Silver Cup	Decoration	Trumpet-shaped foot
		Gilded Musician Pattern Silver Cup	Decoration	Floral and cloud patterns
		Gilded Musician Pattern Silver Cup	Part	Circular handle
		Circular handle	Part	Finger rest
		Gilded Musician Pattern Silver Cup	Cultural Origin	Sogdian silverware
		Circular handle	Structure	Outside of the handle

Table 2. Dataset statistics.

	NYT	Tang Dynasty Gold and Silver Artifacts
Training Set	56,195	122,442
Validation Set	4999	15,922
Test Set	5000	25,109
Normal	3266	14,220
SEO	1297	6440
EPO	978	10
Relationships	14	122,442

Table 3. Model Training Hyperparameter Settings.

Learning Rate	Max Epoch	Batch Size	Seed
1 × 10⁻⁵	30	8	42

Table 4. Model performance comparison.

		NYT
	Prec	Rec	F1
CopyRE	59.6	54.6	57.0
GraphRel	62.7	60.1	61.3
CopyMTL	71.3	68.5	69.9
RSAN	71.7	87.1	78.7
our-model	85.4	85.5	85.2

Table 5. Model performance comparison on different types of triplets.

Dataset	Model	Normal	EPO	SEO
NYT	CopyRE	63.0	42.8	50.5
	GraphRel	63.1	50.8	59.7
	CopyMTL	71.3	56.8	68.4
	RSAN	75.3	84.9	75.3
	Our-model	85.3	85.3	81.5

Table 6. Model performance comparison on sentences with different numbers of triplets.

Dataset	Model	N = 1	N = 2	N = 3	N = 4	N ≥ 5
NYT	CopyRE	67.0	56.2	51.2	47.2	25.8
	GraphRel	63.7	64.6	58.9	55.2	47.1
	CopyMTL	71.2	71.3	70.3	73.1	48.9
	RSAN	73.3	82.1	82.7	84.5	76.4
	Our-model	84.1	85.4	86.1	85.5	85.3

Table 7. Comparison of precision, recall, and F1 score of the model on sentences with different types of overlapping triples.

SEO			EPO			All
Prec	Rec	F1	Prec	Rec	F1	Prec	Rec	F1
84.05	84.03	84.9	85.6	86.1	85.6	84	84	84.9

Table 8. Model performance summary.

Metric	Value
Training Accuracy	0.8506
Validation Accuracy	0.8050
Epochs	19
Time per Epoch (seconds)	996

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Y.; Liu, J.; Wang, W.; Chen, J.; Yang, X.; Sang, L.; Wen, Z.; Peng, Q. Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network. Appl. Sci. 2024, 14, 8231. https://doi.org/10.3390/app14188231

AMA Style

Wang Y, Liu J, Wang W, Chen J, Yang X, Sang L, Wen Z, Peng Q. Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network. Applied Sciences. 2024; 14(18):8231. https://doi.org/10.3390/app14188231

Chicago/Turabian Style

Wang, Yi, Jun Liu, Weiwei Wang, Jian Chen, Xiaoyan Yang, Lijuan Sang, Zhiqiang Wen, and Qizhao Peng. 2024. "Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network" Applied Sciences 14, no. 18: 8231. https://doi.org/10.3390/app14188231

APA Style

Wang, Y., Liu, J., Wang, W., Chen, J., Yang, X., Sang, L., Wen, Z., & Peng, Q. (2024). Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network. Applied Sciences, 14(18), 8231. https://doi.org/10.3390/app14188231

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Construction of Cultural Heritage Knowledge Graph Based on Graph Attention Neural Network

Abstract

1. Introduction

2. Research Objectives

3. Unified Data Modeling in the Cultural Heritage Knowledge Collection Stage for Design

3.1. Data Feature Analysis

3.2. Unified Data Modeling Process for Knowledge Information Based on Knowledge Graphs

3.3. Key Technology Research

4. Data Model Construction

4.1. Entity-Relationship Extraction Model Construction

4.1.1. Embedding Layer

4.1.2. Entity Recognition Layer

4.1.3. Relationship Classification Layer

4.2. Model Training and Optimization

5. Experiment Validation

5.1. Dataset Creation

5.2. Experimental Environment

5.3. Parameter Settings

5.4. Standard Dataset Experimental Results

5.5. Experimental Results on the Tang Dynasty Gold and Silver Artifacts Dataset

6. Knowledge Retrieval of Tang Dynasty Gold and Silver Artifacts

6.1. Overall Functionality

6.2. System Interface

7. Discussion and Outlook

7.1. Association between Knowledge Graphs and Artifact Information

7.2. Effectiveness of Model Construction

7.3. Digital Innovation in Cultural Heritage Preservation

7.4. Limitations and Outlook

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI