Next Article in Journal
U-Net_dc: A Novel U-Net-Based Model for Endometrial Cancer Cell Image Segmentation
Previous Article in Journal
Literature Review: Clinical Data Interoperability Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Document-Level Relation Extraction with Local Relation and Global Inference

1
School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
2
School of Finance, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
3
School of Information Management, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
4
College of Science, Chongqing University of Technology, Chongqing 400054, China
5
Business School, Chengdu University, Chengdu 610106, China
*
Author to whom correspondence should be addressed.
Information 2023, 14(7), 365; https://doi.org/10.3390/info14070365
Submission received: 3 June 2023 / Revised: 23 June 2023 / Accepted: 24 June 2023 / Published: 27 June 2023

Abstract

:
The current popular approach to the extraction of document-level relations is mainly based on either a graph structure or serialization model method for the inference, but the graph structure method makes the model complicated, while the serialization model method decreases the extraction accuracy as the text length increases. To address such problems, the goal of this paper is to develop a new approach for document-level relationship extraction by applying a new idea through the consideration of so-called “Local Relationship and Global Inference” (in short, LRGI), which means that we first encode the text using the BERT pre-training model to obtain a local relationship vector first by considering a local context pooling and bilinear group algorithm and then establishing a global inference mechanism based on Floyd’s algorithm to achieve multi-path multi-hop inference and obtain the global inference vector, which allow us to extract multi-classified relationships with adaptive thresholding criteria. Taking the DocRED dataset as a testing set, the numerical results show that our proposed new approach (LRGI) in this paper achieves an accuracy of 0.73, and the value of F1 is 62.11, corresponding to 28% and 2% improvements by comparing with the classical document-level relationship extraction model (ATLOP), respectively.

1. Introduction

In recent years, document-level relationship extraction, which extracts relationships between entity pairs in text spanning multiple sentences and multiple entities, has become a new research hotspot due to the widespread use of relational knowledge in the form of knowledge graphs by Internet companies such as search engines [1] under the framework of big data analysis and related applications in financial technology (Fintech) and related areas.
The current research on the extraction of relationship tasks mostly focuses on the sentence-level relation extraction method by identifying the relation of entity pairs by two entities and the sentence where the entity pair is located [2]. However, it seems that the sentence-level relation extraction task in general cannot meet the need of practical applications, by the fact that the same entity may appear multiple times, multiple entity pairs exist in the text, multiple relationships may be contained between the same entity pairs, and entity pairs appear in two different sentences, and so on; thus, it is expected that the approach to the document-level relationship extraction proposed in this paper is more practical.
Compared to the sentence-level relation extraction method, the document-level relation extraction approach that we discuss in this paper has several advantages in dealing with at least the following four issues:
1. Entity information needs to be fused for all mentioned words of the same entity, as the same entity can appear multiple times within one document.
2. The calculation of relationships between entity pairs requires consideration of contextual information, since a document contains multiple entity pairs.
3. The classification of these relationships extends from a single-label classification task to a multi-label classification task, as multiple relationships may exist between a pair of entities.
4. Some entity pairs require indirect logical inference to determine their relationships. This necessitates additional inference mechanisms in the construction of the document-level relationship extraction model.
So far, there have been many attempts to achieve the document-level relationship extraction task (see [3] and related references therein); here, we briefly recall some of them.
The popular approach is based on the serialization model, which encodes the text using serialization methods [4] and obtains entity information from the sequence of word vectors after encoding; the relationship between entity pairs is then obtained through neural networks, followed by the classification of relationships. This method primarily originates from sentence-level relationship extraction, and the model has a simple structure that is easy to implement. However, as we know, all document-level relation extraction models based on serialization methods face the challenge of decreased accuracy as the sentence length increases. Another approach is the use of the pre-training model method [5], which researchers have applied to the task of document-level relationship extraction. The key idea of the pre-training model is to use attention mechanisms that dynamically adjust the attention weight for each word according to context, thereby enhancing interpretability and the ability to handle long texts, improving the accuracy of document-level relation extraction.
Despite these improvements, attention mechanisms still face challenges in capturing contextual information in long texts, as more information processing leads to degraded model performance. Because of this reason, researchers also proposed an alternative research idea based on the graph-structured model method. The document is encoded using serialization methods or pre-trained models, and graph-structured data [6] are constructed to implement explicit inference based on graph neural networks with relational multi-classification. The key idea of using the graph-structured data method is to model relations and then use graph neural networks to perform explicit inference on relations in text.
We note that the constructed graph-structured data can capture long-distance information by connecting individual nodes through edges, but the model proposed by many current researchers usually contains multiple types of nodes and edges or establishes complex inference, making the extracted model structures complex and introducing redundant information. For example, the edge-driven graph neural network proposed by Christopoulou [7] contains three types of nodes: mention-level nodes, entity-level nodes, and sentence-level nodes, with a complex model structure. There is no directly connected edge between any entity pair, and the relationship vector between entity pairs needs to be represented by a path passing through several intermediate nodes. The information in the path is processed and aggregated through a multilayer graph neural network to obtain the relationship between entity pairs, introducing a lot of extra information into the relationship representation between entities.
In this paper, in order to overcome the limitations of the two methods discussed above, we first define the “local relation” as the relationship between entity pairs that can be obtained through contextual text pooling without explicit inference mechanisms, and then an extension is made to the document-level relation extraction with adaptive thresholding and localized context pooling (in short, ATLOP) model [8]; this allows us to propose a document-level relation extraction model with local relations and global inference (in short, LRGI). The key idea of our LRGI method is that the process involves using the ATLOP model to obtain local relation information, followed by constructing an inference network with bilinear functions to achieve multi-path and multi-hop inference and obtain global inference information, and finally, the local relation information is combined with the global inference information to obtain the relationship between any two entities in the document and perform relationship classification.
Our LRGI model was tested by DocRED data [9], and the numerical results showed that the LRGI model has an accuracy of 0.73, which is a 28% improvement compared to the ATLOP model, and an F1 score of 62.11, a 2% improvement. Thus, the effectiveness of the document-level relationship extraction model that incorporates local relationships and global inference is verified.
In addition, for the convenience of interested readers to facilitate peer reproduction of this experiment, the source code and trained models have been released on GitHub, which makes this work more accessible to researchers.
Our LRGI approach to the document-level relation extraction model proposed in this paper, by incorporating the local relation and global inference, contains at least the following two major innovations or advantages:
(1) An explicit inference mechanism based on an improved Floyd algorithm is proposed. This inference mechanism first constructs graph-structured data, which contain only one type of node and edge; here, nodes represent entities in a document, while edges represent relationships between two connected entities. Based on the idea of Floyd’s algorithm [10] for computing the shortest paths, the state transfer equation is modified and trainable parameters are added to calculate the relationship between two nodes passing through the intermediate node via iteration. Then, the inference mechanism has a simple structure that is easy to implement. In addition, there are directly connected edges between any two entities in the document, and the relationship vectors between entity pairs can be calculated directly without a complicated aggregation process, avoiding the impact of redundant information on the accuracy of relationship extraction. Moreover, the inference mechanism is easy to extend, and iteration can realize the transformation from single-hop multi-path inference to multi-hop multi-path inference.
(2) Relationship extraction from two different perspectives, local relationships containing contextual information and explicit inference containing indirect relationships, is proposed to construct a document-level relationship extraction model that incorporates local relationships and global inference. Local relations are obtained through context pooling in the ATLOP model: contextual information is obtained through an attention mechanism for any entity pair and appended to the vector of relations between entity pairs. Local relations depend on the entity pairs and contextual content and are affected by text length. In this way, the local relations are used as input to the global inference mechanism based on the improved Floyd algorithm, and the global inference containing indirect relations is obtained through explicit inference about the direct relations between entity pairs. Then, global inference complements the local relations to obtain more valid information and improve the accuracy of the document-level relationship extraction model that incorporates local relations and global inference.
This paper is organized into five sections, as follows.
In Section 2, the current work related to document-level relation extraction is introduced. A variety of algorithms are sorted out, and the current research status, advantages, and disadvantages of existing algorithms are analyzed. Based on the summary of previous work, the key idea of our LRGI method is proposed and discussed briefly.
In Section 3, the LRGI model is discussed from the perspective of the implementation of modeling with algorithms. An explicit inference mechanism based on the improved Floyd algorithm is proposed to address the problems that have occurred due to the complex graph structure and redundant information in existing graph-structured document-level relationship extraction models. In addition, two different perspectives of relation extraction, local relations containing contextual information and explicit inference containing indirect relations, are proposed to construct a document-level relation extraction model that incorporates local relations and global inference. This approach addresses the degradation of model performance due to increased input text length in the serialization model based on mechanisms such as attention.
Section 4 describes the experiments and analyzes the results. The experiments were conducted on the public DocRED dataset, and the results demonstrate that the document-level relation extraction model combining local relations and global inference has high extraction accuracy. The experiments also demonstrate the effectiveness of the explicit inference mechanism based on the improved Floyd algorithm and the effectiveness of global inference as a complement to local relations to achieve the improvement on the accuracy of document-level relation extraction.
Finally, Section 5 summarizes the work of this paper and proposes future research directions and related issues.

2. Related Work

This section classifies document-level relationship extraction models into two types based on their research ideas: serialization-based models and graph-structured models. This section compares several algorithms and analyzes the current state of research on existing algorithms, including their advantages and disadvantages.

2.1. Document-Level Relationship Extraction Models Based on Serialization

Serialization-based document-level relationship extraction models do not use graph neural networks but employ serialization models such as convolutional neural networks (CNNs), long short-term memory neural networks (LSTMs), and bi-directional long short-term memory neural network (BiLSTMs). However, they often face difficulties in capturing long-range contextual information in long text analysis.
With the proposal of pre-training models such as BERT [11], using pre-training models for encoding can alleviate the problem of an insufficient ability to capture long-range contextual information due to the use of attention mechanisms.
For example, the BERT-TSBASE model [12] decomposes the relationship extraction task into a pipeline task, first determining whether a relationship exists between two entities and then making a relationship type judgment. HIN-BERTBASE [13] uses a multilayer network to implement the process of inference. The ATLOP model achieves implicit inference by using the attention mechanism again after encoding with additional contextual information.
Local context pooling improves entity-to-vector representation by introducing local context vectors. The method exploits the dependencies between word vectors in the pre-trained model directly without relearning the attention matrix, thereby saving time and computational resources.
In the pre-trained multi-head attention matrix A R H × l × l , A i j k denotes the attention coefficients in the k -th attention head from the i -th word vector to the j -th word vector. The special symbol “*” in front of the mention indicates the mentioned word, and the attention of the entity A i E R H × l is represented by the mean of the attention coefficients of all mentioned words of the same entity. The attention coefficients of two entities A S E , A O E in the entity pair ( e s , e o ) are multiplied to obtain the weight of the contextual content on the entity pair, and the local context vector c ( s , o ) is calculated. As shown in Equation (1):
A ( s , o ) = A s E A o E q ( s , o ) = i = 1 H A i ( s , o ) α ( s , o ) = q ( s , o ) / 1 T q ( s , o ) c ( s , o ) = H T α ( s , o )
where H is the sequence of word vectors obtained by the pre-training model. The entity representation in the entity pair is obtained by fusing the local context vector c ( s , o ) and the entity vectors h e s and h e o , as shown in Equation (2):
z s = tanh ( W s h e s + W c 1 c ( s , o ) ) z o = tanh ( W o h e o W c 2 c ( s , o ) )
where W C 1 , W C 2 R d × d is the model variable.
Local context pooling is shown in Figure 1. The word vectors are pooled by weighted averages to obtain the local context vectors c ( s , o ) of the entity pair ( e s , e o ) , and context vectors c ( s , o ) are fused with the entity vectors h e s and h e o . In this process, attention coefficients of the head entity and the tail entity are multiplied to obtain the weights of the word vectors. Therefore, only the word vector with high attention coefficients for both entities will receive high weights in the context vector. The word vectors with higher weights in the context vector are marked in light yellow in Figure 1.
However, implicit inference mainly relies on the attention mechanism, which cannot solve the problem of the decreasing effectiveness of relation extraction with an increase in sentence length.

2.2. Document-Level Relationship Extraction Models Based on Graph Structures

Graph-structured document-level relationship extraction models construct graph structures based on dependency structures or structured attention, containing different types of nodes and edges such as entity nodes and sentence nodes to capture information about documents. Then, relationships between entities in the document are obtained using graph neural networks. Since scattered entities in the document are connected by the constructed graph structure data to obtain relationship information between entities, the problem of serialized encoders’ (e.g., recurrent neural network encoders) inability to capture information over long distances is alleviated.
Among these models, the attention-guided graph convolutional network (AGGCN) [14] model converts the dependency tree into a fully connected weighted graph based on the attention mechanism, followed by the graph convolutional network (GCN) [15] for relationship extraction between entities. Latent structure refinement (LSR) [16] constructs entity nodes to automatically perform document graph construction without relying on dependencies such as syntax trees to construct static document graph structures.
Sahu encodes all words in the document with additional location information and constructs a graph structure based on the document, which contains two types of nodes: entity nodes and document nodes, and five types of edges. The vector of each node is obtained by GCN for each of these edges, followed by classification of the relationship for any two entities by a linear activation neural network. This model creates a document node to shorten the shortest distance between any nodes and solves the problem of long-distance dependence, as shown in Figure 2 below [17].
In addition to node-driven graph neural networks that construct the graph structure to obtain the node vector and the relationship vector between two nodes through a neural network, Christopoulou proposed the edge-driven neural network. This approach processes each sentence in the document to obtain word vectors using a BiLSTM layer, constructing a graph structure containing three types of nodes (mention nodes, entity nodes, and sentence nodes) and five types of edges. The graph is aggregated several times by graph convolution to obtain the edges between entities, which are the relationship vectors. Softmax is used to classify the relationship vectors. However, relevant information may be lost in this model’s multiple GCN networks that are required to obtain edge vectors, since edges directly connecting entities do not exist.
To enhance the expressive capability, the GAIN model employs a bipartite graph comprising two layers to represent textual information. Initially, the complete document is encoded; subsequently, mention nodes and document nodes are incorporated into the first layer of the graph structure, which encompasses three types of edges. The mention nodes’ vectors are acquired through the utilization of graph convolutional networks (GCNs). Afterwards, the mention vectors corresponding to the same entities are aggregated through averaging, resulting in entity vectors. These entity-to-entity paths are then represented by applying an attention-mechanism-based approach, which involves an inference process. By following the illustration in Figure 3 below, a multi-classification task is executed [18].
In the GCN network, the information of each node is obtained by weighted summation of the information of that node in the previous layer and the information of the neighboring nodes, followed by linear and nonlinear transformations, as shown in Figure 4.
For node u at layer i in Figure 4, the equation is shown in Equation (3):
h u l + 1 = σ ( k κ v N k ( u ) W k l h v l + b k l )
where κ denotes edges of different types, and W k l R d × d and b k l R d are model parameters. N k ( u ) denotes the nodes which node u is connected to by an edge of type k . σ is the ReLU activation function.
Document-level relationship extraction based on graph structure often requires constructing syntactic or semantic structures of documents, which is less efficient in processing. The construction process is time-consuming and computationally expensive.

3. Document-Level Relationship Extraction Model with Local Relationships and Global Inference

The proposed document-level relationship extraction model with local relationships and global inference is an extension of the ATLOP network with the addition of a global inference layer and a fusion layer of local relationships and global inference. The model framework is shown in Figure 5.

3.1. Definition of Symbols

In a document d , d = [ x t ] t = 1 l , where x t denotes the t th word, and l is the length of the document. the entity is denoted by e i . The same entity may appear several times in a document, and the name of the entity that appears each time is called a mentioned word and denotes the jth mentioned word of the i -th entity by m i j . Each mentioned word can be composed of several words.

3.2. Local Relationships

The local relationships defined in this paper are the relationships between pairs of entities obtained by context pooling only. Local relationships are obtained by this model using the ATLOP network.
The position before and the position after each mentioned word in the text is filled by the pre-processing layer using the symbol “*” to mark the mentioned words.
The pre-processed text content is encoded in the BERT encoding layer using the BERT pre-training model, as shown in Equation (4):
H = [ h 1 , h 2 , , h l ] = B e r t ( [ x 1 , x 2 , , x l ] )
The word vector encoded with the symbol “*” in front of each mentioned word in the BERT encoding layer contains the information of the mentioned word, so the mentioned word layer inputs this word vector into the entity layer as the vector h m i j of the corresponding mentioned word.
The vector h e i of an entity is obtained at the entity level by pooling a number of mentioned word vectors representing this entity by the logsumexp function, as shown in Equation (5):
h e i = l o g j = 1 N e i e x p ( h m i j )
The multi-headed attention matrix data in the BERT encoding layer are processed by local context pooling to obtain the attention coefficients of the entity pair in the context c ( i , j ) .
Since the relationship between entity pairs is directional, for two entities e i and e j , the relationship between them is represented by two different relationship vectors r e i , e j and r e j , e i .
Taking r e i , e j as an example, e i is the head entity, and e j is the tail entity. To distinguish the head entity and tail entity of the entity pair, the entity vectors h e i and h e j , and the attention coefficient c ( i , j ) of the entity pair in the context, are used by the local relationship representation layer to calculate the head entity vector z s and the tail entity vector z o , as shown in Equation (6):
z s = t a n h ( W s h e i + W c 1 c ( i , j ) ) z o = t a n h ( W o h e j + W c 2 c ( i , j ) )
where W s , W o , W c 1 , W c 2 denote the training parameters of the neural network in the local relational representation layer.
A bilinear group transformation is performed on z s and z o to obtain the local relation r e i , e j with e i as the head entity and e j as the tail entity, as shown in Equation (7):
[ z s 1 ; ; z s k ] = z s [ z o 1 ; ; z o k ] = z o r e i , e j = σ ( ( z s i ) T W r z o i + b r )
where W r , b r , are the training parameters of the neural network in the local relational representation layer.

3.3. Global Inference

Local relations between entity pairs are obtained using local context pooling, which relies on the attention mechanism in the BERT pre-training model. However, the attention mechanism is affected by the sentence length, and the longer the sentence length is, the more the accuracy of its relation extraction gradually decreases, which is especially obvious in document-level relation extraction.
To address this issue, the ATLOP model is extended in this paper to implement global relational inference.
On the basis of local context pooling of text contents to obtain local relations between entity pairs, Floyd’s algorithm is improved to perform global inference, i.e., to derive global relations between entity pairs from local relations.
In the process of global inference, the computation of relationships between entity pairs will no longer depend on the entity positions in the document but on the local relationships.
Floyd’s algorithm is improved by this model, and an explicit relation inference mechanism (explicit relation inference layer based on the improved Floyd’s algorithm) is proposed to solve the global relation i n f e i , e j for any pair of entities ( e i and e j ) with e i as the head entity and e j as the tail entity.

3.3.1. Classical Floyd’s Algorithm

The weights of the edges between any two nodes represent the distance between the nodes. In the classical Floyd algorithm, for any pair of nodes ( i , j ) , the nodes other than node i and node j are used as intermediate nodes, and the shortest distance D i , j between node i and node j is continuously updated by the state transfer equation iterating over the intermediate nodes.
For any intermediate node, according to the state transfer equation
D i , j = m i n ( D i , j , D i , k + D k , j )
In the case of containing at most one intermediate node k , the shortest distance D i , j of nodes i and j can be obtained, where D i , k is the distance from node i to intermediate node k , and D k , j is the distance from intermediate node k to node j .
In the case that the path contains multiple intermediate nodes, in order to calculate the shortest distance between node i and node j , the set of nodes other than these two nodes is defined as ( 1 k ) . D i , j , k denotes the length of the shortest path from node i to node j using the nodes in the set ( 1 k ) as intermediate nodes.
If node k is in the shortest path:
D i , j , k = D i , j , k 1 + D k , j , k 1
If node k is not in the shortest path:
D i , j , k = D i , j , k 1
Therefore,
D i , j , k = m i n ( D i , j , k 1 , D i , k , k 1 + D k , j , k 1 )
If there are several different ways between two nodes, the distance between those nodes is the shortest one. The above process is iterated until the number of iterations is equal to the number of intermediate nodes or the shortest distance between all pairs of nodes no longer changes with the number of iterations, and the iterative process ends. Finally, D i , j , k is the shortest distance between node i and node j under the condition of the path containing more than one intermediate node.

3.3.2. Explicit Relational Inference Based on Improved Floyd’s Algorithm

The explicit relational inference mechanism proposed in this paper is a modification of the state transfer equation of the classical Floyd algorithm.
(1) Single-hop relational inference mechanism
The distance D in the state transfer equation in the classical Floyd algorithm is modified to a local relation vector r . Since the inference relation i n f e i , e j , e k containing information about the intermediate entity e k with e i as the head entity and e j as the tail entity cannot be obtained by direct summation of the local relations, the computational model of the distance between node i and node j passing through intermediate node k under the constraint of a single intermediate node is modified from D i , k + D k , j to a neural network constructed with a bilinear function whose inputs are two local relations r e i , e k and r e k , e j , as Equation (12) shows:
i n f e i , e j , e k = t a n h ( ( r e i , e k ) T W i n f r e k , e j + b i n f )
where W i n f , b i n f are the training parameters in the global inference layer neural network.
In order to fuse the information of all paths, the set of entities ( e 1 e k ) except e i and e j is represented by . After traversing all intermediate nodes, the minimum value calculation method in the classical Floyd algorithm is modified to calculate the average value for all inference relation vectors containing information of a single intermediate node to obtain the global inference relation i n f e i , e j for the entity pair. The modified state transfer equation is shown in Equation (13):
i n f e i , e j = 1 K e k R t a n h ( ( r e i , e k ) T W i n f r e k , e j + b i n f )
where K is the number of entities in the set .
When K = 2 , the single-hop relational inference process based on the improved Floyd algorithm is shown in Figure 6.
The nodes in Figure 6 represent entities, the edges represent local relations, and the weights of the edges are the local relation vectors, which are calculated by the method in Section 3.2. The direction of the edge is the direction of the relationship.
Single-hop relational inference takes the entities in the document other than the head entity e i and the tail entity e j as the intermediate entities e k of the global inference relation, and for each intermediate entity e k , a neural network model constructed with a bilinear function is used to obtain the inference relation i n f e i , e j , e k between pairs of entities containing the intermediate entity e k .
After traversing the intermediate entities, the global inference relation { i n f e i , e j , e k | e k } between pairs of entities containing a single intermediate entity is obtained by pooling the mean value between the pairs of entities containing a single intermediate entity i n f e i , e j .
In order to make full use of the rich relationship information implied between multiple entities in documents, the single-hop relationship inference algorithm is extended to propose a multi-hop relationship extraction calculation.
(2) Multi-hop relational inference mechanism
The entities in the document other than the head entity e i and the tail entity e j are used as intermediate entities for multiple relationship inference.
For each intermediate entity e k , two global single-hop inference relations i n f e i , e k with e i as the head entity and e k as the tail entity and i n f e k , e j with e k as the head entity and e j as the tail entity, respectively, are obtained by inference about the single-hop relations, as shown in Equation (13).
Each global single-hop inference relation is obtained by traversing the intermediate entities other than the head and tail entities of the global single-hop inference relation in the document to complete the inference calculation.
We fed i n f e i , e k and i n f e k , e j into a neural network constructed as a bilinear function to obtain the multi-hop inference relation i n f e i , e j , e k containing information about these two global single-hop inference relations. { i n f e i , e j , e k | e k } is the inference of all multi-hop relations computed by traversing the intermediate entity. After all multi-hop inference relations are pooled by the average value, the global multi-hop inference relation i n f e i , e j containing the information of multiple intermediate entities is obtained.
The global inference relation i n f e i , e j from entity e i to entity e j contains information about entities in the set ( e 1 e k ) , as shown in Equation (14):
i n f e i , e j , e k = t a n h ( ( i n f e i , e k ) T W i n f i n f e k , e j + b i n f ) i n f e i , e j = 1 K e k R i n f e i , e j , e k
where i n f e i , e j , e k is the inference relation computed using i n f e i , e k and i n f e k , e j iterations from entity e i to entity e j when the last traversed intermediate entity is e k . i n f e i , e j is the global inference relation from the head entity e i to the tail entity e j . The same is true for i n f e i , e k and i n f e k , e j .
When K = 2 , the multi-hop relational inference mechanism based on the improved Floyd algorithm is shown in Figure 7.
In Figure 7, the single-hop inference relations i n f e i , e 1 , i n f e 1 , e 2 , i n f e i , e j , and i n f e j , e 2 are obtained according to Equation (10), and then the global inference relation i n f e i , e 2 is obtained after the first iteration of the multi-hop inference layer, as shown in Equation (15).
i n f e i , e 2 = 1 2 ( t a n h ( ( i n f e i , e 1 ) T W i n f i n f e 1 , e 2 + b i n f ) + t a n h ( ( i n f e i , e j ) T W i n f i n f e j , e 2 + b i n f ) )
The global inference relation is obtained in the same way.
The global relations i n f e i , e 2 and i n f e 2 , e j are fed into a neural network constructed with a bilinear function to obtain the inference relation i n f e i , e j , e 2 with e 2 as the intermediate entity, as shown in Equation (16):
i n f e i , e j , e 2 = t a n h ( ( i n f e i , e 2 ) T W i n f i n f e 2 , e j + b i n f )
Similarly, the inference relation i n f e i , e j , e 1 is obtained.
The global relation i n f e i , e j is obtained by averaging pooling for the inference relations i n f e i , e j , e 1 and i n f e i , e j , e 2 , as shown in Equation (17):
i n f e i , e j = 1 2 ( i n f e i , e j , e 2 + i n f e i , e j , e 1 )
The second iteration process is represented by Equations (16) and (17).
Multi-path multi-hop inference can be implemented by the above procedure to obtain the global relation i n f e i , e j for the head entity e i pointing to the tail entity e j .
When K > 2 , i.e., the number of intermediate entities is greater than two, the inference process is still as shown in Equation (14).
Since the intermediate entities between entity pairs can be repeatedly traversed during the iteration, the number of intermediate entities traversed by the global inference relation calculation between any entity pair is 2N−1 after N iterations of multi-path multi-hop inference.
The flowchart of the algorithm for the global inference mechanism in this model is shown in Figure 8.
In Figure 8, e i , e j are any head and tail entities in the document, and their output global multi-hop inference relations i n f e i , e j , which can be continuously updated by iteration and used as input for global multi-hop inference, are calculated to obtain global multi-hop inference relations for other head and tail entities.

3.4. Fusion of Local Relations and Global Inference

The vector r e i , e j containing local relations of contextual information and the vector i n f e i , e j based on global inference of explicit inference are fed into the neural network for fusion to achieve document relation extraction, as shown in Equation (18):
r e l e i , e j = t a n h ( W c o n c a t ( r e i , e j , i n f e i , e j ) + b c o n )
where the symbol r e l e i , e j is the relational vector fusing local relations and global inference, W c o n , b c o n are the training parameters of the neural network, and c a t ( ) is the splicing operation. Its algorithm flowchart is shown in Figure 9.
Adaptive threshold classification is performed on the relationship vector r e l e i , e j to obtain the type of relationship between entity pairs.

4. Experimental Design and Analysis

To ensure easy reproducibility, the proposed document-level relationship extraction model with local relation and global inference is available at https://github.com/Emir-Liu/Document-Level-Relation-Extraction-Fusing-Local-Relation-and-Global-Reasoning (accessed on 24 June 2023).

4.1. Dataset Acquisition and Processing

In this paper, based on DocRED, a large-scale dataset for document-level relationship extraction with data from English Wikipedia was manually annotated as the testing set to verify the performance of the LRGI approach, and the testing results show that more than 40.7% of the relationships in this dataset need to be judged by multiple sentences, 61.1% of the relationships need to be obtained by inference, and 7% of the entity pairs with relationships have multiple relationships, as summarized by Table 1 below.

4.2. Experimental Environment and Parameter Settings

4.2.1. Experimental Environment

The deep learning framework used for the experiments in this paper is PyTorch. All training, validation, and testing of the models were conducted using the PyTorch framework, and the specific experimental environment is shown in Table 2.

4.2.2. Experimental Parameter Settings

In this research paper, a novel approach for document-level relation extraction is presented, which integrates local relations and global inference. The model utilizes the base version of the BERT pre-training model and employs the AdamW optimization algorithm [19]. The specific parameter configurations employed for this model can be found in Table 3.

4.3. Evaluation Indicators

Since there may be more than one relationship between entity pairs in document-level relationship extraction tasks, they are considered a multi-classification task. To properly evaluate the performance of various relationship extraction models and considering the characteristics of the task, precision and F 1 are used as evaluation metrics, which are defined as shown in Equation (19):
P r e c i s o n = T P T P + F P R e c a l l = T P T P + F N F 1 = 2 P r e c i s i o n R e c a l l P r e c i s i o n + R e c a l l
where T P denotes the number of correctly predicted entity pair relationships; F P denotes the number of incorrectly predicted entity pair relationships; and F N denotes the number of entity pair relationships that are not predicted.
In this paper, F 1 and accuracy are used as metrics in the experiments. Please note that overfitting may occur during the training process, leading to errors when evaluating the model’s performance if the same entity pairs and inter-entity pair relationships exist in the training, validation, and test sets.
To address this problem, in this paper, we also use l g n F 1 score as a metric, which calculates F 1 metrics for the validation set after eliminating entity pairs and inter-entity pair relationships that have already appeared in the training set.

4.4. Contrast Model

The proposed document-level relationship extraction model that incorporates local relationships and global inference was compared with serialization-based and graph-structured models, using F 1 and l g n F 1 as metrics. The models involved in the comparison are as follows:
1. Serialization-based models
Serialization-based models encode the entire document through neural networks or pre-trained models without using graph structures.
(1) Convolutional neural network (CNN)/long short-term memory neural network (LSTM)/BiLSTM:
Word encoding is performed on the text using GloVe [20], and after encoding, word vectors are obtained using CNN, LSTM, and BiLSTM networks, respectively. Entity vectors are obtained by mean pooling and input to a bilinear function to predict the relationship between entity pairs.
(2) Context-Aware:
Word vector encoding is performed on text using GloVe and input to the LSTM network to obtain the relationship between entity pairs. Attention mechanisms are used to achieve information fusion.
(3) BERTBASE/BERT-TSBASE/HIN-BERTBASE/CorefBERTBASE [21]:
All these models use pre-trained models for encoding serialized models. BERTBASE uses BERT pre-trained models for encoding. BERT-TSBASE decomposes the task into a pipeline task, a dichotomous classification task that first determines whether a relationship exists between two entities, and then a multi-label classification task that makes a relationship type determination. HIN-BERTBASE uses a multilayer network to implement the process of inference.
(4) BERT-ATLOPBASE:
Word encoding is performed on text using the BERT pre-training model. Word encoding is pooled to obtain entity vectors. Relationship vectors are obtained using contextual pooling, and then adaptive threshold classification is performed to obtain the type of relationship between entity pairs.
(5) E2GRE [22]:
The E2GRE model concatenates document text with head entities to help LMs concentrate on parts of the document that are more related to the head entity.
(6) DocuNet [23]:
DocuNet leverages an encoder module to capture the context information of entities and a U-shaped segmentation module over the image-style feature map to capture global interdependency among triples.
2. Graph-structure-based models
Information in the documents is used to construct graph structure data on which the computation of relationships between entity pairs is performed.
(7) BiLSTM-AGGCN/BiLSTM-LSR:
After encoding using a BiLSTM network, BiLSTM-AGGCN converts dependency trees into fully connected band-weight graphs by an attention mechanism and uses GCN for relation computation. BiLSTM-LSR constructs mentioned words, entities, and sentence nodes by an LSR model for determining relations.
(8) BERT-LSRBASE:
After using BERT encoding, the LSR model is used for relationship extraction.
(9) GAIN:
The text is encoded using the BERT model to construct a two-layer heterogeneous graph structure referring to mentioned word-level graphs and entity-level graphs. Explicit inference is implemented using GCN networks to obtain the relationships between entity pairs.

4.5. Experimental Results and Analysis

4.5.1. Comparative Experiments

The different models were experimented on with the DocRED dataset, and the experimental results are shown in Table 4.
Both GAIN and ATLOP models use the BERT pre-training model. In order to compare the performance of each model fairly, the basic version of the BERT pre-training model, BERT Base, is used in this paper.
The above model is replicated in this paper, and the same parameters are used for the original ATLOP model and document-level relationship extraction model proposed in this paper that fuses local relationships and global inference, for better comparison.
The proposed model in this paper achieves better results, with improvements in both F 1 and l g n F 1 indicators, with a 2% improvement in the F 1 score compared to the ATLOP model.
Comparing the proposed model in this paper with the ATLOP model, the accuracy is improved by about 28%, as shown in Table 5.

4.5.2. Analysis of the Maximum Number of Relationships Existing between Each Pair of Entity Pairs

Since more than one type of relationship may exist between entity pairs, each relationship should have a different classification threshold to determine whether a certain relationship type exists in that entity pair.
In the DocRED dataset, there are 96 relationships, and there are at most three different relationships between each pair of entities. The original ATLOP model only judges the three most probable relationships among the entity pairs. Therefore, for this experiment, the maximum number of relationships between entity pairs was set to three, which is the same as the setting of the original ATLOP model.
In the experiments, comparisons were made separately for different maximum number of relations, as shown in Table 6.
Here, the metrics F 1 , l g n F 1 , precision, and recall of the model increase when the maximum number of relationships between entity pairs is increased. The extraction precision of the model is highest when the maximum number of relationships between entity pairs is three.

4.5.3. Analysis of Model Parameters

To represent in more detail the effect of the local relation dimension and global inference relation dimension on the experimental results in the relation extraction model that fuses local relations and global inference, the model parameters were modified to analyze the effect of this mechanism. The training procedure is shown in Figure 10, Figure 11 and Figure 12 and the configured parameters and evaluation results are shown in Table 7.
Here, M o d e l _ r e l x _ i n f y denotes a document-level relation extraction model with local relation dimension x and global inference dimension y that incorporate local relation and global inference. When x is zero, it represents a document-level relation extraction model that contains only global inference, and the global inference dimension is y . Similarly, when y is zero, it represents a document-level relation extraction model that contains only local relations, and the local relation dimension is x .
First, the Model_rel97_inf0 model with only local relations, Model_rel0_inf97 model with only global inference, and Model_rel97_inf97 model with fusion of local relations and global inference are compared.
The Model_rel97_inf0 model containing only local relations has higher recall, F 1 and l g n F 1 , while the Model_rel0_inf97 model containing only global inference has better precision.
The Model_rel97_inf97 model that incorporates both local relations and global inference has higher precision compared to the two models that use only a single relation. It contains both local relations and global inference, so although the recall is slightly reduced due to the additional information added, both F 1 and l g n F 1 parameters are improved for the overall effect.
A change in the global inference dimension directly affects the precision and recall of the model, which in turn improves F 1 and l g n F 1 . While keeping the local relationship dimension at 97, the precision of the model improves, and the recall rate decreases slightly with the increase in the global inference dimension. Meanwhile, both F 1 and l g n F 1 are improved, indicating that increasing the global inference dimension can improve the overall extraction effect of the model.
When the dimensionality of global inference remains unchanged, F 1 and l g n F 1 of the model greatly improve when the number of local relation changes from 0 to 97, while F 1 and l g n F 1 of the model become worse when the number of local relation changes from 97 to 150. This indicates that simply increasing the dimension of local relations has some limitations, and global inference can be a good supplement to improve the accuracy of extraction.

5. Conclusions

In order to address the problem of the decreasing effectiveness of the attention mechanism in document-level relation extraction tasks with increasing text length, this paper first proposes an improved Floyd algorithm, which allowed us to extend the ATLOP model, and then we were able to establish a general framework for multi-path single-hop global inference and a multi-path multi-hop global inference for the extraction of document-level relationships. Then, combining this framework on the inference with local relation computation, the LRGI (standing for “Local Relationship and Global Inference”) method on document-level relation extraction with local relation and global inference is proposed.
The proposed model is compared with other document-level relation extraction models on the DocRED dataset, showing improved accuracy and F 1 values.
However, we would like to point out that when applying our LGRI model to other datasets, the dimensionality of the global inference relationship vector in the global inference layer needs to be adjusted to achieve the best relationship extraction.
The multi-hop inference layer in our LGRI model contains two iterations of the process. During each iteration, the relational inference vector between entity pairs is updated, and the number of intermediate nodes used between entity pairs is also increased, making the global inference relationship between entity pairs contain richer information about the relationships implied between multiple entities.
Finally, we would also like to point out that using more iterative processes will lead to too many intermediate entities being used in the global inference relationship, resulting in entity pairs consisting of intermediate nodes being repeatedly inferred multiple times during inference, and this may lead to duplicate contents in the global relationship. Future research needs to explore how to choose the optimal number of iterations and avoid duplicate inference information in the inference process for the iterations.

Author Contributions

Conceptualization, Y.L.; supervision, H.S., F.N., G.Z. and G.X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 71971031.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yu, M.; Yin, W.; Hasan, K.S.; Santos, C.D.; Xiang, B.; Zhou, B. Improved neural relation detection for knowledge base question answering. arXiv 2017, arXiv:1704.06194. [Google Scholar]
  2. Shi, P.; Lin, J. Simple bert models for relation extraction and semantic role labeling. arXiv 2019, arXiv:1904.05255. [Google Scholar]
  3. Soares, L.B.; FitzGerald, N.; Ling, J.; Kwiatkowski, T. Matching the blanks: Distributional similarity for relation learning. arXiv 2019, arXiv:1906.03158. [Google Scholar]
  4. Miwa, M.; Bansal, M. End-to-end relation extraction using lstms on sequences and tree structures. arXiv 2016, arXiv:1601.00770. [Google Scholar]
  5. Alt, C.; Hübner, M.; Hennig, L. Improving relation extraction by pre-trained language representations. arXiv 2019, arXiv:1906.03088. [Google Scholar]
  6. Tang, Y.; Huang, J.; Wang, G.; He, X.; Zhou, B. Orthogonal relation transforms with graph context modeling for knowledge graph embedding. arXiv 2020, arXiv:1911.04910. [Google Scholar]
  7. Christopoulou, F.; Miwa, M.; Ananiadou, S. Connecting the dots: Document-level neural relation extraction with edge-oriented graphs. arXiv 2019, arXiv:1909.00228. [Google Scholar]
  8. Zhou, W.; Huang, K.; Ma, T.; Huang, J. Document-level relation extraction with adaptive thresholding and localized context pooling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2021; pp. 14612–14620. [Google Scholar]
  9. Yao, Y.; Ye, D.; Li, P.; Han, X.; Lin, Y.; Liu, Z.; Liu, Z.; Huang, L.; Zhou, J.; Sun, M. DocRED: A large-scale document-level relation extraction dataset. arXiv 2019, arXiv:1906.06127. [Google Scholar]
  10. Weisstein, E.W. Floyd-Warshall Algorithm. Available online: https://mathworld.wolfram.com/2008 (accessed on 21 June 2008).
  11. Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
  12. Wang, H.; Focke, C.; Sylvester, R.; Mishra, N.; Wang, W. Fine-tune bert for docred with two-step process. arXiv 2019, arXiv:1909.11898. [Google Scholar]
  13. Tang, H.; Cao, Y.; Zhang, Z.; Cao, J.; Fang, F.; Wang, S.; Yin, P. Hin: Hierarchical inference network for document-level relation extraction. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Singapore, 11–14 May 2020; pp. 197–209. [Google Scholar]
  14. Guo, Z.; Zhang, Y.; Lu, W. Attention guided graph convolutional networks for relation extraction. arXiv 2019, arXiv:1906.07510. [Google Scholar]
  15. Wu, F.; Souza, A.; Zhang, T.; Fifty, C.; Yu, T.; Weinberger, K. Simplifying graph convolutional networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6861–6871. [Google Scholar]
  16. Nan, G.; Guo, Z.; Sekulić, I.; Lu, W. Reasoning with latent structure refinement for document-level relation extraction. arXiv 2020, arXiv:2005.06312. [Google Scholar]
  17. Sahu, S.K.; Christopoulou, F.; Miwa, M.; Ananiadou, S. Inter-sentence relation extraction with document-level graph convolutional neural network. arXiv 2019, arXiv:1906.04684. [Google Scholar]
  18. Zeng, S.; Xu, R.; Chang, B.; Li, L. Double Graph Based Reasoning for Document-level Relation Extraction. arXiv 2020, arXiv:2009.13752. [Google Scholar]
  19. Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
  20. Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
  21. Ye, D.; Lin, Y.; Du, J.; Liu, Z.; Li, P.; Sun, M.; Liu, Z. Coreferential Reasoning Learning for Language Representation. arXiv 2020, arXiv:2004.06870. [Google Scholar]
  22. Huang, K.; Qi, P.; Wang, G.; Ma, T.; Huang, J. Entity and evidence guided document-level relation extraction. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), Online, 6 August 2021; pp. 307–315. [Google Scholar]
  23. Zhang, N.; Chen, X.; Xie, X.; Deng, S.; Tan, C.; Chen, M.; Huang, F.; Si, L.; Chen, H. Document-level relation extraction as semantic segmentation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, International Joint Conferences on Artificial Intelligence Organization, Main Track, 19–27 August 2021; pp. 3999–4006. [Google Scholar]
Figure 1. Illustration of local context pooling.
Figure 1. Illustration of local context pooling.
Information 14 00365 g001
Figure 2. Framework of node-driven graph neural network model.
Figure 2. Framework of node-driven graph neural network model.
Information 14 00365 g002
Figure 3. GAIN model framework.
Figure 3. GAIN model framework.
Information 14 00365 g003
Figure 4. Illustration of GCN.
Figure 4. Illustration of GCN.
Information 14 00365 g004
Figure 5. Model framework.
Figure 5. Model framework.
Information 14 00365 g005
Figure 6. Single-hop relational inference mechanism based on improved Floyd’s algorithm.
Figure 6. Single-hop relational inference mechanism based on improved Floyd’s algorithm.
Information 14 00365 g006
Figure 7. Multi-hop relational inference mechanism based on improved Floyd’s algorithm.
Figure 7. Multi-hop relational inference mechanism based on improved Floyd’s algorithm.
Information 14 00365 g007
Figure 8. Algorithm flowchart of the global multi-hop relational inference mechanism.
Figure 8. Algorithm flowchart of the global multi-hop relational inference mechanism.
Information 14 00365 g008
Figure 9. Flowchart of the fusion algorithm for local relations and global inference. As is shown in the figure, * means Multiplication calculation.
Figure 9. Flowchart of the fusion algorithm for local relations and global inference. As is shown in the figure, * means Multiplication calculation.
Information 14 00365 g009
Figure 10. Precision of different models in training.
Figure 10. Precision of different models in training.
Information 14 00365 g010
Figure 11. F1 of different models in training.
Figure 11. F1 of different models in training.
Information 14 00365 g011
Figure 12. Loss of different models in training.
Figure 12. Loss of different models in training.
Information 14 00365 g012
Table 1. Statistics of DocRED dataset. As is shown in the figure, # means the number of.
Table 1. Statistics of DocRED dataset. As is shown in the figure, # means the number of.
StatisticsValue
DatasetDocRED
# Train3053
# Dev1000
# Test1000
# Types of relation96
# Entities132,275
# Relations56,354
Avg. # sentences per Doc.8
Avg. # entities per Doc.19.5
Table 2. Experimental environment configuration.
Table 2. Experimental environment configuration.
Software and HardwareConfiguration
Development tools
and languages
Pycharm+Python3.9
Development frameworkPytorch1.9.0
Operating System64-bit Ubuntu20.04 system
GPUINVDA Quadro RTX 6000 24G
CPUIntel® Xeon(R) W-2225 CPU @ 4.10 GHz × 8
Table 3. Parameters of model.
Table 3. Parameters of model.
ParameterValue
Dimension of mentioned embedding768
Dimension of entity embedding768
Dimension of local relations vector97
Dimension of global inference vector97
Learning rate5 × 10−5
Dropout0.1
Table 4. F1 of different RE models on DocRED.
Table 4. F1 of different RE models on DocRED.
ModelDevTest
F 1 l g n F 1 F 1 l g n F 1
Serialization-based models:
CNN43.4541.5842.2640.33
LSTM50.6648.4450.0747.71
BiLSTM50.9548.8751.0648.78
Context-Aware51.1048.9450.7048.40
BERTBASE54.16-53.20-
BERT-TSBASE54.42-53.92-
HIN-BERTBASE56.3154.2955.6053.70
CorefBERTBASE57.5155.3256.9654.54
BERT-ATLOPBASE61.0959.2261.3059.31
E2GRE58.7255.22--
DocuNet61.8359.8661.8659.93
Graph-structure-based models:
BiLSTM-AGGCN52.4746.2951.4548.89
BiLSTM-LSR55.1748.8254.1852.15
BERT-LSRBASE59.0052.4359.0556.97
GAIN61.2259.1461.2459.00
Proposed model:
LRGI62.1160.4861.7159.62
Table 5. Comparison of LRGI and ATLOP and GAIN models.
Table 5. Comparison of LRGI and ATLOP and GAIN models.
ModelLRGIATLOPGAIN
F 1 62.1160.9259.83
l g n F 1 60.4859.0358.10
P r e c i s i o n 0.730.570.58
R e c a l l 0.540.650.58
Table 6. Comparison of different maximum relationship numbers between entity pairs on dev set.
Table 6. Comparison of different maximum relationship numbers between entity pairs on dev set.
Max Num of Relations between Entity Pairs F 1 l g n F 1 P r e c i s i o n R e c a l l
157.5356.300.680.49
259.1457.580.680.52
361.6760.100.720.53
Table 7. Parameters and results of different models. As shown in the table, # indicates dimension.
Table 7. Parameters and results of different models. As shown in the table, # indicates dimension.
Model# rel# inf F 1 l g n F 1 PreRec
Model_rel0_inf9709759.5657.940.670.539
Model_rel97_inf097060.5758.750.660.554
Model_rel97_inf10971061.6760.100.720.539
Model_rel97_inf97979762.1060.480.730.538
Model_rel150_inf101501060.3258.720.690.534
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Shan, H.; Nie, F.; Zhang, G.; Yuan, G.X. Document-Level Relation Extraction with Local Relation and Global Inference. Information 2023, 14, 365. https://doi.org/10.3390/info14070365

AMA Style

Liu Y, Shan H, Nie F, Zhang G, Yuan GX. Document-Level Relation Extraction with Local Relation and Global Inference. Information. 2023; 14(7):365. https://doi.org/10.3390/info14070365

Chicago/Turabian Style

Liu, Yiming, Hongtao Shan, Feng Nie, Gaoyu Zhang, and George Xianzhi Yuan. 2023. "Document-Level Relation Extraction with Local Relation and Global Inference" Information 14, no. 7: 365. https://doi.org/10.3390/info14070365

APA Style

Liu, Y., Shan, H., Nie, F., Zhang, G., & Yuan, G. X. (2023). Document-Level Relation Extraction with Local Relation and Global Inference. Information, 14(7), 365. https://doi.org/10.3390/info14070365

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop