BTDM: A Bi-Directional Translating Decoding Model-Based Relational Triple Extraction
Abstract
:1. Introduction
- We propose a new bidirectional perspective to decompose the relational triple-extraction task into three subtasks: entity extraction, subject–object alignment, and relation judgement.
- Following our perspective, we propose a novel end-to-end model BTDM, which greatly mitigates error propagation, handles the overlapping triple problem, and efficiently aligns the subject and object.
- We conduct extensive experiments on several public NYT and WebNLG datasets. We compared the proposed method with 12 baselines, demonstrating that the proposed model achieves a state-of-the-art performance.
2. Problem Definition
2.1. Problem Formulation
2.2. Our View of the Problem
3. Method
3.1. BTDM Encoder
3.2. BTDM Decoder
3.2.1. Entity Extraction
3.2.2. Subject–Object Alignment
3.2.3. Relation Judgement
3.3. Joint Training Strategy
4. Experiments
4.1. Datasets and Evaluation Metrics
4.2. Implementation Details
4.3. Baselines
- NovelTagging [29] treats the extraction problem as a sequence labeling problem by merging entities and relational roles;
- CopyR [30] applies a sequence-to-sentence architecture;
- MultiHead [31] uses a multi-headed selection technique to identify the entity and the relation;
- GraphRel [32] uses graph convolutional neural networks to extract entities and their relations;
- OrderCopyR [33] uses reinforcement learning in seq-to-seq models to extract triples;
- ETL-span [24] proposed a decomposition-based tagging scheme;
- RSAN [34] proposed a relation-specific attention mechanism network to extract entities and relations extraction;
- CasRel [17] applies a cascade binary tagging framework to extract triples;
- TPLinker [25] firstly finds all token pairs and then uses map to tag token links to recognize relations between token pairs;
- TDEER [18] proposes a translating decoding network to relational triples;
- PRGC [19] proposes a framework based on a potential relation and global correspondence;
- R-BPtrNet [35] proposes a unified network to extract explicit and implicit relational triples.
4.4. Experimental Results
4.4.1. Main Results
4.4.2. Detailed Results on Complex Scenarios
4.4.3. Detailed Results on Different Sub-Tasks
5. Related Work
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; Lao, N.; Murphy, K.; Strohmann, T.; Sun, S.; Zhang, W. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 601–610. [Google Scholar]
- Nayak, T.; Majumder, N.; Goyal, P.; Poria, S. Deep neural approaches to relation triplets extraction: A comprehensive survey. Cogn. Comput. 2021, 13, 1215–1232. [Google Scholar] [CrossRef]
- Fader, A.; Zettlemoyer, L.; Etzioni, O. Open question answering over curated and extracted knowledge bases. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 1156–1165. [Google Scholar]
- Li, X.; Yin, F.; Sun, Z.; Li, X.; Yuan, A.; Chai, D.; Zhou, M.; Li, J. Entity-Relation Extraction as Multi-Turn Question Answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics: Florence, Italy, 2019; pp. 1340–1350. [Google Scholar] [CrossRef]
- Huang, C.C.; Lu, Z. Community challenges in biomedical text mining over 10 years: Success, failure and the future. Briefings Bioinform. 2016, 17, 132–144. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lai, T.; Ji, H.; Zhai, C.; Tran, Q.H. Joint Biomedical Entity and Relation Extraction with Knowledge-Enhanced Collective Inference. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; Volume 1: Long Papers; Association for Computational Linguistics: Seattle, WA, USA, 2021; pp. 6248–6260. [Google Scholar] [CrossRef]
- Zelenko, D.; Aone, C.; Richardella, A. Kernel methods for relation extraction. J. Mach. Learn. Res. 2003, 3, 1083–1106. [Google Scholar]
- Zhou, G.; Su, J.; Zhang, J.; Zhang, M. Exploring Various Knowledge in Relation Extraction. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05); Association for Computational Linguistics: Ann Arbor, MI, USA, 2005; pp. 427–434. [Google Scholar] [CrossRef] [Green Version]
- Chan, Y.S.; Roth, D. Exploiting Syntactico-Semantic Structures for Relation Extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics: Portland, OR, USA, 2011; pp. 551–560. [Google Scholar]
- Zhong, Z.; Chen, D. A Frustratingly Easy Approach for Entity and Relation Extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics: Seattle, WA, USA, 2021; pp. 50–61. [Google Scholar] [CrossRef]
- Tjong, E.F.; De Meulder, F. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL; Association for Computational Linguistics: Seattle, WA, USA, 2003; pp. 142–147. [Google Scholar]
- Ratinov, L.; Roth, D. Design Challenges and Misconceptions in Named Entity Recognition. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009); Association for Computational Linguistics: Boulder, CO, USA, 2009; pp. 147–155. [Google Scholar]
- Bunescu, R.C.; Mooney, R.J. A Shortest Path Dependency Kernel for Relation Extraction. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP-05); Association for Computational Linguistics: Vancouver, BC, Canada, 2005; pp. 724–731. [Google Scholar]
- Li, Q.; Ji, H. Incremental Joint Extraction of Entity Mentions and Relations. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics; (Volume 1: Long Papers); Association for Computational Linguistics: Baltimore, MD, USA, 2014; pp. 402–412. [Google Scholar] [CrossRef]
- Ren, X.; Wu, Z.; He, W.; Qu, M.; Voss, C.R.; Ji, H.; Abdelzaher, T.F.; Han, J. Cotype: Joint extraction of typed entities and relations with knowledge bases. In Proceedings of the 26th International Conference on World Wide Web, Geneva, Switzerland, 3–7 April 2017; pp. 1015–1024. [Google Scholar]
- Xu, B.; Wang, Q.; Lyu, Y.; Shi, Y.; Zhu, Y.; Gao, J.; Mao, Z. EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics: Seattle, WA, USA, 2022; pp. 659–665. [Google Scholar] [CrossRef]
- Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics: Seattle, WA, USA, 2020; pp. 1476–1488. [Google Scholar] [CrossRef]
- Li, X.; Luo, X.; Dong, C.; Yang, D.; Luan, B.; He, Z. TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Punta Cana, Dominican Republic, 2021; pp. 8055–8064. [Google Scholar] [CrossRef]
- Zheng, H.; Wen, R.; Chen, X.; Yang, Y.; Zhang, Y.; Zhang, Z.; Zhang, N.; Qin, B.; Ming, X.; Zheng, Y. PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; Volume 1: Long Papers; Association for Computational Linguistics: Seattle, WA, USA, 2021; pp. 6225–6235. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Volume 1: Long and Short Papers; Association for Computational Linguistics: Minneapolis, MN, USA, 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Pennington, J.; Socher, R.; Manning, C. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); Association for Computational Linguistics: Doha, Qatar, 2014; pp. 1532–1543. [Google Scholar] [CrossRef]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Sui, D.; Chen, Y.; Liu, K.; Zhao, J.; Zeng, X.; Liu, S. Joint entity and relation extraction with set prediction networks. arXiv 2020, arXiv:2011.01675. [Google Scholar]
- Yu, B.; Zhang, Z.; Shu, X.; Wang, Y.; Liu, T.; Wang, B.; Li, S. Joint extraction of entities and relations based on a novel decomposition strategy. arXiv 2019, arXiv:1909.04273. [Google Scholar]
- Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage joint extraction of entities and relations through token pair linking. arXiv 2020, arXiv:2010.13415. [Google Scholar]
- Riedel, S.; Yao, L.; McCallum, A. Modeling relations and their mentions without labeled text. In Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2010, Barcelona, Spain, 20–24 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 148–163. [Google Scholar]
- Gardent, C.; Shimorina, A.; Narayan, S.; Perez-Beltrachini, L. Creating training corpora for nlg micro-planning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL); Association for Computational Linguistics: Vancouver, BC, Canada, 2017. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Zheng, S.; Wang, F.; Bao, H.; Hao, Y.; Zhou, P.; Xu, B. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics; Volume 1: Long Papers; Association for Computational Linguistics: Vancouver, BC, Canada, 2017; pp. 1227–1236. [Google Scholar] [CrossRef] [Green Version]
- Zeng, X.; Zeng, D.; He, S.; Liu, K.; Zhao, J. Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics; Volume 1: Long Papers; Association for Computational Linguistics: Melbourne, Australia, 2018; pp. 506–514. [Google Scholar] [CrossRef] [Green Version]
- Bekoulis, G.; Deleu, J.; Demeester, T.; Develder, C. Joint entity recognition and relation extraction as a multi-head selection problem. Expert Syst. Appl. 2018, 114, 34–45. [Google Scholar] [CrossRef] [Green Version]
- Fu, T.J.; Li, P.H.; Ma, W.Y. Graphrel: Modeling text as relational graphs for joint entity and relation extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1409–1418. [Google Scholar]
- Zeng, X.; He, S.; Zeng, D.; Liu, K.; Liu, S.; Zhao, J. Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); Association for Computational Linguistics: Hong Kong, China, 2019; pp. 367–377. [Google Scholar] [CrossRef]
- Yuan, Y.; Zhou, X.; Pan, S.; Zhu, Q.; Song, Z.; Guo, L. A Relation-Specific Attention Network for Joint Entity and Relation Extraction. In Proceedings of the IJCAI, Yokohama, Japan, 11–17 July 2020; Volume 2020, pp. 4054–4060. [Google Scholar]
- Chen, Y.; Zhang, Y.; Hu, C.; Huang, Y. Jointly Extracting Explicit and Implicit Relational Triples with Reasoning Pattern Enhanced Binary Pointer Network. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics: Seattle, WA, USA, 2021; pp. 5694–5703. [Google Scholar] [CrossRef]
- Sun, C.; Gong, Y.; Wu, Y.; Gong, M.; Jiang, D.; Lan, M.; Sun, S.; Duan, N. Joint Type Inference on Entities and Relations via Graph Convolutional Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics: Florence, Italy, 2019; pp. 1361–1370. [Google Scholar] [CrossRef]
- Liu, J.; Chen, S.; Wang, B.; Zhang, J.; Li, N.; Xu, T. Attention as relation: Learning supervised multi-head self-attention for relation extraction. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 7–15 January 2021; pp. 3787–3793. [Google Scholar]
- Tian, X.; Jing, L.; He, L.; Liu, F. StereoRel: Relational Triple Extraction from a Stereoscopic Perspective. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; Volume 1: Long Papers; Association for Computational Linguistics: Seattle, WA, USA, 2021; pp. 4851–4861. [Google Scholar] [CrossRef]
- Miwa, M.; Bansal, M. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics; Volume 1: Long Papers; Association for Computational Linguistics: Berlin, Germany, 2016; pp. 1105–1116. [Google Scholar] [CrossRef] [Green Version]
- Gupta, P.; Schütze, H.; Andrassy, B. Table filling multi-task recurrent neural network for joint entity and relation extraction. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics; Technical Papers; Association for Computational Linguistics: Osaka, Japan, 2016; pp. 2537–2547. [Google Scholar]
- Zhang, M.; Zhang, Y.; Fu, G. End-to-End Neural Relation Extraction with Global Optimization. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Copenhagen, Denmark, 2017; pp. 1730–1740. [Google Scholar] [CrossRef]
- Zeng, D.; Zhang, H.; Liu, Q. Copymtl: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 9507–9514. [Google Scholar]
Dataset | #Sentence | Details of Test Data | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Train | Valid | Test | Normal | SEO | EPO | SOO | #Triples | #Relations | |||
NYT* | 56,195 | 4999 | 5000 | 3266 | 1297 | 978 | 45 | 3244 | 1756 | 8110 | 24 |
WebNLG* | 5019 | 500 | 703 | 245 | 457 | 26 | 84 | 266 | 437 | 1591 | 171 |
NYT | 56,196 | 5000 | 5000 | 3071 | 1273 | 1168 | 117 | 3089 | 1911 | 8616 | 24 |
WebNLG | 5019 | 500 | 703 | 239 | 448 | 6 | 85 | 256 | 447 | 1607 | 216 |
Model | NYT* | WebNLG* | NYT | WebNLG | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Prec. | Rec. | F1 | Prec. | Rec. | F1 | Prec. | Rec. | F1 | Prec. | Rec. | F1 | |
NovelTagging [29] | - | - | - | - | - | - | 32.8 | 30.6 | 31.7 | 52.5 | 19.3 | 28.3 |
CopyRE [30] | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 | - | - | - | - | - | - |
MultiHead [31] | - | - | - | - | - | - | 60.7 | 58.6 | 59.6 | 57.5 | 54.1 | 55.7 |
GraphRel [32] | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 | - | - | - | - | - | - |
OrderCopyRE [33] | 77.9 | 67.2 | 72.1 | 63.3 | 59.9 | 61.6 | - | - | - | - | - | - |
ETL-span [24] | 84.9 | 72.3 | 78.1 | 84.0 | 91.5 | 87.6 | 85.5 | 71.7 | 78.0 | 84.3 | 82.0 | 83.1 |
RSAN [34] | - | - | - | - | - | - | 85.7 | 83.6 | 84.6 | 80.5 | 83.8 | 82.1 |
CasRel [17] | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 | - | - | - | - | - | - |
TPLinker [25] | 91.3 | 92.5 | 91.9 | 91.8 | 92.0 | 91.9 | 91.4 | 92.6 | 92.0 | 88.9 | 84.5 | 86.7 |
TDEER [18] | 93.0 | 92.1 | 92.5 | 93.8 | 92.4 | 93.1 | - | - | - | - | - | - |
PRGC [19] | 93.3 | 91.9 | 92.6 | 94.0 | 92.1 | 93.0 | 93.5 | 91.9 | 92.7 | 89.9 | 87.2 | 88.5 |
R-BPtrNet [35] | 92.7 | 92.5 | 92.6 | 93.7 | 92.8 | 93.3 | - | - | - | - | - | - |
BTDM | 93.0 | 92.5 | 92.7 | 94.1 | 93.5 | 93.8 | 93.1 | 92.4 | 92.7 | 90.9 | 90.1 | 90.5 |
Model | NYT* | WebNLG* | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Normal | EPO | SEO | SOO | N = 1 | N = 2 | N = 3 | N = 4 | Normal | EPO | SEO | SOO | N = 1 | N = 2 | N = 3 | N = 4 | |||
OrderCopyRE | 71.2 | 69.4 | 72.8 | - | 71.7 | 72.6 | 72.5 | 77.9 | 45.9 | 65.4 | 60.1 | 67.4 | - | 63.4 | 62.2 | 64.4 | 57.2 | 55.7 |
ETL-Span | 88.5 | 87.6 | 60.3 | - | 88.5 | 82.1 | 74.7 | 75.6 | 76.9 | 87.3 | 91.5 | 80.5 | - | 82.1 | 86.5 | 91.4 | 89.5 | 91.1 |
CasRel | 87.3 | 91.4 | 92.0 | 77.0 † | 88.2 | 90.3 | 91.9 | 94.2 | 83.7 | 89.4 | 92.2 | 94.7 | 90.4 † | 89.3 | 90.8 | 94.2 | 92.4 | 90.9 |
TPLinker | 90.1 | 93.4 | 94.0 | 90.1 † | 90.0 | 92.8 | 93.1 | 96.1 | 90.0 | 87.9 | 92.5 | 95.3 | 86.0 † | 88.0 | 90.1 | 94.6 | 93.3 | 91.6 |
PRGC | 91.0 | 94.0 | 94.5 | 81.8 | 91.1 | 93.0 | 93.5 | 95.5 | 93.0 | 90.4 | 93.6 | 95.9 | 94.6 | 89.9 | 91.6 | 95.0 | 94.8 | 92.8 |
BTDM | 90.8 | 94.7 | 94.9 | 87.5 | 90.7 | 93.4 | 94.2 | 96.2 | 94.0 | 91.0 | 94.3 | 93.5 | 94.6 | 90.8 | 92.5 | 96.1 | 95.4 | 92.7 |
Model | Element | NYT* | WebNLG* | ||||
---|---|---|---|---|---|---|---|
Prec. | Rec. | F1 | Prec. | Rec. | F1 | ||
PRGC | 94.0 | 92.3 | 93.1 | 96.0 | 93.4 | 94.7 | |
r | 95.3 | 96.3 | 95.8 | 92.8 | 96.2 | 94.5 | |
93.3 | 91.9 | 92.6 | 94.0 | 92.1 | 93.0 | ||
BTDM | 93.2 | 93.0 | 93.1 | 95.8 | 95.1 | 95.5 | |
r | 96.9 | 95.4 | 96.1 | 96.2 | 95.1 | 95.7 | |
93.0 | 92.5 | 92.7 | 94.1 | 93.5 | 93.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Z.; Yang, J.; Liu, H.; Hu, P. BTDM: A Bi-Directional Translating Decoding Model-Based Relational Triple Extraction. Appl. Sci. 2023, 13, 4447. https://doi.org/10.3390/app13074447
Zhang Z, Yang J, Liu H, Hu P. BTDM: A Bi-Directional Translating Decoding Model-Based Relational Triple Extraction. Applied Sciences. 2023; 13(7):4447. https://doi.org/10.3390/app13074447
Chicago/Turabian StyleZhang, Zhi, Junan Yang, Hui Liu, and Pengjiang Hu. 2023. "BTDM: A Bi-Directional Translating Decoding Model-Based Relational Triple Extraction" Applied Sciences 13, no. 7: 4447. https://doi.org/10.3390/app13074447
APA StyleZhang, Z., Yang, J., Liu, H., & Hu, P. (2023). BTDM: A Bi-Directional Translating Decoding Model-Based Relational Triple Extraction. Applied Sciences, 13(7), 4447. https://doi.org/10.3390/app13074447