Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning
Abstract
:1. Introduction
- (1)
- An end-to-end entity disambiguation training framework is proposed that combines topic and semantic features to address the issue of insufficient information extraction in short-text entity disambiguation tasks;
- (2)
- Contrastive learning methods are introduced to the entity disambiguation task to further enhance text representation quality, improve performance in scenarios with limited training samples, and reduce the annotation workload and training costs;
- (3)
- The model proposed in this paper demonstrates superior performance, with a 0.6% improvement in the F1-score on a full training set, and robustness, with a 2.8% improvement on a small training set, compared to the benchmark method, offering a novel approach to addressing short-text entity disambiguation tasks.
2. Related Work
3. Method
3.1. Feature Extraction
3.1.1. Topic Feature Extraction
3.1.2. Semantic Feature Extraction
3.2. Feature Fusion and Disambiguation
3.3. Contrastive Learning
3.4. Training Process
4. Experiment and Analysis
4.1. Dataset
- (1)
- For short texts containing multiple entities, we rebuilt individual samples by pairing each entity mention with the corresponding short text so that each sample is designed to disambiguate only one specific entity mention.
- (2)
- We extracted and consolidated the structured entity information from the knowledge base, condensing it into a single text entity description. This consolidation facilitates the calculation of feature vectors.
- (3)
- We combine the short texts to be disambiguated with their corresponding entity descriptions from the knowledge base, generating samples in the format of {short text, entity description} pairs.
4.2. Evaluation
4.3. Experimental Environment and Parameter Settings
4.4. Results
4.4.1. Full Training Set
4.4.2. Small Training Set
4.5. Discussion
4.5.1. Effectiveness of Feature Fusion
4.5.2. The Role of Contrastive Learning
- (1)
- With the increase in training rounds, the value of the Alignment of features from each extractor continues to decrease, indicating that the text features of samples in the same category are clustered in the vector space, which helps the classification layer better distinguish the categories of samples. And it is notable that when the LDA model, BERT model, or contrastive learning method is removed, the quality of the features will decrease. This means that all of them can improve the text representations.
- (2)
- The slope of the blue arrow is larger than that of the yellow one, indicating that the introduction of the contrastive learning method can accelerate this trend; that is, after adding contrastive learning, the model can reach a smaller Alignment value faster. Meanwhile, we observed that Uniformity increased during the training rounds, indicating that when the model separates different types of samples, the vector distribution inevitably becomes uneven. However, the introduction of the contrastive learning method has little effect on the Uniformity value, indicating that the contrastive learning method will not aggravate this trend.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Nemes, L.; Kiss, A. Information Extraction and Named Entity Recognition Supported Social Media Sentiment Analysis during the COVID-19 Pandemic. Appl. Sci. 2021, 11, 11017. [Google Scholar] [CrossRef]
- Han, X.; Kim, J.; Kwoh, C. Active learning for ontological event extraction incorporating named entity recognition and unknown word handling. J. Biomed. Semant. 2016, 7, 22. [Google Scholar] [CrossRef] [PubMed]
- Al-Moslmi, T.; Gallofré Ocaña, M.; LOpdahl, A.; Veres, C. Named Entity Extraction for Knowledge Graphs: A Literature Overview. IEEE Access 2020, 8, 32862–32881. [Google Scholar] [CrossRef]
- Bagga, A.; Baldwin, B. Entity-based cross-document coreferencing using the vector space model. In Proceedings of the COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics, Stroudsburg, PA, USA, 10–14 August 1998. [Google Scholar]
- Fleischman, M.; Hovy, E. Multi-document person name resolution. In Proceedings of the Conference on Reference Resolution and Its Applications, Barcelona, Spain, 25–26 July 2004; pp. 1–8. [Google Scholar]
- Pedersen, T.; Purandare, A.; Kulkarni, A. Name discrimination by clustering similar contexts. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, 13–19 February 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 226–237. [Google Scholar]
- Pilz, A.; Paaß, G. From names to entities using thematic context distance. In Proceedings of the 20th ACM international conference on Information and Knowledge Management, Glasgow, UK, 24–28 October 2011; pp. 857–866. [Google Scholar]
- He, Z.; Liu, S.; Li, M.; Zhou, M.; Zhang, L.; Wang, H. Learning entity representation for entity disambiguation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Sofia, Bulgaria, 4–9 August 2013; pp. 30–34. [Google Scholar]
- Sun, Y.; Lin, L.; Tang, D.; Yangz, N.; Jiy, Z.; Wang, X. Modeling mention, context and entity with neural networks for entity disambiguation. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Zhang, Y.; Liu, J.; Huang, B.; Chen, B. Entity Linking Method for Chinese Short Text Based on Siamese-Like Network. Information 2022, 13, 397. [Google Scholar] [CrossRef]
- Shi, Y.; Yang, R.; Yin, C.; Lu, Y.; Yang, Y.; Tao, Y. Entity Linking Method for Chinese Short Texts with Multiple Embedded Representations. Electronics 2023, 12, 2692. [Google Scholar] [CrossRef]
- Moller, C.; Lehmann, J.; Usbeck, R. Survey on English Entity Linking on Wikidata. arXiv 2021. [CrossRef]
- De Bonis, M.; Falchi, F.; Manghi, P. Graph-based methods for Author Name Disambiguation: A survey. PeerJ Comput. Sci. 2023, 9, e1536. [Google Scholar] [CrossRef] [PubMed]
- Minkov, E.; Cohen, W.W.; Ng, A. contextual search and name disambiguation in email using graphs. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, USA, 6–11 August 2006; pp. 27–34. [Google Scholar]
- Zhang, B.; Saha, T.K.; Al Hasan, M. Name disambiguation from link data in a collaboration graph. In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing, China, 17–20 August 2014; pp. 81–84. [Google Scholar]
- Phan, M.C.; Sun, A.; Tay, Y.; Han, J.; Li, C. Pair-linking for collective entity disambiguation: Two could be better than all. IEEE Trans. Knowl. Data Eng. 2019, 31, 1383–1396. [Google Scholar] [CrossRef]
- Han, X.; Zhao, J. Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11–16 July 2010. [Google Scholar]
- Bouarroudj, W.; Boufaïda, Z.; Bellatreche, L. WeLink: A Named Entity Disambiguation Approach for a QAS over Knowledge Bases. In Proceedings of the International Conference on Flexible Query Answering Systems, Amantea, Italy, 17–19 June 2019. [Google Scholar]
- Lommatzsch, A.; Ploch, D.; Luca, E.W.; Albayrak, S. Named Entity Disambiguation for German News Articles. LWA 2010, 2, 209–212. [Google Scholar]
- Blei, D.M.; Ng, A.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2001, 3, 993–1022. [Google Scholar]
- Jelodar, H.; Wang, Y.; Yuan, C.; Feng, X. Latent Dirichlet allocation (LDA) and topic modeling: Models, ap-plications, a survey. Multimed. Tools Appl. 2017, 78, 15169–15211. [Google Scholar] [CrossRef]
- Chen, Q.; Yao, L.; Yang, J. Short text classification based on LDA topic model. In Proceedings of the 2016 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, China, 11–12 July 2016; pp. 749–753. [Google Scholar]
- Jiang, S.; Xian, Y.; Wang, H.; Zhang, Z.; Li, H. Representation Learning with LDA Models for Entity Disam-biguation in Specific Domains. J. Adv. Comput. Intell. Intell. Inform. 2021, 25, 326–334. [Google Scholar] [CrossRef]
- Zhang, W.; Su, J.; Tan, C.L. A Wikipedia-LDA Model for Entity Linking with Batch Size Changing Instance Selection. In Proceedings of the 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 8–13 November 2011. [Google Scholar]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.M.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Zhang, P.; Zhao, H.; Wang, F.; Zeng, Q.; Amos, S. Fusing LDA Topic Features for BERT-based Text Classification. Res. Sq. 2022. [Google Scholar] [CrossRef]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G.E. A simple framework for contrastive learning of visual representations. arXiv 2020, arXiv:2002.05709. [Google Scholar]
- Majumder, O.; Ravichandran, A.; Maji, S.; Polito, M.; Bhotika, R.; Soatto, S. Revisiting Contrastive Learning for Few-Shot Classification. arXiv 2021, arXiv:2101.11058. [Google Scholar]
- Stevens, K.; Kegelmeyer, W.P.; Andrzejewski, D.; Buttler, D.J. Exploring Topic Coherence over Many Models and Many Topics. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Jeju Island, Republic of Korea, 12–14 July 2012. [Google Scholar]
- Wan, C.; Li, B. Financial causal sentence recognition based on BERT-CNN text classification. J. Supercomput. 2021, 78, 6503–6527. [Google Scholar] [CrossRef]
- Abas, A.R.; Elhenawy, I.; Zidan, M.; Othman, M. BERT-CNN: A Deep Learning Model for Detecting Emotions from Text. Comput. Mater. Contin. 2022, 71, 2943. [Google Scholar]
- Dai, Z.; Wang, X.; Ni, P.; Li, Y.; Li, G.; Bai, X. Named Entity Recognition Using BERT BiLSTM CRF for Chinese Electronic Health Records. In Proceedings of the 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Suzhou, China, 19–21 October 2019; pp. 1–5. [Google Scholar]
- Xia, L.; Ye, J.; Luo, D.; Guan, M.; Liu, J.; Cao, X. Short text automatic scoring system based on BERT-BiLSTM model. J. Shenzhen Univ. Sci. Eng. 2022, 39, 349. [Google Scholar] [CrossRef]
- Ravi, M.P.; Singh, K.; Mulang, I.O.; Shekarpour, S.; Hoffart, J.; Lehmann, J. CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata. arXiv 2021, arXiv:2101.09969. [Google Scholar]
- Wang, T.; Isola, P. Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020. [Google Scholar]
- Dong, Z.; Dong, Q. HowNet—A hybrid language and knowledge resource. In Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 26–29 October 2003; pp. 820–824. [Google Scholar]
Short texts | Chinese text | {“text_id”: “1”, “text”: “小品⟪战狼故事⟫中, 吴京突破重重障碍解救爱人, 深情告白太感人”, “mention_data”: [{“kb_id”: “159056”, “mention”: “吴京”, “offset”: “10”}]} |
English translation | {“text_id”: “1”, “text”: “In the skit” Wolf Warrior Story “, Wu Jing breaks through numerous obstacles to rescue his lover, and his heartfelt confession is too touching.” “mention_data”: [{“kb_id”: “159056”, “mention”: “Wu Jing”, “offset”: “10”}]} | |
Knowledge base | Chinese text | {“alias”: [], “subject_id”: “27429”, “data”: [{“predicate”: “摘要”, “object”: “⟪心魔⟫是由张明师/张超南作词, 朱兴明作曲, 张雅静演唱的歌曲, 发行于2017年11月◦”}, {“predicate”: “义项描述”, “object”: “张雅静演唱的歌曲”}], “type”: “Work”, “subject”: “心魔”} |
English translation | {“alias”: [], “subject_id”: “27429”, “data”: [{“predict”: “abstract”, “object”: “Heart Demon” is a song written by Zhang Mingshi/Zhang Chaonan, composed by Zhu Xingming, and sung by Zhang Yajing. It was released in November 2017. “}, {“predict”: “meaning description”, “object”: “Song sung by Zhang Yajing”}, “type”: “Work”, “subject”: “Heart Demon”} |
Experimental Environment | Environment Configuration |
---|---|
Operating system | Ubuntu 20.04.6 LTS |
CPU | Intel (R) Xeon (R) gold 6130 h |
GPU | NVIDIA geforce RTX 3090 × 1 |
Memory | 128 G |
Python | 3.8.11 |
Parameter | Parameter Value |
---|---|
Topic num of LDA | 43 |
Dim of m_vector | 128 |
Epoch | 3 |
Batch Size | 128 |
Learning rate of BERT | 5 × 10−5 |
Learning rate of other nets | 1 × 10−3 |
Max sequence length of short texts | 64 |
Max sequence length of entity descriptions | 256 |
Optimizer | Adam |
of | 0.9 |
Model | 10,000 Training Samples | 3000 Training Samples | 1000 Training Samples | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |
BERT-BiLSTM | 84.3 | 77.3 | 80.6 | 72.6 | 85.6 | 78.6 (↓2.5%) | 61.9 | 71.8 | 66.5 (↓17.5%) |
BERT-CNN | 83.1 | 80.6 | 81.8 | 84.4 | 63.8 | 72.7 (↓11.1%) | 73.9 | 65.8 | 69.6 (↓14.9%) |
CHOLAN | 82.2 | 84.6 | 83.4 | 77.2 | 80.9 | 79.0 (↓5.3%) | 69.9 | 73.7 | 71.7 (↓14.0%) |
Our COLBERT | 84.6 | 83.4 | 84.0 | 76.0 | 83.9 | 79.8 (↓5%) | 74.8 | 74.3 | 74.5 (↓11.3%) |
Model | 10,000 Training Samples | 3000 Training Samples | 1000 Training Samples | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |
COLBERT | 84.6 | 83.4 | 84.0 | 76.0 | 83.9 | 79.8 | 74.8 | 74.3 | 74.5 |
-LDA | 82.1 | 85.8 | 83.9 | 71.5 | 88.5 | 79.1 | 69.0 | 80.2 | 74.1 |
-BERT | 60.1 | 68.3 | 64.0 | 56.2 | 65.1 | 60.3 | 50.3 | 60.8 | 55.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mei, Q.; Li, X. Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning. Information 2024, 15, 139. https://doi.org/10.3390/info15030139
Mei Q, Li X. Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning. Information. 2024; 15(3):139. https://doi.org/10.3390/info15030139
Chicago/Turabian StyleMei, Qishun, and Xuhui Li. 2024. "Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning" Information 15, no. 3: 139. https://doi.org/10.3390/info15030139
APA StyleMei, Q., & Li, X. (2024). Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning. Information, 15(3), 139. https://doi.org/10.3390/info15030139