Learning Event Representations for Zero-Shot Detection via Dual-Contrastive Prompting
Abstract
:1. Introduction
- We propose a new method for zero-shot ED, namely COPE, which leverages a dual-contrastive learning framework, i.e., sample-level and instance-level, for learning better event representations pertinent to the task;
- We conceive a contrastive fusion strategy to capture the complex interaction information within events from two perspectives—event triggers and background sentences—such that a balance between task-specific and task-agnostic features is achieved in embedding fusion;
- We validate the performance of our model on two benchmark datasets, and the experiment results indicate that COPE offers superior performance in both seen and unseen event detection, in comparison with state-of-the-art models.
2. Related Work
2.1. Zero-Shot Event Detection
2.2. Event Representation Learning
2.3. Contrastive Learning
3. Problem Definition
4. Methodology
4.1. Contrastive Sample Generator
4.2. Trigger Recognition
TR-Prompt
4.3. Event Sentence Embedding
ER-Prompt
4.4. Contrastive Fusion
4.5. Event Type Prediction
4.6. Loss Analysis
5. Experiments
5.1. Implementation Details
5.2. Datasets
5.3. Evaluation
5.4. Baseline
- Supporting Clustering with Contrastive Learning (SCCL) is one of the best-performing models in unsupervised text clustering tasks, achieving text clustering by optimizing a top-down clustering loss. SCCL is used to detect new event types based on unseen event mentions. The contextual feature of trigger tokens are used in our experiments [31].
- The Semi-supervised Vector Quantized Variational Autoeocoder (SS-VQ-VAE) is a semi-supervised zero-shot event detection model that also utilizes BERT as the encoder for event text. It employs a variational autoencoder to learn discrete event features. SS-VQ-VAE is trained based on visible event types and annotations, and it can be applied to zero-shot event detection [5].
- Zero-Shot Event Detection with Ordered Contrastive Learning (ZEOP) leverages prompt learning and ordered contrastive loss based on both instance-level and class-level distance for zero-shot event detection. ZEOP identifies trigger tokens then predicts event types by clustering [15].
- APEX Prompt is based on prompt engineering. APEX Prompt utilizes a more comprehensive event type description as a template. Compared to other prompt-based methods, this method can significantly enhance the performance of event detection, especially in zero-shot event detection [33].
6. Results and Analysis
6.1. Main Result
6.2. Ablation Analysis
6.3. Prompt Effect
6.4. Qualitative Analysis
7. Limitations
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Nguyen, T.H.; Grishman, R. Graph Convolutional Networks with Argument-Aware Pooling for Event Detection. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, LO, USA, 2–7 February 2018; McIlraith, S.A., Weinberger, K.Q., Eds.; PKP: Burnaby, BC, Canada, 2018; pp. 5900–5907. [Google Scholar] [CrossRef]
- Wadden, D.; Wennberg, U.; Luan, Y.; Hajishirzi, H. Entity, Relation, and Event Extraction with Contextualized Span Representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 5784–5789. [Google Scholar] [CrossRef]
- Lin, Y.; Ji, H.; Huang, F.; Wu, L. A Joint Neural Model for Information Extraction with Global Features. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 7999–8009. [Google Scholar] [CrossRef]
- Zhang, C.; Soderland, S.; Weld, D.S. Exploiting Parallel News Streams for Unsupervised Event Extraction. Trans. Assoc. Comput. Linguist. 2015, 3, 117–129. [Google Scholar] [CrossRef]
- Huang, L.; Ji, H. Semi-supervised New Event Type Induction and Event Detection. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 718–724. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, X.; Han, X.; Lin, Y.; Hou, L.; Liu, Z.; Li, P.; Li, J.; Zhou, J. CLEVE: Contrastive Pre-training for Event Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 6283–6297. [Google Scholar] [CrossRef]
- Huang, L.; Ji, H.; Cho, K.; Dagan, I.; Riedel, S.; Voss, C.R. Zero-Shot Transfer Learning for Event Extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, VIC, Australia, 15–20 July 2018; Gurevych, I., Miyao, Y., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2018; Volume 1: Long Papers, pp. 2160–2170. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, H.; Roth, D. Zero-shot Label-Aware Event Trigger and Argument Classification. In Proceedings of the Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online, 1–6 August 2021; pp. 1331–1340. [Google Scholar] [CrossRef]
- Lyu, Q.; Zhang, H.; Sulem, E.; Roth, D. Zero-shot Event Extraction via Transfer Learning: Challenges and Insights. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Online, 1–6 August 2021; pp. 322–332. [Google Scholar] [CrossRef]
- Lee, I.T.; Goldwasser, D. Multi-Relational Script Learning for Discourse Relations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; Korhonen, A., Traum, D., Màrquez, L., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2019; pp. 4214–4226. [Google Scholar] [CrossRef]
- Rezaee, M.; Ferraro, F. Event Representation with Sequential, Semi-Supervised Discrete Variables. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, WA, USA, 6–11 June 2021; Toutanova, K., Rumshisky, A., Zettlemoyer, L., Hakkani-Tur, D., Beltagy, I., Bethard, S., Cotterell, R., Chakraborty, T., Zhou, Y., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2021; pp. 4701–4716. [Google Scholar] [CrossRef]
- Deng, S.; Zhang, N.; Li, L.; Hui, C.; Huaixiao, T.; Chen, M.; Huang, F.; Chen, H. OntoED: Low-resource Event Detection with Ontology Embedding. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August; Zong, C., Xia, F., Li, W., Navigli, R., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2021; pp. 2828–2839. [Google Scholar] [CrossRef]
- Martin, L.J.; Ammanabrolu, P.; Wang, X.; Hancock, W.; Singh, S.; Harrison, B.; Riedl, M.O. Event Representations for Automated Story Generation with Deep Neural Nets. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, LO, USA, 2–7 February 2018; McIlraith, S.A., Weinberger, K.Q., Eds.; PKP: Burnaby, BC, Canada, 2018; pp. 868–875. [Google Scholar] [CrossRef]
- Chen, H.; Shu, R.; Takamura, H.; Nakayama, H. GraphPlan: Story Generation by Planning with Event Graph. In Proceedings of the 14th International Conference on Natural Language Generation, Aberdeen, UK, 20–24 September 2021; Belz, A., Fan, A., Reiter, E., Sripada, Y., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2021; pp. 377–386. [Google Scholar] [CrossRef]
- Zhang, S.; Ji, T.; Ji, W.; Wang, X. Zero-Shot Event Detection Based on Ordered Contrastive Learning and Prompt-Based Prediction. In Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, WA, USA, 10–15 July 2022; pp. 2572–2580. [Google Scholar] [CrossRef]
- Jiang, T.; Jiao, J.; Huang, S.; Zhang, Z.; Wang, D.; Zhuang, F.; Wei, F.; Huang, H.; Deng, D.; Zhang, Q. PromptBERT: Improving BERT Sentence Embeddings with Prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 25–28 June 2022; pp. 8826–8837. [Google Scholar] [CrossRef]
- Socher, R.; Chen, D.; Manning, C.D.; Ng, A. Reasoning with Neural Tensor Networks for Knowledge Base Completion. Neural Inf. Process. Syst. 2013, 26, 926–934. [Google Scholar]
- Weber, N.; Balasubramanian, N.; Chambers, N. Event Representations with Tensor-Based Compositions. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, LO, USA, 2–7 February 2018; McIlraith, S.A., Weinberger, K.Q., Eds.; PKP: Burnaby, BC, Canada, 2018; pp. 4946–4953. [Google Scholar] [CrossRef]
- Ding, X.; Liao, K.; Liu, T.; Li, Z.; Duan, J. Event Representation Learning Enhanced with External Commonsense Knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; Inui, K., Jiang, J., Ng, V., Wan, X., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2019; pp. 4894–4903. [Google Scholar] [CrossRef]
- Gao, J.; Wang, W.; Yu, C.; Zhao, H.; Ng, W.; Xu, R. Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; Muresan, S., Nakov, P., Villavicencio, A., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2022; pp. 3036–3049. [Google Scholar] [CrossRef]
- Gao, T.; Yao, X.; Chen, D. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 6894–6910. [Google Scholar] [CrossRef]
- Yan, Y.; Li, R.; Wang, S.; Zhang, F.; Wu, W.; Xu, W. ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 5065–5075. [Google Scholar] [CrossRef]
- Zheng, J.; Cai, F.; Chen, H. Incorporating Scenario Knowledge into A Unified Fine-tuning Architecture for Event Representation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, 25–30 July 2020; Huang, J.X., Chang, Y., Cheng, X., Kamps, J., Murdock, V., Wen, J., Liu, Y., Eds.; Association for Computing Machinery: New York, NY, USA, 2020; pp. 249–258. [Google Scholar] [CrossRef]
- Vijayaraghavan, P.; Roy, D. Lifelong Knowledge-Enriched Social Event Representation Learning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, 19–23 April 2021; Merlo, P., Tiedemann, J., Tsarfaty, R., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2021; pp. 3624–3635. [Google Scholar] [CrossRef]
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3982–3992. [Google Scholar] [CrossRef]
- Li, B.; Zhou, H.; He, J.; Wang, M.; Yang, Y.; Li, L. On the Sentence Embeddings from Pre-trained Language Models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 9119–9130. [Google Scholar] [CrossRef]
- Logeswaran, L.; Lee, H. An efficient framework for learning sentence representations. arXiv 2018, arXiv:1803.02893. [Google Scholar]
- Snell, J.; Swersky, K.; Zemel, R.S. Prototypical Networks for Few-shot Learning. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R., Eds.; 2017; pp. 4077–4087. Available online: https://proceedings.neurips.cc/paper/2017/hash/cb8da6767461f2812ae4290eac7cbc42-Abstract.html (accessed on 25 April 2024).
- Kolouri, S.; Nadjahi, K.; Simsekli, U.; Badeau, R.; Rohde, G.K. Generalized Sliced Wasserstein Distances. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R., Eds.; 2019; pp. 261–272. Available online: https://proceedings.neurips.cc/paper/2019/hash/f0935e4cd5920aa6c7c996a5ee53a70f-Abstract.html (accessed on 25 April 2024).
- Deng, S.; Zhang, N.; Kang, J.; Zhang, Y.; Zhang, W.; Chen, H. Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection. In Proceedings of the WSDM ’20: The Thirteenth ACM International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; Caverlee, J., Hu, X.B., Lalmas, M., Wang, W., Eds.; Association for Computing Machinery: New York, NY, USA, 2020; pp. 151–159. [Google Scholar] [CrossRef]
- Zhang, D.; Nan, F.; Wei, X.; Li, S.W.; Zhu, H.; McKeown, K.; Nallapati, R.; Arnold, A.O.; Xiang, B. Supporting Clustering with Contrastive Learning. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 5419–5430. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Wang, S.; Yu, M.; Huang, L. The Art of Prompting: Event Detection based on Type Specific Prompts. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 1286–1299. [Google Scholar] [CrossRef]
Dataset | ACE-2005 | FewShotED | ||||
---|---|---|---|---|---|---|
Seen | 17 | 2316 | 565 | 50 | 40,893 | 1565 |
Unseen | 16 | 1489 | 463 | 50 | 33,439 | 1907 |
Total | 33 | 3805 | 1028 | 100 | 74,332 | 3472 |
Mean | 115.30 | 743.32 | ||||
Stdev | 206.32 | 2828.47 |
Model | ACE-2005 | |||
---|---|---|---|---|
F1-Seen | F1-Unseen | NMI | FM | |
SCCL | 0.5999 | 0.3190 | 0.3259 | 0.2403 |
SS-VQ-VAE | 0.6988 | 0.3509 | 0.2515 | 0.4269 |
BERT-OCL | 0.6040 | 0.3751 | 0.4532 | 0.2551 |
ZEOP | 0.7771 | 0.4591 | 0.3797 | 0.4913 |
APEX Prompt | 0.7490 | 0.5530 | - | - |
COPE | 0.7904 | 0.5803 | 0.4952 | 0.5097 |
Model | FewShotED | |||
F1-Seen | F1-Unseen | NMI | FM | |
SCCL | 0.8717 | 0.3640 | 0.2647 | 0.3462 |
SS-VQ-VAE | 0.9208 | 0.4364 | 0.1722 | 0.5762 |
BERT-OCL | 0.9017 | 0.2160 | 0.4157 | 0.1894 |
ZEOP | 0.9306 | 0.5814 | 0.4831 | 0.7139 |
APEX Prompt | 0.9327 | 0.6371 | - | - |
COPE | 0.9466 | 0.6771 | 0.5392 | 0.7298 |
Model | ACE-2005 | |||
---|---|---|---|---|
F1-Seen | F1-Unseen | NMI | FM | |
COPE | 0.7904 | 0.5803 | 0.4952 | 0.5097 |
w/o OCL | 0.8166 | 0.5537 | 0.3494 | 0.4316 |
only trigger recognition | 0.7587 | 0.4305 | 0.4592 | 0.5138 |
only sentence embedding | 0.7362 | 0.5544 | 0.3129 | 0.3351 |
w/o contrastive fusion | 0.7803 | 0.5695 | 0.3797 | 0.4913 |
Model | FewShotED | |||
F1-Seen | F1-Unseen | NMI | FM | |
COPE | 0.9466 | 0.6771 | 0.5392 | 0.7298 |
w/o OCL | 0.9581 | 0.6351 | 0.4783 | 0.5493 |
only trigger recognition | 0.9207 | 0.5668 | 0.5147 | 0.6816 |
only sentence embedding | 0.9019 | 0.6122 | 0.4659 | 0.7246 |
w/o contrastive fusion | 0.9389 | 0.6503 | 0.4831 | 0.7139 |
Template | ACE-2005 | FewShotED | ||
---|---|---|---|---|
F1-Seen | F1-Unseen | F1-Seen | F1-Unseen | |
Searching for relationship tokens | ||||
<event mention>[MASK]. | 0.5641 | 0.2402 | 0.7741 | 0.3508 |
<event mention> is [MASK]. | 0.6013 | 0.3194 | 0.7831 | 0.4296 |
<event mention> mean [MASK]. | 0.6223 | 0.3862 | 0.7826 | 0.4996 |
<event mention> means [MASK]. | 0.6959 | 0.4824 | 0.7988 | 0.5368 |
Searching for prefix tokens | ||||
This <event mention> means [MASK]. | 0.7670 | 0.4887 | 0.9235 | 0.6138 |
This event of <event mention> means [MASK]. | 0.7766 | 0.5365 | 0.9291 | 0.6355 |
This event of “<event mention>” means [MASK]. | 0.7791 | 0.5487 | 0.9324 | 0.6426 |
This event: “<event mention>” means [MASK]. | 0.7953 | 0.5812 | 0.9418 | 0.6828 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Ge, B.; Xu, H.; Huang, P.; Huang, H. Learning Event Representations for Zero-Shot Detection via Dual-Contrastive Prompting. Mathematics 2024, 12, 1372. https://doi.org/10.3390/math12091372
Li J, Ge B, Xu H, Huang P, Huang H. Learning Event Representations for Zero-Shot Detection via Dual-Contrastive Prompting. Mathematics. 2024; 12(9):1372. https://doi.org/10.3390/math12091372
Chicago/Turabian StyleLi, Jiaxu, Bin Ge, Hao Xu, Peixin Huang, and Hongbin Huang. 2024. "Learning Event Representations for Zero-Shot Detection via Dual-Contrastive Prompting" Mathematics 12, no. 9: 1372. https://doi.org/10.3390/math12091372
APA StyleLi, J., Ge, B., Xu, H., Huang, P., & Huang, H. (2024). Learning Event Representations for Zero-Shot Detection via Dual-Contrastive Prompting. Mathematics, 12(9), 1372. https://doi.org/10.3390/math12091372