Promoting Unified Generative Framework with Descriptive Prompts for Joint Multi-Intent Detection and Slot Filling
Abstract
:1. Introduction
- We develoepd a prompt construction method with instructional description to enrich the semantics of both intent and slot labels. The approach harnesses the power of PLMs to refine the prompt template, while exploring the benefits of task relation, as showcased in [15].
- To model the correlation between specific intent and slot values, we introduced an auxiliary task called intent-driven slot-filling. It encourages PLMs to capture the inherent correlations between intents and slots, thereby enhancing the overall performance of both ID and SF.
- We conducted extensive experiments on two multi-intent datasets, and we compared our method with current state-of-the-art (SOTA) methods to illustrate the effectiveness and superiority of our method.
2. Related Work
2.1. Natural Language Understanding
2.2. Joint Model in Natural Language Understanding
2.3. Prompt Learning
3. Unified Generative Framework with Descriptive Prompt
3.1. Task Formulation
3.2. Framework Overview
3.3. Label Semantic Description
3.4. Intent-Driven Slot Filling
4. Experiments
4.1. Datasets and Settings
4.2. Overall Comparison Results
4.3. Comparison in Few-Shot Scenario
4.4. Ablation Study
4.5. Case Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gao, Q.; Dong, G.; Mou, Y.; Wang, L.; Zeng, C.; Guo, D.; Sun, M.; Xu, W. Exploiting domain-slot related keywords description for few-shot cross-domain dialogue state tracking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 2 December 2022; pp. 2460–2465. [Google Scholar] [CrossRef]
- Wang, Y.; Shen, Y.; Jin, H. A Bi-model based RNN semantic frame parsing model for intent detection and slot filling. In Proceedings of the The North American Chapter of the Association for Computational Linguistics, New Orleans, LA, USA, 1–6 June 2018; pp. 309–314. [Google Scholar] [CrossRef]
- E, H.; Niu, P.; Chen, Z.; Song, M. A novel bi-directional interrelated model for joint intent detection and slot filling. In Proceedings of the the Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July 2019; pp. 5467–5471. [Google Scholar] [CrossRef]
- Qin, L.; Che, W.; Li, Y.; Wen, H.; Liu, T. A stack-propagation framework with token-level intent detection for spoken language understanding. arXiv 2019, arXiv:1909.02188. [Google Scholar] [CrossRef]
- Gangadharaiah, R.; Narayanaswamy, B. Joint multiple intent detection and slot labeling for goal-oriented dialog. In Proceedings of the the Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; pp. 564–569. [Google Scholar] [CrossRef]
- Qin, L.; Xu, X.; Che, W.; Liu, T. AGIF: An adaptive graph-interactive framework for joint multiple intent detection and slot filling. arXiv 2020, arXiv:2004.10087. [Google Scholar] [CrossRef]
- Qin, L.; Wei, F.; Xie, T.; Xu, X.; Che, W.; Liu, T. GL-GIN: Fast and accurate non-autoregressive model for joint multiple intent detection and slot filling. In Proceedings of the the Annual Meeting of the Association for Computational Linguistics and International Joint Conference on Natural Language Processing, Online, 1–6 August 2021; pp. 178–188. [Google Scholar] [CrossRef]
- Chen, L.; Zhou, P.; Zou, Y. Joint multiple intent detection and slot filling via self-distillation. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Singapore, 7–13 May 2022; pp. 7612–7616. [Google Scholar] [CrossRef]
- Wu, Y.; Wang, H.Q.; Zhang, D.; Chen, G.; Zhang, H. Incorporating instructional prompts into a unified generative framework for joint multiple intent detection and slot filling. In Proceedings of the International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 7203–7208. [Google Scholar]
- Song, F.; Huang, L.; Wang, H. A unified framework for multi-intent spoken language understanding with prompting. arXiv 2022, arXiv:2210.03337. [Google Scholar] [CrossRef]
- Zhang, Q.; Wang, S.; Li, J. A heterogeneous interaction graph network for multi-intent spoken language understanding. Neural Process. Lett. 2023, 55, 9483–9501. [Google Scholar] [CrossRef]
- Cheng, L.; Yang, W.; Jia, W. A scope sensitive and result attentive model for multi-intent spoken language understanding. arXiv 2022, arXiv:2211.12220. [Google Scholar] [CrossRef]
- Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. Pre-Train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 2023, 55, 1–35. [Google Scholar] [CrossRef]
- Wang, L.; Li, R.; Yan, Y.; Yan, Y.; Wang, S.; Wu, W.Y.; Xu, W. InstructionNER: A multi-task instruction-based generative framework for few-shot NER. arXiv 2022, arXiv:2203.03903. [Google Scholar] [CrossRef]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Chi, E.; Le, Q.; Zhou, D. Chain of thought prompting elicits reasoning in large language models. arXiv 2022, arXiv:2201.11903. [Google Scholar] [CrossRef]
- Firdaus, M.; Bhatnagar, S.; Ekbal, A.; Bhattacharyya, P. Intent detection for spoken language understanding using a deep ensemble model. In Proceedings of the PRICAI 2018: Trends in Artificial Intelligence; Geng, X., Kang, B.H., Eds.; Springer: Cham, Switzerland, 2018; pp. 629–642. [Google Scholar]
- Xia, C.; Zhang, C.; Yan, X.; Chang, Y.; Yu, P. Zero-shot user intent detection via capsule neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing; Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J., Eds.; Association for Computational Linguistics: Brussels, Belgium, 2018; pp. 3090–3099. [Google Scholar] [CrossRef]
- Shin, Y.; Yoo, K.M.; Lee, S.-g. Slot filling with delexicalized sentence generation. In Proceedings of the Interspeech, Hyderabad, India, 2–6 September 2018; pp. 2082–2086. [Google Scholar] [CrossRef]
- Wu, J.; Banchs, R.E.; D’Haro, L.F.; Krishnaswamy, P.; Chen, N. Attention-based semantic priming for slot-filling. In Proceedings of the the Seventh Named Entities Workshop; Chen, N., Banchs, R.E., Duan, X., Zhang, M., Li, H., Eds.; Association for Computational Linguistics: Melbourne, Australia, 2018; pp. 22–26. [Google Scholar] [CrossRef]
- Zhu, S.; Yu, K. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA, 5–9 March 2017; pp. 5675–5679. [Google Scholar] [CrossRef]
- Qiu, L.; Ding, Y.; He, L. Recurrent neural networks with pre-trained language model embedding for slot filling task. CoRR 2018, arXiv:1812.05199. [Google Scholar] [CrossRef]
- Ding, Z.; Yang, Z.; Lin, H.; Wang, J. Focus on interaction: A novel dynamic graph model for joint multiple intent detection and slot filling. In Proceedings of the the International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Montreal, QC, Canada, 26 August 2021; pp. 3801–3807. [Google Scholar] [CrossRef]
- Kim, B.; Ryu, S.; Lee, G.G. Two-stage multi-intent detection for spoken language understanding. Multimed. Tools Appl. 2017, 76, 11377–11390. [Google Scholar] [CrossRef]
- Kumar, A.; Tripathi, R.K.; Vepa, J. Low resource pipeline for spoken language understanding via weak supervision. arXiv 2022, arXiv:2206.10559. [Google Scholar] [CrossRef]
- Yang, F.; Zhou, X.; Wang, Y.; Atawulla, A.; Bi, R. Diversity features enhanced prototypical network for few-shot intent detection. In Proceedings of the International Joint Conference on Artificial Intelligence, Vienna, Austria, 23–29 July 2022; Volume 7, pp. 4447–4453. [Google Scholar] [CrossRef]
- Hou, Y.; Chen, C.; Luo, X.; Li, B.; Che, W. Inverse is better! Fast and accurate prompt for few-shot slot tagging. In Proceedings of the Findings of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Muresan, S., Nakov, P., Villavicencio, A., Eds.; Association for Computational Linguistics: Melbourne, Australia, 2022; pp. 637–647. [Google Scholar] [CrossRef]
- Wang, Y.; Mei, J.; Zou, B.; Fan, R.; He, T.; Aw, A.T. Making pre-trained language models better learn few-shot spoken language understanding in more practical scenarios. In Proceedings of the Findings of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023; Rogers, A., Boyd-Graber, J., Okazaki, N., Eds.; Association for Computational Linguistics: Melbourne, Australia, 2023; pp. 13508–13523. [Google Scholar] [CrossRef]
- Hou, Y.; Lai, Y.; Wu, Y.; Che, W.; Liu, T. Few-shot learning for multi-label intent detection. In Proceedings of the the AAAI Conference on Artificial Intelligence, Atlanta, GA, USA, 17 May 2021; Volume 35, pp. 13036–13044. [Google Scholar] [CrossRef]
- Zhang, F.; Chen, W.; Ding, F.; Wang, T. Dual class knowledge propagation network for multi-label few-shot intent detection. In Proceedings of the Annual Meeting of the Association for Computational Linguistics; Rogers, A., Boyd-Graber, J., Okazaki, N., Eds.; Association for Computational Linguistics: Melbourne, Australia, 2023; pp. 8605–8618. [Google Scholar] [CrossRef]
- Qin, L.; Xie, T.; Che, W.; Liu, T. A survey on spoken language understanding: Recent advances and new frontiers. arXiv 2021, arXiv:2103.03095. [Google Scholar] [CrossRef]
- Zhang, X.; Wang, H. A joint model of intent determination and slot filling for spoken language understanding. In Proceedings of the International Joint Conference on Artificial Intelligence, New York, NY, USA, 9–15 July 2016; pp. 2993–2999. [Google Scholar] [CrossRef]
- Xing, B.; Tsang, I. Co-guiding net: Achieving mutual guidances between multiple intent detection and slot filling via heterogeneous semantics-label graphs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 11 December 2022; pp. 159–169. [Google Scholar] [CrossRef]
- Zhu, Z.; Xu, W.; Cheng, X.; Song, T.; Zou, Y. A dynamic graph interactive framework with label-semantic injection for spoken language understanding. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Song, M.; Yu, B.; Quangang, L.; Yubin, W.; Liu, T.; Xu, H. Enhancing joint multiple intent detection and slot filling with global intent-slot co-occurrence. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 11 December 2022; pp. 7967–7977. [Google Scholar] [CrossRef]
- Hou, Y.; Lai, Y.; Chen, C.; Che, W.; Liu, T. Learning to bridge metric spaces: Few-shot joint learning of intent detection and slot filling. In Proceedings of the Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online, 1–6 August 2021; pp. 3190–3200. [Google Scholar] [CrossRef]
- Cai, F.; Zhou, W.; Mi, F.; Faltings, B. Slim: Explicit slot-intent mapping with BERT for joint multi-intent detection and slot filling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Singapore, 7–13 May 2022; pp. 7607–7611. [Google Scholar] [CrossRef]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. In Proceedings of the Advances in Neural Information Processing Systems, San Francisco, CA, USA, 6–12 May 2020; Volume 33, pp. 1877–1901. [Google Scholar]
- Gao, T.; Fisch, A.; Chen, D. Making pre-trained language models better few-shot learners. arXiv 2020, arXiv:2012.15723. [Google Scholar] [CrossRef]
- Jin, F.; Lu, J.; Zhang, J.; Zong, C. Instance-aware prompt learning for language understanding and generation. arXiv 2022, arXiv:2201.07126. [Google Scholar] [CrossRef]
- Coucke, A.; Saade, A.; Ball, A.; Bluche, T.; Caulier, A.; Leroy, D.; Doumouro, C.; Gisselbrecht, T.; Caltagirone, F.; Lavril, T. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. arXiv 2018, arXiv:1805.10190. [Google Scholar] [CrossRef]
- Hemphill, C.T.; Godfrey, J.J.; Doddington, G.R. The ATIS spoken language systems pilot corpus. In Proceedings of the a Workshop Held at Hidden Valley, Stroudsburg, PA, USA, 24–27 June 1990; pp. 24–27. [Google Scholar]
Original Intent | Intent with Semantic Description |
---|---|
atis_abbreviation | abbreviation (shortened forms of words or phrases) |
atis_airport | airport (airport) |
atis_city | city (from somewhere to somewhere, such as cities, locations) |
atis_capacity | capacity (such as seats) |
Q: | In common understanding, when the word “snacks” appears in a sentence, there is a high probability that the intention is to refer to “atis_meal”. Consequently, the semantic information of labels can be enhanced through semantic expansion. |
R: | ⋯ For instance, you can gather words, phrases, or concepts related to meals, such as “airline meals”, “pilot meals”, “special diets”, “appetizers”, “beverages”, “snacks”, and so on, and add them to the semantic information of the “atis_meal” intent label. ⋯ |
Q: | According to the description above, the intent label is “atis_meal”. Please return an optimized semantic expansion of the label (separated by a colon). |
R: | Sure, here is an example of expanded “atis_meal” label based on the semantic information: “atis_meal”: snacks, meals, and beverages served on flights. |
Original Slot | Baseline | Slot with Semantic Descriptions |
---|---|---|
flight_number | flight number | flight number |
fromloc.city_name | from location.city name | from location.city name (slot value before the word ‘to’) |
flight_mod | flight mod | flight mod (slot value excluding ‘day of the week’) |
depart_date.today _relative | depart date.today relative | depart date.today relative |
MixATIS | MixSNIPS | ||
---|---|---|---|
Train (#) | 13,161 | Train (#) | 39,776 |
Val (#) | 759 | Val (#) | 2198 |
Test (#) | 828 | Test (#) | 2199 |
Intent (#) | 18 | Intent (#) | 7 |
Slot (#) | 78 | Slot (#) | 39 |
Experimental Settings | |||
---|---|---|---|
Backbone | T5-base | Hidden Size | 768 |
Optimizer | Adam | Learning Rate | |
Batch Size | 4 | No. of Epochs | 30 |
Model | MixATIS | MixSNIPS | ||||
---|---|---|---|---|---|---|
S-F1 | I-Acc | O-Acc | S-F1 | I-Acc | O-Acc | |
Bi-Model [2] | 83.9 | 70.3 | 34.4 | 90.7 | 95.6 | 63.4 |
SF_ID [3] | 87.4 | 66.2 | 34.9 | 90.6 | 95.0 | 59.9 |
Stack-Propagation [4] | 87.8 | 72.1 | 40.1 | 94.2 | 96.0 | 72.9 |
Joint Learning [5] | 84.6 | 73.4 | 36.1 | 90.6 | 95.1 | 62.9 |
AGIF [6] | 86.7 | 74.4 | 40.8 | 94.2 | 95.1 | 74.2 |
GL-GIN [7] | 88.3 | 76.3 | 43.5 | 94.9 | 95.6 | 75.4 |
SDJN [8] | 88.2 | 77.1 | 44.6 | 94.4 | 96.5 | 75.5 |
DGIF [33] | 88.5 | 83.3 | 50.7 | 95.9 | 97.8 | 84.3 |
PromptSLU [10] | 89.6 | 85.8 | 57.2 | 96.5 | 97.5 | 84.8 * |
UGEN [9] | 89.2 | 83.0 | 55.3 | 95.0 | 96.9 | 78.8 |
UGen-DP | 90.3 | 86.2 | 58.7 | 96.6 | 97.6 | 84.7 |
Model | 5-Shot | 10-Shot | 10% | ||||||
---|---|---|---|---|---|---|---|---|---|
S-F1 | I-Acc | O-Acc | S-F1 | I-Acc | O-Acc | S-F1 | I-Acc | O-Acc | |
* SP [4] | 58.7 | 78.2 | 11.9 | 71.5 | 88.3 | 24.8 | 90.3 | 93.5 | 58.4 |
AGIF [6] | 60.7 | 77.8 | 14.4 | 73.0 | 86.3 | 27.5 | 91.2 | 93.0 | 62.8 |
GL-GIN [7] | 54.3 | 86.1 | 10.1 | 69.5 | 90.2 | 23.9 | 92.1 | 95.3 | 66.6 |
UGEN [9] | 84.2 | 92.4 | 42.5 | 87.4 | 93.3 | 50.5 | 93.6 | 96.0 | 71.7 |
UGen-DP | 85.9 | 93.2 | 43.9 | 89.3 | 94.1 | 52.2 | 94.1 | 96.2 | 74.4 |
Model | MixATIS | MixSNIPS | ||||
---|---|---|---|---|---|---|
S-F1 | I-Acc | O-Acc | S-F1 | I-Acc | O-Acc | |
UGEN | 89.2 | 83.0 | 55.3 | 95.0 | 96.9 | 78.8 |
w/o LSD or IDSF | 89.5 | 84.1 | 55.9 | 95.7 | 97.1 | 81.2 |
w/o IDSF | 89.7 | 86.0 | 56.2 | 95.8 | 97.4 | 81.6 |
w/o LSD | 89.9 | 84.3 | 56.4 | 96.4 | 97.2 | 84.2 |
UGen-DP | 90.3 | 86.2 | 58.7 | 96.6 | 97.6 | 84.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, Z.; Qin, J.; Pan, M.; Tang, S.; Mi, J.; Liu, D. Promoting Unified Generative Framework with Descriptive Prompts for Joint Multi-Intent Detection and Slot Filling. Electronics 2024, 13, 1087. https://doi.org/10.3390/electronics13061087
Ma Z, Qin J, Pan M, Tang S, Mi J, Liu D. Promoting Unified Generative Framework with Descriptive Prompts for Joint Multi-Intent Detection and Slot Filling. Electronics. 2024; 13(6):1087. https://doi.org/10.3390/electronics13061087
Chicago/Turabian StyleMa, Zhiyuan, Jiwei Qin, Meiqi Pan, Song Tang, Jinpeng Mi, and Dan Liu. 2024. "Promoting Unified Generative Framework with Descriptive Prompts for Joint Multi-Intent Detection and Slot Filling" Electronics 13, no. 6: 1087. https://doi.org/10.3390/electronics13061087
APA StyleMa, Z., Qin, J., Pan, M., Tang, S., Mi, J., & Liu, D. (2024). Promoting Unified Generative Framework with Descriptive Prompts for Joint Multi-Intent Detection and Slot Filling. Electronics, 13(6), 1087. https://doi.org/10.3390/electronics13061087