Advancing Quality Assessment in Vertical Field: Scoring Calculation for Text Inputs to Large Language Models
Abstract
:1. Introduction
2. VFS Algorithm Model
3. VFS Algorithm Analysis
3.1. Prompt Word Score D1
3.2. Text Structure Score D2
3.3. Content Relevance Score D3
3.3.1. Vectorized Lexicon Module
3.3.2. Unrelated Vector Screening Module
3.3.3. D3 Calculation Module
4. Experimental Results
4.1. Evaluation Index V-L
4.2. Result Analysis
5. Conclusions
- Building more structured input models based on the existing foundation.
- Continuously optimizing the details of the indicators so that the S can more accurately reflect input quality.
- This will be applied in specialized large language models for specific vertical fields.
- Building knowledge in other fields to promote the general applicability of this algorithm.
- Conduct a more detailed analysis of input text research in the field of cybersecurity, including the examination of necessary hardware information and related parameters for specific issues. Additionally, integrate advanced algorithms to enhance the analysis results.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ni, J.; Young, T.; Pandelea, V.; Xue, F.; Cambria, E. Recent advances in deep learning based dialogue systems: A systematic survey. Artif. Intell. Rev. 2023, 56, 3055–3155. [Google Scholar] [CrossRef]
- Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
- Johnson, R.; Zhang, T. Supervised and semi-supervised text categorization using LSTM for region embeddings. In Proceedings of the International Conference on Machine Learning, PMLR, New York, NY, USA, 19–24 June 2016; pp. 526–534. [Google Scholar]
- Liu, W.; Wang, Z.; Liu, X.; Zeng, N.; Liu, Y.; Alsaadi, F.E. A survey of deep neural network architectures and their applications. Neurocomputing 2017, 234, 11–26. [Google Scholar] [CrossRef]
- Alshemali, B.; Kalita, J. Improving the Reliability of Deep Neural Networks in NLP: A Review. Knowl.-Based Syst. 2020, 191, 105210. [Google Scholar] [CrossRef]
- Liu, G.; Guo, J. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 2019, 337, 325–338. [Google Scholar] [CrossRef]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Gillioz, A.; Casas, J.; Mugellini, E.; Abou Khaled, O. Overview of the Transformer-based Models for NLP Tasks. In Proceedings of the 2020 15th Conference on Computer Science and Information Systems (FedCSIS), Sofia, Bulgaria, 6–9 September 2020; IEEE: New York, NY, USA, 2020; pp. 179–183. [Google Scholar]
- Rahman, W.; Hasan, M.K.; Lee, S.; Zadeh, A.; Mao, C.; Morency, L.P.; Hoque, E. Integrating multimodal information in large pretrained transformers. In Proceedings of the Conference. Association for Computational Linguistics. Meeting, Online, 5–10 July 2020; Volume 2020, p. 2359. [Google Scholar]
- Ganesan, A.V.; Matero, M.; Ravula, A.R.; Vu, H.; Schwartz, H.A. Empirical evaluation of pre-trained transformers for human-level NLP: The role of sample size and dimensionality. Proc. Conf. Assoc. Comput. Linguist. N. Am. Chapter Meet. 2021, 2021, 4515. [Google Scholar]
- Srivastava, A.; Rastogi, A.; Rao, A.; Shoeb, A.A.M.; Abid, A.; Fisch, A.; Wang, G. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv 2022, arXiv:2206.04615. [Google Scholar]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Lowe, R. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 2022, 35, 27730–27744. [Google Scholar]
- Dong, X.; Zhang, C.; Ge, Y.; Mao, Y.; Gao, Y.; Lin, J.; Lou, D. C3: Zero-shot Text-to-SQL with ChatGPT. arXiv 2023, arXiv:2307.07306. [Google Scholar]
- Xiao, L.; Chen, X. Enhancing llm with evolutionary fine tuning for news summary generation. arXiv 2023, arXiv:2307.02839. [Google Scholar]
- Wu, C.; Yin, S.; Qi, W.; Wang, X.; Tang, Z.; Duan, N. Visual chatgpt: Talking, drawing and editing with visual foundation models. arXiv 2023, arXiv:2303.04671. [Google Scholar]
- Dong, Y.; Jiang, X.; Jin, Z.; Li, G. Self-collaboration Code Generation via ChatGPT. arXiv 2023, arXiv:2304.07590. [Google Scholar] [CrossRef]
- Wei, X.; Cui, X.; Cheng, N.; Wang, X.; Zhang, X.; Huang, S.; Han, W. Zero-shot information extraction via chatting with chatgpt. arXiv 2023, arXiv:2302.10205. [Google Scholar]
- Azaria, A. ChatGPT Usage and Limitations; Ministère de L’enseignement Supérieur et de la Recherche: Paris, France, 2022.
- Yu, Y.J.; Liu, Q. N-gram Chinese Characters Counting for Huge Text Corpora. Comput. Sci. 2014, 41, 263–268. [Google Scholar]
- Xiong, H.; Wang, S.; Zhu, Y.; Zhao, Z.; Liu, Y.; Huang, L.; Shen, D. Doctorglm: Fine-tuning your Chinese doctor is not a herculean task. arXiv 2023, arXiv:2304.01097. [Google Scholar]
- Cui, J.; Li, Z.; Yan, Y.; Chen, B.; Yuan, L. Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv 2023, arXiv:2306.16092. [Google Scholar]
- Li, X.; Xie, H.; Li, L.J. Research on Sentence Semantic Similarity Calculation Based on Word2vec. Comput. Sci. 2017, 44, 256–260. [Google Scholar]
- Katoch, S.; Chauhan, S.S.; Kumar, V. A review on genetic algorithm: Past, present, and future. Multimed. Tools Appl. 2021, 80, 8091–8126. [Google Scholar] [CrossRef] [PubMed]
Question Type | Quantity | Content |
---|---|---|
Information protection | 72 | Data tampering, theft, destruction, information leakage, unauthorized access, etc. |
Network defense | 103 | DDoS attacks, firewall vulnerabilities, protocol vulnerabilities, IPS configuration issues, etc. |
Application security | 66 | SQL injection, cross-site scripting (XSS), MFA anomalies, etc. |
User education | 32 | Phishing emails, malicious links, scam messages, etc. |
Physical security | 53 | Server failures, data center protection, computer hardware failures, etc. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yi, J.-K.; Yao, Y.-F. Advancing Quality Assessment in Vertical Field: Scoring Calculation for Text Inputs to Large Language Models. Appl. Sci. 2024, 14, 6955. https://doi.org/10.3390/app14166955
Yi J-K, Yao Y-F. Advancing Quality Assessment in Vertical Field: Scoring Calculation for Text Inputs to Large Language Models. Applied Sciences. 2024; 14(16):6955. https://doi.org/10.3390/app14166955
Chicago/Turabian StyleYi, Jun-Kai, and Yi-Fan Yao. 2024. "Advancing Quality Assessment in Vertical Field: Scoring Calculation for Text Inputs to Large Language Models" Applied Sciences 14, no. 16: 6955. https://doi.org/10.3390/app14166955
APA StyleYi, J. -K., & Yao, Y. -F. (2024). Advancing Quality Assessment in Vertical Field: Scoring Calculation for Text Inputs to Large Language Models. Applied Sciences, 14(16), 6955. https://doi.org/10.3390/app14166955