A Quantum Language-Inspired Tree Structural Text Representation for Semantic Analysis
Abstract
:1. Introduction
- (1)
- A quantum language-inspired text representation model based on relation entity and constituency parser is established, including long-range and short-range semantic associations between words.
- (2)
- The combination of attention mechanism and entanglement coefficient reduces the semantic impact of indirect modified relationships between adjacent words and enhances the semantic contribution of direct modified relationships with long-range associations.
- (3)
- The attention mechanism contains not only the overall information of the related words in the dictionary, but also the local grammatical structure of different sentences.
- (4)
- The semantic association between words with variable distances is established by combining the dependency parser.
2. Related Work
2.1. Attention-Based Semantic Analysis
2.2. Dependency Tree
2.3. Quantum Based NLP
3. Approaches
3.1. Read Text and Generate Syntax Tree
3.2. Entanglement between Words with Short-Range Modified Relationship
3.2.1. Normalize the Word Vector
3.2.2. Embedding of Entangled Word
3.2.3. Attention Mechanism
3.2.4. Adjacent Words Entanglement-Based Sentence Representation
3.2.5. Sentence Similarity
3.3. Optimize the Sentence Embedding
3.3.1. Entanglement between Words with Long-Range Modified Relationship
3.3.2. Sentence Embedding Based on Constituency Parser and Relation Entity
3.3.3. Reduce Sentence Embedding Dimensions
4. Experimental Settings
4.1. Parameters Definition
4.2. Datasets
4.3. Experimental Settings
Algorithm 1: Framework of sentence embedding based on constituency parser and relation entity for semantic similarity computation. |
5. Experimental Results
5.1. Comparing with Some Unsupervised Methods
5.2. Influence of PoS Combination Weight
5.3. Influence of the Tree Depth Difference
5.3.1. On STS-Benchmark
5.3.2. On STS’14.deft-News
5.4. Influence of the Dimension Reduction
5.4.1. Influence on STS’15.Images
5.4.2. Influence on STS’15.Headlines
5.4.3. Summary
6. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, Y.; Song, D.; Li, X.; Zhang, P.; Wang, P.; Rong, L.; Yu, G.; Wang, B. A quantum-like multimodal network framework for modeling interaction dynamics in multiparty conversational sentiment analysis. Inf. Fusion 2020, 62, 14–31. [Google Scholar] [CrossRef]
- Zhang, P.; Niu, J.; Su, Z.; Wang, B.; Ma, L.; Song, D. End-to-end quantum-like language models with application to question answering. In Proceedings of the 32nd Conference on Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, Baltimore, MD, USA, 9–11 November 2018; pp. 5666–5673. [Google Scholar]
- Zhang, Y.; Song, D.; Zhang, P.; Li, X.; Wang, P. A quantum-inspired sentiment representation model for twitter sentiment analysis. Appl. Intell. 2018, 49, 3093–3108. [Google Scholar] [CrossRef]
- Yu, Y.; Qiu, D.; Yan, R. A quantum entanglement-based approach for computing sentence similarity. IEEE Access 2020, 8, 174265–174278. [Google Scholar] [CrossRef]
- Yu, Y.; Qiu, D.; Yan, R. Quantum entanglement based sentence similarity computation. In Proceedings of the 2020 IEEE International Conference on Progress in Informatics and Computing (PIC2020), Online, 18–20 December 2020; pp. 250–257. [Google Scholar]
- Zhang, Y.; Wang, Y.; Yang, J. Lattice LSTM for chinese sentence representation. IEEE ACM Trans. Audio Speech Lang. Process. 2020, 28, 1506–1519. [Google Scholar] [CrossRef]
- Liu, D.; Fu, J.; Qu, Q.; Lv, J. BFGAN: Backward and forward generative adversarial networks for lexically constrained sentence generation. IEEE ACM Trans. Audio Speech Lang. Process. 2019, 27, 2350–2361. [Google Scholar] [CrossRef]
- Wang, B.; Kuo, C. SBERT-WK: A sentence embedding method by dissecting bert-based word models. IEEE ACM Trans. Audio Speech Lang. Process. 2020, 28, 2146–2157. [Google Scholar] [CrossRef]
- Hosseinalipour, A.; Gharehchopogh, F.; Masdari, M.; Khademi, A. Toward text psychology analysis using social spider optimization algorithm. Concurr. Comp.-Pract. E 2021, 33, e6325. [Google Scholar] [CrossRef]
- Hosseinalipour, A.; Gharehchopogh, F.; Masdari, M.; Khademi, A. A novel binary farmland fertility algorithm for feature selection in analysis of the text psychology. Appl. Intell. 2021, 51, 4824–4859. [Google Scholar] [CrossRef]
- Osmani, A.; Mohasefi, J.; Gharehchopogh, F. Enriched latent Dirichlet allocation for sentiment analysis. Expert Syst. 2020, 37, e12527. [Google Scholar] [CrossRef]
- Huang, X.; Peng, Y.; Wen, Z. Visual-textual hybrid sequence matching for joint reasoning. IEEE Trans. Cybern. 2020, 51, 5692–5705. [Google Scholar] [CrossRef]
- Dai, D.; Tang, J.; Yu, Z.; Wong, H.; You, J.; Cao, W.; Hu, Y.; Chen, C. An inception convolutional autoencoder model for chinese healthcare question clustering. IEEE Trans. Cybern. 2021, 51, 2019–2031. [Google Scholar] [CrossRef] [PubMed]
- Yin, C.; Tang, J.; Xu, Z.; Wang, Y. Memory augmented deep recurrent neural network for video question answering. IEEE Trans. Neural Netw. Learn Syst. 2020, 31, 3159–3167. [Google Scholar] [CrossRef] [PubMed]
- Mohammadzadeh, H.; Gharehchopogh, F. A multi-agent system based for solving high-dimensional optimization problems: A case study on email spam detection. Int. J. Commun. Syst. 2021, 34, e4670. [Google Scholar] [CrossRef]
- Li, X.; Jiang, H.; Kamei, Y.; Chen, X. Bridging semantic gaps between natural languages and apis with word embedding. IEEE Trans. Softw. Eng. 2020, 46, 1081–1097. [Google Scholar] [CrossRef] [Green Version]
- Osmani, A.; Mohasefi, J.; Gharehchopogh, F. Sentiment classification using two effective optimization methods derived from the artificial bee colony optimization and imperialist competitive algorithm. Comput. J. 2022, 65, 18–66. [Google Scholar] [CrossRef]
- Li, L.; Jiang, Y. Integrating language model and reading control gate in BLSTM-CRF for biomedical named entity recognition. IEEE ACM Trans. Comput. Biol. Bioinform. 2020, 17, 841–846. [Google Scholar] [CrossRef] [PubMed]
- Maragheh, H.K.; Gharehchopogh, F.; Majidzadeh, K.; Sangar, A. A new hybrid based on long Short-term memory network with spotted Hyena optimization algorithm for multi-label text classification. Mathematics 2022, 10, 488. [Google Scholar] [CrossRef]
- Choi, H.; Lee, H. Multitask learning approach for understanding the relationship between two sentences. Inf. Sci. 2019, 485, 413–426. [Google Scholar] [CrossRef]
- Zhang, L.; Luo, M.; Liu, J.; Chang, X.; Yang, Y.; Hauptmann, A. Deep top-k ranking for image-sentence matching. IEEE Trans. Multimed. 2020, 22, 775–785. [Google Scholar] [CrossRef]
- Huang, F.; Zhang, X.; Zhao, Z.; Li, Z. Bidirectional spatial-semantic attention networks for image-text matching. IEEE Trans. Image Process. 2019, 28, 2008–2020. [Google Scholar] [CrossRef]
- Ma, Q.; Yu, L.; Tian, S.; Chen, E.; Ng, W. Global-local mutual attention model for text classification. IEEE ACM Trans. Audio Speech. Lang. Process. 2019, 27, 2127–2139. [Google Scholar] [CrossRef]
- Xu, X.; Wang, T.; Yang, Y.; Zuo, L.; Shen, F.; Shen, H. Cross-modal attention with semantic consistence for image-text matching. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 5412–5425. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Guo, Q.; Qiu, X.; Xue, X.; Zhang, Z. Low-rank and locality constrained self-attention for sequence modeling. IEEE ACM Trans. Audio Speech Lang. Process. 2019, 27, 2213–2222. [Google Scholar] [CrossRef]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.; Le, Q. Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; pp. 5754–5764. [Google Scholar]
- Dong, L.; Yang, N.; Wang, W.; Wei, F.; Liu, X.; Wang, Y.; Gao, J.; Zhou, M.; Hon, H. Unified language model pre-training for natural language understanding and generation. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; pp. 13042–13054. [Google Scholar]
- Bao, H.; Dong, L.; Wei, F.; Wang, W.; Yang, N.; Liu, X.; Wang, Y.; Gao, J.; Piao, S.; Zhou, M.; et al. Unilmv2: Pseudo-masked language models for unified language model pre-training. In Proceedings of the 37th International Conference on Machine Learning, (ICML 2020), Online, 13–18 July 2020; pp. 642–652. [Google Scholar]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACLHLT 2019), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A lite BERT for selfsupervised learning of language representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR 2020), Addis Ababa, Ethiopia, 26–30 April 2020; pp. 1–16. [Google Scholar]
- Conneau, A.; Lample, G. Cross-lingual language model pretraining. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; pp. 7057–7067. [Google Scholar]
- Peters, M.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018), New Orleans, LA, USA, 1–6 June 2018; pp. 2227–2237. [Google Scholar]
- Gharehchopogh, F. Advances in tree seed algorithm: A comprehensive survey. Arch. Comput. Methods Eng. 2022, 1–24. [Google Scholar] [CrossRef]
- Wang, J.; Yu, L.; Lai, K.; Zhang, X. Treestructured regional CNN-LSTM model for dimensional sentiment analysis. IEEE ACM Trans. Audio Speech Lang. Process. 2020, 28, 581–591. [Google Scholar] [CrossRef]
- Shen, M.; Kawahara, D.; Kurohashi, S. Dependency parser reranking with rich subtree features. IEEE ACM Trans. Audio Speech Lang. Process. 2014, 22, 1208–1218. [Google Scholar] [CrossRef]
- Luo, H.; Li, T.; Liu, B.; Wang, B.; Unger, H. Improving aspect term extraction with bidirectional dependency tree representation. IEEE ACM Trans. Audio Speech Lang. Process. 2019, 27, 1201–1212. [Google Scholar] [CrossRef] [Green Version]
- Zhang, J.; Zhai, F.; Zong, C. Syntax-based translation with bilingually lexicalized synchronous tree substitution grammars. IEEE Trans. Speech Audio Process. 2013, 21, 1586–1597. [Google Scholar] [CrossRef]
- Chen, W.; Zhang, M.; Zhang, Y. Distributed feature representations for dependency parsing. IEEE ACM Trans. Audio Speech Lang. Process. 2015, 23, 451–460. [Google Scholar] [CrossRef]
- Geng, Z.; Chen, G.; Han, Y.; Lu, G.; Li, F. Semantic relation extraction using sequential and treestructured LSTM with attention. Inf. Sci. 2020, 509, 183–192. [Google Scholar] [CrossRef]
- Fei, H.; Ren, Y.; Ji, D. A tree-based neural network model for biomedical event trigger detection. Inf. Sci. 2020, 512, 175–185. [Google Scholar] [CrossRef]
- Cao, Q.; Liang, X.; Li, B.; Lin, L. Interpretable visual question answering by reasoning on dependency trees. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 887–901. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wu, Y.; Zhao, S.; Li, W. Phrase2vec: Phrase embedding based on parsing. Inf. Sci. 2020, 517, 100–127. [Google Scholar] [CrossRef]
- Widdows, D.; Cohen, T. Graded semantic vectors: An approach to representing graded quantities in generalized quantum models. In Proceedings of the Quantum Interaction—9th International Conference (QI 2015), Filzbach, Switzerland, 15–17 July 2015; Volume 9535, pp. 231–244. [Google Scholar]
- Aerts, D.; Sozzo, S. Entanglement of conceptual entities in quantum model theory (qmod). In Proceedings of the Quantum Interaction—6th International Symposium (QI 2012), Paris, France, 27–29 June 2012; Volume 7620, pp. 114–125. [Google Scholar]
- Nguyen, N.; Behrman, E.; Moustafa, M.; Steck, J. Benchmarking neural networks for quantum computations. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 2522–2531. [Google Scholar] [CrossRef] [Green Version]
- Sordoni, A.; Nie, J.; Bengio, Y. Modeling term dependencies with quantum language models for IR. In Proceeding of the 36th International ACM SIGIR conference on research and development in Information Retrieval (SIGIR’13), Dublin, Ireland, 28 July–1 August 2013; pp. 653–662. [Google Scholar]
- Cohen, T.; Widdows, D. Embedding probabilities in predication space with hermitian holographic reduced representations. In Proceedings of the Quantum Interaction—9th International Conference (QI 2015), Filzbach, Switzerland, 15–17 July 2015; Volume 9535, pp. 245–257. [Google Scholar]
- Yuan, K.; Xu, W.; Li, W.; Ding, W. An incremental learning mechanism for object classificationbased on progressive fuzzy three-way concept. Inf. Sci. 2022, 584, 127–147. [Google Scholar] [CrossRef]
- Xu, W.; Yuan, K.; Li, W. Dynamic updating approximations of local generalized multigranulation neighborhood rough set. Appl. Intell. 2022. [Google Scholar] [CrossRef]
- Xu, W.; Yu, J. A novel approach to information fusion in multi-source datasets: A granular computing viewpoint. Inf. Sci. 2017, 378, 410–423. [Google Scholar] [CrossRef]
- Xu, W.; Li, W. Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Trans. Cybern. 2016, 46, 366–379. [Google Scholar] [CrossRef]
- Hou, Y.; Zhao, X.; Song, D.; Li, W. Mining pure high-order word associations via information geometry for information retrieval. ACM Trans. Inf. Syst. 2013, 31, 1–12. [Google Scholar] [CrossRef]
- Xie, M.; Hou, Y.; Zhang, P.; Li, J.; Li, W.; Song, D. Modeling quantum entanglements in quantum language models. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), Buenos Aires, 25–31 July 2015; pp. 1362–1368. [Google Scholar]
- Aerts, D.; Beltran, L.; Bianchi, M.; Sozzo, S.; Veloz, T. Quantum cognition beyond hilbert space: Fundamentals and applications. In Proceedings of the Quantum Interaction—10th International Conference (QI 2016), San Francisco, CA, USA, 20–22 July 2016; Volume 10106, pp. 81–98. [Google Scholar]
- Zhang, Y.; Song, D.; Zhang, P.; Wang, P.; Li, J.; Li, X.; Wang, B. A quantum-inspired multimodal sentiment analysis framework. Theor. Comput. Sci. 2018, 752, 21–40. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Li, Q.; Song, D.; Zhang, P.; Wang, P. Quantum-inspired interactive networks for conversational sentiment analysis. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), Macao, China, 10–16 August 2019; pp. 5436–5442. [Google Scholar]
- Aerts, D.; Arguelles, J.; Beltran, L.; Distrito, I.; Bianchi, M.; Sozzo, S.; Veloz, T. Context and interference effects in the combinations of natural concepts. In Proceedings of the Modeling and Using Context—10th International and Interdisciplinary Conference (CONTEXT 2017), Paris, France, 20–23 July 2017; Volume 10257, pp. 677–690. [Google Scholar]
- Galofaro, F.; Toffano, Z.; Doan, B. A quantumbased semiotic model for textual semantics. Kybernetes 2018, 47, 307–320. [Google Scholar] [CrossRef]
- Agirre, E.; Cer, D.; Diab, M.; Gonzalez-Agirre, A. Semeval-2012 task 6: A pilot on semantic textual similarity. In Proceedings of the 6th International Workshop on Semantic Evaluation, Montreal, QC, Canada, 7–8 June 2012; pp. 385–393. [Google Scholar]
- Agirre, E.; Banea, C.; Cardie, C.; Cer, D.; Diab, M.T.; Gonzalez-Agirre, A.; Guo, W.; Mihalcea, R.; Rigau, G.; Wiebe, J. Semeval-2014 task 10: Multilingual semantic textual similarity. In Proceedings of the 8th International Workshop on Semantic Evaluation, Dublin, Ireland, 23–24 August 2014; pp. 81–91. [Google Scholar]
- Agirre, E.; Banea, C.; Cardie, C.; Cer, D.; Diab, M.; Gonzalez-Agirre, A.; Guo, W.; Lopez-Gazpio, I.; Maritxalar, M.; Mihalcea, R.; et al. Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. In Proceedings of the 9th International Workshop on Semantic Evaluation, Denver, CO, USA, 4–5 June 2015; pp. 252–263. [Google Scholar]
- Cer, D.; Diab, M.; Agirre, E.; Lopez-Gazpio, I.; Specia, L. Semeval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation, Vancouver, BC, Canada, 3–4 August 2017; pp. 1–14. [Google Scholar]
- Gao, T.; Yao, X.; Chen, D. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Online, 10–11 November 2021; pp. 6894–6910. [Google Scholar]
- Zhang, Y.; He, R.; Liu, Z.; Lim, K.; Bing, L. An unsupervised sentence embedding method by mutual information maximization. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), Online, 16–20 November 2020; pp. 1601–1610. [Google Scholar]
- Li, B.; Zhou, H.; He, J.; Wang, M.; Yang, Y.; Li, L. On the sentence embeddings from pre-trained language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), Online, 16–20 November 2020; pp. 9119–9130. [Google Scholar]
- Schick, T.; Schütze, H. Generating datasets with pretrained language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Online, 10–11 November 2021; pp. 6943–6951. [Google Scholar]
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, China, 2–7 November 2019; pp. 3982–3992. [Google Scholar]
- Quan, Z.; Wang, Z.; Le, Y. An efficient framework for sentence similarity modeling. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 853–865. [Google Scholar] [CrossRef]
- Wang, S.; Zhang, J.; Zong, C. Learning sentence representation with guidance of human attention. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI2017), Melbourne, Australia, 19–25 August 2017; pp. 4137–4143. [Google Scholar]
word embedding of the ith word | |
entangled word embedding of the ith word and the jth word | |
sentence embedding | |
direction cosine between the word embedding of the ith word and the th word | |
part of speech combination weight of the ith word and the th word | |
depth difference between two words in the parser tree | |
weight of the depth difference between the ith word and the jth word | |
entanglement coefficient between the ith word and the jth word | |
y | annotated sentence similarity by humans |
x | calculated sentence similarity by the proposed model |
, | threshold value of annotated sentence similarity |
threshold value of relative error | |
relative error between the experimental result and annotated score of sentence similarity | |
D | dimensionality of the sentence representation |
sentence dimension reduced at the level of sentence embedding | |
sentence dimension reduced at the level of entangled word embedding |
STS’12 | STS’14 | STS’15 | |
---|---|---|---|
MSRvid (750) | deft-forum (450) | answers-forums (375) | STSb (4225) |
SMTeuroparl (459) | deft-news (300) | answers-students (750) | STS’12 (3108) |
OnWN (750) | headlines (750) | belief (375) | STS’14 (3750) |
MSRpar (750) | images (750) | images (750) | STS’15 (3000) |
SMTnews (399) | tweet-news (750) | headlines (750) | |
OnWN (750) |
Model | STS’12 | STS’14 | STS’15 | STSb | Avg. |
---|---|---|---|---|---|
SimCSE-base [64] | 0.702 | 0.732 | 0.814 | 0.802 | 0.763 |
IS-Bert-NLI [65] | 0.568 | 0.630 | 0.752 | 0.692 | 0.661 |
Bert-flow [66] | 0.652 | 0.694 | 0.749 | 0.723 | 0.705 |
DINO [67] | 0.703 | 0.713 | 0.805 | 0.778 | 0.750 |
SBERT-base [68] | 0.710 | 0.732 | 0.791 | 0.770 | 0.751 |
Provided model | 0.641 | 0.774 | 0.844 | 0.830 | 0.772 |
Dataset | ACVT [69] | - [70] | - [70] | - [4] | |
---|---|---|---|---|---|
12’MSRpar | 0.58 | 0.58 | 0.50 | 0.59 | 0.71 |
12’MSRvid | 0.83 | 0.83 | 0.85 | 0.90 | 0.92 |
12’SMTeuroparl | 0.43 | 0.52 | 0.52 | 0.69 | 0.74 |
12’OnWN | 0.70 | 0.73 | 0.73 | 0.84 | 0.82 |
12’SMTnews | 0.54 | 0.66 | 0.67 | 0.78 | 0.81 |
STS’12 | 0.62 | 0.66 | 0.65 | 0.76 | 0.80 |
14’deft-forum | 0.48 | 0.54 | 0.56 | 0.66 | 0.68 |
14’deft-news | 0.74 | 0.74 | 0.76 | 0.78 | 0.75 |
14’headlines | 0.72 | 0.72 | 0.72 | 0.81 | 0.82 |
14’images | 0.81 | 0.81 | 0.83 | 0.87 | 0.87 |
14’OnWN | 0.87 | 0.87 | 0.85 | 0.92 | 0.93 |
14’tweet-news | 0.75 | 0.82 | 0.79 | 0.82 | 0.90 |
STS’14 | 0.73 | 0.75 | 0.75 | 0.81 | 0.82 |
15’answers-forums | 0.69 | 0.69 | 0.69 | 0.86 | 0.88 |
15’answers-students | 0.79 | 0.79 | 0.79 | 0.86 | 0.89 |
15’belief | 0.70 | 0.78 | 0.78 | 0.87 | 0.88 |
15’images | 0.82 | 0.84 | 0.85 | 0.85 | 0.86 |
15’headlines | 0.79 | 0.79 | 0.77 | 0.89 | 0.86 |
STS’15 | 0.76 | 0.78 | 0.78 | 0.86 | 0.87 |
MSRpar | MSRvid | SMTeu | OnWN | SMTnews | deft-f | deft-n | headlines | images | OnWN | tweet-n | |||
word2vec | Pcc | 0.71 | 0.91 | 0.74 | 0.82 | 0.81 | 0.68 | 0.75 | 0.82 | 0.87 | 0.92 | 0.89 | |
word2vec | MSE | 0.022 | 0.017 | 0.047 | 0.020 | 0.022 | 0.053 | 0.033 | 0.029 | 0.022 | 0.022 | 0.015 | |
fasttext | Pcc | 0.51 | 0.84 | 0.42 | 0.74 | 0.60 | 0.51 | 0.63 | 0.76 | 0.76 | 0.79 | 0.86 | |
fasttext | MSE | 0.028 | 0.035 | 0.086 | 0.049 | 0.073 | 0.055 | 0.039 | 0.035 | 0.040 | 0.062 | 0.028 | |
answ-for | answ-stu | belief | images | headlines | |||||||||
word2vec | Pcc | 0.88 | 0.89 | 0.88 | 0.86 | 0.86 | 0.86 | ||||||
word2vec | MSE | 0.016 | 0.017 | 0.023 | 0.033 | 0.030 | 0.028 | ||||||
fasttext | Pcc | 0.78 | 0.76 | 0.78 | 0.81 | 0.81 | 0.78 | ||||||
fasttext | MSE | 0.029 | 0.041 | 0.037 | 0.037 | 0.044 | 0.037 |
Year | Dataset | ||||||
---|---|---|---|---|---|---|---|
MSRpar | 0.688 | 0.711 | 0.703 | 0.0248 | 0.0224 | 0.0235 | |
MSRvid | 0.882 | 0.881 | 0.872 | 0.0248 | 0.0250 | 0.0270 | |
2012 | SMTeuroparl | 0.728 | 0.732 | 0.732 | 0.0987 | 0.1013 | 0.1023 |
OnWN | 0.825 | 0.824 | 0.818 | 0.0179 | 0.0183 | 0.0196 | |
SMTnews | 0.773 | 0.779 | 0.770 | 0.0195 | 0.0200 | 0.0198 | |
2014 | deft-forum | 0.654 | 0.658 | 0.665 | 0.0515 | 0.0519 | 0.0432 |
deft-news | 0.750 | 0.747 | 0.752 | 0.0328 | 0.0337 | 0.0326 | |
headlines | 0.805 | 0.806 | 0.794 | 0.0294 | 0.0294 | 0.0320 | |
images | 0.872 | 0.876 | 0.873 | 0.0225 | 0.0215 | 0.0222 | |
OnWN | 0.918 | 0.919 | 0.916 | 0.0257 | 0.0255 | 0.0262 | |
tweet-news | 0.906 | 0.909 | 0.909 | 0.0161 | 0.0154 | 0.0151 | |
answers-forums | 0.891 | 0.891 | 0.892 | 0.0214 | 0.0211 | 0.0209 | |
answers-students | 0.842 | 0.843 | 0.843 | 0.0274 | 0.0275 | 0.0272 | |
2015 | belief | 0.888 | 0.886 | 0.885 | 0.0233 | 0.0234 | 0.0232 |
images | 0.858 | 0.858 | 0.857 | 0.0326 | 0.0327 | 0.0328 | |
headlines | 0.909 | 0.910 | 0.863 | 0.0200 | 0.0199 | 0.0301 |
= 75,000 | 0.80803 | 0.01465 | |
= 80,000 | 0.81801 | 0.01962 | |
0.5 | = 75,000 | 0.81312 | 0.01404 |
0.5 | = 80,000 | 0.82549 | 0.01792 |
= 75,000 | 0.81251 | 0.01413 | |
= 80,000 | 0.82427 | 0.01833 |
Stays the Same | Changes | ||
---|---|---|---|
(1.5, 1.2, 0.8) | (, , ) | 0.85837 | 0.02869 |
(, , ) | 0.85829 | 0.02868 | |
(, , ) | 0.85812 | 0.02868 |
Stays the Same | Changes | ||
(1.5, 1.2, 0.8) | (, , ) | 0.75023 | 0.03280 |
(, , ) | 0.74806 | 0.03296 | |
(, , ) | 0.74431 | 0.03340 | |
(, , ) | 0.74687 | 0.03313 | |
(, , ) | 0.74554 | 0.03312 | |
(, , ) | 0.74560 | 0.03310 | |
(, , ) | 0.74532 | 0.03313 | |
(, , ) | 0.74533 | 0.03313 | |
(, , ) | 0.74540 | 0.03312 | |
(, , ) | 0.74682 | 0.03302 | |
Stays the Same | Changes | ||
(, , ) | (1.5, 1.2, 0.6) | 0.75083 | 0.03276 |
(2.0, 1.2, 0.6) | 0.75083 | 0.03276 | |
(, , ) | (2.0, 1.5, 1.0) | 0.74690 | 0.03292 |
(2.0, 1.5, 0.8) | 0.75023 | 0.03280 | |
(, , ) | (1.5, 1.2, 0.6) | 0.75083 | 0.03276 |
(1.5, 1.5, 0.6) | 0.75008 | 0.03285 | |
(, , ) | (2.0, 1.5, 1.0) | 0.74690 | 0.03292 |
(1.5, 1.2, 0.6) | 0.75083 | 0.03276 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, Y.; Qiu, D.; Yan, R. A Quantum Language-Inspired Tree Structural Text Representation for Semantic Analysis. Mathematics 2022, 10, 914. https://doi.org/10.3390/math10060914
Yu Y, Qiu D, Yan R. A Quantum Language-Inspired Tree Structural Text Representation for Semantic Analysis. Mathematics. 2022; 10(6):914. https://doi.org/10.3390/math10060914
Chicago/Turabian StyleYu, Yan, Dong Qiu, and Ruiteng Yan. 2022. "A Quantum Language-Inspired Tree Structural Text Representation for Semantic Analysis" Mathematics 10, no. 6: 914. https://doi.org/10.3390/math10060914
APA StyleYu, Y., Qiu, D., & Yan, R. (2022). A Quantum Language-Inspired Tree Structural Text Representation for Semantic Analysis. Mathematics, 10(6), 914. https://doi.org/10.3390/math10060914