Incorporating Word Significance into Aspect-Level Sentiment Analysis
Abstract
:Featured Application
Abstract
1. Introduction
- We use the word significance factor to model attention worthiness in aspect-level sentiment analysis using the Significant Attention Network (SAN) model.
- We introduce two novel factors in aspect-level sentiment analysis being incremental interpretation and novelty decay as an alternative to position based models.
- We conduct qualitative research on three real world datasets to prove the universality of stretched exponential novelty decay in aspect-level sentiment analysis.
2. Related Works
2.1. Attention Mechanism in Aspect-Level Sentiment Analysis
2.2. Novelty Decay
2.3. Incremental Interpretation
3. Proposed Model
3.1. Task Definition
3.2. Word Embedding Layer
3.3. Contextual Layer
3.4. Word Significance Layer
3.5. Interactive Attention Layer
3.6. Output Layer
3.7. Model Training
4. Results And Discussion
4.1. Datasets and Parameter Setting
4.2. Baseline Comparison
- Majority is a basic baseline method which assigns the sentiment polarities in the test set according to the largest polarities in the training set.
- Feature-SVM [29] uses an SVM classifier based on lexicon, parse and ngram features to achieve state-of-the-art performance.
- LSTM uses an LSTM network to learn hidden states without considering aspect words and uses the averaged vector as the sentence representation to predict polarity.
- AE-LSTM [7] models context word representations using an LSTM, then combines the hidden states with aspect embeddings to generate attention weights for classification.
- ATAE-LSTM [7] improves on the AE-LSTM by appending the aspect embedding to each word embedding in representing the context.
- IAN [5] individually models context and aspect words in attention based LSTMs using interactive attention. The two representations are then concatenated to predict polarity.
- AOA-LSTM [6] based on the individual hidden states of aspect and context words, the model uses an interaction mechanism to focus on the important context words.
- ISAN-I: models word significance based on incremental interpretation, the final prediction is based on the an interactive attention module and word significance.
- ISAN-D: models word significance based on novelty decay, includes an interactive attention module module.
- ISAN: the complete interactive significant attention model.
4.3. Overall Performance Comparison
4.4. Analysis of Isan Model
4.5. The Effects of Novelty Decay
4.6. Case Study
4.7. Error Analysis
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
NLP | Natural Language Processing |
RNN | Recurrent Neural Networks |
LSTM | Long Short Term Memory |
References
- Deng, L.; Liu, Y. Deep Learning in Natural Language Processing; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Tay, Y.; Tuan, L.A.; Hui, S.C. Learning to attend via word-aspect associative fusion for aspect-based sentiment analysis. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Socher, R.; Pennington, J.; Huang, E.H.; Ng, A.Y.; Manning, C.D. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, HI, USA, 25–27 October 2008; Association for Computational Linguistics: Stroudsburg, PA, USA, 2011; pp. 151–161. [Google Scholar]
- Dong, L.; Wei, F.; Tan, C.; Tang, D.; Zhou, M.; Xu, K. Adaptive recursive neural network for target-dependent twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, MD, USA, 22–27 June 2014; Volume 2, pp. 49–54. [Google Scholar]
- Ma, D.; Li, S.; Zhang, X.; Wang, H. Interactive Attention Networks for Aspect-Level Sentiment Classification. arXiv 2017, arXiv:1709.00893, 4068–4074. [Google Scholar] [Green Version]
- Huang, B.; Ou, Y.; Carley, K.M. Aspect level sentiment classification with attention-over-attention neural networks. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington, DC, USA, 10–13 July 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 197–206. [Google Scholar]
- Wang, Y.; Huang, M.; Zhao, L. Attention-based LSTM for aspect-level sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 606–615. [Google Scholar]
- Liu, J.; Zhang, Y. Attention modeling for targeted sentiment. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, 3–7 April 2017; pp. 572–577. [Google Scholar]
- Chen, P.; Sun, Z.; Bing, L.; Yang, W. Recurrent attention network on memory for aspect sentiment analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; pp. 452–461. [Google Scholar]
- Tang, D.; Qin, B.; Feng, X.; Liu, T. Effective LSTMs for target-dependent sentiment classification. arXiv 2015, arXiv:1512.01100. [Google Scholar]
- Li, X.; Bing, L.; Lam, W.; Shi, B. Transformation networks for target-oriented sentiment classification. arXiv 2018, arXiv:1805.01086. [Google Scholar]
- Zeng, J.; Ma, X.; Zhou, K. Enhancing Attention-Based LSTM With Position Context for Aspect-Level Sentiment Classification. IEEE Access 2019, 7, 20462–20471. [Google Scholar] [CrossRef]
- Kahneman, D. Attention and Effort; Citeseer: State College, PA, USA, 1973; Volume 1063. [Google Scholar]
- Styles, E. The Psychology of Attention; Psychology Press: Hove, UK, 2006. [Google Scholar]
- Weng, L.; Flammini, A.; Vespignani, A.; Menczer, F. Competition among memes in a world with limited attention. Sci. Rep. 2012, 2, 335. [Google Scholar] [CrossRef] [PubMed]
- Anderson, R.C. Allocation of attention during reading. In Advances in Psychology; Elsevier: Amsterdam, The Netherlands, 1982; Volume 8, pp. 292–305. [Google Scholar]
- Shirey, L.L.; Reynolds, R.E. Effect of interest on attention and learning. J. Educ. Psychol. 1988, 80, 159. [Google Scholar] [CrossRef]
- Schlesewsky, M.; Bornkessel, I. On incremental interpretation: Degrees of meaning accessed during sentence comprehension. Lingua 2004, 114, 1213–1234. [Google Scholar] [CrossRef]
- Sedivy, J.C.; Tanenhaus, M.K.; Chambers, C.G.; Carlson, G.N. Achieving incremental semantic interpretation through contextual representation. Cognition 1999, 71, 109–147. [Google Scholar] [CrossRef]
- Wu, F.; Huberman, B.A. Popularity, novelty and attention. In Proceedings of the 9th ACM Conference on Electronic Commerce, Chicago, IL, USA, 8–12 July 2008; ACM: New York, NY, USA; pp. 240–245. [Google Scholar]
- Wu, F.; Huberman, B.A. Novelty and collective attention. Proc. Natl. Acad. Sci. USA 2007, 104, 17599–17601. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Atchley, P.; Lane, S. Cognition in the attention economy. In Psychology of Learning and Motivation; Elsevier: Amsterdam, The Netherlands, 2014; Volume 61, pp. 133–177. [Google Scholar]
- Falkinger, J. Limited attention as a scarce resource in information-rich economies. Econ. J. 2008, 118, 1596–1620. [Google Scholar] [CrossRef]
- Liu, Q.; Zhang, H.; Zeng, Y.; Huang, Z.; Wu, Z. Content attention model for aspect based sentiment analysis. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, Lyon, France, 23–27 April 2018; pp. 1023–1032. [Google Scholar]
- Fan, F.; Feng, Y.; Zhao, D. Multi-grained attention network for aspect-level sentiment classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 3433–3442. [Google Scholar]
- Wang, D.; Song, C.; Barabási, A.L. Quantifying long-term scientific impact. Science 2013, 342, 127–132. [Google Scholar] [CrossRef] [PubMed]
- Kaji, N.; Kitsuregawa, M. Building lexicon for sentiment analysis from massive collection of HTML documents. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, 28–30 June 2007; pp. 1075–1083. [Google Scholar]
- Kiritchenko, S.; Zhu, X.; Mohammad, S.M. Sentiment analysis of short informal texts. J. Artif. Intell. Res. 2014, 50, 723–762. [Google Scholar] [CrossRef]
- Kiritchenko, S.; Zhu, X.; Cherry, C.; Mohammad, S. NRC-Canada-2014: Detecting aspects and sentiment in customer reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, 23–24 August 2014; pp. 437–442. [Google Scholar]
- Qu, L.; Ifrim, G.; Weikum, G. The Bag-of-Opinions Method for Review Rating Prediction from Sparse Text Patterns. In Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, 23–27 August 2010; pp. 913–921. [Google Scholar]
- Maas, A.L.; Daly, R.E.; Pham, P.T.; Huang, D.; Ng, A.Y.; Potts, C. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, Portland, OR, USA, 19–24 June 2011; pp. 142–150. [Google Scholar]
- Vo, D.T.; Zhang, Y. Target-dependent twitter sentiment classification with rich automatic features. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
- Zhang, M.; Zhang, Y.; Vo, D.T. Neural networks for open domain targeted sentiment. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 612–621. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Kumar, A.; Irsoy, O.; Ondruska, P.; Iyyer, M.; Bradbury, J.; Gulrajani, I.; Zhong, V.; Paulus, R.; Socher, R. Ask me anything: Dynamic memory networks for natural language processing. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 1378–1387. [Google Scholar]
- Sukhbaatar, S.; Weston, J.; Fergus, R. End-to-end memory networks. In Proceedings of the Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2015, Montreal, QC, Canada, 7–12 December 2015; pp. 2440–2448. [Google Scholar]
- Seo, M.; Kembhavi, A.; Farhadi, A.; Hajishirzi, H. Bidirectional attention flow for machine comprehension. arXiv 2016, arXiv:1611.01603. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Advances in neural information processing systems. In Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Radicchi, F.; Fortunato, S.; Castellano, C. Universality of citation distributions: Toward an objective measure of scientific impact. Proc. Natl. Acad. Sci. USA 2008, 105, 17268–17272. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Candia, C.; Jara-Figueroa, C.; Rodriguez-Sickert, C.; Barabási, A.L.; Hidalgo, C.A. The universal decay of collective memory and attention. Nat. Hum. Behav. 2019, 3, 82. [Google Scholar] [CrossRef] [PubMed]
- Lehmann, J.; Gonçalves, B.; Ramasco, J.J.; Cattuto, C. Dynamical classes of collective attention in twitter. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 April 2012; ACM: New York, NY, USA, 2012; pp. 251–260. [Google Scholar] [Green Version]
- Higham, K.W.; Governale, M.; Jaffe, A.; Zülicke, U. Fame and obsolescence: Disentangling growth and aging dynamics of patent citations. Phys. Rev. E 2017, 95, 042309. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Stringer, M.J.; Sales-Pardo, M.; Amaral, L.A.N. Effectiveness of journal ranking schemes as a tool for locating information. PLoS ONE 2008, 3, e1683. [Google Scholar] [CrossRef]
- Higham, K.W.; Governale, M.; Jaffe, A.; Zülicke, U. Unraveling the dynamics of growth, aging and inflation for citations to scientific articles from specific research fields. J. Inf. 2017, 11, 1190–1200. [Google Scholar] [CrossRef] [Green Version]
- Krapivsky, P.L.; Redner, S.; Leyvraz, F. Connectivity of growing random networks. Phys. Rev. Lett. 2000, 85, 4629. [Google Scholar] [CrossRef]
- Krapivsky, P.L.; Redner, S. Organization of growing random networks. Phys. Rev. E 2001, 63, 066123. [Google Scholar] [CrossRef] [Green Version]
- Laherrere, J.; Sornette, D. Stretched exponential distributions in nature and economy:“Fat tails” with characteristic scales. Eur. Phys. J. B Condens. Matter Complex Syst. 1998, 2, 525–539. [Google Scholar] [CrossRef]
- Elton, D.C. Stretched exponential relaxation. arXiv 2018, arXiv:1808.00881. [Google Scholar]
- Asur, S.; Huberman, B.A.; Szabo, G.; Wang, C. Trends in social media: Persistence and decay. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 17–21 July 2011. [Google Scholar]
- Feng, S.; Chen, X.; Cong, G.; Zeng, Y.; Chee, Y.M.; Xiang, Y. Influence maximization with novelty decay in social networks. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Quebec City, QC, Canada, 27–31 July 2014. [Google Scholar]
- Pereira, F.C.; Pollack, M.E. Incremental interpretation. Artif. Intell. 1991, 50, 37–82. [Google Scholar] [CrossRef]
- Altmann, G.T.; Kamide, Y. Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition 1999, 73, 247–264. [Google Scholar] [CrossRef]
- DeVault, D.; Sagae, K.; Traum, D. Incremental interpretation and prediction of utterance meaning for interactive dialogue. Dialogue Discourse 2011, 2, 143–170. [Google Scholar] [CrossRef]
- Pennington, J.; Socher, R.; Manning, C. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Pontiki, M.; Galanis, D.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S.; Mohammad, A.S.; Al-Ayyoub, M.; Zhao, Y.; Qin, B.; De Clercq, O.; et al. Semeval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), San Diego, CA, USA, 16–17 June 2016; pp. 19–30. [Google Scholar]
- Jiang, L.; Yu, M.; Zhou, M.; Liu, X.; Zhao, T. Target-dependent twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, OR, USA, 19–24 June 2011; Association for Computational Linguistics: Stroudsburg, PA, USA, 2011; pp. 151–160. [Google Scholar]
- Lorenz-Spreen, P.; Mønsted, B.M.; Hövel, P.; Lehmann, S. Accelerating dynamics of collective attention. Nat. Commun. 2019, 10, 1759. [Google Scholar] [CrossRef]
Dataset | Positive | Neutral | Negative | |||
---|---|---|---|---|---|---|
Train | Test | Train | Test | Train | Test | |
Restaurant | 2164 | 728 | 637 | 196 | 807 | 196 |
Laptop | 994 | 341 | 464 | 169 | 807 | 196 |
1561 | 173 | 3127 | 346 | 1560 | 173 |
Model | Laptop | Restaurant | ||||
---|---|---|---|---|---|---|
Acc | Macro-F1 | Acc | Macro-F1 | Acc | Macro-F1 | |
Majority | 0.535 | 0.333 | 0.650 | 0.333 | 0.500 | 0.333 |
Feature-SVM | 0.705 | - | 0.802 | - | 0.634 | 0.633 |
AE-LSTM | 0.689 | - | 0.762 | - | - | - |
ATAE-LSTM | 0.687 | - | 0.772 | - | - | - |
IAN | 0.721 | - | 0.786 | - | 0.716 * | 0.705 * |
AOA-LSTM | 0.745 | - | 0.812 | - | 0.725 * | 0.706 * |
ISAN | 0.749 | 0.721 | 0.824 | 0.744 | 0.734 | 0.716 |
Model | Laptop | Restaurant | ||||
---|---|---|---|---|---|---|
Acc | Macro-F1 | Acc | Macro-F1 | Acc | Macro-F1 | |
ISAN-I | 0.719 | 0.665 | 0.801 | 0.705 | 0.711 | 0.674 |
ISAN-D | 0.723 | 0.673 | 0.806 | 0.713 | 0.727 | 0.710 |
>ISAN | >0.749 | >0.721 | >0.824 | >0.744 | >0.734 | >0.716 |
Sentence Fragment | Beginning (%) | Middle (%) | End (%) |
---|---|---|---|
Laptop | 20.30 | 22.56 | 55.02 |
Restaurant | 22.84 | 22.56 | 54.59 |
27.98 | 21.87 | 50.13 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mokhosi, R.; Qin, Z.; Liu, Q.; Shikali, C. Incorporating Word Significance into Aspect-Level Sentiment Analysis. Appl. Sci. 2019, 9, 3522. https://doi.org/10.3390/app9173522
Mokhosi R, Qin Z, Liu Q, Shikali C. Incorporating Word Significance into Aspect-Level Sentiment Analysis. Applied Sciences. 2019; 9(17):3522. https://doi.org/10.3390/app9173522
Chicago/Turabian StyleMokhosi, Refuoe, ZhiGuang Qin, Qiao Liu, and Casper Shikali. 2019. "Incorporating Word Significance into Aspect-Level Sentiment Analysis" Applied Sciences 9, no. 17: 3522. https://doi.org/10.3390/app9173522
APA StyleMokhosi, R., Qin, Z., Liu, Q., & Shikali, C. (2019). Incorporating Word Significance into Aspect-Level Sentiment Analysis. Applied Sciences, 9(17), 3522. https://doi.org/10.3390/app9173522