A Comparative Sentiment Analysis of Airline Customer Reviews Using Bidirectional Encoder Representations from Transformers (BERT) and Its Variants
Abstract
:1. Introduction
2. Related Works
2.1. Customer Feedback Analysis
2.2. Natural Language Processing (NLP) and Natural Language Understanding (NLU)
2.3. BERT
2.4. Fine-Tuning
3. Methods
3.1. Model Performance Evaluation
3.2. Models
3.2.1. Bidirectional Encoder Representations from Transformers
3.2.2. DistilBERT
3.2.3. RoBERTa
3.2.4. ALBERT
3.3. Data
3.4. Pre-processing
3.4.1. Tokenization
3.4.2. Lower-casing
3.4.3. Others
3.5. Model Training
4. Results
4.1. Binary Classification Task
4.2. Multi-Class Classification Task (k = 3)
4.3. Benchmark Comparison Task
5. Discussion
5.1. Limitations
5.2. Future Directions
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Sandada, M.; Matibiri, B. An investigation into the impact of service quality, frequent flier programs and safety perception on satisfaction and customer loyalty in the airline industry in Southern Africa. South East Eur. J. Econ. Bus. 2016, 11, 41. [Google Scholar] [CrossRef]
- Kalemba, N.; Campa-Planas, F. The quality effect on the profitability of US airline companies. Tour. Econ. 2018, 24, 251–269. [Google Scholar] [CrossRef]
- Ban, H.-J.; Kim, H.-S. Understanding customer experience and satisfaction through airline passengers’ online review. Sustainability 2019, 11, 4066. [Google Scholar] [CrossRef]
- Giachanou, A.; Crestani, F. Like it or not: A survey of twitter sentiment analysis methods. ACM Comput. Surv. CSUR 2016, 49, 1–41. [Google Scholar] [CrossRef]
- Ravi Kumar, G.; Venkata Sheshanna, K.; Anjan Babu, G. Sentiment analysis for airline tweets utilizing machine learning techniques. In International Conference on Mobile Computing and Sustainable Informatics: ICMCSI 2020; Springer: Cham, Switzerland, 2021; pp. 791–799. [Google Scholar]
- Mahurkar, S.; Patil, R. LRG at SemEval-2020 task 7: Assessing the ability of BERT and derivative models to perform short-edits based humor grading. arXiv 2020, arXiv:2006.00607. [Google Scholar]
- Tusar, M.T.H.K.; Islam, M.T. A comparative study of sentiment analysis using NLP and different machine learning techniques on US airline Twitter data. In Proceedings of the 2021 International Conference on Electronics, Communications and Information Technology (ICECIT), Khulna, Banglades, 14–16 September 2021; pp. 1–4. [Google Scholar]
- Patel, A.; Oza, P.; Agrawal, S. Sentiment Analysis of Customer Feedback and Reviews for Airline Services using Language Representation Model. Procedia Comput. Sci. 2023, 218, 2459–2467. [Google Scholar] [CrossRef]
- Yang, C.; Huang, C. Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future. Aerospace 2023, 10, 600. [Google Scholar] [CrossRef]
- Park, E. The role of satisfaction on customer reuse to airline services: An application of Big Data approaches. J. Retail. Consum. Serv. 2019, 47, 370–374. [Google Scholar] [CrossRef]
- Park, E.; Jang, Y.; Kim, J.; Jeong, N.J.; Bae, K.; Del Pobil, A.P. Determinants of customer satisfaction with airline services: An analysis of customer feedback big data. J. Retail. Consum. Serv. 2019, 51, 186–190. [Google Scholar] [CrossRef]
- Punel, A.; Hassan, L.A.H.; Ermagun, A. Variations in airline passenger expectation of service quality across the globe. Tour. Manag. 2019, 75, 491–508. [Google Scholar] [CrossRef]
- Wu, S.; Gao, Y. Machine Learning Approach to Analyze the Sentiment of Airline Passengers’ Tweets. Transp. Res. Rec. 2023. [Google Scholar] [CrossRef]
- Sezgen, E.; Mason, K.J.; Mayer, R. Voice of airline passenger: A text mining approach to understand customer satisfaction. J. Air Transp. Manag. 2019, 77, 65–74. [Google Scholar] [CrossRef]
- Siering, M.; Deokar, A.V.; Janze, C. Disentangling consumer recommendations: Explaining and predicting airline recommendations based on online reviews. Decis. Support Syst. 2018, 107, 52–63. [Google Scholar] [CrossRef]
- Kumar, S.; Zymbler, M. A machine learning approach to analyze customer satisfaction from airline tweets. J. Big Data 2019, 6, 62. [Google Scholar] [CrossRef]
- Lucini, F.R.; Tonetto, L.M.; Fogliatto, F.S.; Anzanello, M.J. Text mining approach to explore dimensions of airline customer satisfaction using online customer reviews. J. Air Transp. Manag. 2020, 83, 101760. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Allen, R.B. Several studies on natural language and back-propagation. In Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, USA, 21 June 1987; p. 341. [Google Scholar]
- Cambria, E.; White, B. Jumping NLP curves: A review of natural language processing research. IEEE Comput. Intell. Mag. 2014, 9, 48–57. [Google Scholar] [CrossRef]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Wang, H.; Raj, B. On the origin of deep learning. arXiv 2017, arXiv:1702.07800. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Sun, C.; Qiu, X.; Xu, Y.; Huang, X. How to fine-tune bert for text classification? In Proceedings of the Chinese Computational Linguistics: 18th China National Conference, CCL 2019, Kunming, China, 18–20 October 2019; pp. 194–206. [Google Scholar]
- Sanh, V.; Debut, L.; Chaumond, J.; Wolf, T. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv 2019, arXiv:1910.01108. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. Albert: A lite bert for self-supervised learning of language representations. arXiv 2019, arXiv:1909.11942. [Google Scholar]
- Twitter US Airline Sentiment. Kaggle. Available online: https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment (accessed on 17 April 2023).
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Brown, P.F.; Della Pietra, V.J.; Desouza, P.V.; Lai, J.C.; Mercer, R.L. Class-based n-gram models of natural language. Comput. Linguist. 1992, 18, 467–480. [Google Scholar]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. Xlnet: Generalized autoregressive pretraining for language understanding. arXiv 2019, arXiv:1906.08237. [Google Scholar]
Studies | Training Dataset | Algorithm | Performance |
---|---|---|---|
Wu and Gao [13] | Kaggle | SVM | A 91% prediction accuracy was achieved by their model, which can determine whether a given tweet is positive or negative. |
Patel et al. [8] | Kaggle | Logistic regression, KNN, decision tree, random forest, AdaBoost, and BERT | BERT produced the highest accuracy (83%), precision, recall, and F-1 score. |
Sezgen et al. [14] | TripAdvisor | Singular value decomposition (SVD) | LSA identifies the key factors that influence passengers to choose the airlines. |
Siering et al. [15] | airlinequality.com | Naïve Bayes (NB), neural network, and SVM | Neural network yields the highest accuracy (75%) in the random sampling, while NB yields the highest accuracy (71%) in the LCC airline sample with overall sentiment configuration. |
Kumar and Zymber [16] | Tweet data using Twitter API | SVM, artificial neural networks (ANNs), and convolutional neural networks (CNN) | CNN outperforms traditional ANN (69%) with a greater accuracy of 92%. |
Lucini et al. [17] | Airline Travel Review | Latent Dirichlet allocation (LDA) and logistic regression analysis | The developed model can predict airline recommendations by customers with an accuracy of 79%. |
Model | Accuracy | Negative | Positive | Parameters | ||||
---|---|---|---|---|---|---|---|---|
Precision | Recall | F-1 | Precision | Recall | F-1 | |||
BERT | 0.9532 | 0.9720 | 0.9699 | 0.9710 | 0.8758 | 0.8837 | 0.8797 | 109,483,778 |
RoBERTa | 0.9697 | 0.9878 | 0.9737 | 0.9807 | 0.9055 | 0.9544 | 0.9293 | 124,647,170 |
DistilBERT | 0.9515 | 0.9704 | 0.9683 | 0.9693 | 0.8807 | 0.8880 | 0.8843 | 66,955,010 |
ALBERT | 0.9589 | 0.9649 | 0.9721 | 0.9685 | 0.8796 | 0.8520 | 0.8656 | 11,685,122 |
Model | Accuracy | Negative | Positive | Neutral | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Precision | Recall | F-1 | Precision | Recall | F-1 | Precision | Recall | F-1 | ||
BERT | 0.8528 | 0.9070 | 0.9195 | 0.9132 | 0.8125 | 0.7930 | 0.8026 | 0.7009 | 0.6828 | 0.6917 |
RoBERTa | 0.8689 | 0.9098 | 0.9424 | 0.9258 | 0.8115 | 0.8719 | 0.8406 | 0.7729 | 0.6424 | 0.7016 |
DistilBERT | 0.8443 | 0.9204 | 0.8924 | 0.9062 | 0.7010 | 0.6987 | 0.6998 | 0.7528 | 0.8430 | 0.7953 |
ALBERT | 0.8408 | 0.9224 | 0.8874 | 0.9046 | 0.8502 | 0.7395 | 0.7910 | 0.6386 | 0.7781 | 0.7015 |
Model | Accuracy | Negative | Positive | Neutral | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Precision | Recall | F-1 | Precision | Recall | F-1 | Precision | Recall | F-1 | ||
Binary | ||||||||||
Benchmark (SVM [13]) | 0.9186 | N/A 1 | N/A | N/A | N/A | N/A | N/A | N/A 2 | N/A | N/A |
BERT | 0.9532 | 0.9720 | 0.9699 | 0.9710 | 0.8758 | 0.8837 | 0.8797 | N/A | N/A | N/A |
RoBERTa | 0.9697 | 0.9878 | 0.9737 | 0.9807 | 0.9055 | 0.9544 | 0.9293 | N/A | N/A | N/A |
DistilBERT | 0.9515 | 0.9704 | 0.9683 | 0.9693 | 0.8807 | 0.8880 | 0.8843 | N/A | N/A | N/A |
ALBERT | 0.9589 | 0.9649 | 0.9721 | 0.9685 | 0.8796 | 0.8520 | 0.8656 | N/A | N/A | N/A |
Multi-class (k = 3) | ||||||||||
Benchmark (BERT k = 3 [8]) | 0.83 | 0.85 | 0.96 | 0.90 | 0.78 | 0.77 | 0.78 | 0.79 | 0.46 | N/A 3 |
BERT | 0.8528 | 0.9070 | 0.9195 | 0.9132 | 0.8125 | 0.7930 | 0.8026 | 0.7009 | 0.6828 | 0.6917 |
RoBERTa | 0.8689 | 0.9098 | 0.9424 | 0.9258 | 0.8115 | 0.8719 | 0.8406 | 0.7729 | 0.6424 | 0.7016 |
DistilBERT | 0.8443 | 0.9204 | 0.8924 | 0.9062 | 0.7010 | 0.6987 | 0.6998 | 0.7528 | 0.8430 | 0.7953 |
ALBERT | 0.8408 | 0.9224 | 0.8874 | 0.9046 | 0.8502 | 0.7395 | 0.7910 | 0.6386 | 0.7781 | 0.7015 |
Model | Accuracy | Negative | Positive | ||||
---|---|---|---|---|---|---|---|
Precision | Recall | F-1 | Precision | Recall | F-1 | ||
BERT | 0.7784 (−7.44%) * | 0.7731 (−13.39%) | 0.8328 (−8.67%) | 0.8019 (−11.13%) | 0.7844 (−2.81%) | 0.8590 (+6.60%) | 0.8200 (+1.74%) |
RoBERTa | 0.8237 (−4.52%) | 0.8597 (−5.01%) | 0.8520 (−9.04%) | 0.8559 (−6.9%) | 0.8115 (0%) | 0.8828 (+1.09%) | 0.8457 (+0.49%) |
DistilBERT | 0.7955 (−4.88%) | 0.7698 (−12.36%) | 0.8700 (−2.24%) | 0.8168 (−8.94%) | 0.8326 (+13.16%) | 0.8326 (+13.39%) | 0.8326 (+13.28%) |
ALBERT | 0.8138 (−2.70%) | 0.8889 (−3.35%) | 0.7619 (−12.55%) | 0.8205 (−8.41%) | 0.8492 (−0.10%) | 0.8458 (+10.63%) | 0.8475 (+5.65%) |
Neutral | |||||||
Precision | Recall | F−1 | |||||
BERT | 0.7774 (+7.65%) | 0.6548 (−2.80%) | 0.7108 (+1.91%) | ||||
RoBERTa | 0.8026 (+2.97%) | 0.7408 (+9.84%) | 0.7705 (+6.89%) | ||||
DistilBERT | 0.7844 (+3.16%) | 0.6923 (−15.07%) | 0.7355 (−5.98%) | ||||
ALBERT | 0.7329 (+9.43%) | 0.8252 (+4.71%) | 0.7763 (+7.48%) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Yang, C.; Huang, C. A Comparative Sentiment Analysis of Airline Customer Reviews Using Bidirectional Encoder Representations from Transformers (BERT) and Its Variants. Mathematics 2024, 12, 53. https://doi.org/10.3390/math12010053
Li Z, Yang C, Huang C. A Comparative Sentiment Analysis of Airline Customer Reviews Using Bidirectional Encoder Representations from Transformers (BERT) and Its Variants. Mathematics. 2024; 12(1):53. https://doi.org/10.3390/math12010053
Chicago/Turabian StyleLi, Zehong, Chuyang Yang, and Chenyu Huang. 2024. "A Comparative Sentiment Analysis of Airline Customer Reviews Using Bidirectional Encoder Representations from Transformers (BERT) and Its Variants" Mathematics 12, no. 1: 53. https://doi.org/10.3390/math12010053
APA StyleLi, Z., Yang, C., & Huang, C. (2024). A Comparative Sentiment Analysis of Airline Customer Reviews Using Bidirectional Encoder Representations from Transformers (BERT) and Its Variants. Mathematics, 12(1), 53. https://doi.org/10.3390/math12010053