TChecker: A Content Enrichment Approach for Fake News Detection on Social Media
Abstract
:1. Introduction
- RQ1: How would the introduction of contextual information enhance the veracity detection of social posts?
- RQ2: How would the incorporation of news articles alongside the post content enhance the veracity detection of social posts?
2. Related Work
2.1. Content-Based Approach
Fake News Detection from News Articles
2.2. Fake News Detection from Social Media
2.3. Social-Based Approach
3. Methodology
3.1. Baseline Model
3.2. BERTweet BiLSTM Model
3.3. TChecker Model
4. Experiments
4.1. Dataset
4.2. Baseline Model
4.3. Effect of BiLSTM
4.4. TChecker Model
- Text-CNN [50]: Text-CNN utilizes convolutional neural networks to model news articles, which can capture different granularities of text features with multiple convolution filters.
- TCNN-URG [43]: TCNN-URG consists of two major components: a two-level convolutional neural network to learn representations from news articles, and a conditional variational auto-encoder to capture features from user comments.
- CSI [44]: CSI is a hybrid deep learning model that utilizes information from the text, response, and source. The news representation is modeled via an LSTM neural network with the Doc2Vec embedding on the news articles and user comments as input.
- dEFEND [17]: This is a model uses the news articles and rank the user posts to select posts that explain whey the post is real or fake.
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Boehm, L.E. The validity effect: A search for mediating variables. Personal. Soc. Psychol. Bull. 1994, 20, 285–293. [Google Scholar] [CrossRef]
- Nickerson, R.S. Confirmation bias: A ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 1998, 2, 175–220. [Google Scholar] [CrossRef]
- Yuan, L.; Jiang, H.; Shen, H.; Shi, L.; Cheng, N. Sustainable Development of Information Dissemination: A Review of Current Fake News Detection Research and Practice. Systems 2023, 11, 458. [Google Scholar] [CrossRef]
- Vosoughi, S.; Roy, D.; Aral, S. The spread of true and false news online. Science 2018, 359, 1146–1151. [Google Scholar] [CrossRef] [PubMed]
- Pogue, D. How to Stamp Out Fake News. Sci. Am. 2017, 316, 24. [Google Scholar] [CrossRef] [PubMed]
- Allcott, H.; Gentzkow, M. Social Media and Fake News in the 2016 Election. J. Econ. Perspect. 2017, 31, 211–236. [Google Scholar] [CrossRef]
- Rapoza, K. Can ‘Fake News’ Impact the Stock Market? Section: Investing. Available online: https://www.forbes.com/sites/kenrapoza/2017/02/26/can-fake-news-impact-the-stock-market/?sh=129496f92fac (accessed on 10 September 2023).
- Cinelli, M.; Quattrociocchi, W.; Galeazzi, A.; Valensise, C.M.; Brugnoli, E.; Schmidt, A.L.; Zola, P.; Zollo, F.; Scala, A. The COVID-19 social media infodemic. Sci. Rep. 2020, 10, 16598. [Google Scholar] [CrossRef] [PubMed]
- Ma, J.; Gao, W.; Wong, K.F. Rumor Detection on Twitter with Tree-structured Recursive Neural Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 1980–1989. [Google Scholar] [CrossRef]
- Ajao, O.; Bhowmik, D.; Zargari, S. Fake News Identification on Twitter with Hybrid CNN and RNN Models. In Proceedings of the 9th International Conference on Social Media and Society, Melbourne, Australia, 15–20 July 2018; ACM: Copenhagen, Denmark, 2018; pp. 226–230. [Google Scholar] [CrossRef]
- Yang, S.; Shu, K.; Wang, S.; Gu, R.; Wu, F.; Liu, H. Unsupervised Fake News Detection on Social Media: A Generative Approach. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 5644–5651. [Google Scholar] [CrossRef]
- Hlaing, M.M.M.; Kham, N.S.M. Defining News Authenticity on Social Media Using Machine Learning Approach. In Proceedings of the 2020 IEEE Conference on Computer Applications(ICCA), Yangon, Myanmar, 27–28 February 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar]
- Lin, H.Y.; Moh, T.S. Sentiment analysis on COVID tweets using COVID-Twitter-BERT with auxiliary sentence approach. In Proceedings of the 2021 ACM Southeast Conference, ACM SE ’21, Online, 15–17 April 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 234–238. [Google Scholar] [CrossRef]
- Jeyasudha, J.; Seth, P.; Usha, G.; Tanna, P. Fake Information Analysis and Detection on Pandemic in Twitter. SN Comput. Sci. 2022, 3, 456. [Google Scholar] [CrossRef]
- Nguyen, D.Q.; Vu, T.; Tuan Nguyen, A. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 9–14. [Google Scholar] [CrossRef]
- Shu, K.; Cui, L.; Wang, S.; Lee, D.; Liu, H. dEFEND: Explainable Fake News Detection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 395–405. [Google Scholar] [CrossRef]
- Pan, J.Z.; Pavlova, S.; Li, C.; Li, N.; Li, Y.; Liu, J. Content Based Fake News Detection Using Knowledge Graphs. In Proceedings of the Semantic Web—ISWC 2018, Monterey, CA, USA, 8–12 October 2018; Vrandečić, D., Bontcheva, K., Suárez-Figueroa, M.C., Presutti, V., Celino, I., Sabou, M., Kaffee, L.A., Simperl, E., Eds.; Lecture Notes in Computer Science. Springer International Publishing: Cham, Switzerland, 2018; pp. 669–683. [Google Scholar] [CrossRef]
- Hu, L.; Yang, T.; Zhang, L.; Zhong, W.; Tang, D.; Shi, C.; Duan, N.; Zhou, M. Compare to The Knowledge: Graph Neural Fake News Detection with External Knowledge. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 754–763. [Google Scholar] [CrossRef]
- Siering, M.; Koch, J.A.; Deokar, A.V. Detecting Fraudulent Behavior on Crowdfunding Platforms: The Role of Linguistic and Content-Based Cues in Static and Dynamic Contexts. J. Manag. Inf. Syst. 2016, 33, 421–455. [Google Scholar] [CrossRef]
- Zhang, D.; Zhou, L.; Kehoe, J.L.; Kilic, I.Y. What Online Reviewer Behaviors Really Matter? Effects of Verbal and Nonverbal Behaviors on Detection of Fake Online Reviews. J. Manag. Inf. Syst. 2016, 33, 456–481. [Google Scholar] [CrossRef]
- Braud, C.; Søgaard, A. Is writing style predictive of scientific fraud? arXiv 2017, arXiv:1707.04095. [Google Scholar]
- Bond, G.D.; Holman, R.D.; Eggert, J.A.L.; Speller, L.F.; Garcia, O.N.; Mejia, S.C.; Mcinnes, K.W.; Ceniceros, E.C.; Rustige, R. ‘Lyin’ Ted’, ‘Crooked Hillary’, and ‘Deceptive Donald’: Language of Lies in the 2016 US Presidential Debates. Appl. Cogn. Psychol. 2017, 31, 668–677. [Google Scholar] [CrossRef]
- Potthast, M.; Kiesel, J.; Reinartz, K.; Bevendorff, J.; Stein, B. A Stylometric Inquiry into Hyperpartisan and Fake News. arXiv 2017, arXiv:1702.05638. [Google Scholar]
- Agarwal, V.; Sultana, H.P.; Malhotra, S.; Sarkar, A. Analysis of Classifiers for Fake News Detection. Procedia Comput. Sci. 2019, 165, 377–383. [Google Scholar] [CrossRef]
- Rohera, D.; Shethna, H.; Patel, K.; Thakker, U.; Tanwar, S.; Gupta, R.; Hong, W.C.; Sharma, R. A Taxonomy of Fake News Classification Techniques: Survey and Implementation Aspects. IEEE Access 2022, 10, 30367–30394. [Google Scholar] [CrossRef]
- Mohapatra, A.; Thota, N.; Prakasam, P. Fake news detection and classification using hybrid BiLSTM and self-attention model. Multimed. Tools Appl. 2022, 81, 18503–18519. [Google Scholar] [CrossRef]
- Pennington, J.; Socher, R.; Manning, C. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Stroudsburg, PA, USA, 2014; pp. 1532–1543. [Google Scholar] [CrossRef]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Sanh, V.; Debut, L.; Chaumond, J.; Wolf, T. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv 2020, arXiv:1910.01108. [Google Scholar]
- Anggrainingsih, R.; Hassan, G.M.; Datta, A. Evaluating BERT-Based Pre-Training Language Models for Detecting Misinformation. arXiv 2022, arXiv:2203.07731. [Google Scholar]
- Rai, N.; Kumar, D.; Kaushik, N.; Raj, C.; Ali, A. Fake News Classification using transformer based enhanced LSTM and BERT. Int. J. Cogn. Comput. Eng. 2022, 3, 98–105. [Google Scholar] [CrossRef]
- Shu, K.; Mahudeswaran, D.; Wang, S.; Lee, D.; Liu, H. FakeNewsNet: A Data Repository with News Content, Social Context and Spatialtemporal Information for Studying Fake News on Social Media. arXiv 2019, arXiv:1809.01286. [Google Scholar] [CrossRef] [PubMed]
- Lee, J.W.; Kim, J.H. Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News. Appl. Sci. 2022, 12, 6402. [Google Scholar] [CrossRef]
- Kaliyar, R.K.; Goswami, A.; Narang, P. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed. Tools Appl. 2021, 80, 11765–11788. [Google Scholar] [CrossRef] [PubMed]
- Zubiaga, A.; Liakata, M.; Procter, R. Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media. arXiv 2016, arXiv:1610.07363. [Google Scholar]
- Olaleye, T.; Abayomi-Alli, A.; Adesemowo, K.; Arogundade, O.T.; Misra, S.; Kose, U. SCLAVOEM: Hyper parameter optimization approach to predictive modelling of COVID-19 infodemic tweets using smote and classifier vote ensemble. Soft Comput. 2022, 27, 3531–3550. [Google Scholar] [CrossRef]
- Müller, M.; Salathé, M.; Kummervold, P.E. COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter. arXiv 2020, arXiv:2005.07503. [Google Scholar] [CrossRef]
- Dadgar, S.; Ghatee, M. Checkovid: A COVID-19 misinformation detection system on Twitter using network and content mining perspectives. arXiv 2021, arXiv:2107.09768. [Google Scholar]
- Kumar, A.; Jhunjhunwala, N.; Agarwal, R.; Chatterjee, N. NARNIA at NLP4IF-2021: Identification of Misinformation in COVID-19 Tweets Using BERTweet. In Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, Online, 6 June 2021; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 99–103. [Google Scholar] [CrossRef]
- Kim, M.G.; Kim, M.; Kim, J.H.; Kim, K. Fine-Tuning BERT Models to Classify Misinformation on Garlic and COVID-19 on Twitter. Int. J. Environ. Res. Public Health 2022, 19, 5126. [Google Scholar] [CrossRef]
- Alyoubi, S.; Kalkatawi, M.; Abukhodair, F. The Detection of Fake News in Arabic Tweets Using Deep Learning. Appl. Sci. 2023, 13, 8209. [Google Scholar] [CrossRef]
- Qian, F.; Gong, C.; Sharma, K.; Liu, Y. Neural User Response Generator: Fake News Detection with Collective User Intelligence. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; International Joint Conferences on Artificial Intelligence Organization: Stroudsburg, PA, USA, 2018; pp. 3834–3840. [Google Scholar] [CrossRef]
- Ruchansky, N.; Seo, S.; Liu, Y. CSI: A Hybrid Deep Model for Fake News Detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; ACM: New York, NY, USA, 2017; pp. 797–806. [Google Scholar] [CrossRef]
- Ma, J.; Gao, W.; Mitra, P.; Kwon, S.; Jansen, B.J.; Wong, K.F.; Cha, M. Detecting rumors from microblogs with recurrent neural networks. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16, New York, NY, USA, 9–15 July 2016; AAAI Press: New York, NY, USA, 2016; pp. 3818–3824. [Google Scholar]
- Shu, K.; Wang, S.; Liu, H. Beyond News Contents: The Role of Social Context for Fake News Detection. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, Melbourne, VIC, Australia, 11–15 February 2019; ACM: New York, NY, USA, 2019; pp. 312–320. [Google Scholar] [CrossRef]
- Zhang, J.; Dong, B.; Yu, P.S. FakeDetector: Effective Fake News Detection with Deep Diffusive Neural Network. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1826–1829. [Google Scholar] [CrossRef]
- Alkhalifa, R.; Yoong, T.; Kochkina, E.; Zubiaga, A.; Liakata, M. QMUL-SDS at CheckThat! 2020: Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions. arXiv 2020, arXiv:2008.13160. [Google Scholar]
- Kumar, A.; Singh, J.P.; Singh, A.K. COVID-19 Fake News Detection Using Ensemble-Based Deep Learning Model. IT Prof. 2022, 24, 32–37. [Google Scholar] [CrossRef]
- Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Stroudsburg, PA, USA, 2014; pp. 1746–1751. [Google Scholar] [CrossRef]
Loss Function | Binary Cross Entropy |
Optimizer | Adam |
# Epochs | 2 |
Batch Size | 16 |
Learning Rate | 0.0000006 |
Train | Test | |||||
---|---|---|---|---|---|---|
Real | Fake | Total | Real | Fake | Total | |
PoliFact | 60,664 | 60,664 | 121,328 | 11,370 | 11,370 | 22,740 |
GossipCop | 248,286 | 248,286 | 496,572 | 81,056 | 81,056 | 162,112 |
Data | Accuracy | Recall | Precision | F1-Score |
---|---|---|---|---|
Politifact | 0.82 | 0.81 | 0.83 | 0.82 |
GossipCop | 0.84 | 0.83 | 0.85 | 0.84 |
Model | Accuracy | Recall | Precision | F1-Score |
---|---|---|---|---|
Politifact Dataset | ||||
Baseline | 0.82 | 0.81 | 0.83 | 0.82 |
BERTweet–BiLSTM | 0.85 | 0.85 | 0.85 | 0.85 |
GossipCop Dataset | ||||
Baseline | 0.84 | 0.83 | 0.85 | 0.84 |
BERTweet–BiLSTM | 0.88 | 0.89 | 0.88 | 0.88 |
Model | Accuracy | Recall | Precision | F1-Score |
---|---|---|---|---|
Politifact Dataset | ||||
Baseline | 0.82 | 0.81 | 0.83 | 0.82 |
BERTweet–BiLSTM | 0.85 | 0.85 | 0.85 | 0.85 |
TChecker | 0.93 | 0.93 | 0.93 | 0.93 |
GossipCop Dataset | ||||
Baseline | 0.84 | 0.83 | 0.85 | 0.84 |
BERTweet–BiLSTM | 0.88 | 0.89 | 0.88 | 0.88 |
TChecker | 0.91 | 0.91 | 0.91 | 0.91 |
Input | Model | Accuracy | Recall | Precision | F1-Score |
---|---|---|---|---|---|
Politifact Dataset | |||||
Articles | Text-CNN | 0.653 | 0.863 | 0.678 | 0.76 |
Articles + Tweets | TCNN-URG | 0.712 | 0.712 | 0.723 | 0.722 |
Articles + Tweets | CSI | 0.807 | 0.813 | 0.821 | 0.817 |
Articles + Tweets | dEFEND | 0.90 | 0.90 | 0.90 | 0.90 |
Articles + Tweets | TChecker | 0.93 | 0.93 | 0.93 | 0.93 |
GossipCop Dataset | |||||
Articles | Text-CNN | 0.739 | 0.477 | 0.707 | 0.569 |
Articles + Tweets | TCNN-URG | 0.736 | 0.521 | 0.715 | 0.603 |
Articles + Tweets | CSI | 0.762 | 0.658 | 0.722 | 0.688 |
Articles + Tweets | dEFEND | 0.808 | 0.808 | 0.808 | 0.808 |
Aritlces + Tweets | TChecker | 0.91 | 0.91 | 0.91 | 0.91 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
GabAllah, N.; Sharara, H.; Rafea, A. TChecker: A Content Enrichment Approach for Fake News Detection on Social Media. Appl. Sci. 2023, 13, 13070. https://doi.org/10.3390/app132413070
GabAllah N, Sharara H, Rafea A. TChecker: A Content Enrichment Approach for Fake News Detection on Social Media. Applied Sciences. 2023; 13(24):13070. https://doi.org/10.3390/app132413070
Chicago/Turabian StyleGabAllah, Nada, Hossam Sharara, and Ahmed Rafea. 2023. "TChecker: A Content Enrichment Approach for Fake News Detection on Social Media" Applied Sciences 13, no. 24: 13070. https://doi.org/10.3390/app132413070
APA StyleGabAllah, N., Sharara, H., & Rafea, A. (2023). TChecker: A Content Enrichment Approach for Fake News Detection on Social Media. Applied Sciences, 13(24), 13070. https://doi.org/10.3390/app132413070