Fake News Detection and Classification: A Comparative Study of Convolutional Neural Networks, Large Language Models, and Natural Language Processing Models
Abstract
:1. Introduction
- Evaluate the effectiveness of advanced AI models in fake news detection, focusing on large language models (LLMs).
- Compare the performance of these models before and after fine-tuning using few-shot learning techniques.
- Examine the costs and trade-offs associated with fine-tuning LLMs and their implications for real-world applications.
- Explore the transformative potential of LLMs to provide automated detection and actionable insights for the news industry.
- Investigate the following critical questions that have yet to be thoroughly examined in previous studies:
- Q1: are traditional NLP and CNN models or LLMs more accurate in fake news detection tasks?
- Q2: among the GPT-4 Omni family, which model performs best prior to fine-tuning?
- Q3: after fine-tuning with few-shot learning, which model in the GPT-4 Omni family demonstrates superior performance?
- Q4: what is the significance of the costs associated with fine-tuning LLMs, and how do these costs impact performance in the news sector?
- Q5: how can LLMs be effectively leveraged to assess fake news, and what transformative effects can they have on the news industry through automated detection and actionable insights?
- Section 2 presents a comprehensive review of the existing literature on fake news detection and classification, with a focus on the application of DNN models in the news sector.
- Section 3 outlines the methodologies employed in this study, detailing the fine-tuning of models using few-shot learning techniques to ensure transparency and reproducibility.
- Section 4 reports the predictive results of the analyzed models, highlighting their performance both before and after fine-tuning.
- Section 5 provides an in-depth discussion of the findings, extracting actionable insights and advancing the discourse on leveraging AI to combat misinformation effectively.
2. Literature Review
2.1. Feature-Based Detection Approaches
2.2. Deep Learning Techniques
2.3. Multi-Modal and Hybrid Approaches
2.4. NLP and Machine Learning
2.5. Network-Based Detection Approaches
2.6. Meta-Analytic and Comparative Studies
2.7. Specialized Detection Models
2.8. Emerging Trends and Novel Techniques
2.9. Augmentation and Transfer Learning
2.10. Cooperative and Feedback-Based Models
2.11. Toxic News and Multiclass Classification
3. Materials and Methods
3.1. Dataset Cleaning, Preprocessing, and Splitting
- Serial Number: the index of the article, starting at 0;
- Title: the headline of the article;
- Text: the full article content;
- Label: the classification label (0 for fake, 1 for real).
3.1.1. Dataset Preprocessing
- Column Removal: We removed the “Unnamed: 0” column, which was deemed irrelevant to the analysis and redundant.
- Empty Row Removal: We performed a thorough check for missing values across the “Title”, “Text”, and “Label” columns. Any rows containing missing values were removed to maintain the integrity of the data.
- Column Merging: The “Title” and “Text” columns were combined into a new consolidated column, named “Text”, to provide the model with a unified input that included both the article headline and content.
- Label Standardization: The “label” column was standardized and renamed as “Label” for consistency across the dataset and to align with our modeling pipeline.
- Text Length Restriction: We set a maximum length of 2560 characters for the “Text” column. This length was chosen to balance sufficient contextual information for training (particularly for CNN and BERT models) while maintaining memory and processing efficiency. After this truncation, the dataset contained 7573 entries labeled as 1 (real) and 7313 entries labeled as 0 (fake).
- Data Standardization: Following the truncation, we standardized the dataset to ensure consistency and facilitate model convergence. After this step, we re-checked for any empty rows that might have been introduced and removed them, leaving a final dataset of 7568 real (1) and 7313 fake (0).
- Balanced Sampling: To address potential class imbalances, we applied stratified sampling to select 5000 entries, with 2500 entries from each class (fake and real). This step ensured that the dataset was balanced, which is essential for training effective classification models.
- ID Addition: A unique identifier (ID) was assigned to each entry to assist with tracking and error handling during the modeling process.
3.1.2. Dataset Splitting
- Training Set (80%): 3200 samples;
- Testing Set (20%): 1000 samples;
- Validation Set (20% of the training data): 800 samples.
3.2. LLM Prompt Engineering
- Content Independent of Model Architecture: We designed the prompt to be versatile and not dependent on any single model’s framework. This flexibility ensured that it could be applied across different LLMs with minimal adjustment, focusing on clear communication of the task with relevant context and instructions interpretable by any LLM.
- Structured Output for Accessibility: Recognizing the importance of usability, we created a response format that aligned with coding and accessibility standards. The output was organized in compliance with the JSON standard, offering a logical, intuitive structure that meets both human readability and machine processing requirements.
Listing 1. Model-agnostic prompt. |
conversation.append({’role’: ’system’, ’content’: “You are an AI model tasked with predicting whether a news article is fake news. Respond with 0 for fake and 1 for not fake. Return your response in JSON format: {’fake’: integer}.”}) conversation.append({’role’: ’user’, ’content’: f”Predict if the following news article is fake news (0 for fake and 1 for not fake). Please respond in JSON format like this example: {{’fake’: integer}}. Please avoid providing additional explanations. Article text:\n{input[’Text’]}”}) |
3.3. Model Deployment, Fine-Tuning, and Predictive Evaluation
3.3.1. GPT Model Deployment and Fine-Tuning
Listing 2. Prompt and completion pairs—JSONL files. |
{“messages”: [{“role”: “system”,
{“role”: “assistant”, “content”: “{’fake’: 1}”}]} |
3.3.2. BERT Model Deployment and Fine-Tuning
3.3.3. CNN Model Deployment and Fine-Tuning
4. Results
4.1. Overview of Fine-Tuning Metrics
4.2. Model Evaluation Phase
4.2.1. Pre-Fine-Tuning Evaluation
4.2.2. Post Fine-Tuning Evaluation
5. Discussion
5.1. Evaluating Traditional NLP Models vs. LLMs in Fake News Detection
- Research Question 1: are traditional NLP and CNN models or LLMs more accurate in fake news detection tasks?
- Research Statement 1: fine-tuned LLMs outperform traditional NLP and CNN models in fake news detection, achieving near-perfect accuracy.
- The role of pretraining and architecture: Unlike CNNs, which are primarily designed for pattern recognition in structured data like images [45], LLMs are built with transformer-based architectures that allow for deep attention mechanisms and sequence-based learning. These transformers, pretrained on vast and diverse datasets, are adept at capturing language patterns, idiomatic expressions, and subtle semantic relationships. In fake news detection, this translates to a model that can understand nuanced phrasing or stylistic cues typical of misinformation, even when these cues are subtle or context-dependent.
- CNN limitations: The CNN model (ft:cnn_adam) in this study achieved only 58.6% accuracy, which is markedly lower than the transformer-based models. CNNs are effective at identifying repetitive, structured patterns but fall short when tasked with understanding the complexities of human language, especially when misleading content relies on nuanced or indirect language. Since CNNs do not inherently process sequential information as effectively as transformers, they struggle to recognize the sequential and contextual patterns often necessary for distinguishing fake news. Furthermore, CNNs require substantial labeled data tailored to the target task to perform well in NLP tasks, given their lack of extensive pretraining on varied textual data [46].
- Comparing BERT and GPT models in fake news classification: The BERT model (ft:bert-adam), while achieving a respectable 97.5% accuracy, still fell short of the fine-tuned GPT-4 Omni models. This difference, although minor, may be attributed to the GPT-4 Omni models’ extensive pretraining and perhaps larger scale compared to BERT. Additionally, while both BERT and GPTs are transformer-based, GPT models are autoregressive, which means they are trained to predict the next word in a sequence, potentially enhancing their understanding of sentence flow and structure—elements that are crucial for detecting deceptive or misleading language. BERT’s bidirectional nature gives it a slight advantage in understanding context but might limit its proficiency in tasks requiring generation or classification of nuanced language.
5.2. Pre-Fine-Tuning Performance Assessment Within the GPT-4 Omni Family
- Research Question 2: among the GPT-4 Omni family, which model performs best prior to fine-tuning?
- Research Statement 2: prior to fine-tuning, GPT-4 Omni models perform poorly in fake news detection, highlighting the necessity for task-specific training.
- Baseline performance and lack of task-specific knowledge: The low accuracy scores of 12.3% for GPT-4o and 24.3% for GPT-4o-mini underscore that both models lack the task-specific knowledge required for effective fake news detection in a zero-shot setting. These results suggest that while LLMs have extensive general language understanding, applying this to a nuanced, specialized task like misleading news categorization is challenging without specific tuning. Fake news classification often relies on recognizing subtle cues, phrasing patterns, and contextual red flags that are challenging for general-purpose models to identify without tailored training.
- The performance gap between the models—specifically, the nearly 50% difference in accuracy between GPT-4o and GPT-4o-mini—was unexpected. A plausible hypothesis is that GPT-4o, despite being larger and more powerful, may have been overfitted to its training data. This could cause the model to struggle in a zero-shot classification task if it overly relies on patterns from the training set that do not generalize well to new data. On the other hand, GPT-4o-mini, with its smaller parameter size, might have avoided overfitting, leading to better generalization in a zero-shot setting. It is a fact that sometimes smaller models can perform better in certain tasks because they learn to prioritize the most important features and avoid distractions from irrelevant data, while larger models might become bogged down by unnecessary complexity [47].
- Challenges of zero-shot fake news detection: Fake news detection is a complex task that requires not only general language understanding but also the ability to differentiate between legitimate and deceptive communication. Fraudulent content often imitates legitimate language, which makes it difficult to classify correctly without exposure to examples during training [40]. Zero-shot models, despite their general versatility, lack the fine-grained knowledge to identify these distinctions [48]. This is especially true in domains like fake news classification, where subtle stylistic or structural cues might signal fake news, and understanding these cues requires domain-specific data exposure.
- Implications of prompt complexity: Attempts to simplify prompts did not result in significant improvements in zero-shot performance, suggesting that prompt engineering alone may not be sufficient for bridging the knowledge gap in specialized tasks [49]. While prompt optimization can enhance zero-shot performance in some general tasks, its limited impact here implies that fake news detection requires more than refined prompting; it needs models that have been trained on data specific to the task. This finding emphasizes that while LLMs are powerful, there are limits in what can be achieved through zero-shot learning alone in cases where the task requires deep contextual familiarity.
5.3. Fine-Tuning Impact on GPT-4 Omni Models with Few-Shot Learning
- Research Question 3: after fine-tuning with few-shot learning, which model in the GPT-4 Omni family demonstrates superior performance?
- Research Statement 3: after fine-tuning with few-shot learning, GPT-4o and GPT-4o-mini both achieve 98.6% accuracy, with GPT-4o-mini offering a resource-efficient alternative.
- High accuracy and comparable performance: Both GPT-4o and GPT-4o-mini achieved a remarkable accuracy of 98.6% after fine-tuning, suggesting that fine-tuning with few-shot learning equipped both models with a deep understanding of fake-related cues and patterns. This high accuracy indicates that fine-tuning enabled these models to internalize task-specific patterns, transforming general-purpose models into highly competent classifiers.
- Fine-tuning efficacy across model sizes: Fine-tuning proved equally effective for both the large and smaller models, suggesting that even a model with fewer parameters, like GPT-4o-mini, can achieve high accuracy when task-specific knowledge is provided through fine-tuning. This reinforces that model scaling is not always necessary for high performance in specialized tasks if effective fine-tuning methods, like few-shot learning, are applied. It also demonstrates that a smaller model, given the right training, can leverage its pre-existing language understanding to learn task-specific requirements efficiently.
- Scalability and flexibility in model deployment: The fact that GPT-4o-mini can achieve comparable performance to GPT-4o after fine-tuning suggests that smaller models in the GPT-4 Omni family can be scaled down without sacrificing substantial accuracy. This scalability is particularly beneficial for businesses or developers looking to deploy multiple models across various tasks, as smaller models require less computational power for deployment and can be trained more quickly [50]. Organizations that need to adapt quickly to new fake news detection patterns, for instance, might find GPT-4o-mini advantageous, as it combines high performance with adaptability and cost-effectiveness.
- Strategic model selection for application needs: For organizations with stringent accuracy standards in fake news detection, both models offer strong choices. However, GPT-4o-mini’s identical performance to GPT-4o and its lower computational footprint make it particularly suitable for real-time fake news classifiers, mobile applications, or cloud deployments where resource limitations are a concern [51]. By achieving high accuracy with fewer resources, GPT-4o-mini serves as an example of how model selection can be aligned with specific operational and budgetary needs without compromising on task accuracy [52].
5.4. Cost-Performance Analysis of Fine-Tuning LLMs for Fake News Detection
- Research Question 4: what is the significance of the costs associated with fine-tuning LLMs, and how do these costs impact performance in the news sector?
- Research Statement 4: fine-tuning LLMs like GPT-4o incurs high costs, but GPT-4o-mini offers a nearly equal performance, making it a cost-effective and sustainable choice for the news sector.
- Cost-performance trade-offs: Fine-tuning costs can vary dramatically between models, particularly as the model size and parameter count increase. While larger models like GPT-4o may offer accuracy improvements, these benefits often come with exponentially higher computational costs due to the additional resources needed for training and storage. The results of this study suggest that smaller models like GPT-4o-mini can achieve exactly the same accuracy (98.6%) as larger models, meaning that news organizations can achieve high performance without committing to the costs associated with the largest models.
- Scalability and resource allocation in newsrooms: Many newsrooms, especially smaller or independent ones, operate on limited budgets, making high-cost fine-tuning of large models unfeasible. GPT-4o-mini’s near-parity in performance with GPT-4o after fine-tuning suggests that news organizations could allocate their resources more efficiently by selecting smaller models that require fewer computational resources. The cost of fine-tuning the mini model was only USD 6.52, compared to USD 54.30 for the larger model—a significant difference. This disparity was similarly large during the prediction phase. By using smaller models, organizations can implement robust AI solutions across multiple tasks—such as fake news detection, fake news analysis, and content moderation—without incurring prohibitive costs. This approach makes AI-powered solutions more scalable and accessible across diverse newsroom environments.
- Sustainability and environmental impacts: Computationally intensive fine-tuning contributes to energy consumption, which has significant environmental implications [53]. The use of a smaller model like GPT-4o-mini, which requires less power and computational time, aligns with sustainability goals by reducing the carbon footprint associated with model training. For news organizations committed to minimizing their environmental impact, smaller models represent a more sustainable alternative that still delivers a high performance. This consideration is becoming increasingly important for industries striving to balance technological advancement with environmental responsibility.
5.5. Harnessing LLMs for Fake News Detection: Impact and Industry Transformation
- Research Question 5: how can LLMs be effectively leveraged to assess fake news, and what transformative effects can they have on the news industry through automated detection and actionable insights?
- Research Statement 5: LLMs can revolutionize fake news detection in the news industry by automating fact-checking, analyzing misinformation patterns, and optimizing journalistic workflows.
- Automated fake news detection and verification: LLMs excel in detecting subtle linguistic cues, including tone, intent, and inconsistencies in phrasing that may indicate misinformation. By analyzing text with high sensitivity to such patterns, these models can flag potentially deceptive articles, posts, or statements [1]. Automating fake news detection enables near-instant identification of suspicious content, providing journalists and editors with a tool to screen and verify information before it reaches the public. This real-time verification can significantly reduce the spread of fake news by catching it early in the content distribution pipeline.
- Analyzing patterns and trends in misinformation: LLMs can analyze large datasets to identify recurring patterns in misinformation [54]. For instance, they can detect repeated themes, sources, or specific phrasing commonly associated with fake news, which helps newsrooms understand how misinformation is structured and spread. These insights allow media organizations to better understand the origins and propagation mechanisms of fake news, helping them create targeted counter-narratives and education campaigns to inform the public. Moreover, such analysis can assist journalists in investigating and debunking trends in misinformation at their root, reducing their overall impact.
- Efficient allocation of journalistic resources: Fake news detection traditionally requires extensive time and effort from journalists to verify sources, cross-check facts, and consult experts. With LLMs automating much of this initial verification process, journalists are free to focus on in-depth investigative reporting or nuanced storytelling. LLMs can serve as frontline tools, handling large volumes of content for preliminary screening and allowing human editors to prioritize the content that truly needs expert analysis [55]. This efficiency can lead to increased productivity in newsrooms, allowing them to cover more stories and provide richer, more balanced perspectives.
- Content moderation and community engagement: News outlets can deploy LLMs to moderate user-generated content, such as comments on articles or social media platforms, where misinformation often proliferates. By filtering out or flagging misleading comments in real-time, LLMs could enable news organizations to maintain respectful and informative discussions around their content. This content moderation creates a safer, more reliable environment for audience engagement, reducing misinformation on news platforms and fostering healthier community discourse [56].
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Papageorgiou, E.; Chronis, C.; Varlamis, I.; Himeur, Y. A Survey on the Use of Large Language Models (LLMs) in Fake News. Future Internet 2024, 16, 298. [Google Scholar] [CrossRef]
- Shu, K.; Sliva, A.; Wang, S.; Tang, J.; Liu, H. Fake News Detection on Social Media: A data mining perspective. ACM SIGKDD Explor. Newsl. 2017, 19, 22–36. [Google Scholar] [CrossRef]
- Sakas, D.P.; Reklitis, D.P.; Trivellas, P. Social Media Analytics for Customer Satisfaction Based on User Engagement and Interactions in the Tourism Industry. In Proceedings of the Computational and Strategic Business Modelling, IC-BIM 2021, Athens, Greece, 18–19 December 2021; Springer Proceedings in Business and Economics. Springer: Berlin/Heidelberg, Germany, 2024; pp. 103–109. [Google Scholar] [CrossRef]
- Poulopoulos, V.; Vassilakis, C.; Wallace, M.; Antoniou, A.; Lepouras, G. The Effect of Social Media Trending Topics Related to Cultural Venues’ Content. In Proceedings of the 13th International Workshop on Semantic and Social Media Adaptation and Personalization, SMAP 2018, Zaragoza, Spain, 6–7 September 2018; pp. 7–12. [Google Scholar] [CrossRef]
- Reis, J.C.S.; Correia, A.; Murai, F.; Veloso, A.; Benevenuto, F.; Cambria, E. Supervised Learning for Fake News Detection. IEEE Intell. Syst. 2019, 34, 76–81. [Google Scholar] [CrossRef]
- Pérez-Rosas, V.; Kleinberg, B.; Lefevre, A.; Mihalcea, R. Automatic Detection of Fake News. In Proceedings of the COLING 2018—27th International Conference on Computational Linguistics, Santa Fe, NM, USA, 20–26 August 2017; pp. 3391–3401. [Google Scholar]
- Al Asaad, B.; Erascu, M. A Tool for Fake News Detection. In Proceedings of the 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2018, Timisoara, Romania, 20–23 September 2018; pp. 379–386. [Google Scholar] [CrossRef]
- Thota, A.; Tilak, P.; Ahluwalia, S.; Lohia, N. Fake News Detection: A Deep Learning Approach. SMU Data Sci. Rev. 2018, 1, 10. [Google Scholar]
- Kaliyar, R.K.; Goswami, A.; Narang, P.; Sinha, S. FNDNet—A Deep Convolutional Neural Network for Fake News Detection. Cogn. Syst. Res. 2020, 61, 32–44. [Google Scholar] [CrossRef]
- Yang, Y.; Zheng, L.; Zhang, J.; Cui, Q.; Zhang, X.; Li, Z.; Yu, P.S. TI-CNN: Convolutional Neural Networks for Fake News Detection. arXiv 2018, arXiv:1806.00749. [Google Scholar]
- Singhal, S.; Shah, R.R.; Chakraborty, T.; Kumaraguru, P.; Satoh, S. SpotFake: A Multi-Modal Framework for Fake News Detection. In Proceedings of the 2019 IEEE 5th International Conference on Multimedia Big Data, BigMM 2019, Singapore, 11–13 September 2019; pp. 39–47. [Google Scholar] [CrossRef]
- Devarajan, G.G.; Nagarajan, S.M.; Amanullah, S.I.; Mary, S.A.S.A.; Bashir, A.K. AI-Assisted Deep NLP-Based Approach for Prediction of Fake News from Social Media Users. IEEE Trans. Comput. Soc. Syst. 2024, 11, 4975–4985. [Google Scholar] [CrossRef]
- Almarashy, A.H.J.; Feizi-Derakhshi, M.R.; Salehpour, P. Enhancing Fake News Detection by Multi-Feature Classification. IEEE Access 2023, 11, 139601–139613. [Google Scholar] [CrossRef]
- Oshikawa, R.; Qian, J.; Wang, W.Y. A Survey on Natural Language Processing for Fake News Detection. In Proceedings of the LREC 2020—12th International Conference on Language Resources and Evaluation, Conference Proceedings, Palais du Pharo, France, 11–16 May 2020; pp. 6086–6093. [Google Scholar]
- Mehta, D.; Patel, M.; Dangi, A.; Patwa, N.; Patel, Z.; Jain, R.; Shah, P.; Suthar, B. Exploring the Efficacy of Natural Language Processing and Supervised Learning in the Classification of Fake News Articles. Adv. Robot. Technol. 2024, 2, 1–6. [Google Scholar] [CrossRef]
- Madani, M.; Motameni, H.; Roshani, R. Fake News Detection Using Feature Extraction, Natural Language Processing, Curriculum Learning, and Deep Learning. Int. J. Inf. Technol. Decis. Mak. 2023, 23, 1063–1098. [Google Scholar] [CrossRef]
- Zhou, X.; Zafarani, R. Network-Based Fake News Detection: A pattern-driven approach. ACM SIGKDD Explor. Newsl. 2019, 21, 48–60. [Google Scholar] [CrossRef]
- Conroy, N.J.; Rubin, V.L.; Chen, Y. Automatic Deception Detection: Methods for Finding Fake News. Proc. Assoc. Inf. Sci. Technol. 2015, 52, 1–4. [Google Scholar] [CrossRef]
- Kozik, R.; Pawlicka, A.; Pawlicki, M.; Choraś, M.; Mazurczyk, W.; Cabaj, K. A Meta-Analysis of State-of-the-Art Automated Fake News Detection Methods. IEEE Trans. Comput. Soc. Syst. 2024, 11, 5219–5229. [Google Scholar] [CrossRef]
- Farhangian, F.; Cruz, R.M.O.; Cavalcanti, G.D.C. Fake News Detection: Taxonomy and Comparative Study. Inf. Fusion 2024, 103, 102140. [Google Scholar] [CrossRef]
- Alghamdi, J.; Lin, Y.; Luo, S. Towards COVID-19 Fake News Detection Using Transformer-Based Models. Knowl. Based Syst. 2023, 274, 110642. [Google Scholar] [CrossRef]
- Mahmud, M.A.I.; Talha Talukder, A.A.; Sultana, A.; Bhuiyan, K.I.A.; Rahman, M.S.; Pranto, T.H.; Rahman, R.M. Toward News Authenticity: Synthesizing Natural Language Processing and Human Expert Opinion to Evaluate News. IEEE Access 2023, 11, 11405–11421. [Google Scholar] [CrossRef]
- Yang, S.; Shu, K.; Wang, S.; Gu, R.; Wu, F.; Liu, H. Unsupervised Fake News Detection on Social Media: A Generative Approach. Proc. AAAI Conf. Artif. Intell. 2019, 33, 5644–5651. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, Y.F.B. FNED: A Deep Network for Fake News Early Detection on Social Media. ACM Trans. Inf. Syst. (TOIS) 2020, 38, 1–23. [Google Scholar] [CrossRef]
- Wani, M.A.; Elaffendi, M.; Shakil, K.A.; Abuhaimed, I.M.; Nayyar, A.; Hussain, A.; El-Latif, A.A.A. Toxic Fake News Detection and Classification for Combating COVID-19 Misinformation. IEEE Trans. Comput. Soc. Syst. 2024, 11, 5101–5118. [Google Scholar] [CrossRef]
- Kapusta, J.; Držik, D.; Šteflovič, K.; Nagy, K.S. Text Data Augmentation Techniques for Word Embeddings in Fake News Classification. IEEE Access 2024, 12, 31538–31550. [Google Scholar] [CrossRef]
- Raja, E.; Soni, B.; Borgohain, S.K. Fake News Detection in Dravidian Languages Using Transfer Learning with Adaptive Finetuning. Eng. Appl. Artif. Intell. 2023, 126, 106877. [Google Scholar] [CrossRef]
- Liu, Y.; Zhu, J.; Zhang, K.; Tang, H.; Zhang, Y.; Liu, X.; Liu, Q.; Chen, E. Detect, Investigate, Judge and Determine: A Novel LLM-Based Framework for Few-Shot Fake News Detection. arXiv 2024, arXiv:2407.08952. [Google Scholar]
- Mallick, C.; Mishra, S.; Senapati, M.R. A Cooperative Deep Learning Model for Fake News Detection in Online Social Networks. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 4451–4460. [Google Scholar] [CrossRef]
- Shushkevich, E.; Alexandrov, M.; Cardiff, J. Improving Multiclass Classification of Fake News Using BERT-Based Models and ChatGPT-Augmented Data. Inventions 2023, 8, 112. [Google Scholar] [CrossRef]
- Verma, P.K.; Agrawal, P.; Amorim, I.; Prodan, R. WELFake: Word Embedding Over Linguistic Features for Fake News Detection. IEEE Trans. Comput. Soc. Syst. 2021, 8, 881–893. [Google Scholar] [CrossRef]
- Fake News Classification. Available online: https://www.kaggle.com/datasets/saurabhshahane/fake-news-classification (accessed on 30 October 2024).
- Zhang, K.; Zhou, F.; Wu, L.; Xie, N.; He, Z. Semantic Understanding and Prompt Engineering for Large-Scale Traffic Data Imputation. Inf. Fusion 2024, 102, 102038. [Google Scholar] [CrossRef]
- Zheng, Y.; Cai, R.; Maimaiti, M.; Abiderexiti, K. Chunk-BERT: Boosted Keyword Extraction for Long Scientific Literature via BERT with Chunking Capabilities. In Proceedings of the 2023 IEEE 4th International Conference on Pattern Recognition and Machine Learning, PRML 2023, Urumqi, China, 4–6 August 2023; pp. 385–392. [Google Scholar] [CrossRef]
- Models—OpenAI API. Available online: https://platform.openai.com/docs/models (accessed on 11 October 2024).
- What Runs ChatGPT? Inside Microsoft’s AI Supercomputer|Featuring Mark Russinovich—YouTube. Available online: https://www.youtube.com/watch?v=Rk3nTUfRZmo (accessed on 17 December 2023).
- Bert-Base-Uncased · Hugging Face. Available online: https://huggingface.co/bert-base-uncased (accessed on 17 December 2023).
- Pretrained Models—Transformers 3.3.0 Documentation. Available online: https://huggingface.co/transformers/v3.3.1/pretrained_models.html (accessed on 17 December 2023).
- BERT—Transformers 3.0.2 Documentation. Available online: https://huggingface.co/transformers/v3.0.2/model_doc/bert.html (accessed on 5 November 2024).
- Roumeliotis, K.I.; Tselikas, N.D.; Nasiopoulos, D.K.; Roumeliotis, K.I.; Tselikas, N.D.; Nasiopoulos, D.K. Next-Generation Spam Filtering: Comparative Fine-Tuning of LLMs, NLPs, and CNN Models for Email Spam Classification. Electronics 2024, 13, 2034. [Google Scholar] [CrossRef]
- Tqdm · PyPI. Available online: https://pypi.org/project/tqdm/ (accessed on 17 December 2023).
- GitHub-Applied-AI-Research-Lab/Fake-News-Detection-and-Classification-A-Comparative-Study-of-CNN-LLMs-and-NLP-Models. Available online: https://github.com/Applied-AI-Research-Lab/Fake-News-Detection-and-Classification-A-Comparative-Study-of-CNN-LLMs-and-NLP-Models (accessed on 14 December 2024).
- Garcia, C.I.; Grasso, F.; Luchetta, A.; Piccirilli, M.C.; Paolucci, L.; Talluri, G. A Comparison of Power Quality Disturbance Detection and Classification Methods Using CNN, LSTM and CNN-LSTM. Appl. Sci. 2020, 10, 6755. [Google Scholar] [CrossRef]
- Roumeliotis, K.I.; Tselikas, N.D.; Nasiopoulos, D.K. LLMs and NLP Models in Cryptocurrency Sentiment Analysis: A Comparative Classification Study. Big Data Cogn. Comput. 2024, 8, 63. [Google Scholar] [CrossRef]
- Amiri, Z.; Heidari, A.; Navimipour, N.J.; Unal, M.; Mousavi, A. Adventures in Data Analysis: A Systematic Review of Deep Learning Techniques for Pattern Recognition in Cyber-Physical-Social Systems. Multimed. Tools Appl. 2024, 83, 22909–22973. [Google Scholar] [CrossRef]
- Bhatti, U.A.; Tang, H.; Wu, G.; Marjan, S.; Hussain, A. Deep Learning with Graph Convolutional Networks: An Overview and Latest Applications in Computational Intelligence. Int. J. Intell. Syst. 2023, 2023, 8342104. [Google Scholar] [CrossRef]
- Hestness, J.; Narang, S.; Ardalani, N.; Diamos, G.; Jun, H.; Kianinejad, H.; Patwary, M.M.A.; Yang, Y.; Zhou, Y. Deep Learning Scaling Is Predictable, Empirically. arXiv 2017, arXiv:1712.00409. [Google Scholar]
- Rojas-Galeano, S. Zero-Shot Spam Email Classification Using Pre-Trained Large Language Models. In Applied Computer Sciences in Engineering, Proceedings of the 11th Workshop on Engineering Applications, WEA 2024, Barranquilla, Colombia, 23–25 October 2024; Springer: Berlin/Heidelberg, Germany, 2025; pp. 3–18. [Google Scholar] [CrossRef]
- Mu, Y.; Wu, B.P.; Thorne, W.; Robinson, A.; Aletras, N.; Scarton, C.; Bontcheva, K.; Song, X. Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024—Main Conference Proceedings, Torino, Italia, 20-25 May 2024; pp. 12074–12086. [Google Scholar]
- OpenAI Launches GPT-4o Mini, a Slimmer, Cheaper AI Model for Developers—Pure AI. Available online: https://pureai.com/Articles/2024/07/18/OpenAI-Launches-GPT-4o-Mini.aspx (accessed on 9 November 2024).
- GPT-4o vs. GPT-4o-Mini: Which AI Model to Choose? Available online: https://anthemcreation.com/en/artificial-intelligence/comparative-gpt-4o-gpt-4o-mini-open-ai/ (accessed on 9 November 2024).
- A Guide to GPT4o Mini: OpenAI’s Smaller, More Efficient Language Model. Available online: https://kili-technology.com/large-language-models-llms/a-guide-to-gpt4o-mini-openai-s-smaller-more-efficient-language-model (accessed on 9 November 2024).
- Huang, K.; Yin, H.; Huang, H.; Gao, W. Towards Green AI in Fine-Tuning Large Language Models via Adaptive Backpropagation. In Proceedings of the 12th International Conference on Learning Representations, ICLR 2024, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Teo, T.W.; Chua, H.N.; Jasser, M.B.; Wong, R.T.K. Integrating Large Language Models and Machine Learning for Fake News Detection. In Proceedings of the 2024 20th IEEE International Colloquium on Signal Processing and Its Applications, CSPA 2024—Conference Proceedings, Langkawi, Malaysia, 1–2 March 2024; pp. 102–107. [Google Scholar] [CrossRef]
- Kumar, R.; Goddu, B.; Saha, S.; Jatowt, A. Silver Lining in the Fake News Cloud: Can Large Language Models Help Detect Misinformation? IEEE Trans. Artif. Intell. 2024, 1–11. [Google Scholar] [CrossRef]
- Ma, H.; Zhang, C.; Fu, H.; Zhao, P.; Wu, B. Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-Tuning. arXiv 2023, arXiv:2310.03400. [Google Scholar]
Paper | Objective | Approach | Results | Contribution to the Field |
---|---|---|---|---|
Reis et al. (2019) [5] | Investigate supervised learning techniques for fake news detection in social media contexts | Feature extraction from news articles and social media posts, supervised learning | Identified critical features and revealed effectiveness of various feature sets for fake news detection | Introduced a novel set of features and provided insights into the challenges of detecting false information, emphasizing practical applications. |
Pérez-Rosas et al. (2018) [6] | Address the challenge of misleading information in accessible media with fake news classification | Developed two novel datasets for fake news classification, conducted linguistic analysis and comparative experiments | Automated methods outperformed manual identification in fake news classification | Demonstrated the advantages of computational tools over manual approaches in identifying fake news and highlighted linguistic differences between fake and legitimate news. |
Al Asaad et al. (2018) [7] | Examine the implications of the “post-truth” era and propose a framework for detecting fake news | Supervised learning with feature extraction using Bag-of-Words and TF-IDF | Linear classification with TF-IDF achieved highest accuracy, bigram models were less effective | Emphasized the importance of feature selection and classification strategies for effective fake news detection, providing insights into the “post-truth” era’s impact on misinformation. |
Thota et al. (2018) [8] | Propose a deep learning approach to fake news classification, addressing the binary classification limitation | Neural network architecture to predict stance between headlines and article bodies | Achieved an accuracy of 94.21% and a 2.5% improvement over previous models | Focused on the need for automated systems and emphasized the improvement over existing models by predicting nuanced relationships between headlines and bodies. |
Kaliyar et al. (2020) [9] | Introduce FNDNet, a deep CNN for fake news detection | CNN-based model that automatically learns discriminative features through multiple hidden layers | Achieved an impressive accuracy of 98.36%, outperforming existing techniques | Demonstrated the potential of CNN-based models for fake news detection and highlighted the automatic feature learning process, marking a significant improvement over traditional methods. |
Yang et al. (2018) [10] | Explore fake news classification by integrating textual and visual information using the TI-CNN model | Deep learning model combining both textual and visual information for fake news classification | Achieved effective fake news detection using both explicit and latent feature extraction | Introduced an innovative approach by incorporating both textual and visual information, improving the robustness and accuracy of fake news classification. |
Singhal et al. (2019) [11] | Introduce SpotFake, a multi-modal framework for fake news detection leveraging both textual and visual features | Multi-modal framework using BERT for text feature extraction and VGG-19 for image feature extraction | Improved performance by 3.27% (Twitter) and 6.83% (Weibo) over state-of-the-art results | Demonstrated the effectiveness of integrating both textual and visual features for fake news detection, surpassing existing techniques. |
Devarajan et al. (2023) [12] | Propose an AI-assisted deep NLP-based approach for detecting fake news across social media platforms | Incorporates social features and deep learning across four layers: publisher, social media networking, enabled edge, and cloud | Achieved 99.72% accuracy and 98.33% F1 score | Significantly outperformed existing methods, offering a comprehensive approach that integrates social media features with deep NLP for improved detection. |
Almarashy et al. (2023) [13] | Enhance accuracy in fake news classification by using a multi-feature classification model | Extracts global, spatial, and temporal features from text using TF-IDF, CNNs, and BiLSTM | Demonstrated superiority over previous methods in classification accuracy | Highlighted the benefits of combining multiple feature extraction techniques (global, spatial, and temporal) for improved fake news detection. |
Oshikawa et al. (2020) [14] | Provide a comprehensive survey on the intersection of NLP and machine learning in fake news detection | Review existing datasets, task formulations, and NLP solutions | Emphasized the need for practical detection models to improve effectiveness | Advocated for more refined detection models, highlighting the challenges of fake news classification and the importance of automatic detection methods. |
Mehta et al. (2024) [15] | Focus on the efficacy of NLP and supervised learning in classifying fake news articles | NLP-based feature extraction followed by supervised learning | Achieved high accuracy, precision, recall, and F1 score | Demonstrated robust performance with NLP techniques and supervised learning, revealing significant contributors to successful classification and providing valuable insights. |
Madani et al. (2023) [16] | Propose a two-phase model combining NLP and machine learning for fake news detection | Hybrid method with curriculum learning, k-nearest neighbor algorithm | Demonstrated superior performance compared to benchmark models | Showcased the potential of hybrid feature extraction and machine learning methods to enhance the performance of fake news detection models. |
Zhou et al. (2019) [17] | Propose a network-based pattern-driven approach for fake news detection | Analyzed patterns of fake news propagation through social networks using social psychological theories, applying network-level analysis | Outperformed existing state-of-the-art techniques | Enhanced feature engineering for fake news detection by focusing on social network patterns, improving explainability of detection models. |
Conroy et al. (2015) [18] | Explore hybrid detection approaches combining linguistic cues with network analysis | Combined content-based analysis with network-based insights to identify deception in online news | Provided a robust hybrid framework for fake news classification | Demonstrated the effectiveness of integrating multiple methodologies (linguistic- and network-based) to improve fake news detection and combat misinformation. |
Kozik et al. (2024) [19] | Survey state-of-the-art technologies for fake news detection | Categorized veracity assessment methods into linguistic cue approaches and network analysis techniques. Proposed a hybrid approach combining both methods. | Advocated for a hybrid approach of linguistic cues and network-based behavioral data to improve fake news detection | Provided operational guidelines for developing effective fake news classifier systems and emphasized the evolving challenges in the online news publication landscape. |
Farhangian et al. (2024) [20] | Address challenges posed by the proliferation of social networks in fake news detection | Introduced an updated taxonomy based on feature types, detection perspectives, feature representation methods, and classification approaches. Conducted an empirical study on feature extraction and classification techniques. | Transformer-based approaches demonstrated superior performance; optimal feature extraction methods are dataset-dependent. | Emphasized the value of combining multiple feature representation methods and classification algorithms, particularly for improved generalization and efficiency. |
Alghamdi et al. (2023) [21] | Detect COVID-19 fake news using transformer-based models | Fine-tuning pre-trained transformer models (BERT, COVID-Twitter-BERT) with downstream CNN and BiGRU layers | Achieved a state-of-the-art F1 score of 98% with CT-BERT augmented with BiGRU | Highlighted the effectiveness of fine-tuning transformer models and augmenting them with neural network layers for COVID-19 fake news detection. |
Mahmud et al. (2023) [22] | Address news authenticity issues with socio-political influences and biased news | Proposed a novel framework integrating blockchain technology, smart contracts, and incremental machine learning | Achieved initial accuracies of 84.94% for training and 84.99% for testing, improving to 93.75% and 93.80% after nine rounds of incremental training | Introduced blockchain and incremental machine learning to assess news credibility, demonstrating the potential of decentralized platforms for news verification. |
Yang et al. (2019) [23] | Introduce an unsupervised method for fake news detection | Generative model using a Bayesian network, treating news truths and user credibility as latent variables | Achieved notable improvements over existing unsupervised methods | Introduced a generative, unsupervised approach to fake news detection, utilizing user engagement data to infer authenticity without labeled data. |
Liu et al. (2020) [24] | Develop FNED for early fake news detection | Deep neural network with feature extractor, position-aware attention mechanism, and multi-region mean-pooling | Achieved over 90% accuracy within five minutes of news propagation, outperforming baselines with only 10% labeled samples | Proposed FNED, a model designed for early-stage fake news detection, achieving high accuracy with limited labeled data. |
Wani et al. (2023) [25] | Focus on toxic fake news classification for COVID-19 misinformation | Machine learning techniques (SVM, random forest) and transformer-based models (BERT) for toxicity analysis | Linear SVM achieved 92% accuracy, with high F1, F2, and F0.5 scores | Introduced a toxicity-oriented approach for distinguishing toxic fake news, suggesting its effectiveness for misinformation detection. |
Kapusta et al. (2024) [26] | Examine text data augmentation techniques for fake news classification | Synonym Replacement, Back Translation, and Reduction of Function Words (FWD) for corpus augmentation | Back Translation improved accuracy in SVM and Bernoulli Naive Bayes models, FWD improved Logistic Regression, original corpus performed best in Random Forest | Introduced data augmentation techniques that enhance the performance of word embeddings and classifiers in fake news detection. |
Raja et al. (2023) [27] | Address fake news detection in Dravidian languages using transfer learning | Fine-tuning mBERT and XLM-R pretrained models with adaptive learning strategies | Achieved 93.31% accuracy on Dravidian fake news dataset, outperforming existing methods | Proposed a transfer learning approach for fake news detection in low-resource languages, demonstrating effectiveness with adaptive fine-tuning. |
Liu et al. (2024) [28] | Develop few-shot fake news detection (FS-FND) framework using LLMs | Dual-perspective Augmented Fake News Detection (DAFND) model with multiple modules | Effective in low-resource settings, improving fake news detection through the integration of multiple modules | Introduced a few-shot detection framework using large language models, focusing on low-resource scenarios and in-context learning for fake news detection. |
Mallick et al. (2023) [29] | Develop a cooperative deep learning model for fake news detection | Incorporated user feedback to assess news trust levels and ranked news accordingly | Achieved 98% accuracy for fake news detection, outperforming many existing models | Proposed a cooperative deep learning approach with user feedback, which refines the model through continuous engagement to improve fake news detection. |
Shushkevich et al. (2023) [30] | Address multi-class fake news detection with a BERT-based approach | Used SBERT, RoBERTa, mBERT, and ChatGPT-generated synthetic data for class balancing | Superior performance to existing methods using a multi-class classification framework with true, false, partially false, and other categories | Expanded the framework for fake news detection from binary to multi-class classification, improving detection outcomes using BERT-based models and synthetic data. |
Model | Resources | Training Loss | Validation Loss | Training Time (Seconds) | Training Cost |
---|---|---|---|---|---|
ft:gpt-4o | API | 0.0021 | 0.0037 | 3353 | USD 54.30 |
ft:gpt-4o-mini | API | 0.0081 | 0.0075 | 1779 | USD 6.52 |
ft:bert-adam | Tesla V100-SXM2-16 GB | 0.0294 | 0.0386 | 877 | USD 2.54 |
ft:cnn-adam | Tesla V100-SXM2-16 GB | 0.6253 | 0.5884 | 47.90 | USD 0.14 |
Model | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|
base:gpt-gpt-4o-2024-08-06 | 0.123 | 0.123 | 0.123 | 0.123 |
base: gpt-4o-mini-2024-07-18 | 0.243 | 0.1969 | 0.243 | 0.2123 |
ft:gpt-4o | 0.986 | 0.9861 | 0.986 | 0.986 |
ft:gpt-4o-mini | 0.986 | 0.9861 | 0.986 | 0.986 |
ft:bert-adam | 0.975 | 0.9758 | 0.975 | 0.975 |
ft:cnn_adam | 0.586 | 0.6334 | 0.586 | 0.5457 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Roumeliotis, K.I.; Tselikas, N.D.; Nasiopoulos, D.K. Fake News Detection and Classification: A Comparative Study of Convolutional Neural Networks, Large Language Models, and Natural Language Processing Models. Future Internet 2025, 17, 28. https://doi.org/10.3390/fi17010028
Roumeliotis KI, Tselikas ND, Nasiopoulos DK. Fake News Detection and Classification: A Comparative Study of Convolutional Neural Networks, Large Language Models, and Natural Language Processing Models. Future Internet. 2025; 17(1):28. https://doi.org/10.3390/fi17010028
Chicago/Turabian StyleRoumeliotis, Konstantinos I., Nikolaos D. Tselikas, and Dimitrios K. Nasiopoulos. 2025. "Fake News Detection and Classification: A Comparative Study of Convolutional Neural Networks, Large Language Models, and Natural Language Processing Models" Future Internet 17, no. 1: 28. https://doi.org/10.3390/fi17010028
APA StyleRoumeliotis, K. I., Tselikas, N. D., & Nasiopoulos, D. K. (2025). Fake News Detection and Classification: A Comparative Study of Convolutional Neural Networks, Large Language Models, and Natural Language Processing Models. Future Internet, 17(1), 28. https://doi.org/10.3390/fi17010028