Fake User Detection Based on Multi-Model Joint Representation
Abstract
:1. Introduction
- MAPM utilizes large models like CLIP, BERT, and HiFi-NET transfer learning modules to aggregate explicit and implicit features across multiple dimensions such as text, images, and user data, thereby achieving a concrete fusion of user behavior characteristics.
- We propose a Time Interval Detection Network aimed at representing potential temporal characteristics of user posting behavior, enhancing the model’s capability to detect unusual user behavior.
- By combining implicit behavioral features with explicit user features, we construct a spectral clustering-based unsupervised classification module to further classify fake users, providing a potential approach for public opinion analysis in social media management.
2. Literature Review
2.1. Fake News Detection
2.2. Fake User Detection
3. Methodology
3.1. The Window User Feature Space
3.2. Sequence Interval Detect Net
3.3. Classifier Module of MAPM
4. Experiments
4.1. Data Sets
4.1.1. BERT Text Categorization Dataset
4.1.2. Weibo User Dataset
4.2. Analyzing Module Performance
4.3. Data Sample Distribution and Evaluation Methods
4.3.1. The Influence of Implicit Features on Sample Distribution
4.3.2. Performance Evaluation Metrics on Weibo User Dataset
4.3.3. Dissociation Experiment
- -
- MAPM: Includes all modules;
- -
- w/o SIDN: Removes time series features;
- -
- w/o BERT: Removes text composition features;
- -
- w/o clip: Removes text-image consistency features;
- -
- w/o DEEP: Disables all deep learning modules, retaining only explicit user features;
- -
- Normalize Scores: Uses normalized Davies–Bouldin scores plus Calinski–Harabasz scores minus Silhouette scores to obtain Normalize Score scores.
4.4. Exploration of Implicit Features and Hyperparameter Selection Effects
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Nan, Q.; Cao, J.; Zhu, Y.; Wang, J.; Li, M. DFEND: Multi-domain fake news detection. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Virtual, 1–5 November 2021; pp. 3343–3347. [Google Scholar]
- Ma, J.; Gao, W.; Wong, K.F. Detect rumors on twitter by promoting information campaigns with generative adversarial learning. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 3049–3055. [Google Scholar]
- Vaibhav, V.; Annasamy, R.M.; Hovy, E. Do sentence interactions matter? leveraging sentence level representations for fake news classification. arXiv 2019, arXiv:1910.12203. [Google Scholar]
- Cheng, M.; Nazarian, S.; Bogdan, P. Vroc: Variational autoencoder-aided multi-task rumor classifier based on text. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 2892–2898. [Google Scholar]
- Singhal, S.; Kabra, A.; Sharma, M.; Shah, R.R.; Chakraborty, T.; Kumaraguru, P. Spotfake+: A multimodal framework for fake news detection via transfer learning (student abstract). Proc. AAAI Conf. Artif. Intell. 2020, 34, 13915–13916. [Google Scholar] [CrossRef]
- Wang, Y.; Ma, F.; Jin, Z.; Yuan, Y.; Xun, G.; Jha, K.; Su, J.; Gao, J. Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 849–857. [Google Scholar]
- Khattar, D.; Goud, J.S.; Gupta, M.; Varma, D. Mvae: Multimodal variational autoencoder for fake news detection. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2915–2921. [Google Scholar]
- Jin, Z.; Cao, J.; Guo, H.; Zhang, Y.; Luo, J. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM International Conference on Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 795–816. [Google Scholar]
- Wu, Y.; Zhan, P.; Zhang, Y.; Wang, L.; Xu, Z. Multimodal fusion with co-attention networks for fake news detection. In Proceedings of the Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online, 1–6 August 2021; pp. 2560–2569. [Google Scholar]
- Jiang, S.; Chen, X.; Zhang, L.; Chen, S.; Liu, H. User-characteristic enhanced model for fake news detection in social media. In Proceedings of the Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, 9–14 October 2019; Springer International Publishing: Cham, Switzerland, 2019; pp. 634–646. [Google Scholar]
- Chen, L.; Ruan, S.; Chen, X.; Wang, H. Research on Intelligent Detection of Social Media Robot Accounts. Netinfo Secur. 2019, 19, 96–100. [Google Scholar]
- Akyon, F.C.; Kalfaoglu, M.E. Instagram fake and automated account detection. In Proceedings of the 2019 Innovations in intelligent systems and applications conference (ASYU), Izmir, Turkey, 31 October–2 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–7. [Google Scholar]
- Liu, K.; Yuan, Y.Y.; Liu, P. Weibo bot-users identification model based on random forest. Acta Sci. Nat. Univ. Pekin. 2015, 52, 290–300. [Google Scholar]
- Zhang, J.; Zhao, Y.; Saleh, M.; Liu, P. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 11328–11339. [Google Scholar]
- Yang, A.; Pan, J.; Lin, J.; Men, R.; Zhang, Y.; Zhou, J.; Zhou, C. Chinese clip: Contrastive vision-language pretraining in chinese. arXiv 2022, arXiv:2211.01335. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the International conference on machine learning, PMLR, Virtual, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Guo, X.; Liu, X.; Ren, Z.; Grosz, S.; Masi, I.; Liu, X. Hierarchical fine-grained image forgery detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 3155–3165. [Google Scholar]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. arXiv 2014, arXiv:1409.3215. [Google Scholar]
- Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef]
- Allcott, H.; Gentzkow, M. Social media and fake news in the 2016 election. J. Econ. Perspect. 2017, 31, 211–236. [Google Scholar] [CrossRef]
- Shu, K.; Sliva, A.; Wang, S.; Tang, J.; Liu, H. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explor. Newsl. 2017, 19, 22–36. [Google Scholar] [CrossRef]
- Rusidn, V.L.; Conroy, N.J.; Chen, Y.; Cornwell, S. Fake news or truth? using satirical cues to detect potentially misleading news. In Proceedings of NAACL-HLT, San Diego, CA, USA, 12–17 June 2016; pp. 7–17. [Google Scholar]
- Rusidn, V.L.; Chen, Y.; Conroy, N.K. Deception detection for news: Three types of fakes. Proc. Assoc. Inf. Sci. Technol. 2015, 52, 1–4. [Google Scholar]
- Ma, J.; Gao, W.; Mitra, P.; Kwon, S.; Jansen, B.J.; Wong, K.F.; Cha, M. Detecting rumors from microblogs with recurrent neural networks. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016), New York, NY, USA, 9–15 July 2016; pp. 3818–3824. [Google Scholar]
- Ma, J.; Gao, W.; Wong, K.F. Detect rumor and stance jointly by neural multi-task learning. In Proceedings of the Companion Proceedings of the Web Conference 2018, Lyon, France, 23–27 April 2018; pp. 585–593. [Google Scholar]
- Liu, J.; Lu, W.; Huang, G.; Ma, N. Research on Internet false information recognition based on deep learning. Intell. Eng. 2022, 8, 86–99. [Google Scholar]
- Qi, P.; Cao, J.; Yang, T.; Guo, J.; Li, J. Exploiting multi-domain visual information for fake news detection. In Proceedings of the 2019 IEEE International Conference on data mining (ICDM), Beijing, China, 8–11 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 518–527. [Google Scholar]
- Zhou, X.; Wu, J.; Zafarani, R. Similarity-Aware Multi-modal Fake News Detection. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Singapore, 11–14 May 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 354–367. [Google Scholar]
- Meng, J.; Wang, L.; Yang, Y.; Lian, B. Multi-modal deep fusion for false information detection. J. Comput. Appl. 2022, 42, 419. [Google Scholar]
- Qi, P.; Cao, J.; Li, X.; Lian, B. Improving fake news detection by using an entity-enhanced framework to fuse diverse multimodal clues. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 1212–1220. [Google Scholar]
- Raza, S. Automatic fake news detection in political platforms-a transformer-based approach. In Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text (CASE 2021), Online, 5–6 August 2021; pp. 68–78. [Google Scholar]
- Ying, Q.; Hu, X.; Zhou, Y.; Qian, Z.; Zeng, D.; Ge, S. Bootstrapping multi-view representations for fake news detection. Proc. AAAI Conf. Artif. Intell. 2023, 37, 5384–5392. [Google Scholar] [CrossRef]
- Kaplan, A.M.; Haenlein, M. Users of the world, unite! The challenges and opportunities of Social Media. Bus. Horiz. 2010, 53, 59–68. [Google Scholar] [CrossRef]
- Roy, P.K.; Chahar, S. Fake profile detection on social networking websites: A comprehensive review. IEEE Trans. Artif. Intell. 2020, 1, 271–285. [Google Scholar] [CrossRef]
- Lu, Y.J.; Li, C.T. GCAN: Graph-aware co-attention networks for explainable fake news detection on social media. arXiv 2020, arXiv:2004.11648. [Google Scholar]
- Dou, Y.; Shu, K.; Xia, C.; Yu, P.S.; Sun, L. User preference-aware fake news detection. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Online, 11–15 July 2021; pp. 2051–2055. [Google Scholar]
- Yuan, D.; Zhang, Y.; Gao, J.; Sun, H. Anomaly User Detection Method in Sina Weibo Based on User Feature Extraction. Comput. Sci. 2020, 47, 364–368+385. [Google Scholar]
- Durga, P.; Sudhakar, T. The use of supervised machine learning classifiers for the detection of fake instagram accounts. J. Pharm. Negat. Results 2023, 267–279. [Google Scholar]
- Khaled, S.; El-Tazi, N.; Mokhtar, H.M.O. Detecting fake accounts on social media. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 3672–3681. [Google Scholar]
- Van Der Walt, E.; Eloff, J. Using machine learning to detect fake identities: Bots vs. humans. IEEE Access 2018, 6, 6540–6549. [Google Scholar] [CrossRef]
- Viswanath, B.; Bashir, M.A.; Crovella, M.; Guha, S.; Gummadi, K.P.; Krishnamurthy, B.; Mislove, A. Towards detecting anomalous user behavior in online social networks. In Proceedings of the 23rd Usenix Security Symposium (Usenix Security 14), San Diego, CA, USA, 20–22 August 2014; pp. 223–238. [Google Scholar]
- Yang, W.; Shen, G.W.; Wang, W.; Gong, L.Y.; Yu, M.; Dong, G.Z. Anomaly detection in microblogging via co-clustering. J. Comput. Sci. Technol. 2015, 30, 1097–1108. [Google Scholar] [CrossRef]
- Heidari, M.; James, H., Jr.; Uzuner, O. An empirical study of machine learning algorithms for social media bot detection. In Proceedings of the 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), Toronto, ON, Canada, 21–24 April 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–5. [Google Scholar]
- Mohammad, S.; Khan, M.U.S.; Ali, M.; Liu, L.; Shardlow, M.; Nawaz, R. Bot detection using a single post on social media. In Proceedings of the 2019 Third World Conference on Smart Trends in Systems Security and Sustainablity (WorldS4), London, UK, 30–31 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 215–220. [Google Scholar]
- Uppada, S.K.; Manasa, K.; Vidhathri, B.; Harini, R.; Sivaselvan, B. Novel approaches to fake news and fake account detection in OSNs: User social engagement and visual content centric model. Soc. Netw. Anal. Min. 2022, 12, 52. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Lu, J. Discover millions of fake followers in Weibo. Soc. Netw. Anal. Min. 2016, 6, 16. [Google Scholar] [CrossRef]
- Cresci, S.; Di Pietro, R.; Petrocchi, M.; Spognardi, A.; Tesconi, M. Fame for sale: Efficient detection of fake Twitter followers. Decis. Support Syst. 2015, 80, 56–71. [Google Scholar] [CrossRef]
- Zhang, Z.; Jing, J.; Li, F.; Habib, A.; Khan, A. A review of research on detection, dissemination and control of false information in online social networks from the perspective of artificial intelligence. J. Comput. Sci. 2021, 44, 2261–2282. [Google Scholar]
- Shao, C.; Ciampaglia, G.L.; Varol, O.; Flammini, A.; Menczer, F. The spread of fake news by social bots. arXiv 2017, arXiv:1707.07592. [Google Scholar]
- Kondeti, P.; Yerramreddy, L.P.; Pradhan, A.; Swain, G. Fake account detection using machine learning. In Evolutionary Computing and Mobile Sustainable Networks: Proceedings of ICECMSN 2020; Springer: Singapore, 2021; pp. 791–802. [Google Scholar]
- Bharti, K.K.; Pandey, S. Fake account detection in twitter using logistic regression with particle swarm optimization. Soft Comput. 2021, 25, 11333–11345. [Google Scholar] [CrossRef]
- Wang, X.; Zheng, Q.; Zheng, K.; Sui, Y.; Cao, S.; Shi, Y. Detecting social media bots with variational autoencoder and k-nearest neighbor. Appl. Sci. 2021, 11, 5482. [Google Scholar] [CrossRef]
- Shreya, K.; Kothapelly, A.; Deepika, V.; Shanmugasundaram, H. Identification of Fake accounts in social media using machine learning. In Proceedings of the 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), Mandya, India, 26–27 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
- Zhang, J.; Gan, R.; Wang, J.; Zhang, Y.; Zhang, L.; Yang, P.; Chen, C. Fengshenbang 1.0: Being the foundation of Chinese cognitive intelligence. arXiv 2022, arXiv:2209.02970. [Google Scholar]
- Ying, Q.F.; Chiu, D.M.; Venkatramanan, S.; Zhang, X. User modeling and usage profiling based on temporal posting behavior in OSNs. Online Soc. Netw. Media 2018, 8, 32–41. [Google Scholar] [CrossRef]
- Li, J.; Sun, M. Scalable Term Selection for Text Categorization. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, 28–30 June 2007; pp. 774–782. [Google Scholar]
- Zhang, Y.; Liu, K.; Zhang, Q.; Wang, Y.; Gao, K. A combined-convolutional neural network for Chinese news text classification. Acta Electonica Sin. 2021, 49, 1059. [Google Scholar]
Model | Precision | Recall | F1 Score |
---|---|---|---|
NB | 0.819 | 0.816 | 0.814 |
KNN | 0.855 | 0.853 | 0.849 |
SVM | 0.874 | 0.876 | 0.874 |
CNN-1 | 0.925 | 0.924 | 0.924 |
CNN-3 | 0.919 | 0.918 | 0917 |
composite-CNN | 0.937 | 0.937 | 0.937 |
THUCTC | 0.886 | 0.829 | 0.856 |
BERT | 0.933 | 0.930 | 0.930 |
User Category | Precision | Recall | F1 Score | Sample Size |
---|---|---|---|---|
Normal User | 0.983 | 0.993 | 0.988 | 1756 |
Reproduce User | 0.957 | 0.852 | 0.901 | 155 |
Lottery User | 0.909 | 0.903 | 0.906 | 165 |
Methods | Behavioral Analysis Module | Internal Metrics | |||||
---|---|---|---|---|---|---|---|
SIDN | cn Clip | BERT | Davies | Calinski | Silhouette | Normalize Score | |
MAPM | √ | √ | √ | 1.224 | 297.014 | 0.109 | −0.245 |
w/o SIDN | × | √ | √ | 1.497 | 296.480 | 0.043 | 1.002 |
w/o BERT | √ | √ | × | 1.463 | 609.496 | 0.227 | 0.924 |
w/o clip | √ | × | √ | 1.452 | 340.384 | 0.137 | 0.501 |
w/o DEEP | × | × | × | 1.367 | 580.108 | 0.234 | 0.476 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Jiang, W.; Zhang, J.; Shao, Y.; Zhu, W. Fake User Detection Based on Multi-Model Joint Representation. Information 2024, 15, 266. https://doi.org/10.3390/info15050266
Li J, Jiang W, Zhang J, Shao Y, Zhu W. Fake User Detection Based on Multi-Model Joint Representation. Information. 2024; 15(5):266. https://doi.org/10.3390/info15050266
Chicago/Turabian StyleLi, Jun, Wentao Jiang, Jianyi Zhang, Yanhua Shao, and Wei Zhu. 2024. "Fake User Detection Based on Multi-Model Joint Representation" Information 15, no. 5: 266. https://doi.org/10.3390/info15050266
APA StyleLi, J., Jiang, W., Zhang, J., Shao, Y., & Zhu, W. (2024). Fake User Detection Based on Multi-Model Joint Representation. Information, 15(5), 266. https://doi.org/10.3390/info15050266