Vulnerabilities to Online Social Network Identity Deception Detection Research and Recommendations for Mitigation
Abstract
:1. Introduction
- We highlight representative studies in the domain of identity deception detection on online social networks.
- We present the key mechanisms that are used to generate detection models and evaluate them.
- We identify the key shortcomings that may inhibit the effectiveness of these methods when implemented in real-world online social networks.
- We provide recommendations that can help address these shortcomings and improve the quality of research in this domain.
2. Identity Deception
2.1. Societal Cost of Identity Deception
2.2. Economic Cost of Identity Deception
3. State-of-the-Art Results on the Detection of Identity Deception
3.1. Review Analysis Method
3.2. Detection Features
3.2.1. Social Context Models
3.2.2. User Behavior Models
3.3. Detection Methods
3.4. Criteria for Evaluation Used by Models
3.5. Detection Technique Objectives
4. Research Approach Vulnerabilities
4.1. Weak Datasets
4.2. Data Selection Bias
4.3. Faulty Feature Selection and Construction
4.4. Precision Bias
5. Recommendations
5.1. Dataset Quality and Sharing
5.2. Emphasizing Recall
5.3. Realistic Feature Selection
5.4. Increasing Computational Overhead
6. Conclusions
Funding
Acknowledgments
Conflicts of Interest
References
- Tsikerdekis, M.; Zeadally, S. Online deception in social media. Commun. ACM 2014, 57, 72–80. [Google Scholar] [CrossRef]
- Yang, C.; Harkreader, R.; Zhang, J.; Shin, S.; Gu, G. Analyzing spammers’ social networks for fun and profit. In Proceedings of the 21st International Conference on World Wide Web—WWW ’12, Lyon, France, 16 April 2012; ACM Press: New York, NY, USA, 2012; pp. 71–80. [Google Scholar] [CrossRef]
- Zheng, X.; Zeng, Z.; Chen, Z.; Yu, Y.; Rong, C. Detecting spammers on social networks. Neurocomputing 2015, 159, 27–34. [Google Scholar] [CrossRef] [Green Version]
- Wang, B.; Gong, N.Z.; Fu, H. GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graphs. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017; pp. 465–474. [Google Scholar] [CrossRef]
- Tsikerdekis, M.; Zeadally, S. Multiple account identity deception detection in social media using nonverbal behavior. IEEE Trans. Inf. Forensics Secur. 2014, 9, 1311–1321. [Google Scholar] [CrossRef] [Green Version]
- Shi, P.; Zhang, Z.; Choo, K.K.R. Detecting Malicious Social Bots Based on Clickstream Sequences. IEEE Access 2019, 7, 28855–28862. [Google Scholar] [CrossRef]
- Cresci, S.; Petrocchi, M.; Spognardi, A.; Tognazzi, S. On the capability of evolved spambots to evade detection via genetic engineering. Online Soc. Netw. Media 2019, 9, 1–16. [Google Scholar] [CrossRef]
- Concone, F.; Re, G.L.; Morana, M.; Ruocco, C. Twitter Spam Account Detection by Effective Labeling. In Proceedings of the ITASEC 2019, Pisa, Italy, 13–15 February 2019. [Google Scholar]
- Tsikerdekis, M.; Zeadally, S. Detecting and Preventing Online Identity Deception in Social Networking Services. IEEE Internet Comput. 2015, 19, 41–49. [Google Scholar] [CrossRef]
- Sanzgiri, A.; Joyce, J.; Upadhyaya, S. The Early (tweet-ing) Bird Spreads the Worm: An Assessment of Twitter for Malware Propagation. Procedia Comput. Sci. 2012, 10, 705–712. [Google Scholar] [CrossRef] [Green Version]
- Huber, B.; Barnidge, M.; Gil de Zúñiga, H.; Liu, J. Fostering public trust in science: The role of social media. Public Underst. Sci. 2019, 28, 759–777. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lovari, A. Spreading (Dis)Trust: Covid-19 Misinformation and Government Intervention in Italy. Media Commun. 2020, 8, 458–461. [Google Scholar] [CrossRef]
- Allcott, H.; Gentzkow, M. Social Media and Fake News in the 2016 Election; National Bureau of Economic Research Working Paper Series; NBER: Cambridge, MA, USA, 2017; No. 23089. [Google Scholar] [CrossRef]
- Badawy, A.; Ferrara, E.; Lerman, K. Analyzing the Digital Traces of Political Manipulation: The 2016 Russian Interference Twitter Campaign. In Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, 28–31 August 2018; pp. 258–265. [Google Scholar] [CrossRef] [Green Version]
- Lever, R. Fake Facebook Accounts: The Never-Ending Battle Against Bots; AFP: France, Paris, 2019. [Google Scholar]
- Graham, M. Fake Followers in Influencer Marketing Will Cost Brands $1.3 Billion This Year, Report Says. 2019. Available online: https://www.cnbc.com/2019/07/24/fake-followers-in-influencer-marketing-will-cost-1point3-billion-in-2019.html (accessed on 30 August 2020).
- Yamak, Z.; Saunier, J.; Vercouter, L. Detection of Multiple Identity Manipulation in Collaborative Projects. In Proceedings of the 25th International Conference Companion on World Wide Web—WWW ’16 Companion, Montréal, QC, Canada, 11–15 April 2016; ACM Press: New York, NY, USA, 2016; pp. 955–960. [Google Scholar] [CrossRef] [Green Version]
- Ferguson, L. External Validity, Generalizability, and Knowledge Utilization. J. Nurs. Scholarsh. 2004, 36, 16–22. [Google Scholar] [CrossRef] [PubMed]
- Nicholson, W.K. Minimizing threats to external validity. In Intervention Research: Designing, Conducting, Analyzing and Funding; Melnyk, B.M., Morrison-Beedy, D., Eds.; Springer Publishing Company: New York, NY, USA, 2012; Chapter 7; pp. 107–120. [Google Scholar]
- Sanders, C.; Smith, J. Applied Network Security Monitoring: Collection, Detection, and Analysis; Syngress: Waltham, MA, USA, 2014. [Google Scholar]
- Cao, Q.; Yang, X.; Yu, J.; Palow, C. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security—CCS ’14, Scottsdale, AZ, USA, 3–7 November 2014; ACM Press: New York, NY, USA, 2014; pp. 477–488. [Google Scholar] [CrossRef] [Green Version]
- Stein, T.; Chen, E.; Mangla, K. Facebook immune system. In Proceedings of the 4th Workshop on Social Network Systems—SNS ’11, Salzburg, Austria, 10 April 2011; ACM Press: New York, NY, USA, 2011; Volume 8, pp. 1–8. [Google Scholar] [CrossRef]
- Daya, A.A.; Salahuddin, M.A.; Limam, N.; Boutaba, R. A Graph-Based Machine Learning Approach for Bot Detection. In Proceedings of the 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Arlington, VA, USA, 8–12 April 2019; pp. 144–152. [Google Scholar]
- Cao, Q.; Sirivianos, M.; Yang, X.; Pregueiro, T. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th NSDI’12 USENIX conference on Networked Systems Design and Implementation, San Jose, CA, USA, 25–27 April 2012; USENIX Association: Berkeley, CA, USA, 2012; p. 15. [Google Scholar]
- Viswanath, B.; Post, A.; Gummadi, K.P.; Mislove, A. An analysis of social network-based Sybil defenses. In ACM SIGCOMM Computer Communication Review; ACM: New York, NY, USA, 2010; Volume 40, p. 363. [Google Scholar] [CrossRef]
- Tsikerdekis, M. Identity Deception Prevention Using Common Contribution Network Data. IEEE Trans. Inf. Forensics Secur. 2017, 12, 188–199. [Google Scholar] [CrossRef]
- Ruan, X.; Wu, Z.; Wang, H.; Jajodia, S. Profiling Online Social Behaviors for Compromised Account Detection. IEEE Trans. Inf. Forensics Secur. 2016, 11, 176–187. [Google Scholar] [CrossRef]
- Zeadally, S.; Tsikerdekis, M. Securing Internet of Things (IoT) with machine learning. Int. J. Commun. Syst. 2020, 33, e4169. [Google Scholar] [CrossRef]
- Sommer, R.; Paxson, V. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, Berkeley/Oakland, CA, USA, 16–19 May 2010; pp. 305–316. [Google Scholar] [CrossRef] [Green Version]
- Sutton, R. The Bitter Lesson. 2019. Available online: http://www.incompleteideas.net/IncIdeas/BitterLesson.html (accessed on 30 August 2020).
- Nazer, T.H.; Davis, M.; Karami, M.; Akoglu, L.; Koelle, D.; Liu, H. Bot Detection: Will Focusing on Recall Cause Overall Performance Deterioration? Springer: Berlin/Heidelberg, Germany, 2019; pp. 39–49. [Google Scholar] [CrossRef]
- Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1–11. [Google Scholar] [CrossRef]
- Cresci, S.; Di Pietro, R.; Petrocchi, M.; Spognardi, A.; Tesconi, M. Fame for sale: Efficient detection of fake Twitter followers. Decis. Support Syst. 2015, 80, 56–71. [Google Scholar] [CrossRef]
- Anagnostopoulos, I.; Zeadally, S.; Exposito, E. Handling big data: Research challenges and future directions. J. Supercomput. 2016, 72, 1494–1516. [Google Scholar] [CrossRef]
Faulty Feature Selection | Weak Datasets | Precision Bias | Data Selection Bias | |
---|---|---|---|---|
Social feature models | Select features that relate to the social context of an OSN (e.g., following relationships); the ground truth comes from a seed of pre-identified malicious users [2] | Datasets tend to contain easily separable clusters of users; users with an ambiguous social context are removed from the dataset [3] | Prioritize identifying components of the graph that are more homogenous rather than seeking out components that are more ambiguous [4] | Limit the size of a graph due to collection or computational limitations [2] |
Atomic feature models | Selecting features relating to a user’s online behavior (e.g., clickstream); the ground truth comes from a pre-existing “profile” for malicious behavior [6] | Datasets tend to contain manually selected users from real-world traffic, easily separable and free of much “noise” [8] | Tend to allow a few “hard to classify”, but malicious users in order to minimize false positives [7] | Downsizing a dataset of real-world traffic in order to boost the model’s performance with regards to “hard to classify” data points [3] |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ismailov, M.; Tsikerdekis, M.; Zeadally, S. Vulnerabilities to Online Social Network Identity Deception Detection Research and Recommendations for Mitigation. Future Internet 2020, 12, 148. https://doi.org/10.3390/fi12090148
Ismailov M, Tsikerdekis M, Zeadally S. Vulnerabilities to Online Social Network Identity Deception Detection Research and Recommendations for Mitigation. Future Internet. 2020; 12(9):148. https://doi.org/10.3390/fi12090148
Chicago/Turabian StyleIsmailov, Max, Michail Tsikerdekis, and Sherali Zeadally. 2020. "Vulnerabilities to Online Social Network Identity Deception Detection Research and Recommendations for Mitigation" Future Internet 12, no. 9: 148. https://doi.org/10.3390/fi12090148
APA StyleIsmailov, M., Tsikerdekis, M., & Zeadally, S. (2020). Vulnerabilities to Online Social Network Identity Deception Detection Research and Recommendations for Mitigation. Future Internet, 12(9), 148. https://doi.org/10.3390/fi12090148