Artificial Intelligence in Social Media Forensics: A Comprehensive Survey and Analysis
Abstract
:1. Introduction
2. Digital Forensics
2.1. Foundation and Methodology
- Admissibility: The goal of every action must be to preserve digital evidence in a way that makes it acceptable in court or other legal proceedings.
- Chain of custody: A meticulous record must be maintained to demonstrate the origin and handling of evidence throughout the investigation, ensuring its authenticity and integrity.
- Minimization of modification: Whenever possible, data should be acquired in a way that minimizes or prevents modifications to the original evidence.
- Documentation: Every step of the investigation process must be comprehensively documented, including tools used, procedures followed, and analysis performed.
- Validation: All analytical methods and tools used must be scientifically sound and validated to ensure their reliability and repeatability.
- Collection: The process of locating and recording credible sources of data pertinent to the incident, followed by the acquisition of data from these sources while ensuring their integrity is maintained.
- Examination: The assessment of data obtained during the collection phase, focusing on extracting relevant information related to the incident while maintaining the integrity of the data.
- Analysis: The study of information extracted during the examination phase to address pertinent investigative questions and to determine if a conclusive or partial conclusion can be reached.
- Reporting: The preparation and presentation of the investigation’s procedures, methodologies, and tools utilized, along with the outcomes derived from the analysis phase.
2.2. Domains in Digital Forensics
2.3. Computer Forensics
- Data acquisition: Data acquisition in computer forensics is the process of gathering and recovering sensitive data during a digital forensic investigation [36,37]. This process involves capturing digital data from various sources such as disk, RAM, swap files, operating systems, and other storage mediums [13]. Overall, data acquisition is an important aspect of computer forensics, and it is essential for investigators to have the necessary skills and tools to identify and capture digital evidence effectively. In their 2021 paper, Ref. [14] demonstrated the importance of data acquisition and recovery in the computer forensics process. Using Photorec, they were able to recover 2781 files of different data formats that had been previously deleted from a 32 gigabyte flash drive [38]. Phorotec is a popular data acquisition and forensics tool used by computer forensics investigators. Other tools include FTK imager and TestDisk.
- File system analysis: This is the data structure that makes it possible to store, access, and retrieve data efficiently on a computer system; without it, all files would become disorganized and tedious to access. File system analysis in the context of computer forensics involves examining the structure and contents of the file system, recovering deleted files, and reconstructing file activity timelines. Using file system activities, Khan et al. [15] presented a post-event timeline reconstruction method based on artificial neural network technology. By following earlier file system activities, they were able to map the chronology of important events on the computer system using a neural network methodology. Ref. [39] investigates and assesses the suitability of neural network approaches in computer forensics investigation by examining data associated with the file system of the computer to ascertain whether it has been altered by a particular application.
- Operating system analysis: Operating system analysis involves finding and evaluating relevant data from the operating system of the concerned computer or digital system [40]. With an emphasis on forensic memory acquisition, Huebner et al. [16] discuss how operating system design and implementation affect computer forensics investigation methodology. The operating system might theoretically facilitate investigative inquiries by providing instruments for data analysis and by facilitating easy access to system data. Ref. [17] offers a thorough overview of the literature on operating system logs forensic analysis. Due to their ability to capture crucial system activity, these system event logs are among the foremost sources of digital evidence in forensic cases.
- Steganography and data hiding detection: Steganography involves concealing information within a carrier, while steganalysis refers to the procedure of identifying concealed information within a carrier [18]. In May 2011, the German Federal Criminal Police (BKA) detained an Al-Qaeda affiliate in Berlin, seizing a chip holding a folder protected by a password. Through forensic analysis, specialists managed to decrypt the folder, exposing a pornographic video labeled ’KickAss’, housing 141 discrete files outlining future targets and activities of Al-Qaeda [41]. Identifying concealed data embedded within files or unused sectors of storage devices, a technique frequently employed by criminals to hide sensitive information, has emerged as a critical area of research. Davidson et al.’s [19] research focused on developing a prototype software (version 1.0) that can detect if an image has any concealed or encrypted information in it. The software prototype was built using a sophisticated artificial neural network (ANN) system. Ref. [20] presents an innovative method of JPEG image steganalysis. This is driven by the need for a rapid and precise identification of concealed data and stego-carriers within image file datasets. As advances are being made in the field of steganalysis, malicious actors keen to stay ahead of the law are intensifying efforts to hide their data, potentially resorting to algorithms deliberately crafted to circumvent detection during forensic investigations.
2.4. Mobile Device Forensics
2.5. Network Forensics
2.6. Database Forensics
2.7. Cloud Forensics
3. Social Media Forensics Fundamentals
- (a)
- Textual content: Posts, comments, messages, and other text-based interactions provide insights into user behavior, opinions, and potential criminal activities like cyberbullying or hate speech.
- (b)
- Multimedia evidence: Images, videos, and even audio recordings can reveal crucial details about events, locations, and individuals involved in investigations.
- (c)
- Network connections: Analyzing user connections, groups, and interactions can shed light on criminal networks, organized groups, or hidden associations.
- (d)
- Metadata: Timestamps, location data, and other embedded information within social media content can provide valuable context and forensic clues.
- (a)
- Reconstruct past events: By meticulously piecing together user activity and interactions, investigators can reconstruct timelines, identify key players, and understand the context surrounding specific situations. This is crucial in criminal investigations or even analyzing the spread of misinformation.
- (b)
- Identify criminal activity: Social media platforms, unfortunately, can be breeding grounds for illegal activities like cyberbullying, hate speech, online harassment, and even fraud. Forensic analysis can uncover evidence of these crimes, supporting legal proceedings and ensuring user accountability.
- (c)
- Unveil hidden networks: Analyzing social media connections can reveal patterns of communication and association, aiding investigations into organized crime, terrorist groups, or other criminal networks. This plays a vital role in disrupting illegal activities and ensuring public safety.
- (d)
- Analyze public opinion and sentiment: By analyzing large datasets of social media posts, we can gain valuable insights into public opinion on various topics. This information empowers researchers, organizations, and even governments to understand societal trends and make informed decisions.
3.1. Key Challenges and Complexities
- (a)
- Data volatility: The dynamic nature of digital information generated and shared on social media platforms means social media data can be highly transient, with content frequently changing or being deleted entirely. This volatility arises due to several factors: real-time updates, where users can post updates, comments, and messages instantly, leading to a continuous stream of new data; user control, allowing individuals to edit or delete their posts and comments, impacting the availability of data for forensic analysis; platform changes, such as updates, algorithm alterations, or shutdowns, affecting data accessibility and preservation; legal requests and policies, wherein social media companies may comply with legal requests to remove certain content or user accounts, leading to the deletion or modification of relevant data; and cultural and topical shifts, where social media conversations can rapidly evolve based on current events, trends, or public sentiment, rendering older data less relevant or accurate over time. Due to this volatility, forensic analysts face challenges in collecting, storing, and analyzing data for investigative purposes. Techniques and tools for capturing and storing social media data must account for its rapid turnover and potential for modification or deletion. Additionally, forensic analysts must act swiftly to collect relevant data before it becomes inaccessible or loses its evidentiary value.
- (b)
- Data volume and diversity: The vast amount and wide range of digital information generated and shared across social media platforms introduce a considerable level of complexity to the social media forensics process. Social media platforms host a multitude of content types, including text, images, videos, links, and more, leading to a diverse array of data formats and structures. This diversity presents challenges for forensic analysis, as different types of content require specialized techniques for processing and interpretation. Furthermore, the sheer volume of information generated on online social networks daily is immense, making it difficult for forensic investigators to sift through and analyze relevant information efficiently. The continuous influx of data adds to the complexity, requiring forensic analysts to develop scalable methods and tools for managing and analyzing large datasets. Additionally, the global reach of social media platforms means that data can be generated in multiple languages and cultural contexts, further increasing the complexity of analysis. Therefore, effective social media forensic investigations require strategies for handling the vast volume and diverse nature of data found on these platforms, ensuring that relevant information is identified, extracted, and interpreted accurately [5].
- (c)
- Attribution and anonymity: There are major difficulties related to identifying the creators or originators of content and distinguishing between genuine users and those hiding behind pseudonyms or false identities. Attribution involves tracing digital content back to its source or author [45], which can be challenging due to the ease of creating anonymous accounts and the potential for content to be shared and reposted across multiple platforms. Social media platforms often allow users to create accounts with minimal verification, enabling individuals to hide their true identities or impersonate others [46]. This anonymity complicates forensic investigations by obscuring the trail of digital evidence and making it difficult to establish the credibility and authenticity of information. Moreover, malicious actors may deliberately manipulate or distort information to mislead investigators or incite conflict, further complicating the task of attribution. Forensic analysts must employ advanced techniques, such as digital footprint analysis, linguistic analysis, and network analysis, to attribute digital content to its source and differentiate between legitimate users and impostors. Additionally, legal and ethical considerations surrounding user privacy and data protection must be carefully navigated when attempting to uncover the identities of individuals behind anonymous accounts. Therefore, addressing the challenges of attribution and anonymity in social media requires a mix of domain expertise, investigative rigor, and adherence to ethical standards to guarantee the accurate and responsible use of digital evidence in forensic contexts.
- (d)
- Privacy concerns: Privacy concerns in social media forensics encompass the ethical and legal dilemmas resulting from the investigation and analysis of digital evidence gathered from social media platforms. As forensic analysts extract and scrutinize data from social media accounts, they confront the challenge of balancing the imperative to uncover truth with the imperative to protect individual privacy rights. The very nature of social media forensics, which involves accessing and examining personal information shared by users, raises concerns about the invasion of privacy and potential misuse of sensitive data. Individuals may feel uneasy knowing that their online activities are subject to scrutiny and may fear the implications of their digital footprint being used in investigations [47]. Moreover, the handling of social media data by forensic experts must adhere to strict ethical guidelines and legal regulations to safeguard the privacy of individuals and ensure the sanctity of the investigative process. Concerns also extend to the potential for data breaches or leaks during the forensic analysis, which could expose personal information to unauthorized parties and lead to further privacy violations. Thus, social media forensics practitioners face the challenge of navigating these privacy concerns while fulfilling their investigative duties. They must employ robust data protection measures, obtain appropriate legal permissions, and prioritize the anonymization of personal information whenever possible. Additionally, fostering transparency and accountability in social media forensic practices is essential for building trust with stakeholders and mitigating privacy-related apprehensions.
3.2. Traditional Social Media Forensic Techniques
- (a)
- Data acquisition: Traditional methods like keyword searches and targeted data extraction from user profiles and posts serve as the bedrock for acquiring relevant evidence.
- (b)
- Metadata analysis: Forensic investigators examines embedded timestamps, location data, and other metadata associated with the acquired social media content to better understand the context and origin of the information.
- (c)
- Hashing and digital forensics tools: Ensuring data integrity and chain of custody is important. Forensics analysts utilize hashing algorithms and specialized software designed for traditional digital forensics.
- (d)
- Network analysis: Network analysis is used to identify connections, groups, and interactions between users, particularly through friend lists and communication logs. This can reveal patterns and potential criminal networks, building upon established network forensics techniques.
- (e)
- Content analysis: Traditional text analysis techniques, including keyword searches, sentiment analysis, and topic modeling, offer a starting point for understanding the content of social media posts, images, and videos.
3.3. OSINT in Social Media Forensics
- (a)
- Profile exploration: Examining user profiles, including bios, posts, comments, and follower lists can reveal details about a person’s activities, interests, and connections.
- (b)
- Keyword/hashtag searching: Utilizing relevant keywords and hashtags can lead investigators to discussions, photos, and videos related to the investigation.
- (c)
- Geolocation analysis: Many social media posts contain embedded geolocation data, providing valuable insights into physical locations associated with an event or user.
- (d)
- Social network analysis: Mapping connections between accounts and analyzing interactions within online communities can reveal patterns and identify potential collaborators or associates.
4. NLP in Social Media Forensics
4.1. Radicalization Detection
4.2. Cyberbullying Detection
4.3. Fake Profile Detection
5. GNNs in Social Media Forensics
5.1. Fauxtography
5.2. Criminal Activity Detection
6. GANs in Social Media Forensics
Deepfake Detection
7. Challenges and Future Directions
7.1. Key Challenges
- (a)
- Data availability and privacy: Balancing the need for comprehensive data for effective AI model training and analysis with the paramount importance of user privacy remains a significant hurdle. Collaborations between researchers, law enforcement agencies, and social media platforms are crucial to establish ethical and legal frameworks for data access while upholding user privacy rights.
- (b)
- Explainability and interpretability: The “black box” nature of many AI models, particularly complex algorithms like deep learning architectures, raises concerns about their decision-making processes. Developing interpretable AI techniques is vital for building trust and ensuring ethical application in forensic investigations. This requires advancements in model design and the integration of Explainable AI (XAI) methodologies to offer insights into the process by which AI models derive their conclusions.
- (c)
- Bias and fairness: AI models have the capacity to adopt and magnify biases inherent in their training data, possibly resulting in unjust or prejudiced results. Mitigating bias requires comprehensive approaches, including:
- Employing diverse datasets: Utilizing data that reflects the true diversity of online communities is crucial to avoid perpetuating existing biases.
- Developing fair evaluation metrics: Establishing evaluation metrics that not only assess accuracy but also identify and address potential biases within the model’s predictions.
- Careful model design: Implementing techniques like fairness-aware model architectures and training procedures can help mitigate bias from the outset.
- (d)
- Evolving technologies and user behavior: The rapid pace of technological advancements and user behavior changes necessitate continuous adaptation and refinement of AI models. Continuously updating training data, developing generalizable models, and monitoring their performance in real-world scenarios are essential to ensure effectiveness and avoid model drift.
7.2. Future Directions
- (a)
- Interdisciplinary collaboration: Future research in AI-driven social media forensics should foster interdisciplinary collaboration between computer scientists, social scientists, legal experts, and ethicists. Collaborative efforts can facilitate a more holistic understanding of the complex socio-technical challenges involved in forensic investigations.
- (b)
- Explainable AI (XAI): Improving the explainability and interpretability of AI models is crucial for fostering trust and transparency in forensic decision-making processes. Future research should prioritize the development of XAI techniques capable of providing human-understandable explanations for AI-driven forensic analyses.
- (c)
- Continuous learning and adaptation: AI systems in social media forensics should be designed to learn continuously from new data and adapt to evolving threats and challenges. Incorporating mechanisms for online learning and real-time feedback can enhance the agility and effectiveness of forensic analyses in dynamic social media environments.
- (d)
- Privacy-preserving techniques: Advancing privacy-preserving AI techniques is paramount for safeguarding user privacy while enabling effective forensic analyses. Future research should explore innovative approaches for conducting forensic investigations while minimizing the disclosure of sensitive user information.
- (e)
- Ethical guidelines and standards: It is imperative to establish unambiguous ethical guidelines and standards to govern the conscientious application of artificial intelligence in the field of social media forensics. Future efforts should focus on developing ethical frameworks and regulatory mechanisms to ensure the ethical and responsible deployment of AI technologies in forensic investigations.
7.3. Limitations of the Scope
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Dean, B. Social Network Usage & Growth Statistics: How Many People Use Social Media in 2024? Backlinko: Cheyenne, WY, USA, 2024; Available online: https://backlinko.com/social-media-users (accessed on 19 April 2024).
- The Importance and Challenges of Social Media in Digital Investigations. Available online: https://www.controlrisks.com/our-thinking/insights/the-importance-and-challenges-of-social-media-in-digital-investigations?utm_referrer=https://www.google.com (accessed on 19 April 2024).
- Dwivedi, Y.K.; Kelly, G.; Janssen, M.; Rana, N.P.; Slade, E.L.; Clement, M. Social Media: The Good, the Bad, and the Ugly. Inf. Syst. Front. 2018, 20, 419–423. [Google Scholar] [CrossRef]
- Morgan, S. Cybercrime to Cost the World 8 Trillion Annually in 2023. Cybercrime Magazine. 17 October 2022. Available online: https://cybersecurityventures.com/cybercrime-to-cost-the-world-8-trillion-annually-in-2023/ (accessed on 19 April 2024).
- Digital Forensics and Social Media: Ethics, Challenges and Opportunities; Birkbeck, University of London: London, UK, 2019; Available online: https://www.bbk.ac.uk/news/digital-forensics-and-social-media-ethics-challenges-and-opportunities/ (accessed on 19 April 2024).
- Kent, K.; Chevalier, S.; Grance, T.; Dang, H. Special Publication 800-86 Guide to Integrating Forensic Techniques into Incident Response Recommendations of the National Institute of Standards and Technology; The National Institute of Standards and Technology: Gaithersburg, MD, USA, 2006. Available online: https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-86.pdf (accessed on 3 February 2024).
- Sharma, B.K.; Joseph, M.A.; Jacob, B.; Miranda, B. Emerging trends in digital forensics and cybersecurity—An overview. In Proceedings of the 2019 Sixth HCT Information Technology Trends (ITT), Ras Al Khaimah, United Arab Emirates, 20–21 November 2019; pp. 309–313. [Google Scholar]
- Dumchykov, M. The Processes of Digitization and Forensics: A Retrospective Analysis. Crim. Forensics 2020, 65, 100–108. [Google Scholar] [CrossRef]
- Ivanov, V.Y. On theoretical aspects of using the concept of digital footprint in forensics. Leg. Stud. 2020, 75–80. [Google Scholar] [CrossRef]
- Sachowski, J. Implementing Digital Forensic Readiness: From Reactive to Proactive Process, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
- Roux, C.; Crispino, F.; Ribaux, O. From forensics to forensic science. Curr. Issues Crim. Justice 2012, 24, 7–24. [Google Scholar] [CrossRef]
- Karabiyik, U. Building an Intelligent Assistant for Digital Forensics. Ph.D. Thesis, Florida State University, Tallahassee, FL, USA, 2015. [Google Scholar]
- Kizza, J.M. Computer crime investigations—Computer forensics. In Ethical and Social Issues in the Information Age, Texts in Computer Science; Springer: London, UK, 2010; pp. 263–276. [Google Scholar]
- Pratama, I.P.A.E. Computer Forensic Using Photorec for Secure Data Recovery Between Storage Media: A Proof of Concept. Int. J. Sci. Technol. Manag. 2021, 2, 1189–1196. [Google Scholar] [CrossRef]
- Khan, M.N.A.; Chatwin, C.R.; Young, R.C. A framework for post-event timeline reconstruction using neural networks. Digit. Investig. 2007, 4, 146–157. [Google Scholar] [CrossRef]
- Huebner, E.; Bem, D.; Henskens, F.; Wallis, M. Persistent systems techniques in forensic acquisition of memory. Digit. Investig. 2007, 4, 129–137. [Google Scholar] [CrossRef]
- Studiawan, H.; Sohel, F.; Payne, C. A survey on forensic investigation of operating system logs. Digit. Investig. 2019, 29, 1–20. [Google Scholar]
- Dalal, M.; Juneja, M. Video steganalysis to obstruct criminal activities for digital forensics: A survey. Int. J. Electron. Secur. Digit. Forensics 2018, 10, 338. [Google Scholar] [CrossRef]
- Davidson, J.; Bergman, C.; Bartlett, E. An artificial neural network for wavelet steganalysis. In Proceedings of the Optics and Photonics 2005, San Diego, CA, USA, 31 July–4 August 2005. [Google Scholar] [CrossRef]
- Zaharis, A.; Martini, A.; Tryfonas, T.; Ilioudis, C.; Pangalos, G. Lightweight Steganalysis Based on Image Reconstruction and Lead Digit Distribution Analysis. Int. J. Digit. Crime Forensics 2011, 3, 29–41. [Google Scholar] [CrossRef]
- Sharma, P.; Arora, D.; Sakthivel, T. Enhanced Forensic Process for Improving Mobile Cloud Traceability in Cloud-Based Mobile Applications. Procedia Comput. Sci. 2020, 167, 907–917. [Google Scholar] [CrossRef]
- Joseph, M.A.; Philip, S.; Miranada, B.; Deshmukh, A.; Singh, N. A theoretical workflow for the verification of embedded threats on mobile devices. In Proceedings of the 2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM), Dubai, United Arab Emirates, 19–21 January 2021. [Google Scholar] [CrossRef]
- Koroniotis, N.; Moustafa, N.; Sitnikova, E. A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework. Future Gener. Comput. Syst. 2020, 110, 91–106. [Google Scholar] [CrossRef]
- Sikos, L.F. Packet analysis for network forensics: A comprehensive survey. Forensic Sci. Int. Digit. Investig. 2020, 32, 200892. [Google Scholar] [CrossRef]
- Khalid, Z.; Iqbal, F.; Kamoun, F.; Hussain, M.; Khan, L.A. Forensic analysis of the cisco WebEx application. In Proceedings of the 2021 5th Cyber Security in Networking Conference (CSNet), Abu Dhabi, United Arab Emirates, 12–14 October 2021. [Google Scholar] [CrossRef]
- Lo, W.W.; Kulatilleke, G.; Sarhan, M.; Layeghy, S.; Portmann, M. XG-BoT: An Explainable Deep Graph Neural Network for Botnet Detection and Forensics. Internet Things 2022, 22, 100747. [Google Scholar] [CrossRef]
- Khanuja, H.K.; Adane, D. Monitor and detect suspicious transactions with database forensics and Dempster-Shafer theory of evidence. Int. J. Electron. Secur. Digit. Forensics 2020, 12, 154. [Google Scholar] [CrossRef]
- Al-Dhaqm, A.; Razak, S.; Ikuesan, R.A.; Kebande, V.R.; Hajar Othman, S. Face Validation of Database Forensic Investigation Metamodel. Infrastructures 2021, 6, 13. [Google Scholar] [CrossRef]
- Chopade, R.M.; Pachghare, V.K. Data Tamper Detection from NoSQL Database in Forensic Environment. J. Cyber Secur. Mobil. 2021, 10, 421–450. [Google Scholar] [CrossRef]
- Choi, H.; Lee, S.; Jeong, D. Forensic Recovery of SQL Server Database: Practical Approach. IEEE Access 2021, 9, 14564–14575. [Google Scholar] [CrossRef]
- Zhang, C.; Yin, J. Research on security mechanism and forensics of SQLite database. In Communications in Computer and Information Science; Springer: Berlin/Heidelberg, Germany, 2021; pp. 614–629. [Google Scholar] [CrossRef]
- Rani, D.R.; Geethakumari, G. Secure data transmission and detection of anti-forensic attacks in cloud environment using MECC and DLMNN. Comput. Commun. 2020, 150, 799–810. [Google Scholar] [CrossRef]
- Ahsan, M.M.; Wahab, A.W.B.A.; Idris, M.Y.I.B.; Khan, S.; Bachura, E.; Choo, K.K.R. CLASS: Cloud Log Assuring Soundness and Secrecy Scheme for Cloud Forensics. IEEE Trans. Sustain. Comput. 2021, 6, 184–196. [Google Scholar] [CrossRef]
- Awuson-David, K.; Al-Hadhrami, T.; Alazab, M.; Shah, N.; Shalaginov, A. BCFL logging: An approach to acquire and preserve admissible digital forensics evidence in cloud ecosystem. Future Gener. Comput. Syst. 2021, 122, 1–13. [Google Scholar] [CrossRef]
- U.S. Department of Homeland Security. Computer Forensics; U.S. Department of Homeland Security: Washington, DC, USA, 2008.
- EC-Council. How to Handle Data Acquisition in Digital Forensics, Cybersecurity Exchange. 11 March 2022. Available online: https://www.eccouncil.org/cybersecurity-exchange/computer-forensics/data-acquisition-digital-forensics/ (accessed on 19 April 2024).
- Pedapudi, S.M.; Vadlamani, N. Data acquisition based seizure record framework for digital forensics investigations. In Proceedings of the 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2–4 December 2021; Available online: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9676088 (accessed on 16 August 2022).
- Christophe Grenier. Photorec. Available online: http://www.cgsecurity.org/wiki/photorec (accessed on 19 April 2024).
- Mohammad, R.M. A neural network based digital forensics classification. In Proceedings of the 2018 IEEE/ACS 15th International Conference on Computer Systems and Applications (AICCSA), Aqaba, Jordan, 28 October–1 November 2018. [Google Scholar] [CrossRef]
- Garfinkel, S.L. Digital forensics research: The next 10 years. Digit. Investig. 2010, 7, S64–S73. [Google Scholar] [CrossRef]
- Gallagher, S. Steganography: How Al-Qaeda Hid Secret Documents in a Porn Video; Ars Technica: New York, NY, USA, 2012; Available online: https://arstechnica.com/business/2012/05/steganography-how-al-qaeda-hid-secret (accessed on 23 February 2024).
- Olivier, M.S. On metadata context in database forensics. Digit. Investig. 2009, 5, 115–123. [Google Scholar] [CrossRef]
- Al-Dhaqm, A.; Abd Razak, S.; Othman, S.H.; Ali, A.; Ghaleb, F.A.; Rosman, A.S.; Marni, N. Database Forensic Investigation Process Models: A Review. IEEE Access 2020, 8, 48477–48490. [Google Scholar] [CrossRef]
- Karagiannis, C.; Vergidis, K. Digital Evidence and Cloud Forensics: Contemporary Legal Challenges and the Power of Disposal. Information 2021, 12, 181. [Google Scholar] [CrossRef]
- Romanov, A.; Semenov, A.; Mazhelis, O.; Veijalainen, J. Detection of fake profiles in social media—Literature review. In Proceedings of the 13th International Conference on Web Information Systems and Technologies, Porto, Portugal, 25–27 April 2017. [Google Scholar] [CrossRef]
- Juola, P. Authorship attribution. Found. Trends Inf. Retr. 2006, 1, 233–334. [Google Scholar] [CrossRef]
- Naqvi, S.; Enderby, S.; Williams, I.; Asif, W.; Rajarajan, M.; Potlog, C.; Florea, M. Privacy-Preserving Social Media Forensic Analysis for Preventive Policing of Online Activities. In Proceedings of the 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Canary Islands, Spain, 24–26 June 2019. [Google Scholar] [CrossRef]
- Shahbazi, Z.; Byun, Y.C. NLP-Based Digital Forensic Analysis for Online Social Network Based on System Security. Int. J. Environ. Res. Public Health 2022, 19, 7027. [Google Scholar] [CrossRef] [PubMed]
- Sun, D.; Zhang, X.; Choo, K.K.R.; Hu, L.; Wang, F. NLP-based digital forensic investigation platform for online communications. Comput. Secur. 2021, 104, 102210. [Google Scholar] [CrossRef]
- Ketcham, M.; Ganokratanaa, T.; Bansin, S. The forensic algorithm on facebook using natural language processing. In Proceedings of the 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Naples, Italy, 28 November–1 December 2016. [Google Scholar] [CrossRef]
- Chambers, N.; Fry, B.; McMasters, J. Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; Available online: https://aclanthology.org/N18-1147/ (accessed on 7 October 2023).
- Mursi, K.T.; Alahmadi, M.D.; Alsubaei, F.S.; Alghamdi, A.S. Detecting Islamic Radicalism Arabic Tweets Using Natural Language Processing. IEEE Access 2022, 10, 72526–72534. [Google Scholar] [CrossRef]
- Torregrosa, J.; Bello-Orgaz, G.; Martinez-Camara, E.; Del Ser, J.; Camacho, D. A survey on extremism analysis using natural language processing. arXiv 2021, arXiv:2104.04069. [Google Scholar]
- Ul Rehman, Z.; Abbas, S.; Khan, M.A.; Mustafa, G.; Fayyaz, H.; Hanif, M.; Saeed, M.A. Understanding the Language of ISIS: An Empirical Approach to Detect Radical Content on Twitter Using Machine Learning. Comput. Mater. Contin. 2021, 66, 1075–1090. [Google Scholar] [CrossRef]
- Nouh, M.; Nurse, J.R.; Goldsmith, M. Understanding the radical mind: Identifying signals to detect extremist content on Twitter. In Proceedings of the 2019 IEEE International Conference on Intelligence and Security Informatics (ISI), Shenzhen, China, 1–3 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 98–103. [Google Scholar]
- Oussalah, M.; Faroughian, F.; Kostakos, P. On detecting online radicalization using natural language processing. In Proceedings of the Intelligent Data Engineering and Automated Learning–IDEAL 2018: 19th International Conference, Madrid, Spain, 21–23 November 2018; Springer International Publishing: Berlin/Heidelberg, Germany, 2018. Part II 19. pp. 21–27. [Google Scholar]
- Manogaran, G.; Qudrat-Ullah, H.; Xin, Q. (Eds.) Special issue on deep structured learning for natural language processing. In ACM Transactions on Asian and Low-Resource Language Information Processing; Association for Computing Machinery: New York, NY, USA, 2021; Volume 20, pp. 1–2. [Google Scholar] [CrossRef]
- Ahmed, M.T.; Rahman, M.; Nur, S.; Islam, A.Z.M.T.; Das, D. Natural language processing and machine learning based cyberbullying detection for Bangla and Romanized Bangla texts. TELKOMNIKA Telecommun. Comput. Electron. Control 2021, 20, 89. [Google Scholar] [CrossRef]
- Elsafoury, F.; Wilson, S.R.; Ramzan, N. A Comparative Study on Word Embeddings and Social NLP Tasks. In Proceedings of the Tenth International Workshop on Natural Language Processing for Social Media, Seattle, WA, USA, 14–15 July 2022; Available online: https://aclanthology.org/2022.socialnlp-1.5 (accessed on 26 February 2024).
- Latha, P.; Sumitra, V.; Sasikala, V.; Arunarasi, J.; Rajini, A.R.; Nithiya, N. Fake profile identification in social network using machine learning and NLP. In Proceedings of the 2022 International Conference on Communication, Computing and Internet of Things (IC3IoT), Chennai, India, 10–11 March 2022. [Google Scholar] [CrossRef]
- Rao, P.S.; Gyani, J.; Narsimha, G. Fake profile identification in online social networks using machine learning and NLP. Int. J. Appl. Eng. Res. 2018, 13, 973–4562. [Google Scholar]
- Rohit, R. Machine learning implementation for identifying fake accounts in social network. Int. J. Pure Appl. Math. 2018, 118, 4785–4797. [Google Scholar]
- Milind, S.K.; Dhamdhere, V. Automatic Detection of Fake Profiles in Online Social Networks. In The Techincal Writers Handbook; Young, M., Ed.; University Science: Mill Valley, CA, USA, 1989. [Google Scholar]
- Bowman-Grieve, L. Exploring ’stormfront’: A virtual community of the radical right. Stud. Confl. Terror. 2009, 32, 989–1007. [Google Scholar] [CrossRef]
- Sageman, M. Leaderless Jihad: Terror Networks in the Twenty-First Century; University of Pennsylvania Press: Philadelphia, PA, USA, 2008. [Google Scholar]
- Mathew, B.; Dutt, R.; Goyal, P.; Mukherjee, A. Spread of hate speech in online social media. In Proceedings of the WebSci ’19: 11th ACM Conference on Web Science, Boston, MA, USA, 30 June–3 July 2019; pp. 173–182. [Google Scholar]
- Løvås, I.V. Recognizing Social Media Right-Wing Radicalization Using Text Analysis and Artificial Intelligence. Master’s Thesis, NTNU, Trondheim, Norway, 2022. [Google Scholar]
- Chen, L.; Liu, X.; Tang, H. The interactive effects of parental mediation strategies in preventingcyberbullying on social media. Psychol. Res. Behav. Manag. 2023, 1009–1022. [Google Scholar] [CrossRef]
- Smith, P.K.; Mahdavi, J.; Carvalho, M.; Fisher, S.; Russell, S.; Tippett, N. Cyberbullying: Its natureand impact in secondary school pupils. J. Child Psychol. Psychiatry 2008, 49, 376–385. [Google Scholar] [CrossRef] [PubMed]
- Bokolo, B.G.; Liu, Q. Combating Cyberbullying in Various Digital Media Using Machine Learning; Chapman and Hall/CRC: Boca Raton, FL, USA, 2023; pp. 71–97. [Google Scholar] [CrossRef]
- Kowalski, R.M.; Limber, S.P. Electronic bullying among middle school students. J. Adolesc. Health 2007, 41, S22–S30. [Google Scholar] [CrossRef] [PubMed]
- Hinduja, S.; Patchin, J. Bullying Beyond the Schoolyard: Preventing and Responding to Cyberbullying; Corwin Press: Thousand Oaks, CA, USA, 2009. [Google Scholar]
- Kiritchenko, S.; Nejadgholi, I.; Fraser, K.C. Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective. J. Artif. Intell. Res. 2021, 71, 431–478. [Google Scholar] [CrossRef]
- Ali, H.; Malik, I.; Mahmood, S.; Akif, F.; Amin, J. Sybil detection in online social networks. In Proceedings of the 2022 17th International Conference on Emerging Technologies (ICET), Swabi, Pakistan, 29–30 November 2022. [Google Scholar] [CrossRef]
- Pig Butchering Scam: From Tinder and TikTok to WhatsApp and Telegram, How Scammers Are Stealing Millions in a Long Con, Tenable®. 2024. Available online: https://www.tenable.com/blog/pig-butchering-scam-tinder-tiktok-whatsapp-telegram-scammers-steal-millions#webinar-2/22 (accessed on 19 February 2024).
- Abbate, P. Federal Bureau of Investigation Internet Crime Report 2021; Internet Crime Complaint Center: Washington, DC, USA, 2021.
- Fire, M.; Goldschmidt, R.; Elovici, Y. Online Social Networks: Threats and Solutions. IEEE Commun. Surv. Tutorials 2014, 16, 2019–2036. [Google Scholar] [CrossRef]
- Wolotko, D. How Many Fake Accounts Are on Social Media?—Hypetrain’s Blog. Available online: https://blog.hypetrain.io/fake_accounts/ (accessed on 18 March 2024).
- Liu, Z.; Zhou, J. Introduction to graph neural networks. In Synthesis Lectures on Artificial Intelligence and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2020; Volume 14, pp. 1–127. [Google Scholar] [CrossRef]
- Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef] [PubMed]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the ICML’17: Proceedings of the 34th International Conference on Machine Learning, Sydney, NSW, Australia, 6–11 August 2017; pp. 1263–1272. [Google Scholar]
- Cooper, S.D. A concise history of the fauxtography blogstorm in the 2006 lebanon war. Am. Commun. J. 2007, 9, 2. [Google Scholar]
- Zhang, D.Y.; Shang, L.; Geng, B.; Lai, S.; Li, K.; Zhu, H.; Amin, M.T.; Wang, D. Fauxbuster: A content-free fauxtography detector using social media comments. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 891–900. [Google Scholar]
- Shang, L.; Zhang, Y.; Zhang, D.; Wang, D. Fauxward: A graph neural network approach to fauxtography detection using social media comments. Soc. Netw. Anal. Min. 2020, 10, 76. [Google Scholar] [CrossRef]
- Qian, Y.; Zhang, Y.; Ye, Y.; Zhang, C. Distilling meta knowledge on heterogeneous graph for illicit drug trafficker detection on social media. Adv. Neural Inf. Process. Syst. 2021, 34, 26911–26923. [Google Scholar]
- Asif, M.; Al-Razgan, M.; Ali, Y.A.; Yunrong, L. Graph convolution networks for social media trolls detection use deep feature extraction. J. Cloud Comput. 2024, 13, 33. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
- What is a Generative Adversarial Network (GAN)? Definition from TechTarget. Enterprise AI. Available online: https://www.techtarget.com/searchenterpriseai/definition/generative-adversarial-network-GAN# (accessed on 8 March 2024).
- Wikipedia Contributors. “Deepfake”. 2019. Available online: https://en.wikipedia.org/wiki/Deepfake (accessed on 19 April 2024).
- Sample, I. What Are Deepfakes—And How Can You Spot Them? The Guardian. 13 January 2020. Available online: https://www.theguardian.com/technology/2020/jan/13/what-are-deepfakes-and-how-can-you-spot-them (accessed on 19 April 2024).
- File: Deepfake Metahuman.png—Wikimedia Commons. 2024. Available online: https://commons.m.wikimedia.org/wiki/File:Deepfake_Metahuman.png (accessed on 18 March 2024).
- Preeti; Kumar, M.; Sharma, H.K. A GAN-Based Model of Deepfake Detection in Social Media. Procedia Comput. Sci. 2023, 218, 2153–2162. [Google Scholar] [CrossRef]
- CelebA Dataset. Available online: https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html (accessed on 21 January 2024).
- Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
- Yang, C.; Ding, L.; Chen, Y.; Li, H. Defending against GAN-based Deepfake Attacks via Transformation-aware Adversarial Faces. arXiv 2020, arXiv:2006.07421v1. [Google Scholar]
- Nadimpalli, A.V.; Rattani, A. ProActive DeepFake Detection using GAN-based Visible Watermarking. ACM Trans. Multimed. Comput. Commun. Appl. 2023. [Google Scholar] [CrossRef]
- Giudice, O.; Guarnera, L.; Battiato, S. Fighting deepfakes by detecting GAN DCT anomalies. J. Imaging 2021, 7, 128. [Google Scholar] [CrossRef]
Domain | Research Studies | Common Methodologies |
---|---|---|
Computer Forensics | [13,14,15,16,17,18,19,20] | Forensic data acquisition; file and operating system analysis; steganalysis |
Mobile Device Forensics | [21,22] | Time synchronization; intra- and inter-application analysis; media log analysis; file system analysis |
Network Forensics | [23,24,25,26] | Deep neural networks with PSO algorithm; network packet analysis; network log analysis |
Database Forensics | [27,28,29,30,31] | Audit logs analysis; tamper detection; forensic data recovery |
Cloud Forensics | [21,32,33,34] | Time synchronization; intra- and inter-application analysis; CLASS; forensic logging frameworks |
Application | Research Studies | Methodologies Used |
---|---|---|
Radicalization Detection | [52,53,54,55,56] | Text preprocessing; feature extraction; word embedding; ML classification algorithms |
Cyberbullying Detection | [57,58,59] | Word similarity and text detection; feature extraction; word embeddings; ML classification algorithms |
Fake Profile Detection | [60,61,62,63] | Text modeling with BoW; dimensionality reduction; feature extraction; ML classification algorithms |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bokolo, B.G.; Liu, Q. Artificial Intelligence in Social Media Forensics: A Comprehensive Survey and Analysis. Electronics 2024, 13, 1671. https://doi.org/10.3390/electronics13091671
Bokolo BG, Liu Q. Artificial Intelligence in Social Media Forensics: A Comprehensive Survey and Analysis. Electronics. 2024; 13(9):1671. https://doi.org/10.3390/electronics13091671
Chicago/Turabian StyleBokolo, Biodoumoye George, and Qingzhong Liu. 2024. "Artificial Intelligence in Social Media Forensics: A Comprehensive Survey and Analysis" Electronics 13, no. 9: 1671. https://doi.org/10.3390/electronics13091671
APA StyleBokolo, B. G., & Liu, Q. (2024). Artificial Intelligence in Social Media Forensics: A Comprehensive Survey and Analysis. Electronics, 13(9), 1671. https://doi.org/10.3390/electronics13091671