Advanced Persistent Threat Group Correlation Analysis via Attack Behavior Patterns and Rough Sets
Abstract
:1. Introduction
2. Related Work
2.1. Attack Attribution
2.2. APT Group Correlation Analysis
3. Methodology
3.1. Knowledge Representation of APT Groups
- denotes the type of the targeted entity in the attack, including individuals, organizations, devices, operating systems, and so on.
- denotes the industry to which the target entity belongs.
- denotes the geographical location of the targeted entity in the attack.
- denotes the attack methods employed by the attacker to target entities, including backdoor attacks, phishing emails, exploit attacks, and so on.
- denotes the specific tool, malware name, or family employed by the attacker to target entities.
- denotes the means utilized by attackers to propagate their attacks, including the vectors for malware delivery and vulnerability exploitation and so on.
- denotes the CVE number of the vulnerability exploited by the attacker.
- denotes the disclosure time of security events. In cases where a specific time is not available, the time stated in the technical report takes precedence.
- denotes the static feature set of malware, including attributes such as the import and export functions of executable files, APIs, language types, resource items, etc. These features provide insights into the coding intentions and language preferences exhibited by malicious code.
- denotes the dynamic feature set of malware, including attributes such as commands, files, registry entries, processes, and dynamic link libraries observed during the execution of malware samples. These features aid in refining the understanding of the attack process.
- denotes the vulnerability feature set of malware.
- denotes the date on which the malware was initially detected in VirusTotal (www.virustotal.com, accessed on 15 August 2023).
3.2. Behavior Pattern Construction of APT Groups Based on Rough Sets
3.3. Correlation Measurement Method Based on Link Prediction
4. Experiment
4.1. Data Gathering and Preprocessing
4.2. Feature Verification and Analysis
4.3. Correlation Analysis
4.4. Evaluation Analysis
5. Case Study
5.1. Correlation Analysis
5.2. Temporal Evolution Analysis
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xiang, G.; Shi, C.; Zhang, Y. An APT event extraction method based on BERT-BiGRU-CRF for APT attack detection. Electronics 2023, 12, 3349. [Google Scholar] [CrossRef]
- Wikipedia. Stuxnet. [EB/OL]. Available online: https://en.wikipedia.org/wiki/Stuxnet#cite_note-57 (accessed on 28 November 2022).
- Kushner, D. The real story of stuxnet. IEEE Spectr. 2013, 50, 48–53. [Google Scholar] [CrossRef]
- Zetter, K.; Modderkolk, H. Revealed: How a Secret Dutch Mole Aided the U.S.-Israeli Stuxnet Cyberattack on Iran. [EB/OL]. Available online: https://news.yahoo.com/revealed-how-a-secret-dutch-mole-aided-the-us\israeli-stuxnet-cyber-attack-on-iran-160026018.html (accessed on 3 September 2019).
- NCCIC. Grizzly Steppe—Russian Malicious Cyber Activity. [EB/OL]. Available online: https://www.cisa.gov/uscert/sites/default/files/publications/JAR_16-20296A_GRIZZLY%20STEPPE-2016-1229.pdf (accessed on 29 December 2019).
- Symantec DeepSight Adversary Intelligence Team. Waterbug: Espionage Group Rolls out Brand-New Toolset in Attacks against Governments [EB/OL]. Available online: https://symantec-enterprise-blogs.security.com/blogs/threat-intelligence/waterbug-espionage-governments (accessed on 21 June 2019).
- Youn, J.; Kim, K.; Kang, D.; Lee, J.; Park, M.; Shin, D. Research on Cyber ISR Visualization Method Based on BGP Archive Data through Hacking Case Analysis of North Korean Cyber-Attack Groups. Electronics 2022, 11, 4142. [Google Scholar] [CrossRef]
- Alkhpor, H.K.; Alserhani, F.M. Collaborative Federated Learning-Based Model for Alert Correlation and Attack Scenario Recognition. Electronics 2023, 12, 4509. [Google Scholar] [CrossRef]
- Lajevardi, A.M.; Amini, M. Big knowledge-based semantic correlation for detecting slow and low-level advanced persistent threats. J. Big Data 2021, 8, 148. [Google Scholar] [CrossRef]
- Wei, R.; Cai, L.; Zhao, L.; Yu, A.; Meng, D. Deephunter: A graph neural network based approach for robust cyber threat hunting. In Proceedings of the 17th EAI International Conference on Security and Privacy in Communication Networks, Online, 6–9 September 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 3–24. [Google Scholar]
- Han, X.; Pasquier, T.; Bates, A.; Mickens, J.; Seltzer, M. Unicorn: Runtime Provenance-Based Detector for Advanced Persistent Threats. In Proceedings of the 27th Annual Network and Distributed System Security Symposium, San Diego, CA, USA, 23–26 February 2020; The Internet Society: Reston, VA, USA, 2020. [Google Scholar]
- Luh, R.; Janicke, H.; Schrittwieser, S. Aidis: Detecting and classifying anomalous behavior in ubiquitous kernel processes. Comput. Secur. 2019, 84, 120–147. [Google Scholar] [CrossRef]
- Kurtz, Z.; Perl, S. Measuring similarity between cyber security incident reports. In Proceedings of the 2017 Forum of Incident Response Security Teams (FIRST) Conference, San Juan, Puerto Rico, 11–16 June 2017. [Google Scholar]
- Rezapour, A.; Tzeng, W.G. A robust algorithm for predicting attacks using collaborative security logs. J. Inf. Sci. Eng. 2020, 36, 597–619. [Google Scholar]
- Karafili, E.; Wang, L.; Lupu, E.C. An argumentation-based reasoner to assist digital investigation and attribution of cyber-attacks. Forensic Sci. Int. Digit. Investig. 2020, 32, 300925. [Google Scholar] [CrossRef]
- Xu, J.; Yun, X.; Zhang, Y.; Sang, Y.; Cheng, Z. Networktrace: Probabilistic relevant pattern recognition approach to attribution trace analysis. In Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, Australia, 1–4 August 2017; pp. 691–698. [Google Scholar]
- Office of the Director of National Intelligence. A Guide to Cyber Attribution; Office of the Director of National Intelligence: Washington, DC, USA, 2018.
- Zhang, P.; Li, T.; Wang, G.; Luo, C.; Chen, H.; Zhang, J.; Wang, D.; Yu, Z. Multi-source information fusion based on rough set theory: A review. Inf. Fusion 2021, 68, 85–117. [Google Scholar] [CrossRef]
- Biswas, A.; Biswas, B. Community-based link prediction. Multimed. Tools Appl. 2017, 76, 18619–18639. [Google Scholar] [CrossRef]
- Son, K.H.; Kim, B.I.; Lee, T.J. Cyber-attack group analysis method based on association of cyber-attack information. KSII Trans. Internet Inf. Syst. 2020, 14, 260–280. [Google Scholar]
- Haddadpajouh, H.; Azmoodeh, A.; Dehghantanha, A.; Parizi, R.M. Mvfcc: A multi-view fuzzy consensus clustering model for malware threat attribution. IEEE Access 2020, 8, 139188–139198. [Google Scholar] [CrossRef]
- Van Dyke Parunak, H. A grammar-based behavioral distance measure between ransomware variants. IEEE Trans. Comput. Soc. Syst. 2021, 9, 8–17. [Google Scholar] [CrossRef]
- Kida, M.; Olukoya, O. Nation-state threat actor attribution using fuzzy hashing. IEEE Access 2022, 11, 1148–1165. [Google Scholar] [CrossRef]
- Liras, L.F.M.; De Soto, A.R.; Prada, M.A. Feature analysis for data-driven apt-related malware discrimination. Comput. Secur. 2021, 104, 102202. [Google Scholar] [CrossRef]
- Li, S.; Zhang, Q.; Wu, X.; Han, W.; Tian, Z. Attribution classification method of apt malware in iot using machine learning techniques. Secur. Commun. Netw. 2021, 2021, 9396141. [Google Scholar] [CrossRef]
- Dib, M.; Torabi, S.; Bou-Harb, E.; Assi, C. A multi-dimensional deep learning framework for iot malware classification and family attribution. IEEE Trans. Netw. Serv. Manag. 2021, 18, 1165–1177. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, W.; He, H.; Liu, P.; Luo, D.X.; Liu, Y.; Jiang, J.; Li, Y.; Zhang, X.; Liu, W.; et al. An evolutionary study of iot malware. IEEE Internet Things J. 2021, 8, 15422–15440. [Google Scholar] [CrossRef]
- Black, P.; Gondal, I.; Vamplew, P.; Lakhotia, A. Function similarity using family context. Electronics 2020, 9, 1163. [Google Scholar] [CrossRef]
- Zhao, J.; Yan, Q.; Li, J.; Shao, M.; He, Z.; Li, B. Timiner: Automatically extracting and analyzing categorized cyber threat intelligence from social data. Comput. Secur. 2020, 95, 101867. [Google Scholar] [CrossRef]
- Berninger, M. Going Atomic: Clustering and Associating Attacker Activity at Scale [EB/OL]. Available online: https://www.mandiant.com/resources/blog/clustering-and-associating-attacker-activity-at-scale (accessed on 12 May 2019).
- Noor, U.; Anwar, Z.; Amjad, T.; Choo, K.-K.R. A machine learning-based fintech cyber threat attribution framework using high-level indicators of compromise. Future Gener. Comput. Syst. 2019, 96, 227–242. [Google Scholar] [CrossRef]
- Kim, K.; Shin, Y.; Lee, J.; Lee, K. Automatically attributing mobile threat actors by vectorized att&ck matrix and paired indicator. Sensors 2021, 21, 6522. [Google Scholar]
- Zhang, Q.; Xie, Q.; Wang, G. A survey on rough set theory and its applications. CAAI Trans. Intell. Technol. 2016, 1, 323–333. [Google Scholar] [CrossRef]
- García, D.E.; DeCastro-García, N. Optimal feature configuration for dynamic malware detection. Comput. Secur. 2021, 105, 102250. [Google Scholar] [CrossRef]
- Loia, V.; Orciuoli, F. Understanding the composition and evolution of terrorist group networks: A rough set approach. Future Gener. Comput. Syst. 2019, 101, 983–992. [Google Scholar] [CrossRef]
- Sun, L.; Wang, L.; Ding, W.; Qian, Y.; Xu, J. Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans. Fuzzy Syst. 2020, 29, 19–33. [Google Scholar] [CrossRef]
- Yang, Q.; Li, Y.L.; Chin, K.S. Constructing novel operational laws and information measures for proportional hesitant fuzzy linguistic term sets with extension to PHFL-VIKOR for group decision making. Int. J. Comput. Intell. Syst. 2019, 12, 998–1018. [Google Scholar] [CrossRef]
- Shang, K.K.; Small, M.; Yan, W.S. Link direction for link prediction. Phys. A Stat. Mech. Its Appl. 2017, 469, 767–776. [Google Scholar] [CrossRef]
- Guo, T.; Jiye, Z. A new measurement of link prediction based on common neighbors. J. China Univ. Metrol. 2016, 27, 121–124. [Google Scholar]
- Lü, L.; Zhou, T. Link prediction in complex networks: A survey. Phys. A Stat. Mech. Its Appl. 2011, 390, 1150–1170. [Google Scholar] [CrossRef]
- Rhode, M.; Burnap, P.; Jones, K. Early-stage malware prediction using recurrent neural networks. Comput. Secur. 2018, 77, 578–594. [Google Scholar] [CrossRef]
- Insikt Group. Iranian Threat Actor Amasses Large Cyber Operations Infrastructure Network to Target Saudi Organizations. [EB/OL]. Available online: https://www.recordedfuture.com/iranian-cyber-operations-infrastructure (accessed on 26 June 2019).
- National Security Agency. Turla Group Exploits Iranian APT To Expand Coverage Of Victims. [EB/OL]. Available online: https://media.defense.gov/2019/Oct/18/2002197242/-1/-1/0/NSA_CSA_TURLA_20191021%20VER%203%20-%20COPY.PDF (accessed on 21 October 2019).
- GROUP-IB. Catching Fish in Muddy Waters. [EB/OL]. Available online: https://www.group-ib.com/blog/muddywater/ (accessed on 29 May 2019).
Name | Equation |
---|---|
Common Neighbors (CN) | 1 |
Jaccard Index | |
Hub-Promoted Index (HPI) | 2 |
Hub-Depressed Index (HDI) | |
Katz Index | 3 |
Experiment Number | Classification Task |
---|---|
A | 2 classifications of general and APT malware. |
B | General malware and 76 classifications of APT groups. |
C | 8 classifications of general malware types and 76 classifications of APT groups. |
D | 76 classifications of APT groups. |
Feature Selection Method | Number of Features | Scaling Ratio | Classification Task | Attribution Accuracy | Attribution Precision | Attribution F1 | Correlation Precision | Correlation Accuracy | Number of Connections |
---|---|---|---|---|---|---|---|---|---|
TF-IDF | 410 | 103:1 | A | 0.9642 | 0.9640 | 0.9641 | - | - | - |
B | 0.9251 | 0.9297 | 0.9244 | - | - | - | |||
C | 0.7620 | 0.7581 | 0.7552 | - | - | - | |||
D | 0.7488 | 0.7609 | 0.7442 | 0.5384 | 0.9810 | 7 | |||
High- frequency word | 332 | 127:1 | A | 0.9628 | 0.9625 | 0.9625 | - | - | - |
B | 0.9214 | 0.9225 | 0.9193 | - | - | - | |||
C | 0.7699 | 0.7720 | 0.7657 | - | - | - | |||
D | 0.7251 | 0.7298 | 0.7164 | 0.6250 | 0.9819 | 10 | |||
Mutual information | 172 | 245:1 | A | 0.9475 | 0.9474 | 0.9474 | - | - | - |
B | 0.9166 | 0.9157 | 0.9114 | - | - | - | |||
C | 0.7722 | 0.7758 | 0.7688 | - | - | - | |||
D | 0.7509 | 0.7619 | 0.7455 | 0.8571 | 0.9838 | 12 |
Method | Precision | Accuracy | TP | FP |
---|---|---|---|---|
Average distance | 0.6923 | 0.9781 | 18 | 8 |
Center distance | 0.1860 | 0.9658 | 8 | 35 |
Set similarity | 0.8750 | 0.9781 | 14 | 2 |
Rough set | 0.9090 | 0.9801 | 20 | 2 |
Time | 2008–2011 | 2012–2015 | 2016–2018 | 2019–2022 | 2008–2022 |
---|---|---|---|---|---|
Security event samples | 45 | 346 | 772 | 433 | 1582 |
Malware samples | 82 | 1495 | 1363 | 279 | 3219 |
Dataset | Time | Precision | Accuracy | TP | FP |
---|---|---|---|---|---|
Security events | 2008–2011 | - | - | - | - |
2012–2015 | 0.5000 | 0.9972 | 1 | 1 | |
2016–2018 | 0.3333 | 0.9886 | 2 | 4 | |
2019–2022 | 0.5000 | 0.9801 | 2 | 2 | |
2008–2022 | 0.6666 | 0.9829 | 4 | 2 | |
Malware | 2008-2011 | - | - | - | - |
2012–2015 | 0.7500 | 0.9987 | 3 | 1 | |
2016–2018 | 0.5000 | 0.9958 | 6 | 6 | |
2019–2022 | 0.4285 | 0.9920 | 3 | 4 | |
2008–2022 | 0.8666 | 0.9851 | 13 | 2 | |
APT groups | 2008-2011 | - | - | - | - |
2012–2015 | 0.6666 | 0.9984 | 4 | 2 | |
2016–2018 | 0.5000 | 0.9946 | 9 | 9 | |
2019–2022 | 0.4545 | 0.9895 | 5 | 6 | |
2008–2022 | 0.9090 | 0.9844 | 20 | 2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Liu, J.; Zhang, R. Advanced Persistent Threat Group Correlation Analysis via Attack Behavior Patterns and Rough Sets. Electronics 2024, 13, 1106. https://doi.org/10.3390/electronics13061106
Li J, Liu J, Zhang R. Advanced Persistent Threat Group Correlation Analysis via Attack Behavior Patterns and Rough Sets. Electronics. 2024; 13(6):1106. https://doi.org/10.3390/electronics13061106
Chicago/Turabian StyleLi, Jingwen, Jianyi Liu, and Ru Zhang. 2024. "Advanced Persistent Threat Group Correlation Analysis via Attack Behavior Patterns and Rough Sets" Electronics 13, no. 6: 1106. https://doi.org/10.3390/electronics13061106
APA StyleLi, J., Liu, J., & Zhang, R. (2024). Advanced Persistent Threat Group Correlation Analysis via Attack Behavior Patterns and Rough Sets. Electronics, 13(6), 1106. https://doi.org/10.3390/electronics13061106