Efficient Large-Scale IoT Botnet Detection through GraphSAINT-Based Subgraph Sampling and Graph Isomorphism Network
Abstract
:1. Introduction
- An efficient botnet detection solution is proposed for complex botnet structures in large-scale IoT networks. This scheme only needs to utilize the aggregated connection information of network traffic to achieve detection, which has high availability in actual deployment.
- We applied GraphSAINT to process large-scale IoT botnet graph data, achieving efficient and unbiased data processing. Furthermore, we developed a graph isomorphism network-based solution with enhanced information representation capabilities for botnet detection.
- Developed and tested with C2, P2P, and chord datasets, the prototype showcases exceptional accuracy, reaching 99.97%, surpassing existing graph-based models and botnet detection schemes that have been suggested in recent years.
2. Related Works
2.1. Machine Learning-Based Botnet Detection
2.2. Deep Learning-Based Botnet Detection
2.3. Graph-Based Botnet Detection
3. Background and Materials
3.1. Botnet Architecture and Life Cycle
3.2. GraphSAINT
3.3. Graph Neural Network
4. Methodology
4.1. Overview
4.2. Data Preprocessing
4.3. Subgraph Sampling
Algorithm 1 Random edge subgraphs sampling algorithm by GraphSAINT |
|
4.4. Detection Based on GIN
5. Metrics
6. Experiments and Evaluation
6.1. Setup
6.2. Comparative Experiment
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Madakam, S.; Ramaswamy, R.; Tripathi, S. Internet of Things (IoT): A literature review. J. Comput. Commun. 2015, 3, 164–173. [Google Scholar] [CrossRef]
- Montazerolghaem, A.; Yaghmaee, M.H. Load-balanced and QoS-aware software-defined Internet of Things. IEEE Internet Things J. 2020, 7, 3323–3337. [Google Scholar] [CrossRef]
- Montazerolghaem, A. Software-defined Internet of Multimedia Things: Energy-efficient and Load-balanced Resource Management. IEEE Internet Things J. 2021, 9, 2432–2442. [Google Scholar] [CrossRef]
- Vailshery, L. Number of Internet of Things (IoT) Connected Devices Worldwide from 2019 to 2023, with Forecasts from 2022 to 2030. 2023. Available online: https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide (accessed on 25 March 2024).
- Chinese Academy of Cyberspace Studies. The Construction of World Information Infrastructure. In World Internet Development Report 2022: Blue Book for World Internet Conference; Springer: Berlin/Heidelberg, Germany, 2023; pp. 59–81. [Google Scholar]
- Xiang, C.; Wu, C.; Liu, Q.; Zhou, S. Review of Research on Network Security Situation Prediction Technology. Comput. Appl. Softw. 2023, 40, 19–28+36. [Google Scholar]
- Djenna, A.; Harous, S.; Saidouni, D.E. Internet of things meet internet of threats: New concern cyber security issues of critical cyber infrastructure. Appl. Sci. 2021, 11, 4580. [Google Scholar] [CrossRef]
- Lohachab, A.; Karambir, B. Critical analysis of DDoS—An emerging security threat over IoT networks. J. Commun. Inf. Netw. 2018, 3, 57–78. [Google Scholar] [CrossRef]
- Burhan, M.; Alam, H.; Arsalan, A.; Rehman, R.A.; Anwar, M.; Faheem, M.; Ashraf, M.W. A comprehensive survey on the cooperation of fog computing paradigm-based iot applications: Layered architecture, real-time security issues, and solutions. IEEE Access 2023, 11, 73303–73329. [Google Scholar] [CrossRef]
- Koroniotis, N.; Moustafa, N.; Sitnikova, E. Forensics and deep learning mechanisms for botnets in internet of things: A survey of challenges and solutions. IEEE Access 2019, 7, 61764–61785. [Google Scholar] [CrossRef]
- Ghafir, I.; Svoboda, J.; Prenosil, V. A survey on botnet command and control traffic detection. Int. J. Adv. Comput. Netw. Its Secur. (IJCNS) 2015, 5, 75–80. [Google Scholar]
- Admass, W.S.; Munaye, Y.Y.; Diro, A. Cyber security: State of the art, challenges and future directions. Cyber Secur. Appl. 2023, 2, 100031. [Google Scholar] [CrossRef]
- Karanja, E.M.; Masupe, S.; Jeffrey, M.G. Analysis of internet of things malware using image texture features and machine learning techniques. Internet Things 2020, 9, 100153. [Google Scholar] [CrossRef]
- NSFOCUS. 2020 BOTNET Trend Report. Available online: https://www.nsfocus.com.cn/html/2021/136_0705/155.html (accessed on 25 March 2024).
- Xia, H.; Li, L.; Cheng, X.; Cheng, X.; Qiu, T. Modeling and analysis botnet propagation in social Internet of Things. IEEE Internet Things J. 2020, 7, 7470–7481. [Google Scholar] [CrossRef]
- Antonakakis, M.; April, T.; Bailey, M.; Bernhard, M.; Bursztein, E.; Cochran, J.; Durumeric, Z.; Halderman, J.A.; Invernizzi, L.; Kallitsis, M.; et al. Understanding the mirai botnet. In Proceedings of the 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, Canada, 16–18 August 2017; pp. 1093–1110. [Google Scholar]
- Moriuchi, P.; Chohan, S. Mirai-variant IoT botnet used to target financial sector in January 2018. In Recorded Future Cyber Threat Analysis Report; Recorded Future: Boston, MA, USA, 2018; pp. 118–140. [Google Scholar]
- Porath, R. Internet, Cyber-Und IT-Sicherheit von AZ; Springer: Berlin/Heidelberg, Germany, 2020; pp. 154–196. [Google Scholar]
- 360Netlab. Pink, a Botnet That Competed with the Vendor to Control the Massive Infected Devices. 2021. Available online: https://blog.netlab.360.com/pink-en/ (accessed on 25 March 2024).
- Tu, T.F.; Qin, J.W.; Zhang, H.; Chen, M.; Xu, T.; Huang, Y. A comprehensive study of Mozi botnet. Int. J. Intell. Syst. 2022, 37, 6877–6908. [Google Scholar] [CrossRef]
- Motylinski, M.; MacDermott, Á.; Iqbal, F.; Shah, B. A GPU-based machine learning approach for detection of botnet attacks. Comput. Secur. 2022, 123, 102918. [Google Scholar] [CrossRef]
- Nadeem, A.; Hammerschmidt, C.; Gañán, C.H.; Verwer, S. Beyond labeling: Using clustering to build network behavioral profiles of malware families. In Malware Analysis Using Artificial Intelligence and Deep Learning; Springer: Berlin/Heidelberg, Germany, 2021; pp. 381–409. [Google Scholar]
- Cong, L.W.; Harvey, C.R.; Rabetti, D.; Wu, Z.Y. An Anatomy of Crypto-Enabled Cybercrimes; Technical Report; National Bureau of Economic Research: Cambridge, MA, USA, 2023. [Google Scholar]
- Beigi, E.B.; Jazi, H.H.; Stakhanova, N.; Ghorbani, A.A. Towards effective feature selection in machine learning-based botnet detection approaches. In Proceedings of the 2014 IEEE Conference on Communications and Network Security, San Francisco, CA, USA, 29–31 October 2014; IEEE: Hoboken, NJ, USA, 2014; pp. 247–255. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Liu, H.; Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef]
- Asharf, J.; Moustafa, N.; Khurshid, H.; Debie, E.; Haider, W.; Wahab, A. A review of intrusion detection systems using machine and deep learning in internet of things: Challenges, solutions and future directions. Electronics 2020, 9, 1177. [Google Scholar] [CrossRef]
- Zhang, B.; Li, J.; Chen, C.; Lee, K.; Lee, I. A practical botnet traffic detection system using gnn. In Proceedings of the Cyberspace Safety and Security: 13th International Symposium, CSS 2021, Virtual Event, 9–11 November 2021; Springer: Berlin/Heidelberg, Germany, 2022; pp. 66–78. [Google Scholar]
- Zhu, X.; Zhang, Y.; Zhang, Z.; Guo, D.; Li, Q.; Li, Z. Interpretability evaluation of botnet detection model based on graph neural network. In Proceedings of the IEEE INFOCOM 2022-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), New York, NY, USA, 2–5 May 2022; IEEE: Hoboken, NJ, USA, 2022; pp. 1–6. [Google Scholar]
- Carpenter, J.; Layne, J.; Serra, E.; Cuzzocrea, A. Detecting botnet nodes via structural node representation learning. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; IEEE: Hoboken, NJ, USA, 2021; pp. 5357–5364. [Google Scholar]
- Bilot, T.; El Madhoun, N.; Al Agha, K.; Zouaoui, A. Graph neural networks for intrusion detection: A survey. IEEE Access 2023, 11, 49114–49139. [Google Scholar] [CrossRef]
- Zeng, H.; Zhou, H.; Srivastava, A.; Kannan, R.; Prasanna, V. Graphsaint: Graph sampling based inductive learning method. arXiv 2019, arXiv:1907.04931. [Google Scholar]
- Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? arXiv 2018, arXiv:1810.00826. [Google Scholar]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Society. Ser. C (Appl. Stat.) 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Chen, R.; Niu, W.; Zhang, X.; Zhuo, Z.; Lv, F. An effective conversation-based botnet detection method. Math. Probl. Eng. 2017, 2017, 4934082. [Google Scholar] [CrossRef]
- Zeidanloo, H.R.; Manaf, A.B.; Vahdani, P.; Tabatabaei, F.; Zamani, M. Botnet detection based on traffic monitoring. In Proceedings of the 2010 International Conference on Networking and Information Technology, Manila, Philippines, 11–12 June 2010; IEEE: Hoboken, NJ, USA, 2010; pp. 97–101. [Google Scholar]
- Bullard, C. Audit Record Generation and Utilization System (Argus). 2018. Available online: https://www.qosient.com/argus/index.shtml (accessed on 25 March 2024).
- Karasaridis, A.; Rexroad, B.; Hoeflin, D.A. Wide-Scale Botnet Detection and Characterization. In Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets, Cambridge, MA, USA, 10 April 2007. [Google Scholar]
- Gu, G.; Porras, P.A.; Yegneswaran, V.; Fong, M.W.; Lee, W. Bothunter: Detecting malware infection through ids-driven dialog correlation. In Proceedings of the USENIX Security Symposium, Boston, MA, USA, 6–10 August 2007; Volume 7, pp. 1–16. [Google Scholar]
- Amini, P.; Azmi, R.; Araghizadeh, M. Botnet detection using NetFlow and clustering. Adv. Comput. Sci. Int. J. 2014, 3, 139–149. [Google Scholar]
- Azab, A.; Alazab, M.; Aiash, M. Machine learning based botnet identification traffic. In Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 23–26 August 2016; IEEE: Hoboken, NJ, USA, 2016; pp. 1788–1794. [Google Scholar]
- Liu, J.; Liu, S.; Zhang, S. Detection of IoT botnet based on deep learning. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; IEEE: Hoboken, NJ, USA, 2019; pp. 8381–8385. [Google Scholar]
- Meidan, Y.; Bohadana, M.; Mathov, Y.; Mirsky, Y.; Shabtai, A.; Breitenbacher, D.; Elovici, Y. N-baiot—Network-based detection of iot botnet attacks using deep autoencoders. IEEE Pervasive Comput. 2018, 17, 12–22. [Google Scholar] [CrossRef]
- Javed, Y.; Rajabi, N. Multi-layer perceptron artificial neural network based IoT botnet traffic classification. In Proceedings of the Future Technologies Conference (FTC) 2019, San Francisco, CA, USA, 24–25 October 2019; Springer: Berlin/Heidelberg, Germany, 2020; Volume 1, pp. 973–984. [Google Scholar]
- Ge, M.; Syed, N.F.; Fu, X.; Baig, Z.; Robles-Kelly, A. Towards a deep learning-driven intrusion detection approach for Internet of Things. Comput. Netw. 2021, 186, 107784. [Google Scholar] [CrossRef]
- Alharbi, A.; Alsubhi, K. Botnet detection approach using graph-based machine learning. IEEE Access 2021, 9, 99166–99180. [Google Scholar] [CrossRef]
- Wang, W.; Shang, Y.; He, Y.; Li, Y.; Liu, J. BotMark: Automated botnet detection with hybrid analysis of flow-based and graph-based traffic behaviors. Inf. Sci. 2020, 511, 284–296. [Google Scholar] [CrossRef]
- Nguyen, H.T.; Ngo, Q.D.; Le, V.H. A novel graph-based approach for IoT botnet detection. Int. J. Inf. Secur. 2020, 19, 567–577. [Google Scholar] [CrossRef]
- Chowdhury, S.; Khanzadeh, M.; Akula, R.; Zhang, F.; Zhang, S.; Medal, H.; Marufuzzaman, M.; Bian, L. Botnet detection using graph-based feature clustering. J. Big Data 2017, 4, 1–23. [Google Scholar] [CrossRef]
- Zhao, J.; Liu, X.; Yan, Q.; Li, B.; Shao, M.; Peng, H. Multi-attributed heterogeneous graph convolutional network for bot detection. Inf. Sci. 2020, 537, 380–393. [Google Scholar] [CrossRef]
- Lo, W.W.; Kulatilleke, G.; Sarhan, M.; Layeghy, S.; Portmann, M. XG-BoT: An explainable deep graph neural network for botnet detection and forensics. Internet Things 2023, 22, 100747. [Google Scholar] [CrossRef]
- Xiaoyuan, M.; Bo, L.; Liu, Y.; Yan, Y. Deep fused flow and topology features for botnet detection basing on pretrained GCN. arXiv 2023, arXiv:2307.10583. [Google Scholar]
- Islam, R.; Refat, R.U.D.; Yerram, S.M.; Malik, H. Graph-based intrusion detection system for controller area networks. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1727–1736. [Google Scholar] [CrossRef]
- O’Meara, K.; Shick, D.; Spring, J.; Stoner, E. Malware Capability Development Patterns Respond to Defenses: Two Case Studies; White Paper; Software Engineering Institute, Carnegie Mellon University: Pittsburgh, PA, USA, 2016; pp. 1–11. [Google Scholar]
- Binsalleeh, H.; Ormerod, T.; Boukhtouta, A.; Sinha, P.; Youssef, A.; Debbabi, M.; Wang, L. On the analysis of the zeus botnet crimeware toolkit. In Proceedings of the 2010 Eighth International Conference on Privacy, Security and Trust, Ottawa, ON, Canada, 17–19 August 2010; IEEE: Hoboken, NJ, USA, 2010; pp. 31–38. [Google Scholar]
- Wang, P.; Sparks, S.; Zou, C.C. An advanced hybrid peer-to-peer botnet. IEEE Trans. Dependable Secur. Comput. 2008, 7, 113–127. [Google Scholar] [CrossRef]
- Xing, Y.; Shu, H.; Zhao, H.; Li, D.; Guo, L. Survey on botnet detection techniques: Classification, methods, and evaluation. Math. Probl. Eng. 2021, 2021, 6640499. [Google Scholar] [CrossRef]
- Xu, K.; Li, C.; Tian, Y.; Sonobe, T.; Kawarabayashi, K.I.; Jegelka, S. Representation learning on graphs with jumping knowledge networks. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5453–5462. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Zhou, J.; Xu, Z.; Rush, A.M.; Yu, M. Automating botnet detection with graph neural networks. arXiv 2020, arXiv:2003.06344. [Google Scholar]
- Garcia, S.; Grill, M.; Stiborek, J.; Zunino, A. An empirical comparison of botnet detection methods. Comput. Secur. 2014, 45, 100–123. [Google Scholar] [CrossRef]
- Brody, S.; Alon, U.; Yahav, E. How attentive are graph attention networks? arXiv 2021, arXiv:2105.14491. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Chiang, W.L.; Liu, X.; Si, S.; Li, Y.; Bengio, S.; Hsieh, C.J. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 257–266. [Google Scholar]
- Rampášek, L.; Galkin, M.; Dwivedi, V.P.; Luu, A.T.; Wolf, G.; Beaini, D. Recipe for a general, powerful, scalable graph transformer. Adv. Neural Inf. Process. Syst. 2022, 35, 14501–14515. [Google Scholar]
Metrics Name | Instruction |
---|---|
TP | The number of malicious nodes predicted as malicious |
FP | The number of benign nodes predicted as malicious |
TN | The number of benign nodes predicted as benign |
FN | The number of malicious nodes predicted as benign |
Accuracy | |
Precision | |
Recall | |
F1 Score |
Dataset Split | Graph | Avg Nodes | Avg Edges | Avg Botnet Nodes |
---|---|---|---|---|
Train | 768 | 143,895 | 813,237 | 3211 |
Val | 96 | 143,763 | 812,955 | 3234 |
Test | 96 | 144,051 | 814,003 | 3175 |
Dataset Split | Graph | Avg Nodes | Avg Edges | Avg Botnet Nodes |
---|---|---|---|---|
Train | 768 | 143,895 | 1,623,217 | 3090 |
Val | 96 | 143,763 | 1,622,620 | 3093 |
Test | 96 | 144,051 | 1,624,948 | 3095 |
Dataset Split | Graph | Avg Nodes | Avg Edges | Avg Botnet Nodes |
---|---|---|---|---|
Train | 768 | 143,895 | 1,502,748 | 10,000 |
Val | 96 | 143,763 | 1,502,284 | 10,000 |
Test | 96 | 144,051 | 1,504,310 | 10,000 |
Hyperparameter | Values |
---|---|
Layers | 15 |
Hidden Channels | 128 |
Dropout | [0.1, 0.2] |
Activation Function | LeakyReLU |
Learning Rate | 3 × |
Weight Decay | 3 × |
Optimizer | Adam |
Scheduler | ReduceLROnPlateau |
Model | Dataset | Precision | Accuracy | F1 Score | Recall |
---|---|---|---|---|---|
GATv2 [63] | C2 | 0.9973 | 0.9996 | 0.9970 | 0.9967 |
P2P | 0.9976 | 0.9996 | 0.9963 | 0.9950 | |
Chord | 0.9921 | 0.9915 | 0.9685 | 0.9915 | |
GraphSAGE [64] | C2 | 0.9946 | 0.9996 | 0.9970 | 0.9995 |
P2P | 0.9954 | 0.9996 | 0.9970 | 0.9987 | |
Chord | 0.9986 | 0.9986 | 0.9949 | 0.9986 | |
GCN [60] | C2 | 0.7540 | 0.9786 | 0.8311 | 0.9881 |
P2P | 0.7860 | 0.9839 | 0.8593 | 0.9901 | |
Chord | 0.9980 | 0.9970 | 0.9882 | 0.9788 | |
Cluster-GCN [65] | C2 | 0.9966 | 0.9993 | 0.9974 | 0.9983 |
P2P | 0.9971 | 0.9992 | 0.9973 | 0.9983 | |
Chord | 0.9970 | 0.9997 | 0.9974 | 0.9978 | |
GraphGPS [66] | C2 | 0.9956 | 0.9989 | 0.9954 | 0.9989 |
P2P | 0.9956 | 0.9994 | 0.9963 | 0.9970 | |
Chord | 0.9963 | 0.9993 | 0.9976 | 0.9985 | |
Our Approach | C2 | 0.9965 | 0.9997 | 0.9976 | 0.9987 |
P2P | 0.9970 | 0.9998 | 0.9985 | 0.9991 | |
Chord | 0.9982 | 0.9997 | 0.9988 | 0.9991 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yin, L.; Chen, W.; Luo, X.; Yang, H. Efficient Large-Scale IoT Botnet Detection through GraphSAINT-Based Subgraph Sampling and Graph Isomorphism Network. Mathematics 2024, 12, 1315. https://doi.org/10.3390/math12091315
Yin L, Chen W, Luo X, Yang H. Efficient Large-Scale IoT Botnet Detection through GraphSAINT-Based Subgraph Sampling and Graph Isomorphism Network. Mathematics. 2024; 12(9):1315. https://doi.org/10.3390/math12091315
Chicago/Turabian StyleYin, Lihua, Weizhe Chen, Xi Luo, and Hongyu Yang. 2024. "Efficient Large-Scale IoT Botnet Detection through GraphSAINT-Based Subgraph Sampling and Graph Isomorphism Network" Mathematics 12, no. 9: 1315. https://doi.org/10.3390/math12091315
APA StyleYin, L., Chen, W., Luo, X., & Yang, H. (2024). Efficient Large-Scale IoT Botnet Detection through GraphSAINT-Based Subgraph Sampling and Graph Isomorphism Network. Mathematics, 12(9), 1315. https://doi.org/10.3390/math12091315