Ensemble Machine Learning Techniques for Accurate and Efficient Detection of Botnet Attacks in Connected Computers
Abstract
:1. Introduction
- (1)
- To identify botnet attacks on connected computers, a novel technique based on single machine learning and stacking ensemble models is presented.
- (2)
- An artificial-intelligence-powered system for detecting botnet attacks on connected computers and preventing botnet activity in real time is presented.
- (3)
- The study uses both qualitative and quantitative approaches to give several perspectives on the research topic of preventing and detecting botnet attacks on connected computers.
- (4)
- This study proposes a data-driven strategy to botnet attack prevention for governments, agencies, and organizations.
- (5)
- It is a novel contribution to the literature in which a new model for detecting botnet attacks is proposed.
2. Related Works
3. Methods
3.1. Data Collection
3.2. Data Preprocessing
3.3. The Proposed Framework
3.4. Statistical Analysis and Data Splitting
3.5. The Machine Learning Classifiers
3.6. Performance Evaluation
4. Experimental Results
4.1. Statistical Analysis of the Dataset
4.2. Performance of the Stacking Ensemble and Basic Machine Learning Models
4.3. Time Complexity of the Machine Learning Models
4.4. Benchmark Practices to Prevent Botnet Attacks
5. Discussion
6. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
RF | Random forest |
DT | Decision tree |
GLM | Generalized linear model |
R2 | Coefficient of determination |
@ | Location or institution of an email recipient |
\n | Newline |
# | Pound sign used as prefix for an address |
IoT | Internet of Things |
IP | Internet Protocol |
Average predicted | |
Average observed | |
Predicted values | |
ipn | Interpenetrating network |
rpn | Region proposal network |
f | Frequency of connected botnet computers |
References
- Song, B. Reliability analysis and optimization of computer communication network based on genetic algorithm. Int. J. Commun. Syst. 2022, 35, e4601. [Google Scholar] [CrossRef]
- Du, M. Application of information communication network security management and control based on big data technology. Int. J. Commun. Syst. 2022, 35, e4643. [Google Scholar] [CrossRef]
- Uzunidis, D.; Apostolopoulou, F.; Pagiatakis, G.; Stavdas, A. Analysis of Available Components and Performance Estimation of Optical Multi-Band Systems. Eng 2021, 2, 531–543. [Google Scholar] [CrossRef]
- Karimian, R.; Ardakani, M.D.; Ahmadi, S.; Zaghloul, M. Human Body Specific Absorption Rate Reduction Employing a Compact Magneto-Dielectric AMC Structure for 5G Massive-MIMO Applications. Eng 2021, 2, 501–511. [Google Scholar] [CrossRef]
- Qiao, Y.; Yang, Y.-X.; He, J.; Tang, C.; Zeng, Y.-Z. Detecting P2P bots by mining the regional periodicity. J. Zhejiang Univ. Sci. C 2013, 14, 682–700. [Google Scholar] [CrossRef]
- Paredes, J.N.; Simari, G.I.; Martinez, M.V.; Falappa, M.A. Detecting malicious behavior in social platforms via hybrid knowledge- and data-driven systems. Futur. Gener. Comput. Syst. 2021, 125, 232–246. [Google Scholar] [CrossRef]
- Javed, A.R.; Jalil, Z.; Moqurrab, S.A.; Abbas, S.; Liu, X. Ensemble Adaboost classifier for accurate and fast detection of botnet attacks in connected vehicles. Trans. Emerg. Telecommun. Technol. 2020, 33, e4088. [Google Scholar] [CrossRef]
- Yerima, S.Y.; Bashar, A. A Novel Android Botnet Detection System Using Image-Based and Manifest File Features. Electronics 2022, 11, 486. [Google Scholar] [CrossRef]
- Al-Begain, K.; Khan, M.; Alothman, B.; Joumaa, C.; Alrashed, E. A DDoS Detection and Prevention System for IoT Devices and Its Application to Smart Home Environment. Appl. Sci. 2022, 12, 11853. [Google Scholar] [CrossRef]
- Nguyen, G.L.; Dumba, B.; Ngo, Q.-D.; Le, H.-V.; Nguyen, T.N. A collaborative approach to early detection of IoT Botnet. Comput. Electr. Eng. 2022, 97, 107525. [Google Scholar] [CrossRef]
- Velarde-Alvarado, P.; Gonzalez, H.; Martínez-Peláez, R.; Mena, L.J.; Ochoa-Brust, A.; Moreno-García, E.; Félix, V.G.; Ostos, R. A Novel Framework for Generating Personalized Network Datasets for NIDS Based on Traffic Aggregation. Sensors 2022, 22, 1847. [Google Scholar] [CrossRef] [PubMed]
- Stevanovic, M.; Revsbech, K.; Pedersen, J.M.; Sharp, R.; Jensen, C.D. A collaborative approach to botnet protection. Lect. Notes Comput. Sci. 2012, 7465, 624–638. [Google Scholar] [CrossRef] [Green Version]
- Shukla, A.K.; Dwivedi, S. Discovery of Botnet Activities in Internet-of-Things System Using Dynamic Evolutionary Mechanism. New Gener. Comput. 2022, 40, 255–283. [Google Scholar] [CrossRef]
- Masoudi-Sobhanzadeh, Y.; Emami-Moghaddam, S. A real-time IoT-based botnet detection method using a novel two-step feature selection technique and the support vector machine classifier. Comput. Networks 2022, 217, 109365. [Google Scholar] [CrossRef]
- Hosseini, S.; Nezhad, A.E.; Seilani, H. Botnet detection using negative selection algorithm, convolution neural network and classification methods. Evol. Syst. 2022, 13, 101–115. [Google Scholar] [CrossRef]
- Afrifa, S.; Zhang, T.; Appiahene, P.; Varadarajan, V. Mathematical and Machine Learning Models for Groundwater Level Changes: A Systematic Review and Bibliographic Analysis. Futur. Internet 2022, 14, 259. [Google Scholar] [CrossRef]
- Afrifa, S.; Varadarajan, V. Cyberbullying Detection on Twitter Using Natural Language Processing and Machine Learning Techniques. Int. J. Innov. Technol. Interdiscip. Sci. 2022, 5, 1069–1080. [Google Scholar] [CrossRef]
- Shaukat, K.; Luo, S.; Varadharajan, V. A novel method for improving the robustness of deep learning-based malware detectors against adversarial attacks. Eng. Appl. Artif. Intell. 2022, 116, 105461. [Google Scholar] [CrossRef]
- Motylinski, M.; MacDermott, F.; Iqbal, F.; Shah, B. A GPU-based machine learning approach for detection of botnet attacks. Comput. Secur. 2022, 123, 102918. [Google Scholar] [CrossRef]
- Akash, N.S.; Rouf, S.; Jahan, S.; Chowdhury, A.; Uddin, J. Botnet Detection in IoT Devices Using Random Forest Classifier with Independent Component Analysis. J. Inf. Commun. Technol. 2022, 21, 201–232. [Google Scholar] [CrossRef]
- Asadi, M. Detecting IoT botnets based on the combination of cooperative game theory with deep and machine learning approaches. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 5547–5561. [Google Scholar] [CrossRef]
- Gera, S.; Sinha, A. T-Bot: AI-based social media bot detection model for trend-centric twitter network. Soc. Netw. Anal. Min. 2022, 12, 76. [Google Scholar] [CrossRef]
- Onyema, E.M.; Dalal, S.; Romero, C.A.T.; Seth, B.; Young, P.; Wajid, M.A. Design of Intrusion Detection System based on Cyborg intelligence for security of Cloud Network Traffic of Smart Cities. J. Cloud Comput. 2022, 11, 26. [Google Scholar] [CrossRef]
- Okey, O.D.; Maidin, S.S.; Adasme, P.; Rosa, R.L.; Saadi, M.; Melgarejo, D.C.; Rodríguez, D.Z. BoostedEnML: Efficient Technique for Detecting Cyberattacks in IoT Systems Using Boosted Ensemble Machine Learning. Sensors 2022, 22, 7409. [Google Scholar] [CrossRef] [PubMed]
- Alrayes, F.S.; Maray, M.; Gaddah, A.; Yafoz, A.; Alsini, R.; Alghushairy, O.; Mohsen, H.; Motwakel, A. Modeling of Botnet Detection Using Barnacles Mating Optimizer with Machine Learning Model for Internet of Things Environment. Electronics 2022, 11, 3411. [Google Scholar] [CrossRef]
- Prasad, A.; Chandra, S. VMFCVD: An Optimized Framework to Combat Volumetric DDoS Attacks using Machine Learning. Arab. J. Sci. Eng. 2022, 47, 9965–9983. [Google Scholar] [CrossRef] [PubMed]
- Syamsuddin, I.; Barukab, O.M. SUKRY: Suricata IDS with Enhanced kNN Algorithm on. Electronics 2022, 11, 737. [Google Scholar] [CrossRef]
- Yang, C.; Lu, T.; Yan, S.; Zhang, J.; Yu, X. N-Trans: Parallel Detection Algorithm for DGA Domain Names. Futur. Internet 2022, 14, 209. [Google Scholar] [CrossRef]
- Gómez-Escalonilla, V.; Martínez-Santos, P.; Martín-Loeches, M. Preprocessing approaches in machine-learning-based groundwater potential mapping: An application to the Koulikoro and Bamako regions, Mali. Hydrol. Earth Syst. Sci. 2022, 26, 221–243. [Google Scholar] [CrossRef]
- Appiahene, P.; Missah, Y.M.; Najim, U. Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks. Adv. Fuzzy Syst. 2020, 2020, 8581202. [Google Scholar] [CrossRef]
- Appiahene, P.; Missah, Y.M.; Najim, U. Evaluation of information technology impact on bank’s performance: The Ghanaian experience. Int. J. Eng. Bus. Manag. 2019, 11, 5337. [Google Scholar] [CrossRef] [Green Version]
- Appiahene, P.; Missah, Y.A.W.M. Predicting the Operational Efficiency of Banks in the Presence of Information Technology Investment using Artificial Neural Network. In Proceedings of the International Conference on Artificial Intelligence and Soft Computing (ICAISC), Zakopane, Poland, 16–20 June 2019; pp. 6–11. [Google Scholar]
- Chen, Y.; Chen, W.; Pal, S.C.; Saha, A.; Chowdhuri, I.; Adeli, B.; Janizadeh, S.; Dineva, A.A.; Wang, X.; Mosavi, A. Evaluation efficiency of hybrid deep learning algorithms with neural network decision tree and boosting methods for predicting groundwater potential. Geocarto Int. 2022, 37, 5564–5584. [Google Scholar] [CrossRef]
- Zhang, Y.; Cao, W.; Jin, Y.; Wu, M. An ensemble model based on weighted support vector regression and its application in annealing heating process. Sci. China Inf. Sci. 2019, 62, 49202. [Google Scholar] [CrossRef] [Green Version]
- Jiang, M.; Li, F.; Liu, L. Continual meta-learning algorithm. Appl. Intell. 2022, 52, 4527–4542. [Google Scholar] [CrossRef]
- Vimont, A.; Leleu, H.; Durand-Zaleski, I. Machine learning versus regression modelling in predicting individual healthcare costs from a representative sample of the nationwide claims database in France. Eur. J. Health Econ. 2022, 23, 211–223. [Google Scholar] [CrossRef]
- Shahhosseini, M.; Hu, G.; Pham, H. Optimizing ensemble weights and hyperparameters of machine learning models for regression problems. Mach. Learn. Appl. 2022, 7, 100251. [Google Scholar] [CrossRef]
- Disha, R.A.; Waheed, S. Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique. Cybersecurity 2022, 5, 1. [Google Scholar] [CrossRef]
- Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef] [Green Version]
- Zhang, T.; Li, S.; Feng, G.; Liang, J.; He, L.; Zhao, X. Local channel transformation for efficient convolutional neural network. Signal, Image Video Process. 2022, 17, 129–137. [Google Scholar] [CrossRef]
- Twumasi, E.; Frimpong, E.A.; Kwegyir, D.; Folitse, D. Improvement of Grey System Model using Particle Swarm Optimization. J. Electr. Syst. Inf. Technol. 2021, 8, 12. [Google Scholar] [CrossRef]
- Khan, M.Z. Hybrid Ensemble Learning Technique for Software Defect Prediction. Int. J. Mod. Educ. Comput. Sci. 2020, 12, 1–10. [Google Scholar] [CrossRef] [Green Version]
- Duan, L.; Zhou, J.; Wu, Y.; Xu, W. A novel and highly efficient botnet detection algorithm based on network traffic analysis of smart systems. Int. J. Distrib. Sens. Netw. 2022, 18, 9910. [Google Scholar] [CrossRef]
- Dawson, W.; Degomme, A.; Stella, M.; Nakajima, T.; Ratcliff, L.E.; Genovese, L. Density functional theory calculations of large systems: Interplay between fragments, observables, and computational complexity. WIREs Comput. Mol. Sci. 2022, 12, 1574. [Google Scholar] [CrossRef]
- Alhogail, A.; Al-Turaiki, I. Improved Detection of Malicious Domain Names Using Gradient Boosted Machines and Feature Engineering. Inf. Technol. Control. 2022, 51, 313–331. [Google Scholar] [CrossRef]
- Xu, L.; Xiong, W.; Zhou, M.; Chen, L. A Continuous Terminal Sliding-Mode Observer-Based Anomaly Detection Approach for Industrial Communication Networks. Symmetry 2022, 14, 124. [Google Scholar] [CrossRef]
- Akhtar, M.S.; Feng, T. Detection of Malware by Deep Learning as CNN-LSTM Machine Learning Techniques in Real Time. Symmetry 2022, 14, 2308. [Google Scholar] [CrossRef]
Model | Evaluation | Values |
---|---|---|
GLM | R2 | 0.9522 |
MAE | 0.0852 | |
RMSE | 0.0099 | |
MAPE | 0.9952 | |
DT | R2 | 0.9882 |
MAE | 0.0752 | |
RMSE | 0.0085 | |
MAPE | 0.9853 | |
RF | R2 | 0.9977 |
MAE | 0.0715 | |
RMSE | 0.0099 | |
MAPE | 0.0952 | |
Stacking Ensemble | R2 | 0.9997 |
MAE | 0.0641 | |
RMSE | 0.0084 | |
MAPE | 0.0899 |
Algorithm | Parameter | Time Complexity (s) |
---|---|---|
Random forest | Best training time | 0.45 |
Decision tree | Best prediction | 0.50 |
Generalized linear model | Worst training time | 0.55 |
Stacking ensemble | Best prediction and worst training time | 2.45 |
Published Papers | Year of Publication | Method and Results of the Published Paper | Performance of the Proposed Model |
---|---|---|---|
Disha and Waheed [38] | 2022 | Decision tree (90.15%) | |
Rehman Javed et al. [7] | 2022 | Decision tree (97.8%) | Decision tree (98.82%) |
Okey et al. [24] | 2022 | Decision tree (98.7%) | |
Okey et al. [24] | 2022 | Random forest (98.4%) | |
Onyema et al. [23] | 2022 | Random forest (99.0%) | Random forest (99.77%) |
Hosseini et al. [15] | 2022 | Random forest (97.0%) | |
Vimont et al. [36] | 2022 | Generalized linear model (81.9%) | Generalized linear model (95.22%) |
Alhogail and Al-Turaiki [45] | 2022 | Generalized linear model (91.66%) | |
Xu et al. [46] | 2022 | Generalized linear model (90.5%) | |
Akhtar and Feng [47] | 2022 | Stacking ensemble (99.0%) | Stacking ensemble (99.99%) |
Masoudi-Sobhanzadeh and Emami-Moghaddam [14] | 2022 | Stacking ensemble (90.0%) | |
Yerima and Bashar [8] | 2022 | Stacking ensemble (96.0%) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Afrifa, S.; Varadarajan, V.; Appiahene, P.; Zhang, T.; Domfeh, E.A. Ensemble Machine Learning Techniques for Accurate and Efficient Detection of Botnet Attacks in Connected Computers. Eng 2023, 4, 650-664. https://doi.org/10.3390/eng4010039
Afrifa S, Varadarajan V, Appiahene P, Zhang T, Domfeh EA. Ensemble Machine Learning Techniques for Accurate and Efficient Detection of Botnet Attacks in Connected Computers. Eng. 2023; 4(1):650-664. https://doi.org/10.3390/eng4010039
Chicago/Turabian StyleAfrifa, Stephen, Vijayakumar Varadarajan, Peter Appiahene, Tao Zhang, and Emmanuel Adjei Domfeh. 2023. "Ensemble Machine Learning Techniques for Accurate and Efficient Detection of Botnet Attacks in Connected Computers" Eng 4, no. 1: 650-664. https://doi.org/10.3390/eng4010039
APA StyleAfrifa, S., Varadarajan, V., Appiahene, P., Zhang, T., & Domfeh, E. A. (2023). Ensemble Machine Learning Techniques for Accurate and Efficient Detection of Botnet Attacks in Connected Computers. Eng, 4(1), 650-664. https://doi.org/10.3390/eng4010039