Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning
Abstract
:1. Introduction
- We design and implement a multipurpose, lightweight, and highly accurate anomaly-based IoT NIDS using various machine-learning methods.
- We characterize and evaluate the performance of three ensemble learning techniques (EBT, ESK, and ERT) for IoT NIDSs using the NSL-KDD and distilled-Kitsune-2018 datasets.
- We provide a thorough empirical analysis of six different supervised machine-learning methods using eight standard performance evaluation metrics.
- We also compare our results with state-of-the-art NIDS solutions and show that our ensemble-based NIDS is better than any prior art by 1–20%.
2. Related Work
3. System Modeling
- Data Selection Subsystem: This subsystem involves obtaining a representative dataset that can be applied to the proposed NIDS to express the IoT network traffic routing. To build a comparative study and gain more insight into the solution approach, Distilled-Kitsune-2018 dataset [12] and NSL-KDD dataset [28] have been employed at this stage. These two datasets were selected since they are both comprehensive, publicly available, well-established as they are used in several reputable research studies, have a fairly large number of samples (~150,000 network traffic records for each of them), and they cover wide spectrum attack vectors for IoT in specific and general computer networks. The distilled-Kitsune-2018 dataset, which was collected by Mirsky et al. (2018), recorded a total of ~150,000 network traffic records of normal traffic and nine different attacks targeting the violation of the three main security services known as CIA triad (confidentiality, integrity, availability) including: two reconnaissance attacks (OS Scan attack and Fuzzing attack), three man-in-the-middle (MitM) attacks (video injection attack, ARP attack, and active wiretap attack), three denial-of-service (DoS) attacks (SSDP Flood attack, SYN DoS attack, and SSL renegotiation attack), and one botnet malware attack (Mirai attack) [12]. On the other hand, NSL-KDD [28], which is a newer and reduced version of the original KDD’99 dataset, was developed by the Defense Advanced Research Projects Agency (DARPA) and has been revised to include more up-to-date and nonredundant attack records with different levels of difficulty. The NSL-KDD dataset is available as a two-class traffic dataset (normal vs. anomaly) and as a multiclass traffic dataset that includes attack-type labels and a difficulty level (normal, DoS attacks, Probe attacks, Root-to-Local (R2L) attacks, and User-to-Root (U2R) attacks). In both cases, it comprises a total of ~150,000 samples, each with 43 attributes, such as duration, protocol, and service. The summary of the distilled-Kitsune-2018 and NSL-KDD dataset distribution is provided in Table 4 below.
- Data Preprocessing Subsystem: This subsystem involves the preparation of the dataset to be fed through the machine-learning processes, which is initiated when the data is imported from the CSV files though MATLAB containers to be stored and processed via MATLAB tables. The data then passes through a cleaning process by excluding the untrainable features, filling the unimportable data cells with zero values, filling the empty data cells with zero values, and unifying the number of extracted features for all traffic datasets. Then, all datasets are labeled using categorical labeling and then encoded into integer encoding (0–9). Thereafter, the data records from all attacks’ datasets are combined to form one large and comprehensive dataset containing all types of traffic (normal, OS Scan, Fuzzing, Video Inj, ARP, Wiretap, SSDP F, SYN DoS, SSL R, and Mirai). After that, all numerical data values are standardized to enhance the classifier task, and finally, all records are randomly shuffled to eliminate any biasing in the classifier process.
- Data Distribution Subsystem: This subsystem involves the random division (using the DivideRand Algorithm [49]) of the preprocessed dataset into the training dataset and the validation (testing) dataset. We have used the policy of 70%:30% distribution for training dataset:testing dataset, respectively. Also, the data has been randomly distributed into five different 70%:30% distributions to accommodate the 5-fold cross validation process [49] performed at the learning stage.
- Learning Process Subsystem: This subsystem involves the development and employment of the different machine-learning models to train and test/validate the considered datasets. Supervised machine-learning methods are usually employed to develop solutions for regression [50], prediction [51], and classification [52]. We have employed six different supervised ML schemes including Ensemble Boosted Trees (EBT) [53], Ensemble Subspace kNN (ESK) [54], Ensemble RUSBoosted Trees (ERT) [55], Shallow Neural Network (SNN) [56], Bilayered Neural Network (BNN) [57], and Logistic Regression Kernel (LRK) [58]. The exploited ML models are summarized in Table 5 below.
- System Evaluation Subsystem: This subsystem involves the evaluation of the performance of the proposed NIDS using several quality indication factors [59], including validation accuracy (CA), validation precision (PR), validation recall (RC), misclassification rate (MCR), false discovery rate (FDR), false negative rate (FNR), total validation cost (TC) in terms of the number of misclassified samples, and classification speed (CS) in terms of the number of observations per second. Such quality indication factors are also used to opt the most advantageous model to employ for attack-aware IoT network traffic routing detection/classification. Moreover, they are also used to contrast the results of our best NIDS with other state-of-the-art models in the same area of study. A summary of the evaluation metrics are depicted in Figure 3 below.
- Classification Process Subsystem: This subsystem involves the categorization of the traffic records into a binary-classification process to either normal vs. anomaly (attack) or to a multiclassification process to a dedicated traffic category in the Kitsune dataset {normal, OS Scan attack, Fuzzing attack, Video Inj attack, ARP attack, Wiretap attack, SSDP F attack, SYN DoS attack, SSL R attack, and Mirai attack}, and in the NSL-KDD dataset {normal, DoS attacks, probe attacks, R2L attacks, and U2R attacks}.
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Ashton, K. That ‘internet of things’ thing. RFID J. 2009, 22, 97–114. [Google Scholar]
- Feng, X.; Yang, L.T.; Wang, L.; Vinel, A. Internet of things. Int. J. Commun. Syst. 2012, 25, 1101. [Google Scholar]
- Yuehong, Y.I.N.; Zeng, Y.; Chen, X.; Fan, Y. The internet of things in healthcare: An overview. J. Ind. Inf. Integr. 2016, 1, 3–13. [Google Scholar]
- Wattana, V.; Xu, L.D.; Bi, Z.; Pungpapong, V. Blockchain and internet of things for modern business process in digital economy—the state of the art. IEEE Trans. Comput. Soc. Syst. 2019, 6, 1420–1432. [Google Scholar]
- John, P.; Shpantzer, G. Securing the Internet of Things Survey; SANS Institute: Rockville, MD, USA, 2014; pp. 1–22. [Google Scholar]
- Zheng, D.E.; William, A.C. Leveraging the Internet of Things for a more Efficient and Effective Military; Center for Strategic & International Studies: Washington, DC, USA, 2015. [Google Scholar]
- Dimitrov, D.V. Medical internet of things and big data in healthcare. Healthc. Inform. Res. 2016, 22, 156–163. [Google Scholar] [CrossRef]
- Chen, Y.; Shen, W.; Wang, X. Applications of Internet of Things in manufacturing. In Proceedings of the 2016 IEEE 20th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Nanchang, China, 4–6 May 2016; pp. 670–675. [Google Scholar]
- Said, O.; Mehedi, M. Towards internet of things: Survey and future vision. Int. J. Comput. Netw. 2013, 5, 1–17. [Google Scholar]
- Axelsson, S. Intrusion detection systems: A survey and taxonomy. Technol. Rep. 2000, 99, 1–15. [Google Scholar]
- Verwoerd, T.; Ray, H. Intrusion detection techniques and approaches. Comput. Commun. 2002, 25, 1356–1365. [Google Scholar] [CrossRef]
- Mirsky, Y.; Tomer, D.; Yuval, E.; Asaf, S. Kitsune: An ensemble of autoencoders for online network intrusion detection. arXiv 2018, arXiv:1802.09089. [Google Scholar]
- Jyothsna, V.V.R.P.V.; Prasad, R.; Prasad, K.M. A review of anomaly-based intrusion detection systems. Int. J. Comput. Appl. 2011, 28, 26–35. [Google Scholar] [CrossRef]
- Tavallaee, M.; Natalia, S.; Ghorbani, A.A. Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2010, 40, 516–524. [Google Scholar] [CrossRef]
- Gustavo, N.; Correia, M. Anomaly-based intrusion detection in software as a service. In Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W), Hong Kong, China, 27–30 June 2011; pp. 19–24. [Google Scholar]
- McLachlan, G.J. Discriminant Analysis and Statistical Pattern Recognition; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 583. [Google Scholar]
- Kumar, B.V.; Abhijit, M.; Richard, D.J. Correlation Pattern Recognition; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
- Papakostas, G.A.; Anestis, G.H.; Vassilis, G.K. Distance and similarity measures between intuitionistic fuzzy sets: A comparative analysis from a pattern recognition point of view. Pattern Recognit. Lett. 2013, 34, 1609–1622. [Google Scholar] [CrossRef]
- Bulgarevich, D.S.; Susumu, T.; Tadashi, K.; Masahiko, D.; Makoto, W. Pattern recognition with machine learning on optical microscopy images of typical metallurgical microstructures. Sci. Rep. 2018, 8, 1–8. [Google Scholar] [CrossRef]
- Zięba, M.; Sebastian, K.T.; Jakub, M.T. Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Syst. Appl. 2016, 58, 93–101. [Google Scholar] [CrossRef]
- Verma, A.; Virender, R. ELNIDS: Ensemble learning based network intrusion detection system for RPL based Internet of Things. In Proceedings of the 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU), Ghaziabad, India, 18–19 April 2019; pp. 1–6. [Google Scholar]
- Yahalom, R.; Steren, A.; Nameri, Y.; Roytman, M. Small Versions of the Extracted Features Datasets for 9 Attacks on IP Camera and IoT Networks Generated by Mirskey et al., Mendeley Data. 2018. Available online: https://data.mendeley.com/datasets/zvsk3k9cf2/1 (accessed on 1 December 2021). [CrossRef]
- Kambourakis, G.; Constantinos, K.; Angelos, S. The mirai botnet and the iot zombie armies. In Proceedings of the MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM), Baltimore, MD, USA, 23–25 October 2017; pp. 267–272. [Google Scholar]
- Bi, J.; Zhang, C. An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl.-Based Syst. 2018, 158, 81–93. [Google Scholar] [CrossRef]
- Khasawneh, K.N.; Meltem, O.; Caleb, D.; Nael, A.; Dmitry, P. Ensemble learning for low-level hardware-supported malware detection. In Proceedings of the International Symposium on Recent Advances in Intrusion Detection, Kyoto, Japan, 2–4 November 2015; Springer: Cham, Switzerland; pp. 3–25. [Google Scholar]
- Wang, S.; Yin, Y.; Cao, G.; Wei, B.; Zheng, Y.; Yang, G. Hierarchical retinal blood vessel segmentation based on feature and ensemble learning. Neurocomputing 2015, 149, 708–717. [Google Scholar] [CrossRef]
- Yang, X.; David, L.; Xia, X.; Sun, J. TLEL: A two-layer ensemble learning approach for just-in-time defect prediction. Inf. Softw. Technol. 2017, 87, 206–220. [Google Scholar] [CrossRef]
- Canadian Institute for Cybersecurity (CIS). NSL-KDD Dataset. Available online: https://www.unb.ca/cic/datasets/nsl.html (accessed on 13 December 2021).
- Frank, J. Artificial intelligence and intrusion detection: Current and future directions. In Proceedings of the 17th National Computer Security Conference, Baltimore, MD, USA, 10–14 October 1994; Volume 10, pp. 1–12. [Google Scholar]
- Jackson, K.A.; David, H.D.; Stallings, C.A. NADIR (Network Anomaly Detection and Intrusion Reporter): A Prototype Network Intrusion Detection System; No. LA-UR-90-3726 CONF-910596-1; Los Alamos National Lab.: Los Alamos, NM, USA, 1990. [Google Scholar]
- Kumar, S.; Eugene, H.S. An Application of Pattern Matching in Intrusion Detection; Department of Computer Science Technical Reports, Purdue University Purdue University: West Lafayette, IN, USA, 1994. [Google Scholar]
- Sultana, N.; Naveen, C.; Peng, W.; Alhadad, R. Survey on SDN based network intrusion detection system using machine learning approaches. Peer-Peer Netw. Appl. 2019, 12, 493–501. [Google Scholar] [CrossRef]
- Abdulhammed, R.; Hassan, M.; Ali, A.; Miad, F.; Abdelshakour, A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 2019, 8, 322. [Google Scholar] [CrossRef] [Green Version]
- Taher, K.A.; Jisan, B.M.Y.; Rahman, M.M. Network intrusion detection using supervised machine learning technique with feature selection. In Proceedings of the 2019 International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), Dhaka, Bangladesh, 10–12 January 2019; pp. 643–646. [Google Scholar]
- Sarhan, M.; Siamak, L.; Marius, P. Towards a Standard Feature Set for Network Intrusion Detection System Datasets. Mobile. Netw. Appl. 2021, 11, 1–14. [Google Scholar] [CrossRef]
- Ashraf, J.; Marwa, K.; Nour, M.; Mohamed, A.; Hasnat, K.; Asim, D.B.; Reham, R.M. IoTBoT-IDS: A Novel Statistical Learning-enabled Botnet Detection Framework for Protecting Networks of Smart Cities. Sustain. Cities Soc. 2021, 72, 103041. [Google Scholar] [CrossRef]
- Kumar, P.; Govind, P.G.; Rakesh, T. TP2SF: A Trustworthy Privacy-Preserving Secured Framework for sustainable smart cities by leveraging blockchain and machine learning. J. Syst. Archit. 2021, 115, 101954. [Google Scholar] [CrossRef]
- Khan, M.A.; Muazzam, A.K.; Shahid, L.; Awais, A.S.; Mujeeb, U.R.; Wadii, B.; Maha, D.; Jawad, A. Voting Classifier-based Intrusion Detection for IoT Networks. arXiv 2021, arXiv:2104.10015. [Google Scholar]
- Abu, A.Q.; Saleh, Z. An Efficient Deep-Learning-Based Detection and Classification System for Cyber-Attacks in IoT Communication Networks. Electronics 2020, 9, 2152. [Google Scholar] [CrossRef]
- Liu, J.; Burak, K.; Carlisle, A. Machine learning-driven intrusion detection for Contiki-NG-based IoT networks exposed to NSL-KDD dataset. In Proceedings of the 2nd ACM Workshop on Wireless Security and Machine Learning, Linz, Austria, 16 July 2020; pp. 25–30. [Google Scholar]
- Alsaedi, A.; Moustafa, N.; Tari, Z.; Mahmood, A.; Anwar, A. TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven Intrusion Detection Systems. IEEE Access 2020, 8, 165130–165150. [Google Scholar] [CrossRef]
- Kumar, A.; Teng, J.L. EDIMA: Early detection of IoT malware network activity using machine learning techniques. In Proceedings of the 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), Limerick, Ireland, 15–18 April 2019; pp. 289–294. [Google Scholar]
- Hafeez, I.; Markku, A.; Aaron, Y.D.; Sasu, T. IoT-KEEPER: Detecting malicious IoT network activity using online traffic analysis at the edge. IEEE Trans. Netw. Serv. Manag. 2020, 17, 45–59. [Google Scholar] [CrossRef] [Green Version]
- Zhong, Y.; Zhu, Y.; Wang, Z.; Yin, X.; Shi, X.; Li, K. An adversarial learning model for intrusion detection in real complex network environments. In Proceedings of the International Conference on Wireless Algorithms, Systems, and Applications, Qingdao, China, 13–15 September 2020; Springer: Cham, Switzerland; pp. 794–806. [Google Scholar]
- Siffer, A.; Pierre-Alain, F.; Alexandre, T.; Christine, L. Netspot: A simple Intrusion Detection System with statistical learning. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 29 December 2020–1 January 2021; pp. 911–918. [Google Scholar]
- Al-Haija, Q.A. On the Security of Cyber-Physical Systems Against Stochastic Cyber-Attacks Models. In Proceedings of the 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), Toronto, ON, Canada, 21–24 April 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Al-Haija, Q.A.; Abdulaziz, A.A. High Performance Classification Model to Identify Ransomware Payments for Heterogeneous Bitcoin Networks Electronics. Electronics 2021, 10, 2113. [Google Scholar] [CrossRef]
- Shah, Y.; Sengupta, S. A survey on Classification of Cyber-attacks on IoT and IIoT devices. In Proceedings of the 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 28–31 October 2020. [Google Scholar]
- Al-Haija, Q.A.; Smadi, M.A.; Zein-Sabatto, S. Multi-Class Weather Classification Using ResNet-18 CNN for Autonomous IoT and CPS Applications. In Proceedings of the 2020 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 16–18 December 2020; pp. 1586–1591. [Google Scholar] [CrossRef]
- Gupta, P. Cross-Validation in Machine Learning. Medium Towards Data Science. 2017. Available online: https://towardsdatascience.com/cross-validation-in-machine-learning-72924a69872f (accessed on 13 February 2020).
- Al-Haija, Q.A.; al Tarayrah, M.I.; Enshasy, H.M. Time-Series Model for Forecasting Short-term Future Additions of Renewable Energy to Worldwide Capacity. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, 26–27 October 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Al-Haija, Q.A.; Nasr, K.A. Supervised Regression Study for Electron Microscopy Data. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 2661–2668. [Google Scholar] [CrossRef]
- Abu, A.; Qasem, A.A.S.; Mohammed, F.A. Meticulously Intelligent Identification System for Smart Grid Network Stability to Optimize Risk Management. Energies 2021, 14, 6935. [Google Scholar] [CrossRef]
- Nagpal, A. Decision Tree Ensembles-Bagging and Boosting. Medium: Towards Data Science. 2017. Available online: https://towardsdatascience.com/decision-tree-ensembles-bagging-and-boosting-266a8ba60fd9 (accessed on 6 October 2021).
- Ye, T.; Feng, Y. RaSE: Random Subspace Ensemble Classification. J. Mach. Learn. Res. 2021, 22, 1–93. [Google Scholar]
- Seiffert, C.; Khoshgoftaar, T.M.; van Hulse, J.; Napolitano, A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2010, 40, 185–197. [Google Scholar] [CrossRef]
- Al-Haija, Q.A.; Ishtaiwi, A. Multiclass Classification of Firewall Log Files Using Shallow Neural Network for Network Security Applications. In Soft Computing for Security Applications; Springer: Singapore, 2022; Volume 1397. [Google Scholar] [CrossRef]
- Al-Haija, Q.A.; Jebril, N.A. Systemic framework of time-series prediction via feed-forward neural networks. In Proceedings of the 3rd Smart Cities Symposium (SCS 2020), Online, 21–23 September 2021; pp. 583–588. [Google Scholar] [CrossRef]
- Swaminathan, S. Logistic Regression—Detailed Overview, Medium: Towards Data Science. 2018. Available online: https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc (accessed on 20 November 2021).
- Al-Haija, Q.A.; McCurry, C.D.; Zein-Sabatto, S. Intelligent Self-reliant Cyber-Attacks Detection and Classification System for IoT Communication Using Deep Convolutional Neural Network. In Selected Papers from the 12th International Networking Conference. INC 2020. Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2021; Volume 180. [Google Scholar] [CrossRef]
Method | Assumptions | Explainability | Execution Time | Data Engineering | |
---|---|---|---|---|---|
Train | Test | ||||
EBT, ERT | No assumptions about predictors or response variable | Intuitively explainable as rule-based knowledge system | Slow | Fast | Minimal effort |
ESK | No assumptions | Intuitively explainable via similarity measures | Slow | Very slow | Minimal effort |
SNN, BNN | No assumptions | Black box | Depends on network architecture | Fast | Medium effort |
LRK | Linearity between predictors and response variable | Relatively explainable | Fast | Slow | Essential |
Ref. | Methods | Datasets | Attacks |
---|---|---|---|
[29] | ANN, SVM, NB, RF, SOM | NSL-KDD, KDD Cup 1999, CIC DOS, ADFA-LD12, UNSWNB15, WSN-DS | DDoS, flooding, U2R, Jamming |
[33] | Auto-Encoder, RF, NB, LDA, QDA | CICIDS2017 | DDoS, Heartbleed, SQL Injection, Botnet. |
[34] | ANN, SVM | NSL-KDD | DDoS, R2L, U2R |
[35] | Ensemble Learning (Extra Trees) | UNSW-NB15, BoT-IoT, ToN-IoT, CSE-CIC-IDS2018 | DDoS, Botnet, Infiltration. |
[36] | Statistical Analysis | Kitsune, ISCX, IoT network intrusion | Botnet, DDoS, MITM |
[37] | XGBoost, PCA | ToN-IoT and BoT-IoT. | DDoS, Botnet, ransomware |
[38] | Ensemble-based voting classifier | Ton-IoT | DDoS, Botnet, ransomware |
[39] | Shallow CNN | NSL-KDD | Normal, DoS, Probe, R2L, U2R |
[40] | XGBoost | NSL-KDD | SynFlood, UDP Flood, Smurf, and others |
[41] | LR, LDA, RT, RF, and NB | TON-IoT | DDoS, Password, Backdoor, Ransomware |
[42] | RF, KNN, NB | Simulated dataset | Satori, Reaper, Amnesia, Masuta, Mirai, others |
[43] | Fuzzy C-means clustering and fuzzy interpolation | Kitsune | Botnet, MitM, DoS |
[44] | Generative adversarial networks (GAN) | Kitsune, CICIDS | Artificially generated attacks |
[45] | Extreme Value Analysis | Kitsune | Botnet, MitM, DoS |
Item | Descriptions |
---|---|
Operation System | Windows 11, Edition 21H2, 64-bit operating system, x64-based processor |
Processing Component | 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30 GHz⋯30 GHz |
Computing Component | NVIDIA GeForce RTX 3050 Ti Laptop GPU@ 4 GBye |
Memory Component | 16.0 GB, DDR4 1.2v @ Memory Speed: 2933 MHz (PC4-23400) |
Storage Component | 500 GB Kingston NV1 M.2 (2280) PCIe NVMe Gen 3.0 (×4) SSD |
Development Platform | MATLAB 2021b + Parallel Computing + Machine Learning Packages. |
Samples Distribution for Distilled-Kitsune-2018 | ||||
---|---|---|---|---|
Attack | No. Training Packets | No. Normal Test Packets | No. Malicious Test Packets | |
OS Scan | 6000 | 13,500 | 1499 | |
Fuzzing | 1200 | 9000 | 999 | |
Video Inj. | 4000 | 9000 | 999 | |
ARP | 6000 | 13,500 | 1499 | |
Wiretap | 4000 | 9000 | 999 | |
SSDP F. | 6000 | 13,500 | 1499 | |
SYN DoS | 1200 | 9000 | 999 | |
SSL R. | 6000 | 13,500 | 1499 | |
Mirai | 6000 | 9000 | 999 | |
Samples Distribution for NSL-KDD dataset | ||||
Normal | DoS | Probe | R2L | |
Training | 67,343 | 45,927 | 11,656 | 995 |
Testing | 9711 | 7458 | 2754 | 2421 |
Total | 77,054 | 53,385 | 14,410 | 3416 |
ML Model | Models Parameters |
---|---|
Ensemble Boosted Trees (EBT) | Ensemble method: AdaBoost, Learner type: Decision tree, Maximum number of splits: 20, Number of learners: 30, Learning rate: 0.1, 5-Fold Cross Validation. |
Ensemble Subspace kNN (ESK) | Ensemble method: Subspace, Learner type: Nearest Neighbors, number of learners: 30, Subspace Dimension: 58, 5-Fold Cross Validation |
Ensemble RUS_Boosted Trees (ERT) | Ensemble method: RUSBoost, Learner type: Decision tree, Maximum number of splits: 20, Number of learners: 30, Learning rate: 0.1, 5-Fold Cross Validation |
Shallow Neural Network (SNN) | Number of fully connected layers = one hidden layer with size = 30, Activation: Sigmoid, Iteration limit: 1000, Standardize data: Yes, Regularization strength (Lambda): 0, 5-Fold Cross Validation |
Bilayered Neural Network (BNN) | Number of fully connected layers: 2, First layer size: 10 Second layer size: 10, Activation: ReLU, Iteration limit: 1000, Regularization strength (Lambda): 0, Standardize data: Yes, 5-Fold Cross Validation |
Logistic Regression Kernel (LRK) | Learner: Logistic Regression, Number of expansion dimensions: Auto, Regularization strength (Lambda): Auto, Kernel scale: Auto, Multiclass method: One-vs-One, Iteration limit: 1000, 5-Fold Cross Validation |
ML Models | CA% | MCR% | TC (#) | CS (Obs/Sec) | ||||
---|---|---|---|---|---|---|---|---|
Kitsune | NSL-KDD | Kitsune | NSL-KDD | Kitsune | NSL-KDD | Kitsune | NSL-KDD | |
EBT | 99.8 | 99.1 | 0.2 | 0.9 | 249 | 1332 | 90,000 | 84,000 |
ESK | 99.4 | 98.4 | 0.6 | 1.6 | 780 | 2346 | 41 | 84 |
ERT | 98.1 | 97.2 | 1.9 | 2.8 | 2495 | 4211 | 90,000 | 90,000 |
BNN | 97.5 | 96.1 | 2.5 | 3.9 | 3250 | 5702 | 290,000 | 420,000 |
SNN | 96.7 | 94.6 | 3.3 | 5.4 | 4290 | 8014 | 240,000 | 390,000 |
LRK | 94.5 | 93.7 | 5.5 | 6.3 | 7215 | 9198 | 440 | 1900 |
CA% | PR% | RC% | MCR% | FDR% | FNR% | TC (#) | CS (Obs/Sec) |
---|---|---|---|---|---|---|---|
99.8 | 99.7 | 98.1 | 0.2 | 0.9 | 1.7 | 249 | 90,000 |
Research | Year | ML Model | No. Classes | ACC% | PPV% | TPR% |
---|---|---|---|---|---|---|
Sarhan et al. [35] | 2021 | XRT Classifier | 2–10 | 98.05 | 84.61 | - |
Ashraf et al. [36] | 2021 | STL Classifier | 3 | 99.20 | - | - |
Kumar et al. [37] | 2021 | XGB Classifier | 10 | 97.81 | 87.55 | 85.43 |
Khan et al. [38] | 2021 | HYB Classifier | 7 | 76.00 | 75.00 | 75.00 |
Al-Haija et al. [39] | 2020 | S-CNN Classifier | 5 | 98.20 | 98.27 | 98.20 |
Jinxin et al. [40] | 2020 | XGB Classifier | 5 | 97.0 | - | - |
Alsaedi et al. [41] | 2020 | CART Classifier | 9 | 77.00 | 77.00 | 77.00 |
Al-Haija et al. [60] | 2020 | S-CNN Classifier | 2 | 99.30 | 99.33 | 99.18 |
Kumar et al. [42] | 2019 | kNN Classifier | 3 | 94.44 | 92.00 | 100.0 |
Proposed model | 2021 | EDT Classifier | 10 | 99.80 | 99.69 | 98.10 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abu Al-Haija, Q.; Al-Badawi, A. Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning. Sensors 2022, 22, 241. https://doi.org/10.3390/s22010241
Abu Al-Haija Q, Al-Badawi A. Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning. Sensors. 2022; 22(1):241. https://doi.org/10.3390/s22010241
Chicago/Turabian StyleAbu Al-Haija, Qasem, and Ahmad Al-Badawi. 2022. "Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning" Sensors 22, no. 1: 241. https://doi.org/10.3390/s22010241
APA StyleAbu Al-Haija, Q., & Al-Badawi, A. (2022). Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning. Sensors, 22(1), 241. https://doi.org/10.3390/s22010241