Machine-Learning-Based Darknet Traffic Detection System for IoT Applications
:1. Introduction
1.1. Summary of Our Contributions
- We developed a multi-purpose and high-performance anomaly-based IoT DIDS utilizing several supervised machine-learning approaches.
- We differentiate and measure the performance of six supervised learning methods (BAG-DT), (ADA-DT), (RUS-DT), (O-DT), (O-KNN), and (O-DSC) for IoT DIDSs using the CIC-Darknet2020 datasets.
- We present a comprehensive experimental evaluation of six different ML techniques using ten typical systems of measurement factors.
- We contrast our findings with state-of-the-art approaches and show that our BAG-DT-based DIDS is better than existing studies by 1.9–20% in the same area of study.
1.2. Paper Organization
2. Related Work
2.1. First Works for Traffic Classification
2.2. VPN Traffic Classification
2.3. Tor Traffic Classification
2.4. Use of Neural Networks in Recent Works
2.5. Summary of Surveyed Research Works
3. System Modeling and Environment
3.1. The Darknet Traffic Dataset
3.2. Feature Engineering Unit
- Data Presentation: CIC-DarkNet-2020 dataset is initially available in CSV format. Therefore, to be processed by the MATLAB platform, at the outset, it should be imported from the CSV file and presented as a table of data records in the MATLAB tables with named columns and numbered rows.
- Exploratory data analysis (EDA): EDA of the dataset performs vital data curation tasks to gain a deeper insight into the dataset. Such a process completes a preliminary enhancement process of the dataset by checking missing data values and providing proper substitution for the missed records, replacing null values with appropriate replacements, such as zeroes, visualizing the dataset classes’ histogram to gain more insights into the classes and features.
- Feature Selection: Datasets are comprised of several features with diverse datatypes. Nevertheless, not all features can be considered for machine-learning models, since they can either be unlearnable (such as string features) or might have a negative impact on the classifier performance. The coefficient score approach is employed to extract the most influential features of the CIC-DarkNet-2020 dataset to obtain the best features that can be used later in training and validating the learning models.
- Data Normalization: normalization is usually performed over the scattered data points with a significant range between the points. Therefore, normalization is performed in order to re-scale data points to be in the same range and significance (usually 0–1). This will disallow the larger values from dominating other data points in the dataset. Therefore, we apply min–max normalization at the stage of preprocessing to have all numerical data within a range between 0 and 1. The min–max normalization of a datapoint within a set of points (), is given by the following formula ():
- Label Encoding: Label encoding techniques are utilized to convert categorical data into numerical data that may be processed by machine-learning methods.
- This research employed integer encoding techniques to represent the categorical data as a numerical record. For instance, the output class labels were encoded as {non-Tor: 00, non-VPN: 01, Tor: 02, and VPN: 03}.
- Data Shuffling: The shuffling process is a preprocessing operation conducted over the dataset samples (rows) by randomly rearranging data from a dataset to produce a new arrangement for the dataset that can be safely used for ML testing and training, without having the classifier being biased to any of the underlying classes. This will guarantee anonymity while ensuring data statistics are kept exactly the same. Figure 5 illustrates the data shuffling process.
- Folding and Splitting: To ensure a high level of the validation process of the proposed predictive models, we have conducted a k-fold cross-validation operation incorporating five different folds (distributions) with data split into 70% for training and 30% for validation (testing). For every fold, a new validation experiment involves further data distribution to ensure that all data items have participated in the training and validation process. Our folding and splitting process is shown in Figure 6 and depicts the dataset distribution throughout the folds for each experiment.
3.3. Learning Models Unit
3.4. Traffic Classification Unit
4. Results and Discussion
5. Conclusions
Author Contributions
Conflicts of Interest
- Al-Garadi, M.A.; Mohamed, A.; Al-Ali, A.K.; Du, X.; Ali, I.; Guizani, M. A Survey of Machine and Deep Learning Methods for Internet of Things (IoT) Security. IEEE Commun. Surv. Tutor. 2018, 22, 1646–1685. [Google Scholar] [CrossRef] [Green Version]
- Dastjerdi, A.V.; Buyya, R. Fog Computing: Helping the Internet of Things Realize Its Potential. Computer 2016, 49, 112–116. [Google Scholar] [CrossRef]
- Ray, S.; Jin, Y.; Raychowdhury, A. The Changing Computing Paradigm with Internet of Things: A Tutorial Introduction. IEEE Des. Test 2016, 33, 76–96. [Google Scholar] [CrossRef]
- Lombardi, M.; Pascale, F.; Santaniello, D. Internet of Things: A General Overview between Architectures, Protocols and Applications. Information 2021, 12, 87. [Google Scholar] [CrossRef]
- Čolaković, A.; Hadžialić, M. Internet of Things (IoT): A review of enabling technologies, challenges, and open research issues. Comput. Netw. 2018, 144, 17–39. [Google Scholar] [CrossRef]
- Abu Al-Haija, Q.; Alsulami, A.A. High Performance Classification Model to Identify Ransomware Payments for Heterogeneous Bitcoin Networks. Electronics 2021, 10, 2113. [Google Scholar] [CrossRef]
- Soro, F.; Drago, I.; Trevisan, M.; Mellia, M.; Ceron, J.; Santanna, J.J. Are Darknets All the Same? On Darknet Visibility for Security Monitoring. In Proceedings of the 2019 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN), Paris, France, 1–3 July 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Bertino, E.; Islam, N. Botnets and Internet of Things Security. Computer 2017, 50, 76–79. [Google Scholar] [CrossRef]
- Kolias, C.; Kambourakis, G.; Stavrou, A.; Voas, J. DDoS in the IoT: Mirai and Other Botnets. Computer 2017, 50, 80–84. [Google Scholar] [CrossRef]
- Bergman, M. The Deep Web: Surfacing Hidden Value. Taking License, volume 7, issue 1. August 2001. Available online:;rgn=main (accessed on 13 November 2021).
- Tor Project. Available online: (accessed on 21 November 2021).
- Caspi, G. Introducing Deep Learning: Boosting Cybersecurity with an Artificial Brain. Informa Tech, Dark Reading, Analytics. 2016. Available online: (accessed on 21 November 2021).
- Bendiab, G.; Shiaeles, S.; Alruban, A.; Kolokotronis, N. IoT Malware Network Tra_c Classification using Visual Representation and Deep Learning. In Proceedings of the 6th IEEE Conference on Network Softwarization (NetSoft), Ghent, Belgium, 29 June–3 July 2020; pp. 444–449. [Google Scholar]
- Abu Al-Haija, Q.; Zein-Sabatto, S. An Efficient Deep-Learning-Based Detection and Classification System for Cyber-Attacks in IoT Communication Networks. Electronics 2020, 9, 2152. [Google Scholar] [CrossRef]
- Sapre, S.; Ahmadi, P.; Islam, K. A Robust Comparison of the KDDCup99 and NSL-KDD IoT Network Intrusion Detection Datasets through Various Machine Learning Algorithms. arXiv 2019, arXiv:1912.13204v1. [Google Scholar]
- Imamverdiyev, Y.; Sukhostat, L. Anomaly detection in network traffic using extreme learning machine. In Proceedings of the 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT), Baku, Azerbaijan, 12–14 October 2016; pp. 1–4. [Google Scholar]
- Albulayhi, K.; Smadi, A.A.; Sheldon, F.T.; Abercrombie, R.K. IoT Intrusion Detection Taxonomy, Reference Architecture, and Analyses. Sensors 2021, 21, 6432. [Google Scholar] [CrossRef]
- Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 1–21. [Google Scholar] [CrossRef]
- Benkhelifa, E.; Welsh, T.; Hamouda, W. A Critical Review of Practices and Challenges in Intrusion Detection Systems for IoT: Toward Universal and Resilient Systems. IEEE Commun. Surv. Tutor. 2018, 20, 3496–3509. [Google Scholar] [CrossRef]
- Abu Al-Haija, Q.; Al Badawi, A.; Bojja, G.R. Boost-Defence for resilient IoT networks: A head-to-toe approach. Expert Syst. 2022, e12934. [Google Scholar] [CrossRef]
- Abu Al-Haija, Q. Top-Down Machine Learning-Based Architecture for Cyberattacks Identification and Classification in IoT Communication Networks. Front. Big Data 2022, 4, 782902. [Google Scholar] [CrossRef] [PubMed]
- Hassija, V.; Chamola, V.; Saxena, V.; Jain, D.; Goyal, P.; Sikdar, B. A Survey on IoT Security: Application Areas, Security Threats, and Solution Architectures. IEEE Access 2019, 7, 82721–82743. [Google Scholar] [CrossRef]
- Abu Al-Haija, Q.; Al-Badawi, A. Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning. Sensors 2021, 22, 241. [Google Scholar] [CrossRef]
- Khraisat, A.; Alazab, A. A critical review of intrusion detection systems in the internet of things: Techniques, deployment strategy, validation strategy, attacks, public datasets, and challenges. Cybersecurity 2021, 4, 1–27. [Google Scholar] [CrossRef]
- Lashkari, A.H.; Kaur, G.; Rahali, A. DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning. In Proceedings of the 10th International Conference on Communication and Network Security, Tokyo, Japan, 27–29 November 2020. [Google Scholar]
- Fachkha, C.; Debbabi, M. Darknet as a Source of Cyber Intelligence: Survey, Taxonomy, and Characterization. IEEE Commun. Surv. Tutor. 2016, 18, 1197–1227. [Google Scholar] [CrossRef]
- Early, J.; Brodley, C.; Rosenberg, C. Behavioral authentication of server flows. In Proceedings of the 19th Annual Computer Security Applications Conference, Las Vegas, NV, USA, 8–12 December 2003. [Google Scholar]
- Turkett, W.H., Jr.; Karode, A.V.; Fulp, E.W. In-the-Dark Network Traffic Classification Using Support Vector Machines. AAAI 2008, 3, 1745–1750. [Google Scholar]
- Moore, A.W.; Zuev, D. Internet traffic classification using bayesian analysis techniques. In Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, Banff, AB, Canada, 6–10 June 2005; pp. 50–60. [Google Scholar]
- Wright, C.V.; Monrose, F.; Masson, G.M. On inferring application protocol behaviors in encrypted network traffic. J. Mach. Learn. Res. 2006, 7, 2745–2769. [Google Scholar]
- Erman, J.; Arlitt, M.; Mahanti, A. Traffic classification using clustering algorithms. In Proceedings of the 2006 SIGCOMM Workshop on Mining Network Data, Pisa, Italy, 11–12 September 2006; pp. 281–286. [Google Scholar]
- Easttom, W. Virtual Private Networks, Authentication, and Wireless Security. In Modern Cryptography; Springer: Cham, Switzerland, 2020; pp. 299–317. [Google Scholar]
- Aswad, S.A.; Sonuc, E. Classification of VPN Network Traffic Flow Using Time Related Features on Apache Spark. In Proceedings of the 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Istanbul, Turkey, 20–24 October 2020; pp. 1–8. [Google Scholar]
- Gupta, A.; Thakur, H.K.; Shrivastava, R.; Kumar, P.; Nag, S. A Big Data Analysis Framework Using Apache Spark and Deep Learning. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 9–16. [Google Scholar]
- Bagui, S.; Fang, X.; Kalaimannan, E.; Bagui, S.C.; Sheehan, J. Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features. J. Cyber Secur. Technol. 2017, 1, 108–126. [Google Scholar] [CrossRef]
- Draper-Gil, G.; Lashkari, A.H.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of encrypted and vpn traffic using time-related. In Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP), Rome, Italy, 19–21 February 2016; pp. 407–414. [Google Scholar]
- Miller, S.; Curran, K.; Lunney, T. Multilayer Perceptron Neural Network for Detection of Encrypted VPN Network Traffic. In Proceedings of the 2018 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (Cyber SA), Glasgow, UK, 11–12 June 2018. [Google Scholar]
- Varghese, J.E.; Muniyal, B. A Pilot Study in Software-Defined Networking Using Wireshark for Analyzing Network Parameters to Detect DDoS Attacks. In Information and Communication Technology for Competitive Strategies (ICTCS 2020); Springer: Singapore, 2021; pp. 475–487. [Google Scholar]
- Basyoni, L.; Fetais, N.; Erbad, A.; Mohamed, A.; Guizani, M. Traffic Analysis Attacks on Tor: A Survey. In Proceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), Doha, Qatar, 2–5 February 2020; pp. 183–188. [Google Scholar]
- AlSabah, M.; Bauer, K.; Goldberg, I. Enhancing Tor’s performance using real-time traffic classification. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh North, CA, USA, 16–18 October 2102; pp. 73–84. [Google Scholar]
- Wang, L.; Mei, H.; Sheng, V.S. Multilevel Identification and Classification Analysis of Tor on Mobile and PC Platforms. IEEE Trans. Ind. Inform. 2021, 17, 1079–1088. [Google Scholar] [CrossRef]
- Zavrak, S.; Iskefiyeli, M. Anomaly-Based Intrusion Detection from Network Flow Features Using Variational Autoencoder. IEEE Access 2020, 8, 108346–108358. [Google Scholar] [CrossRef]
- Lingyu, J.; Yang, L.; Bailing, W.; Hongri, L.; Guodong, X. A hierarchical classification approach for tor anonymous traffic. In Proceedings of the 2017 IEEE 9th International Conference on Communication Software and Networks (ICCSN), Guangzhou, China, 6–8 May 2017; pp. 239–243. [Google Scholar]
- Zhao, S.; Zhang, Y.; Chang, P. Network Traffic Classification Using Tri-training Based on Statistical Flow Characteristics. In Proceedings of the 2017 IEEE Trustcom/BigDataSE/ICESS, Sydney, NSW, Australia, 1–4 August 2017; pp. 323–330. [Google Scholar]
- Saleh, S.; Qadir, J.; Ilyas, M.U. Shedding Light on the Dark Corners of the Internet: A Survey of Tor Research. J. Netw. Comput. Appl. 2018, 114, 1–28. [Google Scholar] [CrossRef] [Green Version]
- Sarwar, M.B.; Hanif, M.K.; Talib, R.; Younas, M. DarkDetect: Darknet Traffic Detection and Categorization Using Modified Convolution-Long Short-Term Memory. IEEE Access 2021, 9, 113705–113713. [Google Scholar] [CrossRef]
- Iliadis, L.A.; Kaifas, T. Darknet Traffic Classification using Machine Learning Techniques. In Proceedings of the 2021 10th International Conference on Modern Circuits and Systems Technologies (MOCAST), Thessaloniki, Greece, 5–7 July 2021; pp. 1–4. [Google Scholar]
- Ul Alam, M.Z.; Azizul Hakim, A.; Toufikuzzaman, M. Application and Interpretation of Ensemble Methods for Darknet Traffic Classification. Preprint. In Proceedings of the 42nd IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 24–27 May 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar]
- Demertzis, K.; Tsiknas, K.; Takezis, D.; Skianis, C.; Iliadis, L. Darknet Traffic Big-Data Analysis and Network Management for Real-Time Automating of the Malicious Intent Detection Process by a Weight Agnostic Neural Networks Framework. Electronics 2021, 10, 781. [Google Scholar] [CrossRef]
- Li, Y.; Lu, Y. ETCC: Encrypted Two-Label Classification Using CNN. Secur. Commun. Netw. 2021, 2021, 6633250. [Google Scholar] [CrossRef]
- Albulayhi, K.; Sheldon, F.T. An Adaptive Deep-Ensemble Anomaly-Based Intrusion Detection System for the Internet of Things. In Proceedings of the 2021 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 10–13 May 2021. [Google Scholar]
- Lashkari, A.H.; Gil, G.D.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of Tor Traffic using Time based Features. In Proceedings of the 3rd International Conference on Information Systems Security and Privacy; SCITEPRESS, Porto, Portugal, 19–21 February 2017. [Google Scholar]
- Abu Al-Haija, Q.; Al Tarayrah, M.I.; Enshasy, H.M. Time-Series Model for Forecasting Short-term Future Additions of Renewable Energy to Worldwide Capacity. In Proceedings of the 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, 26–27 October 2020. [Google Scholar]
- Abu Al-Haija, Q.; Al Nasr, K. Supervised Regression Study for Electron Microscopy Data. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 2661–2668. [Google Scholar] [CrossRef]
- Abu Al-Haija, Q.; Smadi, A.A.; Allehyani, M.F. Meticulously Intelligent Identification System for Smart Grid Network Stability to Optimize Risk Management. Energies 2021, 14, 6935. [Google Scholar] [CrossRef]
- Liu, J.; Fukuda, K. An Evaluation of Darknet Traffic Taxonomy. J. Inf. Process. 2018, 26, 148–157. [Google Scholar] [CrossRef] [Green Version]
- Hu, Y.; Zou, F.; Li, L.; Yi, P. Traffic Classification of User Behaviors in Tor, I2P, ZeroNet, Freenet. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 29 December 2020–1 January 2021; pp. 418–424. [Google Scholar] [CrossRef]
- Li, Y.; Lu, Y.; Li, S. EZAC: Encrypted Zero-day Applications Classification using CNN and K-Means. In Proceedings of the 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Dalian, China, 5–7 May 2021; pp. 378–383. [Google Scholar] [CrossRef]
- Han, C.; Shimamura, J.; Takahashi, T.; Inoue, D.; Takeuchi, J.I.; Nakao, K. Real-Time Detection of Global Cyberthreat Based on Darknet by Estimating Anomalous Synchronization Using Graphical Lasso. IEICE Trans. Inf. Syst. 2020, E103-D, 2113–2124. [Google Scholar] [CrossRef]
- Zenebe, A.; Shumba, M.; Carillo, A.; Cuenca, S. Cyber Threat Discovery from Dark Web. EPiC Ser. Comput. 2019, 64, 174–183. [Google Scholar] [CrossRef] [Green Version]
- Al Nabki, M.W.; Fidalgo, E.; Alegre, E.; De Paz, I. Classifying Illegal Activities on Tor Network Based onWeb Textual Contents. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 3–7 April 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 35–43. [Google Scholar]
Learning Type | Model Building | Examples |
Supervised | Algorithms or models learn from labeled data (Task-Driven Approach) | Classification, regression |
Unsupervised | Algorithms or models learn from unlabeled data (Data-Driven Approach) | Clustering, associations, dimensionality reduction |
Semi-supervised | Models built using combined data (Labeled + Unlabeled) | Classification, clustering |
Reinforcement | Models based on reward or penalty (Environment-Driven Approach) | Classification, control |
Detection Methods | Disadvantages |
Statistics-based: examines network traffic and processes the data using complex statistical techniques. |
Pattern-based: identifies the characters, forms, and patterns in the data. |
Rule-based: use an attack “signature” to identify unusual network activity. |
State-based: examines a sequence of events in order to ascertain the possibility of an attack. |
Heuristic-based: identifies any abnormal activity that is not consistent with the norm. |
Ref | Year | Technique | Contribution |
[27] | 2003 | Decision Trees | Behavioral authentication of server flows and classification of server traffic |
[28] | 2008 | Support Vector Machines | Efficient in-the-dark traffic classification of typical application protocols for TCP sessions |
[29] | 2005 | Bayesian Analysis Techniques | Increasing the accuracy of the Bayes Classifier through a set of simple modifications |
[30] | 2006 | Profile Hidden Markov Models | Creating statistical models for the sequence of packets created by every protocol of interest and using these models to determine the protocol being used |
[31] | 2006 | Clustering Algorithms | Review of 12 clustering methodologies, current issues, and recommendations for traffic flow clustering research |
[33] | 2020 | Artificial Neural Networks and Time-Related Features | Classifying VPN network data flow using ANNs and Time-Related Features |
[35] | 2017 | Six Machine Learning Techniques | Distinguishing VPN from non-VPN traffic and proving that Gradient Boosting Tree and Random Forest are the best machine-learning techniques to use |
[36] | 2016 | K-Nearest Neighbor and C4.5 Decision Tree | Creating multi-class classifiers that accurately classify VPN traffic into seven different categories |
[37] | 2018 | Multi-Layer Perceptron Neural Network and Wireshark | Building a representative dataset of VPN and non-VPN values and classifying VPN network data |
[40] | 2012 | Classification Techniques Mixed With QoS | Real-time classification of Tor’s encrypted circuits by application and assignment of separate service classes to each |
[41] | 2020 | Network Flow Features | Multi-level Tor traffic classification and identification framework for both mobile and PC platforms |
[43] | 2017 | Decision Tree and TriTraining Algorithm | A hierarchical classification strategy for distinguishing Tor anonymous traffic from mixed traffic |
[45] | 2018 | Various Techniques | A thorough analysis of Tor traffic classification, quantification, and comparison of various strategies |
[46] | 2021 | Convolution-LSTM and Extreme Gradient Boosting | A generalized strategy for detecting and categorizing Darknet traffic using Deep Learning |
[47] | 2021 | ML and Receiver Operating Characteristics | A feature significance analysis for the best classifier of binary and multi-class data |
[48] | 2021 | ML and Game-Theoretic Method | Differentiating Darknet traffic from benign traffic using ensemble machine-learning algorithms |
[49] | 2021 | Weight-Agnostic Neural Network | Framework for Darknet traffic management for automating the suspicious intent recognition in real time |
[50] | 2021 | Convolutional Neural Network | Two-stage, two-label classification system that can recognize both protocols and applications |
ML Model | Specifications |
BAG-DT | Preset: Bagged Trees, Ensemble method: Bag, Learner type: Decision tree, Maximum number of splits: 89161, Number of learners: 30, Data Distribution Policy: 70% training and 30% testing, 5-Fold Cross-Validation. |
ADA-DT | Preset: Boosted Trees, Ensemble method: AdaBoost, Learner type: Decision tree, Maximum number of splits: 20, Number of learners: 30, Learning rate: 0.1, Data Distribution Policy: 70% training and 30% testing, 5-Fold Cross-Validation. |
RUS-DT | Preset: RUSBoosted Trees, Ensemble method: RUSBoost, Learner type: Decision tree, Maximum number of splits: 20, Number of learners: 30, Learning rate: 0.1, Data Distribution Policy: 70% training and 30% testing, 5-Fold Cross-Validation. |
O-DT | Preset: Fine Tree, Maximum number of splits: 100, Split criterion: Gini’s diversity index, Surrogate decision splits: On, using a maximum of 10 surrogates, Data Distribution Policy: 70% training and 30% testing, 5-Fold Cross-Validation. |
O-KNN | Preset: Optimizable KNN, Number of neighbors: 2, Distance metric: Euclidean, Distance weight: Squared inverse, Standardize data: false, Optimizer: Bayesian optimization, Acquisition function: Expected improvement per second plus, Iterations: 30. |
O-DSC | Preset: Optimizable Discriminant, Discriminant type: Linear, Quadratic, Diagonal Linear, Diagonal Quadratic, Optimizer: Bayesian optimization Acquisition function: Expected improvement per second plus, Iterations: 30, 5-Fold Cross-Validation. |
Accuracy % | 99.5 | 95.4 | 93.9 | 97.3 | 83.6 | 97.1 |
Error % | 0.5 | 4.6 | 6.1 | 2.7 | 16.7 | 3.9 |
99.50% | 99.45% | 96.93% | 98.18% | 0.5% | 0.55% | 3.07% | 100% | 668 | 110,000 |
Research | Year | Evaluation Model | Accuracy | I.F. % |
[49] | 2021 | Recurring Neural Network (RNN) | 94.51% | 5.28%↑ |
[56] | 2017 | Longitudinal Analysis of Network Traffic (LANT) | 94.00% | 5.85%↑ |
[57] | 2020 | Hierarchical Classification Method (HCM) | 96.60% | 3.00%↑ |
[48] | 2021 | AdaBoost Decision Trees (AB-DT) | 97.30% | 2.26%↑ |
[25] | 2020 | Convolutional Neural Network (CNN) | 86.00% | 15.70%↑ |
[33] | 2020 | Artificial Neural Network and Apache Spark (ANN-AS) | 94.66% | 5.11%↑ |
[58] | 2021 | Convolutional Neural Network (CNN) and K-Means (KM) | 97.40% | 2.16%↑ |
[59] | 2020 | Sparse Structure Learning with LASSO selection (SSL) | 97.10% | 2.47%↑ |
[50] | 2021 | Convolutional Neural Network (CNN) | 97.65% | 1.89%↑ |
[60] | 2019 | Random Forest Classifier (RFC) | 78.30% | 27.08%↑ |
[61] | 2017 | Logistic Regression Classifier (LRC) | 96.60% | 3.00%↑ |
Proposed | 2022 | Bagging Decision Tree Ensembles | 99.50% | - |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
Share and Cite
Abu Al-Haija, Q.; Krichen, M.; Abu Elhaija, W. Machine-Learning-Based Darknet Traffic Detection System for IoT Applications. Electronics 2022, 11, 556.
Abu Al-Haija Q, Krichen M, Abu Elhaija W. Machine-Learning-Based Darknet Traffic Detection System for IoT Applications. Electronics. 2022; 11(4):556.
Chicago/Turabian StyleAbu Al-Haija, Qasem, Moez Krichen, and Wejdan Abu Elhaija. 2022. "Machine-Learning-Based Darknet Traffic Detection System for IoT Applications" Electronics 11, no. 4: 556.
APA StyleAbu Al-Haija, Q., Krichen, M., & Abu Elhaija, W. (2022). Machine-Learning-Based Darknet Traffic Detection System for IoT Applications. Electronics, 11(4), 556.