Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and Insights
Abstract
:1. Introduction
2. Background
3. Methods
3.1. Shallow Learning Models
3.1.1. Decision Tree (DT)
3.1.2. Naïve Bayes (NB)
3.1.3. Logistic Regression (LR)
3.1.4. XGBoost (XGB)
3.1.5. Support Vector Machine (SVM)
3.2. Deep Learning Models
3.2.1. Multilayer Perceptron (MLP)
3.2.2. Convolutional Neural Network (CNN)
3.2.3. Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU)
3.2.4. Convolutional Neural Network–Long Short-Term Memory (CNN)
3.3. Metrics
Accuracy
3.4. Deployment Challenges
4. Related Work
5. Datasets
- On the NSL-KDD dataset, whenever feature selection is used, classification performance can be improved. In addition, the RF algorithm is very effective on this dataset and reports high performance results. DT also seems to perform well on this dataset.
- On the KDD-99 dataset, whenever feature selection is used, the improvement in classification performance was not always discounted. However, RF works very well on this dataset.
- On the UNSW-NB15 dataset, feature selection is effective in improving classification performance. CNNs and recurrent patterns are a successful approach to this dataset; in fact, being a large dataset, deep learning seems to be the solution.
5.1. KDD CUP 99 (KDD-99) Dataset
- Normal.
- Denial of Service Attack (DoS): DoS attacks occur when an attacker prevents authorized users from accessing a system or overloads certain computer or memory resources, making them unable to process valid requests.
- Probing Attack: These attacks involve scanning the network to identify valid IPs, and information is collected on those IPs. Often this information provides attackers with a list of vulnerabilities that can later be useful in launching attacks on systems and services.
- Remote to Local Attack (R2L): The term refers to the process by which a malicious user who is able to transmit packets to a computer on a network but does not have an account on that machine uses a vulnerability to obtain local access as that machine’s user.
- User to Root Attacks (U2R): These are a type of exploit where the attacker gains root access to the system by first getting access to a regular user account (perhaps by password sniffing). From there, they can take advantage of a vulnerability.
- Basic features: This category encompasses all attributes that can be extracted from a TCP/IP connection.
- Traffic features: This category includes features computed with respect to a window interval and is divided into two groups: (a) “same host” feature: in order to compute statistics pertaining to protocol behavior, service, etc., it focuses only on connections made within the last two seconds that share the same destination host as the active connection; (b) “same service” feature: it only focuses on connections that have had the same service as the current connection for the previous two seconds. The above two types of “traffic” features are called time-based. However, there are several slow probe attacks that scan hosts (or ports) using a much wider time window than 2 s, such as one every minute. As a result, these attacks do not produce intrusion patterns with a time window of 2 s. To solve this problem, the “same host” and “same service” features are recalculated based on a connection window of 100 instead of a 2 s time window. These features are called connection-based traffic features.
- Content features: R2L and U2R attacks lack the regular sequential patterns of incursion that characterize typical DoS and probing attacks. This is due to the fact that R2L and U2R attacks are encoded in the data portions of packets and often only require a single connection, whereas DoS and probing attacks entail several connections to certain hosts in a very short amount of time. Certain aspects are necessary to be able to search for unusual activity in the data section, such as the number of unsuccessful login attempts, in order to detect these kinds of attacks. These features are called content features.
5.2. NSL-KDD Dataset
5.3. UNSW-NB15 Dataset
- Normal;
- Fuzzer: trying to stop a program or network by feeding it data that are produced at random;
- Analysis: this includes several attacks to port scan, spam attacks, and HTML file penetration kinds of attack;
- Backdoor: a method for secretly bypassing a system security measure to access a computer or its contents;
- DoS: an intentional attempt to prevent people from accessing a server or network resource, often by momentarily stopping or disrupting the operations of a host that is connected to the internet;
- Exploit: the attacker takes advantage of a known vulnerability;
- Generic: a method that functions against all block ciphers (with a specific block and key size) without taking the block cipher’s structure into account;
- Reconnaissance: this includes every strike that can imitate an information-gathering attack;
- Shell code: a small piece of code used as the payload in the exploitation of software vulnerability;
- Worm: malware that replicates itself in order to infect more systems. It frequently spreads over a computer network, taking advantage of security flaws in the target machine’s security to gain access.
5.4. IoT-23 Dataset
- Attack: This label denotes the existence of an attack originating from the compromised device and directed against a different host. For instance, a command injection into the header of a GET request, a brute force attempt at a telnet login, etc.;
- Benign: this label indicates that no suspicious or malicious activity was detected in the connections;
- C-and-C: This label denotes that a Command-and-Control server was linked to the compromised device. Because of the irregular connections to the suspicious server or the sporadic arrival and departure of certain IRC commands, this behavior was discovered during the network malware capture analysis;
- DDoS: The volume of traffic flowing to the same IP address indicates that these flows are part of a DDoS assault;
- FileDownload: This label denotes the process of downloading a file to the compromised device. This is identified by screening connections whose response bytes exceed 3 KB or 5 KB; often, this is done in conjunction with a destination IP or port that is known to be a C-and-C server and to be suspicious;
- HeartBeat: This label denotes that the C-and-C server tracks the infected host using packets transmitted over this connection;
- Mirai: This label reports that the connections resemble a Mirai botnet, created by exploiting IoT device vulnerabilities. When flows exhibit patterns like the most prevalent known Mirai assaults, this label is appended;
- Okiru: Connections with this designation exhibit traits of an Okiru bot-net, which is a Mirai botnet targeting IoT devices that utilize ARC (Argonaut RISC Core), the only distinction being that this bot-net family is less widespread than Mirai when it comes to the labeling choice;
- PartOfAHorizontalPortScan: This label denotes the use of connections for a horizontal port scan in order to obtain data for further attacks. These labels are used for patterns in which connections have numerous distinct destination IP addresses, the same port, and a comparable amount of transferred bytes;
- Torii: This descriptor denotes that the connections exhibit traits of a bot-net associated with Torii. The criteria used for this categorization determination were the same as those used for Mirai, with the exception that this bot-net family is less widespread.
5.5. UNB-CIC IoT 2023 Dataset
- Normal communications between IoT devices;
- Denial-of-service (DoS) or distributed denial-of-service (DDoS) attacks, where attackers try to overload devices or networks;
- Malware injection into IoT devices, to compromise their operation or steal information;
- Man-in-the-middle attacks, in which an attacker places himself in between two devices in order to intercept or alter communication;
- Timestamp: the exact time each packet was captured;
- IP Address: the source and destination address of each packet;
- Port Number: the port number associated with the connection;
- Protocol: the network protocol used (e.g., TCP, UDP, etc.);
- Packet size: information about the volume of data transmitted;
- TCP flags: indicators specific to TCP connections, such as SYN, ACK, FIN, that help identify the status of connections.
6. Experiments
6.1. Experimental Setup
6.2. Experimental Cases
- Experiment 1: separate models
- Experiment 2: ensemble 1—Random forest (RF)/XGBoost (XGB)/decision tree (DT)
- Experiment 3: ensemble 2—Deep neural network (DNN)/CNN–LSTM
- Experiment 4: ensemble 3—LSTM/CNN–LSTM/GRU
- Experiment 5: ensemble 4—Random forest (RF)/deep neural network (DNN)
6.2.1. Experiment 1: Separate Models
6.2.2. Experiment 2: Ensemble 1—RF/XGB/DT
6.2.3. Experiment 3: Ensemble 2—DNN/CNN–LSTM
- Subsample generation (Bootstrap): The bootstrap approach is used to choose multiple random samples from the original training dataset; as a result, some data may be excluded and some may be repeated in each sample.
- Model training: An independent model, such as a decision tree, is trained for every sample. Different models that capture distinct characteristics of the original dataset are produced because each model is trained on slightly different data.
- Aggregation of predictions: Following training, the models’ output is utilized to generate predictions. A majority vote is used in bagging for classification models, where each model casts a vote for a class, and the class with the most votes becomes the final prediction. The average of the predictions made by individual models is employed in regression models.
6.2.4. Experiment 4: Ensemble 3—LSTM/CNN–LSTM/GRU
6.2.5. Experiment 5: Ensemble 4—RF/DNN
7. Results
7.1. Experiment 1: Separate Models
7.2. Experiments 2–5: Ensemble Models
7.3. Comparison of Separate and Ensemble Models
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Nascita, A.; Cerasuolo, F.; Di Monda, D.; Garcia, J.T.A.; Montieri, A.; Pescape, A. Machine and Deep Learning Approaches for IoT Attack Classification. In Proceedings of the INFOCOM WKSHPS 2022—IEEE Conference on Computer Communications Workshops, New York, NY, USA, 2–5 May 2022. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V.; Saitta, L. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2323. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the EMNLP 2014—2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar] [CrossRef]
- Ravipati, R.D.; Abualkibash, M. Intrusion Detection System Classification Using Different Machine Learning Algorithms on KDD-99 and NSL-KDD Datasets—A Review Paper. SSRN Electron. J. 2019, 11. [Google Scholar] [CrossRef]
- Farnaaz, N.; Jabbar, M.A. Random Forest Modeling for Network Intrusion Detection System. Procedia Comput. Sci. 2016, 89, 213–217. [Google Scholar] [CrossRef]
- Bhamare, D.; Salman, T.; Samaka, M.; Erbad, A.; Jain, R. Feasibility of Supervised Machine Learning for Cloud Security. In Proceedings of the ICISS 2016—2016 International Conference on Information Science and Security, Pattaya, Thailand, 19–22 December 2017. [Google Scholar] [CrossRef]
- Sharmila, B.S.; Nagapadma, R. Intrusion detection system using naive bayes algorithm. In Proceedings of the 2019 5th IEEE International WIE Conference on Electrical and Computer Engineering, WIECON-ECE 2019—Proceedings, Bengaluru, India, 15–16 November 2019. [Google Scholar] [CrossRef]
- Prachi, H.M.; Sharma, P. Intrusion detection using machine learning and feature selection. Int. J. Comput. Netw. Inf. Secur. 2019, 11, 43–52. [Google Scholar]
- Hammad, M.; El-Medany, W.; Ismail, Y. Intrusion Detection System using Feature Selection with Clustering and Classification Machine Learning Algorithms on the UNSW-NB15 dataset. In Proceedings of the 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies, 3ICT 2020, Sakheer, Bahrain, 20–21 December 2020. [Google Scholar] [CrossRef]
- Latif, S.; Dola, F.F.; Afsar, M.; Esha, I.J.; Nandi, D. Investigation of Machine Learning Algorithms for Network Intrusion Detection. Int. J. Inf. Eng. Electron. Bus. 2022, 14, 1–22. [Google Scholar] [CrossRef]
- Alzahrani, A.O.; Alenazi, M.J.F. Designing a Network Intrusion Detection System Based on Machine Learning for Software Defined Networks. Future Internet 2021, 13, 111. [Google Scholar] [CrossRef]
- Gouveia, A.; Correia, M. Network intrusion detection with XGBoost. In Recent Advances in Security, Privacy, and Trust for Internet of Things (IoT) and Cyber-Physical Systems (CPS); Chapman and Hall/CRC: Boca Raton, FL, USA, 2020; pp. 137–166. [Google Scholar]
- Ahmad, I.; Haq, Q.E.U.; Imran, M.; Alassafi, M.O.; AlGhamdi, R.A. An Efficient Network Intrusion Detection and Classification System. Mathematics 2022, 10, 530. [Google Scholar] [CrossRef]
- Thamaraiselvi, R.; Mary, S.A.S. Attack and anomaly detection in iot networks using machine learning. Int. J. Comput. Sci. Mob. Comput. 2020, 9, 95–103. [Google Scholar] [CrossRef]
- Kim, Y.G.; Ahmed, K.J.; Lee, M.J.; Tsukamoto, K. A Comprehensive Analysis of Machine Learning-Based Intrusion Detection System for IoT-23 Dataset. In Advances in Intelligent Networking and Collaborative Systems; Lecture Notes in Networks and Systems, LNNS; Springer: Cham, Switzerland, 2022; Volume 527, pp. 475–486. [Google Scholar] [CrossRef]
- Faker, O.; Dogdu, E. Intrusion detection using big data and deep learning techniques. In Proceedings of the ACMSE 2019—Proceedings of the 2019 ACM Southeast Conference, Kennesaw, GA, USA, 18–20 April 2019; pp. 86–93. [Google Scholar] [CrossRef]
- Jia, Y.; Wang, M.; Wang, Y. Network intrusion detection algorithm based on deep neural network. IET Inf. Secur. 2019, 13, 48–53. [Google Scholar] [CrossRef]
- Vinayakumar, R.; Alazab, M.; Soman, K.P.; Poornachandran, P.; Al-Nemrat, A.; Venkatraman, S. Deep Learning Approach for Intelligent Intrusion Detection System. IEEE Access 2019, 7, 41525–41550. [Google Scholar] [CrossRef]
- Le, T.-T.-H.; Kim, J.; Kim, H. Analyzing Effective of Activation Functions on Recurrent Network for Intrusion Detection. J. Multimed. Inf. Syst. 2016, 3, 91–96. [Google Scholar] [CrossRef]
- Lin, W.-H.; Lin, H.-C.; Wang, P.; Wu, B.-H.; Tsai, J.-Y. Using convolutional neural networks to network intrusion detection for cyber threats. In Proceedings of the 4th IEEE International Conference on Applied System Innovation 2018, ICASI 2018, Chiba, Japan, 13–17 April 2018; pp. 1107–1110. [Google Scholar] [CrossRef]
- Li, Z.; Rios, A.L.G.; Xu, G.; Trajkovic, L. Machine learning techniques for classifying network anomalies and intrusions. In Proceedings of the IEEE International Symposium on Circuits and Systems, Monterey, CA, USA, 21–22 May 2019; Volume 2019. [Google Scholar] [CrossRef]
- Hsu, C.-M.; Hsieh, Y.; Prakosa, S.; Azhari, M.; Leu, J.-S. Using Long-Short-Term Memory Based Convolutional Neural Networks for Network Intrusion Detection. In Proceedings of the 11th EAI International Conference, WiCON 2018, Taipei, Taiwan, 15–16 October 2018; Proceedings. Springer: Cham, Switzerland, 2019; pp. 86–94. [Google Scholar] [CrossRef]
- Andresini, G.; Appice, A.; Di Mauro, N.; Loglisci, C.; Malerba, D. Multi-Channel Deep Feature Learning for Intrusion Detection. IEEE Access 2020, 8, 53346–53359. [Google Scholar] [CrossRef]
- Ravi, V.; Chaganti, R.; Alazab, M. Recurrent deep learning-based feature fusion ensemble meta-classifier approach for intelligent network intrusion detection system. Comput. Electr. Eng. 2022, 102, 108156. [Google Scholar] [CrossRef]
- Sahu, A.K.; Sharma, S.; Tanveer, M.; Raja, R. Internet of Things attack detection using hybrid Deep Learning Model. Comput. Commun. 2021, 176, 146–154. [Google Scholar] [CrossRef]
- Toldinas, J.; Venčkauskas, A.; Damaševičius, R.; Grigaliūnas, Š.; Morkevičius, N.; Baranauskas, E. A Novel Approach for Network Intrusion Detection Using Multistage Deep Learning Image Recognition. Electronics 2021, 10, 1854. [Google Scholar] [CrossRef]
- Ullah, I.; Mahmoud, Q.H. Design and Development of a Deep Learning-Based Model for Anomaly Detection in IoT Networks. IEEE Access 2021, 9, 103906–103926. [Google Scholar] [CrossRef]
- Cao, B.; Li, C.; Song, Y.; Qin, Y.; Chen, C. Network Intrusion Detection Model Based on CNN and GRU. Appl. Sci. 2022, 12, 4184. [Google Scholar] [CrossRef]
- Alhamad, R.N.; Alserhani, F.M. Prediction Models to Effectively Detect Malware Patterns in the IoT Systems. Int. J. Adv. Comput. Sci. Appl. 2022, 13. [Google Scholar] [CrossRef]
- Ullah, I.; Mahmoud, Q.H. Design and Development of RNN Anomaly Detection Model for IoT Networks. IEEE Access 2022, 10, 62722–62750. [Google Scholar] [CrossRef]
- Stolfo, S.J.; Fan, W.; Lee, W.; Prodromidis, A.; Chan, P.K. Cost-based modeling for fraud and intrusion detection: Results from the JAM project. In Proceedings of the DARPA Information Survivability Conference and Exposition, DISCEX 2000, Hilton Head, SC, USA, 25–27 January 2000; Volume 2, pp. 130–144. [Google Scholar] [CrossRef]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, CISDA 2009, Ottawa, ON, Canada, 8–10 July 2009. [Google Scholar] [CrossRef]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference, MilCIS 2015—Proceedings, Canberra, Australia, 10–12 November 2015. [Google Scholar] [CrossRef]
- Garcia, S.; Parmisano, A.; Erquiaga, M.J. IoT-23: A labeled dataset with malicious and benign IoT network traffic. Zenodo 2021. [Google Scholar] [CrossRef]
- Neto, C.P.; Dadkhah, S.; Ferreira, R.; Zohourian, A.; Lu, R.; CICIoT, A.A.G. 2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment. Sensors 2023, 23, 5941. [Google Scholar] [CrossRef] [PubMed]
Model | Nature (Shallow/Deep) |
---|---|
Decision Tree (DT) | Shallow learning |
Naïve Bayes (NB) | |
Logistic Regression (LR) | |
XGBoost (v.1.7.2) | |
Support Vector Machine (SVM) | |
Multilayer Perceptron (MLP) | Deep learning |
Convolutional Neural Network (CNN) | |
Long Short-Term Memory (LSTM) |
Paper | Dataset | Best Model | Best Result (Detection/Classification) |
---|---|---|---|
Ravipati et al. [7] | KDD-99 | Random Forest | 99% (detection) |
Farnaaz [8] | NSL-KDD | Random Forest | 99.67% (classification) |
Bhamare et al. [9] | UNSW-NB15 | Logistic Regression | 89.26% (detection) |
Sharmila et al. [10] | NSL-KDD | Naïve Bayes | 86.5% (classification) |
Prachi et al. [11] | NSL-KDD | Random Forest | 99.91% (classification) |
Hammad et al. [12] | UNSW-NB15 | Random Forest | 97.60% (detection) |
Latif et al. [13] | NSL-KDD | KNN | 85.3% (classification) |
Alzahrani et al. [14] | NSL-KDD | XGBoost | 95.5% (detection) |
Gouveia et al. [15] | NSL-KDD | XGBoost | 88.64% (detection) |
UNSW-NB15 | 93.34% (detection) | ||
Ahmad et al. [16] | UNSW-NB15 | Adaboost-based DT | 99.3% (detection) |
Thamaraiselvi et al. [17] | IoT-23 | RF | 99.5% (detection) |
Kim et al. [18] | IoT-23 | ANN | 99.99% (detection) |
Faker et al. [19] | UNSW-NB15 | DNN | 99.16% (detection) 97.01% (classification) |
CIC-IDS2017 | GBT DNN | 99.99% (detection) 99.56% (classification) | |
Jia et al. [20] | NSL-KDD | DNN | 99.9% (classification) |
Vinayakumar et al. [21] | KDD-99 | DT DNN | 92.9% (detection) 92.5% (classification) |
NSL-KDD | DT DNN | 93% (detection) 78.5% (classification) | |
UNSW NB-15 | RF | 90.3% (detection) 75.5% (classification) | |
CIC-IDS 2017 | RF DNN | 94% (detection) 95.6% (classification) | |
Le et al. [22] | KDD-99 | RNN | 97.77% (classification) |
Lin et al. [23] | KDD-99 | CNN (con adaptive delta algorithm) | 99.65% (detection) |
Li et al. [24] | NSL-KDD | GRU | 82.87% (classification) |
Hsu et al. [25] | NSL-KDD | CNN+LSTM | 94.12% (detection) 88.95% (classification) |
Andresini et al. [26] | KDD-99 | AE+CNN | 92.49% (detection) |
UNSW-NB15 | 93.40% (detection) | ||
CIC-IDS2017 | 97.90% (detection) | ||
Ravi et al. [27] | KDD-99 | (RNN and LSTM and GRU) + (RF and SVM) + LR | 99% (detection) 89% (classification) |
UNSW-NB15 | 99% (detection) 99% (classification) | ||
CICIDS-2017 | 99% (detection) 98% (classification) | ||
Sahu et al. [28] | IoT-23 | CNN+LSTM | 96% (detection) |
Toldinas et al. [29] | UNSW-NB15 | CNN | 99.8% (classification) |
Ullah et al. [30] | IoT-23 | CNN 1D | 99.96% (classification) |
MQTT-IoT-IDS2020 | CNN 1D + transfer learning | 99.98% (detection) | |
Cao et al. [31] | UNSW-NB15 | CNN + GRU | 99% (detection) 89% (classification) |
NSL-KDD | 99% (detection) 99% (classification) | ||
CIC-IDS2017 | 99% (detection) 98% (classification) | ||
Alhamad et al. [32] | IoT-23 | RF Catboost XGBoost | 89% (detection) |
Ullah et al. [33] | NSL-KDD | CNNBiLST BiLSTM | 99.88% (classification) 99.92% (detection) |
IoT-23 | CNNBiLSTM LSTM | 99.87% (classification) 99.80% (detection) |
Class | Training Set | Percentage | Test Set | Percentage |
---|---|---|---|---|
Normal | 812,814 | 75.611% | 60,593 | 19.481% |
DoS | 247,267 | 23.002% | 229,853 | 73.901% |
Probing Attack | 13,860 | 1.289% | 4166 | 1.339% |
R2L | 999 | 0.093% | 16,189 | 5.205% |
U2R | 52 | 0.005% | 228 | 0.073% |
Total | 1,074,992 | 100% | 311,029 | 100% |
Class | No. of Samples | Percentage |
---|---|---|
Normal | 2,218,761 | 87.35% |
Fuzzer | 24,246 | 0.95% |
Analysis | 2677 | 0.11% |
Backdoor | 2329 | 0.09% |
DoS | 16,353 | 0.64% |
Exploit | 44,525 | 1.75% |
Generic | 215,481 | 8.48% |
Reconnaissance | 13,987 | 0.55% |
Shellcode | 1511 | 0.06% |
Worm | 174 | 0.01% |
Total | 2,540,044 | 100% |
Dataset | Accuracy (%) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DT | RF | NB | LR | XGB | SVM | MLP | DNN | CNN | LSTM | CNN–LSTM | GRU | RNN | |
KDD-99 | 93.00 | 93.00 | 88.00 | 92.00 | 93.00 | 78.00 | 93.00 | 99.20 | 93.10 | 93.00 | 92.90 | 93.00 | 92.60 |
NSL-KDD | 84.00 | 87.00 | 85.30 | 85.80 | 87.20 | 50.70 | 88.70 | 89.10 | 89.50 | 88.80 | 91.00 | 88.30 | 91.80 |
UNSW-NB15 | 99.70 | 99.80 | 98.00 | 98.30 | 99.80 | 89.00 | 98.00 | 98.40 | 99.00 | 98.60 | 98.90 | 98.50 | 98.40 |
IoT-23 | 93.50 | 93.40 | 86.10 | 90.0 | 92.60 | 86.30 | 86.70 | 93.50 | 93.50 | 93.50 | 93.50 | 93.50 | 93.50 |
UNB-CIC IoT 2023 | 99.60 | 99.70 | 61.00 | 98.90 | 99.60 | 98.60 | 99.20 | 99.11 | 99.16 | 99.00 | 99.03 | 99.05 | 99.00 |
Mean acc. | 93.96 | 94.58 | 83.68 | 93.00 | 94.44 | 80.52 | 93.12 | 95.86 | 94.85 | 94.58 | 95.07 | 94.47 | 95.06 |
Execution Time (ms) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DT | RF | NB | LR | XGB | SVM | MLP | DNN | CNN | LSTM | CNN–LSTM | GRU | RNN | |
CPU | 4 × 10−4 | 3 × 10−3 | 6 × 10−4 | 3.5 × 10−4 | 1.5 × 10−3 | 1 × 10−3 | 3 × 10−3 | 6 × 10−7 | 8 × 10−7 | 6 × 10−7 | 5 × 10−7 | 5 × 10−7 | 8 × 10−7 |
GPU | - | 1.5 × 10−6 | 1 × 10−7 | 1 × 10−7 | 1 × 10−3 | - | - | 4 × 10−7 | 4 × 10−7 | 1 × 10−7 | 1 × 10−7 | 1 × 10−7 | 4 × 10−7 |
Dataset | Accuracy (%) | |||
---|---|---|---|---|
Experiment 2: RF/XGB/DT | Experiment 3: DNN/CNN–LSTM | Experiment 4: LSTM/CNN–LSTM/GRU | Experiment 5: RF/DNN | |
KDD-99 | 99.30 | 98.70 | 98.71 | 99.09 |
NSL-KDD | 86.00 | 86.28 | 86.24 | 87.40 |
UNSW-NB15 | 99.40 | 99.63 | 99.65 | 99.79 |
IoT-23 | 93.50 | 93.33 | 93.37 | 93.53 |
UNB-CIC IoT 2023 | 99.70 | 99.27 | 99.33 | 99.73 |
Mean acc. | 95.58 | 95.44 | 95.46 | 95.91 |
Execution Time (ms) | ||||
---|---|---|---|---|
Experiment 2: RF/XGB/DT | Experiment 3: DNN/CNN–LSTM | Experiment 4: LSTM/CNN–LSTM/GRU | Experiment 5: RF/DNN | |
CPU | 5 × 10−3 | 6 × 10−7 | 5 × 10−7 | 5 × 10−7 |
GPU | 1 × 10−3 | 8 × 10−7 | 2 × 10−7 | 2 × 10−7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Genuario, F.; Santoro, G.; Giliberti, M.; Bello, S.; Zazzera, E.; Impedovo, D. Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and Insights. Information 2024, 15, 741. https://doi.org/10.3390/info15110741
Genuario F, Santoro G, Giliberti M, Bello S, Zazzera E, Impedovo D. Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and Insights. Information. 2024; 15(11):741. https://doi.org/10.3390/info15110741
Chicago/Turabian StyleGenuario, Filippo, Giuseppe Santoro, Michele Giliberti, Stefania Bello, Elvira Zazzera, and Donato Impedovo. 2024. "Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and Insights" Information 15, no. 11: 741. https://doi.org/10.3390/info15110741
APA StyleGenuario, F., Santoro, G., Giliberti, M., Bello, S., Zazzera, E., & Impedovo, D. (2024). Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and Insights. Information, 15(11), 741. https://doi.org/10.3390/info15110741