A Network Intrusion Detection Model Based on BiLSTM with Multi-Head Attention Mechanism
Abstract
:1. Introduction
2. Related Research
3. Model Methodology
3.1. Embedding
3.2. Multi-Head Attention
3.3. BiLSTM
3.4. Dense Layer
- Improved sequence modeling: BiLSTM is a type of RNN that can effectively model sequential data in both forward and backward directions. However, when combined with multi-head attention, they can further enhance the model’s ability to capture long-range dependencies and improve the quality of sequence modeling.
- Increased interpretability: Multi-head attention mechanism allows the model to attend to distinct parts of the input sequence selectively, providing more transparency and interpretability to the model’s decision-making process. This is particularly useful in detection tasks such as network intrusion detection.
- Robustness to noise and variations: By attending to multiple parts of the input sequence, the model becomes more robust to variations and noise in the data.
- Scalability: The combination of multi-head attention with BiLSTM allows the model to scale well to larger datasets and more complex tasks without compromising performance or accuracy. This makes it an effective approach for handling large-scale network intrusion detection tasks.
4. Implementation Details
5. Experiment and Results
5.1. Introduction to the Data Set
5.2. Data Processing
5.2.1. Data Conversion
5.2.2. Data Normalization
5.2.3. One-Hot Encoding
5.2.4. Dataset Balanced
5.3. Evaluation Criteria
5.4. Model Review
5.5. Model Accuracy and Loss Variation
5.6. Ablation Experiments
- We use BiLSTM to build our model. On the one hand, it can capture the bidirectional features better, on the other hand, it has the ability to avoid situations such as gradient disappearance and gradient explosion, which are very suitable for network intrusion detection.
- The addition of the multi-headed attention mechanism allows different attention weights for each vector in the feature vector to strengthen the relationship between certain vectors and the type of detected attacks, which improves the accuracy of detection. It also avoids the problem of over-focusing attention on its position.
5.7. Comparison with Other Models
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Manzoor, I.; Kumar, N. A feature reduced intrusion detection system using ANN classifier. Expert Syst. Appl. 2017, 88, 249–257. [Google Scholar]
- Thapa, S.; Mailewa, A. The role of intrusion detection/prevention systems in modern computer networks: A review. In Proceedings of the Conference: Midwest Instruction and Computing Symposium (MICS), Online, 3–4 April 2020; Volume 53, pp. 1–14. [Google Scholar]
- Patgiri, R.; Varshney, U.; Akutota, T.; Kunde, R. An investigation on intrusion detection system using machine learning. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India, 18–21 November 2018; pp. 1684–1691. [Google Scholar]
- Liu, M.; Xue, Z.; Xu, X.; Zhong, C.; Chen, J. Host-based intrusion detection system with system calls: Review and future trends. ACM Comput. Surv. (CSUR) 2018, 51, 1–36. [Google Scholar] [CrossRef]
- Pu, G.; Wang, L.; Shen, J.; Dong, F. A hybrid unsupervised clustering-based anomaly detection method. Tsinghua Sci. Technol. 2020, 26, 146–153. [Google Scholar] [CrossRef]
- Buczak, A.L.; Guven, E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 2015, 18, 1153–1176. [Google Scholar] [CrossRef]
- Momand, A.; Jan, S.U.; Ramzan, N. A Systematic and Comprehensive Survey of Recent Advances in Intrusion Detection Systems Using Machine Learning: Deep Learning, Datasets, and Attack Taxonomy. J. Sens. 2023, 2023, 6048087. [Google Scholar] [CrossRef]
- Liu, H.; Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef]
- Sivamohan, S.; Sridhar, S.; Krishnaveni, S. An effective recurrent neural network (RNN) based intrusion detection via bi-directional long short-term memory. In Proceedings of the 2021 international conference on intelligent technologies (CONIT), Hubli, India, 25–27 June 2021; pp. 1–5. [Google Scholar]
- Elsayed, N.; Zaghloul, Z.S.; Azumah, S.W.; Li, C. Intrusion detection system in smart home network using bidirectional LSTM and convolutional neural networks hybrid model. In Proceedings of the 2021 IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), Lansing, MI, USA, 9–11 August 2021; pp. 55–58. [Google Scholar]
- Chi, H.; Lin, C. Industrial Intrusion Detection System Based on CNN-Attention-BILSTM Network. In Proceedings of the 2022 International Conference on Blockchain Technology and Information Security (ICBCTIS), Huaihua City, China, 15–17 July 2022; pp. 32–39. [Google Scholar]
- Zhang, L.; Huang, J.; Zhang, Y.; Zhang, G. Intrusion detection model of CNN-BiLSTM algorithm based on mean control. In Proceedings of the 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 16–18 October 2020; pp. 22–27. [Google Scholar]
- Wang, J.; Chen, N.; Yu, J.; Jin, Y.; Li, Y. An efficient intrusion detection model combined bidirectional gated recurrent units with attention mechanism. In Proceedings of the 2020 7th International Conference on Behavioural and Social Computing (BESC), Bournemouth, UK, 5–7 November 2020; pp. 1–6. [Google Scholar]
- Hou, H.; Di, Z.; Zhang, M.; Yuan, D. An Intrusion Detection Method for Cyber Monintoring Using Attention based Hierarchical LSTM. In Proceedings of the 2022 IEEE 8th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing,(HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), Jinan, China, 6–8 May 2022; pp. 125–130. [Google Scholar]
- Song, Y.; Zhang, D.; Li, Y.; Shi, S.; Duan, P.; Wei, J. Intrusion Detection for Internet of Things Networks using Attention Mechanism and BiGRU. In Proceedings of the 2023 5th International Conference on Electronic Engineering and Informatics (EEI), Wuhan, China, 30 June–2 July 2023; pp. 227–230. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Liu, C.; Liu, Y.; Yan, Y.; Wang, J. An intrusion detection model with hierarchical attention mechanism. IEEE Access 2020, 8, 67542–67554. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef]
- Medsker, L.R.; Jain, L. Recurrent neural networks. Des. Appl. 2001, 5, 2. [Google Scholar]
- Schtickzelle, M. Pierre-François Verhulst (1804–1849). La première découverte de la fonction logistique. Population 1981, 3, 541–556. [Google Scholar] [CrossRef]
- Sudjianto, A.; Knauth, W.; Singh, R.; Yang, Z.; Zhang, A. Unwrapping the black box of deep relu networks: Interpretability, diagnostics, and simplification. arXiv 2020, arXiv:2011.04041. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009 IEEE symposium on computational intelligence for security and defense applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar]
- Revathi, S.; Malathi, A. A detailed analysis on NSL-KDD dataset using various machine learning techniques for intrusion detection. Int. J. Eng. Res. Technol. 2013, 2, 1848–1853. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Andresini, G.; Appice, A.; Malerba, D. Nearest cluster-based intrusion detection through convolutional neural networks. Knowl.-Based Syst. 2021, 216, 106798. [Google Scholar] [CrossRef]
- Luo, J.; Zhang, Y.; Wu, Y.; Xu, Y.; Guo, X.; Shang, B. A Multi-Channel Contrastive Learning Network Based Intrusion Detection Method. Electronics 2023, 12, 949. [Google Scholar] [CrossRef]
- Zhang, L.; Yan, H.; Zhu, Q. An Improved LSTM Network Intrusion Detection Method. In Proceedings of the 2020 IEEE 6th International conference on Computer and Communications (ICCC), Chengdu, China, 11–14 December 2020; pp. 1765–1769. [Google Scholar]
- Su, T.; Sun, H.; Zhu, J.; Wang, S.; Li, Y. BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset. IEEE Access 2020, 8, 29575–29585. [Google Scholar] [CrossRef]
- Yang, Y.; Zheng, K.; Wu, C.; Yang, Y. Improving the Classification Effectiveness of Intrusion Detection by Using Improved Conditional Variational AutoEncoder and Deep Neural Network. Sensors 2019, 19, 2528. [Google Scholar] [CrossRef]
- Ieracitano, C.; Adeel, A.; Morabito, F.C.; Hussain, A. A novel statistical analysis and autoencoder driven intelligent intrusion detection approach. Neurocomputing 2020, 387, 51–62. [Google Scholar] [CrossRef]
- Wang, Z.; Zeng, Y.; Liu, Y.; Li, D. Deep belief network integrating improved kernel-based extreme learning machine for network intrusion detection. IEEE Access 2021, 9, 16062–16091. [Google Scholar] [CrossRef]
- Mendonça, R.V.; Teodoro, A.A.; Rosa, R.L.; Saadi, M.; Melgarejo, D.C.; Nardelli, P.H.; Rodríguez, D.Z. Intrusion detection system based on fast hierarchical deep convolutional neural network. IEEE Access 2021, 9, 61024–61034. [Google Scholar] [CrossRef]
Category | Attack | Interpretation |
---|---|---|
Normal | Normal | Normal network activity |
DOS | back, land, neptune, pod, smurf, teardrop | Denial-of-Service (DoS) attack is a type of cyber attack where a perpetrator attempts to make a website or network resource unavailable to its intended users by overwhelming it with traffic or other types of data. |
Probing | ipsweep, nmap, portsweep, satan | Surveillance and other detection activities. |
R2L | ftp_write, guess_passwd, imap, multihop, phf, spy, warezclient, warezmaster | Remote-to-Local (R2L) attack is a type of cyber attack where an attacker tries to gain unauthorized access to a target system by exploiting vulnerabilities in remote services or applications. |
U2R | buffer overflow, loadmodule, perl, rootkit | User-to-Root (U2R) attackis a type of cyber attack where an attacker with limited privileges on a system attempts to gain root-level access. |
Name of Files | Day Activity | Attacks Found | Advantage | Goal |
---|---|---|---|---|
Monday WorkingHours.pcap_ISCX.csv | Monday | Benign (Normal human activities) | This is a dataset that can further meet real-world standards, covering attack standards from 11 countries, making it more reliable and available [26]. | Using this dataset can help improve our model’s generalization ability and improve its accuracy in modern intrusion detection predictions, rather than just being applicable to the past. |
Tuesday WorkingHours.pcap_ISCX.csv | Tuesday | Benign, FTP-Patator, SSH-Patator | ||
Wednesday workingHours.pcap_ISCX.csv | Wednesday | Benign, DoS GoldenEye, DoS Hulk, DoS Slowhttptest, DoS slowloris, Heartbleed | ||
Thursday-WorkingHours Morning-WebAttacks.pcap_ISCX.csv | Thursday | Benign, Web Attack-Brute Force, Web Attack-Sql Injection, Web Attack-XSS | ||
Thursday-WorkingHours-Afternoon-Infilteration.pcap_ISCX.csv | Thursday | Benign, Infiltration | ||
Friday-WorkingHours Morning.pcap_ISCX.csv | Friday | Benign, Bot | ||
Friday-WorkingHours-Afternoon PortScan.pcap_ISCX.csv | Friday | Benign, PortScan | ||
Friday-WorkingHours-Afternoon DDos.pcap_ISCX.csv | Friday | Benign, DDoS |
Dataset | Training Set | Validation Set | Test Set | Total | Input Vector Features | Number of Labels |
---|---|---|---|---|---|---|
KDDCUP99 | 3,108,950 | 777,237 | 971,547 | 4,857,734 | 41 | 40 |
NSLKDD | 95,050 | 23,762 | 29,704 | 148,516 | 41 | 40 |
CICIDS2017 | 498,741 | 124,685 | 155,857 | 779283 | 78 | 15 |
Metric | Mathematical Formulae |
---|---|
Accuracy | |
Precision | |
Recall | |
F1-Score |
Dataset | Structure | Accuracy (%) | Precision | Recall | F1-Score |
---|---|---|---|---|---|
KDDCUP99 | Transformer | 85.71 | 0.88 | 0.82 | 0.85 |
BiLSTM | 98.25 | 0.97 | 1 | 0.98 | |
Attention | 71.65 | 0.63 | 0.99 | 0.77 | |
Multi-Head Attention | 71.54 | 0.63 | 0.98 | 0.77 | |
Attention + BiLSTM | 97.96 | 0.97 | 1 | 0.98 | |
Multi-Head Attention + BiLSTM | 98.29 | 0.97 | 1 | 0.98 | |
NSLKDD | Transformer | 73.26 | 0.75 | 0.81 | 0.78 |
BiLSTM | 95.13 | 0.96 | 0.97 | 0.97 | |
Attention | 65.01 | 0.76 | 0.62 | 0.68 | |
Multi-Head Attention | 65.01 | 0.76 | 0.62 | 0.68 | |
Attention + BiLSTM | 94.7 | 0.95 | 0.98 | 0.96 | |
Multi-Head Attention + BiLSTM | 95.19 | 0.95 | 0.98 | 0.97 | |
CICID17 | Transformer | 97.94 | 0.98 | 0.97 | 0.97 |
BiLSTM | 98.51 | 0.99 | 0.98 | 0.99 | |
Attention | 97.75 | 0.98 | 0.97 | 0.98 | |
Multi-Head Attention | 97.88 | 0.99 | 0.97 | 0.98 | |
Attention + BiLSTM | 97.24 | 0.97 | 0.97 | 0.97 | |
Multi-Head Attention + BiLSTM | 99.08 | 1 | 0.99 | 0.99 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Zhang, X.; Liu, Z.; Fu, F.; Jiao, Y.; Xu, F. A Network Intrusion Detection Model Based on BiLSTM with Multi-Head Attention Mechanism. Electronics 2023, 12, 4170. https://doi.org/10.3390/electronics12194170
Zhang J, Zhang X, Liu Z, Fu F, Jiao Y, Xu F. A Network Intrusion Detection Model Based on BiLSTM with Multi-Head Attention Mechanism. Electronics. 2023; 12(19):4170. https://doi.org/10.3390/electronics12194170
Chicago/Turabian StyleZhang, Jingqi, Xin Zhang, Zhaojun Liu, Fa Fu, Yihan Jiao, and Fei Xu. 2023. "A Network Intrusion Detection Model Based on BiLSTM with Multi-Head Attention Mechanism" Electronics 12, no. 19: 4170. https://doi.org/10.3390/electronics12194170
APA StyleZhang, J., Zhang, X., Liu, Z., Fu, F., Jiao, Y., & Xu, F. (2023). A Network Intrusion Detection Model Based on BiLSTM with Multi-Head Attention Mechanism. Electronics, 12(19), 4170. https://doi.org/10.3390/electronics12194170