Anomaly Detection Model of Network Dataflow Based on an Improved Grey Wolf Algorithm and CNN
Abstract
:1. Introduction
2. Materials and Methods
2.1. Feature Selection Based on the Gray Wolf Optimization Algorithm
2.1.1. Principles of Grey Wolf Optimization Algorithm
- (1)
- Initialize the population: In the initial population, randomly select a group of grey wolves as the population. Each grey wolf represents a potential solution.
- (2)
- Evaluate fitness: Calculate the fitness value of each grey wolf based on a specific fitness function for the problem. This value is used to assess the quality of the solutions.
- (3)
- Determine the leader: Select the grey wolf with the best fitness value as the leader. The position of the leader represents the current optimal solution.
- (4)
- Update grey wolf positions: Update the position of each grey wolf based on their distances and fitness values. The position update is influenced by the leader, as the better solutions guide the search direction of other solutions.
- (5)
- Handle boundaries: When updating positions, ensure that the grey wolves’ positions do not exceed the defined boundaries of the problem.
- (6)
- Iterative search: Repeat steps three to five until reaching the predetermined number of iterations or meeting the stopping criteria.
- (7)
- Output the result: After completing the iterative search, output the found optimal solution as the solution to the optimization problem.
2.1.2. Basic Grey Wolf Optimization Algorithm
2.1.3. The Improved Grey Wolf Optimization Algorithm
2.2. Anomaly Detection Model of Network Dataflow Based on VGG16
- (1)
- Input layer: Receives the pixel values of the input image.
- (2)
- Convolutional layers: The VGG16 model consists of 13 convolutional layers, where each convolutional layer utilizes a 3 × 3-sized convolutional kernel and employs the ReLU activation function for non-linear transformation. The purpose of these convolutional layers is to extract features from the input image.
- (3)
- Pooling layers: After each convolutional layer, the VGG16 model uses 2 × 2 max-pooling layers to perform downsampling operations, reducing the spatial dimensions of the feature maps while retaining the most prominent features.
- (4)
- Fully connected layers: The VGG16 model contains three fully connected layers, each comprising 4096 neurons. The role of these fully connected layers is to convert the feature maps into specific class probabilities.
- (5)
- Softmax layer: Following the last fully connected layer is the Softmax layer, which is used to map the output of the network to a probability distribution over classes.
- (6)
- Output layer: The output layer provides the final classification result.
3. Results
3.1. Datasets
3.1.1. KDD99
3.1.2. UNSW-NB15
3.2. Experiment Settings
3.3. Evaluation Metrics and Methods
3.4. Experimental Results of Optimizing the Grey Wolf Algorithm
3.5. Experimental Results of Anomaly Detection on the KDD99 Dataset
3.6. Experimental Results of Anomaly Detection on the UNSW-NB15 Dataset
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Xia, X.; Bhatt, N.P.; Khajepour, A.; Hashemi, E. Integrated Inertial-LiDAR-Based Map Matching Localization for Varying Environments. IEEE Trans. Intell. Veh. 2023, 1–12. [Google Scholar] [CrossRef]
- Liu, W.; Quijano, K.; Crawford, M.M. YOLOv5-Tassel: Detecting Tassels in RGB UAV Imagery with Improved YOLOv5 Based on Transfer Learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 8085–8094. [Google Scholar] [CrossRef]
- Meng, Z.; Xia, X.; Xu, R.; Liu, W.; Ma, J. HYDRO-3D: Hybrid Object Detection and Tracking for Cooperative Perception Using 3D LiDAR. IEEE Trans. Intell. Veh. 2023, 1–13. [Google Scholar] [CrossRef]
- Xia, X.; Meng, Z.; Han, X.; Li, H.; Tsukiji, T.; Xu, R.; Zheng, Z.; Ma, J. An automated driving systems data acquisition and analytics platform. Transp. Res. Part C Emerg. Technol. 2023, 151, 104120. [Google Scholar] [CrossRef]
- Gao, C.; Wang, G.; Shi, W.; Wang, Z.; Chen, Y. Autonomous Driving Security: State of the Art and Challenges. IEEE Internet Things J. 2021, 9, 7572–7595. [Google Scholar] [CrossRef]
- Bogdoll, D.; Nitsche, M.; Zöllner, J.M. Anomaly detection in autonomous driving: A survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–22 June 2022; pp. 4488–4499. [Google Scholar]
- Kim, J.; Kim, J.; Kim, H.; Shim, M.; Choi, E. CNN-based network intrusion detection against denial-of-service attacks. Electronics 2020, 9, 916. [Google Scholar] [CrossRef]
- Kanna, P.R.; Santhi, P. Unified Deep Learning approach for Efficient Intrusion Detection System using Integrated Spatial–Temporal Features. Knowl.-Based Syst. 2021, 226, 107132. [Google Scholar] [CrossRef]
- Thakkar, A.; Lohiya, R. Fusion of statistical importance for feature selection in Deep Neural Network-based Intrusion Detection System. Inf. Fusion 2023, 90, 353–363. [Google Scholar] [CrossRef]
- Shahin, M.; Chen, F.F.; Hosseinzadeh, A.; Bouzary, H.; Rashidifar, R. A deep hybrid learning model for detection of cyber attacks in industrial IoT devices. Int. J. Adv. Manuf. Technol. 2022, 123, 1973–1983. [Google Scholar] [CrossRef]
- Tian, Z.; Luo, C.; Qiu, J.; Du, X.; Guizani, M. A distributed deep learning system for web attack detection on edge devices. IEEE Trans. Ind. Inform. 2019, 16, 1963–1971. [Google Scholar] [CrossRef]
- Yu, X.; Yang, X.; Tan, Q.; Shan, C.; Lv, Z. An edge computing based anomaly detection method in IoT industrial sustainability. Appl. Soft Comput. 2022, 128, 109486. [Google Scholar] [CrossRef]
- Li, Z.; Qin, Z.; Huang, K.; Yang, X.; Ye, S. Intrusion detection using convolutional neural networks for representation learning. In Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, 14–18 November 2017, Proceedings, Part V; Springer International Publishing: Cham, Switzerland, 2017; pp. 858–866. [Google Scholar]
- Wang, W.; Zhu, M.; Zeng, X.; Ye, X.; Sheng, Y. Malware traffic classification using convolutional neural network for representation learning. In Proceedings of the 2017 International Conference on Information Networking (ICOIN), IEEE, Da Nang, Vietnam, 11–13 January 2017; pp. 712–717. [Google Scholar]
- Garg, S.; Kaur, K.; Kumar, N.; Kaddoum, G.; Zomaya, A.Y.; Ranjan, R. A hybrid deep learning-based model for anomaly detection in cloud data center networks. IEEE Trans. Netw. Serv. Manag. 2019, 16, 924–935. [Google Scholar] [CrossRef]
- Garg, S.; Kaur, K.; Kumar, N.; Rodrigues, J.J.P.C. Hybrid Deep-Learning-Based Anomaly Detection Scheme for Suspicious Flow Detection in SDN: A Social Multimedia Perspective. IEEE Trans. Multimed. 2019, 21, 566–578. [Google Scholar] [CrossRef]
- Muneer, A.; Taib, S.M.; Fati, S.M.; Balogun, A.O.; Aziz, I.A. A Hybrid Deep Learning-Based Unsupervised Anomaly Detection in High Dimensional Data. Comput. Mater. Contin. 2022, 70. [Google Scholar] [CrossRef]
- Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
- Faris, H.; Aljarah, I.; Al-Betar, M.A.; Mirjalili, S. Grey wolf optimizer: A review of recent variants and applications. Neural Comput. Appl. 2017, 30, 413–435. [Google Scholar] [CrossRef]
- Meidani, K.; Hemmasian, A.; Mirjalili, S.; Barati Farimani, A. Adaptive grey wolf optimizer. Neural Comput. Appl. 2022, 34, 7711–7731. [Google Scholar] [CrossRef]
- Wang, J.S.; Li, S.X. An improved grey wolf optimizer based on differential evolution and elimination mechanism. Sci. Rep. 2019, 9, 7181. [Google Scholar] [CrossRef] [PubMed]
- Yidan, L.; Yanli, C.; Runze, C.; Lan, Y.; Fangming, R. An Encryption Traffic Classification Method Based on ResNeXt. In Proceedings of the 2021 IEEE 15th International Conference on Anti-counterfeiting, Security, and Identification (ASID), IEEE, Xiamen, China, 29–31 October 2021; pp. 47–52. [Google Scholar]
- Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef]
- Qin, Z.; Lu, X.; Nie, X.; Liu, D.; Yin, Y.; Wang, W. Coarse-to-Fine Video Instance Segmentation with Factorized Conditional Appearance Flows. IEEE/CAA J. Autom. Sin. 2023, 10, 1192–1208. [Google Scholar] [CrossRef]
- Lu, X.; Wang, W.; Shen, J.; Crandall, D.; Luo, J. Zero-Shot Video Object Segmentation with Co-Attention Siamese Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 2228–2242. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Chaudhuri, S.; Madigan, D.; Fayyad, U. KDD-99: The fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM SIGKDD Explor. Newsl. 2000, 1, 49–51. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
- Khamis, R.A.; Matrawy, A. Evaluation of adversarial training on different types of neural networks in deep learning-based IDSs. In Proceedings of the IEEE ISNCC 2020: 2020 IEEE International Symposium on Networks, Computers and Communications, Montreal, QC, Canada, 20–22 October 2020; pp. 1–6. [Google Scholar]
- Gneiting, T.; E Raftery, A. Strictly Proper Scoring Rules, Prediction, and Estimation. J. Am. Stat. Assoc. 2007, 102, 359–378. [Google Scholar] [CrossRef]
- Sharma, N.; Mukherjee, S. A Novel Multi-Classifier Layered Approach to Improve Minority Attack Detection in IDS. Procedia Technol. 2012, 6, 913–921. [Google Scholar] [CrossRef]
- Pandeeswari, N.; Kumar, G. Anomaly Detection System in Cloud Environment Using Fuzzy Clustering Based ANN. Mob. Netw. Appl. 2016, 21, 494–505. [Google Scholar] [CrossRef]
- Guo, C.; Ping, Y.; Liu, N.; Luo, S.-S. A two-level hybrid approach for intrusion detection. Neurocomputing 2016, 214, 391–400. [Google Scholar] [CrossRef]
- Kasongo, S.M.; Sun, Y. Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset. J. Big Data 2020, 7, 1–20. [Google Scholar] [CrossRef]
- Roy, A.; Singh, K.J. Multi-classification of UNSW-NB15 Dataset for Network Anomaly Detection System. In Proceedings of International Conference on Communication and Computational Technologies, Algorithms for Intelligent Systems; Purohit, S., Singh Jat, D., Poonia, R., Kumar, S., Hiranwal, S., Eds.; Springer: Singapore, 2020; pp. 429–451. [Google Scholar] [CrossRef]
- Kasongo, S.M.; Sun, Y. A deep learning method with wrapper based feature extraction for wireless intrusion detection system. Comput. Secur. 2020, 92, 101752. [Google Scholar] [CrossRef]
- Eunice, A.D.; Gao, Q.; Zhu, M.-Y.; Chen, Z.; Na, L. Network anomaly detection technology based on deep learning. In Proceedings of the 2021 IEEE 3rd International Conference on Frontiers Technology of Information and Computer (ICFTIC), IEEE, Greenville, SC, USA, 12–14 November 2021; pp. 6–9. [Google Scholar]
- Moustafa, N.; Slay, J. The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf. Secur. J. A Glob. Perspect. 2016, 25, 18–31. [Google Scholar] [CrossRef]
Name | Description | The Selected Features | ||
---|---|---|---|---|
Traditional Algorithm | Improved Algorithm | |||
1 | Duration | Connection duration | √ | √ |
2 | Protocol type | Protocol types for links: TCP, UDP, ICMP | √ | √ |
3 | Service | Network service types: HTTP, FTP, SMTP, etc. | √ | √ |
4 | Flag | Status flags for connections | √ | √ |
5 | The number of the sent bytes | Number of bytes sent by a link | √ | - |
6 | The number of received bytes | The number of bytes accepted by a link | √ | - |
7 | Land | Is the source IP, port consistent with the target IP, and port? | √ | √ |
8 | Wrong frame | Number of segments with invalid checksums in the connection | √ | √ |
9 | Urgent | Number of emergency segments in the connection | √ | √ |
10 | Hot | Number of hot metrics related to the current connection | √ | √ |
11 | The number of the wrong land | Incorrect login count in a link | √ | √ |
12 | Logged_in | Successfully logged in | √ | √ |
13 | Num_compromised | The total number of errors not found in a link | √ | √ |
14 | Root_shell | The root is getting the shell | √ | - |
15 | Su_attempted | Whether to try to authenticate as Superuser | √ | √ |
16 | Num_root | Number of users with root privileges in a link | √ | √ |
17 | Num_file_creation | Number of files created in a link | √ | √ |
18 | Num_shells | Number of normal user logins | √ | √ |
19 | Num_access_file | Number of operation control files in a file | √ | √ |
20 | Num_outbound_cmds | Number of outbound commands in FTP sessions | - | √ |
21 | Is_hot_login | Is the user accessing as root or administrator? | √ | √ |
22 | Is_guest_login | Is it a guest login? | √ | √ |
23 | count | Number of links to the same destination IP | √ | √ |
24 | Srv_count | Number of links to the same destination port | √ | √ |
25 | Serror_rate | The ratio of incorrect links | √ | √ |
26 | Srv_serror_rate | The rate of incorrect links related to the current service | √ | √ |
27 | Rerror_rate | The rate of rejecting connections | √ | √ |
28 | Srv_error_rate | The rate of rejected links related to the current service | √ | √ |
29 | Sane_srv_rate | The ratio of links that are the same as the current service | √ | √ |
30 | Diff_srv_rate | The ratio of links different from the current service | √ | √ |
31 | Srv_diff_host_rate | The ratio of links from different hosts that are the same as the current service | √ | √ |
32 | Dst_host_count | The same number of links as the target host | √ | √ |
33 | Dst_host_srv_count | Number of connections to the same port | √ | - |
34 | Dst_host_same_srv_rate | The ratio of links to the same service as the target host | √ | √ |
35 | Dst_host_diff_srv_rate | The ratio of links to different services from the target host | √ | - |
36 | Dst_host_same_src_port_rate | Link ratio with the same source port as the target host | √ | √ |
37 | Dst_host_srv_diff_host_rate | The ratio of links from different hosts with the same service as the target host | - | √ |
38 | Dst_host_serror_rate | The rate of incorrect links related to the target host | √ | √ |
39 | Dst_host_srv_serror_rate | The rate of incorrect links related to the target host service | √ | √ |
40 | Dst_host_rerror_rate | The rate of rejected links related to the target host | √ | √ |
41 | Dst_host_srv_error_rate | The rate of rejected links related to the service of the target host | √ | √ |
Methods | Detection Rate | False Positive Rate | Precision | F-Score |
---|---|---|---|---|
Sharma et al. [31] | 93.41 | 0.275 | 99.05 | 93 |
Pandeeshwari et al. [32] | 98 | 3.05 | - | 83.20 |
Guo et al. [33] | 91.86 | 0.78 | 93.29 | - |
Proposed Model | 98.6 | 0.278 | 94.86 | 92.24 |
Attack Class | Accuracy | Precision | Recall | F-Score |
---|---|---|---|---|
Exploits | 0.921 | 0.763 | 0.596 | 0.669 |
Generic | 0.990 | 0.990 | 0.962 | 0.981 |
Reconnaissance | 0.969 | 0.812 | 0.665 | 0.733 |
Analysis | 0.893 | 0.899 | 0.951 | 0.925 |
Shellcode | 0.794 | 0.817 | 0.987 | 0.890 |
DoS | 0.847 | 0.799 | 0.943 | 0.867 |
Worms | 0.851 | 0.782 | 0.891 | 0.843 |
Fuzzers | 0.789 | 0.761 | 0.994 | 0.862 |
Backdoors | 0.723 | 0.672 | 0.855 | 0.751 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, L.; Chen, Q.; Song, C. Anomaly Detection Model of Network Dataflow Based on an Improved Grey Wolf Algorithm and CNN. Electronics 2023, 12, 3787. https://doi.org/10.3390/electronics12183787
Wang L, Chen Q, Song C. Anomaly Detection Model of Network Dataflow Based on an Improved Grey Wolf Algorithm and CNN. Electronics. 2023; 12(18):3787. https://doi.org/10.3390/electronics12183787
Chicago/Turabian StyleWang, Liting, Qinghua Chen, and Chao Song. 2023. "Anomaly Detection Model of Network Dataflow Based on an Improved Grey Wolf Algorithm and CNN" Electronics 12, no. 18: 3787. https://doi.org/10.3390/electronics12183787
APA StyleWang, L., Chen, Q., & Song, C. (2023). Anomaly Detection Model of Network Dataflow Based on an Improved Grey Wolf Algorithm and CNN. Electronics, 12(18), 3787. https://doi.org/10.3390/electronics12183787