TTANAD: Test-Time Augmentation for Network Anomaly Detection
Abstract
:1. Introduction
- We introduce TTANAD, a novel method that leverages test-time augmentation to improve the performance of network anomaly detection tasks, across various anomaly detection algorithms.
- Our work introduces the unique approach of generating synthetic augmentations based on temporal aggregation features at test time, without modifying or retraining the underlying models.
2. Related Work
2.1. Network Anomaly Detection
2.1.1. Autoencoder-Based Anomaly Detection
2.1.2. Local Outlier Factor Anomaly Detection
2.1.3. Isolation Forest Anomaly Detection
2.2. Test-Time Augmentation
3. Materials and Methods
3.1. Temporal Aggregation Formulation
3.2. Temporal Aggregation-Based TTA
3.3. Experiments
3.3.1. Data
- CIC-IDS2017—The CIC-IDS2017 [40] dataset includes eight different files containing five days’ normal and malicious traffic data. Combining these files results in roughly 3 million instances and 83 features with 15 labels—1 normal and 14 attack labels.
- CSE-CIC-IDS2018—The CSE-CIC-IDS2018 [40] dataset contains about 16 million instances collected over ten days, with roughly 17% of the instances compromised of malicious traffic.
- UNSW-NB15—The UNSW-NB15 [41] dataset was created by the IXIA PerfectStorm tool for generating a hybrid of real modern normal activities and synthetic contemporary attack behaviors. This dataset contains about 2.5 million instances and 49 features.
3.3.2. Preprocessing
- Integrate and Sort: We first loaded the data for the evaluated datasets, concatenating all of the provided files for each dataset and ordering the instances by their timestamp in ascending order.
- Data Cleaning: We removed any duplicate records and checked for inconsistencies in the dataset. If any inconsistencies were found, such as missing values, we either imputed them using appropriate statistical techniques or removed the corresponding records.
- Temporal Feature Extraction: Using a window with a size s, we are sliding it with a stride equal to s and aggregating the features of the instances in each window. The aggregation contains extracting minimum, maximum, and standard deviation for each feature.
- Feature Scaling: To ensure that all features were on the same scale, we standardized the numerical features using z-score normalization.
- Data Splitting: We time-based split to train and test sets using a 70–30% ratio, respectively.
3.3.3. Compared Algorithms
3.3.4. Evaluated Anomaly Detectors
3.3.5. Experimental Setup
4. Results
Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
TTANAD | Test-Time Augmentation for Network Anomaly Detection |
NIDS | Network Intrusion Detection Systems |
TTA | Test Time Augmentation |
ID | Intrusion Detection |
References
- Li, Y.; Ma, R.; Jiao, R. A hybrid malicious code detection method based on deep learning. Int. J. Secur. Appl. 2015, 9, 205–216. [Google Scholar] [CrossRef]
- Kwon, D.; Kim, H.; Kim, J.; Suh, S.C.; Kim, I.; Kim, K.J. A survey of deep learning-based network anomaly detection. Clust. Comput. 2019, 22, 949–961. [Google Scholar] [CrossRef]
- Fernandes, G.; Rodrigues, J.J.; Carvalho, L.F.; Al-Muhtadi, J.F.; Proença, M.L. A comprehensive survey on network anomaly detection. Telecommun. Syst. 2019, 70, 447–489. [Google Scholar] [CrossRef]
- Garcia-Teodoro, P.; Diaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009, 28, 18–28. [Google Scholar] [CrossRef]
- Zhang, J.; Zulkernine, M. Anomaly based network intrusion detection with unsupervised outlier detection. In Proceedings of the 2006 IEEE International Conference on Communications, Istanbul, Turkey, 11–15 June 2006; Volume 5, pp. 2388–2393. [Google Scholar]
- Xin, Y.; Kong, L.; Liu, Z.; Chen, Y.; Li, Y.; Zhu, H.; Gao, M.; Hou, H.; Wang, C. Machine learning and deep learning methods for cybersecurity. IEEE Access 2018, 6, 35365–35381. [Google Scholar] [CrossRef]
- Su, L.; Yao, Y.; Li, N.; Liu, J.; Lu, Z.; Liu, B. Hierarchical Clustering Based Network Traffic Data Reduction for Improving Suspicious Flow Detection. In Proceedings of the 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), New York, NY, USA, 1–3 August 2018; pp. 744–753. [Google Scholar] [CrossRef]
- Jiang, K.; Wang, W.; Wang, A.; Wu, H. Network intrusion detection combined hybrid sampling with deep hierarchical network. IEEE Access 2020, 8, 32464–32476. [Google Scholar] [CrossRef]
- Wang, Q.; Ouyang, X.; Zhan, J. A classification algorithm based on data clustering and data reduction for intrusion detection system over big data. KSII Trans. Internet Inf. Syst. (TIIS) 2019, 13, 3714–3732. [Google Scholar]
- Liu, L.; Wang, P.; Lin, J.; Liu, L. Intrusion detection of imbalanced network traffic based on machine learning and deep learning. IEEE Access 2020, 9, 7550–7563. [Google Scholar] [CrossRef]
- Brauckhoff, D.; Salamatian, K.; May, M. A signal processing view on packet sampling and anomaly detection. In Proceedings of the 2010 IEEE INFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–9. [Google Scholar]
- Shanmugam, D.; Blalock, D.; Balakrishnan, G.; Guttag, J. When and Why Test-Time Augmentation Works. arXiv 2020, arXiv:2011.11156. [Google Scholar]
- Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification problem. In Proceedings of the 2018 International Interdisciplinary Ph.D. Workshop (IIPhDW), Swinoujscie, Poland, 9–12 May 2018; pp. 117–122. [Google Scholar]
- Wang, G.; Li, W.; Aertsen, M.; Deprest, J.; Ourselin, S.; Vercauteren, T. Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks. Neurocomputing 2019, 338, 34–45. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. (CSUR) 2009, 41, 1–58. [Google Scholar] [CrossRef]
- Chalapathy, R.; Chawla, S. Deep learning for anomaly detection: A survey. arXiv 2019, arXiv:1901.03407. [Google Scholar]
- Dau, H.A.; Ciesielski, V.; Song, A. Anomaly detection using replicator neural networks trained on examples of one class. In Proceedings of the Asia-Pacific Conference on Simulated Evolution and Learning, Dunedin, New Zealand, 15–18 December 2014; pp. 311–322. [Google Scholar]
- Farahnakian, F.; Heikkonen, J. A deep auto-encoder based approach for intrusion detection system. In Proceedings of the 20th International Conference on Advanced Communication Technology (ICACT), Online, Republic of Korea, 11–14 February 2018; pp. 178–183. [Google Scholar]
- Azmin, S.; Islam, A.M.A.A. Network intrusion detection system based on conditional variational laplace autoencoder. In Proceedings of the 7th International Conference on Networking, Systems and Security, Dhaka, Bangladesh, 22–24 December 2020; pp. 82–88. [Google Scholar]
- Yang, L.; Song, Y.; Gao, S.; Hu, A.; Xiao, B. Griffin: Real-time network intrusion detection system via ensemble of autoencoder in SDN. IEEE Trans. Netw. Serv. Manag. 2022, 19, 2269–2281. [Google Scholar] [CrossRef]
- Li, X.; Chen, W.; Zhang, Q.; Wu, L. Building auto-encoder intrusion detection system based on random forest feature selection. Comput. Secur. 2020, 95, 101851. [Google Scholar] [CrossRef]
- Rao, K.N.; Rao, K.V.; PVGD, P.R. A hybrid intrusion detection system based on sparse autoencoder and deep neural network. Comput. Commun. 2021, 180, 77–88. [Google Scholar]
- Muhammad, G.; Hossain, M.S.; Garg, S. Stacked autoencoder-based intrusion detection system to combat financial fraudulent. IEEE Internet Things J. 2020, 10, 2071–2078. [Google Scholar] [CrossRef]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar]
- Gulhare, A.K.; Badholia, A.; Sharma, A. Mean-Shift and Local Outlier Factor-Based Ensemble Machine Learning Approach for Anomaly Detection in IoT Devices. In Proceedings of the International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 20–22 July 2022; pp. 649–656. [Google Scholar]
- Omar, M. Malware Anomaly Detection Using Local Outlier Factor Technique. In Machine Learning for Cybersecurity: Innovative Deep Learning Solutions; Springer: Berlin/Heidelberg, Germany, 2022; pp. 37–48. [Google Scholar]
- Tang, J.; Ngan, H.Y. Traffic outlier detection by density-based bounded local outlier factors. Inf. Technol. Ind. 2016, 4. [Google Scholar] [CrossRef]
- Auskalnis, J.; Paulauskas, N.; Baskys, A. Application of local outlier factor algorithm to detect anomalies in computer network. Elektron. Elektrotechnika 2018, 24, 96–99. [Google Scholar] [CrossRef]
- Madhupriya, G.; Shalinie, S.M.; Rajeshwari, A.R. Detecting DDoS attack in cloud computing using local outlier factors. In Proceedings of the 2nd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 11–12 May 2018; pp. 859–863. [Google Scholar]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation Forest. In Proceedings of the Isolation Forest; IEEE Computer Society: New York, NY, USA, 2008; pp. 413–422. [Google Scholar] [CrossRef]
- Shukla, A.K.; Srivastav, S.; Kumar, S.; Muhuri, P.K. UInDeSI4. 0: An efficient Unsupervised Intrusion Detection System for network traffic flow in Industry 4.0 ecosystem. Eng. Appl. Artif. Intell. 2023, 120, 105848. [Google Scholar] [CrossRef]
- AbuAlghanam, O.; Alazzam, H.; Alhenawi, E.; Qatawneh, M.; Adwan, O. Fusion-based anomaly detection system using modified isolation forest for internet of things. J. Ambient. Intell. Humaniz. Comput. 2022, 14, 1–15. [Google Scholar] [CrossRef]
- Chiba, Z.; Abghour, N.; Moussaid, K.; Omri, A.E.; Rida, M. Newest collaborative and hybrid network intrusion detection framework based on suricata and isolation forest algorithm. In Proceedings of the 4th International Conference on Smart City Applications, Casablanca, Morocco, 2–4 October 2019; pp. 1–11. [Google Scholar]
- Laskar, M.T.R.; Huang, J.X.; Smetana, V.; Stewart, C.; Pouw, K.; An, A.; Chan, S.; Liu, L. Extending isolation forest for anomaly detection in big data via K-means. ACM Trans.-Cyber-Phys. Syst. (TCPS) 2021, 5, 1–26. [Google Scholar] [CrossRef]
- Ripan, R.C.; Sarker, I.H.; Anwar, M.M.; Furhad, M.H.; Rahat, F.; Hoque, M.M.; Sarfraz, M. An isolation forest learning based outlier detection approach for effectively classifying cyber anomalies. In Proceedings of the Hybrid Intelligent Systems: 20th International Conference on Hybrid Intelligent Systems (HIS 2020), Virtual, 14–16 December 2020; pp. 270–279. [Google Scholar]
- Cohen, S.; Goldshlager, N.; Rokach, L.; Shapira, B. Boosting Anomaly Detection Using Unsupervised Diverse Test-Time Augmentation. Inf. Sci. 2023, 626, 821–836. [Google Scholar] [CrossRef]
- Cohen, S.; Dagan, N.; Cohen-Inger, N.; Ofer, D.; Rokach, L. ICU survival prediction incorporating test-time augmentation to improve the accuracy of ensemble-based models. IEEE Access 2021, 9, 91584–91592. [Google Scholar] [CrossRef]
- Lesti, G.; Spiegel, S. A Sliding Window Filter for Time Series Streams. In Proceedings of the IOTSTREAMING@ PKDD/ECML, Skopje, Macedonia, 18–22 September 2017. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
- Sakurada, M.; Yairi, T. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, Australia, 2 December 2014; pp. 4–11. [Google Scholar]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (TKDD) 2012, 6, 1–39. [Google Scholar] [CrossRef]
- Soule, A.; Salamatian, K.; Taft, N. Combining filtering and statistical methods for anomaly detection. In Proceedings of the 5th ACM SIGCOMM Conference on Internet Measurement, Berkeley, CA, USA, 19–21 October 2005; p. 31. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. A detailed analysis of the cicids2017 data set. In Proceedings of the International Conference on Information Systems Security and Privacy, San Francisco, CA, USA, 24 May 2018; pp. 172–188. [Google Scholar]
Method | CIC-IDS2017 | CSE-CIC-IDS2018 | UNSW-NB15 | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Temporal Aggregation Window | T = 3 | T = 4 | T = 5 | T = 3 | T = 4 | T = 5 | T = 3 | T = 4 | T =5 | |
Autoencoder | w/o TTA | 0.754 | 0.769 | 0.767 | 0.925 | 0.921 | 0.922 | 0.989 | 0.987 | 0.985 |
TTANAD | 0.760 | 0.782 | 0.773 | 0.930 | 0.926 | 0.929 | 0.995 | 0.994 | 0.992 | |
Isolation Forest | w/o TTA | 0.813 | 0.813 | 0.870 | 0.948 | 0.948 | 0.944 | 0.889 | 0.921 | 0.909 |
TTANAD | 0.819 | 0.813 | 0.884 | 0.950 | 0.950 | 0.947 | 0.922 | 0.946 | 0.930 | |
Local Outlier Factor | w/o TTA | 0.654 | 0.675 | 0.701 | 0.665 | 0.651 | 0.708 | 0.640 | 0.937 | 0.836 |
TTANAD | 0.6984 | 0.764 | 0.747 | 0.684 | 0.666 | 0.710 | 0.649 | 0.941 | 0.887 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cohen, S.; Goldshlager, N.; Shapira, B.; Rokach, L. TTANAD: Test-Time Augmentation for Network Anomaly Detection. Entropy 2023, 25, 820. https://doi.org/10.3390/e25050820
Cohen S, Goldshlager N, Shapira B, Rokach L. TTANAD: Test-Time Augmentation for Network Anomaly Detection. Entropy. 2023; 25(5):820. https://doi.org/10.3390/e25050820
Chicago/Turabian StyleCohen, Seffi, Niv Goldshlager, Bracha Shapira, and Lior Rokach. 2023. "TTANAD: Test-Time Augmentation for Network Anomaly Detection" Entropy 25, no. 5: 820. https://doi.org/10.3390/e25050820
APA StyleCohen, S., Goldshlager, N., Shapira, B., & Rokach, L. (2023). TTANAD: Test-Time Augmentation for Network Anomaly Detection. Entropy, 25(5), 820. https://doi.org/10.3390/e25050820