Multi-Classification and Tree-Based Ensemble Network for the Intrusion Detection System in the Internet of Vehicles
Abstract
:1. Introduction
- 1.
- We investigate the advantages offered by different data balancing approaches in enhancing the IDS performance. Furthermore, we employ a hybrid approach combining the Synthetic Minority Over-sampling Technique (SMOTE) and RandomUnderSampler to achieve a balanced class distribution, effectively addressing class imbalance concerns. We validate the efficacy of this approach by using the CICIDS2017 [11]. Moreover, we use an ML-based approach for feature selection to reduce feature dimensionality, substantially alleviating the computational overhead.
- 2.
- We propose an adaptive tree-based ensemble network as the intrusion detection engine of the IDS. It primarily conducts an accurate and efficient multiclass classification of network traffic data originating from the IVNs and external network. This intrusion detection model employs a deep-layer structure, wherein diverse ML models are stacked as layers, and are interconnected in a cascading manner. This model enables the precise and efficient identification of various cyber-attacks, thus safeguarding both IoV systems and ICVs against a variety of cyber threats.
- 3.
- We assess the IDS utilizing two datasets: the CICIDS2017, which is widely recognized in network intrusion detection, and the Car-Hacking [12] dataset sourced from the realm of IoV security. The experimental results involve a comprehensive comparison against the prevailing state-of-the-art techniques. The proposed IDS achieved impressive results with an F1-score of 0.965 on the CICIDS2017 dataset, and gave a 0.9999 F1-score on the Car-Hacking dataset. These outcomes illustrate the superiority for multiclass classification on various cyber-attacks both from the external network and the IVNs in IoV.
2. Related Work
3. Methodology
3.1. System Design
3.2. Data Processing
3.2.1. Data Collection
3.2.2. Data Cleaning
3.2.3. Feature Selection (FS)
3.2.4. Data Normalization
3.2.5. Data Balancing
3.3. The Proposed Intrusion Detection Model
3.3.1. The Inspiration
3.3.2. The Model Structure
3.3.3. ML Models
4. Experiments
4.1. Dataset Description
4.1.1. CICIDS2017 Dataset
4.1.2. Car-Hacking Dataset
4.2. Evaluation Metrics
4.3. Results and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
References
- Cho, K.-T.; Shin, K.G. Fingerprinting Electronic Control Units for Vehicle Intrusion Detection. In Proceedings of the 25th USENIX Security Symposium (USENIX Security 16), Austin, TX, USA, 10–12 August 2016; pp. 911–927. [Google Scholar]
- Woo, S.; Jo, H.J.; Lee, D.H. A Practical Wireless Attack on the Connected Car and Security Protocol for In-Vehicle CAN. IEEE Trans. Intell. Transp. Syst. 2015, 16, 993–1006. [Google Scholar] [CrossRef]
- Miller, C.; Valasek, C. A Survey of Remote Automotive Attack Surfaces. Black Hat USA 2014, 2014, 94. [Google Scholar]
- Koscher, K.; Czeskis, A.; Roesner, F.; Patel, S.; Kohno, T.; Checkoway, S.; McCoy, D.; Kantor, B.; Anderson, D.; Shacham, H.; et al. Experimental Security Analysis of a Modern Automobile. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, Washington, DC, USA, 16–19 May 2010; pp. 447–462. [Google Scholar] [CrossRef]
- Miller, C.; Valasek, C. Remote exploitation of an unaltered passenger vehicle. Black Hat USA 2015, 2015, 1–91. [Google Scholar]
- Lv, S.; Nie, S.; Liu, L.; Lu, W. Car hacking research: Remote attack Tesla motors. Keen Secur. Lab Tencent Sl 2016. Available online: https://keenlab.tencent.com/en/2016/09/19/Keen-Security-Lab-of-Tencent-Car-Hacking-Research-Remote-Attack-to-Tesla-Cars/ (accessed on 19 September 2016).
- Kumar, M.; Hanumanthappa, M.; Kumar, T.V.S. Intrusion Detection System using decision tree algorithm. In Proceedings of the 2012 IEEE 14th International Conference on Communication Technology, Chengdu, China, 9–11 November 2012; pp. 629–634. [Google Scholar] [CrossRef]
- Zhang, H.; Dai, S.; Li, Y.; Zhang, W. Real-time Distributed-Random-Forest-Based Network Intrusion Detection System Using Apache Spark. In Proceedings of the 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), Orlando, FL, USA, 17–19 November 2018; pp. 1–7. [Google Scholar] [CrossRef]
- Iman, A.N.; Ahmad, T. Improving Intrusion Detection System by Estimating Parameters of Random Forest in Boruta. In Proceedings of the 2020 International Conference on Smart Technology and Applications (ICoSTA), Surabaya, Indonesia, 20 February 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Waskle, S.; Parashar, L.; Singh, U. Intrusion Detection System Using PCA with Random Forest Approach. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020; pp. 803–808. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Habibi Lashkari, A.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal, Madeira, Portugal, 22–24 January 2018; SCITEPRESS—Science and Technology Publications: Funchal, Madeira, Portugal, 2018; pp. 108–116. [Google Scholar] [CrossRef]
- Song, H.M.; Woo, J.; Kim, H.K. In-vehicle network intrusion detection using deep convolutional neural network. Veh. Commun. 2020, 21, 100198. [Google Scholar] [CrossRef]
- Chitrakar, R.; Huang, C. Selection of Candidate Support Vectors in incremental SVM for network intrusion detection. Comput. Secur. 2014, 45, 231–241. [Google Scholar] [CrossRef]
- Canbay, Y.; Sagiroglu, S. A Hybrid Method for Intrusion Detection. In Proceedings of the 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 9–11 December 2015; pp. 156–161. [Google Scholar] [CrossRef]
- Khalvati, L.; Keshtgary, M.; Rikhtegar, N. Intrusion Detection based on a Novel Hybrid Learning Approach. J. AI Data Min. 2018, 6, 157–162. [Google Scholar] [CrossRef]
- Peng, K.; Leung, V.C.M.; Zheng, L.; Wang, S.; Huang, C.; Lin, T. Intrusion Detection System Based on Decision Tree over Big Data in Fog Environment. Wirel. Commun. Mob. Comput. 2018, 2018, e4680867. [Google Scholar] [CrossRef]
- Diro, A.; Chilamkurti, N. Leveraging LSTM Networks for Attack Detection in Fog-to-Things Communications. IEEE Commun. Mag. 2018, 56, 124–130. [Google Scholar] [CrossRef]
- Samy, A.; Yu, H.; Zhang, H. Fog-Based Attack Detection Framework for Internet of Things Using Deep Learning. IEEE Access 2020, 8, 74571–74585. [Google Scholar] [CrossRef]
- Labiod, Y.; Amara Korba, A.; Ghoualmi, N. Fog Computing-Based Intrusion Detection Architecture to Protect IoT Networks. Wirel. Pers. Commun. 2022, 125, 231–259. [Google Scholar] [CrossRef]
- Li, S.; Lu, Y.; Li, J. CAD-IDS: A Cooperative Adaptive Distributed Intrusion Detection System with Fog Computing. In Proceedings of the 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Hangzhou, China, 4–6 May 2022; pp. 635–640. [Google Scholar] [CrossRef]
- Singh, P.; Kaur, A.; Aujla, G.S.; Batth, R.S.; Kanhere, S. Daas: Dew computing as a service for intelligent intrusion detection in edge-of-things ecosystem. IEEE Internet Things J. 2020, 8, 12569–12577. [Google Scholar] [CrossRef]
- Rahman, S.A.; Tout, H.; Talhi, C.; Mourad, A. Internet of things intrusion detection: Centralized, on-device, or federated learning? IEEE Netw. 2020, 34, 310–317. [Google Scholar] [CrossRef]
- Zhao, R.; Yin, Y.; Shi, Y.; Xue, Z. Intelligent intrusion detection based on federated learning aided long short-term memory. Phys. Commun. 2020, 42, 101157. [Google Scholar] [CrossRef]
- Ferrag, M.A.; Maglaras, L.; Moschoyiannis, S.; Janicke, H. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. J. Inf. Secur. Appl. 2020, 50, 102419. [Google Scholar] [CrossRef]
- Latif, S.; e Huma, Z.; Jamal, S.S.; Ahmed, F.; Ahmad, J.; Zahid, A.; Dashtipour, K.; Aftab, M.U.; Ahmad, M.; Abbasi, Q.H. Intrusion Detection Framework for the Internet of Things Using a Dense Random Neural Network. IEEE Trans. Ind. Inform. 2022, 18, 6435–6444. [Google Scholar] [CrossRef]
- Thapa, K.N.K.; Duraipandian, N. Malicious Traffic classification Using Long Short-Term Memory (LSTM) Model. Wirel. Pers. Commun. 2021, 119, 2707–2724. [Google Scholar] [CrossRef]
- Wang, W.; Harrou, F.; Bouyeddou, B.; Senouci, S.-M.; Sun, Y. Cyber-attacks detection in industrial systems using artificial intelligence-driven methods. Int. J. Crit. Infrastruct. Prot. 2022, 38, 100542. [Google Scholar] [CrossRef]
- Catillo, M.; Pecchia, A.; Villano, U. CPS-GUARD: Intrusion detection for cyber-physical systems and IoT devices using outlier-aware deep autoencoders. Comput. Secur. 2023, 129, 103210. [Google Scholar] [CrossRef]
- Karim, F.; Majumdar, S.; Darabi, H.; Chen, S. LSTM Fully Convolutional Networks for Time Series Classification. IEEE Access 2018, 6, 1662–1669. [Google Scholar] [CrossRef]
- Binbusayyis, A.; Vaiyapuri, T. Unsupervised deep learning approach for network intrusion detection combining convolutional autoencoder and one-class SVM. Appl. Intell. 2021, 51, 7094–7108. [Google Scholar] [CrossRef]
- Ma, T.; Yu, Y.; Wang, F.; Zhang, Q.; Chen, X. A hybrid methodologies for intrusion detection based deep neural network with support vector machine and clustering technique. In Frontier Computing: Theory, Technologies and Applications FC 2016 5; Springer: Berlin/Heidelberg, Germany, 2018; pp. 123–134. [Google Scholar] [CrossRef]
- Ashraf, J.; Bakhshi, A.D.; Moustafa, N.; Khurshid, H.; Javed, A.; Beheshti, A. Novel Deep Learning-Enabled LSTM Autoencoder Architecture for Discovering Anomalous Events from Intelligent Transportation Systems. IEEE Trans. Intell. Transp. Syst. 2021, 22, 4507–4518. [Google Scholar] [CrossRef]
- Zaidi, K.; Milojevic, M.B.; Rakocevic, V.; Nallanathan, A.; Rajarajan, M. Host-Based Intrusion Detection for VANETs: A Statistical Approach to Rogue Node Detection. IEEE Trans. Veh. Technol. 2016, 65, 6703–6714. [Google Scholar] [CrossRef]
- Ali Alheeti, K.M.; McDonald-Maier, K. Intelligent intrusion detection in external communication systems for autonomous vehicles. Syst. Sci. Control Eng. 2018, 6, 48–56. [Google Scholar] [CrossRef]
- Zhao, R.; Gui, G.; Xue, Z.; Yin, J.; Ohtsuki, T.; Adebisi, B.; Gacanin, H. A Novel Intrusion Detection Method Based on Lightweight Neural Network for Internet of Things. IEEE Internet Things J. 2022, 9, 9960–9972. [Google Scholar] [CrossRef]
- Yang, L.; Moubayed, A.; Hamieh, I.; Shami, A. Tree-based Intelligent Intrusion Detection System in Internet of Vehicles. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Big Island, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Chen, Z.; Simsek, M.; Kantarci, B.; Djukic, P. All Predict Wisest Decides: A Novel Ensemble Method to Detect Intrusive Traffic in IoT Networks. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar] [CrossRef]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar] [CrossRef]
- Schwenker, F. Ensemble Methods: Foundations and Algorithms [Book Review]. IEEE Comput. Intell. Mag. 2013, 8, 77–79. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Rokach, L.; Maimon, O. Decision Trees. In Data Mining and Knowledge Discovery Handbook; Maimon, O., Rokach, L., Eds.; Springer: Boston, MA, USA, 2005; pp. 165–192. ISBN 978-0-387-25465-4. [Google Scholar] [CrossRef]
- Roopak, M.; Yun Tian, G.; Chambers, J. Deep Learning Models for Cyber Security in IoT Networks. In Proceedings of the 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 7–9 January 2019; pp. 452–457. [Google Scholar] [CrossRef]
- Belarbi, O.; Khan, A.; Carnelli, P.; Spyridopoulos, T. An Intrusion Detection System based on Deep Belief Networks. In Proceedings of the International Conference on Science of Cyber Security, Shimane, Japan, 10–12 August 2022; Volume 13580, pp. 377–392. [Google Scholar] [CrossRef]
- Yao, Y.; Su, L.; Lu, Z. DeepGFL: Deep Feature Learning via Graph for Attack Detection on Flow-Based Network Traffic. In Proceedings of the MILCOM 2018—2018 IEEE Military Communications Conference (MILCOM), Los Angeles, CA, USA, 29–31 October 2018; pp. 579–584. [Google Scholar] [CrossRef]
- Alshammari, A.; Zohdy, M.A.; Debnath, D.; Corser, G. Classification Approach for Intrusion Detection in Vehicle Systems. Wirel. Eng. Technol. 2018, 9, 79–94. [Google Scholar] [CrossRef]
- Ullah, S.; Khan, M.A.; Ahmad, J.; Jamal, S.S.; e Huma, Z.; Hassan, M.T.; Pitropakis, N.; Arshad; Buchanan, W.J. HDL-IDS: A Hybrid Deep Learning Architecture for Intrusion Detection in the Internet of Vehicles. Sensors 2022, 22, 1340. [Google Scholar] [CrossRef]
- Injadat, M.; Moubayed, A.; Nassif, A.B.; Shami, A. Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection. IEEE Trans. Netw. Serv. Manag. 2021, 18, 1803–1816. [Google Scholar] [CrossRef]
- Nie, L.; Ning, Z.; Wang, X.; Hu, X.; Cheng, J.; Li, Y. Data-Driven Intrusion Detection for Intelligent Internet of Vehicles: A Deep Convolutional Neural Network-Based Method. IEEE Trans. Netw. Sci. Eng. 2020, 7, 2219–2230. [Google Scholar] [CrossRef]
Categories | Methods | Relevant Model | Innovation/Challenge |
---|---|---|---|
IDSs for general networks | [13] | CSV-ISVM | Binary classification |
[14] | Genetic algorithms, KNN | Multiclass classification; slow training and high memory requirements | |
[15] | K-Medoid clustering, SVM, Naïve Bayes classifier | Multiclass classification; need more research in determining optimal cluster numbers and initial cluster medoids | |
[24,25,26,27,28,29,30,31] | DL-based model | DL models did not perform better than ML-based models in intrusion detection | |
IDSs for IoT and IoV | [16] | DT-based | Being able to fully detect four kinds of attacks and twenty-two other kinds of attacks; the model applicability to the IoV context |
[17,18,19,20,21] | ML-based model | distributed architectural approaches; the model applicability to the IoV context | |
[22,23] | Federated learning | distributed architectural approach; preserving sensitive IoT information security | |
IDSs for IVNs | [12] | CNN | Datasets constructed from real vehicles; Limit to additional hardware |
[32] | DPFL-F2IDS | The challenge of striking a balance between utility and privacy metrics | |
[33] | Statistical-based methods | Less accurate when multiple malicious events occur |
Categories | Before Balancing | After Balancing |
---|---|---|
Benign | 1,221,302 | 800,000 |
DoS/DDoS | 192,161 | 450,000 |
PortScan | 34,383 | 50,000 |
Brute Force | 5131 | 50,000 |
Web Attack | 1271 | 50,000 |
Botnet ARES | 1166 | 55,000 |
Methods | F1-Score | Recall | Precision | Accuracy |
---|---|---|---|---|
Baseline | 96.46 | 98.37 | 94.94 | 99.90 |
PCA | 95.57 | 97.24 | 94.21 | 99.90 |
ML-based methods | 96.36 | 98.65 | 94.61 | 99.90 |
Models | F1-Score (%) | Recall (%) | Precision (%) | Accuracy (%) | Execution Time (ms) |
---|---|---|---|---|---|
RF | 92.50 | 97.72 | 88.76 | 99.78 | 2.27 × 10−4 |
ET | 92.68 | 98.11 | 89.18 | 99.79 | 2.23 × 10−4 |
XGBOOST | 88.86 | 84.95 | 99.36 | 99.55 | 4.73 × 10−4 |
LIGHTGBM | 91.54 | 98.54 | 86.52 | 99.03 | 0.0107 |
DT | 95.74 | 97.57 | 94.22 | 99.86 | 1.56 × 10−4 |
XGBoost 2 + RF + ET | 94.77 | 92.30 | 98.66 | 99.88 | 2.66 × 10−3 |
ATBEN | 96.46 | 98.37 | 94.94 | 99.90 | 3.91 × 10−3 |
F1-Score | Recall | Precision | Accuracy | Support | |
---|---|---|---|---|---|
Benign | 1.00 | 1.00 | 1.00 | 1.00 | 610,652 |
Botnet | 0.73 | 0.91 | 0.81 | 0.96 | 583 |
Brute Force | 1.00 | 1.00 | 1.00 | 1.00 | 2565 |
Dos/DDos | 1.00 | 1.00 | 1.00 | 1.00 | 96,081 |
PortScan | 0.98 | 1.00 | 0.99 | 1.00 | 17,192 |
Web Attack | 0.98 | 0.99 | 0.99 | 0.99 | 635 |
Weighted Avg | 1.00 | 1.00 | 1.00 | - | 727,708 |
Accuracy | 1.00 | 727,708 |
F1-Score | Recall | Precision | Accuracy | Support | |
---|---|---|---|---|---|
DoS | 1.00 | 1.00 | 1.00 | 1.00 | 726,995 |
Fuzzy | 1.00 | 1.00 | 1.00 | 1.00 | 750,001 |
RPM | 1.00 | 1.00 | 1.00 | 1.00 | 915,602 |
Gear | 1.00 | 1.00 | 1.00 | 1.00 | 881,312 |
Norm | 1.00 | 1.00 | 1.00 | 1.00 | 185,479 |
Weighted Avg | 1.00 | 1.00 | 1.00 | 1.00 | 3,459,389 |
Accuracy | 1.00 | 727,708 |
Methods | F1-Score | Recall | Precision | Accuracy | Category |
---|---|---|---|---|---|
MLP [46] | 0.872 | 0.862 | 0.884 | 0.872 | 2 |
LSTM [46] | 0.895 | 0.898 | 0.984 | 0.895 | 2 |
1D-CNN [46] | 0.939 | 0.901 | 0.981 | 0.939 | 2 |
DeepGFL [48] | 0.531 | 0.448 | 0.948 | 0.531 | 12 |
DBN [47] | 0.940 | 0.997 | 0.887 | 0.940 | 6 |
Ours | 0.965 | 0.984 | 0.949 | 0.965 | 6 |
Methods | F1-Score (%) | Recall (%) | Precision (%) | Accuracy (%) |
---|---|---|---|---|
ET | 99.96 | 99.96 | 99.96 | 99.97 |
RF | 99.95 | 99.96 | 99.95 | 99.99 |
XGBOOST | 72.46 | 69.72 | 80.73 | 75.53 |
LIGHTGBM | 87.78 | 85.40 | 92.24 | 90.05 |
SVM [49] | 93.3 | 98.3 | 95.7 | 96.5 |
KNN [49] | 93.4 | 98.2 | 96.3 | 97.4 |
LSTM-AE [32] | 99.0 | 99.9 | 99.0 | 99.0 |
DCNN [12] | 99.91 | 99.84 | 99.84 | 99.93 |
HDL-IDS [50] | 99.97 | 99.98 | 99.97 | 99.98 |
Ours | 99.99 | 99.99 | 99.99 | 99.99 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gou, W.; Zhang, H.; Zhang, R. Multi-Classification and Tree-Based Ensemble Network for the Intrusion Detection System in the Internet of Vehicles. Sensors 2023, 23, 8788. https://doi.org/10.3390/s23218788
Gou W, Zhang H, Zhang R. Multi-Classification and Tree-Based Ensemble Network for the Intrusion Detection System in the Internet of Vehicles. Sensors. 2023; 23(21):8788. https://doi.org/10.3390/s23218788
Chicago/Turabian StyleGou, Wanting, Haodi Zhang, and Ronghui Zhang. 2023. "Multi-Classification and Tree-Based Ensemble Network for the Intrusion Detection System in the Internet of Vehicles" Sensors 23, no. 21: 8788. https://doi.org/10.3390/s23218788
APA StyleGou, W., Zhang, H., & Zhang, R. (2023). Multi-Classification and Tree-Based Ensemble Network for the Intrusion Detection System in the Internet of Vehicles. Sensors, 23(21), 8788. https://doi.org/10.3390/s23218788