The evolution of automotive industry has brought many changes which improved the security and safety of the commuters. However, reports of World Health Organization revealed that there are almost 1.3 million causalities every year by road accidents, i.e., one death in every 24 s [
1]. Some real-life examples of the car hacking include “Jeep Cherokee Hack” in 2015 [
2], “Tesla Model S Hack” in 2016 [
3], and “Nissan Leaf Vulnerability” in 2016 [
4]. In the Jeep Cherokee case, the car was remotely hacked by two security researchers, Miller and Valasek, exploiting its infotainment and advanced driver assistance (ADAS) systems. This hijacking attempt was deliberate by the company to test the “Zero-day exploit”, a hacking technique. “Tesla Model S Hack” happened when some researchers from the Keen Security Lab remotely hijacked the car’s CAN-bus, taking control of the ADAS system both in driving and parking mode. They could control dashboard screen, door locks, car brakes, windscreen wipers, move seats, open boot and sunroof. While being able to remotely access the Nissan Leaf, experts in hacking gained remote access to the car through mobile application using vehicle identification number. This allowed them to operate the car air conditioning system, obtain driving history, and perform climate control functions successfully. During the year 2022, a large number of cars from 16 different makers were hacked by the group of seven researchers who were able to gain access to various car functions, such as honk vehicles, flash headlights, lock/unlock cars, start/stop engine, precisely locate cars, and even alter car ownership remotely. The impacted companies were Hyundai, Honda, Land Rover, Kia, Ford, Ferrari, Nissan, Mercedes-Bens, BMW, Acura, Toyota, Rolls Royce, Porsche, Jaguar, Infiniti, and Genesis [
5]. This all shows a serious efficiency and safety concern for the auto-industry which demands a more secure, robust, and efficient self-defense mechanism for cars with less human intervention. Many renowned companies as Google in USA performed road tests of connected intelligent cars in 2009 [
6]. The University of Michigan [
7] examined the performance of similar cars designed by Tesla [
8] in the field of Mcity. In the subsequent years, Mercedes Benz, Audi, and BMW started working on controller area network (CAN), a key element for establishing communication in smart cars (SCs). These intelligent smart cars can perform inter- and intra-vehicular communication and are able to make quick intelligent decisions for collision avoidance.
Keeping the legacy system, SCs are integrated with advanced technology, connectivity, and digital interface which add intelligent smart features such as keyless entry, automatic parking, traffic sign recognition, lane keeping, emergency braking, obstacle detection, and many others to name. The advanced featuring is by the virtue of embedded electronics which includes the utilization of numerous sensors and electronic control units (ECUs) that build an in-vehicular network in SCs to establish effective communication for efficient and safe drive. The in-vehicular communication is achieved through controller area network which is considered an efficient communication protocol compared to ethernet and FlexRay networks used in SCs [
9]. Being message-oriented protocol, CAN allows robust communication between sensors and ECUs, but it lacks proper data authentication and an encryption mechanism. This makes CAN-bus highly vulnerable to cyber attacks arising serious questions on the reliability of a smart car. An unauthorized user can gain access to the system, launch an attack, manipulate the data and thus infiltrate the safe operation of a smart car to gain personal, financial, or any other benefit. This situation demands development of an efficient, reliable and robust intrusion identification and detection system for SCs. The communication between an IDS and the users normally involves (i) Attack Identification and Detection, where the IDS detects an illegal attempt made by the threat actor to establish control of the CAN-bus to send malicious data; (ii) Alert Generation, where alerts are generated by the IDS and are sent to the car dashboard and or to the driver’s mobile app; (iii) Automated Response, which is performed by the IDS to disable the OBD-II port restricting external access. In the worst case, (iv) an alert is also sent to the manufacturer for detailed check and debugging. In this study, the primary focus is on Step (i) of the IDS process with the assumption that the proper communication is taking place between the IDS and the users.
1.1. Related Works
Machine learning (ML) presents promising solutions for intrusion and anomaly detection in cyber-physical systems [
10,
11,
12,
13]. ML techniques primarily learn from data and make predictions based on the data pattern and training. Numerous ML-based studies have been conducted to design an intrusion detection system (IDS) and analyze the behavior of a smart car under different cyber attacks [
14,
15,
16,
17,
18,
19,
20,
21].
Utilizing SVM-, kNN-, and DT-based ML approaches and real-time data from Chevrolet Spark and Kia Soul, Bari et al. [
14] investigated the performance of an ML-based IDS for smart cars. The study conducted experiments using DoS, impersonation and fuzzy attacks. Results showed a remarkable high accuracy of 99.9% with a 1.0 F1-score; however, the computational time including model training and testing time for all the attack types and classifiers is significantly high for all the cases. While evaluating the performance of a machine learning-based IDS proposed by researchers in [
15] utilizing RF, DT, MLP, and SVM classifiers, the authors successfully identified and detected the DoS, fuzzy, and impersonation attacks injected on the real-time data from the CAN-bus of KIA Soul. Though the study achieved the accuracy of 98.5269%, the model is very heavyweight with training and testing time of 460.719 s and 14.935 s, respectively, bringing into question its suitability for events requiring instant fast decision. In a related study [
16], a generative adversarial network-based anomaly detection scheme is proposed for smart cars. The study presented fairly good results based on different parameters of confusion matrix as hit rate, miss rate, false alarm, and correct rejection rates. Giving no information about anomaly detection time and limited to single attack type, the study fails to provide its effectiveness for multiple attack designs and the smart car used for the data collection is undefined.
Realizing the consequeses of cyber attacks and leveraging the ML techniques, Shahriar et al. [
17] proposed a single level deep learning-based anomaly detection system, CANShield for the SCs. The CANShield model comprises the data preprocessing, data analyzer, and ensemble method-based AD modules. The preprocessing module processes and manages the complex data, the analyzer module investigates the time-series data, whereas the detection module offers the final outcome. Despite the comprehendible performance results, the study fails to provide the details about the car type used for data collection, the accuracy measure, and the computational time to make the classification decision. Featured with an LSTM-based ML technique, an IDS for the intrusion detection for a CAN-bus network in smart cars is proposed in [
18]. The study investigated the effect of DoS, fuzzy, and spoofing attacks on CAN messages transmitted by different ECUs for normal SCs functionality. The simulation results presented considerably good sensitivity and specificity scores; however, the study lacks details of model accuracy and computational time, which are crucial parameters of an IDS designed for SCs.
Employing feature engineering for the selection of highly significant variables to train a model, the researchers devised a deep learning-based solution for the detection of flaws and anomalies in SCs. They utilized an IVS-Hackathon dataset, which included four different types of cyber attacks. Achieving 95% accuracy, the study provided good comparison with baseline methods and recent studies; however, it neither considered the computational complexity nor the model verified for different datasets. Further, the accuracy value could be increased [
19]. Kim et al. [
20] determined the performance of an anomaly detection model developed utilizing an LSTM-based autoencoder approach. While presenting the comprehensive simulation results, the study is limited to a single attack type, i.e., a fuzzy attack considering only one dataset. In another similar study, Wang et al. [
21] proposed a GAN-based IDS for anomaly identification in a CAN-FD-bus in SCs. The model employed DoS, fuzzy, PRM/Gear Spoofing attack, and normal data for model training. The results showed an outstanding detection rate of 99.93%; however, the study skipped the other performance evaluation metrics (F1-score, precision, recall, ROC) and also the computational time.
Kishoare et al. [
22] proposed an intelligent IDS for SCs to identify intrusions, anomalies, and flaws in CAN-bus data utilizing the Bidirectional Long Short-Term Memory technique employed on the “Car Hacking: Attack and Defense Challenge, 2020” dataset. They found that the proposed scheme achieved a 99.09% detection accuracy with 0.9910 and 0.9901 F1-score and precision, respectively. Aldhyani and Alkahtani [
23] evaluated the performance of a deep learning-based security system developed to protecting CAN-bus data from various cyber intrusions and assaults. The study presented a 97.43% accuracy to identify and detect CAN-bus anomalies by the proposed system. He et al. [
24] studied the behavior of the anomaly detection model in connected and autonomous cars (CAVs) using decision tree and naive-bayesian machine learning schemes employed on a KDD99-based CAV-KDD dataset. The study concluded that a decision tree is superior to naive bayes in anomaly detection requiring less simulation time. To locate false messages injected into a CAN-bus, Pawar et al. [
25] used k-nearest neighbour (kNN)- and decision tree (DT)-based ML models and found 81.48% and 77.99% detection accuracy values for kNN and DT, respectively. Gupta et al. [
26] presented a novel graph-based ML scheme to secure smart cars from anomalies. Employing the machine learning and event-triggered interval method, Han et al. [
27] detected and identified the abnormalities in CAN-bus data in SCs, and the model achieved up to 99% detection accuracy with 0.990 F1-score and 0.991 precision. While introducing different cyber attacks on a SC in a laboratory setup and collecting real-time data, Onur et al. [
28] investigated the performance of an IDS by employing various machine learning techniques. The simulation results showed that the random forest surpassed the other schemes with 0.923 F1-score, 0.925 precision, and 96.1% classification accuracy. Similarly, Alalwany et al. [
29] analyzed the performance of an IDS to classify different cyber assaults on CAN-bus data. Their model incorporated machine learning and ensemeble methods including random forest, decision tree, eXtreme gradient boosting, stacking, bagging, and voting with Kappa architecture. The study concluded that outperforming the other methods, the stacking ensemble classifier achieved the best results, i.e., a 98.5% accuracy with 0.985 F1-score and 0.987 precision. In similar studies by Alsaade and Al-Adhaileh [
1] and Anand et al. [
30], a deep learning method was applied to identify erroneous information and malicious network traffic in smart cars. Reviewing the various studies carried out for anomaly detection in CAN-bus, the authors analyzed different IDSs on the basis of the detection technique, the attack type, and evaluation metrics, and summarized the findings in [
31].
Table 1 presents the comparison of the proposed study with some relevant studies in terms of investigating the impact of different cyber attacks on CAN-bus data in smart cars.
As evident from the literature above, significant contributions have been made by the researchers to propose an anomaly detection model for smart cars. However, limited to the utilization of a single dataset to train an ML-IDS for the detection of a few cyber attacks injected to CAN-bus in SCs, these IDSs are comparatively heavyweight with computational costs as high as 86.67% [
44] compared to this work. Further, the applicability of an IDS should be extended to different makes of the smart car while tackling different types of cyber intrusions. To our knowledge, the literature lacks a simple, fast, lightweight, and efficient intrusion detection system capable of instant decision-making power with zero delay on real-time data while being concurrently applicable to different car models. Additionally, the literature would gain strength by the addition of a combined unified model proposed in this study featuring detection accuracy as high as 99.922% with a 99.9% precision along with detection time reduced to 0.00027 s while becoming equally efficient for various types of smart cars such as KIA Soul, Hyundai Avante CN7, Hyundai YF Sonata, Genesis g80, and CHEVROLET Spark. In pursuit of this, the paper proposes a machine learning-based novel approach to develop a simple, lightweight but efficient intrusion detection classifier for smart cars.