1. Introduction
The Industrial Internet of Things, also known as Industrial IoT, is an industrial framework in which a large number of devices or machines are connected and synchronized using software tools and third platform technologies for providing varied services to internet users, public and private sector organizations, and smart industries and Industry 4.0 [
1,
2]. In the recent era, the IIoTs are experiencing astonishing growth rates due to their sensing, storing, and intelligence power in the current smart world [
3,
4]. From a recent statistical report, 70 billion IoT devices are expected to be connected over the internet in 2025 [
5]. Such dependence on IoT results in the generation of a significant amount of data, processing, and examination. No doubt, big data analysis is also valuable for business development [
6]. However, the biggest threat to potentially reduce the growth of IIoTs are numerous cyber threats that can compromise the integrity of user data and underlying IoT application for further exploitation. Besides, the risk of being physically compromised that underlies IoT devices due to their prevalent nature is also considered a critical threat in IIoTs environment [
7]. Cyber defense is a pivotal prerequisite for potential growth of IIoT [
8].
Therefore, adversaries practice diverse kinds of malware techniques to obtain access to an IoT device for malfunctioning the entire IIoT network [
9]. Attacks performed on a network are fundamentally resilient to detect and have been a proven strategy to compromise interconnected systems and devices [
10]. The adversary breaks the security and obtains the benefit to access the user’s records, steal sensitive information, and inject malicious code for further exploitation or hijacked hardware. The heterogeneous and dynamic nature of IoT gadgets and various resource constraints such as energy, memory, and processing power amplifies the potential cyber threats exponentially that may prompt Denial of Service (DoS), distributed Denial of Service (DDoS), information infusion, advance persistent threat (APT), and modern malware botnet attacks altogether [
11,
12]. Moreover, the IIoT devices are prone to complex hacking approaches, physical security dangers for the accessibility and classification of data, or even compromise the complete IoT-based network. Hence, IIoT requires an adaptable, robust, and cost-effective technique for the identification of pervasive and prevalent cyberthreats [
13].
In recent years, research has been executed for addressing various security challenges for IIoTs such as confidentiality, privacy, policy enforcement, and key management issues, and so forth [
14]. Besides, traditional techniques such as antiviruses and firewall protection can be easily evaded by zero-day intrusions [
15]. Machine learning (ML) techniques are also considered powerful and mostly rely on analysis of the features of existing patterns. However, the extant ML schemes become less effective for zero-day attack variants. The prime challenge for malware identification framework is to find a means for extraction of useful features and detect sophisticated malware efficiently [
16]. Deep learning is considered an ideal current shift for the identification of pervasive IIoT cyber malware threats and attacks [
17,
18]. To address the aforementioned challenges, we present an efficient hybrid DL-driven multiclass cyberthreat and -attack detection scheme for proficiently identifying distributed variant malware botnet attacks in IIoTs. The offered key contributions are as follows:
1.1. Contributions
We propose an efficient hybrid DL-enabled technique for the detection of sophisticated distributed IIoT botnet attacks by deploying Long short-term memory (LSTM) and Convolutional Neural Network (CNN).
Extensive simulations have been performed on N_BaIoT 2018 dataset to evaluate the performance of proposed algorithms by utilizing extended performance metrics (accuracy, precision, recall, F1-score, etc.).
For corroboration purposes, the proposed approach is compared with our constructed hybrid DL-driven architectures (i.e., DNN-DNN and CNN-CNN) and current benchmarks. Our proposed mechanism outperforms the others in terms of detection accuracy.
Extensive experimental results demonstrate that our proposed method is an effective and efficient approach for multivector botnet detection.
We also performed 10-fold cross-validation to avoid showing biased performance results.
1.2. Organization
The remaining parts of the paper are organized in the following way.
Section 2 presents the literature review with background knowledge.
Section 3 contains the research approach, dataset description, preprocessing of dataset, architectural description of hybrid LSTM-CNN.
Section 4 consists of software and hardware requirements and experiment results discussion. At last,
Section 5 comes to an end with the proposed scheme and future map.
2. Background and Related Work
The Internet of Things (IoTs) is a conversion from a basic physical conventional object to a smart object through the utilization of cutting-edge technologies such as communication technologies, applications, sensor networks, internet protocols, and pervasive computing. Due to the wide range of applications of IoTs in the smart city ecosystem, the flawless implementation of a secure IoT network is necessary. The IoT environment can be explained as the enormous interconnected heterogeneous devices and systems with different communication protocols and patterns [
13,
19]. To deliver intelligence-enabled services to users, the IIoT architecture has consisted of computational objects linked with IoT infrastructure. Moreover, the IoT network has been developed as a four-layer architecture named as the devices layer, network layer, infrastructure layer, and application layer. The taxonomic simple architectural diagram of IIoT can be visualized in
Figure 1.
The perception layer is included with the connected physical objects and their connectivity through different access points such as a radio tower, satellite, wireless access point, and satellite dish. The physical sensors are physical objects and the objective is to sense, gather, and process information. As the IoT devices are resource-constrained devices due to limited processing capabilities, data delivery is the key step for designing a context-aware IoT system. The higher number of IoT devices and ever-increasing data on a daily basis generated through these devices indicate a correlation with big data and expansion in intelligence-based ecosystems. The connectivity of heterogeneous devices helps to provide smart services to users that should be low-powered communication for transmission of data [
20,
21].
The network layer enables the corporation between diverse IoT devices so they can interact with each other effortlessly. Moreover, the network layer also provides interoperability and scalability in the IoT realm. The main function of this middle layer is context awareness and device discovery, which should be provided to support the surrounding IoT objects. The security and privacy of IoT devices are also handled at the middle layer because the data gathered from these devices are mostly industry or human-related and the security mechanisms are also deployed for IoT security. The IoT-based system has applications in several domains including smart healthcare, transportation, smart grids, and smart cities. At the application level, the services are delivered to consumers and the data gathered and analyzed are integrated for the business objective. The integrated data through different levels of IoT models are used for social and economic growth [
11].
Due to the advancement of varied IoT devices in the industry, the security of IoT networks has become the prime focus. To provide a practical solution from the existing security vulnerabilities in IoT systems, researchers have focused on presenting DL//ML-based attack identification frameworks [
22,
23,
24,
25]. In [
26], the author proposed an ensemble technique of Recurrent families that are required to identify various IoT cyberattacks through analysis of network traffic. The dataset considered is Modbus/TCP network traffic dataset synthetic based on an industrial automation context. The proposed technique obtained 99% detection accuracy. Consequently, [
27] presented a scheme for IoT-based phishing and botnet attacks through distributed deep learning. The LSTM classifier has been practiced and employed with the N_BaIoT dataset and achieved 94.3% and 94.80% accuracy for phishing and botnet attacks, respectively. Chen et al., in [
28], introduced an intrusion detection method for conversation-based traffic analysis employing five different machine learning classifiers named Random Forest, REP tree, Random Tree, Bayes-Net, and Decision Tree. Using the CTU-13 dataset, the approach has achieved a detection rate of 93.6%.
In another study, Bansal et al. [
29] proposed anomaly-based IDS using three different deep learning classifiers named Clustering, Neural Network stimulated by LSTM, and Recurrent Neural Networks for IoT’s. The proposed mechanism has been experimented with using ISCX and CTU-13 datasets, achieving the detection accuracy of 98.8%, 98.39%, and 83.09% for Clustering, NN-LSTM, and RNN, respectively. For malicious traffic detection, Pektaş et al. in [
30] proposed a DL-Driven network traffic flow behavior analysis leveraging Neural Network. This approach has been evaluated for binary classification utilizing the ISOT and CTU-13 datasets, and achieved the detection accuracy of 99.3% and 99.1% respectively. Moreover, Sharma et al. in [
31] presented a machine-learning-based approach for the detection of evolving malware through analyzing the network traffic features. The dataset contains 11,688 malware collected from the Malicia project and 4006 benign gathered from multiple systems connected over the network. The framework has been executed using diverse machine learning classifiers (i.e., RF (random forest), LMT (Logistic model tree), NBT (Naïve Bayes Tree), FT (Functional Tree), and J48) whereas RF achieved an accuracy of 97.95%. In [
32], a K-Mean Clustering technique for labeling the dataset was presented. The decision tree has been experimented for the detection of cyberthreats in IoT communication using the ISCX dataset, achieving the accuracy of 88%, which can be improved by using loss function to reduce classification error. Whereas, in [
33], the authors proposed a Logistic-Regression-enabled botnet detection technique for IoT. The proposed scheme can scale enormous malware samples into groups of clusters based on their behavior. The technique achieved 97.3% detection accuracy.
The author in [
33] used features selection to minimize the features that are helpful to detect the bots in IoT. These features provide a high accuracy detection rate over the IoT to detect the botnet. Machine learning technique called decision tree classifier is experimented on N-BaIoT dataset. Deep learning provides a flexible environment for detecting malware.
The research work of [
34] presented an efficient IoT-based malware detection through packet-level analysis by implementing a Bidirectional Long Short-Term Memory based Recurrent Neural Network (BLSTM-RNN). The paper also generated a labeled dataset having attack vectors such as a botnet and benign traffic. Experimental results showed detection accuracy for Mirai, DNS, and UDP as 99%, 98%, and 98%, respectively. Consequently, in [
35], the authors provided a technique through LSTM to inspect the statistical-based network flow feature. The experimental results are achieved through the Cresci dataset and achieved 99% detection accuracy for IoT malware detection. The datasets utilized for the proposed scheme are CTU-13 and ISOT, which are pure binary (i.e., botnet, normal). These datasets are a combination of both botnet and normal traffic. Machine learning and deep learning classifiers i.e., SVM, Logistic Regression, Random Forest, KNN stand-alone LSTM, stand-alone dense, and combined layer are used and showed an accuracy of 99.3%. All convolutional approaches are executed on CPU and all deep learning approaches are executed on GPU.
In [
36], the author focuses on finding domain names that do not belong to data in context, statistical information, etc. For this, Deep-learning-based classifiers known as LSTM, RNN, CNN, and CNN-LSTM are employed. However, the dataset composed of one billion records of benign collected from Alexa, open-DNS, and malicious records was created from 17-DGA. Deep-learning-based IDS in [
37] was presented for identification of intrusions, leveraging Gated Recurrent Neural Network in IoT network. The proposed classifier achieved a detection accuracy of 98.91% and FAR of 0.76%. Consequently, the article [
38] employed deep Auto Encoder and Deep Forward Neural Network for detection of malware attacks in IoTs. This model scored a detection accuracy of 99%.
As per the findings from the literature review, despite achieving high detection accuracies, there still exist several limitations including high computational complexity, reliability on humans, extensive data modifications, and also inconsistent accuracy levels. Researchers have been working on hybrids, ensembles, and also experimenting on diverse hyperparameters (e.g., training, optimization, activation, and classification) to come up with the most accurate and time-saving solution for anomaly detection. Existing research also demonstrates that ensemble or hybrid techniques have a lot of potential in the field of network security anomaly detection. The goal of deep hybrid learning techniques is not to surpass existing classifiers, but to make use of their capability for not misclassifying unseen data. Nonetheless, in contrast with other existing intrusion detection schemes for IoT, we present a comprehensive hybrid framework based on cutting-edge deep learning from the IoT security perspective. This paper presents an efficient approach to identify sophisticated attacks in IoT environments through utilizing the predictive power of deep learning.
3. Research Methodology
This section presents the proposed hybrid DL-enabled multivector attack detection framework for IoT systems. The foundation of the presented model is a combination of several processes. The initial step is the dataset description and observation of features. In the subsequent step, the preprocessing of the dataset is performed, which is included with removing data redundancy, cleaning data, visualization, feature engineering, and data transformation. After preprocessing, data were prepared for input to classifiers for IoT attack identification. Consequently, the hybrid Long short-term memory (LSTM) [
39] and Convolutional Neural Network (CNN)-based [
40] efficient and scalable malware detection framework is presented.
3.1. Dataset
The features of IoT devices can be analyzed through the internet protocols and services they utilize. Network traffic analysis is the ideal choice for the identification and classification of cyberattacks. In any exploration, to obtain precise results, authentic and accurate data must be provided as input data. To design a reliable and applicable intrusion detection system, the data gathered from real devices are optimal to use. However, most of the present analysis approaches utilized datasets collected using the sandbox, which is not precise for the real deployment of identification frameworks in IoT infrastructure. In this study, we used the N_BaIoT 2018 dataset captured through real IoT devices. This dataset fills a gap in the public botnet databases, particularly for IoT devices. The dataset N_BaIoT 2018 contains the features of real normal traffic [
41] and 9 different IoT devices (i.e., Doorbells, Thermostat, Baby Monitor, Security Cameras, and Webcam). The N_BaIoT dataset considers the two malware families of a botnet: GAFGYT and MIRAI. The available dataset traffic were comprehensively recorded for normal and 2 distinct botnet attacks. For our experiment, we considered 6 diverse IoT devices and two botnet families, Gafgyt and Mirai, to detect Botnet attacks. The dataset distribution for the proposed scheme is defined in
Table 1.
3.2. Preprocessing Phase
Deep learning requires a comprehensive data analysis to predict IoT traffic as malicious and benign. So, the very first step was to arrange information in such arrangement that it would be compatible with the input to any deep learning classifier. The dataset contains missing values, infinity, and nan values. In data denoising, these unexpected values were removed from the dataset. In the following step, the types of features were identified, such as numerical and categorical data. The conversion of categorical to numeric data was also performed through label encoding.
3.3. Detection Phase
In this research, a robust, proficient, scalable, and highly accurate hybrid IoT multivariant botnet attack detection scheme is presented through leveraging Long-short-term-memory (LSTM) and Convolutional Neural Network (CNN), as portrayed in
Figure 2. The proposed approach aims to design a system for the identification of Gafgyt and Mirai attacks. The proposed LSTM-CNN architecture mainly included three steps to recognize intrusion in smart devices.
- Step 1.
Modeling of data dimension
At the start, the pre-processed network traffic data is mapped into two-dimensional (2D) feature vector for CNN. As the variants of CNN classifier can be of different dimensions starting from 1D to 3D, the data for the experimentation are the number of samples (features, records); so, they are mapped into 2D.
- Step 2.
Initialization of CNN and LSTM network
For the experimentation, the CNN network was designed with an input layer, three hidden layers, and an output layer. To facilitate the CNN algorithm for feature learning, the input layer converted the 1D network dataset into 2D plane data. Three convolution layers and a flatten layer were included in the implied layer. The convolution layer continually maps the sample data to a high-dimensional space and learns the network connection data feature information. By lowering the dimension of the retrieved features, the flatten layer decreases computation and enhances the model detection efficiency. However, the LSTM network consists of an input layer; three hidden LSTM layers; and finally, an output layer. The data were mapped on the input layer to feed forward to LSTM cells. The LSTM layers were attributed to achieving success in recognizing network anomalies efficiently.
- Step 3.
The combined output
Once both the classifiers were initialized and executed for the identification of attacks in IoT, the additive merge was performed to manifest the ultimate performance of a proposed algorithm.
The complete design of hybrid LSTM-CNN architecture, including layer architecture, number of neurons set in each layer, activation function, loss function, number of epochs, and batch size, are detailed in
Table 2. Moreover, we constructed other contemporary hybrid architectures (i.e., CNN-CNN, DNN-DNN) for a comprehensive evaluation of our proposed technique. To address bias, we also performed 10-fold cross-validation.