1. Introduction
Currently, Internet of Things (IoT) technology is applied in a variety of applications such as sensors, devices, software, and tools to ensure improved functions and accuracy [
1]. With the growth of the IoT, life has become comfortable and simple for human beings, governments, and industries. Machines, sensors, gadgets, and devices have become highly intelligent, whereas manual contributions have substantially decreased through the development of IoT technology [
2]. The IoT is required for smart devices, namely, fire alarms, physical medical sensors, smart mobiles, smart energy meters, industrial devices, energy applications, and smart security systems. The term IoT security refers to the protection of internet-connected or network-based things [
3]. In recent years, the concept of the IoT has developed, and its main terminology has developed over several years. The major role of the IoT is to interconnect the nodes, frameworks, sensors, smart cities, and systems through the internet for communication, control, and information sharing [
4].
The IoT has been developed for the optimization of the day-to-day activities of human life and to enable the current world to run proficiently. Everything is connected to the internet, such as smart sensors, cooking appliances, thermostats, fitness mobile applications, air conditioners, and PV systems [
5]. This fast growth of IoT technology is creating huge difficulties for securing and protecting IoT data from unauthorized users, malicious traffic, hackers, and attackers. Thus, numerous defense mechanisms and approaches have been proposed and developed in IoT systems and architectures to protect the data [
6]. The application of various DL techniques to identify the attacks with binary classification and to classify different kinds of attacks with multi-class classification has become an active research domain [
7]. While several great reviews have covered this developing area of research, the literature fails to produce an unbiased comparison of various DL techniques, particularly the application of new databases for intrusion detection in a controlled setting [
8].
Cybersecurity is a serious challenge in the current world. Firewalls, for instance, have been employed for a long time to protect secret data. The intrusion detection system (IDS) studies network traffic on a certain computer environment to detect the indications of malicious activities [
9]. The fast development of Artificial Intelligence (AI) has led to important advances in certain mechanisms, including anomaly detection and pattern identification. While the existing systems perform well in detecting cyberattacks, their performance could still be improved. With an increase in the number of network attacks, the quantity of data accessible on networks increases; thus, rapid and highly effective methods are required in order to identify the attacks. Without a doubt, a huge number of approaches to enhancing network security exist [
10]. Privacy and security are significant IoT challenges to focus on and investigate further. With the employment AI, the identification of cyber security attacks on the IoT can be significantly enhanced.
The current manuscript describes the Honey Badger Algorithm with the Optimal Hybrid Deep Belief Network (HBA-OHDBN) technique for cyberattack detection in a BC-assisted IoT platform. In the proposed HBA-OHDBN system, feature selection using HBA is implemented to choose an optimal set of features. In the HBA, digging and honey-finding strategies are simulated during the exploration and exploitation of the search space. For intrusion detection, the HBA-OHDBN technique applies the HDBN model. To adjust the hyperparameter values of the HDBN model, the Dung Beetle Optimization (DBO) algorithm is utilized. The DBO algorithm considers global exploration and local exploitation, thereby having the characteristics of a fast convergence rate and a satisfactory accuracy outcome. The HBA-OHDBN algorithm was tested using a benchmark IDS database, and the outcomes were inspected under different evaluation metrics. The key contributions of the paper are summarized herewith:
Development of the HBA-OHDBN technique comprising HBA-based feature subset selection, HDBN-based detection, and DBO-based hyperparameter tuning for intrusion detection in the IoT environment. To the best of the authors’ knowledge, the HBA-OHDBN technique does not exist in the literature.
Design of the HBA-based feature subset selection approach to identify the most relevant and informative features from the dataset, which helps in reducing the dimensionality and improving the overall performance of the intrusion detection system.
Employment of the DBO algorithm to fine-tune the hyperparameters of the HDBN model, thus leading to a fast convergence rate and high solution accuracy. This approach ensures that the HDBN model is well adjusted for an improved intrusion detection performance.
The rest of the paper is organized as follows.
Section 2 provides the related works and
Section 3 discusses the proposed model. Then,
Section 4 details the analytical results, while
Section 5 concludes the paper.
2. Literature Review
Asiri et al. [
11] introduced the Hybrid Metaheuristics FS with Stacked DL-enabled Cyber-Attack Detection (HMFS-SDLCAD) approach. In this study, data preprocessing was implemented, the Salp Swarm Optimizer (SSO) depends on the PSO (SSOPSO), and a Stacked Bidirectional-GRU (SBiGRU) algorithm was employed for FS identification and classification. Further, the Whale Optimization Algorithm (WOA) was utilized for optimization of the hyperparameters. In reference [
12], the authors suggested a DL-based intrusion detection model for IIoT with a hybrid rule-based FS in order to confirm and train the data acquired from TCP/IP packets. In this study, the training method was executed utilizing a hybrid rule-based FS and deep FFNN (DFFNN) approach. In a study conducted earlier [
13], the authors developed a Deep Random-NN (DRaNN)-based rapid and dependable attack identification method for IIoT platforms. The developed RaNN method was optimized by integrating the hybrid PSO with Sequential Quadratic Programming (SQP). The SQP-enabled PSO enables the NN to select the optimum hyper-parameters.
Chander and Upendra Kumar [
14] presented a novel metaheuristic FS with DL-enabled anomaly detection approach in the IIoT platform, abbreviated as the MFSDL-ADIIoT algorithm. This method leveraged a novel Deer Hunting Optimization (DHO) technique based on FS to arrive at the valuable feature subsets. Moreover, a Cascaded RNN (CRNN) algorithm was also implemented for classification and recognition. Eventually, the Sparrow Search Algorithm (SSA) was employed for optimal fine-tuning of the parameters. Khacha et al. [
15] suggested a novel IDS by exploiting the DL techniques. To be specific, the proposed system functions via the incorporation of CNN and LSTM algorithms for intrusion detection and classification. In this study, the current datasets, termed Edge-IIoT sets, containing the actual traffic network of the IIoT and IoT applications were employed. Alabsi et al. [
16] integrated two CNNs and recommended a method by which to identify the attacks on IoT networks. The first CNN algorithm is used for the selection of the important features that identify the IoT attack in the raw data on network traffic. The second CNN applies the features detected by the first CNN for making a robust identification technique so as to correctly identify the IoT attacks. In reference [
17], the authors developed a novel DL-based IDS (DL-IDS) approach. This method involves the application of a Spider Monkey Optimizer (SMO) approach and the Stacked-Deep Polynomial Network (SDPN). Here, the SMO chooses the optimum features in the databases, whereas the SDPN approach classifies the data as either normal or anomaly.
Alohali et al. [
18] developed a novel BC-Assisted Optimal ML-based Cyber-attack Detection and Classification (BAOML-CADC) approach. The primary aim of the proposed BAOML-CADC algorithm was to classify cyber-attacks. Therefore, the proposed BAOML-CADC approach executed the Thermal Equilibrium Algorithm-based FS (TEA-FS) algorithm for optimum selection of the features. The BAOML-CADC system utilized the ELM approach for cyber-attack detection. In the study conducted earlier [
19], a new Hybrid Deep Random NN (HDRaNN) was established for cyber-attack detection from the IIoT. The HDRaNN integrates the DRaNN and MLP with dropout regularization. Al-Abassi et al. [
20] examined a DL approach for the development of novel balanced representations of imbalanced databases. A novel representation was provided as an ensemble DL attack detection model, especially planned for the ICS platform. The presented attack detection model had the advantages of both DNN and DT approaches for the detection of cyber-attacks in a novel representation.
3. The Proposed Model
In the current manuscript, the HBA-OHDBN system is presented for cyberattack detection in a BC-assisted IoT platform. The major intention behind the HBA-OHDBN algorithm lies in its accurate recognition and classification of the cyberattacks that occur in the IoT platform. The proposed HBA-OHDBN approach has three operational stages: HBA-based feature selection, HDBN-based detection, and DBO-based hyperparameter tuning.
Figure 1 depicts the overall flow of the HBA-OHDBN algorithm.
3.1. BC Technology
The blockchain (BC) is a de-centralized digital ledger that enables endwise transmission and provides communication among untrustworthy people [
21]. It includes
number of blocks. In the current research work, all the blocks are incorporated into their prior block via reverse connection, except the initial block (genetic), i.e., the hash code of its early block. For instance,
assumes the hash code of the early
. Similarly, all the blocks contain additional data fields, namely, an encryption opcode, an identifier, a core hash of every commitment, and a core hash of every operation. Since single-bit change may result in commitments, the unique hash code and the fixed-size core hash values of these two functions are irreversible. One particular condition is satisfied, i.e., smart contracts working on the upper end of the BC can simplify the application of restrictions and contractual agreements. After it is synthesized to the implemented code and recorded from the BC, based on the irreversibility of the BC and core hash of the smart contract, the smart contract becomes irreversible. The smart contract helps in improving the efficacy of business operations, reducing its potential threats, and streamlining the administration.
Figure 2 illustrates the structure of the BC.
3.2. Feature Selection Using HBA
In order to elect an optimal subset of features, the HBA is applied. HBA is a newly-established metaheuristic approach inspired by the smart foraging behavior of honey badgers (HB). In this approach, the honey-finding and digging approaches of HB are mimicked during the exploitation and exploration stages in the search range [
22,
23]. The initial population randomly initializes the individual location and the size of the population (number of HBs), as given below:
In Equation (1),
refers to a random integer in the range of
denotes the location of the
individual of
candidates; and
and
indicate the lower and upper boundaries of the search range, respectively. The concentration of prey and its smell intensity are related to the distance between the HBs and the prey. The olfactory intensity is compared with the distance between the targets and the HBs and the concentration of the prey. The slower the movement of the HB, the smaller the olfactory intensity will be. If the HB moves faster, then a higher inverse square law provides the olfactory concentration, as given below:
Here,
shows the intensity concentration or source;
denotes the olfactory concentration of the HB;
corresponds to an arbitrary value ranged between 0 and 1; and
signifies the distance between the prey and the existing individual, HB.
The density factor
ensures a steady conversion from exploration to excavation. The reduction factor
reduces with the number of iterations, upgraded by Equation (5) to reduce the uncertainty produced by time-varying changes at the time of foraging:
In Equation (5), corresponds to the maximal iteration counts; and is a constant value fixed at 2. The flag changes the search direction and provides a heavy prospect of finding the food within the search range.
Similar to the shape of the heart line, the HB implements the action in the excavation stage. The cardioid movement is simulated using the following expression:
In Equation (6),
shows the distance between the target and the existing individual HB;
indicates the global optimum location in the existing state;
corresponds to the capability of the HB to attain food; and
, and
refer to three dissimilar random integers in the range of
.
The HB heavily relies on a few parameters, such as the time-varying search, the fragrance concentration of the target, and the distance from the prey during the excavation phase. Furthermore, HBs are subjected to different disturbances that prevent them from finding the best location of the prey. The movement trajectory of the honey source or the HB follows the honeyguide bird to the hive, as shown in the following expression:
Here,
refers to the upgraded individual location of the HBs;
shows the prey’s location;
and
are defined using the Equations (5) and (7), respectively; and
indicates the randomly generated value within [0, 1]. The HB searches near the prey’s position
according to the distance
The Fitness Function (FF) deployed in the presented HBA methodology is intended to take a balance between the number of features elected from every performance (lesser) and the classification accuracy (higher) acquired by utilizing these elected features. Equation (9) defines the FF to calculate the performance:
Here, implies the errors of a provided classifier; suggests the cardinality of the elected subset; demonstrates the entire number of features from the database; and and correspond to two parameters that are equivalent to the impact of classifier quality and subset length, i.e., ∈ [1, 0] and
3.3. Cyberattack Detection Using the HDBN Model
In the current research work, the HDBN approach is applied to detect cyberattacks. Based on the assumption that both DBN and DBM techniques have constraints, the current study presents the HDBN model that employs the Deep Boltzmann Machine system that contains two low-level layers of RBMs and the DBN technique that contains two-layer RBMs from the upper layers. Either the training time or model accuracy should be of concern [
24]. The HDBN technique contains four layers.
represents the visual state and it also acts as the input for the method. All the samples are defined by a set length vector.
, and
correspond to four hidden layers (HLs) with distinct counts of nodes for all the levels. The nodes with similar HL values cannot be linked together; for nodes of distinct layers, among
,
, and
, there exists directed connections. Among HL
,
, and HL
, there exists undirected FC. Next, the sample is signified as a set vector and is trained by the first two layers, i.e.,
and
of DBM training. Here,
stands for DBM outcomes. Simultaneously,
implies the input of the DBN technique that is composed of
and
. Here,
represents the last outcome of the HDBN approach. Related to the visual state, it is called higher-level semantic representation.
Figure 3 illustrates the architecture of DBN.
In order to gain the optimum parameters for the method, Hinton establishes an energy model in which the parameters are developed as an optimum performance that is embedded in the energy function. The most important mechanism in detecting the statistical pattern is the capturing of the correlations among variables, and it is similar to the energy model.
Here,
signifies the number of hidden nodes (HNs);
denotes the number of visible nodes (VNs); and
and
denote the biases of the visual state and HL, respectively. This equation is the energy function that signifies every connection between VN and HN. The main function of RBM is the cumulative value of energies for every VN and HN. However, for the main function, in every instance, there is a need to identify the values of every HN, which could be equivalent to calculating the energy so that it can face the index level of complexity in computation. The joint probability of VN and HN is given herewith:
By establishing this probability, it becomes simple to resolve the energy model. The purpose of this study is to obtain minimal energy. In the statistical procedure, the low energy layer has a greater probability of incident than the greater energy state. So, it is required to maximize the probability.
so
where
denotes the normalized factor.
In this equation, the first item on the right denotes the negative values of the entire free-energy function of the whole network; on the left, the probability function is shown. This has been mentioned in the presented model, in which the maximal probability evaluation is used to resolve the model parameters.
3.4. Hyperparameter Selection Using DBO Algorithm
In this final stage, the DBO algorithm is applied upon the hyperparameter values of the HDBN model. The DBO algorithm is inspired by the natural behaviors of dung beetles [
25]. Here, the optimum solution is attained by updating the location of the dung beetles [
26]: the dung beetles that roll the balls can be upgraded and formulated using the following Equation (15):
The location-updating formula for female dung beetles is given herewith:
The location-updating formula for small dung beetles is given below.
The location-updating formula for thief dung beetles is as follows:
where,
and
values are fixed, i.e., 0.1 and 0.3, respectively; the flexure angle
ranges from
and is utilized in the dung beetles’ location-updating formula;
and
show the present optimum location and the global optimum location, respectively; and
and
are two constants that signify independent random vectors of size
. The reproduction and ovipositional regions have upper and lower boundaries, which are denoted by
, and
. Moreover, a random vector of
size that follows a uniform distribution is characterized by
and
signifies a constant value. The parameters
and
are uniformly distributed random numbers between [0, 1]. The location-updating equation for dung beetles is updated continuously, until the optimum solution is reached.
Fitness selection is a key aspect of the DBO algorithm. The encoded solution is utilized to assess the good performance of the candidate outcomes. At present, accuracy is the major condition exploited in the design of an FF.
In the above equation, and define the false- and true-positive values, respectively.
4. Results and Discussion
The cyberattack detection performance of the HBA-OHDBN technique is examined in this section using the standard NSLKDD dataset [
27], comprising 148,517 samples under five classes, as summarized in
Table 1.
Figure 4 shows the classification outcomes of the HBA-OHDBN algorithm under the test dataset.
Figure 4a,b portray the confusion matrices generated by the HBA-OHDBN method on 60:40 of the TR set/TS set. The outcome values demonstrate that the HBA-OHDBN approach classified and detected all five classes accurately.
Figure 4c,d depict the classifier outcomes of the HBA-OHDBN methodology on 60:40 of the TR set/TS set. The proposed method accomplished effective detection of all four kinds of attacks and normal instances.
The cyberattack detection output of the HBA-OHDBN technique is shown in
Table 2 and
Figure 5. The results highlight the effectual identification of all four kinds of attacks and normal samples. With the 60% TR set, the HBA-OHDBN technique achieved average
,
,
,
, and
values such of 99.17%, 96.09%, 75.24%, 75.83%, and 87.30%, respectively. Moreover, with the 40% TS set, the HBA-OHDBN approach achieved average
,
,
,
, and
values of 99.21%, 76.26%, 75.04%, 75.63%, and 87.21%, respectively.
Figure 6 demonstrates the training accuracy
and
values achieved via the HBA-OHDBN method. The
is determined as an estimate of the HBA-OHDBN method on the TR dataset, whereas the
is computed by evaluating the performance of the proposed method on a separate testing dataset. The outcomes demonstrate that the
and
values increased with the increase in the number of epochs. As a result, the performance of the HBA-OHDBN technique on the TR and TS datasets improved with the increase in the number of epochs.
Figure 7 shows the
and
curves of the HBA-OHDBN technique. The
value is defined as the error between the predictive outcome and original values of the TR data. The
value measures of the outcome of the HBA-OHDBN system on individual validation data. The results indicate that the
and
values were reduced with the rising number of epochs. This phenomenon portrays the enhanced performance of the HBA-OHDBN technique and its ability to make accurate classifications. The minimal
and
values demonstrate the superior outcome of the HBA-OHDBN technique in capturing the patterns and relationships.
A detailed PR analysis was conducted via the HBA-OHDBN approach using the test database, and the results are shown in
Figure 8. The simulation values depict that the HBA-OHDBN system achieved high PR values. Thus, it is clear that the HBA-OHDBN method achieved enhanced PR values on all five classes.
Figure 9 shows the ROC curve of the HBA-OHDBN approach on the test database. The simulation values indicate that the HBA-OHDBN system achieved superior ROC values. Thus, it can be inferred that the HBA-OHDBN methodology achieved a better ROC performance in all five classes.
In
Table 3 and
Figure 10, the results of the comparative analysis between the proposed HBA-OHDBN technique and other existing models are provided [
28]. The results indicate that the HBA-OHDBN technique achieved improved performance over other models. At the same time, the NSL-KDD IoT model achieved the worst results, while the NIDS-ML, AE-MLID, DL-NIDS, and ADNT-ELM models obtained closer results. Although the remaining models achieved reasonable outcomes, the HBA-OHDBN technique outperformed the rest of the models.
Finally, the Computation Time (CT) results of the HBA-OHDBN technique are compared with recent models in
Table 4 and
Figure 11. The results indicate that the HBA-OHDBN technique reached effectual outcomes with a minimal CT of 0.95 s. On the other hand, the NIDS-ML, AE-MLID, NSL-KDD IoT, DL-Improved ID, DL-NIDS, ADNT-ELM, and DL-DCSCA IoT CN models attained increased CT values. Therefore, the HBA-OHDBN technique can be applied to classify the cyberattacks automatically.
5. Conclusions
In the current manuscript, the authors have presented the HBA-OHDBN system for cyberattack detection in an IoT platform. The major intention behind the HBA-OHDBN approach is to accurately recognize and classify cyberattacks in a IoT platform. The proposed HBA-OHDBN system has three stages of operations: HBA-based feature selection, HDBN-based detection, and DBO-based hyperparameter tuning. Moreover, BC technology is also applied to improve network security. In the presented HBA-OHDBN technique, feature selection using HBA was applied to elect the optimal set of features, which, in turn, led to improved results. Finally, the DBO algorithm has been utilized to adjust the hyperparameter values of the HDBN algorithm. The performance of the HBA-OHDBN system was validated using the benchmark IDS database. Our extensive results confirmed the superior performance of the HBA-OHDBN technique compared to other recent approaches. Future research work can focus on auditing and securing smart contracts in the IoT environment. In addition, advanced tools and methodologies can be developed for the identification of vulnerabilities in IoT-related smart contracts in order to ensure their resilience against cyberattacks. In addition to these, scalability challenges related to BC technology can be resolved to accommodate the growing number of IoT devices and transactions.