1. Introduction
Network security has become an increasingly important issue in recent years due to its critical role in various fields, including the economy, military, healthcare, and more. Establishing effective defenses against various types of network attacks and ensuring the security of network devices and information has become a highly important issue that requires careful consideration [
1].
As globalization and multipolar development continue, the flames of great power games have extended to the cyberspace. The scale and frequency of attacks against governments, industry chains, and other entities have increased year by year [
2,
3]. Additionally, the COVID-19 pandemic has further catalyzed the urgency of network security issues. Intrusion detection is considered one of the mandatory defense lines to protect critical networks from increasing invasive activities [
4].
However, as new technologies such as 5G, industrial Internet, and cloud platform technologies rapidly develop, network intrusion forms and methods have also evolved. Traditional detection methods are inadequate to adapt to the new network intrusion scenarios [
5,
6,
7]. Therefore, new approaches are needed to enhance intrusion detection and improve network security in the face of evolving threats.
This paper discusses the use of meta-learning for intrusion detection in open domains, focusing on open-domain issues, network traffic processing, and meta-learning techniques. Open domain problems have three main challenges: differences in label sets between domains, differences in distribution between source domains, and the possibility of entirely new categories in the target domain. The research aims to develop a model with strong generalization performance on source domains in open domains and apply it to an unknown target domain. Meta-learning algorithms can solve the problem of insufficient data samples, and require only a small amount of samples to achieve traffic identification and classification. Feature extraction from network traffic is necessary to reduce data redundancy and dimensionality and lower computational costs. The research results are expected to improve network security, prevent network attacks and intrusions, protect network devices and information, and support national security, economic development, and social stability.
The contributions of this research are summarized as follows.
Introducing the open domain scenario, which allows the model to achieve generalization on an unknown target domain with only source domain training, addresses the limitations of traditional intrusion detection models and aligns more closely with real-world intrusion scenarios.
Designing the Open DGML framework, which integrates flow image technology, data augmentation, and meta-learning algorithms to improve the performance of intrusion detection in open domains.
Proposing the flow image technology and Mul-Mixup technique, which address the problem of insufficient sample size and improve the model’s generalization performance.
3. System Design
3.1. Overview
The open-domain generation meta-learning (Open DGML) framework is an innovative intrusion detection system that employs a three-stage process: flow image technology for preprocessing network traffic into analyzable images, data augmentation technology using Mul-Mixup to enhance sample diversity and model generalizability, and an open-domain generalization meta-learning algorithm that trains robust meta-learners capable of classifying both known and unknown traffic categories across various domains. This system innovatively addresses key challenges in intrusion detection, such as data imbalance and domain generalization, by integrating advanced feature extraction, multi-domain data enhancement, and meta-learning strategies for improved classification accuracy and adaptability to new, unseen domains. The overall architecture of the system is shown in
Figure 1.
3.2. Flow Image Technology
In the realm of network security, the efficacious processing of network traffic data is paramount for intrusion detection. Traditionally, these data arrive as a sequence of network flows, each comprising a multitude of packets. To harness these data for machine learning algorithms, particularly convolutional neural networks (CNNs), it is imperative to transform it into an appropriate format. This section elucidates our proposed method for encoding network traffic data and converting it into grayscale images, as delineated in
Figure 2 and
Table 1.
Our flow image technology is underpinned by a meticulous four-step process: data stream segmentation, feature extraction, coding, and image conversion. Initially, network traffic is parsed to segment packets into coherent flows based on a quintuple schema, capturing the temporal sequence and aggregation of packets into a cohesive unit.
Feature extraction is executed to identify and categorize the fundamental, content, and flow similarity features of each flow. The features can be divided into two types: classified and continuous. Classified features are categorical in nature and include features such as protocol type, service, and status. These features are converted into binary vectors using one-hot encoding. For example, the protocol type feature, which includes TCP, UDP, and ICMP, is converted into a three-dimensional binary vector (001, 010, 100). Similarly, binary features such as Land and Logged_In, which have only two possible values (0 and 1), are also encoded using one-hot encoding.
Continuous features, on the other hand, are numerical in nature and include features such as duration, urgent, and count. These features are first normalized using the min–max method to the range of [0, 1]. Then, they are discretized into 10 intervals using one-hot encoding. For example, a continuous feature with a range of [0, 100] would be divided into 10 intervals of 10 (0–10, 10–20, 21–29, etc.), and each interval would be represented by a binary vector using one-hot encoding.
Image conversion converts network flows into a visual format. Selecting the initial M packets and the first N bytes from each, we construct an matrix for each packet. These matrices are then grouped in threes to represent the RGB channels of a color image, culminating in a 24-bit color image. Padding with zeros is used if the length of the data is less than M or the number of bytes in a packet is less than N. This approach not only encapsulates the temporal and spatial structure of the data but also significantly reduces computational complexity and enhances real-time processing capabilities.
Flow image technology has the potential to improve the accuracy and efficiency of network traffic analysis for intrusion detection and network security. Compared to directly using the network flow data as input to the model, this method reduces the computational complexity and improves real-time performance. By transforming the data into a format amenable to CNNs, we pave the way for advanced machine learning techniques to be deployed in real-time network security applications [
27].
In summary, flow image technology stands out for its innovative approach to preparing network traffic data for machine learning analysis. It not only streamlines the data for efficient processing but also retains the integrity of the traffic’s temporal and spatial characteristics. The effectiveness of this method has been corroborated through empirical evidence, showcasing its potential to bolster the accuracy and efficiency of intrusion detection systems.
3.3. Data Augmentation
In the domain of intrusion detection, the challenge of imbalanced sample has been a major concern [
28]. To counteract this issue, we introduce a robust data augmentation strategy that enhances the diversity of the training samples without the need for additional data, thereby improving the model’s generalization capabilities. The vicinal risk minimization (VRM) concept can be used to describe the neighborhood of each training sample and generate additional virtual samples to expand the support for the training distribution [
29].
Common data augmentation techniques comprise (1) geometric transformations [
30,
31], including horizontal and vertical flips, random rotations, and scaling, which are especially efficacious for addressing direction-insensitive challenges such as image classification; and (2) pixel-based transformations that encompass color adjustments, padding, and the introduction of noise. For example, Francisco et al. [
32] demonstrate that the incorporation of Gaussian noise into the samples leads to a measurable enhancement of 1.65% in accuracy across a spectrum of nine datasets sourced from the UCI repository.
Mixup. Our approach harnesses the power of Mixup, an established technique for data augmentation that creates new samples by interpolating between existing ones. This method introduces a parameter
, which follows a Beta distribution, to control the interpolation strength between a pair of randomly selected samples
and
. The resulting mixed sample
and its corresponding label
are given by
Similar methodologies include Cutout and CutMix. Cutout mitigates overfitting by randomly zeroing out portions of the input, while CutMix enhances robustness by swapping and blending patches between images.
Mul-Mixup. We introduce Mul-Mixup, a novel schema for data augmentation that transcends traditional Mixup methods. By leveraging samples from multiple source domains, Mul-Mixup covers a larger sample space and generates more diverse and informative mixed samples. Unlike Mixup, which interpolates between two samples, Mul-Mixup combines features and labels from multiple domains through a weighted sum, where each domain’s weight adheres to a multi-dimensional Beta distribution.
The formulation of Mul-Mixup is as follows: given data features
and corresponding labels
from different domains, the mixed sample and label are created by
where
represents the mixing coefficients.
Mul-Mixup is particularly effective in open-domain settings, where class imbalance and missing samples are prevalent. The resulting mixed samples effectively address these challenges, equipping our models with superior classification and generalization capabilities.
In summary, our data augmentation techniques, with a spotlight on the innovative Mul-Mixup, provide a robust framework for improving the generalization and robustness of intrusion detection models. These methods ensure that our models are well prepared to tackle a diverse array of threats in real-world scenarios.
3.4. Meta-Learning Intrusion Detection
In the dynamic field of cyber security, the introduction of meta-learning for intrusion detection represents a significant advancement, offering a robust solution to the pervasive challenge of domain shift. Open DGML leverages the innovative meta-learning domain generalization (MLDG) approach to fortify models against performance degradation when transitioning from training to novel, unseen domains.
Meta-Learning Domain Generalization (MLDG). The MLDG algorithm pioneers the integration of meta-learning within domain generalization, addressing the pivotal issue of domain drift. By simulating the training/test domain shift through the synthesis of virtual test domains within each mini-batch, MLDG optimizes for performance across both training and test domains. This dual focus culminates in the development of a meta-learned model that exhibits enhanced generalization capabilities in the face of unknown domains.
Assuming that S source domains and the target domain share the same task with the same label space and input features but with different statistical data, a single model parameterized by is defined to solve the specified task. The algorithm flow includes meta-training and meta-testing phases. During meta-training, the model is updated across all (S−V) meta-training domains , while in the meta-testing phase, the model undergoes virtual evaluation on V meta-training domains, simulating performance on a new domain with divergent statistical properties.
For the meta-training phase, the model is updated on all (S − V) meta-training domains
with the loss function:
where
denotes the
j-th data point in the
i-th domain. The model parameters are subsequently refined based on the gradient
of this loss function.
For the meta-testing phase, the model is evaluated on
V meta-training domains with the loss function:
which guides the optimization of the model for generalization across domains.
The ultimate goal of the MLDG algorithm is encapsulated by the following objective:
where
,
and
are the learning rates for the respective updates.
The MLDG framework has been deconstructed in various ways to accommodate different types of domain generalization problems, such as feature-wise evaluation network for heterogeneous domain generalization and regularization function in a meta-learning framework for non-uniform domain generalization settings [
33,
34]. Overall, MLDG is a promising approach for domain generalization, as it can effectively improve cross-domain generalization performance by addressing the problem of domain shift [
35].
Open-Domain Generalization Meta-Learning Algorithm. The open-domain generalization meta-learning algorithm (Open DGML) is central to our intrusion detection system, designed to classify traffic into known categories within the source domains and segregate unknown categories for expert analysis. This algorithm introduces a meta-learning approach to address the domain drift phenomenon by establishing meta-learners in each source domain, capable of both domain-specific classification and generalization across all domains. The Open-DGML algorithm is shown in Algorithm 1.
The algorithm begins with the random initialization of parameters and iteratively performs meta-training and meta-testing phases. In the meta-training phase, small batches of data from the source domains are augmented using Mul-Mixup and utilized to train an ensemble of meta-learners. The meta-learners are updated based on the loss functions and , which incorporate both the original data and Mul-Mixup-augmented data.
After the meta-training phase, trained meta-learners are obtained for each source domain. The final model aggregates predictions from all meta-learners to classify unknown target domain data
, employing domain similarity and predictive uncertainty metrics to quantify the transferability of each sample. This approach ensures that our models are well-prepared to counter a diverse array of threats across various domains.
Algorithm 1: Open-Domain Generalization Meta-Learning Algorithm |
-
Data: Source domains , step sizes -
Result: Meta-learner models - 1.
Randomly initialize parameters - 2.
while not converged - 3.
Sample from - 4.
for do - 5.
Obtain by performing Mul-mixup on - 6.
Evaluate loss - 7.
Update parameters - 8.
Sample again from - 9.
for do - 10.
Obtain by performing Mul-mixup on - 11.
Evaluate loss - 12.
Update parameters - 13.
end while
|
In summary, the integration of meta-learning into intrusion detection exemplifies a strategic evolution, offering a proactive and adaptable defense mechanism against the ever-changing landscape of cyber threats. The open-domain generalization meta-learning algorithm presents a comprehensive strategy for enhancing the robustness and generalization of intrusion detection systems.
4. Experimental Setup
4.1. Experimental Environment
Our experiments were conducted on a stable and reliable environment with an Intel Xeon CPU E5-2630 (Intel, Santa Clara, CA, USA), 2 NVIDIA GTX 1080Ti GPUs (NVIDIA, Santa Clara, CA, USA), and 16GB memory. The software environment included CentOS 7.6, NVIDIA CUDA 10.1, Pytorch 1.7, and Python 3.7.
4.2. Evaluation Datasets
In our experiment, we selected four intrusion detection datasets, namely ISCX2012 [
36], NDSec-1 [
37], CICIDS2017 [
28] and CICIDS2018 [
38], which cover a variety of attack types including DoS, DDoS, brute force, penetration, web attacks, zombie networks, Heartbleed, and port scanning, as shown in
Table 2. These datasets were chosen to meet the requirements of open-domain settings, with different distributions, multiple attack types, and some unique types not found in other datasets. ISCX2012 and NDSec-1 are real-world datasets, while CICIDS2017 and CICIDS2018 are simulation-based datasets. The different distributions, attack types, and sample sizes of these datasets provide a better evaluation of intrusion detection algorithms. It is important to consider various factors when selecting a dataset, such as its realism, reliability, sample size, attack types, and distribution. Additionally, proper preprocessing and feature extraction methods should be applied to ensure that the algorithm can effectively learn the features of the data and improve detection accuracy.
4.3. Baselines
In this study, we compare the performance of our proposed algorithm, Open DGML, with three popular intrusion detection models, namely HAST-IDS, CLAIRE, and FC-Net.
HAST-IDS [
39] is designed to address the challenge of manually designing accurate flow features in the field of intrusion detection. It utilizes the spatiotemporal characteristics of network traffic data to automatically learn the features of raw traffic data using neural networks, thereby improving the effectiveness of intrusion detection. The entire feature learning process is automatic and does not require feature engineering techniques.
CLAIRE [
40] proposes a multi-stage intrusion detection method that includes an autoencoder for constructing network data flow feature vectors and a CNN architecture for classification. The data mapping method allows for the construction of image data to express potential data patterns that occur in adjacent flows, and the nearest clustering method has been proven to be an effective means of describing intrusion information patterns that appear in the flow of the neighborhood.
FC-Net [
41] is an end-to-end neural network composed of F-Net (feature extraction network) and C-Net (comparison network), which achieves feature extraction and comparison and is particularly suitable for small-sample detection. These two networks are cascaded, and FC-Net constitutes an end-to-end implementation of input data streams and incremental score outputs. To perform small-sample detection, it only needs to compare the features of the test and training sets. It compares the data from the training set and the test set one by one and uses the average increment score generated by different types of network traffic in the training set to determine which type of traffic the test sample belongs to.
4.4. Evaluation Metrics
To evaluate the performance of our proposed intrusion detection algorithm, we use accuracy (ACC) and detection rate (DR) as evaluation metrics. ACC measures the percentage of correct predictions, while DR measures the proportion of correctly identified malicious samples. Additionally, we use the Fréchet Distance (FD) to measure the distance between the feature spaces of the source and target domains, which reflects the degree of similarity between the two datasets. We repeat each experiment multiple times and calculate the average metrics to evaluate overall performance. FD is calculated as the minimum distance between two feature spaces at all sampling times in the range [0, 1].
4.5. Hyper-Parameter Settings
Our designed model includes 8 layers, consisting of 6 convolutional layers, 2 fully connected layers, and an output layer. Each convolutional layer has 256 filters with a size of 3 × 3. The first two convolutional layers are followed by a pooling layer, and the last convolutional layer is followed by a pooling layer and two fully connected layers with 256 and 1024 nodes, respectively. The output layer is a fully connected layer with nodes corresponding to the classes in the dataset.
We set the batch size to 100 and the number of iterations to 120 based on the convergence analysis of the model’s accuracy and loss within a batch. The step size is set to 0.01, and the beta is set to 0.01. We use the Adam optimizer with stochastic gradient descent, which dynamically adjusts the learning rate of each parameter based on its first and second-order moment estimates, to optimize the model’s training process.
5. Experimental Results
Our experiments aim to answer the following research questions (RQs):
RQ1: How does the Open DGML algorithm perform in comparison to existing intrusion detection systems in non-open domain scenarios?
RQ2: How does the Open DGML algorithm perform in comparison to existing intrusion detection systems in open domain scenarios?
RQ3: Is the design of each component reasonable and effective, and what positive effect does it have on the results?
5.1. Performance Comparison in Non-Open Domain
In the non-open domain setting, we only consider the algorithm’s performance on the current dataset, with no unknown target domain data involved.
Table 3 shows the detailed comparison results of the four algorithms in the non-open domain.
By observing the performance of each algorithm in the non-open domain, we find that with our algorithm, Open DGML, although not always the best in terms of individual accuracy and detection rate on these four datasets, the difference from the best results is not significant, and all values exceed 94%. Notably, its performance on the NDSec-1 dataset is far better than the other algorithms.
Specifically, on the ISCX2012 dataset, HAST-IDS has the highest accuracy, and FC-Net has the highest detection rate. The accuracy of Open DGML is 99.13%, ranking second among the four algorithms, and its detection rate is 98.26%, also in second place. On the NDSec-1 dataset, Open DGML has the highest accuracy and detection rate, at 94.20% and 95.29% respectively, leading the second-best HAST-IDS in accuracy by 3.16% and the second-best CLAIRE in detection rate by 3.69%. On the CICIDS2017 dataset, CLAIRE has the highest accuracy at 98.01%, and FC-Net has the highest detection rate at 99.62%. Open DGML’s accuracy and detection rate are both second, only 0.48% and 2.32% lower than the highest values, respectively. On the CICIDS2018 dataset, Open DGML’s accuracy is 2.26% lower than the highest CLAIRE algorithm, and its detection rate is only 0.15% lower than the highest FC-Net.
However, in terms of the average performance across the four datasets, our algorithm’s accuracy and detection rate are better than the other three algorithms, with a 0.49% and 1.76% improvement over the next best algorithm, respectively. This indicates that our algorithm also performs well in the non-open domain, and achieves consistent results across multiple datasets (domains), due to its better generalization capability. It has a good initial model for domain transfer and can quickly adjust its parameters to adapt to the new data distribution after a few fine-tuning steps.
5.2. Performance Comparison in Open Domain
In the open domain setting, we evaluated the performance of our Open DGML algorithm on one unknown domain and trained it on three source domains. In case of encountering attack types that do not exist in the source domains, such as port scanning in the NDSec-1 dataset, which is absent in the other datasets, we classified them as “unknown class”. The results, shown in
Table 4, indicate that Open DGML outperforms the other algorithms in terms of accuracy and detection rate on most of the datasets.
In particular, Open DGML achieves the highest accuracy and detection rate among the evaluated methods on the ISCX2012 dataset, reaching 75.69% and 70.13% respectively, which are the best performances across the four datasets. Notably, Open DGML’s detection rate is 14.93% higher than the second-best performer, HAST-IDS. Similarly, on the NDSec-1 dataset, Open DGML exhibits the optimal accuracy and detection rate of 68.50% and 65.99%, significantly outperforming the next-best algorithm, CLAIRE, by 16.64% and 12.39%, respectively. On the CICIDS2017 dataset, Open DGML achieves the highest accuracy of 67.45%, while its detection rate is the second best, only 4.63% lower than the top-performing FC-Net. Furthermore, on the CICIDS2018 dataset, Open DGML demonstrates the highest accuracy, surpassing the second-best CLAIRE by 4.37%, and also has the best detection rate.
When considering the average performance across the four datasets, Open DGML significantly outperforms the other algorithms. Compared to HAST-IDS, Open DGML achieves 19.26% and 16.43% higher accuracy and detection rate, respectively. Similarly, it outperforms CLAIRE by 13.03% and 11.08%, and FC-Net by 11.79% and 11.57%, all improvements exceeding 10%.
These results indicate that Open DGML, which integrates Mul-Mixup techniques and meta-learning algorithms, can effectively perform intrusion detection in open-domain settings. The combination of enhanced data sample diversity and improved generalization capabilities when facing unknown data domains contributes to the superior performance.
The performance of Open DGML differed significantly from the other algorithms on the NDSec-1 dataset, which contains a previously unseen traffic type, port scanning. Open DGML’s design incorporates a confidence threshold method to identify unknown classes, which gives it an edge over the other algorithms. To quantify the model’s ability to recognize unknown classes, we evaluated the accuracy of Open DGML and the other algorithms in identifying port scanning as an unknown class in the NDSec-1 dataset. The results, presented in
Table 5, show that Open DGML achieved the highest accuracy of 41.70%, which is significantly better than the other algorithms. This is due to the fact that Open DGML was specifically designed to recognize unknown classes, and its overall performance in the target domain was high, which also affects its ability to recognize unknown classes.
Furthermore, the performance gap between Open DGML and the other algorithms was much wider in the open domain setting than in the non-open domain setting, indicating that Open DGML is better suited for open domain scenarios. The lower performance gap in the non-open domain setting suggests that the current mainstream algorithms can already achieve good performance, leaving little room for improvement. Open DGML, on the other hand, focuses on enhancing the model’s generalization ability, especially in unknown domains, by incorporating meta-learning techniques and Mul-Mixup.
To further validate the model’s generalization ability in the target domain, we compared Open DGML with the FC-Net method using Frechet distance. The results, shown in
Figure 3, indicate that Open DGML’s Frechet distance is much smaller than that of FC-Net, indicating stronger generalization ability in the target domain. We also observed two rules from the results: (1) the Frechet distance is related to the domain itself, and (2) the Frechet distance is positively correlated with the model’s recognition accuracy. In the four experiments using different datasets as the target domain, Open DGML achieved the smallest Frechet distance on the ISCX2012 dataset, followed by NDSec-1 and CICIDS2017, and the largest on the CICIDS2018 dataset, which is consistent with the ranking of accuracy and detection rate when the model is applied to these datasets in the open domain setting.
5.3. Validation of Components in Open DGML
Ablation Study. The ablation study dissects the framework into individual modules to assess their impact on the final detection results. The modules include flow image technology, Mul-Mixup data augmentation, and the meta-learning algorithm. The performance comparison in the open-domain setting is measured using accuracy rates, as detailed in
Table 6.
It can be found that for the recognition of the eight attack types, except for the recognition of botnets where the meta-learning-only scheme performs slightly better, the performance ranking in other cases is Open DGML > DGML > Mul-Mixup > Flow Image. This indicates that the improvement in overall model generalization performance brought by meta-learning is more significant than the data augmentation approach of only using Mul-Mixup to improve sample diversity, and the coupled effect of the two is also greater than the performance improvement brought by applying the basic model alone, because the optimization of the two approaches is at different levels. Mul-Mixup is more focused on data augmentation of the original input images to obtain a more general representation of the belonging classes, while the meta-learning algorithm utilizes a large amount of prior knowledge to train the basic model and learns the distribution of samples in the classes through multi-domain knowledge, which is more focused on improving model generalization performance. The combination of flow image, Mul-Mixup, and meta-learning significantly enhances detection accuracy, verifying the effectiveness of these three components in Open DGML.
It is worth mentioning that our model has obvious differences in the recognition performance of different categories, and this situation also exists in each module stage. The recognition performance for DoS, DDoS, Brute Force, and Penetration Attacks is better than Web Attacks, Botnets, and Heartbleed, and is much better than Port Scans. The reason is that the first four types of attacks are present in each source domain and target domain, so the training samples for the model are relatively sufficient. However, Web Attacks, Botnets, and Heartbleed are not present in all datasets, so for a particular target domain, the attack category may only exist in one or two source domains, and the sample diversity is somewhat insufficient, resulting in less than ideal final recognition performance. Particularly, the Port Scan attack category only exists in the NDSec-1 dataset, so if this dataset is used as the target domain, there is no training data available in the source domains, and the model’s recognition capability for this category is generally poor, especially in the modules without meta-learning, with a recognition accuracy of only 17.1%. After going through meta-learning, the recognition accuracy for unknown categories is directly improved by 15.5%, demonstrating that the meta-learning algorithm provides significant help for the recognition of unknown categories.
Comparative Experiments on the Flow Processing Method. We validate the effectiveness of the data preprocessing module proposed in our framework by comparing it with various flow processing methods on the ISCX2012 dataset. To showcase the performance of our flow processing method, we only conduct experiments and testing on the ISCX2012 dataset and do not consider open-domain scenarios.
We first compare the effectiveness of our proposed data segmentation, feature extraction, and encoding steps with various flow processing methods on the ISCX2012 dataset. We implement K-means clustering for feature-based clustering and feature-based clustering with payload. The latter boosts the accuracy by 0.9% because the payload contains important information. However, the payload is often too long and contains redundancy, decreasing efficiency. We compare our method with the approach that directly converts raw packets into binary images and find that our Open DGML method outperforms it by 3.4%. We also find that our method outperforms methods that only truncate packets or flows, which perform data dimensionality reduction. The payload contributes to accuracy improvement by 1.1%. Our experiments showcase the advantages of our proposed data preprocessing module for intrusion detection.
In addition to improving accuracy and detection rate, our data segmentation and truncation steps significantly reduce the input size and computation cost, leading to a shorter time for intrusion detection, which better meets the needs of real-time traffic identification. To quantify the efficiency improvement in terms of time consumption, we compare the methods in terms of the number of samples recognized per unit time, with Open DGML set as the standard of 100 samples per unit time. The results in
Figure 4 show that the direct image conversion method has the lowest recognition efficiency because it contains too much redundant information that cannot be effectively utilized and increases the input dimension, leading to a lower number of recognized samples per unit time. The method that only truncates packets or flows has a relatively higher time efficiency because the number of packets is often smaller than the number of bytes in a flow, resulting in lower input dimensions. Although these two methods have similar accuracy to our method, the time consumption difference is much larger, with our method achieving a 20–30% improvement. Finally, the method without payload sacrifices a significant amount of accuracy for a minor improvement in time efficiency, which is not desirable for intrusion detection tasks where accuracy is crucial. Therefore, our flow processing method in the proposed framework can balance high accuracy and low time consumption to meet the real-time requirements of intrusion detection in practical scenarios.
In conclusion, the ablation study and comparative experiments underscore the superiority of the Open DGML framework in intrusion detection. The integrated approach of flow image technology, Mul-Mixup, and meta-learning not only improves accuracy and detection rates but also aligns well with real-time requirements in practical scenarios.
6. Conclusions
The burgeoning threat of cyber attacks, amplified by technological advancements and globalization, has rendered traditional intrusion detection systems inadequate. These systems often struggle to adapt to the evolving nature of attacks, particularly in open domains where novel attack vectors may emerge. The significance of this research lies in addressing the limitations of existing models, which lack the generalizability required to detect unknown threats effectively.
Our Open DGML framework introduces a paradigm shift in intrusion detection by integrating meta-learning, data augmentation, and flow image technology. This innovative approach enables the model to learn from multiple source domains and apply this knowledge to unknown target domains, enhancing its robustness and generalization capabilities. The framework’s design is underpinned by the novel Mul-Mixup technique for data augmentation, which enriches the diversity of training samples, and an MLDG algorithm that simulates domain shifts to improve the model’s adaptability to new environments.
Empirical validation of Open DGML on the ISCX2012, NDSec-1, CICIDS2017, and CICIDS2018 datasets reveals its superiority over existing models, achieving higher accuracy and detection rates across various scenarios. The ablation study and component validation experiments highlight the synergistic effect of the integrated components, demonstrating that our framework not only improves detection rates but also meets the real-time requirements crucial for practical application.
Future works aim to build upon the current framework’s success. We plan to develop a specialized dataset tailored for open domain intrusion detection, providing a benchmark for further research. Additionally, we will explore advanced feature selection techniques to refine our model’s efficiency and integrate cutting-edge data augmentation methods to bolster performance. Furthermore, we intend to extend the application of our framework beyond classification tasks, leveraging its adaptability to other areas such as anomaly detection and threat intelligence.
In conclusion, our Open DGML framework represents a significant stride forward in intrusion detection, offering a robust solution for open domain challenges. Its innovative design, grounded in empirical evidence, positions it as a leading approach in the cyber security landscape. The framework’s practical applicability and potential for further expansion make it a promising avenue for safeguarding critical networks against the ever-changing spectrum of cyber threats.