Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection
Abstract
:1. Introduction
1.1. Intrusion Detection System
1.1.1. Signature-Based Detection
1.1.2. Anomaly-Based Detection
1.2. Feature Selection Methods
1.2.1. Sequential Forward Selection (SFS) and Sequential Backward Selection (SBS)
1.2.2. Genetic Algorithm
2. Related Work
3. Methodology
3.1. Dataset Selection
3.1.1. NSL-KDD
3.1.2. CIC-IDS-2017
3.1.3. CIC-IDS-2018
3.2. Preprocessing
3.3. Feature Subset Selection
3.3.1. Sequential Forward Selection (SFS)
3.3.2. Sequential Backward Selection (SBS)
3.3.3. Genetic Algorithm (GA)
Population Initiation
Fitness Function
Selection Function
Crossover Process
Mutation Process
Stopping Criteria
3.4. Classification
3.4.1. Support Vector Machine (SVM)
3.4.2. Artificial Neural Network Multi Perceptron (ANN-MLP)
3.5. Evaluation
- True positive (TP): represents number of attack samples classified correctly.
- True negative (TN): represents number of normal samples classified correctly.
- False positive (FP): represents number of normal samples classified wrongly.
- False negative (FN): represents number of attack samples classified wrongly.
- -
- Accuracy: represents the proportion of correct classified instances to the total number of classifications, as in Equation (3).
- -
- FPR (a.k.a. false alarms): represents the proportion of the normal instances that are identified as attack or abnormal instances, as in Equation (4).
- -
- Precision: represents the ratio of correctly predicted positive instances to the total predicted positive instances, as in Equation (5).
- -
- Recall (a.k.a. detection rate (DR)): represents the ratio of correctly predicted positive instances to the overall number of actual positive instances, as in Equation (6).
- -
- F1 score: represents the weighted average of precision and recall values, as in Equation (7).
3.6. Analysis and Comparison of Results
4. Implementation
4.1. Preprocessing
- In the NSL-KDD dataset, three categorical features, which are flag, service, and protocol_type features, are mapped to numeric values ranging from 0 to N − 1, where N is the number of symbols in the feature.
- Missing values or null values are removed from the CIC-IDS-2017 and CIC-IDS-2018 datasets. A script written in Python is used for removing these records.
- A duplicate feature in CIC-IDS-2017, namely Fwd Header Length, is removed manually. The timestamp feature is removed manually in CIC-IDS-2018. In addition, ten features are removed manually in CIC-IDS-2017 and CIC-IDS-2018, as they have zero values.
- Duplicated records are removed in all datasets. A script written in Python is used for removing these records.
- The class label is mapped to 0 for normal class and 1 for attack. As we use binary classification in this study, all the sub-category attack labels are mapped to 1. The resulted label feature contains 0 for normal records and 1 for attack records.
- The StandardScaler method from the sklearn library in Python is applied to standardize the feature variance in all the used datasets.
- All datasets are split into 70% training and 30% testing datasets.
- The random under-sampling (RUS) technique is applied in all training datasets.
4.2. Feature Selection
Implementation of SFS, SBS and the GA
4.3. Training and Testing
4.4. Results and Discussion
4.5. Performance Comparison with the Recent Methods
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Thakkar, A.; Lohiya, R. A survey on intrusion detection system: Feature selection, model, performance measures, application perspective, challenges, and future research directions. Artif. Intell. Rev. 2022, 55, 453–563. [Google Scholar] [CrossRef]
- Alhakami, W.; Alharbi, A.; Bourouis, S.; Alroobaea, R.; Bouguila, N. Network Anomaly Intrusion Detection Using a Nonparametric Bayesian Approach and Feature Selection. IEEE Access 2019, 7, 52181–52190. [Google Scholar] [CrossRef]
- Thakkar, A.; Lohiya, R. Attack classification using feature selection techniques: A comparative study. J. Ambient. Intell. Humaniz. Comput. 2020, 12, 1249–1266. [Google Scholar] [CrossRef]
- Tao, P.; Sun, Z.; Sun, Z. An Improved Intrusion Detection Algorithm Based on GA and SVM. IEEE Access 2018, 6, 13624–13631. [Google Scholar] [CrossRef]
- Ates, C.; Ozdel, S.; Anarim, E. A New Network Anomaly Detection Method Based on Header Information Using Greedy Algorithm. In Proceedings of the 6th International Conference on Control, Decision and Information Technologies (Codit 2019), Paris, France, 23–26 April 2019; IEEE: New York, NY, USA, 2019; pp. 657–662. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the International Conference on Information Systems Security and Privacy, Funchal, Portugal, 22–24 January 2018; pp. 108–116. [Google Scholar]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defence Applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar] [CrossRef] [Green Version]
- Saleh, A.I.; Talaat, F.M.; Labib, L.M. A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers. Artif. Intell. Rev. 2019, 51, 403–443. [Google Scholar] [CrossRef]
- Leevy, J.L.; Khoshgoftaar, T.M. A survey and analysis of intrusion detection models based on CSE-CIC-IDS2018 Big Data. J. Big Data 2020, 7, 104. [Google Scholar] [CrossRef]
- Wang, W.; Du, X.; Wang, N. Building a Cloud IDS Using an Efficient Feature Selection Method and SVM. IEEE Access 2019, 7, 1345–1354. [Google Scholar] [CrossRef]
- Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
- Thangavel, N.S.G. Building an Efficient of Feature Selection Using Greedy Search Method for HNIDS in Cloud Computing. J. Adv. Res. Dyn. Control Syst. 2019, 11, 307–316. [Google Scholar]
- Khammassi, C.; Krichen, S. A GA-LR wrapper approach for feature selection in network intrusion detection. Comput. Secur. 2017, 70, 255–277. [Google Scholar] [CrossRef]
- Kohavi, R.; John, G.H. Wrappers for feature subset selection. Artif. Intell. 1997, 97, 273–324. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Wang, J.-L.; Tian, Z.-H.; Lu, T.-B.; Young, C. Building lightweight intrusion detection system using wrapper-based feature selection mechanisms. Comput. Secur. 2009, 28, 466–475. [Google Scholar] [CrossRef]
- Mohammadzadeh, A.; Taghavifar, H. A robust fuzzy control approach for path-following control of autonomous vehicles. Soft Comput. 2020, 24, 3223–3235. [Google Scholar] [CrossRef]
- Varma, P.R.K.; Kumari, V.V.; Kumar, S.S. Feature Selection Using Relative Fuzzy Entropy and Ant Colony Optimization Applied to Real-time Intrusion Detection System. Procedia Comput. Sci. 2016, 85, 503–510. [Google Scholar] [CrossRef] [Green Version]
- Mohammadi, S.; Mirvaziri, H.; Ghazizadeh-Ahsaee, M.; Karimipour, H. Cyber intrusion detection by combined feature selection algorithm. J. Inf. Secur. Appl. 2019, 44, 80–88. [Google Scholar] [CrossRef]
- Sarvari, S.; Sani, N.F.M.; Hanapi, Z.M.; Abdullah, M.T. An Efficient Anomaly Intrusion Detection Method with Feature Selection and Evolutionary Neural Network. IEEE Access 2020, 8, 70651–70663. [Google Scholar] [CrossRef]
- Asdaghi, F.; Soleimani, A. An effective feature selection method for web spam detection. Knowl.-Based Syst. 2019, 166, 198–206. [Google Scholar] [CrossRef]
- Aslahi-Shahri, B.M.; Rahmani, R.; Chizari, M.; Maralani, A.; Eslami, M.; Golkar, M.J.; Ebrahimi, A. A hybrid method consisting of GA and SVM for intrusion detection system. Neural Comput. Appl. 2016, 27, 1669–1676. [Google Scholar] [CrossRef]
- Lee, J.; Park, O. Feature Selection Algorithm for Intrusions Detection System using Sequential forward Search and Random Forest Classifier. KSII Trans. Internet Inf. Syst. 2017, 11, 5132–5148. [Google Scholar] [CrossRef]
- Li, Y.; Xia, J.; Zhang, S.; Yan, J.; Ai, X.; Dai, K. An efficient intrusion detection system based on support vector machines and gradually feature removal method. Expert Syst. Appl. 2012, 39, 424–430. [Google Scholar] [CrossRef]
- Raman, M.G.; Somu, N.; Kirthivasan, K.; Liscano, R.; Sriram, V.S. An efficient intrusion detection system based on hypergraph —Genetic algorithm for parameter optimization and feature selection in support vector machine. Knowl.-Based Syst. 2017, 134, 1–12. [Google Scholar] [CrossRef]
- Zhou, Y.; Cheng, G.; Jiang, S.; Dai, M. Building an efficient intrusion detection system based on feature selection and ensemble classifier. Comput. Netw. 2020, 174, 107247. [Google Scholar] [CrossRef] [Green Version]
- Hua, Y. An Efficient Traffic Classification Scheme Using Embedded Feature Selection and LightGBM. In Proceedings of the Information Communication Technologies Conference (ICTC), Nanjing, China, 29–31 May 2020; pp. 125–130. [Google Scholar] [CrossRef]
- Seth, S.; Singh, G.; Chahal, K.K. A novel time efficient learning-based approach for smart intrusion detection system. J. Big Data 2021, 8, 1–28. [Google Scholar] [CrossRef]
- Alazzam, H.; Sharieh, A.; Sabri, K.E. A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer. Expert Syst. Appl. 2020, 148, 113249. [Google Scholar] [CrossRef]
- Mazini, M.; Shirazi, B.; Mahdavi, I. Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. J. King Saud Univ.-Comput. Inf. Sci. 2019, 31, 541–553. [Google Scholar] [CrossRef]
- Saeed, A.A.; Jameel, N.G.M. Intelligent feature selection using particle swarm optimization algorithm with a decision tree for DDoS attack detection. Int. J. Adv. Intell. Inform. 2021, 7, 37. [Google Scholar] [CrossRef]
- Shaikh, J.M.; Kshirsagar, D. Feature Reduction-Based DoS Attack Detection System. In Next Generation Information Processing System; Springer: Berlin/Heidelberg, Germany, 2021; pp. 170–177. [Google Scholar] [CrossRef]
- Patil, A.; Kshirsagar, D. Towards Feature Selection for Detection of DDoS Attack. Comput. Eng. Technol. 2019, 215–223. [Google Scholar] [CrossRef]
- Ahmad, I.; Hussain, M.; Alghamdi, A.; Alelaiwi, A. Enhancing SVM performance in intrusion detection using optimal feature subset selection based on genetic principal components. Neural Comput. Appl. 2014, 24, 1671–1682. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from Imbalanced Data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar] [CrossRef]
- Ho, Y.B.; Yap, W.S.; Khor, K.C. The effect of sampling methods on the cicids2017 network intrusion data set. In IT Convergence and Security. In IT Convergence and Security; Springer: Singapore, 2021; pp. 33–41. [Google Scholar]
- Bamakan, S.M.H.; Wang, H.; Yingjie, T.; Shi, Y. An effective intrusion detection framework based on MCLP/SVM optimized by time-varying chaos particle swarm optimization. Neurocomputing 2016, 199, 90–102. [Google Scholar] [CrossRef]
- Huang, C.-L.; Wang, C.-J. A GA-based feature selection and parameters optimizationfor support vector machines. Expert Syst. Appl. 2006, 31, 231–240. [Google Scholar] [CrossRef]
- Gen, M.; Cheng, R. Genetic Algorithms and Engineering Optimization; John Wiley & Sons: Hoboken, NJ, USA, 1999; Volume 7. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Raschka, S. Mlxtend: Providing machine learning and data science utilities and extensions to python’s scientific computing stack. J. Open Source Softw. 2018, 3, 638. [Google Scholar] [CrossRef]
- Calzolari, M. Manuel-Calzolari/Sklearn-Genetic: Sklearn-Genetic 0.5.1 (0.5.1). Zenodo. 2022. Available online: https://zenodo.org/record/5854662#.Y5knyH1ByUk (accessed on 18 January 2022).
Ref. | Feature Selection Algorithm | Classification Algorithm | Dataset | Number of Features | Result (%) |
---|---|---|---|---|---|
[3] | REF | SVM | NSL-KDD | - | ACC: 98.95 F-score: 99.75 |
[4] | GA | SVM | KDD Cup99 | 19 | |
[5] | Greedy Search | SVM | MIT Darpa 2000 | - | ACC: 99.99 FPR: 0.001 |
[8] | NBFS | PKNN + OSVMs | NSL-KDD | - | DR: 95.77 |
[10] | Greedy Search + CFS | RF | NSL-KDD | - | ACC: 98.32 FPR: 0.40 |
[13] | GA | DT | KDD99 | 18 | ACC: 99.90 FPR: 0.11 |
[15] | MRMHC | MLSVM | KDD Cup | 4 | TPR: 80.00 FPR: 3.65 |
[19] | CSA | MCF & MVO-ANN | NSL-KDD | 22 | ACC: 98.81 DR: 97.25 FPR: 0.03 |
[21] | GA | SVM | KDD Cup | 10 | ACC: 97.30 FPR: 1.70 |
[22] | SFFS | RF | NSL-KDD | 10 | ACC: 99.89 FPR: 0.40 |
[23] | Gradual feature removal | SVM | KDD Cup | 19 | ACC: 98.62 |
[24] | HG-GA | SVM | NSL-KDD | - | ACC: 97.14 FPR: 0.83 |
[25] | CSF + BA | RF + C4.5 + FOREST ATTRIBUTE | NSL-KDD CICIDS2017 | - | ACC: 96.76 DR: 94.04 FPR: 2.38 |
[26] | IG | LightGBM | CICIDS2017 | 10 | ACC: 98.37 |
[27] | Hybrid Feature Selection (RF + PCA) | Light GBM | CICIDS2018 | 24 | ACC: 97.73 |
[28] | Sigmoid POI | DT | NSL-KDD | 18 | ACC: 86.90 FPR: 6.40 |
[28] | Cosine POI | DT | NSL-KDD | 5 | ACC: 86.90 FPR: 8.80 |
[29] | ABC | AdaBoost | NSL-KDD | 25 | ACC: 98.90 FPR: 0.01 |
[30] | Binary-particle swarm optimization | Decision tree | CICIDS2018 | 19 | ACC: 99.52 |
[31] | IG and correlation attribute evaluation methods | PART | CICIDS2017 | 56 | ACC: 99.98 FPR: 1.35 |
[32] | Information gain and ranker algorithm | J48 | CICIDS2017 | 75 | ACC: 87.44 |
Predictive Records | ||
---|---|---|
Actual records | TP | FP |
FN | TN |
Objective Function | Parameter | Value |
---|---|---|
Equation (1) | WD | 0.45 |
WF | 0.45 | |
WN | 0.1 | |
Equation (2) | WA | 0.94 |
WN | 0.06 |
Parameter | Value | Parameter | Value |
---|---|---|---|
Estimator | SVM | n_jobs | −1 |
MLP | floating | False | |
Scoring | Equation (1) | cv | 2 |
Equation (2) | 5 | ||
k_features | NSL-KDD (1,40) | 10 | |
CIC-IDS-2017 (1,66) | forward | True (SFS) | |
CIC-IDS-2018 (1,67) | False (SBS) |
Parameter | Value | Parameter | Value |
---|---|---|---|
Estimator | SVM | crossover_proba | 0.6 |
MLP | crossover_independent_proba | 0.6 | |
Scoring | Equation (1) | mutation_proba | 0.1 |
Equation (2) | mutation_indenpdent_proba | 0.1 | |
Max_features | NSL-KDD (40) | caching | True |
CIC-IDS-2017 (66) | n_gen_no_change | NSL-KDD (50) | |
CIC-IDS-2018 (67) | CIC-IDS-2017 (65) | ||
n_population | NSL-KDD (60) | CIC-IDS-2018 (65) | |
CIC-IDS-2017 (80) | n_jobs | −1 | |
CIC-IDS-2018 (80) | verbose | 1 | |
n_generations | NSL-KDD (50) | cv | 2 |
CIC-IDS-2017 (65) | 5 | ||
CIC-IDS-2018 (65) | 10 |
Metric | Fold | SFS+SVM | SFS+MLP | SBS+SVM | SBS+MLP | GA+SVM | GA+MLP |
---|---|---|---|---|---|---|---|
Accuracy (%) | 2 | 98.94 | 97.83 | 98.95 | 98.25 | 99.23 | 99.01 |
5 | 98.80 | 97.66 | 98.82 | 98.81 | 99.16 | 99.02 | |
10 | 98.89 | 98.78 | 98.88 | 98.16 | 99.06 | 99.06 | |
Number of selected features | 2 | 10 | 9 | 10 | 16 | 29 | 38 |
5 | 9 | 9 | 10 | 14 | 27 | 35 | |
10 | 10 | 13 | 10 | 13 | 29 | 36 | |
FPR (%) | 2 | 1.14 | 2.51 | 0.99 | 1.50 | 0.77 | 0.95 |
5 | 1.44 | 2.84 | 1.18 | 1.26 | 0.84 | 1.20 | |
10 | 1.39 | 1.59 | 1.26 | 1.84 | 0.73 | 1.16 | |
F1 (%) | 2 | 98.90 | 97.75 | 98.81 | 98.17 | 99.20 | 98.98 |
5 | 98.76 | 97.59 | 98.77 | 98.77 | 99.13 | 98.98 | |
10 | 98.85 | 98.83 | 98.84 | 98.09 | 99.03 | 99.02 |
Metric | Fold | SFS+SVM | SFS+MLP | SBS+SVM | SBS+MLP | GA+SVM | GA+MLP |
---|---|---|---|---|---|---|---|
Accuracy (%) | 2 | 99.75 | 99.83 | 99.65 | 94.79 | 99.94 | 99.93 |
5 | 99.78 | 89.61 | 99.81 | 99.84 | 99.94 | 99.96 | |
10 | 99.28 | 88.74 | 99.75 | 99.83 | 99.9 | 99.91 | |
Number of selected features | 2 | 5 | 7 | 5 | 6 | 42 | 44 |
5 | 5 | 5 | 6 | 5 | 39 | 40 | |
10 | 5 | 5 | 5 | 6 | 45 | 38 | |
FPR (%) | 2 | 0.27 | 0.04 | 0.5 | 1.12 | 0.02 | 0.02 |
5 | 0.22 | 0.03 | 0.13 | 0.17 | 0.02 | 0.03 | |
10 | 1.44 | 0.01 | 0.27 | 0.12 | 0.01 | 0.03 | |
F1 (%) | 2 | 99.78 | 99.86 | 99.69 | 95.24 | 99.95 | 99.96 |
5 | 99.81 | 89.93 | 99.83 | 99.86 | 99.93 | 99.94 | |
10 | 99.36 | 89.00 | 99.78 | 99.85 | 99.93 | 99.39 |
Metric | Fold | SFS+SVM | SFS+MLP | SBS+SVM | SBS+MLP | GA+SVM | GA+MLP |
---|---|---|---|---|---|---|---|
Accuracy (%) | 2 | 99.46 | 99.87 | 99.78 | 99.87 | 99.71 | 99.69 |
5 | 97.67 | 99.87 | 99.64 | 99.51 | 99.82 | 99.83 | |
10 | 97.72 | 99.80 | 99.69 | 99.67 | 99.8 | 99.78 | |
Number of selected features | 2 | 21 | 11 | 7 | 8 | 18 | 23 |
5 | 6 | 8 | 7 | 10 | 21 | 26 | |
10 | 6 | 8 | 8 | 11 | 24 | 25 | |
FPR (%) | 2 | 0.32 | 0.90 | 0.10 | 0.90 | 0.16 | 0.21 |
5 | 0.18 | 0.10 | 0.10 | 0.84 | 0.16 | 0.10 | |
10 | 0.30 | 0.30 | 0.20 | 0.90 | 0.19 | 0.17 | |
F1 (%) | 2 | 99.46 | 99.88 | 99.77 | 99.88 | 99.71 | 99.69 |
5 | 97.61 | 99.85 | 99.64 | 99.51 | 99.82 | 99.80 | |
10 | 97.63 | 99.80 | 99.71 | 99.74 | 99.80 | 99.75 |
Metric | Fold | SFS+SVM | SFS+MLP | SBS+SVM | SBS+MLP | GA+SVM | GA+MLP |
---|---|---|---|---|---|---|---|
Accuracy (%) | 2 | 98.77 | 97.83 | 97.76 | 97.7 | 99.18 | 99.11 |
5 | 98.19 | 97.66 | 98.82 | 97.99 | 99.19 | 98.98 | |
10 | 98.19 | 98.82 | 98.65 | 97.93 | 99.21 | 98.93 | |
Number of selected features | 2 | 9 | 9 | 6 | 9 | 27 | 37 |
5 | 7 | 9 | 10 | 10 | 33 | 28 | |
10 | 7 | 12 | 8 | 10 | 30 | 39 | |
FPR (%) | 2 | 1.46 | 2.51 | 1.94 | 2.76 | 0.89 | 0.85 |
5 | 1.85 | 2.84 | 1.81 | 1.76 | 0.81 | 1.17 | |
10 | 1.85 | 1.65 | 1.82 | 2.64 | 0.83 | 1.28 | |
F1 (%) | 2 | 98.72 | 97.96 | 97.66 | 97.63 | 99.15 | 99.08 |
5 | 98.12 | 97.59 | 98.78 | 97.91 | 99.16 | 98.92 | |
10 | 98.12 | 98.09 | 98.61 | 98.01 | 99.18 | 98.88 |
Metric | Fold | SFS+SVM | SFS+MLP | SBS+SVM | SBS+MLP | GA+SVM | GA+MLP |
---|---|---|---|---|---|---|---|
Accuracy (%) | 2 | 97.86 | 98.07 | 99.86 | 96.41 | 99.94 | 99.95 |
5 | 99.87 | 97.9 | 99.9 | 98.17 | 99.93 | 99.91 | |
10 | 99.87 | 97.62 | 99.89 | 94.72 | 99.94 | 99.93 | |
Number of selected features | 2 | 27 | 17 | 16 | 10 | 38 | 35 |
5 | 31 | 24 | 14 | 14 | 42 | 40 | |
10 | 29 | 24 | 19 | 9 | 41 | 33 | |
FPR (%) | 2 | 0.06 | 0.05 | 0.03 | 0.03 | 0.003 | 0.007 |
5 | 0.09 | 0.003 | 0.05 | 0.05 | 0.03 | 0.01 | |
10 | 0.06 | 0.05 | 0.04 | 0.04 | 0.02 | 0.005 | |
F1 (%) | 2 | 98.08 | 98.28 | 99.88 | 96.74 | 99.93 | 99.92 |
5 | 99.89 | 98.12 | 99.92 | 98.36 | 99.93 | 99.93 | |
10 | 99.89 | 97.86 | 99.91 | 95.12 | 99.92 | 99.89 |
Metric | Fold | SFS+SVM | SFS+MLP | SBS+SVM | SBS+MLP | GA+SVM | GA+MLP |
---|---|---|---|---|---|---|---|
Accuracy (%) | 2 | 97.41 | 97.57 | 98.2 | 99.56 | 99.56 | 99.92 |
5 | 97.41 | 99.78 | 99.6 | 99.09 | 99.8 | 99.89 | |
10 | 97.38 | 98.21 | 99.01 | 99.16 | 99.74 | 99.82 | |
Number of selected features | 2 | 6 | 8 | 6 | 8 | 14 | 47 |
5 | 4 | 7 | 5 | 7 | 15 | 41 | |
10 | 6 | 7 | 6 | 9 | 17 | 38 | |
FPR (%) | 2 | 0.35 | 0.19 | 0.36 | 0.15 | 0.13 | 0.07 |
5 | 0.30 | 0.16 | 0.19 | 1.31 | 0.19 | 0.11 | |
10 | 0.33 | 0.19 | 0.20 | 1.22 | 0.15 | 0.19 | |
F1 (%) | 2 | 97.35 | 97.51 | 98.91 | 99.55 | 99.56 | 99.93 |
5 | 97.34 | 99.78 | 99.60 | 99.07 | 99.80 | 99.89 | |
10 | 98.85 | 98.83 | 98.84 | 98.09 | 99.03 | 99.02 |
Ref. | FS Tech. | Classifier | Number of Selected Features | ACC (%) | FPR (%) |
---|---|---|---|---|---|
[19] | CSA | MCF+MVO-ANN | 22 | 98.81 | 0.02 |
[28] | Sigmoid POI | DT | 18 | 86.90 | 6.40 |
[28] | Cosine POI | DT | 5 | 88.30 | 8.80 |
[29] | ABC | AdaBoost | 25 | 98.90 | 0.01 |
Proposed model | GA | SVM | 29 | 99.23 | 0.77 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Almaghthawi, Y.; Ahmad, I.; Alsaadi, F.E. Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection. Mathematics 2022, 10, 4745. https://doi.org/10.3390/math10244745
Almaghthawi Y, Ahmad I, Alsaadi FE. Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection. Mathematics. 2022; 10(24):4745. https://doi.org/10.3390/math10244745
Chicago/Turabian StyleAlmaghthawi, Yousef, Iftikhar Ahmad, and Fawaz E. Alsaadi. 2022. "Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection" Mathematics 10, no. 24: 4745. https://doi.org/10.3390/math10244745
APA StyleAlmaghthawi, Y., Ahmad, I., & Alsaadi, F. E. (2022). Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection. Mathematics, 10(24), 4745. https://doi.org/10.3390/math10244745