Machine Learning for Anomaly Detection in Industrial Environments †
Abstract
:1. Introduction
- RQ1: What are the latest scientific publications in the field of anomaly detection in industrial environments?
- RQ2: Which machine-learning models are most effective in detecting anomalies in industrial environments?
- RQ3: How does the application of machine learning in anomaly detection impact the safety of industrial environments?
2. Related Work
3. Methodology
- The studies were published between 2015 and 2023;
- Only studies focusing on the use of machine learning for anomaly detection in industrial environments were selected;
- Only studies that have been published in scientific journals, conferences, or books were included, thus ensuring the validity of the results;
- The studies were written in English.
- Lack of machine learning: studies that do not use machine learning and focus only on classical detection methods were excluded;
- Studies that did not have sufficient details about the used machine-learning algorithms, the used datasets, or reported evaluation metrics were excluded, ensuring that only studies with sufficient transparency and reproducibility were considered.
- To avoid redundancy in this review, studies that duplicated articles or presented substantially similar findings were excluded.
4. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Harrou, F.; Sun, Y.; Khadraoui, S. Amalgamation of Anomaly-Detection Indices for Enhanced Process Monitoring. J. Loss Prev. Process Ind. 2016, 40, 365–377. [Google Scholar] [CrossRef]
- He, H.; Garcia, E.A. Learning from Imbalanced Data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar] [CrossRef]
- Krawczyk, B. Learning from Imbalanced Data: Open Challenges and Future Directions. Prog. Artif. Intell. 2016, 5, 221–232. [Google Scholar] [CrossRef]
- Nassif, A.B.; Talib, M.A.; Nasir, Q.; Dakalbab, F.M. Machine Learning for Anomaly Detection: A Systematic Review. IEEE Access 2021, 9, 78658–78700. [Google Scholar] [CrossRef]
- Musa, T.H.A.; Bouras, A. Anomaly Detection: A Survey. In Lecture Notes in Networks and Systems; ACM: New York, NY, USA, 2022; pp. 391–401. [Google Scholar]
- Larriva-Novo, X.A.; Vega-Barbas, M.; Villagra, V.A.; Sanz Rodrigo, M. Evaluation of Cybersecurity Data Set Characteristics for Their Applicability to Neural Networks Algorithms Detecting Cybersecurity Anomalies. IEEE Access 2020, 8, 9005–9014. [Google Scholar] [CrossRef]
- Lee, W.; Stolfo, S.J. Data Mining Approaches for Intrusion Detection. In Proceedings of the 7th USENIX Security Symposium, San Antonio, TX, USA, 26–29 January 1998. [Google Scholar]
- Bauer, F.C.; Muir, D.R.; Indiveri, G. Real-Time Ultra-Low Power ECG Anomaly Detection Using an Event-Driven Neuromorphic Processor. IEEE Trans. Biomed. Circuits Syst. 2019, 13, 1575–1582. [Google Scholar] [CrossRef] [PubMed]
- Omar, S.; Ngadi, A.; Jebur, H. Machine Learning Techniques for Anomaly Detection: An Overview. Int. J. Comput. Appl. 2013, 79, 33–41. [Google Scholar] [CrossRef]
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly Detection. ACM Comput. Surv. 2009, 41, 1–58. [Google Scholar] [CrossRef]
- Sodemann, A.A.; Ross, M.P.; Borghetti, B.J. A Review of Anomaly Detection in Automated Surveillance. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2012, 42, 1257–1272. [Google Scholar] [CrossRef]
- Zuo, R. Machine Learning of Mineralization-Related Geochemical Anomalies: A Review of Potential Methods. Nat. Resour. Res. 2017, 26, 457–464. [Google Scholar] [CrossRef]
- Frank, A.G.; Dalenogare, L.S.; Ayala, N.F. Industry 4.0 Technologies: Implementation Patterns in Manufacturing Companies. Int. J. Prod. Econ. 2019, 210, 15–26. [Google Scholar] [CrossRef]
- Umer, M.A.; Junejo, K.N.; Jilani, M.T.; Mathur, A.P. Machine Learning for Intrusion Detection in Industrial Control Systems: Applications, Challenges, and Recommendations. Int. J. Crit. Infrastruct. Prot. 2022, 38, 100516. [Google Scholar] [CrossRef]
- Sokolov, A.N.; Pyatnitsky, I.A.; Alabugin, S.K. Research of Classical Machine Learning Methods and Deep Learning Models Effectiveness in Detecting Anomalies of Industrial Control System. In Proceedings of the 2018 Global Smart Industry Conference (GloSIC), Chelyabinsk, Russia, 13–15 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
- Pang, G.; Shen, C.; Cao, L.; Hengel, A. Van Den Deep Learning for Anomaly Detection. ACM Comput. Surv. 2022, 54, 1–38. [Google Scholar] [CrossRef]
- Erhan, L.; Ndubuaku, M.; Di Mauro, M.; Song, W.; Chen, M.; Fortino, G.; Bagdasar, O.; Liotta, A. Smart Anomaly Detection in Sensor Systems: A Multi-Perspective Review. Inf. Fusion 2021, 67, 64–79. [Google Scholar] [CrossRef]
- Mokhtari, S.; Abbaspour, A.; Yen, K.K.; Sargolzaei, A. A Machine Learning Approach for Anomaly Detection in Industrial Control Systems Based on Measurement Data. Electronics 2021, 10, 407. [Google Scholar] [CrossRef]
- Shin, H.K.; Lee, W.; Yun, J.H.; Kim, H.C. HAI 1.0: HIL-Based Augmented ICS Security Dataset. In Proceedings of the CSET 2020—13th USENIX Workshop on Cyber Security Experimentation and Test, Co-Located with USENIX Security 2020, Online, 10 August 2020. [Google Scholar]
- Gamal, M.; Donkol, A.; Shaban, A.; Costantino, F.; Di Gravio, G.; Patriarca, R. Anomalies Detection in Smart Manufacturing Using Machine Learning and Deep Learning Algorithms. In Proceedings of the Proceedings of the International Conference on Industrial Engineering and Operations Management, Dhaka, Bangladesh, 26–27 December 2021; pp. 1611–1622.
- Steel Plates Faults Data Set, Semeion, Research Center of Sciences of Communication, Via Sersale 117, 00128, Rome, Italy. Available online: https://archive.ics.uci.edu/ml/datasets/Steel+Plates+Faults (accessed on 7 April 2024).
- Wang, J.; Liu, J.; Pu, J.; Yang, Q.; Miao, Z.; Gao, J.; Song, Y. An Anomaly Prediction Framework for Financial IT Systems Using Hybrid Machine Learning Methods. J. Ambient Intell. Humaniz. Comput. 2023, 14, 15277–15286. [Google Scholar] [CrossRef]
- Shanthi, K.; Maruthi, R. Machine Learning Approach for Anomaly-Based Intrusion Detection Systems Using Isolation Forest Model and Support Vector Machine. In Proceedings of the 2023 5th International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 3 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 136–139. [Google Scholar]
- Hassan Zaib, M. NSL-KDD. Available online: https://www.kaggle.com/datasets/hassan06/nslkdd (accessed on 7 April 2024).
- Lejon, E.; Kyösti, P.; Lindström, J. Machine Learning for Detection of Anomalies in Press-Hardening: Selection of Efficient Methods. Procedia CIRP 2018, 72, 1079–1083. [Google Scholar] [CrossRef]
- Quatrini, E.; Costantino, F.; Di Gravio, G.; Patriarca, R. Machine Learning for Anomaly Detection and Process Phase Classification to Improve Safety and Maintenance Activities. J. Manuf. Syst. 2020, 56, 117–132. [Google Scholar] [CrossRef]
- Anton, S.D.D.; Sinha, S.; Dieter Schotten, H. Anomaly-Based Intrusion Detection in Industrial Data with SVM and Random Forests. In Proceedings of the 2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 19–21 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
- Morris, T.H.; Thornton, Z.; Turnipseed, I. Industrial Control System Simulation and Data Logging for Intrusion Detection System Research. In Proceedings of the 7th Annual Southeastern Cyber Security Summit, Huntsville, AL, USA, 3–4 June 2015. [Google Scholar]
- Antón, S.D.; Gundall, M.; Fraunholz, D.; Schotten, H.D. Implementing SCADA Scenarios and Introducing Attacks to Obtain Training Data for Intrusion Detection Methods. In Proceedings of the ICCWS 2019 14th International Conference on Cyber Warfare and Security: ICCWS 2019, Stellenbosch, South Africa, 28 February–1 March 2019. [Google Scholar]
- Inoue, J.; Yamagata, Y.; Chen, Y.; Poskitt, C.M.; Sun, J. Anomaly Detection for a Water Treatment System Using Unsupervised Machine Learning. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1058–1065. [Google Scholar]
- Goh, J.; Adepu, S.; Junejo, K.N.; Mathur, A. A Dataset to Support Research in the Design of Secure Water Treatment Systems. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2017; pp. 88–99. ISBN 9783319713670. [Google Scholar]
- Ifzarne, S.; Tabbaa, H.; Hafidi, I.; Lamghari, N. Anomaly Detection Using Machine Learning Techniques in Wireless Sensor Networks. J. Phys. Conf. Ser. 2021, 1743, 012021. [Google Scholar] [CrossRef]
- Almomani, I.; Al-Kasasbeh, B.; AL-Akhras, M. WSN-DS: A Dataset for Intrusion Detection Systems in Wireless Sensor Networks. J. Sens. 2016, 2016, 4731953. [Google Scholar] [CrossRef]
- Tai, J.; Alsmadi, I.; Zhang, Y.; Qiao, F. Machine Learning Methods for Anomaly Detection in Industrial Control Systems. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 2333–2339. [Google Scholar]
Ref. | Dataset | Machine-Learning Model | Performance Metrics (%) | |||||
---|---|---|---|---|---|---|---|---|
(Year) | Precision | Recall | F1-Score | Accuracy | Other | |||
[18] (2021) | HIL ICS [19] | k-NN | 97.32 | 97.29 | 97.29 | 97.29 | AUC: 97.29%, Fitting time: 173 s, Prediction time: 104 s | |
DT | 99.37 | 99.37 | 99.37 | 99.37 | AUC: 99.37%, Fitting time: 5.8 s, Prediction time: 0.0283 s | |||
RF | 99.76 | 99.76 | 99.76 | 99.76 | AUC: 99.76%, Fitting time: 2.21 s, Prediction time: 0.0505 s | |||
[20] (2021) | Steel plate faults by Semeion. Research of Sciences of Communication [21] | DT | 91.29 | - | 91.29 | 91.14 | Sensitivity: 91.14%, | |
k-NN | 82.86 | - | 82.86 | 82.86 | Sensitivity: 82.86% | |||
RF | 93.86 | - | 92.43 | 93.29 | Sensitivity: 93.29% | |||
SVM | 74.57 | - | 79.43 | 86.00 | Sensitivity: 86.00% | |||
Naïve Bayes | 89.29 | - | 62.86 | 59.00 | Sensitivity: 59.00% | |||
Logistic Regression (LR) | 86.71 | - | 83.71 | 88.29 | Sensitivity: 88.29% | |||
Multilayer Perceptron (MPL) | 88.43 | - | 73.43 | 73.86 | Sensitivity: 73.86% | |||
[22] (2023) | Biz | DT | 69.06 | 73.19 | - | - | F 0.5 Score: 69.85% | |
RF | 87.44 | 70.80 | - | - | F 0.5 Score: 83.51% | |||
k-NN | 74.01 | 62.90 | - | - | F 0.5 Score: 73.00% | |||
GBDT | 83.03 | 62.06 | - | - | F 0.5 Score: 77.77% | |||
Stacking | 88.03 | 70.17 | - | - | F 0.5 Score: 83.76% | |||
Mon | DT | 57.07 | 66.60 | - | - | F 0.5 Score: 58.75% | ||
RF | 87.92 | 64.27 | - | - | F 0.5 Score: 81.89% | |||
k-NN | 73.23 | 53.73 | - | - | F 0.5 Score: 68.27% | |||
GBDT | 85.68 | 43.39 | - | - | F 0.5 Score: 71.70% | |||
Stacking | 87.54 | 69.61 | - | - | F 0.5 Score: 83.25% | |||
Ora | DT | 44.82 | 57.31 | - | - | F 0.5 Score: 46.86% | ||
RF | 77.99 | 49.01 | - | - | F 0.5 Score: 69.74% | |||
k-NN | 56.58 | 25.49 | - | - | F 0.5 Score: 45.48% | |||
GBDT | 61.33 | 21.94 | - | - | F 0.5 Score: 45.13% | |||
Stacking | 85.11 | 56.53 | - | - | F 0.5 Score: 77.29% | |||
Trd | DT | 41.09 | 58.87 | - | - | F 0.5 Score: 43.73% | ||
RF | 86.21 | 53.19 | - | - | F 0.5 Score: 76.69% | |||
k-NN | 81.03 | 33.33 | - | - | F 0.5 Score: 63.00% | |||
GBDT | 72.82 | 53.19 | - | - | F 0.5 Score: 67.81% | |||
Stacking | 85.42 | 51.90 | - | - | F 0.5 Score: 75.64% | |||
[23] (2023) | NSL-KDD [24] | IF | - | 87.00 | 78.00 | 99.00 | - | |
SVM | - | 88.00 | 67.00 | 95.00 | - | |||
[25] (2018) | Data from press-hardening processes | ANN | 100 | 100 | - | 100 | - | |
SC-SVM | 98.90 | 100 | - | 99.40 | - | |||
IF | 99.00 | 100 | - | 99.50 | - | |||
[26] (2020) | Pharmaceutical company data | RFA | 99.97 | 99.97 | 99.97 | - | - | |
DLA | 99.96 | 99.97 | 99.96 | - | - | |||
[27] (2019) | Modbus (D1) [28], OPC UA (D2) [29] | SVM | D1 | - | - | - | 92.53 | Execution time: 11,712 s |
D2 | - | - | - | 90.81 | Execution time: 0.019 s | |||
RF | D1 | - | - | - | 99.84 | Execution time: 281 s | ||
D2 | - | - | - | 99.98 | Execution time: 52.31 s | |||
[30] (2017) | SWaT [31] | DNN | 98.29 | 67.84 | 80.28 | - | - | |
SC-SVM | 92.50 | 69.90 | 79.62 | - | - | |||
[32] (2021) | WSN-DS [33] | SVM | 88.00 | 92.00 | 90.00 | 89.00 | - | |
Naïve Bayes | 94.00 | 85.00 | 88.00 | 94.00 | - | |||
DT | 94.00 | 94.00 | 93.00 | 94.00 | - | |||
ID-GOPA | 96.00 | 96.00 | 96.00 | 96.00 | - | |||
RF | 94.00 | 85.00 | 88.00 | 94.00 | - | |||
[34] (2020) | HIL-based augmented ICS dataset | Naïve Bayes | - | - | - | 54.00 | Training time: 0.1 s; Prediction time: 0.1 s | |
RF | - | - | - | 82.93 | Training time: 0.9 s; Prediction time: 0.1 s | |||
RF GSCN | - | - | - | 82.93 | Training time: 109.8 s; Prediction time: 8.2 s | |||
GB | - | - | - | 77.58 | Training time: 583.02 s; Prediction time: 0.1 s | |||
GB GSCV | - | - | - | 83.63 | Training time: 1274.2 s; Prediction time: 10 s | |||
ANN | - | - | - | 82.79 | Training time: 76 s; Prediction time: 10 s | |||
LSTM | - | - | - | 82.81 | Training time: 111 s; Prediction time: 20 s | |||
LSTM AE | - | - | - | 82.79 | Training time: 809 s; Prediction time: 10 s |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Grunova, D.; Bakratsi, V.; Vrochidou, E.; Papakostas, G.A. Machine Learning for Anomaly Detection in Industrial Environments. Eng. Proc. 2024, 70, 25. https://doi.org/10.3390/engproc2024070025
Grunova D, Bakratsi V, Vrochidou E, Papakostas GA. Machine Learning for Anomaly Detection in Industrial Environments. Engineering Proceedings. 2024; 70(1):25. https://doi.org/10.3390/engproc2024070025
Chicago/Turabian StyleGrunova, Denitsa, Vasiliki Bakratsi, Eleni Vrochidou, and George A. Papakostas. 2024. "Machine Learning for Anomaly Detection in Industrial Environments" Engineering Proceedings 70, no. 1: 25. https://doi.org/10.3390/engproc2024070025
APA StyleGrunova, D., Bakratsi, V., Vrochidou, E., & Papakostas, G. A. (2024). Machine Learning for Anomaly Detection in Industrial Environments. Engineering Proceedings, 70(1), 25. https://doi.org/10.3390/engproc2024070025