Research on Log Anomaly Detection Based on Sentence-BERT
Abstract
:1. Introduction
- We construct a log event semantic feature extraction model, T-SBERT, based on the Sentence-BERT model, which can convert log events into log event semantic feature representations. The Bidirectional Long Short-Term Memory Recurrent Neural Network model (Bi-LSTM) with an attention mechanism is adopted to generate an anomaly detection model.
- We propose a log event semantic feature matching algorithm and an anomaly detection algorithm. The log event semantic matching dictionary is established, and the log anomaly detection method LogADSBERT, based on Sentence-BERT, is constructed. It is, to the best of our knowledge, the first to extract log event semantic features using the Sentence-BERT model.
- In the scenario of new log event injection, LogADSBERT can ensure high accuracy and strong robustness of anomaly detection. Experiment results demonstrate the effectiveness of the proposed method.
2. Related Work
3. Preliminary Knowledge
3.1. Log Parser
3.2. Self-Attention Mechanism
3.3. Sentence-BERT Model
3.4. Bi-LSTM Neural Network Model
4. Definitions of LogADSBERT
- If (h ≤ j < q), then W(Si, ) = <>;
- Else, W(Si, ) = .
- For each ∈ and ∀ ∈ , if the similarity between and is greater than the threshold ξ, it can be determined that the log event sequence Ti is normal;
- Otherwise, the log event sequence Ti is abnormal.
5. Algorithms of LogADSBERT
5.1. Sentence-BERT Training Algorithm
Algorithm 1: TSBERTTrain(T) |
Input: Log event set T Output: Log event semantic vector generation model T-SBERT |
(1) Initialize the text corpus TC = ; (2) Initialize log event semantic dictionary D = ; (3) Initialize the Sentence-BERT model instance; (4) FOR ti DO (5) Split ti into word lists WL; (6) FOR EACH word IN WL DO (7) word = lowerCase(word); (8) IF word is a stop-words or no semantic identifiers THEN (9) Remove word from WL; (10) END IF (11) END FOR (12) Add the corresponding WL of the processed sentence to the corpus; (13) Add the corpus to the TC; (14) END FOR (15) Train Sentence-BERT model to get T-SBERT using text library TC; (16) RETURN T-SBERT; |
5.2. Log Event Semantic Matching Algorithm
Algorithm 2: LESVMatch(ti, T-SBERT) |
Input: Log Event ti, Log event semantic vector model T-SBERT Output: Log event semantic vector (1) IF ki IN D THEN (2) RETURN D(ki); (3) END IF; (4) Split ki into the word list WL; (5) FOR EACH word IN WL DO (6) lowerCase(word); (7) IF word is a stop-words or no semantic identifiers THEN (8) Remove word from WL; (9) END IF (10) END FOR (11) Add the corresponding WL of the processed sentence to the corpus; (12) = T-SBERT (corpus); (13) The mapping {} is added to the log event semantic dictionary D; (14) RETURN ; |
5.3. Bi-LSTM Training Algorithm
Algorithm 3: BILSTMADMTrain (S, h) |
Input: Sliding window length h, Log event semantic vector sequence set S = {<, , …, >,<, , …, >, …, <, , …,>} Output: Log prediction model Bi-LSTM-ADM (1) Initialize the TPDL=; (2) Initialize the Bi-LSTM model; (3) FOR SiSi[1, f] DO (4) FOR j=h, h+1, …, q − 1 DO (5) According to Definition 5 to generate the log event semantic vector sliding window W (Si, ); (6) IF W (Si, )= THEN (7) CONTINUE; (8) END IF (9) Generate the TDP = (wi, ) and add the TDPL; (10) END FOR (11) END FOR (12) Using TPDL as training datasets to train Bi-LSTM to generate Bi-LSTM-ADM; (13) RETURN Bi-LSTM-ADM; |
5.4. Anomaly Detection Algorithm
Algorithm 4: LogADSBERTDetect (Si, h, ξ, Bi-LSTM-ADM) |
Input: Sequence of log event semantic vectors Si = <>, Sliding window size h, Threshold value ξ, Bi-LSTM-ADM model Output: TRUE-normal/FALSE-abnormal (1) FOR j = h, h + 1, …, q − 1 DO (2) Generate the event semantic vector sliding window W(Si, ) and add the value into ; (3) IF W(Si, ) = THEN (4) CONTINUE; (5) END IF (6) Input W(Si, ) into Bi-LSTM-ADM to obtain the prediction vector ; (7) Add to the set of prediction result vectors ; (8) IF Similarity() < ξ THEN (9) RETURN FALSE; (10) END IF (11) END FOR (12) RETURN TRUE; |
6. Evaluation
6.1. Experimental Setting
6.1.1. Evaluation Metrics
- False positive: the number of normal log sequences marked as abnormal, which are denoted as FP.
- False negative: the number of abnormal log sequences marked as normal, which are denoted as FN.
- Precision: the proportion of log sequences with real anomalies that are correctly marked out; the computation of precision is shown in Equation (1).
- 4.
- Recall: the proportion of log sequences with real anomalies that are successfully marked; the computation of recall is shown in Equation (2).
- 5.
- F1-Score: the reconciliation average of the detection result accuracy and detection result completeness, which is denoted as F1-Score; the calculation of F1-Score is shown in Equation (3).
6.1.2. Environment and Hyperparameters
6.1.3. Experimental Datasets
6.2. Result
- Precision, Recall, and F1-Score
- 2.
- Statistics of FP and FN
- 3.
- Effects of different parameters on LogADSBERT
- 4.
- Performance comparison of new log event injection
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lam, H.; Russell, D.; Tang, D.; Munzner, T. Session viewer: Visual exploratory analysis of web session logs. In Proceedings of the2007 IEEE Symposium on Visual Analytics Science and Technology, Sacramento, CA, USA, 30 October–1 November 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 147–154. [Google Scholar]
- Yadav, R.B.; Kumar, P.S.; Dhavale, S.V. A survey on log anomaly detection using deep learning. In Proceedings of the 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 4–5 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1215–1220. [Google Scholar]
- Anastasiou, D.; Ruge, A.; Ion, R.; Segărceanu, S.; Suciu, G.; Pedretti, O.; Gratz, P.; Afkari, H. A machine translation-powered chatbot for public administration. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, Ghent, Belgium, 1–3 June 2022; pp. 327–328. [Google Scholar]
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3982–3992. [Google Scholar]
- Jang, B.; Kim, M.; Harerimana, G.; Kang, S.U.; Kim, J.W. Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism. Appl. Sci. 2020, 10, 5841. [Google Scholar] [CrossRef]
- Roy, S.; König, A.C.; Dvorkin, I.; Kumar, M. Perfaugur: Robust diagnostics for performance anomalies in cloud services. In Proceedings of the 2015 IEEE 31st International Conference on Data Engineering, Seoul, Republic of Korea, 13–17 April 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1167–1178. [Google Scholar]
- Prewett, J.E. Analyzing cluster log files using logsurfer. In Proceedings of the 4th Annual Conference on Linux Clusters, St. Petersburg, Russia, 2–4 June 2003; Citeseer: State College, PA, USA, 2003; pp. 1–12. [Google Scholar]
- Rouillard, J.P. Real-time Log File Analysis Using the Simple Event Correlator (SEC). LISA 2004, 4, 133–150. [Google Scholar]
- Liang, Y.; Zhang, Y.; Xiong, H.; Sahoo, R. Failure prediction in ibm bluegene/l event logs. In Proceedings of the Seventh IEEE International Conference on Data Mining (ICDM 2007), Omaha, NE, USA, 28–31 October 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 583–588. [Google Scholar]
- Wang, Y.; Wong, J.; Miner, A. Anomaly intrusion detection using one class SVM. In Proceedings of the Fifth Annual IEEE SMC Information Assurance Workshop, West Point, NY, USA, 10–11 June 2004; IEEE: Piscataway, NJ, USA, 2004; pp. 358–364. [Google Scholar]
- Breier, J.; Branišová, J. Anomaly detection from log files using data mining techniques. In Information Science and Applications; Springer: Berlin/Heidelberg, Germany, 2015; pp. 449–457. [Google Scholar]
- He, P.; Zhu, J.; He, S.; Li, J.; Lyu, M.R. Towards automated log parsing for large-scale log data analysis. IEEE Trans. Dependable Secur. Comput. 2017, 15, 931–944. [Google Scholar] [CrossRef]
- Chen, M.; Zheng, A.X.; Lloyd, J.; Jordan, M.I.; Brewer, E. Failure diagnosis using decision trees. In Proceedings of the International Conference on Autonomic Computing, New York, NY, USA, 17–19 May 2004; IEEE: Piscataway, NJ, USA, 2004; pp. 36–43. [Google Scholar]
- Ying, S.; Wang, B.; Wang, L.; Li, Q.; Zhao, Y.; Shang, J.; Huang, H.; Cheng, G.; Yang, Z.; Geng, J. An improved KNN-based efficient log anomaly detection method with automatically labeled samples. ACM Trans. Knowl. Discov. Data (TKDD) 2021, 15, 1–22. [Google Scholar] [CrossRef]
- Xu, W.; Huang, L.; Fox, A.; Patterson, D.; Jordan, M.I. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, Big Sky, MT, USA, 11–14 October 2009; pp. 117–132. [Google Scholar]
- Xu, D.; Wang, Y.; Meng, Y.; Zhang, Z. An improved data anomaly detection method based on isolation forest. In Proceedings of the 2017 10th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 9–10 December 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 2, pp. 287–291. [Google Scholar]
- Lou, J.G.; Fu, Q.; Yang, S.; Xu, Y.; Li, J. Mining Invariants from Console Logs for System Problem Detection. In Proceedings of the USENIX Annual Technical Conference, Virtual, 14–16 July 2010; pp. 1–14. [Google Scholar]
- Vaarandi, R.; Pihelgas, M. Logcluster-a data clustering and pattern mining algorithm for event logs. In Proceedings of the 2015 11th International Conference on Network and Service Management (CNSM), Barcelona, Spain, 9–13 November 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–7. [Google Scholar]
- Du, M.; Li, F.; Zheng, G.; Srikumar, V. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1285–1298. [Google Scholar]
- Meng, W.; Liu, Y.; Zhu, Y.; Zhang, S.; Pei, D.; Liu, Y.; Chen, Y.; Zhang, R.; Tao, S.; Sun, P.; et al. LogAnomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. IJCAI 2019, 19, 4739–4745. [Google Scholar]
- Brown, A.; Tuor, A.; Hutchinson, B.; Nichols, N. Recurrent neural network attention mechanisms for interpretable system log anomaly detection. In Proceedings of the First Workshop on Machine Learning for Computing Systems, Tempe, AZ, USA, 12 June 2018; pp. 1–8. [Google Scholar]
- Chen, S.; Liao, H. Bert-log: Anomaly detection for system logs based on pre-trained language model. Appl. Artif. Intell. 2022, 36, 2145642. [Google Scholar] [CrossRef]
- Zhang, M.; Chen, J.; Liu, J.; Wang, J.; Shi, R.; Sheng, H. LogST: Log semi-supervised anomaly detection based on sentence-BERT. In Proceedings of the 2022 7th International Conference on Signal and Image Processing (ICSIP), Suzhou, China, 20–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 356–361. [Google Scholar]
- Guo, H.; Yuan, S.; Wu, X. Logbert: Log anomaly detection via bert. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–8. [Google Scholar]
- Mizutani, M. Incremental mining of system log format. In Proceedings of the 2013 IEEE International Conference on Services Computing, Santa Clara, CA, USA, 28 June–3 July 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 595–602. [Google Scholar]
- Shima, K. Length matters: Clustering system log messages using length of words. arXiv 2016, arXiv:1611.03213. [Google Scholar]
- Hamooni, H.; Debnath, B.; Xu, J.; Zhang, H.; Jiang, G.; Mueen, A. Logmine: Fast pattern recognition for log analytics. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, IN, USA, 24–28 October 2016; pp. 1573–1582. [Google Scholar]
- He, P.; Zhu, J.; Zheng, Z.; Lyu, M.R. Drain: An online log parsing approach with fixed depth tree. In Proceedings of the 2017 IEEE International Conference on Web Services (ICWS), Honolulu, HI, USA, 25–30 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 33–40. [Google Scholar]
- Makanju, A.; Zincir-Heywood, A.N.; Milios, E.E. A lightweight algorithm for message type extraction in system application logs. IEEE Trans. Knowl. Data Eng. 2011, 24, 1921–1936. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: New York, NY, USA; pp. 6000–6010. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Hoffer, E.; Ailon, N. Deep metric learning using triplet network. In Proceedings of the Similarity-Based Pattern Recognition: Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, 12–14 October 2015; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; pp. 84–92. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Shvachko, K.; Kuang, H.; Radia, S.; Chansler, R. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Lake Tahoe, NV, USA, 3–7 May 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1–10. [Google Scholar]
- Sefraoui, O.; Aissaoui, M.; Eleuldj, M. OpenStack: Toward an open-source solution for cloud computing. Int. J. Comput. Appl. 2012, 55, 38–42. [Google Scholar] [CrossRef]
Hyperparameters | Value |
---|---|
Learning rate | 0.001 |
Batch size | 2048 |
Epoch | 300 |
l (Neural network layers) | 2 |
α (Hide layer cell size) | 64 |
h (Sliding window size) | 10 |
Log Datasets | Number of Sessions | Number of Log Events | ||
---|---|---|---|---|
Training Data | Normal | Abnormal | ||
HDFS | 7333 | 14,296 | 4251 | 46 |
OpenStack | 514 | 4904 | 425 | 40 |
Evaluation Metrics | DeepLog | LogAnomaly | LogADSBERT |
---|---|---|---|
False positive (FP) | 451 | 303 | 59 |
False negative (FN) | 265 | 174 | 47 |
Evaluation Metrics | DeepLog | LogAnomaly | LogADSBERT |
---|---|---|---|
False positive (FP) | 107 | 61 | 28 |
False negative (FN) | 34 | 29 | 0 |
Evaluation Metrics | DeepLog | LogAnomaly | LogADSBERT |
---|---|---|---|
Precision | 0.409 | 0.552 | 0.937 |
Recall | 0.976 | 0.932 | 0.928 |
F1-Score | 0.577 | 0.694 | 0.932 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hu, C.; Sun, X.; Dai, H.; Zhang, H.; Liu, H. Research on Log Anomaly Detection Based on Sentence-BERT. Electronics 2023, 12, 3580. https://doi.org/10.3390/electronics12173580
Hu C, Sun X, Dai H, Zhang H, Liu H. Research on Log Anomaly Detection Based on Sentence-BERT. Electronics. 2023; 12(17):3580. https://doi.org/10.3390/electronics12173580
Chicago/Turabian StyleHu, Caiping, Xuekui Sun, Hua Dai, Hangchuan Zhang, and Haiqiang Liu. 2023. "Research on Log Anomaly Detection Based on Sentence-BERT" Electronics 12, no. 17: 3580. https://doi.org/10.3390/electronics12173580
APA StyleHu, C., Sun, X., Dai, H., Zhang, H., & Liu, H. (2023). Research on Log Anomaly Detection Based on Sentence-BERT. Electronics, 12(17), 3580. https://doi.org/10.3390/electronics12173580