Online Intrusion Scenario Discovery and Prediction Based on Hierarchical Temporal Memory (HTM)
Abstract
:1. Introduction
- (1)
- Although the IDS is efficient in detecting the single-step attack, it can hardly reveal the logical relations among these single-step attacks.
- (2)
- The real attack is a small probability event, and the most dangerous attacks happen only rarely, which mean that in each dataset we have only a few examples of attacks. This problem in intrusion detection research, called the “rare data problem” [2], seems even worse when multistep attacks are considered.
- (3)
- IDS sensors could miss some steps of a multistep attack because these steps may appear normal to them [3]. Thus, we cannot assume that the important alerts related to the real attack will always be there.
- (4)
- High data redundancy and volumes will slow down the analysis process, which is a significant challenge for the development of systems working in real time.
- (1)
- Some early prediction models based on prior knowledge are created manually by security experts. The domain knowledge, such as known cyber threats or intrusion scenarios, are leveraged in the modeling process. These models can highly represent real intrusion scenarios, and provide a high accuracy of prediction with no ambiguity, which is the reason why they are popular in the industry. However, it is heavy work for the experts to keep the models always available for attack strategies that are changing all the time. Early studies used the approach proposed by Qin and Lee [6], which can identify attack plans based on isolated attack scenarios, but appreciable adjustments should be made to predict the attacks that are beyond the predefined attack plans.
- (2)
- Two types of graph models, HMM (hidden Markov models) and Bayesian networks, are commonly used in intrusion predictions. Essentially, they are attack graphs that can be constructed automatically. Statistical methods are utilized to obtain the transition probabilities or probability distributions based on which the prediction can be made. Ramaki et al. [7] proposed an attack prediction model based on the previous work [4], which is a real-time IDS alert correlation framework. In their model, Bayes networks and a Bayes attack graph (BAG) are employed to increase the accuracy of the prediction. Unlike the models based on Bayes networks, the HMM-based methods are insensitive to the missed states and transitions, which means it can work well without complete information of the intrusion scenarios [8]. Farhadi et al. [9] proposed the alert correlation and prediction framework utilizing the HMM model as the prediction component, which does not require the information about the network topology, system vulnerabilities, and system configurations, and is robust against over-fitting. However, the model cannot predict patterns of unknown attacks. Additionally, both of the models are sensitive to outliers and the lack of scalability for the various intrusion scenarios.
- (3)
- Data mining is another prevalent technology for discovering correlations within the massive amount of security data before constructing the intrusion prediction models. Husák et al. [10] proposed an attack prediction framework based on data mining to identify the attack patterns from a million alerts. Sequential rule mining is employed to derive rules for prediction. Kim et al. [11] proposed an approach to create an attack graph based on data mining techniques for intrusion prediction. Although data mining can discover unknown intrusion patterns, redundant, small, frequent items would affect the accuracy of the prediction. The most important is that the data mining models cannot discover the critical data items with lower frequency.
- (4)
- Since the ANN (artificial neural network)-related approaches are proposed to meet the intrusion prediction, many studies show that neural networks have a better result in predicting attacks than others. However, neural networks need to be trained offline and retrained in a regular time interval to meet the constantly changing data. Additionally, feature selection and dimensional reduction are two essential and complex tasks to conduct, and the parameters should be learned from the data. In other words, neural networks are heavy models and not optimal for online prediction in real-time. Subba et al. [12] proposed an intrusion detection model which can reduce the computational resources in the training phase and is suitable for real-time deployment. As described above, it needs feature selection and daunting parameter tuning to gain better performance.
2. Materials and Methods
2.1. Preliminaries of HTM
2.2. The Intrusion Scenario Discovery and Prediction Framework
- Normalizing: Raw alerts are converted into a unified data structure for future analyzing;
- Intrusion action extraction: A group of normalized alerts can correlated into a hyper-alert, which is called intrusion action in this paper, and should meet three requirements: first, they have the same source IP address and destination IP address; second, they have the same destination port or the same intrusion type; and last, they occur continuously in time order. The massive redundant raw alerts are aggregated in this process;
- Online action clustering: Many extracted actions possess similar semantics, which will be merged based on the similarity calculating;
- Intrusion session reconstruction: The time-ordered action sequence will be split into many subsequences by calculating the variance of the time gaps between actions, each subsequence is named the intrusion session. Note that all the actions of a particular intrusion session are extracted from the same pair of physical devices. The extracted session will be refined in order to reduce more noise actions;
- Intrusion session encoding: The intrusion session should be encoded before feed to the HTM algorithm. The session can be first converted into the action ID sequence, and the ID will be encoded into a bit array which is the HTM inner representation of that action. The learning phase of the HTM starts with the encoding of the ID sequences;
- HTM online learning: The HTM learns the patterns of the action sequences and predicts the next step according to the current input, and the anomaly score which indicates the discrepancy between the predicted value and the input value. The predicted values and the anomaly scores will be updated in a particular matrix;
- Intrusion scenario discovery: Based on the analyzing of historical predicted actions and the anomaly scores in the correlation matrix, the correlation strengths of intrusion actions are calculated by the auxiliary method; and
- Intrusion prediction: The system will search through the matrix to determine the attacking paths that may happen in the future.
2.2.1. Intrusion Action Clustering
2.2.2. Intrusion Session Reconstruction
2.2.3. Online Learning and Prediction
3. Results
3.1. Experiment Setup
3.2. Evaluations on CICIDS2017 Dataset
- The dataset resembles real-world network traffic data that contains realistic and naturalistic benign background traffic that simulates everyday interaction behaviors of 25 users;
- The data was continuously captured for five days, from Monday to Friday in July 2017. About 10 GB of data was created per day and was labeled for training. It needs to noted that the Monday dataset only contains benign interactions of daily work;
- The performed attacks including brute force FTP, brute force SSH, DoS, Heartbleed, web attack, infiltration, botnet, and DDoS; and
- The network traffic was captured in a completely configured network, including modem, firewall, switches, routers, and PCs, with a variety of operating systems, such as Windows, Ubuntu, and Mac OS X, and most of the traffic is based on the HTTP, HTTPS, FTP, SSH, and email protocols.
3.2.1. Data Preprocessing Results
3.2.2. Data Learning Results
4. Discussion
Author Contributions
Funding
Conflicts of Interest
References
- Navarro, J.; Deruyver, A.; Parrend, P. A Systematic Survey on Multi-step Attack Detection. Comput. Secur. 2018, 76, 214–249. [Google Scholar] [CrossRef]
- Ourston, D.; Matzner, S.; Stump, W.; Hopkins, B. Applications of Hidden Markov Models to Detecting Multi-Stage Network Attacks. In Proceedings of the 36th Annual Hawaii International Conference on System Sciences, Big Island, HI, USA, 6–9 January 2003; p. 10. [Google Scholar]
- Bing, C.; Lee, J.; Wu, A.S. Active Event Correlation in Bro IDS to Detect Multi-Stage Attacks. In Proceedings of the Fourth IEEE International Workshop on Information Assurance (IWIA’06), London, UK, 13–14 April 2006; pp. 16–50. [Google Scholar]
- Ramaki, A.A.; Amini, M.; Atani, R.E. RTECA: Real Time Episode Correlation Algorithm for Multi-Step Attack Scenarios Detection. Comput. Secur. 2015, 49, 206–219. [Google Scholar] [CrossRef]
- Ren, H.; Stakhanova, N.; Ghorbani, A.A. An Online Adaptive Approach to Alert Correlation. In Proceedings of the Detection of Intrusions and Malware, and Vulnerability Assessment, Bonn, Germany, 8–9 July 2010; pp. 153–172. [Google Scholar]
- Qin, X.; Lee, W. Attack Plan Recognition and Prediction Using Causal Networks. In Proceedings of the 20th Annual Computer Security Applications Conference, Tucson, AZ, USA, 6–10 December 2004; pp. 370–379. [Google Scholar]
- Ramaki, A.A.; Khosravi-Farmad, K.; Bafghi, A.G. Real Time Alert Correlation and Prediction Using Bayesian Networks. In Proceedings of the 12th International Iranian Society of Cryptology Conference on Information Security and Cryptology (ISCISC), Rasht, Iran, 8–10 September 2015; pp. 98–103. [Google Scholar]
- Husák, M.; Komárková, J.; Bou-Harb, E.; Čeleda, P. Survey of Attack Projection, Prediction, And Forecasting in Cyber Security. IEEE Commun. Surv. Tutor. 2019, 21, 640–660. [Google Scholar] [CrossRef] [Green Version]
- Farhadi, H.; AmirHaeri, M.; Khansari, M. Alert Correlation and Prediction Using Data Mining and HMM. ISeCure 2011, 3, 77–101. [Google Scholar]
- Husák, M.; Kašpar, J. Towards Predicting Cyber Attacks Using Information Exchange and Data Mining. In Proceedings of the 14th International Wireless Communications & Mobile Computing Conference (IWCMC), Limassol, Cyprus, 25–29 June 2018; pp. 536–541. [Google Scholar]
- Kim, Y.; Park, W.H. A study on cyber threat prediction based on intrusion detection event for APT attack detection. Multimed. Tools Appl. 2014, 71, 685–698. [Google Scholar] [CrossRef]
- Subba, B.; Biswas, S.; Karmakar, S. A Neural Network based system for intrusion detection and attack classification. In Proceedings of the 2016 Twenty Second National Conference on Communication (NCC), Guwahati, India, 4–6 March 2016; pp. 1–6. [Google Scholar]
- Cui, Y.; Ahmad, S.; Hawkins, J. Continuous Online Sequence Learning with an Unsupervised Neural Network Model. April 2016. Available online: https://arxiv.org/abs/1512.05463v2 (accessed on 15 July 2019).
- El-Ganainy, N.O.; Balasingham, I.; Halvorsen, P.S.; Rosseland, L.A. On the Performance of Hierarchical Temporal Memory Predictions of Medical Streams in Real Time. In Proceedings of the 2019 13th International Symposium on Medical Information and Communication Technology (ISMICT), Oslo, Norway, 8–10 May 2019; pp. 1–6. [Google Scholar]
- Li, T.; Wang, B.; Shang, F.; Tian, J.; Cao, K. Online sequential attack detection for ADS-B data based on hierarchical temporal memory. Comput. Secur. 2019, 87, 101599. [Google Scholar] [CrossRef]
- Wang, C.; Zhao, Z.; Gong, L.; Zhu, L.; Liu, Z.; Cheng, X. A Distributed Anomaly Detection System for In-Vehicle Network Using HTM. IEEE Access 2018, 6, 9091–9098. [Google Scholar] [CrossRef]
- Shah, D.; Ghate, P.; Paranjape, M.; Kumar, A. Application of hierarchical temporal memory theory for document categorization. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–6. [Google Scholar]
- Shen, J.; Loew, M. Hierarchical temporal and spatial memory for gait pattern recognition. In Proceedings of the IEEE Computer Society, IEEE Applied Imagery Pattern Recognition Workshop, Washington, DC, USA, 18–20 October 2016; pp. 1–9. [Google Scholar]
- Hawkins, J.; Ahmad, S.; Dubinsky, D. Hierarchical Temporal Memory Including HTM Cortical Learning Algorithms. September 2011. Available online: https://numenta.com/assets/pdf/whitepapers/hierarchical-temporal-memory-cortical-learning-algorithm-0.2.1-en.pdf (accessed on 11 July 2019).
- Wu, J.; Zeng, W.; Chen, Z.; Tang, X. Hierarchical Temporal Memory Method for Time-Series-Based Anomaly Detection. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW), Barcelona, Spain, 12–15 December 2016; pp. 1167–1172. [Google Scholar]
- Ahmad, S.; Lavin, A.; Purdy, S.; Agha, Z. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 2017, 262, 134–147. [Google Scholar] [CrossRef]
- Purdy, S. Encoding Data for HTM Systems. February 2016. Available online: https://arxiv.org/abs/1602.05925 (accessed on 14 July 2019).
- Cui, Y.; Ahmad, S.; Hawkins, J. The HTM spatial pooler–a neocortical algorithm for online sparse distributed coding. Front. Comput. Neurosci. 2017, 11, 111. [Google Scholar] [CrossRef] [Green Version]
- Zhang, K.; Zhao, F.; Luo, S.; Xin, Y.; Zhu, H. An intrusion action-based IDS alert correlation analysis and prediction framework. IEEE Access 2019, 7, 150540–150551. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP), Funchal, Madeira, Portugal, 22–24 January 2018; Volume 1, pp. 108–116. [Google Scholar]
Parameters | Values | Notes |
---|---|---|
Ls | 200 | Action queue length |
Lc | 2000 | Alert cache |
μ | 0.1 | Similarity factor |
s0 | 0.6 | Similarity threshold |
ω | 0.6 | Anomaly score threshold |
N | 500 | Encoder bit array size |
W | 21 | Encoder bucket width |
Cl | 2048 | Number of columns in SP |
ce | 10 | Number of cells per column |
α | 0.3 | Smooth factor |
others | - | Other parameters are default |
Names | Values |
---|---|
Total raw alerts | 64,000 |
Total alerts used to construct the sessions | 2040 (3.2% of total alerts) |
Alert groups | 769 |
Extracted actions (repeated) | 5876 |
Action classes (after clustering) | 39 |
Extracted sessions | 780 |
Sessions with only one action | 702 (90% of total sessions) |
Sessions with more than two actions | 28 (3.5% of total sessions) |
Session refining rate 1 | 82% |
Sessions refined | 38 (49% of sessions with more than one actions) |
Scenarios | Actions 1 | Alerts |
---|---|---|
Infiltration | A | protocol-snmp request tcp |
B | protocol-snmp agentX/tcp request | |
C | protocol-icmp ping | |
protocol-icmp ping undefined code | ||
protocol-icmp echo reply undefined code | ||
D | policy-other tcp packet with urgent flag attempt | |
server-other winnuke attack | ||
E | policy-other tcp packet with urgent flag attempt | |
F | server-webapp robots.txt access | |
G | policy-other ftp anonymous login attempt | |
H | app-detect failed ftp login attempt | |
I | openssh maxstartup threshold connection exhaustion denial of service attempt | |
J | indicator-scan ssh brute force login attempt | |
K | server-other realnetworks helix server ntlm authentication heap overflow attempt | |
Web attack | L | server-webapp password sent via post parameter |
M | server-webapp flexense diskpulse disk change monitor login buffer overflow attempt | |
N | policy-other script tag in uri—likely cross-site scripting attempt | |
O | sql 1 = 1 − possible sql injection attempt |
Intrusion Patterns | Input Action | System Prediction |
---|---|---|
1-2-3-4-5-6-7-8-9-10 | 1 | 2-3-4-5-6-7-8-9-10 |
11-12-13-14-11-12-13-14-15-16 | 12 | 13-14-11(13-14-15-16) |
17-18-19-20-24-28-29 | 17 | 18-19-20-24-28-29 |
21-22-23-24-28-29-30 | 22 | 23-24-28-29-30 |
25-26-27-28-29-30-21-24 | 26 | 27-28-29-30-21-24 |
Intrusion Patterns | Errors | Accuracy 1 |
---|---|---|
1-2-3-4-5-6-7-8-9-10 | (0,0,0,0,0,0,0,1,3,4) | 92% |
11-12-13-14-11-12-13-14-15-16 | (0,0,0,0,0,1,0,0,0,2) | 92.5% |
17-18-19-20-24-28-29 | (0,0,1,0,0,0,1,0,2,3) | 89% |
21-22-23-24-28-29-30 | (0,0,0,0,0,0,0,0,1,4) | 90% |
25-26-27-28-29-30-21-24 | (0,0,0,0,0,0,0,1,0,3) | 93% |
Sessions | Sessions | Sessions |
---|---|---|
1-2-3-4-5-6-7 | 2-1-3-4-5-6-7-14 | 23-24-25 |
8-9-10-3 | 7-8-18-17-19-20-21 | 3-8-9-10-11-26 |
8-10-3-9 | 8-9-3-10-11 | 1-2-3-4-5-7-9-10 |
1-2-4-5-14 | 2-1-3-4-5-6-14-7 | 1-2-27-13-6-14 |
10-11-12-8-9 | 2-1-3-4-5-14-7-15-16-17-18-19-20 | 28-25-29 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, K.; Zhao, F.; Luo, S.; Xin, Y.; Zhu, H.; Chen, Y. Online Intrusion Scenario Discovery and Prediction Based on Hierarchical Temporal Memory (HTM). Appl. Sci. 2020, 10, 2596. https://doi.org/10.3390/app10072596
Zhang K, Zhao F, Luo S, Xin Y, Zhu H, Chen Y. Online Intrusion Scenario Discovery and Prediction Based on Hierarchical Temporal Memory (HTM). Applied Sciences. 2020; 10(7):2596. https://doi.org/10.3390/app10072596
Chicago/Turabian StyleZhang, Kai, Fei Zhao, Shoushan Luo, Yang Xin, Hongliang Zhu, and Yuling Chen. 2020. "Online Intrusion Scenario Discovery and Prediction Based on Hierarchical Temporal Memory (HTM)" Applied Sciences 10, no. 7: 2596. https://doi.org/10.3390/app10072596
APA StyleZhang, K., Zhao, F., Luo, S., Xin, Y., Zhu, H., & Chen, Y. (2020). Online Intrusion Scenario Discovery and Prediction Based on Hierarchical Temporal Memory (HTM). Applied Sciences, 10(7), 2596. https://doi.org/10.3390/app10072596