1. Introduction
Digital services, such as banking, healthcare, education, entertainment, and national/local administration services to name a few, drive our modern society in which access to online services is often taken for granted. These services have become nonexclusive routines for almost everyone. Many of us check our official emails and social services first thing in the morning. The dependency of our day-to-day activities on these services introduces a large number of attacks on network services. The latest development in software, network, and system exploits and vulnerability tools has brought up new attack vectors to compromise access to an entire network or subnetwork. However, network defenders use up-to-date and the most sophisticated defence systems for their safeguard. Contrary to conventional host or service based attacks, Distributed Denial of Service (DDoS) attacks are considered more disruptive in nature. These attacks make targeted services unavailable by sending a significantly large number of malicious access requests to a service provider. After resources being depleted, the service provider becomes unable to serve its potential legitimate users. Nowadays, DDoS is a commonly used attacking method which inflicts heavy financial and reputation losses [
1].
With the advancements of virtualization-based computing, Software Defined Networking (SDN) has been widely adopted for security solution in various services and service provisioning models [
2,
3,
4]. In SDN, most of routing and topological decisions are carried out by a separate entity called control-plane [
5]. This decoupling approach has brought enormous benefits to network management and provides a feasible and effective solution to improve network efficiency [
6]. Furthermore, the separation of data-plane and control-plane assists to manage a flexible and scaleable networking infrastructure to meet the day by day ever-changing modern business needs. Although the logical centralised architecture and its programability approach enables the SDN controller to detect malicious activities; however, the controller itself becomes vulnerable to DDoS attackers [
7].
Most of the forwarding decisions are managed by the controller. The table in OpenFlow based switches consistently searches for new packets arrival, with a successful match, flow action is performed. If packet_in does not match then it is propagated to SDN main control-plane for detailed analysis. In case of DDoS attack, if the arrival rate of packet_in is significantly higher, then control-plane resources start to deplete, which results in discontinuity with data-plane and may overwhelm the controller. A single point failure, such as overwhelming the controller with malicious traffic could defunct the whole networking infrastructure [
8].
Amongst existing security problems in SDN, DDoS attack is considered as one of the most urgent and hardest security issues [
9]. So far, DDoS attack detection in SDN is well researched with approaches including [
10,
11,
12,
13,
14,
15,
16]. However, most of these conventional studies are focused on attack detection and mitigation methods. Majority of work is based on time-based periodic detection—choosing a right time-period to detect an attack is very hard. If a large time-period is selected for attack launch then response time for detecting attack will be increased. This creates extremely large attack overload over deployed switches and its major SDN controller. In contrast, if the time threshold is set to a relatively low value, then deployed attack detection module will continuously run, which unnecessarily consumes controller resources, such as CPU, network up-stream and down-stream bandwidth. It also affects controller efficiency.
However, congestion at the controller is one of the major issues that could easily lower down the performance of deployed mechanism and easily left the entire infrastructure vulnerable especially for DDoS attacks. Currently, most of the research is not focused to improve the accuracy of the controller in SDN, as most of the detection modules work from the SDN controller. However, it is mandatory to rectify SDN controller efficiency with available characteristics of SDN. To solve the aforementioned issues, we propose a new-fashioned anti-DDoS detection mechanism with entropy-based feature distribution in SDN. Our detection model comprises of Snort alert based Features, Entropy Calculation, Feature Distribution and Traffic Processing, and Machine Learning Classifiers. We summarised our contribution as below:
An effective anti-DDoS detection mechanism is proposed, to speed up major SDN controller unit accuracy so that deep learning model easily classifies trade-off between benign and unknown malicious code.
We have distributed specific traffic features with generalised entropy estimation of Shannon and Renyi formulas.
By utilising a Snort–Ryu implementation with entropy calculation, we acquired non-redundant traffic features.
As detection classifiers, we utilise well known deep learning classifiers, such as Stacked Auto Encoder (SAE) and Convolutional Neural Network (CNN) to compare the accuracy and False-Positive alerts with 30% and 60% attack rate with normal traffic.
Rest of the paper is organised as follow:
Section 2 and
Section 3 represent the related literature work and background in details, respectively. Our novel entropy based DDoS detection model is provided with details in
Section 4. Experimental results evaluations are presented in
Section 5. The paper is concluded along with future directions in
Section 6.
2. Related Work
In previous decades, most of the security research is performed mainly with legacy networks [
17,
18,
19]. Different approaches have been implemented to detect and mitigate the DDoS traffic only in traditional network [
20,
21,
22,
23]. To meet the needs to digital services, traditional networks found computationally expensive, time-consuming and it requires more modern innovations for security implementation. OpenFlow enabled SDN infrastructure has been proved successful with various security challenges [
14]. Although SDN provides a feasible and modern platform, the control-plane layer of SDN is not extensively researched from a security point of view. Due to the programmable model and logically centralised implementation, it brings new vulnerabilities and threats making SDN control-plane layer as an attractive target for potential intruders. Especially, massive DDoS attacks at control-plane of SDN based platform, which can result in the unavailability of the entire network [
24].
One of the main challenges of network security and machine learning is to distribute and select optimal features [
25]. The feature subsection aims to pick a feature subset that performs better within a certain condition of assessment [
26]. According to the analysis of [
25,
27], feature selection can be classified from the technical viewpoint approaches. By adaptively validating the system with multiple combinations of features as inputs, also, existing researchers focused to collect and classify the ideal features for the optimal performance in proposed models.
To assess regular traffic flow, Mehdi et al. used the maximum entropy calculation methodology to address security challenges in SDN [
11]. Experimental investigations was carried out by using OpenFlow NOX controller switches, using low-rate data traffic. Their main goal was to classify attack traffic in a home setting. In another research, investigators used the framework of entropy to predict the transmission of worms and threats through port scanning [
13]. Besides this, Ref [
10] suggested an entropy-based anomaly detection technique. In their proposed research, the identification module operated in the edge switches to lower down the overhead of the control plane. Kotani et al. suggested a packet filtering strategy to secure the controller [
28]. Such a strategy identified the elements of the packet headers before the packet-in occurrence was forwarded. On the contrary, the strategy is ineffective unless the attacker produces new streams wherein the flows. Dong et al. developed a system for detecting the vulnerable applications for which attackers were connected [
29]. Typically, the threshold was set in feature extraction, and any irregular variance of incoming traffic feature vectors helped to classify threat [
30]. Mohammadi et al. suggested a prevention strategy to fight the TCP SYN DDoS attack targeting SDN. They used SDN’s programmability [
31] for identification purposes. The system, however, was also vulnerable to several other protocol attacks. DDoS attacks, as a comparison to other attacks, will cause a significant disruption of any sort of networking infrastructure [
32].
Entropy is a common way of producing valuable traffic classification features and has been extensively seen in recent frameworks for DDoS attack detection [
33]. Entropy is an analysis technique that scales the ambiguity about the content. In network activity, entropy uses a single value metric to identify the distributional variations in traffic [
34]. This has been increasingly recognised that adequate analysis of such improvements can classify network anomalies [
35]. A new study of identification showed that detection based on entropy has better detection efficiency than many other approaches [
36]. The entropy-based features classification techniques are feasible and is widely used [
37], which possess various significant features like effective and fast calculation, lower false alerts, and higher detection accuracy. In particular, entropy calculations are extended to input traffic attributes such as IP addresses of source and destination, the destination port of source and destination. For example, the high entropy value indicates that a significant disparity occurs concerning entropy specified at source address and that the low entropy value indicates a reduction in the source of traffic packets. This is valuable for the detection system, as a standard DDoS attack with many attack sources of a single target typically has a high variation of the source address and a low variation of the destination address relative to a regular traffic.
Identifying DDoS malicious traffic at the data-plane layer is difficult because OpenFlow enabled devices have no self-adaptive intelligence to segregate network traffic flows. Addition to this, attackers use easily available tools and hardware-assets [
38]. This section presents a systematic literature of DDoS detection solutions, which are widely deployed in SDN control-plane and listed in
Table 1. Most of the existing approaches have evaluated DDoS detection techniques by classifying packet traffic either legitimate or malicious and broadly categorised into entropy-based anomaly detection, signature-based, machine learning-based and hybrid detection. These approaches are deployed in SDN infrastructure to detect DDoS traffic.
Some authors utilised entropy-based statistical techniques to analyse traffic [
10,
44,
45]. The authors of [
42,
43] proposed an entropy-based technique to detect DDoS at POX controller during initial attack stage. This proposed work has a limitation, once the number of hosts increases the proposed model generates false positive alerts. The computational overhead from the controller is reduced by deploying fast-entropy approach with flow-based model [
44]. The authors in [
39] proposed a scheduling based method to detect DDoS, where a single processing queue is divided with subsets of
k logical queues, each of them belongs to the network switch. During heavy traffic burst, the SDN controller utilises logical queues to satisfy scheduling request. The authors of [
11] utilises maximum entropy estimation technique for classifying normal traffic distribution to solve home office network security concerns in SDN. Most of the experiments were deployed with OpenFlow enabled switches with NOX SDN controller. In [
13], authors have used entropy methodologies for identification of port-scan attacks and worm propagation. Another entropy-based anomaly detection solution was proposed by the authors of [
10], to detect DDoS attacks in SDN. This work more likely focused to reduce control-plane workload.
Some of other, well-known methodologies have been published to detect DDoS traffic with SDN based architecture, such as self-Organising Maps (SOM), which is Machine Learning (ML) based approach to detect malicious traffic [
12]. This work uses only six features to classify the malicious attack traffic, i.e., Average Packet per flow (APf), Average Duration per flow (ADf), Average Byte per flow (ABf) etc. Similarly, Refs. [
41,
46], also utilised different ML-based approaches to classify traffic patterns. The authors in [
16], proposed adaptive flow collection based DDoS detection model in SDN. This methodology utilises OpenSketch traffic measurement tool to create a hash table for measuring traffic. This approach uses three stages based pipeline process to gather traffic samples for identifying malicious traffic instead of using traffic flow sampling. In SDN, most of the DDoS detection solutions are carried out with the collaboration of ML and knowledge-based techniques to identify malicious attacks. Generally, ML-based techniques classify attack flows based on specific features. ML-based anomaly detection models are mainly suitable for small networks. For larger networks with heavy traffic flow overhead, are unmanageable for traffic collection and analysis inside controller [
13]. Although during the attacking scenario, response time is more important to improve detection performance; however, ML performance is also dependant upon trained datasets and its features diversity.
Recently, ML-based detection techniques have been widely applied in SDN to address the challenges of DDoS detection. Authors in [
42] proposed a semi-supervised one class-based Support Vector Machine (SVM) to classify anomalies, here the small quantity of malicious traffic is utilised as compared to normal traffic. This model is feasibly capable to detect outliers from the initial background traffic phase, which helps to easily manipulate majority of the traffic characteristics. The authors have used the Stacked Auto Encoder (SAE) to train datasets; however, it consumed a lot of time to process model iterations. Similarly, the authors of [
43] proposed high precision DDoS detection model, which is based on Xboost classifier SDN. The proposed approach analysed most of the DDoS attacks to cater feasible and effective solution. In SDN, POX controller’s grab bag connection, most of TCP, UDP, and ICMP flooding attacks were sent for manipulating connection records which enabled us to evaluate DDoS classifiers.
According to [
47], a packet_in filtering approach can protect control-plane. This technique helps to list most of contents extracted from packet header field prior to sending the packet_in message. However, when intruders launch very distinctive flows in which all packets have different field values rather than specified values of the proposed technique, then it fails to capture malicious records. Authors in [
29] deployed detection model for locating compromised interfaces, which are used by attackers during attack time. Most of anomaly detection model uses fixed threshold values, once incoming statistical features deviate with abnormal conditions it is identified as attack traffic.
From the literature survey, it can be seen that some research has been carried out for the detection of DDoS attacks by utilising traffic feature distribution with entropy-based methodologies. By utilising feature distribution with the help of entropy calculations over existing detection techniques, primarily we can reduce redundant and unnecessary features processing overhead and improve the detection requiring relatively less time.
Convolutional or convolutional neural networks (CNN) [
48] are known as enhancements of conventional feed forward networks (FFNs). These were initially tested for object recognition using Convolution 2D layers, 2D layer pooling and a totally interconnected layer. This was accompanied by the natural language analysis of the Convolution 1D layer, the pooling of 1D layer and the completely connected layer [
48]. Whereas the conventional CNNs used mostly for image analysis with the help of 2D, 1D, as CNNs can be used effectively for time series processing, since time series in 1D can effectively derived by convolutions [
49]. In our proposed study, we utilise the 1D CNN as deep learning classifier to identify security threats in complex multivariate and distributed features based on entropy estimation.
In CNN, convolution is used as primary building block, where entropy based input features converted as 1D time series input vector of
. All distributed features based on entropy calculation are fed towards the fully connected layer of CNN, a fully connected layer comprises on the
soft-max function, which actively utilises the probability distribution with input features vector one by one. CNN layer with fully interconnected
soft-max function is provided as below in Equation (
1).
where
utilises the highest feature value connected to each input vector of
, and
is used for non linear activation function.
The SAE consisted of several self-encoders—input or visible layer, a hidden layer, and a output layer also called reconstruction layer. The input data is loaded into the visible layer. The construction layer is inducing output. The SAE architecture is special in design relative to CNN, DBN, and RBM deep learning models. In the first place, SAE is made up of a basic and straightforward structure and is trained in a much shorter time compared to the other described Deep Neural Network (DNN) algorithms [
50]. Second, because of the nature of the unsupervised learning strategy, SAE is not using labelled datasets. On the other hand, CNN is based on supervised learning, while DBN and RBM use supervised learning. Finally, the SAE algorithm employs outputs as inputs, and detailed features components can be retrieved with a useful training strategy in the SAE. This paper uses comprehensive features of an SAE method based dataset to increase the rate of identification of DDoS attacks in SDN. SAE as DNN uses sparse auto-encoders and soft-max classifiers to extract and label unattended data.
In Equation (
2),
values denotes sparsity penalty for the weight coefficient via Kullback-Leibler divergence. This divergence function enables to input features vector such as
to process if there is a possibility of lower average activation function, when
then this function comprises the minimum values of 0. During training stages of input layer values,
is utilised for an average activation with
Jth values and
is used for sparsity coefficient at hidden layer.
4. Experimental Setup
Our research experiment is conducted with three different virtual machines on a workstation by using Intel Xeon X5560 CPU with 2.88 GHz processor and 16 GB RAM (DDR3 ECC-Registered Memory PC3-12800MHZ). We run TensorFlow 1.4V and Mininet on Ubuntu LTS 16.04-64 bit operating system. The proposed functionality is illustrated in
Figure 2. We use VMware player to create virtual machines: VM1 with 192.168.202.x1 IP address, VM2 with 192.168.202.x2 IP address, and VM3 with 192.168.202.x3 IP address.
In our proposed model, a python script is used to generate attack and benign traffic with the Scapy tool. We also generated benign traffic with the help of normal web searches and browsing, and video streaming to validate the benign traffic of Scapy. We limited our model bandwidth up to 50 MB for 5 min interval to evaluate the performance, where TCP stood at 9 MB, UDP at 32 MB, and ICMP nearly at 2 MB, TCP, UDP, and ICMP minimum range were 380 Kbit/s and maximum range stood at 700 Kbit/s during attacks.
In VM1, Mininet as a the network emulator is utilised for creating 6 hosts with 6 OpenFlow switches. Kernel name-space properties of Mininet enables to prototype overall network environment within a single workstation. In Mininet, each process has its own network interfaces and routing table, these features enable to virtualise all network elements in Kernel. We connect these switches with Ryu controller with the help of OpenFlow (OF) version 1.3.
In VM2, the Ryu controller is implemented with Snort-IDS as Network Intrusion Detection System (NIDS) and entropy algorithm, which is presented in
Section 3. This VMs play a vital role to collect networks traffic, then apply entropy probabilities property for feature distribution. The Snort-IDS collects every
incoming_packet in Barnayrd2 log file, then Ryu based GE and ID entropy estimation reduces redundant features from all network traces. Our detection model relies on more specific features, which are collected with GE and ID feature distribution for the deep learning classifiers. SDN controller is deployed in VM2, which centrally handles all virtual machines of our test-bed. Network policies are installed via REST APIs. In our system, VM1 considered as data-plane and VM2 as a control-plane.
ovs-ofctl utility is used to insert network policies in the switches table, addition to this, the ovs-ofctl utility is also utilised for monitoring and administration purposes between data-plane and control-plane.
In VM2, Ryu uses two network interfaces, one in promiscuous mode on eth0 interface to collect all OpenFlow traffic traces with Snort from VM1, while another Eth1 is utilised as a port mirror for entropy calculation on Snort-Barnyard2 packets. Snort as NIDS plays a very vital role to acquire all raw network traffic from our proposed model. Snort switch application is implemented on the top layer of Ryu controller, which helps to support Layer L2 switch code and also redirects feasible traffic by using OpenFlow enabled promiscuous mode. The Ryu controller receives Snort alerts by utilising , which helps to collect network packets and then store log-file in Barnayard2. This helps to manipulate data-plane traffic.
VM3 generates malicious traffic remotely, as illustrated in
Figure 2. It utilises Scapy, LOIC, and Metasploit DDoS penetration testing frameworks such as Network Mapper (NMap) and Nessus Vulnerability Scanner. Scapy is considered powerful and feasible to launch real flooding attacks; however, our proposed work uses Scapy and LOIC to launch various TCP, UDP, and ICMP traffice. To validate our proposed work, we perform an attack and normal traffic on same VM1 data-plane area which is directly connected on the main SDN controller with deployed parameters. A python script is used to generate attack traffic and benign traffic with Scapy, where we select different hosts and source nodes during all injection. Our work also generates benign traffic by web activities such as web searches and browsing, and video streaming.
The probability of GE and ID entropy is applied on all test-bed datasets to acquire more special features set with no redundant features attributes. These features are which are provided in
Table 3. Tshark and Tcpreplay tools are utilised to manipulate and analyse benign and malicious traffic individually. Once malicious traffic traces are classified with GE and ID as discussed in
Section 4, then we categorise normal and benign CSV files into training and the testing datasets shown in
Table 4. All datasets have values of non-zeros numbers due to unity based
MinMax normalisation. We break our proposed CSV datasets into
0s and
1s values, normalisation as defined in Equation (
10). After normalisation we utilise SAE and CNN deep neural network models to classify as an attack and non-attack values.
Performance Evaluation
In this work Snort IDS is also used for collecting all data-plane traffic from test-bed. Snort with two different modes: SM1 mode to acquire only malicious traffic, and SM2 mode for acquiring benign traffic. These two modes are configured with specific signature rules of Snort detection engine as illustrated in
Table 2. We process Snort alerts for feature distribution and generalisation by using GE and ID matric to analyse and calculate
Src-IP, Src-Port, Dest-IP, Dest-Port, Source-Bytes, Destination-Bytes, TTL, Flags, proto, Distinct Datagrams.
In order to validate the effectiveness of detection model with proposed GE and ID feature distribution on the SDN controller, in proposed work, we are using two different scenarios. In the first one, we have used different attacking intensity, which is launched from a single host but remotely connected with Mininet data-plane VM1. We randomly generate attacks by using Scapy, LOIC, and Metasploit. The first scenario uses 20%, 30% and 40% attack rate. The second scenario uses 60% and 70% malicious traffic. In both scenarios, we generate benign traffic by using normal web searches and browsing, and video streaming. These searches are performed within VM1, where test-bed data-plane is created with Mininet. We maintain the attack intensity by using the following percentage equation:
In this equation,
represents the attack packets and
represents total number of packets flowing in our test-bed. We run our code 10 times in each case for setting threshold values. We find the False-positive (FP) rate decrements, but the False-negative (FN) rate is stable.
Table 5 represents threshold values during different attacks scenario.
Our model mainly relies on average output with input data (estimated entropy values)—it uses two thresholds, such as lower and upper , for every input data collected based on different time. Our aim is to classify and distribute specific features from incoming network stream. The distributed features are processed with the help of two different time slots, which comprise of deviation beyond normal ranges of . Our model uses entropy deviation such as a sudden rise of values or a drop of values as compared to predefined threshold values between 0 and 1. For example, during the event of DDoS attacks such as port scan on a specific location will result in dispersion known as entropy. Following steps are carried out to set up threshold values:
We calculate possible maximum attack traffic values, this is achieved by combining attack traffic mean entropy values and confidence interval values.
After taking the difference between these values, we derive values for mean and standard deviation.
In
Figure 4, we utilise 30% and 40% attack rate, while
Figure 5 uses 60% to 70% attack rate. Each points on horizontal line represent windows size and vertical line represents entropy values, such as
. In
Figure 4, blue curve represents the normal traffic and orange curve represents malicious traffic. We stabilise network by injecting manipulated malicious traffic remotely and run Algorithm 1 10 times. However, benign traffic entropy values is common in all attack rates as shows in
Figure 4 and
Figure 5.
We have considered different attack scenarios with a single victim and multiple victims. In single attack victim only single host is under attack. On the other hand multiple victims attacks, we have launched attacks on 5 hosts. During simulation, the deviation and sudden drop of entropy values are considered to be used for traffic feature distribution. The rapid drop into flows represents malicious activity based on this phenomenon.
From
Figure 4, it can be seen that entropy values drop is least significant in case 30%, 40% of attack rate. However with 30% attack traffic, the mean entropy values are found as 0.77 on 53 windows interval, 0.72 on 57 windows interval, 0.76 on 63 windows interval, 0.72 on 71 windows interval. After every 25 windows intervals, entropy values dropped at average values of 0.72 to 0.77. With 40% attack rate, we found mean-values of entropy drops from 0.82 to 0.74 on windows interval of 55, after every 25 consecutive intervals, mean values repeatedly fall between 0.73 and 0.71.
Moreover, there is significant entropy values change in the case of 60%, 70% of the attack rate. In
Figure 5a, the mean value of entropy drops to 0.64% after every 25 windows intervals. Similarly,
Figure 5b, the mean value of entropy exponentially drops to 0.35% on 55 windows interval after 25 consecutive windows intervals mean values constantly falls to 0.35%. As compared to benign traffic mean values, attack traffic mean values drop around 0.50%, which is significantly higher with 70% attack rate through all experiments. This entropy value is far less than the threshold values, which is very feasible to classify this event as malicious.
Benign traffic is common in all attacks experiments, it is fixed threshold values to compare entropy values deviations. In
Figure 4 and
Figure 5, entropy deviation values are less than the fixed threshold or some times it is higher than the fixed threshold. However, we have acquired traffic and classified based on mean values, which are significantly less than a fixed threshold as already illustrated in
Figure 4 and
Figure 5.