1. Introduction
Epilepsy is one of the most common neurological disorders, affecting up to one percent of the population worldwide and almost two million people in the United States alone [
1]. Up to
of epilepsy patients experience medically refractory recurrent seizures [
2] that do not respond to anti-seizure medication. In patients presenting with medically intractable seizures, complete surgical resection of the epileptic zone may be curative, offering the best long-term prognosis, with either complete absence of seizures or partial response to surgery with decreased seizure frequency and/or decreased use of anti-epileptic medication.
Presurgical evaluation entails extensive workup, including clinical workup, interictal (between seizures) scalp EEG, ictal (during seizures) video EEG monitoring, and neuropsychological testing; in addition, patients undergo morphologic (MRI, CT) and functional (interictal PET and ictal single photon emission computed tomography (ictal-SPECT)) multimodality imaging [
3]. Usually, patients are offered neurosurgical options if the clinical presentation, ictal-interictal EEG, and imaging features are concordant for localization of the seizure focus. Often, despite extensive presurgical workup and imaging, either the data is discordant or inconclusive; in this large subset of patients, ictal-SPECT is often helpful for localizing seizures [
3] and phases, which demonstrates areas of acute ictal hyperperfusion (enhanced perfusion during seizures). Ictal-SPECT imaging is instrumental in identifying non-lesional intractable seizures and in pediatric patients.
Seizures are known to propagate rapidly to the ipsilateral and contralateral cortex, especially in extratemporal foci compared to temporal foci. This propagation is very rapid and often diffuse. Since blood flow follows electrical activity [
4], it is imperative to inject the perfusion tracers as soon as the onset of seizures on EEG and/or video monitoring is observed. Hence, to obtain an accurate ictal-SPECT scan, the elapsed time from seizure onset to tracer injection is critical and must be as short as possible [
5]. The reliability of tracer injection for seizure localization significantly improves the elapsed period from seizure onset to tracer injection; Early radio tracer injection has been considered the most critical factor for seizure localization. Pastor et al. [
6] and Setoain et al. [
7] reported improved seizure localization using automated tracer injection (average of 33 s; range: 19–63 s;
) compared to manual injections (average of 41 s; range: 14–103 s;
) and a successful localization seizure focus in 21 of the 27 patients (78%) by automated technique as opposed to 19 of the 29 patients (65%) by manual technique. Ho et al. [
8] have documented the different cerebral perfusion patterns in temporal lobe seizures during ictal and periictal phases. Delayed injections lead to diffuse/multiple-foci of hyper-perfusion on ictal-SPECT, thus invalidating the procedure.
Automated seizure detection on ictal EEG has been attempted for more than four decades. After preprocessing the EEG signal for noise and artifact removal, different techniques have been used for the detection task, including rule-based wavelet and spectral analysis, artificial neural networks (ANN), and support vector machines (SVM) [
9,
10,
11] (
Table 1). Research in neurostimulation and automated drug delivery systems has further grown this field, and ANN and SVM are emerging as the front runner classifiers in automated systems [
12,
13,
14]. Though the reported detection accuracies of various techniques have been impressive, reaching as high as 90% or more, these results are based on well-defined and cleansed samples and are often obtained off-field in the laboratory [
9]. When deployed in a real-world clinical setting, the accuracies can plummet significantly. Currently, neural networks are software emulations and are computationally intensive. So, a finite time is elapsed for processing the input data streams; the temporal delay is well known to exponentially increase with increasing volumes and the complexity of the incoming data streams. It has also been noted in several studies [
15,
16,
17,
18,
19] that individual-based systems perform better than a generalized system because of the significant inter-individual variance of epileptic signals and their general random nature. When deployed in real-world settings, these systems generally tend to have a minimal amount of patient-specific ictal/seizure EEG data than the interictal/normal data.
The traditional methods for seizure detection such as ANN, need large amounts of training data for acceptable performance. Also, it has been shown that ANN requires 4 fold more computational power than SVM. Wang et al. [
20] proposed a random forest with grid search optimization. In addition, most studies reporting the classification results with these machine learning (ML) models use a large database, such as the CHB-MIT scalp EEG Database, for training and reporting the model metrics [
10,
21]. Typically, if we take a few EEG sessions for training and aim to perform the SPECT injection in the subsequent few sessions, we would have a substantial amount of normal data but very little seizure data. In our clinical recordings, a session contained 4 h of normal data and 71 s of seizure data on average. Many adaptive pattern classifiers have been developed to provide high-performance and real-time responses with real-world data. Much recent emphasis has been placed on deep learning, but numerous other classifiers have been developed. These include decision trees, Boltzmann machines, RCE networks, feature-map, LVQ, high-order networks, radial basis function classifiers, and modified nearest neighbor approaches, to name a few. These classifiers provide trade-offs in memory and computation requirements, training complexity, and ease of implementation and adaptation. K-NN methods allow reduced error rates. For instance, several studies have demonstrated that k-NN, which train rapidly but require large amounts of memory and computation, sometimes perform as well as back-propagation classifiers, which are more complex to train but require less memory. Decision trees, which have small memory and computation requirements, often perform as well as more complex back-propagation classifiers but are more prone to over-fitting. Radial basis function classifiers require intermediate amounts of memory and training time. RCE networks require less memory than k-nearest neighbor classifiers but adapt its structure over time using simple adaptation rules that recruit new nodes to match the complexity of the classifier to that of the training data. It was reported, in [
22], that RCE networks adapt faster and require fewer exemplar nodes than the nearest neighbor classifiers as more nodes, if needed, are recruited to generate more complex decision regions, and the size of hyper-spheres formed by existing nodes is modified during adaptation. It has been demonstrated, both theoretically and experimentally, that RCE forms complex decision regions rapidly. They can be trained to solve many problems more than an order of magnitude faster than back-propagation classifiers. RCE networks are currently being applied to many real-world problems for real-time execution, due to their fast learning and the absence of local minima.
Recent advances in machine learning science and deep learning techniques have shown their superiority for learning very robust seizure representation features. For example, artificial neural networks (ANNs) were used to detect seizures after using traditional feature extraction techniques. Some researchers have used semi-supervised deep learning strategies for epileptic EEG classification. The most widely used method involves training a neural network in an unsupervised way using unlabeled data and then training it again in a supervised way using labeled data.
Several deep learning-based systems have been proposed to address the limitation of the classification schemes mentioned above [
23,
24,
25,
26]. For instance, Abdelhameed et al. [
23] proposed a 2D supervised deep convolutional autoencoder (SDCAE) to detect epileptic seizures in multichannel EEG signals recordings automatically. They showed that deep learning could achieve 98% detection accuracy with high sensitivity. The computational training and testing times of these models were not reported. Although deep learning approaches seem to be attractive, it requires a sizeable database, which is not always available. Furthermore, deep learning requires specific hardware for faster training, yet building large comprehensive datasets is tedious and expensive. Additionally, the large volumes of continuous EEG recordings required for deep learning algorithms are limited and remain a significant limitation. Finally, in order to elucidate the optimal network structure for a deep neural network, substantial labor may be required. To the best of the authors’ knowledge, few to no studies have examined the use of machine learning for automatic seizure detection with experimental implementation on hardware. The choice of hardware implementation over software implementation is because dedicated hardware provides real-time and faster processing compared with general software [
27].
We identified k-Nearest Neighbors (k-NN) and Reduced Coulomb Energy (RCE) networks for this task [
28]. Wang et al. [
11,
29] reported high accuracies using k-NN and SVM, respectively. Shoka et al. [
30] developed an automatic seizure diagnosis based on channel selection. Shoka et al. tested several machine learning techniques such as SVM, Ensemble decision trees, k-NN, LDA, Logistic Regression, decision trees, and Naive Bayes. These algorithms showed 80% accuracy on unfiltered data. They showed also that filtered data improved the detection by 1% to 2%. Rivero et al. [
31] also reported high accuracy using k-NN. The choice of these algorithms was also motivated by the commercial availability of specialized hardware tailored for implementing these algorithms. Based on neuromorphic architecture [
32], this hardware has been engineered to improve the accuracy of pattern recognition and, more importantly, decrease the elapsed time between signal input and the output of results, and has been used recently by many researchers [
33]. We use NeuroStack [
34] board from General Vision (Petaluma, CA, USA) for our application, which has multiple neuromorphic chips and enables multiple such boards to be daisy-chained, significantly increasing its ability for pattern learning. The NeuroStack has an onboard FPGA for digital signal processing operations. The FPGA has parallel architecture and has multiple processing elements, which can be used to implement a high-throughput map-reduce framework to speed up the preprocessing operations on multiple EEG channels.
The major contributions of this study are summarized as follows:
We developed a clinical dataset that consists of 205 recordings with an average of 7 h and 35 min for normal brain activity and 5 min 11 s for seizure. The 205 EEG recordings has been collected from 45 patients;
We demonstrated that traditional k-NN and RCN could achieve high seizure identification accuracy with high sensitivity (91.14%) and acceptable specificity (98.77%), achieving comparable performance to support vector machine, ANN, and deep learning. We did not directly compare the proposed technique to deep learning on the same datasets because the hardware used in this study does not support deep learning, but we could obtain results from recent studies and surveys. The results show that machine learning can be used in limited data and computing resources cases, which is often the case. Another advantage of traditional machine learning over deep learning is eliminating longer labeling tasks;
We investigated several types of features such as nonlinear features (sample entropy and correlation dimension) and first and second-order feature extraction. We also explored several feature selections such as mutual information-based feature selection, Chi-square score-based feature selection, ANOVA F1-Value, and Recursive Feature Elimination. We showed that well-engineered features could help machine learning achieve high accuracy while supporting real-time seizure detection. We showed that a latency as small as 3.6 s can be achieved;
In comparing the proposed method to other state-of-the-art machine learning, we showed that the proposed methodology is superior to SVM and ANN. They are the most widely used algorithms in seizure detection. Because of the limited training dataset, we only employed a 4-layer neural network. Increasing the depth of the network requires more training data, which we did not have;
We developed a graphical user interface that can assist epileptologists to apply their expertise in the field and facilitate the labeling jobs as they can spend less time with this task.
The remaining sections of this paper are outlined as follows:
Section 2 describes data collection, feature extraction, and feature selection as well as the ML methods used for classification.
Section 3 describes the experimental setup, experiments, training, and evaluation metrics. It also describes examples of results as well as their analysis.
Section 4 discusses the results in the context of related and state-of-the-art-techniques.
Section 5 summarizes the main findings of this study and concludes the paper.
3. Experimental Setup
As depicted in
Figure 2, the raw EEG data was converted into a series of feature vectors through preprocessing and feature extraction. In the preprocessing stage, the raw EEG data was windowed into 5 s long epochs with no overlap between successive epochs. In the feature extraction stage,
CD and
SampEn were calculated on the preprocessed data per epoch for each of the 21 channels. The remaining features were computed from the DWD coefficients. For each session, we had a set of seizure vectors and a set of normal vectors. Since there was an overwhelming amount of normal data, we randomly chose a subset of normal vectors, which was three times the size of the seizure set, for our experiments. Since we have hundreds of hours of data, and 21 channels per epoch, we implemented a map-reduce framework using Python’s Multiprocessing module to make use of multiple processors and speed up the feature extraction by a factor of 10. This framework can be translated into hardware for real-time implementation using an FPGA. For our work, the feature vectors were stored in hdf5 format [
73], which optimized memory utilization and execution speed.
We used the NeuroStack hardware for implementing the k-NN and RCE networks. The hardware has the following constraint: each board can commit a maximum of 4096 examples to memory. Hence, a k-NN network on this hardware can store only the first 4096 training vectors. The RCE network stores the first 4096 vectors that do not fall into each other’s influence field. The training order for k-NN was seizure vectors followed by normal vectors. This was done to make sure the k-NN saved sufficient seizure examples in memory. In case of RCE, since the decision space changes with the order of the training vectors, we performed iterative training until two successive iterations resulted in the same decision space.
As mentioned, NeuroStack uses parallel neuromorphic architecture. The basic operation, which is computing the distance between an input example and all the saved examples, takes a constant amount of time, irrespective of the number of saved examples. This is possible because each example is saved in a separate uniquely addressable memory location. This results in a small training time and a quick response while testing. A consequence of this memory setting is that there is a maximum limit on the size of an example. In NeuroStack, each example can be 256 bytes long. To conform to the 256-byte memory limit, every feature was normalized to 255 so that each feature takes up at most 1 byte of memory. This allowed for a maximum of 12 features per channel, which is equivalent to a maximum of 252 bytes per epoch.
4. Experiments
The following experiments were performed:
Feature selection: Filter and wrapper feature selection methods were used to select optimal feature sets. The performance of classifiers was compared for all feature sets;
Resolution strategy: Different resolution strategies were compared for examples classified as “Unknown” by RCE;
Number of nearest neighbors: k-NN and RCE networks with varying number of nearest neighbors were compared;
Number of EEG sessions used for training: Performance of different classifiers was compared for different training-set sizes;
Varying epoch duration: Performance of a classifier was tracked as the epoch duration was changed;
Individual vs. population-based systems: Performance of a general classifier trained with data from all individuals was compared to specific classifiers trained with data from each individual.
Training and test set used in the experiments For all the individual-based classifiers, sessions were used for training and the one remaining session was used for testing, where is the number of EEG sessions for individual i. For experiments with varying number of sessions used for training, sessions were used for training and the remaining m sessions were used for testing, for . The experiment with a certain value of m was repeated until every session of an individual was tested at least once.
For population-based classifiers, the training set included session data from all the individuals but one, and all the sessions belonging to the one individual were used for testing. This experiment was repeated for all individuals.
In all the classification experiments, seizure and normal data were respectively designated as the positive and negative classes. The notion of True Positives (TP), False Negatives (FN), True Negatives (TN), and False Positives (FP) pertaining to a classifier were defined as:
TP: Number of seizure examples classified as seizure;
FN: Number of seizure examples classified as normal;
TN: Number of normal examples classified as normal;
FP: Number of normal examples classified as seizure.
The following are the metrics used for evaluating and comparing the performance of the classifiers:
Sensitivity: Also known as the true positive rate, is the fraction of the seizure examples classified as a seizure.
Specificity: Also known as the true negative rate, it is the fraction of normal examples classified as normal.
F1-score: This combines the sensitivity and specificity into a single metric, making it easy to compare the performance of classifiers with different sensitivity and specificity values.
where the
precision is given by:
and the
recall is given by:
6. Discussion
The experiments conducted in this work and the results provide relevant information for deciding the architecture of the classifier and the overall real-time seizure detection system. It can be observed that RCE has better performance than SVM, ANN, and k-NN when trained with data from a single session, and this performance is comparable to the one obtained using all sessions for training. In the case of SVM and ANN, we see that the performance gradually improves with data size, and they need numerous EEG sessions for training to reach the performance obtained by RCE with a training set composed of a single EEG session. This performance disparity can be attributed to the ability of RCE networks to identify anomalies with confidence, even with a small amount of seizure data. Having a secondary population-based classifier to classify the Unknown examples (or examples with predict probability <0.8 in case of ANN and SVM) also improved the performance of the classification systems. All the examples which could not be classified by the primary model with a high probability were input to a secondary classifier for a second opinion and classified accordingly. It can be seen from
Table 10 that individual-based seizure detection systems work better than population-based systems. This can be attributed to the high variability in the seizure patterns from person to person, which cannot be captured by a single general model. In this study, the seizure and normal data was labeled and we knew the start and end of each type of data. Hence, non-overlapping epochs were used without any performance degradation. When deployed in a clinical workflow, the system will be working with continuous streams of EEG data. Overlapping epochs can be used in such a scenario to further reduce the chance of missing a seizure since there will be more seizure epochs. The present system can be further improved by pre-processing the data to remove artifacts using combined Blind source separation and independent component analysis [
74], which has proven more useful for separating linearly mixed independent sources in EEG signals, including artifacts [
75,
76]. The influence of such pre-processing can improve the presented results and can be investigated in future studies.
This improved system can be used to trigger tracer injection for ictal-SPECT. In the future, our system can also be used to trigger deep brain stimulation (electro-stimulation) for suppressing ictal discharges and their propagation, and to inject drugs intracranially for more effective seizure control.
Table 11 presents a summary of the main results presented in this study in comparison between the proposed techniques and the state of the art techniques. We did not implement the techniques using the hardware as some of the techniques are not applicable to our study. However, we cite the results obtained by several papers recently published for the sake of the comparison. It can be seen from
Table 11, that the proposed method is able to achieve over 90% plus sensitivity, specificity, and accuracy, all while keeping the delay of the seizure detection within 12 s.