Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals

Sharma, Manish; Tiwari, Jainendra; Patel, Virendra; Acharya, U. Rajendra

doi:10.3390/electronics10131531

Open AccessArticle

Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals

¹

Department of Electrical and Computer Science Engineering, Institute of Infrastructure, Technology, Research and Management (IITRAM), Ahmedabad-380026, India

²

School of Engineering, Ngee Ann Polytechnic, Singapore-599489, Singapore

³

Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan

⁴

School of Management and Enterprise, University of Southern Queensland, Springfield 4300, Australia

^*

Author to whom correspondence should be addressed.

Electronics 2021, 10(13), 1531; https://doi.org/10.3390/electronics10131531

Submission received: 13 May 2021 / Revised: 12 June 2021 / Accepted: 17 June 2021 / Published: 25 June 2021

(This article belongs to the Special Issue Electronic Solutions for Artificial Intelligence Healthcare Volume II)

Download

Browse Figures

Versions Notes

Abstract

:

A sleep disorder is a medical condition that affects an individual’s regular sleeping pattern and routine, hence negatively affecting the individual’s health. The traditional procedures of identifying sleep disorders by clinicians involve questionnaires and polysomnography (PSG), which are subjective, time-consuming, and inconvenient. Hence, an automated sleep disorder identification is required to overcome these limitations. In the proposed study, we have proposed a method using electroencephalogram (EEG) signals for the automated identification of six sleep disorders, namely insomnia, nocturnal frontal lobe epilepsy (NFLE), narcolepsy, rapid eye movement behavior disorder (RBD), periodic leg movement disorder (PLM), and sleep-disordered breathing (SDB). To the best of our belief, this is one of the first studies ever undertaken to identify sleep disorders using EEG signals employing cyclic alternating pattern (CAP) sleep database. After sleep-scoring EEG epochs, we have created eight different data subsets of EEG epochs to develop the proposed model. A novel optimal triplet half-band filter bank (THFB) is used to obtain the subbands of EEG signals. We have extracted Hjorth parameters from subbands of EEG epochs. The selected features are fed to various supervised machine learning algorithms for the automated classification of sleep disorders. Our proposed system has obtained the highest accuracy of 99.2%, 98.2%, 96.2%, 98.3%, 98.8%, and 98.8% for insomnia, narcolepsy, NFLE, PLM, RBD, and SDB classes against normal healthy subjects, respectively, applying ensemble boosted trees classifier. As a result, we have attained the highest accuracy of 91.3% to identify the type of sleep disorder. The proposed method is simple, fast, efficient, and may reduce the challenges faced by medical practitioners during the diagnosis of various sleep disorders accurately in less time at sleep clinics and homes.

Keywords:

sleep; sleep stages; polysomnography (PSG); classification; ensemble boosted trees

1. Introduction

Sleep is a fundamental part of human life, and its deprivation results in many sleep disorders, which imposes adverse effects on an individual’s health. Lack of sleep may even result in the development of dementia and Alzheimer’s disease [1]. Failing to have proper sleep may disrupt blood sugar levels to a level that a person would fall diabetic. Inadequate sleep may cause blockage of our coronary arteries, leading to cardiovascular disease, stroke, and congestive heart failure [1]. Improper sleep significantly contributes to various other psychiatric conditions such as anxiety, depression, and even suicidality. Inadequate sleep may result in weight gain for adults and children. Dieting without proper sleep is ineffective and will result in loss of lean body mass instead of fat. Improper sleep may deteriorate the quality of life and shorten an individual’s life span [1].

Researchers believe that sleep has persisted along for years of evolution and therefore it must be having some significant benefits. However, recent studies show that sleep serves us a multitude of functions helping both our brains and bodies. Sleep helps our brain by improving our ability to learn, memorize, and take decisions. Sleep also helps us in re-calibrating our emotional brain circuits, as a result of which we can plan better our day-to-day activities and handle psychological challenges. When we sleep, our brain receives a neurochemical bath, which helps in conciliating sore memories and a virtual reality space where the brain fuses past and present experiences and knowledge. Sleep helps our body by restrengthening our immune system, preventing infections and helps to fight malignancy. Sleep helps in ameliorating body’s metabolic stage by maintaining the balance of insulin and disseminating glucose. It also helps in controlling our body weight by regulating our appetite. Proper sleep also helps in lowering blood pressure and maintaining better heart health, leading to a healthy cardiovascular system. Recent sleep studies conclude that sleep is the single most effective thing we can do to reset our brain and body health each day [1].

The first sleep disorder considered in this study is insomnia. Insomnia is characterized by difficulty in falling and/or staying asleep. According to International classification of sleep disorders—third revision (ICSD-3), insomnia is mainly classified into two types (i) short-term insomnia disorder and (ii) chronic insomnia disorder [2]. Short-term insomnia lasts for a few weeks however if the symptoms last for at least three nights per week for more than three months then it is treated as chronic insomnia disorder. Approximately, 10–30% of total population experiences symptoms of insomnia. If the patients with co-morbid conditions are also included this number could be around 50% based on the severity and type of the disorder [3]. Narcolepsy is another sleep disorder considered under the scope of this study. ICSD-3 has categorized narcolepsy under the category of central disorders of hypersomnolence, which are characterized by excessive daytime sleepiness. Recent classification of sleep disorders in ICSD-3 has further subdivided narcolepsy into two categories namely narcolepsy type-1 and narcolepsy type-2. Narcolepsy with hypocretin deficiency along with cataplexy is termed as type-1 narcolepsy. If the cerebrospinal fluid hypocretin-1 levels do not meet the narcolepsy type-1 criteria and cataplexy is also absent, then the condition is termed as narcolepsy type-2 [2].

Nocturnal frontal lobe epilepsy (NFLE) can be regarded as a sleep ailment of varied etiology [4] and it occurs during nocturnal sleep in the form of epileptic seizures, the CAP database contains total of 40 patients with NFLE (Table 1). A total of ten patients have disorder of periodic leg movement (PLM) in the CAP database. ICSD-3 has classified PLM under the category of sleep-related movement disorders. The PLM can be regarded as a sleep disorder concerning repetitive and regular flexing or jerking of legs for about 20–40 s during the sleep. To diagnose the condition as PLM, the frequency of limb movement must be greater than 15 and five in an hour for adults and children, respectively. The prevalence of PLM is higher than the prevalence of epilepsy [5]. The CAP sleep database contains recordings of 22 patients suffering from REM behavior disorder (RBD). ICSD-3 has classified RBD under the category of parasomnia, which is designated by loss of normal muscular tension and other abnormal behavior such as enacting dreams during REM sleep. Commonly, elderly males within the age group of 40–70 are the most common patients of RBD [6,7]. The CAP sleep database contains four patients with difficulty breathing during sleep and are categorized into sleep-disordered breathing (SDB). Obstructive sleep apnea (OSA), snoring, central sleep apnea, and hypopnea are collectively classified as SDB [8].

In clinical practice, a questionnaire-based conventional method is used under which the patient is kept in the surveillance of doctors for months of duration. Conventionally, the Pittsburgh sleep quality index (PSQI) is used to evaluate the quality of sleep [9]. In PSQI, sleep quality is assessed using a self-report questionnaire over a one-month time interval, and questions are related to the psychometric properties of sleep quality. Generally, questionnaires include questions on sleep duration, daytime sleepiness, caffeine intake, snoring, breathing problems during sleep, body mass index, and blood pressure. Answers to these questions are very subjective and memory-based, and prone to human errors. The diagnostic procedure also includes keeping a sleep log or sleep diary for the past 15 or 30 days. The doctor decides the type of sleep disorder based on the patient’s answer to the sleep questionnaires and entries registered in the sleep log. Presently, polysomnography (PSG) is a conventional method to diagnose and treat sleep-related disorders. PSG methods use several electrodes and wired sensors to record various physiological events of a patient such as brain waves using EEG, heart rhythms using electrocardiogram (ECG) [10], muscle movements using electromyogram (EMG), eye movements using electrooculogram (EOG), blood oxygen saturation (SpO2), and nasal airflow using thermistors. A medical practitioner observes the overnight PSG recording of a patient and performs the diagnosis of sleep disorders. Due to several sensors, electrodes, and complexity, the PSG procedure is costly, time-consuming, and inconvenient for patients and may be inapt to clinicians.

The EEG signals available in the CAP sleep database are labeled according to R & K criterion [11] as widely used earlier for sleep stage scoring. In R & K criteria of sleep stage scoring, sleep is divided into six sleep stages: wake (W), S1, S2, S3, S4, and rapid eye movement (REM). In the CAP database, R & K guidelines are used to score EEG signals; however, to meet with the updated AASM guidelines on sleep scoring, we have grouped the stages S3 and S4 together to form the N3 sleep stage. According to AASM guidelines [12], the sleep of an individual is divided into five sleep stages, namely wake (W), N1, N2, N3, and rapid eye movement (REM) [13,14]. The sleep stages N1 and N2 are collectively called light sleep, and the N3 sleep stage is also called deep sleep. All three N1, N2, and N3 together are called non-rapid eye movement (NREM) sleep. In NREM sleep, the eyes remain still, while in the REM sleep stage, the eyes move very rapidly. NREM sleep stage constitutes around 75–80%, while REM sleep constitutes the remaining 20–25% of the total sleep duration. Both REM and NREM jointly form one sleep cycle of approximately 90 min duration. Around 4–5 sleep cycles occurs per night in the case of an adult.

Often, PSG-based techniques are employed to detect primary sleep disorders. Sometimes brief questionnaires are also used to detect insomnia [15] and RBD [16]. In the existing literature, some studies conducted on the automated identification of sleep disorders are scant. Majority of the studies are focused on the detection of sleep stages [13,17,18]. Stephansen et al. [19] used neural networks to develop an automated sleep-scoring algorithm to diagnose narcolepsy. Espiritu et al. [20] have used PSG data to identify sleep-related disorder events such as arousal and leg movements automatically. However, they have used only one subject to carry out their study. They have obtained the highest accuracy of 90.57%, 88.39%, and 89.79% for arousal detection, left leg movement, and right leg movement detection, respectively, using a decision tree classifier. David et al. [21] have developed an automated system to identify SDB related events. They have used power spectral density (PSD) estimation for feature extraction and obtained an overall accuracy of 85%.

Recently, Sharma et al. [22,23] have used ECG and EEG signals separately to identify insomnia and obtained the best accuracy of 97.87%. A similar study using ECG signals and CAP sleep database, spectral features for automatic identification of healthy subjects, and three sleep disorders (insomnia, RBD, and SDB) are used. The overall accuracy of 86.27% is obtained [24]. Shahin et al. [25] have used Hjorth parameters and deep neural network approach for the automated identification of insomnia and obtained an overall accuracy of 92%. Mehrnoosh et al. [26] have developed a method for detecting Alzheimer’s disease using Hjorth parameters and EEG signals. Our study has used a novel triplet half-band filter (THFB) based on Hjorth parameters features of EEG signals to identify six different sleep disorders jointly and simultaneously. The study not only helps in discriminating good sleepers from sleep-disordered patients but also identifies the type of sleep disorder the patient is suffering from. The main features of our study are as follows:

The proposed method is accurate and efficient as we have considered EEG signals for the automated identification of various sleep disorders. Additionally, we have tried to simplify the method using only two EEG channels. The EEG channels are combined to obtain better results.
To perform our study, we have used the publicly available CAP sleep database containing data from many sleep disorders. Hence, we have obtained the classification results of a maximum number of sleep disorders.
We have used Hjorth parameters that are considered prominent in analyzing EEG signals for feature extraction and are computationally less expensive than other non-linear features.
We have used a novel THFB for obtaining subbands of EEG signals. The THFB is computationally less expensive than the conventional biorthogonal wavelet filter banks.
We have used the individual sleep stages W, N1, N2, N3, REM as well as their combinations N1 + N2 (light sleep), N1 + N2 + N3 (NREM), and W + N1 + N2 + N3 + REM (all stages) for the classification of sleep disorders. Hence, we have implemented both sleep stage-dependent and sleep stage-independent classification schemes. Hence, our developed model can identify disorders based on sleep stages without segmentation of EEG into sleep stages.

The subsequent portion of the paper is framed as follows. Section 2 portrays the material used for the study. Section 3 details the methodology, including preprocessing, filter bank, wavelet decomposition, feature extraction, and classification involved in our research. Section 4 delineates the results obtained for binary classification of a healthy and sleep-disordered patient as well as seven-class classification of sleep disorders. Section 5 presents a discussion, and we concluded the paper in Section 6.

2. Material Used

In the CAP sleep database, there is a total of 108 PSG recordings. The PSG signals have been recorded at the Sleep Disorders Centre of the Ospedale Maggiore of Parma, Italy. It contains multiple EEG montages, EOG (Electrooculogram) channels, one ECG (Electrocardiogram), submentalis muscle EMG (Electromyogram), two EMG channels, and respiration signals. The 10–20 international system has been used for capturing EEG recordings, and the following channels were recorded: C3/C4, F3/F4, O1/O2 referenced to A1/A2. Additional channels, namely F4-C4, F3-C3, Fp1-F3, P3-O1, C3-P3, C4-P4, Fp2-F4, and P4-O2, are also present. The EEG channels F4-C4, and C4-A1, which are sampled at 512 Hz frequency, have been employed in this study. The maximum number of PSG recordings possess these two channels. The CAP sleep database comprises healthy subjects and patients suffering from seven types of sleep disorders: NFLE, insomnia, narcolepsy, bruxism, PLM, RBD, and SDB. In this study, we have considered patients who contained EEG recordings sampled at 512 Hz sampling frequency. However, only one subject with a 512 Hz sampling frequency is available for bruxism, and the PSG recording for this patient seems ambiguous. Therefore, we have not considered bruxism disorder in this study. The average age of healthy, insomnia, narcolepsy, NFLE, PLM, RBD, and SDB subjects used in this study is 32, 59, 32, 30, 54, 70, and 78 years, respectively. The range in which the age of all subjects varies is 14–82 years, with an average age of 45 years. Around 58% of the subjects are male (45), and 41% are female (32) out of 77 subjects used in this study. It can be observed that the CAP sleep database mainly represents data of elderly subjects. Hence, our work can significantly help to detect sleep disorders in older subjects. Table 1 shows the epochs corresponding to different sleep stages of healthy and different disordered subjects used in this study. The epochs of SDB patients are less and comparatively higher than others for NFLE and RBD patients.

3. Methodology

The sequence of steps involved for sleep disorder classification using our method are depicted in the flowchart shown in Figure 1. The EEG signals are first filtered and then segmented into sleep stages according to annotations of sleep stages given in the database. The annotations are given in the form of each EEG epoch (30 s duration of EEG signal) labeled with a sleep stage. Epochs of EEG signals of disordered patients are grouped into different sleep stages and then combined with healthy EEG epochs. After preprocessing, we have used novel triplet half-band filter pair to perform wavelet decomposition of EEG signals. From seven level wavelet decomposition, eight subbands corresponding to each EEG epochs are obtained. Subsequently, we have extracted Hjorth parameters (activity, mobility and complexity) from each subband. These features are then fed to various supervised machine learning classifiers for the automated discrimination of healthy subjects and disordered patients, and the type of sleep disorder. The complete study with experimentation and model training was performed using MATLAB R2020a installed on a Windows Server 2019 equipped with Intel Xeon E5-2690 v3 CPU @2.6 GHz (6 cores), an Nvidia K80 GPU with 12 GB Graphics memory, and 56 GB RAM.

The sequence of steps involved for sleep disorder classification using our method are depicted in the flowchart shown in Figure 1. The EEG signals are first filtered and then segmented into sleep stages according to annotations of sleep stages given in the database. The annotations are given in the form of each EEG epoch (30 s duration of EEG signal) labeled with a sleep stage. Epochs of EEG signals of disordered patients are grouped into different sleep stages and then combined with healthy EEG epochs. After preprocessing, we have used novel triplet half-band filter pair to perform wavelet decomposition of EEG signals. From seven level wavelet decomposition, eight subbands corresponding to each EEG epochs are obtained. Subsequently, we have extracted Hjorth parameters (activity, mobility and complexity) from each subband. These features are then fed to various supervised machine learning classifiers for the automated discrimination of healthy subjects and disordered patients, and the type of sleep disorder.

3.1. Preprocessing

We have removed data of an initial 3-min. duration from all the recordings for performing wavelet analysis. Filtering of EEG signals was performed to eliminate noise and keep only the required information, and normalization of EEG signals was carried out to obtain identical amplitude levels. We segmented EEG signals into 30-s epochs. Each epoch of EEG signal comprised of 15,360 samples sampled at 512 Hz. These epochs are then labeled with sleep stages.

Filtering for Noise removal: Conventionally, EEG signal is divided into various band of frequencies namely delta(

δ

), theta(

θ

), alpha(

α

), beta(

β

) and gamma(

γ

) with frequency range of 0–4 Hz, 4–8 Hz, 8–13 Hz, 13–30 Hz and above 30 Hz, respectively. To retain the useful information and for removal of noise [27,28], as per AASM criterion [12], we have band-pass filtered the EEG signal using a finite impulse response (FIR) filter with Kaiser window. Only frequencies above 1 Hz and below 35 Hz were retained in our study. Figure 2a,b displays the power spectrum of raw and filtered EEG signal, respectively. The filtered EEG signals are then standardized to make them centered at zero with a standard deviation of 1. This is done to take care of the differences in recording equipment and to make the data easily trainable.

Segregation of sleep stages from EEG signals: In our study, we have divided EEG signal from each subject according to sleep stages. The CAP sleep database contains annotation files along with the signal recordings. The annotation files contained labels corresponding to 30s epochs of EEG signals as per the R&K rules [11] and the sleep stage scoring is done by six trained sleep experts. We have removed data of initial 3 min duration from all the recordings. We have generated the hypnogram for each subject and Figure 3 shows one such hypnogram of a healthy subject. The hypnogram is a graph that depicts sleep stages as a function of sleep time in hours. It is an easy way to represent the brain wave activity using EEG signals. We grouped the 30s epochs into five sleep stages with the help of annotations in the hypnogram. Table 1 shows number of epochs pertaining to each sleep stage for all the subjects used in study.

3.2. Triplet Half-Band Filterbank and Wavelet Decomposition

EEG signals are non-stationary in nature and hence traditional time domain and frequency domain techniques based on Fourier transform cannot analyze them. Contrary to this, the wavelet-based techniques are considered to be an excellent choice for the analysis of non-stationary signals. This encouraged us to use a wavelet-based technique for the analysis of EEG signals [29,30,31]. Hence, in this work we have used a new class of biorthogonal filterbanks [32,33,34,35,36,37,38] named triplet half-band filterbank (THFB) with the parametric Bernstein Polynomial developed by Tay et al. [39].

Tay et al. [39] have designed a novel class of filter bank which has structural perfect reconstruction (PR) and regularity property [40,41]. This filter bank is designed using three simple half-band filters. These filters can be designed easily and have certain desirable properties [42]. This filterbank designed with the help of three half-band filters is known as triplet half-band filterbank, which overcomes the limitations of the halfbank pair filterbank [43]. In this study, we have used THFB in which regularity and sharpness of filters can be controlled using parametric Bernstein polynomial. The filterbank has been designed using least square approach. The filters are designed with the objective of minimizing passband and stopband errors. The optimization problem is a constrain optimization problem subjected to PR and regularity conditions [29]. The optimization problem has been solved iteratively to obtain optimal parameters of Bernstein polynomial.

Earlier Phoong et al. [43] have developed a class of biorthogonal filter bank called half-band filterbank (HPFB) with certain special features namely structural PR and regularity. However, HPFB had some limitations, such as the restriction in frequency response because of the analysis low-pass filter being a half-band filter, in which the magnitude responses of the analysis and synthesis low-pass filters at

f = 0.25

needs to be

1 / 2

and 1.0, respectively. However, for THFB, the frequency response at

f = 0.25

can be set to any desired value. The advantage of using THFB and Parametric Bernstein Polynomial is that one can impose conditions of PR and regularity structurally and the design is more flexible than the corresponding HPFB. Therefore, in this study we have used THFB to obtain the subbands of EEG signals.

In this study, we have used analysis and synthesis filters of orders 28 and 38, respectively. The order of regularity chosen for both filters is fixed. Figure 4 shows frequency responses of the filters. We have used cascade algorithm to generate scaling and wavelet functions (Figure 5).

The wavelet decomposition is performed using above-mentioned THFB filterbank. We chose seven decomposition levels as the maximum frequency component present in the EEG signal is 256 Hz. We have obtained total eight subbands, one for approximation coefficients and seven for detail coefficients. Then, time-domain Hjorth parameters are computed from these eight subbands.

3.3. Feature Extraction: Hjorth Parameters

In the proposed study, we have used Hjorth parameters features representing time-domain specifications of EEG signals and are widely used in EEG signal processing applications. The three parameters, namely activity, mobility, and complexity derived by Hjorth, are used for the analysis of EEG signals. These parameters jointly describe the patterns of EEG signals concerning magnitude, time-frequency scale, and complexity [44]. The Hjorth parameters are defined as follows:

Mathematically, Hjorth activity is the squared standard deviation (energy) of a time series. It represents the mean signal energy. For a signal

y (t)

, the activity can be defined as given in Equation (1).

activity = var (y (t))

(1)

The Hjorth mobility provides an estimate of mean frequency of the signal. It is proportional to the standard deviation of power spectrum of the signal. For a signal

y (t)

, it is defined as the square root of the ratio of variance of the first derivative of the signal to the variance of signal, as shown in Equation (2).

mobility = \sqrt{\frac{var (\frac{d y (t)}{d t})}{var (y (t))}}

(2)

The Hjorth complexity compares the shape of the signal with respect to a pure sinusoidal signal and represents the deviation in frequency. The value of complexity ranges between 0 and 1, where 0 represents minimum similarity and 1 represents maximum similarity with pure sine wave. It gives good estimate of bandwidth of the signal. For any signal

y (t)

, it is defined as the ratio of the mobility of the first derivative of the signal to the mobility of signal and is given by Equation (3).

Complexity = \frac{mobility (\frac{d y (t)}{d t})}{mobility (y (t))}

(3)

3.4. Classification and Validation

Three Hjorth parameters are extracted from all eight subbands to obtain 24 time-domain features using single-channel EEG signals. We have used Statistics and Machine Learning Toolboxes available in MATLAB R2020a for developing the model. All features are trained using multiple supervised machine learning classifiers [45,46] namely ensemble bagged trees (EBT), ensemble boosted trees (EBooT), support vector machines (SVM), and K-nearest neighbor (KNN) using ten-fold cross-validation (CV) to avoid overfitting problem. We have used the trial-and-error approach for selecting an optimal algorithm as a priori it cannot be predicted, which algorithms will perform well for the given database, and found the classifier with the best classification performance. Having searched the best classifier out of all the classifiers available in the toolbox, we tuned the hyperparameters of the selected classifier to enhance its performance further. Among all classifiers used, we observed through our extensive simulations that EBT and EBooT performed the best and yielded maximum classification performance for most of the classification task. It is also in line with the theory that ensemble techniques would better predict the individual member. In the ensemble technique, a bunch of weak learners is combined to form a strong-learner, resulting in the overall increase of the model’s performance. Many decision trees classifier are involved in achieving a higher performance than the single decision tree classifier. Pivotal advantage of employing ensembles is to have improved average classification performance over any other contributory individual member of the ensemble. As we have used a large number of epochs and the data are imbalanced, ensemble methods are highly reliable and capable of generating a robust model by effectively reducing the problem of overfitting.

Ensemble Bagged Trees: In the bagging algorithm, multiple subsets of training data are selected on a random basis with replacement, and then each subset is employed for training a decision tree. Overall performance is obtained from the average of all the predictions made using different decision trees. The central aim of using the bagging algorithm is to minimize the variance of the decision tree classifiers. The bagging algorithm handles higher-dimensional data very well and significantly reduces the overfitting of the model [47,48].

We noticed the variation in the misclassification error rate with respect to the number of splits and the maximum number of trees. The number of splits is changed iteratively from 1 to

m - 1

with a step size of 100, and the number of trees is varied from 20 to 200. In this work, m denotes a total number of epochs (m = 18,710 for N3 dataset in seven-class classification). We obtained the optimum number of learners equal to 30, the number of splits equal to

m - 1

(i.e., 18,709), and the learning rate equal to 1.

Ensemble Boosted Trees: In this technique, random samples of the training data are used to train simple decision trees, and then errors are analyzed. Weights of the misclassified inputs are increased to increase the chances of obtaining previously misclassified input that is correctly classified in the next iteration. Boosting algorithm aims to improve the accuracy by increasing the misclassification cost with every iteration, and thus it converts weak learners to perform as better models [47,48].

K-Nearest Neighbor (KNN): It is one of the most straightforward supervised machine learning algorithm which can be used for classification and regression [49]. In this algorithm, we need to select the value of K (generally, 5). Then we calculate the Euclidean distance between K neighbors, take the K-nearest neighbors according to the calculated Euclidean distance. The data point under consideration belongs to the category which contains the maximum number of nearest neighbors.

Support Vector Machines (SVM): It is a robust supervised learning algorithm, which can be employed for binary and multi-class classification or regression. These are widely used in speech and image recognition, natural language processing, and computer vision [50,51]. The SVM works on the principle of maximizing the margin between two classes. To maximize the separation gap among the two groups in the data, an optimal hyperplane as a decision surface is created that optimally separates the data into two distinct classes.

In this work, we have performed automated identification of six sleep disorders using C4-A1 and F4-C4 EEG channels. We used EEG recordings of six healthy subjects and 74 patients with above-mentioned sleep disorders. A total of 78,853 epochs (30 sec duration each) were used in this study. A detailed summary of individual epoch count is shown in Table 1. In this study, six binary classification tasks are considered to discriminate healthy and sleep disorder subjects. Furthermore, we have considered seven-class classification problem to identify the type of sleep disorders.

4. Results

In the following section, we have detailed about the data subset formed by us, and the classification tasks considered for the identification of various sleep disorders.

4.1. Data Subsets Preparation

As per the AASM guidelines [12], we segregated all five sleep stages from recordings of healthy and sleep disorder patients to form five different data subsets (wake(W), N1, N2, N3, REM) and performed binary classification using each data subset. Additionally, we also formulated some more data subsets as mentioned below:

LSS: It is a stage of light sleep and formed by combining N1 and N2 stages of healthy as well as disorder subjects.
NREM: It is a combination of ( $N 1 + N 2, a n d N 3$ ) of all subjects and patients.
ALL: It is a combination of epochs belonging to all five stages ( $W + N 1 + N 2 + N 3 + R E M$ ).

We have labeled wake stage of healthy subjects as one class and wake stage of sleep disorder patients as another class and then performed binary classification. Similarly, other classification tasks are performed using other data subsets as well.

4.2. Classification Results

In this section, we presented the results of classification performance for various classification problems formulated by us for classifying healthy subjects, and sleep-disordered patients as well as the type of the sleep disorder.

Insomnia vs. Healthy: We have taken seven patients with insomnia (8551 epochs) and six healthy subjects (6063 epochs). We obtained classification accuracy in the range of 90% to 98% using the F4-C4 channel and accuracy of 85.88% to 96.08% using the C4-A1 channel. However, combining F4-C4 and C4-A1 channels yielded excellent classification accuracy ranging from 91.25% to 99.23%. Table 2 shows complete classification results obtained for insomnia and healthy subjects.

Narcolepsy vs. Healthy: Sleep data of five narcolepsy patients (5627 epochs) and six healthy subjects are taken, and classification is performed on all eight subsets. Table 3 shows the classification results obtained for healthy and narcolepsy. The range of accuracy obtained using different data subsets for C4-A1, F4-C4, and both channels combined is 86.69% to 93.61%, 94% to 97.25%, and 94.58% to 98.21%, respectively.

NFLE vs. Healthy: Identification of nocturnal frontal lobe epilepsy (NFLE) patients is carried out using EEG signals obtained from 27 patients (27333 epochs) suffering from NFLE and six healthy subjects. Table 4 gives a detailed summary of results obtained using individual and combined EEG channels. It can be observed that N2, N3, REM, and LSS subsets yielded better performance results using both EEG channels with an accuracy of 96.16%, 96.17%, 96.45%, and 96.21%, respectively.

PLM vs. Healthy: We have used nine periodic leg movement (PLM) subjects (7811 epochs) to perform the automated identification of PLM against healthy subjects. Table 5 provides the detailed comparison and summary of results obtained for the classification using bipolar EEG, unipolar EEG, and a combination of both EEG channels. We have obtained classification accuracies ranging from 89.54% to 93.68% using the C4-A1 channel, whereas it ranges between 92.4% to 98.07% using the F4-C4 channel. However, after combining both EEG channels and performing classification, we obtained excellent classification accuracies in the range of 95.39% to 98.55%. It can be noted from Table 5 that N3 and REM data subsets show identical classification performance.

RBD vs. Healthy: For the automated identification of RBD sleep disorder, the classification of 22 RBD patients and six healthy subjects is performed for all eight data subsets. We have obtained excellent classification accuracies ranging between 91.27% to 95.95%, 95.56% to 98.33%, and 95.56% to 98.98% for all eight data subsets using C4-A1, F4-C4, and combination of both channels, respectively. It can be noted from Table 6 that N3 and REM data subsets exhibit similar performance for the automated identification of RBD.

SDB vs. Healthy: We also performed automated identification of sleep breathing disorder (SDB) by performing the classification of SDB patients against healthy subjects. It can be noted from Table 1 that we have only one SDB patient with only 16 epochs in the REM stage and 1409 epochs of healthy subjects. As a result of this inequality in epoch count, supervised machine learning classifiers yielded poor performance using the REM data subset. We have obtained the classification accuracies in the range of 90.24% to 98.88% using C4-A1 channel, and between 95.47% to 99.04% using F4-C4 channel. However, combination of both channels yielded excellent classification accuracies in the range of 96.88% to 99.46%. Table 7 provides detailed summary of results corresponding to SDB identification.

Identification of type of sleep disorder: After performing binary classification of six disorders against healthy subjects, we have also performed an essential task of identifying the type of sleep disorder by formulating a seven-class classification task by combining all EEG recordings of all 77 subjects together. For instance, we have combined wake stage epochs of all seven types of patients into one data subset, ‘W’ with seven labels (‘Healthy,’ ‘Insomnia,’ ‘NFLE,’ ‘Narcolepsy,’ ‘RBD,’ ‘SDB’ and ‘PLM’). Similarly, we have formed five data subsets (W, N1, N2, N3, REM), and further combinations of these datasets resulted in additional data subsets LSS (N1+N2), NREM (N1+N2+N3), and ALL (W+N1+N2+N3+REM). In this work, the seven-class classification of ‘ALL’ data subset represents sleep stage-independent disorder identification as it contains only seven labels, with each label denoting either healthy subjects or one of six disorders without any information about five sleep stages. Thus, for the automated identification of the type of sleep disorders, we have carried out eight seven-class classification tasks corresponding to each data subset using various classifiers (EBT, EBooT, KNN, and SVM) with a 10-fold cross-validation technique. First, we have used unipolar EEG channel (C4-A1) alone followed by bipolar EEG channel (F4-C4) alone, and then we combined both channels (C4-A1 + F4-C4) and obtained an overall maximum accuracy of 82.0%, 89.8%, and 91.3%, respectively using N3 data subset using EBT. Table 8 summarizes the results obtained for seven-class classification using all eight data subsets. Table 9 shows the confusion matrix and performance metrics obtained for the seven-class classification using both EEG channels (C4-A1 + F4-C4) for the N3 data subset. In this work, it can be observed that the precision rates for the automated identification of a type of disorder using both EEG channels combined are more than 90% for all sleep disorders except NFLE.

To validate our model, we have also used 20% hold-out validation with 80% subjects used for training and 20% subjects separately used for validation. It is observed that 10-fold cross-validation yielded marginally better results as compared to the 20% hold-out validation scheme. The results obtained using hold-out validation are shown in Table 10. We have also compared the performance of the EBooT classifier with other conventional machine learning classifiers such as KNN and SVM. Table 10 clearly shows that the EBooT classifier performed better than KNN and SVM classifiers. For seven-class classification using EBT, EBooT, KNN, and SVM, we have obtained classification accuracy of 90.7%, 67.2%, 77.7%, and 71.2%, respectively.

We have also performed the above-mentioned classification tasks without using wavelet decomposition. Without using THFB and wavelet decomposition, we have obtained the classification accuracy ranging between 80.97% to 83.55% and AUC ranging between 0.85 to 0.89 for binary disorder identification tasks using the N3 sleep stage as shown in Table 11. However, when we employ the proposed THFB wavelet-based features, we attain better classification accuracy ranging between 96.17% to 99.23% and AUC of 0.99. This is due to the high discriminating ability of optimal THFB wavelet-based Hjorth features.

5. Discussion

In the literature majority of studies are focused on automated identification of sleep stages. In this paper, we proposed a novel approach to automatically identify six sleep disorders using two EEG channels (C4-A1 and F4-C4 ). We have obtained the classification results using individual EEG channels as well as their combination. It can be observed that the classification results improved when combination of both EEG channels are used. Our proposed model not only classify the healthy and sleep-disordered subjects but also identified the type of sleep disorder. We observed that in many case N3 sleep stage can classify better than others.

Espiritu et. al. [20] have developed a method using EEG signals for automatic identification of obstructive sleep apnea (OSA) [52,53] and restless leg syndrome (RLS). They have used two types of disorders (arousal and leg movement events) to discriminate between OSA and RLS. For classification purposes, they used a decision tree classifier and achieved a maximum accuracy of 85.02%. Another study proposed by David et al. [21] focused on identifying only SDB patients. To achieve this task, they have used EEG signals and power spectral density (PSD) estimation and principal component analysis (PCA). They have performed the classification using support vector machines (SVM) classifier and obtained an accuracy of 85%. Table 12 shows the summary of comparison of our study with other existing studies on sleep disorder identification. Widasari et al. [24] have performed a study on the classification of healthy subjects and three sleep disorders (insomnia, RBD, and SDB) using the CAP sleep database. Their study mainly focused on classifying these disorders using ECG signals and spectral features based on sleep quality parameters. They have used ensemble bagged trees classifier and obtained an accuracy of 86.27%. Although their method obtained significant accuracy, they have performed only four-class classification between healthy subjects and mentioned three disorders. Their study has used 51 subjects, while in our study, we have used 78 subjects comprising six sleep disorders using EEG signals with 512 Hz sampling frequency. It can be observed from the comparison table that we have achieved the highest accuracy for the seven-class sleep disorder classification.

To compare our study with the work done by Widasari et al. [24], we have also performed four-class classification for the identification of healthy, insomnia, RDB, and SDB subjects. We have obtained maximum classification accuracy of 96.5% with the EBT classifier.

The main features and advantages of our method are as follows:

To the best of our belief, the proposed study is the first study ever undertaken to detect six sleep disorders simultaneously and jointly.
The method used only three Hjorth parameters, which require less computational time, so feature extraction is fast.
The model has yielded high classification performance due to the employment of optimal THFB wavelet-based Hjorth features. The high performance indicates promising discriminating abilities of the optimal wavelet-based features. These highly discriminatory features can be employed in the automated identification of sleep stages, mental disorders such as depression, schizophrenia, Parkinson’s disease, Alzheimer, Dementia, etc.
We have used EEG signals for sleep disorder identification which are proven to be the standard for brain-related studies. Additionally, our developed method is simple, accurate, and can be implemented in real-time applications.
It can be concluded from our results that it is possible to classify sleep disorders using EEG signals alone with high classification performance.
Considering the non-stationary nature of EEG signals, we employed a wavelet-based technique that employs a triplet half-band biorthogonal wavelet filter bank to analyze EEG signals.
We have used a publicly available CAP sleep database so that the work can be reproduced, and other interested researchers may use the data and compare their results with us. Additionally, the database contains PSG recordings of 108 subjects, including six different sleep-disordered patients as well as healthy subjects of 906 hrs of duration.

It can be noted that the PSG recording can be captured only in sophisticated sleep labs and needs trained clinicians for the acquisition of various physiological signals. Hence, creating a private database that includes such many subjects and a wide variety of different sleep disorders is highly time-consuming, complex, vulnerable to errors, challenging, and costly task. Additionally, no research group has used this data to identify sleep disorders using EEG signals to date. Hence, we used the public CAP database. Since the proposed study has attained promising results with EEG signals, a private database can be created in the future, and the developed model can be tested using it. It is worth noting that the method is novel as this is the first study that employs the optimal THFB wavelet-based features for the detection of sleep disorder. In the literature, no study has employed THFB wavelets for any physiological signals. Thus, this study also indicates that THFB-based features work well in disease detection for EEG signals. In the future, the performance of THFB-based features can be explored for the detection of mental disorders such as Depression, Schizophrenia, Parkinson’s and Alzheimer’s disease, sleep arousal, and sleep apnea.

In the future, we can explore convolutional neural network (CNN) [54] and long short-term memory network (LSTM) though it is not guaranteed that they will definitely surpass the performance of the model as it requires a larger number of data in each class to develop the model. However, using deep learning (DL) techniques is advantageous because it does not require feature extraction, feature selection and classification, etc. Hence, we suggest the use of these DL techniques as future work using huge, diverse databases.

The limitations of our proposed method are given below:

Although EEG signals are considered suitable for brain-related studies, they can still cause discomfort to patients due to multiple electrodes on the scalp.
The CAP sleep database contains only one SDB patient with a sampling frequency of 512 Hz. As a result, the recall rate obtained for SDB patients in seven-class classification is less while the recall rate for NFLE patients is high because of the large number of NFLE epochs (Please see Table 1).
The data are imbalanced concerning the number of subjects corresponding to each sleep disorder. Due to this, for identifying the type of sleep disorder, the model becomes biased towards NFLE and RBD patients as more subjects are involved in these disorders in our study.
In our study, the EEG epochs are imbalanced with respect to sleep stages. However, we have used ensemble bagging and boosting algorithms to minimize the inaccurate results and overfitting of the model.
The identification performance of SDB patients among all six sleep disorders considered in this study against healthy subjects (seven-class classification) is not adequate using only C4-A1 (unipolar) channel. However, using only F4-C4 (bipolar) channel, the performance is significantly improved.
The developed model needs to be trained and tested on real-time data for clinical application.

Modern advancements in technology have led to several laboratory-based tests for diagnosing and treating sleep disorders. These lab-based tests include PSG, sleep latency test, actigraphy, maintenance of wakefulness test and ambulatory EEG [55]. Among these techniques, PSG is the most widely used lab-based technique. PSG includes a night-long simultaneous recording of multiple signals and an accurate method for identifying sleep disorders. However, these PSG-based methods use manual scoring by trained physicians and involve multiple wired electrodes on the patient’s body, which is uncomfortable to the patient. These lab-based techniques require patients to spend one or more nights at designated sleep laboratories. Again, due to the limited availability of sleep laboratories and high cost, patients suffering from sleep disorders do not prefer these techniques, and hence many remain undiagnosed. Furthermore, one or two night’s sleep recordings do not represent the exact sleep condition of the patient. Therefore, there is a need for a device that can record and monitor the sleep recordings over a longer period to assess longitudinal variations with minimum discomfort and cost. Thus, we proposed a possible non-invasive, in-home, and cost-effective candidate to identify multiple sleep disorders using only two EEG channels.

Our proposed method used only two EEG channels and can be integrated into a portable device or cloud-based health monitoring system. The patients need not have to be in sleep laboratories surrounded by so many uncomfortable electrodes.

6. Conclusions

This study aims to develop an automated identification of various sleep disorders using EEG signals. A novel optimal triplet half-band filter bank (THFB) is used to obtain the subbands of EEG signals. The Hjorth parameters extracted from subbands are employed as discriminating features. Our developed model can identify disorders based on both sleep stage-dependent and independent classification schemes. The highest classification performance to discriminate between healthy and disordered subjects and to identify the type of sleep disorders is achieved using the N3 (deep sleep) stage. The highest accuracy of classification between healthy subjects against insomnia, narcolepsy, NFLE, PLM, RBD, and SDB subjects is 99.2%, 98.2%, 96.2%, 98.3%, 98.8%, and 98.8%, respectively. In addition, for the identification of the type of sleep disorders, we have achieved the best accuracy of 91.3% using the N3 sleep stage. The experimental results show that the ensemble bagged and boosted trees classifier can effectively classify sleep disorders with high accuracy, precision, recall, and F1-Score. The developed approach for sleep disorder identification is simple, accurate, computationally less expensive, and easy to implement in home-based systems, cloud-based sleep monitoring devices, and hospitals. A private database can be created in the future, and the developed model can be tested on it. We intend to explore various deep learning techniques such as CNN and LSTM to automate sleep disorders using a vast database.

Author Contributions

Conceptualization, M.S.; methodology, M.S.; software, M.S., J.T. and V.P.; validation, J.T. and V.P.; investigation, M.S.; writing—original draft preparation, M.S.; first draft preparation, J.T. and V.P.; review and editing, J.T., V.P. and U.R.A.; editing, M.S.; visualization, M.S.; supervision, U.R.A. All authors have read and agreed to the published version of the manuscript.

Funding

We have not received any funding.

Data Availability Statement

Data used in this work is open-source and publicly available on PhysioNet (https://physionet.org/content/capslpdb/1.0.0/, accessed on 16 June 2021).

Conflicts of Interest

All the authors confirm that they do not have any conflict of interest for the proposed work.

References

Walker, M. Why We Sleep: The New Science of Sleep and Dreams; Penguin: London UK, 2017. [Google Scholar]
Sateia, M.J. International classification of sleep disorders. Chest 2014, 146, 1387–1394. [Google Scholar] [CrossRef] [PubMed]
Bhattacharya, D.; Sen, M.; Suri, J. Epidemiology of insomnia: A review of the global and Indian scenario. Indian J. Sleep Med. 2013, 8, 100–110. [Google Scholar] [CrossRef]
Nobili, L.; Proserpio, P.; Combi, R.; Provini, F.; Plazzi, G.; Bisulli, F.; Tassi, L.; Tinuper, P. Nocturnal frontal lobe epilepsy. Curr. Neurol. Neurosci. Rep. 2014, 14, 424. [Google Scholar] [CrossRef] [Green Version]
Natarajan, R. Review of periodic limb movement and restless leg syndrome. J. Postgrad. Med. 2010, 56, 157. [Google Scholar] [CrossRef]
Ohayon, M.M.; Caulet, M.; Priest, R.G. Violent behavior during sleep. J. Clin. Psychiatry 1997, 58, 369–376. [Google Scholar] [CrossRef] [PubMed]
Boeve, B.F. REM sleep behavior disorder: Updated review of the core features, the REM sleep behavior disorder-neurodegenerative disease association, evolving concepts, controversies, and future directions. Ann. N. Y. Acad. Sci. 2010, 1184, 15–54. [Google Scholar] [CrossRef] [Green Version]
Wells, M.A. Evolving Relationship Between Sleep-Disordered Breathing and Stroke; American College of Cardiology: Washington, DC, USA, 2015. [Google Scholar]
Buysse, D.; Reynolds, C.; Monk, T.; Berman, S.; Kupfer, D. The Pittsburgh Sleep Quality Index—A New Instrument For Psychiatric Practice And Research. Psychiatry Res. 1989, 28, 193–213. [Google Scholar] [CrossRef]
Sharma, M.; Singh, S.; Kumar, A.; Tan, R.S.; Acharya, U.R. Automated detection of shockable and non-shockable arrhythmia using novel wavelet-based ECG features. Comput. Biol. Med. 2019, 115, 103446. [Google Scholar] [CrossRef]
Rechtschaffen, A.; Kales, A. A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects; UCLA Brain Information Service: Los Angeles, CA, USA, 1968. [Google Scholar]
Iber, C. The AASM manual for the scoring of sleep and associated events: Rules. In Terminology and Technical Specification, 1st ed.; AASM: Westchester, IL, USA, 2007. [Google Scholar]
Sharma, M.; Goyal, D.; Achuth, P.; Acharya, U.R. An accurate sleep stages classification system using a new class of optimally time-frequency localized three-band wavelet filter bank. Comput. Biol. Med. 2018, 98, 58–75. [Google Scholar] [CrossRef] [PubMed]
Dhok, S.; Pimpalkhute, V.; Chandurkar, A.; Bhurane, A.A.; Sharma, M.; Acharya, U.R. Automated phase classification in cyclic alternating patterns in sleep stages using Wigner-Ville Distribution based features. Comput. Biol. Med. 2020, 119, 103691. [Google Scholar] [CrossRef]
Kessler, R.; Coulouvrat, C.; Hajak, G.; Lakoma, M.; Roth, T.; Sampson, N.; Shahly, V.; Shillington, A.; Stephenson, J.; Walsh, J.; et al. Reliability and Validity of the Brief Insomnia Questionnaire in the America Insomnia Survey. Sleep 2010, 33, 1539–1549. [Google Scholar] [CrossRef] [PubMed]
Stiasny-Kolster, K.; Mayer, G.; Schäfer, S.; Möller, J.; Heinzel-Gutenbrunner, M.; Oertel, W. The REM Sleep Behavior Disorder Screening Questionnaire—A new diagnostic instrument. Mov. Disord. Off. J. Mov. Disord. Soc. 2007, 22, 2386–2393. [Google Scholar] [CrossRef] [PubMed]
Sharma, M.; Tiwari, J.; Acharya, U.R. Automatic Sleep-Stage Scoring in Healthy and Sleep Disorder Patients Using Optimal Wavelet Filter Bank Technique with EEG Signals. Int. J. Environ. Res. Public Health 2021, 18, 3087. [Google Scholar] [CrossRef] [PubMed]
Sharma, M.; Patel, S.; Choudhary, S.; Acharya, U.R. Automated Detection of Sleep Stages Using Energy-Localized Orthogonal Wavelet Filter Banks. Arab. J. Sci. Eng. 2020, 45, 2531–2544. [Google Scholar] [CrossRef]
Stephansen, J.B.; Olesen, A.N.; Olsen, M.; Ambati, A.; Leary, E.B.; Moore, H.E.; Carrillo, O.; Lin, L.; Han, F.; Yan, H.; et al. Neural network analysis of sleep stages enables efficient diagnosis of narcolepsy. Nat. Commun. 2018, 9, 1–15. [Google Scholar] [CrossRef] [Green Version]
Espiritu, H.; Metsis, V. Automated Detection of Sleep Disorder-Related Events from Polysomnographic Data. In Proceedings of the 2015 International Conference on Healthcare Informatics, Dallas, TX, USA, 21–23 October 2015. [Google Scholar] [CrossRef]
López-García, D.; Ruz, M.; Ramírez, J.; Gorriz, J. Automatic detection of sleep disorders: Multi-class automatic classification algorithms based on Support Vector Machines. In Proceedings of the International Conference on Time Series and Forecasting (ITISE2018), Granada, Spain, 19–21 September 2018. [Google Scholar]
Sharma, M.; Dhiman, H.S.; Acharya, U.R. Automatic identification of insomnia using optimal antisymmetric biorthogonal wavelet filter bank with ECG signals. Comput. Biol. Med. 2021, 131, 104246. [Google Scholar] [CrossRef]
Sharma, M.; Patel, V.; Acharya, U.R. Automated identification of insomnia using optimal bi-orthogonal wavelet transform technique with single-channel EEG signals. Knowl.-Based Syst. 2021, 224, 107078. [Google Scholar] [CrossRef]
Widasari, E.; Tanno, K.; Tamura, H. Automatic Sleep Disorders Classification Using Ensemble of Bagged Tree Based on Sleep Quality Features. Electronics 2020, 9, 512. [Google Scholar] [CrossRef] [Green Version]
Shahin, M.; Ahmed, B.; Ben Hamida, S.; Mulaffer, L.; Glos, M.; Penzel, T. Deep Learning and Insomnia: Assisting Clinicians With Their Diagnosis. IEEE J. Biomed. Health Inform. 2017, 21, 1546–1553. [Google Scholar] [CrossRef]
Safi, M.; Safi, S. Early detection of Alzheimer’s disease from EEG signals using Hjorth parameters. Biomed. Signal Process. Control 2021, 65, 102338. [Google Scholar] [CrossRef]
Jiang, X.; Bian, G.B.; Tian, Z. Removal of artifacts from EEG signals: A review. Sensors 2019, 19, 987. [Google Scholar] [CrossRef] [Green Version]
Lai, C.Q.; Ibrahim, H.; Abdullah, M.Z.; Abdullah, J.M.; Suandi, S.A.; Azman, A. Artifacts and noise removal for electroencephalogram (EEG): A literature review. In Proceedings of the 2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang Island, Malaysia, 28–29 April 2018; pp. 326–332. [Google Scholar]
Sharma, M.; Acharya, U.R. Automated detection of schizophrenia using optimal wavelet-based l₁ norm features extracted from single-channel EEG. Cogn. Neurodynamics 2021, 1–14. [Google Scholar] [CrossRef]
Rajput, J.S.; Sharma, M.; Tan, R.S.; Acharya, U.R. Automated detection of severity of hypertension ECG signals using an optimal bi-orthogonal wavelet filter bank. Comput. Biol. Med. 2020, 123, 103924. [Google Scholar] [CrossRef]
Sharma, M.; Patel, S.; Acharya, U.R. Automated detection of abnormal EEG signals using localized wavelet filter banks. Pattern Recognit. Lett. 2020, 133, 188–194. [Google Scholar] [CrossRef]
Sharma, M.; Acharya, U.R. A new method to identify coronary artery disease with ECG signals and time-Frequency concentrated antisymmetric biorthogonal wavelet filter bank. Pattern Recognit. Lett. 2019, 125, 235–240. [Google Scholar] [CrossRef]
Sharma, M.; Shah, S. A novel approach for epilepsy detection using time–frequency localized bi-orthogonal wavelet filter. J. Mech. Med. Biol. 2019, 19, 1940007. [Google Scholar] [CrossRef]
Sharma, M.; Acharya, U.R. Analysis of knee-joint vibroarthographic signals using bandwidth-duration localized three-channel filter bank. Comput. Electr. Eng. 2018, 72, 191–202. [Google Scholar] [CrossRef]
Sharma, M.; Achuth, P.; Deb, D.; Puthankattil, S.D.; Acharya, U.R. An Automated Diagnosis of Depression Using Three-Channel Bandwidth-Duration Localized Wavelet Filter Bank with EEG Signals. Cogn. Syst. Res. 2018, 52, 508–520. [Google Scholar] [CrossRef]
Sharma, M.; Bhurane, A.A.; Acharya, U.R. MMSFL-OWFB: A novel class of orthogonal wavelet filters for epileptic seizure detection. Knowl. Based Syst. 2018, 160, 265–277. [Google Scholar] [CrossRef]
Zala, J.; Sharma, M.; Bhalerao, R. Tunable Q—Wavelet transform based features for automated screening of knee-joint vibroarthrographic signals. In Proceedings of the 2018 International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 22–23 February 2018. [Google Scholar]
Sharma, M.; Sharma, P.; Pachori, R.B.; Acharya, U.R. Dual-tree complex wavelet transform-based features for automated alcoholism identification. Int. J. Fuzzy Syst. 2018, 20, 1297–1308. [Google Scholar] [CrossRef]
Tay, D.B.; Palaniswami, M. A novel approach to the design of the class of triplet halfband filterbanks. IEEE Trans. Circuits Syst. II Express Briefs 2004, 51, 378–383. [Google Scholar] [CrossRef]
Sharma, M.; Dhere, A.; Pachori, R.B.; Gadre, V.M. Optimal duration-bandwidth localized antisymmetric biorthogonal wavelet filters. Signal Process. 2017, 134, 87–99. [Google Scholar] [CrossRef]
Sharma, M.; Dhere, A.; Pachori, R.B.; Acharya, U.R. An automatic detection of focal EEG signals using new class of time–frequency localized orthogonal wavelet filter banks. Knowl. Based Syst. 2017, 118, 217–227. [Google Scholar] [CrossRef]
Sharma, M.; Achuth, P.V.; Pachori, R.B.; Gadre, V.M. A parametrization technique to design joint time–frequency optimized discrete-time biorthogonal wavelet bases. Signal Process. 2017, 135, 107–120. [Google Scholar] [CrossRef]
Phoong, S.M.; Kim, C.; Vaidyanathan, P.; Ansari, R. New class of two-channel biorthogonal filter banks and wavelet bases. IEEE Trans. Signal Process. 1995, 43, 649–665. [Google Scholar] [CrossRef] [Green Version]
Hjorth, B. EEG analysis based on time domain properties. Electroencephalogr. Clin. Neurophysiol. 1970, 29, 306–310. [Google Scholar] [CrossRef]
Ho Thanh Lam, L.; Le, N.H.; Van Tuan, L.; Tran Ban, H.; Nguyen Khanh Hung, T.; Nguyen, N.T.K.; Huu Dang, L.; Le, N.Q.K. Machine learning model for identifying antioxidant proteins using features calculated from primary sequences. Biology 2020, 9, 325. [Google Scholar] [CrossRef]
Le, N.Q.; Hung, T.N.; Do, D.T.; Lam, L.H.; Dang, L.H.; Huynh, T.T. Radiomics-based machine learning model for efficiently classifying transcriptome subtypes in glioblastoma patients from MRI. Comput. Biol. Med. 2021, 132, 104320. [Google Scholar] [CrossRef]
Zhou, Z.H. Ensemble Methods: Foundations and Algorithms; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
Dietterich, T.G. Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar]
Friedman, J.H.; Bentley, J.L.; Finkel, R.A. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. (TOMS) 1977, 3, 209–226. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Sharma, M.; Agarwal, S.; Acharya, U.R. Application of an optimal class of antisymmetric wavelet filter banks for obstructive sleep apnea diagnosis using ECG signals. Comput. Biol. Med. 2018, 100, 100–113. [Google Scholar] [CrossRef] [PubMed]
Sharma, M.; Raval, M.; Acharya, U.R. A new approach to identify obstructive sleep apnea using an optimal orthogonal wavelet filter bank with ECG signals. Informatics Med. Unlocked 2019, 61, 100170. [Google Scholar] [CrossRef]
Le, N.; Nguyen, B. Prediction of FMN Binding Sites in Electron Transport Chains based on 2-D CNN and PSSM Profiles. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019. [Google Scholar] [CrossRef]
Abad, V.C.; Guilleminault, C. Diagnosis and treatment of sleep disorders: A brief review for clinicians. Dialogues Clin. Neurosci. 2003, 5, 371. [Google Scholar]

Figure 1. Flowchart of the proposed study.

Figure 2. Power Spectrum of an EEG signal: (a) raw and (b) filtered.

Figure 3. Hypnogram of a healthy subject with sleep stages scored according to R & K [11] criterion.

Figure 4. Frequency response of THFB filter pair.

Figure 5. Scaling and wavelet functions. (a) analysis scaling function; (b) synthesis scaling function; (c) analysis wavelet function; (d) synthesis wavelet function.

Table 1. Epoch distribution for data used in this study.

Sleep Stage	Healthy	Disorder						Total Epochs
Sleep Stage	Healthy	Insomnia	Narcolepsy	NFLE	PLM	RBD	SBD	Number	(in %)
Wake	445	3801	1316	3230	1398	5266	211	15,667	19.87%
N1	280	223	301	1123	284	1048	26	3285	4.17%
N2	2172	2456	1708	10,768	2845	7446	251	27,646	35.06%
N3	1757	1085	1044	7207	1943	5386	288	18,710	23.73%
REM	1409	986	1258	5005	1341	3530	16	13,545	17.18%
Total	6063	8551	5627	27,333	7811	22,676	792	78,853
(in %)	7.69%	10.84%	7.14%	34.66%	9.91%	28.76%	1.00%	78,853

Table 2. Performance measures obtained for the automated classification of healthy and insomnia classes.

Data Subset	C4-A1				F4-C4				C4-A1 + F4-C4
Data Subset	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier
W	95.64	0.74	0.96	EBooT	96.49	0.79	0.96	SVM	97.48	0.86	0.96	SVM
N1	85.88	0.71	0.94	EBT	90.85	0.81	0.95	EBooT	91.25	0.82	0.98	EBooT
N2	91.12	0.82	0.97	EBT	95.48	0.91	0.99	EBooT	96.93	0.94	0.99	EBooT
N3	95.88	0.91	0.99	EBooT	98.10	0.96	1.00	EBooT	99.23	0.98	1.00	EBooT
REM	96.08	0.91	0.99	EBooT	96.41	0.93	1.00	EBooT	98.16	0.96	0.99	EBooT
LSS	90.43	0.80	0.97	EBT	95.24	0.90	0.99	EBT	96.08	0.92	0.99	EBooT
NREM	92.14	0.84	0.98	EBT	95.92	0.92	0.99	EBT	96.69	0.93	0.99	SVM
ALL	93.60	0.86	0.98	EBT	96.36	0.92	0.99	EBT	96.63	0.93	0.99	EBT

Table 3. Performance measures obtained for the automated classification of healthy and narcolepsy classes.

Data Subset	C4-A1				F4-C4				C4-A1 + F4-C4
Data Subset	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier
W	90.40	0.73	0.95	EBooT	94.00	0.84	0.96	SVM	95.58	0.85	0.98	EBooT
N1	89.67	0.79	0.96	SVM	94.49	0.89	0.98	EBT	95.01	0.90	0.92	EBooT
N2	92.91	0.85	0.98	EBooT	96.13	0.92	0.99	EBooT	97.22	0.94	1.00	EBooT
N3	93.61	0.86	0.98	EBT	97.25	0.94	1.00	EBooT	98.21	0.96	1.00	EBooT
REM	86.69	0.79	0.97	EBooT	95.80	0.92	0.99	EBooT	97.53	0.95	1.00	EBooT
LSS	92.04	0.84	0.97	EBooT	95.56	0.91	0.99	EBooT	96.97	0.94	1.00	EBooT
NREM	92.70	0.85	0.98	EBT	95.77	0.91	0.99	EBT	97.09	0.94	1.00	EBooT
ALL	91.15	0.82	0.97	EBT	95.36	0.91	0.99	EBT	95.95	0.92	0.99	EBooT

Table 4. Performance measures obtained for the automated classification of healthy and NFLE classes.

Data Subset	C4-A1				F4-C4				C4-A1 + F4-C4
Data Subset	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier
W	90.18	0.41	0.85	SVM	92.27	0.50	0.90	EBooT	92.91	0.66	0.92	SVM
N1	87.31	0.55	0.84	SVM	92.52	0.74	0.95	EBooT	93.73	0.80	0.95	SVM
N2	88.94	0.52	0.89	EBT	95.70	0.83	0.98	EBT	96.16	0.85	0.99	EBT
N3	92.76	0.75	0.96	EBT	95.94	0.86	0.99	EBooT	96.17	0.87	0.99	EBooT
REM	89.07	0.65	0.93	EBT	95.40	0.89	0.98	EBT	96.45	0.89	0.99	EBooT
LSS	88.34	0.50	0.88	EBT	95.76	0.83	0.98	EBT	96.21	0.86	0.99	EBT
NREM	89.64	0.59	0.91	EBT	96.04	0.86	0.98	EBT	95.93	0.85	0.99	EBT
ALL	88.86	0.55	0.90	EBT	95.71	0.84	0.98	EBT	94.33	0.79	0.98	EBT

Table 5. Performance measures obtained for the automated classification of healthy and PLM classes.

Data Subset	C4-A1				F4-C4				C4-A1 + F4-C4
Data Subset	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier
W	90.02	0.71	0.95	EBooT	92.40	0.78	0.96	SVM	95.39	0.87	0.98	SVM
N1	89.54	0.79	0.96	EBooT	93.44	0.87	0.95	SVM	95.57	0.91	0.98	SVM
N2	92.49	0.87	0.97	EBT	97.03	0.94	0.99	EBT	96.99	0.94	1.00	EBooT
N3	93.68	0.82	0.98	EBT	97.89	0.96	1.00	EBooT	98.30	0.97	1.00	EBooT
REM	90.87	0.85	0.97	EBT	98.07	0.96	1.00	EBooT	98.55	0.97	1.00	EBooT
LSS	91.99	0.84	0.97	EBT	96.51	0.93	0.99	EBooT	96.95	0.94	1.00	EBooT
NREM	91.65	0.83	0.97	EBT	96.65	0.93	1.00	EBooT	97.20	0.94	1.00	EBooT
ALL	90.77	0.81	0.97	EBT	95.81	0.91	0.99	EBT	97.01	0.94	1.00	EBT

Table 6. Performance measures obtained for the automated classification of healthy and RBD classes.

Data Subset	C4-A1				F4-C4				C4-A1 + F4-C4
Data Subset	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier
W	95.08	0.58	0.90	SVM	96.36	0.71	0.95	SVM	97.30	0.80	0.98	SVM
N1	91.27	0.72	0.92	SVM	95.56	0.86	0.98	EBooT	95.56	0.86	0.99	EBooT
N2	93.41	0.80	0.96	EBT	96.43	0.90	0.99	EBT	97.26	0.92	0.99	SVM
N3	95.95	0.89	0.99	EBT	98.33	0.95	1.00	EBooT	98.82	0.97	1.00	EBooT
REM	93.16	0.83	0.96	EBT	97.37	0.93	0.99	EBT	98.00	0.95	1.00	EBooT
LSS	92.99	0.79	0.96	EBT	96.29	0.89	0.99	EBT	97.07	0.91	0.99	SVM
NREM	93.17	0.82	0.97	EBT	96.29	0.91	0.99	EBT	96.99	0.91	0.99	EBT
ALL	93.41	0.79	0.96	EBT	97.26	0.92	0.99	EBT	97.09	0.91	0.99	EBT

Table 7. Performance measures obtained for the automated classification of healthy and SBD classes.

Data Subset	C4-A1				F4-C4				C4-A1 + F4-C4
Data Subset	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier	Accuracy (%)	$κ$	AUC	Classifier
W	90.24	0.77	0.95	EBooT	95.47	0.90	0.98	SVM	96.88	0.93	1.00	SVM
N1	93.79	0.68	0.97	RUSBooT	96.08	0.68	0.97	SVM	98.69	0.92	1.00	SVM
N2	97.07	0.83	0.98	EBooT	98.84	0.94	0.99	EBooT	99.46	0.97	1.00	SVM
N3	95.79	0.82	0.96	EBooT	98.73	0.95	1.00	RUSBooT	98.83	0.95	1.00	SVM
REM	98.88	0.00	0.68	EBT	98.88	0.00	0.94	EBT	99.30	0.64	0.78	KNN
LSS	96.81	0.81	0.98	EBooT	98.61	0.92	1.00	EBooT	99.30	0.96	1.00	SVM
NREM	95.48	0.76	0.97	EBT	99.04	0.95	1.00	EBooT	99.35	0.97	1.00	EBooT
ALL	94.09	0.66	0.96	EBT	98.41	0.92	0.99	EBooT	98.79	0.94	1.00	EBooT

Table 8. Classification accuracy obtained for type of sleep disorder (seven-class classification).

Data Subset	Epochs	C4-A1		F4-C4		C4-A1 + F4-C4
Data Subset	Epochs	ACC (%)	$κ$	ACC (%)	$κ$	ACC (%)	$κ$
W	15667	70.3	0.60	81.5	0.75	83.4	0.78
N1	3285	67.2	0.54	80.4	0.72	82.4	0.76
N2	27646	71.3	0.60	85.5	0.80	89.3	0.85
N3	18710	82.0	0.75	89.8	0.86	91.3	0.88
REM	13545	76.6	0.68	87.4	0.83	90.6	0.87
LSS	30931	70.1	0.65	85.6	0.80	88.6	0.84
NREM	49641	73.4	0.45	88.0	0.48	88.9	0.85
ALL	78853	71.1	0.60	86.7	0.82	87.0	0.83
Classifier: Ensemble Bagged Trees

Table 9. Confusion matrix and performance metrics obtained for seven-class classification using N3 sleep stage with combination of F4-C4 and C4-A1 EEG channels.

Confusion Matrix								Performance Metrics
True	Predicted Class							Accuracy	Precision	Recall	F1 Score
Class	Healthy	Insomnia	NFLE	Narcolepsy	RBD	SDB	PLM	Accuracy	Precision	Recall	F1 Score
Healthy	73.6%	0.9%	19.7%	0.4%	4.8%	0.2%	0.3%	97.62%	0.94	0.74	0.83
Insomnia	0.4%	81.3%	11.8%	0.9%	5.1%	0.0%	0.5%	96.99%	0.90	0.81	0.85
NFLE	0.5%	1.0%	93.3%	0.4%	4.2%	0.0%	0.6%	91.53%	0.84	0.93	0.88
Narcolepsy	0.6%	3.1%	13.0%	73.1%	9.0%	0.1%	1.1%	97.65%	0.92	0.73	0.82
RBD	0.3%	0.9%	3.7%	0.4%	94.0%	0.0%	0.7%	93.77%	0.86	0.94	0.90
SDB	1.3%	0.4%	11.6%	1.1%	32.7%	52.4%	0.5%	99.48%	0.93	0.52	0.67
PLM	0.2%	0.7%	12.7%	0.4%	11.7%	0.0%	74.3%	96.92%	0.92	0.74	0.82

Table 10. Summary of results obtained for sleep disorder classification using various classifiers with hold-out validation strategy.

Classifier	Accuracy (%)
Classifier	Insomnia	Narcolepsy	NFLE	PLM	RBD	SDB
EBooT	98.6	97.3	95.2	98.4	99.1	98
KNN	96.1	93.2	90	93.6	96.6	95.6
SVM	94.9	88.2	84.8	93.9	96.7	96

Table 11. Classification results obtained without using wavelet decomposition with N3 sleep stage and both EEG channels (C4-A1 + F4-C4). Results are obtained using 10-fold CV and EBT classifier.

Disorder	Without Wavelet Decomposition		With Wavelet Decomposition
Disorder	Accuracy (%)	AUC	Accuracy (%)	AUC
Insomnia	83.55	0.89	99.23	1.00
Narcolepsy	82.70	0.89	98.21	1.00
NFLE	80.97	0.87	96.17	0.99
PLM	82.77	0.88	98.30	1.00
RBD	83.21	0.89	98.82	1.00
SDB	83.30	0.89	98.83	1.00
seven-class	76.87	NA	91.30	NA

Table 12. Comparison with other studies using CAP sleep database.

Study	Signal	Features	Classifier	Sleep Disorders	Overall Accuracy (%)
Sharma et al. [22]	ECG	Norm features	KNN and SVM	Insomnia	97.87%
Widasari et al. [24]	ECG	Spectral features and sleep quality parameters	Ensemble of bagged trees	Healthy, Insomnia, RBD and SDB	86.27%
Proposed Method	EEG	Hjorth parameters	Ensemble Bagged trees	Healthy, Insomnia, RBD and SDB	96.5%
Proposed Method	EEG	Hjorth parameters	Ensemble Bagged and Boosted trees	Healthy Vs Narcolepsy Vs PLM Vs SDB Vs RBD Vs NFLE	91.30%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sharma, M.; Tiwari, J.; Patel, V.; Acharya, U.R. Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals. Electronics 2021, 10, 1531. https://doi.org/10.3390/electronics10131531

AMA Style

Sharma M, Tiwari J, Patel V, Acharya UR. Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals. Electronics. 2021; 10(13):1531. https://doi.org/10.3390/electronics10131531

Chicago/Turabian Style

Sharma, Manish, Jainendra Tiwari, Virendra Patel, and U. Rajendra Acharya. 2021. "Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals" Electronics 10, no. 13: 1531. https://doi.org/10.3390/electronics10131531

APA Style

Sharma, M., Tiwari, J., Patel, V., & Acharya, U. R. (2021). Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals. Electronics, 10(13), 1531. https://doi.org/10.3390/electronics10131531

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automated Identification of Sleep Disorder Types Using Triplet Half-Band Filter and Ensemble Machine Learning Techniques with EEG Signals

Abstract

1. Introduction

2. Material Used

3. Methodology

3.1. Preprocessing

3.2. Triplet Half-Band Filterbank and Wavelet Decomposition

3.3. Feature Extraction: Hjorth Parameters

3.4. Classification and Validation

4. Results

4.1. Data Subsets Preparation

4.2. Classification Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI