1. Introduction
Sleep is an important aspect of human life and it greatly affects our mental and physical health. Sleep consists of periodic repetition of unconsciousness (physical-inactivity) called non rapid eye moments (NREM)) followed by high activity called rapid eye moments (REM). The REM and NREM manifests certain important functioning of brain including memory consolidation, brain clearance from metabolites and cellular restoration. However, the entire process is yet not completely perceived and known. To distinguish sleep’s macrostructure, sleep is categorized into five stages: wakefulness (W), N1, N2, N3, and REM, according to the guidelines of American Academy of Sleep Medicine (AASM) [
1]. The distinction is made on the 30 s window of electroencephalogram (EEG) signal by the trained medical practitioners. The sleep stages N1, N2, and N3 form NREM part of the sleep cycle followed by REM. In literature, many studies are available on the macrostructure of sleep and researchers have developed models for automated classification of sleep stages using machine learning techniques and PSG [
2,
3,
4,
5,
6,
7,
8,
9]. Recently, deep learning-based methods have also been employed for sleep scoring [
10,
11,
12].
However, ephemeral events such as K-complexes and transient power alterations in frequency bands are neglected by these macrostructure based sleep scoring rules. In the AASM guidelines, only the arousal definition captures the short periods of changes in cortical activation. Although, phasic events like K-complexes and delta bursts show characteristics similar to arousal, but they are not considered as arousal if there is no to short-term frequency increase in EEG [
13,
14]. To overcome these shortcomings of macrostructure based sleep scoring, a new microstructure based sleep scoring technique named cyclic alternating pattern (CAP) was devised, which includes such phasic events in brain activity [
15] as an alternative scheme to describe NREM sleep. CAP is found to be useful in the detection of insomnia, sleep apnea syndrome, epileptic disorders and periodic limb movements [
16]. Ferini-Strambi et al. have observed the impact of CAP on heart rate variability during sleep in healthy young adults [
17]. They concluded that, the cardiac autonomy in normal subjects is influenced by the physiological fluctuations of EEG arousal level. Studies have found that during coma, the CAP in EEG signal is correlated with the motor activity, cardiorespiratory rate and cerebrospinal fluid pressure. These events were observed to increase during phase A and decrease during phase B [
18,
19].
The NREM sleep stage is observed to have alternating patterns of cerebral activation (phase A) followed by duration of deactivation (phase B), which separate two or more consecutive phase A periods [
15]. The CAP phase A typically includes events like K-alpha, K-complex sequences, delta bursts, alpha waves, vertex sharp transients and arousals. The duration of these phases may vary between 2 s to 60 s. Two successive phase A events are considered as a single phase A event if the duration of separation between them is less than 2 s [
20]. The combination of phase A and phase B is termed as a CAP cycle and this cycle begins with phase A and ends with phase B. Two successive CAP cycles are needed to form a CAP sequence [
21]. If a phase A is not accompanied by phase B, then it is termed as an isolated phase A and is considered as non-CAP (an absence of CAP for >60 s duration). Thus, a CAP sequence contains minimum three A phases (A–B–A–B–A) followed by non-CAP period [
22]. However, there is no upper limit in terms of overall duration or number of CAP cycles, and approximate mean duration of a CAP sequence in healthy young adults is around 150 s containing six CAP cycles [
23]. In general, a CAP sequence always follows a continuous NREM sleep EEG pattern with a minimum duration of 60 s. CAP phase A can be detected using any EEG lead [
22].
Figure 1 shows the typical image of a CAP cycle for multiple EEG channels and
Figure 2 displays the typical waveforms of phase A and phase B.
High amplitude slow EEG waves increases with increasing depth of sleep whereas low amplitude fast rhythms are dominantly present in REM sleep. CAP phase A is subdivided into three subtypes A1, A2, and A3 based on the duration of high amplitude slow waves and low amplitude fast rhythms. Subtype A1 is characterised by high amplitude slow waves covering >80% of entire phase A duration and low amplitude fast rhythms, if present, covers <
of total phase A duration. Subtype A2 is characterised by a mixture of fast and slow EEG waves. Low amplitude slow waves in phase A2 covers around 20–50% of entire phase A duration. Subtype A3 is characterised by dominant low amplitude fast rhythms covering >
of the phase A duration. CAP sequences triggered or interrupted by body movements are also distinguished as subtype A3 [
22]. The CAP parameters include CAP rate, CAP time, and no of phases A1, A2 and per hour. Cap rate refers to the percentage ratio of CAP time to total NREM sleep time. CAP time refers to the total duration of all CAP sequences. CAP time increases with increase in number of CAP cycles. For healthy sleepers, CAP rate has very low variability. It is observed to vary with age. It is very low for toddlers (around 13%), gradually increases with age and peaks during peripubertal stage (around 62%), then again decreases for adults and middle age (around 37%) followed by an increase during elderly age (55%) [
22].
Thus, the detection of CAP phases and estimation of CAP parameters is essential for accurate sleep analysis. However, the CAP detection in human beings is prone to errors and a cumbersome task. In literature, few attempts have been made for identifying CAP phases automatically using computer aided systems [
24,
25,
26,
27,
28]. However, the studies on the automated detection of sleep phasic events are very few. Also, the model developed using machine learning methods on the above studies have been tested on very small number (5–10) of good sleepers without considering sleep disordered patients. Further, despite using large number of discriminating features the classification performance is not very high. Hence, there is a need for a model to be tested with significant number of subjects, involving both healthy controls and sleep disordered patients which can exhibit high performance. It is also desirable that the model should employ small number of features for training and testing so that it can be implemented in real time application.
In the proposed study, we have used a large number of subjects comprising of healthy as well as sleep disordered-patients suffering from six different disorders. We have implemented an ensemble learning method by employing wavelet-based Hjorth and entropy features extracted from monopolar C4-A1 and bipolar F4-C4 EEG channels to develop an automated model for detection of CAP phases. The developed model achieved better classification performance than the existing state-of-art studies. Our developed method is simple, computationally efficient and hence it may be deployed in clinical applications.
2. Material Used
This study is accomplished using the public
CAP sleep database, which contains night-long sleep polysomnographic (PSG) recordings of 108 subjects logged at the Sleep Disorders Center of the Ospedale Maggiore of Parma, Italy. The database contains recordings of 16 healthy subjects and patients suffering from insomnia (9), narcolepsy (5), nocturnal frontal lobe epilepsy (40), periodic leg movement (10), REM behavious disorder (22), and sleep-disordered breathing(4). Each PSG recording contains multiple channels including EEG, electrocardiogram (ECG), electromyogram (EMG), electrooculogram (EOG), SpO2 and respiratory signals. EEG channels include traces like F4-C4, C4-A1, C4-P4, P4-O2, Fp2-F4, Fp1-F3, C3-P3, P3-O1 and other combinations of F3/F4/C3/C4/O1/O2 with reference A1/A2. The sampling frequencies of these EEG channels varies from 100 Hz to 512 Hz. Bipolar EEG channel F4-C4 and monopolar EEG channel C4-A1 are present in maximum number of sleep recordings. Based on the availability of C4-A1 and F4-C4 channels coupled with 512 Hz sampling frequency, we have taken 77 subjects for the identification of CAP phase A in this study. Terzano et al. [
21] suggested that bipolar EEG leads are favourable for CAP phase detection. So we have also evaluated performance of each of the two leads (F4-C4 and C4-A1) independently and combined. We have performed CAP phase classification using EEG signals from healthy subjects as well as disordered patients individually and combined.
Table 1 shows gender and age details of healthy controls and sleep disordered patients used in this study.
4. Results
The entire experimentation and training related to our study was accomplished using MATLAB R2020a installed on an Ubuntu server 18.04. The specifications of the server are: an Intel Xeon E5-2690 v3 CPU @2.6 GHz (6 cores), 56 GB RAM and a 12 GB Nvidia K80 GPU.
In this work, we have used monopolar C4-A1 and bipolar F4-C4 EEG channels for the identification of CAP phases A and B using sleep recordings of 77 subjects. The database contains recordings of healthy subjects and variety of sleep disorders like NFLE, SDB, narcolepsy, PLM, insomnia and RBD. In this work, we have extracted and analyzed CAP phase information for each type sleep disorders separately. The results are given below for both EEG montages individually and combined. The classification performance was evaluated in terms of average classification accuracy (ACA), precision (Pcn), Recall (Rcl), F1-score (F1), Cohen’s Kappa () value and area under curve (AUC).
We have taken six healthy subjects and extracted 9930 epochs (2 s duration) of phase A and phase B for two EEG channels. Though features have been applied to several classifiers, the optimal performance was obtained using ensemble boosted trees and bagged trees classifier to classify CAP phases. The model could attain ACA of 75.7% and 76.4% for C4-A1 and F4-C4 channels, respectively. It is evident that bipolar channnel F4-C4 channel performed better than the monopolar counter part. Combining the features extracted from both channels yielded better ACA of 83.30%. The confusion matrix corresponding to three classification tasks for good sleepers can be seen in
Table 4. It can be observed from
Table 4 that, for healthy sleepers both phases are detected equally well while features of both channels are classified together.
The CAP sleep database contains seven EEG recordings for both montages, at sampling rate of 512 Hz with average duration of 575 min, corresponding to insomniac patients. We have classified total of 11328 epochs (2 s) into phase A and phase B. We have achieved maximum ACA of 71.4% and 72.5% for C4-A1 and F4-C4 channels, respectively. It can be seen that bipolar channel F4-C4 performed better than the monopolar channel C4-A1, individually. By combining both EEG channels, ACA increased to 76.5%. Confusion matrix and performance parameters for insomnia subjects are shown in
Table 5.
The CAP sleep database contains EEG recordings of five narcolepsy patients, with both montages sampled at 512 Hz frequency and an average duration of 494 min. We have taken 10086 epochs (2 s each) of phase A and phase B. The best possible ACA achieved is 77.20%. The confusion matrix for all three classification tasks can be seen in
Table 6.
The CAP database contains 27 EEG recordings of NFLE patients at sampling frequency of 512 Hz and an average duration of 505 min. We have classified 73236 epochs (2 s duration) into phase A and phase B. Among all classifiers, ensemble bagged trees classifier gave best performance with maximum ACA of 84%, F1-score of 0.84 and Cohen’s
coefficient of 0.68 using both channels. Confusion matrix and performance parameters corresponding to classification using C4-A1, F4-C4 and combination of both can be seen in
Table 7.
The CAP database contains nine PLM patients’ EEG recordings for both the montages, at sampling rate of 512 Hz and with an average duration of 431 min. The maximum ACA, F1-score and Cohen’s
obtained are 77%, 0.76 and 0.54, respectively.
Table 8 shows the confusion matrix and performance parameters for patients suffering from PLM.
22 RBD patients’ EEG recordings obtained from CAP sleep database were segmented into total number of 39654 epochs of 2s which contains either phase A or phase B. The average duration for each of these recordings is 514 min. For classifying the phases of RBD patients, the proposed model attained the best ACA of 72%, 70%, and 77% for F4-C4 channel, C4-A1 channel and combined channels, respectively. The F1-score and Cohen’s
of 0.71 and 0.44, respectively were obtained corresponding to the features obtained from the combined channels.
Table 9 shows results obtained using RBD patients.
The CAP sleep database contains one SDB patient’s EEG recording for both the channels, with sampling rate of 512 Hz and 396 min duration. We segregated EEG signals into 1668 epochs (2 s each) which comprises of phases A and phase B. Using EBagT classifier on the extracted features, with C4-A1 and F4-C4 channels separately, the model attained ACA of 74.9% with 10 fold cross validation, whereas combined features from both channels produced ACA of 81.5%. We have obtained F1-score and Cohen’s
equal to 0.81 and 0.63, respectively for combined features. Confusion matrix and performance parameters obtained for SDB patients are shown in
Table 10.
After considering detection of phases of all sleep disordered patients and healthy controls individually, we then combined all 77 subjects with a total of 165,960 epochs corresponding to phases A and B. The model attained ACA of 71.4% and 71.0% with C4-A1 and F4-C4 channels, respectively using ensemble bagged trees classifier with 10 fold CV. On combining features obtained from both channels, ACA increased to 78.0% and F1-score and Cohen’s kappa obtained are 0.77 and 0.56, respectively. It is also interesting to note that both the channels performed equally well in identifying phases when used separately. However, on combining the features from both montages there is a gain of around 7% in the ACA. It can be noticed from the results that, for narcolepsy, NFLE and PLM patients, C4-A1 channel performed better than F4-C4. F4-C4 channel performed better for healthy, insomnia and RBD patients. Both the channels performed almost the same for SDB patients.
Table 11 shows the classification results when all subjects are taken together. It can be noted from
Table 12 that, highest classification performance is obtained using EbagT classifier compared to other classifiers used in our work.
5. Discussion
In the literature, the studies on detection of micro structure CAP phasic events are sparse and limited. On the other hand, a plethora of studies are available on identification of sleep macro-structures events including sleep stage scoring and identification of sleep disorders. Moreover, these handful of studies on CAP phase identification have used only healthy controls with few exceptions of Hartmann et al. [
26] and Mendonca et al. [
28], which have included either NFLE or SDB patients. In this proposed study, we have performed CAP phase identification using all 77 subjects comprising six types of sleep disordered patients having NFLE, SDB, narcolepsy, PLM, insomnia and RBD along with good sleepers.
We have also conducted the whole experiment taking 1-second EEG window in addition to normal 2 s window length. It can be observed from
Table 4,
Table 5,
Table 6,
Table 7,
Table 8,
Table 9,
Table 10,
Table 11 and
Table 13 that, the phase classification performance is better for the epochs of 2 s than 1 s. Although the performance achieved using 1 s EEG epochs is inferior but the signal processing burden gets reduced on the system.
Our results reveal that for Healthy, Insomnia, and RBD subjects bipolar EEG analysis is better, which is in line with the observations made by Parrino et al. [
22]. But, for Narcolepsy, NFLE and SBD subjects, monopolar EEG channel is found to be better. However, for accurate analysis, we believe that both channels are equally important. To the best of our knowledge, there is no clinical reason available regarding the superiority of channel (monopolar or bipolar). However, our results reveal that when both monopolar and bipolar EEG channel are used together, the performance of the model improved for sleep disorder as well for healthy good sleepers.
We have computed the microstructure details of subjects used in this study, like CAP rate, CAP time and NREM time (
Table 14). It is evident from
Table 14 that the average CAP rate is found to be 0.56 when all subjects (health+sleep disordered) are taken together. In the table, we have also mentioned the CAP rate for the six sleep disordered subjects and healthy controls when they are considered separately. It can be noticed from the table that the CAP rate of healthy subjects is lowest (0.41) and increased for sleep disordered subjects. The CAP rate for SDB patients is the highest (0.78) among all. RBD patients showed lower CAP rate (0.49) compared to other sleep disorders. Hence, CAP rate can be used as a measure to indicate the sleep quality. The CAP rate either increases or decreases sharply in sleep disorder cases. More precisely, the CAP rate increases in insomnia [
52,
53,
54,
55,
56], apnea [
57], PLM [
58], NFLE [
59], and depression [
60,
61]. The CAP rate decreases in conditions like narcolepsy [
62], continuous positive airway pressure (CPAP) treatment in apnea [
63,
64,
65,
66] and neurodegenerative disorders, like Alzheimer’s disease [
67].
Table 15 shows the comparison of our proposed method with other previously performed state-of-the-art studies. Various techniques have been employed by researchers for CAP phase detection. All the studies mentioned in the table have used CAP sleep database with majority of work being performed with EEG signals of healthy subjects.
Mendez et al. [
25] used unbalanced data with 3963 Phase A events from only ten healthy adult subjects. They have used K-nearest neighbour (KNN) classifier and features including energy, sample entropy, standard deviation, Tsallis entropy and frequency band indices. They obtained an accuracy and sensitivity of around 80% and specificity of 70%. Navona et al. [
24] have achieved an accuracy of 77% using EEG band descriptors and thresholding. Hartmann et al. [
26] have used 16 healthy sleepers and 30 nocturnal frontal lobe epilepsy (NFLE) suffering subjects obtained from CAP sleep database. They achieved an accuracy of 82.42% for healthy subjects. The epoch length considered in their study is variable with a duration of 1–3 s. Dhok et al. [
27] have used balanced data with 4653 occurrences of phase A and phase B each, from six healthy subjects of CAP sleep database for automated CAP phase classification. They used Wigner-Ville distribution based feature extraction and support vector machine (SVM) classifier to achieve classification accuracy of 72.35%. Mendonca et al. [
28] have used time series analysis, Matrix of Lags and SVM classifier and obtained classification accuracy of 77% using ECG signals of 60 s duration. Recently, Loh et al. [
68] developed a deep neural network (1D-CNN) model for CAP phase classification and obtained an accuracy of 73.64%. Mariani et al. [
69] have observed that Hjorth actvity is a better descriptor for CAP A phases and helped to achieve better classification performance between phase A and phase B which is inline with the findings of our proposed study. It can be observed from the table that most of the studies have used only imbalanced data in which case the model developed may bias towards the majority class and cannot be considered as an ideal fit for a clinical application. The proposed study employed a balanced data to overcome bias, under fitting and over fitting problems. The proposed model attained ACA of 83% which is better than the most of the studies presented in
Table 15.
The key attributes and benefits of our study are as follows:
We have employed openly available CAP sleep database for easy reproducibility and to make it easy for other researchers to compare their work with this study.
We have used only two EEG channels to reduce complexity and discomfort to patients. The simultaneous use of both C4-A1 and F4-C4 EEG channels improved the performance which is evident from our results.
In addition to healthy subjects, we have also used subjects from six different CAP database cohorts like insomnia, narcolepsy, NFLE, PLM, RBD, and SDB.
Unlike other studies, in the proposed method we have used less number of features which leads to lesser computational complexity.
Along with 2-second EEG epochs, the simulations are also done with 1-second epochs.
Taking into account the non-stationary nature of EEG signals, we have used wavelet-based Hjorth and entropy features which employed orthogonal filter bank. It can be observed from
Table 4 and
Table 16 that, the performance parameters for phase A and phase B classification of healthy subjects are lower if wavelet analysis is not employed.
We have used balanced data in order to obtain robust classification.
Our developed model is simple without involving much computational complexity and hence can be deployed in real time applications by medical practitioners.
It can be noted that, although the classification task considered is binary, the task is demanding due to high level of resemblance in characteristics of phases, which is clear from the results obtained by the state-of-the-art methods available. The classification accuracy in the range of 72–83% have been achieved using various techniques including few deep learning models also as shown in
Table 15. Further, to examine the discriminating abilities of wavelet based features, we have performed the experiments without using wavelet decomposition and noticed high degradation in the performance (
Table 16). The proposed method has yielded high classification results due to the use of highly discriminating nature of optimal wavelet based Hjorth features used by us to train the model. Our results demonstrated that if we do not use wavelet decomposition and use the Hjorth parameters of EEG directly, then the performance decreases. Similarly, if we use the wavelet decomposition and use some statistical features other than Hjorth parameters, again the performance degrades. On the other hand, when both optimal wavelet decomposition and Hjorth parameters are employed together, the optimal performance is obtained.
Presently, the CAP phases are identified by trained clinicians in sleep laboratories only. This process is cumbersome, stressful and time consuming. Hence, there is always a scope for a computer based automated approach. As discussed earlier, previous few studies have tried to develop a model for CAP characterization. Our method has achieved better performance for CAP phase identification. However, it should be tested using a large independent cohorts before implementing in a clinical application, this can be considered as one of the limitations of the proposed study. There is scope for further improvement in the results obtained using 1 s epochs. The future scope of this work includes the use of deep learning (DL) based techniques like convolutional neural network (CNN), LSTM and recurrent neural network (RNN) for an automated identification of CAP phases. Although, DL-based techniques perform better in many cases but for this problem, our proposed method worked better than rest of the reported works. Hartmann et al. [
26] have already explored DL-based LSTM technique for automated detection of CAP phase A. Our results are found to be better because we have used highly discriminating wavelet-based features and optimal ensemble classifiers for the classification task. In future, it would be interesting to evaluate the performance of the proposed method for identifying sub-phases A1, A2, and A3 of the phase A.
6. Conclusions
The sleep-scoring is widely used in monitoring and analysis of sleep as well as identifying sleep disorders. The sleep macrostructure, represented by different sleep stages provide information regarding the neural activity and brain waves during sleep. However, the macrostructure sleep stages alone cannot provide information about the functional structure and stability of sleep. Besides, the shorter phasic events K complexes, delta-wave bursts, vertex waves, saw-tooth waves, sleep spindles and short-lasting arousals are also abundant in sleep. These events show certain patterns and represents the microstructure of sleep. CAP captures microstructure of the sleep and can be used to identify sleep instability. This paper presents an automated CAP characterization system using optimal wavelet-based features extracted from EEG signals. Our study aimed to reduce the diagnosis time of sleep by specialists. The main intention of the study is to identify the CAP phases of sleep disordered patients. We have utilized the entire CAP sleep database containing both sleep disordered patients and good sleepers. Our study presented the results obtained from healthy subjects, sleep disordered patients individually as well as all subjects combined. An optimal mean squared bandwidth minimized orthogonal filter bank is employed for the decomposition. The combination of highly discriminating wavelet entropy and Hjorth parameters coupled with optimally tuned ensemble bagged trees classier yielded a promising performance. The proposed model classified A and B phases of REM sleep. The model has yielded average classification accuracy of 84%, 83%, 81%, 78%, 77%, 76% and 72% for NFLE, healthy, SDB, narcolepsy, PLM, insomnia and RBD subjects, respectively in discriminating phases A and B using a balanced database. The best accuracy of 84% has been obtained for NFLE patients. However, the proposed model requires to be tested using a diverse and large data before clinical implementation. Our developed system is simple and fully computer-based, which can reduce the challenges faced by sleep specialists in scoring of CAP phases. In future, we aim to develop a model for the identification of subtypes A1, A2 and A3 of CAP phase A using DL based techniques like CNN, LSTM and RNN.