Next Article in Journal
Optical Biosensors for the Diagnosis of COVID-19 and Other Viruses—A Review
Next Article in Special Issue
Potential of Machine Learning for Predicting Sleep Disorders: A Comprehensive Analysis of Regression and Classification Models
Previous Article in Journal
Artificial Intelligence for Image Analysis in Oral Squamous Cell Carcinoma: A Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race

1
Computer Science Department, Southern Connecticut State University, New Haven, CT 06514, USA
2
Department of Computer Systems Engineering, Arab American University, Jenin P.O. Box 240, Palestine
3
Department of Pulmonary, Critical Care & Sleep Medicine, Texas A&M University, College Station, TX 77843, USA
4
Health Management and Informatics Department, School of Medicine, University of Missouri, Columbia, MO 65212, USA
5
Department of Computer Science, Al-Balqa Applied University, Salt 19117, Jordan
6
Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, Durian Tunggal 76100, Melaka, Malaysia
7
Center of Medical Informatics and Enterprise Analytics, University of Kansas Medical Center, Kansas City, KS 66160, USA
8
Department of Computer Science, Birzeit University, Birzeit P.O. Box 14, Palestine
9
Faculty of Information Technology, Sebha University, Sebha 18758, Libya
10
Pulmonary, Critical Care & Sleep Medicine, Sutter Health, Tracy, CA 95376, USA
*
Author to whom correspondence should be addressed.
Diagnostics 2023, 13(14), 2417; https://doi.org/10.3390/diagnostics13142417
Submission received: 27 May 2023 / Revised: 13 July 2023 / Accepted: 15 July 2023 / Published: 20 July 2023
(This article belongs to the Special Issue Diagnosis of Sleep Disorders Using Machine Learning Approaches)

Abstract

:
Obstructive sleep apnea (OSA) is a prevalent sleep disorder that affects approximately 3–7% of males and 2–5% of females. In the United States alone, 50–70 million adults suffer from various sleep disorders. OSA is characterized by recurrent episodes of breathing cessation during sleep, thereby leading to adverse effects such as daytime sleepiness, cognitive impairment, and reduced concentration. It also contributes to an increased risk of cardiovascular conditions and adversely impacts patient overall quality of life. As a result, numerous researchers have focused on developing automated detection models to identify OSA and address these limitations effectively and accurately. This study explored the potential benefits of utilizing machine learning methods based on demographic information for diagnosing the OSA syndrome. We gathered a comprehensive dataset from the Torr Sleep Center in Corpus Christi, Texas, USA. The dataset comprises 31 features, including demographic characteristics such as race, age, sex, BMI, Epworth score, M. Friedman tongue position, snoring, and more. We devised a novel process encompassing pre-processing, data grouping, feature selection, and machine learning classification methods to achieve the research objectives. The classification methods employed in this study encompass decision tree (DT), naive Bayes (NB), k-nearest neighbor (kNN), support vector machine (SVM), linear discriminant analysis (LDA), logistic regression (LR), and subspace discriminant (Ensemble) classifiers. Through rigorous experimentation, the results indicated the superior performance of the optimized kNN and SVM classifiers for accurately classifying sleep apnea. Moreover, significant enhancements in model accuracy were observed when utilizing the selected demographic variables and employing data grouping techniques. For instance, the accuracy percentage demonstrated an approximate improvement of 4.5%, 5%, and 10% with the feature selection approach when applied to the grouped data of Caucasians, females, and individuals aged 50 or below, respectively. Furthermore, a comparison with prior studies confirmed that effective data grouping and proper feature selection yielded superior performance in OSA detection when combined with an appropriate classification method. Overall, the findings of this research highlight the importance of leveraging demographic information, employing proper feature selection techniques, and utilizing optimized classification models for accurate and efficient OSA diagnosis.

1. Introduction

Obstructive sleep apnea (OSA) is a severe respiratory disorder that was first introduced in 1837 by Charles Dickens [1]. The foremost common symptoms of OSA are loud snoring, dry mouth upon awakening, morning headaches, and concentration difficulties [2,3]. There are over 100 million patients who suffer from sleep apnea, and it can affect both adults and children [4,5,6]. Moreover, it is estimated that nearly 22 million Americans suffer from a type of apnea that varies from moderate to severe [7]. Typically, the apnea–hypopnea index (AHI) is used to measure the severity of the apnea. For example, with nearly 326 million people living in the USA, it’s reported that 10% of the US population have mild OSA with AHI scores larger than 5, 3.5% have moderate OSA with AHI scores larger than 15, and 4% have severe OSA syndrome (i.e., apnea/hypopnea) [7].
The publication titled “Hidden health crisis costing America billions” by the American Academy of Sleep Medicine (AASM) presents a new analysis that sheds light on the considerable economic consequences of undiagnosed OSA [8]. Neglecting sleep apnea significantly raises the likelihood of expensive health complications such as hypertension, heart disease, diabetes, and depression [9]. By examining 506 patients diagnosed with OSA, the study showcases the potential improvements in their quality of life following treatment, including enhanced sleep quality, increased productivity, and a notable 40% reduction in workplace absences. A substantial 78% of patients regarded their treatment as a significant investment. Frost & Sullivan, a leading market research firm, has estimated the annual economic burden of undiagnosed sleep apnea among adults in the United States to be approximately $149.6 billion. This staggering amount encompasses $86.9 billion in lost productivity, $26.2 billion in motor vehicle accidents, and $6.5 billion in workplace accidents. Sleep apnea can be categorized into three distinct types:
  • Obstructive sleep apnea (OSA): The most common type of apnea is known as obstructive sleep apnea (OSA), which is identified by two primary characteristics. The first is a continuous reduction in airflow of at least 30% for a duration of 10 seconds, which is accompanied by a minimum oxygen desaturation of 4%. The second is a decrease in airflow of at least 50% for 10 seconds, coupled with a 3% reduction in oxygen saturation [10].
  • Central sleep apnea (CSA): CSA occurs when the brain fails to send appropriate signals to the muscles responsible for breathing. Unlike OSA, which stems from mechanical issues, CSA arises due to impaired communication between the brain and muscles [11,12].
  • Mixed sleep apnea (MSA): MSA, also known as complex sleep apnea, represents a combination of obstructive and central sleep apnea disorders, thus presenting a more complex pattern of symptoms and characteristics.
Detecting OSA using an electrocardiogram (ECG) is an expensive process that is inaccessible to a large number of the world’s population. The attributes of the ECG signal differ in the case of awake and sleep intervals [13]. Hence, using a combined signal of awake and sleep stages reduces the overall reliability of the detection process. Several researchers recommend examining the ECG signal based on minutes [14]. In general, to detect OSA, the signal length should be at least 10 seconds in length. The diagnosis of OSA from ECG signals using various machine learning methods is a commonly used approach in the literature. For example, artificial neural networks (ANN) and convolutional neural networks (CNN) were introduced to detect and classify OSA. Wang et al. [15] used the CNN model to detect OSA based on ECG signals. The authors extracted a set of features from each signal and then trained a three-layered CNN model. The obtained results showed an acceptable performance and the ability to apply the proposed method over wearable devices.
Erdenebayar et al. [16] provided an automated detection method for OSA using a single-lead ECG and a CNN. The CNN model proposed in their study was meticulously constructed, featuring six convolution layers that were carefully optimized. These layers incorporated activation functions, pooling operations, and dropout layers. The research findings demonstrated that the proposed CNN model exhibited remarkable accuracy in detecting OSA solely by analyzing a single-lead ECG signal. Faust et al. [17] introduced the use of a long short-term memory (LSTM) neural network to detect sleep apnea based on RR intervals signal. Their results showed the ability of the LSTM network to detect sleep apnea with an accuracy equal to 99.80 % . Schwartz et al. [18] employed several machine learning methods to detect four types of abbreviated digital sleep questionnaires (DSQs). The authors showed the ability of machine learning in detecting sleep disturbances with high accuracy. Lakhan et al. [19] proposed a dramatic involvement of a deep learning approach to detect multiple sleep apnea–hypopnea syndrome (SAHS). Two types of classifications were employed in their paper: binary classification with three cutoff indices (i.e., AHI = 5, 15, and 30 events/hour) and multiclass classification (i.e., no SAHS, mild SAHS, moderate SAHS, and severe SAHS). The obtained results for the binary classification showed that an AHI with 30 events/hour outperformed other cutoffs with an accuracy of 92.69%. For multiclass classification problems, the obtained accuracy was 63.70%. Banluesombatkul et al. [20] employed a novel deep learning method to detect OSA (i.e., normal and severe patients). The proposed method used three different deep learning methods: (i) one-dimensional CNN (1-D CNN) for feature extraction; (ii) deep recurrent neural networks (DRNNs) with an LSTM network for temporal information extraction; and (iii) fully connected neural networks (DNNs) for feature encoding. The proposed method showed acceptable results compared to the literature.
There have been several efforts to identify the relation between the snoring sound and OSA in the literature. In general, loud snoring is one of the indicators of OSA, and it is commonly thought that the frequency and amplitude of snoring are associated with the severity of the OSA [21]. Alshaer et al. [22] employed an acoustic analysis of breath sounds to detect OSA. The previous research suggests that OSA can be detected using snoring attributes. However, clinicians should pay attention to the possibility of missing an OSA diagnosis for patients with minimal snoring. Kang et al. [23] applied linear predict coding (LPC) and Mel-frequency cepstral coefficient (MFCC) features to detect OSA based on the amplitude of the snoring signal. The proposed method was able to classify three different events, namely, snoring, apnea, and silence, from sleep recordings with accuracies of 90.65%, 90.99%, and 90.30%, respectively.
Feature extraction and feature selection are the most commonly used techniques for data dimensionality reduction. Several papers have been published that highlight the importance of feature selection in OSA detection. Various features are extracted from the ECG signals; then, feature selection is used to reduce the number of extracted features and to determine the most valuable features related to OSA. In the stage of feature extraction, a set of features is extracted from the time series data, which aims to reveal the hidden information within the ECG signal. However, a feature set may contain redundant and irrelevant information, and feature selection is adopted to resolve this issue. A feature selection algorithm can help find the nearly optimal combination of features. Although feature selection is an expensive method, it can produce better classification performance, and high accuracy is significantly important in OSA detection. There are different classification methods that are used to select important features, such as support vector machine (SVM) networks, k-nearest neighbor (kNN) algorithms, artificial neural networks (ANN), linear discriminant analysis (LDA), and logistic regression (LR).
Many researchers have used demographic data to identify OSA. Sheta et al. [24,25] applied LR and ANN models to detect OSA based on demographic data. A real dataset was used that consists of several demographic features (i.e., weight, height, hip, waist, BMI, neck size, age, snoring, the modified Friedman (MF) score, the Epworth sleepiness scale, sex, and daytime sleepiness). The obtained results suggested that the proposed method could detect OSA with an acceptable accuracy. Surani et al. [26] applied the AdaBoost method as a machine learning classifier to detect OSA based on demographic data. The obtained results were promising. Surani et al. [27] applied a wrapper feature selection method based on binary particle swarm optimization (BPSO) with an ANN to detect OSA. The obtained results illustrated that the use of BPSO with an ANN can detect OSA with high accuracy. Haberfeld et al. [28] proposed a mobile application called Sleep Apnea Screener (SAS) to detect OSA based on demographic data. The authors used nine demographic features (i.e., height, weight, waist, hip, BMI, age, neck, M. Friedman, Epworth, snoring, gender, and daytime sleepiness). The application had two machine learning methods: LR and SVM. Moreover, the authors studied the performance of each classifier based on gender. The reported results showed that the proposed application can help patients detect OSA easily compared to an overnight test for OSA diagnosis.
There are many screening approaches for OSA, including tools such as the Berlin Questionnaire, the STOP-BANG Questionnaire, Epworth Sleepiness Scale (ESS), clinical assessment, and population-specific screening tools [29]. These approaches aim to identify individuals at a higher risk of OSA based on symptoms, risk factors, and questionnaire responses. Positive screening results prompt further evaluation using diagnostic tests such as polysomnography (PSG) or home sleep apnea testing (HSAT). Screening helps prioritize resources and directs individuals toward comprehensive sleep assessments. Subramanian et al. [30] introduced a novel screening approach known as the NAMES, which employs statistical methods to identify OSA. The NAMES assessment combines various factors, including neck circumference, airway classification, comorbidities, the Epworth scale, and snoring, to create a comprehensive evaluation that incorporates medical records, current symptoms, and physical examination findings. Experimental findings demonstrated the efficacy of the NAMES assessment in detecting OSA. Furthermore, the inclusion of BMI and gender in the assessment improved its screening capabilities.
This work proposes an efficient classification framework for the early detection of OSA. In specific, it is an extension of the NAMES work machine learning classification method and utilizes a metaheuristic-based feature selection scheme. The main contributions are summarized as follows:
  • The OSA data was grouped based on age, sex, and race variables for performance improvement. This type of grouping is novel and has never been presented in this area of research before.
  • Various types of the most well-known machine learning algorithms were assessed to determine the best-performing one for the OSA problem. These methods included twelve predefined (fixed) parameter classifiers and two optimized classifiers (using hyperparameter optimization).
  • A wrapper feature selection approach using particle swarm optimization (PSO) was employed to determine the most valuable features related to the OSA.
  • Experimental results from the actual data (collected from Torr Sleep Center, Texas, USA) confirmed that the proposed method improved the overall performance of the OSA prediction.
The rest of this paper is organized as follows: Section 2 presents the proposed method used in this work. Section 3 gives a brief description of the dataset used in the experiment. Section 5 discusses the experimental results and simulations. Finally, the conclusion and future work are presented in Section 8.

2. Proposed Diagnosis Process

The proposed OSA diagnosis process is illustrated in Figure 1. We suggest collecting data from patients who have undergone demographic, anthropometric measurements, and polysomnographic studies from a community-based sleep laboratory. An expert from the Torr Sleep Center (Corpus Christi, TX, USA) controlled the collection process for the polysomnography (PSG) evaluation of suspected OSA between 5 February 2007, and 21 April 2008. We processed the data to make the data more suitable for the analysis process. All missing data were handled, and a normalization technique was employed to transform the data into a standard scale. The next step was classification-based grouped data, where the classification model was implemented based on two types of learning methods—fixed parameter setting and adaptive parameter setting—through the training process. The benefit of using two kinds of learning methods is to learn more about the dataset and find the optimal parameter settings. After that, we applied a wrapper feature selection using the best performing classifier to identify the most valuable features related to OSA. This step can reveal useful information to physicians and doctors to understand the demographic characteristics of OSA patients. Finally, we used a set of evaluation criteria (i.e., accuracy, TPR, TNR, AUC, precision, F-score, and G-mean) to evaluate the performance of each classifier.

3. Sleep Apnea Dataset

The initial dataset employed in this study encompasses 620 patients, comprising 366 males and 254 females. The age range for males spans from 19 to 88 years, while for females, it ranges from 20 to 96 years. Notably, the prevalence of snoring was 92.6% among males and 91.7% among females. Each patient underwent comprehensive full-night monitoring as part of the study. The dataset comprises 31 input features and a binary output, represented by either 0 or 1, thus indicating the presence or absence of obstructive sleep apnea (OSA) (see Table 1 for a detailed presentation of these features). Additionally, the study recorded each individual’s Friedman tongue position (FTP), which encompasses four distinct positions, as depicted in Figure 2. Additionally, the Epworth scale, which is used to assess sleepiness, was collected. The scale details are presented in Table 2. Notably, the dataset is imbalanced, with 357 patients identified as positive cases with OSA and 263 individuals identified without OSA. Table 3 provides a comprehensive overview of the dataset’s characteristics.

4. Data Preprocessing

In the data classification-process-based machine learning, data preprocessing is when the data gets encoded to transfer it to a state that the machine can quickly analyze. In this case, the features of the data were smoothly interpreted by the algorithm. Data preprocessing is a vital step in any machine learning process [32]. This process aims to reduce unexpected behavior through the learning process, thereby enhancing the machine learning algorithm’s performance [33,34]. A set of operations such as data cleaning, data transformation, and data reduction are usually involved in data preprocessing. Precisely, the main preprocessing steps used in this research are the following: fill the missing values, data grouping, normalization, and feature selection.

4.1. Missing Data

It is ubiquitous to have missing elements from either rows or columns in your dataset. A failure to collect accurate data can occur during the data collection process or be due to a particular adopted data validation rule. There are several methods to handle missing data. They include the following:
  • If more than 50% of any rows or columns values are missing, we have to remove the whole row/columns, except where it is feasible to fill in the missing values.
  • If only a rational percentage of values are missing, we can adopt simple interpolation methods to fill in those values. Interpolation methods include filling missing values with the mean, median, or mode value of the respective feature.
In this work, we applied a statistical imputation approach [35,36]. All missing values for each attribute were replaced with a statistical measure that was calculated from the remaining values for that attribute. The statistics used were the mean and mode for the numeric and nominal features, respectively. These methods were chosen because they are fast, easy to implement, prevent information loss, and work well with a small dataset. Figure 3 depicts an example of missing value imputation.

4.2. Data Normalization

Data normalization is the process of standardizing numerical attributes into a common scale [37,38]. This operation is strongly recommended for the machine learning process to avoid any bias towards dominant features. Min–max normalization was applied in this research to rescale every numerical feature value into a number within [0,1]. For every feature, every value x gets transformed into x n using formula given in Equation (1).
x n = x m i n m a x m i n ,
where x n is the normalized value of x, and m i n and m a x represent the minimum and maximum value of the feature, respectively.

4.3. Role of Grouping in OSA Diagnosis

Recently, there have been many research efforts toward understanding the relationship between sex, age, and ethnicity in the diagnosis of sleep apnea.
Several research articles have explored the concept of data grouping. For example, in a study conducted by Mohsenin et al. [39], the authors examined the relationship between gender and the prevalence of hypertension in individuals with obstructive sleep apnea (OSA). The study, based on a large cohort of patients assessed at the Yale Center for Sleep Medicine, investigated how gender influences the likelihood of hypertension in OSA patients. The results revealed that hypertension rates increased with age and the severity of OSA, with obese men in the clinic-based population being at approximately twice the risk of hypertension compared to women. Similarly, another study by Freitas et al. [40] investigated the impact of gender on the diagnosis and treatment of OSA.
The study conducted by Ralls et al. [41] delved into the roles of gender, age, race/ethnicity, and residential socioeconomics in OSA syndromes. The research reviewed the existing literature and shed light on several intriguing findings. OSA was found to predominantly affect males, while women exhibited lower apnea–hypopnea index (AHI) values than men during specific sleep stages. Interestingly, women required lower levels of continuous positive airway pressure (CPAP) for treating OSAs of similar severities. The study also highlighted the impact of environmental factors, such as obesity, craniofacial structure, lower socioeconomic status, and residing in disadvantaged neighborhoods, on the prevalence and severity of OSA among different ethnic and racial groups.
In a research paper by Slaats et al. [42], an investigation was conducted to explore the relationship between ethnicity and OSA, which specifically focused on upper airway (UA) morphology, including Down syndrome. The findings of the study revealed that black African (bA) children exhibited a distinct upper airway morphology and were more prone to experiencing severe and persistent OSA compared to Caucasian children. This suggests that ethnicity plays a role in the susceptibility to OSA and highlights the importance of considering ethnic differences in diagnosing and managing the condition.
The Victoria Sleep Cohort study, as discussed in Irene et al. [43], investigated the gender-related impact of OSA on cardiovascular diseases. The study found consistent evidence linking OSA with cardiovascular risk, with a particular emphasis on men with OSA. The authors highlighted that the relationship between OSA and cardiovascular risk is influenced by gender, thereby indicating the need for tailored OSA treatment approaches for men and women. Additionally, Mohsenin et al. [44] conducted a study examining the effect of obesity on pharyngeal size separately for men and women, thus providing insights into the influence of obesity on the upper airway in OSA patients of different genders.
One of the main objectives of this study is to investigate the detection of OSA before and after grouping data based on demographic variables such as age, gender, and race. Accordingly, the original data was grouped by ethnicity (Caucasian and Hispanic), gender (males and females), and age (age ≤ 50 or age > 50). Consequently, six datasets were investigated: Caucasian, Hispanic, females, males, age ≤ 50, and age > 50. In Table 3, we are showing the data-distribution-based grouping. In Figure 4, we show the distribution of apnea and no apnea with respect to age, gender, and race attributes for all datasets.

4.4. Wrapper Feature Selection

Feature selection (FS) plays a crucial role in data mining, wherein it serves as a preprocessing phase to identify and retain informative patterns/features while excluding irrelevant ones. This NP-hard optimization problem has significant implications in data classification, as selecting valuable features can enhance the classification accuracy and reduce computational costs [45,46].
FS methods can be categorized into two families based on the criteria used to evaluate the selected feature subset: these include filters and wrappers [46,47]. Filter FS techniques employ scoring matrices to assign weights to features, such as mutual information or chi-square tests. Features with weights below a threshold are then eliminated from the feature set. On the other hand, wrapper FS methods utilize classification algorithms such as SVM or linear discriminant analysis to assess the quality of the feature subsets generated by a search method [48,49].
Generally, wrapper FS approaches tend to yield higher classification accuracy by leveraging dependencies among features within a subset. In contrast, filter FS methods may overlook such dependencies. However, wrapper FS comes with a higher computational cost compared to filter FS [50].
Feature subset generation involves the search for a highly informative subset of features from a set of patterns. Various search strategies, such as heuristic, complete, and random, are employed for this purpose [51,52,53]. The complete search involves generating and examining all possible feature subsets in the search space to identify the most informative one. However, this approach becomes computationally infeasible for large datasets due to the exponential growth of subsets. For instance, if a dataset has 31 features, the complete search would generate 2 31 subsets for evaluation. Random search, as the name implies, randomly explores the feature space to find subsequent feature subsets [54]. Although random search can, in some cases, generate all possible feature subsets similar to a complete search [45,55], it lacks a systematic search pattern.
In contrast, heuristic search is a different approach used to feature subset generation. It is characterized by iteratively improving the quality of the solution (i.e., a feature subset) based on a given heuristic function, thereby aiming to optimize a specific problem [56]. While heuristic search does not guarantee finding the best solution, it can often find good solutions within reasonable memory and time constraints. Several metaheuristic algorithms, such as particle swarm optimization (PSO) [57], ant colony optimization (ACO) [58], the firefly algorithm (FA) [59], ant lion optimization (ALO) [60], the whale optimization algorithm (WOA) [61], and the grey wolf optimizer (GWO) [62], have demonstrated their effectiveness in addressing feature subset selection problems. Examples of FS approaches can be found in [63,64,65,66,67,68,69].
This paper presents a wrapper feature selection approach based on particle swarm optimization (PSO) [70]. The main concept behind PSO is to simulate the collective behavior of bird flocking. The algorithm initializes a group of particles (solutions) that explore the search space in order to find the optimal solution for a given optimization problem. Each particle in the population adjusts its velocity and position based on the best solution found so far within the swarm. By considering the best particle, each individual particle updates its velocity and position according to specific rules, as outlined in Equations (2) and (3).
The PSO-based wrapper feature selection approach described in the paper utilizes this algorithm to search for an effective feature subset that improves the performance of the chosen optimization problem. For further details, please refer to [70,71].
v i j ( m + 1 ) = ω 1 v i j ( m ) + c 1 r 1 ( p b e s t i j x i j ( m ) ) + c 2 r 2 ( g b e s t i j x i j ( m ) )
x i j ( m + 1 ) = x i j ( m ) + v i j ( m + 1 ) ,
where m denotes the current generation, ω 1 is a parameter, named inertia weight, that is used for controlling the global search and local search tendencies. v i j ( m ) denotes the current velocity at generation m for the j-th dimension of the i-th particle, and x i j ( m ) denotes the current position of the i-th particle for the j-th dimension. Two uniformly distributed randomly assigned numbers between (0,1) are presented by r 1 and r 2 , respectively, and c 1 and c 2 are known as acceleration coefficients. p b e s t is the optimal solution that the particle i has found so far. g b e s t refers to the best solution found within the population so far.
To adapt the original PSO algorithm for discrete or binary search space, a modified binary version was introduced by [57]. The primary step in this transformation is the utilization of a sigmoid (transfer) function, as shown in Equation (4), to convert the real-valued velocities into probability values ranging from 0 to 1. The objective is to adjust the particle’s position based on the probability defined by its velocity. This allows for the representation of binary or discrete variables within the PSO framework.
S ( v i j ( m ) ) = 1 1 + exp v i j ( m ) ,
where v i j ( m ) refers to the velocity of particle i at iteration m in the j-th dimension. The updating process for the S-shape group is presented in Equation (5) for the next iteration m + 1 . After that, the position vectors can be updated based on the probability values of their velocities as follows:
x i j ( m + 1 ) = 0 If r a n d < S ( v i j ( m + 1 ) ) 1 If r a n d S ( v i j ( m + 1 ) ) .
The basic version of the BPSO suffers from some drawbacks, such as trapping in local minima. Mirjalili and Lewis [71] proposed a modified version of the BPSO in which transfer functions for mapping continuous search the space into binary were employed. The aim of introducing these functions is to avoid the problem of local optima and to improve the convergence speed. In this work, we employed the S-shaped transfer functions proposed in [71] for converting the PSO into binary. We examined these functions with the PSO algorithm to choose the most appropriate one. Table 4 presents the utilized transfer functions, and Figure 5 shows the shapes of these transfer functions.

4.5. Formulation of Feature Selection Problem

An FS is typically treated as a binary optimization problem, where candidate solutions are represented as binary vectors. To address this, a binary optimizer such as binary particle swarm optimization (BPSO) can be utilized. This work proposes a wrapper FS method that combines the BPSO as the search strategy and a classifier (e.g., KNN) to evaluate the quality of the feature subsets generated by the BPSO. In the FS problem, a solution is encoded as a binary vector with a length equal to the total number of features in the dataset. Each element in the vector represents a feature, where a value of zero indicates the exclusion of the corresponding feature, and a value of one indicates its inclusion or selection.
The paper introduces four FS approaches based on different binary variants of PSO, with each utilizing a specific S-shaped transfer function to convert continuous values into binary ones. The FS is considered to be a multi-objective optimization problem, thereby aiming to achieve both high classification accuracy and a low number of features. These two objectives are formulated as contradictory objectives in Equation (6) [46,64].
F i t n e s s = α × e r + β × F N ,
where e r indicates the error rate of the utilized classification algorithm (e.g., KNN) over a subset of features produced by the BPSO optimizer. F is the number of selected features, and N denotes the number of all the features. α = 0.99 and β = 0.01 [72,73] are two controlling parameters to balance the importance of both objectives.

5. Experimental Setup

It is well-known that there is no universal machine learning algorithm that can be the best-performing for all problems (As suggested by the No Free Lunch (NFL) theorem [74]). This motivated our attempts to examine various fixed and adaptive classification algorithms to identify the most applicable one for OSA. In the experiment, various classification methods were tested. However, only those classifiers with better performances are reported. Correspondingly, we adopted the decision tree (DT), naive Bayes (NB), K-nearest neighbor (kNN), support vector machine (SVM), fine decision tree (FDT), coarse decision tree (CDT), linear discriminate analysis (LDA), logistic regression (LR), Gaussian naive Bayes (GNB), kernel naive Bayes (KNB), linear support vector machine (LSVM), medium Gaussian support vector machine (MGSVM), coarse Gaussian support vector machine (CGSVM), cosine k-nearest neighbor (CKNN), weighted K-nearest neighbor (WKNN), and subspace discriminant (Ensemble) classifiers for performance validation. The detailed parameter settings for these classification methods are presented in Table 5. Moreover, the kNN and SVM with hyperparameter optimization settings (see Table 6 and Table 7) were also employed in this work.
In this study, a K-fold cross-validation with K = 10 was employed for performance evaluation instead of a hold-out validation. K-fold cross-validation offers the advantage of estimating the generalization error by using different combinations of training and testing sets. This approach allows for comprehensive testing of the data. For assessing the performance of the machine learning models, multiple metrics were utilized, including the accuracy, true positive rate (TPR), true negative rate (TNR), area under the curve (AUC), precision, F-score, and G-mean. These metrics were measured to ensure the effectiveness of the model.

6. Experimental Results

The following sections show the evaluation of the developed results using the complete dataset and the grouped dataset based on race, gender, and age.

6.1. Results with All Data

The experiment was conducted in eight phases. In the first phase, we analyzed the performance results of different classification algorithms for the complete dataset. Accordingly, the DT, LDA, LR, NB, SVM, kNN, Ensemble, optimized kNN, and optimized SVM algorithms provided the best results in this analysis. Thus, only the results of these classifiers are reported, as shown in Table 8. As illustrated, a different result was perceived by each classification algorithm. Compared with the other classifiers, the optimized classifiers (SVM* and kNN*) retained the highest accuracies of 0.7226 and 0.7409 , respectively. Our findings suggest that the optimized classifiers achieved the best performance in the sleep apnea classification.
The kNN* offered the best result with an accuracy of ( 0.7409 ), a TPR of ( 0.8322 ), an AUC of ( 0.7321 ), a precision of ( 0.7294 ), an F-score of ( 0.7774 ), and anG-mean of ( 0.7252 ). Moreover, the kNN* achieved the highest mean rank of 1.14 , thus suggesting that the kNN* was the best classifier when the complete dataset was used.

6.2. Data Grouping with Race

In the second phase, we inspected the performance of different classification algorithms based on the grouped data by race. Table 9 and Table 10 demonstrate the performance of different classification algorithms based on the data of Caucasian and Hispanic races, respectively. From Table 9, the highest accuracy of 0.7483 was obtained by the CKNN and kNN*. In terms of the AUC value, the KNN* retained the best AUC of 0.7114 , which showed better performance in discriminating between the classes. Moreover, the kNN* yielded the optimal mean rank of 2.29 . When observing the results in Table 10, it is clear that the kNN* scored the highest accuracy, TPR, TNR, AUC, precision, F-score, and G-mean. The kNN* proved to be the best algorithm in this analysis. The results of the mean rank in both Table 9 and Table 10 support this argument.

6.3. Data Grouping with Gender

The behavior of the different classification methods regarding the grouped data by gender is studied in this subsection. Table 11 and Table 12 outline the evaluation results of different classification algorithms. According to findings in Table 11, it is seen that the best accuracy of 0.7458 was achieved by the WKNN and Ensemble classifiers. However, the Ensemble classifier offered the optimal mean rank of 2.43 , which showed excellent results for the grouped data of females. By inspecting the results in Table 12, we can observe that the performance of the kNN* was the best. The kNN* ranked first (mean rank = 1.14 ) and offered the highest accuracy of 0.6987 , the highest TNR of 0.7500 , the best AUC of 0.6875 , a precision of 0.6349 , an F-score of 0.6299 , and a G-mean of 0.6847 .

6.4. Data Grouping with Age

In the fourth phase, we investigated the performance of the different classification algorithms based on the grouped data by age (age ≤ 50 or age > 50). Note that the age was normally distributed around 50. Table 13 shows the evaluation results of age ≤ 50. As can be seen, the SVM* obtained the highest accuracy of 0.7523 , followed by the kNN* ( 0.7431 ). Correspondingly, the SVM* contributed to the optimal TNR, AUC, precision, and G-mean. On the other side, the evaluation results of age > 50 are tabulated in Table 14. As shown, the kNN* achieved the best accuracy of 0.7333 . In addition, the kNN* ranked first with the highest properties of the AUC, precision, F-score, and G-mean. Our findings indicate that the algorithms with hyperparameter optimization (SVM* and kNN*) achieved the best performance in the sleep apnea classification.
Table 15 summarizes the overall ranking results for all classifiers. Meanwhile, the bar chart of the overall ranking is demonstrated in Figure 6. As illustrated in the results, the SVM* and kNN* offered the best ranking in most cases. Among the classifiers, the SVM* and kNN* assured the optimal average rank of 3.09 and 1.80 , respectively. The experimental results reveal the supremacy of the optimized algorithms for the classification of sleep apnea. The observed improvement in the kNN* and SVM* is attributed to the training process’s hyperparameter optimization, which enabled the models to explain the target concepts better.

6.5. Summary Performance with Data Grouping

In the fifth part of the experiment, we studied the impact of grouping (race, gender, and age) on the performance of different classifiers. Table 16 depicts the performance evaluation before and after grouping. One can see that the performances of the classifiers were substantially improved when the grouping was implemented, especially for the data grouped by races (Caucasian and Hispanic).
From the analysis, it can be inferred that the grouping step is beneficial for performance improvement. As observed in the result, the data grouped by Caucasian was the best model for accurate sleep apnea classification, with an optimal mean rank of 1.33 . Based on the findings, we can conclude that grouping the data with race (Caucasian) maximizes the features’ separability between classes. Furthermore, Table 17 reports the result of the running time (in seconds). Across all the datasets, it is seen that the fastest algorithm was the DT (rank of 1.29 ), followed by the LDA (rank of 2.14 ).

7. Feature Selection

In the sixth phase, we investigated the impact of feature selection techniques for all cases. Generally speaking, data dimensionality has a large impact in the machine learning development process. Data with high dimensionality not only contain irrelevant and redundant features that can negatively affect the accuracy, but also require massive time and computational resources [75]. Hence, feature selection can be an effective way to resolve the above issue while improving the performance of the learning model. In this research, we adopted the most popular feature selection method, called binary particle swarm optimization (BPSO), to assess the significant features from the high-dimensional feature space. It is worth noting that the kNN* was employed as the learning algorithm, since it obtained the best performance from the previous analysis.

7.1. Evaluation of BPSO Using Different TFs

Initially, the BPSO with different S-shaped transfer functions (TFs) was studied. Generally, TFs play an essential role in converting the solution into a binary form. In other words, it enables the particles to search around the binary feature space. However, different TFs may yield different kinds of results [71]. Thus, we evaluated the BPSO with four other TFs and found the optimal one.
Table 18 shows the average accuracy of the BPSO variants. Based on the result obtained, the BPSO1 achieved the highest accuracy for all five cases except the complete dataset. By observing the result in Table 19, the BPSO1 also yielded the smallest number of selected features in most cases. Our results imply that the BPSO1 was highly capable of finding the optimal feature subset, thereby enhancing the learning model’s performance for sleep apnea classification. The results of the mean ranks support this clarification reported in Table 18 and Table 19.
Table 20 tabulates the running time (in seconds) of the BSPO variants. As can be observed, the BPSO1 often ran faster to find the near-optimal solution, while the BPSO2 was the slowest. Eventually, the BPSO1 was shown to be the best variant, and it was employed in the rest of the experiment.

7.2. Comparison of BPSO with Well-Known Algorithms

In this subsection, the performance of the BPSO1 was further compared with the other seven state-of-the-art methods. The comparison algorithms are the binary Harris hawk optimization (BHHO) [73], the binary gravitational search algorithm (BGSA) [76], the binary whale optimization algorithm (BWOA) [77], the binary grey wolf optimization (BGWO) [78], the binary bat algorithm (BBA) [79], the binary ant lion optimizer (BALO) [78], and the binary moth–flame optimization (BMFO) [48]. Table 21 presents the average accuracy results obtained by the eight different algorithms. From Table 21, it is seen that the BPSO1 outperformed the other methods in tackling the feature selection problem. The results show that the BPSO1 retained the optimal mean rank of 1.57 , followed by the BHHO ( 2.29 ). Among the groups (race, gender, and age), age positively impacted accuracy when applying the BPSO1. The results revealed using feature selection showed that the performance of the grouped data by age could be substantially improved. Moreover, Table 22 presents the result of the Wilcoxon signed rank test. From Table 22, the BPSO1 outperformed the other methods in this work.
Table 23 tabulates the evaluation of the average feature size. The result of the Wilcoxon test is shown in Table 24. Our result indicates that the best algorithm in the feature reduction was the BBA, while the BSPO1 ranked second. In terms of the computational complexity, one can see from Table 25 that the BPSO1 again scored the optimal mean rank of 1.86 across all datasets. The BPSO1 offers not only the highest accuracy and the minimal number of features, but also the fastest computational speed.
Figure 7 illustrates the convergence behavior of the compared algorithms. We can observe that the BPSO1 converged faster and deeper to reach the global optimum out of all seven cases. The BPSO1 showed an excellent convergence rate against its competitors. This can be interpreted due to the strong searching ability of the BSPO1 algorithm. On the other side, the BBA and BGSA were found to have the lowest performance. They suffered from early stagnation and premature convergence, thereby reducing the classification performance.

7.3. Relevant Features Selected by BPSO

In the seventh phase, we inspected the relevant features selected by the BPSO1 algorithm. Table 26 outlines the best accuracy results of the classifiers with and without the BPSO algorithm. As shown in Table 26, the classification accuracy increased when the BSPO was deployed. The result affirms the importance and effectiveness of the feature selection method in sleep apnea classification. Taking the Caucasian dataset as an example, an increment of roughly 6% accuracy was achieved by the BPSO1 algorithm, with a feature reduction of 56.67 %. In the dataset with age ≤ 50, the proposed approach improved the accuracy by at least 11% while eliminating more than half of the irrelevant and redundant features in the dataset. Moreover, the reduction in the feature size contributed to the overall decrease in classifier complexity.
Table 27 presents the details of the selected features yielded through the BPSO algorithm. Instead of using all 31 features, the results show that the number of features chosen was 18 for all datasets, 13 for the Caucasian dataset, 14 for the Hispanic dataset, 13 for the females dataset, 11 for the males dataset, 15 for the age ≤ 50 datasets, and 17 for the age > 50 datasets. The findings suggest that fewer than 20 features are sufficient for accurate sleep apnea classification. On the one hand, Figure 8 exhibits the importance of the features in terms of the number of times each feature was chosen by the BPSO. Across all the datastes, it is suggested that the most selected features were f22 and f11, followed by f14 and f8. Correspondingly, these features had high discriminative power that could best describe the OSA compared to others.

7.4. Comparison of the BPSO-kNN with CNN, MLP, and kNN*

In the final part of the experiments, we compared the performance of the BPSO-kNN to the kNN* and the other well-known models, including the convolutional neural network (CNN) and multilayer perceptron neural network (MLP). Note that the maximum number of epochs for both the CNN and MLP were set at 150. Table 28 presents the accuracy and computational time of the BPSO-kNN, CNN, MLP, and kNN* methods. Upon inspecting the result, the BPSO-kNN contributed to the highest accuracy for all the datasets. Although the computational complexity of the BPSO-kNN was much higher than the CNN, MLP, and kNN*, it can usually ensure an accurate classification process. All in all, our findings affirm the superiority of the BSPO-kNN for the sleep apnea classification.
Based on previous analysis, it showed that the performance of the OSA diagnosis can be enhanced after applying the feature selection method. According to Figure 9, the accuracy percentage showed an increment of at least 3% in most datasets. As can be observed, an increment of roughly 10% could be achieved with the feature selection approach for the dataset age ≤ 50. From the aforementioned, the irrelevant and redundant features are meaningless, and they will degrade the performance of the model, as well as increase the dimensionality of the dataset. By utilizing the BPSO-kNN, most of the unwanted features can be removed while keeping the most informative ones, which guarantees a better diagnosis of the OSA. As a bonus, the BPSO-kNN selects the useful features from the dataset in an automatic way, which means it can be implemented without the need for prior knowledge and experience. In short, feature selection is an essential and efficienct tool for sleep apnea classification.

7.5. Comparison Study

To verify the performance of the proposed approach, we compared the obtained results with those reported in the preceding work on the same dataset. For this purpose, the proposed BPSO-kNN was compared with the screening tool (NAMES assessment) offered by Subramanian et al. [30]. Table 29 presents the AUC scores of the NAMES assessment using different combinations of features versus the BPSO-kNN. According to the findings, it is observed that the developed BPSO-kNN outperformed the other methods, with an optimal AUC rate of 0.8320 . By comparing our proposed model to [27,28], it is clear that the proposed model overwhelmed the SVM, LR, and ANN models. The results again validate the superiority of the feature selection process. These observations confirm that data grouping and the proper selection of features with an effective classification method can yield better performance for OSA detection.

8. Conclusions and Future Works

This study proposed an alternative approach to detect obstructive sleep apnea (OSA), which utilized demographic data instead of traditional ECG analysis. Expert physicians and sleep specialists collected a dataset of 31 features from 620 patients at the Torr Sleep Center in Texas, USA. The research focused on evaluating the performance of various machine learning classifiers using fixed and adaptive learning methods, thereby aiming to identify the most suitable classifier for the collected data. The results demonstrated that the kNN classifier achieved the highest accuracy among the tested classifiers. Additionally, a wrapper feature selection method based on the BPSO (binary particle swarm optimization) was employed with the kNN classifier to determine the most relevant features associated with OSA. The experimental outcomes indicate that the proposed method enhanced the overall prediction performance for OSA. As part of future work, the investigation will expand to include several wrapper feature selection methods, such as binary genetic algorithms (BGA) and binary ant colony optimization (BACO), thus aiming to assess the performance of the kNN classifier with different feature selection techniques.

Author Contributions

Conceptualization, A.S. and S.R.S.; Methodology, A.S., T.T., S.R.S., H.T. and M.M.; Software, T.T.; Validation, T.T. and S.S.; Investigation, T.T. and H.T.; Data curation, A.S.; Writing—original draft, A.S., H.T., M.B., J.T., N.A.-E.-R., M.M. and H.C.; Writing—review & editing, A.S., T.T., S.R.S., H.T., M.B., J.T., N.A.-E.-R., M.M., H.C. and S.S.; Visualization, T.T.; Supervision, A.S. and S.R.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

References

  1. Dickens, C. The Posthumous Papers of the Pickwick Club, 1st ed.; Chapman and Hall: London, UK, 1837. [Google Scholar]
  2. Alruwaili, H.; Ahmed, A.; Fatani, A.; Al-Otaibi, K.; Al-Jahdali, S.; Ali, Y.; Al-Harbi, A.; Baharoon, S.; Khan, M.; Al-Jahdali, H. Symptoms and risk for obstructive sleep apnea among sample of Saudi Arabian adults. Sleep Biol. Rhythm. 2015, 13, 332–341. [Google Scholar] [CrossRef]
  3. Piriyajitakonkij, M.; Warin, P.; Lakhan, P.; Leelaarporn, P.; Kumchaiseemak, N.; Suwajanakorn, S.; Pianpanit, T.; Niparnan, N.; Mukhopadhyay, S.C.; Wilaiprasitporn, T. SleepPoseNet: Multi-View Learning for Sleep Postural Transition Recognition Using UWB. IEEE J. Biomed. Health Inform. 2020, 25, 1305–1314. [Google Scholar] [CrossRef] [PubMed]
  4. Kaditis, A.G.; Alonso Alvarez, M.L.; Boudewyns, A.; Alexopoulos, E.I.; Ersu, R.; Joosten, K.; Larramona, H.; Miano, S.; Narang, I.; Trang, H.; et al. Obstructive sleep disordered breathing in 2- to 18-year-old children: Diagnosis and management. Eur. Respir. J. 2016, 47, 69–94. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Verhulst, S.; Kaditis, A. Obstructive sleep apnoea in children. Breathe 2011, 7, 240–247. [Google Scholar] [CrossRef] [Green Version]
  6. Banluesombatkul, N.; Ouppaphan, P.; Leelaarporn, P.; Lakhan, P.; Chaitusaney, B.; Jaimchariya, N.; Chuangsuwanich, E.; Chen, W.; Phan, H.; Dilokthanakul, N.; et al. MetaSleepLearner: A Pilot Study on Fast Adaptation of Bio-signals-Based Sleep Stage Classifier to New Individual Subject Using Meta-Learning. IEEE J. Biomed. Health Inform. 2020, 25, 1949–1963. [Google Scholar] [CrossRef]
  7. American Sleep Apnea Association. Available online: https://www.sleepapnea.org/learn/sleep-apnea-information-clinicians/ (accessed on 22 April 2019).
  8. AASM. American Academy of Sleep Medicine: Economic Burden of Undiagnosed Sleep Apnea in U.S. Is Nearly $150B per Year, 2023. Available online: https://aasm.org/economic-burden-of-undiagnosed-sleep-apnea-in-u-s-is-nearly-150b-per-year/ (accessed on 8 August 2016).
  9. Lang, C.C.; Mancini, D.M. Non-cardiac comorbidities in chronic heart failure. Heart 2007, 93, 665–671. [Google Scholar] [CrossRef] [Green Version]
  10. Quan, S.; Gillin, J.C.; Littner, M.; Shepard, J. Sleep-related breathing disorders in adults: Recommendations for syndrome definition and measurement techniques in clinical research. editorials. Sleep 1999, 22, 662–689. [Google Scholar] [CrossRef] [Green Version]
  11. Young, T.; Finn, L.; Kim, H. Nasal obstruction as a risk factor for sleep-disordered breathing. J. Allergy Clin. Immunol. 1997, 99, S757–S762. [Google Scholar] [CrossRef]
  12. Hung, P.D. Detection of Central Sleep Apnea Based on a Single-Lead ECG. In Proceedings of the 2018 5th International Conference on Bioinformatics Research and Applications, Hong Kong, China, 27–29 December 2018; ACM: New York, NY, USA, 2018; pp. 78–83. [Google Scholar] [CrossRef]
  13. Daldal, N.; Cömert, Z.; Polat, K. Automatic determination of digital modulation types with different noises using Convolutional Neural Network based on time–frequency information. Appl. Soft Comput. 2020, 86, 105834. [Google Scholar] [CrossRef]
  14. Bozkurt, F.; Uçar, M.K.; Bozkurt, M.R.; Bilgin, C. Detection of abnormal respiratory events with single channel ECG and hybrid machine learning model in patients with obstructive sleep apnea. IRBM 2020, 41, 241–251. [Google Scholar] [CrossRef]
  15. Wang, X.; Cheng, M.; Wang, Y.; Liu, S.; Tian, Z.; Jiang, F.; Zhang, H. Obstructive sleep apnea detection using ecg-sensor with convolutional neural networks. Multimed. Tools Appl. 2020, 79, 15813–15827. [Google Scholar] [CrossRef]
  16. Urtnasan, E.; Park, J.U.; Joo, E.Y.; Lee, K.J. Automated Detection of Obstructive Sleep Apnea Events from a Single-Lead Electrocardiogram Using a Convolutional Neural Network. J. Med. Syst. 2018, 42, 104. [Google Scholar] [CrossRef]
  17. Faust, O.; Barika, R.; Shenfield, A.; Ciaccio, E.J.; Acharya, U.R. Accurate detection of sleep apnea with long short-term memory network based on RR interval signals. Knowl.-Based Syst. 2021, 212, 106591. [Google Scholar] [CrossRef]
  18. Schwartz, A.R.; Cohen-Zion, M.; Pham, L.V.; Gal, A.; Sowho, M.; Sgambati, F.P.; Klopfer, T.; Guzman, M.A.; Hawks, E.M.; Etzioni, T.; et al. Brief digital sleep questionnaire powered by machine learning prediction models identifies common sleep disorders. Sleep Med. 2020, 71, 66–76. [Google Scholar] [CrossRef] [PubMed]
  19. Lakhan, P.; Ditthapron, A.; Banluesombatkul, N.; Wilaiprasitporn, T. Deep Neural Networks with Weighted Averaged Overnight Airflow Features for Sleep Apnea-Hypopnea Severity Classification. In Proceedings of the TENCON 2018—2018 IEEE Region 10 Conference, Jeju, Republic of Korea, 28–31 October 2018; pp. 0441–0445. [Google Scholar] [CrossRef] [Green Version]
  20. Banluesombatkul, N.; Rakthanmanon, T.; Wilaiprasitporn, T. Single Channel ECG for Obstructive Sleep Apnea Severity Detection Using a Deep Learning Approach. In Proceedings of the TENCON 2018—2018 IEEE Region 10 Conference, Jeju, Republic of Korea, 28–31 October 2018; pp. 2011–2016. [Google Scholar] [CrossRef] [Green Version]
  21. Herzog, M.; Plößl, S.; Glien, A.; Herzog, B.; Rohrmeier, C.; Kühnel, T.; Plontke, S.; Kellner, P. Evaluation of acoustic characteristics of snoring sounds obtained during drug-induced sleep endoscopy. Sleep Breath. 2015, 19, 1011–1019. [Google Scholar] [CrossRef]
  22. Alshaer, H.; Hummel, R.; Mendelson, M.; Marshal, T.; Bradley, T.D. Objective Relationship Between Sleep Apnea and Frequency of Snoring Assessed by Machine Learning. J. Clin. Sleep Med. 2019, 15, 463–470. [Google Scholar] [CrossRef]
  23. Kang, B.; Dang, X.; Wei, R. Snoring and apnea detection based on hybrid neural networks. In Proceedings of the 2017 International Conference on Orange Technologies (ICOT), Singapore, 8–10 December 2017; pp. 57–60. [Google Scholar] [CrossRef]
  24. Sheta, A.; Turabieh, H.; Braik, M.; Surani, S.R. Diagnosis of Obstructive Sleep Apnea Using Logistic Regression and Artificial Neural Networks Models. In Proceedings of the Future Technologies Conference (FTC) 2019, San Francisco, CA, USA, 24–25 October 2019; Arai, K., Bhatia, R., Kapoor, S., Eds.; Springer International Publishing: Cham, Swizerland, 2020; pp. 766–784. [Google Scholar]
  25. Aiyer, I.; Shaik, L.; Sheta, A.; Surani, S. Review of Application of Machine Learning as a Screening Tool for Diagnosis of Obstructive Sleep Apnea. Medicina 2022, 58, 1574. [Google Scholar] [CrossRef]
  26. Surani, S.; Sheta, A.; Turabieh, H.; Subramanian, S. Adaboosting Model for Detecting OSA. Chest 2019, 156, A134–A135. [Google Scholar] [CrossRef]
  27. Surani, S.; Sheta, A.; Turabieh, H.; Park, J.; Mathur, S.; Katangur, A. Diagnosis of Sleep Apnea Using artificial Neural Network and binary Particle Swarm Optimization for Feature Selection. Chest 2019, 156, A136. [Google Scholar] [CrossRef]
  28. Haberfeld, C.; Sheta, A.; Hossain, M.S.; Turabieh, H.; Surani, S. SAS Mobile Application for Diagnosis of Obstructive Sleep Apnea Utilizing Machine Learning Models. In Proceedings of the 2020 11th IEEE Annual Ubiquitous Computing, Electronics Mobile Communication Conference (UEMCON), New York, NY, USA, 28–31 October 2020; pp. 0522–0529. [Google Scholar] [CrossRef]
  29. Rossi, C.; Templier, L.; Miguez, M.; Cruz, J.D.L.; Curto, A.; Albaladejo, A.; Vich, M.L. Comparison of screening methods for obstructive sleep apnea in the context of dental clinics: A systematic review. CRANIO® 2023, 41, 245–263. [Google Scholar] [CrossRef]
  30. Subramanian, S.; Hesselbacher, S.; Aguilar, R.; Surani, S. The NAMES assessment: A novel combined-modality screening tool for obstructive sleep apnea. Sleep Breath. 2011, 15, 819–826. [Google Scholar] [CrossRef]
  31. Friedman, M.; Hwang, M.S. Evaluation of the patient with obstructive sleep apnea: Friedman tongue position and staging. Oper. Tech.-Otolaryngol.-Head Neck Surg. 2015, 26, 85–89, Sleep Disordered Breathing: Part 1. [Google Scholar] [CrossRef]
  32. Kotsiantis, S.; Kanellopoulos, D.; Pintelas, P. Data Preprocessing for Supervised Learning. Int. J. Comput. Sci. 2006, 1, 111–117. [Google Scholar]
  33. Malley, B.; Ramazzotti, D.; Wu, J.T.y. Data Pre-processing. In Secondary Analysis of Electronic Health Records; Springer International Publishing: Cham, Swizerland, 2016; pp. 115–141. [Google Scholar]
  34. Fan, C.; Chen, M.; Wang, X.; Wang, J.; Huang, B. A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data. Front. Energy Res. 2021, 9, 77. [Google Scholar] [CrossRef]
  35. Little, R.J.A.; Rubin, D.B. Statistical Analysis with Missing Data; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1986. [Google Scholar]
  36. Sterne, J.A.C.; White, I.R.; Carlin, J.B.; Spratt, M.; Royston, P.; Kenward, M.G.; Wood, A.M.; Carpenter, J.R. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ 2009, 338, b2393. [Google Scholar] [CrossRef] [PubMed]
  37. Sola, J.; Sevilla, J. Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Trans. Nucl. Sci. 1997, 44, 1464–1468. [Google Scholar] [CrossRef]
  38. Sathya Durga, V.; Jeyaprakash, T. An Effective Data Normalization Strategy for Academic Datasets using Log Values. In Proceedings of the 2019 International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 17–19 July 2019; pp. 610–612. [Google Scholar] [CrossRef]
  39. Mohsenin, V.; Yaggi, H.K.; Shah, N.; Dziura, J. The effect of gender on the prevalence of hypertension in obstructive sleep apnea. Sleep Med. 2009, 10, 759–762. [Google Scholar] [CrossRef]
  40. Freitas, L.S.; Drager, L.F. Gender and cardiovascular impact of obstructive sleep apnea: Work in progress! J. Thorac. Dis. 2017, 9, 3579–3582. [Google Scholar] [CrossRef] [Green Version]
  41. Ralls, F.M.; Grigg-Damberger, M. Roles of gender, age, race/ethnicity, and residential socioeconomics in obstructive sleep apnea syndromes. Curr. Opin. Pulm. Med. 2012, 18, 568–573. [Google Scholar] [CrossRef]
  42. Slaats, M.; Vos, W.; Van Holsbeke, C.; De Backer, J.; Loterman, D.; De Backer, W.; Boudewyns, A.; Verhulst, S. The role of ethnicity in the upper airway in a Belgian paediatric population with obstructive sleep apnoea. Eur. Respir. J. 2017, 50. [Google Scholar] [CrossRef]
  43. Cano-Pumarega, I.; Barbé, F.; Esteban, A.; Martínez-Alonso, M.; Egea, C.; Durán-Cantolla, J.; Montserrat, J.M.; Muria, B.; Sánchez de la Torre, M.; Abad Fernández, A. Sleep Apnea and Hypertension: Are There Sex Differences? The Vitoria Sleep Cohort. Chest 2017, 152, 742–750. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Mohsenin, V. Gender Differences in the Expression of Sleep-Disordered Breathing: Role of Upper Airway Dimensions. Chest 2001, 120, 1442–1447. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Liu, H.; Motoda, H. Feature Selection for Knowledge Discovery and Data Mining; Springer Science & Business Media: Berlin, Germany, 2012; Volume 454. [Google Scholar]
  46. Chantar, H.K.; Corne, D.W. Feature subset selection for Arabic document categorization using BPSO-KNN. In Proceedings of the 2011 Third World Congress on Nature and Biologically Inspired Computing, Salamanca, Spain, 19–21 October 2011; pp. 546–551. [Google Scholar]
  47. Chantar, H.; Thaher, T.; Turabieh, H.; Mafarja, M.; Sheta, A. BHHO-TVS: A Binary Harris Hawks Optimizer with Time-Varying Scheme for Solving Data Classification Problems. Appl. Sci. 2021, 11, 6516. [Google Scholar] [CrossRef]
  48. Tumar, I.; Hassouneh, Y.; Turabieh, H.; Thaher, T. Enhanced Binary Moth Flame Optimization as a Feature Selection Algorithm to Predict Software Fault Prediction. IEEE Access 2020, 8, 8041–8055. [Google Scholar] [CrossRef]
  49. Wang, A.; An, N.; Chen, G.; Li, L.; Alterovitz, G. Accelerating wrapper-based feature selection with K-nearest-neighbor. Knowl.-Based Syst. 2015, 83, 81–91. [Google Scholar] [CrossRef]
  50. Saeys, Y.; Iñaki, I.; Pedro, L.n. A review of feature selection techniques in bioinformatics. Bioinformatics 2007, 23, 2507–2517. [Google Scholar] [CrossRef] [Green Version]
  51. Dash, M.; Liu, H. Feature selection for classification. Intell. Data Anal. 1997, 1, 131–156. [Google Scholar] [CrossRef]
  52. Siedlecki, W.; Sklansky, J. On automatic feature selection. Int. J. Pattern Recognit. Artif. Intell. 1988, 2, 197–220. [Google Scholar] [CrossRef]
  53. Langley, P. Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance, New Orleans, LA, USA, 4–6 November 1994. [Google Scholar]
  54. Lai, C.; Reinders, M.J.; Wessels, L. Random subspace method for multivariate feature selection. Pattern Recognit. Lett. 2006, 27, 1067–1076. [Google Scholar] [CrossRef]
  55. Talbi, E. Metaheuristics From Design to Implementation; Wiley Online Library: Hoboken, NJ, USA, 2009. [Google Scholar]
  56. Lu, J.J.; Zhang, M. Heuristic Search. In Encyclopedia of Systems Biology; Springer: New York, NY, USA, 2013; pp. 885–886. [Google Scholar] [CrossRef]
  57. Kennedy, J.; Eberhart, R.C. A discrete binary version of the particle swarm algorithm. In Proceedings of the 1997 IEEE International Conference on Systems, Man, and Cybernetics, Computational Cybernetics and Simulation, Orlando, FL, USA, 12–15 October 1997; Volume 5, pp. 4104–4108. [Google Scholar]
  58. Dorigo, M.; Birattari, M.; Stutzle, T. Ant colony optimization. IEEE Comput. Intell. Mag. 2006, 1, 28–39. [Google Scholar] [CrossRef]
  59. Yang, X.S. Firefly Algorithms for Multimodal Optimization. In Stochastic Algorithms: Foundations and Applications; Watanabe, O., Zeugmann, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 169–178. [Google Scholar]
  60. Mirjalili, S. The ant lion optimizer. Adv. Eng. Softw. 2015, 83, 80–98. [Google Scholar] [CrossRef]
  61. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  62. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef] [Green Version]
  63. Mafarja, M.; Mirjalili, S. Hybrid Whale Optimization Algorithm with Simulated Annealing for Feature Selection. Neurocomputing 2017, 260, 302–312. [Google Scholar] [CrossRef]
  64. Chantar, H.; Mafarja, M.; Alsawalqah, H.; Heidari, A.A.; Aljarah, I.; Faris, H. Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification. Neural Comput. Appl. 2020, 32, 12201–12220. [Google Scholar] [CrossRef]
  65. Zhang, L.; Mistry, K.; Lim, C.; Neoh, S. Feature selection using firefly optimization for classification and regression models. Decis. Support Syst. 2018, 106, 64–85. [Google Scholar] [CrossRef] [Green Version]
  66. Deriche, M. Feature Selection using Ant Colony Optimization. In Proceedings of the 2009 6th International Multi-Conference on Systems, Signals and Devices, Djerba, Tunisia, 23–26 March 2009; pp. 1–4. [Google Scholar] [CrossRef]
  67. Zawbaa, H.M.; Emary, E.; Parv, B. Feature selection based on antlion optimization algorithm. In Proceedings of the 2015 Third World Conference on Complex Systems (WCCS), Marrakech, Morocco, 23–25 November 2015; pp. 1–7. [Google Scholar] [CrossRef]
  68. Rostami, M.; Berahmand, K.; Nasiri, E.; Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 2021, 100, 104210. [Google Scholar] [CrossRef]
  69. Thaher, T.; Mafarja, M.; Turabieh, H.; Castillo, P.A.; Faris, H.; Aljarah, I. Teaching Learning-Based Optimization With Evolutionary Binarization Schemes for Tackling Feature Selection Problems. IEEE Access 2021, 9, 41082–41103. [Google Scholar] [CrossRef]
  70. Eberhart, R.; Kennedy, J. Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  71. Mirjalili, S.; Lewis, A. S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm Evol. Comput. 2013, 9, 1–14. [Google Scholar] [CrossRef]
  72. Mafarja, M.; Aljarah, I.; Faris, H.; Hammouri, A.I.; Ala’M, A.Z.; Mirjalili, S. Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst. Appl. 2019, 117, 267–286. [Google Scholar] [CrossRef]
  73. Thaher, T.; Heidari, A.A.; Mafarja, M.; Dong, J.S.; Mirjalili, S. Binary Harris Hawks optimizer for high-dimensional, low sample size feature selection. In Evolutionary Machine Learning Techniques; Springer: Singapore, 2020; pp. 251–272. [Google Scholar]
  74. Wolpert, D.H. The Lack of A Priori Distinctions Between Learning Algorithms. Neural Comput. 1996, 8, 1341–1390. [Google Scholar] [CrossRef]
  75. Faris, H.; Heidari, A.A.; Ala’M, A.Z.; Mafarja, M.; Aljarah, I.; Eshtay, M.; Mirjalili, S. Time-varying hierarchical chains of salps with random weight networks for feature selection. Expert Syst. Appl. 2020, 140, 112898. [Google Scholar] [CrossRef]
  76. Rashedi, E.; Nezamabadi-Pour, H.; Saryazdi, S. BGSA: Binary gravitational search algorithm. Nat. Comput. 2010, 9, 727–745. [Google Scholar] [CrossRef]
  77. Kumar, V.; Kumar, D. Binary whale optimization algorithm and its application to unit commitment problem. Neural Comput. Appl. 2020, 32, 2095–2123. [Google Scholar] [CrossRef]
  78. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary grey wolf optimization approaches for feature selection. Neurocomputing 2016, 172, 371–381. [Google Scholar] [CrossRef]
  79. Mirjalili, S.; Mirjalili, S.M.; Yang, X.S. Binary bat algorithm. Neural Comput. Appl. 2014, 25, 663–681. [Google Scholar] [CrossRef]
Figure 1. The proposed methodology. The figure illustrates the step-by-step process of the proposed methodology, which involved five steps: data collection, data preprocessing, data grouping, classification, feature selection, and evaluation.
Figure 1. The proposed methodology. The figure illustrates the step-by-step process of the proposed methodology, which involved five steps: data collection, data preprocessing, data grouping, classification, feature selection, and evaluation.
Diagnostics 13 02417 g001
Figure 2. Friedman tongue position (FTP): (a) Class 1 visualizes the uvula and tonsils/pillar. (b) Class 2a visualizes most of the uvula but not the tonsils/pillar. (c) Class 2b visualizes the entire soft palate to the uvular base. (d) Class 3 shows some of the soft palates with the distal end absent. (e) Class 4 visualizes only the hard palate [31].
Figure 2. Friedman tongue position (FTP): (a) Class 1 visualizes the uvula and tonsils/pillar. (b) Class 2a visualizes most of the uvula but not the tonsils/pillar. (c) Class 2b visualizes the entire soft palate to the uvular base. (d) Class 3 shows some of the soft palates with the distal end absent. (e) Class 4 visualizes only the hard palate [31].
Diagnostics 13 02417 g002
Figure 3. Imputation of Missing Values. NaN refers to a missing value. Mean is a statistical measure used for imputing missing numerical values by replacing them with the mean value of the available data. Mode is a statistical measure used for imputing missing categorical values by replacing them with the mode (most frequently occurring value) of the available data.
Figure 3. Imputation of Missing Values. NaN refers to a missing value. Mean is a statistical measure used for imputing missing numerical values by replacing them with the mean value of the available data. Mode is a statistical measure used for imputing missing categorical values by replacing them with the mode (most frequently occurring value) of the available data.
Diagnostics 13 02417 g003
Figure 4. Bar charts comparing data samples with and without obstructive sleep apnea (OSA) grouped by age, gender, and race. The charts provide insights into the prevalence of OSA within different demographics.
Figure 4. Bar charts comparing data samples with and without obstructive sleep apnea (OSA) grouped by age, gender, and race. The charts provide insights into the prevalence of OSA within different demographics.
Diagnostics 13 02417 g004
Figure 5. S-shaped TFs. The figure illustrates the curves of four distinct S-shaped transfer functions.
Figure 5. S-shaped TFs. The figure illustrates the curves of four distinct S-shaped transfer functions.
Diagnostics 13 02417 g005
Figure 6. Overall ranking of classifiers. The figure displays the overall ranking of tested classifiers (DT, LDA, LR, NB, SVM, KNN, Ensemble, optimized SVM, and optimized KNN) based on the Friedman test. The rankings provide insights into the comparative performance of these classifiers, thus aiding in the identification of the most effective ones for the task at hand.
Figure 6. Overall ranking of classifiers. The figure displays the overall ranking of tested classifiers (DT, LDA, LR, NB, SVM, KNN, Ensemble, optimized SVM, and optimized KNN) based on the Friedman test. The rankings provide insights into the comparative performance of these classifiers, thus aiding in the identification of the most effective ones for the task at hand.
Diagnostics 13 02417 g006
Figure 7. Convergence behavior of compared algorithms. The figure depicts the convergence behavior of feature selection algorithms, thereby showcasing their fitness progress over iterations and assisting in the identification of an effective approach.
Figure 7. Convergence behavior of compared algorithms. The figure depicts the convergence behavior of feature selection algorithms, thereby showcasing their fitness progress over iterations and assisting in the identification of an effective approach.
Diagnostics 13 02417 g007
Figure 8. Importance of features in terms of the number of times each feature was selected by BPSO for all datasets [over 10 runs for each dataset].
Figure 8. Importance of features in terms of the number of times each feature was selected by BPSO for all datasets [over 10 runs for each dataset].
Diagnostics 13 02417 g008
Figure 9. Bar chart of accuracy percentage change after grouping based on age, gender, and race. The figure compares the accuracy of the model when using the entire dataset (all) to the accuracy after grouping based on age, gender, and race. Positive values denote an increase in accuracy, thus showcasing the effectiveness of grouping based on these demographic factors.
Figure 9. Bar chart of accuracy percentage change after grouping based on age, gender, and race. The figure compares the accuracy of the model when using the entire dataset (all) to the accuracy after grouping based on age, gender, and race. Positive values denote an increase in accuracy, thus showcasing the effectiveness of grouping based on these demographic factors.
Diagnostics 13 02417 g009
Table 1. List of dataset features.
Table 1. List of dataset features.
AttributesData Type
f1RaceCategorical
f2AgeNumeric
f3SexCategorical
f4BMICategorical
f5EpworthNumeric
f6WastNumeric
f7HipNumeric
f8RDINumeric
f9NeckNumeric
f10M.FriedmanNumeric
f11Co-morbidCategorical
f12SnoringCategorical
f13Daytime sleepinessCategorical
f14DMCategorical
f15HTNCategorical
f16CADCategorical
f17CVACategorical
f18TSTNumeric
f19Sleep EfficNumeric
f20REM AHINumeric
f21NREM AHINumeric
f22Supine AHINumeric
f23Apnea IndexNumeric
f24Hypopnea IndexNumeric
f25Berlin QCategorical
f26Arousal indexNumeric
f27Awakening IndexNumeric
f28PLM IndexNumeric
f29Mins. SaO 2 Numeric
f30Mins. SaO 2 DesatsNumeric
f31Lowest SaO 2 Numeric
classWitnessed apneaCategorical
Table 2. Epworth scale range.
Table 2. Epworth scale range.
RangeDescription
0–5Lower normal daytime sleepiness
6–10Higher normal daytime sleepiness
11–12Mild level of sleepiness experienced during the daytime
13–15Moderate level of sleepiness experienced during the daytime
16–24Significant level of sleepiness experienced during the daytime
Table 3. Description of sleep apnea dataset. The table provides essential information about the sleep apnea dataset, including the total number of samples (No. samples), the number of features (No. features), the count of positive samples (No.positive samples), and the count of negative samples (No. negative samples).
Table 3. Description of sleep apnea dataset. The table provides essential information about the sleep apnea dataset, including the total number of samples (No. samples), the number of features (No. features), the count of positive samples (No.positive samples), and the count of negative samples (No. negative samples).
DatasetsNo. FeaturesNo. SamplesNegativePositive
Original Dataset31274149125
RaceCaucasian301519259
Hispanic301235766
GenderFemales301188533
Males301566492
AgeAge ≤ 50311095554
Age > 50311659471
Table 4. S-shaped transfer functions. The table provides the names and formulas of four S-shaped functions (S1, S2, S3, and S4). These functions exhibit the characteristic sigmoidal shape, which is commonly observed in S-shaped curves.
Table 4. S-shaped transfer functions. The table provides the names and formulas of four S-shaped functions (S1, S2, S3, and S4). These functions exhibit the characteristic sigmoidal shape, which is commonly observed in S-shaped curves.
NameTransfer Function Formula
S1 S ( x ) = 1 1 + e 2 x
S2 S ( x ) = 1 1 + e x
S3 S ( x ) = 1 1 + e ( x / 2 )
S4 S ( x ) = 1 1 + e ( x / 3 )
Table 5. The detailed parameter settings of preset classifiers.
Table 5. The detailed parameter settings of preset classifiers.
Preset ClassifierDescriptionParameterValue
FDTFine Decision TreeMaximum number of splits100
Split criterionGini’s diversity index
CDTCoarse Decision TreeMaximum number of splits100
Split criterionGini’s diversity index
LDALinear Discrimenant AnalaysisDiscriminant typelinear
LRLogistic Regression--
GNBGaussian Naïve BayesDistribution nameGaussian
KNBKernel Naïve BayesDistribution nameKernel
Kernel typeGaussian
LSVMLinear Support Vector MachineKernel functionLinear
Kernel scaleAutomatic
Box contraint level1
standardize dataTRUE
MGSVMMedium Gaussian SVMKernel functionGaussian
Kernel scale5.6
Box contraint level1
Standardize dataTRUE
CGSVMCoarse Gaussian SVMKernel functionGaussian
Kernel scale22
Box contraint level1
Standardize dataTRUE
CKNNCosine K-Nearest NeighborNumber of neighbors10
Distance metriccosine
Distance weightequal
Standardize dataTRUE
WKNNWeighted kNNNumber of neighbors10
Distance metricEuclidean
Distance weightSquared Inverse
Standardize dataTRUE
EnsembleSubspace DiscriminantEnsemble methodSubspace
Learner typeDiscriminant
Number of learners30
Subspace dimension16
Table 6. Parameter settings of the optimized kNN for each dataset.
Table 6. Parameter settings of the optimized kNN for each dataset.
DatasetNumber of NeighborsDistance MetricDistance WeightStandardize Data
All16SpearmanInverseTRUE
Caucasian6CorrelationSquared InverseTRUE
Hispanic32CityblockSquared InverseTRUE
Females4HammingSquared InverseFALSE
Males22CityblockEqualFALSE
Age ≤ 5014CosineSquared InverseTRUE
Age > 503116Squared InverseTRUE
Table 7. Parameter settings of the optimized SVM for each dataset.
Table 7. Parameter settings of the optimized SVM for each dataset.
DatasetKernel FunctionKernel ScaleBox Contraint LevelStandardized Data
AllPolynomial (degree = 2)10.002351927TRUE
CaucasianLinear10.18078TRUE
HispanicLinear10.01115743TRUE
FemalesGaussian2.990548535122.3491994FALSE
MalesLinear10.001000015FALSE
Age ≤ 50Gaussian415.5625146341.7329909FALSE
Age > 50Gaussian26.422111587.503025335TRUE
Table 8. Performance of different classification algorithms on the data, where X* denotes the optimized classifier X.
Table 8. Performance of different classification algorithms on the data, where X* denotes the optimized classifier X.
ClassifierAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
DT0.58760.63090.53600.58340.61840.62460.58158.97
LDA0.68610.75840.60000.67920.69330.72440.67465.79
LR0.68610.73830.62400.68110.70060.71900.67875.57
NB0.66420.77850.52800.65330.66290.71600.64117.71
SVM0.68980.81880.53600.67740.67780.74160.66255.71
kNN0.69340.75170.62400.68780.70440.72730.68494.21
Ensemble0.70440.81880.56800.69340.69320.75080.68203.93
SVM*0.72260.79190.64000.71600.72390.75640.71192.14
kNN*0.74090.83220.63200.73210.72940.77740.72521.14
Table 9. Performance of different classification algorithms for the grouped data of Caucasian race.
Table 9. Performance of different classification algorithms for the grouped data of Caucasian race.
ClassifierAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
FDT0.72850.79350.62710.71030.76840.78070.70543.29
LDA0.69080.79570.52540.66060.72550.75900.64666.71
LR0.68870.76090.57630.66860.73680.74870.66226.21
GNB0.68870.82610.47460.65030.71030.76380.62617.93
MGSVM0.72190.89130.45760.67450.71930.79610.63875.79
CKNN0.74830.94570.44070.69320.72500.82080.64554.36
Ensemble0.72190.86960.49150.68050.72730.79210.65385.07
SVM*0.73510.84780.55930.70360.75000.79590.68863.36
kNN*0.74830.88040.54240.71140.75000.81000.69102.29
Table 10. Performance of different classification algorithms for the grouped data of Hispanic race.
Table 10. Performance of different classification algorithms for the grouped data of Hispanic race.
DataAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
CDT0.62600.52630.71210.61920.61220.56600.61228.29
LDA0.64230.56140.71210.63680.62750.59260.63236.50
LR0.62600.56140.68180.62160.60380.58180.61878.00
GNB0.65040.61400.68180.64790.62500.61950.64705.64
MGSVM0.68290.56140.78790.67460.69570.62140.66512.86
WKNN0.65850.54390.75760.65070.65960.59620.64195.07
Ensemble0.65850.59650.71210.65430.64150.61820.65174.57
SVM*0.67480.63160.71210.67190.65450.64290.67063.00
kNN*0.77240.63160.89390.76280.83720.72000.75141.07
Table 11. Performance of different classification algorithms for the grouped data of females.
Table 11. Performance of different classification algorithms for the grouped data of females.
ClassifierAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
CDT0.71190.88240.27270.57750.75760.81520.49064.86
LDA0.66950.81180.30300.55740.75000.77970.49605.57
LR0.61020.75290.24240.49770.71910.73560.42728.00
KNB0.63560.71760.42420.57090.76250.73940.55185.00
MGSVM0.72031.00000.00000.50000.72030.83740.00005.79
WKNN0.74580.97650.15150.56400.74770.84690.38464.36
Ensemble0.74580.88240.39390.63810.78950.83330.58962.43
SVM*0.72031.00000.00000.50000.72030.83740.00005.79
kNN*0.73730.91760.27270.59520.76470.83420.50033.21
Table 12. Performance of different classification algorithms for the grouped data of males.
Table 12. Performance of different classification algorithms for the grouped data of males.
ClassifierAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
CDT0.52560.29690.68480.49080.39580.33930.45098.64
LDA0.65380.59380.69570.64470.57580.58460.64274.07
LR0.64740.59380.68480.63930.56720.58020.63765.14
KNB0.62340.64060.61110.62590.53950.58570.62575.86
LSVM0.63460.53130.70650.61890.55740.54400.61266.79
CKNN0.62820.60940.64130.62530.54170.57350.62516.50
Ensemble0.66030.54690.73910.64300.59320.56910.63584.29
SVM*0.66670.60940.70650.65790.59090.60000.65622.57
kNN*0.69870.62500.75000.68750.63490.62990.68471.14
Table 13. Performance of different classification algorithms for the grouped data by age (Age ≤ 50).
Table 13. Performance of different classification algorithms for the grouped data by age (Age ≤ 50).
ClassifierAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
FDT0.70640.70910.70370.70640.70910.70910.70644.00
LDA0.65140.67270.62960.65120.64910.66070.65087.29
LR0.66970.70910.62960.66940.66100.68420.66826.29
KNB0.62390.56360.68520.62440.64580.60190.62148.00
LSVM0.70640.80000.61110.70560.67690.73330.69924.93
CKNN0.65140.81820.48150.64980.61640.70310.62767.07
Ensemble0.71560.78180.64810.71500.69350.73500.71193.57
SVM*0.75230.78180.72220.75200.74140.76110.75141.64
kNN*0.74310.83640.64810.74230.70770.76670.73632.21
Table 14. Performance of different classification algorithms for the grouped data by age (Age > 50).
Table 14. Performance of different classification algorithms for the grouped data by age (Age > 50).
ClassifierAccuracyTPRTNRAUCPrecisionF-ScoreG-MeanMean Rank
CDT0.60610.67020.52110.59570.64950.65970.59108.57
LDA0.66670.72340.59150.65750.70100.71200.65425.36
LR0.66060.72340.57750.65040.69390.70830.64636.43
KNB0.67880.79790.52110.65950.68810.73890.64485.93
CGSVM0.67270.91490.35210.63350.65150.76110.56766.29
CKNN0.69090.77660.57750.67700.70870.74110.66973.43
Ensemble0.69090.79790.54930.67360.70090.74630.66204.29
SVM*0.70910.85110.52110.68610.70180.76920.66603.14
KNN*0.73330.86170.56340.71250.72320.78640.69681.57
Table 15. Overall ranking results for all classifiers in dealing with different datasets based on classification evaluation metrics reported in Table 5, Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11.
Table 15. Overall ranking results for all classifiers in dealing with different datasets based on classification evaluation metrics reported in Table 5, Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11.
ClassifierAllCaucasianHispanicFemalesMalesAge ≤ 50Age > 50Average Rank
DT8.973.298.294.868.644.008.576.66
LDA5.796.716.505.574.077.295.365.90
LR5.576.218.008.005.146.296.436.52
NB7.717.935.645.005.868.005.936.58
SVM5.715.792.865.796.794.936.295.45
KNN4.214.365.074.366.507.073.435.00
Ensemble3.935.074.572.434.293.574.294.02
SVM*2.143.363.005.792.571.643.143.09
KNN*1.142.291.073.211.142.211.571.80
Table 16. Performance evaluation before and after grouping in terms of accuracy measure.
Table 16. Performance evaluation before and after grouping in terms of accuracy measure.
ClassifierAllCaucasianHispanicFemalesMalesAge ≤ 50Age > 50
DT0.58760.72850.62600.71190.52560.70640.6061
LDA0.68610.69080.64230.66950.65380.65140.6667
LR0.68610.68870.62600.61020.64740.66970.6606
NB0.66420.68870.65040.63560.62340.62390.6788
SVM0.68980.72190.68290.72030.63460.70640.6727
kNN0.69340.74830.65850.74580.62820.65140.6909
Ensemble0.70440.72190.65850.74580.66030.71560.6909
SVM*0.72260.73510.67480.72030.66670.75230.7091
kNN*0.74090.74830.77240.73730.69870.74310.7333
mean Rank3.441.335.003.446.443.784.56
Table 17. Comparison between classification algorithms dealing with all datasets in terms of running time in seconds.
Table 17. Comparison between classification algorithms dealing with all datasets in terms of running time in seconds.
ClassifierAllCaucasionHispanicFemalesMalesAge ≤ 50Age > 50Average Rank
DT1.70180.14400.10930.11850.11230.09110.12801.29
LDA1.64790.17420.16700.17200.17290.14000.16542.14
LR0.38020.30670.23930.28370.26540.23570.24334.00
NB3.66352.13503.04852.73583.86454.12443.27486.86
SVM2.90950.29390.29260.28040.25200.25770.27104.57
KNN1.70640.17680.19400.16480.22470.17610.17513.00
Ensemble4.29131.95992.12771.80311.94682.07062.07146.14
Table 18. Comparison between different BPSO variants using four S-shaped TFs in terms of average accuracy based on kNN* classifier.
Table 18. Comparison between different BPSO variants using four S-shaped TFs in terms of average accuracy based on kNN* classifier.
DatasetMeasureBPSO1BPSO2BPSO3BPSO4
AllAVG0.75110.75180.74640.7449
STD0.00750.00450.00390.0040
CaucasianAVG0.79670.79540.78410.7808
STD0.01360.01140.00840.0091
FemalesAVG0.80340.79410.79490.7907
STD0.00780.00700.00670.0098
Age > 50AVG0.78360.77820.77520.7655
STD0.01110.01110.00920.0064
HispanicAVG0.78700.78620.77970.7740
STD0.00640.00670.00710.0107
Age ≤ 50AVG0.83210.82110.80920.7945
STD0.01680.00990.01490.0131
MalesAVG0.75130.72050.71860.7180
STD0.01590.00810.00470.0043
Mean RankF-test1.142.002.864.00
Table 19. Comparison between different BPSO variants using four S-shaped TFs in terms of average number of selected features [based on kNN*].
Table 19. Comparison between different BPSO variants using four S-shaped TFs in terms of average number of selected features [based on kNN*].
DatasetMeasureBPSO1BPSO2BPSO3BPSO4
AllAVG17.616.417.716.9
STD2.54733.09842.83042.6854
CaucasianAVG13.013.714.914.5
STD3.26602.94583.47852.7988
FemalesAVG14.114.414.813.2
STD1.85292.22112.04403.1903
Age > 50AVG14.815.015.416.0
STD2.65832.49441.89743.4641
HispanicAVG14.614.715.416.2
STD0.69922.05752.54732.8206
Age ≤ 50AVG14.815.615.415.5
STD1.31661.77642.31902.5927
MalesAVG12.512.814.813.6
STD2.87713.19032.44041.3499
Mean RankF-test1.432.293.432.86
Table 20. Comparison between different BPSO variants using four S-shaped TFs in terms of average running time [based on kNN*].
Table 20. Comparison between different BPSO variants using four S-shaped TFs in terms of average running time [based on kNN*].
DatasetMeasureBPSO1BPSO2BPSO3BPSO4
AllAVG464.0481.7466.2467.8
STD4.86173.66743.36553.2812
caucasionAVG368.1374.5371.8374.0
STD4.34024.25693.66413.5733
femalesAVG245.8247.9247.9248.4
STD2.16982.14171.71362.1340
age > 50AVG266.9270.2268.7270.0
STD2.53952.23412.45431.5417
hispanicAVG260.5262.9262.1263.9
STD2.95002.16131.82312.4382
age ≤ 50AVG261.4262.1264.8263.7
STD2.31151.78372.14782.0247
malesAVG382.7251.8250.3250.9
STD46.57741.86581.83421.7393
mean rankF-test1.433.212.213.14
Table 21. Comparison between BPSO and various well-know algorithms in terms of average accuracy.
Table 21. Comparison between BPSO and various well-know algorithms in terms of average accuracy.
DatasetMeasureBPSO1BHHOBGSABWOABGWOBBABALOBMFO
AllAVG0.75110.75150.72450.75070.73720.66240.74740.7504
STD0.00750.00950.00860.00420.00640.04720.00380.0039
CaucasianAVG0.79670.78210.74300.78150.76230.70600.77810.7821
STD0.01360.00490.00610.00540.00490.03560.00560.0058
FemalesAVG0.80340.80680.75170.80930.78640.69490.80170.8059
STD0.00780.00670.01060.00720.00880.03440.00910.0063
Age > 50AVG0.78360.77030.72360.76910.74730.69450.76490.7721
STD0.01110.00780.01220.00970.00810.01920.01100.0077
HispanicAVG0.78700.78050.73820.77890.76830.68050.78130.7772
STD0.00640.00940.01200.00640.01030.06170.00710.0042
Age ≤ 50AVG0.83210.79910.73760.79910.76610.66610.78350.7853
STD0.01680.01100.01890.00910.01080.06040.01510.0064
MalesAVG0.75130.72240.69490.71540.70900.64300.71600.7160
STD0.01590.00530.01060.00330.00330.03270.00310.0031
Mean RankF-test1.572.297.003.366.008.004.363.43
Table 22. p-values of the Wilcoxon signed rank test based on accuracy results reported in Table 21 (p-values ≤ 0.05 are in bold and significant).
Table 22. p-values of the Wilcoxon signed rank test based on accuracy results reported in Table 21 (p-values ≤ 0.05 are in bold and significant).
DatasetBPSO (the Best Performaing Method) vs.
BHHOBGSABWOABGWOBBABALOBMFO
All 2.79 × 10 1 2 . 62 × 10 4 5.05 × 10 1 1 . 54 × 10 3 1 . 69 × 10 4 4 . 82 × 10 2 5.55 × 10 1
Caucasian 8 . 58 × 10 3 1 . 51 × 10 4 1 . 12 × 10 2 2 . 27 × 10 4 1 . 66 × 10 4 5 . 03 × 10 3 1 . 13 × 10 2
Females 3.53 × 10 1 1 . 50 × 10 4 1.16 × 10 1 1 . 15 × 10 3 1 . 60 × 10 4 4.69 × 10 1 5.13 × 10 1
Age > 50 8 . 05 × 10 3 1 . 74 × 10 4 8 . 19 × 10 3 1 . 62 × 10 4 1 . 73 × 10 4 2 . 93 × 10 3 1 . 83 × 10 2
Hispanic 1.04 × 10 1 1 . 56 × 10 4 1 . 89 × 10 2 4 . 39 × 10 4 1 . 62 × 10 4 1.18 × 10 1 2 . 24 × 10 3
Age ≤ 50 3 . 85 × 10 4 1 . 64 × 10 4 3 . 28 × 10 4 1 . 58 × 10 4 1 . 71 × 10 4 1 . 61 × 10 4 1 . 45 × 10 4
Males 9.68 × 10 1 1 . 51 × 10 4 8 . 81 × 10 3 2 . 72 × 10 4 1 . 57 × 10 4 1 . 26 × 10 2 1 . 26 × 10 2
Table 23. Comparison between BPSO and various well-know algorithms based on the number of selected features.
Table 23. Comparison between BPSO and various well-know algorithms based on the number of selected features.
DatasetMeasureBPSO1BHHOBGSABWOABGWOBBABALOBMFO
AllAVG17.622.818.122.227.41325.124.2
STD2.552.102.232.661.172.111.371.81
CaucasianAVG13.019.914.621.526.212.124.621.7
STD3.273.282.632.320.792.331.351.89
FemalesAVG14.1231522.82713.42424.8
STD1.851.561.892.391.562.171.331.69
Age > 50AVG14.818.616.921.327.315.125.622.5
STD2.665.872.883.471.642.421.652.12
HispanicAVG14.619.617.320.725.21323.321.1
STD0.703.783.533.591.232.451.892.38
Age ≤ 50AVG14.818.316.41925.112.623.419.5
STD1.322.112.801.831.202.461.511.18
MalesAVG12.519.114.918.625.511.322.521.4
STD2.885.241.603.751.844.571.962.27
Mean RankF-test1.864.433.004.578.001.146.866.14
Table 24. p-values of the Wilcoxon signed rank test based on the number of selected features reported in Table 23 (p-values ≤ 0.05 are in bold and significant).
Table 24. p-values of the Wilcoxon signed rank test based on the number of selected features reported in Table 23 (p-values ≤ 0.05 are in bold and significant).
DatasetBPSO (the Best Performaing Method) vs.
BHHOBGSABWOABGWOBBABALOBMFO
9 . 43 × 10 4 7.01 × 10 1 2 . 35 × 10 3 1 . 70 × 10 4 1 . 42 × 10 3 1 . 73 × 10 4 2 . 12 × 10 4
Caucasian 7 . 44 × 10 4 2.37 × 10 1 2 . 65 × 10 4 1 . 57 × 10 4 5.41 × 10 1 1 . 63 × 10 4 2 . 02 × 10 4
Females 1 . 60 × 10 4 2.30 × 10 1 1 . 65 × 10 4 1 . 61 × 10 4 4.16 × 10 1 1 . 62 × 10 4 1 . 51 × 10 4
Age > 50 1.02 × 10 1 1.37 × 10 1 7 . 30 × 10 4 1 . 67 × 10 4 7.31 × 10 1 1 . 56 × 10 4 1 . 73 × 10 4
Hispanic 1 . 37 × 10 2 1 . 20 × 10 2 1 . 40 × 10 3 1 . 43 × 10 4 1.08 × 10 1 1 . 51 × 10 4 1 . 44 × 10 4
Age ≤ 50 1 . 58 × 10 3 1.62 × 10 1 4 . 79 × 10 4 1 . 67 × 10 4 2 . 65 × 10 2 1 . 70 × 10 4 1 . 61 × 10 4
Males 3 . 42 × 10 3 1 . 82 × 10 2 1 . 64 × 10 3 1 . 68 × 10 4 7.61 × 10 1 2 . 00 × 10 4 2 . 03 × 10 4
Table 25. Comparison between BPSO and various well-know algorithms in terms of running time (in seconds).
Table 25. Comparison between BPSO and various well-know algorithms in terms of running time (in seconds).
DatasetMeasureBPSO1BHHOBGSABWOABGWOBBABALOBMFO
allAVG464.05798.02465.45476.24475.84468.76474.47468.78
STD4.8629.4144.9924.8135.7205.6227.1196.179
caucasionAVG368.14613.92376.20378.73377.41376.77375.30374.28
STD4.3405.2953.0153.9284.0453.1915.1144.507
femalesAVG245.75401.38248.90248.18249.38250.27247.90247.08
STD2.1704.3902.0882.8392.6641.8022.3852.445
age > 50AVG266.88441.47267.54272.05272.21269.97269.77269.41
STD2.5404.2072.0242.3182.3572.1693.1162.618
hispanicAVG260.48431.94264.26261.83264.53265.01262.17261.13
STD2.9504.5291.9832.5272.0193.1102.8841.943
age ≤ 50AVG261.38429.46266.24263.29266.10265.21262.29262.48
STD2.3113.0382.5251.7852.3302.2373.2661.700
malesAVG382.68409.66250.95249.72251.26255.21249.02249.54
STD46.5774.5201.5402.2401.7813.1732.7322.101
mean rankF-test1.868.004.144.866.005.433.142.57
Table 26. Best results for classifiers without using feature selection (kNN*) and after using feature selection (BPSO-kNN*) in terms of accuracy, number of features, and improvement rate.
Table 26. Best results for classifiers without using feature selection (kNN*) and after using feature selection (BPSO-kNN*) in terms of accuracy, number of features, and improvement rate.
DatasetKNN*BPSO-KNN Improvement Rate
AccuracyNo. FeaturesAccuracyNo. Features Features Reduction Accuracy
All0.7409310.762818 41.94% 2.19%
Caucasion0.7483300.808013 56.67% 5.96%
Hispanic0.7724300.796814 53.33% 2.44%
Females0.7373300.813613 56.67% 7.63%
Males0.6987300.788511 63.33% 8.97%
Age ≤ 500.7431310.862415 51.61% 11.93%
Age > 500.7333310.806117 45.16% 7.27%
Table 27. Details of selected features selected by the BPSO that scored the best accuracy results for each dataset [best result out of 10 runs].
Table 27. Details of selected features selected by the BPSO that scored the best accuracy results for each dataset [best result out of 10 runs].
DatasetAccuracy#Featuresf1f2f3f4f5f6f7f8f9f10f11f12f13f14f15f16f17f18f19f20f21f22f23f24f25f26f27f28f29f30f31
all0.7628180010110101101100011011101110011
caucasian0.808013-010110101101100000011000001101
Hispanic0.796814-101000110110100100011010101010
females0.81361310-0110001101100011010010000101
males0.78851100-0100001101000001001110100101
age<=500.8624150010010110001110100101110100011
age>500.8061171000011111110010101001100110101
Table 28. Comparison of the BPSO-kNN with kNN*, CNN, and MLP. The table compares the performance of the proposed model, BPSO-kNN, which incorporates feature selection, with other models, including kNN*, CNN, and MLP, which do not employ feature selection.
Table 28. Comparison of the BPSO-kNN with kNN*, CNN, and MLP. The table compares the performance of the proposed model, BPSO-kNN, which incorporates feature selection, with other models, including kNN*, CNN, and MLP, which do not employ feature selection.
DatasetCNNMLPkNN*BPSO-kNN
AccuracyTimeAccuracyTimeAccuracyTimeAccuracyTime
All0.6105291.6560.54380.7890.74091.7060.7628464.050
Caucasion0.7283204.5910.61593.2820.74830.1770.8080368.142
Hispanic0.6513180.5100.52852.3410.77240.1940.7968245.754
Females0.7023208.6050.61023.3980.73730.1650.8136266.882
Males0.6263174.3490.57692.9690.69870.2250.7885260.481
Age ≤ 500.6427177.5570.57802.9170.74310.1760.8624261.377
Age > 500.6629219.3540.54553.4830.73330.1750.8061382.676
Table 29. Comparison of the proposed BPSO-kNN with other approaches from the literature in terms of AUC scores.
Table 29. Comparison of the proposed BPSO-kNN with other approaches from the literature in terms of AUC scores.
Results of NAMES [30]Proposed (BPSO-KNN)Haberfeld et al. [28]Surani et al. [27]
CombinationAUCDatasetAverage AUCSVMLRLRANN
NC + MF + CM + ESS + S + BMI0.6577all0.7438
NC + MF + CM + ESS + S + M0.6572caucasion0.7690
NC + MF + CM + ESS + S + BMI + M (NAMES2)0.6690hispanic0.7811
NC + MF + M + ESS + S0.6583females0.67070.62200.60800.70300.5830
BMI + MF + CM + ESS + S + M0.6436males0.73180.60700.60700.71300.6360
(NC + MF) × 2 + CM + ESS + S0.6661age ≤ 500.8320
(NC + BMI) × 2 + M + ESS + S0.6433ag > 500.7684
(NC + MF) × 2 + M + ESS + S0.6484
(NC + BMI) × 2 + CM + ESS + S0.6426
(NC + MF + BMI)×2 + CM + ESS + S + M0.6478
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sheta, A.; Thaher, T.; Surani, S.R.; Turabieh, H.; Braik, M.; Too, J.; Abu-El-Rub, N.; Mafarjah, M.; Chantar, H.; Subramanian, S. Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race. Diagnostics 2023, 13, 2417. https://doi.org/10.3390/diagnostics13142417

AMA Style

Sheta A, Thaher T, Surani SR, Turabieh H, Braik M, Too J, Abu-El-Rub N, Mafarjah M, Chantar H, Subramanian S. Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race. Diagnostics. 2023; 13(14):2417. https://doi.org/10.3390/diagnostics13142417

Chicago/Turabian Style

Sheta, Alaa, Thaer Thaher, Salim R. Surani, Hamza Turabieh, Malik Braik, Jingwei Too, Noor Abu-El-Rub, Majdi Mafarjah, Hamouda Chantar, and Shyam Subramanian. 2023. "Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race" Diagnostics 13, no. 14: 2417. https://doi.org/10.3390/diagnostics13142417

APA Style

Sheta, A., Thaher, T., Surani, S. R., Turabieh, H., Braik, M., Too, J., Abu-El-Rub, N., Mafarjah, M., Chantar, H., & Subramanian, S. (2023). Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race. Diagnostics, 13(14), 2417. https://doi.org/10.3390/diagnostics13142417

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop