1. Introduction
Obstructive sleep apnoea (OSA), which is a disease affecting ca. 1 billion people in the age bracket of 30 to 60 worldwide, comes with a complete or partial obstruction of the upper airways during sleep [
1]. Despite its increasing prevalence and the progressive expansion of diagnostic options, it is underdiagnosed [
2]. Left untreated, it can lead to a high incidence of co-morbid conditions (myocardial infarction [
3], stroke [
4], hypertension [
3], diabetes mellitus [
5], cardiac arrhythmias [
3], and depression [
6]) due to pathophysiological processes activated by hypoxia and hypercapnia caused by recurring episodes of upper airway obstruction during sleep [
7]. The gold standard of OSA diagnosis has been polysomnography (PSG) in sleep laboratories [
8], but its human and material conditions, cost and its relatively constrained accessibility prevents the proliferation of sleep laboratories according to demand. This gave rise to an acute need to expand the arsenal of OSA diagnostics with various screening methods and rapid diagnostic tools for pre-screening patients, and to reduce the burden on sleep laboratories.
In recent years, there has been an undoubtedly significant development in the instrumental diagnosis of OSA owing to the proliferation of portable device-based home sleep tests in some developed countries. Portable devices and home sleep tests have many ad-vantages. They can be performed at home, in an environment more comfortable for the patient; they require no presence of technical staff and much less equipment. They are, however, not yet widely available in less affluent countries, so simpler methods for pre-screening patients still play a significant role.
Targeted sleep questionnaires provide further opportunities as part of routine outpa-tient care in everyday clinical practice. The advantage of questionnaires is that they can provide an overview of OSA likelihood in a matter of minutes. A number of questionnaires can be used to screen for OSA. The most widely used are the Epworth, Berlin, STOP, and STOP-BANG questionnaires [
9].
Artificial intelligence (AI) is the field of science that applies mathematical statistics and computer science to analyze complex databases. Its benefits are useful for diagnostics, in therapeutic outcomes and for predicting the possible presence of disease. The history of AI and modern computing dates back to 1950 when the British mathematician Alan Turing began applying neural networks to solve classification problems. He coined the term ‘Turing test’ and investigated the extent to which computers could be used to make logical judgements, i.e., decisions [
10]. The applicability of AI in medicine has been the subject of much research since Turing’s work in the middle of the last century. Since the last two decades, there have been significant advances in medicine and a growing demand for the wider use of AI [
11]. With the emergence of a large number of sleep studies and the dynamic growth of the discipline, sleep medicine has laid the foundations for advancing the application of AI. Some recent results are based on big data databases, others on cloud-based information storage and decision support processes [
12]. One of the main goals of sleep diagnostics is to start treating patients as soon as possible and prevent co-morbidities through a timely and cost-effective diagnosis. Our research aims to improve predictability by comparing alternative and cost-effective diagnostic options for OSA diagnosis using AI.
The application of AI for the prediction of OSA has a history more than a decade long [
13]. Our article is a continuation of this. The novelty and academic value of the current article is four-fold: (1) application of AI to a complex analysis of a dataset, consisting of basic demographic, anthropometric, polysomnographic data and pieces of information obtained by the simultaneous application of two frequently used questionnaires: the Berlin and Epworth scales; (2) comparative analysis of the practical applicability of various diagnostic methods, modelling the screening process in the routine of general practitioners; (3) it is well documented that the applicability of different OSA screening scales are highly influenced by socio-cultural and language context [
14], so the testing of the efficiency of application of the Berlin and Epworth questionnaires in Eastern Central Europe may be considered a contribution to the knowledge base on the applicability of these questionnaires; (4) the last decade has witnessed an unprecedented development of different AI algorithms [
15,
16]. In our research, we tested the efficiency of a number of different algorithms. These results would have been unattainable even a few years ago.
The AI supported application of everyday demographic and anthropometric data combined with targeted sleep questionnaires greatly alleviates the screening of previously undiagnosed OSA patients, and thus contributes to their wellbeing.
The current article could be regarded as a preliminary study. Due to the low number of patients, it can and should not be considered a final response to the questions emerging in everyday practice of general practitioners or occupational physicians in the screening of patients with OSA; however, the results of the study offer an additional piece of information on applicability of some simple parameters in the screening of patients with OSA.
3. Results
The patients in the study were divided into control- and OSA-groups based on the results of the sleep study performed; 36 patients were confirmed not to have OSA, and as a result, formed the control group. OSA group patients were divided into mild OSA (32 patients) and moderately severe–severe OSA groups (32 patients), given the small number of elements in each subgroup.
3.1. General Demographic and Anthropometric Characteristics of the Patient Population and the Results of the Berlin and Epworth Questionnaires
The distribution of age, gender, BMI, anthropometric parameters and questionnaire result values in various patient groups is presented in
Table 1.
Male dominance was observed in both the control, mild and moderately severe–severe OSA groups. OSA patients were mostly above 40 years, whereas the ages of the patients in the control group were predominantly below 40 years. A significant difference was proven between the age of the control and mild OSA groups. Patients in the control group were in the normal BMI group while patients in the OSA groups were predominantly in the overweight and obese group. In cases of BMI, significant differences emerged between the control and the mild, and the control and the moderately severe-severe OSA groups. A significant difference was observed between the neck, abdomen and hip circumferences of patients in the control and OSA subgroups. In the results of the Berlin questionnaire, significant differences could be observed between the control and mild OSA, and the control and the moderately severe-severe OSA groups. The results of the Epworth questionnaire, however, presented no significant difference between the study groups.
The distribution of the Berlin questionnaire scores in each OSA subgroup is presented in
Table 2.
Based on the results of the Berlin questionnaire, patients in the non-OSA group were mainly in the low and uncertain OSA probability group. The mild OSA group was dominated by total questionnaire scores below 6 points, i.e., the questionnaire results showed a low probability of OSA. Conversely, in the moderately severe–severe OSA group, 44% of the scores were dominated by low OSA probability and 34% were dominated by high OSA probability.
A contingency table, consisting of AHI-based OSA categorization and probabilities of OSA, based on the Epworth total score, is presented in
Table 3.
The Epworth questionnaire did not detect daytime sleepiness in 77% of patients in the non-OSA group, while mild to moderate daytime sleepiness was observed in 11% and 12% of patients. In the mild OSA group, 40% of patients had no daytime sleepiness, while mild sleepiness was present in 19% and moderate in 28%. The moderately severe–severe OSA group was characterized by moderate (44%) and severe daytime sleepiness (28%).
3.2. Results of FDA Patient Classification, Based on Demographic and Anthropometric Parameters
In our research, a two-step approximation has been applied. First the possibility of two-group categorization (non-OSA vs. OSA), then the applicability of basic demographic and anthropometric parameters was tested for a more detailed classification. Results of OSA prediction by age, gender and BMI using FDA are presented in
Table 4.
A combined prediction by age, gender and BMI, resulted in an 81% success rate for categorizing patients into OSA or non-OSA groups. Non-OSA categorization was successful 29 times out of 36 (80.55%), while OSA categorization yielded a true positive 52 times out of 64 (81.25%).
3.3. AI Supported Three Category OSA Prediction Based on Basic Anthropometric Parameters Grouping Patients into Mild, Moderately Severe–Severe OSA and Non-OSA Groups
The results of AI-supported OSA prediction for severity categories by age, gender and BMI using AI are presented in
Table 5.
When OSA was categorized by severity according to BMI, gender and age, 64% of the results were correct. Prediction of the non-OSA category was correct in 75%, mild OSA in 56.25% and moderately severe–severe OSA in 59.37%.
3.4. OSA Prediction, Based on Responses to the Berlin Questionnaire
The applicability of the Berlin questionnaire for the prediction of OSA was tested in the second phase. As opposed to the results presented in
Table 2, in this phase, all responses to the individual items of the questionnaire were used as input for the AI performing the classification. Two-category (OSA, non-OSA) classification of the patients for OSA prediction is presented in
Table 6.
Based on the Berlin questionnaire, 62% of the predictions made using AI were correct. The non-OSA prediction yielded a correct result in only 7 out of the 36 cases, which is a 19.4% true positive rate, while the OSA category prediction produced correct results in 55 out of 64 cases, which translates to an 85.93% correct classification.
3.5. Classification of OSA Categories Based on the Berlin Questionnaire Using AI
For comparison purposes, data obtained from the Berlin questionnaire were tested to classify the patients according to OSA severity. Results are presented in
Table 7.
The OSA categorizations based on the Berlin questionnaire yielded an aggregate correct rate of 61%. The categorization was correct in 72.22% of cases for non-OSA, 50% for mild OSA, and 59.37% for moderately severe–severe OSA.
3.6. OSA Prediction with the Epworth Questionnaire Using AI
In this phase, workflow was similar to the testing of applicability of the Berlin questionnaire items. A two-category (non-OSA, OSA) patient classification for OSA prediction based on the results of the Epworth scale is presented in
Table 8.
The accuracy of categorization based on the Epworth questionnaire was 75%. The correct categorization rate of patients in the non-OSA group was 58.3%, compared to 84.4% for patients in the OSA group.
3.7. Prediction of OSA by Severity Categories with the Epworth Questionnaire Using AI
The more detailed classification of patients, based on data from the Epworth questionnaire, is summarized in
Table 9.
OSA categorization based on the Epworth questionnaire was successful in 56% of cases. Categorization of patients into the non-OSA group was correct in 19 of 36 cases (52.77%), categorization into the mild OSA group was successful in 31 out of 32 patients (96.8%), while categorization into the moderately severe–severe OSA group was successful in only in 6 cases out of 32 patients (18.75%).
3.8. Three-Group Patient Categorization with the Berlin and Epworth Questionnaires
Filling out the Berlin and Epworth questionnaires takes very little time. Hence, in clinical practice, it seems reasonable to apply both questionnaires. The applicability of data obtained by joint utilization of both questionnaires was tested for OSA prediction. Results are summarized in
Table 10.
Based on the results of the Berlin and Epworth questionnaires, 63% of the categorizations were correct using AI. For the non-OSA group, 80.55% of the predictions proved correct. This ratio was 53% for the mild OSA group, and 53.12% for the moderately severe–severe OSA categories.
3.9. AI Prediction of OSA Based on Demographic and Anthropometric Parameters Combined with Data from the Berlin and Epworth Questionnaires
In the last phase of research, the joint applicability of data was tested. Data were obtained by collecting demographic and anthropometric parameters and responses to the items of the Berlin and Epworth questionnaires. The results for a two category AI classification based on the age, gender, BMI and the responses to items in the Berlin and Epworth questionnaires are presented in
Table 11.
The accuracy of prediction was 83%. Categorization of non-OSA patients was correct in 28 out of 36 cases (77.78%), while OSA categorization was correct in 55 cases out of 64. This means an 85.93% accuracy.
3.10. AI Prediction of OSA Categorization Based on Basic Anthropometric Parameters and the Berlin and Epworth Questionnaires with Patients Categorized into Mild OSA, Moderately Severe–Severe OSA and Non-OSA
Based on the favorable findings with the two-group categorization, the investigation of the applicability of AI for more sophisticated categorization was continued. The results of AI prediction of OSA categories using age, gender, BMI, and Berlin and Epworth questionnaire responses are presented in
Table 12.
Based on the above described data, the accuracy was 69%. The hit rate in the case of non-OSA categories was 80.55%, in the case of the mild OSA category it was 59.37%, and the moderately severe and severe OSA cases could be determined with a 65.62% accuracy.
3.11. Classification of Patients into Three Groups Based on Demographic, Anthropometric and Questionnaire Information
In theory, an increased amount of information reduces categorization error. Therefore, in the final phase of research, the predictability of OSA based on all relevant information was tested. The results of the categorization by age, gender, BMI, neck circumference and data from the Berlin and Epworth questionnaires are presented in
Table 13.
By using all the information available: the results of the Berlin and Epworth questionnaires, BMI, age, gender and neck circumference, 71% confidence in categorization was achieved. A relatively high hit rate (80.55%, 29 out of 36 patients) was obtained in the non-OSA category and a lower rate (65.62%) in the mild OSA group, similar to the moderately severe–severe OSA group.
3.12. Efficiency of Two-Group Categorization Methods, Based on Different Sets of Information
Accuracy of categorization tests by different sets of information is presented in
Table 14.
As shown in
Table 14, the most successful OSA categorization was based on BMI, age, gender and the results of the two questionnaires (83%). Categorization by age, gender and BMI comes next (81%), lagging behind by only 2%. Categorization by the Berlin questionnaire only (62%) comes at the final place. Among the various methods used, no significant difference could be detected for the prediction of the presence of OSA. By contrast, the prediction of non-OSA patients proved to be the most effective (80.55%) by BMI, gender and age; followed by the combination of these parameters complemented with questionnaire data (77.77%). It can be concluded that the prediction by a questionnaire only, in particular the Berlin questionnaire, was clearly the least reliable.
3.13. Efficiency of Categorizations by AI Using Different Sets of Information—A Comparative Analysis
The accuracy of categorizations based on different sets of information is summarized in
Table 15.
The best categorization by OSA severity was achieved using BMI, gender, age, neck circumference and data from the Berlin and Epworth questionnaires (71%). The second most efficient prediction was based on BMI, gender, age and applying the Berlin and Epworth questionnaires (69%), followed by prediction based on BMI, gender, and age (64%). The least efficient categorization by severity of OSA was based on questionnaires alone. Surprisingly, the best predictor of mild OSA was the Epworth questionnaire (96.87%), followed by BMI, gender, age, data from the Berlin and Epworth questionnaires, and neck circumference (65.62%). The questionnaires alone and in combination gave an efficiency of less than 60% for predicting mild OSA. For the prediction of moderate–severe OSA, BMI, gender, age and data from the Berlin and Epworth questionnaires with or without neck circumference gave the same results. The least reliable result for predicting moderate–severe OSA was obtained with the Epworth questionnaire alone (18.75%).
4. Discussion
The factors involved in the pathophysiology of OSA can be divided into two main groups. One consists of anatomical factors. Anatomically narrow and prone to collapse, upper airways are parameters that can be measured by diagnostic imaging studies and are responsible for two-thirds of severe OSA [
26]. The other main group is represented by non-anatomical factors, which include unstable airway control (high loop gain) [
26,
27], low arousal threshold [
28] and reduced upper airway dilator muscle activity [
29]. These factors can be measured by PSG examination. OSA severity is determined by sleep stages, which can also be assessed by PSG examination [
30].
It is evidenced that PSG plays a key role in the diagnosis of OSA, both in terms of as-sessing the presence of the disease and its severity. Knowing the severity of OSA is essential for choosing a suitable treatment plan. Basically, in cases of mild OSA, surgical therapy while in mild–severe and severe cases of OSA continuous positive airway pressure therapy is the primary treatment of choice. With the timely diagnosis and treatment of OSA, comorbidities can be prevented [
31,
32].
However, the rapidly increasing number of patients and poor access to health care, as well as the urgent need to reduce the burden on resources, have led to the introduction of home sleep tests and the development of screening questionnaires for preliminary OSA diagnosis. These, while not providing an accurate diagnosis, can indicate the possible presence of the disease.
Risk factors for OSA, such as central obesity, male gender or age over 40–50 years, play a crucial role in the development of OSA. The questionnaires aim to assess the symptoms of OSA and their extent, in addition to these predisposing factors.
The questionnaires are effective in screening for OSA because their relatively high sensitivity and high negative predictive value are effective in screening for OSA, but they do not provide information on the severity of the disease. From basic anthropometric parameters, demographic data and questionnaire results, a large database can be constructed, which lends itself to be a reliable basis for the prediction of OSA with AI.
Based on the above findings, our research investigated the predisposing factors for OSA according to the traditionally used Berlin and Epworth questionnaires, as well as the success of predicting and categorizing OSA using AI. The aim of our research was to de-velop a simple, practical and reliable prediction system. Our results demonstrated that the database, even with minimal data obtained from a relatively small number of patients can be effectively applied in the preliminary diagnosis of OSA. This reduces the number of untreated patients and contributes to the enhancement of efficiency of material and human resources.
The most important risk factor for OSA is obesity, which is measured using BMI. It is well established that a higher BMI significantly increases the risk of developing OSA compared to individuals with a normal BMI, and this risk is even more pronounced for a BMI above 35 kg/m
2 [
33]. Consequently, changes in body weight affect the severity of OSA. A 10% increase in body weight leads to a 32% increase in AHI [
34], while weight loss leads to an improvement in OSA symptoms and a decrease in AHI [
35]. The disadvantage of using BMI this way is that it does not produce an accurate indication of real body fat percentage [
36], with overestimated BMI for high muscle mass and underestimated values for high body fat percentage with normal BMI [
37].
Male gender is also an important risk factor for OSA, the effect of which may be partly explained by hormonal differences between men and women [
38], which may also affect the distribution of body fat.
The gender difference in adipose tissue is reflected in an increase in visceral adipose tissue in men and global subcutaneous adipose tissue in women [
39]. In men, the excess of adipose tissue associated with OSA results in more severe OSA [
40] than in women with a similar BMI and waist circumference [
41].
Due to hormonal changes, women experience more severe OSA in the postmeno-pausal period without hormone substitution [
42], whereas premenopausal women are significantly less likely to develop OSA compared to men [
43]. Men are not predisposed to developing OSA due to a higher neck circumference, average body weight and free fat mass [
44]. Increasing life expectancy at birth results in a greater incidence of age-related diseases. Studies over the past 10 years have shown that age and ageing and OSA induce similar physiological processes [
45]. OSA is also referred to as an age-related sleep disorder based on its prevalence, as its prevalence in people over 65 years is twice that of people aged 30 to 65 years, thus increasing the number of untreated patients [
46].
The above-mentioned risk factors for OSA, such as BMI, gender and age, are parameters that can be tested in everyday patient care simply and easily. In our patient population, male patients with mild and moderately severe–severe OSA were the study population. The majority of patients were over 40 years of age and their nutritional status could be characterized as overweight or obese with a BMI of 25–30 kg/m2 or over 30 kg/m2. In other words, the presence of all the three risk factors was observed in the OSA group. In our study, AI OSA prediction for evaluation of these parameters yielded an 81% accuracy of OSA/non-OSA classification. With AI, the presence of OSA was predicted with a 81.25% success rate, while the success rate for predicting the patient not suffering from OSA was 80.55%. If the OSA group was further differentiated along the same parameters into severity categories, the rate of success fell to 64%. The best prediction was observed for the non-OSA group, at 75%. The prediction for mild OSA proved correct for 56.25% of cases, while moderately severe–severe OSA predisposition was correct in 59.37% of the cases. To summarize, the explanation of the above findings, namely, that the success rate for OSA prediction falls with the increase of the severity of OSA, can be traced back to a number of factors. One factor could be the low number of samples. Another factor may be the deviation in age and BMI between the groups. Although the mild and the moderately severe–severe OSA subgroups consisted of mostly overweight and obese patients, the control group overwhelmingly consisted of patients of normal body weight. A significant difference in BMI could be observed between the control and the mild OSA severity subgroups, and the control and the moderately severe–severe OSA subgroups. A similar correlation could be proven for age; a significant difference being observable only between the control group and the mild OSA subgroup. Considering that in the operation of the algorithm both age and BMI played an essential role, the lack of significance between the mild and the moderately severe–severe OSA subgroups could lead to the algorithm being unable to differentiate these two subgroups. Based on the above facts. it can be concluded that the above factors could lead to the false categorization of the mild and the moderately severe–severe OSA subgroups.
Self-completion questionnaires are an integral part of everyday diagnostics, with the advantage of being cost-effective and able to provide reliable results in a matter of minutes in most cases. They can be used effectively for screening in many areas of diagnostics, including sleep medicine, as well as specifically for OSA screening. The importance of questionnaires lies in the screening of previously undiagnosed OSA patients and in easing the burden on sleep laboratories. The Berlin questionnaire is one of these tools for OSA screening. According to the results of Netzer et al., the questionnaire can be used with 86% sensitivity and 77% specificity for estimation of respiratory disturbance index (RDI) above 5, and 54% sensitivity and 97% specificity for estimation of RDI above 15 [
47].
In a study by Tan et al., the prediction of AHI above 15/h was found to have 58.8% sensitivity and 77.6% specificity, with a negative predictive value of 82.9% [
48]. These results are in line with findings of Kang et al., who estimated 89% sensitivity and 63% speci-ficity in the case of this questionnaire [
49]. According to other sources [
48], if the questionnaire was used to predict AHI above 30, 76.9% sensitivity, 72.7% specificity and 96.3% negative predictive value were observed [
48]. In our study, OSA/non-OSA categorization by AI, based on the Berlin questionnaire, was found to be successful in 62% of cases, similarly to the result observed for severity subgrouping (61%). The questionnaire performed worst when identifying non-OSA cases while being used for the non-OSA/OSA binary grouping (19.40%). It can be concluded that the low specificity of the questionnaire prevents it from being suitable for use in our research for identifying non-OSA cases in binary groupings.
At the same time, it can be concluded that subcategorization by severity was most successful for non-OSA prediction (72.22%), while it was the weakest for mild OSA classification (50%). Using AI based on just the simple sum of scores of the questionnaire, 42% of the non-OSA group patients obtained a low probability of OSA, while 22% of the mild OSA group and 34% of the moderately severe–severe OSA group proved to have a high probability of OSA. The explanation of these results could be the significant difference between the non-OSA and mild, and the non-OSA and moderately severe–severe OSA subgroups. In other words, using the Berlin questionnaire with AI significantly improved predictive power in all severity subgroups.
Another widely used form of self-completion questionnaire is the Epworth question-naire. Several studies have been conducted to investigate the screening of OSA with the questionnaire, to study its sensitivity, specificity and positive and negative predictive value. Results of 11 studies involving 47 to 4770 subjects with AHI above 30 events/hour found the sensitivity of the questionnaire to be between 46 and 79%. Its specificity was found to be highest for mild OSA (75%), with positive and negative predictive values above 80% [
50]. In a study involving 1450 participants, the sensitivity of the questionnaire was also found to be around 50% for AHI groups above 5, 15 and 30 events/hour, while its specificity averaged 60% [
48]. In our study, the patient population with AI, OSA and non-OSA categorization by the Epworth questionnaire was successful in 75% of cases, and the OSA prediction severity category in 56%. The highest success rate was found in the mild OSA group, with a success rate of 96.87% in predicting OSA, while in the moderately severe–severe OSA group, a success rate of only 18.75% was found. Our results, without the use of AI, showed that daytime sleepiness was present in 77% of the patients in the non-OSA group. Patients in the mild OSA group had mild daytime sleepiness in 19% of cases, moderate in 28% and severe in 13%. The moderately severe–severe OSA group experienced mild daytime sleepiness in 19% of cases, moderate in 44% and severe in 28%. In grouping the OSA/non-OSA subgroups, the Epworth excelled over the Berlin questionnaire. When grouping according to severity, however, the Berlin questionnaire produced the more accurate result. The fruitfulness of prediction of the Epworth questionnaire using AI may be severely decreased by the fact that there was no significant difference between the OSA severity subgroups that could explain the weaker predictive power of the Epworth questionnaire compared to the one seen at the Berlin questionnaire. The above finding highlights the weaker prediction power of the Epworth questionnaire for moderately severe–severe OSA. When the two questionnaires (Berlin and Epworth) were used together, the success rate of prediction by severity improved to 63%, and the prediction of mild and moderately severe–severe OSA was successful in 53% of cases. In other words, when using the two questionnaires with AI, it was non-OSA prediction that could be improved significantly (80%), while the prediction of mild and moderately severe–severe OSA showed a result similar to using the Berlin questionnaire on its own. Using the two questionnaires together, Uzali et al. found the specificity to be 0.72 at an AHI cut-off of 5 [
51].
Overall, it can be observed that the effectiveness of both questionnaires was greatly improved by the use of AI.
For resource optimization, the benefits of AI should be progressively exploited in the diagnosis of OSA. Given the available clinical data, AI can be used to screen high-risk patients [
52]. Nettleton et al. found a 0.53 correlation coefficient between AI prediction of OSA, based on sleep questionnaire results and OSA based on RDI [
53]. AI prediction of OSA by questionnaires was found to have 88% sensitivity and 97% specificity for AHI above 15 [
13]. In our study, prediction of 15 < AHI was found to be most effective when age, gender and BMI were added to the Berlin and Epworth questionnaires. The information provided by neck circumference contributed only marginally to the prediction of OSA severity. The improvement seen with neck circumference in AI augmented OSA prediction may be explained by the significant difference between the study groups. The efficiency of OSA prediction can be vastly improved if parameters that show significant differences between groups are used.
The prediction of OSA and its severity showed different results for electrocardiogram, demographic and 26-parameter PSG using vector machine, k-nearest neighbor and naive Bayes algorithms. Prediction of OSA based on clinical characteristics (AHI < 5 and AHI > 5) by vector machine analysis showed 59% sensitivity and 74.5% specificity, where-as the prediction based on electrocardiogram parameters showed 43.4% sensitivity and 83.5% specificity. Predictions based on these parameters, using a naive Bayesian algorithm, showed a sensitivity of 57.5% and a specificity of 73.7% for predictions based on clinical parameters, while electrocardiogram-based predictions showed a sensitivity of 39% and a specificity of 82.7%. Further analysis confirmed a moderate correlation between the AHI and BMI, arousal index and electrocardiogram parameters [
54]. These results are in line with our estimations: no significant stochastic relationship could be found between AHI and any anthropometric parameters. Conversely, the BMI has shown significant correlation with all anthropometric parameters. Kirby et al. [
55] have applied neural network modelling for the diagnosis of OSA based on the clinical characteristics of 405 patients. The cut-off criterion was AHI > 10. The sensitivity of the classification was 99%, while its specificity was as high as 80%. Using the same cut-off criterion and a logistic model, Rowley et al. categorized the 370 patients in their sample by 76–96% sensitivity and 13–54% specificity [
56]. The above citations highlight that our results are in line with the results of other academic workshops.
Overall, it can be concluded that the application of AI in itself for OSA prediction by age, gender and BMI significantly improves the identification of high-risk OSA patients. The analysis of questionnaire-based results using AI offers further possibilities. The addi-tion of demographic and anthropometric parameters to the data supplied by the patients responding to the questionnaires will aid their categorization. It is important to note, however, that results categorized for OSA severity exceed 50% in many cases, but this comes with limited clinical consequences. The relevance of the categorization lies in the exclusion of non-OSA cases, which, many times, proved as high as 80%. The advantage of this method lies in its cost-effectiveness and simplicity, which could help alleviate the bur-den on sleep laboratories. This method could also contribute to a faster diagnosis of OSA patients.
There are three limitations of the current study:
1. The relatively small number of patients in the sample. Our approach was based on the principle circumstance that there was a constant flow of patients to the Department of Otolaryngology and Head and Neck Surgery of Semmelweis University. We sampled this flow. Due to the limitation of material and human resources, taking into consideration that the research project needed the purposeful well-designed cooperation of different specialists from various branches of the medical profession, this setting was more easily operationalisable than obtaining a random sample from a large pool of patients reflecting the different characteristic features of the population of patients, aiming for the achievement of a representative sample. Obviously, the clinical usability of the present results require careful consideration since their efficiency does not match the efficiency of home sleep apnoea tests. At the same time, it can be asserted that, in the dynamically developing world of artificial intelligence, a great number of studies in many countries of the world attempt to enhance diagnostics and therapeutic efficiency, in many cases, with studies that have an even smaller number of patients in their samples [
57,
58,
59,
60].
2. Our work is a compromise of length and content. It is well documented that a high number of parameters could be additionally applied in the screening of OSA, but we wanted to show that even the simplest parameters could be used with high efficiency in the screening process. Similarly to other studies, the present paper seems to have proved that using even these simple and easily measurable parameters can offer the opportunity to effectively screen OSA even with patient numbers this low [
61]. The aim of our research was to showcase a screening method for OSA that delivered relatively accurate predictions with the use of the least number of cheap parameters that were readily available in everyday diagnostics. In other words, we wanted to know whether we could predict OSA with parameters this few and this simple. This explains why we did not use those important parameters that would have reflected the severity of OSA.
3. The application of machine learning models is developing at an exponential rate, and currently we are very far from having an algorithm that could be considered “the gold standard” [
62]. In our personal estimation, in our present day there are more than a thousand possible algorithms which could be applied to classify different objects based on a relatively small sample. Our goal was not to establish an exhaustive set of these classifiers and choose the best of them, but simply to apply some of them that are relatively easily available, choose the most efficient ones and then demonstrate the results
Although our efforts could be presented as a preliminary result because of the small sample size, on the whole, the results demonstrate the future applicability of the method for the screening of OSA. Our further aims are the expansion of the sample size and the improvement of the algorithms for a more accurate prediction of OSA. With perfecting the algorithm, and as a result, with a more accurate diagnosis of OSA, the more widespread screening of OSA patients and the deloading of OSA screening diagnostics will become achievable in the future.
As a result, it was not possible to separate more sophisticated subcategories by severity of OSA and the moderate and severe OSA categories had to be merged. The basic objective of our study was to further the methodology of OSA diagnosis; therefore, we did not aim to set up a sample, representing the Hungarian population as a whole. We accepted the distorted characteristics of the sample, e.g., a high proportion of men and relatively young patients. The distortion of the sample could be efficiently counter-balanced by the algorithms applied. Naturally, a further increase in the number of patients will lead to even more accurate results, opening the way to formulate more robust and stronger conclusions. The understanding of the global and complex system of factors influencing the development of OSA necessitates further analyses involving other parameters, e.g., races, since the different anthropological characteristics and craniofacial parameters limit the generalizability of the present findings. It is well known that the algorithm applied in the present paper is susceptible to overfitting [
63]; therefore, further studies are needed to validate the results. One future aim is to increase the number of patients in the sample to further improve and refine the diagnostic methods of OSA by constructing and validating novel questionnaires and testing new AI algorithms.