Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review
Abstract
:1. Introduction
2. Methods
- ("cough detection" OR "cough classification" OR "cough sounds"OR "cough audio" OR "cough analysis")AND("specificity" OR "accuracy" OR "sensitivity" OR "f1"OR "true positive" OR "AUC" or "MFCC")
- 2.
- ("respiratory disease classification"OR "respiratory sound classification"OR "lung sound classification" OR "respiratory sound")AND("ML" OR "machine learning")AND("accuracy" OR "AUC" OR "f1" OR "sensitivity")
- 3.
- ("Alzheimer" OR "Parkinson’s disease" OR "bradykinesia")AND("speech signal processing" OR "speech sounds"OR "Voice Recognition" OR "speech classification")AND("classification" OR "accuracy")AND("machine learning" OR "deep learning")
3. Cough Sounds Analysis for Upper Respiratory Symptoms
3.1. Data Acquisition
3.2. Objectives in Cough Sound Analysis Studies
3.3. Implementation Approach
4. Lung and Breath Sounds Analysis of Lower Respiratory Symptoms
4.1. Data Acquisition
4.2. Domain Focus of Sound Analysis Studies for the Lower Respiratory Symptoms
4.3. Implementation Approach
5. Voice/Speech-Based Analysis for Respiratory Diseases Identification
5.1. Data Acquisition
5.2. Domain Focus of Voice/Speech Sound Analysis Studies
5.3. Implementation Approach
6. Publicly Available Datasets
7. Discussion
7.1. Challenges
7.2. Opportunities
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
References
- Troncoso, Á.; Ortega, J.A.; Seepold, R.; Madrid, N.M. Non-invasive devices for respiratory sound monitoring. Procedia Comput. Sci. 2021, 192, 3040–3048. [Google Scholar] [CrossRef] [PubMed]
- Ijaz, A.; Nabeel, M.; Masood, U.; Mahmood, T.; Hashmi, M.S.; Posokhova, I.; Rizwan, A.; Imran, A. Towards using cough for respiratory disease diagnosis by leveraging Artificial Intelligence: A survey. Inform. Med. Unlocked 2022, 29, 100832. [Google Scholar] [CrossRef]
- Kim, H.; Jeon, J.; Han, Y.J.; Joo, Y.; Lee, J.; Lee, S.; Im, S. Convolutional neural network classifies pathological voice change in laryngeal cancer with high accuracy. J. Clin. Med. 2020, 9, 3415. [Google Scholar] [CrossRef] [PubMed]
- T., B.B.; Hee, H.I.; Teoh, O.H.; Lee, K.P.; Kapoor, S.; Herremans, D.; Chen, J.M. Asthmatic versus healthy child classification based on cough and vocalised /a:/ sounds. J. Acoust. Soc. Am. 2020, 148, EL253–EL259. [Google Scholar] [CrossRef] [PubMed]
- Claxton, S.; Porter, P.; Brisbane, J.; Bear, N.; Wood, J.; Peltonen, V.; Della, P.; Abeyratne, U. Identifying acute exacerbations of chronic obstructive pulmonary disease using patient-reported symptoms and cough feature analysis. NPJ Digit. Med. 2021, 4, 107. [Google Scholar] [CrossRef]
- Mouawad, P.; Dubnov, T.; Dubnov, S. Robust Detection of COVID-19 in Cough Sounds. Comput. Sci. 2021, 2, 34. [Google Scholar] [CrossRef]
- Zhou, Q.; Shan, J.; Ding, W.; Chengyin, W.; Yuan, S.; Sun, F.; Li, H.; Fang, B. Cough Recognition Based on Mel-Spectrogram and Convolutional Neural Network. Front. Robot. 2021, 8, 580080. [Google Scholar] [CrossRef]
- Rudraraju, G.; Palreddy, S.; Mamidgi, B.; Sripada, N.R.; Padma Sai, Y.; Kumar, V.; Haranath, S. Cough sound analysis and objective correlation with spirometry and clinical diagnosis. Inform. Med. Unlocked 2020, 19, 100319. [Google Scholar] [CrossRef]
- Laguarta, J.; Hueto, F.; Subirana, B. COVID-19 Artificial Intelligence Diagnosis Using Only Cough Recordings. IEEE Open J. Eng. Med. Biol. 2020, 1, 275–281. [Google Scholar] [CrossRef]
- Bansal, V.; Pahwa, G.; Kannan, N. Cough Classification for COVID-19 based on audio mfcc features using Convolutional Neural Networks. In Proceedings of the 2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, India, 2–4 October 2020; pp. 604–608. [Google Scholar] [CrossRef]
- Andreu-Perez, J.; Espinosa, H.P.; Timonet, E.; Kiani, M.; Girón-Pérez, M.I.; Benitez-Trinidad, A.B.; Jarchi, D.; Rosales-Pérez, A.; Gatzoulis, N.; Reyes-Galaviz, O.F.; et al. A Generic Deep Learning Based Cough Analysis System from Clinically Validated Samples for Point-of-Need Covid-19 Test and Severity Levels. arXiv 2021, arXiv:2111.05895. [Google Scholar] [CrossRef]
- Feng, K.; He, F.; Steinmann, J.; Demirkiran, I. Deep-learning Based Approach to Identify COVID-19. In Proceedings of the SoutheastCon 2021, Atlanta, GA, USA, 10–13 March 2021; pp. 1–4. [Google Scholar] [CrossRef]
- Hassan, A.; Shahin, I.; Alsabek, M.B. COVID-19 Detection System using Recurrent Neural Networks. In Proceedings of the 2020 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI), Sharjah, United Arab Emirates, 3–5 November 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Imran, A.; Posokhova, I.; Qureshi, H.N.; Masood, U.; Riaz, M.S.; Ali, K.; John, C.N.; Hussain, M.I.; Nabeel, M. AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app. Inform. Med. Unlocked 2020, 20, 100378. [Google Scholar] [CrossRef]
- Pal, A.; Sankarasubbu, M. Pay Attention to the cough: Early Diagnosis of COVID-19 using Interpretable Symptoms Embeddings with Cough Sound Signal Processing. arXiv 2020, arXiv:2010.02417. [Google Scholar]
- Pahar, M.; Niesler, T. Machine Learning based COVID-19 Detection from Smartphone Recordings: Cough, Breath and Speech. arXiv 2021, arXiv:2104.02477. [Google Scholar]
- You, M.; Wang, W.; Li, Y.; Liu, J.; Xu, X.; Qiu, Z. Automatic cough detection from realistic audio recordings using C-BiLSTM with boundary regression. Biomed. Signal Process. Control. 2022, 72, 103304. [Google Scholar] [CrossRef]
- Kruizinga, M.; Zhuparris, A.; Dessing, E.; Krol, F.; Sprij, A.; Doll, R.; Stuurman, F.; Exadaktylos, V.; Driessen, G.; Cohen, A. Development and technical validation of a smartphone-based pediatric cough detection algorithm. Pediatr. Pulmonol. 2022, 57, 761–767. [Google Scholar] [CrossRef] [PubMed]
- Chowdhury, N.; Kabir, M.; Rahman, M.; Islam, S. Machine learning for detecting COVID-19 from cough sounds: An ensemble-based MCDM method. Comput. Biol. Med. 2022, 145, 105405. [Google Scholar] [CrossRef] [PubMed]
- Swarnkar, V.; Abeyratne, U.; Chang, A.; Amrulloh, Y.; Setyati, A.; Triasih, R. Automatic Identification of Wet and Dry Cough in Pediatric Patients with Respiratory Diseases. Ann. Biomed. Eng. 2013, 41, 1016–1028. [Google Scholar] [CrossRef] [PubMed]
- Vhaduri, S. Nocturnal Cough and Snore Detection Using Smartphones in Presence of Multiple Background-Noises. In Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies, Guayaquil, Ecuador, 15–17 June 2020; pp. 174–186. [Google Scholar] [CrossRef]
- Teyhouee, A.; Osgood, N. Cough Detection Using Hidden Markov Models. arXiv 2019, arXiv:1904.12354. [Google Scholar]
- Barata, F.; Tinschert, P.; Rassouli, F.; Steurer-Stey, C.; Fleisch, E.; Puhan, M.A.; Brutsche, M.; Kotz, D.; Kowatsch, T. Automatic Recognition, Segmentation, and Sex Assignment of Nocturnal Asthmatic Coughs and Cough Epochs in Smartphone Audio Recordings: Observational Field Study. J. Med. Internet Res. 2020, 22, e18082. [Google Scholar] [CrossRef]
- Sharan, R.V.; Abeyratne, U.R.; Swarnkar, V.R.; Porter, P. Automatic Croup Diagnosis Using Cough Sound Recognition. IEEE Trans. Biomed. Eng. 2019, 66, 485–495. [Google Scholar] [CrossRef]
- Monge-Álvarez, J.; Hoyos-Barceló, C.; Lesso, P.; Casaseca-de-la Higuera, P. Robust Detection of Audio-Cough Events Using Local Hu Moments. IEEE J. Biomed. Health Inform. 2019, 23, 184–196. [Google Scholar] [CrossRef]
- Nemati, E.; Rahman, M.M.; Nathan, V.; Vatanparvar, K.; Kuang, J. A Comprehensive Approach for Classification of the Cough Type. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 208–212. [Google Scholar] [CrossRef]
- Kobat, M.A.; Kivrak, T.; Barua, P.D.; Tuncer, T.; Dogan, S.; Tan, R.S.; Ciaccio, E.J.; Acharya, U.R. Automated COVID-19 and Heart Failure Detection Using DNA Pattern Technique with Cough Sounds. Diagnostics 2021, 11, 1962. [Google Scholar] [CrossRef]
- Bales, C.; Nabeel, M.; John, C.N.; Masood, U.; Qureshi, H.N.; Farooq, H.; Posokhova, I.; Imran, A. Can Machine Learning Be Used to Recognize and Diagnose Coughs? In Proceedings of the 2020 International Conference on e-Health and Bioengineering (EHB), Iasi, Romania, 29–30 October 2020; pp. 1–4. [Google Scholar] [CrossRef]
- Khomsay, S.; Vanijjirattikhan, R.; Suwatthikul, J. Cough detection using PCA and Deep Learning. In Proceedings of the 2019 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Republic of Korea, 16–18 October 2019; pp. 101–106. [Google Scholar] [CrossRef]
- Sharan, R.V.; Abeyratne, U.R.; Swarnkar, V.R.; Porter, P. Cough sound analysis for diagnosing croup in pediatric patients using biologically inspired features. In Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju, Republic of Korea, 11–15 July 2017; pp. 4578–4581. [Google Scholar] [CrossRef]
- Pahar, M.; Klopper, M.; Warren, R.; Niesler, T. COVID-19 Cough Classification using Machine Learning and Global Smartphone Recordings. Comput. Biol. Med. 2021, 135, 104572. [Google Scholar] [CrossRef]
- Tena, A.; Clarià, F.; Solsona, F. Automated detection of COVID-19 cough. Biomed. Signal Process. Control. 2022, 71, 103175. [Google Scholar] [CrossRef]
- Pahar, M.; Klopper, M.; Reeve, B.; Theron, G.; Warren, R.; Niesler, T. Automatic Cough Classification for Tuberculosis Screening in a Real-World Environment. arXiv 2021, arXiv:2103.13300. [Google Scholar] [CrossRef]
- Melek, M. Diagnosis of COVID-19 and non-COVID-19 patients by classifying only a single cough sound. Neural Comput. Appl. 2021, 33, 17621–17632. [Google Scholar] [CrossRef]
- Xue, H.; Salim, F.D. Exploring Self-Supervised Representation Ensembles for COVID-19 Cough Classification. arXiv 2021, arXiv:2105.07566. [Google Scholar]
- Hee, H.I.; Balamurali, B.; Karunakaran, A.; Herremans, D.; Teoh, O.H.; Lee, K.P.; Teng, S.S.; Lui, S.; Chen, J.M. Development of Machine Learning for Asthmatic and Healthy Voluntary Cough Sounds: A Proof of Concept Study. Appl. Sci. 2019, 9, 2833. [Google Scholar] [CrossRef]
- Chaudhari, G.; Jiang, X.; Fakhry, A.; Han, A.; Xiao, J.; Shen, S.; Khanzada, A. Virufy: Global Applicability of Crowdsourced and Clinical Datasets for AI Detection of COVID-19 from Cough. arXiv 2020, arXiv:2011.13320. [Google Scholar]
- Xia, T.; Spathis, D.; Brown, C.; Chauhan, J.; Grammenos, A.; Han, J.; Hasthanasombat, A.; Bondareva, E.; Dang, T.; Floto, A.; et al. COVID-19 Sounds: A Large-Scale Audio Dataset for Digital Respiratory Screening. In Proceedings of the Thirty-Fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 29 August 2021. [Google Scholar]
- Sharma, N.; Krishnan, P.; Kumar, R.; Ramoji, S.; Chetupalli, S.R.; Nirmala, R.; Ghosh, P.K.; Ganapathy, S. Coswara—A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis. arXiv 2020, arXiv:2005.10548. Available online: http://arxiv.org/abs/2005.10548 (accessed on 14 January 2021).
- Wang, W. The Corp Dataset. IEEE Dataport 2021. [Google Scholar] [CrossRef]
- Cohen-McFarlane, M.; Goubran, R.; Knoefel, F. Novel Coronavirus (2019) Cough Database: NoCoCoDa. IEEE Access 2020, 8, 154087–154094. [Google Scholar] [CrossRef]
- Eyben, F.; Wöllmer, M.; Schuller, B. Opensmile: The Munich Versatile and Fast Open-Source Audio Feature Extractor. In Proceedings of the 18th ACM International Conference on Multimedia, MM ’10, Firenze, Italy, 25–29 October 2010; pp. 1459–1462. [Google Scholar] [CrossRef]
- Azam, M.A.; Shahzadi, A.; Khalid, A.; Anwar, S.M.; Naeem, U. Smartphone based human breath analysis from Respiratory Sounds. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2018, 2018, 445–448. [Google Scholar]
- Basu, V.; Rana, S. Respiratory diseases recognition through respiratory sound with the help of deep neural network. In Proceedings of the 2020 4th International Conference on Computational Intelligence and Networks (CINE), Kolkata, India, 27–29 February 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Gelman, A.; Sokolovsky, V.; Furman, E.; Kalinina, N.; Furman, G. Artificial intelligence in the respiratory sounds analysis and computer diagnostics of bronchial asthma. medRxiv 2021. [Google Scholar] [CrossRef]
- Haider, N.S.; Singh, B.K.; Periyasamy, R.; Behera, A.K. Respiratory sound based classification of chronic obstructive pulmonary disease: A risk stratification approach in machine learning paradigm. J. Med. Syst. 2019, 43, 255. [Google Scholar] [CrossRef]
- Meng, F.; Shi, Y.; Wang, N.; Cai, M.; Luo, Z. Detection of Respiratory Sounds Based on Wavelet Coefficients and Machine Learning. IEEE Access 2020, 8, 155710–155720. [Google Scholar] [CrossRef]
- Id, M.A.; Khandoker, A.H. Detection of COVID-19 in smartphone-based breathing recordings: A pre-screening deep learning tool. PLoS ONE 2022, 17, e0262448. [Google Scholar] [CrossRef]
- Brown, C.; Chauhan, J.; Grammenos, A.; Han, J.; Hasthanasombat, A.; Spathis, D.; Xia, T.; Cicuta, P.; Mascolo, C. Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 3474–3484. [Google Scholar] [CrossRef]
- Nguyen, T.; Pernkopf, F. Lung Sound Classification Using Co-tuning and Stochastic Normalization. arXiv 2021, arXiv:2108.01991. [Google Scholar] [CrossRef]
- Gómez, A.F.R.; Orjuela-Cañón, A.D. Respiratory Sounds Classification employing a Multi-label Approach. In Proceedings of the 2021 IEEE Colombian Conference on Applications of Computational Intelligence, (ColCACI), Cali, Colombia, 26–28 May 2021; pp. 1–5. [Google Scholar] [CrossRef]
- Falah, A.; Jondri, J. Lung Sounds Classification Using Stacked Autoencoder and Support Vector Machine. In Proceedings of the 2019 7th International Conference on Information and Communication Technology (ICoICT), Kuala Lumpur, Malaysia, 24–26 July 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Chen, H.; Yuan, X.; Pei, Z.; Li, M.; Li, J. Triple-Classification of Respiratory Sounds Using Optimized S-Transform and Deep Residual Networks. IEEE Access 2019, 7, 32845–32852. [Google Scholar] [CrossRef]
- Tong, F.; Liu, L.; Xie, X.; Hong, Q.; Li, L. Respiratory Sound Classification: From Fluid-Solid Coupling Analysis to Feature-Band Attention. IEEE Access 2022, 10, 22018–22031. [Google Scholar] [CrossRef]
- Wu, L.; Li, L. Investigating into segmentation methods for diagnosis of respiratory diseases using adventitious respiratory sounds. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 768–771. [Google Scholar] [CrossRef]
- Ingco, W.E.M.; Reyes, R.S.; Abu, P.A.R. Development of a Spectral Feature Extraction using Enhanced MFCC for Respiratory Sound Analysis. In Proceedings of the 2019 International SoC Design Conference (ISOCC), Jeju, Republic of Korea, 6–9 October 2019; pp. 263–264. [Google Scholar] [CrossRef]
- Tariq, Z.; Shah, S.K.; Lee, Y. Lung Disease Classification using Deep Convolutional Neural Network. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 732–735. [Google Scholar] [CrossRef]
- Shi, L.; Du, K.; Zhang, C.; Ma, H.; Yan, W. Lung Sound Recognition Algorithm Based on VGGish-BiGRU. IEEE Access 2019, 7, 139438–139449. [Google Scholar] [CrossRef]
- Liu, Y.; Lin, Y.; Zhang, X.; Wang, Z.; Gao, Y.; Chen, G.; Xiong, H. Classifying respiratory sounds using electronic stethoscope. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–8. [Google Scholar] [CrossRef]
- Liu, Y.X.; Yang, Y.; Chen, Y.H. Lung sound classification based on Hilbert-Huang transform features and multilayer perceptron network. In Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Kuala Lumpur, Malaysia, 12–15 December 2017; pp. 765–768. [Google Scholar] [CrossRef]
- Kim, Y.; Hyon, Y.; Jung, S.; Lee, S.; Yoo, G.; Chung, C.; Ha, T. Respiratory sound classification for crackles, wheezes, and rhonchi in the clinical field using deep learning. Sci. Rep. 2021, 11, 17186. [Google Scholar] [CrossRef]
- Gairola, S.; Tom, F.; Kwatra, N.; Jain, M. RespireNet: A Deep Neural Network for Accurately Detecting Abnormal Lung Sounds in Limited Data Setting. arXiv 2021, arXiv:2011.00196. [Google Scholar]
- de Mesquita Guimarães e Ferreira Cardoso, H. Pulmonary Auscultation Using Mobile Devices—Feasibility Study in Respiratory Diseases. Doctoral Dissertation, Universidade do Porto, Porto, Portugal, 2021. [Google Scholar]
- Rocha, B.M.; Pessoa, D.; Marques, A.; Carvalho, P.; Paiva, R.P. Automatic Classification of Adventitious Respiratory Sounds: A (Un)Solved Problem? Sensors 2021, 21, 57. [Google Scholar] [CrossRef]
- Petmezas, G.; Cheimariotis, G.A.; Stefanopoulos, L.; Rocha, B.; Paiva, R.P.; Katsaggelos, A.K.; Maglaveras, N. Automated Lung Sound Classification Using a Hybrid CNN-LSTM Network and Focal Loss Function. Sensors 2022, 22, 1232. [Google Scholar] [CrossRef]
- Islam, M.A.; Bandyopadhyaya, I.; Bhattacharyya, P.; Saha, G. Classification of Normal, Asthma and COPD Subjects Using Multichannel Lung Sound Signals. In Proceedings of the 2018 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 3–5 April 2018; pp. 0290–0294. [Google Scholar] [CrossRef]
- Nguyen, T.; Pernkopf, F. Lung sound classification using snapshot ensemble of convolutional neural networks. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 760–763. [Google Scholar]
- Naqvi, S.Z.H.; Choudhary, M.A.; Tariq, Z.; Waseem, A. Automated Detection and Classification of Multichannel Lungs Signals using EMD. In Proceedings of the 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), Istanbul, Turkey, 12–13 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Rocha, B.M.; Filos, D.; Mendes, L.; Vogiatzis, I.; Perantoni, E.; Kaimakamis, E.; Natsiavas, P.; Oliveira, A.; Jácome, C.; Marques, A.; et al. A Respiratory Sound Database for the Development of Automated Classification. In Precision Medicine Powered by pHealth and Connected Health. ICBHI 2017. IFMBE Proceedings, Thessaloniki, Greece, 18–21 November 2017; Maglaveras, N., Chouvarda, I., de Carvalho, P., Eds.; Springer: Singapore, 2018; pp. 33–37. [Google Scholar]
- Waterworth, C. Acoustics of breathing. Crit. Care 2000, 4, webreport1743. [Google Scholar] [CrossRef]
- Suppakitjanusant, P.; Sungkanuparph, S.; Wongsinin, T.; Virapongsiri, S.; Kasemkosin, N.; Chailurkit, L.; Ongphiphadhanakul, B. Identifying individuals with recent COVID-19 through voice classification using Deep Learning. Sci. Rep. 2021, 11, 19149. [Google Scholar] [CrossRef]
- Verde, L.; De Pietro, G.; Sannino, G. Artificial intelligence techniques for the non-invasive detection of COVID-19 through the analysis of Voice Signals. Arab. J. Sci. Eng. 2021, 48, 11143–11153. [Google Scholar] [CrossRef]
- Verde, L.; De Pietro, G.; Ghoneim, A.; Alrashoud, M.; Al-Mutib, K.N.; Sannino, G. Exploring the Use of Artificial Intelligence Techniques to Detect the Presence of Coronavirus Covid-19 Through Speech and Voice Analysis. IEEE Access 2021, 9, 65750–65757. [Google Scholar] [CrossRef]
- Shimon, C.; Shafat, G.; Dangoor, I.; Ben-Shitrit, A. Artificial intelligence enabled preliminary diagnosis for COVID-19 from voice cues and questionnaires. J. Acoust. Soc. Am. 2021, 149, 1120–1124. [Google Scholar] [CrossRef]
- Nallanthighal, V.S.; Härmä, A.; Strik, H. Detection of COPD Exacerbation from Speech: Comparison of Acoustic Features and Deep Learning Based Speech Breathing Models. In Proceedings of the ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 9097–9101. [Google Scholar] [CrossRef]
- van Bemmel, L.; Harmsen, W.; Cucchiarini, C.; Strik, H. Automatic Selection of the Most Characterizing Features for Detecting COPD in Speech. In Proceedings of the Speech and Computer: 23rd International Conference, SPECOM, St Petersburg, Russia, 27–30 September 2021; pp. 737–748. [Google Scholar]
- Muguli, A.; Pinto, L.; R., N.; Sharma, N.; Krishnan, P.; Ghosh, P.K.; Kumar, R.; Bhat, S.; Chetupalli, S.R.; Ganapathy, S.; et al. DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics. arXiv 2021, arXiv:2103.09148. [Google Scholar]
- Pinkas, G.; Karny, Y.; Malachi, A.; Barkai, G.; Bachar, G.; Aharonson, V. SARS-COV-2 detection from voice. IEEE Open J. Eng. Med. Biol. 2020, 1, 268–274. [Google Scholar] [CrossRef]
- Lella, K.K.; PJA, A. Automatic covid-19 disease diagnosis using 1D convolutional neural network and augmentation with human respiratory sound based on parameters: Cough, breath, and voice. AIMS Public Health 2021, 8, 240–264. [Google Scholar] [CrossRef]
- Asim Iqbal, M.; Devarajan, K.; Ahmed, S.M. An optimal asthma disease detection technique for voice signal using hybrid machine learning technique. Concurr. Comput. Pract. Exp. 2022, 34, e6856. [Google Scholar] [CrossRef]
- Iqbal, M.A.; Devarajan, K.; Ahmed, S.M. Real time detection and forecasting technique for asthma disease using speech signal and DENN classifier. Biomed. Signal Process. Control. 2022, 76, 103637. [Google Scholar] [CrossRef]
- Yadav, S.; Keerthana, M.; Gope, D.; Maheswari, K.U.; Kumar Ghosh, P. Analysis of Acoustic Features for Speech Sound Based Classification of Asthmatic and Healthy Subjects. In Proceedings of the ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 6789–6793. [Google Scholar] [CrossRef]
- Rashid, M.; Alman, K.A.; Hasan, K.; Hansen, J.H.L.; Hasan, T. Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features. arXiv 2020, arXiv:2011.09270. [Google Scholar]
- Iqbal, M.A.; Devarajan, D.K.; Ahmed, D.S.M. EAP-DL: Enhanced asthma prediction with voice recording using efficient feature extraction and classification technique. Int. J. Aquat. Sci. 2021, 12, 4595–4611. [Google Scholar]
- Shrivastava, P.; Tripathi, N. Comparison of different classification techniques for the detection of speech affected due to respiratory disorders. J. Phys. Conf. Ser. 2022, 2273, 012013. [Google Scholar] [CrossRef]
- Das, S. A Machine Learning Model for Detecting Respiratory Problems using Voice Recognition. In Proceedings of the 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), Bombay, India, 29–31 March 2019; pp. 1–3. [Google Scholar] [CrossRef]
- Dash, T.K.; Mishra, S.; Panda, G.; Satapathy, S.C. Detection of COVID-19 from speech signal using bio-inspired based cepstral features. Pattern Recognit. 2021, 117, 107999. [Google Scholar] [CrossRef]
- Liu, A.T.; Yang, S.-w.; Chi, P.-H.; Hsu, P.-c.; Lee, H.-y. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders. In Proceedings of the ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 6419–6423. [Google Scholar] [CrossRef]
- Boersma, P.; Weenink, D. PRAAT, a system for doing phonetics by computer. Glot Int. 2001, 5, 341–345. [Google Scholar]
- Schuller, B.W.; Batliner, A.; Amiriparian, S.; Bergler, C.; Gerczuk, M.; Holz, N.; Larrouy-Maestri, P.; Bayerl, S.P.; Riedhammer, K.; Mallol-Ragolta, A.; et al. The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes. arXiv 2022, arXiv:2205.06799. [Google Scholar]
- McFee, B.; Raffel, C.; Liang, D.; Ellis, D.P.; McVicar, M.; Battenberg, E.; Nieto, O. librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference, Austin, TX, USA, 6–12 July 2015; Volume 8, pp. 18–25. [Google Scholar]
- Orlandic, L.; Teijeiro, T.; Atienza, D. The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms. Sci. Data 2021, 8, 156. [Google Scholar] [CrossRef]
- Sait, U.; Kv, G.; Shivakumar, S.; Kumar, T.; Bhaumik, R.; Prakash Prajapati, S.; Bhalla, K. Spectrogram Images of Breathing Sounds for COVID-19 and other Pulmonary Abnormalities. Mendeley Data 2021. [Google Scholar] [CrossRef]
- Pizzo, D.T.; Esteban, S. IATos: AI-powered pre-screening tool for COVID-19 from cough audio samples. arXiv 2021, arXiv:2104.13247. [Google Scholar]
- Zhang, Q.; Zhang, J.; Yuan, J.; Huang, H.; Zhang, Y.; Zhang, B.; Lv, G.; Lin, S.; Wang, N.; Liu, X.; et al. SPRSound: Open-Source SJTU Paediatric Respiratory Sound Database. IEEE Trans. Biomed. Circuits Syst. 2022, 16, 867–881. [Google Scholar] [CrossRef]
- Hsu, F.S.; Huang, S.R.; Huang, C.W.; Cheng, Y.R.; Chen, C.C.; Hsiao, J.; Chen, C.W.; Lai, F. A Progressively Expanded Database for Automated Lung Sound Analysis: An Update. Appl. Sci. 2022, 12, 7623. [Google Scholar] [CrossRef]
- Piczak, K.J. ESC: Dataset for Environmental Sound Classification. In Proceedings of the 23rd ACM International Conference on Multimedia, MM ’15, Brisbane, Australia, 26–30 October 2015; pp. 1015–1018. [Google Scholar] [CrossRef]
- Gemmeke, J.F.; Ellis, D.P.W.; Freedman, D.; Jansen, A.; Lawrence, W.; Moore, R.C.; Plakal, M.; Ritter, M. Audio set: An ontology and human-labeled dataset for audio events. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 776–780. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhang, J.; Yuan, J.; Huang, H.; Zhang, Y.; Zhang, B.; Lv, G.; Lin, S.; Wang, N.; Liu, X.; et al. Grand Challenge on Respiratory Sound Classification for SPRSound Dataset. In Proceedings of the 2023 IEEE Biomedical Circuits and Systems Conference (BioCAS), Taipei, Taiwan, 13–15 October 2022; pp. 1–5. [Google Scholar] [CrossRef]
- Kilintzis, V.; Beredimas, N.; Kaimakamis, E.; Stefanopoulos, L.; Chatzis, E.; Jahaj, E.; Bitzani, M.; Kotanidou, A.; Katsaggelos, A.K.; Maglaveras, N. CoCross: An ICT platform enabling monitoring recording and fusion of clinical information chest sounds and imaging of COVID-19 ICU patients. Proc. Healthc. 2022, 10, 276. [Google Scholar] [CrossRef]
Reference | Year | Topic | Data and Cohort | Recording Device | ML Models Used | Data Processing Methods | KPIs |
---|---|---|---|---|---|---|---|
[8] | 2020 | Cough Detection | Private-110 subjects (various resp. diseases) | Various Microphones | XGBoost, RF, DT | Feature Extraction from cough events | Acc: 93% Sens: 97% Spec: 95 |
[9] | 2020 | Disease Identification: COVID-19 | Private-2660 subjects | Web App, Various Microphones | CNN w/1 Poisson biomarker and 3 pre-trained ResNet50 | MFCCs | AUC: 97% |
[10] | 2020 | Disease Identification: COVID-19 | ESC-50, Audioset-Cohort n.s. | Various Microphones | CNN, VGG16 | MFCCs | Accuracies: COVID/Non-COVID: 70% Cough/Non-Cough: 90% |
[11] | 2021 | Disease Identification: COVID-19 | Private-8380 positive, 6041 negative instances | N/A | 2D CNN | EMD and MFCCs | AUC: 97% |
[12] | 2021 | Disease Identification: COVID-19 | Coswara, Virufy-Cohort n.s. | Web App, Various Microphones | SVM, KVM, RF | Common Short Term Features and MFCCs | AUC: 90% |
[13] | 2021 | Disease Identification: COVID-19 | Private-240 acoustic data—60 normal, 20 COVID-19 subjects | Smartphone | LSTM (RNN) | Spec. Centroid, Spec. roll-off, ZCS, MFCC (+ΔΔ) | Acc: 97%, F1: 97.9% |
[6] | 2021 | Disease Identification: COVID-19 | Private-Cohort n.s. | Crowdsourced audio recordings, Various Microphones | XGBoost | Feature Extraction | Acc: 97% |
[14] | 2020 | Cough Detection, Disease Identification: COVID-19 | ESC-50 and Private-543 Recordings (96 bronchitis, 130 pertussis, 70 COVID-19, 247 Normal) | N/A | 1 CNN for Cough Classification and 3 CNNs for COVID-19 detection | Mel-spectrograms to images | Accuracies: Cough Detection: 95.5% COVID-19 Identification: 92.64% |
[15] | 2021 | Disease Identification: Bronchitis, Asthma, COVID-19, Healthy | N/A | N/A | Fully connected NN layer | Questionnaire and Cough Embeddings | Acc: 95.04% |
[16] | 2022 | Cough Detection, Disease Identification: COVID-19 | Virufy-Cohort n.s. | N/A | DNN | Windowing and Feature Extraction | Acc: 97.5% |
[17] | 2022 | Cough Detection | Corp Dataset-42 volunteers (18 CAP, 4 asthma, 17 COPD, 3 other resp. illness) | Digital Recorder | CNN | MFCCs | Acc: 99.64%, IoU: 0.89 |
[18] | 2022 | Cough Detection | Public/ Private-3228 cough and 480,780 non-cough sounds | Various Microphones | GB Classifier | Feature Extraction and manual selection | Acc: 99.6% (Validation only on hospital data from children) |
[19] | 2022 | Cough Detection, Disease Identification: COVID-19 | Cambridge, Coswara, Virufy, NoCoCoDa - Cohort n.s. | Various Microphones | Adaboost, MLP, XGBoost, Gboost, LR, K-NN, HGBoost to MCDM | Feature Extraction | Acc: 85% |
[5] | 2022 | Disease Identification: COPD, AECOPD | Private-177 volunteers (78 COPD, 86 AECOPD, 13 not used) | Various Microphones | N/A | Feature Extraction | ROC Curve: 0.89, agreement w/ clinical study |
[20] | 2013 | Symptom Identification: Wet Cough, Dry Cough | Private-78 Patients | High-end Microphones | LR | Feature Extraction | Sens: 84%, Spec: 76% |
[21] | 2020 | Cough Detection | Private-26 healthy participants | Smartphone | K-NN, DT, RF, SVM | Feature Extraction and Selection | F1 Score: 88% |
[22] | 2019 | Cough Detection | Private-20 min of cough sounds | N/A | Hidden Markov Models | Single and Multiple Energy Band | AUC: 0.844 |
[23] | 2020 | Cough Detection | Private-94 adults | Smartphone | CNN | Mel spectrograms | Accuracies: Cough Detection: 99.7% Sex Classification: 74.8% |
[24] | 2018 | Cough Detection, Disease Identification: Croup | Private-56 croup and 424 non-croup subjects | Smartphone | SVM and LR | Feature Extraction | Sens 92.31% Spec: 85.29% Croup classification Acc: 86.09% |
[25] | 2020 | Cough Detection | N/A | N/A | SVM and K-NN | Feature Extraction | Accuracies: K-NN: 94.51%, SVM: 81.22% |
[26] | 2020 | Symptom Identification: Wet Cough, Dry Cough | Private-5971 coughs (5242 dry and 729 wet) | Smartphone | RF | Feature Extraction (Custom and OpenSmile) | Acc: 88% |
[27] | 2021 | Disease Identification: Heart-failure, COVID-19, Healthy | Private-732 patients (241 COVID-19, 244 Heart-failure, 247 Normal. | Smartphone | K-NN | DNA Pattern Feature Generator, mRMR Feature Selector | Acc: 99.5% |
[28] | 2020 | Cough Detection, Disease Identification: Pertussis, Bronchitis, Bronchiolitis | ESC-50, Audioset-Cohort n.s. | Various Microphones | CNN | Mel-spectrograms | Accuracies: Disease Identification: 89.60%, Cough Detection: 88.05% |
[29] | 2019 | Cough Detection, Symptom Identification: Productive Cough, Non-productive Cough | Private-810 events: 229 non-productive cough, 74 productive cough, and 507 other sounds | Various Microphones | DLN | FFT and PCA | Acc: 98.45% |
[30] | 2017 | Disease Identification: Croup | Private-364 patients (43 Croup, 321 non-croup | Smartphone | LR and SVM | MFCCs and CIFs | Acc: 98.33% |
[4] | 2020 | Disease Identification: Asthma | Private-997 asthmatic, 1032 healthy sounds | Smartphone | GMM - UBM | MFCCs and CQSSs | Acc: 95.3% |
[31] | 2021 | Disease Identification: COVID-19 | Coswara-Cohort n.s. | Smartphone | ResNet50 | Feature Extraction | Acc: 97.6% |
[32] | 2022 | Disease Identification: COVID-19 | Coswara, Virufy, Cambridge and private-Cohort n.s. | Various Microphones | Most popular supervised models | Feature selection and extraction | Best Acc: Random Forest: 83.67% |
[33] | 2022 | Disease Identification: Tuberculosis | Private-16 TB and 35 non-TB patients | Various Microphones | LR, SVM, K-NN, MLP, CNN | Feature Extraction | Best Acc: LR: 84.54% |
[34] | 2021 | Disease Identification: COVID-19 | Virufy, NoCoCoDa-Cohort n.s. | Various Microphones | SVM, LDA, K-NN | Feature Extraction and SFS feature selection | Best Acc: K-NN: 98.33% |
[7] | 2021 | Cough Detection | ESC-50-50 cough recordings | Various Microphones | CNN | Mel spectrograms and Data Augmentation | Acc: 98% |
[35] | 2021 | Disease Identification: COVID-19 | Coswara, COVID-19 Sounds-Cohort n.s. | Various Microphones | Contrastive Learning | Contrastive Pre-training: Feature Encoder w/ Random Masking | Acc: 83.74% |
[36] | 2019 | Disease Identification: Asthma, Healthy | Private-89 children for each cohort, 1992 healthy and 1140 asthmatic cough sounds | Smartphone | Gaussian mixture model | Downsampling and multidimensional Feature Extraction (MFCCs and CQCCs) | Sens: 82.81%, Spec: 84.76%, AROC: 0.91 |
Reference | Year | Topic | Data and Cohort | Recording Device | ML Models Used | Data Processing Methods | KPIs |
---|---|---|---|---|---|---|---|
[43] | 2018 | Symptom Identification: Wheeze | Private-255 breathing cycles, 50 patients | Smartphone | SVM | Bag-of-Words To Features | Acc: 75.21% |
[44] | 2020 | Disease Identification: Bronchitis, Pneumonia | Private-739 recordings | Various Microphones | K-NN | EMD, MFCCs, GTCC | Acc: 99% |
[45] | 2021 | Disease Identification: Bronchial Asthma | Private-952 recordings | High-end Microphones | NN, RF | Spectral Bandwidth, Spectral Centroid, ZCR, Spectral Roll-Off, Chromacity | Sens: 89.3% Spec: 86% Acc: 88% Youden’s Index: 0.753 |
[46] | 2019 | Disease Identification: COPD | Private-55 recordings | Stethoscope | Fine Gaussian SVM | Statistical Features, MFCCs | Acc: 100% |
[47] | 2018 | Disease Identification: Asthma, COPD | Private-80 normal, 80 COPD, and 80 asthma recordings | Stethoscope | ANN | PSD Extracted Features, Feature Selection (ANOVA) | Acc: 60% Spec: 54.2% |
[48] | 2022 | Disease Identification: COVID-19 | Coswara-120 recordings from COVID-19 patients, 120 recordings from Healthy patients | Various Microphones | Neural Network | Statistical and CNN-BiLSTM Extracted Features | Acc: 100% (shallow recordings), 88.89% (deep recordings) |
[49] | 2021 | Disease Identification: COVID-19 | COVID-19 Sounds-141 recordings | High-end Microphones | VGGish | Spectral Centroid, MFCCs, Roll-off Frequency, ZCR | ROC-AUC: 80% Prec: 69% Recall: 69% |
[50] | 2022 | Symptom Identification: Wheeze, Crackle | Respiratory Sounds Database (RSDB) and private-943 recordings | Various Microphones | ResNet | Padding, STFT, Spectrum Correlation, Log-Mel Spectrograms, Normalization | Sens: 76.33% Spec: 78.86% |
[51] | 2021 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | ANN, SVM, RF | Time Statistics and Frequency features | Acc: NN: 73%, RF: 73%, SVM: 78.3% |
[52] | 2019 | Symptom Identification: Wheeze, Crackle | Private-21 normal samples, 12 wheezes, and 35 crackles | Stethoscope | SVM | CWT, Gaussian Filter, Average Power, Stacked Autoencoder | Acc: 86.51% |
[53] | 2019 | Symptom Identification: Wheeze, Crackle | RDSB’s stethoscope recordings-834 recordings | Stethoscope | ResNet | Optimized S-transform | Sens: 96.27% Spec: 100% Acc: 98.79% |
[54] | 2022 | Symptom Identification: Wheeze, Crackle | RDSB-920 recordings | Various Microphones | VGG-16 | Fluid-Solid Modeling, Recording Simulation, Downsampling, Feature Extraction | Sens: 28%, Spec: 81% |
[55] | 2020 | Disease Identification: Bronchiectasis, Bronchiolitis, COPD, Pneumonia, URTI, Healthy | RDSB-920 recordings | Various Microphones | RF | Resampling, Windows, Filtering, EMD, Features | Acc: 88%, Prec: 91%, Recall: 87%, Spec: 97% |
[47] | 2020 | Symptom Identification: Wheeze, Crackle | Private-705 lung sounds (240 crackle, 260 rhonchi, and 205 normal) | Stethoscope | SVM, NN, K-Nearest Neighbors (K-NN) | CWT | Acc: 90.71%, Sens: 91.19%, Spec: 95.20% |
[56] | 2021 | Symptom Identification: Crackle, Normal, Stridor, Wheeze | Private-600 recordings | Stethoscope | SVM, K-NN | Filtering, Amplification, Dimensionality Reduction, MFCCs, NLM Filter | Acc: SVM: 92%, K-NN: 97% |
[57] | 2021 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | 2D CNN | RMS Norm, Peak Norm, EBU Norm, Data Augmentation | Acc: 88% |
[58] | 2019 | Symptom Identification: Wheeze, Crackle | Private-384 recordings | Stethoscope | VGGish-BiGRU | Spectrograms | Acc: 87.41% |
[59] | 2017 | Symptom Identification: Wheeze, Crackle | Private-60 recordings | Stethoscope | Gaussian Mixture Model | MFCCs | Acc: 98.4% |
[60] | 2017 | Symptom Identification: Wheeze, Crackle | Private-recordings containing 11 crackles, 3 wheezes, 4 stridors, 2 squawks, 2 rhonchi, and 29 normal sounds | Digital stethoscope | MLP | EMD, IMF, Spectrum, Feature Extraction | Acc: Crackles 92.16%, Wheeze 95%, Stridor 95.77%, Squawk 99.14%, Normal 88.36%, AVG 94.82% |
[61] | 2021 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | VGG-16 | Resampling, Windows, Filtering, Mel spectrogram (Mel, Harmonic, Percussive, Derivative) | Acc: Wheeze 89.00%, Rhonchi 68.00%, Crackles 90.00% |
[62] | 2020 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | ResNet | Resampling, Windows, Filtering, Data Augmentation, Mel-spectrogram, Device Specific Features | 80/40 Split 4 class (per device): Spec: 83.3%, Sens: 53.7%, Score: 68.5% |
[63] | 2021 | Symptom Identification: Crackle, Wheeze – Disease Identification: Asthma, Cystic Fibrosis | Private-Recordings from 95 patients | Various Microphones | N/A | N/A | 85% agreement (k = 0.35 (95% CI 0.26-0.44)) between conventional and smartphone auscultation Features |
[64] | 2021 | Symptom Identification: Wheeze, Crackle, Other | RSDB-920 recordings | Various Microphones | LDA, SVM with Radial Basis Function (SVMrbf), Random Undersampling Boosted trees (RUSBoost), CNNs. | Spectrogram, Mel-spectrogram, Scalogram, Feature Extraction | Acc: 99.6% |
[65] | 2022 | Symptom Identification: Wheeze, Crackle, Normal | RSDB-920 recordings | Various Microphones | Hybrid CNN-LSTM | Feature Extraction | Sens: 52.78% Spec: 84.26% F1: 68.52% Acc: 76.39% |
Reference | Year | Topic | Data and Cohort | Recording Device | ML Models Used | Data Processing Methods | KPIs |
---|---|---|---|---|---|---|---|
[71] | 2020 | Disease Identification: COVID-19 | Private—116 subjects (76 8 weeks post COVID-19, 40 Healthy | Smartphone, Various Microphones | VGG19 | Log-mel spectrogram | Acc: 0.85%, Sens: 0.89%, Spec: 0.77% |
[72] | 2021 | Disease Identification: COVID-19 | Coswara—166 subjects (83 COVID-19 positive, 83 Healthy) | Various Microphones | NB, Bayes Net, SGD, SVM, K-NN, Adaboost algorithm (model combination), DT, OneR, J48, RF, Bagging, Decision table, LWL | Fundamental Frequency (F0), Shimmer, Jitter and Harmonic to Noise Ratio, MFCC or Spectral Centroid or Roll-Off | Best overall results for vowels a, e, o: Random Forest: Acc: 82.35%, Sens: 94.12%, Spec: 70.59% |
[73] | 2021 | Disease Identification: COVID-19 | Coswara—1027 subjects (77 COVID-19 positive (54M, 23F), 950 Healthy (721M, 229F)) | Various Microphones | SVM, SGD, K-NN, LWL, Adaboost and Bagging, OneR, Decision Table, DT, REPTree | ComParE_2016, FF, Jitter and Shimmer, Harmonic to Noise Ratio, MFCCs, MFCC and , Spec. Centroid, Spec. Roll-off | Best overall results for vowels a, e, o: SVM: Acc: 97.07%, F1: 82.35%, Spec: 97.37% |
[74] | 2021 | Disease Identification: COVID-19 | Private—196 subjects (69 COVID-19, 130 Healthy) | Mobile App, Web App–Smartphone, Various Microphones | SVM, RBF, RF | 1024 embedding feature vector from D-CNN | Best model: RF: Acc: 73%, F1: 81% |
[75] | 2022 | Disease Identification: COPD | Corpus Gesproken Nederlands—Cohort n.s. | Various Microphones | SVM | Mean intensity (db), Mean frequency (Hz), Pitch variability (Hz), Mean center (Hz) of gravity Formants, Speaking rate, Syllables per breath group, Jitter, Jitter ppq5, Shimmer, Shimmer apq3, Shimmer apq5, HNR, ComParE_2016 | Acc: 75.12%, Sens: 85% |
[76] | 2021 | Disease Identification: COPD | Private—49 subjects (11 COPD exacerbation, 9 Stable COPD, 29 Healthy) | Smartphone | LDA, SVM | Duration, the four formants, mean gravity center, some measures of pitch and intensity, openSMILE, eGeMAPS, # of words read out loud, duration of file | p < 0.01 |
[77] | 2021 | Disease Identification: COVID-19 | Coswara—Dataset 1: 1040 subjects (965 non-COVID), Dataset 2: 990 subjects (930 non-COVID) | Smartphone | LR, MLP, RF | 39-dimensional MFCCs + and coeff., window size of 1024 samples, window hop size = 441 samples | Dataset 1 - RF: Average AUC: 70.69%, Dataset 2 - RF: Average AUC: 70.17% |
[78] | 2020 | Disease Identification: COVID-19 | Israeli COVID-19 collection—88 subjects (29 positive, 59 negative) | Smartphone | Transformer, SVM | Mel spectrum transformation | /z/: F1: 81%, Prec: 82%, counting: F1: 80%, Prec: 80%, /z/, /ah/: F1: 79%, Prec: 80%, /ah/: F1: 74%, Prec: 83%, cough: 58%, Prec: 72% |
[79] | 2021 | Disease Identification: COVID-19, Asthma | COVID-19 sounds—1541 Respiratory Sounds | Mobile App, Web App–Smartphone, Various Microphones | light-weight CNN | MMFCC, EGFCC and Data De-noising Auto encoder | COVID-19/non-COVID-19 + breath + cough: Acc: 89%, Asthma/non-asthma + breath + voice Acc: 84% |
[80] | 2022 | Disease Identification: Asthma | Private—8 subjects (100 normal, 321 Wheezing, 98 Striding, 73 Rattling sounds) | N/A | DQNN, Hybrid machine learning | IWO, Signal Selection: EHS algorithm | Spec: 99.8%, Sens: 99.2%, Acc: 100% |
[81] | 2022 | Disease Identification: Asthma | 18 patients—300 respiratory sounds, 10 types of breathing | N/A | DENN | IWO Algorithm for Asthma Detection & Forecasting | Spec: 99.8%, Sens: 99.2%, Acc: 99.91% |
[82] | 2020 | Disease Identification: Asthma | Private—95 subjects (47 asthmatic, 48 healthy) | Various Microphones | SVM | ISCB using openSMILE, SET A: 5900 features, SET B: 6373 features, MFCC | /oU/ All feature groups: Acc: 74% |
[13] | 2020 | Disease Identification: COVID-19 | Private–240 acoustic data—60 normal, 20 COVID-19 subjects | Smartphone | LSTM (RNN) | Spec. Centroid, Spec. roll-off, ZCS, MFCC (+) | Cough: F1: 97.9% acc: 97%, breathing: F1: 98.8% acc: 98.2%, voices: F1: 92.5% acc: 88.2% |
[83] | 2020 | Disease Identification: Asthma | 88 recordings: 1957 segments (65 Severe resp. distress, 216 Asthma, 673 Mild resp. distress) | Smartphone | LIBSVM | Acoustic features: Interspeed 2010 Paralinguistic Challenge, 38 LLDs and 21 functionals | Acoustic Features: Acc: 86.3%, Sens: 85.9%, Spec: 86.9% |
[84] | 2021 | Disease Identification: Asthma | Private—30 subjects | N/A | RDNN | Discrete Ripplet-II Transform | Proposed EAP-DL: Acc: 86.3%, Sens: 85.9%, Spec: 86.9% |
[85] | 2022 | Symptom Identification: Voice Alteration | OPJHRC Fortis hospital in Raigarh—Cohort, not specified | Various Microphones | K-NN, SVM, LDA, LR, Linear SVM, etc. | Formant Frequencies, Pitch, Intensity, Jitter, Shimmer, Mean Autocorrelation, Harmonic to Noise ratio, Noice to Harmonic ration, MFCC, LPC | Decision Tree K-fold: Acc: 90% Sen: 90% Spec: 90% |
[86] | 2019 | Symptom Identification: Voice Alteration | Private—Cohort n.s. | Various Microphones | Pretrained from Intel OpenVIVO and TensorFlow | Not specified, however models are vision based | N/A |
[87] | 2021 | Disease Identification: COVID-19 | Coswara, Cambridge DB-2—4352 Web App users, 2261 Android App users | Smartphone | SVM | MFCC | Acc: 85.7%, F2: 85.1% |
Reference | Title | Description | Provider | Suitable for Respiratory Disease Classification | Suitable for Cough Detection |
---|---|---|---|---|---|
[92] | COUGHVID | Over 25,000 crowdsourced audio recordings: Cough—a wide range of participant ages, genders, geographic locations, and COVID-19 statuses | Embedded Systems Laboratory (ESL), EPFL, Lausanne, Switzerland | X | ✓ |
[49] | COVID-19 Sounds | 53,449 audio recordings, over 552 h in total: 3 Cough, 3–5 Breathing, 3 Speech of users reading a specific sentence | University of Cambridge | X | ✓ |
[39] | Coswara | 2747 audio recordings: Breathing, Coughing, Talking—Crowdsourced dataset (not clinically validated) | Indian Institute of Science (IISc), Bangalore | ✓ | ✓ |
[69] | Respiratory Sound Database (RSDB) | 920 audio recordings: Crackles or/and Wheezes - Digital stethoscopes and microphones, each recording is expertly annotated | Department of Informatics Engineering, University of Coimbra, Portugal and School of Medicine, Aristotle University of Thessaloniki, Greece | ✓ | X |
[40] | Corp | 168 h of 9969 audio recordings: Cough—42 different patients with respiratory diseases | MARI Lab, Tongji university | X | ✓ |
[37] | Virufy | Combination of Coswara & COUGHVID audio recordings: Cough—COVID-19 positive/negative | The Covid Detection Foundation (California nonprofit corporation) | X | ✓ |
[93] | COVID-19 and Pulmonary Abnormalities | 1734 COVID-19 spectrogram images of respiratory sounds: 795 Crackles, 322 Wheezes, 1143 Normal. | Indian Institute of Science, PES University, M S Ramaiah Institute of Technology, Concordia University | ✓ | X |
[94] | Tos-COVID | Audio recordings: Cough | Gov. of Buenos Aires city | X | ✓ |
[95] | SPRSound: Open-Source SJTU Paediatric Respiratory Sound Database | 2683 audio recordings and 9089 audio events: Respiratory Symptoms/Sounds—292 participants. | Shanghai Jiao Tong University and Shanghai Children’s Medical Center (SCMC) | ✓ | X |
[96] | HF_Lung | Audio recordings: Lung Sounds/Symptoms—Used for developing automated inhalation, exhalation, and adventitious sound detection algorithms | Taiwan Smart Emergency and Critical Care (TSECC) and Taiwan Society of Emergency and Critical Care Medicine (TSECCM) | ✓ | X |
[97] | ESC-50 | 2000 audio recordings: Environmental, Various, Cough—Labeled collection suitable for benchmarking methods of sound classification | Warsaw University of Technology, Warsaw, Poland | ∼ | ✓ |
[98] | AudioSet | 2,084,320 10-s audio recordings: Environmental, Various, Cough, Respiratory Symptoms—Expanding ontology, 632 human-labeled audio event classes, drawn from YouTube videos | Sound and Video Understanding teams, Google LLC | ∼ | ✓ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kapetanidis, P.; Kalioras, F.; Tsakonas, C.; Tzamalis, P.; Kontogiannis, G.; Karamanidou, T.; Stavropoulos, T.G.; Nikoletseas, S. Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review. Sensors 2024, 24, 1173. https://doi.org/10.3390/s24041173
Kapetanidis P, Kalioras F, Tsakonas C, Tzamalis P, Kontogiannis G, Karamanidou T, Stavropoulos TG, Nikoletseas S. Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review. Sensors. 2024; 24(4):1173. https://doi.org/10.3390/s24041173
Chicago/Turabian StyleKapetanidis, Panagiotis, Fotios Kalioras, Constantinos Tsakonas, Pantelis Tzamalis, George Kontogiannis, Theodora Karamanidou, Thanos G. Stavropoulos, and Sotiris Nikoletseas. 2024. "Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review" Sensors 24, no. 4: 1173. https://doi.org/10.3390/s24041173
APA StyleKapetanidis, P., Kalioras, F., Tsakonas, C., Tzamalis, P., Kontogiannis, G., Karamanidou, T., Stavropoulos, T. G., & Nikoletseas, S. (2024). Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review. Sensors, 24(4), 1173. https://doi.org/10.3390/s24041173