Hybrid Machine Learning Framework for Multistage Parkinson’s Disease Classification Using Acoustic Features of Sustained Korean Vowels
Abstract
:1. Introduction
- (i)
- A speech corpus of three sustained Korean vowels has been created (InhaPD) from Parkinson’s patients who were recruited at Inha University Hospital.
- (ii)
- A feature bank consisting of 43 acoustic features has been collaboratively created, using baseline features, vocal-fold features, and time–frequency features. The analysis of variance (ANOVA) F-Value feature selection classifier from scikit-learn was employed, to identify the top 20 most relevant features, resulting in a more effective and efficient feature subset.
- (iii)
- To address the dataset imbalance, we introduced a synthetic minority over-sampling technique (SMOTE) that generates new samples in the minority class and its neighboring samples, effectively balancing the distribution of classes, and reducing bias in the classification.
- (iv)
- A 10-fold cross-validation is performed for each model, using StratifiedKFold, demonstrating the robustness and effectiveness of our system compared to other recent approaches to stage classification.
- (v)
- The evaluation of the classification included the measuring accuracy, precision, recall, F1 score, and the area under the curve (AUC). This evaluation aimed to test the hypothesis proposing that a HMLP analysis of sustained vowels could effectively differentiate individuals with different stages of the disease.
2. Methodology
2.1. Methods
2.2. Dataset
2.3. Voice Recordings
2.4. Acoustic Measures
2.5. Feature Selection
2.6. Data Balancing
2.7. Machine Learning Algorithms and Hyperparameters
2.8. Performance Metrics
3. Results
3.1. Binary Classification
3.2. Multistage Classification
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pringsheim, T.; Jette, N.; Frolkis, A.; Steeves, T.D.L. The Prevalence of Parkinson’s Disease: A Systematic Review and Meta-analysis. Mov. Disord. 2014, 29, 1583–1590. [Google Scholar] [CrossRef] [PubMed]
- Pramanik, M.; Pradhan, R.; Nandy, P.; Qaisar, S.M.; Bhoi, A.K. Assessment of Acoustic Features and Machine Learning for Parkinson’s Detection. J. Healthc. Eng. 2021, 9957132. [Google Scholar] [CrossRef] [PubMed]
- Agnihotri, A.; Aruoma, O.I. Alzheimer’s Disease and Parkinson’s Disease: A Nutritional Toxicology Perspective of the Impact of Oxidative Stress, Mitochondrial Dysfunction, Nutrigenomics and Environmental Chemicals. Taylor. Fr. 2019, 39, 16–27. [Google Scholar] [CrossRef]
- Dorsey, E.R.; Sherer, T.; Okun, M.S.; Bloemd, B.R. The Emerging Evidence of the Parkinson Pandemic. J. Park. Dis. 2018, 8, S3–S8. [Google Scholar] [CrossRef]
- Rusz, J.; Hlavnička, J.; Novotný, M.; Tykalov, T.; Pelletier, A.; Montplaisir, J.; Gagnon, J.-F.; Dušek, P.; Galbiati, A.; Marelli, S.; et al. Speech Biomarkers in Rapid Eye Movement Sleep Behavior Disorder and Parkinson Disease. Ann. Neurol. 2021, 90, 62–75. [Google Scholar] [CrossRef] [PubMed]
- Ma, A.; Lau, K.K.; Thyagarajan, D. Voice Changes in Parkinson’s Disease: What Are They Telling Us? J. Clin. Neurosci. 2020, 72, 1–7. [Google Scholar] [CrossRef]
- Fereshtehnejad, S.M.; Yao, C.; Pelletier, A.; Montplaisir, J.Y.; Gagnon, J.F.; Postuma, R.B. Evolution of Prodromal Parkinson’s Disease and Dementia with Lewy Bodies: A Prospective Study. Brain 2019, 142, 2051–2067. [Google Scholar] [CrossRef]
- Rusz, J.; Tykalová, T.; Novotný, M.; Zogala, D.; Růžička, E.; Dušek, P. Automated Speech Analysis in Early Untreated Parkinson’s Disease: Relation to Gender and Dopaminergic Transporter Imaging. Eur. J. Neurol. 2022, 29, 81–90. [Google Scholar] [CrossRef]
- Hlavnika, J.; Cmejla, R.; Tykalová, T.; Šonka, K.; Ruzicka, E.; Rusz, J. Automated Analysis of Connected Speech Reveals Early Biomarkers of Parkinson’s Disease in Patients with Rapid Eye Movement Sleep Behaviour Disorder. Sci. Rep. 2017, 7, 12. [Google Scholar] [CrossRef]
- Ngo, Q.C.; Motin, M.A.; Pah, N.D.; Drotár, P.; Kempster, P.; Kumar, D. Computerized Analysis of Speech and Voice for Parkinson’s Disease: A Systematic Review. Comput. Methods Programs Biomed. 2022, 226, 107133. [Google Scholar] [CrossRef]
- Sakar, C.O.; Serbes, G.; Gunduz, A.; Tunc, H.C.; Nizam, H.; Sakar, B.E.; Tutuncu, M.; Aydin, T.; Isenkul, M.E.; Apaydin, H. A Comparative Analysis of Speech Signal Processing Algorithms for Parkinson’s Disease Classification and the Use of the Tunable Q-Factor Wavelet Transform. Appl. Soft. Comput. 2019, 74, 255–263. [Google Scholar] [CrossRef]
- Avuçlu, E.; Elen, A. Evaluation of Train and Test Performance of Machine Learning Algorithms and Parkinson Diagnosis with Statistical Measurements. Med. Biol. Eng. Comput. 2020, 58, 2775–2788. [Google Scholar] [CrossRef]
- Bourouhou, A.; Jilbab, A.; Nacir, C.; Hammouch, A. Comparison of Classification Methods to Detect the Parkinson Disease. In Proceedings of the 2016 International Conference on Electrical and Information Technologies, ICEIT, Tangiers, Morocco, 4–7 May 2016; pp. 421–424. [Google Scholar] [CrossRef]
- Zhang, L.; Qu, Y.; Jin, B.; Jing, L.; Gao, Z.; Liang, Z. An Intelligent Mobile-Enabled System for Diagnosing Parkinson Disease: Development and Validation of a Speech Impairment Detection System. JMIR Med. Inform. 2020, 8, e18689. [Google Scholar] [CrossRef]
- Meghraoui, D.; Boudraa, B.; Merazi-Meksen, T.; Boudraa, M. Parkinson’s Disease Recognition by Speech Acoustic Parameters Classification. Lect. Notes Netw. Syst. 2016, 1, 165–173. [Google Scholar]
- Braga, D.; Madureira, A.M.; Coelho, L.; Ajith, R. Automatic Detection of Parkinson’s Disease Based on Acoustic Analysis of Speech. Eng. Appl. Artif. Intell. 2019, 77, 148–158. [Google Scholar] [CrossRef]
- Nilashi, M.; Ibrahim, O.; Ahani, A. Accuracy Improvement for Predicting Parkinson’s Disease Progression. Sci. Rep. 2016, 6, 34181. [Google Scholar] [CrossRef]
- Dao, S.V.T.; Yu, Z.; Tran, L.V.; Phan, P.N.K.; Huynh, T.T.M.; Le, T.M. An Analysis of Vocal Features for Parkinson’s Disease Classification Using Evolutionary Algorithms. Diagnostics 2022, 12, 1980. [Google Scholar] [CrossRef]
- Kadiri, S.R.; Kethireddy, R.; Alku, P. Parkinson’s Disease Detection from Speech Using Single Frequency Filtering Cepstral Coefficients. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Shanghai, China, 25–29 October 2020; pp. 4971–4975. [Google Scholar] [CrossRef]
- Ma, J.; Zhang, Y.; Li, Y.; Zhou, L.; Qin, L.; Zeng, Y.; Wang, P.; Lei, Y. Deep Dual-Side Learning Ensemble Model for Parkinson Speech Recognition. Biomed Signal Process Control 2021, 69, 102849. [Google Scholar] [CrossRef]
- Gunduz, H. Deep Learning-Based Parkinson’s Disease Classification Using Vocal Feature Sets. IEEE Access 2019, 7, 115540–115551. [Google Scholar] [CrossRef]
- Hoehn, M.M.; Yahr, M.D. Parkinsonism. Neurology 1967, 17, 427. [Google Scholar] [CrossRef]
- Ozbolt, A.S.; Moro-Velazquez, L.; Lina, I.; Butala, A.A.; Dehak, N. Things to Consider When Automatically Detecting Parkinson’s Disease Using the Phonation of Sustained Vowels: Analysis of Methodological Issues. Appl. Sci. 2022, 12, 991. [Google Scholar] [CrossRef]
- Hsu, S.Y.; Yeh, L.R.; Chen, T.B.; Du, W.C.; Huang, Y.H.; Twan, W.H.; Lin, M.C.; Hsu, Y.H.; Wu, Y.C.; Chen, H.Y. Classification of the Multiple Stages of Parkinson’s Disease by a Deep Convolution Neural Network Based on 99mTc-TRODAT-1 SPECT Images. Molecules 2020, 25, 4792. [Google Scholar] [CrossRef] [PubMed]
- Ricciardi, C.; Amboni, M.; De Santis, C.; Ricciardelli, G.; Improta, G.; Iuppariello, L.; D’Addio, G.; Barone, P.; Cesarelli, M. Classifying Different Stages of Parkinson’s Disease Through Random Forests. IFMBE Proc. 2020, 76, 1155–1162. [Google Scholar]
- Suppa, A.; Costantini, G.; Asci, F.; Di Leo, P.; Al-Wardat, M.S.; Di Lazzaro, G.; Scalise, S.; Pisani, A.; Saggio, G. Voice in Parkinson’s Disease: A Machine Learning Study. Front Neurol. 2022, 13, 831428. [Google Scholar] [CrossRef]
- Templeton, J.M.; Poellabauer, C.; Schneider, S. Classification of Parkinson’s Disease and Its Stages Using Machine Learning. Sci. Rep. 2022, 12, 14036. [Google Scholar] [CrossRef]
- Moro-Velazquez, L.; Gomez-Garcia, J.A.; Arias-Londoño, J.D.; Dehak, N.; Godino-Llorente, J.I. Advances in Parkinson’s Disease Detection and Assessment Using Voice and Speech: A Review of the Articulatory and Phonatory Aspects. Biomed. Signal. Process. Control. 2021, 66, 102418. [Google Scholar] [CrossRef]
- Hughes, A.J.; Daniel, S.E.; Kilford, L.; Lees, A.J. Accuracy of Clinical Diagnosis of Idiopathic Parkinson’s Disease: A Clinico-Pathological Study of 100 Cases. J. Neurol. Neurosurg. Psychiatry 1992, 55, 181–184. [Google Scholar] [CrossRef] [PubMed]
- Omberg, L.; Chaibub Neto, E.; Perumal, T.M.; Pratap, A.; Tediarjo, A.; Adams, J.; Bloem, B.R.; Bot, B.M.; Elson, M.; Goldman, S.M.; et al. Remote Smartphone Monitoring of Parkinson’s Disease and Individual Response to Therapy. Nat. Biotechnol. 2022, 40, 480–487. [Google Scholar] [CrossRef]
- Faragó, P.; Ștefănigă, S.-A.; Cordoș, C.-G.; Mihăilă, L.-I.; Hintea, S.; Peștean, A.-S.; Beyer, M.; Perju-Dumbravă, L.; Ileșan, R.R. CNN-Based Identification of Parkinson’s Disease from Continuous Speech in Noisy Environments. Bioengineering 2023, 10, 531. [Google Scholar] [CrossRef]
- Asci, F.; Costantini, G.; Di Leo, P.; Zampogna, A.; Ruoppolo, G.; Berardelli, A.; Saggio, G.; Suppa, A. Machine-Learning Analysis of Voice Samples Recorded through Smartphones: The Combined Effect of Ageing and Gender. Sensors 2020, 20, 5022. [Google Scholar] [CrossRef]
- Lipsmeier, F.; Taylor, K.I.; Kilchenmann, T.; Wolf, D.; Scotland, A.; Schjodt-Eriksen, J.; Cheng, W.Y.; Fernandez-Garcia, I.; Siebourg-Polster, J.; Jin, L.; et al. Evaluation of Smartphone-Based Testing to Generate Exploratory Outcome Measures in a Phase 1 Parkinson’s Disease Clinical Trial. Mov. Disord. 2018, 33, 1287–1297. [Google Scholar] [CrossRef]
- Vaiciukynas, E.; Verikas, A.; Gelzinis, A.; Bacauskiene, M. Detecting Parkinson’s Disease from Sustained Phonation and Speech Signals. PLoS ONE 2017, 12, e0185613. [Google Scholar] [CrossRef]
- Benba, A.; Jilbab, A.; Hammouch, A. Detecting Patients with Parkinson ’ s Disease Using Mel Frequency Cepstral Coefficients and Support Vector Machines. Int. J. Electr. Eng. Inform. 2015, 7, 297–307. [Google Scholar]
- Boersma, P. Praat: Doing Phonetics by Computer. Ear Hear. 2011, 32, 266. [Google Scholar] [CrossRef]
- Jadoul, Y.; Thompson, B.; de Boer, B. Introducing Parselmouth: A Python Interface to Praat. J. Phon. 2018, 71, 1–15. [Google Scholar] [CrossRef]
- Pedregosa, F.; Michel, V.; Grisel Oliviergrisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Vanderplas, J.; Cournapeau, D.; Pedregosa, F.; Varoquaux, G.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Japkowicz, N.; Stephen, S. The Class Imbalance Problem: A Systematic Study. Intell. Data Anal. 2002, 6, 429–449. [Google Scholar] [CrossRef]
PD Stages | |||||
---|---|---|---|---|---|
Whole Group | H&Y 1 | H&Y 2 | H&Y 2.5 | H&Y 3 | |
Number | 101 | 23 | 49 | 12 | 17 |
Gender | |||||
Male | 60 | 13 | 28 | 11 | 8 |
Female | 41 | 10 | 21 | 1 | 9 |
Age | |||||
avg | 68.11 | 67.08 | 67.57 | 71.08 | 69 |
std | 6.83 | 7 | 7.15 | 3.89 | 7.1 |
min | 43 | 49 | 43 | 65 | 54 |
max | 81 | 81 | 80 | 80 | 79 |
Age onset | |||||
avg | 62.86 | 65.47 | 62.59 | 65.41 | 58.29 |
std | 7.79 | 7.09 | 7.75 | 5.23 | 8.61 |
min | 41 | 47 | 41 | 59 | 43 |
max | 78 | 78 | 76 | 75 | 71 |
Disease duration | |||||
avg | 5.33 | 1.86 | 5 | 5.75 | 10.7 |
std | 4.14 | 1.01 | 3.29 | 2.98 | 4.35 |
min | 1 | 1 | 1 | 1 | 1 |
max | 18 | 4 | 15 | 11 | 18 |
Feature Name | Definitions | Symbol |
---|---|---|
F0 (mean) | Mean of the fundamental frequency | meanF0 |
F0 (median) | Median of the fundamental frequency | medianF0 |
F0 (stdv.) | Standard deviation of the fundamental frequency | stdevF0 |
F 1–4 (mean) | Mean of the formant frequencies F1 to F4 | f1-4_mean |
F 1–4 (median) | Median of the formant frequencies F1 to F4 | f1-4_median |
Jitter (%) | Measure of the average absolute difference between consecutive periods, divided by the average period | localJitter |
Jitter (abs) | Measure of the cycle-to-cycle variation in the fundamental frequency, typically expressed in seconds | absJitter |
Jitter: RAP | Measure of the average absolute difference between a period and the average of that period and its two neighboring periods, divided by the average period length | rapJitter |
Jitter: PPQ5 | Measure of the average absolute difference between a period and the average of that period and its four closest neighboring periods, divided by the average period length | ppq5Jitter |
Jitter: DDP | Measure of the average absolute difference between consecutive differences of consecutive periods, divided by the average period length | ddpJitter |
Jitter: mean | Mean of the jitter | meanJt |
Jitter: median | Median of the jitter | medianJt |
Jitter: PCA | Two-factor principal components analysis (PCA) on jitter | jitterPCA |
Shimmer (%) | Measure of the average absolute difference between the amplitudes of consecutive periods, divided by the average amplitude, expressed as a percentage | localShimmer |
Shimmer (dB) | Measure of the average absolute base-10 logarithm of the difference between the amplitudes of consecutive periods, multiplied by 20 | dbShimmer |
Shimmer: APQ3 | Measure of the variability in the amplitude of a speech signal, as measured from the amplitude of a single period to the average amplitude of its three closest neighbors | apq3Sh |
Shimmer: APQ5 | Measure of the variability in the amplitude of a speech signal, as measured from the amplitude of a single period to the average amplitude of its five closest neighbors | apq5Sh |
Shimmer: APQ11 | Measure of the variability in the amplitude of a speech signal, as measured from the amplitude of a single period to the average amplitude of its eleven closest neighbors | apq11Sh |
Shimmer: DDA | Measure of the variability in the amplitude of a speech signal, as measured from the consecutive differences between the amplitudes of consecutive periods | ddaSh |
Shimmer: mean | Mean of the shimmer | meanSh |
Shimmer: median | Median of the shimmer | medianSh |
Shimmer: PCA | Two-factor PCA on shimmer | shimmerPCA |
HNR | Amplitude of the tonal relative to the noise components. | hnr |
MFCC 1–4 | Mel-frequency cepstrum band 1 to 4 | mfcc1-4 |
MFCC jitter 1–4 | Positive change in mel-frequency cepstrum band 1 to 4 over time | mfccJt1-4 |
pF | Formant position | pF |
F_Disp | Dispersion of formant frequency | fdisp |
avgF | Average formant | avgFormant |
Fitch_vlt | Fitch vocal tract length | fitch_vtl |
delta_F | Cumulated variation of formants | delta_f |
vtl_delta_F | Cumulated variation of vocal tract length | vtl_delta_f |
Gender | Gender information of each subject (male, female) | gender |
Model | Best Hyperparameters | Average Accuracy | Average Precision | Average Recall | Average F1 Score | Average AUC |
---|---|---|---|---|---|---|
RF | {‘max_depth’: None, ‘min_samples_split’: 2, ‘n_estimators’: 200} | 0.9448 | 0.9575 | 0.9322 | 0.9438 | 0.9873 |
SVM | {‘C’: 50, ‘gamma’: ‘scale’} | 0.9548 | 0.9914 | 0.9176 | 0.9528 | 0.9846 |
kNN | {‘n_neighbors’: 3, ‘weights’: ‘distance’} | 0.9162 | 0.9956 | 0.8364 | 0.908 | 0.9694 |
MLP | {‘activation’: ‘relu’, ‘hidden_layer_sizes’: (1000, 500, 200), ‘learning_rate’: ‘constant’, ‘max_iter’: 3000, ‘solver’: ‘adam’} | 0.9369 | 0.963 | 0.9097 | 0.9348 | 0.9785 |
Comparisons | Accuracy | Precision (PPV) | Recall (Sensitivity) | AUC | Algorithm | Authors |
---|---|---|---|---|---|---|
HY ≤ 2 vs. HY > 2 | 0.9548 | 0.9914 | 0.9176 | 0.9848 | SVM | Proposed HMLP |
0.880 | 0.889 | 0.872 | 0.934 | SVM and ANN | Suppa et al. [26] | |
0.8947 | 0.9286 | 0.9286 | - | CART | Templeton et al. [27] |
Model | Best Hyperparameters | Average Accuracy | Average Precision | Average Recall | Average F1 Score | Average AUC |
---|---|---|---|---|---|---|
RF | {‘max_depth’: None, ‘min_samples_split’: 2, ‘n_estimators’: 200} | 0.8503 | 0.8514 | 0.8503 | 0.8474 | 0.9694 |
SVM | {‘C’: 50, ‘gamma’: ‘scale’} | 0.8651 | 0.8671 | 0.8651 | 0.8604 | 0.9625 |
kNN | {‘n_neighbors’: 3, ‘weights’: ‘distance’} | 0.8611 | 0.8696 | 0.8611 | 0.8513 | 0.9469 |
MLP | {‘activation’: ‘relu’, ‘hidden_layer_sizes’: (1000, 500, 200), ‘learning_rate’: ‘constant’, ‘max_iter’: 3000, ‘solver’: ‘adam’} | 0.8662 | 0.8662 | 0.8662 | 0.8635 | 0.9584 |
Model | Best Hyperparameters | Average Accuracy | Average Precision | Average Recall | Average F1 Score | Average AUC |
---|---|---|---|---|---|---|
RF | {‘max_depth’: None, ‘min_samples_split’: 2, ‘n_estimators’: 200} | 0.8868 | 0.8887 | 0.8868 | 0.886 | 0.9764 |
SVM | {‘C’: 50, ‘gamma’: ‘scale’} | 0.8764 | 0.8819 | 0.8764 | 0.8739 | 0.9603 |
kNN | {‘n_neighbors’: 3, ‘weights’: ‘distance’} | 0.8752 | 0.8898 | 0.8752 | 0.8695 | 0.9533 |
MLP | {‘activation’: ‘relu’, ‘hidden_layer_sizes’: (1000, 500, 200), ‘learning_rate’: ‘constant’, ‘max_iter’: 3000, ‘solver’: ‘adam’} | 0.8948 | 0.8964 | 0.8948 | 0.8933 | 0.9602 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mondol, S.I.M.M.R.; Kim, R.; Lee, S. Hybrid Machine Learning Framework for Multistage Parkinson’s Disease Classification Using Acoustic Features of Sustained Korean Vowels. Bioengineering 2023, 10, 984. https://doi.org/10.3390/bioengineering10080984
Mondol SIMMR, Kim R, Lee S. Hybrid Machine Learning Framework for Multistage Parkinson’s Disease Classification Using Acoustic Features of Sustained Korean Vowels. Bioengineering. 2023; 10(8):984. https://doi.org/10.3390/bioengineering10080984
Chicago/Turabian StyleMondol, S. I. M. M. Raton, Ryul Kim, and Sangmin Lee. 2023. "Hybrid Machine Learning Framework for Multistage Parkinson’s Disease Classification Using Acoustic Features of Sustained Korean Vowels" Bioengineering 10, no. 8: 984. https://doi.org/10.3390/bioengineering10080984
APA StyleMondol, S. I. M. M. R., Kim, R., & Lee, S. (2023). Hybrid Machine Learning Framework for Multistage Parkinson’s Disease Classification Using Acoustic Features of Sustained Korean Vowels. Bioengineering, 10(8), 984. https://doi.org/10.3390/bioengineering10080984