Acoustic and Temporal Analysis of Speech for Schizophrenia Management †
Abstract
:1. Introduction
1.1. Schizophrenia and Speech
1.2. Features Derived from Acoustic–Phonetic Analysis
2. Methods
2.1. Data Collection
- Task 1: speaking the five Greek vowels (/a/, /o/, /u/, /i/, /e/) in a sustained manner for at least five seconds.
- Task 2: reading a standardized list of thirty words from a predefined script (constructed with the purpose of achieving a high phonetic diversity).
- Task 3: participating in a non-instructed interview where the participants were recorded while having a spontaneous talk. These recordings were used separately to extract the acoustic features that characterize the speech signal (all mentioned in Section 2.2).
2.2. Feature Extraction
3. Results and Discussion
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Joshy, A.A.; Rajan, R. Automated Dysarthria Severity Classification Using Deep Learning Frameworks. In Proceedings of the 2020 28th European Signal Processing Conference (EUSIPCO), Amsterdam, The Netherlands, 18–21 January 2021; pp. 116–120. [Google Scholar]
- Braga, D.; Madureira, A.M.; Coelho, L.; Abraham, A. Neurodegenerative Diseases Detection Through Voice Analysis. In Hybrid Intelligent Systems; Abraham, A., Muhuri, P.K., Muda, A.K., Gandhi, N., Eds.; Springer International Publishing: Cham, Switzerland, 2018; Volume 734, pp. 213–223. [Google Scholar]
- Vigo, I.; Coelho, L.; Reis, S. Speech- and Language-Based Classification of Alzheimer’s Disease: A Systematic Review. Bioengineering 2022, 9, 27. [Google Scholar] [CrossRef] [PubMed]
- Vieira, H.; Costa, N.; Sousa, T.; Reis, S.; Coelho, L. Voice-Based Classification of Amyotrophic Lateral Sclerosis: Where Are We and Where Are We Going? A Systematic Review. NDD 2019, 19, 163–170. [Google Scholar] [CrossRef] [PubMed]
- Yin, F.; Du, J.; Xu, X.; Zhao, L. Depression Detection in Speech Using Transformer and Parallel Convolutional Neural Networks. Electronics 2023, 12, 328. [Google Scholar] [CrossRef]
- Arieti, S. Interpretation of Schizophrenia; Robert Brunner: New York, NY, USA, 1955. [Google Scholar]
- Ueda, N.; Maruo, K.; Sumiyoshi, T. Positive symptoms and time perception in schizophrenia: A meta-analysis. Schizophr. Res. Cogn. 2018, 13, 3–6. [Google Scholar] [CrossRef] [PubMed]
- Cohen, A.S.; Mitchell, K.R.; Elvevåg, B. What do we really know about blunted vocal affect and alogia? A meta-analysis of objective assessments. Schizophr. Res. 2014, 159, 533–538. [Google Scholar] [CrossRef] [PubMed]
- Buchanan, R.W. Persistent Negative Symptoms in Schizophrenia: An Overview. Schizophr. Bull. 2007, 33, 1013–1022. [Google Scholar] [CrossRef] [PubMed]
- Murphy, B.P.; Chung, Y.-C.; Park, T.-W.; McGorry, P.D. Pharmacological treatment of primary negative symptoms in schizophrenia: A systematic review. Schizophr. Res. 2006, 88, 5–25. [Google Scholar] [CrossRef] [PubMed]
- Alpert, M.; Kotsaftis, A.; Pouget, E.R. At issue: Speech fluency and schizophrenic negative signs. Schizophr. Bull. 1997, 23, 171–177. [Google Scholar] [CrossRef] [PubMed]
- Covington, M.A.; Lunden, S.L.A.; Cristofaro, S.L.; Wan, C.R.; Bailey, C.T.; Broussard, B.; Fogarty, R.; Johnson, S.; Zhang, S.; Compton, M.T. Phonetic measures of reduced tongue movement correlate with negative symptom severity in hospitalized patients with first-episode schizophrenia-spectrum disorders. Schizophr. Res. 2012, 142, 93–95. [Google Scholar] [CrossRef]
- Deliyski, D.D.; Shaw, H.S.; Evans, M.K.; Vesselinov, R. Regression tree approach to studying factors influencing acoustic voice analysis. Folia. Phoniatr. Logop. 2006, 58, 274–288. [Google Scholar] [CrossRef] [PubMed]
- Benward, B.; Saker, M. Music in Theory and Practice Volume 1, 9th ed.; McGraw-Hill Education: Dubuque, IA, USA, 2014; ISBN 978-0-07-802515-0. [Google Scholar]
- Teixeira, J.P.; Fernandes, P.O. Acoustic Analysis of Vocal Dysphonia. Procedia Comput. Sci. 2015, 64, 466–473. [Google Scholar] [CrossRef]
- Yumoto, E.; Sasaki, Y.; Okamura, H. Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness. J. Speech Hear. Res. 1984, 27, 2–6. [Google Scholar] [CrossRef]
- Murry, T.; Brown, W.S.; Rothman, H. Judgments of voice quality and preference: Acoustic interpretations. J. Voice 1987, 1, 252–257. [Google Scholar] [CrossRef]
- Coelho, L.; Braga, D.; Dias, M.; Garcia-Mateo, C. An Automatic Voice Pleasantness Classification System Based on Prosodic and Acoustic Patterns of Voice Preference. In Proceedings of the 12th Annual Conference of the International Speech Communication Association, Florence, Italy, 27–31 August 2011. [Google Scholar]
- Ferrand, C.T. Harmonics-to-Noise Ratio: An Index of Vocal Aging. J. Voice 2002, 16, 480–487. [Google Scholar] [CrossRef] [PubMed]
- Covington, M.A.; He, C.; Brown, C.; Naçi, L.; McClain, J.T.; Fjordbak, B.S.; Semple, J.; Brown, J. Schizophrenia and the structure of language: The linguist’s view. Schizophr. Res. 2005, 77, 85–98. [Google Scholar] [CrossRef] [PubMed]
- Compton, M.T.; Lunden, A.; Cleary, S.D.; Pauselli, L.; Alolayan, Y.; Halpern, B.; Broussard, B.; Crisafio, A.; Capulong, L.; Balducci, P.M.; et al. The aprosody of schizophrenia: Computationally derived acoustic phonetic underpinnings of monotone speech. Schizophr. Res. 2018, 197, 392–399. [Google Scholar] [CrossRef]
- Çokal, D.; Zimmerer, V.; Turkington, D.; Ferrier, N.; Varley, R.; Watson, S.; Hinzen, W. Disturbing the rhythm of thought: Speech pausing patterns in schizophrenia, with and without formal thought disorder. PLoS ONE 2019, 14, e0217404. [Google Scholar] [CrossRef] [PubMed]
- Fraser, W.I.; King, K.M.; Thomas, P.; Kendell, R.E. The diagnosis of schizophrenia by language analysis. Br. J. Psychiatry 1986, 148, 275–278. [Google Scholar] [CrossRef] [PubMed]
- Rapcan, V.; D’Arcy, S.; Yeap, S.; Afzal, N.; Thakore, J.; Reilly, R.B. Acoustic and temporal analysis of speech: A potential biomarker for schizophrenia. Med. Eng. Phys. 2010, 32, 1074–1079. [Google Scholar] [CrossRef] [PubMed]
- Boersma, P.; Weenink, D. Praat: Doing phonetics by computer. Ear Hearing 2011, 32, 266. [Google Scholar] [CrossRef]
- Android Application “Parrot—Voice Recorder”. Available online: https://play.google.com/store/apps/details?id=com.SearingMedia.Parrot (accessed on 9 June 2023).
- Kliper, R.; Portuguese, S.; Weinshall, D. Prosodic Analysis of Speech and the Underlying Mental State. In Pervasive Computing Paradigms for Mental Health; Serino, S., Matic, A., Giakoumis, D., Lopez, G., Cipresso, P., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 52–62. [Google Scholar]
Vowels | F0 (STD) Hz | Jitter (STD) % | Shimmer (STD) % | HNR (STD) dB |
---|---|---|---|---|
a | 164.7 (36.7) | 2.3 (0.9) | 15.9 (4.2) | 7.2 (3.4) |
e | 181.5 (53.0) | 2.7 (1.4) | 16.1 (5.1) | 9.1 (2.9) |
i | 184.2 (50.0) | 2.4 (1.0) | 14.0 (5.0) | 13.0 (2.2) |
o | 180.3 (38.1) | 2.7 (1.5) | 15.1 (4.9) | 10.2 (3.6) |
u | 181.0 (48.0) | 2.8 (1.8) | 15.3 (4.3) | 11.1 (4.0) |
Vowels | F0 (STD) Hz | Jitter (STD) % | Shimmer (STD) % | HNR (STD) dB |
---|---|---|---|---|
a | 151.6 (47.2) | 1.0 (0.4) | 12.2 (6.0) | 14.9 (4.7) |
e | 162.3 (49.9) | 1.0 (0.4) | 6.0 (2.6) | 16.1 (4.9) |
i | 157.7 (53.8) | 1.0 (0.3) | 5.3 (1.9) | 18.2 (3.8) |
o | 170.4 (55.6) | 0.9 (0.5) | 5.2 (3.0) | 18.6 (4.7) |
u | 172.2 (69.2) | 2.3 (1.6) | 6.2 (3.6) | 20.3 (6.1) |
SCH | HCG | t-Test (SCH, HCG) | |
---|---|---|---|
Parameter | Mean (STD) | Mean (STD) | p-Value |
F0 (Hz) | 178.1 (44.4) | 162.8 (54.3) | 0.4410 |
Jitter (%) | 2.5 (1.3) | 1.2 (0.9) | 0.0049 |
Shimmer (%) | 15.3 (4.5) | 6.9 (4.4) | 0.0028 |
HNR (dB) | 10.0 (3.7) | 17.6 (5.1) | 0.0023 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mouratai, A.; Dimopoulos, N.; Dimitriadis, A.; Koudounas, P.; Glotsos, D.; Pinto-Coelho, L. Acoustic and Temporal Analysis of Speech for Schizophrenia Management. Eng. Proc. 2023, 50, 13. https://doi.org/10.3390/engproc2023050013
Mouratai A, Dimopoulos N, Dimitriadis A, Koudounas P, Glotsos D, Pinto-Coelho L. Acoustic and Temporal Analysis of Speech for Schizophrenia Management. Engineering Proceedings. 2023; 50(1):13. https://doi.org/10.3390/engproc2023050013
Chicago/Turabian StyleMouratai, Alexantrina, Nikolaos Dimopoulos, Athanasios Dimitriadis, Pantelis Koudounas, Dimitris Glotsos, and Luis Pinto-Coelho. 2023. "Acoustic and Temporal Analysis of Speech for Schizophrenia Management" Engineering Proceedings 50, no. 1: 13. https://doi.org/10.3390/engproc2023050013
APA StyleMouratai, A., Dimopoulos, N., Dimitriadis, A., Koudounas, P., Glotsos, D., & Pinto-Coelho, L. (2023). Acoustic and Temporal Analysis of Speech for Schizophrenia Management. Engineering Proceedings, 50(1), 13. https://doi.org/10.3390/engproc2023050013