Auditory Models for Formant Frequency Discrimination of Vowel Sounds
Abstract
:1. Introduction
2. Materials and Methods
2.1. Stimuli
2.2. Threshold Data Sets
2.3. Auditory Models
2.4. Modeling Formant Frequency Discrimination
2.4.1. Computation of the Change in the Excitation/Loudness Patterns
2.4.2. Selection of Auditory Metrics for Formant Frequency Discrimination
- Peak-to-valley contrast
- b.
- 4-ERBN area
- c.
- 1-peak-1-valley area
2.4.3. Auditory Simulation Model to Predict Thresholds of Formant Frequency Discrimination
3. Results
4. Discussion
4.1. Auditory Metrics for Vowel Formant Frequency Discrimination
4.2. Modeling Vowel Formant Frequency Discrimination in Ordinary Listening Conditions
4.3. Applications of Auditory Models for Vowel Formant Discrimination
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, C. Auditory model of intensity discrimination and vowel formant discrimination: Effect of signal frequency. In Proceedings of the 2009 3rd International Conference on Bioinformatics and Biomedical Engineering, Beijing, China, 11–13 June 2009; pp. 1–4. [Google Scholar]
- Kewley-Port, D.; Zheng, Y. Auditory models of formant frequency discrimination for isolated vowels. J. Acoust. Soc. Am. 1998, 103, 1654–1666. [Google Scholar] [CrossRef]
- Miller, J.D. Auditory-perceptual interpretation of the vowel. J. Acoust. Soc. Am. 1989, 85, 2114–2134. [Google Scholar] [CrossRef]
- Nearey, T.M. Static, dynamic, and relational properties in vowel perception. J. Acoust. Soc. Am. 1989, 85, 2088–2113. [Google Scholar] [CrossRef]
- Hillenbrand, J.; Getty, L.A.; Clark, M.J.; Wheeler, K. Acoustic characteristics of American English vowels. J. Acoust. Soc. Am. 1995, 97, 3099–3111. [Google Scholar] [CrossRef]
- Kewley-Port, D.; Watson, C.S. Formant frequency discrimination for isolated English vowels. J. Acoust. Soc. Am. 1994, 95, 485–496. [Google Scholar] [CrossRef]
- Lyzenga, J.; Horst, J.W. Frequency discrimination of stylized synthetic vowels with a single formant. J. Acoust. Soc. Am. 1997, 102, 1755–1767. [Google Scholar] [CrossRef]
- Lyzenga, J.; Horst, J.W. Frequency discrimination of stylized synthetic vowels with two formants. J. Acoust. Soc. Am. 1998, 104, 2956–2966. [Google Scholar] [CrossRef]
- Kewley-Port, D.; Zheng, Y. Vowel formant discrimination: Towards more ordinary listening conditions. J. Acoust. Soc. Am. 1999, 106, 2945–2958. [Google Scholar] [CrossRef]
- Kewley-Port, D. Vowel formant discrimination II: Effects of stimulus uncertainty, consonantal context, and training. J. Acoust. Soc. Am. 2001, 110, 2141–2155. [Google Scholar] [CrossRef]
- Richie, R.; Kewley-Port, D.; Coughlin, M. Discrimination and identification of vowels by young, hearing-impaired adults. J. Acoust. Soc. Am. 2003, 114, 2923–2933. [Google Scholar] [CrossRef]
- Liu, C.; Kewley-Port, D. Factors affecting vowel formant discrimination by hearing-impaired listeners. J. Acoust. Soc. Am. 2007, 122, 2855–2864. [Google Scholar] [CrossRef] [Green Version]
- Liu, C. Rollover effect of signal level on vowel formant discrimination. J. Acoust. Soc. Am. 2008, 123, EL52–EL58. [Google Scholar] [CrossRef]
- Kewley-Port, D.; Li, X.; Zheng, Y.; Neel, A.T. Fundamental frequency effects on thresholds of vowel formant discrimination. J. Acoust. Soc. Am. 1996, 100, 2462–2470. [Google Scholar] [CrossRef]
- Kewley-Port, D. Thresholds of formant-frequency discrimination of vowels in consonantal context. J. Acoust. Soc. Am. 1995, 97, 3139–3146. [Google Scholar] [CrossRef]
- Miranda, T.; Pichora-Fuller, M. Temporally jittered speech produces performance intensity, phonetically balanced rollover in young normal-hearing listeners. J. Am. Acad. Audiol. 2002, 13, 50–58. [Google Scholar] [CrossRef]
- Molis, M.; Summers, V. Effects of high presentation levels on recognitions of low- and high frequency speech. Acoust. Res. Lett. Online 2003, 4, 124–128. [Google Scholar] [CrossRef]
- Studebaker, G.; Sherbecoe, R.; McDaniel, D.; Gwaltney, C. Monosyllabic word recognition at higher-than-normal speech and noise levels. J. Acoust. Soc. Am. 1999, 105, 2431–2444. [Google Scholar] [CrossRef]
- Glasberg, B.R.; Moore, B.C. Derivation of auditory filter shapes from notched-noise data. Hear. Res. 1990, 47, 103–138. [Google Scholar] [CrossRef]
- Sommers, M.S.; Kewley-Port, D. Modeling formant frequency discrimination of female vowels. J. Acoust. Soc. Am. 1996, 99, 3770–3781. [Google Scholar] [CrossRef]
- Moore, B.C.; Glasberg, B.R. Formulae describing frequency selectivity as a function of frequency and level, and their use in calculating excitation patterns. Hear. Res. 1987, 28, 209–225. [Google Scholar] [CrossRef]
- Moore, B.C.; Glasberg, B.R. A revision of Zwicker’s loudness model. Acta Acust. United Acust. 1996, 82, 335–345. [Google Scholar]
- Woodall, A.; Liu, C. Effects of signal level and spectral contrast on vowel formant discrimination. Am. J. Audiol. 2013, 22, 94–104. [Google Scholar] [CrossRef]
- Kawahara, H.; Masuda-Katsuse, I.; De Cheveigne, A. Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Commun. 1999, 27, 187–207. [Google Scholar] [CrossRef]
- Moore, B.C.; Glasberg, B.R. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J. Acoust. Soc. Am. 1983, 74, 750–753. [Google Scholar] [CrossRef]
- Moore, B.C.; Glasberg, B.R. A revised model of loudness perception applied to cochlear hearing loss. Hear. Res. 2004, 188, 70–88. [Google Scholar] [CrossRef]
- Stevens, S.S. On the psychophysical law. Psychol. Rev. 1957, 64, 153–181. [Google Scholar] [CrossRef]
- Zwicker, E.; Scharf, B. A model of loudness summation. Psychol. Rev. 1965, 72, 3–26. [Google Scholar] [CrossRef]
- Deng, L.; O’Shaughnessy, D. Speech Processing: A Dynamic and Optimization-Oriented Approach; Routledge: London, UK, 2003. [Google Scholar]
- Dau, T.; Püschel, D.; Kohlrausch, A. A quantitative model of the “effective” signal processing in the auditory system. I. Model structure. J. Acoust. Soc. Am. 1996, 99, 3615–3622. [Google Scholar] [CrossRef]
- Dau, T.; Püschel, D.; Kohlrausch, A. A quantitative model of the “effective” signal processing in the auditory system. II. Simulations and measurements. J. Acoust. Soc. Am. 1996, 99, 3623–3631. [Google Scholar] [CrossRef]
- Dau, T.; Kollmeier, B.; Kohlrausch, A. Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. J. Acoust. Soc. Am. 1997, 102, 2892–2905. [Google Scholar] [CrossRef]
- Dau, T.; Kollmeier, B.; Kohlrausch, A. Modeling auditory processing of amplitude modulation. II. Spectral and temporal integration. J. Acoust. Soc. Am. 1997, 102, 2906–2919. [Google Scholar] [CrossRef]
- Moore, B.C. Distribution of auditory-filter bandwidths at 2 kHz in young normal listeners. J. Acoust. Soc. Am. 1987, 81, 1633–1635. [Google Scholar] [CrossRef] [PubMed]
- Glasberg, B.R.; Moore, B.C. Development and evaluation of a model for predicting the audibility of time-varying sounds in the presence of background sounds. J. Audio Eng. Soc. 2005, 53, 906–918. [Google Scholar]
- Liu, C.; Tao, S.; Wang, W.; Dong, Q. Formant discrimination of speech and non-speech sounds for English and Chinese listeners. J. Acoust. Soc. Am. 2012, 132, EL189–EL195. [Google Scholar] [CrossRef] [PubMed]
Excitation Pattern | Loudness Pattern | |||||
---|---|---|---|---|---|---|
4-ERB (dB) | 1p1v (dB) | PtoV (dB) | 4-ERB (sones) | 1p1v (sones) | PtoV (sone/ERB) | |
1.899 | 3.311 | 1.412 | 0.308 | 0.483 | 0.241 |
Speech Level (dB) | 70 | 85 | 100 | ||||
---|---|---|---|---|---|---|---|
Corr. | 95% CI | Corr. | 95% CI | Corr. | 95% CI | ||
4-ERB | 0.95 | 0.90–1.00 | 0.95 | 0.79–0.99 | 0.92 | 0.48–0.98 | |
Excitation Pattern | 1p1v | 0.97 | 0.82–0.99 | 0.98 | 0.62–0.99 | 0.98 | 0.68–0.99 |
PtoV | 0.96 | 0.62–0.99 | 0.94 | 0.62–0.99 | 0.97 | 0.63–0.99 | |
4-ERB | 0.98 | 0.87–1.00 | 0.95 | 0.76–0.99 | 0.94 | 0.68–0.99 | |
Loudness Pattern | 1p1v | 0.97 | 0.61–0.99 | 0.96 | 0.66–0.99 | 0.91 | 0.46–0.98 |
PtoV | 0.96 | 0.84–0.99 | 0.94 | 0.62–0.99 | 0.96 | 0.41–0.98 |
Metrics | Speech Level (dB SPL) | |||
---|---|---|---|---|
70 | 85 | 100 | ||
4-ERB | 12 | 8.3 | 11.5 | |
Excitation Pattern | 1p1v | 17.8 | 34.5 | 30 |
PtoV | 16.3 | 10.6 | 10.9 | |
4-ERB | 20.1 | 16.5 | 20.3 | |
Loudness Pattern | 1p1v | 53.1 | 22.1 | 16.7 |
PtoV | 24.3 | 14.1 | 16.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, C.; Liu, C. Auditory Models for Formant Frequency Discrimination of Vowel Sounds. Information 2023, 14, 429. https://doi.org/10.3390/info14080429
Xu C, Liu C. Auditory Models for Formant Frequency Discrimination of Vowel Sounds. Information. 2023; 14(8):429. https://doi.org/10.3390/info14080429
Chicago/Turabian StyleXu, Can, and Chang Liu. 2023. "Auditory Models for Formant Frequency Discrimination of Vowel Sounds" Information 14, no. 8: 429. https://doi.org/10.3390/info14080429
APA StyleXu, C., & Liu, C. (2023). Auditory Models for Formant Frequency Discrimination of Vowel Sounds. Information, 14(8), 429. https://doi.org/10.3390/info14080429