Speech Enhancement for Hearing Impaired Based on Bandpass Filters and a Compound Deep Denoising Autoencoder
Abstract
:1. Introduction
2. Speech Perception and Hearing Loss
3. Architecture of the Proposed System
3.1. Bandpass Filter
3.2. Compound DDAEs (C-DDAEs)
- DDAE-1: 128 units for each layer, . The magnitude spectrum is 513-dimensional, which works as the input and target.
- DDAE-2: 512 units for each layer, . Three frames of spectra used: , where the target is the single spectrum .
- DDAE3: Three hidden layers with 1024 units for each layer, . Five frames of spectra are used:
4. Experiments and Evaluation
4.1. Experimental Setup
4.2. The Spectrograms Comparison
5. Speech Quality and Intelligibility Evaluation
5.1. Speech Quality Perception Evaluation (PESQ)
5.2. Hearing Aid Speech Quality Index (HASQI)
5.3. Hearing Aid Speech Perception Index (HASPI)
6. Results and Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ying, L.H.; Wei, Z.; Shih, T.; Shih, H.; Wen, H.L.; Yu, T. Improving the performance of hearing aids in noisy environments based on deep learning technology. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–21 July 2018; Volume 20, pp. 404–408. [Google Scholar]
- WHO. Deafness and Hearing Loss. Available online: http://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss (accessed on 5 January 2021).
- Lesica, N. Why Do Hearing Aids Fail to Restore Normal Auditory Perception. Trends Neurosci. 2018, 41, 174–185. [Google Scholar] [CrossRef] [PubMed]
- Weiss, M.; Aschkenasy, E.; Parsons, T. Study and Development of the INTEL Technique for Improving Speech Intelligibility; Technical Report NSC-FR/4023; Nicolet Scientific Corporation: Northvale, NJ, USA, 1974. [Google Scholar]
- Chen, L.; Wang, Y.; Yoho, S.E.; Wang, D.; Healy, E.W. Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises. J. Acoust. Soc. Am. 2016, 139, 2604–2612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Harbach, A.A.; Arora, K.; Mauger, S.J.; Dawson, P.W. Combining directional microphone and single-channel noise reduction algorithms: A clinical evalua-tion in difficult listening conditions with cochlear implant users. Ear Hear. 2012, 33, 13–23. [Google Scholar]
- Hu, Y.; Loizou, P. Subjective Comparison of Speech Enhancement Algorithms. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, 14–19 May 2006; Volume 1, pp. 153–156. [Google Scholar]
- Gray, R.; Buzo, A.; Gray, A.; Matsuyama, Y. Distortion measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 1980, 28, 367–376. [Google Scholar] [CrossRef]
- Aubreville, M.; Ehrensperger, K.; Maier, A.; Rosenkranz, T.; Graf, B.; Puder, H. Deep Denoising for Hearing Aid Applications. Available online: http://arxiv.org/abs/1805.01198 (accessed on 5 January 2021).
- Chen, F.; Loizou, P.C. Impact of SNR and gain-function over- and under-estimation on speech intelligibility. Speech Commun. 2012, 54, 81–272. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, D.; Smaragdis, P.; Kim, M. Experiments on Deep Learning for Speech Denoising. In Proceedings of the 15th Annual Conference of the International Speech Communication Association, Singapore, 14–18 September 2014. [Google Scholar]
- Lu, X.; Tsao, Y.; Matsuda, S.; Hori, C. Ensemble modelling of denoising autoencoder for speech spectrum restoration. In Proceedings of the 15th Conference in the annual series of Interspeech, Singapore, 14–18 September 2014; pp. 885–889. [Google Scholar]
- Sun, M.; Zhang, X.; Hamme, H.V.; Zheng, T.F. Unseen noise estimation using separable deep autoencoder for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 2016, 24, 93–104. [Google Scholar] [CrossRef] [Green Version]
- Shifas, M.; Claudio, S.; Stylianos, Y. A fully recurrent feature extraction for single-channel speech enhancement. arXiv 2020, arXiv:2006.05233. [Google Scholar]
- Lai, Y.-H.; Chen, F.; Wang, S.-S.; Lu, X.; Tsao, Y.; Lee, C.-H. A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation. IEEE Trans. Biomed. Eng. 2017, 64, 1568–1578. [Google Scholar] [CrossRef] [PubMed]
- Lai, Y.-H.; Tsao, Y.; Lu, X.; Chen, F.; Su, Y.-T.; Chen, K.-C.; Chen, Y.-H.; Chen, L.-C.; Li, L.P.-H.; Lee, C.-H. Deep Learning–Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients. Ear Hear. 2018, 39, 795–809. [Google Scholar] [CrossRef] [PubMed]
- Lai, Y.-H.; Zheng, W.-Z. Multi-objective learning based speech enhancement method to increase speech quality and intelligibility for hearing aid device users. Biomed. Signal Process. Control. 2019, 48, 35–45. [Google Scholar] [CrossRef]
- Kim, M. Collaborative Deep Learning for Speech Enhancement: A Run-time Model Selection Method Using Autoencoders. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017. [Google Scholar]
- Souza, P. Speech Perception and Hearing Aids. In Hearing Aids; Springer Handbook of Auditory Research, Chapter 6; Springer: Berlin/Heidelberg, Germany, 2016; pp. 151–180. [Google Scholar]
- CMU_ARCTIC Speech Synthesis Databases. Available online: http://www.festvox.org/cmu_arctic/ (accessed on 5 January 2021).
- Seyyedsalehi, S.; Seyyedsalehi, S. A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks. Neurocomputing 2015, 168, 669–680. [Google Scholar] [CrossRef]
- Tsao, Y.; Lai, Y.H. Generalized maximum a posteriori spectral amplitude estimation for speech enhancement. Speech Commun. 2016, 76, 112–126. [Google Scholar] [CrossRef]
- Beerends, J.; Hoekstra, A.; Rix, A.W.; Hollier, M.P. Perceptual evaluation of speech quality (PESQ) the new ITU standard for end-to-end speech quality assessment part 2: Psychoacoustic model. J. Audio Eng. Soc. 2002, 50, 765–778. [Google Scholar]
- Kates, S.J.; Arehart, K. The hearing-aid speech quality index (HASQI). J. Audio Eng. Soc. 2010, 58, 363–381. [Google Scholar]
- Kates, J.M.; Arehart, K.H. The hearing-aid speech perception index (HASPI). Speech Commun. 2014, 65, 75–93. [Google Scholar] [CrossRef]
Channel No. | Lower Cutoff Frequency (Hz) | Higher Cutoff Frequency (Hz) |
---|---|---|
1 | 20 | 308 |
2 | 308 | 662 |
3 | 662 | 1157 |
4 | 1157 | 1832 |
5 | 1832 | 3321 |
6 | 3321 | 4772 |
7 | 4772 | 6741 |
8 | 6741 | 8000 |
Frequency (kHz) in dB H.L. | |||||||
---|---|---|---|---|---|---|---|
Audiogram | 0.25 | 0.5 | 1 | 2 | 4 | 8 | |
1 | Plane loss | 60 | 60 | 60 | 60 | 60 | 60 |
2 | Reverse tilt loss | 70 | 70 | 70 | 50 | 10 | 10 |
3 | Moderate tilt high-frequency loss | 40 | 40 | 50 | 60 | 65 | 65 |
4 | Steep slope high-frequency loss with standard low-frequency threshold | 0 | 0 | 0 | 60 | 80 | 90 |
5 | Steep slope high-frequency loss with mild low-frequency hearing loss | 0 | 15 | 30 | 60 | 80 | 85 |
6 | Mild to moderate tilt high-frequency hearing loss | 14 | 14 | 11 | 14 | 24 | 39 |
7 | 24 | 24 | 25 | 31 | 46 | 60 |
Noise | Method | SNR Level | |||
---|---|---|---|---|---|
0 dB | 5 dB | 10 dB | 15 dB | ||
White | Noisy | 1.49 | 1.62 | 1.91 | 2.12 |
DDAE-3 | 2.03 | 2.09 | 2.18 | 2.34 | |
DDAE-5 | 2.3 | 2.16 | 2.38 | 2.5 | |
HC-DDAEs | 2.71 | 2.78 | 2.83 | 2.88 | |
Pink | Noisy | 1.54 | 1.71 | 1.88 | 2.12 |
DDAE-3 | 2.03 | 2.11 | 2.19 | 2.41 | |
DDAE-5 | 2.11 | 2.44 | 2.51 | 2.72 | |
HC-DDAEs | 2.22 | 2.76 | 2.76 | 2.98 | |
Babble | Noisy | 1.49 | 1.72 | 1.8 | 1.91 |
DDAE-3 | 2.01 | 2.09 | 2.33 | 2.29 | |
DDAE-5 | 2.04 | 2.11 | 2.47 | 2.66 | |
HC-DDAEs | 2.24 | 2.29 | 2.53 | 2.67 | |
Train | Noisy | 1.51 | 1.54 | 1.76 | 1.91 |
DDAE-3 | 2.15 | 2.27 | 2.11 | 2.11 | |
DDAE-5 | 2.18 | 2.26 | 2.26 | 2.46 | |
HC-DDAEs | 2.27 | 2.41 | 2.48 | 2.69 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
AL-Taai, R.Y.L.; Wu, X. Speech Enhancement for Hearing Impaired Based on Bandpass Filters and a Compound Deep Denoising Autoencoder. Symmetry 2021, 13, 1310. https://doi.org/10.3390/sym13081310
AL-Taai RYL, Wu X. Speech Enhancement for Hearing Impaired Based on Bandpass Filters and a Compound Deep Denoising Autoencoder. Symmetry. 2021; 13(8):1310. https://doi.org/10.3390/sym13081310
Chicago/Turabian StyleAL-Taai, Raghad Yaseen Lazim, and Xiaojun Wu. 2021. "Speech Enhancement for Hearing Impaired Based on Bandpass Filters and a Compound Deep Denoising Autoencoder" Symmetry 13, no. 8: 1310. https://doi.org/10.3390/sym13081310
APA StyleAL-Taai, R. Y. L., & Wu, X. (2021). Speech Enhancement for Hearing Impaired Based on Bandpass Filters and a Compound Deep Denoising Autoencoder. Symmetry, 13(8), 1310. https://doi.org/10.3390/sym13081310