applsci-logo

Journal Browser

Journal Browser

Computational Methods and Engineering Solutions to Voice II

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (31 March 2021) | Viewed by 48461

Special Issue Editor


E-Mail Website
Guest Editor
Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Waldstrasse 1, 91054 Erlangen, Germany
Interests: vision system; voice system; voice production; artificial intelligence in diagnostics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Today, voice and speech research is not limited to acoustic, medical, and clinical studies and investigations. Approaches from different fields like mathematics, computer science, artificial intelligence, fluid dynamics, mechatronics, and biology are applied to achieve new insights into and a better understanding of the physiological and pathological laryngeal processes within voice and speech production. Based on fruitful interdisciplinary working research groups, many new approaches have been suggested during the last decade. This includes for example highly advanced numerical models (FEM/FVM models), as well as tissue engineering and machine learning-based data analysis approaches. The purpose of this Special Issue is to provide an overview of the newest and most innovative techniques applied in our field at the beginning of a new decade. Young colleagues are especially encouraged to submit their work. Authors are invited to submit their work related to the following topics, applying mathematical, engineering, computer science, and biological methods, within the field of voice and speech production.

Prof. Dr. Michael Döllinger
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Computational modeling 
  • Experimental modeling 
  • Computational fluid dynamics 
  • Fluid–structure–acoustic interactions 
  • Image processing 
  • Advanced data analysis 
  • Machine learning 
  • New technologies 
  • Tissue engineering 
  • Molecular biology

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

3 pages, 177 KiB  
Editorial
Special Issue on Computational Methods and Engineering Solutions to Voice II
by Michael Döllinger
Appl. Sci. 2021, 11(20), 9459; https://doi.org/10.3390/app11209459 - 12 Oct 2021
Viewed by 1034
Abstract
Today, research into voice and speech is not only limited to acoustic, medical, and clinical studies and investigations [...] Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)

Research

Jump to: Editorial

16 pages, 789 KiB  
Article
Multivariate Analysis of Vocal Fold Vibrations on Various Voice Disorders Using High-Speed Digital Imaging
by Akihito Yamauchi, Hiroshi Imagawa, Hisayuki Yokonishi, Ken-Ichi Sakakibara and Niro Tayama
Appl. Sci. 2021, 11(14), 6284; https://doi.org/10.3390/app11146284 - 7 Jul 2021
Cited by 8 | Viewed by 2652
Abstract
Although many quantitative parameters have been devised to describe abnormalities in vocal fold vibration, little is known about the priority of these parameters. We conducted a prospective study using high-speed digital imaging to elucidate disease-specific key parameters (KPs) to characterize the vocal fold [...] Read more.
Although many quantitative parameters have been devised to describe abnormalities in vocal fold vibration, little is known about the priority of these parameters. We conducted a prospective study using high-speed digital imaging to elucidate disease-specific key parameters (KPs) to characterize the vocal fold vibrations of individual voice disorders. From 304 patients with various voice disorders and 46 normal speakers, high-speed digital imaging of a sustained phonation at a comfortable pitch and loudness was recorded and parameters from visual-perceptual rating, laryngotopography, digital kymography, and glottal area waveform were calculated. Multivariate analysis was then applied to these parameters to elucidate the KPs to explain each voice disorder in comparison to normal subjects. Four key parameters were statistically significant for all laryngeal diseases. However, the coefficient of determination (R2) was very low (0.29). Vocal fold paralysis (8 KPs, R2 = 0.76), sulcus vocalis (4 KPs, R2 = 0.74), vocal fold scarring (1 KP, R2 = 0.68), vocal fold atrophy (6 KPs, R2 = 0.53), and laryngeal cancer (1 KP, R2 = 0.52) showed moderate-to-high R2 values. The results identified different KPs for each voice disorder; thus, disease-specific analysis is a reasonable approach. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

14 pages, 4932 KiB  
Article
Development of Parameters towards Voice Bifurcations
by Takeshi Ikuma, Andrew J. McWhorter, Lacey Adkins and Melda Kunduk
Appl. Sci. 2021, 11(12), 5469; https://doi.org/10.3390/app11125469 - 12 Jun 2021
Cited by 4 | Viewed by 1868
Abstract
Pathological vocal folds are known to exhibit multiple oscillation patterns, depending on tissue imbalance, subglottal pressure level, and other factors. This includes mid-phonation changes due to bifurcations in the underlying voice source system. Knowledge of when changes in oscillation patterns occur is helpful [...] Read more.
Pathological vocal folds are known to exhibit multiple oscillation patterns, depending on tissue imbalance, subglottal pressure level, and other factors. This includes mid-phonation changes due to bifurcations in the underlying voice source system. Knowledge of when changes in oscillation patterns occur is helpful in the assessments of voice disorders, and the knowledge could be transformed into useful objective measures. Mid-phonation bifurcations can occur in rapid succession; hence, a fast classification of oscillation pattern is critical to minimize the averaging of data across bifurcations. This paper proposes frequency-ratio based short-term measures, named harmonic disturbance factor (HDF) and biphonic index (BI), towards the detection of the bifurcations. For the evaluation of HDF and BI, a frequency selection algorithm for glottal source signals is devised, and its efficacy is demonstrated with the glottal area waveforms of four cases, representing the wide range of oscillatory behaviors. The HDF and BI exhibit clear transitions when the voice bifurcations are apparent in the spectrograms. The presented proof-of-concept experiment’s outcomes warrant a larger scale study to formalize the parameters of the frequency selection algorithm. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

27 pages, 16276 KiB  
Article
Vibrations of Nonlinear Elastic Structure Excited by Compressible Flow
by Monika Balázsová, Miloslav Feistauer, Jaromír Horáček and Adam Kosík
Appl. Sci. 2021, 11(11), 4748; https://doi.org/10.3390/app11114748 - 21 May 2021
Cited by 3 | Viewed by 1541
Abstract
This study deals with the development of an accurate, efficient and robust method for the numerical solution of the interaction of compressible flow and nonlinear dynamic elasticity. This problem requires the reliable solution of flow in time-dependent domains and the solution of deformations [...] Read more.
This study deals with the development of an accurate, efficient and robust method for the numerical solution of the interaction of compressible flow and nonlinear dynamic elasticity. This problem requires the reliable solution of flow in time-dependent domains and the solution of deformations of elastic bodies formed by several materials with complicated geometry depending on time. In this paper, the fluid–structure interaction (FSI) problem is solved numerically by the space-time discontinuous Galerkin method (STDGM). In the case of compressible flow, we use the compressible Navier–Stokes equations formulated by the arbitrary Lagrangian–Eulerian (ALE) method. The elasticity problem uses the non-stationary formulation of the dynamic system using the St. Venant–Kirchhoff and neo-Hookean models. The STDGM for the nonlinear elasticity is tested on the Hron–Turek benchmark. The main novelty of the study is the numerical simulation of the nonlinear vocal fold vibrations excited by the compressible airflow coming from the trachea to the simplified model of the vocal tract. The computations show that the nonlinear elasticity model of the vocal folds is needed in order to obtain substantially higher accuracy of the computed vocal folds deformation than for the linear elasticity model. Moreover, the numerical simulations showed that the differences between the two considered nonlinear material models are very small. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

25 pages, 6616 KiB  
Article
Effects of Vertical Glottal Duct Length on Intraglottal Pressures in the Convergent Glottis
by Sheng Li, Ronald C. Scherer and Mingxi Wan
Appl. Sci. 2021, 11(10), 4535; https://doi.org/10.3390/app11104535 - 16 May 2021
Cited by 7 | Viewed by 2091
Abstract
In a previous study, the vertical glottal duct length was examined for its influence on intraglottal pressures and other aerodynamic parameters in the uniform glottis [J Voice 32, 8–22 (2018)]. This study extends that work for convergent glottal angles, the shape of the [...] Read more.
In a previous study, the vertical glottal duct length was examined for its influence on intraglottal pressures and other aerodynamic parameters in the uniform glottis [J Voice 32, 8–22 (2018)]. This study extends that work for convergent glottal angles, the shape of the glottis during the glottal opening phase of vocal fold vibration. The computational fluid dynamics code ANSYS Fluent 6.3 was used to obtain the pressure distributions and other aerodynamic parameters for laminar, incompressible, two-dimensional flow in a static vocal fold model. Four typical vertical glottal duct lengths (0.108, 0.308, 0.608, 0.908 cm) were selected for three minimal diameters (0.01, 0.04, 0.16 cm), three transglottal pressures (500, 1000, 1500 Pa), and three convergent glottal angles (−5°, −10°, −20°). The results suggest that a longer vertical glottal duct length increases the intraglottal pressures, decreases the glottal entrance loss coefficient, increases the transglottal pressure coefficient, causes a lower gradient of both the intraglottal flow velocity and the wall shear stress along the glottal wall—especially for low flows and small glottal minimal diameters—and has little effect on the exit pressure coefficient and volume flow. The vertical glottal duct length in the convergent glottis has important effects on phonation and should be well specified when building computational and physical models of the vocal folds. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

17 pages, 2698 KiB  
Article
Objective Assessment of Porcine Voice Acoustics for Laryngeal Surgical Modeling
by Patrick Schlegel, Kirsten Wong, Mamdouh Aker, Yazeed Alhiyari and Jennifer Long
Appl. Sci. 2021, 11(10), 4489; https://doi.org/10.3390/app11104489 - 14 May 2021
Cited by 4 | Viewed by 2575
Abstract
Pigs have become important animal models in voice research. Several objective parameters exist to characterize the pig voice, but it is not clear which of them are sensitive to the impaired voice quality after laryngeal injury or surgery. In order to conduct meaningful [...] Read more.
Pigs have become important animal models in voice research. Several objective parameters exist to characterize the pig voice, but it is not clear which of them are sensitive to the impaired voice quality after laryngeal injury or surgery. In order to conduct meaningful voice research in pigs, it is critical to have standard functional voice outcome measures that can distinguish between normal and impaired voices. For this reason, we investigated 17 acoustic parameters before and early after surgery in three Yucatan mini pigs. Four parameters showed consistent changes between pre- and post-surgery recordings, mostly related to decreased spectral energy in higher frequencies after surgery. We recommend two of these, 50% spectral energy quartile (Q50) and Flux, for objective functional voice assessment of pigs undergoing laryngeal surgery. The long-term goal of this process is to enable quantitative voice outcome tracking of laryngeal surgical interventions in porcine models. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

13 pages, 1434 KiB  
Article
Classification of Vocal Fatigue Using sEMG: Data Imbalance, Normalization, and the Role of Vocal Fatigue Index Scores
by Yixiang Gao, Maria Dietrich and Guilherme N. DeSouza
Appl. Sci. 2021, 11(10), 4335; https://doi.org/10.3390/app11104335 - 11 May 2021
Cited by 11 | Viewed by 2891
Abstract
Our previous studies demonstrated that it is possible to perform the classification of both simulated pressed and actual vocally fatigued voice productions versus vocally healthy productions through the pattern recognition of sEMG signals obtained from subjects’ anterior neck. In these studies, the commonly [...] Read more.
Our previous studies demonstrated that it is possible to perform the classification of both simulated pressed and actual vocally fatigued voice productions versus vocally healthy productions through the pattern recognition of sEMG signals obtained from subjects’ anterior neck. In these studies, the commonly accepted Vocal Fatigue Index factor 1 (VFI-1) was used for the ground-truth labeling of normal versus vocally fatigued voice productions. Through recent experiments, other factors with potential effects on classification were also studied, such as sEMG signal normalization, and data imbalance—i.e., the large difference between the number of vocally healthy subjects and of those with vocal fatigue. Therefore, in this paper, we present a much improved classification method derived from an extensive study of the effects of such extrinsic factors on the classification of vocal fatigue. The study was performed on a large number of sEMG signals from 88 vocally healthy and fatigued subjects including student teachers and teachers and it led to important conclusions on how to optimize a machine learning approach for the early detection of vocal fatigue. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

24 pages, 2219 KiB  
Article
Acoustic Identification of the Voicing Boundary during Intervocalic Offsets and Onsets Based on Vocal Fold Vibratory Measures
by Jennifer M. Vojtech, Dante D. Cilento, Austin T. Luong, Jacob P. Noordzij, Jr., Manuel Diaz-Cadiz, Matti D. Groll, Daniel P. Buckley, Victoria S. McKenna, J. Pieter Noordzij and Cara E. Stepp
Appl. Sci. 2021, 11(9), 3816; https://doi.org/10.3390/app11093816 - 23 Apr 2021
Cited by 5 | Viewed by 3041
Abstract
Methods for automating relative fundamental frequency (RFF)—an acoustic estimate of laryngeal tension—rely on manual identification of voiced/unvoiced boundaries from acoustic signals. This study determined the effect of incorporating features derived from vocal fold vibratory transitions for acoustic boundary detection. Simultaneous microphone and flexible [...] Read more.
Methods for automating relative fundamental frequency (RFF)—an acoustic estimate of laryngeal tension—rely on manual identification of voiced/unvoiced boundaries from acoustic signals. This study determined the effect of incorporating features derived from vocal fold vibratory transitions for acoustic boundary detection. Simultaneous microphone and flexible nasendoscope recordings were collected from adults with typical voices (N = 69) and with voices characterized by excessive laryngeal tension (N = 53) producing voiced–unvoiced–voiced utterances. Acoustic features that coincided with vocal fold vibratory transitions were identified and incorporated into an automated RFF algorithm (“aRFF-APH”). Voiced/unvoiced boundary detection accuracy was compared between the aRFF-APH algorithm, a recently published version of the automated RFF algorithm (“aRFF-AP”), and gold-standard, manual RFF estimation. Chi-square tests were performed to characterize differences in boundary cycle identification accuracy among the three RFF estimation methods. Voiced/unvoiced boundary detection accuracy significantly differed by RFF estimation method for voicing offsets and onsets. Of 7721 productions, 76.0% of boundaries were accurately identified via the aRFF-APH algorithm, compared to 70.3% with the aRFF-AP algorithm and 20.4% with manual estimation. Incorporating acoustic features that corresponded with voiced/unvoiced boundaries led to improvements in boundary detection accuracy that surpassed the gold-standard method for calculating RFF. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Graphical abstract

11 pages, 552 KiB  
Article
Using a Lossy Electrical Transmission Line Model for Optimizing Straw Phonation Configurations
by Jonah Rosenthal, Nicole Haderlein, Matthew Silverman, Austin Scholp and Jack Jiang
Appl. Sci. 2021, 11(7), 3258; https://doi.org/10.3390/app11073258 - 5 Apr 2021
Cited by 1 | Viewed by 2363
Abstract
Straw phonation has a long history of being a successful vocal therapy technique. However, not much is known about the mechanics of phonation with a straw, nor the best combination of phoneme and straw dimensions to be used. A significant limitation in research [...] Read more.
Straw phonation has a long history of being a successful vocal therapy technique. However, not much is known about the mechanics of phonation with a straw, nor the best combination of phoneme and straw dimensions to be used. A significant limitation in research thus far is the complexity of existing models and computation techniques to determine acoustic and aerodynamic values such as impedance. In this study, a new electrical circuit-based model of the vocal tract as a transmission line is evaluated and compared to established impedance calculation methods. Results indicate that the model is not complete yet, so several adjustments are suggested and discussed. In addition, straw phonation configurations are examined using previously developed models to determine which maximize impedance and power. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

15 pages, 24584 KiB  
Article
Human Laryngeal Mucus from the Vocal Folds: Rheological Characterization by Particle Tracking Microrheology and Oscillatory Shear Rheology
by Gregor Peters, Olaf Wendler, David Böhringer, Antoniu-Oreste Gostian, Sarina K. Müller, Herbert Canziani, Nicolas Hesse, Marion Semmler, David A. Berry, Stefan Kniesburges, Wolfgang Peukert and Michael Döllinger
Appl. Sci. 2021, 11(7), 3011; https://doi.org/10.3390/app11073011 - 27 Mar 2021
Cited by 9 | Viewed by 3634
Abstract
Mucus consistency affects voice physiology and is connected to voice disorders. Nevertheless, the rheological characteristics of human laryngeal mucus from the vocal folds remain unknown. Knowledge about mucus viscoelasticity enables fabrication of artificial mucus with natural properties, more realistic ex-vivo experiments and promotes [...] Read more.
Mucus consistency affects voice physiology and is connected to voice disorders. Nevertheless, the rheological characteristics of human laryngeal mucus from the vocal folds remain unknown. Knowledge about mucus viscoelasticity enables fabrication of artificial mucus with natural properties, more realistic ex-vivo experiments and promotes a better understanding and improved treatment of dysphonia with regard to mucus consistency. We studied human laryngeal mucus samples from the vocal folds with two complementary approaches: 19 samples were successfully applied to particle tracking microrheology (PTM) and five additional samples to oscillatory shear rheology (OSR). Mucus was collected by experienced laryngologists from patients together with demographic data. The analysis of the viscoelasticity revealed diversity among the investigated mucus samples according to their rigidity (absolute G′ and G″). Moreover some samples revealed throughout solid-like character (G′ > G″), whereas some underwent a change from solid-like to liquid-like (G′ < G″). This led to a subdivision into three groups. We assume that the reason for the differences is a variation in the hydration level of the mucus, which affects the mucin concentration and network formation factors of the mucin mesh. The demographic data could not be correlated to the differences, except for the smoking behavior. Mucus of predominant liquid-like character was associated with current smokers. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

20 pages, 94361 KiB  
Article
Aeroacoustic Sound Source Characterization of the Human Voice Production-Perturbed Convective Wave Equation
by Stefan Schoder, Paul Maurerlehner, Andreas Wurzinger, Alexander Hauser, Sebastian Falk, Stefan Kniesburges, Michael Döllinger and Manfred Kaltenbacher
Appl. Sci. 2021, 11(6), 2614; https://doi.org/10.3390/app11062614 - 15 Mar 2021
Cited by 26 | Viewed by 3681
Abstract
The flow-induced sound sources of human voice production are investigated based on a validated voice model. This analysis is performed using a hybrid aeroacoustic workflow based on the perturbed convective wave equation. In the first step, the validated 3D incompressible turbulent flow simulation [...] Read more.
The flow-induced sound sources of human voice production are investigated based on a validated voice model. This analysis is performed using a hybrid aeroacoustic workflow based on the perturbed convective wave equation. In the first step, the validated 3D incompressible turbulent flow simulation is computed by the finite volume method using STARCCM+. In a second step, the aeroacoustic sources are evaluated and studied in detail. The formulation of the sound sources is compared to the simplification (neglecting the convective sources) systematically using time-domain and Fourier-space analysis. Additionally, the wave equation is solved with the finite element solver openCFS to obtain the 3D sound field in the acoustic far-field. During the detailed effect analysis, the far-field sound spectra are compared quantitatively, and the flow-induced sound sources are visualized within the larynx. In this contribution, it is shown that the convective part of the sources dominates locally near the vocal folds (VFs) while the local time derivative of the incompressible pressure is distributed in the whole supra-glottal area. Although the maximum amplitude of the time derivative is lower, the integral contribution dominates the sound spectrum. As a by-product of the detailed perturbed convective wave equation source study, we show that the convective source term can be neglected since it only reduces the validation error by 0.6%. Neglecting the convective part reduces the algorithmic complexity of the aeroacoustic source computation of the perturbed convective wave equation and the stored flow data. From the source visualization, we learned how the VF motion transforms into specific characteristics of the aeroacoustic sources. We found that if the VFs are fully closing, the aeroacoustic source terms yield the highest dynamical range. If the VFs are not fully closing, VFs motion does not provide as much source energy to the flow-induced sound sources as in the case of a healthy voice. As a consequence of not fully closing VFs, the cyclic pulsating velocity jet is not cut off entirely and therefore turbulent structures are permanently present inside the supraglottal region. These turbulent structures increase the broadband component of the voice signal, which supports research results of previous studies regarding glottis closure and insufficient voice production. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

13 pages, 4378 KiB  
Article
Modelling of Amplitude Modulated Vocal Fry Glottal Area Waveforms Using an Analysis-by-Synthesis Approach
by Vinod Devaraj and Philipp Aichinger
Appl. Sci. 2021, 11(5), 1990; https://doi.org/10.3390/app11051990 - 24 Feb 2021
Cited by 1 | Viewed by 2211
Abstract
The characterization of voice quality is important for the diagnosis of a voice disorder. Vocal fry is a voice quality which is traditionally characterized by a low frequency and a long closed phase of the glottis. However, we also observed amplitude modulated vocal [...] Read more.
The characterization of voice quality is important for the diagnosis of a voice disorder. Vocal fry is a voice quality which is traditionally characterized by a low frequency and a long closed phase of the glottis. However, we also observed amplitude modulated vocal fry glottal area waveforms (GAWs) without long closed phases (positive group) which we modelled using an analysis-by-synthesis approach. Natural and synthetic GAWs are modelled. The negative group consists of euphonic, i.e., normophonic GAWs. The analysis-by-synthesis approach fits two modelled GAWs for each of the input GAW. One modelled GAW is modulated to replicate the amplitude and frequency modulations of the input GAW and the other modelled GAW is unmodulated. The modelling errors of the two modelled GAWs are determined to classify the GAWs into the positive and the negative groups using a simple support vector machine (SVM) classifier with a linear kernel. The modelling errors of all vocal fry GAWs obtained using the modulating model are smaller than the modelling errors obtained using the unmodulated model. Using the two modelling errors as predictors for classification, no false positives or false negatives are obtained. To further distinguish the subtypes of amplitude modulated vocal fry GAWs, the entropy of the modulator’s power spectral density and the modulator-to-carrier frequency ratio are obtained. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

19 pages, 31780 KiB  
Article
Impact of the Sub-Grid Scale Turbulence Model in Aeroacoustic Simulation of Human Voice
by Martin Lasota, Petr Šidlof, Manfred Kaltenbacher and Stefan Schoder
Appl. Sci. 2021, 11(4), 1970; https://doi.org/10.3390/app11041970 - 23 Feb 2021
Cited by 11 | Viewed by 2977
Abstract
In an aeroacoustic simulation of human voice production, the effect of the sub-grid scale (SGS) model on the acoustic spectrum was investigated. In the first step, incompressible airflow in a 3D model of larynx with vocal folds undergoing prescribed two-degree-of-freedom oscillation was simulated [...] Read more.
In an aeroacoustic simulation of human voice production, the effect of the sub-grid scale (SGS) model on the acoustic spectrum was investigated. In the first step, incompressible airflow in a 3D model of larynx with vocal folds undergoing prescribed two-degree-of-freedom oscillation was simulated by laminar and Large-Eddy Simulations (LES), using the One-Equation and Wall-Adaptive Local-Eddy (WALE) SGS models. Second, the aeroacoustic sources and the sound propagation in a domain composed of the larynx and vocal tract were computed by the Perturbed Convective Wave Equation (PCWE) for vowels [u:] and [i:]. The results show that the SGS model has a significant impact not only on the flow field, but also on the spectrum of the sound sampled 1 cm downstream of the lips. With the WALE model, which is known to handle the near-wall and high-shear regions more precisely, the simulations predict significantly higher peak volumetric flow rates of air than those of the One-Equation model, only slightly lower than the laminar simulation. The usage of the WALE SGS model also results in higher sound pressure levels of the higher harmonic frequencies. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

15 pages, 3023 KiB  
Article
Numerical and Experimental Investigations on Vocal Fold Approximation in Healthy and Simulated Unilateral Vocal Fold Paralysis
by Zheng Li, Azure Wilson, Lea Sayce, Amit Avhad, Bernard Rousseau and Haoxiang Luo
Appl. Sci. 2021, 11(4), 1817; https://doi.org/10.3390/app11041817 - 18 Feb 2021
Cited by 11 | Viewed by 6005
Abstract
We have developed a novel surgical/computational model for the investigation of unilat-eral vocal fold paralysis (UVFP) which will be used to inform future in silico approaches to improve surgical outcomes in type I thyroplasty. Healthy phonation (HP) was achieved using cricothyroid suture approximation [...] Read more.
We have developed a novel surgical/computational model for the investigation of unilat-eral vocal fold paralysis (UVFP) which will be used to inform future in silico approaches to improve surgical outcomes in type I thyroplasty. Healthy phonation (HP) was achieved using cricothyroid suture approximation on both sides of the larynx to generate symmetrical vocal fold closure. Following high-speed videoendoscopy (HSV) capture, sutures on the right side of the larynx were removed, partially releasing tension unilaterally and generating asymmetric vocal fold closure characteristic of UVFP (sUVFP condition). HSV revealed symmetric vibration in HP, while in sUVFP the sutured side demonstrated a higher frequency (10–11%). For the computational model, ex vivo magnetic resonance imaging (MRI) scans were captured at three configurations: non-approximated (NA), HP, and sUVFP. A finite-element method (FEM) model was built, in which cartilage displacements from the MRI images were used to prescribe the adduction, and the vocal fold deformation was simulated before the eigenmode calculation. The results showed that the frequency comparison between the two sides was consistent with observations from HSV. This alignment between the surgical and computational models supports the future application of these methods for the investigation of treatment for UVFP. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

17 pages, 4644 KiB  
Article
Effect of Subglottic Stenosis on Vocal Fold Vibration and Voice Production Using Fluid–Structure–Acoustics Interaction Simulation
by Dariush Bodaghi, Qian Xue, Xudong Zheng and Scott Thomson
Appl. Sci. 2021, 11(3), 1221; https://doi.org/10.3390/app11031221 - 29 Jan 2021
Cited by 13 | Viewed by 2594
Abstract
An in-house 3D fluid–structure–acoustic interaction numerical solver was employed to investigate the effect of subglottic stenosis (SGS) on dynamics of glottal flow, vocal fold vibration and acoustics during voice production. The investigation focused on two SGS properties, including severity defined as the percentage [...] Read more.
An in-house 3D fluid–structure–acoustic interaction numerical solver was employed to investigate the effect of subglottic stenosis (SGS) on dynamics of glottal flow, vocal fold vibration and acoustics during voice production. The investigation focused on two SGS properties, including severity defined as the percentage of area reduction and location. The results show that SGS affects voice production only when its severity is beyond a threshold, which is at 75% for the glottal flow rate and acoustics, and at 90% for the vocal fold vibrations. Beyond the threshold, the flow rate, vocal fold vibration amplitude and vocal efficiency decrease rapidly with SGS severity, while the skewness quotient, vibration frequency, signal-to-noise ratio and vocal intensity decrease slightly, and the open quotient increases slightly. Changing the location of SGS shows no effect on the dynamics. Further analysis reveals that the effect of SGS on the dynamics is primarily due to its effect on the flow resistance in the entire airway, which is found to be related to the area ratio of glottis to SGS. Below the SGS severity of 75%, which corresponds to an area ratio of glottis to SGS of 0.1, changing the SGS severity only causes very small changes in the area ratio; therefore, its effect on the flow resistance and dynamics is very small. Beyond the SGS severity of 75%, increasing the SGS severity, leads to rapid increases of the area ratio, resulting in rapid changes in the flow resistance and dynamics. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

15 pages, 2944 KiB  
Article
A Hybrid Machine-Learning-Based Method for Analytic Representation of the Vocal Fold Edges during Connected Speech
by Ahmed M. Yousef, Dimitar D. Deliyski, Stephanie R. C. Zacharias, Alessandro de Alarcon, Robert F. Orlikoff and Maryam Naghibolhosseini
Appl. Sci. 2021, 11(3), 1179; https://doi.org/10.3390/app11031179 - 27 Jan 2021
Cited by 15 | Viewed by 2854
Abstract
Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The [...] Read more.
Investigating the phonatory processes in connected speech from high-speed videoendoscopy (HSV) demands the accurate detection of the vocal fold edges during vibration. The present paper proposes a new spatio-temporal technique to automatically segment vocal fold edges in HSV data during running speech. The HSV data were recorded from a vocally normal adult during a reading of the “Rainbow Passage.” The introduced technique was based on an unsupervised machine-learning (ML) approach combined with an active contour modeling (ACM) technique (also known as a hybrid approach). The hybrid method was implemented to capture the edges of vocal folds on different HSV kymograms, extracted at various cross-sections of vocal folds during vibration. The k-means clustering method, an ML approach, was first applied to cluster the kymograms to identify the clustered glottal area and consequently provided an initialized contour for the ACM. The ACM algorithm was then used to precisely detect the glottal edges of the vibrating vocal folds. The developed algorithm was able to accurately track the vocal fold edges across frames with low computational cost and high robustness against image noise. This algorithm offers a fully automated tool for analyzing the vibratory features of vocal folds in connected speech. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

19 pages, 5069 KiB  
Article
Method for Horizontal Calibration of Laser-Projection Transnasal Fiberoptic High-Speed Videoendoscopy
by Hamzeh Ghasemzadeh, Dimitar D. Deliyski, Robert E. Hillman and Daryush D. Mehta
Appl. Sci. 2021, 11(2), 822; https://doi.org/10.3390/app11020822 - 17 Jan 2021
Cited by 7 | Viewed by 2567
Abstract
Objective: Calibrated horizontal measurements (e.g., mm) from endoscopic procedures could be utilized for advancement of evidence-based practice and personalized medicine. However, the size of an object in endoscopic images is not readily calibrated and depends on multiple factors, including the distance between the [...] Read more.
Objective: Calibrated horizontal measurements (e.g., mm) from endoscopic procedures could be utilized for advancement of evidence-based practice and personalized medicine. However, the size of an object in endoscopic images is not readily calibrated and depends on multiple factors, including the distance between the endoscope and the target surface. Additionally, acquired images may have significant non-linear distortion that would further complicate calibrated measurements. This study used a recently developed in vivo laser-projection fiberoptic laryngoscope and proposes a method for calibrated spatial measurements. Method: A set of circular grids was recorded at multiple working distances. A statistical model was trained that would map from pixel length of the object, the working distance, and the spatial location of the target object into its mm length. Result: A detailed analysis of the performance of the proposed method is presented. The analyses have shown that the accuracy of the proposed method does not depend on the working distance and length of the target object. The estimated average magnitude of error was 0.27 mm, which is three times lower than the existing alternative. Conclusion: The presented method can achieve sub-millimeter accuracy in horizontal measurement. Significance: Evidence-based practice and personalized medicine could significantly benefit from the proposed method. Implications of the findings for other endoscopic procedures are also discussed. Full article
(This article belongs to the Special Issue Computational Methods and Engineering Solutions to Voice II)
Show Figures

Figure 1

Back to TopTop