Models and Analysis of Vocal Emissions for Biomedical Applications

A special issue of Bioengineering (ISSN 2306-5354). This special issue belongs to the section "Biosignal Processing".

Deadline for manuscript submissions: closed (31 July 2024) | Viewed by 5262

Special Issue Editors


E-Mail Website
Guest Editor
Department of Information Engineering, Università degli Studi di Firenze, 50139 Firenze, Italy
Interests: acoustical analysis; ecg

E-Mail Website
Guest Editor
Department of Information Engineering, Università degli Studi di Firenze, Firenze, Italy
Interests: wearable system for non-invasive physiological monitoring; statistical and nonlinear biomedical signal processing; affective computing; mood/mental/neurological disorders; human–animal–robot interaction; autonomic nervous system investigation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Information Engineering, Università degli Studi di Firenze, 50139 Firenze, Italy
Interests: biomedical signal processing; voice analysis; modeling of biomedical signals; parametric spectral estimation; autoregressive models

Special Issue Information

Dear Colleagues,

Following the success of the 13th MAVEBA International Workshop (held in Florence, Italy, 12–13th September, 2023), we propose a Special Issue of Bioengineering that collects an extended version of the contributions presented at the Workshop.

The MAVEBA Workshop concerns the study of the human voice from the methodological point of view and its biomedical applications. The series of MAVEBA international workshops started in 1999, and a multidisciplinary meeting stimulating contacts between specialists active in bioengineering, clinical applications, and industrial development is held every two years.

This SI welcomes contributions ranging from fundamental research to advanced technologies about models and the analysis of signals and images of the human vocal apparatus and any fields related to all kinds of biomedical applications, with emphasis on translational research, the link with the “real” complex world of the human being.

This Special Issue is open to the submission of papers focused on multidisciplinary approaches involving bioengineering, otolaryngology, phoniatrics, neurology, surgery, psychology, psychiatry, logopaedic, linguistics, singing, and related fields, with applications ranging from the newborn to the elderly.

Topics of interest for this Special Issue include, but are not limited to, the following:

  • Tools and methods for voice recording;
  • Wearable devices, mobile apps, and human-computer interaction;
  • Software tools for voice and image analysis;
  • Modeling and Analysis of voice and speech;
  • Modeling of vocal folds and vocal tract;
  • Signal processing methods for singing and drama, classical and modern singing, and acted voice;
  • AI and deep learning in voice recognition, synthesis, and classification;
  • Computational neuroscience;
  • Acoustical and image analysis of the vocal folds;
  • Modeling and simulation of vocal physiology;
  • Biomechanics of the vocal folds;
  • Intonation, mood, stress, and related neurological disorders;
  • Newborn cry, prematurity, and neurological disorders;
  • Voice analysis and native language.

Dr. Lorenzo Frassineti
Dr. Antonio Lanata
Dr. Claudia Manfredi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Bioengineering is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

18 pages, 1528 KiB  
Article
Deep Learning-Based Detection of Glottis Segmentation Failures
by Armin A. Dadras and Philipp Aichinger
Bioengineering 2024, 11(5), 443; https://doi.org/10.3390/bioengineering11050443 - 30 Apr 2024
Viewed by 1112
Abstract
Medical image segmentation is crucial for clinical applications, but challenges persist due to noise and variability. In particular, accurate glottis segmentation from high-speed videos is vital for voice research and diagnostics. Manual searching for failed segmentations is labor-intensive, prompting interest in automated methods. [...] Read more.
Medical image segmentation is crucial for clinical applications, but challenges persist due to noise and variability. In particular, accurate glottis segmentation from high-speed videos is vital for voice research and diagnostics. Manual searching for failed segmentations is labor-intensive, prompting interest in automated methods. This paper proposes the first deep learning approach for detecting faulty glottis segmentations. For this purpose, faulty segmentations are generated by applying both a poorly performing neural network and perturbation procedures to three public datasets. Heavy data augmentations are added to the input until the neural network’s performance decreases to the desired mean intersection over union (IoU). Likewise, the perturbation procedure involves a series of image transformations to the original ground truth segmentations in a randomized manner. These data are then used to train a ResNet18 neural network with custom loss functions to predict the IoU scores of faulty segmentations. This value is then thresholded with a fixed IoU of 0.6 for classification, thereby achieving 88.27% classification accuracy with 91.54% specificity. Experimental results demonstrate the effectiveness of the presented approach. Contributions include: (i) a knowledge-driven perturbation procedure, (ii) a deep learning framework for scoring and detecting faulty glottis segmentations, and (iii) an evaluation of custom loss functions. Full article
(This article belongs to the Special Issue Models and Analysis of Vocal Emissions for Biomedical Applications)
Show Figures

Graphical abstract

19 pages, 3452 KiB  
Article
Towards a Corpus (and Language)-Independent Screening of Parkinson’s Disease from Voice and Speech through Domain Adaptation
by Emiro J. Ibarra, Julián D. Arias-Londoño, Matías Zañartu and Juan I. Godino-Llorente
Bioengineering 2023, 10(11), 1316; https://doi.org/10.3390/bioengineering10111316 - 15 Nov 2023
Cited by 5 | Viewed by 2221
Abstract
End-to-end deep learning models have shown promising results for the automatic screening of Parkinson’s disease by voice and speech. However, these models often suffer degradation in their performance when applied to scenarios involving multiple corpora. In addition, they also show corpus-dependent clusterings. These [...] Read more.
End-to-end deep learning models have shown promising results for the automatic screening of Parkinson’s disease by voice and speech. However, these models often suffer degradation in their performance when applied to scenarios involving multiple corpora. In addition, they also show corpus-dependent clusterings. These facts indicate a lack of generalisation or the presence of certain shortcuts in the decision, and also suggest the need for developing new corpus-independent models. In this respect, this work explores the use of domain adversarial training as a viable strategy to develop models that retain their discriminative capacity to detect Parkinson’s disease across diverse datasets. The paper presents three deep learning architectures and their domain adversarial counterparts. The models were evaluated with sustained vowels and diadochokinetic recordings extracted from four corpora with different demographics, dialects or languages, and recording conditions. The results showed that the space distribution of the embedding features extracted by the domain adversarial networks exhibits a higher intra-class cohesion. This behaviour is supported by a decrease in the variability and inter-domain divergence computed within each class. The findings suggest that domain adversarial networks are able to learn the common characteristics present in Parkinsonian voice and speech, which are supposed to be corpus, and consequently, language independent. Overall, this effort provides evidence that domain adaptation techniques refine the existing end-to-end deep learning approaches for Parkinson’s disease detection from voice and speech, achieving more generalizable models. Full article
(This article belongs to the Special Issue Models and Analysis of Vocal Emissions for Biomedical Applications)
Show Figures

Figure 1

Other

Jump to: Research

15 pages, 5547 KiB  
Technical Note
Pragmatic De-Noising of Electroglottographic Signals
by Sten Ternström
Bioengineering 2024, 11(5), 479; https://doi.org/10.3390/bioengineering11050479 - 11 May 2024
Viewed by 1192
Abstract
In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are actually contacting, such that this signal has an appreciable amplitude. However, phonation can also occur without the [...] Read more.
In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are actually contacting, such that this signal has an appreciable amplitude. However, phonation can also occur without the vocal folds contacting, as in breathy voice, in which case the EGG amplitude is low, but not zero. It is of great interest to identify the transition from non-contacting to contacting, because this will substantially change the nature of the vocal fold oscillations; however, that transition is not in itself audible. The magnitude of the cycle-normalized peak derivative of the EGG signal is a convenient indicator of vocal fold contacting, but no current EGG hardware has a sufficient signal-to-noise ratio of the derivative. We show how the textbook techniques of spectral thresholding and static notch filtering are straightforward to implement, can run in real time, and can mitigate several noise problems in EGG hardware. This can be useful to researchers in vocology. Full article
(This article belongs to the Special Issue Models and Analysis of Vocal Emissions for Biomedical Applications)
Show Figures

Figure 1

Back to TopTop