Computational Intelligence for Audio Signal Processing

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Processes".

Deadline for manuscript submissions: closed (31 October 2020) | Viewed by 12288

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science, University of Milan, 20133 Milan, Italy
Interests: audio analyzing; AI; computer vision; robotics; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, we have been witnessing an ever-increasing demand for applications of generalized sound recognition technology where the emphasis is placed on non-speech signal processing, i.e., environmental sounds, music, animal vocalizations, etc. The major part of the community makes the assumption that training and future/novel data come from the same distribution. However, in many real-world applications, this assumption may not hold. Hence, it is of paramount importance for the scientific community to develop novel computational methods of audio analysis able to track stationarity changes and adapt the processing mechanism. At the same time, we observed a shift from traditional hand-crafted feature design to a data-driven one which, when combined with deep models, reaches the point of questioning the relevance of traditional audio signal processing. In such solutions, there are still several obstacles to overcome, e.g., a systematic explanation of the operation of learned models, adversarial examples, excessive computational needs, etc. Addressing such obstacles becomes essential, especially in sensitive applications, for examples medical ones, where experts have to make meaningful decisions based on such a recognition algorithm.

We invite original papers, communications, and review articles covering the latest advances in generalized sound recognition technology. Novel solutions for the cases of non-stationary environments and interpretable machine learning algorithms comprise the main priority. Topics include, but are not limited to, the following:

  • Computational auditory scene analysis 
  • Methodologies, algorithms and techniques for learning in evolving auditory environments
  • Sound event detection and recognition
  • Audio source separation and localization 
  • Audio-based security systems and surveillance 
  • Music information retrieval
  • Music technology and entertainment
  • Computational music composition 
  • Interpretable deep learning for audio analysis
  • Transfer and reinforcement learning for audio data
  • Adversarial machine learning
  • Privacy in smart-home assistants
  • Audio for mobile and handheld devices 
  • Acoustic data processing for the Internet of Things and emerging applications
  • Applications to medical audio data
  • Biodiversity and environmental monitoring
  • Emerging audio technologies (auditory display, interactive sound, and new audio interfaces) 
  • Wireless acoustic sensor networks and applications 
  • Distributed audio signal processing and coding for segmentation, event detection and alerting.

Prof. Stavros Ntalampiras
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • audio pattern recognition
  • nonstationary environments
  • music information retrieval
  • interpretable machine learning
  • computational intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

9 pages, 1423 KiB  
Article
Sequential Estimation of Relative Transfer Function in Application of Acoustic Beamforming
by Jounghoon Beh
Information 2020, 11(11), 505; https://doi.org/10.3390/info11110505 - 28 Oct 2020
Viewed by 1925
Abstract
In this paper, a sequential approach is proposed to estimate the relative transfer functions (RTF) used in developing a generalized sidelobe canceller (GSC). The latency in calibrating microphone arrays for GSC, often suffered by conventional approaches involving batch operations, is significantly reduced in [...] Read more.
In this paper, a sequential approach is proposed to estimate the relative transfer functions (RTF) used in developing a generalized sidelobe canceller (GSC). The latency in calibrating microphone arrays for GSC, often suffered by conventional approaches involving batch operations, is significantly reduced in the proposed sequential method. This is accomplished by an immediate generation of the RTF from initial input segments and subsequent updates of the RTF as the input stream continues. From the experimental results via the mean square error (MSE) criterion, it has been shown that the proposed method exhibits improved performance over the conventional batch approach as well as over recently introduced least mean squares approaches. Full article
(This article belongs to the Special Issue Computational Intelligence for Audio Signal Processing)
Show Figures

Figure 1

17 pages, 8415 KiB  
Article
Anti-Shake HDR Imaging Using RAW Image Data
by Yan Liu, Bingxue Lv, Wei Huang, Baohua Jin and Canlin Li
Information 2020, 11(4), 213; https://doi.org/10.3390/info11040213 - 16 Apr 2020
Cited by 4 | Viewed by 4125
Abstract
Camera shaking and object movement can cause the output images to suffer from blurring, noise, and other artifacts, leading to poor image quality and low dynamic range. Raw images contain minimally processed data from the image sensor compared with JPEG images. In this [...] Read more.
Camera shaking and object movement can cause the output images to suffer from blurring, noise, and other artifacts, leading to poor image quality and low dynamic range. Raw images contain minimally processed data from the image sensor compared with JPEG images. In this paper, an anti-shake high-dynamic-range imaging method is presented. This method is more robust to camera motion than previous techniques. An algorithm based on information entropy is employed to choose a reference image from the raw image sequence. To further improve the robustness of the proposed method, the Oriented FAST and Rotated BRIEF (ORB) algorithm is adopted to register the inputs, and a simple Laplacian pyramid fusion method is implanted to generate the high-dynamic-range image. Additionally, a large dataset with 435 various exposure image sequences is collected, which includes the corresponding JPEG image sequences to test the effectiveness of the proposed method. The experimental results illustrate that the proposed method achieves better performance in terms of anti-shake ability and preserves more details for real scene images than traditional algorithms. Furthermore, the proposed method is suitable for extreme-exposure image pairs, which can be applied to binocular vision systems to acquire high-quality real scene images, and has a lower algorithm complexity than deep learning-based fusion methods. Full article
(This article belongs to the Special Issue Computational Intelligence for Audio Signal Processing)
Show Figures

Figure 1

16 pages, 5323 KiB  
Article
An Innovative Acoustic Rain Gauge Based on Convolutional Neural Networks
by Roberta Avanzato and Francesco Beritelli
Information 2020, 11(4), 183; https://doi.org/10.3390/info11040183 - 28 Mar 2020
Cited by 27 | Viewed by 5761
Abstract
An accurate estimate of rainfall levels is fundamental in numerous application scenarios: weather forecasting, climate models, design of hydraulic structures, precision agriculture, etc. An accurate estimate becomes essential to be able to warn of the imminent occurrence of a calamitous event and reduce [...] Read more.
An accurate estimate of rainfall levels is fundamental in numerous application scenarios: weather forecasting, climate models, design of hydraulic structures, precision agriculture, etc. An accurate estimate becomes essential to be able to warn of the imminent occurrence of a calamitous event and reduce the risk to human beings. Unfortunately, to date, traditional techniques for estimating rainfall levels present numerous critical issues. The algorithm applies the Convolution Neural Network (CNN) directly to the audio signal, using 3 s sliding windows with an offset of only 100 milliseconds. Therefore, by using low cost and low power hardware, the proposed algorithm allows implementing critical high rainfall event alerting mechanisms with short response times and low estimation errors. More specifically, this paper proposes a new approach to rainfall estimation based on the classification of different acoustic timbres that rain produces at different intensities and on CNN. The results obtained on seven classes ranging from “No rain” to “Cloudburst” indicate an average accuracy of 75%, which rises to 93% if the misclassifications of the adjacent classes are not considered. Some application contexts concern smart cities for which the integration of an audio sensor inside the luminaire of a street lamp is foreseen, precision agriculture, as well as highway safety, by minimizing the risks of aquaplaning. Full article
(This article belongs to the Special Issue Computational Intelligence for Audio Signal Processing)
Show Figures

Figure 1

Back to TopTop