Applied Sciences

Research

Jump to: Review

11 pages, 1510 KiB

Open AccessArticle

Localisation of Vertical Auditory Phantom Image with Band-limited Reductions of Vertical Interchannel Crosstalk

by Rory Wallis and Hyunkook Lee

Appl. Sci. 2020, 10(4), 1490; https://doi.org/10.3390/app10041490 - 21 Feb 2020

Cited by 1 | Viewed by 2992

Abstract

Direct sound that is captured by the upper layer of a three-dimensional (3D) microphone array is typically regarded as vertical interchannel crosstalk (VIC), since it tends to produce an undesired effect of the sound source image being elevated from the ear-level loudspeaker layer [...] Read more.

Direct sound that is captured by the upper layer of a three-dimensional (3D) microphone array is typically regarded as vertical interchannel crosstalk (VIC), since it tends to produce an undesired effect of the sound source image being elevated from the ear-level loudspeaker layer position (0°) in reproduction. The present study examined the effectiveness of band-limited VIC attenuation methods on preventing the vertical image shift problem. In a subjective experiment, five natural sound sources were presented as vertically-oriented phantom images while using two stereophonic loudspeaker pairs elevated at 0° and 30° in front of the listener. The upper layer signal (i.e., VIC) was attenuated in various octave-band-dependent conditions that were based on vertical localisation thresholds obtained from previous studies. The results showed that it was possible to achieve the goal of panning the phantom image at the same height as the image produced by the main loudspeaker layer by attenuating only a single octave band with the centre frequency of 4 kHz or 8 kHz or multiple bands at 1 kHz and above. This has a useful practical implication in 3D sound recording and mixing where a vertically oriented phantom image is rendered. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

19 pages, 1502 KiB

Open AccessArticle

Effects of Background Sounds on Annoyance Reaction to Foreground Sounds in Psychoacoustic Experiments in the Laboratory: Limits and Consequences

by Armin Taghipour and Eduardo Pelizzari

Appl. Sci. 2019, 9(9), 1872; https://doi.org/10.3390/app9091872 - 7 May 2019

Cited by 5 | Viewed by 3364

Abstract

In a variety of applications, e.g., psychoacoustic experiments, virtual sound propagation demonstration, or synthesized noise production, noise samples are played back in laboratories. To simulate realistic scenes or to mask unwanted background sounds, it is sometimes preferable to add background ambient sounds to [...] Read more.

In a variety of applications, e.g., psychoacoustic experiments, virtual sound propagation demonstration, or synthesized noise production, noise samples are played back in laboratories. To simulate realistic scenes or to mask unwanted background sounds, it is sometimes preferable to add background ambient sounds to the noise. However, this can influence noise perception. It should be ensured that either background sounds do not affect, e.g., annoyance from foreground noise or that possible effects can be quantified. Two laboratory experiments are reported, in which effects of mixing background sounds to foreground helicopter samples were investigated. By means of partially balanced incomplete block designs, possible effects of three independent variables, i.e., helicopter’s sound exposure level, background type, and background sound pressure level were tested on the dependent variable annoyance, rated on the ICBEN 11-point numerical scale. The main predictor of annoyance was helicopter’s sound exposure level. Stimuli with eventful background sounds were found to be more annoying than those with less eventful background sounds. Furthermore, background type and level interacted significantly. For the major part of the background sound level range, increasing the background level was associated with increased or decreased annoyance for stimuli with eventful and less eventful background sounds, respectively. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

22 pages, 3100 KiB

Open AccessArticle

Automatic Spatial Audio Scene Classification in Binaural Recordings of Music

by Sławomir K. Zieliński and Hyunkook Lee

Appl. Sci. 2019, 9(9), 1724; https://doi.org/10.3390/app9091724 - 26 Apr 2019

Cited by 12 | Viewed by 3989

Abstract

The aim of the study was to develop a method for automatic classification of the three spatial audio scenes, differing in horizontal distribution of foreground and background audio content around a listener in binaurally rendered recordings of music. For the purpose of the [...] Read more.

The aim of the study was to develop a method for automatic classification of the three spatial audio scenes, differing in horizontal distribution of foreground and background audio content around a listener in binaurally rendered recordings of music. For the purpose of the study, audio recordings were synthesized using thirteen sets of binaural-room-impulse-responses (BRIRs), representing room acoustics of both semi-anechoic and reverberant venues. Head movements were not considered in the study. The proposed method was assumption-free with regards to the number and characteristics of the audio sources. A least absolute shrinkage and selection operator was employed as a classifier. According to the results, it is possible to automatically identify the spatial scenes using a combination of binaural and spectro-temporal features. The method exhibits a satisfactory classification accuracy when it is trained and then tested on different stimuli but synthesized using the same BRIRs (accuracy ranging from 74% to 98%), even in highly reverberant conditions. However, the generalizability of the method needs to be further improved. This study demonstrates that in addition to the binaural cues, the Mel-frequency cepstral coefficients constitute an important carrier of spatial information, imperative for the classification of spatial audio scenes. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

17 pages, 2911 KiB

Open AccessArticle

Timbre Preferences in the Context of Mixing Music

by Felix A. Dobrowohl, Andrew J. Milne and Roger T. Dean

Appl. Sci. 2019, 9(8), 1695; https://doi.org/10.3390/app9081695 - 24 Apr 2019

Cited by 6 | Viewed by 4050

Abstract

Mixing music is a highly complex task. This is exacerbated by the fact that timbre perception is still poorly understood. As a result, few studies have been able to pinpoint listeners’ preferences in terms of timbre. In order to investigate timbre preference in [...] Read more.

Mixing music is a highly complex task. This is exacerbated by the fact that timbre perception is still poorly understood. As a result, few studies have been able to pinpoint listeners’ preferences in terms of timbre. In order to investigate timbre preference in a music production context, we let participants mix multiple individual parts of musical pieces (bassline, harmony, and arpeggio parts, all sounded with a synthesizer) by adjusting four specific timbral attributes of the synthesizer (lowpass, sawtooth/square wave oscillation blend, distortion, and inharmonicity). After participants mixed all parts of a musical piece, they were asked to rate multiple mixes of the same musical piece. Listeners showed preferences for their own mixes over random, fixed sawtooth, or expert mixes. However, participants were unable to identify their own mixes. Despite not being able to accurately identify their own mixes, participants consistently preferred the mix they thought to be their own, regardless of whether or not this mix was indeed their own. Correlations and cluster analysis of the participants’ mixing settings show most participants behaving independently in their mixing approaches and one moderate sized cluster of participants who are actually rather similar. In reference to the starting-settings, participants applied the biggest changes to the sound with the inharmonicity manipulation (measured in the perceptual distance) despite often mentioning that they do not find this manipulation particularly useful. The results show that listeners have a consistent, yet individual timbre preference and are able to reliably evoke changes in timbre towards their own preferences. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

14 pages, 3878 KiB

Open AccessArticle

Impact of Structural Parameters on the Auditory Perception of Musical Sounds in Closed Spaces: An Experimental Study

by Lei Wang, Xiyue Ma, Rong Li and Xiangyang Zeng

Appl. Sci. 2019, 9(7), 1416; https://doi.org/10.3390/app9071416 - 4 Apr 2019

Viewed by 2664

Abstract

This study attempts to investigate the impact of structural parameters (volume, shape, and the wall absorption coefficient) in closed space on the auditory perception of three different musical sound types. With binaural audibility technology and room impulse response measurement (RIR), this paper first [...] Read more.

This study attempts to investigate the impact of structural parameters (volume, shape, and the wall absorption coefficient) in closed space on the auditory perception of three different musical sound types. With binaural audibility technology and room impulse response measurement (RIR), this paper first verifies the reliability of using ODEON software in simulating simplified closed-space auditory scenes. Then, 96 music binaural signals produced in eight simulated closed spaces with different structural parameters are synthesized. Finally, auditory perception experiment is conducted on the synthesized binaural signals by using pair comparison method, and variance analysis is also made on the experimental results. It is concluded that (1) a hemispherical cabin with a small volume and large wall sound absorption coefficient is most suitable for playing a single instrument, such as the flute or violin, and (2) a cabin with large volume is suitable for playing multiple instruments music such as symphony, but the walls should not be totally reflective. The experimental scheme and results of current study provide guidance for designing the inner structure of the concert hall to achieve preferable auditory perception in practice. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

21 pages, 12545 KiB

Open AccessArticle

Interaural Level Difference Optimization of Binaural Ambisonic Rendering

by Thomas McKenzie, Damian T. Murphy and Gavin Kearney

Appl. Sci. 2019, 9(6), 1226; https://doi.org/10.3390/app9061226 - 23 Mar 2019

Cited by 8 | Viewed by 4571

Abstract

Ambisonics is a spatial audio technique appropriate for dynamic binaural rendering due to its sound field rotation and transformation capabilities, which has made it popular for virtual reality applications. An issue with low-order Ambisonics is that interaural level differences (ILDs) are often reproduced [...] Read more.

Ambisonics is a spatial audio technique appropriate for dynamic binaural rendering due to its sound field rotation and transformation capabilities, which has made it popular for virtual reality applications. An issue with low-order Ambisonics is that interaural level differences (ILDs) are often reproduced with lower values when compared to head-related impulse responses (HRIRs), which reduces lateralization and spaciousness. This paper introduces a method of Ambisonic ILD Optimization (AIO), a pre-processing technique to bring the ILDs produced by virtual loudspeaker binaural Ambisonic rendering closer to those of HRIRs. AIO is evaluated objectively for Ambisonic orders up to fifth order versus a reference dataset of HRIRs for all locations on the sphere via estimated ILD and spectral difference, and perceptually through listening tests using both simple and complex scenes. Results conclude AIO produces an overall improvement for all tested orders of Ambisonics, though the benefits are greatest at first and second order. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

17 pages, 9733 KiB

Open AccessArticle

A Study on Affective Dimensions to Engine Acceleration Sound Quality Using Acoustic Parameters

by Soyoun Moon, Sunghwan Park, Donggun Park, Wonjoon Kim, Myung Hwan Yun and Dongchul Park

Appl. Sci. 2019, 9(3), 604; https://doi.org/10.3390/app9030604 - 12 Feb 2019

Cited by 22 | Viewed by 5627

Abstract

The technical performance of recent automobiles is highly progressed and standardized across different manufacturers. This study seeks to derive a semantic space of engine acceleration sound quality for end users and identify the relation with sound characteristics. For this study, two affective attributes: [...] Read more.

The technical performance of recent automobiles is highly progressed and standardized across different manufacturers. This study seeks to derive a semantic space of engine acceleration sound quality for end users and identify the relation with sound characteristics. For this study, two affective attributes: ‘refined’ and ‘powerful’, and eight acoustic parameters considering revolutions per minute were used to determine the correlation coefficient for those affective attributes. In the experiment, a total of 35 automobiles were selected. Each of the 3rd gear wide open throttle sounds was recorded and evaluated by 42 adult subjects with normal hearing ability and driving license. Their subjective evaluations were analyzed using factor analysis, independent t-test, correlation analysis, and regression analysis. The prediction models for the affective dimensions show distinct differences for the revolutions per minute. From the experiment, it was confirmed that the customers’ affective response can be predicted through the acoustic parameters. In addition, it was found that the initial revolutions per minute in the accelerated condition had the greatest influence on the affective response. This study can be a useful guideline to design engine acceleration sounds that satisfy customers’ affective experience. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

18 pages, 674 KiB

Open AccessArticle

Modelling Timbral Hardness

by Andy Pearce, Tim Brookes and Russell Mason

Appl. Sci. 2019, 9(3), 466; https://doi.org/10.3390/app9030466 - 30 Jan 2019

Cited by 7 | Viewed by 3648

Abstract

Hardness is the most commonly searched timbral attribute within freesound.org, a commonly used online sound effect repository. A perceptual model of hardness was developed to enable the automatic generation of metadata to facilitate hardness-based filtering or sorting of search results. A training dataset [...] Read more.

Hardness is the most commonly searched timbral attribute within freesound.org, a commonly used online sound effect repository. A perceptual model of hardness was developed to enable the automatic generation of metadata to facilitate hardness-based filtering or sorting of search results. A training dataset was collected of 202 stimuli with 32 sound source types, and perceived hardness was assessed by a panel of listeners. A multilinear regression model was developed on six features: maximum bandwidth, attack centroid, midband level, percussive-to-harmonic ratio, onset strength, and log attack time. This model predicted the hardness of the training data with

R^{2}

= 0.76. It predicted hardness within a new dataset with

R^{2}

= 0.57, and predicted the rank order of individual sources perfectly, after accounting for the subjective variance of the ratings. Its performance exceeded that of human listeners. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

16 pages, 3743 KiB

Open AccessArticle

The Role of Reverberation and Magnitude Spectra of Direct Parts in Contralateral and Ipsilateral Ear Signals on Perceived Externalization

by Song Li, Roman Schlieper and Jürgen Peissig

Appl. Sci. 2019, 9(3), 460; https://doi.org/10.3390/app9030460 - 29 Jan 2019

Cited by 8 | Viewed by 4703

Abstract

Several studies show that the reverberation and spectral details in direct sounds are two essential cues for perceived externalization of virtual sound sources in reverberant environments. The present study investigated the role of these two cues in contralateral and ipsilateral ear signals on [...] Read more.

Several studies show that the reverberation and spectral details in direct sounds are two essential cues for perceived externalization of virtual sound sources in reverberant environments. The present study investigated the role of these two cues in contralateral and ipsilateral ear signals on perceived externalization of headphone-reproduced binaural sound images at different azimuth angles. For this purpose, seven pairs of non-individual binaural room impulse responses (BRIRs) were measured at azimuth angles of −90°, −60°, −30°, 0°, 30°, 60°, and 90° in a listening room. The magnitude spectra of direct parts were smoothed, and the reverberation was removed, either in left or right ear BRIRs. Such modified BRIRs were convolved with a speech signal, and the resulting binaural sounds were presented over headphones. Subjects were asked to assess the degree of perceived externalization for the presented stimuli. The result of the subjective listening experiment revealed that the magnitude spectra of direct parts in ipsilateral ear signals and the reverberation in contralateral ear signals are important for perceived externalization of virtual lateral sound sources. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Graphical abstract

24 pages, 6018 KiB

Open AccessArticle

Influence of Contextual Factors on Soundscape in Urban Open Spaces

by Xiaolong Zhao, Shilun Zhang, Qi Meng and Jian Kang

Appl. Sci. 2018, 8(12), 2524; https://doi.org/10.3390/app8122524 - 6 Dec 2018

Cited by 16 | Viewed by 4231

Abstract

The acoustic environment in urban open spaces has played a key role for users. This study analyzed the different effects of contextual factors, including shop openness, season, and commercial function, on the soundscape in two typical commercial pedestrian streets. The following observations were [...] Read more.

The acoustic environment in urban open spaces has played a key role for users. This study analyzed the different effects of contextual factors, including shop openness, season, and commercial function, on the soundscape in two typical commercial pedestrian streets. The following observations were based on a series of measurements, including crowd measurements, acoustic environment measurements, and a questionnaire survey. First, the number of talkers in Central Avenue was greater than the number of talkers in Kuan Alley in cases with the same crowd density, while there was no significant difference in the sound pressure level. When the crowd density increased, acoustic comfort trended downward in Kuan Alley, while the value of acoustic comfort in Central Avenue took a parabolic shape. Second, there was no significant difference between the number of talkers in summer and the number of talkers in winter; however, when crowd density increased by 0.1 persons/m², the level of sound pressure increased by 1.3 dBA in winter and 2.2 dBA in summer. Acoustic comfort took a parabolic shape that first increased and then decreased in both winter and summer. Regarding commercial function, as the crowd density increased, the number of talkers and the level of sound pressure both increased, while acoustic comfort decreased in three zones with different commercial functions. In addition, a cross-tab analysis was used to discuss the relationship between the number of talkers and the level of sound pressure, and it was found to be positive. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

21 pages, 33140 KiB

Open AccessArticle

A Perceptual Evaluation of Individual and Non-Individual HRTFs: A Case Study of the SADIE II Database

by Cal Armstrong, Lewis Thresh, Damian Murphy and Gavin Kearney

Appl. Sci. 2018, 8(11), 2029; https://doi.org/10.3390/app8112029 - 23 Oct 2018

Cited by 90 | Viewed by 12985

Abstract

As binaural audio continues to permeate immersive technologies, it is vital to develop a detailed understanding of the perceptual relevance of HRTFs. Previous research has explored the benefit of individual HRTFs with respect to localisation. However, localisation is only one metric with which [...] Read more.

As binaural audio continues to permeate immersive technologies, it is vital to develop a detailed understanding of the perceptual relevance of HRTFs. Previous research has explored the benefit of individual HRTFs with respect to localisation. However, localisation is only one metric with which it is possible to rate spatial audio. This paper evaluates the perceived timbral and spatial characteristics of both individual and non-individual HRTFs and compares the results to overall preference. To that end, the measurement and evaluation of a high-resolution multi-environment binaural Impulse Response database is presented for 20 subjects, including the KU100 and KEMAR binaural mannequins. Post-processing techniques, including low frequency compensation and diffuse field equalisation are discussed in relation to the 8802 unique HRTFs measured for each mannequin and 2818/2114 HRTFs measured for each human. Listening test results indicate that particular HRTF sets are preferred more generally by subjects over their own individual measurements. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

17 pages, 9545 KiB

Open AccessArticle

Diffuse-Field Equalisation of Binaural Ambisonic Rendering

by Thomas McKenzie, Damian T. Murphy and Gavin Kearney

Appl. Sci. 2018, 8(10), 1956; https://doi.org/10.3390/app8101956 - 17 Oct 2018

Cited by 18 | Viewed by 5720

Abstract

Ambisonics has enjoyed a recent resurgence in popularity due to virtual reality applications. Low order Ambisonic reproduction is inherently inaccurate at high frequencies, which causes poor timbre and height localisation. Diffuse-Field Equalisation (DFE), the theory of removing direction-independent frequency response, is applied to [...] Read more.

Ambisonics has enjoyed a recent resurgence in popularity due to virtual reality applications. Low order Ambisonic reproduction is inherently inaccurate at high frequencies, which causes poor timbre and height localisation. Diffuse-Field Equalisation (DFE), the theory of removing direction-independent frequency response, is applied to binaural (over headphones) Ambisonic rendering to address high-frequency reproduction. DFE of Ambisonics is evaluated by comparing binaural Ambisonic rendering to direct convolution via head-related impulse responses (HRIRs) in three ways: spectral difference, predicted sagittal plane localisation and perceptual listening tests on timbre. Results show DFE successfully improves frequency reproduction of binaural Ambisonic rendering for the majority of sound source locations, as well as the limitations of the technique, and set the basis for further research in the field. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

Review

Jump to: Research

22 pages, 1553 KiB

Open AccessFeature PaperReview

Psychoacoustic Models for Perceptual Audio Coding—A Tutorial Review

by Jürgen Herre and Sascha Dick

Appl. Sci. 2019, 9(14), 2854; https://doi.org/10.3390/app9142854 - 17 Jul 2019

Cited by 17 | Viewed by 20192

Abstract

Psychoacoustic models of human auditory perception have found an important application in the realm of perceptual audio coding, where exploiting the limitations of perception and removal of irrelevance is key to achieving a significant reduction in bitrate while preserving subjective audio quality. To [...] Read more.

Psychoacoustic models of human auditory perception have found an important application in the realm of perceptual audio coding, where exploiting the limitations of perception and removal of irrelevance is key to achieving a significant reduction in bitrate while preserving subjective audio quality. To this end, psychoacoustic models do not need to be perfect to satisfy their purpose, and in fact the commonly employed models only represent a small subset of the known properties and abilities of the human auditory system. This paper provides a tutorial introduction of the most commonly used psychoacoustic models for low bitrate perceptual audio coding. Full article

(This article belongs to the Special Issue Psychoacoustic Engineering and Applications)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Psychoacoustic Engineering and Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (13 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI