This section presents the RIR de-noising performance of the GBSS and compares it with the reference RIRs and the compensation method. The estimated RTs and EDT are used to evaluate the feasibility of the algorithm. The advantage of the algorithm was the two factors in Equation (4), which provide flexibility for the GBSS in reducing noise levels to significantly improve the dynamic range of the EDCs. Here, β values from 0.1 to 0.001 (step 0.01) were applied depending on the estimated posterior SNR. The α was set as Equation (5) with
ranging from 1 to 6 (step 0.25) [
32]. The illustration of the de-noising effects by using the two parameters within the range was analyzed separately. Furthermore, the inadequacy of de-noising RIRs through the use of
smaller than 3 and β higher than 0.1 will also be discussed. Finally, the RIRs measured in the meeting room convoluted with the pink noise level at −55 dB, filtered in the octave band at 1 kHz, were used to present the de-noising results.
4.1. Performance of the GBSS Algorithm Factors
The varying β decided by the SNR ranged from 0.1 to 0.001, with a fixed
, controlling the remaining noise levels, and the musical noise effects on the noisy RIR are given in
Figure 2. Observing the EDCs given in
Figure 2a, the EDC obtained using a value larger than 0.1 has a particular impact on the reverberation decay part of the EDC, which offers a significant dynamic range improvement of approximately 7 dB, and the noise floor reduction was about 5.6 dB. With β = 0.05, the obtained dynamic range of the EDC was 1.5 dB larger than that of the de-noising effects using β 0.1. A decrease in β to 0.001 yielded the best de-noising results, providing a dynamic range improvement of up to 10.15 dB. In this case, the reverberation decay range for the RT estimation was extended to −15 dB, and the noise level reduction was approximately 9 dB. On the other hand, the late part of the EDC was not as smooth as the EDC of β = 0.5, which can be seen from the decay range from −35 dB to −40 dB. Compared to β = 0.001, the narrowband spectral peaks of the original RIR were converted to broadband noise using a larger β, as can be seen from the spectrum using the FFT in
Figure 2b. In this case, the musical noise was not perceptible, but the remaining noise levels were higher (
Figure 2c). On the other hand, when β > 0.05, there was an increase in the other artifacts of residual noise levels, which did not belong to the original RIR. Musical noise leads to high fluctuations in the remaining noise parts of the RIR using β = 0.001 (
Figure 2d), contributing to the roughness of the EDCs, which is only prominent in the noise segments, but has less impact on the reverberation decay part. Furthermore, the noise attenuation effects were remarkable with a small β, and the musical noise effects could be decreased using a larger α [
28]. Consequently, the GBSS algorithm with
> 3 provided similar noise attenuation rules and musical noise effects in de-noising processing. Regarding the best dynamic range improvement of the EDCs, as well as the noise levels reduction effects on the RIRs with different noise types and levels filtered in the octave bands, the experimental results showed that varying α worked wells when β = 0.05 for SNR < 0 dB, while β = 0.001 works best for SNR ≥ 0 dB.
It is well-known that, with a fixed value, , more noise attenuation increases with a decreasing value of . On the other hand, with a fixed (suggested above), a change in the values also significantly impact the noise reduction effects and the dynamic range improvement. For example, the over-subtraction factor = 3 with a fixed β leads to an approximately 10.15 dB dynamic range improvement. In contrast, the best de-noising results mentioned above are achieved when the = 4. Therefore, determining the optimal sets of the two factors will improve EDC dynamic range more, which is essential for de-nosing RIRs, particularly when the noise levels are high.
Figure 3 shows the noise attenuation effects of different values of
with a fixed β value of 0.001. The remaining noise levels decreased with larger
values, causing the later parts of the EDCs to decrease. The GBSS with
= 1 produced a slight improvement in the EDC dynamic range (e.g., up to 3 dB), while
= 2 provided a similar result of approximately 4 dB to the EDCs. A change in the over-subtraction factor with an
value larger than 3 led to an improvement of approximately 8 dB in the dynamic range, and the estimation decay range was extended to −12 dB. However, when the noise levels were higher than −60 dB, the estimation decay range failed to extend to −10 dB using a
lower than 3. In this case, the over-subtraction factor,
, lower than 3 will not be applied to explore the de-noising effects on the RIRs regarding the dynamic range improvement.
The application of
= 4 yielded the best dynamic range improvement of the EDCs. In this case, the decay curve was almost identical to the reference EDC above −23 dB. At the same time, the de-noised RIR showed similar energy levels to the reference RIR above 0.4 s, which can be seen in
Figure 3c. With a larger
, the deviation on the reverberation decay was enhanced, which can be noticed from the energy–time curves given in
Figure 3d. Using
= 6 could remove more noise compared with
= 4. On the contrary, a larger loss of the original signal occurred close to the end of the reverberation decay part, around 0.4 s, contributing to a severe deviation in the decay curve. With
= 6, the RT estimated in the range from 0 to −19 dB showed the same result as the noiseless decay curve. However, the reverberation decay curve goes below the reference decay curve in the decay range from −19 dB to −29 dB. In this case, it resulted in the degradation of the reverberation decay and caused a smaller estimated dynamic range than the value achieved by
= 4.
It follows that the over-subtraction factor had a significant effect on the EDC dynamic range improvement when factor
increased to 3. Nevertheless, the increased decay range was smaller than the best-achieved results using the value of
. The case presented here used a value of 4 for de-nosing the RIR. The value in this regard was called the optimal factor regarding the achieved best dynamic range improvement. However, further subtracting of noise from noisy RIRs, up to a certain point, with values higher than the optimal
contributed to the signals distortion starting from the point, causing the decreasing of the dynamic range and a deviation in the reverberation decay rates. The estimated time is equal to the knee of the original noisy RIR [
13]. Therefore, it is crucial to find the point or the decay level corresponding to the optimal level of
to avoid the distortion of the EDCs generated using a higher
.
4.2. Performance of the Optimal Factors
Based on the above, the de-noised results showed the possibility of achieving the largest dynamic range improvement when reducing the noises in the RIR up to the truncation time (the knee). The knee is the time point where the reverberation decay of the impulse response intersects with the noise levels, which can be detected by the nonlinear model [
13]. At the same time, the decay level of the knee located at the truncated EDCs, calculated from the original RIR implemented by the subtraction–truncation–correction method, was also for the noiseless EDCs. In the process, the estimated noise level was subtracted from the RIR before backward integration, where the correction term for the truncation was calculated using the parameters obtained from the nonlinear model. Therefore, the noise levels convoluting the RIRs used in
Section 4.1 were increased to −45 dB to better observe the remaining noise levels around the knee and verify the method validity as applied for severe background noise levels.
The EDCs of the de-noised RIRs with different values,
, compared to the EDCs of the noisy RIRs and the reference RIRs in the time domain, are given in
Figure 4. Observing the RIRs in the time domain, the noises were remarkably reduced for the applied GBSS algorithm with
higher than 3. The higher the
, the lower the remaining noise level. However, compared with the reference RIRs, the three values gave different performances in terms of the energy levels around the knee (about 0.28 s), leading to the variable performance to the generated EDCs. Observing the EDCs presented in
Figure 4a, compared to the other two values, a value of 3 resulted in the worst dynamic range improvement. The amplitudes of the de-noised RIRs presented in
Figure 4b were higher than the reference RIR at the knee. In such cases, much of the noise was removed, but not completely. Further increasing the
value to 4.25 resulted in a decrease in the amplitudes at the knee until the energy levels were almost equal to the reference RIR around the knee (seen
Figure 4c), yielding the largest dynamic range improvement (about −13 dB). At the same time, the EDC was overlapped with the reference before a cross point at the decay level −15 dB. In this regard, the cross point was called the critical point for the largest dynamic range obtained using the optimal over-subtraction factor
. For a higher value of 6, though the noise levels were obviously reduced, the amplitudes around the knee decreased significantly, as shown in
Figure 4d. Relative energy loss occurred at the original noisy signal around the knee, causing the corresponding EDC to be divided from the reference EDC to drop below the decay curve in the range of −15 dB to −24 dB. As a result, the estimated decay level at the corresponding cross point is smaller than the reference results. Most important is that the reverberation decay rate was degraded more severely than the reference one.
It was observed that the processing of RIR de-noising showed a strong dependence on the over-subtraction factors of of the GBSS algorithm. The performance showed that changing led to different cross points with the reference EDCs, and de-noised EDCs had a tremendous relationship with the reduced noise levels in noisy RIRs. On the other hand, the cross point located at the de-noise EDCs did not extend to the critical point until the noise levels around the knee were removed. On the other hand, the reverberation decay part of the signal would be lost using an value higher than the optimum, leading to a decay rate distortion at the knee and causing more significant deviations of the EDCs relative to the reference EDCs. Furthermore, the optimal factors providing the best results regarding the dynamic range improvements were variable for different noise levels. For example, for the applied optimal factor, 4, the dynamic range improvement was larger than 20 dB when the noise level was lower than −60 dB while a 15-dB improvement was achieved using 4.25 at a noise level of −45 dB.
4.3. Over-Subtraction Factor for the Octave Band RIRs and Different Noises
Because the most promising factors of the GBSS applied to RIRs with different noise levels are changed, and the dynamic range obtained was varied, it was essential to find the rule to choose the factors for different situations. Hence, the three measured RIRs with low noise levels, mentioned in
Section 3.1, filtered in octave bands, convoluted with white noise and pink noise, with noise levels ranging from −65 dB to −40 dB, were used to determine the optimal factors. Based on the study above, removing the noise levels around the knee estimated from RIR contributed to the largest dynamic range improvement. In this case, the factors set in the GBSS were the most promising. The acoustic parameters calculated from the EDCs of the filtered RIRs in the octave bands were used to verify the results and guarantee the accuracy of the applied optimal factors, leading to no distortion of the EDCs during the noise subtraction process using the GBSS. Considering the optimal factors determined using the knees obtained with the compensation method, the corresponding RTs were compared with the reference RTs. The RTs calculated at the critical point positioned on the de-noised EDCs were compared with the values of the compensation method and the reference results to ensure that the achieved decay levels at the critical points were the best. At the same time, EDT, T15, and T20 were used to verify the GBSS algorithm and it was compared with the reference results, as listed in
Table 2.
The EDCs generated from the RIRs of the meeting room filtered in the octave bands with noise levels of −60 dB are presented in
Figure 5; the noise level estimated from the RIRs in each frequency band for the convoluted white noise remained the same as the noise levels of the pink noise. In this case, the knees located at the EDCs obtained using the S–T–C method were similar in each frequency band. In addition, the largest difference of the knee estimated at the EDCs for the pink noise and the white noise was 0.02 s in the octave band at 2000 Hz, causing an approximately 1.5 dB difference in the dynamic EDC ranges. However, the EDCs have a slight, but visible, deviation from the reference EDCs at the knees, and the maximum difference of the decay level was 2 dB.
Figure 6 illustrates a comparison of the GBSS with the reference and compensation methods for RT estimation at critical points of white and pink noise. Compared to the reference results, the maximum deviation of the RTs for the GBSS algorithm was 0.13 s at 2000 Hz, and the dynamic range improvement of white noise compared to pink noise was about 2 dB. The compensation method generally produced similar results to the GBSS algorithm. This is why the knee can be used as the endpoint for the EDCs obtained using the compensation method, but is not adopted as the endpoint for EDCs with the GBSS algorithm. On the other hand, the EDCs are coincidental with the reference EDCs above the critical points positioned above the knees. At the same time, there was no apparent deviation of the three EDCs at the critical points.
The ranges from −5 dB to the decay levels estimated at the critical points were used to estimate the RTs presented in
Table 2. T20 was considered for comparison at a noise level of −60 dB because the decay levels estimated at the critical points from the EDCs with two noises were independently in the range of −26 dB to −35 dB. In contrast, T15 was used when the noise level was −50 dB. The maximum differences of EDT and T20 was 0.006 s and 0.007 s, respectively. The deviation between the de-noised EDCs and the reference EDCs was approximately 0.01 dB. The critical points depended strictly on the over-subtraction factor to decide the improved decay levels of the dynamic range, and the accuracy of the produced reverberation decay rates. Over-subtraction factor α for the segment SNR achieved the best dynamic range improvement with the best
value, connecting the estimated SNR at the filtered RIRs with the octave band, as shown in
Figure 7a,b. The estimated SNRs and the optimal factors applied to the situation presented in
Table 2 gave the same results. The values were 3.75 for frequency bands lower than 500 Hz when the estimated SNRs were higher than 30 and 4 for the frequency band higher than 500 Hz when the estimated SNRs were higher than 25. Consequently, the similar results of the two noises showed that the GBSS algorithm does not depend on the noise type. In applying optimal factors, the processing of the noise subtraction to achieve the best dynamic range, leading to no or minimal degradation of the reverberation decay, is reliable for implementing the RIRs.
An extension analysis was applied to three broadband-measured RIRs with two types of added noise (pink and white noises) at noise levels ranging from −40 dB to −65 dB.
Figure 7c shows that the optimal factors of
in the range of 3.75 to 5 depended on the SNR estimated in the octave bands. The higher the SNR, the lower the over-subtraction factor of
. The optimal
for an SNR higher than 30 was 3.75, 4 for an SNR higher than 24, and 4.25 for an SNR in the range from 10 to 24.
Figure 7d shows the mean dynamic range improvements achieved in the octave bands by applying the most reliable optimal factor of
, ranging from 3.75 to 4.25 for an SNR higher than 10. The dynamic range improvement achieved for noise levels lower than −60 dB was around 15 dB to 20 dB. The dynamic range decreased slightly with an increasing noise level, up to −40 dB, contributing to about a 13 dB improvement in the mean dynamic range. In most cases, the reverberation decay range could be extended to −10 dB. The deviation of the improvement was less than 2 dB, and the mean value and deviation of the noise level reduction were approximately 9 dB and 2.5 dB, respectively.
When the SNR is lower than 10, the applied optimal factors of were higher than 4.5, and the noise levels estimated at the frequency bands were lower than 40 dB. In this case, the mean improved dynamic range of the EDC ranged from 3 dB to 8 dB, causing the decay range for calculating the RT to be less than 10 dB. When the noise levels were higher than −40 dB, leading to similar poor results in the mean dynamic range improvement, approximately 5 dB, most of the SNRs estimated in the octave band were lower than 10. The application of the GBSS algorithm did not lead to a significant change in the dynamic range improvement when the SNRs in the octave bands were lower than 10.
The application of the GBSS algorithm did not lead to a significant change in dynamic range improvement when the SNRs of the octave bands were lower than 10. The optimal factors do depend on the SNRs instead of on the steady noise types and levels. The recommended spectral flooring parameter and over-subtraction factor α with different desired of the GBSS algorithm contributed to significant effects on both the dynamic range improvement and the noise floor reduction when the SNRs were estimated at frequency bands higher than 10. When de-noising the measured RIRs with real ambient noise, the used could be smaller than the optimal factor. A threshold of 0.25 is recommended for lowering the risk of signal over-subtraction regarding the difference in the dynamic range being smaller than 1 dB.
Applying the GBSS algorithm assumes that noise affects the entire spectrum of the signal equally. Over-subtraction factor α subtracts an overestimation of noise over the whole range. Although the noise affects the RIRs uniformly across the entire spectrum, the energy distribution in the frequency bands varies, leading to significantly different estimated SNRs of the RIRs filtered in the octave band. Thus, the factors of are significantly influenced by the SNRs in each frequency band and are estimated from the filtered noisy RIR. Hence, setting the optimal factors of the variable depends on the estimated SNR of the original RIR filtered in frequency bands instead of applying the same constant value of for every frequency band.
4.4. GBSS Method in Measured RIRs with Natural Ambient Noise
In the preceding sections, with the fixed spectral flooring parameter, the optimal factor
, leading to the best dynamic range improvement, depended on the estimated SNRs of the filtered RIRs in the frequency bands presented in
Figure 7c. The optimal spectral flooring parameter, β, depended on the averaged posterior SNR of the input signal [
21,
24]. Extensive experiments at noise levels higher than −40 dB were performed to set β = 0.001 at SNR >= 0 while β = 0.05 at SNR < 0. The best de-noising results were obtained using the optimal sets of factors that were conducted in two normal rooms with real ambient noise. In this part, the reference RIRs represent the measured RIRs with a noise level of −60 dB, while the noisy RIRs mean the RIRs were noise levels of −40 dB and −46 dB, separately. A detailed comparison of the GBSS algorithm with the noise compensation method and reference RIRs in terms of the EDCs, EDT, and RTs of the measured RIRs filtered in the octave bands showed that the optimal factors could be valid for actual applications. The generated EDCs at 250 Hz and 2 kHz were taken as examples to compare the results of the EDCs given in
Figure 8. The optimal factors of
applied for the two rooms in the octave bands were different. The value used in room A was 4.25 because of the estimated SNRs of the filtered RIRs in octave bands ranged from 12 to 18, and a factor of was 4 was used for room B because the estimated SNRs of the filtered RIRs in the octave bands ranged from 24.5 to 28. The de-noised EDCs were almost identical to the reference EDCs and the compensated EDCs were above the critical point and are indicated by the dash–dotted vertical line. The reverberation decay ranges were obviously extended using the optimal factors. The mean dynamic range improvements for rooms A and B were 12 dB and 14.4 dB, respectively. The mean noise levels reductions for rooms A and B were approximately 7.8 dB and 8.5 dB, respectively.
Table 3 lists the parameters of the EDT and RTs estimated at critical points using the optimal factors for the two rooms, compared with the reference results. The overall differences of the RTs estimated at the critical points, and the EDT of the frequency bands estimated at the de-noised EDCs, were small compared to the reference RIRs and the compensation method. The fluctuations caused by the GBSS algorithm had slight effects on the EDCs, leading to minor deviations in the early reverberation decays. The maximum relative errors of the EDTs for rooms A and B were 0.89% at an octave band of 500 Hz and 1.3% at an octave band of 1000 Hz, respectively. The corresponding time deviations were 0.011 s and 0.007 s, respectively. The maximum relative errors of the RT were 1.09% for room A at an octave band of 1000 Hz and 0.89% for room B at an octave band of 500 Hz. The corresponding time deviations were 0.015 s and 0.007 s, respectively. A comparison with the noise compensation method showed that the GBSS algorithm produced slightly better results when the optimal factors were applied. The mean relative errors of the EDT and RTs were 0.41% and 0.42% using the GBSS algorithm, respectively, whereas the corresponding values were 0.59% and 0.48% using the noise compensation method.
The relative errors of the RTs and EDT presented in
Figure 9 showed that the optimal factors of the GBSS algorithm had a significant impact on the RIR de-noising. Nevertheless, the EDT calculated from the de-noised EDCs in the same octave bands were smaller than the reference ones, which illustrated that the EDCs went below the reference EDCs, leading to minimal degradation of the reverberation decay. The value of the factor was acceptable because the decay rate was not degraded.
The GBSS method with the optimal factors given in
Figure 7c was valid for the measured RIRs. Although the relative errors of the RTs and EDT estimated by the noise compensation were within the limits and had a barely just noticeable difference [
33], the method was sensitive to the correction term, requiring sophisticated procedures for a truncated time estimation [
25]. Overall, the GBSS method is simple to implement with a solid flexibility to adapt to random noise by adjusting the over-subtraction factors according to the SNRs in the frequency bands instead of using a constant factor. The GBSS algorithm with these promising factors provides a good compromise between the noise reduction and the distortion of the RIR for acoustic parameter estimation.