The optimal decomposition level for wavelet denoising, based on signal characteristics, is determined by evaluating the statistical properties of the denoised signal. Common evaluation indicators include the signal-to-noise ratio (SNR), Root Mean Square Error (RMSE), correlation coefficients (R), and smoothness (r), each representing distinct physical meanings and characteristics. If a pair of metrics, representing different perspectives (such as signal detail and overall characteristics), are negatively correlated, and both values improve together (whether larger or smaller), then theoretically, a composite metric can quantify the optimal wavelet decomposition level. This serves as the theoretical foundation for the composite metric evaluation methods.
4.1. Selection of Assessment Indicators
The key to selecting integrated metrics lies in identifying methods that quantitatively describe denoised signal characteristics from various perspectives, such as signal details and approximation information. The common metrics used to evaluate wavelet denoising performance include
SNR,
RMSE,
R and
r, which are defined as follows [
20]:
where
S represents the original signal sequence and
represents the denoised signal sequence. When the true value is known,
S refers to the true signal sequence without noise; when the true value is unknown,
S refers to the noisy observed signal sequence.
The RMSE reflects the overall bias of the signal, with smaller values indicating superior denoising performance. The SNR describes the impact of noise on the overall signal, with a higher SNR generally associated with improved filtering efficacy. The smoothness indicates the local variation within the signal, particularly highlighting the presence of abrupt changes; a smoother signal correlates with a lower smoothness index value, signifying enhanced denoising effectiveness. The correlation coefficient quantifies the degree to which the denoised signal aligns with the theoretical reference signal, with values approaching 1 indicating a closer resemblance and a better fit.
To further illustrate the limitations of employing a single metric when the original clean signal is unknown, this paper uses the determination of the optimal decomposition level based on a single metric as a case study, employing the wavelet threshold denoising method to evaluate the denoising effects of the simulated signal across various decomposition levels. To ensure that the simulated signal closely resembles the actual monitored signal, three distinct frequency sine signals, linear signals and noise signals are selected and superimposed for analysis, with the corresponding expressions provided as follows:
where
t represents the time series, while
represents the simulated noise sequence.
The
SNR of the simulated data is 15 dB, characterized by a signal length of 1024 sampling points and a sampling frequency of 1 Hz. Denoising is performed using the db4 wavelet basis function, with the decomposition levels ranging from 2 to 10; the optimal decomposition level is determined based on the four aforementioned traditional metrics. The resulting data are presented in the form of a trend.
Figure 2 and
Figure 3 illustrate the trend lines obtained under the conditions of known and unknown true values, respectively.
Figure 2 and
Figure 3 illustrate the trend lines derived under conditions of known and unknown true values, respectively. An analysis of
Figure 2 reveals that, under conditions of known true values, the optimal decomposition level determined by the
RMSE,
SNR and correlation coefficient is 4, whereas the smoothness metric does not yield a conclusive judgment. An analysis of
Figure 3 indicates that, under the conditions of unknown true values, none of the four metrics can accurately determine the optimal decomposition level. Additionally, the
SNR, correlation coefficient and smoothness decrease as the decomposition level increases, whereas the
RMSE increases with rising decomposition levels. Therefore, under conditions of unknown true values, no traditional single metric suffices to determine the optimal wavelet decomposition level.
In practical applications, the reliability of the correlation coefficient is limited due to the absence of known true values for the signals. Consequently, this paper employs the previously mentioned simulated data to develop a composite evaluation indicator that incorporates geometric significance, physical significance and the Pearson correlation coefficient among the three metrics: RMSE, SNR, and smoothness.
Given that the base ranges and units of the three metrics differ, these metrics are normalized to a common range of [0, 1] for easier comparison. Furthermore, considering that smaller values are preferable for RMSE and smoothness, whereas larger values are advantageous for
SNR, it is essential to apply a trend adjustment to the
SNR. The specific processing formulas are presented as follows [
21]:
where
PRMSE,
PSNR and
Pr represent the normalized and trend-adjusted RMSE,
SNR and smoothness, respectively, with the subscript
i indicating the wavelet decomposition level, where
i = 1, 2, …, 10.
A Pearson correlation analysis was performed on the three metrics following standardization and trend adjustment. The correlations among the metrics are presented in
Table 2.
As indicated in
Table 2, a significant correlation exists between the
SNR and
RMSE, with a Pearson correlation coefficient of 0.996, indicating that both metrics characterize the detailed information of the signal. Consequently, this study discards
SNR and selects
RMSE as the composite evaluation indicator. Simultaneously, to better capture the overall characteristics of the denoised signal, smoothness is chosen as an additional composite evaluation indicator.
RMSE and smoothness measure the detailed information and overall approximation of the signal, respectively. Furthermore, these two metrics are negatively correlated, with a correlation coefficient of −0.796.
When employing a composite evaluation indicator that integrates RMSE and smoothness, as the decomposition level increases, the evaluation indicator will inevitably attain an extremum. The physical significance of this extremum is that it represents the optimal balance between preserving detailed information and ensuring the overall approximation of the signal, at which point the decomposition level is deemed optimal.
4.2. Construction of Composite Assessment Indicators
Due to the different ways in which RMSE and smoothness describe signal features, their weights vary during the composite process, necessitating a weighting approach. The commonly used methods for assigning weights include the entropy weight method and the coefficient of variation method. The entropy weight method allocates weights based on information entropy, making it suitable for scenarios with relatively stable data and minimal subjective judgment. It is particularly advantageous when dealing with complex systems that involve large volumes of information. In contrast, the coefficient of variation method assigns weights by evaluating the data’s volatility, making it more suitable for situations with significant fluctuations or variations in the data. This method better reflects the variability and influence of the indicators. In this paper, the coefficient of variation method is used to assign different combination coefficients to both the entropy weight method and the coefficient of variation method. The resulting composite evaluation index enhances the scientific and rational allocation of weights.
4.2.1. Entropy Weight Method
The entropy weight method is an objective weighting technique that establishes weights according to the information content of each indicator. A smaller entropy value for an indicator signifies a greater degree of variation and a higher richness of information provided, thereby enhancing its significance in comprehensive evaluation, resulting in a correspondingly larger weight. This study calculates the corresponding weights for the two indicators,
RMSE and smoothness, and the specific steps are outlined as follows [
12,
22]:
First, calculate the contributions of the standardized and detrended
RMSE and smoothness (r) indicators to the overall
RMSE and smoothness across all decomposition levels, specifically at the
i-th decomposition level.
where
i represents the decomposition level, ranging from 1 to
k. Here,
k denotes the highest decomposition level, which is determined by Equation (7).
Calculate the information entropy of the two main indicators.
Calculate the weights of the two evaluation indicators.
Calculate the composite evaluation indicator.
By analyzing the definitions of RMSE and smoothness, as well as their weight allocation process, it can be concluded that a smaller composite evaluation indicator corresponds to a more effective denoising effect at the decomposition level, leading to a more thorough removal of noise. Therefore, the decomposition level corresponding to the minimum value of the composite evaluation indicator is considered to be the theoretical optimal level based on the entropy weight method.
4.2.2. Variation Coefficient Method
The coefficient of variation method is an objective statistical weighting technique employed to assess the degree of variation for each indicator. In the evaluation system, indicators with greater variation are more effective in highlighting the differences between evaluation units, and thus should be assigned higher weights, while indicators with smaller variation should receive lower weights. This method determines the significance of each indicator based on its statistical characteristics [
11,
21].
First, calculate the coefficient of variation based on the mean and standard deviation.
The weight of the indicator is determined by the coefficient of variation.
where
and
represent the standard deviation and mean of
RMSE, respectively;
and
represent the standard deviation and mean of smoothness;
and
represent the coefficient of variation for
RMSE and smoothness; and
and
represent the weights of
RMSE and smoothness, respectively.
Calculate the composite evaluation indicator.
Similarly, a smaller composite evaluation indicator corresponds to a more effective denoising effect at the decomposition level, indicating a more thorough removal of noise. Therefore, the decomposition level corresponding to the minimum value of the composite evaluation indicator is considered to be the theoretical optimal level based on the coefficient of variation method.
4.2.3. Combined Weighting Method
After determining the weights of the two aforementioned weighting methods, to address their respective limitations and minimize the loss of valuable information, a combined weighting method is developed by integrating the entropy weight method with the coefficient of variation method. This ensures that the indicator weights are more objective and justifiable, ultimately leading to the final comprehensive weight. Let the weight vector determined by the coefficient of variation method be represented as
T, and the weight vector determined by the entropy weight method be represented as
F. By applying an addition-based ensemble method, the final comprehensive weight vector
H is derived.
where
and
represent the undetermined coefficients of the combined weighting method, which can be determined by applying the difference coefficient method.
where
T and
F represent the weights of the coefficient of variation method and the entropy weight method, respectively;
denotes the coefficient of variation; and
p1,
p2, …,
pm represent the weights of the coefficients of variation for each indicator, arranged in an ascending order.
m denotes the number of combinations for weighting combinations, and in this study,
m is set to 2, allowing Equation (30) to be further derived as follows:
The equation presented above functions as the weighting formula for the combined weighting method. The computational process for determining the optimal decomposition levels, as proposed in this paper, is illustrated in
Figure 4.
The design concept of this paper is illustrated in
Figure 4. The original noisy signal is decomposed into noise and denoised components using a wavelet thresholding function. First, in the noise signal extraction phase, the Jarque–Bera test is applied to ensure that the extracted noise adheres to the characteristics of Gaussian white noise, thereby preventing inadequate or distorted denoising. The information from decomposition levels that pass the Jarque–Bera test is retained, while levels that do not pass are discarded.
Next, in the denoised signal evaluation phase, the denoising performance is assessed using two metrics: RMSE and smoothness. Since these metrics describe different aspects of the signal, a weighted composite indicator is used for evaluation. To address the limitations of relying on a single weighting method, a combined weighting strategy is proposed, integrating the entropy method and the coefficient of variation method. The combined weighting coefficient is then calculated using the difference coefficient method.
By combining the evaluation strategies for noise signal extraction and denoised signal assessment, the optimal decomposition level for the noisy signal at specific wavelet coefficients is determined by the extremum of the composite indicator, traversing from the first decomposition level to the highest. This represents the design concept of this paper.