Next Article in Journal
Physical Realizations of Interdependent Networks: Analogy to Percolation
Previous Article in Journal
Subspace Learning for Dual High-Order Graph Learning Based on Boolean Weight
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Determination Method of Optimal Decomposition Level of Discrete Wavelet Based on Joint Jarque–Bera Test and Combination Weighting Method

1
Key Laboratory of Smart Earth, Beijing 100080, China
2
Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(2), 108; https://doi.org/10.3390/e27020108
Submission received: 25 December 2024 / Revised: 9 January 2025 / Accepted: 17 January 2025 / Published: 23 January 2025
(This article belongs to the Special Issue Entropy and Time–Frequency Signal Processing)

Abstract

:
To overcome the limitations of traditional evaluation indicators in determining the optimal wavelet decomposition level, this paper proposes an adaptive method for selecting the best decomposition level by combining the Jarque–Bera test and a composite weighting approach. Firstly, in the noise extraction stage, the Jarque–Bera test is employed to ensure that the extracted noise follows Gaussian white noise characteristics, thereby avoiding issues of insufficient denoising or signal distortion. Secondly, in the evaluation stage of the denoised signal, a comprehensive consideration of the geometric and physical meanings of various evaluation metrics, as well as the Pearson correlation coefficients between them, is undertaken. The RMSE and smoothness are selected as evaluation indicators for the denoising performance. Since these two metrics describe signal characteristics from different dimensions, a weighted combination approach is used to generate a single composite evaluation index. Additionally, to overcome the limitations of using a single weighting method, a composite weighting strategy is proposed by combining the entropy weight method and the coefficient of variation method. The composite coefficient between these two weighting methods is calculated using the variance coefficient method, yielding a new composite evaluation metric. A smaller value of this metric indicates better denoising performance, and the corresponding optimal decomposition level is more accurately determined. The simulation results demonstrate that the proposed comprehensive evaluation method can accurately determine the optimal wavelet decomposition level in both known and unknown truth-value cases, exhibiting a high accuracy and good applicability. Furthermore, the experimental results show that using the optimal decomposition level determined by the proposed method for wavelet denoising leads to smoother peak regions, more stable waveforms and significantly improved denoising performance.

1. Introduction

According to the Central Limit Theorem, under consistent measurement conditions, the superposition of multiple independent noise sources tends to result in a normal distribution, making it possible to model the noise in real-world measurements as Gaussian white noise for analysis [1,2].
In recent years, wavelet analysis, with its excellent time–frequency localization, multi-resolution capabilities, and ability to handle nonlinear problems, has been extensively applied in areas such as geodetic data denoising [3,4]. By processing signals using wavelet transform, noise interference can be effectively reduced, thus improving the accuracy and reliability of the results. However, the effectiveness of wavelet denoising largely depends on the appropriate choice of decomposition levels, as an improper selection may result in insufficient denoising or signal distortion. Therefore, accurately identifying the optimal decomposition levels to achieve the best denoising performance has become a key focus in both academia and engineering practice [5,6].
The current methods for determining the optimal wavelet denoising decomposition levels can be classified into two categories as follows: (1) methods based on noise characteristics and (2) methods based on signal characteristics [7]. Methods based on noise characteristics typically assume that the noise in the signal is white noise, which determine the decomposition level by setting hypothesis test conditions. If these conditions are exceeded, then the decomposition level is rejected. Common tests include the Jarque–Bera test and the Kolmogorov–Smirnov test [8]. By assessing whether the wavelet coefficients at each level exhibit white noise characteristics, a reasonable decomposition level can be adaptively selected. However, relying solely on noise characteristics for qualitative judgment is often inaccurate because multiple decomposition levels may exist, making it challenging to identify a single optimal level [8,9,10].
Another category of methods relies on the inherent characteristics of the signal, utilizing indicators such as the signal-to-noise ratio (SNR), root mean square error (RMSE), smoothness (r) and correlation coefficient (R) to guide the selection of decomposition levels [5]. However, the unknown true value of the signal makes it difficult for a single evaluation indicator to assess the denoising effect accurately, resulting in significant limitations. Numerous scholars have proposed weighting methods, such as the entropy weight method and the coefficient of variation method, which extend and integrate traditional metrics, such as SNR, RMSE and smoothness, to quantitatively identify the inflection points of metric changes, thereby determining the optimal decomposition level for wavelet denoising more precisely [11,12]. Nonetheless, the methods for determining optimal decomposition levels based on signal characteristics continue to face challenges, including indicator selection, the choice of weighting methods and the evaluation of noise statistical characteristics [13].
Overall, while the two aforementioned methods have achieved notable success, they still exhibit limitations in accuracy and the interpretation of physical significance. Furthermore, studies on the modeling and analysis of Gaussian white noise remain relatively limited. Therefore, this study integrates the statistical characteristics of noise with the mathematical properties of signals by employing the Jarque–Bera test for qualitative noise analysis. By integrating this approach with a weighted combination method, the optimal wavelet decomposition level is determined, leading to more effective wavelet denoising. A theoretical analysis and case studies demonstrate that the proposed method exhibits clear geometric and physical significance. The algorithm is straightforward, highly accurate and practical for determining the optimal decomposition level, offering a broad applicability in engineering practice.

2. Wavelet De-Noising Function Model

For a square-integrable function ψ ( t ) in the real number space L 2 R , its Fourier transform satisfies the following equation [14,15]:
R ψ ^ ( t ) 2 t d t <
where ψ ^ ( t ) represents the Fourier transform of ψ ( t ) , which is designated as the wavelet function. The collection of functions generated by the wavelet function ψ ( t ) through the scale factor a and the translation factor b is referred to as the continuous wavelet, specifically,
ψ a , b ( t ) = 1 a ψ t b a
where a > 0 and b are the real numbers.
The continuous wavelet transform of the one-dimensional noisy signal S is given as follows:
W S a , b = S , ψ a , b = a 1 / 2 S t ψ ¯ t b / a d t
where ψ ¯ denotes the conjugate of ψ . In practical engineering applications, continuous wavelets are commonly discretized by selecting discrete values for the scale factor a and the translation factor b, thereby facilitating analysis and processing using computers. This process is defined as follows:
a = a 0 j , b = k a 0 j b 0 j , k Z
where j and k represent the scale factor and translation factor of the discrete wavelet, respectively.
The discrete wavelets are defined as follows:
D S ( j , k ) = S , ψ j , k = a 0 j / 2 + S t ψ ¯ a 0 j t k b 0 d t
Binary wavelets are especially well suited for computer analysis and provide a high computational efficiency. The scale factor is usually discretized using a binary approach, while the translation factor is discretized in multiples of binary integers. In the formula for the discrete wavelet basis function, a0 is set to 2 and b0 is set to 1, with the variation defined as follows:
d j , k = 2 j / 2 + S t ψ ¯ 2 j t k d t
Additionally, based on the conversion formulas for exponents and logarithms,
N = 2 J J = log 2 N
where J represents the maximum level of wavelet decomposition, while N denotes the length of the noisy signal S. The maximum decomposition level depends solely on the length N of the signal. In this paper, the maximum decomposition level is determined by taking the floor of J = log2(N).
In practical engineering applications, effective signals are generally characterized by low-frequency components or relatively stable states, while noise is predominantly concentrated in the high-frequency range. Therefore, this paper employs a multi-level wavelet decomposition method to process noise. The rules governing the discrete wavelet transform (DWT) decomposition are illustrated in Figure 1.
Figure 1 illustrates the discrete wavelet decomposition transform, where fs denotes the sampling frequency, Ai represents the low-frequency signal component in the wavelet decomposition and Di signifies the high-frequency signal component. Noise is typically incorporated into Di, with the subscript i indicating the corresponding decomposition level. Subsequently, thresholding is applied to the wavelet coefficients, and the resulting signal is reconstructed to achieve the goal of denoising.

3. Qualitative Analysis of Gaussian White Noise

Common methods for assessing normality include the skewness–kurtosis test and the Jarque–Bera test. Relevant studies indicate that, across varying sample sizes, the skewness and kurtosis test statistics fail to control the probability of a Type I error at the significance level of 0.05; thus, their use for normality assessment is not advised.
However, when the sample size exceeds 50, the Jarque–Bera test effectively controls the probability of a Type I error at the significance level. Furthermore, while controlling the probability of z Type I error, the probability of a Type II error for the Jarque–Bera test is significantly lower than that of other test statistics. Moreover, when the sample size exceeds 100, the probability of committing a Type II error using the Jarque–Bera test approaches zero [1,16,17,18].
Consequently, this paper employs the Jarque–Bera test to conduct a qualitative analysis of Gaussian white noise, thereby ensuring its validity.
The skewness test examines the symmetry of the data distribution, and its mathematical expression is as follows [19]:
S k e w n e s s = 1 N i = 1 N ( S t i E S ) 3 ( 1 N i = 1 N ( S t i E S ) 2 ) 3
The skewness of a normal distribution is 0. If the sample skewness is close to zero, this indicates that the sample data are symmetrically distributed around the mean. A negative skewness indicates that there are more data points below the mean than above, whereas a positive skewness suggests the opposite.
The kurtosis test examines the sharpness of the data distribution and the lengths of the two tails, and its mathematical expression is as follows:
K u r t o s i s = 1 N i = 1 N ( S t i E S ) 4 ( 1 N i = 1 N ( S t i E S ) 2 ) 4 3
The kurtosis is influenced by a few extreme values. The larger the kurtosis, the more extreme the deviation of the values from the mean. The kurtosis of the normal distribution is 0. If the sample kurtosis is close to 0, it indicates that the sample data are relatively flat without prominent peaks or steep features.
When the sample size N , the sample skewness and kurtosis tend to be a normal distribution:
E S k e w n e s s 0 E K u r t o s i s 0 D S k e w n e s s 6 N D K u r t o s i s 24 N
where E represents the mathematical expectation and D represents the variance.
The null hypothesis H0: the population follows a normal distribution, and the test statistic is as follows:
J B = N 6 ( S k e w n e s s 2 + 1 4 K u r t o s i s 2 )
where JB represents the chi-square test statistic, and its value is less than the value χ P 2 , indicating that the quantiles for which Equation (12) is valid are presented in Table 1.
P J B χ P 2 = 0 χ P 2 K 2 x d x
where K2 (x) denotes the probability density function of the chi-square distribution with 2 degrees of freedom.
From Table 1, v denotes the degrees of freedom, which is set to 2 in this study. P indicates the confidence level, with 95% selected as the standard for evaluating normality. The value of χ P 2 is determined by the corresponding values of P and v.
In practical applications, the selection of the confidence level should take into account factors such as the problem’s context, data reliability, and sample size.

4. Quantitative Evaluation of the Optimal Decomposition Level of Wavelets

The optimal decomposition level for wavelet denoising, based on signal characteristics, is determined by evaluating the statistical properties of the denoised signal. Common evaluation indicators include the signal-to-noise ratio (SNR), Root Mean Square Error (RMSE), correlation coefficients (R), and smoothness (r), each representing distinct physical meanings and characteristics. If a pair of metrics, representing different perspectives (such as signal detail and overall characteristics), are negatively correlated, and both values improve together (whether larger or smaller), then theoretically, a composite metric can quantify the optimal wavelet decomposition level. This serves as the theoretical foundation for the composite metric evaluation methods.

4.1. Selection of Assessment Indicators

The key to selecting integrated metrics lies in identifying methods that quantitatively describe denoised signal characteristics from various perspectives, such as signal details and approximation information. The common metrics used to evaluate wavelet denoising performance include SNR, RMSE, R and r, which are defined as follows [20]:
R M S E = 1 N i = 1 N S t i S ^ t i 2
S N R = 10 lg i = 1 N S t i 2 i = 1 N S t i S ^ t i 2
R = i N S t i S ^ t i i N S t i i N S ^ t i i N S t i 2 i N S t i 2 × i N S ^ t i 2 i N S ^ t i 2
r = i = 1 N 1 S ^ t i + 1 S ^ t i 2 i = 1 N 1 S t i + 1 S t i 2
where S represents the original signal sequence and S ^ represents the denoised signal sequence. When the true value is known, S refers to the true signal sequence without noise; when the true value is unknown, S refers to the noisy observed signal sequence.
The RMSE reflects the overall bias of the signal, with smaller values indicating superior denoising performance. The SNR describes the impact of noise on the overall signal, with a higher SNR generally associated with improved filtering efficacy. The smoothness indicates the local variation within the signal, particularly highlighting the presence of abrupt changes; a smoother signal correlates with a lower smoothness index value, signifying enhanced denoising effectiveness. The correlation coefficient quantifies the degree to which the denoised signal aligns with the theoretical reference signal, with values approaching 1 indicating a closer resemblance and a better fit.
To further illustrate the limitations of employing a single metric when the original clean signal is unknown, this paper uses the determination of the optimal decomposition level based on a single metric as a case study, employing the wavelet threshold denoising method to evaluate the denoising effects of the simulated signal across various decomposition levels. To ensure that the simulated signal closely resembles the actual monitored signal, three distinct frequency sine signals, linear signals and noise signals are selected and superimposed for analysis, with the corresponding expressions provided as follows:
S = 2.5 sin 2 π t / 500 + 1.5 sin ( 2 π t / 300 ) + sin 2 π t / 60 + 0.003 t + ε
where t represents the time series, while ε represents the simulated noise sequence.
The SNR of the simulated data is 15 dB, characterized by a signal length of 1024 sampling points and a sampling frequency of 1 Hz. Denoising is performed using the db4 wavelet basis function, with the decomposition levels ranging from 2 to 10; the optimal decomposition level is determined based on the four aforementioned traditional metrics. The resulting data are presented in the form of a trend. Figure 2 and Figure 3 illustrate the trend lines obtained under the conditions of known and unknown true values, respectively.
Figure 2 and Figure 3 illustrate the trend lines derived under conditions of known and unknown true values, respectively. An analysis of Figure 2 reveals that, under conditions of known true values, the optimal decomposition level determined by the RMSE, SNR and correlation coefficient is 4, whereas the smoothness metric does not yield a conclusive judgment. An analysis of Figure 3 indicates that, under the conditions of unknown true values, none of the four metrics can accurately determine the optimal decomposition level. Additionally, the SNR, correlation coefficient and smoothness decrease as the decomposition level increases, whereas the RMSE increases with rising decomposition levels. Therefore, under conditions of unknown true values, no traditional single metric suffices to determine the optimal wavelet decomposition level.
In practical applications, the reliability of the correlation coefficient is limited due to the absence of known true values for the signals. Consequently, this paper employs the previously mentioned simulated data to develop a composite evaluation indicator that incorporates geometric significance, physical significance and the Pearson correlation coefficient among the three metrics: RMSE, SNR, and smoothness.
Given that the base ranges and units of the three metrics differ, these metrics are normalized to a common range of [0, 1] for easier comparison. Furthermore, considering that smaller values are preferable for RMSE and smoothness, whereas larger values are advantageous for SNR, it is essential to apply a trend adjustment to the SNR. The specific processing formulas are presented as follows [21]:
P R M S E i = R M S E i min ( R M S E ) max ( R M S E ) min ( R M S E )
P S N R i = max ( S N R ) S N R i max ( S N R ) min ( S N R )
P r i = r i min ( r ) max ( r ) min ( r )
where PRMSE, PSNR and Pr represent the normalized and trend-adjusted RMSE, SNR and smoothness, respectively, with the subscript i indicating the wavelet decomposition level, where i = 1, 2, …, 10.
A Pearson correlation analysis was performed on the three metrics following standardization and trend adjustment. The correlations among the metrics are presented in Table 2.
As indicated in Table 2, a significant correlation exists between the SNR and RMSE, with a Pearson correlation coefficient of 0.996, indicating that both metrics characterize the detailed information of the signal. Consequently, this study discards SNR and selects RMSE as the composite evaluation indicator. Simultaneously, to better capture the overall characteristics of the denoised signal, smoothness is chosen as an additional composite evaluation indicator. RMSE and smoothness measure the detailed information and overall approximation of the signal, respectively. Furthermore, these two metrics are negatively correlated, with a correlation coefficient of −0.796.
When employing a composite evaluation indicator that integrates RMSE and smoothness, as the decomposition level increases, the evaluation indicator will inevitably attain an extremum. The physical significance of this extremum is that it represents the optimal balance between preserving detailed information and ensuring the overall approximation of the signal, at which point the decomposition level is deemed optimal.

4.2. Construction of Composite Assessment Indicators

Due to the different ways in which RMSE and smoothness describe signal features, their weights vary during the composite process, necessitating a weighting approach. The commonly used methods for assigning weights include the entropy weight method and the coefficient of variation method. The entropy weight method allocates weights based on information entropy, making it suitable for scenarios with relatively stable data and minimal subjective judgment. It is particularly advantageous when dealing with complex systems that involve large volumes of information. In contrast, the coefficient of variation method assigns weights by evaluating the data’s volatility, making it more suitable for situations with significant fluctuations or variations in the data. This method better reflects the variability and influence of the indicators. In this paper, the coefficient of variation method is used to assign different combination coefficients to both the entropy weight method and the coefficient of variation method. The resulting composite evaluation index enhances the scientific and rational allocation of weights.

4.2.1. Entropy Weight Method

The entropy weight method is an objective weighting technique that establishes weights according to the information content of each indicator. A smaller entropy value for an indicator signifies a greater degree of variation and a higher richness of information provided, thereby enhancing its significance in comprehensive evaluation, resulting in a correspondingly larger weight. This study calculates the corresponding weights for the two indicators, RMSE and smoothness, and the specific steps are outlined as follows [12,22]:
First, calculate the contributions of the standardized and detrended RMSE and smoothness (r) indicators to the overall RMSE and smoothness across all decomposition levels, specifically at the i-th decomposition level.
P i P R M S E = P R M S E i / j = 1 k P R M S E j P i P r = P r i / j = 1 k P r j
where i represents the decomposition level, ranging from 1 to k. Here, k denotes the highest decomposition level, which is determined by Equation (7).
Calculate the information entropy of the two main indicators.
e P R M S E = 1 In k i = 1 k P i P R M S E l n P i P R M S E e P r = 1 In k i = 1 k P i P r l n P i P r
Calculate the weights of the two evaluation indicators.
w P R M S E = 1 e P R M S E 1 e P R M S E + 1 e P r w P r = 1 e P r 1 e P R M S E + 1 e P r
Calculate the composite evaluation indicator.
F = w P R M S E P R M S E + w P r P r
By analyzing the definitions of RMSE and smoothness, as well as their weight allocation process, it can be concluded that a smaller composite evaluation indicator corresponds to a more effective denoising effect at the decomposition level, leading to a more thorough removal of noise. Therefore, the decomposition level corresponding to the minimum value of the composite evaluation indicator is considered to be the theoretical optimal level based on the entropy weight method.

4.2.2. Variation Coefficient Method

The coefficient of variation method is an objective statistical weighting technique employed to assess the degree of variation for each indicator. In the evaluation system, indicators with greater variation are more effective in highlighting the differences between evaluation units, and thus should be assigned higher weights, while indicators with smaller variation should receive lower weights. This method determines the significance of each indicator based on its statistical characteristics [11,21].
First, calculate the coefficient of variation based on the mean and standard deviation.
C V P R M S E = σ P R M S E μ P R M S E C V P r = σ P r μ P r
The weight of the indicator is determined by the coefficient of variation.
W P R M S E = C V P R M S E C V P R M S E + C V P r W P r = CV P r CV P R M S E + CV P r
where σ P R M S E and μ P R M S E represent the standard deviation and mean of RMSE, respectively; σ P r and μ P r represent the standard deviation and mean of smoothness; C V P R M S E and C V P r represent the coefficient of variation for RMSE and smoothness; and W P R M S E and W P r represent the weights of RMSE and smoothness, respectively.
Calculate the composite evaluation indicator.
T = W P R M S E P R M S E + W P r P r
Similarly, a smaller composite evaluation indicator corresponds to a more effective denoising effect at the decomposition level, indicating a more thorough removal of noise. Therefore, the decomposition level corresponding to the minimum value of the composite evaluation indicator is considered to be the theoretical optimal level based on the coefficient of variation method.

4.2.3. Combined Weighting Method

After determining the weights of the two aforementioned weighting methods, to address their respective limitations and minimize the loss of valuable information, a combined weighting method is developed by integrating the entropy weight method with the coefficient of variation method. This ensures that the indicator weights are more objective and justifiable, ultimately leading to the final comprehensive weight. Let the weight vector determined by the coefficient of variation method be represented as T, and the weight vector determined by the entropy weight method be represented as F. By applying an addition-based ensemble method, the final comprehensive weight vector H is derived.
H = α T + β F
where α and β represent the undetermined coefficients of the combined weighting method, which can be determined by applying the difference coefficient method.
α = R E m m m 1 β = 1 α
R E m = 2 m ( 1 p 1 + 2 p 2 + + m p m ) m + 1 m
where T and F represent the weights of the coefficient of variation method and the entropy weight method, respectively; R E m denotes the coefficient of variation; and p1, p2, …, pm represent the weights of the coefficients of variation for each indicator, arranged in an ascending order. m denotes the number of combinations for weighting combinations, and in this study, m is set to 2, allowing Equation (30) to be further derived as follows:
R E m = ( 1 min W P R M S E , W P r + 2 max W P R M S E , W P r ) 1.5
The equation presented above functions as the weighting formula for the combined weighting method. The computational process for determining the optimal decomposition levels, as proposed in this paper, is illustrated in Figure 4.
The design concept of this paper is illustrated in Figure 4. The original noisy signal is decomposed into noise and denoised components using a wavelet thresholding function. First, in the noise signal extraction phase, the Jarque–Bera test is applied to ensure that the extracted noise adheres to the characteristics of Gaussian white noise, thereby preventing inadequate or distorted denoising. The information from decomposition levels that pass the Jarque–Bera test is retained, while levels that do not pass are discarded.
Next, in the denoised signal evaluation phase, the denoising performance is assessed using two metrics: RMSE and smoothness. Since these metrics describe different aspects of the signal, a weighted composite indicator is used for evaluation. To address the limitations of relying on a single weighting method, a combined weighting strategy is proposed, integrating the entropy method and the coefficient of variation method. The combined weighting coefficient is then calculated using the difference coefficient method.
By combining the evaluation strategies for noise signal extraction and denoised signal assessment, the optimal decomposition level for the noisy signal at specific wavelet coefficients is determined by the extremum of the composite indicator, traversing from the first decomposition level to the highest. This represents the design concept of this paper.

5. Simulation Experiments and Practical Engineering Applications

5.1. Simulation Experiment

To accurately compare and validate the effectiveness of the composite evaluation indicator introduced in this study in determining the optimal decomposition level, the simulation signals with known true values derived from Equation (17) are employed for a wavelet denoising analysis.
The control variable method is employed to process noisy signals at varying decomposition levels while maintaining the simulated noise constant. Concurrently, the simulated noise with varying SNRs is analyzed while holding all other parameters constant. Specifically, the db4 wavelet basis function, the ‘sqtwolog’ threshold criterion and the soft-thresholding processing function are utilized. To validate the applicability of this approach, the methods proposed by Wang (index F) and Zhu (index T) are employed as comparison benchmarks [11,12].
The simulated signals with noise levels of 9 dB and 15 dB are used as examples. The RMSE and SNR of the signals are initially calculated using known true values, where the original signals are free of noise. Subsequently, the RMSE and smoothness are evaluated when the true values are unknown, and the original signals include noise. The study also computes the proposed composite evaluation indicator in conjunction with the relevant indicators from Wang’s (index F) and Zhu’s (index T) methods. The detailed results are provided in Table 3 and Table 4.
Table 3 shows that when denoising is performed using the db4 wavelet basis, the RMSE achieves its minimum and the SNR reaches its maximum at a decomposition level of 4 under known true values, indicating that level 4 is optimal. When the true values are unknown, increasing the decomposition level results in a continuous increase in RMSE and a decline in smoothness r, which is consistent with a prior analysis, which makes determining the optimal level challenging. Based on the composite evaluation indicator, the minimum value is also observed at level 4, confirming it as the optimal decomposition level, which aligns with the actual conditions. This conclusion is supported by Wang’s method. However, Zhu’s method identifies the minimum value at level 3, suggesting an optimal decomposition level of 3, which contradicts the actual conditions.
Table 4 shows that when using the db4 wavelet basis for denoising, the optimal decomposition level is 4 under known true values. Likewise, when the true values are unknown, the composite evaluation indicator identifies level 4 as optimal, aligning with the actual conditions. However, Wang’s method identifies level 5 as optimal, while Zhu’s method identifies level 3, both of which deviate from the actual conditions. This underscores the superior accuracy and reliability of the composite evaluation indicator compared to the existing methods.
To comprehensively analyze the reliability of the proposed method and avoid misleading results due to a single wavelet basis function, a denoising analysis is conducted using Gaussian white noise with varying SNRs under unknown true values, aiming to derive a composite evaluation indicator for determining the optimal decomposition level. The optimal decomposition level determined under known true values serves as a reference for comparing the proposed method with the existing methods, as detailed in Table 5.
To comprehensively assess the reliability of the method proposed in this paper and to avoid potential bias from a single data set, a control variable approach is employed. With the wavelet basis function held constant, Gaussian white noise at varying SNRs is introduced for denoising analysis, and the optimal decomposition level for each SNR condition is determined. The optimal decomposition level obtained under known ground truth is used as a reference, and the proposed method is compared with the existing methods, as detailed in Table 5. Additionally, under constant SNR conditions, different wavelet basis functions are selected for the denoising analysis to evaluate the optimal decomposition level for each wavelet. The method for determining the optimal decomposition level proposed in this paper is then compared with the existing methods to assess its performance, with the data processing results presented in Table 6.
The analysis of Table 5 and Table 6 reveals that the accuracy of determining the optimal decomposition level varies across different algorithms. Under the same wavelet basis function, the optimal decomposition level varies across different SNR conditions. The proposed method achieves a 100% accuracy, while Wang’s method and Zhu’s method have accuracies of 83% and 50%, respectively. Under the same SNR conditions, different wavelet basis functions also yield varying optimal decomposition levels. The proposed method consistently maintains a 100% accuracy, whereas Wang’s method and Zhu’s method achieve accuracies of 67% and 33%, respectively.
This demonstrates that the method presented in this paper outperforms the existing approaches in terms of accuracy and reliability. It can be widely applied to signals with different wavelet basis functions and varying SNR conditions, making it a robust and reliable method for determining the optimal decomposition level for wavelet denoising.

5.2. Engineering Practical Applications

5.2.1. Function Model

According to Kepler’s third law, the square of a satellite’s orbital period is proportional to the cube of the semi-major axis of its elliptical orbit. Using the parameters as and dn from the broadcast ephemeris file, this paper derives the following formula for the mean angular velocity n:
n = G M a s 3 + d n = 2 π T
where as denotes the square root of the semi-major axis of the satellite’s orbital ellipse, dn represents the correction factor for the mean angular velocity, G is the gravitational constant and M is the Earth’s mass, while T represents the rotational period of the spacecraft [23].
Based on the calculation from Equation (32), the orbital period of geostationary Earth orbit (GEO) satellites is approximately 23 h and 56 min. Therefore, this study assumes 23 h and 56 min as a single period to calculate the pseudorange multipath of GEO satellites and applies wavelet denoising.
The optimal wavelet decomposition level is determined by analyzing noise signals in both the time and frequency domains at various decomposition levels. Furthermore, multipath periodic modeling experiments use the wavelet-denoised multipath time series to correct the undenoised multipath time series of the subsequent cycle. Finally, the RMSE of the corrected results is calculated to verify the optimal wavelet decomposition level.
The pseudorange multipath (MP) combination is achieved by combining the pseudorange observation P at frequency i with the carrier phase observations φ at frequencies i and j, thereby effectively reducing the impact of tropospheric and ionospheric delays. The corresponding expression is: [24]:
M P i = P i f i 2 + f j 2 f i 2 f j 2 λ i φ i + 2 f j 2 f i 2 f j 2 λ j φ j B i
where λ is the wavelength of the carrier phase, corresponding to the frequency f, and the bias term Bi is expressed as:
B i = f i 2 + f j 2 f i 2 f j 2 λ i N i + 2 f j 2 f i 2 f j 2 λ j N j + Ψ
where N represents the integer ambiguity, and Ψ denotes the time-invariant component of hardware delays and multipath effects.

5.2.2. Analysis of Experiments

This study applies the proposed method to model the GNSS pseudorange multipath effects, utilizing data sourced from measurements taken at a fixed station within a laboratory building in Zhengzhou during DOY (Day of year) 151–152 of 2024, with a sampling interval of 30 s collected using a TRIMBLE ALLOY receiver. The detailed information regarding the station and receiver is provided in Figure 5.
Before applying wavelet denoising, it is essential to maintain the consistency of other parameters. The ‘rigrsure’ threshold selection criterion and the soft threshold function were employed, using the db5 wavelet basis function with decomposition levels ranging from 1 to 11. The optimal decomposition level was determined by integrating Wang’s method, Zhu’s method and the proposed method using a controlled variable approach. The MP time series at the C60 satellite B1I frequency point is depicted in Figure 6, and the corresponding indicator data for the decomposition levels are presented in Table 7. Figure 7 illustrates the graph used to determine the optimal decomposition level of the pseudorange multipath signal.
Table 6 and Figure 7 show that the optimal decomposition levels differed depending on the wavelet denoising method used. Wang’s method and the proposed method both identify level 4 as optimal, whereas Zhu’s method selects level 3. To visually evaluate the accuracy of these methods, the time-domain and spectral curves for the denoised signals at the third and fourth decomposition levels were plotted, as shown in Figure 8 and Figure 9, respectively.
Figure 8 demonstrates that the wavelet-denoised signal preserves the overall trend of the original signal. Compared to the 3-level decomposition, the 4-level denoised signal exhibits a smoother peak curve, a more stable waveform and more comprehensive noise reduction. Figure 9 presents the single-sided amplitude spectrum, where denoising primarily eliminates the irrelevant high-frequency details while preserving the essential low-frequency components, aligned with the principles of wavelet denoising. The 4-level decomposition removes high-frequency noise more effectively than the 3-level one, yielding a higher SNR. Thus, the 4-level decomposition achieves the optimal denoising effect under these conditions, indirectly confirming the effectiveness of the composite evaluation indicators.
To better assess the wavelet denoising effect at different decomposition levels, Figure 10 presents the power spectral density estimates of the original signal and denoised signals after the third- and fourth-level decompositions. The figure indicates that the denoised signal retains valuable low-frequency information, eliminates high-frequency noise and appropriately preserves transitional mid-frequency details. Compared to the three-level decomposition, the four-level decomposition eliminates weak noise more effectively, confirming that the four-level approach is optimal.
To verify the accuracy of the optimal decomposition level determined in this paper, the orbital period of GEO satellites can be derived based on Kepler’s Third Law, with the specific calculation formula provided in Equation (32). For static reference stations, a periodic modeling analysis of a pseudorange multipath is performed. This involves denoising the pseudorange multipath data for DOY 151 using wavelet basis functions at different decomposition levels and then using the denoised signal to correct the original pseudorange multipath combination for DOY 152. The difference between the two is then calculated, and the RMSE value for DOY 152, after removing the periodic components, is computed. The detailed process of periodic correction is shown in Figure 11, with the statistical data presented in Table 8.
The analysis in Table 8 indicates that the RMSE reaches its minimum at a four-level decomposition, demonstrating an optimal performance in multipath periodic modeling. This result aligns with the conclusions from composite evaluation indicators, a time-domain analysis and a frequency-domain analysis, further validating the effectiveness of the proposed method in producing real-world data.
Figure 12 illustrates the application of the fourth-level wavelet basis function to denoise the pseudorange multipath sequence of DOY 151. The denoised signal is then used to correct the original pseudorange multipath sequence of DOY 152. The comparison of the results before and after correction is shown by the black curve, while the wavelet-denoised sequence for DOY 151 is represented by the orange curve.
Analysis of Figure 12 reveals that the optimal wavelet decomposition level determined in this study effectively removes noise from the actual data, thereby enhancing the accuracy of multipath periodic modeling.

6. Conclusions

This paper addresses the filtering problem of Gaussian white noise and proposes an optimal decomposition level determination method for wavelet denoising based on the combination of the Jarque–Bera test and a composite weighting approach. Through an analysis of both simulation signals and engineering examples, the proposed comprehensive evaluation metrics demonstrate superior performance and applicability. The main conclusions are as follows:
(1)
In situations where the true value is unknown, a single evaluation metric cannot effectively guide the determination of the optimal decomposition level in the wavelet threshold denoising process. Therefore, a more comprehensive quality evaluation index needs to be developed.
(2)
By combining the statistical characteristics of the noise and the mathematical features of the signal, an effective filtering method for Gaussian white noise is proposed to accurately determine the optimal wavelet decomposition level. First, during the noise extraction phase, the Jarque–Bera test is employed to ensure that the extracted noise conforms to the characteristics of Gaussian white noise, thus avoiding insufficient denoising or signal distortion. Next, in the signal denoising evaluation phase, RMSE and smoothness are selected as the evaluation metrics for denoising performance. Since these two metrics describe different aspects of the signal, a weighted composite approach is used to generate a single comprehensive evaluation index. To overcome the limitations of a single weighting method, a combination weighting strategy is introduced by integrating the entropy weighting method with the coefficient of variation method. The combination coefficient of these two methods is calculated using the difference coefficient approach, thereby yielding a new composite evaluation index. The smaller the index value, the better the denoising effect, and the more accurately the optimal decomposition level is determined.
(3)
The simulation results indicate that the proposed comprehensive evaluation method can accurately determine the optimal wavelet decomposition level, whether under the same wavelet basis with varying SNRs or under the same SNR condition with different wavelet bases, both when true values are known and unknown. The method demonstrates high accuracy and good applicability. The effectiveness of this approach is further validated with real-world data. After applying the optimal decomposition level determined by this method for wavelet denoising, the signal’s peak domain becomes smoother, the waveform stabilizes and the denoising effect is significantly improved. Additionally, the modeling of multi-path periodicity is also enhanced.
In conclusion, the wavelet optimal decomposition level determination strategy proposed in this paper offers a higher accuracy and greater universality compared to other algorithms. It is important to note that the algorithm designed in this paper mainly targets the most common Gaussian white noise filtering problem in industrial production. However, industrial settings also involve other types of measurement noise, such as Laplacian noise or colored noise, which do not follow Gaussian distributions. These issues still require further research.

Author Contributions

Conceptualization, Z.Z. (Zhanpeng Zhang) and C.L.; methodology, M.W. and C.L.; software, S.S. and M.W.; validation, Z.Z. (Zhao Zhan); project administration, M.W.; funding acquisition, M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly supported by the National Science Foundation of China (Nos. 42304043) and the Key Laboratory of Smart Earth (Nos. KF2023YB01-11 and SYS-ZX01-2024-01).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, and further inquiries can be directed to the corresponding author.

Conflicts of Interest

No potential conflicts of interest were reported by the authors.

References

  1. Durrett, R. Probability: Theory and Examples; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 2019; ISBN 978-1-108-47368-2. [Google Scholar]
  2. Sidebotham, D.; Barlow, C.J. The Central Limit Theorem: The Remarkable Theory That Explains All of Statistics. Anaesthesia 2024, 79, 1117–1121. [Google Scholar] [CrossRef] [PubMed]
  3. Xu, M.; Han, M.; Lin, H. Wavelet-Denoising Multiple Echo State Networks for Multivariate Time Series Prediction. Inf. Sci. 2018, 465, 439–458. [Google Scholar] [CrossRef]
  4. Li, K.; Ban, H.; Jiao, Y.; Lv, S. A Cycle Slip Detection and Repair Method Using BDS Triple-Frequency Optimization Combination with Wavelet Denoising. Int. J. Aerosp. Eng. 2022, 2022, 5110875. [Google Scholar] [CrossRef]
  5. Rouis, M.; Ouafi, A.; Sbaa, S. Optimal Level and Order Detection in Wavelet Decomposition for PCG Signal Denoising. Biomed. Eng. Biomed. Tech. 2019, 64, 163–176. [Google Scholar] [CrossRef]
  6. Behbahani, M.R.M.; Mazarei, M.; Bagtzoglou, A.C. Improving Deep Learning-Based Streamflow Forecasting under Trend Varying Conditions through Evaluation of New Wavelet Preprocessing Technique. Stoch. Environ. Res. Risk Assess. 2024, 38, 3963–3984. [Google Scholar] [CrossRef]
  7. Tao, K.; Zhu, J. A Hybrid Indicator for Determining the Best Decomposition Scale of Wavelet Denoising. Acta Geod. Cartogr. Sin. 2012, 41, 749–755. [Google Scholar] [CrossRef]
  8. Liu, Y.H.; Zeng, M.; Zhang, Y.S. Adaptive Algorithm for Determination of Optimal Wavelet Decomposition Level Based on Jarque-Bera Test. Appl. Mech. Mater. 2014, 644–650, 2220–2223. [Google Scholar] [CrossRef]
  9. Li, X.; Liao, K.; He, G.; Zhao, J. Research on Improved Wavelet Threshold Denoising Method for Non-Contact Force and Magnetic Signals. Electronics 2023, 12, 1244. [Google Scholar] [CrossRef]
  10. Guo, Y.; Zhou, X.; Li, J.; Ba, R.; Xu, Z.; Tu, S.; Chai, L. A Novel and Optimized Sine–Cosine Transform Wavelet Threshold Denoising Method Based on the Sym4 Basis Function and Adaptive Threshold Related to Noise Intensity. Appl. Sci. 2023, 13, 10789. [Google Scholar] [CrossRef]
  11. Zhu, J.; Zhang, Z.; Kuang, C.; Pan, J. A reliable evaluation indicator of wavelet de-noising. Geomat. Inf. Sci. Wuhan Univ. 2015, 40, 688–694. [Google Scholar] [CrossRef]
  12. Wang, X.; Wang, C. A Kind of Wavelet De-Noising Composite Evaluation Index Based on Entropy Method. J. Geod. Geodyn. 2018, 38, 698–702. [Google Scholar] [CrossRef]
  13. Li, J.; Zhao, D.; Wang, D.; Cai, C.; Jia, X.; Zhang, L. A quality evaluation method for wavelet denoising based on combinatorial weighting method. J. Beijing Univ. Aeronaut. Astronaut. 2023, 49, 718–725. [Google Scholar] [CrossRef]
  14. Mallat, S. A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way, 3rd ed.; Academic Press Inc.: Cambridge, MA, USA, 2008; ISBN 978-0-12-374370-1. [Google Scholar]
  15. Li, M.; Yang, S.; He, J.; Gu, X.; Xu, Y.; Gu, F.; Ball, A.D. Full-Field Extraction of Subtle Displacement Components via Phase-Projection Wavelet Denoising for Vision-Based Vibration Measurement. Mech. Syst. Signal Process. 2025, 224, 112021. [Google Scholar] [CrossRef]
  16. Tian, Y. Tests for Normality Based on Skewness and Kurtosis. Master’s Thesis, Shanghai JiaoTong University, Shanghai, China, 2012. [Google Scholar]
  17. Gosselin, R.-D. Testing for Normality: A User’s (Cautionary) Guide. Lab. Anim. 2024, 58, 433–437. [Google Scholar] [CrossRef] [PubMed]
  18. Aslam, M. Testing Normality of Data for Uncertain Level of Significance. J. Stat. Theory Appl. 2024, 23, 480–499. [Google Scholar] [CrossRef]
  19. Chen, B.; Smith, W.A.; Cheng, Y.; Gu, F.; Chu, F.; Zhang, W.; Ball, A.D. Probability Distributions and Typical Sparsity Measures of Hilbert Transform-Based Generalized Envelopes and Their Application to Machine Condition Monitoring. Mech. Syst. Signal Process. 2025, 224, 112026. [Google Scholar] [CrossRef]
  20. Zhang, H.; Pan, Z. Parameters selection of stationary wavelet denoising algorithm. J. Natl. Univ. Def. Technol. 2019, 41, 165–170. [Google Scholar]
  21. Hong, J.; Liang, F.; Chen, Y.; Wang, F.; Zhang, X.; Li, K.; Zhang, H.; Yang, J.; Zhang, C.; Yang, H.; et al. A Novel Battery Abnormality Diagnosis Method Using Multi-Scale Normalized Coefficient of Variation in Real-World Vehicles. Energy 2024, 299, 131475. [Google Scholar] [CrossRef]
  22. Ma, J.; Lu, N.; Sun, Q.; Liang, H. Energy Efficiency Evaluation of Wind Turbines Based on Entropy Weight Method and Stacked Autoencoder. J. Phys. Conf. Ser. 2024, 2846, 012004. [Google Scholar] [CrossRef]
  23. Wang, H.; Zhang, Z.; Dong, Y.; Zhan, W.; Li, Y. Real-Time Multipath Mitigation Based on Spatiotemporal Correlations in BDS Precise Point Positioning. GPS Solut. 2024, 28, 37. [Google Scholar] [CrossRef]
  24. Li, L.; Shen, Y.; Li, X. Mitigating Satellite-Induced Code Pseudorange Variations at GLONASS G3 Frequency Using Periodical Model. Remote Sens. 2023, 15, 431. [Google Scholar] [CrossRef]
Figure 1. Diagram of the discrete wavelet decomposition transform.
Figure 1. Diagram of the discrete wavelet decomposition transform.
Entropy 27 00108 g001
Figure 2. Trend of a single evaluation indicator for a simulated signal with a known true value.
Figure 2. Trend of a single evaluation indicator for a simulated signal with a known true value.
Entropy 27 00108 g002
Figure 3. Trend of a single evaluation indicator for a simulated signal with an unknown true value.
Figure 3. Trend of a single evaluation indicator for a simulated signal with an unknown true value.
Entropy 27 00108 g003
Figure 4. The flowchart for calculating the optimal decomposition levels.
Figure 4. The flowchart for calculating the optimal decomposition levels.
Entropy 27 00108 g004
Figure 5. Basic information of the station.
Figure 5. Basic information of the station.
Entropy 27 00108 g005
Figure 6. Real observation data.
Figure 6. Real observation data.
Entropy 27 00108 g006
Figure 7. Determining the optimal decomposition level for pseudorange multipath signals.
Figure 7. Determining the optimal decomposition level for pseudorange multipath signals.
Entropy 27 00108 g007
Figure 8. Time-domain curves of the original (black line) and denoised (orange line) signals.
Figure 8. Time-domain curves of the original (black line) and denoised (orange line) signals.
Entropy 27 00108 g008
Figure 9. Spectrum curves of the original and denoised signals.
Figure 9. Spectrum curves of the original and denoised signals.
Entropy 27 00108 g009
Figure 10. Power spectral density plot of the original and denoised signals.
Figure 10. Power spectral density plot of the original and denoised signals.
Entropy 27 00108 g010
Figure 11. Flowchart of pseudorange multipath periodic correction.
Figure 11. Flowchart of pseudorange multipath periodic correction.
Entropy 27 00108 g011
Figure 12. Multipath time series before and after model correction.
Figure 12. Multipath time series before and after model correction.
Entropy 27 00108 g012
Table 1. A summary table of χ P 2 percentiles corresponding to specific P and v parameters.
Table 1. A summary table of χ P 2 percentiles corresponding to specific P and v parameters.
P0.950.9750.990.995
v
25.997.389.2110.6
Table 2. Pearson correlation analysis statistics between indicators.
Table 2. Pearson correlation analysis statistics between indicators.
RMSESNRr
RMSE\
SNR0.997\
r−0.796−0.841\
Table 3. Evaluation indicators for wavelet denoising across various decomposition levels at an SNR of 9 dB.
Table 3. Evaluation indicators for wavelet denoising across various decomposition levels at an SNR of 9 dB.
Decomposition LevelJarque–Bera TestTrue Value KnownTrue Value UnknownFTH
RMSESNRRMSEr
2accept0.465715.30040.83780.03520.15510.16830.1640
3accept0.301919.06510.91010.00690.06160.09270.0827
4accept0.224821.62630.94400.00440.05740.09370.0820
5accept0.512214.47301.06910.00190.06720.12070.1035
6accept0.622012.78701.13560.00080.07320.13570.1156
7accept0.676312.05901.17680.00070.07910.14710.1252
8accept0.713911.58981.19240.00070.08140.15140.1289
9accept0.745411.21401.21260.00060.08430.15700.1336
10accept0.764910.98971.22710.00060.08650.16110.1371
Table 4. Evaluation indicators for wavelet denoising across various decomposition levels at an SNR of 15 dB.
Table 4. Evaluation indicators for wavelet denoising across various decomposition levels at an SNR of 15 dB.
Decomposition LevelJarque–Bera TestTrue Value KnownTrue Value UnknownFTH
RMSESNRRMSEr
2accept0.2360 21.2025 0.3926 0.0469 0.1944 0.2007 0.1983
3accept0.1757 23.7674 0.4231 0.0214 0.1010 0.1245 0.1157
4accept0.1531 24.9619 0.4497 0.0175 0.0950 0.1270 0.1150
5accept0.2962 19.2315 0.5223 0.0107 0.0947 0.1486 0.1285
6accept0.3699 17.3007 0.5614 0.0072 0.0956 0.1612 0.1367
7accept0.3988 16.6466 0.5793 0.0070 0.1018 0.1722 0.1459
8accept0.4044 16.5261 0.5865 0.0070 0.1042 0.1766 0.1496
9accept0.4199 16.2001 0.5959 0.0069 0.1075 0.1824 0.1544
10accept0.4257 16.0799 0.6028 0.0069 0.1102 0.1869 0.1582
Table 5. Optimal decomposition levels under different SNR conditions (using the db4 wavelet basis as an example).
Table 5. Optimal decomposition levels under different SNR conditions (using the db4 wavelet basis as an example).
SNR (dB)True Value KnownTrue Value Unknown
FTH
−34444
34 4 4 4
94 4 3 4
124 4 3 4
154 5 3 4
183 3 3 3
Table 6. Optimal decomposition levels under different wavelet basis functions (using 15 dB SNR simulation data as an example).
Table 6. Optimal decomposition levels under different wavelet basis functions (using 15 dB SNR simulation data as an example).
Wavelet Basis FunctionTrue Value KnownTrue Value Unknown
FTH
db44534
db54434
sym54434
sym64434
coif44444
coif54544
Table 7. Analysis of the comparison of comprehensive indicators for different decomposition levels.
Table 7. Analysis of the comparison of comprehensive indicators for different decomposition levels.
Decomposition LevelJarque–Bera TestFTH
1accept0.87730.82510.8433
2accept0.19990.20200.2012
3accept0.09520.11090.1054
4accept0.08890.11130.1035
5accept0.10010.13380.1220
6accept0.10670.14950.1345
7accept0.11330.16100.1443
8accept0.11790.16790.1504
9accept0.12030.17150.1536
10accept0.12150.17320.1552
11accept0.12270.17490.1567
Table 8. Different decomposition levels and their corresponding corrected RMSE.
Table 8. Different decomposition levels and their corresponding corrected RMSE.
Decomposition Level234567891011
RMSE0.2134 0.2105 0.2097 0.2139 0.2151 0.2183 0.2195 0.2213 0.2233 0.2251
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Z.; Liu, C.; Wang, M.; Sun, S.; Zhan, Z. Determination Method of Optimal Decomposition Level of Discrete Wavelet Based on Joint Jarque–Bera Test and Combination Weighting Method. Entropy 2025, 27, 108. https://doi.org/10.3390/e27020108

AMA Style

Zhang Z, Liu C, Wang M, Sun S, Zhan Z. Determination Method of Optimal Decomposition Level of Discrete Wavelet Based on Joint Jarque–Bera Test and Combination Weighting Method. Entropy. 2025; 27(2):108. https://doi.org/10.3390/e27020108

Chicago/Turabian Style

Zhang, Zhanpeng, Changjian Liu, Min Wang, Shuang Sun, and Zhao Zhan. 2025. "Determination Method of Optimal Decomposition Level of Discrete Wavelet Based on Joint Jarque–Bera Test and Combination Weighting Method" Entropy 27, no. 2: 108. https://doi.org/10.3390/e27020108

APA Style

Zhang, Z., Liu, C., Wang, M., Sun, S., & Zhan, Z. (2025). Determination Method of Optimal Decomposition Level of Discrete Wavelet Based on Joint Jarque–Bera Test and Combination Weighting Method. Entropy, 27(2), 108. https://doi.org/10.3390/e27020108

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop