Next Article in Journal
Solution of Internal Forces in Statically Indeterminate Structures Under Localized Distributed Moments
Previous Article in Journal
A Survey on the Minimal Number of Singular Fibers of a P1-Semistable Curve
Previous Article in Special Issue
On One Approach to Obtaining Estimates of the Rate of Convergence to the Limiting Regime of Markov Chains
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Properties of the SURE Estimates When Using Continuous Thresholding Functions for Wavelet Shrinkage

by
Alexey Kudryavtsev
1,2,* and
Oleg Shestakov
1,2,3,*
1
Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, Moscow 119991, Russia
2
Moscow Center for Fundamental and Applied Mathematics, Moscow 119991, Russia
3
Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, Moscow 119333, Russia
*
Authors to whom correspondence should be addressed.
Mathematics 2024, 12(23), 3646; https://doi.org/10.3390/math12233646
Submission received: 24 October 2024 / Revised: 18 November 2024 / Accepted: 19 November 2024 / Published: 21 November 2024

Abstract

:
Wavelet analysis algorithms in combination with thresholding procedures are widely used in nonparametric regression problems when estimating a signal function from noisy data. The advantages of these methods lie in their computational efficiency and the ability to adapt to the local features of the estimated function. It is usually assumed that the signal function belongs to some special class. For example, it can be piecewise continuous or piecewise differentiable and have a compact support. These assumptions, as a rule, allow the signal function to be economically represented on some specially selected basis in such a way that the useful signal is concentrated in a relatively small number of large absolute value expansion coefficients. Then, thresholding is performed to remove the noise coefficients. Typically, the noise distribution is assumed to be additive and Gaussian. This model is well studied in the literature, and various types of thresholding and parameter selection strategies adapted for specific applications have been proposed. The risk analysis of thresholding methods is an important practical task, since it makes it possible to assess the quality of both the methods themselves and the equipment used for processing. Most of the studies in this area investigate the asymptotic order of the theoretical risk. In practical situations, the theoretical risk cannot be calculated because it depends explicitly on the unobserved, noise-free signal. However, a statistical risk estimate constructed on the basis of the observed data can also be used to assess the quality of noise reduction methods. In this paper, a model of a signal contaminated with additive Gaussian noise is considered, and the general formulation of the thresholding problem with threshold functions belonging to a special class is discussed. Lower bounds are obtained for the threshold values that minimize the unbiased risk estimate. Conditions are also given under which this risk estimate is asymptotically normal and strongly consistent. The results of these studies can provide the basis for further research in the field of constructing confidence intervals and obtaining estimates of the convergence rate, which, in turn, will make it possible to obtain specific values of errors in signal processing for a wide range of thresholding methods.

1. Introduction

Wavelet methods are widely used for various signal and image processing tasks. They are applied for the frequency–time analysis of signal spectra, the identification of singular points and data compression. Moreover, during the process of receiving, processing and transmitting, signals are often contaminated with noise due to system errors, environmental interference and the internal noise of receivers. To solve this problem, various noise reduction methods have been proposed, such as linear and median filtering. Recently, wavelet decomposition has become a popular tool in solving noise reduction problems. Compared with traditional linear and median filters, wavelet analysis methods offer advantages such as computational speed and adaptive spectral analysis capabilities: a flexible time–frequency window is automatically narrowed when considering high-frequency phenomena and expanded when studying low-frequency oscillations. This makes wavelets an effective tool for the analysis of non-stationary signals with varying regularity in their domain. Telecommunication channels and computer networks may contain additive Gaussian noise caused by thermal sources, solar radiation and other natural causes. Sensor noise, electronic component noise and poor lighting are also sources of Gaussian noise in digital images. The use of standard filtering methods may lead to the excessive blurring of small-scale image details. The use of wavelets in this situation also offers certain advantages, since applying the wavelet transform retains the Gaussian noise structure, and the useful signal is economically represented as a small number of wavelet coefficients. Thus, in the space of wavelet coefficients, it is possible to effectively separate the noise from the useful signal.
Many methods based on wavelet theory have been proposed [1]. Among these methods, threshold processing is the most widely used due to its simplicity and efficiency. Donoho and Johnstone [2] suggested that only a small number of wavelet coefficients are needed to restore the signal, and they proposed the classical method of hard threshold processing. When using the hard threshold processing function, coefficients with absolute values lower than the threshold are removed, while large coefficients remain unchanged. This function is discontinuous, and additional artifacts, such as an analog of the Gibbs effect, may appear in the restored signal. To make the threshold function continuous, Donoho proposed a soft threshold processing function that reduces large coefficients by the threshold value. However, this introduces an additional bias that can lead to distortions in the reconstructed signal (for more information, see Ref. [3]). Many authors have proposed various versions of thresholding functions that improve the classical methods to some extent [4,5,6,7,8,9,10,11,12,13,14,15]. Such functions, like the soft thresholding function, are continuous and allow one to construct an unbiased estimate of the mean square risk using Stein’s method [16]. In this case, the threshold value is usually proposed to be calculated by solving the problem of minimizing this estimate. Due to the appearance of such a variety of threshold processing methods, Johnstone [3] proposed to consider a fairly general class of thresholding functions that satisfy some natural requirements and allow the construction of an unbiased risk estimate. The study of risk estimate properties is also an important practical task, since this estimate allows one to quantitatively evaluate the error of the noise suppression method using only the observed data. However, while the properties of theoretical risk are well studied, the properties of its estimates have received less attention so far. The purpose of this paper is to fill this gap and study the statistical properties of thresholding risk estimates. This paper proves that, under some additional requirements for the threshold function, the unbiased estimate of the mean square risk is strongly consistent and asymptotically normal. These properties serve as the basis for the use of the risk estimate as a quality criterion for the thresholding method.

2. Wavelet Decomposition

For a function f L 2 ( R ) , the wavelet decomposition is a series of the form
f = j , k Z f , ψ j , k ψ j , k ,
where ψ j , k ( x ) = 2 j / 2 ψ ( 2 j x k ) are shifts and dilations of the chosen wavelet function ψ . The index j in this expansion is called the scale, and the index k is called the shift. Under certain conditions, the sequence { ψ j , k } forms an orthonormal basis. The function ψ can be chosen so that it is sufficiently smooth and has other useful properties [1].
In what follows, we will consider functions f with support on some finite interval [ a , b ] , uniformly Lipschitz regular with the exponent γ > 0 . If a wavelet function is M times continuously differentiable ( M γ ), has M vanishing moments, i.e.,
x k ψ ( x ) d x = 0 , k = 0 , , M 1 ,
and rapidly decays at infinity together with its derivatives, i.e., for all 0 k M and any m N , there is a constant C m such that, for all x R ,
| ψ ( k ) ( x ) | C m 1 + | x | m ,
then
| f , ψ j , k | C f 2 j γ + 1 / 2 ,
where C f is some positive constant depending on f (a wavelet function with the given properties always exists—for example, a Daubechies wavelet of the corresponding order). This inequality shows that, in the space of wavelet coefficients, Lipschitz regular functions have a sparse representation.
In practice, functions are defined by their values on some grid of points. Suppose that this grid is uniform and the number of points is N = 2 J for some J > 0 . In the case where the number of observations differs from a power of two, some extension of the array to the nearest power of 2 is applied [17]. The discrete wavelet transform is the multiplication of the vector of function values by the orthogonal matrix W defined by the wavelet function ψ . This results in a set of discrete wavelet coefficients μ j , k , for which, for a sufficiently large N, the relation μ j , k N f , ψ j , k is valid [1]. Thus, for μ j , k , the following inequality holds:
| μ j , k | C f 2 J / 2 2 j γ + 1 / 2 .

3. Thresholding of Wavelet Coefficients

Real observations are usually contaminated by noise. In many cases, it is assumed that the noise is additive, white and Gaussian. Thus, we consider the following model:
X i = f i + z i , i = 1 , , N ,
where f i are values of the function f, and the random variables z i have the distribution N ( 0 , σ 2 ) and are independent. Since the matrix W is orthogonal, the discrete wavelet coefficients have the form
Y j , k = μ j , k + ϵ j , k , j = 0 , , J 1 , k = 0 , , 2 j 1 ,
where ϵ j , k have distribution N ( 0 , σ 2 ) and are independent. Classical methods of signal processing assume the additivity and Gaussianity of the noise. However, there are approaches (see, for example, [18,19]) that assume non-Gaussianity. The consideration of such a problem could be the subject of a separate study.
To suppress noise, a thresholding function is applied to the wavelet coefficients, the purpose of which is to remove sufficiently small coefficients that are considered noise. The most common are the hard thresholding function
ρ H ( y , T ) = y for   | y | > T ; 0 for   | y | T
and the soft thresholding function
ρ S ( y , T ) = y T for   y > T ; y + T for   y < T ; 0 for   | y | T
with some threshold T. However, each of them has its own drawbacks. The function ρ H is discontinuous, which leads to a lack of stability and the appearance of additional artifacts, and the function ρ S leads to the appearance of an additional bias in the estimate of the function f. A number of studies have proposed some alternative thresholding functions ρ ( y , T ) , which, in fact, are a compromise between hard and soft thresholding [4,5,6,7,8,9,10,11,12,13,14,15]. These functions are continuous, like the function ρ S ( y , T ) , but ρ ( y , T ) y as | y | , i.e., for large absolute values of the coefficients, they resemble the hard thresholding function.
Due to the emergence of various types of thresholding functions, it makes sense to consider some general classes of such functions. Let the function h ( y , T ) have the following properties:
1. h ( y , T ) = h ( y , T ) (oddness);
2. 0 h ( y , T ) T for y 0 (boundedness);
3. h ( T , T ) = T (continuity).
Then, the thresholding function is defined by the expression
ρ h ( y , T ) = y h ( y , T ) for   | y | > T ; 0 for   | y | T .
This form has, for example, the soft thresholding function, the hybrid thresholding function [5,15], the thresholding function based on the hyperbolic tangent [10], the sigmoid function [12] and some others.
Thus, the signal processing scheme is as follows. First, a discrete wavelet transform is performed using the orthogonal matrix W. Then, the threshold processing of the resulting wavelet coefficients is performed, which results in the suppression of most of the noise, since the useful signal has an economical representation in the space of wavelet coefficients, and the noise is uniformly distributed over all coefficients. Finally, the inverse wavelet transform is performed using the matrix W T to obtain the processed signal ( W 1 = W T , since W is orthogonal). Moreover, the task of the economical presentation of data is solved along the way for storage or transmission over communication channels.
The error (or risk) of threshold processing is defined as follows:
R J ( T ) = j = 0 J 1 k = 0 2 j 1 E ρ h ( Y j , k , T ) μ j , k 2 .
One of the key issues is the choice of the threshold value T (note that a “reasonable” threshold value should increase with J [1]). One of the first to appear was the so-called universal threshold T U = σ 2 ln 2 J . The following statement demonstrates that this threshold is, in a certain sense, the maximum among the “reasonable” ones [1].
Lemma 1.
Let Z 1 , , Z N be independent random variables with distribution N ( 0 , σ 2 ) . Then,
P max 1 i N | Z i | > σ 2 ln N 0
as N .
Thus, the maximum noise amplitude with a high probability does not exceed the universal threshold. In Refs. [20,21], additional considerations are also given, showing that it is reasonable not to consider T > T U .

4. Statistical Estimate of the Mean Square Risk

Another approach to calculating the threshold value is based on minimizing the statistical risk estimate (2) constructed using Stein’s method [16]. If h ( y , T ) is differentiable with respect to the variable y, then the following relation is valid:
E ρ h ( Y j , k , T ) μ j , k 2 = σ 2 + E g 2 ( Y j , k ) 2 σ 2 E g ( Y j , k ) ,
where g ( Y j , k ) = Y j , k ρ h ( Y j , k , T ) . That is, Stein’s method consists of the fact that, in (2), the expression containing unknown μ j , k can be replaced by an expression containing only observed data.
Thus, the following expression can be used as an unbiased risk estimate:
R ^ J ( T ) = j = 0 J 1 k = 0 2 j 1 F ( Y j , k , T ) ,
where
F ( y , T ) = y 2 σ 2 for   | y | T ; h 2 ( y , T ) + σ 2 2 σ 2 h y ( y , T ) for   | y | > T .
Let T S denote the threshold that minimizes the estimate (3):
R ^ J ( T S ) = min T [ 0 , T U ] R ^ J ( T ) .
This threshold mimics the theoretical “ideal” threshold T min that minimizes the theoretical risk (2):
R J ( T min ) = min T [ 0 , T U ] R J ( T ) .
Remark 1.
Since this paper studies the order of the expressions under consideration with respect to J, we will use the notations C, c and others for some positive constants that may depend on the model parameters but do not depend on J. In this case, in one chain of transformations, different constants may be designated by the same letters, and the transformations themselves are performed under the assumption that J is sufficiently large.
The following upper bound is valid for R J ( T min ) .
Lemma 2.
Let f be supported on some finite interval [ a , b ] and be uniformly Lipschitz regular with some exponent γ > 0 . Then,
R J ( T min ) C 2 J / ( 2 γ + 1 ) J ( 2 γ + 2 ) / ( 2 γ + 1 ) .
Proof. 
Let the index j * be such that
| μ j , k | C J 1 / 2 , j * j J 1 .
In this case, due to (1),
j * J 2 γ + 1 + 1 2 γ + 1 log 2 J + c .
Let us split the expression for risk into two sums:
R J ( T ) = j = 0 J 1 k = 0 2 j 1 E ( ρ h ( Y j , k , T ) μ j , k ) 2 =
= j = 0 j * 1 k = 0 2 j 1 E ( ρ h ( Y j , k , T ) μ j , k ) 2 + j = j * J 1 k = 0 2 j 1 E ( ρ h ( Y j , k , T ) μ j , k ) 2 S 1 + S 2 .
Let us estimate the first sum from above. Given the restrictions on h ( y , T ) , we obtain
E ( ρ h ( Y j , k , T ) μ j , k ) 2 = E ( ( Y j , k h ( Y j , k , T ) ) I ( | Y j , k | > T ) μ j , k ) 2 =
= E ( Y j , k h ( Y j , k , T ) ( Y j , k h ( Y j , k , T ) ) I ( | Y j , k | T ) μ j , k ) 2
2 E ( Y j , k μ j , k ) 2 + 2 E [ h ( Y j , k , T ) + ( Y j , k h ( Y j , k , T ) ) I ( | Y j , k | T ) ] 2 2 σ 2 + C T 2 .
Hence,
S 1 = j = 0 j * 1 k = 0 2 j 1 E ( ρ h ( Y j , k , T ) μ j , k ) 2 C 2 j * T 2 C T 2 2 J / ( 2 γ + 1 ) J 1 / ( 2 γ + 1 ) .
Now, consider the second sum. Given the constraints on h ( y , T ) and the fact that, for each term of the second sum | μ j , k | C J 1 / 2 (and hence | μ j , k | C / T , since T < T U ), we obtain
E ( ρ h ( Y j , k , T ) μ j , k ) 2 = E ( ( Y j , k h ( Y j , k , T ) ) I ( | Y j , k | > T ) μ j , k ) 2
2 E ( Y j , k h ( Y j , k , T ) ) 2 I ( | Y j , k | > T ) + 2 μ j , k 2
C T exp ( T + μ j , k ) 2 2 σ 2 + T exp ( T μ j , k ) 2 2 σ 2 + 2 μ j , k 2 C T exp T 2 2 σ 2 + 2 μ j , k 2 .
Taking into account (1), we have
S 2 = j = j * J 1 k = 0 2 j 1 E ( ρ h ( Y j , k , T ) μ j , k ) 2 C T exp T 2 2 σ 2 2 J + 2 J / ( 2 γ + 1 ) J 2 γ 2 γ + 1 .
Thus,
R J ( T ) = S 1 + S 2 C T 2 2 J / ( 2 γ + 1 ) J 1 / ( 2 γ + 1 ) + T exp T 2 2 σ 2 2 J .
In this expression, the first term increases with the growing T, and the second one decreases (while, as noted, T < T U ). Repeating the reasoning of Ref. [22] and choosing the value
T * = σ 4 γ 2 γ + 1 ln 2 J σ 2 γ + 3 4 γ + 2 2 γ + 1 4 γ ln ln 2 J ln 2 J
to equalize the orders of these terms, we obtain the upper bound (5) (in this case, the value T * represents the lower bound for the “ideal” threshold T min ). The lemma is proven. □
For the threshold value T S , the following statement holds, showing that it cannot be too small.
Lemma 3.
Let f be supported on some finite interval [ a , b ] and be uniformly Lipschitz regular with some exponent γ > 0 . Let, for some positive θ,
| h y ( y , T ) | C T θ , | h T ( y , T ) | C T θ , | h y T ( y , T ) | C T θ , y R .
Then, for 0 < γ 1 / 2 and arbitrary 0 < α < 4 γ 2 γ + 1 ,
P ( T S [ 0 , T 1 ] ) C 2 J exp c J max { 3 , θ + 1 } 2 ( 1 α ) J .
If γ > 1 / 2 , then, for an arbitrary 2 3 < α < 2 3 · 4 γ 2 γ + 1 ,
P ( T S [ 0 , T 2 ] ) C 2 J exp c J max { 5 / 2 , θ + 1 / 2 } 2 ( 1 α ) J ,
where T 1 = σ α ln 2 J , T 2 = σ 3 α 2 ln 2 J .
Proof. 
Let 0 < γ 1 / 2 . Then, for 0 < α < 4 γ 2 γ + 1 , we obtain
P T S [ 0 , T 1 ] P sup T [ 0 , T 1 ] ( R ^ J ( T min ) R ^ J ( T ) ) 0
P sup T [ 0 , T 1 ] | R ^ J ( T min ) R ^ J ( T ) E ( R ^ J ( T min ) R ^ J ( T ) ) | inf T [ 0 , T 1 ] ( R J ( T ) R J ( T min ) ) .
Next, since 0 h ( y , T ) T ,
E ( ρ h ( Y J 1 , k , T ) μ J 1 , k ) 2 = E ( ( Y J 1 , k h ( Y J 1 , k , T ) ) I ( | Y J 1 , k | > T ) μ J 1 , k ) 2
E ( | Y J 1 , k | T ) 2 I ( | Y J 1 , k | > T ) 2 | μ J 1 , k | E | Y J 1 , k | I ( | Y J 1 , k | > T ) + μ J 1 , k 2 .
Using the inequality
1 Φ ( y ) > φ ( y ) 1 y 1 y 3 , y > 0 ,
where φ ( y ) and Φ ( y ) are the density and distribution function of the standard normal law, and given that, by (1),
| μ J 1 , k | C 2 γ J ,
we conclude that there is a constant C T > 0 such that, for C T < T T 1 ,
E ( | Y J 1 , k | T ) 2 I ( | Y J 1 , k | > T ) 2 | μ J 1 , k | E | Y J 1 , k | I ( | Y J 1 , k | > T ) + μ J 1 , k 2
c T e T 2 2 σ 2 + μ J 1 , k 2 .
If 0 < T C T , then, obviously, for some δ > 0 , independent of J,
E ( ρ h ( Y J 1 , k , T ) μ J 1 , k ) 2 δ .
Thus,
inf T [ 0 , T 1 ] R J ( T ) inf T [ 0 , T 1 ] k = 0 2 J 1 1 E ρ h ( Y J 1 , k , T ) μ J 1 , k 2 C 2 J T 1 e T 1 2 2 σ 2 = C 2 2 α 2 J ln 2 J .
Therefore, taking into account (5),
inf T [ 0 , T 1 ] ( R J ( T ) R J ( T min ) ) C 2 2 α 2 J ln 2 J .
Denote
Z ( T ) = R ^ J ( T min ) R ^ J ( T ) E ( R ^ J ( T min ) R ^ J ( T ) ) .
For T < T ,
| Z ( T ) Z ( T ) | = | R ^ J ( T ) E R ^ J ( T ) R ^ J ( T ) + E R ^ J ( T ) | .
Let N ( T , T ) = # { ( j , k ) | T < | Y j , k | T } . Then, given the form of F ( y , T ) and the constraints (6), we obtain the estimate
| Z ( T ) Z ( T ) | C T max { 2 , θ } N ( T , T ) + C T max { 2 , θ + 1 } ( T T ) 2 J .
Let us divide the interval [ 0 , T 1 ] into equal parts using points
T l = l δ J , l = 0 , , 2 J , δ J = T 1 2 J .
Then,
A J = sup T [ 0 , T 1 ] | R ^ J ( T min ) R ^ J ( T ) E ( R ^ J ( T min ) R ^ J ( T ) ) | C 2 2 α 2 J ln 2 J D J E J ,
where
D J = sup l | Z ( T l ) | > C 2 2 α 2 J 2 ln 2 J , E J = sup l sup T [ T l , T l + 1 ) | Z ( T ) Z ( T l ) | C 2 2 α 2 J 2 ln 2 J .
Using Hoeffding’s inequality [23], we obtain
P ( D J ) l = 1 2 J P | Z ( T l ) | > C 2 2 α 2 J 2 ln 2 J C 2 J exp c 2 ( 1 α ) J ln 2 J T min max { 4 , 2 θ } .
Next,
E J sup l sup T [ T l , T l + 1 ) ( C T max { 2 , θ } N ( T l , T ) + C T max { 2 , θ + 1 } ( T T l ) 2 J ) C 2 2 α 2 J 2 ln 2 J
sup l C ( T l + 1 ) max { 2 , θ } N ( T l , T l + 1 ) C 2 2 α 2 J 2 ln 2 J C ( T 1 ) max { 3 , θ + 2 } .
Since
E N ( T l , T l + 1 ) = E j = 0 J 1 k = 0 2 j 1 I ( T l < | Y j , k | T l + 1 ) C T 1 ,
we have
sup l C ( T l + 1 ) max { 2 , θ } N ( T l , T l + 1 ) C 2 2 α 2 J 2 ln 2 J C ( T 1 ) max { 3 , θ + 2 }
sup l | C ( T l + 1 ) max { 2 , θ } N ( T l , T l + 1 ) E C ( T l + 1 ) max { 2 , θ } N ( T l , T l + 1 ) | C 2 2 α 2 J 2 ln 2 J C ( T 1 ) max { 3 , θ + 2 } .
Applying Hoeffding’s inequality one more time, we obtain
P ( E n ) C 2 J exp c 2 ( 1 α ) J ln 2 J T 1 max { 4 , 2 θ } .
Combining (9) and (10), we obtain
P ( T S [ 0 , T 1 ] ) P ( A J ) C 2 J exp c 2 ( 1 α ) J ln 2 J T 1 max { 4 , 2 θ } C 2 J exp c J max { 3 , θ + 1 } 2 ( 1 α ) J .
Now, let γ > 1 / 2 . Then, for 0 < α < 1 , the estimate of the probability P ( T S [ 0 , T 1 ] ) has exactly the same form. Next, let 2 3 < α < 2 3 · 4 γ 2 γ + 1 and β = 3 α 2 . Then, repeating the previous reasoning, we obtain
inf T [ T 1 , T 2 ] ( R J ( T ) R J ( T min ) ) C 2 2 β 2 J ln 2 J ,
and
P ( T S [ T 1 , T 2 ] ) P ( A J ) ,
where
A J = sup T [ T 1 , T 2 ] | R ^ J ( T min ) R ^ J ( T ) E ( R ^ J ( T min ) R ^ J ( T ) ) | C 2 2 β 2 J ln 2 J D J E J ,
D J = sup l | Z ( T l ) | > C 2 2 β 2 J 2 ln 2 J ,
E J = sup l sup T [ T l , T l + 1 ) | Z ( T ) Z ( T l ) | C 2 2 β 2 J 2 ln 2 J ,
T l = T 1 + l δ J , l = 0 , , 2 J , δ J = T 2 T 1 2 J .
It can also be shown that
D ( R ^ J ( T min ) R ^ J ( T ) ) C T max { 3 , 2 θ 1 } 2 2 α 2 J .
Now, repeating the previous reasoning, to estimate the probabilities of the events D J and E J , instead of Hoeffding’s inequality, we apply Bernstein’s inequality [23] and obtain
P ( D J ) C 2 J exp c J max { 5 / 2 , θ + 1 / 2 } 2 ( 2 2 β + α ) J / 2 ,
P ( E J ) C 2 J exp c J max { 5 / 2 , θ + 1 / 2 } 2 ( 2 2 β + α ) J / 2 .
Hence,
P ( T S [ T 1 , T 2 ] ) C 2 J exp c J max { 5 / 2 , θ + 1 / 2 } 2 ( 2 2 β + α ) J / 2 .
Combining this estimate with the estimate of the probability P ( T S [ 0 , T 1 ] ) and taking into account that β = 3 α 2 , we obtain the statement of the lemma for γ > 1 / 2 .
The lemma is proven. □

5. Asymptotic Properties of the Mean Square Risk Estimate

In this section, we present statements describing the asymptotic properties of the risk estimate. For soft thresholding, similar results were obtained in [24,25].
Theorem 1.
Let f be supported on some finite interval [ a , b ] and be uniformly Lipschitz regular with some exponent γ > 1 / 2 . Let h ( y , T ) satisfy the conditions (6). Then,
P R ^ J ( T S ) R J ( T min ) σ 2 2 J + 1 < x Φ ( x ) w h i l e   J .
Proof. 
Denote
U ( T ) = R ^ J ( T ) R ^ J ( T min ) = j = 0 J 1 k = 0 2 j 1 ( F ( Y j , k , T ) F ( Y j , k , T min ) ) .
Let us write the numerator in the expression under the probability in (11) as
R ^ J ( T S ) R J ( T min ) = U ( T S ) + R ^ J ( T min ) R J ( T min )
and show that
R ^ J ( T min ) R J ( T min ) σ 2 2 J + 1 N ( 0 , 1 ) , J .
Let V J 2 = D ( R ^ J ( T min ) R J ( T min ) ) . Given (1) and the fact that D Y j , k = 2 σ 4 + 4 σ 2 μ j , k 2 , it can be shown that
lim J V J 2 σ 4 2 J + 1 = 1 .
Moreover, the Lindeberg condition is satisfied: for any ϵ > 0 , when J ,
1 V J 2 j = 0 J 1 k = 0 2 j 1 E ( F ( Y j , k , T min ) E F ( Y j , k , T min ) ) 2 I ( | F ( Y j , k , T min ) E F ( Y j , k , T min ) | > ϵ V J ) 0 ,
since, starting from some J, the indicators become equal to zero. Therefore, (12) holds. It remains to show that
U ( T S ) 2 J P 0 .
For 2 3 < α < 2 3 · 4 γ 2 γ + 1 , we denote T 2 = σ 3 2 α ln 2 J . Then, for an arbitrary ϵ > 0 ,
P | U ( T S ) | 2 J > ϵ
P T S T 2 + P sup T [ T 2 , T U ] | U ( T ) E U ( T ) | + sup T [ T 2 , T U ] | E U ( T ) | 2 J > ϵ .
By Lemma 3, the first probability tends to zero. Moreover, repeating the reasoning from Lemma 3, we can verify that
P sup T [ T 2 , T U ] | U ( T ) E U ( T ) | 2 J > ϵ C 2 J exp c J max { 1 , θ / 2 } 2 J / 2 .
Finally, given that γ > 1 / 2 , it can be shown that
sup T [ T 2 , T U ] | E U ( T ) | 2 J sup T [ T 2 , T U ] C T 2 2 J / ( 2 γ + 1 ) J 1 / ( 2 γ + 1 ) + C T max { 1 , θ 1 } e T 2 2 σ 2 2 J
C J max { 1 / 2 , ( θ 1 ) / 2 } 2 2 3 α 4 J 0 .
Thus, (13) holds and the theorem is completely proven. □
Theorem 2.
Let f be supported on some finite interval [ a , b ] and be uniformly Lipschitz regular with some exponent γ > 0 . Let h ( y , T ) satisfy the conditions (6). Then,
R ^ J ( T S ) R J ( T min ) 2 J 0 a . s . w h e n   J .
Proof. 
As in the previous theorem, we write
R ^ J ( T S ) R J ( T min ) = U ( T S ) + R ^ J ( T min ) R J ( T min ) ,
where
U ( T ) = R ^ J ( T ) R ^ J ( T min ) .
If 0 < γ 1 / 2 , then, for 0 < α < 4 γ 2 γ + 1 , we denote T 1 = σ α ln 2 J . If γ > 1 / 2 , then, for 2 3 < α < 2 3 · 4 γ 2 γ + 1 , we denote T 2 = σ 3 2 α ln 2 J . In both situations, one of the estimates (7) or (8) is valid, the right-hand side of which we denote, in any case, by p J .
Consider the case 0 < γ 1 / 2 (the case γ > 1 / 2 is considered similarly). We have
P | U ( T S ) | 2 J > ϵ
P T S T 1 + P sup T [ T 1 , T U ] | U ( T ) E U ( T ) | + sup T [ T 1 , T U ] | E U ( T ) | 2 J > ϵ .
For the second probability, repeating the reasoning from Lemma 3, we obtain
q J P sup T [ T 1 , T U ] | U ( T ) E U ( T ) | 2 J > ϵ C 2 J exp c J max { 1 , θ / 2 } 2 J ,
sup T [ T 1 , T U ] | E U ( T ) | 2 J 0 .
Finally, using Hoeffding’s inequality, we obtain
r J P R ^ J ( T min ) R J ( T min ) 2 J > ϵ C exp c J max { 2 , θ } 2 J .
Thus,
J = 1 ( p J + q J + r J ) <
and the statement of the theorem follows from the Borel–Cantelli lemma. □

6. Examples

Here, we provide some examples of thresholding functions that satisfy the conditions of Theorems 1 and 2 and have been proposed by various authors in recent years. Some of them depend on additional parameters.
  • Thresholding function based on the hyperbolic tangent function [10]: for λ > 0 ,
    ρ h ( y , T ) = sgn ( y ) ( | y | T ) + T e λ sgn ( y ) ( | y | T ) / T e λ sgn ( y ) ( | y | T ) / T e λ sgn ( y ) ( | y | T ) / T + e λ sgn ( y ) ( | y | T ) / T , | y | > T ; 0 , | y | T .
  • Hybrid thresholding function [11]: for λ > 1 ,
    ρ h ( y , T ) = sgn ( y ) ( | y | | y | 1 λ T λ ) , | y | > T ; 0 , | y | T .
  • Sigmoid thresholding function [12]: for λ 0 ,
    ρ h ( y , T ) = sgn ( y ) ( | y | T ) + 2 T 1 + e λ sgn ( y ) ( | y | T ) T , | y | > T ; 0 , | y | T .
  • Soft–hard thresholding function [13] (this function depends on two threshold values; threshold value T is an additional parameter):
    ρ h ( y , T ) = 0 , | y | T ; sgn ( y ) T ( | y | T ) / ( T T ) , T < | y | T ; y , | y | > T .
  • The smooth clipped absolute deviation (SCAD) penalty thresholding function [14]: for λ > 2 ,
    ρ h ( y , T ) = 0 , | y | T ; sgn ( y ) ( | y | T ) , T < | y | 2 T ; sgn ( y ) ( λ | y | | y | λ T ) / ( λ 2 ) , 2 T < | y | λ T ; y , | y | > λ T .
Some other examples of thresholding functions can be found in the monograph [3] and the papers cited therein.

7. Conclusions

Wavelet analysis and threshold processing methods have found wide application in the signal analysis of noisy data. The advantages of these methods include high computational efficiency and the ability to adapt to the local features of the estimated function. Under some additional assumptions, it is possible to economically represent the signal function in such a way that the useful signal is concentrated in a relatively small number of coefficients with sufficiently large absolute values, after which threshold processing is performed to remove noise arising both for natural reasons (for example, due to background sources) and due to imperfections in the equipment used. Typically, the noise distribution is assumed to be additive and Gaussian. This model has been well studied in the literature. Various types of threshold processing and parameter selection strategies adapted to specific practical applications have been proposed. The analysis of errors arising from the application of threshold processing methods is an important practical task, since it allows one to evaluate the quality of both the methods themselves and the equipment used for processing. Most of the efforts in this area are devoted to studying the asymptotic order of the theoretical risk, which cannot be calculated in practice because it explicitly depends on the theoretical unobservable noise-free signal. However, a statistical risk estimate based only on the observed data can also be used to assess the quality of noise reduction methods.
In this paper, a model of a signal contaminated with additive Gaussian noise is considered, and the general formulation of the thresholding problem with threshold functions belonging to a certain special class is discussed. A number of examples of known threshold functions from the considered class are given, which are used in such problems as ECG and EEG signal processing, the processing of noisy images, electromyography, noise reduction in audio signals, seismic signal analysis, etc. The specific type of thresholding function for practical use depends on the application and the amount of available data. Tuning additional parameters of the specified thresholding methods improves their performance in some specific applications. For example, [10,11] present numerical studies demonstrating the effectiveness of hybrid thresholding and hyperbolic tangent-based thresholding in ECG signal processing.
This paper obtains lower bounds for thresholds that minimize the unbiased risk estimate. It also provides conditions under which this risk estimate is asymptotically normal and strongly consistent. The authors plan to conduct a number of further studies in this direction. In particular, we will seek to obtain results concerning the construction of confidence intervals and estimates of the convergence rate, which, in turn, will allow us to obtain specific error values in signal processing for a wide range of threshold processing methods. It is also planned to consider similar models for dependent sample components and non-Gaussian noise.

Author Contributions

Conceptualization, A.K. and O.S.; methodology, A.K. and O.S.; formal analysis, A.K. and O.S.; investigation, A.K. and O.S.; writing—original draft preparation, A.K. and O.S.; writing—review and editing, A.K. and O.S.; supervision, A.K. and O.S.; funding acquisition, O.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Ministry of Science and Higher Education of the Russian Federation, project no. 075-15-2024-544.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Mallat, S. A Wavelet Tour of Signal Processing; Academic Press: New York, NY, USA, 1999; p. 857. [Google Scholar]
  2. Donoho, D.; Johnstone, I.M. Ideal Spatial Adaptation via Wavelet Shrinkage. Biometrika 1994, 81, 425–455. [Google Scholar] [CrossRef]
  3. Iain Johnstone, Department of Statistics, Stanford University. Available online: https://imjohnstone.su.domains/GE_09_16_19.pdf (accessed on 23 October 2024).
  4. Gao, H.-Y. Wavelet Shrinkage Denoising Using the Non-Negative Garrote. J. Comput. Graph. Stat. 1998, 7, 469–488. [Google Scholar] [CrossRef]
  5. Chmelka, L.; Kozumplik, J. Wavelet-Based Wiener Filter For Electrocardiogram Signal Denoising. Comput. Cardiol. 2005, 32, 771–774. [Google Scholar]
  6. Poornachandra, S.; Kumaravel, N.; Saravanan, T.K.; Somaskandan, R. WaveShrink Using Modified Hyper-Shrinkage Function. In Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, 17–18 January 2006; pp. 30–32. [Google Scholar]
  7. Lin, Y.; Cai, J. A New Threshold Function for Signal Denoising Based on Wavelet Transform. In Proceedings of the 2010 International Conference on Measuring Technology and Mechatronics Automation, Changsha, China, 13–14 March 2010; pp. 200–203. [Google Scholar]
  8. Huang, H.-C.; Lee, T.C.M. Stabilized Thresholding with Generalized Sure for Image Denoising. In Proceedings of the 2010 IEEE International Conference on Image Processing (ICIP 2010), Hong Kong, China, 26–29 September 2010; pp. 1881–1884. [Google Scholar]
  9. Zhao, R.-M.; Cui, H.-M. Improved Threshold Denoising Method Based on Wavelet Transform. In Proceedings of the 2015 7th International Conference on Modelling, Identification and Control (ICMIC), Sousse, Tunisia, 18–20 December 2015; Art. 7409352. pp. 1–4. [Google Scholar]
  10. He, C.; Xing, J.; Li, J.; Yang, Q.; Wang, R. A New Wavelet Thresholding Function Based on Hyperbolic Tangent Function. Math. Prob. Eng. 2015, 2015, 528656. [Google Scholar] [CrossRef]
  11. Hussain, S.A.; Gorashi, S.M. Image Denoising based on Spatial/Wavelet Filter using Hybrid Thresholding Function. Int. J. Comput. Appl. 2012, 42, 5–13. [Google Scholar]
  12. He, H.; Tan, Y. A novel adaptive wavelet thresholding with identical correlation shrinkage function for ECG noise removal. Chin. J. Electron. 2018, 27, 507–513. [Google Scholar] [CrossRef]
  13. Gao, H.-Y.; Bruce, A.G. Waveshrink with firm shrinkage. Stat. Sin. 1997, 7, 855–874. [Google Scholar]
  14. Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
  15. Priya, K.D.; Rao, G.S.; Rao, P.S. Comparative analysis of wavelet thresholding techniques with wavelet-wiener filter on ecg signal. Procedia Comput. Sci. 2016, 87, 178–183. [Google Scholar] [CrossRef]
  16. Stein, C. Estimation of the Mean of a Multivariate Normal Distribution. Ann. Stat. 1981, 9, 1135–1151. [Google Scholar] [CrossRef]
  17. Boggess, A.; Narkowich, F. A First Course in Wavelets with Fourier Analysis; Prentice Hall: Upper Saddle River, NJ, USA, 2001; p. 283. [Google Scholar]
  18. Antoniadis, A.; Leporini, D.; Pesquet, J.-C. Wavelet thresholding for some classes of non-Gaussian noise. Stat. Neerl. 2002, 56, 434–453. [Google Scholar] [CrossRef]
  19. Kudryavtsev, A.A.; Shestakov, O.V. Asymptotically optimal wavelet thresholding in models with non-gaussian noise distributions. Dokl. Math. 2016, 94, 615–619. [Google Scholar] [CrossRef]
  20. Donoho, D.; Johnstone, I.M. Adapting to Unknown Smoothness via Wavelet Shrinkage. J. Am. Stat. Assoc. 1995, 90, 1200–1224. [Google Scholar] [CrossRef]
  21. Marron, J.S.; Adak, S.; Johnstone, I.M.; Neumann, M.H.; Patil, P. Exact Risk Analysis of Wavelet Regression. J. Comput. Graph. Stat. 1998, 7, 278–309. [Google Scholar] [CrossRef]
  22. Kudryavtsev, A.A.; Shestakov, O.V. Asymptotic Behavior of the Threshold Minimizing the Average Probability of Error in Calculation of Wavelet Coefficients. Dokl. Math. 2016, 93, 295–299. [Google Scholar] [CrossRef]
  23. Bennett, G. Probability inequalities for the sum of independent random variables. J. Am. Stat. Assoc. 1962, 57, 33–45. [Google Scholar] [CrossRef]
  24. Shestakov, O.V. Asymptotic normality of adaptive wavelet thresholding risk estimation. Dokl. Math. 2012, 86, 556–558. [Google Scholar] [CrossRef]
  25. Shestakov, O.V. On the Strong Consistency of the Adaptive Risk Estimator for Wavelet Thresholding. J. Math. Sci. 2016, 214, 115–118. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kudryavtsev, A.; Shestakov, O. Properties of the SURE Estimates When Using Continuous Thresholding Functions for Wavelet Shrinkage. Mathematics 2024, 12, 3646. https://doi.org/10.3390/math12233646

AMA Style

Kudryavtsev A, Shestakov O. Properties of the SURE Estimates When Using Continuous Thresholding Functions for Wavelet Shrinkage. Mathematics. 2024; 12(23):3646. https://doi.org/10.3390/math12233646

Chicago/Turabian Style

Kudryavtsev, Alexey, and Oleg Shestakov. 2024. "Properties of the SURE Estimates When Using Continuous Thresholding Functions for Wavelet Shrinkage" Mathematics 12, no. 23: 3646. https://doi.org/10.3390/math12233646

APA Style

Kudryavtsev, A., & Shestakov, O. (2024). Properties of the SURE Estimates When Using Continuous Thresholding Functions for Wavelet Shrinkage. Mathematics, 12(23), 3646. https://doi.org/10.3390/math12233646

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop