1. Introduction
Global climatic and environmental problems, such as sea-level rise, polar ice sheet and glacier melting, drought, and flood disasters, are closely related to the mass redistribution at the Earth’s surface [
1,
2]. The GRACE project has been successfully applied to retrieve the mass transportation within/between different earth systems, including solid earth, cryosphere, ocean, terrestrial water, etc. However, there are always inevitable high-frequency and stripping errors in unconstrained GRACE level-2 products. These are caused by an anisotropic mission sensitivity associated with a specific orbit and set-up of the mission. As a result, filtering of unconstrained GRACE level-2 solutions is a requisite step before calculating mass transportation.
DDK filters represent a class of de-striping filters that can reduce both high-frequency and stripping errors effectively. Those filters are designed from a regularization/inversion perspective [
3,
4,
5,
6,
7]. They are different from empirical decorrelation filters [
8,
9,
10], which also find wide applications. There are also many other filters, such as [
11,
12,
13,
14,
15,
16,
17,
18], but these are out of the scope of this study since we mainly focus on DDK-like filters in this work. Usually, two types of input are necessary to design a DDK filter, namely the noise covariances and the signal covariances. Several different variants of DDK filters have been proposed in the literature, and they are distinguished by their specific selections of these two inputs.
A full error covariance matrix is vital for a filter to be able to de-stripe a solution. This is emphasized in [
3], where the DDK filter is designed from the perspective of Wiener filtering. Note that the Wiener-filter perspective was also followed earlier in [
19]. However, a diagonal error covariance matrix was employed there rather than a full one, which is exactly the reason why that filter could not destripe solutions efficiently [
19]. At the time when the DDK filtering method was proposed, the full error covariance matrices were generally inaccessible for most users, and hence, the computation of an approximate full error covariance matrix from publicly available data was proposed in [
5]. It is fortunate that, nowadays, the full error covariance matrices are becoming available from more and more processing centers [
6].
The signal covariance matrices are often different for different variants of DDK filters, either in their structures or in the sources of input information. With regard to structures, a diagonal signal covariance matrix in the spectral domain was defined in [
4,
5,
6], while a diagonal structure in the spatial domain was followed in [
3]. The spectral-domain covariance matrix propagated from a diagonal spatial-domain covariance matrix is generally full. It is stated that neglecting the off-diagonal elements of such a full covariance matrix can result in visible differences in the filtered solutions compared to the case when the off-diagonal elements are preserved. However, we think this difference does not necessarily mean that using the full signal covariance matrix in the spatial domain results in a better performance. Concerning the sources of input information, the signal covariances can be obtained from prior geophysical models [
4,
5] or from the data themselves [
3]. The latter approach can be called data-driven or data adaptive; it can be viewed as a variance and covariance component estimation. One of the main potential merits of data adaptive methods is that they cannot be misled by inadequate prior models. On the other hand, it is found in [
6] that there is no significant difference between the results obtained, no matter whether the signal covariance matrix is from a prior model or from the data themselves.
In this work, we propose a new DDK scheme for filtering GRACE level-2 geopotential Spherical Harmonic (SH) coefficients. The proposed DDK filter assumes a diagonal structure of the signal covariance matrix in the spectral domain, which is similar to [
4,
5,
6]. The proposed filter is also data adaptive, similar to [
3]. The filter is obtained simply by replacing the L2 norm (of the SH coefficients vector) in Tikhonov regularization with the weighted L1 norm. A linear regression problem constrained with L1-norm regularization is nothing but the famous Lasso problem [
20], widely known in the statistical community [
21]. This is still a convex problem [
22], and the convexity of the L1-norm is exactly the reason why it is widely used compared to other sparse regularizations. The applications cover many different fields, such as statistics, machine learning, and signal processing. It is pointed out that the proposed approach can be understood as a DDK filter with diagonal signal covariance matrices (in the spectral domain) automatically determined from the data. This interpretation follows from viewing the L1-norm as a diagonally weighted L2-norm, which will be introduced further in the next section. In addition, the proposed sparse DDK filter employs the monthly time-varying full error covariance matrices, which have been made available by more and more processing centers. This time-varying covariance information has been proved to be superior to a stationary one [
6].
As a byproduct, the filtered solutions (namely, the geopotential SH coefficients) obtained with the proposed approach are sparse. This is the reason why we call our method the sparse DDK filter. A sparse vector is a vector with some of its elements being exactly zero. The reason why the solution can be sparse is briefly explained below. The data to be processed (filtered), namely the SH coefficients, are residual variables, which are obtained by subtracting the SH coefficients provided by processing centers from their corresponding reference values, e.g., a long-term mean or linear trend. Therefore, it is not unnatural that some of the residual SH coefficients are zeros or have small magnitudes. This means that vectors composed of these SH coefficients are compressible. Then, it is feasible to employ a sparse model (namely the sparse filtered SH solution) to fit such data. Furthermore, a sparse model is also beneficial in terms of model generalization, just because it is simple, and a simpler model is often better from the viewpoint of model selection or statistical learning [
21,
23]. Note that besides the filtered solution mentioned above, a DDK filter itself, namely the filtering matrix
F0 defined in Equation (5) in the following section, is also sparse (or can be made sparse), with some coefficients equal to zero (in the spectral domain). This sparsity is due to the (approximate) independence between different orders/parities [
4]. This (approximate) independence is exactly the starting point to design the empirical decorrelation filters [
8], as well as the starting point to simplify the DDK filters [
4].
The paper is organized as follows. After this introduction section, the methodology is presented in
Section 2, including an introduction to the Lasso problem, the algorithm employed to iteratively compute the filtered geopotential SH coefficients, the method to select appropriate hyperparameters (regularization parameters), and a variance/covariance analysis. In
Section 3, the relevant regularization strategies are discussed and compared, and the properties of the proposed sparse DDK filter are discussed, with comparisons to other filtering approaches. Data analysis is presented in
Section 4, including signal and noise evaluation for both global and regional cases, followed by the conclusion and outlook in
Section 5.
2. Materials and Methods
Let us denote the monthly residual geopotential SH coefficients (with a long-term mean removed) as a vector x. The corresponding time-varying full error covariance matrix, which is assumed to be available, is denoted as Q. The task is designing a DDK-type filter, denoted as F0, and computing the filtered solution as . No other data are needed. The other input needed to design a DDK filter, namely the signal covariance matrix (in the spectral domain here), denoted as S, is implicitly obtained from the available data, namely x together with Q.
2.1. A Brief Introduction to the Lasso Problem
Assume there is a measurement model expressed as follows:
where
y,
B,
β and
e are the measurement vector, design matrix, parameter vector to be estimated, and noise vector, respectively. The covariance matrix of measurement noise
e is denoted as
Q. For the sparse L1-norm regularized case, the estimation is defined as the following:
where
µ is a regularization parameter. The estimate
is nothing but the Lasso solution [
20]. Prior constraints on the solution are imposed by including the regularization term
, which is similar to that in the conventional DDK filters [
5]. However, the L1 norm is employed in this regularization, which is different from the L2 norm in the Tikhonov regularization used in conventional DDK filters. The fitting error term
implies the Gaussian distribution of measurement errors. The Lasso solution can be viewed as a Bayesian solution with a Laplace prior rather than a Gaussian prior implied in conventional DDK filters [
24]. The relevance is within the GRACE SH coefficients, making it non-Gaussian, so we adopt a Laplace prior to constrain the magnitudes of the parameters rather than constraining its energy, as implied by a Gaussian prior. It should be noted the regularization term
may appear in the form of
(with
W being regularization matrix) in some textbooks [
25], and a problem involving
is usually called weighted Lasso. As we will show in the following subsection, a weighted Lasso problem can be converted into a standard one as presented in (2). For simplicity, we only show the standard Lasso in this subsection.
The Lasso solution in Equation (2) is calculated with an efficient algorithm called FISTA [
26], which will be briefly introduced in the following subsection. The regularization coefficient
μ provides a tradeoff between the fitting term
(needed to extract signal) and the smoothing term
(needed to suppress noise in the measurement vector
y). An appropriate value for
μ is selected objectively according to a tailored variant of Generalized Cross-Validation (GCV), which is presented in the following subsection. The solution is sparse so that some elements of
are exactly zero; this means that the Lasso method is also playing the role of model coefficient selection, in an automatic way. The tuning of the regularization parameter in Lasso implies that the L1-norm regularization can produce solutions with different degrees of smoothing/denoising and then perform model coefficient selection automatically (according to GCV), namely to balance fitting and smoothing appropriately. This property can facilitate the appropriate separation of signal and noise in DDK filtering, that is, an automatic extraction of the real mass anomaly signal from noisy monthly solutions.
2.2. A Sparse DDK Filter with Weighted Lasso
The sparse DDK filter is designed based on a weighted Lasso, a special variant of the so-called adaptive Lasso in [
27]. In order to approximate the attenuation of gravity field signal with increasing degree, we adopt a power law to construct a weighting matrix, namely
, with
l being the SH degree and the power index
p taken as four empirically [
5]. This weighting matrix is then introduced to better approximate the mass anomaly signal (in the spectral domain), so the sparse DDK filtering solution with weighted Lasso is a tailored version of Equation (2):
where the design matrix
B in Equation (2) is an identity matrix here. The weighted Lasso solution
is exactly the filtered solution of our sparse DDK filter. We can simply calculate this solution by solving the following standard Lasso problem:
The standard Lasso problem involved in the weighted Lasso solution in Equation (4) can be abstracted as Equation (2). The matrix
B, the measurement vector
y, and the parameter vector
β in Equation (2) correspond to the weighting matrix
, the unfiltered SH coefficient vector
x and the transformed unfiltered SH coefficient vector
in Equation (4), respectively. For the sparse DDK filtered SH solution
in Equation (4), we firstly solve a standard Lasso problem for the transformed variable
, and then transform the solution back to the original variable, namely
. The FISTA algorithm for the standard Lasso solution
, and the tailored GCV criterion for the selection of the hyperparameter
μ, are introduced in the following
Section 2.3.
After obtaining the sparse DDK filtered solution with Equation (4), it is also meaningful to construct an equivalent DDK filtering matrix (or an equivalent filter) from this solution so as to analyze the properties of a sparse DDK filter. This is realized by viewing the L1-norm regularized solution in sparse DDK filter as a weighted L2-norm regularized solution. To be specific, let
, then
. This can be proved as follows.
is the sum of the absolute values of all elements in vector
. In
,
is a vector with all elements being ±1, with the sign of each element being the same as that of the corresponding element in
(because
). Then,
is also the sum of the absolute values of all the elements in vector
. This completes the proof. The Lasso problem given by Equation (4) can be viewed as a weighted L2-norm regularization, and accordingly, the equivalent DDK filtered solution of the proposed sparse DDK can be represented as follows:
where the equivalent DDK filtering matrix (in the spectral domain) of sparse DDK is defined accordingly as follows:
This equivalent DDK filtering matrix is nothing but the one constructed based on Tikhonov regularization such as a conventional DDK filtering matrix. Using this equivalent filtering matrix, we can also obtain the equivalent sparse DDK filtered solution . Note that this equivalent solution is strictly speaking non-sparse because the elements corresponding to zero-elements in the original sparse solution are very small (close to zeros). Therefore, in terms of spatial resolution, this equivalent filtered solution does not present performance degradation.
Next, we also develop the signal covariance matrix of the sparse DDK filtered solution
. Assuming
is a posterior estimate of the variance component to scale the error covariance matrix
Q, (how to calculate
is detailed in
Section 2.4), then Equation (6) can be written as:
Therefore, the corresponding posterior signal covariance matrix after equivalent DDK filtering is readily available as follows:
Clearly this posterior signal covariance matrix is an adjusted version of the prior one R0. Let us review the proposed sparse DDK filter. The sparse filtered solution is calculated based on Equation (4), using the numerical algorithm developed in the following subsection. In order to better compare the conventional DDK filters and the sparse DDK filter, an equivalent DDK filtering matrix F0 of the latter, in Equation (6), is developed from the obtained filtered solution, as well as its equivalent DDK signal covariance matrix in Equation (8).
2.3. Computation Algorithm
In general, no analytical solution is available for a standard Lasso problem in Equation (2), due to the nonlinearity introduced by the L1-norm regularization. Many algorithms have been proposed to solve this kind of problem numerically. In this work, a highly efficient method, called FISTA, is employed [
26]. This algorithm has also been used in our previous studies [
28,
29,
30]. It is a proximal gradient algorithm, which is further accelerated by the Nesterov momentum method [
31]. In FISTA, we conduct the following evaluations iteratively until convergence:
The iterations are initialized as and s1 = 1. The iterations are terminated when with the threshold δ = 10−5 predefined empirically. The soft thresholding operator is element-wise; namely, the jth element reads as . The step size coefficient λ is chosen as , where λmax( ) denotes the maximum eigenvalue. The value βk in the last iteration is taken as the final estimate, namely . Note that in Equation (9), both and do not change across iterations, so they just need to be calculated once and stored.
We follow a trial-and-error strategy to select an appropriate hyperparameter, namely the regularization parameter
μ in Equation (2). For each trial value of
μ, we calculate the following GCV value:
In the above, the Residual Sum Squares (RSS) term is calculated as
. The length of the measurement vector
y is denoted as
m, and the number of nonzero elements in the solution is denoted as
p. We deliberately introduce
μ as an argument in the corresponding variables above in order to explicitly show the dependence of these variables on
μ. The chosen
μ is the one that minimizes GCV, namely
. Equation (10) is the well-known GCV [
32,
33]; however, it is tailored to the Lasso problem. This tailoring is permitted by the following property of the Lasso problem revealed by [
34]: the number of the nonzero parameters in the Lasso solution is an unbiased and asymptotically consistent estimate of the effective number of free parameters of the Lasso estimation problem.
2.4. Variance and Covariance Analysis
The error covariance matrix
Q is often inaccurate, overestimating the data accuracy rather than underestimating it. We can simply scale the covariance matrix with a scalar value to account for this problem. This scalar is often called a variance component, which is denoted as
σ2. Therefore, ideally, we should replace
Q with
σ2Q. However, the introduction of this variance component does not affect the solution as shown in the above because it has already been absorbed into the tunable regularization parameter. Taking the equivalent DDK filtering matrix
F0 in Equation (6) as an example, the error covariance matrix
Q therein can be flexibly scaled with a regularization parameter (without using a variance component). However, in some accuracy assessments, this variance component is needed. The variance component and the regularization parameters can be estimated simultaneously with variance component estimation approaches [
35,
36]. However, we estimate the regularization parameter and the variance component separately. This is rational because the variance component does not affect the estimation of the SH parameters. After an appropriate regularization parameter is selected, as discussed in the above, we treat it as known; we then provide two posterior estimates for the variance component. The variance component is estimated for a model shown in Equation (3).
The first estimate is as follows:
where Reg represents the regularization term,
n is the length of the filtered solution
. This estimate is based on the assumption that the constraints implied by the regularization term are viewed as pseudo measurements. Therefore, besides the
m measurements, we also have
n pseudo measurements. Then, we can further view the regularization solution as a least-squares solution using both measurements and pseudo measurements. Therefore, the above general formula is nothing but the standard posterior estimator of the single variance component in a least-squares adjustment. When applying this general formula to our specific DDK filtering,
m =
n.
The second estimate is as follows:
where
p represents the number of nonzero elements in the filtered solution
. This estimate is obtained by considering the findings in [
34], which is already introduced above. We failed to find some relations between the two theoretically. We will compare them in our numerical studies presented in the following.
In addition to the above variance component, the error covariance matrix of the filtered variables may also be needed in some situations. Two approaches can be followed to provide such a covariance matrix. The first is obtained from a Bayesian viewpoint, namely viewing the filtered solution as the maximum a posteriori probability (MAP) estimate. We rearrange the original parameter vector as
, with the upper part being nonzero and the lower part being zero in the solution. Accordingly, we have
. The uncertainty of the solution is as follows:
where
denotes the covariance matrix of the transformed solution
, N( ) denotes a normal (Gaussian) distribution. This is correct up to the second-order terms in the expression for the logarithm of the posterior density. The reader can find a detailed derivation in
Supplementary Text S1 in the Supplementary Material file.
In the second approach, the covariance matrix is obtained simply according to the error propagation law (EPL):
This kind of covariance, which accounts for formal errors only, tends to overestimate the accuracy. One of the reasons is that only the commission error is propagated while the omission error is neglected in this propagation. In assessing the accuracy of a functional calculated from a filtered SH model, we should better use the covariance in Equation (13). If the filtered SH model is treated as a prior in another SH modeling problem, we should better use the covariance Equation (14) to better represent the data-only statistical information.
4. Results
The 7-year time-series of GRACE monthly solutions with corresponding covariance matrices [
44], ranging from January 2004 to December 2010, is used for evaluating the performance of the proposed sparse DDK filter. The regularization parameter
μ of each month, selected by GCV, ranges from 5 × 10
5 to 7.5 × 10
6. The following seven filters are compared: Gaussian 300-km smoother (denoted as G300), Swenson and Wahr (SW) empirical decorrelation filter (denoted as Swen), SW filtering + Gaussian 300-km smoothing (denoted as S300), DDK2, DDK3, DDK4 filters and the proposed sparse DDK (denoted as Spar) filter. The three variants of DDK filters, namely DDK2, DDK3 and DDK4, are selected for comparison since we find that their filtering strengths are comparable to that of the sparse DDK. In addition, the CSR RL06 v2.0 mascon solutions are also included for a comparison, considering they do not need filtering and can derive well-localized mass anomaly estimates [
12,
45]. We calculate the filtered mass anomaly using the filtered solutions with the C
20 coefficient replaced with that derived from satellite laser ranging (SLR) [
46], and with the degree-1 coefficients estimated by [
47] added. The glacial isostatic adjustment (GIA) effect is corrected using the ICE-6G_D model [
48].
In the following
Section 4.1, we compare the filtered mass anomalies of the seven filters for both global and local cases and give their uncertainty estimates. In
Section 4.2, we show the signal retaining rates and noise levels of each filtered solutions series. Finally, we evaluate the sparsity of the sparse DDK filtered solution and its signal retainment in polar areas.
4.1. Filtered Mass Anomalies Analysis
The mass anomaly, namely surface density
ξP, at a location
P, in units of EWH, is computed as follows [
49]:
where
gp is the spectral conversion matrix, which converts the spherical harmonic coefficients vector
x into the surface mass anomaly.
We still select monthly solutions from April 2004 and April 2010 for illustration. The global distribution of EWHs before and after filtering is shown in
Figure 4 for the selected months. It is observed that in all seven filtered solutions, we can identify significant geophysical signals. The G300 smoothing can effectively reduce noise, but some residual stripes still exist, especially in ocean areas. The Swen empirical decorrelation can completely de-stripe the solution; however, there is still some high-frequency noise at middle and low latitudes. The S300 filter can handle both high-frequency noise and stripes efficiently, but there is some signal attenuation in high latitudes such as Greenland and Antarctica. In DDK2, DDK3, DDK4, and sparse DDK filtered solutions, the suppression of high-frequency noise and stripes is similar visually. We also notice that the smoothing effect of the sparse DDK is closer to that of the DDK3 filter.
To further evaluate the filtered local mass anomalies of the sparse DDK filter, we selected six study areas: Amazon Basin, Congo Basin, Ganges Basin, Yangtze Basin, Alaskan glaciers (defined in
Figure S1 in Supplementary Material file), and Greenland. The surface mass densities shown in Equation (17) are integrated over a study area to calculate the total mass anomaly
ζ in this area, as follows:
In the above, Δ
i denotes the area of the
i-th grid cell in the sum-approximation of the integral. Accordingly, the formal variance of the total mass anomaly is computed through error propagation, as follows:
where
denotes the covariance matrix of the filtered solution.
In the following, we will compare the formal error StD (standard deviation) of the filtered regional mass anomaly of the sparse DDK and the conventional DDK filters (DDK2, DDK3 and DDK4). For the sparse DDK filter, there are four options to calculate
in Equation (19), namely with Equation (11)/Equation (13), with Equation (11)/Equation (14), with Equation (12)/Equation (13), or with Equation (12)/Equation (14). For easy reference, we call the corresponding variance-covariance estimates
“MAP with σ
1”, “EPL with σ
1”, “MAP with σ
2” and “EPL with σ
2”, respectively. For the conventional DDK filters, the posterior estimate of the variance component is computed according to the leftmost equation of Equation (11). The formal covariance matrix of a filtered solution from an error propagation point of view is given by Equation (14), while that from a Bayesian point of view is as follows:
where
FDDK is the block-diagonal filtering matrix, directly provided in conventional DDK filters. The diagonal signal covariance matrix
SDDK, as a scaled variant of the matrix
W introduced in
Section 2.2, is usually provided in the form of a given power index and regularization parameter. The involved time-invariable error covariance matrix
QDDK is recovered from
FDDK and
SDDK, based on the following relations:
Figure 5 shows the regional total mass anomaly series (the left panel) for the selected six study areas, as well as their differences with respect to the mascon solutions (cf. the right panel), of which the root mean square differences (RMSD) are presented in
Table 2. In general, all the filtered solutions produce similar mass anomaly estimates, and they agree well with the CSR RL06 v2.0 mascon solution. The mass anomaly differences (with mascon as reference) of the sparse DDK filtered solution are as close as those of the other DDK filters. In Alaskan glaciers and Greenland, the S300 filter and Swen filter show significant differences from the other filtered solutions and mascon solution because of stronger signal smoothing by the involved empirical decorrelation filter at middle and high latitudes. In Greenland, the Spar filtered solutions differ from the mascon solution (but are consistent with the other DDK solutions), which may result from differences in leakage errors. This proves the proposed sparse DDK does not show severe significant degradation, although the high-degree terms are forced to zeros when moving to high latitudes.
Figure 6 presents the formal error StD of regional mass anomalies for sparse DDK, DDK2, DDK3, and DDK4 filters. The following observations are drawn. First, since the variance component
σ1 in Equation (11) is almost the same size as
σ2 in Equation (12) (the left panel in
Figure 5), we can adopt either of them to scale the error covariance matrix of the filtered solution. Second, the error StDs derived from EPL in Equation (14) are always lower than those from MAP in Equation (13). This is because EPL only propagates measurement errors, while modelling errors are neglected. In other words, the uncertainty introduced by prior information (e.g., from regularization) is ignored by EPL. Third, the error StDs in the sparse DDK filter are much larger than those in DDK2, DDK3, and DDK4 filters. This may be due to missing some covariance components for DDK2, DDK3, and DDK4 when we recover the covariance matrix
QDDK in Equation (21) from the block-diagonal
FDDK and signal covariance matrix
SDDK. Finally, it is difficult to assess whether the formal error StD is too optimistic or pessimistic because we do not know the true regional mass anomalies. However, we still made an approximate verification using the mascon mass anomalies as the truth.
Table 3 lists the mean formal error StDs of four calculation schemes for the six areas. Compared with RMSD of sparse DDK in
Table 2, we find that the formal error StDs from both MAP and EPL are comparable to those derived from the differences between sparse DDK solutions and mascon solutions in four cases (Amazon, Congo, Ganges, Yangtze) out of six; the discrepancies are small. For Alaskan glaciers and Greenland, the formal error StDs are still over-optimistic. This may be due to smaller signal leakage of the mascon solutions than that of sparse DDK filtered solutions in these regions, thus making the mass anomaly differences between them greater, which cannot be captured by the formal errors.
4.2. Signal and Noise Level Analysis for Filtered Solutions
First, the signal and noise levels of the filtered solutions in the spectral domain are assessed.
Figure 7 presents the geoid degree-variances of the filtered solutions for the three selected months. According to [
50,
51], the degree-variances of the first 30 degrees mainly reflect the signal amplitude of the time-varying gravity field, while those after degree 30 mainly show the noise level. In our experience, such a statement is probably a bit too optimistic. It would be fairer to say that the range between degrees 20 and 30 is a transition zone, where contributions of noise and signal are similar. From this viewpoint, the S300 filter retains the least time-varying gravity signal, though its noise level is also low. The other filters do not show much difference in signal preservation (the first 30 degrees), but this is converse for noise suppression (after degree 30). It is impressive that compared with the other six filters, the sparse DDK can retain comparable gravity signals before the first 30 degrees while suppressing the noise after 30 degrees significantly. This mainly benefits from the high-degree and high-order sparsity brought by the weighted L1-norm regularization.
Then, the signal and noise levels of filtered solutions are quantitatively analyzed in the spatial domain, both at selected locations and globally. In doing so, we derive a mass anomaly time-series at the Earth’s surface, taking into account the Earth’s oblateness [
52]. The selected locations are five lakes: the Caspian Sea, Ladoga, Nasser, Tanganyika, and Victoria. The mean mass anomalies over those lakes (in terms of EWH) are compared with water level variations extracted from altimetry data. Annual variations are eliminated in the course of the comparison since they may contain noticeable nuisance signals (e.g., nuisance hydrological signals in the GRACE data and steric water level variations in altimetry data). For a similar reason, long-term linear trends are eliminated in the case of the Ladoga and Nasser lakes. Technically, the comparison is performed using a least-squares estimation of unknown coefficients
C1, …,
C5,
E, and, optionally,
C6, in the following equations:
where
;
ti is the time in the middle of the
i-th month,
h(
t) is the altimetry-based time series of water level variations in the given lake, and
d(
t) is the GRACE-based time series of mean mass anomalies in the same lake. The post-fit residuals are interpreted as noise in GRACE mass anomaly time-series and reported in terms of noise StD.
The scaling factor
E reflects the signal damping in a given GRACE-based time series. It is used to compute the quantity
ε, which is called thereafter “relative signal retaining”:
The “expected signal retaining”
Eexp in this expression reflects the signal damping caused solely by the truncation of the spherical harmonic series at
Lmax = 60. Thus, the relative signal retaining
ε contains information about the effect of filtering in the context of a particular lake. In the case of ideal noise-free input data,
ε is equal to 1 if filtering is not applied and reduces as the applied (low-pass) filter becomes increasingly aggressive. Further details concerning this comparison can be found in [
53].
The obtained estimates of the noise StD and relative signal retaining for each of the eight solution time-series are shown per lake in
Figure 8. Furthermore,
Table 4 (the 2nd and 3rd column) reports the mean values of the estimated noise StD and weighted mean values of the relative signal retaining, respectively, which are obtained by averaging over all five lakes. One can see that in most cases, the more aggressive a filter is (i.e., the lower the relative signal retaining), the lower is the resulting noise level. For instance, the S300 filter, which is apparently the most aggressive one, yields the lowest noise StD and the lowest relative signal retaining, whereas the DDK4 filter, being the least aggressive one, results in a relatively high noise level (when the mean values over the five lakes are considered). The sparse DDK filter looks more aggressive than DDK3 but less aggressive than DDK2. To be specific, the mean values of both relative signal retaining and noise StD are in-between the corresponding values of the DDK2 and DDK3 filters.
We also estimate the accuracy of a filtered solution time-series globally at the nodes of a global equiangular 1° × 1° grid. Noise StD is estimated at each grid node separately, using the procedure proposed by [
13]. That is, a regularized time series is computed at each node by minimizing the following objective function:
where
di is the original mass anomaly in the
i-th month,
xi is the regularized mass anomaly in the same month (to be estimated),
is an unknown noise variance,
is an unknown signal variance, and Ω[
x] is a regularization functional. The goal of the regularization functional in Equation (24) is to ensure that the regularized mass anomaly time-series
x(
t) at each grid node is sufficiently realistic so that the deviations of the data from those time-series can be exploited for an accurate estimation of noise variances. A point of special concern is an excessive damping of the original time series, which may result in an overestimation of noise levels. To mitigate this effect, we have applied a tailored regularization, which takes into account that mass anomaly time-series are typically characterized by an annual periodicity and/or long-term trends. Originally, a regularization of this type was introduced by [
13], who proposed to minimize year-to-year differences between the values of the first time-derivative of the mass anomaly time series. It was proved that such a regularization functional does not penalize annual periodic signals and linear trends at all so that at least these signal components are not subject to any damping. In this study, we have adopted a modified variant of that functional, which minimizes year-to-year differences between the values of the second time-derivative of the mass anomaly time series. Assuming that this time series is represented as a continuous function of time
x(
t) in the time interval [
tbeg;
tend], the exploited regularization functional can be written as:
where
stands for the second derivative of
x(
t). This type of regularization was called a “minimization of Month-to-month Year-to-year Triple Differences” (MYTD) in [
53], where it was originally proposed. This type of regularization yields better results than the minimization of year-to-year differences between the values of the first time-derivative, which was proposed in the aforementioned publication by [
13]. The unknown noise variance
and signal variance
are estimated in parallel with the minimization of the objective function, using the Variance Component Estimation (VCE) method [
36]. Only the former value is of interest in this study since it can directly be used to estimate the noise StD
σn at the given grid node.
The obtained estimates of noise StD for the eight solution time-series under consideration are shown as maps in
Figure 9. The global RMS values of the obtained estimates are reported in the last column of
Table 4. One can see that the global estimates of noise level rank the solutions basically in the same way as the lake tests. For instance, the lowest noise level is observed after applying a relatively aggressive S300 filter, whereas the DDK4 filter yields a higher noise level than any other filter of the “DDK” type. The proposed sparse DDK filter results in a noise level that is lower than after the DDK3 filtering but higher than after the DDK2 filtering.
5. Discussion
Figure 10 shows the sparsity ratio and the maximum SH degree of sparse DDK filtered solutions from January 2004 to December 2010. The sparsity ratios are between 14.1% and 36.5%, while the maximum degrees are between 34 and 52. This sparsity significantly suppresses the high-degree and high-order noise, as presented in
Figure 7.
It is well-known that GRACE can sense mass anomalies near the poles much better than in the rest of the world. This implies that a monthly solution complete to a relatively high degree is needed in order to fully exploit information contained in the SH coefficients when mass anomalies in polar areas are estimated. However, our sparse DDK filter may lose some of the high-frequency signals when discarding the high-degree noises. Therefore, it is necessary to check the signal loss of the filtered solutions in polar areas. The six study areas in
Section 4.1, located in low, middle, and high latitudes, are taken as examples here. Considering the noise (or stripes) has been largely eliminated visually, as shown in
Figure 4, we can approximately assume that the filtered solution mainly consists of the signal. In the following, the retained signals within these areas are reported as the annual amplitude of the filtered solutions.
The average annual amplitude for the six areas is summarized in
Table 5. We find that the DDK-type filters always provide higher annual amplitudes than the other filters. In addition, the values of the mascon solutions are similar to those of the DDK filters. In most cases (except Greenland), the average annual amplitudes of sparse DDK filtered solutions are still between those of DDK2 and DDK3, while the largest amplitudes are shown by the DDK4 solutions. This reveals that all the DDK filters show similar behavior. In addition, we also notice that the annual amplitude of sparse DDK filtered solutions tends to become weaker at higher latitudes. Over Greenland, for instance, they are by 5~17% weaker (4.0 versus 4.3, 4.7, 4.8, and 4.2 cm), compared to the other DDK filtered solutions and G300 smoothing, but still stronger than those of Swen and S300 filters.
6. Conclusions
Focusing on decorrelation and denoising for GRACE time-variable gravity field solutions, the contributions of this work include: (1) An alternative filter called sparse DDK is devised by using weighted L1-norm regularization; (2) the conventional mainstream filters are checked and compared from their properties, including time-variability, location-inhomogeneity, anisotropy, independence from the prior model, sparsity, etc., and the relevant regularization strategies, including L0, L1 and L2 regularizations are discussed. The sparse DDK filter ensures that the obtained solutions are sparse, i.e., that most of the high-degree and high-order SH coefficients are zeros. This means that strong noise, which may originate from these coefficients, is suppressed. The performance of the proposed sparse DDK filter and other mainstream filters was compared using real GRACE level-2 data. The results show that the proposed sparse DDK can remove stripes and high-frequency noise efficiently. The annual amplitudes and noise StD based on both point-wise estimates and on average values for selected regions show that the sparse DDK filter yields signal and noise levels that are comparable to those from the traditional DDK filters. The time-series of total mass anomalies in the selected research areas also indicate that the sparse DDK filter can recover mass anomalies similarly to other DDK filters. Finally, the following two points should be noted:
First, the filtering matrix
F in sparse DDK can also be simplified into block-diagonal form, as proposed by [
4], which can be achieved by retaining only the elements that correspond to SH coefficients of the same order in the error covariance matrix. On the other hand, one can also modify the conventional DDK filter, which is based on a stationary error covariance matrix and signal covariance matrix, into a sparse DDK filter by directly replacing the Tikhonov regularization with the weighted L1-norm regularization. In addition, the regularization parameter involved in sparse DDK are selected by GCV, which is believed to correspond to the optimal tradeoff between signal retainment and noise suppression. Of course, one can also adjust the regularization parameter of sparse DDK freely if they need a different filtering strength.
Second, for the sparse DDK filter, high-degree and high-order sparsity means that the maximum degree of the filtered solution is reduced, implying that the spatial resolution of the filtered solution is reduced. However, this reduction does not cause explicit loss of mass anomaly signal, especially at middle and low latitudes (5~17% signal loss in terms of annual amplitude for high latitudes taking Greenland as an example). Note that the GRACE 60-degree monthly solutions are just adopted in this work as an example. Considering that many data centers nowadays release higher 90-degree (e.g., GFZ, JPL) or 96-degree (e.g., CSR, Tongji) monthly time-varying gravity field solutions, it can be expected that after filtering with sparse DDK, the maximum degree of the filter solutions can reach approximately 60 to 80, which can significantly improve the loss of spatial resolution compared with using 60-degree data. In this case, compared to reserving higher-degree noise, it is worthwhile to discard higher-degree tiny signals, and the negative effects of high-degree signal loss can also be further reduced.