1. Introduction
Compressed sensing (see, for example, [
1,
2,
3,
4,
5,
6,
7]), was used to invert incomplete Fourier transforms in the context of sparse signal/image processing, and the
-norm was applied as a regularization for reconstructing an object from randomly selected incomplete frequency samples. Both the sparse regularization method and the compressed sensing method use the
-norm as a regularization to impose sparsity for the reconstructed signal under certain transforms. Because the models based on the
-norm are convex, they can be solved efficiently by available algorithms. Recently, the application of non-convex metrics as alternative approaches to
norm has been favored, see for example, [
8,
9,
10,
11]. The main goal of this paper is to suggest the employ of the Moreau envelope associated with the
-norm as a regularization. Note that the sparsity of a vector is originally measured by the
-norm of the vector, i.e., the number of its nonzero components. However, the
-norm is discontinuous at the origin, which is not appropriate from a computational point of view. The envelope of the
-norm is a Lipschitz surrogate of the
-norm, which is nonconvex. Through [
7], a local minimizer of a function that is the sum of a convex function and the
-norm can be identified with a global minimizer of a convex function which permits algorithmic development of convex optimization problems. For inverting incomplete Fourier transforms, the use of the
-norm allows to formulate a sparsity regularization model that can reduce artifacts and outliers in the reconstructed signal. It also allow us to design an efficient algorithm for the resulting nonconvex and nonsmooth optimization problem by means of a fixed-point formulation. Moreover, the link of this minimization problem with the related convex minimization problem will permit to prove convergence of our proposed algorithm. Furthermore, a connection with proximal/projection gradient methods is also provided by appealing to two key formulas.
2. A Sparse Regularization Model
In order to obtain to the essential information to share, we took the same paper outline as in [
6] and we assume the reader has some basic knowledge of monotone operator theory and convex analysis as can be found, for example, in [
12,
13,
14,
15].
In what follows, we propose an extension of a sparse regularization model based on the Moreau envelope of the
-norm for inverting incomplete Fourier transforms considered in [
6]. Likewise, relying on properties of the Moreau envelope of the
-norm, we obtain an equivalent formulation favorable for algorithmic development.
Given two Euclidean spaces of dimensions
N and
d, a nonempty, closed and convex subset
and a matrix
, we are interested in this work regarding the following problem:
This formalism is also at the heart of the modeling of many inverse problems posed by phase recovery problems and other real-world problems, see [
16] and references therein.
Our job is to describe the sparse regularization model for Equation (
1) in order to obtain a sparse vector
y. The
-norm, which counts the number of nonzero components of a vector
, is naturally used to measure its sparsity and is defined by
with
if
and
if
.
Now, let
be the projection from
onto the set
Q. Since the constraint is equivalent to the fact that
, we derive the following equivalent Lagrangian formulation
with
a Lagrangian multiplier.
Both non-convexity and discontinuity of the
-norm at the origin lead to computational difficulties. To overcome these problems, we use a Lipschitz regularization of the
-norm by its Moreau envelope. According to [
14,
17], for a positive number
, the Moreau envelope of
with index
at
is defined by
is continuous and locally convex near the origin. Moreover, as
,
is a good approximation of
when
is small enough. Therefore, with an appropriate choice of the parameter
,
can be used as a measure of sparsity and allows to avoid drawbacks
. For a fixed
and for
, we let
where
is a positive parameter.
To recover a sparse vector
y from (1), we now propose the sparse regularization model based on the Moreau envelope of the
-norm
Since is an approximation of , we expect that the proposed model enjoys nice properties and can be solved by efficient iteration algorithms.
As was pointed out earlier, is an excellent sparsity promoting function. Therfore, we adopt in this paper.
We reformulate problem (5) to obtain a problem that is well suited and favorable for computation. Relying on definition (3) of
in problem (5) and
r being fixed, we introduce the following function
The non-convex function
is a special case of those considered in [
7]. We then consider the problem
Next, we prove that problems (5) and (7) are essentially equivalent. A global minimizer of any of these problems will also be called a solution of the problem. We first present a relation between
and
. Remember that for
, the proximity operator of
at
is defined by
Clearly, if
, then we have that
By relation (9), we obtain
We now give a direct proof of [
6], Proposition 1.
Proposition 1. Let and . A pair solves problem (7) if, and only if, solves problem (5) with , verifying the following relation Proof. This follows directly from the following successive equalities.
□
Based on the fact that problems (5) and (7) are essentially equivalent, it suffices to establish that a local minimizer of the nonconvex problem (7) is a minimizer of a convex problem on a subdomain. To that end, we first present a convex optimization problem on a proper subdomain of
related to problem (7) and recall the notion of the support of a vector
, denoted by
, namely the index set on which the components of
x is nonzero, that is
. Note that when the support of
x in problem (7) is specified, the non-convex problem (7) reduces to a convex one. Based on this observation, we introduce a convex function by
Clearly, and is convex and differentiable on .
We define now, for a given index set
, a subspace of
by setting
is convex and closed (see [
6]), and we consider then the minimization problem on
defined by
Problem (13) is convex, thanks to the convexity of both the function
G and the set
. Next, we will show the equivalence between the non-convex problem (7) and the convex problem (13) with an appropriate choose of the index set
. To this end, we investigate properties of the support set of certain sequences in
and for a given index set
, we define an operator
by
This operator is indeed the orthogonal projection from
onto
, (see [
6] Lemma 3). A convenient identification of the proximity operator of the
-norm with the projection
and some properties of the sequence generated by
, with respect to the existence of an integer which will denote
, were developed in (Lemmas 4–7 together with Proposition 2 [
6]), and which are still valid in our context.
Recall also the closed form formula of the proximity of
. For all
,
A connection between problems (7) and (13) is given by the following result.
Theorem 1. , and be given. The pair is a local minimizer of the non-convex problem (7) if, and only if, is a minimizer of the convex problem (13) with .
Proof. Follows directly by using ([
6] Corollary 4.9), with
and
. □
Following the same lines of ([
6] Propositions 1 and 3), we can identify and connect global and local minimizers of (7) with those of (5).
3. A Fixed Point Approach
We will propose an iterative method for finding a local minimizer of (7) relying on a fixed-point formulation. For all the facts we will use, we refer to [
14].
Let us begin with a characterization of the convex problem (13).
Proposition 2. Suppose . If , then the problem (13) with has a solution and a pair solves (13) with if, and only if, Proof. The existence of solutions follows by the fact that
is compact together with the coercivity of
G with respect to the second variable. On the other hand, the optimality condition of the minimization problem
reads as
or, equivalently,
□
Application of both Theorem 1 and Proposition 2 leads to the following characterization of a local minimizer of the problem (7).
Theorem 2. Let be fixed. A pair is a local minimizer of (7) if, and only if, verifies (16).
Let us now give a characterization of a global minimizer of (16).
Theorem 3. Let be fixed. If a pair is a local minimizer of (7), then satisfies the relations Conversely, if a pair verifies (17), then If a pair is a local minimizer of (7).
Proof. The optimality condition of the minimization problem
reads as
or, equivalently,
□
In view of Theorem 3, we propose the following explicit–implicit Algorithm for solving problem (7)
When the projection can be computed efficiently, updates of both variables x and y in Algorithm (18) at each iteration can be efficiently implemented.
Proposition 3. If , then the operator is invertible. Hence, the second part of (18) reads as Proof. It is well known that is a maximal monotone operator (more precisely, it is an inverse strongly monotone operator, see the beginning of the proof of Theorem 4), hence is invertible by Minty Theorem and its inverse, which is the so-called resolvent operator, is single valued and firmly nonexpansive. □
4. Convergence Analysis
In this section, we investigate the convergence behavior of Algorithm (18). As in [
6], after a finite number of iterations, the support of the sparse variable
defined by Algorithm (18) will remain unchanged, and hence solving the non-convex optimization problem (7) by algorithm (18) reduces to solving a convex optimization problem on the support.
First, we consider a function
E, which is closely related to both functions
F and
G, and we define
where
;
will be denoted for short by
.
Now, we prove a convergence result of Algorithm (18).
Theorem 4. Let be a sequence generated by Algorithm (18) with an initial for problem (7). If are positive numbers, then we have the following properties
- 1.
for all and the sequence is convergent;
- 2.
The sequence has a finite length, namely - 3.
The sequence is asymptotically regular, that is
Proof. The function
E is differentiable with a 1-Lipschitz continuous gradient. Indeed,
This ensures that is 1-Lipschitz continuous.
On the other hand, since
, we can write
Taking into account definition of the
and the fact that
, we obtain that
Using the celebrate descent Lemma, see for example [
12], we can write
The Characterization of the orthogonal projection, namely
assures that
and thus
Now, by using the second equation of (18) and by taking into account the fact that
is firmly nonexpansive and that
, we can write
Taking into account the fact that
, we have
which, combined with the last inequality, ensures that
Combining (22), (23) and (25) yields
To prove the no-increasing of the sequence
, we first notice that the second part of (18) can be reads as
. Now, by applying definition of the proximal operator of
at
, we have
or, equivalently,
which ensures that
Finally, from (26) and (28), we deduce that
It follows from (27) that
In addition form (22), we obtain that
Summing the above two inequalities and using (23), we obtain
This, combined with (24), yields
By summing the last inequality and by taking into account the fact the convergence of the sequence
together with the fact that
, we deduce first that
The property follows then from relation (24). The latter properties ensures clearly the asymptotic regularity of the sequence . □
As in ([
6] Lemma 12), in our setting, we also have that the invariant support set of the sequence defined by Algorithm (18) exists for the nonconvex problem (7). Now, let us prove, more directly than in [
6], the convergence of the sequence
generated by (18) relying on averaged operators and Krasnoselskii–Mann Theorem. Averaged mappings are convenient in studying the convergence sequences generated by iterative algorithms for fixed-point problems thanks to the following celebrate theorem, see for example [
12,
13].
Theorem 5 (Krasnoselskii–Mann Theorem). Let be averaged and assume . Then, for any starting point , the sequence converges weakly to a fixed-point of M.
Recall also the definitions of nonexpansive and averaged operators, which appear naturally when using iterative algorithms for solving fixed-point problems and which are commonly encountered in the literature; see, for instance, [
13]. A mapping
is said to be nonexpansive if, for all
, firmly nonexpansive if
is nonexpansive, or equivalently
, for all
. It is well known that
T is firmly nonexpansive if and only if
T can be written as
, where
is nonexpansive. Recall also that mapping
is said to be averaged if it can be expressed as
with
a nonexpansive mapping and
[0, 1]. Thus, firmly nonexpansive mappings (e.g., projections on convex convex and nonempty subsets and resolvent of maximal monotone operators) are averaged mappings.
Mimicking the analysis in [
6],
is a solution of (13) if, and if
satisfies (16) and thus
is verified as
Similarly, using the same arguments, we drive that
generated by (18) leads to
and thus for all
, it satisfies
It is well-known that firmly nonexpansive mappings (including orthogonal projections on closed convex nonempty subsets and resolvent mappings of maximal monotone operators) are averaged operators. In view of the fact that the composite of finitely many averaged mappings is averaged, see for instance [
12], and by applying Krasnoselskii–Mann Theorem, we deduce the convergence of the subsequence
to a solution
of (29).
, since
, which is closed, and we also have
in view of ([
6] Lemma 7). In addition, since the resolvent of a maximal monotone operator is nonexpansive, for all
, we can write that
being convergent, it is a Cauchy sequence and thus it is also the case of the subsequence
which, in turn, converges to some limit
. Now, by passing to the limit in
and taking into account the continuity of the resolvent, we obtain
, where
is a solution of (16) which ensures that
is a solution of (13) with
.
Based on the above, these lead to the following Theorem.
Theorem 6. Let a sequence generated by (18) form an initial point . If are chosen such that , then converges to a local minimizer of (7). Moreover, is a convergent sequence and if, in addition for all , then is a local minimizer of (5).
Finally, let us point out that, since
, the fixed point iteration in (18) turns into
In particular, when
, this reduces to
We used both the fact that for any maximal monotone operator
A and
, we have
being the Yosida operator of
A with parameter 1, and that for all
, we have
We used also the fact that the Yosida operator of the subdifferential of the indicator function of Q with parameter 2 (the latter is nothing but the the normal cone to Q) is exactly .
Therefore, the proposed algorithms are nothing else than a Proximal Gradient and a Projection Gradient Algorithms. Nevertheless, the crucial idea (namely, the support of the sparse main variable generated by the Algorithm remains unchanged after a finite number of iterations) that permits to locate a local minimizer of the nonconvex optimization problem with a global minimizer of a convex optimization one deserves a great interest.
Clearly, the analysis developed here can be extended to split feasibility problems, namely
with
,
being two closed, convex subsets and
a given matrix. Since the sum of a maximal monotone operator (the normal cone to
C) and the monotone Lipschitz one (the operator
) is still maximal monotone [
14], this can be naturally extended to the following general minimization problem by means of its regularized version, i.e.,
with
being two proper, convex, lower semicontinuous functions defined on
, respectively, and
. The proximity map of
g and the subdifferential of
f will act as the projection on the set
Q and the normal cone to
C, respectively, since they share both the same properties.