Next Article in Journal
Secure Three-Factor Authentication Protocol for Multi-Gateway IoT Environments
Previous Article in Journal
Ecofriendly Long Life Nanocomposite Sensors for Determination of Carbachol in Presence of Choline: Application in Ophthalmic Solutions and Biological Fluids
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Direction of Arrival Estimation in Elliptical Models via Sparse Penalized Likelihood Approach

1
College of Mathematics, Sichuan University, Chengdu 610064, China
2
Center for Information Engineering Science Research, School of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
*
Author to whom correspondence should be addressed.
Sensors 2019, 19(10), 2356; https://doi.org/10.3390/s19102356
Submission received: 4 May 2019 / Revised: 19 May 2019 / Accepted: 20 May 2019 / Published: 22 May 2019
(This article belongs to the Section Physical Sensors)

Abstract

:
In this paper, an l 1 -penalized maximum likelihood (ML) approach is developed for estimating the directions of arrival (DOAs) of source signals from the complex elliptically symmetric (CES) array outputs. This approach employs the l 1 -norm penalty to exploit the sparsity of the gridded directions, and the CES distribution setting has a merit of robustness to the uncertainty of the distribution of array output. To solve the constructed non-convex penalized ML optimization for spatially either uniform or non-uniform sensor noise, two majorization-minimization (MM) algorithms based on different majorizing functions are developed. The computational complexities of the above two algorithms are analyzed. A modified Bayesian information criterion (BIC) is provided for selecting an appropriate penalty parameter. The effectiveness and superiority of the proposed methods in producing high DOA estimation accuracy are shown in numerical experiments.

1. Introduction

Estimating the directions of arrival (DOAs) of a number of far-field narrow-band source signals is an important problem in signal processing. Many DOA estimation methods were proposed early on, such as multiple signal classification (MUSIC) [1], estimation of signal parameters via rotation invariance techniques (ESPRIT) [2] and their variants [3,4]. Many of them work well when having accurate estimates of the array output covariance matrix and source number. In scenarios with sufficient array snapshots and a moderately high signal-to-noise (SNR), the array output covariance matrix and source number can be accurately estimated.
Recently, some sparse DOA estimation methods are popularly proposed based on sparse constructions of the array output model. They are applicable if the number of sources is unknown and many of them are effective when with a limited number of data snapshots. In [5], the sparse DOA estimation methods are categorized into three groups: the on-grid, the off-grid and the grid-less. The on-grid method, which is widely studied and straightforward to implement, assumes that the true DOAs are on a predefined grid [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22]. The off-grid method also uses a prior grid but does not constrain the DOA estimates to be on this grid, while it introduces more unknown parameters to be estimated and complicates the algorithm [23,24,25,26,27,28]. The grid-less method directly operates in the entire direction domain without a pre-specified grid but, currently, is developed mainly for linear array and encounters rather large computational burden [29,30,31]. Although the on-grid methods may induce the grid mismatch, it is still attractive due to its easy accessibility in a wide range of applications (see, e.g., [28,32,33]). To alleviate the grid mismatch, methods for selecting an appropriate prior grid are proposed in [9,34].
During the past two decades, various on-grid sparse techniques, such as linear least-square, covariance-based and maximum-likelihood (ML) type methods, were researched. The linear least-square methods [6,9,11,12], minimizing the l 2 -norm of the noise residual of the deterministic model, enforce the sparsity by constraining the l p -norm of the signal vector for p [ 0 , 1 ] (note that l p -norm for p [ 0 , 1 ) is defined as the same form as the one for p 1 but is not truly a norm in mathematics). The covariance-based methods, such as the sparse iterative covariance-based estimation (SPICE) method [13,14] and its variants [15,16], and those proposed in [17,18,19,20,21,22], are derived from different covariance-fitting criteria and use the l p -norm penalty to enforce the sparsity. The ML type methods, including the sparse Bayesian learning (SBL) methods [7,8,10] and the likelihood-based estimation of sparse parameters (LIKES) method [15,16], are deduced under the assumption that the array output signal is multivariate Gaussian distributed. The SBL methods use different prior distributions to model the sparsity. The LIKES method appears to be not explicitly utilizing the sparsity, but provides sparse estimates.
The ML principle is generally believed to be statistically more sound than the covariance-fitting principle [15]. The existing on-grid ML methods are developed on the Gaussian distribution assumption that are not satisfied in many signal processing practical applications. The complex elliptically symmetric (CES) distributions, containing the t-distribution, the K-distribution, the Gaussian distribution and so on, can be used to characterise the Gaussian and non-Gaussian random variables. Particularly, the heavy-tailed data, which usually appear in the field of radar and array signal processing [35,36], can be modeled by some CES distributions.
The l p -norm penalty with p [ 0 , 1 ] is well-known for its contributions in deriving sparse solutions [37]. In particular, the convex l 1 -norm penalty, also known as the least absolute shrinkage and selection operator (LASSO) penalty, is easy to handle and widely used in various applications. Generally, a sparse and accurate estimate of the unknown sparse parameter can be expected when minimizing the negative likelihood plus an l p -norm penalty.
For estimating the DOAs from the CES distributed array outputs, we provide an l 1 -norm penalized ML method based on a sparse reconstruction of the array output covariance matrix. The characteristics and advantages of our method are explained as follows:
  • Our method is a sparse on-grid method based on the penalized ML principle, and is designed especially for the CES random output signals. The Gaussian distribution and many heavy-tailed and light-tailed distributions are included in the class of the CES distributions, and it is worth mentioning that the heavy-tailed output signals that are non-Gaussian are common in the field of signal processing [35,36].
  • The sparsity of the unknown sparse vector is enforced by penalizing its l 1 -norm. When the penalty parameter becomes zero, the proposed method is actually the general ML DOA estimation method that is applicable to the scenarios with CES random output signals.
  • Two penalized ML optimization problems are formulated for spatially uniform and non-uniform white sensor noise, respectively. Since it is difficult to solve the two non-convex penalized ML optimizations globally, two majorization-minimization (MM) algorithms having different iterative procedures are developed for seeking the optimal solutions locally for each of them. Some discussions on the computational complexities of the above two algorithms are provided. In addition, the optimal penalty parameter is suggested.
  • The proposed methods are evaluated numerically in scenarios with Gaussian and non-Gaussian output signals. Particularly, the performance gains originated from the added l 1 -norm penalty are numerically demonstrated.
The remainder of this paper is organized as follows. Section 2 introduces a sparse CES data model of the array output. In Section 3, a sparse penalized ML method is developed to estimate the DOAs. For solving the proposed l 1 -penalized ML optimizations that are non-convex, algorithms in the MM framework are developed in Section 4. Section 5 numerically shows the performance of our method in Gaussian and non-Gaussian scenarios. Finally, some conclusions are given in Section 6.

Notations

The notation C m ( R m ) denotes the set of all m-dimensional complex- (real-) valued vectors, and  C m × n ( R m × n ) denotes the set of all m × n complex- (real-) valued matrices. 1 p and 0 p are the p-dimensional vectors with all elements equal to 1 and 0, respectively. I p is the p × p identity matrix. · 1 and · 2 denote the l 1 -norm and l 2 -norm of a vector, respectively. The superscripts ( · ) T and ( · ) H , respectively, denote the transpose and the conjugate transpose of a vector or matrix. The imaginary unit is denoted as ı defined by ı 2 = 1 .
For a vector x = [ x 1 , , x p ] T C p , define element-wise square root function sqrt ( x ) = [ | x 1 | , , | x p | ] T and element-wise division operation x y = [ x 1 / y 1 , , x p / y p ] T for a vector y = [ y 1 , , y p ] T C p with non-zero elements. max ( x ) = max { x 1 , , x p } , x [ a : b ] = [ x a , , x b ] T for 1 a b p , x > 0 ( 0 ) means x i > 0 ( 0 ) for i = 1 , , p , [ x ; z ] denotes the stacked vector of the column vectors x and z . Diag ( x ) denotes a square diagonal matrix with the elements of vector x on the main diagonal, and ( Diag ( x ) ) 1 denotes the diagonal matrix with main diagonal elements 1 / x i ( i = 1 , , p ) by taking 1 / 0 = .
For a square matrix X C p × p , X i , j denotes the ( i , j ) th entry of X , X > 0 ( 0 ) means that X is Hermitian and positive definite (semidefinite), tr ( X ) denotes the trace of X , and diag ( X ) denotes a column vector of the main diagonal elements of X .

2. Problem Formulation

Consider the problem of estimating the DOAs of k 0 narrow-band signals impinging on an array of m sensors.
Given a set of grid points
{ θ 1 , , θ k } ,
we assume that the true k 0 ( k 0 k ) DOAs, respectively, denoted as ξ 1 , , ξ k 0 , take values in it. The array output measurement at the instant t, denoted as x ( t ) C m , can be modeled as
x ( t ) = A ( θ ) s ( t ) + v ( t ) ,
where
  • θ = [ θ 1 , , θ k ] T ,
  • A ( θ ) = [ a ( θ 1 ) , , a ( θ k ) ] C m × k is the known array manifold matrix with a ( θ i ) being the steering vector corresponding to θ i , i = 1 , , k ;
  • s ( t ) = [ s 1 ( t ) , , s k ( t ) ] T C k is the source signal vector at the time instant t, in which s i ( t ) is the unknown random signal from a possible source at θ i and then is zero if θ i is not in the true DOA set { ξ 1 , , ξ k 0 } , i = 1 , , k ;
  • v ( t ) = [ v 1 ( t ) , , v m ( t ) ] T C m is the noise vector impinging on the sensor array at the time instant t.
Some necessary statistical assumptions are made as follows:
  • The possible source signals s 1 ( t ) , , s k ( t ) are uncorrelated and zero-mean at any time instant t.
  • The noise components v 1 ( t ) , , v m ( t ) are uncorrelated, zero-mean, and independent of s 1 ( t ) , , s k ( t ) for any time instant t.
  • The n ( n > m ) snapshots x ( 1 ) , , x ( n ) of sensor array signals are independent and identically distributed from a CES distribution.
Note that the zero-mean assumptions above are common in the signal processing literature [5,35]. The CES distribution setting of the array output x ( t ) enables us to effectively process the Gaussian, heavy-tailed or light-tailed array snapshots, because the class of CES distributions [38] includes the Gaussian distribution, the t-distribution, the K-distribution and so on.
For the simplicity of the notations, we denote A ( θ ) as A and a ( θ i ) as a i for i = 1 , , k in the following. Under the above assumptions, we can find that the array output x ( t ) in Equation (2) at any time instant t has mean zero and covariance matrix
R = E [ x ( t ) x ( t ) H ] = A P A H + V ,
where
P = E [ s ( t ) s ( t ) H ] = Diag ( p )
is the (unknown) source signal covariance matrix with signal power vector p = [ p 1 , , p k ] T , and
V = E [ v ( t ) v ( t ) H ] = Diag ( σ )
is the (unknown) noise covariance matrix with the noise power vector σ = [ σ 1 2 , , σ m 2 ] T . The matrix R can be rewritten as
R = [ A , I m ] Diag ( [ p ; σ ] ) [ A , I m ] H .
For any i = 1 , , k , the signal power p i = 0 if θ i is not in the set of the true DOAs. Therefore, the true DOAs can be identified by the locations of nonzero elements of the power vector p . In the following, the DOA estimation problem is formulated as a problem of estimating the locations of nonzero elements of the power vector p .

3. Sparse DOA Estimation

The array output x ( t ) in Equation (2) is CES distributed with mean zero and covariance matrix R , and then the normalized random vector
y ( t ) x ( t ) / x ( t ) 2 ,
which actually refers to the angle of the array output vector x ( t ) , has a complex angular central Gaussian (ACG) distribution [35] with the probability density function
p ( y ( t ) ; R ) det ( R ) 1 ( y ( t ) H R 1 y ( t ) ) m .
Denote
L 0 ( R ) = log det ( R ) + m n t = 1 n log ( x ( t ) H R 1 x ( t ) ) ,
and then the negative log-likelihood of y ( t ) = x ( t ) / x ( t ) 2 , t = 1 , , n , becomes
n L 0 ( R ) m t = 1 n log ( x ( t ) 2 2 ) + c 1 ,
where c 1 is a constant.

3.1. Spatially Non-Uniform White Noise

In the case of spatially non-uniform white sensor noise, not all noise variances σ i 2 , i = 1 , , m , are equal. Assuming σ m 2 > 0 , we denote
r = [ p ; σ [ 1 : m 1 ] ] / σ m 2 ,
W = B Diag ( r ) B H + J m ,
where B C m × ( k + m 1 ) is the first ( k + m 1 ) columns of [ A , I m ] and J m is an m × m matrix with the ( m , m ) -entry being 1 and the other entries being 0. Since
W = R / σ m 2 ,
L 0 ( a R ) = L 0 ( R ) , a > 0
and that the locations of nonzero elements of the sparse vector r [ 1 : k ] identify the true DOAs, we formulate the DOA estimation problem as solving the penalized likelihood optimization problem
arg min r Ω L 0 ( W ) + λ r [ 1 : k ] 1 ,
where
Ω = { t R k + m 1 t 0 , B Diag ( t ) B H + J m > 0 }
and λ 0 is pre-specified. The l 1 -norm penalty term λ r [ 1 : k ] 1 would help deduce a sparse solution, and the penalty parameter λ controls the sparsity level [37].
Remark 1.
We explain why the DOAs are not estimated by solving the plausible l 1 -penalized ML optimization problem
arg min p , σ L 0 ( R ) + λ p 1 .
Recalling Equation (6), we find for any λ 1 > 0 and λ 2 > 0 ,
L 0 ( R ) + λ 1 p 1 = L 0 [ A , I m ] Diag λ 1 λ 2 p ; λ 1 λ 2 σ [ A , I m ] H + λ 2 λ 1 λ 2 p 1 .
Thus, [ p , σ ] is a locally optimal solution of Equation (17) with λ = λ 1 if and only if λ 1 λ 2 p ; λ 1 λ 2 σ is a locally optimal solution of Equation (17) with λ = λ 2 . This means if we estimate p by Equation (17), the parameter λ cannot work theoretically in adjusting the sparsity level of the estimate of p . Instead of considering Equation (17), we formulate the optimization (Equation (15)) for the DOA estimation. In Equation (15), noticing the constant matrix J m in Equation (12), we find that different values of λ would result in solutions with different sparsity levels.

3.2. Spatially Uniform White Noise

In the case of spatially uniform white sensor noise, σ 1 = = σ m = σ . Assuming σ 2 > 0 , we denote
q = p / σ 2 ,
Q = A Diag ( q ) A H + I m .
Estimating the DOAs means identifying the locations of nonzero elements of the vector q . Considering Q = R / σ 2 , we can estimate q through solving the penalized likelihood optimization problem
arg min q 0 L 0 ( Q ) + λ q 1
with λ 0 , where the term λ q 1 plays the same role as the penalty term in Equation (15).
Note that the number of unknown parameters to be estimated is k in the case of spatially uniform noise in contrast to k + m 1 in the case of spatially non-uniform noise.

4. DOA Estimation Algorithms

In this section, we provide methods to solve the optimization problems in Equations (15) and (21) with λ fixed. As Equations (15) and (21) are non-convex, it is generally hard to give their globally optimal solutions. Based on the MM framework [39,40], we develop algorithms to find the locally optimal solutions of Equations (15) and (21).
A function f ( x ) is said to be majorized by a function g ( x | x 0 ) at x 0 , if  f ( x ) g ( x | x 0 ) for all x and f ( x 0 ) = g ( x 0 | x 0 ) . In the MM framework, the problem arg min x f ( x ) can be solved through iteratively solving x u + 1 = arg min x g ( x | x u ) .
When solving the problems in Equations (15) and (21) in the MM framework, majorizing functions can be constructed based on the following two inequalities. For any positive definite matrices U C m × m and U u C m × m [40],
log det ( U ) log det ( U u ) + tr ( U u 1 U ) m
and
log ( x ( t ) H U 1 x ( t ) ) log ( x ( t ) H U u 1 x ( t ) ) + x ( t ) H U 1 x ( t ) x ( t ) H U u 1 x ( t ) 1 ,
where both equalities are achieved at U = U u .

4.1. Algorithms for Spatially Non-Uniform White Noise

In this subsection, using two different majorizing functions of the objection function in Equation (15), we develop two different MM algorithms named MM1 and MM2 to solve Equation (15).

4.1.1. MM1 Algorithm

Denote
W u = B Diag ( r u ) B H + J m ,
where r u Ω . Replacing the U and U u in Equations (22) and (23) by W and W u , respectively, we have for any W > 0 , W u > 0 ,
L 0 ( W ) tr ( W u 1 W ) + m n t = 1 n x ( t ) H W 1 x ( t ) x ( t ) H W u 1 x ( t ) + c 2
= tr ( W u 1 W ) + tr ( M u W 1 ) + c 2
where c 2 is a constant, the equality in Equation (25) is achieved at W = W u , and
M u = m n t = 1 n x ( t ) x ( t ) H x ( t ) H W u 1 x ( t ) .
Denote the sum of the first two terms on the right side of Equation (26) and the l 1 -norm penalty term in Equation (15) as g 1 ( r r u , λ ) , and then due to Equation (12), g 1 ( r r u , λ ) can be rewritten as
g 1 ( r r u , λ ) = tr ( M u W 1 ) + w u T r
with
w u = diag ( B H W u 1 B ) + [ λ 1 k ; 0 m 1 ] .
From Equation (25), g 1 ( r r u , λ ) + c 2 is found to be a majorizing convex function of the objective function in Equation (15). Based on the MM framework, the problem in Equation (15) can be solved by iteratively solving the convex optimization problem
r u + 1 = arg min r Ω g 1 ( r r u , λ ) .
Proposition 1 
(The MM1 algorithm for Equation (15)). The sequence { r u } generated by
r u + 1 = arg min r 0 g 1 ( r r u , λ )
with any initial value r 0 Ω converges to a locally optimal solution of the optimization problem in Equation (15).
Proof. 
Since g 1 ( r r u , λ ) tends to + as W goes to the boundary of the positive semidefinite cone, the optimization problems in Equations (30) and (31) are equivalent.
This proposition follows by the convergence property of the MM algorithm [40] and the fact that L 0 ( W ) + with probability 1 as W tends to the boundary of the positive semidefinite cone [41].  □
Due to Equation (11), r 0 = [ ( r 1 ) 0 , , ( r k + m 1 ) 0 ] T required in the iteration in Equation (31) can be the one obtained by inserting the initial estimate of [ p ; σ ] presented in [13]. Specifically,
( r i ) 0 = b i H R ^ b i R ^ m , m b i 2 4 > 0 , i = 1 , , k + m 1 ,
where b i is the ith column of the matrix B and
R ^ = 1 n t = 1 n x ( t ) x ( t ) H .
To solve the optimization problem in Equation (31) globally, we develop two available solvers: the coordinate descent (CD) and SPICE-like solvers. Denote r = [ r 1 , , r k + m 1 ] T in the following.
Proposition 2 
(The CD solver for Equation (31)). The sequence of r generated by iterating
r i = β i α i 1 1 γ i δ ( β i α i ) , i = 1 , , k + m 1
with any initial value r 0 converges to the globally optimal solution r u + 1 of the problem in Equation (31), where δ ( · ) is the indicator function of the interval ( 0 , ) ,
α i = ( w i ) u ,
β i = b i H N i 1 M u N i 1 b i ,
γ i = b i H N i 1 b i ,
with ( w i ) u being the ith element of the vector w u , and 
N i = l i r l b l b l H + J m .
Proof. 
By the general analysis of the CD method in [42], it is easy to find that the convex problem in Equation (31) can be globally solved by iterating
r i = arg min r i 0 g 1 ( r r u , λ ) , i = 1 , , k + m 1
until convergence.
Due to
W = B Diag ( r ) B H + J m = r i b i b i H + N i
and
( r i b i b i H + N i ) 1 = N i 1 r i N i 1 b i b i H N i 1 1 + r i b i H N i 1 b i ,
the iteration in Equation (39) can be reformulated as
r i = arg min r i 0 α i r i β i r i 1 + γ i r i , i = 1 , k + m 1 .
In addition, α i > 0 , β i > 0 , γ i > 0 for i = 1 , , k + m 1 . Solving the convex optimization problems in Equation (42) by the gradient method, we conclude that the equations in Equation (42) are equivalent to those in Equation (34). □
As the solver in Proposition 2 is derived by the CD method, we call it the CD solver. The MM1 algorithm in Proposition 1 with the CD solver is specially named the MM1-CD algorithm.
For clarity, detailed steps of the MM1-CD algorithm for Equation (15) are presented in Algorithm 1.
Note that, in the CD solver, for  i = 1 , , k , we have
β i = m n t = 1 n ζ i ζ i H
where
ζ i = a i H N i 1 x ( t ) x ( t ) H W u 1 x ( t )
with a i being the ith column of the matrix A . The β i can be interpreted as the correlation between the signal from a possible source at the grid θ i and the array responses x ( 1 ) , , x ( n ) . From the indicator function δ ( β i α i ) in Equation (34), we find that it is more likely to force to zeros the powers of the assumed signals that are less correlated with the array responses.
Proposition 3 
(The SPICE-like solver for Equation (31)). The sequence of r generated by iterating
r = sqrt ( diag ( Diag ( r ) B H W 1 M u W 1 B Diag ( r ) ) w u )
with any initial value r > 0 converges to the globally optimal solution r u + 1 of problem in Equation (31).
Algorithm 1 The MM1-CD algorithm for spatially non-uniform white noise.
1:
Give r 0 and ϵ ,
2:
u = 0 ,
3:
repeat
4:
     W u = B Diag ( r u ) B H + J m ,
5:
    Calculate M u and w u ,
6:
     d = 0 ,
7:
     r 0 = r u , r = r 0 ,
8:
    repeat
9:
        for i = 1 : k + m 1 do
10:
            N i = l i r l b l b l H + J m ,
11:
           Calculate α i , β i and γ i ,
12:
           if β i α i then
13:
                r i = 0 ,
14:
           else
15:
                r i = ( β i / α i 1 ) / γ i ,
16:
           end if
17:
        end for
18:
         d = d + 1 ,
19:
         r d = [ r 1 , , r k + m 1 ] T ,
20:
    until r d r d 1 2 / r d 1 2 < ϵ ,
21:
     u = u + 1 ,
22:
     r u = r d ,
23:
until r u r u 1 2 / r u 1 2 < ϵ .
Proof. 
It is obvious that Equation (31) and the SPICE criterion in [15] are with similar forms. By the same way as the SPICE criterion is analyzed in [15], we find that the globally optimal solution of the problem in Equation (31) is identical to the minimizer r of the optimization problem
min r 0 , E C ( k + m ) × n tr ( E H ( Diag ( [ r ; 1 ] ) ) 1 E ) + w u T r s.t. [ A , I m ] E = X u ,
where X u = [ x ˜ ( 1 ) , , x ˜ ( n ) ] with
x ˜ ( t ) = m n x ( t ) x ( t ) T W u 1 x ( t ) , t = 1 , , n .
For a fixed r , the matrix E minimizing Equation (46) can be verified to be [15]
E = Diag ( [ r ; 1 ] ) [ A , I m ] H W 1 X u ,
and for a fixed E , the vector r minimizing Equation (46) can be readily given by
r = sqrt ( ( diag ( E E H ) ) [ 1 : k + m 1 ] w u ) .
The sequences of E and r , generated by alternately iterating Equations (48) and (49) from r > 0 , converge to the globally optimal solution of the convex problem in Equation (46) [14,15].
Due to X u X u H = M u , iterating Equation (49) is just iterating Equation (45). Thus, the sequence of r generated by Equation (45) converges to the minimizer r of Equation (46). □
The solver in Proposition 3 is named the SPICE-like solver, and the MM1 algorithm in Proposition 1 with it is called the MM1-SPICE algorithm. Algorithm 2 summarily illustrates the MM1-SPICE algorithm for Equation (15).
Algorithm 2 The MM1-SPICE algorithm for spatially non-uniform white noise.
1:
Give r 0 and ϵ ,
2:
u = 0 ,
3:
repeat
4:
     W u = B Diag ( r u ) B H + J m ,
5:
    Calculate M u and w u ,
6:
     d = 0 ,
7:
     r 0 = r u ,
8:
    repeat
9:
         W d = B Diag ( r d ) B H + J m ,
10:
         r d + 1 = sqrt ( diag ( Diag ( r d ) B H ( W d ) 1 M u ( W d ) 1 B Diag ( r d ) ) w u ) ,
11:
         d = d + 1 ,
12:
    until r d r d 1 2 / r d 1 2 < ϵ ,
13:
     u = u + 1 ,
14:
     r u = r d ,
15:
until r u r u 1 2 / r u 1 2 < ϵ .
From Propositions 1–3, it is found that the proposed MM1 algorithm is by an inner–outer iteration loop. The MM1-CD and the MM1-SPICE algorithms are with the identical outer loop but different nested inner loops. In addition, relationship and difference between the MM1-CD and the MM1-SPICE are discussed in Section 4.3.

4.1.2. MM2 Algorithm

When r u > 0 , it is found from [36] that, for any r > 0 ,
C u ( Diag ( [ r ; 1 ] ) ) 1 C u H I m I m W = F H F 0 ,
where
C u = W u 1 [ A , I m ] Diag ( [ r u ; 1 ] ) ,
F = [ ( Diag ( [ r ; 1 ] ) ) 1 2 C u H , ( Diag ( [ r ; 1 ] ) ) 1 2 [ A , I m ] H ] .
From Equation (50), we have
W 1 C u ( Diag ( [ r ; 1 ] ) ) 1 C u H
with the equality achieved at r = r u . The inequality in Equation (53) is also valid for any r Ω due to W > 0 .
Denote
g 2 ( r r u , λ ) = tr ( C u H M u C u ( Diag ( [ r ; 1 ] ) ) 1 ) + w u T r .
It is clear from Equation (53) that, when r u > 0 , for any r Ω ,
g 1 ( r r u , λ ) g 2 ( r r u , λ ) ,
where the equality is achieved at r = r u . Therefore, at any r u > 0 , g 2 ( r r u , λ ) + c 2 majorizes the objective function in Equation (15) for r Ω .
Proposition 4 
(The MM2 algorithm for Equation (15)). The sequence { r u } generated by
r u + 1 = sqrt ( diag ( Diag ( r u ) B H W u 1 M u W u 1 B Diag ( r u ) ) w u )
with any initial value r 0 > 0 converges to a locally optimal solution of the problem in Equation (15).
Proof. 
Through the convergence analysis in [36], we have that, although  Equation (55) is valid only when r u > 0 , the sequence { r u } generated
r u + 1 = arg min r Ω g 2 ( r r u , λ )
with any initial value r 0 > 0 converges to a locally optimal solution of the problem in Equation (15). It is worth mentioning that the elements of the coefficient vector diag ( C u H M u C u ) in Equation (54) are positive. By solving the optimization problem in Equation (57) using the gradient method, the iteration procedure in Equation (57) is found to be exactly Equation (56).  □
The r 0 involved in the iteration procedure in Equation (56) can be the one given in Equation (32). Summarily, Algorithm 3 gives the detailed steps of the MM2 algorithm for the problem in Equation (15).
Algorithm 3 The MM2 algorithm for spatially non-uniform white noise.
1:
Give r 0 and ϵ ,
2:
u = 0 ,
3:
repeat
4:
     W u = B Diag ( r u ) B H + J m ,
5:
    Calculate M u and w u ,
6:
     r u + 1 = sqrt ( diag ( Diag ( r u ) B H W u 1 M u W u 1 B Diag ( r u ) ) w u ) ,
7:
     u = u + 1 ,
8:
until r u r u 1 2 / r u 1 2 < ϵ .

4.2. DOA Estimation for Spatially Uniform White Noise

The DOA estimation in the case of spatially uniform white noise amounts to solving the optimization problem in Equation (21). Problems in Equations (21) and (15) are with similar forms, but Equation (21) involves a smaller number of unknown parameters. By the same way as the problem in Equation (15) is analyzed in Section 4.1, we can naturally derive the algorithms to solve Equation (21). The algorithms for Equation (21) are still named MM1 (including MM1-CD and MM1-SPICE) and MM2.

4.2.1. MM1 Algorithm

Proposition 5 
(The MM1 algorithm for Equation (21)). Denote
h 1 ( q q u , λ ) = tr ( Z u Q 1 ) + e u T q
with
Z u = m n t = 1 n x ( t ) x ( t ) H x ( t ) H Q u 1 x ( t ) ,
e u = diag ( A H Q u 1 A ) + λ 1 k ,
Q u = A Diag ( q u ) A H + I m ,
and then the sequence { q u } generated by
q u + 1 = arg min q 0 h 1 ( q q u , λ )
with any initial value q 0 converges to a locally optimal solution of the problem in Equation (21).
To solve the optimization problem in Equation (62), using the same method as Equation (31) is solved, we offer two different iterative solvers for Equation (62).
Proposition 6 
(The CD solver for Equation (62)). The sequence of q generated by iterating
q j = β ˜ j α ˜ j 1 1 γ ˜ j δ ( β ˜ j α ˜ j ) , j = 1 , , k
with any initial value q 0 converges to the globally optimal solution of the problem in Equation (62), where
α ˜ j = ( e j ) u ,
β ˜ j = a j H ( H j ) 1 Z u ( H j ) 1 a j ,
γ ˜ j = a j H ( H j ) 1 a j ,
with ( e j ) u being the jth element of the vector e u , a j being the jth column of the matrix A and
H j = l j q l a l a l H + I m .
Proposition 6 introduces the nested CD inner loop of the MM1 algorithm for the problem in Equation (21), and can be proven similar to Proposition 2.
Proposition 7 
(The SPICE-like solver for Equation (62)). The sequence of q generated by iterating
q = sqrt ( diag ( Diag ( q ) A H Q 1 Z u Q 1 A Diag ( q ) ) e u )
with any initial value q > 0 converges to the globally optimal solution of the problem in Equation (62).
The iterative procedure in Proposition 7 above is the nested SPICE-like inner loop of the MM1 algorithm for Equation (21), and can be proven similar to Proposition 3.
The MM procedure in Proposition 5 with the CD nested loop in Proposition 6 is the MM1-CD algorithm for Equation (21). The MM1-SPICE algorithm for Equation (21) is the MM procedure in Proposition 5 with the SPICE-like nested loop in Proposition 7.

4.2.2. MM2 Algorithm

Proposition 8 
(The MM2 algorithm for Equation (21)). The sequence { q u } generated by
q u + 1 = sqrt ( diag ( Diag ( q u ) A H Q u 1 Z u Q u 1 A Diag ( q u ) ) e u )
with any initial value q > 0 converges to a locally optimal solution of the problem in Equation (21).
Due to Equation (19), by means of inserting in q the initial estimates of p and σ 2 given in [13], q 0 = [ ( q 1 ) 0 , , ( q k ) 0 ] T required by the iterations in Equations (62) and (69) can be
( q j ) 0 = a i H R ^ a i σ ^ a i 2 4 > 0 , j = 1 , , k ,
where R ^ is given by Equation (33) and σ ^ is the mean of the first m smallest values of the set
a 1 H R ^ a 1 a 1 2 2 , , a k H R ^ a k a k 2 2 , R ^ 1 , 1 , , R ^ m , m .

4.3. Discussions on the MM1 and MM2 Algorithms

The following arguments are focused on the case of spatially non-uniform noise, which are also applicable to the case of spatially uniform noise.
Both the MM1 and MM2 algorithms decrease the objective function in Equation (15) at each MM iteration and converge locally, in which the different majorizing functions are adopted. Compared to the MM2 algorithm, the MM1 algorithm has a majorizing function closer to the objective function in Equation (15) due to Equation (55). The computational burdens of the two algorithms are mainly caused by the matrix inversion operations.
Although the MM1-CD and MM1-SPICE algorithms have different nested inner iteration procedures, they converge to the same local solution theoretically because their outer MM iteration procedures are both Equation (31). Each nested inner iteration of the MM1-CD algorithm, detailed by Steps 9–17 in Algorithm 1, requires k + m 1 matrix inverse operations. In each nested inner iteration of the MM1-SPICE algorithm, presented by Steps 9–10 in Algorithm 2, only one matrix inverse operation is entailed.
It is somewhat interesting to find that the iteration of the MM2 algorithm (see Equation (56)) and the nested inner iteration of the MM1-SPICE algorithm (see Equation (45)) have similar forms. In each iteration of the MM2 algorithm, only one matrix inverse operation is needed.

4.4. Selection of Penalty Parameter

The penalty parameter λ in Equations (15) and (21) affects the sparsity levels of the estimates of r and q . A modified Bayesian information criterion (BIC) method [37], which is common and statistically sound, is provided here to choose an appropriate λ . Let r ^ λ and q ^ λ be the solutions of the problems in Equations (15) and (21) with a fixed λ , respectively, and denote W ^ λ = B Diag ( r ^ λ ) B H + J m and Q ^ λ = A Diag ( q ^ λ ) A H + I m . The appropriate λ for spatially non-uniform noise and uniform noise are
arg min λ L 0 ( W ^ λ ) + z ^ λ , 1 log ( n ) / ( 2 n )
and
arg min λ L 0 ( Q ^ λ ) + z ^ λ , 2 log ( n ) / ( 2 n ) ,
respectively, where z ^ λ , 1 and z ^ λ , 2 are the numbers of nonzero elements of the vectors ( r ^ λ ) [ 1 : k ] and q ^ λ , respectively.
Note that the elements of ( r ^ λ ) [ 1 : k ] and q ^ λ can be treated as 0 if they are smaller than some certain values respectively, e.g., ε max ( ( r ^ λ ) [ 1 : k ] ) and ε max ( q ^ λ ) with a very small ε > 0 .
Notice that the L 0 ( W ^ λ ) and L 0 ( Q ^ λ ) in Equations (72) and (73) are substitutes for the marginal likelihood also called as Bayesian evidence required in the BIC criterion [43]. By the way, when modeling in the Bayesian framework, the marginal likelihood usually cannot be analytically calculated, but can be approximated by several computational methods in [44,45,46].

5. Numerical Experiments

In this section, we numerically show the performance of the proposed methods. Consider a uniform linear array (ULA), and, for each j = 1 , , k 0 , the steering vector corresponding to the DOA ξ j is
a ( ξ j ) = [ 1 , exp ( ı π cos ( ξ j ) ) , , exp ( ı π ( m 1 ) cos ( ξ j ) ) ] T .
The array output data x ( t ) are generated from the model
x ( t ) = j = 1 k 0 a ( ξ j ) s ¯ j ( t ) + v ( t ) , t = 1 , , n ,
and both the source signals s ¯ 1 ( t ) , , s ¯ k 0 ( t ) and the observation noise v ( t ) are temporally independent. The SNR is defined as
SNR ( dB ) = 10 log 10 tr ( A ¯ P ¯ A ¯ H ) tr ( V ) ,
where A ¯ = [ a ( ξ 1 ) , , a ( ξ k 0 ) ] , P ¯ is the covariance matrix of s ¯ ( t ) = [ s ¯ 1 ( t ) , , s ¯ k 0 ( t ) ] T , and V is the covariance matrix of v ( t ) . The root mean-square error (RMSE) of the DOA estimate is employed to evaluate the estimation performance, which is approximated by R = 500 Monte Carlo runs as
RMSE = 1 R i = 1 R 1 k 0 j = 1 k 0 ( ξ ^ j , i ξ j ) 2 ,
where ξ ^ j , i is the estimate of ξ j in the ith Monte Carlo run. In the following experiments, we applied the proposed methods and the SPICE [13,15] and LIKES methods [15] to estimate the DOAs. Set the grid points θ = { 0 , 1 , , 180 } , and the tolerance ϵ = 10 8 for convergence. All methods were coded in MATLAB and executed on a workstation with two 2.10 GHz Intel Xeon CPUs.

5.1. Experimental Settings

5.1.1. Experiment 1

Consider a scenario with Gaussian source signals and noise:
  • The number of sensors in the array is m = 15 , and the number of snapshots is n = 100 .
  • There are k 0 = 4 sources locating at ξ 1 = 60 , ξ 2 = 80 , ξ 3 = 100 , ξ 4 = 120 , respectively.
  • The signal s ¯ ( t ) is zero-mean circular complex Gaussian with covariance matrix P ¯ = Diag ( [ 100 , 100 , 100 , 100 ] ) .
  • The noise v ( t ) is zero-mean circular complex Gaussian with covariance matrix V = σ 2 I m .

5.1.2. Experiment 2

Consider the scenario where both the source signals and the observation noise are non-Gaussian. Let m = 15 . We used the experimental settings:
  • The k 0 = 4 source signals from ξ 1 = 10 ,   ξ 2 = 40 ,   ξ 3 = 40 + Δ ξ ,   ξ 4 = 70 are, respectively, as s ¯ 1 ( t ) = 5 exp ( ı φ 1 ( t ) ) ,   s ¯ 2 ( t ) = 10 exp ( ı φ 2 ( t ) ) ,   s ¯ 3 ( t ) = 10 exp ( ı φ 3 ( t ) ) ,   s ¯ 4 ( t ) = 5 exp ( ı φ 4 ( t ) ) , where Δ ξ is the angle separation between ξ 2 and ξ 3 , and φ 1 ( t ) , , φ 4 ( t ) are independent and respectively, distributed as
    φ 1 ( t ) 2 π Gamma ( 1 , 1 ) , φ 2 ( t ) 2 π Gamma ( 1 , 1 ) , φ 3 ( t ) U [ 0 , 2 π ] , φ 4 ( t ) U [ 0 , 2 π ] ,
    where Gamma ( a , b ) denotes the Gamma distribution with the shape parameter a and scale parameter b, and U [ a , b ] denotes the uniform distribution on the interval [ a , b ] . Note that the source signals s ¯ 1 ( t ) , s ¯ 2 ( t ) , s ¯ 3 ( t ) and s ¯ 4 ( t ) have constant modulus, which is common in communication applications [13].
  • The observation noise v ( t ) is distributed as
    v ( t ) ϱ n 1 ,
    where ϱ Gamma ( 1 , 6 ) and n 1 is zero-mean Gaussian with covariance matrix V = σ 2 I m .

5.1.3. Experiment 3

Consider a scenario with non-Gaussian source signals and non-Gaussian spatially non-uniform white noise. Let m = 8 and n = 100 . The k 0 = 3 source signals locating at ξ 1 = 40 , ξ 2 = 50 , ξ 3 = 80 are s ¯ 1 ( t ) = 10 exp ( ı φ 2 ( t ) ) , s ¯ 2 ( t ) = 30 exp ( ı φ 3 ( t ) ) and s ¯ 3 ( t ) = 10 exp ( ı φ 4 ( t ) ) with φ 2 ( t ) , φ 3 ( t ) , φ 4 ( t ) being given by Equation (78). The observation noise is distributed as
v ( t ) ϱ n 2 ,
where ϱ Gamma ( 2 , 2 ) and n 2 is zero-mean Gaussian with covariance matrix V = σ 2 Diag ( [ m , m 1 , , 1 ] ) .

5.2. Experimental Results

In Experiment 1, the MM1-CD, MM1-SPICE and MM2 algorithms with λ = 0 were firstly applied. Table 1 reports the iteration numbers n 1 , n 2 , n 3 , the computational time τ (in seconds), and the DOA estimation RMSEs (in degrees). Specifically, n 1 is the number of total iterations, n 2 is the number of iterations in the outer loop, amd n 3 is the average of the iterations in a nested inner loop. Besides the RMSEs, each value in Table 1 is the average of the results of 500 Monte Carlo runs. Note that, even though n 1 = n 2 × n 3 in a Monte Carlo run, n 1 is not exactly equal to n 2 × n 3 in Table 1 because they are the average of 500 Monte Carlo runs.
As shown in Table 1, the MM1-CD and MM1-SPICE algorithms had similar RMSEs and the MM1-CD algorithm took much more time than the MM1-SPICE algorithm. Considering that the MM1-CD and MM1-SPICE algorithms theoretically converge to the identical local solution, we believe that they will always have similar RMSEs. Moreover, the MM2 algorithm had the smallest iteration numbers, the least computational time and satisfactory RMSEs. In the following, we present the DOA estimation performance only of the MM1-SPICE and MM2 algorithms. Note that the MM1-SPICE and MM2 algorithms may converge to different local solutions, and then they are referred to as two different estimation methods hereinafter. Specifically, we compared the following six methods:
  • LIKES: The ML method proposed in [15] under the assumption that the array output x ( t ) is Gaussian distributed.
  • SPICE: A sparse covariance-based estimation method proposed in [13] with no distribution assumption.
  • MM1-SPICE-0: The MM1-SPICE method with the penalty parameter λ = 0 ;
  • MM1-SPICE-P: The MM1-SPICE method with the penalty parameter selected by the criterion in Section 4.4.
  • MM2-0: The MM2 method with the penalty parameter λ = 0 .
  • MM2-P: The MM2 method with the penalty parameter selected by the criterion in Section 4.4.
It is worth presenting that in, Experiments 1–3, the standard MUSIC algorithm was found to be almost ineffective, thus we do not illustrate it.
The RMSE comparisons of the above six methods are illustrated in Figure 1, Figure 2 and Figure 3 for Experiments 1–3 with different SNRs, respectively. Figure 4 and Figure 5 for Experiment 2 show the DOA estimation RMSEs of the above six methods versus the number of snapshots and the angle separation, respectively.
In Figure 1, we can see that the MM1-SPICE-0 and MM2-0 methods were comparable to the LIKES and SPICE methods in the Gaussian cases. As shown in Figure 2, Figure 3, Figure 4 and Figure 5, the scenarios of non-Gaussian random source signals and heavy-tailed random noise, the MM1-SPICE-0 and MM2-0 methods performed much better than the LIKES and SPICE methods. In other words, the ML methods designed for the CES outputs were effective in the simulation scenarios of Gaussian and non-Gaussian distributions.
As shown in Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5, we found that the penalized ML methods, i.e., the MM1-SPICE-P and MM2-P methods, had smaller RMSEs than the other four methods. As shown in Figure 4, with the increase of the number of snapshots, the performance of the MM1-SPICE-P and MM2-P methods improved but the performance of the MM1-SPICE-0 and MM2-0 methods remained virtually unchanged. Figure 5 shows that, as Δ ξ increased from 2 to 6 , the performance of the MM1-SPICE-P and MM2-P methods became better, while, when Δ ξ was larger than 4 , the performance of the MM1-SPICE-0 and MM2-0 methods no longer improved. As shown in Figure 1, Figure 2 and Figure 3, unsurprisingly, as the SNR increased, the performance of all the six methods became better.
For illustrating the difference between the penalized and un-penalized ML methods, we evaluated the normalized spectrum (NS):
NS = r ^ [ 1 : k ] max ( r ^ [ 1 : k ] ) or q ^ max ( q ^ ) ,
where r ^ and q ^ are, respectively, the estimates of r and q .
Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 report the NSs of Experiments 1–3 by the MM2-0 and MM2-P methods (the curves of the NSs of the MM-SPICE-0 and MM2-SPICE-P methods are similar to those of the MM2-0 and MM2-P methods, respectively). Figure 6 and Figure 10 are for Experiments 1 and 3 with SNR = 0 dB , respectively. Figure 7 and Figure 8 are both for Experiment 2 with n = 100 and SNR = 0 dB , but they involve two scenarios with different angle separations. Particularly, Figure 8 is for the very small Δ ξ = 3 , as opposed to Figure 7 for Δ ξ = 15 . Figure 9 is also for Experiment 2, but it is about the scenario with small snapshot size n = 20 , small SNR = 10 dB and moderate angle separation Δ ξ = 10 . Note that the red dashed lines mark the true DOAs and the NS curves in each figure are the results of a randomly selected realization. We can clearly see in Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 that the MM2-P method having the l 1 -norm penalty added yielded a higher angular resolution.

6. Conclusions

This paper provides a penalized ML method for DOA estimation under the assumption that the array output has a CES distribution. Two MM algorithms, named MM1 and MM2, are developed for solving the non-convex penalized ML optimization problem for spatially uniform or non-uniform noise. They converge locally with different rates. The numerical experiments showed that the MM2 ran faster and performed as well as the MM1 when estimating the DOAs. Their rates of convergence will be further explored theoretically in future.
It is worth mentioning that, in numerical simulations, the proposed l 1 -norm penalized likelihood method effectively estimated the DOAs, although the values of nonzero elements of r or p were not estimated very accurately. By replacing the l 1 -norm penalty by other proper penalties (e.g., the smoothly clipped absolute deviation (SCAD) [47], adaptive LASSO [48], or l p -norm penalty ( 0 p < 1 )), more accurate estimates of the unknown parameter may be derived in the future.
If the l 1 -norm penalty is replaced by the adaptive LASSO penalty, then the algorithms proposed in this paper can be applied almost without modification because the adaptive LASSO penalty is a weighted l 1 -norm penalty. When the non-convex SCAD or the l p -norm penalty ( 0 p < 1 ) is employed, the convex majorizing functions given in [40,49] can be exploited based on the MM framework, and then the algorithms in this paper with minor modifications are also applicable.
When we have some informative prior knowledge on the directions of source signals, the problem of estimating the DOAs can be formulated as maximizing a sparse posterior likelihood from the Bayesian perspective. The sparse Bayesian method of estimating DOAs from the CES array outputs is interesting and is worth studying, while how to formulate and solve a sparse posterior likelihood optimization and how to do model selection are challenges. The BIC criterion can be used for the model selection, in which the Bayesian evidence difficult to be analytically derived can be approximated by the methods in [44,45,46]. More research along this line will be done in the future.

Author Contributions

Conceptualization, C.C. and J.Z.; Funding acquisition, J.Z.; Methodology, C.C., J.Z. and M.T.; Software, C.C.; Supervision, J.Z.; Validation, C.C.; Writing—original draft, C.C.; and Writing—review and editing, J.Z. and M.T.

Funding

This work was supported in part by the National Natural Science Foundation of China under grant 61374027, the Application Basic Research Project of Sichuan Province under grant 2019YJ0122, and the Program for Changjiang Scholars and Innovative Research Team in University under grant IRT_16R53 from the Chinese Education Ministry.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Schmidt, R.O. Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 1986, 34, 276–280. [Google Scholar] [CrossRef] [Green Version]
  2. Roy, R.; Kailath, T. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 984–995. [Google Scholar] [CrossRef]
  3. Stoica, P.; Sharman, K.C. Maximum likelihood methods for direction-of-arrival estimation. IEEE Trans. Acoust. Speech Signal Process. 1990, 38, 1132–1143. [Google Scholar] [CrossRef]
  4. Friedlander, B. The root-MUSIC algorithm for direction finding with interpolated arrays. Signal Process. 1993, 30, 15–29. [Google Scholar] [CrossRef]
  5. Yang, Z.; Li, J.; Stoica, P.; Xie, L. Sparse methods for direction-of-arrival estimation. In Academic Press Library in Signal Processing, Volume 7: Array, Radar and Communications Engineering; Chellappa, R., Theodoridis, S., Eds.; Academic Press: London, UK, 2018; Chapter 11; pp. 509–581. [Google Scholar] [Green Version]
  6. Gorodnitsky, I.F.; Rao, B.D. Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. IEEE Trans. Signal Process. 1997, 45, 600–616. [Google Scholar] [CrossRef]
  7. Tipping, M.E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar]
  8. Wipf, D.P.; Rao, B.D. Sparse Bayesian learning for basis selection. IEEE Trans. Signal Process. 2004, 52, 2153–2163. [Google Scholar] [CrossRef]
  9. Malioutov, D.; Çetin, M.; Willsky, A.S. A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 2005, 53, 3010–3022. [Google Scholar] [CrossRef] [Green Version]
  10. Babacan, S.D.; Molina, R.; Katsaggelos, A.K. Bayesian compressive sensing using Laplace priors. IEEE Trans. Image Process. 2010, 19, 53–63. [Google Scholar] [CrossRef] [PubMed]
  11. Hyder, M.M.; Mahata, K. Direction-of-arrival estimation using a mixed l2,0 norm approximation. IEEE Trans. Signal Process. 2010, 58, 4646–4655. [Google Scholar] [CrossRef]
  12. Xu, X.; Wei, X.; Ye, Z. DOA estimation based on sparse signal recovery utilizing weighted l1-norm penalty. IEEE Signal Process. Lett. 2012, 19, 155–158. [Google Scholar] [CrossRef]
  13. Stoica, P.; Babu, P.; Li, J. SPICE: A sparse covariance-based estimation method for array processing. IEEE Trans. Signal Process. 2011, 59, 629–638. [Google Scholar] [CrossRef]
  14. Stoica, P.; Babu, P.; Li, J. New method of sparse parameter estimation in separable models and its use for spectral analysis of irregularly sampled data. IEEE Trans. Signal Process. 2011, 59, 35–46. [Google Scholar] [CrossRef]
  15. Stoica, P.; Babu, P. SPICE and LIKES: Two hyperparameter-free methods for sparse-parameter estimation. Signal Process. 2012, 92, 1580–1590. [Google Scholar] [CrossRef]
  16. Stoica, P.; Zachariah, D.; Li, J. Weighted SPICE: A unifying approach for hyperparameter-free sparse estimation. Digit. Signal Process. 2014, 33, 1–12. [Google Scholar] [CrossRef] [Green Version]
  17. Yin, J.; Chen, T. Direction-of-arrival estimation using a sparse representation of array covariance vectors. IEEE Trans. Signal Process. 2011, 59, 4489–4493. [Google Scholar] [CrossRef]
  18. Zheng, J.; Kaveh, M. Sparse spatial spectral estimation: A covariance fitting algorithm, performance and regularization. IEEE Trans. Signal Process. 2013, 61, 2767–2777. [Google Scholar] [CrossRef]
  19. Liu, Z.M.; Huang, Z.T.; Zhou, Y.Y. Array signal processing via sparsity-inducing representation of the array covariance matrix. IEEE Trans. Aerosp. Electron. Syst. 2013, 49, 1710–1724. [Google Scholar] [CrossRef]
  20. He, Z.Q.; Shi, Z.P.; Huang, L. Covariance sparsity-aware DOA estimation for nonuniform noise. Digit. Signal Process. 2014, 28, 75–81. [Google Scholar] [CrossRef]
  21. Chen, T.; Wu, H.; Zhao, Z. The real-valued sparse direction of arrival (DOA) estimation based on the Khatri-Rao product. Sensors 2016, 16, 693. [Google Scholar] [CrossRef]
  22. Wang, Y.; Hashemi-Sakhtsari, A.; Trinkle, M.; Ng, B.W.H. Sparsity-aware DOA estimation of quasi-stationary signals using nested arrays. Signal Process. 2018, 144, 87–98. [Google Scholar] [CrossRef]
  23. Shutin, D.; Fleur, B.H. Sparse variational Bayesian SAGE algorithm with application to the estimation of multipath wireless channels. IEEE Trans. Signal Process. 2011, 59, 3609–3623. [Google Scholar] [CrossRef]
  24. Hu, L.; Shi, Z.; Zhou, J.; Fu, Q. Compressed sensing of complex sinusoids: An approach based on dictionary refinement. IEEE Trans. Signal Process. 2012, 60, 3809–3819. [Google Scholar]
  25. Yang, Z.; Zhang, C.; Xie, L. Robustly stable signal recovery in compressed sensing with structured matrix perturbation. IEEE Trans. Signal Process. 2012, 60, 4658–4671. [Google Scholar] [CrossRef]
  26. Yang, Z.; Xie, L.; Zhang, C. Off-grid direction of arrival estimation using sparse Bayesian inference. IEEE Trans. Signal Process. 2013, 61, 38–43. [Google Scholar] [CrossRef]
  27. Fan, Y.; Wang, J.; Du, R.; Lv, G. Sparse method for direction of arrival estimation using denoised fourth-order cumulants vector. Sensors 2018, 18, 1815. [Google Scholar] [CrossRef] [PubMed]
  28. Wu, X.; Zhu, W.P.; Yan, J.; Zhang, Z. Two sparse-based methods for off-grid direction-of-arrival estimation. Signal Process. 2018, 142, 87–95. [Google Scholar] [CrossRef]
  29. Bhaskar, B.N.; Tang, G.; Recht, B. Atomic norm denoising with applications to line spectral estimation. IEEE Trans. Signal Process. 2013, 61, 5987–5999. [Google Scholar] [CrossRef]
  30. Yang, Z.; Xie, L.; Zhang, C. A discretization-free sparse and parametric approach for linear array signal processing. IEEE Trans. Signal Process. 2014, 62, 4959–4973. [Google Scholar] [CrossRef]
  31. Yang, Z.; Xie, L. Enhancing sparsity and resolution via reweighted atomic norm minimization. IEEE Trans. Signal Process. 2016, 64, 995–1006. [Google Scholar] [CrossRef]
  32. Wang, X.; Wang, W.; Li, X.; Liu, Q.; Liu, J. Sparsity-aware DOA estimation scheme for noncircular source in MIMO Radar. Sensors 2016, 16, 539. [Google Scholar] [CrossRef] [PubMed]
  33. Jing, X.; Liu, X.; Liu, H. A sparse recovery method for DOA estimation based on the sample covariance vectors. Circuits Syst. Signal Process. 2017, 36, 1066–1084. [Google Scholar] [CrossRef]
  34. Stoica, P.; Babu, P. Sparse estimation of spectral lines: Grid selection problems and their solutions. IEEE Trans. Signal Process. 2012, 60, 962–967. [Google Scholar] [CrossRef]
  35. Ollila, E.; Tyler, D.E.; Koivunen, V.; Vincent, H. Complex elliptically symmetric distributions: Survey, new results and applications. IEEE Trans. Signal Process. 2012, 60, 5597–5625. [Google Scholar] [CrossRef]
  36. Sun, Y.; Babu, P.; Palomar, D.P. Robust estimation of structured covariance matrix for heavy-tailed elliptical distributions. IEEE Trans. Signal Process. 2016, 64, 3576–3590. [Google Scholar] [CrossRef]
  37. Pourahmadi, M. High-Dimensional Covariance Estimation; John Wiley & Sons Inc.: Hoboken, NJ, USA, 2013. [Google Scholar]
  38. Sangston, K.J.; Gini, F.; Greco, M.S. Coherent radar target detection in heavy-tailed compound-Gaussian clutter. IEEE Trans. Aerosp. Electron. Syst. 2012, 48, 64–77. [Google Scholar] [CrossRef]
  39. Hunter, D.R.; Lange, K. A tutorial on MM algorithms. Am. Stat. 2004, 58, 30–37. [Google Scholar] [CrossRef]
  40. Sun, Y.; Babu, P.; Palomar, D.P. Majorization-Minimization algorithms in signal processing, communications, and machine learning. IEEE Trans. Signal Process. 2017, 65, 794–816. [Google Scholar] [CrossRef]
  41. Kent, J.T.; Tyler, D.E. Maximum likelihood estimation for the wrapped Cauchy distribution. J. Appl. Stat. 1988, 15, 247–254. [Google Scholar] [CrossRef]
  42. Luo, Z.Q.; Tseng, P. On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 1992, 72, 7–35. [Google Scholar] [CrossRef] [Green Version]
  43. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  44. Friel, N.; Wyse, J. Estimating the evidence—A review. Stat. Neerl. 2012, 66, 288–308. [Google Scholar] [CrossRef]
  45. Martino, L.; Elvira, V.; Luengo, D.; Corander, J. An adaptive population importance sampler: Learning from uncertainty. IEEE Trans. Signal Process. 2015, 63, 4422–4437. [Google Scholar] [CrossRef]
  46. Martino, L.; Elvira, V.; Luengo, D.; Corander, J. Layered adaptive importance sampling. Stat. Comput. 2017, 27, 599–623. [Google Scholar] [CrossRef]
  47. Fan, J.; Li, R. Variable selection via nonconcave penalized likilihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
  48. Zou, H. The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
  49. Fan, J.; Feng, Y.; Wu, Y. Network exploration via the adaptive LASSO and SCAD penalties. Ann. Appl. Stat. 2009, 3, 521–541. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. DOA estimation RMSE versus SNR for Experiment 1.
Figure 1. DOA estimation RMSE versus SNR for Experiment 1.
Sensors 19 02356 g001
Figure 2. DOA estimation RMSE versus SNR for Experiment 2 with n = 100 and Δ ξ = 15 .
Figure 2. DOA estimation RMSE versus SNR for Experiment 2 with n = 100 and Δ ξ = 15 .
Sensors 19 02356 g002
Figure 3. DOA estimation RMSE versus SNR for Experiment 3.
Figure 3. DOA estimation RMSE versus SNR for Experiment 3.
Sensors 19 02356 g003
Figure 4. DOA estimation RMSE versus the number of snapshots n for Experiment 2 with SNR = 0 dB and Δ ξ = 15 .
Figure 4. DOA estimation RMSE versus the number of snapshots n for Experiment 2 with SNR = 0 dB and Δ ξ = 15 .
Sensors 19 02356 g004
Figure 5. DOA estimation RMSE versus the angle separation Δ ξ for Experiment 2 with n = 100 and SNR = 0 dB .
Figure 5. DOA estimation RMSE versus the angle separation Δ ξ for Experiment 2 with n = 100 and SNR = 0 dB .
Sensors 19 02356 g005
Figure 6. NSs of Experiment 1 with SNR = 0 dB .
Figure 6. NSs of Experiment 1 with SNR = 0 dB .
Sensors 19 02356 g006
Figure 7. NSs of Experiment 2 with n = 100 , SNR = 0 dB and Δ ξ = 15 .
Figure 7. NSs of Experiment 2 with n = 100 , SNR = 0 dB and Δ ξ = 15 .
Sensors 19 02356 g007
Figure 8. NSs of Experiment 2 with n = 100 , SNR = 0 dB and Δ ξ = 3 .
Figure 8. NSs of Experiment 2 with n = 100 , SNR = 0 dB and Δ ξ = 3 .
Sensors 19 02356 g008
Figure 9. NSs of Experiment 2 with n = 20 , SNR = 10 dB and Δ ξ = 10 .
Figure 9. NSs of Experiment 2 with n = 20 , SNR = 10 dB and Δ ξ = 10 .
Sensors 19 02356 g009
Figure 10. NSs of Experiment 3 with SNR = 0 dB .
Figure 10. NSs of Experiment 3 with SNR = 0 dB .
Sensors 19 02356 g010
Table 1. Comparison of computational complexities and RMSEs.
Table 1. Comparison of computational complexities and RMSEs.
SNRMethod n 1 n 2 n 3 τ RMSE
1 dB MM1-CD 525.5600 16.0440 32.7550 27.9710 0.0877
MM1-SPICE 317.6500 16.9280 18.7370 0.3805 0.0926
MM2 129.5000 0.3266 0.1059
3 dB MM1-CD 539.2700 16.0000 33.7040 28.7700 0.2345
MM1-SPICE 310.0300 16.6260 18.6070 0.3714 0.2434
MM2 129.3800 0.3260 0.2568
5 dB MM1-CD 561.0900 16.4380 34.1500 29.3910 0.4292
MM1-SPICE 303.1400 16.3340 18.5130 0.3590 0.4681
MM2 129.1400 0.3195 0.4840

Share and Cite

MDPI and ACS Style

Chen, C.; Zhou, J.; Tang, M. Direction of Arrival Estimation in Elliptical Models via Sparse Penalized Likelihood Approach. Sensors 2019, 19, 2356. https://doi.org/10.3390/s19102356

AMA Style

Chen C, Zhou J, Tang M. Direction of Arrival Estimation in Elliptical Models via Sparse Penalized Likelihood Approach. Sensors. 2019; 19(10):2356. https://doi.org/10.3390/s19102356

Chicago/Turabian Style

Chen, Chen, Jie Zhou, and Mengjiao Tang. 2019. "Direction of Arrival Estimation in Elliptical Models via Sparse Penalized Likelihood Approach" Sensors 19, no. 10: 2356. https://doi.org/10.3390/s19102356

APA Style

Chen, C., Zhou, J., & Tang, M. (2019). Direction of Arrival Estimation in Elliptical Models via Sparse Penalized Likelihood Approach. Sensors, 19(10), 2356. https://doi.org/10.3390/s19102356

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop