Next Article in Journal
Subsidence of a Coal Ash Landfill in a Power Plant Observed by Applying PSInSAR to Sentinel-1 SAR Data
Next Article in Special Issue
Evaluating SAR Radiometric Terrain Correction Products: Analysis-Ready Data for Users
Previous Article in Journal
A Robust Adaptive Extended Kalman Filter Based on an Improved Measurement Noise Covariance Matrix for the Monitoring and Isolation of Abnormal Disturbances in GNSS/INS Vehicle Navigation
Previous Article in Special Issue
Machine Learning Applied to a Dual-Polarized Sentinel-1 Image for Wind Retrieval of Tropical Cyclones
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Nonsparse SAR Scene Imaging Network Based on Sparse Representation and Approximate Observations

1
School of Information and Navigation, Air Force Engineering University, Xi’an 710077, China
2
Key Laboratory for Information Science of Electromagnetic Waves (Ministry of Education), Fudan University, Shanghai 200433, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(17), 4126; https://doi.org/10.3390/rs15174126
Submission received: 17 July 2023 / Revised: 8 August 2023 / Accepted: 16 August 2023 / Published: 22 August 2023
(This article belongs to the Special Issue Advances in Synthetic Aperture Radar Data Processing and Application)

Abstract

:
Sparse-representation-based synthetic aperture radar (SAR) imaging technology has shown superior potential in the reconstruction of nonsparse scenes. However, many existing compressed sensing (CS) methods with sparse representation cannot obtain an optimal sparse basis and only apply to the sensing matrix obtained by exact observation, resulting in a low image quality occupying more storage space. To reduce the computational cost and improve the imaging performance of nonsparse scenes, we formulate a deep learning SAR imaging method based on sparse representation and approximated observation deduced from the chirp-scaling algorithm (CSA). First, we incorporate the CSA-derived approximated observation model and a nonlinear transform function within a sparse reconstruction framework. Second, an iterative shrinkage threshold algorithm is adopted to solve this framework, and the solving process is unfolded as a deep SAR imaging network. Third, a dual-path convolutional neural network (CNN) block is designed in the network to achieve the nonlinear transform, dramatically improving the sparse representation capability over conventional transform-domain-based CS methods. Last, we improve the CNN block to develop an enhanced version of the deep SAR imaging network, in which all the parameters are layer-varied and trained by supervised learning. The experiments demonstrate that our proposed two imaging networks outperform conventional CS-driven and deep-learning-based methods in terms of computing efficiency and reconstruction performance of nonsparse scenes.

1. Introduction

Synthetic aperture radar (SAR) can achieve high-resolution images by wideband signal pulse compression and virtual aperture synthesis, which is not affected by time and weather [1]. As an active microwave imaging system, SAR transmits a series of pulses and receives echoes, which are generally processed by matched filter (MF)-based SAR imaging methods, such as the range-Doppler algorithm (RDA) [2], chirp-scaling algorithm (CSA) [3], and back-projection algorithm (BPA) [4]. With the development of these traditional MF-based focusing algorithms, more measurements of the echo data are required to satisfy the demand for increased resolution, which makes the storage and processing of SAR data difficult for the system hardware. Over the past decade, as compressed sensing (CS) and related deep learning (DL) methods have emerged, SAR imaging technology has progressed to high-dimensional data and plays an increasingly significant role in military and civilian fields [5].

1.1. Previous Work

Different from the Nyquist sampling theorem, CS provides the possibility of accurately reconstructing sparse signals using a small quantity of observation data [6,7]. CS-driven SAR imaging, also known as sparse SAR imaging, has recently become a controversial issue and has received extensive attention [8]. These studies primarily focus on the potential application of SAR imaging technology innovated by CS theory. They have indicated that if the SAR imaging result is a sparse signal in the spatial domain or transform domain, it can be reconstructed with high precision from an incomplete SAR echo by constructing an optimization problem [9,10]. Sparse SAR imaging can be considered an inverse optimization problem, which can be directly solved using iterative algorithms when the true scattered field is sparse in the spatial domain, e.g., the iterative shrinkage threshold algorithm (ISTA) [11], approximate message passing (AMP) algorithm [12], and alternating direction method of multipliers (ADMM) [13]. However, the scattered field of the SAR scene is always nonsparse or weakly sparse, and thus, it needs to be sparsely represented. This limitation restricts the application of sparse SAR imaging.
To solve the above-mentioned limitation, reconstruction methods that combine iterative algorithms with sparse representation, which formulate the reconstruction framework with sparsity constraints, have been developed. Rilling et al. [14] investigated a transform-domain-based reconstruction technique in which only a specific transform basis, i.e., a wavelet basis, was employed to transform the nonsparse areas. In practice, the features of SAR images are complicated, so it is difficult to obtain a common transform basis or sparse representation. As a result, joint sparse transform-based approaches, including combined dictionaries, structured sparse representation, and mixed sparse representation [15,16,17,18], have been successively proposed to enhance the sparsity exploration ability of the SAR reconstruction framework. In [15], an image formation technique with sparse representation, which relied on combined dictionaries to obtain multiple feature-enhanced reconstructed images, was developed. By incorporating an adaptive sparse representation space into the SAR reconstruction framework, Shen et al. [16] proposed a structural-sparse-representation-assisted SAR imaging algorithm in which SAR image reconstruction and adaptive sparse space updating were constructed as a joint optimization problem. However, the above studies adopted the exact observation model to build an optimization problem for reconstructing nonsparse SAR images. Because of the vectorization of 2D scattering coefficient matrices, such exact observation models require much more computational resources, thus rendering them inefficient for large-scale matrices. Fortunately, a sparse reconstruction framework based on approximated observations was proposed in [19] to solve this problem. On this basis, Li et al. [18] proposed the mixed sparse representation for approximated observation SAR imaging, dramatically improving the computational efficiency of large-scale matrices while guaranteeing the sparsity of related signals. Although this kind of imaging algorithm can eliminate the difficulties in storing and processing data, it remains limited by two general shortcomings of CS-driven SAR imaging: the difficulty of determining the optimal parameter values and the complexity of the reconstruction mechanism.
The recent development of DL technology has triggered extensive research on SAR image restoration, providing new research methods to overcome the shortcomings of conventional CS-driven SAR imaging. Convolutional neural networks (CNNs), as powerful networks in DL technology, have been employed in the field of SAR imaging to advance state-of-the-art technology for image reconstruction and target focusing. A representative study on embedding CNN into reconstruction was published in [20], in which a complex-valued CNN structure was introduced to enhance the imaging functionality of the inverse SAR (ISAR) signal processing framework. In [21], Mu et al. proposed a deep CNN imaging method named DeepImaging to refocus SAR moving targets and eliminate residual clutter. Similarly, U-Net, a CNN-based network, was improved in [22] to achieve a high-resolution focused image by addressing the defocused data generated by the RDA. In [23], a deep convolutional encoder, also based on a CNN, was combined with a residual network to approximate the SAR imaging process of the traditional RDA. In this case, the encoder mapped the echo data to the SAR image, and the residual network subsequently enhanced the image resolution. For all the above methods, the CNN is utilized as a “black box” to learn the mapping from sampled data or defocused images to high-quality images. Consequently, these works lack interpretable characteristics since they do not explicitly investigate imaging models or explore image sparsity.
In contrast to traditional neural networks, deep unfolding methods combine model-driven iterative algorithms and data-driven deep networks. This approach solves the inverse imaging problem by converting the traditional CS-driven SAR reconstruction algorithm to a deep network, which improves imaging performance and interpretability. Thus, deep unfolding is more appropriate for SAR image formation than CNN. Inspired by this concept, several deep unfolding imaging methods based on iterative algorithms have emerged in the field of SAR imaging, but they remain in the preliminary stage. In [24], Yonel et al. unfolded the ISTA on a recurrent neural network architecture to establish an inverse problem solver for passive SAR reconstruction. Then, the solver was further transformed into an autoencoder (named ISTA-Net) by adding a forward model as the decoder so that it could be trained by unsupervised learning. However, due to its vectorized SAR imaging model, ISTA-Net wastes storage space, which limits the scope of imaging spaces. An extension network named range-migration kernel-based ISTA-Net (RMIST-Net), which was presented in [25], was designed to directly process 2D complex-valued echo data instead of vectorization. Similarly, a target-oriented SAR imaging network, obtained by unfolding an MF-based ADMM iterative solution, was proposed in [26] to enhance the signal-to-clutter ratio of the desired target. For SAR imaging of moving targets, deep-unfolding-based focusing methods have been proposed to reconstruct sparse ship targets [27,28]. In addition, deep unfolding technology can also be applied to the sparse imaging of ISAR, such as ADMM-Net [29] and AF-AMPNet [30]. Sufficient experiments verified that deep-unfolding-based imaging methods outperformed conventional CS-driven reconstruction algorithms. Note that the above microwave imaging networks for stationary and moving targets can only be applied to sparse scenes. To overcome this limitation, an RDA-based deep unfolding imaging network named RDA-Net is proposed in [31]. Although RDA-Net can reconstruct nonsparse SAR scenes by learning the corresponding compensation matrices of RDA, setting these matrices as learnable parameters still requires a substantial amount of storage space. Furthermore, due to the lack of sparse representation modules, the robustness and accuracy of RDA-Net with unsupervised learning are low in nonsparse scenes. In our previous work [5], we proposed a 1D SAR imaging network with sparse representation capability, but it required a much higher computational cost due to the exact observation model. As a result, it is difficult for current deep unfolding technology to effectively solve the SAR image reconstruction problem for complex scenes while saving storage space.

1.2. Motivation

After analyzing the previous works on SAR imaging methods, we came to the following conclusions. (1) Traditional CS methods require the integration of multiple algorithms to solve the nonsparse SAR scene imaging problem. The lack of optimal sparse basis results in unsatisfactory imaging efficiency. (2) The current DL-based methods cannot achieve satisfactory reconstruction results for complex and large-scale scenes.
In this paper, we investigate a deep SAR learning imaging method, which was motivated by the desire to enhance the working efficiency and image quality of nonsparse SAR scene imaging. Therefore, we propose a novel deep unfolding strategy for SAR imaging of complex scenes to reduce the computational burden and improve the reconstruction performance.

1.3. Main Work and Contributions

The proposed strategy utilizes the significant representation performance of a CNN to accurately reconstruct nonsparse scenes, rather than learning the large-scale MF compensation matrices in SAR imaging networks such as RDA-Net. First, we establish a sparse reconstruction framework for complex scenes by integrating a CSA-derived approximated observation model and a nonlinear transform function. Second, the overall architecture of an end-to-end deep SAR imaging network is determined by unfolding the iterative process of solving this framework with the ISTA. To improve the imaging accuracy of nonsparse SAR scenes, we design a learnable dual-path CNN block that outperforms conventional nonlinear transform functions in terms of sparse transform. In addition, we develop an enhanced version of the deep SAR imaging network by strengthening the sparse representation capability of the CNN block. Experimental results demonstrate that our proposed imaging networks (abbreviated as nets) achieve impressive reconstruction results of nonsparse SAR scenes while reducing the computational cost. The main contributions of this article are summarized as follows:
(1)
Compared to the 1D exact observation model, the advantages and feasibility of the 2D approximated observation model are analyzed. Then, by decoupling the CSA operation process, we develop a specific 2D SAR sparse reconstruction framework and introduce its imaging mechanism with the ISTA as a solution example.
(2)
Combining the advantages of deep unfolding and a CNN, a deep SAR imaging net and its improved versions (SR-CSA-Net and SR-CSA-Net-plus, respectively) are developed for nonsparse SAR reconstruction, with internal modules designed as dual-path structures to address complex-valued SAR echoes as input to imaging nets.
(3)
A CNN-based nonlinearity module is designed in SR-CSA-Net to automatically learn the optimal sparse transform. Furthermore, we redesign the nonlinearity module with skip connections in SR-CSA-Net-plus to fully exploit the high-frequency component of the scattering coefficient and improve the reconstruction performance.
(4)
The proposed imaging nets are trained end-to-end via supervised learning, and the parameters are learned using cleverly designed loss functions. Specifically, these well-designed loss functions simultaneously take into account both structure symmetry constraints and reconstruction discrepancies.
(5)
Extensive experiments verify that our two proposed imaging nets outperform conventional CS-driven sparse reconstruction algorithms and several existing deep unfolding imaging methods by large margins.
This paper is organized as follows: Section 2 discusses the CSA-derived SAR approximated observation imaging model. Section 3 illustrates the details of sparse-representation-based deep SAR imaging nets. In Section 4, experimental results and performance analyses are presented. Section 5 concludes this paper.
Notations: A vector is denoted by the bold lowercase letter x , a matrix is denoted by the bold capital letter X , and a variable or scalar is denoted by the lowercase letter x. The real field domain and complex field domain are denoted by R and C , respectively. In addition, ( · ) T , ( · ) * , and ( · ) H denote the transpose, conjugate, and Hermitian transpose, respectively, of matrices or operators.

2. SAR Imaging Model Based on Approximated Observation

In Section 2.1, the elementary theory of sparse SAR imaging based on the exact observation model is summarized. To reduce the computational cost, we investigate the general formalization of the 2D approximated observation model and verify its feasibility in Section 2.2. In Section 2.3, we propose a concrete example derived from the CSA procedure to explicitly construct the 2D approximated observation model. The CSA-derived model is generalized to a sparse-representation-based SAR imaging model in Section 2.4.

2.1. One-Dimensional Exact Observation Model for SAR Imaging

Vertical side-looking, strip-mode SAR was adopted for sparse imaging in this paper, and the geometric structure of the imaging system is shown in Figure 1. The observation area could be regarded as a series of scattering centers located on the P × Q grids, in which the grid coordinates along the range direction (x-axis) and azimuth (y-axis) direction are p = 1 , 2 , P and q = 1 , 2 , Q , respectively. Therefore, the scattering coefficient of the imaging scene is expressed as the following scattering coefficient matrix:
Ξ = σ ( x 1 , y 1 ) σ ( x 1 , y Q ) σ ( x p , y q ) σ ( x P , y 1 ) σ ( x P , y Q ) P × Q
where σ ( x p , y q ) denotes the scattering coefficient at coordinate ( x p , y q ) . As shown in Figure 1b, it is assumed that the range and azimuth sampling points on each grid are m = 1 , 2 , M and n = 1 , 2 , N , respectively.
The 1D exact observation model of sparse SAR imaging represents the relationship between SAR echo signals and the scattering distribution of imaging scenes. The downsampling observation model of the echo signal is expressed as
s d = Ψ · s + n 0 = Ψ Φ σ + n 0
where s C M N × 1 is the vectorized original echo signal; s d C M N × 1 is the complex-valued echo vector after downsampling; n 0 is the additional white Gaussian noise (AWGN); Ψ R M N × M N is the downsampling matrix, which generally adopts a partial identity matrix; Φ C M N × P Q is the measurement matrix composed of phase terms; and σ = vec ( Ξ ) C P Q × 1 is the scattering coefficient vector, where vec ( · ) is the vector operator. If the imaging scene is sparse enough and the measurement matrix Φ meets the conditions of the restricted isometry property (RIP) [32], the unknown SAR image σ can be reconstructed from the known echo vector s by solving the following optimization problem based on 1 decoupling [24]
σ ^ = arg min σ s d Ψ Φ σ 2 2 + λ σ 1
where σ ^ approximates the true scattered field σ , λ · 1 denotes the 1 norm-based regularization constraint term, and λ is the predefined regularization parameter.

2.2. Two-Dimensional Approximated Observation Model for SAR Imaging

The 1D exact observation model vectorizes the 2D echo signal and scattering coefficient matrix. In this case, the size of the measurement matrix Φ will rapidly increase with the expansion of the echo signal and imaging scene scale. The model introduces a substantial computational and storage burden to the sparse imaging processing of the 1D exact observation model and simultaneously limits its application in sparse imaging. Therefore, a 2D approximated observation model is designed in this subsection to replace the conventional 1D exact observation model, reducing computational and storage costs by a large margin.
It is known that the MF algorithm focuses the echo signal s in the 2D frequency domain. It is assumed that M represents the MF operation procedure, with which the scattering coefficient vector σ is approximately reconstructed by σ ^ = M s , where M satisfies M Φ I . It is difficult to establish a 2D approximated observation model by accurately decoupling the measurement matrix Φ [31]. Conversely, the MF operation procedure M can be decoupled into a series of operators processed in the range or azimuth dimension, and the inverse operation of M denoted as M 1 is also able-decoupled. Hence, the inverse operation M 1 , as an acceptable alternative to Φ , is integrated with the CS method to design the 2D approximated observation model. We further establish a general conclusion as
G = M 1 Φ
where M represents any high-resolution MF algorithm, such as RDA or CSA, and G is the generalized inverse operation of M , which is referred to as the approximated observation matrix and can be used to implicitly approximate Φ . By substituting G for Φ , the observation model in Equation (2) is rewritten as an approximated observation model
s d = Ψ Φ σ + n 0 = Ψ G σ ^ + n 0 .
To further establish the 2D approximated observation model, the Kronecker product decomposition, a typical method to reduce the size of the matrix, is introduced in this paper. Assuming that the approximated observation matrix G C M N × P Q is expressed as G = G a T G r via the Kronecker product decomposition, where ⊗ represents the Kronecker product, and G r C M × P and G a C Q × N denote the observation matrices of the range dimension and azimuth dimension, respectively. Note that the coupling characteristic of measurement matrix Φ makes its range measurement matrix Φ r and azimuth measurement matrix Φ a unable to be wholly separated. Thus, we utilize G to approximate the observation matrix Φ . On this basis, the 2D approximated observation model is derived as follows:
s d = Ψ G σ ^ + n 0 = Ψ ( G a T G r ) σ ^ + n 0 = Ψ ( G a T G r ) vec ( Ξ ^ ) + vec ( N 0 ) = vec ( Ψ r G r Ξ ^ G a Ψ a + N 0 ) = vec ( Ψ r S Ψ a + N 0 ) = vec ( S d )
where S C M × N is the original 2D echo matrix; S d C M × N denotes the 2D echo matrix after downsampling; Ψ r R M × M formed by M rows of an M-by-M identity matrix and Ψ a R N × N formed by N columns of an N-by-N identity matrix are the downsampling matrices in the range dimension and azimuth dimension, respectively; and Ξ ^ C P × Q is the reconstructed 2D scattering coefficient matrix. The 2D approximated observation model and its downsampling form with AWGN then become
S = G r Ξ ^ G a S d = Ψ r S Ψ a + N 0 .
In summary, due to the challenge of decomposing Φ with the Kronecker product decomposition rule, we adopt the approximated observation matrix G derived from the MF algorithm to construct the 2D approximated observation model rather than searching for a direct application to the observation matrix Φ .

2.3. CSA-Derived Approximated Observation Model

Since the MF algorithm is a linear process including a phase multiplication and fast Fourier transform (FFT) in the range and azimuth dimensions, the approximated observation matrices G r and G a are implicitly represented by a series of suboperations obtained from the inversion of the classical MF focusing methods (e.g., RDA, CSA, and BPA). In this subsection, an admissible 2D approximated observation model derived from CSA is examined to show how to achieve the analytic representation of Equation (7).
CSA is a dual-domain hybrid processing imaging algorithm that includes three steps of phase multiplication: (1) differential range cell migration correction (RCMC) (range-Doppler domain), (2) range compression and bulk RCMC (2D frequency domain), and (3) azimuth compression (range-Doppler domain). Then, the imaging operator denoted as M ( · ) and the approximated observation operator denoted as G ( · ) are described as
Ξ ^ = M ( S ) = F r H F r S F a H 1 H 2 H 3 F a H
S = G ( Ξ ^ ) = { F r H F r [ ( Ξ ^ F a ) H 3 * ] H 2 * H 1 * } F a H
where ∘ denotes the Hadamard product; F r and F a are the FFT matrices along the range direction and azimuth direction, respectively; F H represents the inverse FFT (IFFT) matrix; and H 1 , H 2 , and H 3 are the quadratic phase function for chirp scaling operation, the phase function for range compression and bulk RCMC, and the phase function for azimuth compression and residual phase compensation, respectively. Since the phase multiplication operation is a unitary transformation, G ( · ) is obtained by deriving the inverse operation of M ( · ) . Furthermore, G ( · ) is also referred to as the CSA operator, which can be regarded as an implicit operation procedure of decomposing G with the Kronecker product decomposition rule, avoiding the vectorization of echoes and scattering coefficients.
The above operators have a prominent property that both M ( · ) and G ( · ) are linear processes whose relationship, similar to Equation (4), is described as G ( · ) = M ( · ) 1 = M ( · ) H . This conclusion can be further generalized to G = M H .
Theorem 1.
The imaging operator M ( · ) and approximated observation operator G ( · ) are linear operators, and the relationship satisfies G ( · ) = M ( · ) H and G = M H .
Proof. 
Based on Equations (8) and (9) above, the vector form of Ξ ^ and S is expressed as vec ( Ξ ^ ) = vec ( M ( S ) ) and vec ( S ) = vec ( G ( Ξ ^ ) ) . According to the definition of the CSA, the operators M ( · ) and G ( · ) are rewritten as the multiplication of a series of suboperation matrices. Then, we deduce
σ ^ = vec ( M ( S ) ) = M s = F ^ a H H ^ 3 F ^ r H H ^ 2 F ^ r H ^ 1 F ^ a s
s = vec ( G ( Ξ ^ ) ) = G σ ^ = F ^ a H H ^ 1 * F ^ r H H ^ 2 * F ^ r H ^ 3 * F ^ a σ ^
H ^ x = diag vec H x C M N × M N , x = 1 , 2 , 3
F ^ r = I a F r C M N × M N F ^ a = F a T I r C M N × M N
where F ^ r H = I a F r H , F ^ a H = ( F a H ) T I r , I r R M × M , and I a R N × N are the identity matrices, and H ^ x is the diagonal matrix that satisfies H ^ x * = H ^ x H . Consequently, it can be concluded from Equations (10)–(13) that G = M H and G ( · ) = M ( · ) H . □
Theorem 1 proves that the inverse operation of M , as an approximated observation, can be applied to design the 2D approximated observation model by replacing the measurement matrix Φ . Restricted by the form of operator G ( · ) , the downsampling observation model S d in Equation (7) is rewritten as S d = Ψ r G ( Ξ ^ ) Ψ a + N 0 . Similar to the 1D exact observation model, the scattering coefficient matrix Ξ is reconstructed by transforming the downsampling observation model into an optimization problem, which is expressed as
Ξ ^ = arg min Ξ S d Ψ r S ^ Ψ a F 2 + λ Ξ 1 = arg min Ξ S d Ψ r G ( Ξ ) Ψ a F 2 + λ Ξ 1
where · F denotes the Frobenius norm, and S ^ = G ( Ξ ) is the approximate echo data calculated by introducing the true scattered field Ξ into the CSA operator G ( · ) .
The above derivation method can be extended to yield more approximate observation-based CS algorithms from the inverse operation of other MF methods, such as the RDA and BPA. Note that the decoupled nature of MF methods enables their inverse operation by definition. This paper does not present other possible extensions due to space restrictions.

2.4. Sparse Representation of the SAR Imaging Model

The matrix Ξ is considered sparse if the dominant parts of the scattering coefficients are either zeros or very close to zeros. The scattered field of complex SAR scenes is generally nonsparse, making it challenging to obtain high-quality SAR images by directly adopting the optimization model in Equation (14). Assuming that the SAR scene is sparse after adopting traditional transform domain methods (such as the discrete wavelet transform (DWT) [33] or discrete cosine transform (DCT) [34]), the optimization model is rewritten as a sparse-representation-based reconstruction model Ξ ^ = arg min Ξ S d Ψ r S ^ Ψ a F 2 + λ Ξ S R 1 , where Ξ S R = Y Ξ denotes the sparse representation of Ξ with respect to the transform matrix Y , and the constraint term λ · 1 enforces the sparsity of matrix Ξ S R . Then, we further generalize the 1 norm-based regularizer · 1 to a more general regularization function ρ ( · ) , which represents the piecewise linear function (PLF) [35] or the constraint function based on the q norm, where q [ 0 , 1 ] . The q norm-based regularizers with q = 0 and q = 1 correspond to the hard threshold functions and soft threshold functions, respectively.
Assuming that the transform matrix Y is an orthonormal sparse basis matrix, the scattering coefficient matrix is defined as Ξ = Y + Ξ S R , where Y Y + = I and I is the identity operator. In this case, the optimization problem of nonsparse scene reconstruction has been efficiently addressed due to the orthogonality of Y . However, it remains nontrivial and laborious to obtain a high-quality SAR reconstructed image if the sparse basis is nonorthogonal or even nonlinear in a more complex imaging scene. Then, the sparse transform matrix is further generalized to a general nonlinear transform function F ( · ) to relax its reversibility and orthogonality requirements. We substitute ρ F ( Ξ ) for Y Ξ 1 to develop a generalized reconstruction model based on sparse representation.
Ξ ^ = arg min Ξ S d Ψ r S ^ Ψ a F 2 + λ ρ F ( Ξ ) .
Many iterative optimization algorithms have been developed to solve this optimization problem, including the iterative threshold algorithm (ITA), ADMM, AMP, etc. Nevertheless, these algorithms universally demand many iterations and extensive calculations to achieve satisfactory reconstruction results. In this paper, the nonlinear transform function and other iterative parameters are set as undetermined variables in network learning to improve the imaging speed and reconstruction performance.

3. SAR Imaging Nets Based on Sparse Representation and Approximated Observation

In Section 3.1, first, we elaborate on the design of the CSA- and ISTA-based deep SAR imaging net architecture. Second, Section 3.2 and Section 3.3 present SR-CSA-Net and its enhanced version, which apply to SAR imaging of complex scenes. Last, the network structure and training strategy analysis are demonstrated in Section 3.4.

3.1. CSA-Derived 2D SAR Imaging Net Architecture

As one of the ITAs, the ISTA with the 1 norm-based regularizer is well known for solving the regularization optimization problem and is widely utilized in sparse SAR imaging. In this subsection, the detailed construction process from the ISTA to the deep-unfolding-based SAR imaging net architecture is introduced step by step. Specifically, the ISTA solving the optimization problem in Equation (3) is divided into two stages—operator update and nonlinear transformation—which are expressed as
r l = σ ^ l 1 + μ · ( Ψ Φ ) H ( s d Ψ Φ σ ^ l 1 )
σ ^ l = soft ( r l ; T ) = sign ( r l ) m a x [ ( | r l | T ) ; 0 ]
where l denotes the iteration index, μ represents the step size, which affects the convergence of the ISTA, r l is the operator that processes the residuals, s o f t ( · ; · ) denotes the nonlinear reconstruction function corresponding to the 1 norm-based regularizer ( q = 1 ), λ and μ are merged into a threshold parameter T = λ μ , and s i g n ( · ) denotes the sign function. Similarly, the ISTA can also effectively solve the 2D optimization problem in Equation (14) due to the linear characteristic of the CSA operator G ( · ) . The 2D scattering coefficient matrix Ξ ^ is reconstructed by iterating between the operator update and the nonlinear transformation, which can be rewritten as
R l = Ξ ^ l 1 + μ l · M Ψ r T S d Ψ r G ( Ξ ^ l 1 ) Ψ a Ψ a T
Ξ ^ l = soft ( R l ; T ) = sign ( R l ) m a x [ ( | R l | T ) ; 0 ] .
In CS-driven SAR imaging methods, the ISTA is frequently required to undertake numerous iterations to acquire acceptable reconstruction results, accompanied by the challenge of selecting the most appropriate hand-crafted parameters. To overcome the disadvantages of 1D exact observation models and online sparse imaging techniques, we developed a CSA-derived deep SAR imaging net by unfolding the above-mentioned 2D iterative algorithm. This net architecture for 2D SAR imaging consists of a fixed number of L layers, each corresponding to one ISTA iteration. The two stages represented by Equations (18) and (19) are mapped to the monolayer topology, which comprises two update modules: the linearity module R and the nonlinearity module N . Figure 2 depicts the topology architecture of the deep-unfolding-based 2D SAR imaging net.
Then, we describe the internal structure of network modules. In terms of Equation (18), the linearity module R is expressed as
R : R ( l ) = Ξ ^ ( l 1 ) + μ ( l ) · M Ψ r T S d Ψ r G ( Ξ ^ ( l 1 ) ) Ψ a Ψ a T .
where R ( l ) denotes the 2D linear reconstruction result, and μ ( l ) and T ( l ) represent the learnable parameters in the lth layer, while they are manually tuned in the ISTA. Parameter learnability is conducive to resolving the challenge of adjusting a priori. Since the radar signal is complex-valued data, we address the matrices in Equation (20) by separating them into real and imaginary parts, i.e., R ( l ) = Re ( R ( l ) ) + j Im ( R ( l ) ) , where Re ( · ) denotes the real part of the matrix, Im ( · ) denotes the imaginary part of the matrix, and j denotes the imaginary unit. Hence, the operators G ( · ) and M ( · ) in module R are designed as a dual-path structure, with which the real and imaginary parts of the radar signal are separately processed.
We use G ( · ) as an example to introduce the dual-path structure in detail. The calculation process of G ( · ) is divided into three steps to simplify the calculation expression, and the real and imaginary parts of each step are derived according to the complex-valued operation rules. The first step F r [ ( Ξ ^ ( l 1 ) F a ) H 3 * ] with respect to Ξ ^ ( l 1 ) , denoted as a suboperator F r [ · ] , is given in Equation (21), where C o n ( A , B ) = A T B T T represents the consolidation operator, with which the matrices A and B are consolidated to form an array A T B T T , and d C o n U A T B T T = A and d C o n L A T B T T = B represent the inverse operation of C o n ( · , · ) , with which the upper part and lower part, respectively, of the array are extracted.
The second step F r H · = F r H F r [ · ] H 2 * based on suboperator F r [ · ] is expressed as Equation (22), and the third step { · } F a H = { F r H · H 1 * } F a H , namely, G ( · ) , is given in Equation (23). Since M ( · ) is the inverse operation of G ( · ) , M ( · ) has the same structure as G ( · ) and contains different phase functions. The structure of M ( · ) can be developed by swapping the input and output of G ( · ) , and the topology architecture of module R in the lth layer is illustrated in Figure 3 based on the above analysis. Note that all the matrix operations related to the FFT and downsampling in Figure 3 are left multiplications.
Re F r [ · ] Im F r [ · ] = Re ( F r ) Im ( F r ) Im ( F r ) Re ( F r ) · C o n Re ( Ξ ^ ( l 1 ) F a ) Re ( H 3 * ) Im ( Ξ ^ ( l 1 ) F a ) Im ( H 3 * ) , Im ( Ξ ^ ( l 1 ) F a ) Re ( H 3 * ) + Re ( Ξ ^ ( l 1 ) F a ) Im ( H 3 * ) = Re ( F r ) Im ( F r ) Im ( F r ) Re ( F r ) · C o n d C o n U Re ( F a ) Im ( F a ) Im ( F a ) Re ( F a ) T Re ( Ξ ^ ( l 1 ) ) T Im ( Ξ ^ ( l 1 ) ) T T Re ( H 3 * ) d C o n L Re ( F a ) Im ( F a ) Im ( F a ) Re ( F a ) T Re ( Ξ ^ ( l 1 ) ) T Im ( Ξ ^ ( l 1 ) ) T T Im ( H 3 * ) , d C o n L Re ( F a ) Im ( F a ) Im ( F a ) Re ( F a ) T Re ( Ξ ^ ( l 1 ) ) T Im ( Ξ ^ ( l 1 ) ) T T Re ( H 3 * ) + d C o n U Re ( F a ) Im ( F a ) Im ( F a ) Re ( F a ) T Re ( Ξ ^ ( l 1 ) ) T Im ( Ξ ^ ( l 1 ) ) T T Im ( H 3 * )
Re F r H · Im F r H · = Re ( F r H ) Im ( F r H ) Im ( F r H ) Re ( F r H ) · C o n Re F r [ · ] Re ( H 2 * ) Im F r [ · ] Im ( H 2 * ) , Im F r [ · ] Re ( H 2 * ) + Re F r [ · ] Im ( H 2 * ) = Re ( F r H ) Im ( F r H ) Im ( F r H ) Re ( F r H ) · C o n d C o n U Re F r [ · ] Im F r [ · ] Re ( H 2 * ) d C o n L Re F r [ · ] Im F r [ · ] Im ( H 2 * ) , d C o n L Re F r [ · ] Im F r [ · ] Re ( H 2 * ) + d C o n U Re F r [ · ] Im F r [ · ] Im ( H 2 * )
Re G ( · ) T Im G ( · ) T = Re { · } F a H T Im { · } F a H T = Re ( F a H ) Im ( F a H ) Im ( F a H ) Re ( F a H ) T · C o n Re F r H · Re ( H 1 * ) Im F r H · Im ( H 1 * ) T , Im F r H · Re ( H 1 * ) + Re F r H · Im ( H 1 * ) T = Re ( F a H ) Im ( F a H ) Im ( F a H ) Re ( F a H ) T · C o n d C o n U Re F r H · Im F r H · Re ( H 1 * ) d C o n L Re F r H · Im F r H · Im ( H 1 * ) T , d C o n L Re F r H · Im F r H · Re ( H 1 * ) + d C o n U Re F r H · Im F r H · Im ( H 1 * ) T
Furthermore, if the imaging scene is sparse in the spatial domain, module N in Equation (19) can be combined with module R to establish a CSA-based 2D SAR imaging net (CSA-Net) applied to solve Equation (14). Conversely, if the imaging scene is nonsparse or weakly sparse, module N should be redesigned to sparsely represent the corresponding scene. Aimed at the difficulty of SAR imaging in complex scenes, a novel 2D SAR imaging strategy that can introduce sparse representation and transform sparsity into the network architecture is introduced in Section 3.3.

3.2. Deep SAR Imaging Net Based on Sparse Representation: SR-CSA-Net

In this subsection, a CNN-based module N with sparse representation is designed for solving the complex-scene, SAR imaging problem in Equation (15); then, it is incorporated with the CSA-based module R to construct a dual-path deep SAR imaging net that is referred to as SR-CSA-Net.
If we use traditional transform-domain methods to achieve sparse representation, the output of layer l can be calculated by the nonlinear function s o f t ( · ; · ) with the input of the linear reconstruction result R ( l ) and threshold parameter T ( l ) , which are expressed as follows:
Ξ ^ S R ( l ) = soft ( Y R ( l ) ; T ( l ) ) Ξ ^ ( l ) = Y + Ξ ^ S R ( l )
where the transform matrix Y and its inverse form Y + are substitutable and nonoptimal. To sparsify the scattering coefficient matrix Ξ ^ and optimize the reconstruction performance, we develop a CNN-based nonlinear transform function F ( · ) that can be utilized to learn an optimal sparse basis. Then, this well-designed transform function is embodied as a CNN block and adopted in SR-CSA-Net instead of using the handcrafted transform matrix Y .
Considering the powerful fitting ability of a CNN, the convolution operators C i ( · ) , activation function R e L U ( · ) , and batch normalization (BN) operator B ( · ) are combined to design the nonlinear transform function F ( · ) , in which the two convolution operators C 1 ( · ) and C 2 ( · ) are successively separated by B ( · ) and R e L U ( · ) . The BN operator preceding the activation function effectively prevents problems such as vanishing gradient, overfitting, and slow convergence [36], which are expressed as
B ( X ) = X E [ X ] Var [ X ] + ϵ · γ + β
where E [ · ] denotes the mean function, Var [ · ] represents the variance function, ϵ is a very small number that prevents the instability of numerical calculation, and γ and β are learnable affine parameters. On this basis, the linear reconstruction result R ( l ) in the lth layer can be sparsely represented by using the well-designed transform function F ( · ) , which is formulated as
F ( R ( l ) ) = C 2 R e L U B C 1 ( R ( l ) )
where the first convolution operator C 1 ( · ) corresponds to a set of N f filters with a size of ω f × ω f × 1 , whose purpose is to perform a channelwise conversion from one channel input to N f channel output, the second convolution operator corresponds to another set of N f filters with a size of ω f × ω f × N f , and the rectified linear unit (ReLU) function is denoted as R e L U ( X ) = m a x ( X ; 0 ) . In summary, the CNN block F ( · ) is important in module N and has rich sparse representation capability due to its nonlinearity and learnability.
It has been indicated in [37] that F ( Ξ ^ ( l ) ) F ( R ( l ) ) F 2 and Ξ ^ ( l ) R ( l ) F 2 satisfy the linear relationship under a reasonable assumption. Thus, Ξ ^ S R ( l ) , the sparse representation of Ξ ^ ( l ) , is derived by Ξ ^ S R ( l ) = F ( Ξ ^ ( l ) ) = soft [ F ( R ( l ) ) ; T ( l ) ] . Furthermore, motivated by the structure symmetric constraint, a mirror-symmetrical structure with different weights, denoted by F ˜ ( · ) such that F ˜ ( · ) × F ( · ) = I , is designed to extract the unknown scattering coefficient matrix from the obtained result of sparse representation. Therefore, the following closed-form expression efficiently computes the unknown scattering coefficient matrix Ξ ^ ( l ) in layer l:
Ξ ^ ( l ) = F ˜ ( Ξ ^ S R ( l ) ) = F ˜ soft F ( R ( l ) ) ; T ( l )
where the symmetric CNN block F ˜ ( Ξ ^ S R ( l ) ) is modeled as F ˜ ( Ξ ^ S R ( l ) ) = C ˜ 2 R e L U { B ˜ [ C 1 ˜ ( Ξ ^ S R ( l ) ) ] } . The nonlinearity module N is also divided into two paths to connect the real and imaginary parts output by the linearity module R . Moreover, due to the learnability of F ˜ ( · ) and F ( · ) , each layer of SR-CSA-Net has its own CNN block, and thus, module N in the lth layer is redesigned as
N : Re ( Ξ ^ ( l ) ) = F ˜ ( l ) soft F ( l ) ( Re ( R ( l ) ) ) ; T ( l ) Im ( Ξ ^ ( l ) ) = F ˜ ( l ) soft F ( l ) ( Im ( R ( l ) ) ) ; T ( l ) .
The nonlinear reconstruction function s o f t ( · ; · ) is determined by Equation (19). For the nonzero elements of complex x, the sign function is calculated by soft ( x ) = x . / | x | . Therefore, when processing complex-valued signals, the function s o f t ( · ; · ) in module N is redefined as a nonlinear function including the complex normalization operation. Selecting the real and imaginary parts of R ( l ) as examples, the function s o f t ( · ; · ) is expressed as
s o f t [ Re ( R ( l ) ) ; T ( l ) ] = s i g n ( Re ( R ( l ) ) ) max [ ( | R ( l ) | T ( l ) ) ; 0 ] = ( Re ( R ( l ) ) . / | R ( l ) | ) max [ ( | R ( l ) | T ( l ) ) ; 0 ]
s o f t [ Im ( R ( l ) ) ; T ( l ) ] = s i g n ( Im ( R ( l ) ) ) max [ ( | R ( l ) | T ( l ) ) ; 0 ] = ( Im ( R ( l ) ) . / | R ( l ) | ) max [ ( | R ( l ) | T ( l ) ) ; 0 ]
where Re ( R ( l ) ) . / | R ( l ) | and Im ( R ( l ) ) . / | R ( l ) | represent the normalization operations of the real part and imaginary part, respectively, and | R ( l ) | = a b s ( Re ( R ( l ) ) , Im ( R ( l ) ) ) is the elementwise absolute operation. Figure 4a illustrates the dual-path topology architecture of module N in the lth layer, and the monolayer topology of the proposed SR-CSA-Net is composed of modules R and N .

3.3. Enhanced Version: SR-CSA-Net-Plus

To accelerate the convergence speed and expect further performance improvement from the sparse reconstruction, we designed an enhanced network named SR-CSA-Net-plus by improving module N from SR-CSA-Net. Suppose that Ξ ^ ( l ) = R ( l ) + W ( l ) + N 0 ( l ) , where W ( l ) denotes the high-frequency component missing in the linear reconstruction result R ( l ) , which can be recovered by a linear operator Π ( · ) from the closed-form solution in Equation (28), i.e., W ( l ) = Π ( Ξ ^ ( l ) ) = Π F ˜ ( l ) { soft [ F ( l ) ( R ( l ) ) ; T ( l ) ] } .
Specifically, the linear operator Π ( · ) consists of two convolution operators, represented by Π ( · ) = G ( · ) × D ( · ) , where D ( · ) corresponds to a set of N f filters with a size of ω f × ω f × 1 , and G ( · ) corresponds to one filter with a size of ω f × ω f × N f . In this case, we redefine the size of convolution operators in both CNN blocks to address the dimensionality mismatch, and both C i ( · ) and C ˜ i ( · ) are defined as one set of N f filters with a size of ω f × ω f × N f . On this basis, we introduce skip connections to develop an improved module N , named module N p l u s , which is given by Equation (31), and its topology architecture is illustrated in Figure 4b.
The nonlinear reconstruction result of SR-CSA-Net-plus is updated by module N p l u s . Similar to module N in Equation (28), Π ( · ) , F ( · ) , and F ˜ ( · ) are also learnable in module N p l u s , and they are not constrained to be the same on each layer, namely, they are layer-varied.
N p l u s : Re ( Ξ ^ ( l ) ) = Re ( R ( l ) ) + Π ( l ) F ˜ ( l ) s o f t F ( l ) Re ( R ( l ) ) ; T ( l ) = Re ( R ( l ) ) + G ( l ) F ˜ ( l ) s o f t F ( l ) D ( l ) Re ( R ( l ) ) ; T ( l ) Im ( Ξ ^ ( l ) ) = Im ( R ( l ) ) + Π ( l ) F ˜ ( l ) s o f t F ( l ) Im ( R ( l ) ) ; T ( l ) = Im ( R ( l ) ) + G ( l ) F ˜ ( l ) s o f t F ( l ) D ( l ) Im ( R ( l ) ) ; T ( l )
Note that in SR-CSA-Net-plus, F ( · ) × D ( · ) and G ( · ) × F ˜ ( · ) are adopted to replace F ( · ) and F ˜ ( · ) , respectively, in SR-CSA-Net, which is conducive to further exploring the sparsity of SAR images. The monolayer topology of the proposed SR-CSA-Net-plus comprises modules R and N p l u s .

3.4. Network Analysis

Structural analysis: SR-CSA-Net and SR-CSA-Net-plus mainly consist of a linearity module R and a nonlinearity module N or N p l u s . The nonlinearity module of each layer is composed of convolution operators, BN operators, and nonlinear activation functions. Therefore, the framework of our proposed imaging nets is a mixture of linear reconstruction, convolution, batch normalization, and nonlinear activation. Compared with conventional CNN-based or deep-unfolding-based methods, our proposed imaging nets are impressive in network structure and operation mechanism. These two nets embed the approximated observation model into a linear reconstruction architecture and convert the sparse transform function into a CNN block. Moreover, the distinct superiority of SR-CSA-Net-plus lies in its more prominent sparse representation capability and the addition of skip connections. The skip connection structure facilitates network convergence and has been widely applied to ResNet [38] and DenseNet [39].
Learnable parameters: In our proposed imaging nets, the parameters of different layers are set to be nonshared. Γ net is the parameter set of SR-CSA-Net, which has three types of parameters: the step size μ ( l ) , the soft threshold T ( l ) , and the CNN blocks F ( l ) and F ˜ ( l ) . Hence, Γ net is expressed as Γ net = { μ ( l ) , T ( l ) , F ( l ) , F ˜ ( l ) } l = 1 L , where L is the total number of network layers, and F ( l ) and F ˜ ( l ) include parameters γ , β , N f , and ω f . Similar to SR-CSA-Net, each layer of SR-CSA-Net-plus also has its trainable network parameters, and its parameter set is denoted by Γ net + = { μ ( l ) , T ( l ) , D ( l ) , G ( l ) , F ( l ) , F ˜ ( l ) } l = 1 L . In this paper, we adopt supervised training to learn all parameters and use the minibatch gradient descent (MBGD) algorithm [36] in combination with the backpropagation algorithm [40] to update the gradient.
Complexity analysis: All of the above parameters in Γ net and Γ net + are real-valued. The dimensionality of μ ( l ) and T ( l ) is 1, and the dimensionality of D ( l ) , G ( l ) , F ( l ) , and F ˜ ( l ) is calculated by N f and ω f . For SR-CSA-Net, the number of network parameters in F ( l ) and F ˜ ( l ) is the same and equal to 1 × ω f × ω f × N f + N f × ( 2 + ω f × ω f ) × N f . Then, the total number of network parameters in SR-CSA-Net is denoted as O Γ net = L × { 2 + 2 × [ N f × ω f 2 + N f 2 × ( 2 + ω f 2 ) ] } . Similarly, for SR-CSA-Net-plus, the numbers of parameters in D ( l ) , G ( l ) , F ( l ) , and F ˜ ( l ) are N f × ω f × ω f × 1 , 1 × ω f × ω f × N f , N f × ( 2 + ω f × ω f + ω f × ω f ) × N f , and N f × ( 2 + ω f × ω f + ω f × ω f ) × N f , respectively. Therefore, the total number of network parameters in SR-CSA-Net-plus is O Γ net + = L × { 2 + 2 × [ N f × ω f 2 + N f 2 × ( 2 + 2 × ω f 2 ) ] } .
For recent and state-of-the-art SAR imaging nets, namely, ISTA-Net [24], RMIST-Net [25], TISTA [41], C-TISTA [42], LAMP [43], AF-AMPnet [30], and RDA-Net [31], the number of parameters for each of these imaging nets is L ( P Q 2 + M N × P Q + 1 ) , 2 L , L + 2 , 4 L , L ( M N × P Q + 2 ) , 2 L + 2 , and L ( 2 M N + M 2 + N 2 + 2 ) , respectively. Since the values of N f and ω f are generally much smaller than those of M N and P Q , the number of parameters in our proposed two imaging nets is acceptable and moderate. According to the above-mentioned analysis, SR-CSA-Net and SR-CSA-Net-plus achieve a tradeoff between reconstruction quality and computational complexity compared with the existing SAR imaging nets.
Loss function: The training procedure of the proposed imaging nets can be viewed as updating the learnable parameter set Γ net or Γ net + by minimizing the loss function. Instead of using a general loss function such as the averaged normalized mean square error (NMSE) [44], we adopt a cleverly designed loss function in this paper, in which both the reconstruction error and symmetry constraint of CNN blocks are considered. More specifically, the structure symmetry constraint F ˜ ( · ) × F ( · ) = I is satisfied in the well-designed loss function while reducing the discrepancy between the label and the final reconstruction result obtained in layer L.
The total loss functions for SR-CSA-Net and SR-CSA-Net-plus are denoted as L total and L total + , respectively, which are defined as follows:
(32a) L total ( Γ net ) = α 1 L 1 + α 2 L 2 (32b) L total + ( Γ net + ) = α 1 L 1 + α 3 L 3 (32c) L 1 = 1 2 J j = 1 J Ξ ^ j ( L ) D j F 2 L 2 = 1 2 J j = 1 J l = 1 L F ˜ ( l ) F ( l ) Re ( R j ( l ) ) Re ( R j ( l ) ) F 2 (32d) + 1 2 J j = 1 J l = 1 L F ˜ ( l ) F ( l ) Im ( R j ( l ) ) Im ( R j ( l ) ) F 2 L 3 = 1 2 J j = 1 J l = 1 L F ˜ ( l ) F ( l ) D ( l ) Re ( R j ( l ) ) D ( l ) Re ( R j ( l ) ) F 2 (32e) + 1 2 J j = 1 J l = 1 L F ˜ ( l ) F ( l ) D ( l ) ( Im ( R j ( l ) ) ) D ( l ) Im ( R j ( l ) ) F 2
where L 1 is the average Euclidean distance loss function, L 2 and L 3 are the symmetry constrained loss functions, J is the total number of training samples, Ξ ^ j ( L ) represents the network output corresponding to the jth training sample, and D j indicates the label corresponding to the jth training sample. In addition, α 1 , α 2 , and α 3 are adjustable parameters by which the compromise between L 1 and L 2 or between L 1 and L 3 can be controlled. In this paper, they were set to 1, 0.1, and 0.1, respectively.

4. Experimental Results and Analyses

In this section, the superiority of the proposed SR-CSA-Net and SR-CSA-Net-plus is demonstrated by simulated data and measured data from the RADARSAT-1 satellite. Table 1 lists the main parameters of the simulation and RADARSAT-1. These system parameters determine the phase functions H x and FFT/IFFT matrices F x .
In the simulated experiments, we adopted the nonsparse images in [37] as labels D j | j = 1 J with a size of 256 × 256. A sufficient number of simulated echoes S j | j = 1 J was generated using labels with known geometry, 8000 of which were employed for training and 1000 were utilized for testing. With this strategy, the training set contained 8000 pairs of { D j , S j } j = 1 J = 8000 with 20 dB AWGN added to each S j . Some nonsparse labels in the training set are shown in Figure 5. In the measured experiments, a portion of the original RADARSAT-1 data containing nonsparse scenes was applied to further investigate the proposed imaging strategy. The training set was formulated by utilizing 1000 slices of the original echo data, where the size of each slice was 512 × 512, and a 20 dB AWGN was added. In particular, we adopted the traditional CSA to generate the labels of measured data, in which the sidelobe was further suppressed by feature enhancement. The following three cases were taken into account to assess the effectiveness of our proposed imaging nets.
(1)
C a s e   I : Simulated experiments of nonsparse scenes.
(2)
C a s e   I I : Simulated experiments of real scenes.
(3)
C a s e   I I I : Measured experiments.
The fixed mutual parameters were set as follows: the convolution size was N f = 32, the filter size was ω f × ω f = 3 × 3 , the learning rate was 1 × 10 4 , the default number of layers was L = 9 , and the default number of epochs was 101. All our experiments were implemented in the PyTorch framework with the Adam optimizer and accelerated by an NVIDIA Tesla V100 GPU.

4.1. Simulated Experiments of Nonsparse Scenes

Here, we focused on a nonsparse imaging scene in which the discretized grids P × Q were fixed to 256 × 256. In this section, five algorithms, including MF (CSA), traditional CS without sparse representation, DCT-based CS (DCT-CS) [17], mixed sparse representation-based CS (MSR-CS) [18], and CSA-Net, were adopted as comparative experiments. CSA-Net is a simplified SR-CSA-Net without CNN blocks, which can be regarded as a 2D version of ISTA-Net [24] improved by the CSA operator G ( · ) . In addition, four evaluation indices, including the NMSE, peak signal-to-noise ratio (PSNR) [45], structural similarity index for measuring [45], and mean computing time, were applied to evaluate the reconstruction performance of different algorithms. The NMSE is defined as
NMSE = X ^ X label F 2 X label F 2
where X ^ represents the reconstruction result and X label denotes the label image. The definition of PSNR is
PSNR = 10 · log 10 K X ^ 2 MSE MSE = 1 P Q p = 0 P 1 q = 0 Q 1 [ X ^ ( p , q ) X label   ( p , q ) ] 2
where MSE is the mean square error between X ^ and X label , and K X ^ denotes the pixel value range of X ^ . In addition, the SSIM is defined as
SSIM = ( 2 E [ X ^ ] E [ X label ] + α 1 ) ( 2 Cov [ X ^ , X label ] + α 2 ) ( E [ X ^ ] 2 + E [ X label ] 2 + α 1 ) ( Var [ X ^ ] + Var [ X label ] + α 2 )
where Cov [ · , · ] represents the cross-covariance function, and α 1 and α 2 were set to ( 0.01 · K X ^ ) 2 and ( 0.03 · K X ^ ) 2 , respectively.
According to the definitions in Equations (33)–(35), it can be concluded that the NMSE describes the accuracy of the reconstruction results, the PSNR reflects the distortion of images after reconstruction, and the SSIM indicates the similarity of the reconstruction result and ground truth, where a larger SSIM value means better performance.
( 1 )   R e s u l t s   w i t h   c o m p l e t e   s a m p l i n g : As illustrated in Figure 6, the reconstruction results of the proposed SR-CSA-Net and SR-CSA-Net-plus were compared with those of the above five algorithms with complete sampling. Figure 6 verifies that the reconstruction results of SR-CSA-Net and SR-CSA-Net-plus have more details and sharper edges than other methods, showing the superiority of the proposed imaging nets.
Table 2 lists the three evaluation indices of the imaging results in Figure 6, obtained from the average of 100 Monte Carlo experiments. In addition, the optimal evaluation values are in bold font. Although traditional MF, CS-driven, and CSA-Net without CNN blocks obtain acceptable imaging results, the proposed SR-CSA-Net and SR-CSA-Net-plus outperform all the above methods by a large margin. As expected, SR-CSA-Net-plus has better performance and is superior to SR-CSA-Net based on the evaluation values.
( 2 )   R e s u l t s   w i t h   d i f f e r e n t   s a m p l i n g   r a t e s : Then, we examined the influence of the sampling rate denoted as η = M N / M N on the reconstruction performance. We adopted the same training set to ensure a fair comparison, and different sampling rates, including η = 81 % , 64 % , and 36 % , were applied to the above seven SAR imaging methods to more deeply investigate the superiority of our SR-CSA-Net and SR-CSA-Net-plus. The imaging results by MF, CS-driven methods, and deep unfolding-based methods under the premise of downsampling are shown in Figure 7 from left to right.
In Figure 7, the ambiguity phenomenon appears in the reconstruction results of CSA and CSA-Net because both the conventional MF and the deep unfolding method without sparse representation cannot address the downsampled echo of the nonsparse scene. In contrast, the ambiguity phenomenon is eliminated in the reconstruction results of CS-driven methods and our proposed imaging nets because those methods are effective for the downsampled echo of the nonsparse scene. While CS without sparse representation may alleviate noise and sidelobe interference to some extent, its ambiguity phenomenon is also severe and grows in severity as sampling rates decrease. The reconstruction results of sparse-representation-based CS methods, including DCT-CS and MSR-CS, are acceptable when η = 81 % and η = 64 % but seriously deteriorate when η = 36 % . As a result, the proposed SAR imaging nets achieve satisfactory reconstruction results with complete sampling data to 36% downsampling data.
The PSNR, NMSE, and SSIM values of various η and the mean computing time of various algorithms are listed in Table 3. It shows that the imaging quality declines for all the methods with a decrease in sampling rate. The evaluation values of CSA, CS, and CSA-Net significantly decrease under the premise of downsampling. For DCT-CS and MSR-CS, the evaluation indices do not decline significantly when η 64 % , but the evaluation indices decline significantly when η = 36 % . Although the evaluation indices of our proposed imaging nets also decline with a reduction in the sampling rate, the proposed SR-CSA-Net-plus obtains the best NMSE, PSNR, and SSIM values in each case, followed by SR-CSA-Net.
The superior reconstruction performance of the proposed SR-CSA-Net and SR-CSA-Net-plus are demonstrated by the PSNR and NMSE values. Furthermore, the SSIM values and the mean computing time indicate that the proposed imaging nets have outstanding target enhancement performance and reconstruction efficiency. For different sampling rates, the echo matrix is filled to a fixed size, so there is almost no difference in imaging time. The three deep-unfolding-based imaging nets have comparable imaging times, and CSA-Net takes the shortest imaging time among the three methods. Due to the existence of module N and module N p l u s , the parameter sets in SR-CSA-Net and SR-CSA-Net-plus include learnable parameters related to sparse representation, which results in a slightly larger imaging time of the proposed imaging nets than that of CSA-Net.
( 3 )   A b l a t i o n   s t u d y : To further evaluate the superiority of our proposed strategy, an ablation study of SR-CSA-Net-plus with different layers, epochs, and network modules was performed and is presented in this part. The comparison results of PSNR with respect to different imaging methods are presented in Figure 8 and Table 4, where SR-CSA-Net-plus without skip connections (SCs) and Π ( · ) stands for SR-CSA-Net, and SR-CSA-Net-plus without SCs, Π ( · ) , F ( · ) , and F ˜ ( · ) refers to CSA-Net.
The PSNR curves of the proposed imaging nets increase with the number of layers or epochs and converge at a certain point, as shown in Figure 8. We also observe that SR-CSA-Net-plus significantly outperforms SR-CSA-Net, CSA-Net, and other conventional methods in terms of reconstruction performance. When we remove SCs and Π ( · ) , the results are still acceptable. However, after removing the sparse representation blocks F ( · ) and F ˜ ( · ) , the system suffers a significant performance degradation. The PSNR curves converge when L 7 . Table 4 indicates that under this condition, SR-CSA-Net-plus achieves approximately 3 dB and 12 dB gains over SR-CSA-Net and CSA-Net, respectively. Furthermore, CSA-Net achieves the fastest training convergence as the number of epochs increases, while SR-CSA-Net-plus registers as the second-fastest and achieves the best reconstruction performance among the three deep-unfolding-based imaging nets.
The BN operator was designed in our proposed imaging nets to accelerate convergence, which is also critical to the improvement in network performance. To directly reflect the influence of the BN operator, the proposed two imaging nets were compared with those without BN operators. Figure 9 and Table 5 show the NMSE comparison results of four different imaging nets with various numbers of layers and epochs, where w/o is the abbreviation of “without”.
As illustrated in Figure 9 and Table 5, the NMSE values of SR-CSA-Net and SR-CSA-Net-plus are lower than those without BN operators at each layer or epoch. The performance of SR-CSA-Net-plus is not appreciably affected by the removal of BN operators, whereas the advantages of BN operators are more apparent for SR-CSA-Net. Figure 9a verifies that the NMSE curves of SR-CSA-Net-plus (two red curves) converge and tend to be flat when L 7 , while the curves of SR-CSA-Net (two blue curves) converge when L 9 . Figure 9b shows that SR-CSA-Net-plus with BN operators performs best in terms of convergence speed and achieves a stable performance over sixty epochs. Hence, the convergence speed and reconstruction accuracy of imaging nets can be improved to some extent by introducing the BN operator. Nevertheless, Table 5 suggests that SR-CSA-Net-plus without BN operators still achieves better reconstruction performance than SR-CSA-Net with BN operators.
In conclusion, F ( · ) and F ˜ ( · ) are notably beneficial for the high-quality reconstruction of nonsparse SAR scenes. The superiority of SR-CSA-Net-plus over SR-CSA-Net can be attributed to the high-frequency component recovery operator Π ( · ) and SCs introduced by module N p l u s . These two designs in SR-CSA-Net-plus improve the sparse representation ability and reconstruction performance. For both SR-CSA-Net and SR-CSA-Net-plus, the BN operator introduced by module N and module N p l u s facilitates the convergence of the network training while improving the reconstruction accuracy.

4.2. Simulated Experiments of Real Scenes

The simulated experiments in Section 4.1 proved the effectiveness and feasibility of SR-CSA-Net and SR-CSA-Net-plus for nonsparse SAR reconstruction by comparing them with traditional MF and CS-driven methods. In this subsection, we further compare them with state-of-the-art deep unfolding methods, including RMIST-Net [25] and RDA-Net [31], which also reduce the storage burden and are designed by unfolding ISTA. To demonstrate the universality of the proposed imaging nets, we chose three real scenes with weak sparsity in an open SAR ships detection dataset (SSDD) [46] to generate the SAR echoes, which were directly input into the trained imaging nets in Section 4.1, avoiding unnecessary training and calculation. The three real scenes with the indices 231, 1080, and 1088 in the SSDD were cut into images of size of 256 × 256, and their sparsity degrees were scene 3 > scene 1 > scene 2. The radar system parameters were the same as those in the previous section, and the fixed mutual parameters of all networks were set to be the same for a fair comparison. Figure 10 shows the imaging results of three real scenes obtained by RMIST-Net, RDA-Net, SR-CSA-Net, and SR-CSA-Net-plus when downsampling with η = 81 % and η = 64 % . In addition, the corresponding evaluation indices are listed in Table 6, where the best values are marked in boldface.
Figure 10 and Table 6 reflect the superiority and robustness of our proposed two imaging nets for a real SAR scene reconstruction when downsampling. Although the reconstruction quality of all the above imaging nets weakens with the decreasing scene sparsity and sampling rate, SR-CSA-Net-plus has the best advantage in all three scenes and both downsampling schemes, followed by SR-CSA-Net. For the other two comparative imaging nets, the imaging results of RDA-Net and RMIST-Net are acceptable when η = 81 % , but those of RMIST-Net are streaky when η = 64 % . Furthermore, RDA-Net performs better than RMIST-Net because the iterative parameters and compensation matrices in RDA-Net are learned, while RMIST-Net only learns the ISTA parameters, and its RM kernel is predefined. This issue leads to RDA-Net having the largest number of learnable parameters but RMIST-Net having the least number of learnable parameters. In contrast, the number of parameters in our methods is moderate. Therefore, our proposed imaging nets achieve a compromise between reconstruction performance and computing speed compared with RDA-Net and RMIST-Net.

4.3. Measured Experiments

The previous simulated experiments validated that SR-CSA-Net and SR-CSA-Net-plus were superior to conventional CS-driven and deep-unfolding-based algorithms in both reconstruction quality and efficiency. In this subsection, some experimental results are given based on the measured data of RADARSAT-1 to investigate whether the proposed imaging strategy still performs well in the measured experiments. Since SR-CSA-Net-plus performed better, we chose it as an example to further verify our proposed strategy and retrain it in the way introduced at the beginning of Section 4. SR-CSA-Net-plus was tested and compared with the four above-mentioned methods, including CSA, MSR-CS, RMIST-Net, and RDA-Net. In addition, an AMP-unfolded 2D SAR imaging method inspired by [30], called AMP-Net, was introduced in this experiment for comparison with other SAR imaging methods unfolded by ISTA. The corresponding imaging results with η = 81 % and η = 64 % of a sparse scene and a nonsparse scene, namely, a harbor and seashore, are shown in Figure 11 and Figure 12, respectively. The image entropy (ENT) and the previously mentioned evaluation values are listed in Table 7.
From Figure 11 and Figure 12 and Table 7, the four networks achieve comparable reconstruction results in the sparse scene, but the proposed strategy has apparent advantages in the nonsparse scene. Specifically, SR-CSA-Net-plus achieves approximately 3∼8 dB PSNR gains over RDA-Net, AMP-Net, and RMIST-Net, and reconstructs more details and sharper edges. Furthermore, the NMSE, PSNR, and SSIM values of SR-CSA-Net-plus are optimal among the six algorithms, which is consistent with the simulated experiments, while the ENT of SR-CSA-Net-plus ranks third. However, for nonsparse scenes, ENT cannot be used solely as an indicator to evaluate reconstruction performance. Although AMP-Net and RMIST-Net obtain smaller ENT values for both the harbor and seashore, they lose much information in the reconstruction process due to the lack of sparse representation ability, such as the structure, edges, and smooth components of scattered fields. This loss of information is especially pronounced in nonsparse scenes. In addition, since AMP is superior to ISTA, the reconstruction performance of AMP-Net is slightly better than that of RMIST-Net and has comparable performance with RDA-Net for sparse scenes. Thus, we are interested in combining AMP-Net with a sparse representation structure, which is expected to achieve better reconstruction performance than SR-CSA-Net-plus in our future work. According to the complexity analysis in Section 3.4, the number of parameters in RMIST-Net, AMP-Net, RDA-Net, and SR-CSA-Net-plus is 18, 20, 9.437 × 10 6 , and 3.738 × 10 5 , respectively, when the echo size is 512 × 512. Therefore, RMIST-Net spends the least runtime among the four networks under the premise of the same fixed mutual parameters and 2D imaging mechanism, followed by AMP-Net and SR-CSA-Net-plus. In conclusion, the proposed strategy combines the merits of the CS and DL methods and achieves consistently high-quality reconstruction results in sparse and weakly sparse scenes while remaining computationally competent.

5. Conclusions

In this article, we presented a novel deep unfolding strategy for SAR imaging of nonsparse scenes. There are two limitations in conventional nonsparse SAR scene reconstruction: the vectorized CS optimization model requires a high computational cost to process the large-scale matrix, and the transform-domain-based methods cannot obtain an optimal sparse basis. Thus, the approximated observation model and CNN blocks were introduced in place of the conventional exact observation model and sparse transform function. This strategy combined deep unfolding and traditional neural network architectures, unlike model-driven CS reconstruction algorithms and data-driven DL imaging methods. In practical terms, we incorporated the CSA operator and ISTA within an efficient deep SAR imaging net, SR-CSA-Net, and developed its enhanced version, SR-CSA-Net-plus. Theoretically, these two imaging nets pertained to the model-driven deep hierarchical architecture. Each layer of our proposed imaging nets was designed as a combination of the linearity and nonlinearity modules, corresponding to the iteration steps of ISTA optimizations. Moreover, we embedded CNN blocks in the nonlinearity module for the sparse representation instead of adopting the existing sparse transform methods. The experimental results indicated that our proposed imaging strategy achieved significant performance improvements for the reconstruction quality of nonsparse scenes, showing superiority in terms of computational efficiency and complexity compared with conventional CS-driven and deep unfolding methods.
As a general imaging strategy, we envisage that such model-driven deep unfolding methods with CNN structures will have significant potential in various SAR imaging applications. However, our proposed nonsparse SAR scene imaging nets only consider the side-looking mode; they suffer from imaging quality degradation in squint mode. In future work, we are interested in integrating the processing of range cell migration and Doppler center shift (resulting in geometry distortion) into the proposed strategy to improve its adaptability and reconstruction ability for squint imaging.

Author Contributions

Conceptualization and methodology, H.Z. and J.N.; software, H.Z. and K.L.; validation, H.Z., K.L. and J.N.; resources, Q.Z. and Y.L.; writing—original draft preparation, H.Z.; writing—review and editing, H.Z., J.N. and Q.Z.; funding acquisition, Q.Z. and J.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant 62131020, grant 62001508, and grant 61971434.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lan, L.; Marino, A.; Aubry, A.; De Maio, A.; Liao, G.; Xu, J.; Zhang, Y. GLRT-Based Adaptive Target Detection in FDA-MIMO Radar. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 597–613. [Google Scholar] [CrossRef]
  2. Bamler, R. A comparison of range-Doppler and wavenumber domain SAR focusing algorithms. IEEE Trans. Geosci. Remote Sens. 1992, 30, 706–713. [Google Scholar] [CrossRef]
  3. Raney, R.; Runge, H.; Bamler, R.; Cumming, I.; Wong, F. Precision SAR processing using chirp scaling. IEEE Trans. Geosci. Remote Sens. 1994, 32, 786–799. [Google Scholar] [CrossRef]
  4. Ulander, L.; Hellsten, H.; Stenstrom, G. Synthetic-aperture radar processing using fast factorized back-projection. IEEE Trans. Aerosp. Electron. Syst. 2003, 39, 760–776. [Google Scholar] [CrossRef]
  5. Zhang, H.; Ni, J.; Xiong, S.; Luo, Y.; Zhang, Q. SR-ISTA-Net: Sparse Representation-Based Deep Learning Approach for SAR Imaging. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4513205. [Google Scholar] [CrossRef]
  6. Donoho, D. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
  7. Eldar, Y.C.; Kutyniok, G. Compressed Sensing: Theory and Applications; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  8. Bi, H.; Zhu, D.; Bi, G.; Zhang, B.; Hong, W.; Wu, Y. FMCW SAR Sparse Imaging Based on Approximated Observation: An Overview on Current Technologies. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 4825–4835. [Google Scholar] [CrossRef]
  9. Bi, H.; Lu, X.; Yin, Y.; Yang, W.; Zhu, D. Sparse SAR Imaging Based on Periodic Block Sampling Data. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
  10. Kelly, S.; Yaghoobi, M.; Davies, M. Sparsity-based autofocus for undersampled synthetic aperture radar. IEEE Trans. Aerosp. Electron. Syst. 2014, 50, 972–986. [Google Scholar] [CrossRef]
  11. Beck, A.; Teboulle, M. A fast Iterative Shrinkage-Thresholding Algorithm with application to wavelet-based image deblurring. In Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan, 19–24 April 2009; pp. 693–696. [Google Scholar] [CrossRef]
  12. Rangan, S. Generalized approximate message passing for estimation with random linear mixing. In Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, St. Petersburg, Russia, 31 July–5 August 2011; pp. 2168–2172. [Google Scholar] [CrossRef]
  13. Shi, W.; Ling, Q.; Yuan, K.; Wu, G.; Yin, W. On the Linear Convergence of the ADMM in Decentralized Consensus Optimization. IEEE Trans. Signal Process. 2014, 62, 1750–1761. [Google Scholar] [CrossRef]
  14. Rilling, G.; Davies, M.; Mulgrew, B. Compressed sensing based compression of SAR raw data. In Proceedings of the SPARS’09—Signal Processing with Adaptive Sparse Structured Representations, Saint Malo, France, 6–9 April 2009; Gribonval, R., Ed.; Inria Rennes-Bretagne Atlantique: Rennes, France, 2009. [Google Scholar]
  15. Samadi, S.; Çetin, M.; Masnadi-Shirazi, M.A. Multiple Feature-Enhanced SAR Imaging Using Sparsity in Combined Dictionaries. IEEE Geosci. Remote Sens. Lett. 2013, 10, 821–825. [Google Scholar] [CrossRef]
  16. Shen, F.; Zhao, G.; Liu, Z.; Shi, G.; Lin, J. SAR Imaging With Structural Sparse Representation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3902–3910. [Google Scholar] [CrossRef]
  17. Ni, J.C.; Zhang, Q.; Luo, Y.; Sun, L. Compressed Sensing SAR Imaging Based on Centralized Sparse Representation. IEEE Sens. J. 2018, 18, 4920–4932. [Google Scholar] [CrossRef]
  18. Bo, L.; Liu, F.; Zhou, C.; Zheng, W.; Hao, H. Mixed sparse representation for approximated observation-based compressed sensing radar imaging. J. Appl. Remote Sens. 2018, 12, 035015. [Google Scholar]
  19. Fang, J.; Xu, Z.; Zhang, B.; Hong, W.; Wu, Y. Fast Compressed Sensing SAR Imaging Based on Approximated Observation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 352–363. [Google Scholar] [CrossRef]
  20. Hu, C.; Wang, L.; Li, Z.; Zhu, D. Inverse Synthetic Aperture Radar Imaging Using a Fully Convolutional Neural Network. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1203–1207. [Google Scholar] [CrossRef]
  21. Mu, H.; Zhang, Y.; Ding, C.; Jiang, Y.; Er, M.H.; Kot, A.C. DeepImaging: A Ground Moving Target Imaging Based on CNN for SAR-GMTI System. IEEE Geosci. Remote Sens. Lett. 2021, 18, 117–121. [Google Scholar] [CrossRef]
  22. Lu, Z.J.; Qin, Q.; Shi, H.Y.; Huang, H. SAR moving target imaging based on convolutional neural network. Digit. Signal Process. 2020, 106, 102832. [Google Scholar] [CrossRef]
  23. Rittenbach, A.; Walters, J.P. RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation. arXiv 2020, arXiv:2001.08202. [Google Scholar]
  24. Yonel, B.; Mason, E.; Yazıcı, B. Deep Learning for Passive Synthetic Aperture Radar. IEEE J. Sel. Top. Signal Process. 2018, 12, 90–103. [Google Scholar] [CrossRef]
  25. Wang, M.; Wei, S.; Liang, J.; Zeng, X.; Wang, C.; Shi, J.; Zhang, X. RMIST-Net: Joint Range Migration and Sparse Reconstruction Network for 3-D mmW Imaging. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5205117. [Google Scholar] [CrossRef]
  26. Li, M.; Wu, J.; Huo, W.; Jiang, R.; Li, Z.; Yang, J.; Li, H. Target-Oriented SAR Imaging for SCR Improvement via Deep MF-ADMM-Net. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5223314. [Google Scholar] [CrossRef]
  27. Zhang, H.; Ni, J.; Xiong, S.; Luo, Y.; Zhang, Q. Omega-KA-Net: A SAR Ground Moving Target Imaging Network Based on Trainable Omega-K Algorithm and Sparse Optimization. Remote Sens. 2022, 14, 1664. [Google Scholar] [CrossRef]
  28. Chen, L.; Ni, J.; Luo, Y.; He, Q.; Lu, X. Sparse SAR Imaging Method for Ground Moving Target via GMTSI-Net. Remote Sens. 2022, 14, 4404. [Google Scholar] [CrossRef]
  29. Li, R.; Zhang, S.; Zhang, C.; Liu, Y.; Li, X. Deep Learning Approach for Sparse Aperture ISAR Imaging and Autofocusing Based on Complex-Valued ADMM-Net. IEEE Sens. J. 2021, 21, 3437–3451. [Google Scholar] [CrossRef]
  30. Wei, S.; Liang, J.; Wang, M.; Shi, J.; Zhang, X.; Ran, J. AF-AMPNet: A Deep Learning Approach for Sparse Aperture ISAR Imaging and Autofocusing. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5206514. [Google Scholar] [CrossRef]
  31. Kang, L.; Sun, T.; Luo, Y.; Ni, J.; Zhang, Q. SAR Imaging Based on Deep Unfolded Network With Approximated Observation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5228514. [Google Scholar] [CrossRef]
  32. Song, C.B.; Xia, S.T. Sparse Signal Recovery by q Minimization Under Restricted Isometry Property. IEEE Signal Process. Lett. 2014, 21, 1154–1158. [Google Scholar] [CrossRef]
  33. Shensa, M.J. The discrete wavelet transform: Wedding the a trous and Mallat algorithms. IEEE Trans. Signal Process. 1992, 40, 2464–2482. [Google Scholar] [CrossRef]
  34. Ahmed, N.; Natarajan, T.; Rao, K. Discrete Cosine Transform. IEEE Trans. Comput. 1974, 100, 90–93. [Google Scholar] [CrossRef]
  35. Yang, Y.; Sun, J.; Li, H.; Xu, Z. ADMM-CSNet: A deep learning approach for image compressive sensing. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 521–538. [Google Scholar] [CrossRef] [PubMed]
  36. Cui, Y.; Wu, D.; Huang, J. Optimize TSK Fuzzy Systems for Classification Problems: Minibatch Gradient Descent With Uniform Regularization and Batch Normalization. IEEE Trans. Fuzzy Syst. 2020, 28, 3065–3075. [Google Scholar] [CrossRef]
  37. Zhang, J.; Ghanem, B. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Online, 1 July 2018; pp. 1828–1837. [Google Scholar]
  38. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Online, 5 July 2016; pp. 770–778. [Google Scholar]
  39. Iandola, F.; Moskewicz, M.; Karayev, S.; Girshick, R.; Darrell, T.; Keutzer, K. Densenet: Implementing efficient convnet descriptor pyramids. arXiv 2014, arXiv:1404.1869. [Google Scholar]
  40. Yu, X.; Efe, M.O.; Kaynak, O. A general backpropagation algorithm for feedforward neural networks learning. IEEE Trans. Neural Netw. 2002, 13, 251–254. [Google Scholar]
  41. Ito, D.; Takabe, S.; Wadayama, T. Trainable ISTA for sparse signal recovery. IEEE Trans. Signal Process. 2019, 67, 3113–3125. [Google Scholar] [CrossRef]
  42. Takabe, S.; Wadayama, T.; Eldar, Y.C. Complex trainable ista for linear and nonlinear inverse problems. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 5020–5024. [Google Scholar]
  43. Borgerding, M.; Schniter, P.; Rangan, S. AMP-Inspired Deep Networks for Sparse Linear Inverse Problems. IEEE Trans. Signal Process. 2017, 65, 4293–4308. [Google Scholar] [CrossRef]
  44. Xiong, T.; Xing, M.; Wang, Y.; Wang, S.; Sheng, J.; Guo, L. Minimum-entropy-based autofocus algorithm for SAR data using chebyshev approximation and method of series reversion, and its implementation in a data processor. IEEE Trans. Geosci. Remote Sens. 2013, 52, 1719–1728. [Google Scholar] [CrossRef]
  45. Horé, A.; Ziou, D. Image Quality Metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
  46. Li, J.; Qu, C.; Shao, J. Ship detection in SAR images based on an improved faster R-CNN. In Proceedings of the 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017; pp. 1–6. [Google Scholar] [CrossRef]
Figure 1. Geometric structure of the SAR imaging system: (a) 3D model; (b) 2D model.
Figure 1. Geometric structure of the SAR imaging system: (a) 3D model; (b) 2D model.
Remotesensing 15 04126 g001
Figure 2. Topology architecture of 2D SAR imaging net.
Figure 2. Topology architecture of 2D SAR imaging net.
Remotesensing 15 04126 g002
Figure 3. Topology architecture of module R .
Figure 3. Topology architecture of module R .
Remotesensing 15 04126 g003
Figure 4. Topology architecture of the nonlinearity module. (a) Module N ; (b) module N p l u s .
Figure 4. Topology architecture of the nonlinearity module. (a) Module N ; (b) module N p l u s .
Remotesensing 15 04126 g004
Figure 5. (a) Partial nonsparse labels in the training set; (b) one example of nonsparse labels.
Figure 5. (a) Partial nonsparse labels in the training set; (b) one example of nonsparse labels.
Remotesensing 15 04126 g005
Figure 6. Comparison of seven SAR imaging methods with complete sampling (including the proposed SR-CSA-Net and SR-CSA-Net-plus), where more details of the imaging results are shown below the corresponding image.
Figure 6. Comparison of seven SAR imaging methods with complete sampling (including the proposed SR-CSA-Net and SR-CSA-Net-plus), where more details of the imaging results are shown below the corresponding image.
Remotesensing 15 04126 g006
Figure 7. Comparison of seven SAR imaging methods with different sampling rates (including η = 81 % , 64 % , and 36 % ).
Figure 7. Comparison of seven SAR imaging methods with different sampling rates (including η = 81 % , 64 % , and 36 % ).
Remotesensing 15 04126 g007
Figure 8. Performance investigation with respect to the SC and sparse representation with different layers and epochs ( η = 81 % ). (a) Layer; (b) epoch.
Figure 8. Performance investigation with respect to the SC and sparse representation with different layers and epochs ( η = 81 % ). (a) Layer; (b) epoch.
Remotesensing 15 04126 g008
Figure 9. Performance investigation with respect to the BN operator with different layers and epochs ( η = 64 % ). (a) Layer; (b) epoch.
Figure 9. Performance investigation with respect to the BN operator with different layers and epochs ( η = 64 % ). (a) Layer; (b) epoch.
Remotesensing 15 04126 g009
Figure 10. Imaging results of three real scenes by RMIST-Net, RDA-Net, SR-CSA-Net, and SR-CSA-Net-plus from top to bottom. (Columns 1 and 2: scene 1. Columns 3 and 4: scene 2. Columns 5 and 6: scene 3. Columns 1, 3, and 5: downsampling with η = 81 % . Columns 2, 4, and 6: downsampling with η = 64 % .)
Figure 10. Imaging results of three real scenes by RMIST-Net, RDA-Net, SR-CSA-Net, and SR-CSA-Net-plus from top to bottom. (Columns 1 and 2: scene 1. Columns 3 and 4: scene 2. Columns 5 and 6: scene 3. Columns 1, 3, and 5: downsampling with η = 81 % . Columns 2, 4, and 6: downsampling with η = 64 % .)
Remotesensing 15 04126 g010
Figure 11. Imaging results of measured data for a harbor by CSA, MSR-CS, AMP-Net, RMIST-Net, RDA-Net, and SR-CSA-Net-plus from left to right. (Rows 1: downsampling with η = 81 % . Rows 2: downsampling with η = 64 % ).
Figure 11. Imaging results of measured data for a harbor by CSA, MSR-CS, AMP-Net, RMIST-Net, RDA-Net, and SR-CSA-Net-plus from left to right. (Rows 1: downsampling with η = 81 % . Rows 2: downsampling with η = 64 % ).
Remotesensing 15 04126 g011
Figure 12. Imaging results of measured data for a seashore by CSA, MSR-CS, AMP-Net, RMIST-Net, RDA-Net, and SR-CSA-Net-plus from left to right. (Rows 1: downsampling with η = 81 % . Rows 2: downsampling with η = 64 % ).
Figure 12. Imaging results of measured data for a seashore by CSA, MSR-CS, AMP-Net, RMIST-Net, RDA-Net, and SR-CSA-Net-plus from left to right. (Rows 1: downsampling with η = 81 % . Rows 2: downsampling with η = 64 % ).
Remotesensing 15 04126 g012
Table 1. Main parameters of the simulation and RADARSAT-1 satellite.
Table 1. Main parameters of the simulation and RADARSAT-1 satellite.
ParametersSimulationRADARSAT-1
Range FM rate62.50 MHz/ μ s0.72 MHz/ μ s
Azimuth FM rate66.67 Hz/s1733 Hz/s
Center frequency10 GHz5.3 GHz
Pulse duration1.2 μ s41.74 μ s
Pulse repetition frequency100 Hz1257 Hz
Effective radar velocity100 m/s7062 m/s
Table 2. Evaluation values with complete sampling.
Table 2. Evaluation values with complete sampling.
MethodNMSEPSNR (dB)SSIM
MF(CSA)0.011324.720.7685
CS0.017322.880.7587
DCT-CS [17]0.010824.910.7713
MSR-CS [18]0.009625.450.7858
CSA-Net0.009225.620.8106
SR-CSA-Net0.002431.740.9069
SR-CSA-Net-plus0.001832.540.9496
Table 3. Evaluation values with different sampling rates.
Table 3. Evaluation values with different sampling rates.
Method η = 81 % η = 64 % η = 36 % Runtime (s)
NMSE PSNR (dB) SSIM NMSE PSNR (dB) SSIM NMSE PSNR (dB) SSIM CPU/GPU
MF(CSA)0.193012.400.29810.199111.890.21100.34509.880.14091.37/-
CS0.074016.570.46030.126214.250.30430.265411.020.20627.86/-
DCT-CS [17]0.014423.670.73310.017623.130.71130.050718.160.572660.94/-
MSR-CS [18]0.011824.550.74990.016223.160.74210.073916.450.5526123.15/-
CSA-Net0.055617.810.62870.069316.850.34890.121914.400.2076-/0.036
SR-CSA-Net0.007426.550.81530.009225.160.79950.047919.160.6069-/0.064
SR-CSA-Net-plus0.004429.270.90140.006627.090.85660.020721.770.7054-/0.068
Table 4. PSNR Value of ablation study ( η = 81 % ).
Table 4. PSNR Value of ablation study ( η = 81 % ).
MethodSC Π ( · ) F ( · ) F ˜ ( · ) PSNR (dB) (epoch = 101)
L = 7911
SR-CSA-Net-plus29.1529.2729.18
SR-CSA-Net××26.3926.5526.48
CSA-Net××××17.6917.8117.94
MethodSC Π ( · ) F ( · ) F ˜ ( · ) PSNR (dB) ( L = 9)
epoch = 71101131
SR-CSA-Net-plus29.1729.2729.20
SR-CSA-Net××26.3726.5526.61
CSA-Net××××17.8617.8117.79
Table 5. NMSE Value of ablation study ( η = 64 % ).
Table 5. NMSE Value of ablation study ( η = 64 % ).
MethodSC Π ( · ) BNNMSE (epoch = 101)
L = 7911
SR-CSA-Net-plus0.00720.00660.0069
SR-CSA-Net-plus (w/o BN)×0.00850.00830.0086
SR-CSA-Net××0.01200.00920.0097
SR-CSA-Net (w/o BN)×××0.02290.02060.0213
MethodSC Π ( · ) BNNMSE ( L = 9)
epoch = 71101131
SR-CSA-Net-plus0.00740.00660.0068
SR-CSA-Net-plus (w/o BN)×0.01140.00830.0081
SR-CSA-Net××0.01570.00920.0091
SR-CSA-Net (w/o BN)×××0.02760.02060.0168
Table 6. Performance comparison of different imaging nets in nonsparse SAR scenes.
Table 6. Performance comparison of different imaging nets in nonsparse SAR scenes.
Method NMSE/PSNR (dB)/SSIM, η = 81 % NMSE/PSNR (dB)/SSIM, η = 64 % Runtime (s)
Scene 1 Scene 2 Scene 3 Scene 1 Scene 2 Scene 3 GPU
RMIST-Net [25]0.0938/16.08/0.58660.1487/15.28/0.49470.1791/17.16/0.39300.1984/12.82/0.42410.1685/14.74/0.38710.2413/15.89/0.33520.042
RDA-Net [31]0.0449/19.27/0.75330.0811/17.92/0.63140.1042/18.81/0.60730.0677/17.49/0.65640.1482/15.30/0.54950.1609/17.41/0.46270.075
SR-CSA-Net0.0259/21.66/0.85750.0490/20.11/0.86030.0527/22.47/0.82790.0542/18.45/0.71170.1281/15.93/0.61670.1414/18.19/0.45210.064
SR-CSA-Net-plus0.0123/24.88/0.90330.0283/22.49/0.89310.0237/25.94/0.91640.0439/19.37/0.76690.1187/16.26/0.69740.1066/19.42/0.63260.068
Table 7. Performance comparison of different imaging nets on measured data.
Table 7. Performance comparison of different imaging nets on measured data.
Method NMSE/PSNR (dB)/SSIM/ENT, η = 81 % NMSE/PSNR (dB)/SSIM/ENT, η = 64 %
Harbor Seashore Harbor Seashore
CSA0.0509/18.07/0.4988/3.75110.0488/18.41/0.6472/4.09850.0587/17.18/0.4511/3.98490.0536/17.96/0.6206/4.1333
MSR-CS [18]0.0110/23.86/0.6246/3.17170.0266/19.90/0.6310/4.08570.0127/23.24/0.5823/3.52030.0302/19.35/0.5978/3.9964
AMP-Net [30]0.0065/27.14/0.8858/1.56130.0143/23.30/0.7062/2.65940.0066/26.10/0.8651/1.57120.0122/23.27/0.6721/2.6949
RMIST-Net [25]0.0063/27.08/0.8632/1.74540.0278/21.80/0.7284/2.73290.0065/25.08/0.8376/1.97390.0334/19.87/0.6349/2.7519
RDA-Net [31]0.0042/27.93/0.8946/2.15760.0055/26.49/0.8682/3.42570.0055/25.85/0.8632/2.40560.0071/24.61/0.7405/3.8262
SR-CSA-Net-plus0.0037/28.13/0.9083/2.13610.0026/29.67/0.9145/3.29510.0049/26.77/0.8878/2.22550.0043/27.86/0.8680/3.6207
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, H.; Ni, J.; Li, K.; Luo, Y.; Zhang, Q. Nonsparse SAR Scene Imaging Network Based on Sparse Representation and Approximate Observations. Remote Sens. 2023, 15, 4126. https://doi.org/10.3390/rs15174126

AMA Style

Zhang H, Ni J, Li K, Luo Y, Zhang Q. Nonsparse SAR Scene Imaging Network Based on Sparse Representation and Approximate Observations. Remote Sensing. 2023; 15(17):4126. https://doi.org/10.3390/rs15174126

Chicago/Turabian Style

Zhang, Hongwei, Jiacheng Ni, Kaiming Li, Ying Luo, and Qun Zhang. 2023. "Nonsparse SAR Scene Imaging Network Based on Sparse Representation and Approximate Observations" Remote Sensing 15, no. 17: 4126. https://doi.org/10.3390/rs15174126

APA Style

Zhang, H., Ni, J., Li, K., Luo, Y., & Zhang, Q. (2023). Nonsparse SAR Scene Imaging Network Based on Sparse Representation and Approximate Observations. Remote Sensing, 15(17), 4126. https://doi.org/10.3390/rs15174126

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop