In the simulated experiments, we adopted the nonsparse images in [
37] as labels
with a size of 256 × 256. A sufficient number of simulated echoes
was generated using labels with known geometry, 8000 of which were employed for training and 1000 were utilized for testing. With this strategy, the training set contained 8000 pairs of
with 20 dB AWGN added to each
. Some nonsparse labels in the training set are shown in
Figure 5. In the measured experiments, a portion of the original RADARSAT-1 data containing nonsparse scenes was applied to further investigate the proposed imaging strategy. The training set was formulated by utilizing 1000 slices of the original echo data, where the size of each slice was 512 × 512, and a 20 dB AWGN was added. In particular, we adopted the traditional CSA to generate the labels of measured data, in which the sidelobe was further suppressed by feature enhancement. The following three cases were taken into account to assess the effectiveness of our proposed imaging nets.
4.1. Simulated Experiments of Nonsparse Scenes
Here, we focused on a nonsparse imaging scene in which the discretized grids
were fixed to 256 × 256. In this section, five algorithms, including MF (CSA), traditional CS without sparse representation, DCT-based CS (DCT-CS) [
17], mixed sparse representation-based CS (MSR-CS) [
18], and CSA-Net, were adopted as comparative experiments. CSA-Net is a simplified SR-CSA-Net without CNN blocks, which can be regarded as a 2D version of ISTA-Net [
24] improved by the CSA operator
. In addition, four evaluation indices, including the NMSE, peak signal-to-noise ratio (PSNR) [
45], structural similarity index for measuring [
45], and mean computing time, were applied to evaluate the reconstruction performance of different algorithms. The NMSE is defined as
where
represents the reconstruction result and
denotes the label image. The definition of PSNR is
where MSE is the mean square error between
and
, and
denotes the pixel value range of
. In addition, the SSIM is defined as
where
represents the cross-covariance function, and
and
were set to
and
, respectively.
According to the definitions in Equations (
33)–(
35), it can be concluded that the NMSE describes the accuracy of the reconstruction results, the PSNR reflects the distortion of images after reconstruction, and the SSIM indicates the similarity of the reconstruction result and ground truth, where a larger SSIM value means better performance.
: As illustrated in
Figure 6, the reconstruction results of the proposed SR-CSA-Net and SR-CSA-Net-plus were compared with those of the above five algorithms with complete sampling.
Figure 6 verifies that the reconstruction results of SR-CSA-Net and SR-CSA-Net-plus have more details and sharper edges than other methods, showing the superiority of the proposed imaging nets.
Table 2 lists the three evaluation indices of the imaging results in
Figure 6, obtained from the average of 100 Monte Carlo experiments. In addition, the optimal evaluation values are in bold font. Although traditional MF, CS-driven, and CSA-Net without CNN blocks obtain acceptable imaging results, the proposed SR-CSA-Net and SR-CSA-Net-plus outperform all the above methods by a large margin. As expected, SR-CSA-Net-plus has better performance and is superior to SR-CSA-Net based on the evaluation values.
: Then, we examined the influence of the sampling rate denoted as
on the reconstruction performance. We adopted the same training set to ensure a fair comparison, and different sampling rates, including
, were applied to the above seven SAR imaging methods to more deeply investigate the superiority of our SR-CSA-Net and SR-CSA-Net-plus. The imaging results by MF, CS-driven methods, and deep unfolding-based methods under the premise of downsampling are shown in
Figure 7 from left to right.
In
Figure 7, the ambiguity phenomenon appears in the reconstruction results of CSA and CSA-Net because both the conventional MF and the deep unfolding method without sparse representation cannot address the downsampled echo of the nonsparse scene. In contrast, the ambiguity phenomenon is eliminated in the reconstruction results of CS-driven methods and our proposed imaging nets because those methods are effective for the downsampled echo of the nonsparse scene. While CS without sparse representation may alleviate noise and sidelobe interference to some extent, its ambiguity phenomenon is also severe and grows in severity as sampling rates decrease. The reconstruction results of sparse-representation-based CS methods, including DCT-CS and MSR-CS, are acceptable when
and
but seriously deteriorate when
. As a result, the proposed SAR imaging nets achieve satisfactory reconstruction results with complete sampling data to 36% downsampling data.
The PSNR, NMSE, and SSIM values of various
and the mean computing time of various algorithms are listed in
Table 3. It shows that the imaging quality declines for all the methods with a decrease in sampling rate. The evaluation values of CSA, CS, and CSA-Net significantly decrease under the premise of downsampling. For DCT-CS and MSR-CS, the evaluation indices do not decline significantly when
, but the evaluation indices decline significantly when
. Although the evaluation indices of our proposed imaging nets also decline with a reduction in the sampling rate, the proposed SR-CSA-Net-plus obtains the best NMSE, PSNR, and SSIM values in each case, followed by SR-CSA-Net.
The superior reconstruction performance of the proposed SR-CSA-Net and SR-CSA-Net-plus are demonstrated by the PSNR and NMSE values. Furthermore, the SSIM values and the mean computing time indicate that the proposed imaging nets have outstanding target enhancement performance and reconstruction efficiency. For different sampling rates, the echo matrix is filled to a fixed size, so there is almost no difference in imaging time. The three deep-unfolding-based imaging nets have comparable imaging times, and CSA-Net takes the shortest imaging time among the three methods. Due to the existence of module and module , the parameter sets in SR-CSA-Net and SR-CSA-Net-plus include learnable parameters related to sparse representation, which results in a slightly larger imaging time of the proposed imaging nets than that of CSA-Net.
: To further evaluate the superiority of our proposed strategy, an ablation study of SR-CSA-Net-plus with different layers, epochs, and network modules was performed and is presented in this part. The comparison results of PSNR with respect to different imaging methods are presented in
Figure 8 and
Table 4, where SR-CSA-Net-plus without skip connections (SCs) and
stands for SR-CSA-Net, and SR-CSA-Net-plus without SCs,
,
, and
refers to CSA-Net.
The PSNR curves of the proposed imaging nets increase with the number of layers or epochs and converge at a certain point, as shown in
Figure 8. We also observe that SR-CSA-Net-plus significantly outperforms SR-CSA-Net, CSA-Net, and other conventional methods in terms of reconstruction performance. When we remove SCs and
, the results are still acceptable. However, after removing the sparse representation blocks
and
, the system suffers a significant performance degradation. The PSNR curves converge when
.
Table 4 indicates that under this condition, SR-CSA-Net-plus achieves approximately 3 dB and 12 dB gains over SR-CSA-Net and CSA-Net, respectively. Furthermore, CSA-Net achieves the fastest training convergence as the number of epochs increases, while SR-CSA-Net-plus registers as the second-fastest and achieves the best reconstruction performance among the three deep-unfolding-based imaging nets.
The BN operator was designed in our proposed imaging nets to accelerate convergence, which is also critical to the improvement in network performance. To directly reflect the influence of the BN operator, the proposed two imaging nets were compared with those without BN operators.
Figure 9 and
Table 5 show the NMSE comparison results of four different imaging nets with various numbers of layers and epochs, where w/o is the abbreviation of “without”.
As illustrated in
Figure 9 and
Table 5, the NMSE values of SR-CSA-Net and SR-CSA-Net-plus are lower than those without BN operators at each layer or epoch. The performance of SR-CSA-Net-plus is not appreciably affected by the removal of BN operators, whereas the advantages of BN operators are more apparent for SR-CSA-Net.
Figure 9a verifies that the NMSE curves of SR-CSA-Net-plus (two red curves) converge and tend to be flat when
, while the curves of SR-CSA-Net (two blue curves) converge when
.
Figure 9b shows that SR-CSA-Net-plus with BN operators performs best in terms of convergence speed and achieves a stable performance over sixty epochs. Hence, the convergence speed and reconstruction accuracy of imaging nets can be improved to some extent by introducing the BN operator. Nevertheless,
Table 5 suggests that SR-CSA-Net-plus without BN operators still achieves better reconstruction performance than SR-CSA-Net with BN operators.
In conclusion, and are notably beneficial for the high-quality reconstruction of nonsparse SAR scenes. The superiority of SR-CSA-Net-plus over SR-CSA-Net can be attributed to the high-frequency component recovery operator and SCs introduced by module . These two designs in SR-CSA-Net-plus improve the sparse representation ability and reconstruction performance. For both SR-CSA-Net and SR-CSA-Net-plus, the BN operator introduced by module and module facilitates the convergence of the network training while improving the reconstruction accuracy.
4.2. Simulated Experiments of Real Scenes
The simulated experiments in
Section 4.1 proved the effectiveness and feasibility of SR-CSA-Net and SR-CSA-Net-plus for nonsparse SAR reconstruction by comparing them with traditional MF and CS-driven methods. In this subsection, we further compare them with state-of-the-art deep unfolding methods, including RMIST-Net [
25] and RDA-Net [
31], which also reduce the storage burden and are designed by unfolding ISTA. To demonstrate the universality of the proposed imaging nets, we chose three real scenes with weak sparsity in an open SAR ships detection dataset (SSDD) [
46] to generate the SAR echoes, which were directly input into the trained imaging nets in
Section 4.1, avoiding unnecessary training and calculation. The three real scenes with the indices 231, 1080, and 1088 in the SSDD were cut into images of size of 256 × 256, and their sparsity degrees were scene 3 > scene 1 > scene 2. The radar system parameters were the same as those in the previous section, and the fixed mutual parameters of all networks were set to be the same for a fair comparison.
Figure 10 shows the imaging results of three real scenes obtained by RMIST-Net, RDA-Net, SR-CSA-Net, and SR-CSA-Net-plus when downsampling with
and
. In addition, the corresponding evaluation indices are listed in
Table 6, where the best values are marked in boldface.
Figure 10 and
Table 6 reflect the superiority and robustness of our proposed two imaging nets for a real SAR scene reconstruction when downsampling. Although the reconstruction quality of all the above imaging nets weakens with the decreasing scene sparsity and sampling rate, SR-CSA-Net-plus has the best advantage in all three scenes and both downsampling schemes, followed by SR-CSA-Net. For the other two comparative imaging nets, the imaging results of RDA-Net and RMIST-Net are acceptable when
, but those of RMIST-Net are streaky when
. Furthermore, RDA-Net performs better than RMIST-Net because the iterative parameters and compensation matrices in RDA-Net are learned, while RMIST-Net only learns the ISTA parameters, and its RM kernel is predefined. This issue leads to RDA-Net having the largest number of learnable parameters but RMIST-Net having the least number of learnable parameters. In contrast, the number of parameters in our methods is moderate. Therefore, our proposed imaging nets achieve a compromise between reconstruction performance and computing speed compared with RDA-Net and RMIST-Net.
4.3. Measured Experiments
The previous simulated experiments validated that SR-CSA-Net and SR-CSA-Net-plus were superior to conventional CS-driven and deep-unfolding-based algorithms in both reconstruction quality and efficiency. In this subsection, some experimental results are given based on the measured data of RADARSAT-1 to investigate whether the proposed imaging strategy still performs well in the measured experiments. Since SR-CSA-Net-plus performed better, we chose it as an example to further verify our proposed strategy and retrain it in the way introduced at the beginning of
Section 4. SR-CSA-Net-plus was tested and compared with the four above-mentioned methods, including CSA, MSR-CS, RMIST-Net, and RDA-Net. In addition, an AMP-unfolded 2D SAR imaging method inspired by [
30], called AMP-Net, was introduced in this experiment for comparison with other SAR imaging methods unfolded by ISTA. The corresponding imaging results with
and
of a sparse scene and a nonsparse scene, namely, a harbor and seashore, are shown in
Figure 11 and
Figure 12, respectively. The image entropy (ENT) and the previously mentioned evaluation values are listed in
Table 7.
From
Figure 11 and
Figure 12 and
Table 7, the four networks achieve comparable reconstruction results in the sparse scene, but the proposed strategy has apparent advantages in the nonsparse scene. Specifically, SR-CSA-Net-plus achieves approximately 3∼8 dB PSNR gains over RDA-Net, AMP-Net, and RMIST-Net, and reconstructs more details and sharper edges. Furthermore, the NMSE, PSNR, and SSIM values of SR-CSA-Net-plus are optimal among the six algorithms, which is consistent with the simulated experiments, while the ENT of SR-CSA-Net-plus ranks third. However, for nonsparse scenes, ENT cannot be used solely as an indicator to evaluate reconstruction performance. Although AMP-Net and RMIST-Net obtain smaller ENT values for both the harbor and seashore, they lose much information in the reconstruction process due to the lack of sparse representation ability, such as the structure, edges, and smooth components of scattered fields. This loss of information is especially pronounced in nonsparse scenes. In addition, since AMP is superior to ISTA, the reconstruction performance of AMP-Net is slightly better than that of RMIST-Net and has comparable performance with RDA-Net for sparse scenes. Thus, we are interested in combining AMP-Net with a sparse representation structure, which is expected to achieve better reconstruction performance than SR-CSA-Net-plus in our future work. According to the complexity analysis in
Section 3.4, the number of parameters in RMIST-Net, AMP-Net, RDA-Net, and SR-CSA-Net-plus is 18, 20, 9.437 × 10
, and 3.738 × 10
, respectively, when the echo size is 512 × 512. Therefore, RMIST-Net spends the least runtime among the four networks under the premise of the same fixed mutual parameters and 2D imaging mechanism, followed by AMP-Net and SR-CSA-Net-plus. In conclusion, the proposed strategy combines the merits of the CS and DL methods and achieves consistently high-quality reconstruction results in sparse and weakly sparse scenes while remaining computationally competent.