1. Introduction
Synthetic aperture radar (SAR) has been an important and powerful modern microwave sensor system in both military and civilian areas [
1]. Due to its superior operational capabilities [
2,
3], SAR has played a significant information acquisition role for reconnaissance and detection nowadays. In addition, SAR can obtain the electromagnetic scattering characteristics of the detected targets and scenarios and acquire unique information from the imaging results at microwave frequencies [
4], which have been of remarkable superiorities compared with other sensor systems.
With the improvement of the imaging capability of SAR systems, people have been interested in not only SAR signal processing but also interpretation or recognition of the real-world targets from SAR images. Automatic target recognition (ATR) [
5,
6,
7,
8] has become one of the most attractive but challenging research hotspots in SAR application. From the point of view of the users, an ideal ATR system should locate the regions with potential targets of interests from the SAR image and give those targets with accurate category labels intelligently and efficiently [
9].
The general scheme of an end-to-end SAR ATR system, proposed by the researchers from MIT Lincoln Laboratory, has three basic stages with a hierarchical processing [
10], i.e., detection [
11], discrimination [
12], and classification [
13]. It aims to find the regions of interests (ROIs) from the SAR imagery, screen the targets we wanted [
14], remove the false alarm clutters, and finally assign the classified attributes for the SAR targets with a well-designed classifier. In order to make the intelligent SAR target recognition to a reality, people have proposed many novel SAR ATR methods in the past several years [
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26], such as principal component analysis (PCA) [
27], linear discriminant analysis (LDA) [
28], support vector machine (SVM) [
15], adaptive boosting (AdaBoost) [
29], conditional Gaussian model (CGM) [
30], sparse representation [
19], iterative graph thickening (IGT) [
20], and so on, which generally performed well in applications.
To be viewed from a system level, there are generally two different SAR ATR genres according to the implementations: template-based [
10] and model-based [
31]. The template-based ATR system relies on template matching between the labeled target templates or features and the class-unknown input SAR target. It is a sequential processing and has advantages in simple construction and execution efficiency. Nevertheless, it is lacking in sufficient knowledge and intelligence, and its recognition results could be interfered by the operating condition variation and the template matching means. On the contrary, the model-based system includes two modules, i.e., offline model construction and online prediction and recognition, and tries to take a different approach to SAR ATR. It has much more intelligence and flexibility than the template-based system. However, additional complexity from the model construction and online prediction will also bring big challenges for the model-based SAR ATR system.
The ATR approaches mentioned above often need to extract specialized features from SAR images and predesign complex or sophisticated algorithms for target recognition. With the development of artificial intelligence, a novel ATR genre based on deep learning [
32] has been growing fast and achieved remarkable performance in computer vision [
33], natural language processing, and image classification [
34] domains. As a new type of ATR approach, it can spontaneously discover and extract hierarchical and useful features from input data and give effective solutions to complex target recognition tasks. Naturally, due to the superiorities of deep learning, many significant works based on deep neural networks have also greatly upgraded the performance of SAR ATR [
16,
21,
22,
24,
35].
Most of the existing SAR ATR systems and algorithms regard the SAR images as independent individuals, and they are often designed for single-input SAR images in practical ATR missions. Actually, it is a complex electromagnetic inverse scattering process for SAR imagery formation, and SAR images of the same target are often sensitive to different viewing angles. Hence, it is difficult for us to mine enough information from a single-input SAR image for ATR in general. On the other hand, SAR ATR will be benefited from multiview measurements [
36] because the multiview SAR images of the same target could contain much richer classification information than single-view ones. SAR sensors have the abilities of obtaining images of the same target from different views with spotlight or circular modes in practice. Naturally, if the classification information could be effectively exploited or learned from the multiview SAR images, the SAR ATR performance may be significantly improved.
Inspired by this thought, a number of novel methods using multiview inputs have been proposed in recent years, which are of high recognition accuracy. For example, Ref. [
36] shows the benefits of aspect diversity for SAR ATR based on the experimental analysis, and Ref. [
37] proposes a multiview SAR ATR method using a Bayesian classifier, which improves the ATR performance. In Ref. [
38], a machine learning based method is proposed for SAR ATR using multiple acquisition from multiple sensors, which improves the SAR target recognition performance greatly. Ref. [
39] extracts the feature of the multiview SAR images using PCA and obtains a good classification result based on a radial basis function neural network. In [
40], two fusion strategies are involved for target recognition with multiview SAR images, and the recognition performance excels the single-view based methods. Ref. [
41] employs joint sparse representation and proposes a novel multiview SAR ATR method, and the experimental result shows its superiority. Ref. [
42] exploits the multiview SAR robust target recognition and further improves the ATR performance based on sparse representation classification. Some multiview SAR ATR methods are also proposed based on various deep neural network architectures [
43,
44,
45], which could achieve outstanding recognition results under different operating conditions.
Generally, multiview SAR ATR is a complex and integrated information processing procedure. In order to achieve outstanding multiview SAR ATR performance, two important issues must be incorporated: a valid ATR processing framework and an appropriate ATR algorithm for classification feature learning from limited raw SAR samples. A reasonable processing framework is necessary for the effectiveness of multiview SAR ATR, while the ATR algorithm is one of the most key points in the framework. Hence, it is indispensable and desirable to establish a standard processing framework for multiview SAR ATR architecture design and then search for an effective ATR algorithm.
In this paper, we will give a general processing framework for multiview SAR ATR including three parts, i.e., raw multiview SAR data formation, multiview SAR data preprocessing, and multiview target recognition, which can provide an effective and standard way to multiview SAR ATR system design. Then, a novel ATR method using a multiview deep feature learning network is proposed based on this framework. The proposed deep neural network is with a multiple input parallel topology, and some specific modules such as convolutional layer, convolutional gated recurrent unit (ConvGRU), weighted concatenation unit (WCU), 3D convolutional layer, and 3D pooling layer are embedded in this network. Both the intra-view and inter-view features of the input multiview SAR images will be thoroughly learned with this elaborately designed multiview deep feature learning network. Therefore, the proposed network can take advantage of comprehensive and significant classification information from multiview SAR images and achieve high target recognition accuracy.
The main contributions compared with available SAR ATR works are the following: (1) We give a general processing framework for multiview SAR ATR, which can make a paradigm for ATR system designs and future studies of this field. (2) A multiview deep feature learning network is proposed for effective SAR ATR, and this network can simultaneously extract the intra-view and inter-view features from multiview SAR images. (3) Compared with the available SAR ATR methods, the proposed deep neural network can achieve excellent ATR performances under various operating conditions but with limited raw SAR data for training sample generation.
This paper is organized as follows: A general processing framework for multiview SAR ATR is introduced in
Section 2.
Section 3 details the proposed SAR ATR method using a multiview deep feature learning network. Experiments are carried out in
Section 4, and
Section 5 gives the conclusions of our work.
2. Multiview SAR ATR Processing Framework
Practical implementation of SAR ATR had been summarized as a multistage processing by the researchers from MIT Lincoln Laboratory in the last century [
46], which is a classical and excellent SAR ATR framework. Nevertheless, that ATR scheme was generalized and mainly designed for single-view input SAR image at the beginning. Multiview SAR data are with higher dimensions than single-view ones, and contain rich classification information, so this needs a more sophisticated and specific processing ATR scheme than before. Therefore, based on the MIT ATR scheme, we give a general processing framework that is appropriate for multiview SAR ATR.
The framework includes three specific parts as shown in
Figure 1, i.e., raw multiview SAR data formation, multiview SAR data preprocessing, and multiview target recognition, each of which performs easily identifiable functions. Modules in this framework are detailed as follows.
2.1. Raw Multiview SAR Data Formation
The first module in the framework is to acquire the eligible and valid raw multiview SAR images and find out the ROIs, which can locate the targets we wanted and reduce the computational load of the ATR system. Generally, this module should contain two processing steps, i.e., multiview SAR imaging and ROIs acquisition. Some SAR imaging modes, such as spotlight mode [
47] and circular mode [
48], can continuously observe the same scene or target and are perfect for raw multiview SAR images collection. Then, the target chips with multiple views will be obtained by the ROI acquisition step, and there are many target detection and discrimination methods that can be chosen to realize it.
2.2. Multiview SAR Data Preprocessing
After the raw multiview SAR data formation, the multiview SAR target chips are obtained; however, there are still some problems to be solved. For example, the orientations of the same target on the SAR chips are different, and the scattering information of the target on the multiview SAR images could be inapparent. In addition, sufficient training samples should be fed into the multiview SAR ATR algorithm to optimize its parameters during the training phase. However, the amount of the available raw multiview SAR data are often limited in practice, which could lead to overfitting of the ATR algorithm.
The aims of the multiview SAR data preprocessing are to eliminate the inconsistence, enhance the scattering information, and augment the raw multiview SAR data for training, which correspond to orientation correction, image enhancement, and data augmentation in this module, respectively. After the data preprocessing, the multiview SAR data are more suitable for the following ATR processing, and the classification information of the multiview SAR targets will be more easy to learn than before.
2.3. Multiview Target Recognition
Multiview target recognition is the back-end module in the multiview SAR ATR processing framework. It constructs ATR algorithms, receives the multiview SAR samples from the preceding module, and assigns the most probable classified label for the target. Essentially, this module is to learn and extract effective classification features from the input samples and make optimal division for the features with hyperplanes in the feature space.
There are two kinds of very important features to be learned in multiview SAR images, i.e., intra-view feature and inter-view feature. The intra-view feature means the inherent scattering or structural feature of the SAR target within each view, while the inter-view feature is the mutual feature in the multiview SAR image sequence, which is distinct from single-view SAR ATR. Meanwhile, the inter-view feature includes two individual features. When SAR observes the same target from different views, the correlated feature among the multiview image sequence, namely the temporal feature, will contain intrinsic classification information. In addition, the variation feature of the multiview image sequence, i.e., spatial feature, can also provide complementary discriminative information of the same target and benefit to ATR. Therefore, the most important point in multiview target recognition module is to design an appropriate ATR algorithm to simultaneously learn classification features of both intra-view and inter-view from multiview SAR images. After feature learning, the multiview target recognition module will give us an accurate class attribute of the target.
Thus far, the multiview SAR ATR processing framework is summarized as three individual but related modules with several distinct steps. In this way, the multiview SAR ATR problem can be effectively handled. While this ATR framework includes some specific processing steps within each module, it is noted that not every processing step is absolutely necessary; people could also make some adjustments in ATR practice.