1. Introduction
High-resolution and wide-swath synthetic aperture radar (HRWS SAR) has drawn a lot of attention over the years [
1,
2,
3,
4,
5]. The main reason is that it can overcome the minimum antenna area constraint and provide high-resolution images of more extensive areas compared with the traditional SAR [
6,
7]. For example, the German Aerospace Center (DRL) plans a mission called High-Resolution Wide-Swath to enhance the performance of the current SAR system, TerraSAR-X. The mission designs a 3 m resolution with 80 km swath coverage for the stripmap mode [
8]. Compared with the current TerraSAR-X [
9], a 30 km increment is obtained. These prominent advantages make it suitable for applications such as maritime transportation surveillance [
8], military target indication [
10,
11,
12], etc.
Compared with the traditional SAR, the system’s pulse repetition frequency (PRF) needs to be lower to receive echoes from the wide swath area. This may result in the undersampling problem in the azimuth direction, causing ghost images from Doppler ambiguity in the final imaging result [
13]. To solve this problem, the HRWS SAR system is designed to deploy multiple receiving channels in the azimuth direction [
14]. By receiving echo from multiple channels at the same time, using spatial sampling to replace it, equivalent sampling frequency thus increases multiple times, and undersampling is avoided. Due to these additional receiving channels, new imaging models need to be established, and corresponding imaging algorithms need to be studied. Researchers mainly studied these two aspects over the last decade [
3,
4,
15,
16,
17,
18,
19,
20,
21,
22], among which issues related to imaging such as multichannel signal reconstruction [
13,
15,
16,
17,
18], amplitude/phase balancing, and motion error compensation [
3,
4,
5,
19,
20] were studied the most. In the meantime, along with these developments, multiple new spaceborne and airborne SAR systems have been designed with HRWS imaging capability [
2,
8,
21,
22,
23], which again shows its attractive practical value.
The previous methods follow the imaging framework in which the imaging result of the whole scene is needed first before detecting targets [
24,
25]. For some specific application fields, such as maritime surveillance, the online processing performance is limited due to the following two gaps left by previous approaches.
First, when the hardware resources are limited, the design of the imaging framework does not consider it. From the perspective of hardware receiving resources, high resolution means that a high sampling rate is required to sample echo signals with large bandwidth, and wide coverage means a large storage capacity to store echo signals of the whole scene. Although most of them provide little information, the system still needs to receive their echo signals, which increases the consumption of receiving resources.
Secondly, when the real-time requirement for software processing is high, the design of the imaging framework does not consider it. From the perspective of computation time, all echoes need to be processed to obtain the whole scene image, increasing computation time consumption.
To fill the two gaps, we need to design a new imaging framework related to hardware and software, and the focus should be on the consumption and allocation of resources. Thus, conceptually, we propose a new imaging concept to guide the design of the imaging framework. Unlike the existing whole-scene-oriented imaging framework, the concept is target-oriented: the system’s resource allocation is to serve the online target-detection task. Specifically, guided by the concept, we propose an adaptive receiving–processing–decision feedback framework from the aspects of hardware, software, and their combination. To be specific:
First, to reduce the consumption of receiving hardware resources, we propose a two-dimensional adaptive receiving module, which only receives echoes from targets. Specifically, it receives necessary bandwidth and time-width echoes from targets by dechirping and subaperture decomposition in the range and azimuth directions, respectively.
Second, to reduce the consumption of computation time, we propose a target-oriented processing module, which only processes the echoes from targets. Specifically, it only obtains the images of targets with a necessary resolution by parallel-streaming inverse fast Fourier transform (IFFT) and back projection (BP) in the range and azimuth directions, respectively.
Third, to allocate resources reasonably, we propose a decision module, which decides the parameters of the receiving and processing modules. Specifically, it determines the receiving window, the receiving aperture in the range and azimuth directions, and the processing bandwidth by constant false alarm rate (CFAR) detection to extract areas of targets.
Fourth, to allocate resources adaptively, we connect three modules in a closed loop to realize feedback. Specifically, it fulfills allocating resources adaptively by imaging and detecting targets progressively from coarse to fine.
The contributions of this work are summarized as follows.
A target-oriented imaging concept is proposed for HRWS SAR imaging. To the best of our knowledge, this is a first in the HRWS SAR imaging community.
An adaptive receiving–processing–decision feedback framework, aiming at specific application fields such as maritime surveillance, is proposed to fulfill the concept. We propose three modules (2D adaptive receiving module, target-oriented processing module, and decision module) with a closed-loop connection. The modules and the link are designed considering characteristics related to the application field, imaging scene, and the radar system.
The proposed framework simplifies the processing flow, providing a feasible end-to-end way for HRWS SAR online maritime surveillance to obtain the target detection results directly.
The feasibility of the proposed target-oriented imaging concept and the realization of the imaging framework is verified through numerical simulations. The system’s resources and computation time are significantly saved compared with the current framework. For a typical scenario, our framework saves at least 30% sampling resources, 33% storage resources, and 99% computation time.
The rest of this paper is organized as follows.
Section 2 reviews the current imaging framework.
Section 3 introduces the proposed imaging concept and its realization framework.
Section 4 presents the experiments and the results. Conclusions are shown in
Section 5.
2. Review
In this section, a brief review of the current imaging framework for HRWS SAR imaging is presented. Taking the most common type of HRWS SAR system design, for example, the azimuth multichannel SAR (AMC-SAR) [
22,
26], its imaging model is presented in
Figure 1.
As shown in
Figure 1, unlike the traditional SAR system, multiple receiving channels are deployed along the azimuth direction. Within each transmitting pulse, these receiving channels simultaneously receive echoes from the whole scene. Considering the transmitting signal as the linear frequency-modulated signal, the received signal of each channel for one point target can be expressed as
where
is the range time,
is the azimuth time,
is the channel index and
is the number of channels,
is the target intensity,
is the range envelope function,
is the target instantaneous range time delay,
is the target instantaneous slant range distance,
is the light speed,
is the pulse width,
is the frequency modulation rate of transmitting signal, and
is the carrier frequency. For imaging in the range direction, matched filtering is conducted, and the imaging result in the range direction is
The next step is azimuth multichannel signal reconstruction. This step is the core step of AMC-SAR signal processing. This step allows for imaging results of all receiving channels in the range direction to be reorganized and combined into an equivalent result of traditional single-channel SAR [
27]. As shown in
Figure 2, the combined imaging result in the range direction is obtained through the rearranging of samples according to the azimuth time sequence and channel arrangement sequence. The antenna phase centers (APCs) of each channel are sorted into equivalent phase centers (EPCs) [
28].
The rearranged result is
where
is the pulse index and
is the pulse repetition period. After multichannel signal reconstruction, azimuth imaging methods for traditional SAR can be utilized, such as range-Doppler [
29] and BP [
30,
31]. For wide-swath imaging, BP has some unique advantages. First, for a wide-swath scene, range-Doppler has the limitation of the imaging area’s size due to its approximated model of the range history, while this is not a problem for BP due to its accurate model [
32]. Second, for online processing, range-Doppler has an online update limitation due to its need for the whole aperture, and this is not a problem for BP due to its ability of time-domain subaperture accumulation [
31]. Third, for online processing, BP can be implemented in parallel through graphical processing units (GPU). Fourth, for platforms with maneuverability, BP can be implemented with a nonlinear trajectory [
19]. Due these advantages, we prefer BP as the azimuth imaging method. Thus, the final imaging result after BP is
where
and
are the image’s azimuth and range pixel index, respectively, and
is the synthetic aperture time. The current imaging framework for HRWS imaging is summarized and illustrated in
Figure 3.
3. Methodology
In the last section, we briefly review the current imaging framework for HRWS imaging. In this section, the methodology of the proposed target-oriented imaging concept and its realization framework are presented in detail.
From the current imaging framework in
Figure 3, two core modules of receiving and processing form it. While it is a general framework for all kinds of imaging scenes, it may not be the most suitable one for applications such as maritime surveillance, especially for online surveillance on a resource-limited platform. We take an imaging result of a typical maritime scene, for example, as shown in
Figure 4. It is from the LS-SSDD-v1. (Large-Scale SAR Ship Detection Dataset-v1.0, LS-SSDD-v1) dataset we released before [
25].
From the figure, we can clearly see that most of the scene is water surface, and the interested targets of ships only occupy a relatively small part, revealing the strong sparsity of the scene. However, as shown in
Section 2, echoes of the whole scene need to be received and processed. Then, for further surveillance, ship target detection is performed on the entire imaging result. As we explain in
Section 1, this framework increases the burden of receiving resources and computation time, which is not favorable for online maritime surveillance.
With this observation, utilizing the strong sparsity of the scene, we propose a new target-oriented imaging concept and its realization framework, as shown in
Figure 5.
To serve the need for online maritime surveillance, one key issue is the appropriate management of the resource budget. The first principle is to minimize unnecessary costs. Therefore, compared with the general imaging concept that is whole-scene-oriented, we propose this target-oriented one that guides the resource allocation to the targets and avoids the unnecessary resource allocation to the pure-water area. The second principle is to maximize necessary allocation. Therefore, compared with the general imaging framework with just two main modules (receiving and processing) as shown in
Figure 3, we propose an adaptive feedback framework with an additional decision module and the feedback connection to dynamically allocate the resource.
As seen in
Figure 5, the proposed framework consists of three basic modules, namely receiving, processing, and decision modules. They are connected in a closed-loop way to enable feedback. This idea is inspired by biological behavior, where animals perceive the surrounding environment and then adjust their behavior according to its information. Thus, from the perspective of perception, the first two modules collect and process to obtain information from the imaging scene, respectively. From the perspective of action, the last module and the feedback connection process the obtained information and then decide the system’s following response. Their details are presented in the following three subsections.
3.1. Two-Dimensional Adaptive Receiving Module
In this subsection, we introduce the first module in our proposed imaging framework, the 2D adaptive receiving module. From the perspective of hardware resource consumption, this module is designed to reduce the consumption of receiving resources.
In general, receiving resource consumption is mainly determined by two factors, which are sampling frequency and storage occupation. For the sampling frequency, according to the well-known Nyquist sampling theorem [
33], it should be at least the bandwidth of the receiving echo. Thus, the requirement for the sampling device is expressed as
where
is the sampling rate and
is the signal bandwidth. For the storage occupation, it is determined by the sampling length and the sampling frequency, which is expressed as
where
is the storage occupation and
is the sampling time length. As seen from the above two formulas, to reduce the consumption of receiving resources, we need to reduce the sampling frequency and the sampling length. In our 2D adaptive receiving module, necessary bandwidth and time-length echoes from targets are received by dechirping and subaperture decomposition in the range and the azimuth direction, respectively, as illustrated in
Figure 6.
Unlike the normal receiving in the range direction, which is received by mixing the echoes with the local oscillator, the module adapts dechirping by mixing them with the time-delayed transmitting signal. It can be expressed as
where
is the conjugate operator. This formula shows that for a point target, its echo becomes a single-frequency signal, and the frequency is related to its slant range. Compared with the signal form in Formula (1), the bandwidth is reduced to zero. In practice, for a typical target such as a ship, it cannot be taken as a point target but as a distributed target, as shown in
Figure 7.
For a distributed target, the dechirped echo can be expressed as
where
is the target point index and
is the number of points,
is the intensity of the
point of the target, and
is the instantaneous range time delay of the
point of the target. From this formula, we can see that its echo consists of multiple single-frequency signals for a distributed target, and its bandwidth is determined by the size of the target as follows:
where
is the time length of the target echo in the range direction. As the maximum length equals the receiving window’s length, the largest bandwidth is
. In general, the size of the ship does not exceed the order of 100 m, corresponding to the order of 1
time length. In comparison with the length of the receiving window, the order of 10
, it is much less. Benefitting from this, the needed sampling frequency is reduced a lot. In other words, because we only receive echoes from targets by dechirping, the receiving module only needs to bear the resource consumption of the necessary sub-band and partial-time-length echoes rather than the consumption of the full-bandwidth and full-time-length echoes from the whole scene.
The next step is to receive in the azimuth direction. In recent years, GPU has widely shown its strong calculation performance through thousands of parallel threads, which makes it suitable for imaging algorithms such as BP [
31]. To facilitate the following signal processing, the module adapts parallel streaming subaperture decomposition receiving in the azimuth direction.
Since only the targets are to be imaged, instead of the whole aperture for the entire scene to be received, only a group of the targets’ subapertures is received individually. For each subaperture received, to avoid massive memory costs, the receiving storage framework is streaming architecture.
For CUDA device computation, the parallel computation is executed by thousands of threads running concurrently. For each thread, a pixel in the image is uniquely related to it, along with the pixel’s corresponding data in a different azimuth time. In the streaming architecture, instead of relating to the whole aperture of the imaging scene, the thread corresponds to its own aperture. A circular aperture buffer is used during the azimuth receiving phase, and each received echo of the target stream into the buffer by chronological order. Threads form into an image buffer, which is after the aperture buffer and stores and accumulates the processing results of the aperture buffer. Further, the massive storage device follows the image buffer to store the final imaging result. This three-level receiving structure is illustrated in
Figure 8.
The sizes of the two buffers are determined by the aperture’s length, the size of targets, and the pixel interval. For the aperture buffer, its size is calculated as
where
is the synthetic aperture length,
is the flight speed,
is the target size in the azimuth direction,
is the interpolation ratio in the range direction, and
is the round up operator. For the image buffer, its size is calculated as
where
is the image pixel size in the azimuth direction,
is the image pixel size in the range direction, and
is the target size in the range direction.
The above receiving process is related to the target’s position and size. Thus, the receiving window’s position and size are adaptively changed according to the environment. Moreover, the azimuth receiving aperture’s length is also related to the necessary azimuth image resolution for target detection. It is adaptively changed from short to long according to the decision module’s feedback.
At the end of this subsection, we conclude by summarizing the receiving hardware resource consumption of the proposed receiving module, as presented in
Table 1. The comparison with the current receiving module is also included.
In
Table 1,
is the oversampling ratio;
;
.
and
are the range length and the azimuth length of the whole scene, respectively.
is the relative ratio of target’s length vs. the whole scene’s length in the range direction.
and
are the relative ratios of receiving windows’ or subaperture’s length vs. the whole scene’s length in the range direction and the azimuth direction, respectively.
In summary, utilizing the sparsity of the entire scene and the relatively small size of targets, this receiving module reduces both the sampling frequency and storage occupation a lot, according to
Table 1. In general, the sparser the scene is, the more relative reduced proportion we can obtain.
3.2. Target-Oriented Processing Module
In this subsection, we introduce the second module in our proposed imaging framework, the target-oriented processing module. From the perspective of software resource consumption, this module is designed to reduce the consumption of computation time.
In general, the resource consumption of computation time is mainly determined by time and spatial complexities. To reduce them, we first reduce the data amount to be processed, by means of target-oriented processing that only handles the echoes from targets; secondly, we execute the computation process in a parallel streaming way through a GPU device.
First, as depicted in Formula (8), the received echoes from a target consist of multiple single-frequency signals. Their frequencies are linearly related to the range distance of the points. Thus, imaging in the range direction is conducted through IFFT, which can be expressed as
We can see from the result that multiple function peaks exist, and their peak locations correspond to the positions of the target points. The mainlobe width of these functions is inversely proportional to the transmitting signal bandwidth. In other words, through dechirping receiving and IFFT processing, the resolution of the imaging result in the range direction is the same as the current match-filtering processing.
The designed process contains one multiplication operation and one fast Fourier transform (FFT) operation. As for the matched-filtering way, it includes one operation of FFT to transform echoes in the time domain into the frequency domain, one multiplication operation with the matched filter, and one operation of IFFT to transform it back into the time domain. Thus, the designed processing saves one operation of FFT.
Second, as depicted in (14), one phase term that contains the range histories exists. For imaging in the azimuth direction, this phase term needs to be compensated via parallel streaming BP. As presented earlier in the last subsection, we use the GPU device to conduct streaming BP, which means that one range imaging result is obtained in one azimuth moment; it is streamed into one column of aperture buffer after interpolation via zero-padding in the frequency domain. Following this step, every thread corresponding to every pixel in the 2D image traces the corresponding position of response in the aperture buffer at this moment, and takes it out with the phase compensation term to coherently accumulate. Once the necessary resolution is obtained, the image buffer streams out the current 2D imaging result to the massive storage device, as shown in
Figure 9.
Specifically, every thread’s calculation includes steps as follows.
- 1.
Calculate the time delay in the range direction.
Given the corresponding image pixel’s position, the time delay at every moment in the azimuth direction is calculated as
where
is the EPC position of the system at azimuth time
;
is the pixel’s position
in the imaging plane.
- 2.
Calculate the position of response in the aperture buffer.
Given the time delay calculated before, the corresponding position in the aperture buffer is calculated as
where
is the time delay of the receiving window and
is the rounding operator.
- 3.
Compensate the phase term.
Given the time delay calculated before, the corresponding compensated response is calculated as
- 4.
Coherent accumulation.
Given the phase compensated response, the image pixel’s value is coherently accumulated with the new response as
The above calculations are repeated until the necessary resolution in the azimuth direction is achieved.
The streaming BP computation complexity is determined by the number of pixels in the 2D image—the size of the target, in other words—and the length of aperture to be accumulated. For a single round of the above calculation steps, one operation of multiplication and one operation of addition are conducted.
At the end of this subsection, we conclude it by summarizing the computation time resource consumption of the proposed target-oriented processing module, as presented in
Table 2. The comparison with the current processing module is also included.
In
Table 2,
is the number of azimuth sampling points for the whole scene.
and
are the number of image pixels in the range and azimuth direction, respectively.
is the target-oriented ratio of necessary azimuth resolution vs. full azimuth resolution that is no more than 1;
is the number of sampling points in the range direction.
is the number after interpolation.
Similarly, with the help of the sparsity of the scene, the small size of targets, and the GPU device, this processing module reduces the computation time consumption a lot, according to
Table 2. In addition, considering the need for target detection, the required resolution may be less than the maximum available resolution. Thus, azimuth accumulation time is saved.
3.3. Decision Module and Feedback Connection
In this subsection, we introduce the last module in our proposed imaging framework. Moreover, the modules’ feedback connections are also introduced, and they are designed to adjust the resource allocation dynamically.
As the system’s resources are limited, the principle of adjusting is to allocate them as reasonably and adaptive as possible. To allocate them more reasonably, the module adaptively decides the receiving module’s receiving window and aperture in the two directions so that resources are spent only on targets. To allocate them more adaptively, the module adaptively decides the necessary resolution of the image.
Specifically, the module uses CFAR to extract targets’ information, including their position and size. CFAR is a local detector to detect targets with an adaptive threshold [
34]. For a single target’s pixel under detection, its surrounding clutter and noise pixels are used to calculate the corresponding threshold as
where
is the amplitude of clutter and noise pixel in the image,
is the pixel index,
is the number of pixels, and
is the constant related to false alarm probability. To avoid the miscalculation with the target’s pixels being mixed in, guard pixels are set adjacent to the pixel under detection, as shown in
Figure 10.
Meanwhile, the module uses image entropy [
19] to decide the necessary resolution in the azimuth resolution. The image entropy is calculated as follows:
where
is the amplitude of pixel
in the image. The entropy is to measure the randomness of the image statistically. In the SAR imaging community, it is usually used to indicate the degree of image focusing [
19]. Normally, the more the image is focused, the lower the entropy is. In another information theory perspective, entropy indicates the amount of uncertainty. For random parts in the image—the clutter and noise—it has higher entropy than the targets’. When the resolution is low, the entropy difference between the part of targets and the part of the clutter and noise is not significant. As the resolution is increased, the difference increases, which is beneficial for target detection.
Initially, the target entropy is decreased rapidly. As target details are gradually revealed, the decreasing rate of entropy becomes low, as shown in
Figure 11. Based on this, the module decides that the necessary resolution is achieved when the relative rate of change is lower than a threshold.
Finally, we introduce a feedback connection. The proposed receiving, processing, and decision modules are connected through this feedback connection so that the decision module can adapt to other modules’ parameters. The receiving window and subaperture length related to the positions of targets are fed back through this connection. For a single target, initially, the module sets the receiving aperture short in the azimuth direction. Then, it is gradually increased so that the imaging result of the target is progressively obtained from coarse to fine when the decision module decides that the necessary resolution is achieved. The decision is fed back through this connection to stop the resource allocation for this target.