Precision Detection of Infrared Small Target in Ground-to-Air Scene

Dong, Xiaona; Jiang, Huilin; Song, Yansong; Dong, Keyan

doi:10.3390/rs16224230

Open AccessArticle

Precision Detection of Infrared Small Target in Ground-to-Air Scene

¹

School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China

²

Institute of Space Optoelectronic Technology, Changchun University of Science and Technology, Changchun 130022, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(22), 4230; https://doi.org/10.3390/rs16224230

Submission received: 27 September 2024 / Revised: 4 November 2024 / Accepted: 8 November 2024 / Published: 13 November 2024

Download

Browse Figures

Versions Notes

Abstract

:

Reliable infrared small target detection plays an important role in infrared search and track systems. In recent years, most target detection methods usually use the statistical features of a rectangular window to represent the contrast between the target and the background. When the size of the target is small or the target is close to the background, the statistical features of the rectangular window would reduce the significance of the target. Moreover, such methods have limited effect on interfering targets, high brightness background, background edges, and clutter suppression in complex backgrounds, and are likely to misdetect the target or even miss it. This paper proposes a non-window, structured algorithm for precision detection of infrared small targets under ground-to-air complex scenes. The non-window, structured local grayscale descent intensity and local gradient watershed (LGDI-LGW) filter can detect a 1 × 1 pixel infrared small target, and effectively suppress interfering targets and background edges. By using the adaptive threshold and centroid algorithm on the target area, the precision of target coordinates reaches sub-pixel accuracy. The results of 9 simulation experiments show that the algorithm has the lowest false alarm rate and the highest detection rate compared with the eight baseline algorithms. It can effectively detect targets with Gaussian distribution of grayscale values and targets with grayscale values approximating tree stump structure. The results of 2 engineering experiments show that under simulated near-sun conditions, a uniform target is precisely detected, and the UAV point target is precisely detected in complex ground-to-air scenes.

Keywords:

precise detection of infrared small target; local grayscale descent intensity; local gradient watershed; adaptive threshold for target area

1. Introduction

Infrared search and track (IRST) systems based on infrared imaging technology offer the advantages of strong concealment, high spatial resolution, and all-weather use and utilize the temperature difference between the target and the background to achieve long-distance target detection [1,2]. It has very important applications in military fields such as precision guidance, warning systems, space-based monitoring, and geological analysis [3,4]. Infrared small target detection plays a vital role in these applications. In order to find the target as early as possible, it is necessary to detect and track the target at a long distance. For long-distance infrared imaging, a small target in an infrared image usually occupies only a few pixels, which makes it impossible to segment the target by detail features such as shape and texture [5,6,7]. In addition, complex factors such as trees, buildings, and rivers in realistic backgrounds appear in infrared images in the form of highlighted backgrounds, strong edges, and noise of other interfering targets [8]. Therefore, the effective detection of infrared small targets in complex backgrounds is a problem that needs to be urgently solved.

Figure 1 shows a typical infrared image with a small target in a ground-to-air scene, where TT, HB, BE, and IT represent the true target, high-brightness background, background edge, and interfering target, respectively. When the infrared camera images a target with uniform temperature that is higher than the background temperature, the grayscale values in the target area of the infrared image are similar and higher than the grayscale values in the neighborhood background, as shown in Figure 2. The ultimate goal of infrared target detection is to utilize the difference between the true target and high-brightness background, background edge, and interfering target to achieve a low false alarm rate and a high detection rate.

Infrared small target detection methods in complex scenes include sequence detection methods and single-frame detection methods [9,10]. Since real-time target detection becomes urgent in military applications of IRST systems, research based on single-frame detection methods is necessary [11,12]. Single-frame detection methods can be classified into model-driven and data-driven [13]. In recent years, data-driven methods based on deep learning have received widespread attention and are considered advanced technology in the field of infrared small target detection. However, deep learning-based methods heavily rely on computational resources and labeled training samples. Their computational power severely degrades in real-world engineering applications due to the need for highly integrated computational platforms [14]. Infrared small targets usually occupy only a few pixels, which is not conducive to texture and shape feature extraction. Limited labeled samples can hardly cover a variety of clutter, and their detection performance will be greatly affected [15]. On the other hand, model-driven methods have received increasing attention because they have relatively low complexity, do not require training samples, and are easy to implement on resource-limited platforms. Model-driven methods mainly include filter-based detection methods, low-rank and sparse detection methods, and human visual system methods.

Early filtering detection methods include Maximum Median Filter [16], Maximum Mean Filter [17], Bilateral Filter [18], and Top Hat Filter [19] to suppress the background. Their principles are simple and easy to implement. However, the detection results are inaccurate when the background is complex. Subsequently, researchers have made many improvements based on the traditional filter detection methods, such as Cao et al. [20], who proposed a two-dimensional least mean square (TDLMS) filter to adapt to more complex backgrounds. Zhang et al. [21] proposed a two-layer TDLMS filter to suppress the background and extract the target. Compared with early methods, these improved methods have strong adaptability to sea backgrounds, sky backgrounds, and building backgrounds with slow grayscale value changes. However, false alarms may occur when detecting pixel-sized highlighting noise, background edges, and other locations with large grayscale changes.

Low-rank and sparse detection methods utilize the theory of nonlocal autocorrelation of the background, considering the target and the background as a sparse matrix and a low-rank matrix, respectively, and achieving small target detection through the mathematical optimization of the sparse matrix and low-rank matrix. Gao et al. [10] proposed an IPI model to achieve more accurate image segmentation and small target extraction by optimizing the infrared image reconstruction process. When the complexity of the image background increases, the IPI model is not sufficient to observe the sparsity among the small pieces, resulting in the sparse background noise also being mistakenly decomposed into the sparse matrix, which interferes with the detection of the true target. There are many other optimization methods, such as non-convex optimization with Lp-norm constraint (NOLC) [22], Schatten 1/2 quasi-norm regularization and reweighted sparse enhancement (RS1/2NIPI) [23], and the partial sum of the tensor nuclear norm (PSTNN) [24], and so on. With the proposal of optimization methods and the improvement of models, the detection accuracy of low-rank and sparse methods has been continuously improved. However, they still have difficulty suppressing strong local interference. In addition, they are very time-consuming due to the iterative decomposition and optimization of the matrix.

The human visual system (HVS) method is based on the fact that a small target usually causes significant grayscale changes in local texture rather than global texture [25]. This mechanism converts target salience into the difference between the target and its neighborhood background. Based on this property, many local contrast methods have been proposed. Chen et al. [4] proposed a local contrast method (LCM) and a multi-scale local contrast method (MLCM), which uses nested windows with eight directions to suppress background edges. Deng et al. [8] proposed a Weighted Local Difference Measure (WLDM) that uses local difference contrast to extend the luminance difference between the target and the background to enhance the target. Wei et al. [26] used a multiscale patch-based contrast measure (MPCM) algorithm, which takes the maximum value of the local contrast between different scales for each pixel. Han et al. [27] proposed relative LCM (RLCM), calculated by combining the ratio differences, and then extended it to the sub-block level [28]. Han et al. [29] proposed a multi-scale tri-layer local contrast measure (TLLCM), which uses Gaussian filtering to enhance the target area and takes the average of the maximum pixels in the surrounding eight areas. Moradi et al. [30] proposed Absolute Directional Mean Difference (ADMD), which uses directional methods to suppress the structural background. Han et al. [31] proposed Weighted Strengthened LCM (WSLCM) and proposed an improved RIL (IRIL) that replaces the maximum value with the average of several maximum grayscale calculations. Zhang et al. [32] proposed a Multi-scale Strengthened Directional Difference (MSDD) algorithm that combines the local directional intensity measure and the local directional fluctuation measure to effectively suppress angular clutter. The above HVS method calculates the local differences at each position by sliding a rectangular window (minimum size of 3 × 3 pixels). However, the local contrast of the target cannot be measured effectively using a rectangular window. For example, when the rectangular window is larger than the target or the target is close to the background, the rectangular window contains both background and target pixels. The local contrast results of the target will be affected by the background pixels and the statistical features do not represent the target features. In addition, rectangular windows have limited effectiveness in suppressing irregular highlight clutter [33]. Irregular highlight clutter mainly consists of a high-brightness background and background edges. As mentioned above, the existing HVS method using rectangular windows makes it difficult to achieve clutter suppression and detect a 1 × 1 pixel target.

Inspired by the HVS method, we proposed an effective non-window, structured, infrared, small-target detection method based on the feature analysis of infrared targets, high-brightness backgrounds, background edges, and interfering targets. Local grayscale descent intensity (LGDI) and local gradient watershed (LGW) characteristics are used to suppress the interfering targets and background clutters and enhance the true target. The main work and contributions of this paper are summarized as follows.

(1): Based on the local features of the small target, a non-window, structured LGDI-LGW filter is designed to detect a 1 × 1 pixel infrared small target.
(2): An LGDI algorithm is proposed for distinguishing the grayscale features between true targets, interfering targets, and background edges. It can effectively detect targets with a Gaussian distribution of grayscale values and targets with grayscale values approximating a tree stump structure, and effectively suppress interfering targets and background edges.
(3): An LGW algorithm is proposed to effectively enhance the target through the gradient features of the target and further suppress the interfering targets and background edges.
(4): The target position is obtained using the area adaptive threshold and centroid algorithm, and the position accuracy reaches sub-pixel accuracy.

The contents of this paper are as follows. Section 2 provides a detailed introduction to the detection algorithm based on a non-window, structured LGDI-LGW filter. Section 3 provides the results and analysis of nine simulation experiments in infrared scenes and two engineering experiments. Section 4 summarizes the conclusions.

2. The Proposed Method

Figure 3 shows the flow of the detection algorithm based on a non-window, structured LGDI-LGW filter. The precision detection method for infrared small targets proposed in this paper consists of three parts: image preprocessing, LGDI-LGW filtering, adaptive threshold for target area, and the calculation for target coordinates, As shown in Figure 3. Firstly, two kinds of preprocessing are performed on the input infrared image: one is Sobel gradient calculation, and the other is Gaussian high-pass filtering, which is used to calculate the maximum value of Gaussian high-pass filtering in the local area to obtain the candidate target (CT) points. Then, we perform the local grayscale intensity and local gradient watershed filtering on each CT point to suppress clutter and enhance the target. Then, the maximum value is selected from the filtering results to obtain the target area range. Finally, the precise coordinates of the target are calculated using the adaptive threshold and centroid algorithm for the target area.

2.1. Image Preprocessing

In ground-to-air scenes, clouds, buildings, plants, roads, and mountains sometimes appear in the background of infrared images. The purpose of image preprocessing is to remove large uniform background while retaining target information as much as possible. In infrared images, targets, background edges, and interfering targets belong to high-frequency information. Backgrounds with slow transformation of grayscale values are low-frequency information. A high-pass filter can allow high-frequency information to pass through and suppress low-frequency information. Due to the fact that the grayscale distribution of a small target is close to a Gaussian distribution, using a Gaussian filter has an additional advantage, which is that it can enhance the target to some extent. Therefore, we perform Gaussian high-pass filtering on the original infrared image. The Gaussian function [34] for a two-dimensional image is defined as:

G (x, y) = \frac{1}{2 π σ^{2}} e^{- \frac{{(x - μ)}^{2} + {(y - μ)}^{2}}{2 σ^{2}}}

(1)

The Gaussian kernel is set to 3 × 3. When the expectation

μ = 0

and variance

σ = 0.8

, the matrix is normalized and adjusted to integers to obtain the weight matrix as follows:

G = \frac{1}{16} [\begin{matrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{matrix}]

(2)

I_{G T} (x, y) = \{\begin{matrix} I (x, y) - I (x, y) \times G, & i f I (x, y) \geq I (x, y) \times G \\ 0, & o t h e r w i s e \end{matrix}

(3)

where

I (x, y)

is the pixel value in the original infrared image,

I (x, y) \times G

is the pixel value after Gaussian filtering, and

I_{G T} (x, y)

is the pixel value after Gaussian high-pass filtering.

The maximum value of the local area in the image after Gaussian high-pass filtering are the CT points. The grayscale value of the CT point is

I_{g t l m}

, and the coordinates are

(x_{g t l m}, y_{g t l m})

. The scope of the local area is selected based on the size of the target. According to the definition of a small target by the Society of Photo-Optical Instrumentation Engineers (SPIE), the area of a small target is less than 80 pixels. In this paper, the local area is 9 × 9 pixels.

The traditional Sobel algorithm calculates the gradient of pixels in 0° and 90° directions, which can only reflect the edge characteristics in the horizontal and vertical directions. In order to make the gradient features of the image more complete and the edge effects finer, this paper uses eight-directional Sobel convolution kernels [35] to calculate the gradient values of the image. Eight-directional Sobel convolution kernels are shown below:

\begin{matrix} S_{0} = [\begin{matrix} - 1 & - 2 & - 1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{matrix}], & S_{1} = [\begin{matrix} - 2 & - 1 & 0 \\ - 1 & 0 & 1 \\ 0 & 1 & 2 \end{matrix}], & S_{2} = [\begin{matrix} - 1 & 0 & 1 \\ - 2 & 0 & 2 \\ - 1 & 0 & 1 \end{matrix}], & S_{3} = [\begin{matrix} 0 & 1 & 2 \\ - 1 & 0 & 1 \\ - 2 & - 1 & 0 \end{matrix}], \\ S_{4} = [\begin{matrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ - 1 & - 2 & - 1 \end{matrix}], & S_{5} = [\begin{matrix} 2 & 1 & 0 \\ 1 & 0 & - 1 \\ 0 & - 1 & - 2 \end{matrix}], & S_{6} = [\begin{matrix} 1 & 0 & - 1 \\ 2 & 0 & - 2 \\ 1 & 0 & - 1 \end{matrix}], & S_{7} = [\begin{matrix} 0 & - 1 & - 2 \\ 1 & 0 & - 1 \\ 2 & 1 & 0 \end{matrix}], \end{matrix}

(4)

The gradient value S of

I (x, y)

is calculated as follows:

S = ⌊ (\sum_{j = 0}^{7} I (x, y) \times S_{j}) \div 8 ⌋

(5)

where

I (x, y)

is the pixel value in the original infrared image, and

S_{j}

represents eight-directional Sobel convolution kernels. The symbol “⌊⌋” represents rounding downwards. The values of

S

are integers.

2.2. Calculation of the LGDI-LGW Properties

Figure 4(a1)–(a4)shows the high-brightness background area, true target area, interfering target area, and background edge area in Figure 1. Figure 5(a1,a2) illustrates the true target area and background edge area in Figure 2.

The high-brightness background area is relatively large and has a uniform grayscale distribution, as shown in Figure 4(b1). The high-brightness background area belongs to low-frequency information, which is filtered out after Gaussian high-pass filtering, as seen in Figure 4(c1).

As shown in Figure 4(a2–a4) and Figure 5(a1,a2), the grayscale descent values in the diagonal direction of the true target are almost the same, and the grayscale descent values in the diagonal direction of the interfering target are almost the same. However, there is a significant difference in the grayscale descent values in the diagonal direction of background edges. As shown in Figure 4(b2,b3), the grayscale descent values of the interfering target are smaller compared with the true target. Based on the intensity of grayscale descent in eight directions, the true target can be distinguished from the interfering target and the background edges.

On the gradient map of the infrared image, there is a continuous closed gradient around the true target, which we call the gradient watershed of the small target [36], as shown in Figure 4(d2) and Figure 5(d1). Although the interfering target gradient is closed, the gradient values are small, as shown in Figure 4(d3). The gradient distribution of the background edges is a non-closed watershed, and the gradient values in some directions are very small, as shown in Figure 4(d4) and Figure 5(d2). The gradient characteristics can be used to enhance the saliency of the target and further eliminate interfering targets and background edges.

When using a rectangular window to extract features, the window may contain the target, the backgrounds, the background edges, and interfering sources at the same time. To avoid these situations, a non-window, structured LGDI-LGW filter is designed in this paper to adaptively extract the grayscale features and gradient features of the target.

2.2.1. Calculation of the LGDI Properties

In the 3D map of an infrared image, the area of the top of the target is proportional to the number of pixels with high grayscale values. As shown in Figure 4(a2,b2), when the number of pixels with high target grayscale values is small, the top of the target is a small area. When the number of pixels with high target grayscale values is 1, the area at the top of the target is the smallest, which is 1 point. The grayscale distribution of infrared small targets for these two cases is close to the Gaussian distribution. As shown in Figure 5(a1,b1), the top of the target is a larger area when the number of pixels with high target grayscale values is high and their grayscale values are similar. The grayscale distribution of the infrared small target in this situation approximates the tree stump structure. From Figure 4(b2) and Figure 5(b1), it can be seen that the grayscale values of each point on the top edge of the target are almost the same. With the number of steps of each point on the top edge increasing in the direction of its adjacent background, the grayscale descent values are almost the same and the grayscale descent intensity is larger. In order to facilitate the calculation, the local grayscale descent intensity (LGDI) algorithm uses the grayscale descent characteristics of the target in eight directions to measure the grayscale difference between the target and the background. The calculation process of LGDI mainly includes calculating the center position of the CT, calculating the start position of grayscale descent for eight paths, calculating the end position of grayscale descent for eight paths, and calculating LGDI.

Calculating the center position of the CT

After the image preprocessing of the original infrared image, CT points are obtained. As shown in Figure 4(b2,c2) and Figure 5(b1,c1), the maximum value point of the Gaussian high-pass filtering for the true target is at the top edge of the target. When the target size is that of a 1 × 1 pixel, the maximum value point of the Gaussian high-pass filter for the target is at the target vertex. In this paper, the center position of CTs is calculated based on the CT points.

Firstly, the top-edge grayscale value of the CT

I_{e d g e}

is calculated based on the grayscale values of the CT points and surrounding pixels. In Figure 6, the area marked by the blue box is the CT point, with coordinates (

x_{g t l m}

,

y_{g t l m}

) and grayscale value of

I_{g t l m}

. A 3 × 3 area is selected with the CT point as the center. In the 3 × 3 area, the average grayscale value of all pixels with grayscale values less than or equal to

I_{g t l m}

is the top edge grayscale value of the CT.

Then, the path length

L^{i}

and the average grayscale value in eight directions are calculated in Figure 6. Index i is the direction number of the path, and takes a value from the set {0,…, 7}. The number of pixels from the start point to the end point of a path is the path length. The average grayscale value of all pixels on the path is the average grayscale value of the path. Since the maximum diameter of the infrared small target is nine pixels, the maximum path length is 9. The CT point (

x_{g t l m}

,

y_{g t l m}

) is the initial path point. If the grayscale value of the next pixel

I_{n e x t}^{i}

is greater than

I_{e d g e}

along the path direction, the path continues. If the path length counts to 9 or the grayscale value

I_{n e x t}^{i}

of the next pixel is not greater than

I_{e d g e}

, the path terminates at the current pixel.

The following formulas provide a detailed explanation of the calculation process of path length in eight directions. The initial value

L^{i}

of the path length is 0.

L^{i} = 0, i = 0, 1, 2, 3, 4, 5, 6, 7

(6)

L^{i} = \{\begin{matrix} L^{i} + 1, & i f I_{n e x t}^{i} > I_{e d g e} a n d L^{i} \leq 9 \\ L^{i}, & o t h e r w i s e \end{matrix}

(7)

The path direction i where the maximum value of the path length is located is the optimal path direction

i_{\max}

. When the lengths of multiple paths are equal, the path direction where the maximum value of the average grayscale value of the path is located is the optimal path direction.

The middle position of the optimal path is the center position (

x_{c e n t e r}

,

y_{c e n t e r}

) of the CT. The grayscale value of the pixel at the center position of the CT is

I_{c e n t e r}

. The optimal path direction and the center position of the CT are indicated by the green line and the green box in Figure 6, respectively.

The calculation process of the center position of the CT is explained in detail using the following formula. The initial value of the horizontal coordinate

x_{c e n t e r}

of the center position of the CT is

x_{g t l m}

.

x_{c e n t e r} = \{\begin{matrix} x_{g t l m} + ⌊ L^{1} \div 2 ⌋, & i f i_{\max} = 1 \\ x_{g t l m} + ⌊ L^{2} \div 2 ⌋, & i f i_{\max} = 2 \\ x_{g t l m} + ⌊ L^{3} \div 2 ⌋, & i f i_{\max} = 3 \\ x_{g t l m} - ⌊ L^{5} \div 2 ⌋, & i f i_{\max} = 5 \\ x_{g t l m} - ⌊ L^{6} \div 2 ⌋, & i f i_{\max} = 6 \\ x_{g t l m} - ⌊ L^{7} \div 2 ⌋, & i f i_{\max} = 7 \\ x_{center}, & otherwise \end{matrix}

(8)

The initial value of the vertical coordinate

y_{c e n t e r}

of the center position of the CT is

y_{g t l m}

.

y_{c e n t e r} = \{\begin{matrix} y_{g t l m} - ⌊ L^{0} \div 2 ⌋, & i f i_{\max} = 0 \\ y_{g t l m} - ⌊ L^{1} \div 2 ⌋, & i f i_{\max} = 1 \\ y_{g t l m} + ⌊ L^{3} \div 2 ⌋, & i f i_{\max} = 3 \\ y_{g t l m} + ⌊ L^{4} \div 2 ⌋, & i f i_{\max} = 4 \\ y_{g t l m} + ⌊ L^{5} \div 2 ⌋, & i f i_{\max} = 5 \\ y_{g t l m} - ⌊ L^{7} \div 2 ⌋, & i f i_{\max} = 7 \\ y_{center}, & otherwise \end{matrix}

(9)

The symbol “⌊⌋ represents rounding downwards. The values of

x_{c e n t e r}

and

y_{c e n t e r}

are integers.

2.: Calculating the start position of the grayscale descent for eight paths

The start position of grayscale descent (

x_{g_s t a r t}^{i}

,

y_{g_s t a r t}^{i}

) of each path is calculated according to the eight path directions in Figure 7. Index i is the direction number of the path, and takes a value from the set {0,..., 7}. The maximum path length is set to 5. The center position of the CT (

x_{c e n t e r}

,

y_{c e n t e r}

) is the initial start point of the path, marked in green. Along a path direction, if the grayscale value of the next pixel

I_{n e x t}^{i}

is greater than

I_{e d g e}

, the path continues. If the path length counts to 5 or

I_{n e x t}^{i}

is not greater than

I_{e d g e}

, the path terminates in that direction, and the position of the current pixel is the start position of the grayscale descent of the path. In Figure 7, the start position of the grayscale descent is marked with a yellow box.

The calculation process of the start position of grayscale descent for each path is explained using the following formula. The initial value of the horizontal coordinate at the start position of the grayscale descent

x_{g_s t a r t}^{i}

is

x_{c e n t e r}

, and the count of path length

L^{i}

is 0.

x_{g_s t a r t}^{i} = \{\begin{matrix} x_{g_s t a r t}^{i} + 1, & i f I_{n e x t}^{i} > I_{e d g e} a n d L^{i} \leq 5, i = 1, 2, 3 \\ x_{g_s t a r t}^{i} - 1, & i f I_{n e x t}^{i} > I_{e d g e} a n d L^{i} \leq 5, i = 5, 6, 7 \\ x_{g_s t a r t}^{i}, & otherwise \end{matrix}

(10)

The initial value of the vertical coordinate at the start position of the grayscale descent

y_{g_s t a r t}^{i}

is

y_{c e n t e r}

, and the count of path length

L^{i}

is 0.

y_{g_s t a r t}^{i} = \{\begin{matrix} y_{g_s t a r t}^{i} - 1, & i f I_{n e x t}^{i} > I_{e d g e} a n d L^{i} \leq 5, i = 0, 1, 7 \\ y_{g_s t a r t}^{i} + 1, & i f I_{n e x t}^{i} > I_{e d g e} a n d L^{i} \leq 5, i = 3, 4, 5 \\ y_{g_s t a r t}^{i}, & otherwise \end{matrix}

(11)

3.: Calculating the end position of grayscale descent of 8 paths

The end position of grayscale descent (

x_{g_e n d}^{i}

,

y_{g_e n d}^{i}

) of each path is calculated according to the eight path directions in Figure 7. Index i is the direction number of the path, and it takes a value from the set {0,…, 7}. In order to accurately distinguish the target from the edge of the background, the maximum path length is set to 9. The start position of grayscale descent (

x_{g_s t a r t}^{i}

,

y_{g_s t a r t}^{i}

) is the initial start point of the path, and the grayscale value is

I_{g_s t a r t}^{i}

. Along a path direction, if the grayscale value of the next pixel

I_{next}^{i}

is not greater than the grayscale value of the current pixel

I^{i}

, then the next pixel is the new start point of the path and the path continues.

If the path length counts to 9 or

I_{next}^{i}

is greater than

I^{i}

, the path terminates in that direction, and the position of the current pixel is the end position of the grayscale descent of the path. In Figure 7, the end position of the grayscale descent is marked with a red box. The pixel grayscale value at the end position of grayscale descent is

I_{g_e n d}^{i}

.

The calculation process of the end position of the grayscale descent for each path is explained using the following formula. The initial value of the horizontal coordinate at the end position of the grayscale descent

x_{g_e n d}^{i}

is

x_{g_s t a r t}^{i}

, and the count of path length

L^{i}

is 0.

x_{g_e n d}^{i} = \{\begin{matrix} x_{g_e n d}^{i} + 1, & i f I_{next}^{i} \leq I^{i} a n d L^{i} \leq 9, i = 1, 2, 3 \\ x_{g_e n d}^{i} - 1, & i f I_{n e x t}^{i} \leq I^{i} a n d L^{i} \leq 9, i = 5, 6, 7 \\ x_{g_e n d}^{i}, & o t h e r w i s e \end{matrix}

(12)

The initial value of the vertical coordinate at the end position of the grayscale descent

y_{g_e n d}^{i}

is

y_{g_s t a r t}^{i}

, and the count of the path length

L^{i}

is 0.

y_{g_e n d}^{i} = \{\begin{matrix} y_{g_e n d}^{i} - 1, & i f I_{n e x t}^{i} \leq I^{i} a n d L^{i} \leq 9, i = 0, 1, 7 \\ y_{g_e n d}^{i} + 1, & i f I_{n e x t}^{i} \leq I^{i} a n d L^{i} \leq 9, i = 3, 4, 5 \\ y_{g_e n d}^{i}, & o t h e r w i s e \end{matrix}

(13)

4.: Calculating LGDI

The LGDI algorithm measures the grayscale difference between the target and background using the difference between the pixel grayscale value

I_{c e n t e r}

at the center position of the CT and the pixel grayscale value

I_{g_e n d}^{i}

at the end position of the grayscale descent along the eight paths. The definition of the grayscale value difference is as follows:

d_{i} = (I_{c e n t e r} - I_{g_e n d}^{i}) \times (I_{c e n t e r} - I_{g_e n d}^{i + 4}), i = 0, 1, 2, 3

(14)

Then, the minimum value in

d_{i}

is used as the LGDI at the center position of the CT, which is defined as:

L G D I (x_{c e n t e r}, y_{c e n t e r}) = \min (d_{i}), i = 0, 1, 2, 3

(15)

2.2.2. Calculation of the LGW Properties

LGDI can enhance the true target and suppress the interfering targets and background clutter. In order to further enhance the true target and eliminate the interfering targets and background edges, gradient features need to be used. Figure 8 shows the gradient watershed on each path is located between the center position of the CT and the end position of the grayscale descent. The gradient watershed value on each path is extracted based on the characteristics of this location.

We obtain the end position coordinates of each path using Equations (12) and (13). The gradient values of the center position of the CT on the eight paths are obtained using the Sobel gradient values

S

, as shown in Figure 8. The gradient values of each path

T^{i}

can be written as:

T^{i} = [S_{c e n t e r}^{i}, S_{c e n t e r + 1}^{i}, \dots, S_{g_e n d}^{i}], i = 0, 1, 2, 3, 4, 5, 6, 7

(16)

The maximum gradient value in each direction can be expressed as:

T_{\max}^{i} = \max (T^{i}), i = 0, 1, 2, 3, 4, 5, 6, 7

(17)

We obtain the gradient watershed at the center position of the CT by performing the minimum pooling operation on

T_{\max}^{i}

. The result of LGW can be defined as:

L G W (x_{c e n t e r}, y_{c e n t e r}) = \min (T_{\max}^{i}), i = 0, 1, 2, 3, 4, 5, 6, 7

(18)

Finally, we obtain the response results of the LGDI-LGW filter at the center position of the CT as follows:

F (x_{c e n t e r,} y_{c e n t e r}) = L G D I (x_{c e n t e r,} y_{c e n t e r}) \times L G W (x_{c e n t e r,} y_{c e n t e r})

(19)

The execution process of the LGDI-LGW filter is shown in Algorithm 1.

Algorithm 1: LGDI-LGW filter.

Input:

I (x_{g t l m}, y_{g t l m})

.

Output:

F (x_{c e n t e r,} y_{c e n t e r})

.

1: for i=0:7 do

calculate the path length in 8 directions through Equation (7).

end for

2: Obtain the optimal path direction i_max by comparing the Length and the average grayscale value of 8 paths.

3: for i=0:7 do

calculate the center coordinates of CTs through Equations (8) and (9).

end for

4: for i=0:7 do

calculate the start position of grayscale descent for 8 paths through Equations (10) and (11).

end for

5: for i=0:7 do

calculate the end position of grayscale descent for 8 paths through Equations (12) and (13).

end for

6: for i=0:3 do

get d_i through Equation (14).

end for

7: Calculate

L G D I (x_{c e n t e r}, y_{c e n t e r})

through Equation (15).

8: for i=0:7 do

calculate Tⁱ through Equation (16).

get

T_{\max}^{i}

through Equation (17).

end for

9: Calculate

L G W (x_{c e n t e r,} y_{c e n t e r})

through Equation (18).

10: Calculate the result

F (x_{c e n t e r,} y_{c e n t e r})

through Equation (19).

Figure 7a,b show the grayscale descent paths of the infrared small target (with grayscale distribution approximating tree stump structure) and the background edges, respectively. Figure 8a,b show the LGW of these two areas, respectively. Figure 9(a1,b1,c1) show the grayscale descent paths of the infrared small target (with a grayscale distribution similar to Gaussian distribution), the interfering targets, and the background edges. Figure 9(a2,b2,c2) show the LGW of these three areas, respectively. The center position of the CT is marked in green. On the eight paths, the start position of the grayscale descent is marked with a yellow box, and the end position of the grayscale descent is marked with a red box. The minimum path of LGDI is marked with a yellow box, while the remaining paths are marked with red boxes. The minimum value of LGW is marked in purple.

From Figure 7a and Figure 9(a1), it can be seen that there is a large grayscale difference between the true target and the adjacent background on each path; moreover, there is a large gradient watershed around the true target, which leads to the LGDI-LGW response F reaching 1.61556E + 13 and 6591676, respectively. From Figure 9(b1,b2), we can see that the interfering target has similar grayscale features and gradient features to those of the true target; however, the grayscale difference and gradient watershed are relatively small, resulting in an LGDI-LGW response F of 2300512. In Figure 7b and Figure 9(c1), there is a significant grayscale difference between the background edges and the adjacent background on some paths; however, there is a smaller grayscale difference on other paths. Meanwhile, the local gradient of the background edges is non-closed, with smaller gradients in some directions. After the minimum pooling operation, the LGDI-LGW responses F of the two background edges are 1.35125E + 10 and 1648500, respectively.

Through the above comparison, it can be concluded that the F-values of the interfering targets and the F-values of the background edges are much smaller than the F-values of the true target. The proposed LGDI-LGW algorithm can effectively enhance the target and suppress the interfering targets and background edges.

2.3. Precise Calculation of the Target Position

The existing algorithms generally perform threshold calculation and image segmentation on the processed entire image. The true target position obtained is one pixel or an area composed of several pixels, with low accuracy. The computation is time-consuming because the entire image needs to be traversed twice.

To avoid the above drawbacks, a precise calculation method for the target position has been proposed. There are significant response differences among true targets, interfering targets, and background edges after passing through the LGDI-LGW filter. The true target is obtained by selecting the maximum value from LGDI-LGW filter results. The range of the true target and its neighborhood background area can be obtained using the end position of the grayscale descent of the true target. They are defined as follows:

x_{l e f t} = \max (x_{g_e n d}^{i}), i = 5, 6, 7

(20)

x_{r i g h t} = \min (x_{g_e n d}^{i}), i = 1, 2, 3

(21)

y_{u p} = \max (y_{g_e n d}^{i}), i = 0, 1, 7

(22)

y_{d o w n} = \min (y_{g_e n d}^{i}), i = 3, 4, 5

(23)

where

x_{l e f t}

,

x_{r i g h t}

,

y_{u p}

and

y_{d o w n}

represent column minimum, column maximum, row minimum, and row maximum, respectively.

In the original infrared image, an adaptive threshold is used to obtain the true target area from the true target and its neighborhood background area. The adaptive threshold

I_{t h}

is defined as follows:

I_{t h} = λ \times (I_{\max} - I_{mean}) + I_{mean}

(24)

where

I_{\max}

and

I_{mean}

are the maximum pixel grayscale value and the average pixel grayscale value of the true target and its neighborhood background area in the original infrared image, respectively.

λ

is a given parameter with a value between 0 and 1. When λ is low, the threshold approximates the mean value, and pixels with high grayscale values in the neighborhood background of the target may participate in the coordinate calculation, affecting the positional accuracy of the target. When λ is high, the threshold approximates the max value, and only the pixel with the highest grayscale value in the target participates in the coordinate calculation, which affects the positional accuracy of the target. According to experience, the appropriate range of

λ

is [0.4, 0.7].

In the true target and its neighborhood background area, if the pixel grayscale value is less than the threshold, it is considered background or noise and does not participate in centroid calculation. The centroid calculation is defined as follows:

S u m_{t a r g e t_x} = \{\begin{matrix} \sum_{x = 1}^{M} \sum_{y = 1}^{N} x \times I (x, y), & i f I (x, y) \geq I_{t h} \\ S u m_{t a r g e t_x}, & o t h e r w i s e \end{matrix}

(25)

S u m_{t a r g e t_y} = \{\begin{matrix} \sum_{x = 1}^{M} \sum_{y = 1}^{N} y \times I (x, y), & i f I (x, y) \geq I_{t h} \\ S u m_{t a r g e t_y}, & o t h e r w i s e \end{matrix}

(26)

S u m_{t a r g e t_d a t a} = \{\begin{matrix} \sum_{x = 1}^{M} \sum_{y = 1}^{N} I (x, y), & i f I (x, y) \geq I_{t h} \\ S u m_{t a r g e t_d a t a}, & o t h e r w i s e \end{matrix}

(27)

X_{t a r g e t} = \frac{S u m_{t a r g e t_x}}{S u m_{t a r g e t_d a t a}}

(28)

Y_{t a r g e t} = \frac{S u m_{t a r g e t_y}}{S u m_{t a r g e t_d a t a}}

(29)

where M and N are the number of rows and columns of the infrared image, respectively.

X_{t a r g e t}

and

Y_{t a r g e t}

are the horizontal coordinate and vertical coordinate of the true target, respectively.

I (x, y)

,

x

, and

y

represent the grayscale value, horizontal coordinates, and vertical coordinates of the pixel, respectively.

The target position accuracy calculated using the above centroid algorithm can achieve sub-pixel accuracy. In Figure 10a,b, the position of the maximum value after LGDI-LGW filtering, which is the center position of the true target, is marked in green. The target position and coordinates calculated using the above method are marked in red, and the accuracy of the true target coordinates reaches sub-pixel accuracy.

3. Experiments and Analysis

In this section, we conduct two experiments to evaluate the performance of the proposed algorithm. One is the simulation experiment for nine types of ground-to-air infrared image sequences with complex backgrounds. The other is the actual engineering experiments conducted indoors and outdoors.

3.1. Results and Analysis of Simulation Experiments

In this section, we conducted simulation experiments on nine types of ground-to-air infrared image sequences with complex backgrounds. The proposed algorithm in this paper was qualitatively and quantitatively compared with eight baseline algorithms, including Top Hat [19], PSTNN [24], MLCM [4], MPCM [26], RLCM [27], TLLCM [29], ADMD [30], and WSLCM [31], to comprehensively verify the algorithm’s ability to suppress background edges, enhance targets, and detect targets. All simulation experiments were performed using MATLAB R2019a on a computer with 16GB RAM and a 2.4-GHz Intel I5 CPU.

The detailed features of the target and backgrounds in the infrared image sequences of 9 scenes are listed in Table 1 [37,38,39]. These images contain complex backgrounds such as buildings, trees, grasslands, rivers, roads, mountains, and clouds, which can effectively test the performance and robustness of the proposed algorithm.

3.1.1. Quantitative Analysis of Simulation Experiments

The parameter settings of the eight baseline algorithms, including Top Hat [19], PSTNN [24], MLCM [4], MPCM [26], RLCM [27], TLLCM [29], ADMD [30] and WSLCM [31], are shown in Table 2.

Figure 11 shows the target detection images and 3D distribution maps of the nine types of infrared images, which vividly demonstrate the detection performance of the eight baseline algorithms and the proposed algorithm in this paper. Among them, we marked the target with a green rectangle in the original infrared image, and the target detection images of all algorithms were processed without threshold segmentation.

In Figure 11, the proposed algorithm achieves the expected suppression effect on interfering targets, high-brightness buildings, high-brightness land, high-brightness roads, background edges, and other clutters. The algorithm effectively enhances the target and achieves satisfactory detection results. However, other baseline algorithms have limited suppression of interfering targets, high-brightness backgrounds, background edges, and clutter, and they even misdetect targets and fail to detect targets. Top Hat not only has a poor ability to suppress high-brightness backgrounds, interfering targets, and background edges but also exhibits target misdetection in several scenes. PSTNN suppresses most backgrounds; however, its effectiveness in suppressing interfering targets is poor. At the same time, it also enhances the background edges which leads to target misdetection, such as in the lower left corner area of buildings in scene (4). MLCM enhances the target, making the target area larger while enhancing the noise, but cannot suppress most of the background. Although the RLCM detects the target, it also has the same problem as the MLCM, which makes the target area larger. The performance of the RLCM for target detection in complex backgrounds still needs to be improved. MPCM and ADMD can suppress most backgrounds but cannot suppress background edges and backgrounds with locally highlighted areas, such as the highlighted parts of clouds in scene (2), the edges of the buildings in scene (8), and the edges of the clouds in scene (9). TLLCM has some suppression effect on the background; however, it exhibits target misdetection in multiple scenes and fails to detect targets with the size of 1 × 1 pixel in scene (8) and scene (9). WSLCM performs well in scene (3), scene (4), scene (5), scene (6), scene (7), and scene (9), detecting targets. However, it exhibits target misdetection in scene (1) and scene (2) and fails to detect the target in scene (8).

Overall, in the detection of infrared small targets with complex backgrounds, factors such as interfering targets, high-brightness backgrounds, background edges, low target grayscale values, and the target near backgrounds can make target detection difficult. Baseline algorithms are prone to misdetection or even missing targets. The visualized images above show that compared with the other eight baseline algorithms, the proposed algorithm overcomes these difficulties, achieves a good background suppression effect, and can effectively detect small targets.

3.1.2. Qualitative Analysis of Simulation Experiments

In order to evaluate the performance of algorithms in quantitative metrics, the background suppression factor (BSF), contrast gain (CG), and receiver operating characteristic (ROC) curve are commonly used as evaluation metrics in the field of infrared small target detection [40]. In this article, BSF [41,42], CG [43,44], and ROC [45] curves are used to represent the performance of target enhancement and background suppression.

BSF and CG are used to measure the algorithm’s ability to suppress background clutter and enhance target saliency, respectively. If the values of BSF and CG are higher, the detection performance is better. They are defined as follows:

B S F = \frac{σ_{i n}}{σ_{o u t}}

(30)

C G = \frac{{|T_{\max_g r a y} - B_{m e a n_g r a y}|}_{o u t}}{{|T_{\max_g r a y} - B_{m e a n_g r a y}|}_{i n}}

(31)

where

σ_{i n}

and

σ_{o u t}

are the standard deviations of the whole background intensity in the original infrared image and the final output image, respectively. When calculating BSF, the target area does not participate in the calculation.

T_{\max_g r a y}

denotes the maximum grayscale value of the target in the infrared image and

B_{m e a n_g r a y}

denotes the average grayscale value of the target’s neighborhood background. The target’s neighborhood background refers to the background area within 10 pixels of the target edge. It is worth noting that before calculating the metrics, the input image and the output image of all methods are normalized to [0, 1].

In the qualitative analysis experiment, all algorithms used the target detection images in Figure 11, which were processed without threshold segmentation. Table 3 shows the evaluation metrics of all the algorithms on nine scenes. We can see that the proposed algorithm has the highest average BSF on seven scenes and ranks second on scene (6) and scene (9), compared with the eight baseline algorithms. The proposed algorithm has the maximum average CG on nine scene image sequences. The results indicate that the proposed algorithm exhibits better clutter suppression capability and target enhancement ability than the other eight algorithms.

The ROC curve represents the performance of the small target detection method in image sequences. The closer the ROC curve is to the upper left corner, the better the performance of the detection method. The vertical coordinate of the ROC curve is represented by the true positive rate (TPR), and the horizontal axis is represented by the false positive rate (FPR).

T P R

and

F P R

are defined as follows:

T P R = \frac{T P}{F N + T P}

(32)

F P R = \frac{F P}{T N + F P}

(33)

TP is the number of pixels whose detection results belong to the true target; FN refers to the number of pixels that belong to the true target but are considered to be the background in the detection results; FP is the number of pixels that belong to the true background but are considered to be the target in the detection results; and TN is the number of pixels whose detection results belong to the true background.

The ROC curves of the nine scene image sequences are demonstrated in Figure 12. Under the same FPR, the larger the TPR, the better the performance of the algorithm. In scene (1) of Figure 12, although the curve of MPCM is slightly higher than that of the proposed algorithm at first, the detection rate of the proposed algorithm quickly exceeds that of MPCM. The detection rate of the proposed algorithm is always higher than that of MPCM. It is worth noting that some baseline HVS algorithms have poor detection performance in scenes (2), (5), (8), and (9) of Figure 12, with a target smaller than 2 × 2 pixels. As previously analyzed, and as mentioned in the introduction, the local contrast of targets smaller than 2 × 2 pixels cannot be effectively measured through a rectangular window, which increases false alarms in low-threshold segmentation and makes it easy to miss targets in high-threshold segmentation. In scenes (2)~(9) of Figure 12, the ROC curves of the proposed algorithm are closer to the upper-left corner, which indicates that our algorithm has good detection performance in various complex scenes.

3.2. Analysis of Engineering Experiment Results

The proposed algorithm has completed the engineering realization on the image board. The main control chip of the image board is the XILINX’s FPGA chip. The FPGA chip provides abundant logic resources and peripheral interfaces and has characteristics such as parallel execution and low latency. The image board can achieve single-frame target detection with the help of the above features of the FPGA chip. To facilitate visual observation by human eyes, the image board also has the function of outputting 3G-SDI images, with the center of the image being the origin (0,0) of the two-dimensional coordinate system.

In order to verify the detection ability of the image board to detect targets in practical applications, target detection experiments were conducted indoors and outdoors. Indoor, the target detection experimental platform is shown in Figure 13. Device (1) is a mid-wave infrared camera with a cooler, with a resolution of 640 × 512 and inverted image output. Device (2) is an electric soldering iron for simulating the sun or a high-brightness clutter. Device (3) is a uniform heat source. Device (4) is the camera link cable. Device (5) is the image board. Device (6) is the DC power supply. Device (7) is the SDI monitor. The SDI image in Figure 14 can be summarized as follows:

The circular target occupies 68 pixels and has approximate grayscale values.
The target is precisely detected under the simulated near-sun condition.

Outdoor, target detection experiments were conducted on a UAV with a halogen lamp mounted in the sky using the target detection devices shown in Figure 13. The distance between the mid-wave infrared camera and the UAV is 1.2 km. The UAV and the halogen lamp are shown in Figure 14.

The SDI image in Figure 15 can be summarized as follows:

The UAV is a point target occupying four pixels in the infrared image.
UAV is precisely detected against a complex background such as buildings, trees, and sky.

4. Conclusions

This article proposed a precision detection algorithm for an infrared small target based on a non-window structure of local grayscale descent intensity (LGDI) and local gradient watershed (LGW). This algorithm breaks the limitation of rectangular windows in the HVS methods, effectively suppresses the interfering targets and irregular clutter, improves the saliency of the target, and can detect the target of 1 × 1 pixel. The target coordinate accuracy obtained using the precision calculation method proposed in this article can reach sub-pixel accuracy. The simulation experiment results of nine ground-to-air infrared image sequences with complex backgrounds show that, compared with eight algorithms, the proposed algorithm has better stability and effectiveness in various complex backgrounds and exhibits obvious advantages in multiple evaluation indexes. The results of two engineering experiments show that a circular target with uniform temperatures is accurately detected under simulated near-sun conditions; moreover, the UAV point target is accurately detected under complex backgrounds such as buildings, trees, and the sky. This algorithm can detect targets earlier and locate targets more accurately. Therefore, the algorithm proposed in this paper has significant advantages in applications such as long-range IRST systems, long-range electro-optical search and tracking systems, and precision guidance systems. In future work, we will further investigate the detection of infrared targets using this algorithm in the background of the sea surface.

Author Contributions

X.D. proposed the original idea, performed the experiments, and wrote the manuscript. H.J., Y.S. and K.D. contributed to the content, writing, and revising of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key Research and Development Program of the Ministry of Science and Technology (2022YFB3902505) and the Major Science and Technology Special Project of Jilin Province (20230301001GX).

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to thank Junpeng Zhou, who is a student from Changchun University of Science and Technology, for controlling the UAV flight; we are also grateful to the Changchun University of Science and Technology Institute of Space Optoelectronic Technology for providing equipment such as the camera, heat source, and UAV.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Scene (1), scene (2), and scene (3) in Table 1 were captured using a medium-wave infrared camera. Scene (4) in Table 1 is derived from Yimian Dai et al. publicly released “Single frame InfraRed Small Target (SIRST)”. Scene (5), scene (6), and scene (7) in Table 1 are sourced from Bingwei Hui et al. publication “A dataset for infrared image dim small aircraft target detection and tracking underground/air background”. Scene (8) and scene (9) in Table 1 are sourced from Xiaoliang Sun et al. publicly available “A dataset of semi-synthetic detection for small infrared moving targets under complex backgrounds”.

References

Qu, H.M.; Cao, D.; Zheng, Q.; Li, Y.-Y.; Chen, Q. Accuracy test and analysis for infrared search and track system. Optik 2013, 124, 2313–2317. [Google Scholar] [CrossRef]
Zhao, M.J.; Li, W.; Li, L.; Hu, J.; Ma, P.; Tao, R. Single-frame IR small-target detection: A survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 87–119. [Google Scholar] [CrossRef]
Malanowski, M.; Kulpa, K. Detection of moving targets with continuous-wave noise radar: Theory and measurements. IEEE Trans. Geosci. Remote Sens. 2012, 50, 3502–3509. [Google Scholar] [CrossRef]
Philip Chen, C.L.; Li, H.; Wei, Y.; Xia, T.; Tang, Y.Y. A local contrast method for small infrared target detection. IEEE Trans. Geosci. Remote Sens. 2013, 52, 574–581. [Google Scholar] [CrossRef]
Bai, X.Z.; Bi, Y.G. Derivative entropy-based contrast measure for infrared small-target detection. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2452–2466. [Google Scholar] [CrossRef]
Ma, J.Y.; Yu, W.; Liang, P.W.; Li, C.; Jiang, J. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
Ma, J.Y.; Ma, Y.; Li, C. Infrared and visible image fusion methods and applications: A survey. Inf. Fusion 2019, 45, 153–178. [Google Scholar] [CrossRef]
Deng, H.; Sun, X.P.; Liu, M.L.; Ye, C.; Zhou, X. Small Infrared Target Detection Based on Weighted Local Difference Measure. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4204–4214. [Google Scholar] [CrossRef]
Nie, J.Y.; Qu, S.C.; Wei, Y.T.; Zhang, L.; Deng, L. An Infrared Small Target Detection Method Based on Multiscale Local Homogeneity Measure. Infrared Phys. Technol. 2018, 90, 186–194. [Google Scholar] [CrossRef]
Gao, C.Q.; Meng, D.Y.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared patch-image model for small target detection in a single image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef]
Qin, Y.; Li, B. Effective infrared small target detection utilizing a novel local contrast method. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1890–1894. [Google Scholar] [CrossRef]
Shao, X.P.; Fan, H.; Lu, G.X.; Xu, J. An improved infrared dim and small target detection algorithm based on the contrast mechanism of human visual system. Infrared Phys. Technol. 2012, 55, 403–408. [Google Scholar] [CrossRef]
Chen, F.J.; Bian, C.J.; Meng, X. Infrared small target detection using Homogeneity-weighted local patch saliency. Infrared Phys. Technol. 2023, 133, 104811. [Google Scholar] [CrossRef]
Liang, X.J.; Liu, L.L.; Luo, M.; Yan, Z.; Xin, Y. Robust infrared small target detection using Hough line suppression and rank-hierarchy in complex backgrounds. Infrared Phys. Technol. 2022, 120, 103893. [Google Scholar] [CrossRef]
Zhou, F.; Wu, Y.Q.; Dai, Y.M.; Ni, K. Robust infrared small target detection via jointly sparse constraint of l1/2-metric and dual-graph regularization. Remote Sens. 2020, 12, 1963. [Google Scholar] [CrossRef]
Yang, L.; Yang, J.; Yang, K. Adaptive Detection for Infrared Small Target Under Sea-Sky Complex Background. Electron. Lett. 2004, 40, 1083–1085. [Google Scholar] [CrossRef]
Deshpande, S.Y.; Er, H.M.; Ronda, V.; Chan, P. Max-Mean and Max-Median Filters for Detection of Small Targets. Signal Data Process. Small Targets 1999, 3809, 74–83. [Google Scholar]
Bae, T.W. Small Target Detection Using Bilateral Filter and Temporal Cross Product in Infrared Images. Infrared Phys. Technol. 2011, 54, 403–411. [Google Scholar] [CrossRef]
Zhao, J.F.; Feng, H.J.; Xu, Z.H.; Li, Q.; Peng, H. Real-time automatic small target detection using saliency extraction and morphological theory. Opt. Laser Technol. 2013, 47, 268–277. [Google Scholar] [CrossRef]
Cao, Y.; Liu, R.M.; Yang, J. Small Target Detection Using Two-Dimensional Least Mean Square (TDLMS) Filter Based on Neighborhood Analysis. Int. J. Infrared Millim. Waves 2008, 29, 188–200. [Google Scholar] [CrossRef]
Zhang, Y.X.; Li, L.; Xin, Y.H. Infrared Small Target Detection Based on Adaptive Double-Layer TDLMS Filter. Acta Photonica Sin. 2019, 48, 0910001. [Google Scholar] [CrossRef]
Zhang, T.F.; Wu, H.; Liu, Y.H.; Peng, L.; Yang, C.; Peng, Z. Infrared small target detection based on non-convex optimization with lp-norm constraint. Remote Sens. 2019, 11, 559. [Google Scholar] [CrossRef]
Zhou, F.; Wu, Y.Q.; Dai, Y.M.; Wang, P. Detection of small target using schatten 1/2 quasinorm regularization with reweighted sparse enhancement in complex infrared scenes. Remote Sens. 2019, 11, 2058. [Google Scholar] [CrossRef]
Zhang, L.D.; Peng, Z.M. Infrared small target detection based on partial sum of the tensor nuclear norm. Remote Sens. 2019, 11, 382. [Google Scholar] [CrossRef]
Liu, D.P.; Cao, L.; Li, Z.Z.; Liu, T.; Che, P. Infrared small target detection based on flux density and direction diversity in gradient vector field. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 2528–2554. [Google Scholar] [CrossRef]
Wei, Y.T.; You, X.G.; Li, H. Multiscale patch-based contrast measure for small infrared target detection. Pattern Recognit. 2016, 58, 216–226. [Google Scholar] [CrossRef]
Han, J.H.; Liang, K.; Zhou, B.; Zhu, X.; Zhao, J.; Zhao, L. Infrared small target detection utilizing the multiscale relative local contrast measure. IEEE Geosci. Remote Sens. Lett. 2018, 15, 612–616. [Google Scholar] [CrossRef]
Han, J.H.; Yu, Y.; Liang, K.; Zhang, H. Infrared small-target detection under complex background based on subblock-level ratio-difference joint local contrast measure. Opt. Eng. 2018, 57, 103105. [Google Scholar] [CrossRef]
Han, J.H.; Moradi, S.; Faramarzi, I.; Liu, C.; Zhang, H.; Zhao, Q. A Local Contrast Method for Infrared Small-Target Detection Utilizing a Tri-Layer Window. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1822–1826. [Google Scholar] [CrossRef]
Moradi, S.; Moallem, P.; Sabahi, M.F. Fast and robust small infrared target detection using absolute directional mean difference algorithm. Signal Process. 2020, 117, 107727. [Google Scholar] [CrossRef]
Han, J.H.; Moradi, S.; Faramarzi, I.; Zhang, H.; Zhao, Q.; Zhang, X.; Li, N. Infrared small target detection based on the weighted strengthened local contrast measure. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1670–1674. [Google Scholar] [CrossRef]
Zhang, Y.Y.; Zheng, Y.; Li, X.H. Multi-Scale Strengthened Directional Difference Algorithm Based on the Human Vision System. Sensors 2022, 22, 10009. [Google Scholar] [CrossRef]
Huang, S.Q.; Liu, Y.H.; He, Y.M.; Zhang, T.; Peng, Z. Structure-adaptive clutter suppression for infrared small target detection: Chain-growth filtering. Remote Sens. 2020, 12, 47. [Google Scholar] [CrossRef]
Zhang, Z.Y.; Pan, N.H.; Zhao, T.Y. Calculation method of expected sharpness value for region of interest in ventral subregion image based on standard deviation-weighted gaussian filter function and multidirectional sobel operator. Laser Optoelectron. Prog. 2024, 61, 385–393. [Google Scholar]
Xia, G.S.; He, Q.; Wei, Y.C.; Mo, D.H. Edge detection system based on ethernet and FPGA image processing. Instrum. Tech. Sens. 2023, 10, 56–59. [Google Scholar]
Gauch, J.M. Image segmentation and analysis via multiscale gradient watershed hierarchies. IEEE Trans. Image Process. 1999, 8, 69–79. [Google Scholar] [CrossRef] [PubMed]
Dai, Y.M.; Wu, Y.Q.; Zhou, F.; Barnard, K. Asymmetric Contextual Modulation for Infrared Small Target Detection. IEEE Winter Conf. Appl. Comput. Vis. WACV 2021, 7, 949–958. [Google Scholar]
Hui, B.W.; Song, Z.Y.; Fan, H.Q.; Zhong, P.; Hu, W.D.; Zhang, X.F.; Ling, J.G.; Su, H.Y.; Jin, W.; Zhang, Y.J.; et al. A dataset for infrared image dim-small aircraft target detection and tracking under ground/air background. China Sci. Data 2020, 5, 291–302. [Google Scholar]
Sun, X.L.; Guo, L.C.; Zhang, W.L.; Wang, Z.; Hou, Y.J.; Teng, X.C. A dataset of semi-synthetic detection for small infrared moving targets under complex backgrounds. China Sci. Data 2024, 9, 315–331. [Google Scholar] [CrossRef]
Qiu, Z.B.; Ma, Y.; Fan, F.; Huang, J.; Wu, M. Adaptive scale patch-based contrast measure for dim and small infrared target detection. IEEE Geosci. Remote Sens. Lett. 2022, 19, 7000305. [Google Scholar] [CrossRef]
Zhang, P.; Zhang, L.Y.; Wang, X.Y.; Shen, F.; Pu, T.; Fei, C. Edge and corner awareness-based spatial–temporal tensor model for infrared small-target detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 10708–10724. [Google Scholar] [CrossRef]
Han, J.H.; Ma, Y.; Zhou, B.; Fan, F.; Liang, K.; Fang, Y. A robust infrared small target detection algorithm based on human visual system. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2168–2172. [Google Scholar]
Nasiri, M.; Chehresa, S. Infrared small target enhancement based on variance difference. Infrared Phys. Technol. 2017, 82, 107–119. [Google Scholar] [CrossRef]
Gao, C.Q.; Wang, L.; Xiao, Y.X.; Zhao, Q.; Meng, D. Infrared small-dim target detection based on Markov random field guided noise modeling. Pattern Recognit. 2018, 76, 463–475. [Google Scholar] [CrossRef]
Guan, X.W.; Peng, Z.M.; Huang, S.Q.; Chen, Y. Gaussian scale-space enhanced local contrast measure for small infrared target detection. IEEE Geosci. Remote Sens. Lett. 2019, 17, 327–331. [Google Scholar] [CrossRef]

Figure 1. Original infrared image in a ground-to-air scene.

Figure 2. The original infrared image contains a target with an approximate grayscale value.

Figure 3. The flowchart of the proposed infrared small target detection algorithm.

Figure 4. Grayscale values, Gaussian high-pass filtered values, and gradient values of HB, TT, IT, and BE in Figure 1: (a1) HB area of infrared image; (a2) TT area of infrared image; (a3) IT area of infrared image; (a4) BE area of infrared image; (b) 3D map of (a); (c) 3D map of Gaussian high-pass filter values for (a); (d) Gradient distribution of (a).

Figure 5. Grayscale values, Gaussian high-pass filtered values, and gradient values of TT and BE in Figure 2: (a1) TT area of the infrared image; (a2) BE area of the infrared image; (b) 3D map of (a); (c) 3D map of Gaussian high-pass filter values for (a); (d) Gradient distribution of (a).

Figure 6. The center position path of each CT point: (a) the grayscale values of CT and its neighborhood background in Figure 5(a1); (b) the grayscale values of CT and its neighborhood background in Figure 5(a2).

Figure 7. Grayscale descent path at the center position of the CT: (a) the grayscale values of TT and its neighborhood background in Figure 5(a1); (b) the grayscale values of BE and its neighborhood background in Figure 5(a2).

Figure 8. LGW at the center position of the CT: (a) the gradient values of TT and its neighborhood background in Figure 5(a1); (b) the gradient values of BE and its neighborhood background in Figure 5(a2).

Figure 9. Filtering response of TT, IT, and BE in Figure 1: (a1) TT intensity; (a2) TT gradient; (b1) IT intensity; (b2) IT gradient; (c1) BE intensity; (c2) BE intensity gradient.

Figure 10. The coordinates of the true target: (a) the coordinates of the true target in Figure 1; (b) the coordinates of the true target in Figure 2.

Figure 11. Target detection results of 9 algorithms on the infrared images of nine scenes.

Figure 12. ROC curves of 9 algorithms on the scene (1−9) in Figure 11.

Figure 13. Indoor: the experimental results of target detection on the image board.

Figure 14. UAV with halogen lamp mounted and image of UAV in the sky.

Figure 15. Outdoor: the experimental results of target detection on the image board.

Table 1. Details of 9 test scenes. The detailed description of the 9 scenes is shown in Appendix A.

Scene	Frame Number	Target Size	Image Size (Pixel)	Background Description
(1)	5	2 × 3~3 × 3	128 × 128	Ground background, buildings, strong edges, interfering target
(2)	56	1 × 2~3 × 3	128 × 128	Sky background, heavy clouds, light clouds, and irregular clouds
(3)	12	2 × 2~3 × 3	128 × 128	Complex background, irregular clouds, buildings
(4)	1	4 × 4	239 × 256	Complex background, buildings, irregular clouds
(5)	300	1 × 2~2 × 3	256 × 256	Complex background, grasslands, land, trees, road, interfering target
(6)	300	2 × 2~4 × 5	256 × 256	Ground background, grasslands, trees, land, interfering target
(7)	600	1 × 2~3 × 3	256 × 256	Complex background, mountains, trees, grasslands, land, interfering target
(8)	40	1 × 1~3 × 3	640 × 512	Complex background, river, trees, sky, irregular buildings
(9)	300	1 × 1~3 × 3	640 × 512	Sky background, complex clouds

Table 2. Parameter settings for comparison algorithms.

No.	Algorithms	Parameter Settings
1	Top-hat [19]	the size of the structuring element: 5 × 5.
2	PSTNN [24]	Patch size = 40;	$λ = \frac{0.6}{\sqrt{\max (n 1, n 2) \times n 3}}, ε = 10^{- 7}$
2	PSTNN [24]	Slide step = 40;
3	MLCM [4]	the size of the local window: N = 3, 5, 7, 9.
4	MPCM [26]	the size of the local window: N = 3, 5, 7, 9.
5	RLCM [27]	k1 = [2,5,9]; k2 = [4,9,16];
6	TLLCM [29]	G = [1/16, 1/8, 1/16;1/8, 1/4, 1/8; 1/16, 1/8,1/16], the size of the local window: N = 3, 5, 7, 9.
7	ADMD [30]	the size of the local window: N = 3, 5, 7, 9.
8	WSLCM [31]	the size of the local window: N = 7, 9, 11.

Table 3. Average BSF and CG of all the algorithms on 9 scenes.

Metrics	Scene	Tophat	PSTNN	MLCM	MPCM	RLCM	TLLCM	ADMD	WSLCM	Proposed
BSF	(1)	0.1319	1.1412	0.4167	0.3970	0.2093	0.3453	0.3277	0.8058	1.2868
	(2)	0.5033	1.0554	0.6308	1.0590	0.7648	1.3231	1.2248	2.9684	4.1692
	(3)	0.2643	1.0817	0.5208	0.6541	0.3722	0.6262	0.6898	1.4445	2.2019
	(4)	1.1729	6.6360	1.0357	2.7564	2.2310	4.4878	4.3126	7.4265	8.2999
	(5)	0.9550	2.9437	0.8790	4.3504	1.8865	3.9286	4.7129	9.0498	11.4142
	(6)	1.4025	9.3873	1.6599	4.2905	4.4090	5.2247	6.3937	21.9804	20.9017
	(7)	2.4759	8.1856	1.6289	6.2315	3.9285	5.4106	7.1858	13.3868	21.9069
	(8)	2.4225	9.9176	1.6819	11.5507	7.9404	13.3491	18.1715	32.3202	36.6030
	(9)	3.0343	16.0856	3.8622	12.8567	12.2055	17.9148	18.2355	67.4762	66.4039
CG	(1)	5.1274	5.3791	1.2865	5.3788	4.5710	5.3788	5.3855	5.3920	5.3933
	(2)	3.7521	3.8000	1.4225	3.5365	3.1365	3.7560	3.5324	3.6117	3.8032
	(3)	4.0160	4.1135	1.3398	4.1017	3.3436	4.1068	4.1086	4.1153	4.1206
	(4)	3.1757	3.5660	1.4187	3.8880	3.5994	2.6167	3.8916	3.9022	3.9066
	(5)	1.8675	2.0122	1.0177	0.9044	0.9966	1.3907	0.9219	0.2507	2.0394
	(6)	1.6938	1.8440	1.0404	1.8419	1.5864	1.8456	1.8464	1.8480	1.8494
	(7)	1.4734	1.5380	1.0163	1.5374	1.1585	1.5399	1.5391	1.5051	1.5431
	(8)	1.6102	1.7462	1.2280	0.2541	1.3652	0.8816	0.2168	1.1641	1.7541
	(9)	1.6527	1.7111	1.0730	0.1555	1.2209	0.9613	0.2856	1.1630	1.7115

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dong, X.; Jiang, H.; Song, Y.; Dong, K. Precision Detection of Infrared Small Target in Ground-to-Air Scene. Remote Sens. 2024, 16, 4230. https://doi.org/10.3390/rs16224230

AMA Style

Dong X, Jiang H, Song Y, Dong K. Precision Detection of Infrared Small Target in Ground-to-Air Scene. Remote Sensing. 2024; 16(22):4230. https://doi.org/10.3390/rs16224230

Chicago/Turabian Style

Dong, Xiaona, Huilin Jiang, Yansong Song, and Keyan Dong. 2024. "Precision Detection of Infrared Small Target in Ground-to-Air Scene" Remote Sensing 16, no. 22: 4230. https://doi.org/10.3390/rs16224230

APA Style

Dong, X., Jiang, H., Song, Y., & Dong, K. (2024). Precision Detection of Infrared Small Target in Ground-to-Air Scene. Remote Sensing, 16(22), 4230. https://doi.org/10.3390/rs16224230

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Precision Detection of Infrared Small Target in Ground-to-Air Scene

Abstract

1. Introduction

2. The Proposed Method

2.1. Image Preprocessing

2.2. Calculation of the LGDI-LGW Properties

2.2.1. Calculation of the LGDI Properties

2.2.2. Calculation of the LGW Properties

2.3. Precise Calculation of the Target Position

3. Experiments and Analysis

3.1. Results and Analysis of Simulation Experiments

3.1.1. Quantitative Analysis of Simulation Experiments

3.1.2. Qualitative Analysis of Simulation Experiments

3.2. Analysis of Engineering Experiment Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI