1. Introduction
Because of abundant spectral information, hyperspectral imagery (HSI) has the potential to discover the subtle differences of ground materials that cannot be visually inspected in a multispectral image [
1]. As such, it has gained widespread attention in the fields of geology, the military, the mining industry, and medical imaging, among others [
2,
3,
4]. Anomaly detection, which is one of the most popular branches in hyperspectral image processing, is capable of uncovering many masked targets of interest without a priori spectral knowledge, and as such, it generally conforms to practical conditions, and has gradually been considered very effective and useful in HSI [
5,
6,
7,
8].
The Reed–Xiaoli detector (RXD) [
9] is a widely used anomaly detection algorithm, and is regarded as a benchmark algorithm in HSI [
10,
11,
12]. Derived from the generalized likelihood ratio test (CFAR), it finds abnormal points by calculating the Mahalanobis distance between the background and the pixel under test (PUT) in a multivariate Gaussian background. However, the RXD algorithm merely utilizes the low-order linear statistics of hyperspectral data, thus leading to inferior detection accuracy in complex and changeable ground distributions. To address this issue, kernel-RXD (KRXD), a kernel version of the RXD algorithm, is presented by Kwon and Nasrabadi [
13]. KRXD exhaustively mines the high-order correlation between spectral bands via a kernel function. It can vastly improve the detection accuracy as compared to the RXD when original data are mixed in a non-linear model, which is always the case. Unfortunately, the data kernelization in KRXD produces lots of multiplications and the inversion of matrices, thereby consuming a lot of processing time.
Recently, real-time anomaly detection has gained wide attention in HSI [
14,
15,
16,
17,
18,
19,
20,
21]. It is particularly significant, as some moving objects are highly desirable to be detected on a timely basis. Furthermore, with the huge pressure of big data storage, it becomes the inevitable trend to perform anomaly detection in real time. It is worth noting that the realization of real-time processing is mainly dependent on the three data acquisition formats [
17,
18].
The first such format is the band interleaved pixel (BIP) format, which performs anomaly detection pixel by pixel. Two real-time algorithms based on the RXD, the global real-time causal RX detector (GRTC-RXD), and the local real-time causal RX detector (LRTC-RXD), are discussed in [
19,
20] in terms of BIP, respectively. Both real-time algorithms solve the problem of causality (the one requirement of real-time processing), and only the data samples already visited can be used to detect PUT; no future data samples should be involved in the data processing [
21]. They also enhance the execution efficiency by recursively updating the covariance matrices via the Woodbury matrix identity [
22]. While the GRTC-RXD utilizes all processed data before the PUT as the background, LRTC-RXD regards the data contained in sliding causal array windows as the background to perform real-time detection. As these algorithms usually have an undesirable detection output by using the low-order statistics of the HSI dataset, a real-time algorithm based on the KRXD to yield better detection accuracy with much shorter processing time has been recently proposed [
23]. The second such format is the band sequential (BSQ) format, which processes data samples band by band. According to the BSQ format, progressive band processing anomaly detection (PBP-AD) could be implemented band by band progressively while the data acquisition is still ongoing [
24]. The third and final such format is the band interleaved line (BIL) format, which collects data line by line. Since acquiring data with a push broom scanner has gradually become the mainstream for hyperspectral imaging sensors [
25,
26], the literature [
26] presents a real-time anomaly detector based on the BIL format. By using a linear algebra-based strategy, it effectively updates the inverse covariance, thus avoiding complex computation. The renewal process of correlation matrices only separately considers the data going in and out of the dual window. Yet, it does not re-derive the covariance matrix by treating these data and the duplicate data as a whole, leading to the loss of some information in recursion. Additionally, the low detection accuracy is still a challenge, as it is based on the RXD.
Therefore, to tackle the aforementioned issues, this paper presents an approach to the progressive line processing of KRXD (PLP-KRXD) via BIL. To ensure the causality of the detection system, a new type of local window (parallel causal sliding window) is defined. In this case, hyperspectral data can be collected line by line progressively during the process of line acquisition. As the proposed method is updated recursively using the Woodbury matrix identity and the matrix inversion lemma [
27], it avoids having to recalculate the previously processed data lines, thereby significantly reducing computational complexity. Experimental results demonstrate that the PLP-KRXD keeps the detection accuracy unchanged and speeds up the execution efficiency compared to the KRXD.
4. Experimental Results and Analysis
In this section, three groups of HSI datasets, including one synthetic image and two real images, are utilized for experiments to demonstrate the performance of the PLP-KRXD.
4.1. Optimum Kernel Parameter on the PLP-KRXD
This group of experiments is conducted to find the optimum kernel parameter on the PLP-KRXD. According to cross-validation, the size of the causal sliding rectangle windows on the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset are chosen as , , and , respectively.
The receiver operating characteristic (ROC) [
29] curves are employed as one of the frequently used evaluation criteria in anomaly detection. The detection probability
and false alarm rate
can be defined as
The area under the curve (AUC) is employed as another evaluation criterion to reasonably estimate the accuracy of anomaly detection with respect to the ROC analysis. In detail, a larger AUC value represents a higher accuracy of anomaly detection.
Figure 10 illustrates the different AUCs with diverse polynomial kernel parameter
for the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset, respectively. As can be seen in this figure, the AUC of all three HSI datasets reaches a maximum value when
is equal to 2, and then go into a downward trend. For the synthetic dataset and the Pavia Center dataset, the AUC of them has a modest decline, but it declines sharply after reaching the maximum for the San Diego Airport dataset. Hence, the polynomial kernel parameter
is relatively sensitive to the complex hyperspectral dataset. In other words, when the distribution of the data features is simple, the selected
has a weaker effect on the separation of the background and objects.
4.2. Effects from the Changing Window Size on the PLP-KRXD
This group of experiments is to investigate that the detection effect of the PLP-KRXD is related to the size of the causal sliding rectangle windows. In these experiments, and of the causal sliding rectangle window are separately chosen in a range from 12 to 18 with a step of 3 and 5 to 7 with a step of 1 on the three HSI datasets. By means of cross-validation, the polynomial kernel parameter of the synthetic dataset, the Pavia Center dataset and the San Diego Airport dataset are all manually set to 2.
Table 1 reveals the output AUC values of the PLP-KRXD for the three HSI datasets with a changing causal sliding rectangle window size. From
Table 1, the maximum values obtained in the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset are respectively 0.9966, 0.9979, and 0.9458 at the detection windows of
,
, and
. The residuals between the best AUC and the worst AUC of the PLP-KRXD for the three HSI datasets are given in Table 3. Compared to the San Diego Airport dataset, the residuals of the synthetic and Pavia Center datasets are smaller, which means they can weaken the influence resulting from the causal sliding rectangle window size. For the San Diego Airport dataset, due to the distribution of the complex background, its robustness is relatively weaker than the other two HSI datasets.
4.3. Detection Performance of PLP-KRXD
In comparison with the KRXD, several experiments are conducted on the three HSI datasets to demonstrate the effectiveness of PLP-KRXD. In these experiments, by cross-validation, the polynomial kernel parameter for the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset are all set to 2. At the same time, the corresponding causal sliding rectangle window size chosen for these three HSI datasets are set to ,, and , respectively. For the KRXD, the polynomial kernel parameter and the dual window size are chosen as 2 and , respectively.
Figure 11 illustrates the ROC curves of the two detectors for the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset, respectively. The ROC curves of the KRXD and PLP-KRXD correspond to their best AUC. As can be observed for the synthetic dataset in
Figure 11a, the detection precision of the PLP-KRXD is basically the same as that of the KRXD, and it has a higher probability of detection under the lower false alarm probability. For the Pavia Center dataset shown in
Figure 11b, the ROC results of the KRXD and the PLP-KRXD are comparable, even if the KRXD is slightly better than the PLP-KRXD. Both of them can achieve a probability of detection of 1 with a lower false alarm rate. For the San Diego Airport dataset shown in
Figure 11c, the PLP-KRXD and the KRXD are almost identical in detection accuracy. This result is due to the fact that the recursive process of the PLP-KRXD is derived from the KRXD by applying identity without information leaks. Some subtle distinctions occur between the PLP-KRXD and the KRXD because the KRXD needs to implement the pseudoinverse of the Gram matrix for every pixel, while the PLP-KRXD cancels this operation via the causal sliding rectangle window.
Figure 12 presents the detection maps of the KRXD and the PLP-KRXD on the three HSI datasets, respectively. Through comparison, there is no appreciable visual difference between the results of the two anomaly detectors for the synthetic dataset and the Pavia Center dataset. For the San Diego Airport dataset, the detection result obtained by the PLP-KRXD achieves better visual inspection than does the KRXD with regard to the detection maps. In order to make more visualized descriptions for the data and expose more detailed information,
Figure 13 illustrates the three-dimensional (3D) plots of the KRXD and the PLP-KRXD on the three HSI datasets. It can clearly be seen that the 3D results of the KRXD and the PLP-KRXD for each group of HSI datasets are extremely comparable; particularly, the PLP-KRXD has better background suppression and accurately detects the anomalies on the San Diego Airport dataset.
Figure 14 provides the line-varying detection maps produced by the PLP-KRXD as it was implemented in real-time on the synthetic dataset, the Pavia Center dataset and the San Diego Airport dataset, respectively. As time progresses, these progressive detection maps are beneficial to visual inspection at varying levels of background suppression. For example, in the Pavia Center dataset, the first three weak anomalies are detected in a better visual assessment. From
Figure 14j, k, l, however, as the detection process progresses, these weak anomalies are easily overwhelmed by the subsequent strong anomalies and become dimmer than before. Obviously, this phenomenon cannot be observed using the KRXD that obtains its result in the light of the final detection maps.
4.4. Computational Analysis of the PLP-KRXD
Computational analysis is an essential process in real-time processing. Theoretically speaking, the computational complexity is mainly related to the number of multiplication operations. In this paper, it is worth noting that the computational complexity of PLP-KRXD is mainly determined by
and
. Hence, in order to intuitively analyze the time efficiency of PLP-KRXD,
Table 2 presents the computational complexity of the kernel Gram matrix and its inversion in both the KRXD and the PLP-KRXD. As can be seen in
Table 2, by virtue of recursive processing in the PLP-KRXD, the computational complexity of
and
in the KRXD is reduced to
and
from
and
, respectively, where
represents the number of background data sample vectors in the processing window. Note that
must be smaller than
; hence, the computational complexity of the PLP-KRXD is cheaper than that of the KRXD. In addition, it is evident that the computational complexity is not directly linked to processing time [
8]. This is mainly because the PLP-KRXD, using BIL, detects all of the pixels of a line in a one-shot operation while the KRXD, using BIP, processes only one pixel each time.
In order to further perform the concrete comparative analysis and better show the efficiency of the PLP-KRXD, we discuss the total processing time in seconds required by running the KRXD and the PLP-KRXD on the synthetic dataset, the Pavia Center dataset, and the San Diego Airport dataset by averaging five runs, respectively. The computer environments used for experiments are all performed on a 64-bit operating system running on an Intel i7-4770k with a CPU of 3.5 Hz and 16 GB of RAM. As shown in
Table 3, compared with the KRXD, the PLP-KRXD using recursive update equations requires far less computing time on the three HSI datasets. It is approximately 40 times faster than the KRXD on the synthetic dataset, and on the San Diego Airport dataset, it is at least 58 times faster.
5. Discussion
At the theoretical analysis stage, it is clear that the computation of the current kernel Gram matrix inversion
needs previous processed information
by virtue of
Figure 2; therefore, it is necessary to set an initial kernel Gram matrix
. It should be noted that the initial kernel Gram matrix is related to the size of the parallel causal sliding windows. The larger the window size is, the more data sample vectors the initial kernel Gram matrix has. However, a large window size may exceed the time constrains of real-time processing, while small window size easily leads to a singularity in the kernel Gram matrix. It offers a reference for us to determine a proper window size by cross-validation.
In the experiment, the pseudoinverse of the low-rank kernel matrix was used to guarantee the stability of the PLP-KRXD. A common alternative approach for inverting singular and near-singular matrices is to regularize them first [
30]. Therefore, in place of the kernel Gram matrix
, we employ a regularization operation
for some small
to ensure that all eigenvalues can avoid approaching zero.
According to the analysis of the experiments and simulations, the proposed PLP-KRXD algorithm can maintain the detection accuracy unaltered and accelerate the execution efficiency compared to the KRXD. In each step of deriving the update equations, no information is leaked out, thereby retaining the same detection performance as KRXD while performing real-time processing. The PLP-KRXD recursively updates the kernel Gram matrix and its inversion via a Kalman-like recursive equation, thus avoiding the computation of complex matrices. Besides, parallel causal sliding windows, which only include data samples currently being processed without re-processing previous data samples, are adopted to ensure the causality of real-time processing. Although the PLP-KRXD achieves an over 58-fold speedup, it is sometimes still limited for practical applications. Digital Signal Processors (DSPs) and Graphics Processing Units (GPUs) can be taken into account to speed up real-time processing.
6. Conclusions
In this paper, a new real-time anomaly detection of the KRXD algorithm in a BIL fashion is presented. Compared with the KRXD, there are several major advantages that the PLP-KRXD offers. First, the concept of parallel causal sliding windows is developed to ensure the causality of the PLP-KRXD, so that it could be implemented progressively by line in real-time. Second, a new real-time KRXD using BIL via a causal line-varying kernel Gram matrix to detect the anomalies line by line is proposed. Third, weak anomalies can be easily extracted before they are overwhelmed and dominated by the subsequent strong anomaly points, thereby forming data-line varying detection maps. Finally, aiming at the complex computation of the KRXD, this paper takes advantage of the Woodbury matrix identity and the matrix inversion lemma to recursively update the kernel Gram matrix and its inversion to meet the requirement of rapid processing. Through the complexity analysis, for the PLP-KRXD, the computational complexity is significantly reduced and operation efficiency is greatly improved as compared to the KRXD.
Several experiments were conducted on the three HSI datasets to validate the detection performance of the PLP-KRXD algorithm. The experimental results indicate that the PLP-KRXD holds a comparable detection accuracy with the original KRXD and remarkably improves the algorithm’s execution efficiency at the same time.