1. Introduction
Hyperspectral imaging technique has attracted much attention in remote sensing, which collects reflected light of hundreds of contiguous and narrow wavelengths, having rich spectral information about the various materials in the scene. It is a powerful tool in different application fields, such as mineralogy, agriculture, surveillance, space exploration, environmental monitoring, and disease diagnosis [
1,
2]. There is a tradeoff between the spectral and spatial resolutions, which leads the observed signature of each pixel to be a mixture of several materials [
3]. HU techniques are source separation methods that decompose each mixed pixel in HSI into a set of pure spectral signatures called endmembers and their corresponding fractions called abundance maps. More specifically, the HU process is composed of three tasks: (1) estimating the optimal number of endmembers, (2) extracting the endmember signatures, and (3) estimating the abundance maps [
4]. Estimating the optimal number of endmembers, virtual dimensionality (VD) [
5], is crucial because of the performance of extracting endmembers and estimating the abundance maps depending on it. Most of HU algorithms mainly focus on the last two tasks of the HU process, i.e., estimating the endmembers and abundance maps sequentially or simultaneously [
6,
7], by assuming that the number of endmembers is available in advance.
The hyperspectral mixing and unmixing algorithms can be classified into linear and nonlinear models according to the assumptions of the interaction of light with the observed field of view [
4,
8,
9]. Most existing HU algorithms are based on the linear mixing model (LMM), a straightforward and simple method for representing the mixing process. Furthermore, two physical abundance constraints are required while considering linear unmixing: the nonnegative abundance constraint (ANC) and the abundance sum-to-one constraint (ASC) [
10]. The endmember extraction and abundance maps estimation of the HU process can be solved separately. Estimating the abundance maps is an ill-posed linear inversion problem, and we can solve it using the least-squares-based methods, such as fully constrained least squares (FCLS) [
7]. To reduce the accumulative errors in two-stage-based HU algorithms, the statistical-based HU has been proposed, where the HU is considered a BSS problem, so the endmembers and abundance maps are estimated simultaneously [
11].
Nonnegative matrix factorization (NMF) [
12] is the popular method for solving BSS problems, which has been utilized in the HU process. NMF decomposes the nonnegative data matrix into a product of two nonnegative matrices: a source matrix and a mixing matrix. Although standard and constrained NMF methods have been proposed for blind HU, they do not guarantee to provide a unique solution, the decomposition results depend on the initialization, and they require high computational costs to obtain the solution because they are iterative methods. In addition, the interpretation and analysis of the resultant matrices from NMF are complex.
Because HSIs consist of relatively few prominent features (i.e., endmembers), we can approximate HSIs using the low-rank approximation method to explore those prominent features and their fraction at each pixel. Many low-rank decomposition methods exist, such as Singular Value Decomposition (SVD), QR factorization, NMF, Interpolative Decomposition (ID), CUR matrix decomposition, and Principal Component Analysis (PCA) [
13]. In order to demonstrate that the HSI data are almost low-rank or can be well approximated as low-rank data, we consider a real HSI dataset known as the Samson HSI dataset [
14]. It comprises
pixels with 156 spectral bands and has three reference endmembers (i.e., soil, tree, and water). The Samson data cub can be reshaped into an HSI data matrix of size
, whose rows represent the spectral bands, and its columns represent the pixels.
Figure 1a shows the Samson data matrix, which clearly can be well approximated as low-rank data and constructed from a few components.
Figure 1b shows the ground truth of normalized endmembers and their abundance maps of the Samson HSI dataset. Moreover, we selected 15 pixels randomly from the HSI data matrix and visualized their reflectance along the spectral bands, as shown in
Figure 1c. As can be seen, the 15 signatures can be represented within three groups (i.e., solid, dash, and dash-dot lines), where each member in a group is a linear combination of the group members. In other words, each pixel of the HSI dataset can be represented as linear combinations of the main component of each group. Thus, the HSI is almost a low-rank and well-approximated by a low rank, and the rank here represents the number of endmembers.
Inspired by the CUR matrix factorization methods [
15,
16] for finding the low-rank approximation, interpreting the resulting matrices from the decomposition, and their performance in various application areas, we propose a novel framework for performing the main three tasks of the HU process based on CUR factorization called CUR-HU. It is a new blind unmixing method incorporating CUR matrix decomposition based on a deterministic sampling method for selecting the columns and rows from the observed mixed HSIs that have the maximum independence and provide a unique solution. CUR-HU automatically estimates the endmembers and their abundance maps without requiring prior knowledge about the scene. Unlike NMF, CUR-HU does not depend on the initialization method, and the resulting matrices are easy to interpret as they contain the same data element as the original given HSIs. To decrease the complexity of computing the SVD of an HSI, which the sampling method requires computing the selected indices of rows and columns, the QR factorization based on an incremental approach is utilized to approximate the SVD and estimate the number of endmembers in a given HSI.
Figure 2 shows an overview of the proposed framework (i.e., CUR-HU) for performing the whole HU process. The performance of the proposed method in each stage of the complete HU process is evaluated using synthetic and real standard HSI datasets, with a comparison with several selected state-of-the-art HU baselines. The experiments demonstrate that CUR-HU outperforms other state-of-the-art blind HU algorithms, especially for extracting the endmember spectra and the computational efficiency. CUR-HU has a
to
times speedup over different state-of-the-art methods for estimating the endmembers and their abundance maps of different real HSI datasets.
The contributions of this paper are summarized as follows:
We propose a new framework based on CUR matrix factorization called CUR-HU for performing the complete HU process. To the best of our knowledge, this is the first time that CUR decomposition has been applied to solve the HU problem.
CUR-HU incorporates several techniques to solve the blind HU problem with unmixing performance comparable to state-of-the-art methods but with much higher computational efficiency. It does not require initialization or prior information about the scene.
Various experiments on synthetic and real HSIs were conducted, and the results show that CUR-HU performs comparably to state-of-the-art methods for estimating the number of endmembers, extracting the abundance maps, and robustness to spectral variability. It outperforms other methods for estimating the endmembers and the computational efficiency.
CUR-HU can be used to initialize the existing HU algorithms, especially iterative methods that are sensitive to the initialization step. Furthermore, it can provide supervised information for methods based on deep learning techniques.
CUR-HU has a to times speedup over different state-of-the-art methods for estimating the endmembers and their abundance maps of different real HSI datasets.
The rest of the paper is organized as follows.
Section 2 presents the related work for solving the HU problem.
Section 3 briefly introduces the concepts and methods used in our proposed method, including CUR decomposition, selection method, NMF, and the linear mixture model (LMM).
Section 4 presents the proposed HU method, which is called CUR-HU.
Section 5 provides extensive experiments and performance comparison with the state-of-the-art methods using both simulated and real HSIs. Finally, the conclusion and discussion of our future work are presented in
Section 6.
Notation
In this paper, the lower and upper case bold are used for representing vectors and matrices, respectively, and is the element in i-th row and j-th column of matrix . We follow Matlab style notation, i.e., is the submatrix which contains the rows and columns of indexed by I and J. The SVD of in which and are the left and right singular matrices, and contains the singular values of . The Frobenius norm of a matrix is and the Moore–Penrose inverse of is . The identity matrix of size is represented by .
2. Related Work
Various methods have been proposed for estimating the number of endmembers. They can be categorized into two groups [
17]: (1) The eigen analysis-based methods, e.g., HFC, NWHFC [
5], and HySime [
18]. The HySime method tries to find the optimal signal subspace by minimizing the mean-square error (MSE) between the projected data onto the subspace and the original data, where the number of endmembers is equal to the optimal dimensionality of the subspace. NWHFC and HFC methods aim to find the most significant eigenvalue gap between the eigenvalues of the sample correlation and covariance matrices, which determines the appropriate number of eigenvectors that can represent the HSIs data. (2) The information-theoretic criteria-based algorithms, e.g., the Akaike information criterion (AIC) and the minimum description length (MDL) [
19]. Many algorithms have been proposed to solve the HU problem, and they can be divided into three classes according to the computational methods: (1) geometric algorithms, (2) sparse regression-based algorithms [
4], and (3) statistical algorithms. Geometrical methods utilize the orientation of HSIs by assuming that the endmembers of HSI are the vertices of a simplex with the minimum or maximum volume enclosing the data set or contained in the convex hull of the data set, respectively. The classical geometrical algorithms include N-FINDR [
20], the pixel purity index (PPI) [
21], the vertex component analysis (VCA) [
6], the simplex growing algorithm (SGA) [
22], and simplex identification via split augmented Lagrangian [
23]. VCA and N-FINDR are iteratively searching about the simplex with maximum volume, whose vertices are considered endmembers. VCA is an iterative approach that projects data in the orthogonal direction to the subspace spanned by the extracted endmembers in the previous steps, while N-FINDR randomly selects some pixels as the initial endmembers matrix, then changes those pixels by other pixels to find a set of pixels with the maximum volume to extract endmembers. Although the geometrical unmixing methods have worked well for extracting endmembers, they may fail to extract endmembers from the highly mixed HSI data, especially if the pure spectral signatures do not exist. Sparse regression-based methods use large known libraries to formulate the HU problem as a sparse linear regression problem [
24,
25]. Extracting realistic endmembers directly from the HSIs data is a big challenge without prior knowledge. Because more large spectral libraries have become openly available, many algorithms have been proposed to select the best subset of endmembers signatures from those libraries to represent each pixel by considering the spectral library as a priori knowledge [
26]. This approach aims to use only a few spectra in a given spectral library to model each mixed pixel in the HSI. It is called sparse unmixing because the number of endmembers in a given HSI is much smaller than the available spectra in the library, which leads to a sparse solution. The authors in [
26] proposed a sparse unmixing using spectral a priori information unmixing method that uses the spectral a priori information in the HSI to address the high mutual coherence problem of the spectral library. However, extracting the endmembers from spectral libraries is usually inconsistent with HSI pixel signatures because of acquisition conditions, affecting the subsequent endmember extraction step, i.e., abundance estimation.
Statistical methods of HU use statistical representation to interpret the mixed pixels. They are powerful techniques for decreasing the accumulative errors in the two-step-based HU methods because they consider the HU as a BSS problem to estimate the endmember signatures and extract the abundance maps simultaneously.
Independent component analysis (ICA) [
27,
28], Bayesian self-organizing maps [
29], independent factor analysis (IFA) [
30], and nonnegative matrix factorization (NMF) [
12] are some of the common statistical methods utilized for HU. HU based on ICA and IFA methods has been proposed based on two assumptions: the observed spectrum vector should be a linear of the endmember signatures weighted by the correspondent sources (abundance fractions) and the sources should be statistically independent. Thus, the HU itself is not a strict ICA or IFA problem. The authors in [
30] claim that the ICA does not play a competitive role in HSI because the statistical independence of the sources, i.e., assumed by ICA and IFA methods, is violated in the HU, compromising the performance of ICA and IFA methods for solving HU. As a result, nonparametric statistical unmixing methods based on the dependent component analysis are proposed in [
31,
32,
33]. It models the abundance maps as mixtures of Dirichlet densities, which enforces the non-negativity and constant sum constraints on the abundance fractions. The mixing matrix is estimated using a generalized expectation–maximization (GEM)-type algorithm. Although this method provides a reasonable approach to avoid the independence assumption of ICA, the Dirichlet densities of the abundances need to be acquired in advance. Moreover, a matrix inversion operation is required in each iteration step, which may lead to performance degradation. Compared with the sparse regression-based methods and geometrical algorithms, the NMF-based HU algorithms are powerful and widely used to extract the endmembers and their abundance maps simultaneously [
34].
NMF-based algorithms work well for solving HU, but they are ill-posed geometric algorithms, meaning that the solution may not be unique. Various auxiliary regularizers have been introduced to add to conventional NMF to improve the uniqueness of the HU solution, e.g., minimum volume rank deficient NMF (Min-vol NMF) [
35],
-sparsity constrained NMF (
-NMF) [
36], manifold regularized sparse NMF [
37], spatial group sparsity regularized NMF (SGSNMF) [
38], robust collaborative NMF (R-CoNMF) [
39], graph regularized NMF [
40], and subspace structure regularized NMF (SSRNMF) [
41]. The SGSNMF method incorporates a spatial group sparsity regularizer into the NMF-based HU method to explore the sparse structure and the use of spatial information of HSIs. It exploits the idea that the pixels within a local spatial are expected to have the same pattern in the low-rank matrix of the abundance map. For
-NMF, the
-norm is used instead of the
-norm to obtain a sparse solution. Because the
regularization provides high unmixing performance, it is used in different NMF-based HU methods, such as structure-constrained sparse NMF [
42] and graph regularized L1/2-NMF [
37]. Recently, nonnegative tensor factorizations have been introduced to solve the blind HU problem, e.g., Matrix-Vector Nonnegative Tensor Factorization for Blind HU (MVNTF) [
43], which preserves the spatial and spectral information to perform the factorization on the hyperspectral alternative 2D image. Moreover, HU methods based on deep learning (DL) have been proposed to achieve more competitive unmixing performance [
44,
45,
46]. However, there are still several drawbacks, i.e., requiring many training samples and network parameters to achieve satisfactory performance, high computational complexity, and difficulty interpreting the results from the network [
47,
48]. Most existing HU methods consider only one or two tasks of the HU process, i.e., estimating the end members and their abundance and assuming that the number of endmembers is available, which is uncommon in real scenarios. In this paper, we consider the whole HU process.
4. CUR Factorization for Hyperspectral Unmixing (CUR-HU)
This section presents the proposed framework (i.e., CUR-HU) for performing the main three tasks of the HU process. A general assumption is that a low-dimensional subspace is hidden in the HSI dataset, as demonstrated in
Figure 1. The endmember signatures of the given HSI represent the macroscopic objects of the HSI scene, e.g., water, soil, and vegetation. HU algorithms attempt to extract those macroscopic objects from the observed mixed signals. As those endmember signatures are statistically independent, extracting them from the mixed observed signals can be achieved by finding a group of pixel spectra with maximum independence between them. In other words, the extracted endmember signatures should be more mutually independent than the mixed pixel spectra. Therefore, such a methodology would be a progression toward extracting more realistic endmember spectra.
According to the LMM, the spectrum of each pixel of HSI is modeled as a linear combination of all endmembers, so the HSI data matrix has a low-rank structure with a rank almost equal to the number of endmembers. To explore the low-rank pattern of HSIs, we can project the pixels onto low dimensional subspace as follows. Let the HSI matrix be
with a span dimension
r. Then the basis of the reduced dimension
is a set of basis vectors
, which provides the best approximation to the original span of
. The set of basis vectors
is constructed to minimize error as follows;
The set of the left singular vectors of gives the optimal solution to this minimization problem. Let the SVD of the HSI data be , where and are left and right singular vectors, respectively, and , with order as . So the optimal basis vectors for solving are . Although this is an optimal solution, it is limited to approximating the linear or low-order polynomial nonlinearity functions. It is computationally expensive and difficult to physically interpret the resultant singular vectors of a given HSI dataset.
We propose to utilize the CUR matrix decomposition to find the hidden structure of HSI datasets. Therefore, HSI can be represented using CUR matrix decomposition, such that for a given HSI matrix
, as
, where
contains the most informative
p columns that can represent the columns space of the HSI matrix
(i.e., a subset of
p pixels that can represent the whole
N pixels of HSI),
contains the most informative rows from
, and
which makes the difference between data matrix
and low-rank approximation matrix
as small as possible. Analog to Equation (
6) of NMF-based HU methods, the HU problem is formulated based on CUR decomposition as follows.
The
matrix contains the endmembers of HSI, and the
matrix represents the abundance matrix whose columns represent the fractional compositions at each of the
N pixels. To construct
and
matrices, randomized and deterministic sampling approaches can be used. Randomized approaches are based on sampling columns and rows based on uniform or non-uniform distributions [
15,
54]. Deterministic approaches consider the matrix approximation as an optimization problem and try to find the best decomposition that minimizes the approximation error [
55]. They provide superior accuracy when reconstructing the matrix but have higher computational costs. In this paper, we utilized a recent deterministic selection method, the discrete empirical interpolation method (DEIM), because it gives high approximation accuracy, provides the most informative columns and rows approximating the whole data matrix, and approximates the nonlinear part by considering the continuous domain as a finite set of discrete points in the continuous domain [
55,
56]. For more details about using the DEIM method to approximate the nonlinear function, please refer to
Appendix A.2.
The DEIM method is completely deterministic and involves few parameters to provide the selected rows and columns indices of the HSI matrix
based on the knowledge of the SVD. Suppose the truncated SVD
is computed, then the DEIM algorithm selects the best rows and columns indices
and
by processing the basis vectors
and
, respectively, to construct the
and
matrices. Given an orthonormal matrix
and a set of distinct indices
, then the interpolatory projector for indices
onto the column span of
is
where
, and the
is invertible. This projector
has a critical characteristic not generally available by orthogonal projectors: for any vector
,
The original vector
and its projected vector
match in the
elements, so this projector is called an interpolatory projector. DEIM processes each vector of the left singular vectors
Starting from the first singular vector
, which corresponds to the largest singular value. The first index
is selected as the index of the largest magnitude element in
, as
Now,
contains only the first slected index
as
. The interpolatory projector of
onto
is given by
The second index
is selected as the index of the largest element in
but after removing its interpolatory projection in the
direction.
Because the
and
are matched in the
position, the
. So there are non-repeated indices. Generally, for selecting the index
, the
indices are available and the interpolatory projection matrix is given by
where
is the first
left singular vectors,
. Then, the index
is computed as
The process of selecting the column indices
is the same above steps to the right singular vectors
. Algorithm 1 shows the procedures to compute the selected row indices
by the DEIM algorithm.
Algorithm 1: Discrete empirical interpolation method algorithm |
|
The interpolation indices used for constructing the matrix of CUR decomposition are selected inductively from the basis vectors of . After processing all singular vectors and obtaining the row indices vector , the matrix is constructed as . Similarly, obtain and construct the matrix as .
4.1. Approximating the Singular Vectors
The main limitation of the selection method is requiring the top
r right and left singular vectors. For moderate-size HSI, software libraries can be used to compute SVD. Unfortunately, this is inefficient for large-scale HSI. Therefore, iterative approaches can be used to compute the leading
r singular vectors of HSI [
53,
57,
58]. We utilize the QR factorization based on an incremental approach (IQR) to approximate the computation of the singular vectors and estimate the number of endmembers in the HSI. The main advantages of IQR are that (1) it is only required to pass once through the HSI matrix
for approximating the SVD, leading to efficient storage of massive data sets in memory; and (2) it satisfies a deterministic error bound in approximating the SVD of low-rank data. IQR approximates a matrix
as
, where
that contains
k orthonormal column vectors and
is the upper triangular matrix.
Based on eigenanalysis algorithms of estimating the number of endmember signatures of HSI and considering the HSI can be well approximated using a low rank, we can utilize the QR factorization method to estimate the rank of the HSI data matrix, which represents the number of main components that approximate the HSI [
5,
17,
59]. We use the QR factorization based on incremental methods for two tasks: (1) approximating the singular vectors required by the sampling algorithm and (2) estimating the number of endmember signatures of HSI data. By using the IQR, Algorithm A1 in
Appendix A, we approximate the HSI matrix
as
, where
that contains the orthonormal column vectors,
is the upper triangular matrix, and
p is the estimated number of endmembers. IQR processes one pixel of
at each step to make it orthogonal with the previously orthogonalized pixels. After the orthogonality step, the algorithm verifies if any row of
has a tiny relative norm. If there is such a row, then this row of
and the corresponding column of
can be deleted because this column of
has a small contribution to the cumulatively computed Frobenius norm of
.
At the start, we select the first
k pixels from the HSI matrix
which denoted as
, and let
, as an initial step. Then,
, with
and
. In step 2 of the algorithm, the norm of rows in the
is computed. According to its value, we can determine which row should be deleted by incrementing the value of
k to obtain the next pixel (column) of
. To make the pixel
orthogonal with previously
k orthogonalized pixels, the classical Gram–Schmidt operations can be applied as
For a robust and stable re-orthogonalization step, the classical Gram–Schmidt is replaced by the following steps (7–8) of Algorithm A1, as proposed in [
60]. This modification provides a
matrix which is numerically orthogonal to working precision, and it can be implemented in an efficient parallel way as it utilizes the Gram–Schmidt process. see [
61] for the full analysis.
The output of the above steps is the orthonormal vector
and the corresponding
. Then, the
and
matrices are updated, as formulated in the algorithm as
Next is checking whether there is a truncation step by checking the row of the minimum relative norm in
. Suppose this row has a relative norm less than the tolerance parameter (i.e.,
, a positive value controlling the accuracy of decomposition). In that case, the corresponding column in
has little contribution to the factorization. The truncation step is required here by deleting that row from
and replacing it with the last row corresponding to the index
of
the matrix. Similar to how the corresponding column of
will delete and put it in its position in the last column of
matrix. If the minimum relative norm of rows in
does not achieve the truncation condition, then the last column in
, which is
, is accepted as an orthonormal vector, and the process of the algorithm is continuing to obtain the next pixel from
. According to the linear model (LMM), each pixel in HSI data is a linear combination of endmember signatures in the HSI scene. So the number of column vectors in
is the minimum number of the subspace dimensions that can approximate the HSI data, representing the number of endmembers in the HSI scene (i.e.,
). In general, the estimated number of endmembers of HSI
using incremental QR factorization within a tolerance error
is
, where
is the number of times there is satisfaction of the truncation condition (number of deletions that occurred up to the last iteration) of Algorithm A1. The tolerance parameter
plays a vital role in estimating the number of endmembers. A large tolerance value will lead to a smaller endmembers number than the actual number but will require less storage and computing time. Conversely, for a smaller
, the estimated number of endmembers is equal to or larger than the actual number, and the complexity becomes significant. For more details about the complexity of approximating the singular vectors using the QR factorization, please refer to
Appendix A.1.
4.2. CUR-HU
In this section, we present the proposed HU framework (i.e., CUR-HU) for performing the entire unmixing process (i.e., estimating the number of endmembers, estimating the endmembers, and extracting the corresponding abundance maps). As shown in
Figure 2, CUR-HU consists of three main stages, preprocessing, estimating the number of endmembers, and CUR decomposition. Algorithm 2 shows the pseudocode of the CUR-HU framework.
In the preprocessing stage, the input HSI cube is reshaped to a matrix (i.e., HSI observed data matrix
), and then, the noise of HSI is estimated to obtain the estimated HSI matrix
. Various methods for noise estimation have been proposed in the signal processing area and utilized in hyperspectral imagery, such as the nearest neighbor difference (NND) [
62] and multiple regression theory [
63]-based methods. The most straightforward technique is the NND method, which assumes that the signal component of the adjacent pixels is almost equal. It is also known as the shift difference method because for a given two adjacent observed vectors
and
in a homogenous area of the HSI, the covariance matrix of noise can be estimated using the difference between
and
, which yields
, where
and
are signal and noise vectors, respectively. However, noise samples are independent and almost have the same statistics. In this paper, we utilize the multiple regression theory-based method [
5,
63]. It outperforms the shift difference technique because of the high correlation of the neighboring spectral bands in HSIs. The second stage is performing the QR factorization based on an incremental approach, as presented in Algorithm A1 in
Appendix A. This stage is responsible for two tasks: (1) approximating the singular vectors for the sampling algorithm and (2) estimating the number of endmembers as presented in
Section 4.1, as shown in lines 4-7 of Algorithm 2. Suppose the number of endmembers is available by any other existing method. In that case, this stage only approximates the top
r singular vectors of the HSI data matrix. Then, the CUR-HU can be used to estimate the endmembers and the corresponding abundance maps. Therefore, CUR-HU has much flexibility to cooperate with the existing algorithms at each stage.
In the third stage, the endmembers and their abundance maps are estimated using CUR decomposition. The inputs to this stage are the estimated number of endmembers and singular vectors of HSI data by the second stage and the HSI matrix
. At the start, the sampling algorithm (DEIM) processes the left and right singular vectors to generate the rows and columns indices for constructing the
and
matrices of CUR decomposition, where
contains the most informative columns that can represent all pixels of HSI and
contains the most informative rows (i.e., selected bands) over the whole pixels, as shown in lines 8–11 of Algorithm 2. The
and
matrices are unique, as the deterministic sampling algorithm provides unique indices of selected rows and columns of CUR. After getting the
and
matrices, the
matrix is constructed to make the approximation error as small as possible [
64].
Different methods for constructing the
matrix have been proposed. The simplest method is using the pseudoinverse of the intersection matrix between
and
matrices. Given
and
, then
is computed as,
. Although this is the simplest solution of
, it gives a high approximation error in approximating high-rank or noisy data. The close form for the best middle matrix of CUR decomposition regarding the Frobenius norm error is given by solving the optimization problem in (
19), so the optimal
matrix is given by
So,
, where
and
are orthogonal projections onto the span of the columns of
and rows of
, respectively. Therefore, using
gives the best CUR approximations for given
and
matrices. In other words, the
matrix is formed by weighting the whole elements of
, which leads to a small approximation error [
65].
Analog to the NMF-based methods for HU, the matrix
represents the source matrix of NMF or
matrix of HU LMM, while the
matrix represents the mixing matrix of NMF or
matrix in Equation (
6). Finally, the nonnegative abundance (ANC) and the abundance sum-to-one (ASC) constraints are imposed on the
matrix. For ANC, we set any negative value of
to zero, while for ASC, we normalized the output from ANC constraint to
.
Algorithm 2: The pseudocode of CUR-HU |
| Input: HSI cube , a positive number for controlling the factorization accuracy. |
| Output: Number of endmembers p, Endmember matrix , and Abundance maps |
| Preprocessing: |
1 | Reshape into ; | // Where . |
2 | Estimate the noise matrix . |
3 | Compute the estimated HSI, |
| Second Stage: |
4 | . | // Where . |
| /* Compute the singular vectors | */ |
5 | ; | // Economy-sized SVD. |
6 | Set ; |
| Thrid Stage: |
| /* Get Row and column indices | */ |
7 | ; Using Algorithm 1. |
8 | ; Using Algorithm 1. |
| /* Construct C, U, and R matrices | */ |
9 | ; | // Where . |
10 | ; | // Where . |
11 | Set ; | // Where . |
| /* Construct M and S matrices | */ |
12 | Set ; and ; |
13 | Apply the abundance nonnegativity constraint to ; |
14 | Apply the abundance sum to one constraint to ; |
6. Conclusions
In this paper, we propose a new hyperspectral unmixing (HU) framework based on CUR matrix decomposition called CUR-HU that performs the entire HU process, which consists of three stages (i.e., estimating the optimal number of endmembers, estimating the endmembers and their corresponding abundance maps). However, most of the existing HU algorithms mainly focus on only one or two tasks of the HU process. CUR-HU is a fully unsupervised method, consists of three main stages, and incorporates several techniques to perform the HU process with high computational efficiency. For estimating the number of endmembers, we utilized a QR factorization based on an incremental method (IQR), used for approximating the singular vectors of the hyperspectral image (HSI) that are required for the deterministic selecting method, which significantly improves the scalability of the proposed method. Then, for the blind HU that estimates the endmembers and their abundance maps simultaneously, we proposed a method based on CUR matrix decomposition with a deterministic sampling method (i.e., DEIM). A unique advantage of our method is that the CUR decomposition based on IQR decomposition provides the matrix rank information for estimating the number of endmembers, and at the same time, we can compute the low-rank approximation of the original data with meaningful interpretation. Extensive experiments were conducted on various synthetic and real HSIs datasets to demonstrate the effectiveness of each stage of the proposed framework. The experiment results show that CUR-HU performs comparable to state-of-the-art methods for estimating the number of endmembers and abundance maps. However, it outperforms other methods for estimating the endmembers and the computational efficiency. Furthermore, CUR-HU shows robustness to spectral variability and is simple to implement and applicable to parallelizing. The proposed framework can be utilized with the existing HU algorithms to initialize the number of endmembers, signatures, and abundance maps. It can also be used to provide supervised information for HU methods based on deep learning techniques. CUR-HU has a to times speedup over different state-of-the-art methods for estimating the endmembers and their abundance maps of different real HSI datasets.