1. Introduction
Hyperspectral images integrate the spectrum representing radiant attributes of an observed scene with homological images standing for spatial and geometric relations. Thanks to the high spectrum resolution and rich spectral information, hyperspectral imaging has been widely used in applications ranging from forest mapping and monitoring, land cover change detection, mineral exploration, object identifying, etc.
Hyperspectral image analysis plays a key role in revealing information from high-dimensional data. In the last decade, spectral unmixing has received considerable attention. It aims to decompose the observed pixel spectra into a collection of constituent spectra signatures (called endmembers) and estimate fractions associated with each component (called abundances) [
1]. Spectral unmixing is particularly useful to analyze pixels with low spatial or with photon interactions among materials. Unmixing techniques have been extensively studied and most of them perform the endmember extraction and abundance estimation either in an independent or simultaneous manner. Provided that the pure pixels are present in the observed scene, methods such as pixel purity index algorithm (PPI) [
2], vertex component analysis (VCA) [
3] and the N-FINDR algorithm [
4] are proposed for extracting endmembers. Meanwhile some algorithms consider to generate virtual endmembers and abundances due to the absence of pure pixels, for example, the minimum volume simplex analysis (MVSA) [
5], the minimum volume constrained nonnegative matrix factorization [
6]. To pursue interpretable and tractable solutions, most algorithms are based on the linear mixture model (LMM). The LMM assumes that an observed pixel is a combination of signature spectra weighted by the abundances and to be physically meaningful, the problem is usually subject to two constraints: the abundance nonnegativity constraint (ANC) and the abundance sum-to-one constraint (ASC). The linear unmixing algorithms can be classified as the least squares principle [
7], statistical framework [
8,
9], sparse regression [
10,
11], and independence component analysis (ICA) based algorithms [
9,
12,
13].
Since the LMM may not be appropriate in some practical situations where the light is scattered by multiple reflective or interacted materials [
14], the nonlinear mixture model (NLMM) then provides an alternative to overcome the limitation of the LMM. For instance, a specific class of nonlinear models referred to as bilinear models has been studied in [
15,
16,
17,
18,
19] for modeling the second-order reflectance by introducing extra additional bilinear terms to LMM. The postnonlinear mixture model (PNMM) has also been introduced in [
20,
21] that considers an appropriate nonlinear function mapping from
into
[
18]. In [
22], based on nonlinear models, the authors proposed a parameter-free unmixing algorithm where abundance fractions and a set of nonlinear parameters are under the specific constraints which can be satisfied by minimizing the penalty function. The above algorithms are presented based on certain models, and thus lack the flexibility. The neural networks based methods can be regarded as model-free methods. In [
23], the authors designed a multi-layer perceptron neutral network combined with a Hopfield neural network to deal with nonlinear mixture. In [
24] the auto-associative neural network for abundance estimation was introduced for nonlinear unmixing purposes consisting of dimension reduction stage and the mapping stage from feature to the abundance percentages. Furthermore, in [
25] the authors proposed an end-to-end unmixing method based on the convolutional neural network to improve the accuracy where the spatial information is taken into account. The kernel-based methods can also be regarded as the model-free methods.
The kernel-based methods serve as one of the most popular tools in addressing nonlinear learning problems. These methods map the data to a feature space of higher dimension where the mapped data can be represented with a linear model in this space [
26]. It is important to note that finding the explicit mapping is however bypassed via the kernel trick [
27,
28,
29,
30,
31]. In [
32,
33], the authors proposed a model consisting of a linear mixture and a nonlinear fluctuation. The nonlinear fluctuation function characterizes the high-order interactions between endmembers and is restricted within a reproducing kernel Hilbert space (RKHS). The so called K-Hype and SK-Hype algorithms are proposed therein. In contrast to several other classes of algorithms, the kernel-based methods is independent of a specific observation model and has the generalization ability in multiple scenes. Thus, sebsequent models have been further studied in the literature. An
-spatial regularization term is added to the K-Hype problem in [
34] to promote the piece-wise spatial continuity of the abundance maps. In [
35,
36], the nonlinear fluctuation term is also called the residual term and the associated algorithm is called the residual component analysis. Post-nonlinear models as well as some robust unmixing models that consider such a fluctuation are presented in [
37]. In [
38], the authors extended this method by accounting for band-dependent and neighboring nonlinear contributions using separable kernels. Further, kernel-based nonnegative matrix factorization (NMF) techniques are studied to simultaneously capture nonlinear dependence features and estimate the abundance.
The above kernel-based algorithms has its inherent limitations. On one hand, these algorithms focus on the abundance estimation, and do not consider the extraction of the endmembers. On the other hand, the structures of the K-hype type algorithms and its variations impose a regularization with unclear physical interpretation on the abundance vector (as elaborated in
Section 2) and the fluctuation is independent of the abundance fractions. In this work, we propose a kernel-based sparse nonlinear spectral unmixing. The nonlinear unmixing algorithm is designed to run with a large number of candidate spectral signatures. A sparse regularization step and a dictionary pruning step are conducted in a sequential manner: the former select endmembers with significant contributions; and the latter then performs the abundance estimation with a more proper optimization problem using the pruned endmember dictionary. The contributions of this work are summarized as follows:
A kernel based sparse nonlinear unmixing problem is formulated and a two-step solving strategy is proposed. This strategy allows to use a spectral library for selecting the endmembers and bypass the problem of endmember extraction in the nonlinear unmixing.
A more reasonable formulation of the optimization problem is proposed for solving the linear mixture/nonlinear fluctuation model. This formulation improves the K-Hype formulation in several aspects and serves as a key component in the proposed sparse unmixing scheme.
The algorithm is tested using real data with ground-truth created in our laboratory. Lack of publicly available data sets with ground truth imposes difficulties to compare unmixing algorithms. Most of the existing works rely on the use of numerically produced synthetic data and real data without ground truth. Using a labeled real data provides a more meaningful comparison results.
The remainder of this paper is organized as follows.
Section 2 introduces the related work based on kernel methods. The proposed basic kernel-based algorithm is described in
Section 3.
Section 4 describes experimental results with simulated hyperspectral data and real hyperspectral data sets. Finally the conclusion is given in
Section 5.
2. Kernel-Based Nonlinear Abundance Estimation
Notation. Scalars are denoted by italic letters. Vectors and matrices are denoted by boldface small and capital letters respectively. Specifically, if each pixel of the hyperspectral image consists of a reflectance vector in L contiguous spectral bands, then is an observed pixel, is the endmember matrix which is a spectral library consisting of spectral signatures with R numbers of endmembers, and is the vector of R endmember signatures at the ℓ-th wavelength bands of , is the abundance vector.
In this section, we first review the general unmixing model and the kernel-based linear mixture/nonlinear fluctuation model proposed in [
33]. A general mixing mechanism can be formulated as
where
is an unknown function that defines the interactions between the endmembers in matrix
parameterized by their associated abundance fractions
, and
is the modeling noise. Though general enough, this strategy may fail if the function
cannot be adequately and finitely parameterized. A semi-parametric model proposed in [
33] is described by:
This model is composed by a linear mixture term and a nonlinear fluctuation term defined by
. Several useful nonlinear mixing model can be considered as a specific case of (
2). For instance, (
2) reduces to a bilinear model if the second-order polynomial is used for
. Models and algorithms of the residual component analysis follows the same principle [
35,
36,
37]. Assume that the endmember matrix
is known, and
is a real-valued function of a reproducing kernel Hilbert space
, endowed with the reproducing kernel
, i.e.,
Selecting a proper kernel is essential for describing the mixture. For instance, the second-order homogeneous polynomial kernel
is able to represent the second-order interactions between endmembers; and the Gaussian kernel
involves an infinity order of interactions since it can be expanded as the sum of polynomials of all orders. Then [
33] proposes to evaluate the abundances and the nonlinear function by solving the following optimization problem:
where
is a positive parameter. The above problem minimizes the reconstruction error, as well as the regularity of
and
is characterized by the squared
-norm with the ANC and ASC. This convex problem can be solved via its duality theory and leads to the so-called K-Hype algorithm. The
ℓ-th element of this pixel is then reconstructed by
where
denotes the optimal estimates. K-Hype has been shown efficient in addressing several nonlinear mixture scenarios. However, it has the following restrictions:
R1: While the -regularization on controls it regularity, the -regularization on does not possess a clear interpretation, as there is no reason to consider that an abundance vector is having a small -norm.
R2: Problem (
6) aims to estimate the abundances with the known endmember matrix. Considering to use a large spectral library as candidates, specific variants for extracting active endmembers are needed.
Facts R1 and R2 motivate us to derive an improved model and the associated unmixing algorithms by modfiying the -regularization and considering an endmember selection strategy.