Next Article in Journal
Rossby Waves in Total Ozone over the Arctic in 2000–2021
Previous Article in Journal
Predicting Tree Mortality Using Spectral Indices Derived from Multispectral UAV Imagery
Previous Article in Special Issue
Hyperspectral Anomaly Detection via Dual Dictionaries Construction Guided by Two-Stage Complementary Decision
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hyperspectral Remote Sensing Image Classification Based on Partitioned Random Projection Algorithm

The Institute for Remote Sensing Science and Application, School of Geomatics, Liaoning Technical University, Fuxin 123000, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(9), 2194; https://doi.org/10.3390/rs14092194
Submission received: 12 March 2022 / Revised: 27 April 2022 / Accepted: 28 April 2022 / Published: 4 May 2022

Abstract

:
Dimensionality reduction based on random projection (RP) includes two problems, namely, the dimensionality is limited by the data size and the class separability of the dimensionality reduction results is unstable due to the randomly generated projection matrix. These problems make the RP algorithm unsuitable for large-size hyperspectral image (HSI) classification. To solve these problems, this paper presents a new partitioned RP (PRP) algorithm and proves its rationality in theory. First, a large-size HSI is evenly divided into multiple small-size sub-HSIs. Afterwards, the projection matrix that maximizes the class separability is selected from multiple samplings in which the class dissimilarity measurement is defined as large inter-class distance and small intra-class variance. By using the same projection matrix, each small-size sub-HSI is projected to generate a low dimensional sub-HSI, thereby generating a low dimensional HSI. Next, the minimum distance (MD) classifier is utilized to classify the low dimensional HSI obtained by the PRP algorithm. Finally, four real HSIs are used for experiments, and three of the most popular classification algorithms based on RP are selected as comparison algorithms to validate the effectiveness of the proposed algorithm. The classification performance is evaluated with the kappa coefficient, overall accuracy (OA), average accuracy (AA), average precision rate (APR), and running time. Experimental results indicate that the proposed algorithm can obtain reliable classification results in a very short time.

1. Introduction

The wavelength range of hyperspectral images (HSIs) is continuous and dense from visible light to short-wave infrared [1]. Compared with panchromatic images and multispectral remote sensing images, HSIs contain richer spectral information and finer texture information of ground objects, which provide a solid data foundation for high-precision ground object classification [2,3]. Therefore, more and more researchers apply it to the fields of medicine [4,5] and agriculture [6,7]. However, for commonly configured hardware, the classification of large-size HSIs has some problems, such as long running time and a large amount of data. The most straightforward solution is to perform dimensionality reduction before HSI classification, thereby reducing the computational load through dimensionality reduction [8,9].
Popular HSI dimensionality reduction methods are roughly divided into two types: band selection [10,11] and feature extraction [12,13]. The band selection method directly discards most of the bands and only selects a part of the bands for subsequent HSI analysis. Currently, Zhang et al. [14] proposed a marginalized graph self-representation (MGSR) method for unsupervised hyperspectral band selection, which considers the differences between different homogenous regions. First, super-pixel segmentation is used to construct the structure map of HSI [15]. Additionally, considering the relationship between adjacent pixels of the same segmentation, the damage is marginalized by an alternating optimization algorithm to search for the best band selection scheme. Computation of the graph matrix takes a lot of time, and the result of super-pixel segmentation affects the result of subsequent band selection. Mou et al. [16] studied deep reinforcement learning models for HSI analysis and first proposed an automatic learning strategy for band selection under the framework of the Markov decision. The method first used a parametric method to rank the bands and then utilized a deep reinforcement learning model to select the best subset of bands. Because it is a learning-based algorithm, it takes much time during the training phase to explore efficient band selection strategies. Feature extraction generally uses mapping to linearly transform all bands of HSIs. Tamilarasi et al. [17] proposed a new dimensionality reduction and classification technique, which extracted and classified road and building features from HSIs with high accuracy. This technique combined independent component analysis (ICA) [18], principal component analysis (PCA) [19], fully convolutional network (FCN) [20], and support vector machine (SVM) models [21]. The above dimensionality reduction method is cumbersome and cannot be calculated quickly.
The random projection (RP) algorithm is an emerging dimensionality reduction technique that satisfies the Johnson-Lindenstrauss (JL) lemma [22,23,24]. The algorithm does not require any prior knowledge of the original data [25]. In addition, a randomly generated projection matrix is used to achieve dimensionality reduction, where the column vector of the projection matrix is unit length. This makes the RP algorithm easier to implement and more convenient to calculate [26]. Additionally, the RP algorithm can guarantee that the distance between any pair of vectors before and after projection is kept within a small range [27,28]. Because many machine learning algorithms use the distance information between vectors to operate and execute [29,30], the dataset obtained after dimensionality reduction by the RP algorithm can be well applied to machine learning algorithms. Therefore, more and more researchers apply it to the dimensionality reduction of HSI data.
At present, many researchers interested in the field of the RP algorithm have proposed many classification algorithms based on the RP algorithm [31,32]. These methods can be grouped into three types: combined feature extraction method [33], separability-boosting method [34], and ensemble method [35]. Combined feature extraction methods combine RP and other feature extraction methods. Zhao and Mao [36] proposed a semi-random projection method, which uses linear discriminant analysis (LDA) to calculate each column vector in the projection matrix and calculates the projection matrix by repeating it multiple times. It strikes a good balance between computational complexity and classification accuracy. This method can effectively preserve the distance invariance of the data when dealing with high-dimensional data. Deegalla and Bostrom [37] compared the dimensionality reduction effect of PCA and the RP algorithm and combined PCA and RP to propose a nearest-neighbor classification algorithm. This method not only improves efficiency but also enables more efficient nearest-neighbor classification. However, the method cannot achieve dimensionality reduction when dealing with large-size HSIs. Separability-boosting methods preserve the class separability of the original data during projection. Zhao et al. [38] proposed a new tighter RP with minimal intra-class variance (TRP-MIV) algorithm for HSI classification, which supplied a promising way for HSI dimensionality reduction. Nevertheless, when the number of hyperspectral vectors is very large, the dimensionality reduction effectiveness of the TRP-MIV algorithm is still limited. The ensemble methods combine RP and the fuzzy clustering algorithm. Rathore et al. [39] studied the cumulative agreement fuzzy c-means (CAFCM) algorithm, which uses RP for generating multiple dimensionality reduction results. In addition, Anderlucci et al. [40] proposed a model-based high dimensional data clustering method based on the idea of ensemble methods. The method first generates a set of low dimensional independent projection results and performs model-based clustering on each of them. Then, the projection of the best grouping structure is displayed, and the final partition is obtained by aggregating the clusters found in the projection by consensus. However, the dimensionality reduction results produced by each projection are only guaranteed to be like the original data and cannot be applied to classification tasks. The RP algorithm has the following shortcomings in large-size HSI classification. (1) The lowest projection dimensionality of the RP algorithm is independent of the high dimensionality, which is positively correlated with the number of hyperspectral vectors [41]. The larger number of hyperspectral vectors, the larger the lowest projection dimensionality. (2) Different dimensionality reduction results will be generated by different projection matrices, which lead to the class separability of dimensionality reduction results being very unstable. Thus, the RP algorithm cannot play a dimensionality reduction effect on the HSI with a large number of hyperspectral vectors, which is not conducive to large-size HSI classification.
Aiming at the above problems existing in the RP algorithm in large-scale dimensionality reduction and classification, an HSI classification algorithm based on partitioned RP (PRP) is proposed. Firstly, the PRP algorithm is proposed by means of image division, and the distance preservation of the hyperspectral vector pair after partitioning is still satisfied from two perspectives within each sub-HSI and between any two different sub-HSIs. In addition, a classifier based on the PRP algorithm is designed, in which the optimal projection matrix is determined according to the principle of large inter-class distance and small intra-class variance, and minimum distance (MD) [42,43] is used to classify the low dimensional HSI received by the PRP algorithm. The experimental results show that the PRP algorithm can project the large-size HISs into a subspace with a lower dimensionality than the RP algorithm, and the proposed classification algorithm can obtain good classification results. The structure of this paper is organized as below. Section 2 and Section 3 give the materials and the proposed algorithm, respectively. The experimental results and discussion are provided in Section 4 and Section 5. Finally, this paper is concluded in Section 6.

2. Materials

This paper uses four publicly available datasets with validation data, namely, real Pavia Centre, Salinas, Chikusei, and LongKou [44,45] images. The four images are respectively located in Pavia Centre in northern Italy, Salinas in Northwestern Indiana, Chikusei in Japan, and LongKou in China. Details of these images are listed in Table 1, including download address, year, sensor, spatial resolution, and number of bands. Spatial resolution represents the actual distance on the ground represented by a pixel.
It is worth noting that this paper uses HSIs of sizes 1096 × 531, 512 × 127, 1000 × 800, and 550 × 400, with homogeneous regions of 9, 13, 7, and 9, respectively, as shown in Figure 1. The classes included in the Pavia Centre image are water, trees, meadows, self-blocking bricks, bare soil, asphalt, bitumen, tiles, and shadows. The classes included in the Salinas image are broccoli green weeds2, fallow, fallow rough plow, fallow smooth, stubble, celery, grapes untrained, soil vineyard develop, corn senesced weeds, lettuce romaine 4 week, lettuce romaine 5 week, lettuce romaine 6 week, and lettuce romaine 7 week. The classes included in the Chikusei image are water, bare soil (school), bare soil (farmland), natural plants, glass, rice field (grown), and row crops. The classes included in the LongKou image are corn, cotton, sesame, broad leaf soybean, narrow leaf soybean, rice, water, roads and houses, and mixed weed. Specifically, Figure 1 shows the false-color images, the standard images (i.e., validation data), and the legends. The band combinations used to replace the RGB bands in the false-color images of the four images are (49, 31, 15), (29, 20, 12), (54, 35, 22) and (130, 65, 18), respectively. Moreover, the specific classes are explained in the legends. In addition, because many vectors in these images do not contain any valid information, these vectors are discarded before applying. Therefore, the numbers of spectral vectors in these scenes are 109,794, 20,655, 9435, and 204,542, respectively.

3. The Proposed Algorithm

3.1. Random Projection

RP is an effective dimensionality reduction tool that can keep the distance between vector pairs before and after projection unchanged. The usual version of the RP algorithm is defined below [26].
Theorem 1.
Supposing that A is any S × D matrix, where S is the data size and D is the feature dimensionality. Its row vector is as in D dimensional spaces, where s is the index of row vectors and s = 1, …, S. For constants ε and β > 0, an integer KRP is chosen as follows,
K RP K RP 0 = 4 + 2 β ε 2 2 ε 3 3 log S
where K0RP is the lowest projection dimensionality of the RP algorithm. RRP is any D × KRP matrix whose entry belongs to the standard normal Gaussian distribution. The S × KRP matrix
B = 1 K RP A R RP
is the projection of A in KRP dimensional subspaces. The row vector of matrix B is bs. Then, with the probability of no distortion at least
PRP = 1 − S β
for any two row vectors as and as′ of matrix A, the distance of the row vectors bs and bs′ of matrix B is preserved as follows,
(1 − ε) ||asas||2 ≤ ||bsbs||2 ≤ (1 + ε) ||asas||2
where s′ is the index of row vectors.
It can be seen from the above theorem that RP is a computationally simple method. The order of forming a projection matrix RRP and projecting the high dimensional data A into the KRP dimensional subspaces is O (D × KRP × S). If the number of vectors S becomes larger, the lowest projection dimensionality K0RP will become larger according to Equation (1) and the probability of no distortion PRP will also become larger according to Equation (3). Therefore, when the number of vectors is large, although the probability of no distortion is high, the low dimensionality KRP may be higher than the high dimensionality D, so the dimensionality reduction effect cannot be achieved. Besides, the generation of the projection matrix is completely random, which will lead to instability in the class separability of the generated dimensionality reduction result. In simpler terms, it cannot guarantee that the class separability of the dimensionality reduction result is good for the classification task.

3.2. Partitioned Random Projection

The lowest projection dimensionality K0RP of the RP algorithm is strictly limited by the parameters ε, β, and the number of hyperspectral vectors S. The Pavia Centre image with 102 bands is taken as an example, of which the number of hyperspectral vectors is 109,794. When the fixed parameters ε = 1 and β = 0.5, according to Equation (1), the lowest projection dimensionality of the RP algorithm is 349, which is much higher than the number of bands, which is 102. Therefore, when processing large-size HSIs, if the number of bands is very low, the RP algorithm will not be able to achieve the dimensionality reduction of large-size HSIs. Considering this problem, a new PRP algorithm is introduced in this section. PRP, a dimensionality reduction algorithm, can project a large-size HSI into a low dimensional subspace without serious distortion on pairwise distance.
The HSI can be expressed as a matrix U = [u1; …; us; …; uS], where S is the number of hyperspectral vectors, s is the index of hyperspectral vectors, and us is the sth hyperspectral vector. The implementation of the PRP algorithm involves partitioning an HSI into multiple sub-HSIs evenly according to the horizontal direction and then projecting each sub-HSI into a low dimensional subspace. Specifically, the matrix of the HSI can also be defined as U = [U1; …; Um; …; UM], where M is the number of sub-matrices, and m is the index of sub-matrices. Moreover, Um = [um1; …; umn; …; umN] is the mth N × D sub-matrix, where n is the index of hyperspectral vectors in the sub-matrix, and N is the number of hyperspectral vectors in the sub-matrix that is the same for all sub-matrices. Thus, S = M × N. Note that the dimensionality reduction method in this paper requires that an HSI must be divided equally. Otherwise, the first few hyperspectral vectors can be regarded as edges and discarded so that the HSI can be divided equally.
Next, the rationality of the distance preservation property guaranteed by this partition method is proven from two perspectives: within each sub-matrix and between any two different sub-matrices.
The distance changes before and after the projection of any two hyperspectral vectors within each sub-matrix are explained as follows. The conclusion of this theorem comes from the RP algorithm (Theorem 1).
Theorem 2.
For constants ε and β > 0, an integer KPRP such that
K PRP K PRP 0 = 4 + 2 β ε 2 2 ε 3 3 log N = 4 + 2 β ε 2 2 ε 3 3 log S M
where K0PRP is the lowest projection dimensionality of the proposed algorithm. RPRP is a D × KPRP matrix whose entry belongs to the standard normal Gaussian distribution. The N × KPRP sub-matrix
V m = 1 K PRP U m R PRP
is the projection of sub-matrix Um in KPRP dimensional subspaces. The low dimensional hyperspectral vector of sub-matrix Vm is vmn. Namely, Vm = {vm1; …; vmn; …; vmN}. Then, according to Equation (3), with the probability of no distortion at least
PPRP1 = 1 − N β = 1 − (S/M)β
for any two hyperspectral vectors umn and umn of sub-matrix Um, the distance of the two low dimensional hyperspectral vectors vmn and vmn of sub-matrix Vm is preserved based on Equation (4),
(1 − ε) ||umnumn||2 ≤ ||vmnvmn||2 ≤ (1 + ε) ||umnumn||2
where n′ is the index of hyperspectral vectors in the sub-matrix.
According to Equation (5), the lowest projection dimensionality K0PRP decreases with the increase in the number of sub-matrices M and the decrease of parameter β. Moreover, with the increase of parameter ε, the change trend of lowest projection dimensionality K0PRP is to first decrease and then increase. Comparing Equation (1) and Equation (5), the lowest projection dimensionality of the PRP algorithm is smaller than that of the RP algorithm under the same parameters. Note that when M is equal to one, it is the projection without partitioning, that is, the RP algorithm. When M is equal to S, there is only one hyperspectral vector in each sub-matrix. Meanwhile, the lowest projection dimensionality is zero, according to Equation (5), so it can be projected into any space with a dimensionality greater than zero.
A projection matrix is randomly generated, and each sub-matrix is projected into the same low dimensional space with this projection matrix. On this basis, the distance changes before and after the projection of any two hyperspectral vectors in any two different sub-matrices are introduced. It is described in detail as Theorem 3. (Proof of Theorem 3 can be found in Appendix A).
Theorem 3.
For constants ε and β > 0, an integer KPRP is chosen according to Equation (5). RPRP is a D × KPRP matrix whose entries belong to the standard normal Gaussian distribution. The sub-matrix Vm is obtained based on Equation (6). Then, for the same ε, β, KPRP, and RPRP, with the probability of no distortion at least
PPRP2 = 1 − 2N β = 1 − 2 (S/M)β
for any hyperspectral vector umn in sub-matrix Um and hyperspectral vector um′n in another sub-matrix Um′, the distance of the low dimensional hyperspectral vector vmn in sub-matrix Vm and low dimensional hyperspectral vector vm′n in another sub-matrix Vm′ is preserved,
(1 − ε) ||umnumn||2 ≤ ||vmnvmn||2 ≤ (1 + ε) ||umnumn||2
where m′ is the index of sub-matrices. Significantly, m′ is not equal to m (m′ ≠ m).
Combining Theorem 2 and Theorem 3, the distance before and after the projection of any two hyperspectral vectors in HSIs can also remain unchanged with an acceptable probability of no distortion. Specifically, the minimum value of probabilities of no distortion within each sub-matrix and between two different sub-matrices is taken as the probability of no distortion of the PRP algorithm. The projection of the matrix U = [u1; …; us; …; uS] can be expressed as a matrix V = [v1; …; vs; …; vS], where the low dimensional hyperspectral vector vs is the projection of us. The proposed algorithm is interpreted as Theorem 4.
Theorem 4.
For constants ε and β > 0, an integer KPRP is chosen according to Equation (5). RPRP is a D × KPRP matrix whose entries belong to the standard normal Gaussian distribution. The N × KPRP sub-matrix Vm is defined based on Equation (6) to present the projection of Um in KPRP dimensional subspaces. Then, for the same ε, β, KPRP and RPRP, with the probability of no distortion at least
PPRP = min [PPRP1, PPRP2] = 1 − 2 (S/M)β
for any two hyperspectral vectors us and us′ of matrix U, the distance of the hyperspectral vectors vs and vs′ of matrix V remains as Equation (4).
The PRP algorithm can greatly reduce the dimensionality of HSIs, thereby reducing the computational load of subsequent HSI classification. The probability of no distortion PPRP increases as the parameter β increases and the number of sub-matrices M decreases based on Equation (11). Theorem 4 demonstrates that the PRP algorithm does not introduce a significant distortion in HSIs. Meanwhile, the proposed algorithm can also gain a high probability of no distortion in the case of dividing a small number of sub-matrices. In this case, the proposed algorithm can approximate the distance between each hyperspectral vector pair while projecting HSI into a space with a lower dimensionality than the RP algorithm. Thus, considering Theorem 1 and Theorem 4 comprehensively, the PRP algorithm is feasible in calculation and dimensionality reduction for large-size HSI. Moreover, when the number of sub-matrices is too small, the proposed algorithm cannot ensure the achievement of the purpose of dimensionality reduction. Therefore, this paper gives the minimum number of sub-matrices that guarantees high dimensionality can be reduced. Additionally, it is defined as Lemma 1. (Proof of Lemma 1 can be found in Appendix B).
Lemma 1.
Suppose that 1.5 > ε > 0 and β > 0, then
M > S × e D 24 K 0 D
Figure 2 presents an example of dividing an HSI with five million hyperspectral vectors into one million sub-HSIs. The number of hyperspectral vectors N in each sub-HSI is five in Figure 2, which is calculated by dividing the number of hyperspectral vectors S in the HSI by the number of sub-HSIs M, that is, N = S/M = 5,000,000/1,000,000 = 5.

3.3. HSI Classification Based on the PRP Algorithm

The classification based on the PRP algorithm is introduced from two aspects. One is the optimization strategy of the projection matrix to increase the class separability of the dimensionality reduction result with the assistance of samples, and then the optimization strategy is applied to the PRP algorithm. The other is the HSI classification algorithm integrates the PRP algorithm for dimensionality reduction and MD classifier for classification.

3.3.1. Optimization Strategy of the Projection Matrix

Because the projection matrix is generated randomly, there is a case where the dimensionality reduction result with low class separability is not suitable for subsequent HSI classification. Therefore, it is necessary to exploit a projection matrix that is applicable for the classification task. The optimization strategy of the projection matrix is to increase the class separability of the low dimensional HSI. The class dissimilarity measurement is defined as large inter-class distance and small intra-class variance. The calculation method is the sum of the distance between classes divided by the variance within the class. Specifically, with the aid of samples, the projection matrix that maximizes the class dissimilarity is selected among multiple samplings.
Samples matrix for all classes of an HSI is expressed as X. Specifically, X = [X1; …; Xl; …; XL], where l is the index of classes and L is the number of classes known as a prior. And Xl is the lth samples matrix,
X l = x 11 l x 12 l x 1 D l x 21 l x 22 l x 2 D l x H 1 l x H 2 l x H D l
where H is the number of samples in a single class, which is the same for each class. The selection of the projection matrix RPRP is very important for the HSI classification. To be specific, many matrices are generated based on the standard normal Gaussian distribution. These matrices constitute a sampling matrix of the projection matrix, Q = [Q1, …, Qt, …, QT], where t is the index of sampling, T is the number of samplings, and Qt is the projection matrix of the tth sampling. Then, each projection matrix Qt is used to project the samples matrix X. The low dimensional samples matrix W = [W1, …, Wt, …, WT] is the projection of samples matrix X by the sampling matrices Q. Moreover, its calculation is derived from Equation (6), that is, Wt = XQt/√KPRP. Specifically, Wt = [Wt1, …, Wtl, …, WtL], where Wtl is the low dimensional samples matrix of class l of the tth sampling. That is,
W t l = w 11 t l w 12 t l w 1 K PRP t l w 21 t l w 22 t l w 2 K PRP t l w H 1 t l w H 2 t l w H K PRP t l
The feature mean hyperspectral vector of low dimensional samples matrix Wtl of class l of the tth sampling is described as
W mean t l = 1 H W t l = 1 H w 11 t l + + w H 1 t l w 12 t l + + w H 2 t l w 1 K PRP t l + + w H K PRP t l
Next, distance between lth class and l′th class divided by the variance of the lth class is defined as follows,
I l l t = W mean t l W mean t l 2 v a r W mean t l
where l′ is the index of classes. Then, the class dissimilarity is calculated as follows,
J t = l = 1 L l = 1 L I l l t
The better projection matrix is the t*th sampled projection matrix with the large class dissimilarity. Then,
t * = arg max t = 1 , , T J t
Finally, the projection matrix RPRP takes to the t*th sampled projection matrix Qt*, where t* is obtained by Equation (18),
R PRP = Q t *
The projection matrix selection strategy can increase the separability of dimensionality reduction results, thereby increasing the accuracy of HSI classification.

3.3.2. HSI Classification Based on Optimization Strategy of the Projection Matrix

HSI classification algorithm is described in this section. First, the low dimensional samples matrix Y = [Y1; …; Yl; …; YL] is the projection of samples matrix X by the projection matrix obtained according to Equation (19) with KPRP dimensionalities. To be special,
Y l = y 11 l y 12 l y 1 K PRP l y 21 l y 22 l y 2 K PRP l y H 1 l y H 2 l y H K PRP l
is the lth low dimensional samples matrix. The feature mean hyperspectral vector of the lth low dimensional samples is expressed as
Y mean l = 1 H Y l = 1 H y 11 l + + y H 1 l y 12 l + + y H 2 l y 1 K PRP l + + y H K PRP l
Consequently, an HSI is denoted by U = [u1, …, us, …, uS], where s is the index of hyperspectral vectors, S is the number of hyperspectral vectors, and us is a hyperspectral vector with D dimensionalities. The low dimensional HSI V = [v1, …, vs, …, vS] is obtained by the PRP algorithm, where vs is a low dimensional hyperspectral vector with KPRP dimensionalities. Then, the final classes to which the low dimensional HSIs belong are determined by executing the MD classifier. That is to calculate the distance between the low dimensional hyperspectral vector and the samples feature mean hyperspectral vector of each class.
The distance matrix is characterized as Z = [z1, …, zs, …, zS], where zs is the sth distance vector. Especially, zs = [zs1, …, zsl, …, zsL], where zsl is the distance between a low dimensional hyperspectral vector vs and feature mean hyperspectral vector Ylmean according to Equation (21), and the calculation method is as follows,
zsl = ||Ylmeanvs||
The determined classification result can be defined as f = [f1, …, fs, …, fS] in the HSI classification process. That is, the class with the smallest distance is the class to which the low dimensional hyperspectral vector belongs,
f s = arg min z s = arg min l = 1 , , L z s l

3.4. The Complexity of the Proposed Algorithm

The space and time complexity of the proposed algorithm are analyzed here. This section studies the complexity in two parts: the optimization strategy of the projection matrix and the HSI classification algorithm based on the optimization strategy of the projection matrix.
The main contribution of the complexity of the optimization strategy of the projection matrix is to calculate the projection matrix Q. To update Q, it takes O(HLD) space and O(HLDKPRP) time for the calculation of the low dimensional sample matrix W. Furthermore, O(HKPRP) space and O(THL2KPRP) time are required to calculate the class dissimilarity Jt. The main contribution of the complexity of the HSI classification algorithm based on the optimization strategy of the projection matrix is to calculate the distance matrix Z. To update Z, it takes O(MND) space and O(MNDKPRP) time for the calculation of the low dimensional image V. Furthermore, O(MNKPRP) space and O(MNLKPRP) time are required to calculate the distance zsl. To sum up, the overall space complexity of the proposed algorithm is O(MND), and the overall time complexity of the proposed algorithm is O(HLDKPRP + THL2KPRP + MNDKPRP + MNLKPRP).

3.5. The Flow Chart of the Proposed Algorithm

The detailed process of the proposed Algorithm 1 is described as follows.
Algorithm 1 The detailed process of the proposed classification algorithm
Input: HSI U and samples X.
Output: Classification result f.
Step 1. Initializing parameters ε, β, L, T, H, M, S.
Step 2. Dividing an HSI evenly U = [U1; …; UM].
Step 3. Calculating the lowest projection dimensionality K0PRP by using Equation (5).
Step 4. Forming the projection matrix RPRP by using Equation (19).
Step 5. Generating a low dimensional sub-HSI Vm by using Equation (6).
Step 6. Getting the low dimensional HSI V = [V1; …; VM].
Step 7. Generating the feature mean hyperspectral vector of the low dimensional samples of lth class Ylmean by using Equation (21).
Step 8. Calculating the distance zsl by using Equation (22).
In addition, Figure 3 illustrates a flow chart of the proposed classification algorithm in this paper.

4. Experiments and Results

To validate the effectiveness of the proposed algorithm, classification experiments in real HSIs were performed on a PC with Intel (R) Xeon (R) CPU E7-8880 v3, 2.30 GHz and 96.0 GB memory using MATLAB R2020a. The experimental parameters are set as shown in Table 2. The table contains the number of samplings T, the number of samples in each class H, the number of sub-HSI M, and low dimensionalities. It can be seen from Table 2 that the KPRP is less than three-quarters of that of the KTRP and one-eighth of that of the KRP, which greatly reduces the amount of computation. In addition, the KRP is larger than the high dimensionality D, which means that the RP algorithm cannot achieve the purpose of dimensionality reduction.
To verify the effectiveness of the projection matrix selection strategy, taking the LongKou image as an example, the randomly generated projection matrix (projection matrix without selection) is used as a comparison algorithm. Figure 4 shows the change of the spectral curve of the mean vector of samples of 9 classes of the LongKou image to evaluate the distance change between classes in the dimensionality reduction results. The horizontal coordinates represent the number of bands, and the vertical coordinates represent the intensity value. Figure 4a–c represent spectral curves of all classes without dimensionality reduction, dimensionality reduction without matrix selection, and dimensionality reduction based on matrix selection, with legends for all classes, respectively. Comparing the three images in Figure 4, the low dimensional spectral curves obtained by the projection matrix selection strategy have good separability for each class in each dimensionality, which fully reflect the superiority of the projection matrix optimization strategy.
For the sake of validating the proposed classification algorithm, the TRP-MIV algorithm, the algorithm for combining PRP and MGSR (PRP-MGSR), and the CAFCM algorithm are used as comparison algorithms. It is worth noting that the classification experiments of all experimental images have been tested 100 times. For all algorithms, the classification results with the highest accuracy in 100 trials are exhibited in Figure 5. Figure 5(a1–a4,b1–b4,c1–c4,d1–d4) show the classification results for four experimental images by the proposed algorithm and the TRP-MIV, PRP-MGSR, and CAFCM algorithms, respectively. Additionally, Figure 5(a5–d5) are legends for all classes of the four experimental images. Meanwhile, to better show the classification effect of the proposed algorithm, Figure 6 superimposes the final classification results of HSIs on the respective false-color images (see Figure 1(a1–d1) for details). Figure 6 shows the outlines of the superposition results of the four experimental images of the four algorithms.
From a qualitative point of view, compared with the other three comparison algorithms, the classification performance of the proposed algorithm is superior, which proves the reliability of the proposed classification algorithm for large-size HSIs. According to qualitative experimental results, the TRP-MIV algorithm has misclassification for complex HSIs with large hyperspectral vectors. In addition, comparing the PRP-MGSR algorithm, the proposed algorithm simultaneously considers the inter-class distance and intra-class variance and can achieve better classification results. The proposed algorithm preserves the class separability so that each class is classified exactly. In theory, the more dimensionalities are reduced, the more the loss of spectral information will be, which results in an unsatisfactory classification effect. However, the proposed algorithm attains a classification effect better than the CAFCM algorithm for four experimental images. Moreover, the CAFCM algorithm needs to be projected into a space with a higher dimensionality than the high dimensionality so that it cannot achieve the effect of dimensionality reduction.
The kappa coefficient, overall accuracy (OA), average accuracy (AA), average precision rate (APR), and running time are calculated to effectively evaluate the classification performance. Table 3, Table 4, Table 5 and Table 6 reveal the quantitative evaluation results of four experimental images, respectively. The values in the tables are the mean and variance of the kappa coefficient, OA, AA, APR, and running time calculated by different algorithms for 100 trials, in which the values in brackets are the variance and those not in are the mean.
From a quantitative perspective, the classification results of the proposed algorithm surpass other algorithms. The running time of the proposed method for four image classification experiments is about 1.16, 0.37, 0.21, and 2.42 s, and the mean values of their kappa coefficient are larger than 0.82. All evaluation parameters intuitively state that the proposed algorithm accurately classifies the real HSIs. According to quantitative experimental results, although the mean values of AA and APR are less than 80% for the Pavia Centre image, the mean value of OA is more than 89%. For the Salinas image, the variance value of the kappa coefficient of the proposed algorithm in 100 trials is approximately equal to zero, which indicates that the classification results of the proposed algorithm are relatively stable. In addition, for the Chikusei image, the mean values of all evaluation parameters of the proposed algorithm are at least 0.05, 3.59%, 4.73%, 7.28%, and 7.57 times larger than those of the comparison algorithms, which fully reflects the superiority of the proposed classification algorithm. Although the kappa coefficient of the proposed algorithm is not very high for the LongKou image, the accuracy of the proposed algorithm is still higher than that of all comparison algorithms. Additionally, the values of OA, AA, and APR outperform the comparison algorithms by at least 16.51%, 21.35%, and 35.19%. In summary, the proposed algorithm can obtain reliable classification results in a very short time.

5. Discussion

In order to verify the superiority of the proposed classification algorithm, all classification experiments have been tested 100 times. The TRP-MIV algorithm [38] has better classification results for HSIs with a large inter-class distance. The PRP-MGSR algorithm is relatively stable in classification results in multiple tests. This is because the MGSR algorithm [14] has strong applicability to various data, but the running time is relatively long. The CAFCM algorithm [39] is more likely to be disturbed by noise, and the classification results of the algorithm are better for images with less noise. The proposed classification algorithm considers the class dissimilarity, which can reach the ideal classification result in a shorter time.
In order to use the proposed algorithm more widely, this section discusses the influence of the projection parameters on dimensionality reduction and the influence of the number of samples and the number of samplings on the classification accuracy. First, the focus is to analyze the effects of parameters ε, β, and the number of sub-HSIs M on the lowest projection dimensionality K0PRP. According to Equation (5), taking the Salinas image as an example, Figure 7 represents the change curves of the lowest projection dimensionality with different parameters, where the horizontal coordinates are different parameters, and the vertical coordinates are K0PRP. Figure 7a is the change curve of ε and K0PRP when β = 0.5 and M = 2295 (according to Table 2), where the position of the black dotted line is the inflection point. When ε is equal to 1.5, the denominator term of Equation (5) is 0, so the curve in Figure 7a is discontinuous at ε = 1.5. When ε < 1.5, K0PRP first decreases and then increases with the increase of ε and takes the minimum value when ε = 1. When ε > 1.5, K0PRP increases with the increase of ε and is always negative, which is meaningless for dimensionality reduction. Figure 7b is the change curve of β and K0PRP when ε = 1 and M = 2295. As seen in Figure 7b, K0PRP increases with the increase of β. Figure 7c is the change curve of M and K0PRP when ε = 1 and β = 0.5. The position of the black dotted line is the number of bands, and the values on the upper side and the left side of the boundary line cannot achieve dimensionality reduction. As seen from Figure 7c, K0PRP decreases with the increase of M. Therefore, to reduce computation and memory, it is necessary to project the HSI to a smaller dimensional space. Then, it is best to set ε to 1 within (0, 1.5); β should be set as small as possible, and M should be set as large as possible. Next, in order to reflect the impact of the number of samples H and the number of samplings T on classification performance, the change curves in the values of OA with different T and H are shown in Figure 8. The number of samples and samplings are taken from 10 to 650, respectively. Figure 8 shows that as T and H increase, the OA value varies only in a small range. The larger the number of samplings and samples, the smaller the range of variation. Among them, Figure 8a shows that with the increase of T, the error of the OA value is 1%, and Figure 8b shows that with the increase of H, the error of the OA value is 2.5%. Therefore, in practical applications, only a small number of samples and few projection matrix samplings can be used to obtain better classification results. In addition, in order to ensure that the PRP algorithm can still maintain the distance after the HSI is divided, the number of samples in each class in this paper is set to be the same. The focus of future research will be on dividing an HSI into sub-HSIs with different numbers of hyperspectral vectors to improve utility.

6. Conclusions

A new dimensionality reduction method, the PRP algorithm, is presented for large-size HSIs in this paper. It is worth mentioning that this paper also theoretically proves the distance preservation property of the PRP algorithm in detail. The PRP algorithm projects the large-size HSI into a space with a lower dimensionality than the RP algorithm. Therefore, when applied to the large-size HSI, the PRP algorithm no longer has the problem of the inability to reduce the dimensionality. In addition, the proposed algorithm utilizes the large inter-class distance and small intra-class variance as the class dissimilarity measurement for selecting the projection matrix, which can preserve the class separability of the low dimensional HSI well. Compared with TRP-MIV, PRP-MGSR, and CAFCM algorithms, the proposed algorithm has the shortest running time and the highest classification accuracy.

Author Contributions

Conceptualization, S.J., Q.Z. and Y.L.; methodology, S.J., Q.Z. and Y.L.; software, S.J.; validation, S.J.; formal analysis, S.J.; writing—original draft preparation, S.J.; writing—review and editing, S.J., Q.Z. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Project for Key Scientific Issues from the Education Department of Liaoning 2020, grant number LJ2020ZD003.

Data Availability Statement

Acknowledgments

The authors would like to thank Gamba P. for providing the ROSIS Pavia Centre data, Johnson L. and Gualtieri J. A. for providing the AVIRIS Salinas data, Naoto Yokoya and Akira Iwasaki for providing the Chikusei data, and the Intelligent Data Extraction, Analysis and Applications of Remote Sensing (RSIDEA) academic research group, State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing (LIESMARS), Wuhan University, for providing the LongKou data.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Theorem 3.
To facilitate understanding, some basic statistical concepts and properties that need to be used in the proof are introduced.
Markov inequality. The Markov inequality defines an upper bound on the probability that a function of a random variable is greater than or equal to a positive number. Let X be a non-negative random variable α > 0, and then,
P r X α E X α
Standard normal distribution. If α obeys the standard normal distribution, then
p α = 1 2 π exp α 2 2
Function integral. When 1/2 > a > 0,
+ exp x 2 2 a 2 d x = 2 π a
Taylor expansion. A formula that describes the value around a point of an exponential function is as follows.
exp ( ε ) = 1 ε + ε 2 2 ε 3 6 +
The specific proof is given below. Provided that the probability that the distance between the vectors before and after projection exceeds the range of (1 ± ε) is Pdistortion
P distortion = P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 v n m v n m 2 ( 1 + ε ) u n m u n m 2
Either of these two cases signifies that the distance exceeds a factor of ε. Without loss of generality, thinking of the situation Pr (||vmnvmn||2 ≥ (1 + ε) ||umnumn||2), suppose that
C = K PRP v n m v n m 2 u n m u n m 2
Then,
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 = P r C ( 1 + ε ) K PRP
For any constant λ > 0, continue to construct the above formula, then,
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 = P r exp ( C λ ) exp ( ( 1 + ε ) K PRP λ )
The probability that the projected distance will remain the same can be linked to the mathematical expectation of the projected distance according to Equation (A1). Then, the first scaling can be done for Pr (||vmnvmn||2 ≥ (1 + ε) ||umnumn||2),
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 E exp ( C λ ) exp ( 1 + ε ) K PRP λ
Meanwhile, according to Equation (6), the distance between vector vmn and vector vmn is
v n m v n m 2 = 1 K PRP R ( u n m u n m )
Then,
C = K PRP v n m v n m 2 u n m u n m 2 = R ( u n m u n m ) 2 u n m u n m 2
Supposing that Ck = Rk (umnumn). Because it is the weighted average of the independent standard normal variables, it belongs to the standard normal distribution. Hence,
C = k = 1 K PRP R k ( u n m u n m ) 2 u n m u n m 2 = 1 u n m u n m 2 k = 1 K PRP C k 2
Since Ck is independent and identically distributed, Pr (||vmnvmn||2 ≥ (1 + ε) ||umnumn||2) defined in Equation (6) can be calculated as follows,
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 E exp ( C λ ) exp ( 1 + ε ) K PRP λ = k = 1 K PRP E exp ( C k 2 λ ) exp ( 1 + ε ) K PRP λ = E exp ( C 1 2 λ ) exp ( 1 + ε ) λ K PRP
Intriguingly, because C1 obeys the standard normal distribution, then according to Equation (A2)
E exp ( C 1 2 λ ) = + exp ( C 1 2 λ ) p C 1 d C 1 = 1 2 π + exp ( C 1 2 λ ) exp C 1 2 2 d C 1 = 1 2 π + exp ( 1 2 λ ) C 1 2 d C 1 = 1 2 π + exp 1 2 ( 1 2 λ ) C 1 2 d C 1
Based on Equation (A3), combining the above equation and Equation (A13),
E exp ( C 1 2 λ ) = 1 2 π 2 π 1 1 2 λ = 1 1 2 λ
Therefore,
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 E exp ( C 1 2 λ ) exp ( 1 + ε ) λ K PRP = 1 1 2 λ exp ( 1 + ε ) λ K PRP = exp 2 ( 1 + ε ) λ 1 2 λ K PRP 2
In order to perform the next scaling on the base term of the above equation, we provide that
g ( λ ) = exp 2 ( 1 + ε ) λ 1 2 λ
At this point, its minimum value can be found at λ0 = ε/2 (1 + ε), and g (λ0) = (1 + ε) exp(−ε) here. And then,
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 ( 1 + ε ) exp ( ε ) K PRP 2
Therewith, according to Equation (A4), the probability that a single hyperspectral vector pair satisfies the distance preservation property is
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 exp K PRP 2 ε 2 2 + ε 3 3
Similarly,
P r v n m v n m 2 ( 1 ε ) u n m u n m 2 exp K PRP 2 ε 2 2 + ε 3 3
According to the above derivation, the probability of distortion is as follows,
P r v n m v n m 2 ( 1 + ε ) u n m u n m 2 v n m v n m 2 ( 1 ε ) u n m u n m 2 2 exp K PRP 2 ε 2 2 + ε 3 3
Since the number of hyperspectral vector pairs is N2, the probability of making these hyperspectral vector pairs undistorted based on Equation (5) can be calculated as below,
1 N 2 2 exp K PRP 2 ε 2 2 + ε 3 3 1 2 N 2 exp 2 + β log N = 1 2 exp 2 + β log N + 2 log N = 1 2 exp ( β log N ) = 1 2 N β
 □

Appendix B

Proof of Lemma 1.
For Theorem 4, if the lowest projection dimensionality K0PRP is smaller than the high dimensionality D based on Equation (5), then
4 + 2 β ε 2 2 ε 3 3 log S M D   ( ε , β > 0 )
For β > 0, the item 4 + 2β is larger than 4. Moreover, S/M is the number of hyperspectral vectors in the sub-HSI N, which is larger than 1. Therefore, the logarithmic item log S/M is larger than zero. When the range of the parameter ε is larger than 1.5, the item ε2/2 − ε3/3 is smaller than 0. Therefore, the value on the left side of the above equation is always negative. At this time, the above equation is always established, so this paper will not discuss it. In addition, if the range of the parameter ε is 0 to 1.5, the item ε2/2 − ε3/3 is larger than 0. Thus, the value on the left side of the above equation is always positive here. Then,
4 + 2 β D log S M ε 2 2 ε 3 3   ( 1.5 > ε > 0 , β > 0 )
The right side can be seen as a function g (ε) = ε2/2 − ε3/3 of ε. Within this range, the function obviously increases first and then decreases. g (ε) takes the maximum value g (ε0) = 1/6 at ε0 = 1. Thus, in case the above inequality is always established in the range of 1.5 ≥ ε ≥ 0, then
4 + 2 β D log S M ε 2 2 ε 3 3 1 6   ( 1.5 > ε > 0 , β > 0 )
Moreover, when the parameter β is equal to zero, the left side takes the value (4 log S/M)/D. Similarly, for β > 0, if the above equation always holds, then
4 D log S M < 4 + 2 β D log S M 1 6
In this way, the minimum number of sub-matrices to ensure dimensionality reduction can be calculated as follows,
4 D log S M < 1 6 log S M < D 24 S M < e D 24 M > S e D 24
 □

References

  1. Pan, H.; Chen, Z. Application of UVA hyperspectral remote sensing in winter wheat leaf area index inversion. Chin. J. Agric. Resour. Reg. Plan. 2018, 39, 32–37. [Google Scholar] [CrossRef]
  2. Xu, Q.; Ma, Y.; Jiang, Q.; Tong, C.; Zhao, Z. Estimation of rice leaf water content based on hyperspectral remote sensing. Remote Sens. Infom. 2018, 33, 136–143. [Google Scholar] [CrossRef]
  3. Chutia, D.; Bhattacharyya, D.K.; Sarma, K.K.; Kalita, R.; Sudhakar, S. Hyperspectral remote sensing classifications: A perspective survey. Trans. GIS 2016, 20, 463–490. [Google Scholar] [CrossRef]
  4. Wang, Y.; Reardon, C.P.; Read, N.; Thorpe, S.; Evans, A.; Todd, N.; Krauss Thomas, F. Attachment and antibiotic response of early-stage biofilms studied using resonant hyperspectral imaging. NPJ Biofilms Microbiomes 2020, 6, 57. [Google Scholar] [CrossRef]
  5. Koprowski, R. Hyperspectral imaging in medicine: Image pre-processing problems and solutions in Matlab. J. Biophotonics 2015, 8, 935–943. [Google Scholar] [CrossRef] [PubMed]
  6. Gu, X.; Wang, Y.; Sun, Q.; Yang, G.; Zhang, C. Hyperspectral inversion of soil organic matter content in cultivated land based on wavelet transform. Comput. Electron. Agric. 2019, 167, 105053–105059. [Google Scholar] [CrossRef]
  7. Mathieu, M.; Roy, R.; Launeau, P.; Cathelineau, M.; Quirt, D. Alteration mapping on drill cores using a hyspex swir-320m hyperspectral camera: Application to the exploration of an unconformity related uranium deposit (Saskatchewan, Canada). J. Geochem. Explor. 2017, 172, 71–88. [Google Scholar] [CrossRef]
  8. Su, H.; Sheng, Y.; Du, P.; Chen, C.; Liu, K. Hyperspectral image classification based on volumetric texture and dimensionality reduction. Front. Earth Sci. 2015, 9, 225–236. [Google Scholar] [CrossRef]
  9. Dong, Y.; Du, B.; Zhang, L.; Zhang, L. Exploring locally adaptive dimensionality reduction for hyperspectral image classification: A maximum margin metric learning aspect. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 10, 1136–1150. [Google Scholar] [CrossRef]
  10. Yao, D.; Zhao, P.; Pham, T.N.; Cong, G. High-Dimensional Similarity Learning Via Dual-Sparse Random Projection. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), Stockholm, Sweden, 13–19 July 2018; pp. 3005–3011. [Google Scholar] [CrossRef] [Green Version]
  11. Dou, Z.; Gao, K.; Zhang, X.; Wang, H.; Han, L. Band selection of hyperspectral images using attention-based autoencoders. IEEE Geosci. Remote. Sens. Lett. 2020, 18, 147–151. [Google Scholar] [CrossRef]
  12. Fan, J.; Liao, Y.; Wang, W. Projected principal component analysis in factor models. Ann. Stat. 2016, 44, 219–254. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Hua, W.; Lin, Y.; Huang, H.; Ding, C. From protein sequence to protein function via multi-label linear discriminate analysis. IEEE-ACM Trans. Comput. Biol. Bioinform. 2017, 14, 503–513. [Google Scholar] [CrossRef]
  14. Zhang, Y.; Wang, X.; Jiang, X.; Zhou, Y. Marginalized graph self-representation for unsupervised hyperspectral band selection. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 1–12. [Google Scholar] [CrossRef]
  15. Saranathan, A.M.; Parente, M. Uniformity-based superpixel segmentation of hyperspectral images. IEEE Trans. Geosci. Remote. Sens. 2016, 54, 1419–1430. [Google Scholar] [CrossRef]
  16. Mou, L.; Saha, S.; Hua, Y.; Bovolo, F.; Zhu, X.X. Deep reinforcement learning for band selection in hyperspectral image classification. IEEE Trans. Geosci. Remote. Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
  17. Tamilarasi, R.; Prabu, S. Automated building and road classifications from hyperspectral imagery through a fully convolutional network and support vector machine. J. Supercomput. 2021, 77, 13243–13261. [Google Scholar] [CrossRef]
  18. Filipović, M.; Kopriva, I. A comparison of dictionary based approaches to inpainting and denoising with an emphasis to independent component analysis learned dictionaries. Inverse Probl. Imag. 2017, 5, 815–841. [Google Scholar] [CrossRef] [Green Version]
  19. Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R. Soc. A 2016, 374, 20150202. [Google Scholar] [CrossRef]
  20. Li, H.; Jiang, G.; Wang, R.; Zhang, J.; Wang, Z.; Zheng, W.S.; Menze, B. Fully convolutional network ensembles for white matter hyperintensities segmentation in MR images. NeuroImage 2018, 183, 650–665. [Google Scholar] [CrossRef] [Green Version]
  21. Meier, T.B.; Desphande, A.S.; Vergun, S.; Nair, V.A.; Prabhakaran, V. Support vector machine classification and characterization of age-related reorganization of functional brain networks. NeuroImage 2012, 60, 601–613. [Google Scholar] [CrossRef] [Green Version]
  22. Frankl, P.; Maehara, H. The Johnson-Lindenstrauss lemma and the sphericity of some graphs. J. Comb. Theory 1988, 44, 355–362. [Google Scholar] [CrossRef] [Green Version]
  23. Dasgupta, S. Experiments with Random Projection. In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, Francisco, CA, USA, 30 June–3 July 2000; pp. 143–151. [Google Scholar] [CrossRef]
  24. Dasgupta, S.; Gupta, A. An elementary proof of the Johnson-Lindenstrauss lemma. Random Struct. Algor. 2003, 22, 60–65. [Google Scholar] [CrossRef]
  25. Zhu, Y.; Chen, S. Growing neural gas with random projection method for high-dimensional data stream clustering. Soft Comput. 2020, 24, 9789–9807. [Google Scholar] [CrossRef]
  26. Vempala, S.S. The Random Projection Method; American Mathematical Society: Providence, RI, USA, 2004. [Google Scholar] [CrossRef]
  27. Achlioptas, D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci. 2003, 66, 671–687. [Google Scholar] [CrossRef] [Green Version]
  28. Huang, W.; Wong, P.K.; Wong, K.I.; Vong, C.M.; Zhao, J. Adaptive neural control of vehicle yaw stability with active front steering using an improved random projection neural network. Veh. Syst. Dyn. 2021, 59, 396–414. [Google Scholar] [CrossRef]
  29. Zhe, X.; Chen, S.; Yan, H. Directional statistics-based deep metric learning for image classification and retrieval. Pattern Recogn. 2018, 93, 113–123. [Google Scholar] [CrossRef] [Green Version]
  30. Yellamraju, T.; Boutin, M. Clusterability and clustering of images and other “real” high-dimensional data. IEEE Trans. Image Process. 2018, 27, 1927–1938. [Google Scholar] [CrossRef]
  31. Fan, Y.; Xuan, L.; Li, Q.; Tao, L. Exploring the diversity in cluster ensemble generation: Random sampling and random projection. Expert Syst. Appl. 2014, 41, 4844–4866. [Google Scholar] [CrossRef]
  32. Hou, B.; Na, L.; Shuang, W.; Zhang, X. SAR image segmentation based on random projection and signature frame. Geosci. Remote Sens. Symp. 2014, 1, 3726–3729. [Google Scholar] [CrossRef]
  33. Fowler, J.E.; Qian, D.; Wei, Z.; Younan, N.H. Classification performance of random-projection-based dimensionality reduction of hyperspectral imagery. Geosci. Remote Sens. Symp. 2009, 5, 76–79. [Google Scholar] [CrossRef]
  34. Alshamiri, A.K.; Singh, A.; Surampudi, B.R. Combining ELM with Random Projections for Low and High Dimensional Data Classification and Clustering. In Proceedings of the Fifth International Conference on Fuzzy and Neuro Computing (FANCCO-2015), Copenhagen, Denmark, 17–19 December 2015; Springer: Cham, Switzerland, 2015; pp. 89–107. [Google Scholar]
  35. Fern, X.Z.; Brodley, C.E. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach. In Proceedings of the 20th International Conference on Machine Learning, Washington, DC, USA, 21–24 August 2003; pp. 186–193. [Google Scholar]
  36. Zhao, R.; Mao, K. Semi-random projection for dimensionality reduction and extreme learning machine in high-dimensional space. IEEE Comput. Intell. Mag. 2015, 10, 30–41. [Google Scholar] [CrossRef]
  37. Deegalla, S.; Bostrom, H. Reducing High-Dimensional Data by Principal Component Analysis vs. Random Projection for Nearest Neighbor Classification. In Proceedings of the 5th International Conference on Machine Learning & Applications (ICMLA’06), Orlando, FL, USA, 14–16 December 2006; pp. 245–250. [Google Scholar] [CrossRef] [Green Version]
  38. Zhao, Q.H.; Jia, S.H.; Li, Y. Hyperspectral remote sensing image classification based on tighter random projection with minimal intra-class variance algorithm. Pattern Recogn. 2021, 111, 107635. [Google Scholar] [CrossRef]
  39. Rathore, P.; Bezdek, J.C.; Erfani, S.M.; Rajasegarar, S.; Palaniswami, M. Ensemble fuzzy clustering using cumulative aggregation on random projections. IEEE Trans. Fuzzy Syst. 2018, 26, 1510–1524. [Google Scholar] [CrossRef]
  40. Anderlucci, L.; Fortunato, F.; Montanari, A. High-dimensional clustering via Random Projections. J. Classif. 2021, 38, 191–216. [Google Scholar] [CrossRef]
  41. Menon, A.K. Random Projections and Applications to Dimensionality Reduction. Bechelor’s Thesis, The University of Sydney, Darlington, Australia, March 2007. [Google Scholar]
  42. Sarvia, F.; Xausa, E.; Petris, S.D.; Cantamessa, G.; Borgogno-Mondino, E. A possible role of copernicus sentinel-2 data to support common agricultural policy controls in agriculture. Agronomy 2021, 11, 110. [Google Scholar] [CrossRef]
  43. Zhan, Y.; Muhammad, S.; Hao, P.; Niu, Z. The effect of EVI time series density on crop classification accuracy. Optik 2018, 157, 1065–1072. [Google Scholar] [CrossRef]
  44. Zhong, Y.; Hu, X.; Luo, C.; Wang, X.; Zhao, J.; Zhang, L. WHU-Hi: UAV-borne hyperspectral with high spatial resolution (H2) benchmark datasets and classifier for precise crop identification based on deep convolutional neural network with CRF. Remote Sens. Environ. 2020, 250, 112012. [Google Scholar] [CrossRef]
  45. Zhong, Y.; Wang, X.; Xu, Y.; Wang, S.; Jia, T.; Hu, X.; Zhao, J.; Wei, L.; Zhang, L. Mini-UAV-borne hyperspectral remote sensing: From observation and processing to applications. IEEE Geosci. Remote Sens. Mag. 2018, 6, 46–62. [Google Scholar] [CrossRef]
Figure 1. Real HSIs. (a1d1) represent the false-color images of the Pavia Centre, Salinas, Chikusei, and LongKou images, respectively. (a2d2) represent the standard classified images (i.e., validation data) of four real HSIs, respectively. (a3d3) represent the legends for all classes of four real HSIs, respectively.
Figure 1. Real HSIs. (a1d1) represent the false-color images of the Pavia Centre, Salinas, Chikusei, and LongKou images, respectively. (a2d2) represent the standard classified images (i.e., validation data) of four real HSIs, respectively. (a3d3) represent the legends for all classes of four real HSIs, respectively.
Remotesensing 14 02194 g001aRemotesensing 14 02194 g001b
Figure 2. A partition of an HSI with five million hyperspectral vectors into one million sub-HSIs.
Figure 2. A partition of an HSI with five million hyperspectral vectors into one million sub-HSIs.
Remotesensing 14 02194 g002
Figure 3. Flow chart of the proposed classification algorithm in this paper.
Figure 3. Flow chart of the proposed classification algorithm in this paper.
Remotesensing 14 02194 g003
Figure 4. Spectral curve analysis of sample mean vectors of all classes in the LongKou image. (a) High dimensional sample mean vectors of all classes. (b) Low dimensional sample mean vectors of all classes after dimensionality reduction without projection matrix selection. (c) Low dimensional sample mean vectors of all classes after dimensionality reduction based on projection matrix selection, with legends for all classes.
Figure 4. Spectral curve analysis of sample mean vectors of all classes in the LongKou image. (a) High dimensional sample mean vectors of all classes. (b) Low dimensional sample mean vectors of all classes after dimensionality reduction without projection matrix selection. (c) Low dimensional sample mean vectors of all classes after dimensionality reduction based on projection matrix selection, with legends for all classes.
Remotesensing 14 02194 g004aRemotesensing 14 02194 g004b
Figure 5. Comparison of the classification results of four images. (a1d1) The proposed algorithm. (a2d2) TRP-MIV algorithm. (a3d3) PRP-MGSR algorithm. (a4d4) CAFCM algorithm. (a5d5) The legends for all classes.
Figure 5. Comparison of the classification results of four images. (a1d1) The proposed algorithm. (a2d2) TRP-MIV algorithm. (a3d3) PRP-MGSR algorithm. (a4d4) CAFCM algorithm. (a5d5) The legends for all classes.
Remotesensing 14 02194 g005aRemotesensing 14 02194 g005b
Figure 6. Comparison of outlines superposition results of four images. (a1d1) The proposed algorithm. (a2d2) TRP-MIV algorithm. (a3d3) PRP-MGSR algorithm. (a4d4) CAFCM algorithm. (The red lines are the outline of the classification results).
Figure 6. Comparison of outlines superposition results of four images. (a1d1) The proposed algorithm. (a2d2) TRP-MIV algorithm. (a3d3) PRP-MGSR algorithm. (a4d4) CAFCM algorithm. (The red lines are the outline of the classification results).
Remotesensing 14 02194 g006aRemotesensing 14 02194 g006b
Figure 7. Analysis of the change of the lowest projection dimensionality K0PRP with different parameters. (a) The change curve of ε and K0PRP. (The position of the black dotted line is the inflection point.) (b) The change curve of β and K0PRP. (c) The change curve of M and K0PRP. (The position of the black dotted line is the number of bands).
Figure 7. Analysis of the change of the lowest projection dimensionality K0PRP with different parameters. (a) The change curve of ε and K0PRP. (The position of the black dotted line is the inflection point.) (b) The change curve of β and K0PRP. (c) The change curve of M and K0PRP. (The position of the black dotted line is the number of bands).
Remotesensing 14 02194 g007aRemotesensing 14 02194 g007b
Figure 8. Analysis of the change of OA with the number of samplings T and the number of samples in each class H. (a) The change curve of T and OA. (b) The change curve of H and OA.
Figure 8. Analysis of the change of OA with the number of samplings T and the number of samples in each class H. (a) The change curve of T and OA. (b) The change curve of H and OA.
Remotesensing 14 02194 g008
Table 1. Details of four real HSIs.
Table 1. Details of four real HSIs.
Pavia CentreSalinasChikuseiLongKou
URL for data sourcehttp://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (accessed on 12 July 2021)http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (accessed on 12 July 2021)https://naotoyokoya.com/Download.html (accessed on 11 April 2016)http://rsidea.whu.edu.cn/resource_WHUHi_sharing.htm (accessed on 1 October 2020)
Year2003199820142018
SensorROSISAVIRISHeadwall Hyperspec-VNIR-CNano-Hyperspec
Spatial resolution1.3 m3.7 m2.5 m0.463 m
Number of bands102204128270
Table 2. Projection parameter settings in the experiment.
Table 2. Projection parameter settings in the experiment.
THMKPRPKTRPKRP
Pavia Centre image101036,59833100349
Salinas image101022956686299
Chikusei image101031453379275
LongKou image1010102,27121106367
Table 3. Accuracy evaluation of classification results for the Pavia Centre image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
Table 3. Accuracy evaluation of classification results for the Pavia Centre image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
The Proposed AlgorithmTRP-MIVPRP-MGSRCAFCM
Kappa coefficient0.83
(0.02)
0.81
(0.05)
0.79
(0.02)
0.38
(0.11)
OA/%89.65
(1.03)
88.02
(2.94)
87.34
(1.25)
53.72
(11.68)
AA/%78.99
(1.94)
75.43
(5.88)
73.22
(3.02)
38.03
(5.65)
APR/%79.80
(1.70)
74.49
(5.81)
72.72
(2.60)
34.11
(6.01)
Running time/s1.16
(0.09)
2.93
(0.29)
10,209.72
(1538.24)
5798.18
(855.03)
Table 4. Accuracy evaluation of classification results for the Salinas image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
Table 4. Accuracy evaluation of classification results for the Salinas image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
The Proposed AlgorithmTRP-MIVPRP-MGSRCAFCM
Kappa coefficient0.91
(0.00)
0.87
(0.01)
0.88
(0.00)
0.59
(0.14)
OA/%91.98
(0.23)
89.00
(0.71)
90.20
(0.38)
63.77
(11.47)
AA/%85.97
(0.92)
84.24
(1.25)
80.14
(1.70)
48.95
(11.68)
APR/%79.89
(0.66)
75.57
(1.20)
75.28
(1.33)
45.32
(17.93)
Running time/s0.37
(0.10)
3.93
(0.11)
100.08
(8.65)
1623.16
(226.78)
Table 5. Accuracy evaluation of classification results for the Chikusei image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
Table 5. Accuracy evaluation of classification results for the Chikusei image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
The Proposed AlgorithmTRP-MIVPRP-MGSRCAFCM
Kappa coefficient0.97
(0.01)
0.92
(0.05)
0.90
(0.01)
0.35
(0.10)
OA/%97.74
(0.38)
94.15
(3.36)
92.93
(0.74)
48.52
(7.77)
AA/%96.66
(1.06)
87.89
(8.85)
91.63
(0.81)
36.58
(9.75)
APR/%94.57
(0.56)
87.29
(7.67)
85.00
(1.93)
33.62
(6.54)
Running time/s0.21
(0.76)
1.59
(0.16)
52.20
(11.34)
289.21
(34.70)
Table 6. Accuracy evaluation of classification results for the LongKou image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
Table 6. Accuracy evaluation of classification results for the LongKou image. (The values in brackets are the variance of 100 trials, and those not in are the mean).
The Proposed AlgorithmTRP-MIVPRP-MGSRCAFCM
Kappa coefficient0.82
(0.01)
0.62
(0.01)
0.62
(0.03)
0.47
(0.01)
OA/%86.24
(0.51)
68.90
(0.81)
69.73
(2.93)
56.84
(1.03)
AA/%77.94
(1.57)
56.59
(2.12)
55.51
(3.51)
33.69
(2.56)
APR/%74.23
(1.50)
52.05
(1.65)
52.78
(2.36)
39.04
(3.21)
Running time/s2.42
(0.24)
7.76
(0.40)
7215.75
(835.60)
16,698.02
(1512.06)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Jia, S.; Zhao, Q.; Li, Y. Hyperspectral Remote Sensing Image Classification Based on Partitioned Random Projection Algorithm. Remote Sens. 2022, 14, 2194. https://doi.org/10.3390/rs14092194

AMA Style

Jia S, Zhao Q, Li Y. Hyperspectral Remote Sensing Image Classification Based on Partitioned Random Projection Algorithm. Remote Sensing. 2022; 14(9):2194. https://doi.org/10.3390/rs14092194

Chicago/Turabian Style

Jia, Shuhan, Quanhua Zhao, and Yu Li. 2022. "Hyperspectral Remote Sensing Image Classification Based on Partitioned Random Projection Algorithm" Remote Sensing 14, no. 9: 2194. https://doi.org/10.3390/rs14092194

APA Style

Jia, S., Zhao, Q., & Li, Y. (2022). Hyperspectral Remote Sensing Image Classification Based on Partitioned Random Projection Algorithm. Remote Sensing, 14(9), 2194. https://doi.org/10.3390/rs14092194

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop