Next Article in Journal
Bayesian Cloud Detection over Land for Climate Data Records
Previous Article in Journal
Monitoring the Surface Elevation Changes of a Monsoon Temperate Glacier with Repeated UAV Surveys, Mainri Mountains, China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Data-Wise Spatial Regional Consistency Re-Enhancement for Hyperspectral Image Classification

1
School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266525, China
2
Faculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 610031, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(9), 2227; https://doi.org/10.3390/rs14092227
Submission received: 10 March 2022 / Revised: 3 May 2022 / Accepted: 3 May 2022 / Published: 6 May 2022
(This article belongs to the Section Remote Sensing Image Processing)

Abstract

:
Effectively using rich spatial and spectral information is the core issue of hyperspectral image (HSI) classification. The recently proposed Diverse Region-based Convolutional Neural Network (DRCNN) achieves good results by weighted averaging the features extracted from several predefined regions, thus exploring the use of spatial consistency to some extent. However, such feature-wise spatial regional consistency enhancement does not effectively address the issue of wrong classifications at the edge of regions, especially when the edge is winding and rough. To improve the feature-wise approach, Data-wise spAtial regioNal Consistency re-Enhancement (“DANCE”) is proposed. Firstly, the HSIs are decomposed once using the Spectral Graph Wavelet (SGW) to enhance the intra-class correlation. Then, the image components in different frequency domains obtained from the weight map are filtered using a Gaussian filter to “debur” the non-smooth region edge. Next, the reconstructed image is obtained based on all filtered frequency domain components using inverse SGW transform. Finally, a DRCNN is used for further feature extraction and classification. Experimental results show that the proposed method achieves the goal of pixel level re-enhancement with image spatial consistency, and can effectively improve not only the performance of the DRCNN, but also that of other feature-wise approaches.

1. Introduction

In recent years, hyperspectral imaging and application have attracted great attention in the field of earth remote sensing. HSI classification is the basic task of hyperspectral data analysis and application [1,2,3,4]. The spatial regional consistency characteristics [5] of the image should be considered during the process of HSI since some correlations exist between the ground objects in the HSI. Moreover, the problem of background interference widely exists in the existing public HSI data [6], which also makes it difficult to accurately identify and classify ground objects. In summary, it is very important to make full use of the rich spatial consistency information [7] and improve the quality of HSIs [8].
The research on spatial consistency has attracted increasing attention as a result of the development of remote sensing classification techniques. The spatial consistency of an image can be simply defined as every small window having similarity with the other windows in the same image, especially the adjacent windows [9]. Therefore, the correlation between a pixel and its neighboring pixels should be considered during feature extraction. In addition, usually, similar objects tend to be distributed in a block, that is, pixels belonging to the same class are usually close to each other. Therefore, the spatial consistency in HSI is firstly used to enhance the quality of HSI. Based on the Gibbs algorithm, Rand et al. [10] regard HSI as a set of high-dimensional vectors related to spectral information, and divide a large set into several subsets of vectors according to spatial similarity. The spatial consistency of spectral information at each site is enhanced by facilitating subsequent spectral mixing analysis (SMA) of HSI. Yue et al. [11] combine multiple similar pixels adjacent in the spatial domain into one block to realize the pixel reduction in HSI according to the spectral angle. Secondly, spatial consistency is also used for feature extraction of HSIs. The spatial correlation features of HSIs can be obtained by using Spectral Graph Wavelet Transform (SGWT) [12], which fully considers the relationship between each pixel and its adjacent pixels. Nadia et al. [13] use SGWT to extract texture information of a HSI as secondary features in HSI classification. In addition, SGW can also be seen as a filter, which can be used to extract the multi-scale characteristics of an image. The SGW [14] is used as the convolution kernel to construct a Graph Wavelet Neural Network (GWNN), which is used to classify the nodes of the graph. Dong et al. [15] decomposes the vibration signal by using SGW to obtain its multi-scale characteristics, and converts the results into path graphs at multiple levels. The above spatial consistency enhancement methods are mainly based on hyperspectral raw data (data-wise).
Chen [16] applied deep learning to HSI classification for the first time and achieved good results. Convolutional Neural Networks (CNNs) in hyperspectral image classification tasks [17,18,19,20] use convolutional kernels to traverse the whole image and extract valuable features. In the process of convolution, the spatial consistency of the image has been considered. The recently proposed DRCNN [21] divides an image block into multiple regions, which are sent into different CNN models instead of only a single CNN model. The classification process is more consistent using the regional consistency assumption in the spatial domain since multiple feature extractions and weighted averages are performed in DRCNN. Overall, the advantage of DRCNN is that it strengthens the spatial consistency of the feature-wise approach and increases the number of samples by multi-region operation. However, it has some limitations. Firstly, DRCNN ignores the spatial consistency at the data-wise (pixel) level since the correlation between the pixels in an HSI is considered less. Secondly, the operation of multiple convolutions will lead to the loss of image edge information, which will cause the problem of edge point misclassification. Furthermore, to remove the noise of HSIs, the denoising effects of commonly used filters such as the bilateral filter [22], trilateral filter [23], and Gaussian filter have been compared. The Gaussian filter is selected in this paper since it can remove Gaussian noise and smooth the edges of images. In particular, the Gaussian filter can greatly simplify the noise variance estimation and analysis [24]. Therefore, a Data-wise spAtial regioNal Consistency re-Enhancement (DANCE) method is proposed in this paper to further improve the spatial consistency. Based on the above analysis, DANCE can overcome the shortcomings of the DRCNN in terms of spatial consistency in data-wise approaches to some extent.
The main contributions of this paper are as follows:
  • To solve the misclassification problem of HSI image edge points, a novel and effective DANCE method is proposed to enhance the spatial regional consistency of data-wise approaches, which can promote the performance of some state-of-the-art methods.
  • To better integrate the feature-wise and data-wise method, the structure of the DRCNN model is optimized through experiments, which can comprehensively improve the spatial regional consistency.
The remainder of this paper is organized as follows. The related basic knowledge is introduced in Section 2. The proposed method is described in Section 3. The experiment results and analysis are discussed in Section 4. The discussion is given in Section 5. The conclusions are drawn in Section 6.

2. Preliminary

2.1. Characteristics of Weighted Graphs

A hyperspectral image can be regarded as an undirected weight graph G = ( V , E , w ) , where V is the vertex set, E is the edge set, w is the weight function between vertices and is not less than 0. Take an M × M image as an example; each pixel in an image has k neighbor nodes N k . The value of element a i , j in A is defined as:
a i , j = { w ( i , j ) ,   i f   j   N k ( i ) 0 ,   o t h e r w i s e
where 1 i , j M 2 . For a weight graph G, the degree of each vertex i is recorded as d(i), which is equal to the sum of all weighted edges values of a vertex i. d(i) is defined as:
d ( i ) = j = 1 M 2 a i , j
The matrix D is defined as:
D i j = { d ( i ) , i f   i = j 0 ,   o t h e r w i s e
Then the Laplace operator L of a graph G is defined as:
L = D A
where L is a real symmetric matrix, and the eigenvalues of L are defined as λ l ( l = 0 , 1 , ... , M 2 1 ) , all of which are non-negative. The corresponding eigenvectors are Xl, which are mutually orthogonal. The eigenvalues λ l can be sorted to be 0 = λ 0 < λ 1 λ N 1 , then:
L χ l = λ l χ l

2.2. Graph Fourier Transform

The Fourier transform of a signal f is shown in Equation (6).
f ^ ( ω ) = f ( x ) e j ω x d x
The inverse Fourier transform is given by:
f ( x ) = 1 2 π f ^ ( ω ) e j ω x d ω
where e j ω x is the exponential eigenfunction. The Spectral Graph Fourier Transform (SGFT) is obtained by replacing the set of eigenvectors e j ω x with the graph eigenvectors χ l , i.e., the SGFT of a function f with the length L.
f ^ ( l ) = f , χ = n = 0 L 1 f ( n ) χ * ( n )
where 0 l , n L 1 . The inverse SGFT is given by:
f ( n ) = = 0 L 1 f ^ ( l ) χ ( n )

2.3. Spectral Graph Wavelet Transform

The spectral wavelet kernel function g is similar to the wavelet kernel function in the Fourier domain. Generally, g can be regarded as a bandpass filter satisfying g ( 0 ) = 0 and lim x g ( x ) = 0 . Each Fourier mode of a given function f can be modulated by the wavelet operator T g = g ( L ) :
T g f ( l ) = g ( λ l ) f ^ ( l )
Then the Fourier inverse transform is applied to Equation (10):
( T g f ) ( m ) = = 0 M 2 1 g ( λ l ) f ^ ( l ) χ l ( m )
The wavelet operator on scale s is defined as T g s = g ( s L ) , where T g s corresponds to the wavelet operator ψ ( s , ω ) in a classical wavelet transform. Then the spectral graph wavelet is obtained as shown in Equation (12):
ψ s , n ( m ) = = 0 L 1 g ( s λ l ) χ l * ( n ) χ l ( m )
Formally, the wavelet coefficients for a given function f are obtained by taking the inner product of the wavelets:
W f ( s , n ) = ψ s , n , f
Then the SGWT for a graph function f R L at vertex n and scale s is shown in Equation (14):
W f ( s , n ) = ( T g f ) ( n ) = l = 0 N 1 g ( s λ l ) f ^ ( l ) χ l ( n )

3. Materials and Methods

3.1. Materials

In later experiments, three public hyperspectral datasets were selected, namely, include Indian Pines data, Salinas-small data, and Pavia University data.

3.1.1. Indian Pines Data

Indian Pines data are an image of the Indiana farm in the USA taken by the AVIRIS imager, which is the first dataset used for HSI classification tasks. The wavelength range of the AVIRIS spectral imager is 0.4 to 2.5 μm with a spatial resolution of about 20 m. A total of 220 spectral bands are collected by the sensors. Since 20 bands cannot be reflected by water, the remaining 200 bands were used for this study. The image size is 145 × 145. There are 16 kinds of ground objects in the image. The numbers of sample points per class and the numbers involved in the training set and the testing set are shown in Table 1. There is a serious class imbalance in the Indian Pines data. Specifically, there are only 46, 28, and 20 samples in classes 1, 7, and 9, respectively. In addition, even the ratio of the sample numbers between classes 9 and 11 is less than 1:100. Therefore, 25% of the samples were taken as the training samples for classes 1, 7, and 9 for all of the experiments described in Section 3.2 and Section 4.

3.1.2. Salinas Data

The Salinas data are an image (512 × 217) of the Salinas Valley, CA, USA, captured by the AVIRIS sensor, which contains 16 classes of ground objects with 224 bands. Since the computation complexity is high using all samples, 9 classes of ground objects were taken to verify the effectiveness of the proposed method. The spatial resolution of the Salinas data reaches 3.7 m. Since bands 108–112, 154–167, and 224 cannot be reflected by water, the remaining 204 bands were used for this study. The sample number of each class is relatively balanced.

3.1.3. Pavia University Data

The Pavia University data are a top view image (610 × 340) of the University of Pavia, Italy, acquired by the ROSIS-03 sensor. The wavelength range of the sensor is 0.43–0.86 μm and the spatial resolution of the data is 1.3 m. A total of 103 spectra bands were selected in this paper; the other 12 bands were removed since they are heavily influenced by noise. There are 207,400 pixels in an image. Among them, only 42,776 pixels include ground object information, and the remainder are background pixels.

3.2. Methods

3.2.1. Overview of the Classification Approach

The overall flowchart of the proposed method is shown in Figure 1. Two main stages are included, i.e., DANCE and DRCNN classification. The HSI is first operated by DANCE to obtain images with enhanced spatial consistency. The size of HSI is not changed in this stage. Then, the preprocessed image is sent to the DRCNN to obtain the classification result. The detailed processes are introduced in Section 3.2.2 and Section 3.2.3.

3.2.2. Data-wise spAtial regioNal Consistency re-Enhancement (DANCE)

The proposed DANCE method mainly includes five stages: blocking, SGW frequency decomposition, filtering and SGW reconstruction, and splicing, as shown in Figure 1. First, the HSI is divided into some sub-image blocks. Then, each sub-image block is decomposed to obtain its spectrum graph feature using the SGW. Thirdly, the Gaussian filter is used to filter the block’s noise and smooth its edge. Fourth, the inverse SGW is used to reconstruct the filtered sub-image block. Finally, all sub-image blocks are spliced to a preprocessed HSI. The operations of SGW frequency decomposition, filtering, and SGW reconstruction are described in detail below.
(1)
SGW Decomposition.
First, a weight map is calculated according to the distance between the pixels and their neighbor nodes, and denoted as an adjacency matrix. The neighborhood nodes are selected based on the minimum distance principle. Taking Nk = 4 as an example, the 4 nearest nodes are selected as in Figure 2.
The SGWT with scale s is carried out on the weight map, and the decomposed images are obtained in four frequency bands, which are a low-frequency component LF and three high-frequency components, HF1, HF2, and HF3, as shown in Figure 1.
(2)
Gaussian Filtering.
Gaussian filtering is applied to the image components obtained in step (1). The parameters of the Gaussian filter are determined by the experiments, which are described in Section 4.1. Taking the image block of Pavia University as an example, the Gaussian filtering results of four components, GLF, GHF1, GHF2, and GHF3, are shown in Figure 3 for d = 5, σ = 0.5. The low-frequency component mainly includes the coarse information of the image according to Figure 3a. Furthermore, the edge information of the image is mainly in the high-frequency components (Figure 3b–d). It is clear that the image is smoothed and the noise is removed to a certain extent, which makes the processed image more consistent with the true spatial domain distribution.
(3)
SGW Reconstruction.
The filtered components with different frequency bands are reconstructed by inverse SGW to obtain the preprocessed hyperspectral feature map, as shown in Figure 4. The reconstructed image using four frequency band images after Gaussian filtering is shown in Figure 4a. Figure 4b shows the un-preprocessed image. For clear observation, two areas in the two images are marked with red rectangle and enlarged as shown in Figure 4. It can be seen that the preprocessed image using the DANCE method can improve the spatial consistency in both flat regions and edges.
To quantitatively evaluate the spatial consistency enhancement of DANCE, the regions with the size of 5 × 5, 10 × 10, and 15 × 15 were randomly selected 10 times in the processed HSI and the original HSI. The average Euclidean distances between all pixels in these regions were calculated, as shown in Table 2. It can be seen that the average Euclidean distance between pixels after DANCE is reduced, regardless of the size of the region, which further verifies the enhancement of spatial consistency by DANCE.

3.2.3. Construction of the DRCNN

The remote sensing image classification performance using DRCNN [17] is effectively improved by fully considering the feature-wise spatial consistency. Since the proposed DANCE can provide the more spatial consistency information based on a data-wise approach, the preprocessed remote sensing image can be taken based on the preliminary features, which are sent to the DRCNN to classify the ground objects. The structure of the DRCNN is adjusted by experiments based on the preprocessed data. The preprocessed image is divided into the K × K sub-block images. In general, K is odd. A sub-block image is taken as a Global Region. Its left, right, top, bottom, and central regions are extracted as the new features, which are trained using the different networks. For convenience, the six selected regions are named GR (Global Region), RR (Right Region), LR (Left Region), TR (Top Region), BR (Bottom Region), and CR (Central Region). The sizes of five of the areas, with the exception of CR, can be determined by the window radius r. Thus, the size of GR (SGR) is (2 r + 1) × (2 r + 1). The size of LR and RR is set as (2 r + 1) × (r + 2). The size of TR and BR is set as (r + 2) × (2 r + 1). Two series of experiments were performed to determine the sizes of GR, RR, LR, TR, BR, and CR. The schematic diagram examples of feature extraction with r = 3, 4, 5, 6 are shown in Figure 5. Each region can be taken as a GR. Its TR, BR, LR, and RR are shown with different color rectangles.
(1)
The selection of r and the proportion of training set.
In the cross-validation experiments based on the Indian Pines data, Salinas-small data, and Pavia University data, r was set to be 3, 4, 5, 6. The size of CR was set 3 × 3. The proportion of training samples was taken as 3%, 5%, 10%, 13%, and 15%. The results of the experiments are shown in Figure 6, where the different colored lines represent the classification Overall Accuracies (OAs) with different r values.
As can be seen from Figure 6, the best OA can be achieved with r = 6 on the Indian Pines data and Salinas-small data. The OA is the best when r = 5 on the Pavia University dataset. The OA is obviously good when using 15% of the training samples for these data. Based on the above analysis and the time complexity, 10% of the data were set as the training set, and the radius of the sub-block image was taken as 6. Therefore, the sizes of the five windows are shown in Table 3.
A share of 10% samples of the Indian Pines data were selected as the training set and r = 6. The size of CR was set as 1 × 1, 3 × 3, and 5 × 5. The OA were obtained using all features of the six regions, as shown in Figure 7. It can be seen that the classification accuracy is the highest when the size of the CR is 3 × 3. Therefore, 3 × 3 was selected as the size of the CR.

4. Experiment Results and Analyses

The parameter selection experiment on Gaussian filter was designed first to achieve the optimal performance of the proposed method. Then, multiple experiments were performed to prove the effectiveness of DANCE and DANCE-DRCNN. The experimental environment was MATLAB (R2019 a) and Python running on a workstation with a GeForce RTX 2080 Ti GPU.

4.1. Parameters Selection of Gaussian Filter

The parameters of the Gaussian filter used for the four components extracted by SGWT are important parameters in the DANCE method. A (2 k + 1) × (2 k + 1) discrete Gaussian convolution kernel H is defined as:
H i , j = 1 2 π σ 2 e ( i k 1 ) 2 + ( j k 1 ) 2 2 σ 2
where 1 i , j ( 2 k + 1 ) , σ is the variance. Thus, the two parameters affecting the Gaussian filter effectiveness are the dimensionality of the Gaussian convolution kernel d = 2 k + 1 and the variance σ. To determine the optimum values of these two parameters, d was taken as 3, 5 and 7, and σ was set as 0.3, 0.5, and 0.7 respectively. Taking the Indian Pines data processed by SGWT as an example, in which 25% of the samples were taken as the training samples for classes 1, 7, and 9, and 10% of the samples for the other classes were taken as training set, the DRCNN constructed in Section 3.2.2 was used for training and classification. The classification results with various parameters are shown in Table 4. According to Table 4, d was set to 5 and σ was set to 0.5 in this study.

4.2. Results and Comparisons

4.2.1. Comparisons with DRCNN and Baselines

To further demonstrate the effectiveness of the proposed method, comparative experiments between DANCE-DRCNN, DRCNN, and baselines were designed. The training set proportion was set as 10%. Seven state-of-art approaches were selected. SVMMRF [25] represents the traditional method, and combines an SVM classifier and a Radial Basis Function (RBF). As an improvement to ResNet, A2S2K-ResNet [26] was also chosen for comparison. The HSID-CNN method [27] was selected since it fuses multi-scale features to remove noise. R-PCA-CNN [19] and Gabor-CNN [28] are all based on the method of combining the classical preprocessing methods with CNN. SSRN [29] and 3D-CNN [20] were compared with the proposed method since their classification results are relatively good at present. The classification maps of the three datasets are shown in Figure 8, Figure 9 and Figure 10.
It can be seen that the Figures (a) in the three maps (Figure 8, Figure 9 and Figure 10) represent the ground truth of these data, and Figures (b–f) show the obtained classification maps using SVMMRF [25], R-PCA-CNN [19], 3D-CNN [20], DRCNN [21], and the proposed method, respectively. For clear observation, some of the image edges framed in the classification map were enlarged, as shown in Figures (g) or (h), which correspond to the Figures (b–f) in order from top to bottom. Illegible misclassification points are labeled with dashed circles. Firstly, compared with SVMMRF, R-PCA-CNN, and 3D-CNN methods, the proposed method can effectively remove the influence of Gaussian noise and enhance the continuity of the classification maps. In addition, the misclassification rate of the edge points is greatly reduced. For example, the edge misclassification of the Corn-mintill class in the Indian Pines data (the red area in Figure 8), the Grapes untrained class in the Salinas-small data (the blackish green area in Figure 9), and the Gravel class in the Pavia University dataset (the red area in Figure 10) is significantly corrected. Then, compared with DRCNN, the misclassification rate of ground object edge reduces obviously when using the proposed DANCE approach.
The comparisons of the classification performance between the proposed method, DRCNN, and baselines on the three datasets are shown in Table 5, Table 6 and Table 7, which list the accuracy of each class, OA, and AA. It can be seen that the proposed method achieved the highest accuracy in the three datasets. Compared with the traditional SVM classification method, the OA using DANCE-DRCNN is improved by about 20% compared to the Indian Pines data, and by about 8% compared to the Salinas-small data and Pavia University data. At the same time, compared with other CNN methods, the proposed method improved the OA to some extent. Moreover, several classes with low accuracy in DRCNN are improved by the proposed method.

4.2.2. Ablation Experiments

According to the strategy described in Chapter 3, the HSI was firstly decomposed into four components, Gaussian filtering was then used, and it was finally reconstructed. To verify the effectiveness of the proposed approach, the HSI processed only by GF (GF-HSI), the four components after GF (GLF, GHF1, GHF2, and GHF3), and the HSI after DANCE were the inputs of DRCNN, respectively. The experimental results with different inputs are shown in Table 8, which lists the accuracy of each class, OA, and Average Accuracy (AA).
Firstly, it can be seen that the classification accuracy of the approach using only GF is lower than that of the proposed combination. This may be because the high-frequency and low-frequency features of the HSI are filtered together when the global HSI is filtered using the GF. Secondly, it can also be seen that the classification accuracy using GLF is higher than that using three other components (GHF1, GHF2, and GHF3) because it represents most of the image information. However, it is still lower than the DANCE method since some edge information is lacking in the GLF. Finally, since GHF1, GHF2, and GHF3 only represent the high-frequency components of the image (edge information), the results using GHF1, GHF2, and GHF3 are lower in order. Specially, the edge information is not vital for the classification of some large ground objects. In summary, the proposed DANCE method fusing SGWT and GF has higher classification accuracy than any of the approaches using only one of these methods.

4.2.3. The Improvement with DANCE in Other Methods

To demonstrate the effectiveness of the proposed method for other feature-wise approaches, the data were first preprocessed using the DANCE method. Based on the preprocessed data, all comparison methods used in Section 4.2.1 were performed to classify the ground objects like the proposed method. The experimental results are shown in Table 9. As can be seen from the results in Table 9, the DANCE method not only improves the classification accuracy of DRCNN, but also that of the other methods. Thus, it was further verified that the proposed DANCE method is an effective solution for the HSI classification.

5. Discussion

5.1. The Selection of the Input Size in DANCE

Before the use of DANCE, an HSI is divided into many blocks having the same size; then, the undirected graphs of blocks are obtained by spectral graph theory. Therefore, the size of blocks determines how many pixels are used simultaneously for spatial consistency, which can affect the performance of DANCE. Based on the analysis, the selection experiments of the sub-block size in DANCE were designed. To generate a node graph of every HSI block, the sub-block size must be an integer. Thus, the input size of Indian Pines data was set to 145 × 145, 29 × 29, and 5 × 5. The image after the use of DANCE is sent to the DRCNN for classification, and the results are shown in Table 10. Smaller blocks mean more iterations, which in turn affects the running time. Therefore, the running time with different image block sizes is also shown in Table 10.
It can be seen that the best classification results are obtained when the size of the HSI block is set to 29 × 29. Compared with the size of 145 × 145, the smaller image block only contains the information of spatial consistency with its neighborhoods, which is better than the result of computing all pixel points together. In addition, when the size is set to 5 × 5, the more iterations lead to huge computation complexity. Therefore, the input size in DANCE was selected as 29 × 29 on Indian Pines data.
In summary, the selection of the image block size needs to consider both the classification performance and the running time. According to the above experimental results, it is clear that good performance can be achieved when the middle block size is selected in all possible sizes. Therefore, the sub-block sizes were set to 128 × 37 and 122 × 85 for Salinas-small data and Pavia University data, respectively. However, for different data, further study of the evaluation criteria for the optimum sub-block size is still necessary.

5.2. The Computation Cost of DANCE

To evaluate the computation cost of DANCE, the Indian Pines data were taken as an example. The HSI was first divided into blocks with the size of 29 × 29, and passed into DANCE ten times. The computational costs are shown in Table 11, which includes the averages and variances of disk usage, CPU usage, and the running time.
It can be seen that the proposed DANCE does not greatly increase the burden of image processing. However, the running time is still not short, and thus needs to be optimized in future research.

6. Conclusions

Motivated by the DRCNN method using feature-wise spatial regional consistency, a method named Data-wise spAtial regioNal Consistency re-Enhancement (DANCE) is proposed, which fully considers the relationship between pixels and combines the SGWT with Gaussian filtering. Then, DRCNN is used to realize the HSI classification. Experimental results show the proposed DANCE method can effectively enhance the spatial regional consistency of images based on a data-wise approach. It can be seen in Section 4.2.1 that the proposed method performs better than other baselines and DRCNN. Firstly, compared with other baselines, the proposed method makes full use of the spatial consistency of both the data-wise and feature-wise approaches. For both the middle and edge areas of the ground objects, the misclassification points are evidently reduced. Then, compared with DRCNN, DANCE improves the quality of HSIs by enhancing the spatial consistency of the data-wise approach and removing the Gaussian noise. In particular, it can be seen that the accuracy of edge points is improved in the classification maps. The disadvantage is that DANCE increases computational cost compared with only DRCNN.
Some additional work should be further researched. Firstly, the result of six regions in the DRCNN is adopted by the contact strategy. Therefore, the central region does not achieve the role of re-correcting the misclassified points. This issue should also be given further attention in future work. Regarding another aspect, the proposed method does not consider the spectral correlations between different bands, which leads to the problem of redundancy with longer training time and larger storage space. The above two issues were not addressed in this study. In the future, we will conduct further research.

Author Contributions

Conceptualization, L.Z.; methodology, L.Z.; validation, E.X.; formal analysis, S.H.; investigation, L.Z.; resources, Y.Y.; data curation, E.X.; writing—original draft preparation, E.X.; writing—review and editing, L.Z.; visualization, E.X.; supervision, K.Z.; project administration, L.Z.; funding acquisition, S.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 62171247 and 41921781.

Data Availability Statement

Acknowledgments

The authors would like to thank the editors and the reviewers for their valuable suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yan, J.; Chen, H.; Liu, L. Overview of hyperspectral image classification. J. Sens. 2020, 2020, 4817234. [Google Scholar] [CrossRef]
  2. Yang, F.; Chen, X.; Chai, L. Hyperspectral image destriping and denoising using stripe and spectral low-rank matrix recovery and global spatial-spectral total variation. Remote Sens. 2021, 13, 827. [Google Scholar] [CrossRef]
  3. Saboori, A.; Ghassemian, H. Adversarial discriminative active Deep Learning for domain adaptation in hyperspectral images classification. Int. J. Remote Sens. 2021, 42, 3981–4003. [Google Scholar] [CrossRef]
  4. Zhang, S.; Sun, B.; Li, S.; Kang, X. Noise estimation of hyperspectral image in the spatial and spectral dimensions. Natl. Remote Sens. Bull. 2021, 25, 1108–1123. [Google Scholar]
  5. Mohan, A.; Sapiro, G.; Bosch, E. Spatially coherent nonlinear dimensionality reduction and segmentation of hyperspectral images. IEEE Geosci. Remote Sens. Lett. 2007, 4, 206–210. [Google Scholar] [CrossRef]
  6. Li, J.; Zhao, C.H.; Mei, F. Detecting hyperspectral anomaly by using background residual error data. J. Infrared Millim. Waves 2010, 29, 150–155. [Google Scholar] [CrossRef]
  7. Imani, M.; Ghassemian, H. An overview on spectral and spatial information fusion for hyperspectral image classification: Current trends and challenges. Inf. Fusion 2020, 59, 59–83. [Google Scholar] [CrossRef]
  8. Rasti, B.; Scheunders, P.; Ghamisi, P.; Licciardi, G.; Chanussot, J. Noise reduction in hyperspectral imagery: Overview and application. Remote Sens. 2018, 10, 482. [Google Scholar] [CrossRef] [Green Version]
  9. Buades, A.; Coll, B.; Morel, J.M. On image denoising methods. A new nonlocal principle. SIAM Rev. 2010, 4, 490–530. [Google Scholar] [CrossRef]
  10. Rand, R.S.; Keenan, D.M. A spectral mixture process conditioned by Gibbs-based partitioning. IEEE Trans. Geosci. Remote Sens. 2001, 39, 1421–1434. [Google Scholar] [CrossRef]
  11. Yue, J.; Zhang, Y.; Xu, H.; Bai, L. An unsupervised classification of hyperspectral images based on pixels reduction with spatial coherence property. Spectrosc. Spectr. Anal. 2012, 32, 1860–1864. [Google Scholar]
  12. David, K.H.; Pierre, V.; Rémi, G. Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal. 2011, 30, 129–150. [Google Scholar]
  13. Zikiou, N.; Mourad, L.; David, H. Hyperspectral image classification using graph-based wavelet transform. Int. J. Remote Sens. 2020, 41, 2624–2643. [Google Scholar] [CrossRef]
  14. Zhang, M.; Li, Q. MS-GWNN: Multi-scale graph wavelet neural network for breast cancer diagnosis. arXiv 2020, arXiv:2012.14619. [Google Scholar]
  15. Dong, X.; Li, G.; Jia, Y.; Xu, K. Multiscale feature extraction from the perspective of graph for hob fault diagnosis using spectral graph wavelet transform combined with improved random forest. Measurement 2021, 176, 109178. [Google Scholar] [CrossRef]
  16. Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
  17. Quan, Y.; Dong, S.; Feng, W.; Dauphin, G.; Zhao, G.; Wang, Y.; Xing, M. Spectral-spatial feature extraction based CNN for hyperspectral image classification. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Waikoloa Village, HI, USA, 26 September–2 October 2020; pp. 485–488. [Google Scholar]
  18. Feng, Y.; Zheng, J.; Qin, M.; Bai, C.; Zhang, J. 3d octave and 2d vanilla mixed Convolutional Neural Network for hyperspectral image classification with limited samples. Remote Sens. 2021, 13, 4407. [Google Scholar] [CrossRef]
  19. Makantasis, K.; Karantzalos, K.; Doulamis, A.; Doulamis, N. Deep supervised learning for hyperspectral data classification through convolutional neural networks. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 4959–4962. [Google Scholar]
  20. Ahmad, M.; Khan, A.M.; Mazzara, M.; Distefano, S.; Ali, M.; Sarfraz, M.S. A fast and compact 3-D CNN for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  21. Zhang, M.; Li, W.; Du, Q. Diverse region-based CNN for hyperspectral image classification. IEEE Trans. Image Process. 2018, 27, 2623–2634. [Google Scholar] [CrossRef]
  22. Zhang, Y.; He, J. Bilateral texture filtering for spectral-spatial hyperspectral image classification. J. Eng. 2019, 2019, 9173–9177. [Google Scholar] [CrossRef]
  23. Gupta, V.; Sastry, S.; Mitra, S.K. Hyperspectral image classification using trilateral filter and deep learning. In Proceedings of the IEEE International Symposium on Sustainable Energy, Gunupur, Odisha Signal Processing and Cyber Security (iSSSC), Gunupur Odisha, India, 16–17 December 2020. [Google Scholar]
  24. Landgrebe, D.; Malaret, E. Noise in remote-sensing systems: The effect on classification error. IEEE Trans. Geosci. Remote Sens. 1986, GE-24, 294–300. [Google Scholar] [CrossRef]
  25. Tarabalka, Y.; Fauvel, M.; Chanussot, J.; Benediktsson, J.A. SVM- and MRF-based method for accurate classification of hyperspectral images. IEEE Geosci. Remote Sens. Lett. 2010, 7, 736–740. [Google Scholar] [CrossRef] [Green Version]
  26. Roy, S.K.; Manna, S.; Song, T.; Bruzzone, L. Attention-based adaptive spectral-spatial kernel ResNet for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2020, 59, 7831–7843. [Google Scholar] [CrossRef]
  27. Yuan, Q.; Zhang, Q.; Li, J.; Shen, H.; Zhang, L. Hyperspectral image denoising employing a spatial-spectral deep residual convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1205–1218. [Google Scholar] [CrossRef] [Green Version]
  28. Kang, X.; Li, C.; Li, S.; Lin, H. Classification of hyperspectral images by Gabor filtering based deep network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 11, 1166–1178. [Google Scholar] [CrossRef]
  29. Zhong, Z.; Li, J.; Luo, Z.; Chapman, M. Spectral-spatial residual network for hyperspectral image classification: A 3-D deep learning framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 847–858. [Google Scholar] [CrossRef]
Figure 1. The overall flowchart of the proposed method.
Figure 1. The overall flowchart of the proposed method.
Remotesensing 14 02227 g001
Figure 2. The selection of neighborhood nodes.
Figure 2. The selection of neighborhood nodes.
Remotesensing 14 02227 g002
Figure 3. Four components’ images before and after application of the Gaussian filter on Pavia University image block: (a) LF to GLF; (b) HF1 to GHF1; (c) HF2 to GHF2; (d) HF3 to GHF3.
Figure 3. Four components’ images before and after application of the Gaussian filter on Pavia University image block: (a) LF to GLF; (b) HF1 to GHF1; (c) HF2 to GHF2; (d) HF3 to GHF3.
Remotesensing 14 02227 g003
Figure 4. Comparison of results with and without DANCE: (a) reconstructed image (with GF); (b) the un-preprocessed image (without GF).
Figure 4. Comparison of results with and without DANCE: (a) reconstructed image (with GF); (b) the un-preprocessed image (without GF).
Remotesensing 14 02227 g004
Figure 5. Four kinds of feature extraction windows with different r values.
Figure 5. Four kinds of feature extraction windows with different r values.
Remotesensing 14 02227 g005
Figure 6. The overall accuracies for different r values and different training set ratios: (a) Indian Pines data; (b) Salinas-small data; (c) Pavia University data.
Figure 6. The overall accuracies for different r values and different training set ratios: (a) Indian Pines data; (b) Salinas-small data; (c) Pavia University data.
Remotesensing 14 02227 g006
Figure 7. Overall classification accuracy for different CR sizes.
Figure 7. Overall classification accuracy for different CR sizes.
Remotesensing 14 02227 g007
Figure 8. Classification maps from the proposed DANCE-DRCNN method and the baselines on the Indian Pines data: (a) ground truth; (b) SVM-PCA; (c) PCA-CNN; (d) 3D-CNN; (e) DRCNN; (f) DANCE-DRCNN; (g,h) enlarged edge image.
Figure 8. Classification maps from the proposed DANCE-DRCNN method and the baselines on the Indian Pines data: (a) ground truth; (b) SVM-PCA; (c) PCA-CNN; (d) 3D-CNN; (e) DRCNN; (f) DANCE-DRCNN; (g,h) enlarged edge image.
Remotesensing 14 02227 g008
Figure 9. Classification maps from the proposed DANCE-DRCNN method and the baselines on the Salinas-small data: (a) ground truth; (b) SVM-PCA; (c) PCA-CNN; (d) 3D-CNN; (e) DRCNN; (f) DANCE-DRCNN; (g) enlarged edge image.
Figure 9. Classification maps from the proposed DANCE-DRCNN method and the baselines on the Salinas-small data: (a) ground truth; (b) SVM-PCA; (c) PCA-CNN; (d) 3D-CNN; (e) DRCNN; (f) DANCE-DRCNN; (g) enlarged edge image.
Remotesensing 14 02227 g009
Figure 10. Classification maps from the proposed DANCE-DRCNN method and the baselines on the Pavia University data: (a) ground truth; (b) SVM-PCA; (c) PCA-CNN; (d) 3D-CNN; (e) DRCNN; (f) DANCE-DRCNN; (g,h) enlarged edge image.
Figure 10. Classification maps from the proposed DANCE-DRCNN method and the baselines on the Pavia University data: (a) ground truth; (b) SVM-PCA; (c) PCA-CNN; (d) 3D-CNN; (e) DRCNN; (f) DANCE-DRCNN; (g,h) enlarged edge image.
Remotesensing 14 02227 g010
Table 1. The numbers of total, training, and testing samples for the Indian Pines data.
Table 1. The numbers of total, training, and testing samples for the Indian Pines data.
#ClassTotalTrainingTesting
1Alfalfa461234
2Corn-notill14281431285
3Corn-mintill83083747
4Corn23724213
5Grass-pasture48348435
6Grass-trees73073657
7Grass-pasture-mowed28721
8Hay-windrowed47848430
9Oats20515
10Soybean-notill107297875
11Soybean-mintill24552462209
12Soybean-clean59359534
13Wheat20521184
14Woods12651271138
15Building-grass-trees-drives38639347
16Stone-steel-towers93984
-Total10,34910419208
Table 2. The average Euclidean distances with and without DANCE on the Indian Pines data.
Table 2. The average Euclidean distances with and without DANCE on the Indian Pines data.
#5 × 510 × 1015 × 15
without DANCE1.0775 × 10−33.1700 × 10−38.8460 × 10−3
with DANCE1.0571 × 10−31.4450 × 10−48.0556 × 10−3
Table 3. The size of windows of DRCNN.
Table 3. The size of windows of DRCNN.
xRGRRRLRTRBR
Size13 × 1313 × 813 × 88 × 138 × 13
Table 4. Classification accuracy of different parameters in GF (%).
Table 4. Classification accuracy of different parameters in GF (%).
dTotalTrainTest
σ
0.396.6896.6496.95
0.598.6898.8297.69
0.797.8896.2397.36
Table 5. Comparison of the classification accuracy (%) among the proposed method, DRCNN, and the baselines on the Indian Pines data.
Table 5. Comparison of the classification accuracy (%) among the proposed method, DRCNN, and the baselines on the Indian Pines data.
ClassSVMMRFHSID-CNNA2S2K-RestNetR-PCA-CNNGabor-CNNSSRN3D-CNNDRCNNDANCE-DRCNN
144.62100.0098.37100.0095.6252.73100.0071.74100.00
268.5188.4790.5192.2089.2098.2894.2796.1898.90
370.3688.7198.5597.6285.6597.2399.0390.9996.93
457.0792.2391.12100.0090.3196.7195.5187.6398.04
592.2099.1694.67100.0088.9297.5695.94100.0099.27
686.7088.7697.7198.6593.1798.6298.64100.0099.20
785.71100.0096.41100.0096.8898.60100.0096.18100.00
897.0591.5898.72100.0090.6795.42100.0096.75100.00
936.3633.3399.4696.2396.3598.62100.00100.00100.00
1069.7690.8795.3895.7989.8796.9097.6695.1499.38
1175.4884.6496.4695.9394.7898.0297.7797.8998.29
1281.3892.1492.2896.7298.2696.9191.5393.2299.21
1393.1699.4295.66100.0095.4699.59100.00100.00100.00
1492.1891.9089.7693.8690.1898.9199.8097.3299.35
1576.1188.1995.6797.5089.6397.6696.6989.6993.84
1695.9596.0196.69100.0097.3098.6797.3788.9198.73
OA78.5488.8097.3196.9795.7597.7197.2796.1598.61
AA73.3789.0995.4697.7892.8295.0395.8893.8598.82
Table 6. Comparison of the classification accuracy (%) among the proposed method, DRCNN, and the baselines on the Salinas-small data.
Table 6. Comparison of the classification accuracy (%) among the proposed method, DRCNN, and the baselines on the Salinas-small data.
ClassSVMMRFHSID-CNNA2S2K-RestNetR-PCA-CNNGabor-CNNSSRN3D-CNNDRCNNDANCE-DRCNN
1100.0099.73100.00100.00100.00100.0099.88100100.00
299.6499.2899.7099.6299.20100.0099.87100100.00
398.9699.4999.4798.9499.5099.49100.0099.88100.00
499.9799.9399.6099.9599.8099.30100.00100.00100.00
599.81100.00100.00100.0099.7598.5099.8699.86100.00
679.7899.9787.6499.9585.1593.40100.0096.34100.00
799.4399.9999.4598.3799.4899.30100.0099.89100.00
898.4199.7699.6498.3098.13100.0099.4798.81100.00
997.5899.86100.0092.1095.66100.00100.00100.00100.00
OA92.7299.8398.3299.2797.4698.2899.9799.91100.00
AA97.0699.7898.3998.3097.4197.8899.9099.42100.00
Table 7. Comparison of the classification accuracy (%) among the proposed method, DRCNN, and the baselines on the Pavia University data.
Table 7. Comparison of the classification accuracy (%) among the proposed method, DRCNN, and the baselines on the Pavia University data.
ClassSVMMRFHSID-CNNA2S2K-RestNetR-PCA-CNNGabor-CNNSSRN3D-CNNDRCNNDANCE-DRCNN
194.5694.3298.2299.7096.4599.8199.8599.3599.81
296.0895.4998.9099.8496.9599.9499.9399.8899.97
385.7394.0788.9192.3196.0999.3598.4699.72100.00
496.4199.4693.5699.3599.2299.81100.0098.9999.85
599.5999.5799.11100.0099.9299.94100.00100.00100.00
693.1897.9480.2694.3594.6999.95100.0098.2299.98
789.6697.6893.3198.5287.36100.00100.00100.00100.00
885.7584.1193.6494.8587.3898.5999.4698.3999.97
999.8898.0599.3798.96100.00100.00100.00100.00100.00
OA92.1894.9795.2397.5495.6799.7799.8299.5899.92
AA94.1595.6393.9298.2595.3799.7199.7499.3999.95
Table 8. Comparison of classification results (%) using DRCNN with different inputs on Indian Pines data.
Table 8. Comparison of classification results (%) using DRCNN with different inputs on Indian Pines data.
#GF-HSIGLFGHF1GHF2GHF3DANCE
1100.00100.0073.1262.0051.84100.00
295.7683.0672.3567.5137.8998.90
396.9298.9987.3347.1563.0396.93
497.54100.0061.1663.6459.8898.04
5100.00100.0096.22100.0078.8299.27
6100.00100.0010099.2588.1399.20
795.45100.0081.2343.5939.33100.00
8100.0099.3191.1599.5133.57100.00
9100.00100.0087.98100.0079.75100.00
1089.4395.2691.4592.3392.3399.38
1196.6495.5584.2285.4270.1598.29
1296.3896.6688.0753.7568.4599.21
1397.3798.4071.5296.6975.32100.00
14100.0099.7396.4984.7783.4199.35
1587.3185.3668.9962.1480.0593.84
1687.3779.0593.16100.0088.1298.73
OA96.1394.5286.5978.6169.1598.61
AA96.2695.7184.0374.9068.1398.82
Table 9. Comparison of the classification accuracy (%) among the proposed method and the baselines on the Indian Pines data.
Table 9. Comparison of the classification accuracy (%) among the proposed method and the baselines on the Indian Pines data.
OASVMMRFHSID-CNNA2S2K-RestNetR-PCA-CNNGabor-CNNSSRN3D-CNNDRCNN
without DANCE78.5488.8097.3196.9795.7597.7197.2796.15
with DANCE81.6790.1298.2997.3596.8297.9198.1398.61
Table 10. Comparison of the different sub-block sizes on the Indian Pines data.
Table 10. Comparison of the different sub-block sizes on the Indian Pines data.
#145 × 14529 × 295 × 5
OA (%)98.5998.6197.57
AA (%)97.2198.8296.22
running time of DANCE58.44±2.26 s368.38 ± 3.62 s4601.54 ± 9.21 s
Table 11. Computational cost of DANCE on the Indian Pines data.
Table 11. Computational cost of DANCE on the Indian Pines data.
Disk UsageCPU UsageRunning Time
1236 ± 23.4 MB36.12 ± 1.4%368.38 ± 3.62 s
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhou, L.; Xu, E.; Hao, S.; Ye, Y.; Zhao, K. Data-Wise Spatial Regional Consistency Re-Enhancement for Hyperspectral Image Classification. Remote Sens. 2022, 14, 2227. https://doi.org/10.3390/rs14092227

AMA Style

Zhou L, Xu E, Hao S, Ye Y, Zhao K. Data-Wise Spatial Regional Consistency Re-Enhancement for Hyperspectral Image Classification. Remote Sensing. 2022; 14(9):2227. https://doi.org/10.3390/rs14092227

Chicago/Turabian Style

Zhou, Lijian, Erya Xu, Siyuan Hao, Yuanxin Ye, and Kun Zhao. 2022. "Data-Wise Spatial Regional Consistency Re-Enhancement for Hyperspectral Image Classification" Remote Sensing 14, no. 9: 2227. https://doi.org/10.3390/rs14092227

APA Style

Zhou, L., Xu, E., Hao, S., Ye, Y., & Zhao, K. (2022). Data-Wise Spatial Regional Consistency Re-Enhancement for Hyperspectral Image Classification. Remote Sensing, 14(9), 2227. https://doi.org/10.3390/rs14092227

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop