1. Introduction
Recently, hyperspectral remote sensing has been broadly and successfully applied in urban planning [
1], precision agriculture [
2], environmental monitoring [
3] and other fields with the constantly increased spectral resolution of sensors. Hyperspectral remote sensing combines spectral features with spatial images that can accurately identify and detect ground objects, which provides strong technical support for ground feature extraction [
4]. However, hyperspectral image (HSI) obtained in hundreds of narrow and contiguous bands from visible to infrared areas of the electromagnetic spectrum are characterized by high-dimensional space and a large number of spectral bands [
5], which makes the processing and analysis of HSI a challenging task. Therefore, dimensionality reduction becomes a crucial task for hyperspectral data analysis [
6].
Feature extraction and feature selection are two typical dimension-reduction methods. The original hyperspectral datasets are transformed into a low-dimensional and less-redundant feature space by feature extraction and common techniques such as independent component analysis (ICA) [
7], principal component analysis (PCA) [
8], and local linear embedding (LLE) [
9]. Although these methods can extract valuable features from HSI datasets, they often lose physical information of the original data during the process of data compression [
10]. In contrast, feature selection can select the feature subset with the most information while preserving the physical meaning of the original data, which is an important and popular method for reducing dimensions [
11]. In the traditional filter methods, the feature subset is built independently of the classifier or classification algorithm and can be evaluated based on different measures such as distance measures, correlation measures and information measures [
12], while wrapper methods use the classifier model to estimate feature subsets. Although filter methods are computationally simple and fast [
13], they are generally less accurate than wrapper methods because they are not guided by classifiers [
14]. In general, feature-selection methods can be divided into supervised and unsupervised according to the availability of sample tags [
15]. Unsupervised methods can select a subset of bands without class labels, but they tend to be unstable and biased due to the lack of prior information [
16]. In comparison, supervised methods tend to obtain better feature-selection results with the assistance of class labels.
Supervised feature-selection methods include three search strategies: exhaustive search, sequential search, and random search [
10]. Exhaustive search requires enumerating all possible combinations of features [
17], which results in unacceptable time complexity for HSI. Sequential search contains sequential forward search (SFS), sequential backward search (SBS), and sequential floating forward search (SFFS) [
18]. These methods require much computation while tending to get stuck to the local optima, and it is difficult to perform well for the existence of bands with strong correlation in HSI [
19]. By contrast, random search introduces randomness into the search process to distance from the local optima and deliver promising results with higher efficiency. Recently, a number of nature-inspired stochastic search algorithms have been extensively utilized for feature selection based on their strong search ability in the large-scale space [
20]. These include genetic algorithm (GA) [
21], differential evolution (DE) algorithm [
22], particle swarm optimization (PSO) [
23], gray wolf optimizer (GWO) [
24], cuckoo search (CS) algorithm [
25], artificial bee colony (ABC) algorithm [
26] and whale optimization algorithm (WOA) [
27], which may have superior performance in dealing with feature-selection problems.
For HSI band selection, Nagasubramanian et al. [
28] used GA to select the optimal subset of bands and support vector machine (SVM) to classify the infected and healthy samples. Additionally, the classification accuracy was replaced by F1-Score to alleviate the skewness caused by unbalanced datasets. The results showed that the bands chosen by this approach were more informative compared to RGB images. Xie et al. [
29] proposed a band selection method based on ABC algorithm and enhanced subspace decomposition to apply in HSI classification. Subspace decomposition was realized by computing the relevance between adjacent bands, and ABC algorithm was guided by enhanced subspace decomposition and maximum entropy to optimize the combination of selected bands, which provided high classification accuracy compared with six related techniques. Wang et al. [
30] proposed a wrapper feature-selection approach based on improved ant lion optimizer (ALO) and wavelet SVM to reduce the dimension of HSI. Lévy flight was used to help ALO jump out of local optimum and the wavelet SVM was introduced to improve the stability of classification result. The results showed that the proposed method can provide satisfactory classification accuracy in fewer frequency bands. Subsequently, Wang et al. [
31] designed a new band selection method using chaos operation to set corresponding indices for the top three gray wolves in GWO to improve the optimization ability of GWO, and experimental results demonstrated that a suitable band subset can be obtained and superior classification accuracy can be achieved by this approach. Kavitha and Jenifa [
32] used Discrete Wavelet transform with eight taps and four taps for extracting the important features and applied PSO algorithm for searching the optimal band subsets and utilized SVM as a classifier to classify HSI effectively. Medjahed et al. [
33] introduced a novel band selection framework based on binary CS algorithm. The experiment compared the optimization ability of CS under two different objective functions and proved that it could obtain more excellent results than relevant approaches by adopting a few instances for training. Su et al. [
34] proposed a modified firefly algorithm (FA) to deal with the band selection problem by optimizing the minimum values of objective function, which outperformed superior results than SFS and PSO. In essence, band selection is a NP hard problem, as if the number of the bands increases, the above algorithms may suffer from premature convergence and even optimization stagnation.
Hybrid rice optimization (HRO) [
35] is a newly proposed nature-inspired algorithm and has been successfully applied to image processing and knapsack problem because of its simple structure and strong optimization ability. For example, Liu et al. [
36] presented an image segmentation method that used HRO to find the fittest multi-level thresholds by using Renyi’s entropy as the fitness function, and experiments proved that HRO prevailed over the other six commonly used evolutionary algorithms on most metrics. Su et al. [
37] designed two different hybrid models for the complex large-scale 0–1 knapsack problem by using novel combinations of improved HRO and binary ant colony algorithm, which achieved better performance on different size datasets. In addition, Ye et al. [
38] regarded the band selection problem as a combinatorial optimization problem and employed binary HRO to select the optimal band set for HSI, which obtained good results in classification precision and execution efficiency. Although HRO algorithm has contributed to acquiring satisfactory results, primary HRO performs the exploitation of the current best solution during each search process inadequately.
Recently, DE algorithm has been successfully combined with other swarm intelligence algorithms for solving diverse optimization problems. Tubishat et al. [
39] employed evolutionary operators from DE algorithm to help each whale seek better positions and improve the local search capability of WOA for feature selection in sentiment analysis. Jadon et al. [
40] proposed a hybrid DE algorithm with ABC algorithm to enhance the convergence and the balance between exploration and exploitation. Houssein et al. [
41] hybridized the adaptive guided DE algorithm with slime mold algorithm for combinatorial optimization problems, which verified that evolutionary operators could boost the local search capability of swarm agents. Hence, a modified HRO (MHRO) based on opposition-based learning (OBL) strategy and differential evolution (DE) operators is proposed to overcome the disadvantages of standard HRO in the paper. The main contributions of this paper are concluded as follows:
- (1)
OBL strategy is introduced to enhance the diversity of the initial population and accelerate the convergence of MHRO;
- (2)
DE operators are embedded into the search process of MHRO to enhance the local exploitation ability;
- (3)
The MHRO algorithm is applied in band selection and its performance is demonstrated on standard HSI datasets.
The remainder of the paper is organized as follows:
Section 2 briefly gives a fundamental overview of the related technique and standard HRO algorithm. The methodology and the specific workflow of the proposed band selection approach are introduced in
Section 3.
Section 4 presents the experimental results and comparative studies. At last, conclusions and future work are summarized in the final section.
3. The Proposed Band Selection Method
To overcome the disadvantages of the primary HRO algorithm, two strategies are used to enhance the performance of HRO for handling the band selection problem. The main steps of the proposed technique are described in the following subsections.
3.1. The Coding Scheme
The key factor to handling the band selection issue is to make an appropriate mapping between the problem solution and algorithm coding. For band selection of HSI, each band has two candidate states of being selected or not being selected, which is suitable to be represented by binary coding. In HRO, each gene bit is represented by “1” or “0”, where “1” means that the corresponding band is selected and will be utilized for training, and “0” represents that the corresponding band is not chosen. Supposing that HSI contains ten bands, the binary coding of MHRO is “1001100101”. That is, the 1st, 4th, 5th, 8th and 10th bands will be selected to complete the subsequent classification task.
3.2. The Objective Function
Further, the proposed band selection method is developed to minimize the fitness function or the objective function by adopting MHRO algorithm. The main purpose of this method is to select the bands with the most informative subset from the original bands, so as to maximize the classification accuracy. Accordingly, SVM is adapted to conduct the classification on the HSI datasets, and the classification accuracy is selected as part of the objective function. In band selection technique, classification accuracy is an important measure metric, but how to reduce the number of redundant bands is also one of the most crucial goals. Therefore, the objective function as shown in Equation (12) is utilized in the paper.
where
denotes the fitness value,
represents the overall classification accuracy and its concept is described in
Appendix A.1. Note that
and
are the entire and the selected number of bands.
is a weight factor that balances classification accuracy and selected number of bands. It takes
in the paper.
3.3. The Implementation of the Proposed MHRO
The proposed band selection method is easy to implement, and its idea is to choose the optimal band subset with satisfactory classification results. Two improvements contained in the proposed algorithm MHRO are presented as shown in
Figure 1. The first improvement is to adopt OBL in the initialization stage, whose aim is to improve the population diversity. The second improvement is the combination of DE operators and binary HRO algorithm, which improves the local search ability of the algorithm. The main procedure of these strategies utilized in MHRO is described as follows:
OBL: In the stage of population initialization, the position of each rice seed is randomly generated in the specified space. Then, a new population is formed by generating the corresponding opposite individual for each rice seed in the initial population by using OBL mechanism. Next, the individuals in the initial and new populations are sorted according to their fitness value, and the top individuals are selected to enter the final population. The main steps to initialize the population by OBL are as follows:
- (1)
Initialize the location of each rice seed randomly, Let be the i-th rice seed in the initial population , where and . denotes the population size and represents the dimension of the problem;
- (2)
A new population OX was obtained by using the Equation (8) for each rice seed in the population X;
- (3)
The fittest individuals are chosen from the set to constitute the new initial population of the MHRO algorithm.
DE operators: In HRO, only individuals in sterile and restorer lines are updated, while maintenance lines are ignored, which reduces the search performance of the algorithm on high-dimensional band selection. Therefore, DE evolution operators are applied to the genetic sequences of each rice seed in the maintainer line to find better rice seeds by using Equations (9)–(11). In order to degrade the possibility of falling into the local optimum, the mutation factor in Equation (9) is set as a random number between 0 and 1, where , and are randomly selected individuals in the maintainer line. If the fitness value of the newly generated trial solution is better than the current individual, the current individual will be replaced. Otherwise, it is not replaced.