1. Introduction
Hyperspectral technology is a potential tool for the remote detection of targets and monitoring. A hyperspectral sensor measures electromagnetic radiation reflected from the target in a large number of spectral narrowbands. The inherent objective in target classification and assessment using hyperspectral data is to utilize its high spectral resolution [
1]. However, the large dimensionality of hyperspectral data is often attributed to the Hughes phenomenon, the curse of dimensionality [
2]. The problem is a combined consequence of the high correlations among the adjacent bands and the inability of the algorithm being applied to process the high-dimensional data. The problem is paramount in spectrally complex environments such as wetlands and swamps with many diverse species to be monitored [
1,
3,
4]. While a common remote sensing data processing solution involves the application of dimensionality reduction techniques or the selection of suitable narrowbands in a post-acquisition step, a hardware-based solution involves the use of programmable hyperspectral sensors as a pre-acquisition step. Programmable hyperspectral sensors typically involve a snapshot-based scanning mechanism, unlike general point or line scanning-type systems, which are non-programable and acquire a continuous spectrum over the operable wavelength region. Several such programmable hyperspectral sensors have been developed in recent times, which are increasingly being used in UAV-based remote sensing applications [
5,
6,
7]. A hardware-based method, such as Fabry–Pérot interferometer (FPI) technology, acquires reflected electromagnetic radiation in pre-selected optimal narrowbands, and it is programmed by changing the air gap between the internal tuneable mirrors [
8]. This method has the additional benefit of efficient mapping of the environment through the selection of only the spectral features of interest, which is particularly crucial in high-resolution mapping applications using unmanned aerial vehicles (UAVs), which have limited flight times. The technology is relatively new compared to the traditional pushboom type hyperspectral sensors, and existing works involving the FPI have used either (1) a set of bands for generating vegetation indices (VIs), herein referred to as
indices-based criteria [
7,
9,
10], or (2) set of bands identified through rigorous experimental testing, herein referred to as
knowledge-based criteria [
11,
12] of narrowband selection.
Indices-based criteria for band selection have the potential to assess the condition and/or estimate the yield of the vegetation [
7,
9]; however, they are not principally suited for multi-target classification, since the spectral variations of the target endmembers present within the scene are subjective. Furthermore, the efficacy of
indices-based narrowband selection approach for vegetation quality or condition assessment is also subject to the characteristic reflectance of the target, and the traditional list of indices does not always ensure the best results for different vegetation communities or species. The
knowledge-based approach requires a thorough understanding of the spectral variability among the targets present over the area, which is usually attained through intensive in-situ sampling and is not always realizable over difficult terrain or in scenarios requiring urgent mapping. Therefore, it is important to adopt a
data-driven methodology for programmable hyperspectral sensors to estimate appropriate narrow bands for scene classification or assessment. Minet et al. [
13] proposed an approach to adaptively maximize the contrast between the targets by employing a genetic algorithm (GA)-based optimization of the positions and linewidths of a limited number of filters in FPI for military applications. However, this method is unsuitable in thematic applications of remote sensing.
Different
data-driven strategies have been proposed for the selection of optimal bands for traditional remote sensing applications. A method of sub-optimal search strategy utilizing constrained local extremes in a discrete binary space to select hyper-dimensional features was presented in [
14]. Becker et al. [
3] used a second-derivative approximation to identify the spectral location of inflection. A band selection method using the correlations among bands based on mutual information (MI) and deterministic annealing optimization was also employed [
15]. Becker et al. [
4] proposed a classification-based assessment for three optimal spectral band selection techniques (derivative, magnitude, fixed interval, and derivative histogram), using the spectral angle mapper (SAM) as a classifier. A GA-based wrapper method using a support vector machine (SVM) was proposed for the classification of hyperspectral images [
16]. A double parallel feedforward neural network based on radial basis function was used for dimensionality reduction [
17]. Principal component analysis for identifying optimal bands to discriminate wetland plant species was presented [
1]. A semi-supervised band clustering approach for dimensionality reduction was developed [
18]. A particle swarm optimization (PSO)-based dimensionality reduction approach to improving support vector machine (SVM)-based classification was suggested by [
19]. Li et al. [
20] and Pal et al. [
21] presented a hybrid band selection strategy based on a GA-SVM wrapper to search optimal bands’ subsets. A method of band selection based on spectral shape similarity analysis was put forward in [
22]. Methods for nesting a traditional single loop of PSO or 1PSO inside an outer PSO loop, termed 2PSO, have been identified to improve the overall optimization performance in certain applications, at the expense of computational cost [
23]. Su et al. [
23] implemented 1PSO and 2PSO with minimum estimated abundance covariance (MEAC) [
24], among other techniques, for the evaluation of optimal bands. Ghamisi et al. [
25] presented a feature selection approach based on hybridization of a GA and PSO with an SVM classifier as a fitness function. Accuracies achieved in an optimized band selection method are influenced by the characteristics of the input dataset, as the search strategy depends on the present classes and their spectral profiles. Therefore, these methods need to be tested on benchmark datasets, an equivalent comprehensive evaluation is reported in [
23]. However, all these existing optimal band identification studies involving
data-driven methods were used on traditional hyperspectral datasets after the acquisition, and are yet to be used with a hardware-based solution to pre-tune hyperspectral sensors to acquire the optimal bands.
In this study, for the first time, an in-field
data-driven approach to pre-tune a snapshot-type UAV-hyperspectral sensor was devised for remote sensing applications. The method employs PSO, with minimum estimated abundance covariance (MEAC), similarly to [
23] in a post-processing stage for waveband selection after hyperspectral dataset acquisition. The significant benefits are: (1) it is an efficient approach to identifying the optimal bands in-field before the survey; (2) it does not require a lot of spectral samples per class, which is particularly an issue over difficult terrain when trying to establish a spectral library; and (3) the system works perfectly when the number of observed samples is less than the total number of potential hyperspectral bands to select from, which is an important issue with other dimensionality reduction methods, such as principal component analysis (PCA). Programmable UAV-hyperspectral sensors have increasingly been used in applications such as environmental mapping, precision agriculture, phenotyping, and forestry [
12,
26,
27]. Identification of optimal wavelengths remains crucial for mapping vegetation communities, phenotyping functional plant traits, and identifying vegetation under biotic or abiotic stress. Our method aims to resolve functional challenges by improving the capturing of the spectral representation of an environment through a UAV-hyperspectral survey.
The rest of the paper is arranged as follows. The Materials and Methods section describes the experimental framework. The theoretical background of the PSO-MEAC approach is described in relation to the elements of the proposed application. In the Results and Discussion section, we present the results of using the PSO-MEAC method for optimal band selection at the experimental site. In addition, the performance of the data-driven PSO-MEAC approach has been evaluated against the traditional indices based approach for feature selection and mapping. Finally, the concluding remarks are provided in the conclusion section.
2. Materials and Methods
This section details the study area, ground based hyperspectral sensing system, data processing for the hyperspectral data, workflow for identifying optimal bands in the field, and method for UAV-hyperspectral surveying and assessment.
2.1. The Area Used for the Experiment
The test site is an upland swamp area above an underground coal mine within the temperate highland peat swamp on sandstone (THPSS) in New South Wales, southwest of the city of Sydney, Australia (34°21′24.0″S, 150°51′51″E). The area is located in Wollongong. The focus was laid on spectrally diverse vegetation communities in critically endangered ecosystems distributed in the Blue Mountains, Lithgow, Southern Highlands, and Bombala regions in New South Wales, Australia [
28]. The NSW National Parks and Wildlife Service (NPWS) classifies the upland swamps complexes into five major vegetation communities—Banksia Thicket, Cyperoid Heath, Fringing Eucalypt Woodland, Restioid Heath, and Sedgeland [
29]. The site has occasional thick vegetation cover and steep gradients which are inaccessible.
2.2. Hyperspectral Set-Up for Ground Based Sampling
The spectra of the target classes in the environment were measured with the visible-infrared snapshot hyperspectral (FPI) sensor (Rikola, Senop Optronics, Kangasala, Finland) with a separate data acquisition computer. In this mode of operation, the sensor acquires the maximum number of wavelength bands possible—i.e., 380 bands at 1 nm spectral steps between 500 and 880 nm. With a focal length of 9 mm and a field-of-view (FOV) of 36.5 × 36.5 degrees, the sensor acquires 1010 × 1010 spatial channels in the snapshot imaging mode. In contrast, in the standalone on-board UAV-based data acquisition mode the sensor records a set of 15 programmed wavelength bands in 1010 × 1010 pixel format, i.e., up to a total of 16 megapixels of storage per hypercube. The sensor also acquires solar irradiance measurements—it uses an irradiance sensor for radiometric calibration; and positional measurements using a global positioning system (GPS) for geometric corrections (
Figure 1). All sensors were installed on a handheld mount for hyperspectral imaging. An Android mobile phone was also installed on the sensor mount and paired to the data acquisition computer with a video telemetry feed over a WiFi link to provide a realtime view of the scene, which was useful for bringing the target vegetation in focus before the collection of hyperspectral data (
Figure 1a). Additionally, a realtime feed of goniometric measurements (roll and pitch) from the mobile phone’s accelerometer was relayed to the screen of the data acquisition computer to monitor the planimetric setting of the captured hypercubes using the FPI sensor (
Figure 1b).
The simplistic design of the handheld hyperspectral imaging system was important for carrying it around in regions with dense shrub-type vegetation cover (
Figure 1c). The hyperspectral data were acquired with a downward nadir orientation over the shrub type swamp vegetation. The data were acquired at a distance of approximately 0.5 m from the top of the canopy (
Figure 1c). In this study, the FPI sensor was used as a tool for in-field spectral acquisition to demonstrate an independent form of operation. Nevertheless, the field spectral measurements could also be obtained from other spectroradiometers, such as ASD FieldSpec3 (Analytical Spectral Devices, Boulder, CO, USA). However, special care should be taken to establish proper radiometric calibration to remove any inter-sensor response mismatch, which is addressed by using the same FPI sensor for both in-field spectral data collection for identifying the optimal bands and later UAV-hyperspectral data acquisition.
For identifying the optimal bands through PSO-MEAC, the hyperspectral measurements were collected for a total of three target vegetation classes, covering eight upland swamp species, including Grass tree (Xanthorrhoea resinosa), Pouched coral fern (Gleichenia dicarpa). and Sedgeland complex (Empodisma minus, Gymnoschoenus sphaerocephalus, Lepidosperma limicola, Lepidosperma neesii, Leptocarpus tenax, and Schoenus brevifolius). In addition, spectral measurements were also collected for background vegetation, which contained a mixture of other species which were present in small patches and not selected in this study. Finally, a background bare-earth spectrum was also collected. To obtain a proper un-mixed spectrum for a single species, field sampling was performed over a region of interest with local homogeneity.
2.3. In-Field Ground-Based Hyperspectral Data Processing
Vegetation in an upland swamp environment is highly diverse, and species can exist in homogenous and heterogeneous patches. Data collected through the portable handheld FPI system caused minor spectral misalignments due to unavoidable handheld movement of the sensor and due to slight movements of the canopy caused by wind. This happened as the data in the FPI sensor were acquired in a snapshot, bandwise manner with a small delay and sensor movement [
26]. The hyperspectral bands were aligned using a previously developed band alignment workflow described in [
26]. The data were first flat-field corrected using dark current removal and a white calibration panel; then they were converted to the reflectance measurements using previously computed calibration coefficients with an integrating sphere [
7]. A band-averaged hyperspectral signal was calculated from the hypercube and used in the optimal band identification workflow. The spectrum was further treated using a Savizky–Golay [
30] smoothing filter with a polynomial order of 3 and a frame length of 17 to remove spectral noise. A PSO with MEAC as the criterion function was employed to identify the suitable bands in the field; the details of the theory of operation are in
Section 2.4. The entire process of spectral signature retrieval and PSO-MEAC workflow for suitable band identification was implemented as MATLAB (ver. 9.5) routines, and a graphical user interface (GUI) was designed for user-friendly and seamless operation in the field.
2.4. Optimal Band Identification Using PSO-MEAC
Particle swarm optimization (PSO) was originally used to simulate the social behaviour (movement and interaction) of the organisms (
particles) in a flock of birds or a pool of fishes [
31]. It has, however, been used as a robust metaheuristic computational method to improve the selection of candidate solutions for an optimization problem. The optimization operates iteratively over a swarm of candidate solutions with a criterion function as a given measure of quality. In our approach, the selected set of bands are called
particles, and a recursive update of the bands is called a
velocity. The particle position
denotes the selected band subset of size
, and velocity
denotes the update for the selected band. A particle updates [
31] as shown in Equation (1).
where
is the historically best local solution;
is historically the best global solution among all the particles;
and
control the contributions from local and global solutions, respectively;
and
are independent random variables between 0 and 1; and
is the inertia weight to improve the convergence performance.
New velocities and positions (
and
on the left-hand side of Equation (1)) for the particles are updated based on the existing parameters and cost criterion upon every iteration (
Figure 2). The iteration process aims to minimize the underlined criterion function.
In a traditional supervised classification, where representative class signatures are known through exhaustive field surveying, the band-selection process can be greatly simplified. However, in an aerial survey to determine suitable wavelength bands for a programmable UAV-hyperspectral system, such an exhaustive exercise is tedious, cumbersome, and not always possible. Therefore, MEAC was used as a criterion function in PSO, as it requires only class signatures and no training samples. The efficacy of this technique has been previously evaluated against other existing optimization methods by Su et al. [
23] for feature selection on traditional hyperspectral datasets (airborne and satellite).
Assuming there are
p classes present over an area in which the samples were collected, the endmember matrix can be written as
. According to Yang et al. [
19], with linear mixing of the endmembers, the pixel
r can be expressed as in Equation (2):
where
is the abundance vector and
n is the uncorrelated noise with
and
(
I is an identity matrix).
Usually, the actual number of classes (
p) is greater than the known class signatures; i.e.,
. Hence, the uncorrelated noise will have
, where
is the noise covariance matrix. Therefore, the abundance vector becomes the weighted least square solution, as in Equation (3):
with first-order moment being
and second-order moment being
.
The analysis demonstrates that when all the classes are known, the remaining noise can be modelled as independent Gaussian noise. For this application, when meeting such sampling criteria was difficult and there were unknown classes present, noise whitening was applied first. Yang et al. [
19] and Su et al. [
23] performed the optimal band selection on traditional hyperspectral datasets, and used all the pixels for the background noise (
) estimation. In this case, the background pixels’ noise was calculated using background class spectra and bare-earth spectra collected through ground-based sampling. The background plus noise covariance is denoted as
; this estimate was used in this study. The estimate of the unknown class pixels is based on the likelihood of the unknown class (or the class of no interest) being present around the sampled class of interest. In scenes where all endmembers are of known classes (or the target classes of interest), noise estimation
is not required, which is an unlikely condition in a spectrally complex swamp environment [
7].
The identified optimal bands should allow minimal deviations of
from actual
[
23]. With the partially known classes, the criterion function is equivalent to minimizing the trace of the covariance, as in Equation (4):
where
is the selected band subset. The resulting band selection algorithm is referred to as the MEAC method [
23].
The optimizer returns a suitably identified set of wavelength bands with the lowest cost criterion values (Equation (4)), upon successful completion of the PSO-MEAC algorithmic iterations (
Figure 2).
2.5. UAV-Hyperspectral Survey and Assessment
After the identification of a set of optimal bands through the
data-driven PSO-MEAC approach, the FPI hyperspectral sensor was programmed to acquire using the suitable narrow wavelength bands. A UAV-hyperspectral mission was carried out in pre-planned waypoint acquisition mode with 85% forwards and 75% lateral overlap from a flying altitude of 50 m. The sensor exposure time was set at 10 ms per band to provide good radiometric image quality for the existing illumination conditions. The UAV-hyperspectral survey was performed around two hours of local solar noon and in clear weather conditions with no clouds. This was done to avoid both the effect of significant illumination variations and shadows cast by clouds during the aerial image acquisition. However, due to the experimental site being situated in a low latitudinal region in the southern hemisphere (34°21′24.0″S, 150°51′51″E) with the sun projecting a shallow incidence angle, the issues of the shadows projected by trees and other tall vegetation was unavoidable. In addition to the
data-driven PSO-MEAC tuned survey, another aerial survey was performed with an
indice-based [
7] wavelength selection approach, using the same UAV flight characteristic and sensor exposure configuration. A band stabilization workflow was adopted to co-register spatial shifts between bands in hypercubes, from both the aerial acquisition modes [
26]. Further, the regular radiometric, illumination adjustment, mosaicking, and geometric correction procedures for hypercubes were carried out [
7]. The UAV-hyperspectral orthomosaics achieved a high spatial resolution of 2 cm in ground sampling distance.
A supervised support vector machine (SVM) classifier was used to classify the hyperspectral datasets into constituent classes. The SVM is an efficient kernel-based machine learning classifier suitable for high-dimensional feature spaces, which is well used in classifying hyperspectral datasets [
32,
33,
34]. The classification was performed as an evaluation step to compare the efficacy of wavelengths identified through
data-driven PSO-MEAC and
indices-based approaches. As the fundamental objective in this study was to simply evaluate the two methods, and not to achieve superior accuracies in classification, involving complex classification algorithms were deemed needless. Standard parameter settings—a radial basis function with a kernel gamma function of 0.167, a penalty parameter of 100, and a pyramid level of 5—were used for the SVM classification. The overall and individual class classification accuracies were computed using the ground truth training samples.
For evaluating the efficacy of PSO-MEAC-identified bands through classification, a total of 120 ground truth measurements were collected for shrub-type swamp vegetation through a rigorous field survey, and 120 ground truth polygons were identified through visual interpretation of high-resolution hyperspectral data. The sampled ground-based (120) and image-based (120) polygons were randomly divided into 1:1 mutually exclusive sets of training and test samples, i.e., 60 ground and 60 image-based polygons for each training and test group. The ground truth training set was used to train the SVM classifier, and the test samples were used to compute the overall accuracy (OA), kappa (κ), and confusion matrix to evaluate the classification accuracies. The spectral data from training and test sample polygons were obtained from the UAV-hyperspectral datasets in corresponding data-driven PSO-MEAC and indices-based modes.