1. Introduction
Wildfires are global phenomena that cause environmental and social transformations, including biodiversity loss, soil degradation, harm to human beings and economic losses [
1]. Nowadays, there is growing concern over this matter since, under current climate change conditions, there will be an increase in the season duration, magnitude and consequences of wildfires [
2]. In this context, it is increasingly necessary to have precise and detailed information regarding the affected area in order to both assess and consider damages and to implement prevention plans and processes of recovery of the infrastructure and ecosystems affected [
3].
Satellite images provided by optical sensors can be used to delimit burned areas resulting from wildfires. This is possible due to the strong decrease of reflectance in the near-infrared (NIR) and the increase in the shortwave infrared spectrum (SWIR) of vegetation as a result of dryness caused by fire [
4]. In addition, the difference in the information provided by satellite data can be used to estimate spectral indices, which help to describe the vegetation condition, the severity of the damage or the levels of carbonation of the areas affected by fires. The most common indices are burned-area index (BAI), normalized burn ratio (NBR) and normalized difference vegetation index (NDVI) [
5].
In general terms, there are two approaches for the mapping of burned areas using satellite data. The first one is based on rules that identify variations in the burned spectral signals with respect to the unburned environment, based on which fixed or dynamic thresholds are defined for the spectral bands or indices [
6]. The second approach is based on machine-learning algorithms that can learn the spectral features of a sample of pixels with labels and recognize those patterns in other areas of the image [
7]. Machine-learning (ML) algorithms build a non-linear transformation whose parameters are estimated in a supervised or unsupervised way to solve classification, regression and clustering problems by formulating an optimization problem [
8]. Recently, these types of algorithms have been used for mapping burned areas, showing good results with random forest (RF) [
7,
9], support vector machine (SVM) [
7,
10], logistic regression (LR) [
11] and multilayer perceptron (MLP) supervised learning Neural Networks [
7,
12].
Despite the advances in the methodologies and algorithms employed, the estimation of burned areas using satellite images depends heavily on the spatial resolution of the sensor used. This demonstrates the significant underestimation of the calculations performed with low-resolution sensors when compared to estimations of higher resolution [
13]. Therefore, it is deemed necessary to explore the use of higher spatial resolution platforms and sensors, such as Sentinel-2, in order to deal with these problems.
Moreover, most of the machine-learning algorithms that are frequently employed in image classification increase their computational cost when the size of the data sets increases, leading to longer training and classification times [
14]. However, extreme learning machines (ELM) have recently attracted increasing interest in image classification due to their high accuracy levels, which are comparable to the traditional SVM and MLP, but with a training algorithm with reduced computational cost [
15].
Considering the ELM algorithm training efficiency and its good generalization, we hypothesized that ELM neural networks can be used to classify burned areas on medium-spatial resolution images, showing a great potential to deal with this problem at scales that require a massive volume of data. Consequently, this work aims to propose the evaluation of mapping burned areas on Sentinel-2 images through an ELM neural network supervised classification, comparing ELM performance in terms of accuracy and training time to several machine-learning algorithms.
The rest of the article is organized as follows:
Section 2 presents the methods used to evaluate the ELM performance with other algorithms.
Section 3 presents the methodology of the experiments.
Section 4 shows the results and their discussion. Finally,
Section 5 presents the conclusions.
2. Fundamentals of the Classification Algorithms Applied
In this research, several classification algorithms were used, whose fundamentals are described below.
2.1. Extreme Learning Machine
Extreme learning machine (ELM) neural networks are known for their extremely fast learning speed, showing high levels of performance, compared to the ones found for multilayer perceptron network (MPL) and support vector machine (SVM) [
16]. ELM has shown better performance than neural networks with backpropagation and other classification models, in terms of computational efficiency and proper generalization when applied to Landsat satellite images [
17].
The ELM algorithm was originally developed for single-hidden layer feedforward neural networks. One of its main features is its simple training since the weights and biases of the hidden layer are randomly created and the output weights can be determined by the resolution of an overdetermined linear system by the use of the Moore–Penrose generalized inverse [
16]. The output expression of a single-hidden layer ELM network is as follows:
where:
L: the number of hidden neurons.
N: the number of training samples.
: output layer’s weight vector.
w: hidden layer’s weight vector.
g: activation function.
b: biases parameters vector.
x: input vector.
Equation (
1) can be written in the matricial form:
where
H is the hidden layer output matrix and
is the weight vector of the output layer. Specifically, the H matrix has the following structure:
where:
m: the number of outputs.
H: the hidden layer output matrix of the neural network.
T: the training data labels matrix.
Unlike gradient descent-based neural networks, for ELM, weights of the hidden layer need not be fitted since they are randomly created. In order to train an ELM network, it is necessary to find a
least-squares solution of the
linear system, whose expression is as follows:
where
is the Moore–Penrose generalized inverse of matrix
H.
Given a training set, activation function, and L number of hidden neurons, the ELM training algorithm has the following steps:
Step 1: assign arbitrary input weight and bias .
Step 2: calculate the hidden layer output matrix H.
Step 3: calculate the output weight for .
2.2. Multilayer Perceptron
Multilayer perceptron neural networks (MLP) are widely used for satellite images classification [
18] and for burned area mapping applications [
19]. These networks have an input layer, one or more hidden layers and an output layer [
20]. They are fitted using the back-propagation algorithm, which consists of minimizing the output mean squared error of the network using a gradient descent algorithm [
19,
21]. In general, MLPs have poor performance when the optimization algorithm falls into local minimum and due to the overfitting phenomenon [
18]. However, it is possible to improve the optimization algorithm performance through heuristic search techniques [
21].
2.3. Random Forest
It is an ensemble model of several decision trees randomly organized and individually trained. Each tree makes a classification decision where the class with the maximum number of votes is determined for the input data [
8]. Each tree is independently organized so successive trees are independent from the previous ones [
18]. Random forest (RF) has shown good performance in linear and nonlinear models known for balancing bias and variance [
7]. This algorithm is widely used on satellite images data due to the high accuracy of its classifications [
22]. One of its outstanding features is that it can successfully handle high data dimensionality and multicolinearity since both are insensitive to overfitting [
23] and training algorithm is fast [
24]. Moreover, RF has high accuracy for datasets external to those considered during training, which shows that spatial autocorrelation has a low impact on the prediction performance [
25].
2.4. Logistic Regression
Logistic regression (LR) is a statistical model that can be used to describe the relationship between a dichotomous dependent variable and a series of independent variables [
11]. LR has the function of predicting the result of a categorical variable in relation to the predictor variables [
18]. It is an efficient tool for burned area mapping since it offers the chance to obtain the probability of a pixel to be classified as burned or unburned [
9,
26]. LR provides a good probabilistic framework for the development of burned area algorithms since the models obtained are consistent with variations in environmental variables [
21].
2.5. Support Vector Machine
Support vector machine (SVM) is a statistical learning algorithm, robust to data noise and adapted to classification and nonlinear regression problems [
10]. SVM transforms a nonlinear regression model into a linear model using kernel functions to map the original input space into a new high-dimensional features space [
7]. In this high-dimensional space (hyperplane), SVM finds unique solutions to classification and regression problems [
18]. The advantage of SVM over traditional classifiers is that it solves learning problems better when only a small number of training samples are available [
27]. SVM has shown good classification results for satellite images, presenting low omission errors [
19].
5. Conclusions
This study has evaluated the potential of ELM neural networks as a new burned area classifier of multispectral Sentinel-2 satellite images. ELM showed promising results in the classification performance on the test dataset and shorter training times than the other algorithms. This characteristic is a great advantage for ELM in order to address burned area mapping with medium-high spatial resolution images at national and global scales. However, like the other machine-learning algorithms, its practical implementation requires preprocessing stages that include an atmospheric and topographic correction and samples collection to generate training datasets with enough spectral and spatial representativity of the burned areas.
In the current context of climate change, where an increase in the frequency and magnitude of wildfires is forecasted, there is a growing need for precise information about burned areas. This information is essential to fire risk systems and burned-area records used to design prevention and fire-combat strategies and provides valuable knowledge on the effect of fires on the landscape and the atmosphere.