Next Article in Journal
An Infrared and Visible Image Fusion Network Based on Res2Net and Multiscale Transformer
Previous Article in Journal
Considerations on UAS-Based In Situ Weather Sensing in Winter Precipitation Environments
Previous Article in Special Issue
CANGuard: An Enhanced Approach to the Detection of Anomalies in CAN-Enabled Vehicles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spatio-Temporal Agnostic Sampling for Imbalanced Multivariate Seasonal Time Series Data: A Study on Forest Fires †

by
Abdul Mutakabbir
1,*,
Chung-Horng Lung
2,
Kshirasagar Naik
3,
Marzia Zaman
4,
Samuel A. Ajila
2,
Thambirajah Ravichandran
5,
Richard Purcell
6 and
Srinivas Sampalli
6
1
Department of Data Science, Analytics, and Artificial Intelligence, Carleton University, Ottawa, ON K1S 5B6, Canada
2
Department of Systems and Computer Engineering, Carleton University, Ottawa, ON K1S 5B6, Canada
3
Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
4
Research and Development, Cistel Technology, Nepean, ON K2E 7V7, Canada
5
Research and Development, Hegyi Geomatics Inc., Nepean, ON K2E 7K3, Canada
6
Faculty of Computer Science, Dalhousie University, Halifax, NS B3H 4R2, Canada
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in Spatio-Temporal Agnostic Deep Learning Modeling of Forest Fire Prediction Using Weather Data. In Proceedings of the IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), Torino, Italy, 26–30 June 2023. https://doi.org/10.1109/COMPSAC57700.2023.00054.
Sensors 2025, 25(3), 792; https://doi.org/10.3390/s25030792
Submission received: 20 December 2024 / Revised: 22 January 2025 / Accepted: 26 January 2025 / Published: 28 January 2025
(This article belongs to the Special Issue Feature Papers in the Internet of Things Section 2024)

Abstract

:
Natural disasters are mostly seasonal and caused by anthropological, climatic, and geological factors that impact human life, economy, ecology, and natural resources. This paper focuses on increasingly widespread forest fires, causing greater destruction in recent years. Data obtained from sensors for predicting forest fires and assessing fire severity, i.e., area burned, are multivariate, seasonal, and highly imbalanced with a ratio of 100,000+ non-fire events to 1 fire event. This paper presents Spatio-Temporal Agnostic Sampling (STAS) to overcome the challenge of highly imbalanced data. This paper first presents a mathematical understanding of fire and non-fire events and then a thorough complexity analysis of the proposed STAS framework and two existing methods, NearMiss and SMOTE. Further, to investigate the applicability of STAS, binary classification models (to determine the probability of forest fire) and regression models (to assess the severity of forest fire) were built on the data generated from STAS. A total of 432 experiments were conducted to validate the robustness of the STAS parameters. Additional experiments with a temporal data split were conducted to further validate the results. The results show that 180 of the 216 binary classification models had an F 1 s c o r e > 0.9 and 150 of the 216 regression models had an R 2 s c o r e > 0.75 . These results indicate the applicability of STAS for fire prediction with highly imbalanced multivariate seasonal time series data.

1. Introduction

The occurrence of natural phenomena, such as forest fires, tsunamis, earthquakes, and cyclones, may lead to natural disasters spanning large areas in different geographical regions. This research presents an under-sampling framework to predict natural disasters by using forest fires as an example. Predictions on the occurrence of these phenomena can be performed using time series, multivariate, multi-source, and seasonal data collected from sensors and satellites [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]. The data are also recorded over consistent periods, such as hourly, daily, or monthly. Additionally, sensor data are recorded from multiple spatial locations at the same time. When using such data, the challenge of sampling the majority class (non-event data) arises due to the imbalance in data among classes. This is because the event data are spread far across geographic distances and in time, leading to very high non-event data. It is to be noted that event and non-event are used in a general context here, and when used in the context of forest fires, they are referred to as fire and non-fire events (elaborated in Section 4.1).
Natural phenomena, such as forest fires, are essential for ecological integrity. Coexistence with forest fires by allowing prescribed burns can help maintain ecological integrity [16]. However, forest fires may also be caused by humans intentionally or by accident. Further, they can occur naturally by lightning, provided that suitable meteorological conditions exist for ignition [15,17,18]. This paper confines itself to naturally occurring forest fires.
Canadian Geographic pointed out that in the 2023 fire season, as of August 2023, forest fires in Canada had burned 15.2 million hectares compared with 7.1 million hectares burned in 1995 [19]. NASA reported that in early August 2023, more than 300 megatons of carbon emissions were generated due to forest fires in Canada, more than three times that has been generated in recent decades [20]. Hence, it is important to reliably predict forest fires and fire severity.
The data available for forest fire prediction, such as the weather station data, are highly imbalanced with a ratio of 100,000+ non-fire events to 1 fire event. The imbalanced datasets affect forest fire research and prediction. The commonly used techniques, such as NearMiss and Synthetic Minority Over-sampling TEchnique (SMOTE) are explained in detail in Section 2.2. The primary limitation of these techniques is that they are computationally expensive, especially for real-time evolving data. To overcome the challenge of sampling highly imbalanced multivariate seasonal time series data in real time, Spatio-Temporal Agnostic Sampling (STAS) was introduced in [8]. The contributions of this paper are as follows:
1.
A mathematical description of multivariate time series event and non-event data is provided along with a description of the STAS framework;
2.
The computation speed gained by STAS over two other common sampling algorithms (NearMiss and SMOTE) for real-time applications in natural forest fire disasters is presented through a time complexity analysis;
3.
Algorithms for K-Nearest Sensor Data Aggregation and Spatio-Temporal Agnostic Sampling in the STAS framework have been modified for better understandability and readability;
4.
Validation of the robustness of the parameters proposed in the STAS framework [8] was conducted through an extensive set of possible parameter values;
5.
An additional set of experiments based on a temporal split of fire and non-fire event data were conducted demonstrating that the binary classification and regression models used in STAS are not impacted by current or future events during training.
The available tools for forest fire prediction may not be reliable due to the possibility of regional bias. Univariate data are not suitable for predicting forest fires and severity. Lightning, wind, groundwater level, precipitation, elevation, and slope influence forest fires and their severity [21]. In this research, a total of 432 experiments were conducted with 216 experiments each for binary classification and regression to demonstrate that STAS can be used with highly imbalanced multivariate seasonal time series data to identify the change in features with high prediction accuracy.
The rest of this paper is organized as follows: Section 2 presents the related work and background. Section 3 discusses the datasets and the methodology is presented in Section 4. The experiments along with the results are presented in Section 5. A discussion of STAS and concluding remarks are provided in Section 6 and Section 7, respectively.

2. Literature Review and Background

This section presents some of the available literature on the main topics this research tries to cover. The available literature is categorized into themes, such as sampling techniques, fire weather index, and forest fire prediction models, which are further subdivided. This section also presents the terminology used and the background of the study.

2.1. Terminology

The following are some of the terms used in this study:
  • Time Series: Data points recorded over a period of time, in successive order with regular time intervals.
  • Multivariate: A dataset with more than one independent variable for any given data point.
  • Seasonality: Time series data points having regular and periodic changes that occur at near-constant time intervals.

2.2. Sampling Techniques

In imbalanced data, the class with a higher number of samples is referred to as the majority class, and the class with a lower number of samples is referred to as the minority class. In natural phenomena, such as forest fires, the majority of the data points are for non-fire data. When models are trained on such imbalanced data, they tend to accurately predict only non-fire events [12,13].
Sampling is a technique used to select a subset from a population. Sampling has been frequently used for sensor data [11,12,13,22,23,24,25]. The common approaches are to either under-sample the majority class or over-sample the minority class, or a combination of both. Some sampling techniques are random sampling, NearMiss, SMOTE, and Information Based Optimal Subdata Selection (IBOSS) [23,24,25].

2.2.1. Random Sampling

Random sampling is one of the most popular under-sampling techniques, and it was used in [13], where a random sample of data points was taken from the majority class. It was found that models trained using randomly sampled data could not identify fire events but were able to correctly classify non-fire events [13].

2.2.2. NearMiss

The NearMiss approach is another popular technique used in under-sampling [26,27,28]. It was used in previous research [12,13] to undersample the data. There are three types of NearMiss approaches, as explained in the following passage. NearMiss-1 selects data points from the majority class with the smallest average distance to the three closest data points from the minority class. NearMiss-2 selects data points from the majority class that has the smallest average distance to the three furthest data points from the minority class. In NearMiss-3, for every minority class data point, a given number of majority class data points are selected that are closest to the minority class data point. The number of majority class data points can be selected while running the algorithm. In [12], it was seen that with under-sampling, as the ratio of fire and non-fire events approached 1:1, the models started to identify fire events but did not accurately identify non-fire events as compared with no sampling. An effective under-sampling ratio was hard to determine.

2.2.3. Synthetic Minority Over-Sampling TEchnique

SMOTE is a statistical technique used to address the issue of class imbalance by over-sampling the minority class. It was used in [12]. It first selects a sample a from the minority class. Then, it selects x minority class samples closest to a. Finally, it uses linear interpolation between a and x to generate more samples. Over-sampling produced better results when compared with under-sampling [12].

2.2.4. Spatio-Temporal Agnostic Sampling

A common problem with machine learning models trained on extremely imbalanced data is that they favor predicting the majority class. In [12], when the models were trained without sampling the data, the models mostly predicted the non-fire. STAS [8], instead of letting the models identify between majority and minority class, transforms the problem to identify the change in features over a time frame M for an event to occur.  It is a repeatable under-sampling technique that generates a balanced dataset for large-scale evolving data and ensures seasonality does not impact prediction and can be applied to sensor and satellite data. We proposed and validated the STAS framework to predict forest fires using weather data in [8,9] and later extended it to federated learning in [10]. In [9], it was also validated that features completely independent of forest fires, i.e., hydrometric data do not increase or decrease the performance of models. In [8], the parameters for STAS used in the framework were N = 7 , 14 , 30 ; M = 3 , 5 , 6 , 7 , 9 , 12 ; and K = 1 , 3 , 5 , where N represents the number of past days’ information each feature has in a data point, M represents how many months back to select a non-fire event, and K determines how many stations’ values are averaged to make predictions for sensor data.

2.3. Fire Weather Index

The Canadian Fire Weather Index (FWI) system plays a major role in forest fire prediction. It is a part of the Canadian Wildland Fire Information System (CWFIS), which provides the fire danger level values for the entire country. It is the principal source of fire intelligence for forest fire management services [21]. It has been successfully migrated to other parts of the world, such as Poland [29], Portugal [30], China, Italy [31], Indonesia, Spain, and the United States. The Canadian FWI system has been extensively applied in other research [21,29,31,32,33]. Stocks et al. [32] provided a good background to this research. It was highlighted in [33] that wind, temperature, humidity, and rain are useful in predicting fire spread.
The Canadian FWI system includes fire weather behavior based on temperature, relative humidity, wind, and rain. A pictorial representation of the FWI system is shown in Figure 1. The Canadian FWI system has six components divided equally into fuel moisture codes and fire behavior indices. The fuel moisture codes are Fine Fuel Moisture Code (FFMC), Duff Moisture Code (DMC), and Drought Code (DC) shown in blue in Figure 1. The three fire behavior indices are Initial Spread Index (ISI), BuildUp Index (BUI), and FWI, shown in red in Figure 1.
In Canadian FWI, fuel moisture codes are calculated using similar features. For example, all three (FFMC, DMC, and DC) use temperature and rain to calculate their values. FFMC and DMC have relative humidity in common as well. Later, the fire behavior indices are calculated using FFMC, DMC, and DC. This leads to redundant calculations of readings, such as temperature, relative humidity, and rain. Further, the equations used to calculate these values have seasonal constants that need to be calibrated with intervention from human experts in the field. A detailed overview of the calculations can be found in [34]. Additionally, FWI does not consider that region’s forestry data. This may predict fire danger even if no forests exist in the region. Meanwhile, in [10], the STAS models proved effective in diverse environments. Features similar to the Canadian FWI system were used in STAS; however, human calibration was not needed.

2.4. Forest Fire Prediction Models

As concluded in [35], the National Fire Data Base (NFDB) points and polygons give the best estimates of fire season start and end dates. The aim of [35] is to examine the trends in fire-regime changes. Research on neural network forecasting for seasonal and trend time series [36] showed that neural networks do not capture seasonal or trend variations with raw data. The model proposed in [37] was developed to predict the severity of small and frequent fires using meteorological input. The researchers proposed using a Support Vector Machine (SVM) and Random Forest (RF). Much earlier, [38] presented a forest fire risk prediction algorithm based on SVM with only meteorological data.
The research in [11,37,38,39,40,41,42,43] focuses on using meteorological data to make predictions. These studies are limited to a particular region. The research in [44,45,46] considers lightning as the primary factor for predicting natural forest fires. These too are limited to narrow regions. This leads to regional bias in forest fire predictions. Hence, it is proposed to use large geographic data in this research.

2.5. Metrics

There are different types of performance metrics for classification tasks with machine learning, such as accuracy, recall (sensitivity), specificity, and precision. Accuracy is a common metric used in classification.
Recall indicates what proportion of the data belonging to a class is classified correctly in that class by the classifier. The formula for recall is given in Equation (1).
R e c a l l = T r u e P o s i t i v e T r u e P o s i t i v e + F a l s e N e g a t i v e
Precision tells us how many true positives were there among all the true positives and false positives. The formula for precision is shown in Equation (2).
P r e c i s i o n = T r u e P o s i t i v e T r u e P o s i t i v e + F a l s e P o s i t i v e
F 1 s c o r e ( F 1 ) is a standard machine learning metric used in classification models. It is the harmonic mean of precision and recall. The value of F 1 is considered excellent if it is in the range ( 0.9 1 ] , good if in the range ( 0.8 0.9 ] , poor if in the range ( 0.5 0.8 ] , and bad if in the range [ 0 0.5 ] . The formula for F 1 is shown in Equation (3).
F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
R 2 s c o r e ( R 2 ) is a metric used in evaluating the performance of a regression-based machine learning model by measuring the variation in the dependent variables on the independent variables. It measures the goodness of fit. Significant variance is explained if the R 2 is in the range ( 0.75 1 ] , good variance is explained if in the range ( 0.5 0.75 ] , and little or no variance is explained if in the range [ 0 0.5 ] . R 2 can be calculated using the Equation (4) where n is the number of samples, y i is the i t h target value, y i ´ is the i t h predicted value, and μ y is the mean of target values.
R 2 = 1 i = 1 n ( y i y i ´ ) 2 i = 1 n ( y i μ y ) 2
This paper proposes to use F 1 as a metric for classification tasks because F 1 provides an understanding of the model’s performance with imbalanced data and incorporates precision and recall. Also, R 2 is proposed as a metric for regression models. This is because R 2 measures the goodness of fit for regression models, given a set of inputs.

3. Datasets

This section describes the datasets used in this research. Two sources of data were used, the Canadian National Fire Database (CNFDB) and the Canadian Weather Energy and Engineering Datasets (CWEEDS), which are summarized in the following subsections. The final dataset used to predict forest fires is constructed by merging the CWEEDS and CNFDB datasets (Table 1), as explained in Section 4.2. In Table 2 we can see the CWEEDS metadata features used in the preprocessing. The size of input for modeling will vary based on the parameter N in STAS. Each data point in the final dataset will have N historic days of information for all the features listed in Table 3. The final dataset will have a 1:1 ratio for fire to non-fire events. The final dataset covers the entire country of Canada from 1998 to 2018, with a granularity of a day. The final dataset contains roughly 30,000 data points approximately two times the number of naturally occurring forest fires (roughly 15000) in the CNFDB.

3.1. Canadian National Fire Database

The CNFDB dataset [47] provides historical information on fire events for the entire Canada from 1917 to 2020. The database has been updated over time. Since this dataset is a ledger of forest fires across Canada, it does not have temporal granularity. Instead, it has a spatial extent of complete Canada. It also includes information on the shape, time, cause, and location of the fire events. The dataset is available as a shapefile. A visual representation of the distribution of the fires in the CNFDB is given in Figure 2. The causes of fires across Canada is shown in Figure 2a. The fires burned by human factors are shown in red, the naturally (lightning) caused fires are shown in yellow, and the ones for which the cause was unknown are shown in purple. In Figure 2b, we see the distribution of naturally caused fires over the years in CNFDB. In Figure 2c, we see the distribution of naturally caused fires by months. Most fires occur between June and August and are seasonal.
There are many features available in CNFDB. When making predictions using this dataset, some features, such as MORE_INFO, POLY_DATE, and ACQ_DATE, are not needed. Apart from this, there are redundant features in the dataset, such as YEAR, MONTH, and DAY, which are represented in REP_DATE. Table 1 lists the features from the CNFDB dataset used in this research.

3.2. Canadian Weather Energy and Engineering Datasets

The CWEEDS dataset [48] available from 1998 to 2018 has an hourly granularity. A document detailing the data can be found in [48]. CWEEDS was chosen as an ideal dataset as it provides features similar to the ones used in Canadian FWI. This dataset provides a historical record of the meteorological (weather) data of Canada for multiple weather stations. It contains one file for all the weather station metadata, and then each weather station’s data are contained in a separate file. The CWEEDS weather station metadata fields used in this research are presented in Table 2.
The preprocessed weather station data features are provided in Table 3 along with their units. Column three specifies the preprocessing done, while column four specifies the final preprocessed unit. Further, in the unprocessed weather station data files, each of the features has a flag value associated with it. The flag provides information on how the recording was made. This information was discarded in the processed dataset. The features having digit codes as their unit, such as sky layers and weather, are a combination of multiple-digit values (shown as dark purple Table 3) and are expanded in the final preprocessed dataset.

4. Methodology

This section first describes event and non-event data and provides a table for the notation used in this research. Then, the STAS framework is discussed in detail. Following this, a time complexity analysis of the STAS framework and comparison with NearMiss and SMOTE is presented. Finally, the deep learning model architecture used is discussed.

4.1. Fire and Non-Fire Events

In this research, event(s) and non-event(s) are referred to in the context of any natural disaster. When used in the context of forest fires, they are referred to as fire and non-fire event(s). Consider a single forest fire event f that belongs to a set F of all forest fire events. Here, f s , f g , and f l represent R E P _ D A T E (start date of f), G E O M E T R Y (final 2 dimensional burnt area of f), and center of G E O M E T R Y of f, respectively, from Table 1.
F = { f f is forest fire event starting at f s with a final burned area f g having center at location point f l }
Let S be the set of sensors (weather stations in case of forest fires) as shown in Table 2.
S = { s s contains metadata of the sensor located at s l }
Finally, consider W a set of multivariate time series sensor data as shown in Equation (5). The sensors record p features. In this research, p = 31 , since weather station data (Table 3) is used. Each sensor data w in the set W will record time series data for p features between the time frame t = 0 until t = t m a x . Here, t = 0 indicates the oldest time recorded among all the sensors in S, while t m a x is the latest time recorded among all the sensors in S. For forest fires t increments in days. The value of t belongs to the set of natural numbers ( N ). If there is no reading at a certain value of t, then the missing values are padded with 0 to keep a consistent input feature size.
W = { w R p × ( t max + 1 ) w is multivariate time series data } w = w 1 , t = 0 w 1 , t = 1 w 1 , t = t m a x w 2 , t = 0 w 2 , t = 1 w 2 , t = t m a x w p , t = 0 w p , t = 1 w p , t = t m a x
Let g ( · ) be a mapping from set S to W such that it returns sensor data w for a given sensor s as shown in Equation (6).
g ( · ) : S W
K is the number of nearest sensors to consider for a single forest fire f. Therefore, K belongs to N ( K N ) and is significantly smaller than the size of the set S ( K | S | ). Then, a set of K sensors S k for a given forest fire f can be defined as seen in Equation (7), where s l f l is the Euclidean distance between the location of sensor s and the center of f g of a forest fire f. It is also necessary that the sensors(s) selected should be operational at time f s .
S k = { S k S , | S k | = K , arg min s S s l f l }
Therefore, the list of multivariate time series sensor data W k acquired for each sensor in S k can be defined as seen in Equation (8). w ¯ k is the average of the values in W k as seen in Equation (9), where w i is i t h sensor data matrix in W k .
W k = g ( s ) s S k
w ¯ k = 1 K i = 1 K w i
In STAS, N specifies the number of past days incorporated in a single event or non-event. It is measured in discrete units; hence, N N . In the case of forest fires, it is measured in days. M is the time difference for a non-event to be considered, similarly M N . For forest fires, it is considered in months.
In this research, a single event (data point) e ^ is defined as a multivariate time series data extracted from w ¯ k , starting at the occurrence of a natural disaster and going N historical records in the past in reverse order, i.e., a fire event  e ^ R p × ( N + 1 ) is the time series extracted from w ¯ k for a forest fire f between the time frame [ f s , f s N ] , as shown in Equation (10). There are N + 1 days, since f s is included as well.
e ^ = w ¯ 1 , t = f s w ¯ 1 , t = f s 1 w ¯ 1 , t = f s N w ¯ 2 , t = f s w ¯ 2 , t = f s 1 w ¯ 2 , t = f s N w ¯ p , t = f s w ¯ p , t = f s 1 w ¯ p , t = f s N
Similarly, a single non-event (data point) e ˇ is defined as a multivariate time series data extracted from w ¯ k , starting M months prior to the occurrence of a natural disaster and going until N historical records in the past in reverse order, i.e., a non-fire event  e ˇ R p × ( N + 1 ) is a time series extracted from w ¯ k for a forest fire f between the time frame [ f s M , f s M N ] , as seen in Equation (11).
e ˇ = w ¯ 1 , t = f s M w ¯ 1 , t = f s M 1 w ¯ 1 , t = f s M N w ¯ 2 , t = f s M w ¯ 2 , t = f s M 1 w ¯ 2 , t = f s M N w ¯ p , t = f s M w ¯ p , t = f s M 1 w ¯ p , t = f s M N
This makes the data agnostic to spatial resolution. The dataset thus produced, for a single combination of values of K , N , and M , will have a 1:1 ratio of fire and non-fire events, which is equal to the number of forest fires in F. The notation used is described in Table 4.

4.2. Framework

The framework for forest fire prediction using STAS is shown in Figure 3. A detailed description of the framework is presented in [8]. Table 4 describes the variables used in Algorithms 1–4. Algorithm 1, and K are only needed for sensor data processing. The rest are needed for both sensor and satellite data. This is shown with different shades of green in Figure 3. The values of K , S, W, and F are provided as input to Algorithm 1. In Algorithm 1, K -nearest weather stations from the set S are selected for every fire f in F. The output D contains the average of K multivariate time series weather data for each forest fire.
The value of N is selected and is provided as input to Algorithm 2 along with D from Algorithm 1 and the set F. Algorithm 2 extracts fire events by limiting each value in D to the time frame [ f s , f s N ] by looping over all values in D and F as a combined set (similar to the zip function in Python). For Algorithm 3, the value of M is selected and provided as input along with N , F, and D from Algorithm 1. Algorithm 3 extracts non-fire events by limiting each value in D to the time frame [ f s M , f s M N ] by looping over all values in D and F as a combined set. For example, consider M = 3 , N = 30 for a forest fire event f starting at f s = 14 August 2023. Then, multivariate time series for the fire event e ^ is extracted between 14 August 2023 ( t s t a r t ) and going backward until 15 July 2023 ( t e n d ) . The multivariate time series data for the non-fire event e ˇ is extracted between 14 May 2023 ( t s t a r t ) going backward until 14 April 2023 ( t e n d ) .
Algorithm 1  K -Nearest Sensor Data Aggregation
All the symbols and notations are described in Table 4.
     Input: K , S, W, F
     Output: D
  1:
procedure  g e t K N e a r e s t S e n s o r D a t a ( K , S, W, F)
  2:
     D
  3:
    for  f F  do
  4:
         f l center location of f g in f
  5:
         S k K -nearest sensor to f l               ▹ Using Equation (7)
  6:
         W k
  7:
        for  s S k  do
  8:
            w g ( s )                                    ▹ As seen in Equation (6)
  9:
            W k add w
10:
       end for
11:
         w ¯ k average W k                              ▹ Using Equation (9)
12:
         D add w ¯ k to D
13:
    end for
14:
end procedure
Algorithm 2 Event Points Extraction
All the symbols and notations are described in Table 4.
     Input: N , F, D
     Output: E ^
  1:
procedure   g e t E v e n t P o i n t s ( N , F, D)
  2:
     E ^
  3:
    for  ( w ¯ k , f ) ( D , F )  do                         ▹ Zip of sets (D, F)
  4:
         f s start date of natural fire disaster f
  5:
         t s t a r t f s
  6:
         t e n d f s N
  7:
         e ^ w ¯ k between [ t s t a r t , t e n d )
  8:
         E ^ add e ^ to E ^
  9:
    end for
10:
end procedure
The outputs from Algorithm 2 ( E ^ ) and Algorithm 3 ( E ˇ ) are provided as inputs to Algorithm 4. The target values are appended based on the prediction being made. In the case of binary classification, E ^ gets 1 and E ˇ gets 0, while in the case of regression, E ^ gets the area burnt (severity) by forest fires and E ˇ gets 0. Any spatial or temporal information, such as date, time, latitude, and longitude, is removed to make the data spatio-temporal agnostic. The data are standardized using mean and standard deviation. Finally, the data are randomly sampled and split into D t r a i n and D t e s t , which are used for training and testing of deep learning models, respectively. D t r a i n gets 80 % of the fire and non-fire events, while D t e s t gets 20 % . D t r a i n is then used to train models described in Section 4.4. Once the model is trained, D t e s t is used to evaluate the model.
Algorithm 3 Non-Event Point Extraction
All the symbols and notations are described in Table 4.
     Input: N , M , F, D
     Output: E ˇ
  1:
procedure   g e t N o n E v e n t P o i n t s ( N , M , F, D)
  2:
     E ˇ
  3:
    for  ( w ¯ k , f ) ( D , F )  do                         ▹ Zip of sets (D, F)
  4:
         f s start date of natural fire disaster f
  5:
         t s t a r t f s M
  6:
         t e n d f s M N
  7:
        if  t s f s f F  then
  8:
            e ˇ w ¯ k between [ t s t a r t , t e n d )
  9:
            E ˇ add e ˇ to E ˇ
10:
        end if
11:
    end for
12:
end procedure
Algorithm 4 Spatio-Temporal Agnostic Sampling
All the symbols and notations are described in Table 4.
     Input: E ^ , E ˇ
     Output: D t r a i n , D t e s t
  1:
procedure   g e t D a t a s e t s ( E ^ , E ˇ )
  2:
     D
  3:
     E ^ add target values to E ^
  4:
     D add E ^ to D
  5:
     E ˇ add target values to E ˇ
  6:
     D add E ˇ to D
  7:
     D delete spatio-temporal data from d a t a s e t
  8:
     D standardize D
  9:
     D t r a i n , D t e s t partition D to get train and test dataset
10:
end procedure

4.3. Time Complexity Analysis

Consider a dataset A with points either belonging to the majority class A m a j or the minority class A m i n . Each value a belonging to A is p dimensional.
A = { a a R p , a A m a j or a A m i n }
By definition, the size of A m a j is significantly larger than that of A m i n . We can consider the size of a set of natural disasters (forest fires) F to be approximately equal to that of the minority class. This can be seen in Equation (13).
| A m i n | | A m a j | | F | | A m i n | | A m a j |
Majority and minority class data are taken from the values recorded by the sensors in set S. Each sensor records for a long time frame. Therefore, the size of S is significantly lower than the size of A m a j , as seen in Equation (14).
| S |   | A m a j |
It is hard to compare the sizes of sets F and S, since they heavily depend on the time frame being considered and may vary based on the natural disaster under study. It is known that the data used to make the predictions are multivariate ( p 2 ). The size of set F may be smaller than the size of set S, but we can assume that p × | F | is larger than the size of S. If p is significantly larger than 2, then we can assume Equation (15).
p 2 p × | F | > | S | p 2 × | F | > | S | If p 2
The Big-O (O) notation is used for time complexity in the worst case. To compute the time complexity for the distance between two sets of points which are p-dimensional, we need to obtain the difference between each dimension for each pair of points in the two sets A 1 and A 2 . This can be represented by Equation (16). For sorting, the best worst-case time complexity is for Heap Sort, as seen in Equations (17). In Equation (16) and (17), | | represents the size of the set.
T d i s t = O ( p × | A 1 | × | A 2 | )
T s o r t = O ( | A 3 | × log | A 3 | )
It should be noted that time complexity for distance and sorting calculated in Section 4.3.1Section 4.3.3 uses Equations (16) and (17). The notation T d i s t and T s o r t is used in all subsections, but the values derived are respective to the section.

4.3.1. STAS

For Algorithm 1, we calculate the distance using Equation (7) and then sort computed distances to find the shortest distance. Then, we compute both T d i s t and T s o r t for STAS using Equations (16) and (17), respectively. The distance is compared between sets F and S in two dimensions. Hence, the distance time complexity for Algorithm 1 can be seen in Equation (18).
T d i s t = O ( p × | A 1 | × | A 2 | ) = O ( 2 × | S | × | F | )
For sorting, ordering is applied | F | times to values in S to obtain the top K values as seen in Equation (19). Here, | A 3 | for the sorting complexity in Equation (17) is | S | × | F | .
| A 3 | = | S | × | F | T s o r t = O ( | A 3 | × log | A 3 | ) = O | S | × | F | × log | S | × | F |
Algorithms 2 and 3 loop over the event data points. Therefore, the complexity can be considered linear, as seen in Equation (20). The time complexity of Algorithm 4 can be ignored, as it is of constant time.
T e x t r = O ( | F | )
In STAS, if sensor data are considered, then we need to add the time complexity of Equations (18)–(20). Equation (20) is added twice, since we consider it for Algorithms 2 and 3. If satellite data are considered, only Equation (20) can be taken.
(21) O ( S T A S s e n s o r ) = T d i s t + T s o r t + ( 2 × T e x t r ) (22) O ( S T A S s a t e l l i t e ) = 2 × T e x t r

4.3.2. N e a r M i s s

A description of the N e a r M i s s algorithm is given in Section 2.2. The distance is compared in p dimensions, and the data are sorted to obtain either the farthest or nearest values. Further, the distance between the majority and minority classes is compared. Hence, the time complexity for distance comparison can be seen in Equation (23).
T d i s t = O ( p × | A 1 | × | A 2 | ) = O ( p × | A m i n | × | A m a j | ) = O ( p × | F | × | A m a j | ) Using Equation ( 13 )
For sorting, we have to order the distance comparison between the majority and minority classes. Hence, the set A 3 in Equation (17) is | A m i n | × | A m a j | . Therefore, the time complexity for sorting is presented in Equation (24),
T s o r t = O ( | A 3 | × log | A 3 | ) = O | A m i n | × | A m a j | × log | A m i n | × | A m a j | = O | F | × | A m a j | × log | F | × | A m a j |
Then, the time complexity of N e a r M i s s will be the sum of Equations (24) and (23), and it is shown in Equation (25).
O ( N e a r M i s s ) = T d i s t + T s o r t

4.3.3. SMOTE

SMOTE is described in Section 2.2. First, a distance comparison is made among the data in the minority class. The time complexity for which is shown in Equation (26).
T d i s t = O ( p × | A 1 | × | A 2 | ) = O ( p × | F | × | F | ) = O ( p × | F | 2 )
For sorting, we have to order the distance comparison between the minority class data. Hence, the set A 3 in Equation (17) is | A m i n | × | A m i n | . Therefore, the time complexity for sorting can be seen in Equation (27).
T s o r t = O ( | A 3 | × log | A 3 | ) = O | A m i n | × | A m i n | × log | A m i n | × | A m i n | = O | A m i n | 2 × log | A m i n | 2 = O ( 2 × | A m i n | 2 × log | A m i n | ) = O ( 2 × | F | 2 × log | F | ) Using Equation ( 13 )
Finally, interpolation is performed. The time complexity for interpolating points can be considered linear. The interpolation is performed for every minority class sample x times. x is a large value to account for the difference in size between the majority and minority classes. The interpolation time complexity can be seen in Equation (28).
T i n t e r   p = O ( x × | A m i n | ) = O ( x × | F | )
Therefore, the time complexity of SMOTE will be the sum of Equations (26)–(28).
O ( S M O T E ) = T d i s t + T s o r t + T i n t e r   p

4.3.4. Comparison

When comparing the STAS framework with N e a r M i s s , we can say that the time complexity for the STAS framework is lower than N e a r M i s s with either the sensor or satellite data, since | A m a j | is used in N e a r M i s s for Equations (23) and (24). From Equations (13) and (14), we know that | A m a j | is significantly larger than either | S | or | F | . Therefore,
O ( S T A S s e n s o r ) O ( N e a r M i s s ) Since Equation ( 14 ) O ( S T A S s a t e l l i t e ) O ( N e a r M i s s ) Since Equation ( 14 )
When comparing the STAS framework with SMOTE, we know that for satellite data, the time complexity for STAS is lower than SMOTE, since STAS with satellite data is computed in linear time.
O ( S T A S s a t e l l i t e ) O ( S M O T E )
If the sensor data are very high-dimensional, then using Equation (15), we can say that the STAS framework is faster than SMOTE. Also, it is to be noted that x is a large value in Equation (28). Otherwise, the time complexity is almost similar.
O ( S T A S s e n s o r ) < O ( S M O T E ) using Equation ( 15 ) , if p 2 O ( S M O T E ) otherwise

4.4. Modeling

In this research, a binary classification model is used to classify data into fire or non-fire events. In our previous research [8], the hypothesis is that the STAS deep learning models will learn the change in features that are likely to cause a fire over the time frame M . The model architecture is depicted in Figure 4a. The binary classification model has an activation of R e L U for all its hidden layers and input layer while having an activation of S i g m o i d for the output layer. Target values are set to 1 for fire events ( e ^ ), whereas they are set to 0 for non-fire events ( e ˇ ), as discussed in Section 4.2. Binary Cross-Entropy (BCE) is used as a loss function in this model along with the Adam optimizer. Further details are presented in [8]. Various learning rates ranging from 0.01 to 0.0000001 were tested while developing the models. The learning rate of 0.00001 showed the best results. The models converged fast (<100 epochs). F 1 is used to evaluate the models.
Figure 4b illustrates a regression model architecture. It was used to predict the severity of a fire event. The regression model had an activation of L e a k y R e L U with a negative slope of 0.01 for all its layers. The target value was set to the area burned for fire events ( e ^ ) and 0 for non-fire events ( e ˇ ) as discussed in Section 4.2. Mean Square Error (MSE) was used as a loss function in this model along with the Stochastic Gradient Descent (SGD) optimizer. Further details are presented in [8]. Different learning rates and epochs were tested. It was found that a learning rate of 0.0001 was ideal for all the models. The models converge very slowly (>5000 epochs). R 2 was used, since it tests for goodness of fit.

5. Experiments and Results

This section elaborates on the experimentation and results. Each model had 31 × ( N + 1 ) input features, since, N + 1 days were provided as input for 31 features (Table 3). Table 5 lists the comparison of the models’ metrics with different combinations of N , K , and M for both binary classification in yellow and regression in red. The first two columns represent the values of K and M , respectively, while the remaining six columns are for the various values of N . The column for each value of N is divided into two columns for F 1 and R 2 , respectively. The impact of M , K , and N on F 1 is shown in Figure 5, and their impact on R 2 is shown in Figure 6 using boxplots. The median value of the box plot is shown for easy comparison. A total of 432 experiments are recorded in Table 5. Here, 216 are for binary classification models with F 1 , and 216 are for regression models with R 2 .
For binary classification experiments, a prediction greater than the threshold of 0.5 was considered a forest fire. It is seen in Table 5 that F 1 is excellent ( > 0.9 ) for 83.4% of the experiments, good (between 0.8 and 0.9) for 13% of the experiments, and poor (between 0.5 and 0.8) for 3.6% of the experiments. This shows that forest fire prediction depends on weather features, similar to the findings in [8,9,10]. From Figure 5a, it can be seen that the median F 1 is excellent ( > 0.9 ) for all values of M except for M = 12 where it is 0.802. In Figure 5b, it is noticed that increasing the value of K has a negligible increase in the median value of F 1 . When compared with various values of N in Figure 5c, the median F 1 is > 0.90 and increases as the value of N increases until N = 30 and then remains constant, implying that having 30 days of historical information is sufficient.
Table 5 for regression experiments shows that 69.4% of the experiments explain a significant amount of variance ( R 2 > 0.75 ) in data, 19% of the experiments explain a good amount of variance (between 0.5 and 0.7), and 11.6% of the experiments explain little to no variance in data ( R 2 < 0.5 ). When comparing the impact of M to predict the area burned in Figure 6a, it can be seen that M = 12 has little to no variance explained by the models, as expected. The models could explain a significant or good amount of variance for the remaining values of M . K has no impact on the models’ performance, as it had negligible change as seen in Figure 6b, similar to binary classification. Figure 6c shows that as the value of N increases, the median R 2 gradually increases. Unlike the binary classification models, the values of N do not show an optimal upper limit for regression models.
Apart from the aforementioned experiments in Table 5, additional experiments were conducted to have a temporal split in D t r a i n and D t e s t instead of having a random split. This ensures fire event and non-fire event data from the specified cutoff date are not present in D t e s t . The cutoff date was set to J a n u a r y 1 s t , 2016 , i.e., D t r a i n has fire event and non-fire event data from 1998 to 2015 (17 years), while D t e s t has fire event and non-fire event data between 2016 and 2018 (2 years). There was no noticeable change in the performance metrics ( R 2 and F 1 ).

6. Discussion

The STAS framework was found effective in the study of forest fire prediction. If data are collected from sensors other than the ones in the weather station, then the sets W and S can be replaced with the respective data. This was demonstrated in [9], where weather station data were replaced by lightning and hydrometric data. When the data are gathered from a single remote-sensing source, such as a satellite, then Algorithm 1 (K-Nearest Sensor Data Aggregation) is not needed, and the data can instead be passed to Algorithm (Event Points Extraction and Non-Event Point Extraction). There is no restriction on the number of features a data point can have in the STAS framework. Hence, univariate data can also be used.
The units for N and M depend on the forest fire disaster. The units of N should indicate the time for a forest fire to happen and M should be one unit higher. For example, the time frame in which a forest burns is given in days. Hence, units for N are in days, while M is in months. M and N are the main parameters of STAS. Multiple models trained on the STAS dataset for varying values of M can be ensembled to make predictions or single models can be used to make predictions based on the temporal change in data points.
This research agrees with the findings of previous research, which specify the use of weather data to predict forest fires [37,38,39,40,41,42,43]. It goes further by providing a framework that is repeatable with less computation time for a larger area (i.e., entire Canada) over large time frames and in real time. In addition, it presents STAS as a faster multivariate sampling technique for evolving data in the domain of natural forest fire disasters. The transformation of data by STAS to identify the change in features over a time frame for an event to occur changes the framing of the problem.
Since, non-events are sampled in the same location as events, but in a past time frame, it makes the dataset spatially agnostic. The only variation in the fire and non-fire data points will be the temporal change in data points. When models are trained on such a dataset, they will learn the variations in features distinguishing between events separated by a temporal difference of M . From [9], it is known that STAS does not make predictions based on seasonal differences. When STAS is provided a data source with no predictive power, then the model trained on such a dataset performs poorly.

6.1. STAS Parameters

Sensor data are associated with the physical location of the sensors. K specifies how many spatially distant sensors from an event affect the predictions. It was found to have a negligible impact on both the classification and severity prediction of forest fires. This is likely since the occurrence of forest fires is dependent on local conditions rather than on neighboring conditions. For this reason, one should use data available as close to a natural disaster as possible.
Natural disasters, such as forest fires, do not occur overnight. They gradually build up over time based on climatic and geological factors. Therefore, the parameter N specifies the number of historical days of information needed in a single data point to predict a natural fire disaster. Having a larger N incorporated in the model provides better results. After a certain value of N , it will no longer be viable. STAS framework can be used to determine this threshold. Since the regression model predicts severity, it does not have an upper bound for prediction. This requires more historical data to make accurate predictions.
M specifies the time difference to consider for change in features. In seasonal data, a recurring trend occurs, and for this reason, for some values of M , the models will easily distinguish between the data. In the case of forest fires, where the data also have associated seasonality, the models will easily distinguish between data for M = 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , as there is sufficient difference in values over that time frame. Metrics for M = 1 , 2 , 11 , 12 can be used to test if the dataset is a good predictor of natural phenomena. This is because there are little or no variations in event and non-event data. Especially, for M = 12 , where fire and non-fire events are from the same month, leading to similar weather patterns. This technique was applied in [9], where hydrometric data were found to be of no use, while lightning was found to be the best predictor of a forest fire. Similarly, using this technique in [10], it was found that weather data could be used for both predicting the occurrence and severity of a forest fire, while lightning data only work for predicting the occurrence. In this research, STAS produced good results for M = 12 in the case of predicting the occurrence of forest fires. The value of N can be increased to yield high performance in models for M = 12 , as it will need more historical information.

6.2. Sampling

In the study of natural forest fire disasters, random sampling is not suitable, as different researchers will obtain different datasets based on the random seed. This may cause challenges when other researchers attempt to replicate the work. Further, a random instance may provide a dataset with good results while the other may not, as was seen in [12,13]. One drawback of the N e a r M i s s approach is that the under-sampled data points in the majority class will change if new points are added or removed from either the minority class or the majority class. The algorithm needs to be re-run for the entire dataset. This may not be effective when data keep growing, as in the study of natural disasters, causing challenges of re-computation and repeatable work. In SMOTE, similar challenges to N e a r M i s s arise. Further, the artificially added samples will add bias to the models.
STAS is suitable for under-sampling multivariate time series evolving data for natural fire disasters. In STAS, re-computation is only required for new data, unlike traditional algorithms. This reduces the computation time and resources. Furthermore, the data have a 1:1 ratio for event and non-event data, thereby producing a balanced dataset.

6.3. Metrics and Validation

F 1 is considered an unbiased estimator for randomly sampled data. The data are not randomly sampled in STAS, but due to the change in the framing of the problem to identify the change in features over a time frame for an event to occur, it can be considered an unbiased estimator. Figure 7 shows how many non-fire events ( e ˇ ) are taken from any given month for a given value of parameter M . The legend for the month is given at the bottom. It can be seen that as the value of M is varied, different months dominate in the sample for e ˇ . Hence, it is highly encouraged to use a Mixture of Experts (MoE) approach with the same value of K and N but with M ranging from 1 to 12, one for each month. This leads the models to be trained on segments of temporally scattered data.
Possible threats to validity include selecting different model parameters and improper data preprocessing. The hidden layers, learning rates, and epochs play a critical role in model training. It is recommended that a model be trained with the specified parameters. The acquisition of data different from those specified in this paper may also potentially affect validating the results presented.

6.4. Limitations

This research demonstrated the applicability and procedure of STAS for multivariate seasonal time series data. It was also demonstrated in [9] that if features completely independent of forest fires are provided, i.e., hydrometric instead of lightning or weather, the models did not have good predictive power. One potential limitation of STAS is that if it is given a dataset D of p features, it would not identify which of the p features are important predictors. It will only be able to demonstrate if the D has predictive power in the p features for event e ^ . For identifying important features, traditional feature importance techniques can be used first for a given D, and then the important features can be passed as a smaller dataset to STAS.

7. Conclusions

In this research, the STAS framework was successfully applied to weather datasets with 31 features. This paper elaborated on the initial work presented in [8] by providing a thorough description and evaluation of the STAS framework while describing its applicability to natural disasters, such as forest fires. The K-Nearest Sensor Data Aggregation and Spatio-Temporal Agnostic Sampling algorithms were further modified in this paper for better readability and understandability. A mathematical description was provided for the extraction of event and non-event data for natural forest fire disasters. Further, a time complexity analysis with NearMiss and SMOTE showed that the STAS framework is faster.
The trained models are expected to learn the variation in features over M months and act as classifiers and severity predictors over the time frame. Further, STAS was tested on a total of 432 deep binary classification and regression experiments. It is four times greater than the experiments conducted in [8]. The additional experiments align with the findings from [8]. This research also found that N = 30 is the ideal value for binary classification with CWEEDS weather sensor data.
Additionally, in this research, experimentation was also conducted where D t r a i n and D t e s t were split over temporal values instead of being randomly split, as shown in Algorithm 4. The data between 1998 and 2015 were used to generate D t r a i n , while the data between 2016 and 2018 were used to generate D t e s t . No change in performance was noted for experiments with the temporal split.
As part of future work, it is proposed that an ablation study be conducted on the features of the CWEEDS weather sensor dataset. In this research, for binary classification experimentation, a threshold of 0.5 was used to classify between fire and non-fire events. It is proposed to identify an ideal threshold for classification in future research. Further, it is proposed that models be trained on the Environment and Climate Change Canada weather station data. This allows for integration with the Environment and Climate Change Canada system to make real-time predictions. In addition, it is proposed that knowledge distillation be performed on the models to reduce their size. Finally, it is proposed to build an MoE architecture by using 12 models (one for each month) for varying values of M for a constant value of N and K .

Author Contributions

Conceptualization, A.M., C.-H.L., K.N. and M.Z.; methodology, A.M.; software, A.M.; validation, A.M., C.-H.L., K.N., M.Z., S.A.A., R.P., S.S. and T.R.; formal analysis, A.M., C.-H.L., K.N., M.Z., S.A.A., S.S. and T.R.; investigation, A.M., C.-H.L., K.N., M.Z., S.A.A., R.P., S.S. and T.R.; resources, A.M.; data curation, A.M.; writing—original draft preparation, A.M.; writing—review and editing, A.M., C.-H.L., K.N., M.Z., S.A.A., R.P., S.S. and T.R.; visualization, A.M.; supervision, C.-H.L., K.N., M.Z. and S.A.A.; project administration, C.-H.L., K.N. and M.Z.; funding acquisition, C.-H.L., K.N., M.Z. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly funded by an Alliance Missions grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada grant number ALLRP 570503-2021.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study were derived from the following resources available in the public domain: CNFDB https://cwfis.cfs.nrcan.gc.ca/datamart/download/nfdbpoly [47] accessed on 5 May 2023 and CWEEDS https://collaboration.cmc.ec.gc.ca/cmc/climate/Engineer_Climate/CWEEDS_FMCEG/ [48] accessed on 5 May 2023 provided by Environment and Climate Change Canada and Natural Resources Canada, respectively. The authors confirm that the processed data supporting the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

This article is a revised and expanded version of a paper entitled Spatio-Temporal Agnostic Deep Learning Modeling of Forest Fire Prediction Using Weather Data, which was presented at IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), Torino, Italy on 26–30 June 2023 [8]. The research produced is part of ongoing collaborative work between Carleton University, University of Waterloo, and Dalhousie University with industry partners Cistel Technology and Hegyi Geomatics International Inc. Additional support was received from Research Computing Services at Carleton University.

Conflicts of Interest

Author Marzia Zaman was employed by the company Cistel Technology. Author Thambirajah Ravichandran was employed by the company Hegyi Geomatics Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BCEBinary Cross-Entropy
BUIBuildUp Index
CNFDBCanadian National Fire Database
CWEEDSCanadian Weather Energy and Engineering Datasets
CWFISCanadian Wildland Fire Information System
DCDrought Code
DMCDuff Moisture Code
FFMCFine Fuel Moisture Code
FWIFire Weather Index
ISIInitial Spread Index
MoEMixture of Experts
MSEMean Square Error
NFDBNational Fire Data Base
RFRandom Forest
SGDStochastic Gradient Descent
SMOTESynthetic Minority Over-sampling TEchnique
STASSpatio-Temporal Agnostic Sampling
SVMSupport Vector Machine

References

  1. Hefeeda, M.; Bagheri, M. Wireless Sensor Networks for Early Detection of Forest Fires. In Proceedings of the IEEE International Conference on Mobile Adhoc and Sensor Systems, Pisa, Italy, 8–11 October 2007; pp. 1–6. [Google Scholar]
  2. Al-turjman, F.M.; Hassanein, H.S.; Oteafy, S.M.A.; Alsalih, W. Towards Augmenting Federated Wireless Sensor Networks in Forestry Applications. Pers. Ubiquitous Comput. 2013, 17, 1025–1034. [Google Scholar] [CrossRef]
  3. Wu, Y.; Geng, X.; Liu, Z.; Shi, Z. Tropical Cyclone Forecast Using Multitask Deep Learning Framework. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  4. Amezquita-Sanchez, J.; Valtierra-Rodriguez, M.; Adeli, H. Current Efforts for Prediction and Assessment of Natural Disasters: Earthquakes, Tsunamis, Volcanic Eruptions, Hurricanes, Tornados, and Floods. Sci. Iran. 2017, 24, 2645–2664. [Google Scholar] [CrossRef]
  5. Consoli, S.; Reforgiato Recupero, D.; Zavarella, V. A Survey on Tidal Analysis and Forecasting Methods for Tsunami Detection. Sci. Tsunami Hazards 2014, 33, 1–56. [Google Scholar]
  6. Pickell, P.D.; Chavardès, R.D.; Li, S.; Daniels, L.D. FuelNet: An Artificial Neural Network for Learning and Updating Fuel Types for Fire Research. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7338–7352. [Google Scholar] [CrossRef]
  7. Jaiswal, R.K.; Mukherjee, S.; Raju, K.D.; Saxena, R. Forest Fire Risk Zone Mapping from Satellite Imagery and GIS. Int’l J. Appl. Earth Obs. Geoinf. 2002, 4, 1–10. [Google Scholar] [CrossRef]
  8. Mutakabbir, A.; Lung, C.H.; Ajila, S.A.; Zaman, M.; Naik, K.; Purcell, R.; Sampalli, S. Spatio-Temporal Agnostic Deep Learning Modeling of Forest Fire Prediction Using Weather Data. In Proceedings of the IEEE 47th Computers, Software, and Applications Conference, Torino, Italy, 26–30 June 2023; pp. 346–351. [Google Scholar] [CrossRef]
  9. Mutakabbir, A.; Lung, C.H.; Ajila, S.A.; Zaman, M.; Naik, K.; Purcell, R.; Sampalli, S. Forest Fire Prediction Using Multi-Source Deep Learning. In Big Data Technologies and Applications; Springer Nature: Cham, Switzerland, 2023; pp. 135–146. [Google Scholar] [CrossRef]
  10. Mutakabbir, A.; Lung, C.H.; Ajila, S.A.; Naik, K.; Zaman, M.; Purcell, R.; Sampalli, S.; Ravichandran, T. A Federated Learning Framework based on Spatio-Temporal Agnostic Subsampling (STAS) for Forest Fire Prediction. In Proceedings of the IEEE 48th Computers, Software, and Applications Conference, Osaka, Japan, 2–4 July 2024; pp. 350–359. [Google Scholar] [CrossRef]
  11. Purcell, R.; Naik, K.; Lung, C.H.; Zaman, M.; Sampalli, S.; Mutakabbir, A. A Framework Using Federated Learning for IoT-Based Forest Fire Prediction. In Proceedings of the International Conference on Internet of Things and Intelligence Systems, Bali, Indonesia, 28–30 November 2023; pp. 133–139. [Google Scholar] [CrossRef]
  12. Tavakoli, F.; Naik, K.; Zaman, M.; Purcell, R.; Sampalli, S.; Mutakabbir, A.; Lung, C.H.; Ravichandran, T. Big Data Synthesis and Class Imbalance Rectification for Enhanced Forest Fire Classification Modeling. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence, Rome, Italy, 24–26 February 2024; Volume 2, pp. 264–275. [Google Scholar] [CrossRef]
  13. Kaur, P.; Naik, K.; Purcell, R.; Sampalli, S.; Lung, C.H.; Zaman, M.; Mutakabbir, A. A Data Integration Framework with Multi-Source Big Data for Enhanced Forest Fire Prediction. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 344–351. [Google Scholar] [CrossRef]
  14. Purcell, R.; Naik, K.; Zaman, M.; Lung, C.H.; Sampalli, S.; Mutakabbir, A.; Dhindsa, M.S.; Das, U. IoT Sensor Deployment in the Wildland-Urban Interface: Leveraging Fire Risk Analysis. In Proceedings of the IEEE 10th World Forum on Internet of Things, Ottawa, ON, Canada, 10–13 November 2024; pp. 864–869. [Google Scholar]
  15. Johnson, E.; Miyanishi, K. Forest Fires: Behavior and Ecological Effects; Academic Press, Inc.: San Diego, CA, USA, 2001. [Google Scholar]
  16. Coogan, S.C.P.; Daniels, L.D.; Boychuk, D.; Burton, P.J.; Flannigan, M.D.; Gauthier, S.; Kafka, V.; Park, J.S.; Wotton, B.M. Fifty years of Wildland Fire Science in Canada. Can. J. For. Res. 2021, 51, 283–302. [Google Scholar] [CrossRef]
  17. Castro, A.C.M.; Nunes, A.; Sousa, A.; Lourenço, L. Mapping the Causes of Forest Fires in Portugal by Clustering Analysis. Geosciences 2020, 10, 53. [Google Scholar] [CrossRef]
  18. Moris, J.V.; Álvarez Álvarez, P.; Conedera, M.; Dorph, A.; Hessilt, T.D.; Hunt, H.G.P.; Libonati, R.; Menezes, L.S.; Müller, M.M.; Pérez-Invernón, F.J.; et al. A Global Database on Holdover Time of Lightning-Ignited Wildfires. Earth Syst. Sci. Data 2023, 15, 1151–1163. [Google Scholar] [CrossRef]
  19. Canadian Geographic. Mapping 100 years of Forest Fires in Canada; 2023. Available online: https://canadiangeographic.ca/articles/mapping-100-years-of-forest-fires-in-canada/ (accessed on 1 December 2023).
  20. Earth Observatory. Relentless Wildfires in Canada. 2023. Available online: https://earthobservatory.nasa.gov/images/151696/relentless-wildfires-in-canada (accessed on 1 December 2023).
  21. Taylor, S.W.; Alexander, M.E. Science, Technology, and Human Factors in Fire Danger Rating: The Canadian Experience. Int. J. Wildland Fire 2006, 15, 121–135. [Google Scholar] [CrossRef]
  22. Lee, J.; Choi, W.; Kim, J. A Cost-Effective CNN-LSTM-Based Solution for Predicting Faulty Remote Water Meter Reading Devices in AMI Systems. Sensors 2021, 21, 6229. [Google Scholar] [CrossRef] [PubMed]
  23. Ai, M.; Yu, J.; Zhang, H.; Wang, H. Optimal Subsampling Algorithms for Big Data Regressions. Stat. Sin. 2021, 31, 749–772. [Google Scholar] [CrossRef]
  24. Wang, H.; Yang, M.; Stufken, J. Information-Based Optimal Subdata Selection for Big Data Linear Regression. J. Am. Stat. Assoc. 2019, 114, 393–405. [Google Scholar] [CrossRef]
  25. Yao, Y.; Wang, H. A Review of Optimal Subsampling Methods for Massive Datasets. J. Data Sci. 2021, 19, 151–172. [Google Scholar] [CrossRef]
  26. Bao, L.; Juan, C.; Li, J.; Zhang, Y. Boosted Near-Miss Under-Sampling on SVM Ensembles for Concept Detection in Large-Scale Imbalanced Datasets. Neurocomputing 2016, 172, 198–206. [Google Scholar] [CrossRef]
  27. Yen, S.J.; Lee, Y.S. Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset. In Proceedings of the International Conference on Intelligent Computing, Kunming, China, 16–19 August 2006; pp. 731–740. [Google Scholar]
  28. Tanimoto, A.; Yamada, S.; Takenouchi, T.; Sugiyama, M.; Kashima, H. Improving Imbalanced Classification Using Near-Miss Instances. Expert Syst. Appl. 2022, 201, 117130. [Google Scholar] [CrossRef]
  29. Mandal, A.; Nykiel, G.; Strzyżewski, T.; Kochanski, A.K.; Wrońska, W.; Gruszczyńska, M.; Figurski, M. High-Resolution Fire Danger Forecast for Poland based on the Weather Research and Forecasting Model. Int. J. Wildland Fire 2021, 31, 149–162. [Google Scholar] [CrossRef]
  30. Silva, P.; Carmo, M.; Rio, J.; Novo, I. Changes in the Seasonality of Fire Activity and Fire Weather in Portugal: Is the Wildfire Season Really Longer? Meteorology 2023, 2, 74–86. [Google Scholar] [CrossRef]
  31. Giovanni, L.; Jahjah, M.; Fabrizio, F.; Fabrizio, B. The Development of a Fire Vulnerability Index for the Mediterranean Region. In Proceedings of the 2011 IEEE International Geoscience and Remote Sensing Symposium, Vancouver, BC, Canada, 24–29 July 2011. [Google Scholar] [CrossRef]
  32. Stocks, B.J.; Alexander, M.E.; Van, C.E. The Canadian Forest Fire Danger Rating System: An Overview. For. Chron. 1989, 65, 258–265. [Google Scholar] [CrossRef]
  33. Wang, X.; Oliver, J.; Swystun, T.; Hanes, C.C.; Erni, S.; Flannigan, M.D. Critical Fire Weather Conditions during Active Fire Spread Days in Canada. Sci. Total Environ. 2023, 869, 161831. [Google Scholar] [CrossRef]
  34. Lawson, B.D.; Armitage, O.B. Weather Guide for the Canadian Forest Fire Danger Rating System; Canadian Forest Service, Northern Forestry Centre: Edmonton, AB, USA, 2008. [Google Scholar]
  35. de Groot, W.J.; Hanes, C.C.; Wang, Y. Crown Fuel Consumption in Canadian Boreal Forest Fires. Int. J. Wildland Fire 2022, 31, 255–276. [Google Scholar] [CrossRef]
  36. Zhang, P.; Qi, M. Neural Network Forecasting for Seasonal and Trend Time Series. Eur. J. Oper. Res. 2005, 160, 501–514. [Google Scholar] [CrossRef]
  37. Lin, T.Z. Suppressing Forest Fires in Global Climate Change Through Artificial Intelligence: A Case Study on British Columbia. In Proceedings of the 2022 International Conference on Big Data, Information and Computer Network (BDICN), Sanya, China, 20–22 January 2022; pp. 432–438. [Google Scholar] [CrossRef]
  38. Sakr, G.E.; Elhajj, I.H.; Mitri, G.; Wejinya, U.C. Artificial Intelligence for Forest Fire Prediction. In Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Montreal, QC, Canada, 6–9 July 2010; pp. 1311–1316. [Google Scholar] [CrossRef]
  39. Latifah, A.L.; Shabrina, A.; Wahyuni, I.N.; Sadikin, R. Evaluation of Random Forest Model for Forest Fire Prediction Based on Climatology over Borneo. In Proceedings of the IEEE International Conference on Computer, Control, Informatics and Its Applications, Tangerang, Indonesia, 23–24 October 2019; pp. 4–8. [Google Scholar]
  40. Cortez, P.; Morais, A. A Data Mining Approach to Predict Forest Fires using Meteorological Data. In Proceedings of the 13th EPIA–Portuguese Conference on Artificial Intelligence, Guimarães, Portugal, 3–7 December 2007; pp. 512–523. [Google Scholar]
  41. Zhang, G.; Wang, M.; Liu, K. Forest Fire Susceptibility Modeling Using a Convolutional Neural Network for Yunnan Province of China. Int. J. Disaster Risk Sci. 2019, 10, 386–403. [Google Scholar] [CrossRef]
  42. Singh, K.R.; Neethu, K.P.; Madhurekaa, K.; Harita, A.; Mohan, P. Parallel SVM Model for Forest Fire Prediction. Soft Comput. Lett. 2021, 3, 100014. [Google Scholar] [CrossRef]
  43. Thomas, G.; Rosalie, V.; Olivier, C.; de Maria, G.A.; Antonio, L.P. Modelling Forest Fire and Firebreak Scenarios in a Mediterranean Mountainous Catchment: Impact on Sediments. J. Environ. Manag. 2021, 289, 112497. [Google Scholar] [CrossRef]
  44. Wotton, B.M.; Martell, D.L. A Lightning Fire Occurrence Model for Ontario. Can. J. For. Res. 2005, 35, 1389–1401. [Google Scholar] [CrossRef]
  45. Green, A.A.; Dean, C.B.; Martell, D.L.; Woolford, D.G. A Methodology for Investigating Trends in Changes in the Timing of the Fire Season with Applications to Lightning-Caused Forest Fires in Alberta and Ontario, Canada. Can. J. For. Res. 2012, 43, 39–45. [Google Scholar] [CrossRef]
  46. Wierzchowski, J.; Heathcott, M.; Flannigan, M.D. Lightning and Lightning Fire, Central Cordillera, Canada. Int. J. Wildland Fire 2002, 11, 41–51. [Google Scholar] [CrossRef]
  47. Natural Resources Canada. National Fire Database Fire Polygons. Available online: https://cwfis.cfs.nrcan.gc.ca/datamart/download/nfdbpoly (accessed on 5 May 2023).
  48. Morris, R. Final Report–Updating CWEEDS Weather Files. Environment and Climate Change Canada 2016. Available online: https://collaboration.cmc.ec.gc.ca/cmc/climate/Engineer_Climate/CWEEDS_FMCEG/docs/ (accessed on 5 May 2023).
Figure 1. Structure of Canadian FWI system adapted from [34].
Figure 1. Structure of Canadian FWI system adapted from [34].
Sensors 25 00792 g001
Figure 2. Distribution of fires across Canada in Canadian National Fire Database (CNFDB).
Figure 2. Distribution of fires across Canada in Canadian National Fire Database (CNFDB).
Sensors 25 00792 g002
Figure 3. STAS framework adapted from [8].
Figure 3. STAS framework adapted from [8].
Sensors 25 00792 g003
Figure 4. Deep learning models adapted from [8].
Figure 4. Deep learning models adapted from [8].
Sensors 25 00792 g004
Figure 5. Comparison of the impact of M , K , and N on the binary classification models.
Figure 5. Comparison of the impact of M , K , and N on the binary classification models.
Sensors 25 00792 g005
Figure 6. Comparison of the impact of M , K , and N on the regression models.
Figure 6. Comparison of the impact of M , K , and N on the regression models.
Sensors 25 00792 g006
Figure 7. Number of non-fire events ( e ˇ ) from each month for given values of parameter M .
Figure 7. Number of non-fire events ( e ˇ ) from each month for given values of parameter M .
Sensors 25 00792 g007
Table 1. CNFDB Dataset (F) Fields Used with Description.
Table 1. CNFDB Dataset (F) Fields Used with Description.
FeatureDescription
REP_DATEDate associated with the start of forest fire ( f s )
OUT_DATEReported date of fire extinguished
CALC_HAFire size in hectares with higher precision
CAUSESpecifies the cause of fire
GEOMETRY2-dimensional final burnt polygonal region ( f g )
Table 2. CWEEDS Weather Station Metadata (S) Fields Used with Description.
Table 2. CWEEDS Weather Station Metadata (S) Fields Used with Description.
MetadataDescription
Climate IDWeather station ID
LocationWeather station’s latitude and longitude ( s l )
First YearYear weather station started recording data
Last YearYear weather station stopped recording data
Table 3. CWEEDS Weather Station Data (W) Features Used with Units.
Table 3. CWEEDS Weather Station Data (W) Features Used with Units.
S.No.FeaturePreprocessingFinal Units
1Extraterrestrial irradiance-NA- k J m 2
2Global irradiance-NA- k J m 2
3Direct irradiance-NA- k J m 2
4Diffuse irradiance-NA- k J m 2
5Global illuminance100 lux → luxlux
6Direct illuminance100 lux → luxlux
7Diffuse illuminance100 lux → luxlux
8Zenith luminance100 C d m 2 C d m 2 C d m 2
9Minutes of sunshine-NA-min
10Ceiling height10 m → mm
11 Four digitDigit code
12Sky layerssky conditionDigit code
13 codes →Digit code
14 Digit codesDigit code
15Visibility100 m → kmkm
16Thunderstorm (Weather) Digit code
17Rain (Weather) Digit code
18Drizzle (Weather)Eight digitDigit code
19Snow 1 (Weather)weather codeDigit code
20Snow 2 (Weather)Digit code
21Ice (Weather)Digit codesDigit code
22Visibility 1 (Weather) Digit code
23Visibility 2 (Weather) Digit code
24Station pressure10 Pa → PaPa
25Dry bulb temperature0.1 °C → °C°C
26Dew point temperature0.1 °C → °C°C
27Wind direction-NA-degree
28Wind speed0.1 m s m s m s
29Total sky cover-NA-Oktas
30Opaque sky cover-NA-Oktas
31Snow cover-NA-Boolean
Table 4. Methodology Notation/Variable Description.
Table 4. Methodology Notation/Variable Description.
VariableDescription
N Set of natural numbers
R Set of real numbers
Empty set or empty list
N Number of past days
M Number of past months
K Number of nearest sensors
FSet of forest fire incidents
fA single fire event
f s Start date of a forest fire
f g 2-dimensional polygonal geometry of a forest fire
f l Center location of a polygonal region of a forest fire
sA single sensor (weather station)
s l Location of a single sensor (weather station)
SSet of all sensors (weather stations)
S k Set of K selected sensors (weather stations) from S
wA single sensor (weather) data matrix
w i A single sensor (weather) data matrix at index i in W k
WSet of sensor (weather) data matrix
W k A list of K sensor (weather) data matrix
w ¯ k Average of K sensor (weather) data matrix
pThe number of features
g ( . ) Mapping from set S to set W
E ^ A list of all events (fire events)
e ^ A single event (fire event)
E ˇ A list of all non-events (non-fire events)
e ˇ A single non-event (non-fire event)
DIntermediate dataset
D t r a i n Training dataset without spatio-temporal information
D t e s t Testing dataset without spatio-temporal information
t m a x Latest time record in days
t s t a r t Start date to consider for a time series
t e n d End date to consider for a time series
AA dataset
A m a j A subset of A with majority class values
A m i n A subset of A with minority class values
A 1 , A 2 , A 3 Set of data
aA value in set A
xNumber of times to up-sample in SMOTE
T d i s t Time complexity of distance calculation
T s o r t Time complexity of sorting
T e x t r Time complexity for event extraction
T i n t e r p Time complexity for interpolation
Table 5. Representation of Model Metrics with respect to K , N , and M .
Table 5. Representation of Model Metrics with respect to K , N , and M .
K M N = 7 N = 14 N = 21 N = 30 N = 60 N = 90
F 1 R 2 F 1 R 2 F 1 R 2 F 1 R 2 F 1 R 2 F 1 R 2
110.8210.4060.8740.5460.9250.6460.9860.8280.8550.5350.9650.879
120.9030.6630.9220.6890.9280.7280.9480.6720.9440.7450.9620.812
130.9570.8320.9640.8530.9740.8770.9810.9070.9830.9130.9860.924
140.9860.9250.9880.9310.9910.9440.9930.9440.9950.9700.9960.972
150.9900.9520.9940.9550.9940.9590.9940.9640.9960.9760.9970.933
160.9930.9710.9970.9660.9950.9760.9980.9750.9950.9800.9970.977
170.9940.9580.9940.9670.9940.9680.9940.9640.9940.9670.9980.985
180.9900.9480.9890.9470.9910.9530.9870.9440.9960.9680.9970.963
190.9760.8760.9760.8890.9780.9080.9880.9360.9880.9300.9890.941
1100.9050.6920.9200.7290.9340.7310.9370.6830.9470.7840.9590.837
1110.8170.4550.8680.5580.9230.6640.9770.8240.8640.8530.9710.875
1120.7160.1590.7540.1860.7660.2270.7870.2640.7980.349 0.8430.385
310.8210.4360.8800.5470.9230.6880.9840.7680.8690.5490.9740.900
320.9120.6940.9340.7150.9410.7370.9460.6810.9530.7770.9620.819
330.9520.8380.9690.8650.9770.8880.9850.9020.9870.9050.9860.936
340.9870.9230.9870.9430.9940.9510.9950.9440.9940.9720.9980.974
350.9910.9520.9930.9510.9920.9550.9950.9600.9950.9650.9950.982
360.9940.9670.9960.9690.9970.9760.9960.9790.9940.9730.9990.975
370.9940.9610.9970.9650.9950.9670.9950.9670.9970.9690.9990.979
380.9900.9370.9940.9530.9940.9530.9970.9450.9970.9630.9970.965
390.9750.8830.9790.9000.9850.9510.9890.9310.9870.9350.9870.940
3100.9150.7090.9330.7460.9400.7360.9440.7210.9500.7860.9580.836
3110.8310.4720.8880.5810.9290.7000.9810.7910.8800.5690.9730.906
3120.7230.1880.7890.253 0.8050.289 0.8030.347 0.8320.412 0.8290.452
510.8240.4510.8900.5900.9290.7030.9880.7010.8900.5890.9710.922
520.9070.6590.9450.7380.9430.7350.9490.6550.9560.7800.9610.837
530.9620.8410.9750.8560.9780.9000.9860.9190.9840.9130.9880.928
540.9920.9310.9920.9510.9920.9570.9960.9470.9960.9690.9970.981
550.9910.9510.9930.9630.9950.9630.9930.9640.9970.9690.9970.976
560.9950.9770.9960.9720.9970.9740.9980.9580.9970.9740.9980.980
570.9920.9670.9950.9650.9960.9710.9970.9730.9970.9730.9980.982
580.9910.9020.9940.9500.9960.9530.9970.9530.9950.9760.9970.968
590.9700.8990.9830.9140.9860.9510.9910.9310.9870.9250.9920.945
5100.9410.7050.9450.7530.9430.7680.9480.6990.9490.7850.9550.844
5110.8330.4670.8990.5820.9390.7080.9810.8130.8860.6030.9750.895
5120.7600.191 0.8020.255 0.8080.323 0.8130.342 0.8390.468 0.8360.497
Binary classification model metrics in columns F 1 and regression model metrics in columns R 2 .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mutakabbir, A.; Lung, C.-H.; Naik, K.; Zaman, M.; Ajila, S.A.; Ravichandran, T.; Purcell, R.; Sampalli, S. Spatio-Temporal Agnostic Sampling for Imbalanced Multivariate Seasonal Time Series Data: A Study on Forest Fires. Sensors 2025, 25, 792. https://doi.org/10.3390/s25030792

AMA Style

Mutakabbir A, Lung C-H, Naik K, Zaman M, Ajila SA, Ravichandran T, Purcell R, Sampalli S. Spatio-Temporal Agnostic Sampling for Imbalanced Multivariate Seasonal Time Series Data: A Study on Forest Fires. Sensors. 2025; 25(3):792. https://doi.org/10.3390/s25030792

Chicago/Turabian Style

Mutakabbir, Abdul, Chung-Horng Lung, Kshirasagar Naik, Marzia Zaman, Samuel A. Ajila, Thambirajah Ravichandran, Richard Purcell, and Srinivas Sampalli. 2025. "Spatio-Temporal Agnostic Sampling for Imbalanced Multivariate Seasonal Time Series Data: A Study on Forest Fires" Sensors 25, no. 3: 792. https://doi.org/10.3390/s25030792

APA Style

Mutakabbir, A., Lung, C.-H., Naik, K., Zaman, M., Ajila, S. A., Ravichandran, T., Purcell, R., & Sampalli, S. (2025). Spatio-Temporal Agnostic Sampling for Imbalanced Multivariate Seasonal Time Series Data: A Study on Forest Fires. Sensors, 25(3), 792. https://doi.org/10.3390/s25030792

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop