A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping

Al-Shabeeb, Abdel Rahman; Hamdan, Ibraheem; Meimandi Parizi, Sedigheh; Al-Fugara, A’kif; Odat, Sana’a; Elkhrachy, Ismail; Hu, Tongxin; Sammen, Saad Sh.

doi:10.3390/su152115598

Open AccessArticle

A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping

by

Abdel Rahman Al-Shabeeb

^1,*,

Ibraheem Hamdan

²

,

Sedigheh Meimandi Parizi

³

,

A’kif Al-Fugara

⁴

,

Sana’a Odat

⁵

,

Ismail Elkhrachy

⁶

,

Tongxin Hu

⁷

and

Saad Sh. Sammen

^8,*

¹

Department of GIS and Remote Sensing, Faculty of Earth and Environmental Sciences, Al al-Bayt University, Mafraq 25113, Jordan

²

Department of Applied Earth and Environmental Sciences, Faculty of Earth and Environmental Sciences, Al al-Bayt University, Mafraq 25113, Jordan

³

Department of Civil Engineering, Sirjan University of Technology, Sirjan 7813733385, Iran

⁴

Department of Surveying Engineering, Faculty of Engineering, Al al-Bayt University, Mafraq 25113, Jordan

⁵

Department of Earth and Environmental Sciences, Faculty of Science, Yarmouk University, Irbid 21163, Jordan

⁶

Civil Engineering Department, College of Engineering, Najran University, King Abdulaziz Road, Najran 66454, Saudi Arabia

⁷

Key Laboratory of Sustainable Forest Ecosystem Management-Ministry of Education, College of Forestry, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China

⁸

Department of Civil Engineering, College of Engineering, University of Diyala, Baqubah 10047, Iraq

^*

Authors to whom correspondence should be addressed.

Sustainability 2023, 15(21), 15598; https://doi.org/10.3390/su152115598

Submission received: 6 August 2023 / Revised: 22 September 2023 / Accepted: 29 September 2023 / Published: 3 November 2023

(This article belongs to the Special Issue Predictive Modeling through Earth Observational Data Analysis for Natural Hazards Risk Assessment and Disaster Management)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Wildfire susceptibility mapping (WSM) plays a crucial role in identifying areas with heightened vulnerability to forest fires, allowing for proactive measures in fire prevention, management, and resource allocation, ultimately leading to more effective fire control and mitigation strategies. This paper describes our undertaking to develop and compare the performance of two knowledge-based models, namely the analytic hierarchy process (AHP) and the technique for order performance by similarity to ideal solution (TOPSIS), as well as two novel genetic algorithm (GA)-based ensemble data-driven models: boosting and random subspace. The objective was to map susceptibility to forest fires in the Northern Mazar District in Jordan. The ensemble models were constructed using four well-known classifiers: decision tree (DT), support vector machine (SVM), k-nearest neighbors (kNN), and naive Bayes (NB) algorithms. This study utilized seventy forest fire locations and twelve influential factors to build and evaluate the models. To identify the optimal features for constructing the data-driven models, a GA-based wrapper method and four machine learning models were applied. During the validation phase, the area under the receiver operating characteristic curve (AUROCC) values for the single SVM, single NB, single DT, single kNN, GA-based boosting, GA-based random subspace, FR-AHP, and AHP-TOPSIS models were found to be 85.3%, 85.9%, 73.8%, 88.7%, 95.0%, 95.0%, 74.0%, and 65.4% respectively. The results indicated that the GA-based ensemble models outperformed both the single machine learning models and the knowledge-based techniques in terms of performance. The developed models in this study can be effectively utilized in various management and decision-making processes aimed at mitigating forest fire risks and enhancing fire control strategies.

Keywords:

fire susceptibility mapping; AHP; support vector machine; random subspace; ensemble models

1. Introduction

Forest fires result from a combination of human activities, climate conditions, and ecological factors [1]. They happen at regular intervals or occur sporadically over time and vary in their severity levels [2]. Identifying forest fire-prone areas for prediction and prevention purposes is a crucial measure due to the invaluable role played by forests as vital natural resources. Forests also play a vital role in maintaining ecosystems [3]. It is important to comprehend the influential factors behind forest fires in order to effectively manage fire risks in vulnerable regions [4].

While forest fires can have beneficial effects such as enriching the soil and eliminating harmful fungi and microorganisms [2], they also lead to forest degradation and biodiversity loss [1]. Therefore, it is essential for decision-makers to accurately assess the susceptibility of areas to forest fires in order to plan preventive measures and address their underlying causes. The susceptibility of an area to forest fires has a significant influence on social factors, which ultimately determines the level of harm experienced by both individuals and their properties [2]. Given Jordan’s elevated risk of forest fire occurrence, precise mapping of high-risk areas becomes crucial [5].

Studies on forest fire susceptibility (FFS) require the consideration of various conditioning factors, such as environmental, economic, topographic, and meteorological factors, which aid in simulating fire ignition in forests. It is crucial to carefully choose these factors for accurate FFS mapping [3]. Some factors directly impact fire occurrence, while others have indirect effects. Fuel load plays a critical role in the occurrence and severity of wildfires, as it determines the amount and type of flammable material available to sustain and spread the fire. Understanding and managing fuel load levels are essential for mitigating fire incidents and reducing their impact on ecosystems and human communities [6]. Ghorbanzadeh et al. (2019) developed an FFS index using a machine learning (ML) model and GIS-MCDM method [2]. They utilized 16 conditioning factors to create a forest fire inventory and generated a fire susceptibility map using an ANN model. They argued that their index can be applied to different regions by considering relevant input data for those specific areas. Eskandari et al. (2020) explored the influence of climatic factors on forest fire incidence through regression and correlation analyses and temporal relationships [7]. They found that temperature, humidity, and wind speed significantly affect fire outbreaks. Pourghasemi et al. (2020) studied the FFS using ten determinant factors and employed three GIS-based ML algorithms: boosted regression tree (BRT), mixture discriminant analysis (MDA), and general linear model (GLM) [8]. They also examined the spatial relationships between these factors and forest fire occurrence. Their findings indicated that land use, slope, rainfall rate, and elevation are the most important factors in predicting forest fire incidence. The literature review emphasizes the importance of selecting influential factors for forest fires appropriately. Therefore, the first objective of this study was to evaluate the feasibility of wrapper-based feature selection using a genetic algorithm (GA) and four ML models to identify the factors that influence the forest fire incidence. These selected factors were then utilized for the FFS mapping.

Advancements in ensemble ML models and multicriteria decision analysis (MCDA) methods, coupled with geographic information systems (GIS) analysis and remote sensing (RS) data, have led to more precise forecasting of forest fire incidents. Ghorbanzadeh et al. (2019) categorized the commonly used algorithms for susceptibility mapping into two types: knowledge-based models, such as the analytical hierarchy process (AHP), analytical network process (ANP), and fuzzy logic, and data-driven algorithms, such as random forest (RF), logistic regression (LR), and artificial neural network (ANN) [2,9]. The MCDA techniques are particularly suitable for problems that involve conflicting decision-making criteria and complex selection among alternatives [10]. Several approaches have been employed in the analysis and mapping of forest fire susceptibility, including AHP [1,10,11], technique for order performance by similarity to ideal solution (TOPSIS) [3], ANP [5], VIKOR [3,12], and fuzzy AHP [10]. Meanwhile, various data-driven models have commonly been used for forest fire risk assessment, such as the fuzzy inference system [13,14], LogitBoost ensemble-based decision tree (LEDT) [15], adaptive neuro fuzzy inference system (ANFIS) [16], simulated annealing (SA), genetic algorithm (GA), imperialist competitive algorithm-based ANFIS [17], and ensemble models [18,19,20,21].

Based on the review of the available literature, it is evident that the adoption of hybrid and ensemble models for FFS has increased in recent years. These models have shown promising outcomes in FFS modeling because they leverage the capabilities of multiple models simultaneously. Hybrid models, for instance, are typically created by combining meta-heuristic and machine learning algorithms, while ensemble models are formed by merging the outputs of multiple machine learning models [22,23]. However, there has been limited focus in the scientific literature on the combination of meta-heuristic algorithms and ensemble models. Therefore, the second objective of this study is to develop a wrapper feature selection method using Genetic Algorithm (GA) and ensemble models based on boosting and random subspace approaches. Additionally, few studies have compared the performance of GA-based ensemble models with those based on multicriteria decision-making (MCDM) models. Therefore, the third objective of this study is to compare the performance of two MCDM models, namely analytic hierarchy process (AHP) and AHP-TOPSIS, with the boosting and random subspace ensemble models employing wrapper feature selection. Briefly, the innovation of this study is the development of a wrapper feature selection method using Genetic Algorithm (GA) and ensemble models based on boosting and random subspace approaches. This approach combines meta-heuristic algorithms and ensemble models to improve the accuracy and performance of forest fire susceptibility mapping. Additionally, the comparison of GA-based ensemble models with MCDM models provides valuable insights into the effectiveness of different modeling approaches in predicting forest fire incidence. The FFS maps for the Northern Mazar District in Jordan were created using proposed models that took into account twelve key factors associated with forest fires. These factors include rainfall rate, aspect, wind speed, elevation, slope, land use, solar radiation, temperature, soil texture, population density, distance to roads, distance to drainage, topographic wetness index (TWI), and normalized difference vegetation index (NDVI).

2. Study Area

The Northern Mazar District is situated in the northern part of Jordan (Figure 1). It can be found at the geographical coordinates of 32°28′21″ N and 35°47′34″ E. In terms of local administrative divisions, it falls under the jurisdiction of Irbid Governorate. The governorate is composed of nine administrative districts, namely Irbid city, Al Hisn, Al Mazar al Shamali, Ar Ramtha, Sama al-Rousan, Der Abi Saeed, North Shuneh, Taybeh, and Kufr Asad. The Southern District of the Irbid Governorate is the Northern Mazar District. Its total area is 63.7 km², accounting for around 4% of the entire Irbid Governorate, which has an area of 1571.8 km². The Northern Mazar District comprises nine villages which differ in size and population. These villages include Al-Mazar Al-Shamali, Enaba, Deir Yusef, Arhaba, Zobia, Jahfah, Samad, Zaatara, Habka, and Hofa.

Based on the data of the Department of Statistics, Jordan, Irbid Governorate had a population of 2,050,300 individuals in 2021. During that year, the Northern Mazar District accounted for 90,840 people, comprising 18,254 families, which represents approximately 4.43% of Irbid Governorate’s total population. Considering these figures, the Northern Mazar District ranks as the seventh most populous district within the governorate of Irbid.

The Northern Mazar District is characterized by its mountainous terrain, which is an extension of the topography found in the adjacent Ajloun Governorate to the south. The primary occupation and source of income for the residents of this district is agriculture. There is widespread cultivation of various fruit-bearing trees such as almonds, olives, and grapes. Additionally, parts of the southern region of the Northern Mazar District are covered by natural forests consisting mainly of oak and pine trees, which extend from the Ajloun Governorate to the north. To safeguard the natural habitats in this district, the Jordanian government has taken action by designating a portion of it as the Barqash Nature Reserve. Livestock raising is another significant activity in this district, taking advantage of the fertile agricultural lands available. Livestock raising practices within the district vary, ranging from individual endeavors to organized, large-scale farms that employ modern methods.

The study area exhibits a pleasant and attractive climate for tourism, characterized by mild summers with an average temperature of approximately 25 °C. However, winters in this district are cold, and temperatures frequently drop below 0 °C during this season. Snowfall is also common in certain parts of the Northern Mazar District during winter. In terms of precipitation, the annual rainfall in this district ranges from 400 to 600 mm, making it one of the areas in Jordan with the highest rainfall rates. As a result, the study area boasts a remarkable richness of biodiversity. However, the presence of dry weeds and field crops during the summer and autumn months increases the risk of forest fires in this region.

2.1. Preparation of Inventory Map

To create the forest fire inventory map, information was gathered from the General Directorate of Civil Defense in the Northern Mazar District regarding seventy locations within the study area that experienced forest fires between 1991 and 2021. The Civil Defense Directorate in the Northern Mazar District has generated these samples by analyzing the polygons that have been affected by fire using the “Create Random Points” technique in the GIS environment. Additionally, another set of seventy locations within the study area, which had no history of forest fires, were identified to develop data-driven models. Among these locations, seventy percent (98 locations) were utilized for constructing the data-driven models, while the remaining thirty percent (42 locations) were reserved for validating the developed models. These samples were selected based on random sampling for creating the training and test datasets. The validation dataset was also utilized to verify the accuracy of the knowledge-based models.

2.2. Factors Contributing to Forest Fires

Following consultations with experts and a thorough review of the relevant literature, twelve factors were selected for the FFS mapping. These factors were chosen based on data availability and their significance in assessing fire spread. Subsequently, the reported values of each factor within the study area were classified into relevant subcategories. A detailed overview of each factor will be provided.

2.2.1. Elevation

The elevation of an area plays a crucial role in determining various climatic conditions, including temperature, humidity, the presence of dry organic materials, as well as the intensity and direction of winds. All of these factors contribute significantly to the risks of ignition and fire spread [3]. Figure 2a depicts an elevation map specifically for the study district, highlighting variations in elevation ranging from 326 to 1096 m above mean sea level (msl).

2.2.2. Slope

The slope of the land significantly influences the direction and speed at which a fire spreads [24]. Typically, fires tend to propagate in an up-slope direction, and their rate of spread increases in such conditions due to enhanced connectivity, preheating, and ignition along the uphill path [1]. Figure 2b illustrates the slope map specifically for the study area, indicating that slopes within the district varied from flat lands (0 degrees) to steeper slopes of up to 55.72 degrees.

2.2.3. Aspect

Aspect refers to how sunlight, temperature, and humidity impact the Earth’s surface [24]. An aspect map provides details about the slope and orientation of a specific area terrain. These maps are valuable for assessing landscape features and the quantity of sunlight that a location receives, which in turn affects its temperature [25], and, consequently, its temperature. In Figure 2c, it can be observed that the aspect values in the study region varied from −1 (representing flat areas) to 359.356.

2.2.4. Land Use

In this study, the land cover and land use attributes of the study area have been classified into eight categories (Figure 2d).

2.2.5. Distance to Roads

The closeness of forests to roads enhances their vulnerability to fires due to various human-induced disturbances in the forest ecosystem [1]. As shown in Figure 2e, the study area exhibited distances to roads ranging from 0 m to over 1200 m.

2.2.6. Population Density

An increased concentration of people living near forests indicates a greater reliance on forest resources, which can result in an elevated risk of forest fire incidents due to human activities within the forests [26]. Figure 2f reveals that population density in the study area ranged from 0.25 to 2.58 individuals per square kilometer.

2.2.7. Wind Speed

High wind speeds enhance the presence of fresh oxygen in the atmosphere, which can potentially contribute to the spread of fires [25]. Within the study area, the investigation revealed that wind speeds varied between 7 and 8 m per second, as depicted in Figure 2g.

2.2.8. Rainfall

Rainfall plays a vital role in regulating humidity and maintaining the water balance [24] and is inversely related to forest fires [25]. In the study area, the average annual rainfall depths varied from 450 mm to 550 mm, as illustrated in Figure 2h.

2.2.9. Temperature

The likelihood of a fire spread increases when there is a combination of high temperature and low humidity [24]. Forest fires have a direct relationship with this combination of weather conditions [25]. According to Figure 2i, the temperature in the study area ranges from 16.47 to 30.04 °C.

2.2.10. NDVI

Normalized difference vegetation index (NDVI) is a widely used indicator for the presence and condition of green vegetation. High NDVI values indicate healthier vegetation, while low values suggest a lack of vegetation. The NDVI value is calculated by dividing the difference between the near-infrared and red bands of radiation by their sum, resulting in a value between −1 and +1. It is calculated using Equation (1) [25]:

N D V I = \frac{(N I R - R e d)}{(N I R + R e d)}

(1)

In the study area, the NDVI values ranged from 0.05 to 0.49 (Figure 2j).

2.2.11. Topographic Wetness Index (TWI)

The Topographic Wetness Index (TWI) quantifies the influence of land topography on hydrological processes [27]. It is determined by considering the slope and the size of the contributing area located upstream (Equation (2)):

T W I = \ln (\frac{C A}{S l o p e})

(2)

CA refers to the catchment area on an upward slope, while Slope represents the steepest slope outward for each grid cell [25]. Figure 2k illustrates that within the study area, TWI values varied from −8.12 to 9.28.

2.2.12. Solar Radiation

Solar radiation has a notable impact on forest fire occurrences since it serves as the predominant energy source that initiates and spreads fires within forest ecosystems [28]. According to Figure 2m, the solar radiation intensity within the study area ranges from 48.56 to 13,075.8 Watts/m². Table 1 shows the source of the factors and their minimum and maximum values.

3. Methodology

This study conducts a comparison between two knowledge-based models, namely AHP and AHP-TOPSIS, and two newly developed GA-based ensemble data-driven models, boosting and random subspace, in the context of FFS mapping. This study identifies significant factors and prepares separate training and validation datasets consisting of fire and nonfire locations (98 for training and 42 for validation). The ensemble models are utilized to build data-driven models using the training dataset, and the validation dataset is employed to evaluate and validate these models (Figure 3). These models were built using a dataset consisting of seventy locations with forest fire history and seventy locations without any recorded forest fires. The data were divided into training and validation sets, with 70% (98 locations) allocated to training and the remaining 30% (42 locations) assigned to validation. The averaging method was utilized to generate the final output of the GA-based boosting and GA-based random subspace models. The frequency ratio (FR) method was utilized to assess and assign weights to each class of the factors prior to the modeling process. Subsequently, the AHP was employed to determine the weight of each factor. Finally, the AHP-FR model was developed based on the calculated weights. In fact, at first, the factors that influence the fire incident were classified into subclasses based on previous studies and their ranges of values. Then, FR technique was utilized to rate each subclass of these factors, where high FR scores in each subclass mean more fire incidents with respect to its coverage area. Afterward, the AHP method was applied to calculate the importance of each factor. Experts’ opinions were used to determine the relative superiority of each factor over other factors. They assigned a relative importance score to each pair of factors, based on a scale ranging from 1 to 9. Finally, the weighted overlap approach was employed to combine the outcomes. The weights obtained through the AHP analysis were also used in the modeling process, where the TOPSIS method was utilized.

For the construction of ensemble models in FFS modeling, a GA-based wrapper approach and four algorithms (NB, SVM, kNN, and DT) were utilized to identify the optimal features. These selected features were used in four ML models. Concurrently, ensemble models were developed using boosting and random subspace approaches. Subsequently, the models were validated, and forest fire susceptibility maps were generated (Figure 3). Detailed information on the models and the evaluation criteria for their performance will be provided in subsequent sections.

3.1. Frequency Ratio (FR)

The Frequency Ratio (FR) was used to determine the correlations between the forest fire locations and each of the aforementioned determinant factors. Through this method, all classes of factors were weighed. The FR is computed by dividing the ratio of the number of fires in each class i of factor j (

F_{i j}

) by the total number of fires (F) and then dividing it by the ratio of the area of class

A_{i j}

to the overall area (A) of the study area according to Equation (3) [29]:

F R = \frac{\frac{F_{i j}}{F}}{\frac{A_{i j}}{A}}

(3)

The FR values obtained were standardized and normalized using Equation (4) [30]:

{N F R}_{i j} = \frac{{F R}_{i j}}{\sum {F R}_{i}}

(4)

where

{N F R}_{i j}

is the normalized FR values,

{F R}_{i j}

is the FR of the

j

^th subclass of factor

i,

and

\sum {F R}_{i}

is the sum of the FR values of factor

i

.

In addition, the prediction rate (PR) is used to evaluate the overall impact of the factors within the FR model. It is calculated according to Equation (5) [30]:

{P R}_{i} = \frac{(M a x ({N F R}_{i}) - \min ({N F R}_{i}))}{{(M a x ({N F R}_{i}) - \min ({N F R}_{i}))}_{m i n}}

(5)

3.2. Analytic Hierarchy Process (AHP)

The Analytic Hierarchy Process (AHP) is a widely recognized method in multicriteria decision-making (MCDM) [31]. In the AHP, different criteria are assigned weights based on a specific objective, aiming to identify the most optimal choice [31]. This technique involves three primary steps [32]:

Setting the goal, criteria, and alternatives in a problem

The initial step involves defining the objective, criteria, and available alternatives for a given problem. In this particular study, the objective focused on FFS, while the criteria consisted of 12 factors that determine forest fire occurrence. These factors were identified and specified by the experts involved in this study.

2.: Making pairwise comparisons and weighting

During this step, the criteria are arranged in a matrix called the pairwise comparison matrix. This matrix allows for the comparison and weighting of each criterion by considering them two by two. The pairwise comparison is performed by assigning importance weights to the elements within each cell of the matrix. Typically, a scale of 1–9 is used to assess the relative importance of each factor being evaluated. A value of 9 signifies high importance, while a value of 1 indicates equal importance and preference (refer to Table 2). Importantly, the pairwise comparison matrix is an invertible matrix, meaning that if the comparative value of the importance of a row element ‘a’ to a column element ‘b’ is 9, then the comparative value of the importance of the column element ‘b’ to the row element ‘a’ would be 1/9.

3.: Calculating the consistency rate

In the AHP, the consistency rate (CR) is a metric that assesses the coherence of pairwise comparisons. It quantifies the accuracy and correctness of the valuations made in these comparisons. The CR is calculated using Equation (6) (Saaty, 1987):

C R = \frac{C I}{R I}

(6)

where RI is the random index selected from Table 3 based on the number of criteria (n), and CI is the consistency index. It is calculated using Equation (7) [32]:

C I = \frac{λ_{m a x} - n}{n - 1}

(7)

where

λ_{m a x}

is the largest eigenvalue of the comparison matrix.

If the CR is equal to or less than 0.1, it indicates that the valuations and compari-sons are reliable and accurate. However, if the CR exceeds 0.1, it signifies a need for modifications to the valuations and comparisons.

3.3. TOPSIS

TOPSIS is a MCDM technique that was created by Hwang and Yoon in 1981. This model operates under the assumption that the optimal solution is characterized by having the shortest Euclidean distance from the positive ideal solution and the longest distance from the negative ideal solution [33]. The positive ideal solution aims to maximize the positive features (benefits) and minimize the negative features (costs), while the negative ideal solution seeks to maximize the negative features (costs) and minimize the positive features (benefits).

MCDM problems involve comparing and evaluating different information under specific conditions in order to obtain a suitable ranking. To accomplish this objective, a matrix

A \times C

is constructed using alternatives

A = \{a, b, c, \dots\}

and criteria

C = \{c_{1}, c_{2}, c_{3}, \dots\}

. This matrix facilitates the ranking of alternatives and enables the evaluation and comparison of different criteria. The TOPSIS algorithm is then implemented by following the subsequent steps [33,34]:

Creating the decision matrix based on m criteria and n alternatives.
Calculating the normalized decision matrix r_ij using Equation (8):

$r_{i j} = \frac{x_{i j}}{\sqrt{\sum_{i = 1}^{m} x_{i j}^{2}}}, i = 1, 2, \dots, m j = 1, 2, \dots, n$

(8)

where $x_{i j}$ is the numerical value obtained from the intersection of alternatives and criteria.
Calculating the weighted normalized decision matrix (w_ij) based on the weighted normalized value of v_ij by multiplying the normalized decision matrix by the weight ( $W_{i}$ ) allocated to every criterion according to Equation (9):

$v_{i j} = w_{i} r_{i j} i = 1, 2, \dots, m j = 1, 2, \dots, n$

(9)
Determining the positive ideal solution and the negative ideal solution by using Equations (10) and (11):

$A^{*} = \{v_{1}^{*}, \dots, v_{2}^{*},\}, v^{*} = \{\max (v_{i j},), j ϵ J, \min (v_{i j}), j ϵ J^{'}\}$

(10)

$A^{-} = \{v_{1}^{'}, \dots, v_{2}^{'},\}, v^{'} = \{\max (v_{ij},), j ϵ J, \min (v_{i j}), j ϵ J^{'}\}$

(11)
In these equations, J pertains to benefit criteria, whereas J’ is related to cost criteria.
Calculating the separation values by using the n-dimensional Euclidean distances:
The distance between every alternative from a positive ideal solution (s⁺_j) is calculated by Equation (12):

$s_{i}^{+} = \sum_{j = 1}^{n} {(v_{i j} - v_{j}^{*})}^{2}, i = 1, 2, \dots ., m$

(12)

Similarly, distance of each alternative from a negative ideal solution (s⁻_j) is calculated as follows:

$s_{i}^{-} = \sum_{j = 1}^{n} {(v_{i j} - v_{j}^{'})}^{2}, i = 1, 2, \dots ., m$

(13)
Calculating the relative proximity to an ideal solution, which can be obtained from Equation (14):

$C_{i}^{*} = \frac{s_{i}^{-}}{s_{i}^{-} + s_{i}^{+}} {0 < C}_{i}^{*} < 1$

(14)
Ranking the alternatives based on their relative proximity values.

In this regard, the alternative that has the highest

C_{i}^{*}

value is considered the best alternative.

The present study utilized the AHP method to assign weights to criteria within the TOPSIS model.

3.4. Genetic Algorithm-Based Ensemble Models

Sagi and Rokach (2018) provided a definition of ensemble learning as the practice of generating and merging multiple inducers to address specific machine learning tasks. Dong et al. (2020) stated that ensemble learning strives to seamlessly integrate diverse machine learning algorithms into a unified framework. This enables effective utilization of the complementary information from each integrated algorithm, resulting in improved performance for the overall model.

Ensemble models are becoming more prevalent in different fields and research studies. In this particular study, GA-based boosting and random subspace methods were employed for the FFS mapping. Prior to constructing the ensemble models, feature selection was conducted using the wrapper approach with GA and four classifiers: SVM, DT, NB, and kNN. Once the optimal features were determined by the GA-SVM, GA-DT, GA-NB, and GA-kNN models, these features were utilized to create ensemble models using the boosting and random subspace approaches, utilizing four machine learning models.

The boosting approach involves initially constructing a model using a training dataset. Then, the samples that were incorrectly classified by the initial model are identified and presented to the next model for further learning. This process is repeated iteratively to build the final ensemble model. On the other hand, in the random subspace approach, the models are created by randomly selecting features to build the ensemble model. In this study, the four commonly used and well-known classifiers, NB, SVM, kNN, and DT, were used for this purpose.

The NB algorithm is a straightforward classifier that relies on the Bayes’ theorem [35,36,37]. In the NB classification, it assumes that features are independent and aims to determine posterior probabilities. The DT algorithm utilizes a flowchart-like structure where samples are classified by traversing from the root node to the leaf nodes [38]. The kNN algorithm is a classification technique that determines the class for a new sample by analyzing a specified number of most similar neighbors or samples [39]. This method requires a distance criterion, such as the Euclidean distance, to measure similarity between samples. The SVM algorithm seeks to identify the optimal hyperplane that can separate data from two classes with the maximum margin [24].

3.5. Model Accuracy Assessment

In this study, the accuracy of the models was assessed using the area under the receiver operating characteristic curve (AUROCC). The y-axis and x-axis values of this curve were obtained using Equations (15) and (16) [40,41]:

x - a x i s = 1 - s p e c i f i c i t y = f a l s e p o s i t i v e r a t e = F P / (F P + T N)

(15)

y - a x i s = s e n s i t i v i t y = t r u e p o s i t i v e r a t e = T P / (T P + F N)

(16)

where TP represents correctly predicted positive samples (occurrence); TN represents correctly predicted negative samples (nonoccurrence); FP represents negative samples incorrectly classified as positive; and FN represents positive samples incorrectly classified as negative by the model. In this study, the AUROCC values were used to assess the predictive capabilities of the models in distinguishing between locations with forest fire incidence and those without.

4. Results

In Table 4, the findings regarding the correlation between forest fires and various factors generated by the FR model are summarized. The analysis reveals that the classes of altitude between 326–608 m and 608–721 m exhibit the highest FR values in relation to forest fires. In terms of other factors studied as determinants of forest fires, this study indicates that the highest FR values are associated with a slope ranging from 10–15 degrees, a north-facing aspect, distances to roads between 300 and 600 m, a population density of 1.24–1.81 person/km², land use for field crops, wind speeds of 7–7.5 m/s, rainfall depths of 475–525 mm, temperatures exceeding 27 °C, NDVI values greater than 0.32, TWI values between −2.7 and 0.26, and solar radiation intensities below 2000 watt/m².

Following the computation of PR values for each factor, these values were used in creating the AHP pairwise comparison matrix, as represented in Table 5 for the FR-AHP model. To illustrate, the PR value for the altitude factor is 1.93, while the PR value for the slope factor is 1.60. By dividing 1.93 by 1.60, we determine that the relative importance of altitude compared to slope is 1.20.

To determine the weights in the AHP, it is necessary to normalize the pairwise comparison matrix (Table 6). Subsequently, the weights were calculated using arithmetic means. The AHP results indicated that the factors of land use and NDVI had the highest weights, specifically 0.22 and 0.13, respectively. Conversely, solar radiation and slope had the lowest weight, both at 0.05 (as shown in Table 6). To generate the FFS map using the FR-AHP model within the GIS environment, each subclass of the factors (Table 4) was assigned FR values. The criteria weights obtained through the FR-AHP method (Table 6) were utilized in a weighted overlay analysis. This process resulted in the production of the FFS map.

In the AHP-TOPSIS model, the initial step involved establishing the PWCM of AHP. This matrix is represented in Table 7. The values in the matrix were then normalized to obtain the final weights. Additionally, the preferences of the decision maker were taken into account using a scale ranging from 1 to 9. The AHP model demonstrated a low inconsistency rate of 0.03, indicating that the evaluations in the pairwise comparison matrix were consistent. Referring to Table 7, it can be observed that the land use factor carried the highest weight of 0.257, whereas the TWI had the lowest weight of 0.02.

As shown in Table 7, the AHP technique was used to determine the weights of 12 factors related to forest fires. These weights were then used in the creation of a TOPSIS model using MATLAB software. To display the values of the 12 factors and other calculated parameters derived from the AHP-TOPSIS technique, a random sample of 5000 points was selected from a total of 31,654 points (Table 8). In the subsequent step, the criteria values were rescaled and multiplied by the weights obtained from the AHP method. The primary values of the 12 factors were normalized using Equations (7) and (8). The values of A⁺ and A⁻ were calculated using Equations (9) and (10) (Table 9). Then, Equations (11) and (12) were employed to determine the separation of each alternative from the positive ideal solution (

s_{i}^{+}

) and the negative ideal solution (

s_{i}^{-}

), respectively (Table 10). Following this, Equation (13) was used to compute the relative proximity to an ideal solution (

C_{i}^{*}

) and rank the various alternatives (Table 10). The weights generated by this technique were incorporated into the attribute table of the combined layer of factors. Consequently, FFS maps were created in the GIS environment.

Figure 4 and Figure 5 depict the mean and best fitness values obtained from the GA-based feature selection applied to identify optimal features using wrapper approaches and four algorithms of DT, kNN, NB, and SVM models. The root mean square error (RMSE) was used as the performance measure in the feature selection process to assess the fitness of each feasible solution. The GA-based feature selection approach employed 200 generations and 20 solutions per generation. Figure 4 illustrates that the GA-DT model achieved the best fitness value (RMSE = 0), indicating excellent performance. On the other hand, the GA-kNN model reached a stable fitness value after the 21st generation, with an associated RMSE of 0.2178. After completing the feature selection using the GA-DT model, it was determined that the most influential factors for forest fire incidence were land use, distance to roads, NDVI, wind speed, TWI, and temperature. In the case of the GA-kNN model, the most influential factors were found to be NDVI, aspect, and temperature.

The GA-NB model for feature selection reached an RMSE value of 0.2583 after the 18th generation, as shown in Figure 6. In this model, population density, solar radiation, rainfall, NDVI, aspect, and temperature were identified as the significant factors for the modeling process.

Figure 7 demonstrates that the GA-SVM model reached a stable fitness value after the 28th generation, with the RMSE stabilizing at 0.2215. The findings indicate that population density, distance to roads, solar radiation, NDVI, wind speed, aspect, and temperature were the most significant factors influencing forest fire incidence by GA-SVM.

Once the optimal factors were determined for each model, six data-driven models were employed: DT, NB, kNN, SVM, GA-based boosting, and GA-based random subspace. Within the GA-based bootstrap and GA-based random subspace models, four integrated models were created: GA-DT, GA-SVM, GA-NB, and GA-kNN.

Figure 8 and Figure 9 show the AUROCCs and the AUC values for the eight models in both the training and testing runs. In both runs, the GA-based ensemble models outperformed the MCDM models in terms of the AUC. In the training runs, the DT and boosting models had the highest AUC values, followed by the random subspace, kNN, SVM, NB, FR-AHP, and AHP-TOPSIS models (Figure 8). In the testing runs, the boosting and random subspace models exhibited the highest AUC values, while the single models and MCDM techniques displayed lower AUC values (Figure 9).

The FFS maps produced by the FR-AHP, AHP-TOPSIS, boosting, and random subspace models are depicted in Figure 10. The entire area’s fire susceptibility was classified into five classes: very low, low, moderate, high, and very high susceptibilities. It is evident from Figure 10 that the allocation of areas to each class varies among the different models. Moreover, Figure 11 shows the percentage of susceptibility classes for four models.

The FFS maps in Figure 10 were classified into five different probability zones using the natural break raster classification technique in GIS environment. This method takes into account the natural grouping of the data and determines breakpoints based on the differences between comparable groups. The reason for using this classification method is because it is data-specific and allows for the zoning of the fire forest susceptibility maps.

5. Discussion

Forests play a crucial role in sustaining human life and provide various essential benefits such as increased soil infiltrability, enhanced water storage capacity, prevention of soil erosion, purification of polluted air, and serving as habitats for diverse animal and plant species. However, forests face significant destruction worldwide due to frequent fires caused by factors such as global warming, human negligence, and intentional burning for land clearance. The loss of forests has severe consequences for humans, animals, plants, and ecosystem processes and services. It is therefore imperative to prevent forest fires to safeguard these invaluable forest systems, processes, and resources. In the unfortunate event of a fire, immediate measures must be taken to prevent its spread.

Understanding the factors contributing to forest fire occurrence is crucial for the development of fire susceptibility maps. Numerous feature selection methods have been proposed, with wrapper-based feature selection being one of the most important approaches. In this study, four feature selection algorithms, namely GA-DT, GA-kNN, GA-SVM, and GA-NB, were employed to identify the influential factors in the forest fire susceptibility mapping. NDVI and temperature consistently emerged as the most significant factors across all algorithms. Previous studies, including those by Fiorucci et al. (2007), Gonzalez-Alonso et al. (1997), Lasaponara (2005), and Illera et al. (1996), have also utilized NDVI variations to identify fire-prone areas [25,42,43,44]. Similarly, in our study, NDVI was identified as a critical indicator of fire occurrence, showing a direct correlation between increased NDVI values and the number of fire incidents. However, NDVI can identify areas with potential fuel sources for burning, but it is not an ideal substitute for directly measuring fuel quantity or quality. Temperature was another prominent factor, aligning with research conducted by [26,45]. Temperature plays a significant role in fire behavior. Higher temperatures increase the chances of fire ignition and make it more difficult to control fires once they have started.

Identifying forested areas at high risk of fire and implementing preventive measures are essential for effective fire management. In this context, our study compared the FFS mapping capabilities of two knowledge-based models (AHP and AHP-TOPSIS) with two GA-based ensemble data-driven models (boosting and RS). These models were developed using the NB, kNN, DT, and SVM algorithms to predict areas highly susceptible to forest fires in the Northern Mazar District in Jordan. In recent years, both MCDM techniques and data-driven models have been widely employed by researchers for hazard mapping. Some studies, such as those conducted by Zhu et al. (2018) and Arabameri et al. (2018), have reported superior performance of MCDM techniques over data-driven models [46,47], while others, such as Chicas et al. (2022) and Nachappa et al. (2020), have found data-driven models to be more effective [48,49]. Therefore, to obtain a reliable forest fire susceptibility map, it is crucial to evaluate and compare both approaches. In our study, data-driven models exhibited better performance compared to expert-based models, consistent with findings from Chicas et al. (2022) and Nachappa et al. (2020).

6. Conclusions

This research highlights the importance of employing advanced ML models to accurately map areas in the Northern Mazar District in Jordan that are highly prone to forest fires. To create the FFS models, a total of 140 locations were chosen. These locations included 70 areas that had experienced forest fires in the past and 70 locations with no known instances of forest fires. Fourteen factors were considered during the analysis, which encompassed elevation, slope, aspect, land use, distance to roads, population density, wind speed, rainfall, temperature, NDVI, TWI, and solar radiation. By comparing knowledge-based models such as AHP and AHP-TOPSIS with GA-based ensemble models such as boosting and RS, developed using algorithms such as NB, kNN, DT, and SVM, it was found that the latter performed better than the former. This suggests that using ML algorithms capable of capturing complex relationships between multiple variables is crucial. Additionally, this study concludes that data-driven models are more precise than knowledge-based models for the FFS mapping.

To gain deeper understanding of this study’s findings, it is crucial to grasp the importance of the variables employed in the modeling and FFS mapping. Through the wrapper approach utilized for feature selection, it was discovered that NDVI and temperature played a significant role in determining an area’s vulnerability to forest fires. NDVI serves as a widely used index indicating the density and health of vegetation. Concerning FFS, NDVI is a vital factor because areas with dense vegetation cover are less prone to experiencing forest fires compared to those with sparse vegetation cover. Conversely, temperature is another critical variable as higher temperatures escalate the likelihood of fire ignition and intensify the challenges involved in controlling them once they start.

The produced FFS maps have the potential to assist in the prevention and control of forest fires by recognizing areas at high risk and implementing precautionary measures before fires occur. For example, pre-establishing rescue and relief stations can considerably decrease response times, enabling firefighters to rapidly contain and extinguish fires. Additionally, the installation of water storage tanks containing ample amounts of water in high-risk regions can serve as a firefighting water source. By placing wireless sensors in wooded regions, fires can be detected, and authorities can be alerted promptly, resulting in quicker response times and more efficient firefighting operations.

Author Contributions

Conceptualization, A.R.A.-S.; data curation, I.H. and S.M.P.; formal analysis, I.H.; investigation, S.M.P., A.A.-F. and T.H.; methodology, A.R.A.-S. and S.O.; resources, S.M.P. and S.S.S.; software, A.R.A.-S., I.H., A.A.-F. and S.O.; supervision, T.H.; validation, I.H., I.E., T.H. and S.S.S.; visualization, S.O., I.E. and S.S.S.; writing—original draft preparation, A.R.A.-S.; writing—review and editing, A.A.-F., I.E., T.H. and S.S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

Authors would like to acknowledge the support of the Deputy for Research and Innovation- Ministry of Education, Saudi Arabia for this research through a grant (NU/IFC/2/SERC/-/12) under the Institutional Funding Committee at Najran University, Saudi Arabia.

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper.

References

Suryabhagavan, K.; Alemu, M.; Balakrishnan, M. GIS-based multi-criteria decision analysis for forest fire susceptibility mapping: A case study in Harenna forest, southwestern Ethiopia. Trop. Ecol. 2016, 57, 33–43. [Google Scholar]
Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Aryal, J. Forest fire susceptibility and risk mapping using social/infrastructural vulnerability and environmental variables. Fire 2019, 2, 50. [Google Scholar] [CrossRef]
Sari, F. Forest fire susceptibility mapping via multi-criteria decision analysis techniques for Mugla, Turkey: A comparative analysis of VIKOR and TOPSIS. For. Ecol. Manag. 2021, 480, 118644. [Google Scholar] [CrossRef]
Vadrevu, K.P.; Eaturu, A.; Badarinath, K. Fire risk evaluation using multicriteria analysis—A case study. Environ. Monit. Assess. 2010, 166, 223–239. [Google Scholar] [CrossRef] [PubMed]
Abedi Gheshlaghi, H. Using GIS to develop a model for forest fire risk mapping. J. Indian Soc. Remote Sens. 2019, 47, 1173–1185. [Google Scholar] [CrossRef]
Collins, B.; Rhoades, C.; Battaglia, M.; Hubbard, R. The effects of bark beetle outbreaks on forest development, fuel loads and potential fire behavior in salvage logged and untreated lodgepole pine forests. For. Ecol. Manag. 2012, 284, 260–268. [Google Scholar] [CrossRef]
Eskandari, S.; Miesel, J.R.; Pourghasemi, H.R. The temporal and spatial relationships between climatic parameters and fire occurrence in northeastern Iran. Ecol. Indic. 2020, 118, 106720. [Google Scholar] [CrossRef]
Pourghasemi, H.R.; Gayen, A.; Lasaponara, R.; Tiefenbacher, J.P. Application of learning vector quantization and different machine learning techniques to assessing forest fire influence factors and spatial modelling. Environ. Res. 2020, 184, 109321. [Google Scholar] [CrossRef]
Liu, X. Real-world data for the drug development in the digital era. J. Artif. Intell. Technol. 2022, 2, 42–46. [Google Scholar] [CrossRef]
Tiwari, A.; Shoab, M.; Dixit, A. GIS-based forest fire susceptibility modeling in Pauri Garhwal, India: A comparative assessment of frequency ratio, analytic hierarchy process and fuzzy modeling techniques. Nat. Hazards 2021, 105, 1189–1230. [Google Scholar] [CrossRef]
Coban, H.; Erdin, C. Forest fire risk assessment using GIS and AHP integration in Bucak forest enterprise, Turkey. Appl. Ecol. Environ. Res. 2020, 18, 1567–1583. [Google Scholar] [CrossRef]
Mahmood, T.; Ali, Z.; Naeem, M. Aggregation operators and CRITIC-VIKOR method for confidence complex q-rung orthopair normal fuzzy information and their applications. CAAI Trans. Intell. Technol. 2023, 8, 40–63. [Google Scholar] [CrossRef]
Kamran, K.V.; Omrani, K.; Khosroshahi, S.S. Forest fire risk assessment using multi-criteria analysis: A case study Kaleybar forest. In Proceedings of the International Conference on Agriculture, Environment and Biological Sciences, Antalya, Turkey, 4–5 June 2014. [Google Scholar]
Li, P.; Gu, H.; Yin, L.; Li, B. Research on trend prediction of component stock in fuzzy time series based on deep forest. CAAI Trans. Intell. Technol. 2022, 7, 617–626. [Google Scholar] [CrossRef]
Tehrany, M.S.; Jones, S.; Shabani, F.; Martínez-Álvarez, F.; Tien Bui, D. A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data. Theor. Appl. Climatol. 2019, 137, 637–653. [Google Scholar] [CrossRef]
Moayedi, H.; Mehrabi, M.; Kalantar, B.; Abdullahi Mu’azu, M.; Rashid, A.S.A.; Foong, L.K.; Nguyen, H. Novel hybrids of adaptive neuro-fuzzy inference system (ANFIS) with several metaheuristic algorithms for spatial susceptibility assessment of seismic-induced landslide. Geomat. Nat. Hazards Risk 2019, 10, 1879–1911. [Google Scholar] [CrossRef]
Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Choi, S.-M. Ubiquitous GIS-based forest fire susceptibility mapping using artificial intelligence methods. Remote Sens. 2020, 12, 1689. [Google Scholar] [CrossRef]
Gholamnia, K.; Gudiyangada Nachappa, T.; Ghorbanzadeh, O.; Blaschke, T. Comparisons of diverse machine learning approaches for wildfire susceptibility mapping. Symmetry 2020, 12, 604. [Google Scholar] [CrossRef]
Xie, Y.; Peng, M. Forest fire forecasting using ensemble learning approaches. Neural Comput. Appl. 2019, 31, 4541–4550. [Google Scholar] [CrossRef]
Kalantar, B.; Ueda, N.; Idrees, M.O.; Janizadeh, S.; Ahmadi, K.; Shabani, F. Forest fire susceptibility prediction based on machine learning models with resampling algorithms on remote sensing data. Remote Sens. 2020, 12, 3682. [Google Scholar] [CrossRef]
Meng, J.; Li, Y.; Liang, H.; Ma, Y. Single-image dehazing based on two-stream convolutional neural network. J. Artif. Intell. Technol. 2022, 2, 100–110. [Google Scholar] [CrossRef]
Zheng, M.; Zhi, K.; Zeng, J.; Tian, C.; You, L. A hybrid CNN for image denoising. J. Artif. Intell. Technol. 2022, 2, 93–99. [Google Scholar] [CrossRef]
Shakeel, N.; Shakeel, S. Context-Free Word Importance Scores for Attacking Neural Networks. J. Comput. Cogn. Eng. 2022, 1, 187–192. [Google Scholar] [CrossRef]
Vishwanathan, S.; Murty, M.N. SSVM: A simple SVM algorithm. In Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat. No. 02CH37290), Honolulu, HI, USA, 12–17 May 2002; pp. 2393–2398. [Google Scholar]
Lasaponara, R. Inter-comparison of AVHRR-based fire susceptibility indicators for the Mediterranean ecosystems of southern Italy. Int. J. Remote Sens. 2005, 26, 853–870. [Google Scholar] [CrossRef]
Mabdeh, A.N.; Al-Fugara, A.k.; Khedher, K.M.; Mabdeh, M.; Al-Shabeeb, A.R.; Al-Adamat, R. Forest Fire Susceptibility Assessment and Mapping Using Support Vector Regression and Adaptive Neuro-Fuzzy Inference System-Based Evolutionary Algorithms. Sustainability 2022, 14, 9446. [Google Scholar] [CrossRef]
Różycka, M.; Migoń, P.; Michniewicz, A. Topographic Wetness Index and Terrain Ruggedness Index in geomorphic characterisation of landslide terrains, on examples from the Sudetes, SW Poland. Z. Für Geomorphol. Suppl. Issues 2017, 61, 61–80. [Google Scholar] [CrossRef]
Masinda, M.M.; Li, F.; Qi, L.; Sun, L.; Hu, T. Forest fire risk estimation in a typical temperate forest in Northeastern China using the Canadian forest fire weather index: Case study in autumn 2019 and 2020. Nat. Hazards 2022, 111, 1085–1101. [Google Scholar] [CrossRef] [PubMed]
Mohajane, M.; Costache, R.; Karimi, F.; Pham, Q.B.; Essahlaoui, A.; Nguyen, H.; Laneve, G.; Oudija, F. Application of remote sensing and machine learning algorithms for forest fire mapping in a Mediterranean area. Ecol. Indic. 2021, 129, 107869. [Google Scholar] [CrossRef]
Rabby, Y.W.; Ishtiaque, A.; Rahman, M.S. Evaluating the effects of digital elevation models in landslide susceptibility mapping in Rangamati district, Bangladesh. Remote Sens. 2020, 12, 2718. [Google Scholar] [CrossRef]
Golden, B.L.; Wasil, E.A.; Harker, P.T. The analytic hierarchy process. Appl. Stud. Berl. Heidelb. 1989, 2, 1–273. [Google Scholar]
Saaty, R.W. The analytic hierarchy process—What it is and how it is used. Math. Model. 1987, 9, 161–176. [Google Scholar] [CrossRef]
Hwang, C.-L.; Yoon, K. Methods for multiple attribute decision making. In Multiple Attribute Decision Making; Springer: Berlin/Heidelberg, Germany, 1981; pp. 58–191. [Google Scholar]
Tong, L.-I.; Wang, C.-H.; Chen, H.-C. Optimization of multiple responses using principal component analysis and technique for order preference by similarity to ideal solution. Int. J. Adv. Manuf. Technol. 2005, 27, 407–414. [Google Scholar] [CrossRef]
Chen, S.; Webb, G.I.; Liu, L.; Ma, X. A novel selective naïve Bayes algorithm. Knowl.-Based Syst. 2020, 192, 105361. [Google Scholar] [CrossRef]
Fang, B.; Jiang, M.; Shen, J.; Stenger, B. Deep generative inpainting with comparative sample augmentation. J. Comput. Cogn. Eng. 2022, 1, 174–180. [Google Scholar] [CrossRef]
Tang, J.; Xue, Y.; Wang, Z.; Hu, S.; Gong, T.; Chen, Y.; Zhao, H.; Xiao, L. Bayesian estimation-based sentiment word embedding model for sentiment analysis. CAAI Trans. Intell. Technol. 2022, 7, 144–155. [Google Scholar] [CrossRef]
Kotsiantis, S.B. Decision trees: A recent overview. Artif. Intell. Rev. 2013, 39, 261–283. [Google Scholar] [CrossRef]
Cheng, D.; Zhang, S.; Deng, Z.; Zhu, Y.; Zong, M. kNN algorithm with data-driven k value. In Proceedings of the International Conference on Advanced Data Mining and Applications, Guilin, China, 19–21 December 2014; pp. 499–512. [Google Scholar]
Hanley, J.A.; McNeil, B.J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982, 143, 29–36. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Cheng, M.; Eaton, J.; Hsieh, C.-J.; Wu, S.F. Fake node attacks on graph convolutional networks. J. Comput. Cogn. Eng. 2022, 1, 165–173. [Google Scholar] [CrossRef]
Fiorucci, P.; Gaetani, F.; Lanorte, A.; Lasaponara, R. Dynamic fire danger mapping from satellite imagery and meteorological forecast data. Earth Interact. 2007, 11, 1–17. [Google Scholar] [CrossRef]
Gonzalez-Alonso, F.; Cuevas, J.; Casanova, J.; Calle, A.; Illera, P. A forest fire risk assessment using NOAA AVHRR images in the Valencia area, eastern Spain. Int. J. Remote Sens. 1997, 18, 2201–2207. [Google Scholar] [CrossRef]
Illera, P.; Fernandez, A.; Delgado, J. Temporal evolution of the NDVI as an indicator of forest fire danger. Int. J. Remote Sens. 1996, 17, 1093–1105. [Google Scholar] [CrossRef]
Al-Fugara, A.k.; Mabdeh, A.N.; Ahmadlou, M.; Pourghasemi, H.R.; Al-Adamat, R.; Pradhan, B.; Al-Shabeeb, A.R. Wildland fire susceptibility mapping using support vector regression and adaptive neuro-fuzzy inference system-based whale optimization algorithm and simulated annealing. ISPRS Int. J. Geo-Inf. 2021, 10, 382. [Google Scholar] [CrossRef]
Zhu, A.-X.; Miao, Y.; Wang, R.; Zhu, T.; Deng, Y.; Liu, J.; Yang, L.; Qin, C.-Z.; Hong, H. A comparative study of an expert knowledge-based model and two data-driven models for landslide susceptibility mapping. Catena 2018, 166, 317–327. [Google Scholar] [CrossRef]
Arabameri, A.; Rezaei, K.; Pourghasemi, H.R.; Lee, S.; Yamani, M. GIS-based gully erosion susceptibility mapping: A comparison among three data-driven models and AHP knowledge-based technique. Environ. Earth Sci. 2018, 77, 628. [Google Scholar] [CrossRef]
Chicas, S.D.; Østergaard Nielsen, J.; Valdez, M.C.; Chen, C.-F. Modelling wildfire susceptibility in Belize’s ecosystems and protected areas using machine learning and knowledge-based methods. Geocarto Int. 2022, 37, 15823–15846. [Google Scholar] [CrossRef]
Nachappa, T.G.; Piralilou, S.T.; Gholamnia, K.; Ghorbanzadeh, O.; Rahmati, O.; Blaschke, T. Flood susceptibility mapping with machine learning, multi-criteria decision analysis and ensemble using Dempster Shafer Theory. J. Hydrol. 2020, 590, 125275. [Google Scholar] [CrossRef]

Figure 1. Study area (Northern Mazar, Jordan).

Figure 2. Maps of the criteria employed in the study area: (a) DEM, (b) slope, (c) aspect, (d) land use, (e) distance to roads (f) population density, (g) wind speed, (h) rainfall, (i) temperature, (j) NDVI, (k) TWI, and (l) radiation.

Figure 3. Flowchart of the forest fire susceptibility method used in this study.

Figure 4. ANN-based GA convergence plot for feature selection based on DT.

Figure 5. SVM-based GA convergence plot for feature selection based on KNN.

Figure 6. ANN-based GA convergence plot for feature selection based on NB.

Figure 7. ANN-based GA convergence plot for feature selection based on SVM.

Figure 8. The AUROCCs for the eight models under study in the training phase.

Figure 9. The AUROCCs for the eight models under study in the testing phase.

Figure 10. The forest fire susceptibility maps produced by (a) FR-AHP, (b) AHP-TOPSIS, (c) boosting and (d) random subspace models.

Figure 11. The percentage of susceptibility classes for four models.

Table 1. The source of the factors and their maximum and minimum values.

Criteria	Extraction	Min	Max
Land use	Supervised classification of Landsat 8 Operational Land Imager (OLI) images for the year 2023 using the maximum likelihood algorithm.	Bare Rocks	Vegetables
NDVI	Landsat 8 OLI images that were downloaded from the United States Geological Survey (USGS) website.	<0.17	>0.32
Rainfall	Digitized from an annual precipitation; it is a map prepared by the Jordanian Meteorological Department (JMD) and converted into raster format using the spatial analyst tools in ArcGIS 10.8.	450	550
TWI	Describes the topography of the area and other related conditions that affect the spatial patterns of soil texture and soil moisture based on equation TWI = In(CA/slope)	<−5.8	>0.26
Temperatures	Data on climatic variables obtained from six meteorological stations with quarter-century records were interpolated with the inverse distance weighted (IDW) interpolation method in the ArcGIS 10.8 environment, using the spatial analyst tools.	<20	>27
Wind Speed	Data on climatic variables obtained from six meteorological stations with quarter-century records were interpolated with the inverse distance weighted (IDW) interpolation method in the ArcGIS 10.8 environment using the spatial analyst tools.	7 (m/s)	8
Population Density	The population density data obtained by dividing the population by the size of the area. Thus, population density = number of people/land area in the ArcGIS 10.8 environment	0.25	>1.81

Table 2. The ‘1–9′ scale for AHP preference (Saaty, 1987).

Intensity of Importance	Definition
1	Equal importance
3	Moderate importance
5	Strong importance
7	Very strong importance
9	Extreme importance
2, 4, 6, 8	Intermediate importance

Table 3. Scales for the random consistency index values.

No. of Criteria	1	2	3	4	5	6	7	8	9	10	11	12	13
RI	0.00	0.00	0.58	0.90	1.12	1.24	1.32	1.41	1.45	1.49	1.51	1.54	1.56

Table 4. Association of forest fires with each factor under study according to the FR model.

Class	No. of Pixels in Domain	No. of Fires	FR	PR
Altitude (m)				1.93
326–608	6803	6	1.72
608–721	17,019	15	1.72
721–819	27,722	12	0.84
819–922	26,539	10	0.73
>922	17,495	6	0.67
Slope angle (degree)				1.60
<5	10,039	5	0.97
5–10	25,419	17	1.3
10–15	28,153	17	1.18
15–20	20,036	6	0.58
>20	11,808	4	0.66
Slope aspect				3.05
Flat	31	0	0
North	17,265	17	1.92
Northeast	18,567	15	1.58
East	7650	1	0.25
Southeast	3829	1	0.51
South	8093	1	0.24
Southwest	13,987	4	0.56
West	13,001	5	0.75
Northwest	13,155	5	0.74
Distance to road (m)				1
0–300	24,583	10	0.79
300–600	19,865	13	1.27
600–900	16,986	9	1.03
900–1200	13,943	6	0.84
>1200	20,036	11	1.07
Population density				1.61
0.25–0.59	29,498	9	0.59
0.59–0.89	34,035	23	1.32
0.89–1.24	17,567	7	0.78
1.24–1.81	12,605	9	1.39
>1.81	1804	1	1.08
Land use				6.47
Pastures	17,290	11	1.24
Vegetables	1606	1	1.22
Tree crops	46,499	28	1.18
Basaltic rocks	9	0	0
Bare rocks	2093	0	0
Urban fabric	884	0	0
Open forest	11,971	5	0.82
Mixed land	1615	0	0
Closed forest	12,562	3	0.47
Field crops	1155	1	1.7
Wind				2.52
7–7.5	14,353	11	1.49
7.5–8	81,176	38	0.91
Rainfall (mm)				1.79
450–475	7568	3	0.77
475–525	57,988	35	1.18
525–550	29,973	11	0.72
Temperature				1.97
<20	1251	1	1.56
20–23	24,985	11	0.86
23–25	44,246	18	0.79
25–27	22,322	16	1.4
>27	2859	3	2.05
NDVI				3.85
<0.17	8047	0	0
0.17–0.23	20,754	10	0.94
0.23–0.27	31,449	18	1.12
0.27–0.32	28,920	15	1.01
>0.32	6493	0	1.8
TWI				2.58
<−5.8	28,855	9	0.6
−5.8–−4.5	40,468	17	0.81
−4.5–−2.7	17,966	14	1.5
−2.7–0.26	4630	6	2.5
>0.26	2558	3	2.26
Solar radiation				1.39
<2000	2393	2	1.63
2000–5000	3155	2	1.24
5000–8000	5893	3	1
8000–11,000	25,041	11	0.86
>11,000	59,342	31	1.02

Table 5. The PWCM of the FR-AHP.

	a	b	c	d	e	f	g	h	i	j	k	l
Altitude (a)	1	1.20	0.63	1.92	1.19	0.30	0.76	1.07	0.98	0.50	0.75	1.38
Slope (b)	0.83	1	0.52	1.60	0.99	0.25	0.64	0.89	0.81	0.41	0.62	1.15
Aspect (c)	1.59	1.91	1	3.05	1.89	0.47	1.21	1.70	1.55	0.79	1.18	2.19
Distance to road (d)	0.52	0.63	0.33	1	0.62	0.15	0.40	0.56	0.51	0.26	0.39	0.72
Population density (e)	0.84	1.01	0.53	1.61	1	0.25	0.64	0.90	0.82	0.42	0.63	1.16
Land use (f)	3.36	4.04	2.12	6.47	4.00	1	2.57	3.60	3.28	1.68	2.51	4.63
Wind (g)	1.31	1.57	0.82	2.52	1.56	0.39	1	1.40	1.28	0.65	0.98	1.80
Rainfall (h)	0.93	1.22	0.59	1.79	1.11	0.28	0.71	1	0.91	0.47	0.70	1.29
Temperature (i)	1.02	1.23	0.65	1.97	1.22	0.30	0.78	1.10	1	0.51	0.76	1.41
NDVI (j)	2.00	2.41	1.26	3.85	2.38	0.60	1.53	2.15	1.95	1	1.49	2.76
TWI (k)	1.34	1.61	0.85	2.58	1.60	0.40	1.03	1.44	1.31	0.67	1	1.85
Solar radiation (l)	0.72	0.87	0.46	1.39	0.86	0.22	0.55	0.78	0.71	0.36	0.54	1
Sum	15.46	18.62	9.75	29.77	18.43	4.60	11.82	16.59	15.11	7.73	11.54	21.34

Table 6. The normalized matrix and weights of the FR-AHP model.

	a	b	c	d	e	f	g	h	i	J	k	l	Sum	Weight
Altitude (a)	0.06	0.06	0.06	0.06	0.06	0.07	0.06	0.06	0.06	0.06	0.06	0.06	0.78	0.06
Slope (b)	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.64	0.05
Aspect (c)	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10	0.10	1.23	0.10
Distance to road (d)	0.03	0.03	0.03	0.03	0.03	0.03	0.03	0.03	0.03	0.03	0.03	0.03	0.40	0.03
Population density (e)	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.65	0.05
Land use (f)	0.22	0.22	0.22	0.22	0.22	0.22	0.22	0.22	0.22	0.22	0.22	0.22	2.61	0.22
Wind (g)	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08	0.08	1.01	0.08
Rainfall (h)	0.06	0.07	0.06	0.06	0.06	0.06	0.06	0.06	0.06	0.06	0.06	0.06	0.73	0.06
Temperature (i)	0.07	0.07	0.07	0.07	0.07	0.07	0.07	0.07	0.07	0.07	0.07	0.07	0.79	0.07
NDVI (j)	0.13	0.13	0.13	0.13	0.13	0.13	0.13	0.13	0.13	0.13	0.13	0.13	1.55	0.13
TWI (k)	0.09	0.09	0.09	0.09	0.09	0.09	0.09	0.09	0.09	0.09	0.09	0.09	1.04	0.09
Solar radiation (l)	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.05	0.56	0.05
Sum	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	12.00	1.00

Table 7. The PWCM of AHP in the AHP-TOPSIS model.

	a	b	c	d	e	f	g	h	i	j	k	l	Weight
Distance to road (a)	1	2	3	3	4	4	6	0.33	0.5	2	5	6	0.13
Aspect (b)	-	1	2	2	3	3	6	0.33	0.33	2	4	5	0.096
Temperature (c)	-	-	1	2	2	3	4	0.2	0.25	0.5	2	3	0.059
Rainfall (d)	-	-	-	1	2	2	4	6	5	2	2	3	0.049
Altitude (e)	-	-	-	-	1	2	3	6	5	3	2	3	0.039
Population density (f)	-	-	-	-	-	1	3	7	6	3	2	3	0.033
Slope (g)	-	-	-	-	-	-	1	9	8	5	3	2	0.015
Land use (h)	-	-	-	-	-	-	-	1	2	4	8	8	0.257
NDVI (i)	-	-	-	-	-	-	-	-	1	3	6	7	0.193
Wind (j)	-	-	-	-	-	-	-	-	-	1	3	4	0.08
Solar radiation (k)	-	-	-	-	-	-	-	-	-	-	1	2	0.028
TWI (l)	-	-	-	-	-	-	-	-	-	-	-	1	0.02

Table 8. Values of the 12 forest fire determinant factors calculated by AHP-TOPSIS.

Number	Temperature	NDVI	Distance to Road	Rainfall	Wind	Aspect	Altitude	Population Density	Slope	TWI	Solar Radiation	Land Use
1	23.01	0.37	30.00	500.00	8	18.43	775	1.18	3.62	−3.76	11488.5	3
2	26.10	0.28	403.61	550.00	8	167.47	703	0.74	20.24	−5.29	10102	7
3	24.25	0.2	127.28	500.00	7	54.87	665	0.75	18.27	−4.03	10629.8	3
4	24.85	0.17	702.92	550.00	8	225.00	908	0.42	7.25	−5.22	9660.9	3
5	21.97	0.2	999.05	500.00	8	63.43	820	1.17	5.11	−4.66	11035.3	3
6	24.91	0.3	1260.00	550.00	8	315	838	0.29	10.42	−5.58	12912.8	3
7	22.00	0.35	60.00	500.00	8	274.4	966	0.84	7.43	−2.29	11743.2	3
8	21.99	0.31	954.83	500.00	8	45	882	0.64	1.62	−3.64	11153	9
9	23.95	0.31	1018.68	500.00	7	8.13	588	0.75	15.79	−3.46	11594.6	3
10	21.78	0.27	947.26	550.00	8	236.31	729	0.45	26.78	−6.60	49.26	9
.
.
.
4991	24.33	0.3	276.59	500	8	237.53	934	1	7.43	−6.01	10984.7	7
4992	22.05	0.14	865.85	450	8	332.1	851	1.51	10.89	−4.34	10223.8	5
4993	25.80	0.29	660	500	8	194.53	783	1.68	15.59	−5.05	11130.6	1
4994	23.47	0.24	335.41	500	8	15.95	742	1.36	16.24	−5.78	11487	3
4995	22.87	0.24	295.47	550	8	296.57	834	0.65	10.14	−6.38	11338.5	3
4996	24.38	0.2	906.97	500	8	248.2	870	0.59	12.16	−6.19	12440.7	3
4997	23.74	0.22	161.56	500	8	234.46	765	1.16	9.76	−4.22	10930.2	3
4998	24.72	0.23	1060.66	550	8	7.13	812	0.37	9.16	−3.73	10720.9
4999	23.81	0.3	1044.84	500	8	279.46	758	1.69	6.94	−2.16	11182.7	3
5000	23.38	0.25	1256.07	550	7	−1	545	0.71	0	−2.55	4584.11	9

Table 9. The positive ideal solution (

A^{+}

) and the negative ideal solution (

A^{-}

) in TOPSIS.

Table 9. The positive ideal solution (

A^{+}

) and the negative ideal solution (

A^{-}

) in TOPSIS.

	Temperature	NDVI	Distance to Road	Rainfall	Wind	Aspect	Altitude	Population Density	Slope	TWI	Solar Radiation	Land Use
$A^{*}$	0.001	0.0049	0.0065	0.0007	0.0012	0.0023	0.0007	0.0013	0.0007	0.0003	0.0005	0.0073
$A^{-}$	0	0	0	0	0	0	0	0	0	−0.0004	0	0

Table 10. Separation of each alternative from the positive and negative ideal solutions and the associated

C_{i}^{*}

values.

Table 10. Separation of each alternative from the positive and negative ideal solutions and the associated

C_{i}^{*}

values.

	$s_{i}^{*}$	$s_{i}^{-}$	$c_{i}^{*}$
1	0.009	0.005	0.357
2	0.007	0.006	0.486
3	0.009	0.003	0.28
4	0.008	0.004	0.327
5	0.008	0.004	0.344
6	0.007	0.005	0.435
7	0.008	0.005	0.37
8	0.005	0.008	0.586
9	0.007	0.005	0.388
10	0.005	0.008	0.592
.
.
.
4991	0.007	0.007	0.492
4992	0.007	0.005	0.423
4993	0.009	0.004	0.322
4994	0.008	0.004	0.312
4995	0.008	0.004	0.34
4996	0.008	0.004	0.356
4997	0.009	0.004	0.318
4998	0.008	0.004	0.356
4999	0.007	0.005	0.42
5000	0.005	0.008	0.586

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Al-Shabeeb, A.R.; Hamdan, I.; Meimandi Parizi, S.; Al-Fugara, A.; Odat, S.; Elkhrachy, I.; Hu, T.; Sammen, S.S. A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping. Sustainability 2023, 15, 15598. https://doi.org/10.3390/su152115598

AMA Style

Al-Shabeeb AR, Hamdan I, Meimandi Parizi S, Al-Fugara A, Odat S, Elkhrachy I, Hu T, Sammen SS. A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping. Sustainability. 2023; 15(21):15598. https://doi.org/10.3390/su152115598

Chicago/Turabian Style

Al-Shabeeb, Abdel Rahman, Ibraheem Hamdan, Sedigheh Meimandi Parizi, A’kif Al-Fugara, Sana’a Odat, Ismail Elkhrachy, Tongxin Hu, and Saad Sh. Sammen. 2023. "A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping" Sustainability 15, no. 21: 15598. https://doi.org/10.3390/su152115598

APA Style

Al-Shabeeb, A. R., Hamdan, I., Meimandi Parizi, S., Al-Fugara, A., Odat, S., Elkhrachy, I., Hu, T., & Sammen, S. S. (2023). A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping. Sustainability, 15(21), 15598. https://doi.org/10.3390/su152115598

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comparative Study of Genetic Algorithm-Based Ensemble Models and Knowledge-Based Models for Wildfire Susceptibility Mapping

Abstract

1. Introduction

2. Study Area

2.1. Preparation of Inventory Map

2.2. Factors Contributing to Forest Fires

2.2.1. Elevation

2.2.2. Slope

2.2.3. Aspect

2.2.4. Land Use

2.2.5. Distance to Roads

2.2.6. Population Density

2.2.7. Wind Speed

2.2.8. Rainfall

2.2.9. Temperature

2.2.10. NDVI

2.2.11. Topographic Wetness Index (TWI)

2.2.12. Solar Radiation

3. Methodology

3.1. Frequency Ratio (FR)

3.2. Analytic Hierarchy Process (AHP)

3.3. TOPSIS

3.4. Genetic Algorithm-Based Ensemble Models

3.5. Model Accuracy Assessment

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI