Predicting the Duration of Forest Fires Using Machine Learning Methods

Kopitsa, Constantina; Tsoulos, Ioannis G.; Charilogis, Vasileios; Stavrakoudis, Athanassios

doi:10.3390/fi16110396

Open AccessArticle

Predicting the Duration of Forest Fires Using Machine Learning Methods

by

Constantina Kopitsa

¹

,

Ioannis G. Tsoulos

^1,*

,

Vasileios Charilogis

¹ and

Athanassios Stavrakoudis

²

¹

Department of Informatics and Telecommunications, University of Ioannina, 45110 Ioannina, Greece

²

Department of Economics, University of Ioannina, 45110 Ioannina, Greece

^*

Author to whom correspondence should be addressed.

Future Internet 2024, 16(11), 396; https://doi.org/10.3390/fi16110396

Submission received: 17 September 2024 / Revised: 14 October 2024 / Accepted: 22 October 2024 / Published: 28 October 2024

(This article belongs to the Special Issue Artificial Intelligence and Blockchain Technology for Smart Cities)

Download

Browse Figures

Versions Notes

Abstract

:

For thousands of years forest fires played the role of a regulator in the ecosystem. Forest fires contributed to the ecological balance by destroying old and diseased plant material; but in the modern era fires are a major problem that tests the endurance not only of government agencies around the world, but also have an effect on climate change. Forest fires have become more intense, more destructive, and more deadly; these are known as megafires. They can cause major economic and ecological problems, especially in the summer months (dry season). However, humanity has developed a tool that can predict fire events, to detect them in time, but also to predict their duration. This tool is artificial intelligence, specifically, machine learning, which is one part of AI. Consequently, this paper briefly mentions several methods of machine learning as used in predicting forest fires and in early detection, submitting an overall review of current models. Our main overall objective is to venture into a new field: predicting the duration of ongoing forest fires. Our contribution offers a new way to manage forest fires, using accessible open data, available from the Hellenic Fire Service. In particular, we imported over 72,000 data from a 10-year period (2014–2023) using machine learning techniques. The experimental and validation results are more than encouraging, with Random Forest achieving the lowest value for the error range (8–13%), meaning it was 87–92% accurate on the prediction of forest fire duration. Finally, some future directions in which to extend this research are presented.

Keywords:

forest fires; machine learning; neural networks; decision trees

Graphical Abstract

1. Introduction

Forests play an important role in the ecological balance [1] of our planet as well as in our everyday life [2]. However, these ecosystems are threatened by various risks, the most important of which are fires [3,4,5]. Forest fires destroy the forest ecosystem [6,7,8] and can have devastating effects on local economies [9,10], with a significant impact also on tourism development [11,12,13] as well as human health [14,15,16].

Since the risks of fires are great, governments must take measures and review them in the direction of fire prevention by analyzing data collected from fires that have broken out in recent history [17,18,19]. Local authorities have used techniques for forest fire monitoring, such as small UAVs [20], usage of a monitoring system based on GPRS and a ZigBee wireless network [21], the iForestFire system [22], etc. Merino et al. suggested an unmanned aircraft system (UAS) [23] for forest fire monitoring. Also, Aslan et al. proposed a system [24] of wireless sensor networks for forest fire detection and monitoring. Recently, Serna et al. suggested a distributed system for fire monitoring using wireless sensor networks [25].

During recent years, machine learning techniques have started to play an important role in the prevention and treatment of forest fires. For example, Dwiasnati and Devianto proposed the usage of various machine learning methods for the classification of forest fire areas [26]. Also, Pang et al. suggested the usage of a series of machine learning models for forest fire occurrence prediction in China [27]. Dampage et al. suggested a system of wireless sensor networks with data handled by machine learning models for the detection of forest fires [28]. Shao et al. proposed a mapping of China’s forest fire risks using a series of machine learning models [29].A parallel SVM model was suggested by Singh et al. [30] for forest fire prediction, using data collected from India and Portugal. A survey on machine learning models used for forest fire prediction can be found in the work of Abid [31].

In addition, image processing has been established as a fire detection method. In this direction, a multitude of techniques have been presented that also take advantage of machine learning methods, such as the work of Vicente and Guillemant that presented a method for early smoke source detection [32]. Also, Yan et al. proposed a method [33] that combined image processing techniques and neural networks for forest fire recognition. Mubarak et al. suggested a rule-based image processing algorithm [34] for forest fire detection. Convolutional neural networks were utilized in the work of Wang et al. [35] for forest fire image recognition. Also, wavelet analysis was used in the work of Jiao et al. [36] for forest fire detection. Also, Jain et al. underline that the field of ML has undergone an explosion of new algorithmic advances in recent years and is deeply connected to the broader field of artificial intelligence (AI) [37]. In this research paper, Figure 1 outlines the incorporation of machine learning methods in wildfire management. ML models, a subset of artificial intelligence, harness the power of data and algorithms to learn from past experiences [38]. In the context of wildfire management, these models are invaluable, analyzing historical data and leveraging it to create predictive models capable of forecasting the spread of future fires. Although more complex than their statistical and physical counterparts, these models stand out for their ability to incorporate a broad array of variables. Understanding the requirements of ML models is crucial, particularly in terms of the variables they depend on for accurate predictions [39].

This research work focuses on the use of machine learning techniques to predict the duration of forest fires which occurred in Greece from 2014 to 2023. The data were collected by the Hellenic Fire Service, and then, after clearing missing records, the data were digitized and one of three categories was assigned to every pattern: fires of short duration, fires of medium duration, and fires of long duration. Figure 2 shows the burned areas in Greece per year for the last 10 years, according to the Global Wildfire Information System (GWIS).

Farid et al. pointed out the vulnerability of the Greek ecosystem (pine forest) to large forest fires, along with other Mediterranean countries, such as Spain, Portugal, Italy, and France [40]. The fires in the Mediterranean region have become very intense and dangerous, with scientists reporting a sixth-generation megafire clearly linked to global change. This new type of fire broke out for the first time in Portugal and Spain in 2017, with over 120 deaths, and the next year in Greece, with 104 deaths. The common characteristics of this new type of fire are it is extreme, uncontrollable, and lethal, as indicated in the WWF report of 2019, available from http://awsassets.panda.org/downloads/wwf__the_mediterranean_burns_2019_eng_final.pdf (accessed on 21 October 2024). In order to improve preparation for fire hazards, this paper improves forest fire management through the use of machine learning methods. The prediction of the duration of a fire is important, as in this way, on the one hand, an estimate can be made of the expected damage that will be caused in the area, and on the other, the human resources required to extinguish the fire can be calculated. Similar works in this area include the work of Liang et al. that used the duration of a wildfire and the burnt area to determine the scale of wildfires using neural networks [41]. Also, KC et al. proposed a surrogate model [42] to model the size of a wildfire over time, using data collected from wildfires in Tasmania. Furthermore, Xi et al. proposed [43] the application of joint mixture models to model the duration and the size of wildfires. In this work, a number of machine learning models, which have been successfully tested on a wide range of problems in the modern literature, were used. The purpose of these models is the satisfactory separation of the categories of the problem through stochastic techniques that adjust the parameters of the above models.

In this paper, we review the application of ML in forest fire management. Our main overall objective is to improve awareness of ML methods among fire researchers and managers, and illustrate the open data that the Hellenic Fire Service provide. In addition to that, to highlight the current state-of-the-art methods in predicting forest fire duration, and their benefit to decision making regarding the fire fighting resources required for firefighters. The US Forest Service underlines the importance of predictive services that give information to fire managers in order to anticipate and determine the resource needs, such as firefighters, engines, airplanes, etc. [44]. Also, the European State Forest Association, states that addressing the challenge of forest fires requires a concerted effort that combines scientific research, practical management strategies, and strong community engagement [45]. Therefore, technology has become a valuable ally for the environmental sciences. On that, recent reviews demonstrate the increase, in the last ten years, in the application of ML models in the environmental sciences [46] and forest ecology [47]. The Canadian specialist in fire science, Jain, also points out that a better method of wildfire prediction is crucial for wildfire management. Consequently, there has been a growing interest in the use of machine learning (ML) methodologies in wildfire science and management in recent years [37]. Forest fires and their management is a unique science field, with six problem domains according to Jain et al. These are fire detection, fuel characterization, and mapping; climate change and fire weather; fire susceptibility, occurrence, and risk; fire behavior prediction; fire effects; and fire management. In our paper, we focus on fire management, because there appears to be few studies in this domain problem, according to Jain et al. Finley showed that fire management is a type of risk management that aims to maximize fire benefits while minimizing costs and losses [48]. Fire management decisions are crucial on a variety of scales, including long-term strategic decisions about resource procurement and location control in large regions; medium-term tactical decisions about resource acquisition, relocation, or release during the fire season; and short-term real-time operational decisions regarding resource deployment and usage on specific events [37]. Xiao indicates that fire management groups struggle to effectively respond in a limited amount of time. It would be wise to keep an eye out for potential large fires [49].

This paper presents a fast-decision model for predicting the duration of ongoing fires. As a result, this review will help practitioners and researchers in the wildfire community who are interested in using machine learning techniques by offering guidance and information. It will also give ML researchers the chance to find potential uses in the field of wildfire science and management.

The field of wildfire duration prediction is limited, as researchers tend to focus more on fire occurrence and early detection. The domain we are focusing on appears to have great potential in two ways: first, in wildfire management; and second, for machine learning researchers. The objective of this paper is to introduce an innovative approach by incorporating the number of firefighters, vehicles, and aerial forces used in each of the more than 72,000 fire incidents as key data points. Our research stands out by focusing not only on the occurrence of fires but also on the critical role of human and material resources in managing them. The application of machine learning (ML), and especially the Random Forest algorithm, in our project has proven highly valuable, enabling us to accurately analyze and predict critical parameters such as fire duration, while also considering human and material resources. Our findings are reliable, offering substantial support for optimizing wildfire management strategies. Our contribution can enhance the understanding of both material needs and human resources for fighting a wildfire, at both the local level and the European level, through the European Civil Protection Mechanism.

The rest of this article is organized as follows: In Section 2, the used dataset is described, as well as the incorporated machine learning methods; in Section 3, the experimental results are fully described; and finally, in Section 4 some conclusions are discussed, accompanied by some guidelines for future research.

2. Materials and Methods

This section presents the datasets that are used in the experiments, as well as the machine learning techniques that are applied to these datasets.

2.1. The Used Datasets

In this research work, open data were used, which are available from the Hellenic Fire Service at the link https://www.fireservice.gr/en_US/synola-dedomenon (accessed on 2 October 2024). The data were obtained for the years 2014–2023 and data preprocessing techniques were applied before inputting the data into machine learning models. The data used for this paper concern the strengthening of the European transparency legislation 2013/37/EE. Therefore, the data are neither type nor location biased, and concern all fires in the Greek (Hellenic) area. The information provided from the Hellenic Fire Service is easily accessible, updated, accurate, allows analysis, and includes all participating parties.

The initial datasets contained both numerical and alphanumeric information. For example, they included data on the area where the forest fire occurred, as well as information about the fire station that participated in the suppression efforts. Therefore, the first step in data preprocessing was the digitization of the columns containing alphanumeric data, specifically those with numerical information. This involved replacing categorical data with discrete integer values, enabling their use by machine learning algorithms, that require numerical input. The next crucial step in data preprocessing involved handling missing values. Specifically, records that contained missing values in important features, such as climatic data or other relevant variables, were removed from the dataset. This typically occurred when a value was unavailable at the time of recording, which could lead to biased or unreliable results. Additionally, records with a fire duration of zero were excluded, as they were considered unrealistic and inappropriate for analysis. To define the output category, the duration of the forest fire was converted from hours or other time units into minutes, providing greater precision in classification. Subsequently, three distinct categories were created based on the logarithmic value of the fire duration in minutes. This logarithmic transformation allowed for better management of the large variations in fire duration, ensuring that both shorter and longer fires were appropriately considered without overemphasizing extremely large values. The three resulting categories were used as target values for the execution of experiments, enabling the classification of forest fires based on their duration. In the present work, and for the Greek data on forest fires, the following fire classification was used:

Up to 6 h is considered to be a fire of short duration;
Up to 2 days is a forest fire of medium duration;
A duration of 2 days or more is considered a long-duration fire.

In this way, the data were adequately prepared to be analyzed using machine learning methods, ultimately achieving greater accuracy in predicting the duration of forest fires. The preprocessing steps are graphically illustrated in Figure 3.

Having performed the previously mentioned preprocessing steps, the final datasets contain 25 features and the following information about the forest fires:

Fire department.
Province.
Season.
Burnt area: Forest area, grove, grasslands, reeds/swamps, agricultural lands, cover crop, and garbage dumps.
Personnel: Firefighters, volunteers, army, etc.
Vehicles: Firefighting, tanks, etc.
Aerial means: Helicopters and other aircraft.

A schematic representation of the used dataset is given in Figure 4.

2.2. The Used Machine Learning Methods

A number of machine learning techniques were used to efficiently find classes in the datasets of the previous subsection. These techniques cover a wide range of techniques available in the field of machine learning and are presented in more detail below.

2.2.1. Bayesian Networks

Bayesian networks are probabilistic models based on direct acyclic graphs [50,51] and they have been applied with success in various cases. For example, Friedman et al. used Bayesian networks to analyze expression data [52]. Also, Cai et al. used Bayesian networks in fault diagnosis [53] and Barton et al. proposed the use of Bayesian networks for environmental problems [54]. In the case of forest fires, Bayesian networks have been used in many cases, such as to predict and analyze possible fire causes [55]; the study was conducted in Mugla of Turkey. Also, Bayesian networks were used to model the cascading impacts of drought and forest fire in a recent study [56]. Also, Bayesian networks were combined with deep learning for detection of fires from video frames [57].

2.2.2. Naïve Bayes

Naïve Bayes is a supervised machine learning algorithm used for classification tasks. This classifier uses principles of probability in order to perform classification tasks [58]. This algorithm has been incorporated in many research areas, such as document classification [59], traffic risk management [60], network intrusion detection [61], etc. Also, naïve Bayes has been used for forest fire issues in a series of papers. For example, Nugroho et al. proposed a system for forest fire prevention using a combination of a wireless sensor network and a naïve Bayes classifier [62]. A classification of hotspots causing forest fires using the naïve Bayes algorithm is proposed in the work of Zainul et al. [63]. Karo et al. proposed a methodology to classify wildfires using feature selection and naïve Bayes among other machine learning methods [64]. Also, a variant of the naïve Bayes algorithm was suggested by Shu et al. for forest fire prediction [65].

2.2.3. Logistic Regression

Like the previously mentioned algorithms, logistic regression works also with machine learning classification and it can be considered as a data analysis technique used to predict probabilities [66]. Cabrera proposed using logistic regression in higher education decisions [67]. Also, Lawson et al. proposed the usage of the logistic regression method to analyze customer satisfaction data [68]. Hu and Lo used the logistic regression technique to model urban growth in their paper [69]. This method has been used also in a series of issues involving forest fires, such as human-caused wildfire risk estimation [70], prediction of wildfire vulnerability [71], probabilistic modeling of wildfire occurrence [72], analysis of wildfire danger [73], etc.

2.2.4. Artificial Neural Networks

Artificial neural networks (ANNs) are parametric models [74,75], where a set of parameters, commonly called weights, must be calculated to be adapted to classification or regression data. This machine learning model has been utilized in a variety of scientific and real-world problems, such as physics problems [76,77,78], solving differential equations [79,80], solar radiation prediction [81], agriculture problems [82,83], problems in chemistry [84,85,86], wind speed forecasting [87], economics problems [88,89,90], problems related to medicine [91,92], etc.

In the area of forest fire prediction and observation, a number of works using artificial neural networks have been published. Hossain et al. used ANNs to detect flames and smoke from static image features [93]. Lall and Mathibela utilized neural networks to predict the risk of wildfires in the city of Cape Town [94]. Also, Sayad et al. used neural networks, among other machine learning techniques, for predictive modeling of wildfires from data collected from NASA’s Land Processes Distributed Active Archive Center (LP DAAC) [95]. Also, a case study for predicting wildfires in a Chinese province using neural networks was published recently by Gao et al. [96].

2.2.5. The J48 Algorithm

The J48 algorithm [97] is one of the most used supervised machine learning algorithms, used to construct decision trees for classification data. This method was tested on a series of classification problems, such as the prediction of diabetes [98], network intrusion detection [99], classification of criminal data [100], fingerprint gender classification [101], fake news classification [102], etc. Also, the J48 algorithm was used to predict forest fires using data from Slovenia in a recent work [103]. A similar study was performed in Algeria using the J48 algorithm among other machine learning models [103].

2.2.6. Random Forest

Random Forest [104,105] is a popular supervised machine learning algorithm, used to construct decision trees for classification problems. The method of Random Forest has proven its adaptability and effectiveness in a number of difficult problems, such as remote sensing classification [106], ecology issues [107], bionformatics [108], text categorization [109], network intrusion detection [110], etc. Moreover, Random Forest has been incorporated in forest fire prediction, such as in the work of Latifah et al., where Random Forest was applied to predict forest fires in Borneo [111]. Also, Malik et al. proposed the usage of Random Forest for wildfire risk prediction in northern California [112]. Also, Gao et al. performed a forest fire risk prediction [113] in China using a combination of Random Forest and a neural network trained with the back-propagation method [114].

3. Results

The experiments were conducted using the freely available programming tool of WEKA [115]. The software, which is written in the JAVA programming language, to be portable, can be downloaded freely from https://ml.cms.waikato.ac.nz/weka/ (accessed on 14 September 2024), or it can be found in the repositories of most Linux systems. The WEKA software is a collection of machine learning and data analysis tools and it also contains some visualization tools for modeling. WEKA has been used with success in many cases, such as educational problems [116,117], medical problems [118,119], etc. The validation of the conducted experiments was performed using the ten-fold cross validation technique. The experiments were carried out on an AMD Ryzen 5950X installed at the University of Ioannina, with 128 GB of RAM, running the Debian Linux operating system. The experimental results using the methods mentioned in the previous section and the 10 modified datasets from the Hellenic Fire Service are listed in Table 1. The following applies to the tables of experimental results:

The numbers in cells denote the average classification error as calculated on the test set.
The column Year denotes the year where the machine learning methods were applied.
The column BAYESNET stands for the application of the Bayesian network method.
The column NAIVEBAYES denotes the application of the naïve Bayes algorithm.
The column LOGISTIC represents the application of the logistic regression algorithm.
The column MLP denotes the application of a neural network to the dataset.
The column J48 denotes the application of the J48 method to the forest fire data.
The column RANDOMFOREST denotes the usage of the Random Forest method on the data.
The row Average denotes the average classification error for all datasets.

Judging from the experimental results, it is evident that the Random Forest technique outperforms the others along with the logistic regression technique. This observation is reinforced in the box plot of Figure 5.

Also, the precision and recall measures for every dataset and for each method are presented in Table 2.

A statistical comparison of Random Forest with the other machine learning methods is depicted in Figure 6. In this figure pairwise statistical comparisons of the models are made using the Kruskal-Wallis test, and the results are presented with asterisks instead of the exact p-values. The asterisks indicate the level of statistical significance for the comparisons. Each number of asterisks corresponds to a different p-value threshold, showing how likely it is to reject the null hypothesis (that there is no difference between the groups) based on the data. These levels are usually as follows: One asterisk (*) represents a p-value less than 0.05, indicating a significant difference at the 5% level, which is often considered an acceptable threshold for statistical significance. Two asterisks (**) represent a p-value less than 0.01, indicating a significant difference at the 1% level, which is a stronger indication for rejecting the null hypothesis. Three asterisks (***) represent a p-value less than 0.001, indicating a very strong level of statistical significance, with an extremely small likelihood that the difference is due to random chance.

The Kruskal–Wallis test was used because we had many groups to compare, specifically, six different tree counts, making it necessary to use a test that could compare more than two groups simultaneously. In the statistical visualization of Figure 6, the Bayes Net model shows higher error rates in most years, with an exceptionally high error in 2016, reaching 25.71%. However, in other years, such as 2019 and 2023, its performance was quite close to RANDOMFOREST, although the latter remains slightly better. Although NAIVEBAYES performs relatively well in several years, it still shows higher errors compared to RANDOMFOREST, especially during the 2020–2021 period, where NAIVEBAYES had significantly increased errors. LOGISTIC records exceptionally low error rates in 2016, but remains consistently below RANDOMFOREST in most years. An exception is in 2020, where a divergence is observed, with better performance for RANDOMFOREST. MLP shows some fluctuations in errors, with significant improvement after 2016. Despite better performance in certain years, such as 2021, it remains generally inferior compared to RANDOMFOREST. J48 has fairly comparable error rates with RANDOMFOREST, especially after 2018, but RANDOMFOREST consistently proves to be the most efficient model in most years. RANDOMFOREST consistently emerges as the best model based on error rates, recording lower errors in most years compared to other machine learning models. Although other models, such as LOGISTIC and MLP, perform well in certain years, RANDOMFOREST maintains a more stable and reliable performance with fewer fluctuations.

Furthermore, in order to evaluate the performance and the effectiveness of the Random Forest technique, an additional test was carried out where the number of trees for this method increased from 10 to 200. The experimental results are outlined in Table 3.

Also, a statistical comparison for the previously mentioned results is shown in Figure 7.

In Table 3, it is observed that as the number of trees increases, the error decreases. This is expected in Random Forest models, as more trees typically improve the model’s accuracy. For example, in 2014, the error decreases from 10.0477% for 10 trees to 9.253% for 200 trees, while in 2023, the error starts at 8.4684% for 10 trees and decreases to 7.4303% for 200 trees. In Figure 7, the paired comparisons using the t-test shows that for certain pairs, such as between 10 and 20 trees, the differences are statistically significant, with p-values less than 0.05. This indicates that increasing from 10 to 20 trees results in a significant reduction in error. On the other hand, for larger numbers of trees, such as 150 and 200, the p-values are greater than 0.05, suggesting that the differences in errors are not statistically significant. This means that increasing the number of trees beyond a certain point (e.g., from 150 to 200) does not have a substantial impact on reducing the error. In conclusion, the analysis shows that increasing the number of trees in the Random Forest model leads to a reduction in error, especially for smaller tree counts, where the differences are statistically significant (p < 0.05). For larger tree counts, the differences in errors become smaller, and the p-values indicate that these differences are not statistically significant (p > 0.05).

In addition, to validate the experimental results, another comparison was made in which the Random Forest method participated as well as machine learning models, for the training of which the OPTIMUS optimization software was used. This software is freely available from https://github.com/itsoulos/GlobalOptimus/ (accessed on 3 October 2024). The experimental results for this comparison are outlined in Table 4.

The following notation is used:

The column RANDOM FOREST denotes the results with the method Random Forest implemented by the WEKA package.
The column SVM denotes the application of the support vector machine (SVM) method [120] using the libsvm software package [121].
The column MLP10 BFGS stands for the results obtained by an artificial neural network trained by the BFGS optimization method, as modified by Powell [122]. This neural network is equipped with 10 processing nodes.
The column MLP10 LBFGS represents the results obtained by a neural network with 10 processing nodes that was trained using the limited memory BFGS optimization method [123].
The column RBF10 denotes the results produced by the training of a radial basis function (RBF) [124] network with 10 processing nodes.

Also, a statistical comparison for the previously presented results is outlined in Figure 8.

In Table 4, we observe that the RBF10 model generally exhibits higher error rates compared to the other models, with greater variability from year to year. Its highest error occurs in 2021 at 18.17%, while its lowest is in 2016 at 7.25%. The MLP10-BFGS model, although showing some fluctuations, is more stable than RBF10 and has lower error rates in most years. Its lowest error is recorded in 2016 at 4.36%, while the highest is in 2021 at 14.12%. The MLP10-LBFGS model generally performs better than MLP10-BFGS, with slightly lower error rates in most years. Its lowest error is 3.61% in 2016, while the highest is in 2021 at 12.71%. Finally, the Random Forest model shows the best overall performance, with the lowest error recorded in 2016 (3%) and the highest in 2021 (11.92%). Overall, Random Forest consistently maintains lower error rates compared to the other models in nearly all years. Comparing the models using the t-test (Figure 8), the statistical analysis shows that the differences between the models are statistically significant, especially when comparing RBF10, MLP10-BFGS, and MLP10-LBFGS to Random Forest. The p-value is lower than the specified threshold (p < 0.05), confirming that Random Forest statistically outperforms the other models.

Furthermore, to measure the effectiveness of the artificial neural network on the proposed datasets, one more experiment was conducted, where the BFGS method was used to train a neural network, where the number of processing nodes was in the range

[2, 20]

. The results from this experiment are outlined in Table 5.

From the generated results, it is evident that the artificial neural network model shows improved performance when the nodes increase from 2 to 5, or to 10, but from this point onwards there is no noticeable difference.

4. Conclusions

In the present research work, a study was made of the duration of forest fires using open data for the Greek area. These data contained information such as the area of the fire, the time it erupted, the destruction it caused, as well as the human resources involved in extinguishing it. Machine learning models were then used to estimate the duration of the fires. The successful prediction of fire duration contributes to the proper management of such catastrophic events by state mechanisms. The machine learning models used included models such as artificial neural networks, decision trees, etc. Most of the machine learning techniques used achieved significantly low error values for each year of experimental data. On average, these classification error values were in the range of 8–13% with the Random Forest technique achieving the lowest value. Furthermore, additional tests were executed to measure the effectiveness of Random Forest and neural networks. In addition, the stability of the used data was assessed in the presence of random noise with encouraging results.

The present work could be extended in the future in various research directions, as follows:

Incorporation of more machine learning methods from the relevant literature.
Using feature selection or construction techniques from the recent literature to identify the most important factors influencing the classification process.
Usage of methods that create classification rules, in order to discover any hidden relationships between the data and the classes of the datasets.
Usage of data that also include meteorological data, in order to identify a possible correlation of the categories with the meteorological conditions that prevailed at the time of the fire.
Parallel programming techniques may be incorporated to speed up the optimization process, such as MPI [125] or the OpenMP library [126].

Author Contributions

C.K., V.C. and I.G.T. conceived of the idea and the methodology, and C.K. and V.C. implemented the corresponding software. C.K. conducted the experiments, employing objective functions as test cases, and provided the comparative experiments. A.S. performed the necessary statistical tests. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been financed by the European Union: Next Generation EU through the Program Greece 2.0 National Recovery and Resilience Plan, under the call RESEARCH–CREATE–INNOVATE, project name “iCREW: Intelligent small craft simulator for advanced crew training using Virtual Reality techniques” (project code: TAEDK-06195).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Qiang, Z.; Meka, E.Z.; Anderson, R.C.; Kakabadse, Y. Forests Nature at Your Service. UNEP Report. The Magazine of the United Nations Environment Program. Available online: https://www.unep.org/zh-hans/node/11645 (accessed on 21 October 2024).
Mori, A.S.; Lertzman, K.P.; Gustafsson, L. Biodiversity and ecosystem services in forest ecosystems: A research agenda for applied forest ecology. J. Appl. Ecol. 2017, 54, 12–27. [Google Scholar] [CrossRef]
Stocks, B.J.; Mason, J.A.; Todd, J.B.; Bosch, E.M.; Wotton, B.M.; Amiro, B.D.; Flannigan, M.D.; Hirsch, K.G.; Logan, K.A.; Martell, D.L.; et al. Large forest fires in Canada, 1959–1997. J. Geophys. Res. Atmos. 2002, 107, FFR-5. [Google Scholar] [CrossRef]
Flannigan, M.D.; Amiro, B.D.; Logan, K.A.; Stocks, B.J.; Wotton, B.M. Forest fires and climate change in the 21 st century. Mitig. Adapt. Strateg. Glob. Change 2006, 11, 847–859. [Google Scholar] [CrossRef]
Sahar, O. Wildfires in Algeria: Problems and challenges. iForest 2015, 8, 818–826. [Google Scholar] [CrossRef]
Certini, G. Effects of fire on properties of forest soils: A review. Oecologia 2005, 143, 1–10. [Google Scholar] [CrossRef]
Van Der Werf, G.R.; Randerson, J.T.; Collatz, G.J.; Giglio, L. Carbon emissions from fires in tropical and subtropical ecosystems. Glob. Change Biol. 2003, 9, 547–562. [Google Scholar] [CrossRef]
Agbeshie, A.A.; Abugre, S.; Atta-Darkwa, T.; Awuah, R. A review of the effects of forest fire on soil properties. J. For. Res. 2022, 33, 1419–1441. [Google Scholar] [CrossRef]
Aleksić, P.; Krstić, M.; Jančić, G. Forest fires—Ecological and economic problem in Serbia. Bot. Serb. 2009, 32, 169–176. [Google Scholar]
Wang, D.; Guan, D.; Zhu, S.; Kinnon, M.M.; Geng, G.; Zhang, Q.; Zheng, H.; Lei, T.; Shao, S.; Gong, P.; et al. Economic footprint of California wildfires in 2018. Nat. Sustain. 2021, 4, 252–260. [Google Scholar] [CrossRef]
Hystad, P.W.; Keller, P.C. Towards a destination tourism disaster management framework: Long-term lessons from a forest fire disaster. Tour. Manag. 2008, 29, 151–162. [Google Scholar] [CrossRef]
Boustras, G.; Boukas, N. Forest fires’ impact on tourism development: A comparative study of Greece and Cyprus. Manag. Environ. Qual. 2013, 24, 498–511. [Google Scholar] [CrossRef]
Otrachshenko, V.; Nunes, L.C. Fire takes no vacation: Impact of fires on tourism. Environ. Dev. Econ. 2022, 27, 86–101. [Google Scholar] [CrossRef]
Sastry, N. Forest fires, air pollution, and mortality in Southeast Asia. Demography 2002, 39, 1–23. [Google Scholar] [CrossRef]
Frankenberg, E.; McKee, D.; Thomas, D. Health consequences of forest fires in Indonesia. Demography 2005, 42, 109–129. [Google Scholar] [CrossRef] [PubMed]
Bowman, D.M.; Williamson, G.J.; Abatzoglou, J.T.; Kolden, C.A.; Cochrane, M.A.; Smith, A.M. Human exposure and sensitivity to globally extreme wildfire events. Nat. Ecol. Evol. 2017, 1, 0058. [Google Scholar] [CrossRef]
Zhong, M.; Fan, W.; Liu, T.; Li, P. Statistical analysis on current status of China forest fire safety. Fire Saf. J. 2003, 38, 257–269. [Google Scholar] [CrossRef]
Avila-Flores, D.; Pompa-Garcia, M.; Antonio-Nemiga, X.; Rodriguez-Trejo, D.A.; Vargas-Perez, E.; Santillan-Perez, J. Driving factors for forest fire occurrence in Durango State of Mexico: A geospatial perspective. Chin. Geogr. Sci. 2010, 20, 491–497. [Google Scholar] [CrossRef]
Lovreglio, R.; Leone, V.; Giaquinto, P.; Notarnicola, A. Wildfire cause analysis: Four case-studies in southern Italy. iForest 2010, 3, 8–15. [Google Scholar] [CrossRef]
Casbeer, D.W.; Beard, R.W.; McLain, T.W.; Li, S.-M.; Mehra, R.K. Forest fire monitoring with multiple small UAVs. In Proceedings of the 2005, American Control Conference, Portland, OR, USA,, 8–10 June 2005; Volume 5, pp. 3530–3535. [Google Scholar] [CrossRef]
Wang, G.; Zhang, J.; Li, W.; Cui, D. A forest fire monitoring system based on GPRS and ZigBee wireless sensor network. In Proceedings of the 2010 5th IEEE Conference on Industrial Electronics and Applications, Taichung, Taiwan, 15–17 June 2010; pp. 1859–1862. [Google Scholar] [CrossRef]
Stula, M.; Krstinic, D.; Seric, L. Intelligent forest fire monitoring system. Inf. Syst. Front. 2012, 14, 725–739. [Google Scholar] [CrossRef]
Merino, L.; Caballero, F.; Martínez-de-Dios, J.R.; Maza, I.; Ollero, A. An Unmanned Aircraft System for Automatic Forest Fire Monitoring and Measurement. J. Intell. Robot. Syst. 2012, 65, 533–548. [Google Scholar] [CrossRef]
Aslan, Y.E.; Korpeoglu, I.; Ulusoy, Ö. A framework for use of wireless sensor networks in forest fire detection and monitoring, Computers. Environ. Urban Syst. 2012, 36, 614–625. [Google Scholar] [CrossRef]
Serna, M.Á.; Casado, R.; Bermúdez, A.; Pereira, N. Distributed Forest Fire Monitoring Using Wireless Sensor Networks. Int. J. Distrib. Sens. Netw. 2015, 11, 10. [Google Scholar] [CrossRef]
Dwiasnati, S.; Devianto, Y. Classification of forest fire areas using machine learning algorithm. World J. Adv. Eng. Technol. Sci. 2021, 3, 8–15. [Google Scholar] [CrossRef]
Pang, Y.; Li, Y.; Feng, Z.; Feng, Z.; Zhao, Z.; Chen, S.; Zhang, H. Forest Fire Occurrence Prediction in China Based on Machine Learning Methods. Remote Sens. 2022, 14, 5546. [Google Scholar] [CrossRef]
Dampage, U.; Bandaranayake, L.; Wanasinghe, R.; Kottahachchi, K.; Jayasanka, B. Forest fire detection system using wireless sensor networks and machine learning. Sci. Rep. 2022, 12, 46. [Google Scholar] [CrossRef] [PubMed]
Shao, Y.; Feng, Z.; Sun, L.; Yang, X.; Li, Y.; Xu, B.; Chen, Y. Mapping China’s Forest Fire Risks with Machine Learning. Forests 2022, 13, 856. [Google Scholar] [CrossRef]
Singh, K.R.; Neethu, K.P.; Madhurekaa, K.; Harita, A.; Mohan, P. Parallel SVM model for forest fire prediction. Soft Comput. Lett. 2021, 3, 100014. [Google Scholar] [CrossRef]
Abid, F. A survey of machine learning algorithms based forest fires prediction and detection systems. Fire Technol. 2021, 57, 559–590. [Google Scholar] [CrossRef]
Vicente, J.; Guillemant, P. An image processing technique for automatically detecting forest fire. Int. J. Therm. Sci. 2002, 41, 1113–1120. [Google Scholar] [CrossRef]
Yan, Q.; Bo, P.; Juanjuan, Z. Forest Fire Image Intelligent Recognition based on the Neural Network. J. Multimed. 2014, 9, 469–475. [Google Scholar]
Mahmoud, M.A.; Ren, H. Forest Fire Detection Using a Rule-Based Image Processing Algorithm and Temporal Variation. Math. Probl. Eng. 2018, 2018, 7612487. [Google Scholar] [CrossRef]
Wang, Y.; Dang, L.; Ren, J. Forest fire image recognition based on convolutional neural network. J. Algorithms Comput. Technol. 2019, 13, 1748302619887689. [Google Scholar] [CrossRef]
Jiao, Z.; Zhang, Y.; Xin, J.; Yi, Y.; Liu, D. Forest Fire Detection with Color Features and Wavelet Analysis Based on Aerial Imagery. In Proceedings of the 2018 Chinese Automation Congress (CAC), Xi’an, China, 30 November–2 December 2018; pp. 2206–2211. [Google Scholar] [CrossRef]
Jain, P.; Coogan, S.C.; Subramanian, S.G.; Crowley, M.; Taylor, S.; Flannigan, M.D. A review of machine learning applications in wildfire science and management. Environ. Rev. 2020, 28, 478–505. [Google Scholar] [CrossRef]
Dogan, A.; Birant, D. Machine learning and data mining in manufacturing. Expert Syst. Appl. 2021, 166, 114060. [Google Scholar] [CrossRef]
Singh, H.; Ang, L.M.; Lewis, T.; Paudyal, D.; Acuna, M.; Srivastava, P.K.; Srivastava, S.K. Trending and emerging prospects of physics-based and ML-based wildfire spread models: A comprehensive review. J. For. Res. 2024, 35, 135. [Google Scholar] [CrossRef]
Farid, A.; Alam, M.K.; Goli, V.S.N.S.; Akin, I.D.; Akinleye, T.; Chen, X.; Cheng, Q.; Cleall, P.; Cuomo, S.; Foresta, V.; et al. A Review of the Occurrence and Causes for Wildfires and Their Impacts on the Geoenvironment. Fire 2024, 7, 295. [Google Scholar] [CrossRef]
Liang, H.; Zhang, M.; Wang, H. A Neural Network Model for Wildfire Scale Prediction Using Meteorological Factors. IEEE Access 2019, 7, 176746–176755. [Google Scholar] [CrossRef]
KC, U.; Aryal, J.; Hilton, J.; Garg, S. A Surrogate Model for Rapidly Assessing the Size of a Wildfire over Time. Fire 2021, 4, 20. [Google Scholar] [CrossRef]
Xi, D.D.; Dean, C.B.; Taylor, S.W. Modeling the duration and size of wildfires using joint mixture models. Environmetrics 2021, 32, e2685. [Google Scholar] [CrossRef]
US Forest Service. Department of Agriculture. Science and Technology. Managing Fire, Fire Forecasting. Available online: https://www.fs.usda.gov/science-technology/managing-fire (accessed on 21 October 2024).
eustafor. European State Forest Association. Forest Fires in Europe: A Growing Challenge. Towards a Resilient Future. Available online: https://eustafor.eu/forest-fires-in-europe-a-growing-challenge/ (accessed on 21 October 2024).
Karpatne, A.; Ebert-Uphoff, I.; Ravela, S.; Babaie, H.A.; Kumar, V. Machine learning for the geosciences: Challenges and opportunities. IEEE Trans. Knowl. Data Eng. 2017, 31, 1544–1554. [Google Scholar] [CrossRef]
Liu, Z.; Peng, C.; Work, T.; Candau, J.N.; DesRochers, A.; Kneeshaw, D. Application of machine-learning methods in forest ecology: Recent progress and future challenges. Environ. Rev. 2018, 26, 339–350. [Google Scholar] [CrossRef]
Finney, M.A. The challenge of quantitative risk analysis for wildland fire. For. Ecol. Manage. 2005, 211, 97–108. [Google Scholar] [CrossRef]
Xiao, H. Estimating Wildfire Duration Using Regression—Models. 2023. Available online: https://arxiv.org/pdf/2308.08936 (accessed on 21 October 2024).
Ben-Gal, I. Bayesian Networks. In Encyclopedia of Statistics in Quality and Reliability; Ruggeri, F., Kenett, R.S., Faltin, F.W., Eds.; Wiley Online Library: Hoboken, NJ, USA, 2008. [Google Scholar]
Koski, T.; Noble, J. Bayesian Networks: An Introduction; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Friedman, N.; Linial, M.; Nachman, I.; Pe, D. Using Bayesian networks to analyze expression data. In Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, Tokyo, Japan, 8–11 April 2000; pp. 127–135. [Google Scholar]
Cai, B.; Huang, L.; Xie, M. Bayesian Networks in Fault Diagnosis. IEEE Trans. Ind. Inform. 2017, 13, 2227–2240. [Google Scholar] [CrossRef]
Barton, N.D.; Kuikka, S.; Varis, O.; Uusitalo, L.; Henriksen, H.J.; Borsuk, M.; de la Hera, A.; Farmani, R.; Johnson, S.; Linnell, J.D. Bayesian networks in environmental and resource management. Integr. Environ. Assess Manag. 2012, 8, 418–429. [Google Scholar] [CrossRef]
Sevinc, V.; Kucuk, O.; Goltas, M. A Bayesian network model for prediction and analysis of possible forest fire causes. For. Ecol. Manag. 2020, 457, 117723. [Google Scholar] [CrossRef]
Chen, F.; Jia, H.; Du, E.; Chen, Y.; Wang, L. Modeling of the cascading impacts of drought and forest fire based on a Bayesian network. Int. J. Disaster Risk Reduct. 2024, 111, 104716. [Google Scholar] [CrossRef]
Kim, B.; Lee, J. A Bayesian network-based information fusion combined with DNNs for robust video fire detection. Appl. Sci. 2021, 11, 7624. [Google Scholar] [CrossRef]
Webb, G.I.; Keogh, E.; Miikkulainen, R. Naïve Bayes. Encycl. Mach. Learn. 2010, 15, 713–714. [Google Scholar]
Ting, S.L.; Ip, W.H.; Tsang, A.H. Is Naive Bayes a good classifier for document classification. Int. J. Softw. Eng. Its Appl. 2011, 5, 37–46. [Google Scholar]
Chen, H.; Hu, S.; Hua, R.; Zhao, X. Improved naive Bayes classification algorithm for traffic risk management. EURASIP J. Adv. Signal Process. 2021, 30, 2021. [Google Scholar] [CrossRef]
Panda, M.; Patra, M.R. Network intrusion detection using naive bayes. Int. J. Comput. Sci. Netw. Secur. 2007, 7, 258–263. [Google Scholar]
Nugroho, A.A.; Iwan, I.; Azizah, K.I.N.; Raswa, F.H. Peatland Forest Fire Prevention Using Wireless Sensor Network Based on Naïve Bayes Classifier. KnE Soc. Sci. 2019, 3, 20–34. [Google Scholar]
Zainul, M.; Minggu, E. Classification of Hotspots Causing Forest and Land Fires Using the Naive Bayes Algorithm. Interdiscip. Soc. Stud. 2022, 1, 555–567. [Google Scholar] [CrossRef]
Karo, I.M.K.; Amalia, S.N.; Septiana, D. Wildfires Classification Using Feature Selection with K-NN, Naïve Bayes, and ID3 Algorithms. J. Softw. Eng. Inf. Commun. Technol. (SEICT) 2022, 3, 15–24. [Google Scholar] [CrossRef]
Shu, L.; Zhang, H.; You, Y.; Cui, Y.; Chen, W. Towards fire prediction accuracy enhancements by leveraging an improved naïve bayes algorithm. Symmetry 2021, 13, 530. [Google Scholar] [CrossRef]
Sperandei, S. Understanding logistic regression Analysis. Biochem. Medica 2014, 24, 12–18. [Google Scholar] [CrossRef]
Cabrera, A.F. Logistic regression analysis in higher education: An applied perspective. High. Educ. Handb. Theory Res. 1994, 10, 225–256. [Google Scholar]
Lawson, C.; Montgomery, D.C. Logistic Regression Analysis of Customer Satisfaction Data. Qual. Reliab. Eng. Int. 2006, 22, 971–984. [Google Scholar] [CrossRef]
Hu, Z.; Lo, C.P. Modeling urban growth in Atlanta using logistic regression. Comput. Environ. Urban Syst. 2007, 31, 667–688. [Google Scholar] [CrossRef]
Vilar del Hoyo, L.; Martín Isabel, M.P.; Martínez Vega, F.J. Logistic regression models for human-caused wildfire risk estimation: Analysing the effect of the spatial accuracy in fire occurrence data. Eur. J. Forest Res. 2011, 130, 983–996. [Google Scholar] [CrossRef]
De Bem, P.P.; de Carvalho Júnior, O.A.; Matricardi, E.A.T.; Guimarães, R.F.; Gomes, R.A.T. Predicting wildfire vulnerability using logistic regression and artificial neural networks: A case study in Brazil’s Federal District. Int. J. Wildland Fire 2018, 28, 35–45. [Google Scholar] [CrossRef]
Nhongo, E.J.S.; Fontana, D.C.; Guasselli, L.A.; Bremm, C. Probabilistic modelling of wildfire occurrence based on logistic regression, Niassa Reserve, Mozambique. Geomat. Nat. Hazards Risk 2019, 10, 1772–1792. [Google Scholar] [CrossRef]
Peng, W.; Wei, Y.; Chen, G.; Lu, G.; Ye, Q.; Ding, R.; Hu, P.; Cheng, Z. Analysis of Wildfire Danger Level Using Logistic Regression Model in Sichuan Province, China. Forests 2023, 14, 2352. [Google Scholar] [CrossRef]
Bishop, C. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
Cybenko, G. Approximation by superpositions of a sigmoidal Function. Math. Control. Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Baldi, P.; Cranmer, K.; Faucett, T.; Sadowski, P.; Whiteson, D. Parameterized neural networks for high-energy physics. Eur. Phys. J. C 2016, 76, 1–7. [Google Scholar] [CrossRef]
Valdas, J.J.; Bonham-Carter, G. Time dependent neural network models for detecting changes of state in complex processes: Applications in earth sciences and astronomy. Neural Netw. 2006, 19, 196–207. [Google Scholar] [CrossRef]
Carleo, G.; Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 2017, 355, 602–606. [Google Scholar] [CrossRef]
Shirvany, Y.; Hayati, M.; Moradian, R. Multilayer perceptron neural networks with novel unsupervised training method for numerical solution of the partial differential equations. Appl. Soft Comput. 2009, 9, 20–29. [Google Scholar] [CrossRef]
Malek, A.; Beidokhti, R.S. Numerical solution for high order differential equations using a hybrid neural network—Optimization method. Appl. Math. Comput. 2006, 183, 60–271. [Google Scholar] [CrossRef]
Yadav, A.K.; Chandel, S.S. Solar radiation prediction using Artificial Neural Network techniques: A review. Renew. Sustain. Energy Rev. 2014, 33, 772–781. [Google Scholar] [CrossRef]
Topuz, A. Predicting moisture content of agricultural products using artificial neural networks. Adv. Eng. Softw. 2010, 41, 464–470. [Google Scholar] [CrossRef]
Escamilla-García, A.; Soto-Zarazúa, G.M.; Toledano-Ayala, M.; Rivas-Araiza, E.; Gastélum-Barrios, A. Applications of Artificial Neural Networks in Greenhouse Technology and Overview for Smart Agriculture Development. Appl. Sci. 2020, 10, 3835. [Google Scholar] [CrossRef]
Shen, L.; Wu, J.; Yang, W. Multiscale Quantum Mechanics/Molecular Mechanics Simulations with Neural Networks. J. Chem. Theory Comput. 2016, 12, 4934–4946. [Google Scholar] [CrossRef] [PubMed]
Manzhos, S.; Dawes, R.; Carrington, T. Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces. Int. J. Quantum Chem. 2015, 115, 1012–1020. [Google Scholar] [CrossRef]
Wei, J.N.; Duvenaud, D.; Aspuru-Guzik, A. Neural Networks for the Prediction of Organic Chemistry Reactions. ACS Cent. Sci. 2016, 2, 725–732. [Google Scholar] [CrossRef]
Li, G.; Shi, J. On comparing three artificial neural networks for wind speed forecasting. Appl. Energy 2010, 87, 2313–2320. [Google Scholar] [CrossRef]
Falat, L.; Pancikova, L. Quantitative Modelling in Economics with Advanced Artificial Neural Networks. Procedia Econ. Financ. 2015, 34, 194–201. [Google Scholar] [CrossRef]
Namazi, M.; Shokrolahi, A.; Maharluie, M.S. Detecting and ranking cash flow risk factors via artificial neural networks technique. J. Bus. Res. 2016, 69, 1801–1806. [Google Scholar] [CrossRef]
Tkacz, G. Neural network forecasting of Canadian GDP growth. Int. J. Forecast. 2001, 17, 57–69. [Google Scholar] [CrossRef]
Baskin, I.I.; Winkler, D.; Tetko, I.V. A renaissance of neural networks in drug discovery. Expert Opin. Drug Discov. 2016, 11, 785–795. [Google Scholar] [CrossRef]
Bartzatt, R. Prediction of Novel Anti-Ebola Virus Compounds Utilizing Artificial Neural Network (ANN). World J. Pharm. Res. 2018, 7, 16. [Google Scholar]
Hossain, F.A.; Zhang, Y.; Yuan, C.; Su, C.Y. Wildfire Flame and Smoke Detection Using Static Image Features and Artificial Neural Network. In Proceedings of the 2019 1st International Conference on Industrial Artificial Intelligence (IAI), Shenyang, China, 23–27 July 2019; pp. 1–6. [Google Scholar] [CrossRef]
Lall, S.; Mathibela, B. The application of artificial neural networks for wildfire risk prediction. In Proceedings of the 2016 International Conference on Robotics and Automation for Humanitarian Applications (RAHA), Amritapuri, India, 18–20 December 2016; pp. 1–6. [Google Scholar] [CrossRef]
Sayad, Y.O.; Mousannif, H.; Al Moatassime, H. Predictive modeling of wildfires: A new dataset and machine learning approach. Fire Saf. J. 2019, 104, 130–146. [Google Scholar] [CrossRef]
Gao, K.; Feng, Z.; Wang, S. Using multilayer perceptron to predict forest fires in jiangxi province, southeast china. Discret. Dyn. Nat. Soc. 2022, 1, 6930812. [Google Scholar] [CrossRef]
Bhargava, N.; Sharma, G.; Bhargava, R.; Mathuria, M. Decision tree analysis on j48 algorithm for data mining. Proc. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2013, 3, 1114–1119. [Google Scholar]
Kaur, G.; Chhabra, A. Improved J48 classification algorithm for the prediction of diabetes. Int. J. Comput. Appl. 2014, 98, 22. [Google Scholar] [CrossRef]
Sahu, S.; Mehtre, B.M. Network intrusion detection system using J48 Decision Tree. In Proceedings of the 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Kochi, India, 10–13 August 2015; pp. 2023–2026. [Google Scholar]
Sakhare, N.N.; Joshi, S.A. Classification of criminal data using J48-Decision Tree algorithm. Int. J. Data Warehous. Min. 2015, 4, 167–171. [Google Scholar]
Abdullah, S.F.; Rahman, A.F.N.A.; Abas, Z.A.; Saad, W.H.M. Fingerprint gender classification using univariate decision tree (j48). Int. J. Adv. Comput. Sci. Appl. 2016, 7, 217–221. [Google Scholar]
Jehad, R.; Yousif, S.A. Fake news classification using random forest and decision tree (j48). Al-Nahrain J. Sci. 2020, 23, 49–55. [Google Scholar] [CrossRef]
Abid, F.; Izeboudjen, N. Predicting forest fire in algeria using data mining techniques: Case study of the decision tree algorithm. In Advanced Intelligent Systems for Sustainable Development (AI2SD’2019) Volume 4-Advanced Intelligent Systems for Applied Computing Sciences; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 363–370. [Google Scholar]
Breiman, L. Random forests. In Machine Learning; Springer: Berlin/Heidelberg, Germany, 2001; Volume 45, pp. 5–32. [Google Scholar]
Rigatti, S.J. Random forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2005, 26, 217–222. [Google Scholar] [CrossRef]
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
Qi, Y. Random Forest for Bioinformatics. In Ensemble Machine Learning; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012. [Google Scholar] [CrossRef]
Xu, B.; Guo, X.; Ye, Y.; Cheng, J. An improved random forest classifier for text categorization. J. Comput. 2012, 7, 2913–2920. [Google Scholar] [CrossRef]
Farnaaz, N.; Jabbar, M.A. Random Forest Modeling for Network Intrusion Detection System. Procedia Comput. Sci. 2016, 89, 213–217. [Google Scholar] [CrossRef]
Latifah, A.L.; Shabrina, A.; Wahyuni, I.N.; Sadikin, R. Evaluation of Random Forest model for forest fire prediction based on climatology over Borneo. In Proceedings of the 2019 International Conference on Computer, Control, Informatics and Its Applications (IC3INA), Tangerang, Indonesia, 23–24 October 2019; pp. 4–8. [Google Scholar]
Malik, A.; Rao, M.R.; Puppala, N.; Koouri, P.; Thota, V.A.K.; Liu, Q.; Chiao, S.; Gao, J. Data-Driven Wildfire Risk Prediction in Northern California. Atmosphere 2021, 12, 109. [Google Scholar] [CrossRef]
Gao, C.; Lin, H.; Hu, H. Forest fire risk prediction based on random forest and backpropagation neural network of Heihe area in Heilongjiang province, China. Forests 2023, 14, 170. [Google Scholar] [CrossRef]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Hall, M.; Frank, F.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H. The WEKA data mining software: An update. ACM SIGKDD Explor. Newsl. 2009, 11, 10–18. [Google Scholar] [CrossRef]
Aher, S.B.; Lobo, L.M.R.J. Data mining in educational system using weka. In International conference on emerging technology trends. Found. Comput. Sci. 2011, 3, 20–25. [Google Scholar]
Hussain, S.; Dahan, N.A.; Ba-Alwib, F.M.; Ribata, N. Educational data mining and analysis of students’ academic performance using WEKA. Indones. J. Electr. Eng. Comput. Sci. 2018, 9, 447–459. [Google Scholar] [CrossRef]
Sigurdardottir, A.K.; Jonsdottir, H.; Benediktsson, R. Outcomes of educational interventions in type 2 diabetes: WEKA data-mining analysis. Patient Educ. Couns. 2007, 67, 21–31. [Google Scholar] [CrossRef]
Amin, M.N.; Habib, A. Comparison of different classification techniques using WEKA for hematological data. Am. J. Eng. Res. 2015, 4, 55–61. [Google Scholar]
Suthaharan, S.; Suthaharan, S. Support vector machine. In Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning; Springer: Berlin/Heidelberg, Germany, 2016; pp. 207–235. [Google Scholar]
Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2, 1–27. [Google Scholar] [CrossRef]
Powell, M.J.D. A Tolerant Algorithm for Linearly Constrained Optimization Calculations. Math. Program 1989, 45, 547–566. [Google Scholar] [CrossRef]
Liu, D.C.; Nocedal, J. On the Limited Memory Method for Large Scale Optimization. Math. Program. B. 1989, 45, 503–528. [Google Scholar] [CrossRef]
Park, J.; Sandberg, I.W. Universal Approximation Using Radial-Basis-Function Networks. Neural Comput. 1991, 3, 246–257. [Google Scholar] [CrossRef]
Gropp, W.; Lusk, E.; Doss, N.; Skjellum, A. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 1996, 22, 789–828. [Google Scholar] [CrossRef]
Chandra, R. Parallel Programming in OpenMP; Morgan Kaufmann: Cambridge, MA, USA, 2001. [Google Scholar]

Figure 1. Machine learning techniques used in wildfire management.

Figure 2. Burned area in Greece for the last ten years.

Figure 3. The steps of the preprocessing that were applied to the original datasets.

Figure 4. The used dataset after the preprocessing steps.

Figure 5. Box plot for the used machine learning techniques.

Figure 6. Statistical comparison between the Random Forest method and the other machine learning methods.

Figure 7. Statistical test for the experiment with different numbers of trees using the Random Forest technique.

Figure 8. Statistical comparison for using the methods MLP, RBF, and Random Forest to predict the duration of fires.

Table 1. Experimental results using various machine learning models for 10 years of observations. The numbers in cells denote average classification error as measured on the test set.

Year	BAYESNET	NAIVEBAYES	LOGISTIC	MLP	J48	RANDOMFOREST
2014	11.44%	12.89%	9.81%	11.37%	10.04%	9.42%
2015	11.08%	11.26%	9.53%	10.65%	9.51%	8.95%
2016	25.71%	13.00%	3.41%	3.90%	3.65%	3.00%
2017	11.04%	11.51%	9.48%	10.08%	10.30%	9.29%
2018	11.20%	10.46%	9.09%	9.48%	9.27%	8.58%
2019	9.61%	9.25%	8.29%	8.53%	9.08%	8.01%
2020	18.00%	6.72%	5.54%	5.97%	6.09%	5.50%
2021	12.35%	14.15%	12.04%	13.59%	13.59%	11.92%
2022	10.25%	9.62%	9.01%	9.47%	9.04%	8.93%
2023	9.74%	9.19%	8.26%	8.77%	8.39%	7.66%
Average	13.04%	10.81%	8.45%	9.18%	8.90%	8.13%

Table 2. Precision and recall for every machine learning method.

	BAYES NET		NAIVEBAYES		LOGISTIC		MLP		J48		FOREST
YEAR	PRECISION	RECALL	PRECISION	RECALL	PRECISION	RECALL	PRECISION	RECALL	PRECISION	RECALL	PRECISION	RECALL
2014	0.889	0.886	0.851	0.871	0.89	0.902	0.87	0.886	0.888	0.90	0.898	0.906
2015	0.892	0.889	0.87	0.887	0.891	0.905	0.876	0.893	0.892	0.905	0.901	0.91
2016	0.959	0.743	0.959	0.87	0.96	0.966	0.955	0.961	0.959	0.963	0.968	0.97
2017	0.897	0.89	0.869	0.885	0.893	0.905	0.886	0.889	0.886	0.897	0.899	0.907
2018	0.897	0.888	0.879	0.895	0.894	0.909	0.89	0.905	0.894	0.907	0.905	0.914
2019	0.914	0.904	0.894	0.907	0.903	0.917	0.903	0.915	0.898	0.909	0.912	0.92
2020	0.929	0.82	0.923	0.933	0.937	0.945	0.933	0.94	0.931	0.939	0.94	0.945
2021	0.879	0.876	0.835	0.858	0.865	0.88	0.846	0.864	0.849	0.864	0.871	0.881
2022	0.91	0.897	0.889	0.904	0.893	0.91	0.891	0.905	0.896	0.91	0.9	0.911
2023	0.912	0.903	0.894	0.908	0.905	0.917	0.899	0.912	0.906	0.916	0.916	0.923

Table 3. Experimental results using different numbers of trees for the Random Forest technique.

Year	10 Trees	20 Trees	50 Trees	100 Trees	150 Trees	200 Trees
2014	10.05%	9.66%	9.59%	9.43%	9.24%	9.25%
2015	9.90%	9.44%	9.16%	8.95%	9.07%	9.00%
2016	3.34%	3.15%	2.97%	3.00%	3.01%	2.99%
2017	10.07%	9.54%	9.43%	9.30%	9.36%	9.25%
2018	9.27%	8.76%	8.52%	8.58%	8.64%	8.67%
2019	8.81%	8.18%	7.95%	8.02%	7.98%	7.89%
2020	6.11%	5.78%	5.53%	5.51%	5.58%	5.57%
2021	12.26%	11.67%	11.74%	11.92%	11.77%	11.77%
2022	9.43%	9.44%	9.14%	8.94%	8.86%	8.86%
2023	8.47%	7.85%	7.61%	7.67%	7.43%	7.43%

Table 4. Comparison of Random Forest against other machine learning models.

Year	Random Forest	SVM	MLP10_BFGS	MLP10_LBFGS	RBF10
2014	9.43%	14.19%	11.61%	10.28%	13.76%
2015	8.95%	13.08%	10.86%	9.71%	12.34%
2016	3.00%	5.03%	4.36%	3.61%	7.25%
2017	9.30%	13.20%	10.60%	9.77%	12.87%
2018	8.58%	11.50%	10.16%	9.15%	11.12%
2019	8.02%	10.67%	9.27%	8.26%	9.83%
2020	5.51%	8.18%	6.69%	5.62%	10.39%
2021	11.92%	17.98%	14.12%	12.71%	18.17%
2022	8.94%	11.16%	9.86%	8.98%	11.05%
2023	7.67%	11.37%	9.55%	8.63%	11.53%
Average	8.13%	11.64%	9.71%	8.67%	11.83%

Table 5. Experiments with different numbers of processing nodes for the artificial neural network case. The BFGS optimization method was used to train the neural network.

Year	MLP2 BFGS	MLP5 BFGS	MLP10_BFGS	MLP15_BFGS	MLP20 BFGS
2014	12.88%	12.40%	11.61%	11.58%	11.43%
2015	11.78%	10.98%	10.86%	12.71%	12.66%
2016	5.08%	4.40%	4.36%	4.13%	4.14%
2017	12.14%	10.71%	10.60%	10.80%	10.46%
2018	10.46%	10.28%	10.16%	10.04%	9.90%
2019	9.61%	9.21%	9.27%	9.12%	9.01%
2020	7.49%	6.67%	6.69%	6.36%	6.40%
2021	16.29%	14.82%	14.12%	14.05%	13.95%
2022	10.39%	10.06%	9.86%	9.67%	9.63%
2023	10.51%	9.96%	9.55%	9.45%	9.47%
Average	10.66%	9.95%	9.71%	9.79%	9.71%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kopitsa, C.; Tsoulos, I.G.; Charilogis, V.; Stavrakoudis, A. Predicting the Duration of Forest Fires Using Machine Learning Methods. Future Internet 2024, 16, 396. https://doi.org/10.3390/fi16110396

AMA Style

Kopitsa C, Tsoulos IG, Charilogis V, Stavrakoudis A. Predicting the Duration of Forest Fires Using Machine Learning Methods. Future Internet. 2024; 16(11):396. https://doi.org/10.3390/fi16110396

Chicago/Turabian Style

Kopitsa, Constantina, Ioannis G. Tsoulos, Vasileios Charilogis, and Athanassios Stavrakoudis. 2024. "Predicting the Duration of Forest Fires Using Machine Learning Methods" Future Internet 16, no. 11: 396. https://doi.org/10.3390/fi16110396

APA Style

Kopitsa, C., Tsoulos, I. G., Charilogis, V., & Stavrakoudis, A. (2024). Predicting the Duration of Forest Fires Using Machine Learning Methods. Future Internet, 16(11), 396. https://doi.org/10.3390/fi16110396

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting the Duration of Forest Fires Using Machine Learning Methods

Abstract

1. Introduction

2. Materials and Methods

2.1. The Used Datasets

2.2. The Used Machine Learning Methods

2.2.1. Bayesian Networks

2.2.2. Naïve Bayes

2.2.3. Logistic Regression

2.2.4. Artificial Neural Networks

2.2.5. The J48 Algorithm

2.2.6. Random Forest

3. Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI