1. Introduction
Forest fires are sudden and difficult to extinguish, so conducting a risk assessment of forest fires is particularly important for forest fire management [
1]. At present, the assessment of forest fire risk uses a wide range of data sources, including remote sensing data, in situ detection data, basic geographic data, etc. [
2,
3,
4]. While a large number of data sources improve the potential of forest fire risk assessment, they also bring about issues such as data heterogeneity and semantic gaps [
5]. The increase in data complexity has increased the knowledge level required for data users to judge fire risk.
In response, Ontology, a hot knowledge engineering technology, widely accepted as “an explicit specification of a conceptualization [
6]”, has been introduced into forest fire risk assessment [
7,
8,
9]. Ontology can express concepts and their relationships in a structured way that both humans and computers can understand. Rules are a way of expressing knowledge [
10], using ontology as the basis for concept expression. Through rule reasoning, relationships between concepts can be found, and the combination of ontology and rules can achieve efficient knowledge expression and sharing [
11,
12].
Existing knowledge-based researches [
7,
8,
9] has focused on using ontology and rules to assess forest fire risk, demonstrating the effectiveness of ontology and rules in forest fire management. However, in these researches, some rules rely on complex and highly manual fire index methods [
7], some use heuristic algorithm-based methods [
8], and some use experimental or illustrative rules [
9]. The process of mining rules has been overlooked to a certain extent, and the proposed methods are also difficult to meet the needs of interpretable and scalable knowledge mining.
Association rule mining is a type of data mining algorithm, and it can uncover important and reliable rules between attributes in databases [
13]. Representative association rule algorithms include Apriori [
14] and Eclat [
15], where Apriori scans the database more times and occupies less memory, while Eclat only scans the database once but occupies more memory. Considering the volume of forest fire dataset, the Apriori algorithm [
16,
17,
18] is used more frequently than the Eclat algorithm in forest fire rules mining. However, the Apriori algorithm is designed to mine the relationship between discrete data. When conducting forest fire risk assessment, continuous data, such as temperature, needs to be processed. At this point, the Apriori algorithm may not be able to effectively mine the rules that users are interested in, which is overlooked in previous research [
16,
17,
18]. For example, continuous data, such as temperature, can increase the forest fire risk as the numerical values increase. It is assumed that if the results of data mining show a high fire risk in a certain scenario at 20 °C, then under other unchanged scenarios, a similar scenario at 40 °C should have a higher fire risk. However, due to the rarity of a temperature of 40 °C, the importance of this rule is much lower than similar rules at 20 °C (which is measured by the frequency of data in the Apriori algorithm), which may lead to the neglect of important rules under extreme scenarios. Therefore, it is necessary to improve the Apriori algorithm when using it for forest fire rule mining to avoid this kind of neglect.
Current researchers have performed some work to improve the Apriori algorithm’s support for continuous data, such as using K-means [
19], distribution probability [
20], membership functions [
21], etc. However, these works did not provide sufficient attention to rules in extreme scenarios, which is extremely important in forest fire risk assessment.
In summary, the combination of the powerful expression ability of ontology and the Apriori algorithm can effectively utilize the mined rules in the forest fire assessment process. A unified and standardized expression platform is conducive to the automated dissemination and sharing of knowledge. However, the potential of this combination is limited by the current association rule mining algorithm.
This article uses ICAA (Improved Continuous Apriori Algorithm), a new improved Apriori algorithm that can be used for continuous data, to mine forest fire data and achieve automated knowledge generation, thereby alleviating the problem of the shortcomings of rule mining algorithms neglecting extreme scenario rules in knowledge-based fire risk assessment. Furthermore, ontology technology is introduced as a unified and standardized expression platform to standardize the semantics of heterogeneous data, combining the constructed ontology and forest fire rules generated by algorithms to achieve knowledge reasoning, thereby improving the automation level of forest fire risk assessment, and reducing the knowledge requirements for data users.
The architecture of this article is shown in
Figure 1.
4. Discussion
In this paper, we propose the ICAA, an Improved Apriori algorithm to enhancing forest fire risk assessment. Through rule mining and ontology-based reasoning of forest fire instance data cases in the Bejaia region of Algeria, the advantages and usability of the entire methodology proposed in this article has been fully demonstrated.
The ontology constructed in this article can provide a scalable and standardized expression platform for forest fire risk assessment. The ICAA rules can be combined with SWRL rules to achieve automated reasoning based on the constructed ontology, and the results are written back to the ontology. This improves the automation level of knowledge in forest fire risk assessment, reduces the knowledge requirements for users, and enables semantic knowledge and observation data to better support forest fire risk assessment work.
Compared with the raw Apriori algorithm, the ICAA can better handle continuous data association rule mining for forest fire risk assessment. The ICAA has the following advantages:
The mining results of the ICAA include all the rules mined with the raw Apriori algorithm, which proves that the ICAA is an incremental extension of the raw Apriori algorithm.
Due to the increased support for small probability events such as “Temp_very_high”, the number of candidate rules that meet the support increased, resulting in a 191.67% increase in the number of generated rules.
The raw Apriori algorithm discovered “Temp_medium ^ Rain_none -> Risk_high”, but did not find “Temp_high ^ Rain_none -> Risk_high” and “Rain_none ^ Temp_very_high -> Risk_high”. These three rules were all discovered in the ICAA, and their support increased sequentially, which is also in line with the expectation of prior knowledge. Similarly, there are also the combinations of “Temp_high” and “RH”, and “Rain_none” and “RH”. Compared with the raw Apriori algorithm, the ICAA has improved the number of mining rules and support with confidence and support, as shown in
Figure 7.
The subfigure (a) shows the combination of “Rain_none” and “Temp”. In the absence of rainfall, the raw Apriori algorithm only outputs two rules: “Temp_medium”, and “Temp_high”. Although the confidence level of “Temp_high” is higher than that of “Temp_medium”, its support is lower. This is because “Temp_high” appears fewer times, and as for “Temp_very_high”, it appears too few times to be found with the raw Apriori algorithm. This situation has been improved in the ICAA. Due to the extension of temperature in the ICAA, a large amount of low-risk data have reduced the support of “Temp_medium” in the low-temperature region (by increasing the denominator of support calculation), while in the high-temperature region, the support and confidence of “Temp_high” and “Temp_very_high” have been increased. The subfigures (b) and (c) also reflect the same situation, and these data represent the possible fire risk under extreme scenarios. Considering their extremely high importance, they cannot be discarded as redundant rules. The above three examples indicate that when processing continuous data association rule mining for forest fire risk assessment, the raw Apriori algorithm’s neglect of extreme scenario rules is widespread, and ICAA alleviates this neglect to a certain extent.
The comparison of ICAA, raw Apriori, and three other Apriori algorithm improvements in
Table 9 shows that the neglect to rules in extreme scenarios also exists to varying degrees in other improved algorithms. The ICAA has made great progress in increasing attention to rules under extreme scenarios. This proves the superiority of the ICAA, considering that the application area of the algorithm is forest fire rule mining, and the rules under extreme scenarios are very important.
5. Conclusions
This study provides an improved association rule mining algorithm, ICAA, which enhances the adaptability of the Apriori algorithm to continuous data by introducing prior knowledge to classify the input data. In addition, the rules obtained from mining are combined with ontology to implement semantic reasoning, reducing the knowledge requirements for users in forest fire risk assessment.
Based on ICAA and the designed ontology, an ontology and data individuals were constructed using the ontology management software protege on a forest fire dataset in the Bejaia region of northeastern Algeria. The proposed ICAA was used to mine the rules of the dataset, and the mined rules were combined with the constructed ontology in the form of SWRL specifications, achieving accurate automated reasoning.
The results show that the ICAA outperforms the raw Apriori algorithm, the number of generated rules increased by 191.67%, increasing the attention to extreme scenario rules for continuous data in association rules mining. The output rules can be integrated with the constructed ontology to achieve automated semantic reasoning to support forest fire risk assessment.
The work presented can be used to enhance the forest fire risk assessment and contribute to the generation and sharing of forest-fire-related knowledge, alleviate the problem of insufficient knowledge in forest fire risk assessment, and be extended to forest fire risk management in other regions or other forest fire management services such as forest fire spread management and forest fire point identification.