1. Introduction
Forest surface fuels are considered the most complex fuel types in fire management and their detailed physical description is required by the majority of fire behavior prediction systems, such as FARSITE, FlamMap, and BEHAVE [
1,
2,
3]. Surface fuels incorporate all live and dead biomass between the ground and ladder fuels, being responsible for surface fire initiation and spread (i.e., litter, herbaceous vegetation, shrubs and trees less than 1.8 m, branches, twigs, and logs) [
4,
5,
6,
7,
8,
9]. One of the most significant surface fuel components is surface fuel load (SFL), which is defined as the dry weight of the surface fuel layer per unit area (kg/m
2) [
6]. SFL is used as an input variable in almost all fire-related applications, since it significantly influences many aspects of fire behavior, such as fireline intensity and rate of spread [
5,
6,
10,
11]. SFL quantification can also contribute to wildland fire prevention, e.g., through treatment of hazardous fuels, maintenance of low-risk forest understory, fuel connectivity estimation, and smoke emissions assessment [
6,
12,
13]. The load of individual surface fuel types (SFTs) is also critical because each type has a generally different effect on fire behavior; therefore, SFTs are incorporated in different surface fuel models. Such models require a numerical description of the physical parameters characterizing each fuel type, including load, which is used in fire danger and behavior prediction systems [
14]. A prominent example is litter and fine woody debris (FWD) less than 0.64 cm in diameter, which dry very quickly and can be easily ignited, whereas woody fuels of larger dimensions are less susceptible to ignition and combustion [
6,
15].
Despite the pivotal role it plays in forest fire management, SFL quantification still remains a challenge due to its high spatial and temporal variability [
16]. Although the traditional field measurements constitute the most accurate SFL quantification method, the amount of labor, time, and cost involved necessitates the investigation of alternative SFL estimation methods critically important [
17,
18]. Remote sensing technology has been used as a cost-efficient means for monitoring a variety of forest fuels parameters, including SFL [
19,
20,
21,
22,
23,
24,
25,
26,
27].
During the last decades, the potential of direct and reliable SFL prediction has been examined using passive optical and active remote sensing sensors. Arellano-Perez et al. employed Sentinel-2 data and two non-parametric regression approaches, namely random forest (RF) and multivariate adaptive regression splines, to estimate SFL of pure even-aged pine stands in the northwest of Spain [
28]. The results showcased quite low goodness-of-fit statistics, with the coefficient of determination (R
2) ranging between 0.02 and 0.12. Stand characteristics (i.e., mean stand tree height and mean stand diameter at breast height) and satellite data, namely QuickBird and Landsat TM imagery, were used by Jin and Chen for SFL estimation in a larch-dominated forest of China [
29]. They performed linear regression analysis and the explained variance (R
2) of the resulting predictive models ranged from 0.05 to 0.57. In 2004, Reich et al. developed regression models for SFL estimation in South Dakota of the USA using Landsat TM, topographic, and forest class variables [
8]. Cross-validation results indicated that the predictive models yielded R
2 values up to 0.71.
The frequently encountered difficulty in retrieving SFL from optical data is attributed to their restriction to the horizontal level and the subsequent inability to provide information about the vertical profile of a forest, especially a closed-canopy one [
30,
31,
32,
33,
34]. On the contrary, the canopy penetration capabilities of Light Detection and Ranging (LiDAR) sensors provide additional three-dimensional information about all vegetation layers of the forest, including surface fuels [
35,
36,
37,
38,
39,
40,
41,
42,
43,
44,
45,
46].
Several studies have examined the potential of LiDAR remote sensing in SFL prediction, especially during the last two decades. Bright et al. explored the individual and combined use of airborne LiDAR and Landsat time series for SFL estimation in Colorado, USA, through random forest modeling [
47]. The results showed that the integration of Landsat and LiDAR data presented minor to moderate increase in the prediction accuracy compared to the individual use of LiDAR which explained 0.24–0.32 of SFL variation. Chen et al. employed both airborne and terrestrial LiDAR for SFL estimation in an eucalypt open forest of Australia [
48]. They applied multiple regression analysis and the developed models presented R
2 values up to 0.89. Price and Gordon estimated SFL in dry sclerophyll forests of Australia via linear regression analysis, resulting in a model of rather low predictive accuracy (R
2 = 0.24). Airborne LiDAR and multiple regression analysis were also employed by Hudak et al. for SFL quantification in Florida, USA, leading to the construction of a predictive model which explained 0.44 of variance [
49]. Jakubowski et al. used airborne LiDAR, optical data, and their combination in order to estimate various fuel variables, including SFL, of the Tahoe National Forest in the USA [
50]. The authors employed multiple linear regression, additive regression, and support vector machine (SVM) models, all of which yielded R
2 values lower than 0.5. A new approach was introduced by van Aardt et al. for coarse woody debris (CWD) load estimation in Daniel Boone National Forest, USA [
51]. Stepwise linear regression and LiDAR negative height residuals (i.e., LiDAR returns that are “below-ground” after subtraction of the DEM) were employed, resulting in models with adjusted R
2 of 0.69 to 0.99 (albeit without cross-validation). CWD load was also estimated by Seielstad and Queen using airborne LiDAR data covering two watersheds in the Tenderfoot Forest, USA [
52]. The authors assessed the obstacle density metric below 1.8-m height, yielding an estimation accuracy of approximately 60%.
As evidenced by the abovementioned studies, LiDAR data do not always provide accurate SFL evaluations despite their three-dimensional properties. They present limitations, such as attenuation by dense foliage and possible point positioning errors, which can hinder near-surface load assessment [
53,
54,
55,
56]. Moreover, different sensor characteristics, forest composition, and topography conditions can lead to variations in the laser penetration depth and, hence, the results in terms of the most suitable SFL prediction method and derived accuracy [
24,
34,
57]. The use of full-waveform or discrete return LiDAR data can also define the most appropriate analysis approach to be adopted and its predictive performance. For instance, although full-waveform LiDAR data provide more information compared to the discrete return systems, they are rarely available due to higher cost and/or storage capacity requirements [
58,
59]. The same applies to the ability of using pulse intensity information, which improves the SFL estimation accuracy but requires its prior normalization, especially in topographically complex terrains [
47,
60,
61]. Nevertheless, this is not always possible since the normalization of LiDAR intensity values requires plane trajectory information, which may be unavailable to the final user [
47].
The supplementary capability of multispectral LiDAR in providing intensity information over multiple wavelengths has been successfully employed in a number of forest studies (e.g., Reference [
62,
63,
64,
65,
66]). Given that multispectral LiDAR is still an emerging technology, current literature mainly deals with land cover and species mapping [
62,
64,
65,
67,
68,
69,
70,
71,
72] but—to the best of our knowledge—has not yet been used for SFL estimation. Multispectral LiDAR data were used in Reference [
66], in order to derive an active normalized burn ratio in a boreal peatland ecosystem. The results suggested that multispectral intensity information is useful in the examination of forest understory. The estimation of canopy fuel parameters was addressed in Reference [
73], where a comparative analysis of unispectral and multispectral LiDAR data was performed for the prediction of canopy fuel weight, canopy base height, vegetation height and biomass through regression analysis. The results of this research highlighted the usefulness of multispectral LiDAR compared to the unispectral LiDAR data in the examination of forest fuel parameters. Lopes Queiroz et al. [
74] focused on the volume estimation of CWD applying generalized additive models (GAMs) and multispectral LiDAR information yielding high accuracy of R
2 = 0.72.
With respect to the SFL estimation methods, most relevant studies follow a multiple regression analysis approach. Although Jakubowski et al. have stated that complex fuel components, such as fuel load, are best predicted through more sophisticated analysis (e.g., SVM and RF modeling), such machine learning methods require a substantially larger sample size, which is extremely time-consuming and labor-intensive to acquire [
75,
76,
77]. Additionally, the simplicity and higher interpretability of linear models renders multiple regression analysis a fundamental and the most widely used tool for the prediction of forest parameters—including SFL—using LiDAR-derived variables [
77,
78].
In order to address the literature gap in the use of multispectral LiDAR and SFL estimation, the aim of the present study was to investigate the potential of high-density, discrete return multispectral LiDAR data in estimating the SFL of an uneven-aged structured hybrid fir forest (Abies borisii regis), characterized by dense canopy and topographically complex terrain. We employed multiple linear regression analysis to examine the contributions of structural and intensity variables obtained from multispectral LiDAR to the reliable prediction of total surface fuel loads and individual SFTs.
4. Results
The implementation of SA variable selection and linear regression analysis resulted in the generation of different models—including three or four independent variables—for SFTL and TSFL prediction, among which the ones with the highest predictive accuracy were eventually selected. The computed goodness-of-fit statistics (described in
Section 3.3) indicated that the use of four predictors in the regression model, instead of three, led to more reliable models in terms of accuracy. Therefore, four predictors participated in all selected regression models presented below.
The selection of the appropriate number of predictor variables to be employed in the models was based on the general rule of thumb of one predictor per ten observations. In order to further support this choice, we devised an experiment where the regression models were built using an increasing number of predictors ranging from one to ten (including both structural and intensity metrics).
Figure 3 illustrates an example of the LOOCV results (i.e., R
2 and rRMSE) for the TSFL models. More specifically,
Figure 3a shows the increase in the R
2 of the models resulting from the sequential addition of one independent variable while
Figure 3b depicts the training and validation rRMSE of the respective models highlighting the change in their numerical difference according to the participating number of predictors. The inclusion of additional predictor variables increases the validation R
2 (up to 0.9). However, the difference between training and validation rRMSE also increases, which is an indication of possible overfitting. The models’ accuracy still increases for more than four predictors but at a slower pace than for less than four predictors. Similar results were obtained for the regression models constructed for the individual SFTs and, therefore, they are not presented here for the sake of brevity. Hence, our decision to select four predictors for the final models balances between the rule of thumb of approximately 10 samples per 1 predictor and the experimental observation that the highest rate of accuracy increase (relatively to the simplest model with a single input variable) is achieved with four predictors.
Table 4,
Table 5 and
Table 6 present the goodness-of-fit statistics derived from the LOOCV process for the models developed in the present study (i.e., produced using only structural, only intensity metrics and both)
Table 7 includes the predictor variables selected by SA in the most accurate models, as indicated by LOOCV. The scatterplots of the observed versus predicted load values and the residuals plots for the latter regression models (
Table 7) are illustrated in
Figure 2 and
Figure 3, respectively.
The goodness-of-fit statistics presented in
Table 4 showcase that R
2 values ranging from 0.44 to 0.58 (26.79–48.56% rRMSE) could be achieved by the models developed with the use of only structural metrics for the TSFL, litter, 1-h and 10-h fuels load. Grass load was the most accurately estimated variable in this case with the respective model yielding an R
2 of 0.71 and 36.61% rRMSE. In the case of using only intensity metrics as predictor variables, the performance of the developed models was lower with R
2 values of 0.39–0.55 (23.84–50.36% rRMSE) (
Table 5).
Table 6 indicates that the use of both structural and intensity metrics resulted in models with higher estimation performance with R
2 values of 0.59–0.71. In particular, the TSFL and grass/forbs models were the most accurate, both reaching an R
2 of 0.71 and rRMSE of 19.34% and 36.33%, respectively. Litter load was predicted less accurately with 0.69 R
2 and 37.59% rRMSE. The models with the lowest predictive accuracy were the ones of FWD (i.e., 1-h and 10-h fuels). In particular, 59% of the 1-h fuels load variance was explained with an rRMSE of 28.89% while the cross-validation of the 10-h fuels model revealed an R
2 of 0.6 and 31.26% rRMSE. All models presented negative rbiases ranging between −1.4 and −0.26 indicating an overestimation of the field-measured load values.
A comparison of the models developed by the different sets of LiDAR metrics (
Table 4,
Table 5 and
Table 6) revealed the need for synergy between the height distribution and intensity metrics for reliable SFL prediction except for the case of grass/forbs load estimation, in which, based on the results presented in
Table 4 and
Table 6, the use of only height distribution metrics is adequate.
Table 7 showcases that metrics related to both lower and upper fuel layers were used in the models, which indicates that LiDAR data can directly interact with the surface fuel layers and indirectly associate the SFL with the canopy characteristics of the study area, as well. In addition, GR and NIR channels are used—either separately or together—in the estimation of all SFTLs, except for the grass/forbs load.
A comparison of the aforementioned variations explained by the most accurately developed models (
Table 6) can be also performed by an examination of the observed versus predicted plots depicted in
Figure 4. The models’ fits to the field-measured load values are similar with the hypothetical 1:1 relationship (
Figure 4). Regarding the residual distributions presented in
Figure 5, a fair homogeneity can be observed without displaying any strong distribution patterns or obvious systematic trends.
5. Discussion
In the present study, SA and multiple linear regression analysis were implemented for the prediction of TSFL and SFTL (i.e., litter, grass/forbs, 1-h, and 10-h fuels load) using high-density multispectral LiDAR data in a dense forest over sloped terrain. LiDAR-derived height and single-return intensity distribution metrics were used individually and combined as potential independent variables for the construction of fifteen regression models (with four predictors each). Following the LOOCV approach, the goodness-of-fit statistics showcased that the incorporation of LiDAR intensity information to the regression analysis workflow leads to the construction of models with higher estimation performance compared to the ones including only structural metrics. Overall, the results indicate a significant contribution of the multispectral LiDAR data to reliably predict SFL even in dense canopy and topographically complex conditions.
With respect to the selection of the number of independent variables in the models, the use of more predictors will most likely lead to increased accuracy in terms of explained variance. Nonetheless, the risk of overfitting the regression models increases when more independent variables are included. A number of rules of thumb has been proposed in the literature for the estimation of a minimum acceptable sample size that are based on various factors, such as the nature and structure of the data, number of predictors to be used, and complexity of the predictive model to be constructed [
93,
94,
95,
96,
97]. Given that a large sample size requires extreme workload and is extremely costly, we collected 35 samples from our study area. Hence, we examined three and four predictors in the model following the general rule of thumb of 10 samples per 1 predictor [
93]. Yet, different number of predictor variables were also tested (see
Figure 3 and the associated text in
Section 4), which showcased that the highest rate of accuracy increase was exhibited up to four variables, after which the difference between training and validation error started increasing. Our suggestion is, therefore, to respect the 10 samples per predictor empirical rule. Eventually, the more ground truth samples are available, the better the control against overfitting a modeler has.
As for the results of the regression analysis, the load variation explained by the models employing structural or intensity metrics alone was 0.44–0.71 (26.79–48.56% rRMSE) and 0.39–0.55 (23.84–50.36% rRMSE), respectively. The 0.71 R
2 was actually achieved by the grass/forbs model while the models for the other SFTs revealed R
2 values lower than 0.6. The incorporation of the pooled metrics in the SA process resulted in more accurate results, based on the load variation explained by the regression models which ranged from 0.59 to 0.71 with rRMSE between 19.34% and 37.59% (
Table 6). The loads of grass/forbs and total surface fuels were mostly accurately predicted (R
2 = 0.71), whereas the FWD (i.e., 1-h and 10-h fuels) load was estimated with the lowest accuracy (R
2 = 0.59 and 0.6, respectively). The low predictive performance of the FWD models can be attributed to the fact that the amount of downed woody debris in a forested area is not only affected by stand characteristics but weather conditions (i.e., strong winds or heavy snow), natural debranching and presence of logging operations (i.e., residues), as well [
98,
99], information which is not possible to retrieve from LiDAR data. The similar estimation performances of two of the developed grass/forbs models (
Table 4 and
Table 6) is attributed to the fact that although different pools of predictor variables were used in the SA process (i.e., only structural in the first and all metrics in the second pool), intensity metrics were not selected in either case. For the sake of conciseness, the following part of the discussion is focused on the regression models incorporating both the structural and intensity metrics since these are the ones that yielded the highest estimation accuracy.
The LiDAR-derived variables (
Table 7) that were found to be mostly correlated with the reference data (i.e., the field-measured SFL values) were
for TSFL,
for litter load,
for grass/forbs load,
for 1-h fuels and
for 10-h fuels load, presenting Pearson’s correlation coefficients of 0.32, 0.28, 0.62, 0.27, and 0.30, respectively. An examination of all independent variables included in the models led to the observation that the first three regression models (i.e., the ones predicting TSFL, litter, and grass/forbs load) incorporated metrics related to the lower part of the canopy (i.e., metrics for the lm returns, low height percentiles, and s returns below 1-m height) indicating that the LiDAR data were able to interact with the surface fuel layers in a direct manner. On the contrary, variables describing the upper canopy (i.e., metrics for the fm returns, high height percentiles, and canopy s returns) were mostly used for the 1-h and 10-h fuels load estimation, revealing the ability of LiDAR to indirectly associate the FWD load with the canopy characteristics of the study area. This can be considered a rather logical association, since stand density and height may indicate the amount of the living tree branches available for falling on the ground, which in turn is linked with the downed branches load.
All regression models included GR and/or NIR channel intensity metrics, except for the grass/forbs one, where a bincentile metric, namely the b05
lm, was selected instead. In our study area, grass and forbs were mostly located in forest openings; thus, the respective load was higher in plots with lower canopy cover and density. Given that canopy cover and density had a significant impact on the amount of grass and forbs per plot area, the use of height percentile and bincentile metrics for the fm and lm returns is expected for describing the specific SFT, since they encompass the description of both the upper and lower canopy conditions. All of the remaining SFTs (i.e., litter, 1-h fuels, and 10-h fuels) are downed. Hence, intensity information—which is directly associated with the surface reflectivity of the target—was required in order to retrieve information related to the load of the different downed fuel types by correlating the LiDAR-derived intensity metrics with the respective reference data. In these cases, the selection of the single-return intensity information of the NIR channel is reasonable given that green vegetation strongly reflects energy in the NIR spectrum and can be employed for the discrimination of living and dead surface fuel materials. On the contrary, the GR channel has substantially lower reflectance from vegetation compared to NIR and its larger beam divergence can decrease the number of registered canopy echoes [
63]. Therefore, the GR channel information can be considered useful in the examination of the below-canopy forest environments.
With respect to the LiDAR multispectral information, as mentioned in the first paragraph of
Section 3.1, the contractor of the LiDAR acquisition provided us with the merged point cloud from the two channels, without maintaining the information about which channel each point was derived from. Intensity measures were derived from the raw discrete-return LiDAR data (provided separately), but we did not have access to the proprietary software employed for correcting atmospheric and noise effects; thus, we could not regenerate the single-channel point clouds. Therefore, the contribution of each channel to the final models cannot be proved experimentally at the present time in our case, but there is at least a strong evidence from the results of Reference [
73] mentioned in
Section 1. Nevertheless, the existence of a second channel effectively doubles the point cloud density (reaching to an average of 82.99 points/m
2), which is highly expected to increase the surface fuel estimation potential, even if that cannot be proven experimentally.
Upon further examination of the selected independent variables, it was observed that nearly all predictors were log, square root or square transformed indicating non-linear relationships with the reference data. Similarly, given the parametric nature of linear regression analysis, transformations of the response variables were also necessary to be involved in order to examine potential nonlinearity with the predictor variables and finally meet the assumptions of the linear regression model. Thus, the logarithmic and square root transformations of the response variables were selected in all load prediction cases to achieve variance homogeneity and normality of the residual plots.
Even though a direct comparison of regression analysis results derived from other studies requires particular caution, it is worth noting that more variation in the load of different surface fuel layers was explained in this work compared to most of the other reported prediction accuracies described in
Section 1. Possible reasons may encompass the use of higher-density data (e.g., Reference [
47,
50]), as well as the different forest composition and topography characterizing other examined areas, which affect the canopy penetration capabilities of the laser [
57]. The use of pulse intensity-related variables also had a substantial influence on the results’ accuracy since the point cloud structural information (i.e., height metrics) alone is not adequate for reliable load estimation of the different SFTs [
47,
100]. However, intensity range normalization requires specific sensor- and flight-related information (e.g., smoothed best estimate of plane trajectory) which are not always available to the final user. Plot sizes also differ among studies. The 1000 m
2 sample plot area selected in our work provided an additional advantage by avoiding inherent variability and the subsequent variation of the results due to possible spatial misplacement of the rectangular plots caused by GPS measurement inaccuracies. Moreover, the estimation accuracy is strongly influenced by the employed fuel sampling method (e.g., planar intersect, photoload, destructive) and possible data cleaning.
Contrary to the above mentioned studies, the research of Chen et al. (2017) for SFL estimation in Australia yielded significantly higher prediction accuracy (R
2 of up to 0.89) compared to our work [
48]. The authors examined the potential of year since last fire and surface fuel depth variables using both airborne and terrestrial LiDAR data. Except for the abovementioned reasons, the substantial outperformance of these results compared to ours can be also attributed to the available LiDAR data type. In particular, the type of LiDAR instrument being employed can substantially influence the results since each one provides different level of information. While the airborne LiDAR collects data mostly from the upper canopy layers, terrestrial LiDAR can provide much more detailed information from the lower stand layers [
101].
Despite the fact that this research has reached its aims, there were some unavoidable limitations. First of all, although the distribution of the sample plots across the study area was performed in the most representative manner possible, most related studies recommend using at least 40 samples in regression analysis and, therefore, it is possible that the limited sample size has caused additional uncertainty in the predictions [
102]. Furthermore, in the absence of shrubs, 100-h fuels, and regeneration from some plots, their load estimation was impossible to be performed because the remaining plot number (i.e., where these SFTs were present) was not sufficient for conducting a robust regression analysis. Sampling plots of smaller dimensions (<1000 m
2) would be able to decrease the workload probably leading to an increased sample size, which could be an effective solution to both aforementioned limitations. Nevertheless, as in the majority of related studies that employ a rather limited sample size and given the complex forest structure and steep topography, the importance that the samples capture the SFL variability across the area was considered a priority [
75].
Finally, the predictive models constructed in the framework of this research cannot be successfully transferred to other forest ecosystems. Although the models showed a reliable predictive performance in our study area, difference in vegetation composition, topographic conditions, and sensor characteristics would cause diverse relationships between the response and LiDAR-derived predictive variables, significantly influencing the results. SFL estimation in another forested area should encompass new reference data collection and development of different regression models. Yet, we argue that the employed methodology offers a robust framework for estimating SFL from LiDAR data, even in dense coniferous forests with complex topography and understory composition.