1. Introduction
Estimates of forest inventory such as basal area (BA), volume, and aboveground biomass (AGB) across large tracts of land provide foresters and ecologists the information necessary to form and implement management strategies at both small and large scales [
1,
2]. For example, BA (a measure of the cross-sectional area of a tree at breast height) has traditionally been used to manage naturally regenerated forest stands for timber production. BA is also linked to other forest metrics and can be used to estimate volume and biomass [
3,
4,
5,
6]. BA can also be a useful forest measurement for ecological studies, and a perfect example to demonstrate this point is found in longleaf pine (
Pinus palustris) forests in the U.S. Southeast. In the U.S. Southeastern, it is not only necessary to know the amount of wooded area of the longleaf pine for conservation of the species, but Red-Cockaded woodpeckers (
Leuconotopicus borealis) depend on a balance of longleaf pine basal area and stand density that are suitable for cavity nesting [
7]. BA is also useful for examining the amount of woody area infested by various species of invasive insects across large areas [
8]. Volume is a measure used by foresters to estimate the cubic amount of wood on an area of land [
9,
10]. An estimate of volume can give foresters an estimate of the dollar value of the standing timber they are cruising. Volume is also associated with the persistence of invasive species, and having estimates of volume can help foresters better manage both timber product and ecosystem health through stem and growth stocking projections [
11]. AGB is the dry weight of carbon stored within forest trees above the ground and is measured in Mg ha
−1 [
12]. In ecological studies, AGB is a known driver of the species composition of an ecosystem, and maps of AGB can give ecologists insights into the distribution and composition of organisms across great extents [
13]. Because AGB is an estimation of the carbon stored in the trunks of trees, it is critical to understand the role forests play in carbon cycling and climate change. Accurate estimates of AGB can help nations develop strategies to meet goals set by international agreements for climate change, such as those outlined in the Reducing Emissions from Deforestation section of the Paris Climate Agreement [
14], and contributing action inventories such as the National Greenhouse Gas Inventories report [
15]. Often it can be difficult or impossible to accurately estimate AGB without destructively sampling trees in a forest, and allometric equations developed from field inventory estimating BA, volume, and AGB can fail to account for variables that affect them across a large landscape, such as site index and crowding, among other local factors [
2,
16]. Inconsistencies in forest inventory methods across the U.S. also cause inaccuracies in regional estimates, as inventory methods are frequently tailored for project-specific goals and thus have differing methods for measuring [
17,
18,
19]. Lastly, estimates of forest measurements across large areas using traditional mensuration approaches can be time-consuming and costly, and in some instances, variation in landscape features can cause dangers in field work and inaccuracies in estimates [
16,
18,
20,
21,
22]. Remote sensing (RS), particularly light detection and ranging (lidar) estimates of BA, volume, and AGB, may overcome these limitations and can produce comparable [
23] and more accurate results [
24] compared to field-based estimates of large-area forest inventory.
Lidar, an RS method that uses laser scanning to acquire three-dimensional information over the desired area [
25], provides an alternative approach to estimating forest inventory metrics [
1,
21]. By using lidar, forest characteristics such as tree heights and canopy cover can be directly estimated, and other measures such as BA, volume, and AGB can be estimated by means of modeling using a variety of metrics derived from directly measured height and canopy cover [
26,
27]. The most commonly used approach to modeling BA, volume, and AGB from lidar data is by developing a multiple linear regression from field inventory and lidar variables [
28]. Often the best subsets approach, such as forward or backward selection, is used to determine a model with variables that best explain the dependent variable and further validated using a cross-validation or set validation approach [
27,
29,
30]. These models usually contain at least one variable explaining tree height, another explaining canopy cover, and another that accounts for variation in the data, such as a height standard deviation variable [
6,
31,
32]; however, the variables in these models may differ depending on the study site and the foliage type being measured [
28,
33,
34]. Estimates of BA, volume, and AGB can also be acquired from non-parametric machine learning approaches such as random forest (RF), a machine learning algorithm that uses random and iterative samples of the data to produce regression trees and bootstraps data for robust predictive models [
35,
36,
37].
Estimating forest metrics may be performed by using an area-based approach or a tree-level approach, and previous studies modeling BA, volume, or AGB have seen success in both approaches. For example, in estimating forest attributes from individual trees, researchers found that BA, volume, and tree heights of longleaf pine in Georgia, U.S., could be estimated with accuracies of R
2 = 0.18, 0.94, and 0.96, respectively [
38]. The results for estimating volume and tree heights of individual trees are promising, and the poor results for estimating BA were explained by the loss of height-diameter allometry in southern pines above 25 m. In another study, researchers estimated BA for a plantation of loblolly pines (
Pinus taeda) (another southern pine species) using an area-based approach and were able to achieve accuracies of R
2 = 0.97, and noted that the highly homogeneous environment of the pine plantations that the trees were grown in likely led to such high accuracies [
27]. Using an area-based approach across a large area can be beneficial in that it reduces the amount of processing required by the computer. Similarly, variables can be derived from multispectral imagery, such as the normalized difference vegetation index (NDVI) and texture co-occurrence, which have also proven useful for modeling [
34]. While predictive modeling using variables from either lidar or multispectral imagery has been successful in many cases, particularly in northern and western forests where pine species dominate [
17,
30], more difficulty is found in estimating forest inventories in heterogeneously mixed forests [
28,
34,
38]. This, in addition to the perceived cost of remote sensing data, has caused delays in adopting RS technology for use in some forest inventories. However, the combination of lidar data, multispectral imagery, and field data for modeling could improve estimates of BA, volume, and AGB in mixed forests [
39,
40,
41]. Furthermore, open-source data are becoming more available and can be used for forest inventories in place of otherwise expensive data sources. Finally, the implications of achieving accurate estimates of forest inventory metrics go beyond timber cruising and carbon estimation alone. By achieving wall-to-wall estimates of forest inventory metrics, past and future data can be compared for forest management techniques and landscape ecology analysis. Most importantly, the wall-to-wall maps produced in this study can be used as base maps for validation of satellite lidar data produced by systems such as the Global Ecosystem Dynamics Investigation (GEDI) lidar aboard the International Space Station (ISS) [
42]. The goal of this study was to produce wall-to-wall estimates of BA, volume, and AGB in a southern mixed-species forest using open-source RS data. This goal was achieved by meeting the following objectives:
- A.
Determine whether the addition of variables derived from multispectral imagery to models previously including only lidar derived variables can improve estimates of BA, volume, and AGB;
- B.
Determine what variables are useful for modeling each forest metric (BA, volume, AGB) for Southern mixed forests.
If sufficiently accurate, models predicting forest inventory metrics can be used a number of times in forest inventory [
17], and final wall-to-wall map outputs can be used for the validation of satellite data.
4. Discussion
Models predicting BA, volume, and AGB in this study ranged in accuracy from R
2 = 0.36 to 0.53. In the RF models, the R
2 values were higher than those produced by the linear regression modeling approach, except for when predicting BA. Both RMSE and %RMSE were lower in linear regression models, except when predicting BA. Other researchers modeling forest metrics reported findings that RF modeling tends to have the smallest RMSE and the least amount of bias [
36,
65]. Because the predictive accuracy between the two modeling types was similar, the findings of this study suggest that the results of a modeling approach could depend more heavily on the variables used in modeling, as well as the forest type being modeled (mixed forest stands, homogeneous plantations). The models produced for volume had the greatest predictive accuracy (R
2 = 0.53, RMSE = 50.21 Mg ha
−1 %RMSE = 36.68), likely because the model used to calculate volume from field inventory included a height variable, while those for BA and AGB only included DBH and associated parameters. By calculating volume from field inventory using an equation that includes a height metric, lidar-derived height variables account for variation in heights when predictive modeling. Less accurate models developed for BA and AGB were also likely due to a number of other factors. For example, the difference in tree heights from when the field inventory was taken and when the lidar tree heights were measured could cause some variation; however, this is likely minimal as a subset of plots was used within two years of the lidar acquisition. The most likely explanation for the unexplained variation in BA and AGB models is due to a combination of the merchantable timber cruising process used to collect this field data and to the high heterogeneity of southeastern forests [
34]. In other words, a lot of the foliage measured by lidar and multispectral imagery is not accounted for in the field inventory gathered by inventory cruisers. This is especially seen in the model building process for BA, where variables selected for each model were those whose function either distinguished layers of the canopy (i.e., density metrics trying to explain shrubs and non-inventoried trees beneath the tops of pines) or those that distinguished foliage type (texture bands). Height metrics were excluded from these models because they could not explain BA for all of the foliage within a plot, especially when the data that the models were built from were based only on merchantable timber instead of the entirety of the foliage constituting the plot BA. Furthermore, while there is a relationship between a tree’s height and its BA, in many species, this relationship is known to be somewhat weak [
28,
38], and another reason why there are no height metrics in the model for BA. Similar findings were reported by other researchers working in heavily mixed forest stands, such as those in Canada, where BA was modeled with an R
2 of only 0.093 [
28]. In that particular study, the researchers noted that it was difficult to account for most of the vegetation not measured by timber cruisers and recommended using another data source in addition to lidar.
From the models developed in this study, spectral, textural, and landcover variables were considered in addition to lidar to delineate any attributes of the forest not accounted for in the field inventory. As mentioned in the results section, the model for BA included a texture variable (var_3). Furthermore, the four and five variable models for volume and biomass included texture variables (var_2, hom_1). This suggests that in heavily mixed forests, spectral, textural, and landcover class variables have the potential to explain variation missed by lidar variables alone; however, the difference in model accuracy after removing imagery variables was not substantial, and lidar variables alone predicted BA, volume and AGB almost as well. This is not to say that lidar variables are sufficient in estimating lidar in Alabama mixed-species forests, as the accuracy of lidar-derived models were still poor in BA, volume, and AGB. While many papers suggest that lidar alone is sufficient for modeling, most of these studies are developed for study sites whose forests are primarily homogeneous such as those in the western United States [
59,
66,
67]. In estimating BA, for example, a study in western Oregon achieved very high predictive accuracy using a model with two height variables and a canopy cover variable (R
2 = 0.96), but the study site was homogeneous in nature, dominated by old-growth coniferous forests [
30]. Because of the variation in allometric dimensions among tree species in mixed forest stands, alternative variables are necessary to accurately predict forest metrics.
5. Conclusions
RS data, including airborne and spaceborne lidar, as well as multispectral imagery from satellite-based platforms, are becoming a more available resource to foresters and ecologists. Programs such as the 3D Elevation program from the USGS, the National Agricultural Imagery Program from the USDA, and Sentinel-2 from the ESA provide free data that foresters can use to estimate the amount of BA, volume, and AGB on their stands and across large tracts of forested land. In this study, the potential of estimating BA, volume, and AGB from RS data was demonstrated by means of linear regression analysis and random forest modeling. Predictive accuracies of models were low (R
2 = 0.36–0.53) relative to those in some studies [
27,
59], and the results presented here suggest that the addition of imagery variables to lidar derived predictive models do not substantially help explain the variation in BA, volume, and AGB. While this study demonstrates the potential for a fast and efficient method of estimating BA, volume, and AGB using freely available data, further investigation of variables is needed to increase the variability of forest structure in mixed forest stands. Lastly, the main limitation of this study is that the field inventory used to build the models included merchantable timber and not the entire vegetation in one forest plot. Therefore, the potential to improve forest metric estimates may be improved if the entirety of the vegetation in a forest plot is measured.
Open-source RS data are increasingly available, and as demonstrated in this study, freely available data can be leveraged for estimating BA, volume, and AGB in a spatially explicit manner. Lastly, improving estimates of BA, volume, and AGB are necessary to produce accurate reference maps that can be used for the validation of forest measures derived from satellite lidar and imagery.