1. Introduction
Accurate forest inventory data are essential in forest management planning. Conventional, forest inventory data are collected via ground-based survey methods [
1]. These methods, while collecting information with acceptable accuracy, have been criticized for their high cost, time consumption, and inability to update in a timely manner [
1]. Indeed, exploring methods that avoid the disadvantages of ground-based methods and achieve an acceptable accuracy is of great interest to foresters. Remote sensing techniques likely are a viable alternative.
While numerous sources of remotely sensed data are available, the Landsat sensors are still the most widely used in forestry [
2,
3]. This is because the Landsat sensor series data, in addition to their potential relationships with forest attributes, provide a large, free archive of imagery with medium spatial resolution, relatively large spatial extent, and frequent revisit time. Great efforts have been made to evaluate the suitability of using Landsat sensor data including individual spectral bands, vegetation indices, transformed images, and textural images, to estimate forest attributes. Mixed results have been reported, with some revealing a high level of accuracy (i.e., R
2 > 0.60) [
4,
5,
6], and others reporting great uncertainties [
7,
8]. Landsat sensor variables are sensitive to many factors such as stand structure, geographic region, and human disturbance, compromising their application in predicting forest attributes. Depending on stand attribute and condition, often Landsat images were found to estimate forest stand attributes with 30% or greater error [
2,
9], higher than the allowable error (15~20%) of what is often required for operational forest management. Furthermore, most of the previous studies have focused on natural forests with complicated stand structures of multiple species, age classes, various densities, and understory conditions.
Little research has been dedicated to commercial plantations despite their economic importance. Compared to natural forests, the application of remotely sensed data to plantation forests potentially faces different problems [
10]. Most plantations are intensively managed, single-species monocultures, resulting in distinct stand structural and temporal characteristics. Silvicultural activities such as implementing site preparation, competition control, and fertilization homogenize the understory of young (open crown) plantations, facilitating the relation between remote sensing data and stand attributes, but homogenize stand canopy structure and enhance stand productivity, compromising the ability of images in discerning canopy-closed plantations. Stand structure is a complex combination of several parameters such as stand average height, diameter, density, biomass, and volume. Previous studies have focused on using remotely sensed data to predict forest stand biomass and volume. For plantation forests, biomass and volume are secondary variables, often calculated from traits such as stand tree size (average height and diameter) and density. Therefore, the usefulness of Landsat sensor variables to recover plantation stand attributes should be evaluated by their adequacy in predicting stand tree size and density. Density management is paramount for the success of managing plantations. Previous studies on the use of Landsat sensor data have unanimously adopted the number of trees per unit area (stocking) as an indicator of density, which, however, has intrinsic limitations when applied to plantations since it does not consider tree size [
11]. Basal area or stand density index combines stocking and tree size and has been widely adopted in decision-making in managing plantation density by foresters [
1]. Thus, the suitability of using Landsat sensor data to predict basal area or stand density index is of great practical value, which, however, has seldom been investigated. Recent research [
11] reported promising results of using Landsat sensor data to predict the height and basal area of young spruce plantations. However, much less optimistic findings were indicated for predicting the number of trees per hectare and volume of coniferous plantations [
10].
Plantation management planning schedules activities aimed at optimizing the utilities of a plantation, in particular wood production. Often the key activities include, but are not limited to, projection of stand conditions and economic value (growth and yield modeling), mid-rotation silvicultural treatments, and harvest. Timely inventory data are needed to support these activities. These data should be sufficiently accurate, the level of which depends on the activity. Less precise data may be acceptable for projection purposes while a higher accuracy is desirable for supporting silvicultural activities [
9]. To integrate Landsat sensor data into forest management, their usefulness must be assessed by how well they can support the activities, which, however, is missing in the literature. Most previous studies have adopted a simplistic view of the information needs in the forest planning process, without relating the analysis to management decisions [
9].
Loblolly pine (
Pinus taeda L.) is the most widely planted commercial species in the West Gulf (WG) Region. For example, in east Texas, forestland occupies about 4.9 million hectares, of which, 1.2 million hectares (24%) are classified as pine plantations, with most being composed of loblolly pine [
12]. Intensive silvicultural activities such as competition control, chemical site preparation, tillage, fertilization, thinning, and other activities have been routinely applied to loblolly pine plantations [
13], resulting in rapid stand development including canopy closure over a short period post-establishment. For current operational forest management, forest inventory data are needed for young (around five years) and mid-rotation ages (around 10 years), yet traditional methods are costly. Landsat sensor data could be useful for this purpose, yet has not been fully investigated. Landsat sensor data have been demonstrated useful in managing loblolly pine plantations in ways such as mapping plantations [
14] and predicting leaf area index [
15,
16,
17,
18]. Only one study [
19] reported the application of Landsat sensor data to estimate loblolly pine plantation age and stocking (trees per hectare) and the results were encouraging (R
2 = 0.70 and 0.60, respectively). None of the above studies linked results to specific planning activities, limiting operational application. Overall, the potential of Landsat sensor data for estimating pine plantation attributes has not been fully explored, yet the interests of forest companies in using remotely sensed data for forest inventory purposes remain high (Mr. Trevor Terry, personal communication).
Initiated in 1982, the East Texas Pine Plantation Research Project (ETPPRP) has been exploring models for managing loblolly pine plantations in the region (
http://etpprp.hzgzsoft.com:92/ (accessed on 8 July, 2022)). In 2004, the ETPPRP began establishing permanent plots in intensively managed loblolly pine plantations across the region, which were repeatedly measured on a three-year scale, providing temporal reference ground data. The objective of this study was to analyze the suitability of using Landsat sensor data to predict loblolly pine plantation attributes from before and after crown closure. If satellite reflectance data can be sufficiently related to measured plantation stand attributes, information from this study can be incorporated into growth and yield modeling and forest management planning.
3. Results
Selected MLR models for predicting the stand attributes using the Landsat sensor variables are shown in
Table 4. The model’s goodness-of-fit varied with cycle and stand attributes. For the first cycle measurement, HT, DBH, BA, and SDI were predicted well. The achieved R
2 ranged from 0.61 (DBH) to 0.74 (SDI) and their respective RMSE were fairly low, 1.22 m, 2.11 cm, 3.23 m
2, and 78.16 for HT, DBH, BA, and SDI, relative to their respective minimum (
Table 1). The selected model predictors were similar for all the attributes (
Table 4); the SWIR1 was consistently the top predictor for all the attributes, followed by RED (for HT, BA, and SDI) or NIR (for DBH). The lower SWIR1 or larger RED values were associated with faster stand growth and higher density. No index variables stayed in any models other than the texture measures of ASM and IDM for predicting BA and SDI. The goodness-of-fit for the HT, DBH, BA and SDI models of the second cycle measurement were greatly reduced, but still moderate, with R
2 ranging from 0.43 (HT) to 0.48 (SDI). SWIR1 remained the top predictor for all the models, while indices SAVG (for DBH and BA) and NDVI (for BA and SDI) were also selected. The model goodness-of-fit of the third cycle models worsened further to the point that the Landsat sensor variables were practically useless for predicting HT and DBH. For BA and SDI, the models, respectively, explained 30% and 21% of the total variation and SWIR1 and RED were still the best predictors. Residual plots showed that model assumptions (normality, equal variance and independence) were acceptable in general, although some imperfections were present (data not shown). Differently, the NT was poorly predicted within the training datasets for all cycles, with R
2 being 0.19, 0.08 and 0.17 for cycles one to three, respectively. Identified predictors varied with cycle and neither SWIR1 nor RED was important predictor.
Applying the models to the test dataset confirmed the MLR model fitness (
Table 4;
Figure 2,
Figure 3 and
Figure 4). The obtained R
2 values were similar in magnitude to respective ones in the model training for each cycle. The MAE estimates were low, e.g., being 0.87 m, 1.49 cm, 1.77 m
2 ha
−1, and 52.49 for the first cycle HT, DBH, BA, and SDI, respectively. Overestimations occurred for small values and underpredictions were true for large values, and this trend was more evident for models of the latter cycles (
Figure 2,
Figure 3 and
Figure 4).
Models for predicting selected stand attributes were trained using RF to evaluate the importance of predictors, and
Table 5,
Table 6 and
Table 7 list all the selected predictors with %IncMSE ≥ 2. For the first cycle measurement, the reflectance bands SWIR1, SWIR2, and RED, and the TCT variables wetness or brightness were the important (%IncMSE ≥ 10) predictors for modeling HT, DBH, BA, and SDI, while vegetative indices were the top predictors for predicting NT. The identified top predictors for the second cycle models remained similar, although texture index SAVG was also important for the HT, DBH, and NT models. For the third cycle, while the SWIR1, SWIR2, and wetness were still the top predictors, more texture indices were identified as the top predictors as well. Evidently, contributions were dominated by a few top predictors in the first cycle, this dominance, however, decreased with increasing cycles. For example, SWIR1 had a %IncMSE value of 30.42 for predicting BA in the first cycle, which was reduced to 24.19 and 12.53, respectively, in the second and third cycles. Many texture variables were included with %IncMSE between 2~5, only a few, e.g., SAVG and IMCORR1, had a %IncMSE > 7. The relative importance of texture variables seemly increased with the cycle. None of the vegetative indices were identified in the top four predictors for all models other than predicting NT for cycles one and two.
The final RF models were re-trained using the important predictors (
Table 5,
Table 6 and
Table 7) and the training dataset, and the model goodness can be found in
Table 8. For the first cycle, the models for predicting HT, DBH, BA, and SDI explained 65% or more of the total variation, and achieved reasonably low RMSEs of 1.33 m, 2.20 cm, 3.40 m
2 ha
−1 and 80.49 for HT, DBH, BA and SDI, respectively. The corresponding models for the second cycle showed a medium fitness with R
2 being around 0.40 and those for the third cycle having no practical values. Similar to the MLR results, the NT models were poor in fitness for all the cycles, with a R
2 ≤ 0.16. Our model test using the test dataset showed that the models of the first cycle had the accuracy as expected, as demonstrated by the similar R
2 between training and test datasets and the low MAEs of 0.84 m, 1.34 cm, 2.06 m
2 ha
−1 and 60.2 for HT, DBH, BA and SDI, respectively (
Table 8;
Figure 2,
Figure 3 and
Figure 4). However, the models of the second cycle had lower test R
2 than respective ones of the model fitting for all attributes other than SDI, of which a similar R
2 was achieved. Similar to the MLR models, the attributes were overestimated for low observed values and underestimated at the high end for all the attributes; this trend was more evident for latter cycles (
Table 8;
Figure 2,
Figure 3 and
Figure 4).
4. Discussion
Intensively managed loblolly pine plantations are often subjected to a variety of silvicultural treatments at the time of planting or shortly thereafter [
38], and their canopy structures develop in a rapid, yet predictable, way. Intensive management enhances tree growth and suppresses underground vegetation, creating a favorable environment for using images for open-canopy plantations, but can lead to homogenous, dense stand canopy structures, weakening penetrability of Landsat sensor signals and compromising the estimation of stand attributes after clown closure. The repeated (cycles one to three) data used represent crown development from pre- to post-closure over a period of six years in operational loblolly pine plantations. Results reveal that HT, DBH, BA, and SDI of loblolly pine plantations may be predicted adequately when crowns were open, but such predictability weakened as stands developed, being moderate when crowns were partially closed and becoming practically useless after crowns were completely closed. To our knowledge, this study is the first to quantify changes in Landsat sensor data suitability within a critical period of stand development from open crown to complete crown closure. The general belief that strong relationships exist between Landsat sensor data and stand attributes in homogeneous forests such as in conifer plantations [
39,
40] is an oversimplification. Since the plantation canopy was at such a high level of density after crown closure that Landsat sensors lost the ability to penetrate the canopy and were unable to differentiate between individual tree canopies, compromising the estimates of DBH, and consequently the estimates of BA and SDI. In upland conifer plantations of various ages in Scotland, stand HT and BA were strongly (R
2 > 0.77) correlated with Landsat sensor bands but the strength of the relationships decreased substantially after canopy closure occurred [
39]. The strong predictability of Landsat sensor variables for young (pre-crown closure, 2–17 years old) Sitka spruce plantations has been reported [
11]. In even-aged
Eucalyptus plantations, correlations between high spectral resolution ASTER satellite data and stand trees ha
−1, DBH, HT, BA, and volume were stronger for plantations of ages 4–6 years than those of 7–9 years [
41]. These studies paired with our findings suggest that the suitability of using Landsat sensor data to estimate plantation stand attributes varies with the stand development stage, being more useful for young and open-canopy plantations.
One important finding was that Landsat images were not practically useful for predicting NT during the study period (
Table 4 and
Table 8), suggesting that the counting of trees is difficult when the satellite images are of moderate resolution, especially when tree canopies overlap. This disagrees with findings from a study also on loblolly pine plantations in east Texas [
19] where as high as 60% of the variation in NT was accounted for by Landsat sensor variables such as NDVI. This inconsistency could partly be due to sampling; Their NT ranged from 222 to 2396 trees ha
−1, whereas our study sampled a much narrower NT range (
Table 1), increasing the difficulty of discerning populations from Landsat sensor variables. Additionally, reflection of satellite on a pine plantation is more determined by crown characteristics, in particular crown width, a strong predictor for tree DBH. Thus, a stronger relationship of Landsat sensor data with BA or SDI over NT is not unexpected. Poor predictability for NT using Landsat sensor variables has been confirmed in young commercial plantations of other species [
11,
39]. Thus, there is uncertainty associated with the application of using Landsat variables to predict NT for commercial plantations, suggesting further investigation of other variables or methods to explain forest variability in NT is necessary. Further discussion focuses more on stand attributes of HT, DBH, BA and SDI.
While used widely, the MLR method is often criticized for model development to estimate stand attributes since several assumptions (linear relationship, equal variance, normality and independence) are required. RF has theoretical advantages over MLR modeling by overcoming the assumption-related limitations and also its ability to address the issues of high-dimensionality and multicollinearity. RF, relatively new in forest statistical data analysis, has recently been adopted to predict forest stand attributes using climate and biophysical data [
35,
42,
43]. For this data, even with many strongly correlated Landsat variables being involved, the models from RF and MLR were competitive with one another in predictability; neither approach produced substantially better models (
Table 4 and
Table 8). The selected predictors by both methods were similar, in particular for the first cycle models. In a study to modeling stand BA, volume and biomass using airborne Lidar and multispectral imagery in heterogeneous mixed species forests in Alabama, models developed by MLR and RF were comparable [
44]. Fewer Landsat variables were selected as predictors of the MLR models, and their predictability may be improved by including highly related predictors such as SWIR2, which, however, complicates biological explanations. Further improvements in predictability may be realized by adding extra predictors to RF models too, but the improvement is unlikely to be substantial. One drawback of using RF technique to develop models is the implementation of the models which is difficult without programming the models into source codes [
35]. Currently the MLR models proposed in this study are recommended over RF models due to their ease of use, comparable accuracy and straight-forward interpretation.
Both the MLR and RF models tended to bias for low (overestimate) and high (underestimate) observed values (
Figure 2,
Figure 3 and
Figure 4), which has been identified in previous studies [
45]. For plantations with closed crowns and high productivity, the limited sensitivity of the optical sensors for interpreting the canopy structure leads to saturated measurements while for young stands with open crowns the confounding effect of understory vegetation may overestimate stand attributes. Our results further indicated that the biased tendencies varied with stand development, being relatively minor for the first cycle measurement and becoming more evident with stand crown closing.
Landsat sensor data are commonly used for assessing stand attributes due to their potential correlations, free availability, and broad spatial and temporal coverage. While numerous Landsat variables are available, the selection of the ideal ones is challenging, varying with stand structure and environment. The dynamic changes in stand density (crown closure) and understory background as stands develop can lead to various variables being the optimal predictors at different stages. In this regard, some patterns stood out based on our results for a short period of six years of the plantation’s early growth. First, the bands, SWIR and RED, were consistently the most important predictors. Previous studies on loblolly pine plantations indirectly support our findings, where SWIR1, RED, and NDMI (an index strongly influenced by SWIR) strongly correlated with stand age and leaf area index, both growth-related stand attributes [
15,
18,
19]. The SWIR1 was also identified as the top predictor for forest stand properties on other pine stands [
39,
46,
47]. Second, wetness could be a key variable for predicting tree size of open-canopy plantations (
Table 5) and for predicting stand density of plantations with partly or completely closed crowns (
Table 6 and
Table 7), which is in parallel to the findings on other species [
19]. Our MLR models did not include wetness due to its strong correlation with SWIR, leading to an issue of multicollinearity. Third, all vegetative indices except NDVI were relatively unimportant. NDVI was selected as a predictor for predicting BA and SDI for canopy-closed plantations (
Table 4). Vegetation indices are useful in estimating stand attributes such as biomass because they enhance green vegetation signals and minimize the impacts from the surface and atmospheric effects in complex vegetation structures [
4]; their usefulness likely reduces in relatively simple and homogeneous structures such as intensively managed pine plantations. Fourth, many texture variables contributed somewhat to improving model prediction, with most having a %IncRME between 2–4. The models centralized the importance at a few band variables such as SWIR1 and RED before crown closure, this dominance declined as stands develop, and models split the importance to more texture measure variables after crowns were completely closed (
Table 5,
Table 6 and
Table 7). Texture features performed better than spectral factors for modeling stand attributes in areas with multiple layers and complex forest stand structures [
48,
49], and our results support the pattern that the importance of texture variables increased from the first cycle in a situation of simple structure to the more complex structure of the third cycle.
Basal area (BA) and SDI are numerical values that capture intensity of competition within forest stands. Operationally, BA and SDI are parameters often used to help foresters with thinning decisions, and thus accurately predicting BA or SDI in particular at the mid-rotation age has great practical value. Even models predicting BA and particularly SDI had larger R
2 than the respective ones for predicting HT and DBH (
Table 4 and
Table 8), their accuracies in predicting the mid-rotation age (third cycle) BA and SDI were low, with R
2 < 0.25, suggesting infeasible of the method to guide thinning of loblolly pine plantations unless the models can be otherwise improved in future.
Can the findings of this study be incorporated into plantation management of loblolly pine in the region? This question has to be answered within the context of operational pine plantation planning. For loblolly pine plantations in the region, often a timber cruise around 5 years is carried out to obtain stand information such as mean HT, DBH, and NT, which is then entered into regional growth and yield models [
22] to project future stand characteristics (BA, volume, and NT) and economic values under scenarios of various silvicultural activities. For this purpose, the required accuracies for predicting HT, DBH, and NT are relatively low. Our results are encouraging for predicting HT and DBH of the plantations around age five (
Table 4 and
Table 8). The issue was NT, which was poorly predicted for all cycles. In reality, however, loblolly pine plantation survival is routinely surveyed at the end of the first or second growing season, and this number is unlikely to change substantially in the subsequent years until competition-induced mortality occurs [
38]. In other words, NT of young loblolly pine plantations are typically known. Thus, our results do support the use of Landsat sensor data to facilitate the long-term management planning. Forest inventory data are indispensable for mid-rotation (around age 10 years) loblolly pine plantations to support silvicultural activities such as commercial thinning and/or fertilization. Toward this end, Landsat sensor data were found not to be reliable indicators of stand attributes and, therefore, caution should be taken for using Landsat sensor data to replace field measurements.
Are there options to improve the predictability of the models? With the inclusion of other factors, a few studies have shown positive results in predicting stand attributes. Ages are known accurately in any well-managed loblolly pine plantations and adding age as a predictor was reported to improve the performance of Landsat sensor data-based models [
45]. In this study, when stand age was added as a predictor to the MLR models for predicting HT, DBH, BA, and SDI, the model R
2 was improved to 0.82, 0.75, 0.80, and 0.80 for the first cycle, 0.72, 0.62, 0.62, and 0.61 for the second cycle, and 0.55, 0.44, 0.41, and 0.32 for the third cycle (note that multicollinearity did not become an issue), greatly reducing the uncertainties of model estimates. Geographic factors, such as stand elevation, slope, and aspect, are known to affect spectral reflectance values [
50], and thus including them may improve model performance. We are currently collecting this information so that a test on the topic can be carried out. Synergistic use of data from different sources or from different sampling dates (or seasons) of one source to estimate stand attributes can be more accurate than data from a single source or a single date [
3,
51]. For this study, Landsat sensor data were extracted for summer only. The suitability of incorporating Landsat sensor data from other seasons to improve model performance should be further investigated. Clearly, applying high-resolution remotely sensed data would improve model performance, which, however, is accompanied by higher data costs. Reference [
52] estimated stand attributes of pine plantations using a WorldView-2 multispectral imager (2m spatial resolution) and achieved acceptable estimates for HT and DBH (errors of <13%) but relatively large errors (up to 30%) for BA, volume, and NT estimates. In another study, researchers estimated BA for a plantation of loblolly pine using airborne LiDAR data, achieving an accuracy of R
2 = 0.97 [
40].
Field reference data well representing the truth are indispensable for model development. There are a few issues about field data which need to be clarified for this study. Four plots were installed in plantations with ages between 8~11 years, and therefore their first cycle data do not represent open-canopy plantations. Additionally, four plots at the third measurement cycle have been commercially thinned once. We did not exclude these plots in analyses since our preliminary analyses showed that removing them would not change the results. Our field plots were in the square of 900 m2, very similar to the pixel shape and size, nonetheless, a potential mismatch between the field coordinates and the points of spectral variable extraction might occur. Note that the plots measured in this study are considered operational for loblolly pine plantations in the WG region. Consequently, our results well represent the relationships between Landsat sensor data and observed growth and density in operational settings there, and thus, the models may be valuable tools for managing the current regional loblolly pine plantations.