1. Introduction
Effective forest monitoring systems require novel and well-organized methods and tools to provide up-to-date, reliable data on forests [
1,
2]. The development of remote-sensing and image-processing techniques has enabled the automatic, cost-effective, high-quality, spatially continuous, and resource-saving mapping of forest attributes [
3,
4,
5] across various spatial scales [
6].
Forest canopy cover (FCC) is one of the most important forest inventory parameters and plays a critical role in evaluating forest functions [
7,
8]. For example, FCC is frequently used as an explanatory variable in water cycling [
9,
10,
11], the assessment of soil erosion [
12,
13], wildlife habitat [
14,
15], forest regeneration and tree and forest survival [
16,
17], forest structure [
18,
19], wildfire risk assessment [
20,
21], and air purification [
22,
23]. This is because FCC reduces soil erosion by diminishing the impact of raindrops on barren surfaces [
10]. Further, decreasing FCC increases understory light availability for forest regeneration [
19]. Various remote-sensing techniques and data have been used for FCC mapping ever since such technologies became available [
24,
25,
26]. Overall, FCC mapping using earth observation datasets (reference datasets) is complex due to low inter-class separability and similar spectral signatures. Moreover, large-scale FCC mapping requires sufficient numbers of training and validation samples. In previous studies, this sample was usually collected during the field inventories, which is very time-consuming and costly [
27,
28,
29]. Therefore, enhancing FCC mapping has remained a topic of high interest for remote sensing and forestry over recent decades. Anchang et al. [
30] used forest inventory plots, VHR Google Earth satellite images, and Sentinel-1 and -2 time series and reported that their methodology was able to map wood canopy cover (WCC) into 10 classes at low errors (RMSE = 8%). Arumae and Lang [
31] used airborne laser scanning (ALS) data and hemispherical photography to produce an ALS-based FCC model. They also examined the influence of varying height thresholds and scan angles on FCC estimation. Based on their results, the best model was produced using all echoes and a 1.3 m height break (R
2 = 0.81; RMSE = 11.8%). Though previous studies have attempted to develop effective methods of using remotely sensed data, many challenges still remain regarding large-scale FCC mapping. For example, most previous studies were conducted across small sites (local scale) [
32,
33], and some have shown that the time series and multi-sensor image processing techniques enabled FCC mapping at higher accuracy compared to mono-date images [
34,
35]. However, most of them used mono-date or mono-source satellite images because they faced several limitations, such as low data storage capacity and computing power [
36,
37]. Further, previous FCC mapping studies were performed with cloud-free or near-cloud-free satellite imagery. They often faced low-density datasets, especially in areas with a lot of clouds.
Thanks to the advent of cloud computing platforms specialized in remote sensing, such as the Google Earth Engine (GEE), analyzing a large number of satellite images has become more effective [
38,
39]. The novel time series image-processing methods made it easier to integrate all cloud-free reflectance values from all images during a period and to produce spectral–temporal metrics applicable for mapping and modeling solutions [
40]. Freely available satellite images, such as those obtained from the European Space Agency’s (ESA) Sentinel sensors, have been proved to be cost-effective, timely, and easy for the integration of data sources, being frequently used for FCC mapping [
40,
41]. The Sentinel-1 (S-1) system is composed of a constellation of two satellites that provide C-band synthetic aperture radar (SAR) data in four acquisition modes with a revisit time of 6 days [
42]. Sentinel-2 (S-2) optical satellites collect high temporal resolution data, with a revisit time of 5 days, associated with a rich spectral configuration that includes 13 spectral bands [
43,
44]. Earlier studies reported that integrating S-1and S-2 images improved the classification accuracy of land use and land cover (LULC) classes compared to the results achieved by only S-1 or S-2 data [
45,
46]. The reason is that the radar signal is sensitive to the geometry (e.g., roughness, texture, and internal structure), and their physiology influences optical reflectance [
47]. However, the effects of S-1 and S-2 time series integration on mapping FCC classes have not been well-studied, and there is a significant research gap in evaluating how this integration can provide better results in mapping FCC classes with a high level of spectral similarities. Although many studies have used Google Earth VHR images to prepare training and validation samples [
48,
49,
50], they did not report the accuracy of extracted samples.
Classification algorithms are another important component, because they must produce accurate maps from remotely sensed data. An accurate and robust reference dataset is the most important component of supervised classifiers, especially machine learning (ML) models [
51,
52]. The field collection of reference datasets is a time-consuming and costly part of all forest monitoring studies because of the large study areas and inaccessible regions. The reference dataset should provide enough training and validation samples to represent the diversity of various canopy cover densities (including all classes of FCC) and be well distributed across the study site. When the number of training samples is not enough, there is a risk of overfitting the training data, which can lead to poor generalization capabilities of the classifier [
53,
54]. Providing this kind of reference dataset is challenging in large-scale FCC mapping, since training and validation samples in large area mapping are typically collected by field inventories [
55]. This work builds on previous findings regarding the preparation of reliable reference datasets, novel image processing techniques, and ML models to provide a more accurate FCC map. As the upscaling artificial intelligence solutions for large-scale mapping are important, this study therefore had the following specific objectives:
- (1)
Presenting a simple but efficient method for preparing an accurate reference dataset,
- (2)
Evaluating the effects of data density (i.e., one-year datasets vs. three-year datasets) on FCC mapping accuracy,
- (3)
Evaluating the effect of various spectral domains on FCC mapping accuracy, and
- (4)
Comparing the efficiency of different ML models, namely Random Forest (RF), Support Vector Machine (SVM), and Classification and Regression Tree (CART) on FCC mapping accuracy.
4. Discussion
In general, combining ML models with freely available data led to satisfactory results, with the best model producing highly accurate results (OA = 91.37% and Kappa = 0.861). The first component that helped to reach these results was an accurate and robust reference dataset for FCC mapping. In this regard, VHR satellite images from Google Earth and gridded points were used to prepare 520 FCC samples, allowing us to efficiently train ML models and high-quality accuracy assessment. The accuracy of these samples was assessed based on field inventory plots, which represent the precision of the reference dataset used. In addition, a high level of correlation was observed between measured and estimated FCC values. Therefore, the present study showed that VHR satellite images provided by GEE and gridded sampling points are now a practical alternative for collecting required FCC samples fast, at a low cost. The limitations of ground-based inventories have led researchers to use various remote sensing products and methods for FCC estimation to obtain reference datasets. For example, Korhonen et al. [
28] used airborne LiDAR data with a pulse density of 1/m
2 to estimate FCC and the resulting values as training and validating samples. Although the efficiency of using LiDAR data for providing precise forest inventory parameters has been proven in various earlier studies [
82,
83,
84], using such data has become very costly for researchers and is constrained by technological and logistical limitations, especially in applications aiming to cover large areas, or those that require frequent applications [
85,
86,
87]. In another study, Huang et al. [
29] utilized VHR satellite images available on Google Earth to measure FCC. They defined and used a complex image processing workflow, which included taking screenshots of Google Earth images, black and white discarding, brightness and contrast adjustment, and measuring FCC using a density slice tool. They did not provide the accuracy of their measurements, and it is not possible to make a statistical comparison. Still, such complex methods require more time and more image processing experience. Hence, it was concluded that using georeferenced VHR images obtained from Google Earth for measuring FCC does not require much image processing experience and can provide a high level of accuracy.
Regarding ML classifiers, the SVM produced the highest accuracy compared to RF and CART, irrespective of data density and integration. Further, based on McNemar’s test, SVM significantly outperformed RF and CART. These results are consistent with previous studies, which demonstrated that the SVM was the most efficient algorithm for LULC mapping [
88,
89]. The highest class level accuracy (CA, UA, and F-1 score) was produced using SVM, indicating that SVM reduces both commission and omission errors. The lowest classification accuracy was obtained using CART for all forest-covered classes. For example, the lowest F-1 scores were produced by CART for mapping sparse forest, which indicates less suitability of this algorithm for classifying features with similar spectral similarities. In the case of such similar spectral signatures, the higher ability of SVM to distinguish these classes may have been caused by its design aiming to find decision boundaries that maximize the margin [
90]. In contrast, all the ML models showed similar performance for classifying other FCC classes, especially non-forest and dense forests. These results indicate that all ML models performed well when the classes were relatively pure in terms of spectral signatures, which supports previous findings. For example, based on Adugna et al. [
91], RF and SVM showed similar performance for classifying pure classes with distinct spectral characteristics such as water bodies and sparse vegetation. In this study, the higher accuracy of the SVM algorithm compared to RF and CART may be related to the training sample size. Based on Shetty et al. [
92], the SVM is less sensitive to the number of training samples and yields higher accuracy with a high number of features and small training samples compared to other models. Similar results were observed in other studies as well. For example, Sabat-Tomala et al. [
93] compared the performance of RF and SVM algorithms in mapping three invasive species using 430 hyperspectral bands. They also evaluated the influence of training sample size on classification accuracies, for which they used different sizes of training samples (30, 50, 100, 200, and 300 samples per class). They found that SVM was more sensitive to the number of training samples than RF when many features were involved in the classification workflow. In other words, the features used for RF classification may not optimally distinguish the spectral differences between classes. Raczko and Zagajewski [
94] reported that an incorrect selection of features might also affect the RF classification results. However, in some studies, RF has performed better than SVM. For example, Li et al. [
95] reported that RF produced the highest OA model performance metric compared to SVM, especially for mapping complex classes such as surface-mined lands. In short, considering previous studies, RF and SVM typically showed similar abilities and often outperformed other algorithms such as CART [
96], Artificial Neural Network (ANN) [
97], and Maximum Likelihood Classification (MLC) [
98], placing them among the best options for classification solutions.
Moreover, in this study, the impact of data density on FCC classification accuracies was assessed. For example, Adams et al. [
99] used all Landsat-8 OLI satellite images from 2013 to 2017, including 330 clear observations (i.e., after the cloud masking), and produced a dense time series for forest composition mapping. Azzari and Lobell [
100] created a series involving all satellite images from the Landsat-7 TM, Landsat 7 ETM+, and Landsat 8 OLI sensors between 2012 and 2015 to achieve high data density and produce reliable observations for land cover monitoring in Zambia. However, in contrast to previous studies, in the current study, both one-year- and multi-year-based STMs were used to determine how high data density influenced classification results. This study observed highly significant differences associated with higher data density in all ML models. Based on the results, the OA increased by 1.63%, 1.46%, and 2.04%, respectively, with SVM, RF, and CART. The three-year time series helped all ML models and increased their ability to classify FCC classes, particularly the sparse forest class that was not distinguished well by the one-year dataset. These results are supported by Pflugmacher et al. [
101], who found that the classification results increased when spectral and temporal metrics were calculated using a three-year time series instead of single-year data. Using a multi-year satellite time series is a practical and effective method for more reliably calculating STMs [
102], because involving more observations can provide more information from land surface characteristics.
Regarding data integration based on results, the synergetic use of S-1 SAR and S-2 optical STMs improved the classification accuracies compared to those obtained using only S-2. The integration of S-1 STM increased the OA by 2.3%, 3.67%, and 0.39%, with SVM, RF, and CART algorithms, respectively. The integration had the highest impact on the classification of the sparse forest, which indicates that this class has similar spectral characteristics to those found in other classes, mainly non-forest and medium-density forests, and could therefore not be effectively classified by optical features. Some studies have focused on integrating optical and radar data, concluding that using them together provided better results than using them separately [
103,
104]. For example, Borges et al. [
105] defined various feature set configurations based on S-1, S-2, and S-1 and S-2 features and stated that the synergetic use of S-1 and S-2 increased the accuracy for most land cover types. Thus, these two spectral domains provide complementary information [
106].
As a first limitation, we could not conduct our study on a much larger scale due to the lack of training and validation samples. Future studies could compare the results of other popular classifiers for FCC mapping, such as ANNs and deep learning, since classification accuracy may be affected by the type of ML model used (second limitation). For instance, ANNs might work better on other classes of signals [
7,
107]. It was also challenging to measure the diameter of tree crowns in some sample plots due to the mountainous environment of the Zagros forests (third limitation). As a replacement option, the high-resolution images can be used to estimate the FCC for those plots located in steep areas. Further, to understand the generalizability of our approach, we would propose to test this approach such that training and testing samples would be collected from different areas, which can demonstrate the transferability of results to other regions.
At the moment, there is no official report on the FCC in the Zagros vegetation zone. Therefore, the FCC map generated using the SVM algorithm in this study could be used as a baseline for decision makers in the Forest, Rangeland, and Watershed Organization of Iran to monitor FCC changes in the region, whether they be the result of human activities or natural events, and to establish a forest management plan.
5. Conclusions
This research aimed at using all available capabilities, including VHR satellite images, field inventory plots, the Sentinel time series, GEE cloud computing, and ML algorithms to identify FCC classes in heterogeneous Mediterranean oak forests. Regarding the methodology used and the results, the following may be concluded: (i) VHR satellite images and gridded sampling points can be combined to prepare a sufficient number of training and validation samples for a fast, precise, and cost-effective approach; (ii) Sentinel optical and radar time series provide useful information for accurate FCC mapping. Their combination was the best option for identifying all FCC classes; (iii) using a three-year time series increased the ability of all ML models to classify FCC classes, mainly sparse forest that was not distinguished well using the one-year dataset. Highly significant differences associated with higher data density were observed for all ML models; (iv) the synergetic use of S-1 SAR and S-2 optical STMs improved the classification accuracies compared to those obtained using only S-2. This result emphasizes the importance of using multi-sensor datasets and different kinds of predictors in FCC mapping; (v) different ML models were examined regarding their training performance and classification accuracies based on remotely sensed datasets. Based on the results, SVM was the most efficient algorithm for FCC mapping. However, the RF also showed reasonable results. We conclude that the most popular ML models, including SVM and RF, which are provided in GEE, are sufficient for accurate FCC mapping solutions. In addition, the methodology used gives optimal and up-to-date information regarding FCC mapping and can be extended as a powerful and well-organized approach, especially in Mediterranean forests. More studies are recommended to investigate the potential of other earth observation datasets and their integration, such as Landsat and ALOS time series. Further, the performance of deep learning classifiers should be evaluated.