1. Introduction
Aboveground biomass (AGB) includes all vegetation above the ground (i.e., stems, branches, bark, seeds, flowers, and foliage of live plants) and approximately 50% of its composition is carbon [
1]. AGB usually measures in metric tons of dry matter per hectare (e.g., t ha
−1 or Mg ha
−1) or in metric tons of carbon per hectare (e.g., t C ha
−1 or Mg C ha
−1). The United Nations Framework Convention on Climate Change (UNFCCC) identified it as an Essential Climate Variable (ECV). Therefore, accurate information on biomass stock in world forests is necessary to reduce uncertainties and to fill the knowledge gaps of the climate system [
2]. Further strong impetus to improve methods for measuring global biomass comes from the reduction of emissions due to deforestation and forest degradation (REDD) mechanism, which was introduced in the UNFCCC Committee of the Parties (COP-13) Bali Action Plan. REDD which is now popular with REDD+ (with additional elements of carbon stock enhancement and biodiversity conservation) is dedicated to the developing countries around the world including Malaysia. Its implementation relies fundamentally on systems to assess available carbon stock and monitor changes due to loss of biomass from deforestation and forest degradation [
3], which are amalgamated in a system called monitoring, reporting, and verification (MRV).
Remote sensing has been recognized as one of the primary spatial inputs for this process [
4,
5,
6]. Satellite remote sensing technologies are currently widely tested and suggested as a tool in REDD+ MRV. Along with scientific programs and field tests, there is also a debate as to the overall feasibility and cost–benefit ratio of remote sensing approaches, depending on the wide range of ecosystem and land use conditions as well as the range of approaches to carbon credit accounting [
7].
In many parts of the world, especially in tropical region, the frequent cloud conditions often restrain the acquisition of high-quality remotely sensed data by optical sensors. The acquisition of cloud-free, wall-to-wall optical satellite images in tropical countries is almost impossible [
8]. Thus, SAR data become the only feasible way of acquiring remotely sensed data within a given timeframe because the SAR systems are independent of cloud coverage, weather, and light conditions. Due to this unique feature compared with optical sensor data, the SAR data have been used extensively in many fields, including forest-cover identification and mapping, discrimination of forest from other land covers, and forest biomass estimation.
Previous studies demonstrated that the L-band polarimetry backscatter tends to saturate at certain levels of biomass, and hence limits the accuracy of estimates [
9,
10,
11]. However, the saturation level varies with the type and structure of the forest. It was demonstrated that the sensitivity of SAR polarimetry is depending on the structure, density, and tree elements (i.e., trunk/stem, branches, and leaves) of the forests [
12]. Other than these issues, several other inter-related issues can affect the biomass estimations using remotely sensed data. These issues can be generalized into three major groups, which are (i) the natural conditions of the forest, (ii) the forest management system being practiced, and (iii) the technical issues related to the remote sensing system being used [
13].
Short wavelength SAR sensors on board several satellites such as the Earth Resources Satellite (ERS-1), Radarsat, and Environmental Satellite (Envisat) have been used to quantify forest biomass. A number of studies have been conducted in relatively homogeneous or young forests, but the signal tends to saturate at low biomass (100–200 Mg ha
−1) [
14,
15,
16]. However, L-band SAR has shown better potential in retrieving the biomass of forests, including those in the tropics [
17,
18,
19,
20,
21]. Recently, there has been rising interest in integrating data from several SAR sensors and SAR with optical sensors to improve the accuracy of biomass estimates [
22,
23].
In Malaysia, there are limited studies on the applications of SAR for estimating biomass. Out of many studies conducted worldwide, very few have been done in Malaysia [
11,
24,
25]. This indicates that the potential, limitations, and advances of L-band SAR in estimating tropical forest in Malaysia are not extensively explored. Methods of applying this SAR system are also scarcely exploited. Therefore, the objective of this study is to explore the synergy of SAR sensors, i.e., PALSAR-2 L-band and Sentinel-1A C-band for estimation of AGB in inland dry dipterocarp forest in Peninsular Malaysia. This study highlights and discusses advantages and limitations of this technique.
2. Materials and Methods
2.1. The Study Area
The study area comprised lowland, hill, and upper hill dipterocarp forests, which are categorized based on land altitude, i.e., <300, 300–750, and 750–1200 m, respectively. These forests are major, occupying about 5,257,395 ha or about 89% out of the total forested land (i.e., about 5.9 million ha) in Peninsular Malaysia. These forests occur within the entire Peninsular Malaysia, which has an extent between 1–7° latitude and 99–105° longitude. These forests embrace all the well-drained primary forests of the plains, undulating land, and foothills and hill terrain up to about 750 m altitude. Trees from the family of Dipterocarpaceae are dominant species, which make the forests major timber production areas in Peninsular Malaysia. Almost the entire area (i.e., 4.9 million ha) is categorized as Reserve Forest which is meant for production and protection. About 1.98 million ha have been allocated for protection forests in the form of national parks, wildlife sanctuaries, and nature reserves [
26]. The most common tree species found in this forest come from the genera such as Shorea, Hopea, Dipterocarpus, Dryobalanops, Neobalacarpus, Anisoptera, and Vatica. The remaining forested land is comprised of peat swamp, mangrove, and montane forests.
Figure 1 shows the distribution of major forest types in Peninsular Malaysia.
2.2. Satellite Datasets
2.2.1. Satellite Images Acquisition
The satellite datasets that have been used in this study came from two satellites, which are; (i) Advanced Land Observing Satellite 2 (ALOS-2) that carries Phased Array type L-band Synthetic Aperture Radar-2 (PALSAR-2) on-board and (ii) Sentinel-1A that carries C-band imaging SAR sensor.
Table 1 summarizes the properties of the data and
Figure 2 shows both datasets that have been processed and used in this study.
ALOS-2 is the successor of the ALOS, but the structure of the new satellite is quite different from its predecessor. ALOS was launched in January 2006 and brought the Phased Array type L-band Synthetic Aperture Radar (PALSAR) on-board. After five years of observations, it stopped transmitting in April 2011. ALOS-2 was then launched on 24 May 2014, which carried the PALSAR-2 sensor. PALSAR-2 is currently operating and producing L-band SAR data, that has similar (with some advancements) characteristics with PALSAR. The data were acquired from Earth Observation Research Center (EORC) under Japan Aerospace Exploration Agency (JAXA). The data was acquired under the Kyoto and Carbon (K&C) Initiave, a research agreement between Forest Research Institute Malaysia (FRIM) and JAXA, whereby FRIM has special permission to access the PALSAR-2 product at all imaging modes and resolutions. JAXA also provides free access of PALSAR-2 mosaic product at 25-meter resolution for public which is available at
http://www.eorc.jaxa.jp/ALOS/en/PALSAR_fnf/data/index.htm.
The SENTINEL-1 mission is the European Radar Observatory for the Copernicus joint initiative of the European Commission (EC) and the European Space Agency (ESA). The SENTINEL-1 was launched on 3 April 2014 and its mission operating in four exclusive imaging modes with different resolution (down to 5 m) and coverage (up to 400 km). It provides dual polarization capability, very short revisit times, and rapid product delivery. The Sentine-1A data was acquired in Level-1 Ground Range Detected (GRD) format so that radar cross-section of both distributed and point targets can be easily derived from the data. The data is available at
https://scihub.copernicus.eu/dhus/#/home and free to download.
Another satellite data that was used in this study was the digital elevation model acquired from the Shuttle Radar Topography Mission (SRTM). This data was used to classify the forest into specified elevation categories, according to the type of forests. It was also used for radiometric terrain correction on both PALSAR-2 and Sentinel-1A images. These data are available at the US Geological Survey's EROS Data Center for download at
http://srtm.usgs.gov/index.html.
2.2.2. Satellite Image Pre-Processing
PALSAR-2 images that were used in this study came in Level-1.5 product, which means the range and multilook azimuth compressed data is represented by amplitude data. The range coordinates were also converted from slant range to ground range, and map projection was performed.
Level-1 GRD consist of Sentinel-1A products consist of focused SAR data that has been detected, multi-looked and projected to ground range using an Earth ellipsoid model such as WGS84. The ellipsoid projection of the GRD products is corrected using the terrain height specified in the product general annotation. The terrain height used varies in azimuth but is constant in range.
Ground range coordinates are the slant range coordinates projected onto the ellipsoid of the Earth. Pixel values represent detected magnitude. Phase information is lost. The resulting product has approximately square resolution pixels and square pixel spacing with reduced speckle at a cost of reduced geometric resolution. In addition, the GRD products have thermal noise removed to improve the quality of the detected image.
Other than these processes, they are two important pre-processing stages, namely, speckle suppression and radiometric terrain correction. Spatial domain Lee Sigma filter with a kernel size of 7 × 7 pixels was used to remove speckle effect on the images. Digital Elevation Model (DEM) acquired from SRTM was used to minimize terrain shadowing effect on the images. The presence of this effect on SAR imagery was because the signal strengths were dependent on two variables, which were incidence angle and surface roughness or topography of terrain. If slope is facing the SAR transmitter, the signal will become stronger than the other side of the slope. Semi-empirical method was used for radiometric terrain correction [
27] and the processes were performed in ENVI/IDL (Harris Corporation, Melbourne, Australia) following the same approach as found in Canty et al. [
28]. This process was necessary to normalize both sides of the slopes and it minimized errors towards the end of AGB prediction.
2.2.3. Satellite Image Calibration
The objective of SAR calibration is to provide imagery in which the pixel values can be directly related to the radar backscatter of the scene. The PALSAR-2 image that was used in this study was built on a 16-bit data type and all pixels have digital numbers (DN) ranging from 0 to 65,535. These DNs however do not represent the radar signal of features or objects on the ground. Therefore, the DNs have to be converted to backscatter (i.e., the returned radar signals) known as Normalized Radar Cross Section (NRCS) and represented as
σ0 in decibels (dB). The equation that was used for the calculation of NRCS for PALSAR are slightly different from other sensors in that the usual sine term has already been included in the DN values. Thus, for the products stored at Level 1.5 and above, the equation for NRSC of any of the polarization component can be obtained by the following formula with single calibration factor (CF), which can be expressed as follows [
29].
The Sentinel-1A product uses radiometric calibration look-up table (LUT) to do the calibration. This was performed on Sentinel Application Platform (SNAP) tool, a software that was designed specifically for Sentinel-1 products processing and it available for free at
http://step.esa.int/main/download/. The essential conversion of amplitude to
DN and from
DN to sigma nought were done automatically on SNAP and once the sigma nought values was obtained, the computation of backscatter (
) can be performed as
2.3. Forest-Non-Forest Classification
Forest-non-forest (FNF) classification was performed on PALSAR-2 polarizations images to delineate the forests from other land cover. This process is critical to define the boundary of forests and to ensure that the estimated AGB did not include other types of vegetation. The reason was that the forests are often confused with rubber, teak, and other timber tree plantations, which are common in Peninsular Malaysia and they appear almost identical on both HH and HV polarizations. Instead of using only the original backscatter HH and HV polarizations, an attempt has been also made to derive other image variables derived from PALSAR HH and HV images. Image variables, namely (i) simple band ratio (HH/HV), (HV/HH), (ii) average (HH + HV/2), and (iii) square root of products (√(HH × HV)) were produced.
The incorporation of texture measure also can improve classification of spatially distributed pixels on an image. Gray-level co-occurrence matrix (GLCM) uses a gray-tone spatial dependence matrix to calculate texture values. This is a matrix of relative frequencies with which pixel values occur in two neighboring processing windows separated by a specified distance and direction. For this purpose, texture has been defined as repeating pattern of local variations in image intensity which is too fine to be distinguished as separate class at the observed resolution. Thus, a connected set of pixels satisfying given gray-level properties which occur repeatedly in an image region constitute a textured region. A mean-type GLCM was applied to the original HH and HV polarizations to produce textured images with clearer definitions of the objects on the images [
30].
These inputs were used for the FNF classification and the Maximum Likelihood Classifier algorithms with nearest neighbor technique was applied. The forests were then further classified into several forest types by using the DEM from SRTM.
2.4. Forest Survey Data
The sampling design in this study modified from the standard operating procedure (SOP) that was developed by Winrock International [
31], which follows the Intergovernmental Panel on Climate Change (IPCC) standards [
1]. A cluster comprises of four plots and the design is shown in
Figure 3. The plot was designed in circular with smaller nests inside. The biggest nest measures 20 m in radius, followed by the smaller nests, measuring 12 and 4 m. The sizes of trees were measured according to the nest sizes, which is summarized in
Table 2. Depending on the nest size, it indicates that not all stands were measured in a single plot. In additional to these nests, there is another small nest measuring 2 m in radius, which is used to count saplings (i.e., trees measuring <10 cm in diameter at breast height (dbh) and >1.3 m in height). Clustering of plots at each sampling unit was recommended for natural forest areas and areas that have been selectively logged. The sampling system was designed in such a way to make the data collection process easier and faster, but reliable and representative for a particular forest stratum.
A forest ecosystem normally has five terrestrial carbon pools, which are; (i) aboveground living biomass, (ii) belowground living biomass, (iii) deadwood, (iv) non-tree vegetation and litters, and (v) soil. However, one of the most significant carbon pools is aboveground biomass as it the easiest and the most practical pool to assess, while being representative to an ecosystem. Aboveground biomass comprises all the living components of a tree, including stems, branches, and leaves. Allometric functions are the best way that AGB can be estimated. In this study, a published allometric function for dry inland forest in Asia region was used to estimate the AGB of living trees [
32].
AGB denotes aboveground biomass (kg/tree), E represents bioclimatic variable, ρ is wood specific gravity/wood density, and D is dbh.
A total number of 332 plots have been surveyed between years 2014 and 2016 and were used as sample plots information for this study. The forest survey was conducted in a number of field trips that cover mainly the central parts of Peninsular Malaysia. The States include Terengganu, Pahang, Johor, Negeri Sembilan, Selangor, Perak, Kelantan, and Perlis in the north. In each plot, every tree which meets the dbh size in the nest radius was inventoried. Species of every stand being inventoried was also recorded. Position (coordinate) of each plot was recorded at the center by using hand-held Global Positioning System (GPS) (Trimble Inc., Sunnyvale, CA, USA). The locations of all plots were post-processed by using base position data from the Department of Survey and Mapping Malaysia to ensure the accuracy of the position acquired. Locations of the sample plots are shown in
Figure 4 and a summary of the sample plots is given in
Table 3.
2.5. Correlation Analysis
The backscatter values from both PALSAR-2 and Sentinel-1A were extracted from the images, which represented HH, HV, VV, and VH polarizations. The AGB at the sample plots on the ground was then correlated with the corresponding backscatter of these polarizations by using linear regression method. Instead of using the single polarization as a variable, several other variables have been derived by manipulating these single polarizations. This manipulation was performed to produce variety of image variables and that to examine the roles of polarization in estimating AGB.
Table 4 lists the variables that have been derived from the individual PALSAR-2 and Sentinel-1A and combination of polarizations from both sensors. These variables act as predictors to the AGB at the sample plots.
Simple linear regression method was used to investigate the relationship between the AGB and the image variables. Multiple linear regressions were also applied to the variables to observe whether the combination of polarizations from both PALSAR-2 and Sentinel-1A able to improve strength of the correlations. Manipulation of polarization of individual PALSAR-2 and Sentinel-1A as well as combination of both sensors were tested and multiple variable equations have been produced. The relationship between backscatter and AGB is represented in a common linear function as y = ax + b, x and y denote image variables and AGB, respectively and a and b are the equation coefficients. The strength of the relationship was measured by the derived coefficient of correlation (R2). The greater R2 indicates a stronger relationship between two variables; the value R2 of 0 means no correlation and 1 is a perfect correlation. In this case, the prediction equation with the highest R2 was selected to estimate AGB within the entire study area.
Studies [
9,
11,
33] have demonstrated that PALSAR polarization data actually has a logarithmic relationship with AGB. Therefore, instead of employing linear regression only, the study also attempted to correlate the AGB with the polarizations in a non-linear form. However, this method was applied only on the individual polarizations, i.e., HH, HV, VV, and VH. Similar to simple linear regression, the estimation models used AGB as independent variable to observe the sensitivity of the backscatter to the AGB. The relationship between backscatter and AGB is commonly represented in exponential an function as
y =
a ×
e(xb), where
x and
y denote image variables and AGB, respectively and
a and
b are the equation coefficients.
2.6. Validation Approach
The study used K-fold cross validation method to evaluate the performance of the best prediction model derived from PALSAR-2, Sentinel-1A, and combination of both data. This method provided better indication on the prediction performance than the common residual method. Residual method does not provide an indication as to how well the model makes new predictions over new sample data, but this method does. In this study, 10-fold cross validation method [
34] was used where all sample plots data were randomly grouped into 10 groups. One group was used as a testing set while the other nine groups were used in developing the model. The root mean square error (RMSE) was calculated using the testing set. This process was iterated 10 times where each group was used as a testing set once. Then, the average of all RMSEs was calculated to get the overall RMSE of that model.
4. Conclusions
The study has successfully quantified the AGB over the lowland, hill, and upper hill dipterocarp forests in Peninsular Malaysia. The total AGB was estimated at about 1.82 trillion Mg over the year 2016. The extent of forested area—i.e., 5,895,810 ha—was also identified from the L-band PALSAR-2 data. The study confirmed that the synergetic of PALSAR-2 and Sentinel-1A produced better estimates than the single sensor. Although there were limitations found, the study provided an alternative for AGB retrieval that can be utilized in a practical manner to assist in the management and protection of forested areas. The study, to some extent, can also provide a significance contribution towards the MRV in the REDD+ implementation. One of greatest advantages of using the PALSAR-2 and Sentinel-1A data is the free access policy to the datasets. Free-cloud cover and rapid acquisition made them more valuable, especially for this kind of study in Malaysia.