Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models

Chinembiri, Tsikai Solomon; Mutanga, Onisimo; Dube, Timothy

doi:10.3390/rs14225676

Open AccessFeature PaperArticle

Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models

by

Tsikai Solomon Chinembiri

¹,

Onisimo Mutanga

^1,*

and

Timothy Dube

²

¹

College of Agricultural, Earth and Environmental Sciences, University of KwaZulu-Natal, Private Bag X01, Scottsville 3209, South Africa

²

Institute of Water Studies, Department of Earth Sciences, University of the Western Cape, Private Bag X17, Bellville 7535, South Africa

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(22), 5676; https://doi.org/10.3390/rs14225676

Submission received: 19 October 2022 / Revised: 3 November 2022 / Accepted: 3 November 2022 / Published: 10 November 2022

(This article belongs to the Special Issue Monitoring Forest Carbon Sequestration with Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

This study sought to establish the performance of Spatially Varying Coefficient (SVC) Bayesian Hierarchical models using Landsat-8, and Sentinel-2 derived auxiliary information in predicting plantation forest carbon (C) stock in the eastern highlands of Zimbabwe. The development and implementation of Zimbabwe’s land reform program undertaken in the year 2000 and the subsequent redistribution and resizing of large-scale land holdings are hypothesized to have created heterogeneity in aboveground forest biomass in plantation ecosystems. The Bayesian hierarchical framework, accommodating residual spatial dependence and non-stationarity of model predictors, was evaluated. Firstly, SVC models utilizing Normalized Difference Vegetation Index (NDVI), Soil Adjusted Vegetation Index (SAVI), and Enhanced Vegetation Index (EVI), derived from Landsat-8 and Sentinel-2 data and 191 sampled C stock observations were constructed. The SVC models built for each of the two multispectral remote sensing data sets were assessed based on the goodness of fit criterion as well as the predictive performance using a 10-fold cross-validation technique. The introduction of spatial random effects in the form of Landsat-8 and Sentinel-2 derived covariates to the model intercept improved the model fit and predictive performance where residual spatial dependence was dominant. For the Landsat-8 C stock predictive model, the RMSPE for the non-spatial, Spatially Varying Intercept (SVI) and SVC models were 8 MgCha⁻¹, 7.77 MgCha⁻¹, and 6.42 MgCha⁻¹ whilst it was 7.85 MgCha⁻¹, 7.69 MgCha⁻¹ and 6.23 MgCha⁻¹ for the Sentinel-2 C stock predictive models, respectively. Overall, the Sentinel-2-based SVC model was preferred for predicting C stock in plantation forest ecosystems as its model provided marginally tighter credible intervals, [1.17–1.60] MgCha⁻¹ when compared to the Landsat-8 based SVC model with 95% credible intervals of [1.13–1.62] Mg Cha⁻¹. The built SVC models provided an understanding of the performance of the multispectral remote sensing derived predictors for modeling C stock and thus provided an essential foundation required for updating the current carbon forest plantation databases.

Keywords:

Bayesian hierarchical modelling; geostatistics; Eucalyptus grandis; Eucalyptus camaldulensis; Pinus patula; spatial random effects; spatially varying coefficient

1. Introduction

Since the onset of the Fast Track Land Reform (FTLRP) program in Zimbabwe and the subsequent redistribution and resizing of large-scale land holdings in the year 2000, plantation forests within and around the neighborhood of resettlement areas continuously faced physical distress [1]. Zimbabwe lost close to 224,000 ha of tree cover, which is equivalent to a forest degradation of 17% and 88 Mt of CO₂ emissions from 2000 to 2021 [2]. The steady decline of land under forest over the years has therefore been mainly driven by the activities of the year 2000 taking place on resettled land, which included wildfires, illegal logging, mining, and agriculture expansion. Yet the amount of additional biomass that can be accumulated in these areas depends much on the forest condition and land management practices.

A number of studies modeling the relationship between C stock and remote sensing-derived information adopt a standard approach where the effects of independent variables, including vegetation indices on C stock, remain constant over space in the model [3,4]. Given the history of Zimbabwe’s land reform program, it is well known that management and conservation practices of plantation forests have been changing over the years, particularly since the adoption of the program in the year 2000. Greater portions of formally designated commercial plantation forests were occupied and replaced by subsistence farmers across the country. It is upon this premise that it cannot be assumed that the impact of covariates on C stock is constant in these ecosystems. The carbon sequestration potential in various regions of the occupied forest plantations, therefore, remains unknown as the management and conservation practices of the managed ecosystems have been largely modified and altered [5]. Yet government and other stakeholders in the timber industry are obliged under the 2015 Paris climate agreement to provide accurate accounts of the country’s carbon sequestration potential for managed and natural forest ecosystems.

The literature proffers methods used in the estimation of AGB, either as direct or indirect approaches from forest inventory data. These methods either use allometric equations, conversion factors such as wood density, or biomass expansion factors (BEF) [1]. In spite of the advantages of using conventional methods given in [6,7] as generally providing accurate AGB estimates, these techniques are also regarded as time-consuming and environmentally unfriendly. The inaccessibility of some areas due to complicated topography and forest conditions also makes conventional methods less attractive to AGB estimation, especially in extensive areas [8]. However, remote sensing is emerging as a promising technique free of the aforementioned limitations as it offers cost-effective methods of AGB monitoring through stratification of canopy density and forest types. Repeated application of remote sensing leads to the generation of historical data critical for change detection analysis and incorporation into a Geographic Information System (GIS) for integration with other datasets [9]. However, remote sensing methods are also not immune from limitations as they usually face limitations in areas of bad weather conditions, especially in areas with cloud cover, and require a certain level of training for effective use and application [7].

Remote sensing C stock data derived from satellite imagery has significantly grown over the years. This data has been the basis for informing international policy agreements associated with CO₂ emissions into the atmosphere, mainly from deforestation and other land use land cover changes (LULCC) [10]. Remote sensing is regarded as a powerful tool for deriving AGB and forest structure as it offers practical means of acquiring spatially-distributed forest carbon from local, continental, and global scales [5,11]. Forest biomass can generally be measured from three broad categories of remote sensing (RS) data, namely, passive optical remote sensing, radio detection and ranging, microwave (radar), and light detection and ranging (LiDAR) [8,10]. Passive optical spectral reflectance is responsive to vegetation structural attributes (tree density, leaf area index, and crown size), shadow, and texture [8,12]. Crown size, tree density, and leaf area index (LAI) are strongly correlated with AGB. Radar remote sensing measure geometrical and dielectric attributes of forests. LiDAR RS methods of biomass measurement characterise forest vertical structure and height. Remote sensing (RS) and ancillary technologies such as Geographical Information System (GIS) are practical and cost-time effective, allowing for imaging and research on extensive and inaccessible areas.

The estimation of plantation forest C stock, together with other structural parameters using new regeneration multispectral remote sensing data, is a relatively new ground in the climate change and carbon monitoring arena. Research in this field has demonstrated the potential for remote sensing data as tools for developing estimates of forest attributes, amongst them, being Above Ground Biomass (AGB), either as a standalone or coupled with other earth observation techniques [11,13]. Spatial regression models applied in mapping C or biomass using new-generation multispectral remote sensing data may fail to clearly make room for residual spatial dependence [14,15]. Modeling of natural resource variables without explicitly accommodating spatial variation can be justified if covariates can account for all the spatially structured dependence. Such assumptions do not hold in many practical applications involving georeferenced data.

Failure to account for spatial dependence in the modeling of regionalized variables can lead to inaccurate model parameters and incorrect predictions [9]. Subsequently, disregarding spatial correlation in electing model choices can result in higher prediction uncertainty for inference of the outcome variable. In addition to the drawbacks highlighted in the foregoing, non-Bayesian spatial modeling can further lead to underestimation of uncertainty [9,16] as traditional spatial regression estimation methods assume stationarity of the covariance matrix,

Σ

. The ultimate effect is the derivation of standard error estimates that are unable to take all the uncertainties in the parameters into account. Checking for spatially correlated residuals when spatial data are employed in the modeling of above-ground biomass is therefore critical.

Some attention has been given to spatially varying coefficient (SVC) models in the literature [17,18,19]. The Bayesian framework of statistical inference is the bedrock of SVC models in which analyses make use of samples derived from Markov Chain Monte Carlo (MCMC) techniques from the posterior distribution of model parameters [20]. What makes Spatially Varying Intercept (SVI) and SVC models unique in applied geostatistical and the remote sensing literature is their ability to consider the residual spatial dependence and the non-homogeneity in model parameters differently than ordinary geostatistical approaches. SVI models assume the model intercept is spatially varying, whilst the SVC models assume all the model regression coefficients to be spatially varying [21]. A number of applications and methodologies utilizing spatially varying coefficient models are documented in the literature. Amongst them are the geographically-weighted regression (GWR) by [21], who employed the technique of canopy height prediction using remote sensing data. Application of these models resulted in significant improvement in model fit. On the other hand, [22] made provisions for spatial dependence and parameter non-homogeneity by exploring geostatistical kriging variants for forest canopy height prediction. Co-kriging and regression kriging models resulted in significant improvements in model fit.

However, it has been shown in recent times that GWR might not be robust to correlation among predictors and can potentially lead to inaccurate results when complex correlation structures are involved [18,23]. Again, from an inferential viewpoint, GWR can present problems when drawing inferences regarding prediction uncertainty and model parameters. The lack of valid underlying probability models in GWR makes prediction uncertainty, and standard parameter error estimates difficult to justify. For instance, uncertainty maps produced from kriging and co-kriging techniques make no consideration of the uncertainty in the variogram-based spatial covariance parameters. This is an established and common weakness encountered when using these geostatistical approaches [24]. It is possible to estimate spatial covariance parameters within the SVC and SVI models using a Bayesian hierarchical construction. Such an approach allows the propagation of uncertainty to the prediction of the response variable [25]. In such scenarios, a better and statistically defendable map of uncertainty can be produced than would else be produced when GWR or traditional kriging methods are utilized.

Landsat-8 and Sentinel-2 are multispectral platforms categorized as new generation remote sensing sensors with enhanced spectral and spatial properties than previous missions of the Landsat series. It is this perceived improvement in earth observation properties that we expect to give a dividend to C stock predictive models for applications in carbon accounting and inventorying under the United Nations Framework Convention on Climate Change (UNFCC) [8]. We employed a Bayesian hierarchical framework with spatially varying coefficients (SVC) using predictor information derived from the aforementioned sensors for predicting C stock. The modeling framework considered the non-stationarity and residual spatial dependence of model covariates through the inclusion of spatial random effects into the SVC models. We, therefore, developed and compared the performance of C stock predictive models under spatially varying regression coefficients derived from Landsat-8 and Sentinel-2 predictors. Prediction accuracy and uncertainty quantification for the REDD collaborative program in the developing world is a critical aspect of the Carbon Measurement, Reporting, and Verification Systems (MRVs) of the United Nations (UN-REDD, 2009; CMS, 2014). Thus, in this paper, we explored how spatially varying coefficient models constructed using a Bayesian hierarchical set-up with aiding information from Landsat-8 and Sentinel-2 multispectral sensors and implemented through Markov Chain Monte Carlo (MCMC), perform in C stock prediction in plantation forest ecosystems.

2. Materials and Methods

2.1. Study Area

The study was carried out at lot 75 A of Nyanga Downs in Nyanga district in the Eastern Highlands of Zimbabwe (Figure 1). The study area is dominated by Eucalyptus grandis, Eucalyptus camaldulensis, and Pinus patula plantation forest species which have some of its patches being used for agriculture, grazing, and gold panning and is located between latitude 32°40′E and 32°54′E and 18°10′12″S and 18°25′4″S longitude as illustrated in Figure 1 [5,26]. Grazing, agriculture, and gold panning activities came after part of the commercially owned plantation forests were redistributed to small and medium-sized indigenous farmers in 2000. This development has increased the interface between settlements and timber plantations in all forests originally designated under forest plantations in Zimbabwe. The study area covers an area of approximately 2766 ha. Rainfall amounts are varied, with a mean annual precipitation range of 741 mm to 2997 mm. Annual mean temperatures range from a minimum of 9 °C to 12 °C to a maximum of 25 °C to 28 °C. The weather is very hot, and extensive wildfires occur in the high-elevation grasslands from August to November when the grasses are dry [26].

As illustrated in Figure 1c, Eucalyptus camaldulensis and Pinus patula are the most dominant species in the study area. Pockets of cultivated land within the plantations are evident and are partly responsible for the present biomass density in the sampled region. Greater portions of former plantation vegetation have been cleared by pockets of resettled small-scale and medium-scale farmers venturing into coffee and tea plantations and, in some cases, for dairy farming. Patches of unprotected forest plantations are still present but remain vulnerable to attack for agriculture by settled farmers in the area.

2.2. Sampling Design

We employed the spatial coverage sampling scheme that exploits the Mean Squared Shortest Distance (MSSD) for the optimization of data locations in the study domain. The k-means clustering scheme for equal area coverage was therefore used [27] for obtaining a representative sample from the studied region. The work of [28] demonstrated how the mapping and estimation of mean spatial problems could be resolved through the employment of a uniform coverage sampling scheme. MSSD is particularly suited for areas where sampling campaigns cannot be extended beyond a single phase. As illustrated in Figure 2, the smallest separation distance between samples was approximately 8 m whilst the largest separation distance between samples was 2500 m.

2.3. Above-Ground Biomass Measurements and C Stock Derivation

We sampled and collected measurements of Diameter at Breast Height (DBH) for Above Ground Biomass (AGB) estimation for all trees with a DBH of more than 10 cm (1.3 m) using 500 m² circular plots from the 19 September 2021 to 24 October 2021 in Manicaland Province of Zimbabwe as illustrated in Figure 1. The sampling program meant to measure DBH for subsequent AGB and C stock estimation utilized the MSSD optimization function, resulting in 191 sampling plots being measured from the study area. The 191 sample plot measurements of DBH were then transformed into per plot C stock data using allometric equations of [4] for the Pinus species, whilst the allometric equations of [10] were used for deriving C stock for the Eucalyptus species. The aforementioned allometric equations were also applied by [26].

Allometric equations shown in Equations (1) and (2) were used for the calculation of Above Ground Biomass (AGB) for the Pinus patula and the Eucalyptus grandis and Eucalyptus camaldulensis species, respectively [1]. A default conversion factor of 0.47 used by the IPCC was applied to derive AGB to C stock.

t D w = e^{(- 1.170 + 2.119 \times l n (d b h))}

(1)

t D w = 0.39 \times {(d b h)}^{2.142}

(2)

2.4. Modelling Framework

It is a common geostatistical practice to assume at location

S \in D \subseteq ℝ^{2}

where

s

is a vector of observed

x, y

coordinates within the domain

D

. A Gaussian response variable

y (s)

is therefore modeled through the regression model as in Equation (3):

y (s) = x {(s)}^{'} β + \tilde{x} {(s)}^{'} w (s) + ε (s)

(3)

x {(s)}^{'}

denotes a set of

p

covariates in the model. In this case, the linear mean structure accounting for wide-scale variation in the response is comprised of

p x 1

vectors of

x (s)

which include an intercept and spatially varying georeferenced predictors together with an associated column vector of model coefficients

β = {(β_{0}, β_{1}, \dots, β_{p - 1})}^{'}

. The

\tilde{x} (s)

in the model represents a

q x 1

vector accommodating the intercept and those predictors from

x (s)

whose impact on the response is assumed to vary spatially. The space-varying impact is obtained from the vector of spatial random effects

w (s) = {(w_{1} (s), w_{2} (s), \dots, w_{q} (s))}^{'}

. The specification of different combinations of

\tilde{x} (s)

and the associated

w (s)

leads to the formation of different sub-models. We model

ε (s)

as an independent white noise process that takes care of the micro-scale (measurement error) variation. As such, with the collection of

n

C stock locations,

S = s_{1}, s_{2}, \dots, s_{n}

, we assume the

ε (s_{i})

’s are independent and identically distributed (

i i d

) as provided by

N (0, τ^{2})

where

τ^{2}

is the nugget.

The spatial structure of this model is generally introduced by way of a multivariate Gaussian process (GP) [24,25,29] in which a cross-covariance function clearly models the covariance of

w (s)

within and among data points. The added flexibility in this model is documented in the literature [17,30]. We assume in this study that elements of

w (s)

emanate from

q

independent univariate GPs. Precisely, the process associated with the

k - th

prediction is

w_{k} (s) ~ G P (0, (., .; θ_{k}))

where

C (s, s^{*}; θ_{k}) = C o v (w_{k} (s), w_{k} (s^{*}))

is a valid spatial covariance function modelling the covariance related to a pair of observations

s

and

s^{*}

. The process outcomes are gathered into an

n x 1

vector, say

w_{k} = {(w_{k} (s_{1}), \dots, w_{k} (s_{n}))}^{'}

which permits a multivariate normal distribution

M V N (0, Σ_{k})

, in which

Σ_{k}

is an

n x n

covariance matrix of

w_{k}

with the

(i, j) - th

element provided by

C (s_{i}, s_{j}; θ_{k})

. Evidently,

C (s, s^{*}; θ_{k})

cannot just be a function, but guarantees that the resulting

Σ_{k}

matrix is positive definite and symmetric. Functions of this type are regarded as positive definite and are referred to as the characteristic function of a symmetric stochastic variable [24,25,31].

We denote

C (s, s^{*}; θ_{k}) = σ_{k}^{2} ρ (s, s^{*}; θ_{k})

with

θ_{k} = \{σ_{k}^{2}, ϕ_{k}\}, ρ (.; ϕ_{k})

to be a valid spatial correlation function in which

ϕ_{k}

computes the correlation decay rate and

v a r (w_{k}) = σ_{k}^{2}

. We assumed for all the accompanying analyses an exponential correlation function

ρ (||s - s^{*}||; ϕ_{k}) = e x p (- ϕ_{k} ||s - s^{*}||)

, where

||s - s^{*}||

represents the Euclidean distance between location

s

and location

s^{*}

. Completion of the Bayesian modeling framework requires specification and assignment of prior distributions to the parameters of the model, where inference proceeds by sampling from the posterior distribution of the modeled parameters. As standard practice, we assume

β ~ M V N (μ_{β}, Σ_{β})

prior where

μ_{β} = 0

and

Σ_{β} = 10,000 I_{p}

, whilst the spatial variance components

σ_{k}^{2}

’s and the measurement error variance

τ^{2}

are designated inverse-Gamma,

I G (a, b)

priors. The spatial decay parameters

ϕ_{k} ~ U n i f (a, b)

with the lower and upper bounds of the distribution covering the geographic domain of the sampled study region.

Applying notation similar to the ones used by [32], we can specify the model parameter posterior distribution as

p (Ω | y)

where:

Ω = \{β, w_{1}, w_{2}, \dots, w_{q}, σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{q}^{2}, ϕ_{1}, ϕ_{2}, \dots, ϕ_{q}, τ^{2}\}

y = {(y (s_{1}), \dots y (s_{n}))}^{'}

and is proportional to:

\prod_{k = 1}^{q} U n i f (ϕ_{k} | a_{ϕ k}, b_{ϕ k}) x \prod_{k = 1}^{q} I G (σ_{k}^{2} | a_{σ k}, b_{σ k}) x N (β | μ_{β}) x I G (τ^{2} | a_{τ}, b_{τ}) x

\prod_{k = 1}^{q} N (w_{k} | 0, Σ_{k}) x \prod_{i = 1}^{n} N (y (s_{i}) | x {(s_{i})}^{'} β + \tilde{x} (s)^{'} w (s), τ^{2})

(4)

An effective Markov Chain Monte Carlo (MCMC) function for the estimation of Equation (4) is derived through updating of

β

from its full conditional and utilizing Metropolis procedures for the remainder of the parameters. Reparameterization of the model is an alternative way of ensuring the spatial random effects

w

do not require direct sampling [33]. The spatial random effects could represent other independent variables that are spatially structured and not taken into consideration in the current modeling approach. Nonetheless, the MCMC process produces posterior samples of the parameter space,

Ω

.

From a prediction point of view, if

S_{0} = \{s_{0, 1}, s_{0, 2}, \dots, s_{0, m}\}

is a set of

m

new sites, the spatial random effects posterior predictive distribution corresponding to the

k - t h

regression coefficient is provided by:

p (w_{k, 0} | y) \propto \int p (w_{k, 0} | w_{k}, Ω, y) p (w_{k} | Ω, y) p (Ω | y) d Ω w_{k}

(5)

where

w_{k, 0} = {(w_{k} (s_{0, 1}), w_{k} (s_{0, 2}), \dots, w_{k} (s_{0, m}))}^{'}

.

Since we are making use of MCMC sample-based inference, the integral in Equation (5) does not need to be evaluated precisely. Instead, given

L

posterior samples for the parameter space (

Ω

), that is,

{Ω^{(l)}}_{l = 1}^{L}

, composition sampling can be used to derive this distribution [33] by first drawing

w_{k}^{l}

followed by

w_{k, 0}^{(l)}

for each

l

from

p (w_{k, 0} | w_{k}^{(l)}, Ω^{(l)}, y)

. The last distribution is multivariate normal as it is a derivative of a conditional distribution from a multivariate normal distribution. More specifically, the process realizations over the new sites are conditionally independent of the measured outcomes given the values over the sampled locations and the process parameters. Expressed differently,

p (w_{k, 0} | w_{k}, Ω, y) = p (w_{k, 0} | w_{k}, Ω)

is a multivariate distribution with mean and variance given by

E [w_{k, 0} | w_{k}, Ω] = C o v (w_{k, 0}, w_{k}) v a r^{- 1} (w_{k}) w_{k} = R_{0} {(ϕ_{k})}^{'} R {(ϕ_{k})}^{- 1} w_{k}

(6)

and

v a r [w_{k, 0} | w_{k}, Ω] = σ_{k}^{2} {R (ϕ_{k}) - R_{0} {(ϕ_{k})}^{'} R {(ϕ_{k})}^{- 1} R_{0} (ϕ_{k})}

(7)

where

R_{0} (ϕ_{k}

is an

n x n

matrix with

(i, j) - th

element specified as

ρ (s_{0, i}, s_{j}; θ_{k})

and

R (ϕ_{k}

is an n x n matrix with

(i, j) - th

element provided by

ρ (s_{i}, s_{j}; ϕ_{k})

. Repetition of the same procedure results in the generation of samples for all the

w_{k, 0}

’s. Lastly, for a set of predictors at unsampled locations

s_{0}

, posterior predictive distribution samples regarding the outcome variable

y {(s_{0})}^{l}

, are derived from

N (x {(s)}_{0}^{'} β^{(l)} + \tilde{x} (s_{0})' w_{0}^{(l)}, τ^{2 (l)})

for

l = 1, 2, \dots, L

.

We evaluated 95% credible interval widths (

C I W s

) using the posterior predictive distribution of Landsat-8 and Sentinel-2 C stock models by calculating the difference between the 2.5% and the 97.5% quantile bounds. The 95%

C I W s

were therefore used as summaries of the C stock prediction uncertainty for the Landsat-8 and Sentinel-2 C stock-based spatially varying coefficient models.

2.5. Competing Models

We derived five candidate models for each of the two multispectral remote sensing-based C stock models using Equation (1) using

N D V I

and

S A V I

as predictors of C stock [34]. The models included the non-spatial where

w_{k}

’s is fixed to zero; the spatially varying intercept (

S V I

) in which we only included the spatial random effects related to the model intercept; the complete

S V C

model in which all predictors have associated spatial random effects; the

S V C - β_{1}

which included the spatial random effects for the intercept and

N D V I

predictor variables and the

S V C - β_{2}

which included the spatial random effects for the intercept.

We utilized empirical semivariograms modeled for the residuals of the independent error model as guidelines for candidate model

I G

and

U n i f

hyperprior specifications. Precisely, for the variance parameters, the Inverse Gamma hyperprior

a

was set equal to

1.76

, which would result in a mean prior distribution equal to

b

and infinite variance [35]. To add on, the models’

b

hyperpriors for the

τ^{2}

and

σ^{2}

’s were calibrated in accordance with the sample variograms of the simple linear regression model residuals derived from Landsat-8 and Sentinel-2 sensor’s nugget and partial sill. We programmed the prior for the spatial decay parameter

ϕ

’s to

U n i f (0.38, 0.0012)

which, adopting the exponential covariance function, equates to support an effective range spanning between

~ 8

and

2500 m

. We define the effective spatial range as the distance where the correlation equals 0.05 [18].

Three Markov Chain Monte Carlo (MCMC) chains were run for 20,000 iterations, each with the computationally demanding model requiring approximately 30 min to complete a single MCMC chain. We diagnosed convergence using the CODA library in the R Statistical and Computing environment by monitoring the mixing of chains and the Gelman–Rubin statistic [36]. Acceptable convergence was established within 10,000 iterations for all the models. The posterior inference was premised on a post-burn-in subsample of 15,000 iterations, that is, every third sample from the last 15,000 iterations of each chain. SVC and SVI models were fit using the spBayes R Statistical and Computing Library version 0.4.3. We, therefore, utilized the spBayes R statistical package for fitting all the predictive models for both Landsat-8 and Sentinel-2 SVC models.

2.6. Landsat OLI and Sentinel-2 MSI Imagery Derived Covariates

Landsat-8 has a revisit period of 16 days and offers nine spectral bands with a spatial resolution of 30 m for Bands 1 to 7 and 9 [6,37]. The panchromatic band, Band 8, has a spatial resolution of 15 m. On the other hand, Sentinel-2 has thirteen spectral bands where four bands are configured at 10 m spatial resolution, six bands at 20 m, and three bands at 60 m spatial resolution [37]. Common vegetation indices utilized in the estimation of biophysical variables of Absorbed photosynthetically active radiation (APAR), Leaf Area Index (LAI), and biomass are Normalised Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI) and Soil Adjusted Vegetation Index (SAVI) [5]. We utilized the same Vegetation Indices (VIs) in this study. Readily available and cost-effective new-generation sensors (Landsat-8 and Sentinel-2) with improved spectral and spatial resolution were therefore utilized in the modeling of C stock under the spatially varying coefficients assumption.

We obtained Landsat 8 imagery from the United States Geological Survey Earth Explorer (http://earthexplorer.usgs.gov) as data ready for analysis (ARDs) on the 20 of September 2020. All the datasets were riddled with cloud cover and shadow cloud cover limits set to smaller than 10%. We acquired Sentinel-2 cloud-free images on 20 September 2020 at the same time as the Landsat 8 OLI data collection covering the entire area coinciding with the dates Landsat-8 OLI was collected covering the domain of interest. The multispectral instrument is the main imaging instrument used for Sentinel-2, a push broom scanner that measures the terrestrial Top of the Atmosphere (TOA) reflectance in thirteen spectral bands, that is, 443 nm to 2190 nm. We derived Sentinel-2 spectral data as level-1C 12-bit automated TOA reflectance values. Orthorectification and pre-processing of the TOA-derived products were performed using the sen2r package of the R Statistical and Computing environment [38].

Soil Adjusted Vegetation Index (

S A V I

), Enhanced Vegetation Index (

E V I

), and Normalized Difference Vegetation Index (

N D V I

were used as Landsat-8 and Sentinel-2 derived covariates for C stock prediction in the plantation forest of the eastern highlands of Zimbabwe. The literature is replete with studies that have utilized the aforementioned vegetation indices [3,39] as ABG predictors in biomass estimation. In this study, predictor variables supplied by both Landsat-8 and Sentinel-2 are used specifically for comparing SVC Bayesian hierarchical geostatistical models predicting C stock in a disturbed environment in Zimbabwe. Given information regarding C stock distribution at each location, we fit the model in Equation (1)

S V C

with

p = 2

. Hence, we have two processes, an intercept and one slope process relating to

N D V I

.

2.7. Model Fit and Prediction Accuracy Evaluation

We assessed the performance of the models using the commonly used Deviance Information Criterion (DIC) to categorize models in terms of how well they fit the data [40]. The sum of the Bayesian Deviance and the effective number of model parameters make up the DIC criterion. The Bayesian deviance measurement, which assesses model goodness of fit, and the effective number of model parameters which penalizes model complexity, are measured by D and pD, respectively. Attractive models have lower DIC values.

Predictive performance was evaluated through a

k - f o l d

cross-validation technique. C stock was predicted from observations within each subset, given the estimated parameters from the remaining subsets. We employed the Root Mean Squared Prediction Error (RMSPE) as a metric from the R Statistical and Computing environment to calculate the sampled C stock data values and the accompanying median of the posterior predictive distribution (PPD). Models with lower RMSPE signify more accurate C stock predictions.

3. Results

3.1. Multispectral Remote Sensing C Stock Derived Predictors

Employed predictors in C stock prediction using a Bayesian hierarchical framework with spatially varying coefficients showed NDVI as a significant predictor of C stock as illustrated in Table 1. 95% credible intervals of SAVI and EVI contained zero and hence, rendering them as insignificant predictors of C stock. These predictors were therefore excluded in the final prediction and mapping of C stock distribution.

3.2. Candidate Models and Parameter Estimates

Model posterior estimates of the regression coefficients for the Landasat-8 and Sentinel-2-based C stock models are illustrated in Table 2 and Table 3, respectively, for the non-spatial, SVI, and SVC models. Spatial autocorrelation is modeled in the residuals in the SVI, SVC, and all the

S V C - β_{k}

variant models. This may entail differences in the posterior regression coefficient parameter estimates, depending on the C stock model parameter structure. Credible intervals (95% CI) for

β_{0}

and

β_{N D V I}

including zero for both non-spatial and SVI models would hint at a non-significant relationship between the C stock observations and predictor variables. However, we cannot apply the same reasoning and interpretation for the SVC models as the predictor-specific spatially varying coefficient maps of

β_{N D V I} + w_{N D V I} (s)

should be considered and ascertain whether location-specific CIs include zero.

C.I means 95% Credible Interval.

Posterior estimates for the Landsat-8 and Sentinel-2 based

β_{N D V I} + w_{N D V I} (s)

coefficients are shown in Figure 3 and Figure 4, with a significant relationship between the outcome variable and

N D V I

being evident over the entire study domain. In both cases, there is significant variability in the values of

β_{N D V I} + w_{N D V I} (s)

in the studied region. This could imply more accessibility to plantation resources by communities settling in the plantation areas. The same trend is noticeable for the Sentinel-2-based C stock-based SVC model (Figure 5 and Figure 6) for both predictor coefficient values of

β_{N D V I} + w_{N D V I} (s)

and for the same region as observed in the Landsat-8 based SVC model in Figure 3. Communities settled within certain areas of the plantation forest have more access to forest resources than in other areas, rendering low C stock density in these areas as demonstrated by the corresponding low

N D V I

values in the same region for the Sentinel-2-based SVC model. Enhanced detail in the spatial resolution of Sentinel-2 based

β_{N D V I} + w_{N D V I} (s)

vindicates variability in

β_{N D V I} + w_{N D V I} (s)

coefficient values in the same region over those derived from generalized Landsat-8 multispectral data [41].

Estimates of

β_{0} + w_{0} (s)

for the Landsat-8 and Sentinel-2-based SVI and SVC models are shown in Figure 3 and Figure 4, respectively. The

β_{0} + w_{0} (s)

pattern for the SVI is the same for both Landsat-8 and Sentinel-2 spatially varying coefficient models whilst the

β_{0} + w_{0} (s)

SVC pattern in Landsat-8 is the same for Landsat-8-based SVC (

β_{N D V I} + w_{N D V I} (s)

) process. On the other hand, the

β_{0} + w_{0} (s)

SVI pattern for Sentinel-2 based model is different for Sentinel-2

β_{N D V I} + w_{N D V I} (s)

SVC model. However, the same is not true for Sentinel-2-based SVI as the partitioning of

w_{0}

into

w_{0}

and

w_{N D V I}

in Sentinel-2 SVC is detailed with enhanced spatial resolution compared to Landsat-8 based

w_{N D V I}

SVC model.

Table 2 and Table 3 illustrate the posterior estimates of the uncorrelated residual variance,

τ^{2}

. The uncorrelated residual variance is largest in the non-spatial model, whilst it is small for the SVC-variant models for both Landsat-8 and Sentinel-2-based SVC. The SVI and the SVC variant models incorporate a spatially varying correlated random effect,

w_{0}

with variance

σ_{0}^{2}

. The SVC model variants further incorporate more spatially varying correlated random effects

w_{N D V I}

and variance

σ_{N D V I}^{2}

. For both Landsat-8 and Sentinel-2 based models, the

w_{0}

and

w_{k}

explained much of the residual variability and hence, a reduced

τ^{2}

. The implication is higher predictive accuracy for the SVC-variant models in both sensors, making the SVI and SVC more attractive over the error-independent models. This is supported by the goodness of fit diagnostics for the SVI and the SVC-variant models illustrated in Table 2 and Table 3 for the Landsat-8 and Sentinel-2 derived regression coefficients, respectively.

In comparison to the SVI model, the SVC model based on both remote sensing-derived covariates reduced the non-spatial residual spatial dependence by incorporating the space-varying impact of

β_{N D V I}

. Estimates of the spatial process parameters have a big difference. In particular, the spatial process parameters for the Landsat-8-based (Figure 3) SVC point estimates of

σ_{0}^{2} = 0.24

and the effective range of 30 m (i.e.,

\approx - l o g (0.05 / ϕ) = - l o g (0.05 / 0.1)

denote a less variable and significantly shorter effective spatial range than the spatial process of the SVI model. The same pattern recurs in the Senitnel-2-based SVI and SVC models, where the effective spatial range reduces from

\approx 1800 m

in the SVI to

\approx 100 m

in the SVC model (Table 3). The non-negligible spatial process parameter estimates of

σ_{N D V I}^{2}

and

ϕ_{N D V I}

in the SVC model denote a potentially space-varying relationship between C stock and multispectral remote sensing derived covariates.

3.3. Landsat-8 and Sentinel-2 C Stock-Based Predictions

The entire SVC model based on Landsat-8 and Sentinel-2-based C stock models generated the lowest DIC, D, and RMSPE, as illustrated in Table 2 and Table 3, respectively. Landsat-8-based SVC 95% CIW is much shorter when compared to the non-spatial fitted model values. A 10% improvement in the RMSPE is seen in the Landsat-8-based model when moving from the error-independent to the SVC model. In the same vein, the Sentinel-2-based model had a 12% improvement from the SVI to the SVC model (Table 3). Predictions produced for both Landsat-8 and Sentinel-2-based C stock models mirrored the observed C stock data in the studied region. Observed C stock data values ranged from log (0.2–6) MgCha⁻¹ (Figure 1). Variability in C stock uncertainty is fairly constant across the studied domain for both Landsat-8 and Sentinel-2 SVC-based models.

The almost similar variability in the density of C stock predicted by both sensors could be attributed to the inadequacy of covariates in the modeling framework, which when the range of modeled covariates is broadened, could accurately depict the variability of C stock in these disturbed plantation forest ecosystems [5].

3.4. C Stock Model Prediction Assessment

Scatterplots observed against predicted C stock alongside the 95% intervals are illustrated in Figure 7. Slight improvement in the performance of Landsat-8 and Sentinel-2 C stock-based predictions can be validated through a visual inspection of the results. Estimated Root Mean Square Prediction Error (RMSPE) for Landsat-8 (6.42 Mg C ha⁻¹) and Sentinel-2 (6.23 Mg C ha⁻¹) based C stock prediction further reinforces the model prediction performance for the two sensors illustrated in Figure 7.

4. Discussion

Space varying coefficient models constructed from different but related new-generation multispectral remote sensing platforms were used to predict C stock in a managed plantation forest ecosystem in Zimbabwe. The RMSPE is marginally higher in Sentinel-2-based C stock SVC model than in the Landsat-8-based SVC counterpart. Our findings with regard to the performance of Landsat-8 and Sentinel-2 sensors are also congruent with the work of [12,37]. SVC models for both new-generation remote sensing-derived predictors showed preference over the error-independent models. However, estimates from the two data sources were marginally different from each other on the prediction of C stock, as illustrated in validation diagnostics [42]. The adaptable structure of the SVC permitted the residuals spatial variability to be apportioned between the random slope and the random intercept. This provided additional benefits not available in the SVI models from the two predictor sources. The SVC permitted reconnaissance of the C stock observations and the

N D V I

predictor. Furthermore, the SVC models in both multispectral data sources had better representation of the processes thereby yielding C stock predictions with reduced variance.

Evaluation of the models utilizing predictors from both remote sensing data sources showed SVC to fit the data better than the SVI models. Kriged maps for C stock using Landsat-8 and Sentinel-2 data were not significantly different from each other, with the Landsat-8 SVC displaying a slightly wider 95% CI compared to the Sentinel-2-based SVC model. Again, this is partly because the study employed conventional bands (indices) that are calculated in a similar fashion in both Landsat-8 and Sentinel-2, and hence, the differences in prediction only emanated from the spatial resolutions. Explicit accommodation of residual spatial dependence through spatially correlated random effects gathered better predictions as the SVC models using the two data sources as regression coefficients borrowed additional information from neighboring C stock observations [43,44]. Precise estimates of covariance parameters are not easy to derive with small inventory sample sizes [45]. Consequently, we might anticipate some impact on the accuracy of the predictions when uncertainty in the covariance ensues to the posterior predictive distributions. Predictions had limited information to borrow from because of the sparsity of C stock inventory observations. This was further worsened by the reduction in the overall sample size through cross-validation.

Similar to previous studies predicting ABG and C stock, our findings establish Sentinel-2 as a better source of RS data for predicting C stock in disturbed environments [37] compared Sentinel-2 and Landsat-8 imagery for forest biomass prediction and showed Sentinel-2 outperforms Landsat-8 because of the enhanced spatial resolution in the former in comparison to Landsat-8. Most studies comparing Sentinel-2 and Landsat-8 for predicting AGB prefer Sentinel-2 over Landsat-8. This is further justified by the work of [13], who compared Worldview-3, Sentinel-2, and Landsat-8 for representing AGB in a forest environment in Thailand and demonstrated Worldview-3 and Sentinel-2 as better data sources and, therefore, predictors than Landsat-8 due to the red-edge and the improved spatial and spectral properties of Worldview-3 and Sentinel-2. Furthermore, [16] utilized LiDAR derived covariates from establishing the prediction performance of SVC models using forest inventory data in North America. The researchers established significant improvement in biomass prediction accuracy in the presence of residual spatial dependence deriving from the finer resolution LiDAR covariates. In most cases, the effectiveness of SVC models in these studies is usually strengthened by the solid non-stationary relationships between the response variable and the predictor variables influenced by unobserved ecological factors operating at broad geographical scales. Such ecological factors were also seen in the present study as NDVI was established to be a statistically significant predictors of C stock in the studied region. The biggest drivers to these factors are the presumed activities of forest disturbance due to human encroachment into plantation forests that subsequently impact the density and distribution of ABG biomass.

4.1. Limitations of the Study

Apart from the vegetation indices influencing the spatial distribution and density of AGB employed in this study, it is also known and acknowledged in the literature that climate and topographic variables play a part in the distribution of C stock. For example, [37,38] have shown elevation and aspect accounting for the bigger portion of the spatial variability of C stock in a mountainous region of Nepal. As such, the limitation of our study is the application of vegetation indices as predictors of C stock, and this may, therefore, not be representative of the general C stock dynamics in the studied region. We, therefore, recommend the integration of topo-climatic factors with new generation remote sensing-derived vegetation indices for future research in order to obtain a more accurate global overview of the C stock density and distribution in the studied region.

4.2. Conclusions

The study presented a hierarchical Bayesian geostatistical spatially varying coefficient model for determining the relationship between sampled C stock data and multispectral remote sensing derived predictors. There was a marginal improvement in model fit, and prediction accuracy in both Landsat-8 and Sentinel-2-based SVC models in comparison to the error-independent models. The SVC model permitted exploration of the observed C stock locations where the models performed well or poorly, which was missing in the SVI models. This provided an understanding of the performance of the multispectral remote sensing derived predictors for modeling C stock and hence, sets the foundation for the updating of the carbon forest plantation database for forest practitioners in the country and utilized as a monitoring tool in the long term. The Sentinel-2-based SVC model was preferred for prediction in the plantation forest ecosystems as its model provided tighter credible intervals compared to the Landsat-8-based C stock SVC model.

The small sample size of the data utilized in the present research enabled the modeling approach to be computationally feasible. When inventory plots are comprised of bigger sample sizes, the matrix operations of immense dimensions are needed for computing model parameter estimates of higher magnitude and may not be achievable through ordinary PCs. Our future work is therefore aimed at exploring algorithms for resolving the dimensionality curse when fitting spatially varying coefficient models. The problem of dimensionality can also get complicated if many predictors are involved in the SVC modeling framework. Resolving dimensionality issues is needed as forest C stock is typically modeled with predictors from many data sources, chief amongst them being topographic, bioclimatic, and anthropogenic variables.

Author Contributions

T.S.C.: research design, writing, and data processing. O.M.: supervision and funding of research activity. T.D.: supervision and technical assistance. All authors have read and agreed to the published version of the manuscript.

Funding

The study was partially supported by the South African Research Chair Initiative (SARChI) in Land Use Planning and Management (Grant no. 84157).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

All relevant data are within the paper.

Acknowledgments

The authors thank Joseph. Z.Z. Matowanyika and Nixon Kutsaranga for allowing the researchers access to forest plantations at Lot 75 A of Nyanga Downs in Nyanga District of Manicaland province. The researchers also thank Trylee Matongera for assistance with the formatting of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Matose, F. Trends in forest ownership, institutional arrangements and the impact on forest management and poverty reduction. Cbneih 2008, 13, 373. [Google Scholar]
Newsday. Forestry Commision decentralise issuance of timber movement. Newsday, 24 July 2017.
Bordoloi, R.; Das, B.; Tripathi, O.; Sahoo, U.; Nath, A.; Deb, S.; Das, D.; Gupta, A.; Devi, N.; Charturvedi, S.; et al. Satellite based integrated approaches to modelling spatial carbon stock and carbon sequestration potential of different land uses of Northeast India. Environ. Sustain. Indic. 2022, 13, 100166. [Google Scholar] [CrossRef]
Brown, S. Estimating Biomass and Biomass Change of Tropical Forests: A Primer; Food & Agriculture Organization: Rome, Italy, 1997; Volume 134. [Google Scholar]
Zvobgo, L.; Tsoka, J. Deforestation rate and causes in Upper Manyame Sub-Catchment, Zimbabwe: Implications on achieving national climate change mitigation targets. Trees For. People 2021, 5, 100090. [Google Scholar] [CrossRef]
Wang, Q.; Li, J.; Jin, T.; Chang, X.; Zhu, Y.; Li, Y.; Sun, J.; Li, D. Comparative analysis of Landsat-8, Sentinel-2, and GF-1 data for retrieving soil moisture over wheat farmlands. Remote Sens. 2020, 12, 2708. [Google Scholar] [CrossRef]
Fuller, A.; Dawson, T.; Helmuth, B.; Hetem, R.S.; Mitchell, D.; Maloney, S.K. Physiological mechanisms in coping with climate change. Physiol. Biochem. Zool. 2010, 83, 713–720. [Google Scholar] [CrossRef] [Green Version]
Mutanga, O.; Dube, T.; Ahmed, F. Progress in remote sensing: Vegetation monitoring in South Africa. S. Afr. Geogr. J. 2016, 98, 461–471. [Google Scholar] [CrossRef]
Hoeting, J.A. The importance of accounting for spatial and temporal correlation in analyses of ecological data. Ecol. Appl. 2009, 19, 574–577. [Google Scholar] [CrossRef]
Zunguze, A.X. Quantificação de Carbono Sequestrado em Povoamentos de Eucalyptus spp na Floresta de Inhamacari-Manica; Universidade Eduardo Mondlane: Maputo, Mozambique, 2012. (In Portuguese) [Google Scholar]
Gelfand, A.E.; Kim, H.-J.; Sirmans, C.; Banerjee, S. Spatial modeling with spatially varying coefficient processes. J. Am. Stat. Assoc. 2003, 98, 387–396. [Google Scholar] [CrossRef]
Chrysafis, I.; Mallinis, G.; Siachalou, S.; Patias, P. Assessing the relationships between growing stock volume and Sentinel-2 imagery in a Mediterranean forest ecosystem. Remote Sens. Lett. 2017, 8, 508–517. [Google Scholar] [CrossRef]
Green, E.J.; Finley, A.O.; Strawderman, W.E. Introduction to Bayesian Methods in Ecology and Natural Resources; Springer Nature: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Popescu, S.C.; Zhao, K.; Neuenschwander, A.; Lin, C. Satellite lidar vs. small footprint airborne lidar: Comparing the accuracy of aboveground biomass estimates and forest structure metrics at footprint level. Remote Sens. Environ. 2011, 115, 2786–2797. [Google Scholar] [CrossRef]
Tonolli, S.; Dalponte, M.; Neteler, M.; Rodeghiero, M.; Vescovo, L.; Gianelle, D. Fusion of airborne LiDAR and satellite multispectral data for the estimation of timber volume in the Southern Alps. Remote Sens. Environ. 2011, 115, 2486–2498. [Google Scholar] [CrossRef]
Babcock, C.; Finley, A.O.; Bradford, J.B.; Kolka, R.; Birdsey, R.; Ryan, M.G. LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients. Sens. Environ. 2015, 169, 113–127. [Google Scholar] [CrossRef] [Green Version]
Gelfand, A.E.; Schmidt, A.M.; Banerjee, S.; Sirmans, C. Nonstationary multivariate process modeling through spatially varying coregionalization. Test 2004, 13, 263–312. [Google Scholar] [CrossRef]
Finley, A.O.; Banerjee, S.; MacFarlane, D.W. A hierarchical model for quantifying forest variables over large heterogeneous landscapes with uncertain forest areas. J. Am. Stat. Assoc. 2011, 106, 31–48. [Google Scholar] [CrossRef] [Green Version]
Finley, A.O.; Banerjee, S.; McRoberts, R.E. Hierarchical spatial models for predicting tree species assemblages across large domains. Ann. Appl. Stat. 2009, 3, 1052. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Schabenberger, O.; Gotway, C.A. Statistical Methods for Spatial Data Analysis: Texts in Statistical Science; Chapman and Hall/CRC: London, UK, 2017. [Google Scholar]
Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
Hudak, A.T.; Lefsky, M.A.; Cohen, W.B.; Berterretche, M. Integration of lidar and Landsat ETM+ data for estimating and mapping forest canopy height. Remote Sens. Environ. 2002, 82, 397–416. [Google Scholar] [CrossRef] [Green Version]
Wheeler, D.C.; Waller, L.A. Comparing spatially varying coefficient models: A case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests. J. Geogr. Syst. 2009, 11, 1–22. [Google Scholar] [CrossRef]
Cressie, N. Statistics for Spatial Data; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1993. [Google Scholar]
Banerjee, S.; Carlin, B.P.; Gelfand, A.E. Hierarchical Modeling and Analysis for Spatial Data; Chapman and Hall/CRC: London, UK, 2003. [Google Scholar]
Lisboa, S.N.; Guedes, B.S.; Ribeiro, N.; Sitoe, A. Biomass allometric equation and expansion factor for a mountain moist evergreen forest in Mozambique. Carbon Balance Manag. 2018, 13, 1–16. [Google Scholar] [CrossRef] [Green Version]
Walvoort, D.J.; Brus, D.; De Gruijter, J. An R package for spatial coverage sampling and random sampling from compact geographical strata by k-means. Comput. Geosci. 2010, 36, 1261–1267. [Google Scholar] [CrossRef]
Brus, D.; De Gruijter, J.; Van Groenigen, J. Designing spatial coverage samples using the k-means clustering algorithm. Dev. Soil Sci. 2006, 31, 183–192. [Google Scholar]
Wackernagel, H. Multivariate Geostatistics: An Introduction with Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Banerjee, S.; Finley, A.O.; Waldmann, P.; Ericsson, T. Hierarchical spatial process models for multiple traits in large genetic trials. J. Am. Stat. Assoc. 2010, 105, 506–521. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chiles, J.-P.; Delfiner, P. Geostatistics: Modeling Spatial Uncertainty; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
Kaplan, D.; Depaoli, S. Bayesian Structural Equation Modeling; Guilford Press: New York, NY, USA, 2012. [Google Scholar]
Banerjee, S.; Gelfand, A.E.; Finley, A.O.; Sang, H. Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. Ser. B 2008, 70, 825–848. [Google Scholar] [CrossRef] [PubMed]
Mansfield, E.R.; Helms, B.P. Detecting multicollinearity. Am. Stat. 1982, 36, 158–160. [Google Scholar]
Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis; Chapman and Hall/CRC: London, UK, 1995. [Google Scholar]
Gelman, A.; Rubin, D.B. Inference from iterative simulation using multiple sequences. Stat. Sci. 1992, 7, 457–472. [Google Scholar] [CrossRef]
Korhonen, L.; Packalen, P.; Rautiainen, M. Comparison of Sentinel-2 and Landsat 8 in the estimation of boreal forest canopy cover and leaf area index. Remote Sens. Environ. 2017, 195, 259–274. [Google Scholar] [CrossRef]
Ranghetti, L.; Boschetti, M.; Nutini, F.; Busetto, L. “sen2r”: An R toolbox for automatically downloading and preprocessing Sentinel-2 satellite data. Comput. Geosci. 2020, 139, 104473. [Google Scholar] [CrossRef]
Li, C.; Li, X. Hazard rate and reversed hazard rate orders on extremes of heterogeneous and dependent random variables. Stat. Probab. Lett. 2019, 146, 104–111. [Google Scholar] [CrossRef]
Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B 2002, 64, 583–639. [Google Scholar] [CrossRef] [Green Version]
Forkuor, G.; Dimobe, K.; Serme, I.; Tondoh, J.E. Landsat-8 vs. Sentinel-2: Examining the added value of sentinel-2’s red-edge bands to land-use and land-cover mapping in Burkina Faso. GIScience Remote Sens. 2018, 55, 331–354. [Google Scholar] [CrossRef]
Chen, G.; Zhao, K.; McDermid, G.J.; Hay, G.J. The influence of sampling density on geographically weighted regression: A case study using forest canopy height and optical data. Int. J. Remote Sens. 2012, 33, 2909–2924. [Google Scholar] [CrossRef]
Gelfand, A.E. Hierarchical modeling for spatial data problems. Spat. Stat. 2012, 1, 30–39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Babcock, C.; Finley, A.O.; Cook, B.D.; Weiskittel, A.; Woodall, C.W. Modeling forest biomass and growth: Coupling long-term inventory and LiDAR data. Remote Sens. Environ. 2016, 182, 1–12. [Google Scholar] [CrossRef]
Baumer, B.S.; Kaplan, D.T.; Horton, N.J. Texts in Statistical Science: Modern Data Science with R; Chapman and Hall/CRC: London, UK, 2017. [Google Scholar]

Figure 1. Map of the study area indicating (a) the province where samples were derived, (b) the study area location within the particular province, and (c) the spatial distribution of major plantation tree species in the studied region. * refers to Provinces in Zimbabwe whilst the box encompasses the studied area in this research.

Figure 2. Study area sampling design.

Figure 3. Landsat-8-based C stock spatially varying coefficient maps alongside their 95% credible intervals for the SVC model.

Figure 4. Sentinel-2-based C stock spatially varying coefficient maps alongside their 95% credible intervals for the SVC model.

Figure 5. Landsat-8-based C stock posterior predictions.

Figure 6. Sentinel-2-based C stock posterior predictions.

Figure 7. Landsat-8 and Sentinel-2 C stock-based predictions vs. C stock observations alongside 95% intervals.

Table 1. Landsat-8 and Sentinel-2 derived predictors of C stock.

Parameter	Lansat-8 OLI C Stock Model				Sentinel-2 MSI C Stock Model
Parameter	Mean	s.d	2.5%	97.5%	Mean	s.d	2.5%	97.5%
$I n t e r c e p t$	−2.65	1.02	−4.76	−0.83	−2.40	0.31	−3.01	−1.80
$NDVI$	2.47	0.98	0.78	4.62	4.90	0.23	4.52	5.38
$SAVI$	−0.57	0.67	−2.04	0.63	−0.55	0.34	−1.25	0.17
$EVI$	−0.62	0.55	−1.57	0.50	−0.002	0.096	−0.17	0.24
$σ_{w}^{2}$	1.25	0.26	0.72	1.72	0.075	0.016	0.046	0.092
$σ_{ε}^{2}$	0.35	0.17	0.074	0.54	0.0043	0.0030	0.0005	0.011
$ϕ$	0.0016	0.0003	0.0012	0.0018	0.0015	0.0002	0.0014	0.0014

Table 2. Landsat-8-based SVC model median parameter estimates alongside their 95% credible intervals.

		Non-Spatial	SVI	$S V C - N D V I$	SVC
Parameter C.I 50% (2.5%, 97.5%)	$β_{0}$	−2.5 (−4.1, −0.8)	−2.7 (−4.6, −0.8)	−2.9 (−5.0, −0.9)	−3.5 (−5.3, −1.7)
	${\tilde{β}}_{N D V I} (s)$	5.4 (3.3, 7.5)	2.8 (0.8, 4.7)	3.3 (1.2, 5.1)	3.6 (1.4, 5.8)
	$τ^{2}$	1.5 (1.2, 2.0)	0.5 (0.3, 0.8)	0.12 (0.03, 0.40)	0.1 (0.02, 0.2)
	$\frac{3}{ϕ_{0}}$	-	0.0014 (0.0012, 0.0021)	0.003 (0.002, 0.003)	0.1 (0.07, 0.16)
	$\frac{3}{ϕ_{N D V I}}$	-	-	0.13 (0.06, 0.22)	0.03 (0.02, 0.05)
	$σ_{0}^{2}$	-	1.0 (0.5, 1.5)	1.1 (0.8, 1.6)	0.24 (0.12, 0.48)
	$σ_{N D V I}^{2}$	-	-	1.0 (0.2, 3.0)	0.45 (0.14, 1.43)
Fit Statistics	$D$	436	140	41.8	65.9
	$p D$	4.1	68.9	87.0	108.9
	$D I C$	207	105.6	−77.9	29.0
	$R M S P E$ (MgCha⁻¹)	8	7.77	7.54	6.42

C.I means 95% Credible Interval.

Table 3. Sentinel-2-based SVC model median parameter estimates alongside 95% credible intervals.

		Non-Spatial	SVI	$S V C - N D V I$	$S V C$
Parameter C.I 50% (2.5%, 97.5%)	$β_{0}$	−2.5 (−4.1, −0.8)	−2.8 (−4.6, −0.9)	−2.9 (−4.9, −0.7)	−3.5 (−5.4, −1.6)
	${\tilde{β}}_{N D V I} (s)$	5.4 (3.3, 7.5)	3.0 (1.2, 4.9)	2.9 (0.7, 4.9)	3.7 (1.5, 5.7)
	$τ^{2}$	1.5 (1.2, 2.0)	0.5 (0.3, 0.9)	0.32 (0.15, 0.55)	0.2 (0.11, 0.42)
	$\frac{3}{ϕ_{0}}$	-	0.0015 (0.0014, 0.0016)	0.0016 (0.0015, 0.0018)	0.06 (0.018, 0.076)
	$\frac{3}{ϕ_{N D V I}}$	-	-	0.09 (0.036, 0.13)	0.0015 (0.0014, 0.0017)
	$σ_{0}^{2}$	-	0.7 (0.4, 1.1)	1.1 (0.8, 1.9)	0.23 (0.16, 0.47)
	$σ_{N D V I}^{2}$	-	-	0.7 (0.43, 1.1)	0.65 (0.41, 1.05)
Fit Statistics	$D$	436.6	159	93.3	17.9
	$p D$	4.1	58.0	89.8	106.1
	$D I C$	207.6	112.5	65.6	−177.3
	$R M S P E$ (MgCha⁻¹)	7.85	7.69	7.46	6.23

C.I means 95% Credible Interval.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chinembiri, T.S.; Mutanga, O.; Dube, T. Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models. Remote Sens. 2022, 14, 5676. https://doi.org/10.3390/rs14225676

AMA Style

Chinembiri TS, Mutanga O, Dube T. Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models. Remote Sensing. 2022; 14(22):5676. https://doi.org/10.3390/rs14225676

Chicago/Turabian Style

Chinembiri, Tsikai Solomon, Onisimo Mutanga, and Timothy Dube. 2022. "Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models" Remote Sensing 14, no. 22: 5676. https://doi.org/10.3390/rs14225676

APA Style

Chinembiri, T. S., Mutanga, O., & Dube, T. (2022). Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models. Remote Sensing, 14(22), 5676. https://doi.org/10.3390/rs14225676

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Landsat-8 and Sentinel-2 Based Prediction of Forest Plantation C Stock Using Spatially Varying Coefficient Bayesian Hierarchical Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Sampling Design

2.3. Above-Ground Biomass Measurements and C Stock Derivation

2.4. Modelling Framework

2.5. Competing Models

2.6. Landsat OLI and Sentinel-2 MSI Imagery Derived Covariates

2.7. Model Fit and Prediction Accuracy Evaluation

3. Results

3.1. Multispectral Remote Sensing C Stock Derived Predictors

3.2. Candidate Models and Parameter Estimates

3.3. Landsat-8 and Sentinel-2 C Stock-Based Predictions

3.4. C Stock Model Prediction Assessment

4. Discussion

4.1. Limitations of the Study

4.2. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI