Appendix A. Econometric Estimation. Semiparametric Approach
A semi-parametric model specifies the conditional mean of the dependent variable as two separate components, one parametric and one non-parametric. These types of models are very attractive from an empirical point of view due to their flexibility to balance precision and robustness. On the one hand, it allows us to incorporate prior information from economic theory or past experience while maintaining more flexibility in the specification of the model. On the other hand, although there is a nonparametric part that shows a slower convergence rate, the estimators obtained for the parametric part exhibit the same statistical properties as if the whole model were totally parametric. This is the so-called
-consistency property (see
Robinson (
1988) and
Speckman (
1988), for example) for cross-sectional models. Finally, semiparametric models allow to provide a solution to the well-known “curse of dimensionality” of the fully nonparametric models. Several reviews on this topic exist and we suggest the interested reader consult
Ai and Li (
2008),
Henderson and Parmeter (
2015),
Parmeter and Racine (
2019),
Rodriguez-Poo and Soberon (
2017), and
Su and Ullah (
2011), among others.
A semi-parametric panel data model with heterogeneous slopes and unknown functions and cross-sectional dependence is given by
where
denotes the dependent variable (i.e., the environmental quality measure in Equation (
5)),
and
are
and
vectors of the explanatory variables of interest (i.e., in the case of Equation (
5),
and
), respectively,
is an unknown smooth function and
is a
vector of unknown population parameters.
The aim of the researchers is to obtain consistent estimators of
and
knowing that
is a
vector of observed common effects (including deterministic regressors such as intercepts or seasonal dummies),
is a
vector of unknown parameters,
is a
vector of unobserved common factors,
is the corresponding vector of the factor loadings, and
are the individual-specific (idiosyncratic) errors assumed to be independently distributed of
. In general, however, the unobserved factors
could be correlated with
, and to allow for such a possibility, we adopt the fairly general models for the individual specific regressors,
where
,
,
, and
are
,
,
, and
, factor loading matrices with fixed components, and
and
are the specific components of
and
, respectively, distributed independently of the common effects and across
i, but assumed to follow general covariance stationary processes.
With the aim of obtaining consistent estimators for
and
, in the following we will show how, with modifications, the Common Correlated Effects (CCE) approach proposed in
Pesaran (
2006) can be applied to a semiparametric regression model.
Let
and
be matrices of zero of
and
dimension, respectively,
and
identity matrices of
and
dimension. If we combine (
A1)–(
A3) and rearrange terms, we can write
where
and
are matrices of
and
dimension, respectively, of the form
In order to show that using suitable proxies for the unobserved factors is enough to avoid having to use initial estimates of
, we take the cross-sectional sample averages of (
A4) obtaining
where
,
, and
. Furthermore,
,
,
, and
are the cross-sectional averages of
,
,
, and
, respectively, and
. Let
, where
,
, and
. Following
Pesaran (
2006), we can premultiply both sides of (
A5) by
and solve for
,
provided that
As
,
,
,
, and
for each
t under weak conditions. It follows,
The result of this last line suggests that we can use
as observable proxies of the unobservable factors,
. Therefore, we can conclude that effectively the Common Correlated Effect (CCE) approach proposed in
Pesaran (
2006) for fully parametric models can be applied in a semi-parametric setting with slight changes.
In this situation, we can estimate
and
by augmenting the semiparametric regression of
on
and
with
obtaining the following regression model
where
captures possible approximation errors of the proxies. In addition,
is a
vector of proxies, where
.
In order to get a
-consistent estimator of
, we follow
Robinson (
1988) to eliminate the unknown functional
. Taking conditional expectations of (
A9) yields
and subtracting (
A10) from (
A9) yields
In order to obtain feasible estimators for
, it is well-known that these conditional expectations are unknown and need to be estimated. With this aim,
Robinson (
1988) proposes to use (higher-order) Nadaraya-Watson kernel estimators. Later,
Linton (
1995) and
Hamilton and Truong (
1997), among others, pointed out that partial regression methods can be improved further by using local linear smoothers (see
Fan and Gijbels (
1996) to a deeper discussion about the desirable properties of these estimators). At the light of these results, we propose to use a local linear smoothers to estimate these conditional expectations.
Let
, where (
, or
, or
). For a given point
and for
in a neighbourhood of
z, we propose to minimize the following weighted local linear least-squares (LLLS) problem for
,
where
is a product kernel function such as
,
is the
lth component of
u, and
a is a positive bandwidth term. Of course, a general diagonal or non-diagonal bandwidth matrix could be employed, but for the sake of simplicity, a single scalar bandwidth is used.
Using the resulting estimators for these conditional estimators in (
A11) and writing the resulting expression in vectorial form yields
where
and
are
T-dimensional vectors,
and
are matrices of dimension
and
, respectively, and
is a
diagonal matrix. Further, assuming that
is invertible,
is a
smoothing matrix associated to the individual
i of the form
where
is a
matrix,
is a
vector having 1 in the first entry and all other entries 0,
is a
diagonal matrix. Note that
is the new error term which consists of three elements: (i) original error term, (ii) approximation error of the proxies, (iii) approximation error of the Taylor expansion.
By the formula for partitioned regression, the estimator of
in (
A13) is given by
where
is a projection matrix, for
,
, and
.
Following a similar reasoning, the estimator of
in (
A13) is given by
where
.
Focusing now on the nonparametric estimation of the smooth unknown function
, we use the above estimator so the corresponding weighted local linear least-squares problem to minimize is of the following form
where
is a product kernel defined as in (
A12) and
h is the new bandwidth term.
Then, assuming that
is invertible, the resulting CCE nonparametric estimator of
is given by
where
is a
diagonal matrix defined as
, with
h instead of
a.
Under the conditions in
Appendix B one can show that the semiparametric CCE estimator,
is consistent and asymptotically normal as
N and
T tends to infinity. More precisely, following a similar proof scheme as in
Pesaran (
2006), the following result is obtained.
Theorem A1. Consider the panel data model (A1), and suppose that , , , Assumptions A1–A3 and A4–A7 hold, (in no particular order), and the rank condition (A7) is satisfied. Then, and are consistent estimators of and , respectively. If it is further assumed that as , then where , , and are covariance matrices. Furthermore, for , where and are vectors whose tth element are such as and . Similarly, under the conditions in
Appendix B one can show that the nonparametric CCE estimator,
is consistent and asymptotically normal as
N and
T tends to infinity.
Theorem A2. Consider the panel data model (A1), and suppose that Assumptions A1–A9 hold. Given the -consistency of and , as ,where and is the Hessian matrix of . That theorem is proved following a similar proof scheme as in
Musolesi et al. (
2020), so it is omitted. The detailed proof of the theorem can be provided upon request.
Finally note that the estimate of the variances of the above theorems can be used to construct standard errors for or confidence bands for . We use a standard multivariate kernel density estimator with a Epanechnikov kernel and the Silverman’s rule-of-thumb to chose the bandwidth.
Appendix B. Assumptions
In order to derive the asymptotic distribution of
and
obtained in
Appendix A, the following notation is used. Denoting
and
, where
and
. Furthermore, the following conditions are required.
Assumption A1. (Common Effects). The vector for common effects , is covariance stationary with absolute summable autocovariances, distributed independently of the individual-specific errors , , and for all i, t, and .
Assumption A2. (Individual-Specific Errors).
The individual-specific errors , , and are distributed independently for all i, j, t, and . Furthermore, for each i, , , and follow linear stationary processes with absolute summable autocovariances , , and , where are vectors of identically, independently distributed random variables with mean zero, variance matrix , and finite fourth-order cumulants. In particular,for all i and some constants , , and , where and and are positive definite matrices. Assumption A3. (Identification of ) For each i, and are nonsingular matrices and have finite second-order moments for all i. Furthermore, .
Assumption A4. (Density function). The density of satisfies and is twice continuously differentiable in all its arguments with bounded second-order derivatives at any point of its support.
Assumption A5. (Smoothness condition). Let be the support of . The unknown functions , , and are bounded and twice continuously differentiable at z in the interior of with second-order derivatives bounded.
Assumption A6. (Kernel function). is a product kernel, and the univariate kernel function is compactly supported and bounded such that , , and , where and are scalars and is a identity matrix. All odd-order moments of k vanish, that is, , for all non-negative integers such that their sum is odd.
Assumption A7. (Bandwidth). a and h are positive bandwidths such that as , , and . Furthermore, as , .
Assumption A8. The map is twice continuously differentiable at z in the interior of with second-order derivatives bounded.
Assumption A9. (Lyapounov). For some , exists and is bounded.