Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves

Lu, Lu; Ghosh, Sujit

doi:10.3390/math12081135

Open AccessFeature PaperArticle

Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves

by

Lu Lu

^*,† and

Sujit Ghosh

^†

Departement of Statistics, North Carolina State University, Raleigh, NC 27695, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2024, 12(8), 1135; https://doi.org/10.3390/math12081135

Submission received: 4 March 2024 / Revised: 31 March 2024 / Accepted: 8 April 2024 / Published: 10 April 2024

(This article belongs to the Special Issue Nonparametric Statistical Methods and Their Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Conditional copulas are useful tools for modeling the dependence between multiple response variables that may vary with a given set of predictor variables. Conditional dependence measures such as conditional Kendall’s tau and Spearman’s rho that can be expressed as functionals of the conditional copula are often used to evaluate the strength of dependence conditioning on the covariates. In general, semiparametric estimation methods of conditional copulas rely on an assumed parametric copula family where the copula parameter is assumed to be a function of the covariates. The functional relationship can be estimated nonparametrically using different techniques, but it is required to choose an appropriate copula model from various candidate families. In this paper, by employing the empirical checkerboard Bernstein copula (ECBC) estimator, we propose a fully nonparametric approach for estimating conditional copulas, which does not require any selection of parametric copula models. Closed-form estimates of the conditional dependence measures are derived directly from the proposed ECBC-based conditional copula estimator. We provide the large-sample consistency of the proposed estimator as well as the estimates of conditional dependence measures. The finite-sample performance of the proposed estimator and comparison with semiparametric methods are investigated through simulation studies. An application to real case studies is also provided.

Keywords:

empirical checkerboard Bernstein copula (ECBC); conditional dependence measures; covariates

MSC:

62H05; 62H10; 62G05; 62P05

1. Introduction

Copulas have found many applications in the field of finance, insurance, system reliability, etc., owing to their utility in modeling the dependence among variables (see, e.g., Nelsen [1], Jaworski et al. [2] and Joe [3] for details about copulas and their applications). In some situations, the dependence structure between variables can be influenced by a set of covariates, and it is thereby of interest to understand how such dependence changes with the values of covariates. For instance, it is well known that the life expectancy at birth of males and females in a country is often highly interdependent due to shared economic or environmental factors, and it is possible that the strength of the dependence relies on these factors. When the covariate is binary or discrete-valued with few levels, one can estimate a copula for each given level of the discrete-valued covariate separately. In constrast, the influence of a continuous-value covariate on the dependence structure should be formulated in a functional way, and this is where conditional copulas (Patton [4]; Patton [5]) along with the corresponding conditional versions of dependence measures come into play.

Suppose we are interested in the dependence among the components of a random vector

Y

= (Y_{1}, Y_{2}, \dots, Y_{d})

given covariates

X = (X_{1}, X_{2}, \dots, X_{p})

. The conditional joint and marginal distribution of

Y

given

X = x

can be denoted as

F_{x} (y) \equiv F_{x} (y_{1}, y_{2}, \dots y_{d}) = P (Y_{1} \leq y_{1}, Y_{2} \leq y_{2}, \dots, Y_{d} \leq y_{d} | X = x),

(1)

and

F_{j x} (y_{j}) = P (Y_{j} \leq y_{j} | X = x) j = 1, \dots, d .

(2)

If

F_{1 x}, F_{2 x}, \dots, F_{d x}

are continuous, then by an extension of the well-known Sklar’s theorem (Sklar [6]) for conditional distributions (e.g., see Patton [5]), there exists a unique copula

C_{x}

such that

F_{x} (y) = C_{x} (F_{1 x} (y_{1}), F_{2 x} (y_{2}), . . ., F_{d x} (y_{d})) \forall y \in R^{d}, \forall y \in R^{d},

(3)

and the function

C_{x}

is called a conditional copula, which captures the conditional dependence structure of

Y

given

X = x

. The focus of this paper is modeling continuous-valued responses and covariates. Thus, in what follows, we assume that the conditional marginal CDFs

F_{1 x}, j = 1, \dots, d

and the CDFs of each response and covariate are absolutely continuous.

The literature contains a variety of parametric families for modeling copulas. Some commonly used copula families are Archimedean copulas, elliptical copulas, etc.; see Žežula [7] and Joe [3], etc. Assuming that the conditional copula belongs to a parametric copula family where the copula parameter is a function of the covariate(s), there has been previous work addressing the estimation of conditional copula in a semiparametric setting. In regard to frequentist methods based on an assumed parametric class, Acar et al. [8] propose to estimate the functional relationship between the copula parameter and the covariate nonparametrically by using the local likelihood approach, but they assume known marginals, and the maximization is conducted for a fixed value of the covariate. In other words, with the intention of identifying the entire function between the copula parameter and the covariate, it is necessary to solve the maximization problem for a sufficiently large grid of values within the range of the covariate. Abegaz et al. [9] extend the work to a more general setting of unknown marginals and apply a two-stage technique that has been widely adopted in copula estimation: in the first stage, the nonparametric estimates of conditional marginals are obtained using the kernel-based method, and by plugging in these estimates, the functional link is estimated by maximizing the pseudo log-likelihood in the second stage. As alternative estimation methods for the function relationship, Vatter and Chavez-Demoulin [10] develop generalized additive models for the conditional dependence structures, and Fermanian and Lopez [11] introduce so-called single-index copulas, etc. In particular, conditional copulas of Archimedean copulas are studied, e.g., in Mesfioui and Quessy [12], Kasper [13] and the references therein. In the Bayesian framework, inference for bivariate conditional copula models has been constructed in Craiu and Sabeti [14], Sabeti et al. [15] and Levi and Craiu [16], among others.

However, the misspecification of the copula family could lead to severely biased estimation even though a sophisticated and flexible parametric model is employed (e.g., see Geerdens et al. [17]), so it is required to select an appropriate copula model from a large number of candidate families. In order to do so, many copula selection techniques have been proposed in either the frequentist or Bayesian setting; e.g., Acar et al. [8] select the copula family based on cross-validated prediction errors, while the deviance information criterion (DIC) is utilized for the choice of copula in Craiu and Sabeti [14].

Acknowledging the limitations of parametric copula models as mentioned above, fully nonparametric approaches have also been proposed for conditional copula estimation. Gijbels et al. [18] suggest the empirical estimators for conditional copulas where the weights are smoothed over the covariate space through kernel-based methods. They further derive nonparametric estimates for the conditional dependence measures including conditional Kendall’s tau and conditional Spearman’s rho. Since the bandwidth selection is very crucial for any of the smoothing methods, they also develop an algorithm for selecting the bandwidths. The asymptotic properties of the estimators together with conditional dependence measure estimates are established in Veraverbeke et al. [19]. Gijbels et al. [20] further consider more complex covariates like multivariate covariates, and box-type conditioning events are studied in Derumigny and Fermanian [21]. On the other hand, there has been recent work on the Bayesian nonparametric estimation of conditional copulas. Leisen et al. [22] introduce the effect of a covariate on the Bayesian infinite mixture models proposed by [23]. However, the large-sample asymptotic properties of the Bayesian models have been almost unexplored and still remain an area of open work.

In this paper, we focus on the nonparametric estimation of conditional copulas and have realized a relatively easy way by employing the empirical checkerboard Bernstein copula (ECBC) estimator proposed in Lu and Ghosh [24]. ECBC is constructed by extending the Bernstein copula, allowing for varying degrees of the polynomials, which is a genuine smooth copula for any number of degrees and any finite sample size. When the covariates are continuous-valued, the main idea of extending the copula models to include covariates is to first estimate the full copula of responses along with covariates and then take partial derivatives to obtain the conditional distribution of responses given the covariates. As a fully nonparametric approach, it is not required to make any selection of the proper copula family, which is a key step in semiparametric methods to avoid the adverse consequence of model misspecification. Compared to the kernel-based empirical estimators, the selection of bandwidths is unnecessary as well, making it easy to implement in practice. The proposed ECBC-based conditional copula estimator immediately leads to nonparametric estimates of the conditional dependence measures, which can be expressed in a very neat form under matrix operations. The large-sample consistency of the proposed estimator is also provided in the paper.

The rest of the paper is organized as follows: in Section 2, we present a model for conditional copula and closed-form estimates of popular multivariate conditional dependence measures based on the novel methodology of conditional copula estimation. Section 3 shows the finite-sample performance for the proposed methodology. Section 4 provides a real case study. Finally, we make some general comments in Section 5.

2. Models for Conditional Copula

In the following, we focus on the bivariate conditional copula of

(Y_{1}, Y_{2})

with a single covariate X for simplicity. Notice that the extension to more than two dimensions and multiple covariates is straightforward.

Suppose we have i.i.d. samples

(Y_{i 1}, Y_{i 2}, X_{i}), i = 1, \dots, n

, where

(Y_{i 1}, Y_{i 2}), i = 1, \dots, n

are i.i.d observations of the random vector

(Y_{1}, Y_{2})

of which the conditional dependence structure is of our interest.

X_{i}, i = 1, \dots, n

are i.i.d. observations of the covariate X. We assume all components of

(Y_{1}, Y_{2}, X)

are continuous-valued random variables with absolutely continuous marginal distributions, and the conditional marginal distributions of

Y_{1}

and

Y_{2}

given

X = x

are also absolutely continuous. The goal is to estimate the conditional copula

C_{x}

from a random sample of i.i.d. observations

(Y_{i 1}, Y_{i 2}, X_{i}), i = 1, \dots, n

.

As suggested by Gijbels et al. [18], it is often favorable to remove the effect of the covariate on the marginal distributions before estimating

C_{x}

. In order to do that, we can transform the original observations

(Y_{i 1}, Y_{i 2})

to marginally uniformly distributed (unobserved) samples

(U_{i 1}, U_{i 2}) \equiv (F_{1 X_{i}} (Y_{i 1}), F_{2 X_{i}} (Y_{i 2})), i = 1, \dots, n,

(4)

which can be estimated by pseudo-observations

({\hat{U}}_{i 1}, {\hat{U}}_{i 2}) \equiv ({\hat{F}}_{1 X_{i}} (Y_{i 1}), {\hat{F}}_{2 X_{i}} (Y_{i 2})), i = 1, \dots, n,

(5)

where

{\hat{F}}_{1 x}

and

{\hat{F}}_{2 x}

are the estimated conditional marginal distributions.

Motivated by Janssen et al. [25] who apply the empirical Bernstein estimator of the bivariate copula derivative to conditional distribution estimation with a single covariate, we are able to use the multivariate copula estimator ECBC as proposed in Lu and Ghosh [24] to estimate the conditional marginal distributions of

Y_{1}

and

Y_{2}

given

X = x

, respectively. Specifically, for

j \in {1, 2}

, we have i.i.d samples

(Y_{i j}, X_{i}), i = 1, \dots, n

and the corresponding pseudo-observations

({\hat{W}}_{i j}, {\hat{V}}_{i}) =

(F_{n Y_{j}} (Y_{i j}), F_{n X} (X_{i})), i = 1, \dots, n

, where

F_{n Y_{j}}

and

F_{n X}

are the modified empirical estimation of the (unconditional) marginal distributions

F_{Y_{j}}

and

F_{X}

, respectively, e.g.,

F_{n X} (x) = \frac{1}{n + 1} \sum_{i = 1}^{n} I (X_{i} \leq x)

. These pseudo-observations can be then treated as samples from a

t w o

-dimensional copula

C_{j}

, which can be estimated by the ECBC copula estimator as follows

C_{j}^{#} (w_{j}, v) = \sum_{h = 0}^{g_{j}} \sum_{k = 0}^{m_{j}} {\tilde{θ}}_{h, k} (\binom{g_{j}}{h}) w_{j}^{h} {(1 - w_{j})}^{g_{j} - h} (\binom{m_{j}}{k}) v^{k} {(1 - v)}^{m_{j} - k},

(6)

where

{\tilde{θ}}_{h, k} = C_{j n}^{#} (\frac{h}{g_{j}}, \frac{k}{m_{j}}),

(7)

and

C_{j n}^{#}

is the empirical checkerboard copula. The ECBC estimation process is detailed in Lu and Ghosh [24]. Then, the partial derivative

C_{j}^{(1)}

of

C_{j}

with respect to v can be estimated by using

\begin{matrix} C_{j}^{# (1)} (w_{j}, v) & \equiv \frac{\partial C_{j}^{#} (w_{j}, v)}{\partial v} \\ = \sum_{h = 0}^{g_{j}} \sum_{k = 0}^{m_{j} - 1} {\tilde{λ}}_{h, k} (\binom{g_{j}}{h}) w_{j}^{h_{j}} {(1 - w_{j})}^{g_{j} - h} m_{j} (\binom{m_{j} - 1}{k}) v^{k} {(1 - v)}^{m_{j} - k - 1}, \end{matrix}

(8)

where

\begin{matrix} {\tilde{λ}}_{h, k} & = {\tilde{θ}}_{h, k + 1} - {\tilde{θ}}_{h, k} . \end{matrix}

(9)

Notice that the following relationship holds between the conditional marginal distribution function of

Y_{j}

given

X = x

and the partial derivative

C_{j}^{(1)} (w_{j}, v)

\begin{matrix} F_{j x} (y_{j}) = P (Y_{j} \leq y_{j} | X = x) = C_{j}^{(1)} (F_{Y_{j}} (y_{j}), F_{X} (x)) . \end{matrix}

(10)

Thus, we can estimate the conditional marginal distributions using

\begin{matrix} {\hat{F}}_{j x} (y_{j}) = C_{j}^{# (1)} (F_{n Y_{j}} (y_{j}), F_{n X} (x)) \end{matrix}

(11)

for

j = 1, 2

, and then the corresponding pseudo-observations

({\hat{U}}_{i 1}, {\hat{U}}_{i 2}), i = 1, \dots, n

of the conditional copula

C_{x}

adjusted for the effect of the covariate on the marginal distributions can be estimated as given in (5).

Now, we can use the covariate-adjusted pseudo-observations

({\hat{U}}_{i 1}, {\hat{U}}_{i 2}), i = 1, \dots, n

along with the pseudo-observations of the covariate

{\hat{V}}_{i}, i = 1, \dots, n

to estimate a

t h r e e

-dimensional copula

C (u_{1}, u_{2}, v)

again using ECBC and denote it as

C^{#} (u_{1}, u_{2}, v)

. Similar to (8), it is easy to obtain the partial derivative

C^{# (1)}

of

C^{#}

with respect to v, which is denoted as

\begin{matrix} C^{# (1)} (u_{1}, u_{2} | v) & \equiv \frac{\partial C^{#} (u_{1}, u_{2}, v)}{\partial v} \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} {\tilde{γ}}_{h_{1}, h_{2}, k} m (\binom{m - 1}{k}) v^{k} {(1 - v)}^{m - k - 1} \prod_{s = 1}^{2} (\binom{l_{s}}{h_{s}}) u_{s}^{h_{s}} {(1 - u_{s})}^{l_{s} - h_{s}}, \end{matrix}

(12)

where

\begin{matrix} {\tilde{γ}}_{h_{1}, h_{2}, k} & = {\tilde{θ}}_{h_{1}, h_{2}, k + 1} - {\tilde{θ}}_{h_{1}, h_{2}, k} . \end{matrix}

(13)

Notice that we can use

C^{# (1)} (u_{1}, u_{2} | F_{n X} (x))

as an estimate of the conditional copula

C_{x}

; however,

C^{# (1)} (u_{1}, u_{2} | v)

is itself a valid bivariate copula for any value of

v \in [0, 1]

only asymptotically. This is because the conditional marginal distributions of

C^{# (1)} (u_{1}, u_{2} | v)

are not necessarily uniform distributions for finite samples. Aiming to obtain a more accurate estimate of the conditional copula for small samples, we consider the conditional marginal distributions of

C^{# (1)} (u_{1}, u_{2} | v)

given as

\begin{matrix} F_{1} (u_{1} | v) & \equiv C^{# (1)} (u_{1}, 1 | v) \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{k = 0}^{m - 1} {\tilde{γ}}_{h_{1}, l_{2}, k} m (\binom{m - 1}{k}) v^{k} {(1 - v)}^{m - k - 1} (\binom{l_{1}}{h_{1}}) u_{1}^{h_{1}} {(1 - u_{1})}^{l_{1} - h_{1}}, \end{matrix}

(14)

and

\begin{matrix} F_{2} (u_{2} | v) & \equiv C^{# (1)} (1, u_{2} | v) \\ = \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} {\tilde{γ}}_{l_{1}, h_{2}, k} m (\binom{m - 1}{k}) v^{k} {(1 - v)}^{m - k - 1} (\binom{l_{2}}{h_{2}}) u_{2}^{h_{2}} {(1 - u_{2})}^{l_{2} - h_{2}}, \end{matrix}

(15)

By using Sklar’s theorem, we are able to obtain a conditional copula estimator which is a genuine copula itself denoted as

\begin{matrix} C^{#} (u_{1}, u_{2} | v) = C^{# (1)} (F_{1}^{- 1} (u_{1} | v), F_{2}^{- 1} (u_{2} | v) | v), \end{matrix}

(16)

where

F_{1}^{- 1} (u_{1} | v)

and

F_{2}^{- 1} (u_{2} | v)

are the inverse functions of

F_{1}

and

F_{2}

, respectively. It is to be noted that

C^{#} (u_{1}, u_{2} | v)

is a valid copula for any value of

v \in [0, 1]

, and as a result, the conditional copula

C_{x}

can be estimated by

\begin{matrix} C_{x}^{#} (u_{1}, u_{2}) = P (F_{1 x} (y_{1}) \leq u_{1}, F_{2 x} (y_{2}) \leq u_{2} | X = x) = C^{#} (u_{1}, u_{2} | F_{n X} (x)) . \end{matrix}

(17)

Let

| | g | | (v) = sup_{(u_{1}, u_{2}) \in {[0, 1]}^{2}} | g (u_{1}, u_{2} | v) |

denote the conditional supremum norm of a conditional function

g (u_{1}, u_{2} | v)

defined on the unit square

{[0, 1]}^{2}

for a fixed v. We denote the common supremum norm as

| | \cdot | |

. The following theorem provides the large-sample consistency of the estimator

C^{#} (u_{1}, u_{2} | v)

for fixed value of

0 < v < 1

using the conditional supremum norm. The proof is in Appendix A.

Theorem 1.

Assume that the underlying trivariate copula

C (u_{1}, u_{2}, v)

is absolutely continuous and the conditional copula

C_{v} (u_{1}, u_{2} | v) = \frac{\partial C (u_{1}, u_{2}, v)}{\partial v}

is Lipschitz continuous on

{[0, 1]}^{3}

. Then, for any fixed

0 < v < 1

, we have

\begin{matrix} E (| | C^{#} - C_{v} | | (v)) \overset{a . s .}{\to} 0 a s n \to \infty . \end{matrix}

where the expectation is taken with respect to the empirical prior distribution of

l_{1}, l_{2}

, and m as given for ECBC.

Remark 1.

Following the hierarchical shifted Poisson distributions proposed for ECBC in Lu and Ghosh [24], the empirical prior distribution of

l_{1}, l_{2}

, and m is given as

l_{j} | α_{j} \sim P o i s s o n (n^{α_{j}}) + 1 a n d α_{j} \sim U n i f (\frac{1}{3}, \frac{2}{3}) j = 1, 2,

m | α \sim P o i s s o n (n^{α}) + 2 a n d α \sim U n i f (\frac{1}{3}, \frac{2}{3}) .

The choice of the above priors is motivated by the asymptotic theory of empirical checkerboard copula methods Janssen et al. [26]. Sample sizes or, more generally, data-dependent priors have been used extensively in the literature (e.g., see Wasserman [27] and Parrado-Hernández et al. [28]), and these have been shown to produce desirable asymptotic properties of the posterior distributions.

Next, by extending the dependence measures given in Schweizer et al. [29] to conditional versions, we are able to estimate the conditional dependence measures (e.g., conditional Spearman’s rho, conditional Kendall’s tau, etc.) using the estimator

C^{# (1)} (u_{1}, u_{2} | v)

. For instance, the estimate of conditional Kendall’s tau takes the form

\begin{matrix} \hat{τ} (v) = 4 \int_{0}^{1} \int_{0}^{1} C^{# (1)} (u_{1}, u_{2} | v) d C^{# (1)} (u_{1}, u_{2} | v) - 1, \end{matrix}

(18)

and the estimate of the conditional Spearman’s rho is given as

\begin{matrix} \hat{ρ} (v) = 12 \int_{0}^{1} \int_{0}^{1} (C^{# (1)} (u_{1}, u_{2} | v) - F_{1} (u_{1} | v) F_{2} (u_{2} | v)) d F_{1} (u_{1} | v) d F_{2} (u_{2} | v) . \end{matrix}

(19)

Let us denote

\begin{matrix} η_{h_{1}, h_{2} | v} \equiv m \sum_{k = 0}^{m - 1} {\tilde{γ}}_{h_{1}, h_{2}, k} (\binom{m - 1}{k}) v^{k} {(1 - v)}^{m - k - 1}, h_{1} = 0, . . ., l_{1}, h_{2} = 0, \dots, l_{2} . \end{matrix}

(20)

Then, we can rewrite the estimator

C^{# (1)} (u_{1}, u_{2} | v)

and its conditional marginal distributions as

\begin{matrix} C^{# (1)} (u_{1}, u_{2} | v) & = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} η_{h_{1}, h_{2} | v} \prod_{s = 1}^{2} (\binom{l_{s}}{h_{s}}) u_{s}^{h_{s}} {(1 - u_{s})}^{l_{s} - h_{s}}, \end{matrix}

(21)

\begin{matrix} F_{1} (u_{1} | v) = \sum_{h_{1} = 0}^{l_{1}} η_{h_{1}, l_{2} | v} (\binom{l_{1}}{h_{1}}) u_{1}^{h_{1}} {(1 - u_{1})}^{l_{1} - h_{1}}, \end{matrix}

(22)

and

\begin{matrix} F_{2} (u_{2} | v) = \sum_{h_{2} = 0}^{l_{2}} η_{l_{1}, h_{2} | v} (\binom{l_{2}}{h_{2}}) u_{2}^{h_{2}} {(1 - u_{2})}^{l_{2} - h_{2}}, \end{matrix}

(23)

respectively. As a result, a closed-form estimate of conditional Kendall’s tau takes the form

\begin{matrix} \hat{τ} (v) = 4 \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{g_{1} = 0}^{l_{1} - 1} \sum_{g_{2} = 0}^{l_{2} - 1} η_{h_{1}, h_{2} | v} (η_{g_{1} + 1, g_{2} + 1 | v} - η_{g_{1} + 1, g_{2} | v} - η_{g_{1}, g_{2} + 1 | v} + η_{g_{1}, g_{2} | v}) \\ \prod_{s = 1}^{2} l_{s} (\binom{l_{s}}{h_{s}}) (\binom{l_{s} - 1}{g_{s}}) B (h_{s} + g_{s} + 1, 2 l_{s} - h_{s} - g_{s}) - 1, \end{matrix}

(24)

where B is the beta function. Similarly, we are able to obtain a closed-form estimate of the conditional Spearman’s rho as

\begin{matrix} \hat{ρ} (v) = 12 \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{g_{1} = 0}^{l_{1} - 1} \sum_{g_{2} = 0}^{l_{2} - 1} (η_{h_{1}, h_{2} | v} - η_{h_{1}, l_{2} | v} η_{l_{1}, h_{2} | v}) (η_{g_{1} + 1, l_{2} | v} - η_{g_{1}, l_{2} | v}) (η_{l_{1}, g_{2} + 1 | v} - η_{l_{1}, g_{2} | v}) \\ \prod_{s = 1}^{2} l_{s} (\binom{l_{s}}{h_{s}}) (\binom{l_{s} - 1}{g_{s}}) B (h_{s} + g_{s} + 1, 2 l_{s} - h_{s} - g_{s}) . \end{matrix}

(25)

For the purpose of computing the estimates of conditional dependence measures more efficiently, we apply matrix operations to the tensor products in expressions (24) and (25). For given

(h_{1}, h_{2}), h_{1} = 1, \dots, l_{1}, h_{2} = 1, \dots, l_{2}

, let us denote

a_{h_{1}, g_{1}} = l_{1} (\binom{l_{1}}{h_{1}}) (\binom{l_{1} - 1}{g_{1}}) B (h_{1} + g_{1} + 1, 2 l_{1} - h_{1} - g_{1}), g_{1} = 0, \dots l_{1} - 1 .

(26)

and

b_{h_{2}, g_{2}} = l_{2} (\binom{l_{2}}{h_{2}}) (\binom{l_{2} - 1}{g_{2}}) B (h_{2} + g_{2} + 1, 2 l_{2} - h_{2} - g_{2}), g_{2} = 0, \dots l_{2} - 1 .

(27)

Then, we have

a_{h_{1}} = {(a_{h_{1}, 0}, \dots, a_{h_{1}, l_{1} - 1})}^{T}

and

b_{h_{2}} = {(b_{h_{2}, 0}, \dots, b_{h_{2}, l_{2} - 1})}^{T}

. We also denote a

l_{1} \times l_{2}

matrix

D_{v} = {(d_{g_{1}, g_{2} | v})}_{l_{1} \times l_{2}}

where

d_{g_{1}, g_{2} | v} = η_{g_{1} + 1, g_{2} + 1 | v} - η_{g_{1} + 1, g_{2} | v} - η_{g_{1}, g_{2} + 1 | v} + η_{g_{1}, g_{2} | v}

. Thus, the estimate of conditional Kendall’s tau given in (29) can be rewritten as

\begin{matrix} \hat{τ} (v) = 4 \sum_{h_{1} = 1}^{l_{1}} \sum_{h_{2} = 1}^{l_{2}} η_{h_{1}, h_{2} | v} a_{h_{1}}^{T} D_{v} b_{h_{2}} - 1 . \end{matrix}

(28)

Furthermore, we can denote two

l_{1} \times l_{2}

matrices,

H_{v} = {(η_{h_{1}, h_{2} | v})}_{l_{1} \times l_{2}}

and

G_{v} = {(a_{h_{1}}^{T} D_{v} b_{h_{2}})}_{l_{1} \times l_{2}}

, and as a result, we have

\begin{matrix} \hat{τ} (v) = 4 Tr (H_{v}^{T} G_{v}) - 1 . \end{matrix}

(29)

Similarly, we are able to rewrite the estimate of conditional Spearman’s rho given in (25). Let us first denote two vectors,

p_{v} = {(p_{g_{1} | v})}_{l_{1}}^{T}

where

p_{g_{1} | v} = η_{g_{1} + 1, l_{2} | v} - η_{g_{1}, l_{2} | v}, g_{1} = 0, \dots, l_{1} - 1

and

q_{v} = {(q_{g_{2} | v})}_{l_{2}}^{T}

where

q_{g_{2} | v} = η_{l_{1}, g_{2} + 1 | v} - η_{l_{1}, g_{2} | v}, g_{2} = 0, \dots, l_{2} - 1

. Then, we have

\begin{matrix} \hat{ρ} (v) = 12 \sum_{h_{1} = 1}^{l_{1}} \sum_{h_{2} = 1}^{l_{2}} (η_{h_{1}, h_{2} | v} - η_{h_{1}, l_{2} | v} η_{l_{1}, h_{2} | v}) a_{h_{1}}^{T} (p_{v} \otimes q_{v}) b_{h_{2}} . \end{matrix}

(30)

If we further denote two

l_{1} \times l_{2}

matrices,

R_{v} = {(r_{h_{1}, h_{2} | v})}_{l_{1} \times l_{2}}

where

r_{h_{1}, h_{2} | v} = η_{h_{1}, h_{2} | v} - η_{h_{1}, l_{2} | v} η_{l_{1}, h_{2} | v}

and

J_{v} = {(a_{h_{1}}^{T} (p_{v} \otimes q_{v}) b_{h_{2}})}_{l_{1} \times l_{2}}

, then we have

\begin{matrix} \hat{ρ} (v) = 12 Tr (R_{v}^{T} J_{v}) \end{matrix}

(31)

By applying the above matrix operations, we are able to obtain very neat expressions of the estimates of conditional dependence measures, and the computational efficiency can be improved significantly.

3. Numerical Illustrations Using Simulated Data

We now show the finite-sample performance of the conditional copula estimator

C_{x}^{#} (u_{1}, u_{2})

. Similar to the simulation setup in Acar et al. [8], data

(U_{i 1}, U_{i 2} | X_{i}), i = 1, \dots, n

are generated from the Clayton copula using the package

copula

in

R

under the following models:

(U_{i 1}, U_{i 2}) | X_{i} \sim C (u_{1}, u_{2} | θ_{i})

, where

θ_{i} = \exp (0.8 X_{i} - 2)

and

X_{i} \sim U n i f (0, 3)

. The true copula parameter varies from 0.14 to 1.49 with Spearman’s rho ranging from 0.10 to 0.60. The pseudo-observations of the covariate are defined as

V_{i} \equiv F_{n X} (X_{i}), i = 1, \dots, n

, where

F_{n X} (x) = \frac{1}{n + 1} \sum_{i = 1}^{n} I (X_{i} \leq x)

.

N = 100

replicates are drawn from the true copula with sample size

n = 200

.

Figure 1 shows the contour plots of the Monte Carlo average of the estimated

C_{x}^{#} (u_{1}, u_{2})

given

x = 0.5

,

x = 1

,

x = 1.5

, and

x = 2

, respectively, across 100 Monte Carlo replicates. The contour plots are drawn based on a

15 \times 15

equally spaced grid of points in the unit square, meaning that for a given v, we need to find

15 + 15 = 30

roots. Since

F_{1}

and

F_{2}

are both non-decreasing functions, we can calculate the inverse functions

F_{1}^{- 1} (u_{1} | v)

and

F_{2}^{- 1} (u_{2} | v)

by applying the function

uniroot

in

R

to Equations (22) and (23) for a given value of v. The true copula parameters are 0.20 (Spearman’s rho equal to 0.09), 0.30 (Spearman’s rho equal to 0.20), 0.45 (Spearman’s rho equal to 0.27), and 0.67 (Spearman’s rho equal to 0.37) for

x = 0.5

,

x = 1

,

x = 1.5

, and

x = 2

, respectively.

It can be observed from the plots that all the estimated contour lines overlap with the true lines at the boundaries, which is evidence that the conditional copula estimator

C^{#} (u_{1}, u_{2} | v)

is a genuine copula with uniform conditional marginal distributions. Moreover, there is almost no bias between the estimated conditional copula averaged over 100 Monte Carlo samples and the true conditional copula across different values of the covariate, illustrating that the proposed ECBC-based method works well in estimating the conditional copula.

Then, we can plot the conditional Kendall’s tau and conditional Spearman’s rho as given in (29) and (31) as a function of the covariate in Figure 2. The covariate x ranges from 0 to 3, so we compute the dependence measures at seven different values

(0.05, 0.5, 1, 1.5, 2, 2.5, 2.95)

. The following plots show the Monte Carlo average of estimates of dependence measures and the

90 %

Monte Carlo confidence bands (5th and 95th percentiles of the dependence measure estimates) across 100 Monte Carlo replicates.

Overall, the estimates averaged over 100 Monte Carlo samples seem to be fairly close to the true conditional dependence measures across different values of the covariate. The variance tends to increase and the Monte Carlo average tends to underestimate a little bit when it becomes closer to the boundaries of the covariate.

Next, we would like to compare the performance of our proposed nonparametric method to the semiparametric method in Acar et al. [8] through simulation studies. They assume a conditional copula model where the copula function comes from a parametric copula family and the copula parameter is a function of the covariate. Different copula families, e.g., Clayton and Gumbel, were considered, and the functional relationship between the copula parameter and the covariate was estimated using a nonparametric local likelihood approach. The severe consequence of the misspecified copula model was investigated in Acar et al. [8], and they proposed a copula selection method based on cross-validated prediction errors. In contrast, the proposed conditional copula estimator is fully nonparametric, so there is no need to make any choice of the copula family.

The simulation setups follow Acar et al. [8]. The data

(U_{i 1}, U_{i 2} | X_{i}), i = 1, \dots, n

are generated from the Clayton copula under the following models:

(U_{i 1}, U_{i 2}) | X_{i} \sim C (u_{1}, u_{2} | θ_{i})

, where (i):

θ_{i} = \exp (0.8 X_{i} - 2)

and

X_{i} \sim U n i f (2, 5)

; (ii):

θ_{i} = \exp (2 - 0.3 {(X_{i} - 2)}^{2}

and

X_{i} \sim U n i f (2, 5)

. The sample size is

n = 200

.

The comparison can be made numerically by calculating the conditional Kendall’s tau and some performance measures, including the integrated square bias (IBIAS²), integrated variance (IVAR) and integrated mean square error (IMSE) as given in Acar et al. [8]:

\begin{matrix} I B I A S^{2} & = & \int_{[2, 5]} {[E [{\hat{τ}}_{x} (x)] - τ_{x} (x)]}^{2} d x = 3 \int_{[0, 1]} {[E [\hat{τ} (v)] - τ (v)]}^{2} d v, \end{matrix}

(32)

\begin{matrix} I V A R & = & \int_{[2, 5]} E [{[{\hat{τ}}_{x} (x) - E [{\hat{τ}}_{x} (x)]]}^{2}] d x = 3 \int_{[0, 1]} E [{[\hat{τ} (v) - E [\hat{τ} (v)]]}^{2}] d v, \end{matrix}

(33)

\begin{matrix} I M S E & = & \int_{[2, 5]} E [{[{\hat{τ}}_{x} (x) - τ_{x} (x)]}^{2}] d x = 3 \int_{[0, 1]} E [{[\hat{τ} (v) - τ (v)]}^{2}] d v, \end{matrix}

(34)

where the second equality holds because

τ_{x} (X) = τ (F_{X} (X)) = τ (V)

and

X \sim U n i f (2, 5)

. We compute Monte Carlo estimates of these performance measures by following the tricks in Segers et al. [30] and compare our proposed method (referred to as “ECBC-based”) to the local likelihood method (referred to as “Local”) in Acar et al. [8]. The results are shown in Table 1.

From the results, we can see that when data are generated from the Clayton copula (the underlying true copula), our ECBC-based method outperforms the local likelihood method for the incorrect parametric case (Gumbel) in terms of bias and MSE, although the performance is not as good as the local likelihood method for the correct parametric case (Clayton). Nonetheless, the advantage of the proposed nonparametric method is that we can avoid the adverse impact of misspecified copula and obtain a fairly good estimation of conditional copula and conditional dependence measures without having to select the ‘best’ copula model from numerous copula families.

4. Real Case Study

We now apply the proposed methodology to a data set of life expectancy at birth of males and females with GDP (in USD) per capita as a covariate for 210 countries or regions. The data are available from the World Factbook 2020 of CIA. Similar data sets were analyzed in Gijbels et al. [18] and Abegaz et al. [9]. Life expectancy at birth summarizes the average number of years to be lived in a country, while GDP per capita is often considered as an indicator of a country’s standard of living. We are interested in the dependence between the life expectancy at birth of males and females and would like to see if the strength of dependence is influenced by the GDP per capita. In other words, it is of interest to investigate the dependence between the life expectancy at birth of males (

Y_{1}

) and females (

Y_{2}

) conditioning on the covariate X, where

X =

log₁₀(GDP) is a log₁₀ transformation of GDP per capita.

The pairwise scatterplots of the data are shown in Figure 3a, from which we can see that there is strong positive correlation between the life expectancy of males (referred to as Male) and females (referred to as Female). Figure 3a also shows that the life expectancy tends to increase with the log₁₀ transformation of GDP per capita (referred to as log10.GDP) for both males and females. Before estimating the conditional copula of

(Y_{1}, Y_{2})

given X, we first remove the effect of the covariate X on the marginal distributions of

Y_{1}

and

Y_{2}

. As a result, the covariate-adjusted pseudo-observations of

Y_{1}

and

Y_{2}

(referred to as Male.pseudo and Female.pseudo, respectively) and the pseudo-observations of X (referred to as log10.GDP.pseudo) are given in Figure 3b.

We then estimate the conditional copula and the conditional dependence of life expectancy at birth of males and females given the covariate X. Figure 4 shows the estimated conditional Kendall’s tau as a function of log₁₀(GDP). It can be observed from the plot that the estimate of Kendall’s tau decreases from around 0.8 to 0.6 as the GDP per capita increases from

10^{3} = 1000

to

10^{4.6} \approx

40,000 USD, and it picks up slightly as the GDP per capita becomes greater than 40,000 USD. Overall, the dependence between the life expectancy at birth of males and females is relatively larger for countries with a lower GDP per capita (less than 10,000 USD), and the dependence is relatively smaller for countries with a higher GDP per capita (greater than 10,000 USD).

5. Conclusions

This article provides a nonparametric approach for estimating conditional copulas based on the empirical checkerboard Bernstein copula (ECBC) estimator. The proposed nonparametric method has its own advantages compared to the semiparametric methods as it fixes the issue of model misspecification by not relying on any selection of copula family and demonstrates a good finite-sample performance. The large-sample consistency of the proposed ECBC-based conditional copula estimator is also presented. In addition, we derive closed-form nonparametric estimates of the conditional dependence measures from the proposed estimator.

Due to the complexity in modeling and inference caused by the dependence of conditional copula on the covariates, it is quite common in practice, particularly for vine copulas, to assume that the dependence structure is not influenced by the value of covariates, which is referred to as ‘simplifying assumption’. Under simplifying assumption, the conditional copula

C_{I | J} (| V_{J} = v_{J})

does not depend on

v_{J}

, i.e., for every

u_{I} \in {[0, 1]}^{p}

, the function

v_{J} \in R^{q} \to C_{I | J} (u_{I} | V_{J} = v_{J})

is a constant function (that depends on

u_{I}

). See, e.g., Haff et al. [31], Acar et al. [32], Stoeber et al. [33], Nagler and Czado [34] and Schellhase and Spanhel [35]).

In the literature, there have been some available tests of the simplifying assumption; see Acar et al. [36], Gijbels et al. [37], Gijbels et al. [38], Derumigny and Fermanian [39], Kurz and Spanhel [40], etc. Our proposed ECBC-based conditional copula estimator can be useful for constructing new tests of the simplifying assumption. We have shown the framework of obtaining a general estimate of the conditional copula that is allowed to vary with the value of covariates. It is also straightforward to obtain an estimate satisfying the simplifying assumption based on the covariate-adjusted pseudo-observations again using the ECBC estimator. Therefore, it could be possible to build test statistics based on some discrepancy criteria like the Kolmogorov–Smirnov type, Anderson–Darling type, etc., where the distributions of such test statistics could be approximated by bootstrap schemes.

Another interesting topic for future work would be extending the estimation framework to high-dimensional conditional copula. We can perhaps first use some dimension reduction methods like principal component analysis (PCA) and then develop copula models based on the lower dimensional principal components of the covariates.

Author Contributions

Conceptualization, L.L. and S.G.; Methodology, L.L. and S.G.; Software, L.L.; Validation, L.L.; Formal analysis, L.L.; Writing—original draft, L.L.; Writing—review and editing, L.L. and S.G.; Visualization, L.L.; Supervision, S.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are openly available from the World Factbook 2020 of CIA.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Proof of Theorem 1.

We denote

P_{m, k} (v) = (\binom{m}{k}) v^{k} {(1 - v)}^{m - k} .

(A1)

Then, we can rewrite the Bernstein copula as

\begin{matrix} B (C; u_{1}, u_{2}, v) = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} C (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m, k} (v), \end{matrix}

(A2)

and the ECBC copula estimator as

\begin{matrix} B (C_{n}^{#}; u_{1}, u_{2}, v) = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} C_{n}^{#} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m, k} (v), \end{matrix}

(A3)

where

C_{n}^{#}

is the empirical checkerboard copula. Thus, the partial derivative of three-dimensional ECBC

B (C_{n}^{#}; u_{1}, u_{2}, v)

with respect to v takes the form of

\begin{matrix} C^{# (1)} (u_{1}, u_{2} | v) \equiv \frac{\partial B (C_{n}^{#}; u_{1}, u_{2}, v)}{\partial v} \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} C_{n}^{#} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m, k}^{'} (v), \end{matrix}

(A4)

where

P_{m, k}^{'} (v)

is the derivative of

P_{m, k} (v)

with respect to v.

Let us denote the partial derivative of the Bernstein copula

B (C; u_{1}, u_{2}, v)

with respect to v as

\begin{matrix} C^{(1)} (u_{1}, u_{2} | v) \equiv \frac{\partial B (C; u_{1}, u_{2}, v)}{\partial v} \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} C (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m, k}^{'} (v), \end{matrix}

(A5)

and the partial derivative of the empirical Bernstein copula

B (C_{n}; u_{1}, u_{2}, v)

with respect to v as

\begin{matrix} C_{n}^{(1)} (u_{1}, u_{2} | v) \equiv \frac{\partial B (C_{n}; u_{1}, u_{2}, v)}{\partial v} \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} C_{n} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m, k}^{'} (v) . \end{matrix}

(A6)

Using the triangle inequality, we have

\begin{matrix} | | C^{# (1)} - C_{v} | | (v) & \leq | | C^{# (1)} - C^{(1)} | | (v) + | | C^{(1)} - C_{v} | (v) \\ \leq | | C^{# (1)} - C_{n}^{(1)} | | (v) + | | C_{n}^{(1)} - C^{(1)} | (v) + | | C^{(1)} - C_{v} | | (v) . \end{matrix}

First, we can show that

\begin{matrix} | | C^{# (1)} - C_{n}^{(1)} | | (v) \\ = | | \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} (C_{n}^{#} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m}) - C_{n} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m})) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m, k}^{'} (v) | | (v) \\ \leq max_{0 \leq h_{1} \leq l_{1}, 0 \leq h_{2} \leq l_{2}, 0 \leq k \leq m - 1} |C_{n}^{#} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m}) - C_{n} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m})| \\ \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m} | P_{l_{1}, h_{1}} (u_{1}) | | P_{l_{2}, h_{2}} (u_{2}) | | P_{m, k}^{'} (v) | \\ \leq max_{0 \leq h_{1} \leq l_{1}, 0 \leq h_{2} \leq l_{2}, 0 \leq k \leq m - 1} |C_{n}^{#} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m}) - C_{n} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m})| \sum_{k = 0}^{m} | P_{m, k}^{'} (v) | . \end{matrix}

In the above, the second inequality follows from the fact that since

(\binom{l_{j}}{h_{j}}) u_{j}^{l_{j}} {(1 - u_{j})}^{l_{j} - h_{j}},

l_{j} = 0, 1, \dots, h_{j}

,

j = 1, 2

are binomial probabilities,

\sum_{h_{j} = 0}^{l_{j}} (\binom{l_{j}}{h_{j}}) u_{j}^{h_{j}} {(1 - u_{j})}^{l_{j} - h_{j}} = 1

for any

u_{j} \in [0, 1]

,

j = 1, 2

. Under the assumption that the marginal CDFs are continuous, it follows from Remark 2 in Genest et al. [41] that for d-dimensional copula

| | C_{n}^{#} - C_{n} | | \leq \frac{3}{n},

and from Lemma 1 in Janssen et al. [26], it follows that for any fixed

0 < v < 1

,

\sum_{k = 0}^{m} | P_{m, k}^{'} (v) | \sim \sqrt{\frac{2}{π}} \frac{m^{1 / 2}}{\sqrt{v (1 - v)}} = O (m^{1 / 2}) as m \to \infty .

Thus, for any fixed

0 < v < 1

, we have

\begin{matrix} | | C^{# (1)} - C_{n}^{(1)} | | (v) \\ \leq max_{0 \leq h_{1} \leq l_{1}, 0 \leq h_{2} \leq l_{2}, 0 \leq k \leq m - 1} |C_{n}^{#} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m}) - C_{n} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m})| \sum_{k = 0}^{m} | P_{m, k}^{'} (v) | \\ \leq | | C_{n}^{#} - C_{n} | | \sum_{k = 0}^{m} | P_{m, k}^{'} (v) | = O (m^{1 / 2} n^{- 1}) . \end{matrix}

Next, we can use a similar technique to show that

\begin{matrix} | | C_{n}^{(1)} - C^{(1)} | (v) \\ \leq max_{0 \leq h_{1} \leq l_{1}, 0 \leq h_{2} \leq l_{2}, 0 \leq k \leq m - 1} |C_{n} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m}) - C (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m})| \sum_{k = 0}^{m} | P_{m, k}^{'} (v) | \end{matrix}

By using Lemma 1 in Janssen et al. [42] and Equation (3) in Kiriliouk et al. [43], for the d-dimensional copula, we obtain

\begin{matrix} | | C_{n} - C | | & \leq \frac{3}{n} + O (n^{- 1 / 2} {(log log n)}^{1 / 2}) a . s . \\ = O (n^{- 1 / 2} {(log log n)}^{1 / 2}) a . s . . \end{matrix}

Thus, it follows that for any fixed

0 < v < 1

,

| | C_{n}^{(1)} - C^{(1)} | | (v) = O (m^{1 / 2} n^{- 1 / 2} {(log log n)}^{1 / 2}), a . s . .

Hence, for any fixed

0 < v < 1

, we have

| | C^{# (1)} - C^{(1)} | | (v) \leq | | C^{# (1)} - C_{n}^{(1)} | | (v) + | | C_{n}^{(1)} - C^{(1)} | (v) = O (m^{1 / 2} n^{- 1 / 2} {(log log n)}^{1 / 2}), a . s . .

Next, by mean value theorem, there exists

\frac{k}{m} < ξ_{k} < \frac{k + 1}{m}

s.t.

\begin{matrix} C^{(1)} (u_{1}, u_{2} | v) \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} m (C (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k + 1}{m}) - C (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}}, \frac{k}{m})) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m - 1, k} (v), \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | ξ_{k}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m - 1, k} (v), \\ = \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} (C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | ξ_{k}) - C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | \frac{k}{m - 1}) + C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | \frac{k}{m - 1})) \\ P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m - 1, k} (v) . \end{matrix}

Notice that

\begin{matrix} \frac{k}{m} - \frac{k + 1}{m} < \frac{k}{m} - \frac{k}{m - 1} < ξ_{k} - \frac{k}{m - 1} < \frac{k + 1}{m} - \frac{k}{m - 1} < \frac{k + 1}{m} - \frac{k}{m}, \end{matrix}

which means that

\begin{matrix} |ξ_{k} - \frac{k}{m - 1}| < \frac{1}{m} . \end{matrix}

If

C_{v} (u_{1}, u_{2} | v) = \frac{\partial C (u_{1}, u_{2}, v)}{\partial v}

is Lipschitz continuous on

{[0, 1]}^{3}

, then there exists a Lipschitz constant L s.t.

\begin{matrix} |C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | ξ_{k}) - C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | \frac{k}{m - 1})| \leq L |ξ_{k} - \frac{k}{m - 1}| \leq \frac{L}{m}, \end{matrix}

and based on Lemma 3.2 in Segers et al. [30], we also have

\begin{matrix} | | \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | \frac{k}{m - 1}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m - 1, k} (v) - C_{v} | | \\ \leq L (\frac{1}{2 \sqrt{l_{1}}} + \frac{1}{2 \sqrt{l_{2}}} + \frac{1}{2 \sqrt{m - 1}}) . \end{matrix}

Thus,

\begin{matrix} | | C^{(1)} - C_{v} | | (v) \\ \leq | | \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} (C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | ξ_{k}) - C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | \frac{k}{m - 1})) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m - 1, k} (v) | | \\ + | | \sum_{h_{1} = 0}^{l_{1}} \sum_{h_{2} = 0}^{l_{2}} \sum_{k = 0}^{m - 1} C_{v} (\frac{h_{1}}{l_{1}}, \frac{h_{2}}{l_{2}} | \frac{k}{m - 1}) P_{l_{1}, h_{1}} (u_{1}) P_{l_{2}, h_{2}} (u_{2}) P_{m - 1, k} (v) - C_{v} | | \\ \leq \frac{L}{m} + L (\frac{1}{2 \sqrt{l_{1}}} + \frac{1}{2 \sqrt{l_{2}}} + \frac{1}{2 \sqrt{m - 1}}) . \end{matrix}

Finally, for any fixed

0 < v < 1

, we obtain

\begin{matrix} | | C^{# (1)} - C_{v} | | (v) & \leq | | C^{# (1)} - C^{(1)} | | (v) + | | C^{(1)} - C_{v} | | (v) \\ \leq \frac{L}{m} + L (\frac{1}{2 \sqrt{l_{1}}} + \frac{1}{2 \sqrt{l_{2}}} + \frac{1}{2 \sqrt{m - 1}}) + O (m^{1 / 2} n^{- 1 / 2} {(log log n)}^{1 / 2}), a . s . . \end{matrix}

The empirical prior of the degrees m,

l_{1}

, and

l_{2}

is given as

m | α \sim P o i s s o n (n^{α}) + 2 a n d α \sim U n i f (\frac{1}{3}, \frac{2}{3}),

l_{j} | α_{j} \sim P o i s s o n (n^{α_{j}}) + 1 a n d α_{j} \sim U n i f (\frac{1}{3}, \frac{2}{3}) j = 1, 2 .

Notice that

Pr (\frac{1}{3} \leq α \leq \frac{2}{3}) = 1

and

Pr (\frac{1}{3} \leq α_{j} \leq \frac{2}{3}) = 1, j = 1, 2

; then,

E (m^{1 / 2} n^{- 1 / 2}

{(log log n)}^{1 / 2})

\leq n^{1 / 3} n^{- 1 / 2} {(log log n)}^{1 / 2} \to 0

as

n \to \infty

. In the proof of Theorem 1 in Lu and Ghosh [24], it has been shown

E (\sqrt{\frac{1}{l_{j}}} | α_{j}) \leq \sqrt{\frac{1 - e^{- n^{α_{j}}}}{n^{α_{j}}}} \to 0, j = 1, 2

,

E (\sqrt{\frac{1}{m - 1}} | α) \leq \sqrt{\frac{1 - e^{- n^{α}}}{n^{α}}} \to 0

as

n \to \infty

and

E (\frac{1}{m} | α) \leq \frac{1 - e^{- n^{α}}}{n^{α}} \to 0

as

n \to \infty

. Thus, taking expectation with respect to the prior distributions of

l_{1}

,

l_{2}

and m as given for ECBC, it follows that

\begin{matrix} E (| | C^{# (1)} - C_{v} | | (v)) \\ \leq E (\frac{L}{m}) + E (L (\frac{1}{2 \sqrt{l_{1}}} + \frac{1}{2 \sqrt{l_{2}}} + \frac{1}{2 \sqrt{m - 1}})) + E (O (m^{1 / 2} n^{- 1 / 2} {(log log n)}^{1 / 2})) \\ \to 0 a s n \to \infty a . s . \end{matrix}

Since

C (u_{1}, 1 | v) = u_{1}

and

C (1, u_{2} | v) = u_{2}

and

C^{# (1)} (u_{1}, u_{2} | v)

converges to

C_{v} (u_{1}, u_{2} | v)

uniformly on

{[0, 1]}^{2}

as

n \to \infty

for any fixed

0 < v < 1

, then we have

\begin{matrix} E (| | F_{1} (u_{1} | v) - u_{1} | | (v)) \equiv E (sup_{u_{1} \in [0, 1]} | F_{1} (u_{1} | v) - u_{1} |) \\ = E (| | C^{# (1)} (u_{1}, 1 | v) - C (u_{1}, 1 | v) | | (v)) \leq E (| | C^{# (1)} - C^{(1)} | | (v)) \overset{a . s .}{\to} 0, \end{matrix}

and

\begin{matrix} E (| | F_{2} (u_{2} | v) - u_{2} | | (v)) \equiv E (sup_{u_{2} \in [0, 1]} | F_{2} (u_{2} | v) - u_{2} |) \\ = E (| | C^{# (1)} (1, u_{2} | v) - C (1, u_{2} | v) | | (v)) \leq E (| | C^{# (1)} - C^{(1)} | | (v)) \overset{a . s .}{\to} 0 . \end{matrix}

For any fixed

0 < v < 1

,

F_{1}

and

F_{2}

are non-decreasing functions, so

F_{1}^{- 1} (u_{1} | v) \overset{a . s .}{\to} u_{1}

and

F_{2}^{- 1} (u_{2} | v) \overset{a . s .}{\to} u_{2}

. Thus, we obtain the uniform convergence of

\begin{matrix} E (| | C^{#} - C_{v} | | (v)) = E (| | C^{# (1)} (F_{1}^{- 1} (u_{1} | v), F_{2}^{- 1} (u_{2} | v) | v) - C_{v} | | (v)) \overset{a . s .}{\to} 0 \end{matrix}

as

n \to \infty

for any fixed

0 < v < 1

. □

References

Nelsen, R.B. An Introduction to Copulas; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Jaworski, P.; Durante, F.; Hardle, W.K.; Rychlik, T. Copula Theory and Its Applications; Springer: Berlin/Heidelberg, Germany, 2010; Volume 198. [Google Scholar]
Joe, H. Dependence Modeling with Copulas; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
Patton, A.J. Estimation of multivariate models for time series of possibly different lengths. J. Appl. Econom. 2006, 21, 147–173. [Google Scholar] [CrossRef]
Patton, A.J. Modelling asymmetric exchange rate dependence. Int. Econ. Rev. 2006, 47, 527–556. [Google Scholar] [CrossRef]
Sklar, M. Fonctions de repartition an dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 1959, 8, 229–231. [Google Scholar]
Žežula, I. On multivariate Gaussian copulas. J. Stat. Plan. Inference 2009, 139, 3942–3946. [Google Scholar] [CrossRef]
Acar, E.F.; Craiu, R.V.; Yao, F. Dependence calibration in conditional copulas: A nonparametric approach. Biometrics 2011, 67, 445–453. [Google Scholar] [CrossRef] [PubMed]
Abegaz, F.; Gijbels, I.; Veraverbeke, N. Semiparametric estimation of conditional copulas. J. Multivar. Anal. 2012, 110, 43–73. [Google Scholar] [CrossRef]
Vatter, T.; Chavez-Demoulin, V. Generalized additive models for conditional dependence structures. J. Multivar. Anal. 2015, 141, 147–167. [Google Scholar] [CrossRef]
Fermanian, J.D.; Lopez, O. Single-index copulas. J. Multivar. Anal. 2018, 165, 27–55. [Google Scholar] [CrossRef]
Mesfioui, M.; Quessy, J.F. Dependence structure of conditional Archimedean copulas. J. Multivar. Anal. 2008, 99, 372–385. [Google Scholar] [CrossRef]
Kasper, T.M. On convergence and singularity of conditional copulas of multivariate Archimedean copulas, and conditional dependence. J. Multivar. Anal. 2023, 201, 105275. [Google Scholar] [CrossRef]
Craiu, V.R.; Sabeti, A. In mixed company: Bayesian inference for bivariate conditional copula models with discrete and continuous outcomes. J. Multivar. Anal. 2012, 110, 106–120. [Google Scholar] [CrossRef]
Sabeti, A.; Wei, M.; Craiu, R.V. Additive models for conditional copulas. Stat 2014, 3, 300–312. [Google Scholar] [CrossRef]
Levi, E.; Craiu, R.V. Bayesian inference for conditional copulas using Gaussian Process single index models. Comput. Stat. Data Anal. 2018, 122, 115–134. [Google Scholar] [CrossRef]
Geerdens, C.; Acar, E.F.; Janssen, P. Conditional copula models for right-censored clustered event time data. Biostatistics 2018, 19, 247–262. [Google Scholar] [CrossRef] [PubMed]
Gijbels, I.; Veraverbeke, N.; Omelka, M. Conditional copulas, association measures and their applications. Comput. Stat. Data Anal. 2011, 55, 1919–1932. [Google Scholar] [CrossRef]
Veraverbeke, N.; Omelka, M.; Gijbels, I. Estimation of a conditional copula and association measures. Scand. J. Stat. 2011, 38, 766–780. [Google Scholar] [CrossRef]
Gijbels, I.; Omelka, M.; Veraverbeke, N. Multivariate and functional covariates and conditional copulas. Electron. J. Stat. 2012, 6, 1273–1306. [Google Scholar] [CrossRef]
Derumigny, A.; Fermanian, J.D. Conditional empirical copula processes and generalized dependence measures. arXiv 2020, arXiv:2008.09480. [Google Scholar]
Leisen, F.; Dalla Valle, L.; Rossini, L. Bayesian Nonparametric Conditional Copula Estimation of Twin Data. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2018, 67, 523–548. [Google Scholar]
Wu, J.; Wang, X.; Walker, S.G. Bayesian nonparametric inference for a multivariate copula function. Methodol. Comput. Appl. Probab. 2014, 16, 747–763. [Google Scholar] [CrossRef]
Lu, L.; Ghosh, S. Nonparametric estimation of multivariate copula using empirical bayes methods. Mathematics 2023, 11, 4383. [Google Scholar] [CrossRef]
Janssen, P.; Swanepoel, J.; Veraverbeke, N. Bernstein estimation for a copula derivative with application to conditional distribution and regression functionals. Test 2016, 25, 351–374. [Google Scholar] [CrossRef]
Janssen, P.; Swanepoel, J.; Veraverbeke, N. A note on the asymptotic behavior of the Bernstein estimator of the copula density. J. Multivar. Anal. 2014, 124, 480–487. [Google Scholar] [CrossRef]
Wasserman, L. Asymptotic inference for mixture models by using data-dependent priors. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2000, 62, 159–180. [Google Scholar] [CrossRef]
Parrado-Hernández, E.; Ambroladze, A.; Shawe-Taylor, J.; Sun, S. PAC-Bayes bounds with data dependent priors. J. Mach. Learn. Res. 2012, 13, 3507–3531. [Google Scholar]
Schweizer, B.; Wolff, E.F. On nonparametric measures of dependence for random variables. Ann. Stat. 1981, 9, 879–885. [Google Scholar] [CrossRef]
Segers, J.; Sibuya, M.; Tsukahara, H. The empirical beta copula. J. Multivar. Anal. 2017, 155, 35–51. [Google Scholar] [CrossRef]
Haff, I.H.; Aas, K.; Frigessi, A. On the simplified pair-copula construction—Simply useful or too simplistic? J. Multivar. Anal. 2010, 101, 1296–1310. [Google Scholar] [CrossRef]
Acar, E.F.; Genest, C.; NešLehová, J. Beyond simplified pair-copula constructions. J. Multivar. Anal. 2012, 110, 74–90. [Google Scholar] [CrossRef]
Stoeber, J.; Joe, H.; Czado, C. Simplified pair copula constructions—Limitations and extensions. J. Multivar. Anal. 2013, 119, 101–118. [Google Scholar] [CrossRef]
Nagler, T.; Czado, C. Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. J. Multivar. Anal. 2016, 151, 69–89. [Google Scholar] [CrossRef]
Schellhase, C.; Spanhel, F. Estimating non-simplified vine copulas using penalized splines. Stat. Comput. 2018, 28, 387–409. [Google Scholar] [CrossRef]
Acar, E.F.; Craiu, R.V.; Yao, F. Statistical testing of covariate effects in conditional copula models. Electron. J. Stat. 2013, 7, 2822–2850. [Google Scholar] [CrossRef]
Gijbels, I.; Omelka, M.; Pešta, M.; Veraverbeke, N. Score tests for covariate effects in conditional copulas. J. Multivar. Anal. 2017, 159, 111–133. [Google Scholar] [CrossRef]
Gijbels, I.; Omelka, M.; Veraverbeke, N. Nonparametric testing for no covariate effects in conditional copulas. Statistics 2017, 51, 475–509. [Google Scholar] [CrossRef]
Derumigny, A.; Fermanian, J.D. About tests of the “simplifying” assumption for conditional copulas. Depend. Model. 2017, 5, 154–197. [Google Scholar] [CrossRef]
Kurz, M.S.; Spanhel, F. Testing the simplifying assumption in high-dimensional vine copulas. arXiv 2017, arXiv:1706.02338. [Google Scholar] [CrossRef]
Genest, C.; Nešlehová, J.G.; Rémillard, B. Asymptotic behavior of the empirical multilinear copula process under broad conditions. J. Multivar. Anal. 2017, 159, 82–110. [Google Scholar] [CrossRef]
Janssen, P.; Swanepoel, J.; Veraverbeke, N. Large sample behavior of the Bernstein copula estimator. J. Stat. Plan. Inference 2012, 142, 1189–1197. [Google Scholar] [CrossRef]
Kiriliouk, A.; Segers, J.; Tsukahara, H. On some resampling procedures with the empirical beta copula. arXiv 2019, arXiv:1905.12466. [Google Scholar]

Figure 1. The contour plots of the Monte Carlo average of the estimated

C_{x}^{#} (u_{1}, u_{2})

given

x = 0.5

,

x = 1

,

x = 1.5

, and

x = 2

, respectively.

Figure 1. The contour plots of the Monte Carlo average of the estimated

C_{x}^{#} (u_{1}, u_{2})

given

x = 0.5

,

x = 1

,

x = 1.5

, and

x = 2

, respectively.

Figure 2. The plots of the estimated conditional Kendall’s tau and conditional Spearman’s rho as a function of the covariate.

Figure 3. Life expectancy data.

Figure 4. Estimated conditional Kendall’s tau as a function of log₁₀(GDP).

Table 1. Comparison of the proposed method (referred to as “ECBC-based”) to the local likelihood method (referred to as “Local”) using Monte Carlo estimates of three performance measures, IBIAS², IVAR and IMSE. Data are generated from the Clayton copula under two different functional relationships between the copula parameter and the covariate.

Clayton Copula: $θ = exp (0.8 X - 2)$
Estimation Method	Parametric Model	IBIAS² ( $\times 10^{- 2}$ )	IVAR ( $\times 10^{- 2}$ )	IMSE ( $\times 10^{- 2}$ )
Local	Clayton	0.017	0.553	0.570
Local	Gumbel	3.704	1.716	5.389
ECBC-Based	N/A	0.323	2.569	2.892
Clayton copula: $θ = exp (2 - 0.3 {(X - 4)}^{2})$
Estimation Method	Parametric Model	IBIAS² ( $\times 10^{- 2}$ )	IVAR ( $\times 10^{- 2}$ )	IMSE ( $\times 10^{- 2}$ )
Local	Clayton	0.040	0.288	0.328
Local	Gumbel	4.808	1.301	6.109
ECBC-Based	N/A	0.855	1.876	2.731

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lu, L.; Ghosh, S. Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves. Mathematics 2024, 12, 1135. https://doi.org/10.3390/math12081135

AMA Style

Lu L, Ghosh S. Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves. Mathematics. 2024; 12(8):1135. https://doi.org/10.3390/math12081135

Chicago/Turabian Style

Lu, Lu, and Sujit Ghosh. 2024. "Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves" Mathematics 12, no. 8: 1135. https://doi.org/10.3390/math12081135

APA Style

Lu, L., & Ghosh, S. (2024). Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves. Mathematics, 12(8), 1135. https://doi.org/10.3390/math12081135

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Nonparametric Estimation of Conditional Copula Using Smoothed Checkerboard Bernstein Sieves

Abstract

1. Introduction

2. Models for Conditional Copula

3. Numerical Illustrations Using Simulated Data

4. Real Case Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI