Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence

Joe, Harry; Li, Xiaoting

doi:10.3390/e26070610

Open AccessArticle

Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence

by

Harry Joe

^* and

Xiaoting Li

Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada

^*

Author to whom correspondence should be addressed.

Entropy 2024, 26(7), 610; https://doi.org/10.3390/e26070610

Submission received: 12 June 2024 / Revised: 13 July 2024 / Accepted: 16 July 2024 / Published: 19 July 2024

(This article belongs to the Special Issue Bayesianism)

Download

Browse Figure

Versions Notes

Abstract

:

For multivariate non-Gaussian involving copulas, likelihood inference is dominated by the data in the middle, and fitted models might not be very good for joint tail inference, such as assessing the strength of tail dependence. When preliminary data and likelihood analysis suggest asymmetric tail dependence, a method is proposed to improve extreme value inferences based on the joint lower and upper tails. A prior that uses previous information on tail dependence can be used in combination with the likelihood. With the combination of the prior and the likelihood (which in practice has some degree of misspecification) to obtain a tilted log-likelihood, inferences with suitably transformed parameters can be based on Bayesian computing methods or with numerical optimization of the tilted log-likelihood to obtain the posterior mode and Hessian at this mode.

Keywords:

copula; latent variable; likelihood; prior; Bayesian computing; numerical optimization

1. Introduction

Dependence models with multivariate copulas have had many applications in the past two decades to handle non-Gaussian dependence; in particular, for applications such as risk analysis where variables can have more dependence in the joint tails than with Gaussian dependence with the same strength of central dependence.

When pairwise scatterplots of variables suggest lower and upper tail dependence, possibly asymmetric in the strength in the joint lower tails (extreme of lower quadrant) versus the strength in the joint upper tails (extreme of upper quadrant), several different parametric copula families with tail dependence are among the best based on information criteria such as the Akaike information criterion (AIC). However model-based bivariate lower and upper tail dependence measures can be quite different for these different parametric copulas, and the comparisons of lower and upper tail dependence measures might not match the visual comparisons on the pairwise scatterplots. This is because likelihood methods are influenced a lot by data in the middle (rather than extremes), and all simple parametric models have some degree of misspecification.

For univariate distributions, it is well known that inferences involving large quantiles should not be based on a fitted parametric distribution because extrapolation is not reliable when the data values in the middle have the most influence in the parameter estimates. There are two approaches for univariate inferences involving extremes: (a) from univariate extreme value theory with the assumption of a well-behaved tail density, the peaks-over-threshold method based on generalized Pareto distribution [1] can be used, or (b) splicing models [2] can be used with different flexible densities for the body and tail, if inferences are also needed for non-extremes. For the joint tail region, there is a multivariate Pareto approach such as that in [3], but there is no convenient way to combine with a density for the body.

The goal in this article is to propose a method that incorporates “prior” information on the relations of bivariate lower/upper tail dependence pairs, thereby placing more weight on joint extreme observations when estimating the dependence parameters of the multivariate copula; the splicing of densities for the body and the joint tails is avoided. This approach should lead to parameter estimates of copula dependence parameters with more reliable inference for tail dependence and other tail-based quantities.

How different parametric copula models lead to quite different tail inferences is illustrated with some financial returns data over a few consecutive years. Consider the financial returns for different market indexes or stocks in the same sector of a market; for dependence analysis, commonly, a copula-GARCH model (see [4]) is applied to GARCH-filtered returns. Pairwise normal scores plots after rank transform to

N (0, 1)

show tail dependence with the clouds of points being sharper than the elliptical shape in the extreme lower and upper quadrants. Often, there appears to be a stronger dependence in the joint lower tail than in the joint upper tail.

When different flexible parametric multivariate copula families, such as vine and factor copula models, are fit to multivariate GARCH-filtered returns, the best-fitting models based on AIC imply lower and upper tail dependence for any pair of returns. This is based on the results of [5,6], which imply that if bivariate copulas, for pairs of variables in the first tree of the vine or with a variable linked to a latent variable, have lower and upper tail dependence, then the bivariate copulas of all pairs of variables have lower and upper tail dependence. Note that factor copulas are vine copulas that include latent variables.

However, if model-based tail dependence parameters are computed based on the best few fitted models, they can be quite different among the models and sometimes may not match what is seen in the normal score plots. For example, (a) sometimes, a model-based lower tail dependence parameter may be closer to 0 than expected based on the plot, or (b) the model-based lower tail dependence parameter may be smaller than the model-based upper tail dependence parameter, in contrast to the the visual inspection of the plot.

With the non-parametric method for empirical tail dependence measures in [7], it is possible to compare empirical and model-based lower and upper tail dependence to show quantitatively that model-based measures might not be reliable for all bivariate margins. This is because the fit of parametric multivariate models based on likelihood tends to be dominated by the data in the middle of the distribution. Inference concerning the middle (e.g., medians and non-extreme marginal orthant probabilities) can be reliable but not necessarily inference concerning the extremes (e.g., extreme marginal orthant probabilities or multivariate quantiles of the form defined in [8]).

This article shows the use of a tilted likelihood to estimate parameters of the 1-factor copula so that inferences in the joint tails are improved.The 1-factor copula for d variables has a vector parameter

ϑ_{j}

for the bivariate linking copula of the jth variable and the latent variable (the latter explains the joint dependence of the observed variables). The tilting depends on the nature of the variables. For d GARCH-filtered stock returns for stocks in the same sector, the dependence parameters

{ϑ_{j} : 1 \leq j \leq d}

can be considered a sample from a super-population so that it is reasonable to assume a common prior distribution for the

ϑ_{j}

. The tilted log-likelihood is based on the sum of the 1-factor copula log-likelihood and the logarithm of this prior density that is based on tail dependence summaries from some “previous data”.

There is a numerical data example in Section 2 for preliminaries to show explicitly why likelihood inference can be inadequate for tail inference; tail dependence parameters are defined, and examples of normal score plots are given in this section. Section 3 and Section 4 contain the theory and numerical methods to develop a “prior” to help with tail inference for the 1-factor copula model with asymmetric tail-dependent copulas linking to the latent variable. Section 5 illustrates the theory for a data example with GARCH-filtered stock returns from stocks in a S&P sector to show improved tail inference. Section 6 has some simulation results to compare with the data example. Section 7 concludes with a discussion on the generality of the approach proposed for the 1-factor model; the basis is a “super-population” assumption for some bivariate margins with lower and upper tail dependence. The background results for tail dependence, copulas, and factor models are given in Appendix A.

2. Numerical Data Example to Illustrate Discrepancy for Tail Inference

In this section, a numerical low-dimensional data example is used to clarify what is meant by the possible poor joint tail inference following maximum likelihood.

Definitions of bivariate tail dependence and the copula as summaries of dependence are presented to explain concepts of dependence in joint tails.

Let

F_{1 : d}

be an absolutely continuous d-variate distribution with univariate margins

F_{1}, \dots, F_{d}

and copula

C_{1 : d}

such that

F_{1 : d} = C_{1 : d} (F_{1}, \dots, F_{d})

. Let

Y = (Y_{1}, \dots, Y_{d}) \sim F_{1 : d}

.

For the bivariate margin

F_{j k} = C_{j k} (F_{j}, F_{k})

with

j \neq k

, the probabilistic version of the lower and upper tail dependence parameters is:

\begin{matrix} λ_{j k, L} & = & lim_{u \to 0^{+}} Pr (Y_{j} \leq F_{j}^{- 1} (u) | Y_{k} \leq F_{k}^{- 1} (u)) \\ = & lim_{u \to 0^{+}} u^{- 1} Pr (Y_{j} \leq F_{j}^{- 1} (u), Y_{k} \leq F_{k}^{- 1} (u)) = lim_{u \to 0^{+}} C_{j k} (u, u) / u, \\ λ_{j k, U} & = & lim_{u \to 0^{+}} Pr (Y_{j} \geq F_{j}^{- 1} (1 - u) | Y_{k} \geq F_{k}^{- 1} (1 - u)) \\ = & lim_{u \to 0^{+}} u^{- 1} Pr (Y_{j} \geq F_{j}^{- 1} (1 - u), Y_{k} \geq F_{k}^{- 1} (1 - u)) \\ = & lim_{u \to 0^{+}} {\bar{C}}_{j k} (1 - u, 1 - u) / u, \end{matrix}

where

{\bar{C}}_{j k} (u_{j}, u_{k}) = 1 - u_{j} - u_{k} + C_{j k} (u_{j}, u_{k})

for

0 \leq u_{j}, u_{k} \leq 1

.

Consider a random sample from

F_{1 : d}

with

y_{i} = (y_{i 1}, \dots, y_{i d})

for

i = 1, \dots, n

. Because

λ_{j k, L}

and

λ_{j k, U}

are limiting quantities (as

u \to 0^{+}

), there are no direct empirical (data) versions. For the numerical examples in this section and later sections, the sample version comes from a limit of tail-weighted dependence measures.

A general reference for concepts (in the above and in later sections) with copulas and dependence is [9], and the estimator of tail dependence from a limit of tail-weighted dependence measures is given in [7]. For the probabilistic version, the tail-weighted dependence measures are indexed by a parameter

α > 1

and the limit, as

α \to \infty

is the tail dependence parameter. After computing the empirical tail-weighted dependence measure for a grid of

α

values, typically in the interval

[10, 20]

, a regression model is fit for the empirical measure versus a power of

α^{- 1}

, and then the tail dependence parameter is estimated as the extrapolation with

α^{- 1} \to 0

.

The data example involves GARCH-filtered stock returns with all stocks in the same sector. Appendix A.4 has some background for GARCH time series and copula-GARCH models.

The S&P 500 data set of GARCH-filtered stock returns (January 2013 to December 2015, good economic conditions) used for illustration is analyzed in [10]. The sample consists of

n = 754

days. For the finance sector, some initial descriptive statistics analyses based on 10 stocks were chosen from 64; the ticker symbols of the 10 stocks are COF, RJF, SCHW, FRC, GL, FD, TROW, GS, BLK, and ICE. Normal score plots of GARCH-filtered returns for a few pairs amongst these 10 stocks are given in Figure 1 (see Appendix A.3 for the mathematical definition of the transform). They show tail dependence, with the clouds of the points being sharper than the elliptical shape and having a stronger correlation in the lower quadrant than in the upper quadrant.

These few stocks are used to demonstrate (in small tables) a typical situation of differences in empirical and model-based tail quantities. To check that a 1-factor dependence structure is reasonable, the non-parametric transform to normal scores is applied to GARCH-filtered returns, and factor analysis (see [11]) is applied to the resulting correlation matrix. The loadings are, respectively, 0.741, 0.802, 0.838, 0.475, 0.688, 0.821, 0.665, 0.609, 0.690, 0.830. The average absolute difference between the empirical and model-based correlation matrix is 0.03, and the maximum absolute difference between the empirical and model-based correlation matrix is 0.21 (with two discrepancies with an absolute difference

> 0.10

), so the 1-factor structure is reasonable as a first-order approximation when considering that a

10 \times 10

matrix with 45 correlations is approximated by a simple correlation matrix with 10 parameters. With a larger dimension (more stocks in the same sector), a 1-factor model with some weak conditional dependence (see [12]) could be a better dependence model.

Two parametric copula models are fitted to account for non-Gaussian dependence—1-factor with

d = 10

linking copulas that are all BB1, or all reflected BB1 (abbreviated as BB1r). These are referred to briefly as 1-factor BB1 and 1-factor BB1r. The details of these models are summarized in Appendix A.1 and Appendix A.2 in Appendix A; in particular, Appendix A.1 has the definition of the 2-parameter bivariate BB1 copula and some of its dependence properties, and Appendix A.2 has the definition of the 1-factor copula for d variables based on conditional independence of observed variables given a latent variable.

Table 1 has empirical and model-based lower and upper tail dependence measures:

{\hat{λ}}_{j k, L}, λ_{j k, L} (\hat{ϑ})

,

{\hat{λ}}_{j k, U}, λ_{j k, U} (\hat{ϑ})

. Model-based values are based on maximum likelihood estimates (MLEs) with 1-factor BB1r and 1-factor BB1. Table 2 has an empirical Spearman rank correlation as a central measure of dependence:

{\hat{ρ}}_{j k, S}

. The values of

ρ_{j k, S} (\hat{ϑ})

for the two 1-factor copula models are quite close to the empirical values compared with some discrepancies for tail dependence measures. Table 3 has summaries, averaged over

d (d - 1) / 10 = 45

bivariate margins. Table 1 and Table 3 show that tail inferences from different models with lower and upper tail dependence can be quite different, but the models can have similar inferences for central quantities.

The tail asymmetry of financial returns, with commonly more dependence in the joint lower tail than in the joint upper tail, is explained and discussed in [13,14]. The 1-factor BB1r model has a smaller AIC value than 1-factor BB1, and it better matches the empirical property of lower tail dependence, being often larger than upper tail dependence. However, the 1-factor BB1r model tends to overestimate the difference in the lower and upper tail dependence, and the 1-factor BB1 model tends to underestimate the difference in lower and upper tail dependence. This motivates the tilted likelihood in Section 3 with an appropriate “prior” so that model-based tail dependence measures are closer to empirical counterparts.

It has been observed in many data examples (see [15] and Chapter 7 of [9]) that model-based assessment of tail dependence may not be accurate. The more recent development of tail-weighted dependence measures in [16] allows for better assessment on the reliability of a parametric copula model for tail inferences, by comparing empirical and model-based directional tail-weighted measures.

3. Tilted Likelihood for 1-Factor Copula Model with Tail Dependence

This section has a modified log-likelihood using a prior based on previous data for tail dependence parameters in a 1-factor copula model (as given in Appendix A.2). The starting point is the copula-based log-likelihood after univariate margins have been estimated.

Assume that satisfactory univariate models

{\hat{F}}_{j}

(

1 \leq j \leq d

) have been fit to the random sample

{(y_{i 1}, \dots, y_{i d}) : i = 1, \dots, n}

and then transform to the uniform scale to

{u_{i} = (u_{i 1}, \dots, u_{i d}) : i = 1, \dots, n}

with

u_{i j} = {\hat{F}}_{j} (y_{i j})

in the interval

(0, 1)

.

We consider mainly inference on dependence parameters for the data in the transformed uniform scale, considered as a realization of a random sample

{U_{i}}

for a copula cumulative distribution function (cdf)

C_{U} (\cdot; ϑ)

, where

ϑ = (ϑ_{1}, \dots, ϑ_{d})

. The log-likelihood for a random sample of size n is:

L (ϑ) = L (ϑ_{j} : j = 1, \dots, d) = \sum_{i = 1}^{n} log c_{U} (u_{i}; ϑ_{1}, \dots, ϑ_{d}) .

(1)

For the 1-factor copula based on BB1r (and other) linking copulas, there are lower bounds on components of the 2-dimensional

ϑ_{j}

.

For likelihood inference, there is invariance to 1-1 transforms of

ϑ_{j}

to

η_{j}

, with the latter being functions of lower and upper tail dependence parameters. Specifically,

η_{j} = (η_{1 j}, η_{2 j}) = (log [λ_{j L} / (1 - λ_{j L}], log [λ_{j U} / (1 - λ_{j U})])

with

λ_{j L}, λ_{j U}

being the lower and upper tail dependence parameters for the bivariate copula linking variable j to the latent variable. Note that

η_{j}

is unbounded. The tilted log-likelihood or log “posterior” is:

\tilde{L} (η_{1}, \dots, η_{d}) = \sum_{i = 1}^{n} log c_{U} (u_{i}; η_{1}, \dots, η_{d}) + \sum_{j = 1}^{d} log f_{H} (η_{j})

(2)

where the density

f_{H}

does not depend on j. The above is called the tilted log-likelihood because the goal is to obtain parameter estimates that put less weight in the middle of the data space and more weight in the tails based on “prior” expected behavior of how the lower and upper tail dependence parameters are related.

With the appropriate transformation, the prior can be taken as multivariate normal. For bivariate BB1r or BB1,

η_{j}

is 2-dimensional, and

f_{H}

is assumed to be bivariate normal. The latter is reasonable if the form of

η_{j}

is chosen so that (2) is closer to a quadratic in a neighborhood of its mode. Asymptotic likelihood theory (see [17]) implies that the log-likelihood is quadratic in a neighborhood of the mode, as

n \to \infty

, but the adequacy of the approximation for moderate sample size n depends on the transform.

The justification of “independent” prior densities for different variables is based on some empirical checks for 1-factor copula construction with different bivariate linking copulas (with or without tail dependence). The inverse Hessian (roughly the covariance matrix of the sampling distribution of the MLE) of the negative log-likelihood in (1) for the 1-factor copula is close to the block diagonal, with a block for each

η_{j}

. The product form of the “prior” is based on an assumption of a “super-population” for the variables linked to the latent variable (e.g., stocks in a market sector). The density

f_{H}

can be considered a frequency density of

η_{j}

values over a large “super-population”.

A method is described in Section 4 to decide on choices for

f_{H}

.

Similar ideas to the tilted log-likelihood have been used to obtain an adjusted log-likelihood that corrects some undesirable behavior of the MLE, given in [18,19]. There are also some connections with variational Bayes inference such as when the posterior density is assumed to be approximated by a multivariate Gaussian density after a suitable transform so that parameters are unconstrained. However, with copula applications [20,21], parsimonious and possibly unrealistic assumptions are made for the covariance matrix (such as diagonal or factor structure) of the Gaussian density. The optimization involves a Kullback–Leibler divergence of the Gaussian approximation and the posterior. This differs from optimizing (2) with no constraints on the form of the Hessian matrix at the mode.

Numerical Optimization for Posterior Mode and Hessian at Mode

The tilted log-likelihood has the penalized log-likelihood as an analogy so that standard numerical optimization methods can be used for estimating the mode and its Hessian.

The tilted log-likelihood in (2) and its log-likelihood counterpart in (1) are functions of

2 d

parameters for the 1-factor BB1r copula with d variables. For the log-likelihood, Ref. [22] discusses an efficient numerical procedure where the log-likelihood, gradient, and Hessian are analytically derived and coded in Fortran90, and all integrals are evaluated via Gauss–Legendre quadrature (see [23]).

The code is modified to handle the transform from BB1 parameters

(θ_{j}, δ_{j})

to

(η_{1 j}, η_{2 j})

, and this requires care in using the chain rule for partial derivatives. The code for (2) and its gradient and Hessian are inputted into an efficient modified Newton–Raphson algorithm, as summarized in Section 6.2 of [9]. This leads to much faster computations than coding the negative of (2) in R and using a quasi-Newton method for numerical minimization based on numerical gradients and Hessians because many more iterations are needed compared with the modified Newton–Raphson. With the use of Fortran90 (for loops), analytic derivatives, and modified Newton–Raphson iterations, the time to deduce the posterior mode is decreased by a factor larger than 20 for

2 d = 40

parameters. Without the increased speed, the simulation study reported in Section 6 would take too much time. Also numerical optimization with the quasi-Newton method performs much worse as the number of parameters increases beyond 40.

With the negative Hessian at the mode of the tilted log-likelihood, the inverse Hessian can be used to obtain interval estimates for functions of the parameters.

4. Closer Match of Empirical and Model-Based Tail Dependence

Suppose diagnostic plots suggest tail dependence for all pairs of variables. Maximum likelihood estimation with a parametric copula might not provide good model-based estimates of tail dependence parameters or reliable inferences for tail-based quantities. In this section, a least squares method is used to obtain parameter estimates for the 1-factor copula that will make the empirical and model-based tail dependence parameters closer to each other. That is, there is an objective function to find copula parameters to better match model-based and empirical tail dependence parameters.

Let

θ

be the vector of all parameters

(ϑ_{1}, \dots, ϑ_{d})

. The jth component is

ϑ_{j} = (θ_{j}, δ_{j})

for the 1-factor BB1r or BB1 copula; see Appendix A.1 for the parametric BB1 family. The steps below assume that the 1-factor BB1r has lower AIC than 1-factor BB1 (empirical evidence from many applications of 1-factor copulas to GARCH-filtered stock returns).

Minimize negative of log-likelihood $L (ϑ)$ in (1) to get MLE $\hat{ϑ}$ .
Get empirical matrix of lower tail dependence ${\hat{λ}}_{j k, L}$ , upper tail dependence ${\hat{λ}}_{j k, U}$ , central dependence Spearman rho ${\hat{ρ}}_{j k, S}$ .
Minimize

$\begin{matrix} S (ϑ) & = & \frac{1}{2 d (d - 1)} {\sum_{j < k} {[{\hat{λ}}_{j k, L} - λ_{j k, L} (ϑ_{j}, ϑ_{k})]}^{2} + \sum_{j < k} {[{\hat{λ}}_{j k, U} - λ_{j k, U} (ϑ_{j}, ϑ_{k})]}^{2} \\ + \sum_{j \neq k} {[{\hat{ρ}}_{j k, S} - ρ_{j k, S} (ϑ_{j}, ϑ_{k})]}^{2}} \end{matrix}$

(3)

with $\hat{ϑ}$ as starting point. Let the result be $\tilde{ϑ}$ .
Convert ${\tilde{ϑ}}_{j}$ to ${\tilde{λ}}_{j V, L} = λ_{j V, L} ({\tilde{ϑ}}_{j})$ , ${\tilde{λ}}_{j V, U} = λ_{j V, U} ({\tilde{ϑ}}_{j})$ as defined in (A6), using the BB1r linking copula for variable j and the latent variable V.
Transform to values in $(- \infty, \infty)$ : $log [{\tilde{λ}}_{j V, L} / (1 - {\tilde{λ}}_{j V, L})]$ and $log [{\tilde{λ}}_{j V, U} / (1 - {\tilde{λ}}_{j V, U})]$ for $j = 1, \dots, d$ .
Get the sample mean vector and covariance matrix for a sample of size d for the two transformed $\tilde{λ}$ ’s. The mean vector and covariance matrix are used as parameters for the bivariate normal prior $f_{H}$ in (2). For the tilted likelihood, use the parametrization

$η_{j} = (log [λ_{j V, L} / (1 - λ_{j V, L})], log [λ_{j V, U} / (1 - λ_{j V, U})])$

for $j = 1, \dots, d$ .

The data set mentioned in Section 2 as used in [10] has 64 stocks in the finance sector, 21 stocks in the energy sector and 60 stocks in the health sector of S&P 500 (years 2013–2015). The above procedure is applied to 20 random stocks from the finance sector, 10 random stocks from the energy sector, and 20 random stocks from the health sector. Below in (4) to (6) are the mean vector and covariance matrix for

f_{H}

for three cases:

\begin{matrix} μ_{1} & = & (\begin{matrix} 0.20 \\ - 0.38 \end{matrix}), Σ_{1} = (\begin{matrix} 0.163 & 0.134 \\ 0.134 & 0.231 \end{matrix}), ρ_{1} = 0.691; \end{matrix}

(4)

\begin{matrix} μ_{2} & = & (\begin{matrix} 0.04 \\ - 0.74 \end{matrix}), Σ_{2} = (\begin{matrix} 0.277 & 0.390 \\ 0.390 & 0.830 \end{matrix}), ρ_{2} = 0.813; \end{matrix}

(5)

\begin{matrix} μ_{3} & = & (\begin{matrix} - 0.30 \\ - 1.05 \end{matrix}), Σ_{3} = (\begin{matrix} 0.121 & 0.140 \\ 0.140 & 0.365 \end{matrix}), ρ_{3} = 0.666 . \end{matrix}

(6)

They are used as the parameters of three bivariate normal distributions. The three cases are used in subsequent sections to allow a sensitivity analysis of the parameters in

f_{H}

.

All three cases in (4) to (6) indicate stronger tail dependence in the joint lower tail compared with the joint upper tail because of the larger value in the first component of

μ

. Of the three cases, the first example has the strongest expected lower tail dependence because of largest first component of

μ

. For the first two cases, the median lower tail dependence is larger than 0.5 because of the positive value. The median upper tail dependence is less than 0.5 for all three cases.

5. Data Example with Prior and Tilted Likelihood

This section summarizes the application of the tilted log-likelihood for GARCH-filtered stock returns. Initially, three 1-factor copula constructions, with BB1, BB1r and BB7 bivariate linking copulas to the latent variable, were fitted with maximum likelihood for different subsets of stocks. Here, as is common from many empirical applications, the 1-factor copula based on BB1r is best, based on the AIC.

The tilted log-likelihood in (2) was then used for analysis of random subsets of stocks from the finance, energy and health sectors; these were different subsets from those used to determine the prior parameters (4)–(6). The qualitative conclusions are similar for different random subsets, so below we report details for one case of 20 randomly chosen finance stocks, considered one representative application of the theory in the preceding sections.

The numerical details below are based on 20 stocks with the tickers LNC, PGR, MMC, C, KEY, CBOE, BK, BEN, AXP, ALL, BAC, RF, AFL, ZION, DFS, CMA, MCO, GL, TRV, and BRK-B, used to determine the prior

f_{H}

, and 20 other stocks with the tickers L, CME, MTB, MKTX, AIZ, MET, SCHW, FITB, STT, HBAN, PFG, BLK, SPGI, CB, COF, TFC, WRB, JPM, FRCB, and ICE for applying the procedure in Section 3 and Section 4.

Inferences for tail dependence are compared for five cases below with summaries in Table 4.

1-factor BB1r, $f_{H}$ based on finance sector stocks.
1-factor BB1r, $f_{H}$ based on energy sector stocks.
1-factor BB1r, $f_{H}$ based on health sector stocks.
1-factor BB1, $f_{H}$ based on finance sector stocks.
1-factor BB7, $f_{H}$ based on finance sector stocks.

The “parameters” of

f_{H}

are

(log λ_{L} / (1 - λ_{L}), log λ_{U} / (1 - λ_{U}))

and have different transformations to the parameters

(θ, δ)

of BB1r, BBB1, BB7.

Table 4 shows that for BB1r, there is little sensitivity to the three priors (4)–(6). However, the worse fitting 1-factor BB1 and 1-factor BB7 models (based on last column of Table 4) do not lead to better matching with empirical dependence measures using the prior in (4). Overall, these latter two models fit worse in the middle of the data space, leading to smaller values for (2) at the mode.

For 1-factor BB1 with the tilted log-likelihood (2), we looked at the negative inverse Hessian (covariance matrix of normal approximation) in posterior mode for row 2 of Table 4. There is almost zero correlation of the parameters for different variable indices j (for different stocks). The inverse Hessian is too large to show in its entirety, but an extract of some entries is converted into standard deviations and correlations in Table 5 and Table 6.

Bayesian Computing with STAN

Results based on the prior in (2) were also obtained via Bayesian computing with STAN (Hamiltonian Monte Carlo). Estimation for a 1-factor copula model via the Hamiltonian Monte Carlo is shown in [24], but their inferences do not include asymmetric tail dependence.

In Bayesian inference, the parameter vector

Θ^{*}

consists of both the (transformed) copula dependence parameters

η = (η_{1}, η_{2}, \dots, η_{d})

and the latent variables

v = (v_{1}, v_{2}, \dots, v_{n})

in (A3). We assume a joint independent uniform prior distribution for the latent variables and a (product of) bivariate normal prior for the copula dependence parameters for the bivariate linking copulas. The prior density is given by

π_{Θ^{*}} (θ^{*}) = π_{V} (v) π_{H_{1}, \dots, H_{d}} (η) = \prod_{i = 1}^{n} I (- 1 < v_{i} < 1) \prod_{j = 1}^{d} f_{H} (η_{j}),

where the mean and variance of the bivariate normal prior

f_{H}

are given in (4)–(6). The “complete” likelihood function with the latent variables as parameters is

\begin{matrix} p_{U | Θ^{*}} (u_{1}, \dots, u_{n} ∣ θ^{*}) = \prod_{i = 1}^{n} p_{U_{i} | V_{i}, H_{1}, \dots, H_{d}} (u_{i} ∣ v_{i}, η_{1 : d}) = \prod_{i = 1}^{n} \prod_{j = 1}^{d} c_{j V} (u_{j}, v_{i}; ϑ_{j} (η_{j})), \end{matrix}

(7)

where

c_{j V}

is given in Appendix A.2. Since the Bayesian estimation treats the latent variables as additional parameters, the likelihood function consists of the conditional density function given the latent variables instead of the joint density function. The posterior density function of the parameters (up to a constant) is

π_{Θ | U} (θ^{*} ∣ u_{1}, \dots, u_{n}) \propto p_{U, Θ^{*}} (u, θ^{*}) = p_{U | Θ^{*}} u_{1}, \dots, u_{n} ∣ θ^{*}) π_{Θ^{*}} (θ^{*}) .

To perform Bayesian inference on the (transformed) copula dependence parameters of the 1-factor model, we use the No-U-Turn sampler (NUTS) proposed by [25]. NUTs is an extension of the Hamiltonian Monte Carlo algorithm, implemented within the STAN framework developed by [26]. The 1-factor copula models with BB1 and reflected BB1 copulas are fitted to the GARCH-filtered returns in STAN. For the data example with results summarized in Table 5 and Table 6, the posterior statistics of

η

(including posterior means, standard deviations, and correlation matrix) are similar to the results obtained from maximizing the tilted likelihood function in (2). In comparison with Table 5, the median and maximum absolute differences are, respectively, (a) 0.006 and 0.033 for

μ_{η}

’s, (b) 0.002 and 0.014 for

σ_{η}

’s, and (c) 0.023 and 0.059 for

ρ

’s.

From (7), it is seen that the log posterior is (up to a constant) equal to:

\tilde{L} (η_{1}, \dots, η_{d}, v) = \sum_{i = 1}^{n} \sum_{j = 1}^{d} log c_{j V} (u_{j}, v_{i}; ϑ_{j} (η_{j})) + \sum_{j = 1}^{d} log f_{H} (η_{j});

this is equivalent to the tilted log-likelihood in (2) after marginalizing over the latent variables

v

. Therefore, the two approaches should yield essentially the same result.

With a flat prior on the

η_{j}

, the posterior estimates should align with the maximum likelihood estimates. However, in the case of estimating BB1 or reflected BB1 copulas, identifiability issues arise when using a flat prior. The two parameters of BB1 or reflected BB1 are negatively dependent, which can result in different combinations of parameter values producing similar likelihood values. This issue might be overlooked in maximum likelihood estimation since it converges to one of the maxima, with an appropriate starting point for numerical optimization. However, it becomes evident in Bayesian estimation, where the model struggles to distinguish between different parameter values in the posterior distributions. We found that incorporating informative priors can effectively mitigate this problem. These priors leverage tail dependence measures to provide additional information about the relationship between the parameters, thereby improving the model’s ability to identify meaningful and interpretable parameter values.

6. Simulation Summary

This section has some simulation results for comparisons. Simulated data sets of size

n = 754

and

d = 20

are obtained to match the data example in Section 5; the algorithm for the simulation is in Algorithm 22 of [9]. For each simulated data set,

(η_{1 j}, η_{2 j})

for

j = 1, \dots, d

are generated at random from (4) and then a random sample

{(u_{i 1}, \dots, u_{i d}) : i = 1, \dots, n}

is generated from 1-factor BB1r based on the tail dependence parameters. For each simulated data set, as a sensitivity analysis, the log posterior in (2) for all three choices of

f_{H}

based on (4)–(6) is maximized to obtain the mode and the approximate covariance matrix of the posterior density; also the MLE based on (1) is obtained.

The MLEs of the

η_{1 j}, η_{2 j}

parameters are transformed to the estimated

θ, δ

parameters of BB1r. Similarly, three sets of posterior modes for

η_{1 j}, η_{2 j}

parameters are transformed to estimate the

θ, δ

parameters. Then, the following root mean squares (rms) are computed:

{r m s}_{m} = {[{(2 d)}^{- 1} \sum_{j = 1}^{d} ({({\hat{θ}}_{j}^{(m)} - θ_{j})}^{2} + {({\hat{δ}}_{j}^{(m)} - δ_{j})}^{2})]}^{1 / 2}

(8)

for the four sets of estimators. The superscripts

m = 1, 2, 3

indicate the three priors, and the superscript

m = 0

indicates maximum likelihood. Over 100 simulated data sets, the rms summaries are given in Table 7.

As expected, all three priors lead to closer estimates to the

(θ, δ)

parameters used to generate the simulated data sets than the MLEs. The three sets of

{({\hat{θ}}_{j}^{(m)}, {\hat{δ}}_{j}^{(m)})}

for

m = 1, 2, 3

are relatively much closer to each other than with the MLE. For all simulated data sets, the value of the tilted log-likelihood (2) at the posterior mode is largest for prior (4) and smallest for prior (6).

Another summary in Table 8 is the closeness to the empirical

λ_{j k, L}

and

λ_{j k, U}

over the

(\binom{d}{2})

pairs:

Δ_{M, m} = {[d (d - 1) / 2]}^{- 1} \sum_{j < k} |a_{j k, M}^{(m)} - a_{j k, M}^{e m p i r i c a l}|

(9)

where

m \in {0, 1, 2, 3}

as above, and

M \in {L, U, C}

for lower tail dependence, upper tail dependence, and central dependence, with dependence measures

a \in {{\hat{λ}}_{j k, L}, {\hat{λ}}_{j k, U}, {\hat{ρ}}_{j k, S}}

, respectively.

From Table 8, there is better matching with the tilted log-likelihood for upper tail dependence but no improvement for lower tail dependence. The Spearman values

ρ_{j k, S}

are much closer for empirical versus model based parameters.

In the comparison of the simulation results to those for the stock return data in Section 5, the improvements in using (2) are less. This can be explained as follows. For finance stock return data with stocks from one sector, the 1-factor structure with lower and upper tail dependence is reasonable, and BB1r linking copulas can be considered good approximations, and there might also be weak conditional dependence of some stock returns conditioned on the latent variable. That is, the 1-factor BB1r copula model has some small degree of model misspecification, and this can explain why tilting to obtain model-based tail dependence parameters to match empirical counterparts should lead to better tail inference.

7. Discussion

A method has been proposed for improved tail inference when preliminary data and likelihood analysis suggest asymmetric tail dependence. The approach of the tilted log-likelihood introduces a prior distribution involving lower and upper tail dependence parameters. Incorporating the prior places more weight on the behavior of the joint lower and upper tails compared with the center of the probability space, thereby improving the extreme value inference. This can account for a small degree of model misspecification in the parametric model. The prior is chosen so that model-based lower and upper tail dependence parameters can be a closer match to empirical counterparts for a previous data set that has some similar features to the data set under consideration.

For simpler exposition, the theory is applied to a 1-factor copula model that can handle non-Gaussian dependence structures with asymmetric tail dependence. The tilted log-likelihood approach can be extended to other structured factor copula models (e.g., bi-factor and1-factor with weak residual dependence) with asymmetric tail dependence, where a super-population assumption is reasonable for how observed variables are linked to latent variables.

Also, the approach can be applied to vine copula models with bivariate tail dependence for all pairs of variables by choosing bivariate copulas with lower and upper tail dependence in tree 1 of the vine. From [5], lower and upper upper tail dependence in the first vine tree lead to this property for all pairs of variables. By including a prior based on pairs of variables with stronger dependence and asymmetric tail dependence, there could be a better match of vine copula model-based tail dependence measures and empirical counterparts.

The skew-t copula (see [27]) can also be used for asymmetric tail dependence. However, the functional relation of copula parameters and tail dependence parameters is much more complicated than with the BB1 copula (latter in Appendix A.1) such that the tilted log-likelihood approach would be have to implemented with a different transform of the copula parameters.

Bayesian computing methods can be used if there are latent variables. Alternatively, a tilted log-likelihood similar to (2) can be optimized via (a) a quasi-Newton method if the total number of parameters is not large (say, fewer than 40), (b) a modified Newton–Raphson method if the analytical gradient and Hessian can be obtained, or (c) sequential estimation of parameters if possible (Section 5.5 of [9]). For methods (b) and (c), numerical optimization of the tilted log-likelihood is used to obtain the (approximate) posterior mode, and then the Hessian in this mode, in order to obtain interval estimates of the functions of parameters.

Author Contributions

Conceptualization, H.J.; methodology, H.J. and X.L.; software, H.J. and X.L.; writing—original draft preparation, H.J.; writing—review and editing, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been supported by NSERC Discovery Grant GR010293 and an NSERC Postgraduate Scholarship.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

Thanks to the referees for their valuable comments and to the editors for their encouragement.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Background Results on Copulas and Dependence Concepts

The appendix consists of subsections of known results in order to make this article more self-contained. Details of individual topics are in [9] if there are no additional references.

Appendix A.1. BB1 Copula Family and Tail Dependence Properties

If

(U, V) \sim C

for a bivariate copula C, the reflected copula

\hat{C} (u, v)

is the distribution of the reflection

(1 - U, 1 - V)

. Simple probability calculations lead to:

\hat{C} (u, v) = u + v - 1 + C (1 - u, 1 - v), 0 \leq u, v \leq 1 .

The survival function of C is related; it is defined as

\bar{C} (u, v) = Pr (U > u, V > v) = 1 - u - v + C (u, v), 0 \leq u, v \leq 1,

(A1)

so that

\hat{C} (u, v) = \bar{C} (1 - u, 1 - v)

.

The bivariate 2-parameter BB1 copula family is useful to handle asymmetric lower and upper tail dependence. Asymmetric here refers to copulas where C is different from

\hat{C}

. For

θ > 0, δ > 1

,

C_{BB 1} (u, v; θ, δ) = {\{1 + {[{(u^{- θ} - 1)}^{δ} + {(v^{- θ} - 1)}^{δ}]}^{1 / δ}\}}^{- 1 / θ}, 0 \leq u, v \leq 1 .

(A2)

The reflected copula is:

C_{BB 1 r} (u, v; θ, δ) = u + v - 1 + C_{BB 1} (1 - u, 1 - v; θ, δ)

.

The BB1 copula and its reflection are the most useful for handling asymmetric tail dependence. Other choices for this property are BB7 and skew-t (see [27]). The BB1 family has the nice property that there is increasing concordance or positive dependence as

θ

and

δ

increase with independence as the lower bound of the parameter space is reached and comonotonicity (perfect dependence) as one of

θ

or

δ

goes to infinity. The BB7 copula does not have the concordance property over the entire parameter space.

Tail dependence parameters (as functions of copula family) are

λ_{L} (C_{BB 1}) = 2^{- 1 / (δ θ)}

and

λ_{U} (C_{BB 1}) = 2 - 2^{1 / δ}

as functions of

θ

and

δ

. Then,

λ_{U} (C_{BB 1 r}) = λ_{L} (C_{BB 1})

and

λ_{L} (C_{BB 1 r}) = λ_{U} (C_{BB 1})

because the reflection reverses the joint upper and lower tails. The range of

(λ_{L}, λ_{U})

is

{(0, 1)}^{2}

. To go from tail dependence parameters to copula parameters, the transforms are:

\begin{matrix} δ = [log 2] / log (2 - λ_{U}), θ = log (2 - λ_{U}) / (- log λ_{L}) for BB 1, \\ δ = [log 2] / log (2 - λ_{L}), θ = log (2 - λ_{L}) / (- log λ_{U}) for BB 1 r, \\ δ = [log 2] / [- log - λ_{L}), θ = [log 2] / log (2 - λ_{U}) for BB 7 . \end{matrix}

For the data example in Section 5, the BB7 1-factor copula is fitted much worse, but it is included to show that the “prior” in (2) is independent of the tail-dependent family used for linking copulas to the latent variable.

Appendix A.2. 1-Factor Copula Construction

For the 1-factor copula model, it assumes conditional independence of

U = (U_{1}, \dots, U_{d})

, given a latent variable

V \sim U (0, 1)

. Let

C_{j V} (\cdot, ϑ_{j})

(for

j = 1, \dots, d

) be the bivariate copula cdf for the jth variable with the latent variable V. The d-variate copula cdf and density for

U = (U_{1}, \dots, U_{d})

are:

\begin{matrix} C_{U} (u; ϑ_{1}, \dots, ϑ_{d}) & = & \int_{0}^{1} \prod_{j = 1}^{d} C_{j | V} (u_{j} | v; ϑ_{j}) d v, \end{matrix}

(A3)

\begin{matrix} c_{U} (u; ϑ_{1}, \dots, ϑ_{d}) & = & \int_{0}^{1} \prod_{j = 1}^{d} c_{j V} (u_{j}, v; ϑ_{j}) d v, \end{matrix}

(A4)

where

C_{j | V} (u_{j} | v; ϑ_{j}) = \partial C_{j V} (u_{j}, v; ϑ_{j}) / \partial v

is the conditional distribution of

[U_{j} | V = v]

and

c_{j V} (u_{j}, v, ϑ_{j}) = \partial^{2} C_{j V} (u_{j}, v; ϑ_{j}) / \partial v \partial u_{j}

is the corresponding copula density. The

(j, k)

bivariate margin is

C_{j k} (u_{j}, u_{k}) = \int_{0}^{1} C_{j | V} (u_{j} | v; ϑ_{j}) C_{k | V} (u_{k} | v; ϑ_{k}) d v .

(A5)

The Spearman’s rho for

(U_{j}, U_{k})

is

Cor (U_{j}, U_{k})

. It is numerically best to compute it via a 2-dimensional numerical integral

12 \int_{0}^{1} \int_{0}^{1} C_{j k} (u_{u}, u_{k}) d u_{j} d u_{k} - 3

to avoid a possible unbounded copula density

c_{j k}

. For (A5), this becomes

12 \int_{0}^{1} (\int_{0}^{1} C_{j | V} (u_{j} | v) d u_{j}) (\int_{0}^{1} C_{j | V} (u_{k} | v) d u_{k}) d v - 3 = : 12 \int_{0}^{1} a_{j} (v) a_{k} (v) d v - 3;

This can be computed via a 1-dimensional Gauss–Legendre quadrature over

n_{q}

quadrature points

v_{1}, \dots, v_{n_{q}}

for v, and

a_{j} (v)

and

a_{k} (v)

(with respect to

u_{j}

and

u_{k}

) can be separately evaluated via the Gauss–Legendre quadrature for

v_{1}, \dots, v_{n_{q}}

. See [23] for Gaussian quadrature theory.

Suppose

C_{j V}

has lower and upper tail dependence for all j, such as with BB1 or reflected BB1 copulas. Define the survival function

{\bar{C}}_{j V}

as in (A1). Then, from Section 2.18 of [9], there are functions

b_{j V}

and

b_{j V}^{*}

for lower and upper tails, respectively, such that for

w_{j} > 0

and

w > 0

,

\begin{matrix} C_{j V} (u w_{j}, u w) / u & \to & b_{j V} (w_{j}, w), u \to 0^{+}, \\ {\bar{C}}_{j V} (1 - u w_{j}, 1 - u w) / u & \to & b_{j V}^{*} (w_{j}, w), u \to 0^{+}, \\ C_{j | V} (u w_{j} | u w) & \to & b_{j | V} (w_{j}, w) = \partial b_{j V} (w_{j}, w) / \partial w, u \to 0^{+}, \\ {\bar{C}}_{j | V} (1 - u w_{j} | 1 - u w) & \to & b_{j | V}^{*} (w_{j}, w) = \partial b_{j V}^{*} (w_{j}, w) / \partial w, u \to 0^{+} . \end{matrix}

Lower and upper tail dependence parameters are defined in Section 2. The conditions of Theorem 8.76 of [9] are satisfied with BB1 copulas, and it implies for variable j and k (

j \neq k

) that the lower and upper table dependence parameters for

(U_{j}, U_{k})

are

λ_{j k, L} = \int_{0}^{\infty} b_{j | V} (1 | z) b_{k | V} (1 | z) d z, λ_{j k, U} = \int_{0}^{\infty} b_{j | V}^{*} (1 | z) b_{k | V}^{*} (1 | z) d z .

(A6)

For BB1,

\begin{matrix} b_{j V} (w_{j}, w) & = & {(w_{j}^{- δ θ} + w^{- δ θ})}^{- 1 / (δ θ)}, \\ b_{j V}^{*} (w_{j}, w) & = & w_{1} + w - {(w_{j}^{δ} + w^{δ})}^{1 / δ}, \\ b_{j | V} (1 | v) & = & {(1 + v^{δ θ})}^{- 1 - 1 / (δ θ)}, \\ b_{j | V}^{*} (1 | v) & = & 1 - {(1 + v^{- δ})}^{1 / δ - 1} . \end{matrix}

For reflected BB1, the lower and upper quantities are reversed, as the reflection just changes the upper (lower) quadrant to the lower (upper) quadrant.

Reflected BB1 and BB1 copulas usually fit better than the BB7 copula for stock return data with factor or vine copulas. From the concordance property of the BB1 copula, there is more lower and upper tail dependence as either of the two parameters increase.

Appendix A.3. Normal Scores and Use for Diagnostics

Let

y_{1}, \dots, y_{n}

be a sample of size n for a variable y. The rank transform to the standard normal is

{\hat{z}}_{i} = Φ^{- 1} ((rank (y_{i}) - 0.5) / n),

where a larger y value attains a larger rank (rank 1 for smallest and rank n for largest (among the sorted values of

y_{1}, \dots, y_{n}

). We refer to the

{{\hat{z}}_{i}}

as the normal scores for y.

If there are d variables and a sample

{(y_{i 1}, \dots, y_{i d}) : i = 1, \dots, n}

, then one can obtain vectors of normal scores

{({\hat{z}}_{i 1}, \dots, {\hat{z}}_{i d}) : i = 1, \dots, n}

with ranking performed separately for each variable. Pairs of variables have strong monotone dependence if the correlation of their normal score transforms is large. Figure 1 has examples of bivariate normal scores plots for pairs of GARCH-filtered returns. The plots are used to check for departures from elliptically shaped clouds of points, which are expected if the Gaussian copula fits well.

Appendix A.4. GARCH Time Series

A summary of the copula-GARCH models is as follows. Let

P_{t}

(

t = 0, 1, \dots, n

) be the price time series of a financial asset such as a market index or stock; the time index could be day, week, or month. The (log) return

R_{t}

is defined as

log (P_{t} / P_{t - 1})

. For d assets, denote the returns as

R_{t 1}, \dots, R_{t d}

at time t. For each financial return variable, a common choice is the GARCH(1,1) time series filter, with innovation distribution being the symmetric (or asymmetric) Student t_ν distribution with variance 1 and

ν > 2

; see Section 4.3.6 of [28].

Let

F_{j} (\cdot, ν_{j})

be the distribution of the innovation for the jth univariate marginal model:

R_{t j} = μ_{j} + σ_{t j} Z_{t j}, σ_{t j}^{2} = ω_{j} + α_{j} R_{t - 1, j}^{2} + β_{j} σ_{t - 1, j}^{2}, j = 1, \dots, d, t = 1, \dots, n,

(A7)

where for each j,

ω_{j} > 0

,

α_{j} > 0

,

β_{j} > 0

, the

Z_{t j}

are assumed to be innovations over t, and the random

σ_{t j}^{2}

depends on

R_{t - 1, j}^{2}

and

σ_{t - 1, j}^{2}

. For stationarity,

α_{j} + β_{j} < 1

for all j. The vectors

(Z_{t 1}, \dots, Z_{t d})

for

t = 1, \dots, n

are assumed to be independent and identically distributed with distribution:

F_{Z} (z; ν_{1}, \dots, ν_{d}, ϑ) = C (F_{1} (z_{1}; ν_{1}), \dots, F_{d} (z_{d}; ν_{d}); ϑ),

where

ϑ

is a dependence parameter of a d-dimensional copula C. Sometimes, an autoregressive coefficient

ϕ_{j}

is also added when

μ_{j}

in (A7) is replaced by

μ_{j} + ϕ_{j} R_{t - 1, j}

. The flexible choice of the copula families includes vine and factor copula constructions.

References

McNeil, A.J. Estimating the tails of loss severity distributions using extreme value theory. ASTiN Bull. 1997, 27, 117–137. [Google Scholar] [CrossRef]
Sun, M. Modeling cyber loss severity using a spliced regression distribution with mixture components. Open J. Stat. 2023, 13, 425–452. [Google Scholar] [CrossRef]
Falk, M.; Padoan, S.; Wisheckel, F. Generalized Pareto copulas: A key to multivariate extremes. J. Multivar. Anal. 2019, 174, 104538. [Google Scholar] [CrossRef]
Jondeau, E.; Rockinger, M. The copula-GARCH model of conditional dependencies: An international stock market application. J. Int. Money Financ. 2006, 25, 827–853. [Google Scholar] [CrossRef]
Joe, H.; Li, H.; Nikoloulopoulos, A.K. Tail dependence functions and vine copulas. J. Multivar. Anal. 2010, 101, 252–270. [Google Scholar] [CrossRef]
Krupskii, P.; Joe, H. Factor copula models for multivariate data. J. Multivar. Anal. 2013, 120, 85–101. [Google Scholar] [CrossRef]
Lee, D.; Joe, H.; Krupskii, P. Tail-weighted dependence measures with limit being the tail dependence coefficient. J. Nonparametr. Stat. 2018, 30, 262–290. [Google Scholar] [CrossRef]
Coblenz, M.; Dyckerhoff, R.; Grothe, O. Nonparametric estimation of multivariate quantiles. Environmetrics 2018, 29, e2488. [Google Scholar] [CrossRef]
Joe, H. Dependence Modeling with Copulas; Chapman & Hall/CRC: Boca Raton, FL, USA, 2014. [Google Scholar]
Fan, X. Dependence Modeling in High Dimensions with Latent Variables. Ph.D. Thesis, University of British Columbia, Vancouver, BC, Canada, 2024. [Google Scholar]
Johnson, R.A.; Wichern, D.W. Applied Multivariate Statistical Analysis, 5th ed.; Prentice Hall: Englewood Cliffs, NJ, USA, 2002. [Google Scholar]
Joe, H. Parsimonious graphical dependence models constructed from vines. Can. J. Stat. 2018, 46, 532–555. [Google Scholar] [CrossRef]
Ang, A.; Chen, J. Asymmetric correlations of equity portfolios. J. Financ. Econ. 2002, 63, 443–494. [Google Scholar] [CrossRef]
Longin, F.; Solnik, B. Extreme correlations in international equity markets. J. Financ. 2001, 56, 649–676. [Google Scholar] [CrossRef]
Nikoloulopoulos, A.K.; Joe, H.; Li, H. Vine copulas with asymmetric tail dependence and applications to financial return data. Comput. Stat. Data Anal. 2012, 56, 3659–3673. [Google Scholar] [CrossRef]
Li, X.; Joe, H. Multivariate directional tail-weighted dependence measures. J. Multivar. Anal. 2024, 203, 105319. [Google Scholar] [CrossRef]
Serfling, R.J. Approximation Theorems of Mathematical Statistics; Wiley: New York, NY, USA, 1980. [Google Scholar]
Liseo, B.; N, L. A note on reference priors for the scalar skew-normal distribution. J. Stat. Plan. Inference 2006, 136, 373–389. [Google Scholar] [CrossRef]
Azzalini, A.; Arellano-Valle, R.B. Maximum penalized likelihood estimation for skew-normal and skew-t distributions. J. Stat. Plan. Inference 2013, 143, 419–433. [Google Scholar] [CrossRef]
Loaiza-Maya, R.; Smith, M.S. Variational Bayes estimation of discrete-margined copula models with application to time series. J. Comput. Graph. Stat. 2019, 28, 523–539. [Google Scholar] [CrossRef]
Nguyen, H.; Ausin, M.C.; Galeano, P. Variational inference for high dimensional structured factor copulas. Comput. Stat. Data Anal. 2020, 151, 107012. [Google Scholar] [CrossRef]
Krupskii, P.; Joe, H. Structured factor copula models: Theory, inference and computation. J. Multivar. Anal. 2015, 138, 53–73. [Google Scholar] [CrossRef]
Stroud, A.; Secrest, D. Gaussian Quadrature Formulas; Prentice-Hall: Englewood Cliffs, NJ, USA, 1966. [Google Scholar]
Kreuzer, A.; Czado, C. Bayesian inference for a single factor copula stochastic volatility model using Hamiltonian Monte Carlo. Econom. Stat. 2021, 19, 130–150. [Google Scholar] [CrossRef]
Hoffman, M.D.; Gelman, A. The No-U-Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 2014, 15, 1593–1623. [Google Scholar]
Stan Development Team. Stan Modeling Language Users Guide and Reference Manual, Version 2.30.1; 2023. Available online: https://mc-stan.org/users/citations/ (accessed on 10 June 2024).
Yoshiba, T. Maximum likelihood estimation of skew-t copulas with its applications to stock returns. J. Stat. Comput. Simul. 2018, 88, 2489–2506. [Google Scholar] [CrossRef]
Jondeau, E.; Poon, S.H.; Rockinger, M. Financial Modeling under Non-Gaussian Distributions; Springer: London, UK, 2007. [Google Scholar]

Figure 1. Normal score plots for some pairs of GARCH-filtered stock returns. Lower and upper semi-correlations, as used in Section 2.4 of [9], show more dependence in the lower quadrant than in the upper quadrant and suggest asymmetric tail dependence.

Table 1. Matrices of tail dependence measures for 10 stock GARCH-filtered returns: model-based 1-factor BB1r, empirical, model-based 1-factor BB1, respectively. Lower (upper) tail dependence below (above) diagonal. Bootstrap standard errors (SEs) for estimates of lower and upper tail dependence are mostly in the range 0.04 to 0.075.

Model-Based Tail-Dependence Based on MLE with 1-Factor BB1r
1.000	0.258	0.193	0.017	0.176	0.093	0.135	0.037	0.039	0.233
0.400	1.000	0.267	0.022	0.242	0.126	0.184	0.049	0.051	0.326
0.449	0.470	1.000	0.018	0.181	0.096	0.139	0.038	0.040	0.240
0.257	0.268	0.298	1.000	0.016	0.009	0.013	0.004	0.004	0.021
0.353	0.369	0.413	0.239	1.000	0.088	0.127	0.035	0.037	0.218
0.449	0.471	0.533	0.299	0.414	1.000	0.069	0.020	0.020	0.114
0.334	0.349	0.390	0.227	0.310	0.391	1.000	0.027	0.029	0.167
0.328	0.342	0.382	0.223	0.304	0.383	0.288	1.000	0.008	0.045
0.383	0.401	0.450	0.258	0.354	0.451	0.335	0.329	1.000	0.047
0.434	0.455	0.513	0.289	0.400	0.514	0.378	0.370	0.435	1.000
Empirical Tail-Dependence Measures
1.000	0.418	0.259	0.087	0.196	0.233	0.200	0.160	0.144	0.239
0.397	1.000	0.238	0.141	0.275	0.285	0.247	0.144	0.163	0.333
0.362	0.382	1.000	0.118	0.268	0.326	0.275	0.108	0.202	0.347
0.184	0.168	0.217	1.000	0.130	0.142	0.080	0.189	0.157	0.079
0.283	0.310	0.344	0.231	1.000	0.178	0.209	0.111	0.132	0.290
0.281	0.364	0.494	0.132	0.307	1.000	0.182	0.078	0.161	0.339
0.235	0.274	0.375	0.207	0.320	0.301	1.000	0.060	0.131	0.308
0.267	0.275	0.219	0.234	0.207	0.273	0.201	1.000	0.137	0.109
0.246	0.312	0.373	0.156	0.262	0.333	0.151	0.279	1.000	0.198
0.284	0.293	0.450	0.224	0.301	0.438	0.255	0.284	0.276	1.000
Model-Based Tail-Dependence Based on MLE with 1-Factor BB1
1.000	0.385	0.363	0.172	0.317	0.335	0.279	0.223	0.273	0.385
0.308	1.000	0.416	0.195	0.363	0.383	0.317	0.253	0.311	0.442
0.394	0.397	1.000	0.184	0.342	0.361	0.300	0.240	0.294	0.416
0.169	0.170	0.211	1.000	0.163	0.172	0.144	0.117	0.142	0.194
0.263	0.265	0.336	0.147	1.000	0.316	0.264	0.211	0.258	0.362
0.396	0.399	0.525	0.212	0.337	1.000	0.278	0.223	0.272	0.383
0.260	0.262	0.332	0.146	0.225	0.333	1.000	0.187	0.228	0.317
0.263	0.266	0.336	0.147	0.228	0.338	0.225	1.000	0.183	0.253
0.317	0.320	0.410	0.174	0.273	0.412	0.270	0.273	1.000	0.311
0.358	0.362	0.469	0.194	0.307	0.471	0.303	0.307	0.373	1.000

Table 2. Empirical Spearman rank correlation matrix for 10 GARCH-filtered stock returns. Bootstrap SEs for Spearman are in the range 0.017 to 0.036. Model-based Spearman rhos based on 1-factor BB1r and 1-factor BB1 are quite close to the respective empirical values.

Empirical Spearman Rho Central Dependence Measures
1.000	0.745	0.597	0.321	0.534	0.584	0.439	0.444	0.567	0.565
0.745	1.000	0.633	0.376	0.605	0.620	0.491	0.519	0.577	0.639
0.597	0.633	1.000	0.323	0.551	0.709	0.581	0.506	0.561	0.723
0.321	0.376	0.323	1.000	0.347	0.378	0.352	0.508	0.362	0.388
0.534	0.605	0.551	0.347	1.000	0.569	0.494	0.390	0.501	0.558
0.584	0.620	0.709	0.378	0.569	1.000	0.557	0.498	0.558	0.720
0.439	0.491	0.581	0.352	0.494	0.557	1.000	0.325	0.420	0.599
0.444	0.519	0.506	0.508	0.390	0.498	0.325	1.000	0.467	0.485
0.567	0.577	0.561	0.362	0.501	0.558	0.420	0.467	1.000	0.567
0.565	0.639	0.723	0.388	0.558	0.720	0.599	0.485	0.567	1.000

Table 3. Summaries to indicate how well model-based tail dependence and central dependence approximate respective empirical values. The averages and fractions are over

(\binom{10}{2}) = 45

bivariate margins.

Table 3. Summaries to indicate how well model-based tail dependence and central dependence approximate respective empirical values. The averages and fractions are over

(\binom{10}{2}) = 45

bivariate margins.

Summary	Value
average $\| {\hat{λ}}_{j k, L} - λ_{j k, L} (\hat{ϑ}) \|$ for 1-factor BB1r	0.088
average $\| {\hat{λ}}_{j k, U} - λ_{j k, U} (\hat{ϑ}) \|$ for 1-factor BB1r	0.101
average $\| {\hat{ρ}}_{j k, S} - ρ_{j k, S} (\hat{ϑ}) \|$ for 1-factor BB1r	0.035
average $\| {\hat{λ}}_{j k, L} - λ_{j k, L} (\hat{ϑ}) \|$ for 1-factor BB1	0.043
average $\| {\hat{λ}}_{j k, U} - λ_{j k, U} (\hat{ϑ}) \|$ for 1-factor BB1	0.088
average ${\hat{ρ}}_{j k, S} - ρ_{j k, S} (\hat{ϑ})$ for 1-factor BB1	0.036
average $({\hat{λ}}_{j k, L} - {\hat{λ}}_{j k, U})$	0.088
average $(λ_{j k, L} (\hat{ϑ}) - λ_{j k, U} (\hat{ϑ}))$ for 1-factor BB1r	0.275
average $(λ_{j k, L} (\hat{ϑ}) - λ_{j k, U} (\hat{ϑ}))$ for 1-factor BB1	0.021
fraction ${\hat{λ}}_{j k, L} > {\hat{λ}}_{j k, U}$	40/45
fraction $λ_{j k, L} (\hat{ϑ}) > λ_{j k, U} (\hat{ϑ})$ for 1-factor BB1r	45/45
fraction $λ_{j k, L} (\hat{ϑ}) > λ_{j k, U} (\hat{ϑ})$ for 1-factor BB1	29/45

Table 4. Closeness to corresponding empirical values to model-based ML or posterior modal

η_{L} . η_{U}

values for lower tail dependence

λ_{j k, L}

, upper tail dependence

λ_{j k, U}

and central dependence parameters

ρ_{j k, S}

of

d (d - 1) / 2

pairs; d = 10 GARCH-filter stock returns. The quantiles in columns 2 to 4 are from the average absolute difference over

d (d - 1) / 2

pairs.

Table 4. Closeness to corresponding empirical values to model-based ML or posterior modal

η_{L} . η_{U}

values for lower tail dependence

λ_{j k, L}

, upper tail dependence

λ_{j k, U}

and central dependence parameters

ρ_{j k, S}

of

d (d - 1) / 2

pairs; d = 10 GARCH-filter stock returns. The quantiles in columns 2 to 4 are from the average absolute difference over

d (d - 1) / 2

pairs.

Case	$λ_{L}$	$λ_{U}$	$ρ_{S}$	Objective (2)
BB1r, MLE	0.094	0.116	0.034	–
BB1r, prior (4)	0.077	0.043	0.034	5766
BB1r, prior (5)	0.081	0.054	0.034	5766
BB1r, prior (6)	0.074	0.046	0.035	5743
BB1, prior (4)	0.078	0.078	0.036	5737
BB7, prior (4)	0.140	0.134	0.062	5549

Table 5. Posterior mode and standard deviation (SD) of

η_{1 j}, η_{2 j}

parameters. Note that

μ_{η_{1 j}} > μ_{η_{2 j}}

implies strong lower tail dependence than upper tail dependence for variable j with the latent variable.

μ_{η_{1 j}} > 0

means that the estimated lower tail dependence with latent variable exceeds 0.5. The SD

σ

values come from the square root of diagonal values of the negative inverse Hessian at the mode. The correlation values for each diagonal

2 \times 2

block come from converting a covariance matrix to a correlation matrix.

Table 5. Posterior mode and standard deviation (SD) of

η_{1 j}, η_{2 j}

parameters. Note that

μ_{η_{1 j}} > μ_{η_{2 j}}

implies strong lower tail dependence than upper tail dependence for variable j with the latent variable.

μ_{η_{1 j}} > 0

means that the estimated lower tail dependence with latent variable exceeds 0.5. The SD

σ

values come from the square root of diagonal values of the negative inverse Hessian at the mode. The correlation values for each diagonal

2 \times 2

block come from converting a covariance matrix to a correlation matrix.

Variable j	$μ_{η_{1 j}}$	$μ_{η_{2 j}}$	$σ_{η_{1 j}}$	$σ_{η_{2 j}}$	$ρ_{η_{1 j}, η_{2 j}}$
1	0.292	−0.351	0.081	0.202	−0.398
2	−0.308	−0.797	0.101	0.217	−0.315
3	0.447	−0.010	0.081	0.186	−0.442
4	−0.771	−1.422	0.119	0.243	−0.163
5	0.250	−0.904	0.076	0.227	−0.274
6	0.758	−0.261	0.068	0.215	−0.373
7	0.435	−0.390	0.076	0.209	−0.379
8	0.560	−0.165	0.075	0.200	−0.411
9	0.553	−0.137	0.075	0.200	−0.422
10	0.657	−0.314	0.070	0.213	−0.372
11	0.803	−0.255	0.067	0.215	−0.366
12	0.460	−0.755	0.071	0.224	−0.292
13	−0.108	−1.312	0.086	0.234	−0.197
14	0.134	−0.859	0.081	0.226	−0.292
15	0.253	−0.637	0.079	0.219	−0.336
16	0.600	0.101	0.078	0.186	−0.450
17	−0.149	−1.240	0.089	0.233	−0.216
18	0.533	−0.036	0.078	0.192	−0.437
19	0.107	−0.734	0.083	0.221	−0.323
20	−0.393	−1.228	0.101	0.235	−0.215

Table 6. Part of negative inverse Hessian at posterior mode, that is, the posterior covariance for variables

μ_{η_{1 j}}, μ_{η_{2 j}}

,

j \in {1, 7, 13}

. Note that near the block diagonal form, the matrix is diagonal

2 \times 2

block dominant.

Table 6. Part of negative inverse Hessian at posterior mode, that is, the posterior covariance for variables

μ_{η_{1 j}}, μ_{η_{2 j}}

,

j \in {1, 7, 13}

. Note that near the block diagonal form, the matrix is diagonal

2 \times 2

block dominant.

$j = 1$		$j = 7$		$j = 13$
0.00662	−0.00655	−0.00002	−0.00001	0.00005	−0.00003
−0.00655	0.04093	−0.00002	−0.00025	0.00001	0.00052
−0.00002	−0.00002	0.00570	−0.00600	0.00005	0.00000
−0.00001	−0.00025	−0.00600	0.04388	0.00002	0.00029
0.00005	0.00001	0.00005	0.00002	0.00737	−0.00396
−0.00003	0.00052	0.00000	0.00029	−0.00396	0.05473

Table 7. Values of average difference in values in (8) over 100 simulated data sets. Also included are the fraction of times that the posterior mode from (2) is closer to the “true” vector compared with the MLE, and lower/upper quartiles of (8).

m	Average ${rms}_{0} - {rms}_{m}$	Fraction ${rms}_{0} > {rms}_{m}$	Q1 rms_m	Q3 rms_m
0			0.085	0.101
1	0.023	0.95	0.066	0.079
2	0.019	0.92	0.071	0.082
3	0.013	0.80	0.074	0.091

Table 8. Values of average difference for values of (9) in 100 simulated data sets. Also, the fraction of times that the posterior mode from (2) improves on the MLE based on (9), and lower/upper quartiles of (9).

m	M	avg $Δ_{M, 0} - Δ_{M, m}$	Fraction $Δ_{M, m} < Δ_{M, 0}$	Q1 $Δ_{M, m}$	Q3 $Δ_{M, m}$
0	L			0.059	0.078
1	L	0.001	0.46	0.060	0.077
2	L	0.001	0.48	0.060	0.077
3	L	0.000	0.37	0.060	0.079
0	U			0.073	0.129
1	U	0.011	0.87	0.065	0.116
2	U	0.004	0.70	0.070	0.122
3	U	0.011	0.92	0.065	0.114

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Joe, H.; Li, X. Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence. Entropy 2024, 26, 610. https://doi.org/10.3390/e26070610

AMA Style

Joe H, Li X. Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence. Entropy. 2024; 26(7):610. https://doi.org/10.3390/e26070610

Chicago/Turabian Style

Joe, Harry, and Xiaoting Li. 2024. "Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence" Entropy 26, no. 7: 610. https://doi.org/10.3390/e26070610

APA Style

Joe, H., & Li, X. (2024). Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence. Entropy, 26(7), 610. https://doi.org/10.3390/e26070610

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence

Abstract

1. Introduction

2. Numerical Data Example to Illustrate Discrepancy for Tail Inference

3. Tilted Likelihood for 1-Factor Copula Model with Tail Dependence

Numerical Optimization for Posterior Mode and Hessian at Mode

4. Closer Match of Empirical and Model-Based Tail Dependence

5. Data Example with Prior and Tilted Likelihood

Bayesian Computing with STAN

6. Simulation Summary

7. Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Background Results on Copulas and Dependence Concepts

Appendix A.1. BB1 Copula Family and Tail Dependence Properties

Appendix A.2. 1-Factor Copula Construction

Appendix A.3. Normal Scores and Use for Diagnostics

Appendix A.4. GARCH Time Series

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI