1. Introduction
Researchers have examined diverse classes of distributions to study various facets of problems. Most loss datasets in the context of actuarial loss modeling share some common characteristics, such as being skewed to the right, unimodal (multimodal in certain situations), and having a thin left tail and a moderate to extremely thick right tail. In recent years, various classes of heavy-tailed distributions, including the subexponential distribution class, have been studied for modeling heavy-tailed (or extreme) data [
1,
2,
3]. On the other hand, a different stream of problems, such as the periodic vibrations in commercial aircraft, have motivated the introduction of the Birnbaum–Saunders (BS) distribution [
4]. The BS distribution models the total time elapsed until a critical threshold is exceeded by fatigue accumulated on a subject (material) of interest, causing the failure event (or a crack) of the material to occur. Due to its ability to fit right-skewed data, the BS distribution is highly effective for modeling numerous scenarios, e.g., situations where there is an accumulation of a certain factor that drives a quantifiable characteristic to surpass a critical threshold. See [
5] for details on the theoretical properties and applications of the BS distribution.
Various extensions of the BS distribution have been discussed in the literature. In [
6], an extended version of Birnbaum–Saunders distribution family is introduced using the density of elliptical distribution in place of the standard normal density that quantifies the amount of the stress per cycle of material use in the BS setting. In [
7], the extreme value version of the generalized Birnbaum–Saunders (GBS) distribution, whose tail thickness is determined by that of the auxiliary distribution (i.e., an elliptical distribution), has been discussed. Some applications of the extreme value BS models can be found in [
5,
8].
The (three-parameter) Gaussian crack (lifetime) distribution introduced in [
9] is another important extension of the BS distribution. The Gaussian crack (CR) distribution is a two-component mixture of inverse Gaussian and length-biased inverse Gaussian distributions with a weight parameter
p, and it features increased flexibility to fit various datasets due to the additional mixture weight parameter. However, the Gaussian crack distribution relies on the standard normal base density, and thus it lacks heavy-tailedness. The limited applicability of the Gaussian crack distribution for modeling heavy-tailed data such as insurance losses motivates the construction of a large class of generalized crack (GCR) distributions [
10]. The GCR distribution class contains the Gaussian crack distribution as a specific member, and each member of the class is built on a specific choice of a base-density function that determines the tail characteristics of the resulting GCR distribution. In [
10,
11], the GCR distributions with the Student’s
t and the generalized Gaussian base-density functions are applied to catastrophic losses and heavy-tailed precipitation time series, respectively.
In [
12], the GCR distribution class has recently been further extended to the class of Type-II generalized crack (GCR2) distributions in which an additional shape parameter
is included to increase flexibility over the GCR class. The key distributional properties, such as the tail characteristics of GCR2 distribution, depend on the specification of the base-density function and the shape parameters involved in each model.
In the literature, several important modeling frameworks allow for the creation of new distribution families from given ones, including the Azzalini method [
13], Lehmann-type distributions [
14], and Topp-Leone families [
15]. While each framework possesses its own distributional characteristics, these general frameworks are common in the sense that the specification of the baseline distribution function (or a density function) plays a crucial role in determining key distributional properties of the constructed model. For instance, the Lehmann-type I setting with Stoppa (baseline) distribution function renders a model that can be effectively used for actuarial data with extreme observations [
16]. In this sense, the modeling approach used in the construction of the GCR/GCR2 distribution class is in line with these general frameworks. Typical parametric distributions built under the aforementioned general frameworks have simple (closed) forms, and thus, they may not perform well when data features complex (i.e., multimodal) shapes. Due to its mixture structure, the GCR/GCR2 distribution with an appropriate baseline density can be advantageous over the simple models in such cases.
Regarding applications for bivariate data with heavy-tailed marginals, a bivariate GCR distribution in the form of a four-component mixture of independent models has been constructed in [
17]. The authors demonstrate that bivariate GCR models can exhibit useful dependence structures and serve as valuable models for diverse real-world situations.
This paper aims to extend the univariate GCR2 models to bivariate cases by employing the mixture model structure used in [
17]. Three specific examples of GCR2 distributions, i.e., a newly constructed GCR2 model based on the logistic density in addition to the GCR2-
t and the GCR2-GG models introduced in [
12], are used as marginals to effectively model heavy-tailed insurance/catastrophic loss data. We investigate some theoretical properties of the proposed bivariate models, such as the conditional distribution and the dependence structure, and discuss the expectation-maximization (EM) algorithm for model estimation. The applicability of the proposed bivariate GCR2 models and the estimation method is illustrated through a model fitting to a real disaster loss dataset.
The rest of this paper is organized as follows.
Section 2 gives a brief review of the origin, definition, examples, and key properties of the univariate Type-II generalized crack distributions for the reader’s convenience. In
Section 3, the bivariate Type-II generalized crack distribution is introduced with some detailed discussions of its theoretical properties. A method of model estimation and its application on a real catastrophic loss dataset are presented in
Section 4 and
Section 5, respectively. Finally,
Section 6 provides some concluding remarks.
3. Bivariante GCR2 Distribution
In this section, we introduce a bivariate distribution with GCR2 marginals and study its key theoretical properties. Formally, the bivariate Type-II generalized crack (BVGCR2) distribution is defined as follows.
Definition 1. A pair of random variables T = (, ) has a bivariate Type-II generalized crack distribution with base density g and parameters = (, ), = (), = () and = (), denoted as BVGCR2(, , , ; g), if and only if, its joint pdf is given as follows:where the mixture weight parameters satisfy , , , and, for each ,
and Clearly, BVGCR2(, , , ; g) is a mixture of four combinations of independent bivariate distributions. It is easy to see that the marginal distributions of and are GCR2 and GCR2 , respectively. Please note that for simplicity, we assume that the marginal distributions are built on the same base-density function g, but any model parameters involved in g are distinct for each marginal.
3.1. Conditional Distribution
The conditional distribution is useful in simulating a pair of random variables from the BVGCR2 distribution. From the following relationship between the two mixture components of the GCR2 distribution,
the conditional density of
given
, denoted as
, can be expressed as
Here the subscripts and parameters in the density functions have been dropped for notational convenience. From the expression, we can see that the conditional distribution of
given
is also GCR2
where
Using the conditional distribution, one can easily simulate a pair of random variates
from the BVGCR2 model; first simulate
from the GCR2
using the acceptance-rejection method as given in [
12], and then simulate,
from the conditional distribution GCR2
.
3.2. Dependence Measures
In this section, we derive expressions for Spearman’s rho and Kendall’s tau of BVGCR2 random variables.
3.2.1. Spearman’s Rho
Spearman’s rho is a commonly used measure of dependence between two random variables. Due to the invariance under monotone transformations, Spearman’s rho provides a broad interpretation of the dependence structure for any bivariate distributions. With marginal densities
and
for the random variables
and
, respectively, and the joint distribution
on
, the (population version) Spearman’s rho is defined as
where
and
are uniform random variables, i.e., Spearman’s rho is Pearson’s correlation between transformations of the original random variables into standard uniform marginals. The following provides an expression for Spearman’s rho of a pair of random variables with a BVGCR2 distribution.
Proposition 1. Suppose BVGCR2(,,,;g). Then, Spearman’s rho between and is expressed aswhere 3.2.2. Kendall’s Tau
Kendall’s tau is a measure of association (concordance/discordance) between two random variables. Formally, for the random variables
and
with the joint distribution
on
, the (population version) Kendall’s tau is defined as
where the pair
has the joint distribution
F and is independent to
.
The following provides an expression for Kendall’s tau of random variables with a BVGCR2 distribution.
Proposition 2. Suppose BVGCR2(,,,;g). Then, Kendall’s tau between and is expressed aswhere Remark 1. It is important to note that These inequalities can be proven using the symmetry of , and integration by parts. Specifically,
Clearly and , for . Then, by the fact , we have From this and by Equations (8) and (10), we obtain the following bounds for and : Note that and become the maximum when = and γ attains its maximum of 1/4.
We also remark that if , and thus , the joint density of the BVGCR2 model can be expressed as a product of two GCR2 marginals, i.e., the two random variables are independent.
3.3. Tail Independence
As remarked in the previous section, the dependency measures of the proposed BVGCR2 model are bounded, and thus, the model may not be suitable for cases where extreme dependency is required, i.e., market turmoil. Here, we further investigate the tail dependence of the BVGCR2 model in terms of the upper-tail dependence, which is defined as
where the last equality is held by L’Hopital’s rule. In fact,
Since
,
, and
, we have
Then, by Equation (
12), we have
, i.e., the BVGCR2 model lacks upper-tail dependence.
4. Model Parameter Estimation
In this section, we discuss parameter estimation for the bivariate Type-II generalized crack distribution using the expectation-maximization algorithm. We briefly review the EM algorithm in a general setting and provide a specific application to the BVGCR2 models.
4.1. Maximum Likelihood Estimation
Suppose
is an independent and identically distributed random sample drawn from a density
. Likelihood function is given as
The maximum likelihood estimation (MLE) aims to find the parameter estimate that maximizes the likelihood function, or equivalently, the log-likelihood function:
In the presence of latent (hidden/unobserved) values, however, the direct maximization method often does not provide reliable estimates; hence, other alternative methods are usually considered, and one such method is the expectation-maximization algorithm.
4.2. Expectation-Maximization Algorithm
For simple mixture models with a small number of mixture components, the direct optimization of the log-likelihood function may be used to obtain the maximum likelihood estimates of the model parameters. However, the direct optimization often fails to converge when the number of mixture components is large relative to the sample size.
The expectation-maximization (EM) algorithm [
19] is an efficient iterative procedure used to compute the maximum likelihood estimates in the presence of missing or hidden data. When applied to finite mixture model settings, the expectation step renders the separation of the mixture weight parameters from other model parameters for optimization, and the maximization step gives an explicit solution for updating the mixture weights. Due to this, the algorithm is the most widely used maximum likelihood estimation technique for finite mixture models. For details on recent developments of EM-type algorithms for the Poisson mixture model and multivariate Gaussian mixture models in complex data settings, see [
20,
21], respectively.
Here, we briefly review the general form of the EM algorithm. Suppose we have observed data
with density
and some latent (hidden/unobserved data)
with density
. The density of the complete data is denoted by
. The goal of the EM algorithm is to find the MLE, i.e., the maximum of the observed data likelihood function,
The EM algorithm proceeds by iterating between the following two steps;
Lemma 1. The EM algorithm improves . That is, if , then .
We now provide the EM algorithm for the estimation of the parameters in the BVGCR2 model. Let
be a random sample drawn from a BVGCR2(
;
g) distribution, where
is a pair of observations for each
, and the base density
g may have its own parameter(s) such as
involved in the Student’s
t density. We denote the vector of parameters involved in the base densities by
. Letting
be the vector of all model parameters, the likelihood function based on the incomplete data is
where
is the set of indexes and
On the other hand, letting
be a set of latent variables where Pr
for each
, the likelihood function based on the (augmented) complete data is
where
denotes the indicator function.
Let
denote the current estimate of
after
m-th iteration of the EM algorithm. Then, by Bayes’ theorem, we have
The expectation step (E-step) of the EM algorithm follows.
The maximization step (M-step) finds the updated parameter estimates that maximize the objective function (
13), which separates
and the other parameters (
). The update of
can be dealt with separately by applying the method of Lagrange multiplier. The updated estimate is
The updated estimates
and
are the maximizers of the objective function
where
and
That is, the maximization for can proceed separately from that for and thus, the dimensionality of the optimization problem is reduced significantly.
5. Applications
In [
12], the usefulness of the univariate GCR2 models for heavy-tailed data modeling has been demonstrated through an application with a real loss dataset. In this section, we fit several bivariate Type-II generalized crack distributions on a real catastrophic loss dataset compiled from the International Disaster Database (EM-DAT,
www.emdat.be, accessed on 22 November 2024). Specifically, each observation in the dataset is composed of two variables: ‘Meteo’ and ‘Hydro’. Marginally, ‘Meteo’ is a quarterly time series of (estimated) losses from meteorological disasters such as storms and extreme temperatures, spanning from 1950 to 2022 in Asia, and ‘Hydro’ is a series of (estimated) losses due to hydrological disasters such as flood and landslide for the same geographical area and the time span. For bivariate model fitting, we remove the pairs with missing observations, resulting in a bivariate dataset with 166 observations. The losses are inflation-adjusted to be equivalent to the US dollar values in 2021.
Table 1 presents summary statistics of losses due to meteorological and hydrological disasters in Asia. The descriptive statistics, along with the histogram and the normal Q-Q plots (
Figure 10 and
Figure 11), suggest that the marginal distributions of the two variables are both positively skewed and heavy-tailed. Many time series include some deterministic components such as long-term trends and seasonality. To isolate the deterministic components, each quarterly time series is decomposed under the multiplicative model assumption. The time series decompositions given in
Figure 12 show the presence of strong seasonality in both time series and weak evidence of long-term trend. Since the proposed GCR2 models do not assume any deterministic seasonality, we deseasonalize each time series by dividing it by its estimated seasonal component.
Figure 13 gives the scatter plots of the two seasonally adjusted variables and their log-transformations. The figure shows some evidence of a dependent relationship between the two variables. The sample Spearman’s rho and Kendall’s tau of the variables are 0.425 and 0.291, respectively.
We apply the EM algorithm described in
Section 4.2 to fit the (seasonally adjusted) data using six specific bivariate models: GCR-GG, GCR-
t, GCR-LG, GCR2-GG, GCR2-
t, and GCR2-LG. For each model fitting, the EM algorithm requires initialization of the parameter values. To obtain a reliable result, we first fit the marginal models separately using the estimation method given in [
12], and the fitted values of the marginal parameters, i.e.,
for GCR models and
, for the GCR2 models, are used for the initialization of the BVGCR (or BVGCR2) model parameters. For mixture weight parameter initialization, we use a large set of possible parameter values satisfying
and
, where
and
are the mixture weight parameter estimates for the marginal models. The log-likelihood function values of the EM fits under the set of initializations are compared, and the one that gives the largest log-likelihood value is selected for the final fit. To compare the performance of BVGCR2 models with some other benchmark bivariate models commonly used in loss modeling, we further implement the maximum likelihood estimation of the following seven bivariate models: BVLNorm (bivariate lognormal), Clayton-LNorm (bivariate Clayton copula with lognormal marginals), Gumbel-LNorm (bivariate Gumbel copula with lognormal marginals), Frank-LNorm (bivariate Frank copula with lognormal marginals), Clayton-Pareto (bivariate Clayton copula with two-parameter Pareto marginals), Gumbel-Pareto (bivariate Gumbel copula with two-parameter Pareto marginals), and Frank-Pareto (bivariate Frank copula with two-parameter Pareto marginals) models.
Since the number of estimated model parameters differs by model, we compare the fits of candidate models in terms of the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), defined as
respectively, where
k is the number of estimated parameters and
n is the sample size. Among candidate models, the preferred model is the one with the smallest value of either of these criteria.
From
Table 2, we see that, based on the Akaike information criterion, the fitted BVGCR2-GG model outperforms all the other alternative bivariate models. However, the BVGCR-LG model is preferred in terms of the Bayesian information criterion, which heavily penalizes complex models. Comparing the BVGCR-GG to the BVGCR2-GG, one can see that the BVGCR2-GG significantly improves model fitting due to the additional shape parameter.
Table 3 gives parameter estimates of the fitted marginal distributions and the bivariate GCR2-GG model.
Please note that the values of and based on the fitted BVGCR2-GG model are larger than the corresponding estimates of and . Spearman’s rho and Kendall’s tau under the fitted BVGCR2-GG model are 0.173 and 0.115, respectively, which are lower than the empirical counterparts. This may be because the empirical data contains some spurious dependence due to the deterministic trends.
To test the statistical significance of the parameter estimates of the BVGCR2-GG model, we construct the bootstrap confidence intervals by computing the 2.5th and the 97.5th percentiles of the estimates based on the 100 bootstrap samples.
Table 4 shows that all the parameter estimates on the original dataset fall in the corresponding 95% bootstrap confidence intervals.
Figure 14 gives the contour plots of the fitted BVGCR2-GG model, and that of the log-transformed random variables, and
Figure 15 presents the scatter plots of the simulated random variables from the BVGCR2-GG model and their log-transformations. Comparing these plots with
Figure 13, we can see that the fitted model explains the dependence structure of the empirical data well.
6. Concluding Remarks
In this paper, we constructed a bivariate extension of the Type-II generalized crack distribution and studied a few specific examples of the bivariate GCR2 distributions base on the generalized Gaussian, Student’s t, and logistic densities to demonstrate the applicability of the constructed model. Specifically, our main theoretical finding is that the level of dependence of the constructed BVGCR2 model in terms of Kendall’s tau and Spearman’s rho is a weak to medium association. The model fitting to catastrophic loss data showed that the fitted BVGCR2-GG model outperformed all the other alternative models based on the Akaike information criterion. Especially when compared to the BVGCR-GG model, the BVGCR2-GG model has shown a significant improvement due to the increased flexibility. With an appropriate choice of base-density function, the proposed BVGCR2 model can be effectively used for various applications that require a weak to moderate level of dependence.
With the lack of the upper-tail dependence, the bivariate GCR2 distributions may not be applicable for the situations where variables are (or assumed to be) asymptotically dependent in the upper tail, e.g., stress testing for market/credit portfolios. This limitation can be alleviated using a common parameter for both marginals and randomizing the parameter. For example, we may set , the shape parameters in the GCR2 marginals, and assume follows a Gamma distribution. The use of a common random parameter is expected to widen the range of dependence levels and allow for upper-tail dependence. We will pursue this approach in future research.