Next Article in Journal
The Importance of Economic Variables on London Real Estate Market: A Random Forest Approach
Next Article in Special Issue
Machine Learning in P&C Insurance: A Review for Pricing and Reserving
Previous Article in Journal
Determining Economic Security of a Business Based on Valuation of Intangible Assets according to the International Valuation Standards (IVS)
Previous Article in Special Issue
Address Identification Using Telematics: An Algorithm to Identify Dwell Locations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of a Vine Copula for Multi-Line Insurance Reserving

1
Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
2
Department of Statistics, University of Connecticut, 215 Glenbrook Rd. U-4120, Storrs, CT 06269-4120, USA
*
Author to whom correspondence should be addressed.
Risks 2020, 8(4), 111; https://doi.org/10.3390/risks8040111
Submission received: 11 September 2020 / Revised: 13 October 2020 / Accepted: 14 October 2020 / Published: 21 October 2020
(This article belongs to the Special Issue Data Mining in Actuarial Science: Theory and Applications)

Abstract

:
This article introduces a novel use of the vine copula which captures dependence among multi-line claim triangles, especially when an insurance portfolio consists of more than two lines of business. First, we suggest a way to choose an optimal joint loss development model for multiple lines of business that considers marginal distribution, vine copula structure, and choice of family for each pair of copulas. The performance of the model is also demonstrated with Bayesian model diagnostics and out-of-sample validation measures. Finally, we provide an implication of the dependence modeling, which allows a company to analyze and establish the risk capital for whole portfolio.

1. Introduction

There have been various approaches to determine loss reserve for a single line of business with statistical models. According to Mack (1993), the chain-ladder method, which has been used as a rule of thumb for the determination of reserve, can be interpreted as a nonparameteric stochastic model. After that, many univariate stochastic reserving models have been developed and used for both determination of point estimates of reserve and risk management. For detailed discussions on univariate stochastic reserving models, see England and Verrall (2002).
However, it is also certain that most of the insurance companies do not run only a single line of business so one needs to consider possible dependence among the lines of business in reserve modeling. Consequently, stochastic reserving models need to be extended to multivariate frameworks. In this regard, some actuarial literature focused on the extension of the chain-ladder method to multivariate cases, such as Braun (2004), Schmidt (2006), and Merz and Wüthrich (2008) which are followed by Shi et al. (2012), which used a bivariate normal distribution to model multi-line reserves. Besides the dependence among multiple lines of business, there have been some works on the dependence between paid claim triangles and incurred claim triangles such as Zhang (2010) and Merz and Wüthrich (2010).
Note that the methods mentioned above are less flexible since it is not allowed to disentangle marginal distributions and multivariate association structure. Therefore, use of a copula was introduced to the multivariate reserving problem since this allows us to consider marginal distribution and association structure in a separate way. Based on this idea, Shi and Frees (2011) proposed the use of a bivariate Gaussian copula to model dependence among multi-line reserves where the marginal distributions are chosen as Gaussian and gamma, respectively. Peters et al. (2014) considered dependence between paid-incurred claim triangles via bivariate copula. Further, Abdallah et al. (2015) used the idea of the hierarchical archimedian copula to capture calendar year effect and multi-line dependence simultaneously.
Although there have been some works on the multivariate reserving in actuarial literature, applications of all the aforementioned works are restricted only to possible dependence between two lines of business. Indeed, it is natural that a property and casualty insurance company runs more than two lines of business, so we need to incorporate such a high-dimensional dependence structure in our reserve modeling.
In that regard, we apply the idea of a vine copula in this article, which uses bivariate copulas as its building blocks and connects them with vine structure to describe the high-dimensional association in a flexible way. The vine copula has been widely used in the actuarial and financial literature. Loaiza Maya et al. (2015) investigated dependency among the exchange rates of Latin American countries using the vine copula. Reboredo and Ugolini (2016), Arreola Hernandez et al. (2017), and Trucíos et al. (2020) used the vine copula to assess systemic risks due to possible dependence among economic subjects. Surprisingly, usage of the vine copula in property and casualty insurance literature does exist but is scarce. For example, Shi and Yang (2018) used a vine copula to capture the serial dependence of claim amounts and derive the experience ratemaking factor.
This paper has been organized as follows. In Section 2, basic concept of the vine copula is introduced and the model selection procedure to be implemented is specified. In Section 3, we describe the data used for our empirical analysis. In Section 4, we go through the model selection procedure to determine marginal distributions and the vine copula structure to be used in our analysis. In Section 5, we discuss the implications of estimation results from the perspective of enterprise risk management using the predictive distribution of unpaid claims. Section 6 addresses practical issues for implementing the proposed methodology. Finally, we conclude this article in Section 7 by providing some future directions of research.

2. Proposed Methodology

Suppose an insurance company owns a portfolio which consists of N multiple lines of business. By assuming balanced observations in multiple triangles, one can write the multivariate cumulative paid claims as Y i j = ( Y i j ( 1 ) , Y i j ( 2 ) , , Y i j ( N ) ) where n = 1 , , N indicates a claim triangle from the n t h line of business, i = 1 , , I means the i t h accident years, and j = 1 , , J denotes j t h development lag. In general, it is of interest to predict the cumulative paid claim for the next year given information up to the current year, which can be written as E Y i , j + 1 ( n ) | Y i , j ( n ) . Therefore, instead of working with Y i j , one can directly model the incremental development of claims, or age-to-age factors as follows:
D i j = ( D i j ( 1 ) , D i j ( 2 ) , , D i j ( N ) ) ,
where D i , j + 1 ( n ) : = Y i , j + 1 ( n ) / Y i , j ( n ) so that E Y i , j + 1 ( n ) | Y i , j ( n ) = Y i , j ( n ) E D i , j + 1 ( n ) . Since each of D ( n ) = ( D i j ( n ) ) i = 1 , , I , j = 1 , , J are observed from the same business line, it is natural that we model the marginal distribution of D ( n ) and the dependence structure among D ( n ) via copulas to jointly model D ( n ) , n = 1 , 2 , , N .
According to Sklar (1959), if all marginal distribution functions are continuous, then there is a unique function C : [ 0 , 1 ] [ 0 , 1 ] such that
H ( x 1 , x 2 , , x m ) = C ( F 1 ( x 1 ) , F 2 ( x 2 ) , , F m ( x m ) ) ,
where F i denotes marginal distribution function of X i and H denotes joint distribution function of ( X 1 , X 2 , , X m ) . Therefore, use of copulas allows us to capture the association among the joint response random variables, which may follow different marginal distributions. In that sense, by letting θ = ( θ ( 1 ) , , θ ( N ) ) , where θ ( n ) is the parameter vector for the marginal distribution of D ( n ) and ϕ is the copula parameter, the likelihood for joint distribution is given as follows:
( θ , ϕ | d ) = i = 1 I j = 2 J + 1 I log h d i j ( 1 ) , , d i j ( N ) | θ , ϕ = i = 1 I j = 2 J + 1 I n = 1 N log f ( n ) d i j ( n ) | θ ( n ) + log c ϕ F ( 1 ) ( d i j ( 1 ) | θ ( 1 ) ) , , F ( N ) ( d i j ( N ) | θ ( N ) ) .
Here h means the joint density of ( D ( 1 ) , , D ( N ) ) and we assume that d i j ( n ) ( i = 1 , , I , j = 2 , , J ) are independent for fixed n.
Depending on the sign of dependence, we may use different families of copulas. For example, in order to capture positive dependence, one can suggest the following bivariate copulas:
Gumbel : C G ( u , v ) = exp ( log ( u ) ) ϕ + ( log ( v ) ) ϕ 1 / ϕ , ϕ [ 1 , ) Clayton : C C ( u , v ) = u ϕ + v ϕ 1 1 / ϕ , ϕ ( 0 , ) .
Although those two copulas can only capture positive dependence, one can easily rotate those in order to capture reversed tail behavior and negative dependence as follows:
90 rotated Gumbel : C G 90 ( u , v ) = v C G ( 1 u , v ) , 90 rotated Clayton : C C 90 ( u , v ) = v C C ( 1 u , v ) , Survival Gumbel : C SG ( u , v ) = u + v 1 + C G ( 1 u , 1 v ) , Survival Clayton : C SC ( u , v ) = u + v 1 + C C ( 1 u , 1 v ) , 270 rotated Gumbel : C G 270 ( u , v ) = u C G ( u , 1 v ) , 270 rotated Clayton : C C 270 ( u , v ) = u C C ( u , 1 v ) .
Further, the Gaussian copula and Frank copula are also prevalent choices which can capture both positive and negative dependence:
Gaussian : C N ( u , v ) = Φ 2 ( Φ 1 ( u ) , Φ 1 ( v ) ; ϕ ) , ϕ [ 1 , 1 ] , Frank : C F ( u , v ) = 1 ϕ log 1 + ( exp ( ϕ u ) 1 ) ( exp ( ϕ v ) 1 ) exp ( ϕ ) 1 , ϕ R { 0 } ,
where Φ 2 ( · , · ) stands for the cumulative distribution function of a bivariate, standard, normal, random variable with correlation ϕ and Φ ( · ) stands for the cumulative distribution function of a univariate, standard, normal random variable.
Given families of copulas, allow us to consider various facets of possible dependence among the lines of business, including the upper and lower tail dependence properties. For example, the Clayton copula can capture a positive association and has lower tail dependence but no upper tail dependence. The Gumbel copula can capture positive associations and has upper tail dependence but no lower tail dependence. Further, both Frank and Gaussian copulas are symmetric and able to capture positive and negative dependence, but they have no tail dependencies. Note that upper tail dependence of original copulas corresponds to lower tail dependence of survival copulas, and vice versa. Further, if an original copula can capture a positive association, then the corresponding 90 or 270 rotated copula can capture a negative association; for example, 90 rotated Clayton and 270 rotated Gumbel copulas can be used to capture negative associations. We refer the readers to Nelsen (1999), Embrechts et al. (2001), and Hua and Joe (2011) for a more detailed explanation. The variability of tail behaviors is potentially important considering the data used in our empirical analysis. Here we capture the dependencies among the incremental development factors of different lines of business so that it is natural that the incremental development factors will be quite large in the initial years and become smaller in the mature years. In this regard, upper tail dependence corresponds to the associations among the loss development factors in the initial years, whereas lower tail dependence corresponds to the associations among the loss development factors in the mature years.
If N = 2 , then the dependence can be captured by a bivariate copula, which has been explored in previous works such as Shi and Frees (2011). However, we may not preclude the possibility that N > 2 , and one can suggest the use of vine copula when we have more than two lines of business to capture the multiple dependence. The concept of vine copula was introduced by Aas et al. (2009) who extended the research conducted by Bedford and Cooke (2001) and Bedford and Cooke (2002) to show how multivariate data, which exhibit complex patterns of dependence in the tails, can be modeled using a cascade of pair-copulas, acting on two variables at a time. Later Czado (2010) and Joe and Kurowicka (2011) developed this idea further and provided us with a definite road map for constructing vine copulas to represent complex multivariate data. The core concept behind construction of vine copulas is that a d-dimensional density f ( x 1 , x 2 , , x d ) can be represented as a product of pair copula densities and marginal densities. For example, let us consider d = 3 dimension. A possible way to represent the joint density f ( x 1 , x 2 , x 3 ) is as follows:
f ( x 1 , x 2 , x 3 ) = f 3 | 12 ( x 3 | x 1 , x 2 ) f 2 | 1 ( x 2 | x 1 ) f 1 ( x 1 ) .
Additionally, since
f 2 | 1 ( x 2 | x 1 ) = c 12 ( F 1 ( x 1 ) , F 2 ( x 2 ) ) f 2 ( x 2 ) , f 3 | 12 ( x 3 | x 1 , x 2 ) = c 13 | 2 ( F 1 | 2 ( x 1 | x 2 ) , F 3 | 2 ( x 3 | x 2 ) | x 2 ) f 3 | 2 ( x 3 | x 2 ) , f 3 | 2 ( x 3 | x 2 ) = c 23 ( F 2 ( x 2 ) , F 3 ( x 3 ) ) f 3 ( x 3 ) ,
where c i j ( F i ( x i ) , F j ( x j ) ) is the joint copula density of random variables X i and X j . One can write out the joint density of X 1 , X 2 , X 3 as follows:
f ( x 1 , x 2 , x 3 ) = f 1 ( x 1 ) f 2 ( x 2 ) f 3 ( x 3 ) ( marginals ) × c 12 ( F 1 ( x 1 ) , F 2 ( x 2 ) ) c 23 ( F 2 ( x 2 ) , F 3 ( x 3 ) ) ( unconditional pair ) × c 13 | 2 ( F 1 | 2 ( x 1 | x 2 ) , F 3 | 2 ( x 3 | x 2 ) | x 2 ) ( conditional pair ) .
Note that in general, c 13 | 2 is affected by x 2 , not only through F 1 | 2 ( x 1 | x 2 ) and F 3 | 2 ( x 3 | x 2 ) , but also directly through x 2 . However, such a general form of conditional coplula density is cumbersome to be used in statistical inference. For this reason, it is tempting to use a simplified form of conditional copula density so that c 13 | 2 is affected by x 2 only through F 1 | 2 ( x 1 | x 2 ) and F 3 | 2 ( x 3 | x 2 ) . Haff et al. (2010) showed that we may write each conditional pair copula density in a simplified form for elliptical distributions with a positive definite scale matrix. Further, it was also shown that use of the simplified form could be a good approximation even in a case wherein such an assumption could not be fully validated. Hence, we will use a simplified form of conditional copula density in this article hereafter.
The sequential dependence structure via pair copulas is described with a regular vine, or R-vine. Let V = ( T 1 , , T d 1 ) be a set of trees where each tree T j = ( N j , E j ) is connected and N 1 = { 1 , , d } . Here N j and E j mean the nodes and edges of T j , respectively. If N j = E j 1 for j = 2 , , d , V is called a vine. Further, if
| e e | = 1 for { e , e } E j and j = 2 , , d 1 ,
then V is called a R-vine. There are two important subclasses of R-vine, C-vine and D-vine. If there is a node n N j so that the degree of that node is d j in an R-vine, it is called a C-vine. D-vine is an R-vine wherein the degree of every node is at most 2.
The above decomposition of the joint density in terms of vine structure is not unique. For example, when d = 3 , we have the following three available R-vine structures as in Figure 1. Note that in the case of 3-dimensional R-vine structures, C-vine and D-vine are identical, which is no more true for d > 3 .
Note that the number of regular vine structures on d-dimensional variables grows super-exponentially so that there are d ! · 2 d 2 2 regular vine structures on d variables, as shown in Joe and Kurowicka (2011). Therefore, it is already meaningless to display every possible vine structure if d 5 .
Therefore, for the calibration of vine copula model, we need to carefully determine the following components based on observed data:
  • Marginal distribution for each of D ( n ) ;
  • Optimal d-dimensional vine structure;
  • Copula family for each of pair copula in the selected vine structure.
For statistical inference, one may consider either the frequentist or the Bayesian approach and subsequently choose the optimal structure via a model selection procedure. However, it can be quite computationally demanding to choose an optimal model in both approaches. For example, as mentioned in Table 1 of Gruber and Czado (2015), the size of search space of vine copula model for d = 5 is already (1.3559 × 10 11 ) when we consider seven families of copula for each of the pair copulas, which is definitely computationally infeasible. In this regard, Dissmann et al. (2013) proposed a top-down approach to choose the vine structure from a frequentist view. Gruber and Czado (2018) proposed sequential vine copula model selection from a Bayesian perspective. Schamberger et al. (2017) also considered a Bayesian approach to model dependence structure with a factor copula in order to handle the dimensionality issue. Recently, Kreuzer and Czado (2019) showed that the pair copula family can be chosen together with their parameters using the Hamiltonian MCMC.
The Bayesian approach can be beneficial in reserve modeling since it can incorporate the uncertainty of parameter estimation and reflect it to compute the predictive distribution of future unpaid claims, as proposed in Shi et al. (2012). Therefore, we propose the following two-step approach, which can be considered as a compromise of Dissmann et al. (2013) and Gruber and Czado (2018):
  • Estimate the parameters with Bayesian inference and choose marginal distribution based on Bayesian model selection criteria, such as deviance information criterion (DIC) and the logarithm of the pseudomarginal likelihood (LPML). The DIC for each marginal distribution of n t h line, proposed by Spiegelhalter et al. (2002), is defined as
    DIC ( n ) = 4 ( θ ( n ) | d ( n ) ) π ( θ ( n ) | d ( n ) ) d θ + 2 ( θ ˜ ( n ) | d ( n ) ) ,
    where θ ˜ ( n ) = E θ ( n ) | d ( n ) and d ( n ) = { d i , 2 ( n ) , , d i , J i + 1 ( n ) | i = 1 , I } are the observed values of incremental loss developments from n t h line of business. Since this integral is hardly available in a closed form, one can estimate DIC as follows:
    DIC ^ ( n ) = 4 s = 1 S ( θ [ s ] ( n ) | d ( n ) ) + 2 1 S s = 1 S θ [ s ] ( n ) | d ( n ) ,
    where { θ [ s ] ( n ) } s = 1 S are MCMC samples generated from the posterior density. Note that we prefer models with smaller DIC values.
    LPML is calculated based on the conditional predictive ordinate (CPO), which was proposed by Gelfand et al. (1992) and Geisser (2017). The CPO for d i , j ( n ) is defined as follows:
    CPO i , j ( n ) = f ( d i , j ( n ) | θ ) π ( θ ( n ) | d ( i , j ) ( n ) ) d θ ( n ) ,
    where d ( i , j ) ( n ) = d ( n ) d i , j ( n ) . Since CPO is usually not readily available in closed form, one can estimate CPO i , j ( n ) as follows, according to Gelfand and Dey (1994):
    CPO ^ i , j ( n ) = S s = 1 S 1 f ( d ( i , j ) ( n ) | θ [ s ] ) 1 .
    Finally, according to Ibrahim et al. (2014), CPO i , j ( n ) can be summarized as LPML as follows:
    LPML ( n ) = i = 1 I j = 2 J + 1 I log ( CPO ^ i , j ( n ) ) .
    We prefer models with larger LPML values.
  • Based on the fitted marginal model and corresponding posterior means of marginal parameters θ ^ ( n ) for n = 1 , , N , generate probability integral transfrom (PIT) U ^ i j ( n ) = F ( n ) d i j ( n ) | θ ^ ( n ) and optimize the following:
    ϕ | U ^ = i = 1 I j = 2 J + 1 I log c ϕ U ^ i j ( 1 ) , , U ^ i j ( N ) ,
    which ends up with the optimal vine copula structure and copula family for each pair copula using the top-down algorithm of Dissmann et al. (2013). The algorithm starts with the choice of first tree T 1 and subsequently chooses T j for j = 2 , , d 1 while we keep the constraint of having R-vine structure in (1). The optimal tree T j is chosen by solving the following optimization problem:
    argmin E j , N j e E j w e ,
    where w e is pairwise BIC of an edge e in our implementation.
    In our search for the best family for each of pair copula, we use the aforementioned bivariate Gaussian, Frank, Clayton, Gumbel, and their rotated copulas. Note that this approach also utilizes the inference by margin (IFM) method proposed by Joe and Xu (1996), since it decomposes estimation of marginal distribution and copula structure separately for model selection a with lesser computational burden. It should be noted that such computational convenience is obtained at the expense of potential estimation bias, as shown in Louzada and Ferreira (2016).

3. Data Description

For the empirical analysis, we used a publicly available dataset from ACE Limited 2013 Global Loss Triangles, which consists of aggregated claim developments for insurance operations in North America (4 lines of business), insurance operations overseas (3 lines of business), and global reinsurance operations (2 lines of business). A description and the corresponding index for each line of business are given in Table 1. We incorporate a 9-dimensional vine copula structure to capture possible dependence among all lines of business. Indeed, the current COVID-19 situation could be a clear example of showing the inappropriateness of ignoring potential dependencies among different countries. Note that the dataset and code for data analysis are attached as Supplementary Materials.
Although we consider all possible lines of business simultaneously here, an actuary who applies vine copulas to model dependence among the lines of business should be careful, since the complexity of the vine copula structure increases super-exponentially as the dimensionality grows. Therefore, it is a role of an actuary to apply his/her industrial experience or knowledge for the dependence modeling. Further, it is also possible to observe more than 10 lines of business according to the classification in Schedule P of the National Association of Insurance Commissioners (NAIC) annual report, including but not limited to:
  • Homeowners/farmowners;
  • Private passanger auto liability/medical;
  • Commercial auto/truck liability/medical;
  • Worker’s compensation;
  • Special liability;
  • Other liability;
  • Fidelity/surety.
Therefore, it is also possible to merge some lines of business into one category in order to not only enhance homogeneity and credibility in the analysis but also avoid too much complexity of the vine structure due to the large number of lines of business.
The loss development triangles, which are training sets in terms of predictive analytics, can be expressed as follows:
D 1 : I = { Y i j ( n ) : 1 i I and 1 j min ( I , I + 1 i ) , n = 1 , , 9 } .
Our task is to predict the cumulative paid claims for the next years, which are described as follows:
D I + k = { Y i j ( n ) : 1 + k i I and j = I + 1 + k i , n = 1 , , 9 } .
For enterprise risk management (ERM) purposes, insurance companies are usually interested in the distribution of aggregate unpaid claims after one year of loss development, which is determined by incremental paid losses defined as follows:
L ( n ) = i = 2 I Y i , I + 2 i ( n ) Y i , I + 1 i ( n ) = i = 2 I D i , I + 2 i ( n ) · Y i , I + 1 i ( n ) , L = n = 1 N L ( n ) .
This formulation validates us to model D i , j ( n ) rather than Y i , j ( n ) because one can see that
E L | D 1 : I = n = 1 N i = 2 I E D i , I + 2 i ( n ) | D 1 : I · Y i , I + 1 i ( n ) .
A simple approach for the analysis of L is the silo approach, which means we just aggregate all lines of business so that the paid losses of the same development lag and accident year are merged and modeled altogether. In this case, one can write L as follows:
L = i = 2 I Y i , I + 2 i Y i , I + 1 i = i = 2 I D i , I + 2 i · Y i , I + 1 i ,
where Y i , j = n = 1 N Y i , j ( n ) and D i , j = Y i , j + 1 / Y i , j . However, such an approach ignores the heterogeneity of the lines of business and could be problematic. Suppose there are two lines of business, personal and commercial auto insurances, and reported losses of these lines are given in Table 2. Note that the incremental development is calculated as log Y j + 1 / Y j . One can see that due to the difference in volume, the volatility of loss development for the commercial auto line is wiped out if we analyze the aggregated data. Therefore, the vine copula model can be considered as a flexible generalization of two extreme models, silo and independent approaches, so that one can consider possible dependence and heterogeneity among the lines of business simultaneously.

4. Model Selection and Parameter Estimation

For marginal distribution, we apply the idea of the cross-classified model for each line of business as in Shi and Frees (2011) and Taylor and McGuire (2016), which regresses Y i j ( n ) to linear predictor γ ( n ) + α i ( n ) + δ j ( n ) as follows:
Lognormal : E log Y i j ( n ) = γ ( n ) + α i ( n ) + δ j ( n ) , Gamma : E Y i j ( n ) = exp ( γ ( n ) + α i ( n ) + δ j ( n ) ) ,
where α i ( n ) corresponds to the effect for the i t h accident year and δ j ( n ) corresponds to the cumulative effect up to j t h development lag, for n t h line of business, respectively.
Recall that it is in our interests to predict E Y i , j + 1 ( n ) | Y i , j ( n ) = Y i , j ( n ) E D i , j + 1 ( n ) . Therefore, one can rewrite (6) in the following way:
Lognormal : E log D i , j + 1 ( n ) = η j + 1 ( n ) , Gamma : E D i , j + 1 ( n ) = exp ( η j + 1 ( n ) ) ,
where η j + 1 ( n ) = δ j + 1 ( n ) δ j ( n ) and D i , j + 1 ( n ) = Y i , j + 1 ( n ) / Y i , j ( n ) . Therefore, η j + 1 ( n ) can be interpreted as the incremental loss development factor from j t h lag to j + 1 t h lag. In this regard, we utilize two candidate distributions in (7) and compare the model selection diagnostics to choose a suitable distribution for each of marginal models.
Further, it is natural to expect that for a fixed accident year, paid claim amounts gradually increase until they are fully mature and developed, while the magnitude of development gets smaller as development lag increases, which means the incremental loss development factor is expected to be greater than 1 but monotonically decreasing until it converges to 1. In terms of cross-classified model, such a statement is equivalent to the following mathematical condition:
η 2 ( n ) η 3 ( n ) η L ( n ) ( n ) = 0 ,
for a large enough integer L ( n ) , or there are ζ t ( n ) [ 0 , ) for t = 2 , , L ( n ) and n = 1 , , 9 such that η j ( n ) = t = j L ( n ) ζ t ( n ) . Therefore, we may suggest the following four models as the candidates for marginal distribution of each line of business:
  • LNU: The cross-classfied lognormal model in (7);
  • LNC: The cross-classfied lognormal model in (7) with (8);
  • GamU: The cross-classfied gamma model in (7);
  • GamC: The cross-classfied gamma model in (7) with (8).
All of the four models are fitted via RStan because of its flexibility. For example, one can incorporate the constraint in (8) by forcing a lower limit of ζ t ( n ) as 0 in RStan and using a diffuse uniform prior on a positive real number for ζ t ( n ) , n = 1 , , 9 , and t = 1 , , J . For the unconstrained models, a diffuse uniform prior on the positive real number is directly used for η j + 1 ( n ) , n = 1 , , 9 , and  j = 2 , , J . For each marginal model, four chains with 1000 iterates are used; the first 500 iterates are discarded for burn-in, which usually requires computation time for MCMC sampling of less than a second.
Table 3 provides a summary of the estimates of parameters in all models for the marginal components, which are summarized with posterior mean and upper/lower bounds of 90 % of the Bayesian credible interval for each parameter. One can see that η ^ 9 ( n ) tends to be greater than or equal to η ^ 8 ( n ) in the unconstrained models, whereas η ^ 8 ( n ) > η ^ 9 ( n ) for all n = 1 , , 9 in the constrained models, which is a more natural pattern in loss development analysis.
When posterior MCMC samples of parameters are obtained, it is quite important to make sure that these samples converge to (proper) posterior distributions. In this regard, we use R ^ statistics proposed by Gelman and Rubin (1992) and traceplots which enable us to judge the convergence of MCMC sampling with the generated samples. Basically, R ^ statistics tell us that the samples from distinct chains are well-mixed so that the chains are considered converged if R ^ 1 . Table A1 in Appendix A shows that R ^ 1 for all parameters, which tells us that the chains are well-mixed. One can also see that MCMC chains are well-mixed by looking at selected traceplots of maginal models, as shown in Figure A1. It indicates different MCMC chains end up with similar empirical distributions, which is a necessary condition of the convergence of the MCMC algorithm to the correct posterior distribution.
After the convergence of Bayesian models is assessed, we compare the goodness-of-fit of Bayesian models using DIC and LPML. Based on the DIC and LPML values for fitted marginal models in Table 4, it turns out that lognormal distribution is favored compared to gamma distribution. Further, DIC values of constrained lognormal distribution are always less than ones of unconstrained lognormal distribution, while LPML values have no definitive patterns between constrained and unconstrained lognormal distributions. Therefore, we suggest to use constrained lognormal distribution as the marginal distribution of every line of business, based on the model selection diagnostics and interpretability of the regression coefficients while unconstrained lognormal distributions are considered as a benchmark.
Once a marginal distribution for each triangle is specified, the dependence structure can be modeled with a 9-dimensional vine copula. Since marginal distributions are fixed and calibrated, it is possible to optimize (2). For the optimization routine, we use the algorithm of Dissmann et al. (2013) which is readily available in an R package VineCopula, as a function RVineStructureSelect, which explores both the optimal vine structure and the choice of family for each of the pair copulas sequentially. For a detailed explanation of such an implementation, see Chapter 8 of Czado (2019).
In our search for the vine structure, we also consider the sparsity of the vine copula. Since we need d 2 pair copulas to construct a d-dimensional vine structure, the number of required pair copulas quadratically increases as the dimension increases, which adds both complexity and computational burden to the optimization scheme. Therefore, in this article, we also incorporate the idea of a sparse copula so that a pair copula is considered to be an independent copula unless the estimated association Kendall’s Tau is significantly different from 0 for the pair copula. More specifically, we use the following test statistic T based on the estimated value of Kendall’s Tau, τ ^ as proposed in Genest and Favre (2007):
T = 9 m ( m 1 ) 2 ( 2 m + 5 ) × | τ ^ | ,
where T asymptotically follows a standard normal distribution if m is large enough. For more approaches of incorporating sparsity in a high-dimensional vine copula, see Gruber and Czado (2018) and Nagler et al. (2019).
According to the optimization with the data, the optimal vine structure is given Table 5 which only displays non-independent pair copulas. In the table, ϕ and τ mean estimated copula parameter and Kendall’s Tau for each pair copula, respectively. Some empirical relationships are observed by the estimated copula parameters, which seem intuitive considering the nature of the lines of business. For example, it might be quite natural to have strongly positive dependence among the development patterns of line 4 (North American non-casualty) and line 6 (overseas general non-casualty) due to inherent characteristics of the same claim type. Dependence between line 8 (global reinsurance property) and line 9 (global reinsurance non-property) is captured by the Gumbel copula, which indicates there is positive association between claim settlements in the initial years (that corresponds to upper tail part) for a global reinsurance business unit. Such a positive tail dependence might have originated from an internal decision on the claim processing in the same business unit. Note that a company has limited resources to deal with ligitations related to the claim adjustments so that claim adjustments can be delayed in a line by focusing on expediation in claim adjustments in another line, which might explain negative associations between lines 6 and 9, lines 2 and 7, and lines 5 and 9. One can see that the estimated vine structure exhibits enough of a degree of sparsity because only 10 pair copulas are non-independent among 9 2 = 36 pair copulas in the vine structure so that most of the dependencies arise from tree 1, as shown in Figure 2. There are nine lines of business that correspond to the nodes of tree 1 so that we have nine nodes and eight edges as an R-vine structure which captures the first layer of dependence among the lines of business. Since the dependence of a pair of copulas is quite weak except for those in tree 1, we only display the R-vine structure for tree 1.
We also visualize the dependence structure via normalized contour plots in Figure 3, which displays the association between two probabilty integral transforms in a normalized scale. For example, if a pair copula c ( u 1 , u 2 ) is given, then the corresponding normalized contour plot gives us a contour plot with the following density g:
g ( Φ 1 ( u 1 ) , Φ 1 ( u 2 ) ) = c ( u 1 , u 2 ) ϕ ( Φ 1 ( u 1 ) ) ϕ ( Φ 1 ( u 2 ) ) .
Normalized contour plots for trees 1, 2, and 3 in the chosen vine structure are provided in Figure 3. One can see that some pair copulas (for example, c 4 , 3 and c 8 , 1 ) demonstrate strong tail dependence or skewed shape of the contours, which cannot be captured with naive use of eillptical copulas such as Gaussians or t copulas. Therefore, Figure 3 substantiates that the choice of pair copula families are enough to describe the dependence structure observed in the data. In the figure, the first row corresponds to normalized contour plots for tree 3 where one can see concentric circles as contour plots, which indicates independence of corresponding pair copulas. On the other hand, one can see many skewed contour plots for tree 3 from the third row of the figure, which indicates varying dependence of corresponding pair copulas.

5. Validation and Actuarial Implication

Once a predictive model is calibrated based on the upper loss triangles D 1 : 9 , which is defined in (3), it needs to be validated based on available D 9 + k , which is defined in (4). For this purpose, D 10 = { Y i j ( n ) : 2 i 9 and j = 11 i , n = 1 , , 9 } , the latest diagonals are used as a validation set.
Under the proposed lognormal model for marginal components, one can show that
E Y i , j + 1 ( n ) | Y i , j ( n ) = E D i , j + 1 ( n ) Y i , j ( n ) = exp η i , j + 1 ( n ) + 1 2 σ 2 ( n ) Y i , j ( n ) ,
so that a point estimate of Y i , 11 i ( n ) is given as Y ^ i , 11 i ( n ) = Y i , 10 i ( n ) × exp η ^ 11 i ( n ) + 1 2 σ ^ 2 ( n ) , where η ^ ( n ) and σ ^ ( n ) are obtained as posterior means from the MC samples of parameters. Since it is of interest to predict the future unpaid claim L as defined in (5), here we apply three models to project the unpaid claims; the independent model which assumes independence among all lines of business, the silo model which assumes a perfect positive association among all lines of business, and the copula model which utilizes the vine copula structure specified and optimized in Section 4.
Due to linearity of expectation, it is natural to expect that point estimates of reserves under the copula model would be more or less the same as the estimates under independent model. However, insurance companies are often interested in not only the predictive mean, but also the confidence interval of the estimated mean and risk measures for enterprise risk management. In this regard, the proposed vine copula model allows us to consider the impacts of associations among the lines of business that might lead to more accurate estimation of risk measures. For example, if there are positive associations among the lines of business, then the independent model will underestimate Value at Risk (VaR) or Conditional tail expectation (CTE) since a worse scenario can affect all lines of business adversely.
Table 6 summarizes the point estimates of unpaid claims for the subsequent calendar year based on the copula model and the silo model. Note that point estimates of unpaid claims from the independent model should be the same as ones from the copula model, as long as we model each line of business with the same marginal distribution. Further, although we primarily use constrained models for the marginal components, here we also provide the point estimates of unpaid claims from the models where we do not impose any constraints on the parameters.
From the table, it is shown that copula model with constraints turns out to be the best model in terms of prediction of aggregate unpaid claims for the subsequent year. We observe that under unconstrained models, the predicted values of unpaid claims can be severely exaggerated, especially for mature years (in our case, AY = 2005). That is because the value of η 9 ( n ) can be overestimated in the unconstrained models, as observed in Table 3. Therefore, it implies that naive implementation of unconstrained model for multi-line reserving problem may end up with poor prediction.
In order to evaluate the prediction performance, here we use validation measures such as root mean squared error (RMSE) and mean absolute error (MAE) defined as follows:
RMSE = : 1 8 i = 2 9 ( Y ^ i , 11 i Y i , 11 i ) 2 , MAE = : 1 8 i = 2 9 | Y ^ i , 11 i Y i , 11 i | ,
where Y ^ i , 11 i is the estimate of Y i , 11 i under each method. We prefer the models with smaller values of RMSE and/or MAE. According to Table 7, the copula model with constraints still turns out to be the best model in terms of RMSE and MAE, which evaluate the performance of predicting the point estimates.
However, it is of less concern to get point estimates of L, the aggregate unpaid claims. Actuaries (and companies) want to know a range of estimates, or the predictive distribution of the aggregate unpaid claims to assess the quality of decision making on reserve and retain appropriate amounts of risk capital for the whole company. Thus, here we provide a way to simulate predictive samples of L ( n ) and describe the predictive distribution of L ( n ) for n = 1 , , 9 and subsequently L = n = 1 9 L ( n ) with the following steps, which are similar to those of Gao (2018):
  • Generate a 9-dimensional uniform random vector ( u i j r : ( 1 ) , , u i j r : ( 9 ) ) based on the specified copula structure in a sequential way. For example, one can first generate ( u i j r : ( 1 ) , u i j r : ( 2 ) ) from C 12 . After that, u i j r : ( 3 ) is generated subsequently due to the following identity:
    U [ 0 , 1 ] = d w 3 = C 3 | 12 ( u 3 | u 1 , u 2 ) = h 3 | 1 ; 2 C 3 | 2 ( u 3 | u 2 ) , C 1 | 2 ( u 1 | u 2 ) ,
    where h 3 | 1 ; 2 ( w , v ) = w C 13 ; 2 ( w , v ) . One can continue such implementation to obtain a random sample of d-dimensional uniform vector, which is readily available with RVineSim function in R package VineCopula. (For details, see Chapter 6 of Czado (2019).)
  • Based on the lognormal assumption of marginal components, ( D i , j r : ( 1 ) , , D i , j r : ( 9 ) ) are generated as follows; D i , j r : ( n ) = exp η ^ j r : ( n ) + Φ 1 ( u i j r : ( n ) ) σ ^ r : ( n ) where η ^ j r : ( n ) and σ ^ r : ( n ) are r t h MC samples of η j ( n ) and σ ( n ) from the marginal model for the n t h line of business, respectively.
  • Repeat steps 1 to 2 to get L r : ( n ) , the MC samples of L ( n ) for r = 1 , , R where
    L r : ( n ) = i = 2 9 Y ^ i , 11 i r : ( n ) Y i , 10 i ( n ) = i = 2 9 D i , 11 i r : ( n ) 1 Y i , 10 i ( n ) and L r : = n = 1 9 L r : ( n ) .
Since { Y i , 10 i ( n ) : 2 i 9 , n = 1 , , 9 } D 1 : 9 , only { D i , 11 i ( n ) : 2 i 9 , n = 1 , , 9 } needs to be simulated to describe the distribution of L. In the simulation scheme, possible uncertainties of estimated parameter values are already considered since the marginal parameters are estimated with the Bayesian approach.
Figure 4 shows the predictive kernel densities of L based on the Monte Carlo samples generated with the aforementioned approach from the constrained model. In the silo approach, all nine triangles are aggregated into one triangle based on the development lags and accident years, so that loss development patterns of all business lines are implicitly assumed to be identical, which may not be true in some cases. One can see that the predictive distribution of L under the copula model is more jagged than one under the independent model, but the copula model is still closer to the independent model than the silo model. It agrees with the proposed vine copula structure described in Figure 2, which shows a sufficient degree of sparsity.
The discrepancy among predictive distributions based on copula, independent, and silo models also can be quantified via Hellinger distance. Hellinger distance, originally proposed by Hellinger (1909), is used to quantify the "distance" between two distributions. For example, if there are two functions f and g which are probability density functions of two distributions F and G, the square of Hellinger distance is calculated as follows:
H 2 ( F , G ) = 1 f ( x ) g ( x ) d x .
One can see that 0 H ( F , G ) 1 is always satisfied with Hellinger distance so that it is well calibrated and interpreted, where H ( F , G ) = 0 means F and G are identical almost surely and H ( F , G ) = 1 means F and G are totally different. Due to this property, Hellinger distance has been widely used in the statistics literature, such as Beran (1977), and estimation of H ( F , G ) using generated samples from F and G can be easily implemented via R packages such as statip. Here, H ^ 2 ( L | Copula , L | Independent ) , the square of the Hellinger distance between the kernel density of L under copula model with constraints (illustrated with blue solid line in Figure 4) and the density under the independent model with constraints (illustrated with black dotted line in Figure 4) is about 4.33 % while H ^ 2 ( L | Copula , L | Silo ) = 13.54 % . Therefore, one can see that the proposed vine copula structure can capture weak dependencies among the lines of business.
Note that Hellinger distance is a special case of the functional Bregman divergence proposed in Goh and Dey (2014) and Jeong (2020), which is given as follows:
D ψ ( F , G ) = ψ f ( x ) g ( x ) g ( x ) d x ,
where ψ ( · ) is a strictly convex and differentiable function with positive support and ψ ( 1 ) = 0 . One can see that D ψ ( F , G ) = H 2 ( F , G ) if ψ ( z ) = 1 z , and D ψ ( F , G ) is Kullback–Leibler divergence if ψ ( z ) = log z .
Based on the predictive distributions of L and L ( n ) for n = 1 , , 9 , one can calculate the risk margins for the unpaid claims as in Table 8. Both VaR and CTE are estimated in an empirical way—in other words, based on the generated samples of the predictive distribution of L or L ( n ) . Usually, it is expected that a company can get a benefit from the diversification of risks for multiple lines of business so that the risk capital for aggregate reserve is less than the sum of the risk capitals for the reserve of each line, which is known as subadditivity of risk measure. Although subadditivity is not guaranteed for VaR, it is guaranteed for CTE and we can quantify the diversification effects by taking the differences of risk capitals by subtracting the risk capital for aggregate reserve from the sum of the risk capitals for the reserve of each line. Table 9 summarizes measured diversification effects for given models. We can observe that the measured diversification effect of the copula model is much greater than that of the silo model because of weak dependence captured in the vine structure, although it is still less than that of the independent model, which assumes perfect independence among the lines of business.

6. Practical Issues for Implementation

Note that a loss reserving dataset from a single company was utilized in this article because a primary insurer will be more interested in analysis of dependence among the lines of business in the same company for effective enterprise risk management, rather than whole industry. However, this methodology can be applied for analysis of reserving trend in a dataset of several companies as well. For example, analyzing dependencies among the loss development profiles of several companies can be of interest for the regulatory authorities and reinsurance companies.
One can be also concerned about the predictive ability of these models, which can be distorted by catastrophic claims. In this regard, insurance companies often decompose the loss development profiles into two layers, primary and excess layers, to minimize the distortion impact from the presence of catastrophic claims (Dew and Hedges 1998), and our proposed methodology can be applied to analyze dependencies among the loss development for the primary layer. Note that such distinction of layers is usually internal and may not be disclosed to the public, which also applies to the dataset used in this article.

7. Conclusions

In this article, we explored and introduced a novel approach which considered possible dependence among the multiple lines of business via vine copula. Use of a vine copula is very flexible and it allows us to model high-dimensional dependence among the business lines, not only bivariate dependence. In the case of traditional copula models, one needs to calibrate many copula models with different families, and go through model validation procedure to choose the best one among the candidates. On the other hand, our proposed vine copula structure enables us to explore the optimal copula structure for the given data in a unified manner.
For model selection, a stepwise approach can be applied to choose the marginal distributions, copula structure, and family for each pair of copulas subsequently. Our empirical analysis on a synthetic insurance portfolio consisting of nine lines of business showed weak associations among the lines.
Further, it was also shown that the naive implementation of a cross-classified model may result in a counterintuitive pattern of loss development, so one might consider a constrained cross-classified model to mitigate this issue. In our work, constraints on the development lag parameters are naturally incorporated via prior elicitation in a Bayesian framework. We expect that a more thorough discussion of the constrained cross-classified model in terms of variable selection could be one of the future directions of this work.

Supplementary Materials

The dataset and code for data analysis are available at https://github.com/ssauljin/vine_copula_reserving.

Author Contributions

Both authors worked on the development of the methodology and proofreading. Data preparation, empirical analysis, and draft preparation was done by H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Society of Actuaries’ James C. Hickman doctoral scholarship.

Acknowledgments

The authors thank anonymous referees and editors for helpful comments that improved this article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MCMCMarkov chain Monte Carlo
DICDeviance information criterion
LPMLLogarithm of the pseudo marginal likelihood
CPOConditional predictive ordinate
IFMInference by margin
ERMEnterprise risk management
AYAccident year
RMSERoot mean squared error
MAEMean absolute error
VaRValue at risk
CTEConditional tail expectation

Appendix A

Table A1. R ^ values for marginal models.
Table A1. R ^ values for marginal models.
Unconstrained Lognormal Model
T1T2T3T4T5T6T7T8T9
η 2 1.0001.0011.0001.0001.0011.0021.0021.0001.003
η 3 1.0000.9991.0001.0021.0000.9991.0001.0031.002
η 4 1.0001.0010.9991.0001.0020.9991.0011.0001.002
η 5 1.0011.0011.0021.0001.0030.9991.0011.0011.000
η 6 1.0061.0010.9991.0030.9991.0001.0021.0001.000
η 7 1.0001.0011.0021.0001.0001.0030.9991.0001.001
η 8 0.9991.0021.0001.0060.9991.0011.0001.0001.000
η 9 1.0000.9990.9991.0011.0021.0021.0021.0021.001
Unconstrained Gamma Model
T1T2T3T4T5T6T7T8T9
η 2 0.9991.0001.0001.0001.0061.0000.9991.0000.999
η 3 1.0001.0000.9991.0021.0010.9991.0041.0001.003
η 4 1.0021.0011.0041.0001.0011.0001.0040.9991.001
η 5 1.0041.0001.0021.0020.9991.0011.0001.0031.002
η 6 1.0031.0021.0021.0021.0000.9990.9991.0000.999
η 7 1.0001.0041.0011.0011.0020.9991.0011.0001.000
η 8 1.0031.0001.0031.0011.0001.0031.0021.0031.000
η 9 1.0001.0021.0001.0020.9991.0041.0111.0001.001
Constrained Lognormal Model
T1T2T3T4T5T6T7T8T9
η 2 0.9990.9990.9991.0001.0010.9991.0001.0000.999
η 3 1.0021.0021.0011.0011.0011.0021.0031.0001.002
η 4 0.9981.0011.0021.0001.0010.9991.0000.9981.002
η 5 1.0031.0000.9990.9990.9991.0011.0011.0001.003
η 6 1.0020.9990.9990.9991.0021.0000.9990.9991.003
η 7 0.9991.0011.0001.0000.9990.9990.9990.9991.000
η 8 1.0011.0021.0001.0001.0010.9991.0001.0010.998
η 9 1.0011.0001.0001.0011.0000.9991.0011.0000.999
Constrained Gamma Model
T1T2T3T4T5T6T7T8T9
η 2 0.9991.0030.9991.0021.0001.0020.9990.9991.000
η 3 0.9991.0000.9991.0041.0031.0010.9991.0000.999
η 4 1.0000.9991.0031.0011.0041.0020.9990.9991.004
η 5 0.9991.0010.9991.0011.0011.0020.9990.9991.000
η 6 0.9991.0011.0001.0001.0000.9980.9991.0001.000
η 7 1.0001.0010.9991.0000.9990.9991.0001.0011.000
η 8 1.0000.9991.0001.0000.9981.0001.0001.0010.998
η 9 1.0011.0001.0011.0001.0001.0011.0001.0021.000
Figure A1. Randomly chosen traceplots for marginal models.
Figure A1. Randomly chosen traceplots for marginal models.
Risks 08 00111 g0a1

References

  1. Aas, Kjersti, Claudia Czado, Arnoldo Frigessi, and Henrik Bakken. 2009. Pair-copula constructions of multiple dependence. Insurance: Mathematics and Economics 44: 182–98. [Google Scholar] [CrossRef] [Green Version]
  2. Abdallah, Anas, Jean-Philippe Boucher, and Helene Cossette. 2015. Modeling dependence between loss triangles with hierarchical archimedean copulas. ASTIN Bulletin: The Journal of the IAA 45: 577–99. [Google Scholar] [CrossRef] [Green Version]
  3. Arreola Hernandez, Jose, Shawkat Hammoudeh, Duc Khuong Nguyen, Mazin A. M. Al Janabi, and Juan Carlos Reboredo. 2017. Global financial crisis and dependence risk analysis of sector portfolios: A vine copula approach. Applied Economics 49: 2409–27. [Google Scholar] [CrossRef] [Green Version]
  4. Bedford, Tim, and Roger M. Cooke. 2001. Probability density decomposition for conditionally dependent random variables modeled by vines. Annals of Mathematics and Artificial Intelligence 32: 245–68. [Google Scholar] [CrossRef]
  5. Bedford, Tim, and Roger M. Cooke. 2002. Vines—A new graphical model for dependent random variables. The Annals of Statistics 30: 1031–68. [Google Scholar] [CrossRef]
  6. Beran, Rudolf. 1977. Minimum hellinger distance estimates for parametric models. The Annals of Statistics 5: 445–63. [Google Scholar] [CrossRef]
  7. Braun, Christian. 2004. The prediction error of the chain ladder method applied to correlated run-off triangles. ASTIN Bulletin: The Journal of the IAA 34: 399–423. [Google Scholar] [CrossRef] [Green Version]
  8. Czado, Claudia. 2010. Pair-copula constructions of multivariate copulas. In Copula Theory and Its Applications. Berlin/Heidelberg: Springer, pp. 93–109. [Google Scholar]
  9. Czado, Claudia. 2019. Analyzing Dependent Data with Vine Copulas. Berlin/Heidelberg: Springer. [Google Scholar]
  10. Dew, Edward, and Barton Hedges. 1998. Reserving for excess layers: A guide to practical reserving applications. Insurance: Mathematics and Economics 22: 178. [Google Scholar] [CrossRef]
  11. Dissmann, Jeffrey, Eike Christian Brechmann, Claudia Czado, and Dorota Kurowicka. 2013. Selecting and estimating regular vine copulae and application to financial returns. Computational Statistics & Data Analysis 59: 52–69. [Google Scholar]
  12. Embrechts, Paul, Filip Lindskog, and Alexander McNeil. 2001. Modelling Dependence with Copulas. Rapport technique. Zurich: Département de mathématiques, Institut Fédéral de Technologie de Zurich. [Google Scholar]
  13. England, Peter D., and Richard J. Verrall. 2002. Stochastic claims reserving in general insurance. British Actuarial Journal 8: 443–518. [Google Scholar] [CrossRef]
  14. Gao, Guangyuan. 2018. Bayesian Claims Reserving Methods in Non-life Insurance with Stan. Berlin/Heidelberg: Springer. [Google Scholar]
  15. Geisser, Seymour. 2017. Predictive Inference. Abingdon-on-Thames: Routledge. [Google Scholar]
  16. Gelfand, Alan E., and Dipak Kumar Dey. 1994. Bayesian model choice: Asymptotics and exact calculations. Journal of the Royal Statistical Society: Series B (Methodological) 56: 501–14. [Google Scholar]
  17. Gelfand, Alan E., Dipak Kumar Dey, and Hong Chang. 1992. Model Determination Using Predictive Distributions with Implementation via Sampling-Based Methods. Technical Report. Stanford: Department of Statistics, Stanford University. [Google Scholar]
  18. Gelman, Andrew, and Donald Bruce Rubin. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7: 457–72. [Google Scholar] [CrossRef]
  19. Genest, Christian, and Anne-Catherine Favre. 2007. Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering 12: 347–68. [Google Scholar] [CrossRef]
  20. Goh, Gyuhyeong, and Dipak Kumar Dey. 2014. Bayesian model diagnostics using functional bregman divergence. Journal of Multivariate Analysis 124: 371–83. [Google Scholar] [CrossRef]
  21. Gruber, Lutz Fabian, and Claudia Czado. 2015. Sequential bayesian model selection of regular vine copulas. Bayesian Analysis 10: 937–63. [Google Scholar] [CrossRef]
  22. Gruber, Lutz Fabian, and Claudia Czado. 2018. Bayesian model selection of regular vine copulas. Bayesian Analysis 13: 1107–31. [Google Scholar] [CrossRef]
  23. Haff, Ingrid Hobæk, Kjersti Aas, and Arnoldo Frigessi. 2010. On the simplified pair-copula construction—Simply useful or too simplistic? Journal of Multivariate Analysis 101: 1296–310. [Google Scholar] [CrossRef] [Green Version]
  24. Hellinger, Ernst. 1909. Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen. Journal für die reine und angewandte Mathematik 136: 210–71. [Google Scholar] [CrossRef]
  25. Hua, Lei, and Harry Joe. 2011. Tail order and intermediate tail dependence of multivariate copulas. Journal of Multivariate Analysis 102: 1454–71. [Google Scholar] [CrossRef] [Green Version]
  26. Ibrahim, Joseph G., Ming-Hui Chen, and Debajyoti Sinha. 2014. Bayesian Survival Analysis. Berlin/Heidelberg: Springer. [Google Scholar]
  27. Jeong, Himchan. 2020. Testing for random effects in compound risk models via bregman divergence. ASTIN Bulletin: The Journal of the IAA 50: 777–98. [Google Scholar] [CrossRef]
  28. Joe, Harry, and Dorota Kurowicka. 2011. Dependence Modeling: Vine Copula Handbook. Singapore: World Scientific. [Google Scholar]
  29. Joe, Harry, and James Jianmeng Xu. 1996. The Estimation Method of Inference Functions for Margins for Multivariate Models. Technical Report. Vancouver: University of British Columbia. [Google Scholar]
  30. Kreuzer, Alexander, and Claudia Czado. 2019. Bayesian inference for dynamic vine copulas in higher dimensions. arXiv arXiv:1911.00702. [Google Scholar]
  31. Loaiza, Maya Ruben Albeiro, Jose Eduardo Gomez-Gonzalez, and Luis Fernando Melo Velandia. 2015. Latin american exchange rate dependencies: A regular vine copula approach. Contemporary Economic Policy 33: 535–49. [Google Scholar] [CrossRef] [Green Version]
  32. Louzada, Francisco, and Paulo Henrique Ferreira da Silva. 2016. Modified inference function for margins for the bivariate clayton copula-based sun tobit model. Journal of Applied Statistics 43: 2956–76. [Google Scholar] [CrossRef]
  33. Mack, Thomas. 1993. Distribution-free calculation of the standard error of chain ladder reserve estimates. ASTIN Bulletin: The Journal of the IAA 23: 213–25. [Google Scholar] [CrossRef] [Green Version]
  34. Merz, Michael, and Mario V. Wüthrich. 2008. Prediction error of the multivariate chain ladder reserving method. North American Actuarial Journal 12: 175–97. [Google Scholar] [CrossRef]
  35. Merz, Michael, and Mario V. Wüthrich. 2010. Paid–incurred chain claims reserving method. Insurance: Mathematics and Economics 46: 568–79. [Google Scholar] [CrossRef]
  36. Nagler, Thomas, Christian Bumann, and Claudia Czado. 2019. Model selection in sparse high-dimensional vine copula models with an application to portfolio risk. Journal of Multivariate Analysis 172: 180–92. [Google Scholar] [CrossRef] [Green Version]
  37. Nelsen, Roger B. 1999. An Introduction to Copulas. Berlin/Heidelberg: Springer. [Google Scholar]
  38. Peters, Gareth W., Alice X. D. Dong, and Robert Kohn. 2014. A copula based bayesian approach for paid–incurred claims models for non-life insurance reserving. Insurance: Mathematics and Economics 59: 258–78. [Google Scholar] [CrossRef] [Green Version]
  39. Reboredo, Juan C., and Andrea Ugolini. 2016. Systemic risk of spanish listed banks: A vine copula covar approach. Spanish Journal of Finance and Accounting/Revista Española de Financiación y Contabilidad 45: 1–31. [Google Scholar] [CrossRef]
  40. Schamberger, Benedikt, Lutz Fabian Gruber, and Claudia Czado. 2017. Bayesian inference for latent factor copulas and application to financial risk forecasting. Econometrics 5: 21. [Google Scholar] [CrossRef] [Green Version]
  41. Schmidt, Klaus D. 2006. Optimal and additive loss reserving for dependent lines of business. In Proceeding ot the Casualty Actuarial Society Forum, pp. 319–51. Available online: https://www.casact.org/pubs/forum/06fforum/323.pdf (accessed on 10 October 2020).
  42. Shi, Peng, Sanjib Basu, and Glenn G. Meyers. 2012. A bayesian log-normal model for multivariate loss reserving. North American Actuarial Journal 16: 29–51. [Google Scholar] [CrossRef]
  43. Shi, Peng, and Edward W. Frees. 2011. Dependent loss reserving using copulas. ASTIN Bulletin: The Journal of the IAA 41: 449–86. [Google Scholar]
  44. Shi, Peng, and Lu Yang. 2018. Pair copula constructions for insurance experience rating. Journal of the American Statistical Association 113: 122–33. [Google Scholar] [CrossRef]
  45. Sklar, Abe. 1959. Fonctions de repartition a n dimensions et leurs marges. Publications de l’Institut de Statistique de l’Universite de Paris 8: 229–31. [Google Scholar]
  46. Spiegelhalter, David J., Nicola G. Best, Bradley P. Carlin, and Angelika Van Der Linde. 2002. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Methodological) 64: 583–639. [Google Scholar] [CrossRef] [Green Version]
  47. Taylor, Greg, and Gráinne McGuire. 2016. Stochastic Loss Reserving Using Generalized Linear Models. Arlington County: Casualty Actuarial Society. [Google Scholar]
  48. Truc, Carlos íos, Aviral Kumar Tiwari, and Faisal Alqahtani. 2020. Value-at-risk and expected shortfall in cryptocurrencies’ portfolio: A vine copula–based approach. Applied Economics 52: 2580–93. [Google Scholar]
  49. Zhang, Yanwei. 2010. A general multivariate chain ladder model. Insurance: Mathematics and Economics 46: 588–99. [Google Scholar] [CrossRef]
Figure 1. R-vine structures for 3 dimensions.
Figure 1. R-vine structures for 3 dimensions.
Risks 08 00111 g001
Figure 2. R-vine structure for tree 1.
Figure 2. R-vine structure for tree 1.
Risks 08 00111 g002
Figure 3. Normalized contour plots for trees 1, 2, and 3.
Figure 3. Normalized contour plots for trees 1, 2, and 3.
Risks 08 00111 g003
Figure 4. Predictive density of aggregate unpaid claims for constrained models.
Figure 4. Predictive density of aggregate unpaid claims for constrained models.
Risks 08 00111 g004
Table 1. Description of the lines of business.
Table 1. Description of the lines of business.
IndexDescription
1North American Workers’ Compensation
2North American General Liability
3North American Other Casualty
4North American Non-Casualty
5Overseas General Casualty
6Overseas General Non-Casualty
7Overseas General Personal Accident
8Global Reinsurance Property
9Global Reinsurance Non-Property
Table 2. An illustrative example of heterogeneity among business lines. 
Table 2. An illustrative example of heterogeneity among business lines. 
Paid LossesIncremental Development
YearPersonalCommericalAggregatePersonalCommercialAggregate
25,000,000800,0005,800,000---
35,200,0001,200,0006,400,0003.92%40.55%9.84%
45,300,0001,500,0006,800,0001.90%22.31%6.06%
Table 3. Estimated regression coefficients for marginal models.
Table 3. Estimated regression coefficients for marginal models.
Unconstrained Lognormal Model
T1T2T3T4T5T6T7T8T9
Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%
η 2 0.790.730.851.261.191.330.660.620.700.360.320.400.720.700.740.810.790.830.530.530.541.451.281.611.801.741.86
η 3 0.320.260.380.610.540.690.220.180.260.050.020.090.270.240.290.190.180.210.090.080.090.280.100.470.680.620.74
η 4 0.190.130.260.430.350.510.130.090.170.030.000.070.160.140.190.070.050.100.030.030.040.140.020.310.380.310.44
η 5 0.140.070.210.240.160.330.080.030.130.030.000.070.110.080.140.030.010.050.010.010.020.120.010.280.260.190.33
η 6 0.100.020.180.210.120.310.060.010.110.030.000.070.070.040.100.020.000.040.010.000.020.120.010.300.170.100.25
η 7 0.080.010.160.100.020.210.040.000.090.030.000.080.050.010.080.020.000.040.010.000.020.140.010.340.120.030.20
η 8 0.080.010.180.100.010.210.040.000.100.040.000.100.040.010.080.020.000.050.010.000.020.170.010.430.080.010.17
η 9 0.100.010.240.110.010.260.060.010.130.060.000.140.040.000.090.030.000.060.010.000.030.230.020.570.100.010.22
Unconstrained Gamma Model
T1T2T3T4T5T6T7T8T9
Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%
η 2 0.810.760.861.271.211.340.670.630.700.370.330.410.720.700.740.810.790.830.530.530.541.601.431.761.811.761.87
η 3 0.320.270.380.620.550.690.220.180.260.050.010.100.270.240.290.190.170.210.090.080.090.290.110.470.680.630.74
η 4 0.190.130.250.440.360.510.130.090.170.030.000.070.160.140.190.080.050.100.030.030.040.140.020.310.380.320.44
η 5 0.140.070.200.250.170.330.080.030.130.030.000.070.110.090.140.030.010.050.010.010.020.120.010.290.260.200.33
η 6 0.090.020.170.210.130.300.060.020.110.030.000.070.070.040.100.020.000.040.010.000.020.130.010.310.170.100.25
η 7 0.070.010.150.100.020.200.030.000.080.030.000.090.040.010.080.020.000.040.010.000.020.140.010.350.120.040.20
η 8 0.080.010.170.100.010.200.040.000.100.040.000.110.040.010.080.020.000.050.010.000.020.180.010.450.080.010.17
η 9 0.100.010.220.100.000.260.050.000.130.060.010.150.030.000.080.020.000.060.010.000.020.250.020.640.090.010.22
Constrained Lognormal Model
T1T2T3T4T5T6T7T8T9
Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%
η 2 0.790.740.851.261.191.320.660.630.700.360.320.400.720.700.740.810.790.830.530.530.541.451.291.611.801.751.85
η 3 0.330.270.380.610.540.690.210.180.250.070.040.110.270.240.290.190.170.210.090.080.090.330.200.480.680.620.73
η 4 0.200.150.260.430.360.500.140.100.170.050.030.080.160.140.190.080.050.100.030.030.040.210.120.330.380.320.44
η 5 0.150.100.200.270.200.340.090.060.120.040.020.060.110.090.140.040.020.060.020.010.020.160.080.250.260.200.32
η 6 0.110.070.160.200.130.270.060.040.090.030.010.050.080.050.100.020.010.040.010.010.020.120.050.200.180.130.24
η 7 0.080.030.120.130.060.200.040.020.070.020.010.040.050.030.070.020.010.030.010.000.010.090.030.160.120.070.18
η 8 0.050.010.100.080.020.150.030.010.050.010.000.030.030.010.060.010.000.020.000.000.010.060.010.120.080.030.13
η 9 0.030.000.070.040.000.100.010.000.030.010.000.020.020.000.040.010.000.010.000.000.010.030.000.080.040.000.09
T1T2T3T4T5T6T7T8T9
Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%Mean5%95%
η 2 0.810.760.861.271.211.330.670.630.700.370.330.400.720.700.740.810.790.830.530.530.541.601.441.761.811.761.86
η 3 0.330.270.380.620.550.690.220.180.250.070.040.100.270.240.290.190.180.210.090.080.090.330.210.480.680.630.74
η 4 0.200.150.250.440.370.510.130.100.170.050.030.070.160.140.190.080.060.100.030.030.040.220.120.320.380.320.44
η 5 0.150.100.190.270.200.340.090.060.120.040.020.060.110.090.140.040.020.060.020.010.020.160.080.250.260.210.32
η 6 0.110.060.150.200.140.270.060.030.090.030.010.050.080.050.100.020.010.040.010.010.020.120.050.200.180.130.24
η 7 0.080.040.120.130.060.190.040.020.070.020.010.040.050.030.070.020.010.030.010.000.010.090.030.160.120.070.18
η 8 0.050.020.090.080.020.150.030.010.050.010.000.030.030.010.060.010.000.020.000.000.010.060.010.120.080.030.13
η 9 0.030.000.060.040.000.100.010.000.040.010.000.020.020.000.040.010.000.010.000.000.010.030.000.080.040.000.09
Table 4. DIC and LPML for marginal components.
Table 4. DIC and LPML for marginal components.
T1T2T3T4T5T6T7T8T9
DICLNU−57.1−46.5−90.3−84.2−125.1−140.5−22416.8−61.8
GamU−38.6−7.7−74.4−75.3−105.9−123.6−214.147.6−14.9
LNC−62.9−50.5−95.9−88.4−129.2−144.1−228.511.9−66.7
GamC−44.2−11.8−79.2−79.6−110.5−127.3−21842.8−19.2
LPMLLNU−326.2−129.3−117.2−291.7−83.2−127.3−43.2−241.8−155.2
GamU−379.2−168.6−144.3−364.8−103.1−162−62.1−342.7−215.2
LNC−368.1−132.7−123.4−300.6−82.6−118.1−43.4−242−144.8
GamC−419.6−176.4−145.5−392.5−100.1−161.3−57.5−350.5−219.5
Table 5. Specification of vine structure.
Table 5. Specification of vine structure.
TreeEdgeFamily ϕ τ
19,8Gumbel1.870.46
19,6 90 rotated Clayton−1.20−0.37
18,1Frank9.400.65
11,7 270 rotated Clayton−0.20−0.09
17,2 90 rotated Clayton−0.95−0.32
16,4Frank5.870.50
19,5 270 rotated Gumbel−1.24−0.20
14,3Survival Gumbel1.610.38
29,4;6 90 rotated Gumbel−1.35−0.26
39,7;8,1Survival Clayton0.520.21
Table 6. Summary of unpaid claims prediction with the fitted models.
Table 6. Summary of unpaid claims prediction with the fitted models.
UnconstrainedConstrained
CopulaSiloCopulaSiloActual
AY = 2005529,426219,222148,76781,65790,662
AY = 2006264,689167,787185,653131,065109,420
AY = 2007299,024196,099319,881227,435209,649
AY = 2008454,503404,983461,885414,982338,359
AY = 2009405,808386,094447,697409,172304,544
AY = 2010651,228610,943703,166615,770450,724
AY = 2011942,3241,071,867986,7531,070,549762,328
AY = 20122,182,2022,649,8582,181,7082,648,6041,837,070
Total5,729,2055,706,8545,435,5115,599,2364,102,756
Table 7. Summary of validation performance measures of the fitted models.
Table 7. Summary of validation performance measures of the fitted models.
UnconstrainedConstrained
CopulaSiloCopulaSilo
RMSE234,540318,849190,381315,934
MAE203,306203,900166,594189,311
Table 8. Estimated risk margins for the unpaid claims.
Table 8. Estimated risk margins for the unpaid claims.
VaRCTE
90%95%99%90%95%99%
Marginal T1384,383420,016494,059432,782464,901521,869
Marginal T21,403,4641,520,3871,720,6171,560,5761,657,1671,893,831
Marginal T3655,532694,812766,766707,338740,161805,153
Marginal T41,657,5001,802,0812,126,8761,858,7351,993,3882,313,396
Marginal T5777,885807,011869,457818,302844,170895,520
Marginal T6691,142714,401757,403721,538741,557791,119
Marginal T7405,697413,424428,243416,023423,117439,641
Marginal T8586,402696,6291,009,717757,769878,8691,199,343
Marginal T9556,892591,841658,353603,369633,144703,385
Marginal Total7,118,8977,660,6028,831,4917,876,4318,376,4739,563,257
Independent Aggregate6,163,3076,363,5926,755,1256,428,4846,608,7536,978,484
Copula Aggregate6,145,8486,376,8476,774,4636,435,7126,610,6846,995,674
Silo Aggregate6,434,9836,692,3707,309,8606,815,5357,076,7757,586,896
Table 9. Risk reduction from diversification.
Table 9. Risk reduction from diversification.
VaRCTE
90%95%99%90%95%99%
Independent955,5901,297,0102,076,3661,447,9471,767,7202,584,773
Copula973,0491,283,7552,057,0281,440,7191,765,7892,567,583
Silo683,914968,2321,521,6311,060,8961,299,6981,976,361
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Jeong, H.; Dey, D. Application of a Vine Copula for Multi-Line Insurance Reserving. Risks 2020, 8, 111. https://doi.org/10.3390/risks8040111

AMA Style

Jeong H, Dey D. Application of a Vine Copula for Multi-Line Insurance Reserving. Risks. 2020; 8(4):111. https://doi.org/10.3390/risks8040111

Chicago/Turabian Style

Jeong, Himchan, and Dipak Dey. 2020. "Application of a Vine Copula for Multi-Line Insurance Reserving" Risks 8, no. 4: 111. https://doi.org/10.3390/risks8040111

APA Style

Jeong, H., & Dey, D. (2020). Application of a Vine Copula for Multi-Line Insurance Reserving. Risks, 8(4), 111. https://doi.org/10.3390/risks8040111

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop