1. Introduction
Recently,
Robe-Voinea and Vernic (
2016a,
2016b,
2017,
2018) and
Vernic (
2018) studied the recursive and Fast Fourier Transform (FFT)-based evaluation of the distribution of the following multivariate collective model:
which may arise in different contexts (see, e.g., the discussion in Section 14.1 of Reference
Sundt and Vernic (
2009)), from which we mention the case where a policyholder has
m types of policies, such as auto, home, business, etc., that can be simultaneously affected by some claim events, such as floods, storms or earthquakes. More precisely, in this case,
denotes the aggregate claims affecting solely the policy of type
j, while
denotes the random variable (r.v.) number of claims simultaneously affecting all
m types of policies, with
denoting the size of the
kth such claim corresponding to the policy of type
j. The assumptions under which this model was considered are: Each set of claim sizes
are non-negative, independent and identically distributed (i.i.d.) r.v.s,
they are also independent of the claim numbers and of the other claim sizes,
included; the random vectors
are non-negative i.i.d. as the generic random vector
and independent of the claim numbers, while the components of
, however, are dependent; by convention,
Note that the above model assumes that a claim event affects either a single type of insurance line or all the insurance lines at once; there is no middle way, i.e., an event cannot affect only, say, lines 1 and 2, without causing claims in the other lines.
To overcome this drawback, in this paper we consider the more general multivariate collective model:
where
The m-variate claim size random vectors are i.i.d. as the generic m-variate random vector whose jth univariate component if meaning that results from those claim events simultaneously affecting solely the lines ; these events are counted by the r.v. . Moreover, the s are also independent of the other claim size random vectors (i.e., of each , where ) and of the claim numbers. We let denote the jth univariate component of the probability function (p.f.) of (in the discrete case) and, by convention, .
The components of the random vector number of claims are dependent r.v.s, in total (maximum)
We adopt the actuarial terminology in which the distribution of is called “compound” and the distribution of is called “counting”.
To evaluate the distribution of this model, we shall consider that all the claim distributions are of the discrete type (e.g., they have been previously discretized; this is a usual assumption for collective models). We start the next section by presenting the exact formula of the p.f. of
based on convolutions, which, unfortunately, is unpractical. Therefore, we also aim at developing recursions for the evaluation of this distribution, an approach that requires the introduction of supplementary assumptions under which it is possible to obtain recursive formulas; examples of such recursions are given in
Section 2.1. Apart from the restrictive assumptions, another important drawback of recursions is that they become very time consuming when the dimensionality
m of the model increases (see the numerical examples in
Section 2.3). To overcome these drawbacks, in
Section 2.2 we propose the use of the Fast Fourier Transform (FFT) technique, which can be applied whenever we know the form of the characteristic function of
and which is very efficient when we want to evaluate the distribution’s tail. However, this remarkably fast method is an approximate one, and we must pay a special attention to its specific errors; this aspect is illustrated by the numerical examples discussed in
Section 2.3.
For simplicity, let us introduce more notation: We denote by the p.f. of , by g and the probability generating function (pgf) and the characteristic function (cf), respectively, of a r.v., which will be indexed with the r.v.’s name. Also, are vectors whose corresponding dimension results from the context, is the zero-vector, while the difference is componentwise. By we denote the sum of the components of the vector and by the n-fold convolution of f. To shorten the formulas, we rewrite the sum as .
2. Evaluation of the Compound Distribution
We start by presenting the exact formula of the p.f. of based on convolutions. This formula is so complex that, in general, it cannot be directly applied to find the distribution of .
Proposition 1. The p.f. of the multivariate collective model (2) is given bywhere . Proof. We have
which immediately yields the result. ☐
We shall also need the pgf and the cf of .
Proposition 2. The pgf and cf of the general multivariate collective model (2) are, respectively, given by Proof. We prove only the pgf formula (the one for the cf follows along the same lines). Considering the independence assumptions of the model, we have
hence the formula (
3). ☐
2.1. Recursive Evaluation
Due to the difficulty of directly applying the exact formula from Proposition 1, we present in the following examples of alternative recursive formulas for obtaining the p.f. of under some supplementary assumptions. These assumptions are chosen such that the multivariate compound distribution of can be rewritten as a compound distribution with a univariate counting distribution, for which we can apply the already existing recursions.
2.1.1. Case 1 Assumptions
As in Reference
Robe-Voinea and Vernic (
2017), we assume that
follows the multivariate Poisson distribution
with parameters
and
having the pgf (see, e.g.,
Johnson et al. (
1997))
As a consequence, Proposition 2 easily yields the following pgf and cf
Also, two recursive formulas for evaluating the distribution of are obtained in the following proposition, where we denote by the p.f. of the sum r.v. .
Proposition 3. Under the assumption that it holds thatandwith starting valuewhere . In the above formulas, is such that is a permutation of its components. Proof. Due to the independence of the random vectors
we have that
therefore, we can rewrite the pgf (
5) as
meaning that in this case, the distribution of Model (
2) is also a compound distribution, with a univariate Poisson counting distribution. More precisely,
can also be rewritten as
where
,
while the random vectors
are i.i.d. as the
m-variate random vector
having the mixture p.f.
Regarding model (
8), with
satisfying Panjer’s recursion (see
Panjer (
1981)) with parameters
, i.e.,
from Reference
Sundt (
1999) (see, also, formulas (15.4) and, respectively, (15.5) in
Sundt and Vernic (
2009)) it holds that
Since in our case
we have
and
. Based on this, we insert Equation (
9) into Equation(
10) and obtain for
We know that if hence, concerning the argument of we can take the components . Therefore, if clearly in the argument of which yields the first stated formula. The second formula results in a similar way by inserting Equation (9) into Equation (11), while the starting value is immediate from and from the above form of . This completes the proof. ☐
2.1.2. Case 2 Assumptions
- A1
The p.f. of the total number of claims satisfies Panjer’s recursion for .
- A2
Given
the conditional distribution of the random vector number of claims
is assumed to be multinomial
with parameters
and
where
such that
Therefore, with
and
Under these assumptions, the pgf, the cf of and two alternative recursive formulas are presented in the following.
Proposition 4. Under the assumptions (A1 and A2), the pgf and cf of the general multivariate collective model (2) become, respectively, Proof. To obtain the pgf formula, we recall that the pgf of the multinomial distribution
is (see, e.g.,
Johnson et al. (
1997))
, so that the pgf of
becomes for
Inserting this into Equation (3) easily yields Equation (12). Equation (13) follows in a similar way, which completes the proof. ☐
Proposition 5. Under the assumptions (A1 and A2) of Model (2), with starting valuethe following recursive formula holds for while for where and is such that is a permutation of its components. Proof. Considering the assumptions (A1 and A2), we rewrite Model (
2) as
where
while the random vectors
are i.i.d. as the
m-variate random vector
with the p.f.
We use again Equations (10) and (11). By inserting Equation (16) into Equation (10), the stated formula of the constant
K is easily obtained and, for
Using reasoning similar with the one used in the proof of Proposition 3, we obtain Equation (14). Similarly, Equations (11) and (16) lead to Equation (15). This completes the proof. ☐
Particular case:
. Let us now have a look at a recursive formula in the trivariate case, where the general Model (2) is
with
For example, Equation (
15) becomes
where
.
2.1.3. Case 3 Assumptions
Another assumption under which recursive formulas already exist is the univariate mixed Poisson counting distribution. To this purpose, we assume that, given that a positive univariate r.v.
takes the value
the r.v.s
are all i.i.d. Poisson distributed such that
Then, the pgf of
given
becomes, from Equation (3):
where
. This is the pgf of a compound distribution with univariate Poisson
counting distribution and multivariate claims distribution having p.f.
hence, the conditional distribution of
, given
can be evaluated based on Equations (10) and (11), with
and
. To find the unconditional distribution of
we use the technique described in Chapter 20 of
Sundt and Vernic (
2009). Therefore, with
U denoting the distribution function of
, we introduce the auxiliary functions
and note that
Multiplying Equations (10) and (11) by
and integrating yields the following two recursions for
with starting value
. Therefore, the algorithm for evaluating
for all
is more complex and implies the backward evaluation of all
(here backward means by decreasing
i, see, e.g., the algorithm in Section 20.4.1 in Reference
Sundt and Vernic (
2009)). Being very time consuming, we don’t insist on this algorithm. However, we note that the recursions can be refined under the assumption that the mixing distribution
U is of the continuous type, with the density denoted by
u satisfying the condition
This is also called Willmot’s mixing distribution, see Reference
Willmot (
1993).
Remark 1. In view of the FFT, we also display the formula of the cf of given where Particular case: Simpler recursions are obtained when
is gamma
distributed, with
. In this case, the univariate mixed Poisson
distribution becomes a Negative Binomial distribution
which satisfies Panjer’s recursion with
and
. Since
where
hence
and it follows that we can use Equations (10) and (11) to obtain direct recursions for
i.e.,
with starting value
Moreover, regarding the cf, we easily obtain
2.2. Fast Fourier Transform Evaluation
The recursive method is an exact one, but, as already mentioned in the introduction, it has some important drawbacks: It can be applied only on some particular models and it becomes quite slow with the increasing of the dimensionality of
. A much faster and less restrictive way to evaluate the p.f. of
is provided by the Fast Fourier Transform method, which is an approximate technique used to strongly reduce the computing time, especially when evaluating the distribution’s tail. As an advantage, this method can be applied to any model as long as its cf (4) (on which it is based) has a closed form, even if there is no recursive formula available. Therefore, the FFT technique received special consideration in the actuarial literature (see, e.g., References
Bühlmann (
1984),
Embrechts et al. (
1993),
Jin and Ren (
2014) or
Robe-Voinea and Vernic (
2018)). It consists of an algorithm that computes the discrete Fourier transform of a multivariate function, as well as its inverse, extremely fast. Let
denote an
m-variate function defined on the integer support
; then its discrete Fourier transform,
, and, respectively, the inverse mapping, can defined by (definition consistent with the functions
fftn and
ifftn in Matlab)
In general, the FFT method requires that the values
are powers of two for all
j. For the multivariate model (
2), this algorithm becomes:
FFT Algorithm for model (
2)
Step 1. After setting the truncation point for each claim size random vector at the same , the corresponding truncated claim size distribution is obtained as ; if necessary, the resulting will be filled with zeros (e.g., to constraint the s to be powers of two).
Step 2. Apply the m-dimensional FFT to each , which results in the multidimensional table
Step 3. Use Equation (4) in the general case to obtain the discrete cf .
Step 4. Apply the multidimensional IFFT to to obtain the p.f. of .
Usually, to find the optimal
s, one gradually increases them until the differences between the actual solutions and the previous ones are under a certain threshold (e.g., we increase
as 32, 64, 128, 256 etc.). However, when dealing with heavy tailed claim size distributions, the results of this method can be strongly affected by a specific error caused by the discrete Fourier transform, which consists of placing under the truncation point the compound probability mass which is in fact above this point. This so-called “aliasing error” (AE) can be significantly reduced by applying to the claim size distributions an exponential change of measure, hence, forcing the tails of these distributions to decrease at an exponential rate; this transformation is known under the name of “exponential tilting” (for more details on this transformation see, e.g., Reference
Grübel and Hermesmeier (
1999)).
Particular cases: Under the particular assumptions considered in the previous section to allow for a recursive evaluation, one should use the following formulas at Step 3 of the above algorithm:
- -
When is given by Equation (6);
- -
Under the Case 2 assumptions (A1 and A2), is given by Equation (13);
- -
Under the Case 3 mixed Poisson assumption, is given by Equation (18).
2.3. Numerical Illustration
In this section, we consider a particular trivariate model (
2) with
for which we implemented both the recursive formulas and the FFT algorithm, under different assumptions.
As claim size distributions, we considered only type II Pareto distributions with the purpose to emphasize the effect of the exponential tilting on the FFT technique. We recall that the decumulative distribution (or survival) function of the
m-variate type II Pareto distribution
is given by
The expected value of each marginal exists only if
, while the variance exists only when
We took (mainly from the numerical Example 4 in Reference
Robe-Voinea and Vernic (
2018))
The expected value of
and the variances of
do not exist, hence we can see the effect of the exponential tilting in the heavy-tailed case. To discretize these distributions, we used the method of rounding considering the span
(good enough for illustration, but not optimal, see the discussion in Reference
Robe-Voinea and Vernic (
2018)).
Concerning the FFT method, as discussed in
Section 2.2, we increased the truncation point
(we took
for simplicity) from 16 till 128 (unfortunately,
generated an “out of memory” warning), and noticed that
yielded enough accurate results (for our data) compared to the exact method (see Tables 1, 3 and 5). Moreover, we also varied the tilting parameter
and noticed that an increasing of
improves the results till
while a larger value like
doesn’t significantly improve the results (see Table 4 in Example 2).
As expected, there is an important difference between the computing times requested by the two methods. This difference increases with the increasing of the truncation point and becomes really huge for in Example 1 and for in Examples 2 and 3. Therefore, we decided to compare the resulting p.f.s only up to a certain right endpoint denoted by even if the support of the FFT was much larger. Note that the discretization time was not taken into account in the displayed computing times since discretization is needed by both methods (the total discretization time up to was about 160 s).
To emphasize the differences between the FFT and the recursive results, we used the cumulative distribution function (cdf), the AE and the maximum absolute error evaluated between the exact p.f. and the FFT one; these last two are defined, respectively, by
We shall now present three examples based on the three particular cases considered in
Section 2.1. From these examples, we also note that in cdf terms,
an inequality caused by the AE that places compound mass below the truncation point.
Example 1. We assume that where ; since for this particular model, the recursive method (we implemented Equation (7)) implies the evaluation of the p.f. (i.e., multivariate convolutions), the corresponding computing time increases tremendously with Therefore, starting with we took only which needed about 30 minutes only for the convolution part. However, the FFT was ready in only a few s even for , see Table 1, where we also display a comparison of the accuracy of the two methods. This example clearly emphasizes the speed discrepancy between the two methods and the important advantage of the FFT speed. Example 2. We now assume that follows a Poisson distribution for which we recall that and Numerically, we took the multinomial parameters . We implemented the recursive Equation (17) and performed it up to the maximum in about 35 min. The speed difference between the two methods can be seen in Table 2, where we displayed the relative computing times Rec/FFT (for FFT took about 8 s). The accuracy comparison of the two methods is presented in Table 3 and the effect of changing the tilting parameters in Table 4, both supporting the above conclusions regarding the choices of r and Example 3. This example is related to Case 3, i.e., follows a mixed Poisson distribution and, for simplicity, we let Therefore, we implemented recursion (19) and the FFT based on Equation (20). The values of the parameters are: The comparison between the two methods is presented in Table 5, from where we note once again that a value of is sufficient to obtain good enough results by FFT (at least for these data). Concerning the computing times, the values were similar with the ones obtained in Example 2, see Table 2. 3. Conclusions
In this paper, we proposed a general multivariate collective model that allows for dependence between the r.v.s number of claims, and, moreover, between the different r.v.s claim sizes. Since the evaluation of the resulting compound distribution is not straightforward, we discussed two types of techniques to deal with it: The recursive method that was presented in
Section 2.1 and the FFT algorithm that was described in
Section 2.2. Unfortunately, even if the recursive method has the advantage of being exact, it has two main drawbacks compared with the FFT method: First, recursions are available under some restrictive assumptions and second, they become very slow with the increasing of the dimensionality of the model. On the other hand, the main drawback of the FFT method consists in its specific errors, especially the aliasing error. However, the FFT technique is so fast compared with the exact recursions, that it is quite worthwhile to use it, especially when values from the tail of the compound distribution are needed (nevertheless, it is important to pay attention when choosing optimal values for the truncation points and for the tilting parameters). Another advantage of the FFT is that specific functions are already implemented in existing software, even for higher dimensions, with, eventually, the disadvantage of memory limitation.
To conclude, we would recommend the following approach: If recursive formulas are available for the considered model, they should be used to evaluate the compound distribution until some reasonable (in computing time terms) upper limit is reached, and then the FFT method should be applied for a more extended domain; to validate the accuracy of the FFT results, they should be compared with the ones obtained by the recursive method.