Next Article in Journal
Normalized Augmented Inverse Probability Weighting with Neural Network Predictions
Previous Article in Journal
Entropy Method of Road Safety Management: Case Study of the Russian Federation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution

by
Jacinto Martín
1,†,
María Isabel Parra
1,†,
Mario Martínez Pizarro
2,† and
Eva López Sanjuán
2,*,†
1
Departamento de Matemáticas, Facultad de Ciencias, Universidad de Extremadura, 06006 Badajoz, Spain
2
Departamento de Matemáticas, Centro Universitario de Mérida, Universidad de Extremadura, 06800 Mérida, Spain
*
Author to whom correspondence should be addressed.
The authors contributed equally to this work.
Entropy 2022, 24(2), 178; https://doi.org/10.3390/e24020178
Submission received: 17 November 2021 / Revised: 17 January 2022 / Accepted: 21 January 2022 / Published: 25 January 2022

Abstract

:
In the parameter estimation of limit extreme value distributions, most employed methods only use some of the available data. Using the peaks-over-threshold method for Generalized Pareto Distribution (GPD), only the observations above a certain threshold are considered; therefore, a big amount of information is wasted. The aim of this work is to make the most of the information provided by the observations in order to improve the accuracy of Bayesian parameter estimation. We present two new Bayesian methods to estimate the parameters of the GPD, taking into account the whole data set from the baseline distribution and the existing relations between the baseline and the limit GPD parameters in order to define highly informative priors. We make a comparison between the Bayesian Metropolis–Hastings algorithm with data over the threshold and the new methods when the baseline distribution is a stable distribution, whose properties assure we can reduce the problem to study standard distributions and also allow us to propose new estimators for the parameters of the tail distribution. Specifically, three cases of stable distributions were considered: Normal, Lévy and Cauchy distributions, as main examples of the different behaviors of the tails of a distribution. Nevertheless, the methods would be applicable to many other baseline distributions through finding relations between baseline and GPD parameters via studies of simulations. To illustrate this situation, we study the application of the methods with real data of air pollution in Badajoz (Spain), whose baseline distribution fits a Gamma, and show that the baseline methods improve estimates compared to the Bayesian Metropolis–Hastings algorithm.

1. Introduction

Extreme value theory (EVT) is a set of statistical tools employed for modeling and predicting the occurrence of rare events outside the range of available data. It has been widely used to study events that are more extreme than any previously observed, e.g., in disciplines such as climatology: extreme events of temperature [1,2,3,4], precipitations [5,6,7,8,9,10], and solar climatology [11,12,13]; finance and insurance: applications to risk theory [14,15,16,17,18,19,20]; and engineering: design for modern buildings [21].
There are two approaches for modeling an extreme value problem. The first one is the block maxima method: dividing the sample space into blocks of equal size, the maxima values of each block follow, under a certain domain of attraction conditions, approximately a Generalized Extreme Value (GEV) distribution [22]. The second way to deal with an extreme value data set attempts to make use of the available information about the upper tail of the distribution than just the block maxima. The so-called peaks-over-threshold (POT) method is due to hydrologists trying to model floods. Loosely speaking, References [23,24] showed that when we consider the distribution of data above a certain threshold u, it can usually be approximated by a properly scaled Generalized Pareto distribution (GPD) as u tends to the endpoint of the distribution. This is the point of view considered in this article.
Due to its importance, several methods have been proposed for estimating the shape and scale parameters of the GPD. Classical methods include the method of moments, probability weighted moments, maximum likelihood and others. An exhaustive review of them can be consulted in [25,26]. Other researchers have proposed generalizations of the GPD: Reference [27] proposed a three-parameter Pareto distribution and employed POT to make estimations of Value at Risk and [28] introduced an extension of GPD and performed parametric estimation. However, classical methods might be inappropriate in certain situations, as explained in [26]. That is why Bayesian inference could be advisable. There are not many approaches to the GPD parameter estimation through Bayesian techniques. We can cite [29], who recommended the use of conjugate prior distributions; Reference [30] estimated the shape parameter when it is positive, and the computation of the posterior distribution was implemented via the Markov Chain Monte Carlo (MCMC) method with Gibbs sampling; Reference [31] employed Gamma and Pareto priors via MCMC; Reference [32] proposed a Bayesian mixture model, where the threshold was also a random parameter; Reference [33] employed Jeffrey’s prior and Metropolis–Hastings (MH) method and [34] employed the GPD distribution itself as the prior density.
In this paper, we aim at seizing all the available information coming from data in order to estimate the parameters of the GPD in a way as accurate as possible. A similar idea was also implemented in [35], for the estimation of the parameters of the Gumbel distribution when the baseline distribution is Gumbel, Exponential or Normal.
We will take into account all the data of the baseline distribution and study the relation between the baseline parameters and the parameters of the limit GPD in order to incorporate such relation into the sketch of new methods to make estimations. Concretely, we propose two methods and compare them with the classical MH method for data over the threshold. In addition, we will analyze four particular examples of underlying distribution: Exponential, Lévy, Cauchy and Normal, and make a special study of stable distributions.

2. Generalized Pareto Distribution and Its Relation with Extreme Value Theory

Let X be a random variable with distribution function F. Define u as the threshold value and let X u = X u | X > u be the random variable with distribution function
F X u ( x ) = P [ X u x ] = P [ X x + u | X > u ] = F ( x + u ) F ( u ) 1 F ( u ) ,
for 0 x x F u , being x F the right endpoint of F, that is, x F : = sup { x : F ( x ) < 1 } .
Notice that X u is the random variable that we obtain when we consider the distribution of data above the threshold, which we usually call the tail distribution.
Given a random variable X, we say that it follows a Generalized Pareto Distribution (GPD) when its distribution function is
G ( x | ξ , σ ) = 1 1 + ξ x σ 1 / ξ , ξ 0 1 exp x σ , ξ = 0
where σ > 0 and ξ R are the scale and shape parameters, respectively. Equation (1) is valid when x 0 for ξ 0 , and for 0 x σ / ξ for ξ < 0 .
The fundamental result that connects EVT and GPD distribution belongs to [23,24], and it establishes that under certain mild conditions, for a random variable X, the distribution of X u for a sufficiently high threshold u follows a properly scaled Generalized Pareto distribution (GPD).
We will call the distribution function of X, F, the baseline distribution of the GPD. The parameters ξ and σ will depend on the value of the threshold u, and on the baseline distribution. For example, ξ is determined by the shape of the upper tail of the baseline distribution F. Positive values of the shape parameter correspond to heavy tails, while negative ones come from light tails. The special case ξ = 0 will appear when the upper tail of the distribution tends to an exponential distribution of parameter 1 / σ .
In this work, we will consider several types of baseline distributions for the GPD. Traditionally, estimation for the parameters of the limit GPD in Extreme Value Theory has been made taking into account only values above the threshold, but this information might be scarce. We propose a new strategy consisting in seizing all the data from the baseline distribution. We will show how this strategy can produce accurate estimations for the parameters of the GPD.
In the case when the baseline distribution is an exponential distribution with parameter λ , with distribution function F ( x ) = 1 e λ x , x 0 , for every u 0 ,
F X u ( x ) = 1 e λ ( x + u ) ( 1 e λ u ) e λ u = 1 e λ x = F ( x ) , x 0 .
Consequently, the tail distribution X u is the same as the underlying distribution in the exponential case. Therefore, we must employ all the available data to estimate the parameter λ = 1 / σ in the definition of the GPD (1).
The case when ξ 0 is different. In this paper, we will consider some of the most employed distributions as underlying distributions: normal distribution, which has light tails ( ξ < 0 ); and the Cauchy and Lévy distributions, which lead to heavy tails ( ξ > 0 ). Those distributions are stable; therefore, they have additional properties that will be helpful to estimate the parameters of the GPD. We will study such properties below.
With respect to the threshold, it can be settled as a known value, which has a physical meaning, depending on the characteristics of the data, or it can be defined as an upper order statistic. It is generally defined as a p-quantile of the underlying distribution q p , for appropriate values of p.

3. Stable Distributions

Stable distributions are a rich class of probability distributions characterized by [36] and they have been proposed as a model for many types of physical and economic systems because of their interesting properties. They are the attractors of sums of independent, identically distributed distributions whether or not the mean or variance is finite. Good references to study them are [16] or [37].
Let Z be a random variable with parameters defined by its characteristic function:
E [ e i t Z ] = exp | t | α 1 i β tan π α 2 s i g n ( t ) , if   α 1 exp | t | 1 + i β 2 π s i g n ( t ) log | t | , if   α = 1
where the parameter α ( 0 , 2 ] is called the index of stability, and β [ 1 , 1 ] is the skewness parameter. When β = 0 , the distribution is symmetric.
A random variable X is said to follow a stable distribution with parameters a > 0 and b R if it satisfies that
X = a Z + b .
Generally, densities can be expressed only by complicated special functions, but there are three special cases of stable distributions that have probability density functions, which can be expressed analytically:
  • When α = 1 / 2 and β = 1 , we obtain Lévy distributions, Z L ( 0 , 1 ) . If X L ( γ , δ ) , then a = δ and b = γ in (3).
  • When α = 1 and β = 0 , we obtain the family of Cauchy distributions, Z C ( 0 , 1 ) . If X C ( γ , δ ) , then a = δ and b = γ in (3).
  • When α = 2 and β = 0 , we obtain the normal distribution, N ( 0 , 2 ) . If X N ( μ , σ ) , then a = σ and b = μ in (3). As usual, in this case we will denote Z N ( 0 , 1 ) .
In particular, as we can standardize the stable distributions of a family, the p-quantiles q p for a stable distribution X, can be expressed in terms of the p-quantiles z p of the standard distribution Z, as
q p = a z p + b .
Let us assume that X follows a stable distribution with parameters a and b for Equation (3), and fix u = q p as the threshold for the problem of extreme value. Then, for u large enough, z p from (4) is also large, and, consequently, denoting u Z = z p , Theorem 1 guarantees that Z u Z G P D ( ξ Z , σ Z ) . Therefore, as
X u a = Z u Z
and
F X u ( x ) = P ( X u x ) = P ( X u / a x / a ) G x a ξ Z , σ Z = 1 1 + ξ Z σ Z x a 1 / ξ Z ,
then
X u G P D ( ξ Z , a σ Z ) .
The parameter ξ Z will remain constant for all the random variables of the same stable family, whatever the parameters of the baseline variable are, while the scale parameter is obtained through the product of the standardization parameter a and the scale parameter σ Z for the GPD limiting distribution of Z u Z
ξ = ξ Z , σ = a σ Z .
In the case when the underlying distribution is a Cauchy or a Lévy, or any stable distribution X with the index 0 < α < 2 , the tail of the distribution is considered to be “heavy", therefore it leads to a GPD where ξ > 0 .
Stable distributions with index of stability α 2 , (all of them except normal distribution) also verify an interesting property. As we can see in [36], given the standard distribution Z, its survival function F ¯ can be approximated by:
F ¯ ( x ) ( 1 + β ) C α x α , x
where C α = 1 π Γ ( α ) sin α π 2 .
From this approximation, we can infer that the shape of the tail of the distribution will only depend on the index of stability α . Therefore, if we consider the GPD that models Z u Z , the shape parameter ξ Z will be fixed. We will estimate it through simulation.
Proposition 1.
When the baseline distribution is a standard stable distribution Z with α < 2 , the relation between the parameters of Z and the parameters of the GPD that models the distribution above the p-quantile of Z, u Z , is:
ξ ^ Z = 1 α , σ ^ Z = 1 α C α ( 1 + β ) 1 p 1 / α
Proof. 
From Theorem 1, for u Z that is big enough,
F ¯ u Z ( x ) 1 + ξ Z σ Z x 1 / ξ Z
and by Proposition 1 (9), also for big values of u Z
F ¯ u Z ( x ) ( 1 + β ) C α x α F ¯ Z ( u Z ) = ( 1 + β ) C α x α 1 p .
Then, making equal both expressions,
1 + ξ Z σ Z x 1 / ξ Z = ( 1 + β ) C α 1 p 1 / α x α
Therefore, we can take ξ ^ Z = 1 / α as an estimator for ξ Z (notice that the shape of the tail of the distribution depends only on the value of α ).    □
Substituting ξ Z by 1 / α , we have
1 + 1 α σ Z x = ( 1 + β ) C α 1 p 1 / α x ,
so
1 = ( 1 + β ) C α 1 p 1 / α 1 x α σ Z ( 1 + β ) C α 1 p 1 / α α σ Z
as 1 / x can be negligible. Therefore, we define
σ ^ Z = 1 α C α ( 1 + β ) 1 p 1 / α .
In Section 5, we will assure the accuracy of these estimators through an extensive simulation study.

4. Metropolis–Hastings (MH) Method

In this section, we will explain how to apply the Markov chain Monte Carlo (MCMC) method through the Metropolis–Hastings (MH) algorithm to make the estimations for stable distributions. We have to distinguish between light tails ( ξ < 0 ) and heavy tails ( ξ > 0 ). Let us assume X u G P D ( ξ , σ ) and that we dispose of m values.
Let x = ( x 1 , , x n ) be a sample of n values from X and x u = ( x u 1 , , x u m ) be a sample of m values from X u .

4.1. Light Tails ξ < 0

Take k = ξ , and δ = σ ξ , so X u G P D ( k , k δ ) , with the likelihood function
L ( k , δ | x u ) = k m δ m i = 1 m 1 x u i δ 1 + 1 / k
Considering Γ ( a 0 , b 0 ) and Γ ( a 1 , b 1 ) as prior distributions for both parameters. Then, the MH algorithm is applied:
  • Draw a starting sample ( k ( 0 ) , δ ( 0 ) )
  • For j = 0 , 1 ,
    • Sample candidates k * , δ * from the proposal distributions
      k * N ( k ( j ) , ν k ) , δ * N ( δ ( j ) , ν δ )
    • Calculate the ratios
      r k = π ( k * | δ ( j ) , x u ) π ( k ( j ) | δ ( j ) , x u ) , r δ = π ( δ * | k ( j ) , x u ) π ( δ ( j ) | k ( j ) , x u )
    • Set
      k ( j + 1 ) = k * , with probability min { 1 , r k } k ( j ) , otherwise
      δ ( j + 1 ) = δ * , with probability min { 1 , r δ } δ ( j ) , otherwise
  • Iterate the former procedure.
Notice that
r k = k * k ( j ) a 0 m 1 exp b 0 k * k ( j ) + 1 k * 1 k ( j ) i = 1 m ln 1 x u i δ ( j ) r δ = δ * δ ( j ) a 1 m 1 exp b 1 δ * δ ( j ) + 1 k ( j ) 1 i = 1 m ln 1 x u i δ * 1 k ( j ) 1 i = 1 m ln 1 x u i δ ( j )
Finally, we obtain estimations for ξ and σ from ξ = k and σ = k δ .

4.2. Heavy Tails ξ > 0

In this case, the likelihood function is
L ( ξ , σ | x u ) = σ m i = 1 m 1 + ξ x u i σ ( 1 + 1 / ξ )
Taking a type I Pareto ( a 0 , b 0 ) as prior distribution for ξ and Inv Γ ( a 1 , b 1 ) for σ ,
π ( ξ ) ξ ( a 0 + 1 ) , with ξ > b 0 π ( σ ) σ ( a 1 + 1 ) exp b 1 σ
Posterior conditional distributions are
π ( ξ | σ , x u ) ξ ( a 0 + 1 ) i = 1 m 1 + ξ x u i σ ( 1 + 1 / ξ ) π ( σ | ξ , x u ) σ ( m + a 1 + 1 ) exp b 1 σ i = 1 m 1 + ξ x u i σ ( 1 + 1 / ξ )
Then, MH algorithm is applied, as in the previous case.
Notice that
r ξ = ξ ( j ) ξ * a 0 + 1 exp 1 + 1 ξ ( j ) i = 1 m ln 1 + ξ ( j ) x u i σ ( j ) 1 + 1 ξ * i = 1 m ln 1 + ξ * x u i σ ( j ) r σ = σ ( j ) σ * m + a 1 + 1 exp b 1 1 σ ( j ) 1 σ * + 1 + 1 ξ ( j ) i = 1 m ln 1 + ξ ( j ) x u i σ ( j ) 1 + 1 ξ ( j ) i = 1 m ln 1 + ξ ( j ) x u i σ *

5. Baseline MH Method (BMH)

In this section, we will introduce Baseline Metropolis–Hastings (BMH) method, designed according to the objectives of seizing all the available data from the baseline distribution and making use of the existing relations between the baseline parameters and the limit GPD parameters. The method consists of:
  • Applying the MH algorithm to estimate the parameters for the baseline distribution θ .
  • Making use of the relations between the baseline parameters θ and the GPD parameters ξ and σ to compute estimations for ξ and σ .
In the case of stable distributions, these relations have been explained in previous sections and are given by (8). We will detail below the application of the BMH method for the three selected stable distributions.
For the rest of the baseline distributions, the strategy to find such relations would be made thorough studies of simulation, in order to establish correspondences between the baseline parameters and the tail GPD parameters. At the moment, there are no studies in the literature about this subject; therefore, it would be interesting to perform them in future research.
Then, in the case of stable distributions, the application of BMH would be:
  • Apply the MH algorithm to estimate scale parameter a from the stable baseline distribution.
  • Make use of the relation (8) to compute estimations for ξ and σ , using estimators for ξ Z and σ Z that we detail below.

5.1. Estimations for ξ Z and σ Z

In order to provide good estimations for ξ Z and σ Z , we made a thorough simulation study for the three baseline distributions we have considered: Lévy, Cauchy, and Normal distribution. We took values for p [ 0.990 , 0.995 ] with increments of 0.001 , and set the threshold u Z = q p . For each distribution, and for each value of the threshold, m = 1000 values from Z u Z were generated. This sequence was repeated 100 times. Therefore, we obtained 100 point estimations for each p.
To guarantee the convergence of the MCMC algorithm, we must be sure that the posterior distribution has been reached. These proceedings were made using library coda [38] for R software, taking 10,000 values for the burn-in period, 25 values for the thinning, and selecting initial values for each sample. Finally, to obtain the posterior distribution for each parameter, a Markov chain of length 10,000 was obtained and we considered the mean as the estimator. The results of the simulation study are shown in Figure 1.

5.1.1. Lévy and Cauchy Baseline Distribution

By Proposition 1 (10), we had estimations for ξ Z and σ Z . Concretely, these are
ξ ^ Z = 2 , σ ^ Z = 4 π 1 p 2 ,
for the Lévy distribution and,
ξ ^ Z = 1 , σ ^ Z = 1 π 1 p 1
for the Cauchy distribution.

5.1.2. Normal Baseline Distribution

In the case of the normal distribution, property (10) is not verified. In this case, the adjustment from the simulation study is:
ξ ^ Z = 0.7 + 0.61 p , σ ^ Z = 0.34 + 3.18 ( 1 p ) 12.4 ( 1 p ) 2 .
Now, we will estimate the scale parameter a of the baseline distribution X. Notice that parameter b for the stable distribution X defined in (3) does not have any influence on the estimation of the parameters of the GPD, as we have shown before. Consequently, we will assume b = 0 from now on.

5.2. Lévy Distribution

Likelihood function for Lévy distribution is:
L ( a | x ) a n / 2 exp a 2 i = 1 n 1 x i
Taking a prior distribution Γ ( a 0 , b 0 ) for a, and making use of (8) and (11), we obtain estimations ξ ^ and σ ^ .

5.3. Cauchy Distribution

Likelihood function for Cauchy is:
L ( a | x ) a n i = 1 n 1 + x i a 2 1
Taking Γ ( a 0 , b 0 ) as prior distribution and, making use of (8) and (12), we obtain estimations ξ ^ and σ ^ .

5.4. Normal Distribution

In this case, we consider Inv Γ ( a 0 , b 0 ) as prior distribution for a 2 , and making use of (8) and (13), we obtain estimations ξ ^ and σ ^ .

6. Informative Priors Baseline MH (IPBMH)

Finally, we propose an MH method to estimate the parameters ξ and σ , where highly informative priors are employed, seizing the estimations computed before. Employing estimations of ξ Z , σ Z and a obtained in previous sections, we can settle priors for ξ and σ , which are very informative. Notice that this way of proceeding keeps the original idea of Extreme Value Theory, granting more weight to tail values because they are employed twice: to compute estimations for ξ Z and σ Z and also through the likelihood function. As we commented before, this method could also be implemented for other baseline distributions once we have found relations between baseline and GPD parameters.
For stable distributions, highly informative priors are
ξ N ( ξ Z , b 1 ) , σ N ( a · σ Z , b 2 )
Then, for the three distributions studied, a is estimated through BMH and ξ Z , σ Z are estimations computed through (11)–(13). In addition,
  • b 1 is constant, being 0.03, 0.065 and 0.1 for Normal, Cauchy and Lévy baseline distributions, respectively.
  • b 2 = exp { c 1 p 2 + c 2 p + c 3 } , where values are given in Table 1.
The Joint posterior distribution is
π ( ξ , σ | x u ) σ m exp 1 2 b 1 2 ξ ξ Z 2 1 2 b 2 2 σ a · σ Z 2 i = 1 m 1 + ξ x u i σ ( 1 + 1 / ξ )
and marginal distributions are
π ( ξ | σ , x u ) exp 1 2 b 1 2 ξ ξ Z 2 i = 1 m 1 + ξ x u i σ ( 1 + 1 / ξ ) π ( σ | ξ , x u ) σ m exp 1 2 b 2 2 σ a · σ Z 2 i = 1 m 1 + ξ x u i σ ( 1 + 1 / ξ )
Then, we apply the MH algorithm with
r ξ = exp 1 2 b 1 2 ξ ( j ) ξ Z 2 ξ * ξ Z 2 1 + 1 ξ * i = 1 m ln 1 + ξ * x u i σ ( j ) + 1 + 1 ξ ( j ) i = 1 m ln 1 + ξ ( j ) x u i σ ( j )
r σ = σ ( j ) σ * m exp 1 2 b 2 2 σ ( j ) a · σ Z 2 σ * a · σ Z 2 1 + 1 ξ ( j ) i = 1 m ln 1 + ξ ( j ) x u i σ * + 1 + 1 ξ ( j ) i = 1 m ln 1 + ξ ( j ) x u i σ ( j )

7. Simulation Study

Now, we will develop a thorough simulation study in order to compare the accuracy of the three MH methods of estimation: MH, BMH and IPBMH.
We fixed p = 0.9 and the threshold u = q p . Furthermore, we take b = 0 . We sample n = 2 i , with i = 5 , , 10 values from the three baseline distributions considered, with scale a = 2 j , j = 2 , 1 , 0 , 1 , 2 . We obtained an MCMC with length 10,000, taking 10,000 values for the burn-in period, 25 values for the thinning and selecting initial values for each sample. Finally, this sequence was repeated 100 times and we considered the mean as the estimator.
In Figure 2, we can see the posterior distribution of the parameters ξ Z and σ Z for the different sample sizes n, when the baseline distribution is L ( 0 , 1 ) (left), C ( 0 , 1 ) (middle) and N ( 0 , 1 ) (right), for the methods MH and BMH. For the first one, the distribution is right skewed, although skewness becomes smaller as n increases. BMH offers a point estimation, plotted as a vertical red line.
In Figure 3, we can see the posterior distribution for σ Z , for both methods. MH (upper charts) shows much more skewness, such as in the case of ξ Z , while estimations from BMH (lower charts) are less skewed and dispersed.
Now, we will compare mean absolute errors (MAE) for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 . In Figure 4, Figure 5 and Figure 6, we can see how BMH provides smaller errors than MH.
Clearly, BMH provides more accurate estimations for the parameters of the GPD when the baseline distribution is stable and known. However, in practical situations, we might not know which is the baseline distribution, or data could better fit a mixture of distributions rather than a simple one. In these situations, the use of highly informative priors, built with the information available from all the data, could be advisable. To develop this idea, we simulated from different mixtures and computed values of MAE for the three methods, finding that IPBMH is the method that shows the best behavior when data differ from the simple distributions.
In Figure 7, we can see MAE for the three methods, in the case of the mixtures employed ( α = 0.5 ): α F ( 0 , 1 / 2 ) + ( 1 α ) F ( 0 , 2 ) (left charts), α F ( 0 , 1 ) + ( 1 α ) F ( 1 , 1 ) (middle charts) and α F ( 0 , 1 ) + ( 1 α ) F ( 1 , 2 ) (right charts), for the three baseline distributions considered (Lévy, Cauchy, Normal). In general, IPBMH is the most advisable method, especially for the case when data are scarce. Notice that when data approaches one of the pure stable distributions, for example, in the case of Cauchy mixtures (which are still quite similar to a simple Cauchy) and the second mixture for the Normal distribution, BMH and IPBMH show very similar results. However, when the mixtures differ significantly from the simple distribution, the method IPBMH and MH are more advisable than BMH.

8. An Application: PM 2.5 in Badajoz (Spain) during the Period 2011–2020

As we mentioned before, both baseline methods can be generalized for other baseline distributions by studying the relations between the parameters of the baseline distributions and the parameters of the limit GPD. We show an example, employing real data whose baseline distribution can be fitted by a Gamma distribution.
Data from measurements of the levels for diverse air pollutants in many municipalities in Spain are publicly available on the website https://www.miteco.gob.es/es/calidad-y-evaluacion-ambiental/temas/atmosfera-y-calidad-del-aire/, accessed on 15 January 2022. Particulate matter is a mixture of solid particles and liquid droplets that can be inhaled and cause serious health problems. In particular, we have selected particulate matter less than 2.5 micrometers in diameter, known as PM 2.5, because it is considered to be especially dangerous to human health. In this context, studying the tail distribution above a certain threshold is essential because the World Health Organization recommends not to exceed an average daily value of 25 μg/m3 and not to exceed an average annual value of 10 μg/m3. We studied levels of PM 2.5 measured in μg/m3 for the last ten years available, 2011–2020, from Badajoz. There are n = 1066 observations, and, as can be seen in Figure 8, data can be fitted by a Gamma distribution. As in the previous simulations, p = 0.90 and the threshold u = q p , resulting u = 15 μg/m3.
Making a simulation study, similar to the previous ones made for Normal, Lévy and Cauchy, we obtained the following relation between the parameters of the baseline distribution Γ ( α , β ) and the parameters of limit GPD ( ξ , σ ) :
ξ ^ = 0 , σ ^ = 1 β ( 1 + 0.22 log 2 α )
Then, we randomly selected 50 data and applied the three methods MHM, BDM and IPBDM to fit the tail data. This proceeding was repeated many times, and we found three possible behaviors, as shown in Figure 9. In the first case (left chart), there is an example of the most usual situation: IPBDM offers intermediate estimations, between MHM and BDM. When there are scarce tail data (middle chart), MH differs significantly from the real density, while BDM and IPBDM provide better estimations. Finally, in the right chart, a common situation is shown in which IPBDM clearly offers the best estimations.

9. Conclusions

  • In the parameter estimation of GPD, usual EVT methods waste a lot of information. We have proposed two MH methods that make the most of all the information available from the data set. They are based on making use of the existing relations between baseline and GPD parameters, through informative priors.
  • When considering the GPD coming from stable baseline distributions, we employed singular properties of stable distributions to simplify the problem (reducing to standard cases) and to provide estimators for the parameters of the GPD.
  • We have achieved very accurate estimations for the parameters of the GPD when the baseline distribution is Cauchy, Normal or Lévy, making use of the properties of stable distributions and MH methods.
  • We have studied the goodness of the estimations for classical MH method and BMH when the baseline distribution is standard Cauchy, Normal or Lévy. Clearly, BMH provides more accurate estimations than MH.
  • In most real situations, data do not fit a simple distribution. We simulated some examples of mixtures of stable distributions and showed that IPBMH provides more accurate estimations than the other methods.
  • These proposals could be generalized for other baseline distributions by studying the relations between the parameters of the baseline distributions and the parameters of the limit GPD. We provide an application with real data to illustrate this situation.

Author Contributions

Investigation, J.M., M.I.P., M.M.P. and E.L.S.; Methodology, J.M., M.I.P., M.M.P. and E.L.S.; Writing—review & editing, J.M., M.I.P., M.M.P. and E.L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Junta de Extremadura, Consejería de Economía, Ciencia y Agenda Digital GR21057, partially supported by Fondo Europeo de Desarrollo Regional (FEDER).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Nogaj, M.; Yiou, P.; Parey, S.; Malek, F.; Naveau, P. Amplitude and frequency of temperature extremes over the North Atlantic region. Geophys. Res. Lett. 2006, 33. [Google Scholar] [CrossRef]
  2. Coelho, C.A.S.; Ferro, C.A.T.; Stephenson, D.B.; Steinskog, D.J. Methods for Exploring Spatial and Temporal Variability of Extreme Events in Climate Data. J. Clim. 2008, 21, 2072–2092. [Google Scholar] [CrossRef] [Green Version]
  3. Acero, F.J.; Fernández-Fernández, M.I.; Carrasco, V.M.S.; Parey, S.; Hoang, T.T.H.; Dacunha-Castelle, D.; García, J.A. Changes in heat wave characteristics over Extremadura (SW Spain). Theor. Appl. Climatol. 2017, 133, 605–617. [Google Scholar] [CrossRef] [Green Version]
  4. García, J.A.; Pizarro, M.M.; Acero, F.J.; Parra, M.I. A Bayesian Hierarchical Spatial Copula Model: An Application to Extreme Temperatures in Extremadura (Spain). Atmosphere 2021, 12, 897. [Google Scholar] [CrossRef]
  5. García, J.; Gallego, M.C.; Serrano, A.; Vaquero, J. Trends in Block-Seasonal Extreme Rainfall over the Iberian Peninsula in the Second Half of the Twentieth Century. J. Clim. 2007, 20, 113–130. [Google Scholar] [CrossRef]
  6. Re, M.; Barros, V.R. Extreme rainfalls in SE South America. Clim. Chang. 2009, 96, 119–136. [Google Scholar] [CrossRef]
  7. Acero, F.J.; Gallego, M.C.; García, J.A. Multi-day rainfall trends over the Iberian Peninsula. Theor. Appl. Climatol. 2011, 108, 411–423. [Google Scholar] [CrossRef]
  8. Acero, F.J.; García, J.A.; Gallego, M.C. Peaks-over-Threshold Study of Trends in Extreme Rainfall over the Iberian Peninsula. J. Clim. 2011, 24, 1089–1105. [Google Scholar] [CrossRef]
  9. Acero, F.J.; Parey, S.; Hoang, T.T.H.; Dacunha-Castelle, D.; García, J.A.; Gallego, M.C. Non-stationary future return levels for extreme rainfall over Extremadura (southwestern Iberian Peninsula). Hydrol. Sci. J. 2017, 62, 1394–1411. [Google Scholar] [CrossRef] [Green Version]
  10. Wi, S.; Valdés, J.B.; Steinschneider, S.; Kim, T.W. Non-stationary frequency analysis of extreme precipitation in South Korea using peaks-over-threshold and annual maxima. Stoch. Environ. Res. Risk Assess. 2015, 30, 583–606. [Google Scholar] [CrossRef]
  11. Ramos, A.A. Extreme value theory and the solar cycle. Astron. Astrophys. 2007, 472, 293–298. [Google Scholar] [CrossRef] [Green Version]
  12. Acero, F.J.; Carrasco, V.M.S.; Gallego, M.C.; García, J.A.; Vaquero, J.M. Extreme Value Theory and the New Sunspot Number Series. Astrophys. J. 2017, 839, 98. [Google Scholar] [CrossRef]
  13. Acero, F.J.; Gallego, M.C.; García, J.A.; Usoskin, I.G.; Vaquero, J.M. Extreme Value Theory Applied to the Millennial Sunspot Number Series. Astrophys. J. 2018, 853, 80. [Google Scholar] [CrossRef] [Green Version]
  14. Longin, F.M. Value at Risk and Extreme Values. IFAC Proc. Vol. 1998, 31, 45–49. [Google Scholar] [CrossRef]
  15. Singh, A.K.; Allen, D.E.; Powell, R.J. Value at risk estimation using extreme value theory. In Proceedings of the 19th International Congress on Modelling and Simulation, Perth, Australia, 12–16 December 2011. [Google Scholar]
  16. Embrechts, P.; Klüppelberg, C.; Mikosch, T. Modelling Extremal Events for Insurance and Finance; Springer: Berlin, Germany; New York, NY, USA, 1997. [Google Scholar]
  17. Trzpiot, G.; Majewska, J. Estimation of Value at Risk: Extreme value and robust approaches. Oper. Res. Decis. 2010, 20, 131–143. [Google Scholar]
  18. Magnou, G. An Application of Extreme Value Theory for Measuring Financial Risk in the Uruguayan Pension Fund. Compend. Cuad. Econ. Adm. 2017, 4, 1–19. [Google Scholar]
  19. Gilli, M.; Këllezi, E. An application of extreme value theory for measuring financial risk. Comput. Econ. 2006, 27, 207–228. [Google Scholar] [CrossRef] [Green Version]
  20. Bali, T.G. A generalized extreme value approach to financial risk measurement. J. Money Credit Bank. 2007, 39, 1613–1649. [Google Scholar] [CrossRef]
  21. Castillo, E.; Hadi, A.S.; Balakrishnan, N.; Sarabia, J.M. Extreme Value and Related Models with Applications in Engineering and Science; Wiley-Interscience: Hoboken, NJ, USA, 2004. [Google Scholar]
  22. Gumbel, E.J. Statistics of Extremes (Dover Books on Mathematics); Dover Publications: Mineola, NY, USA, 2012. [Google Scholar]
  23. Balkema, A.A.; De Haan, L. Residual life time at great age. Ann. Probab. 1974, 2, 792–804. [Google Scholar] [CrossRef]
  24. Pickands, J. Statistical inference using extreme order statistics. Ann. Stat. 1975, 3, 119–131. [Google Scholar]
  25. De Zea Bermudez, P.; Kotz, S. Parameter estimation of the generalized Pareto distribution—Part I. J. Stat. Plan. Inference 2010, 140, 1353–1373. [Google Scholar] [CrossRef]
  26. De Zea Bermudez, P.; Kotz, S. Parameter estimation of the generalized Pareto distribution—Part II. J. Stat. Plan. Inference 2010, 140, 1374–1388. [Google Scholar] [CrossRef]
  27. Korkmaz, M.Ç.; Altun, E.; Yousof, H.M.; Afify, A.Z.; Nadarajah, S. The Burr X Pareto Distribution: Properties, Applications and VaR Estimation. J. Risk Financ. Manag. 2018, 11, 1. [Google Scholar] [CrossRef] [Green Version]
  28. Ihtisham, S.; Khalil, A.; Manzoor, S.; Khan, S.A.; Ali, A. Alpha-Power Pareto distribution: Its properties and applications. PLoS ONE 2019, 14, e0218027. [Google Scholar] [CrossRef]
  29. Arnold, B.C.; Press, S.J. Bayesian estimation and prediction for Pareto data. J. Am. Stat. Assoc. 1989, 84, 1079–1084. [Google Scholar] [CrossRef]
  30. Diebolt, J.; El-Aroui, M.A.; Garrido, M.; Girard, S. Quasi-conjugate Bayes estimates for GPD parameters and application to heavy tails modelling. Extremes 2005, 8, 57–78. [Google Scholar] [CrossRef] [Green Version]
  31. De Zea Bermudez, P.; Turkman, M.A. Bayesian approach to parameter estimation of the generalized Pareto distribution. Test 2003, 12, 259–277. [Google Scholar] [CrossRef]
  32. Behrens, C.N.; Lopes, H.F.; Gamerman, D. Bayesian analysis of extreme events with threshold estimation. Stat. Model. 2004, 4, 227–244. [Google Scholar] [CrossRef] [Green Version]
  33. Castellanos, M.E.; Cabras, S. A default Bayesian procedure for the generalized Pareto distribution. J. Stat. Plan. Inference 2007, 137, 473–483. [Google Scholar] [CrossRef]
  34. Zhang, J.; Stephens, M.A. A new and efficient estimation method for the generalized Pareto distribution. Technometrics 2009, 51, 316–325. [Google Scholar] [CrossRef]
  35. Martín, J.; Parra, M.I.; Pizarro, M.M.; Sanjuán, E.L. Baseline Methods for Bayesian Inference in Gumbel distribution. Entropy 2020, 22, 1267. [Google Scholar] [CrossRef]
  36. Lévy, P. Calcul des Probabilités; Gauthier Villars: Paris, France, 1925. [Google Scholar]
  37. Nolan, J.P. Stable Distributions—Models for Heavy Tailed Data; Birkhauser: Boston, MA, USA, 2018; Chapter 1; Available online: http://fs2.american.edu/jpnolan/www/stable/stable.html (accessed on 15 January 2022).
  38. Plummer, M.; Best, N.; Cowles, K.; Vines, K. CODA: Convergence Diagnosis and Output Analysis for MCMC. R News 2006, 6, 7–11. [Google Scholar]
Figure 1. Estimations for ξ Z (left) and σ Z (right) for Lévy (upper charts), Cauchy (middle charts), Normal (lower charts) and estimators from Equations (11)–(13) plotted in red for p [ 0.900 , 0.995 ] .
Figure 1. Estimations for ξ Z (left) and σ Z (right) for Lévy (upper charts), Cauchy (middle charts), Normal (lower charts) and estimators from Equations (11)–(13) plotted in red for p [ 0.900 , 0.995 ] .
Entropy 24 00178 g001
Figure 2. Posterior distribution of ξ Z for the different values of n, when the baseline distribution is L ( 0 , 1 ) (left), C ( 0 , 1 ) (middle) and N ( 0 , 1 ) (right), for the MH method. The estimation of BMH is plotted as a vertical red line.
Figure 2. Posterior distribution of ξ Z for the different values of n, when the baseline distribution is L ( 0 , 1 ) (left), C ( 0 , 1 ) (middle) and N ( 0 , 1 ) (right), for the MH method. The estimation of BMH is plotted as a vertical red line.
Entropy 24 00178 g002
Figure 3. Posterior distribution of σ Z for the different values of n, when the baseline distribution is L ( 0 , 1 ) (left), C ( 0 , 1 ) (middle) and N ( 0 , 1 ) (right), for the method MH (upper charts) and BMH (lower charts).
Figure 3. Posterior distribution of σ Z for the different values of n, when the baseline distribution is L ( 0 , 1 ) (left), C ( 0 , 1 ) (middle) and N ( 0 , 1 ) (right), for the method MH (upper charts) and BMH (lower charts).
Entropy 24 00178 g003
Figure 4. MAE for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 , for the baseline distribution L ( 0 , 1 ) .
Figure 4. MAE for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 , for the baseline distribution L ( 0 , 1 ) .
Entropy 24 00178 g004
Figure 5. MAE for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 , for the baseline distribution C ( 0 , 1 ) .
Figure 5. MAE for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 , for the baseline distribution C ( 0 , 1 ) .
Entropy 24 00178 g005
Figure 6. MAE for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 , for the baseline distribution N ( 0 , 1 ) .
Figure 6. MAE for MH and BMH when a = 0.25 , 1 , 4 , and for sample sizes n = 2 5 , 2 7 , 2 9 , for the baseline distribution N ( 0 , 1 ) .
Entropy 24 00178 g006
Figure 7. MAE and 2.5%, 97.5% quantiles for the three methods, for α F ( 0 , 1 / 2 ) + ( 1 α ) F ( 0 , 2 ) (left charts), α F ( 0 , 1 ) + ( 1 α ) F ( 1 , 1 ) (middle charts) and α F ( 0 , 1 ) + ( 1 α ) F ( 1 , 2 ) (right charts), for Lévy (upper charts), Cauchy (medium charts), Normal (lower charts).
Figure 7. MAE and 2.5%, 97.5% quantiles for the three methods, for α F ( 0 , 1 / 2 ) + ( 1 α ) F ( 0 , 2 ) (left charts), α F ( 0 , 1 ) + ( 1 α ) F ( 1 , 1 ) (middle charts) and α F ( 0 , 1 ) + ( 1 α ) F ( 1 , 2 ) (right charts), for Lévy (upper charts), Cauchy (medium charts), Normal (lower charts).
Entropy 24 00178 g007
Figure 8. Histogram of PM 2.5 (in μg/m3) in Badajoz for 2011–2020, threshold u = q p = 15 , p = 0.90 and density (black curve).
Figure 8. Histogram of PM 2.5 (in μg/m3) in Badajoz for 2011–2020, threshold u = q p = 15 , p = 0.90 and density (black curve).
Entropy 24 00178 g008
Figure 9. Tail distribution of data, density (black curve), estimations for MHM (red colored), BDM (green colored) and IPBDM (blue colored).
Figure 9. Tail distribution of data, density (black curve), estimations for MHM (red colored), BDM (green colored) and IPBDM (blue colored).
Entropy 24 00178 g009
Table 1. Values for c 1 , c 2 , c 3 for the three baseline distributions.
Table 1. Values for c 1 , c 2 , c 3 for the three baseline distributions.
Distribution c 1 c 2 c 3
Lévy500.2−900.9408.2
Cauchy323.57−588.51266.13
Normal−46.2483.55−41.58
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Martín, J.; Parra, M.I.; Pizarro, M.M.; Sanjuán, E.L. Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution. Entropy 2022, 24, 178. https://doi.org/10.3390/e24020178

AMA Style

Martín J, Parra MI, Pizarro MM, Sanjuán EL. Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution. Entropy. 2022; 24(2):178. https://doi.org/10.3390/e24020178

Chicago/Turabian Style

Martín, Jacinto, María Isabel Parra, Mario Martínez Pizarro, and Eva López Sanjuán. 2022. "Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution" Entropy 24, no. 2: 178. https://doi.org/10.3390/e24020178

APA Style

Martín, J., Parra, M. I., Pizarro, M. M., & Sanjuán, E. L. (2022). Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution. Entropy, 24(2), 178. https://doi.org/10.3390/e24020178

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop