Next Article in Journal
On the Direct Limit from Pseudo Jacobi Polynomials to Hermite Polynomials
Next Article in Special Issue
Changepoint in Error-Prone Relations
Previous Article in Journal
Convergence and Stability of a Parametric Class of Iterative Schemes for Solving Nonlinear Systems
Previous Article in Special Issue
m-Consecutive-k-out-of-n: F Structures with a Single Change Point
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Properties and Applications of a New Family of Skew Distributions

by
Emilio Gómez-Déniz
1,
Barry C. Arnold
2,
José M. Sarabia
3 and
Héctor W. Gómez
4,*
1
Department of Quantitative Methods in Economics and TiDES Institute, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canarias, Spain
2
Statistics Department, University of California Riverside, Riverside, CA 92504, USA
3
Department of Quantitative Methods, CUNEF University, 28040 Madrid, Spain
4
Departamento de Matemática, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(1), 87; https://doi.org/10.3390/math9010087
Submission received: 16 November 2020 / Revised: 26 December 2020 / Accepted: 29 December 2020 / Published: 3 January 2021
(This article belongs to the Special Issue Probability, Statistics and Their Applications)

Abstract

:
We introduce two families of continuous distribution functions with not-necessarily symmetric densities, which contain a parent distribution as a special case. The two families proposed depend on two parameters and are presented as an alternative to the skew normal distribution and other proposals in the statistical literature. The density functions of these new families are given by a closed expression which allows us to easily compute probabilities, moments and related quantities. The second family can exhibit bimodality and its standardized fourth central moment (kurtosis) can be lower than that of the Azzalini skew normal distribution. Since the second proposed family can be bimodal we fit two well-known data set with this feature as applications. We concentrate attention on the case in which the normal distribution is the parent distribution but some consideration is given to other parent distributions, such as the logistic distribution.

1. Introduction

There are many situations in which empirical data show slight or marked asymmetry. This is frequently the case, for example, with actuarial and financial data which, in addition to this feature, have heavy tails reflecting the existence of extreme values. The se features mean that the data can not be adequately modeled by the Gauss (or normal) distribution. Furthermore, bimodal distributions appear naturally in many different scenarios. For example, in certain disease patterns, as well as in certain cancer incidence curves. Behind the bimodality (and multimodality as well) of some cancer incidence curves, and their study, clinicians can improve their understanding of cancer, the development process as well as the potential characteristics that identify cancer and that separate a particular type of cancer of all other types. This occurs, for example, in cases where there are two peaks of occurrence per age. The se cancers include Kaposi’s sarcoma and Hodgkin’s lymphoma. The  latter type of cancer has two peaks of occurrence: in young people adults and middle-aged adults. On the other hand, the normal skew distribution appears naturally in stochastic frontier analysis when a normal distribution is assumed to represent the noise or idiosyncratic component and a half-normal distribution to represent the inefficiency term, in the event that the researcher imposes inefficient behavior on all firms in the sample of interest. See, for instance [1]. Recently, [2] introduces (using a finite mixture model) the zero inefficiency stochastic frontier model which can accommodate the presence of both efficient and inefficient firms in the sample by appearing various bimodal scenarios. The refore, it seems plausible to try to obtain families of distributions that incorporate bias to the normal distribution but that at the same time are more versatile in the sense of being able to adapt to the bimodal scenario that appears in different situations.
Although there are various mechanisms to obtain skewed distributions from an initial that is not skewed, (Two well-known procedures that allow to generalize an initial probability distribution, symmetric or not, are those provided in the works of [3,4], among others). Our attention here will focus on the mechanism for this purpose introduced by [5], which enjoys an undoubted popularity and has been the subject of research in numerous works. Let g and G be, respectively, the probability density function (pdf) and the cumulative distribution function (cdf) of a symmetric distribution. A random variate Z is said to have a skew distribution if its pdf is given by
f Z ( z ) = 2 g ( z ) G ( λ z ) , < z < , λ I R .
This family of distributions has been widely studied as an extension of the normal distribution by means of a shape parameter, λ , which accounts for the skewness. In  this case g and G are replaced in Equation (1) by the pdf and cdf of the standard normal distribution and the resulting distribution is called the skew normal distribution. It should be pointed out that the function g does not have to be precisely the derivative of the cdf G to ensure that the pdf given in Equation (1) is a genuine pdf, although this case has not been studied in depth in the statistical literature. Following the notation provided in Reference [6] we denote the family of distributions given by g ( z ) = 2 ϕ ( z ) Φ ( λ z ) , where ϕ and Φ are the pdf and cdf of the standard normal distribution, respectively, by  S N = S N ( λ ) : λ I R . Furthermore, when a random variable follows a skew normal distribution with location parameter < μ < and scale parameter σ > 0 we will write SN ( λ , μ , σ ) .
In this article, a new generalization of the family of skew distributions given in Equation (1) is proposed, which also includes the skew family of distributions of Azzalini as a particular case; that is, the expression (1). The  methodology used is based on the combination of Azzalini’s proposal and a result provided by [7] which led us to add a new parameter to the family (1). Later, from this new family a second family, very similar to the first, is introduced. This new family of distributions can exhibit bimodality and its standardized fourth central moment (kurtosis) can be lower than the kurtosis of the Azzalini skew normal distribution (and can be positive or negative).
In recent decades, starting from Azzalini’s proposal, several generalizations and extensions of the skew-normal distribution have been introduced (see for example [8]). For multivariate extensions, see References [9,10,11], among others. The  methods applied in the present paper can be considered as extensions and alternatives to the well-known skew-normal distribution (see [5,12]), whose properties (see [12,13]), and corresponding estimation [14] have been widely discussed. Other ways of obtaining skewed normal distributions have also been introduced, such as the one proposed by Reference [15], the Balakrishnan skew-normal density in Reference [16], the proposed model of Reference [17] and the generalized normal distribution in References [18,19,20], among others. For an exhaustive and comprehensive study of the skew-normal distribution, see the recent book by Reference [21].
The class of probability models proposed in the present paper can also be considered as alternatives or as approximations to the usual collective risk models in actuarial settings (see [22,23] among others). Data sets in these settings are typically skewed and the generalized models of the present paper expected to provide better fits than the standard models. In collective risk settings the right tail of the distribution is of considerable interest since the likelihood of large claims is of concern. In addition, the total claim distribution is of interest. Normal approximations are frequently resorted to when dealing with these variables. The use of more flexible generalized normal models can be expected to yield improvement.
The organization of this paper is as follows. The  main result from which we constructed the two proposed families is shown in Section 2. Due to the importance that the normal distribution plays in numerous problems of applied statistics we dedicate a complete section, Section 3, to the study of this distribution. For this purpose, the pdf, which appears in closed form, is shown for the two families. We also give expressions for the mean, variance and the third and fourth standardized cumulant, to compare with their equivalents corresponding to the classical (Azzalini) skew normal distribution. In Section 4, the parameter estimation problem is studied. In order to obtain numerical solutions to the maximum likelihood (ML) estimation problem, suitable software has to be used. Multivariate extensions are described briefly in Section 5. Some examples and applications are described in Section 6. Finally, some conclusions are drawn and promising fields for further research are proposed in the last Section.

2. Main Results

We recall (see [24]) that if X and Y are independent and indentically distributed random variables with a finite fractional moment and if for all real λ , Pr ( X + λ Y > 0 ) = 1 / 2 , then they are symmetric. Also, the following Theorem, which appears in [7], is required for the main result which will appear later.
Theorem 1. 
(see [7]) Let G be the cdf of an arbitrary distribution that is symmetric about zero. The n,
a a G ( z ) d z = a , a I R + .
Expression (2) in Reference [25] also establishes this assertion.

2.1. The First Family of Skew Distributions

The following result presents the key contribution of this work, consisting of proposing, given a family of symmetrical distributions, a more general family not necessarily symmetric that includes as a particular case the first family.
Theorem 2. 
Let X and Y two random variables with symmetric cdf’s G X ( x ) and G Y ( y ) and pdf’s g X ( x ) and g Y ( y ) , respectively. The n,
f Y ( y ; λ , α ) = g Y ( y ) α α α G X ( z + λ y ) d z
represents a genuine pdf for α I R { 0 } and λ I R .
Proof. 
Without loss of generality, asume that X and Y are independent random variables. Taking into account the fact that X λ Y is symmetric and using the result provided in Theorem 1, we get
α = α α Pr ( X λ Y < z ) d z = α α Pr ( X < z + λ y | Y = y ) d z g Y ( y ) d y = α α Pr ( X < z + λ y ) d z g Y ( y ) d y = g Y ( y ) α α G X ( z + λ y ) d z d y .
Hence the result. □
Expression (3) can instead be viewed in the following form related to an infinite mixture construction. Let G X ( ξ + λ y ) , λ I R , ξ I R and y I R , be the cdf of a symmetric distribution with support in the real line. Suppose now that ξ is random and follows a uniform distribution in the interval [ α , α ] , then
H X ( y ; λ , α ) = 1 2 α α α G ( ξ + λ y ) d ξ ,
is also a genuine cdf symmetric around zero. That is, H X ( y ) = 1 H X ( y ) . Now, Equation (3) is derived by taking into account the fact that 2 g Y ( y ) H X ( y ) is a genuine pdf. Another elegant way to see that Equation (3) defines a genuine pdf is to consider the argument given in Lemma 1 in [15]. That is, let S ( λ , α ) = f Y ( y ; λ , α ) d y . Now, we have that S ( 0 , α ) = g Y ( y ) d y = 1 and since
λ S ( λ , α ) = y g Y ( y ) α G X ( α + λ y ) G X ( α + λ y ) d y = 0
we have that S ( λ , α ) = 1 .
The results in the following proposition are readily verified and consequently are stated without proof.
Proposition 1. 
For the density (3) the following results hold.
( i )
f Y ( 0 ; λ , α ) = g Y ( 0 ) .
( i i )
f Y ( y ; λ , α ) = f Y ( y ; λ , α ) for λ I R , α I R { 0 } .
( i i i )
f Y ( y ; 0 , α ) = g Y ( y ) for α I R { 0 } .
( i v )
lim α 0 f Y ( y ; λ , α ) = 2 g Y ( y ) G X ( λ y ) for λ I R .
( v )
G Y ( y ; λ , α ) = 1 G Y ( y ; , λ , α ) .
( v i )
If Y has the pdf (3) then Y has the same distribution but with the parameter λ replaced by λ .
( v i i )
f Y ( y ; λ , α ) + f Y ( y ; λ , α ) = 2 g Y ( y ) .
( v i i i )
Let α = λ in (3) and consider the two random variates Z 1 and Z 2 following the pdf (3) with parameters λ 1 I R and λ 2 I R , respectively, then, if λ 1 < λ 2 , Z 1 < s t Z 2 . That is, Z 1 is stochastically smaller than Z 2 .
Because of Result ( i i ) in Proposition 1, an identifiability problem will arise in ( 3 ) if we allow α to assume both positive and negative values. To avoid this problem we can and will restrict α to be non-negative when discussing inference for this model. Observe that ( i v ) establishes that when α 0 we get as a special case the well studied skew family of distributions appearing in References [5,12].
In most cases the density function (3) does not have a simple, closed form but it can be computed numerically. However, closed-form expressions for the pdf can be obtained for some specific choices of well-known distributions. For example, if we utilize the pdf and cdf of the logistic distribution with location parameter μ = 0 and scale parameter s = 1 in (3), i.e., take g Y ( y ) = exp ( y ) / ( 1 + exp ( y ) ) 2 and G X ( x ) = ( 1 + exp ( x ) ) 1 , respectively, then (3) becomes
f Y ( y ; λ , α ) = exp ( y ) α ( 1 + exp ( y x ) ) 2 log 1 + exp ( α + λ y ) 1 + exp ( α + λ y ) .
On the other hand, combining the pdf of the standard normal distribution with the cdf of the logistic distribution we have, after applying (3) the new pdf
f Y ( y ) = ϕ ( y ) α log 1 + exp ( α + λ y ) 1 + exp ( α + λ y ) .
If in Equation (3), we use a normal pdf and a normal cdf, i.e., take g Y ( y ) = ϕ ( y ) and G X ( x ) = Φ ( x ) , where ϕ ( · ) and Φ ( · ) represent the pdf and cdf of the standard normal distribution, respectively, then (3) becomes:
f Y ( y ; λ , α ) = ϕ ( y ) α 2 α + ϕ ( α + λ y ) ϕ ( α λ y ) + ( λ y α ) Φ ( α λ y ) ( λ y + α ) Φ ( α λ y ) ,
which does not appear to be a very attractive analytical expression. However it is not intractable and does represent a flexible two parameter family of densities. Figure 1 shows some graphs of this distribution in comparison with the skew normal distribution with the same mean. Since the mean and variance do not have a closed form for this distribution, we have chosen to calculate the mean numerically and match the mean of the SN distribution with parameters ( λ , μ , σ ) to obtain the value of the skew parameter λ . In  all cases, except for the first graph, both distributions have the same mean and approximately the same variance.) and with parameters λ , μ and σ . Similar to the latter, the distribution is unimodal. Furthermore, the degree of skewness increases when λ grows. Positive skewness corresponds to the case λ > 0 . As can be seen, the new distribution can take the same shape as the normal skew distribution, even having one less parameter when the highest probability mass percentage is around the ordinate axis. Otherwise, shape and scale parameters will have to be incorporated, as will be proposed later. The refore, taking into account the Ockham’s razor principle, it would seem logical, if one had to choose between both models, to opt for the new modeling proposed in this work.
Other possibilities for which we can get simple expressions for the corresponding pdf by applying (3) include the hyperbolic secant distributions (see [26,27], among others) and the ArcSin distributions (see [28]).
On some occasions, it is convenient expressed (3) as
f Y ( y ; λ , α ) = g Y ( y ) α λ y α λ y + α G X ( u ) d u ,
which is obtained after the change of variable u = z + λ y in (3). In  fact the family of skew distributions given in [5] can alternatively be obtained from (5) by applying the first mean value Theorem to the integral appearing on (5) in the interval [ λ y α , λ y + α ] . To see this take c = λ y in the well known formula b a f ( x ) d x = f ( c ) ( b a ) , c [ a , b ] , to obtain f Y ( y ; λ , α ) = 2 g Y ( y ) G X ( λ y ) .

2.2. The Second Family of Skew Distributions

Now we present the second family of skew distributions proposed in this work. This is derived from Equation (5) as follows.
Theorem 3. 
If g Y ( y ) be a density function that is symmetric about zero and G X a cdf also symmetric about zero, then,
f Y ( y ; λ , α ) = g Y ( y ) G X ( λ y + α ) + G X ( λ y α )
is a valid pdf for α I R and λ I R .
Proof. 
From Equation (3) it follows that
α = g Y ( y ) α α G X ( z + λ y ) d z d y = g Y ( y ) α 0 G X ( z + λ y ) d z + 0 α G X ( z + λ y ) d z d y .
If we now take the derivative with respect to α on both sides in (7) and apply the Fundamental Theorem of calculus we get
g Y ( y ) G X ( λ y + α ) + G X ( λ y α ) d y = 1 .
Thus Equation (6) is a genuine pdf. □
It can be verified that the pdf’s given in (6) also satisfy the properties listed in Proposition 1. Furthermore, it is straightforward to see that f Y ( y ; λ , 0 ) = 2 g Y ( y ) G X ( y ) , f Y ( y ; λ , ) = g Y ( y ) and f Y ( y ; 0 , α ) = g Y ( y ) .
Note that the two proposed families, (3) and (6), are different and the only density belonging to both families is the basic density g Y ( y ) . A major difference between the two proposed families is that in the first family α is not permitted to take the value zero while that is permitted in the second family. Both families are very similar and differ markedly only in small regions of the support of the distributions. To see this, note that applying the trapezoid rule to (3) we get
f Y ( y ; λ , α ) g Y ( y ) G X ( λ y + α ) + G X ( λ y α ) ,
which coincides with (5).
Because G X ( α ) = 1 G X ( α ) , from (6) we get that when λ = 0 , f Y ( y ) = g Y ( y ) while for α = 0 we get the skew family of distributions proposed in Reference [5]. Again, it can be verified that f Y ( y ; λ , α ) = f Y ( y ; λ , α ) for the family given in Equation (6). Thus the same identifiability problem arises for the model (6) as did in the model (3) if we allow α to assume both positive and negative values. To avoid this problem here too we can and will restrict α to be non-negative when discussing inference for the model. If desired, a more general class than the one proposed in (6) is one corresponding to finite mixtures of densities of the form (6) as follows
f Y ( y ) = g Y ( y ) i = 1 m δ i G X ( λ i y + α i ) + G X ( λ i y α i ) ,
where the δ i ’s are positve and i = 1 m δ i = 1 .
At this point, we give a more general result than the one provided in [5,29,30] for a symmetric random variable.
Proposition 2. 
If U be a random variable that is symmetric about zero with cdf given by G ( u ) , pdf g ( u ) and if λ and α are two real numbers, then,
E [ G ( λ U + α ) ] = 1 E [ G ( λ U α ) ] .
Proof. 
This is obtained in a straightforward way directly from the result given in Theorem 3 by integrating on both sides of the equality over the support ( , ) . □
When U follows a standard normal distribution (9) reduces to
E [ Φ ( λ U + α ) ] = 1 E [ Φ ( λ U α ) ] = Φ α 1 + λ 2 ,
a result which is well known in the statistical literature (see [5,29,30]).

3. The Normal Distribution Case

Of greater interest, because it is expressed in a simpler formulation, is the pdf obtained from Equation (6) when g and G are replaced by the pdf and cdf of the standard normal distribution, respectively. This is given by
f X ( x ) = ϕ ( x ) Φ ( λ x + α ) + Φ ( λ x α ) .
In the folowing discussion, when a random variate X has its pdf given by Equation (10) it will be denoted by X GSN ( λ , α ) . Figure 2 show some graphs of this pdf and the corresponding skew normal pdf with the same mean and variance. (In this case, the skew parameter λ of the SN distribution has been set so that, once equal to the mean and variance of the new distribution, the values of μ and σ were obtained numerically so that both distributions should have the same mean and the same variance. It can be seen that the new model is very versatile and that the value of the parameters provide a distribution which can exhibit unimodality and bimodality. Again, as with Figure 1, the distribution can take approximately the same shape as the normal skew distribution with fewer parameter).
We next provide the moment generating function of the family given in Equation (10).
Proposition 3. 
The moment generating function of a random variable X having its pdf given by Equation (10) is of the form
M X ( t ) = Φ δ ( λ t + α ) + Φ δ ( λ t α ) exp t 2 2 ,
where δ = 1 / 1 + λ 2 .
Proof. 
It is straightforward following the same argument as the one used in Reference [5] in order to get the moment generating function of the skew-normal distribution. □
Moments can then be readily obtained by differentiation of Equation (11). In  particular, the mean and variance are given by
E ( X ) = λ b δ exp 1 2 ( α δ ) 2 ,
v a r ( X ) = 1 ( λ b δ ) 2 exp ( α δ ) 2 ,
where b = 2 / π . Table 1 shows the mean and variance of the proposed distribution and the corresponding ones for the skew normal distribution for some special cases of parameters. Similar to case of the skew normal distribution, it can be verified that E ( X 2 ) = 1 . Another important property that GSN ( λ , α ) shares with the skew normal distribution is that if Z GSN ( λ , α ) then Z 2 χ 1 2 for all values of λ and α . In complete moments can also be studied following the work of Reference [31]. The third (skewness) and fourth (kurtosis) standardised cumulants are given by,
γ 1 = E ( X ) [ v a r ( X ) ] 3 / 2 ( δ λ ) 2 2 b 2 exp ( ( α δ ) 2 ) + ( α δ ) 2 1 , γ 2 = π b 2 ( 3 b 2 π 4 δ 4 E 2 ( X ) κ 1 ) 12 E 4 ( X ) ( 1 v a r ( X ) ) 2 κ 2 3 ,
where
κ 1 = 3 + 2 ( 2 + α 2 ) λ 2 + λ 4 , κ 2 = π exp [ ( α δ ) 2 ] 2 ( λ δ ) 2 ( λ δ ) 4 ,
and E ( X ) and v a r ( X ) are given by Equations (12) and (13), respectively. Comparisons of these values with the standard skew normal distribution are shown in Table 2. As can be seen, the standardized fourth central moment (kurtosis) can be lower for the GSN distribution than it is for Azzalini’s skew normal distribution.
Let Φ λ , α ( z ) denote the cdf of Z GSN ( λ , α ) .
Proposition 4. 
If Z G S N ( λ , α ) , then its cdf is given by
Φ λ , α ( z ) = Φ ( z ) + T z , α z λ T z , α z + λ + T α δ , y α δ 2 λ T α δ , z α δ 2 + λ , z 0 , α 0 ,
where T ( x , a ) is the Owen’s function see [32] given by
T ( x , a ) = 1 2 π 0 a 1 1 + t 2 exp x 2 ( 1 + t 2 ) / 2 d t , a I R .
Proof. 
The proof is direct by applying result B.21 in Reference [21]. □
Proposition 5. 
If Z G S N ( λ , α ) , then it follows that:
( i )
Φ λ , 0 ( z ) = Φ ( z ) + T z , α z λ T z , α z + λ .
( i i )
Φ λ , α ( 0 ) = 1 2 2 T α δ , λ .
( i i i )
Φ λ , α ( z ) = Φ λ , α ( z ) .
Proof. 
The proof is also direct this time by applying the result B.23 in [21]. □
To end this Section we provide a result related to probability transformations which generalises a result appearing in Reference [5] and provided also in Reference [31].
Proposition 6. 
Let W and Z be independent random variables with distribution N ( 0 , 1 ) and G S N ( λ , α ) , respectively. The n, the random variable Y = ( h W + k Z ) / h 2 + k 2 , where h , k I R , follows a G S N ( λ ˜ , α ˜ ) , where λ ˜ = δ k λ / h 2 + k 2 and α ˜ = δ α .
Proof. 
It can be proved following the same argument as that used in Lemma 1 in Reference [31]. □

4. Estimation

The family of distributions GSN ( λ , α ) can be generalized by means of a linear transformation in order to introduce a location and a scale parameter adding more flexibility to the model (10). We thus will consider the location-scale generalization of the skew-normal distribution defined as the distribution of Y = μ + σ X , where X GSN ( λ , α ) given in Equation (10), where μ I R and σ > 0 . Its pdf is given by
f Y ( y ) = 1 σ ϕ Y y μ σ Φ Y λ ( y μ ) σ + α + Φ Y λ ( y μ ) σ α .
When λ = α = 0 this pdf reduces to the N ( μ , σ ) and when α = 0 to the S N ( μ , σ , λ ) . For a sample y ̲ = { y 1 , , y n } we can estimate the four parameters, Θ = ( λ , α , μ , σ ) , of the model given in Equation (14) by a direct search for the maximum of the log-likelihood surface given by
( y ̲ ; Θ ) n log σ i = 1 n ( y i μ ) 2 2 σ 2 + i = 1 n log Φ Y λ ( y i μ ) σ + α + Φ Y λ ( y i μ ) σ α .
From Equation (15) we get the normal equations which are given by:
( y ̲ ; Θ ) λ = i = 1 n y i μ σ I ( r 1 i , r 2 ) = 0 , ( y ̲ ; Θ ) α = i = 1 n H ( r 1 i , r 2 i ) = 0 , ( y ̲ ; Θ ) μ = n ( y ¯ μ ) σ 2 λ σ i = 1 n I ( r 1 i , r 2 i ) = 0 , ( y ̲ ; Θ ) σ = n σ + 1 σ 3 i = 1 n ( y i μ ) 2 λ σ 2 i = 1 n ( y i μ ) I ( r 1 i , r 2 i ) = 0 ,
where
H ( r 1 i , r 2 i ) = ϕ ( r 1 i ) ϕ ( r 2 i ) Φ ( r 1 i ) + Φ ( r 2 i ) , I ( r 1 i , r 2 i ) = ϕ ( r 1 i ) + ϕ ( r 2 i ) Φ ( r 1 i ) + Φ ( r 2 i )
and
r 1 i = λ ( y i μ ) σ + α , r 2 i = λ ( y i μ ) σ α .
Since it is not possible to obtain closed expressions for the maximum likelihood estimators, algorithms based on numerical methods, such as Newton-Raphson or Broyden-Fletcher-Goldfarb-Sanno (BGGS), among others, will have to be used. It is recommended to use different seed points as initial values to ensure that the solution obtained constitutes a global maximum of the logarithm-likelihood function.
The standard errors of the estimators can be obtained by inverting the Hessian matrix. Both Mathematica and WinRats have at least two methods for this purpose. The  first is to use Cholesky factors (this package is available on the web upon request). The  second, faster method, involves by finite differentiation. Furthermore, WinRats package also offers the possibility to get the maximum of the log-likelihood directly giving us the elements of the Fisher information matrix. In  fact, for the examples considered later these two packages were used to get the maximum likelihood estimators in a fast way.

5. Multivariate Versions

Multivariate extensions of the univariate distributions arise in an easy way as we can see in the next result.
Theorem 4. 
Let X and Y ̲ be two random variables where X N ( 0 , 1 ) and Y ̲ N ( m ) ( 0 ̲ , Σ ) . The n,
f ( y ̲ ; λ ̲ , α ) = f Y ̲ ( y ̲ ) α α α F X ( z + λ ̲ T y ̲ ) d z
represents a genuine pdf for α I R { 0 } and λ ̲ I R m .
Proof. 
Without loss of generality, asume that X and Y ̲ are independent random variables. Taking into account the fact that X λ ̲ T Y ̲ is symmetric and using the result provided in Theorem 1, we get
α = α α Pr ( X λ ̲ T Y ̲ < z ) d z = I R m α α Pr ( X < z + λ ̲ T y ̲ | Y ̲ = y ̲ ) d z f Y ̲ ( y ̲ ) d y ̲ = I R m α α Pr ( X < z + λ ̲ T y ̲ ) d z f Y ̲ ( y ̲ ) d y ̲ = I R m f Y ̲ ( y ̲ ) α α F X ( z + λ ̲ T y ̲ ) d z d y ̲ .
Hence the result. □
Remark 1. 
The only important required property of the distribution of Y ̲ is that, for any λ ̲ , the random variable λ ̲ T Y ̲ is symmetric about zero. The  only required property for the distribution of X is that it be symmetric about zero. This is true for the above Theorem and the next.
Theorem 5. 
Let X and Y ̲ be two random variables where X N ( 0 , 1 ) and Y ̲ N ( m ) ( 0 ̲ , Σ ) . The n,
f ( y ̲ ; λ ̲ , α ) = f Y ̲ ( y ̲ ) F X ( λ ̲ T y ̲ + α ) + F X ( λ ̲ T y ̲ α )
is a valid pdf for α I R and λ ̲ I R m .
Proof. 
From (16) it follows that
α = I R m f Y ̲ ( y ̲ ) α α F X ( z + λ ̲ T y ̲ ) d z d y ̲ = I R m f Y ̲ ( y ̲ ) α 0 F X ( z + λ ̲ T y ̲ ) d z + 0 α F X ( z + λ ̲ T y ̲ ) d z d y ̲ .
If we now take the derivative with respect to α on both sides in Equation (18) and apply the Fundamental Theorem of calculus we get
I R m f Y ̲ ( y ̲ ) F X ( λ ̲ T y ̲ + α ) + F X ( λ ̲ T y ̲ α ) d y = 1 .
Thus (17) is a genuine pdf. □
Note that if we set α = 0 in (17), we obtain
f ( y ̲ ; λ ̲ ) = 2 f Y ̲ ( y ̲ ) F X ( λ ̲ T y ̲ ) ,
which was one of the first multivariate skew-normal models to appear in the literature. See [10,11], for instance.
Figure 3 shows the density for bivariate generalized skew-normal (BGSN) model for some combinations of the parameters. Perusal of Figure 3, will confirm that the density of the BGSN model exhibits a more interesting array of possible shapes than do many of its competitors. The  flexibility of this model can be expected to be useful in fitting the model to a variety of different data sets.

6. Numerical Illustrations

In this section, three examples for the GSN model given in Equation (14) are carried out and the results are compared with the flexible epsilon skew-normal (FESN) model introduced by Reference [33] in the first example, with the mixture of two normals (MN) model in the second example and flexible skew-normal (FSN) model introduced by Reference [34] in the third example. The  three densities respectively are given by:
  • f y ; μ , σ , λ , ε = 1 2 σ c λ ϕ y μ σ ( 1 + ε ) λ if y < μ 1 2 σ c λ ϕ y μ σ ( 1 ε ) + λ if y μ
  • f ( y ; μ , μ 1 , σ , σ 1 , p ) = p σ ϕ y μ σ + 1 p σ 1 ϕ y μ 1 σ 1
  • f ( y ; μ , σ , λ , α ) = 2 σ ϕ y μ σ Φ λ y μ σ + α y μ σ 3
where ϕ ( · ) and Φ ( · ) denote the density and distribution functions of the standard normal distribution, c λ = 1 Φ ( λ ) , μ , μ 1 , λ , α R , σ , σ 1 R + , 1 < ε < 1 and 0 p 1 .
We use these three models, since they have been used in the applied statistics literature to explain bimodal empirical data. We chose the MN model since it is a classic model that is used to model bimodal data sets, we chose the FESN model since it is one of the first bimodal extensions of the family of epsilon-skew-simétric distributions and We chose the FSN model since it is one of the first bimodal extensions of the family of skew-simétric distributions.

6.1. Example 1

The data in this example is a set of fiber levels for 315 patients and is available online at http://Lib.stat.cmu.edu/datasets/Plasma_Retinol and contains values for 14 variables for each patient. For our analysis we will use only the variable called Fiber (Grams of fiber consumed per day). Low levels of this variable may be associated with higher risk of certain types of cancer. Descriptive statistics for the data set are displayed in Table 3. In  the table b 1 and b 2 denote sample skewness and kurtosis coefficients. Note that the data exhibits a high level of kurtosis.
The estimated values of the parameters for the two models are shown in Table 4 together with the standard errors (SE) in parentheses. The table also includes the maximum of the log-likelihood function ( max ), the Akaike information criterion (AIC) and the consistent Akaike information criterion (CAIC), proposed in References [35,36] respectively. Amodel with a lower AIC or CAIC is preferred to a model with a higher value.
Graphs of histogram of the data and fitted densities are given in Figure 4. As it can be seen, the GSN distribution is the better of the two models with regard to reflecting the nature of the empirical data. All computations here were done using Mathematica v.11.0 and WinRATS v.7.0. These codes are available from the authors upon request.

6.2. Example 2

We consider the variables M-Sweet available in the database creaminess of cream cheese which can be found at http://www.models.kvl.dk/research/data/Cream/index.asp.
Table 5 shows summary statistics for the M-Sweet dataset.
In Table 6, presents parameter estimates (SE) for both, the MN and GSN models. It can be seen that the log-likelihood for the GSN model is higher than the for the MN model. The AIC and CAIC criterion are used again to compare the estimated models, it can be seen that the GSN model presents the best fit (smallest AIC and CAIC values).
Finally, the histogram of the data and the fitted densities are shown in Figure 5.

6.3. Example 3

Finally, data corresponding to the age and frequency of cancer cataloged as Kaposi’s sarcoma have been taken. This is a type of cancer that can form masses in the skin, lymph nodes, or other organs without distinguishing the sub-types. The data has been collected from the website of the Office for National Statistics (ONS, section Health statistics) and it can be seen in Table A1 in the Appendix A. It can be observed that there is a higher incidence in individuals with an age around 25 years as well as for those with an age around 60 years. The se records were taken during the years 1995–2016 and correspond to different regions of the United Kingdom. Table 7 shows summary statistics for the Kaposi’s sarcoma dataset.
The two fitted models are represented in Figure 6 and the correspondent estimated values can be seen in Table 8. The GSN model presents better fit for the data, since the AIC and CAIC values are smaller.

7. Conclusions

We have proposed two families of skew distributions which can be considered as alternatives to the well-known skew normal distribution for fitting skewed data.
Future research could address the following issue. We can ask whether the methodology proposed here can be applied to the generalized skew-normal distribution proposed by Reference [9] to obtain a more flexible distribution. For the case with standard normal components, one might consider the following model which is an average of two Arnold and Beaver densities.
f X ( x ) = 1 2 ϕ ( x ) Φ ( λ x + α 1 ) Φ α 1 1 + λ 2 + Φ ( λ x α 2 ) Φ α 2 1 + λ 2 ,
where λ , α 1 , α 2 I R . Note that this model is not obtainable by methods analogous to those used to develop the model (10). However, it is a simple more flexible extension of the Arnold-Beaver model. But, once we recognize it as a mixture with equal weights, it is resonable to add more flexibility by considering unequal weights, as follows
f X ( x ) = ϕ ( x ) γ Φ ( λ x + α 1 ) Φ α 1 1 + λ 2 + ( 1 γ ) Φ ( λ x α 2 ) Φ α 2 1 + λ 2 ,
where γ [ 0 , 1 ] .

Author Contributions

Conceptualization, E.G.-D., B.C.A. and J.M.S.; Formal analysis, E.G.-D., B.C.A., J.M.S. and H.W.G.; Investigation, E.G.-D., B.C.A., J.M.S. and H.W.G.; Methodology, E.G.-D. and J.M.S.; Software, E.G.-D. and H.W.G.; Supervision, E.G.-D. and B.C.A.; Validation, E.G.-D. and H.W.G. All of the authors contributed significantly to this research article. All authors have read and agreed to the published version of the manuscript.

Funding

The authors thank to the Ministerio de Economía y Competitividad, Spain (project ECO2017-85577-P, EGD and Ministerio de Ciencia e Innovación, Spain (PID2019-105986GB-C22), JMS for partial support of this work. The research of H.W. Gómez was supported by PUENTE-UA project, Chile.

Acknowledgments

The authors thank the Editor and three anonymous referees for their constructive comments and suggestions, which have greatly helped them to improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Data corresponding to Kaposi’s sarcoma.
Table A1. Data corresponding to Kaposi’s sarcoma.
AgeNumber
11
589
10342
15718
202352
253593
303243
352533
402015
451747
501562
551662
601801
651915
701855
751611
801203
85642
90247

References

  1. Aigner, D.J.; Lovell, C.A.K.; Schmidt, P. Formulation and estimation of stochastic frontier production functions. J. Econ. 1977, 6, 21–37. [Google Scholar] [CrossRef]
  2. Kumbhakar, S.C.; Parmeter, C.F.; Tsionas, E.G. A zero inefficiency stochastic frontier model. J. Econ. 2013, 172, 66–76. [Google Scholar] [CrossRef]
  3. Marshall, A.W.; Olkin, I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 1997, 84, 641–652. [Google Scholar] [CrossRef]
  4. Jones, M.C. Families of distributions arising from distributions of order statistics. Test 2007, 13, 1–43. [Google Scholar] [CrossRef]
  5. Azzalini, A. A class of distributions which includes the normal ones. Scan. J. Stat. 1985, 12, 171–178. [Google Scholar]
  6. Henze, N. A probabilistic representation of the Skew-Normal distribution. Scan. J. Stat. 1986, 4, 271–275. [Google Scholar]
  7. Silver, E.A.; Costa, D. A property of symmetric distributions and a related order statistic result. Am. Stat. 1997, 51, 32–33. [Google Scholar]
  8. Gupta, R.C.; Gupta, R.D. Generalized skew normal model. Test 2004, 12, 501–524. [Google Scholar] [CrossRef]
  9. Arnold, B.C.; Beaver, R.J. Skewed multivariate models related to hidden truncation and/or selective reporting (with discussion). Test 2002, 11, 1–54. [Google Scholar] [CrossRef]
  10. Azzalini, A.; Capitanio, A. Statistical applications of the multivariate skew-normal distribution. J. R. Stat. Soc. Ser. B 1999, 61, 579–602. [Google Scholar] [CrossRef]
  11. Azzalini, A.; Valle, A. The multivariate skew-normal distribution. Biometrika 1996, 83, 715–726. [Google Scholar] [CrossRef]
  12. Azzalini, A. Further results on a class of distributions which includes the normal ones. Statistica 1986, 46, 199–208. [Google Scholar]
  13. Azzalini, A.; Chiogna, M. Some results on the stress-strength model for skew-normal variates. METRON 2004, LXII, 315–326. [Google Scholar]
  14. Gupta, R.D.; Gupta, R.C. Analyzing skewed data by power normal model. Test 2008, 17, 197–210. [Google Scholar] [CrossRef]
  15. Gómez, H.; Venegas, O.; Bolfarine, H. Skew-symmetric distributions generated by the distribution function of the normal distribution. Environmetrics 2007, 18, 395–407. [Google Scholar] [CrossRef] [Green Version]
  16. Sharafi, M.; Behboodian, J. The Balakrishnan skew-normal density. Stat. Paper. 2008, 49, 769–778. [Google Scholar] [CrossRef]
  17. Jones, M.; Pewsey, A. Sinh-arcsinh distributions. Biometrika 2009, 96, 761–780. [Google Scholar] [CrossRef] [Green Version]
  18. Gómez-Déniz, E.; Iriarte, Y.A.; Calderín-Ojeda, E.; Gómez, H.W. Modified Power-Symmetric Distribution. Symmetry 2019, 11, 1410. [Google Scholar] [CrossRef] [Green Version]
  19. Alavi, S.M.R.; Tarhani, M. On a Skew Bimodal Normal-Normal distribution fitted to the Old-Faithful geyser data. Commun. Stat. Theor. Method. 2017, 46, 7301–7312. [Google Scholar] [CrossRef]
  20. García, V.; Gómez-Déniz, E.; Vázquez-Polo, F. A new skew generalization of the normal distribution: Properties and applications. Comput. Stat. Data Anal. 2010, 54, 2021–2034. [Google Scholar] [CrossRef]
  21. Azzalini, A. The Skew Normal and Related Families; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
  22. Dickson, D. InSurance Risk and Ruin; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
  23. Rolski, T.; Schmidli, H.; Schmidt, V.; Teugel, J. Stochastic Processes for Insurance and Finance; John Wiley and Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
  24. Burdick, D. A note on symmetric random variables. Annal. Math. Stat. 1972, 43, 2039–2040. [Google Scholar] [CrossRef]
  25. Jones, M. Generating distributions by transformation of scale. Stat. Sinica 2014, 24, 749–771. [Google Scholar] [CrossRef] [Green Version]
  26. Ding, P. Three Occurrences of the Hyperbolic-Secant Distribution. Am. Stat. 2014, 68, 32–35. [Google Scholar] [CrossRef] [Green Version]
  27. Johnson, N.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions; Wiley: New York, NY, USA, 1995. [Google Scholar]
  28. Arnold, B.C.; Groeneveld, R.A. Some Properties of the Arcsine Distribution. J. Am. Stat. Assoc. 1980, 75, 173–175. [Google Scholar] [CrossRef]
  29. Ellison, B. Two theorems for inferences about the normal distribution with applications in acceptance sampling. J. Am. Stat. Assoc. 1964, 59, 89–95. [Google Scholar] [CrossRef]
  30. Zacks, S. Parametric Statistical Inference; Pergamon Press: Oxford, UK, 1981. [Google Scholar]
  31. Chiogna, M. Some results on the scalar skew-normal distribution. J. Ital. Statist. Soc. 1998, 1, 1–13. [Google Scholar] [CrossRef]
  32. Owen, D. Tables for computing bivariate normal probabilities. Ann. Math. Stat. 1956, 27, 1075–1090. [Google Scholar] [CrossRef]
  33. Arellano-Valle, R.B.; Cortés, M.A.; Gómez, H.W. An Extension of the Epsilon-Skew-Normal Distribution. Communicat. Stat. Theor. Method. 2010, 39, 912–922. [Google Scholar] [CrossRef]
  34. Yanyuan, M.A.; Genton, M.G. Flexible Class of Skew-Symmetric Distributions. Scand. J. Stat. 2004, 31, 459–468. [Google Scholar]
  35. Bozdogan, H. The general theory and its analytical extension. Psychometrika 1987, 52, 345–370. [Google Scholar] [CrossRef]
  36. Akaike, H. A new look at the statistical model. IEEE Transact. Automat. Control. 1974, 19, 716–723. [Google Scholar] [CrossRef]
Figure 1. Plot of the pdf (4) (thick line) denoted as GSN( λ , α ) and the SN( λ , μ , σ ) (thin line) for selected values of the parameters α and λ .
Figure 1. Plot of the pdf (4) (thick line) denoted as GSN( λ , α ) and the SN( λ , μ , σ ) (thin line) for selected values of the parameters α and λ .
Mathematics 09 00087 g001
Figure 2. Plot of the pdf in (10) for selected values of parameters and comparison with the skew normal one.
Figure 2. Plot of the pdf in (10) for selected values of parameters and comparison with the skew normal one.
Mathematics 09 00087 g002
Figure 3. Density for BGSN for: λ 1 = λ 2 = 3 and α = 1 (upper left panel); λ 1 = λ 2 = 3 and α = 1 (upper right panel); λ 1 = λ 2 = 3 and α = 5 (lower left panel); λ 1 = λ 2 = 3 and α = 5 (lower right panel).
Figure 3. Density for BGSN for: λ 1 = λ 2 = 3 and α = 1 (upper left panel); λ 1 = λ 2 = 3 and α = 1 (upper right panel); λ 1 = λ 2 = 3 and α = 5 (lower left panel); λ 1 = λ 2 = 3 and α = 5 (lower right panel).
Mathematics 09 00087 g003
Figure 4. FESN distribution (dashed line) and GSN distribution (solid line) for the Fiber data.
Figure 4. FESN distribution (dashed line) and GSN distribution (solid line) for the Fiber data.
Mathematics 09 00087 g004
Figure 5. MN distribution (dashed line) and GSN distribution (solid line) for the M-Sweet data.
Figure 5. MN distribution (dashed line) and GSN distribution (solid line) for the M-Sweet data.
Mathematics 09 00087 g005
Figure 6. FSN distribution (dashed line) and GSN distribution (solid line) for the Kaposi’s sarcoma data.
Figure 6. FSN distribution (dashed line) and GSN distribution (solid line) for the Kaposi’s sarcoma data.
Mathematics 09 00087 g006
Table 1. Mean and variance of the GSN and the SN variates for some parameter values.
Table 1. Mean and variance of the GSN and the SN variates for some parameter values.
λ 012345
Mean SN ( λ ) 0.0000.5640.7130.7560.7740.782
Mean GSN ( λ , 0.1 ) 0.0000.5620.7120.7560.7730.782
Variance SN ( λ ) 1.0000.6810.4900.4270.4000.387
Variance GSN ( λ , 0.1 ) 1.0000.6830.4910.4270.4010.388
Mean SN ( λ ) 0.0000.5640.7130.7560.7740.782
Mean GSN ( λ , 1 ) 0.0000.4390.6450.7200.7510.767
Variance SN ( λ ) 1.0000.6810.4900.4270.4000.387
Variance GSN ( λ , 1 ) 1.0000.8060.5830.4810.4350.410
Mean SN ( λ ) 0.0000.5640.7130.7560.7740.782
Mean GSN ( λ , 5 ) 0.0000.0010.0580.2160.3710.483
Variance SN ( λ ) 1.0000.6810.4900.4270.4000.387
Variance GSN ( λ , 5 ) 1.0000.9990.9960.9520.8620.765
Table 2. Third and fourth standardized cumulants of the GSN and the SN variates for some parameter values.
Table 2. Third and fourth standardized cumulants of the GSN and the SN variates for some parameter values.
λ 012345
γ 1 , SN ( λ ) 0.0000.1360.4530.6670.7840.850
γ 1 , Mean GSN ( λ , 0.1 ) 0.0000.1350.4510.6640.7820.849
γ 2 , SN ( λ ) 0.0000.0610.3050.5090.6320.705
γ 2 , GSN ( λ , 0.1 ) 0.0000.0600.3020.5070.6300.703
γ 1 , SN ( λ ) 0.0000.1360.4530.6670.7840.850
γ 1 , GSN ( λ , 1 ) 0.0000.0820.2810.4880.6390.738
γ 2 , SN ( λ ) 0.0000.0610.3050.5090.6320.705
γ 2 , GSN ( λ , 1 ) 0.000−0.0460.0710.2890.4580.571
γ 1 , SN ( λ ) 0.0000.1360.4530.6670.7840.850
γ 1 , GSN ( λ , 5 ) 0.0000.0060.1880.3360.3320.311
γ 2 , SN ( λ ) 0.0000.0610.3050.5090.6320.705
γ 2 , GSN ( λ , 5 ) 0.000−0.000−0.044−0.294−0.480−0.501
Table 3. Fiber: Descriptive statistics.
Table 3. Fiber: Descriptive statistics.
n y ¯ s 2 b 1 b 2
31512.78928.4111.1475.425
Table 4. Parameter estimates (SE) for FESN and GSN models.
Table 4. Parameter estimates (SE) for FESN and GSN models.
ParameterFESNGSN
μ 7.176 (0.405)6.714 (0.329)
σ 4.396 (0.446)8.076 (0.406)
λ −0.288 (0.235)9.692 (2.510)
α -2.486 (0.764)
ε −0.695 (0.048)-
max −949.458−945.575
AIC1906.9161899.150
CAIC1925.9261918.160
Table 5. M-Sweet: Descriptive statistics.
Table 5. M-Sweet: Descriptive statistics.
n y ¯ s 2 b 1 b 2
2403.2761.9640.8825.049
Table 6. Parameter estimates (SE) for MN and GSN models.
Table 6. Parameter estimates (SE) for MN and GSN models.
ParameterMNGSN
μ 2.115 (0.115)2.577 (0.126)
σ 0.498 (0.104)1.564 (0.091)
λ -3.780 (0.852)
α -3.685 (0.944)
μ 1 3.727 (0.176)-
σ 1 1.376 (0.084)-
p0.280 (0.077)-
AIC828.214826.585
CAIC850.617844.507
Table 7. Kaposi’s sarcoma: Descriptive statistics.
Table 7. Kaposi’s sarcoma: Descriptive statistics.
n y ¯ s 2 b 1 b 2
2913145.396416.4870.3131.936
Table 8. Parameter estimates (SE) for FSN and GSN models.
Table 8. Parameter estimates (SE) for FSN and GSN models.
ParameterFSNGSN
μ 17.896 (0.119)37.039 (0.139)
σ 34.245 (0.171)22.052 (0.105)
λ 6.016 (0.118)4.898 (0.118)
α −1.007 (0.090)5.525 (0.138)
AIC255,356.2253,832.6
CAIC255,393.3253,869.7
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gómez-Déniz, E.; Arnold, B.C.; Sarabia, J.M.; Gómez, H.W. Properties and Applications of a New Family of Skew Distributions. Mathematics 2021, 9, 87. https://doi.org/10.3390/math9010087

AMA Style

Gómez-Déniz E, Arnold BC, Sarabia JM, Gómez HW. Properties and Applications of a New Family of Skew Distributions. Mathematics. 2021; 9(1):87. https://doi.org/10.3390/math9010087

Chicago/Turabian Style

Gómez-Déniz, Emilio, Barry C. Arnold, José M. Sarabia, and Héctor W. Gómez. 2021. "Properties and Applications of a New Family of Skew Distributions" Mathematics 9, no. 1: 87. https://doi.org/10.3390/math9010087

APA Style

Gómez-Déniz, E., Arnold, B. C., Sarabia, J. M., & Gómez, H. W. (2021). Properties and Applications of a New Family of Skew Distributions. Mathematics, 9(1), 87. https://doi.org/10.3390/math9010087

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop