Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling

Stojanović, Vladica S.; Jovanović, Mihailo; Pažun, Brankica; Langović, Zlatko; Grujčić, Željko

doi:10.3390/sym16111513

Open AccessArticle

Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling

by

Vladica S. Stojanović

^1,*

,

Mihailo Jovanović

¹,

Brankica Pažun

²,

Zlatko Langović

³ and

Željko Grujčić

²

¹

Department of Informatics & Computer Sciences, University of Criminal Investigation and Police Studies, 11000 Belgrade, Serbia

²

Department of Informatics, Mathematics and Statistics, Faculty of Engineering Management, 11000 Belgrade, Serbia

³

Department of Business Economy, Faculty of Hotel Management and Tourism, University of Kragujevac, 36210 Vrnjačka Banja, Serbia

^*

Author to whom correspondence should be addressed.

Symmetry 2024, 16(11), 1513; https://doi.org/10.3390/sym16111513

Submission received: 28 August 2024 / Revised: 26 October 2024 / Accepted: 6 November 2024 / Published: 11 November 2024

(This article belongs to the Special Issue Skewed (Asymmetrical) Probability Distributions and Applications Across Disciplines Fourth Edition)

Download

Browse Figures

Versions Notes

Abstract

:

The manuscript deals with a new unit distribution that depends on two positive parameters. The distribution itself was obtained from the Gumbel distribution, i.e., by its transformation, using generalized logistic mapping, into a unit interval. In this way, the so-called Gumbel-logistic unit (abbr. GLU) distribution is obtained, and its key properties, such as cumulative distribution function, modality, hazard and quantile function, moment-based characteristics, Bayesian inferences and entropy, have been investigated in detail. Among others, it is shown that the GLU distribution, unlike the Gumbel one which is always positively asymmetric, can take both asymmetric forms. An estimation of the parameters of the GLU distribution, based on its quantiles, is also performed, together with asymptotic properties of the estimates thus obtained and their numerical simulation. Finally, the GLU distribution has been applied in modeling the empirical distributions of some real-world data related to telecommunications.

Keywords:

unit distributions; gumbel distribution; logistic mapping; asymmetry; parameter estimation; quantiles; telecommunications

MSC:

60E05; 62E10; 62F10

1. Introduction

Unit stochastic distributions are certainly one of the frequently studied areas of contemporary probability theory. They are commonly used as stochastic models that describing the so-called percentage (proportional) variables, which can explain a variety of real-world phenomena (see, as some more recent results, e.g., [1,2,3,4,5,6,7,8]). Still, it is worth pointing out that modeling with unit distributions is very specific, primarily because of the limitation of their values to the unit interval. Let us point out that the procedure for forming unit distributions can be specified in a very general way (see, e.g., [9]). Nevertheless, specific transformations of continuous distributions are most often applied, such as Teisser [10], Burr Type-X [11], Maxwell-Boltzmann [12], or exponential-based distributions [13,14], into a unit interval. On the other hand, in recent years, half-logistic mappings are often used in this kind of transformations [15,16,17]. Motivated by this issue, and using the procedure similar as in Stojanović et al. [18], here is presented a novel unit distribution, called the Gumbel–logistic unit (GLU) distribution. This distribution is obtained by the general logistic transformation of the Gumbel distribution into a unit interval, which gives it flexibility and convenience for describing various kind of empirical distributions. The definition of the GLU distribution as well as its basic stochastic properties, related to its modality, asymmetry, moments, etc. are described in the next Section 2. In addition, the hazard rate and quantile functions of the GLU distribution are also discussed in this section. After that, Section 3 considers the procedure for estimating the parameters of the GLU distribution based on the sample quantiles. Asymptotic properties of the estimators thus obtained were also examinated, along with an appropriate Monte Carlo numerical study. Section 4 presents the application of the GLU distribution in modeling empirical distributions of real-world data, related to telecommunications and machine learning. Finally, Section 5 provides some concluding highlights.

2. The GLU Distribution

In the first part of this section, the concept of GLU distribution is given as well as some of its basic stochastic properties.

2.1. The Definition and Key Properties

We start with a random variable (RV) Y that has a zero-centered Gumbel distribution, i.e., its probability density function (PDF) is given as follows:

g (y; β) = β exp (- β y - exp (- β y)) .

(1)

Here,

y \in R

and

β > 0

is the so-called scale parameter of this distribution. As is known, the Gumbel distribution is a special case of generalized extreme value distributions (of the first order) and as such is often used in modeling the maxima of a large number of samples from various distributions. Therefore, the basic idea here is to transform this distribution into a unit interval, where a similar application to the newly created distribution can be found. For this purpose, the so-called general logistics map (of the first kind) is used, defined as

x = φ (y; α) = {(1 + exp (y))}^{- 1 / α},

wherein is

α > 0

. Notice that the logistic map is continuous-bijective function

φ : (- \infty, + \infty) \times (0, + \infty) \to (0, 1)

, with limit values:

lim_{y \to - \infty} φ (y; α) = 1^{-}, lim_{y \to + \infty} φ (y; α) = 0^{+} .

According to Equation (1) and using the inverse function

y = φ^{- 1} (x; α) = ln (x^{- α} - 1)

, we get a novel RV

X = φ (Y; β)

, defined on the unit interval

(0, 1)

. The PDF of the RV X can be expressed as follows:

\begin{matrix} f (x; α, β) & = g (φ^{- 1} (x; β); α) \cdot | \frac{\partial φ^{- 1} (x; β)}{\partial x} | = \frac{α β x^{α β - 1}}{{(1 - x^{α})}^{β + 1}} exp (- \frac{x^{α β}}{{(1 - x^{α})}^{β}}), \end{matrix}

(2)

where

x \in (0, 1)

and

\int_{0}^{1} f (x; α, β) d x = 1

. We say that the RV X, with the PDF given by Equation (2), has a Gumbel–logistic unit (GLU) distribution, with the parameters

α, β > 0

, which will be further denoted as

X : G (α, β)

. Note that the GLU distribution depends on two parameters, i.e., besides the scale parameter

β > 0

, it also has a shape parameter

α > 0

. Moreover, it is worth pointing out that it has similarities with the initial Gumbel distribution. However, as is known, the Gumbel distribution is unimodal and positively asymmetric, while the GLU distribution has some other special characteristics. We first describe some of them with following proposition.

Theorem 1.

Let

X : G (α, β)

be the RV with the GLU distribution, whose PDF is defined by Equation (2). Then the following statements are valid:

$(i)$: When $α β > 1$ , the RV X is unimodal and both-sides vanishing, that is,

$lim_{x \to 0^{+}} f (x; α, β) = lim_{x \to 1^{-}} f (x; α, β) = 0^{+} .$

(3)
$(i i)$: When $α β \leq 1$ and the function

$P (z; α, β) : = α β z^{β} (z + 1) - α (β + 1) z + 1 - α β$

has at least one positive root $z = z_{0} > 0$ , the RV X is unimodal, left-tailed and right-side vanishing, that is,

$lim_{x \to 0^{+}} f (x; α, β) > 0, lim_{x \to 1^{-}} f (x; α, β) = 0^{+} .$

(4)
$(i i i)$: Otherwise, the PDF of the RV X strictly decreases on $(0, 1)$ , with boundary properties as in Equation (4).

Proof.

We use a procedure similar as in Stojanović et al. [18]. First, after some algebraic computation, for the first partial derivative on x of the function

f (x, α, β)

, one obtains:

\begin{matrix} \frac{\partial f (x; α, β)}{\partial x} & = \frac{α β x^{α β - 2}}{{(1 - x^{α})}^{2 (β + 1)}} exp (- \frac{x^{α β}}{{(1 - x^{α})}^{β}}) ({(1 - x^{α})}^{β} (α β + (α + 1) x^{α} - 1) - α β x^{α β}) . \end{matrix}

Thus, equation

\partial f (x; α, β) / \partial x = 0

is equivalent to

ψ (x; α, β) = 0

, wherein is:

ψ (x; α, β) = {(1 - x^{α})}^{β} (α β + (α + 1) x^{α} - 1) - α β x^{α β} .

After substitution

z = x^{α} / (1 - x^{α}) > 0

and some rearrangement, the previous equation is easily transformed into the equivalent one

P (z; α, β) = 0

, i.e.,

α β z^{β + 1} + α β z^{β} - α (β + 1) z + 1 - α β = 0 .

(5)

Based on it, by applying Descartes’ rules of signs, the following cases can be observed:

$(i)$: When $α β > 1$ , there exists exactly one solution $z = z_{0}$ of Equation (5). Therefore, it obviously represents the unique mode of GLU distribution.
$(i i)$: When $α β \leq 1$ , Equation (5) can have 0 or 2 different solutions and the other two cases mentioned in the theorem are then obtained.

Finally, according to:

\lim_{x \to 0^{+}} f (x; α, β) = \{\begin{matrix} 0^{+}, & α β > 1 \\ 1, & α β = 1 \\ + \infty, & α β < 1 \end{matrix}, \lim_{x \to 1^{-}} f (x; α β) = 0^{+},

the boundary conditions given by Equations (3) and (4) follows, and the theorem is proved completely. □

Remark 1.

Note that preceding theorem describes three basic forms of the GLU distribution, which are also shown in Figure 1a. As can be easily observed, the GLU distribution can have various shapes, where apart from the typical one with a “peak”, which is close to the Gumbel distribution, there is also a decreasing one. Let us also note that the form of the GLU distribution depends primarily on the number of solutions to Equation (5). Thus, for the standard Gumbel distribution with

β = 1

, this equation becomes quadratic one

α z^{2} - α z + 1 - α = 0,

and a mode exists if and only if its discriminant satisfies

α^{2} - 4 α (1 - α) \geq 0

, i.e.,

α \geq 4 / 5

. Still, a more precise examination of the modality of GLU distribution will be further presented below.

Remark 2.

The bounds of the (possible) roots of Equation (5) can be additionally determined by applying well-known procedures (see, e.g., Theorem 3.2.2 in Milovanović [19]). Thus, when

β > 1

, for the roots bounds of this equation one obtains:

1 + \frac{α (β + 1)}{1 - α β} < z < 1 + \sqrt{\frac{β + 1}{β}},

and consequently, Equation (5) has no real roots in the case when:

\frac{α^{2} {(β + 1)}^{2}}{{(1 - α β)}^{2}} > \frac{β + 1}{β} .

By solving this inequality with respect to α and using Theorem 1, the sufficient condition for the GLU distribution to be strictly decreasing is as follows:

\frac{1}{β + \sqrt{β (β + 1)}} < α \leq \frac{1}{β} .

Similarly, when

0 < β < 1

, for the interval of root bounds of Equation (5) we get:

1 + \sqrt{\frac{α (β + 1)}{1 - α β}} < z < 1 + \frac{β + 1}{β},

so the sufficient condition for decreasing of the GLU distribution is then:

\frac{β + 1}{β (2 β + 1)} < α \leq \frac{1}{β} .

According to Equation (2), the cumulative distribution function (CDF) of the GLU distribution is easily calculated:

F (x; α, β) : = P {X < x} = \int_{0}^{x} f (t; α, β) d t = 1 - exp (- \frac{x^{α β}}{{(1 - x^{α})}^{β}}),

(6)

where

0 < x < 1

. Figure 1b shows the plots of this function for some parameter values

α, β > 0

. Obviously, the function

F (x; α, β)

is well defined on the unit interval, because is valid:

lim_{x \to 0^{+}} F (x; α, β) = 0^{+}, lim_{x \to 1^{-}} F (x; α, β) = 1^{-},

for all

α, β > 0

. Using the CDF

F (x; α, β)

, one can simply prove the following statement regarding the asymmetry conditions for the GLU distribution:

Theorem 2.

Let

X : G (α, β)

is the RV with the GLU distribution, and

ξ (β) : = \frac{1}{c} ln (1 + c^{- 1 / β})

the function on

β > 0

, where

c : = ln 2 \approx 0.69315

. Then, the GLU distribution is positively asymmetric when

α < ξ (β)

, and vice versa, it is negatively asymmetric when

α > ξ (β)

.

Proof.

Note that the CDF of the GLU distribution, given by Equation (6), is a strictly increasing on the unit interval. Therefore, the median

m \in (0, 1)

can be easily obtained as solutions of the equation:

F (m; α, β) = \int_{0}^{m} f (x; α, β) d x = 1 - exp (- \frac{m^{α β}}{{(1 - m^{α})}^{β}}) = \frac{1}{2} .

(7)

Consequently, the RV

X : G (α, β)

will be positively asymmetric if and only if is valid

m = F^{- 1} (1 / 2; α, β) < 1 / 2,

that is,

F (1 / 2; α, β) > 1 / 2

. After some algebraic calculation it is easy to see that the last inequality indeed gives

α < ξ (β)

, and the corresponding inequality for the negatively asymmetric GLU distribution is obtained in a similar way. □

Remark 3.

The previous theorem gives the parameter dependence

α = ξ (β)

under which the GLU distribution has different skewness. It is shown in Figure 2a, along with the dependence

α β = 1

, mentioned in Theorem 1, which indicating different shapes of the GLU distribution. Thereby, the point

β_{0} \approx 0.70147

which represents the solution of the equation

ξ (β) = β^{- 1}

is clearly visible. It should be emphasized that the dependence

α = ξ (β)

does not indicate the symmetry of the GLU distribution. As an illustration, Figure 2b shows graphs of PDFs for which this dependence is applied, that is, for which the median is equal to

m = 1 / 2

. Note that the “true” symmetry of the GLU distributed RV X does not exist, which also confirms its PDF given by Equation (2). In this way, asymmetry represents one of the essential characteristics of this distribution.

2.2. Bayesian Inference

In this section, Bayesian inference for the GLU distribution is briefly analyzed. To this end, let us assume that

α = 1

and the RV

X : G (1, β)

is the prior distribution for the (binomial) parameter

p \in (0, 1)

, that is,

π (p) : = f (p; 1, β) = \frac{β p^{β - 1}}{{(1 - p)}^{β + 1}} exp (- \frac{p^{β}}{{(1 - p)}^{β}}),

(8)

is a prior for the sample (binomial) distribution:

s (k | p) = (\binom{n}{k}) p^{k} {(1 - p)}^{n - k},

(9)

where

k = 0, 1, \dots, n

.

As is known, the posterior distribution at

k = 0, 1, \dots, n

is as follows:

s (p | k) = \frac{s (k | p) π (p)}{s (k)},

where

s (k) = \int_{0}^{1} s (k | p) π (p) d p

is a marginal distribution. Applying Equations (8) and (9), for the posterior distribution one obtains:

s (p | k) = C^{- 1} \frac{p^{β + k - 1}}{{(1 - p)}^{β + k - n + 1}} exp (- \frac{p^{β}}{{(1 - p)}^{β}}), 0 < p < 1,

where

C = C (k, β) : = \int_{0}^{1} \frac{p^{β + k - 1}}{{(1 - p)}^{β + k - n + 1}} exp (- \frac{p^{β}}{{(1 - p)}^{β}}) d p

is the normalizing constant. Let us notice that thus obtained posterior distribution is similar to the GLU distribution, and corresponding procedures for Bayesian inference can be performed for some other stochastic distributions.

2.3. Moment-Based Characteristics

Useful tools for measuring and analysing some properties of stochastic distributions are often based on their moments. Therefore, here we first examine the so-called r-th incomplete moments of the GLU distribution, defined by:

M_{r} (x; α, β) : = \int_{0}^{x} u^{r} f (u; α, β) d u,

where

r \in N

and

x \in (0, 1)

. Note that then:

0 < M_{r} (x; α, β) \leq \int_{0}^{1} f (x; α, β) d x = 1,

so the

r^{t h}

incomplete moments exist, for any

r \in N

, but they cannot be computed in a closed form. To this end, it is convenient to perform their series expansion, given by the following proposition.

Theorem 3.

The incomplete

r^{t h}

moment of the GLU distributed RV

X : G (α, β)

can be expressed as follows:

M_{r} (x; λ, θ) = \{\begin{matrix} \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) γ (\frac{r}{α β} + \frac{j}{β} + 1, \frac{x^{α}}{1 - x^{α}}), & 0 < x \leq 2^{- 1 / α} \\ \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) [γ (\frac{r}{α β} + \frac{j}{β} + 1, 1) + Γ (- \frac{j}{β} + 1, 1) - Γ (- \frac{j}{β} + 1, \frac{x^{α}}{1 - x^{α}})], & x > 2^{- 1 / α} \end{matrix},

(10)

where

γ (a, b) : = \int_{0}^{b} t^{a - 1} exp (- t) d t

and

Γ (a, b) : = \int_{b}^{\infty} t^{a - 1} exp (- t) d t

are the lower and upper incomplete gamma functions, respectively.

Proof.

According to definition of the

r^{t h}

incomplete moment, one obtains:

M_{r} (x; α, β) = α β \int_{0}^{x} \frac{u^{r + α β - 1}}{{(1 - u^{α})}^{β + 1}} exp [- {(\frac{u^{α}}{1 - u^{α}})}^{β}] d u = \int_{0}^{x^{α} / (1 - x^{α})} \frac{t^{\frac{r}{α β}}}{{(1 + t^{\frac{1}{β}})}^{\frac{r}{α}}} exp (- t) d t,

(11)

where

t = {(u^{α} / (1 - u^{α}))}^{β}

. Now, the following two cases can be considered:

$(i)$: When $0 < x^{α} / (1 - x^{α}) \leq 1$ , that is $0 < x \leq 2^{- 1 / α}$ , by applying the generalized binomial formula, integral in Equation (11) can be rewritten as follows:

$M_{r} (x; α, β) = \int_{0}^{x^{α} / (1 - x^{α})} t^{\frac{r}{α β}} {(1 + t^{\frac{1}{β}})}^{- \frac{r}{α}} exp (- t) d t = \int_{0}^{x^{α} / (1 - x^{α})} \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) t^{\frac{r}{α β} + \frac{j}{β}} exp (- t) d t .$

(12)

Thereby, for an arbitrary but fixed $x \in (0, 1)$ , the following inequality holds:

$\sum_{j = 0}^{+ \infty} \int_{0}^{x^{α} / (1 - x^{α})} | (\binom{- r / α}{j}) t^{\frac{r}{α β} + \frac{j}{β}} exp (- t) d t | \leq \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) \int_{0}^{x^{α} / (1 - x^{α})} d t = \frac{x^{α}}{2^{r / α} (1 - x^{α})} < + \infty,$

so the order of the sign of integral and sum can be changed (see, e.g., Theorem 1.38 in Rudin [20]). Thus, according to Equation (12) and after some computations, the first equality in Equations (10) is easily obtained.
$(i i)$: When $x^{α} / (1 - x^{α}) > 1$ , i.e., $x > 2^{- 1 / α}$ , by using the similar procedure as above, integral in Equation (11) becomes:

$\begin{matrix} M_{r} (x; α, β) & = \int_{0}^{1} t^{\frac{r}{α β}} {(1 + t^{\frac{1}{β}})}^{- \frac{r}{α}} exp (- t) d t + \int_{1}^{x^{α} / (1 - x^{α})} {(1 + t^{- \frac{1}{β}})}^{- \frac{r}{α}} exp (- t) d t \\ = \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) γ (\frac{r}{α β} + \frac{j}{β} + 1, 1) + \int_{1}^{x^{α} / (1 - x^{α})} \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) t^{- \frac{j}{β}} exp (- t) d t . \end{matrix}$

(13)

Similar to the previous case, the following inequality holds:

$\sum_{j = 0}^{+ \infty} \int_{1}^{x^{α} / (1 - x^{α})} | (\binom{- r / α}{j}) t^{- \frac{j}{β}} exp (- t) d t | \leq exp (- 1) \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) \int_{1}^{x^{α} / (1 - x^{α})} d t = \frac{2 x^{α} - 1}{2^{r / α} e (1 - x^{α})} < + \infty,$

so changing the order of the integral and the sum in the last expression of Equation (13) implies the second equality in Equation (10).

□

Using the previous theorem, one can easily see that “ordinary”

r^{t h}

moments of the GLU distribution, defined by equality:

μ_{r} (α, β) : = E (X^{r}) = \int_{0}^{1} u^{r} f (u; α, β) d u,

can be obtained from the incomplete moments, in the limiting case when

x \to 1

. Therefore, according to the second one of Equation (10), the following proposition immediately follows:

Corollary 1.

The

r^{t h}

moment of the RV X with GLU distribution

G (α, β)

can be expressed as follows:

μ_{r} (α, β) = \sum_{j = 0}^{+ \infty} (\binom{- r / α}{j}) [γ (\frac{r}{α β} + \frac{j}{β} + 1, 1) + Γ (- \frac{j}{β} + 1, 1)],

(14)

where

r \in N

.

Remark 4.

It is worth noting that Equation (14) can be suitable for numerical (approximate) calculation of the moments of GLU distribution, first of all its mean value

E (X) = μ_{1} (α, β)

and the variance

Var (X) = μ_{2} (α, β) - {(μ_{1} (α, β))}^{2}

. For instance, when

α = 2

and

β = 1

their approximate values (calculated by summing

10^{5}

terms in Equation (14)) are, respectively,

E (X) \approx 0.60346

and

Var (X) \approx 0.03986

. In a similar way one can calculate the other statistical measures of the GLU distribution. Thus, the skewness coefficient and the kurtosis are, respectively,

\begin{matrix} S (α, β) & = \frac{E {(X - μ_{1} (α, β))}^{3}}{{[V a r (X)]}^{3 / 2}} = \frac{μ_{3} (α, β) - 3 μ_{1} (α, β) Var (X) - {(μ_{1} (α, β))}^{3}}{{[V a r (X)]}^{3 / 2}}, \\ K (α, β) & = \frac{E {(X - μ_{1} (α, β))}^{4}}{{[V a r (X)]}^{2}} \\ = \frac{μ_{4} (α, β) - 4 μ_{3} (α, β) μ_{1} (α, β) + 6 μ_{2} (α, β) {(μ_{1} (α, β))}^{2} - 3 {(μ_{1} (α, β))}^{4}}{{[V a r (X)]}^{2}} . \end{matrix}

Remark 5.

In economics, incomplete moments are often used tools in income inequality measuring. By using them, the Lorenz curve can be defined:

L (x; α, β) : = \frac{M_{1} (x; α, β)}{μ_{1} (α, β)} = \frac{1}{μ_{1} (α, β)} \int_{0}^{x} u f (u; λ, θ) d u,

where

μ_{1} (x; α, β) = E (X)

and

M_{1} (x; α, β)

is the incomplete first moment of the GLU distribution. In Figure 3a are shown graphs of Lorenz curves for some values of the GLU distribution parameters. Thereby, the “closer” Lorenz curve to the diagonal line indicates a uniform distribution of income, while, conversely, a greater distance from the diagonal implies its greater unevenness. As can be seen, the GLU distribution shows significant versatility in describing the behaviour of these phenomena.

The entropy of the GLU distribution, as a measure of uncertainty variation, is briefly described below. One of the commonly used entropy measure, which also generalizes its various other forms, is the so-called Rényi entropy. In the case of GLU distributed RV X, the Rényi entropy is given by the equality:

H_{λ} (X) : = \frac{1}{1 - λ} ln \int_{0}^{1} {(f (x; α, β))}^{λ} d x,

where

λ \in (0, + \infty) ∖ {1}

. Using the definition of RV

X : G (α, β)

, i.e., its PDF given by Equation (2), after certain computations similar to the previous ones, it is obtained:

H_{λ} (X) : = - ln (α β) + \frac{1}{1 - λ} [ln I (α, β, λ) - ln λ]

where:

\begin{matrix} I (α, β, λ) & : = \int_{0}^{1} \frac{x^{(α β - 1) λ}}{{(1 - x^{α})}^{(β + 1) λ}} exp (- \frac{λ x^{α β}}{{(1 - x^{α})}^{β}}) d x = \sum_{j = 0}^{+ \infty} (\binom{(λ - 1) (1 + \frac{1}{α})}{j}) [λ^{(1 - λ) (1 - \frac{1}{α β}) - \frac{j}{β}} \times \\ \times γ ((λ - 1) (1 - \frac{1}{α β}) + \frac{j}{β} + 1, λ) + λ^{(1 - λ) (1 + \frac{1}{α}) + \frac{j}{β}} Γ ((λ - 1) (1 + \frac{1}{α}) - \frac{j}{β} + 1, λ)] . \end{matrix}

From here, in the limit case when

λ \to 1

, one obtains the well-known Shannon entropy:

H (X) : = E (- ln f (X)) = lim_{λ \to 1} H_{λ} (X) = - ln (α β) - \frac{1}{I (α, β, λ)} \cdot \frac{\partial I (α, β, λ)}{\partial λ} - 1,

where the above partial derivatives exist, but we have omitted them due to their complexity. Figure 3b shows a 3D plot of the Shannon entropy, depending on the parameters

(α, β)

. Obviously, high entropy values correspond to lower parameter values, and vice versa.

2.4. Hazard Rate and Quantile Functions

In this part, we consider two important functions which additionally characterize the GLU distribution. First, the hazard rate function (HRF) of the GLU distribution can be easily obtained according to Equations (2) and (6), as follows:

h (x; α, β) : = \frac{f (x; α, β)}{1 - F (x; α, β)} = \frac{α β x^{α β - 1}}{{(1 - x^{α})}^{β + 1}} .

(15)

Obviously, the function

h (x; α, β)

strictly increases when

α β \geq 1

, which means that the probability of failure of the designed system then increases. On the other hand, when

α β < 1

, this HRF has two asymptotic points at

x = 0

and

x = 1

, i.e., it is a bath-tube shaped. Then there is the so-called declining failure rate (DFR) which describes the phenomenon where the probability of failure decreases over a unit interval. Obviously, both of these cases can be obtained from the GLU distribution, for certain of its parameters, as can also be seen in Figure 4a.

Similar to some previous results on unit distributions (see, e.g., Bakouch et al. [13] or Biçer et al. [12]), one typical HRF characterization of the GLU distribution is given as follows:

Theorem 4.

The RV X has the GLU distribution

G (α, β)

, with the HRF

h (x; α, β)

defined by Equation (15), if and only if this HRF satisfies differential equation:

\frac{\partial h (x; α, β)}{\partial x} \cdot \frac{1}{{(h (x; α, β)))}^{2}} = \frac{{(1 - x^{α})}^{β} ((α + 1) x^{α} + α β - 1)}{α β x^{α β}} .

(16)

Proof.

We have use the procedure similar as in Bakouch et al. [13], and first assume that RV X has the GLU distribution

G (α, β)

. Thus, the logarithm of its PDF, given by Equation (2), is as follows:

ln f (x; α, β) = ln (α β) + (α β - 1) ln x - (β + 1) ln (1 - x^{α}) - \frac{x^{α β}}{{(1 - x^{α})}^{β}},

and taking the first derivative with respect to

x \in (0, 1)

, we get:

\begin{matrix} \frac{\partial f (x; α, β)}{\partial x} \cdot \frac{1}{f (x; α, β)} & = \frac{α β - 1}{x} + \frac{α (β + 1) x^{α - 1}}{1 - x^{α}} - \frac{α β x^{α β - 1}}{{(1 - x^{α})}^{β + 1}} . \end{matrix}

Therefore, according to the well-known property of HRF (see, e.g., Salinas et al. [21]), and using Equation (15), it follows:

\begin{matrix} \frac{\partial h (x; α, β)}{\partial x} \cdot \frac{1}{h (x; α, β)} & = \frac{\partial f (x; α, β)}{\partial x} \cdot \frac{1}{f (x; α, β)} + h (x; α, β) \\ = \frac{α β - 1}{x} + \frac{α (β + 1) x^{α - 1}}{1 - x^{α}} . \end{matrix}

(17)

Thus, the last equality and Equation (15), after simple calculation, yield Equation (16).

Otherwise, let us assume that Equation (16) holds, which after integration becomes:

- \frac{1}{h (x; α, β)} = - \frac{{(1 - x^{α})}^{β + 1}}{α β x^{α β - 1}},

and the HRF

h (x; α, β)

is obtained as in Equation (15). Moreover, by substituting this function into Equation (17) and after integration, one obtains:

\begin{matrix} ln f (x; α, β) & = \int \frac{\partial f (x; α, β)}{\partial x} \cdot \frac{d x}{f (x; α, β)} = \int (\frac{\partial h (x; α, β)}{\partial x} \cdot \frac{1}{h (x; α, β)} - h (x; α, β)) d x \\ = \int (\frac{α β - 1}{x} + \frac{α (β + 1) x^{α - 1}}{1 - x^{α}} - \frac{α β x^{α β - 1}}{{(1 - x^{α})}^{β + 1}}) d x + C_{1}, \end{matrix}

that is

f (x; α, β) = \frac{α β x^{α β - 1}}{{(1 - x^{α})}^{β + 1}} exp (C_{1} - \frac{x^{α β}}{{(1 - x^{α})}^{β}}) .

After another integration, it follows:

F (x; α, β) = \int f (x; α, β) d x + C_{2} = C_{2} - exp (C_{1} - \frac{x^{α β}}{{(1 - x^{α})}^{β}}),

so according to the equalities

F (0; α, β) = 0

and

F (1; α, β) = 1

it is easily obtain

C_{1} = 0

and

C_{2} = 1

. Therefore,

F (x; α, β)

is the CDF of RV

X : G (α, β)

, and theorem is completely proved. □

Below we consider the so-called quantile function (QF) of the GLU distributed RV X. As is known, the QF represents the inverse function of the corresponding CDF:

Q (p; α, β) : = F^{- 1} (p; α, β) = {(1 - \frac{1}{1 + {(- ln (1 - p))}^{1 / β}})}^{1 / α},

(18)

wherein is

p = F (x; α, β)

. The plot of QF for some values of GLU parameters is also shown in Figure 4b. Note that the QF represents a useful tool in further investigating the features of the GLU distribution, given as follows.

Theorem 5.

Let

X : G (α, β)

is the GLU distributed RV, whose QF is defined by Equation (18). Then, the RV X is unimodal if and only if its parameters

(α, β)

satisfy the equality:

α = {[β (1 + ln (1 - p_{0})) + {(- ln (1 - p_{0}))}^{1 / β} (1 + β + β ln (1 - p_{0}))]}^{- 1},

(19)

where

\partial^{3} Q (p_{0}; (α, β)) / \partial p^{3} > 0

,

p_{0} = F (x_{0}; α, β)

and

x_{0} \in (0, 1)

is the mode of the GLU distribution. Otherwise, the PDF of X is a strictly decreasing function.

Proof.

Applying the derivative rule of the inverse function, the derivatives of QF

Q (p; α, β)

up to the third order are as follows:

\begin{matrix} \frac{\partial Q (p; α, β)}{\partial p} & = \frac{1}{\partial F (x; α, β) / \partial x} = \frac{1}{f (x; α, β)}, \\ \frac{\partial^{2} Q (p; α, β)}{\partial p^{2}} & = - \frac{\partial f (x; α, β) / \partial x}{f^{2} (x; α, β)} \\ \frac{\partial^{3} Q (p; α, β)}{\partial p^{3}} & = \frac{2 {(\partial f (x; α, β) / \partial x)}^{2} - f (x; α, β) \partial^{2} f (x; α, β) / \partial x^{2}}{f^{3} (x; α, β)}, \end{matrix} .

where

x = Q (p; α, β)

. At the same time, if the value

x = x_{0}

is the mode for the RV X, it is the critical point of its PDF

f (x; α, β)

, that is, the value

p_{0} = F (x_{0}; α, β)

satisfies:

\frac{\partial^{2} Q (p_{0}; α, β)}{\partial p^{2}} = 0, \frac{\partial^{3} Q (p_{0}; α, β)}{\partial p^{3}} = - \frac{\partial^{2} f (x_{0}; α, β) / \partial x^{2}}{f^{2} (x_{0}; α, β)} > 0 .

(20)

After some computation, it is obtained:

\begin{matrix} \frac{\partial^{2} Q (p; α, β)}{\partial p^{2}} & = \frac{{(- ln (1 - p))}^{\frac{1}{α β} - 2} (α β + α β ln (1 - p) + α {(- ln (1 - p))}^{1 / β} (1 + β + β ln (1 - p)) - 1)}{α^{2} β^{2} {(1 - p)}^{2} {(1 + {(- ln (1 - p))}^{1 / β})}^{\frac{1}{α} + 2}}, \end{matrix}

(21)

so that Equations (20) and (21) obviously give the statement of the theorem. □

Remark 6.

Note that Theorems 1 and 5 provide complete insight into the modality property of the GLU distribution. It is explicit described by the functional dependence in Equation (19), which is close to logarithmic spirals, as shown in Figure 5. Here are presented, with two different “views”, polar plots of parameter dependences

(α, β)

that give unimodal GLU distribution, for different but fixed values

p_{0} \in (0, 1)

. In addition, the QF can also describes the asymmetry conditions of the GLU distribution. Namely, the conditions of positive and negative asymmetry can be easily obtained by solving inequalities

Q (1 / 2; α, β) < 1 / 2

and

Q (1 / 2; α, β) > 1 / 2

, respectively, and it can simply be proved that they are the same as in Theorem 2.

3. Parameters Estimation & Numerical Simulation Study

The procedure for estimating the parameters

α, β > 0

of the GLU-distributed RV X based on its observed random sample

x_{1}, \dots, x_{n}

of length n is described here. First, we note that according to the above-mentioned features of the GLU distribution, some well-known estimation procedures are not appropriate here. For instance, according to Equation (10), it follows that moments of the GLU distribution cannot be calculated in a closed form, and thus the method of moments cannot be successfully applied. Similarly, the maximum likelihood (ML) estimation method is also associated with certain difficulties, related to numerically finding solutions

(α, β)

that maximize the likelihood function:

L (α, β | x_{1}, \dots, x_{n}) = \prod_{i = 1}^{n} f (x_{i}; α, β) = {(α β)}^{n} \prod_{i = 1}^{n} [\frac{x_{i}^{α β - 1}}{{(1 - x_{i}^{α})}^{β + 1}} exp (- \frac{x_{i}^{α β}}{{(1 - x_{i}^{α})}^{β}})] .

For these reasons, and similar as in Stojanović et al. [18], here we consider parameter estimation methods based on the GLU distribution quantiles. These estimators (we will call them Q-estimators) are explicitly given and also have some convenient asymptotic properties, which will be shown below.

In that aim, for a random sample

X_{1}, X_{2}, \dots, X_{n}

of the length n, let us define the appropriate order statistics

X_{(1)} \leq X_{(2)} \leq \dots X_{(n)}

. Then the PDF of the i-th order statistic

X_{(i)}

, as is known, can be expressed as follows:

f_{X_{(i)}} (x; α, β) = \frac{n!}{(i - 1)! (n - i)!} f (x; α, β) {[F (x; α, β)]}^{i - 1} {[1 - F (x; α, β)]}^{n - i},

(22)

where

i = 1, \dots, n

. On the other hand, by replacing

p = p_{0}

in the QF

Q_{p} : = K (p; α, β)

, given by Equation (18), the quantile

Q_{p_{0}}

is obtained. Therefore, the appropriate sample quantile can be obtained according to the equality:

{\hat{Q}}_{p_{0}} : = X_{(i_{p_{0}})}, i_{p_{0}} = \{\begin{matrix} n p_{0}, & n p_{0} = 1, 2, \dots, n, \\ 1 + [n p_{0}], & otherwise, \end{matrix}

(23)

where

[n p_{0}]

is the integer part of

n p_{0}

. Thus, sample quantiles are actually the order statistics, so their distribution is determined by Equation (22).

In order to determine the Q-estimators of the parameters of the GLU-distributed RV X, notice first that for

p_{0} = 1 - e^{- 1}

is obtained the quantile

Q_{p_{0}} = 2^{- 1 / α}

. Hence, by equating this quantile with sample one

{\hat{Q}}_{p_{0}}

, the estimator of the shape parameter

α

is simply obtained as follows:

\hat{α} = - \frac{ln 2}{ln {\hat{Q}}_{p_{0}}} .

(24)

Furthermore, by substituting

p = 1 / 2

in the QF

Q (p; α, β)

, it is obtained the median of the GLU distribution:

m = Q (1 / 2; α, β) = {(1 - \frac{1}{{(ln 2)}^{1 / β} + 1})}^{1 / α} .

(25)

Therefore, by equating median with the sample one

\hat{m} = {\hat{Q}}_{1 / 2}

, as well as using the estimator

\hat{α}

, as an estimate of the scale parameter

β

we get:

\hat{β} = \frac{ln (ln 2)}{ln ({\hat{m}}^{\hat{α}} / (1 - {\hat{m}}^{\hat{α}}))} .

(26)

In the following, some asymptotic properties of the proposed estimators are examined:

Theorem 6.

Statistics

(\hat{α}, \hat{β})

, given by Equations (24) and (26), are strictly consistent and asymptotically normal (AN) estimators of the true parameters

(α, β)

.

Proof.

In order to prove the strong consistency of the proposed estimators, we apply some general results of sample quantile theory. Let us first note that the CDF

p = F (x; α, β)

is a differentiable function that increases with respect to

p \in (0, 1)

. Therefore, the quantiles

Q_{p} : = Q (p; α, β)

are uniquely determined by Equation (18), while the sample quantiles

{\hat{Q}}_{p}

are uniquely determined by Equation (23). Now, according to Bahadur’s representation of sample quantiles (see, e.g., Theorem 1 in Dudek & Kuczmaszewska [22], or Serfling [23], pp. 91–92), it follows:

{\hat{Q}}_{p} = Q_{p} + \frac{p - F_{n} (Q_{p})}{f (Q_{p}; α, β)} + O [{(n^{- 1} ln n)}^{3 / 4}],

(27)

where

F_{n} (x) : = n^{- 1} \sum_{i = 1}^{n} 1 (X_{i} < n)

is the empirical CDF of the RV

X : G (α, β)

. It is well known that for arbitrary

x \in R

, the empirical CDF

F_{n} (x)

almost surely and uniformly converges to the CDF

p = F (x; α, β)

, when

n \to \infty

. Thus, by applying this convergence on Equation (27), when

x = Q_{p}

, one obtains:

{\hat{Q}}_{p} \overset{a s}{⟶} Q_{p}, n \to \infty,

i.e., sample quantiles are consistent estimators of theoretical ones. At the same time, the estimators

(\hat{α}, \hat{β})

are continuous functions of the sample quantile

{\hat{Q}}_{p_{0}}

, where

p_{0} = 1 - e^{- 1}

, as well as the sample median

\hat{m} = {\hat{Q}}_{1 / 2}

. Applying the continuity property for almost sure convergence (see, e.g., Serfling [23], p. 24), it follows:

(\hat{α}, \hat{β}) \overset{a s}{⟶} (α, β), n \to \infty,

i.e.,

(\hat{α}, \hat{β})

are indeed consistent estimators of

(α, β)

.

We now prove the AN features of these estimators. To this end, note that according to the earlier assumptions and Equation (27), the following convergence in the distribution is obtained:

\sqrt{n} ({\hat{Q}}_{p} - Q_{p}) \overset{d}{⟶} N (0, \frac{p (1 - p)}{{[f (Q_{p}, α, β)]}^{2}}), n \to \infty .

(28)

By using Equation (28), for the sample quantile

{\hat{Q}}_{p_{0}}

, where

p_{0} = 1 - e^{- 1}

, one obtains:

\sqrt{n} ({\hat{Q}}_{p_{0}} - Q_{p_{0}}) \overset{d}{⟶} N (0, σ_{p_{0}}^{2}), n \to \infty,

where, according to Equation (2) and after some calculations, we get:

σ_{p_{0}}^{2} : = Var ({\hat{Q}}_{p_{0}}) = \frac{p_{0} (1 - p_{0})}{{[f (p_{0}; α, β)]}^{2}} |_{p_{0} = 1 - e^{- 1}} = \frac{e - 1}{α^{2} β^{2} 4^{\frac{1}{α} + 1}} .

Hence, applying the continuity of convergence in the distribution (see, e.g., Serfling [23], p. 118), for the estimator

\hat{α}

, defined by Equation (24), it is obtained:

\sqrt{n} (\hat{α} - α) \overset{d}{⟶} N (0, σ_{α}^{2}), n \to \infty,

(29)

where:

σ_{α}^{2} : = Var (\hat{α}) = {[- \frac{\partial}{\partial Q_{p_{0}}} (\frac{ln 2}{ln Q_{p_{0}}})]}^{2} |_{Q_{p_{0}} = 2^{- 1 / α}} \cdot σ_{p_{0}}^{2} = \frac{{ln}^{2} 2}{Q_{p_{0}}^{2} {ln}^{4} Q_{p_{0}}} |_{Q_{p_{0}} = 2^{- 1 / α}} \cdot \frac{e - 1}{α^{2} β^{2} 4^{\frac{1}{α} + 1}} = \frac{α^{2} (e - 1)}{4 β^{2} {ln}^{2} 2} .

In a similar way, the AN property of the estimator

\hat{β}

, given by Equation (26), is proved. To that end, let us first notice that Equation (28), applying on the sample median

\hat{m} = {\hat{Q}}_{1 / 2}

, gives the following convergence:

\sqrt{n} (\hat{m} - m) \overset{d}{⟶} N (0, σ_{m}^{2}), n \to \infty .

Here, according to Equations (25) and (28), as well as after some computations, it follows:

σ_{m}^{2} : = Var (\hat{m}) = \frac{1}{4 {[f (m; α, β)]}^{2}} = \frac{{(ln 2)}^{\frac{2}{α β} - 2}}{α^{2} β^{2} {(1 + {(ln 2)}^{\frac{1}{β}})}^{\frac{2}{α} + 2}} .

(30)

By applying again the continuity of convergence in distribution, one obtains:

\sqrt{n} (\hat{β} - β) \overset{d}{⟶} N (0, σ_{β}^{2}), n \to \infty,

(31)

where, according to Equations (25), (26) and (30), it follows:

\begin{matrix} σ_{β}^{2} : = Var (\hat{β}) & = {[- \frac{\partial}{\partial m} (\frac{ln (ln 2)}{ln (m^{α} / (1 - m^{α}))})]}^{2} \cdot σ_{m}^{2} \\ = \frac{{ln}^{2} (ln 2)}{m^{2 α} {(1 - m^{α})}^{2} {ln}^{4} (m^{α} / (1 - m^{α}))} \cdot \frac{{(ln 2)}^{\frac{2}{α β} - 2}}{α^{2} β^{2} {(1 + {(ln 2)}^{\frac{1}{β}})}^{\frac{2}{α} + 2}} \\ = \frac{β^{4} {(ln 2)}^{- \frac{2}{β}} {(1 + {(ln 2)}^{\frac{1}{β}})}^{4} {(ln 2)}^{\frac{2}{α β} - 2}}{α^{2} β^{2} {ln}^{2} (ln 2) {(1 + {(ln 2)}^{\frac{1}{β}})}^{\frac{2}{α} + 2}} = \frac{β^{2} {(1 + {(ln 2)}^{1 / β})}^{2 - \frac{2}{α}}}{α^{2} {ln}^{2} (ln 2) {(ln 2)}^{2 + \frac{2}{β} - \frac{2}{α β}}} . \end{matrix}

In this way, according to the convergences proved in Equations (29) and (31), the AN properties of both estimators

\hat{α}

and

\hat{β}

follow. □

In the following, a numerical study which examines the effectiveness of the proposed Q-estimators is conducted. It is based on independent Monte Carlo simulations of samples

x_{1}, \dots, x_{n}

drawn from the GLU distribution, where various samples and parameter values were considered. According to them, the Q-estimators were computed, and their statistical analysis was also carried out. To that aim, three various samples from the GLU-distribution are examined (also shown in Figure 6, below):

$(i)$: Sample I is taken from a decreasing GLU distribution, with parameters $α = 0.5$ and $β = 1$ , which satisfies the inequality $α β < 1$ .
$(i i)$: Sample II is taken from a unimodal, positively skewed GLU distribution, with parameters $α = 0.5$ and $β = 2$ , so the equality $α β = 1$ holds.
$(i i i)$: Sample III is taken from a unimodal, negatively skewed GLU distribution, with parameters $α = 2$ and $β = 1.5$ , which satisfies the inequality $α β > 1$ .

Note that the simulated sample values are obtained using the R-package “distr” (version 2.9.3) [24], and thereafter the Q-estimates

\hat{α}

and

\hat{β}

are calculated using the procedure described above. In order to check the efficiency of the proposed estimators, realizations of samples with different lengths

n \in {150, 500, 1500}

are considered. So they are close to the lengths of some of the real-world data that will be investigated below. In addition, for each of the samples,

S = 250

independent simulations were conducted, on which an appropriate statistical analysis of the obtained estimates was then performed. The results of this analysis are presented in the following Table 1, Table 2 and Table 3.

More specifically, the above tables contain summary statistics of the calculated estimates, that is, their minimuma (Min.), means (Mean) and maximuma (Max.). In addition, some error statistics are also shown, such as the standard deviations (SD), the mean square estimation errors (MSEE) and fractional estimation errors (FEE). Finally, Anderson-Darling and Shapiro-Wilk normality tests are also performed. Based on the results obtained in this way, it can be noted that the proposed estimators are efficient, because the bias, the range of samples (Max.–Min.), and the values of error statistics decrease with the increase in the sample size. At the same time, it can be noted that stability and efficiency are more significant at the estimates

\hat{α}

, especially in the first sample. This obviously follows from the way the estimates

\hat{β}

are calculated, as a two-stage procedure, using previously obtained estimates

\hat{α}

.

Similar conclusions can be made according to the results of AN testing of these estimates. As previously mentioned, AN testing procedures was conducted using Anderson–Darling and Shapiro–Wilk normality tests, and their statistics, labelled by AD and W respectively, as well as the appropriate p-values were computed using the R-package “nortest” (version 1.0-4) [25]. The AN test results are also presented in Table 1, Table 2 and Table 3, where can be noted that estimates

\hat{α}

have a pronounced AN feature, which applies to all observed samples. On the other hand, estimates

\hat{β}

have a less pronounced AN feature, primarily with samples of smaller lengths. Still, AN properties are confirmed for most of the samples, which can also be seen in Figure 6, where their observations, as well as empirical and theoretical distributions, are shown.

4. Applications of the GLU Distribution

As mentioned earlier, the Gumbel distribution can be used in modeling the extremes of some sample values. In more detail, in his original work Gumbel [26] has proved that the maximum of a sample taken from a population with an exponential distribution, after a simple transformation, approaches the Gumbel distribution with increasing sample size. This procedure can also be applied in some practical cases, such as in Burke et al. [27], where the Gumbel distribution is used to analyze maximum rainfall. In a similar way, the maximum load on the telecommunications system, which allows administrators to optimize network capacity and minimize the occurrence of overloads can be modeled with the Gumbel distribution. It is worth pointing that some other application can be in risk analysis related to the ICT technology, which enables companies to better understand and prepare for extreme scenarios that can significantly affect the business by developing methods for the development and introduction of new technologies and concepts, as is shown in Pažun & Langović [28].

For those reasons, this section considers some possible applications of the GLU distribution in modeling real-world data, primarily in the domains of telecommunications and machine learning. The datasets observed here were downloaded from the website “Kaggle.com” [29], a platform focused on analysing and sharing datasets related to machine learning and online data science. Thereby, the observed data represent parts of the training data related to network and telecommunication traffic in India, i.e., describe the satisfaction and participation of end users, as well as the adaptability and extensibility of the corresponding network transport. More specifically, the three real-world datasets analyzed below can be briefly described as follows:

$(i)$: The first data set, named Series A, consists of $n = 152$ data representing the percentage of service usage time of end users. The data was collected by Mirza [30] and as already mentioned, it is a part of the training data intended for machine learning and online coding and modeling.
$(i i)$: The second one (Series B), are also part of the same training data as above, and consistes of $n = 542$ monthly end user fees (in Indian Rupees). In doing so, these data are normalized in relation to their maximum and minimum values, and in this way a set of data in a unit interval is obtained.
$(i i i)$: Finally, the third set of data, designated as Series C, is obtained from training data authored by Mnassri [31], intended for the development of appropriate predictive models, i.e., training, cross-validation and performance testing of machine learning models. Therefore, Series C consists of $n = 1564$ data, which represent the total daily call length of end users (expressed in minutes), whereby the normalized values are obtained as the ratio of the call duration to the maximum call length.

Realizations of these series are shown in Figure 7a, while Figure 7b shows the values of their corresponding autocorrelation functions (ACFs). As can be easily seen, the ACF values of all series are realatively low, so they can be considered as independent realizations of some unit RV, that is, they can obviously be modeled by one of the unit stochastic distributions.

To further check the effectiveness of this modeling, in addition to the proposed GLU distribution, some other existing, well-known unit distributions were used to fit the empirical distributions of observed data. In more detail, the GLU distribution is compared against two well-known distributions defined on the unit interval, namely the Beta and Kumaraswamy distributions. The PDFs of these distributions are, respectively,

g_{1} (x; α, β) = \frac{1}{B (α, β)} x^{α - 1} {(1 - x)}^{β - 1}, g_{2} (x; α, β) = α β x^{α - 1} {(1 - x^{α})}^{β - 1},

where

0 < x < 1

, as

α, β > 0

are distribution parameters, and

B (α, β)

is the beta function. In order to calculate the parameter estimates of the Beta distribution, the method of moments (MM) is used here (see, e.g., Kachiashvili & Melikdzhanjan [32]), and the MM estimates are obtained as follows:

\hat{α} = \bar{x} (\frac{\bar{x} (1 - \bar{x})}{\bar{v}} - 1), \hat{β} = (1 - \bar{x}) (\frac{\bar{x} (1 - \bar{x})}{\bar{v}} - 1) .

Here,

\bar{x} = n^{- 1} \sum_{i = 1}^{n} x_{i}

and

\bar{v} = {(n - 1)}^{- 1} \sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}

are the sample mean and variance, respectively, and the inequality

\bar{v} < \bar{x} (1 - \bar{x})

holds. On the other hand, for the Kumaraswamy distribution, the maximum likelihood (ML) estimation method is used. Its application yields ML estimators as solutions of coupled equations (see, e.g., Dey et al. [33]):

\begin{matrix} \frac{n}{α} + \sum_{i = 1}^{n} ln x_{i} - (β - 1) \sum_{i = 1}^{n} \frac{x_{i}^{α} ln x_{i}}{1 - x_{i}^{α}} & = 0, \\ \frac{n}{β} + \sum_{i = 1}^{n} ln (1 - x_{i}^{α}) & = 0 . \end{matrix}

As is known, the proposed estimators for both of the above distributions have the stability and AN properties. In this way, one of the reasons for choosing them is the comparison not only with respect to their distribution, but also with respect to different estimation procedures.

The results of the previously described estimation procedures can be seen in Figure 7c, where the empirical distributions of the observed data are shown, along with their appropriate fitted PDFs. As can be easily seen, the empirical distribution of Series A is significantly positively skewed and it is fitted with decreasing PDFs. On the other side, Series B and Series C have negatively skewed unimodal distributions, with the distribution of Series C being “approximately symmetric” to some extent. This can also be confirmed by the estimated values of the parameters for each series, as well as for all competing distributions, presented in Table 4 below.

Based on them,

S = 1000

independent simulations of the corresponding theoretical distributions were carried out and the agreement between the empirical and fitted distributions was verify in several different ways. Namely, the mean square estimation error (MSEE) statistics, the Akaike information criterion (AIC), as well as the Bayesian information criterion (BIC) for model selection were used for this purpose. In addition, the Kolmogorov–Smirnov (KS) test of the two samples asymptotic distribution equivalence was also performed, and all these values are also shown in Table 4.

According to the results thus obtained, it is noticeable, for instance, that in the case of Series A all three theoretical distributions can be adequate for fitting. Namely, here the values of MSEE, AIC and BIC statistics are relatively low, while results of KS testing do not reject the hypothesis of equivalence of empirical and (all three) theoretical distributions. On the contrary, for the other two series, the values of MSEE, AIC and BIC are generally lower in cases where GLU and Beta distribution are applied as appropriate fitting models. Also, note that, in contrast to the Kumaraswamy distribution, the KS test results of the GLU and Beta distributions do not reject the previously mentioned hypothesis of equivalence for a significance level of

p < 0.01

. Nevertheless, it is clear that the GLU distribution has better fitting characteristics than both other theoretical distributions, even in the case of Series C, which has an “approximately symmetric” distribution. Moreover, only with the GLU distribution, the KS test statistics do not reject, with a significant level

p > 0.01

, hypothesis of asymptotic equivalence of theoretical and empirical distribution.

5. Conclusions

A novel, the so-called GLU distribution is presented here, along with its basic stochastic properties, where, among others, the asymmetry of the proposed distribution was shown. In addition, a quantile-based procedure was performed to estimate the parameters of the GLU distribution. The consistency and AN properties of the Q-estimators obtained in this way were considered, as well as a numerical study of their efficiency. To further check efficiency of the proposed distribution, it was applied in fitting three real data sets, where it was also compared with the Beta and Kumaraswamy distributions, demonstrating its greater efficiency. The results obtained in this way provide a motive for further investigation new unit distributions, with some different features. Namely, by logistic mapping of some other continuous distributions defined on the real numbers set, new unit distributions can arise. On the other hand, as is known, logistic regressions are applied in machine learning and data science, and this may also indicate some opportunities for further examination of unit distributions defined by logistic maps.

Author Contributions

Conceptualization, V.S.S., M.J. and Ž.G.; methodology, V.S.S. and M.J.; software, V.S.S., M.J. and B.P.; validation, V.S.S., M.J. and B.P.; formal analysis, V.S.S. and M.J.; data curation, V.S.S., M.J. and Z.L.; writing—original draft preparation, V.S.S., M.J. and Z.L.; writing—review and editing, B.P., Z.L. and Ž.G.; visualization, V.S.S., B.P. and Ž.G.; supervision, B.P. and Z.L.; project administration, B.P. and Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the study are openly available on the “Kaggle” website: https://www.kaggle.com accessed on 1 August 2024.

Conflicts of Interest

The authors declare no conflict of interest.

References

Afify, A.Z.; Nassar, M.; Kumar, D.; Cordeiro, G.M. A New Unit Distribution: Properties and Applications. Electron. J. Appl. Stat. 2022, 15, 460–484. [Google Scholar]
Nasiru, S.; Abubakari, A.G.; Chesneau, C. New Lifetime Distribution for Modeling Data on the Unit Interval: Properties, Applications and Quantile Regression. Math. Comput. Appl. 2022, 27, 105. [Google Scholar] [CrossRef]
Martínez-Flórez, G.; Azevedo-Farias, R.B.; Tovar-Falón, R. New Class of Unit-Power-Skew-Normal Distribution and Its Associated Regression Model for Bounded Responses. Mathematics 2022, 10, 3035. [Google Scholar] [CrossRef]
Korkmaz, M.Ç.; Korkmaz, Z.S. The Unit Log–log Distribution: A New Unit Distribution with Alternative Quantile Regression Modeling and Educational Measurements Applications. J. Appl. Stat. 2023, 50, 889–908. [Google Scholar] [CrossRef] [PubMed]
Shakhatreh, M.K.; Aljarrah, M.A. Bayesian Analysis of Unit Log-Logistic Distribution Using Non-Informative Priors. Mathematics 2023, 11, 4947. [Google Scholar] [CrossRef]
Nasiru, S.; Abubakari, A.G.; Chesneau, C. The Arctan Power Distribution: Properties, Quantile and Modal Regressions with Applications to Biomedical Data. Math. Comput. Appl. 2023, 28, 25. [Google Scholar] [CrossRef]
Nasiru, S.; Chesneau, C.; Ocloo, S.K. The Log-Cosine-Power Unit Distribution: A New Unit Distribution for Proportion Data Analysis. Decis. Anal. J. 2024, 10, 100397. [Google Scholar] [CrossRef]
Alomair, G.; Akdoğan, Y.; Bakouch, H.S.; Erbayram, T. On the Maximum Likelihood Estimators’ Uniqueness and Existence for Two Unitary Distributions: Analytically and Graphically, with Application. Symmetry 2024, 16, 610. [Google Scholar] [CrossRef]
Condino, F.; Domma, F. Unit Distributions: A General Framework, Some Special Cases, and the Regression Unit-Dagum Models. Mathematics 2023, 11, 2888. [Google Scholar] [CrossRef]
Krishna, A.; Maya, R.; Chesneau, C.; Irshad, M.R. The Unit Teissier Distribution and Its Applications. Math. Comput. Appl. 2022, 27, 12. [Google Scholar] [CrossRef]
Fayomi, A.; Hassan, A.S.; Baaqeel, H.; Almetwally, E.M. Bayesian Inference and Data Analysis of the Unit–Power Burr X Distribution. Axioms 2023, 12, 297. [Google Scholar] [CrossRef]
Biçer, C.; Bakouch, H.S.; Biçer, H.D.; Alomair, G.; Hussain, T.; Almohisen, A. Unit Maxwell-Boltzmann Distribution and Its Application to Concentrations Poglutant Data. Axioms 2024, 13, 226. [Google Scholar] [CrossRef]
Bakouch, H.S.; Hussain, T.; Tošić, M.; Stojanović, V.S.; Qarmalah, N. Unit Exponential Probability Distribution: Characterization and Applications in Environmental and Engineering Data Modeling. Mathematics 2023, 11, 4207. [Google Scholar] [CrossRef]
Alsadat, N.; Taniş, C.; Sapkota, L.P.; Kumar, A.; Marzouk, W.; Gemeay, A.M. Inverse Unit Exponential Probability Distribution: Classical and Bayesian Inference With Applications. AIP Adv. 2024, 14, 55108. [Google Scholar] [CrossRef]
Ramadan, A.T.; Tolba, A.H.; El-Desouky, B.S. A Unit Half-Logistic Geometric Distribution and Its Application in Insurance. Axioms 2022, 11, 676. [Google Scholar] [CrossRef]
Nasiru, S.; Chesneau, C.; Abubakari, A.G.; Angbing, I.D. Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance. Analytics 2023, 2, 438–462. [Google Scholar] [CrossRef]
Alghamdi, S.M.; Shrahili, M.; Hassan, A.S.; Mohamed, R.E.; Elbatal, I.; Elgarhy, M. Analysis of Milk Production and Failure Data: Using Unit Exponentiated Half Logistic Power Series Class of Distributions. Symmetry 2023, 15, 714. [Google Scholar] [CrossRef]
Stojanović, V.S.; Jovanović Spasojević, T.; Jovanović, M. Laplace-Logistic Unit Distribution with Application in Dynamic and Regression Analysis. Mathematics 2024, 12, 2282. [Google Scholar] [CrossRef]
Milovanović, G.V. Numerical Analysis I, 2nd ed.; Naučna knjiga: Belgrade, Serbia, 1991. [Google Scholar]
Rudin, W. Real and Complex Analysis, 3rd ed.; McGraw-Hill International Edition: New York, NY, USA, 1987. [Google Scholar]
Salinas, H.S.; Bakouch, H.S.; Almuhayfith, F.E.; Caimanque, W.E.; Barrios-Blanco, L.; Albalawi, O. Statistical Advancement of a Flexible Unitary Distribution and Its Applications. Axioms 2024, 13, 397. [Google Scholar] [CrossRef]
Dudek, D.; Kuczmaszewska, A. Some Practical and Theoretical Issues Related to the Quantile Estimators. Stat. Papers, 2024; in press. [Google Scholar] [CrossRef]
Serfling, R.J. Approximation Theorems of Mathematical Statistics, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2002. [Google Scholar]
Ruckdeschel, P.; Kohl, M.; Stabla, T.; Camphausen, F. S4 Classes for Distributions. R News 2006, 6, 2–6. Available online: https://CRAN.R-project.org/doc/Rnews (accessed on 1 August 2024).
Gross, L. Tests for Normality. R Package Version 1.0-2. 2013. Available online: http://CRAN.R-project.org/package=nortest (accessed on 1 August 2024).
Gumbel, E.J. Statistical Theory of Extreme Values and Some Practical Applications; Applied Mathematics Series; U.S. Department of Commerce, National Bureau of Standards: Gaithersburg, MD, USA, 1954; Volume 33. [Google Scholar]
Burke, E.J.; Perry, R.H.J.; Brown, S.J. An extreme value analysis of UK drought and projections of change in the future. J. Hydrol. 2010, 388, 131–143. [Google Scholar] [CrossRef]
Pažun, B.; Langović, Z. Contemporary Information System Development Methodologies in Tourism Organizations. In Proceedings of the 4th International Scientific Conference, Tourism in Function of Development of the Republic of Serbia, Spa Tourism in Serbia and Experiences of Other Countries, Vrnjačka Banja, Serbia, 30 May–1 June 2019; pp. 467–481. Available online: https://www.tisc.rs/proceedings/index.php/hitmc/article/view/267 (accessed on 3 August 2024).
Kaggle.com. Available online: https://www.kaggle.com (accessed on 2 August 2024).
Mirza, M.H. “telecom_churn_dataset” (Dataset). 2023. Available online: https://www.kaggle.com/datasets/mirzahasnine/telecom-churn-dataset (accessed on 2 August 2024).
Mnassri, B. “Telecom Churn Dataset” (Dataset). 2023. Available online: https://www.kaggle.com/datasets/mnassrib/telecom-churn-datasets (accessed on 2 August 2024).
Kachiashvili, K.J.; Melikdzhanjan, D.I. Estimators of the Parameters of Beta Distribution. Sankhya B 2019, 81, 350–373. [Google Scholar] [CrossRef]
Dey, S.; Mazucheli, J.; Nadarajah, S. Kumaraswamy Distribution: Different Methods of Estimation. Comp. Appl. Math. 2018, 37, 2094–2111. [Google Scholar] [CrossRef]

Figure 1. Graphs of the PDFs (a) and CDFs (b) of the GLU distribution for some parameter values

α, β > 0

.

Figure 1. Graphs of the PDFs (a) and CDFs (b) of the GLU distribution for some parameter values

α, β > 0

.

Figure 2. (a) Parameter regions with various shapes and asymmetry of the GLU distribution. (b) Some PDFs of the RV

X : G (α, β)

, where the dependence

α = ξ (β)

holds.

Figure 2. (a) Parameter regions with various shapes and asymmetry of the GLU distribution. (b) Some PDFs of the RV

X : G (α, β)

, where the dependence

α = ξ (β)

holds.

Figure 3. (a) Plot of Lorenz curves obtained for some values of the GLU distribution parameters; (b) 3D plot of the dependence of Shannon entropy vs. parameters

(α, β)

.

Figure 3. (a) Plot of Lorenz curves obtained for some values of the GLU distribution parameters; (b) 3D plot of the dependence of Shannon entropy vs. parameters

(α, β)

.

Figure 4. Graphs of the HRF (a) and QF (b) of the GLU distributed RV

X : G (α, β)

, for some values of parameters

α, β > 0

.

Figure 4. Graphs of the HRF (a) and QF (b) of the GLU distributed RV

X : G (α, β)

, for some values of parameters

α, β > 0

.

Figure 5. Polar plots of parameter dependences yielding a unimodal GLU distribution, with some fixed values of

p_{0} \in (0, 1)

and two different angular intervals: (a)

β \in [0, 8 π]

; (b)

β \in [0, 80 π]

.

Figure 5. Polar plots of parameter dependences yielding a unimodal GLU distribution, with some fixed values of

p_{0} \in (0, 1)

and two different angular intervals: (a)

β \in [0, 8 π]

; (b)

β \in [0, 80 π]

.

Figure 6. Graphs left: Realizations of various samples drawn from the GLU-distribution. Graphs right: Empirical and fitted PDFs of the RV

X : G (α, β)

.

Figure 6. Graphs left: Realizations of various samples drawn from the GLU-distribution. Graphs right: Empirical and fitted PDFs of the RV

X : G (α, β)

.

Figure 7. (a) Observed sample values of three real-world data. (b) Estimated ACFs of observed samples (data series). (c) Empirical distributions and PDFs fitted using the GLU, Beta and Kumaraswamy distributions.

Table 1. Q-estimates of GLU distribution parameters for Sample I: the true parameter values are

α = 0.5

and

β = 1

.

Table 1. Q-estimates of GLU distribution parameters for Sample I: the true parameter values are

α = 0.5

and

β = 1

.

Statistics	$n = 150$		$n = 500$		$n = 1500$
Statistics	$\hat{α}$	$\hat{β}$	$\hat{α}$	$\hat{β}$	$\hat{α}$	$\hat{β}$
Min.	0.4015	0.5716	0.4237	0.6739	0.4660	0.8195
Mean	0.5122	1.1906	0.5111	1.1519	0.5020	1.0488
Max.	0.6101	1.8720	0.5807	1.7959	0.5515	1.2252
SD	0.0388	0.9769	0.0221	0.2341	0.0122	0.0439
MSEE	0.0403	0.3218	0.0252	0.1906	0.0168	0.0519
FEE (%)	8.0623	32.177	5.0333	19.062	3.3707	5.1930
$A D$	0.2933	0.5094	0.3035	0.6068	0.2930	0.4860
(p-value)	(0.5997)	(0.1962)	(0.5704)	(0.1137)	(0.6004)	(0.2240)
W	0.9929	0.9876 *	0.9946	0.9884 *	0.9957	0.9902
(p-value)	(0.2815)	(0.0299)	(0.5138)	(0.0414)	(0.7109)	(0.0890)

* 0.01 < p < 0.05.

Table 2. Q-estimates of GLU distribution parameters for Sample II: the true parameter values

α = 0.5

and

β = 2

.

Table 2. Q-estimates of GLU distribution parameters for Sample II: the true parameter values

α = 0.5

and

β = 2

.

Statistics	$n = 150$		$n = 500$		$n = 1500$
Statistics	$\hat{α}$	$\hat{β}$	$\hat{α}$	$\hat{β}$	$\hat{α}$	$\hat{β}$
Min.	0.4299	1.2050	0.4731	1.3330	0.4821	1.5590
Mean	0.4993	2.1085	0.4995	2.0420	0.5000	2.0206
Max.	0.5615	2.6790	0.5194	2.2570	0.5155	2.1400
SD	0.0212	1.0480	9.26 $\times 10^{- 3}$	0.3665	5.58 $\times 10^{- 3}$	0.2013
MSEE	0.0181	0.3746	0.0105	0.0862	6.22 $\times 10^{- 3}$	0.0356
FEE (%)	3.6132	18.728	2.0931	4.3068	1.2481	1.7883
$A D$	0.3802	1.0201 *	0.2678	0.4337	0.2090	0.3384
(p-value)	(0.4005)	(0.0108)	(0.6826)	(0.2998)	(0.8621)	(0.5021)
W	0.9914	0.9888 *	0.9939	0.9900	0.9956	0.99031
(p-value)	(0.1504)	(0.0489)	(0.4049)	(0.0834)	(0.7032)	(0.0949)

* 0.01 < p < 0.05.

Table 3. Q-estimates of GLU distribution parameters for Sample III: the parameter values

α = 2

and

β = 1.5

.

Table 3. Q-estimates of GLU distribution parameters for Sample III: the parameter values

α = 2

and

β = 1.5

.

Statistics	$n = 150$		$n = 500$		$n = 1500$
Statistics	$\hat{α}$	$\hat{β}$	$\hat{α}$	$\hat{β}$	$\hat{α}$	$\hat{β}$
Min.	1.6710	1.0773	1.8330	1.1057	1.8621	1.1950
Mean	1.9952	1.4901	1.9978	1.5052	2.0010	1.4970
Max.	2.2941	1.9122	2.1605	1.7935	2.1072	1.7245
SD	0.0949	0.6647	0.0606	0.2910	0.0326	0.1534
MSEE	0.0949	0.1901	0.0523	0.0796	0.0327	0.0482
FEE (%)	4.7450	12.675	2.6488	5.3096	1.6375	3.2158
$A D$	0.3238	2.0687 **	0.2153	0.5059	0.3041	0.5160
(p-value)	(0.5235)	(2.83 $\times 10^{- 5}$ )	(0.8460)	(0.2001)	(0.5686)	(0.1889)
W	0.99376	0.9806 **	0.9950	0.9903	0.9952	0.9908
(p-value)	(0.3865)	(1.74 $\times 10^{- 3}$ )	(0.5840)	(0.0949)	(0.6194)	(0.1189)

** p < 0.01.

Table 4. Parameter estimates, estimation errors, and fit statistics of the GLU, Beta, and Kumaraswamy distributions.

Parameter/	Series A			Series B			Series C
Statistic	GLU	BETA	KUM	GLU	BETA	KUM	GLU	BETA	KUM
$α$	0.6603	0.8939	0.5989	2.3400	1.9597	1.5018	1.2055	4.7587	1.3589
$β$	1.1541	1.9902	1.3840	0.8773	1.1883	1.0948	1.5639	4.6025	1.7445
MSEE	0.0118	0.0153	0.0215	5.54 $\times 10^{- 3}$	7.98 $\times 10^{- 3}$	0.0426	2.86 $\times 10^{- 3}$	3.15 $\times 10^{- 3}$	0.0573
AIC	−116.0	−69.18	−83.81	−310.9	−145.5	−65.37	−1423.5	−1419.7	−218.0
BIC	−110.0	−63.13	−77.76	−294.3	−128.9	−48.78	−1404.8	−1401.0	−199.3
$K S$	0.0921	0.0987	0.1316	0.0623	0.0886 *	0.1495 **	0.0392	0.0403	0.2398 **
(p-value)	(0.5393)	(0.4498)	(0.1439)	(0.1654)	(0.0285)	(1.11 $\times 10^{- 3}$ )	(0.1858)	(0.1580)	(0.00)

* 0.01 < p < 0.05; ** p < 0.01.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Stojanović, V.S.; Jovanović, M.; Pažun, B.; Langović, Z.; Grujčić, Ž. Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling. Symmetry 2024, 16, 1513. https://doi.org/10.3390/sym16111513

AMA Style

Stojanović VS, Jovanović M, Pažun B, Langović Z, Grujčić Ž. Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling. Symmetry. 2024; 16(11):1513. https://doi.org/10.3390/sym16111513

Chicago/Turabian Style

Stojanović, Vladica S., Mihailo Jovanović, Brankica Pažun, Zlatko Langović, and Željko Grujčić. 2024. "Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling" Symmetry 16, no. 11: 1513. https://doi.org/10.3390/sym16111513

APA Style

Stojanović, V. S., Jovanović, M., Pažun, B., Langović, Z., & Grujčić, Ž. (2024). Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling. Symmetry, 16(11), 1513. https://doi.org/10.3390/sym16111513

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Gumbel–Logistic Unit Distribution with Application in Telecommunications Data Modeling

Abstract

1. Introduction

2. The GLU Distribution

2.1. The Definition and Key Properties

2.2. Bayesian Inference

2.3. Moment-Based Characteristics

2.4. Hazard Rate and Quantile Functions

3. Parameters Estimation & Numerical Simulation Study

4. Applications of the GLU Distribution

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI