Next Article in Journal
Electromagnetic Scattering from Fractional Brownian Motion Surfaces via the Small Slope Approximation
Next Article in Special Issue
On a Novel Dynamics of a SIVR Model Using a Laplace Adomian Decomposition Based on a Vaccination Strategy
Previous Article in Journal
Quasilinear Fractional Order Equations and Fractional Powers of Sectorial Operators
Previous Article in Special Issue
A Novel Regression Model for Fractiles: Formulation, Computational Aspects, and Applications to Medical Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Continuous Bernoulli Distribution: Mathematical Characterization, Fractile Regression, Computational Simulations, and Applications

by
Mustafa Ç. Korkmaz
1,
Víctor Leiva
2,* and
Carlos Martin-Barreiro
3,4
1
Department of Measurement and Evaluation, Artvin Çoruh University, Artvin 08100, Turkey
2
School of Industrial Engineering, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile
3
Faculty of Natural Sciences and Mathematics, Escuela Superior Politécnica del Litoral ESPOL, Guayaquil 090902, Ecuador
4
Faculty of Engineering, Universidad Espíritu Santo, Samborondón 0901952, Ecuador
*
Author to whom correspondence should be addressed.
Fractal Fract. 2023, 7(5), 386; https://doi.org/10.3390/fractalfract7050386
Submission received: 28 March 2023 / Revised: 27 April 2023 / Accepted: 29 April 2023 / Published: 6 May 2023

Abstract

:
The continuous Bernoulli distribution is defined on the unit interval and has a unique property related to fractiles. A fractile is a position on a probability density function where the corresponding surface is a fixed proportion. This article presents the derivation of properties of the continuous Bernoulli distribution and formulates a fractile or quantile regression model for a unit response using the exponentiated continuous Bernoulli distribution. Monte Carlo simulation studies evaluate the performance of point and interval estimators for both the continuous Bernoulli distribution and the fractile regression model. Real-world datasets from science and education are analyzed to illustrate the modeling abilities of the continuous Bernoulli distribution and the exponentiated continuous Bernoulli quantile regression model.

1. Introduction

The interest in new probability discrete and continuous distributions is increasing; see, for example, [1]. Recently, this interest has been present in distributions defined on the interval [0, 1], which has raised great attention from researchers. These distributions have many applications in several fields of science; see, for example, [2,3,4,5,6,7]. Often, novel unit models are transported to unit intervals by transformations of well-known models [2], with studies on unit modeling increasing day by day [2,3,4,5,6,7].
Regression is a statistical approach that is widely employed [8] to explain the structure between a dependent (response) variable and independent (explanatory) variables or covariates. Such a structure can be linear or nonlinear, associating the response mean with the independent variables. When the dependent variable is stated on the interval [0, 1], a regression structure employing the beta distribution [9] has been the most preferred method to explain the unit-dependent variable with the independent variables based on the conditional mean. Note that the beta model is formulated in relation to its mean. Then, its unit regression based on the mean was established in [9].
With similar ideas,  alternative unit mean response regressions were introduced, among them are the beta rectangular [10], Birnbaum–Saunders [11], log-Bilal [12], log-Lindley [1], log-weighted exponential [13], and unit Lindley [14] models.
The dependent variables can be affected by situations related to a atypical data or asymmetric distributions. When these situations exist, the mean is also affected strongly, and the inference from the model may give rise to possible inaccurate interpretations.
A fractile is a point on a probability density function (PDF) where the surface upon the function is a fixed proportion. Due to the inaccuracy mentioned, the quantile regression (QR) models are a robust alternative to mean regression models. The QR model was introduced in [15] and relates the response’s conditional quantiles to independent variables’ given values instead of explaining their conditional mean. One of the key benefits of using QR models over standard mean regressions is its lack of imposed distributional assumptions on the error term. This allows for greater flexibility in modeling different types of data [16].
To implement a QR model utilizing a probability distribution, it is necessary to parameterize it using its quantile function (QF). Fortunately, this parameterization can be utilized for any probability distribution with a closed-form QF, regardless of whether its mean has a closed form. This provides a high level of adaptability and enables us to model quantiles based on various distributions, as demonstrated in the following examples: the Kumaraswamy (KW) [17], L-logistic [18], log-extended exponential-geometric (LEEG) [19], log-symmetric [20], power Johnson SB, [21], unit-Birnbaum–Saunders [22], unit-Burr-XII, [2,23], unit-Chen [24], unit-Weibull [25], and Weibull–Marshall–Olkin [26] QR models. These types of models and structures have been largely applied to describe COVID-19 mortality [23,27].
The continuous Bernoulli (CB) model, as described in [4], is a probability distribution that exists on the interval [0, 1] and has a single parameter related to the shape of the distribution. It belongs to the exponential family of distributions and offers a generalization of the uniform distribution defined on the unit interval. In their study, the authors of [4] proposed using the CB distribution to investigate the impact of Bernoulli variational autoencoders on intensity data that fall within the [0, 1] range. The results of their research indicated that the CB distribution leads to significant improvements in various metrics and datasets, including the generation of sharper image samples, and suggested a wider range of potential performance-optimized variational autoencoders. Despite the promising findings of the CB distribution, as far as we know, its associated mathematical and statistical properties as well as its associated QR model have yet to be explored. Thus, the primary objectives of this article are as follows:
(i)
To derive some mathematical and statistical properties that have not been studied of the CB distribution stated on the interval [0, 1], including estimation and inference.
(ii)
To formulate a QR model for a unit response employing the exponentiated CB (ECB) distribution. In this way, the applications of the CB distribution are extended. The formulated regression model is a robust alternative to the beta regression and other models proposed in the literature. Another motivation for the proposed QR model comes from a real-world dataset.
The present article is organized into seven main sections. Section 2 is dedicated to deriving mathematical properties of the continuous Bernoulli (CB) distribution, while Section 3 focuses on the various point estimation procedures for its parameters. A novel QR model is introduced in Section 4. Monte Carlo simulations are used to evaluate the performance of point and interval estimators for both the CB distribution and the associated QR model in Section 5. The results are illustrated utilizing a real-world dataset in Section 6. Finally, Section 7 sketches some concluding remarks.

2. The Continuous Bernoulli Distribution

This section provides some known and new functions of the CB distribution.

2.1. Cumulative Distribution and Probability Density Functions

In the fields of artificial intelligence, in the context of modeling pixel intensities, the CB distribution is a commonly encountered concept [28]. The PDF and cumulative distribution function (CDF) of the CB distribution are formulated as
F ( x , β ) = x , β = 1 / 2 ; 1 β β 1 β x 1 1 2 β 1 , β 1 / 2 ;
and
f ( x , β ) = 1 , β = 1 / 2 ; 1 β β 1 β x 1 2 β 1 log β / ( 1 β ) , β 1 / 2 ;
respectively, with  x ( 0 , 1 ) , and β 0 , 1 being a parameter related to the shape of the distribution. Alternatively, the PDF of the CB distribution can be written as
f x , β = 1 , β = 1 / 2 ; 2 1 β tanh 1 1 2 β 1 2 β β / ( 1 β ) x , β 1 / 2 ;
where tanh 1 ( w ) = ( 1 / 2 ) log ( ( 1 + w ) / ( 1 w ) ) , for  w ( 1 , 1 ) , is the inverse hyperbolic tangent function. We denote the CB distribution of parameter β as CB ( β ) . Note that the CB distribution corresponds to the uniform distribution on the interval [0, 1], that is, on (0, 1), when β = 1 / 2 . Observe that there is an interesting recursive relation between the CB PDF and its derivatives. For  β 1 / 2 , this relation is stated as
d m f x , β d x m = f ( x , β ) log m + 1 β 1 β , m { 1 , 2 , } .
Hence, it can be concluded that the PDF with a concave CDF is decreasing for β < 1 / 2 , and the PDF with a convex CDF is increasing for β > 1 / 2 . Note that the term i = 1 n X i is a sufficient statistic for estimating β based on a sample of size n, X 1 , , X n namely, collected from the CB distribution.

2.2. Inverse Cumulative Distribution Function

From (1), we can obtain the CB inverse CDF, which is given by
F 1 ( x , β ) = x , β = 1 / 2 ; log 1 + 2 β 1 1 β x log β 1 β , β 1 / 2 .
In Figure 1, we can see the plot of (2) for the values β = 0.25 , β = 0.5 , and  β = 0.75 . To simulate a random variable (RV) with CB distribution, we propose Algorithm 1. In simulation studies related to the CB distribution, generating data from this distribution is important. Then, we can use Algorithm 1 for this purpose, which takes advantage of (2) to generate a sample of size n.
Algorithm 1: Approach that simulates the CB distribution using the inverse-transform method
Fractalfract 07 00386 i001
Computational experiments were conducted utilizing Algorithm 1. Table 1 displays the results generated with β = 0.25 , while Table 2 provides the results generated with β = 0.75 , using n { 25 , 50 , 100 , 200 , 500 , 1000 } and performing 1000 Monte Carlo simulations for each n.
The simulation results demonstrate the experimental convergence of Algorithm 1. The theoretical mean and variance values for the CB distribution with β = 0.25 are 0.4102 and 0.0785354 , respectively, while the theoretical mean and variance values with β = 0.75 are 0.5898 and 0.0785354 , respectively.

2.3. Hazard Rate Function

The hazard rate function (HRF) of the CB distribution is stated as
h ( x , β ) = 1 β 1 x β x β 1 β 1 x β x log β / ( 1 β ) , x ( 0 , 1 ) , β 1 / 2 .
The HRF is an increasing function in x because its derivative expressed as
d h x , β d x = β 1 β β 1 β β / ( 1 β ) x 2 β / ( 1 β ) x log 2 β / ( 1 β )
is positive for all β . Furthermore, note that lim x 1 h ( x , β ) = and
lim x 0 h x , β = 1 , β = 1 / 2 ; 1 β 2 β 1 log β / ( 1 β ) , β 1 / 2 .
Figure 2 displays the CB PDF, CDF, and HRF. From this figure, note the flexible shapes these functions have for the different parameter values, that is, for  β .

2.4. Moments

Using the definition of the moment-generating function, for the case of the CB distribution, we obtain that
M X t = E e t X = β 1 + e t 1 log β / ( 1 β ) 2 β 1 log β / ( 1 β ) + t , β 1 / 2 , t .
Note that M X ( t ) = ( e t 1 ) / t for β = 1 / 2 . Thus, utilizing the relation given by E ( X r ) = d r M X ( t ) / d t r | t = 0 , for  r { 1 , 2 , } , the first four noncentral moments are, respectively, for  β 1 / 2 , stated as
m 1 = E ( X ) = β log β / ( 1 β ) + 1 2 β 2 β 1 log β / ( 1 β ) , m 2 = E ( X 2 ) = β log 2 β 2 β log β log 1 β + β log 2 1 β 2 β log β + 2 β log 1 β + 4 β 2 2 β 1 log 2 β / ( 1 β ) , m 3 = E ( X 3 ) = β log 3 1 β + 3 β log β 1 log 2 1 β 3 β log 2 β + 2 2 log β log 1 β 2 β 1 log 3 β / ( 1 β ) + β log 3 β 12 β + 6 β log β 3 β log 2 β + 6 2 β 1 log 3 β / ( 1 β ) , m 4 = E ( X 4 ) = β log 4 β 4 β log 3 β log 1 β + 6 β log 2 β log 2 1 β 4 β log 3 1 β log β 2 β 1 log 4 β / ( 1 β ) + β log 4 1 β 4 β log 3 β + 12 β log 2 β log 1 β 12 β log 2 1 β log β + 4 β log 3 1 β + 12 β log 2 β 2 β 1 log 4 β / ( 1 β ) + 24 β log β log 1 β + 12 β log 2 1 β 24 β log β + 24 β log 1 β + 48 β 24 2 β 1 log 4 β / ( 1 β ) .
Figure 3 shows some shapes of the CB moments. Note that, as β increases, the mean increases, whereas the skewness decreases. According to the values of β , the variance increases for β < 1 / 2 and decreases for β > 1 / 2 , but a reverse situation is detected for the kurtosis.

2.5. Quantile Function

The QF, defined by F 1 ( q , β ) = x q ( β ) , for  q ( 0 , 1 ) , of the CB distribution can be obtained in a closed form. Let X be a CB distributed RV with CDF as defined in (1). Then, the CB QF is given by
x q β = log 1 + 2 β 1 q 1 β log β / ( 1 β ) , β 1 / 2 , q ( 0 , 1 ) .
Note that the CB QF is equal to q for β = 1 / 2 , whereas the corresponding median is derived as
x 0.5 ( β ) = log β / ( 1 β ) 1 log 0.5 1 β ,
for β 1 / 2 . Notice that if the RV Q related to q in (3) has a uniform distribution on (0, 1), which is denoted as U(0, 1), then the RV associated with x Q ( β ) has the CB distribution.

2.6. Mean Residual Life Function

The mean residual life (MRL), also known as life expectancy, is a critical parameter in the field of reliability that characterizes the assumed distribution. Unlike the hazard rate function (HRF), which only considers the risk of immediate failure, the MRL summarizes the entire residual life distribution. The MRL function of a positive random variable at time x is defined as
MRL ( x ) = 1 1 F ( x ) x 1 F y d y ,
where F is the lifetime model CDF. Therefore, the MRL function of the CB distribution is stated as
MRL ( x ) = β 1 x log β / ( 1 β ) β 1 + β / ( 1 β ) x + β / ( 1 β ) x β 1 β β / ( 1 β ) x log β / ( 1 β ) , β 1 / 2 ,
while MRL(0) is obtained as
MRL ( 0 ) = β log β / ( 1 β ) + 1 2 β 2 β 1 log β / ( 1 β ) , β 1 / 2 ,
which is equal to the corresponding expected value, and then MRL(1) = 0. Observe that MRL ( x ) = ( 1 x ) / 2 , for  β = 1 / 2 .
We can comment about the shapes of the MRL function via its HRF. The following results hold for both discrete and continuous lifetimes [29]:
  • If the HRF is (strictly) increasing, then the MRL is (strictly) decreasing.
  • If the HRF is (strictly) decreasing, then the MRL is (strictly) increasing.
  • The HRF is a constant function (exponential or geometric distribution) if and only if the MRL is a constant.
We can conclude that, since the HRF function of the CB distribution is strictly increasing, its MRL function is strictly decreasing. For detail about the MRL, see [30].

2.7. Incomplete Moments and Lorenz Curve

For β 1 / 2 and r { 1 , 2 , } , the rth incomplete moment of the CB distribution is given by
I ( x , r ) = 0 x z r f ( z , β ) d z = x r 1 β r Γ r , x log β / ( 1 β ) x log β / ( 1 β ) r Γ r + 1 x log β / ( 1 β ) r + β / ( 1 β ) x 2 β 1 ,
for z < x , where Γ ( · ) and Γ ( · , · ) are the gamma function and upper incomplete gamma function, respectively. For  β = 1 / 2 , the CB rth incomplete moment is equal to x r + 1 / ( r + 1 ) . Note that the equations lim x I ( x , r ) = E ( X r ) and F ( y , β ) = I ( x , r = 0 ) hold.
As a result of the incomplete moments, the Lorenz curve of the CB distribution for β 1 / 2 is expressed by
L ( p ) = 1 E ( X ) 0 x q ( β ) x f ( x , β ) d x = I x q β , 1 E ( X ) = β 2 p 1 + 1 p log 1 + β 2 p 1 p 1 β 2 p β + p 1 2 β + β log β / ( 1 β ) .
Then, L ( p ) = p 2 is obtained for β = 1 / 2 .

2.8. Stress Strength Reliability

The system with applied stress works smoothly if its strength is higher than the applied stress. The stress–strength parameter of a life distribution is an important indicator of reliability. This parameter is the working probability of the system and a reliability measure, which is defined as
R = P X < Y = 0 1 f X x F Y x d x ,
where f X is the PDF of the stress (X) and F Y is the CDF of the strength (Y). Now, consider X and Y as two independent CB distributed RVs with parameters β 1 and β 2 , respectively. Then, the CB stress–strength parameter is given by
R = 2 β 1 β 2 1 β 2 + 1 log β 2 1 β 2 + 2 β 2 1 β 1 log β 1 1 β 1 2 β 2 1 2 β 1 1 log β 1 ( 1 β 1 ) β 2 ( 1 β 2 ) .
Note that, when β 1 = β 2 1 / 2 , then R = 1 / 2 .

3. Point Estimation for the Parameter of the CB Distribution

This section provides some point estimation methods, such as maximum likelihood (ML), moment, percentile (PE), least square (LS), weighted square (WS), Anderson–Darling (AD), and Cramér–von Mises (CM) to determine the model parameter β . Note that, for the estimation processes of the above methods, we focus on β 1 / 2 .

3.1. Maximum Likelihood Method

Consider X 1 , , X n as a sample of size n collected from the CB distribution, with x 1 , , x n as its observed values. Thus, the associated log-likelihood function is established as
β = n log 1 β n log 2 β 1 + n log log β / ( 1 β ) + log β / ( 1 β ) i = 1 n x i .
Taking the first derivative of the expression stated in (4) with respect to β , we obtain
d ( β ) d β = n log 1 β 2 n log 2 β 1 + n β 1 β log β / ( 1 β ) + 1 β 1 β i = 1 n x i .
Since the equation d ( β ) / d β = 0 obtained from (5) is a nonlinear function of β , the ML estimator β ^ must be computed using numerical methods. For any smooth PDF, it is widely recognized that under regularity conditions, β ^ has a normal distribution asymptotically with mean β and variance I β 1 , with  I β = E ( d 2 ( β ) / d β 2 ) being the Fisher information.
After performing several computations, the inverse of the Fisher information can be obtained as
I β 1 = β 1 β 2 β 1 log β / ( 1 β ) 2 n 1 β 1 β 4 + log 2 β / ( 1 β ) .
Based on the previous results, the asymptotic 100 ( 1 θ ) % confidence interval of the parameter β is given as
β ^ z θ / 2 β 1 β 2 β 1 log β / ( 1 β ) n 1 β 1 β 4 + log 2 β / ( 1 β ) β β ^ + z θ / 2 β 1 β 2 β 1 log β / ( 1 β ) n 1 β 1 β 4 + log 2 β / ( 1 β ) ,
where z θ / 2 refers to the ( θ / 2 ) × 100 th upper percentile of the standard normal distribution.

3.2. Moment Estimation

The parameter estimation for the CB distribution can be performed using the moment method, as its expected value can be expressed analytically. This method involves equating the first moment of the sample to the expected value of the CB distribution. By solving this equation, the moment estimate of the parameter β can be obtained as
x ¯ β log β / ( 1 β ) + 1 2 β 2 β 1 log β / ( 1 β ) = 0 ,
where x ¯ is the sample mean. Note that the equation stated in (7) must be solved employing a numerical method.

3.3. Percentile Estimation

The percentile method to estimate β in the CB distribution was originally proposed in [31]. Since the QF of the CB distribution has a closed form, this method can be applied straightforwardly. Consider X ( 1 ) , , X ( n ) as the order statistics of a sample of size n from the CB distribution and  x ( 1 ) , , x ( n ) as their observed values. An estimate of the CDF can be obtained as F ^ ( x ( i ) ) = i / ( n + 1 ) , for i { 1 , , n } . Then, an estimate of β can be reached by minimizing a certain function with respect to β . The expression for this function is given by
PE ( β ) = i = 1 n x i x q i β 2 ,
and then the PE estimate of β is attained, where x q is defined in (3). Therefore, the PE estimate of β is the solution on β evaluated at β ^ of an equation formulated as
d PE ( β ) d β = 2 β 1 β log 2 β / ( 1 β ) i = 1 n x i log 1 + i 2 β 1 n + 1 1 β log β / ( 1 β ) × β ( 2 i 1 n ) + n i + 1 log 1 + i 2 β 1 n + 1 1 β i β log β / ( 1 β ) n + 1 1 β + i 2 β 1 = 0 .

3.4. Other Methods of Estimation

Next, the LS, WS, AD, and CM methods of estimation for β are provided. According to these methods, the functions to be used for estimating β are expressed as
LS β = i = 1 n F ( x ( i ) , β ) i n + 1 2 , WS β = i = 1 n ( n + 2 ) ( n + 1 ) 2 ( n i + 1 ) i F ( x ( i ) , β ) i n + 1 2 , LS β = n i = 1 n ( 2 i 1 ) n log 1 F ( x ( n + 1 i ) , β ) + log ( F ( x ( i ) , β ) ) , CM β = 1 12 n + i = 1 n F ( x ( i ) , β ) 2 i 1 2 n 2 ,
In the CB distribution, a sample of size n is drawn from the population, with x ( 1 ) , , x ( n ) representing the ordered observed values of this sample. Therefore, the LS, WS, AD, and CM estimates of β are given by
β ^ LS = argmin β LS β ,
β ^ WS = argmin β WS β ,
β ^ AD = argmin β AD β ,
β ^ CM = argmin β CM β ,
respectively. It is important to emphasize that the equations presented in (4), (7), (8), (9), (10), (11) and (12) do not have explicit solutions and must be solved through numerical methods. However, these equations can be optimized directly using software such as R or Matlab.

4. Associated QR Model

This section states a QR model employing the exponentiated CB distribution.

4.1. The Exponentiated CB Distribution

Although the CB distribution has a closed-form mean function and QF, its PDF and CDF cannot be parameterized using these functions due to the dependence of the parameter β . Therefore, we propose an alternative QR model that is based on the ECB distribution, which is a generalized form of the CB distribution. The ECB CDF and PDF are expressed as
G y , α , β = y α , β = 1 / 2 ; 1 β β / ( 1 β ) y 1 2 β 1 α , β 1 / 2 ;
and
g y , α , β = α y α 1 , β = 1 / 2 ; α β / ( 1 β ) y log β / ( 1 β ) β / ( 1 β ) y 1 1 β β / ( 1 β ) y 1 2 β 1 α , β 1 / 2 ;
respectively, where y ( 0 , 1 ) , α > 0 is a parameter related to the shape of the distribution and β ( 0 , 1 ) . The CDF of the ECB distribution is G ( y , α , β ) = ( F ( y , β ) ) α , and it can be named a Lehmann-type I CB distribution. For  β = 1 / 2 , the ECB distribution is the power function distribution.
The τ × 100 th quantile of the ECB distribution is presented by
Q ( τ , α , β ) = τ 1 / α , β = 1 / 2 ; log β / ( 1 β ) 1 log 1 + 2 β 1 τ 1 / α 1 β , β 1 / 2 ;
where τ ( 0 , 1 ) .
By using the QF of the ECB distribution, we can now reparameterize it in terms of its quantiles. Observe that, as  β = 1 / 2 , the ECB distribution can be reparameterized as a particular case of the Kumaraswamy distribution based on the quantile form, as presented in [17]. Here, we consider β 1 / 2 and provide the PDF form of the ECB distribution based on its QF.
Consider μ = Q ( τ , α , β ) and α = log ( τ log 1 β β / ( 1 β ) μ 1 / 2 β 1 1 ) . Hence, the reparameterized ECB CDF and PDF are introduced by
G y , β , μ , τ = τ log 1 β β / ( 1 β ) y 1 / 2 β 1 log 1 β β / ( 1 β ) μ 1 / 2 β 1 1
and
g y , β , μ , τ = log τ β / ( 1 β ) y log β / ( 1 β ) τ log 1 β β / ( 1 β ) y 1 / 2 β 1 log 1 β β / ( 1 β ) μ 1 / 2 β 1 1 log 1 β β / ( 1 β ) μ 1 / 2 β 1 β / ( 1 β ) y 1 .
The reparameterized PDF stated in (14) is defined by the CDF presented in (13), where β ( 0 , 1 ) is a parameter related to the shape of the distribution, μ ( 0 , 1 ) is the quantile parameter, and  τ ( 0 , 1 ) is a known quantile level. We denote the RV associated with the CDF given in (13) as Y ECB ( β , μ , τ ) . Figure 4 illustrates the diverse shapes of the reparameterized PDF, which exhibit characteristics such as U-shape, unimodality, increasing, decreasing, and constancy.

4.2. ECB Quantile Regression Model

Once the reparameterized ECB distribution is defined, a QR model can be introduced. Suppose we have n observations y 1 , , y n from the reparameterized distribution with its PDF stated in (14). Then, the ECB QR model can be expressed as
g ( μ i ) = x i β .
Here, x i = 1 , x i 1 , , x i p is the vector of independent variable values, β = β 0 , β 1 , , β p is the vector of unknown regression coefficients, and  g : 0 , 1 is a strictly monotonic and twice differentiable link function used to relate independent variables with the conditional quantile of the dependent variable. Common link functions include the logit, probit, log-log, and Cauchy structures. Thus, μ i is obtained by inverting the model given in (15) such that μ i = g 1 ( x i β ) , for i { 1 , , n } . In this proposed regression model, we use the logit function for linking, stated as g ( μ i ) = log ( μ i / ( 1 μ i ) ) , for  i { 1 , n } . If we take τ = 0.5 , the link function connects the median response to the independent variables in the ECB QR model.

4.3. ML Estimation of the Regression Coefficients

Assume that we have n independent RVs Y 1 , , Y n following the ECB distribution, with β being a parameter related to the shape of the distribution, μ i a quantile parameter, and  τ a known quantile level. The quantile parameter is obtained using the logistic function formulated as
μ i = e x i β 1 + e x i β ,
where x i is the vector of independent variable values for the ith observation, and  β is the vector of unknown regression coefficients. The parameter vector ϑ = ( β , β ) has p + 2 unknown elements. The log-likelihood function of the ECB regression model is expressed as
ϑ = n log log ( τ ) + n log log β / ( 1 β ) + log β / ( 1 β ) i = 1 n y i i = 1 n log log 1 β β / ( 1 β ) μ i 1 2 β 1 i = 1 n log β / ( 1 β ) y i 1 + log ( τ ) i = 1 n log 1 β β / ( 1 β ) y i 1 2 β 1 log 1 β β / ( 1 β ) μ i 1 2 β 1 1 .
Thus, the first derivatives according to the model coefficients established in (17) are given by
d ( ϑ ) d β r = i = 1 n β / ( 1 β ) μ i μ i 1 μ i x i r log β / ( 1 β ) β / ( 1 β ) μ i 1 log 1 β β / ( 1 β ) μ i 1 2 β 1 , r { 0 , 1 , , p } , i = 1 n β / ( 1 β ) μ i μ i 1 μ i x i r log ( τ ) log β / ( 1 β ) log 2 1 β β / ( 1 β ) μ i 1 2 β 1 β / ( 1 β ) μ i 1 log 1 β β / ( 1 β ) y i 1 2 β 1 , d ( ϑ ) d β = n β 1 β log β / ( 1 β ) + 1 β 1 β i = 1 n y i i = 1 n y i β / ( 1 β ) y i β 1 β β / ( 1 β ) y i 1 i = 1 n β / ( 1 β ) μ i 2 β 1 μ i β + β β 1 β 2 β 1 β / ( 1 β ) μ i 1 log 1 β β / ( 1 β ) μ i 1 2 β 1 + 2 log ( τ ) i = 1 n β y i 1 / 2 y i 2 β / ( 1 β ) y i + β 2 β 2 β 1 1 β β / ( 1 β ) y i 1 log 1 β β / ( 1 β ) μ i 1 2 β 1 2 log ( τ ) i = 1 n β μ i 1 / 2 μ i 2 β / ( 1 β ) μ i + β 2 log 1 β β / ( 1 β ) y i 1 2 β 1 β 2 β 1 1 β log 2 1 β β / ( 1 β ) μ i 1 2 β 1 β / ( 1 β ) μ i 1 .
Once the derivatives are set to zero, the resulting equations should be solved using numerical methods since they are nonlinear functions. The formula presented in (17) can be directly maximized utilizing the R software to obtain these solutions. It is important to note that the asymptotic distribution of ( ϑ ^ ϑ ) is a multivariate normal N p + 2 ( 0 , I 1 ( ϑ ) ) , where I ( ϑ ) is the Fisher information matrix. However, for numerical applications, the  ( p + 2 ) × ( p + 2 ) observed information matrix is often used instead of I ( ϑ ) to make inference about the corresponding parameters.

4.4. Residual Analysis for Model Fitting

Validating a fitted regression model involves analyzing residuals. The randomized quantile (RQ) residual is a preferred method for this validation. The RQ residual, proposed in [32], is defined as
r ^ i = Φ 1 ( G ( y i , β ^ , μ ^ i , τ ) ) , i { 1 , , n } ,
where G ( y , β , μ , τ ) is the CDF of the CEB model and μ ^ i is defined in (16). If the fitted regression is valid, the RQ quantile should be standard normal distributed.

5. Simulation Studies

In this section, we discuss simulation studies for point estimation of the CB parameter and estimation of the ECB QR parameters.

5.1. Computational Framework

The experiments were conducted on a computer with Windows 10 for 64 bits, 8 Gigabytes of RAM, and an Intel Core i7-4510U 2-2.60 GHz processor. The optimization of the log-likelihood function can be performed directly by software such as R or Matlab.
In the case of using R, two libraries, named ConstrOptim and maxLik, can be employed to conduct the experimental results of the simulation studies and real data analysis when obtaining the point estimates of the corresponding parameters. The functions of these libraries numerically provide the parameter estimates and the associated estimated standard errors based on diverse algorithms. The Nelder–Mead optimization routine can be utilized to obtain these estimates. Such libraries may only be employed in specific contexts, which is their limitation. Our computations were carried out by the maxLik and goftest functions of the R software.
For more details about R packages related to QR, the interested reader is referred to a package named unitquantreg, which was developed in the setting stats::lm package [33]. The unitquantreg() function of this package is flexible and allows us, by means of the ns() function, to utilize regression splines. This function is available in the splines package. Moreover, it is possible to employ the gam() function of an R package named mgcv.
The unitquantreg package can be secured from https://github.com/AndrMenezes/unitquantreg (accessed on 5 April 2023) and installed using
  • devtools::install_github(“AndrMenezes/unitquantreg”).
The unitquantreg package contains 15 distributions, and our new distribution can be added to this package.

5.2. Point Estimation

For the simulations of point estimation, we change n in { 25 , 50 , 100 , 200 , 500 , 1000 } with M = 1000 Monte Carlo replicates from the CB distribution, with β { 0.25 , 0.75 } . The empirical mean, bias, and mean square error (MSE) of the estimators, given by Mean ( β ^ ) = ( 1 / M ) i = 1 M β ^ i , Bias ( β ^ ) = ( 1 / M ) i = 1 M ( β β ^ i ) , and  MSE ( β ^ ) = ( 1 / M ) i = 1 M ( β β ^ i ) 2 , were used to calculate statistical indicators. The results of the estimators are close to the true values of the parameters, and consistent, with MSE and bias decreasing as n increases. The moment estimate was used as the initial value for the ML estimates. The results are shown in Figure 5.
Simulation studies also were conducted to evaluate the behavior of 95% confidence intervals using the ML method. The coverage length (CL) and coverage probability (CP) were empirically calculated to assess this behavior. The CL and CP are defined by
CL ( β ^ ) = 1 M i = 1 M 3.92 s ( β ^ i ) ,
CP ( β ^ ) = 1 M i = 1 M 𝟙 ( β ^ i 1.96 s ( β ^ i ) < β < β ^ i + 1.96 s ( β ^ i ) ) ,
respectively. In the expression given in (18), s ( β i ^ ) is the estimated standard error of the ML estimator. In the formula stated in (19), 𝟙 is the indicator function. The simulation results for the empirical CLs and CPs are presented in Figure 6. It is evident from the figure that the empirical CLs decrease as the sample size increases, while the empirical CPs remain close to 0.95.

5.3. Quantile Regression Model

Next, we conduct simulations to evaluate the ML estimators’ performance for the ECB QR parameters. We measure their performance by examining the bias, MSE, CL, and CP across different sample sizes (n), known values of τ , true values of β , and various covariate values. To generate the values of the unit response, we use a structure stated as
y i = log 1 + 2 β 1 u i 1 / α i 1 β log β / ( 1 β ) 1 , i { 1 , , n } ,
where u i U ( 0 , 1 ) and α i = log ( τ log 1 β β / ( 1 β ) μ i 1 / 2 β 1 1 ) .
For this simulation study, we use M = 1000 Monte Carlo replicates and varying sample sizes from n = 25 to n = 1000 , with step sizes of 25 , 50 , 100 , 250 , 500 , and 1000. We also vary the values of τ and β , with  τ taking on the values of 0.25 , 0.50 , and  0.75 and  β taking on the values of 0.25 , 0.5 , and  0.75 . The regression structure used is formulated as
logit ( μ i ) = β 0 + β 1 z i 1 , i { 1 , , n } ,
with β 0 = 0.5 , β 1 = 2 and z i 1 Bernoulli ( 0.5 ) .
The simulation results for the ECB QR model are presented in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8. It is observed that the empirical CLs exhibit a decreasing trend as the sample size increases, while the empirical CPs approach 0.95. The biases are negligible, and all MSEs approach zero.

6. Applications in Science and Education

This section provides two real data illustrations to show potential applications of the CB distribution and ECB QR model. All datasets were collected via: OECD.Stat with link https://stats.oecd.org/ (accessed on 1 May 2023).
The OECD.Stat provides data on various development themes, including demography, education and training, health, labor, social protection and well-being, finance, as well as agriculture and fisheries, for both OECD and nonmember economies. Among these, we focus on two datasets. The first dataset includes the long-term unemployment rate of various countries, while the second dataset comprises the educational attainment proportion and better life index (BLI) measurements of different countries. We analyze the relationship between the educational attainment proportion and BLI using our QR model. All the datasets employed in this study are publicly available and can be secured directly from: https://stats.oecd.org/index.aspx?DataSetCode=BLI2017 (accessed on 1 May 2023).

6.1. Application I

We analyze the proportion of long-term unemployment in different countries, defined as the ratio of the number of individuals who have been unemployed for one year or more to the total labor force (employed and unemployed individuals). We compare the performance of four one-parameter distributions, namely CB, log-Bilal [12], unit Lindley [14], and Topp–Leone (TL) [34], using indicators such as the estimated log-likelihood ( ^ ), AIC (Akaike information criterion), and BIC (Bayesian information criterion), as well as Kolmogorov–Smirnov (KS), Cramer–von Mises (CM), and Anderson–Darling (AD) goodness-of-fit statistics.
We report the results of the data analysis utilizing ML estimation in Table 9, which shows that the CB distribution is the best model according to the lowest values of AIC, BIC, AD, CM, and KS statistics. Plots of fitted PDF and CDF for the unemployment data are provided in Figure 7.

6.2. Application II

Now, we focus on the modeling ability of the ECB QR and compare its results with those of the KW [17] and LEEG [19] QRs. The CDFs of the KW and LEEG distributions are, respectively, given by
G KW ( y ; β , μ , τ ) = 1 1 y β log ( 1 τ ) log ( 1 μ β ) y ( 0 , 1 ) ,
G LEEG ( y ; β , μ , τ ) = τ y β μ β 1 τ μ β y β + μ β y β 1 , y ( 0 , 1 ) ,
where μ ( 0 , 1 ) is the quantile parameter, β > 0 is a parameter related to the shape of the distribution, and  τ is the known quantile level.
We investigate the relationship between the conditional quantile of educational attainment and the independent variables, including homicide rate (HR), dwellings without basic facilities (DWBF), and labor market insecurity (LMI) values of countries. The unit-dependent variable employed in this study is the proportion of educational attainment. This dataset was previously analyzed in [12] using the log-Bilal distribution for unit mean response regression modeling. The QR based on μ i is given by
μ i = e β 0 + β 1 HR i + β 2 DWBF i + β 3 LMI i 1 + e β 0 + β 1 HR i + β 2 DWBF i + β 3 LMI i , i { 1 , , 38 } ,
where μ i is the quantile for all models.
We obtain the QR results for different quantile levels such as τ = 0.25 , τ = 0.5 , and  τ = 0.75 , separately. Table 10, Table 11 and Table 12 supply the results of the QR analysis for the different quantile levels mentioned. The findings from these tables indicate that when inferring about the ECB QR, all independent variables are statistically significant at any usual significance level based on all quantile levels.
The signs of β 0 and β 2 are positive, whereas the signs of β 1 and β 3 are negative. Hence, all independent variables have effects on the educational attainment of the countries. Since a direct positive correlation between the educational attainment and DWBF exists, as the educational attainment of countries increases, the DWBF increases as well. Moreover, there are negative relationships between educational attainment and the HR and LMI index, indicating that as the educational attainment of countries increases, the HR and LMI index decreases. Furthermore, the ECB QR model has minimal AIC/BIC with maximal log-likelihood values compared with other regression models at all quantile levels. Consequently, we conclude that the best model is the proposed regression structure.
Figure 8, Figure 9 and Figure 10 display the QQ (empirical quantile versus theoretical quantile) plot of the RQ residual for all QRs based on all quantile levels. These figures show that explaining the unit-dependent variable by independent variables is remarkable, and there is a good fit for the ECB QR model.

7. Conclusions

This article presented new properties of the continuous Bernoulli distribution and introduced its quantile regression model with an exponentiated structure. Our quantile regression model incorporated independent variables that were associated with the quantile of the dependent variable using a link function related to a logit structure. To assess the performance of the estimators, we employed the Monte Carlo method under various settings. Our results indicated that the estimators work well in statistical terms, and additionally, the coverage probabilities approach the nominal value as the size of the sample increases.
The adequacy of the continuous Bernoulli distribution, exponentiated continuous Bernoulli quantile regression model, and other regression structures was compared by analyzing two real datasets based on the better life index of the countries. The comparison was carried out based on statistical indicators. Results indicated that the continuous Bernoulli distribution and its associated quantile regression model demonstrated a good coherence with the two sets based on real data, providing an alternative to other models presented in the literature on the topic.
Our quantile regression model yielded similar results to the unit mean regression models presented in [12,13]. Other related works [35,36,37] proposed the unit log-log, transmuted unit Rayleigh, and unit quantile regression models, respectively. These models analyzed the educational attainment values of OECD countries with a unit-dependent variable and reported a negative relationship between all responses and the independent variable of homicide rate, which is consistent with the findings of our study.
The examination we conducted showed that the continuous Bernoulli distribution can be very effective in analyzing data that are limited to the unit interval, whether or not independent variables are present. The use of both quantile and mean regression models can be easily carried out through the implementation of R codes by the authors of this article.
There are several avenues for future research that can build upon the present investigation. One possible direction is to incorporate multivariate, functional, temporal, and spatial structures, errors-in-variables, and partial least squares into the quantile regression framework, as well as studying their influence on diagnostics [38]. Other potential areas of investigation are the Tobit and Cobb–Douglas frameworks [39], which could be relevant to the topic of this study. Additionally, the analysis of censored observations and frailty models in the present context is another avenue for further research [40]. The authors are currently analyzing these issues and their findings are expected to be available in the future.

Author Contributions

Conceptualization, M.Ç.K., V.L. and C.M.-B.; data curation, M.Ç.K. and C.M.-B.; formal analysis, M.Ç.K., V.L. and C.M.-B.; investigation, M.Ç.K., V.L. and C.M.-B.; methodology, M.Ç.K., V.L. and C.M.-B.; writing—original draft, M.Ç.K. and C.M.-B.; writing—review and editing, V.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by FONDECYT grant number 1200525 (V.L.) from the National Agency for Research and Development (ANID) of the Chilean government under the Ministry of Science, Technology, Knowledge, and Innovation.

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank two reviewers for their comments which helped to improve the presentation of this article.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

  1. Gómez-Déniz, E.; Sordo, M.A.; Calderín-Ojeda, E. The log-Lindley distribution as an alternative to the beta regression model with applications in insurance. Insur. Math. Econ. 2014, 54, 49–57. [Google Scholar] [CrossRef]
  2. Korkmaz, M.Ç.; Chesneau, C. On the unit Burr-XII distribution with the quantile regression modeling and applications. Comput. Appl. Math. 2021, 40, 29. [Google Scholar] [CrossRef]
  3. Kumaraswamy, P. A generalized probability density function for double-bounded random processes. J. Hydrol. 1980, 46, 79–88. [Google Scholar] [CrossRef]
  4. Loaiza-Ganem, G.; Cunningham, J.P. The continuous Bernoulli: Fixing a pervasive error in variational autoencoders. arXiv 2019, arXiv:1907.06845. [Google Scholar]
  5. Smithson, M.; Verkuilen, J. A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol. Methods 2006, 11, 54. [Google Scholar] [CrossRef]
  6. Van Dorp, J.R.; Kotz, S. A novel extension of the triangular distribution and its parameter estimation. J. R. Stat. Soc. D 2002, 51, 63–79. [Google Scholar] [CrossRef]
  7. Vasicek, O.A. Probability of Loss on Loan Portfolio; KMV Corporation: San Francisco, CA, USA, 1987. [Google Scholar]
  8. Li, S.; Chen, J.; Li, B. Estimation and testing of random effects semiparametric regression model with separable space-time filters. Fractal Fract. 2022, 6, 735. [Google Scholar] [CrossRef]
  9. Ferrari, S.; Cribari Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
  10. Bayes, C.L.; Bazán, J.L.; García, C. A new robust regression model for proportions. Bayesian Anal. 2012, 7, 841–866. [Google Scholar] [CrossRef]
  11. Mazucheli, J.; Menezes, A.F.B.; Dey, S. The unit-Birnbaum-Saunders distribution with applications. Chil. J. Stat. 2018, 9, 47–57. [Google Scholar]
  12. Altun, E.; El-Morshedy, M.; Eliwa, M. A new regression model for bounded dependent variable: An alternative to the beta and unit-Lindley regression models. PLoS ONE 2021, 16, e0245627. [Google Scholar] [CrossRef] [PubMed]
  13. Altun, E. The log-weighted exponential regression model: Alternative to the beta regression model. Commun. Stat. Theory Methods 2021, 50, 2306–2321. [Google Scholar] [CrossRef]
  14. Mazucheli, J.; Menezes, A.F.B.; Chakraborty, S. On the one parameter unit-Lindley distribution and its associated regression model for proportion data. J. Appl. Stat. 2019, 46, 700–714. [Google Scholar] [CrossRef]
  15. Koenker, R.; Bassett, G.J. Regression quantiles. Econometrica 1978, 46, 33–50. [Google Scholar] [CrossRef]
  16. Bayes, C.L.; Bazán, J.L.; De Castro, M. A quantile parametric mixed regression model for bounded dependent variables. Stat. Its Interface 2017, 10, 483–493. [Google Scholar] [CrossRef]
  17. Mitnik, P.A.; Baek, S. The Kumaraswamy distribution: Median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Stat. Pap. 2013, 54, 177–192. [Google Scholar] [CrossRef]
  18. Paz, R.F.; Balakrishnan, N.; Bazán, J.L. L-logistic regression models: Prior sensitivity analysis, robustness to outliers and applications. Braz. J. Probab. Stat. 2019, 33, 455–479. [Google Scholar]
  19. Jodrá, P.; Jiménez-Gamero, M.D. A quantile regression model for bounded responses based on the exponential-geometric distribution. Revstat Stat. J. 2020, 4, 415–436. [Google Scholar]
  20. Saulo, H.; Dasilva, A.; Leiva, V.; Sanchez, L.; de la Fuente-Mella, H. Log-symmetric quantile regression models. Stat. Neerl. 2022, 76, 124–163. [Google Scholar] [CrossRef]
  21. Cancho, V.G.; Bazán, J.L.; Dey, D.K. A new class of regression model for a bounded response with application in the study of the incidence rate of colorectal cancer. Stat. Methods Med. Res. 2020, 29, 2015–2033. [Google Scholar] [CrossRef]
  22. Sanchez, L.; Leiva, V.; Galea, M. Saulo, H. Birnbaum-Saunders quantile regression and its diagnostics with application to economic data. Appl. Stoch. Model. Bus. Ind. 2021, 37, 53–73. [Google Scholar] [CrossRef]
  23. Ribeiro, T.F.; Cordeiro, G.M.; Peña-Ramírez, F.A.; Guerra, R.R. A new quantile regression for the COVID-19 mortality rates in the United States. Comput. Appl. Math. 2021, 40, 255. [Google Scholar] [CrossRef]
  24. Korkmaz, M.Ç.; Emrah, A.; Chesneau, C.; Yousof, H.M. On the unit-Chen distribution with associated quantile regression and applications. Math. Slovaca 2022, 72, 765–786. [Google Scholar] [CrossRef]
  25. Mazucheli, J.; Menezes, A.F.B.; Fernandes, L.B.; de Oliveira, R.P.; Ghitany, M.E. The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on independent variables. J. Appl. Stat. 2020, 47, 954–974. [Google Scholar] [CrossRef] [PubMed]
  26. Leiva, V.; Mazucheli, M.; Alves, B. A novel regression model for fractiles: Formulation, computational aspects, and applications to medical data. Fractal Fract. 2023, 7, 169. [Google Scholar] [CrossRef]
  27. Shahin, A.I.; Almotairi, S. A deep learning BiLSTM encoding-decoding model for COVID-19 pandemic spread forecasting. Fractal Fract. 2021, 5, 175. [Google Scholar] [CrossRef]
  28. Kingma, D.P.; Welling, M. Auto-encoding variational Bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
  29. Guiraud, P.; Leiva, V.; Fierro, R. A non-central version of the Birnbaum-Saunders distribution for reliability analysis. IEEE Trans. Reliab. 2009, 58, 152–160. [Google Scholar] [CrossRef]
  30. Tang, L.; Lu, Y.; Chew, E. Mean residual life of lifetime distributions. IEEE Trans. Reliab. 1999, 48, 73–78. [Google Scholar] [CrossRef]
  31. Kao, J.H. Computer methods for estimating Weibull parameters in reliability studies. IRE Trans. Reliab. Qual. Control 1958, PGRQC-13, 15–22. [Google Scholar] [CrossRef]
  32. Dunn, P.K.; Smyth, G.K. Randomized quantile residuals. J. Comput. Graph. Stat. 1996, 5, 236–244. [Google Scholar]
  33. Mazucheli, M.; Alves, B.; Menezes, A.F.B.; Leiva, V. An overview on parametric quantile regression models and their computational implementation with applications to biomedical problems including COVID-19 data. Comput. Methods Programs Biomed. 2022, 221, 106816. [Google Scholar] [CrossRef] [PubMed]
  34. Topp, C.W.; Leone, F.C. A family of J-shaped frequency functions. J. Am. Stat. Assoc. 1955, 50, 209–219. [Google Scholar] [CrossRef]
  35. Korkmaz, M.Ç.; Korkmaz, Z.S. The unit log-log distribution: A new unit distribution with alternative quantile regression modeling and educational measurements applications. J. Appl. Stat. 2023, 50, 889–908. [Google Scholar] [CrossRef] [PubMed]
  36. Korkmaz, M.Ç.; Chesneau, C.; Korkmaz, Z.S. Transmuted unit Rayleigh quantile regression model: Alternative to beta and Kumaraswamy quantile regression models. Univ. Politeh. Buchar. Sci. Bull. Ser. Appl. Math. Phys. 2021, 83, 149–158. [Google Scholar]
  37. Korkmaz, M.Ç.; Chesneau, C.; Korkmaz, Z.S. The unit folded normal distribution: A new unit probability distribution with the estimation procedures, quantile regression modeling and educational attainment applications. J. Reliab. Stat. Stud. 2022, 15, 261–298. [Google Scholar] [CrossRef]
  38. Figueroa-Zuniga, J.; Niklitschek, S.; Leiva, V.; Liu, S. Modeling heavy-tailed bounded data by the trapezoidal beta distribution with applications. Revstat Stat. J. 2023, 20, 387–404. [Google Scholar]
  39. de la Fuente, H.; Rojas Fuentes, J.L.; Leiva, V. Econometric modeling of productivity and technical efficiency in the Chilean manufacturing industry. Comput. Ind. Eng. 2020, 139, 105793. [Google Scholar] [CrossRef]
  40. Leao, J.; Leiva, V.; Saulo, H.; Tomazella, V. Incorporation of frailties into a cure rate regression model and its diagnostics and application to melanoma data. Stat. Med. 2018, 37, 4421–4440. [Google Scholar] [CrossRef]
Figure 1. The CB inverse CDF for the indicated values of β .
Figure 1. The CB inverse CDF for the indicated values of β .
Fractalfract 07 00386 g001
Figure 2. Plots of the CB PDF (left), CDF (center), and HRF (right) for the listed values of β .
Figure 2. Plots of the CB PDF (left), CDF (center), and HRF (right) for the listed values of β .
Fractalfract 07 00386 g002
Figure 3. Plots of the mean, variance, skewness, and kurtosis of the CB distribution for the indicated values of β .
Figure 3. Plots of the mean, variance, skewness, and kurtosis of the CB distribution for the indicated values of β .
Fractalfract 07 00386 g003
Figure 4. Plots of the PDF stated in (14) for the indicated values of parameters.
Figure 4. Plots of the PDF stated in (14) for the indicated values of parameters.
Fractalfract 07 00386 g004
Figure 5. Plots of simulations for the point estimation with β = 0.25 (top) and β = 0.75 (bottom).
Figure 5. Plots of simulations for the point estimation with β = 0.25 (top) and β = 0.75 (bottom).
Fractalfract 07 00386 g005
Figure 6. Plots of simulations for the CPs and CLs with β = 0.25 (top) and β = 0.75 (bottom).
Figure 6. Plots of simulations for the CPs and CLs with β = 0.25 (top) and β = 0.75 (bottom).
Fractalfract 07 00386 g006
Figure 7. Plots of fitted PDF (left) and CDF (right) for the unemployment data.
Figure 7. Plots of fitted PDF (left) and CDF (right) for the unemployment data.
Fractalfract 07 00386 g007
Figure 8. QQ plots of the RQ residual for KW (left), LEEG (center), and ECB (right) models based on τ = 0.25 quantile level with educational data, where circles indicate the observed data.
Figure 8. QQ plots of the RQ residual for KW (left), LEEG (center), and ECB (right) models based on τ = 0.25 quantile level with educational data, where circles indicate the observed data.
Fractalfract 07 00386 g008
Figure 9. QQ plots of the RQ residual for KW (left), LEEG (center), and ECB (right) models based on τ = 0.50 quantile level with educational data, where circles indicate the observed data.
Figure 9. QQ plots of the RQ residual for KW (left), LEEG (center), and ECB (right) models based on τ = 0.50 quantile level with educational data, where circles indicate the observed data.
Fractalfract 07 00386 g009
Figure 10. QQ plots of the RQ residual for KW (left), LEEG (center), and ECB (right) models based on τ = 0.75 quantile level with educational data, where circles indicate the observed data.
Figure 10. QQ plots of the RQ residual for KW (left), LEEG (center), and ECB (right) models based on τ = 0.75 quantile level with educational data, where circles indicate the observed data.
Fractalfract 07 00386 g010
Table 1. Simulation results with β = 0.25 .
Table 1. Simulation results with β = 0.25 .
nRuntime (in Seconds) x ¯ s 2
25 0.0116 0.4106 0.0780877
50 0.0207 0.4110 0.0788680
100 0.0352 0.4101 0.0783473
200 0.0513 0.4098 0.0785829
500 0.1273 0.4104 0.0784173
1000 0.1951 0.4102 0.0784991
Table 2. Simulation results with β = 0.75 .
Table 2. Simulation results with β = 0.75 .
nRuntime (in Seconds) x ¯ s 2
25 0.0197 0.5895 0.0784745
50 0.0224 0.5897 0.0780702
100 0.0315 0.5885 0.0788916
200 0.0459 0.5887 0.0786282
500 0.0955 0.5894 0.0788145
1000 0.2136 0.5897 0.0784867
Table 3. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.25 and τ = 0.25 for the indicated n and parameter with simulated data.
Table 3. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.25 and τ = 0.25 for the indicated n and parameter with simulated data.
nBias( β ^ 0 )Bias( β ^ 1 )Bias( β ^ )MSE( β ^ 0 )MSE( β ^ 1 )MSE( β ^ )CL( β ^ 0 )CL( β ^ 1 )CL( β ^ )CP( β ^ 0 )CP( β ^ 1 )CP( β ^ )
25 0.0506 0.0241 0.0461 0.1578 0.2368 0.0524 1.3731 1.7408 0.8412 0.8880 0.9110 0.6430
50 0.0333 0.0053 0.0057 0.0804 0.1169 0.0417 1.0516 1.3099 0.7733 0.9290 0.9490 0.7788
100 0.0120 0.0009 0.0064 0.0371 0.0593 0.0219 0.7197 0.9227 0.5588 0.9250 0.9430 0.8340
200 0.0078 0.0008 0.0013 0.0196 0.0298 0.0127 0.5310 0.6634 0.4251 0.9390 0.9360 0.8830
500 0.0004 0.0029 0.0001 0.0076 0.0122 0.0051 0.3351 0.4210 0.2758 0.9460 0.9510 0.9079
1000 0.0010 0.0001 0.0003 0.0035 0.0052 0.0027 0.2318 0.2967 0.1936 0.9500 0.9630 0.9240
Table 4. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.75 and τ = 0.25 for the indicated n and parameter with simulated data.
Table 4. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.75 and τ = 0.25 for the indicated n and parameter with simulated data.
nBias( β ^ 0 )Bias( β ^ 1 )Bias( β ^ )MSE( β ^ 0 )MSE( β ^ 1 )MSE( β ^ )CL( β ^ 0 )CL( β ^ 1 )CL( β ^ )CP( β ^ 0 )CP( β ^ 1 )CP( β ^ )
25 0.0227 0.0017 0.2301 0.2340 0.3634 0.1553 1.7324 2.1644 1.1733 0.9010 0.9190 0.6650
50 0.0296 0.0311 0.1406 0.1236 0.1746 0.0915 1.3305 1.5991 0.9841 0.9180 0.9360 0.7730
100 0.0171 0.0141 0.0833 0.0748 0.0979 0.0531 0.9991 1.1703 0.7680 0.9180 0.9320 0.8138
200 0.0027 0.0024 0.0442 0.0335 0.0463 0.0249 0.6950 0.8258 0.5717 0.9500 0.9440 0.8940
500 0.0046 0.0025 0.0206 0.0131 0.0184 0.0100 0.4437 0.5251 0.3749 0.9430 0.9500 0.9259
1000 0.0014 0.0020 0.0120 0.0069 0.0097 0.0050 0.3280 0.3790 0.2774 0.9520 0.9420 0.9390
Table 5. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.50 and τ = 0.25 for the indicated n and parameter with simulated data.
Table 5. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.50 and τ = 0.25 for the indicated n and parameter with simulated data.
nBias( β ^ 0 )Bias( β ^ 1 )MSE( β ^ 0 )MSE( β ^ 1 )CL( β ^ 0 )CL( β ^ 1 )CP( β ^ 0 )CP( β ^ 1 )
25 0.0496 0.0033 0.0901 0.1766 1.1570 1.6256 0.9520 0.9480
50 0.0196 0.0110 0.0432 0.0875 0.7888 1.1547 0.9450 0.9500
100 0.0055 0.0012 0.0226 0.0465 0.5630 0.8153 0.9320 0.9410
200 0.0063 0.0032 0.0117 0.0214 0.4475 0.5820 0.9620 0.9540
500 0.0012 0.0001 0.0047 0.0088 0.2651 0.3638 0.9580 0.9470
1000 0.0022 0.0007 0.0025 0.0042 0.1919 0.2579 0.9420 0.9470
Table 6. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.50 and τ = 0.75 for the indicated n and parameter with simulated data.
Table 6. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.50 and τ = 0.75 for the indicated n and parameter with simulated data.
nBias( β ^ 0 )Bias( β ^ 1 )MSE( β ^ 0 )MSE( β ^ 1 )CL( β ^ 0 )CL( β ^ 1 )CP( β ^ 0 )CP( β ^ 1 )
25 0.0466 0.0147 0.1133 0.1821 1.2587 1.6439 0.9250 0.9450
50 0.0220 0.0064 0.0443 0.0876 0.8032 1.1513 0.9470 0.9470
100 0.0146 0.0064 0.0204 0.0418 0.5626 0.8150 0.9520 0.9580
200 0.0085 0.0130 0.0120 0.0212 0.4259 0.5761 0.9480 0.9510
500 0.0053 0.0041 0.0045 0.0087 0.2624 0.3636 0.9460 0.9480
1000 0.0002 0.0004 0.0024 0.0045 0.1901 0.2576 0.9430 0.9420
Table 7. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.25 and τ = 0.50 for the indicated n and parameter with simulated data.
Table 7. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.25 and τ = 0.50 for the indicated n and parameter with simulated data.
nBias( β ^ 0 )Bias( β ^ 1 )Bias( β ^ )MSE( β ^ 0 )MSE( β ^ 1 )MSE( β ^ )CL( β ^ 0 )CL( β ^ 1 )CL( β ^ )CP( β ^ 0 )CP( β ^ 1 )CP( β ^ )
25 0.0429 0.0319 0.0166 0.2150 0.2637 0.0531 1.7293 1.8910 0.8087 0.9210 0.9260 0.6940
50 0.0056 0.0054 0.0086 0.0798 0.1155 0.0305 1.1040 1.3311 0.6289 0.9380 0.9500 0.8060
100 0.0070 0.0160 0.0053 0.0431 0.0604 0.0147 0.8228 0.9506 0.4911 0.9550 0.9460 0.8700
200 0.0007 - 0.0009 0.0023 0.0229 0.0310 0.0091 0.6091 0.6801 0.3655 0.9520 0.9490 0.8950
500 0.0017 - 0.0041 0.0003 0.0096 0.0118 0.0038 0.3913 0.4336 0.2382 0.9560 0.9540 0.9300
1000 0.0008 - 0.0013 0.0020 0.0049 0.0059 0.0019 0.2785 0.3076 0.1710 0.9590 0.9450 0.9420
Table 8. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.75 and τ = 0.50 for the indicated n and parameter with simulated data.
Table 8. Empirical bias, MSE, 95 % CP, and  95 % CL for the ECB QR model with β = 0.75 and τ = 0.50 for the indicated n and parameter with simulated data.
nBias( β ^ 0 )Bias( β ^ 1 )Bias( β ^ )MSE( β ^ 0 )MSE( β ^ 1 )MSE( β ^ )CL( β ^ 0 )CL( β ^ 1 )CL( β ^ )CP( β ^ 0 )CP( β ^ 1 )CP( β ^ )
25 0.0501 0.0264 0.1891 0.2898 0.3932 0.1172 1.9476 2.3130 1.0495 0.9120 0.9290 0.7404
50 0.0283 0.0257 0.1072 0.1507 0.2058 0.0638 1.4803 1.6950 0.8336 0.9260 0.9320 0.8029
100 0.0165 0.0129 0.0566 0.0955 0.1078 0.0322 1.1702 1.2454 0.6562 0.9300 0.9400 0.8805
200 0.0105 0.0059 0.0321 0.0540 0.0589 0.0169 0.8995 0.9262 0.4952 0.9440 0.9360 0.8958
500 0.0007 0.0005 0.0150 0.0188 0.0209 0.0063 0.5222 0.5580 0.3107 0.9370 0.9470 0.9333
1000 0.0013 0.0012 0.0056 0.0090 0.0101 0.0028 0.3792 0.4032 0.2310 0.9370 0.9500 0.9306
Table 9. ML estimate and standard errors, in parentheses, as well as ^ and p-value of the goodness-of-fit test, in brackets, with unemployment data.
Table 9. ML estimate and standard errors, in parentheses, as well as ^ and p-value of the goodness-of-fit test, in brackets, with unemployment data.
Distribution β ^ ^ AICBICADCMKS
CB 0.261 × 10 13 ( 0.13 × 10 12 ) 92.8248 183.6497 182.0122 0.6018 0.1058 0.1332 [ 0.5097 ]
Log-Bilal 4.8700 ( 0.5694 ) 74.2699 146.5399 144.9023 5.7487 1.1128 0.3077 [ 0.0015 ]
TL 0.2996 ( 0.0486 ) 68.0997 134.1995 132.5619 35.2670 7.0160 0.7110 [ < 0.001 ]
Unit-Lindley 29.7750 ( 4.8431 ) 92.2610 182.5221 180.8845 0.7448 0.1317 0.1532 [ 0.3344 ]
Table 10. Results of fitted regressions with quantile τ = 0.25 and model selection criteria for educational data.
Table 10. Results of fitted regressions with quantile τ = 0.25 and model selection criteria for educational data.
ParameterECBKWLEEG
EstimateSEp-ValueEstimateSEp-ValueEstimateSEp-Value
β 0 1.91600.3172<0.00011.34730.1730<0.00011.41990.1599<0.0001
β 1 −0.09140.0241<0.0001−0.05110.0130<0.0001−0.055880.0118<0.0001
β 2 11.18002.3920<0.00012.05631.19990.08662.36121.30680.078
β 3 −21.20006.1390<0.001−7.19771.1596<0.0001−8.18411.7661<0.0001
β 0.000580.00090.53677.07691.2104<0.00019.17381.9582<0.0001
^ 35.579433.583832.3569
AIC−61.1587−57.1676−54.7139
BIC−52.9708−48.9797−46.5259
Table 11. Results of fitted regressions with quantile τ = 0.50 and model selection criteria for educational data.
Table 11. Results of fitted regressions with quantile τ = 0.50 and model selection criteria for educational data.
ParameterECBKWLEEG
EstimateSEp-ValueEstimateSEp-ValueEstimateSEp-Value
β 0 2.38301.7410<0.00011.95950.1795<0.00011.99070.1903<0.0001
β 1 −0.09420.0207<0.0001−0.06390.0135<0.0001−0.06710.0134<0.0001
β 2 10.93002.2890<0.00012.37831.20350.04812.81451.49140.0591
β 3 −21.07001.8860<0.0001−8.96961.3384<0.0001−9.81192.0626<0.0001
β 0.000970.00150.5137.21661.2191<0.00019.10911.9682<0.0001
^ 34.671733.864232.2079
AIC−59.3433−57.7286−54.4158
BIC−51.1553−49.5406−46.2279
Table 12. Results of fitted regressions with quantile τ = 0.75 and model selection criteria for educational data.
Table 12. Results of fitted regressions with quantile τ = 0.75 and model selection criteria for educational data.
ParametersECBKWLEEG
EstimateSEp-ValueEstimateSEp-ValueEstimateSEp-Value
β 0 3.08970.2262<0.00012.52040.2044<0.00012.75060.2707<0.0001
β 1 −0.09780.0229<0.0001−0.07720.0133<0.0001−0.08340.0157<0.0001
β 2 9.78761.8659<0.00012.72951.54890.07083.50701.80680.0523
β 3 −19.90551.4959<0.0001−10.08281.6659<0.0001−12.17602.6050<0.0001
β 0.00220.00290.45308.26131.3396<0.00019.06142.0043<0.0001
^ 33.892333.632332.0757
AIC−57.7847−57.2647−54.1515
BIC−49.5967−49.0768−45.9635
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Korkmaz, M.Ç.; Leiva, V.; Martin-Barreiro, C. The Continuous Bernoulli Distribution: Mathematical Characterization, Fractile Regression, Computational Simulations, and Applications. Fractal Fract. 2023, 7, 386. https://doi.org/10.3390/fractalfract7050386

AMA Style

Korkmaz MÇ, Leiva V, Martin-Barreiro C. The Continuous Bernoulli Distribution: Mathematical Characterization, Fractile Regression, Computational Simulations, and Applications. Fractal and Fractional. 2023; 7(5):386. https://doi.org/10.3390/fractalfract7050386

Chicago/Turabian Style

Korkmaz, Mustafa Ç., Víctor Leiva, and Carlos Martin-Barreiro. 2023. "The Continuous Bernoulli Distribution: Mathematical Characterization, Fractile Regression, Computational Simulations, and Applications" Fractal and Fractional 7, no. 5: 386. https://doi.org/10.3390/fractalfract7050386

APA Style

Korkmaz, M. Ç., Leiva, V., & Martin-Barreiro, C. (2023). The Continuous Bernoulli Distribution: Mathematical Characterization, Fractile Regression, Computational Simulations, and Applications. Fractal and Fractional, 7(5), 386. https://doi.org/10.3390/fractalfract7050386

Article Metrics

Back to TopTop