Quantile-Based Multivariate Log-Normal Distribution

Morán-Vásquez, Raúl Alejandro; Roldán-Correa, Alejandro; Nagar, Daya K.

doi:10.3390/sym15081513

Open AccessArticle

Quantile-Based Multivariate Log-Normal Distribution

by

Raúl Alejandro Morán-Vásquez

^*,†

,

Alejandro Roldán-Correa

^†

and

Daya K. Nagar

^†

Instituto de Matemáticas, Universidad de Antioquia, Calle 67 No. 53-108, Medellín 050010, Colombia

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2023, 15(8), 1513; https://doi.org/10.3390/sym15081513

Submission received: 28 June 2023 / Revised: 22 July 2023 / Accepted: 30 July 2023 / Published: 31 July 2023

(This article belongs to the Section Mathematics)

Download

Browse Figures

Versions Notes

Abstract

:

We introduce a quantile-based multivariate log-normal distribution, providing a new multivariate skewed distribution with positive support. The parameters of this distribution are interpretable in terms of quantiles of marginal distributions and associations between pairs of variables, a desirable feature for statistical modeling purposes. We derive statistical properties of the quantile-based multivariate log-normal distribution involving the transformations, closed-form expressions for the mixed moments, expected value, covariance matrix, mode, Shannon entropy, and Kullback–Leibler divergence. We also present results on marginalization, conditioning, and independence. Additionally, we discuss parameter estimation and verify its performance through simulation studies. We evaluate the model fitting based on Mahalanobis-type distances. An application to children data is presented.

Keywords:

Kullback–Leibler divergence; mixed moments; independence; multivariate log-normal distribution; quantile-based distribution

1. Introduction

Quantile regression modeling has been widely applied in different fields such as economics, environmental science, ecology, and medicine, among many others (Cade and Noon [1], Yu et al. [2]). A number of studies on nonparametric quantile regression and its applications have been developed since the seminal work of Koenker and Bassett [3]. Recently, several parametric quantile models have been studied in the regression literature, which have motivated the study of probability distributions that are useful for this purpose.

In the univariate setting, some distributions suitable for parametric quantile modeling appear in Ferrari and Fumes [4], Gijbels et al. [5], Mazucheli et al. [6], and Smithson and Shou [7]. Multivariate quantile modeling is less frequent in the statistical literature and often uses nonparametric methods. Several studies are based on extensions of the quantile concept to a multivariate setting. Some examples can be found in Breckling and Chambers [8], Kong and Mizera [9], McKeague et al. [10], and Wei [11]. Other multivariate models are based on the univariate quantile notion. For instance, Petrella and Raponi [12], Morán-Vásquez and Ferrari [13], and Morán-Vásquez et al. [14] propose methods for jointly modeling univariate marginal quantiles, taking into account the potential correlation between marginals.

In the present article, we define a quantile-based multivariate log-normal distribution. This distribution has positive support, and is simplified to the quantile-based log-normal distribution (Saulo et al. [15]) in the univariate setting. On the other hand, the usual multivariate log-normal distribution (Morán-Vásquez and Ferrari [13] and Morán-Vásquez et al. [14]) can be expressed as a quantile-based multivariate log-normal distribution. The parameters of the proposed distribution are interpretable in terms of marginal quantiles and associations between pairs of variables, making this model attractive to quantile modeling for correlated multivariate positive skewed data.

In this article, we study some statistical properties of the quantile-based multivariate log-normal family, describe the estimation of its parameters, and show its usefulness through an application to real data. We derive distributional properties obtained through transformations, as well as results related to the mixed moments, expected value, covariance matrix, mode, Shannon entropy, Kullback–Leibler divergence, marginal and conditional distributions, and independence. Applications of some of our results derived in this article establish new properties of the multivariate log-normal distribution. We compute the maximum likelihood estimates of the parameters of the quantile-based multivariate log-normal distribution from the maximum likelihood estimates of the multivariate log-normal distribution. We evaluate the performance of the proposed estimation procedure through Monte Carlo simulations. The usefulness of the proposed distribution for modeling multivariate positive skewed data is illustrated through an analysis of real data on children’s weights and heights.

The paper is organized as follows. Section 2 presents the quantile-based multivariate log-normal distribution. Section 3 deals with the derivation of various statistical properties of the proposed distribution. Section 4 focuses on maximum likelihood estimation and simulation studies. Also, a graphical method to assess the goodness of fit is described. Section 5 presents an application to real data. Finally, Section 6 closes the paper with concluding remarks.

2. Quantile-Based Multivariate Log-Normal Distribution

We denote vectors with lowercase Greek letters in bold and matrices with capital Greek letters in bold. For vectors and matrices, the components are denoted by the respective Greek letter in normal font. For example, if

θ \in R^{p}

and

Ω (p \times q)

are a real matrix, then

θ = {(θ_{1}, \dots, θ_{p})}^{'}

and

Ω = {(ω_{j k})}_{p \times q}

. We denote by

0 = {(0, \dots, 0)}^{'}

and

1 = {(1, \dots, 1)}^{'}

the p-dimensional vectors whose components are all zero and one, respectively. We denote by

I_{p}

the

p \times p

identity matrix. Let

Ω (p \times p)

be a square matrix. We denote by

\det (Ω)

and

tr (Ω)

the determinant and trace of

Ω

, respectively. If

Ω

is a symmetric matrix, then

Ω > 0

means that

Ω

is the positive definite. Additionally,

Ω^{1 / 2}

is the unique symmetric positive definite square root of

Ω > 0

. If

Ω_{1}

and

Ω_{2}

are matrices of the same dimension, then

Ω_{1} ⊙ Ω_{2}

denotes the Hadamard product of

Ω_{1}

and

Ω_{2}

. If

θ \in R^{p}

is a vector, then

D_{θ}

denotes the diagonal matrix with diagonal elements of

θ

, that is,

D_{θ} = diag (θ_{1}, \dots, θ_{p})

. We define the set

R_{+}^{p}

as

R_{+}^{p} = \{θ \in R^{p} : θ_{k} > 0, k = 1, \dots, p\} .

If

θ \in R^{p}

and f are a real function, we denote

f (θ) = {(f (θ_{1}), \dots, f (θ_{p}))}^{'}

, provided that the components of

θ

are in the domain of f. If

θ \in R_{+}^{p}

and

β \in R^{p}

, we write

θ^{β} = {(θ_{1}^{β_{1}}, \dots, θ_{p}^{β_{p}})}^{'}

. We denote random vectors and their components with capital Roman letters in bold and normal fonts, respectively.

It is well known that the PDF of a multivariate normal vector

X \sim N_{p} (μ, Σ)

is given by

ϕ_{p} (x; μ, Σ) = {(2 π)}^{- p / 2} \det {(Σ)}^{- 1 / 2} \exp (- \frac{1}{2} δ_{Σ} (x, μ)),

where

δ_{Σ} (x, μ) = {(x - μ)}^{'} Σ^{- 1} (x - μ)

is the square of the Mahalanobis distance between

x

and

μ

with respect to

Σ

. On the other hand, the random vector

Y \in R_{+}^{p}

has a multivariate log-normal distribution with median vector

μ \in R_{+}^{p}

and dispersion matrix

Σ (p \times p) > 0

, denoted by

Y \sim L N_{p} (μ, Σ)

, if

\log (Y) \sim N_{p} (\log (μ), Σ)

, where the log denotes the natural logarithm function. The PDF of

Y \sim L N_{p} (μ, Σ)

is (Morán-Vásquez et al. [14])

L N_{p} (y; μ, Σ) = ϕ_{p} (\log (y); \log (μ), Σ) \prod_{k = 1}^{p} \frac{1}{y_{k}}, y \in R_{+}^{p} .

(1)

The multivariate log-normal distribution given in (1) has a slightly different parameterization than the one used by Fang et al. ([16], Section 2.8).

Let

α_{1}, \dots, α_{p}

be fixed values in

(0, 1)

. Theorem 5 and Corollary 2 of Morán–Vásquez and Ferrari [13] permits us to establish that the

α_{k}

-quantile

Q_{k}

of

Y_{k}

satisfies

Q_{k} = μ_{k} \exp (\sqrt{σ_{k k}} q_{k})

, where

q_{k}

is the

α_{k}

-quantile of a standard normal distribution,

k = 1, \dots, p

. Note that the quantile vector

Q = {(Q_{1}, \dots, Q_{p})}^{'}

can be expressed as

Q = D_{μ} \exp (Ω q),

(2)

where

Ω = {(Σ ⊙ I_{p})}^{1 / 2}

and

q = {(q_{1}, \dots, q_{p})}^{'}

. A reparameterization of the multivariate log-normal distribution in terms of

Q

is obtained by replacing

μ = D_{Q} \exp (- Ω q)

in (1). Based on this, we present the quantile-based multivariate log-normal distribution in Definition 1.

Definition 1.

Let

α = {(α_{1}, \dots, α_{p})}^{'} \in {(0, 1)}^{p}

and

q = {(q_{1}, \dots, q_{p})}^{'} \in R^{p}

be fixed vectors such that

q_{k}

is the

α_{k}

-quantile of a standard normal distribution,

k = 1, \dots, p

. The random vector

Y \in R_{+}^{p}

is said to have a quantile-based multivariate log-normal distribution with quantile vector

Q = {(Q_{1}, \dots, Q_{p})}^{'}

and dispersion matrix

Σ (p \times p) > 0

, denoted by

Y \sim Q L N_{p} (Q, Σ, q)

, if its PDF is

Q L N_{p} (y; Q, Σ, q) = ϕ_{p} (\log (y); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{y_{k}}, y \in R_{+}^{p},

(3)

where

Ω = {(Σ ⊙ I_{p})}^{1 / 2}

.

If we choose

α = {(1 / 2, \dots, 1 / 2)}^{'}

, then Definition 1 coincides with the definition of a multivariate log-normal distribution (Morán-Vásquez et al. [14]). In this case,

Q = μ

is the median vector. Note that

Y \sim Q L N_{p} (Q, Σ, q)

if

\log (Y) \sim N_{p} (\log (Q) - Ω q, Σ)

, which establishes the way in which the quantile-based multivariate log-normal and normal distributions are related through the logarithmic transformation.

Figure 1 displays contour plots (at levels 0.15, 0.1, 0.05, 0.02, 0.01) of the quantile-based bivariate log-normal distribution. The legend indicates the values of

α_{1}

,

α_{2}

, and all the parameters considered in the first plot and the values that are changed from a plot to the subsequent one (in alphabetical order). The parameters

Q_{1}

and

Q_{2}

of the distribution in Figure 1a are the marginal medians of

Y_{1}

and

Y_{2}

, respectively. For Figure 1b–f, these parameters are the first quartile of

Y_{1}

and the median of

Y_{2}

, respectively. The parameter

Q_{2}

impacts the scale of the marginal distribution of

Y_{2}

(Figure 1b,c). The parameter

σ_{11}

controls the dispersion of the marginal distribution of

Y_{1}

(Figure 1c,d). The parameter

σ_{12}

controls the association between the marginal distributions of

Y_{1}

and

Y_{2}

, ranging from a negative to positive association (Figure 1d–f).

The quantile-based multivariate log-normal distribution is suitable in situations where it is necessary to model quantiles of the marginals, taking into account the correlation between them. Additionally, our model can be useful for regression modeling purposes. For instance, assume that, for fixed k,

\log (Q_{k}) = \sum_{j = 1}^{r} β_{j} x_{j}

, where

β_{1}, \dots, β_{r}

are unknown regression parameters and

x_{1}, \dots, x_{r}

are fixed covariates. So,

\exp (β_{j})

is the multiplicative effect of a one unit increase in

x_{j}

on the

α_{k}

-quantile of

Y_{k}

. This is a parametric methodology that allows us to jointly analyze marginal quantiles, taking into account the association among the response variables through the dispersion matrix

Σ (p \times p) > 0

. These types of models can provide more accurate estimates than those that consider univariate models for each marginal assuming independence among them (Morán-Vásquez et al. [14]).

3. Main Properties

Theorems 1–3 state distributional results involving the transformation of quantile-based multivariate log-normal random vectors.

Theorem 1.

Let

θ \in R_{+}^{p}

. If

Y \sim Q L N_{p} (Q, Σ, q)

, then

D_{θ} Y \sim Q L N_{p} (D_{θ} Q, Σ, q)

.

Proof.

From the transformation

U = D_{θ} Y

, with the Jacobian

J (y \to u) = \prod_{k = 1}^{p} θ_{k}^{- 1}

, in (3), we arrive at

\begin{matrix} f_{U} (u) & = ϕ_{p} (\log (D_{θ}^{- 1} u); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{u_{k}} \\ = ϕ_{p} (\log (u) - \log (θ); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{u_{k}} . \end{matrix}

(4)

Since

ϕ_{p} (\log (u) - \log (θ); \log (Q) - Ω q, Σ) = ϕ_{p} (\log (u); \log (Q) + \log (θ) - Ω q, Σ)

, (4) can be expressed as

f_{U} (u) = ϕ_{p} (\log (u); \log (D_{θ} Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{u_{k}},

where the last line is obtained by using the identity

\log (Q) + \log (θ) = \log (D_{θ} Q)

. □

Corollary 1.

Let

θ \in R_{+}^{p}

. If

Y \sim L N_{p} (μ, Σ)

, then

D_{θ} Y \sim L N_{p} (D_{θ} μ, Σ)

.

Proof.

The result follows by applying Theorem 1 to the quantile-based multivariate log-normal distribution generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

The result stated in the above corollary can also be obtained as a particular case of Theorem 3(1) of Morán–Vásquez and Ferrari [13].

Theorem 2.

Let

β \in R^{p}

have nonzero components. If

Y \sim Q L N_{p} (Q, Σ, q)

, then

Y^{β} \sim Q L N_{p} (Q^{β}, D_{β} Σ D_{β}, q^{*})

, where

q^{*} = D_{sgn (β)} q

.

Proof.

Transforming

T = Y^{β}

, with the Jacobian

J (y \to t) = \prod_{k = 1}^{p} β_{k}^{- 1} t_{k}^{1 / β_{k} - 1}

, in (3), we have

\begin{matrix} f_{T} (t) & = ϕ_{p} (\log (t^{1 / β}); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{| β_{k} | t_{k}} \\ = ϕ_{p} (D_{β}^{- 1} \log (t); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{| β_{k} | t_{k}} . \end{matrix}

(5)

By using the identity

ϕ_{p} (D_{β}^{- 1} \log (t); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} \frac{1}{| β_{k} |} = ϕ_{p} (\log (t); D_{β} \log (Q) - Ω^{*} q^{*}, D_{β} Σ D_{β}),

with

Ω^{*} = {(D_{β} Σ D_{β} ⊙ I_{p})}^{1 / 2}

and

q^{*} = D_{sgn (β)} q

, in (5), we have

f_{T} (t) = ϕ_{p} (\log (t); \log (Q^{β}) - Ω^{*} q^{*}, D_{β} Σ D_{β}) \prod_{k = 1}^{p} \frac{1}{t_{k}},

where the last line is derived by noting that

D_{β} \log (Q) = \log (Q^{β})

. □

Corollary 2.

Let

β \in R^{p}

with nonzero components. If

Y \sim L N_{p} (μ, Σ)

, then

Y^{β} \sim L N_{p} (μ^{β}, D_{β} Σ D_{β})

.

Proof.

The result follows by applying Theorem 2 to the quantile-based multivariate log-normal distribution generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

The above corollary can also be obtained as a particular case of Theorem 3(2) of Morán–Vásquez and Ferrari [13].

Theorem 3.

Let

λ \in R^{p} ∖ 0

. If

Y \sim Q L N_{p} (Q, Σ, q)

, then

\prod_{k = 1}^{p} Y_{k}^{λ_{k}} \sim Q L N_{1} (\prod_{k = 1}^{p} Q_{k}^{λ_{k}}, λ^{'} Σ λ, \tilde{q}),

where

\tilde{q} = {(λ^{'} Σ λ)}^{- 1 / 2} λ^{'} Ω q

.

Proof.

Since

\log (Y) \sim N_{p} (\log (Q) - Ω q, Σ)

, we have

λ^{'} \log (Y) = \log (\prod_{k = 1}^{p} Y_{k}^{λ_{k}}) \sim N_{1} (\log (\prod_{k = 1}^{p} Q_{k}^{λ_{k}}) - \tilde{ω} \tilde{q}, λ^{'} Σ λ),

where

\tilde{ω} = {(λ^{'} Σ λ)}^{1 / 2}

and

\tilde{q} = {(λ^{'} Σ λ)}^{- 1 / 2} λ^{'} Ω q

. This completes the proof. □

Corollary 3.

Let

λ \in R^{p} ∖ 0

. If

Y \sim L N_{p} (μ, Σ)

, then

\prod_{k = 1}^{p} Y_{k}^{λ_{k}} \sim L N_{1} (\prod_{k = 1}^{p} μ_{k}^{λ_{k}}, λ^{'} Σ λ) .

Proof.

Simply apply Theorem 3 to the quantile-based multivariate log-normal distribution generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

In Theorem 4, we give a closed-form expression for the mixed moments of quantile-based multivariate log-normal random vectors.

Theorem 4.

Let

λ \in R^{p} ∖ 0

. If

Y \sim Q L N_{p} (Q, Σ, q)

, then

E (\prod_{k = 1}^{p} Y_{k}^{λ_{k}}) = \exp (- λ^{'} Ω q + \frac{1}{2} λ^{'} Σ λ) \prod_{k = 1}^{p} Q_{k}^{λ_{k}} .

(6)

Proof.

From (3), we have

E (\prod_{k = 1}^{p} Y_{k}^{λ_{k}}) = \int_{R_{+}^{p}} ϕ_{p} (\log (y); \log (Q) - Ω q, Σ) \prod_{k = 1}^{p} y_{k}^{λ_{k} - 1} d y .

(7)

By making the change of variables

y = \exp (x)

, with the Jacobian

J (y \to x) = \exp (\sum_{k = 1}^{p} x_{k})

, in (7), we arrive at

E (\prod_{k = 1}^{p} Y_{k}^{λ_{k}}) = M_{X} (λ)

, where

M_{X}

is the moment-generating function of

X \sim N_{p} (\log (Q) - Ω q, Σ)

. This completes the proof. □

In the following corollary, we derive the expected value and the covariance matrix of a quantile-based multivariate log-normal random vector.

Corollary 4.

Let

Y \sim Q L N_{p} (Q, Σ, q)

. Then,

$E (Y) = D_{Q} \exp (- Ω q + σ / 2)$ , where $σ = {(σ_{11}, \dots, σ_{p p})}^{'}$ is the vector with elements being the main diagonal elements of Σ.
$Cov (Y) = {(Cov (Y_{j}, Y_{k}))}_{p \times p}$ , where

$Cov (Y_{j}, Y_{k}) = Q_{j} Q_{k} \exp (- \sqrt{σ_{j j}} q_{j} - \sqrt{σ_{k k}} q_{k} + \frac{1}{2} (σ_{j j} + σ_{k k})) (\exp (σ_{j k}) - 1) .$

(8)

Proof.

For each

k = 1, \dots, p

, by choosing

λ

with all its components being 0, except the kth which is 1, in (6), we obtain

E (Y_{k}) = Q_{k} \exp (- \sqrt{σ_{k k}} q_{k} + \frac{σ_{k k}}{2}) .

From the above expression, we get the first assertion. Similarly, for each

j, k = 1, \dots, p

, by choosing

λ

with all components equal to 0, except the jth and kth, which are 1, in (6), we have

E (Y_{j} Y_{k}) = Q_{j} Q_{k} \exp (- \sqrt{σ_{j j}} q_{j} - \sqrt{σ_{k k}} q_{k} + \frac{1}{2} (σ_{j j} + σ_{k k}) + σ_{j k}) .

The second assertion is obtained from the identity

Cov (Y_{j}, Y_{k}) = E (Y_{j} Y_{k}) - E (Y_{j}) E (Y_{k})

. □

In Section 2, we described the behavior of the quantile-based multivariate log-normal distribution in terms of the parameters involved in the matrix

Σ

. The following corollary establishes an exact interpretation of these parameters in terms of covariance between pairs of variables according to their signs.

Corollary 5.

Let

Y \sim Q L N_{p} (Q, Σ, q)

. Then,

$Cov (Y_{j}, Y_{k}) > 0$ if and only if $σ_{j k} > 0$ , $j \neq k$ .
$Cov (Y_{j}, Y_{k}) = 0$ if and only if $σ_{j k} = 0$ , $j \neq k$ .

Proof.

The result follows from (8). □

Corollary 6.

Let

λ \in R^{p} ∖ 0

. If

Y \sim L N_{p} (μ, Σ)

, then

E (\prod_{k = 1}^{p} Y_{k}^{λ_{k}}) = \exp (\frac{1}{2} λ^{'} Σ λ) \prod_{k = 1}^{p} μ_{k}^{λ_{k}} .

Moreover,

E (Y) = D_{μ} \exp (σ / 2)

, where

σ = {(σ_{11}, \dots, σ_{p p})}^{'}

is the vector with elements being the main diagonal elements of Σ, and

Cov (Y) = {(Cov (Y_{j}, Y_{k}))}_{p \times p}

, with

Cov (Y_{j}, Y_{k}) = μ_{j} μ_{k} \exp (\frac{1}{2} (σ_{j j} + σ_{k k})) (\exp (σ_{j k}) - 1) .

Proof.

Apply Theorem 4 and Corollary 4 to the quantile-based multivariate log-normal distribution generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

Theorem 5 gives a closed-form expression for the mode of the quantile-based multivariate log-normal distribution.

Theorem 5.

The mode of

Y \sim Q L N_{p} (Q, Σ, q)

is given by

Mode (Y) = D_{Q} \exp (- Ω q - Σ 1)

. The value of the PDF of

Y

at the mode is

Q L N_{p} (Mode (Y); Q, Σ, q) = {(2 π)}^{- p / 2} \det (Σ) \exp (\frac{1}{2} 1^{'} Σ 1 - 1^{'} (\log (Q) - Ω q)) .

Proof.

The mode of

Y \sim Q L N_{p} (Q, Σ, q)

is obtained by maximizing (3) with respect to

y

, which is the one that maximizes the function

f (y) = - \frac{1}{2} δ_{Σ} (\log (y), \log (Q) - Ω q) - \sum_{k = 1}^{p} \log (y_{k})

with respect to

y

. By using results on vector differentiation (Seber ([17], Chapter 17)), we find that the equation

\partial f / \partial y = 0

is equivalent to

Σ^{- 1} (\log (y) - \log (Q) + Ω q) + 1 = 0 .

The solution for

y

of the above equation is

D_{Q} \exp (- Ω q - Σ 1)

.

Now, for

y \in R_{+}^{p}

, we have

{(\log (y) - \log (Q) + Ω q + Σ 1)}^{'} Σ^{- 1} (\log (y) - \log (Q) + Ω q + Σ 1) \geq 0,

which implies that

\begin{matrix} Q L N_{p} (y; Q, Σ, q) & \leq Q L N_{p} (D_{Q} \exp (- Ω q - Σ 1); Q, Σ, q) \\ = {(2 π)}^{- p / 2} \det (Σ) \exp (\frac{1}{2} 1^{'} Σ 1 - 1^{'} (\log (Q) - Ω q)), \end{matrix}

for all

y \in R_{+}^{p}

. Hence,

Mode (Y) = D_{Q} \exp (- Ω q - Σ 1)

. □

Corollary 7.

The mode of

Y \sim L N_{p} (μ, Σ)

is given by

Mode (Y) = D_{μ} \exp (- Σ 1)

. The value of the PDF of

Y

at the mode is

L N_{p} (Mode (Y); μ, Σ) = {(2 π)}^{- p / 2} \det (Σ) \exp (\frac{1}{2} 1^{'} Σ 1 - 1^{'} \log (μ)) .

Proof.

Apply Theorem 5 to the quantile-based multivariate log-normal distribution generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

Theorem 6 provides the distribution of a Mahalanobis-type distance involving a quantile-based multivariate log-normal random vector.

Theorem 6.

If

Y \sim Q L N_{p} (Q, Σ, q)

, then

δ_{Σ} (\log (Y), \log (Q) - Ω q) \sim χ_{p}^{2}

.

Proof.

The result follows by noting that

\log (Y) \sim N_{p} (\log (Q) - Ω q, Σ)

. □

The above result allows us to evaluate the goodness of fit of the quantile-based multivariate log-normal distribution by using quantile–quantile plots to compare empirical Mahalanobis distances with theoretical quantiles obtained from a chi-squared distribution with p degrees of freedom.

The Shannon entropy (also called differential entropy) of a continuous random vector

X \in R^{p}

with PDF

f_{X}

is defined as

H (X) = - E [\log (f_{X} (X))] .

On the other hand, the Kullback–Leibler (KL) divergence between the distributions of two p-dimensional random vectors

T

and

U

is given by

D_{K L} (T, U) = E [\log (\frac{f_{T} (T)}{f_{U} (T)})],

where

f_{T}

and

f_{U}

denote the PDFs of

T

and

U

, respectively. The above expected value is defined with respect to the PDF

f_{T}

. A detailed study about Shannon entropy and KL divergence can be found in Pardo [18].

Lemmas 1 and 2 provide the Shannon entropy and the KL divergence for the multivariate normal distribution, respectively.

Lemma 1.

The Shannon entropy of

X \sim N_{p} (μ, Σ)

is given by

H (X) = \frac{1}{2} \log [\det (Σ) {(2 π e)}^{p}] .

Proof.

See Pardo ([18], p. 32). □

Note that

H (X)

in the above lemma can be expressed as

H (X) = \frac{p}{2} (1 + \log (2 π)) + \frac{1}{2} \log (\det (Σ)) .

Lemma 2.

The KL divergence between

X \sim N_{p} (μ_{a}, Σ_{a})

and

W \sim N_{p} (μ_{b}, Σ_{b})

is given by

D_{K L} (X, W) = \frac{1}{2} [\log (\frac{\det (Σ_{b})}{\det (Σ_{a})}) + t r (Σ_{b}^{- 1} Σ_{a} - I_{p}) + {(μ_{a} - μ_{b})}^{'} Σ_{b}^{- 1} (μ_{a} - μ_{b})] .

Proof.

See Pardo ([18], p. 33). □

In the following Theorem, we derive the Shannon entropy of the quantile-based multivariate log-normal distribution.

Theorem 7.

The Shannon entropy of

Y \sim Q L N_{p} (Q, Σ, q)

is given by

H (Y) = \frac{p}{2} (1 + \log (2 π)) + \frac{1}{2} \log (\det (Σ)) + 1^{'} (\log (Q) - Ω q) .

Proof.

By definition,

\begin{matrix} H (Y) & = - E [\log (Q L N_{p} (Y; Q, Σ, q))] \\ = - \int_{R_{+}^{p}} \log (Q L N_{p} (y; Q, Σ, q)) Q L N_{p} (y; Q, Σ, q) d y . \end{matrix}

By making the change of variables

y = \exp (x)

, with Jacobian

J (y \to x) = \exp (\sum_{k = 1}^{p} x_{k})

, in the above integral, we have

\begin{matrix} H (Y) & = - \int_{R^{p}} [\log (ϕ_{p} (x; \log (Q) - Ω q, Σ)) - 1^{'} x] ϕ_{p} (x; \log (Q) - Ω q, Σ) d x \\ = H (X) + 1^{'} E (X), \end{matrix}

where

X \sim N_{p} (\log (Q) - Ω q, Σ)

. The result follows by calculating

H (X)

by using Lemma 1 and replacing

E (X) = \log (Q) - Ω q

in the above expression. □

Corollary 8.

The Shannon entropy of

Y \sim L N_{p} (μ, Σ)

is given by

H (Y) = \frac{p}{2} (1 + \log (2 π)) + \frac{1}{2} \log (\det (Σ)) + 1^{'} \log (μ) .

Proof.

The result follows by applying Theorem 7 to the quantile-based multivariate log-normal distribution generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

In Theorem 8, we derive the KL divergence between two quantile-based multivariate log-normal distributions.

Theorem 8.

The KL divergence between

T \sim Q L N_{p} (Q_{a}, Σ_{a}, q_{a})

and

U \sim Q L N_{p} (Q_{b}, Σ_{b}, q_{b})

is given by

\begin{matrix} D_{K L} & (T, U) = \frac{1}{2} [\log (\frac{\det (Σ_{b})}{\det (Σ_{a})}) + tr (Σ_{b}^{- 1} Σ_{a} - I_{p}) \\ + {(\log (Q_{a}) - Ω_{a} q_{a} - \log (Q_{b}) + Ω_{b} q_{b})}^{'} Σ_{b}^{- 1} (\log (Q_{a}) - Ω_{a} q_{a} - \log (Q_{b}) + Ω_{b} q_{b})] . \end{matrix}

Proof.

By definition,

\begin{matrix} D_{K L} (T, U) & = E [\log (\frac{Q L N_{p} (T; Q_{a}, Σ_{a}, q_{a})}{Q L N_{p} (T; Q_{b}, Σ_{b}, q_{b})})] \\ = \int_{R_{+}^{p}} \log (\frac{Q L N_{p} (t; Q_{a}, Σ_{a}, q_{a})}{Q L N_{p} (t; Q_{b}, Σ_{b}, q_{b})}) Q L N_{p} (t; Q_{a}, Σ_{a}, q_{a}) d t . \end{matrix}

We substitute

t = \exp (w)

, with Jacobian

J (t \to w) = \exp (\sum_{k = 1}^{p} w_{k})

, above to arrive at

\begin{matrix} D_{K L} (T, U) & = \int_{R^{p}} \log (\frac{ϕ_{p} (w; \log (Q_{a}) - Ω_{a} q_{a}, Σ_{a})}{ϕ_{p} (w; \log (Q_{b}) - Ω_{b} q_{b}, Σ_{b})}) ϕ_{p} (w; \log (Q_{a}) - Ω_{a} q_{a}, Σ_{a}) d w \\ = D_{K L} (T^{*}, U^{*}), \end{matrix}

where

T^{*} \sim N_{p} (\log (Q_{a}) - Ω_{a} q_{a}, Σ_{a})

and

U^{*} \sim N_{p} (\log (Q_{b}) - Ω_{b} q_{b}, Σ_{b})

. By using Lemma 2 to calculate

D_{K L} (T^{*}, U^{*})

we arrive at the desired result. □

Corollary 9.

The KL divergence between

T \sim Q L N_{p} (Q_{a}, Σ_{a}, q_{a})

and

U \sim L N_{p} (μ_{b}, Σ_{b})

is given by

\begin{matrix} D_{K L} (T, U) = \frac{1}{2} [ & \log (\frac{\det (Σ_{b})}{\det (Σ_{a})}) + tr (Σ_{b}^{- 1} Σ_{a} - I_{p}) \\ + {(\log (Q_{a}) - Ω_{a} q_{a} - \log (μ_{b}))}^{'} Σ_{b}^{- 1} (\log (Q_{a}) - Ω_{a} q_{a} - \log (μ_{b}))] . \end{matrix}

Proof.

Take the quantile-based multivariate log-normal random vector

U

in Theorem 8 generated by

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

Corollary 10.

The KL divergence between

T \sim L N_{p} (μ_{a}, Σ_{a})

and

U \sim L N_{p} (μ_{b}, Σ_{b})

is given by

\begin{matrix} D_{K L} (T, U) = \frac{1}{2} [ & \log (\frac{\det (Σ_{b})}{\det (Σ_{a})}) + t r (Σ_{b}^{- 1} Σ_{a} - I_{p}) + \\ {(\log (μ_{a}) - \log (μ_{b}))}^{'} Σ_{b}^{- 1} (\log (μ_{a}) - \log (μ_{b}))] . \end{matrix}

Proof.

Generate the quantile-based multivariate log-normal random vector

T

in Corollary 9 with

α = {(1 / 2, \dots, 1 / 2)}^{'}

. □

With the aim to derive results on marginal and conditional distributions and independence, relating sub-vectors of the random vector having a quantile-based multivariate log-normal distribution, we introduce notations for partitions of

Y \in R_{+}^{p}

,

Q \in R_{+}^{p}

,

q \in R^{p}

, and

Σ (p \times p) > 0

as follows:

Y = {(Y_{1}^{'}, Y_{2}^{'})}^{'}, Q = {(Q_{1}^{'}, Q_{2}^{'})}^{'}, q = {(q_{1}^{'}, q_{2}^{'})}^{'}, Σ = [\begin{matrix} Σ_{11} & Σ_{12} \\ Σ_{21} & Σ_{22} \end{matrix}],

(9)

where

Y_{1} \in R_{+}^{p_{1}}

,

Y_{2} \in R_{+}^{p_{2}}

,

Q_{1} \in R_{+}^{p_{1}}

,

Q_{2} \in R_{+}^{p_{2}}

,

q_{1} \in R^{p_{1}}

,

q_{2} \in R^{p_{2}}

,

Σ_{11} (p_{1} \times p_{1}) > 0

,

Σ_{22} (p_{2} \times p_{2}) > 0

, and

Σ_{12} (p_{1} \times p_{2})

and

Σ_{21} (p_{2} \times p_{1})

are such that

Σ_{12}^{'} = Σ_{21}

. The Schur complement of the block

Σ_{11}

of

Σ

is given by

Σ_{22 \cdot 1} = Σ_{22} - Σ_{21} Σ_{11}^{- 1} Σ_{12}

. Also, we define

Ω_{1} = {(Σ_{11} ⊙ I_{p_{1}})}^{1 / 2}

,

Ω_{2} = {(Σ_{22} ⊙ I_{p_{2}})}^{1 / 2}

, and

Q_{2 \cdot 1} = D_{Q_{2}} \exp (Σ_{21} Σ_{11}^{- 1} (\log (y_{1}) - \log (Q_{1}) + Ω_{1} q_{1}))

. The dimension p is such that

p = p_{1} + p_{2}

.

In Lemma 3, we give a factorization of the PDF of the quantile-based multivariate log-normal distribution.

Lemma 3.

Consider the partitions given in (9). The PDF of

Y \sim Q L N_{p} (Q, Σ, q)

can be expressed as

Q L N_{p} (y; Q, Σ, q) = Q L N_{p_{1}} (y_{1}; Q_{1}, Σ_{11}, q_{1}) Q L N_{p_{2}} (y_{2}; Q_{2 \cdot 1}, Σ_{22 \cdot 1}, q_{2 \cdot 1}),

(10)

where

q_{2 \cdot 1} = Ω_{2 \cdot 1}^{- 1} Ω_{2} q_{2}

, with

Ω_{2 \cdot 1} = {(Σ_{22 \cdot 1} ⊙ I_{p_{2}})}^{1 / 2}

.

Proof.

It suffices to show that

\begin{matrix} δ_{Σ} (\log (y), \log (Q) - Ω q) = & δ_{Σ_{11}} (\log (y_{1}), \log (Q_{1}) - Ω_{1} q_{1}) \\ + δ_{Σ_{22 \cdot 1}} (\log (y_{2}), \log (Q_{2 \cdot 1}) - Ω_{2 \cdot 1} q_{2 \cdot 1}) . \end{matrix}

The straightforward calculation shows that

\begin{matrix} δ_{Σ} (\log (y), & \log (Q) - Ω q) - δ_{Σ_{11}} (\log (y_{1}), \log (Q_{1}) - Ω_{1} q_{1}) \\ = {(\log (y_{1}) - \log (Q_{1}) + Ω_{1} q_{1})}^{'} Σ_{11}^{- 1} Σ_{12} Σ_{22 \cdot 1}^{- 1} Σ_{21} Σ_{11}^{- 1} (\log (y_{1}) - \log (Q_{1}) + Ω_{1} q_{1}) \\ - 2 {(\log (y_{1}) - \log (Q_{1}) + Ω_{1} q_{1})}^{'} Σ_{11}^{- 1} Σ_{12} Σ_{22 \cdot 1}^{- 1} (\log (y_{2}) - \log (Q_{2}) + Ω_{2} q_{2}) \\ + {(\log (y_{2}) - \log (Q_{2}) + Ω_{2} q_{2})}^{'} Σ_{22 \cdot 1}^{- 1} (\log (y_{2}) - \log (Q_{2}) + Ω_{2} q_{2}) . \end{matrix}

Now, using the result

(\log (Q_{2 \cdot 1}) - Ω_{2 \cdot 1} q_{2 \cdot 1}) - (\log (Q_{2}) - Ω_{2} q_{2}) = Σ_{21} Σ_{11}^{- 1} (\log (y_{1}) - \log (Q_{1}) + Ω_{1} q_{1}),

we have

\begin{matrix} δ_{Σ} (\log (y), & \log (Q) - Ω q) - δ_{Σ_{11}} (\log (y_{1}), \log (Q_{1}) - Ω_{1} q_{1}) \\ = {(\log (y_{2}) - \log (Q_{2 \cdot 1}) + Ω_{2 \cdot 1} q_{2 \cdot 1})}^{'} Σ_{22 \cdot 1}^{- 1} (\log (y_{2}) - \log (Q_{2 \cdot 1}) + Ω_{2 \cdot 1} q_{2 \cdot 1}) \\ = δ_{Σ_{22 \cdot 1}} (\log (y_{2}), \log (Q_{2 \cdot 1}) - Ω_{2 \cdot 1} q_{2 \cdot 1}), \end{matrix}

which is the desired result. □

In Theorem 9, we show that the quantile-based multivariate log-normal family is preserved under marginalization and conditioning. In this theorem, we also present a characterization of the independence between subvectors of this family.

Theorem 9.

Let

Y \sim Q L N_{p} (Q, Σ, q)

. Consider the partitions given in (9). Then,

$Y_{1} \sim Q L N_{p_{1}} (Q_{1}, Σ_{11}, q_{1})$ .
$Y_{2} | Y_{1} = y_{1} \sim Q L N_{p_{2}} (Q_{2 \cdot 1}, Σ_{22 \cdot 1}, q_{2 \cdot 1})$ .
$Y_{1}$ and $Y_{2}$ are independent if and only if $Σ_{12} = 0$ .

Proof.

Statements 1 and 2 follow from the factorization given in (10). To prove the statement 3, note that

Y_{1}

and

Y_{2}

are independent if and only if

Q L N_{p} (y; Q, Σ, q) = Q L N_{p_{1}} (y_{1}; Q_{1}, Σ_{11}, q_{1}) Q L N_{p_{2}} (y_{2}; Q_{2}, Σ_{22}, q_{2}),

which, from (10), is satisfied if and only if

Σ_{12} = 0

. □

4. Parameter Estimation

The reparameterization used in Definition 1 permits us to compute the maximum likelihood estimates of the parameters of the quantile-based multivariate log-normal distribution through the maximum likelihood estimates of the parameters of the multivariate log-normal distribution. Let

y_{1}, \dots, y_{n}

be the observed values of a random sample

Y_{1}, \dots, Y_{n}

of

Y \sim Q L N_{p} (Q, Σ, q)

. We denote the maximum likelihood estimators of

Q

and

Σ

by

\hat{Q}

and

\hat{Σ}

, respectively. From (2), we have

\hat{Q} = D_{\hat{μ}} \exp (\hat{Ω} q),

(11)

where

\hat{Ω} = {(\hat{Σ} ⊙ I_{p})}^{1 / 2}

, and

\hat{μ}

and

\hat{Σ}

are the maximum likelihood estimators of the multivariate log-normal distribution given by (Morán-Vásquez et al. [14])

\begin{matrix} \hat{μ} & = \exp (\frac{1}{n} \sum_{i = 1}^{n} \log (y_{i})), \\ \hat{Σ} & = \frac{1}{n} \sum_{i = 1}^{n} (\log (y_{i}) - \log (\hat{μ})) {(\log (y_{i}) - \log (\hat{μ}))}^{'} . \end{matrix}

Note that the maximum likelihood estimator of

Σ

in the quantile-based multivariate log-normal distribution is the same as in the multivariate log-normal distribution. Furthermore, this estimator is the same for any choice of

q

.

We assess the goodness of fit of the quantile-based multivariate log-normal distributions by using quantile–quantile plots, comparing the empirical Mahalanobis distances

δ_{\hat{Σ}} (\log (y_{i}), \log (\hat{Q}) - \hat{Ω} q)

,

i = 1, \dots, n

, with the theoretical quantiles

δ_{α_{i}}^{2}

, where

α_{i} = i / (n + 1)

,

i = 1, \dots, n

, obtained from a chi-squared distribution with p degrees of freedom. Additionally, we plot simulated envelopes (Atkinson [19]) for the quantile–quantile plots in order to help the comparison between quantiles and judge the adequacy of the models.

To evaluate the estimation procedure, we conducted simulations with the quantile-based bivariate log-normal distribution. We consider the sample sizes of

n = 50, 100, 500, 1000

, and

10, 000

Monte Carlo replicates. The random samples of

Y \sim Q L N_{p} (Q, Σ, q)

were generated through the following steps:

Generate a random sample $x_{1}, \dots, x_{n}$ of of $X \sim N_{p} (\log (Q) - Ω q, Σ)$ .
Compute $y_{1} = \exp (x_{1}), \dots, y_{n} = \exp (x_{n})$ . Then, $y_{1}, \dots, y_{n}$ is a random sample of $Y \sim Q L N_{p} (Q, Σ, q)$ .

The true parameters were yielded by fitting the quantile-based bivariate log-normal distribution to the children data set considered in Section 5. Table 1 reports the median and the interquartile range for the estimated values of the parameters of the investigated models. The medians get close to the true parameters and the interquartile range gets smaller as the sample size grows, indicating a satisfactory performance of the estimators. All the computations were conducted in the R software [20].

5. Application

Anthropometric measures are useful for monitoring the growth and identification of childhood developmental problems. The World Health Organization [21,22] provides quantile estimations of several children’s anthropometric characteristics as height, weight, and head and arm circumferences, among others. These estimations are obtained by fitting univariate models for each anthropometric measurement separately, ignoring the association between them. We use the quantile-based bivariate log-normal distribution to estimate quantiles of children’s weights (in kilograms) and heights (in centimeters), considering the natural correlation between them. We consider a sample of 587 children between 2 and 5 years of age collected at the year 2018 in the El Poblado neighborhood, located in Medellín, Colombia [23].

The bagplot in Figure 2 shows that children’s weights and heights are positively associated with slight joint skewedness and highlights two outliers. In order to estimate the third quartile of weight and the first quartile of height, we fitted the quantile-based bivariate log-normal distribution

Y \sim Q L N_{2} (Q, Σ, q)

, with

α_{1} = 0.75 (q_{1} = 0.67)

and

α_{2} = 0.25 (q_{2} = - 0.67)

. The maximum likelihood estimates of the parameters are

{\hat{Q}}_{1} = 101.62

,

{\hat{Q}}_{2} = 13.101

,

{\hat{σ}}_{11} = 0.0062

,

{\hat{σ}}_{22} = 0.0314

, and

{\hat{σ}}_{12} = 0.0121

. Therefore, the third quartile of the children’s height is estimated to be

101.62

cm, and the first quartile of the children’s weight is estimated to be

13.101

kg. Since

{\hat{σ}}_{12} = 0.0121

, thereby the children’s weights and heights are estimated to be positively correlated, which is consistent with the descriptive analysis presented in Figure 2.

Figure 3 shows the quantile–quantile plot with simulated envelopes for the Mahalanobis distances for the fitted quantile-based bivariate log-normal distribution. This plot suggests a suitable fit.

6. Final Remarks

In this article, we have proposed a multivariate distribution with positive support derived by applying a parameterization of the multivariate log-normal distribution by using their marginal quantiles. This distribution will attract researchers in the area of quantile modeling for correlated multivariate positive skewed data. We derived a number of important statistical properties of this distribution involving the transformations, mixed moments, expected value, covariance matrix, mode, Shannon entropy, Kullback–Leibler divergence, marginalization, conditioning, and independence. Needless to say, the quantile-based multivariate log-normal distribution defined in this article is rich in theoretical properties and can easily be manipulated from a mathematical viewpoint. The parameter estimation was approached by using the maximum likelihood estimation method. The satisfactory behavior of the estimation procedure was verified through simulation studies. Also, a graphical diagnostic tool was employed in order to assess the quality of the fitted distributions. On the other hand, an application to real data is presented and discussed as an alternative for the quantile estimation of the children’s weights and heights, considering the natural association between these variables.

There are several aspects that will be addressed in future articles. Bayesian approaches for the estimation of the parameters of the quantile-based multivariate log-normal distribution will be developed. The study of regression models based on the quantile-based multivariate log-normal distributions together with inferential developments and applications to real data will also be undertaken. These models will allow us to analyze the relationship between marginal quantiles of response vectors and a set of explanatory variables, taking into account the potential association among the marginal response variables. Additionally, a comparative analysis of this methodology with the model proposed by Petrella and Raponi [12] will be included in a forthcoming article.

Author Contributions

Conceptualization, R.A.M.-V., A.R.-C. and D.K.N.; methodology, R.A.M.-V., A.R.-C. and D.K.N.; investigation, R.A.M.-V., A.R.-C. and D.K.N.; writing-original draft preparation, R.A.M.-V., A.R.-C. and D.K.N.; writing-review and editing, R.A.M.-V., A.R.-C. and D.K.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that there are no conflict of interests regarding the publication of this article.

References

Cade, B.S.; Noon, B.R. A gentle introduction to quantile regression for ecologists. Front. Ecol. Environ. 2003, 1, 412–420. [Google Scholar] [CrossRef]
Yu, K.; Lu, Z.; Stander, J. Quantile regression: Applications and current research areas. J. R. Stat. Soc. Ser. D 2003, 52, 331–350. [Google Scholar] [CrossRef]
Koenker, R.; Bassett, G., Jr. Regression quantiles. Econometrica 1978, 46, 33–50. [Google Scholar] [CrossRef]
Ferrari, S.L.P.; Fumes, G. Box–Cox symmetric distributions and applications to nutritional data. Adv. Stat. Anal. 2017, 101, 321–344. [Google Scholar] [CrossRef] [Green Version]
Gijbels, I.; Karim, R.; Verhasselt, A. Semiparametric quantile regression using family of quantile-based asymmetric densities. Comput. Stat. Data Anal. 2021, 157, 107–129. [Google Scholar] [CrossRef]
Mazucheli, J.; Alves, B.; Menezes, A.; Leiva, V. An overview on parametric quantile regression models and their computational implementation with applications to biomedical problems including COVID-19 data. Comput. Methods Programs Biomed. 2022, 221, 106816. [Google Scholar] [CrossRef] [PubMed]
Smithson, M.; Shou, Y. CDF-quantile distributions for modelling random variables on the unit interval. Br. J. Math. Stat. Psychol. 2017, 70, 412–438. [Google Scholar] [CrossRef] [PubMed]
Breckling, J.; Chambers, R. M-Quantiles. Biometrika 1988, 75, 761–771. [Google Scholar] [CrossRef]
Kong, L.; Mizera, I. Quantile tomography: Using quantiles with multivariate data. Stat. Sin. 2012, 22, 1589–1610. [Google Scholar] [CrossRef] [Green Version]
McKeague, I.W.; López-Pintado, S.; Hallin, M.; Šiman, M. Analyzing growth trajectories. J. Dev. Orig. Health Dis. 2011, 2, 322–329. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wei, Y. An approach to multivariate covariate-dependent quantile contours with application to bivariate conditional growth charts. J. Am. Stat. Assoc. 2008, 103, 397–409. [Google Scholar] [CrossRef]
Petrella, L.; Raponi, V. Joint estimation of conditional quantiles in multivariate linear regression models with an application to financial distress. J. Multivar. Anal. 2019, 173, 70–84. [Google Scholar] [CrossRef] [Green Version]
Morán-Vásquez, R.A.; Ferrari, S.L.P. Box-Cox elliptical distributions with application. Metrika 2018, 82, 547–571. [Google Scholar] [CrossRef] [Green Version]
Morán-Vásquez, R.A.; Mazo-Lopera, M.A.; Ferrari, S.L.P. Quantile modeling through multivariate log-normal/independent linear regression models with application to newborn data. Biom. J. 2021, 63, 1290–1308. [Google Scholar] [CrossRef] [PubMed]
Saulo, H.; Dasilva, A.; Leiva, V.; Sánchez, L.; de la Fuente-Mella, H. Log-symmetric quantile regression models. Stat. Neerl. 2022, 76, 124–163. [Google Scholar] [CrossRef]
Fang, K.T.; Kotz, S.; Ng, K.W. Symmetric Multivariate and Related Distributions; Chapman and Hall: London, UK, 1990. [Google Scholar]
Seber, G.A.F. A Matrix Handbook for Staticians; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
Pardo, L. Statistical Inference Based on Divergence Measures; Chapman & Hall/CRC: Boca Raton, FL, USA, 2006. [Google Scholar]
Atkinson, A.C. Two graphical displays for outlying and influential observations in regression. Biometrika 1981, 68, 13–20. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 21 July 2023).
World Health Organization. WHO Child Growth Standards: Length/Height-for-Age, Weight-for-Age, Weight-for-Length, Weight-for-Height and Body Mass Index-for-Age: Methods and Development; World Health Organization: Geneva, Switzerland, 2006. Available online: https://apps.who.int/iris/handle/10665/43413 (accessed on 25 May 2023).
World Health Organization. WHO Child Growth Standards: Head Circumference-for-Age, Arm Circumference-for-Age, Triceps Skinfold-for-Age and Subscapular Skinfold-for-Age: Methods and Development; World Health Organization: Geneva, Switzerland, 2007. Available online: https://apps.who.int/iris/handle/10665/43706 (accessed on 25 May 2023).
MEData: Portal de datos de Medellín. Estado Nutricional de Menores de 6 años Programa de Crecimiento y Desarrollo. 2022. Available online: http://medata.gov.co/dataset/estado-nutricional-de-menores-de-6-anos-programa-de-crecimiento-y-desarrollo (accessed on 21 July 2023).

Figure 1. Contour plots at levels 0.15, 0.1, 0.05, 0.02, 0.01 of the joint PDF of

Y \sim Q L N_{2} (Q, Σ, q)

given in Definition 1, where (a)

α_{1} = α_{2} = 0.5

,

Q_{1} = 0.8

,

Q_{2} = 1

,

σ_{11} = 0.8, σ_{22} = 1

,

σ_{12} = - 0.5

, (b)

α_{1} = 0.25

, (c)

Q_{2} = 1.2

, (d)

σ_{11} = 1

, (e)

σ_{12} = 0

, (f)

σ_{12} = 0.7

.

Figure 1. Contour plots at levels 0.15, 0.1, 0.05, 0.02, 0.01 of the joint PDF of

Y \sim Q L N_{2} (Q, Σ, q)

given in Definition 1, where (a)

α_{1} = α_{2} = 0.5

,

Q_{1} = 0.8

,

Q_{2} = 1

,

σ_{11} = 0.8, σ_{22} = 1

,

σ_{12} = - 0.5

, (b)

α_{1} = 0.25

, (c)

Q_{2} = 1.2

, (d)

σ_{11} = 1

, (e)

σ_{12} = 0

, (f)

σ_{12} = 0.7

.

Figure 2. Bagplot of weight vs. height; children’s data.

Figure 3. Quantile–quantile plot with simulated envelopes for the Mahalanobis distances for the fitted distribution.

Table 1. Median (M) and interquartile range (IQR) of the parameter estimates of the quantile-based bivariate log-normal distributions.

			$n = 50$		$n = 100$		$n = 500$		$n = 1000$
Probability	True Parameter		M	IQR	M	IQR	M	IQR	M	IQR
$α_{1} = 0.75$	$Q_{1}$	101.62	101.59	1.6480	101.59	1.1849	101.60	0.5391	101.61	0.3747
$α_{2} = 0.25$	$Q_{2}$	13.101	13.115	0.4923	13.108	0.3378	13.102	0.1557	13.103	0.1115
$α_{1} = 0.50$	$Q_{1}$	96.368	96.373	1.4397	96.365	1.0000	96.365	0.4582	96.367	0.3229
$α_{2} = 0.50$	$Q_{2}$	14.764	14.768	0.4965	14.766	0.3417	14.763	0.1625	14.764	0.1123
$α_{1} = 0.25$	$Q_{1}$	91.392	91.429	1.5166	91.406	1.0509	91.397	0.4784	91.396	0.3437
$α_{2} = 0.75$	$Q_{2}$	16.640	16.632	0.6190	16.634	0.4332	16.635	0.2016	16.636	0.1412
	$σ_{11}$	0.0062	0.0062	0.0016	0.0062	0.0012	0.0062	0.0005	0.0062	0.0004
	$σ_{22}$	0.0314	0.0313	0.0085	0.0313	0.0059	0.0313	0.0027	0.0313	0.0019
	$σ_{12}$	0.0121	0.0121	0.0035	0.0121	0.0025	0.0121	0.0011	0.0121	0.0008

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Morán-Vásquez, R.A.; Roldán-Correa, A.; Nagar, D.K. Quantile-Based Multivariate Log-Normal Distribution. Symmetry 2023, 15, 1513. https://doi.org/10.3390/sym15081513

AMA Style

Morán-Vásquez RA, Roldán-Correa A, Nagar DK. Quantile-Based Multivariate Log-Normal Distribution. Symmetry. 2023; 15(8):1513. https://doi.org/10.3390/sym15081513

Chicago/Turabian Style

Morán-Vásquez, Raúl Alejandro, Alejandro Roldán-Correa, and Daya K. Nagar. 2023. "Quantile-Based Multivariate Log-Normal Distribution" Symmetry 15, no. 8: 1513. https://doi.org/10.3390/sym15081513

APA Style

Morán-Vásquez, R. A., Roldán-Correa, A., & Nagar, D. K. (2023). Quantile-Based Multivariate Log-Normal Distribution. Symmetry, 15(8), 1513. https://doi.org/10.3390/sym15081513

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quantile-Based Multivariate Log-Normal Distribution

Abstract

1. Introduction

2. Quantile-Based Multivariate Log-Normal Distribution

3. Main Properties

4. Parameter Estimation

5. Application

6. Final Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI