Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws

Bulinski, Alexander; Slepov, Nikolay

doi:10.3390/math10244747

Open AccessArticle

Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws

by

Alexander Bulinski

^1,*

and

Nikolay Slepov

²

¹

Faculty of Mathematics and Mechanics, Lomonosov Moscow State University, Leninskie Gory 1, 119991 Moscow, Russia

²

Department of Higher Mathematics, Moscow Institute of Physics and Technology, National Research University, 9 Instituskiy per., Dolgoprudny, 141701 Moscow, Russia

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(24), 4747; https://doi.org/10.3390/math10244747

Submission received: 29 October 2022 / Revised: 6 December 2022 / Accepted: 7 December 2022 / Published: 14 December 2022

(This article belongs to the Special Issue Limit Theorems of Probability Theory)

Download Versions Notes

Abstract

:

The convergence rate in the famous Rényi theorem is studied by means of the Stein method refinement. Namely, it is demonstrated that the new estimate of the convergence rate of the normalized geometric sums to exponential law involving the ideal probability metric of the second order is sharp. Some recent results concerning the convergence rates in Kolmogorov and Kantorovich metrics are extended as well. In contrast to many previous works, there are no assumptions that the summands of geometric sums are positive and have the same distribution. For the first time, an analogue of the Rényi theorem is established for the model of exchangeable random variables. Also within this model, a sharp estimate of convergence rate to a specified mixture of distributions is provided. The convergence rate of the appropriately normalized random sums of random summands to the generalized gamma distribution is estimated. Here, the number of summands follows the generalized negative binomial law. The sharp estimates of the proximity of random sums of random summands distributions to the limit law are established for independent summands and for the model of exchangeable ones. The inverse to the equilibrium transformation of the probability measures is introduced, and in this way a new approximation of the Pareto distributions by exponential laws is proposed. The integral probability metrics and the techniques of integration with respect to sign measures are essentially employed.

Keywords:

probability metrics; Stein method; geometric sums; generalization of the Rényi theorem; generalized transformation of equilibrium for probability measures and its inverse; generalized gamma distribution

MSC:

60F99; 60E10; 60G50; 60G09

1. Introduction

The theory of sums of random variables belongs to the core of modern probability theory. The fundamental contribution to the formation of the classical core was made by A. de Moivre, J. Bernoulli, P.-S. Laplace, D. Poisson, P.L. Chebyshev, A.A. Markov, A.M. Lyapunov, E. Borel, S.N. Bernstein, P. Lévy, J. Lindeberg, H. Cramér, A.N. Kolmogorov, A.Ya. Khinchin, B.V. Gnedenko, J.L. Doob, W. Feller, Yu.V. Prokhorov, A.A. Borovkov, Yu.V. Linnik, I.A. Ibragimov, A. Rényi, P. Erdös, M. Csörgö, P. Révész, C. Stein, P. Hall, V.V. Petrov, V.M. Zolotarev, J. Jacod and A.N. Shiryaev among others. The first steps led to limit theorems for appropriately normalized partial sums of sequences of independent random variables. Besides the laws of large numbers, special attention was paid to emergence of Gaussian and Poisson limit laws. Note that despite many efforts to find necessary and sufficient conditions for the validity of the central limit theorem (the term was proposed by G. Pólya for a class of limit theorems describing weak convergence of distributions of normalized sums of random variables to the Gaussian law), this problem was completely resolved for independent summands only in the second part of the 20th century in the works by V.M. Zolotarev and V.I. Rotar. Also in the last century, the beautiful theory of infinitely divisible and stable laws was constructed. New developments of infinite divisibility along with classical theory can be found in [1]. For exposition of the theory of stable distributions and their applications, we refer to [2], see also references therein.

Parallel to partial sums of a sequence of random variables (and vectors), other significant schemes have appeared, for instance, the arrays of random variables. Moreover, in physics, biology and other domains, researchers found that it was essential to study the sums of random variables when the number of summands was random. Thus, the random sums with random summands became an important object of investigation. One can mention the branching processes which stem from the 19th century population models by I.J. Bienaymé, F. Galton and H.W. Watson that are still intensively being developed, see, e.g., [3]. In the theory of risk, it is worth recalling the celebrated Cramér–Lundberg model for dynamics of the capital of an insurance company, see, e.g., Ch. 6 in [4]. Various examples of models described by random sums are considered in Ch. 1 of [5], including (see Example 1.2.1) the relationship between certain random sums analysis and the famous Pollaczek–Khinchin formula in queuing theory. A vast literature deals with the so-called geometric sums. There, one studies the sum of independent identically distributed random variables, and the summation index follows the geometric distribution, being independent with summands. Such random sums can model many real world phenomena, e.g., in queuing, insurance and reliability, see the Section “Origin of Geometric Sums” in the Introduction of [6]. Furthermore, a multitude of important stochastic models described by systems of dependent random variables occurred to meet diverse applications, see, e.g., [7]. In particular, the general theory of stochastic processes and random fields arose in the last century (for introduction to random fields, see, e.g., [8]).

An intriguing problem of estimating the convergence rate to a limit law was addressed by A.C. Berry and C.-G. Esseen. Their papers initiated the study of proximity for distribution functions of the normalized partial sums of independent random variables to the distribution function of a standard Gaussian law in the framework of the classical theory of random sums.

To assess the proximity of distributions, we will employ various integral probability metrics. Usually, for random variables Y, Z and a specified class

H

of functions

h : R \to R

, one sets

d_{H} (Y, Z) : = sup_{h \in H} | E [h (Y)] - E [h (Z)] | \in [0, \infty] .

(1)

Clearly,

d_{H} (Y, Z)

is a functional depending on

l a w (Y)

and

l a w (Z)

, i.e., distributions of Y and Z. A class

H

should be rich enough to guarantee that

d_{H}

possesses the properties of a metric (or semi-metric). The general theory of probability metrics is presented, e.g., in [9,10]. In terms of such metrics, one often compares the distribution of a random variable Y under consideration with that of a target random variable Z. In Section 2, we recall the definitions of the Kolmogorov and Kantorovich (alternatively called Wasserstein) distances and Zolotarev ideal metrics corresponding to the adequate choice of

H

, denoted below as

K

,

H_{1}

and

H_{2}

, respectively.

It should be emphasized that for sums of random variables, deep results were established along with creation and development of different methods of analysis. One can mention the method of characteristic functions due to the works of J.Fourier, P.-S.Laplace and A.M.Lyapunov, the method of moments proposed by P.L.Chebyshev and developed by A.A.Markov, the Lindeberg method of employing auxiliary Gaussian random variables and the Bernstein techniques of large and small boxes. In 1972, C.Stein in [11] (see also [12]) introduced the new method to estimate the proximity of the distribution under consideration to a normal law. Furthermore, this powerful method was developed in the framework of classical limit theorems of the probability theory. We describe this method in Section 2. Applying the Stein method along with other tools, one can establish in certain cases the sharp estimates of closeness between a target distribution and other ones in specified metrics (see, e.g., [13,14]). We recommend the books [15,16] and the paper [17] for basic ideas of the ingenious Stein method. The development of this techniques under mild moment restrictions for summands is treated in [18,19]. We mention in passing that there are deep generalizations of Stein techniques involving generators of certain Markov processes; a compact exposition is provided, e.g., on p. 2 of [20].

In the theory of random sums of random summands, the limit theorems with exponential law as a target distribution play a role similar to the central limit theorem for (nonrandom) sums of random variables. Here, one has to underline the principal role of the Rényi classical theorem for geometric sums published in [21]. Recall this famous result. Let

X_{1}, X_{2}, \dots

be a sequence of independent identically distributed (i.i.d.) random variables such that

μ : = E [X_{1}] \neq 0

. Take a geometric random variable

N_{p}

with parameter

p \in (0, 1)

, defined as follows:

P (N_{p} = k) = p {(1 - p)}^{k}, k \in N \cup {0} .

(2)

Assume that

N_{p}

and

{(X_{n})}_{n \in N}

are independent. Set

S_{0} : = 0

,

S_{n} : = X_{1} + \dots + X_{n}

,

n \in N

. Then,

W_{p} : = \frac{S_{N_{p}}}{E [S_{N_{p}}]} \overset{D}{\to} Z \sim E x p (1) a s p \to 0 +,

(3)

where

\overset{D}{\to}

stands for convergence in distribution, and Z follows the exponential law

E x p (λ)

with parameter

λ = 1

,

E [S_{N_{p}}] = μ (1 - p) / p

. In fact, instead of

N_{p}

, A.Rényi considered the shifted geometric random variable

N (p)

such that

P (N (p) = k) = p {(1 - p)}^{k - 1}

,

k \in N

. Clearly,

N_{p}

has the same law as

N (p) - 1

. He supposed that i.i.d. random variables

X_{1}, X_{2}, \dots

are non-negative, and

N (p)

and

{(X_{n})}_{n \in N}

are independent. Then,

S_{N (p)} / E [S_{N (p)}]

converges in distribution to

Z \sim E x p (1)

as

p \to 0 +

, where

E [S_{N (p)}] = μ / p

. It was explained in [22] that both statements are equivalent and the assumption of nonnegativity of summands can be omitted.

Building on the previous investigations discussed below in this section, we study different instances of quantifying the approximation of random sums by limit laws and also extend the Stein method employment. The main goals of our paper are the following: (1) to find sharp estimates (i.e., optimal ones which cannot be diminished) of proximity of geometric sums of independent (in general non-identically distributed) random variables to exponential law using the probability metric

d_{H_{2}}

; (2) to prove the new version of the Rényi theorem when the summands are described by a model of exchangeable random variables, establishing the due non-exponential limit law together with an optimal bound of the convergence rate applying

d_{H_{2}}

; (3) to obtain the exact convergence rate of appropriately normalized random sums of random summands to the generalized gamma distribution when the number of summands follows the generalized negative binomial distribution employing

d_{H_{2}}

; (4) to introduce the inverse transformation to an “equilibrium distribution transformation”, give full description of its existence and demonstrate the advantage of applying the Stein method combined with that inverse transform; and (5) to use such approach in deriving the new approximation in the Kolmogorov metric

d_{K}

of the Pareto distribution by an exponential one, which is important in signal processing.

The main idea is to apply the Stein method and deduce (Lemma 2) new estimates of the solution of Stein’s equation (corresponding to an exponential law

E x p (λ)

as a target distribution) when a function h appearing in its right-hand side belongs to a class

H_{2}

. This entails the established sharp estimates. The integral probability metrics and the techniques of integration with respect to sign measures are essentially employed. It should be stressed that we consider random summands which take, in general, positive and negative values and in certain cases need not have the same law.

Now, we briefly comment on the relevance of the five groups of the paper results mentioned above. Some upper bounds for convergence rates in Equation (3) were obtained previously by different tools (the renewal techniques and the memoryless property of the geometric distribution), and the estimates were not sharp. We refer to the results by A.D. Soloviev, V.V. Kalashnikov and S.Y. Vsekhsvyatskii, M. Brown, V.M. Kruglov and V.Yu. Korolev, where the authors either used the Kolmogorov distance or proved specified nonuniform estimates for differences of the corresponding distribution functions. For instance, in [23] the following estimate was proved

sup_{x \in R} | P (W_{p} \leq x) - P (Z \leq x) | \leq p \frac{E [X_{1}^{2}]}{μ^{2}} max \{1, \frac{1}{2 (1 - p)}\},

where

Z \sim E x p (1)

. Moreover, this estimate is asymptotically exact when

p \to 0 +

. Some improvements are in [24] under certain (hazard rate) assumptions. E.V. Sugakova obtained a version of the Rényi theorem for independent, in general, not identically distributed random variables. We also mention contributions by V.V. Kalashnikov, E.F. Peköz, A. Röllin, N. Ross and T.L. Hung which gave the estimates in terms of the Zolotarev ideal metrics. We do not reproduce all these results here since they can be viewed on pages 3 and 4 of [22] with references where they were published.

In Corollary 3.6 of [25] for nondegenerate i.i.d. positive random variables

X_{1}, X_{2}, \dots

with mean

μ

and finite second moment, it was proved that

ζ_{2} (p S (p), Z (1 / μ)) \leq p (E [X_{1}^{2}] + 2 μ^{2}),

where

S (p) : = \sum_{j = 1}^{N (p)} X_{j}

,

ζ_{2}

is the Zolotarev ideal metric of order two,

Z (λ) \sim E x p (λ)

,

λ > 0

. In [22], the estimates for proximity of geometric sums distributions to

Z \sim E x p (1)

were provided in the Kantorovich and

ζ_{2}

metrics. A substantial contribution of the authors of [22] is the study of random summands

X_{1}, X_{2}, \dots

that need not be positive (see also [26]). The general estimate for deviation of

W_{p}

from

Z \sim E x p (1)

in the ideal metric of order s was proved in [27]. We do not assume that

W_{p}

is constructed by means of i.i.d. random variables and, moreover, demonstrate that our estimate (for summands taking real values) involving the metric

d_{H_{2}}

is sharp.

The exchangeable random variables form an important class having various applications in statistics and combinatorics, see, e.g., [28]. As far as we know, the model of exchangeable random variables is studied in the context of random sums for the first time here. It is interesting that instead of the exponential limit law we indicate explicit expression of the new limit law. In addition, we establish the sharp estimate of proximity of random sums distributions to this law using

d_{H_{2}}

.

A natural generalization of the Rényi theorem is to study the summation index following non-geometrical distribution. In this way, the upper bound of the convergence rate of random sums of random summands to generalized gamma distribution was proved in [29]. Theorem 3.1 in [30] contains the estimates in the Kolmogorov and Kantorovich distances for approximations of non-negative random variable law by specified (nongeneralized) gamma distribution. The proof relies on Stein’s identity for gamma distribution established in H.M.Luk’s PhD thesis (see the reference in [30]). New estimates of the solutions of the gamma Stein equation are given in [31]. We derive the sharp estimate for approximation of random sums by generalized gamma law using the Zolotarev metric of order two. In a quite recent paper [32] the author established deep results concerning further generalizations of the Rényi theorem. Namely, Theorem 1 of [32] demonstrates how one can provide the upper bounds of the convergence rate of specified random sums to a more general law than an exponential one using the estimates in the Rényi theorem. This approach is appealing since the author employs the ideal metric of order

s > 0

. However, the sharpness of these estimates was not examined.

Note that in [33] the important “equilibrium transformation of distributions” was proposed and employed along with the Stein techniques. We will consider this transformation

X^{e}

for a random variable X in Section 7 and also tackle other useful transformations. In the present paper, the inverse to the “equilibrium distribution transformation” is introduced. We completely describe the possibility to construct such transformation and provide an explicit formula for the corresponding density. The idea to apply such inverse transformation whenever it exists is based on the result [33] demonstrating that one can obtain a more precise estimate for proximity in the Kantorovich metric between

X^{e}

and Z than between X and Z, where

Z \sim E x p (1)

and

E [X] = 1

,

E [X^{2}] < \infty

. We extend this result. Moreover, we prove that in this way one can obtain a new estimate of approximation of the Pareto distribution by an exponential one. It is shown that our new estimate is advantageous for a wide range of parameters of the Pareto distribution. Let

X^{e} \sim P a r e t o (α, β)

, i.e., the distribution function of

X^{e}

is

F^{e} (x) = 1 - {(\frac{β}{x + β})}^{α}, x \geq 0, α > 0, β > 0 .

We show that the preimage

X \sim P a r e t o (α + 1, β)

. Thus, for any

α > 2

,

β > 0

, one has

d_{K} (X^{e}, Z) \leq 1 / (α - 1),

where

Z \sim E x p (α / β)

and

d_{K}

stands for the Kolmogorov distance. This bound is more precise than the previous ones applied in signal processing, see, e.g., [34].

This paper is organized as follows. After the Introduction, the auxiliary results are provided in Section 2. Here we include the material important for understanding the main results. We recall the concept of probability metrics, consider the Kolmogorov and the Kantorovich distances and examine the Zolotarev ideal metrics. We describe the basic ideas of Stein’s method, especially for the exponential target distribution. In this section, we formulate a simple but useful Lemma 1 concerning the essential supremum of the Lipschitz function, an important Lemma 2 giving the solution of the Stein equation for different functional classes. We explain the essential role of the generalized equilibrium transformation proposed in [22] which permits study of the summands taking both positive and negative values. We formulate Lemma 3 to be able to solve an integral equation involving the generalized equilibrium transformation when

E [X] \neq 0

and

E [X^{2}] < \infty

. The proofs of auxiliary lemmas are placed in Appendix A. Section 3 is devoted to an approximation of the normalized geometric sums

W_{p}

by an exponential law. Here, the sharp convergence rate is found (see Theorem 1) by means of the probability metric

d_{H_{2}}

. The proof is based on the Lebesgue–Stieltjes integration techniques, the formula of integration by parts for functions of bounded variations, Lemma 2, various limit theorems for integrals and the important result of [22] concerning the estimates involving the Kantorovich distance. In Section 4, for the first time an analog of the Rényi theorem is proved for a model of exchangeable random variables proposed in [35]. We demonstrate (Theorem 2) that, in contrast to Rényi’s theorem, the limit distribution for random sums under consideration is a specified mixture of two explicitly indicated laws. Moreover, the sharp convergence rate to this limit law is obtained (Theorem 3) by means of

d_{H_{2}}

. In Section 5, the distance between the generalized gamma law and the suitably normalized sum of independent random variables is estimated when the number of summands has the generalized negative binomial distribution. Theorem 4 demonstrates that this estimate is sharp. For the proof, we employ various truncation techniques, the transformations of parameters of initial random variables, the monotone convergence theorem and explicit formula for the generalized gamma distribution moments of order

δ > 0

, obtained in [27]. Section 6 provides the pioneering study of the same problem in the framework of exchangeable random variables and also gives the sharp estimate for the

d_{H_{2}}

metric (Theorem 5). In Section 7, we introduce the inverse to the equilibrium transformation of the probability measures. Lemma 6 contains a full description of situations when a unique preimage X of a random variable

X^{e}

exists and gives an explicit formula for distribution of X. This approach permits us to obtain the new estimates of closeness of probability measures in the Kolmogorov and Kantorovich metrics (Theorem 6). In particular, due to Theorem 6 and Lemmas 2, 6, it becomes possible to find a useful estimate of proximity of the Pareto law to the exponential one (Example 2). Section 8 containing the conclusions and indications for further research work is followed by Appendix A and the list of references.

2. Auxiliary Results

Let

K : = {h : h_{z} (x) = I {x \leq z}, x, z \in R},

where

I {A} : = 1

if A holds and zero otherwise. The choice

H = K

in Equation (1) corresponds to the Kolmogorov distance. Note that h above is a function in x, whereas z is the index parameterizing the class.

A function

h : R \to R

is called the Lipschitz one if

L i p (h) : = sup_{x, u \in R; x \neq u} \frac{| h (x) - h (u) |}{| x - u |} < \infty .

(4)

Then,

| h (x) - h (u) | \leq C | x - u |, x, u \in R,

(5)

and in light of Equation (4),

L i p (h)

is the smallest possible constant C appearing in Equation (5). We write

Lip (C)

, where

C \in [0, \infty)

for a collection of the Lipschitz functions having

L i p (h) \leq C

. For

s > 0

set

m = m (s) : = ⌈ s - 1 ⌉ \in N \cup {0}

(where, for

a \in R

,

⌈ a ⌉

stands for the minimal integer number which is equal or greater than a). Introduce a class of functions

H_{s} : = {h : R \to R, | h^{(m)} (x) - h^{(m)} {(u) | \leq | x - u |}^{s - m}, x, u \in R}, s > 0 .

As usual,

h^{(0)} (x) = h (x)

,

x \in R

. We write

d_{H_{s}}

for a metric defined according to Equation (1) with

H = H_{s}

. V.M. Zolotarev and many other researchers defined an ideal metric

ζ_{s}

of order

s > 0

involving only bounded functions from

H_{s}

. We will use collections

H_{1}

and

H_{2}

without assumption that functions h are bounded on

R

. This is the reason why we write

d_{H_{s}}

instead of

ζ_{s}

. Thus, we employ

H_{1} : = {Lip (1)}, H_{2} : = {h : h^{'} \in Lip (1)} .

Note that in definitions of

H_{2}

we deal with

h \in C^{(1)}

, where the space

C^{(1)} (R)

consists of functions

h : R \to R

such that

h^{'} (x)

exists for all

x \in R

, and

h^{'}

is continuous on

R

(evidently the Lipschitz function is continuous). One calls

d_{H_{1}}

the Kantorovich metric (the term Wasserstein metric appears in the literature as well). One also uses the bounded Kantorovich metric when the class

H_{1}

contains all the bounded functions from

Lip (1)

. The metric

ζ_{s}

was introduced in [36] and called an ideal metric in light of its important properties. The properties of

ζ_{s}

metrics, where

s > 0

, are collected in Sec. 2 of [32]. We mention in passing that various functionals are ubiquitous in assessing the proximity of distributions. In this regard, we refer, e.g., to [37,38].

To apply the Stein method, we begin with fixing the target random variable Z (or its distribution) and describe a class

H

to estimate

d_{H} (Y, Z)

for a random variable Y under consideration. Then, the problem is to indicate an operator T (with specified domain of definition) so that the Stein equation

T f (x) = h (x) - E [h (Z)]

(6)

has a solution

f_{h} (x)

,

x \in R

, for each function

h \in H

. After that, one can substitute Y instead of x in Equation (6) and take the expectation of both sides, assuming that all these expectations are finite. As a result, one comes to the relation

E [T f_{h} (Y)] = E [h (Y)] - E [h (Z)] .

(7)

It is not a priori clear why the estimation of the left-hand side of Equation (7) is more adequate than the estimation of

| E [h (Y)] - E [h (Z)] |

for

h \in H

. However, in many situations, justifying the method this occurs. The choice of T depends on the distribution of Z. Note that in certain cases (e.g., when Z follows the Poisson law) one considers functions f defined on a subset of

R

. We emphasize that the construction of operator T is a nontrivial problem, see, e.g., [33,39,40,41].

The basic idea in this way is the following. For many probability distributions (Gaussian, Laplace, Exponential, etc.), one can find an operator T characterizing the law of a target variable Z. In other words, for a rather large class of functions f,

E [T f (Y)] = 0

if and only if

l a w (Y) = l a w (Z)

(i.e., the laws of Y and Z coincide). Thus, if

| E [T f_{h} (Y)] |

is small enough for a suitable class of functions h, this leads to the assertion that the law of Y is close (in a sense) to the law of Z. One has to verify that this kind of “continuity” takes place. Clearly, if for any

h \in H

, where

H

defines the integral probability metric in Equation (1), one can find a solution

f_{h}

of Equation (6), then the relation

E [T f_{h} (Y)] = 0

for all

f_{h}

,

h \in H

, yields

d_{H} (Y, Z) = 0

and, consequently,

l a w (Y) = l a w (Z)

.

Further, we assume that

Z \sim E x p (λ)

, i.e., Z has exponential distribution with parameter

λ > 0

. In this case (see, e.g., Sec. 5 in [17]), one uses the operator

T f (x) : = f^{'} (x) - λ f (x) + λ f (0), x \in R, λ > 0,

(8)

and writes the Stein Equation (6) as follows

f^{'} (x) - λ f (x) + λ f (0) = h (x) - E [h (Z)], x \in R .

(9)

It should be stipulated that

E [h (Z)] \in R

for a test function

h \in H

, and there exists a differentiable solution f of Equation (9). Therefore, if one can find such solution f, then

E [f^{'} (Y)] - λ E [f (Y)] + λ f (0) = E [h (Y)] - E [h (Z)]

(10)

under the hypothesis that all these expectations are finite. If

f : R \to R

is absolutely continuous, then (see, e.g., Theorem 13.18 of [42]) for almost all

x \in R

with respect to the Lebesgue measure, there exists

f^{'} (x)

. Moreover, one can find an integrable (on each interval) function

g : R \to R

,

x \in R

, to guarantee, for each

x, u \in R

, that

f (x) = f (u) + \int_{u}^{x} g (v) d v,

(11)

where

g (v) = f^{'} (v)

for almost all

v \in R

. Thus,

(T f) (x)

is defined for such f according to Equation (8) for almost all

x \in R

. In general, for an arbitrary random variable Y, one cannot write

E [(T f) (Y)]

since the value of expectation depends on the choice of a version of

(T f) (x)

,

x \in R

. Really, let

B \in B (R)

be such that

m (B) = 0

, where m stands for the Lebesgue measure. Assume that Y takes values in B. Then, it is clear that

E [(T f) (Y)]

depends on the choice of a function

(T f) (x)

version defined on

R

. However, if the distribution

P_{Y}

of a random variable Y has a density with respect to m, then

E [(T f) (Y)]

will be the same for any version of

T f

(with respect to the Lebesgue measure). In certain cases, the Stein operator is applied to smoothed functions (see, e.g., [33,43]). Otherwise, Equation (6) does not hold at each point of

R

(see, e.g., Lemma 2.2 in [16]), and complementary efforts are needed. For our study, it is convenient to employ in Equation (8) for T in the capacity of

f^{'} (x)

,

x \in R

, the right derivative. In many cases, for a real-valued function f defined on a fixed set

D \subset R

one considers

{sup}_{x \in D} | f (x) |

as "essential supremum". Recall that a function

\tilde{f}

is a version of f (and vice versa) if the measure (here the Lebesgue measure) of points x such that

\tilde{f} (x) \neq f (x)

is zero. The notation

{∥ f ∥}_{\infty}

means that one takes

{inf}_{\tilde{f}} {sup}_{x \in D} | \tilde{f} (x) |

, where

\tilde{f}

belongs to the class of all versions of f. Clearly,

{∥ f ∥}_{\infty}

will be the same if we change f on a subset of D having a measure which is equal to zero. Thus, we write

∥ f^{'} ∥_{\infty}

instead of

{∥ g ∥}_{\infty}

appearing in Equation (11). The following simple observation is useful. Its proof is provided in Appendix A.

Lemma 1.

A function h is the Lipschitz function on

R

with

L i p (h) = C < \infty

if and only if h is absolutely continuous and (its essential supremum)

∥ h^{'} ∥_{\infty} = C < \infty

.

Remark 1.

Note that

0 \leq h (x) \leq 1

,

x \in R

, for any

h \in K

. If, for some positive constant C,

h \in Lip (C)

, then Equation (5) yields that

| h (x) | \leq C | x | + | h (0) |

. If

h^{'}

is a Lipschitz function (with

L i p (h^{'}) = C

), then

h^{″} (x)

exists for almost all

x \in R

and an application of Lemma 1 gives

| h^{'} (x) - h^{'} (0) | = |\int_{0}^{x} h^{″} (u) d u| \leq C | x |, x \in R .

Consequently,

| h^{'} (x) | \leq A | x | + B

for some positive A, B (one can take

A = C

,

B = | h^{'} (0) |

) and any

x \in R

. As

h^{'} (x)

is continuous on each interval, it follows that

| h (x) | \leq a x^{2} + b | x | + c

for some positive

a, b, c

and all

x \in R

(

a = C / 2

,

b = | h^{'} (0) |

,

c = | h (0) |

). Therefore,

| h (x) | \leq A_{0} x^{2} + B_{0}

for some positive

A_{0}, B_{0}

and each

x \in R

.

Lemma 2.

For any

λ > 0

and each

h \in K \cup H_{1} \cup H_{2}

, the equation

f^{'} (x) - λ f (x) = h (x), x \in R,

(12)

has a solution

f_{h} (x) = - e^{λ x} \int_{x}^{\infty} h (u) e^{- λ u} d u, x \in R,

(13)

where

f_{h} (0) = - E [h (Z)] / λ

. If

h \in K

, then for all

x \in R

there exists

f_{h}^{'} (x)

and

∥ f_{h}^{'} ∥_{\infty} \leq 1

. If

h \in H_{1} \cup H_{2}

, then

f_{h}^{'}

is defined on

R

and

∥ f_{h}^{'} ∥_{\infty} \leq {∥ h^{'} ∥}_{\infty} / λ

. For

h \in H_{2}

, a function

f_{h}^{″}

is defined on

R

and

∥ f_{h}^{″} ∥_{\infty} \leq min {2 ∥ h^{'} ∥_{\infty}, ∥ h^{″} ∥_{\infty} / λ}

.

The right-hand side of Equation (13) is well defined for each

x \in R

in light of Remark 1. Lemma 4.1 of [33] contains for

λ = 1

some statements of Lemma 1. We will use the above estimates for any

λ > 0

. Estimates for

h \in H_{2}

were not considered in [33]. The proof of Lemma 2 is given in Appendix A.

The following concept was introduced in [33].

Definition 1

([33]). Let X be a non-negative random variable with finite

E [X] > 0

. One says that a random variable

X^{e}

has distribution of equilibrium with respect to X if for any Lipschitz function

f : R \to R

,

E [f (X)] - f (0) = E [X] E [f^{'} (X^{e})] .

(14)

Note that Definition 1 deals separately with distributions of X and

X^{e}

. One says that

X^{e}

is the result of the equilibrium transformation applied to X. The same terminology is used for transition from

l a w (X)

to

l a w (X^{e})

. For the sake of completeness, we explain in Appendix A (Comments to Definition 1) why one can take the law of

X^{e}

having a density with respect to the Lebesgue measure

p^{e} (x) = \{\begin{matrix} \frac{1}{E [X]} P (X > x), & x \geq 0, \\ 0, & x < 0, \end{matrix}

(15)

to guarantee the validity of Equation (14).

Remark 2.

For a non-negative random variable X with finite

E [X] > 0

, one can construct a random variable

X^{e}

having a density (15). Accordingly, we then have a random vector

(X, X^{e})

with specified marginal distributions. However, the joint law of X and

X^{e}

is not fixed and can be chosen in appropriate way. If

X_{1}, X_{2}, \dots

is a sequence of independent random variables, we will assume that a sequence

{(X_{n}, X_{n}^{e})}_{n \in N}

consists of independent vectors, and these vectors are independent with all considered random variables which are independent with

{(X_{n})}_{n \in N}

.

In the recent paper [22], a generalization of the equilibrium transformation of distributions was proposed without assuming that random variable X is non-negative.

Definition 2

([22]). Let X be a random variable having a distribution function

F (x) : = P (X \leq x)

,

x \in R

. Assume the existence of finite

E [X] \neq 0

. An equilibrium distribution function corresponding to X (or

F (x)

) is introduced by way of

F^{e} (x) : = \{\begin{matrix} - \frac{1}{E [X]} \int_{- \infty}^{x} F (u) d u, & x \leq 0, \\ - \frac{E [X^{-}]}{E [X]} + \frac{1}{E [X]} \int_{0}^{x} (1 - F (u)) d u, & x > 0, \end{matrix}

(16)

where

X^{-} : = X I {X < 0}

. This function can be written as

F^{e} (x)

= \int_{- \infty}^{x} p^{e} (u) d u

, where

p^{e} (x) = \{\begin{matrix} - \frac{1}{E [X]} F (x), & x \leq 0, \\ \frac{1}{E [X]} (1 - F (x)), & x > 0, \end{matrix}

(17)

thus,

p^{e}

is a density (with respect to the Lebesgue measure) of a signed measure

Q^{e}

corresponding to

F^{e}

. In other words, Equation (17) demonstrates the Jordan decomposition (see, e.g., Sec. 29 of [44]) of

Q^{e}

.

Clearly, for a non-negative random variable, the functions defined in Equation (15) and Equation (16) coincide. For a nonpositive random variable, the function

F^{e}

appearing in Equation (16) is a distribution function of a probability measure. In general, when X can take positive and negative values, the function introduced in Equation (16) is not a distribution function. We will call

F^{e}

the generalized equilibrium distribution function. Note that

| p^{e} (x) | \leq \frac{1}{| E [X] |}

. Thus,

F^{e}

is the Lipschitz function and consequently continuous (

F^{e} (x)

is well defined for each

x \in R

since

E [X]

is finite and nonzero). Moreover,

F^{e}

is absolutely continuous being the Lipschitz function. Each absolutely continuous function has bounded variation. If G is a function of bounded variation, then

G = G_{1} - G_{2}

, where

G_{1}

and

G_{2}

are nondecreasing functions (see, e.g., [42], Theorem 12.18). One can employ the canonical choice

G_{1} (x) : = V a r_{0}^{x} (G)

, where

V a r_{a}^{b} (G)

means the variation of G on

[a, b]

,

- \infty < a \leq b < \infty

(if

a > b

then

V a r_{a}^{b} (G) : = - V a r_{b}^{a} (G)

). If G is right-continuous (on

R

), then evidently

G_{1}

and

G_{2}

are also right-continuous. Thus, for a right-continuous G having bounded variation, a nondecreasing function

G_{i}

in its representation corresponds to a

σ

-finite measure

Q_{i}

on

B (R)

,

i = 1, 2

. More precisely, there exists a unique

σ

-finite measure

Q_{i}

on

B (R)

such that, for each finite interval

(a, b]

,

Q_{i} ((a, b]) = G_{i} (b) - G_{i} (a)

,

i = 1, 2

. Recall that one writes for the Lebesgue–Stieltjes integral with respect to a function G

\int_{R} f (u) d G (u) : = \int_{R} f (u) d G_{1} (u) - \int_{R} f (u) d G_{2} (u),

(18)

whenever the integrals in the right-hand side exist (with values in

[- \infty, \infty]

), and the cases

\infty - \infty

or

- \infty + \infty

are excluded. The integral

\int_{R} f (u) d G_{i} (u)

means the integration with respect to measure

Q_{i}

,

i = 1, 2

. The signed measure Q corresponding to G is

Q_{1} - Q_{2}

. Thus,

\int_{R} f (u) d G (u)

means the integration with respect to signed measure Q. Note that if

G = U_{1} - U_{2}

where

U_{i}

is right-continuous and nondecreasing (

i = 1, 2

), then

\int_{R} f (u) d G_{1} (u) - \int_{R} f (u) d G_{2} (u) = \int_{R} f (u) d U_{1} (u) - \int_{R} f (u) d U_{2} (u) .

(19)

The left-hand side and the right-hand side of Equation (19) make sense simultaneously, and if so, are equal to each other. Indeed, for any finite interval

(a, b]

(

a \leq b

), one has

G_{1} (b) - G_{1} (a) - (G_{2} (b) - G_{2} (a)) = U_{1} (b) - U_{1} (a) - (U_{2} (b) - U_{2} (a))

. Thus, the signed measures corresponding to

G_{1} - G_{2}

and

U_{1} - U_{2}

coincide on

B (R)

. We mention in passing that one can also employ the Jordan decomposition of a signed measure.

For

F^{e}

introduced in Equation (16), the analog of Equation (15) has the form

E [f (X)] - f (0) = E [X] \int_{R} f^{'} (x) d F^{e} (x) .

(20)

Taking into account Equation (17), one can rewrite Equation (20) equivalently as follows

E [f (X)] - f (0) = \int_{(- \infty, 0]} f^{'} (x) (- F (x)) d x + \int_{(0, \infty)} f^{'} (x) (1 - F (x)) d x .

(21)

The right-hand side of the latter relation does not depend on the choice of a version of

f^{'}

. Due to Theorem 1(d) of [22], Equation (20) is valid for any Lipschitz function f. Evidently, an arbitrary function

f \in H_{2}

need not be the Lipschitz one and vice versa.

Lemma 3.

Let X be a random variable such that

E [X^{2}] < \infty

and

E [X] \neq 0

. Then, Equation (20) is satisfied for all

f \in H_{2}

.

The proof is provided in Appendix A.

3. Limit Theorem for Geometric Sums of Independent Random Variables

Consider

N_{p} \sim G e o m (p)

, see Equation (2). In other words,

N_{p}

has a geometric distribution with parameter p. Let

X_{1}, X_{2}, \dots

be a sequence of independent random variables such that

E [X_{k}] = μ

, where

μ \in R

,

μ \neq 0

,

k \in N

. Assume that

N_{p}

and

{(X_{n})}_{n \in N}

are independent. Consider a normalized geometric sum

W_{p} : = \frac{p}{μ (1 - p)} \sum_{k = 1}^{N_{p}} X_{k},

(22)

introduced in Equation (3). Since

N_{p}

can take zero value, set, as usual,

\sum_{k = 1}^{0} X_{k} : = 0

. One can see that

W_{p}

can be viewed as a random sum

S_{p} : = \sum_{k = 1}^{N_{p}} X_{k}

normalized by

E [X] E [N_{p}]

.

Lemma 4.

Let

X_{1}, X_{2}, \dots

and

N_{p}

, where

p \in (0, 1)

, be random variables described above in this Section. Then, the following relations hold:

E [W_{p}] = 1, E | W_{p} | \leq \frac{{sup}_{k \in N} E | X_{k} |}{| μ |},

E [W_{p}^{2}] = \frac{p}{μ^{2} (1 - p)} E [X_{N_{p} + 1}^{2}] + 2 .

(23)

Proof.

Recall that

E [N_{p}] = \sum_{k = 1}^{\infty} k p {(1 - p)}^{k - 1} = \frac{1 - p}{p},

(24)

E [N_{p}^{2}] = \sum_{k = 1}^{\infty} k^{2} p {(1 - p)}^{k - 1} = \frac{(1 - p) (2 - p)}{p^{2}} .

(25)

Thus, one has

E [W_{p}] = \frac{p}{μ (1 - p)} \sum_{k = 1}^{\infty} k μ P (N_{p} = k) = \frac{p}{1 - p} E [N_{p}] = 1 .

Clearly,

E | X_{k} | < \infty

since

E [X_{k}]

is finite

(k \in N

). Therefore

E | W_{p} | \leq \frac{p}{| μ | (1 - p)} \sum_{k = 1}^{\infty} k E | X_{k} | P (N_{p} = k) \leq \frac{{sup}_{k \in N} E | X_{k} |}{| μ |} .

Set

ν_{k} : = E [X_{k}^{2}]

,

k \in N

. One has

E [S_{p}^{2}] = \sum_{k = 1}^{\infty} P (N_{p} = k) E {(\sum_{i = 1}^{k} X_{i})}^{2} = \sum_{k = 1}^{\infty} p {(1 - p)}^{k} (\sum_{i = 1}^{k} ν_{i} + k (k - 1) μ^{2}) .

(26)

According to Equations (24) and (25) one derives the formula

\sum_{k = 1}^{\infty} p {(1 - p)}^{k} (k (k - 1) μ^{2}) = μ^{2} (\frac{(1 - p) (2 - p)}{p^{2}} - \frac{1 - p}{p}) = 2 {(\frac{μ (1 - p)}{p})}^{2} .

(27)

Convergence of the series

\sum_{k = 1}^{\infty} p {(1 - p)}^{k} \sum_{i = 1}^{k} ν_{i}

having non-negative terms holds simultaneously with the validity of inequality

E [W_{p}^{2}] < \infty

. Changing the order of summation, we obtain

\sum_{k = 1}^{\infty} p {(1 - p)}^{k} \sum_{i = 1}^{k} ν_{i} = \sum_{i = 1}^{\infty} {(1 - p)}^{i} ν_{i} = (\frac{1 - p}{p}) E [X_{N_{p} + 1}^{2}] .

The latter formula and Equations (26), (27) yield

E [W_{p}^{2}] = {(\frac{p}{μ (1 - p)})}^{2} E [S_{p}^{2}] = {(\frac{p}{μ (1 - p)})}^{2} ((\frac{1 - p}{p}) E [X_{N_{p} + 1}^{2}] + 2 {(\frac{μ (1 - p)}{p})}^{2})

= \frac{p}{μ^{2} (1 - p)} E [X_{N_{p} + 1}^{2}] + 2 .

Equation (23) is established. □

The proof of Theorem 3.1 in [45] shows for non-negative i.i.d. random variables

X_{1}, X_{2}, \dots

(when

μ = 1

, see Formula (3.15) in [45]) that the equilibrium transformation of

W_{p}

distribution has the following form:

W_{p}^{e} = \frac{p}{μ (1 - p)} (\sum_{k = 1}^{N_{p}} X_{k} + X_{N_{p} + 1}^{e}) = W_{p} + \frac{p}{μ (1 - p)} X_{N_{p} + 1}^{e},

(28)

where

X_{N_{p} + 1}^{e}

means that we construct

X_{1}^{e}, X_{2}^{e}, \dots

and then take a random index

N_{p} + 1

. In other words,

X_{N_{p} + 1}^{e} = \sum_{n = 0}^{\infty} X_{n + 1}^{e} I {N_{p} = n} .

It was explained in Section 2 that a generalized equilibrium distribution function

F_{W_{p}}^{e} (x)

(see Definition 2) need not be a distribution function when the summands

X_{1}, X_{2}, \dots

can take values of different signs. However, employing this function, one can establish the following result.

Theorem 1.

Let

X_{1}, X_{2}, \dots

be a sequence of independent random variables having finite

E [X_{k}] = μ

, where

μ \neq 0

,

k \in N

. Assume that

N_{p}

and

{(X_{n})}_{n \in N}

are independent, where

N_{p} \sim G e o m (p)

,

0 < p < 1

. If

Z \sim E x p (1)

, then

d_{H_{2}} (W_{p}, Z) = \frac{E [X_{N_{p} + 1}^{2}]}{2 μ^{2}} (\frac{p}{1 - p})

(29)

where

W_{p}

was introduced in Equation (22).

Proof.

If

E [W_{p}^{2}] = \infty

, then

d_{H_{2}} (W_{p}, Z) = \infty

since, for a function

h (x) = x^{2} / 2

,

x \in R

, belonging to

H_{2}

, one has

E [h (W_{p})] = \infty

, whereas

E [h (Z)] < \infty

. According to Equation (23),

E [W_{p}^{2}]

and

E [X_{N_{p} + 1}^{2}]

are both finite or infinite simultaneously. Consequently, Equation (29) is true when

E [W_{p}^{2}] = \infty

.

Let us turn to the case

E [W_{p}^{2}] < \infty

. At first, we obtain an upper bound for

d_{H_{2}} (W_{p}, Z)

. Take

h \in H_{2}

. Applying Lemmas 1 and 2 and Remark 1, one can write due to Stein’s Equation (10) that

| E [h (W_{p})] - E [h (Z)] | = | E [f_{h}^{'} (W_{p})] - E [f_{h} (W_{p})] + f (0) | .

(30)

Using the generalized equilibrium distribution transformation (20) one obtains:

| E [f_{h}^{'} (W_{p})] - E [f_{h} (W_{p})] + f (0) | = |\int_{R} f_{h}^{'} (x) d F_{W_{p}} (x) - \int_{R} f_{h}^{'} (x) d F_{W_{p}}^{e} (x)| .

(31)

Due to Lemma 3 this is true, for

h \in H_{2}

, because

f_{h} \in H_{2}

according to Lemma 2 (with

λ = 1

). Next, we employ the relation

\int_{R} f_{h}^{'} (x) d F_{W_{p}} (x) - \int_{R} f_{h}^{'} (x) d F_{W_{p}}^{e} (x) = \int_{R} f_{h}^{'} (x) d (F_{W_{p}} - F_{W_{p}}^{e}) (x) .

(32)

Evidently, one can write

\int_{R} | f_{h}^{'} (x) | d F_{W_{p}} (x) < \infty

. The notation

d F_{W_{p}}^{e} (x)

in the integral refers to the Lebesgue–Stieltjes integral with respect to a function

F_{W_{p}}^{e} (x)

of bounded variation. In fact, the integral with integrator

d F_{W_{p}}^{e} (x)

means that integration employs a signed measure

Q_{p}^{+} - Q_{p}^{-}

, where

Q_{p}^{+}

and

Q_{p}^{-}

have the following densities with respect to the Lebesgue measure:

q_{p}^{+} (x) : = (1 - F_{W_{p}} (x)) I {(0, \infty)}, q_{p}^{-} (x) : = F_{W_{p}} (x) I {(- \infty, 0]}, x \in R,

we took into account that

E [W_{p}] = 1

according to Lemma 4. Then, for any

- \infty < a < b < \infty

, one ascertains that variation of

F_{W_{p}}^{e}

on

[a, b]

is given by formula

V a r_{a}^{b} (F_{W_{p}}^{e}) = \int_{a}^{b} | p_{W_{p}}^{e} (u) | d u

(see, e.g., Theorem 4.4.7 [46]). Note that for any

- \infty < a < b < \infty

,

\int_{a}^{b} | p_{W_{p}}^{e} (u) | d u \leq E | W_{p} | < \infty

according to Lemma 4. Thus,

F_{W_{p}}^{e}

is a function of bounded variation. In the right-hand side of Equation (32), we take the Lebesgue–Stieltjes integral with respect to the function of bounded variation

(F_{W_{p}} - F_{W_{p}}^{e}) (x)

,

x \in R

. Let

F_{W_{p}}^{e} (x) = F_{p, 1}^{e} (x) - F_{p, 2}^{e} (x)

,

x \in R

, where

F_{p, i}^{e}

are nondecreasing right-continuous functions (even continuous since

F_{W_{p}}^{e}

is continuous),

i = 1, 2

. Thus,

F_{W_{p}} (x) - F_{W_{p}}^{e} (x) = (F_{W_{p}} (x) + F_{p, 2}^{e} (x)) - F_{p, 1}^{e} (x), x \in R .

With the help of Equations (18) and (19) one makes sure that, for each

n \in N

,

\int_{(- n, n]} f_{h}^{'} (x) d (F_{W_{p}} - F_{W_{p}}^{e}) (x) = \int_{(- n, n]} f_{h}^{'} (x) d (F_{W_{p}} (x) + F_{p, 2}^{e} (x)) - \int_{(- n, n]} f_{h}^{'} (x) d (F_{p, 1}^{e} (x))

= \int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}} (x) + \int_{(- n, n]} f_{h}^{'} (x) d F_{p, 2}^{e} (x) - \int_{(- n, n]} f_{h}^{'} (x) d F_{p, 1}^{e} (x)

= \int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}} (x) - \int_{(- n, n]} f_{h}^{'} (x) d (F_{p, 1}^{e} (x) - F_{p, 2}^{e} (x))

= \int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}} (x) - \int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}^{e}} (x) .

All the integrals in the latter formulas are finite. According to Lemma 2 and Remark 1, one can write

| f_{h}^{'} (x) | \leq A_{0} | x | + B_{0}

, where

A_{0}, B_{0}

are positive constants. Thus, the Lebesgue theorem on dominated convergence ensures that

lim_{n \to \infty} \int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}} (x) = \int_{R} f_{h}^{'} (x) d F_{W_{p}} (x),

where the latter integral is finite. Indeed,

\int_{R} (A_{0} | x | + B_{0}) d F_{W_{p}} (x) = A_{0} E | W_{p} | + B_{0} < \infty

(33)

according to Lemma 4. By the same Lemma, one has

E [W_{p}] = 1

. Therefore, on account of Equation (17), the following relation holds:

\int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}^{e}} (x) = \int_{(- n, 0]} f_{h}^{'} (x) (- F_{W_{p}} (x)) d x + \int_{(0, n]} f_{h}^{'} (x) (1 - F_{W_{p}} (x)) d x,

whereas Corollary 2, Sec. 6, Ch. II of [47] and Lemma 4 entail that

\begin{matrix} \int_{(- \infty, 0]} (A_{0} | x | + B_{0}) F_{W_{p}} (x) d x + \int_{(0, \infty)} (A_{0} | x | + B_{0}) (1 - F_{W_{p}} (x)) d x \\ \leq A_{0} E [W_{p}^{2}] + B_{0} E | W_{p} | < \infty . \end{matrix}

(34)

The Lebesgue theorem on dominated convergence for

σ

-finite measures and Equation (34) yield

lim_{n \to \infty} \int_{(- n, n]} f_{h}^{'} (x) d F_{W_{p}}^{e} (x) = \int_{R} f_{h}^{'} (x) d F_{W_{p}}^{e} (x),

where the latter integral is finite. Now, we show that

lim_{n \to \infty} \int_{(- n, n]} f_{h}^{'} (x) d (F_{W_{p}} - F_{W_{p}}^{e}) (x) = \int_{R} f_{h}^{'} (x) d (F_{W_{p}} - F_{W_{p}}^{e}) (x) .

(35)

Note that

f_{h}^{'} (x) I_{(- n, n]} (x) \to f_{h}^{'} (x)

at each

x \in R

as

n \to \infty

. To apply the version of the Lebesgue theorem to integrals over a signed measure, it suffices (see, e.g., [48], p. 74) to verify that

\int_{R} | f_{h}^{'} (x) | | d (F_{W_{p}} - F_{W_{p}}^{e}) (x) | < \infty,

where

| d G |

means that one evaluates an integral with respect to the measure corresponding to the total variation of a measure determined by a right-continuous function G of bounded variation. The extension of the Lebesgue theorem on dominated convergence for signed measures is an immediate corollary of the Jordan decomposition mentioned above. Using this decomposition, one obtains the inequality

\int_{R} | f_{h}^{'} (x) | | d (F_{W_{p}} - F_{W_{p}}^{e}) (x) | \leq \int_{R} | f_{h}^{'} (x) | | d F_{W_{p}} (x) | + \int_{R} | f_{h}^{'} (x) | | d F_{W_{p}}^{e} (x) | .

Due to Remark 1 one has

| f_{h}^{'} (x) | \leq A_{0} | x | + B_{0}

for all

x \in R

and some positive constants

A_{0}, B_{0}

. Then, Equations (33) and (34) yield (as

F_{W_{p}}

generates probability measure)

\int_{R} (A_{0} | x | + B_{0}) d F_{W_{p}} (x) + \int_{R} (A_{0} | x | + B_{0}) | d F_{W_{p}}^{e} (x) | < \infty .

The functions

f_{h}^{'}

and

F_{W_{p}} - F_{W_{p}}^{e}

are right-continuous and have bounded variation. Then each of them can be represented as the difference of right-continuous nondecreasing functions, and using for any

n \in N

the integration by parts formula (see, e.g., Theorem 11, Sec. 6, Ch. 2, [47]), one has

\begin{matrix} \int_{(- n, n]} f_{h}^{'} (x) d (F_{W_{p}} - F_{W_{p}}^{e}) (x) \\ = f_{h}^{'} (x) (F_{W_{p}} (x) - F_{W_{p}}^{e} (x)) |_{- n}^{n} - \int_{(- n, n]} (F_{W_{p}} (x) - F_{W_{p}}^{e} (x)) d f_{h}^{'} (x) . \end{matrix}

Since the integral in the right-hand side of Equation (35) is finite, it holds

f_{h}^{'} (x) (F_{W_{p}} (x) - F_{W_{p}}^{e} (x)) \to 0, x \to - \infty o r x \to \infty

(36)

(the proof is similar to the proof of Corollary 2, Sec. 6, Ch. 2 in [47]). Then,

\int_{R} f_{h}^{'} (x) d (F_{W_{p}} - F_{W_{p}}^{e}) (x) = - lim_{n \to \infty} \int_{(- n, n]} (F_{W_{p}} (x) - F_{W_{p}}^{e} (x)) d f_{h}^{'} (x) .

The function

f_{h}^{'}

is absolutely continuous according to Lemma 2. Hence (see also Equations (36) and (A12) in Appendix A) we get

\begin{matrix} |\int_{R} f_{h}^{'} (x) d (F_{W_{p}} (x) - F_{W_{p}}^{e} (x))| = lim_{n \to \infty} |\int_{(- n, n]} (F_{W_{p}} (x) - F_{W_{p}}^{e} (x)) f_{h}^{″} (x) d x| \\ \leq ∥ f_{h}^{″} ∥_{\infty} \int_{R} |F_{W_{p}} (x) - F_{W_{p}}^{e} (x)| d x \leq \int_{R} |F_{W_{p}} (x) - F_{W_{p}}^{e} (x)| d x, \end{matrix}

(37)

because

∥ f_{h}^{″} ∥_{\infty} \leq {∥ h^{'} ∥}_{\infty} \leq 1

due to Lemmas 1 and 2. Using the homogeneity of the Kantorovich metric for signed measures which is derived from formula (20) of [22] (see Lemma 1 (a) there) and applying Lemma 3 of that paper, we can write

\begin{matrix} \int_{R} |F_{W_{p}} (x) - F_{W_{p}}^{e} (x)| d x = \frac{p}{| μ | (1 - p)} \int_{R} |F_{S_{N_{p}}} (x) - F_{S_{N_{p}}}^{e} (x)| d x \\ \leq \frac{E [X_{N_{p} + 1}^{2}]}{2 μ^{2}} (\frac{p}{1 - p}) . \end{matrix}

(38)

Relations (30), (31), (32), (37), (38) and Lemmas 1 and 2 guarantee that

d_{H_{2}} (W_{p}, Z)

does not exceed the right-hand side of Equation (29).

Now, we turn to the lower bounds for

d_{H_{2}} (W_{p}, Z)

. Choose

h (x) = x^{2} / 2

as the test function. Since

h \in H_{2}

, we can write

d_{H_{2}} (W_{p}, Z) \geq |E [h (W_{p})] - E [h (Z)]| = \frac{1}{2} |E [W_{p}^{2}] - E [Z^{2}]| .

(39)

For a random variable Z following the exponential law

E x p (1)

, one has

E [Z^{2}] = 2

. Formula (23) of Lemma 4 yields

d_{H_{2}} (W_{p}, Z) \geq \frac{E [X_{N_{p} + 1}^{2}]}{2 μ^{2}} (\frac{p}{1 - p}) .

Taking into account formula (38), we come to the desired statement. The proof is complete. □

Remark 3.

Evidently,

E [X_{N_{p} + 1}^{2}] = \sum_{n = 0}^{\infty} E [X_{n + 1}^{2}] p {(1 - p)}^{n} .

Thus, one obtains

E [X_{N_{p} + 1}^{2}] \leq sup_{n \in N} E [X_{n}^{2}],

and the latter inequality becomes an equality when

E [X_{n}^{2}] = E [X_{1}^{2}]

for all

n \in N

. Therefore, the statement of Theorem 1 can be written as follows

d_{H_{2}} (W_{p}, Z) \leq \frac{{sup}_{n \in N} E [X_{n}^{2}]}{2 μ^{2}} (\frac{p}{1 - p}),

and this becomes an equality when

E [X_{n}^{2}] = E [X_{1}^{2}]

for all

n \in N

.

Remark 4.

In [22], the authors proved the following inequality

d_{H_{2}} (W_{p}, Z) \leq \frac{3 E [X_{N_{p} + 1}^{2}]}{2 μ^{2}} (\frac{p}{1 - p}) .

We established the sharp estimate with a factor

1 / 2

instead of

3 / 2

having employed Equation (20) for a class of functions comprising solutions of the Stein equation for

h \in H_{2}

. The estimate with factor

1 / 2

was also obtained in the recent paper [49] but for i.i.d. summands. The lower bounds were not provided there. In our Theorem 1, the summands have the same expectations but need not have the same distribution.

Remark 5.

If the summands of

W_{p}

are non-negative, we consider

W_{p}^{e}

appearing in Equation (28). Applying Theorem 1(i) [22] to relation (29), one obtains

d_{H_{1}} (W_{p}^{e}, Z) = \frac{E [X_{N_{p} + 1}^{2}]}{2 μ^{2}} \frac{p}{1 - p} .

For

i \in N

, consider a random variable

X_{i}

having distribution

E x p (1 / μ)

. Then

X_{i}^{e} \sim E x p (1 / μ)

, and, consequently,

X_{N_{p} + 1}^{e} \sim E x p (1 / μ)

. We can choose

X_{i}^{e}

,

i \in N

, according to Remark 2. Then, the distribution of

W_{p}^{e}

will be the same if we change

X_{N_{p} + 1}^{e}

to

X_{N_{p} + 1}

in Equation (28). In such a way,

W_{p}^{e}

is a normalized sum of a random number of independent random variables. Using the homogeneity of the Kantorovich metric, one has

d_{H_{1}} (\frac{p}{μ} \sum_{k = 1}^{N_{p} + 1} X_{k}, (1 - p) Z) = (1 - p) d_{H_{1}} (\frac{p}{μ (1 - p)} \sum_{k = 1}^{N_{p} + 1} X_{k}, Z) = \frac{E [X_{N_{p} + 1}^{2}]}{2 μ^{2}} p .

(40)

Therefore, for an arbitrary sequence

{(X_{k})}_{k \in N}

satisfying conditions of Theorem 1, the upper bound for the left-hand side of Equation (40) is not less than the right-hand side of Equation (40).

4. Limit Theorem for Geometric Sums of Exchangeable Random Variables

Now, we consider exchangeable random variables

X_{1}, X_{2}, \dots

satisfying the dependence condition proposed in [35]. Namely, assume that for all

n \in N

,

t_{j} \in R

(

j = 1, \dots, n

) and some

ρ \in [0, 1]

E [e^{i (t_{1} X_{1} + \dots + t_{n} X_{n})}] = ρ E [e^{i X_{1} (t_{1} + \dots + t_{n})}] + (1 - ρ) \prod_{j = 1}^{n} E [e^{i t_{j} X_{j}}],

(41)

where

i^{2} = - 1

. The cases of

ρ = 0

and

ρ = 1

correspond, respectively, to independent random variables and those possessing the property of comonotonicity. The latter means that for

ρ = 1

the joint behavior of

X_{1}, \dots, X_{n}

is strongly correlated and coincides with one of a vector

(X_{1}, \dots, X_{1})

.

Theorem 2.

Let

X_{1}, X_{2} \dots

be exchangeable random variables with

E [X_{1}] = μ

,

μ \neq 0

satisfying condition (41) for some

ρ \in (0, 1)

. Suppose that

{(X_{n})}_{n \in N}

and

N_{p}

are independent, where

N_{p} \sim G e o m (p)

,

p \in (0, 1)

. In contrast to the Rényi theorem, one has

W_{p} \overset{D}{\to} Y, p \to 0 +,

where the law of Y is the following mixture

P_{Y} = ρ P_{V X_{1} / μ} + (1 - ρ) P_{Z},

(42)

random variables

X_{1}, V

are independent and

V \sim E x p (1)

,

Z \sim E x p (1)

.

Proof.

Let

{\tilde{X}}_{1}, {\tilde{X}}_{2}, \dots

be independent copies of

X_{1}, X_{2}, \dots

, respectively. Suppose that

{\tilde{X}}_{1}, {\tilde{X}}_{2}, \dots

are independent with

N_{p}

. Set

S_{0} : = 0

,

{\tilde{S}}_{0} : = 0

,

{\tilde{S}}_{n} : = {\tilde{X}}_{1} + \dots + {\tilde{X}}_{n}

,

n \in N

. Denote the characteristic function of a random variable

ξ

by

f_{ξ} (t)

,

t \in R

. For each

t \in R

, using Equation (41), one has

f_{S_{N_{p}}} (t) = \sum_{n = 0}^{\infty} E [e^{i t S_{n}}] P (N_{p} = n)

= P (N_{p} = 0) + \sum_{n = 1}^{\infty} (ρ E [e^{i X_{1} t n}] + (1 - ρ) \prod_{j = 1}^{n} E [e^{i t X_{j}}]) P (N_{p} = n)

= p + \sum_{n = 0}^{\infty} (ρ E [e^{i X_{1} t n}] + (1 - ρ) \prod_{j = 1}^{n} E [e^{i t {\tilde{X}}_{j}}]) P (N_{p} = n) - ρ p - (1 - ρ) p

= ρ f_{X_{1} N_{p}} (t) + (1 - ρ) \sum_{n = 0}^{\infty} f_{{\tilde{S}}_{n}} (t) P (N_{p} = n) = ρ f_{X_{1} N_{p}} (t) + (1 - ρ) f_{{\tilde{S}}_{N_{p}}} (t) .

For each

t \in R

, one has

f_{W_{p}} (t) = ρ f_{\frac{p}{μ (1 - p)} X_{1} N_{p}} (t) + (1 - ρ) f_{{\tilde{W}}_{p}} (t),

(43)

where

{\tilde{W}}_{p} = \frac{p}{μ (1 - p)} \sum_{j = 1}^{N_{p}} {\tilde{X}}_{j}

.

According to the classical Rényi theorem,

{\tilde{W}}_{p} \overset{D}{\to} Z

as

p \to 0 +

, where

Z \sim E x p (1)

. Note that

T_{p} : = \frac{p}{1 - p} N_{p} \overset{D}{\to} V

as

p \to 0 +

, where

V \sim E x p (1)

. In fact, one can apply Theorem 1 with

X_{j} \equiv 1

,

j \in N

to check this. For each

t \in R

, taking into account that

T_{p}

and

X_{1}

are independent and applying the Lebesgue theorem on dominated convergence, we see that

E [e^{i t T_{p} X_{1}}] = E [E e^{i t T_{p} X_{1}} | X_{1}] = \int_{R} e^{i t T_{p} x} d F_{X_{1}} (x) \to \int_{R} e^{i t V x} d F_{X_{1}} (x) = E [e^{i t V X_{1}}], p \to 0 +,

since

X_{1}

and V are independent. Hence,

\frac{p}{μ (1 - p)} X_{1} N_{p} \overset{D}{⟶} \frac{V X_{1}}{μ}, p \to 0 +

is true. In light of Equation (43),

W_{p} \overset{D}{⟶} Y, p \to 0 +,

here the law of Y is the mixture of distributions

V X_{1} / μ

and Z provided by Equation (42). The proof is complete. □

Theorem 3.

Assume that

N_{p}

and

{(X_{n})}_{n \in N}

satisfy conditions of Theorem 2. Let

μ_{2} = E [X_{1}^{2}]

. Then,

d_{H_{2}} (W_{p}, Y) = \frac{μ_{2}}{2 μ^{2}} (\frac{p}{1 - p}) .

(44)

Proof.

Relation (43) for characteristic functions implies that the following equality of distributions holds

W_{p} \overset{D}{=} \frac{p}{μ (1 - p)} ((1 - I_{ρ}) N_{p} X_{1} + I_{ρ} {\tilde{S}}_{N_{p}}),

(45)

where indicator

I_{ρ}

equals 1 and 0 with probabilities

1 - ρ

and

ρ

, respectively, and is independent of all the variables under consideration. Assume at first that

μ_{2} < \infty

. Then, for

h \in H_{2}

,

E [h (W_{p})] = ρ E [h (\frac{p}{μ (1 - p)} N_{p} X_{1})] + (1 - ρ) E [h ({\tilde{W}}_{p})] .

In view of Equation (42) one has

E [h (Y)] = ρ E [h (\frac{V X_{1}}{μ})] + (1 - ρ) E [h (Z)] .

The latter two formulas and the triangle inequality yield

\begin{matrix} | E [h (W_{p})] - E [h (Y)] | \\ \leq ρ |E [h (\frac{p}{μ (1 - p)} N_{p} X_{1})] - E [h (\frac{V X_{1}}{μ})]| + (1 - ρ) |E [h ({\tilde{W}}_{p})] - E [h (Z)]| . \end{matrix}

(46)

By means of Theorem 1 we have

sup_{h \in H_{2}} | E [h ({\tilde{W}}_{p})] - E [h (Z)] | = \frac{μ_{2}}{2 μ^{2}} (\frac{p}{1 - p}) .

(47)

For each

h \in H_{2}

, taking into account the independence of

X_{1}

,

N_{p}

, V, one can write

\begin{matrix} |E [h (\frac{p}{μ (1 - p)} N_{p} X_{1})] - E [h (\frac{V X_{1}}{μ})]| \\ = |\int_{R} (E [h (\frac{p}{μ (1 - p)} N_{p} X_{1})] - E [h (\frac{x V}{μ})]) d F_{X_{1}} (x)| . \end{matrix}

Due to homogeneity of

d_{H_{2}}

we infer from Theorem 1 that

sup_{h \in H_{2}} |E [h (\frac{p}{μ (1 - p)} N_{p} X_{1})] - E [h (\frac{x V}{μ})]| = d_{H_{2}} (\frac{p x}{μ (1 - p)} N_{p}, \frac{x V}{μ})

= {(\frac{x}{μ})}^{2} d_{H_{2}} (\frac{p}{(1 - p)} \sum_{k = 1}^{N_{p}} 1, V) = \frac{1}{2} {(\frac{x}{μ})}^{2} \frac{p}{1 - p} .

Consequently, it holds

\begin{matrix} |E [h (\frac{p}{μ (1 - p)} N_{p} X_{1})] - E [h (\frac{V X_{1}}{μ})]| \\ \leq \frac{p}{2 (1 - p)} \int_{R} {(\frac{x}{μ})}^{2} d F_{X_{1}} (x) = \frac{μ_{2}}{2 μ^{2}} (\frac{p}{1 - p}) . \end{matrix}

(48)

Equations (46), (47) and (48) lead to the upper bound for

d_{H_{2}} (W_{p}, Y)

.

Note that a function

h (x) = x^{2} / 2

,

x \in R

, belongs to

\in H_{2}

and therefore

sup_{H_{2}} |E [h (W_{p})] - E [h (Y)]| \geq \frac{1}{2} (E [W_{p}^{2}] - E [Y^{2}]) .

(49)

Note that

E [Z^{2}] = E [V^{2}] = 2

because

Z \sim E x p (1)

and

V \sim E x p (1)

. The random variables

X_{1}, V, Z

are independent. Thus, in light of Equation (42), one has

E [Y^{2}] = 2 ρ \frac{μ_{2}}{μ^{2}} + 2 (1 - ρ) .

(50)

By means of Equations (45), (23) and (25) we obtain

\begin{matrix} E [W_{p}^{2}] = {(\frac{p}{μ (1 - p)})}^{2} ρ E [N_{p}^{2}] E [X_{1}^{2}] + (1 - ρ) E [{\tilde{W}}_{p}^{2}] \\ = {(\frac{p}{μ (1 - p)})}^{2} ρ \frac{(1 - p) (2 - p)}{p^{2}} μ_{2} + (1 - ρ) (\frac{p}{μ^{2} (1 - p)} μ_{2} + 2) = \\ = \frac{μ_{2}}{μ^{2}} (ρ \frac{2 - p}{1 - p} + (1 - ρ) \frac{p}{1 - p}) + 2 (1 - ρ) . \end{matrix}

(51)

Equations (50) and (51) permit to find

E [W_{p}^{2}] - E [Y^{2}]

. Hence Equation (49) leads to the inequality

\begin{matrix} sup_{H_{2}} |E [h (W_{p})] - E [h (Y)]| \\ \geq (\frac{1}{2}) \frac{μ_{2}}{μ^{2}} (ρ (\frac{2 - p}{1 - p} - 2) + (1 - ρ) \frac{p}{1 - p}) = (\frac{1}{2}) \frac{μ_{2}}{μ^{2}} \frac{p}{1 - p} . \end{matrix}

(52)

Now, let

μ_{2} = \infty

. Then,

d_{H_{2}} (W_{p}, Y) = \infty

according to Equation (52). The proof is complete. □

5. Convergence of Random Sums of Independent Summands to Generalized Gamma Distribution

Statements concerning weak convergence of geometric sums distributions to exponential law are often just particular cases of more general results concerning the convergence of random sums of random summands to generalized gamma law when the number of summands follows the generalized negative binomial distribution, see, e.g., [27,29,49]). The recent work [29] demonstrated how it is possible to study the mentioned general case employing the estimates of proximity of geometric sums distributions to exponential law. We introduce some notation to apply Theorem 1 for analysis of the distance between the distributions of random sums and the generalized gamma law.

Introduce a random variable

G_{r, λ}

such that

G_{r, λ} \sim G (r, λ)

, where

G (r, λ)

is the gamma law with positive parameters r and

λ

, i.e., its density with respect to the Lebesgue measure has the form

g (z; r, λ) = \frac{λ^{r} z^{r - 1}}{Γ (r)} e^{- λ z} I_{(0, \infty)} (z), z \in R,

Γ (r)

being the gamma function. For

r = 1

, one has

G (1, λ) = E x p (λ)

. Clearly, for

a > 0

,

a G_{r, λ} \sim G (r, λ / a)

. Set

G_{r, α, λ}^{*} : = G_{r, λ}^{1 / α}

, where

α > 0

. One says that random variable

G_{r, α, λ}^{*}

has the generalized gamma distribution

G^{*} (r, α, λ)

. According to Equation (5) of [29], the density of

G_{r, α, λ}^{*}

is given by formula

g^{*} (z; r, α, λ) = \frac{| α | λ^{r} z^{α r - 1}}{Γ (r)} e^{- λ z^{α}} I_{(0, \infty)} (z), z \in R .

Also it is known (see Equation (6) in [29]) that, for

r \in (0, 1)

,

α \in (0, 1]

and

λ > 0

, the following relation holds

g^{*} (z; r, α, λ) = \int_{0}^{1} \frac{u}{1 - u} e^{- \frac{u}{1 - u} z} q (u; r, α, λ) d u, z > 0,

(53)

where q is a density of a specified random variable

Y_{r, α, λ}

such that support of its distribution belongs to

(0, 1)

(see Remark 3 [49]). We only note that for

α = 1

the density q admits a representation

q (u; r, 1, \frac{b}{1 - b}) = b^{r} (\frac{sin π r}{π}) \frac{{(1 - u)}^{r - 1}}{u {(u - b)}^{r}} I_{(b, 1)} (u), b \in (0, 1) .

Consider a random variable

N_{r, α, p}^{*}

having the generalized negative binomial distribution

G N B (r, α, p)

, where

r > 0

,

α \neq 0

and

p \in (0, 1)

, i.e.,

P (N_{r, α, p}^{*} = k) = \int_{0}^{\infty} \frac{z^{k}}{k!} e^{- z} g^{*} (z; r, α, \frac{p}{1 - p}) d z, k = 0, 1, \dots

(54)

Thus

G N B (r, α, p)

has a mixed Poisson distribution. One can verify that

G N B (r, 1, p)

coincides with

N B (r, p)

, where

N B (r, p)

is the negative binomial law. Recall that

N_{r, p} \sim N B (r, p)

if

P (N_{r, p} = k) = \frac{Γ (k + r)}{k! Γ (r)} p^{r} {(1 - p)}^{k}, k = 0, 1, \dots

Note also that

N_{1, p} \sim G e o m (p)

.

Introduce the random variables

W_{r, α, p}^{*} : = \frac{1}{μ} {(\frac{p}{1 - p})}^{1 / α} \sum_{k = 1}^{N_{r, α, p}^{*}} X_{k}, S_{r, α, p}^{*} : = \sum_{k = 1}^{N_{r, α, p}^{*}} X_{k},

(55)

where

N_{r, α, p}^{*} \sim G N B (r, α, p)

,

r > 0

,

α \neq 0

,

p \in (0, 1)

, and

E [X_{k}] = μ

,

μ \neq 0

,

k \in N

. We assume that

{(X_{n})}_{n \in N}

and

N_{r, α, p}^{*}

are independent, where

r > 0

,

α \neq 0

,

p \in (0, 1)

.

Theorem 4.

Let

{(X_{n})}_{n \in N}

be a sequence of independent random variables having

E [X_{n}] = μ

,

μ \neq 0

,

n \in N

. Then, for

W_{r, α, p}^{*}

introduced in Equation (55) with parameters

r \in (0, 1)

,

α \in (0, 1]

,

p \in (0, 1)

and

G_{r, 1}

having the gamma distribution

G (r, 1)

, the following relation holds

d_{H_{2}} (W_{r, α, p}^{*}, G_{r, 1}^{1 / α}) = \frac{1}{2 μ^{2}} {(\frac{p}{1 - p})}^{2 / α} \int_{0}^{1} E [X_{N_{u} + 1}^{2}] (\frac{1 - u}{u}) q (u; r, α, \frac{p}{1 - p}) d u,

(56)

whenever the right-hand side of Equation (56) is finite. Here,

N_{u} : = N_{1, 1, u}^{*}

,

N_{u} \sim G e o m (u)

,

u \in (0, 1)

and q appeared in Equation (53).

Proof.

Without loss of generality, we can assume that

μ = 1

; otherwise, we consider

{\tilde{X}}_{n} : = \frac{X_{n}}{μ}

,

n \in N

. For such sequence,

E [{\tilde{X}}_{N_{u} + 1}^{2}] = \frac{1}{μ^{2}} E [X_{N_{u} + 1}^{2}]

. Note that

\frac{1 - p}{p} G_{r, 1}

has the same distribution as

G_{r, p / (1 - p)}

. Applying the homogeneity property of the ideal probability metric of order two, one has

d_{H_{2}} (W_{r, α, p}^{*}, G_{r, 1}^{1 / α}) = {(\frac{p}{1 - p})}^{2 / α} d_{H_{2}} (S_{r, α, p}^{*}, G_{r, p / (1 - p)}^{1 / α}) .

The proof of Theorem 1 [29] starts with establishing for any bounded Borel function h,

r \in (0, 1)

,

α \in (0, 1]

and

p \in (0, 1)

, that

E [h (G_{r, p / (1 - p)}^{1 / α})] = \int_{0}^{1} E [h (\frac{1 - u}{u} Z)] q (u; r, α, \frac{p}{1 - p}) d u,

(57)

where

Z \sim E x p (1)

, and

E [h (S_{r, α, p}^{*})] = \int_{0}^{1} E [h (S_{1, 1, u}^{*})] q (u; r, α, \frac{p}{1 - p}) d u .

(58)

Let us examine these relations for each

h \in H_{2}

. Recall that in light of Remark 1

| h (x) | \leq A_{0} x^{2} + B_{0}

for some positive constants

A_{0}

and

B_{0}

(which depend on h), we write

h = h^{+} - h^{-}

, where

h^{+} (x) : = h (x) I {h (x) \geq 0}

,

h^{-} (x) : = - h (x) I {h (x) \leq 0}

. Set

h_{n} (x) : = h^{+} (x) I_{(- n, n]} (x)

,

n \in N

. Then,

h_{n}

and

n \in N

are bounded Borel functions such that for each

x \in R

,

0 \leq h_{n} (x) ↗ h^{+} (x)

as

n \to \infty

. Hence, the monotone convergence theorem yields

E [h^{+} (G_{r, p / (1 - p)}^{1 / α})] = lim_{n \to \infty} E [h_{n} (G_{r, p / (1 - p)}^{1 / α})] .

Note that, for each

u \in (0, 1)

,

E [h_{n} (\frac{1 - u}{u} Z)] ↗ E [h^{+} (\frac{1 - u}{u} Z)]

. Applying the monotone convergence theorem once again, we obtain

\int_{0}^{1} E [h^{+} (\frac{1 - u}{u} Z)] q (u; r, α, \frac{p}{1 - p}) d u = lim_{n \to \infty} \int_{0}^{1} E [h_{n} (\frac{1 - u}{u} Z)] q (u; r, α, \frac{p}{1 - p}) d u .

So, Equation (57) is valid if instead of h belonging to

H_{2}

we write

h^{+}

. Obviously,

0 \leq h^{+} (x) \leq | h (x) | \leq A_{0} x^{2} + B_{0}

,

x \in R

,

n \in R

. Thus,

E [h^{+} {(G_{r, p / (1 - p)}^{1 / α})}^{2}] \leq A_{0} E (G_{r, p / (1 - p)}^{2 / α}) + B_{0} < \infty .

According to [27] (page 8), for

δ > 0

, one has

E [{(G_{r, α, λ}^{*})}^{δ}] = \frac{Γ (r + \frac{δ}{α})}{λ^{δ / α} Γ (r)} .

(59)

This permits us to write

E (G_{r, p / (1 - p)}^{2 / α}) = E [{(G_{r, 1, p / (1 - p)}^{*})}^{2 / α}] < \infty

.

In the same manner, we demonstrate that Equation (57) is valid if instead of

h \in H_{2}

we take

h^{-}

. Moreover,

E [h^{-} (G_{r, p / (1 - p)}^{1 / α})]

is finite. Therefore, Equation (57) holds for any

h \in H_{2}

, and for such h,

E [h (G_{r, p / (1 - p)}^{1 / α})]

is finite.

By the monotone convergence theorem

E [h^{+} (S_{r, α, p}^{*})] = {lim}_{n \to \infty} E [h_{n} (S_{r, α, p}^{*})]

. In a similar way,

E [h_{n} (S_{1, 1, u}^{*})] ↗ E [h^{+} (S_{1, 1, u}^{*})]

as

n \to \infty

, and applying this theorem once again, we obtain

\int_{0}^{1} E [h^{+} (S_{1, 1, u}^{*})] q (u; r, α, \frac{p}{1 - p}) d u = lim_{n \to \infty} \int_{0}^{1} E [h_{n} (S_{1, 1, u}^{*})] q (u; r, α, \frac{p}{1 - p}) d u .

Taking into account that Equation (58) is valid for bounded Borel functions

h_{n}

, one ascertains that Equation (58) holds if we replace h by

h^{+}

. To show the latter integral is finite, we note that

0 \leq h^{+} (x) \leq | h (x) | \leq A_{0} x^{2} + B_{0}

, for some positive

A_{0}, B_{0}

and all

x \in R

. Formula (23) of Lemma 4 yields, for each

u \in (0, 1)

,

E [{(S_{1, 1, u}^{*})}^{2}] \leq \frac{1 - u}{u} E [X_{N_{u} + 1}^{2}] + 2 \frac{{(1 - u)}^{2}}{u^{2}} .

It was assumed above that the right-hand side of Equation (56) is finite. So,

\int_{0}^{1} E (A_{0} (\frac{1 - u}{u} E [X_{N_{u} + 1}^{2}] + 2 \frac{{(1 - u)}^{2}}{u^{2}}) + B_{0}) q (u; r, α, \frac{p}{1 - p}) d u < \infty,

since in light of Equation (57), taking

h (x) = 1

and

h (x) = \frac{x^{2}}{2}

(these functions belong to

H_{2}

),

x \in R

, we obtain, respectively,

\int_{0}^{1} q (u; r, α, \frac{p}{1 - p}) d u = 1,

E [Z^{2}] \int_{0}^{1} \frac{{(1 - u)}^{2}}{u^{2}} q (u; r, α, \frac{p}{1 - p}) d u = E (G_{r, p / (1 - p)}^{2 / α}) < \infty .

(60)

We demonstrate analogously that Equation (58) holds upon replacing

h \in H_{2}

with

h^{-}

and if the right-hand side of Equation (56) is finite, it follows that

\int_{0}^{1} E [h^{-} (S_{1, 1, u}^{*})] q (u; r, α, \frac{p}{1 - p}) d u

is finite as well. Consequently, Equation (58) is established for each

h \in H_{2}

(whenever the right-hand side of Equation (56) is finite) and

E [h (S_{r, α, p}^{*})]

is finite for such h. Therefore, for

h \in H_{2}

and fixed

α, r, p

, one has

\begin{matrix} E [h (S_{r, α, p}^{*})] - E [h (G_{r, p / (1 - p)}^{1 / α})] \\ = \int_{0}^{1} (E [h (S_{1, 1, u}^{*})] - E [h (\frac{1 - u}{u} Z)]) q (u; r, α, \frac{p}{1 - p}) d u = : J (h) . \end{matrix}

By Theorem 1, for

h \in H_{2}

, it holds

|E [h (S_{1, 1, u}^{*})] - E [h (\frac{1 - u}{u} Z)]| \leq d_{H_{2}} (S_{1, 1, u}^{*}, \frac{1 - u}{u} Z) = {(\frac{1 - u}{u})}^{2} d_{H_{2}} (\frac{u}{1 - u} S_{1, 1, u}^{*}, Z)

\leq {(\frac{1 - u}{u})}^{2} \frac{u}{1 - u} (\frac{1}{2}) E [X_{N_{u} + 1}^{2}] = (\frac{1}{2}) \frac{1 - u}{u} E [X_{N_{u} + 1}^{2}],

where we take into account that

N_{1, 1, u}^{*} \sim N B (1, u)

, and

N B (1, u)

coincides with

G e o m (u)

. Thus,

\frac{u}{1 - u} S_{1, 1, u}^{*}

can be written as

\frac{u}{1 - u} \sum_{k = 1}^{N_{u}} X_{k},

where

N_{u} \sim G e o m (u)

,

N_{u}

and

{(X_{k})}_{k \in N}

are independent.

Therefore, for each

h \in H_{2}

,

{(\frac{p}{1 - p})}^{2 / α} | J (h) |

is bounded by the right-hand side of Equation (56), and so the desired upper bound is obtained (recall that

μ = 1

).

Now, we turn to the lower bound of

d_{H_{2}} (W_{r, α, p}^{*}, G_{r, 1}^{1 / α})

. Take

h (x) = x^{2} / 2

belonging to

H_{2}

. Then, applying Equation (23) to evaluate

E [{(S_{1, 1, u}^{*})}^{2}]

, one has

\begin{matrix} d_{H_{2}} (W_{r, α, p}^{*}, G_{r, 1}^{1 / α}) \\ \geq \frac{1}{2} {(\frac{p}{1 - p})}^{2 / α} |\int_{0}^{1} (E [{(S_{1, 1, u}^{*})}^{2}] - {(\frac{1 - u}{u})}^{2} E [G_{1, 1}^{2}]) q (u; r, \frac{p}{1 - p}) d u| \\ = \frac{1}{2} {(\frac{p}{1 - p})}^{2 / α} \int_{0}^{1} (\frac{1 - u}{u}) E [X_{N_{u} + 1}^{2}] q (u; r, \frac{p}{1 - p}) d u, \end{matrix}

(61)

where

G_{1, 1} = Z \sim E x p (1)

. Thus, Equation (61) completes the proof. □

Corollary 1.

Let conditions of Theorem 4 be satisfied and also

μ_{2} = {sup}_{n \in N} E [X_{n}^{2}] < \infty

. Then, the right-hand side of Equation (56) is finite and

d_{H_{2}} (W_{r, α, p}^{*}, G_{r, 1}^{1 / α}) \leq \frac{μ_{2}}{2 μ^{2}} {(\frac{p}{1 - p})}^{1 / α} \frac{Γ (r + \frac{1}{α})}{Γ (r)} .

The inequality becomes an equality if

μ_{2} = E [X_{n}^{2}]

for all

n \in N

. In particular, if

α = 1

then

\frac{Γ (r + 1)}{Γ (r)} = r

.

Proof.

According to Equation (57), for

h (x) = x

,

x \in R

,

E [G_{r, p / (1 - p)}^{1 / α}] = E [Z] \int_{0}^{1} (\frac{1 - u}{u}) q (u; r, α, \frac{p}{1 - p}) d u .

Thus, the following relation is valid.

\int_{0}^{1} (\frac{1 - u}{u}) q (u; r, α, \frac{p}{1 - p}) d u = E [G_{r, p / (1 - p)}^{1 / α}] .

(62)

Due to [27] (see page 8 there), for

δ > 0

, one has

E [G_{r, α, λ}^{*}] = \frac{Γ (r + 1 / α)}{λ^{1 / α} Γ (r)} .

Therefore,

E [G_{r, p / (1 - p)}^{1 / α}] = E [G_{r, α, p / (1 - p)}^{*}] = {(\frac{1 - p}{p})}^{\frac{1}{α}} \frac{Γ (r + \frac{1}{α})}{Γ (r)} .

For

α = 1

, we obtain

E [G_{r, p / (1 - p)}] = \frac{1 - p}{p} \frac{Γ (r + 1)}{Γ (r)} = r \frac{(1 - p)}{p}

. □

6. Convergence of Random Sums of Exchangeable Summands to Generalized Gamma Distribution

Consider the model of exchangeable random variables

X_{1}, X_{2}, \dots

described in Section 4. Introduce the distribution of a random variable

U_{r, α, λ}^{*}

as the following mixture

P_{U_{r, α, λ}^{*}} = ρ P_{(\frac{V_{r, α, λ}^{*} X_{1}}{μ})} + (1 - ρ) P_{Z_{r, α, λ}^{*}},

(63)

where

ρ \in [0, 1]

,

α > 0

,

r > 0

,

μ : = E [X_{1}]

,

μ \neq 0

, random variables

X_{1}, V_{r, α, λ}^{*}

are independent,

V_{r, α, λ}^{*} \sim G^{*} (r, α, λ)

,

Z_{r, α, λ}^{*} \sim G^{*} (r, α, λ)

. Since

E [G_{r, λ}^{2 / α}] = \frac{Γ (r + 2 / α)}{λ^{2 / α} Γ (r)}

(see, e.g., page 8 [27]), one has

E [{(U_{r, α, λ}^{*})}^{2}] = (ρ \frac{E [X_{1}^{2}]}{μ^{2}} + (1 - ρ)) \frac{Γ (r + 2 / α)}{λ^{2 / α} Γ (r)} .

(64)

Due to the properties of generalized gamma distributions, for any positive number c,

\begin{matrix} \frac{1}{c^{α}} U_{r, α, λ}^{*} = \frac{1}{c^{α}} ((1 - I_{ρ}) \frac{V_{r, α, λ}^{*} X_{1}}{μ} + I_{ρ} Z_{r, α, λ}^{*}) \\ = ((1 - I_{ρ}) \frac{V_{r, α, λ}^{*} X_{1}}{μ} + I_{ρ} Z_{r, α, c λ}^{*}) = U_{r, α, c λ}^{*}, \end{matrix}

(65)

where indicator

I_{ρ}

equals 1 and 0 with probabilities

1 - ρ

and

ρ

, respectively, and is independent with all the variables under consideration. Note that

U_{1, 1, 1}^{*}

has the same distribution as a random variable Y, having the law defined in Equation (42). Recall that the generalized negative binomial distribution

G N B (r, α, p)

is the law of a random variable

N_{r, α, p}^{*}

, see Equation (54). We will use the following result.

Lemma 5.

If

r > 0

,

α \neq 0

,

p \in (0, 1)

, then for

N_{r, α, p}^{*} \sim G N B (r, α, p)

one has

E [N_{r, α, p}^{*}] = E [G_{r, α, p / (1 - p)}^{*}], E [N_{r, α, p}^{*} (N_{r, α, p}^{*} - 1)] = E [{(G_{r, α, p / (1 - p)}^{*})}^{2}] .

(66)

Proof.

According to Equation (54), for each

n \in N

,

\sum_{k = 1}^{n} k P (N_{r, α, p}^{*} = k) = \int_{0}^{\infty} z \sum_{k = 1}^{n} \frac{z^{k - 1}}{(k - 1)!} e^{- z} g^{*} (z; r, α, \frac{p}{1 - p}) d z,

\sum_{k = 2}^{n} k (k - 1) P (N_{r, α, p}^{*} = k) = \int_{0}^{\infty} z^{2} \sum_{k = 2}^{n} \frac{z^{k - 2}}{(k - 2)!} e^{- z} g^{*} (z; r, α, \frac{p}{1 - p}) d z .

The desired statement follows from the monotone convergence theorem for the Lebesgue integral by letting

n \to \infty

. □

Theorem 5.

Let

X_{1}, X_{2} \dots

be exchangeable random variables, introduced in Section 4, such that

E [X_{1}] = μ

,

E [X_{1}^{2}] = μ_{2} < \infty

. Assume that for some

ρ \in (0, 1)

Equation (41) holds. Suppose that

{(X_{n})}_{n \in N}

and

N_{r, α, p}^{*}

are independent, where

N_{r, α, p}^{*} \sim G N B (r, α, p)

. Then, for

W_{r, α, p}^{*}

defined in Equation (55) with parameters

r \in (0, 1)

,

α \in (0, 1]

,

p \in (0, 1)

and

U_{r, α, 1}^{*}

given in Equation (63), one has

d_{H_{2}} (W_{r, α, p}^{*}, U_{r, α, 1}^{*}) = \frac{μ_{2}}{2 μ^{2}} {(\frac{p}{1 - p})}^{1 / α} \frac{Γ (1 + \frac{1}{α})}{Γ (r)} .

(67)

Proof.

Without loss of generality, we can assume that

μ = 1

; otherwise, we consider

{\tilde{X}}_{n} : = X_{n} / μ

,

n \in N

. For such sequence,

{\tilde{μ}}_{2} = E {\tilde{X}}_{1}^{2} = μ_{2} / μ^{2}

. Note that Equation (58) is true for dependent summands (see Theorem 1 [29]). Furthermore, for bounded

h (t)

,

t \in R

, function

h_{x} (t) = h (x t)

is also bounded for any

x \in R

. Thus, an employment of Equation (63) gives

E [h (U_{r, α, λ}^{*})] = ρ \int_{R} E [h_{x} (G_{r, λ}^{1 / α})] d F_{X_{1}} (x) + (1 - ρ) E [h (G_{r, λ}^{1 / α})] .

(68)

Now we apply Equation (57) with bounded

h_{x}

and by Fubini’s theorem obtain:

\begin{matrix} \int_{R} E [h_{x} (G_{r, λ}^{1 / α})] d F_{X_{1}} (x) = \int_{R} \int_{0}^{1} E [h_{x} (\frac{1 - u}{u} V^{*})] q (u; r, α, λ) d u d F_{X_{1}} (x) \\ = \int_{0}^{1} E [h (\frac{1 - u}{u} X_{1} V^{*})] q (u; r, α, λ) d u, \end{matrix}

(69)

where

X_{1}

and

V^{*}

are independent and

V^{*} \sim E x p (1)

. Apply Equation (57) for the second summand of Equation (68). Then, Equation (69) yields

\begin{matrix} E [h (U_{r, α, λ}^{*})] \\ = ρ \int_{0}^{1} E [h (\frac{1 - u}{u} X_{1} V^{*})] q (u; r, α, λ) d u + (1 - ρ) \int_{0}^{1} E [h (\frac{1 - u}{u} Z^{*})] q (u; r, α, λ) d u \\ = \int_{0}^{1} E [h (\frac{1 - u}{u} U_{1, 1, 1}^{*})] q (u; r, α, λ) d u, \end{matrix}

(70)

where

Z^{*} \sim E x p (1)

and

U_{1, 1, 1}^{*}

have the same distribution as Y, see Equation (42).

Recall that, for

h \in H_{2}

, an inequality

| h (x) | \leq A_{0} x^{2} + B_{0}

holds for all

x \in R

and some positive constants

A_{0}

,

B_{0}

(see Remark 1). Moreover,

E [{(U_{r, α, λ}^{*})}^{2}] < \infty

according to Equation (64). So, employing bounded

h_{n} (x) = h (x) I_{(- n, n]} (x)

tending to

h (x) \in H_{2}

as

n \to \infty

, one can invoke the Lebesgue dominated convergence theorem to claim that

{lim}_{n \to \infty} E [h_{n} (U_{r, α, λ}^{*})] = E [h (U_{r, α, λ}^{*})]

. We take into account that

\int_{0}^{1} E |h_{n} (\frac{1 - u}{u} U_{1, 1, 1}^{*})| q (u; r, α, λ) d u \leq A_{0} E [{(U_{1, 1, 1}^{*})}^{2}] \int_{0}^{1} {(\frac{1 - u}{u})}^{2} q (u; r, α, λ) d u + B_{0} .

The integral in the right-hand side of the latter formula is finite by Equation (60) and

E [{(U_{1, 1, 1}^{*})}^{2}] < \infty

in accord with Equation (64). Thus, it is possible to apply the Lebesgue dominated convergence theorem to obtain

lim_{n \to \infty} \int_{0}^{1} E [h_{n} (\frac{1 - u}{u} U_{1, 1, 1}^{*})] q (u; r, α, λ) d u = \int_{0}^{1} E [h (\frac{1 - u}{u} U_{1, 1, 1}^{*})] q (u; r, α, λ) d u

for any

h \in H_{2}

. So, Equation (70) holds for all

h \in H_{2}

.

In a similar way,

{lim}_{n \to \infty} E [h_{n} (S_{r, α, p}^{*})] = E [h (S_{r, α, p}^{*})]

for

h \in H_{2}

. According to the Cauchy–Bunyakovsky–Schwarz inequality for identically distributed variables

X_{1}, X_{2}, \dots

we have

| E [X_{i} X_{j}] | \leq μ_{2}

for

i, j \in N

and consequently

\begin{matrix} E [{(S_{r, α, p}^{*})}^{2}] = \sum_{k = 0}^{\infty} P (N_{r, α, p}^{*} = k) E [{(\sum_{j = 1}^{k} X_{j})}^{2}] \\ \leq μ_{2} \sum_{k = 0}^{\infty} P (N_{r, α, p}^{*} = k) k^{2} = μ_{2} E [{(N_{r, α, p}^{*})}^{2}] . \end{matrix}

(71)

Equations (59) and (66) entail that

E [{(N_{r, α, p}^{*})}^{2}] < \infty

. Thus, the dominated convergence theorem guarantees that

{lim}_{n \to \infty} E [h_{n} (S_{r, α, p}^{*})] = E [h (S_{r, α, p}^{*})]

. Furthermore, one can demonstrate that, for each

h \in H_{2}

,

lim_{n \to \infty} \int_{0}^{1} E [h_{n} (S_{1, 1, u}^{*})] q (u; r, α, λ) d u = \int_{0}^{1} E [h (S_{1, 1, u}^{*})] q (u; r, α, λ) d u .

(72)

For this purpose we note that Equation (71) implies

\int_{0}^{1} E |h_{n} (S_{1, 1, u}^{*})| q (u; r, α, λ) d u \leq C + A μ_{2} \int_{0}^{1} E [{(N_{1, 1, u}^{*})}^{2}] q (u; r, α, λ) d u .

According to Equation (66) one has

\int_{0}^{1} E [{(N_{1, 1, u}^{*})}^{2}] q (u; r, α, λ) d u = \int_{0}^{1} (E [{(G_{1, 1, u / (1 - u)}^{*})}^{2}] + E [G_{1, 1, u / (1 - u)}^{*}]) q (u; r, α, λ) d u .

The latter integral is finite because one can take

h (x) = x

and

h (x) = x^{2} / 2

in Equation (57) and invoke Equation (59). Then, it is possible to use the dominated convergence theorem once again to establish Equation (72).

Now, combining Equation (58) and Equation (70) leads for any

h \in H_{2}

to the relation

\begin{matrix} E [h (S_{r, α, p}^{*})] - E [h (U_{r, α, p / (1 - p)}^{*})] \\ = \int_{0}^{1} (E [h (S_{1, 1, u}^{*})] - E [h (\frac{1 - u}{u} U_{1, 1, 1}^{*})]) q (u; r, α, \frac{p}{1 - p}) d u . \end{matrix}

(73)

Note that a random variable

N_{1, 1, u}^{*}

follows the geometric distribution

G e o m (u)

with parameter

u \in (0, 1)

. For each

h \in H_{2}

and any

u \in (0, 1)

, by Theorem 3 and in view of

d_{H_{2}}

homogeneity, we obtain

\begin{matrix} |E [h (S_{1, 1, u}^{*})] - E [h (\frac{1 - u}{u} U_{1, 1, 1}^{*})]| \leq d_{H_{2}} (S_{1, 1, u}^{*}, \frac{1 - u}{u} U_{1, 1, 1}^{*}) \\ = {(\frac{1 - u}{u})}^{2} d_{H_{2}} (W_{u}, Y) \leq {(\frac{1 - u}{u})}^{2} (\frac{u}{1 - u}) \frac{μ_{2}}{2} = (\frac{1 - u}{u}) \frac{μ_{2}}{2} . \end{matrix}

(74)

Employing Equations (73), (74) and (62) one deduces

d_{H_{2}} (S_{r, α, p}^{*}, U_{r, α, p / (1 - p)}^{*}) \leq \frac{μ_{2}}{2} \int_{0}^{1} (\frac{1 - u}{u}) q (u; r, α, \frac{p}{1 - p}) d u = \frac{μ_{2}}{2} E [G_{r, p / (1 - p)}^{1 / α}] .

(75)

Equation (65) implies by virtue of

d_{H_{2}}

homogeneity that

d_{H_{2}} (W_{r, α, p}^{*}, U_{r, α, 1}^{*}) = {(\frac{p}{1 - p})}^{2 / α} d_{H_{2}} (S_{r, α, p}^{*}, U_{r, α, p / (1 - p)}^{*}) .

(76)

Combining Equations (59), (75) and (76) we conclude that the right-hand side of Equation (67) is an upper bound for

d_{H_{2}} (W_{r, α, p}^{*}, U_{r, α, 1}^{*})

.

Choosing

h (x) = x^{2} / 2

in Equation (73), upon employing Equation (52) and Equation (62) one infers:

\begin{matrix} d_{H_{2}} (W_{r, α, p}^{*}, G_{r, 1}^{1 / α}) \geq \\ \geq \frac{1}{2} {(\frac{p}{1 - p})}^{2 / α} |\int_{0}^{1} (E [{(S_{1, 1, u}^{*})}^{2}] - {(\frac{1 - u}{u})}^{2} E [{(U_{1, 1, 1}^{*})}^{2}]) q (u; r, α, \frac{p}{1 - p}) d u| = \\ = \frac{μ_{2}}{2} {(\frac{p}{1 - p})}^{2 / α} \int_{0}^{1} (\frac{1 - u}{u}) q (u; r, α, \frac{p}{1 - p}) d u = \frac{μ_{2}}{2} {(\frac{p}{1 - p})}^{2 / α} E [G_{r, α, p / (1 - p)}^{*}] . \end{matrix}

Using Equation (59) once again, we see that the right-hand side of Equation (67) is a lower bound for

d_{H_{2}} (W_{r, α, p}^{*}, U_{r, α, 1}^{*})

. □

7. Inverse to Equilibrium Transformation

The development of Stein’s method is closely connected with various transformations of distributions. Let a random variable

W \geq 0

and

0 < μ = E [W] < \infty

. Then, one says that a random variable

W^{s}

has the W-size biased distribution if for all f such that

E [W f (W)]

exists

E [W f (W)] = μ E [f (W^{s})] .

The connection of this transformation with Stein’s equation was considered in [50,51]. It was pointed out in [51] that this transformation works well for combinatorial problems, such as counting the number of vertices in a random graph having prespecified degrees, see also [52]. In [53], another transformation was introduced. Namely, if a random variable W has mean zero and variance

σ^{2} \in (0, \infty)

, then the authors of [53] write (Definition 1.1) that a variable

W^{*}

has W-zero biased distribution whenever, for all differentiable f such that

E W f (W)

exists, the following relation holds

E [W f (W)] = σ^{2} E [f^{'} (W^{*})] .

This definition is inspired by an equation

E [W f (W)] = σ^{2} E [f^{'} (W)]

characterizing the normal law

N (0, σ^{2})

. The authors of [53] explain that

W^{*}

always exists if

E [W] = 0

and

var W \in (0, \infty)

. Zero-based coupling for products of normal random variables is treated in [54]. In Sec. 2 of [30], it is demonstrated that the gamma distribution is uniquely characterised by the property that its size-biased distribution is the same as its zero-biased distribution. Two generalizations of zero biasing were proposed in [55], see p. 104 of that paper for discussion of these transformations. We refer also to survey [56].

Now, we turn to the equilibrium distribution transformation introduced in [33] and concentrate on approximation of the law under consideration by means of an exponential law, see the corresponding Definition 1 in Section 2.

According to the second part of Theorem 2.1 of [33] (in our notation), for

Z \sim E x p (1)

and non-negative random variable X with

E [X] = 1

and

E [X^{2}] < \infty

the following estimate holds

d_{H_{1}} (X, Z) \leq 2 E | X^{e} - X |,

and at the same time

d_{H_{1}} (X^{e}, Z) \leq E | X^{e} - X | .

(77)

The authors of [33] also proved that

d_{K} (X^{e}, Z) \leq E | X^{e} - X |

. Notice that the estimate for

d_{H_{1}} (X^{e}, Z)

is more precise than that for

d_{H_{1}} (X, Z)

.

Now we turn to Equation (77) and demonstrate how to find the distribution of X when we know the distribution of

X^{e}

. In other words, we concentrate on the inverse of an equilibrium distribution transformation.

Assume that

E [X] > 0

. Recall that a random variable

X^{e}

exists if

F^{e} (x)

appearing in Equation (16) is a distribution function. The latter statement for

E [X] > 0

is equivalent to nonnegativity of X. Indeed, for non-negative X,

F^{e} (x)

coincides with a distribution function having a density (15). If

F^{e} (x)

is a distribution function and

E [X] > 0

in Equation (16), then

F^{e} (x) \geq 0

for

x < 0

only if

F (x) = 0

for

x < 0

.

Thus a random variable

X^{e}

has a (version of) density

p^{e} (x)

introduced in Equation (15). Obviously, the function

p^{e} (x)

has the following properties. It is nonincreasing on

[0, \infty)

and

p^{e} (x) = 0

for

x < 0

. This density is right-continuous on

[0, \infty)

and consequently

p^{e} (0) < \infty

. Now, we are able to provide a full description of the class of densities for random variables

X^{e}

relevant to all non-negative X with positive mean.

Lemma 6.

Let a non-negative random variable

X^{e}

have a version of density (with respect to the Lebesgue measure)

p^{e} (x)

,

x \in R

, such that this function is nonincreasing on

[0, \infty)

,

p^{e} (x) = 0

for

x < 0

, and there is finite

{lim}_{x \to 0 +} p^{e} (x)

. Then, there exists a unique preimage of

X^{e}

distribution having the distribution function F continuous at

x = 0

. Namely,

F (x) = \{\begin{matrix} 1 - \frac{p^{e} (x)}{p^{e} (0)}, & x \geq 0, \\ 0, & x < 0 . \end{matrix}

(78)

Proof.

First of all, note that

p^{e} (0) > 0

as otherwise

p^{e} (x) = 0

for all

x \in R

(

p^{e}

is a nonincreasing function on

[0, \infty)

). We also know that there exist a left-sided limit and a right-sided limit of

p^{e}

at each point

x \in (0, \infty)

as well as the right-sided limit of

p^{e}

at

x = 0

. The set of discontinuity points of

p^{e}

is at most countable, and we can take a version which is right continuous at each point of

[0, \infty)

. Then, Equation (78) introduces a distribution function. Consider a random variable X with distribution function F and check the validity of Equation (14).

The integration by a parts formula yields, for any

b > 0

,

1 \geq \int_{0}^{b} p^{e} (x) d x = b p^{e} (b) + p^{e} (0) \int_{0}^{b} x d F (x) .

(79)

Summands in the right-hand side of Equation (79) are non-negative. Therefore, for any

b > 0

,

E [X I (X \leq b)] \leq 1 / p^{e} (0)

. Hence, the monotone convergence theorem implies that

E [X]

is finite. According to Equation (78)

b p^{e} (b) / p^{e} (0) = b (1 - F (b)) = b P (X > b) \to 0, b \to \infty,

(80)

since

E [X] < \infty

. Taking in the Equation (79) limit as

b \to \infty

, one obtains

1 = p^{e} (0) E [X]

. Now, we are ready to verify Equation (14). For any Lipschitz function f,

E [f (X)]

is finite and

E [f (X)] = \int_{0}^{\infty} f (x) d F (x) = - \frac{1}{p^{e} (0)} \int_{0}^{\infty} f (x) d p^{e} (x) .

Taking into account Equation (80), we infer that

f (b) p^{e} (b) \to 0

as

b \to \infty

. Consequently, applying integration by parts once again (f has bounded variation), we obtain

\begin{matrix} E [X] E [f^{'} (X^{e})] = \frac{1}{p^{e} (0)} \int_{0}^{\infty} f^{'} (x) p^{e} (x) d x = \frac{1}{p^{e} (0)} \int_{0}^{\infty} p^{e} (x) d f (x) \\ = \frac{1}{p^{e} (0)} [- f (0) p^{e} (0) - \int_{0}^{\infty} f (x) d p^{e} (x)] = E [f (X)] - f (0) . \end{matrix}

Uniqueness of X distribution corresponding to

X^{e}

is a consequence of Equation (15) and continuity of

F (x)

at

x = 0

. Indeed, assume that for

X_{1}

and

X_{2}

one has

X_{1}^{e} = X_{2}^{e}

. Then, Equation (15) yields that for almost all

x \geq 0

,

\frac{1}{E [X_{1}]} P (X_{1} > x) = \frac{1}{E [X_{2}]} P (X_{2} > x),

(81)

and therefore

P (X_{1} > x) = c P (X_{2} > x)

, where c is a positive constant (the equilibrium distribution in Definition 1 is introduced for random variables with positive expectation only). Since

P (X_{1} = 0) = P (X_{2} = 0) = 0

, one has

P (X_{1} > 0) = P (X_{2} > 0)

. Let

x_{n} \to 0 +

,

n \to \infty

, where the points

x_{n}

belong to the set considered in Equation (81) to ensure that

c = 1

. Thus, distributions of

X_{1}

and

X_{2}

coincide. □

Remark 6.

Let

X_{p}

be the Bernoulli random variable taking values 1 and 0 with probabilities p and

1 - p

, respectively. Then, it is easily seen that the distribution of

X_{p}^{e}

is uniform on

[0, 1]

. Thus, in contrast to Lemma 6, without assumption of continuity of F at a point

x = 0

one can not guarantee, in general, the preimage uniqueness for the inverse transformation to the equilibrium one.

In the proof of Lemma 6, we find out that

E [X] = 1 / p^{e} (0)

. Set

λ = p^{e} (0)

,

Z \sim E x p (λ)

. Then,

E [X] = E [Z]

. Further, we suppose that this choice of

λ

is made.

Recall that random variables U and V are stochastically ordered if either

P (U \leq x) \leq P (V \leq x)

, for every

x \in R

, or the opposite inequality holds (for all

x \in R

). Now, we clarify one of the Theorem 2.1 of [33] statements (see also Theorem 3 [22], where the result similar to Theorem 2.1 of [33] is formulated employing the generalized distributions).

Theorem 6.

Let a random variable

X^{e}

satisfy conditions of Lemma 6, and

E [X^{e}] < \infty

and X be a preimage of the equilibrium transformation. Then, Equation (77) holds. Moreover, the inequality becomes an equality when X and

X^{e}

are stochastically ordered.

Proof.

Apply the Stein Equation (10) along with equilibrium transformation (14). Then, in light of

E [X] = \frac{1}{λ}

and

E f_{h} (X) - f_{h} (0) = \frac{1}{λ} E f_{h}^{'} (X^{e})

, we can write

\begin{matrix} |E [h (X^{e})] - E [h (Z)]| = |E (f_{h}^{'} (X^{e}) - λ f_{h} (X^{e})) + λ f (0)| \\ = λ |E (f_{h} (X^{e}) - f_{h} (X))| \leq λ | | f_{h}^{'} {| |}_{\infty} E | X^{e} - X | \leq | | h^{'} {| |}_{\infty} E | X^{e} - X | . \end{matrix}

(82)

The last inequality in (82) is true due to Lemma 2. Now, we demonstrate that equality in (82) can be attained. Taking

h (x) = x - \frac{1}{λ}

, we have a solution

f_{h} (x) = - \frac{1}{λ} x

of Equation (12). Then,

|E [h (X^{e})] - E [h (Z)]| = λ |E (f_{h} (X^{e}) - f_{h} (X))| = |E (X^{e} - X)| .

Employing the integration by parts formula, one can show that the expression in the right-hand side of the last equality is equal to the Kantorovich distance between X and

X^{e}

when these variables are stochastically ordered. Note that

x (1 - F (x)) \to 0

,

x (1 - F^{e} (x)) \to 0

as

x \to \infty

and

x F (x) \to 0

,

x F^{e} (x) \to 0

as

x \to - \infty

because

E [X]

and

E [X^{e}]

are finite. Thus,

\begin{matrix} |E [X^{e}] - E [X]| = |\int_{R} x (d F_{X^{e}} (x) - d F_{X} (x))| \\ = |- \int_{R} (F_{X^{e}} (x) - F_{X} (x)) d x| = \int_{R} |F_{X^{e}} (x) - F_{X} (x)| d x, \end{matrix}

since

F_{X^{e}} (x) \geq F_{X} (x)

(or ≤) for all

x \in R

. It is well-known that the Kantorovich distance is the minimal one for the metric

τ (U, V) = E | U - V |

(see, e.g., [9], Ch. 1, §1.3). Therefore,

\int_{R} |F_{X^{e}} (x) - F_{X} (x)| d x = inf E | U - V |,

where the infimum has taken over all joint laws

(U, V)

such that

P_{U} = P_{X^{e}}

and

P_{V} = P_{X}

(see also Remark 2 and [10], Corollary 5.3.2). Consequently, in the framework of Theorem 6,

|E [X^{e}] - E [X]| = E | X^{e} - X |

. □

Remark 7.

One can show that by means of Lemma 2 and Equation (82) it is possible to provide an estimate

d_{K} (X^{e}, Z) \leq λ E | X^{e} - X | .

(83)

For each function h belonging to

K

, in a similar way to Equation (82), one can apply Equation (10) together with equilibrium transformation. Now, it is sufficient to study the Stein equation with right derivative. Formula (13) gives a solution of the Stein equation according to Lemma 2. Note that for

f_{h}

, the right derivative coincides almost everywhere with the derivative, and the law of

X^{e}

is absolutely continuous according to Equation (15). Thus, for the Lipschitz function

f_{h}

(see Lemma 2), one can use an equilibrium transformation.

Example 1.

Consider the distribution functions

F_{ε} (x)

of random variables

X_{ε}

, taking values

ε

and

2 - ε

with probabilities

1 / 2

,

0 < ε < 1

. Formula (15) yields that

X_{ε}^{e}

has the following piece-line structure

F_{ε}^{e} (x) = \{\begin{matrix} 0, & if x < 0, \\ x, & if 0 \leq x < ε, \\ x / 2 + ε / 2, & if ε \leq x < 2 - ε, \\ 1, & if 2 - ε \leq x . \end{matrix}

If

ε \geq 1 / 2

then, for all

x \in R

, the following inequality holds:

F_{ε}^{e} (x) \geq F_{ε} (x)

, i.e.,

X_{ε}

and

X_{ε}^{e}

are stochastically ordered. We see that for

ε < 1 / 2

, the inequality is violated in the right neighborhood of a point

ε

. Thus, there are beside the stochastically ordered pairs (X,

X^{e}

) also those of a different kind.

Now, we turn to another example of stochastically ordered X and

X^{e}

.

Example 2.

Take

X^{e}

having the Pareto distribution. The notation

X^{e} \sim P a r e t o (α, β)

means that

X^{e}

has a density

f^{e} (x) = \frac{α β^{α}}{{(x + β)}^{α + 1}}

(

x \geq 0

) and the corresponding distribution function

F^{e} (x) = 1 - {(\frac{β}{x + β})}^{α}

, where

x \geq 0

,

α > 0, β > 0

.

Further, we consider only

α > 1

, since in this case there exists finite

E [X^{e}] = \frac{β}{α - 1}

. By means of Lemma 6, we obtain the distribution of the preimage of the equilibrium transformation

F (x) = 1 - \frac{f^{e} (x)}{f^{e} (0)} = 1 - \frac{α β^{α}}{{(x + β)}^{α + 1}} \frac{β^{α + 1}}{α β^{α}} = 1 - {(\frac{β}{x + β})}^{α + 1}, x \geq 0 .

Thus one can state that

X \sim P a r e t o (α + 1, β)

. It is not difficult to see that

F^{e} (x) \leq F (x)

for

x \in R

, i.e., the random variables

X^{e}

and X are stochastically ordered. Due to Theorem 6, one has

d_{H_{1}} (X^{e}, Z) = E | X^{e} - X | = E [X^{e}] - E [X] = \frac{β}{α - 1} - \frac{β}{α} = \frac{β}{α (α - 1)},

(84)

d_{K} (X^{e}, Z) \leq \frac{α}{β} E | X^{e} - X | = \frac{1}{α - 1} .

In such a way we find the bound for the Kolmogorov distance between the distributions

P a r e t o (α, β)

and

E x p (α / β)

. This relation demonstrates the convergence rate of

d_{1} (X^{e}, Z)

to zero as

α \to \infty

. The estimate is nontrivial for

α > 2

.

Remark 8.

It is interesting that estimation of the proximity of the Pareto law to the Exponential one became important in signal processing, see [34] and references therein. Let

X \sim P a r e t o (α, β)

, where

α > 0

,

β > 0

, and

Z \sim E x p (λ)

. In [34], the author indicates that the Pinsker–Csiszár inequality was employed to derive

d_{K} (X, Z) \leq \sqrt{2 D_{K L} (X | | Z)},

(85)

where

D_{K L} (X | | Z)

is the Kullback–Leibler divergence between laws of X and Z. More precisely, in the left-hand side of Equation (85) one can write the total variation distance

d_{T V} (X, Z)

between distributions of X and Z. Clearly,

d_{K} (X, Z) \leq d_{T V} (X, Z)

. By evaluating

D_{K L} (X | | Z)

and performing an optimal choice of parameter

λ

, it was demonstrated (formula (19) in [34]) that, for

α > 1

and any

β > 0

,

d_{K} (X, Z) \leq \sqrt{\frac{2}{α (α - 1)}}

(86)

if

λ = \frac{α - 1}{β}

. The author of [34] on page 8 writes that in his previous work [57] the inequality

d_{K} (X, Z) \leq \frac{3}{α}

(87)

was established with the same choice of

λ

. Next, he also writes that “in the most cases

α > 2

” and notes that the estimate in Equation (86) involving the Kullback–Leibler divergence is more precise for

α > \frac{9}{7}

than the estimate in Equation (87) obtained by the Stein method. Moreover, on page 4 of [34] we read: “The problem with the Stein approach is that the bounds do not suggest a suitable way in which, for a given Pareto model, an appropriate approximating Exponential distribution can be specified”. However, we have demonstrated that application of the inverse equilibrium transformation together with the Stein method permits indicating, whenever

α > 2

, the corresponding Exponential distribution with proximity closer than the right-hand sides of Equation (86) and Equation (87) can provide.

8. Conclusions

Our principle goal was to find the sharp estimates of the proximity of random sums distributions to exponential and more general laws. This goal is achieved when we employ the probability metric

d_{H_{2}}

. Thus, it would be valuable to find the best possible approximations of random sums distributions by means of specified laws using the metrics

ζ_{s}

of order

s > 0

. The results of [32] provide the basis for this approach.

There are various complementary refinements of the Rényi theorem. One approach is related to the employment of Brownian motion. It is interesting that in [58] (p. 1071) the authors proposed an explanation of the Rényi theorem involving the embedding theorem. We provide a little bit different complete proof. Let

X_{1}, X_{2}, \dots

be i.i.d. random variables with mean

μ : = E X_{1}

and

σ^{2} : = var X_{1} < \infty

, whereas

S_{n}, n \in N,

denote the corresponding partial sums. According to Theorem 12.6 of [59], which is due to A.V. Skorokhod and V. Strassen, there exists a standard Brownian motion

B (t), t \geq 0,

(perhaps it is defined on an extension of initial probability space) such that

\frac{1}{\sqrt{t}} sup_{0 \leq u \leq t} | S_{[u]} - μ u - σ B (u) | \overset{P}{\to} 0, t \to \infty,

(88)

and

lim_{t \to \infty} \frac{S_{[t]} - μ t - σ B (t)}{\sqrt{2 t log log t}} = 0 a . s .,

(89)

where

\overset{P}{\to}

stands for convergence in probability, and a.s. means almost surely. Thus, in light of Equation (89), we can write, for

t \geq 0

,

S_{[t]} = μ t + σ B (t) + R (t),

(90)

where

{sup}_{0 \leq u \leq t} R (u) / \sqrt{t} \overset{P}{\to} 0

and

R (t) / \sqrt{2 t log log t} \to 0

a.s. when

t \to \infty

. Substitute

N_{p}

(see Equation (2)) in Equation (90) instead of t. It is easily seen that

N_{p} \overset{P}{\to} \infty

(i.e., for each

t > 0

, one has

P (N_{p} \leq t) \to 0

as

p \to 0 +

) and by means of characteristic functions one can verify that

p N_{p} \overset{D}{\to} Z

as

p \to 0 +

, where

Z \sim E x p (1)

. Therefore,

μ p N_{p} \overset{D}{\to} μ Z

,

p \to 0 +

. In the proof of Lemma 4, we showed (Equation (24)) that

E [N_{p}] = (1 - p) / p

. Consequently,

\begin{matrix} var [p B (N_{p})] = p^{2} E [B {(N_{p})}^{2}] = p^{2} \sum_{k = 0}^{\infty} E [B {(k)}^{2}] p {(1 - p)}^{k} \\ = p^{2} \sum_{k = 0}^{\infty} k p {(1 - p)}^{k} = p^{2} E [N_{p}] = p^{2} \frac{1 - p}{p} = p (1 - p) \to 0, p \to 0 + . \end{matrix}

Hence,

p σ B (N_{p}) \overset{P}{\to} 0

as

p \to 0 +

. Now, we demonstrate that

p R (N_{p}) \overset{P}{\to} 0, p \to 0 + .

For any

ε > 0

and any

t > 0

,

\begin{matrix} P (p | R (N_{p}) | > ε) \leq P (p | R (N_{p}) | > ε, N_{p} \leq t) + P (N_{p} > t) \\ \leq P (p sup_{0 \leq u \leq t} | R (u) | > ε) + P (N_{p} > t) . \end{matrix}

In light of Equation (88), for arbitrary

γ > 0

and

ε > 0

, one can take

t_{0} = t_{0} (γ)

such that

P ({sup}_{0 \leq u \leq t_{0}} | R (u) | > ε \sqrt{t_{0}}) < γ / 2

. Then, for any

0 < p \leq 1 / \sqrt{t_{0}}

, we obtain

P (p sup_{0 \leq u \leq t_{0}} | R (u) | > ε) < γ / 2 .

Since

N_{p} \overset{P}{\to} \infty

, we can find

p_{0} > 0

such that

P (N_{p} > t_{0}) < γ / 2

if

0 < p \leq p_{0}

. Therefore,

R (N_{p}) \overset{P}{\to} 0

as

p \to 0 +

. The Slutsky lemma yields the desired relation

p S_{N_{p}} \overset{D}{\to} μ Z, p \to 0 +,

which implies Equation (3). However, it seems that there is no clear intuitive reason why the law of the random sum converges to an exponential in the Rényi theorem. Moreover, in Ch. 3, Sec. 2 “The Rényi Limit Theorem” of [20] (see Sec. 2.1 “Motivation”), one can find examples demonstrating that intuition behind the Rényi theorem is poor.

Actually, relation (90) leads to refinements of Equation (3). In [58], it is proved that if

X_{1}

has finite exponential moments and other specified conditions are satisfied then there exists a more sophisticated approximation for distribution of

W_{p}

, and its accuracy is estimated. The results are applied to the study of

M / G / 1

queue for both light-tailed and heavy-tailed service time distributions. Note that in [58], Section 5, the authors study the model where the distribution of

X_{1}

can depend on p. For future research, it would be desirable to establish analogues of our theorems for such a model.

The results concerning the accuracy of approximating a distribution under consideration by an exponential law are applicable to some queuing models. Let, for a queue

M / G / 1

, the inter-arrival times follow

E x p (λ)

distribution and S stand for the general service time. Introduce the stationary waiting time W and define

ρ : = λ E [S]

to be its load. Due to [60], if

E [S^{3}] < \infty

then

(1 - ρ) W \overset{D}{\to} Z

as

ρ \to 1

, where

Z \sim E x p (1)

. Theorem 3.1 of [45] contains an upper bound of

d_{H_{1}} (W_{p}, Z)

, where

Z \sim E x p (1)

. This estimate is used by the authors for analysis of queueing systems with a single server. It would be interesting to obtain the sharp approximations in the framework of queueing systems.

For the model of exchangeable random variables, Theorem 2 in Section 2 ensures the weak convergence of distributions under consideration to specified mixture of explicitly indicated laws. Theorem 3 proves the sharp convergence rate estimate to this limit law by means of the ideal probability metric of the second order. It would be worthwhile to establish such an estimate of the distributions proximity applying the Lévy–Prokhorov distance because convergence in this metric is equivalent to the weak convergence of distributions of random variables. All the more, at present there is no unified theory of probability metrics. In this regard, one can mention Proposition 1.2 of [17] stating that if a random variable Z has the Lebesgue density bounded by C then, for any random variable Y,

d_{K} (Y, Z) \leq \sqrt{C d_{H_{1}} (Y, Z)} .

However, this estimate only gives the sub-optimal convergence rates. We also highlight the important total variation distance

d_{T V}

. The authors of [61] study the sum

W : = \sum_{j \in J} X_{j}

, where

{X_{j}, j \in J}

is a family of locally dependent non-negative integer-valued random variables. Using the perturbations of Stein’s operator, they establish the upper bounds for

d_{T V} (W, M)

where the law of M is a mixture of Poisson distribution and either binomial or negative binomial distribution. It would be desirable to obtain the sharp estimates and, moreover, consider a more general model where the set of summation is random. In this connection, it seems helpful to employ the paper [62], where the authors proved results concerning the weak convergence of distributions of statistics constructed from samples of random size. In addition, it would be interesting to extend these results to stratified samples by invoking Lemma 1 of [63].

Special attention is paid to various generalizations of the geometric sums. In Theorem 3.3 of [64], the authors consider random sums with summation index

T_{n} : = Y_{1} + \dots + Y_{n}

, where

Y_{1}, Y_{2}, \dots

are i.i.d. random variables following the geometric law

G e o m (p)

, see Equation (2). Then, they show that

S_{T_{n}} / E [S_{T_{n}}]

converge in distribution to the gamma law with certain parameters as

p \to 0 +

. In [62], it is demonstrated that the Linnik and the Mittag–Leffler laws arise naturally in the framework of limit theorems for random sums. Hopefully, in future the complete picture of limit laws involving general theory of distributions mixtures will appear. In addition, it is desirable to study various models of random sums of dependent random variables. On this track, it could be useful to consider the decompositions of exchangeable random sequences extending the fundamental de Finetti theorem, see, e.g., [65].

One can try to generalize the results of Section 7 for accumulative laws proposed in [66]. These laws are akin to both the Pareto distribution and the lognormal distribution. In addition, we refer to [43] where the “variance-gamma distributions” were studied. These distributions form a four-parameter family and comprise as special and limiting cases the normal, gamma and Laplace distributions. Employment of these distributions permits enlarging a range of applications in modeling and fitting real data.

To complete the indication of further research directions, we note that the next essential and nontrivial step is to establish the limit theorem in functional spaces for processes generated by a sequence of random sums of random variables. For such stochastic processes, one can obtain the analogues of the classical invariance principles.

Author Contributions

Conceptualization, A.B. and N.S.; methodology, A.B and N.S.; formal analysis, A.B. and N.S.; investigation, A.B. and N.S.; writing—original draft preparation, A.B. and N.S.; writing—review and editing, A.B. and N.S.; supervision, A.B.; project administration, A.B.; funding acquisition, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

The work was supported by the Lomonosov Moscow State University project “Fundamental Mathematics and Mechanics”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors are grateful to Alexander Tikhomirov for invitation to present manuscript for this issue. In addition, they would like to thank three anonymous Reviewers for the careful reading of the manuscript and valuable remarks.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Lemma 1.

If

L i p (h) = C < \infty

, then h is absolutely continuous (see, e.g., §13 in [42]), and consequently there exists

h^{'} (x)

for almost all

x \in R

. Thus,

| h^{'} (x) | \leq C

for almost all

x \in R

in light of Equation (4). Assume that essential supremum

∥ h^{'} ∥_{\infty} = C_{0} < C

. Then, for any

ε > 0

, one can find a version of

h^{'}

, defined on

R

, such that

{sup}_{x \in R} | h^{'} (x) | \leq C_{0} + ε

. (It was explained in Section 2 that one can consider a measurable extension of

h^{'}

to

R

). Then, due to Equation (11) with h instead of f we obtain Equation (5) with

C_{0} + ε

instead of C. Consequently,

L i p (h) \leq C_{0} < C

. We come to the contradiction.

On the other hand, let h be absolutely continuous. Then, for almost all

x \in R

, there exists

h^{'} (x)

and Equation (11) is valid for h instead of f. Assume that essential supremum

∥ h^{'} ∥_{\infty} = C < \infty

. Then, for any

ε > 0

there is a version of

h^{'}

such that

{sup}_{x \in R} | h^{'} (x) | \leq C + ε

. According to Equation (11), the relation (5) holds with

C + ε

instead of C. Since

ε > 0

can be taken as an arbitrary small, one can claim that

L i p (h) \leq C

. Suppose that

L i p (h) \leq C_{0} < C

. Then, for almost all

x \in R

, there exists

h^{'}

and

| h^{'} | \leq C_{0}

. Thus, we found a version with

∥ h^{'} ∥_{\infty} \leq C_{0}

. The contradiction shows that

L i p (h) = C

. Hence, the desired statement is proved. □

Proof of Lemma 2.

Let

x_{0}

be a continuity point of a function

h \in K \cup H_{1} \cup H_{2}

. Then, the same is true for a function

h (u) e^{- λ u}

,

u \in R

. Hence, the function

\int_{x}^{\infty} h (u) e^{- λ u} d u

has a derivative

- h (x_{0}) e^{- λ x_{0}}

at point

x_{0}

(in light of Remark 1 an integral

\int_{x}^{\infty} h (u) e^{- λ u} d u

is well defined for any

x \in R

). Thus, for each point x of continuity h there exists

f_{h}^{'} (x) = - λ e^{λ x} \int_{x}^{\infty} h (u) e^{- λ u} d u - e^{λ x} (- h (x) e^{- λ x}) = λ f_{h} (x) + h (x) .

(A1)

For each fixed

z \in R

and a function

h (x) = I {x \leq z}

, where

x \in R

, Equation (12) is verified in a similar way for the right derivative

f_{h}

at point

z \in R

. Taking

x = 0

in Equation (12), we obtain

- E [h (Z)] / λ

. Evidently,

- e^{λ x} \int_{x}^{\infty} e^{- λ u} d u = - 1 / λ

. Therefore, Equation (A1) yields

f_{h}^{'} (x) = - λ e^{λ x} \int_{x}^{\infty} (h (u) - h (x)) e^{- λ u} d u .

(A2)

If a function h belongs to

K

, then, for any

u, x \in R

, the following inequality holds

| h (u) - h (x) | \leq 1

. Consequently, for

h \in K

, one has

∥ f_{h}^{'} ∥_{\infty} \leq 1

(where

f_{h}^{'}

means a right derivative of a version of

f_{h}^{'}

, and we operate with essential supremum).

Taking into account Lemma 1, for a function

h \in H_{1}

and any

x \leq u

, one can write

| h (u) - h (x) | \leq L i p (h) (u - x) = ∥ h^{'} ∥_{\infty} (u - x)

. For

h \in H_{2}

and

x \leq u

, by the Lagrange finite-increments formula,

| h (u) - h (x) | \leq | h^{'} (v) | (u - x) \leq ∥ h^{'} ∥_{\infty} (u - x)

, where

x < v < u

. Hence, for any

x \in R

and

h \in H_{1} \cup H_{2}

,

| f_{h}^{'} (x) | = λ e^{λ x} \int_{x}^{\infty} (h (u) - h (x)) e^{- λ u} d u \leq λ e^{λ x} {∥ h^{'} ∥}_{\infty} \int_{x}^{\infty} (u - x) e^{- λ u} d u = \frac{∥ h^{'} ∥_{\infty}}{λ}

since

λ e^{λ x} \int_{x}^{\infty} (u - x) e^{- λ u} d u = \int_{0}^{\infty} λ v e^{- λ v} d v = \frac{1}{λ} .

(A3)

Taking into account Equation (12), one can see that, for any

h \in H_{2}

,

f_{h}^{'} = λ f_{h} + h

, where

f_{h}

and h have derivatives at each point

x \in R

. Using Equation (A2) and Equation (A3), we obtain, for

x \in R

,

f_{h}^{″} (x) = λ f_{h}^{'} (x) + h^{'} (x) = - λ^{2} e^{λ x} \int_{x}^{\infty} (h (u) - h (x)) e^{- λ u} d u + h^{'} (x)

= - λ^{2} e^{λ x} \int_{x}^{\infty} (h (u) - h (x) - h^{'} (x) (u - x)) e^{- λ u} d u .

(A4)

By means of Equation (A3) and the Lagrange finite-increments formula we can write

| f_{h}^{″} (x) | \leq 2 ∥ h^{'} ∥_{\infty} λ^{2} e^{λ x} \int_{x}^{\infty} (u - x) e^{- λ u} d u = 2 {∥ h^{'} ∥}_{\infty} .

(A5)

Let us apply the Taylor formula with integral representation of the residual term:

h (u) = h (x) + h^{'} (x) (u - x) + R (u, x), R (u, x) = \int_{x}^{u} (u - t) h^{″} (t) d t, u, x \in R .

(A6)

This representation known for the Riemann integral (see, e.g., [67], §9.17) holds in the framework of the Lebesgue integral if it is possible to use the recurrent integration by parts for

R (u, x)

, i.e.,

\int_{x}^{u} (u - t) h^{″} (t) d t = - h^{'} (x) (u - x) + \int_{x}^{u} h^{'} (t) d t = - h^{'} (x) (u - x) + h (u) - h (x) .

(A7)

Integral in the left-hand side of Equation (A7) exists by virtue of Lemma 1 since

h^{'} \in Lip (1)

. Therefore,

h^{″} (x)

is defined for almost all

x \in R

and (essential supremum)

∥ h^{″} ∥ \leq 1

. The latter equality in Equation (A7) is obvious since

h^{'}

is continuous function on

R

. The first equality in Equation (A7) is valid due to the integration by parts formula for the Lebesgue integral. Indeed, functions

h^{'} (t)

and

(u - t)

are absolutely continuous for t belonging to

[x, u]

. Thus, we can apply, e.g., Theorem 13.29 of [42] to justify the first equality in Equation (A7). Consequently, due to Equation (A4) and Equation (A6) one can write

| f_{h}^{″} (x) | \leq |- λ^{2} e^{λ x} \int_{x}^{\infty} (\int_{x}^{u} (u - t) h^{″} (t) d t) e^{- λ u} d u|

\leq \frac{∥ h^{″} ∥_{\infty}}{2} |\int_{x}^{\infty} λ^{2} {(u - x)}^{2} e^{- λ (u - x)} d u| = \frac{∥ h^{″} ∥_{\infty} Γ (3)}{2 λ} = \frac{∥ h^{″} ∥_{\infty}}{λ},

(A8)

where

Γ (α) : = \int_{0}^{\infty} u^{α - 1} e^{- u} d u

,

α > 0

. Relations Equation (A5) and Equation (A8) lead to the last statement of Lemma 2. The proof is complete. □

Comments to Definition 1.

For each Lipschitz function f, one can claim that

E [f (X)]

is finite since

E | X | < \infty

and, in light of Remark 1, one has

| f (x) | \leq C | x | + | f (0) |

, where

C = L i p (f)

,

x \in R

. Clearly, it is sufficient to verify Equation (14) for any Lipschitz function f such that

f (0) = 0

(otherwise we take the Lipschitz function

f (x) - f (0)

,

x \in R

). Evidently,

p^{e} (x)

,

x \in R

, introduced by Equation (15), is a probability density because for non-negative random variable X according to [47], Ch.2, formula (69)

E [X] = \int_{[0, \infty)} P (X > u) d u .

(A9)

We will show that, for such f and a density

p^{e}

of

X^{e}

, one has

\int_{[0, \infty)} f (u) d F (u) = \int_{[0, \infty)} f^{'} (u) P (X > u) d u,

(A10)

where F is a distribution function of X and

E [X] \neq 0

. We take integrals over

[0, \infty)

as

X \geq 0

and

p^{e} (x) = 0

for

x < 0

.

We know that a function f has a derivative at almost all points

x \in R

. Therefore, the right-hand side of Equation (A10) does not depend on the choice of a version

f^{'}

(

P (X > u)

is a measurable bounded function). The integral in the right-hand side of Equation (A10) is finite because

∥ f^{'} ∥ \leq C

in light of Lemma 1 and since the right-hand side of Equation (A9) is finite. One can take the integrals over

(0, \infty)

in Equation (A10) as

f (0) = 0

and

m ({0}) = 0

, where m stands for the Lebesgue measure.

Function f is a function of finite variation (as f is the Lipschitz function). Therefore,

f = f_{1} - f_{2}

where

f_{1}

and

f_{2}

are nondecreasing functions. We can take the canonical representation with

f_{1} (x) = V a r_{0}^{x} (f)

and

f_{2} (x) = f (x) - f_{1} (x)

,

x \in R

, where

V a r_{a}^{b} (f)

is the variation of f on

[a, b]

,

a < b

(see, e.g., [42], Theorem 12.18). If

f \in L i p (C)

, then

V a r_{a}^{b} (f) \leq C (b - a)

. For

a < c < b

, one has (see, e.g., [42], Lemma 12.15)

V a r_{a}^{c} (f) + V a r_{c}^{b} (f) = V a r_{a}^{b} (f) .

We see that such

f_{1}

and

f_{2}

are the Lipschitz functions when f is the Lipschitz one. Hence, for almost all

x \in R

, there exist

f_{1}^{'} (x)

,

f_{2}^{'} (x)

and

f^{'} (x) = f_{1}^{'} (x) - f_{2}^{'} (x)

. Thus, it is enough to demonstrate that

\int_{(0, \infty)} f_{i} (u) d F (u) = \int_{(0, \infty)} f_{i}^{'} (u) P (X > u) d u, i = 1, 2 .

These integrals are finite since

f_{1}

and

f_{2}

are the Lipschitz functions. Note that

\int_{(0, \infty)} f_{i} (u) d F (u) = - \int_{(0, \infty)} f_{i} (u) d (1 - F (u)) = - \int_{(0, \infty)} f_{i} (u) d P (X > u) .

By applying Theorem 11 of Sec. 6, Ch. 2 [47], one obtains, for each

b > 0

, nondecreasing continuous function

f_{i}

and a nondecreasing right-continuous function

(- P (X > u))

, the following formula:

\int_{(0, b]} f_{i} (u) d P (X > u) = f_{i} (b) P (X > b) - f_{i} (0) P (X > 0) - \int_{(0, b]} P (X > u) d f_{i} (u)

(A11)

= f_{i} (b) P (X > b) - \int_{(0, b]} P (X > u) f_{i}^{'} (u) d u .

We take into account that

f_{i} (0) = 0

and the

σ

-finite measure

Q_{i}

corresponding to

f_{i}

is absolutely continuous w.r.t. m, and the Radon–Nikodým derivative

\frac{d Q_{i}}{d m} (x) = f_{i}^{'} (x)

,

x \in R

,

i = 1, 2

. In addition, we can write

P (X > u)

in Equation (A11) since for at almost all

u \in R

the left-limit of this function coincides with

P (X > u)

(there exist at most a countable set of jumps of

P (X > u)

,

u \in R

). Obviously,

f_{i} (b) P (X > b) \to 0

as

b \to \infty

because

| f_{i} (u) | \leq A_{i} u + B_{i}

for some positive

A_{i}, B_{i}

and all

u \in R

. Indeed, according to formula (73) of Sec. 6, Ch. 2 of [47] the condition

E | X | < \infty

yields

b P (| X | > b) \to 0, b \to \infty .

By the Lebesgue dominated convergence theorem one infers that

\int_{(0, b]} f_{i} (u) d P (X > u) \to \int_{(0, \infty)} f_{i} (u) d P (X > u), b \to \infty .

and

lim_{b \to \infty} \int_{(0, b]} P (X > u) f_{i}^{'} (u) d u = \int_{(0, \infty)} P (X > u) f_{i}^{'} (u) d u .

This permits to claim the validity of Equation (A10) which entails the desired Equation (15).

Proof of Lemma 3.

For

f \in H_{2}

, in light of Remark 1 one can state that

| f (x) | \leq A_{0} x^{2} + B_{0}

for some positive numbers

A_{0}

and

B_{0}

. Let F be a distribution function of X. Since

E [X^{2}] < \infty

, due to Corollary 2, Sec. 6, Ch. 2, v.1, [47] one has

x^{2} F (x) \to 0, x \to - \infty; x^{2} (1 - F (x)) \to 0, x \to \infty .

Hence, we obtain that

f (x) F (x) \to 0

as

x \to - \infty

and

f (x) (1 - F (x)) \to 0

as

x \to \infty

. Continuous function f has a bounded variation. Thus

f = f_{1} - f_{2}

where

f_{1}

and

f_{2}

are nondecreasing continuous functions. Thus, for any

a < 0

and

i = 1, 2

, the integration by parts formula (see, e.g., Theorem 11, Sec. 6, Ch. 2, [47]) and Equation (18) give

\int_{(a, 0]} (f_{1} (x) - f_{2} (x)) d F (x) = f (0) F (0) - f (a) F (a) - (\int_{(a, 0]} F (x) d f_{1} (x) - \int_{(a, 0]} F (x) d f_{2} (x))

= f (0) F (0) - f (a) F (a) - \int_{(a, 0]} F (x) d f (x) .

We take into account that the integrands are bounded measurable functions and the measures corresponding to F,

f_{1}

and

f_{2}

are finite on any interval

(a, 0]

. Therefore such integrals are finite. According to the Lebesgue theorem on dominated convergence (recall that

E [X^{2}] < \infty

) one has

lim_{a \to - \infty} \int_{(a, 0]} f (x) d F (x) = \int_{(- \infty, 0]} f (x) d F (x),

and the limit is finite. The monotone convergence theorem for

σ

-finite measure yields

lim_{a \to - \infty} (\int_{(a, 0]} F (x) d f_{1} (x) - \int_{(a, 0]} F (x) d f_{2} (x)) = \int_{(- \infty, 0]} F (x) d f_{1} (x) - \int_{(- \infty, 0]} F (x) d f_{2} (x) .

We have seen that

f (a) F (a) \to 0

as

a \to - \infty

. Hence, in light of Equation (18)

\int_{(- \infty, 0]} F (x) d f_{1} (x) - \int_{(- \infty, 0]} F (x) d f_{2} (x) = \int_{(- \infty, 0]} F (x) d f (x) .

Therefore, for

i = 1, 2

, each integral

\int_{(- \infty, 0]} F (x) d f_{i} (x)

is finite as

\int_{(- \infty, 0]} F (x) d f (x)

is finite. Thus,

\int_{(- \infty, 0]} f (x) d F (x) = f (0) F (0) - \int_{(- \infty, 0]} F (x) d f (x) = f (0) F (0) + \int_{(- \infty, 0]} (- F (x)) f^{'} (x) d x,

as f is absolutely continuous. Indeed, for any

x \in R

,

f (x) = f (0) + \int_{(0, x]} f^{'} (u) d u,

where (continuous)

f^{'} \in L^{1} [a, b]

for any finite interval

[a, b]

. Thus,

{(f^{'})}^{+} \in L^{1} [a, b]

and

{(f^{'})}^{-} \in L^{1} [a, b]

. Set

f_{1} (x) : = f (0) + \int_{(0, x]} {(f^{'} (u))}^{+} d u, f_{2} (x) : = \int_{(0, x]} {(f^{'} (u))}^{-} d u .

Then

f_{1}

and

f_{2}

are nondecreasing continuous functions on

R

,

f = f_{1} - f_{2}

and

\int_{(a, 0]} F (x) d f (x) = \int_{(a, 0]} F (x) d f_{1} (x) - \int_{(a, 0]} F (x) d f_{2} (x),

where these three integrals are finite. For (non-negative)

σ

-finite measures corresponding to

f_{1}

and

f_{2}

, one can write

\int_{(a, 0]} F (x) d f_{1} (x) = \int_{(a, 0]} F (x) {(f^{'} (x))}^{+} d x, \int_{(a, 0]} F (x) d f_{2} (x) = \int_{(a, 0]} F (x) {(f^{'} (x))}^{-} d x .

Thus, one has

\int_{(a, 0]} F (x) d f (x) = \int_{(a, 0]} F (x) {(f^{'} (x))}^{+} d x - \int_{(a, 0]} F (x) {(f^{'} (x))}^{-} d x

= \int_{(a, 0]} F (x) ({(f^{'} (x))}^{+} - {(f^{'} (x))}^{-}) d x = \int_{(a, 0]} F (x) f^{'} (x) d x .

(A12)

The bound

∥ f^{'} ∥ \leq 1

follows from Lemma 1. Therefore, the Lebesgue theorem on dominated convergence yields (as

E | X | < \infty

)

lim_{a \to - \infty} \int_{(a, 0]} F (x) f^{'} (x) d x = \int_{(- \infty, 0]} F (x) f^{'} (x) d x .

We have demonstrated that

\int_{(- \infty, 0]} F (x) d f (x) = \int_{(- \infty, 0]} F (x) f^{'} (x) d x .

In a similar way, we consider

\int_{(0, b]} (1 - F (x)) d x

and letting

b \to \infty

come to relation

- \int_{(0, \infty)} f (x) d (1 - F (x)) = f (0) (1 - F (0)) + \int_{(0, \infty)} (1 - F (x)) d f (x)

= f (0) (1 - F (0)) + \int_{(0, \infty)} (1 - F (x)) f^{'} (x) d x .

This establishes Equation (21). □

References

Steutel, F.W.; Van Harn, K. Infinite Divisibility of Probability Distributions on the Real Line; Marcel Dekker: New York, NY, USA, 2004. [Google Scholar]
Nolan, J.P. Univariate Stable Distributions. Models for Heavy Tailed Data; Springer: Cham, Switzerland, 2020. [Google Scholar]
Jagers, P. Branching processes: Personal historical perspective. In Statistical Modeling for Biological Systems; Almudevar, A., Oakes, D., Hall, J., Eds.; Springer: Cham, Switzerland, 2020; pp. 1–12. [Google Scholar] [CrossRef]
Schmidli, H. Risk Theory; Springer: Cham, Switzerland, 2017. [Google Scholar]
Gnedenko, B.V.; Korolev, V.Y. Random Summation. Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
Kalashnikov, V.V. Geometric Sums: Bounds for Rare Events with Applications; Kluwer Academic: Dordrecht, The Netherlands, 1997. [Google Scholar]
Pinski, M.A.; Karlin, S. An Introduction to Stochastic Modeling, 4th ed.; Academic Press: Amsterdam, The Netherlands, 2011. [Google Scholar]
Bulinski, A.; Spodarev, E. Introduction to random fields. In Stochastic Geometry, Spacial Statistics and Random Fields. Asymptotic Methods; Spodarev, E., Ed.; Springer: Berlin, Germany, 2013; pp. 277–336. [Google Scholar] [CrossRef]
Zolotarev, V.M. Modern Theory of Summation of Random Variables; De Gruyter: Berlin, Germany, 1997. [Google Scholar]
Rachev, S.T.; Klebanov, L.B.; Stoyanov, S.V.; Fabozzi, F.J. The Methods of Distances in the Theory of Probability and Statistics; Springer: New York, NY, USA, 2013. [Google Scholar]
Stein, C. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory; Statistical Laboratory of the University of California: Berkeley, CA, USA, 1972; pp. 583–602. [Google Scholar]
Stein, C. Approximate Computation of Expectations, Institute of Mathematical Statistics Lecture Notes—Monograph Series, 7; Institute of Mathematical Statistics: Hayward, CA, USA, 1986. [Google Scholar]
Slepov, N.A. Convergence rate of random geometric sum distributions to the Laplace law. Theory Probab. Appl. 2021, 66, 121–141. [Google Scholar] [CrossRef]
Tyurin, I.S. On the convergence rate in Lyapunov’s theorem. Theory Probab. Appl. 2011, 55, 253–270. [Google Scholar] [CrossRef]
Barbour, A.D.; Chen, L.H.Y. (Eds.) An Introduction to Stein’s Method; World Scientific: Singapore, 2005. [Google Scholar]
Chen, L.H.Y.; Goldstein, L.; Shao, Q.-M. Normal Approximation by Stein’s Method; Springer: Heidelberg, Germany, 2011. [Google Scholar]
Ross, N. Fundamentals of Stein’s method. Probab. Surv. 2011, 8, 210–293. [Google Scholar] [CrossRef]
Arras, B.; Breton, J.-C.; Deshayes, A.; Durieu, O.; Lachièze-Rey, R. Some recent advances for limit theorems. ESAIM Proc. Surv. 2020, 68, 73–96. [Google Scholar] [CrossRef]
Arras, B.; Houdré, C. On Stein’s Method for Infinitely Divisible Laws with Finite First Moment, 1st ed.; Springer: Cham, Switzerland, 2019. [Google Scholar]
Chen, P.; Nourdin, I.; Xu, L.; Yang, X.; Zhang, R. Non-integrable Stable Approximation by Stein’s Method. J. Theor. Probab. 2022, 35, 1137–1186. [Google Scholar] [CrossRef]
Rényi, A. (Hungarian) A characterization of Poisson processes. Magyar Tud. Akad. Mat. Kutató. Int. Közl. 1957, 1, 519–527. [Google Scholar]
Shevtsova, I.; Tselishchev, M. A generalized equilibrium transform with application to error bounds in the Rényi theorem with no support constraints. Mathematics 2020, 8, 577. [Google Scholar] [CrossRef]
Brown, M. Error bounds for exponential approximations of geometric convolutions. Ann. Probab. 1990, 18, 1388–1402. [Google Scholar] [CrossRef]
Brown, M. Sharp bounds for exponential approximations under a hazard rate upper bound. J. Appl. Probab. 2015, 52, 841–850. [Google Scholar] [CrossRef]
Hung, T.L.; Kein, P.T. On the rates of convergence in weak limit theorems for normalized geometric sums. Bull. Korean Math. Soc. 2020, 57, 1115–1126. [Google Scholar] [CrossRef]
Shevtsova, I.; Tselishchev, M. On the accuracy of the exponential approximation to random sums of alternating random variables. Mathematics 2020, 8, 1917. [Google Scholar] [CrossRef]
Korolev, V.; Zeifman, A. Bounds for convergence rate in laws of large numbers for mixed Poisson random sums. Stat. Probab. 2021, 168, 108918. [Google Scholar] [CrossRef]
Aldous, D.J. More Uses of Exchangeability: Representations of Complex Random Structures. In Probability and Mathematical Genetics: Papers in Honour of Sir John Kingman; Bingham, N.H., Goldie, C.M., Eds.; Cambridge Univesity Press: Cambridge, UK, 2010. [Google Scholar]
Shevtsova, I.; Tselishchev, M. On the accuracy of the generalized gamma approximation to generalized negative binomial random sums. Mathematics 2021, 9, 1571. [Google Scholar] [CrossRef]
Liu, Q.; Xia, A. Geometric sums, size biasing and zero biasing. Electron. Commun. Probab. 2022, 27, 1–13. [Google Scholar] [CrossRef]
Döbler, C.; Peccati, G. The Gamma Stein equation and noncentral de Jong theorems. Bernoulli 2018, 24, 3384–3421. [Google Scholar] [CrossRef] [Green Version]
Korolev, V. Bounds for the rate of convergence in the generalized Rényi theorem. Mathematics 2022, 10, 4252. [Google Scholar] [CrossRef]
Peköz, E.A.; Röllin, A. New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 2011, 39, 587–608. [Google Scholar] [CrossRef] [Green Version]
Weinberg, G.V. Kulback-Leibler divergence and the Pareto-Exponential approximation. SpringerPlus 2016, 5, 604. [Google Scholar] [CrossRef] [Green Version]
Daly, F. Gamma, Gaussian and Poisson approximations for random sums using size-biased and generalized zero-biased couplings. Scand. Actuar. J. 2022, 24, 471–487. [Google Scholar] [CrossRef]
Zolotarev, V.M. Ideal metrics in the problem of approximating the distributions of sums of independent random variables. Theory Probab. Appl. 1977, 22, 433–449. [Google Scholar] [CrossRef]
Gibbs, A.L.; Su, F.E. On choosing and bounding probability metrics. Int. Stat. Rev. 2002, 70, 419–435. [Google Scholar] [CrossRef] [Green Version]
Janson, S. Probability Distances. 2020. Available online: www2.math.uu.se/∼svante (accessed on 1 September 2022).
Peköz, E.A.; Röllin, A.; Ross, N. Total variation error bounds for geometric approximation. Bernoulli 2013, 19, 610–632. [Google Scholar] [CrossRef] [Green Version]
Slepov, N.A. Generalized Stein equation on extended class of functions. In Proceedings of the International Conference on Analytical and Computational Methods in Probability Theory and Its Applications, Moscow, Russia, 23–27 October 2017; pp. 75–79. [Google Scholar]
Ley, C.; Reinert, G.; Swan, Y. Stein’s method for comparison of inivariate distributions. Probab. Surv. 2017, 14, 1–52. [Google Scholar] [CrossRef]
Yeh, J. Real Analysis. Theory of Measure and Integration, 2nd ed.; World Scientific: Singapore, 2006. [Google Scholar]
Gaunt, R.E. Wasserstein and Kolmogorov error bounds for variance gamma approximation via Stein’s method I. J. Theor. Probab. 2020, 33, 465–505. [Google Scholar] [CrossRef] [Green Version]
Halmos, P.R. Measure Theory; Springer: New York, NY, USA, 1974. [Google Scholar]
Gaunt, R.E.; Walton, N. Stein’s method for the single server queue in heavy traffic. Stat. Probab. Lett. 2020, 156, 108566. [Google Scholar] [CrossRef]
Muthukumar, T. Measure Theory and Lebesgue Integration. 2018. Available online: home.iitk.ac.in/∼tmk (accessed on 1 September 2022).
Shiryaev, A.N. Probability-1; Springer: New York, NY, USA, 2016. [Google Scholar]
Burkill, L.C. The Lebesgue Integral; Cambridge University Press: Cambridge, UK, 1963. [Google Scholar]
Korolev, V.; Zeifman, A. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef] [Green Version]
Baldi, P.; Rinott, Y.; Stein, C. A normal approximations for the number of local maxima of a random function on a graph. In Probability, Statistics and Mathematics, Papers in Honor of Samuel Karlin; Anderson, T.W., Athreya, K.B., Iglehart, D.L., Eds.; Academic Press: San-Diego, CA, USA, 1989; pp. 59–81. [Google Scholar] [CrossRef]
Goldstein, L.; Rinott, Y. Multivariate normal approximations by Stein’s method and size bias couplings. J. Appl. Prob. 1996, 33, 1–17. [Google Scholar] [CrossRef]
Goldstein, L. Berry-Esseen bounds for combinatorial central limit theorems and pattern occurrences, using zero and size biasing. J. Appl. Probab. 2005, 42, 661–683. [Google Scholar] [CrossRef]
Goldstein, L.; Reinert, G. Stein’s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 1997, 7, 935–952. [Google Scholar] [CrossRef]
Gaunt, R.E. On Stein’s method for products of normal random variables and zero bias couplings. Bernoulli 2017, 23, 3311–3345. [Google Scholar] [CrossRef] [Green Version]
Döbler, C. Distributional transformations without orthogonality relations. J. Theor. Probab. 2017, 30, 85–116. [Google Scholar] [CrossRef] [Green Version]
Arratia, R.; Goldstein, L.; Kochman, F. Size bias for one and all. Probab. Surv. 2019, 16, 1–61. [Google Scholar] [CrossRef]
Weinberg, G.V. Validity of whitening-matched filter approximation to the Pareto coherent detector. IET Signal Process 2012, 6, 546–550. [Google Scholar] [CrossRef]
Blanchet, J.; Glinn, P. Uniform renewal theory with applications to expansions of random geometric sums. Adv. Appl. Prob. 2007, 39, 1070–1097. [Google Scholar] [CrossRef]
Kallenberg, O. Foundations of Modern Probability; Springer: New York, NY, USA, 1997. [Google Scholar]
Kingman, J.F.C. On queues in heavy traffic. J. R. Stat. Soc. Ser. B Stat. Methodol. 1962, 24, 383–392. [Google Scholar] [CrossRef]
Su, Z.; Wang, X. Approximation of sums of locally dependent random variables via perturbation of Stein operator. arXiv 2022, arXiv:2209.09770.v2. [Google Scholar]
Korolev, V.Y.; Zeifman, A.I. Convergence of statistics constructed from samples with random sizes to the Linnik and Mittag-Leffler distributions and their generalizations. J. Korean Stat. Soc. 2017, 46, 161–181. [Google Scholar] [CrossRef]
Bulinski, A.; Kozhevin, A. New version of the MDR method for stratified samples. Stat. Optim. Inf. Comput. 2017, 5, 1–18. [Google Scholar] [CrossRef] [Green Version]
Ginag, L.T.; Hung, T.L. An extension of random summations of independent and identically distributed random variables. Commun. Korean Math. Soc. 2018, 33, 605–618. [Google Scholar] [CrossRef]
Farago, A. Decomposition of Random Sequences into Mixtures of Simpler Ones and Its Application in Network Analysis. Algorithms 2021, 14, 336. [Google Scholar] [CrossRef]
Feng, M.; Deng, L.-J.; Chen, F.; Perc, M.; Kurths, J. The accumulative law and its probability model: An extension of the Pareto distribution and the log-normal distribution. Proc. R. Soc. A 2020, 476, 20200019. [Google Scholar] [CrossRef] [PubMed]
Nikolsky, S.M. A Course of Mathematical Analysis, v. 1; Mir Publishers: Moscow, Russia, 1987. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bulinski, A.; Slepov, N. Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws. Mathematics 2022, 10, 4747. https://doi.org/10.3390/math10244747

AMA Style

Bulinski A, Slepov N. Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws. Mathematics. 2022; 10(24):4747. https://doi.org/10.3390/math10244747

Chicago/Turabian Style

Bulinski, Alexander, and Nikolay Slepov. 2022. "Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws" Mathematics 10, no. 24: 4747. https://doi.org/10.3390/math10244747

APA Style

Bulinski, A., & Slepov, N. (2022). Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws. Mathematics, 10(24), 4747. https://doi.org/10.3390/math10244747

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws

Abstract

1. Introduction

2. Auxiliary Results

3. Limit Theorem for Geometric Sums of Independent Random Variables

4. Limit Theorem for Geometric Sums of Exchangeable Random Variables

5. Convergence of Random Sums of Independent Summands to Generalized Gamma Distribution

6. Convergence of Random Sums of Exchangeable Summands to Generalized Gamma Distribution

7. Inverse to Equilibrium Transformation

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI