Next Article in Journal
Bayesian Inference Algorithm for Estimating Heterogeneity of Regulatory Mechanisms Based on Single-Cell Data
Next Article in Special Issue
High-Dimensional Consistencies of KOO Methods for the Selection of Variables in Multivariate Linear Regression Models with Covariance Structures
Previous Article in Journal
Privacy Protection Practice for Data Mining with Multiple Data Sources: An Example with Data Clustering
Previous Article in Special Issue
Asymptotic Expansions for Symmetric Statistics with Degenerate Kernels
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws

by
Alexander Bulinski
1,* and
Nikolay Slepov
2
1
Faculty of Mathematics and Mechanics, Lomonosov Moscow State University, Leninskie Gory 1, 119991 Moscow, Russia
2
Department of Higher Mathematics, Moscow Institute of Physics and Technology, National Research University, 9 Instituskiy per., Dolgoprudny, 141701 Moscow, Russia
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(24), 4747; https://doi.org/10.3390/math10244747
Submission received: 29 October 2022 / Revised: 6 December 2022 / Accepted: 7 December 2022 / Published: 14 December 2022
(This article belongs to the Special Issue Limit Theorems of Probability Theory)

Abstract

:
The convergence rate in the famous Rényi theorem is studied by means of the Stein method refinement. Namely, it is demonstrated that the new estimate of the convergence rate of the normalized geometric sums to exponential law involving the ideal probability metric of the second order is sharp. Some recent results concerning the convergence rates in Kolmogorov and Kantorovich metrics are extended as well. In contrast to many previous works, there are no assumptions that the summands of geometric sums are positive and have the same distribution. For the first time, an analogue of the Rényi theorem is established for the model of exchangeable random variables. Also within this model, a sharp estimate of convergence rate to a specified mixture of distributions is provided. The convergence rate of the appropriately normalized random sums of random summands to the generalized gamma distribution is estimated. Here, the number of summands follows the generalized negative binomial law. The sharp estimates of the proximity of random sums of random summands distributions to the limit law are established for independent summands and for the model of exchangeable ones. The inverse to the equilibrium transformation of the probability measures is introduced, and in this way a new approximation of the Pareto distributions by exponential laws is proposed. The integral probability metrics and the techniques of integration with respect to sign measures are essentially employed.

1. Introduction

The theory of sums of random variables belongs to the core of modern probability theory. The fundamental contribution to the formation of the classical core was made by A. de Moivre, J. Bernoulli, P.-S. Laplace, D. Poisson, P.L. Chebyshev, A.A. Markov, A.M. Lyapunov, E. Borel, S.N. Bernstein, P. Lévy, J. Lindeberg, H. Cramér, A.N. Kolmogorov, A.Ya. Khinchin, B.V. Gnedenko, J.L. Doob, W. Feller, Yu.V. Prokhorov, A.A. Borovkov, Yu.V. Linnik, I.A. Ibragimov, A. Rényi, P. Erdös, M. Csörgö, P. Révész, C. Stein, P. Hall, V.V. Petrov, V.M. Zolotarev, J. Jacod and A.N. Shiryaev among others. The first steps led to limit theorems for appropriately normalized partial sums of sequences of independent random variables. Besides the laws of large numbers, special attention was paid to emergence of Gaussian and Poisson limit laws. Note that despite many efforts to find necessary and sufficient conditions for the validity of the central limit theorem (the term was proposed by G. Pólya for a class of limit theorems describing weak convergence of distributions of normalized sums of random variables to the Gaussian law), this problem was completely resolved for independent summands only in the second part of the 20th century in the works by V.M. Zolotarev and V.I. Rotar. Also in the last century, the beautiful theory of infinitely divisible and stable laws was constructed. New developments of infinite divisibility along with classical theory can be found in [1]. For exposition of the theory of stable distributions and their applications, we refer to [2], see also references therein.
Parallel to partial sums of a sequence of random variables (and vectors), other significant schemes have appeared, for instance, the arrays of random variables. Moreover, in physics, biology and other domains, researchers found that it was essential to study the sums of random variables when the number of summands was random. Thus, the random sums with random summands became an important object of investigation. One can mention the branching processes which stem from the 19th century population models by I.J. Bienaymé, F. Galton and H.W. Watson that are still intensively being developed, see, e.g., [3]. In the theory of risk, it is worth recalling the celebrated Cramér–Lundberg model for dynamics of the capital of an insurance company, see, e.g., Ch. 6 in [4]. Various examples of models described by random sums are considered in Ch. 1 of [5], including (see Example 1.2.1) the relationship between certain random sums analysis and the famous Pollaczek–Khinchin formula in queuing theory. A vast literature deals with the so-called geometric sums. There, one studies the sum of independent identically distributed random variables, and the summation index follows the geometric distribution, being independent with summands. Such random sums can model many real world phenomena, e.g., in queuing, insurance and reliability, see the Section “Origin of Geometric Sums” in the Introduction of [6]. Furthermore, a multitude of important stochastic models described by systems of dependent random variables occurred to meet diverse applications, see, e.g., [7]. In particular, the general theory of stochastic processes and random fields arose in the last century (for introduction to random fields, see, e.g., [8]).
An intriguing problem of estimating the convergence rate to a limit law was addressed by A.C. Berry and C.-G. Esseen. Their papers initiated the study of proximity for distribution functions of the normalized partial sums of independent random variables to the distribution function of a standard Gaussian law in the framework of the classical theory of random sums.
To assess the proximity of distributions, we will employ various integral probability metrics. Usually, for random variables Y, Z and a specified class H of functions h : R R , one sets
d H ( Y , Z ) : = sup h H | E [ h ( Y ) ] E [ h ( Z ) ] | [ 0 , ] .
Clearly, d H ( Y , Z ) is a functional depending on l a w ( Y ) and l a w ( Z ) , i.e., distributions of Y and Z. A class H should be rich enough to guarantee that d H possesses the properties of a metric (or semi-metric). The general theory of probability metrics is presented, e.g., in [9,10]. In terms of such metrics, one often compares the distribution of a random variable Y under consideration with that of a target random variable Z. In Section 2, we recall the definitions of the Kolmogorov and Kantorovich (alternatively called Wasserstein) distances and Zolotarev ideal metrics corresponding to the adequate choice of H , denoted below as K , H 1 and H 2 , respectively.
It should be emphasized that for sums of random variables, deep results were established along with creation and development of different methods of analysis. One can mention the method of characteristic functions due to the works of J.Fourier, P.-S.Laplace and A.M.Lyapunov, the method of moments proposed by P.L.Chebyshev and developed by A.A.Markov, the Lindeberg method of employing auxiliary Gaussian random variables and the Bernstein techniques of large and small boxes. In 1972, C.Stein in [11] (see also [12]) introduced the new method to estimate the proximity of the distribution under consideration to a normal law. Furthermore, this powerful method was developed in the framework of classical limit theorems of the probability theory. We describe this method in Section 2. Applying the Stein method along with other tools, one can establish in certain cases the sharp estimates of closeness between a target distribution and other ones in specified metrics (see, e.g., [13,14]). We recommend the books [15,16] and the paper [17] for basic ideas of the ingenious Stein method. The development of this techniques under mild moment restrictions for summands is treated in [18,19]. We mention in passing that there are deep generalizations of Stein techniques involving generators of certain Markov processes; a compact exposition is provided, e.g., on p. 2 of [20].
In the theory of random sums of random summands, the limit theorems with exponential law as a target distribution play a role similar to the central limit theorem for (nonrandom) sums of random variables. Here, one has to underline the principal role of the Rényi classical theorem for geometric sums published in [21]. Recall this famous result. Let X 1 , X 2 , be a sequence of independent identically distributed (i.i.d.) random variables such that μ : = E [ X 1 ] 0 . Take a geometric random variable N p with parameter p ( 0 , 1 ) , defined as follows:
P ( N p = k ) = p ( 1 p ) k , k N { 0 } .
Assume that N p and ( X n ) n N are independent. Set S 0 : = 0 , S n : = X 1 + + X n , n N . Then,
W p : = S N p E [ S N p ] D Z E x p ( 1 ) a s p 0 + ,
where D stands for convergence in distribution, and Z follows the exponential law E x p ( λ ) with parameter λ = 1 , E [ S N p ] = μ ( 1 p ) / p . In fact, instead of N p , A.Rényi considered the shifted geometric random variable N ( p ) such that P ( N ( p ) = k ) = p ( 1 p ) k 1 , k N . Clearly, N p has the same law as N ( p ) 1 . He supposed that i.i.d. random variables X 1 , X 2 , are non-negative, and N ( p ) and ( X n ) n N are independent. Then, S N ( p ) / E [ S N ( p ) ] converges in distribution to Z E x p ( 1 ) as p 0 + , where E [ S N ( p ) ] = μ / p . It was explained in [22] that both statements are equivalent and the assumption of nonnegativity of summands can be omitted.
Building on the previous investigations discussed below in this section, we study different instances of quantifying the approximation of random sums by limit laws and also extend the Stein method employment. The main goals of our paper are the following: (1) to find sharp estimates (i.e., optimal ones which cannot be diminished) of proximity of geometric sums of independent (in general non-identically distributed) random variables to exponential law using the probability metric d H 2 ; (2) to prove the new version of the Rényi theorem when the summands are described by a model of exchangeable random variables, establishing the due non-exponential limit law together with an optimal bound of the convergence rate applying d H 2 ; (3) to obtain the exact convergence rate of appropriately normalized random sums of random summands to the generalized gamma distribution when the number of summands follows the generalized negative binomial distribution employing d H 2 ; (4) to introduce the inverse transformation to an “equilibrium distribution transformation”, give full description of its existence and demonstrate the advantage of applying the Stein method combined with that inverse transform; and (5) to use such approach in deriving the new approximation in the Kolmogorov metric d K of the Pareto distribution by an exponential one, which is important in signal processing.
The main idea is to apply the Stein method and deduce (Lemma 2) new estimates of the solution of Stein’s equation (corresponding to an exponential law E x p ( λ ) as a target distribution) when a function h appearing in its right-hand side belongs to a class H 2 . This entails the established sharp estimates. The integral probability metrics and the techniques of integration with respect to sign measures are essentially employed. It should be stressed that we consider random summands which take, in general, positive and negative values and in certain cases need not have the same law.
Now, we briefly comment on the relevance of the five groups of the paper results mentioned above. Some upper bounds for convergence rates in Equation (3) were obtained previously by different tools (the renewal techniques and the memoryless property of the geometric distribution), and the estimates were not sharp. We refer to the results by A.D. Soloviev, V.V. Kalashnikov and S.Y. Vsekhsvyatskii, M. Brown, V.M. Kruglov and V.Yu. Korolev, where the authors either used the Kolmogorov distance or proved specified nonuniform estimates for differences of the corresponding distribution functions. For instance, in [23] the following estimate was proved
sup x R | P ( W p x ) P ( Z x ) | p E [ X 1 2 ] μ 2 max 1 , 1 2 ( 1 p ) ,
where Z E x p ( 1 ) . Moreover, this estimate is asymptotically exact when p 0 + . Some improvements are in [24] under certain (hazard rate) assumptions. E.V. Sugakova obtained a version of the Rényi theorem for independent, in general, not identically distributed random variables. We also mention contributions by V.V. Kalashnikov, E.F. Peköz, A. Röllin, N. Ross and T.L. Hung which gave the estimates in terms of the Zolotarev ideal metrics. We do not reproduce all these results here since they can be viewed on pages 3 and 4 of [22] with references where they were published.
In Corollary 3.6 of [25] for nondegenerate i.i.d. positive random variables X 1 , X 2 , with mean μ and finite second moment, it was proved that
ζ 2 p S ( p ) , Z ( 1 / μ ) p ( E [ X 1 2 ] + 2 μ 2 ) ,
where S ( p ) : = j = 1 N ( p ) X j , ζ 2 is the Zolotarev ideal metric of order two, Z ( λ ) E x p ( λ ) , λ > 0 . In [22], the estimates for proximity of geometric sums distributions to Z E x p ( 1 ) were provided in the Kantorovich and ζ 2 metrics. A substantial contribution of the authors of [22] is the study of random summands X 1 , X 2 , that need not be positive (see also [26]). The general estimate for deviation of W p from Z E x p ( 1 ) in the ideal metric of order s was proved in [27]. We do not assume that W p is constructed by means of i.i.d. random variables and, moreover, demonstrate that our estimate (for summands taking real values) involving the metric d H 2 is sharp.
The exchangeable random variables form an important class having various applications in statistics and combinatorics, see, e.g., [28]. As far as we know, the model of exchangeable random variables is studied in the context of random sums for the first time here. It is interesting that instead of the exponential limit law we indicate explicit expression of the new limit law. In addition, we establish the sharp estimate of proximity of random sums distributions to this law using d H 2 .
A natural generalization of the Rényi theorem is to study the summation index following non-geometrical distribution. In this way, the upper bound of the convergence rate of random sums of random summands to generalized gamma distribution was proved in [29]. Theorem 3.1 in [30] contains the estimates in the Kolmogorov and Kantorovich distances for approximations of non-negative random variable law by specified (nongeneralized) gamma distribution. The proof relies on Stein’s identity for gamma distribution established in H.M.Luk’s PhD thesis (see the reference in [30]). New estimates of the solutions of the gamma Stein equation are given in [31]. We derive the sharp estimate for approximation of random sums by generalized gamma law using the Zolotarev metric of order two. In a quite recent paper [32] the author established deep results concerning further generalizations of the Rényi theorem. Namely, Theorem 1 of [32] demonstrates how one can provide the upper bounds of the convergence rate of specified random sums to a more general law than an exponential one using the estimates in the Rényi theorem. This approach is appealing since the author employs the ideal metric of order s > 0 . However, the sharpness of these estimates was not examined.
Note that in [33] the important “equilibrium transformation of distributions” was proposed and employed along with the Stein techniques. We will consider this transformation X e for a random variable X in Section 7 and also tackle other useful transformations. In the present paper, the inverse to the “equilibrium distribution transformation” is introduced. We completely describe the possibility to construct such transformation and provide an explicit formula for the corresponding density. The idea to apply such inverse transformation whenever it exists is based on the result [33] demonstrating that one can obtain a more precise estimate for proximity in the Kantorovich metric between X e and Z than between X and Z, where Z E x p ( 1 ) and E [ X ] = 1 , E [ X 2 ] < . We extend this result. Moreover, we prove that in this way one can obtain a new estimate of approximation of the Pareto distribution by an exponential one. It is shown that our new estimate is advantageous for a wide range of parameters of the Pareto distribution. Let X e P a r e t o ( α , β ) , i.e., the distribution function of X e is
F e ( x ) = 1 β x + β α , x 0 , α > 0 , β > 0 .
We show that the preimage X P a r e t o ( α + 1 , β ) . Thus, for any α > 2 , β > 0 , one has d K ( X e , Z ) 1 / ( α 1 ) , where Z E x p ( α / β ) and d K stands for the Kolmogorov distance. This bound is more precise than the previous ones applied in signal processing, see, e.g., [34].
This paper is organized as follows. After the Introduction, the auxiliary results are provided in Section 2. Here we include the material important for understanding the main results. We recall the concept of probability metrics, consider the Kolmogorov and the Kantorovich distances and examine the Zolotarev ideal metrics. We describe the basic ideas of Stein’s method, especially for the exponential target distribution. In this section, we formulate a simple but useful Lemma 1 concerning the essential supremum of the Lipschitz function, an important Lemma 2 giving the solution of the Stein equation for different functional classes. We explain the essential role of the generalized equilibrium transformation proposed in [22] which permits study of the summands taking both positive and negative values. We formulate Lemma 3 to be able to solve an integral equation involving the generalized equilibrium transformation when E [ X ] 0 and E [ X 2 ] < . The proofs of auxiliary lemmas are placed in Appendix A. Section 3 is devoted to an approximation of the normalized geometric sums W p by an exponential law. Here, the sharp convergence rate is found (see Theorem 1) by means of the probability metric d H 2 . The proof is based on the Lebesgue–Stieltjes integration techniques, the formula of integration by parts for functions of bounded variations, Lemma 2, various limit theorems for integrals and the important result of [22] concerning the estimates involving the Kantorovich distance. In Section 4, for the first time an analog of the Rényi theorem is proved for a model of exchangeable random variables proposed in [35]. We demonstrate (Theorem 2) that, in contrast to Rényi’s theorem, the limit distribution for random sums under consideration is a specified mixture of two explicitly indicated laws. Moreover, the sharp convergence rate to this limit law is obtained (Theorem 3) by means of d H 2 . In Section 5, the distance between the generalized gamma law and the suitably normalized sum of independent random variables is estimated when the number of summands has the generalized negative binomial distribution. Theorem 4 demonstrates that this estimate is sharp. For the proof, we employ various truncation techniques, the transformations of parameters of initial random variables, the monotone convergence theorem and explicit formula for the generalized gamma distribution moments of order δ > 0 , obtained in [27]. Section 6 provides the pioneering study of the same problem in the framework of exchangeable random variables and also gives the sharp estimate for the d H 2 metric (Theorem 5). In Section 7, we introduce the inverse to the equilibrium transformation of the probability measures. Lemma 6 contains a full description of situations when a unique preimage X of a random variable X e exists and gives an explicit formula for distribution of X. This approach permits us to obtain the new estimates of closeness of probability measures in the Kolmogorov and Kantorovich metrics (Theorem 6). In particular, due to Theorem 6 and Lemmas 2, 6, it becomes possible to find a useful estimate of proximity of the Pareto law to the exponential one (Example 2). Section 8 containing the conclusions and indications for further research work is followed by Appendix A and the list of references.

2. Auxiliary Results

Let K : = { h : h z ( x ) = I { x z } , x , z R } , where I { A } : = 1 if A holds and zero otherwise. The choice H = K in Equation (1) corresponds to the Kolmogorov distance. Note that h above is a function in x, whereas z is the index parameterizing the class.
A function h : R R is called the Lipschitz one if
L i p ( h ) : = sup x , u R ; x u | h ( x ) h ( u ) | | x u | < .
Then,
| h ( x ) h ( u ) | C | x u | , x , u R ,
and in light of Equation (4), L i p ( h ) is the smallest possible constant C appearing in Equation (5). We write Lip ( C ) , where C [ 0 , ) for a collection of the Lipschitz functions having L i p ( h ) C . For s > 0 set m = m ( s ) : = s 1 N { 0 } (where, for a R , a stands for the minimal integer number which is equal or greater than a). Introduce a class of functions
H s : = { h : R R , | h ( m ) ( x ) h ( m ) ( u ) | | x u | s m , x , u R } , s > 0 .
As usual, h ( 0 ) ( x ) = h ( x ) , x R . We write d H s for a metric defined according to Equation (1) with H = H s . V.M. Zolotarev and many other researchers defined an ideal metric ζ s of order s > 0 involving only bounded functions from H s . We will use collections H 1 and H 2 without assumption that functions h are bounded on R . This is the reason why we write d H s instead of ζ s . Thus, we employ
H 1 : = { Lip ( 1 ) } , H 2 : = { h : h Lip ( 1 ) } .
Note that in definitions of H 2 we deal with h C ( 1 ) , where the space C ( 1 ) ( R ) consists of functions h : R R such that h ( x ) exists for all x R , and h is continuous on R (evidently the Lipschitz function is continuous). One calls d H 1 the Kantorovich metric (the term Wasserstein metric appears in the literature as well). One also uses the bounded Kantorovich metric when the class H 1 contains all the bounded functions from Lip ( 1 ) . The metric ζ s was introduced in [36] and called an ideal metric in light of its important properties. The properties of ζ s metrics, where s > 0 , are collected in Sec. 2 of [32]. We mention in passing that various functionals are ubiquitous in assessing the proximity of distributions. In this regard, we refer, e.g., to [37,38].
To apply the Stein method, we begin with fixing the target random variable Z (or its distribution) and describe a class H to estimate d H ( Y , Z ) for a random variable Y under consideration. Then, the problem is to indicate an operator T (with specified domain of definition) so that the Stein equation
T f ( x ) = h ( x ) E [ h ( Z ) ]
has a solution f h ( x ) , x R , for each function h H . After that, one can substitute Y instead of x in Equation (6) and take the expectation of both sides, assuming that all these expectations are finite. As a result, one comes to the relation
E [ T f h ( Y ) ] = E [ h ( Y ) ] E [ h ( Z ) ] .
It is not a priori clear why the estimation of the left-hand side of Equation (7) is more adequate than the estimation of | E [ h ( Y ) ] E [ h ( Z ) ] | for h H . However, in many situations, justifying the method this occurs. The choice of T depends on the distribution of Z. Note that in certain cases (e.g., when Z follows the Poisson law) one considers functions f defined on a subset of R . We emphasize that the construction of operator T is a nontrivial problem, see, e.g., [33,39,40,41].
The basic idea in this way is the following. For many probability distributions (Gaussian, Laplace, Exponential, etc.), one can find an operator T characterizing the law of a target variable Z. In other words, for a rather large class of functions f, E [ T f ( Y ) ] = 0 if and only if l a w ( Y ) = l a w ( Z ) (i.e., the laws of Y and Z coincide). Thus, if | E [ T f h ( Y ) ] | is small enough for a suitable class of functions h, this leads to the assertion that the law of Y is close (in a sense) to the law of Z. One has to verify that this kind of “continuity” takes place. Clearly, if for any h H , where H defines the integral probability metric in Equation (1), one can find a solution f h of Equation (6), then the relation E [ T f h ( Y ) ] = 0 for all f h , h H , yields d H ( Y , Z ) = 0 and, consequently, l a w ( Y ) = l a w ( Z ) .
Further, we assume that Z E x p ( λ ) , i.e., Z has exponential distribution with parameter λ > 0 . In this case (see, e.g., Sec. 5 in [17]), one uses the operator
T f ( x ) : = f ( x ) λ f ( x ) + λ f ( 0 ) , x R , λ > 0 ,
and writes the Stein Equation (6) as follows
f ( x ) λ f ( x ) + λ f ( 0 ) = h ( x ) E [ h ( Z ) ] , x R .
It should be stipulated that E [ h ( Z ) ] R for a test function h H , and there exists a differentiable solution f of Equation (9). Therefore, if one can find such solution f, then
E [ f ( Y ) ] λ E [ f ( Y ) ] + λ f ( 0 ) = E [ h ( Y ) ] E [ h ( Z ) ]
under the hypothesis that all these expectations are finite. If f : R R is absolutely continuous, then (see, e.g., Theorem 13.18 of [42]) for almost all x R with respect to the Lebesgue measure, there exists f ( x ) . Moreover, one can find an integrable (on each interval) function g : R R , x R , to guarantee, for each x , u R , that
f ( x ) = f ( u ) + u x g ( v ) d v ,
where g ( v ) = f ( v ) for almost all v R . Thus, ( T f ) ( x ) is defined for such f according to Equation (8) for almost all x R . In general, for an arbitrary random variable Y, one cannot write E [ ( T f ) ( Y ) ] since the value of expectation depends on the choice of a version of ( T f ) ( x ) , x R . Really, let B B ( R ) be such that m ( B ) = 0 , where m stands for the Lebesgue measure. Assume that Y takes values in B. Then, it is clear that E [ ( T f ) ( Y ) ] depends on the choice of a function ( T f ) ( x ) version defined on R . However, if the distribution P Y of a random variable Y has a density with respect to m, then E [ ( T f ) ( Y ) ] will be the same for any version of T f (with respect to the Lebesgue measure). In certain cases, the Stein operator is applied to smoothed functions (see, e.g., [33,43]). Otherwise, Equation (6) does not hold at each point of R (see, e.g., Lemma 2.2 in [16]), and complementary efforts are needed. For our study, it is convenient to employ in Equation (8) for T in the capacity of f ( x ) , x R , the right derivative. In many cases, for a real-valued function f defined on a fixed set D R one considers sup x D | f ( x ) | as "essential supremum". Recall that a function f ˜ is a version of f (and vice versa) if the measure (here the Lebesgue measure) of points x such that f ˜ ( x ) f ( x ) is zero. The notation f means that one takes inf f ˜ sup x D | f ˜ ( x ) | , where f ˜ belongs to the class of all versions of f. Clearly, f will be the same if we change f on a subset of D having a measure which is equal to zero. Thus, we write f instead of g appearing in Equation (11). The following simple observation is useful. Its proof is provided in Appendix A.
Lemma 1.
A function h is the Lipschitz function on R with L i p ( h ) = C < if and only if h is absolutely continuous and (its essential supremum) h = C < .
Remark 1.
Note that 0 h ( x ) 1 , x R , for any h K . If, for some positive constant C, h Lip ( C ) , then Equation (5) yields that | h ( x ) | C | x | + | h ( 0 ) | . If h is a Lipschitz function (with L i p ( h ) = C ), then h ( x ) exists for almost all x R and an application of Lemma 1 gives
| h ( x ) h ( 0 ) | = 0 x h ( u ) d u C | x | , x R .
Consequently, | h ( x ) | A | x | + B for some positive A, B (one can take A = C , B = | h ( 0 ) | ) and any x R . As h ( x ) is continuous on each interval, it follows that | h ( x ) | a x 2 + b | x | + c for some positive a , b , c and all x R ( a = C / 2 , b = | h ( 0 ) | , c = | h ( 0 ) | ). Therefore, | h ( x ) | A 0 x 2 + B 0 for some positive A 0 , B 0 and each x R .
Lemma 2.
For any λ > 0 and each h K H 1 H 2 , the equation
f ( x ) λ f ( x ) = h ( x ) , x R ,
has a solution
f h ( x ) = e λ x x h ( u ) e λ u d u , x R ,
where f h ( 0 ) = E [ h ( Z ) ] / λ . If h K , then for all x R there exists f h ( x ) and f h 1 . If h H 1 H 2 , then f h is defined on R and f h h / λ . For h H 2 , a function f h is defined on R and f h min { 2 h , h / λ } .
The right-hand side of Equation (13) is well defined for each x R in light of Remark 1. Lemma 4.1 of [33] contains for λ = 1 some statements of Lemma 1. We will use the above estimates for any λ > 0 . Estimates for h H 2 were not considered in [33]. The proof of Lemma 2 is given in Appendix A.
The following concept was introduced in [33].
Definition 1
([33]). Let X be a non-negative random variable with finite E [ X ] > 0 . One says that a random variable X e has distribution of equilibrium with respect to X if for any Lipschitz function f : R R ,
E [ f ( X ) ] f ( 0 ) = E [ X ] E [ f ( X e ) ] .
Note that Definition 1 deals separately with distributions of X and X e . One says that X e is the result of the equilibrium transformation applied to X. The same terminology is used for transition from l a w ( X ) to l a w ( X e ) . For the sake of completeness, we explain in Appendix A (Comments to Definition 1) why one can take the law of X e having a density with respect to the Lebesgue measure
p e ( x ) = 1 E [ X ] P ( X > x ) , x 0 , 0 , x < 0 ,
to guarantee the validity of Equation (14).
Remark 2.
For a non-negative random variable X with finite E [ X ] > 0 , one can construct a random variable X e having a density (15). Accordingly, we then have a random vector ( X , X e ) with specified marginal distributions. However, the joint law of X and X e is not fixed and can be chosen in appropriate way. If X 1 , X 2 , is a sequence of independent random variables, we will assume that a sequence ( X n , X n e ) n N consists of independent vectors, and these vectors are independent with all considered random variables which are independent with ( X n ) n N .
In the recent paper [22], a generalization of the equilibrium transformation of distributions was proposed without assuming that random variable X is non-negative.
Definition 2
([22]). Let X be a random variable having a distribution function F ( x ) : = P ( X x ) , x R . Assume the existence of finite E [ X ] 0 . An equilibrium distribution function corresponding to X (or F ( x ) ) is introduced by way of
F e ( x ) : = 1 E [ X ] x F ( u ) d u , x 0 , E [ X ] E [ X ] + 1 E [ X ] 0 x ( 1 F ( u ) ) d u , x > 0 ,
where X : = X I { X < 0 } . This function can be written as F e ( x ) = x p e ( u ) d u , where
p e ( x ) = 1 E [ X ] F ( x ) , x 0 , 1 E [ X ] ( 1 F ( x ) ) , x > 0 ,
thus, p e is a density (with respect to the Lebesgue measure) of a signed measure Q e corresponding to F e . In other words, Equation (17) demonstrates the Jordan decomposition (see, e.g., Sec. 29 of [44]) of Q e .
Clearly, for a non-negative random variable, the functions defined in Equation (15) and Equation (16) coincide. For a nonpositive random variable, the function F e appearing in Equation (16) is a distribution function of a probability measure. In general, when X can take positive and negative values, the function introduced in Equation (16) is not a distribution function. We will call F e the generalized equilibrium distribution function. Note that | p e ( x ) | 1 | E [ X ] | . Thus, F e is the Lipschitz function and consequently continuous ( F e ( x ) is well defined for each x R since E [ X ] is finite and nonzero). Moreover, F e is absolutely continuous being the Lipschitz function. Each absolutely continuous function has bounded variation. If G is a function of bounded variation, then G = G 1 G 2 , where G 1 and G 2 are nondecreasing functions (see, e.g., [42], Theorem 12.18). One can employ the canonical choice G 1 ( x ) : = V a r 0 x ( G ) , where V a r a b ( G ) means the variation of G on [ a , b ] , < a b < (if a > b then V a r a b ( G ) : = V a r b a ( G ) ). If G is right-continuous (on R ), then evidently G 1 and G 2 are also right-continuous. Thus, for a right-continuous G having bounded variation, a nondecreasing function G i in its representation corresponds to a σ -finite measure Q i on B ( R ) , i = 1 , 2 . More precisely, there exists a unique σ -finite measure Q i on B ( R ) such that, for each finite interval ( a , b ] , Q i ( ( a , b ] ) = G i ( b ) G i ( a ) , i = 1 , 2 . Recall that one writes for the Lebesgue–Stieltjes integral with respect to a function G
R f ( u ) d G ( u ) : = R f ( u ) d G 1 ( u ) R f ( u ) d G 2 ( u ) ,
whenever the integrals in the right-hand side exist (with values in [ , ] ), and the cases or + are excluded. The integral R f ( u ) d G i ( u ) means the integration with respect to measure Q i , i = 1 , 2 . The signed measure Q corresponding to G is Q 1 Q 2 . Thus, R f ( u ) d G ( u ) means the integration with respect to signed measure Q. Note that if G = U 1 U 2 where U i is right-continuous and nondecreasing ( i = 1 , 2 ), then
R f ( u ) d G 1 ( u ) R f ( u ) d G 2 ( u ) = R f ( u ) d U 1 ( u ) R f ( u ) d U 2 ( u ) .
The left-hand side and the right-hand side of Equation (19) make sense simultaneously, and if so, are equal to each other. Indeed, for any finite interval ( a , b ] ( a b ), one has G 1 ( b ) G 1 ( a ) ( G 2 ( b ) G 2 ( a ) ) = U 1 ( b ) U 1 ( a ) ( U 2 ( b ) U 2 ( a ) ) . Thus, the signed measures corresponding to G 1 G 2 and U 1 U 2 coincide on B ( R ) . We mention in passing that one can also employ the Jordan decomposition of a signed measure.
For F e introduced in Equation (16), the analog of Equation (15) has the form
E [ f ( X ) ] f ( 0 ) = E [ X ] R f ( x ) d F e ( x ) .
Taking into account Equation (17), one can rewrite Equation (20) equivalently as follows
E [ f ( X ) ] f ( 0 ) = ( , 0 ] f ( x ) ( F ( x ) ) d x + ( 0 , ) f ( x ) ( 1 F ( x ) ) d x .
The right-hand side of the latter relation does not depend on the choice of a version of f . Due to Theorem 1(d) of [22], Equation (20) is valid for any Lipschitz function f. Evidently, an arbitrary function f H 2 need not be the Lipschitz one and vice versa.
Lemma 3.
Let X be a random variable such that E [ X 2 ] < and E [ X ] 0 . Then, Equation (20) is satisfied for all f H 2 .
The proof is provided in Appendix A.

3. Limit Theorem for Geometric Sums of Independent Random Variables

Consider N p G e o m ( p ) , see Equation (2). In other words, N p has a geometric distribution with parameter p. Let X 1 , X 2 , be a sequence of independent random variables such that E [ X k ] = μ , where μ R , μ 0 , k N . Assume that N p and ( X n ) n N are independent. Consider a normalized geometric sum
W p : = p μ ( 1 p ) k = 1 N p X k ,
introduced in Equation (3). Since N p can take zero value, set, as usual, k = 1 0 X k : = 0 . One can see that W p can be viewed as a random sum S p : = k = 1 N p X k normalized by E [ X ] E [ N p ] .
Lemma 4.
Let X 1 , X 2 , and N p , where p ( 0 , 1 ) , be random variables described above in this Section. Then, the following relations hold:
E [ W p ] = 1 , E | W p | sup k N E | X k | | μ | ,
E [ W p 2 ] = p μ 2 ( 1 p ) E [ X N p + 1 2 ] + 2 .
Proof. 
Recall that
E [ N p ] = k = 1 k p ( 1 p ) k 1 = 1 p p ,
E [ N p 2 ] = k = 1 k 2 p ( 1 p ) k 1 = ( 1 p ) ( 2 p ) p 2 .
Thus, one has
E [ W p ] = p μ ( 1 p ) k = 1 k μ P ( N p = k ) = p 1 p E [ N p ] = 1 .
Clearly, E | X k | < since E [ X k ] is finite ( k N ). Therefore
E | W p | p | μ | ( 1 p ) k = 1 k E | X k | P ( N p = k ) sup k N E | X k | | μ | .
Set ν k : = E [ X k 2 ] , k N . One has
E [ S p 2 ] = k = 1 P ( N p = k ) E i = 1 k X i 2 = k = 1 p ( 1 p ) k i = 1 k ν i + k ( k 1 ) μ 2 .
According to Equations (24) and (25) one derives the formula
k = 1 p ( 1 p ) k k ( k 1 ) μ 2 = μ 2 ( 1 p ) ( 2 p ) p 2 1 p p = 2 μ ( 1 p ) p 2 .
Convergence of the series k = 1 p ( 1 p ) k i = 1 k ν i having non-negative terms holds simultaneously with the validity of inequality E [ W p 2 ] < . Changing the order of summation, we obtain
k = 1 p ( 1 p ) k i = 1 k ν i = i = 1 ( 1 p ) i ν i = 1 p p E [ X N p + 1 2 ] .
The latter formula and Equations (26), (27) yield
E [ W p 2 ] = p μ ( 1 p ) 2 E [ S p 2 ] = p μ ( 1 p ) 2 1 p p E [ X N p + 1 2 ] + 2 μ ( 1 p ) p 2
= p μ 2 ( 1 p ) E [ X N p + 1 2 ] + 2 .
Equation (23) is established. □
The proof of Theorem 3.1 in [45] shows for non-negative i.i.d. random variables X 1 , X 2 , (when μ = 1 , see Formula (3.15) in [45]) that the equilibrium transformation of W p distribution has the following form:
W p e = p μ ( 1 p ) k = 1 N p X k + X N p + 1 e = W p + p μ ( 1 p ) X N p + 1 e ,
where X N p + 1 e means that we construct X 1 e , X 2 e , and then take a random index N p + 1 . In other words,
X N p + 1 e = n = 0 X n + 1 e I { N p = n } .
It was explained in Section 2 that a generalized equilibrium distribution function F W p e ( x ) (see Definition 2) need not be a distribution function when the summands X 1 , X 2 , can take values of different signs. However, employing this function, one can establish the following result.
Theorem 1.
Let X 1 , X 2 , be a sequence of independent random variables having finite E [ X k ] = μ , where μ 0 , k N . Assume that N p and ( X n ) n N are independent, where N p G e o m ( p ) , 0 < p < 1 . If Z E x p ( 1 ) , then
d H 2 ( W p , Z ) = E [ X N p + 1 2 ] 2 μ 2 p 1 p
where W p was introduced in Equation (22).
Proof. 
If E [ W p 2 ] = , then d H 2 ( W p , Z ) = since, for a function h ( x ) = x 2 / 2 , x R , belonging to H 2 , one has E [ h ( W p ) ] = , whereas E [ h ( Z ) ] < . According to Equation (23), E [ W p 2 ] and E [ X N p + 1 2 ] are both finite or infinite simultaneously. Consequently, Equation (29) is true when E [ W p 2 ] = .
Let us turn to the case E [ W p 2 ] < . At first, we obtain an upper bound for d H 2 ( W p , Z ) . Take h H 2 . Applying Lemmas 1 and 2 and Remark 1, one can write due to Stein’s Equation (10) that
| E [ h ( W p ) ] E [ h ( Z ) ] | = | E [ f h ( W p ) ] E [ f h ( W p ) ] + f ( 0 ) | .
Using the generalized equilibrium distribution transformation (20) one obtains:
| E [ f h ( W p ) ] E [ f h ( W p ) ] + f ( 0 ) | = R f h ( x ) d F W p ( x ) R f h ( x ) d F W p e ( x ) .
Due to Lemma 3 this is true, for h H 2 , because f h H 2 according to Lemma 2 (with λ = 1 ). Next, we employ the relation
R f h ( x ) d F W p ( x ) R f h ( x ) d F W p e ( x ) = R f h ( x ) d ( F W p F W p e ) ( x ) .
Evidently, one can write R | f h ( x ) | d F W p ( x ) < . The notation d F W p e ( x ) in the integral refers to the Lebesgue–Stieltjes integral with respect to a function F W p e ( x ) of bounded variation. In fact, the integral with integrator d F W p e ( x ) means that integration employs a signed measure Q p + Q p , where Q p + and Q p have the following densities with respect to the Lebesgue measure:
q p + ( x ) : = ( 1 F W p ( x ) ) I { ( 0 , ) } , q p ( x ) : = F W p ( x ) I { ( , 0 ] } , x R ,
we took into account that E [ W p ] = 1 according to Lemma 4. Then, for any < a < b < , one ascertains that variation of F W p e on [ a , b ] is given by formula V a r a b ( F W p e ) = a b | p W p e ( u ) | d u (see, e.g., Theorem 4.4.7 [46]). Note that for any < a < b < ,
a b | p W p e ( u ) | d u E | W p | <
according to Lemma 4. Thus, F W p e is a function of bounded variation. In the right-hand side of Equation (32), we take the Lebesgue–Stieltjes integral with respect to the function of bounded variation ( F W p F W p e ) ( x ) , x R . Let F W p e ( x ) = F p , 1 e ( x ) F p , 2 e ( x ) , x R , where F p , i e are nondecreasing right-continuous functions (even continuous since F W p e is continuous), i = 1 , 2 . Thus,
F W p ( x ) F W p e ( x ) = ( F W p ( x ) + F p , 2 e ( x ) ) F p , 1 e ( x ) , x R .
With the help of Equations (18) and (19) one makes sure that, for each n N ,
( n , n ] f h ( x ) d ( F W p F W p e ) ( x ) = ( n , n ] f h ( x ) d ( F W p ( x ) + F p , 2 e ( x ) ) ( n , n ] f h ( x ) d ( F p , 1 e ( x ) )
= ( n , n ] f h ( x ) d F W p ( x ) + ( n , n ] f h ( x ) d F p , 2 e ( x ) ( n , n ] f h ( x ) d F p , 1 e ( x )
= ( n , n ] f h ( x ) d F W p ( x ) ( n , n ] f h ( x ) d ( F p , 1 e ( x ) F p , 2 e ( x ) )
= ( n , n ] f h ( x ) d F W p ( x ) ( n , n ] f h ( x ) d F W p e ( x ) .
All the integrals in the latter formulas are finite. According to Lemma 2 and Remark 1, one can write | f h ( x ) | A 0 | x | + B 0 , where A 0 , B 0 are positive constants. Thus, the Lebesgue theorem on dominated convergence ensures that
lim n ( n , n ] f h ( x ) d F W p ( x ) = R f h ( x ) d F W p ( x ) ,
where the latter integral is finite. Indeed,
R ( A 0 | x | + B 0 ) d F W p ( x ) = A 0 E | W p | + B 0 <
according to Lemma 4. By the same Lemma, one has E [ W p ] = 1 . Therefore, on account of Equation (17), the following relation holds:
( n , n ] f h ( x ) d F W p e ( x ) = ( n , 0 ] f h ( x ) ( F W p ( x ) ) d x + ( 0 , n ] f h ( x ) ( 1 F W p ( x ) ) d x ,
whereas Corollary 2, Sec. 6, Ch. II of [47] and Lemma 4 entail that
( , 0 ] ( A 0 | x | + B 0 ) F W p ( x ) d x + ( 0 , ) ( A 0 | x | + B 0 ) ( 1 F W p ( x ) ) d x A 0 E [ W p 2 ] + B 0 E | W p | < .
The Lebesgue theorem on dominated convergence for σ -finite measures and Equation (34) yield
lim n ( n , n ] f h ( x ) d F W p e ( x ) = R f h ( x ) d F W p e ( x ) ,
where the latter integral is finite. Now, we show that
lim n ( n , n ] f h ( x ) d ( F W p F W p e ) ( x ) = R f h ( x ) d ( F W p F W p e ) ( x ) .
Note that f h ( x ) I ( n , n ] ( x ) f h ( x ) at each x R as n . To apply the version of the Lebesgue theorem to integrals over a signed measure, it suffices (see, e.g., [48], p. 74) to verify that
R | f h ( x ) | | d ( F W p F W p e ) ( x ) | < ,
where | d G | means that one evaluates an integral with respect to the measure corresponding to the total variation of a measure determined by a right-continuous function G of bounded variation. The extension of the Lebesgue theorem on dominated convergence for signed measures is an immediate corollary of the Jordan decomposition mentioned above. Using this decomposition, one obtains the inequality
R | f h ( x ) | | d ( F W p F W p e ) ( x ) | R | f h ( x ) | | d F W p ( x ) | + R | f h ( x ) | | d F W p e ( x ) | .
Due to Remark 1 one has | f h ( x ) | A 0 | x | + B 0 for all x R and some positive constants A 0 , B 0 . Then, Equations (33) and (34) yield (as F W p generates probability measure)
R ( A 0 | x | + B 0 ) d F W p ( x ) + R ( A 0 | x | + B 0 ) | d F W p e ( x ) | < .
The functions f h and F W p F W p e are right-continuous and have bounded variation. Then each of them can be represented as the difference of right-continuous nondecreasing functions, and using for any n N the integration by parts formula (see, e.g., Theorem 11, Sec. 6, Ch. 2, [47]), one has
( n , n ] f h ( x ) d ( F W p F W p e ) ( x ) = f h ( x ) ( F W p ( x ) F W p e ( x ) ) | n n ( n , n ] ( F W p ( x ) F W p e ( x ) ) d f h ( x ) .
Since the integral in the right-hand side of Equation (35) is finite, it holds
f h ( x ) ( F W p ( x ) F W p e ( x ) ) 0 , x o r x
(the proof is similar to the proof of Corollary 2, Sec. 6, Ch. 2 in [47]). Then,
R f h ( x ) d ( F W p F W p e ) ( x ) = lim n ( n , n ] ( F W p ( x ) F W p e ( x ) ) d f h ( x ) .
The function f h is absolutely continuous according to Lemma 2. Hence (see also Equations (36) and (A12) in Appendix A) we get
R f h ( x ) d ( F W p ( x ) F W p e ( x ) ) = lim n ( n , n ] ( F W p ( x ) F W p e ( x ) ) f h ( x ) d x f h R F W p ( x ) F W p e ( x ) d x R F W p ( x ) F W p e ( x ) d x ,
because f h h 1 due to Lemmas 1 and 2. Using the homogeneity of the Kantorovich metric for signed measures which is derived from formula (20) of [22] (see Lemma 1 (a) there) and applying Lemma 3 of that paper, we can write
R F W p ( x ) F W p e ( x ) d x = p | μ | ( 1 p ) R F S N p ( x ) F S N p e ( x ) d x E [ X N p + 1 2 ] 2 μ 2 p 1 p .
Relations (30), (31), (32), (37), (38) and Lemmas 1 and 2 guarantee that d H 2 ( W p , Z ) does not exceed the right-hand side of Equation (29).
Now, we turn to the lower bounds for d H 2 ( W p , Z ) . Choose h ( x ) = x 2 / 2 as the test function. Since h H 2 , we can write
d H 2 ( W p , Z ) E [ h ( W p ) ] E [ h ( Z ) ] = 1 2 E [ W p 2 ] E [ Z 2 ] .
For a random variable Z following the exponential law E x p ( 1 ) , one has E [ Z 2 ] = 2 . Formula (23) of Lemma 4 yields
d H 2 ( W p , Z ) E [ X N p + 1 2 ] 2 μ 2 p 1 p .
Taking into account formula (38), we come to the desired statement. The proof is complete. □
Remark 3.
Evidently,
E [ X N p + 1 2 ] = n = 0 E [ X n + 1 2 ] p ( 1 p ) n .
Thus, one obtains
E [ X N p + 1 2 ] sup n N E [ X n 2 ] ,
and the latter inequality becomes an equality when E [ X n 2 ] = E [ X 1 2 ] for all n N . Therefore, the statement of Theorem 1 can be written as follows
d H 2 ( W p , Z ) sup n N E [ X n 2 ] 2 μ 2 p 1 p ,
and this becomes an equality when E [ X n 2 ] = E [ X 1 2 ] for all n N .
Remark 4.
In [22], the authors proved the following inequality
d H 2 ( W p , Z ) 3 E [ X N p + 1 2 ] 2 μ 2 p 1 p .
We established the sharp estimate with a factor 1 / 2 instead of 3 / 2 having employed Equation (20) for a class of functions comprising solutions of the Stein equation for h H 2 . The estimate with factor 1 / 2 was also obtained in the recent paper [49] but for i.i.d. summands. The lower bounds were not provided there. In our Theorem 1, the summands have the same expectations but need not have the same distribution.
Remark 5.
If the summands of W p are non-negative, we consider W p e appearing in Equation (28). Applying Theorem 1(i) [22] to relation (29), one obtains
d H 1 ( W p e , Z ) = E [ X N p + 1 2 ] 2 μ 2 p 1 p .
For i N , consider a random variable X i having distribution E x p ( 1 / μ ) . Then X i e E x p ( 1 / μ ) , and, consequently, X N p + 1 e E x p ( 1 / μ ) . We can choose X i e , i N , according to Remark 2. Then, the distribution of W p e will be the same if we change X N p + 1 e to X N p + 1 in Equation (28). In such a way, W p e is a normalized sum of a random number of independent random variables. Using the homogeneity of the Kantorovich metric, one has
d H 1 p μ k = 1 N p + 1 X k , ( 1 p ) Z = ( 1 p ) d H 1 p μ ( 1 p ) k = 1 N p + 1 X k , Z = E [ X N p + 1 2 ] 2 μ 2 p .
Therefore, for an arbitrary sequence ( X k ) k N satisfying conditions of Theorem 1, the upper bound for the left-hand side of Equation (40) is not less than the right-hand side of Equation (40).

4. Limit Theorem for Geometric Sums of Exchangeable Random Variables

Now, we consider exchangeable random variables X 1 , X 2 , satisfying the dependence condition proposed in [35]. Namely, assume that for all n N , t j R ( j = 1 , , n ) and some ρ [ 0 , 1 ]
E e i ( t 1 X 1 + + t n X n ) = ρ E e i X 1 ( t 1 + + t n ) + ( 1 ρ ) j = 1 n E e i t j X j ,
where i 2 = 1 . The cases of ρ = 0 and ρ = 1 correspond, respectively, to independent random variables and those possessing the property of comonotonicity. The latter means that for ρ = 1 the joint behavior of X 1 , , X n is strongly correlated and coincides with one of a vector ( X 1 , , X 1 ) .
Theorem 2.
Let X 1 , X 2 be exchangeable random variables with E [ X 1 ] = μ , μ 0 satisfying condition (41) for some ρ ( 0 , 1 ) . Suppose that ( X n ) n N and N p are independent, where N p G e o m ( p ) , p ( 0 , 1 ) . In contrast to the Rényi theorem, one has
W p D Y , p 0 + ,
where the law of Y is the following mixture
P Y = ρ P V X 1 / μ + ( 1 ρ ) P Z ,
random variables X 1 , V are independent and V E x p ( 1 ) , Z E x p ( 1 ) .
Proof. 
Let X ˜ 1 , X ˜ 2 , be independent copies of X 1 , X 2 , , respectively. Suppose that X ˜ 1 , X ˜ 2 , are independent with N p . Set S 0 : = 0 , S ˜ 0 : = 0 , S ˜ n : = X ˜ 1 + + X ˜ n , n N . Denote the characteristic function of a random variable ξ by f ξ ( t ) , t R . For each t R , using Equation (41), one has
f S N p ( t ) = n = 0 E e i t S n P ( N p = n )
= P ( N p = 0 ) + n = 1 ρ E e i X 1 t n + ( 1 ρ ) j = 1 n E e i t X j P ( N p = n )
= p + n = 0 ρ E e i X 1 t n + ( 1 ρ ) j = 1 n E e i t X ˜ j P ( N p = n ) ρ p ( 1 ρ ) p
= ρ f X 1 N p ( t ) + ( 1 ρ ) n = 0 f S ˜ n ( t ) P ( N p = n ) = ρ f X 1 N p ( t ) + ( 1 ρ ) f S ˜ N p ( t ) .
For each t R , one has
f W p ( t ) = ρ f p μ ( 1 p ) X 1 N p ( t ) + ( 1 ρ ) f W ˜ p ( t ) ,
where W ˜ p = p μ ( 1 p ) j = 1 N p X ˜ j .
According to the classical Rényi theorem, W ˜ p D Z as p 0 + , where Z E x p ( 1 ) . Note that T p : = p 1 p N p D V as p 0 + , where V E x p ( 1 ) . In fact, one can apply Theorem 1 with X j 1 , j N to check this. For each t R , taking into account that T p and X 1 are independent and applying the Lebesgue theorem on dominated convergence, we see that
E e i t T p X 1 = E E e i t T p X 1 | X 1 = R e i t T p x d F X 1 ( x ) R e i t V x d F X 1 ( x ) = E e i t V X 1 , p 0 + ,
since X 1 and V are independent. Hence,
p μ ( 1 p ) X 1 N p D V X 1 μ , p 0 +
is true. In light of Equation (43),
W p D Y , p 0 + ,
here the law of Y is the mixture of distributions V X 1 / μ and Z provided by Equation (42). The proof is complete. □
Theorem 3.
Assume that N p and ( X n ) n N satisfy conditions of Theorem 2. Let μ 2 = E [ X 1 2 ] . Then,
d H 2 ( W p , Y ) = μ 2 2 μ 2 p 1 p .
Proof. 
Relation (43) for characteristic functions implies that the following equality of distributions holds
W p = D p μ ( 1 p ) ( 1 I ρ ) N p X 1 + I ρ S ˜ N p ,
where indicator I ρ equals 1 and 0 with probabilities 1 ρ and ρ , respectively, and is independent of all the variables under consideration. Assume at first that μ 2 < . Then, for h H 2 ,
E [ h ( W p ) ] = ρ E [ h p μ ( 1 p ) N p X 1 ] + ( 1 ρ ) E [ h ( W ˜ p ) ] .
In view of Equation (42) one has
E [ h ( Y ) ] = ρ E [ h V X 1 μ ] + ( 1 ρ ) E [ h ( Z ) ] .
The latter two formulas and the triangle inequality yield
| E [ h ( W p ) ] E [ h ( Y ) ] | ρ E [ h p μ ( 1 p ) N p X 1 ] E [ h V X 1 μ ] + ( 1 ρ ) E [ h ( W ˜ p ) ] E [ h ( Z ) ] .
By means of Theorem 1 we have
sup h H 2 | E [ h ( W ˜ p ) ] E [ h ( Z ) ] | = μ 2 2 μ 2 p 1 p .
For each h H 2 , taking into account the independence of X 1 , N p , V, one can write
E [ h p μ ( 1 p ) N p X 1 ] E [ h V X 1 μ ] = R E [ h p μ ( 1 p ) N p X 1 ] E [ h x V μ ] d F X 1 ( x ) .
Due to homogeneity of d H 2 we infer from Theorem 1 that
sup h H 2 E [ h p μ ( 1 p ) N p X 1 ] E [ h x V μ ] = d H 2 p x μ ( 1 p ) N p , x V μ
= x μ 2 d H 2 p ( 1 p ) k = 1 N p 1 , V = 1 2 x μ 2 p 1 p .
Consequently, it holds
E [ h p μ ( 1 p ) N p X 1 ] E [ h V X 1 μ ] p 2 ( 1 p ) R x μ 2 d F X 1 ( x ) = μ 2 2 μ 2 p 1 p .
Equations (46), (47) and (48) lead to the upper bound for d H 2 ( W p , Y ) .
Note that a function h ( x ) = x 2 / 2 , x R , belongs to H 2 and therefore
sup H 2 E [ h ( W p ) ] E [ h ( Y ) ] 1 2 E [ W p 2 ] E [ Y 2 ] .
Note that E [ Z 2 ] = E [ V 2 ] = 2 because Z E x p ( 1 ) and V E x p ( 1 ) . The random variables X 1 , V , Z are independent. Thus, in light of Equation (42), one has
E [ Y 2 ] = 2 ρ μ 2 μ 2 + 2 ( 1 ρ ) .
By means of Equations (45), (23) and (25) we obtain
E [ W p 2 ] = p μ ( 1 p ) 2 ρ E [ N p 2 ] E [ X 1 2 ] + ( 1 ρ ) E [ W ˜ p 2 ] = p μ ( 1 p ) 2 ρ ( 1 p ) ( 2 p ) p 2 μ 2 + ( 1 ρ ) p μ 2 ( 1 p ) μ 2 + 2 = = μ 2 μ 2 ρ 2 p 1 p + ( 1 ρ ) p 1 p + 2 ( 1 ρ ) .
Equations (50) and (51) permit to find E [ W p 2 ] E [ Y 2 ] . Hence Equation (49) leads to the inequality
sup H 2 E [ h ( W p ) ] E [ h ( Y ) ] 1 2 μ 2 μ 2 ρ 2 p 1 p 2 + ( 1 ρ ) p 1 p = 1 2 μ 2 μ 2 p 1 p .
Now, let μ 2 = . Then, d H 2 ( W p , Y ) = according to Equation (52). The proof is complete. □

5. Convergence of Random Sums of Independent Summands to Generalized Gamma Distribution

Statements concerning weak convergence of geometric sums distributions to exponential law are often just particular cases of more general results concerning the convergence of random sums of random summands to generalized gamma law when the number of summands follows the generalized negative binomial distribution, see, e.g., [27,29,49]). The recent work [29] demonstrated how it is possible to study the mentioned general case employing the estimates of proximity of geometric sums distributions to exponential law. We introduce some notation to apply Theorem 1 for analysis of the distance between the distributions of random sums and the generalized gamma law.
Introduce a random variable G r , λ such that G r , λ G ( r , λ ) , where G ( r , λ ) is the gamma law with positive parameters r and λ , i.e., its density with respect to the Lebesgue measure has the form
g ( z ; r , λ ) = λ r z r 1 Γ ( r ) e λ z I ( 0 , ) ( z ) , z R ,
Γ ( r ) being the gamma function. For r = 1 , one has G ( 1 , λ ) = E x p ( λ ) . Clearly, for a > 0 , a G r , λ G ( r , λ / a ) . Set G r , α , λ * : = G r , λ 1 / α , where α > 0 . One says that random variable G r , α , λ * has the generalized gamma distribution G * ( r , α , λ ) . According to Equation (5) of [29], the density of G r , α , λ * is given by formula
g * ( z ; r , α , λ ) = | α | λ r z α r 1 Γ ( r ) e λ z α I ( 0 , ) ( z ) , z R .
Also it is known (see Equation (6) in [29]) that, for r ( 0 , 1 ) , α ( 0 , 1 ] and λ > 0 , the following relation holds
g * ( z ; r , α , λ ) = 0 1 u 1 u e u 1 u z q ( u ; r , α , λ ) d u , z > 0 ,
where q is a density of a specified random variable Y r , α , λ such that support of its distribution belongs to ( 0 , 1 ) (see Remark 3 [49]). We only note that for α = 1 the density q admits a representation
q u ; r , 1 , b 1 b = b r sin π r π ( 1 u ) r 1 u ( u b ) r I ( b , 1 ) ( u ) , b ( 0 , 1 ) .
Consider a random variable N r , α , p * having the generalized negative binomial distribution G N B ( r , α , p ) , where r > 0 , α 0 and p ( 0 , 1 ) , i.e.,
P ( N r , α , p * = k ) = 0 z k k ! e z g * z ; r , α , p 1 p d z , k = 0 , 1 ,
Thus G N B ( r , α , p ) has a mixed Poisson distribution. One can verify that G N B ( r , 1 , p ) coincides with N B ( r , p ) , where N B ( r , p ) is the negative binomial law. Recall that N r , p N B ( r , p ) if
P ( N r , p = k ) = Γ ( k + r ) k ! Γ ( r ) p r ( 1 p ) k , k = 0 , 1 ,
Note also that N 1 , p G e o m ( p ) .
Introduce the random variables
W r , α , p * : = 1 μ p 1 p 1 / α k = 1 N r , α , p * X k , S r , α , p * : = k = 1 N r , α , p * X k ,
where N r , α , p * G N B ( r , α , p ) , r > 0 , α 0 , p ( 0 , 1 ) , and E [ X k ] = μ , μ 0 , k N . We assume that ( X n ) n N and N r , α , p * are independent, where r > 0 , α 0 , p ( 0 , 1 ) .
Theorem 4.
Let ( X n ) n N be a sequence of independent random variables having E [ X n ] = μ , μ 0 , n N . Then, for W r , α , p * introduced in Equation (55) with parameters r ( 0 , 1 ) , α ( 0 , 1 ] , p ( 0 , 1 ) and G r , 1 having the gamma distribution G ( r , 1 ) , the following relation holds
d H 2 ( W r , α , p * , G r , 1 1 / α ) = 1 2 μ 2 p 1 p 2 / α 0 1 E [ X N u + 1 2 ] 1 u u q u ; r , α , p 1 p d u ,
whenever the right-hand side of Equation (56) is finite. Here, N u : = N 1 , 1 , u * , N u G e o m ( u ) , u ( 0 , 1 ) and q appeared in Equation (53).
Proof. 
Without loss of generality, we can assume that μ = 1 ; otherwise, we consider X ˜ n : = X n μ , n N . For such sequence, E [ X ˜ N u + 1 2 ] = 1 μ 2 E [ X N u + 1 2 ] . Note that 1 p p G r , 1 has the same distribution as G r , p / ( 1 p ) . Applying the homogeneity property of the ideal probability metric of order two, one has
d H 2 ( W r , α , p * , G r , 1 1 / α ) = p 1 p 2 / α d H 2 S r , α , p * , G r , p / ( 1 p ) 1 / α .
The proof of Theorem 1 [29] starts with establishing for any bounded Borel function h, r ( 0 , 1 ) , α ( 0 , 1 ] and p ( 0 , 1 ) , that
E h G r , p / ( 1 p ) 1 / α = 0 1 E h 1 u u Z q u ; r , α , p 1 p d u ,
where Z E x p ( 1 ) , and
E h ( S r , α , p * ) = 0 1 E h ( S 1 , 1 , u * ) q u ; r , α , p 1 p d u .
Let us examine these relations for each h H 2 . Recall that in light of Remark 1 | h ( x ) | A 0 x 2 + B 0 for some positive constants A 0 and B 0 (which depend on h), we write h = h + h , where h + ( x ) : = h ( x ) I { h ( x ) 0 } , h ( x ) : = h ( x ) I { h ( x ) 0 } . Set h n ( x ) : = h + ( x ) I ( n , n ] ( x ) , n N . Then, h n and n N are bounded Borel functions such that for each x R , 0 h n ( x ) h + ( x ) as n . Hence, the monotone convergence theorem yields
E h + G r , p / ( 1 p ) 1 / α = lim n E h n G r , p / ( 1 p ) 1 / α .
Note that, for each u ( 0 , 1 ) , E h n 1 u u Z E h + 1 u u Z . Applying the monotone convergence theorem once again, we obtain
0 1 E h + 1 u u Z q u ; r , α , p 1 p d u = lim n 0 1 E h n 1 u u Z q u ; r , α , p 1 p d u .
So, Equation (57) is valid if instead of h belonging to H 2 we write h + . Obviously, 0 h + ( x ) | h ( x ) | A 0 x 2 + B 0 , x R , n R . Thus,
E h + G r , p / ( 1 p ) 1 / α 2 A 0 E G r , p / ( 1 p ) 2 / α + B 0 < .
According to [27] (page 8), for δ > 0 , one has
E ( G r , α , λ * ) δ = Γ ( r + δ α ) λ δ / α Γ ( r ) .
This permits us to write E G r , p / ( 1 p ) 2 / α = E ( G r , 1 , p / ( 1 p ) * ) 2 / α < .
In the same manner, we demonstrate that Equation (57) is valid if instead of h H 2 we take h . Moreover, E h G r , p / ( 1 p ) 1 / α is finite. Therefore, Equation (57) holds for any h H 2 , and for such h, E h G r , p / ( 1 p ) 1 / α is finite.
By the monotone convergence theorem E [ h + ( S r , α , p * ) ] = lim n E [ h n ( S r , α , p * ) ] . In a similar way, E [ h n ( S 1 , 1 , u * ) ] E [ h + ( S 1 , 1 , u * ) ] as n , and applying this theorem once again, we obtain
0 1 E [ h + ( S 1 , 1 , u * ) ] q u ; r , α , p 1 p d u = lim n 0 1 E [ h n ( S 1 , 1 , u * ) ] q u ; r , α , p 1 p d u .
Taking into account that Equation (58) is valid for bounded Borel functions h n , one ascertains that Equation (58) holds if we replace h by h + . To show the latter integral is finite, we note that 0 h + ( x ) | h ( x ) | A 0 x 2 + B 0 , for some positive A 0 , B 0 and all x R . Formula (23) of Lemma 4 yields, for each u ( 0 , 1 ) ,
E ( S 1 , 1 , u * ) 2 1 u u E [ X N u + 1 2 ] + 2 ( 1 u ) 2 u 2 .
It was assumed above that the right-hand side of Equation (56) is finite. So,
0 1 E A 0 1 u u E [ X N u + 1 2 ] + 2 ( 1 u ) 2 u 2 + B 0 q u ; r , α , p 1 p d u < ,
since in light of Equation (57), taking h ( x ) = 1 and h ( x ) = x 2 2 (these functions belong to H 2 ), x R , we obtain, respectively,
0 1 q u ; r , α , p 1 p d u = 1 ,
E [ Z 2 ] 0 1 ( 1 u ) 2 u 2 q u ; r , α , p 1 p d u = E G r , p / ( 1 p ) 2 / α < .
We demonstrate analogously that Equation (58) holds upon replacing h H 2 with h and if the right-hand side of Equation (56) is finite, it follows that
0 1 E h ( S 1 , 1 , u * ) q u ; r , α , p 1 p d u
is finite as well. Consequently, Equation (58) is established for each h H 2 (whenever the right-hand side of Equation (56) is finite) and E h ( S r , α , p * ) is finite for such h. Therefore, for h H 2 and fixed α , r , p , one has
E h ( S r , α , p * ) E h G r , p / ( 1 p ) 1 / α = 0 1 E h ( S 1 , 1 , u * ) E h 1 u u Z q u ; r , α , p 1 p d u = : J ( h ) .
By Theorem 1, for h H 2 , it holds
E h ( S 1 , 1 , u * ) E h 1 u u Z d H 2 S 1 , 1 , u * , 1 u u Z = 1 u u 2 d H 2 u 1 u S 1 , 1 , u * , Z
1 u u 2 u 1 u 1 2 E [ X N u + 1 2 ] = 1 2 1 u u E [ X N u + 1 2 ] ,
where we take into account that N 1 , 1 , u * N B ( 1 , u ) , and N B ( 1 , u ) coincides with G e o m ( u ) . Thus, u 1 u S 1 , 1 , u * can be written as
u 1 u k = 1 N u X k ,
where N u G e o m ( u ) , N u and ( X k ) k N are independent.
Therefore, for each h H 2 , p 1 p 2 / α | J ( h ) | is bounded by the right-hand side of Equation (56), and so the desired upper bound is obtained (recall that μ = 1 ).
Now, we turn to the lower bound of d H 2 ( W r , α , p * , G r , 1 1 / α ) . Take h ( x ) = x 2 / 2 belonging to H 2 . Then, applying Equation (23) to evaluate E S 1 , 1 , u * 2 , one has
d H 2 ( W r , α , p * , G r , 1 1 / α ) 1 2 p 1 p 2 / α 0 1 E S 1 , 1 , u * 2 1 u u 2 E G 1 , 1 2 q u ; r , p 1 p d u = 1 2 p 1 p 2 / α 0 1 1 u u E [ X N u + 1 2 ] q u ; r , p 1 p d u ,
where G 1 , 1 = Z E x p ( 1 ) . Thus, Equation (61) completes the proof. □
Corollary 1.
Let conditions of Theorem 4 be satisfied and also μ 2 = sup n N E [ X n 2 ] < . Then, the right-hand side of Equation (56) is finite and
d H 2 ( W r , α , p * , G r , 1 1 / α ) μ 2 2 μ 2 p 1 p 1 / α Γ ( r + 1 α ) Γ ( r ) .
The inequality becomes an equality if μ 2 = E [ X n 2 ] for all n N . In particular, if α = 1 then Γ ( r + 1 ) Γ ( r ) = r .
Proof. 
According to Equation (57), for h ( x ) = x , x R ,
E G r , p / ( 1 p ) 1 / α = E [ Z ] 0 1 1 u u q u ; r , α , p 1 p d u .
Thus, the following relation is valid.
0 1 1 u u q u ; r , α , p 1 p d u = E G r , p / ( 1 p ) 1 / α .
Due to [27] (see page 8 there), for δ > 0 , one has E [ G r , α , λ * ] = Γ ( r + 1 / α ) λ 1 / α Γ ( r ) . Therefore,
E G r , p / ( 1 p ) 1 / α = E [ G r , α , p / ( 1 p ) * ] = 1 p p 1 α Γ ( r + 1 α ) Γ ( r ) .
For α = 1 , we obtain E [ G r , p / ( 1 p ) ] = 1 p p Γ ( r + 1 ) Γ ( r ) = r ( 1 p ) p . □

6. Convergence of Random Sums of Exchangeable Summands to Generalized Gamma Distribution

Consider the model of exchangeable random variables X 1 , X 2 , described in Section 4. Introduce the distribution of a random variable U r , α , λ * as the following mixture
P U r , α , λ * = ρ P V r , α , λ * X 1 μ + ( 1 ρ ) P Z r , α , λ * ,
where ρ [ 0 , 1 ] , α > 0 , r > 0 , μ : = E [ X 1 ] , μ 0 , random variables X 1 , V r , α , λ * are independent, V r , α , λ * G * ( r , α , λ ) , Z r , α , λ * G * ( r , α , λ ) . Since E [ G r , λ 2 / α ] = Γ ( r + 2 / α ) λ 2 / α Γ ( r ) (see, e.g., page 8 [27]), one has
E ( U r , α , λ * ) 2 = ρ E [ X 1 2 ] μ 2 + ( 1 ρ ) Γ ( r + 2 / α ) λ 2 / α Γ ( r ) .
Due to the properties of generalized gamma distributions, for any positive number c,
1 c α U r , α , λ * = 1 c α ( 1 I ρ ) V r , α , λ * X 1 μ + I ρ Z r , α , λ * = ( 1 I ρ ) V r , α , λ * X 1 μ + I ρ Z r , α , c λ * = U r , α , c λ * ,
where indicator I ρ equals 1 and 0 with probabilities 1 ρ and ρ , respectively, and is independent with all the variables under consideration. Note that U 1 , 1 , 1 * has the same distribution as a random variable Y, having the law defined in Equation (42). Recall that the generalized negative binomial distribution G N B ( r , α , p ) is the law of a random variable N r , α , p * , see Equation (54). We will use the following result.
Lemma 5.
If r > 0 , α 0 , p ( 0 , 1 ) , then for N r , α , p * G N B ( r , α , p ) one has
E N r , α , p * = E G r , α , p / ( 1 p ) * , E N r , α , p * ( N r , α , p * 1 ) = E G r , α , p / ( 1 p ) * 2 .
Proof. 
According to Equation (54), for each n N ,
k = 1 n k P ( N r , α , p * = k ) = 0 z k = 1 n z k 1 ( k 1 ) ! e z g * ( z ; r , α , p 1 p ) d z ,
k = 2 n k ( k 1 ) P ( N r , α , p * = k ) = 0 z 2 k = 2 n z k 2 ( k 2 ) ! e z g * ( z ; r , α , p 1 p ) d z .
The desired statement follows from the monotone convergence theorem for the Lebesgue integral by letting n . □
Theorem 5.
Let X 1 , X 2 be exchangeable random variables, introduced in Section 4, such that E [ X 1 ] = μ , E [ X 1 2 ] = μ 2 < . Assume that for some ρ ( 0 , 1 ) Equation (41) holds. Suppose that ( X n ) n N and N r , α , p * are independent, where N r , α , p * G N B ( r , α , p ) . Then, for W r , α , p * defined in Equation (55) with parameters r ( 0 , 1 ) , α ( 0 , 1 ] , p ( 0 , 1 ) and U r , α , 1 * given in Equation (63), one has
d H 2 ( W r , α , p * , U r , α , 1 * ) = μ 2 2 μ 2 p 1 p 1 / α Γ ( 1 + 1 α ) Γ ( r ) .
Proof. 
Without loss of generality, we can assume that μ = 1 ; otherwise, we consider X ˜ n : = X n / μ , n N . For such sequence, μ ˜ 2 = E X ˜ 1 2 = μ 2 / μ 2 . Note that Equation (58) is true for dependent summands (see Theorem 1 [29]). Furthermore, for bounded h ( t ) , t R , function h x ( t ) = h ( x t ) is also bounded for any x R . Thus, an employment of Equation (63) gives
E h U r , α , λ * = ρ R E h x G r , λ 1 / α d F X 1 ( x ) + ( 1 ρ ) E h G r , λ 1 / α .
Now we apply Equation (57) with bounded h x and by Fubini’s theorem obtain:
R E h x G r , λ 1 / α d F X 1 ( x ) = R 0 1 E h x 1 u u V * q u ; r , α , λ d u d F X 1 ( x ) = 0 1 E h 1 u u X 1 V * q u ; r , α , λ d u ,
where X 1 and V * are independent and V * E x p ( 1 ) . Apply Equation (57) for the second summand of Equation (68). Then, Equation (69) yields
E h U r , α , λ * = ρ 0 1 E h 1 u u X 1 V * q u ; r , α , λ d u + ( 1 ρ ) 0 1 E h 1 u u Z * q u ; r , α , λ d u = 0 1 E h 1 u u U 1 , 1 , 1 * q u ; r , α , λ d u ,
where Z * E x p ( 1 ) and U 1 , 1 , 1 * have the same distribution as Y, see Equation (42).
Recall that, for h H 2 , an inequality | h ( x ) | A 0 x 2 + B 0 holds for all x R and some positive constants A 0 , B 0 (see Remark 1). Moreover, E U r , α , λ * 2 < according to Equation (64). So, employing bounded h n ( x ) = h ( x ) I ( n , n ] ( x ) tending to h ( x ) H 2 as n , one can invoke the Lebesgue dominated convergence theorem to claim that lim n E h n ( U r , α , λ * ) = E h U r , α , λ * . We take into account that
0 1 E h n 1 u u U 1 , 1 , 1 * q u ; r , α , λ d u A 0 E U 1 , 1 , 1 * 2 0 1 1 u u 2 q u ; r , α , λ d u + B 0 .
The integral in the right-hand side of the latter formula is finite by Equation (60) and E U 1 , 1 , 1 * 2 < in accord with Equation (64). Thus, it is possible to apply the Lebesgue dominated convergence theorem to obtain
lim n 0 1 E h n 1 u u U 1 , 1 , 1 * q u ; r , α , λ d u = 0 1 E h 1 u u U 1 , 1 , 1 * q u ; r , α , λ d u
for any h H 2 . So, Equation (70) holds for all h H 2 .
In a similar way, lim n E [ h n ( S r , α , p * ) ] = E h S r , α , p * for h H 2 . According to the Cauchy–Bunyakovsky–Schwarz inequality for identically distributed variables X 1 , X 2 , we have | E [ X i X j ] | μ 2 for i , j N and consequently
E S r , α , p * 2 = k = 0 P ( N r , α , p * = k ) E j = 1 k X j 2 μ 2 k = 0 P ( N r , α , p * = k ) k 2 = μ 2 E N r , α , p * 2 .
Equations (59) and (66) entail that E N r , α , p * 2 < . Thus, the dominated convergence theorem guarantees that lim n E [ h n ( S r , α , p * ) ] = E h S r , α , p * . Furthermore, one can demonstrate that, for each h H 2 ,
lim n 0 1 E h n S 1 , 1 , u * q u ; r , α , λ d u = 0 1 E h S 1 , 1 , u * q u ; r , α , λ d u .
For this purpose we note that Equation (71) implies
0 1 E h n S 1 , 1 , u * q u ; r , α , λ d u C + A μ 2 0 1 E N 1 , 1 , u * 2 q u ; r , α , λ d u .
According to Equation (66) one has
0 1 E N 1 , 1 , u * 2 q u ; r , α , λ d u = 0 1 E G 1 , 1 , u / ( 1 u ) * 2 + E G 1 , 1 , u / ( 1 u ) * q u ; r , α , λ d u .
The latter integral is finite because one can take h ( x ) = x and h ( x ) = x 2 / 2 in Equation (57) and invoke Equation (59). Then, it is possible to use the dominated convergence theorem once again to establish Equation (72).
Now, combining Equation (58) and Equation (70) leads for any h H 2 to the relation
E h ( S r , α , p * ) E h ( U r , α , p / ( 1 p ) * ) = 0 1 E h ( S 1 , 1 , u * ) E h 1 u u U 1 , 1 , 1 * q u ; r , α , p 1 p d u .
Note that a random variable N 1 , 1 , u * follows the geometric distribution G e o m ( u ) with parameter u ( 0 , 1 ) . For each h H 2 and any u ( 0 , 1 ) , by Theorem 3 and in view of d H 2 homogeneity, we obtain
E h ( S 1 , 1 , u * ) E h 1 u u U 1 , 1 , 1 * d H 2 S 1 , 1 , u * , 1 u u U 1 , 1 , 1 * = 1 u u 2 d H 2 ( W u , Y ) 1 u u 2 u 1 u μ 2 2 = 1 u u μ 2 2 .
Employing Equations (73), (74) and (62) one deduces
d H 2 ( S r , α , p * , U r , α , p / ( 1 p ) * ) μ 2 2 0 1 1 u u q u ; r , α , p 1 p d u = μ 2 2 E G r , p / ( 1 p ) 1 / α .
Equation (65) implies by virtue of d H 2 homogeneity that
d H 2 ( W r , α , p * , U r , α , 1 * ) = p 1 p 2 / α d H 2 ( S r , α , p * , U r , α , p / ( 1 p ) * ) .
Combining Equations (59), (75) and (76) we conclude that the right-hand side of Equation (67) is an upper bound for d H 2 ( W r , α , p * , U r , α , 1 * ) .
Choosing h ( x ) = x 2 / 2 in Equation (73), upon employing Equation (52) and Equation (62) one infers:
d H 2 ( W r , α , p * , G r , 1 1 / α ) 1 2 p 1 p 2 / α 0 1 E S 1 , 1 , u * 2 1 u u 2 E U 1 , 1 , 1 * 2 q u ; r , α , p 1 p d u = = μ 2 2 p 1 p 2 / α 0 1 1 u u q u ; r , α , p 1 p d u = μ 2 2 p 1 p 2 / α E [ G r , α , p / ( 1 p ) * ] .
Using Equation (59) once again, we see that the right-hand side of Equation (67) is a lower bound for d H 2 ( W r , α , p * , U r , α , 1 * ) . □

7. Inverse to Equilibrium Transformation

The development of Stein’s method is closely connected with various transformations of distributions. Let a random variable W 0 and 0 < μ = E [ W ] < . Then, one says that a random variable W s has the W-size biased distribution if for all f such that E [ W f ( W ) ] exists
E [ W f ( W ) ] = μ E [ f ( W s ) ] .
The connection of this transformation with Stein’s equation was considered in [50,51]. It was pointed out in [51] that this transformation works well for combinatorial problems, such as counting the number of vertices in a random graph having prespecified degrees, see also [52]. In [53], another transformation was introduced. Namely, if a random variable W has mean zero and variance σ 2 ( 0 , ) , then the authors of [53] write (Definition 1.1) that a variable W * has W-zero biased distribution whenever, for all differentiable f such that E W f ( W ) exists, the following relation holds
E [ W f ( W ) ] = σ 2 E [ f ( W * ) ] .
This definition is inspired by an equation E [ W f ( W ) ] = σ 2 E [ f ( W ) ] characterizing the normal law N ( 0 , σ 2 ) . The authors of [53] explain that W * always exists if E [ W ] = 0 and var W ( 0 , ) . Zero-based coupling for products of normal random variables is treated in [54]. In Sec. 2 of [30], it is demonstrated that the gamma distribution is uniquely characterised by the property that its size-biased distribution is the same as its zero-biased distribution. Two generalizations of zero biasing were proposed in [55], see p. 104 of that paper for discussion of these transformations. We refer also to survey [56].
Now, we turn to the equilibrium distribution transformation introduced in [33] and concentrate on approximation of the law under consideration by means of an exponential law, see the corresponding Definition 1 in Section 2.
According to the second part of Theorem 2.1 of [33] (in our notation), for Z E x p ( 1 ) and non-negative random variable X with E [ X ] = 1 and E [ X 2 ] < the following estimate holds
d H 1 ( X , Z ) 2 E | X e X | ,
and at the same time
d H 1 ( X e , Z ) E | X e X | .
The authors of [33] also proved that d K ( X e , Z ) E | X e X | . Notice that the estimate for d H 1 ( X e , Z ) is more precise than that for d H 1 ( X , Z ) .
Now we turn to Equation (77) and demonstrate how to find the distribution of X when we know the distribution of X e . In other words, we concentrate on the inverse of an equilibrium distribution transformation.
Assume that E [ X ] > 0 . Recall that a random variable X e exists if F e ( x ) appearing in Equation (16) is a distribution function. The latter statement for E [ X ] > 0 is equivalent to nonnegativity of X. Indeed, for non-negative X, F e ( x ) coincides with a distribution function having a density (15). If F e ( x ) is a distribution function and E [ X ] > 0 in Equation (16), then F e ( x ) 0 for x < 0 only if F ( x ) = 0 for x < 0 .
Thus a random variable X e has a (version of) density p e ( x ) introduced in Equation (15). Obviously, the function p e ( x ) has the following properties. It is nonincreasing on [ 0 , ) and p e ( x ) = 0 for x < 0 . This density is right-continuous on [ 0 , ) and consequently p e ( 0 ) < . Now, we are able to provide a full description of the class of densities for random variables X e relevant to all non-negative X with positive mean.
Lemma 6.
Let a non-negative random variable X e have a version of density (with respect to the Lebesgue measure) p e ( x ) , x R , such that this function is nonincreasing on [ 0 , ) , p e ( x ) = 0 for x < 0 , and there is finite lim x 0 + p e ( x ) . Then, there exists a unique preimage of X e distribution having the distribution function F continuous at x = 0 . Namely,
F ( x ) = 1 p e ( x ) p e ( 0 ) , x 0 , 0 , x < 0 .
Proof. 
First of all, note that p e ( 0 ) > 0 as otherwise p e ( x ) = 0 for all x R ( p e is a nonincreasing function on [ 0 , ) ). We also know that there exist a left-sided limit and a right-sided limit of p e at each point x ( 0 , ) as well as the right-sided limit of p e at x = 0 . The set of discontinuity points of p e is at most countable, and we can take a version which is right continuous at each point of [ 0 , ) . Then, Equation (78) introduces a distribution function. Consider a random variable X with distribution function F and check the validity of Equation (14).
The integration by a parts formula yields, for any b > 0 ,
1 0 b p e ( x ) d x = b p e ( b ) + p e ( 0 ) 0 b x d F ( x ) .
Summands in the right-hand side of Equation (79) are non-negative. Therefore, for any b > 0 , E [ X I ( X b ) ] 1 / p e ( 0 ) . Hence, the monotone convergence theorem implies that E [ X ] is finite. According to Equation (78)
b p e ( b ) / p e ( 0 ) = b ( 1 F ( b ) ) = b P ( X > b ) 0 , b ,
since E [ X ] < . Taking in the Equation (79) limit as b , one obtains 1 = p e ( 0 ) E [ X ] . Now, we are ready to verify Equation (14). For any Lipschitz function f, E [ f ( X ) ] is finite and
E [ f ( X ) ] = 0 f ( x ) d F ( x ) = 1 p e ( 0 ) 0 f ( x ) d p e ( x ) .
Taking into account Equation (80), we infer that f ( b ) p e ( b ) 0 as b . Consequently, applying integration by parts once again (f has bounded variation), we obtain
E [ X ] E [ f ( X e ) ] = 1 p e ( 0 ) 0 f ( x ) p e ( x ) d x = 1 p e ( 0 ) 0 p e ( x ) d f ( x ) = 1 p e ( 0 ) f ( 0 ) p e ( 0 ) 0 f ( x ) d p e ( x ) = E [ f ( X ) ] f ( 0 ) .
Uniqueness of X distribution corresponding to X e is a consequence of Equation (15) and continuity of F ( x ) at x = 0 . Indeed, assume that for X 1 and X 2 one has X 1 e = X 2 e . Then, Equation (15) yields that for almost all x 0 ,
1 E [ X 1 ] P ( X 1 > x ) = 1 E [ X 2 ] P ( X 2 > x ) ,
and therefore P ( X 1 > x ) = c P ( X 2 > x ) , where c is a positive constant (the equilibrium distribution in Definition 1 is introduced for random variables with positive expectation only). Since P ( X 1 = 0 ) = P ( X 2 = 0 ) = 0 , one has P ( X 1 > 0 ) = P ( X 2 > 0 ) . Let x n 0 + , n , where the points x n belong to the set considered in Equation (81) to ensure that c = 1 . Thus, distributions of X 1 and X 2 coincide. □
Remark 6.
Let X p be the Bernoulli random variable taking values 1 and 0 with probabilities p and 1 p , respectively. Then, it is easily seen that the distribution of X p e is uniform on [ 0 , 1 ] . Thus, in contrast to Lemma 6, without assumption of continuity of F at a point x = 0 one can not guarantee, in general, the preimage uniqueness for the inverse transformation to the equilibrium one.
In the proof of Lemma 6, we find out that E [ X ] = 1 / p e ( 0 ) . Set λ = p e ( 0 ) , Z E x p ( λ ) . Then, E [ X ] = E [ Z ] . Further, we suppose that this choice of λ is made.
Recall that random variables U and V are stochastically ordered if either P ( U x ) P ( V x ) , for every x R , or the opposite inequality holds (for all x R ). Now, we clarify one of the Theorem 2.1 of [33] statements (see also Theorem 3 [22], where the result similar to Theorem 2.1 of [33] is formulated employing the generalized distributions).
Theorem 6.
Let a random variable X e satisfy conditions of Lemma 6, and E [ X e ] < and X be a preimage of the equilibrium transformation. Then, Equation (77) holds. Moreover, the inequality becomes an equality when X and X e are stochastically ordered.
Proof. 
Apply the Stein Equation (10) along with equilibrium transformation (14). Then, in light of E [ X ] = 1 λ and E f h ( X ) f h ( 0 ) = 1 λ E f h ( X e ) , we can write
E [ h ( X e ) ] E [ h ( Z ) ] = E f h ( X e ) λ f h ( X e ) + λ f ( 0 ) = λ E f h ( X e ) f h ( X ) λ | | f h | | E | X e X | | | h | | E | X e X | .
The last inequality in (82) is true due to Lemma 2. Now, we demonstrate that equality in (82) can be attained. Taking h ( x ) = x 1 λ , we have a solution f h ( x ) = 1 λ x of Equation (12). Then,
E [ h ( X e ) ] E [ h ( Z ) ] = λ E f h ( X e ) f h ( X ) = E ( X e X ) .
Employing the integration by parts formula, one can show that the expression in the right-hand side of the last equality is equal to the Kantorovich distance between X and X e when these variables are stochastically ordered. Note that x ( 1 F ( x ) ) 0 , x ( 1 F e ( x ) ) 0 as x and x F ( x ) 0 , x F e ( x ) 0 as x because E [ X ] and E [ X e ] are finite. Thus,
E [ X e ] E [ X ] = R x d F X e ( x ) d F X ( x ) = R F X e ( x ) F X ( x ) d x = R F X e ( x ) F X ( x ) d x ,
since F X e ( x ) F X ( x ) (or ≤) for all x R . It is well-known that the Kantorovich distance is the minimal one for the metric τ ( U , V ) = E | U V | (see, e.g., [9], Ch. 1, §1.3). Therefore,
R F X e ( x ) F X ( x ) d x = inf E | U V | ,
where the infimum has taken over all joint laws ( U , V ) such that P U = P X e and P V = P X (see also Remark 2 and [10], Corollary 5.3.2). Consequently, in the framework of Theorem 6, E [ X e ] E [ X ] = E | X e X | . □
Remark 7.
One can show that by means of Lemma 2 and Equation (82) it is possible to provide an estimate
d K ( X e , Z ) λ E | X e X | .
For each function h belonging to K , in a similar way to Equation (82), one can apply Equation (10) together with equilibrium transformation. Now, it is sufficient to study the Stein equation with right derivative. Formula (13) gives a solution of the Stein equation according to Lemma 2. Note that for f h , the right derivative coincides almost everywhere with the derivative, and the law of X e is absolutely continuous according to Equation (15). Thus, for the Lipschitz function f h (see Lemma 2), one can use an equilibrium transformation.
Example 1.
Consider the distribution functions F ε ( x ) of random variables X ε , taking values ε and 2 ε with probabilities 1 / 2 , 0 < ε < 1 . Formula (15) yields that X ε e has the following piece-line structure
F ε e ( x ) = 0 , if x < 0 , x , if 0 x < ε , x / 2 + ε / 2 , if ε x < 2 ε , 1 , if 2 ε x .
If ε 1 / 2 then, for all x R , the following inequality holds: F ε e ( x ) F ε ( x ) , i.e., X ε and X ε e are stochastically ordered. We see that for ε < 1 / 2 , the inequality is violated in the right neighborhood of a point ε . Thus, there are beside the stochastically ordered pairs (X, X e ) also those of a different kind.
Now, we turn to another example of stochastically ordered X and X e .
Example 2.
Take X e having the Pareto distribution. The notation X e P a r e t o ( α , β ) means that X e has a density f e ( x ) = α β α ( x + β ) α + 1 ( x 0 ) and the corresponding distribution function F e ( x ) = 1 β x + β α , where x 0 , α > 0 , β > 0 .
Further, we consider only α > 1 , since in this case there exists finite E [ X e ] = β α 1 . By means of Lemma 6, we obtain the distribution of the preimage of the equilibrium transformation
F ( x ) = 1 f e ( x ) f e ( 0 ) = 1 α β α ( x + β ) α + 1 β α + 1 α β α = 1 β x + β α + 1 , x 0 .
Thus one can state that X P a r e t o ( α + 1 , β ) . It is not difficult to see that F e ( x ) F ( x ) for x R , i.e., the random variables X e and X are stochastically ordered. Due to Theorem 6, one has
d H 1 ( X e , Z ) = E | X e X | = E [ X e ] E [ X ] = β α 1 β α = β α ( α 1 ) ,
d K ( X e , Z ) α β E | X e X | = 1 α 1 .
In such a way we find the bound for the Kolmogorov distance between the distributions P a r e t o ( α , β ) and E x p ( α / β ) . This relation demonstrates the convergence rate of d 1 ( X e , Z ) to zero as α . The estimate is nontrivial for α > 2 .
Remark 8.
It is interesting that estimation of the proximity of the Pareto law to the Exponential one became important in signal processing, see [34] and references therein. Let X P a r e t o ( α , β ) , where α > 0 , β > 0 , and Z E x p ( λ ) . In [34], the author indicates that the Pinsker–Csiszár inequality was employed to derive
d K ( X , Z ) 2 D K L ( X | | Z ) ,
where D K L ( X | | Z ) is the Kullback–Leibler divergence between laws of X and Z. More precisely, in the left-hand side of Equation (85) one can write the total variation distance d T V ( X , Z ) between distributions of X and Z. Clearly, d K ( X , Z ) d T V ( X , Z ) . By evaluating D K L ( X | | Z ) and performing an optimal choice of parameter λ , it was demonstrated (formula (19) in [34]) that, for α > 1 and any β > 0 ,
d K ( X , Z ) 2 α ( α 1 )
if λ = α 1 β . The author of [34] on page 8 writes that in his previous work [57] the inequality
d K ( X , Z ) 3 α
was established with the same choice of λ . Next, he also writes that “in the most cases α > 2 ” and notes that the estimate in Equation (86) involving the Kullback–Leibler divergence is more precise for α > 9 7 than the estimate in Equation (87) obtained by the Stein method. Moreover, on page 4 of [34] we read: “The problem with the Stein approach is that the bounds do not suggest a suitable way in which, for a given Pareto model, an appropriate approximating Exponential distribution can be specified”. However, we have demonstrated that application of the inverse equilibrium transformation together with the Stein method permits indicating, whenever α > 2 , the corresponding Exponential distribution with proximity closer than the right-hand sides of Equation (86) and Equation (87) can provide.

8. Conclusions

Our principle goal was to find the sharp estimates of the proximity of random sums distributions to exponential and more general laws. This goal is achieved when we employ the probability metric d H 2 . Thus, it would be valuable to find the best possible approximations of random sums distributions by means of specified laws using the metrics ζ s of order s > 0 . The results of [32] provide the basis for this approach.
There are various complementary refinements of the Rényi theorem. One approach is related to the employment of Brownian motion. It is interesting that in [58] (p. 1071) the authors proposed an explanation of the Rényi theorem involving the embedding theorem. We provide a little bit different complete proof. Let X 1 , X 2 , be i.i.d. random variables with mean μ : = E X 1 and σ 2 : = var X 1 < , whereas S n , n N , denote the corresponding partial sums. According to Theorem 12.6 of [59], which is due to A.V. Skorokhod and V. Strassen, there exists a standard Brownian motion B ( t ) , t 0 , (perhaps it is defined on an extension of initial probability space) such that
1 t sup 0 u t | S [ u ] μ u σ B ( u ) | P 0 , t ,
and
lim t S [ t ] μ t σ B ( t ) 2 t log log t = 0 a . s . ,
where P stands for convergence in probability, and a.s. means almost surely. Thus, in light of Equation (89), we can write, for t 0 ,
S [ t ] = μ t + σ B ( t ) + R ( t ) ,
where sup 0 u t R ( u ) / t P 0 and R ( t ) / 2 t log log t 0 a.s. when t . Substitute N p (see Equation (2)) in Equation (90) instead of t. It is easily seen that N p P (i.e., for each t > 0 , one has P ( N p t ) 0 as p 0 + ) and by means of characteristic functions one can verify that p N p D Z as p 0 + , where Z E x p ( 1 ) . Therefore, μ p N p D μ Z , p 0 + . In the proof of Lemma 4, we showed (Equation (24)) that E [ N p ] = ( 1 p ) / p . Consequently,
var [ p B ( N p ) ] = p 2 E [ B ( N p ) 2 ] = p 2 k = 0 E [ B ( k ) 2 ] p ( 1 p ) k = p 2 k = 0 k p ( 1 p ) k = p 2 E [ N p ] = p 2 1 p p = p ( 1 p ) 0 , p 0 + .
Hence, p σ B ( N p ) P 0 as p 0 + . Now, we demonstrate that p R ( N p ) P 0 , p 0 + . For any ε > 0 and any t > 0 ,
P ( p | R ( N p ) | > ε ) P ( p | R ( N p ) | > ε , N p t ) + P ( N p > t ) P ( p sup 0 u t | R ( u ) | > ε ) + P ( N p > t ) .
In light of Equation (88), for arbitrary γ > 0 and ε > 0 , one can take t 0 = t 0 ( γ ) such that P ( sup 0 u t 0 | R ( u ) | > ε t 0 ) < γ / 2 . Then, for any 0 < p 1 / t 0 , we obtain
P ( p sup 0 u t 0 | R ( u ) | > ε ) < γ / 2 .
Since N p P , we can find p 0 > 0 such that P ( N p > t 0 ) < γ / 2 if 0 < p p 0 . Therefore, R ( N p ) P 0 as p 0 + . The Slutsky lemma yields the desired relation
p S N p D μ Z , p 0 + ,
which implies Equation (3). However, it seems that there is no clear intuitive reason why the law of the random sum converges to an exponential in the Rényi theorem. Moreover, in Ch. 3, Sec. 2 “The Rényi Limit Theorem” of [20] (see Sec. 2.1 “Motivation”), one can find examples demonstrating that intuition behind the Rényi theorem is poor.
Actually, relation (90) leads to refinements of Equation (3). In [58], it is proved that if X 1 has finite exponential moments and other specified conditions are satisfied then there exists a more sophisticated approximation for distribution of W p , and its accuracy is estimated. The results are applied to the study of M / G / 1 queue for both light-tailed and heavy-tailed service time distributions. Note that in [58], Section 5, the authors study the model where the distribution of X 1 can depend on p. For future research, it would be desirable to establish analogues of our theorems for such a model.
The results concerning the accuracy of approximating a distribution under consideration by an exponential law are applicable to some queuing models. Let, for a queue M / G / 1 , the inter-arrival times follow E x p ( λ ) distribution and S stand for the general service time. Introduce the stationary waiting time W and define ρ : = λ E [ S ] to be its load. Due to [60], if E [ S 3 ] < then ( 1 ρ ) W D Z as ρ 1 , where Z E x p ( 1 ) . Theorem 3.1 of [45] contains an upper bound of d H 1 ( W p , Z ) , where Z E x p ( 1 ) . This estimate is used by the authors for analysis of queueing systems with a single server. It would be interesting to obtain the sharp approximations in the framework of queueing systems.
For the model of exchangeable random variables, Theorem 2 in Section 2 ensures the weak convergence of distributions under consideration to specified mixture of explicitly indicated laws. Theorem 3 proves the sharp convergence rate estimate to this limit law by means of the ideal probability metric of the second order. It would be worthwhile to establish such an estimate of the distributions proximity applying the Lévy–Prokhorov distance because convergence in this metric is equivalent to the weak convergence of distributions of random variables. All the more, at present there is no unified theory of probability metrics. In this regard, one can mention Proposition 1.2 of [17] stating that if a random variable Z has the Lebesgue density bounded by C then, for any random variable Y,
d K ( Y , Z ) C d H 1 ( Y , Z ) .
However, this estimate only gives the sub-optimal convergence rates. We also highlight the important total variation distance d T V . The authors of [61] study the sum W : = j J X j , where { X j , j J } is a family of locally dependent non-negative integer-valued random variables. Using the perturbations of Stein’s operator, they establish the upper bounds for d T V ( W , M ) where the law of M is a mixture of Poisson distribution and either binomial or negative binomial distribution. It would be desirable to obtain the sharp estimates and, moreover, consider a more general model where the set of summation is random. In this connection, it seems helpful to employ the paper [62], where the authors proved results concerning the weak convergence of distributions of statistics constructed from samples of random size. In addition, it would be interesting to extend these results to stratified samples by invoking Lemma 1 of [63].
Special attention is paid to various generalizations of the geometric sums. In Theorem 3.3 of [64], the authors consider random sums with summation index T n : = Y 1 + + Y n , where Y 1 , Y 2 , are i.i.d. random variables following the geometric law G e o m ( p ) , see Equation (2). Then, they show that S T n / E [ S T n ] converge in distribution to the gamma law with certain parameters as p 0 + . In [62], it is demonstrated that the Linnik and the Mittag–Leffler laws arise naturally in the framework of limit theorems for random sums. Hopefully, in future the complete picture of limit laws involving general theory of distributions mixtures will appear. In addition, it is desirable to study various models of random sums of dependent random variables. On this track, it could be useful to consider the decompositions of exchangeable random sequences extending the fundamental de Finetti theorem, see, e.g., [65].
One can try to generalize the results of Section 7 for accumulative laws proposed in [66]. These laws are akin to both the Pareto distribution and the lognormal distribution. In addition, we refer to [43] where the “variance-gamma distributions” were studied. These distributions form a four-parameter family and comprise as special and limiting cases the normal, gamma and Laplace distributions. Employment of these distributions permits enlarging a range of applications in modeling and fitting real data.
To complete the indication of further research directions, we note that the next essential and nontrivial step is to establish the limit theorem in functional spaces for processes generated by a sequence of random sums of random variables. For such stochastic processes, one can obtain the analogues of the classical invariance principles.

Author Contributions

Conceptualization, A.B. and N.S.; methodology, A.B and N.S.; formal analysis, A.B. and N.S.; investigation, A.B. and N.S.; writing—original draft preparation, A.B. and N.S.; writing—review and editing, A.B. and N.S.; supervision, A.B.; project administration, A.B.; funding acquisition, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

The work was supported by the Lomonosov Moscow State University project “Fundamental Mathematics and Mechanics”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors are grateful to Alexander Tikhomirov for invitation to present manuscript for this issue. In addition, they would like to thank three anonymous Reviewers for the careful reading of the manuscript and valuable remarks.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Lemma 1.
If L i p ( h ) = C < , then h is absolutely continuous (see, e.g., §13 in [42]), and consequently there exists h ( x ) for almost all x R . Thus, | h ( x ) | C for almost all x R in light of Equation (4). Assume that essential supremum h = C 0 < C . Then, for any ε > 0 , one can find a version of h , defined on R , such that sup x R | h ( x ) | C 0 + ε . (It was explained in Section 2 that one can consider a measurable extension of h to R ). Then, due to Equation (11) with h instead of f we obtain Equation (5) with C 0 + ε instead of C. Consequently, L i p ( h ) C 0 < C . We come to the contradiction.
On the other hand, let h be absolutely continuous. Then, for almost all x R , there exists h ( x ) and Equation (11) is valid for h instead of f. Assume that essential supremum h = C < . Then, for any ε > 0 there is a version of h such that sup x R | h ( x ) | C + ε . According to Equation (11), the relation (5) holds with C + ε instead of C. Since ε > 0 can be taken as an arbitrary small, one can claim that L i p ( h ) C . Suppose that L i p ( h ) C 0 < C . Then, for almost all x R , there exists h and | h | C 0 . Thus, we found a version with h C 0 . The contradiction shows that L i p ( h ) = C . Hence, the desired statement is proved. □
Proof of Lemma 2.
Let x 0 be a continuity point of a function h K H 1 H 2 . Then, the same is true for a function h ( u ) e λ u , u R . Hence, the function x h ( u ) e λ u d u has a derivative h ( x 0 ) e λ x 0 at point x 0 (in light of Remark 1 an integral x h ( u ) e λ u d u is well defined for any x R ). Thus, for each point x of continuity h there exists
f h ( x ) = λ e λ x x h ( u ) e λ u d u e λ x ( h ( x ) e λ x ) = λ f h ( x ) + h ( x ) .
For each fixed z R and a function h ( x ) = I { x z } , where x R , Equation (12) is verified in a similar way for the right derivative f h at point z R . Taking x = 0 in Equation (12), we obtain E [ h ( Z ) ] / λ . Evidently, e λ x x e λ u d u = 1 / λ . Therefore, Equation (A1) yields
f h ( x ) = λ e λ x x ( h ( u ) h ( x ) ) e λ u d u .
If a function h belongs to K , then, for any u , x R , the following inequality holds | h ( u ) h ( x ) | 1 . Consequently, for h K , one has f h 1 (where f h means a right derivative of a version of f h , and we operate with essential supremum).
Taking into account Lemma 1, for a function h H 1 and any x u , one can write | h ( u ) h ( x ) | L i p ( h ) ( u x ) = h ( u x ) . For h H 2 and x u , by the Lagrange finite-increments formula, | h ( u ) h ( x ) | | h ( v ) | ( u x ) h ( u x ) , where x < v < u . Hence, for any x R and h H 1 H 2 ,
| f h ( x ) | = λ e λ x x ( h ( u ) h ( x ) ) e λ u d u λ e λ x h x ( u x ) e λ u d u = h λ
since
λ e λ x x ( u x ) e λ u d u = 0 λ v e λ v d v = 1 λ .
Taking into account Equation (12), one can see that, for any h H 2 , f h = λ f h + h , where f h and h have derivatives at each point x R . Using Equation (A2) and Equation (A3), we obtain, for x R ,
f h ( x ) = λ f h ( x ) + h ( x ) = λ 2 e λ x x ( h ( u ) h ( x ) ) e λ u d u + h ( x )
= λ 2 e λ x x ( h ( u ) h ( x ) h ( x ) ( u x ) ) e λ u d u .
By means of Equation (A3) and the Lagrange finite-increments formula we can write
| f h ( x ) | 2 h λ 2 e λ x x ( u x ) e λ u d u = 2 h .
Let us apply the Taylor formula with integral representation of the residual term:
h ( u ) = h ( x ) + h ( x ) ( u x ) + R ( u , x ) , R ( u , x ) = x u ( u t ) h ( t ) d t , u , x R .
This representation known for the Riemann integral (see, e.g., [67], §9.17) holds in the framework of the Lebesgue integral if it is possible to use the recurrent integration by parts for R ( u , x ) , i.e.,
x u ( u t ) h ( t ) d t = h ( x ) ( u x ) + x u h ( t ) d t = h ( x ) ( u x ) + h ( u ) h ( x ) .
Integral in the left-hand side of Equation (A7) exists by virtue of Lemma 1 since h Lip ( 1 ) . Therefore, h ( x ) is defined for almost all x R and (essential supremum) h 1 . The latter equality in Equation (A7) is obvious since h is continuous function on R . The first equality in Equation (A7) is valid due to the integration by parts formula for the Lebesgue integral. Indeed, functions h ( t ) and ( u t ) are absolutely continuous for t belonging to [ x , u ] . Thus, we can apply, e.g., Theorem 13.29 of [42] to justify the first equality in Equation (A7). Consequently, due to Equation (A4) and Equation (A6) one can write
| f h ( x ) | λ 2 e λ x x x u ( u t ) h ( t ) d t e λ u d u
h 2 x λ 2 ( u x ) 2 e λ ( u x ) d u = h Γ ( 3 ) 2 λ = h λ ,
where Γ ( α ) : = 0 u α 1 e u d u , α > 0 . Relations Equation (A5) and Equation (A8) lead to the last statement of Lemma 2. The proof is complete. □
Comments to Definition 1.
For each Lipschitz function f, one can claim that E [ f ( X ) ] is finite since E | X | < and, in light of Remark 1, one has | f ( x ) | C | x | + | f ( 0 ) | , where C = L i p ( f ) , x R . Clearly, it is sufficient to verify Equation (14) for any Lipschitz function f such that f ( 0 ) = 0 (otherwise we take the Lipschitz function f ( x ) f ( 0 ) , x R ). Evidently, p e ( x ) , x R , introduced by Equation (15), is a probability density because for non-negative random variable X according to [47], Ch.2, formula (69)
E [ X ] = [ 0 , ) P ( X > u ) d u .
We will show that, for such f and a density p e of X e , one has
[ 0 , ) f ( u ) d F ( u ) = [ 0 , ) f ( u ) P ( X > u ) d u ,
where F is a distribution function of X and E [ X ] 0 . We take integrals over [ 0 , ) as X 0 and p e ( x ) = 0 for x < 0 .
We know that a function f has a derivative at almost all points x R . Therefore, the right-hand side of Equation (A10) does not depend on the choice of a version f ( P ( X > u ) is a measurable bounded function). The integral in the right-hand side of Equation (A10) is finite because f C in light of Lemma 1 and since the right-hand side of Equation (A9) is finite. One can take the integrals over ( 0 , ) in Equation (A10) as f ( 0 ) = 0 and m ( { 0 } ) = 0 , where m stands for the Lebesgue measure.
Function f is a function of finite variation (as f is the Lipschitz function). Therefore, f = f 1 f 2 where f 1 and f 2 are nondecreasing functions. We can take the canonical representation with f 1 ( x ) = V a r 0 x ( f ) and f 2 ( x ) = f ( x ) f 1 ( x ) , x R , where V a r a b ( f ) is the variation of f on [ a , b ] , a < b (see, e.g., [42], Theorem 12.18). If f L i p ( C ) , then V a r a b ( f ) C ( b a ) . For a < c < b , one has (see, e.g., [42], Lemma 12.15)
V a r a c ( f ) + V a r c b ( f ) = V a r a b ( f ) .
We see that such f 1 and f 2 are the Lipschitz functions when f is the Lipschitz one. Hence, for almost all x R , there exist f 1 ( x ) , f 2 ( x ) and f ( x ) = f 1 ( x ) f 2 ( x ) . Thus, it is enough to demonstrate that
( 0 , ) f i ( u ) d F ( u ) = ( 0 , ) f i ( u ) P ( X > u ) d u , i = 1 , 2 .
These integrals are finite since f 1 and f 2 are the Lipschitz functions. Note that
( 0 , ) f i ( u ) d F ( u ) = ( 0 , ) f i ( u ) d ( 1 F ( u ) ) = ( 0 , ) f i ( u ) d P ( X > u ) .
By applying Theorem 11 of Sec. 6, Ch. 2 [47], one obtains, for each b > 0 , nondecreasing continuous function f i and a nondecreasing right-continuous function ( P ( X > u ) ) , the following formula:
( 0 , b ] f i ( u ) d P ( X > u ) = f i ( b ) P ( X > b ) f i ( 0 ) P ( X > 0 ) ( 0 , b ] P ( X > u ) d f i ( u )
= f i ( b ) P ( X > b ) ( 0 , b ] P ( X > u ) f i ( u ) d u .
We take into account that f i ( 0 ) = 0 and the σ -finite measure Q i corresponding to f i is absolutely continuous w.r.t. m, and the Radon–Nikodým derivative d Q i d m ( x ) = f i ( x ) , x R , i = 1 , 2 . In addition, we can write P ( X > u ) in Equation (A11) since for at almost all u R the left-limit of this function coincides with P ( X > u ) (there exist at most a countable set of jumps of P ( X > u ) , u R ). Obviously, f i ( b ) P ( X > b ) 0 as b because | f i ( u ) | A i u + B i for some positive A i , B i and all u R . Indeed, according to formula (73) of Sec. 6, Ch. 2 of [47] the condition E | X | < yields
b P ( | X | > b ) 0 , b .
By the Lebesgue dominated convergence theorem one infers that
( 0 , b ] f i ( u ) d P ( X > u ) ( 0 , ) f i ( u ) d P ( X > u ) , b .
and
lim b ( 0 , b ] P ( X > u ) f i ( u ) d u = ( 0 , ) P ( X > u ) f i ( u ) d u .
This permits to claim the validity of Equation (A10) which entails the desired Equation (15).
Proof of Lemma 3.
For f H 2 , in light of Remark 1 one can state that | f ( x ) | A 0 x 2 + B 0 for some positive numbers A 0 and B 0 . Let F be a distribution function of X. Since E [ X 2 ] < , due to Corollary 2, Sec. 6, Ch. 2, v.1, [47] one has
x 2 F ( x ) 0 , x ; x 2 ( 1 F ( x ) ) 0 , x .
Hence, we obtain that f ( x ) F ( x ) 0 as x and f ( x ) ( 1 F ( x ) ) 0 as x . Continuous function f has a bounded variation. Thus f = f 1 f 2 where f 1 and f 2 are nondecreasing continuous functions. Thus, for any a < 0 and i = 1 , 2 , the integration by parts formula (see, e.g., Theorem 11, Sec. 6, Ch. 2, [47]) and Equation (18) give
( a , 0 ] ( f 1 ( x ) f 2 ( x ) ) d F ( x ) = f ( 0 ) F ( 0 ) f ( a ) F ( a ) ( a , 0 ] F ( x ) d f 1 ( x ) ( a , 0 ] F ( x ) d f 2 ( x )
= f ( 0 ) F ( 0 ) f ( a ) F ( a ) ( a , 0 ] F ( x ) d f ( x ) .
We take into account that the integrands are bounded measurable functions and the measures corresponding to F, f 1 and f 2 are finite on any interval ( a , 0 ] . Therefore such integrals are finite. According to the Lebesgue theorem on dominated convergence (recall that E [ X 2 ] < ) one has
lim a ( a , 0 ] f ( x ) d F ( x ) = ( , 0 ] f ( x ) d F ( x ) ,
and the limit is finite. The monotone convergence theorem for σ -finite measure yields
lim a ( a , 0 ] F ( x ) d f 1 ( x ) ( a , 0 ] F ( x ) d f 2 ( x ) = ( , 0 ] F ( x ) d f 1 ( x ) ( , 0 ] F ( x ) d f 2 ( x ) .
We have seen that f ( a ) F ( a ) 0 as a . Hence, in light of Equation (18)
( , 0 ] F ( x ) d f 1 ( x ) ( , 0 ] F ( x ) d f 2 ( x ) = ( , 0 ] F ( x ) d f ( x ) .
Therefore, for i = 1 , 2 , each integral ( , 0 ] F ( x ) d f i ( x ) is finite as ( , 0 ] F ( x ) d f ( x ) is finite. Thus,
( , 0 ] f ( x ) d F ( x ) = f ( 0 ) F ( 0 ) ( , 0 ] F ( x ) d f ( x ) = f ( 0 ) F ( 0 ) + ( , 0 ] ( F ( x ) ) f ( x ) d x ,
as f is absolutely continuous. Indeed, for any x R ,
f ( x ) = f ( 0 ) + ( 0 , x ] f ( u ) d u ,
where (continuous) f L 1 [ a , b ] for any finite interval [ a , b ] . Thus, ( f ) + L 1 [ a , b ] and ( f ) L 1 [ a , b ] . Set
f 1 ( x ) : = f ( 0 ) + ( 0 , x ] ( f ( u ) ) + d u , f 2 ( x ) : = ( 0 , x ] ( f ( u ) ) d u .
Then f 1 and f 2 are nondecreasing continuous functions on R , f = f 1 f 2 and
( a , 0 ] F ( x ) d f ( x ) = ( a , 0 ] F ( x ) d f 1 ( x ) ( a , 0 ] F ( x ) d f 2 ( x ) ,
where these three integrals are finite. For (non-negative) σ -finite measures corresponding to f 1 and f 2 , one can write
( a , 0 ] F ( x ) d f 1 ( x ) = ( a , 0 ] F ( x ) ( f ( x ) ) + d x , ( a , 0 ] F ( x ) d f 2 ( x ) = ( a , 0 ] F ( x ) ( f ( x ) ) d x .
Thus, one has
( a , 0 ] F ( x ) d f ( x ) = ( a , 0 ] F ( x ) ( f ( x ) ) + d x ( a , 0 ] F ( x ) ( f ( x ) ) d x
= ( a , 0 ] F ( x ) ( ( f ( x ) ) + ( f ( x ) ) ) d x = ( a , 0 ] F ( x ) f ( x ) d x .
The bound f 1 follows from Lemma 1. Therefore, the Lebesgue theorem on dominated convergence yields (as E | X | < )
lim a ( a , 0 ] F ( x ) f ( x ) d x = ( , 0 ] F ( x ) f ( x ) d x .
We have demonstrated that
( , 0 ] F ( x ) d f ( x ) = ( , 0 ] F ( x ) f ( x ) d x .
In a similar way, we consider ( 0 , b ] ( 1 F ( x ) ) d x and letting b come to relation
( 0 , ) f ( x ) d ( 1 F ( x ) ) = f ( 0 ) ( 1 F ( 0 ) ) + ( 0 , ) ( 1 F ( x ) ) d f ( x )
= f ( 0 ) ( 1 F ( 0 ) ) + ( 0 , ) ( 1 F ( x ) ) f ( x ) d x .
This establishes Equation (21). □

References

  1. Steutel, F.W.; Van Harn, K. Infinite Divisibility of Probability Distributions on the Real Line; Marcel Dekker: New York, NY, USA, 2004. [Google Scholar]
  2. Nolan, J.P. Univariate Stable Distributions. Models for Heavy Tailed Data; Springer: Cham, Switzerland, 2020. [Google Scholar]
  3. Jagers, P. Branching processes: Personal historical perspective. In Statistical Modeling for Biological Systems; Almudevar, A., Oakes, D., Hall, J., Eds.; Springer: Cham, Switzerland, 2020; pp. 1–12. [Google Scholar] [CrossRef]
  4. Schmidli, H. Risk Theory; Springer: Cham, Switzerland, 2017. [Google Scholar]
  5. Gnedenko, B.V.; Korolev, V.Y. Random Summation. Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
  6. Kalashnikov, V.V. Geometric Sums: Bounds for Rare Events with Applications; Kluwer Academic: Dordrecht, The Netherlands, 1997. [Google Scholar]
  7. Pinski, M.A.; Karlin, S. An Introduction to Stochastic Modeling, 4th ed.; Academic Press: Amsterdam, The Netherlands, 2011. [Google Scholar]
  8. Bulinski, A.; Spodarev, E. Introduction to random fields. In Stochastic Geometry, Spacial Statistics and Random Fields. Asymptotic Methods; Spodarev, E., Ed.; Springer: Berlin, Germany, 2013; pp. 277–336. [Google Scholar] [CrossRef]
  9. Zolotarev, V.M. Modern Theory of Summation of Random Variables; De Gruyter: Berlin, Germany, 1997. [Google Scholar]
  10. Rachev, S.T.; Klebanov, L.B.; Stoyanov, S.V.; Fabozzi, F.J. The Methods of Distances in the Theory of Probability and Statistics; Springer: New York, NY, USA, 2013. [Google Scholar]
  11. Stein, C. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory; Statistical Laboratory of the University of California: Berkeley, CA, USA, 1972; pp. 583–602. [Google Scholar]
  12. Stein, C. Approximate Computation of Expectations, Institute of Mathematical Statistics Lecture Notes—Monograph Series, 7; Institute of Mathematical Statistics: Hayward, CA, USA, 1986. [Google Scholar]
  13. Slepov, N.A. Convergence rate of random geometric sum distributions to the Laplace law. Theory Probab. Appl. 2021, 66, 121–141. [Google Scholar] [CrossRef]
  14. Tyurin, I.S. On the convergence rate in Lyapunov’s theorem. Theory Probab. Appl. 2011, 55, 253–270. [Google Scholar] [CrossRef]
  15. Barbour, A.D.; Chen, L.H.Y. (Eds.) An Introduction to Stein’s Method; World Scientific: Singapore, 2005. [Google Scholar]
  16. Chen, L.H.Y.; Goldstein, L.; Shao, Q.-M. Normal Approximation by Stein’s Method; Springer: Heidelberg, Germany, 2011. [Google Scholar]
  17. Ross, N. Fundamentals of Stein’s method. Probab. Surv. 2011, 8, 210–293. [Google Scholar] [CrossRef]
  18. Arras, B.; Breton, J.-C.; Deshayes, A.; Durieu, O.; Lachièze-Rey, R. Some recent advances for limit theorems. ESAIM Proc. Surv. 2020, 68, 73–96. [Google Scholar] [CrossRef]
  19. Arras, B.; Houdré, C. On Stein’s Method for Infinitely Divisible Laws with Finite First Moment, 1st ed.; Springer: Cham, Switzerland, 2019. [Google Scholar]
  20. Chen, P.; Nourdin, I.; Xu, L.; Yang, X.; Zhang, R. Non-integrable Stable Approximation by Stein’s Method. J. Theor. Probab. 2022, 35, 1137–1186. [Google Scholar] [CrossRef]
  21. Rényi, A. (Hungarian) A characterization of Poisson processes. Magyar Tud. Akad. Mat. Kutató. Int. Közl. 1957, 1, 519–527. [Google Scholar]
  22. Shevtsova, I.; Tselishchev, M. A generalized equilibrium transform with application to error bounds in the Rényi theorem with no support constraints. Mathematics 2020, 8, 577. [Google Scholar] [CrossRef]
  23. Brown, M. Error bounds for exponential approximations of geometric convolutions. Ann. Probab. 1990, 18, 1388–1402. [Google Scholar] [CrossRef]
  24. Brown, M. Sharp bounds for exponential approximations under a hazard rate upper bound. J. Appl. Probab. 2015, 52, 841–850. [Google Scholar] [CrossRef]
  25. Hung, T.L.; Kein, P.T. On the rates of convergence in weak limit theorems for normalized geometric sums. Bull. Korean Math. Soc. 2020, 57, 1115–1126. [Google Scholar] [CrossRef]
  26. Shevtsova, I.; Tselishchev, M. On the accuracy of the exponential approximation to random sums of alternating random variables. Mathematics 2020, 8, 1917. [Google Scholar] [CrossRef]
  27. Korolev, V.; Zeifman, A. Bounds for convergence rate in laws of large numbers for mixed Poisson random sums. Stat. Probab. 2021, 168, 108918. [Google Scholar] [CrossRef]
  28. Aldous, D.J. More Uses of Exchangeability: Representations of Complex Random Structures. In Probability and Mathematical Genetics: Papers in Honour of Sir John Kingman; Bingham, N.H., Goldie, C.M., Eds.; Cambridge Univesity Press: Cambridge, UK, 2010. [Google Scholar]
  29. Shevtsova, I.; Tselishchev, M. On the accuracy of the generalized gamma approximation to generalized negative binomial random sums. Mathematics 2021, 9, 1571. [Google Scholar] [CrossRef]
  30. Liu, Q.; Xia, A. Geometric sums, size biasing and zero biasing. Electron. Commun. Probab. 2022, 27, 1–13. [Google Scholar] [CrossRef]
  31. Döbler, C.; Peccati, G. The Gamma Stein equation and noncentral de Jong theorems. Bernoulli 2018, 24, 3384–3421. [Google Scholar] [CrossRef] [Green Version]
  32. Korolev, V. Bounds for the rate of convergence in the generalized Rényi theorem. Mathematics 2022, 10, 4252. [Google Scholar] [CrossRef]
  33. Peköz, E.A.; Röllin, A. New rates for exponential approximation and the theorems of Rényi and Yaglom. Ann. Probab. 2011, 39, 587–608. [Google Scholar] [CrossRef] [Green Version]
  34. Weinberg, G.V. Kulback-Leibler divergence and the Pareto-Exponential approximation. SpringerPlus 2016, 5, 604. [Google Scholar] [CrossRef] [Green Version]
  35. Daly, F. Gamma, Gaussian and Poisson approximations for random sums using size-biased and generalized zero-biased couplings. Scand. Actuar. J. 2022, 24, 471–487. [Google Scholar] [CrossRef]
  36. Zolotarev, V.M. Ideal metrics in the problem of approximating the distributions of sums of independent random variables. Theory Probab. Appl. 1977, 22, 433–449. [Google Scholar] [CrossRef]
  37. Gibbs, A.L.; Su, F.E. On choosing and bounding probability metrics. Int. Stat. Rev. 2002, 70, 419–435. [Google Scholar] [CrossRef] [Green Version]
  38. Janson, S. Probability Distances. 2020. Available online: www2.math.uu.se/∼svante (accessed on 1 September 2022).
  39. Peköz, E.A.; Röllin, A.; Ross, N. Total variation error bounds for geometric approximation. Bernoulli 2013, 19, 610–632. [Google Scholar] [CrossRef] [Green Version]
  40. Slepov, N.A. Generalized Stein equation on extended class of functions. In Proceedings of the International Conference on Analytical and Computational Methods in Probability Theory and Its Applications, Moscow, Russia, 23–27 October 2017; pp. 75–79. [Google Scholar]
  41. Ley, C.; Reinert, G.; Swan, Y. Stein’s method for comparison of inivariate distributions. Probab. Surv. 2017, 14, 1–52. [Google Scholar] [CrossRef]
  42. Yeh, J. Real Analysis. Theory of Measure and Integration, 2nd ed.; World Scientific: Singapore, 2006. [Google Scholar]
  43. Gaunt, R.E. Wasserstein and Kolmogorov error bounds for variance gamma approximation via Stein’s method I. J. Theor. Probab. 2020, 33, 465–505. [Google Scholar] [CrossRef] [Green Version]
  44. Halmos, P.R. Measure Theory; Springer: New York, NY, USA, 1974. [Google Scholar]
  45. Gaunt, R.E.; Walton, N. Stein’s method for the single server queue in heavy traffic. Stat. Probab. Lett. 2020, 156, 108566. [Google Scholar] [CrossRef]
  46. Muthukumar, T. Measure Theory and Lebesgue Integration. 2018. Available online: home.iitk.ac.in/∼tmk (accessed on 1 September 2022).
  47. Shiryaev, A.N. Probability-1; Springer: New York, NY, USA, 2016. [Google Scholar]
  48. Burkill, L.C. The Lebesgue Integral; Cambridge University Press: Cambridge, UK, 1963. [Google Scholar]
  49. Korolev, V.; Zeifman, A. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef] [Green Version]
  50. Baldi, P.; Rinott, Y.; Stein, C. A normal approximations for the number of local maxima of a random function on a graph. In Probability, Statistics and Mathematics, Papers in Honor of Samuel Karlin; Anderson, T.W., Athreya, K.B., Iglehart, D.L., Eds.; Academic Press: San-Diego, CA, USA, 1989; pp. 59–81. [Google Scholar] [CrossRef]
  51. Goldstein, L.; Rinott, Y. Multivariate normal approximations by Stein’s method and size bias couplings. J. Appl. Prob. 1996, 33, 1–17. [Google Scholar] [CrossRef]
  52. Goldstein, L. Berry-Esseen bounds for combinatorial central limit theorems and pattern occurrences, using zero and size biasing. J. Appl. Probab. 2005, 42, 661–683. [Google Scholar] [CrossRef]
  53. Goldstein, L.; Reinert, G. Stein’s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab. 1997, 7, 935–952. [Google Scholar] [CrossRef]
  54. Gaunt, R.E. On Stein’s method for products of normal random variables and zero bias couplings. Bernoulli 2017, 23, 3311–3345. [Google Scholar] [CrossRef] [Green Version]
  55. Döbler, C. Distributional transformations without orthogonality relations. J. Theor. Probab. 2017, 30, 85–116. [Google Scholar] [CrossRef] [Green Version]
  56. Arratia, R.; Goldstein, L.; Kochman, F. Size bias for one and all. Probab. Surv. 2019, 16, 1–61. [Google Scholar] [CrossRef]
  57. Weinberg, G.V. Validity of whitening-matched filter approximation to the Pareto coherent detector. IET Signal Process 2012, 6, 546–550. [Google Scholar] [CrossRef]
  58. Blanchet, J.; Glinn, P. Uniform renewal theory with applications to expansions of random geometric sums. Adv. Appl. Prob. 2007, 39, 1070–1097. [Google Scholar] [CrossRef]
  59. Kallenberg, O. Foundations of Modern Probability; Springer: New York, NY, USA, 1997. [Google Scholar]
  60. Kingman, J.F.C. On queues in heavy traffic. J. R. Stat. Soc. Ser. B Stat. Methodol. 1962, 24, 383–392. [Google Scholar] [CrossRef]
  61. Su, Z.; Wang, X. Approximation of sums of locally dependent random variables via perturbation of Stein operator. arXiv 2022, arXiv:2209.09770.v2. [Google Scholar]
  62. Korolev, V.Y.; Zeifman, A.I. Convergence of statistics constructed from samples with random sizes to the Linnik and Mittag-Leffler distributions and their generalizations. J. Korean Stat. Soc. 2017, 46, 161–181. [Google Scholar] [CrossRef]
  63. Bulinski, A.; Kozhevin, A. New version of the MDR method for stratified samples. Stat. Optim. Inf. Comput. 2017, 5, 1–18. [Google Scholar] [CrossRef] [Green Version]
  64. Ginag, L.T.; Hung, T.L. An extension of random summations of independent and identically distributed random variables. Commun. Korean Math. Soc. 2018, 33, 605–618. [Google Scholar] [CrossRef]
  65. Farago, A. Decomposition of Random Sequences into Mixtures of Simpler Ones and Its Application in Network Analysis. Algorithms 2021, 14, 336. [Google Scholar] [CrossRef]
  66. Feng, M.; Deng, L.-J.; Chen, F.; Perc, M.; Kurths, J. The accumulative law and its probability model: An extension of the Pareto distribution and the log-normal distribution. Proc. R. Soc. A 2020, 476, 20200019. [Google Scholar] [CrossRef] [PubMed]
  67. Nikolsky, S.M. A Course of Mathematical Analysis, v. 1; Mir Publishers: Moscow, Russia, 1987. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bulinski, A.; Slepov, N. Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws. Mathematics 2022, 10, 4747. https://doi.org/10.3390/math10244747

AMA Style

Bulinski A, Slepov N. Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws. Mathematics. 2022; 10(24):4747. https://doi.org/10.3390/math10244747

Chicago/Turabian Style

Bulinski, Alexander, and Nikolay Slepov. 2022. "Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws" Mathematics 10, no. 24: 4747. https://doi.org/10.3390/math10244747

APA Style

Bulinski, A., & Slepov, N. (2022). Sharp Estimates for Proximity of Geometric and Related Sums Distributions to Limit Laws. Mathematics, 10(24), 4747. https://doi.org/10.3390/math10244747

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop