Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes

Christoph, Gerd; Ulyanov, Vladimir V.

doi:10.3390/math9070775

Open AccessArticle

Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes

by

Gerd Christoph

^1,*,†

and

Vladimir V. Ulyanov

^2,†

¹

Department of Mathematics, Otto-von-Guericke University Magdeburg, 39016 Magdeburg, Germany

²

Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, 119991 Moscow, Russia

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2021, 9(7), 775; https://doi.org/10.3390/math9070775

Submission received: 3 March 2021 / Revised: 28 March 2021 / Accepted: 29 March 2021 / Published: 2 April 2021

(This article belongs to the Special Issue Analytical Methods and Convergence in Probability with Applications)

Download Versions Notes

Abstract

:

Second-order Chebyshev–Edgeworth expansions are derived for various statistics from samples with random sample sizes, where the asymptotic laws are scale mixtures of the standard normal or chi-square distributions with scale mixing gamma or inverse exponential distributions. A formal construction of asymptotic expansions is developed. Therefore, the results can be applied to a whole family of asymptotically normal or chi-square statistics. The random mean, the normalized Student t-distribution and the Student t-statistic under non-normality with the normal limit law are considered. With the chi-square limit distribution, Hotelling’s generalized

T_{0}^{2}

statistics and scale mixture of chi-square distributions are used. We present the first Chebyshev–Edgeworth expansions for asymptotically chi-square statistics based on samples with random sample sizes. The statistics allow non-random, random, and mixed normalization factors. Depending on the type of normalization, we can find three different limit distributions for each of the statistics considered. Limit laws are Student t-, standard normal, inverse Pareto, generalized gamma, Laplace and generalized Laplace as well as weighted sums of generalized gamma distributions. The paper continues the authors’ studies on the approximation of statistics for randomly sized samples.

Keywords:

second-order expansions; random sample size; asymptotically normal statistics; asymptotically chi-square statistics; Student’s t-distribution; normal distribution; inverse Pareto distribution; Laplace and generalized Laplace distribution; weighted sums of generalized gamma distributions

MSC:

62E17 (Primary) 62H10; 60E05 (Secondary)

1. Introduction

In classical statistical inference, the number of observations is usually known. If observations are collected in a fixed time span or we lack observations the sample size may be a realization of a random variable. The number of failed devices in the warranty period, the number of new infections each week in a flu season, the number of daily customers in a supermarket or the number of traffic accidents per year are all random numbers.

Interest in studying samples with a random number of observations has grown steadily over the past few years. In medical research, the authors of [1,2,3] examines ANOVA models with unknown sample sizes for the analysis of fixed one-way effects in order to avoid false rejection. Applications of orthogonal mixed models to situations with samples of a random number of observations of a Poisson or binomial distributed random variable are presented. Based a random number of observations [4], Al-Mutairi and Raqab [5] and Barakat et al. [6] examined the mean with known and unknown variances and the variance in the normal model, confidence intervals for quantiles and prediction intervals for the future observations for generalized order statistics. An overview on statistical inference of samples with random sample sizes and some applications are given in [4], see also the references therein.

When the non-random sample size is replaced by a random variable, the asymptotic features of statistics can change radically, as shown by Gnedenko [7]. The monograph by Gnedenko and Korolev [8] deals with below limit distributions for randomly indexed sequences and their applications.

General transfer theorems for asymptotic expansions of the distribution function of statistics based on samples with non-random sample sizes to their analogues for samples of random sizes are proven in [9,10]. In these papers, rates of convergence and first-order expansion are proved for asymptotically normal statistics. The results depend on the rates of convergence with which the distributions of the normalized random sample sizes approach the corresponding limit distribution.

The difficulty of obtaining second-order expansions for the normalized random sample sizes beyond the rates of convergences was overcome by Christoph et al. [11]. Second-order expansions were proved by the authors of [11,12] for the random mean and the median of samples with random sample sizes and the authors of [13,14] for the three geometric statistics of Gaussian vectors, the length of a vector, the distance between two vectors and the angle between two vectors associated with their correlation coefficient when the dimension of the vectors is random.

The classical Chebyshev–Edgeworth expansions strongly influenced the development of asymptotic statistics. The fruitful interactions between Chebyshev Edgeworth expansions and Bootstrap methods are demonstrated in [15]. Detailed reviews of applications of Chebyshev–Edgeworth expansions in statistics were given by, e.g., Bickel [16] and Kolassa [17]. If the arithmetic mean of independent random variables is considered as the statistic, only the expected value and the dispersion are taken into account in the central limit theorem or in the Berry–Esseen inequalities. The two important characteristics of random variables, skewness and kurtosis, have great influence on second order expansions, provided that the corresponding moments exist. The Cornish–Fisher inversion of the Chebyshev–Edgeworth expansion allows the approximation of the quantiles of the test statistics used, for example, in many hypothesis tests. In [11], Theorems 3 and 6, and [12], Corollaries 6.2 and 6.3, Cornish–Fisher expansions for the random mean and median from samples with random sample sizes are obtained. In the same way, Cornish–Fisher expansions for the quantiles of the statistics considered in present paper can be derived from the corresponding Chebyshev–Edgeworth expansions.

In the present paper, we continue our research on approximations if the sample sizes are random. To the best of our knowledge, Chebyshev–Edgeworth-type expansions with asymptotically chi-square statistics have not yet been proven in the literature when the sample sizes are random.

The article is structured as follows. Section 2 describes statistical models with random numbers of observations, the assumptions about statistics and random sample sizes and transfer propositions from samples with non-random to random sample sizes. Section 3 presents statistics with non-random sample sizes with Chebyshev–Edgeworth expansions based on standard normal or chi-square distributions. Corresponding expansions of the negative binomial or discrete Pareto distributions as random sample sizes are considered in Section 4. Section 5 describes the influence of non-random, random or mixed normalization factors on the limit distributions of the examined statistics that are based on samples with random sample sizes. Besides the common Student’s t, normal and Laplace distributions, inverse Pareto, generalized gamma and generalized Laplace as well as weighted sums of generalized gamma distributions also occur as limit laws. The main results for statistic families with different normalization factors and examples are given in Section 6. To prove statements about a family of statistics, formal constructions for the expansions are worked out in Section 7, which are used in Section 8 to prove the theorems. Conclusions are drawn in Section 9. We leave four auxiliary lemmas to Appendix A.

2. Statistical Models with a Random Number of Observations

Let

X_{1}, X_{2}, \dots \in R = (- \infty \infty)

and

N_{1}, N_{2}, \dots \in N_{+} = {1, 2, \dots}

be random variables defined on a common probability space

(Ω, A, P)

. The random variables

X_{1}, \dots, X_{m}

denote the observations and form the random sample with a non-random sample size

m \in N_{+}

. Let

T_{m} : = T_{m} (X_{1}, \dots, X_{m}) with m \in N_{+}

be some statistic obtained from the sample

{X_{1}, X_{2}, \dots, X_{m}}

. Consider now the sample

X_{1}, \dots, X_{N_{n}}

. The random variable

N_{n} \in N_{+}

denotes the random size of the underlying sample, that is the random number of observations, depending on a parameter

n \in N_{+}

. We suppose for each

n \in N_{+}

that

N_{n} \in N_{+}

is independent of random variables

X_{1}, X_{2}, \dots

and

N_{n} \to \infty

in probability as

n \to \infty

.

Let

T_{N_{n}}

be a statistic obtained from a random sample

X_{1}, X_{2}, \dots, X_{N_{n}}

defined as

T_{N_{n}} (ω) : = T_{N_{n} (ω)} (X_{1} (ω), \dots, X_{N_{n} (ω)} (ω)) for all ω \in Ω and every n \in N_{+} .

2.1. Assumptions on Statistics $T_{m}$ and Random Sample Sizes $N_{n}$

In further consideration, we restrict ourselves to only those terms in the expansions that are used below.

We assume that the following condition for the statistic

T_{m}

with

E T_{m} = 0

from a sample with non-random sample size

m \in N_{+}

is fulfilled:

Assumption 1.

There are differentiable functions for all

x \neq 0

distribution function

F (x)

and bounded functions

f_{1} (x)

,

f_{2} (x)

and real numbers

γ \in {0, \pm 1 / 2, \pm 1}

,

a > 1

and

0 < C_{1} < \infty

so that for all integers

m \geq 1

{sup}_{x} | P (m^{γ} T_{m} \leq x) - F (x) - m^{- 1 / 2} f_{1} (x) - m^{- 1} f_{2} (x) | \leq C_{1} m^{- a} .

(1)

Remark 1.

In contrast to Bening et al. [10], the differentiability of

F (x)

,

f_{1} (x)

and

f_{2} (x)

is only required for

x \neq 0

. In the present article, in addition to the normal distribution, the chi-square distribution with p degrees of freedom is used as

F (x)

, which is not differentiable in

x = 0

if

p = 1

or

p = 2

.

The distribution functions of the normalized random variables

N_{n} \in N_{+}

satisfy the following condition:

Assumption 2.

A distribution function

H (y)

with

H (0 +) = 0

, a function of bounded variation

h_{2} (y)

, a sequence

0 < g_{n} ↑ \infty

and real numbers

b > 0

and

C_{2} > 0

exist so that for all integers

n \geq 1

\begin{matrix} {sup}_{y \geq 0} |P (g_{n}^{- 1} N_{n} \leq y) - H (y)| \leq C_{2} n^{- b}, & f o r & 0 < b \leq 1, \\ {sup}_{y \geq 0} |P (g_{n}^{- 1} N_{n} \leq y) - H (y) - n^{- 1} h_{2} (y)| \leq C_{2} n^{- b}, & f o r & b > 1 . \end{matrix}\}

(2)

2.2. Transfer Proposition from Samples with Non-Random to Random Sample Sizes

Assumptions 1 and 2 allow the construction of expansions for distributions of normalized random-size statistics

T_{N_{n}}

based on approximate results for fixed-size normalized statistics

T_{m}

in (1) and for the random size

N_{n}

in (2).

Proposition 1.

Suppose

γ \in {0, \pm 1 / 2, \pm 1}

and the statistic

T_{m}

and the sample size

N_{n}

satisfy Assumptions 1 and 2. Then, for all

n \in N_{+}

, the following inequality applies:

{sup}_{x \in R} | P (g_{n}^{γ} T_{N_{n}} \leq x) - G_{n} (x, 1 / g_{n}) | \leq C_{1} E (N_{n}^{- a}) + (C_{3} D_{n} + C_{4}) n^{- b},

(3)

where

G_{n} (x, 1 / g_{n}) = \int_{1 / g_{n}}^{\infty} (F (x y^{γ}) + \frac{f_{1} (x y^{γ})}{\sqrt{g_{n} y}} + \frac{f_{2} (x y^{γ})}{g_{n} y}) d (H (y) + \frac{h_{2} (y)}{n}),

(4)

D_{n} = sup_{x} \int_{1 / g_{n}}^{\infty} |\frac{\partial}{\partial y} (F (x y^{γ}) + \frac{f_{1} (x y^{γ})}{\sqrt{g_{n} y}} + \frac{f_{2} (x y^{γ})}{y g_{n}})| d y,

(5)

a > 1, b > 0

and

f_{1} (z), f_{2} (z), h_{2} (y)

are given in (1) and (2). The constants

C_{1}, C_{3}, C_{4}

do not depend on n.

General transfer theorems with more terms are proved in [9,10] for

γ \geq 0

.

Remark 2.

The approximation function

G_{n} (x, 1 / g_{n})

is not a polynomial in

g_{n}^{- 1 / 2}

and

n^{- 1 / 2}

. The domain

[1 / g_{n}, \infty)

of integration in (4) depends on

g_{n}

. Some of the integrals in (4) could tend to infinity with

1 / g_{n} \to 0

as

n \to \infty

.

The following statement clarifies the problem.

Proposition 2.

In addition to the conditions of Proposition 1, let the following conditions be satisfied on the functions

H (.)

and

h_{2} (.)

, depending on the rate of convergence

b > 0

in (2):

\begin{matrix} H (1 / g_{n}) \leq c_{1} g_{n}^{- b}, & f o r & b > 0, \end{matrix}

(6)

\begin{matrix} \int_{0}^{1 / g_{n}} y^{- 1 / 2} d H (y) \leq c_{2} g_{n}^{- b + 1 / 2}, & f o r & b > 1 / 2, \end{matrix}

(7)

\begin{matrix} \begin{matrix} i : & \int_{0}^{1 / g_{n}} y^{- 1} d H (y) \leq c_{3} g_{n}^{- b + 1}, \\ i i : & h_{2} (0) = 0 a n d | h_{2} (1 / g_{n}) | \leq c_{4} n g_{n}^{- b}, \\ i i i : & \int_{0}^{1 / g_{n}} y^{- 1} | h_{2} (y) | d y \leq c_{5} n g_{n}^{- b}, \end{matrix}\} & f o r & b > 1 . \end{matrix}

(8)

Then, for the function

G_{n} (x, 1 / g_{n})

defined in (4), one has

{sup}_{x} | G_{n} (x, 1 / g_{n}) - G_{n, 2} (x) - I_{1} (x, n) - I_{2} (x, n) - I_{3} (x, n) - I_{4} (x, n) | \leq C g_{n}^{- b}

with

G_{n, 2} (x) = \{\begin{matrix} \int_{0}^{\infty} F (x y^{γ}) d H (y), & f o r 0 < b \leq 1 / 2, \\ \int_{0}^{\infty} (F (x y^{γ}) + \frac{f_{1} (x y^{γ})}{\sqrt{g_{n} y}}) d H (y) = : G_{n, 1} (x), & f o r 1 / 2 < b \leq 1, \\ G_{n, 1} (x) + \int_{0}^{\infty} \frac{f_{2} (x y^{γ})}{g_{n} y} d H (y) + \int_{0}^{\infty} \frac{F (x y^{γ})}{n} d h_{2} (y), & f o r b > 1, \end{matrix}\}

(9)

\begin{matrix} I_{1} (x, n) & = & \int_{1 / g_{n}}^{\infty} \frac{f_{1} (x y^{γ})}{\sqrt{g_{n} y}} d H (y) f o r b \leq 1 / 2, I_{2} (x, n) = \int_{1 / g_{n}}^{\infty} \frac{f_{2} (x y^{γ})}{g_{n} y} d H (y) f o r b \leq 1, \end{matrix}

(10)

\begin{matrix} I_{3} (x, n) & = & \int_{1 / g_{n}}^{\infty} \frac{f_{1} (x y^{γ})}{n \sqrt{g_{n} y}} d h_{2} (y) a n d I_{4} (x, n) = \int_{1 / g_{n}}^{\infty} \frac{f_{2} (x y^{γ})}{n g_{n} y} d h_{2} (y) f o r b > 1 . \end{matrix}

(11)

Remark 3.

The lower limit of integration in

I_{1} (x, n)

to

I_{4} (x, n)

in (10) and (11) depends on

g_{n}

. If the sample size

N_{n} = N_{n} (r)

is negative binomial distributed with, e.g.,

0 < r < 1 / 2

or

1 < r < 2

and

g_{n} = r (n - 1) + 1

(see (28) below), then both

I_{1} (x, n)

and

I_{4} (x, n)

have order

n^{- r}

and not

n^{1 / 2}

or

n^{- 2}

, as it seems at first glance.

Remark 4.

The additional conditions (6)–(8) guarantee to extend the integration range of the integrals in (9) from

[1 / g_{n}, \infty)

to

(0, \infty)

.

Proof of Propositions 1 and 2:

Evidence of Proposition 1 follows along the similar arguments of the more general Transfer Theorem 3.1 in [10] for

γ \geq 0

. The proof was adapted by Christoph and Ulyanov [13] to negative

γ < 0

, too. Therefore, the Proposition 1 applies to

γ \in {0, \pm 1 / 2, \pm 1}

.

The present Propositions 1 and 2 differ from Theorems 1 and 2 in [13] only by the additional term

f_{1} (x y^{γ}) {(g_{n} y)}^{- 1 / 2}

and the added condition (7) to estimate this additional term. Therefore, the details are omitted her. □

Remark 5.

In Appendix 2 of the monograph by Gnedenko and Korolev [8], asymptotic expansions for generalized Cox processes are proved (see Theorems A2.6.1–A2.6.3). As random sample size, the authors considered a Cox process

N (t)

controlled by a Poisson process

Λ (t)

(also known as a doubly stochastic Poisson process) and proved asymptotic expansions for the random sum

S (t) = \sum_{k = 1}^{N (t)} X_{k}

, where

X_{1}, X_{2}, \dots

are independent identically distributed random variables. For each

t \geq 0

, the random variables

N (t), X_{1}, X_{2}, \dots

are independent. The above-mentioned theorems are close to Proposition 1. The structure of the functions

G_{2; n} (.)

in (4) and the bounds on the right-hand side of inequality (3) in Proposition 1 differ from the corresponding terms in Theorems A2.6.1–A2.6.3. Thus, the bounds contain little o-terms.

3. Chebyshev–Edgeworth Expansions Based on Standard Normal and Chi-Square Distributions

We consider two classes of statistics which are asymptotically normal or chi-square distributed.

3.1. Examples for Asymptotically Normally Distributed Statistics

Let

X, X_{1}, X_{2}, \dots

be independent identically distributed random variables with

\begin{matrix} E {|X|}^{5} < \infty, E (X) = μ, 0 < Var (X) = σ^{2}, \\ skewness λ_{3} = σ^{- 3} E {(X - μ)}^{3} and kurtosis λ_{4} = σ^{- 4} E {(X - μ)}^{4} . \end{matrix}\}

(12)

The random variable X is assumed to satisfy Cramér’s condition

{lim sup}_{| t | \to \infty} |E e^{i t X}| < 1 .

(13)

Consider the asymptotically normal sample mean:

{\bar{X}}_{m} = (X_{1} + \dots + X_{m}) / m m = 1, 2, \dots,

(14)

It follows from Petrov [18], Theorem 5.18 with

k = 5

, that

{sup}_{x} |P (σ^{- 1} \sqrt{m} ({\bar{X}}_{m} - μ) \leq x) - Φ_{2; m} (x)| \leq C m^{- 3 / 2},

(15)

with C being independent of m and second order expansion

Φ_{2; m} (x) = Φ (x) - (\frac{λ_{3}}{6 \sqrt{m}} H_{2} (x) + \frac{1}{m} (\frac{λ_{4}}{24} H_{3} (x) + \frac{λ_{3}^{2}}{72} H_{5} (x))) φ (x),

(16)

where

Φ (x)

and

φ (x)

are standard normal distribution function and its density and

H_{k} (x)

are the Chebyshev–Hermite polynomials

H_{2} (x) = x^{2} - 1 H_{3} (x) = x^{3} - 3 x a n d H_{5} (x) = x^{5} - 10 x^{3} + 15 x .

Let the random variable

χ_{d}^{2}

be chi-square distributed with d degrees of freedom having distribution function

G_{d} (x)

and density function

g_{d} (x)

:

G_{d} (x) = P (χ_{d}^{2} \leq x) = \int_{0}^{x} g_{d} (y) d y a n d g_{d} (y) = \frac{1}{2^{d / 2} Γ (d / 2)} y^{(d - 2) / 2} e^{- y / 2}, y > 0 .

(17)

Next, we examine the scale-mixed normalized statistic

T_{m} = \sqrt{m} Z / \sqrt{χ_{m}^{2}}

, where Z and

χ_{m}^{2}

are independent random variables with the standard normal distribution

Φ (x)

and the chi-square distribution

G_{m} (x)

, respectively. Then, the statistic

T_{m} = \sqrt{m} Z / \sqrt{χ_{m}^{2}}

follows the Student’s t-distribution with m degrees of freedom. Example 2.1 in [19] indicates

|P (\frac{\sqrt{m} Z}{\sqrt{χ_{m}^{2}}} \leq x) - Φ (x) - \frac{(x^{3} + x) φ (x)}{4 m}| \leq \frac{{sup}_{x} {| x^{5} + 2 x^{3} + 3 x | φ (x)}}{6 m^{2}} + \frac{6 (m + 4)}{m^{3}} \leq \frac{30.5}{m^{2}} .

(18)

Chebyshev–Edgeworth expansions of Student’s t-statistic under non-normality are well investigated, but only Hall [20] proved these under minimal moment condition. Let conditions (12) and (13) are satisfied for independent identically distributed random variables

X_{1}, X_{2}, \dots

Define

T_{m}^{*} = m^{1 / 2} ({\bar{X}}_{m} - μ) / {\hat{σ}}_{m}

with sample mean

{\bar{X}}_{n}

and biased sample variance

{\hat{σ}}_{m}^{2} = m^{- 1} \sum_{i = 1}^{m} {(X_{i} - {\bar{X}}_{m})}^{2}

. It follows from Hall [20] that for Student’s t-statistic

T_{m}^{*}

:

R_{m} (x) = |P (m^{1 / 2} \frac{{\bar{X}}_{m} - μ}{{\hat{σ}}_{m}} \leq x) - Φ (x) - φ (x) (\frac{P_{1} (x)}{\sqrt{m}} + \frac{P_{2} (x)}{m})| \leq C m^{- 3 / 2} (1 + u (m))

(19)

uniformly in x, where

u (m) \to 0

as

m \to \infty

,

P_{1} (x) = λ_{3} (2 x^{2} + 1) / 6 and P_{2} (x) = x \{\frac{λ_{4}}{12} (x^{2} - 3) - \frac{λ_{3}^{2}}{18} (x^{4} + 2 x^{2} - 3) - \frac{1}{4} (x^{2} + 3)\} .

(20)

Remark 6.

The estimate (19) does not satisfy (1) in Assumption 1 because we do not have a computable error bound U with

| u (m) | \leq U < \infty

for all

m \in N_{+}

. The estimate (19) does not satisfy (1) in Assumption 1 because we do not have a computable constant C with

| u (m) | \leq C < \infty

for all

m \in N_{+}

, if all parameter are given. The remainder in (19) meets order condition

R_{m} (x) = O (m^{- 3 / 2})

as

m \to \infty

, but in the equivalent condition

s u p_{x} R_{m} (x) \leq C m^{- 3 / 2}

for all

m \geq M

the values

C > 0

and

M > 0

are unknown. About non-asymptotic bounds and order conditions, see the work of Fujikoshi and Ulyanov [19] (Section 1.1).

In [21], an inequality for a first order approximation is proved:

{sup}_{x} |P (m^{1 / 2} \frac{{\bar{X}}_{m} - μ}{{\hat{σ}}_{m}} \leq x) - Φ (x) - \frac{P_{1} (x) φ (x)}{\sqrt{m}}| \leq C m^{- 1},

(21)

where

{E | X |}^{4 + ε} < \infty

is required for arbitrary

ε > 0

and

P_{1} (x)

is defined in (20).

3.2. Examples for Asymptotically Chi-Square Distributed Statistics

The baseline distribution of the second order expansions is now the chi-square distribution

G_{d} (x)

occurring as limit distribution in different multivariate tests (see [22], Chapters 5 and 8–10, [19,23]).

At first, we consider statistic

T_{m} = T_{0}^{2} = m tr S_{q} S_{m}^{- 1}

, where

S_{q}

and

S_{m}

are random matrices independently distributed as Wishart distributions

W_{p} (q, I_{p})

and

W_{p} (m, I_{p})

, respectively, with identity operator

I_{p}

in

R_{p}

. Note that

W

has Wishart distribution

W_{p} (q, Σ)

if

q \geq p

and its density is

\frac{1}{2^{p q / 2} Γ_{p} (q / 2) {| Σ |}^{q / 2}} exp \{- \frac{1}{2} tr (Σ^{- 1} W)\} {| W |}^{(q - p - 1) / 2},

where

Γ_{p} (q / 2) = π^{p (p - 1) / 4} Π_{k = 1}^{p} Γ ((q - k + 1) / 2)

(see [23], Chapter 2, for some basic properties).

Hotelling’s generalized

T_{0}^{2}

distribution allows approximation

{sup}_{x} |P (m tr (S_{q} S_{m}^{- 1}) \leq x) - G_{d} (x) - \frac{d}{4 m} (a_{0} G_{d} (x) + a_{1} G_{d + 2} (x) + a_{2} G_{d + 4} (x))| \leq C m^{- 2}

(22)

(see [24], Theorem 4.1), where

d = p q, a_{0} = q - p - 1, a_{1} = - 2 q and a_{2} = q + p + 1 with a_{0} + a_{1} + a_{2} = 0 .

(23)

If

T_{m} = χ_{d}^{2} / χ_{m}^{2}

is a scale mixture, where

χ_{d}^{2}

and

χ_{m}^{2}

are independent,

T_{m}

allows asymptotic expansion

{sup}_{x} |P (m χ_{d}^{2} / χ_{m}^{2} \leq x) - G_{d} (x) - \frac{d}{4 m} (a_{0} G_{d} (x) + a_{1} G_{d + 2} (x) + a_{2} G_{d + 4} (x))| \leq C m^{- 2}

(24)

(see [25], Section 5), where now

a_{0} = 2 - d, a_{1} = 2 d and a_{2} = - (2 + d) with a_{0} + a_{1} + a_{2} = 0 .

(25)

Integration by parts gives

G_{k + 2} (x) = - 2 g_{k + 2} (x) + G_{k} (x)

. Moreover,

g_{k + 2} (x) = (x / k) g_{k} (x)

for

k = d

and

k = d + 2

. Then, it follows for both statistics

Z_{m} = tr (S_{q} S_{m}^{- 1})

in (22) and

Z_{m} = χ_{d}^{2} / χ_{m}^{2}

in (24) that

{sup}_{x} |P (m Z_{m} \leq x) - G_{d} (x) + \frac{g_{d} (x)}{m} (\frac{(a_{1} + a_{2}) x}{2} + \frac{a_{2} x^{2}}{2 (d + 2)})| \leq C m^{- 2}

(26)

where the coefficients

a_{1}

and

a_{2}

are defined in (23) and (25).

The scaled mixture

T_{m} = m χ_{4}^{2} / χ_{m}^{2}

is considered in the works by Fujikoshi et al. [23] (Example 13.2.2) and Fujikoshi and Ulyanov [19] (Example 2.2). The estimation given there leads to a computable error bound:

{sup}_{x} | P (m \frac{χ_{4}^{2}}{χ_{m}^{2}} \leq x) - G_{4} (x) + \frac{(2 x - x^{2}) g_{4} (x)}{2 m} | \leq \frac{x^{2} | x^{2} - 4 | e^{- x / 2}}{12 m^{2}} + \frac{12 (m + 4)}{m^{3}} \leq \frac{65.9}{m^{2}} .

(27)

Remark 7.

The statistics

T_{m}

in (15), (18) and (21) satisfy Assumption 1 with the normal limit distribution

Φ (x)

and in (26) and (27) with chi-square distributions

G_{d} (x)

and

G_{4} (x)

, respectively.

4. Chebyshev–Edgeworth Expansions for Distributions of Normalized Random Sample Sizes

As in the articles by, e.g., Bening et al. [9,10], Christoph et al. [11,12] and Christoph and Ulyanov [13] and Christoph and Ulyanov [14], we consider as random sample sizes

N_{n}

the negative binomial random variable

N_{n} (r)

and the maximum of n independent discrete Pareto random variables

N_{n} (s)

where

r > 0

and

s > 0

are parameters.

“The negative binomial distribution is one of the two leading cases for count models, it accommodates the overdispersion typically observed in count data (which the Poisson model cannot)” [26]. Moreover,

E N_{n} (r) < \infty

and

P (N_{n} (r) / E N_{n} (r)

\leq y)

tends to the gamma distribution

G_{r r} (y)

with identical shape and rate parameters

r > 0

.

On the other hand, the mean for the discrete Pareto-like variable

N_{n} (s)

does not exist, yet

P (N_{n} (s) / n \leq x)

tends to the inverse exponential distribution

W_{s} (y) = e^{- s / y}

with scale parameter

s > 0

.

Remark 8.

The authors of [1,2,3,4,27], among others, considered the binomial or Poisson distributions as random number N of observations. If

N = N_{n}

is binomial (with parameters n and

0 < p < 1

) or Poisson (with rate

λ n, 0 < λ < \infty

) distributed, then

P (N_{n} \leq E N_{n} x)

tends to the degenerated in 1 distribution as

n \to \infty

. Therefore, Assumption 2 for the Transfer Proposition 1 is not fulfilled. On the other hand, since binomial or Poisson sample sizes are asymptotically normally distributed and if the statistic

T_{m}

is also asymptotically normally distributed, so is the statistic

T_{N_{n}}

, too (see [28]). Chebyshev–Edgeworth expansions for lattice distributed random variables exist so far only with bounds of small-o or large-

O

order (see [29]). For (2) in Assumption 2, computable error bounds

C_{2}

are required because the constant

C_{3}

in (3) depends on

C_{2}

(see also Remark 6 on large-

O

-bounds and computable error bounds).

4.1. The Random Sample Size $N_{n} = N_{n} (r)$ Has Negative Binomial Distribution with Success Probability $1 / n$

The sample size

N_{n} (r)

has a negative binomial distribution shifted by 1 with the parameters

1 / n

and

r > 0

, the probability mass function

P (N_{n} (r) = k) = \frac{Γ (k + r - 1)}{Γ (k) Γ (r)} {(\frac{1}{n})}^{r} {(1 - \frac{1}{n})}^{k - 1}, k = 1, 2, \dots

(28)

and

g_{n} = E (N_{n} (r)) = r (n - 1) + 1

. Bening and Korolev [30] and Schluter and Trede [26] showed

{lim}_{n \to \infty} {sup}_{y} |P (N_{n} (r) / g_{n} \leq y) - G_{r, r} (y)| = 0,

(29)

where

G_{r, r} (y)

is the gamma distribution function with its density

g_{r, r} (y) = \frac{r^{r}}{Γ (r)} y^{r - 1} e^{- r y} I_{(0 \infty)} (y), y \in R .

(30)

In addition to the expansion of

N_{n} (r)

, a bound of the negative moment

E {(N_{n} (r))}^{- a}

in (3) is required, where

m^{- a}

is rate of convergence of the Chebyshev–Edgeworth expansion for

T_{m}

in (1).

Proposition 3.

Suppose that

r > 0

and the discrete random variables

N_{n} (r)

have probability mass function (28) with

g_{n} : = E N_{n} (r) = r (n - 1) + 1

. Then,

{sup}_{y \geq 0} |P (\frac{N_{n} (r)}{g_{n}} \leq y) - G_{r, r} (y) - \frac{h_{2; r} (y)}{n}| \leq C_{2} (r) n^{- min {r, 2}},

(31)

for all

n \in N_{+}

, where the constant

C_{2} (r) > 0

does not depent on n and

h_{2; r} (y) = \{\begin{matrix} 0, & f o r r < 1, \\ {(2 r)}^{- 1} g_{r, r} (y) ((y - 1) (2 - r) + 2 Q_{1} (g_{n} y)), & f o r r \geq 1 . \end{matrix}

(32)

Q_{1} (y) = 1 / 2 - (y - [y]) a n d [.] d e n o t e s t h e i n t e g e r p a r t o f a n u m b e r .

(33)

Moreover, negative moments

E {(N_{n} (r))}^{- a}

fulfill the estimate for all

r > 0

,

α > 0

E {(N_{n} (r))}^{- α} \leq C (r) \{\begin{matrix} n^{- min {r, α}}, r \neq α \\ ln (n) n^{- α}, r = α \end{matrix}

(34)

and the convergence rate in case

r = α

cannot be improved.

Proof.

In [10] (Formula (21)) and in [31] (Formula (11)), the convergence rate is reported for the case

r < 1

. In [11] (Theorem 1), the Chebyshev–Edgeworth expansion for

r > 1

is proved. In the case

r = 1

, for geometric distributed random variable

N_{n} (1) \in N_{+}

with success probability

1 / n

the proof is straightforward:

P (N_{n} \leq n y) = 1 - P (N_{n} \geq [n y] + 1) = 1 - {(1 - \frac{1}{n})}^{n y - τ} = (1 - e^{- y} + \frac{e^{- y}}{n} (\frac{y}{2} - τ)) + r_{n} (y),

where

{sup}_{y} | r_{n} (y) | \leq C n^{- 2}

and

τ = n y - [n y] = 1 / 2 - Q_{1} (n y) \in [0 1)

. Hence, (31) holds for

r = 1

.

In [12] (Corollary 4.2), leading terms for the negative moments of

E {(N_{n} (r))}^{- p}

are derived, which lead to (34). □

Remark 9.

The negative binomial random variables

N_{n} (r)

satisfy (2) in Assumption 2 and the additional conditions (6), (7) and (8) in Proposition 2 with

H (y) = G_{r, r} (y)

,

h_{2} (y) = h_{2; r} (y)

,

g_{n} = E N_{n} (r) = r (n - 1) + 1

and

b = min {r 2}

. The jumps of the distribution function

P (N_{n} (r) \leq g_{n} y)

only affect the function

Q_{1} (.)

in the term

h_{2; r} (.)

.

4.2. The Random Sample Size $N_{n} = N_{n} (s)$ Is the Maximum of n Independent Discrete Pareto Variables

We consider the continuous Pareto Type II (Lomax) distribution function

F_{Y^{*}} (x) = 1 - {(1 + (x - 1) / s)}^{- 1} f o r x \geq 1 .

The discrete Pareto II distribution

F_{Y (s)}

is obtained by discretizing the continuous Pareto distribution

F_{Y^{*}} (x)

, :

P (Y (s) = k) = F_{Y^{*}} (k) - F_{Y^{*}} (k - 1), k \in N_{+}

. The random variable

Y (s)

is the discrete counterpart on the positive integers to the continuous random variable

Y^{*}

. Both random variables

Y^{*}

and

Y (s)

have shape parameter 1 and scale parameter

s > 0

(see [32]). The discrete Pareto distributed

Y (s)

has probability mass and distribution functions:

P (Y (s) = k) = \frac{s}{s + k - 1} - \frac{s}{s + k} a n d P (Y (s) \leq k) = \frac{k}{s + k}, f o r k \in N_{+} .

(35)

Let

Y_{1} (s), Y_{2} (s), \dots

be a sequence of independent random variables with the common distribution function (35). Define

N_{n} (s) = {max}_{1 \leq j \leq n} Y_{j} (s) w i t h P (N_{n} (s) \leq k) = {(\frac{k}{s + k})}^{n}, n \in N_{+}, k \in N_{+} s > 0 .

(36)

The random variable

N_{n} (s)

is extremely spread over the positive integers.

Proposition 4.

Consider the discrete random variable

N_{n} (s)

with distribution function (36). Then,

{sup}_{y > 0} |P (\frac{N_{n} (s)}{n} \leq y) - W_{s} (y) - \frac{h_{2; s} (y)}{n}| \leq \frac{C_{3} (s)}{n^{2}} f o r a l l n \in N_{+} a n d f i x e d s > 0,

(37)

W_{s} (y) = e^{- s / y} a n d h_{2; s} (y) = s e^{- s / y} (s - 1 + 2 Q_{1} (n y)) / (2 y^{2}), y > 0

(38)

where

C_{3} (s) > 0

does not depend on n and

Q_{1} (y)

is defined in (33). Moreover,

E {(N_{n} (s))}^{- p} \leq C (p) n^{- min {p, 2}},

(39)

where for

0 < p \leq 2

the order of the bound is optimal.

The Chebyshev–Edgeworth expansion (37) is proved in [11] (Theorem 4). In [12] (Corollary 5.2), leading terms for the negative moments

E {(N_{n} (s))}^{- p}

are derived for the negative moments that lead to (39).

Remark 10.

Let the random variable

V (s)

is exponentially distributed with rate parameter

s > 0

. Then,

W (s) = 1 / V (s)

is an inverse exponentially distributed random variable with the continuous distribution function

W_{s} (y) = e^{- s / y} I_{(0 \infty)} (y)

. Both

W_{s} (y)

and

P (N_{n} (s) \leq y)

are heavy tailed with shape parameter 1.

Remark 11.

Since

E (W (s)) = \infty

and

E (N_{n} (s)) = \infty

for all

n \in N_{+}

, we choose

g_{n} = n

as normalizing factor for

N_{n} (s)

in (37).

Remark 12.

The random sample sizes

N_{n} (s)

satisfy (2) in Assumption 2 and the additional conditions (6)–(8) in Proposition 2 with

W_{s} (y) = e^{- s / y}

,

h_{2} (y) = h_{2; s} (y)

,

g_{n} = n

and

b = 2

. The jumps of the distribution function

P (N_{n} (s) \leq n y)

only affects the function

Q_{1} (.)

in the term

h_{2; s} (.)

.

Remark 13.

Lyamin [33] proved a bound

| P (N_{n} (s) \leq n y) - W_{s} (y) | \leq 0.37 n^{- 1}

for integers

s \geq 1

.

5. Limit Distributions of Statistics with Random Sample Sizes Using Different Scaling Factors

The statistic

T_{m}

from a sample with non-random sample size

m \in N_{+}

fulfills condition (1) in Assumption 1. Instead of the non-random sample size m, we consider a random sample size

N_{n} \in N_{+}

satisfying condition (2) in Assumption 2. Let

g_{n}

be a sequence with

g_{n} ↑ \infty

as

n \to \infty

. Consider the scaling factor

g_{n}^{γ} N_{n}^{γ^{*} - γ}

by the statistics

T_{N_{n}}

with

γ \in {0, \pm 1 / 2}

if

F (x) = Φ (x)

and

γ^{*} = 1 / 2

or

γ \in {0, \pm 1}

if

F (x) = G_{u} (x)

and

γ^{*} = 1

. Then, conditioning on

N_{n}

and using (1) and (2), we have

\begin{matrix} P (g_{n}^{γ} N_{n}^{γ^{*} - γ} T_{N_{n}} \leq x) & = & P (N_{n}^{γ} T_{N_{n}} \leq x {(N_{n} / g_{n})}^{γ}) = \sum_{m = 1}^{\infty} P (m^{γ} T_{m} \leq x {(m / g_{n})}^{γ}) P (N_{n} = m) \\ \overset{(1)}{\approx} E (F (x {(N_{n} / g_{n})}^{γ})) = \int_{1 / g_{n}}^{\infty} F (x y^{γ}) d P (N_{n} / g_{n} \leq y) \overset{(2)}{\approx} \int_{1 / g_{n}}^{\infty} F (x y^{γ}) d H (y) . \end{matrix}

(40)

If there exists a limit distribution of

P (g_{n}^{γ} N_{n}^{γ^{*} - γ} T_{N_{n}} \leq x)

as

n \to \infty

, then it has to be a scale mixture of parent distribution

F (x)

and positive mixing parameter

H (y)

:

\int_{0}^{\infty} F (x y^{γ}) d H (y)

(see, e.g., [23,34], Chapter 13, and [19] and the references therein).

Remark 14.

Formula (40) shows that different normalization factors at

T_{N_{n}}

lead to different scale mixtures of the limit distribution of the normalized statistics

T_{N_{n}}

.

5.1. The Case $F (x) = Φ (x)$ and $H (y) = G_{r, r} (y)$

The statistics (15), (18) and (21) considered in Section 3.1 have normal approximations

Φ (x)

. The limit distribution for the normalized random sample size

N_{n} (r) / E N_{n} (r)

is the gamma distribution

G_{r, r} (y)

with density (30). We investigate the dependence of the limit distributions in

P (g_{n}^{γ} N_{n} {(r)}^{1 / 2 - γ} T_{N_{n} (r)} \leq x) \to \int_{0}^{\infty} Φ (x y^{γ}) d G_{r, r} (y)

as

n \to \infty

for

γ \in {1 / 2, 0, - 1 / 2}

.

(i) If

γ = 1 / 2

, then the limit distribution is Student-s t distribution

S_{2 r} (x)

having density

s_{2 r} (x) = \frac{Γ ((r + 1 / 2)}{\sqrt{2 r π} Γ (r)} {(1 + \frac{x^{2}}{2 r})}^{- r + 1 / 2}, r > 0, x \in R .

(41)

(ii) If

γ = 0

, the standard normal law

Φ (x)

is the limit one with density

φ (x)

.

(iii) For

γ = - 1 / 2

, the generalized Laplace distributions

L_{r} (x)

occur with density (see [13], Section 5.1.3):

l_{r} (x) = \frac{r^{r}}{Γ (r)} \int_{0}^{\infty} φ (x y^{- 1 / 2}) y^{r - 3 / 2} e^{- r y} d y = \frac{2 r^{r}}{Γ (r) \sqrt{2 π}} {(\frac{| x |}{\sqrt{2 r}})}^{r - 1 / 2} K_{r - 1 / 2} (\sqrt{2 r} | x |) .

(42)

where

K_{α} (u)

is the Macconald function of order α or modified Bessel function of the third kind with index

α

. The function

K_{α} (u)

is also sometimes called a modified Bessel function of the second kind of order

α

. For properties of these functions, see, e.g., Chapter 51 in [35] or the Appendix on Bessel functions in [36].

If

r \in N_{+}

, the so-called Sargan densities

l_{1} (x), l_{2} (x), \dots

and their distribution functions are computable in closed forms (see Formulas (63)–(65) below in Section 7):

\begin{matrix} l_{1} (x) = \frac{1}{\sqrt{2}} e^{- \sqrt{2} | x |} & a n d & L_{1} (x) = 1 - \frac{1}{2} e^{- \sqrt{2} | x |}, x > 0 \\ l_{2} (x) = (\frac{1}{2} + | x |) e^{- 2 | x |} & a n d & L_{2} (x) = 1 - \frac{1}{2} (1 + x) e^{- 2 | x |}, x > 0 \\ l_{3} (x) = \frac{3 \sqrt{6}}{16} (1 + \sqrt{6} | x | + 2 x^{2}) e^{- \sqrt{6} | x |} & a n d & L_{3} (x) = 1 - (\frac{1}{2} + \frac{5 \sqrt{6} x}{16} + \frac{3 x^{2}}{8}) e^{- \sqrt{6} | x |}, \end{matrix}\}

(43)

where

L_{r} (- x) = 1 - L_{r} (x)

for

x \geq 0

.

The double exponential or standard Laplace density is

l_{1} (x)

with variance 1 and distribution function

L_{1} (x)

given in (43). The Sargan distributions are therefore a generalisation of the standard Laplace distribution.

5.2. The Case $F (x) = G_{d} (x)$ and $H (y) = W_{s} (y) = e^{- s / y}$

The statistics considered in Section 3.2 asymptotically approach chi-square distribution

G_{d} (x)

. The limit distribution for the normalized random sample size

N_{n} (s) / n

is the inverse exponential distribution

W_{s} (y) = e^{- s / y} I_{(0, \infty)} (y)

.

(i) If

γ = 1

, then the generalized gamma distribution

W_{d} (x; 2 s)

occurs with density

w_{d} (x; 2 s)

:

w_{d} (x; 2 s) = s {(\frac{s x}{2})}^{d / 4 - 1 / 2} K_{d / 2 - 1} (\sqrt{2 s x}) I_{(0, \infty)} (x)

(44)

where the Macconald function

K_{α} (u)

already appears in Formula (42) with different

α

and argument. For

α = m + 1 / 2

, where m is an integer, the Macconald function

K_{m + 1 / 2} (u)

has a closed form (see Formulas (63)–(65) below in Section 7). Therefore, if

d = 1, 3, 5 \dots

is an odd number, then the density

w_{d} (x; 2 s)

may be calculated in closed form. The distribution functions

W_{d} (x; 2 s)

with density functions

w_{d} (x; 2 s)

for

d = 1, 3, 5

and

x > 0

are

\begin{matrix} w_{1} (x; 2 s) & = & s {(2 s x)}^{- 1 / 2} e^{- \sqrt{2 s x}} and W_{1} (x; 2 s) = 1 - e^{- \sqrt{2 s x}}, \end{matrix}

(45)

\begin{matrix} w_{3} (x; 2 s) & = & s e^{- \sqrt{2 s x}} and W_{3} (x; 2 s) = 1 - e^{- \sqrt{2 s x}} (\sqrt{2 s x} + 1), \end{matrix}

(46)

\begin{matrix} w_{5} (x; 2 s) & = & \frac{s}{3} (1 + 2 \sqrt{2 s x}) e^{- \sqrt{2 s x}} and W_{5} (x; 2 s) = 1 - e^{- \sqrt{2 s x}} (\sqrt{2 s x} + \frac{2 s x}{3}) . \end{matrix}

(47)

Remark 15.

Functions in (45) are Weibull density and distribution functions, in (46) there are density and distribution functions of a generalized gamma distribution, but

w_{5} (x)

and

W_{5} (x)

are even more general.

The family of generalized gamma distributions contains many absolutely continuous distributions concentrated on the non-negative half-line.

Remark 16.

The generalized gamma distribution

G^{*} (x; r, α, λ)

corresponds to the density

g^{*} (x; r, α, λ) = \frac{| α | λ^{r}}{Γ (r)} x^{α r - 1} e^{- λ x^{α}}, x \geq 0, | α | > 0, r > 0, λ > 0,

(48)

where α and r are the two shape parameters and λ the scale parameter. The density representation (48) is suggested in the work of Korolev and Zeifman [37] or Korolev and Gorshenin [38], and many special cases are listed therein. In addition to, e.g., Gamma and Weibull distributions (with a > 0), inverse Gamma, Lévy and Fréche distributions (with a < 0) also belong to that family of generalized gamma distributions.

Remark 17.

The Weibull density in (45) is

w_{1} (x; 2 s) = g^{*} (x; 1, 1 / 2, \sqrt{2 s})

. Moreover,

w_{3} (x; 2 s)

= g^{*} (x; 2, 1 / 2, \sqrt{2 s})

. The densities

w_{5} (x)

,

w_{7} (x), w_{9} (x), \dots

are weighted sums of generalized gamma distribution with different shape parameters r, e.g.,

\begin{matrix} w_{5} (x; 2 s) & = & \frac{1}{3} g^{*} (x; 2, \frac{1}{2}, \sqrt{2 s}) + \frac{2}{3} g^{*} (x; 3, \frac{1}{2}, \sqrt{2 s}) \\ w_{9} (x; 2 s) & = & \frac{5}{35} g^{*} (x; 2, \frac{1}{2}, \sqrt{2 s}) + \frac{10}{35} g^{*} (x; 3, \frac{1}{2}, \sqrt{2 s}) + \frac{12}{35} g^{*} (x; 4, \frac{1}{2}, \sqrt{2 s}) + \frac{8}{35} g^{*} (x; 5, \frac{1}{2}, \sqrt{2 s}) . \end{matrix}

(i) If

γ = 1

. For better readability I have introduced for (i) and after (ii).

(ii) If

γ = 0

, the standard normal law

Φ (x)

is the limit distribution with density

φ (x)

.

(iii) If

γ = - 1,

as limit distribution the inverse Pareto distribution occurs

V_{d / 2} (x; 2 s)

with shape parameter

d / 2

, scale parameter

2 s

and density

v_{d / 2} (x; 2 s)

:

V_{d / 2} (x; 2 s) = {(\frac{x}{2 s + x})}^{d / 2} and v_{d / 2} (x; 2 s) = \frac{s d x^{d / 2 - 1}}{{(x + 2 s)}^{d / 2 + 1}} for x \geq 0

(49)

In [39], a robust and efficient estimator for the shape parameter of the inverse Pareto distribution and applications are given.

6. Main Results

We examine asymptotic approximations of

P (g_{n}^{γ} N_{n}^{γ^{*} - γ} T_{N_{n}} \leq x)

depending on the scaling factor

g_{n}^{γ} N_{n}^{γ^{*} - γ}

for

γ \in {0, \pm 1 / 2, \pm 1}

, for

γ^{*} = 1 / 2

if the statistic

T_{m}

is asymptotically normal or

γ^{*} = 1

if

T_{m}

is asymptotically chi-square distributed.

6.1. Asymptotically Normal Distributed Statistics with Negative Binomial Distributed Sample Sizes

Consider first the statistics estimated in (15), (18) and (21) with the normal limiting distribution

Φ (x)

. They have the form

|P (\sqrt{m} Z_{m} \leq x) - Φ (x) - (m^{- 1 / 2} (p_{0} + p_{2} x^{2}) + m^{- 1} (p_{1} x + p_{3} x^{3} + p_{5} x^{5}) I_{a > 1} (a)) φ (x)| \leq m^{- a} .

(50)

The sample size is negative binomial

N_{n} = N_{n} (r)

with probability mass function (28).

Theorem 1.

Let

r > 0

. If inequality (50) for the statistic

Z_{m}

and inequality (31) for random sample size

N_{n} (r)

with

g_{n} = E N_{n} (r) = r (n - 1) + 1

hold, then, for all

n \in N_{+}

, the following expansions apply:

i:: The non-random scaling factor $\sqrt{g_{n}}$ by statistic $Z_{N_{n} (r)}$ leads to Student’s t-approximation.

${sup}_{x} |P (\sqrt{g_{n}} Z_{N_{n} (r)} \leq x) - S_{2 r; n} (x)| \leq C_{r} \{\begin{matrix} n^{- r} ln (n), & r \in {1 / 2, 3 / 2, 2}, \\ n^{- min {r, 2}}, & r \notin {1 / 2, 3 / 2, 2} \end{matrix}$

where

$\begin{matrix} S_{2 r; n} (x) & = & S_{2 r} (x) + \frac{s_{2 r} (x)}{\sqrt{g_{n}}} (p_{0} \frac{x^{2} + 2 r}{2 r - 1} + p_{2} x^{2}) I_{{r > 1 / 2}} (r) \\ + \frac{s_{2 r} (x)}{g_{n}} (p_{1} \frac{x (x^{2} + 2 r)}{(2 r - 1)} + p_{3} x^{3} + p_{5} \frac{x^{5} (2 r + 1)}{x^{2} + 2 r} + \frac{(2 - r) x (x^{2} + 1)}{4 (2 r - 1)}) I_{{r > 1}} (r), \end{matrix}$

(51)

$S_{2 r} (x)$ is Student’s t-distribution having density $s_{2 r} (x)$ , defined in (41), and $p_{k}$ are the coefficients in (50).
ii:: The standard normal approximation occurs at random scaling factor $\sqrt{N_{n} (r)}$ by statistic $Z_{N_{n} (r)}$ :

${sup}_{x} |P (\sqrt{N_{n} (r)} Z_{N_{n} (r)} \leq x) - Φ_{n 2} (x)| \leq C_{r} \{\begin{matrix} n^{- min {r, 2}}, & r \neq 2, \\ ln (n) n^{- 2}, & r = 2, \end{matrix}$

(52)

where

$\begin{matrix} Φ_{n 2} (x) & = & Φ (x) + \frac{\sqrt{r} Γ (r - 1 / 2)}{Γ (r) \sqrt{g_{n}}} (p_{0} + p_{2} x^{2}) φ (x) I_{{r > 1 / 2}} (r) \\ + \frac{(p_{1} x + p_{3} x^{3} + p_{5} x^{5}) φ (x)}{g_{n}} (ln n I_{{r = 1}} (r) + \frac{r}{r - 1} I_{{r > 1}} (r)), \end{matrix}$

(53)
iii:: If $r = 2$ , the mixed scaling factor $g_{n}^{- 1 / 2} N_{n} (2)$ by statistic $Z_{N_{n} (2)}$ leads to generalized Laplace approximation:

${sup}_{x} |P (g_{n}^{- 1 / 2} N_{n} (2) Z_{N_{n} (2)} \leq x) - L_{2} (x) - l_{n; 2} (x)| \leq C_{2} ln (n) n^{- 2}$

(54)

where $L_{2} (x) = 1 - \frac{1}{2} (1 + x) e^{- 2 | x |}$ , $L_{2} (- x) = 1 - L_{2} (x)$ for $x \geq 0$ and

$l_{n; 2} (x) = \frac{e^{- 2 | x |}}{\sqrt{g_{n}}} (p_{0} (| x | + 1 / 2) + 2 p_{2} x^{2})) - \frac{e^{- 2 | x |}}{g_{n}} (p_{1} x + 4 p_{3} | x | x + 4 p_{5} (2 x^{3} + | x | x)) .$

(55)

Remark 18.

Analogous to (54) and (55), expansions for all

r > 0

can be derived from Formulas (42) and (63)–(66) below in Section 7, whereby closed forms can be presented only for

r \in {1, 2, 3, \dots}

.

The statistics from Section 3.1 are considered with different normalization factors as applications of Theorem 1:

Corollary 1.

Let the conditions of Theorem 1 be satisfied:

i:: In the case of the Student’s t-statistic $Z / \sqrt{χ_{m}^{2}}$ estimated in (18), one has (51) with $p_{0} = p_{2} = p_{5} = 0$ and $p_{1} = p_{3} = 1 / 4$ using non-random scaling factor $\sqrt{g_{n}}$ :

${sup}_{x} |P (\frac{\sqrt{g_{n}} Z}{\sqrt{χ_{N_{n} (r)}^{2}}} \leq x) - S_{2 r} (x; n)| \leq C_{r} \{\begin{matrix} n^{- min {r, 2}}, & r \neq 2, \\ ln (n) n^{- 2}, & r = 2, \end{matrix}$

where

$S_{2 r} (x; n) = S_{2 r} (x) - s_{2 r} (x) \frac{2 r (x + x^{3}) - (2 - r) x (x^{2} + 1)}{4 (2 r - 1) g_{n}} I_{{r > 1}} (r)$
ii:: In the case of Student’s one-sample t-test statistic under non-normality $T_{m} = ({\bar{X}}_{m} - μ) / {\hat{σ}}_{m}$ estimated in (21) with $a = 1$ , the first-order approximation defined in (52) for $0 < r \leq 1$ and (53) with $p_{0} = λ_{3} / 6$ and $p_{2} = λ_{3} / 3$ using random scaling factor $\sqrt{N_{n} (r)}$ leads uniformly in x to:

$|P (\sqrt{N_{n} (r)} T_{N_{n} (r)} \leq x) - Φ (x) + \frac{\sqrt{r} Γ (r - 1 / 2) λ_{3} (2 x^{2} + 1)}{6 Γ (r) \sqrt{g_{n}}} φ (x)| \leq C_{r} \{\begin{matrix} n^{- min {r, 1}}, & r \neq 1, \\ ln (n) n^{- 1}, & r = 1 . \end{matrix}$
iii:: Considering sample mean ${\bar{X}}_{m}$ estimated in (15), one has (55) with $p_{0} = - p_{2} = λ_{3} / 6$ , $p_{1} = λ_{4} / 8 - 5 λ_{3}^{2} / 24$ , $p_{3} = - λ_{4} / 24 + 5 λ_{3}^{2} / 36$ , and $p_{5} = - λ_{3}^{2} / 72$ using mixed scaling factor $g_{n}^{- 1 / 2} N_{n} (2)$ :

${sup}_{x} |P (g_{n}^{- 1 / 2} N_{n} (2) {\bar{X}}_{N_{n} (2)} \leq x) - L_{2} (x) - l_{2; n} (x)| \leq C_{2} ln (n) n^{- 2},$

where the generalized Laplace distributions $L_{2} (x)$ is defined in (43) and

$l_{2; n} (x) = \frac{λ_{3} e^{- 2 | x |}}{6 \sqrt{g_{n}}} (2 x^{2} - | x | - 1 / 2) + \frac{e^{- 2 | x |}}{36 g_{n}} (λ_{4} (9 x - 6 x | x |) + 12 λ_{3}^{2} (4 x^{3} + 18 x | x | - 15 x)) .$

Remark 19.

The approximating functions in the expansions for

P (g_{n}^{γ} N_{n} {(r)}^{1 / 2 - γ} T_{N_{n} (r)} \leq x)

with the statistics estimated in (15), (18) and (21) can only be given in closed form for all

r > 0

in the case of non-random (

γ = 1 / 2

) or random (

γ = 0

) normalization factors. In the case of the mixed (

γ = - 1 / 2

) normalization factor, only for positive integer r closed forms are available, while in the other cases Macconald functions are involved.

6.2. Asymptotically Chi-Square Distributed Statistics with Pareto-Like Distributed Sample Sizes

Consider now the statistics, estimated in (26) and (27) with limit chi-square distributions. They have the form

|P (m T_{m} \leq x) - G_{d} (x) - m^{- 1} (q_{1} x + q_{2} x^{2}) g_{d} (x)| \leq C (s) m^{- 2} .

(56)

The sample size is the Pareto-like random variable

N_{n} = N_{n} (s)

with probability mass function (35).

Theorem 2.

Let

s > 0

and (36) be the distribution function of the random sample size

N_{n} = N_{n} (s)

. If for the statistic

T_{m}

the inequality (56) with limiting chi-square distribution

G_{d} (x)

and the inequality (37) with

g_{n} = n

for the random sample size

N_{n} (s)

hold, then for all

n \in N_{+}

one has the following approximation:

i:: The non-random scaling factor n by $T_{N_{n} (s)}$ leads to the limiting generalized gamma distributions.

${sup}_{x > 0} |P (n T_{N_{n} (s)} \leq x) - W_{d; n} (x; 2 s)| \leq C (s) n^{- 2} ln n f o r d = 1 a n d d = 3,$

(57)

$W_{1; n} (x; 2 s) = W_{1} (x; 2 s) + n^{- 1} w_{1} (x; 2 s) (q_{1} x (1 + \sqrt{2 s x}) + q_{2} x^{2} - \frac{(s - 1) x (\sqrt{2 s x} + 1)}{4 s})$

(58)

and

$W_{3; n} (x; 2 s) = W_{3} (x; 2 s) + n^{- 1} w_{3} (x; 2 s) (q_{1} \frac{x \sqrt{2 s x}}{2 s} + q_{2} x^{2} - \frac{(s - 1) x^{2}}{2 \sqrt{2 s x}}) .$

(59)

where the limit law $W_{d} (x; 2 s)$ with density $w_{d} (x; 2 s)$ for $d = 1$ and $d = 3$ are given in (45) and (46).
ii:: The random scaling factor $N_{n} (s)$ by $T_{N_{n} (s)}$ induces the limiting chi-square distribution.

${sup}_{x} |P (N_{n} (s) T_{N_{n} (s)} \leq x) - G_{d} (x) - \frac{g_{d} (x)}{s n} (q_{1} x + q_{2} x^{2})| \leq C (s) \frac{ln n}{n^{2}},$

(60)
iii:: Limiting inverse Pareto distributions occur at mixed scaling factor $n^{- 1} N_{n}^{2} (s)$ by $T_{N_{n} (s)}$ .

${sup}_{x > 0} |P (\frac{N_{n}^{2} (s)}{n} T_{N_{n} (s)} \leq x) - V_{d / 2} (x; 2 s) - \frac{1}{n} v_{d / 2; n} (x; 2 s)| \leq C (s) \frac{ln n}{n^{2}},$

where

$v_{d / 2; n} (x; 2 s) = v_{d / 2} (x; 2 s) (q_{1} \frac{x (d + 2)}{x + 2 s} + q_{2} \frac{x^{2} (d + 4) (d + 2)}{{(x + 2 s)}^{2}} + \frac{(s - 1) x (2 + d)}{2 (x + 2 s)})$

(61)

with inverse Pareto distribution $V_{d / 2} (x; 2 s)$ having shape parameter $d / 2$ , scale parameter $2 s$ and density $v_{d / 2} (x; 2 s)$ defined in (49).

Remark 20.

Analogous to (57), expansions for all

d \in N_{+}

can be derived from Formulas (44), (63)–(65) and (69) below in Section 7, whereby closed forms can be given for

d \in {1, 3, 5, \dots}

.

The statistics from Section 3.2 are considered with different normalization factors as applications of Theorem 2.

Corollary 2.

Let the conditions of Theorem 2 be satisfied.

i:: Let $χ_{d}^{2} / χ_{m}^{2}$ be scale mixture, estimated in (24), where $χ_{d}^{2}$ and $χ_{m}^{2}$ are independent. Then, using non-random scaling factor, n limiting generalized gamma distributions occur with $q_{1} = (d - 2) / 2$ and $q_{2} = - 1 / 2$ in (58) and (59):

${sup}_{x > 0} |P (n tr (χ_{d}^{2} / χ_{N_{n} (s)}^{2}) \leq x) - W_{d; n} (x; 2 s)| \leq C (s) n^{- 2} ln n, f o r d = 1 a n d d = 3,$

$W_{1; n} (x; 2 s) = W_{1} (x; 2 s) + n^{- 1} w_{1} (x; 2 s) (- x (1 + \sqrt{2 s x}) - \frac{x^{2}}{2} - \frac{(s - 1) x (\sqrt{2 s x} + 1)}{4 s})$

$x > 0$ and

$W_{3; n} (x; 2 s) = W_{3} (x; 2 s) + n^{- 1} w_{3} (x; 2 s) (\frac{x}{4 s} \sqrt{2 s x} - \frac{x^{2}}{2} - \frac{(s - 1) x^{2}}{2 \sqrt{2 s x}}), x > 0,$

where the limit law $W_{d} (x; 2 s)$ with density $w_{d} (x; 2 s)$ for $d = 1$ and $d = 3$ are given in (45) and (46).
ii:: For the scaled mixture $χ_{4}^{2} / χ_{m}^{2}$ estimated in (27), one gets the limiting chi-square distribution with a random scaling factor $N_{n} (s)$ in (60) with $q_{1} = 1$ and $q_{2} = 1 / 2$ :

${sup}_{x} |P (N_{n} (s) χ_{4}^{2} / χ_{N_{n} (s)}^{2} \leq x) - G_{d} (x) - \frac{g_{d} (x)}{s n} (x - x^{2} / 2))| \leq C (s) \frac{ln n}{n^{2}},$
iii:: In the case of the Hotelling’s generalized $T_{0}^{2}$ statistic $T_{0}^{2} = m tr (S_{q} S_{m}^{- 1})$ estimated in (22), one has the limiting inverse Pareto distributions with mixed scaling factor $n^{- 1} N_{n}^{2} (s)$ by $tr (S_{q} S_{N_{n} (s)}^{- 1})$ . Here, (61) holds with $q_{1} = (p + 1 - q) / 2$ and $q_{2} = (p + 1 + q) / (2 d + 4)$ .

${sup}_{x > 0} |P (\frac{N_{n}^{2} (s)}{n} tr (S_{q} S_{N_{n} (s)}^{- 1}) \leq x) - V_{d / 2} (x; 2 s) - \frac{1}{n} v_{d / 2; n} (x; 2 s)| \leq C (s) \frac{ln n}{n^{2}},$

and

$v_{d / 2; n} (x; 2 s) = v_{d / 2} (x; 2 s) (\frac{(p + 1 - q) x (d + 2)}{2 (x + 2 s)} + \frac{(p + 1 + q) x^{2} (d + 4)}{2 {(x + 2 s)}^{2}} + \frac{(s - 1) x (2 + d)}{2 (x + 2 s)})$

where the inverse Pareto distribution $V_{d / 2} (x; 2 s)$ with shape parameter $d / 2$ , scale parameter $2 s$ and density $v_{d / 2} (x; 2 s)$ is defined in (49).

Remark 21.

For the statistics estimated in (26) and (27), the approximating functions in the expansions for

P (g_{n}^{γ} N_{n} {(s)}^{1 - γ} T_{N_{n} (s)} \leq x)

can only be given in closed form for all integer d in the case of non-random (

γ = 1

) or random (

γ = 0

) normalization factors. In the case of the mixed (

γ = - 1

) normalization factor, only for odd integer d in closed form can be presented; for even integer d, the Macconald functions are involved.

7. Formal Construction of the Expansions

Expansions of the statistics considered in (15), (18), (21), (26) and (27) have the structure:

G (x) + g (x) (m^{- 1 / 2} P_{1} (x; j_{1}^{*}) + m^{- 1} P_{2} (x; j_{2}^{*}))

with

g (x) = G^{'} (x)

and polynomials

P_{1} (x; j_{1}^{*}), P_{2} (x; j_{2}^{*})

of degrees

j_{1}^{*}

and

j_{2}^{*}

, respectively. Here,

G (x) = Φ (x)

or

G (x) = P (χ_{d}^{2} \leq x)

.

We calculate the integrals with

k = 1, 2

and

j = 0, 1, \dots, j_{k}^{*}

:

J_{1} (x; γ) = \int_{0}^{\infty} G (x y^{γ}) d H (y) and J_{2} (x; γ, k, j) = x^{j} \int_{0}^{\infty} y^{γ j - k / 2} g (x y^{γ}) d H (y) .

The limit distributions of the random sizes

N_{n}

are

H (y) = G_{r, r} (y)

and

H (y) = W_{s} (y) = e^{- s / y}

with corresponding second approximation

h_{2} (y)

.

We use the following formulas several times: Formula 2.3.3.1 in [40]

M_{α} (p) = \int_{0}^{\infty} y^{α - 1} e^{- p y} d y \overset{y = 1 / z}{=} \int_{0}^{\infty} z^{- α - 1} e^{- p / z} d z = Γ (α) p^{- α} α > 0, p > 0 .

(62)

and Formula 2.3.16.1 in [40] with real

α

and

p, q > 0

:

K_{α}^{*} (p, q) = \int_{0}^{\infty} y^{α - 1} e^{- p y - q / y} d y = 2 {(\frac{q}{p})}^{α / 2} K_{α} (2 \sqrt{p q}),

(63)

where the Macconald function

K_{α} (u)

already appears in Formula (42) with different

α

and argument.

For

α = m + 1 / 2

, where m is an integer, the Macdonald function

K_{m + 1 / 2} (u)

has a closed form (see Formulas 2.3.16.2 and 2.3.16.3 in [40] with

p, q > 0

):

K_{m + 1 / 2}^{*} (p, q) = K_{m}^{* *} (p, q) if m is an integer,

(64)

where

K_{m}^{* *} (p, q) = \int_{0}^{\infty} y^{m - 1 / 2} e^{- p y - q / y} d y = \{\begin{matrix} {(- 1)}^{m} \sqrt{π} \frac{\partial^{m}}{\partial p^{m}} (p^{- 1 / 2} e^{- 2 \sqrt{p q}}), m = 0, 1, 2, \dots, \\ {(- 1)}^{- m} \sqrt{\frac{π}{p}} \frac{\partial^{- m}}{\partial q^{- m}} e^{- 2 \sqrt{p q}}, m = 0, - 1, - 2, \dots \end{matrix}

(65)

7.1. The Case $G (x) = Φ (x)$ and $H (y) = G_{r, r} (y)$

Consider statistics that meet the condition (50).

Let

J_{1} (x; γ) = \int_{0}^{\infty} Φ (x y^{γ}) d G_{r, r} (y)

with

γ \in {0, \pm 1 / 2}

. Then,

J_{1} (x; γ) = Φ (x)

for

γ = 0

and

\begin{matrix} \frac{\partial}{\partial x} J_{1} (x; γ) & = & \frac{r^{r}}{Γ (r) \sqrt{2 π}} \int_{0}^{\infty} y^{r + γ - 1} e^{- (x^{2} y^{2 γ} / 2 + r y)} d y \\ = \frac{r^{r}}{Γ (r) \sqrt{2 π}} \{\begin{matrix} M_{r + 1 / 2} (r (1 + x^{2} / (2 r))), & for γ = 1 / 2, \\ K_{r - 1 / 2}^{*} (r, x^{2} / 2) = 2 {(\frac{x^{2}}{2 r})}^{r / 2 - 1 / 4} K_{r - 1 / 2} (\sqrt{2 r} x), & for γ = - 1 / 2 . \end{matrix} \end{matrix}

(66)

If

r > 0

is an integer number then using (64) with

m = r - 1

, the density of

J_{1} (x; - 1 / 2)

can be calculated with (65) in a closed form.

Let

γ \in {0, \pm 1 / 2}

. Let

k = 1, 2

and

j = 0, 1, \dots, 5

be the exponents at

m^{- k / 2}

and

x^{j}

in (50), respectively.

\begin{matrix} J_{2} (x; γ, k, j) & = & \frac{r^{r} x^{j}}{Γ (r) \sqrt{2 π}} \int_{0}^{\infty} y^{j γ + r - 1 - k / 2} e^{- (x^{2} y^{2 γ} / 2 + r y)} d y \\ = & \frac{r^{r} x^{j}}{Γ (r) \sqrt{2 π}} \{\begin{matrix} M_{j / 2 + r - k / 2} (r (1 + x^{2} / (2 r))), & for γ = 1 / 2, \\ e^{- x^{2} / 2} M_{r - k / 2} (r), & for γ = 0, \\ K_{r - (j + k) / 2}^{*} (r, x^{2} / 2), & for γ = - 1 / 2 . \end{matrix} \end{matrix}

(67)

In (50)

k + j

are odd integers. If

r > 0

is an integer, then

K_{r - (j + k) / 2}^{*} (r, x^{2} / 2) =

K_{r - (j + k + 1) / 2}^{* *} (r, x^{2} / 2)

.

Define

p_{j}^{*} = p_{j} I_{2} (x; γ, k, j)

with coefficient

p_{j}

from (50) and calculate the terms in (67):

\begin{matrix} γ = 1 / 2, & k = 1 : p_{0}^{*} = \frac{p_{0} (2 r + x^{2})}{2 r - 1} s_{2 r} (x), p_{2}^{*} = p_{2} x^{2} s_{2 r} (x), \\ k = 2 : p_{1}^{*} = \frac{p_{1} x (x^{2} + 2 r)}{2 r - 1} s_{2 r} (x), p_{3}^{*} = p_{3} x^{3} s_{2 r} (x), p_{5}^{*} = p_{5} \frac{x^{5} (2 r + 1)}{x^{2} + 2 r} s_{2 r} (x), \\ γ = - 1 / 2, & k = 1, r = 2 : p_{0}^{*} = p_{0} (| x | + 1 / 2) e^{- 2 | x |}, p_{2}^{*} = p_{2} 2 x^{2} e^{- 2 | x |}, \\ k = 2, r = 2 : p_{1}^{*} = p_{1} 2 x e^{- 2 | x |}, p_{3}^{*} = p_{3} 4 | x | x e^{- 2 | x |}, p_{5}^{*} = p_{5} 4 (2 x^{3} + x | x |) e^{- 2 | x |}, \\ γ = 0, & k = 1, 2, j = 0, 1, 2, 3, 5 : p_{j}^{*} = p_{j} x^{j} φ (x) r^{k / 2} Γ (r - k / 2) / Γ (r), \end{matrix}\}

(68)

7.2. The Case $G (x) = G_{d} (x)$ and $H (y) = W_{s} (y) = e^{- s / y}$

Consider statistics that meet the condition (56). Let

J_{1} (x; γ) = \int_{0}^{\infty} G_{u} (x y^{γ}) s y^{- 2} e^{- s / y} d y

,

γ \in {0, \pm 1}

. Then,

J_{1} (x; 0) = G_{u} (x)

and

\frac{\partial}{\partial x} J_{1} (x; γ) = s \int_{0}^{\infty} y^{γ - 2} g_{u} (x y^{γ}) e^{- s / y} d y = \frac{s x^{d / 2 - 1}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} y^{γ d / 2 - 2} e^{- x y^{γ} / 2 - s / y} d y, γ = \pm 1 .

Let

γ = 1

. Using (63) with

α = d / 2 - 1

,

p = x / 2

and

q = s

, we find

\begin{matrix} \frac{\partial}{\partial x} J_{1} (x; 1) & = & s \int_{0}^{\infty} y g_{u} (x y) e^{- s / y} d y = \frac{s x^{d / 2 - 1}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} y^{d / 2 - 2} e^{- x y / 2 - s / y} d y \\ = & \frac{s x^{d / 2}}{2^{d / 2} Γ (d / 2)} K_{d / 2 - 1}^{*} (x / 2, s) = s {(\frac{s x}{2})}^{d / 4 - 1 / 2} K_{d / 2 - 1} (\sqrt{2 s x}) . \end{matrix}

If

d = 1, 3, 5, \dots

is an odd number, using the closed form

K_{m}^{* *} (p, q)

in (65) with

m = (d - 3) / 2

,

p = x / 2

and

q = s

, then

J_{1} (x; 1) = W_{d} (x, 2 s)

and its density

w_{d} (x; 2 s)

may be calculated in closed form:

w_{d} (x; 2 s) = \frac{\partial}{\partial x} J_{1} (x; 1) = \frac{s x^{d / 2 - 1}}{2^{d / 2 - 1} Γ (d / 2)} K_{(d - 3) / 2}^{* *} (x / 2, s) for d = 1, 3, 5, \dots

(69)

The distribution functions

W_{d} (x; 2 s)

and their densities

w_{d} (x; 2 s)

for

d = 1, 3, 5

are given in (45)–(47).

If

γ = - 1

, we use (62) with

α = d / 2 + 1

,

p = (x + 2 s) / 2

,

q = s

and the substitution

y = 1 / z

:

\begin{matrix} \frac{\partial}{\partial x} J_{1} (x; - 1) = \frac{s x^{d / 2 - 1}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} y^{- d / 2 - 2} e^{- (x / 2 + s) / y} d y \overset{y = 1 / z}{=} \frac{s x^{d / 2 - 1}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} z^{d / 2} e^{- (x + 2 s) z / 2} d z \\ = \frac{s x^{d / 2 - 1} Γ (d / 2 + 1) 2^{d / 2 + 1}}{2^{d / 2} Γ (d / 2) {(x + 2 s)}^{d / 2 + 1}} = \frac{s d x^{d / 2 - 1}}{{(x + 2 s)}^{d / 2 + 1}} = \frac{s d}{x^{2}} {(1 + \frac{2 s}{x})}^{- d / 2 - 1} = v_{d / 2} (x; 2 s) . \end{matrix}

(70)

where

v_{d / 2} (x; 2 s)

is the density of the inverse Pareto distribution defined in (49).

Suppose

γ \in {0, \pm 1}

. Let

j = 1, 2

be the exponent at

x^{j}

in (56). Then, by (65) for positive odd numbers d with

α = j + (d - 7) / 2

,

p = x / 2

,

q = s

if

γ = 1

, by (62) with

α = 2

,

p = s

for

γ = 0

and with

α = j + d / 2 - 1

,

p = (x + 2 s) / 2

for

γ = - 1

:

\begin{matrix} J_{2} (x; γ, 2, j) = \frac{s x^{j}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} y^{j γ - 3} {(x y^{γ})}^{d / 2 - 1} e^{- (x y^{γ} / 2 + s / y)} d y \\ = \frac{s x^{j + d / 2 - 1}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} \frac{y^{γ (j + d / 2 - 1) - 3}}{e^{x y^{γ} / 2 + s / y}} d y = \frac{s x^{j + d / 2 - 1}}{2^{d / 2} Γ (d / 2)} \{\begin{matrix} K_{j + (d - 7) / 2}^{* *} (x / 2, s), & γ = 1, \\ e^{- x / 2} M_{2} (s), & γ = 0, \\ M_{j + 1 + d / 2} ((x + 2 s) / 2), & γ = - 1 . \end{matrix} \end{matrix}

(71)

If d is not an odd number,

K_{j + (d - 7) / 2}^{* *} (x / 2, s)

in (71) has to be replaced by

K_{j + d / 2 - 3}^{*} (x / 2, s)

, which may be calculated with (63) where the Macdonald functions

K_{j + d / 2 - 3} (\sqrt{2 s x})

are involved.

Define

q_{j}^{*} = q_{j} I_{2} (x; γ, 2, j)

for

j = 1, 2

with the coefficient

q_{j}

from (56). Calculating the corresponding terms in (71) we find

\begin{matrix} γ = 1, d = 1 : & q_{1}^{*} = q_{1} x {(2 s)}^{- 1} (1 + \sqrt{2 s x}) w_{1} (x), & q_{2}^{*} = q_{2} x^{2} w_{1} (x), \\ γ = 1, d = 3 : & q_{1}^{*} = q_{1} {(2 s)}^{- 1} x \sqrt{2 s x} w_{3} (x), & q_{2}^{*} = q_{2} x^{2} w_{3} (x), \\ γ = - 1 : & q_{1}^{*} = q_{1} \frac{x (d + 2)}{x + 2 s} v_{d / 2} (x; 2 s), & q_{2}^{*} = q_{2} \frac{x^{2} (d + 4) (d + 2)}{{(x + 2 s)}^{2}} v_{d / 2} (x; 2 s), \\ γ = 0 : & q_{1}^{*} = q_{1} x s^{- 1} g_{d / 2} (x) & q^{*} 21 = q_{2} x^{2} s^{- 1} g_{d / 2} (x) \end{matrix}\}

(72)

8. Proof of Theorems

We find from Lemmas A1 and A2 that

D_{n}

in (5) in Proposition 1 is bounded and the integrals in (10) and (11) in Proposition 2 have the necessary convergence rates. It remains to calculate the integrals in (9).

Proof of Theorem 1.

Let

F (x) = Φ (x)

,

H (y) = G_{r, r} (y)

and

h_{2} (y) = h_{2; r} (y)

defined in (32).

Suppose

J_{1} (x; γ) = \int_{0}^{\infty} Φ (x y^{γ}) d G_{r, r} (y)

with

γ \in {0, \pm 1 / 2}

, which are the limit distributions in (9) for

P (g_{n}^{γ} N_{n} {(r)}^{1 / 2 - γ} Z_{N_{n} (r)} \leq x)

under the condition of Theorem 1. Then,

J_{1} (x; γ) = Φ (x)

for

γ = 0

. It follows from (66), (62) for

γ = 1 / 2

and (65) for

γ = - 1 / 2

that

\frac{\partial}{\partial x} J_{1} (x; γ) = \{\begin{matrix} s_{2 r} (x) = \frac{Γ (r + 1 / 2)}{\sqrt{2 r π} Γ (r)} {(1 + \frac{x^{2}}{2 r})}^{- (r + 1 / 2)}, & γ = 1 / 2 & with J_{1} (x; 1 / 2) = S_{2 r} (x), \\ φ (x) = \frac{1}{\sqrt{2 π}} e^{- x^{2} / 2}, & γ = 0 & with J_{1} (x; 0) = Φ (x), \\ l_{2} (x) = (\frac{1}{2} + | x |) e^{- 2 | x |}, r = 2, & γ = - 1 / 2, & with J_{1} (x; - 1 / 2) = L_{2} (x), \end{matrix}

(73)

where

s_{2 r} (x)

is the density of Student’s t-distribution with

2 r

degrees of freedom and

l_{2} (x)

is the density of a generalized Laplace distribution.

Integral

J_{2} (x; γ) = \int_{0}^{\infty} y^{- 1 / 2} (p_{0} + p_{2} x^{2} y^{2 γ}) φ (x y^{γ}) d G_{r, r} (y)

is the integral by

g_{n}^{- 1 / 2}

in the expansion (9). Then, using (67) and (68) with

k = 1

, we obtain

J_{2} (x; γ) = p_{0} J_{2} (x; γ, 1, 0) + p_{2} J_{2} (x; γ, 1, 2) = p_{0}^{*} + p_{2}^{*},

(74)

Integral

J_{3} (x; γ) = \int_{0}^{\infty} y^{- 1} (p_{1} x y^{γ} + p_{3} x^{3} y^{3 γ} + p_{5} x^{5} y^{5 γ}) φ (x y^{γ}) d G_{r, r} (y)

is the integral by

g_{n}^{- 1}

in the expansion (9). Then, using again (67) and (68) with

k = 2

, we obtain

J_{3} (x; γ) = p_{1} J_{2} (x; γ, 2, 1) + p_{3} J_{2} (x; γ, 2, 3) + p_{5} J_{2} (x; γ, 2, 5) = p_{1}^{*} + p_{3}^{*} + p_{5}^{*}

(75)

Integration by parts in the last integral by

n^{- 1}

in (9) for

γ = \pm 1 / 2

and

r > 1

leads to

J_{4} (x; γ) = \int_{0}^{\infty} Φ (x y^{γ}) d h_{2, r} (y) = - \frac{γ x r^{r}}{2 r \sqrt{2 π} Γ (r)} \int_{0}^{\infty} \frac{y^{γ + r - 2}}{e^{x^{2} y^{2 γ} / 2 + r y}} ((y - 1) (2 - r) + 2 Q_{1} (g_{n} y)) d y .

Suppose

γ = 1 / 2

. We find from (62)

\begin{matrix} J_{4} (x; 1 / 2) & = & \frac{x r^{r} (2 - r)}{4 r \sqrt{2 π} Γ (r)} (M_{r - 1 / 2} (r + x^{2} / 2) - M_{r + 1 / 2} (r + x^{2} / 2)) - J_{4}^{*} (x; 1 / 2) \\ = & \frac{(2 - r) x (x^{2} + 1)}{4 r (2 r - 1)} s_{2 r} (x) - J_{4}^{*} (x; 1 / 2) . \end{matrix}

with

J_{4}^{*} (x; 1 / 2) = \frac{x r^{r - 1}}{2 \sqrt{2 π} Γ (r)} \int_{0}^{\infty} y^{r - 3 / 2} e^{- (r + x^{2} / 2) y} Q_{1} (g_{n} y) d y,

(76)

where

Q_{1} (y)

is defined in (33). It follows from Lemma A3 that for

r > 1

{sup}_{x} n^{- 1} | J_{4}^{*} (x; 1 / 2) | \leq c (r) n^{- r} .

Hence, because of

0 \leq g_{n}^{- 1} - {(r n)}^{- 1} \leq {(n g_{n})}^{- 1}

for

r \geq 1

, we obtain

|\frac{1}{n} \int_{0}^{\infty} Φ (x \sqrt{y}) d h_{2} (y) - \frac{(2 - r) x (x^{2} + 1)}{4 (2 r - 1) g_{n}} s_{2 r} (x)| \leq \frac{1}{n} | J_{4}^{*} | + \frac{C (r)}{n g_{n}} \leq c_{1} (r) n^{- min {r 2}} .

(77)

For

γ = - 1 / 2

, we only consider the case

r = 2

, which results in

J_{4} (x; - 1 / 2) = 0

and

J_{4}^{*} (x; - 1 / 2) = \frac{x}{2 \sqrt{2 π}} \int_{0}^{\infty} y^{- 1 / 2} Q_{1} (g_{n} y) e^{- (2 y + x^{2} / (2 y))} d y,

(78)

where

{sup}_{x} n^{- 1} J_{4}^{*} (x; - 1 / 2) \leq C n^{- 2}

is proved in Lemma A3.

If

γ = 0

, then

J_{4} (x; 0) = Φ (x) (h_{2; r} (\infty) - h_{2; r} (0)) = 0

since

Q_{1} (0) = 1 / 2

.

The proof of Theorem 1 follows from (73)–(75) and (77) and Lemma A3. □

Proof of Theorem 2.

Let

F (x) = G_{d} (x)

,

H (y) = W_{s} (y) = e^{- s / y}

and

h_{2} (y) = h_{2; s} (y)

defined in (32).

Suppose

J_{1} (x; γ) = \int_{0}^{\infty} G_{d} (x y^{γ}) s y^{- 2} e^{- s / y} d y

with

γ \in {0, \pm 1}

which are the limit distributions in (9) for

P (g_{n}^{γ} N_{n} {(s)}^{1 / 2 - γ} Z_{N_{n} (s)} \leq x)

under the condition of Theorem 2.

Then,

J_{1} (x; γ) = G_{d} (x)

for

γ = 0

. It follows from (69) and (65) for

γ = 1

and (70) for

γ = - 1

that

\frac{\partial}{\partial x} J_{1} (x; γ) = \{\begin{matrix} w_{1} (x; 2 s) = s {(2 s x)}^{- 1 / 2} e^{- \sqrt{2 s x}} & γ = 1 & with J_{1} (x; 1) = W_{1} (x), \\ w_{3} (x; 2 s) = s e^{- \sqrt{2 s x}} & γ = 1 & with J_{1} (x; 1) = W_{3} (x), \\ φ (x) = \frac{1}{\sqrt{2 π}} e^{- x^{2} / 2}, & γ = 0 & with J_{1} (x; 0) = Φ (x), \\ v_{d / 2} (x; 2 s) = \frac{s d x^{d / 2 - 1}}{{(x + 2 s)}^{d / 2 + 1}} & γ = - 1, & with J_{1} (x; - 1 / 2) = V_{d / 2} (x; 2 s), \end{matrix}

where

w_{1} (x; 2 s)

is the Weibull density (see (45)),

w_{3} (x; 2 s)

is the generalized gamma density0 (see (46)) and

v_{d / 2} (x; 2 s)

is the density of the inverse Pareto distribution

V_{d / 2} (x; 2 s)

defined in (49).

Integral

J_{2} (x; γ) = \int_{0}^{\infty} y^{- 1} (q_{1} x y^{γ} + q_{2} x_{2} y^{2 γ}) g_{d / 2} (x y^{γ}) s y^{- 2} e^{- s / y} d y

is the integral by

g_{n}^{- 1}

in the expansion (9). Then, the use of (71) and (72) leads to

J_{2} (x; γ) = q_{1} J_{2} (x; γ, 2, 1) + q_{3} J_{2} (x; γ, 2, 3) = q_{1}^{*} + q_{3}^{*} .

Integration by parts in the last integral by

n^{- 1}

in (9) for

γ = \pm 1

leads to -4.6cm0cm

\begin{matrix} J_{3} (x; γ, d) & = & \int_{0}^{\infty} G_{d} (x y^{γ}) d h_{2; s} (y) = J_{4} (x; γ, d) + J_{4}^{*} (x; γ, d), \\ J_{4} (x; γ, d) & = & - \frac{s (s - 1) γ x^{d / 2}}{2^{d / 2 + 1} Γ (d / 2)} \int_{0}^{\infty} y^{γ d / 2 - 3} e^{- x y^{γ} / 2 - s / y} d y = - \frac{s (s - 1) γ x^{d / 2}}{2^{d / 2 + 1} Γ (d / 2)} K_{(γ d - 5) / 2}^{* *} (x / 2, s), \\ J_{4}^{*} (x; γ, d) & = & - \frac{s γ x^{d / 2}}{2^{d / 2} Γ (d / 2)} \int_{0}^{\infty} y^{γ d / 2 - 3} e^{- x y^{γ} / 2 - s / y} Q_{1} (n y) d y, \end{matrix}

(79)

where

Q_{1} (y)

is defined in (33). Suppose

γ = 1

. We get with (65)

J_{4} (x; 1, 1) = - \frac{(s - 1) x (\sqrt{2 s x} + 1)}{4 s} w_{1} (x; 2 s) and J_{4} (x; 1, 3) = - \frac{(s - 1) x^{2}}{2 \sqrt{2 s x}} w_{3} (x; 2 s)

For

γ = - 1

using (65), we see that

J_{4} (x; - 1, d) = \frac{s (s - 1) x^{d / 2} \sqrt{2 s x}}{2^{1 + d / 2} Γ (d / 2)} M_{(d + 4) / 2} ((x + 2 s) / 2) = \frac{(s - 1) x (2 + d)}{2 (x + 2 s)} v_{d / 2} (x; 2 s) .

In Lemma A4,

{sup}_{x} n^{- 1} J_{4}^{*} (x; γ, d) \leq c (s) n^{- 2}

for

γ = \pm 1

is proved.

If

γ = 0

, then

J_{3} (x; 0, d) = G_{d} (x) (h_{2; s} (\infty) - {lim}_{y \to 0} h_{2; s} (y)) = 0

.

Combining the above estimates proves Theorem 2. □

9. Conclusions

Chebyshev–Edgeworth expansions are derived for the distributions of various statistics from samples with random sample sizes. The construction of these asymptotic expansions is based on the given asymptotic expansions for the distributions of statistics of samples with a fixed sample sizes as well as those of the distributions of the random sample sizes.

The asymptotic laws are scale mixtures of the underlying standard normal or chi-square distributions with gamma or inverse exponential mixing distributions. The results hold for a whole family of asymptotically normal or chi-squared statistics since a formal construction of asymptotic expansions are developed. In addition to the random sample size, a normalization factor for the examined statistics also has a significant influence on the limit distribution. As limit laws, Student, standard normal, Laplace, inverse Pareto, generalized gamma, generalized Laplace and weighted sums of generalized gamma distributions occur. As statistica the random mean, the scale-mixed normalized Student t-distribution and the Student’s t-statistic under non-normality with normal limit law, as well as Hotelling’s generalized

T_{0}^{2}

and scale mixture of chi-squared statistics with chi-square limit laws, are considered. The bounds for the corresponding residuals are presented in terms of inequalities.

Author Contributions

Conceptualization, G.C. and V.V.U.; methodology, V.V.U. and G.C.; writing—original draft, G.C. and V.V.U.; and writing—review and editing, V.V.U. and G.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was done within the framework of the Moscow Center for Fundamental and Applied Mathematics, Lomonosov Moscow State University and University Basic Research Programs.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the managing editor for the assistance and the reviewers for their careful reading of the manuscript and the relevant comments. Their constructive feedback helped to improve the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Auxiliary Statements and Lemmas

In Section 3, we consider statistics satisfying (1) in Assumption 1 and in Section 4 random sample sizes satisfying (2) in Assumption 2. The statistics

T_{m}

in (15), (18) and (21) satisfy Assumption 1 with the normal limit distribution

Φ (x)

and in (26) and (27) with chi-square limit distributions

G_{d} (x)

and

G_{4} (x)

, defined in (17), respectively.

Further, we estimate the functions

f_{k} (x y^{γ})

and

\frac{\partial}{\partial y} (\frac{f_{k} (x y^{γ})}{y^{k / 2}})

,

k = 1, 2

that appear in

(1)

of Assumption 1 and in the term

D_{n}

in

(5)

. Since the functions

f_{k} (z)

are products of a polynomial

P_{k} (z)

and a density function

p (z)

with

p (z) = φ (z)

or

p (z) = g_{r, r} (z)

, it follows for

γ \in {\pm 1 / 2, \pm 1}

that, if

f_{k} (x y^{γ}) = P_{k} (x y^{γ}) p (x y^{γ}) then \frac{\partial}{\partial y} (\frac{P_{k} (x y^{γ}) p (x y^{γ})}{y^{k / 2}}) = \frac{Q_{k} (x y^{γ}) p (x y^{γ})}{y^{1 + k / 2}},

(A1)

with some polynomial

Q_{k} (z)

. For

γ = 0

, we have

f_{k} (x y^{γ}) = f_{k} (x)

and

Q_{k} (x) = - (k / 2)

f_{k} (x)

. Hence, (A1) also holds for

γ = 0

.

For example, with

f_{1} (z) = \frac{λ_{3}}{6} (z^{2} - 1) φ (z)

occurring in (16) for the sample mean

{\bar{X}}_{m}

and

f_{2} (z) = (A z - B z^{2}) g_{d} (z)

occurring in (27) with

d = 4

for scale mixture of chi-square statistics, we obtain

\frac{\partial}{\partial y} (\frac{f_{1} (x y^{γ})}{y^{1 / 2}}) = \frac{Q_{1} (x y^{γ}) φ (x y^{γ})}{y^{3 / 2}} and \frac{\partial}{\partial y} (\frac{f_{2} (x y^{γ})}{y}) = \frac{Q_{2} (x y^{γ}) g_{d} (x y^{γ})}{y^{2}}

with

Q_{1} (z) = λ_{3} (1 / 2 + (2 γ - 1 / 2) z^{2} - γ z^{4}) / 6

and

Q_{2} (z) = (B z^{3} - (B d - 2) z^{2} +

A (d - 2) z) / 4

.

Remark A1.

If

P_{k} (0) \neq 0

, i.e., the absolute term of the polynomial

P_{k} (z)

is not equal to zero, then it is also the absolute term of

Q_{k} (z)

, i.e.,

Q_{k} (0) \neq 0

.

The functions

φ (z)

and

e^{- r z}

in

g_{r, r} (z)

allow obtaining the estimates for

c_{k}^{*} = {sup}_{z} | f_{k} (z) | < \infty and c_{k}^{* *} = {sup}_{z} | Q_{k} (z) | p (z) < \infty, k = 1, 2 .

(A2)

Appendix A.1. Lemmas A1 and A2

Lemma A1.

Consider the statistics estimated in (15), (18), (21), (26) and (27). Let

g_{n}

be a sequence with

0 < g_{n} ↑ \infty

as

n \to \infty

and

γ \in {- 1 / 2, 0, 1 / 2, 1}

. Then, with some computable constant

0 < C^{*} (γ) < \infty

, we obtain

D_{n} = {sup}_{x} \int_{1 / g_{n}}^{\infty} |\frac{\partial}{\partial y} (F (x y^{γ}) + \frac{f_{1} (x y^{γ})}{\sqrt{g_{n} y}} + \frac{f_{2} (x y^{γ})}{y g_{n}})| d y \leq C^{*} (γ),

where

F (x)

,

f_{1} (x)

and

f_{2} (x)

are defined in the approximation estimates (15), (18), (21), (26) and (27).

Proof of Lemma A1.

The statistics in (15), (18) and (21) satisfy Assumption 1 with the normal limit distribution

Φ (x)

. To estimate

D_{n} = {sup}_{x} | D_{n} (x) |

, we consider the cases

x \neq 0

and

x = 0

.

Let

x \neq 0

. Since

\frac{\partial}{\partial y} Φ (x y^{γ}) = 0

for

γ = 0

and

\frac{\partial}{\partial y} Φ (x y^{γ}) = γ x y^{γ - 1} φ (x y^{γ}) d y

has constant

sign (γ x)

for

γ = \pm 1 / 2

and

y > 0

, we find

\int_{1 / g_{n}}^{\infty} |\frac{\partial}{\partial y} Φ (x y^{γ})| d y = |\int_{1 / g_{n}}^{\infty} γ x y^{γ - 1} φ (x y^{γ}) d y| = \{\begin{matrix} 1 - Φ (x g_{n}^{- γ}) \leq 1 / 2 & for x > 0 . \\ Φ (x g_{n}^{- γ}) \leq 1 / 2 & for x < 0 . \end{matrix}

From (A1) and (A2), it follows that

|\frac{\partial}{\partial y} (\frac{f_{k} (x y^{γ})}{{(g_{n} y)}^{k / 2}})| \leq 2 c_{k}^{* *} / k

, which proves the first case. Moreover,

D_{n} (0) = | Q_{1} (0) |

since

Q_{2} (0) = 0

for the considered statistics.

Consider now the statistics estimated in (26) and (27) with limit chi-square distributions. We only need to examine

x > 0

and

γ \in {0, \pm 1}

. In the cases now under review, we have

f_{1} (x) = 0

and

f_{2} (z) = (A z + B z^{2}) g_{d} (z)

with some real constants A and B. The proof is completed with (A1) and (A2),

\int_{1 / g_{n}}^{\infty} |\frac{\partial}{\partial y} G_{d} (x y γ)| d y = |\int_{1 / g_{n}}^{\infty} \frac{\partial}{\partial y} G_{d} (x y γ) d y| \leq 1 and \int_{1 / g_{n}}^{\infty} |\frac{\partial}{\partial y} (\frac{f_{2} (x y^{γ})}{g_{n} y})| d y \leq c_{2}^{* *}

. □

Next, the integrals in (10) and (11) in Proposition 2 for the gamma limit distributions

H (y) = G_{r, r} (y)

and the inverse exponential limit distribution

H (y) = exp {- s / y}

are estimated.

Lemma A2.

(i) The conditions (6), (7) and (8) in Proposition 2 are satisfied for

G_{r, r} (y)

and

W_{s} (y) = e^{- s / y}

.

(ii) Let

γ \in {0, \pm 1 / 2}

. Consider

f_{1} (x)

given in (16) and

f_{2} (z)

is given in (16) or (18) for statistics occurring in (15), (18) and (21) with limiting distribution

Φ (x)

.

(iia) Let the mixing distribution be

H (y) = G_{r, r} (y)

with

g_{n} = r (n - 1) + 1

,

b = min {r, 2}

and

h_{2} (y) = h_{2; r} (y) = g_{r, r} (y) ((y - 1) (2 - r) + 2 Q_{1} (g_{n} y)), y > 0

. Then, we obtain with

k = 1, 2

\begin{matrix} sup_{x} | I_{1} (x, n) | \leq sup_{x} \int_{1 / g_{n}}^{\infty} |\frac{f_{1} (x y^{γ})}{{(g_{n} y)}^{1 / 2}}| d G_{r, r} (y) \leq c_{1}^{*} g_{n}^{- r} & f o r & 0 < r < 1 / 2, γ \in {0, \pm 1 / 2}, \end{matrix}

(A3)

\begin{matrix} {sup}_{x} | I_{1} (x, n) | \leq c_{1}^{*} g_{n}^{- 1 / 2} ln g_{n} & f o r & r = 1 / 2, γ = \pm \frac{1}{2}, \end{matrix}

(A4)

\begin{matrix} sup_{x} | I_{2} (x, n) | \leq sup_{x} \int_{1 / g_{n}}^{\infty} |\frac{f_{2} (x y^{γ})}{g_{n} y}| d G_{r, r} (y) \leq c_{2}^{*} g_{n}^{- r} & f o r & 0 < r < 1, γ \in {0, \pm 1 / 2}, \end{matrix}

(A5)

\begin{matrix} {sup}_{x} | I_{2} (x, n) | \leq c_{3} g_{n}^{- 1} & f o r & r = 1, γ = \pm 1 / 2, \end{matrix}

(A6)

\begin{matrix} {sup}_{x} | I_{k} (x, n) - g_{n}^{- k / 2} f_{k} (x) ln g_{n} | \leq c_{4} g_{n}^{- k / 2} & f o r & r = k / 2, γ = 0, \end{matrix}

(A7)

sup_{x} | I_{2 + k} (x, n) | = sup_{x} |\int_{1 / g_{n}}^{\infty} \frac{f_{k} (x y^{γ})}{n g_{n}^{k / 2} y} d h_{2; r} (y)| \leq \{\begin{matrix} c_{5} (r) g_{n}^{- r}, & r > 1, r \neq 1 + k / 2, \\ c_{6} (r) g_{n}^{- 1 - k / 2} ln n, & r = 1 + k / 2 . \end{matrix}

(A8)

(iib) Now, consider the mixing distribution

H (y) = W_{s} (y) = e^{- s / y}

with

g_{n} = n

,

b = 2

and

h_{2} (y) = h_{2; s} (y) = s e^{- s / y} (s - 1 + 2 Q_{1} (n y)) / (2 y^{2}), y > 0

. Then, apply it to

I_{4} (x, n)

in (11)

{sup}_{x} | I_{4} (x, n) | = {sup}_{x} |\int_{1 / n}^{\infty} \frac{f_{2} (x y^{γ})}{n^{2} y} d h_{2; s} (y)| \leq c_{6} (s) n^{- 2}, f o r s > 0, γ \in {0, \pm 1 / 2} .

(A9)

(iii) Put

γ \in {0, \pm 1}

. Consider

f_{2} (x) = (A x + B x^{2}) g_{d} (x)

for statistics occurring in (26) and (27) satisfying Assumption 1 with the chi-square distribution

G_{d} (x)

.

(iiia) The mixing distribution is

H (y) = G_{r, r} (y)

as in Case (ia) above. Then, for

γ \in {0, \pm 1}

,

{sup}_{x} | I_{2} (x, n) | \leq {sup}_{x} \int_{1 / g_{n}}^{\infty} |\frac{f_{2} (x y^{γ})}{g_{n} y}| d G_{r, r} (y) \leq \frac{c_{2}^{*} r^{r}}{(r - 1) Γ (r)} g_{n}^{- r} f o r r < 1,

(A10)

\begin{matrix} {sup}_{x} | I_{2} (x, n) | \leq c_{2}^{*} n^{- 1} & w i t h γ = \pm 1 / 2, \\ {sup}_{x} | I_{2} (x, n) - n^{- 1} f_{2} (x) ln n | \leq c_{2}^{*} n^{- 1} & w i t h γ = 0, \end{matrix}\} f o r r = 1,

(A11)

{sup}_{x} | I_{4} (x, n) | = {sup}_{x} |\int_{1 / g_{n}}^{\infty} \frac{f_{2} (x y^{γ})}{n g_{n} y} d h_{2; r} (y)| \leq \{\begin{matrix} c_{3} (r) g_{n}^{- min {r, 2}}, & f o r r > 1, r \neq 2, \\ c_{4} (r) g_{n}^{- 2} ln n, & f o r r = 2 . \end{matrix}

(A12)

(iiib) The mixing distribution is

H (y) = W_{s} (y) = e^{- s / y}

with

g_{n} = n

and

b = 2

as in Case (iib). Then,

{sup}_{x} | I_{4} (x, n) | = sup_{x} |\int_{1 / g_{n}}^{\infty} \frac{f_{2} (x y^{γ})}{n^{2} y} d h_{2; s} (y)| \leq C_{5} (r) n^{- 2} . f o r s > 0, γ \in {0, \pm 1 / 2} .

(A13)

Proof of Lemma A2.

(i) Insertion of

G_{r, r} (y)

with

h_{2, r} (y)

and

W_{s} (y) = e^{- s / y}

with

h_{2, s} (y)

and simple calculation result in the necessary estimates in (6)–(8). In the case of

W_{s} (y) = e^{- s / y}

, one even gets for all terms exponentially fast decrease.

(ii) The limit distribution of the considered statistics is standard normal

Φ (x)

.

(iia) Let

H (x) = G_{r, r} (x)

. Using (A2), the estimations (A3) and (A5) for

r < k / 2

, with

k = 1, 2

, are

{sup}_{x} | I_{k} (x, n) | \leq \frac{c_{k}^{*} r^{r}}{g_{n}^{k / 2} Γ (r)} \int_{1 / g_{n}}^{\infty} y^{r - 1 - k / 2} d y \leq \frac{c_{k}^{*} r^{r}}{(k / 2 - r) Γ (r)} g_{n}^{- r} .

Taking into account

0 \leq ln g_{n} - \int_{1 / g_{n}}^{1} \frac{e^{- r y}}{y} d y = \int_{1 / g_{n}}^{1} \frac{1 - e^{- r y}}{y} d y \leq r and \int_{1}^{\infty} \frac{e^{- r y}}{y} d y \leq e^{- r} / r f o r r > 0,

(A14)

the bound (A4) follows from

| I_{1} (x, n) | \leq \frac{c_{1}^{*}}{{(2 g_{n})}^{1 / 2} Γ (1 / 2)} (\int_{1}^{\infty} \frac{e^{- y / 2}}{y} d y + \int_{1 / g_{n}}^{1} \frac{e^{- y / 2}}{y} d y) \leq \frac{c_{1}^{*} (2 e^{- 1 / 2} + ln g_{n})}{Γ (1 / 2) {(2 g_{n})}^{1 / 2}} .

If

r = 1

with

d_{2}^{*} = {sup}_{z} {| z^{- 1} f_{2} (z) | φ (z / \sqrt{2})}

, we find

| f_{2} (z) | \leq d_{2}^{*} | z | φ (z / \sqrt{2})

and the bound (A6) follows from

| I_{2} (x, n) | \leq \frac{d_{2}^{*} | x |}{\sqrt{2 π} n} \int_{1 / n}^{\infty} y^{γ - 1} e^{- (y + x^{2} y^{2 γ} / 4)} d y with γ = \pm 1 / 2,

where for

γ = 1 / 2

using

| x | {(1 + x^{2} / 4)}^{- 1 / 2} \leq 2

, we obtain

| I_{2} (x, n) | \leq \frac{d_{2}^{*} | x |}{\sqrt{2 π} n} \int_{1 / n}^{\infty} y^{1 / 2 - 1} e^{- (1 + x^{2} / 4) y} d y \leq \frac{d_{2}^{*} | x | Γ (1 / 2)}{\sqrt{2 π} {(1 + x^{2} / 4)}^{1 / 2}} n^{- 1} \leq \frac{\sqrt{2} d_{2}^{*}}{n}

and, in the case of

γ = - 1 / 2

, the substitution

z = x^{2} / (4 y)

for

x \neq 0

leads to

I_{2} (x, n) \leq \frac{c_{2}^{*} | x |}{\sqrt{2 π} n} \int_{1 / n}^{\infty} y^{- 1 - 1 / 2} e^{- (y + x^{2} / (4 y))} d y \leq \frac{2 c_{2}^{*}}{\sqrt{2 π} n} \int_{0}^{\infty} z^{- 1 / 2} e^{- z} d z \leq \frac{\sqrt{2} d_{2}^{*}}{n} .

Finally, if

γ = 0

, then

f_{k} (x y^{γ}) = f_{k} (x)

does not depend on y. Then, (A7) follows from (A1), (A2), and (A14) for

r = k / 2

.

Let

r > 1

. Integration by parts for Lebesgue–Stieltjes integrals

I_{k + 2} (x, n)

,

k = 1, 2

, in (11) leads to

{sup}_{x} I_{k + 2} (x, n) \leq \frac{1}{n g_{n}^{k / 2}} (c_{k}^{*} | h_{2; r} (1 / g_{n}) | + {sup}_{x} \int_{1 / g_{n}}^{\infty} \frac{| Q_{k} (x y^{γ}) |}{y^{1 + k / 2}} | h_{2; r} (y) | d y)

(A15)

with bound

c_{k}^{*}

given in (A2). Defining

C_{r}^{*} = \frac{r^{r}}{2 r Γ (r)} {sup}_{y} {e^{- r y} (| y - 1 | | 2 - r | + 1)} < \infty

, we find

\int_{1 / g_{n}}^{\infty} \frac{| h_{2; r} (y) |}{y^{1 + k / 2}} d y \leq C_{r}^{*} \int_{1 / g_{n}}^{\infty} y^{r - 2 - k / 2} d y = \frac{C_{r}^{*}}{(1 + k / 2 - r)} g_{n}^{- r + 1 + k / 2} for 1 < r < 1 + k / 2

and, with

C_{r}^{* *} = \frac{r^{r - 1}}{2 Γ (r)} {sup}_{y} {(e^{- r y / 2} (| y - 1 | | 2 - r | + 1)} < \infty

, we obtain

\int_{1 / g_{n}}^{\infty} \frac{| h_{2; r} (y) |}{y^{1 + k / 2}} d y \leq C_{r}^{* *} \int_{1 / g_{n}}^{\infty} y^{r - 2 - k / 2} e^{- r y / 2} d y \leq \frac{C_{r}^{* *} Γ (r - 1 - k / 2)}{{(r / 2)}^{r - 1 - k / 2}} for r > 1 + k / 2 .

Hence, using

g_{n} \leq r n

for

r > 1

, we obtain (A8) and, for

r > 1

,

r \neq 1 + k / 2

.

For

r = 1 + k / 2

, the second integral in the line above is an exponential integral. Therefore, with (A14), we find (A8) for

r = 1 + k / 2

, too.

(iib) The mixing distribution is

H (y) = W_{s} (y) = e^{- s / y}

with

g_{n} = n

. Since

b = 2

, only

I_{4} (x, n)

has to be estimated. Integration by parts for

I_{4} (x, n)

in (11) leads to (A15) with

k = 2

,

g_{n} = n

and

h_{2, s} (y)

instead of

h_{2, r} (y)

. Hence, (A9) follows from

\int_{1 / n}^{\infty} \frac{| h_{2; s} (y) |}{y^{2}} d y \leq s (s + 2) \int_{1 / n}^{\infty} y^{- 4} e^{- s / y} d y \leq \frac{s + 2}{s^{2}} \int_{0}^{s n} z^{2} e^{- z} d z \leq \frac{(s + 2) Γ (3)}{s^{2}} .

(A16)

(iii) The limit distribution of statistics in (26) and (27) is chi-square distribution

G_{u} (x)

defined in (17). In the considered cases

f_{1} (x) = 0

. Let

γ \in {0, \pm 1}

. Consider

f_{2} (x) = (A x + B x^{2}) g_{d} (x)

with chi-square density

g_{d} (x)

.

(iiia) Let

H (y) = G_{r, r} (y)

. We have to estimate

I_{2} (x, n)

for

r \leq 1

and

I_{4} (x, n)

for

r > 1

. The bound (A10) for

0 < r < 1

follows from (A2) and

{sup}_{x} | I_{2} (x, n) | \leq \frac{c_{2}^{*} r^{r}}{g_{n} Γ (r)} \int_{1 / g_{n}}^{\infty} y^{r - 2} e^{- r y} d y \leq \frac{c_{2}^{*} r^{r}}{(r - 1) Γ (r)} g_{n}^{- r} for γ \in {0, \pm 1} .

If

r = 1

with

C_{2}^{*} = {sup}_{z} \{| A + B z | \frac{1}{2^{d / 2} Γ (d / 2)} e^{- z / 4}\} < \infty

, we find

| f_{2} (z) | \leq C_{2}^{*} z^{d / 2} e^{- z / 4}

and

| I_{2} (x, n) | \leq \frac{C_{2}^{*} x^{d / 2}}{g_{n}} \int_{1 / g_{n}}^{\infty} y^{- 1 + d / 2} e^{- (1 + x / 4) y} d y \leq \frac{C_{2}^{*} x^{d / 2}}{g_{n} {(1 + x / 4)}^{d / 2}} \leq \frac{4^{d / 2} C_{2}^{*}}{g_{n}} for γ = 1 .

in the case of

γ = - 1

using variable transformation

z = x / (4 y)

for

x > 0

one has

| I_{2} (x, n) | \leq \frac{C_{2}^{*} x^{d / 2}}{g_{n}} \int_{1 / g_{n}}^{\infty} \frac{e^{- x / (4 y)}}{y^{1 + d / 2}} d y \leq \frac{C_{2}^{*} x^{d / 2}}{g_{n} {(x / 4)}^{d / 2}} \int_{0}^{\infty} z^{- 1 + d / 2} e^{- z} d z \leq \frac{C_{2}^{*} 4^{d / 2} Γ (d / 2)}{g_{n}} .

If

γ = 0

then

f_{2} (x y^{γ}) = f_{2} (x)

, noting (A14) and

g_{n} = n

for

r = 1

, we prove (A11).

Let now

r > 1

. It remains to estimate

I_{4} (x, n)

. Using (A15) with

k = 2

, remembering (A2), we obtain (A12) in the same way as for

r > 1

with

k = 2

in case (iia) above.

(iiib) The limit distribution

H (y) = W_{s} (y) = e^{s / y}

with

g_{n} = n

and

b = 2

. As in Case (iib), taking into consideration (A16), we obtain (A13). □

Appendix A.2. Lemmas A3 and A4

We show that the integrals

J_{4}^{*} (x; γ)

and

J_{4}^{*} (x; γ, d)

in the proofs of Theorems 1 and 2 have the order of the remaining terms. Therefore, the involved jump correcting function

Q_{1} (y) = 1 / 2 - (y - [y])

occurring in (32) and (38) has no effect on the second approximation. The function

Q_{1} (y)

is periodic with period 1. The Fourier series expansion of

Q_{1} (y)

at all non-integer points y is

Q_{1} (y) = 1 / 2 - (y - [y]) = \sum_{k = 1}^{\infty} \frac{sin (2 π k y)}{k π} y \neq [y]

(A17)

(see formula 5.4.2.9 in [40] with

a = 0

).

Lemma A3.

Let

J_{4}^{*} (x; \pm 1)

be defined by (76) and (78), respectively. Then,

n^{- 1} J_{4}^{*} (x; 1 / 2) \leq C n^{- r}

for

r > 1

and

n^{- 1} J_{4}^{*} (x; - 1 / 2) \leq C n^{- 2}

.

Proof of Lemma A3.

We begin by considering

Q_{1} (y)

in

J_{4}^{*} (x; 1 / 2)

defined in (A17) following the estimate of

J_{4}^{*} (x)

in the proof of Theorem 2 in [11]. Inserting Fourier series expansion of

Q_{1} (y)

into the integral

J_{4}^{*} (x; 1 / 2)

, interchanging the integral and sum and applying formula (2.5.31.4) in [40] with

α = r - 1 / 2, p = (r + x^{2} / 2)

and

b = 2 π k g_{n}

, then

\begin{matrix} J_{4}^{*} (x; 1 / 2) = \frac{x r^{r - 1}}{{(2 π)}^{3 / 2} Γ (r)} \sum_{k = 1}^{\infty} \frac{1}{k} \int_{0}^{\infty} y^{r - 3 / 2} e^{- (r + x^{2} / 2) y} sin (2 π k g_{n} y) d y \\ = \frac{x r^{r - 1} Γ (r - 1 / 2)}{{(2 π)}^{3 / 2} Γ (r)} \sum_{k = 1}^{\infty} \frac{sin ((r - 1 / 2) arctan (4 π k g_{n} / (x^{2} + 2 r)))}{k {({(2 π k g_{n})}^{2} + {(r + x^{2} / 2)}^{2})}^{(r - 1 / 2) / 2}} = \frac{r^{r - 1} Γ (r - 1 / 2)}{2 π \sqrt{2 π} Γ (r)} \sum_{k = 1}^{\infty} \frac{a_{k} (x; n)}{k} . \end{matrix}

Now, we split the exponent

(r - 1 / 2) / 2 = (r - 1) / 2 + 1 / 4

and obtain

| a_{k} (x; n) | \leq \frac{| x |}{{({(2 π k g_{n})}^{2} + {(r + x^{2} / 2)}^{2})}^{(r - 1) / 2 + 1 / 4}} \leq \frac{| x |}{{(2 π k g_{n})}^{r - 1} {(r + x^{2} / 2)}^{1 / 2}} \leq \frac{\sqrt{2}}{{(2 π k (n - 1))}^{r - 1}} .

Since

r > 1

and

n \geq 2

, the first statement in Lemma A3 follows:

{sup}_{x} n^{- 1} | J_{4}^{*} (x; 1 / 2) | \leq c (r) n^{- r} \sum_{k = 1}^{\infty} k^{- r} = c_{1} (r) n^{- r} .

To prove the second statement about

J_{4}^{*} (x; - 1 / 2)

, we insert again the Fourier series expansion of

Q_{1} (y)

given in (A17) into

J_{4}^{*} (x; - 1 / 2)

and interchange the integral and sum

J_{4}^{*} (x; - 1 / 2) = \frac{x}{2 \sqrt{2 π}} \sum_{k = 1}^{\infty} \frac{1}{π k} \int_{0}^{\infty} y^{- 1 / 2} e^{- (2 y + x^{2} / (2 y)} sin (2 π k g_{n} y) d y .

Further, we use formula 2.5.37.4 in [40]

\int_{0}^{\infty} y^{- 1 / 2} e^{- p y - q / y} sin (b y) d y = \frac{\sqrt{π}}{\sqrt{p^{2} + b^{2}}} e^{- 2 \sqrt{q} z_{+}} (z_{+} sin (2 \sqrt{q} z_{-}) + z_{-} cos (2 \sqrt{q} z_{-}))

with

2 z_{\pm}^{2} = \sqrt{p^{2} + b^{2}} \pm p

,

p = 2 > 0

,

q = x^{2} / 2 > 0

and

b = 2 π g_{n} k > 0

. Use of the estimates

0 < z_{-} \leq z_{+}, | x | z_{+} e^{- | x | z_{+}} \leq e^{- 1}, \sqrt{p^{2} + b^{2}} \geq b = 2 π g_{n} k a n d \sum_{k = 1}^{\infty} k^{- 2} = π^{2} / 6

leads to the inequalities

{sup}_{x} \frac{1}{n} | J_{4}^{*} (x; - 1 / 2) | \leq {sup}_{x} \frac{1}{2 \sqrt{2 π}} \sum_{k = 1}^{\infty} \frac{\sqrt{π} 2 | x | z_{+}}{2 π^{2} g_{n} n k^{2}} e^{- \sqrt{2} | x |} \leq \frac{1}{e \sqrt{2} 12 g_{n} n} = C n^{- 2}

(A18)

and Lemma A3 is proven. □

Lemma A4.

Let

J_{4}^{*} (x; γ, d)

be defined by (79), then

n^{- 1} J_{4}^{*} (x; γ, d) \leq C n^{- 2}

for

γ = \pm 1

.

Proof of Lemma A4.

Using the Fourier series expansion (A17) of the periodic function

Q_{1} (y)

, given in (33), and interchange integral and sum, we find

J_{4}^{*} (x; γ, d) = - \frac{s γ x^{d / 2}}{2^{d / 2} Γ (d / 2)} \sum_{k = 1}^{\infty} \frac{1}{k π} \int_{0}^{\infty} y^{(γ d - 6) / 2} e^{- x y^{γ} / 2 - s / y} sin (2 π k n y) d y .

(A19)

We begin by estimating

J_{4}^{*} (x; 1, 3)

, i.e., the exponent by y in (A19) is

- 3 / 2

. Thus, we can use formula 2.5.37.3 in [40]

\int_{0}^{\infty} y^{- 3 / 2} e^{- p y - s / y} sin (b y) d y = \frac{\sqrt{π}}{\sqrt{s}} e^{- 2 \sqrt{s} z_{+}} sin (2 \sqrt{s} z_{-})

(A20)

where

p = x / 2 > 0

,

s > 0

,

b = 2 π k n > 0

and

2 z_{\pm}^{2} = \sqrt{p^{2} + b^{2}} \pm p = \sqrt{x^{2} / 4 + {(2 π k n)}^{2}} \pm x / 2

. Since

z_{+} = \frac{1}{2} {({(x^{2} / 4 + 2 π k n)}^{1 / 2} + x / 2)}^{1 / 2} \geq \frac{1}{4} x^{1 / 2} + \frac{1}{8} {(2 π)}^{1 / 4} (k^{1 / 4} + n^{1 / 4})

(A21)

it results in

\frac{1}{n} |J_{4}^{*} (x : 1, 3| \leq \frac{{(s x)}^{3 / 2}}{n s 2^{1 / 2} π} \sum_{k = 1}^{\infty} exp \{- {(s x)}^{1 / 2} / 2 - {(2 π)}^{1 / 4} (k^{1 / 4} + n^{1 / 4}) / 4\} \leq \frac{C (s)}{n^{2}} .

Let now

γ = 1

and

d = 1

. The main difference compared with the previous estimate of

J_{4}^{*} (x; 1, 3)

is that we are facing more technical trouble in order to estimate

J_{4}^{*} (x; 1, 1)

. The exponent by y in (A19) is

- 5 / 2

and we cannot find a closed formula similar to (A19) for this case. To estimate

J_{4}^{*}

in the proof of Theorem 5 in [11], we show that differentiation with respect to s under the integral sign in (A20) is allowed. Hence,

\begin{matrix} \int_{0}^{\infty} y^{- 5 / 2} e^{- p y - s / y} sin (b y) d y & = & (\sqrt{π} / 2) e^{- 2 \sqrt{s} z_{+}} (s^{- 3 / 2} sin (2 \sqrt{s} z_{-}) \\ + 2 s^{- 1} z_{+} sin (2 \sqrt{s} z_{-}) & - & 2 s^{- 1} z_{-} cos (2 \sqrt{s} z_{-})) . \end{matrix}

with the same coefficients p, s, b and

z_{\pm}

as in (A20). The use of (A21) and the obvious inequalities

0 < z_{-} \leq z_{+}

and

z_{+} \geq \frac{1}{2} z_{+} + \frac{1}{8} x^{1 / 2} + \frac{1}{16} {(2 π)}^{1 / 4} (k^{1 / 4}

leads to

\frac{1}{n} |J_{4}^{*} (x; 1, 1)| \leq \frac{{(s x)}^{1 / 2}}{n 2^{1 / 2} π} e^{- {(s x)}^{1 / 2} / 4 - {(2 π)}^{1 / 4} n^{1 / 4} / 8} \sum_{k = 1}^{\infty} \frac{1}{k} e^{- {(2 π)}^{1 / 4} k^{1 / 4} / 8} \leq \frac{C (s)}{n^{2}} .

Finally, let now

γ = - 1

.

J_{4}^{*} (x; - 1, d) = \frac{s x^{d / 2}}{2^{d / 2} Γ (d / 2)} \sum_{k = 1}^{\infty} \frac{1}{k π} \int_{0}^{\infty} y^{- (d / 2 + 3)} e^{- (x / 2 + s) / y} sin (2 π k n y) d y .

Partial integration in the integral with

A = d / 2 + 3

,

B = x / 2 + s

and

C = 2 π k n

leads to

\int_{0}^{\infty} y^{- A} e^{- B / y} sin (C y) d y = - \int_{0}^{\infty} \frac{1}{C} (A y^{- (A + 1)} + B y^{- (A + 2)}) e^{- B / y} cos (C y) d y

and using (62) to

\int_{0}^{\infty} \frac{1}{C} (A y^{- (A + 1)} + B y^{- (A + 2)}) e^{- B / y} d y \leq \frac{1}{C} (\frac{A Γ (A)}{B^{A}} + \frac{B Γ (A + 1)}{B^{A + 1}}) = \frac{Γ (A + 1)}{C B^{A}} .

Therefore,

\frac{1}{n} | J_{4}^{*} (x; - 1, d) | \leq \frac{s x^{d / 2}}{n 2^{d / 2} Γ (d / 2)} \sum_{k = 1}^{\infty} \frac{Γ (d / 2 + 4)}{k^{2} 2 π^{2} n {(x / 2 + s)}^{d / 2 + 3}} \leq \frac{C (s)}{n^{2}},

and Lemma A4 is proven. □

References

Nunes, C.; Capistrano, G.; Ferreira, D.; Ferreira, S.S.; Mexia, J.T. Exact critical values for one-way fixed effects models with random sample sizes. J. Comput. Appl. Math. 2019, 354, 112–122. [Google Scholar] [CrossRef]
Nunes, C.; Capistrano, G.; Ferreira, D.; Ferreira, S.S.; Mexia, J.T. Random sample sizes in orthogonal mixed models with stability. Comp. Math. Methods 2019, 1, e1050. [Google Scholar] [CrossRef] [Green Version]
Nunes, C.; Mário, A.; Ferreira, D.; Moreira, E.M.; Ferreira, S.S.; Mexia, J.T. An algorithm for simulation in mixed models with crossed factors considering the sample sizes as random. J. Comput. Appl. Math. 2021. [Google Scholar] [CrossRef]
Esquível, M.L.; Mota, P.P.; Mexia, J.T. On some statistical models with a random number of observations. J. Stat. Theory Pract. 2016, 10, 805–823. [Google Scholar] [CrossRef]
Al-Mutairi, J.S.; Raqab, M.Z. Confidence intervals for quantiles based on samples of random sizes. Statist. Pap. 2020, 61, 261–277. [Google Scholar] [CrossRef]
Barakat, H.M.; Nigm, E.M.; El-Adll, M.E.; Yusuf, M. Prediction of future generalized order statistics based on exponential distribution with random sample size. Statist. Pap. 2018, 59, 605–631. [Google Scholar] [CrossRef]
Gnedenko, B.V. Estimating the unknown parameters of a distribution with a random number of independent observations. (Probability theory and mathematical statistics (Russian)). Trudy Tbiliss. Mat. Inst. Razmadze Akad. Nauk Gruzin. SSR 1989, 92, 146–150. [Google Scholar]
Gnedenko, B.V.; Korolev, V.Y. Random Summation. Limit Theorems and Applications; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
Bening, V.E.; Galieva, N.K.; Korolev, V.Y. On rate of convergence in distribution of asymptotically normal statistics based on samples of random size. Ann. Math. Inform. 2012, 39, 17–28. (In Russian) [Google Scholar]
Bening, V.E.; Galieva, N.K.; Korolev, V.Y. Asymptotic expansions for the distribution functions of statistics constructed from samples with random sizes. Inform. Appl. 2013, 7, 75–83. (In Russian) [Google Scholar]
Christoph, G.; Monakhov, M.M.; Ulyanov, V.V. Second-order Chebyshev-Edgeworth and Cornish-Fisher expansions for distributions of statistics constructed with respect to samples of random size. J. Math. Sci. 2020, 244, 811–839. [Google Scholar] [CrossRef]
Christoph, G.; Ulyanov, V.V.; Bening, V.E. Second Order Expansions for Sample Median with Random Sample Size. arXiv 2020, arXiv:1905.07765v2. [Google Scholar]
Christoph, G.; Ulyanov, V.V. Second order expansions for high-dimension low-sample-size data statistics in random setting. Mathematics 2020, 8, 1151. [Google Scholar] [CrossRef]
Christoph, G.; Ulyanov, V.V. Short Expansions for High-Dimension Low-Sample-Size Data Statistics in a Random Setting. Recent Developments in Stochastic Methods and Applications. In Proceedings in Mathematics & Statistics; Shiryaev, A.N., Samouylov, K.E., Kozyrev, D.V., Eds.; Springer: Cham, Switzerland, 2021; (to appear). [Google Scholar]
Hall, P. The Bootstrap and Edgeworth Expansion; Springer Series in Statistics; Springer: New York, NY, USA, 1992. [Google Scholar]
Bickel, P.J. Edgeworth expansions in nonparametric statistics. Ann. Statist. 1974, 2, 1–20. [Google Scholar] [CrossRef]
Kolassa, J.E. Series Approximation Methods in Statistics, 3rd ed.; Lecture Notes in Statistics 88; Springer: New York, NY, USA, 2006. [Google Scholar]
Petrov, V.V. Limit Theorems of Probability Theory, Sequences of Independent Random Variables; Clarendon Press: Oxford, UK, 1995. [Google Scholar]
Fujikoshi, Y.; Ulyanov, V.V. Non-Asymptotic Analysis of Approximations for Multivariate Statistics; Springer: Singapore, 2020. [Google Scholar]
Hall, P. Edgeworth Expansion for Student’s t-statistic under minimal moment conditions. Ann. Probab. 1987, 15, 920–931. [Google Scholar] [CrossRef]
Bentkus, V.; Götze, F.; van Zwet, W.R. An Edgeworth expansion for symmetric statistics. Ann. Statist. 1997, 25, 851–896. [Google Scholar] [CrossRef]
Anderson, T.W. An Introduction to Multivariate Statistical Analysis, 3rd ed.; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2003. [Google Scholar]
Fujikoshi, Y.; Ulyanov, V.V.; Shimizu, R. Multivariate Statistics. High-Dimensional and Large-Sample Approximations; Wiley Series in Probability and Statistics; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2010. [Google Scholar]
Fujikoshi, Y.; Ulyanov, V.V.; Shimizu, R. L₁-norm error bounds for asymptotic expansions of multivariate scale mixtures and their applications to Hotelling’s generalized $T_{0}^{2}$ . J. Multivar. Anal. 2005, 96, 1–19. [Google Scholar] [CrossRef] [Green Version]
Ulyanov, V.V.; Fujikoshi, Y. On accuracy of improved χ²-approximation. Georgian Math. J. 2001, 8, 401–414. [Google Scholar] [CrossRef]
Schluter, C.; Trede, M. Weak convergence to the Student and Laplace distributions. J. Appl. Probab. 2016, 53, 121–129. [Google Scholar] [CrossRef]
Döbler, C. New Berry-Esseen and Wasserstein bounds in the CLT for non-randomly centered random sums by probabilistic methods. ALEA Lat. Am. J. Probab. Math. Stat. 2015, 12, 863–902. [Google Scholar]
Robbins, H. The asymptotic distribution of the sum of a random number of random variables. Bull. Am. Math. Soc. 1948, 54, 1151–1161. [Google Scholar] [CrossRef] [Green Version]
Kolassa, J.E.; McCullagh, P. Edgeworth Series for Lattice Distributions. Ann. Statist. 1990, 18, 981–985. [Google Scholar] [CrossRef]
Bening, V.E.; Korolev, V.Y. On the use of Student’s distribution in problems of probability theory and mathematical statistics. Theory Probab. Appl. 2005, 49, 377–391. [Google Scholar] [CrossRef]
Gavrilenko, S.V.; Zubov, V.N.; Korolev, V.Y. The rate of convergence of the distributions of regular statistics constructed from samples with negatively binomially distributed random sizes to the Student distribution. J. Math. Sci. 2017, 220, 701–713. [Google Scholar] [CrossRef]
Buddana, A.; Kozubowski, T.J. Discrete Pareto distributions. Econ. Qual. Control 2014, 29, 143–156. [Google Scholar] [CrossRef]
Lyamin, O.O. On the rate of convergence of the distributions of certain statistics to the Laplace distribution. Mosc. Univ. Comput. Math. Cybern. 2010, 34, 126–134. [Google Scholar] [CrossRef]
Choy, T.B.; Chan, J.E. Scale mixtures distributions in statistical modellings. Aust. N. Z. J. Stat. 2008, 50, 135–146. [Google Scholar] [CrossRef]
Oldham, K.B.; Myland, J.C.; Spanier, J. An Atlas of Functions, 2nd ed.; Springer Science+Business Media: New York, NY, USA, 2009. [Google Scholar]
Kotz, S.; Kozubowski, T.J.; Podgórski, K. The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance; Birkhäuser: Boston, MA, USA, 2001. [Google Scholar]
Korolev, V.Y.; Zeifman, A.I. Generalized negative binomial distributions as mixed geometric laws and related limit theorems. Lith. Math. J. 2019, 59, 366–388. [Google Scholar] [CrossRef] [Green Version]
Korolev, V.Y.; Gorshenin, A. Probability models and statistical tests for extreme precipitation based on generalized negative binomial distributions. Mathematics 2020, 8, 604. [Google Scholar] [CrossRef] [Green Version]
Safari, M.A.M.; Masseran, N.; Ibrahim, K.; Hussain, S.I. A robust and efficient estimator for the tail index of inverse Pareto distribution. Phys. A Stat. Mech. Its Appl. 2019, 517, 431–439. [Google Scholar] [CrossRef]
Prudnikov, A.P.; Brychkov, Y.A.; Marichev, O.I. Integrals and Series, Vol. 1: Elementary Functions, 3rd ed.; Gordon & Breach Science Publishers: New York, NY, USA, 1992. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Christoph, G.; Ulyanov, V.V. Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes. Mathematics 2021, 9, 775. https://doi.org/10.3390/math9070775

AMA Style

Christoph G, Ulyanov VV. Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes. Mathematics. 2021; 9(7):775. https://doi.org/10.3390/math9070775

Chicago/Turabian Style

Christoph, Gerd, and Vladimir V. Ulyanov. 2021. "Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes" Mathematics 9, no. 7: 775. https://doi.org/10.3390/math9070775

APA Style

Christoph, G., & Ulyanov, V. V. (2021). Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes. Mathematics, 9(7), 775. https://doi.org/10.3390/math9070775

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Chebyshev–Edgeworth-Type Approximations for Statistics Based on Samples with Random Sizes

Abstract

1. Introduction

2. Statistical Models with a Random Number of Observations

2.1. Assumptions on Statistics T m and Random Sample Sizes N n

2.2. Transfer Proposition from Samples with Non-Random to Random Sample Sizes

3. Chebyshev–Edgeworth Expansions Based on Standard Normal and Chi-Square Distributions

3.1. Examples for Asymptotically Normally Distributed Statistics

3.2. Examples for Asymptotically Chi-Square Distributed Statistics

4. Chebyshev–Edgeworth Expansions for Distributions of Normalized Random Sample Sizes

4.1. The Random Sample Size N n = N n ( r ) Has Negative Binomial Distribution with Success Probability 1 / n

4.2. The Random Sample Size N n = N n ( s ) Is the Maximum of n Independent Discrete Pareto Variables

5. Limit Distributions of Statistics with Random Sample Sizes Using Different Scaling Factors

5.1. The Case F ( x ) = Φ ( x ) and H ( y ) = G r , r ( y )

5.2. The Case F ( x ) = G d ( x ) and H ( y ) = W s ( y ) = e − s / y

6. Main Results

6.1. Asymptotically Normal Distributed Statistics with Negative Binomial Distributed Sample Sizes

6.2. Asymptotically Chi-Square Distributed Statistics with Pareto-Like Distributed Sample Sizes

7. Formal Construction of the Expansions

7.1. The Case G ( x ) = Φ ( x ) and H ( y ) = G r , r ( y )

7.2. The Case G ( x ) = G d ( x ) and H ( y ) = W s ( y ) = e − s / y

8. Proof of Theorems

9. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Auxiliary Statements and Lemmas

Appendix A.1. Lemmas A1 and A2

Appendix A.2. Lemmas A3 and A4

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.1. Assumptions on Statistics $T_{m}$ and Random Sample Sizes $N_{n}$

4.1. The Random Sample Size $N_{n} = N_{n} (r)$ Has Negative Binomial Distribution with Success Probability $1 / n$

4.2. The Random Sample Size $N_{n} = N_{n} (s)$ Is the Maximum of n Independent Discrete Pareto Variables

5.1. The Case $F (x) = Φ (x)$ and $H (y) = G_{r, r} (y)$

5.2. The Case $F (x) = G_{d} (x)$ and $H (y) = W_{s} (y) = e^{- s / y}$

7.1. The Case $G (x) = Φ (x)$ and $H (y) = G_{r, r} (y)$

7.2. The Case $G (x) = G_{d} (x)$ and $H (y) = W_{s} (y) = e^{- s / y}$