Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution

Novoa-Muñoz, Francisco; Aguirre-González, Juan Pablo

doi:10.3390/axioms14010054

Open AccessArticle

Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution

by

Francisco Novoa-Muñoz

^1,*

and

Juan Pablo Aguirre-González

²

¹

Departamento de Enfermería, Facultad de Ciencias de la Salud y de los Alimentos, Universidad del Bío-Bío, Chillán 3800708, Chile

²

Departamento de Estadística, Universidad del Bío-Bío, Concepción 4051381, Chile

^*

Author to whom correspondence should be addressed.

Axioms 2025, 14(1), 54; https://doi.org/10.3390/axioms14010054

Submission received: 24 November 2024 / Revised: 27 December 2024 / Accepted: 9 January 2025 / Published: 12 January 2025

(This article belongs to the Special Issue Advances in Statistical Simulation and Computing)

Download

Browse Figures

Versions Notes

Abstract

:

When modeling real-world data, we face the challenge of determining which probability distribution best represents the data. To address this intricate problem, we rely on goodness-of-fit tests. However, when the data come from a bivariate negative binomial distribution, the literature reveals no existing goodness-of-fit test for this distribution. For this reason, in this article, we propose and study a computationally convenient goodness-of-fit test for the bivariate negative binomial distribution. This test is based on a bootstrap approximation and a parallelization strategy. To this end, we use a reparameterization technique based on the probability generating function and a Cramér-von Mises-type statistic. From the simulation studies, we conclude that the results converge to the established nominal levels as the sample size increases, and in all cases considered, the parametric bootstrap method provides an accurate approximation of the null distribution of the statistic we propose. Additionally, we verify the power of the proposed test, as well as its application to five real datasets. To accelerate the massive computational work, we employ the parallelization strategy that, according to Novoa-Muñoz (2024), was the most efficient among the techniques he analyzed.

Keywords:

bivariate negative binomial distribution; goodness-of-fit; statistical simulation; simulation techniques; bootstrap distribution estimator; parallel programming

MSC:

62F03; 62F05; 62F12; 62F40

1. Introduction

In today’s world, where large volumes of data proliferate, it is common to encounter count data, with the Univariate Negative Binomial Distribution (UNBD) being a useful discrete method for representing such data. This distribution has been used to model insurance policy allocation in actuarial practice (see Shi and Valdez [1]), football betting (see Van Gemert and Van Ophem [2]), and, interestingly, the population dynamics of invasive species (see Krkošek et al. [3]).

By extending the idea of modeling, the Bivariate Negative Binomial Distribution (BNBD) was developed, which is used to describe phenomena involving workplace accidents (see Arbus and Kerrich [4]; Edwards and Gurland [5]), ecological data (see Holgate [6]), migration data related to traffic accidents (see Maher [7]), and bus accident data within London’s vehicle fleet (see Kopocińska [8]).

But in a real-life situation, when bivariate count data are available, how can one determine if the BNBD is appropriate for modeling such data? According to González-Albornoz and Novoa-Muñoz [9], a crucial aspect of data analysis is testing the goodness of fit (gof) of given observations with a probabilistic model. In this regard, and to the best of our knowledge, the only goodness-of-fit tests related to the Negative Binomial distribution that we have found in the literature are those developed by (a) Heller [10], which was applied to a set of microbiological data; (b) Beltrán-Beltrán and O’Reilly [11], who applied it to two datasets (deaths caused by horse kicks and laboratory mice); and (c) Meintanis [12], who applied it to accident data in the Netherlands over the period 1997–2004. However, cases (a) and (b) are tests for univariate data, while case (c) is bivariate but corresponds to a Poisson–Negative Binomial distribution model. More recently, Hudecová et al. [13] and Wang et al. [14] published goodness-of-fit tests for bivariate count time series based on a bivariate Poisson model. Moreover, as far as we know, there is no literature on gof tests for the BNBD. For this reason, the objective of this article is to propose and study a gof test for the BNBD that is consistent against any fixed alternative.

The test we propose employs the probability-generating function (pgf) of the BNBD and its counterpart, the empirical probability-generating function (epgf). The foundation of this proposal lies in the fact that the pgf characterizes the distribution of a random vector and can be consistently estimated by the epgf (see for example Novoa-Muñoz [15]; Novoa-Muñoz and Jiménez-Gamero [16]). This statistical test compares the epgf of the data with an estimator of the pgf of the BNBD.

To decide when to reject the null hypothesis, we need to know the null distribution of the statistic or at least an approximation of it. Since the proposed statistic is of the Cramér-von Mises type, obtaining the null distribution for finite sample sizes is not feasible. The asymptotic alternative we used was to approximate the null distribution using a parametric bootstrap estimator, whose properties can be found in Novoa-Muñoz and Jiménez-Gamero [17].

Since infinite sample sizes do not exist in the real world, we conducted a simulation study to evaluate the performance of the proposed test for finite sample sizes. Due to the computational intensity of the parametric bootstrap method, we used the parallel algorithm parRapply to reduce the simulation runtime. This algorithm proved to be the most efficient among the parallel versions analyzed in Novoa-Muñoz [18].

The present study follows the following sequence: In Section 2, we present the pgf of the DBNB and a reparameterization that makes it computationally more efficient. In Section 3, we develop the test statistic we propose, which, since it does not have a known distribution, we show can be approximated by its null asymptotic distribution. However, as this approximation still depends on the true parameter, in Section 4, we propose a parametric bootstrap estimator. Section 5 is dedicated to presenting the three techniques we used to estimate the parameters of the BNBD. In Section 6, we present and discuss the results of a simulation study and the application of the proposed test to two real datasets. The simulation study was conducted to evaluate the performance of the proposed test and analyze its power.

The notation we use below is detailed in the abbreviations listed at the end of the manuscript.

2. Reparametrization of the Probability Generating Function of the BNBD

From now on, we will consider that

γ = (γ_{0}, γ_{1}, γ_{2}) \in Θ

, where

Θ = \{(γ_{0}, γ_{1}, γ_{2}) \in R^{3} : γ_{0} > γ_{2}, γ_{1} > γ_{2}, γ_{2} > 0\},

is the respective parameter space.

On the other hand, according to Kocherlakota and Kocherlakota [19], the genesis of the BNBD depends on the underlying random mechanism. One such mechanism, composition, gives rise to the following probability generating function (pgf):

g (t_{1}, t_{2}) = {(1 - p_{1} - p_{2} - p_{3})}^{v} {(1 - p_{1} t_{1} - p_{2} t_{2} - p_{3} t_{1} t_{2})}^{- v},

(1)

where

p_{1}, p_{2}, p_{3} \in [0, 1]

,

v \in N

represents the number of trials before the first success or failure.

From (1), Edwards and Gurland [5] and Subrahmaniam [20] derived the probability mass function of the BNBD given below (see Min et al. [21]):

P (X_{1} = r, X_{2} = s) = {(1 - p_{1} - p_{2} - p_{3})}^{v} \sum_{i = 0}^{m i n (r, s)} \frac{Γ (v + r + s - i)}{Γ (v) i! (r - i)! (s - i)!} p_{1}^{r - i} p_{2}^{s - i} p_{3}^{i},

where

Γ

is the Gamma function.

For computational purposes and to facilitate the estimation of the distribution parameters, it is highly convenient to incorporate the following reparameterization:

p_{1} = \frac{γ_{0} - γ_{2}}{1 + γ_{0} + γ_{1} - γ_{2}}, p_{2} = \frac{γ_{1} - γ_{2}}{1 + γ_{0} + γ_{1} - γ_{2}}, p_{3} = \frac{γ_{2}}{1 + γ_{0} + γ_{1} - γ_{2}} .

Thus, the random vector

(X_{1}, X_{2})

follows a Bivariate Negative Binomial distribution with parameter

γ = (γ_{0}, γ_{1}, γ_{2})

, which we will denote as

(X_{1}, X_{2}) \sim B N B (γ)

, and its probability mass function (pmf), with which we will work in this article, is given by

P_{γ} (X_{1} = r, X_{2} = s) = \frac{{(γ_{0} - γ_{2})}^{r} {(γ_{1} - γ_{2})}^{s} Γ (v + r + s)}{r! s! Γ (v) {(1 + γ_{0} + γ_{1} - γ_{2})}^{v + r + s}} S (r, s),

(2)

where

S (r, s) = \sum_{i = 0}^{m i n (r, s)} \frac{(\binom{r}{i}) (\binom{s}{i})}{(\binom{v + r + s - 1}{i})} τ^{i} with τ = \frac{γ_{2} (1 + γ_{0} + γ_{1} - γ_{2})}{(γ_{0} - γ_{2}) (γ_{1} - γ_{2})} .

With this reparameterization and considering the parameter

γ

, the pgf (1) of the BNBD can be written as

g (t; γ) : = E (t_{1}^{X_{1}} t_{2}^{X_{2}}) = {(1 + (1 - t_{1}) γ_{0} + (1 - t_{2}) γ_{1} - (t_{1} - 1) (t_{2} - 1) γ_{2})}^{- v},

where

t = (t_{1}, t_{2}) \in R^{2}

and

γ \in Θ

.

3. Cramér-Von Mises-Type Statistic and Its Asymptotic Null Distribution

In this section, we introduce the test statistic, which is of the Cramér-von Mises type. The essence of the Cramér-von Mises statistic lies in measuring the distance between two continuous distribution functions. In particular, we use it to measure the distance between the pgf,

g (t; γ)

, and its empirical counterpart, the epgf

g_{n} (t)

, which is defined below. For a detailed explanation of the Cramér-von Mises distance, see Baringhaus and Henze [22].

In order to test the hypothesis

H_{0} : (X_{1}, X_{2}) \sim B N B (γ), for some γ \in Θ,

against the alternative

H_{1} : (X_{1}, X_{2}) ≁ B N B (γ), \forall γ \in Θ,

let

X_{1} = (X_{11}, X_{12}), \dots, X_{n} = (X_{n 1}, X_{n 2})

be independent and identically distributed (iid) random vectors defined on a probability space

(Ω, A, P)

and taking values in

N_{0}^{2}

. Moreover, let

g_{n} (t) = \frac{1}{n} \sum_{i = 1}^{n} t_{1}^{X_{i 1}} t_{2}^{X_{i 2}},

be the epgf of

X_{1}, \dots, X_{n}

for some appropriate

W \subseteq R^{2}

.

To develop our statistical test, the following result is fundamental, and its proof, for the d-dimensional case, can be reviewed in Novoa-Muñoz and Jiménez-Gamero [17]:

Proposition 1.

Let

X_{1}, \dots, X_{n}

be iid from a random vector

X = (X_{1}, X_{2}) \in N_{0}^{2}

. Let

g (t) = E (t_{1}^{X_{1}} t_{2}^{X_{2}})

be the pgf of

X

, which is defined on

W \subseteq R^{2}

. Let

0 \leq b_{j} \leq c_{j} < \infty, j = 1, 2

such that

R = [b_{1}, c_{1}] \times [b_{2}, c_{2}] \subseteq W

; then,

sup_{t \in R} | g_{n} (t) - g (t) | \overset{a . s .}{⟶} 0 .

According to Proposition 1, the empirical probability-generating function (epgf) consistently estimates the pgf. Moreover, if

{\hat{γ}}_{n}

is a consistent estimator of

γ

and

H_{0}

is true, then the pgf is consistently estimated by

g (t; {\hat{γ}}_{n})

, since g is a continuous function. On the other hand, because the distribution of

X

is determined solely by its pgf,

g (t), t \in {[0, 1]}^{2}

, it is reasonable to think that

H_{0}

should reject the null hypothesis for large values of

B_{n, w} ({\hat{γ}}_{n})

, which are defined by

B_{n, w} ({\hat{γ}}_{n}) = \int_{{[0, 1]}^{2}} B_{n}^{2} (t; {\hat{γ}}_{n}) w (t) d t,

(3)

where

B_{n} (t; γ) = \sqrt{n} \{g_{n} (t) - g (t; γ)\},

{\hat{γ}}_{n} = {\hat{γ}}_{n} (X_{1}, X_{2}, \dots, X_{n})

is a consistent estimator of

γ

. Additionally,

w (t)

is a measurable and non-negative weight function to which we will impose the condition of being finite to ensure that the double integral in (3) is finite for each fixed n. That is,

0 \leq \int_{{[0, 1]}^{2}} w (t) d t < \infty .

(4)

The next step is to determine what constitutes large samples of

B_{n, w} ({\hat{γ}}_{n})

. To do this, the first alternative is to understand its null distribution. However, since

B_{n, w} ({\hat{γ}}_{n})

is a Cramér-von Mises-type statistic, it does not have a known distribution. Therefore, we will estimate it through its asymptotic null distribution. To achieve this, we will assume that the estimator

{\hat{γ}}_{n}

satisfies the following regularity condition:

Assumption 1.

Under

H_{0}

, if

γ = (γ_{0}, γ_{1}, γ_{2}) \in Θ

denotes the true parameter value, then

\sqrt{n} ({\hat{γ}}_{n} - γ) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ℓ (X_{i}; γ) + o_{P} (1),

where

ℓ : N_{0}^{2} \times Θ ⟶ R^{3}

is such that

E_{γ} \{ℓ (X_{1}; γ)\} = 0

and

J (γ) = E_{γ} \{ℓ {(X_{1}; γ)}^{⊤} ℓ (X_{1}; γ)\} < \infty

. Here,

o_{P} (1)

is a vector consisting of three

o_{P} (1)

elements.

Assumption 1 is fulfilled by most commonly used estimators; see [19,23].

The regularity conditions for a parameter estimator have important practical implications, as they ensure the validity and good behavior of the estimators obtained. These conditions allow for the derivation of asymptotic properties such as consistency, asymptotic normality, and efficiency. Some key practical implications are the following:

Ensuring the consistency of the estimator: This guarantees that the estimator converges to the true value of the parameter as the sample size increases. This is crucial for ensuring that the model’s results are representative of the underlying phenomenon.
Asymptotic normality: This allows the distribution of the estimator to approximate a normal distribution as the sample size grows. This facilitates inference, such as hypothesis testing and confidence interval construction.
Efficiency of the estimator: This can enable the estimator to reach the Cramér–Rao bound, meaning it has the lowest possible variance among unbiased estimators.
Mathematical interpretability and proper modeling: They ensure that the probability or likelihood functions have appropriate properties, such as being differentiable, non-degenerate, and well-defined. This enables reliable analytical derivations and numerical computations.
Robustness to minor violations: Although some regularity conditions may be difficult to verify in practice, this provides a solid theoretical framework that allows models to adapt to small violations or approximations.

In summary, regularity conditions ensure that estimators are reliable, accurate, and suitable for making valid inferences, which is essential in practical statistical applications, such as scientific research, economics, biology, and other fields.

The next result gives the asymptotic null distribution of

B_{n, w} ({\hat{γ}}_{n})

.

Theorem 1.

Let

X_{1}, \dots, X_{n}

be iid from

X = (X_{1}, X_{2}) \sim B N B (γ)

. Suppose that Assumption 1 holds. Then,

B_{n, w} ({\hat{γ}}_{n}) = | | V_{n} {| |}_{H}^{2} + o_{P} (1),

where

V_{n} (t) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} B^{0} (X_{i}; t; γ)

, with

B^{0} (X_{i}; t; γ) = t_{1}^{X_{i 1}} t_{2}^{X_{i 2}} - g (t; γ) - v {\{g (t; γ)\}}^{\frac{v + 1}{v}} (t_{1} - 1, t_{2} - 1, (t_{1} - 1) (t_{2} - 1)) ℓ {(X_{i}; γ)}^{⊤},

i = 1, \dots, n

. Moreover,

B_{n, w} ({\hat{γ}}_{n}) \overset{L}{⟶} \sum_{j \geq 1} λ_{j} χ_{1 j}^{2},

(5)

where

χ_{11}^{2}, χ_{12}^{2}, \dots

are independent

χ^{2}

variates with one degree of freedom, and the set

{λ_{j}}

is the non-null eigenvalues of the operator

C (γ)

defined on the function space

{ξ : N_{0}^{2} \to R, such that E_{γ} \{ξ^{2} (X)\} < \infty, \forall γ \in Θ}

as follows:

C (γ) ξ (x) = E_{γ} {h (x, Y; γ) ξ (Y)},

where

h (x, y; γ) = \int_{{[0, 1]}^{2}} B^{0} (x; t; γ) B^{0} (y; t; γ) w (t) d t .

(6)

Proof.

By definition,

B_{n, w} ({\hat{γ}}_{n}) = {∥ B_{n} (t; {\hat{γ}}_{n}) ∥}_{H}^{2}

. Note that

B_{n} (t; {\hat{γ}}_{n}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} B (X_{i}; t; {\hat{γ}}_{n}), with B (X_{i}; t; γ) = t_{1}^{X_{i 1}} t_{2}^{X_{i 2}} - g (t; γ) .

(7)

By Taylor expansion of

B (X_{i}; t; {\hat{γ}}_{n})

around

γ

,

\begin{matrix} B_{n} (t; {\hat{γ}}_{n}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} B (X_{i}; t; γ) & + \frac{1}{n} \sum_{i = 1}^{n} B^{(1)} (X_{i}; t; γ) \sqrt{n} {({\hat{γ}}_{n} - γ)}^{⊤} \\ + \frac{1}{2} \sqrt{n} ({\hat{γ}}_{n} - γ) \frac{1}{n} \sum_{i = 1}^{n} B^{(2)} (X_{i}; t; γ) \sqrt{n} {({\hat{γ}}_{n} - γ)}^{⊤} + r_{n}, \end{matrix}

(8)

where

r_{n} = {\frac{1}{3!} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \sum_{k_{1} + k_{2} + k_{3} = 3} \frac{\partial^{3} B (X_{i}; t; γ)}{\partial γ_{0}^{k_{1}} \partial γ_{1}^{k_{2}} \partial γ_{2}^{k_{3}}}|}_{γ = \tilde{γ}} ({\hat{γ}}_{0 n} - γ_{0}) ({\hat{γ}}_{1 n} - γ_{1}) ({\hat{γ}}_{2 n} - γ_{2}),

and

\tilde{γ} = α {\hat{γ}}_{n} + (1 - α) γ

, for some

0 < α < 1

,

B^{(1)} (x; t; τ)

, is the vector of the first derivatives, and

B^{(2)} (x; t; τ)

is the matrix of the second derivatives of

B (x; t; τ)

with respect to

τ = (τ_{0}, τ_{1}, τ_{2})

. That is,

B^{(1)} (x; t; τ) = (B_{1}^{(1)} (x; t; τ), B_{2}^{(1)} (x; t; τ), B_{3}^{(1)} (x; t; τ)), B_{j}^{(1)} (x; t; τ) = \frac{\partial B (x; t; τ)}{\partial τ_{j - 1}}, j = 1, 2, 3 .

As the first derivatives are

B_{1}^{(1)} (x; t; τ) = v (1 - t_{1}) q, B_{2}^{(1)} (x; t; τ) = v (1 - t_{2}) q, B_{3}^{(1)} (x; t; τ) = - v (t_{1} - 1) (t_{2} - 1) q,

(9)

where

q = {g (t; τ)}^{\frac{v + 1}{v}}

, then

|B_{j}^{(1)} (X_{1}; t; τ)| \leq v < \infty

for

j = 1, 2, 3

\forall t \in {[0, 1]}^{2}

.

Thus, considering (4), it results that

E_{γ} \{{∥B_{j}^{(1)} (X_{1}; t; γ)∥}_{H}^{2}\} < \infty, j = 1, 2, 3 .

(10)

Since, for

j = 1, 2, 3

, we have that

E_{γ} [{\{\frac{1}{n} \sum_{i = 1}^{n} B_{j}^{(1)} (X_{i}; t; γ)\}}^{2}] = \frac{1}{n} E_{γ} [{\{B_{j}^{(1)} (X_{1}; t; γ)\}}^{2}] + \frac{n - 1}{n} {[E_{γ} \{B_{j}^{(1)} (X_{1}; t; γ)\}]}^{2},

then

\begin{matrix} E_{γ} [{\{\frac{1}{n} \sum_{i = 1}^{n} B_{j}^{(1)} (X_{i}; t; γ) - E_{γ} \{B_{j}^{(1)} (X_{1}; t; γ)\}\}}^{2}] \\ = \frac{1}{n} E_{γ} [{\{B_{j}^{(1)} (X_{1}; t; γ)\}}^{2}] - \frac{1}{n} {[E_{γ} \{B_{j}^{(1)} (X_{1}; t; γ)\}]}^{2} \\ \leq \frac{1}{n} E_{γ} [{\{B_{j}^{(1)} (X_{1}; t; γ)\}}^{2}] . \end{matrix}

Using the Markov inequality and (10) for

j = 1, 2, 3

, we obtain

P_{γ} [{∥\frac{1}{n} \sum_{i = 1}^{n} B_{j}^{(1)} (X_{i}; t; γ) - E_{γ} \{B_{j}^{(1)} (X_{1}; t; γ)\}∥}_{H} > ε] \leq \frac{1}{n ε^{2}} E_{γ} [{∥B_{j}^{(1)} (X_{1}; t; γ)∥}_{H}^{2}] \to 0 .

Thus, in the space

H

, we obtain that

\frac{1}{n} \sum_{i = 1}^{n} B^{(1)} (X_{i}; t; γ) \overset{P}{⟶} E_{γ} \{B^{(1)} (X_{1}; t; γ)\} = - v {\{g (t; γ)\}}^{\frac{v + 1}{v}} (t_{1} - 1, t_{2} - 1, (t_{1} - 1) (t_{2} - 1)) .

For the second derivatives, it follows from (9) that

B^{(2)} (x; t; τ) = - v (v + 1) {g (t; τ)}^{\frac{v + 2}{v}} A_{t},

where

A_{t} = (\begin{matrix} {(t_{1} - 1)}^{2} & (t_{1} - 1) (t_{2} - 1) & {(t_{1} - 1)}^{2} (t_{2} - 1) \\ (t_{1} - 1) (t_{2} - 1) & {(t_{2} - 1)}^{2} & (t_{1} - 1) {(t_{2} - 1)}^{2} \\ {(t_{1} - 1)}^{2} (t_{2} - 1) & (t_{1} - 1) {(t_{2} - 1)}^{2} & {(t_{1} - 1) (t_{2} - 1)}^{2} \end{matrix})

.

Therefore,

|B_{j k}^{(2)} (X_{1}; t; τ)| \leq v (v + 1) < \infty

for

j, k \in {1, 2, 3}

\forall t \in {[0, 1]}^{2}

.

Thus, considering (4), it results that

E_{γ} \{{∥B_{j k}^{(2)} (X_{1}; t; γ)∥}_{H}^{2}\} < \infty, j, k \in {1, 2, 3} .

Following similar steps as those for the vector

B^{(1)} (X_{i}; t; γ)

for the matrix of second derivatives, we obtain that in the space

H

it results in

\frac{1}{n} \sum_{i = 1}^{n} B^{(2)} (X_{i}; t; γ) \overset{P}{⟶} E_{γ} \{B^{(2)} (X_{1}; t; γ)\} = - v (v + 1) {g (t; γ)}^{\frac{v + 2}{v}} A_{t} .

Additionally, using Assumption 1, we obtain

\begin{matrix} \frac{1}{2} \sqrt{n} ({\hat{γ}}_{n} - γ) \frac{1}{n} \sum_{i = 1}^{n} B^{(2)} (X_{i}; t; γ) & \sqrt{n} {({\hat{γ}}_{n} - γ)}^{⊤} = \\ \frac{1}{2} E_{γ} \{ℓ (X_{1}; γ)\} E_{γ} \{B^{(2)} (X_{1}; t; γ)\} \frac{1}{\sqrt{n}} \sum_{k = 1}^{n} ℓ {(X_{k}; γ)}^{⊤} \\ + E_{γ} \{ℓ (X_{1}; γ)\} E_{γ} \{B^{(2)} (X_{1}; t; γ)\} o_{P} {(1)}^{⊤} + o_{P} (1) \end{matrix}

On the other hand, using similar arguments to the previous ones, we obtain

∥ r_{n} ∥_{H} = o_{P} (1)

; then, using Assumption 1, (8) can be written as

B_{n} (t; {\hat{γ}}_{n}) = B_{n} (t; γ) + b_{n},

where

∥ b_{n} ∥_{H} = o_{P} (1)

, and

B_{n} (t; γ) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} [B (X_{i}; t; γ) + E_{γ} \{B^{(1)} (X_{1}; t; γ)\} ℓ {(X_{i}; γ)}^{⊤}] .

On the other hand, observe that

∥ B_{n} {(γ) ∥}_{H}^{2} = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n} h (X_{i}, X_{j}; γ),

where

h (x, y; γ)

is defined in (6) and satisfies

h (x, y; γ) = h (y, x; γ)

,

E_{γ} \{h^{2} (X_{1}, X_{2}; γ)\} < \infty

,

E_{γ} \{| h (X_{1}, X_{1}; γ) |\} < \infty

and

E_{γ} \{h (X_{1}, X_{2}; γ)\} = 0

. Thus, from Theorem 6.4.1.B in Serfling [24],

∥ B_{n} {(γ) ∥}_{H}^{2} \overset{L}{⟶} \sum_{j \geq 1} λ_{j} χ_{1 j}^{2}

where

χ_{11}^{2}, χ_{12}^{2}, \dots

and the set

{λ_{j}}

is as defined in the statement of the Theorem. In particular,

∥ B_{n} {(γ) ∥}_{H}^{2} = O_{P} (1)

, which implies (5). □

As seen in Theorem 1, the null asymptotic distribution of

B_{n, w} ({\hat{γ}}_{n})

still depends on the true value of the parameter

γ

, which, being unknown, does not provide a useful solution. This issue is resolved by replacing

γ

with

\hat{γ}

.

However, a greater challenge lies in determining the set

{λ_{j}}_{j \geq 1}

, since calculating the eigenvalues of an operator is not straightforward. Moreover, it is also necessary to obtain the expression for

h (x, y; γ)

, which is difficult to derive, because it depends on the function ℓ, which generally does not have a simple expression.

Thus, in the following section, we employ the parametric bootstrap method to approximate the null distribution of the test statistic.

4. Bootstrap Approximation of the Null Distribution

The parametric bootstrap method provides an alternative way to estimate the null distribution of the proposed statistic. For this, let

X_{1}, \dots, X_{n}

be iid random vectors taking values in

N_{0}^{2}

, and suppose that

{\hat{γ}}_{n} = {\hat{γ}}_{n} (X_{1}, \dots, X_{n}) \in Θ

.

Furthermore, to use the bootstrap method, we must consider that

X_{1}^{*}, \dots, X_{n}^{*}

are iid random vectors from a population distributed according to the

B N B ({\hat{γ}}_{n})

law given that

X_{1}, \dots, X_{n}

. Also, let

B_{n, w}^{*} ({\hat{γ}}_{n}^{*})

be the bootstrap version of

B_{n, w} ({\hat{γ}}_{n})

, which is obtained by replacing

X_{1}, \dots, X_{n}

and

{\hat{γ}}_{n} = {\hat{γ}}_{n} (X_{1}, \dots, X_{n})

with

X_{1}^{*}, \dots, X_{n}^{*}

and

{\hat{γ}}_{n}^{*} = {\hat{γ}}_{n} (X_{1}^{*}, \dots, X_{n}^{*})

, respectively, in the expression for

B_{n, w} ({\hat{γ}}_{n})

.

Theorem 2, presented later, demonstrates that the parametric bootstrap method consistently approximates the null distribution of

B_{n, w} ({\hat{γ}}_{n})

. To achieve this, we must assume the following hypotheses, which are satisfied by commonly used estimators, as mentioned after Assumption 1.

Assumption 2.

Assumption 1 holds and the functions ℓ and J satisfy

(1): ${sup}_{ϱ \in Θ_{0}} E_{ϱ} [{∥ ℓ (X; ϱ) ∥}^{2} I \{∥ ℓ (X; ϱ) ∥ > τ\}] ⟶ 0$ as $τ \to \infty$ , where $Θ_{0} \subseteq Θ$ is an open neighborhood of γ.
(2): $ℓ (X; ϱ)$ and $J (ϱ)$ are continuous as functions of ϱ at $ϱ = γ$ , and $J (ϱ)$ is finite $\forall ϱ \in Θ_{0}$ .

Theorem 2.

Let

X_{1}, \dots, X_{n}

be iid from a random vector

X = (X_{1}, X_{2}) \in N_{0}^{2}

. Suppose that Assumption 2 holds and that

{\hat{γ}}_{n} = γ + o (1)

for some

γ \in Θ

. Then,

sup_{x \in R} |P_{*} \{B_{n, w}^{*} ({\hat{γ}}_{n}^{*}) \leq x\} - P_{γ} \{B_{n, w} ({\hat{γ}}_{n}) \leq x\}| \overset{a . s .}{⟶} 0 .

Proof.

By definition,

B_{n, w}^{*} ({\hat{γ}}_{n}^{*}) = {∥ B_{n}^{*} (t; {\hat{γ}}_{n}^{*}) ∥}_{H}^{2}

, with

B_{n}^{*} (t; {\hat{γ}}_{n}^{*}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} B (X_{i}^{*}; t; {\hat{γ}}_{n}^{*})

and

B (X; t; γ)

defined in (7).

Following similar steps to those given in the proof of Theorem 1, it can be seen that

B_{n, w}^{*} ({\hat{γ}}_{n}^{*}) = {∥ V_{n}^{*} ∥}_{H}^{2} + o_{P_{*}} (1)

, where

V_{n}^{*} (t)

is defined as

V_{n} (t)

, with

X_{i}

and

γ

replaced by

X_{i}^{*}

and

{\hat{γ}}_{n}

, respectively.

Now, the result follows by applying Theorem 1.1 in Kundu et al. [25] to

V_{n}^{*}

and the continuous mapping theorem. □

From Theorem 2, the test function

Ψ^{*} = \{\begin{matrix} 1, & if B_{n, w}^{*} ({\hat{γ}}_{n}^{*}) \geq b_{n, w, α}^{*}, \\ 0, & otherwise, \end{matrix}

or, equivalently, the test that rejects

H_{0}

when

p^{*} = P_{*} {B_{n, w}^{*} ({\hat{γ}}_{n}^{*}) \geq B_{o b s}} \leq α,

is asymptotically correct in the sense that, when

H_{0}

is true,

lim P_{γ} (Ψ^{*} = 1) = α

, where

b_{n, w, α}^{*} = inf {x : P_{*} (B_{n, w}^{*} ({\hat{γ}}_{n}^{*}) \geq x) \leq α}

is the

α

upper percentile of the bootstrap distribution of

B_{n, w} ({\hat{γ}}_{n})

, and

B_{o b s}

is the observed value of the test statistic.

5. Parameter Estimation

In this research, three techniques were analyzed to estimate the parameters established in parameterization (2), namely, the (a) method of moments, (b) zero–zero cell frequency, and (c) maximum likelihood. In what follows, we will use different notation to denote the estimator of

γ

depending on the method employed:

\tilde{γ}

for the method of moments,

\tilde{\tilde{γ}}

for the zero–zero cell frequency method, and

\tilde{\tilde{\tilde{γ}}}

for the maximum likelihood. For this, let us consider

(X_{1}, X_{2}) \sim B N B (γ)

.

5.1. Method of Moments

The usual procedure for obtaining the moment estimators is to equate the sample moments to the population moments and solve the resulting equations. According to [26], we shall use the two first marginal moments and the first mixed central moment

m_{1, 1}

.

The moments of the distribution in (2) are

\begin{matrix} μ_{1, 0}^{'} & = v γ_{0}, μ_{0, 1}^{'} = v γ_{1}, μ_{1, 1} = v (γ_{2} + γ_{0} γ_{1}), μ_{2, 0} = v γ_{0} (1 + γ_{0}), \\ μ_{0, 2} & = v γ_{1} (1 + γ_{1}), μ_{2, 1} = v (2 γ_{0} + 1) (γ_{2} + γ_{0} γ_{1}), μ_{1, 2} = v (2 γ_{0} + 1) (γ_{1} + γ_{0} γ_{1}), \\ μ_{2, 2} & = v \{2 (v + 1) γ_{2}^{2} + 2 (v + 2) γ_{0} γ_{1} γ_{2} + 3 (v + 2) γ_{0}^{2} γ_{1}^{2} + γ_{2} (2 γ_{0} + 2 γ_{1} + 1) \\ + (v + 1) γ_{0} γ_{1} (1 + γ_{0} + γ_{1}) + γ_{0} γ_{1} (γ_{0} + γ_{1})\} . \end{matrix}

5.1.1. Estimators of $\tilde{γ}$

Using the sample marginal first moments

{\bar{X}}_{1}

and

{\bar{X}}_{2}

, as well as the mixed moment

m_{1, 1}

, we have the moment estimators

{\tilde{γ}}_{0} = \frac{{\bar{X}}_{1}}{v}, {\tilde{γ}}_{1} = \frac{{\bar{X}}_{2}}{v}, {\tilde{γ}}_{2} = \frac{m_{1, 1}}{v} - \frac{{\bar{X}}_{1} {\bar{X}}_{2}}{v^{2}} .

5.1.2. Asymptotic Variance Matrix of $\tilde{γ}$

The asymptotic variance–covariance matrix of the estimators obtained through the method of moments,

Σ_{M M}

, is given by

Σ_{M M} = \frac{1}{n v} (\begin{matrix} γ_{0} (1 + γ_{0}) & γ_{2} + γ_{0} γ_{1} & (2 γ_{0} + 1) (γ_{2} + γ_{0} γ_{1}) \\ γ_{2} + γ_{0} γ_{1} & γ_{1} (1 + γ_{1}) & (2 γ_{0} + 1) (γ_{1} + γ_{0} γ_{1}) \\ (2 γ_{0} + 1) (γ_{2} + γ_{0} γ_{1}) & (2 γ_{0} + 1) (γ_{1} + γ_{0} γ_{1}) & ς \end{matrix}),

where

ς = (v + 2) γ_{2}^{2} + γ_{0} γ_{1} \{4 γ_{2} + 2 (v + 3) γ_{0} γ_{1} + (v + 2) (γ_{0} + γ_{1}) + v + 1\}

.

5.1.3. Asymptotic Distribution of $\tilde{γ}$

Since

\tilde{γ}

is a function of sample moments, applying Theorem 11.2.14 (Delta Method) given in Lehmann and Romano [27], it follows that

\sqrt{n} (\tilde{γ} - γ) \overset{L}{⟶} N_{3} (0, Σ_{M M}) .

5.1.4. Asymptotic Properties of $\tilde{γ}$

According to the results of Section 5.1.3,

\tilde{γ}

is a consistent and asymptotically normal estimator.

5.2. Zero–Zero Cell Frequency

5.2.1. Estimators of $\tilde{\tilde{γ}}$

From (2),

P_{γ} (0, 0) = {(1 + γ_{0} + γ_{1} - γ_{2})}^{- v} .

A combination of

{\bar{X}}_{1}

,

{\bar{X}}_{2}

and the observed zero–zero cell relative frequency

\frac{n_{00}}{n}

can be used to estimate the parameters

γ_{0}

,

γ_{1}

and

γ_{2}

:

{\tilde{\tilde{γ}}}_{0} = \frac{{\bar{X}}_{1}}{v}, {\tilde{\tilde{γ}}}_{1} = \frac{{\bar{X}}_{2}}{v}, {\tilde{\tilde{γ}}}_{2} = 1 + \frac{{\bar{X}}_{1}}{v} + \frac{{\bar{X}}_{2}}{v} - {\{\frac{n_{00}}{n}\}}^{- \frac{1}{v}} .

5.2.2. Asymptotic Variance Matrix of $\tilde{\tilde{γ}}$

The asymptotic variance–covariance matrix of the estimators obtained through the zero–zero cell frequency method,

Σ_{Z Z}

, is given by

Σ_{Z Z} = \frac{1}{v n} (\begin{matrix} γ_{0} (1 + γ_{0}) & γ_{2} + γ_{0} γ_{1} & γ_{2} (1 + γ_{0}) \\ γ_{2} + γ_{0} γ_{1} & γ_{1} (1 + γ_{1}) & γ_{2} (1 + γ_{1}) \\ γ_{2} (1 + γ_{0}) & γ_{2} (1 + γ_{1}) & (1 + γ_{0} + γ_{1}) (2 γ_{2} - γ_{0} - γ_{1}) + ζ \end{matrix}),

where

ζ = \frac{1}{v} (\frac{1}{P_{γ} (0, 0)} - 1) {(1 + γ_{0} + γ_{1} - γ_{2})}^{2}

.

5.2.3. Asymptotic Distribution of $\tilde{\tilde{γ}}$

Since

\tilde{\tilde{γ}}

is a function of sample moments, applying Theorem 11.2.14 (Delta Method) given in Lehmann and Romano [27], it follows that

\sqrt{n} (\tilde{\tilde{γ}} - γ) \overset{L}{⟶} N_{3} (0, Σ_{Z Z}) .

5.2.4. Asymptotic Properties of $\tilde{\tilde{γ}}$

According to the results of Section 5.2.3

\tilde{\tilde{γ}}

is a consistent and asymptotically normal estimator.

5.3. Maximum Likelihood Estimators

5.3.1. Estimators of $\tilde{\tilde{\tilde{γ}}}$

Differentiating the pmf given in (2) with respect to

γ_{0}

,

γ_{1}

, and

γ_{2}

successively, we have

\frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{0}} = \frac{r}{γ_{0} - γ_{2}} - \frac{v + r + s}{1 + γ_{0} + γ_{1} - γ_{2}} - \frac{(1 + γ_{1}) R (r, s)}{(γ_{0} - γ_{2}) (1 + γ_{0} + γ_{1} - γ_{2})},

(11)

where

R (r, s) = \frac{S^{'} (r, s)}{S (r, s)}

, and

S^{'} (r, s) = \sum_{i = 0}^{m i n (r, s)} \frac{(\binom{r}{i}) (\binom{s}{i})}{(\binom{v + r + s - 1}{i})} i τ^{i} .

A similar procedure yields

\frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{1}} = \frac{s}{γ_{1} - γ_{2}} - \frac{v + r + s}{1 + γ_{0} + γ_{1} - γ_{2}} - \frac{(1 + γ_{0}) R (r, s)}{(γ_{1} - γ_{2}) (1 + γ_{0} + γ_{1} - γ_{2})},

(12)

and

\begin{matrix} \frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{2}} & = - \frac{r}{γ_{0} - γ_{2}} - \frac{s}{γ_{1} - γ_{2}} - \frac{v + r + s}{1 + γ_{0} + γ_{1} - γ_{2}} \\ + (\frac{1}{γ_{2}} + \frac{1}{γ_{0} - γ_{2}} + \frac{1}{γ_{1} - γ_{2}} - \frac{1}{1 + γ_{0} + γ_{1} - γ_{2}}) R (r, s) . \end{matrix}

(13)

The estimating equations are

\sum_{r, s} \frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{0}} = 0, \sum_{r, s} \frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{1}} = 0, \sum_{r, s} \frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{1}} = 0,

which yield the maximum likelihood estimators

\tilde{\tilde{\tilde{γ}}}

as

\frac{{\bar{X}}_{1}}{{\tilde{\tilde{\tilde{γ}}}}_{0} - {\tilde{\tilde{\tilde{γ}}}}_{2}} - \frac{v + {\bar{X}}_{1} + {\bar{X}}_{2}}{ξ} - \frac{(1 + {\tilde{\tilde{\tilde{γ}}}}_{1}) \bar{R}}{ξ ({\tilde{\tilde{\tilde{γ}}}}_{0} - {\tilde{\tilde{\tilde{γ}}}}_{2})} = 0,

- \frac{v + {\bar{X}}_{1} + {\bar{X}}_{2}}{ξ} + \frac{{\bar{X}}_{2}}{{\tilde{\tilde{\tilde{γ}}}}_{1} - {\tilde{\tilde{\tilde{γ}}}}_{2}} - \frac{(1 + {\tilde{\tilde{\tilde{γ}}}}_{0}) \bar{R}}{ξ ({\tilde{\tilde{\tilde{γ}}}}_{1} - {\tilde{\tilde{\tilde{γ}}}}_{2})} = 0,

- \frac{{\bar{X}}_{1}}{{\tilde{\tilde{\tilde{γ}}}}_{0} - {\tilde{\tilde{\tilde{γ}}}}_{2}} - \frac{{\bar{X}}_{2}}{{\tilde{\tilde{\tilde{γ}}}}_{1} - {\tilde{\tilde{\tilde{γ}}}}_{2}} + \frac{v + {\bar{X}}_{1} + {\bar{X}}_{2}}{ξ} + \{\frac{1}{{\tilde{\tilde{\tilde{γ}}}}_{2}} + \frac{1}{{\tilde{\tilde{\tilde{γ}}}}_{0} - {\tilde{\tilde{\tilde{γ}}}}_{2}} + \frac{1}{{\tilde{\tilde{\tilde{γ}}}}_{1} - {\tilde{\tilde{\tilde{γ}}}}_{2}} - \frac{1}{ξ}\} \bar{R} = 0,

where

ξ = 1 + {\tilde{\tilde{\tilde{γ}}}}_{0} + {\tilde{\tilde{\tilde{γ}}}}_{1} - {\tilde{\tilde{\tilde{γ}}}}_{2}

.

Here,

\bar{R}

is the sample mean of

R (x, y)

. Solving these equations leads to

{\tilde{\tilde{\tilde{γ}}}}_{0} = \frac{{\bar{X}}_{1}}{v}, {\tilde{\tilde{\tilde{γ}}}}_{1} = \frac{{\bar{X}}_{2}}{v},

while

{\tilde{\tilde{\tilde{γ}}}}_{2}

is the solution to the equation

v {\tilde{\tilde{\tilde{γ}}}}_{2} = \bar{R} .

This equation for

{\tilde{\tilde{\tilde{γ}}}}_{2}

can be solved iteratively using either the moment or zero–zero cell frequency estimate as an initial value.

5.3.2. Asymptotic Variance Matrix of $\tilde{\tilde{\tilde{γ}}}$

By definition, The third-order information matrix,

F_{3}

, is given by (for large n)

F_{3} = n (\sum_{r, s} \frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{i}} \frac{\partial P_{γ} (r, s)}{\partial γ_{j}}), with i, j \in {0, 1, 2} .

Returning to Equations (11)–(13) and recalling that the asymptotic variance–covariance matrix of the estimators obtained by the maximum likelihood method,

Σ_{M L}

, is the inverse of the information matrix, then

Σ_{M L} = \frac{1}{n} {(\sum_{r, s} Q_{i} (r, s) Q_{j} (r, s) P_{γ} (r, s))}^{- 1}, i, j \in {0, 1, 2},

where

Q_{i} (r, s) = \frac{1}{P_{γ} (r, s)} \frac{\partial P_{γ} (r, s)}{\partial γ_{i}}, i = 0, 1, 2 .

5.3.3. Asymptotic Distribution of $\tilde{\tilde{\tilde{γ}}}$

Since

\tilde{\tilde{\tilde{γ}}}

is a function of sample moments, applying Theorem 11.2.14 (Delta Method) given in Lehmann and Romano [27], it follows that

\sqrt{n} (\tilde{\tilde{\tilde{γ}}} - γ) \overset{L}{⟶} N_{3} (0, Σ_{M L}) .

5.3.4. Asymptotic Properties of $\tilde{\tilde{\tilde{γ}}}$

According to the results of Section 5.3.3,

\tilde{\tilde{\tilde{γ}}}

is a consistent and asymptotically normal estimator.

6. Numerical Results and Discussion

The properties of the statistic

B_{n, w} ({\hat{γ}}_{n})

are asymptotic (see, for example, [17]). However, we need to apply the test statistic in the real world, i.e., for finite sample sizes. This section is dedicated to investigating the behavior of the bootstrap approximation of the proposed statistic in a finite scenario. To this end, we conducted a simulation study and present the results obtained.

It is important to note that, as we did not find another consistent test statistic for the BNBD, the numerical study presented has no point of comparison. This is the reason why the study was carried out using three estimation techniques.

Given the massive volume of calculations performed, we minimized computation time by using the parallelization algorithm called parRapply, which has been suggested in [18]. All computational work was done through codes written in the R programming language [28].

It is important to point out that parRapply is a function from the parallel package, which is available in [18]. parRapply is a computational technique based on task simultaneity, allowing for faster computation times by partitioning the work to be executed across the different cores available on a computer. Its main goal is to reduce the execution time of a program. For more detailed information about parRapply, you can refer to the documentation available in [18].

To calculate

B_{n, w} ({\hat{γ}}_{n})

, we used the weight function

w (t; a_{1}, a_{2}) = t_{1}^{a_{1}} t_{2}^{a_{2}},

(14)

with

a_{1} > - 1

and

a_{2} > - 1

, resulting in the following explicit form of the test statistic:

\begin{matrix} B_{n, w} ({\hat{γ}}_{n}) = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n} (\frac{1}{X_{i 1} + X_{j 1} + a_{1} + 1}) (\frac{1}{X_{i 2} + X_{j 2} + a_{2} + 1}) \\ + q^{v} \sum_{k = 0}^{\infty} \sum_{l = 0}^{k} \sum_{m = 0}^{k - l} \frac{{({\hat{γ}}_{0} - {\hat{γ}}_{1})}^{k - l - m} {({\hat{γ}}_{1} - {\hat{γ}}_{2})}^{l} {\hat{γ}}_{2}^{m}}{l! m! (k - m - l)! q^{k}} (\frac{n q^{v} (2 v + k - 1)!}{(2 v - 1)! A_{k l} C_{l m}} - \frac{2 (v + k - 1)!}{(v - 1)!} \sum_{i = 1}^{n} \frac{1}{D_{i k l m}}), \end{matrix}

where

q = 1 + {\hat{γ}}_{0} + {\hat{γ}}_{1} - {\hat{γ}}_{2}

,

A_{k l} = a_{1} + k - l + 1

,

C_{l m} = a_{2} + l + m + 1

, and

D_{i k l m} = (X_{i 1} + A_{k l}) (X_{i 2} + C_{l m})

.

Remark 1.

For practical purposes, truncating the infinite series to the first 20 terms yielded sufficiently accurate values for the test statistic and good performance of the corresponding subroutine.

6.1. Simulated Data for Type I Error

In practice, the exact bootstrap estimator of the null distribution of

B_{n, w} ({\hat{γ}}_{n})

cannot be calculated. As usual, we approximate it by simulation as follows:

1.

Estimate

γ

through

\hat{γ}

and compute the observed value of the test statistic

B_{o b s}

.

2.

For some large integer K, repeat for every

k \in {1, \dots, K}

:

(a): Generate $X^{* k} = (X_{1}^{* k}, \dots, X_{n}^{* k})$ , where $X_{1}^{* k}, \dots, X_{n}^{* k}$ are iid from a $B N B (\hat{γ})$
(b): Calculate the test statistic evaluated at $X^{* k}$ , obtaining $B_{n}^{* k} ({\hat{γ}}^{* k})$ .

3.

Approximate the p-value by

\hat{p} = \frac{1}{K} \sum_{k = 1}^{K} I {B_{n}^{* k} ({\hat{γ}}^{* k}) > B_{o b s}}

.

It is important to note that in Appendix A, we present the computational implementation of the parametric bootstrap algorithm written in R code.

To study the goodness of the bootstrap approximation of

B_{n, w} ({\hat{γ}}_{n})

for finite sample sizes, we generated samples of size

n = 30 (20) 70

from

B N B (γ_{0}, γ_{1}, γ_{2})

, with

γ_{0} = γ_{1}

, where

γ_{0}, γ_{1} \in {0.29, 0.30, 0.31, 0.32}

and the parameter

γ_{2}

were chosen such that the correlation coefficient,

ρ = \frac{γ_{0} γ_{1} + γ_{2}}{\sqrt{γ_{0} γ_{1} (1 + γ_{0}) (1 + γ_{1})}}

, equaled to 0.25, 0.50, and 0.75 in order to evaluate the quality of the approximations for data with low, moderate, and high correlations, respectively.

To estimate the parameter

γ

, we employed the three techniques described in Section 5. Then, we approximated the bootstrap p-values,

p^{*}

, for the proposed test. For this purpose, we used the weighting function given in (14) for

(a_{1}, a_{2}) \in {(0, 0), (1, 0), (0, 1)}

and generated

K = 500

bootstrap samples.

The above procedure was repeated 1000 times, and we calculated the fraction of the estimated p-values that were less than or equal to 0.05 and 0.10. These fractions correspond to the estimated probabilities of Type I error for

α = 0.05, 0.10

, respectively.

Next, we repeat the previous experiment for

γ_{0} \neq γ_{1}

, where

γ_{0}, γ_{1} \in {0.29, 0.30, 0.31, 0.32}

and

γ_{2}

were chosen such that the correlation coefficient,

ρ = \frac{γ_{0} γ_{1} + γ_{2}}{\sqrt{γ_{0} γ_{1} (1 + γ_{0}) (1 + γ_{1})}}

, was approximately equal to 0.25, 0.50, and 0.75.

In both scenarios,

γ_{0} = γ_{1}

and

γ_{0} \neq γ_{1}

, we experimented with different numbers of trials before the first event occurred, v. We obtained similar results for the different simulated cases; here, we only report the results for

v = 5

.

Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6 summarize the obtained results. We denote the test statistic by

B_{n, a} (\hat{γ})

when the weight function w takes the form given by (14) for some

a = (a_{1}, a_{2})

. Additionally, note that we used the notation detailed in Section 5 to represent the technique employed to estimate the parameter

γ

.

In turn, as documented in the literature, the tables show that the maximum likelihood method provided better results than the other two estimation methods analyzed, followed by the method of moments. Furthermore, it is observed that the results converged to the established nominal level as the sample size increased.

These two conclusions can be clearly visualized in the graphs presented in Appendix A. On the one hand, it can be observed that as the sample size increased, the results approached the established nominal level: in the left area of each graph, the convergence toward

α = 0.05

is evident, while in the right area, the convergence toward

α = 0.10

is observable. Furthermore, it can be detected that using the maximum likelihood method yielded results closer to the established nominal levels (

α = 0.05

on the left side and

α = 0.10

on the right side) compared to the other two methods, with the method of moments following in accuracy.

Observing the values presented in the tables, we conclude that the parametric bootstrap method provides an accurate approximation to the null distribution of

B_{n, w} ({\hat{γ}}_{n})

in all the cases considered. This is also evident in the graphs in Appendix B.

6.2. Power of the Proposed Test Statistic

To analyze the power, we repeated the previous experiment for samples of size

n = 50

across four alternatives (see, for example, Kocherlakota and Kocherlakota [19] for a description of these distributions):

Bivariate Binomial distribution $B B (m; p_{1}, p_{2}, p_{3})$ , where $p_{1} + p_{2} - p_{3} \leq 1$ , $p_{1} \geq p_{3}$ , $p_{2} \geq p_{3}$ , and $p_{3} > 0$ ;
Bivariate Hermite distribution $B H (μ, σ^{2}; λ_{1}, λ_{2}, λ_{3})$ , where $μ > σ^{2} (λ_{1} + λ_{2} + λ_{3})$ ;
Bivariate Neyman type A distribution $B N T A (λ; λ_{1}, λ_{2}, λ_{3})$ , where $0 < λ_{1} + λ_{2} + λ_{3} \leq 1$ ;
Bivariate Poisson distribution $B P (λ_{1}, λ_{2}, λ_{3})$ , where $λ_{1} > λ_{3}$ , and $λ_{2} > λ_{3} > 0$ .

The bivariate Binomial distribution and the bivariate Neyman type A distribution have been used as alternatives in other related articles (see, for example, Loukas and Kemp [29] and Rayner and Best [30]). The other alternative distributions employed are analogous to those used by Gürtler and Henze [31] in the univariate case.

To generate random samples from the bivariate distributions used as alternatives, we implemented computational algorithms following the simulation procedures provided in Kocherlakota and Kocherlakota [19].

Table 7 displays the alternatives considered and the estimated power for nominal significance level

α = 0.05

. Additionally, to estimate the parameter

γ

, we only used the maximum likelihood method, as it yielded better results in the type I error study. Analyzing this table, we can conclude that all the considered tests, denoted by

B_{(a_{1}, a_{2})}

, were able to detect the alternatives studied and with good power results, especially when

B_{(0, 0)}

was used.

All calculations performed in this research were programmed using the open-source software R [28]. To leverage the potential of parallel programming (see Novoa-Muñoz [18]), the codes were executed utilizing the supercomputing infrastructure of the NLHPC (ECM–02)² belonging to the Mathematical Modeling Center of the University of Chile [32]. This infrastructure includes, among other resources, 132 HP computing nodes (128 HP SL230 nodes and 4 HP SL250 nodes), each equipped with two 10-core Intel Xeon Ivy Bridge E5-2660V2 processors.The manufacturer of the NLHPC supercomputer was Atos, located in the city of Bezons, France.

6.3. Real Data Set

The test statistic was applied to two real datasets:

First Case: Biometric Data. This case was analyzed by Dunn [33] to illustrate and represent a real case of the BNBD. The sample was obtained from Arbous and Kerrich [4] in a study on the treatment and analysis of the propensity for workplace accidents in the railway sector among a group of experienced workers. The study involved 122 workers, with the first variable representing a six-year period (1937–1942) and the second variable representing a five-year period (1943–1947). The results correspond to the number of workers who had accidents in the two non-overlapping time periods.

Second Case: Agronomic–Botanical Data. This dataset is based on the number of plants of the species Lacistema aggregatum and Protium guianense in each of 100 contiguous quadrants. The data were first presented by Holgate [6] and later analyzed by various researchers up to 1995, including Gillings [34], Crockett [35], Loukas and Kemp [29], Kocherlakota and Kocherlakota [19], and Rayner and Best [30]. They were also studied by Novoa-Muñoz and Jiménez-Gamero [17]. Researchers Crockett [35], Loukas and Kemp [29], Rayner and Best [30], and Novoa-Muñoz and Jiménez-Gamero [17] examined the data to determine whether they corresponded to a bivariate Poisson model and concluded that the bivariate Poisson distribution did not correctly model the mentioned data.

The following three cases are part of the kaggle.com [36] repository, where companies from around the world can upload their databases for a community of data scientists to generate predictive models. It is a challenge of skills and competencies among developers who seek to identify patterns and trends in the data. These cases correspond to current data recorded in recent times.

kaggle.com [36] also offers modeling and evaluation tools to develop and compare models. It is an online community for learning, sharing, and competing in data science.

It is important to note that, based on the information available to date, we have no evidence that any of these three datasets have previously been analyzed to determine whether they follow a probability distribution.

Third Case: Black Friday Sales Data. This is a public dataset containing information about purchases made during Black Friday. It includes buyer variables such as age, gender, purchase amount, category, and quantity of products purchased. The dataset contains over 550,000 rows and 12 columns. Two columns containing the quantity of products purchased were randomly selected. The dataset can be used to identify patterns and trends in Black Friday purchases.

Fourth Case: Jamboree Education Data. This dataset contains information on students’ academic performance in admission exams. It includes variables such as scores in different sections of the exam, the student’s age, and gender. The objective is to predict students’ academic performance. This dataset is useful for analyzing and improving educational outcomes.

Fifth Case: Insurance Data. This database contains information about insurance customers. It includes variables such as age, gender, marital status, and income level. It also contains information about insurance policies, such as premium amount and insurance type. The dataset aims to predict the likelihood of a customer renewing their insurance policy. It is useful for analyzing and improving customer retention in the insurance industry.

Table 8 shows the p-values for testing the goodness of fit to a BNBD for the five real datasets using the statistical test we have proposed. For this, we used

(a_{1}, a_{2}) \in {(0, 0), (1, 0), (0, 1)}

. Additionally, for the five real cases, we applied a nominal significance level of

α = 0.05

and considered

v = 5

, which was employing the same number of trials before the first event occurred.

The results displayed in this table allow us to conclude the following: For the Black Friday Sales data, the null hypothesis is rejected, meaning that Black Friday sales are not well modeled by a DBNB. On the other hand, for the other four datasets, it is not possible to reject the null hypothesis. Therefore, we conclude that the biometric, agronomic, education, and insurance data can be well modeled by a DBNB.

6.4. Comparisons with Other Goodness-of-Fit Tests

The only goodness-of-fit tests we have found in the statistical literature for bivariate count data are for the bivariate Poisson distribution and the bivariate Hermite distribution. The similarity is that these tests can be defined as follows:

Consistent: As the sample size increases, the probability of correctly rejecting a false null hypothesis approaches 1. In other words, a consistent test has the ability to detect any deviation from the null hypothesis when sufficient information is available (i.e., with a sufficiently large sample size).
Based on the probability generating function.
Of the Cramér-von Mises type.
With null asymptotic distribution approximated by the parametric bootstrap method.
Specific tests for bivariate count data.

However, their implementation complexity may be a disadvantage compared to more generic tests like Kolmogorov–Smirnov, Anderson–Darling, or chi-squared.

Author Contributions

Conceptualization, F.N.-M.; methodology, F.N.-M.; software, F.N.-M. and J.P.A.-G.; validation, F.N.-M. and J.P.A.-G.; formal analysis, F.N.-M.; investigation, F.N.-M. and J.P.A.-G.; resources, F.N.-M.; data curation, J.P.A.-G.; writing—original draft preparation, F.N.-M. and J.P.A.-G.; writing—review and editing, F.N.-M.; visualization, F.N.-M. and J.P.A.-G. All authors have read and agreed to the published version of the manuscript.

Funding

This publication was supported by Universidad del Bío-Bío, DICREA [2220529 IF/R].

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The corresponding author expresses gratitude to the research project DIUBB 2220529 IF/R, the “Vicerrectoría de Investigación y Postgrado” and the “Departamento de Enfermería” entities of the Universidad del Bío-Bío, Chile. He also thanks the anonymous reviewers and the editor of this journal for their valuable time and their careful comments and suggestions with which the quality of this paper has been improved.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

All vectors are row vectors, and

x^{⊤}

is the transposed of the row vector x; for any vector,

x, x_{k}

denotes its kth coordinate, and

∥ x ∥

denotes its Euclidean norm;

N_{0} = {0, 1, 2, 3, . . .}

;

μ_{r, s}^{'} = E (X^{r} Y^{s})

denotes the joint moments of the random variables X and Y;

μ_{r, s} = E [{(X - μ_{X})}^{r} {(Y - μ_{Y})}^{s}]

denotes the

(r, s)

th central moment, where

μ_{Z}

is the expected value of Z;

I {A}

denotes the indicator function of the set A;

P_{γ}

denotes the probability law of the BNBD with parameter

γ

;

E_{γ}

denotes expectation with respect to the probability function

P_{γ}

;

P_{*}

and

E_{*}

denote the conditional probability law and expectation given the data

(X_{1}, Y_{1}), \dots, (X_{n}, Y_{n})

, respectively; all limits in this work are taken as

n \to \infty; \overset{L}{⟶}

denotes convergence in distribution;

\overset{P}{⟶}

denotes convergence in probability;

\overset{a . s .}{⟶}

denotes almost sure convergence; let

{C_{n}}

be a sequence of random variables or random elements, and let

ϵ \in R

; then,

C_{n} = O_{P} (n^{- ϵ})

means that

n^{ϵ} C_{n}

is bounded in probability,

C_{n} = o_{P} (n^{- ϵ})

means that

n^{ϵ} C_{n} \overset{P}{⟶} 0

, and

C_{n} = o (n^{- ϵ})

means that

n^{ϵ} C_{n} \overset{a . s .}{⟶} 0

;

H = L^{2} ({[0, 1]}^{2}, w)

denotes the separable Hilbert space of the measurable functions

φ, w : {[0, 1]}^{2} \to R

such that

{| | φ | |}_{H}^{2} = \int_{{[0, 1]}^{2}} φ^{2} (t) w (t) d t < \infty

.

Appendix A

This section presents the R code for the parametric bootstrap implementation used.

# The libraries to be used are loaded

library(methods)

library(Runuran)

# generates bivariate negative binomial samples

gbnbs = function(gamma0,gamma1,gamma2,N,v){

if (gamma0 > gamma2 && gamma1 > gamma2 && gamma2 >= 0){

q = 1 + gamma0 + gamma1 - gamma2

p1 = (gamma0 - gamma2)/q

p2 = (gamma1 - gamma2)/q

p3 = gamma2/q

tau = gamma2 * q/((gamma0 - gamma2)*(gamma1 - gamma2))

x1 = numeric(N)

x2 = rep(N,0)

x = rep(N,0)

y = numeric(N)

pmf1 = function(a){

p = abs(1 - (p2 + p3)/(1 - p1))

return(choose(v + a - 1,v - 1) * p^v * (1 - p)^a)

}

dy = unuran.discr.new(pmf = pmf1, lb = 0, ub = 10, mode = 0, sum = 1)

gen = unuran.new(distr = dy, method = "dari; squeeze = on")

y = ur(gen,N)

pmf2 = function(t){

P = abs(p3/(p2 + p3))

return(choose(v,t) * P^t * (1 - P)^(v - t))

}

dxg = unuran.discr.new(pmf = pmf2, lb = 0, ub = 2, mode = 0, sum = 1)

gen = unuran.new(distr = dxg, method = "dari; squeeze = on")

x1 = ur(gen,N)

for(j in 1:N){

pmf3 <- function(b){

px2<- abs(1 - p1)

return(choose(v + y[j] + b - 1,v + y[j] - 1) * px2^(v + y[j]) * (1 - px2)^b)

}

dx2 = unuran.discr.new(pmf = pmf3, lb = 0, ub = 10, mode = 0, sum = 1)

gen = unuran.new(distr = dx2, method = "dari; squeeze = on")

x2[j] = ur(gen,1)

x[j] = x1[j] + x2[j]

}

return(list(x = x,y = y))

}

# Assignment of the fixed values to be used

v = 5 # Number of trials before the first success or failure

n = 70 # Sample size

K = 500 # Bootstrap iterations

#The initial population data is loaded

load("data_BNBD.rda")

data_x = data_BNBD$x

data_y = data_BNBD$y

freq_ini = table(data_x,data_y)

# Step 1 of the parametric Bootstrap algorithm

#The parameters of the initial population are estimated

gamma_0 = mean(data_x)/v

gamma_1 = mean(data_y)/v

#The third parameter is estimated using the three estimation methods

#MZ is the implementation of Zero-Zero frequency method algorithm

#MM is the implementation of the Method of Moments algorithm

#ML is the implementation of the Maximum Likelihood method algorithm

gammaZZ_2 = MZ(g_0 = gamma_0,g_1 = gamma_1,table = freq_ini,n = n,v = v)

gammaMM_2 = MM(g_0 = gamma_0,g_1 = gamma_1,table = freq_ini,n = n,v = v)

gammaML_2 = ML(g_0 = gamma_0,g_1 = gamma_1,table = freq_ini,n = n,v = v)

#The observed value is obtained for each parameter estimation method

#B_est is the implementation of the test statistic.

B_obs_ZZ = B_est(gamma_0,gamma_1,gammaZZ_2,data_x,data_y,n,v)

B_obs_MM = B_est(gamma_0,gamma_1,gammaMM_2,data_x,data_y,n,v)

B_obs_ML = B_est(gamma_0,gamma_1,gammaML_2,data_x,data_y,n,v)

# Implementation of the parametric Bootstrap method

B_bootZZ = B_bootMM = B_bootML = rep(0,K)

for(k in 1:K){

# Step 2(a) of the parametric Bootstrap algorithm

sampleB_ZZ = gbnbs(gamma_0, gamma_1, gammaZZ_2, n, v)

sampleB_MM = gbnbs(gamma_0, gamma_1, gammaMM_2, n, v)

sampleB_ML = gbnbs(gamma_0, gamma_1, gammaML_2, n, v)

dataB_ZZ = data.frame(x = sampleB_ZZ$x, y = sampleB_ZZ$y)

dataB_MM = data.frame(x = sampleB_MM$x, y = sampleB_MM$y)

dataB_ML = data.frame(x = sampleB_ML$x, y = sampleB_ML$y)

freqB_ZZ = table(dataB_ZZ$x,dataB_ZZ$y)

freqB_MM = table(dataB_MM$x,dataB_MM$y)

freqB_ML = table(dataB_ML$x,dataB_ML$y)

# The Bootstrap parameters are estimated: gammaB

gammaB_ZZ_0 = mean(dataB_ZZ$x)/v

gammaB_ZZ_1 = mean(dataB_ZZ$y)/v

gammaB_MM_0 = mean(dataB_MM$x)/v

gammaB_MM_1 = mean(dataB_MM$y)/v

gammaB_ML_0 = mean(dataB_ML$x)/v

gammaB_ML_1 = mean(dataB_ML$y)/v

# The third parameter is estimated using the three estimation methods

gammaB_ZZ_2 = MZ(g_0 = gammaB_ZZ_0,g_1 = gammaB_ZZ_1,table = freqB_ZZ,n = n,v = v)

gammaB_MM_2 = MM(g_0 = gammaB_MM_0,g_1 = gammaB_MM_1,table = freqB_MM,n = n,v = v)

gammaB_ML_2 = ML(g_0 = gammaB_ML_0,g_1 = gammaB_ML_1,table = freqB_ML,n = n,v = v)

# Step 2(b) of the parametric Bootstrap algorithm

B_boot_ZZ[k] = B_est(gammaB_ZZ_0,gammaB_ZZ_1,gammaB_ZZ_2,dataB_ZZ$x,dataB_ZZ$y,n,v)

B_boot_MM[k] = B_est(gammaB_MM_0,gammaB_MM_1,gammaB_MM_2,dataB_MM$x,dataB_MM$y,n,v)

B_boot_ML[k] = B_est(gammaB_ML_0,gammaB_ML_1,gammaB_ML_2,dataB_ML$x,dataB_ML$y,n,v)

}

# Step 3 of the parametric Bootstrap algorithm

# An approximation of the p-value is accumulated

vpZZ_B = sum(rep(1,K)[B_boot_ZZ > B_obs_ZZ])/K

vpMM_B = sum(rep(1,K)[B_boot_MM > B_obs_MM])/K

vpML_B = sum(rep(1,K)[B_boot_ML > B_obs_ML])/K

Appendix B

This section displays the graphs that complement the numerical tables presented in Section 6.1.

Note that in Figure A1, Figure A2, Figure A3, Figure A4, Figure A5, Figure A6, Figure A7, Figure A8, Figure A9, Figure A10, Figure A11, Figure A12, Figure A13, Figure A14, Figure A15, Figure A16, Figure A17, and Figure A18 presented below, the areas with a pink background highlight the convergence of the method at the nominal level

α = 0.05

represented by the blue dashed line. On the other hand, the areas with an orange background emphasize the convergence of the method at the nominal level

α = 0.10

described by the blue dashed line.

Figure A1. Simulation results for the probability of type I error for

γ = (0.29, 0.29, 0.009425)

.

Figure A1. Simulation results for the probability of type I error for

γ = (0.29, 0.29, 0.009425)

.

Figure A2. Simulation results for the probability of type I error for

γ = (0.30, 0.30, 0.007500)

.

Figure A2. Simulation results for the probability of type I error for

γ = (0.30, 0.30, 0.007500)

.

Figure A3. Simulation results for the probability of type I error for

γ = (0.32, 0.32, 0.003200)

.

Figure A3. Simulation results for the probability of type I error for

γ = (0.32, 0.32, 0.003200)

.

Figure A4. Simulation results for the probability of type I error for

γ = (0.29, 0.30, 0.008492)

.

Figure A4. Simulation results for the probability of type I error for

γ = (0.29, 0.30, 0.008492)

.

Figure A5. Simulation results for the probability of type I error for

γ = (0.29, 0.31, 0.007543)

.

Figure A5. Simulation results for the probability of type I error for

γ = (0.29, 0.31, 0.007543)

.

Figure A6. Simulation results for the probability of type I error for

γ = (0.30, 0.31, 0.006492)

.

Figure A6. Simulation results for the probability of type I error for

γ = (0.30, 0.31, 0.006492)

.

Figure A7. Simulation results for the probability of type I error for

γ = (0.29, 0.29, 0.102950)

.

Figure A7. Simulation results for the probability of type I error for

γ = (0.29, 0.29, 0.102950)

.

Figure A8. Simulation results for the probability of type I error for

γ = (0.30, 0.30, 0.105000)

.

Figure A8. Simulation results for the probability of type I error for

γ = (0.30, 0.30, 0.105000)

.

Figure A9. Simulation results for the probability of type I error for

γ = (0.32, 0.32, 0.108800)

.

Figure A9. Simulation results for the probability of type I error for

γ = (0.32, 0.32, 0.108800)

.

Figure A10. Simulation results for the probability of type I error for

γ = (0.29, 0.31, 0.104990)

.

Figure A10. Simulation results for the probability of type I error for

γ = (0.29, 0.31, 0.104990)

.

Figure A11. Simulation results for the probability of type I error for

γ = (0.30, 0.31, 0.105984)

.

Figure A11. Simulation results for the probability of type I error for

γ = (0.30, 0.31, 0.105984)

.

Figure A12. Simulation results for the probability of type I error for

γ = (0.30, 0.32, 0.106939)

.

Figure A12. Simulation results for the probability of type I error for

γ = (0.30, 0.32, 0.106939)

.

Figure A13. Simulation results for the probability of type I error for

γ = (0.29, 0.29, 0.196475)

.

Figure A13. Simulation results for the probability of type I error for

γ = (0.29, 0.29, 0.196475)

.

Figure A14. Simulation results for the probability of type I error for

γ = (0.30, 0.30, 0.202500)

.

Figure A14. Simulation results for the probability of type I error for

γ = (0.30, 0.30, 0.202500)

.

Figure A15. Simulation results for the probability of type I error for

γ = (0.31, 0.31, 0.208475)

.

Figure A15. Simulation results for the probability of type I error for

γ = (0.31, 0.31, 0.208475)

.

Figure A16. Simulation results for the probability of type I error for

γ = (0.29, 0.30, 0.199475)

.

Figure A16. Simulation results for the probability of type I error for

γ = (0.29, 0.30, 0.199475)

.

Figure A17. Simulation results for the probability of type I error for

γ = (0.30, 0.31, 0.205476)

.

Figure A17. Simulation results for the probability of type I error for

γ = (0.30, 0.31, 0.205476)

.

Figure A18. Simulation results for the probability of type I error for

γ = (0.30, 0.32, 0.208408)

.

Figure A18. Simulation results for the probability of type I error for

γ = (0.30, 0.32, 0.208408)

.

References

Shi, P.; Valdez, E. Multivariate Negative Binomial Models for Insurance Claim Counts. Insur. Math. Econ. 2014, 55, 18–29. [Google Scholar] [CrossRef]
Van Gemert, D.; Van Ophem, J.C.M. Modelling the Scores of Premier League Football Matches. In Economics, Management and Optimization in Sports; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
Krkošek, M.; Connors, B.M.; Lewis, M.A.; Poulin, R. Allee effects may slow the spread of parasites in a coastal marine ecosystem. Am. Nat. 2012, 179, 401–412. [Google Scholar] [CrossRef] [PubMed]
Arbous, A.G.; Kerrich, J.E. Accident Statistics and the Concept of Accident-Proneness. Biometrics 1951, 7, 340–432. [Google Scholar] [CrossRef]
Edwards, C.B.; Gurland, J. A Class of Distributions Applicable to Accidents. J. Am. Stat. Assoc. 1961, 56, 503–517. [Google Scholar] [CrossRef]
Holgate, P. Bivariate generalizations of Neyman’s Type A distribution. Biometrika 1966, 53, 241–245. [Google Scholar] [CrossRef]
Maher, M.J. A bivariate negative binomial model to explain traffic accident migration. Accid. Anal. Prev. 1990, 22, 487–498. [Google Scholar] [CrossRef]
Kopocińska, I. Bivariate negative binomial distribution of the Marshall-Olkin type. Appl. Math. 1999, 25, 457–461. [Google Scholar] [CrossRef]
González-Albornoz, P.; Novoa-Muñoz, F. Goodness-of-Fit Test for the Bivariate Hermite Distribution. Axioms 2023, 12, 7. [Google Scholar] [CrossRef]
Heller, B. A Goodness-of-Fit test for the Negative Binomial Distribution applicable to Large Sets of Small Samples. Dev. Water Sci. 1986, 27, 215–220. [Google Scholar] [CrossRef]
Beltrán-Beltrán, J.I.; O’Reilly, F.J. On goodness of fit tests for the Poisson, negative binomial and binomial distributions. Stat. Pap. 2019, 60, 1–18. [Google Scholar] [CrossRef]
Meintanis, S.G. A new goodness-of-fit test for certain bivariate distributions applicable to traffic accidents. Stat. Methodol. 2007, 4, 22–34. [Google Scholar] [CrossRef]
Hudecová, Š.; Hušková, M.; Meintanis, S.G. Goodness-of-Fit Tests for Bivariate Time Series of Counts. Econometrics 2021, 9, 10. [Google Scholar] [CrossRef]
Wang, H.; Weiß, C.H.; Zhang, M. Goodness-of-fit testing in bivariate count time series based on a bivariate dispersion index. AStA Adv. Stat. Anal. 2024. [Google Scholar] [CrossRef]
Novoa-Muñoz, F. Goodness-of-fit tests for the bivariate Poisson distribution. Commun. Stat. Simul. Comput. 2019, 50, 1998–2014. [Google Scholar] [CrossRef]
Novoa-Muñoz, F.; Jiménez-Gamero, M.D. A goodness-of-fit test for the multivariate Poisson distribution. SORT 2016, 40, 113–138. [Google Scholar]
Novoa-Muñoz, F.; Jiménez-Gamero, M.D. Testing for the bivariate Poisson distribution. Metrika 2014, 77, 771–793. [Google Scholar] [CrossRef]
Novoa-Muñoz, F. Implementation of a Parallel Algorithm to Simulate the Type I Error Probability. Mathematics 2024, 12, 1686. [Google Scholar] [CrossRef]
Kocherlakota, S.; Kocherlakota, K. Bivariate Discrete Distributions; John Wiley & Sons: Hoboken, NJ, USA, 1992. [Google Scholar]
Subrahmaniam, K. A test for “intrinsic correlation” in the Theory of Accident Proneness. J. R. Stat. Soc. Ser. B Methodol. 1966, 28, 180–189. Available online: http://www.jstor.org/stable/2984284 (accessed on 12 February 2023). [CrossRef]
Min, C.; Ong, S.; Srivastava, H.M. A class of bivariate negative binomial distributions with different index parameters in the marginals. Appl. Math. Comput. 2010, 217, 3069–3087. [Google Scholar] [CrossRef]
Baringhaus, L.; Henze, N. Cramér-von Mises distance: Probabilistic interpretation, confidence intervals, and neighborhood-of-model validation. J. Nonparametr. Stat. 2017, 29, 167–188. [Google Scholar] [CrossRef]
Papageorgiou, H.; Loukas, S. Conditional even point estimation for bivariate discrete distributions. Commun. Stat. Theory Methods 1988, 17, 3403–3412. [Google Scholar] [CrossRef]
Serfling, R.J. Approximation Theorems of Mathematical Statistics; Wiley: New York, NY, USA, 1980. [Google Scholar]
Kundu, S.; Majumdar, S.; Mukherjee, K. Central limits theorems revisited. Stat. Prob. Lett. 2000, 47, 265–275. [Google Scholar] [CrossRef]
Subrahmaniam, K.; Subrahmaniam, K. On the Estimation of the Parameters in the Bivariate Negative Binomial Distribution. J. R. Stat. Soc. Ser. B Methodol. 1973, 35, 131–146. [Google Scholar] [CrossRef]
Lehmann, E.L.; Romano, J.P. Testing Statistical Hypotheses; Springer: New York, NY, USA, 2005. [Google Scholar]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: https://www.R-project.org/ (accessed on 10 January 2021).
Loukas, S.; Kemp, C.D. The Index of Dispersion Test for the Bivariate Poisson Distribution. Biometrics 1986, 47, 941–948. [Google Scholar] [CrossRef]
Rayner, J.C.W.; Best, D.J. Smooth Tests for the Bivariate Poisson Distribution. Aust. N. Z. J. Stat. 1995, 47, 233–245. [Google Scholar] [CrossRef]
Gürtler, N.; Henze, N. Recent and classical goodness-of-fit tests for the Poisson distribution. J. Stat. Planningand Inference 2000, 90, 207–225. [Google Scholar] [CrossRef]
Available online: https://www.nlhpc.cl (accessed on 15 January 2021).
Dunn, J.E. Characterization of the Bivariate Negative Binomial Distribution. J. Ark. Acad. Sci. 1967, 21, 77–86. [Google Scholar]
Gillings, D.B. Some further results for bivariate generalizations of the Neyman type A distribution. Biometrics 1974, 30, 619–628. [Google Scholar] [CrossRef]
Crockett, N.G. A quick test of fit of a bivariate distribution. In Interactive Statistics; McNeil, D., Ed.; Elsevier: Amsterdam, The Netherlands, 1979; pp. 185–191. [Google Scholar]
Available online: https://www.kaggle.com/ (accessed on 14 December 2024).

Table 1. Simulation results for the probability of type I error for

ρ = 0.25

and

E (X_{1}) = E (X_{2})

.

Table 1. Simulation results for the probability of type I error for

ρ = 0.25

and

E (X_{1}) = E (X_{2})

.

	$n = 30$		$n = 50$		$n = 70$
	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$
$γ = (0.29, 0.29, 0.009425)$
$B_{n, (0, 0)} (\tilde{γ})$	0.033	0.080	0.036	0.087	0.061	0.110
$B_{n, (1, 0)} (\tilde{γ})$	0.034	0.081	0.035	0.088	0.060	0.109
$B_{n, (0, 1)} (\tilde{γ})$	0.032	0.080	0.036	0.087	0.058	0.111
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.038	0.083	0.042	0.089	0.056	0.107
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.039	0.083	0.043	0.090	0.057	0.108
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.040	0.082	0.044	0.090	0.057	0.109
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.088	0.045	0.095	0.051	0.102
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.090	0.047	0.097	0.050	0.101
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.087	0.046	0.097	0.051	0.102
$γ = (0.30, 0.30, 0.007500)$
$B_{n, (0, 0)} (\tilde{γ})$	0.030	0.084	0.035	0.085	0.061	0.112
$B_{n, (1, 0)} (\tilde{γ})$	0.034	0.083	0.035	0.087	0.062	0.113
$B_{n, (0, 1)} (\tilde{γ})$	0.033	0.080	0.038	0.089	0.059	0.112
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.086	0.042	0.091	0.056	0.107
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.038	0.085	0.041	0.093	0.057	0.106
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.037	0.085	0.044	0.093	0.055	0.105
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.090	0.047	0.095	0.052	0.102
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.087	0.048	0.094	0.051	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.092	0.049	0.095	0.050	0.101
$γ = (0.32, 0.32, 0.003200)$
$B_{n, (0, 0)} (\tilde{γ})$	0.030	0.083	0.037	0.086	0.060	0.108
$B_{n, (1, 0)} (\tilde{γ})$	0.031	0.082	0.035	0.089	0.061	0.109
$B_{n, (0, 1)} (\tilde{γ})$	0.032	0.081	0.036	0.087	0.060	0.107
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.039	0.087	0.041	0.091	0.058	0.107
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.037	0.087	0.043	0.093	0.057	0.105
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.036	0.086	0.043	0.093	0.055	0.104
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.089	0.045	0.096	0.053	0.102
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.091	0.046	0.095	0.053	0.103
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.093	0.045	0.096	0.052	0.101

Table 2. Simulation results for the probability of type I error for

ρ \approx

0.25 and

E (X_{1}) \neq E (X_{2})

.

Table 2. Simulation results for the probability of type I error for

ρ \approx

0.25 and

E (X_{1}) \neq E (X_{2})

.

	$n = 30$		$n = 50$		$n = 70$
	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$
$γ = (0.29, 0.30, 0.008492)$
$B_{n, (0, 0)} (\tilde{γ})$	0.031	0.079	0.039	0.085	0.058	0.113
$B_{n, (1, 0)} (\tilde{γ})$	0.034	0.081	0.039	0.087	0.056	0.111
$B_{n, (0, 1)} (\tilde{γ})$	0.033	0.075	0.035	0.084	0.057	0.114
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.039	0.086	0.042	0.089	0.056	0.110
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.037	0.085	0.041	0.089	0.056	0.109
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.039	0.085	0.042	0.088	0.055	0.110
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.092	0.046	0.096	0.053	0.101
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.093	0.045	0.095	0.052	0.103
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.090	0.044	0.096	0.052	0.100
$γ = (0.29, 0.31, 0.007543)$
$B_{n, (0, 0)} (\tilde{γ})$	0.032	0.079	0.035	0.083	0.059	0.115
$B_{n, (1, 0)} (\tilde{γ})$	0.033	0.083	0.035	0.085	0.058	0.112
$B_{n, (0, 1)} (\tilde{γ})$	0.031	0.077	0.033	0.081	0.059	0.116
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.081	0.040	0.087	0.057	0.110
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.036	0.087	0.043	0.089	0.055	0.109
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.038	0.085	0.044	0.090	0.056	0.108
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.093	0.044	0.095	0.053	0.103
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.090	0.045	0.094	0.053	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.089	0.046	0.092	0.050	0.100
$γ = (0.30, 0.31, 0.006492)$
$B_{n, (0, 0)} (\tilde{γ})$	0.031	0.081	0.037	0.084	0.060	0.112
$B_{n, (1, 0)} (\tilde{γ})$	0.033	0.078	0.038	0.084	0.059	0.112
$B_{n, (0, 1)} (\tilde{γ})$	0.033	0.076	0.037	0.086	0.060	0.110
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.035	0.081	0.041	0.086	0.055	0.109
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.038	0.083	0.043	0.087	0.055	0.107
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.035	0.082	0.041	0.084	0.056	0.111
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.086	0.044	0.092	0.054	0.102
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.085	0.045	0.095	0.051	0.101
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.085	0.045	0.095	0.052	0.101

Table 3. Simulation results for the probability of type I error for

ρ = 0.50

and

E (X_{1}) = E (X_{2})

.

Table 3. Simulation results for the probability of type I error for

ρ = 0.50

and

E (X_{1}) = E (X_{2})

.

	$n = 30$		$n = 50$		$n = 70$
	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$
$γ = (0.29, 0.29, 0.102950)$
$B_{n, (0, 0)} (\tilde{γ})$	0.033	0.075	0.035	0.083	0.060	0.115
$B_{n, (1, 0)} (\tilde{γ})$	0.031	0.083	0.037	0.084	0.059	0.112
$B_{n, (0, 1)} (\tilde{γ})$	0.033	0.082	0.038	0.086	0.058	0.113
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.038	0.082	0.043	0.084	0.055	0.112
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.039	0.083	0.044	0.087	0.055	0.111
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.039	0.085	0.042	0.089	0.055	0.111
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.088	0.045	0.095	0.052	0.104
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.090	0.047	0.096	0.051	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.093	0.045	0.097	0.052	0.100
$γ = (0.30, 0.30, 0.105000)$
$B_{n, (0, 0)} (\tilde{γ})$	0.034	0.074	0.038	0.084	0.059	0.114
$B_{n, (1, 0)} (\tilde{γ})$	0.033	0.075	0.037	0.085	0.060	0.112
$B_{n, (0, 1)} (\tilde{γ})$	0.034	0.076	0.041	0.084	0.058	0.114
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.038	0.083	0.044	0.086	0.054	0.109
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.039	0.085	0.042	0.087	0.056	0.111
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.038	0.082	0.042	0.086	0.055	0.110
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.089	0.046	0.095	0.053	0.103
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.091	0.044	0.095	0.053	0.103
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.093	0.045	0.096	0.050	0.102
$γ = (0.32, 0.32, 0.108800)$
$B_{n, (0, 0)} (\tilde{γ})$	0.032	0.080	0.039	0.084	0.060	0.113
$B_{n, (1, 0)} (\tilde{γ})$	0.035	0.076	0.036	0.083	0.061	0.114
$B_{n, (0, 1)} (\tilde{γ})$	0.030	0.078	0.038	0.084	0.060	0.113
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.081	0.042	0.087	0.057	0.109
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.039	0.084	0.042	0.086	0.056	0.109
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.037	0.082	0.041	0.085	0.057	0.108
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.087	0.044	0.095	0.051	0.100
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.091	0.045	0.094	0.053	0.104
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.093	0.046	0.095	0.051	0.103

Table 4. Simulation results for the probability of type I error for

ρ \approx

0.50 and

E (X_{1}) \neq E (X_{2})

.

Table 4. Simulation results for the probability of type I error for

ρ \approx

0.50 and

E (X_{1}) \neq E (X_{2})

.

	$n = 30$		$n = 50$		$n = 70$
	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$
$γ = (0.29, 0.31, 0.104990)$
$B_{n, (0, 0)} (\tilde{γ})$	0.032	0.078	0.038	0.083	0.060	0.114
$B_{n, (1, 0)} (\tilde{γ})$	0.032	0.076	0.037	0.084	0.060	0.113
$B_{n, (0, 1)} (\tilde{γ})$	0.034	0.081	0.037	0.083	0.059	0.113
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.038	0.080	0.041	0.084	0.057	0.111
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.039	0.081	0.043	0.085	0.055	0.110
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.038	0.082	0.042	0.087	0.056	0.109
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.086	0.046	0.096	0.053	0.102
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.090	0.044	0.095	0.053	0103
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.091	0.047	0.094	0.051	0.103
$γ = (0.30, 0.31, 0.105984)$
$B_{n, (0, 0)} (\tilde{γ})$	0.031	0.072	0.037	0.083	0.059	0.115
$B_{n, (1, 0)} (\tilde{γ})$	0.033	0.079	0.035	0.083	0.060	0.115
$B_{n, (0, 1)} (\tilde{γ})$	0.033	0.074	0.038	0.085	0.058	0.114
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.080	0.042	0.084	0.056	0.109
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.037	0.081	0.043	0.085	0.055	0.111
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.039	0.082	0.041	0.087	0.056	0.109
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.087	0.045	0.094	0.052	0.103
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.090	0.044	0.093	0.053	0.103
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.040	0.087	0.047	0.095	0.051	0.101
$γ = (0.30, 0.32, 0.106939)$
$B_{n, (0, 0)} (\tilde{γ})$	0.031	0.079	0.034	0.081	0.061	0.115
$B_{n, (1, 0)} (\tilde{γ})$	0.032	0.076	0.035	0.083	0.060	0.112
$B_{n, (0, 1)} (\tilde{γ})$	0.032	0.077	0.035	0.082	0.061	0.114
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.039	0.082	0.042	0.087	0.054	0.111
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.038	0.082	0.041	0.088	0.053	0.106
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.038	0.081	0.043	0.088	0.054	0.107
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.088	0.046	0.095	0.051	0.104
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.086	0.044	0.096	0.050	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.088	0.044	0.095	0.053	0.103

Table 5. Simulation results for the probability of type I error for

ρ = 0.75

and

E (X_{1}) = E (X_{2})

.

Table 5. Simulation results for the probability of type I error for

ρ = 0.75

and

E (X_{1}) = E (X_{2})

.

	$n = 30$		$n = 50$		$n = 70$
	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$
$γ = (0.29, 0.29, 0.196475)$
$B_{n, (0, 0)} (\tilde{γ})$	0.034	0.085	0.038	0.089	0.059	0.110
$B_{n, (1, 0)} (\tilde{γ})$	0.033	0.084	0.036	0.088	0.060	0.111
$B_{n, (0, 1)} (\tilde{γ})$	0.032	0.084	0.037	0.087	0.058	0.111
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.085	0.042	0.089	0.057	0.107
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.038	0.086	0.045	0.090	0.054	0.104
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.037	0.084	0.043	0.089	0.056	0.109
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.044	0.088	0.047	0.095	0.051	0.104
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.090	0.044	0.096	0.052	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.093	0.045	0.097	0.052	0.100
$γ = (0.30, 0.30, 0.202500)$
$B_{n, (0, 0)} (\tilde{γ})$	0.035	0.076	0.039	0.086	0.058	0.110
$B_{n, (1, 0)} (\tilde{γ})$	0.034	0.075	0.039	0.085	0.059	0.111
$B_{n, (0, 1)} (\tilde{γ})$	0.034	0.075	0.040	0.084	0.057	0.112
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.036	0.083	0.043	0.087	0.054	0.108
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.038	0.085	0.044	0.088	0.055	0.109
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.039	0.086	0.044	0.089	0.055	0.107
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.090	0.046	0.095	0.052	0.101
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.091	0.046	0.095	0.052	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.092	0.045	0.096	0.051	0.101
$γ = (0.31, 0.31, 0.208475)$
$B_{n, (0, 0)} (\tilde{γ})$	0.033	0.078	0.039	0.085	0.059	0.112
$B_{n, (1, 0)} (\tilde{γ})$	0.035	0.079	0.039	0.086	0.060	0.113
$B_{n, (0, 1)} (\tilde{γ})$	0.033	0.078	0.038	0.084	0.050	0.112
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.038	0.084	0.043	0.088	0.054	0.107
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.040	0.085	0.045	0.089	0.053	0.108
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.040	0.086	0.046	0.090	0.052	0.106
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.089	0.046	0.095	0.051	0.100
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.091	0.046	0.094	0.052	0.101
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.041	0.092	0.044	0.096	0.050	0.100

Table 6. Simulation results for the probability of type I error for

ρ \approx 0.75

and

E (X_{1}) \neq E (X_{2})

.

Table 6. Simulation results for the probability of type I error for

ρ \approx 0.75

and

E (X_{1}) \neq E (X_{2})

.

	$n = 30$		$n = 50$		$n = 70$
	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$	$α = 0.05$	$α = 0.10$
$γ = (0.29, 0.30, 0.199475)$
$B_{n, (0, 0)} (\tilde{γ})$	0.031	0.084	0.037	0.089	0.060	0.109
$B_{n, (1, 0)} (\tilde{γ})$	0.030	0.083	0.036	0.088	0.061	0.108
$B_{n, (0, 1)} (\tilde{γ})$	0.030	0.084	0.036	0.089	0.059	0.109
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.036	0.084	0.044	0.089	0.057	0.106
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.037	0.086	0.045	0.091	0.056	0.105
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.037	0.085	0.045	0.089	0.055	0.107
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.089	0.047	0.094	0.052	0.104
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.090	0.045	0.096	0.053	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.092	0.045	0.096	0.053	0.101
$γ = (0.30, 0.31, 0.205476)$
$B_{n, (0, 0)} (\tilde{γ})$	0.034	0.076	0.039	0.085	0.058	0.111
$B_{n, (1, 0)} (\tilde{γ})$	0.034	0.075	0.038	0.086	0.057	0.110
$B_{n, (0, 1)} (\tilde{γ})$	0.035	0.075	0.039	0.086	0.058	0.111
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.084	0.043	0.087	0.054	0.109
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.039	0.086	0.044	0.089	0.055	0.108
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.039	0.086	0.044	0.089	0.054	0.108
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.044	0.091	0.046	0.096	0.052	0.101
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.091	0.045	0.095	0.053	0.102
$B_{n, (0, 1)} (\tilde{\tilde{\tilde{γ}}})$	0.044	0.092	0.045	0.096	0.052	0.101
$γ = (0.30, 0.32, 0.208408)$
$B_{n, (0, 0)} (\tilde{γ})$	0.035	0.079	0.039	0.085	0.058	0.112
$B_{n, (1, 0)} (\tilde{γ})$	0.035	0.080	0.039	0.087	0.059	0.110
$B_{n, (0, 1)} (\tilde{γ})$	0.034	0.079	0.039	0.085	0.056	0.112
$B_{n, (0, 0)} (\tilde{\tilde{γ}})$	0.037	0.084	0.044	0.088	0.054	0.106
$B_{n, (1, 0)} (\tilde{\tilde{γ}})$	0.037	0.084	0.044	0.088	0.054	0.109
$B_{n, (0, 1)} (\tilde{\tilde{γ}})$	0.038	0.085	0.045	0.089	0.053	0.107
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.090	0.046	0.094	0.051	0.100
$B_{n, (1, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.042	0.091	0.046	0.094	0.052	0.101
$B_{n, (0, 0)} (\tilde{\tilde{\tilde{γ}}})$	0.043	0.092	0.047	0.096	0.050	0.101

Table 7. Simulation results for the power. The values are in the form of percentages rounded to the nearest integer.

Alternative	$B_{(0, 0)}$	$B_{(1, 0)}$	$B_{(0, 1)}$
$B B (1; 0.41, 0.02, 0.01)$	95	91	88
$B B (1; 0.55, 0.03, 0.02)$	97	92	89
$B B (2; 0.42, 0.02, 0.01)$	98	94	90
$B B (2; 0.51, 0.01, 0.01)$	94	89	88
$B B (2; 0.71, 0.04, 0.03)$	92	86	85
$B H (0.99; 1, 0.66, 0.10, 0.10)$	96	87	92
$B H (1.00; 1, 0.30, 0.30, 0.25)$	95	86	90
$B H (1.00; 1, 0.31, 0.33, 0.24)$	92	86	86
$B H (1.30; 1, 0.61, 0.03, 0.02)$	93	84	85
$B H (1.40; 1, 1.00, 0.26, 0.12)$	95	86	88
$B N T A (0.15; 0.35, 0.33, 0.32)$	94	87	93
$B N T A (0.42; 0.32, 0.35, 0.33)$	92	89	91
$B N T A (0.50; 0.31, 0.34, 0.35)$	93	89	92
$B N T A (0.70; 0.35, 0.30, 0.33)$	91	88	92
$B N T A (0.75; 0.32, 0.34, 0.34)$	94	89	93
$B P (1.00, 1.00, 0.25)$	97	87	90
$B P (1.00, 1.00, 0.50)$	94	83	92
$B P (1.00, 1.00, 0.75)$	96	82	90
$B P (1.50, 1.00, 0.31)$	95	86	89
$B P (1.50, 1.00, 0.92)$	96	86	91

Table 8. p-values for the real datasets rounded to three decimal places.

Test Statistics	Biometric	Agronomic	Sales	Education	Insurance
$B_{n (0, 0)}$	0.434	0.277	0.003	0.892	0.493
$B_{n (1, 0)}$	0.443	0.327	0.037	0.836	0.665
$B_{n (0, 1)}$	0.440	0.303	0.023	0.878	0.258

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Novoa-Muñoz, F.; Aguirre-González, J.P. Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution. Axioms 2025, 14, 54. https://doi.org/10.3390/axioms14010054

AMA Style

Novoa-Muñoz F, Aguirre-González JP. Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution. Axioms. 2025; 14(1):54. https://doi.org/10.3390/axioms14010054

Chicago/Turabian Style

Novoa-Muñoz, Francisco, and Juan Pablo Aguirre-González. 2025. "Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution" Axioms 14, no. 1: 54. https://doi.org/10.3390/axioms14010054

APA Style

Novoa-Muñoz, F., & Aguirre-González, J. P. (2025). Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution. Axioms, 14(1), 54. https://doi.org/10.3390/axioms14010054

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Goodness-of-Fit Test for the Bivariate Negative Binomial Distribution

Abstract

1. Introduction

2. Reparametrization of the Probability Generating Function of the BNBD

3. Cramér-Von Mises-Type Statistic and Its Asymptotic Null Distribution

4. Bootstrap Approximation of the Null Distribution

5. Parameter Estimation

5.1. Method of Moments

5.1.1. Estimators of γ ˜

5.1.2. Asymptotic Variance Matrix of γ ˜

5.1.3. Asymptotic Distribution of γ ˜

5.1.4. Asymptotic Properties of γ ˜

5.2. Zero–Zero Cell Frequency

5.2.1. Estimators of γ ˜ ˜

5.2.2. Asymptotic Variance Matrix of γ ˜ ˜

5.2.3. Asymptotic Distribution of γ ˜ ˜

5.2.4. Asymptotic Properties of γ ˜ ˜

5.3. Maximum Likelihood Estimators

5.3.1. Estimators of γ ˜ ˜ ˜

5.3.2. Asymptotic Variance Matrix of γ ˜ ˜ ˜

5.3.3. Asymptotic Distribution of γ ˜ ˜ ˜

5.3.4. Asymptotic Properties of γ ˜ ˜ ˜

6. Numerical Results and Discussion

6.1. Simulated Data for Type I Error

6.2. Power of the Proposed Test Statistic

6.3. Real Data Set

6.4. Comparisons with Other Goodness-of-Fit Tests

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5.1.1. Estimators of $\tilde{γ}$

5.1.2. Asymptotic Variance Matrix of $\tilde{γ}$

5.1.3. Asymptotic Distribution of $\tilde{γ}$

5.1.4. Asymptotic Properties of $\tilde{γ}$

5.2.1. Estimators of $\tilde{\tilde{γ}}$

5.2.2. Asymptotic Variance Matrix of $\tilde{\tilde{γ}}$

5.2.3. Asymptotic Distribution of $\tilde{\tilde{γ}}$

5.2.4. Asymptotic Properties of $\tilde{\tilde{γ}}$

5.3.1. Estimators of $\tilde{\tilde{\tilde{γ}}}$

5.3.2. Asymptotic Variance Matrix of $\tilde{\tilde{\tilde{γ}}}$

5.3.3. Asymptotic Distribution of $\tilde{\tilde{\tilde{γ}}}$

5.3.4. Asymptotic Properties of $\tilde{\tilde{\tilde{γ}}}$