Quasi Association Models for Square Contingency Tables with Ordinal Categories

Fujisawa, Kengo; Tahata, Kouji

doi:10.3390/sym14040805

Open AccessArticle

Quasi Association Models for Square Contingency Tables with Ordinal Categories

by

Kengo Fujisawa

^*

and

Kouji Tahata

Faculty of Science and Technology, Tokyo University of Science, Chiba 278-8510, Japan

^*

Author to whom correspondence should be addressed.

Symmetry 2022, 14(4), 805; https://doi.org/10.3390/sym14040805

Submission received: 25 February 2022 / Revised: 5 April 2022 / Accepted: 11 April 2022 / Published: 12 April 2022

(This article belongs to the Special Issue Advances in Quasi-Symmetry Models)

Download

Browse Figure

Versions Notes

Abstract

:

The analysis of contingency tables focuses on a statistical model instead of independence when the independence between row and column variables does not hold. Many association models have been proposed to indicate the structure of odds ratios. Additionally, symmetry and asymmetry models have been proposed to analyze the cell probabilities of square contingency tables with symmetric or asymmetric structures. This paper proposes an asymmetry plus association model for square contingency tables with ordinal categories and partitioning of the test statistic for goodness-of-fit using our proposed model.

Keywords:

association model; asymmetry model; square contingency table

1. Introduction

A categorical variable distinguishes a set of categories. It is employed in diverse fields such as social sciences, medical sciences, engineering, and education. Here, we consider a categorical variable with r categories and another one with c categories. The outcome for two variables has

r c

possible combinations, which can be denoted by a rectangular table with r rows and c columns, where the cells illustrate the

r c

possible outcomes. This is called a contingency table (for more details, see [1,2]). A contingency table illustrates the joint frequencies by combination of two categorical variables. When analyzing a contingency table, only the observed frequencies are seen, but the true distribution is unknown. One of the aims of analyzing a contingency table is to estimate an unknown probability distribution from the observed frequencies. The confidence level of the estimated distribution is higher when fewer parameters are used to describe the data. Sometimes, we need to consider a parsimonious model. Traditionally, a contingency table is used to evaluate whether classifications are associated. That is, the analysis determines whether two variables are statistically independent.

If two variables take the same categorical values, the table is called a “square” contingency table. When the observed frequencies are concentrated in the main diagonal cells, the two variables are dependent. Even if the observations are not concentrated on the main diagonal but we have one large frequency and several small frequencies in each row and each column, then there is a strong association between the categories of a variable and those of the other, and hence a strong dependence. This is a common situation in real world data and, since the case of independence is infrequent and unrealistic, a suitable model for representing dependence data is important. Consequently, many statisticians consider various statistical models instead of an independence model and study the method of estimation and hypothesis testing based on a statistical model.When statistical independence between two variables does not hold, association models, which indicate the structure of odds ratios, have been considered to analyze contingency tables. On the other hand, symmetry or asymmetry models, which indicate the structure of ratios for cell probabilities in symmetric positions, are often used to analyze square contingency tables.

This study proposes a model with characteristics of both an association model and asymmetry model. Our model is more parsimonious than many association or asymmetry models. Hence, our model may better estimate the distribution than conventional association models and asymmetry models.

This paper is organized as follows. Section 2 introduces previous research and proposes an asymmetry plus association model. Section 3 describes the necessary and sufficient condition to use our model. Section 4 provides the methods to evaluate model-fitting based on goodness-of-fit. Section 5 concludes this paper.

2. Models

For an

r \times r

square contingency table with ordinal categories, let

π_{i j}

denote the probability that an observation will fall in the ith row and jth column of the contingency table (

i = 1, \dots, r; j = 1, \dots, r

). Goodman [3,4,5] considered many association models in a contingency table. For example, the quasi-uniform association (QU) model is defined as

π_{i j} = \{\begin{matrix} μ α_{i} β_{j} θ^{i j} & (i \neq j), \\ ψ_{i i} & (i = j) . \end{matrix}

(1)

Without loss of generality, we impose

α_{r} = β_{r} = 1

. The odds ratio for rows i and j

(>

i), and columns s and t

(>

s) are denoted by

ϕ_{(i j; s t)}

. That is,

ϕ_{(i j; s t)} = \frac{π_{i s} π_{j t}}{π_{j s} π_{i t}} .

(2)

Using the odds ratios, the QU model can be expressed as

ϕ_{(i j; s t)} = θ^{(j - i) (t - s)} (i \neq s, i \neq t, j \neq s, j \neq t) .

(3)

The QU model with

θ = 1

is the quasi-independence (QI) model (see p. 426 in Agresti [6]). That is,

π_{i j} = \{\begin{matrix} μ α_{i} β_{j} & (i \neq j), \\ ψ_{i i} & (i = j) . \end{matrix}

(4)

On the other hand, many statisticians have analyzed square contingency tables using a symmetric structure or an asymmetric structure for cell probabilities. Bowker [7] proposed the symmetry (S) model, which is defined as

π_{i j} = ψ_{i j} (i = 1, \dots, r; j = 1, \dots, r),

(5)

where

ψ_{i j} = ψ_{j i}

. This model indicates the symmetric structure for cell probabilities.

Stuart [8] proposed the marginal homogeneity (MH) model, which is defined as

π_{i +} = π_{+ i} (i = 1, \dots, r),

(6)

where

π_{i +} = \sum_{j = 1}^{r} π_{i j}

and

π_{+ i} = \sum_{j = 1}^{r} π_{j i}

. The MH model indicates that the row marginal distribution is equivalent to the column marginal distribution.

Caussinus [9] proposed the quasi-symmetry (QS) model, which is defined as

π_{i j} = μ α_{i} β_{j} ψ_{i j} (i = 1, \dots, r; j = 1, \dots, r),

(7)

where

ψ_{i j} = ψ_{j i}

. This model is identical to the S model when

α_{i} = β_{i}

. The QS model can be expressed as

ϕ_{(i j; s t)} = ϕ_{(s t; i j)} (i < j; s < t) .

(8)

The QS model indicates the symmetric structure of the odds ratios. The QU model implies the QS model. That is, the QS model holds when the QU model holds.

When the S model does not hold, asymmetry models, with a weaker restriction than the S model, have been proposed. For example, Tahata and Tomizawa [10] proposed the kth linear asymmetry (LS

_{k}

) model, which is defined for a fixed k

(k = 1, \dots, r - 1)

as

π_{i j} = μ \prod_{l = 1}^{k} α_{l}^{i^{l}} β_{l}^{j^{l}} ψ_{i j} (i = 1, \dots, r; j = 1, \dots, r),

(9)

where

ψ_{i j} = ψ_{j i}

. Note that when

α_{l} = β_{l}

, this model is the S model. As k increases, the LS

_{k}

model is less restrictive, and the LS

_{r - 1}

model is the QS model. Namely, the LS

_{k}

model is the intermediate model between the S model and QS model. The LS

_{k}

model can be expressed as

\frac{π_{i j}}{π_{j i}} = \prod_{l = 1}^{k} γ_{l}^{j^{l} - i^{l}} (i \neq j) .

(10)

The LS

_{k}

model includes the linear diagonals-parameter symmetry model [11] and the extended linear diagonals-parameter symmetry model [12].

Goodman [4] introduced the symmetry plus quasi-independence (SQI) model, which is defined as

π_{i j} = \{\begin{matrix} μ α_{i} α_{j} & (i \neq j), \\ ψ_{i i} & (i = j) . \end{matrix}

(11)

This model is a special case of the S model and the QI model when

ψ_{i j} = μ α_{i} α_{j}

and

α_{i} = β_{i}

for

i \neq j

, respectively.

Yamamoto and Tomizawa [13] proposed the symmetry plus quasi-uniform association (SQU) model, which is defined as

π_{i j} = \{\begin{matrix} μ α_{i} α_{j} θ^{i j} & (i \neq j), \\ ψ_{i i} & (i = j) . \end{matrix}

(12)

The SQU model implies the S model and QU model. Note that the SQU model is identical to the SQI model when

θ = 1

.

Association models and asymmetry models have been proposed independently. However, an asymmetry plus association model, which considers both the structure of asymmetry for cell probabilities and odds ratios, is rarely considered.

Here, we propose a new model defined for a fixed k

(k = 1, \dots, r - 1)

as

π_{i j} = \{\begin{matrix} μ α_{i} α_{j} \prod_{l = 1}^{k} δ_{l}^{j^{l} - i^{l}} θ^{i j} & (i \neq j), \\ ψ_{i i} & (i = j) . \end{matrix}

(13)

Without loss of generality, we set

α_{r} = 1

. This model is called the kth linear asymmetry plus quasi-uniform association (LSQU

_{k}

) model. When

θ = 1

, it is called the kth linear asymmetry plus quasi-independence (LSQI

_{k}

) model.

If the LSQU

_{k}

model holds, then

\frac{π_{i j}}{π_{j i}} = \prod_{l = 1}^{k} δ_{l}^{2 (j^{l} - i^{l})} (i \neq j) .

(14)

The LS

_{k}

model holds by

γ_{l} = δ_{l}^{2}

in Equation (14). Additionally,

ϕ_{(i j; s t)} = θ^{(j - i) (t - s)} (i \neq s, i \neq t, j \neq s, j \neq t) .

(15)

Therefore, the LSQU

_{k}

model shows characteristics of both the LS

_{k}

model and the QU model.

This model with

δ_{l} = 1

for

l = 1, \dots, k

is the SQU model. When

k = r - 1

, the LSQU

_{k}

model implies

\frac{π_{i j}}{π_{j i}} = \prod_{l = 1}^{r - 1} \frac{γ_{l}^{j^{l}}}{γ_{l}^{i^{l}}} (i \neq j) .

(16)

On the other hand, the QU model implies

\frac{π_{i j}}{π_{j i}} = \frac{λ_{j}}{λ_{i}} (i \neq j),

(17)

where

λ_{j} = β_{j} / α_{j}

. Setting

λ_{j} = \prod_{l = 1}^{r - 1} γ_{l}^{j^{l}}

provides a one-to-one relation between

{λ_{1}, \dots, λ_{r - 1}}

and

{γ_{1}, \dots, γ_{r - 1}}

. This means that the LSQU

_{r - 1}

model is equivalent to the QU model. The LSQU

_{k}

(

k < r - 1

) model is a special case of the QU model since the LSQU

_{r - 1}

model with

δ_{l} = 1

for

l = k + 1, \dots, r - 1

is the LSQU

_{k}

model. Hence, the LSQU

_{k}

model is an intermediate model between the SQU and QU models. Similarly, the LSQI

_{r - 1}

model is equivalent to the QI model. That is, the LSQI

_{k}

model is an intermediate model between the SQI and QI models (For more details, see Figure 1).

3. Necessary and Sufficient Condition for the SQU Model

Caussinus [9] introduced the necessary and sufficient condition for the S model. This condition separates the S model into multiple models with a weaker restriction than the S model. Assuming that model M

_{1}

holds if and only if both models M

_{2}

and M

_{3}

hold, then analyzing models M

_{2}

and M

_{3}

should elucidate a more detailed structure of the cell probabilities. Here, we are interested in deriving a necessary and sufficient condition for the SQU model using the LSQU

_{k}

model.

Yamamoto and Tomizawa [13] provided the following necessary and sufficient condition for the SQU model.

Theorem 1.

The SQU model holds if and only if both the QU model and the MH model hold.

Let X and Y denote the row and column variables, respectively, and consider a model defined for a fixed k (

k = 1, \dots, r - 1

), which is given as

E (X^{l}) = E (Y^{l}) (l = 1, \dots, k),

(18)

where

E (X^{l}) = \sum_{i} \sum_{j} i^{l} π_{i j}

and

E (Y^{l}) = \sum_{i} \sum_{j} j^{l} π_{i j}

. This model can be referred to as the marginal kth moment equality (ME

_{k}

) model. This leads to the following theorem.

Theorem 2.

For any k

(k = 1, \dots, r - 1)

, the SQU model holds if and only if both the LSQU

_{k}

model and the ME

_{k}

model hold.

Proof.

If the SQU model holds, the LSQU

_{k}

model holds because the LSQU

_{k}

model with

δ_{l} = 1

(l = 1, \dots, k)

is the SQU model. Since the SQU model implies the S model, we can see that

E (X^{l}) = \sum_{i} \sum_{j} i^{l} π_{i j} = \sum_{i} \sum_{j} i^{l} π_{j i} = E (Y^{l}) (l = 1, \dots, k) .

(19)

The ME

_{k}

model also holds. The necessity is proved.

Conversely, if both the LSQU

_{k}

model and the ME

_{k}

model hold, we can prove that the SQU model holds. If the LSQU

_{k}

model holds, from Equation (14), we obtain

log π_{i j} - log π_{j i} = 2 \sum_{l = 1}^{k} (j^{l} - i^{l}) log δ_{l} (i \neq j) .

(20)

The ME

_{k}

model can also be expressed as

\underset{i \neq j}{\sum \sum} (j^{l} - i^{l}) π_{i j} = 0 (l = 1, \dots, k) .

(21)

From the LSQU

_{k}

model and the ME

_{k}

model, we obtain

\begin{matrix} \begin{matrix} \underset{i \neq j}{\sum \sum} (π_{i j} - π_{j i}) (log π_{i j} - log π_{j i}) & = 2 \sum_{l = 1} log δ_{l} \underset{i \neq j}{\sum \sum} (j^{l} - i^{l}) (π_{i j} - π_{j i}) \\ = 0 . \end{matrix} \end{matrix}

(22)

Since the logarithmic function is strictly increasing, then for any

i \neq j

(π_{i j} - π_{j i}) (log π_{i j} - log π_{j i}) \geq 0 .

(23)

Equation (22) with

π_{i j} = π_{j i}

holds, that is, the S model holds. When the S model holds, the MH model holds. Additionally, the LSQU

_{k}

model is a special case of the QU model. From Theorem 1, the SQU model holds. The proof is complete. □

Theorem 2 is a generalization of Yamamoto and Tomizawa’s result because the ME

_{r - 1}

model is equivalent to the MH model (see [14]). This leads to the following corollary.

Corollary 1.

For any k

(k = 1, \dots, r - 1)

, the SQI model holds if and only if both the LSQI

_{k}

model and the ME

_{k}

model hold.

4. Partition of Test Statistics

Here, we describe a method to evaluate the model fitting. We consider a test of hyphothesis, where the null hypothesis is that model M holds, and the alternative hypothesis is that model M does not hold. Let

n_{i j}

denote the observed frequency in the (

i, j

)th cell of the table and

m_{i j}

indicate the corresponding expected frequency with

n = \sum_{i} \sum_{j} n_{i j}

(

i = 1, \dots, r; j = 1, \dots, r)

. Assume that

{n_{i j}}

has a multinomial distribution. Then

{\hat{m}}_{i j}

denotes the maximum likelihood estimate (MLE) of

m_{i j}

under a model. The likelihood ratio chi-squared statistic for the goodness-of-fit of the model M is defined as

G^{2} (M) = 2 \sum_{i = 1}^{r} \sum_{j = 1}^{r} n_{i j} log (\frac{n_{i j}}{{\hat{m}}_{i j}}) .

(24)

The numbers of degrees of freedom (df) for testing the goodness-of-fit under the SQU, LSQU

_{k}

, and ME

_{k}

models are

r^{2} - 2 r - 1

,

r^{2} - 2 r - 1 - k

, and k, respectively. The number of df for the SQU model is equal to the sum of those for the LSQU

_{k}

and ME

_{k}

models.

Previous studies have discussed the separability of a model [15,16,17,18,19]. Separability means that a test statistic for the goodness-of-fit of model M

_{1}

is asymptotically equivalent to the sum of the test statistics for model M

_{2}

and model M

_{3}

when model M

_{1}

can be separated into model M

_{2}

and model M

_{3}

. If it holds, the incompatible situation, where both model M

_{2}

and model M

_{3}

are accepted but model M

_{1}

is rejected, would not arise. This leads to the following theorem.

Theorem 3.

For any k

(k = 1, \dots, r - 1)

, the test statistic

G^{2} (S Q U)

is asymptotically equivalent to the sum of

G^{2} (L S Q U_{k})

and

G^{2} (M E_{k})

.

Proof.

For a fixed k (

k = 1, \dots, r - 1

), the LSQU

_{k}

model can be expressed as

\begin{matrix} log π_{i j} = \{\begin{matrix} μ^{'} + α_{i}^{'} + α_{j}^{'} + \sum_{l = 1}^{k} (j^{l} - i^{l}) δ_{l}^{'} + i j θ^{'} & (i \neq j), \\ μ^{'} + α_{i}^{'} + α_{i}^{'} + ψ_{i i}^{'} & (i = j) . \end{matrix} \end{matrix}

(25)

Without loss of generality, we can impose

α_{r}^{'} = 0

. Let

π = {(π_{11}, \dots, π_{1 r}, π_{21}, \dots, π_{2 r}, \dots, π_{r r})}^{T},

(26)

and

β = {(μ^{'}, β_{1}, β_{2}, β_{12})}^{T},

(27)

where “T" denotes the transpose,

β_{1} = (α_{1}^{'}, \dots, α_{r - 1}^{'}), β_{2} = (δ_{1}^{'}, \dots, δ_{k}^{'}),

(28)

and

β_{12} = (θ^{'}, ψ_{11}^{'}, \dots, ψ_{r r}^{'}) .

(29)

The LSQU

_{k}

model can also be expressed as

log π = X β = (1_{r^{2}}, X_{1}, X_{2}, X_{12}) β,

(30)

where

log π = {(log π_{11}, \dots, log π_{r r})}^{T}

, X is the

r^{2} \times (2 r + 1 + k)

matrix, and

1_{s}

is the

s \times 1

vector of the 1 element. Additionally,

X_{1} = (\begin{matrix} I_{r - 1} \otimes 1_{r} \\ O_{r, r - 1} \end{matrix}) + 1_{r} \otimes (\begin{matrix} I_{r - 1} \\ 0_{r - 1}^{T} \end{matrix})

(31)

X_{2} = (x_{1}, \dots, x_{k}),

(32)

where

x_{l} = 1_{r} \otimes J_{r}^{l} - J_{r}^{l} \otimes 1_{r} (l = 1, \dots, k),

(33)

and

X_{12}

is the

r^{2} \times (r + 1)

matrix determined from Equation (25). Note that

O_{s t}

is the

s \times t

zero matrix,

0_{s}

is the

s \times 1

zero vector,

J_{r}^{l} = {(1^{l}, \dots, r^{l})}^{T}

, and “⊗" represents the Kronecker product. The matrix X has a full column rank, which is

K = 2 r + 1 + k

.

We denote the linear space spanned by the columns of the matrix X by

S (X)

with dimension K. Let U be an

r^{2} \times d_{1}

full column rank matrix, where

d_{1} = r^{2} - 2 r - 1 - k

, such that

S (U)

is the orthogonal complement of space

S (X)

. Hence,

U^{T} X = O_{d_{1}, K}

.

Let

h_{1} (π)

be a vector of functions defined by

h_{1} (π) = U^{T} log π

. Moreover, let

h_{2} (π)

be a vector of functions defined by

h_{2} (π) = X_{2}^{T} π

, and note that

X_{2}^{T} U = O_{d_{2}, d_{1}}

where

d_{2} = k

because

X_{2}

belongs to space

S (X)

.

From Equation (25), the LSQU

_{k}

model is equivalent to the hypothesis

h_{1} (π) = 0_{d_{1}}

. Additionally, the ME

_{k}

model is equivalent to the hypothesis

h_{2} (π) = 0_{d_{2}}

. From Theorem 2, the SQU model is equivalent to the hypothesis

h_{3} (π) = 0_{d_{3}}

where

h_{3} = {(h_{1}^{T}, h_{2}^{T})}^{T}

and

d_{3} = d_{1} + d_{2} = r^{2} - 2 r - 1

.

We derive the Wald statistic for the SQU model in an analogous mannar to Bhapkar [20]. Let

H_{s}

(s = 1, 2, 3)

denote the

d_{s} \times r^{2}

matrix of partial derivatives of

h_{s} (π)

with respect to

π

. Namely,

H_{s} (π) = \partial h_{s} (π) / \partial π^{T}

. Let

Σ (π) = d i a g (π) - π π^{T}

, where

d i a g (π)

denotes a diagonal matrix with the ith component of

π

as the ith diagonal component. Additionally, let

p_{i j}

denote a sample proportion of the (

i, j

) cell. That is,

p_{i j} = n_{i j} / n

, and

p = {(p_{11}, \dots, p_{1 r}, p_{21}, \dots, p_{2 r}, \dots, p_{r r})}^{T}

. The central limit theorem indicates that

\sqrt{n} (p - π)

has an asymptotic normal distribution with mean

0_{r^{2}}

and covariance matrix

Σ (π)

. Using the delta method,

\sqrt{n} (h_{3} (p) - h_{3} (π))

has an asymptotic normal distribution with mean

0_{d_{3}}

and covariance matrix

H_{3} (π) Σ (π) H_{3}^{T} (π) = (\begin{matrix} H_{1} (π) Σ (π) H_{1}^{T} (π) & H_{1} (π) Σ (π) H_{2}^{T} (π) \\ H_{2} (π) Σ (π) H_{1}^{T} (π) & H_{2} (π) Σ (π) H_{2}^{T} (π) \end{matrix}) .

Since

H_{1} (π) π = U^{T} 1_{r^{2}} = 0_{d_{1}}

,

H_{1} (π) d i a g (π) = U^{T}

, and

H_{2} (π) = X_{2}^{T}

, we obtain

H_{1} (π) Σ (π) H_{2}^{T} (π) = U^{T} X_{2} = O_{d_{1}, d_{2}} .

(34)

Under each hypothesis,

h_{s} (π) = 0_{d_{s}}

(s = 1, 2, 3)

, we see

W_{3} = W_{1} + W_{2},

(35)

where

W_{s} = n h_{s} {(p)}^{T} {(H_{s} (p) Σ (p) H_{s}^{T} (p))}^{- 1} h_{s} (p) .

(36)

The Wald statistic

W_{s}

has an asymptotic chi-squared distribution with

d_{s}

df. That is, (i)

W_{1}

is the Wald statistic for the LSQU

_{k}

model, (ii)

W_{2}

is that for the ME

_{k}

model, and (iii)

W_{3}

is that for the SQU model. The proof is completed using the asymptotic equivalence of the Wald statistic and the likelihood ratio statistic as proved by Rao [21]. □

Theorem 3 is also a generalization of Yamamoto and Tomizawa’s result since this theorem is identical to Yamamoto and Tomizawa’s result when

k = r - 1

. Moreover, we obtain the following corollary.

Corollary 2.

For any k

(k = 1, \dots, r - 1)

, the test statistic

G^{2} (S Q I)

is asymptotically equivalent to the sum of

G^{2} (L S Q I_{k})

and

G^{2} (M E_{k})

.

5. Example

Table 1 shows the data cited by [22]. This data described 59 matched pairs using 4 dose levels of conjugated estrogen. The models described herein are used to analyze this data. Table 2 shows the value of

G^{2} (M)

for each model applied to the data in Table 1. That is, for model M, the null hypothesis is that model M holds, and the alternative hypothesis is that model M does not hold. From Table 2, the SQI, SQU, S, and ME

_{k}

models do not fit well, and the LSQI

_{k}

, LSQU

_{k}

, and LS

_{k}

models are accepted at the 0.05 significant level

(k = 1, 2, 3)

. We choose the most appropriate model in these models. If model M

_{1}

is a special case of model M

_{2}

, a test based on the difference between the likelihood ratio chi-squared statistic can compare the model fitting of two nested models. Let

d_{1}

and

d_{2}

denote the degrees of freedom for the models M

_{1}

and M

_{2}

, respectively. Assuming that model M

_{2}

holds, a likelihood ratio chi-squared statistic under model M

_{1}

is given as

G^{2} (M_{1} | M_{2}) = G^{2} (M_{1}) - G^{2} (M_{2})

. This statistic is an asymptotically chi-squared distribution with

d_{1} - d_{2}

degrees of freedom. When we use it at the 0.05 significant level, the LSQI

_{1}

model is the most appropriate model.

Table 3 shows the estimated expected frequencies from the LSQI

_{1}

model for the data in Table 1. The value of maximum likelihood estimator of

δ_{1}

for the LSQI

_{1}

model is 0.71. We estimate the ratio between two probabilities as

{\hat{π}}_{i j} / {\hat{π}}_{j i} = 0 . 71^{2 (j - i)}

for

i < j

. Therefore, the probability distribution for the average dose for a case tends to be stochastically higher than the probability distribution for the average dose for control because

{\hat{δ}}_{1} < 1

.

Finally, we are interested in inferring the reason for the poor fit of the SQI model. According to Corollary 1, the SQI model is separated into the LSQI

_{1}

model and the ME

_{1}

model. Since the LSQI

_{1}

model fits very well, but the ME

_{1}

model fits very poorly, we deduce that the lack of structure of the ME

_{1}

model is responsible for the poor fit of the SQI model.

6. Conclusions

Herein we describe an asymmetry plus association model. This model indicates the asymmetry structures for cell probabilities between symmetric position and odds ratios. Our model is an intermediate model between the SQU model and the QU model. If the QU (LSQU

_{r - 1}

) model holds but the SQU model does not, the LSQU

_{k}

model for

k < r - 1

may hold. In this case, the QU model may be overfitting. That is, our model may realize a better fit than the QU model under these conditions. In practice, the LSQI

_{1}

model fits well when the SQU model fits poorly and the QU model fits for the data in Table 1. Additionally, a theorem with respect to the necessary and sufficient condition for the SQU model is represented using our model. Using this theorem, we show the asymptotic separability for the SQU model. Namely, the likelihood ratio chi-squared statistic for the SQU model is equivalent to the sum of those for the separated models, which helps deduce the reason that the SQU model does not hold.

Author Contributions

Conceptualization, K.T.; methodology, K.F.; formal analysis, K.F; writing—–original draft preparation, K.F.; writing—review and editing, K.F. and K.T.; project administration, K.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank the three anonymous referees for their helpful comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

QU	Quasi-uniform association
QI	Quasi-independence
S	Symmetry
MH	Marginal homogeneity
QS	Quasi-symmetry
LS $_{k}$	kth linear asymmetry
SQI	Symmetry plus quasi-independence
SQU	Symmetry plus quasi-uniform association
LSQU $_{k}$	kth linear asymmetry plus quasi-uniform association
LSQI $_{k}$	kth linear asymmetry plus quasi-independence
ME $_{k}$	Marginal kth moment equality
df	Degrees of freedom

References

Bishop, Y.M.M.; Fienberg, S.E.; Holland, P.W. Discrete Multivariate Analysis: Theory and Practice; The MIT Press: Cambridge, MA, USA, 1975. [Google Scholar]
Kateri, M. Contingency Table Analysis: Methods and Implementation Using R; Birkhäuser (Springer): New York, NY, USA, 2014. [Google Scholar]
Goodman, L.A. Simple models for the analysis of association in cross-classifications having ordered categories. J. Am. Stat. Assoc. 1979, 74, 537–552. [Google Scholar] [CrossRef]
Goodman, L.A. The analysis of cross-classified data having ordered and/or unordered categories: Association models, correlation models, and asymmetry models for contingency tables with or without missing entries. Ann. Stat. 1985, 13, 10–69. [Google Scholar] [CrossRef]
Goodman, L.A. Some useful extensions of the usual correspondence analysis approach and the usual log-linear models approach in the analysis of contingency tables. Int. Stat. Rev. 1986, 54, 243–309. [Google Scholar] [CrossRef]
Agresti, A. Categorical Data Analysis, 2nd ed.; Wiley: New York, NY, USA, 2002. [Google Scholar]
Bowker, A.H. A test for symmetry in contingency tables. J. Am. Stat. Assoc. 1948, 43, 572–574. [Google Scholar] [CrossRef] [PubMed]
Stuart, A. A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 1955, 42, 412–416. [Google Scholar] [CrossRef]
Caussinus, H. Contribution à l’analyse statistique des tableaux de corrélation. Ann. Fac. Des. Sci. L’Univ. Toulouse 1965, 29, 77–182. [Google Scholar] [CrossRef]
Tahata, K.; Tomizawa, S. Generalized linear asymmetry model and decomposition of symmetry for multiway contingency tables. J. Biom. Biostat. 2011, 2, 1–6. [Google Scholar] [CrossRef] [Green Version]
Agresti, A. A simple diagonals-parameter symmetry and quasi-symmetry model. Stat. Probab. Lett. 1983, 1, 313–316. [Google Scholar] [CrossRef]
Tomizawa, S. An extended linear diagonals-parameter symmetry model for square contingency tables with ordinal categories. Metron 1991, 49, 401–409. [Google Scholar]
Yamamoto, K.; Tomizawa, S. Symmetry plus quasi uniform association model and its orthogonal decomposition for square contingency tables. J. Mod. Appl. Stat. Methods 2010, 9, 255–262. [Google Scholar] [CrossRef] [Green Version]
Tahata, K.; Tomizawa, S. Generalized marginal homogeneity model and its relation to marginal equimoments for square contingency tables with ordered categories. Adv. Data Anal. Classif. 2008, 2, 295–311. [Google Scholar] [CrossRef]
Aitchison, J. Large-sample restricted parametric tests. J. R. Stat. Soc. Ser. B 1962, 24, 234–250. [Google Scholar] [CrossRef]
Darroch, J.N.; Silvey, S.D. On testing more than one hypothesis. Ann. Math. Stat. 1963, 34, 555–567. [Google Scholar] [CrossRef]
Lang, J.B.; Agresti, A. Simultaneously modeling joint and marginal distributions of multivariate categorical responses. J. Am. Stat. Assoc. 1994, 89, 625–632. [Google Scholar] [CrossRef]
Lang, J.B. On the partitioning of goodness-of-fit statistics for multivariate categorical response models. J. Am. Stat. Assoc. 1996, 91, 1017–1023. [Google Scholar] [CrossRef]
Tomizawa, S.; Tahata, K. The analysis of symmetry and asymmetry: Orthogonality of decomposition of symmetry into quasi-symmetry and marginal symmetry for multi-way tables. J. Soc. Fr. Stat. 2007, 148, 3–36. [Google Scholar]
Bhapkar, V.P. A note on the equivalence of two test criteria for hypotheses in categorical data. J. Am. Stat. Assoc. 1966, 61, 228–235. [Google Scholar] [CrossRef]
Rao, C.R. Linear Statistical Inference and Its Applications, 2nd ed.; Wiley: New York, NY, USA, 1973. [Google Scholar]
Breslow, N.E.; Day, N.E. Statistical Methods in Cancer Research, Vol. I: The Analysis of Case-Control Studies; International Agency for Research on Cancer: Lyon, France, 1980. [Google Scholar]

Figure 1. Relationships among the models (A → B indicates that model A is a special case of model B).

Table 1. Average doses of conjugated estrogen used by cases and matched controls: Los Angeles endometrial cancer study [22].

Average Dose for Case (mg/day)	Average Dose for Control (mg/day)
Average Dose for Case (mg/day)	0 (1)	0.1–0.299 (2)	0.3–0.625 (3)	0.625+ (4)	Total
0 (1)	6	2	3	1	12
0.1–0.299 (2)	9	4	2	1	16
0.3–0.625 (3)	9	2	3	1	15
0.625+ (4)	12	1	2	1	16
Total	36	9	10	4	59

Table 2. The values of likelihood ratio chi-squared statistics for models applied to Table 1.

Model	df	$G^{2} (M)$
SQI	8	19.98 $^{*}$
SQU	7	19.86 $^{*}$
LSQI $_{1}$	7	3.62
LSQI $_{2}$	6	2.98
LSQI $_{3}$ (QI)	5	0.77
LSQU $_{1}$	6	3.61
LSQU $_{2}$	5	2.98
LSQU $_{3}$ (QU)	4	0.69
S	6	19.27 $^{*}$
LS $_{1}$	5	2.97
LS $_{2}$	4	2.33
LS $_{3}$ (QS)	3	0.46
ME $_{1}$	1	16.43 $^{*}$
ME $_{2}$	2	17.08 $^{*}$
ME $_{3}$ (MH)	3	19.12 $^{*}$

Note * Significant at the 0.05 level.

Table 3. Estimated expected frequencies from the LSQI

_{1}

model.

Table 3. Estimated expected frequencies from the LSQI

_{1}

model.

Average Dose for Case (mg/day)	Average Dose for Control (mg/day)
Average Dose for Case (mg/day)	0 (1)	0.1–0.299 (2)	0.3–0.625 (3)	0.625+ (4)
0 (1)	6	3.58	2.64	1.42
0.1–0.299 (2)	7.07	4	1.13	0.61
0.3–0.625 (3)	10.34	2.24	3	0.89
0.625+ (4)	10.96	2.37	1.76	1

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fujisawa, K.; Tahata, K. Quasi Association Models for Square Contingency Tables with Ordinal Categories. Symmetry 2022, 14, 805. https://doi.org/10.3390/sym14040805

AMA Style

Fujisawa K, Tahata K. Quasi Association Models for Square Contingency Tables with Ordinal Categories. Symmetry. 2022; 14(4):805. https://doi.org/10.3390/sym14040805

Chicago/Turabian Style

Fujisawa, Kengo, and Kouji Tahata. 2022. "Quasi Association Models for Square Contingency Tables with Ordinal Categories" Symmetry 14, no. 4: 805. https://doi.org/10.3390/sym14040805

APA Style

Fujisawa, K., & Tahata, K. (2022). Quasi Association Models for Square Contingency Tables with Ordinal Categories. Symmetry, 14(4), 805. https://doi.org/10.3390/sym14040805

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quasi Association Models for Square Contingency Tables with Ordinal Categories

Abstract

1. Introduction

2. Models

3. Necessary and Sufficient Condition for the SQU Model

4. Partition of Test Statistics

5. Example

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI