Statistics and Probability Letters 101 (2015) 33–37
Contents lists available at ScienceDirect
Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro
Orthogonal decomposition of symmetry model using the ordinal quasi-symmetry model based on f -divergence for square contingency tables Yusuke Saigusa, Kouji Tahata ∗ , Sadao Tomizawa Department of Information Sciences, Faculty of Science and Technology, Tokyo University of Science, Noda City, Chiba, 278-8510, Japan
article
info
Article history: Received 23 June 2014 Received in revised form 9 January 2015 Accepted 26 February 2015 Available online 6 March 2015 Keywords: Decomposition f -divergence Orthogonality Quasi-symmetry Square contingency table Symmetry
abstract For square contingency tables, Caussinus (1965) considered the quasi-symmetry (QS) model. Kateri and Agresti (2007) considered the ordinal quasi-symmetry (OQS[f ]) model based on f -divergence. The present paper gives a decomposition of the symmetry (S) model into the OQS[f ] and marginal mean equality models. It also shows that the test statistic for goodness-of-fit of the S model is asymptotically equivalent to the sum of those for the decomposed models. © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction Consider an r × r square contingency table with the same row and column classifications having ordered categories. Let pij denote the probability that an observation will fall in the (i, j) cell (i = 1, . . . , r ; j = 1, . . . , r). The S model is defined by pij = ψij
(i = 1, . . . , r ; j = 1, . . . , r ),
where ψij = ψji (Bowker, 1948; Bishop et al., 1975, p. 282). Caussinus (1965) considered the quasi-symmetry (QS) model, defined by pij = αi βj ψij
(i = 1, . . . , r ; j = 1, . . . , r ),
where ψij = ψji . A special case of this model with {αi = βi } is the S model. Let {us } (s = 1, . . . , r ) denote a set of known scores u1 ≤ u2 ≤ · · · ≤ ur with u1 < ur for the rows and columns. Agresti (2010, p. 236) gave the ordinal quasi-symmetry (OQS) model, defined by pij = α ui β uj ψij
(i = 1, . . . , r ; j = 1, . . . , r ),
where ψij = ψji . This is a special case of the QS model. Note that, for two distributions p = (pij ) and θ = (θij ), the f -divergence between p and θ is defined by
i ,j
∗
θij f
pij
θij
,
Corresponding author. E-mail addresses:
[email protected] (Y. Saigusa),
[email protected] (K. Tahata),
[email protected] (S. Tomizawa).
http://dx.doi.org/10.1016/j.spl.2015.02.023 0167-7152/© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).
34
Y. Saigusa et al. / Statistics and Probability Letters 101 (2015) 33–37
where f is a real-valued convex function on (0, +∞) with f (1) = 0, f (0) = limt →0 f (t ), 0f (0/0) = 0 and 0f (a/0) = af∞ with f∞ = limt →∞ [f (t )/t ] (Csiszár and Shields, 2004). Let f be a twice-differentiable and strictly convex function, and let F (x) = f ′ (x) for all x. Replacing the Kullback–Leibler distance by the more general f -divergence (although the details are omitted here), Kateri and Papaioannou (1997) introduced the quasi-symmetry (QS[f ]) model based on f -divergence, defined by pij = pSij F −1 (αi + γij )
(i = 1, . . . , r ; j = 1, . . . , r ),
where pSij = (pij + pji )/2 and γij = γji . The QS[f ] model with f (x) = x log x, x > 0, is the QS model. Also the QS[f ] model with f (x) = (1 − x)2 is called as the Pearsonian QS model (Kateri and Papaioannou, 1997). Kateri and Agresti (2007) proposed the ordinal quasi-symmetry (OQS[f ]) model based on f -divergence, defined by pij = pSij F −1 (α ui + γij )
(i = 1, . . . , r ; j = 1, . . . , r ),
where pSij = (pij + pji )/2 and γij = γji . This is a special case of the QS[f ] model. For example, when f (x) = x log x, x > 0, the OQS[f ] model is expressed as 2eα ui pij = pSij α u e i + eα uj
(i = 1, . . . , r ; j = 1, . . . , r ).
This is the OQS model. Also when f (x) = (1 − x)2 , the OQS[f ] model is expressed as pij = pSij (1 + a(ui − uj ))
(i = 1, . . . , r ; j = 1, . . . , r ),
with a = α/4. This model is called as the Pearsonian OQS model (Kateri and Agresti, 2007). See Kateri and Papaioannou (1997), and Kateri and Agresti (2007) for the details of the QS[f ] and OQS[f ] models with more general function f . The marginal homogeneity (MH) model (Stuart, 1955) is given by pi· = p·i
(i = 1, . . . , r ),
r
r
where pi· = t =1 pit and p·i = s=1 psi . Some statisticians gave the decompositions of the S model using the MH model. For instance, Caussinus (1965) gave the theorem that the S model holds if and only if both the QS and MH models hold. Kateri and Papaioannou (1997) showed that the S model holds if and only if both the QS[f ] and MH models hold. Kateri and Agresti (2007) pointed out that the S model holds if and only if both the OQS[f ] and MH models hold. It would be natural to consider the decomposition of the S model into the OQS[f ] model and the model which has weaker restriction than the MH model, because the structure satisfies both the OQS[f ] and MH models has stronger restriction than the S model. As discussed in Yamamoto et al. (2007), as the model which has the weaker restriction than the MH model, we consider the model of marginal means equality (ME) for the scores {us } as
µ1 = µ2 , r r where µ1 = i=1 ui pi· and µ2 = i=1 ui p·i . Note that, (i) if the MH model holds then the ME model holds, but the converse does not hold, and (ii) if us = u0 + (s − 1)d, and u0 and d are specified (i.e. {us } are the equal-interval scores), then the ME model is given by
µ ˜1 = µ ˜ 2, where µ ˜ 1 = ri=1 ipi· and µ ˜ 2 = ri=1 ip·i , and indicates that the mean of the row variable equals the mean of the column variable (Tomizawa, 1991). Yamamoto et al. (2007) gave the theorem that the S model holds if and only if both the OQS and ME models hold (also see Tahata et al., 2008). Therefore, we are now interested in whether the decomposition of the S model into the OQS[f ] and ME models holds. The conceivable result is the extension of the decomposition of Yamamoto et al. (2007) because the OQS[f ] model is the generalized OQS model based on f -divergence. The present paper gives (i) the theorem that the S model holds if and only if the OQS[f ] and ME models hold, and shows that (ii) the test statistic for goodness-of-fit of the S model is asymptotic equivalent to the sum of those for the decomposed models. 2. Decomposition of the symmetry model We obtain the following theorem: Theorem 1. The S model holds if and only if both the OQS[f ] and ME models hold.
Y. Saigusa et al. / Statistics and Probability Letters 101 (2015) 33–37
35
Table 1 Unaided distance vision of 4746 students aged 18 to about 25 including about 10% women in Faculty of Science and Technology, Science University of Tokyo, Japan, examined in April 1982; adapted from Tomizawa (1984). Right eye grade
Left eye grade Best (1)
Total Second (2)
Third (3)
Worst (4)
Best (1) Second (2) Third (3) Worst (4)
1291 149 64 20
130 221 124 25
40 114 660 249
22 23 185 1429
1483 507 1033 1723
Total
1524
500
1063
1659
4746
Proof. If the S model holds, then both the OQS[f ] and ME models hold. Assuming that both the OQS[f ] and ME models hold, then we shall show that the S model holds. Since the OQS[f ] model holds, pij pji
=
F −1 (α ui + γij ) F −1 (α uj + γji )
(i < j).
From the assumption that the ME model holds, we see r −1 r
(ui − uj )(pij − pji ) = 0.
i =1 j =i +1
Thus
F −1 (α uj + γji ) (ui − uj )pij 1 − −1 = 0. F (α ui + γij ) j=i+1
r −1 r i =1
Note that F −1 is an increasing function and γij = γji . Thus we obtain α = 0, i.e., the S model holds. The proof is completed. Let nij denote the observed frequency in the (i, j) cell (i = 1, . . . , r ; j = 1, . . . , r). Assume that a multinomial distribution applies to the r × r table. Let G2 (M ) denote the likelihood ratio chi-squared statistic for testing goodness-of-fit of model M defined by G2 (M ) = 2
r r i=1 j=1
nij log
nij
ˆ ij m
,
ˆ ij is the maximum likelihood estimate of expected frequency mij under the model M. where m Aitchison (1962) discussed the asymptotic separability of models. Also the similar property of models is described by Darroch and Silvey (1963), and Read (1977). The decomposition such that the test statistic for testing goodness-of-fit of a model is asymptotically equivalent to the sum of those for testing the decomposed models was given in several statistician, e.g., Tomizawa and Tahata (2007), and Tahata et al. (2008). We now obtain the following theorem: Theorem 2. Under the S model, the G2 (S ) is asymptotically equivalent to the sum of G2 (OQS [f ]) and G2 (ME ). The proof is omitted because it is obtained in a similar manner to Tahata et al. (2008). Note that the G2 (S ) is not asymptotically equivalent (or not equal) to the sum of G2 (OQS [f ]) and G2 (MH ). In a similar manner to Theorem 2, we can obtain that, under the S model, the G2 (S ) is asymptotically equivalent to the sum of G2 (QS [f ]) and G2 (MH ). 3. Examples Example 1. Table 1 taken from Tomizawa (1984) is constructed from the data of the unaided distance vision of 4746 students aged 18 to about 25 including about 10% women in Faculty of Science and Technology, Science University of Tokyo in Japan examined in April 1982. Table 3 gives the values of likelihood ratio chi-square G2 for testing the goodness-of-fit of models applied to the data in Table 1. We use the integer scores {ui = i}. The S model fits the data in Table 1 poorly yielding G2 = 16.95 with 6 degrees of freedom. Also the ME model fits these data poorly, but the OQS and Pearsonian OQS models fit these data well. It is inferred from Theorem 1 and Table 3 that the poor fit of the S model is caused by the influence of the lack of structure of the ME model rather than the OQS (or Pearsonian OQS) model. Thus we see that the non-equality of the mean of the right eye grade of students and the mean of the left eye grade of them causes the structure which the probability that a student’s right eye grade is i and his left eye grade is j (̸=i) does not equal the probability that the student’s right eye grade is j and his left eye grade is i for some (i, j). We note that G2 (S) is close to (i) the sum of G2 (OQS) (or G2 (Pearsonian OQS)) and G2 (ME), and (ii) the sum of G2 (QS) (or G2 (Pearsonian QS)) and G2 (MH).
36
Y. Saigusa et al. / Statistics and Probability Letters 101 (2015) 33–37 Table 2 Cross-classification of Merino ewes according to number of lambs born in consecutive years; adapted from Tallis (1962). Number of lambs 1953
Number of lambs 1952 0
1
Total 2
0 1 2
58 26 8
52 58 12
1 3 9
111 87 29
Total
92
122
13
227
Table 3 The numbers of degrees of freedom (df) and likelihood ratio chi-square values G2 for models applied to the data in Tables 1 and 2. Models
S OQS Pearsonian OQS ME QS Pearsonian QS MH
Table 1
Table 2
df
G2
df
G2
6 5 5 1 3 3 3
16.95* 6.95 7.01 9.94* 5.71 5.78 11.18*
3 2 2 1 1 1 2
20.81* 20.74* 20.75* 0.07 1.35 2.16 18.65*
Note: * means significant at the 0.05 level.
Example 2. The data in Table 2 taken from Tallis (1962) describe the cross-classification of 227 Merino ewes according to the numbers of lambs born to them in two consecutive years, 1952 and 1953 (also see Bishop et al., 1975, p. 288). Table 3 shows that the S, OQS, Pearsonian OQS and MH models fit these data poorly, but the ME, QS and Pearsonian QS models fit these data well. We see from Theorem 1 and Table 3 that the poor fit of the S model is caused by the influence of the lack of structure of the OQS (or Pearsonian OQS) model rather than that of the ME model. 4. Concluding remarks We have given the decomposition of the S model into the OQS[f ] and ME models. As seen in Examples, Theorem 1 would be useful for inferring the reason for the poor fit of the S model when the S model fits the data poorly. In addition, we have shown that, for Theorem 1, the orthogonality of models holds. We point out that, e.g., the likelihood ratio statistic for testing goodness-of-fit of the S model assuming that the ME model holds true is G2 (S) − G2 (ME), and this is asymptotically equivalent to the likelihood ratio statistic for testing goodness-of-fit of the OQS[f ] model, G2 (OQS[f ]). Namely, G2 (OQS[f ]) enable us to test goodness-of-fit of the S model assuming that the ME model holds true. We note that the decomposition of the S model into the OQS[f ] and ME models can apply only to the ordinal data because each of decomposed models is not invariant under arbitrary same permutations of the categories of rows and columns except the reverse order. The pragmatic way of choosing scores that assign to ordered categories is discussed in, e.g., Agresti (2010, Chapter 2). Acknowledgments The authors would like to thank the co-editor-in-chief and referees for their helpful comments. References Agresti, A., 2010. Analysis of Ordinal Categorical Data, second ed. Wiley, Hoboken, New Jersey. Aitchison, J., 1962. Large-sample restricted parametric tests. J. R. Stat. Soc. Ser. B 24, 234–250. Bishop, Y.M.M., Fienberg, S.E., Holland, P.W., 1975. Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge. Bowker, A.H., 1948. A test for symmetry in contingency tables. J. Amer. Statist. Assoc. 43, 572–574. Caussinus, H., 1965. Contribution à l’analyse statistique des tableaux de corrélation. Ann. Fac. Sci. Univ. Toulouse 29, 77–182. Csiszár, I., Shields, P., 2004. Information Theory and Statistics: A Tutorial. In: Foundations and Trends in Communications and Information Theory, vol. 1. pp. 417–528. Darroch, J.N., Silvey, S.D., 1963. On testing more than one hypothesis. Ann. Math. Statist. 34, 555–567. Kateri, M., Agresti, A., 2007. A class of ordinal quasi-symmetry models for square contingency tables. Statist. Probab. Lett. 77, 598–603. Kateri, M., Papaioannou, T., 1997. Asymmetry models for contingency tables. J. Amer. Statist. Assoc. 92, 1124–1131. Read, C.B., 1977. Partitioning chi-square in contingency tables: a teaching approach. Comm. Statist. Theory Methods 6, 553–562. Stuart, A., 1955. A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 42, 412–416. Tahata, K., Yamamoto, H., Tomizawa, S., 2008. Orthogonality of decompositions of symmetry into extended symmetry and marginal equimoment for multi-way tables with ordered categories. Austral. J. Statist. 37, 185–194. Tallis, G.M., 1962. The maximum likelihood estimation of correlation from contingency tables. Biometrics 18, 342–353.
Y. Saigusa et al. / Statistics and Probability Letters 101 (2015) 33–37
37
Tomizawa, S., 1984. Three kinds of decompositions for the conditional symmetry model in a square contingency table. J. Japan Statist. Soc. 14, 35–42. Tomizawa, S., 1991. Decomposing the marginal homogeneity model into two models for square contingency tables with ordered categories. Calcutta Statist. Assoc. Bull. 41, 161–164. Tomizawa, S., Tahata, K., 2007. The analysis of symmetry and asymmetry: orthogonality of decomposition of symmetry into quasi-symmetry and marginal symmetry for multi-way tables. J. Soc. Fr. Statist. 148, 3–36. Yamamoto, H., Iwashita, T., Tomizawa, S., 2007. Decomposition of symmetry into ordinal quasi-symmetry and marginal equimoment for multi-way tables. Austral. J. Statist. 36, 291–306.