Statistics & Probability Letters 47 (2000) 213 – 217
Variance stabilizing transformation and studentization for estimator of correlation coecient Hironori Fujisawa Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Tokyo 152-8552, Japan Received December 1998; received in revised form June 1999
Abstract The variance stabilizing transformation and the studentization have a simple relation on the skewness and the mean. The resultant relation implies that the former makes a better normal approximation than the latter for estimators of correlation c 2000 Elsevier Science B.V. All rights reserved coecient in some cases, including an elliptical case and a missing case. Keywords: Mean; Missing; Normal approximation; Skewness
1. Introduction Let be a correlation coecient, ˆ an estimator, and n a sample size. Assume that the asymptotic distribution of u = n1=2 (ˆ − ) is normal with mean zero and variance 2 (). Let T be the studentization and ˆ and Z = n1=2 {f() ˆ − f()} such that Z the variance stabilizing transformation, that is, T = n1=2 (ˆ − )=() 0 f ()() = 1 and f() is continuous dierentiable. The asymptotic distributions of T and Z are common and normal with mean zero and variance one. The following case is most simple: a population is normal, ˆ is the maximum likelihood estimator, and data are complete. In this case, we know that the variance stabilizing transformation makes a better normal approximation than the studentization. See David (1938), Gayen (1951), and Konishi (1981). Konishi and Shimizu (1994) investigated the case that some data are missing, and showed the above phenomenon. The missing data have been studied by many researchers; Dahiya and Korwar (1980), Kariya et al. (1983), Srivastava (1993), Fujisawa (1996), Garren (1998), Fujisawa (1999). The variance stabilizing transformation and the studentization have a simple relation on the skewness and the mean of order n−1=2 . When this relation is applied to some cases, including an elliptical case and a missing case, we can see that the variance stabilizing transformation makes a better normal approximation than the studentization in the sense of skewness and=or mean. Lahiri (1997) constructed similar relations on coverage probability. It may be noted that the results obtained in the present paper cannot be derived on coverage probability. c 2000 Elsevier Science B.V. All rights reserved 0167-7152/00/$ - see front matter PII: S 0 1 6 7 - 7 1 5 2 ( 9 9 ) 0 0 1 5 8 - 3
214
H. Fujisawa / Statistics & Probability Letters 47 (2000) 213 – 217
2. Simple relation Let 3 (W ) and m(W ) be the skewness and the mean of W , which denotes T or Z, respectively. Usually, the expansions can be given by 3 (W ) = n−1=2 31 (W ) + o(n−1=2 ); m(W ) = n−1=2 m1 (W ) + o(n−1=2 ): These appear on the rst-order term of Edgeworth expansion. In general, the terms 31 (W ) and m1 (W ) are not simple, but the dierences between T and Z are simple (the proof is at the end of this section): 3 (T ) − 3 (Z) = n−1=2 (−30 ()) + o(n−1=2 );
(1)
m(T ) − m(Z) = n−1=2 (−0 ()=2) + o(n−1=2 ):
(2)
Focus on the order n−1=2 . If 0 () ¿ 0 and 3 (Z) ¡ 0, then expansion (1) provides that 3 (T ) − 3 (Z) ¡ 0, therefore, 3 (T ) ¡ 3 (Z) ¡ 0. If 0 () ¡ 0 and 3 (Z) ¿ 0, then expansion (1) describes that 3 (T )−3 (Z) ¿ 0, therefore, 3 (T ) ¿ 3 (Z) ¿ 0. The above two situations show that the variance stabilizing transformation is better than the studentization in the sense of skewness. Similar discussions are possible on the mean. Such the convenient situations happen on estimators of correlation coecient. Consider the proof of (2). Let ˆ = + n−1=2 u. Using Taylor expansion, we have {()} ˆ −1 = {() + n−1=2 0 ()u}−1 + op (n−1=2 ) = ()−1 {1 + n−1=2 0 ()u=()}−1 + op (n−1=2 ) = ()−1 {1 − n−1=2 0 ()u=()} + op (n−1=2 ): Let v = u=(). The studentization can be expanded to T = {()} ˆ −1 u = v − n−1=2 0 ()v2 + op (n−1=2 ): Using Taylor expansion and noting that f0 () = 1=(), we have f() ˆ = f() + n−1=2 f0 ()u + n−1 f00 ()u2 =2 + op (n−1 ) = f() + n−1=2 u=() − n−1 0 ()u2 =(22 ()) + op (n−1 ) = f() + n−1=2 v − n−1 0 ()v2 =2 + op (n−1 ): So, Z = n1=2 {f() ˆ − f()} = v − n−1=2 0 ()v2 =2 + op (n−1=2 ): Therefore, E(T ) − E(Z) = n−1=2 (−0 ()=2)E(v2 ) + o(n−1=2 ) = n−1=2 (−0 ()=2) + o(n−1=2 ): Expansion (1) can be proved by similar discussions.
H. Fujisawa / Statistics & Probability Letters 47 (2000) 213 – 217
215
3. Examples 3.1. Normal case Let x be a bivariate random variate normally distributed and x1 ; : : : ; x n the random samples. The maximum likelihood estimator of is √ (3) ˆ = s12 = s11 s22 ; P P i − x) 0 = (sij ), x = xi =n. The asymptotic distribution of u = n1=2 (ˆ − ) is normal where S = (xi − x)(x with mean zero and variance 2 () = (1 − 2 )2 . The variance stabilizing transformation is 1 1+ f() = log : 2 1− The dierential of () is 0 () = −2. It holds that 3 (T ) − 3 (Z) = n−1=2 (6) + o(n−1=2 ); m(T ) − m(Z) = n−1=2 + o(n−1=2 ): The skewness and the mean of Z are 3 (Z) = o(n−1=2 ); m(Z) = n−1=2 (=2) + o(n−1=2 ) (see Anderson, 1984). Focus on the order n−1=2 . If ¿ 0, the above provide that 3 (T ) − 3 (Z) ¿ 0, m(T ) − m(Z) ¿ 0, 3 (Z) = 0, and m(Z) ¿ 0, therefore, 3 (T ) ¿ 3 (Z) = 0 and m(T ) ¿ m(Z) ¿ 0. Similar discussions describe that 3 (T ) ¡ 3 (Z) = 0 and m(T ) ¡ m(Z) ¡ 0 if ¡ 0. These show that the variance stabilizing transformation is better than the studentization in the sense of mean as well as skewness. 3.2. Elliptical case Let x be a bivariate random variate elliptically distributed and x1 ; : : : ; x n the random samples. Consider the estimator ˆ de ned by (3). The asymptotic distribution of u = n1=2 (ˆ − ) is normal with mean zero and variance 2 () = (1 − 2 )2 (1 + 4 ), where 4 is the known fourth cumulant (see Muirhead, 1982). The variance stabilizing transformation is 1 1+ f() = √ : log 1− 2 1 + 4
√ The dierential of () is 0 () = −2 1 + 4 . It holds that p 3 (T ) − 3 (Z) = n−1=2 (6 1 + 4 ) + o(n−1=2 ): The skewness of Z are 3 (Z) = o(n−1=2 ) (see Gayen, 1951, though this result was not clearly stated). Similar discussions are possible as in Section 3.1. Gayen (1951) calculated various approximate quantities of T and Z including the non-normal case. However, we could not see other convenient situations except the skewness of the elliptical case.
216
H. Fujisawa / Statistics & Probability Letters 47 (2000) 213 – 217
3.3. Interclass case Let x be a bivariate random variate normally distributed with equal variance and x1 ; : : : ; x n the random samples. The correlation coecient is called ‘interclass correlation coecient’. The maximum likelihood estimator of is ˆ = 2s12 =(s11 + s22 ); where S = (sij ) is de ned in Section 3.1. The asymptotic distribution of u = n1=2 (ˆ − ) is normal with mean zero and variance 2 () = (1 − 2 )2 , which is the same as in Section 3.1. So, all the corresponding relations are the same as in Section 3.1. However, the skewness and the mean is not the same. Using the delta method (see, e.g., Section 2:6 of Hall, 1992), we can obtain 3 (Z) = o(n−1=2 );
m(Z) = o(n−1=2 ):
Consequently, the variance stabilizing transformation is better than the studentization in the sense of skewness and mean. 3.4. Missing case Let z = (x; y) be a bivariate random variate normally distributed with equal variance. Suppose that there are n1 paired data (xi ; yi ), n2 unpaired data xi∗ , and n3 unpaired data yi∗ . The data can be imaged as x1 ; : : : ; x n1 ; x1∗ ; : : : ; x∗n2 y1 ; : : : ; yn1 ; y1∗ ; : : : ; yn∗3
:
Let ˆ be the maximum likelihood estimator. Konishi and Shimizu (1994) obtained the likelihood equation, the asymptotic distribution, and the variance stabilizing transformation. Since the likelihood equation and the variance stabilizing transformation are complicated, the details are omitted in this section. The asymptotic distribution of u = n1=2 (ˆ − ), where n = n1 + n2 + n3 , is normal with mean zero and variance 2 () = (1 − 2 )2
−1
(1 + 1=c1 );
where = (1 + c1 ) + (1 − c1 )2 , c1 = n1 =n. The dierential of () is 0 () = −; where = {(1 − c1 )2 + (3 + c1 )}
−3=2
p
1 + 1=c1 ¿ 0. It follows that
3 (T ) − 3 (Z) = n−1=2 (3) + o(n−1=2 ): Using the delta method, after simple but troublesome calculations, we can obtain 3 (Z) = n−1=2 + o(n−1=2 );
p where = (1 − c1 ){(3 + 2 ) + 3c1 (1 − 2 )} 1=2 = 1 + 1=c1 ¿ 0. By the similar discussions as in the previous sections, we can see that the variance stabilizing transformation is better than the studentization in the sense of skewness. (The convenient situation could not be seen on the mean though the author calculated the mean of order n−1=2 .)
H. Fujisawa / Statistics & Probability Letters 47 (2000) 213 – 217
217
References Anderson, T.W., 1984. An Introduction to Multivariate Statistical Analysis. Wiley, New York. Dahiya, R.C., Korwar, R.M., 1980. Maximum likelihood estimators for a bivariate normal distribution with missing data. Ann. Statist. 8, 687–692. David, F.N., 1938. Tables of the Correlation Coecient. Cambridge University Press, Cambridge. Fujisawa, H., 1996. The maximum likelihood estimators in a multivariate normal distribution with AR(1) covariance structure for monotone data. Ann. Inst. Statist. Math. 48, 423–428. Fujisawa, H., 1999. Eects of unpaired data for estimating an interclass correlation. Commun. Statist. Theory Meth. 28, 245–254. Gayen, A.K., 1951. The frequency distribution of the product-moment correlation coecient in random samples of any size drawn from non-normal universes. Biometrika 38, 219–247. Garren, S.T., 1998. Maximum likelihood estimation of the correlation coecient in a bivariate normal model with missing data. Statist. Probab. Lett. 38, 281–288. Hall, P., 1992. The Bootstrap and Edgeworth Expansion. Springer, New York. Kariya, T., Krishnaiah, P.R., Rao, C.R., 1983. Inference on parameters of multivariate normal populations when some data is missing. In: Krishnaiah, P.R. (Ed.), Developments in Statistics, Vol. 4. Academic Press, New York, pp. 137–184. Konishi, S., 1981. Normalizing transformations of some statistics in multivariate analysis. Biometrika 68, 647–651. Konishi, S., Shimizu, K., 1994. Maximum likelihood estimation of an intraclass correlation in a bivariate normal distribution with missing observations. Commun. Statist. Theory Meth. 23, 1593–1604. Lahiri, S.N., 1997. Variance stabilizing transformations, studentization and the bootstrap. J. Statist. Plann. Inf. 61, 105–123. Muirhead, R.J., 1982. Aspects of Multivariate Statistical Theory. Wiley, New York. Srivastava, M.S., 1993. Estimation of the intraclass correlation coecient. Ann. Human Genetics 57, 159–165.