ARTICLE IN PRESS
Statistics & Probability Letters 78 (2008) 488–489 www.elsevier.com/locate/stapro
A note on confidence intervals for the power of t-test Dennis Gillilanda, Mingfei Lib, a
Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA b Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA Received 13 August 2007; received in revised form 15 August 2007; accepted 15 August 2007 Available online 25 August 2007
Abstract Tarasin´ska [2005. Confidence intervals for the power of student’s t-test, Statist. Probab. Lett. 73, 125–130] proposes a minimum length method for determining confidence interval estimates for the power of the one-sided t-test at fixed alternative means. This note points out the lack of proof for the coverage probability and provides results to indicate that the nominal coverage probability is not achieved. r 2007 Elsevier B.V. All rights reserved. MSC: 62F30; 62F25; 62G10; 62G15 Keywords: t-test; Power; Confidence intervals
Let Y be a random variable distributed Ps and ðaðY Þ; bðY ÞÞ be a 100ð1 gÞ% confidence interval estimator for the parameter s. Then if o ¼ f ðsÞ is a strictly increasing function of s, ðf ðaðY ÞÞ; f ðbðY ÞÞÞ is a 100ð1 gÞ% confidence interval estimator for o and if o ¼ f ðsÞ is a strictly decreasing function of s, ðf ðbðY ÞÞ; f ðaðY ÞÞÞ is a 100ð1 gÞ% confidence interval estimator for o. Here we briefly review the development in Tarasin´ska (2005) borrowing from the notation therein. Consider the power o of the a level t-test H0 : m ¼ m0 v. Ha : m4m0 based on X 1 ; X 2 ; . . . ; X n iid Nðm; sÞ, namely, Power ¼ o ¼ 1 G n1;d ðt0 Þ,
(1)
where G n;d is the cdf of the non-central t-distribution with n degrees of freedom and non-centrality parameter d, and t0 is the 1 a quantile of G n1;0 . From the definition of the t-distribution it follows that pffiffiffiffiffiffiffiffiffi (2) Gn;d ðtÞ ¼ EFðt Y =n dÞ, where the expectation E is with respect to Y , a chi-square random variable with n degrees of freedom, d is the non-centrality parameter, and F is the cdf of the standard normal distribution. It is obvious from (2) and is ´ ska observed in Tarasin p ffiffiffi (2005) that Gn;d ðtÞ is strictly decreasing in d. On the other hand, for the test under consideration, d ¼ nD=s is strictly decreasing in the scale parameter s40 for Pfixed D ¼ m m0 40. It follows that any 100ð1 gÞ% confidence interval estimator for s based on Y ¼ ni¼1 ðX i X Þ2 transforms to a Corresponding author.
E-mail addresses:
[email protected] (D. Gilliland),
[email protected] (M. Li). 0167-7152/$ - see front matter r 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2007.08.002
ARTICLE IN PRESS D. Gilliland, M. Li / Statistics & Probability Letters 78 (2008) 488–489
489
Table 1 Operating characteristics of confidence intervals for power of the t-test Case
n ¼ 2, D ¼ 0:5 Power ¼ o ¼ 0:106 n ¼ 2, D ¼ 1 Power ¼ o ¼ 0:180 n ¼ 4, D ¼ 0:5 Power ¼ o ¼ 0:196 n ¼ 4, D ¼ 1 Power ¼ o ¼ 0:461 n ¼ 16, D ¼ 0:5 Power ¼ o ¼ 0:604 n ¼ 16, D ¼ 1 Power ¼ o ¼ 0:985
CLT
CL
Coverage
Length
Coverage
Length
0.896 (0.007) 0.890 (0.007) 0.899 (0.007) 0.914 (0.006) 0.919 (0.006) 0.940 (0.005)
0.298 (0.005) 0.466 (0.006) 0.390 (0.005) 0.660 (0.003) 0.454 (0.001) 0.156 (0.002)
0.955 (0.005) 0.938 (0.005) 0.941 (0.005) 0.956 (0.005) 0.951 (0.005) 0.946 (0.005)
0.329 (0.006) 0.512 (0.006) 0.414 (0.005) 0.695 (0.003) 0.461 (0.001) 0.196 (0.003)
Averages are reported with standard errors in parentheses based on the N ¼ 2000 repetitions.
100ð1 gÞ% interval estimator for o. The usual 100ð1 gÞ% CI interval estimator for s is aosob pconfidence ffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi where a ¼ Y =B and b ¼ Y =A with A and B such that F ðBÞ F ðAÞ ¼ 1 g where F is the cdf of the central chi-square distribution with n 1 degrees of freedom. The typical choice is A ¼ F 1 ðg=2Þ and B ¼ F 1 ð1 g=2Þ. Note that A and B do not depend upon Y . The above is straightforward and very widely known. However, Tarasin´ska (2005, pp. 126–127, Table 1) proposes using positions A and B to minimize the length of the confidence interval for the power subject to the constraint F ðBÞ F ðAÞ ¼ 1 g. In this case, the minimizing A and B depend upon Y and the resulting intervals for power o and their corresponding intervals for s are not proven to have the nominal coverage probability 1 g. A program was written in MatLab to simulate two confidence interval estimators for power in six cases. The program calls upon MatLab for its cdf ’s of non-central t-distributions, inverses of cdf ’s of central t-distributions, cdf ’s and their inverses of chi-square distributions, and to generate the chi-square variates. The minimization for the Tarasin´ska method was done by comparing results across grids with steps 0.001. In each case, s ¼ 1 and N ¼ 2000 chi-square variates Y were generated. Also, the level of the t-test is 0.05 and the nominal confidence level is 95%, which are the choices in Tarasin´ska (2005, Table 1). Results are reported in Table 1 for the Tarasin´ska interval denoted as CLT and for the interval CL based on transforming a 95% CI qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi for s, namely aosob where a ¼ Y =F 1 ð0:975Þ and b ¼ Y =F 1 ð0:025Þ. By their construction, the CLTs have smallest length for every realization Y . Table 1 shows that CLT has diminished coverage probability as expected. By construction, CL has 95% coverage probability and results are reported in Table 1 as a check on the simulation. As an additional check on the MatLab program, simulation CLT endpoints values were compared with values found in Tarasin´ska (2005, Table 1) for those pffiffiffiffiffiffiffiffiffi realizations D= Y =n ¼ D=S that matched values in the table, and there was excellent agreement. References MatLab Version 7.2.0.232 (R2006a). The Math Works, Inc., Natick, MA. Tarasin´ska, J., 2005. Confidence intervals for the power of student’s t-test. Statist. Probab. Lett. 73, 125–130.