Computational Statistics & Data Analysis 46 (2004) 511 – 529 www.elsevier.com/locate/csda
Comparison of two normal populations with restricted means Viviana Giampaolia;∗ , Julio da Motta Singerb a Departamento
de Estat stica, Universidade Federal de Pernambuco, CCEN, Recife PE 5074-540, Brazil b Departamento de Estat stica, Universidade de S˜ ao Paulo, Caixa Postal 66281, S˜ao Paulo SP 05311-970, Brazil Received 1 March 2002; received in revised form 1 June 2003
Abstract We consider the problem of comparing the means of two Normal populations under the assumption that they are bounded. Assuming that the common variance is known we propose an alternative test and show via a simulation study, that its power is greater than of the usual Z test in most cases of practical interest. The result is extended to the case where the common variance is unknown and a numerical example is presented. c 2003 Elsevier B.V. All rights reserved. Keywords: Restricted parameter spaces; Two sample problem
1. Introduction Consider the problem of comparing the mean diastolic blood pressure of individuals submitted to a stress stimulus (like the death of a close relative or discharge from employment) to that of individuals under normal (no stress) conditions. A typical setup for this kind of investigation involves blood pressure measurements on a group of n1 individuals under stress conditions and on a control group of n2 individuals. A numerical example for the special case n1 = n2 = 11 is displayed in Table 1. Each entry corresponds to the average of a series of 30 measurements of diastolic blood pressure taken over periods of 1 h to eliminate short-term 8uctuations. ∗
Corresponding author. E-mail addresses:
[email protected] (V. Giampaoli),
[email protected] (J.M. Singer).
c 2003 Elsevier B.V. All rights reserved. 0167-9473/$ - see front matter doi:10.1016/j.csda.2003.09.001
512
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Table 1 Diastolic blood pressure (mmHg)
Stress group
Control group
89.6 92.4 87.1 92.2 96.4 92.2 92.7 95.0 92.2 109.2 96.8
81.7 89.9 81.5 89.4 94.6 93.5 85.5 95.5 88.9 95.4 97.0
Assuming that the (mean) diastolic blood pressures follow normal homoscedastic distributions, this is the prototypical situation for which Student’s t test is an appropriate statistical tool. Here, we are interested in comparing the corresponding means under the additional assumption that the subjects in both groups are hypertense and have mean diastolic blood pressures known to be at least equal to 90 mm Hg. This occurs, for example, in studies where patients are being screened for the detection of possible causes for hypertension. This type of problem has been addressed by many authors under the general denomination of statistical inference for constrained parameters. Considering a multivariate normal population with mean vector V = (1 ; 2 ; : : : ; k )t and known unstructured covariance matrix , assumed to be non-singular, Perlman (1969) considers tests of H0 : V ∈ P0 versus H1 : V ∈ P1 ;
(1)
where P0 and P1 are closed positively homogeneous sets, i.e., such that x ∈ Pi implies ax ∈ Pi for all positive real numbers a, with P0 ⊂ P1 . The sets P0 = {0} and P1 = {V : 1 ¿ 0; i ∈ R; i = 2; : : : ; k}, for example satisfy this requirement. Perlman (1969) obtains the exact null distribution of the likelihood ratio test for the special case where P0 = {0} and P1 = {V : i ¿ 0; i = 2; : : : ; k}; he also derives sharp upper and lower bounds for the null distribution of likelihood ratio tests for the more general case of one-sided alternatives, where P0 = {0} and P1 = C, a cone. More generally, consider the linear regression model y = X + ”;
(2)
where y is a n×1 vector of response variables, X is a n×p known matrix of explanatory variables assumed to be of rank p; is a p × 1 vector of unknown parameters and ” is a n × 1 random vector such that E(”) = 0 and Var(”) = = 2 Q, with 2 ¿ 0 and Q denotes a known matrix of constants. Estimation under linear models of form (2) subject to inequality restrictions of the type R ¿ r;
(3)
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
513
where R(r × p) and r(p × 1) are, respectively, a matrix and a vector of known constants, is studied by Judge and Takayama (1966), who obtain inequality constrained least squares estimators of the regression coeDcients. Liew (1976) discusses the large sample properties of such estimators. In fact, the problem under investigation here is a special case of (2) and (3). For the data presented in Table 1, we have p = 2; X = I2 ⊗ 111 ; = 2 I11 ; = (X ; Y )t ; R = I2 and r = (90; 90)t , where ⊗ denotes the Kronecker product, Ik denotes the identity matrix of order k and 1k denotes a k-vector with all elements equal to 1. In general, if the sample sizes, n1 and n2 are equal, this problem reduces to (1) with the sets P0 = {V : 1 = 2 ; i ¿ c; i = 1; 2} and P1 = {V : 1 ¿ 2 ; i ¿ c; i = 1; 2}. This does not hold for unequal sample sizes. Furthermore, we are interested in hypothesis testing instead of estimation problems. In Section 2, we propose an alternative test for 1 = 2 assuming that the common variance 2 is known. The critical value depends on a nuisance parameter which is unknown. This parameter is replaced by an estimate and the resulting test becomes an approximate test. A simulation study designed to evaluate the performance of such a test in terms of level of signiJcance and power is presented in Section 3. The extension to the case of unknown variance is outlined in Section 4. We conclude with a brief discussion of the results in Section 5.
2. Denition of the test statistic and power comparison Let X1 ; : : : ; Xn1 and Y1 ; : : : ; Yn2 denote independent random samples from normal distributions with means X ¿ c and Y ¿ c, respectively, and common known variance 2 ¿ 0. Without loss of generality, let c = 0. Consider the problem of testing the hypothesis H0 : X = Y against H1 : X ¿ Y :
(4)
If no restrictions were imposed on the parameter space, i.e., if X ; Y ∈ R, a level of signiJcance test for H0 versus H1 would correspond to the rejection of H0 for DK = XK − YK ¿ D where 1 1 D = + −1 (1 − ) (5) n1 n2 with denoting the standard normal distribution function. The power of the unrestricted parameter test (usually known as the Z test) may be computed from DK − D − = 1 − D − : ¿ PX −Y = 1 1 1 1 1 1 + + + n1 n2 n1 n2 n1 n2
(6)
For future comparison purposes, the corresponding powers of the usual Z test, denoted by pZ (; ), for = 0:05; = 0:5; 1:0 and diLerent values of sample sizes n1 and n2
514
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Table 2 Power of the Z test for = 1
(n2 \ n1 )
5 10 15 20
= 0:5
= 1:0
5
10
15
20
5
10
15
20
0.194 0.232 0.249 0.259
0.232 0.299 0.337 0.362
0.249 0.337 0.391 0.428
0.259 0.362 0.428 0.475
0.475 0.572 0.615 0.639
0.572 0.723 0.789 0.826
0.615 0.789 0.863 0.900
0.639 0.826 0.900 0.935
are displayed in Table 2. Under a restricted parameter space, i.e., X ¿ 0; Y ¿ 0, we propose an alternative test according to which the rejection of H0 corresponds to Dˆ ¿ D∗ (), where Dˆ denotes the maximum likelihood estimator of = X − Y subject to the restrictions on X and Y , i.e., Dˆ = ˆX − ˆY with ˆX = max{0; XK } and ˆY = max{0; YK } and where, under the null hypothesis, X = Y = ; D∗ () ¿ 0 is a solution to PX −Y =0 (Dˆ ¿ D∗ ()) = : To solve (7), we Jrst show (see the appendix) that
(7)
D∗ () − (X − Y ) PX −Y = (Dˆ ¿ D∗ ()) = 1 − 1 1 + n1 n2 ∗ D () − X −Y − + h(Y ; ) √ √ = n1 = n2 D∗ () − =1− 1 1 + n1 n2 ∗ D () − − X −Y − √ √ = n1 = n2 +h(Y ; ):
(8)
To compute D∗ () for a given level of signiJcance , it is suDcient to make = 0 and X = Y = in (8). Since ∗ D () − − X −Y − + h(Y ; ) √ √ = n1 = n2 is a negative function, we have D∗ () 6 D ;
(9)
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
515
thus indicating that if were known, the alternative test would be less favourable to K the null hypothesis than the usual test when Dˆ = D. The test, however, still involves the nuisance parameter . In similar situations, a conditional test is advocated by many authors (see Cox and Hinkley, 1974, Chapter 5, for example). Such tests require a complete suDcient statistic for the unknown parameter, which is not directly available in our case. As an alternative, we consider an approximate solution obtained by substitution of an estimate ˜ for the nuisance parameter , as also suggested by these authors. Before evaluating the performance of the approximate alternative test, we examine its power assuming that is known. Since an analytical comparison seems intractable due to the number of parameters involved, we consider a numerical analysis. Initially, taking = 0:05 without loss of generality, we computed the power function for diLerent values of the sample sizes n1 and n2 , the common mean under the null hypothesis, used to deJne the critical value D∗ (), the diLerence between the two means under the alternative hypothesis and the distance between the smaller of these means, Y and its lower bound 0. The results are presented in Tables 3 and 4. Table 3 Power of the alternative test for = 1
n2
Y
= 0:5
= 1:0
n1
n1
5
10
15
20
5
10
15
20
= 0:0
5 5 5 10 10 10 15 15 15 20 20 20
0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
0.273 0.399 0.715 0.282 0.383 0.735 0.286 0.373 0.741 0.288 0.365 0.744
0.406 0.538 0.845 0.426 0.540 0.889 0.437 0.539 0.907 0.443 0.536 0.917
0.503 0.608 0.893 0.534 0.629 0.940 0.550 0.639 0.958 0.561 0.644 0.967
0.574 0.650 0.917 0.614 0.685 0.962 0.636 0.704 0.977 0.650 0.715 0.984
0.650 0.713 0.715 0.678 0.734 0.735 0.691 0.741 0.741 0.698 0.744 0.744
0.830 0.845 0.845 0.874 0.889 0.889 0.893 0.907 0.907 0.903 0.917 0.917
0.891 0.893 0.893 0.937 0.940 0.940 0.955 0.958 0.958 0.964 0.967 0.967
0.917 0.917 0.917 0.961 0.962 0.962 0.977 0.977 0.977 0.984 0.984 0.984
= 0:5
5 5 5 10 10 10 15 15 15 20 20 20
0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.0 1.0
0.117 0.471 0.481 0.151 0.570 0.572 0.174 0.614 0.615 0.191 0.638 0.639
0.131 0.576 0.578 0.186 0.723 0.723 0.228 0.789 0.790 0.260 0.826 0.826
0.140 0.620 0.621 0.208 0.790 0.790 0.266 0.863 0.863 0.311 0.900 0.900
0.145 0.644 0.644 0.225 0.826 0.826 0.295 0.900 0.900 0.352 0.935 0.935
0.433 0.540 0.547 0.508 0.591 0.593 0.552 0.621 0.622 0.581 0.641 0.641
0.589 0.667 0.668 0.695 0.749 0.749 0.756 0.798 0.798 0.794 0.828 0.828
0.675 0.722 0.722 0.789 0.817 0.817 0.852 0.871 0.871 0.888 0.903 0.903
0.725 0.752 0.752 0.839 0.853 0.853 0.899 0.907 0.907 0.932 0.937 0.937
516
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Table 4 Power of the alternative test for = 1—continued
n2
= 1:0
5 5 5 10 10 10 15 15 15 20 20 20
Y
0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0
= 0:5
= 1:0
n1
n1
5
10
15
20
5
10
15
20
0.079 0.177 0.200 0.137 0.226 0.232 0.168 0.247 0.249 0.188 0.259 0.259
0.068 0.213 0.237 0.159 0.295 0.299 0.217 0.336 0.337 0.256 0.361 0.362
0.056 0.233 0.254 0.171 0.334 0.337 0.251 0.391 0.391 0.305 0.428 0.428
0.046 0.246 0.264 0.177 0.360 0.362 0.276 0.428 0.428 0.345 0.475 0.475
0.352 0.471 0.481 0.484 0.570 0.572 0.544 0.614 0.615 0.578 0.638 0.639
0.459 0.576 0.578 0.661 0.723 0.723 0.746 0.789 0.790 0.791 0.826 0.826
0.521 0.62 0.621 0.753 0.790 0.790 0.842 0.863 0.863 0.885 0.900 0.900
0.563 0.644 0.644 0.805 0.826 0.826 0.891 0.900 0.900 0.930 0.935 0.935
We may conclude that, in general, the alternative test is more powerful than the usual Z test. The exceptions occur when the constant lies away from the lower bound but the smaller of the two means is close to the lower bound. We also determined the region where the alternative test is more powerful than the usual Z test. For simplicity, we considered only the case n1 =n2 =n. Let Y (; n) denote the value of Y above which the power function of the alternative test is constant and let (Y ; ; n) be a constant for given Y ; and n. We observe that the power of the alternative test is equal to that of the usual Z test for ¿ (Y (; n); ; n) and is greater than it for ¡ (Y (; n); ; n). To obtain (Y ; ; n), we Jrst Jxed ; n and Y and used (6) to compute the power of the usual Z test. Then, from (8), considering the chosen Jxed value Y , i.e., PX −Y = (Dˆ ¿ Dp∗ Z (;) ) = pZ (; ); we calculated Dp∗ Z (;) using Brent’s method and Simpson’s rule (see, for example, Press et al., 1994), and Jnally obtained (Y ; ; n) as a solution to PX −Y =0 (Dˆ ¿ Dp∗ Z (;) ) = : The values of (Y ; ; n) for diLerent sample sizes and diLerent values of and = 1 are displayed in Tables 5 and 6. Let, for example, n=5 and =1:00. Then for Y =0:2 we have (0:2; 1:00; 5)=0:60, i.e., for values of less than 0.60, the power of the alternative test is greater than the power of the usual Z test; for = 0:60 the powers of both tests are equal and for ¿ 0:60, the power of the alternative test is smaller than that of the usual Z test. Therefore, for Jxed and n, it follows that (Y ; ; n) increases with Y ; thus the size of the region where the alternative test is more powerful than the usual Z test also increases.
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
517
Table 5 Values of the common mean below which the alternative test is more powerful than the Z test
Y
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
= 0:25
= 0:50
n=5
n = 10
n = 15
n=5
n = 10
n = 15
0.09 0.19 0.29 0.40 0.50 0.60 0.71 0.81 0.91 1.01 1.11 1.22 1.32 1.43 1.54 1.73
0.09 0.19 0.30 0.40 0.51 0.61 0.71 0.81 0.92 1.02 1.13
0.09 0.19 0.30 0.40 0.51 0.61 0.72 0.82 0.92 1.03
0.18 0.29 0.39 0.50 0.60 0.71 0.82 0.92 1.02 1.13 1.23 1.33 1.44 1.54 1.66 1.92
0.18 0.29 0.40 0.51 0.62 0.72 0.83 0.93 1.04 1.10 1.15
0.19 0.30 0.41 0.52 0.63 0.73 0.83 0.93 0.99 1.01
∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗
= 0:75
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7
= 1:00
n=5
n = 10
n = 15
n=5
n = 10
n = 15
0.28 0.39 0.49 0.60 0.71 0.82 0.93 1.04 1.14 1.25 1.35 1.45 1.55 1.65 1.74 1.81 1.85 1.86
0.28 0.40 0.51 0.62 0.73 0.84 0.95 1.05 1.13 1.17 1.18 1.18
0.29 0.41 0.52 0.64 0.75 0.85 0.97
0.38 0.49 0.60 0.72 0.83 0.94 1.05 1.15 1.26 1.37 1.48 1.60 1.83 1.85
0.39 0.51 0.63 0.74 0.86 0.96 1.07 1.18 1.27 1.35
0.41 0.53 0.65 0.76 0.87 0.97 1.05 1.07 1.08
∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
Furthermore, with n = 5 and = 1:00, the last entry in the corresponding column is (1:3; 1:00; 5) = 1:85; thus Y (1:00; 5) = 1:3, i.e., when Y ¿ 1:3, the power of the alternative Z test is equal to the power of the usual Z test for ¿ 1:85 and is greater than it for ¡ 1:85. This situation is represented with asterisks in Table 5. Then for Jxed ; Y (; n) decreases with n; thus the size of the region where the power of the alternative test is greater or equal to that of the usual Z test increases.
518
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Table 6 Values of the common mean below which the alternative test is more powerful than the Z test—continued
Y
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4
= 1:25
= 1:50
n=5
n = 10
n = 15
n=5
n = 10
n = 15
0.48 0.60 0.72 0.83 0.95 1.06 1.17 1.28 1.38 1.49 1.58 1.67 1.73 1.76 1.77
0.51 0.64 0.75 0.87 0.98 1.11
0.53 0.66 0.77 0.89 1.00
0.60 0.72 0.84 0.95 1.07 1.18 1.29 1.40 1.50 1.61 1.70 1.79 1.83 1.85
0.64 0.76 0.88 0.99 1.08 1.13 1.15
0.67 0.79 0.92
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= 1:75
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= 2:0
n=5
n = 10
n = 15
n=5
n = 10
n = 15
0.72 0.84 0.96 1.08 1.19 1.30 1.41 1.52 1.63 1.73 1.83 1.96 2.06
0.77 0.89 1.01 1.12 1.20 1.25
0.80 0.99
0.84 0.97 1.09 1.20 1.32 1.44 1.51 1.57 1.61 1.62
0.90 1.02 1.10 1.15
0.79
∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
Now, for example, if Y =0:2 and n=5 we have (0:2; 0:25; 5)=0:19, i.e., for values of less than 0.19, the power of the alternative test is greater than the power of the usual Z test; (0:2; 0:50; 5) = 0:29 for = 0:5, and for = 1:5 we have (0:2; 1:50; 5) = 0:84. Thus for Jxed n and Y ; (Y ; ; n) increases with . An equal pattern of behaviour was observed for diLerent values of . In summary, we may conclude that the power of the alternative test increases as (i) the common value of the mean under the null hypothesis () used to deJne the critical value D∗ () gets closer to its lower bound (0), (ii) the means (X ; Y ) move away from the common lower bound and (iii) either the distance between means () or the sample size (n) increases. Moreover, except when the constant used to deJne the critical
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
519
value D∗ () lies away from the lower bound but the smaller of the two means is close to the lower bound, the alternative test is more powerful than the usual Z test. These conclusions imply that in practical applications, we may prefer the alternative test in most cases, especially in those situations where the diLerence between the two means is small and both of them are close to the lower bound, as long as its performance when is replaced by an estimate is not aLected. 3. Performance of the alternative test based on estimators of the common mean To evaluate the eLect of the substitution of the unknown value of in (8) by some estimator on the level of signiJcance and power of the alternative test, we considered a simulation study. Following the ideas of Gupta and Rohatgi (1980), we based our study on the following estimators of the common value of the restricted means: ˆ = f((XK + YK )=2);
(10)
ˆ1 = |(XK + YK )=2|;
n n 1 ˆ2 = |Xi | + |Yi | ; 2 i=1 i=1
n n 1 ˆ3 = f(Xi ) + f(Yi ) 2
(11)
i=1
(12) (13)
i=1
with f(x) = max{0; x}. These estimators are positively biased, with bias() ˜ = E() ˜ − , so that √ √ n n − − ; bias () ˆ =√ ’ n √ √ n n 2 bias (ˆ1 ) = √ ’ − 2 − ; n
bias (ˆ2 ) = 2’ − 2 − = 2bias (ˆ3 ) where ’ is the normal density function. To correct for bias, the estimators were modiJed to yield ˆ ˆ4 = ˆ − bias (); ˆ5 = ˆ1 − bias (ˆ1 ); ˆ − bias () ˆ ˆGR1 = ˆ1 − bias (ˆ1 ) ˆ3 − bias (ˆ3 )
if (XK + YK )=(2) ¡ 0:25 or (XK + YK )=(2) ¿ 2; if 0:25 6 (XK + YK )=(2) 6 0:5; if 0:5 ¡ (XK + YK )=(2) 6 2;
520
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
ˆ − bias () ˆ ˆGR2 = ˆ1 ˆ3 − bias (ˆ3 ) and
ˆV =
if (XK + YK )=(2) ¡ 0:25 or (XK + YK )=(2) ¿ 2; if 0:25 6 (XK + YK )=(2) 6 0:5; if 0:5 ¡ (XK + YK )=(2) 6 2
ˆ − bias () ˆ
if [(XK + YK )=2]= ¡ 0:25;
ˆ
if 0:25 6 [(XK + YK )=2]=;
which were considered as candidates in the simulation study. Initially, 10 000 pairs of samples of size 5 of the distributions N (X ; 2 ) and N (Y ; 2 ) were generated with X = Y = = 0:0; 0:1; : : : ; 0:5; 1:0; 1:5; 2:5; 5:0 and = 1:00. For each pair of samples, we computed D∗ () ˜ with = 0:05 from (8), replacing with each of its competing estimators. In each case, we computed the empirical signiJcance level as the ratio between the number of samples for which Dˆ ¿ D∗ () ˜ and the total number of samples. The idea here is to select estimators for which the probability of errors of type I is close to the nominal level, i.e., ∗ ˆ P= ˜ : ˜ X =Y (D ¿ D ())
The empirical values of for the alternative test based on the estimators ; ˆ ˆ4 ; ˆ5 ; ˆGR1 ; ˆGR2 and ˆV are displayed in Table 7. The empirical signiJcance level of the usual Z test is also presented for comparison purposes. Unfortunately, none of the proposed estimators seem to perform uniformly better than the others. The smallest empirical signiJcance levels are those associated to ; ˆ ˆGR2 or ˆV when the true value 6 0:3. The highest are those associated to ˆ4 ; ˆ5 and ˆGR1 when the true value 0:3 ¡ ¡ 1:0. For ¿ 1:5, all the estimators have similar performance. In view of these considerations, ˆ and ˆV are the best candidates. Both tend to be conservative in the sense that the alternative test based on them has a rejection rate (under the null hypothesis) smaller than the nominal one (0.05) for values Table 7 Empirical level of signiJcance
True parameter
0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
Alternative test based on
Z test
ˆ
ˆ4
ˆ5
ˆGR1
ˆGR2
ˆV
0.0310 0.0371 0.0446 0.0514 0.0572 0.0578 0.0545 0.0526 0.0463 0.0484
0.0335 0.0401 0.0470 0.0546 0.0605 0.0629 0.0556 0.0526 0.0463 0.0484
0.0481 0.0571 0.0631 0.0702 0.0753 0.0733 0.0566 0.0526 0.0463 0.0484
0.0358 0.0445 0.0532 0.0617 0.0688 0.0699 0.0596 0.0526 0.0463 0.0484
0.0110 0.0202 0.0281 0.0381 0.0460 0.0534 0.0575 0.0528 0.0483 0.0502
0.0224 0.0289 0.0365 0.0448 0.0504 0.0542 0.0542 0.0526 0.0463 0.0484
0.0505 0.0499 0.0472 0.0525 0.0543 0.0511 0.0504 0.0529 0.0465 0.0488
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
521
of near the boundary of the parameter space. As a consequence, the alternative test based on either of them should be conservative, in the sense that it could have less power to detect departures from the null hypothesis. Given their similar performance, we proceeded with the analysis using ˆV and the maximum likelihood estimator . ˆ In a new series of simulations, 10 000 pairs of samples of size 5 from normal distributions with X = Y + , and = 1:00 were generated for each combination of Y = 0:0; 0:1; : : : ; 0:5; 1:0; 1:5; 2:5; 5:0 and = 1:00; 1:25; 1:50. For each pair of samples, ˜ using (8) with = 0 and replacing the unknown value of with we obtained D∗ () either ˆ or ˆV . The empirical power was calculated in each case as the ratio between the number of samples for which Dˆ ¿ D∗ () ˜ and the total number of samples. The results are presented in Table 8. In spite of its expected conservative nature, in both cases, the empirical power of the alternative test is always greater than that of the usual Z test and in most cases it is also greater than the associated nominal power (0.475 for = 1:00; 0:630 for = 1:25, and 0.766 for = 1:50). The diLerence between the empirical power of the alternative test and the empirical power of the usual Z-test is greater when the smaller of the two means, Y , is close to zero. Note that this increase in power occurs even though the corresponding empirical signiJcance levels are smaller than the nominal level (0.05). Furthermore, the empirical power of the alternative test based on ˆV is greater than that obtained using , ˆ especially when the values of Y are close to zero. This suggests that the former could be adopted in practice, mainly in situations where the means are expected to be close to the boundary of the parameter space. Although the gain in power is not of great magnitude, we must keep in mind that conclusions obtained with the alternative test can be diLerent than those obtained with the usual Z test. This may be important in situations where sample sizes are small. For illustration purposes, we consider an analysis of the data presented in Table 1. The required estimates of the means of both populations are XK = 94:2 and YK = 90:3. We assume that the variance is known and equal to the pooled sample variance, namely, sp2 = 31:36. Therefore, it follows that ˆV = ˆ = 92:2 and, taking = 0:05, ˆ = 3:6, in contrast with D = 4:1 for the usual Z test. Given from (8) we obtain D∗ () that Dˆ = DK = XK − YK = 3:9, we reject H0 under the alternative test but not under the Z test.
4. Extension to the unknown variance case If 2 is unknown, the obvious track is to replace it by the unbiased estimator under model (2) given by sp2 = where sX2 =
(n1 − 1)sX2 + (n2 − 1)sY2 ; n1 + n 2 − 2 n1
− XK )2 n1 − 1
i=1 (Xi
and
(14)
sY2 =
n2
− YK )2 n2 − 1
i=1 (Yi
522
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Table 8 Empirical power
True parameter Y
Alternative test based on
Z test
ˆV
ˆ
= 1:00 0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
0.487 0.499 0.496 0.494 0.493 0.482 0.477 0.474 0.474 0.472
0.466 0.487 0.489 0.490 0.491 0.481 0.477 0.474 0.474 0.472
0.473 0.478 0.473 0.476 0.478 0.467 0.476 0.474 0.474 0.472
= 1:25 0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
0.650 0.656 0.641 0.648 0.640 0.631 0.632 0.630 0.628 0.628
0.637 0.650 0.638 0.647 0.639 0.631 0.632 0.630 0.628 0.628
0.627 0.634 0.622 0.632 0.629 0.623 0.632 0.630 0.628 0.628
= 1:50 0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
0.775 0.785 0.779 0.772 0.773 0.766 0.763 0.768 0.771 0.763
0.771 0.783 0.778 0.772 0.773 0.766 0.763 0.768 0.771 0.763
0.759 0.771 0.767 0.763 0.767 0.761 0.763 0.768 0.771 0.763
so that the critical value d for a level of signiJcance test may be computed from d = sp
1 1 + T−1 (1 − ); n1 n2 m
(15)
where Tm denotes the Student’s t distribution function with m = n1 + n2 − 2 degrees of freedom. Consequently, the test of H0 versus H1 , in (4), rejects the null hypothesis
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
523
if DK ¿ d , i.e., if DK ¿ Tm−1 (1 − ): 1 1 sp + n1 n2 The approximate power of this test under the alternative hypothesis X − Y = ¿ 0 is obtained from DK − d − = 1 − Tm d − : PX −Y = ¿ 1 1 1 1 1 1 sp sp sp + + + n1 n2 n1 n2 n1 n2
(16)
Following arguments similar to those considered in the known variance case, the critical value d∗ () for a level of signiJcance restricted parameter space alternative test may be computed from PX −Y =0 (Dˆ ¿ d∗ ()) = :
(17)
Therefore, we have PX −Y = (Dˆ ¿ d∗ ()) = PX −Y = (XK ¿ d∗ (); XK ¿ 0; YK 6 0) +PX −Y = (XK − YK ¿ d∗ (); XK ¿ 0; YK ¿ 0); so that
d∗ () − PX −Y = (Dˆ ¿ d∗ ()) = 1 − Tm 1 1 sp + n1 n2 ∗ d () − − Y −Y −Tm Tm √ √ sp = n 1 s p = n2 ˜ Y ; ); +h( where ˜ Y ; ) = √ h(
1 mB( 12 ; m2 )
× 1+
√ −Y =sp = n2
−∞
t2 m+1
(18) Tm
√ n1 d∗ () − +√ t √ s p = n1 n2
−(m+1)=2 dt
with B(a; b) denoting the beta function with parameters a and b. For a Jxed level of signiJcance , the value d∗ () is obtained from (18) with = 0. As in Section 2, we considered a simulation study to evaluate the eLect of replacing the unknown in (18) by the estimators ˆ and ˆV , on the signiJcance level and on the power of the alternative test. Initially, 10 000 pairs of samples of size 5 from the distributions N (X ; 1) and N (Y ; 1) were generated for X = Y = = 0:0; 0:1; : : : ; 0:5; 1:0; 1:5. Taking = 0:05,
524
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Table 9 Empirical level of signiJcance for tests based on unknown variances
Y
0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
Alternative test based on
t test
ˆV
ˆ
0.0441 0.0488 0.0508 0.0538 0.0590 0.0570 0.0585 0.0515 0.0513 0.0519
0.0327 0.0361 0.0405 0.0465 0.0531 0.0524 0.0583 0.0515 0.0513 0.0519
0.0510 0.0504 0.0483 0.0470 0.0490 0.0477 0.0527 0.0509 0.0512 0.0519
for each pair of samples, we calculated d∗ () from (18) replacing the common value of with each of its competing estimators. In each case we computed the empirical signiJcance level as the ratio between the number of samples for which Dˆ ¿ d∗ () and the total number of samples. The empirical levels of signiJcance for both the alternative and the usual t test are displayed in Table 9. The performance of the test based on ˆV is superior to that of the test based on ˆ when 6 0:3, but the situation is reversed when ¿ 0:4. For the power evaluations, we performed a new series of simulations, generating 10 000 pairs of samples of size 5 from the distributions N (X ; 1) and N (Y ; 1) with X = Y + for Y = 0:0; 0:1; : : : ; 0:5; 1:0; 1:5 and = 1:00; 1:25; 1:50. For each pair of samples, we obtained d∗ () from (18) with = 0 using either ˆ or ˆV instead of . The empirical power results are shown in Table 10. As with the known variance case, the empirical power of the alternative test is at least equal to that of the usual t test in all instances considered in the simulation study. The empirical power of both tests are also greater than the nominal power for the usual t test, namely, 0.394 for = 1:00; 0:545 for = 1:25 and 0.689 for = 1:50. Furthermore, the empirical powers of the alternative test based on ˆV is greater than that of its counterpart based on ˆ for values of Y close to the boundary of the parameter space. 5. Discussion The test statistic under investigation is based on a special case of the estimators proposed by Judge and Takayama (1966) and Liew (1976). Essentially, it consists of the projection of the usual estimator DK = XK − YK on the Jrst quadrant (assuming c = 0) and seems to be the natural candidate to replace it in a test of X = Y under the restricted parameter space. The unknown parameter in the expression for the computation of the critical values (D∗ () in (8) and d∗ () in (18)) is certainly a subject for concern. The present
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
525
Table 10 Empirical power for tests based on unknown variances
True parameter (Y )
Alternative test based on
t test
ˆV
ˆ
= 1:00 0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
0.465 0.464 0.472 0.459 0.453 0.446 0.422 0.424 0.411 0.415
0.448 0.454 0.466 0.456 0.452 0.445 0.422 0.424 0.411 0.415
0.425 0.419 0.425 0.418 0.418 0.415 0.414 0.423 0.411 0.415
= 1:25 0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
0.613 0.604 0.615 0.603 0.599 0.593 0.563 0.566 0.561 0.562
0.603 0.597 0.612 0.602 0.599 0.593 0.563 0.566 0.561 0.562
0.571 0.561 0.574 0.564 0.564 0.565 0.556 0.565 0.561 0.561
= 1:50 0.0 0.1 0.2 0.3 0.4 0.5 1.0 1.5 2.5 5.0
0.746 0.741 0.729 0.721 0.725 0.715 0.708 0.690 0.691 0.696
0.741 0.739 0.728 0.721 0.725 0.715 0.708 0.690 0.691 0.696
0.703 0.702 0.695 0.696 0.699 0.697 0.703 0.689 0.691 0.696
numerical study suggests that even if we misspecify the value of , the alternative test considering diLerent estimators will be more powerful than the usual test except in some pathological cases. In view of this, we considered an approximate alternative test obtained by replacing with an estimate and conducted simulation studies to evaluate its performance. The results do not contradict such conjecture. A further complication must be faced in the case where the variance is unknown. In fact, the pooled sample variance, sp2 , used in our proposal does not take the restriction on the parameter space into consideration. One might argue that as the sample means were
526
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
replaced by appropriate estimators, so should the sample variance. Some alternatives are the maximum-likelihood estimator obtained under restriction, namely n2 n1 2 2 2 i=1 (Xi − ˆ1 ) + i=1 (Yi − ˆ2 ) ˆR = ; (19) n1 + n 2 where ˆ1 = max{0; XK } and ˆ2 = max{0; YK } or 2 spr =
where
n1 sX2 r + n2 sY2 r ; n1 + n 2 − 2
−1 2 n S 1 X n1 sX2 r = −1 2 K2 n Xi2 = n−1 1 SX + X 1
if XK ¿ 0; if XK ¡ 0
i=1
and
sY2 r
=
2 n−1 2 SY
if YK ¿ 0;
2 K2 n−1 2 SY + Y
if YK ¡ 0
with SX2 = (n1 − 1)−1 sX2 and SY2 = (n2 − 1)−1 sY2 . The expected value of sX2 r (or sY2 r ) was obtained by Heiny and Siddiqui (1970). Both estimators are biased and the bias depends on . However, if we replace sp by ˆR or by spr in (18), the corresponding t distributions may no longer be employed. To evaluate the eLect of such substitution, we compared the empirical distribution of the original test statistic in (18) with those obtained using the alternative estimators of the variance via an additional simulation study. We generated r = 10 000 pairs of random samples of size n from a normal distribution with mean and variance 2 for the diLerent combinations of n = 5; 10; = 0:0; 0:1; 0:2; 0:5 and 2 = 0:5; 1:0; 1:5; 2:0. For each pair of samples, we computed Dˆ Dˆ Dˆ ; ; and (20) sp 2n spr 2n ˆR 2n and obtained the respective sample quantiles of order p (=0:10; 0:20; 0:25; 0:50; 0:75; 0:90; 0:95; 0:975; 0:995; 0:9975; 0:999; 0:9995). In Table 11, we present only the quantiles for the case n=5 and 2 =0:5, since in all cases the quantiles of the test statistic based on spr are smaller than or equal to those of the test statistic based on the sample standard deviation, sp , which in turn are smaller than or equal to those of the test statistic based on the maximum likelihood estimator ˆ ˆR 2=n is stochastically larger than D=s ˆ p 2=n which in turn is ˆR . In other words, D= ˆ pr 2=n. Furthermore, all three statistics are stochastically stochastically larger than D=s smaller than Student’s t-statistic, suggesting that the alternative test will reject the null hypothesis more frequently than the usual t test, regardless of the estimator we employ for the unknown variance. Keeping its computational advantage in mind, the use of the pooled sample variance in the expression for the test statistic oLers a good compromise between controlling the probability of type I error and the power of the test.
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
527
Table 11 Sample quantiles for n = 5 and = 0:5
Quantile (%)
= 0:0
= 0:1
Test based on
Test based on
spr 10 25 50 75 90 95 97.5 99 99.5 99.75 99.9 99.95
sp
−1.760 −1.362
0.000 0.282 0.754 1.061 1.349 1.729 2.034 2.300 2.753 3.215
−1.858 −1.446
0.000 0.292 0.794 1.127 1.405 1.843 2.130 2.467 2.937 3.300
spr
−1.967 −1.523
0.000 0.315 0.843 1.187 1.508 1.933 2.274 2.571 3.078 3.595
sp
−2.084 −1.648
0.000 0.475 1.008 1.383 1.756 2.239 2.476 2.787 3.251 3.329
= 0:2
= 0:5
Test based on
Test based on
spr 10 25 50 75 90 95 97.5 99 99.5 99.75 99.9 99.95
ˆR
sp
−2.462 −1.989
0.000 0.579 1.177 1.549 1.923 2.360 2.723 3.017 3.364 4.007
ˆR
−2.498 −2.026
0.000 0.585 1.192 1.581 1.963 2.439 2.793 3.097 3.481 4.203
−2.752 −2.224
0.000 0.647 1.316 1.732 2.149 2.638 3.045 3.373 3.761 4.480
spr
ˆR
−2.185 −1.717
0.000 0.489 1.030 1.437 1.813 2.321 2.561 2.944 3.323 3.763
sp
−2.802 −2.252
0.007 0.695 1.396 1.849 2.319 2.840 3.309 3.620 4.244 4.422
−2.330 −1.843
0.000 0.531 1.126 1.547 1.964 2.503 2.768 3.116 3.634 3.722
ˆR
−2.809 −2.252
0.007 0.695 1.397 1.854 2.323 2.840 3.310 3.649 4.244 4.422
−3.133 −2.518
0.008 0.777 1.561 2.067 2.592 3.176 3.699 4.048 4.745 4.944
Despite the computational diDculties and small gain over the usual t test, we believe that the proposed test may be useful in situations where sample units are expensive and the means are expected to lie close to the boundary of the parameter space. Otherwise, we feel that knowledge about the small eLects of the bounded means on the usual tests allow us to be more comfortable in using them in situations where the phenomena under investigation induce such restrictions, as in the example described in Section 1. We note that extensions to comparison of more than two populations seem mathematically intractable and must certainly rely on extensive simulation studies. Also, the restricted parameter space is specially attractive for a Bayesian analysis which we intend to pursue in a future paper.
528
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
Acknowledgements The authors thank Professor Francisco Cribari-Neto, Professor Pranab K. Sen and two anonymous referees for their enlightening suggestions. This work received partial Jnancial support from Conselho Nacional de Desenvolvimento CientVWJco e TecnolVogico (CNPq), and FundaXca˜ o de Amparo aY Pesquisa do Estado de S˜ao Paulo (FAPESP), Brazil.
Appendix A PX −Y = (Dˆ ¿ D∗ ()) = PX −Y = (XK ¿ D∗ (); XK ¿ 0; YK 6 0) +PX −Y = (XK − YK ¿ D∗ (); XK ¿ 0; YK ¿ 0) +PX −Y = (−YK ¿ D∗ (); XK 6 0; YK ¿ 0) +PX −Y = (0 ¿ D∗ (); XK 6 0; YK 6 0):
(A.1)
The third term on the right-hand side of (A.1) is null because if YK ¿ 0, we may not have −YK ¿ D∗ () ¿ 0. The last term on the right-hand side of (A.1) is also null, since D∗ () ¿ 0. We shall now analyse the remaining terms. First, note that PX −Y = (XK ¿ D∗ (); XK ¿ 0; YK 6 0) =PX −Y = (XK ¿ D∗ (); YK 6 0) ∗ −Y D () − X : = 1− √ √ = n1 = n2
(A.2)
Now, consider PX −Y = (XK − YK ¿ D∗ (); XK ¿ 0; YK ¿ 0) =PX −Y = (XK − YK ¿ D∗ (); YK ¿ 0) =PX −Y = (XK − YK ¿ D∗ ()) − PX −Y = (XK − YK ¿ D∗ (); YK 6 0) and note that PX −Y = (XK − YK ¿ D∗ (); YK 6 0) XK − X D∗ () − X + YK K ;Y 60 : =PX −Y = √ ¿ √ = n1 = n1 Denoting Z1 =
XK − X √ = n1
and Z2 =
YK − Y √ ; = n2
(A.3)
V. Giampaoli, J.M. Singer / Computational Statistics & Data Analysis 46 (2004) 511 – 529
529
we may write the term on the right-hand side of (A.3) as √ n1 D∗ () − (X − Y ) −Y PX −Y = Z1 ¿ + √ Z2 ; Z2 6 √ √ = n1 n2 = n2 2 −Y ==√n2 ∞ −v −w2 1 dw exp dv exp = √ √ √ 2( −∞ 2 2 ∗ (D ()−)== n1 + n1 = n2 v −Y ==√n2
D∗ ()−==√n1 +√n1 =√n2 v 1 1 =√ 1− √ 2( −∞ 2( −∞ 2 −v −w2 dw exp dv ×exp 2 2
−Y ==√n2 (D∗ ()−)==√n1 +√n1 =√n2 v −Y 1 √ = − √ = n2 2( −∞ −∞ 2 2 −v −w dw exp dv ×exp 2 2 2 −Y ==√n2 ∗ √ n1 −Y 1 D () − −v = dv −√ + √ v exp √ √ = n2 = n n 2 2( −∞ 1 2 −Y = − h(Y ; ); (A.4) √ = n2 where 1 h(Y ; ) = √ 2(
√ −Y == n2
−∞
2 √ n1 D∗ () − −v dv: + √ v exp √ n2 = n1 2
Using (A.1)–(A.4) we obtain (8). References Cox, D.R., Hinkley, D.V., 1974. Theoretical Statistics. Chapman & Hall, London. Gupta, A.K., Rohatgi, V.K., 1980. On the estimation of restricted mean. J. Statist. Plann. Inference 4, 369–379. Heiny, R.L., Siddiqui, M.M., 1970. Estimation of the parameters of a normal distribution when the mean is restricted to an interval. Austral. J. Statist. 12 (2), 112–117. Judge, G.G., Takayama, T., 1966. Inequality restrictions in regression analysis. J. Amer. Statist. Assoc. 61 (313), 166–181. Liew, C.K., 1976. Inequality constrained least-squares estimation. J. Amer. Statist. Assoc. 71 (355), 746–751. Perlman, M.D., 1969. One-sided testing problems in multivariate analysis. Ann. Math. Statist. 40 (2), 549–567. Press, W.H., Teukolsky, S.A., Vetlerling, W.T., Flannery, B.P., 1994. Numerical Recipes in C, The Art of ScientiJc Computing, 2nd Edition. Cambridge University Press, Cambridge.