ARTICLE IN PRESS Journal of Statistical Planning and Inference 140 (2010) 2719–2738
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Optimal block designs comparing treatments with a control when the errors are correlated J. Kunert a,, R.J. Martin b, J. Eccleston c a b c
Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany Wirksworth DE4 4EB, UK Department of Mathematics, University of Queensland, Brisbane, Qld 4072, Australia
a r t i c l e in fo
abstract
Article history: Received 14 November 2008 Received in revised form 5 February 2010 Accepted 18 March 2010 Available online 25 March 2010
The efficient design of experiments for comparing a control with v new treatments when the data are dependent is investigated. We concentrate on generalized leastsquares estimation for a known covariance structure. We consider block sizes k equal to 3 or 4 and approximate designs. This method may lead to exact optimal designs for some v, b, k, but usually will only indicate the structure of an efficient design for any particular v, b, k, and yield an efficiency bound, usually unattainable. The bound and the structure can then be used to investigate efficient finite designs. & 2010 Elsevier B.V. All rights reserved.
Keywords: Efficient design Correlated errors Optimal design Supplemented balance
1. Introduction There is an extensive literature on optimal and efficient designs for comparing v test (or new) treatments with a control (or standard treatment)—see Majumdar (1996). However, almost all results assume the observations are uncorrelated. In many situations, it is more realistic to assume that observations in the same block are positively correlated. This can occur with three or four colour microarrays (see e.g. Woo et al., 2005), when it is plausible that the correlations between measurements of the red and blue dye will be different from the correlations between measurements of the red and green dye. The usual way to cope with this would be randomization. However, particularly in microarray experiments, there will be information on the correlation structure from previous experiments, which we might want to exploit to get more efficient treatment comparisons. There has been much interest in the construction of efficient block designs with correlated errors when all contrasts are of equal interest—see, for example, Martin (1996). However, little is known about efficient test–control designs when errors are correlated. Assuming that the estimation uses ordinary least-squares, Bhaumik (1990) found optimal withinblock orderings under a first-order nearest-neighbour model NN(1) among some designs that would have been optimal test–control designs under independence (Stufken, 1988). Cutler (1993) obtained some optimality results under a first-order autoregressive process AR(1) on the circle or the line, assuming generalized least-squares estimation for a known AR(1)-parameter. There are also some brief examples and discussion of the correlated case in Martin and Eccleston (1993, 2001).
Corresponding author.
E-mail address:
[email protected] (J. Kunert). 0378-3758/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2010.03.037
ARTICLE IN PRESS 2720
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
Assume we want to carry out an experiment with v+ 1 treatments in b blocks with common block-size k. The set of all possible designs d for this situation is denoted Dðv þ1,b,kÞ. Here, we concentrate on generalized least-squares estimation for a known covariance. Results for independence, and Cutler’s (1993) results for the AR(1), are for specific combinations of v, b, k, and use integer minimization to obtain an optimal design. Here, we consider approximate designs and determine an optimal approximate design. This method may lead to exact optimal designs for some v, b, k, but usually will only indicate the structure of an efficient design for any particular v, b, k, and yield an efficiency bound, usually unattainable. The bound and the structure can then be used to investigate efficient finite designs. Define Ia as the identity matrix of size a and 1a is the a-dimensional vector of ones. We consider the simple block model y ¼ Td t þ Bb þe, where y= [y11,y,y1k, y21,y,ybk]T, with yij the j-th observation in the i-th block, Td is the treatment design matrix and B = Ib1k is the block-design matrix. The vector e of errors has covariance matrix S. We assume that S ¼ s2 ðIb VÞ, where V is some known matrix of full rank. That is, within-block correlations are the same for each block, and errors in different blocks are uncorrelated. Define W ¼ V 1
1 1Tk V 1 1k
V 1 1k 1Tk V 1 :
Then the generalized least-squares estimator for contrasts of the treatment effects, uT t, where uT 1v ¼ 0, equals T t ¼ uT C þ T T ðI WÞy, ud d d b
where Cdþ is the Moore–Penrose generalized inverse of the information matrix Cd ¼ TdT ðIb WÞTd . T t Þ ¼ uT C þ u. Then 1=s2 varðud d We restrict attention to reflection-symmetric (reversible) processes so that S, and therefore W, is centro-symmetric, i.e. the entries wij of W satisfy wij ¼ wk þ 1i,k þ 1j ,
1 ri, j r k:
We consider the case where there is a control treatment 0, say, and v test-treatments i, 1 rirv, and we are interested in t0 , 1 rirv. comparing the test-treatments with the control, i.e. we estimate tid A design d is A-optimal for the comparison with a control (Atc-optimal) if it minimizes the average scaled variance of a contrast between test treatments and the control, i.e. minimizes
jAtc ðdÞ ¼
v 1X varðtid t0 Þ=s2 : vi¼1
Let Cd be partitioned in the form " # T cd,00 cd,n0 , Cd ¼ cd,n0 Md where cd,00 is a scalar, cd,n0 is a v-vector, and Md is a v v matrix. Then Md is the v by v principal minor of Cd formed by deleting the row and column corresponding to the control. It is well-known, see e.g. Majumdar and Notz (1983), that trðMd1 Þ ¼
v
s2
jAtc ðdÞ:
Therefore, a design is Atc-optimal for the comparison with a control if and only if it minimizes trðMd1 Þ. A simple bound for trðMd1 Þ can be obtained, see e.g. Majumdar and Notz (1983). Define md2 ¼
1 T 1 M 1v ¼ cd,00 =v v v d
md1 ¼
1 1 ðtrðMd Þmd2 Þ ¼ ðtrðCd Þðv þ1Þmd2 Þ: v1 v1
and
Then we find that the Atc-value is at least ad ¼
v1 1 þ : md1 md2
This bound is attained if Md is completely symmetric, Md ¼ g1 Iv g2 1v 1Tv . If Md is completely symmetric, we say the design d has supplemented balance.
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2721
2. Method for finding optimal designs We are looking for a lower bound of ad = {(v 1)/md1 + 1/md2}, where the minimum is taken over all d 2 Dðv þ 1,b,kÞ. Note that instead of minimizing ad, we can, equivalently, maximize qd ¼
m2d1 v1 ¼ md1 ðv1Þmd2 þ md1 ðv1Þ=md1 þ 1=md2
and find an upper bound q* of qd. Then (v 1)/q* is a lower bound for ad. Then we can utilize a method which was introduced by Kushner (1997) to determine optimal cross-over designs (see also Kunert and Martin, 2000). Consider the convex quadratic function Qd ðxÞ ¼ md1 þ 2md1 x þ ðmd1 þ ðv1Þmd2 Þx2 ¼ ð1þ xÞ2 md1 þ x2 ðv1Þmd2 : Then Qd(x) attains its minimum at x ¼ xd ¼
md1 , md1 þ ðv1Þmd2
with min Qd(x)=qd. Each block i of the design d receives a sequence sid of treatments, and we have md1 ¼
b X
m1 ðsid Þ
and
md2 ¼
i¼1
b X
m2 ðsid Þ,
i¼1
where the mj(s), j = 1, 2, are the mfj-values of a design f 2 Dðv þ 1,1,kÞ that consists of only one block, with sequence s. We call two sequences s and t equivalent if s can be transformed into t by reversing the order and/or relabelling the testtreatments. (For instance, sequences [2 3 0] and [0 1 2] are equivalent, because [2 3 0] becomes [0 3 2] by reversing the order, and then [0 1 2] by relabelling the test-treatments.) Equivalence implies that m1(s)= m1(t) and m2(s)= m2(t). Therefore, qd only depends on which equivalence classes of sequences are used. Denote by K the number of possible classes and by pdi the proportion of blocks in the design d which use a sequence from class ‘, 1 r ‘ r K. For each sequence class ‘ we define the function H‘ ðxÞ ¼ ð1 þ xÞ2 m1 ðsÞ þ x2 ðv1Þm2 ðsÞ, where s is an arbitrary sequence from class ‘. We then observe that K X 1 pd‘ H‘ ðxÞ: Q ðxÞ ¼ b d ‘¼1
To fully utilize the method, we consider approximate designs, by allowing that the proportions pd‘ need not be multiples of 1/b but can be any nonnegative real numbers, subject to K X
pd‘ ¼ 1:
‘¼1
With this generalization, it follows from Kunert and Martin’s (2000) Proposition 4 that q ¼ min max H‘ ðxÞ: x
‘
Let x* be a point such that max H‘ ðx Þ ¼ min max H‘ ðxÞ: x
‘
Let
H‘j ðxÞ
‘
denote the derivative of H‘ ðxÞ. If a design d only uses such sequences that belong to a class ‘ fulfilling
H‘ ðx Þ ¼ max H‘ ðx Þ, ‘
and, additionally, the proportions pd‘ are such that K X
pd‘ H‘j ðx Þ ¼ 0,
‘¼1
then qd* = q. When there are two sequence classes i and j, such that Hi ðx Þ ¼ Hj ðx Þ ¼ max‘ H‘ ðx Þ and Hij ðx Þ 4 0, and Hjj ðx Þ o 0, sequences from these two classes suffice. The proportion of sequences from class i can be derived as follows. Assume d is a design where the proportion of blocks with a sequence from class i equals p and the rest is from class j. Then we get (1/b)Qd(x)= p Hi(x)+(1 p) Hj(x), and the derivative is Qdj ðxÞ=b ¼ pHij ðxÞ þ ð1pÞHjj ðxÞ. The proportion p has to be chosen such
ARTICLE IN PRESS 2722
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
that Qdj ðx Þ ¼ 0. Therefore, p should be
pijj ¼
Hjj ðx Þ : j Hi ðx ÞHjj ðx Þ
ð1Þ
This proportion is in the interval (0, 1) whenever Hij ðx Þ and Hjj ðx Þ have opposite signs. If a design d 2 Dðv þ 1,b,kÞ has qd* =q* and additionally has supplemented balance, then d* is Atc-optimal. Since an arbitrary design d 2 Dðv þ 1,b,kÞ has (v 1)/q* as a lower bound of its Atc-value, we can define its efficiency as the quotient ðv1Þ=ðq jAtc ðdÞÞ. As the main example for the distribution of the errors, we consider the AR(1) process, where 3 2 1 l l2 lk2 lk1 6 k2 7 7 6 l 1 l l2 l 7 6 k3 7 6 l2 l 1 l l 7 6 Vl ¼ 6 7, 6 ^ & & & ^ 7 7 6 7 6 k2 4l l 1 l 5
lk1
l2
l
1
and l, the lag-one correlation, is assumed to be known. In that case, the inverse of Sl equals 3 2 1 l 0 0 0 7 6 2 l 0 0 7 6 l 1 þ l 7 6 2 1þl l 0 7 l 1 6 7 6 0 1 Vl ¼ 7: 6 2 ^ & & & ^ 7 1l 6 7 6 2 6 0 l 7 l 1þ l 5 4 0 0 l 1 Then the diagonal elements of W are 8 1l > > if i ¼ 1 or i ¼ k, > < 1kð1lÞ þ 2l wii ¼ > ð1lÞ3 > > 1þ l2 otherwise: : kð1lÞ þ 2l Since W is symmetric, we only 8 1l > > > > kð1 lÞ þ 2l > > > > > > ð1lÞ2 > > > > kð1lÞ þ 2l > > > > < ð1lÞ3 wij ¼ kð1 lÞ þ 2l > > > > > > ð1lÞ2 > > l > > kð1 lÞ þ2l > > > > > > ð1 lÞ3 > > : l kð1lÞ þ2l
need to determine the off-diagonal-elements with ioj. Here we get if i ¼ 1 and j ¼ k, if i ¼ 1 and 3 r j rk1 or if 2r i rk2 and j ¼ k, if 2 r i and i þ2 r j rk1, if i ¼ 1 and j ¼ 2 or if i ¼ k1 and j ¼ k, if 2 r ir k2 and j ¼ iþ 1:
Note that Sl, and hence W, is centro-symmetric. 3. General methods to reduce the number of competing sequence classes If sequence classes a and b are such that m1 ðaÞ Z m1 ðbÞ
and
m2 ðaÞ Z m2 ðbÞ,
then Ha(x)ZHb(x) everywhere, and sequence class b can be neglected—it is obvious that Hb(x*)rHa(x*). For sequence s, consider the treatment design matrix TðsÞ ¼ ½t ð0Þ ðsÞ,t ð1Þ ðsÞ, . . . ,t ðvÞ ðsÞ: Then the j-th column t(j)(s), 0 rjrv, consists of zeros and ones, the i-th element tiðjÞ ðsÞ is 1 if and only if sequence s allots treatment j to the i-th position. We observe that both m1(s) and m2(s) can be derived from the diagonal elements of the
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2723
matrix CðsÞ ¼ T T ðsÞWTðsÞ. More precisely, m2 ðsÞ ¼
c00 ðsÞ v
m1 ðsÞ ¼
v 1 X cjj ðsÞm2 ðsÞ , v1 j ¼ 1
and
where cjj ðsÞ ¼ tðjÞ ðsÞT WtðjÞ ðsÞ is the j-th diagonal element of C(s). We then get k X k X
cjj ðsÞ ¼
trðjÞ ðsÞwru tuðjÞ ðsÞ:
ð2Þ
r ¼1u¼1
Note that the constant m2(s) depends only on the positions of the control treatment 0 in s. We can therefore divide the sequence classes into subsets, according to the positions taken by treatment 0, and observe that all sequences within a subset have the same m2(s). Proposition 3.1. Assume vZk and consider a subset of sequences s with given positions of treatment 0. If all off-diagonal elements wij, iaj, of W are less than or equal to zero, then m1(s) is maximal if s is binary in the test-treatments, i.e. none of the test-treatments i, 1rirv, appears more than once in the sequence. Proof. If all wij r0 for iaj, we get from (2) that k X
cjj ðsÞ r
ðtrðjÞ ðsÞÞ2 wrr ¼
r¼1
k X
trðjÞ ðsÞwrr
r¼1
for all j. It follows that m1 ðsÞ r
v X k 1 X trðjÞ ðsÞwrr m2 ðsÞ : v1 j ¼ 1 r ¼ 1
Equality holds if no treatment j, where 1rj rv, appears more than once in the sequence.
&
Proposition 3.2. Assume that sequences s1 and s2 are orthogonal in the sense that in each position where there is treatment 0 in s1, there is a test-treatment in s2, and in each position where there is a treatment other than 0 in s1, there is treatment 0 in s2. Then m2(s1)= m2(s2). Define by j1,y,jz the positions which in s1 are occupied by the control-treatment. If both sequences are binary in the test treatments and z X
wjr jr r
r¼1
1 tr W, 2
then, additionally, m1(s1)Zm1(s2). Proof. Note that, by construction, t ð0Þ ðs1 Þ þt ð0Þ ðs2 Þ ¼ 1k . Furthermore, W1k = 0. It follows that vm2 ðs2 Þ ¼ t ð0Þ ðs2 ÞT Wtð0Þ ðs2 Þ ¼ ð1k tð0Þ ðs1 ÞÞT Wð1k tð0Þ ðs1 ÞÞ ¼ 1Tk Wð1k t ð0Þ ðs1 ÞÞt ð0Þ ðs1 ÞT Wð1k tð0Þ ðs1 ÞÞ ¼ t ð0Þ ðs1 ÞT Wtð0Þ ðs1 Þ ¼ vm2 ðs1 Þ: Therefore, m2(s1) =m2(s2). It follows from (2) that m1(s1) Zm1(s2) if and only if v X k X
trðjÞ ðs1 Þwrr Z
j¼1r ¼1
v X k X
trðjÞ ðs2 Þwrr :
j¼1r ¼1
However, v X k X
trðjÞ ðs1 Þwrr ¼ tr W
j¼1r ¼1
z X
wjr jr
r¼1
by the definition of j1,y,jz and v X k X j¼1r ¼1
trðjÞ ðs1 Þwrr þ
v X k X
trðjÞ ðs2 Þwrr ¼ tr W
j¼1r¼1
by the construction of s1 and s2.
&
ARTICLE IN PRESS 2724
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
Proposition 3.3. In thepspecial ffiffiffiffiffiffiffiffiffiffiffiffi case of AR(1) errors, all off-diagonal elements of W are negative if either k= 3 and l is arbitrary, or kZ 4 and l Zðk2 k2 9Þ=ð2ðk3ÞÞ. Note that the bound on l is negative for all kZ4. Proof. For l Z0, it is obvious that all off-diagonal elements are less or equal to zero. If l o0, the largest off-diagonal element is w12 ¼ l
ð1lÞ3 kð1lÞ2 þ 2lð1lÞ
:
For k= 3 this is always non-positive, for kZ4, it is non-positive as long as pffiffiffiffiffiffiffiffiffiffiffiffi k2 k2 8 lZ : 2ðk3Þ
&
pffiffiffi For k= 4, the bound in Proposition 3.3 equals 1 2 0:414. In Sections 4 and 5, we consider optimal designs for k =3 and 4. The tables in these sections only list those sequence classes which we discuss, and for all optimal designs derived in these sections we require supplementary balance, in addition to the proportions of sequences given. 4. The case k= 3 If there are three plots per block, we consider only the case where vZ3. Proofs of propositions in this section are given in Appendix A1. We restrict attention here to the situation where all off-diagonal elements of W are non-positive. Some cases with positive off-diagonal elements of W are treated in Section 7. We use Proposition 3.1 to reduce the number of sequences of interest: no sequence that uses one of the test-treatments more than once needs to be considered. This reduces the set of sequence classes of interest to the six classes in Table 1. Now, sequence 1 is orthogonal to 6 in the sense of Proposition 3.2, and the same holds for 3 and 5, and for 2 and 4 (after reversing the order). Proposition 3.2 therefore implies that we do not need to consider classes 1 and 2, while 3 need not be considered provided 2w11 Zw22. However, this inequality easily follows when all off-diagonal elements of W are nonpositive. Thus, if this holds we only need to consider sequence classes 4, 5 and 6. According to Proposition 3.3 this therefore is the case for the AR(1) with any 1o l o1. We observe from Table 1 that H5 ðxÞH6 ðxÞ ¼
w22 Fv ðxÞ vðv1Þ
ð3Þ
H5 ðxÞH4 ðxÞ ¼
w22 w11 Fv ðxÞ, vðv1Þ
ð4Þ
and
where Fv ðxÞ ¼ vðv3Þx2 2ðv þ 1Þxðv þ 1Þ ¼ ðv1Þ2 x2 ðv þ1Þðx þ 1Þ2 :
ð5Þ
Hence, for all W and all v, the polynomials from the three sequence classes of interest intersect at the point x1, where pffiffiffiffiffiffiffiffiffiffiffi vþ1 pffiffiffiffiffiffiffiffiffiffiffi , x1 ¼ ð6Þ v1þ v þ1 which is the smaller root of Fv. Note that for v Z3, the point x1 lies in the interval [ 1/2, 0). The case of uncorrelated errors, V= Ik, has been considered in detail, see, e.g. Majumdar (1996). This case leads to W being I3 1=3 1v 1Tv . This implies w11 =w22 and, therefore, the polynomials corresponding to classes 4 and 5 are exactly the same. For vr4, consider designs d 1 , which consists of sequences from class 4 only, and d2 , consisting of sequences from class 5 only. Both designs (and any convex combination of the two) maximize qd. For v Z5, we need mixtures of two Table 1 Sequence classes of interest for k=3. Number
Typical sequence s
m1(s) v(v 1)
m2(s) v(v 1)
1 2 3 4 5 6
[0,0,0] [0,0,1] [0,1,0] [0,1,2] [1,0,2] [1,2,3]
0 (v 1)w11 (v 1)w22 (v 1)w11 + vw22 2vw11 w22 v(2w11 + w22)
0 (v 1)w11 (v 1)w22 (v 1)w11 (v 1)w22 0
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2725
sequences. Consider the design d 3 consisting of sequences from class 6 with a proportion
p ¼ 1
3vðx1 Þ , vþ1
where x1 o0 is defined in (6), and the other sequences coming from class 4; and the design d4 consisting of the same proportion of sequences from class 6 and the rest from class 5. Again, both designs (and convex combinations of the two) maximize qd. For details, see Stufken (1988). Now consider the case with AR(1) errors. Since the polynomials from the three classes 4, 5 and 6 intersect at x1, it follows that max‘ H‘ ðx1 Þ ¼ H6 ðx1 Þ. If we can show that H‘j ðx1 Þ, the derivative at the point x1, equals 0 for at least one of the three sequence classes ‘, or if we can show that H‘j ðx1 Þ 4 0 for one class ‘ and Hsj ðx1 Þ o 0 for another class s, then we get that H6 ðx1 Þ ¼ min max H‘ ðxÞ: x
‘
We will see that we can achieve this for some l, but not all. To proceed we define pffiffiffiffiffiffiffiffiffiffiffi ðv1Þ v þ 12v þ 1 xv ¼ v and
gv ¼
xv : xv þ2
Note that xv is increasing in v, and is 13, 0.073, + 0.160 for v= 3, 4, 5, respectively, and that gv is decreasing in v and is 0.2, 0.038, 0.074 for v = 3, 4, 5, respectively. With these definitions, we get Proposition 4.1. Assume k=3, with vZ3, and that the errors follow an AR(1) with known parameter l. Define pijj as in Eq. (1) with x* = x1. (i) If 1o l o xv , a design with a proportion p6j4 of its sequences from class 6 and the rest from class 4 maximizes qd. (ii) If gv o l o 1, a design with a proportion p6j5 of its sequences from class 6 and the rest from class 5 maximizes qd. For 1r l r xv, we calculate
p6j4 ¼
H4j ðx1 Þ j H6 ðx1 ÞH4j ðx1 Þ
¼ 1
2w11 þ w22 v ðx1 Þ: w11 v þ1
Inserting the wii for the AR(1) gives
p6j4 ¼ 1
ð3 þ lÞv ðx1 Þ: vþ1
If v = 3, it decreases from 14 for l = 1 to 0 for l = x3; if v = 4 it decreases from 0.317 for l = 1 to 0 for l = x4; if v = 5, it decreases from 0.367 for l = 1 to 0.05 for l = 0. For fixed l, the proportion p6j4 increases for larger v. For all gv r l o 1, we similarly get
p6j5 ¼ 1
ð3 þ lÞv ðx1 Þ: ð1 þ lÞðv þ 1Þ
If v =3, it increases from 0 for l = g3 to 14 for l =1; if v =4 it increases from 0 for l = g4 to 0.317 for l =1; if v = 5 it increases from 0.05 for l = 0 to 0.367 for l = 1. For fixed positive l, the proportion p6j5 increases for larger v. If vZ5 then xv 40 and gv o 0. Therefore, Proposition 4.1 when v Z5 covers all possible l. If v =3 or 4, however, then 1 o xv o0 and 0 o gv o 1. Hence, for v = 3 or 4, there are l such that xv o l o gv. Whenever xv o l o gv all three derivatives are positive at x1 and therefore the minimum of max‘ H‘ ðxÞ is not at x1. For this case we have the following result. Proposition 4.2. Assume k= 3, with v= 3 or 4, and that the errors follow an AR(1) with known parameter l. (i) If xv r l r0, a design with all its sequences from class 4 maximizes qd. (ii) If 0 r l r gv, a design with all its sequences from class 5 maximizes qd. Note that for l = 0 both designs (and any convex combination of the two) are optimal. Remarks. (1) xv 4 1 for v Z10. Hence when vZ 10, it is possible for all l to construct an Atc-optimal approximate design using sequences from classes 6 and 4 (such that class 6 has proportion p6|4).
ARTICLE IN PRESS 2726
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
(2) When H4j ðx1 Þ and H5j ðx1 Þ have opposite signs, and H6 ðx1 Þ ¼ minx maxi Hi ðxÞ, an optimal approximate design using sequences from classes 4 and 5 (such that class 4 has proportion p4j5 ) is also available. In particular, this holds for l o gv for v Z 5; l 4 xv for 5r v r 9;
l o xv and l 4 gv for 3r v r 4: (3) A design with supplemented balance using sequences from classes 4 and 5 with a proportion of from class 4 is extremely efficient for all l. In particular, the efficiency is a constant pffiffiffiffiffiffiffiffiffiffiffi ðv1 þ v þ 1Þ2 ð2v1Þ 3v3
2 3
of its sequences
for all v and all l, except for xv o l o gv when 3 rvr4. This efficiency is 0.9983 when v =5. When xv o l o gv for 3rv r4, the efficiency rises from the constant value at the borders to 1 at l = 0. The plots in Figs. 1–4 give an impression of the size of p6j4 and of p6j5 for various l and v. 5. The case k =4 Here, we only consider the case that v Z4. The proofs are in Appendix A2. If we restrict attention to the case where the largest off-diagonal element of W is non-positive, then Proposition 3.1 implies that we only have to consider the 10 equivalence classes of sequences given in Table 2, resulting from the possible numbers of appearances and possible positions of treatment 0. To apply Proposition 3.2, we give the sum of the wii corresponding to the positions taken by treatment 0 and give the class to which the sequence belongs that is orthogonal in the sense of Proposition 3.2. Note that we consider centrosymmetric W, so that w44 =w11 and w33 =w22. Further note that for classes 4 and 5, the orthogonal sequences belong to the same class. Note that all diagonal elements of W must be nonnegative, because W is nonnegative definite. Using Proposition 3.2, we therefore find from Table 2 that sequences 1, 2, 3 need not be considered. Which one of the sequences 7 or 6 we should consider depends on the relative sizes of w11 and w22. For the uncorrelated case, we again refer to Stufken (1988). If v=p4,ffiffiffiffiffiffithen a design pffiffiffiffiffiffi consisting of sequences from class 5 and 9, with the proportion of sequences from class 5 equal to ð339 13Þ=ð13 þ 3 13Þ 0:023 maximizes qd. For 5rv r9, a maximal qd is achieved by a design which consists of sequences from class 9 only. For vZ10, a design with maximal qd uses sequences from classes 9 and 10 with the proportion of class 10 equal to 4vx1 =ðv þ 1Þ, where x1 is defined in (6). Note that in the uncorrelated case many sequences from different classes have the same H‘ ðxÞ for all x. Therefore, we could replace some or all of the sequences from class 5 by sequences from class 4, 6 or 7, and we could replace sequences from class 9 by sequences from class 8 in these designs, without changing their performance. Now consider the case with AR(1) errors and known parameter l 40. In this case, 2
3
w22 3 þ l þ l l ¼ : w11 3l
ð7Þ
This implies that w22 4 w11, and we conclude from Proposition 3.2 that sequence class 6 does not have to be considered. Thus, we only need to consider classes 4, 5, and 7 to 10. For every fixed v =4, 5,y, define the function Gv : ½0,1-R as pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi 2 3 ð8Þ Gv ðlÞ ¼ 3ðv1Þ v þ 13ð3v1Þ þ 2vl þðv1Þð v þ 11Þðl þ l l Þ:
Fig. 1. Plot of p6j4 for 1 o l o 0 (left) and of p6j5 for 0 o l o 1 (right) if v= 3.
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2727
Fig. 2. Plot of p6j4 for 1 o l o 0 (left) and of p6j5 for 0o l o 1 (right) if v= 4.
Fig. 3. Plot of p6j4 for 1 o l o 0 (left) and of p6j5 for 0o l o 1 (right) if v= 5.
Fig. 4. Plot of p6j4 for 1 o l o 0 (left) and of p6j5 for 0 o l o 1 (right) if v= 25.
It is obvious that the derivative of Gv(l) is positive for all 0 r l r1. We therefore have at most one point dv, say, in the interval [0, 1] where Gv(l)= 0. If dv exists, then Gv(l)40 for all l 4 dv, and Gv(l)o0 for all l o dv. For 5rv r9, we find that however, there is no dv. If v= 4, then G4(0) oG4(1)o0, while there is a dv, since Gv(0)o0 and Gv(1)40. For the other v, p ffiffiffiffiffiffiffiffiffiffiffi Gv(1)4Gv(0) 40 for vZ10. The latter holds because 3ðv1Þ v þ 13ð3v1Þ 4 0, when v Z10. We can determine the size of dv numerically, and get that
d5 E0.746, for v = 5, d6 E0.508, for v =6,
ARTICLE IN PRESS 2728
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
Table 2 Sequences of interest for k= 4. Number
Typical sequence s
Sum of the wii corresponding to 0
Class containing the orthogonal sequences
1 2 3 4 5 6 7 8 9 10
[0,0,0,0] [0,0,0,1] [0,0,1,0] [0,0,1,2] [0,1,0,2] [1,0,0,2] [0,1,2,0] [0,1,2,3] [1,0,2,3] [1,2,3,4]
2(w11 +w22) w11 +2w22 2w11 + w22 w11 +w22 w11 +w22 2w22 2w11 W11 w22 0
10 8 9 4 5 7 6 2 3 1
d7 E 0.332, for v= 7, d8 E 0.187, for v= 8, and d9 E 0.058, for v = 9. We then have Proposition 5.1. Assume k= 4 and that the errors follow an AR(1) with known parameter l 40. For vZ10 an optimal approximate design can be constructed from sequence classes 9 and 10, whatever l. For 5rv r9, a combination of these two classes suffices whenever l 4 dv. In these cases, an optimal qd is achieved by a design with a proportion p10j9 of blocks receiving sequence 10 and the rest receiving sequence 9, where p10j9 is as in (1) with x* = x1. Note that Proposition 5.1 does not cover all positive l whenever 5rv r9, and it does not apply at all when v =4. The remaining cases are covered by Propositions 5.2 and 5.3. To proceed, define the function U : ½0,1-R as 6
5
4
3
2
UðlÞ ¼ 4l 16l þ 16l 26l þ81l 56l þ9:
ð9Þ
We show in the appendix that U has two roots Z1 o Z2 in the interval (0, d5). Proposition 5.2. For v= 5, a design consisting of sequence 9 only will maximize qd for 0r l r Z1 and for Z2 r l r d5, where Z1, Z2 are the roots of the function U in the interval (0, d5) and U is defined in (9). For 6rv r9, a design consisting of sequence 9 only will maximize qd for all 0 r l r dv. A numerical approximation gives Z1 E0.236 and Z2 E0.602. For the remaining cases we get Proposition 5.3. If v=4, or if v= 5 and Z1 o l o Z2, then the optimal approximate design will use sequences from classes 9 and 5. Define pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 3v þ 1ðv3Þl2l 3v þ 1ðv3Þl2l ðv1Þ 1þ 3l2l : x5 ¼ 2 v2 5v þ ð3v2 5vÞlð2v2 4vÞl The proportion of sequences needed from class 5 is p5j9 as defined in (1) with x*= x5. In summary, we have shown that for an AR(1) with known l 4 0: if vZ10, an optimal approximate design uses sequences from classes 10 and 9, with proportion p10|9; if 6rv r9, an optimal approximate design when l 4 dv uses sequences from classes 10 and 9, with proportion p10|9, and an optimal approximate design when l r dv uses sequences from class 9; if v= 5, an optimal approximate design when l 4 d5 uses sequences from classes 10 and 9, with proportion p10|9, an optimal approximate design when l r Z1 or Z2 r l r d5 uses sequences from class 9 only, and an optimal approximate design when Z1 o l o Z2 uses sequences from classes 5 and 9, with proportion p5j9 ; if v= 4, an optimal approximate design uses sequences from classes 5 and 9, with proportion p5j9 . Remarks. (1) For v Z13, a design using sequences from classes 10 and 9 (with proportion p10|9) is optimal for all
l 41 O2, i.e. for all l such that the off-diagonal elements of W are negative. (2) The fact that H8 ðx1 Þ ¼ H9 ðx1 Þ ¼ H10 ðx1 Þ ¼ max‘ H‘ ðx1 Þ allows for more possible optimal designs. When H8j ðx1 Þ becomes negative, then H8 ðx1 Þ ¼ minx max‘ H‘ ðxÞ and an optimal approximate design using sequences from classes 10 and 8 (with
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2729
proportion p10j8 ) is also available. This holds whenever w22 v1 pffiffiffiffiffiffiffiffiffiffiffi : o w11 2ð1þ v þ1Þ Note from (7) that w22/w11 is increasing in l in the interval [0, 1]. If v o10, then H8j ðx1 Þ 40 for all l Z0, and if v4 25, then H8j ðx1 Þ o 0 for all l. However, for 10 rvr25, there is a kv in [0, 1] such that w22 v1 pffiffiffiffiffiffiffiffiffiffiffi ¼ w11 2ð1 þ v þ 1Þ for l = kv. Then H8j ðx1 Þ o0 for l o kv and H8j ðx1 Þ 4 0 for all l 4 kv. Numerical evaluation gives kv = 0.061, 0.160, 0.235, 0.321, for v =10, 11, 12, 13, respectively. (3) When H9j ðx1 Þ and H8j ðx1 Þ have opposite signs, then again H8 ðx1 Þ ¼ min max H‘ ðxÞ, x
‘
and an optimal approximate design using sequences from classes 8 and 9 (with proportion of class 9 equal to p9j8 ) is also available. In particular, this holds for l 4 dv if 5 r v r9 for l 4 kv if 10 rv r 25, but is not possible for any l 4 0 if v 425. (4) A design using sequences from classes 8 and 9 with proportion efficiency is a constant
1 2
is extremely efficient for all l 40. In particular, the
pffiffiffiffiffiffiffiffiffiffiffi ð3v1Þðv1þ v þ 1Þ2 2 4v ðv þ 1Þ for all v Z5 and all l, except for l o dv when 5rv r9. This efficiency is 0.9998 when v= 10. When l o dv for 5rv r9, the efficiency rises from the constant value at dv to 1 at l = 0. Table 3 summarizes the optimal approximate designs for k= 4 and positive l for 4rv r13 and for v =25, 26. Note that, whenever for a given v and l there are two possibilities in Table 3, then any convex combination of the two designs in different columns also maximizes qd. The plots in Figs. 5–10 show the sizes of the proportions needed for a design that maximizes qd for some values of v and for 0 o l o1. Note that the vertical axis in Fig. 9 only goes to 0.06, since the proportions are very small.
6. Examples In this section, we consider some finite designs which are either optimal, or have high efficiency. In the following, the notation [0 1 2; 0 2 3;y] means that the first block has treatments 0, 1, 2; the second block has treatments 0, 2, 3; etc. Table 3 Sequences used by the optimal approximate designs for k= 4 and positive l. v
Class 5 and 9
Class 9 alone
Classes 9 and 10
Classes 8 and 9
Classes 8 and 10
4 5
For 0r l o 1 For 0.236 o l o 0.602
– For l 40.746
– For l 40.746
– –
6 7 8 9 10 11 12 13 ^ 25 26
– – – – – – – – ^ – –
– 9 8 > = < For 0r l o 0:236 > and for > > ; : 0:602 o l o 0:746 For 0r l o0.508 For 0r l o0.332 For 0r l o0.187 For 0r l o0.058 – – – – ^ – –
For For For For For For For For ^ For For
For For For For For For For For ^ For –
– – – – For For For For ^ For For
l 40.508 l 40.332 l 40.187 l 40.058 all l Z0 all l Z0 all l Z0 all l Z0 all l Z0 all l Z0
l 40.508 l 40.332 l 40.187 l 40.058 l 40.061 l 40.160 l 40.245 l 40.321 l 40.968
0r l o 0.061 0r l o 0.160 0r l o 0.245 0r l o 0.321 0r l o 0.968 all l Z 0
ARTICLE IN PRESS 2730
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
Fig. 5. Plot of p10j9 against l if v= 6.
Fig. 6. Plot of p10j9 against l if v= 9.
Fig. 7. Plot of p10j9 against l if v= 10.
6.1. Examples for k=3 Figs. 1–4 suggest that, unless v is large, an efficient design might result from using sequences from class 4 for l o0 and from class 5 for l 40, and ensuring that the design is close to supplementary balance. However, better results appear to come from combinations of sequences from classes 4 and 5. Example 1. If v = 3, then p4|5 is (5l 1)/(8l) for 1o l o 1/3 and 0.2 o l o1, and can be taken as 1 for 1/3o l o0, and 0 for 0o l o0.2. It tends to 12 as l tends to 1, and 34 as l tends to 1.
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2731
Fig. 8. Plot of p10j9 against l if v= 25.
Fig. 9. Plot of p5j9 (for 0.2o l o0.6) and of p10j9 (for 0.8o l) if v =5.
Fig. 10. Plot of p5j9 against l if v= 4.
The three-block design D1: [0 1 2; 0 2 3; 0 3 1], based on class 4, satisfies the conditions of Proposition 4.2 for 1X3 o l o0, and has good efficiency (at least 0.889) for all l. The three-block design D2: [1 0 2; 2 0 3; 3 0 1], based on class 5, satisfies the conditions of Proposition 4.2 for 0 o l o0.2, and has good efficiency (at least 0.889) for all l 4 1/3. The latter design is slightly more efficient than the former for l 40, but much less efficient for l o0—its efficiency tends to 0 as l tends to 1. The six-block design [D1; D2], based on classes 4 and 5 equally, has supplemented balance, and very high efficiencies for all l. In the limit as l tends to 1, it satisfies the conditions of the second remark at the end of Section 4. The six-block design
ARTICLE IN PRESS 2732
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
[D2; D2] satisfies the conditions of Proposition 4.2 for 0 o l o0.2. It appears to be an optimal design for 0 o l o0.32, while [D1; D2] appears to be an optimal design for 0.36o l o1, see Martin and Eccleston (2001, Section 5.2), where it is noted that an optimal design for 0.32 o l o 0.36 appears to be [D2; 1 0 2; 0 1 3; 0 3 2]. Note that Cutler (1993) Example 5.1, claimed incorrectly that [D2; D2] is optimal for l = 0.72, 0.73,y,0.99 and [D1; D2] is optimal for l = 0.01, 0.02,y,0.23. The nine-block design [D1; D1; D2] has supplemented balance, and very high efficiencies for all l—a constant 0.988 for l outside (13, 0.2), and rising to 1 at l = 0 inside (13, 0.2). Example 2. If v =4, then
p4j5 ¼
4 þ2l
l
3ð3 þ lÞ pffiffiffi 5l
for 1o l o x4 and g4 o l o1, and can be taken as 1 for x4 o l o0, and 0 for 0 o l o g4, where x4 = 0.073, g4 = 0.038. The proportion p4|5 tends to 0.633 as l tends to 1, and to 0.683 as l tends to 1. The six-block design D3: [1 0 2; 2 0 3; 3 0 4; 4 0 1; 1 0 3; 2 0 4], based on class 5, satisfies the conditions of Proposition 4.2 for 0o l o g4, and has high efficiency (at least 0.857) for all l 4 1/2. For lower l, the efficiency drops rapidly towards 0 at l = 1. The 12-block design D4: [0 1 2;0 2 3; 0 3 4; 0 4 1; 0 1 3; 0 2 4; 0 3 1; 0 4 2; 0 1 4; 0 2 1; 0 3 2; 0 4 3], based on class 4, satisfies the conditions of Proposition 4.2 for x4 o l o0, and has good efficiency (at least 0.857) for all l. The 12-block design [D3; D3], has worse efficiencies outside (0, g4), which become particularly low as l goes below 13. The 18-block design [D3; D4] has supplemented balance, and very high efficiencies for all l—a constant 0.9996 for l outside (x4, g4), and rising to 1 at l = 0 inside (x4, g4). Example 3. If v =5, then
p4j5 ¼
5 þ3l 2ð3 þ lÞ pffiffiffi 2l 6l
for 1o l o g5 and x5 o l o1, where g5 = 0.074 and x5 = 0.160. It tends to 0.734 as l tends to 1, to 1 as l tends to x5, to 0 as l tends to g5, and to 0.633 as l tends to 1. The 10-block design D5: [0 1 2; 0 2 3; 0 3 4; 0 4 5; 0 5 1; 0 1 3; 0 2 4; 0 3 5; 0 4 1; 0 5 2], based on class 4, has supplemented balance and good efficiency (at least 0.832) for all l; whereas design D6: [1 0 2; 2 0 3; 3 0 4; 4 0 5; 5 0 1; 1 0 3; 2 0 4; 3 0 5; 4 0 1; 5 0 2], based on class 5, has supplemented balance but is worse for almost all l, and very poor for l o 1/2. The 20-block design [D5;D6] with a proportion of 12 of its sequences from each of classes 4 and 5, has good efficiencies (at least 0.971) for all l. The 30-block design [D5;D5;D6] with a proportion of 23 of its sequences from class 4 and 13 from class 5, has supplemented balance, and very high efficiencies for all l—a constant 0.9983. 6.2. Examples for k=4
Example 4. If v= 4, the optimal p5|9 is very small. The 12-block design [1 0 2 3; 2 0 1 4; 3 0 4 1; 4 0 3 2; 1 0 3 4; 2 0 4 3; 3 0 1 2; 4 0 2 1; 1 0 4 2; 2 0 3 1; 3 0 2 4; 4 0 1 3], based on class 9, has supplemented balance and is extremely efficient (at least 0.994) for all l 40. Example 5. If v = 5, the 10-block design [1 0 2 3; 2 0 3 4; 3 0 4 5; 4 0 5 1; 5 0 1 2; 1 0 3 5; 2 0 4 1; 3 0 5 2; 4 0 1 3; 5 0 2 4], based on class 9, satisfies the conditions of Proposition 5.2, and hence is fully efficient for l o Z1, and Z2 o l o d5. For other l 40, it is very highly efficient (at least 0.998). 7. Extensions and discussion In this section, we briefly note some extensions, for stationary processes, to the results given so far. Proofs are omitted. Further details will be given elsewhere. Define ri as the lag i correlation coefficient, 1 rirk 1. 7.1. k= 3 If we write f = w11/w22 = (1 r1)/(1 r2), then all optimality results for k= 3 only depend on the size of f. Note that w12/w22 = 1/2, and w13/w22 =1/2 f, which is negative for f 41/2. The results given in Section 4 for k =3 cover all stationary processes with f 41/2, which is equivalent to 1 2r1 + r2 40. Results in terms of f can be obtained by replacing l by 1/f 1. The case f 41, which corresponds to l o0 for the AR(1), is in general equivalent to r1 o r2. Stationarity requires that f 41/4. When f = 1 (r1 = r2), results are as for the uncorrelated case S = Ik.
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2733
The case 1/4o f o1/2 arises for 1 2r1 + r2 o0, and implies w13 40. There is then a very simple result for all v Z2: an optimal design uses sequences from classes 3 (or equivalently class 7, say, represented by [1 0 1]) and class 8, say, represented by [1 2 1]. The optimal proportion for sequences from class 3, p3|8, is 2vx1/(v +1), which is constant over 1/4o f o1/2 for a given v.
7.2. k=4 In general, optimality results for k= 4 depend on the signs and sizes of the off-diagonal wij. The case discussed in Section 5 corresponds to w12 o0, w13 o0, w14 o0, w23 o0. Results in Section 5 for v Z10 hold for any process with w11/w22 o1 (usually consistent with r1 4 r3), and only depend on the ratio w11/w22. At some point as w11/w22 increases above 1, the optimal design changes to one using sequences from classes 8 and 10, with optimal proportion of sequences from class 8 of p8|10 = –4(w11 +w22)vx1/{(v + 1)w11}. For 4rv r9, results usually depend on more than just w11/w22. However, for some processes when 6 rvr9 the results are simple, and do only depend on w11/w22. Then, as w11/w22 increases from 1, the optimal design changes to one using just sequences from class 9, then to one using just sequences from class 8, and then to the above one using sequences from classes 8 and 10. Two other cases are very similar to that of f o1/2 when k= 3. If w13 40, with w12 o0, w14 o0, w23 o0, and v Z2: an optimal design uses sequences from classes 11, say, e.g. [0 1 0 1] and 12, say, e.g. [1 2 1 2]. The optimal proportion for sequences from class 11, p11|12, is 2vx1/(v + 1), which is constant for a given v. If w12 40, with w13 o0, w14 o0, w23 o0, and vZ2: an optimal design uses sequences from classes 13, say, e.g. [0 0 1 1] and 14, say, e.g. [1 1 2 2]. The optimal proportion for sequences from class 13, p13|14, is again 2vx1/(v +1). From Martin (1998), the condition w13 4 0 corresponds to 1 3r1 + 3r2 r3 o0 (e.g. some processes with a relatively large r1), while w12 40 corresponds to (1 r2)2 +(r1 r3)(1 2r1 + r2)o0 (e.g. some processes with a relatively large negative r1).
7.3. Circular processes Cutler’s (1993) main results were for circular processes, for which rg = rk g, g= 1, 2,y,k 1. For a given k, the circular case is much simpler since W has many fewer different elements—only (k+ 2)/2 if k is even, (k+ 1)/2 if k is odd. Thus there are fewer sequences to consider. However, results for the circular case give a good indication of some of the possible results in the linear case. If k= 3, r1 = r2, so W is proportional to Ik k1 1k 1Tk , and results are the same as in the uncorrelated case S =Ik. If k =4, r1 = r3, so w11 = w22, and w12 = w14 =w23, with w11 + 2w12 + w13 =0. It can be shown that w12 and w13 are a positive constant times (1 r2) and (1 4r1 +3r2), respectively. Thus w12 is always negative, but w13 can be positive if 1 4r1 +3r2 o0. If w13 4 0 and vZ2, then, as in Section 7.2, an optimal design uses sequences from classes 11, e.g. [0 1 0 1] and 12, e.g. [1 2 1 2], with the constant (for a given v) optimal proportion of sequences from class 11, p11|12 = 2vx1/(v +1). If w13 o 0, then the situation is similar to the linear case with all off-diagonal wij negative (see Sections 5 and 7.2). If v Z 10, an optimal design uses sequences from classes 9 and 10, with the constant (for a given v) optimal proportion of sequences from class 9, p9|10 = 4vx1/(v + 1). Note that the optimal designs for the uncorrelated case are a special case. For 3 r v r 9, the results are more complicated. As w13/w11 decreases from 0, an optimal design firstly uses sequences from classes 5 and 9, then (for v Z 5) just class 9 (the uncorrelated case lies in this region), and finally sequences from classes 4 and 9. The change points, and the optimal proportions in the first and last regions, depend on v and w13/w11. If k=5, r1 = r4 and r2 = r3, so w11 = w22 = w33, and w12 =w15 = w23, w13 = w14 =w24, with w11 + 2w12 + 2w13 = 0. It can be shown that w12 and w13 are a positive constant times (1+2r1 3r2) and (1 3r1 + 2r2), respectively. Note that w13 results from w12 by interchanging r1 and r2. Either w12 or w13 can be positive, but not both. The values of f2 = w12/w11 and f3 = w13/w11 must lie between (O5 +1)/4 E 0.809 and (O5 1)/4 E0.309, with f2 + f3 = 1/2. If w13 40 and vZ 2, an optimal design uses sequences from classes with typical sequences [0 1 0 1 2] and [0 1 2 1 2]. The optimal proportion depends on n3. An alternative for v Z3 uses sequences from classes with typical sequences [0 1 0 1 2] and [1 2 1 2 3], and the optimal proportion only varies slightly with n3. If w12 40 and vZ2, an optimal design uses sequences from classes represented by [0 0 1 1 2] and [0 1 1 2 2]. The optimal proportion depends on n2. An alternative for v Z3 uses sequences from classes with typical sequences [0 0 1 1 2] and [1 1 2 2 3], and the optimal proportion only varies slightly with n2. If both w12 and w13 are negative, the results are again more complicated. If v Z17, an optimal design uses sequences from classes represented by [0 1 2 3 4] and [1 2 3 4 5]. The optimal proportion of the former is constant (for a given v), p = –5vx1/(v +1). The optimal designs for the uncorrelated case are a special case. For 4rv r16, the results depend on the value of f3 (or f2). As f3 decreases from 0, an optimal design firstly uses sequences from classes represented by [0 1 0 2 3] and [0 1 2 3 4], then (for v Z10) just the one with typical sequence [0 1 2 3 4] (the uncorrelated case lies in this region), and finally sequences from classes represented by [0 0 1 2 3] and [0 1 2 3 4]. The change points, and the optimal proportions in the first and last regions, depend on v and f3.
ARTICLE IN PRESS 2734
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
7.4. Discussion Even for k= 4, the results can in general be very complicated. Here we have concentrated on the AR(1), and positive correlations when k= 4, but results for other processes and negative correlations could be investigated. Extending the results to higher k will clearly also depend on the particular process assumed. The results for the uncorrelated case (see Stufken, 1988) show that as k increases there must be more different solutions for different values of v.
Acknowledgements Support from the German Federal Ministry of Education and Research (Project SKAVOE) is gratefully acknowledged. RJM is grateful for assistance from the University of Queensland and the Isaac Newton Institute, Cambridge. We also thank three reviewers for their helpful comments. Appendix A A.1. Proofs for Section 4
Proof of Proposition 4.1. We have that H6j ðxÞ ¼
2ð2w11 þ w22 Þð1 þxÞ : v1
Since x1 4 1, it follows that H6j ðx1 Þ 4 0 for all nonnegative definite W and for all v. On the other hand 2 H4j ðxÞ ¼ ð ðv1Þw11 þvw22 ð1 þ xÞ þðv1Þ2 w22 xÞ: vðv1Þ Since pffiffiffiffiffiffiffiffiffiffiffi vþ1 x1 ¼ , 1 þ x1 v1 it follows that H4j ðx1 Þ is negative if and only if pffiffiffiffiffiffiffiffiffiffiffi w22 ðv1Þð v þ 11Þ o : w11 v For the AR(1) and k= 3, w22 ¼ 1 þ l: w11
ð10Þ
Therefore, H4j ðx1 Þ is negative if and only if pffiffiffiffiffiffiffiffiffiffiffi ðv1Þ v þ12v þ 1 lo ¼ xv : v We therefore have shown that for all l o xv , an optimal qd can be achieved with a design consisting of sequences 4 and 6. Similarly, H5j ðxÞ ¼
2 ðð2vw11 w22 Þð1 þ xÞ þ ðv1Þ2 w22 xÞ, vðv1Þ
and H5j ðx1 Þ is negative if and only if w22 2v pffiffiffiffiffiffiffiffiffiffiffi 4 : w11 ðv1Þ v þ 1 þ 1 This is equivalent to l 4 gv . For all l 4 gv an optimal qd can therefore be achieved with a design consisting of sequences 5 and 6.
&
Proof of Proposition 4.2. Note that x1 is the smaller of the two roots of Fv defined in (5). Since Fv is a convex quadratic function, we have that Fv(x)40 for all x ox1. Now assume v is 3 or 4 and that xv r l r0. Consider sequence class 4 and determine the point x2 such that H4j ðx2 Þ ¼ 0. Then x2 ¼
2v1 þvl : v2 þvl
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2735
Therefore H4(x) has its minimum at the point x2. Note that for l = xv the derivative of H4(x) at x1 is zero. This, however, implies that for l = xv the minimum of H4(x) is at x1 and therefore x2 = x1. For l 4 xv, we have shown in the proof of Proposition 4.1 that the derivative H4j ðx1 Þ is positive and therefore the minimum of H4(x) must be to the left of x1, i.e. x2 ox1. This implies that Fv ðx2 Þ Z0. From Eq. (3) and the fact that w22 Z0, we therefore conclude that H5 ðx2 Þ Z H6 ðx2 Þ. Further, equation (10) implies that w22 w11 r0 for all l r0. Therefore it follows from equation (4) in the same way that H4 ðx2 Þ Z H5 ðx2 Þ. In all, we have for xv r l r0 that H4 ðx2 Þ ¼ maxH‘ ðx2 Þ. This implies that an optimal approximate design in this case uses sequence class 4 only. Now consider the remaining case, where 0 r l r gv. Define x3 as the point such that H5j ðx3 Þ ¼ 0. Then x3 ¼
2v1l : vðv þ ðv2ÞlÞ
Again, we use the fact that H5j ðx1 Þ Z 0 to conclude that x3 rx1 . This directly implies that H5 ðx3 Þ Z H6 ðx3 Þ. Observing that w22 w11 Z0 for all l Z0, we also conclude that H5 ðx3 Þ ZH4 ðx3 Þ. Hence, for 0 r l r gv we have H5 ðx2 Þ ¼ maxH‘ ðx2 Þ. This implies that an optimal approximate design in this case uses sequence class 5 only. & A.2. Proofs for Section 5 In order to prove Propositions 5.1–5.3, we give in Table A1 expressions for the mi(s) for the sequence classes of interest, and begin with two technical propositions. Proposition A1. If we compare with sequence class 9, we observe that (a) H9 ðxÞ Z H10 ðxÞ for all x rx1 , (b) if w22 Zw11 then H9 ðxÞ Z H8 ðxÞ for all x r x1 . Proof. a) Making use of Table A1 we see that H9 ðxÞH10 ðxÞ ¼
w22 Fv ðxÞ, vðv1Þ
ð11Þ
where Fv was defined in (5). Since w22 Z0, H9 ðxÞH10 ðxÞ is positive for all xox1. b) Similarly, H9 ðxÞH8 ðxÞ ¼
ðw22 w11 Þ Fv ðxÞ: vðv1Þ
ð12Þ
If we assume that w11 rw22, we have H9 ðxÞ ZH8 ðxÞ for all x r x1 .
&
Proposition A2. a) If we assume that w12 rw13 and compare sequences 5 and 4, we find that H5 ðxÞ ZH4 ðxÞ
for all x r x1 :
b) If we assume that w22 w11 Z0 and w13 w14 Z0, and compare sequences 5 and 7, we find that H5 ðxÞ ZH7 ðxÞ,
for all x rx1 :
Proof. Since x1 is the smaller root of Fv ðxÞ ¼ ðv1Þ2 x2 ðv þ 1Þð1þ xÞ2 ,
Table A1 Expressions for the mi(s) for sequence classes of interest when k= 4. Number
Typical sequence s
m1(s) v(v 1)
m2(s) v(v 1)
4 5 6 7 8 9 10
[0,0,1,2] [0,1,0,2] [1,0,0,2] [0,1,2,0] [0,1,2,3] [1,0,2,3] [1,2,3,4]
ðv1Þðw11 þ w22 Þ2w12 ðv1Þðw11 þ w22 Þ2w13 2vw11 2w22 2w23 2ðv1Þw22 2w23 vðw11 þ 2w22 Þw11 vð2w11 þ w22 Þw22 2vðw11 þ w22 Þ
ðw11 þ w22 þ 2w12 Þðv1Þ ðw11 þ w22 þ 2w13 Þðv1Þ ð2w22 þ 2w23 Þðv1Þ ð2w22 þ 2w23 Þðv1Þ w11 ðv1Þ w22 ðv1Þ 0
ARTICLE IN PRESS 2736
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
and Fv(x)40 for all x ox1, we observe for all vZ4 and all xrx1 that ðv1Þ2 x2 Z ðv þ1Þð1 þxÞ2 : a) From Table A1, H5 ðxÞH4 ðxÞ ¼
2ðw13 w12 Þ ððv1Þ2 x2 ð1 þ xÞ2 Þ, vðv1Þ
and so we conclude that H5 ðxÞH4 ðxÞ Z
2ðw13 w12 Þ ð1 þxÞ2 ðv þ11Þ Z 0: vðv1Þ
b) From Table A1, we get H5 ðxÞH7 ðxÞ ¼
ð1 þxÞ2 ðv1Þ2 x2 ððv1Þðw11 w22 Þ2ðw13 w23 ÞÞ þ ðw11 w22 þ2ðw13 w23 ÞÞ: vðv1Þ vðv1Þ
Since w11 w22 þ 2ðw13 w23 Þ 4 2ðw11 w22 þw13 w23 Þ ¼ 2ðw13 w14 Þ Z 0, it follows for all x rx1 that H5 ðxÞH7 ðxÞ Z ¼
ð1þ xÞ2 ð1 þ xÞ2 ðv þ 1Þ ððv1Þðw11 w22 Þ2ðw13 w14 ÞÞ þ ðw11 w22 þ 2ðw13 w23 ÞÞ vðv1Þ vðv1Þ
ð1 þ xÞ2 ð1 þ xÞ2 2vðw11 w22 þ w13 w23 Þ ¼ 2vðw13 w14 Þ Z0: vðv1Þ vðv1Þ
&
Proof of Proposition 5.1. From (11) and (12) it follows that H8(x), H9(x), and H10(x) intersect at the point x1 defined in (6). We have that H10 ðx1 ÞH4 ðx1 Þ ¼ w12
2ðv1Þx21 , v þ1
H10 ðx1 ÞH5 ðx1 Þ ¼ w13
2ðv1Þx21 , v þ1
H10 ðx1 ÞH7 ðx1 Þ ¼ w14
2ðv1Þx21 : v þ1
Therefore, we have shown that if all the off-diagonal elements of W are negative—as for the AR(1) with l Z0—all of the sequences ‘ satisfy H10 ðx1 Þ Z H‘ ðx1 Þ. So, we have that H10 ðx1 Þ ¼ H9 ðx1 Þ ¼ H8 ðx1 Þ ¼ max H‘ ðx1 Þ: ‘
j ðx1 Þ ¼ 4ðw11 þw22 Þð1 þx1 Þ=ðv1Þ 40 for all l and all v. If we can show that H9j ðx1 Þ o 0, then x1 is the desired Note that H10 minimum. We have that
H9j ðxÞ ¼
2 ðð2vw11 þ ðv1Þw22 Þð1þ xÞ þ ðv1Þ2 w22 xÞ: vðv1Þ
ð13Þ
With the same arguments as in the proof of Proposition 4.1, we can show that H9j ðx1 Þ is negative whenever w22 2v pffiffiffiffiffiffiffiffiffiffiffi 4 : w11 ðv1Þð v þ 11Þ Using Eq. (7), we conclude that H9j ðx1 Þ is negative whenever Gv ðlÞ defined in (8) is positive. We therefore have shown that H9j ðx1 Þ
is never negative when v =4, is negative for all l 4 dv when 5rv r9, and is negative for all l 40 when vZ10. & Proof of Proposition 5.2. Consider the performance of a design consisting of sequence class 9 only. The derivative of H9(x) is zero at 2
x4 ¼
3
3ð3v1Þðv þ 1Þl þ ðv1Þðl l Þ 2
3
ð3ðv þ1Þ þ ðv3Þl þðv1Þðl l ÞÞv
:
Since we consider a case where H9j ðx1 Þ Z 0, it follows with the arguments used for x2 in the proof of Proposition 4.2 that x4 ox1.
ARTICLE IN PRESS J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
2737
From Eq. (7), we know that w22 Zw11. Therefore, Proposition A1 implies that for all x rx1 we have H9 ðxÞ Z H10 ðxÞ and H9 ðxÞ ZH8 ðxÞ. To see whether the conditions of Proposition A2 are fulfilled, we consider w13 w14 ¼
lð1lÞ Z 0 and w13 w12 ¼ l Z 0: 4l
Since we assume l Z0, Proposition A2 implies that for all x rx1 we have H5 ðxÞ Z H4 ðxÞ and H5 ðxÞ Z H7 ðxÞ. It therefore follows in particular that max H‘ ðx4 Þ ¼ maxfH5 ðx4 Þ,H9 ðx4 Þg. To check whether H9 ðx4 Þ ¼ minx ðmax‘ H‘ ðxÞÞ, it therefore suffices to compare H9(x4) and H5(x4). Since H9j ðx4 Þ ¼ 0, it follows from (13) that x4 ¼
2vw11 þ ðv1Þw22 ðv1Þ2 w22
ð1 þ x4 Þ
and, therefore, x24 ¼
ð2vw11 þðv1Þw22 Þ2 ðv1Þ4 w222
ð1 þx4 Þ2 :
This implies that H9 ðx4 ÞH5 ðx4 Þ ¼
ð1þ x4 Þ2 w11 ðv1Þ3 w222
ððv1Þ2 w222 4ððv1Þw22 þvw11 Þðw11 þ 2w13 ÞÞ:
This difference is nonnegative as long as gv ¼ ðv1Þ2 w222 4ððv1Þw22 þvw11 Þðw11 þ 2w13 Þ is nonnegative. We observe that gv þ 1 gv ¼ ð2v1Þw222 4ðw11 þ w22 Þðw11 þ2w13 Þ: Since w11 ow22 and w13 o0 in the case of an AR(1) with l 4 0, it follows that 4ðw11 þw22 Þðw11 þ2w13 Þ o 8w222 . This implies that gv þ 1 gv 40 for all v Z5 and for all l 40. Therefore, if we can show that gv Z0 for some vZ 5 and for l 40, then it follows inductively that gs Z0 for all sZv. In terms of l, we get 2
3
2
3
2
2
ð42lÞgv ¼ ðv1Þ2 ð3 þ l þ l l Þ2 4ðv1Þð3 þ l þ l l Þð1 þ 3l2l Þ4vð3lÞð1 þ3l2l Þ and (4 2l)g6 = 93 242l +431l2 148l3 + 75l4 90l5 +25l6. We can express (4 2l)g6 as 10283/268+268(l 121/268)2 + 148l2(1 l)+15l2(1 l3)+75l4(1 l)+25l6, which is clearly positive for 0o l o1. Therefore, we have shown for v = 6, 7, 8 and 9 that H9 ðx4 Þ ZH5 ðx4 Þ for all 0 r l r dv. However, (4 2l)g5/4= U(l), where U(l) was defined in (9). This becomes negative for some 0 o l o dv. The second 4 3 2 2 2 4 derivative of this function is U j ðlÞ ¼ 120l 320l þ192l 156l þ 162 ¼ ð34lÞð54 þ20l þ 80l Þ þ 32l þ60l , which is obviously positive for 0o l o3/4. Therefore, U(l) is convex in the interval [0, 3/4]*[0, d5]. The facts that U(0)= 940 and U(3/4)= 3657/102440, with U(12)= 23 16, imply that there are exactly 2 points in [0, d5] where U(l) =0, with U(l)o 0 between them and U(l)40 otherwise. These points are Z1 E0.236 and Z2 E0.602. Therefore, we have shown for v = 5 that H9 ðx4 Þ Z H5 ðx4 Þ for all 0 r l r Z1 and for all Z2 r l r d5. Hence, a design consisting of sequence 9 only indeed maximizes qd for these l. & Proof of Proposition 5.3. Note that we only have to deal with the case where H5 ðx4 Þ 4H9 ðx4 Þ. Since we have already shown that H5 ðx1 Þ oH9 ðx1 Þ, there must be a point x5, say, between x4 and x1, such that H5 ðx5 Þ ¼ H9 ðx5 Þ. Solving this, we get the expression for x5 given in Proposition 5.3. Propositions A1 and A2 imply that max H‘ ðx5 Þ ¼ maxfH5 ðx5 Þ,H9 ðx5 Þg, since x5 ox1. Because x5 4x4, it follows that H9j ðx5 Þ 4 0. Therefore, it only remains to show that H5j ðx5 Þ o0. We use the fact that H5j ðxÞ is increasing in x and show that H5j ðx1 Þ o 0 instead. Now, H5j ðx1 Þ ¼ m1 ðs5 Þ þ x1 ðm1 ðs5 Þ þðv1Þm2 ðs5 ÞÞ, where s5 is a sequence from class 5. We find that pffiffiffiffiffiffiffiffiffiffiffi H5j ðx1 Þ o0 if and only if m1 ðs5 Þ o v þ1m2 ðs5 Þ. pffiffiffiffiffiffiffiffiffiffiffi Taking m1(s5) and m2(s5) from Table A1, and using that v þ 1 42, it suffices to show that ðv1Þðw11 þ w22 Þ 4 2ð2v1Þw13 :
ARTICLE IN PRESS 2738
J. Kunert et al. / Journal of Statistical Planning and Inference 140 (2010) 2719–2738
This, however, holds for the AR(1) with l Z0. Therefore, we have shown that H9j ðx5 Þ 40 and H5j ðx5 Þ o 0. This implies that for the l and v considered here H9 ðx5 Þ ¼ H5 ðx5 Þ ¼ min max H‘ ðxÞ, x
‘
and an optimal approximate design uses sequences from classes 9 and 5.
&
References Bhaumik, D.K., 1990. Optimal incomplete block designs for comparing treatments with a control under the nearest neighbor correlation model. Utilitas Math. 38, 15–25. Cutler, D.R., 1993. Efficient block designs for comparing test treatments to a control when the errors are correlated. J. Statist. Plann. Inference 36, 107–125 (and 37, 393–412). Kunert, J., Martin, R.J., 2000. On the determination of optimal designs for an interference model. Ann. Statist. 28, 1728–1742. Kushner, H.B., 1997. Optimal repeated measurements designs: the linear optimality equations. Ann. Statist. 25, 2328–2344. Majumdar, D., 1996. Optimal and efficient treatment-control designs. In: Ghosh, S., Rao, C.R. (Eds.), Handbook of Statistics 13: Design and Analysis of Experiments. North-Holland, Amsterdam, pp. 1007–1053. Majumdar, D., Notz, W., 1983. Optimal incomplete block designs for comparing treatments with a control. Ann. Statist. 11, 258–266. Martin, R.J., 1996. Spatial experimental design. In: Ghosh, S., Rao, C.R. (Eds.), Handbook of Statistics 13: Design and Analysis of Experiments. North-Holland, Amsterdam, pp. 477–514. Martin, R.J., 1998. Optimal designs for small-sized blocks under dependence. J. Combin. Inform. Syst. Sci. 23, 95–124. Martin, R.J., Eccleston, J.A., 1993. Incomplete block designs with spatial layouts when observations are dependent. J. Statist. Plann. Inference 35, 77–91. Martin, R.J., Eccleston, J.A., 2001. Optimal and near-optimal designs for dependent observations. Statist. Appl. 3, 101–116. Stufken, J., 1988. On bounds for the efficiency of block designs for comparing test treatments with a control. J. Statist. Plann. Inference 19, 361–372. Woo, Y., Krueger, W., Kaur, A., Churchill, G., 2005. Experimental design for three-color and four-color gene expression microarrays. Bioinformatics 21 (Suppl. 1), i459–i467.