Partitioning principle and selection of good treatments

Partitioning principle and selection of good treatments

Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069 www.elsevier.com/locate/jspi Partitioning principle and selection of good treatm...

216KB Sizes 0 Downloads 9 Views

Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069 www.elsevier.com/locate/jspi

Partitioning principle and selection of good treatments H. Finner∗ , G. Giani, K. Straßburger Institute of Biometrics and Epidemiology, German Diabetes Center, Leibniz Center at Heinrich-Heine-University Düsseldorf, Düsseldorf, Germany Available online 7 September 2005

Abstract In a recent paper, Finner and Straßburger [2002. The partitioning principle: a powerful tool in multiple decision theory. Ann. Statist. 30, 1194–1213.] introduced a general, weak and strong partitioning principle (SPP), respectively, for the construction of multiple decision procedures, e.g., multiple testing or selection procedures. Partitioning principles can be viewed as natural extensions of the well-known closure principle and may yield more powerful decision procedures. In this paper, we are concerned with the construction of a step-down procedure for selecting a subset of k  2 treatments containing all good treatments by rigorously applying the weak partitioning principle (WPP). This results in some new least favourable parameter configuration (LFC) problems and an improved set of critical values. The new step-down procedure improves the step-down procedure based on the closure principle considerably and decreases the required sample size with respect to a pre-specified power criterion. Various procedures including a Newman–Keuls-type procedure which is related to the SPP and probably the best possible step-down procedure will be compared with respect to power and sample sizes. Finally, we reanalyse an experiment with 13 treatment means. © 2005 Elsevier B.V. All rights reserved. MSC: Primary 62J15; 62F07; secondary 62F03; 62C99 Keywords: Directional errors; Familywise error rate; Formal closure principle; Least favourable parameter configuration; Multiple comparisons with the best; Multiple level; Multiple hypotheses testing; Newman–Keuls test; Power control; Probability of a correct selection; Sample size determination; Subset selection procedure; Step-down procedure; Strong partitioning principle; Weak partitioning principle

1. Introduction Statistical inference concerning k 2 treatments is one of the most common problems in multiple decision theory. Besides procedures for all pairwise comparisons of treatment means or treatment variances various types of selection goals and selection procedures have been devel∗ Corresponding author. Tel.: +49 0211 3382 352; fax: +49 0211 3382 677.

E-mail address: fi[email protected] (H. Finner). 0378-3758/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2005.08.036

2054

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

oped and proposed during the last half century. In contrast to pairwise comparisons procedures the advantage of formulating a specific selection goal relies on more precise answers to more specific questions. Shanti S. Gupta’s contributions in the field of ranking and selection started in the year of birth of the first author of this work with a paper entitled ‘On a decision rule for a problem in ranking means’ (Gupta, 1956). With about 200 research articles the influence of Shanti’s work is outstanding. One of the most famous selection procedures cited and applied in numerous papers is Gupta’s subset selection rule designed to select a subset out of k 2 treatments containing one of the best treatments with some pre-specified probability P ∗ (Gupta, 1965). Admissibility of this procedure was shown in Finner and Giani (1996). In this paper, we are concerned with the problem of selecting a subset of k 2 treatments containing all good treatments such that the selected subset contains all good treatments with a pre-specified probability. As in Lehmann (1961) a treatment is considered as good if it does not fall too much below the best one. The extended selection goal is often appropriate in a screening phase of a study concerning k treatments. We first introduce some notation and briefly discuss previous results. Let I = {1, . . . , k} denote an index set whose elements are interpreted as the labels of k 2 treatments under consideration. Furthermore, let X = (X1 , . . . , Xk ) denote a related sampling statistic with values in the sample space X = Rk . Let the distribution of X be given in terms of a cumulative distribution function Fϑ (x) = F ((x − )/), where ϑ = (, ) ∈  = Rk × (0, ∞) with location parameter  ∈ Rk and scale parameter  > 0. We assume that F has a Lebesgue-density f . Our interest is focused on the selection goal of determining a minimal subset S(X) of I containing all good treatments with prespecified confidence probability 1 − . Often, P ∗ = 1 −  is called the PCS-level (PCS: probability of a correct selection). An equivalent formulation is that I \S(X) contains only treatments which are not good with confidence probability 1−. In this context good treatments are defined to be all the elements of the set G(ϑ) = {i ∈ I : (k) − i }, where  0 is fixed and (k) = maxj ∈I j . Formally, the PCS condition is given by ∀ϑ ∈  : Pϑ (G(ϑ) ⊆ S(X)) 1 − .

(1.1)

Fixing the threshold  in advance has the advantage that powerful selection rules satisfying (1.1) can be designed in order to answer the specific question at hand. For another approach involving simultaneous control for all  0 we refer to the discussion in Remark 1.3 later in this section. For the sake of simplicity, we first restrict attention to the case that the scale parameter  is known. In this case, the set of good treatments will be denoted by G(). Based on the duality between testing and selecting, Finner and Giani (1994) proposed a selection procedure which is based on a step-down (or closed) testing procedure for the set of hypotheses given by HJ = { ∈ Rk : G() ⊇ J }

versus KJ = Rk \HJ ,

(1.2)

with ∅  = J ⊆ I . The proposed step-down procedure is based on the special range-type statistic (the selection range statistic) TJ (X) = max Xi − min Xi , i∈I

i∈J

which has already been used by Broström (1981) in the same context. Let cJ > 0 denote the critical value defining a test at level  for testing (1.2) based on the test statistic TJ , i.e., cJ is

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2055

the smallest value satisfying inf P (TJ (X) cJ ) = 1 − .

(1.3)

∈HJ

Then 

S(X) =

J

J :TJ (X)  cJ 

defines a (step-down) subset selection procedure which contains all good treatments with at least probability 1 − . This procedure can be viewed as a result of the so-called closure principle (CP) in multiple comparisons. The CP requires that a hypothesis HJ is rejected if and only if each hypothesis HR ⊆ HJ is rejected at level . To obtain all the critical values cJ for the step-down subset selection procedure, one has to solve the least favourable parameter configuration (LFC) problem of finding a ∗ (J ) such that inf P (TJ (X) c) = P∗ (J ) (TJ (X) c),

∈HJ

(1.4)

where c > 0 is fixed. LFC results for the range statistic TI under various hypotheses are given in Giani and Finner (1991). For J = I the determination of (unique) LFC’s is more difficult. Under the assumption that the underlying probability density f is log-concave, application of Prekopa’s theorem (Prekopa, 1973) immediately yields that one of the parameter configurations (r) = (0, . . . , 0, , . . . , ),       r

r = 1, . . . , j

k−r

is least favourable for HJ for J = {1, . . . , j }. Additional assumptions yield a further reduction of possible LFC candidates. We summarize the results for known  in the following theorem. Theorem 1.1 (Finner and Giani, 2001). Let h : R → [0, ∞) be a log-concave and symmetric Lebesgue-density and let f be of the type f (x1 , . . . , xk ) = ki=1 h(xi ). Moreover, let  > 0. Then, for each J = {1, . . . , j }, j = 1, . . . , k, there exists a solution ∗ = ∗ (J ) of (1.4) with (a) ∗ = (j ) if 1j (k + 1)/2, (b) ∗ ∈ {(r) : k/2 r j } if (k + 1)/2 < j < k, (c) ∗ = (m) if j = k with m = (k + 1)/2 being the largest integer less than or equal to (k + 1)/2. As mentioned in Finner and Giani (2001), a corresponding result can be derived for a normal model with unknown scale parameter with the method described in Finner and Giani (1996). Under the assumptions of Theorem 1.1, let ck,j denote the critical values satisfying inf P (TJ (X) ck,j ) = 1 − ,

∈HJ

J = {1, . . . , j }, j = 2, . . . , k.

Then ck,k  · · · ck,2 . Suppose for a moment that the observations are ordered, i.e., x1  · · · xk . Then S(x) = {1, . . . , m}, where m is determined by m = max{j : x1 − xj ck,j }. This is an alternative definition of the step-down subset selection procedures under the assumptions of Theorem 1.1.

2056

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

Remark 1.2. The step-down procedure improves the single-step (or natural) procedure, proposed, for example, by Lam (1986), which selects all i ∈ I with maxj ∈I Xj − Xi (qk + ), where qk , determined by P0 (max1  i 0, replacement of (qk + ) by the smaller critical value ck,k  also yields a single-step procedure at confidence level 1 −  being more powerful than Lam’s (1986) procedure. Remark 1.3. The ad hoc method of construction in Lam (1986) for the natural procedure has a more general background. The natural procedure may be derived by applying the so-called confidence region test method which has its roots in Aitchison (1964, 1965). Let X denote a sample space and suppose we have a confidence set C = (C(x) : x ∈ X) with confidence level (1 − ) for ϑ ∈ . Let H = {Hm : m ∈ M} denote a family of hypotheses. Reject Hm (i.e., set m (x) = 1) iff Hm ∩ C(x) = ∅. Then the multiple test  = (m : m ∈ M) for H controls the multiple level . In order to underline the dependence on  of the set of good treatments and the selection procedure, replace G by G and S by S for the moment. Choose C(x) = { : xi − xj − qk  i − j xi − xj + qk , 1 i < j k} and H,i = {ϑ : i ∈ G ()} for i ∈ I and  0. Then H,i ∩ C(x) = ∅ iff max1  j  k xj − xi > (qk + ). In this case, H,i is rejected and we set ,i (x) = 1. Moreover, the duality between testing and selecting yields that S defined by S (x) = {i : ,i (x) = 0} has PCS-level 1 − , that is, P (G () ⊆ S (X))1 −  which is Lam’s (1986) result. In fact, we get even more, that is, ∀ ∈ Rk : P (G () ⊆ S (X) ∀ 0)1 − . This may show that Lam’s (1986) natural procedure is not tailored to a special value of . In contrast to this approach, in this paper we consider methods which are more designed to answer the specific question at hand in terms of a fixed threshold  and the specific selection goal. Extensive investigations concerning the behaviour of the critical values of step-down and singlestep procedures as well as some tables can be found in Finner and Giani (1994). One of the important observations has been that the first half of the critical values (the large ones) of the step-down procedure tend to be nearly equal for large values of . This means that the stepdown procedure may be not much better than the single-step procedure. Therefore, the question is whether this step-down procedure is the best possible step-down procedure, or, whether this procedure can be improved upon. Finner and Giani (1994) conjectured that a certain step-down procedure closely related to the Newman–Keuls test may be the best possible procedure in the class of step-down procedures. But this conjecture is not proved yet. In a recent paper, Finner and Straßburger (2002) introduced some partitioning principles for the construction of multiple decision procedures which are sometimes uniformly more powerful than procedures related to a formal application of the CP. Among others, they gave some brief hints on how to improve the step-down subset selection procedure by applying the so-called weak partitioning principle (WPP). In this paper, we use this idea and give a full description and derivation of the improved step-down procedure the critical values of which are closer to the selection procedure related to the Newman–Keuls test. In Section 2, we give a brief description of the WPP and apply it to the subset selection problem. This results in a new testing problem of pairwise disjoint hypotheses. In Section 3, we derive the corresponding LFC results for this testing problem. The proof of the main result is

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2057

deferred to the Appendix. In Section 4, we briefly outline the case of an unknown scale parameter. Problems concerning control of power and determination of sample sizes are discussed in Section 5. Moreover, various procedures are compared with respect to the required sample size. Finally, in Section 6, we reanalyse Keuls (1952) example with 13 treatment means with respect to subset selection goals.

2. The weak partitioning principle The main idea behind the partitioning principles proposed in Finner and Straßburger (2002) is to rewrite the problem of testing a family of hypotheses H = {Ht : t ∈ T } as a problem of testing a so-called natural partition generated by H. A family of hypotheses H = {Ht : t ∈ T } (it is always assumed that ∅  = Ht  =  for all t ∈ T , where  denotes the entire parameter set) is called disjoint, if Ht ∩ Hs = ∅ for all t  = s, s, t ∈ T . It is well-known that in this situation every hypothesis Ht can be tested at level , that is, without further adjustments we get a multiple test controlling a multiple level  (or familywise error rate  in the strong sense). Let us first recall what we understand by a formal application of the CP or the formal closure principle (FCP). If H={Ht : t ∈ T } is a family of hypotheses which is closed under intersections (short: ∩-closed), all we have to do is to construct a non-randomized level- test t for each Ht , t ∈ T . Then setting t = mins:Hs ⊂Ht s for each t ∈ T defines a multiple level- test for H. Since there are no further requirements concerning the choice of the local level- tests t , we call this a formal application of the CP. The class of all (non-randomized) multiple level- tests for a family of hypotheses H will be denoted by  (H). Starting point for applying the WPP is a ∩-closed family of hypotheses H = {Ht : t ∈ T }. For the sake of simplicity, we assume in addition that for each Hs there exists a Ht ⊆ Hs being minimal in H. This leads to the following variant of the WPP. For the minimal hypotheses in H one may choose level- tests without further restrictions but as powerful as possible. Now, let s ∈ T be arbitrary but fixed for the moment and suppose we have already tests t for each t ∈ J (s) = {t : Ht ⊂ Hs }(assuming that J (s)  = ∅) such that (t : t ∈ J (s)) ∈  ({Ht : t ∈ J (s)}). Let s = Hs \ t∈J (s) Ht . If s = ∅, set s = mint∈J (s) t . If s  = ∅, choose a test ˜ s in some sense as powerful as possible for testing Hs such that ∗s = supϑ∈s Pϑ ( ˜ s = 1) is as large as possible but less than or equal to . This means, a possible type I error is only controlled on s ⊆ Hs and we now have to look for LFC’s over s instead of Hs as one would do when applying the FCP. This may lead to a test with a smaller acceptance region, hence a more powerful test for Hs versus Ks which is now defined by s = min{ ˜ s , mint∈J (s) t }. This construction ensures that (t : t ∈ J (s) ∪ {s}) ∈  ({Ht : t ∈ J (s) ∪ {s}}). If we apply this method to every non-minimal hypothesis in H we end up with a strongly coherent multiple level- test for H (cf. Finner and Straßburger, 2002). Note that, with s = Hs for all minimal hypotheses  Hs , the set {s  = ∅ : s ∈ T } constitutes the above-mentioned natural partition of t∈T Ht . For an illustrating example concerning the application of the WPP we refer to Example 4.1 in Finner and Straßburger (2002). Whether the WPP indeed yields a more powerful test procedure than the FCP strongly depends on the structure of H and the structure of the acceptance regions. Often, tests are of the type {s = 0} = {Ts cs }, where Ts denotes a suitable test statistic. Assume for a moment that Ts tends to larger values if ϑ moves away from Hs . Then in case of the FCP a critical value cs = cs (FCP) (say) is determined such that s =supϑ∈Hs Pϑ (Ts > cs ) is as large as possible, but less than or equal to . In case of the WPP, cs = cs (WPP) (say) is determined such that s = supϑ∈s Pϑ (Ts > cs )

2058

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

is as large as possible but less than or equal to  (assuming that s  = ∅). Clearly, if the LFC’s are different and cs (WPP) < cs (FCP), the WPP yields a more powerful test than the FCP. 3. Least favourable parameter configurations As mentioned in the introduction, for the subset selection problem at hand, the duality between testing and selecting can be used to generate the ∩-closed family of hypotheses HJ = { ∈ Rk : G() ⊇ J }, ∅  = J ⊆ I . This family generates a natural partition of the parameter space given by  J = { ∈ Rk : G() = J } = HJ \ HR , ∅  = J ⊂ I and I = HI . R:R⊃J

All that remains to be done is to construct a level- test J for each J , ∅  = J ⊆ I . Then  S(X) = J J :J (X)=0

defines a subset selection procedure controlling the probability of a correct selection at level 1−. In other words, the subset selection rule S defines a (1 − )-confidence set (S(x) : x ∈ X) for the set of good treatments G(). A hypothesis HJ : G() ⊇ J will be rejected if the selection range statistic TJ (X) = maxi∈I Xi − mini∈J Xi is large. The WPP implies that we have to determine critical values dJ as small as possible such that inf P (TJ (X)dJ ) = 1 − .

∈J

(3.1)

It is easy to show that J in (3.1) can be replaced by EJ ={ ∈ Rk : i =0 for all i ∈ I \J and i ∈ [0, ] for all i ∈ J with maxi∈J i = }. This results in EJ ⊂ EL for ∅  = J ⊂ L ⊆ I . Since TJ is monotone in J, we finally obtain that the critical values possess the monotonicity property dJ dL

for all J, L with ∅  = J ⊂ L ⊆ I .

(3.2)

Under the assumption that f is log-concave it can be argued as in Finner and Giani (1994) that for >0 inf P (TJ (X)d) = min P (TJ (X)d),

∈J

∈DJ

where DJ = { ∈ Rk : i ∈ {0, } for i ∈ J, ϑi = 0 for i ∈ I \J }\{(0, . . . , 0), (, . . . , )}, i.e., the LFC candidates have been reduced (except translation) to a finite number. Under the assumptions of Theorem 1.1, the LFC problem becomes more tractable and a further reduction is possible. The proof of the following result, which is similar but much easier than the proof of Theorem 1.1, is deferred to the Appendix. Theorem 3.1. Under the assumptions of Theorem 1.1, for each J ={k−j +1, . . . , k}, j =1, . . . , k, the infimum in (3.1) is attained at some point ∗ = ∗ (J ) with (a) ∗ ∈ {(r) : k − j/2 r k − 1} if 2 j k−1, (b) ∗ = (m) if j = k with m = (k + 1)/2 , and ∗ = (k − 1) if j = 1.

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2059

Under the assumptions of the last theorem, the critical values again depend only on the size of J. Let dk,j denote the critical values satisfying inf P (TJ (X)dk,j ) = 1 − ,

∈J

J = {1, . . . , j }, j = 2, . . . , k.

From (3.2) we get dk,2  · · · dk,k . Consequently, the corresponding step-down procedure is of the same type as the FCP-based procedure of Finner and Giani (1994) described in Section 1. We only need to replace the critical values ck,j by dk,j . For k 4, LFC’s for the determination of the critical values are uniquely determined by Theorem 3.1. For k 5, 2 j k − 1, we are left with several LFC candidates for the determination of dk,j . Extensive numerical investigations suggest that the probability P(r) (T{1,...,j } > d) is a unimodal function in r ∈ {0, . . . , k − 1}. A verification of this conjecture would help to speed up the determination of LFC’s and critical values. It should be mentioned that the calculation of critical values dk,j is not much more complicated than the calculation of the critical values ck,j . Let nFCP (k, j ) and nWPP (k, j ) denote the number of the remaining LFC candidates for the determination of these critical values. It can be easily seen from Theorems 1.1 and 3.1, that nWPP (k, j ) nFCP (k, j ), for k 6 and nWPP (k, j )nFCP (k, j )+

(k − 3)/4 , for k > 6. A problem occurs concerning the behaviour of the critical values considered as a function of  > 0. First, note that for a fixed  > 0 the sets J () = J (say), ∅  = J ⊆ I , yield a partition of the entire parameter space  = Rk . While I () is increasing in  > 0 and the sets {i} (), i ∈ I , are decreasing in  > 0, J () is neither decreasing nor increasing in  > 0 if J ⊂ I , |J | 2. Therefore, it cannot be expected that the critical values dk,2 , . . . , dk,k−1 are monotone in  > 0. It is easy to show by applying Prekopa’s theorem that  min Xi d P (r) max Xi − i∈I

i=k−j +1,...,k

is log-concave (and hence unimodal) in  > 0, where  (r) = (r) (say). Therefore, a critical value dk,j (), 2 j k − 1, may be decreasing in  ∈ (0, 1 ] and increasing in  ∈ (1 , ∞) for some 1 > 0. This behaviour is confirmed by numerical calculations. An example can be found at the end of Section 6. As a consequence, if we consider two subset selection procedures Si based on different threshold values i , i = 1, 2, with 1 < 2 , it may happen that S2 (x) ⊂ S1 (x) for some x which seems paradoxical at first glance. We note that dk,k (1 ) dk,k (2 ) for 0 < 1 2 . Therefore, if dk,j (1 ) > dk,j (2 ) for some j ∈ {2, . . . , k − 1}, the WPP-based procedures for 1 and 2 , respectively, are not really comparable. In order to avoid the problem of non-monotone critical values in , they can be adjusted in such a way that they are non-decreasing in  > 0. This can be done by replacing dk,j () by ∗ () = max{d (), d (0)} for j = 2, . . . , k. This leads to a more conservative procedure dk,j k,j k,j if dk,j () < dk,j (0) for some j ∈ {2, . . . , k − 1}. It should be mentioned that for each  > 0 the adjusted procedure is at least as powerful as the corresponding FCP-procedure. Therefore, at present the adjusted procedure can be recommended for practical use. Moreover, dk,j () > dk,j (0) for suitable large values of . Hence, the original procedure based on the WPP coincides with the adjusted procedure if  exceeds a certain boundary. The reason for the non-monotone behaviour of the critical values in  may be explained based on a conjecture formulated in Finner and Giani (1996). They conjectured that the step-down

2060

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

procedure based on the critical values dk,k () · · · d2,2 () also controls the probability of a correct selection at level 1− and that these critical values cannot be improved upon. A verification of this conjecture is closely related to the problem of directional errors appearing for stepwise multiple test procedures and has not been given yet. It should be mentioned that dk,j () dj,j () for j < k. A step-down procedures based on dk,k () · · · d2,2 () can be viewed as a refinement of the Newman–Keuls test for pairwise comparisons. To understand this we note that a critical value dj,j () can be taken to test a ‘homogeneity’ hypothesis HJ : maxr
max

P∗j

1  r
|Xr − Xs |dj,j () = 1 − ,

where ∗j ∈ {0, }j has m entries 0 and j − m entries  for j ∈ {2m, 2m + 1}. The original Newman–Keuls tests (for  = 0) rejects a pair-hypothesis Hij : i = j if all homogeneity hypotheses HJ ⊆ Hij are rejected at level . It is well-known that the Newman–Keuls test does not control the multiple level  for more than three treatments. Note that the step-down subset selection procedure based on the Newman–Keuls critical values for selecting a subset containing all good treatments differs from the original Newman–Keuls step-down procedure for all pairwise comparisons. If one could prove that ⎞ ⎛  inf P ⎝ {TR (X)d|R|,|R| ()⎠ 1 −  for all J ⊆ I, J = ∅, ∈J

R:J ⊆R

the step-down selection procedure based on the critical values of the Newman–Keuls test controls the PCS-level 1 − . If this turns out to be true, this procedure may be considered as a result of the so-called strong partitioning principle (SPP). For more details concerning the SPP we refer to Finner and Straßburger (2002). In the following, we use the abbreviation FCP or FCP-procedure for the FCP-based procedure of Finner and Giani (1994). We refer to the new WPP-based step-down selection procedure which uses the critical values dk,j () as WPP or WPP-procedure (WPP∗ for the adjusted monotone procedure) in the following. For the step-down procedure based on the critical values dj,j () of the Newman–Keuls test we use the abbreviation NK or NK-procedure (instead of SPP because it is not clear yet whether it coincides with the SPP-procedure). The next lemma gives some insight into the behaviour of the critical values dk,j () of the WPP-procedure and the critical values dj,j () of the NK-procedure. Lemma 3.2. Under the assumptions of Theorem 3.1 let k 3, j ∈ {2, . . . , k − 1}, such that j = 2m or j = 2m + 1. Let ek,j () = dk,j () − , ej,j () = dj,j () − , and let ej be the solution of  P0

max Xi −

1i m

min

m+1  i  j

Xi ej  = 1 − .

Then ek,j () and ej,j () are both decreasing in  > 0 with lim ek,j () = lim ej,j () = ej ,

→∞

→∞

(3.3)

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2061

hence lim [dk,j () − dj,j ()] = 0.

→∞

A proof of this lemma, which is left to the reader, relies on the results stated in Theorem 3.1 and the fact that for k 3, j ∈ {2, . . . , k − 1}, r ∈ {1, . . . , j }, P0 (max{X1 , . . . , Xr , Xr+1 − , . . . , Xk − } − min{X1 + , . . . , Xr + , Xr+1 , . . . , Xj }c) is increasing in  > 0 with limit  P0 max Xi − min 1i r

r+1  i  j

Xi c .

 In the normal case, where F (x) = ki=1 (xi ) with denoting the cumulative distribution function of the standard normal distribution, the critical values ej defined by (3.3) appear in a different context in Bechhofer (1954) and can be obtained by solving  ∞ (j − r) [1 − (x)]j −r−1 [ (x + ej )]r d (x) = 1 − . −∞

The main statement of the lemma is that the critical values of the WPP-procedure converge to the critical values of the NK-procedure for  tending to infinity. It will be seen in Sections 5 and 6 that the critical values are often nearly equal already for  2. 4. Unknown scale parameter For convenience, we restrict attention to a k-sample normal location-scale model with common unknown variance 2 > 0, that is, a balanced one-way layout. To this end, let Xi ∼ N(i , 2 /n), i ∈ I , W 2 /2 ∼ 2 , be independently distributed. In a balanced one-way layout we have =k(n−1). Then we define the studentized selection range statistic by TJ (X, W )=(maxi∈I Xi − minj ∈J Xj )/W . Firstly, suppose that good treatments are defined in terms of -units, i.e., G(ϑ) = {i : (k) −i }. Then the corresponding hypotheses are of the type HJ : (k) −minj ∈J j , ∅  = J ⊆ I . These hypotheses yield the natural partition J : (k) − minj ∈J j = , ∅  = J ⊆ I . In order to obtain an exact level- test for J , one has to determine the smallest c > 0 such that inf Pϑ (TJ (X, W )c) = 1 − ,

ϑ∈J

where the notation Pϑ indicates that the probability measure now depends on ϑ = (, ). It has been shown in Finner and Giani (1996) that Pϑ (TJ (X, W )c) considered as a function of , is log-concave in . Therefore, Theorem 3.1 carries over to this situation, i.e., the LFC candidates for the determination of the critical values are exactly the same as described in Theorem 3.1. It can also be applied, e.g., for balanced randomized block designs and cross-over designs where the degrees of freedom have to be adjusted in a suitable way. Some people prefer to define good treatments in terms of absolute values instead of -units, that is, the set of good treatments is given by {i : (k) − i }. In connection with our subset

2062

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

selection problem this approach has been considered by Lam (1986). First improvements of Lam’s SS-procedure are discussed in Finner and Giani (1994). Since the determination of the critical values of a corresponding WPP-procedure is based on the case  = 0, Theorem 3.1 yields no improvement of the step-down procedure proposed in Finner and Giani (1994, 1996). Note that Theorems 1.1 and 3.1 coincide for  = 0. However, one may construct a step-down procedure based on the critical values qk,  · · · q2, of the studentized range statistic, too. Unfortunately, it is not clear whether such a procedure controls the probability of a correct selection at the prespecified level 1 − . In view of power control and allocation of sample sizes we will restrict attention to the case where good treatments are defined in -units in the remainder of this article. 5. Power control and allocation of sample sizes In general, control of power for stepwise multiple decision procedures is a difficult issue. The first problem is an appropriate definition of power, the second problem is the calculation of power. In this section, we consider one possibility of power control in the balanced k-sample normal model. For convenience, suppose the Xi , i = 1, . . . , k, are the estimated treatment means based on a sample size n for every treatment group, i.e., the variance of Xi is now given by 2 /n. The definition of good treatments remains the same, that is, G() = {i : (k) − i }. Let B() = {i : (k) − i  } denote the set of bad treatments assuming > . For the sake of simplicity, we first consider the case that  is known and set  = 1. One possibility of power control is to consider the probability that the selected subset S(X) contains no bad treatment, that is () = P (S(X) ∩ B() = ∅), and a natural requirement for this power function is ()P1∗

for all  with B()  = ∅

for some pre-specified value P1∗ ∈ (0, 1). To begin with, we first consider a single-step procedure denoted by SS or SS-procedure in the following. Again, we are concerned with the problem of finding the LFC for () given a fixed sample size n. For a SS-procedure S = {i : maxj ∈I Xj − Xi c} based on some critical value c (say) it is easy to verify that the so-called slippage configuration ∗ = (0, . . . , 0, ) is least favourable for . For step-down procedures the situation is much more complicated as will be demonstrated in the following. Let X1:J  · · · X|J |:J denote the order statistics of Xj , j ∈ J ⊆ I and let 0 < c2  · · · ck be any set of critical values. Let  ∈ Rk be fixed for the moment and suppose B() = J  = ∅ with J ⊆ I \{k}. Then ⎛ ⎞  () = P ⎝ RM ⎠ M:M⊇J, |M|  k−1

=



P (RM ∩ {X|M c |:M c − X1:M c ck−|M| }),

M:M⊇J, |M|  k−1

where RM = {X|M c |:M c − X1:M > ck , . . . , X|M c |:M c − X|M|:M > ck−|M|+1 }.

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2063

Since most of these events are not convex, it seems very difficult to prove any monotonicity properties for  with respect to . For example,  is only piecewise continuous on Rk . Moreover, analytical expressions for () are rather complicated even for moderately large k and their numerical calculation is more than cumbersome. An aggravating circumstance is that the slippage configuration is in general not least favourable for  as simulations have shown. A look at the sets RM immediately yields that an infimum of  with respect to a set { : B() = J } will be attained for some  ∈ {(1 , . . . , k ) : j = 0 for all j ∈ J and k:I = }. In case of the FCP-, WPP- or NK-procedure it may be conjectured that the LFC with respect to this set should have |J | entries 0, one entry , and k − |J | − 1 entries all equal to some  ∈ [0, ]. Even if one could prove this, the calculation of  for such configurations as well as the remaining minimization problem remain difficult. To overcome these difficulties, we find an appropriate approximation. Obviously, ()

max

M:M⊇J,|M|  k−1

P (RM ) P (RJ ).

Now, some monotonicity arguments yield inf

:B()=J

P (RJ ) = P∗ (RJ ) = P∗ (R{1,...,|J |} )

with ∗ = (0, . . . , 0, ), hence we get inf ()

∈Rk

min

j =1,...,k−1

P∗ (R{1,...,j } ).

It remains to find the minimal sample size n and the corresponding set of critical values for the step-down procedure under consideration (FCP, WPP or NK) such that min

j =1,...,k−1

P∗ (R{1,...,j } ) P1∗ ,

a job which is assigned to a computer program. Analytic expressions for P∗ (R{1,...,j } ) can be derived by conditioning on Xk−j :{j +1,...,k} and by using a formula known as Bolshev’s recursion for the joint distribution function of order statistics of i.i.d. random variables (cf., e.g., Shorack and Wellner, 1986). In case of a balanced one-way layout with underlying normal distribution and known  = 1 we get  P∗ (R{1,...,j } ) = [(x − ) (x)k−j −1 + (k − j − 1)(x) (x − ) (x)k−j −2 ] × Fj (x − ck , . . . , x − ck−j +1 ) dx, where Fj (y1 , . . . , yj ) = P (Y1:j y1 , . . . , Yj :j yj ) denotes the joint c.d.f. of the order statistics Y1:j  · · · Yj :j of j i.i.d. standard normal variables Y1 , . . . , Yj . If we consider the SS-, FCP-, WPP∗ -, WPP- and NK-procedure at a fixed level  the corre∗ sponding power functions denoted by SS , FCP , WPP , WPP and NK , respectively, satisfy ∗

SS ()FCP () WPP ()WPP ()NK ()

for all  ∈ Rk .

Moreover, a little reflection yields that the lower-bound minj =1,...,k−1 P∗ (R{1,...,j } ) for the FCP-, WPP- or NK-procedure is always greater than or equal to the minimum power of the SSprocedure. A few simulations indicate that the lower bounds yield a good approximation for the minimum power of the step-down procedures. For  0.1 and inf ∈Rk () 0.8 and moderately

2064

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

Table 1 Required sample sizes for k = 4, 8 for the SS-, FCP-, WPP- and NK-procedure displayed in the form (nSS , nFCP , nWPP , nNK )k based on a randomized block design for  = 0.05,  = 0.8 and known  = 1

 0 0.05 0.25 0.5 1

−  = 0.5 (123, 109, 109, 100)4 (107, 103, 90, 87)4 (99, 99, 78, 78)4 (99, 99, 78, 78)4 (99, 99, 78, 78)4

(176, 153, 153, 141)8 (155, 145, 128, 123)8 (148, 144, 116, 116)8 (148, 143, 116, 116)8 (148, 143, 116, 116)8

−=1 (31, 28, 28, 25)4 (29, 27, 25, 23)4 (26, 25, 21, 20)4 (25, 25, 20, 20)4 (25, 25, 20, 20)4

(44, 39, 39, 36)8 (41, 37, 35, 33)8 (38, 36, 30, 30)8 (37, 36, 29, 29)8 (37, 36, 29, 29)8

−=2 (8, 7, 7, 7)4 (8, 7, 7, 6)4 (7, 7, 6, 6)4 (7, 7, 6, 6)4 (7, 7, 5, 5)4

(11, 10, 10, 9)8 (11, 10, 10, 9)8 (10, 10, 8, 8)8 (10, 9, 8, 8)8 (10, 9, 8, 8)8

Table 2 Required sample sizes for k = 4, 8 for the SS-, FCP, WPP- and NK-procedure displayed in the form (nSS , nFCP , nWPP , nNK )k based on a randomized block design for  = 0.05,  = 0.8 and unknown  > 0

 0 0.05 0.25 0.5 1

−  = 0.5 (124, 109, 109, 101)4 (108, 104, 91, 87)4 (102, 102, 80, 80)4 (104, 104, 81, 81)4 (111, 111, 87, 87)4

(177, 153, 153, 142)8 (156, 146, 129, 124)8 (150, 144, 117, 117)8 (151, 146, 118, 118)8 (155, 150, 122, 122)8

−=1 (32, 28, 28, 26)4 (30, 28, 26, 24)4 (27, 27, 22, 21)4 (27, 27, 22, 22)4 (30, 30, 24, 24)4

(45, 39, 39, 36)8 (42, 38, 35, 33)8 (38, 37, 30, 30)8 (39, 37, 31, 31)8 (40, 39, 32, 32)8

−=2 (9, 8, 8, 8)4 (9, 8, 8, 7)4 (8, 8, 7, 7)4 (8, 8, 7, 7)4 (9, 9, 7, 7)4

(12, 10, 10, 10)8 (12, 10, 10, 9)8 (11, 10, 9, 9)8 (11, 10, 9, 9)8 (11, 11, 9, 9)8

large k we always found some  such that () was very close to the lower bound. There are some theoretical reasons that the lower bound becomes better with increasing  and which are not discussed here. The previous considerations carry over to the case of unknown  where good and bad treatments are defined in terms of -units. Conditioning on W/ yields appropriate formulae. The SS-, FCP-, WPP∗ -, WPP- and NK-procedures are implemented in the software package SEPARATE (Selecting, Partitioning and Testing, developed by the authors and others) including the case of unknown variance 2 . SEPARATE is designed for sample size determinations and the calculation of critical values in selecting, partitioning and testing problems. A demo version can be found on the Web-page: www.ddz.uni-duesseldorf.de/main/separate/index.htm. In the following, we give some examples of power control and allocation of sample sizes. To this end, we consider a randomized block design of the type Yis = i + s + eis , s = 1, . . . , n, i = 1, . . . , k, where the error variables eis follow a normal distribution with mean 0 and variance 2 > 0 which may be known or unknown. In case of unknown  the usual estimate for 2 has degrees of freedom = (k − 1)(n − 1). The block effects s may be fixed or random effects, respectively. Some examples of sample sizes for  = 0.05,  = 0.8, are displayed in Table 1 ( = 1 known) and Table 2 ( > 0 unknown). The entries (nSS , nFCP , nWPP , nNK )k give the sample sizes nSS for the SS-procedure, nWPP for the WPP-procedure, and nNK for the NK-procedure for k ∈ {4, 8}. All calculations have been carried out with SEPARATE. It should be mentioned that in all cases the  -monotone version of the WPP-procedure leads to the same sample sizes as the WPPprocedure. For larger values of  (0.25) the sample sizes of the FCP-procedure are very close to the sample sizes of the SS-procedure. Except a few cases, the difference between WPP and NK is

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2065

Table 3 Power of the SS-procedure and simulated power of the FCP-, WPP- and NK-procedure for  = (0, . . . , 0, , . . . , , ), |B(ϑ)| = j , based on a randomized block design for k = 6, n = 15,  = 0.05,  = 0.25 and unknown  > 0. Additionally ˜ = minj =1,...,k−1 P((0,...,0, ),) (R{1,...,j } ) is displayed in the last row



j

= 1.25

= 1.5

= 1.75

SS

FCP

WPP

NK

SS

FCP

WPP

NK

SS

FCP

WPP

NK

0+

1 2 3 4 5

0.560 0.402 0.318 0.265 0.229

0.570 0.418 0.340 0.291 0.258

0.678 0.585 0.539 0.509 0.486

0.701 0.624 0.587 0.565 0.549

0.791 0.679 0.606 0.553 0.511

0.801 0.698 0.632 0.586 0.549

0.888 0.840 0.811 0.791 0.775

0.903 0.866 0.846 0.832 0.822

0.929 0.880 0.842 0.811 0.785

0.936 0.893 0.860 0.834 0.812

0.975 0.961 0.951 0.944 0.938

0.981 0.971 0.965 0.960 0.957

/2

1 2 3 4

0.578 0.410 0.322 0.267

0.581 0.416 0.329 0.277

0.586 0.471 0.436 0.445

0.596 0.479 0.454 0.487

0.802 0.686 0.610 0.554

0.802 0.690 0.619 0.567

0.805 0.737 0.718 0.733

0.806 0.742 0.732 0.768

0.933 0.884 0.844 0.812

0.933 0.885 0.849 0.821

0.934 0.910 0.905 0.916

0.935 0.913 0.912 0.932

1 2 3 4

0.864 0.725 0.578 0.416

0.863 0.729 0.587 0.428

0.864 0.773 0.698 0.620

0.864 0.779 0.712 0.663

0.970 0.925 0.855 0.736

0.970 0.927 0.860 0.747

0.970 0.944 0.917 0.879

0.970 0.947 0.924 0.902

0.996 0.988 0.972 0.929

0.996 0.989 0.973 0.933

0.996 0.992 0.988 0.978

0.996 0.993 0.989 0.984

0.229

0.258

0.418

0.433

0.511

0.549

0.706

0.719

0.785

0.812

0.900

0.906



quite small. Compared to the SS-procedure or FCP-procedure, the savings of WPP and NK, respectively, is up to approximately 20%, sometimes more. Keeping in mind that the sample sizes for the step-down procedures are based on a lower bound for the minimum power and the discreteness of the problem of finding minimum sample sizes, the actual minimum power is in general greater than the required minimum power P1 . At a first glance, the difference between sample sizes for known and unknown variance seems to be not very big. However, while in case of known variance the sample size is decreasing in  to some fixed value for a fixed difference

− , it is first decreasing and then increasing in the case of unknown variance. This effect often appears in connection with non-central t-distributions. It can be explained by the fact that the variance of a non-central t-distribution increases with increasing non-centrality parameter. Table 3 gives some power values of the SS-procedure (based on numerical calculations) and the WPP- and NK-procedure (based on simulations, 106 replicates) for parameter configurations  = (0, . . . , 0, , . . . , , ) with  ∈ {0+, /2, } with j entries 0 (i.e., B() = {1, . . . , j }), j = 1, . . . , 5(4), for k = 6, n = 15,  = 0.25 and  = 0.05. The last row in the table gives lowerbound ˜ = minj =1,...,k−1 P((0,...,0, ),) (R{1,...,j } ) for the power. In case of the SS-procedure, the lower bound is attained in the slippage configuration. The minimum power value in each column is displayed in boldface. In case of the step-down procedures these values give an (estimated) upper bound for the minimum power. In any case, upper and lower bounds differ not too much. The values in the fifth row ( = 0+, j = 5) do not depend on  and are obtained numerically (instead of simulation). In case of the FCP-procedure these values equal the corresponding values in the last row of the table, so the slippage configuration, which is least favourable for the power of the SS-procedure, is also least favourable for the FCP-procedure. But this result does not hold in general. It should be mentioned that in the slippage configuration the WPP and NK step-down procedures both have considerably higher power than the SS- and FCP-procedures. Moreover,

2066

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

the difference between WPP and NK becomes smaller when increases. Finally, we note that the power values of the WPP∗ -procedure coincide with power values of the WPP-procedure in Table 3. 6. The Keuls example In his paper on the use of the studentized range in connection with an analysis of variance Keuls (1952) considered the following trial on white cabbage carried out in 1950. He described the trial as follows: A trial field had been divided into 39 plots, grouped into 3 blocks of 13 plots each. In each block the 13 varieties to be investigated were planted out (a randomized block design). During this trial all plots were treated in exactly the same way. The purpose was to learn which variety would give the highest gross yield per cabbage and which the lowest, in other words to find approximately the order of the varieties according to gross yield per cabbage. The original data can be found in Keuls (1952). The (renumbered) sample means yi = x i· in decreasing order are 176.0, 152.7, 150.7, 141.7, 132.0, 131.0, 129.0, 128.7, 124.3, 120.7, 111.3, 100.7, 97.7. The variance has been estimated as s 2 = 124.29 with degrees of freedom = 24. Based on these estimates we now reanalyse the trial with respect to the selection goal of selecting a subset containing all good varieties with pre-specified probability 1 −  = 0.9. The standardized differences tj = (y1 − yj )/s, j = 2, . . . , 13, are given in the second column of Table 4. A comparison of the critical values dk,j for the new WPP-procedure with the corresponding critical values ck,j of the FCP-related step-down procedure shows a considerable improvement. A comparison of the critical values dj,j for the NK-procedure with the corresponding critical values dk,j of the WPP-procedure shows that there is some difference for small values of , but that they are practically equal for  = 2. The entries in boldface indicate where the first acceptance occurs in the corresponding stepdown procedure. For example, the selected subset with respect to the WPP-procedure with critical values dk,j (0) is S = {1, 2, 3} while the selected subset with respect to the NK-procedure with Table 4 Test statistics and critical values for various step-down procedures. j

tj

dk,j (0)

dk,j (0.5)

dk,j (1)

dj,j (0)

dj,j (0.5)

dj,j (1)

ck,j (0.5)

ck,j (1)

dk,j (2)

dj,j (2)

13 12 11 10 9 8 7 6 5 4 3 2

7.023 6.754 5.803 4.960 4.637 4.243 4.216 4.036 3.947 3.077 2.269 2.090

2.718 2.696 2.671 2.644 2.614 2.579 2.540 2.494 2.438 2.368 2.274 2.136

2.961 2.920 2.874 2.824 2.771 2.709 2.644 2.569 2.483 2.389 2.259 2.035

3.477 3.430 3.373 3.316 3.245 3.173 3.082 2.986 2.859 2.718 2.517 2.256

2.718 2.672 2.622 2.566 2.504 2.432 2.350 2.252 2.131 1.976 1.759 1.397

2.961 2.915 2.861 2.804 2.737 2.666 2.577 2.480 2.349 2.200 1.962 1.636

3.477 3.429 3.371 3.313 3.241 3.168 3.071 2.974 2.832 2.686 2.424 2.140

2.961 2.957 2.953 2.949 2.944 2.940 2.935 2.924 2.899 2.856 2.786 2.668

3.477 3.477 3.477 3.476 3.476 3.476 3.475 3.474 3.456 3.417 3.349 3.230

4.640 4.589 4.528 4.467 4.390 4.313 4.210 4.107 3.955 3.803 3.521 3.238

4.640 4.589 4.528 4.467 4.390 4.313 4.210 4.107 3.955 3.803 3.519 3.236

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2067

critical values dj,j (0) (or dj,j (0.5)) is S = {1}. For  = 2, both NK- and WPP-procedure end up with S = {1, . . . , 8}. The table also illustrates that the critical values of the WPP-procedure are not necessarily increasing in  0 which results in S={1} for =0.5 but S={1, 2, 3} for =0. The WPP∗ -procedure selects S = {1, 2, 3} for  = 0 and 0.5. While S is non-decreasing in  for the NK-procedure, it is not proved that NK controls the desired confidence level. Finally, we note that the corresponding single-step procedure with critical value dk,k () ends up with the decision S = {1, 2, 3} for  = 0 and 0.5, S = {1, 2, 3, 4} for  = 1 and S = {1, . . . , 9} for  = 2. Acknowledgements This work was supported by the Deutsche Forschungsgemeinschaft. The authors wish to thank two referees for their valuable suggestions and comments. Appendix A. Proof of Theorem 3.1. The first part of statement (b) of the theorem has already been proved in Giani and Finner (1991). The second part of (b) is trivial. To prove part (a), we can restrict ourselves to k 3 and |J | ≡ j 2. The other cases are already covered by the results in Finner and Giani (1994). Without loss of generality, we can assume that  = 1. Since the LFC problem is shift invariant we can choose the location parameters (k − j + r) ≡ (k − j + r) − (/2, . . . , /2), r = 0, . . . , j − 1, as possible LFC candidates. Similarly, as in Giani and Finner (1991), using the partition of {x ∈ Rk : TJ (x)c} into the k + j + 1 disjoint subsets {x ∈ Rk : |xi | c/2 for all i ∈ J, xi c/2 for all i ∈ I \J },   k x ∈ R : max x − xi c, min x = xi , xi < − c/2 , i ∈ J , ∈I

∈J

  k x ∈ R : xi − min x c, max x = xi , xi > c/2 , ∈J

∈I

i ∈ I,

an integral expression for P(k−j +r) {TJ (X) c} can be derived. With the abbreviations h1 (x) = x h(x + /2), h2 (x) = h(x − /2), Hi (x) = −∞ hi (t) dt and Ai (x) = Hi (x + c) − Hi (x) for x ∈ R, i = 1, 2, one obtains P(k−j +r) (TJ (X) c) = [H2 (c/2) − H2 (−c/2)]j H1 (c/2)k−j  −c/2 + {rh1 (t)A1 (t)r−1 A2 (t)j −r H1 (t + c)k−j −∞

+ (j − r)h2 (t)A1 (t)r A2 (t)j −r−1 H1 (t + c)k−j + rh2 (t)A1 (t)j −r A2 (t)r−1 (1 − H2 (t))k−j + (j − r)h1 (t)A1 (t)j −r−1 A2 (t)r (1 − H2 (t))k−j + (k − j )h2 (t)A1 (t)j −r A2 (t)r (1 − H2 (t))k−j −1 } dt. If we can show that the integrand Ir (t), say, is non-increasing in r, 0 r  min{(j + 1)/2, j − 1}, uniformly in t ∈ (−∞, −c/2), the proof of the theorem is complete. So let t ∈ (−∞, −c/2) be fixed. For a start let us assume that the support of h is the open finite interval (−a, a) for some

2068

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

a > 0. The three cases (i) −a + /2 < t < − c/2, (ii) −a + /2 − c < t  min{−a + /2, −c/2} and (iii) t  min{−a + /2 − c, −c/2} have to be distinguished. (i) If −a + /2 < t < − c/2, from the symmetry and unimodality of h we easily see that h1 (t)h2 (t) > 0 and A1 (t)A2 (t) > 0. Hence, there exists d 1 and D 1 with h1 (t) = dh2 (t)

and

A1 (t) = DA2 (t).

(A.1)

In the sequel we make essential use of the inequality D d

(A.2)

which is proved in Finner and Giani (1991). Applying (A.1) the integrand Ir (t) can be written as Ir (t) = h2 (t)A2 (t)j −1 × {d[rD r−1 H1 (t + c)k−j + (j − r)D j −r−1 (1 − H2 (t))k−j + (k − j )D j −r A2 (t)(1 − H2 (t))k−j −1 ] + (j − r)D r H1 (t + c)k−j + rD j −r (1 − H2 (t))k−j } = h2 (t)A2 (t)j −1 E(r) say. Obviously, the assertion follows if F (r) = E(r) − E(r + 1) 0 for all 0 r  min{(j + 1)/2, j − 1} − 1. Setting U (r) = (j − r)D j −r−2 (D − 1)(1 − H2 (t))k−j + D j −r−2 (1 − H2 (t))k−j − rD r−1 (D − 1)H1 (t + c)k−j − D r H1 (t + c)k−j and V (r) = rD j −r−1 (D − 1)(1 − H2 (t))k−j − D j −r−1 (1 − H2 (t))k−j + D r+1 H1 (t + c)k−j − (j − r)D r (D − 1)H1 (t + c)k−j + (k − j )D j −r−1 (D − 1)A2 (t)(1 − H2 (t))k−j −1 it can easily be verified that F (r) = dU (r) + V (r). Since 1 − H2 (t) H1 (t + c) we get U (r) (1 − H2 (t))k−j [(j − r)D j −r−1 − (j − r − 1)D j −r−2 − (r + 1)D r + rD r−1 ]  (1 − H2 (t))k−j (D − 1)[(j − r)D j −r−2 −r−3 s = D − rD r−1 ] + js=r 0 0,

if r < (j − 1)/2 if r = (j − 1)/2

H. Finner et al. / Journal of Statistical Planning and Inference 136 (2006) 2053 – 2069

2069

for r (j − 1)/2. Now, a straightforward calculation yields with F (r) = dU (r) + V (r) DU (r) + V (r) = j D j −r−1 (D − 1)(1 − H2 (t))k−j − j D r (D − 1)(H1 (t + c))k−j + (k − j )D j −r−1 (D − 1)A2 (t)(1 − H2 (t))k−j −1 j D j −r−1 (D − 1)(1 − H2 (t))k−j − j D r (D − 1)(H1 (t + c))k−j (1 − H2 (t))k−j j (D − 1)(D j −r−1 − D r ) 0 for all r (j − 1)/2, hence the assertion. (ii) If −a + /2 − c < t  min{−a + /2, −c/2}, then h2 (t) = 0 and A2 (t) = H2 (t + c) > 0. Then Ir (t) = U (r) and the assertion follows from the derivations in (i). (iii) If t  min{−a +/2−c, −c/2}, we have h2 (t)=A2 (t)=0 implying the integrand Ir (t)=0 for 0 r  min{(j + 1)/2, j − 1}. Hence, Ir (t) trivially is non-increasing in r. The proof is concluded with the remark that, whenever the support of h is the entire real line, for each t < − c/2 the monotonicity of Ir (t) in r follows from case (i) for a tending to infinity.  References Aitchison, J., 1964. Confidence-region tests. J. Roy. Statist. Soc. B 26, 462–476. Aitchison, J., 1965. Likelihood-ratio and confidence region tests. J. Roy. Statist. Soc. B 27, 345–350. Bechhofer, R.E., 1954. A single-sample multiple decision procedure for ranking means of normal populations with known variances. Ann. Math. Statist. 25, 16–39. Broström, G., 1981. On sequentially rejective subset selection procedures. Comm. Statist. Theory Methods A 10, 203– 221. Finner, H., Giani, G., 1994. Closed subset selection procedures for selecting good populations. J. Statist. Plann. Inference 38, 179–200. Finner, H., Giani, G., 1996. Duality between multiple testing and selecting. J. Statist. Plann. Inference 54, 201–227. Finner, H., Giani, G., 2001. Least favourable parameter configurations for a step-down subset selection procedure. Biometrical J. 43, 543–552. Finner, H., Straßburger, K., 2002. The partitioning principle: a powerful tool in multiple decision theory. Ann. Statist. 30, 1194–1213. Giani, G., Finner, H., 1991. Some general results on least favourable configurations with special reference to equivalence testing and the range statistic. J. Statist. Plann. Inference 28, 33–47. Gupta, S.S., 1956. On a decision rule for a problem in ranking means. Ph.D. Thesis (Mimeo. Ser. No. 150). Institute of Statistics, University of North Carolina, Chapel Hill. Gupta, S.S., 1965. On some multiple decision (selection and ranking) rules. Technometrics 7, 225–245. Keuls, M., 1952. The use of the “Studentized range” in connection with an analysis of variance. Euphytica 1, 112–122. Lam, K., 1986. A new procedure for selecting good populations. Biometrika 73, 201–206. Lehmann, E.L., 1961. Some model I problems of selection. Ann. Math. Statist. 32, 990–1012. Prekopa, A., 1973. On logarithmic concave measures and functions. Acta Sci. Math. 34, 335–343. Shorack, G.P., Wellner, J.A., 1986. Empirical Processes with Applications to Statistics. Wiley, New York.