~-' 0.- ; ) / t.
|
Journal of Statistical Planning and Inference 54 (1996) 159-174
ELSEVIER
journal of statistical planning and inference
Distributional and efficiency results for subset selection Paul van der Laan Department of Mathematics and Computin9 Science, Eindhoven University of Technology, P. 0. Box 513, 5600 MB Eindhoven Netherlands
Received 25 February 1994; revised 31 March 1995
Abstract Assume k (k~>2) populations are given. The associated independent random variables have continuous distribution functions with an unknown location parameter. The statistical selection goal is to select a non-empty subset which contains the best population, that is the population with largest value of the location parameter with confidence level P* (k -1 < P* < 1). Some distributional results for subset selection are given and proved. Explicit expressions for expectation and variance of the subset size are presented. Also some distributional and efficiency results are given concerning a generalized selection goal using subset selection. This generalized subset selection goal is to select a non-empty subset of populations that contains at least one e-best population. An e-best population is any population with a parameter value within e (e ~>0) of the largest parameter value. The subset selection goal of Gupta is a special case of the generalized selection goal. A M S classification: 62F07. Keywords: Selection; Subset selection; Location parameter; Subset size; Expectation; Variance; Generalized selection goal
1. Introduction A general goal in many statistical experiments is to indicate the best population from a set o f k (k~>2) populations 7r~.... ,~zk. The associated independent random variables, which may be sample means, have continuous distribution functions with an unknown location parameter. In the context o f location parameter the best population is defined as the population with largest value of the location parameter. The main approaches in handling with selection o f the best population are the subset selection approach o f Gupta (1965) and the indifference zone approach by Bechhofer (1954). The subset selection procedure, which will be considered in this paper, has as its goal the selection o f a non-empty subset, as small as possible, containing the best population with a given probability. This probability requirement has to be met for all possible parameter values. A possible practical 'objection' to subset selection procedures is o f the type 'large subsets are sometimes the result'. For a strong probability requirement 0378-3758/96/$15.00 (~) 1996 Elsevier Science B.V. All rights reserved SSDI 0378-3758(95)00166
160
P. van der Laan/Journal o f Statistical Plannin 9 and Inference 54 (1996) 15~174
one pays, for fixed sample sizes, with a large size of the selected subset, where the size of the subset is defined as the number of populations in the subset. These large subsets are mainly due to the fact that the probabiltiy requirement has to be met for the least favourable configuration (LFC) of the parameters. For the location model the LFC is, in many cases, the configuration consisting of equal parameter values for the k populations. The performance of selection procedures can be improved by either increasing the sample sizes or by weakening the probability requirement. From an application point of view the (expected) subset size is a characteristic and crucial quantity. After a short description of Gupta's subset selection procedure in Section 2 some general distributional results for the subset size will be given in Section 3. Chen and Sobel (1987) give some distributional results for normal populations. Section 4 contains some results for the expectation and variance of the subset size. An alternative to weakening the probability requirement and increasing the sample sizes, in order to get smaller subset sizes, is to be content when the subset contains an e-best population. Here, an e-best population is any population with a parameter value within e (e > 0) of the value of the largest parameter. This extension was considered by van der Laan (1992a, b). Section 5 contains some efficiency results based on the results of Section 4. Finally, some remarks concerning recent research results are made in Section 6. 2. Subset selection Let, for i = 1. . . . . k, X~l,... ,X/n be a sample from rc i and suppose these k samples are independent. It is assumed that Xil has a continuous distribution with an unknown location parameter Oi C 0 C ~, i = 1 . . . . . k. The ordered parameters are denoted by 011] ~< -.. ~<0[k] and the population associated with 0[i] is denoted by 7z(0, i = 1. . . . . k. The best population is 7r(k). It is assumed that n(k) is unique, otherwise appropriate flagging is used. A sufficient statistic X, for Oi, e.g. the sample mean, is used for the selection of n(k). The random variable associated with n(j), j = 1. . . . . k, is denoted by X(j). It is assumed that ~ has a distribution function P[Xi <~xiIOi] = F ( x i - Oi) with continuous density f ( . ) . The standardized statistic Zi = X i - Oi has a distribution function independently of Oi. The number of populations in the selected subset, the subset size, is indicated by S. The size S is a random variable and the outcome is denoted by s, where 1 ~
~P*
(k-l
for all 0 = (01 . . . . . Ok). P* is called the confidence level of the selection. The selection rule R is given by R : put ~i into the subset if and only if X//> max Xj - d , 1 <~j<~k
where the selection constant d ~>O.
P. van der Laan/Journal o f Statistical Plannin9 and Inference 54 (1996) 159 174
161
Distributional properties o f the subset size S are derived in the next section.
3. D i s t r i b u t i o n a l results for S
The subset size S can be considered as a crucial and characteristic quantity for subset selection. A relatively large subset size means, apart from random fluctuations, that the location parameters are close together in comparison with the variation o f the populations. The distribution o f the subset size S is given in the next theorem. T h e o r e m 3.1. For s = 1. . . . . k and integers 1 <<.ij<~k, ij # il f o r I # j, j , l = 1. . . . . k,
we have P [ S = s] =
y].
[I
il < "-"
j=s+l
× fi {F(x
-
F ( x - O[i,] + O[i,l - d )
O[il] ~- O[ir]) --
F ( x - 0[61 +
O[i,] --
d)} d F ( x ) .
j=l
j#r
Proof.
P [ S = s] =
P[g(i,) E Cs, j = 1. . . . . s ; il < "'"
rt(i~)f[Cs, l = s + =
Y~. il < . . .
1. . . . . k; is+l < ... < ik]
P[X(i,) ..... X(i~)~> m a x X~i)-d 1 ~i<~k
;
X(i,) < m a x X(i) - d, j = s + 1 . . . . k; l~<~k
is+l
<
...
< i/,.]
k ~-~P[X(i,)
.....
X(io>/X(6 ) - d ;
X(,,):
il<--.
X(iA < X ( i , ) - d ,
k
max XO); l<~i~k
j=s+l
. . . . . k; i~+1 < . - . < ik]
P [ X ( i , ) . . . . . X ( i , ) > X ( i , ) - d; X(i,) . . . . . X ( i , ) ~ X ( i ~ )
;
il <--" < i s r = l
X(i D < X(i,) - d,
j=s+l
. . . . . k;
is+l < "'" < ik]
~ P[X(i,) - d <~X(i,) . . . . . X(i~) <~X(i,) ; il < "'"
X(i,) < X(i,) - d, j = s + 1 .... ,k; and the result follows.
[]
is+ 1 <
...
< ik]
(1)
162
P. v a n d e r L a a n l J o u r n a l
of Statistical
Plannin 9 and Inference
54 (1996)
159-174
In the next theorem the distribution of S in a specific subspace of the parameter space f2 = {0 -- ( O l , . . . , O k ) , Oi C 6), i = 1 , . . . , k } will be given. This subspace, indicated by (2(6), consists of the following configurations of O's: Or1l = OIk_l l = OLk1 - 6
(6 > 0 ) .
Theorem 3.2. F o r s = 1 . . . . . k the f o l l o w i n g holds: e[s
~(6)]
= slOe =
(k-1)f~F(
S
x
-- 8 -- d)Fk-s-l(x
-
d){F(x)
- F(x
- F(x -
d)} '-2
d ) } s-1 dF(x)
-
S
)/5
1
+ ( s - 1)
1
oo
Fk-~(x - d){F(x)
x{F(x - 6) - F ( x - 6 - d)} dF(x) 1
+
Fk-S(x + 3 - d){F(x
1
+ 6) - F ( x + 6 - d)} s-l dE(x).
(2)
Proof. P [ S = sl0[1] = 0[k-l] = =
~
s
il<...
+
~
O[k] - - (~]
/> 12 ,s,x
-- 6 -- d)Fk-s-l(x
-
d){F(x) - F(x -
d)} s-I dE(x)
-
(s - 1)
il < ""
- d){f(x)
- f(x
- d ) } s-2
x { F ( x - 6) - F ( x - 6 - d)} dF(x) +
~
f~Fk-S(x
il < " " < i s = k
and the result follows.
+ 6 - d){F(x
+ 3) - F ( x + 6
-
d)} s - l dE(x) ,
--
[]
A special case of the subspace f2(6) is f2(0), the so-called Least Favourable Configuration (LFC) for subset selection in terms of location parameters. This case is considered in Corollary 3.1.
Corollary 3.1. For s = 1 . . . . , k one has P [ S = s[O C .(-2(0)] = P [ S = s l O E LFC]
= s
{F(x) - F ( x - d ) } S - l F k - S ( x S
- d)dF(x)
.
(3)
oc
Proof. The result follows directly from Theorem 3.2 using the identity (k~-l) + k-I k (s_l)=(s).
[]
P. van der Laan/Journal of Stat&tical Plann& 9 and Inference 54 (1996) 15~174
163
k
It can be verified that ~ s = l P [ S = slO C LFC] = 1 by expanding { F ( x ) F ( x - d ) } s-1 in (3), changing the summation order, and noticing that the coefficient o f F i ( x ) F k - I - i ( x - d) for i = 0, 1. . . . . k - 2 is equal to
s=i+l
s
i
i
i
s=i+l
1
'
while for i = k - 1, and thus s = k, this coefficient is equal to k, and the result follows immediately. Some special distributions are considered in the next corollaries. C o r o l l a r y 3.2. For uniform populations on (0, ½) with known scale parameter 2 > O, and F ( x ) = ) , x , O ~ x ~ 1, one gets with 0~
P[S=s'OELFC]=(
( 2 kd ) ~ )- l ( 1 - 2 d ) k - s + ' + ( 2 d ) k I ( s = k ) s - 1
,
(4)
where s = 1..... k and I(A) is the indicator function of the set A. Proof. For s < k one gets from (3)
P[S = slO 6 LFC] = s
2k
dS-~(x - d) k-~ d x
= ( ( 2kd ) s -) l ( 1 - 2 d ) k - s + l s - 1 and for s = k the probability in (3) equals k2 k
dx +
x ~-1 d x
,
dO
and the result follows.
[]
C o r o l l a r y 3.3. For exponential populations with F(x) = 1 -e-;'~,x>~O, with known scale parameter 2 > O, and 0 elsewhere, one gets
P[S = s[OE LFC] = (1 - e - ' l d ) S - l e -~l(~
(5)
Proof. For s < k one gets
P[S = slO 6 LFC] = s = s
2
(:)
{e -a~x-d) - e-Z~}~-l(1 - e - ; ' ( x - d ) ) k - % - ~ d x
(1 - e - ~ ) S - l e - ; ~
= e-Zd(1 _ e - ~ ) ~ - 2
;
.to
yS-l(1 _ y)k-s dy
164
P. van der Laan/Journal of Statistical Planning and Inference 54 (1996) 159-174
and for s = k this probability equals e-;,~(1 - e - ' ~ ) k-I + k2.~d(l -- e - ; ~ ) k - l e -;'~ d x = (1 - e-'td) k-1 , and the result follows.
[]
Corollary 3.4. For logistic populations with F ( x ) -- (l+e-2X) -1, -cxD < x < oo, with k no w n scale parameter 2 > O, one gets, with a = e ~a and s = 1,...,k:
P[S = slO ~ LFC] = s
() k
(a - 1)s 1
S
s l( s - 1 ) r=O
r
k+r- 1 I r ~ k-2 × ,:0 E ( - 1)~+' { k+~-2 ! ~ k+i-2 /
.{
×a~
, k +r-
,(!))
1
(6)
where
Cr(e)=
c-~]- /
l n e - i =~l li
Proof. See Appendix A.
,c > 0
1-
and integer r>~l.
[]
4. The expectation and variance of S under the LFC It is possible to determine the expected subset size S under the LFC, denoted by g{SILFC }, using (3) (see Appendix B). The ultimate result ~{SILFC} = k
Fk-l(x + d)df(x)
,
(7)
o<)
derived in a different way by Gupta (1965), is well known. This result holds for a general value of the selection constant d/>0. To fulfill the confidence requirement P ( C S ) ) P * , then, following Gupta (1965), d has to be solved from
f
~Fk-l(x
+ d ) d F ( x ) = p* .
0<3
For this value of d one has 8{S[LFC} = kP* (cf. Gupta, 1965). Expressions for the expected value of S will be derived for some well-known distributions.
P. van der Laan/Journal of Statistical Planning and InJerence 54 (1996) 159-174
165
Corollary 4.1. For uniform populations with F(x) = 2x, O<~x~ 1/2, with known scale parameter 2 > O, one gets for 0-..
o~{SILFC}=
1 - 2kd k + k A d .
(8)
Proof. fl/2-d
g{SILFC} = k 2 1 J0
{2(x + d)} k-1 d x + k2
/11.;
d x = 1 - 2kd k + k 2 d .
E
dll2-d
Corollary 4.2. For exponential populations with F ( x ) = l-e-)~,O<<.x < cx~ and with known scale parameter 2 > O, one gets for 0 ~ d : g{SILFC}
= e'~a{1 - (1 - e-'~d) k} .
(9)
Proof.
g{SILFC} =
k2
/0 {I
- e-X(x+d)}k-le-~rdx
= e~'{l -- (1 -- e - ~ ) k} .
[3
Corollary 4.3. For logistic populations with F(x) = (1 + e - Z ~ ) - l , - ~ known scale parameter 2 > O, one gets for 0 <~d and a = e~:
< x < oo with
k-lck_l(a)}
6~{S[LFC} = k {1
(lO)
a
where C k - i ( a ) has been defined in Corollary 3.4.
Proof. N{SILFC } = k
F
= ka k-1
(1 + e-X(~+d))-k+12e-~(1 + e - ' ~ ) -2 d x
/0
= k {1-
(z + a)-k+l(z + 1)-2 dz
k - l c~-l(a) }
where the last equality follows from Theorem 1 in van der Laan (1992c). For a general value of d the variance of S under the LFC, denoted by var{SILFC }, is given in the next theorem.
Theorem 4.1. var{SlLFC} = k
/
Fk-l(x + d)dF(x)
~ o o
-2k(k-
{
2k - 1 -
S
Fk-l(x + d ) d F ( x ) o<3
Fk-2(x + d ) F ( x ) d F ( x ) .
1) oo
} (11)
P. van der Laan/Journal of Statistical Plannin9 and Inference 54 (1996) 159-174
166
Proof. See Appendix C.
[]
For a general value of d this variance has been determined for uniform, exponential and logistic distributions. The results are given in the next corollaries. Corollary 4.4. For uniform populations with distribution function F(x ) = 2 x, O<~x <~ 1/2, with known scale parameter 2 > O, one 9ets for 0~
var{SILFC } = k2d(1 - 2d)(1 - 22k-1d k - l ) + 2kdk(1 -- f f d k) .
(12)
Proof.
var{S[LFC} = ( 2 k - 1)(1 - 2 k d k + k2d) (
rl/2-d
-2k(k-1)12kJo
f1/2
)
(x+d)k-2xdxq-}~2J1/2_dXdX
- (1 - )~kdk + k2d) 2 = k2d(1 - 2d)(1 - 22k-ld k-1 ) + )~kdk(1 -- 2kd k) •
[]
Corollary 4.5. For exponential populations with distribution function F(x) = 1 e - ~ , 0 <~x < ~ , and with known scale parameter 2 > O, one has var{SlLFC} = e'~d(e~ -(1
-
1){1 - - ( 2 k -
e-2d)
2k-1}
1)e-~a(1 - e - ; ' d ) k-1 (13)
•
Proof. var{SILFC } = (2k - 1)e~{1 - (1 - e - ; a ) k}
- 2k(k - 1)
-
e22d{1
--
(1 - e-;'d)k} 2
(1 - e-'~(x+d))k-2(1 -- e-Z~)2e-'~ dx
= ( 2 k - 1)e;~{1 - (1 - e-Xd) k} -- e2;~{l - (1 -- e-~d)k} 2 -
and the result follows.
2k(k - 1)
/0'
(1 - e-;"~t)k-2(1 - t)dt ,
[]
Corollary 4.6. For logistic populations with distribution function F(x) = (1 + e -z~)-l, - o c < x < oc, with known scale parameter 2 > O, and density f ( x ) one has var{SILVC}-k(k-1){ +Ck_l(a)(1
-2)(1
k-lck_l(a)a
k ( k '- 1 ) Cak - l (-a ) ) }
where Ck-l(a) has been defined in Corollary 3.4.
(14)
P. van der Laan/ Journal of Statistical Planning and Inference 54 (1996) 159-174
167
Table 1 g{S[LFC} for standard normal populations for some values of k and d k \ d 2 5 10 25
1
1.5
2
2.5
3
3.5
4
4.5
5
1.52 2.47 3.41 4.99
1.71 3.26 5.09 8.68
1.84 3.94 6.74 13.05
1.92 4.43 8.09 17.28
1.97 4.73 9.02 20.66
1.99 4.89 9.57 22.89
2.00 4.96 9.83 24.12
2.00 4.99 9.94 24.68
2.00 5.00 9.98 24.90
Table 2 Standard deviation of S under the LFC for standard normal distributions for some values of k and d k \ d 2 5 10 25
1
1.5
2
2.5
3
3.5
4
4.5
5
0.500 1.14 1.78 2.91
0.453 1.20 2.14 4.13
0.364 1.08 2.12 4.73
0.267 0.848 1.80 4.50
0.181 0.595 1.32 3.62
0.115 0.381 0.863 2.51
0.068 0.225 0.511 1.52
0.038 0.125 0.281 0.083
0.020 0.066 0.145 0.042
Proofi
var{SILFC } : k ( 2 k - 1 ) { 1 - 2 k ( k - 1)
k-a 1Ck-l(a)
/5
/{12 -kz
1
k-a 1Ck-l(a)
F k - 2 ( x ) F ( x - d ) f ( x - d) dx ,
where the last integral equals (with a = e'~): f~
2e-Z~{1 + e-~(x+d)}k-2(1 + e-Zr) 3 dx = a ~-2
(a + y)-k+2(1 + y)-3 dy dO
-
21 k-2 { 1a12
-- - - k - l c k - l ( a ) } a
'
using partial integration. After some calculations the result follows. [~ In Tables 1 and 2 some numerical results concerning expectation and standard deviation of the subset size S are given for normal populations.
5. Efficiency of subset selection of an E-best population relative to selecting the best one
In this section, for fixed e(f>0), the efficiency of subset selection for an e-best population relative to subset selection for the best population is considered. A correct selection (eCS) in the context of selection of an e-best population is the selection
P. van der LaanlJournal of Statistical Planning and Inference 54 (1996) 159-174
168
o f a subset which contains at least one e-best population, where an t-best population is defined as a population ni for which 0i >~0[k] - e. Note that there exists at least one t-best population for each e J>0. As for Gupta's procedure, the goal is to select a subset, as small as possible, such that P(eCS)j>P* for all 0, where P* is given and k - l < P * < 1. The selection rule is of the same form as Gupta's; the difference is in the choice of the selection constant d. Further, for normal populations with known standard deviation a( > 0), then it can easily be seen (cf. van der Laan, 1992a) that the selection constant d for selecting an t-best population is equal to ada - e , where de is Gupta's selection constant for standard normal populations and the same P* is used for the two procedures. From CS C eCS it follows that, when using Gupta's procedure, the probability o f selecting an e-best population into the subset is at least equal to the probability o f selecting the best population and this for every 0.
Lemma 5.1. m i n P ( e C S ) = H ( d + ~) 0
(15)
with H(t) =
/?
Fk-l(x + t)dF(x).
(16)
Proof. Let r ( 0 ~ < r ~ < k - 1) given by O[i]~O[k] - ~, i ~- 1.... ,r and Oil] > 0[k] - e for i=r+l,...,k. For r = 0 we have P(~CSIO) = 1 >~P(eCN[O c f2(e)) = H ( d + e). For r~> 1 we have E : = {X(k)/> m a x l <~i<~rX(i)-d} C_egg and, with Z(i) ~ X(i)-O and wi = O[kl--Oti] >~, P(aCSIO)>>-P(EIO) = P(Z(o <~Z(k)+d+wi, i = 1. . . . . rlO)>>.e(z(~ ~<. Z ( k ~ + d + e , i = 1. . . . . rio ) = f ~ F r ( x + d + a ) d F ( x ) > J H ( d + e ) , and the result follows. Note that this lower bound for P(aCSIO ) is sharp because for 0 E ~(a) we get r = k - 1 and P(eCS[O ) = H ( d + e). Moreover, it follows easily that the least favourable configuration with respect to P(eCSIO ) is given by f2(e),~>0. []
Definition 5.1. For a fixed selection constant d(~>0) and a fixed e( ~>0) the efficiency, G, o f the e-best selection procedure is defined as the relative difference in the minimal probabilities o f reaching the selection goals - the new one, resp. Gupta's - relative to Gupta' s:
G(d, e) = {rn~n P(eCS)}/{m~n P(CS)} - 1 .
Theorem 5.1. For f i x e d ~>~0 the relative efficiency G = G(d) can be written as H ( d + e) G(d) -- - H(d)
1,
(17)
where d is the selection constant Jor subset selection o f an t-best population. Further." lim G(d) = 0 .
dT~x~
(18)
IE van der Laan/Journal of Statistical Planning and Inference 54 (1996) 159-174 Proof. The result follows from Lemma 5.1.
169
[]
Comparing the two procedures we notice that if P* is the same for both procedures, then the selection constant of the e-best selection procedure equals d~, = H - l ( P * ) - e = d o - e, with do = H - I ( P *) the selection constant of Gupta's procedure. If the selection constant is the same for both procedures, then we have different values of P* for the two procedures, indicated by P~ - H(d) and P~ -- H ( d + ~) for Gupta's and the e-best selection procedure, respectively. If the density f ( x ) of F(x) is strongly unimodal ( f ( x ) is log-concave; Dharmadhikari and Joag-dev (1988)) then the following theorem holds for the efficiency G. Theorem 5.2. I f f (x) is strongly unimodal, then G(d) is a decreasing function of P*. Proof. H ( d ) : = f ~ F k - ' ( x + d )
f ( x ) d x can be written as
H(d) = P(max(Xz . . . . . Xk) - ) ( 1 ~
(19)
where S(CS) and S(eCS) are the subset sizes for selecting the best and an e-best population, respectively, when using Gupta's procedure and the t-best selection procedure, respectively, with the same minimal probability of correct selection. For a fixed value of P* one has do = H - I ( P *) and d,: = H - I ( P *) - e = d o - e. However, one gets
G(P*, e) -
H(d~ + e) H(d~,)
1-
P* H(do - e)
1
(20)
and
R-
kP* P* kH(d~) - H(do - e) - G ( P * , e ) + 1 .
(21)
170
P. van der LaanlJournal of Statistical Planning and Inference 54 (1996) 159-174 Table 3 The relative efficiency G for different values of k and ~, and with P~ = 0.90 k
min P(CS) 0
G in %
e = 0.2 2 3 4 5 6 7 8 9 10 25 50 100 500 1000 2000
0.868 0.865 0.864 0.863 0.863 0.862 0.862 0.862 0.862 0.860 0.859 0.859 0.858 0.858 0.857
3.7 4.0 4.2 4.3 4.3 4.4 4.4 4.4 4.4 4.7 4.8 4.8 4.9 4.9 5.0
e = 0.5 2 3 4 5 6 7 8 9 10 25 50 100 500 1000 2000
0,820 0.813 0.810 0.809 0.807 0.806 0.805 0.805 0.804 0.801 0.799 0.797 0.795 0.794 0.793
9.8 10.7 ll.1 l 1.2 11.5 11.7 11.8 11.8 11.9 12.4 12.6 12.9 13.2 13.4 13.5
6. A generalized subset selection procedure: Some recent results I n s t e a d o f asking for a p r o b a b i l i t y o f at least P * o f a c o r r e c t selection, v a n d e r L a a n a n d v a n E e d e n ( 1 9 9 3 ) use a loss function. T h e y take the loss equal to z e r o w h e n the s u b s e t c o n t a i n s an e-best p o p u l a t i o n a n d a n o n d e c r e a s i n g function, h, o f the difference 0[k] - e - 0 Is] i f the s e l e c t e d s u b s e t d o e s not c o n t a i n an e-best p o p u l a t i o n , w h e r e 0 [sl =
max(Oi]i s u c h that ni is in the s u b s e t ) . T h e s u b s e t s e l e c t i o n goals are e x p r e s s e d in t e r m s o f an u p p e r b o u n d o n the risk f u n c t i o n a n d / o r o n the e x p e c t e d s u b s e t size. T h e s e
P. van der Laan/Journal o f Statistical Planning and Inference 54 (1996) 159-174
171
Table 4 The relative efficiency G for different values of P* and ~, and with k = 10 k = 10,e=0.2 P~' P~ G in %
0.80 0.739
0.90 0.862
0.95 0.927
0.975 0.962
0.99 0.983
0.995 0.9917
0.999 0.9977
8.3
4.4
2.5
1.4
0.7
0.33
0.13
k = 10,e - 05 P~* P~ G in %
0.80 0.648
0.90 0.804
0.95 0.893
0.975 0.942
0.99 0.973
0.995 0.9868
0.999 0.9959
23.5
11.9
6.4
3.5
1.7
0.83
0.31
upper bounds are required to hold either for all or for some 0. The selection rule is o f the same form as the one used by Gupta (1965) and by van der Laan (1992a). The difference is in the value o f the selection constant d. Gupta's and van der L a a n ' s subset selection approaches are obtained as special cases by taking h ( x ) _-- 1 for all x E ~+ with = 0 for Gupta and e > 0 for van der Laan. The case o f two normal populations with equal known variances and h ( x ) = x p for some p > 1 has been studied in full detail. Properties o f the expected subset size and the risk function are given. In order to apply in practice this new subset selection methodology, tables o f the risk function and the expected subset size are needed. Such tables can be found in van der Laan and van Eeden (1993). I f the introduction o f a loss function o f the form given in this paper is realistic for a practical problem, then the methodology based on a loss function can be considered as a flexible approach. The method is a generalization o f Gupta's subset selection procedure and is designed to give a smaller expected subset size. A comparison with Gupta's subset selection procedure can be found in van der Laan and van Eeden (1993).
Acknowledgements Many thanks to the referee Dr. Volker Guiard, Rostock for his careful reading o f the manuscript and a number o f improvements. A punctual p r o o f o f Lemma 5.1 and the relations (20) and (21) are due to him. Another anonymous referee indicated that one can also prove that the probabilities (3) form a probability distribution by using generating functions.
Appendix A.
Proof o f Corollary 3.4.
P [ S = s]O C LFC]
=s
s
~
1 + e -z~
1 + e-i(x-a)
172
P. van der Laan/Journal of Statistical Planning and Inference 54 (1996) 159-174
>(
1
)k-s
e_kX
(1 + e-;'~) 2
1 + e -;~x-d)
dx
Zs- 1 = s
( a - 1)S-la-k+la0
(z + 1)s+l(z + 1/a) k-1 dz ,
with a = e m and using the transformation z = e -z~, gives i
=s(ks)(a--1)S-'a-k+'S~
using the Binomial expansion for {(z + 1) - 1}~-1, where J ( m , n ) = f o ( Z + 1) -m ( z + 1 / a ) - " d z with 2~
an J(m,n)-- m-1
n m-1J(m-
l'n + l ) "
Thus,
--2
In+m
m-1 l) i-1 n-1 // J(m, n) = i=1 ~ (( n+m-3n+i_2/] ~ an+i-
{1
aI(i = m
n+m-2
-
1)Cn+m-2( a
and P[S = slOE LFC] = s
(ks)
s-1 ( S - l ) (a - 1)s-la -k+l ~ r=O F
× ~(-1)
i-1
i=1
x
k+r-1
(
(-1
)r
k+r-1)
k - 2 ( k+r-2~ k+i-3)
ak+i_2
al(i = r + 1 ) C k + r - l ( l ) }
and the result follows.
Appendix B g{S[LFC} = ~ s 2 s=l
{F(x) - F(x - d ) } S - l F k - S ( x - d ) d F ( x ) . OG
P. van der L a a n I Journal o f Statistical Planning and Inference 54 (1996) 159-174
173
From
S
with l<<.s<~k, ( 00) = ~
S--1
1,(~)=0
'
for all real a and integer b < 0, and, for j =
{F(x) - F ( x - d ) } S - J F k - ' ( x - d ) = f k - J ( x )
(B.1)
s--J
it follows that o~{SILFC} = k ( k - 1)
/5
{F(x) - F ( x - d ) } F k - 2 ( x ) d F ( x )
oo
÷ k
Fk-l(x)dF(x) oo
= k
(y + d) d F ( y ) ,
using partial integration and the transformation y = x - d.
Appendix C. Proof of Theorem 4.1. We have •{$21LFC} = Z s3
- d ) { g ( x ) - F ( x - d)} s-I dF(x) .
S=I
The next identity can easily be verified
with 1 <~s<~k. Equality (B.1) is also valid for j = 3(k~>3). Then it follows that o~{S2[LFC} = k ( k - 1)(k - 2)
F
Fk-3(x){F(x)
- F ( x - d)} 2 dF(x)
oc
+ 3 k ( k - 1)
- F ( x - d)t dF(x)
+kf~Fk-l(x)dF(x) /5 1)
= k(2k-
Fk-l(x+d)dF(x)-2k(k-
C2~
)<
.f~f
k--2 ( X
÷d)F(
X )dF( X ) ,
o(3
using partial integration, and the result follows immediately.
1,2,
1)
174
P. van der LaanlJournal of Statistical Plannin9 and Inference 54 (1996) 159-174
References Bechhofer, R.E. (1954). A single-sample multiple decision procedure for ranking means of normal populations with known variances. Ann. Math. Statist. 25, 16-39. Chen, P. and M. Sobel (1987). An integrated formulation for selecting the t best of k normal populations. Commun. Statist. Theory Methods 16, 121-146. Dharmadhikari, S. and K. Joag-dev (1988). Unimodality, Convexity, and Applications. Academic Press, New York. Gupta, S.S. (1965). On some multiple decision (selection and ranking) rules. Technometrics 7, 225-245. Karlin, S. (1968). Total Positivity. Vol. 1. Stanford University Press. van der Laan, P. (1970). Simple Distribution-free Confidence Intervals for a Difference in Location. Eindhoven University of Technology, Eindhoven. van der Laan, P. (1992a). Subset selection of an almost best treatment. Biometrical J. 34, 647-656. van der Laan, P. (1992b). Subset selection: robustness and imprecise selection. Memorandum COSOR 92-10. Dept. of Mathematics and Computing Science, Eindhoven University of Technology. van der Laan, P. (1992c). On subset selection from logistic populations. Statist. Neerlandica 46, 153-163. van der Laan, P. and C. van Eeden (1993). Subset selection with a generalized selection goal based on a loss function. Technical Report #127, Dept. of Statistics, University of British Columbia, Vancouver and Memorandum COSOR 93-15, Dept. of Mathematics and Computing Science, Eindhoven University of Technology (submitted).