Journal of Statistical Planning and Inference 55 (1996) 13-21
ELSEVIER
oumaj of Jslat ist nn and inference
An identity for the joint distribution of order statistics and its applications K. Balasubramanian
a, M . I . B e g b,*, R . B . B a p a t a
aIndian Statistical Institute, New Delhi 110016, India bSchool of Mathematics, University of Hyderabad, Hyderabad 500134, India Received 12 May 1993; revised 21 April 1994
Abstract
It is shown that the joint distribution function of several order statistics can be expressed in terms of the distribution functions of the maximum (or the minimum) order statistics of suitable subsamples. The result is used to derive explicit expressions for the expectations of functions of order statistics from certain families of distributions which include the exponential distribution and the power function distribution. The results generalize earlier work by the authors for a single order statistic.
A MS classification: 62G30 Keywords." Order statistics; Product moments; Exponential distribution; Power function distribution
1. Introduction
Let X1, X2 . . . . . Xn be independent random variables with distribution functions F1, F2, .... Fn respectively and let Xl:n < X2:, < "'" < X,:n denote the corresponding order statistics. We set N = {1,2 . . . . . n}. I f S c N then S ~ will denote the complement of S in N and ISI will denote the cardinality of S. The rth order statistic for the set {X/[i E S} is denoted by Xr:S. The distribution function of Xr:s is denoted by Hr:s(x). When there is no possibility of confusion we will write Xr:JSl instead of Xr:s and Hr:Jsl instead of Hr:s. The distribution of the rth order statistic Xr:n can be expressed as a linear combination of the distribution functions of the maximum (or the minimum) order statistic for various subsets of {X1,X2 . . . . . X~}. In the jargon of reliability theory this simply means that the reliability o f an ' r out-of n' system can be expressed as a linear combination of the reliabilities of all possible series (or parallel) systems formed out of the n components. We state the result next; for a proof see Balasubramanian et al. (1991). * Corresponding author. 0378-3758/96/$15.00 (~) 1996 Elsevier Science B.V. All rights reserved PH S 0 3 7 8 - 3 7 5 8 ( 9 5 ) 0 0 1 3 8 - 7
K. Balasubramanian et al./ Journal of Statistical Plannino and Inference 55 (1996) 13-21
14
Theorem 1. (i)
For 1 <~r <~n and n >~2 :
Hr.n(X)= ~=O ~SCN, iS[=n_j(--1)r-ISl ( lSl-}SI lr ) glSl:S(X
(ii) H~:n(x)
~'
(_l,n-r+l-lSl ( 'S[ - 1 ) Hl:s(x)" J n -- r
~-~=1 ~--~SC N,[SI=n-J ~"
Theorem 1 also follows from the well-known work of Feller (1968). One can obtain the distribution function of Xr:n using (5.2) of Ch. 4 there to obtain (i) of Theorem 1. The motivation behind obtaining the result in Balasubramanian et al. (1991) was to use it to get explicit expressions for the moments of functions of order statistics for distributions such as the exponential, for which the minimum order statistic is again exponential; or the power function distribution, for which the maximum order statistic again has a power function distribution. The purpose of this paper is to obtain a generalization of Theorem 1 to the case of the joint distribution function of several order statistics. We will show that the joint distribution function of X~,:n < Xr2:~ < "'" < Xrk:n where rl < r2 < .-" < rk at Xl < x2 < . - ' < xk can be expressed as a linear combination of terms of the type
HIs, I:S,(Xl) Hl&l:&(x2)'" HI& i:& (xk)
(1)
or of the type
H1:s,(Xk) HI:& (xk-1 ) " " Hi:& (Xl)
(2)
where &,S2 ..... & are pairwise disjoint nonempty subsets of N. We indicate how the result may be used to get explicit expressions for the product moments of functions of order statistics from distributions such as the exponential or the power function distribution. 2. The main result Let 1 ~ r l < r2 < distribution function of Theorem 2.
< rk <<.n and let nrt,r2,...,rk:n(Xl,X2 Xr,,Xr2,... ,Xrk.
"'"
.....
Xk)
denote the joint
For l~
Hr, r2 r~'n(Xl,X2 . . . . . Xk) = ~ - ~ ( - - 1 ) E'='(r'-i[Sil} H .......
i=1
i ~j=l
H[SiI:s,(Xi), ISA - r;
for any xl < x2 < .." < xk; where the summation is over all collections {&,S2 ..... Sk} of pairwise disjoint nonempty subsets of N. Note that in Theorem 2 as well as in the rest of the paper we take the binomial
coefficient ( P )
tobezerowheneverq>
porq
K. Balasubramanian et al./Journal of Statistical Planning and Inference 55 (1996) 13 21
15
We can give a result similar to Theorem 2 using the minimum order statistics in various subsets. We give the result next.
Theorem 3. For l~
E(--1)~'{n-r'+'-ilS't}
n - }-~}-IISjI-rk+,-i i=1
for any xl < xz < "" < xk; where the summation is over all possible collections {$1,..., Sk} of pairwise disjoint nonempty subsets of N. Allowing Xl ~ oo, x2 -~ cx~..... xk ~ c~, Theorems 2 and 3 give
~=llSsl--ri
Z(--1)E~t{ri-ilS'l} i=1
and
F (-1 )~::,{n-r,+l-ilS, I}
n -- ~ } - l l S j l - Fk+l_ i i=1
respectively. These combinatorial identities may be of independent interest. The proof of Theorem 2 is deferred to Section 4. The proof of Theorem 3 is similar.
3. Expectation of functions of order statistics In this section we make use o f the identities o f Section 2 to obtain expressions for expectation o f functions o f order statistics. Suppose the random variable X has an arbitrary distribution function F(x). Define the following two families o f distribution functions with a positive parameter 2. Family I: F~(x) = [F(x)] 't, 2 > O. Family II: F~(x) = 1 - [1 - F ( x ) ] '~. In the survival analysis literature this family is known as the proportional hazard rate model. Let X (a) have distribution function Fa(x). Let X1,X2 ..... Xn be independently distributed as XO'),X(~2),...,X 0") respectively. Then
HIS[:s(x) = I I F2'(x) iES
= l-I[F(x)]2' i6S
= [F(x)] 2s = F'~S(x),
K. Balasubramanian et al./ Journal of Statistical PlanninO and Inference 55 (1996) 13-21
16
where 2s = ~
2~.
iES
Similarly, let X(z) have distribution function F2(x). If X1,X2..... Xn are independently distributed as X(x,),X(x2) . . . . . X(a.) respectively, then
Hl:s(x) = 1 - I I [ 1 - Fx,(x)l iES
= 1 - H[1
-
F(x)] 2i
iES
= 1 - [1 - F(x)] xs = Fa,(x), where 2s = ~ 2~. ies It follows from Theorems 2 and 3 that E { g ( Y r , :n. . . . ,Xrk:n)} = Z(--1)~=l{r'-ilSd}
k(
H
i=1
i
Is~l-1
~Y :IISyl -
(3)
r~ ) g*(2s, . . . . . 2s~)
and
E{g(xr,:. ..... Xr,:.)}
Isil- 1
v ' , i - l ts.]
i=1
n -- z.~j=l t ji --
"~
rk+l-i
) o.(2s, .....
2s. ), (4)
where the summation is over pairwise disjoint subsets S b S 2 . . . . . S k of N, 9"(21 . . . . , , ~ k ) = E { g ( X (21). . . . .
X ( 2 k ) ) [ X (2') < X (22) <
"'"
< X (2k)}
×Pr(X (2') < X O2) < ... < X Ok)) and
o,(,~l .... , ~ k ) = e { g ( X ( z , ) ..... X(~,))IX(~,) < X(~2) < - . . xer(X(2,) < X
< Xok>).
Here g* and g, are assumed to exist and to be finite.
An example of Family I 1. Consider
F(x)= ( x ~ , \U/
O<~x<~O; O> O,
< X(~,)}
K. Balasubramanian et al./ Journal of Statistical Planning and Inference 55 (1996) 13-21
17
then
F~(x) = (0) ~,
O<~x<~O; 0,2 > O,
which is a power function distribution. If k ~(Xl . . . . .
Xk) : Hxia"
i = 1,2 . . . . . k,
~i ~ 0 ,
i=1
then
/'''/
fi{(~)X~i i=l O<~x~<...
g*(~l,...,,~k)=
=H
i
i:1
(0)'~'-1} dxi
"
~j=l(O~J -~ ~J)
Hence from (3), we get
E
/fVr: / .
k i=1
)
= Z (-1)~-~:='{r'-iIS'l} H
i
7]j=1 [Sjl - ri
i=1
i
i=1
~-~)=1(~J + 2sj )
where the summation is over pairwise disjoint subsets $1, $2 . . . . . Sk o f N.
An example of Family II 1. Consider
F(x)
= 1 - exp{-x},
x~>0,
then
F~(x)
= 1 - exp{-2x},
x~>0, 2 > 0 ,
which is an exponential distribution. If
g(xl . . . . . xk)
tixi
= exp
ki=l
,
)
then
/ k / k 0,(21 . . . . . 2 k ) =
S'"S
exp
Z tixi H{2iexp{-2ixi}}dx,. I, i=1
0~
) i=1
'
(5)
K. Balasubramanian et al./Journal of Statistical Planning and Inference 55 (1996) 13-21
19
and hence
( ,sd-1)
E { X~:nX~2:n }
= ~
(-1)~=l{n-ri+l-ilS'[} H
n-
y']~.-IISj[- r3-i
i=1 ~Xl,CX2,2S~/a____~O (~Xl+i) × 2s, + 2s2
ctz--i
cq
1
2s2 (2s, + 2s~y ~+i
For cq = ct2 = 1, the above reduces to
g{Xr,:nXr2:n} = ~ (-1)~2'{n-ri+l-ilSil}
H
n-- ~5--lllSj[-
i=1
r3-i
2s, (2s, + 32s2 )
x 2s2(2s, + 2s~) 3' where the summation is over pairwise disjoint subsets $1, $2 of N.
4. Proof of the main result We will need the following result. L e m m a 1. L e t f ( t ) = ao + al t + . . . + ant n and let br = Y~i~=r ai, r = O, 1. . . . . n. Then the coefficient o f t r in t f ( t ) / t 1, It[ > 1, is br. Proof. It is easily verified that t f ( t ) - bo = (t - 1)(b0 + b i t + . . .
+ bnt n)
and therefore, when It[ > 1, tf(t) t-I
bo = bo + b~t + . . . + b~t n + - t-1 = bo + blt + . . . + bntn + bo
and the result follows.
+ -~ + . . .
[]
We now introduce some notation. Let A a, i = 1,2 .... ,n; j = 1 , 2 , . . . , k be events such that All CAi2 C ... C A i k for every i. For O~C( 1 ~0~2-~ ''" ~ e k ~
A~js occur, j = 1,2 . . . . . k}.
20
K. Balasubramanian et aL I Journal o f Statistical Planning a n d Inference 55 (1996) 13-21
Note that since Air C A i ~ , r = l, 2 . . . . . k - 1, the condition 0 ~< 0~1 ~ 0~2 ~ necessary for P[~,,...,~d to be positive. Let
" "
•
~ ak is
least c~j Aijs ' occur, j = 1,2,• .. ,k}
P*[~l,...,ak] = P r { a t
= ~_,""
~
il >~al
P[i,,i2,...,id.
(7)
ik >/~k
Define
S~,,...,f~, = Z
Pr{n~=, r~ic,, A q } ,
where the summation is over I],I2 . . . . . Ik, disjoint subsets o f { 1 , 2 . . . . . n}, with cardinalities ill, f 1 2 , - ' . , flk respectively. W e wish to express P[*~,,...,=d in terms o f S#,,...,/~k. Consider ?1
G(tl . . . . . tk) = E H { t l
y~, + t2(ZA,2
-
~,Ail
) -[- • " " -[- tk(~,k -- Y~,T:~ ) + ZA,, }
i=l
=Z
Z""
A
Z
A
t~'t~'." "~'PlJ, J,+J~,...J,+J=+...+Jd,
A
where y~ is an indicator function. Changing tl = u l u 2 . . . u k , Ilk, we get
t2 = u2u3...uk . . . . . tk =
G ( t l , . . . , tk) = H(ul ..... uk)
.J,+...J,p l
jl
j2
2
~1
52
• " " u k P[~,,...,~k],
CCk
where the last s u m m a t i o n is over ~1 ~
[Jljl+j2,,..jJ+j2+...+jk]
jk
=ZZ•-.Z,,,2012 ~l
• " • Uk
~
coefficient o f u~ ~ • -- u k~ in 141 • . . Uk
(ul - 1 ) - - ' ( u k - 1) H ( u l
(8)
Uk).
Now n
G(q ..... tk) = E H { ( t I
- t2)L4,, + (t2 - t3)ZA,: + " "
+ ( t k - I - tk)L4 r=~
i=l
+(tk -- 1)ZAik + 1}
= ~-~" " Z ( t , jl
jk
- t 2 ) j' " "(tk - 1)J'Sj, d~,...d,
K. Balasubramanian et al./Journal o f Statistical Planning and Inference 55 (1996) 13-21
= ~"~"'~(ul-
1 ) J ' ( u z - 1) A ' ' ' ( u k - l ) j~
jk
jl
X41~l+j2 . . . Since G ( q , . . . , t k ) =
21
,tJlJr'"+Jk--l.~.
~k
.
.
(9)
oil d2,'"dk'
H ( u l . . . . . u k ) we get from (7), (8) and (9) that
P*[~l,...,~k] = coefficient o f u~~' " " " Uk~k in (9) ( jl
jk
× ... ×
~l
1
~2 - jl - 1
jk - 1 ~k - j l - j 2 . . . . .
j3 -- 1 ~3 - - j l - - j 2 - - 1 )
/ jk-l
-
1
k ..
k
l)Ei=lqi-Ei=oti (--
SJ'd2'"&
(lO) Identity (10) is a generalization o f Eq. (5.2) o f Feller (1968, p. 109). In order to obtain Theorem 2 from (10) we set A i j = { X i < x j } , where Xl < x2 < • " < xk. We make a rearrangement o f the terms and write j l , j 2 . . . . , j k in terms o f
Is, I, 1521 .... , [ski.
Acknowledgements The authors would like to thank the referees for some useful comments which led to a more general form o f the main result.
References Balasubramanian, K., M.I. Beg and R.B. Bapat (1991). On families of distributions closed under extrema. Sankhya Ser. A 53, 375-388. Bapat, R.B. and M.I. Beg (1989). Order statistics for nonidentically distributed variables and permanents. Sankhya Ser. A 51, 79-93. David, H.A. (1981). Order Statistics, 2nd. ed. Wiley, New York. Feller, W. (1968). An Introduction to Probability Theory and its Applications Wiley, New York.