P. R. Krishnaiah and P.K. Sen, eds., Handbook of Statistics, Vol. 4 © Elsevier Science Publishers (1984) 145-171
u
Rank Statistics and Limit Theorems*
Malay Ghosh
I. Introduction
Methods based on ranks have proved to be a very useful alternative to the classical parametric theory in many situations. A systematic development of the rank-based methods seems to have been sparked by the two pioneering papers of Wilcoxon (1945), and Mann and Whitney (1947), although rank tests for certain individual problems can be traced back even earlier. Wilcoxon (1945) proposed the famous rank sum test as a test for location in the two-sample problem. Let X1 . . . . . Xm and Y 1 , . . . , Y, denote independent random samples from conwhere the shift tinuous distribution functions F ( x ) and G ( x ) = F ( x - A ) , p a r a m e t e r A (real) is unknown. We want to test H0: d = 0 against the one-sided alternatives/-/1: A > 0 (or A < 0), or the two-sided alternatives H : A ¢ 0. Writing Z~ = Xi (i = 1 . . . . . m), Zm+/ = Yj (j = 1 . . . . , n), let Rl denote the rank of Z1 in the combined sample of size m + n. The Wilcoxon rank sum statistic is given by W = E7_-1Rm+/. For the one-sided alternatives/-/1: A > 0 (A < 0) we reject for large (small) values of W, because that indicates that the Y's have a distribution stochastically larger (smaller) than the distribution of the X ' s . For the two-sided a!~ternatives H, we reject for very large or very small values of W. Linear rank statistics (sometimes also called simple linear rank statistics) are generalized versions of W. A linear rank statistic based on a sample of size N is defined by N
(1.1)
Su = ~ c ( i ) a ( R i ) , i=l
where the c(i)'s are referred to as regression constants, and the a(i)'s are referred to as scores. Such statistics arise quite naturally in obtaining locally most powerful rank tests against certain regression alternatives (see e.g. H~jek c ( m ) = 0 and c(m + 1)= and Sid~k, 1967). The special case when c(1) . . . . . *Research supported by the Army Research Office Durham Grant Number DAAG29-82-K0136. 145
Malay Ghosh
146
. . . . c ( N ) = 1 and a ( 1 ) < - ' " < ~ a ( N ) is a generalized version of the Wilcoxon rank sum statistic in the two-sample problem. Two important special choices for the a(i)'s are given by (i) a ( i ) = i where the score function a is referred to as the Wilcoxon score, and (ii) a ( i ) = EZ(I ), where Z(1)~<." ~< Ztm, are the order statistics in a random sample of size N from the N ( 0 , 1 ) distribution. In this case, the score function a is referred to as the normal score. Next consider the one-sample situation when Z1 . . . . . ZN are iid with a c o m m o n continuous distribution function F ( z - A), where the shift p a r a m e t e r A (real) is unknown, and F ( z ) + F ( - z ) = 1. In this case, one wants to test/40: A = 0 against the one-sided alternatives H~: A > 0 (or A < 0) or the two-sided alternatives H : A ¢ 0. For such a testing problem, Wilcoxon (1945) proposed the signed rank test statistic W ~ = Z~=~ sgn Z~R +, where R + is the rank of [Z~I among IZ~I. . . . , [ZN], while sgn u = 1,0 or - 1 according as u > , = or <0. Generalized versions of W + are given by N
Sgr = ~'~ c(i) sgn Z i a ( R ~ ) ,
(1.2)
i=1
and such statistics are referred to as signed rank regression statistics. When c(1) . . . . . c ( N ) = 1, the statistics are used for one-sample location tests. In this case the important special choices for the score function a are (i) a ( 1 ) = . . . . a ( N ) = 1 which corresponds to the sign test statistic, and (iii) a(i) = EIZ[~i) where IZI0)< --- < IZI(N) denote the order statistics corresponding to the absolute values in a random sample of size N from the N(0, 1) distribution. The resulting test statistic is referred to as the one sample normal scores test statistic. We shall review the literature on limit theorems for linear rank statistics and signed linear rank statistics. Some of these statistics are closely related to the so-called U-statistics. For example, the Wilcoxon signed rank statistic can also be expressed as ~'~ ~', sgn(Zi + Zj) l~i<<.j~n
(see Tukey, 1949), and we shall see later that (,~)-1 times the above statistic is expressible as a weighted average of two one-sample U-statistics. The Wilcoxon rank sum statistic W is expressible as W = U + (,~1), where U = Ei%~ ~j%~ Iwj~Xil, where U is the celebrated M a n n - W h i t n e y U-statistic. In the above, and in what follows Ia = 1 if the event A occurs, and Ia = 0, otherwise. The M a n n - W h i t n e y U-statistic is an exampl e of a two-sample U-statistic. Section 2 presents the central limit theorem, rates of convergence and laws of large numbers for one-sample U-statistics. Multisample extension of these results is given in Section 3. Section 4 deals with jackknifed and bootstrapped U-statistics, while Section 5 provides functional central limit theorems for U-statistics. Some miscellaneous remarks related to U-statistics are made in Section 6.
Rank statistics and limit theorems
147
Section 7 involves a discussion of central limit theorems for linear rank statistics under the null and alternative hypotheses. Strong laws of large numbers are given for linear rank statistics in Section 8. Finally, in Section 9, functional central limit theorems are given for linear rank statistics.
2. One sample U-statistics: Central limit theorems, rates of convergence and laws of large numbers We start with n iid random variables X b . . . , X n each having a distribution function (dr) F(x). It is assumed that F E o~, a class of df's in R k, the k-dimensional Euclidean space. Let O(F) be a functional with domain space and range space R s. For simplicity, we confine ourselves to s = 1. O(F) is called a regular functional over o~ if for all F E o~, O(F) admits an unbiased estimator, say
I ~b(xt. . . . . x , ) F ( d x l ) " ' F ( d x , )
=
O(F) for all F E ~-. (2.1)
If (2.1) holds, we say that O(F) is estimable. If O(F) is estimable, the smallest sample size, say m for which (2.1) holds, is called the degree of O(F), and ~b(X1. . . . . Xm) is called the kernel of 0(1:). Without any loss of generality, we may assume that ~b is symmetric in its arguments, as otherwise, we can define ~b0(X1. . . . . X,,) = (m !)-1E ~ b ( X ~ , . . . , X~m), where the summation extends over all possible permutations (al . . . . . am) of the first m positive integers. Corresponding to a symmetric kernel ~b of O(F), we define a one-sample U-statistic (see Hoeffding, 1948) by U. =
~'~-.. ~'+ ~b(X. 1. . . . . X~.).
(2.2)
Such statistics have proved to be quite useful in the theory of point and confidence estimation, and in hypothesis testing. For a very detailed account of U-statistics, the reader is referred to Purl and Sen (1971, Chapter 3), Randles and Wolfe (1979, Chapter 3), and Serfling (1980, Chapter 5). Examples of one-sample U-statistics include the sample mean
i=1
the sample variance
i=1
l~i
Malay Ghosh
148
Kendall's tau statistic Zn~" ( 2 ) - l z Z (x i l - Xjl)( x i 2 - x12) , l<<.i
("2 1)-11~n sgn(Xi + Xi) can be expressed as (n + 1)-~(2U,~ + (n - 1)U,2) where n
Unl :" n -1 ~
sgn
Xi
i=1
is the sign statistic and U.2 =
~] ~'~ sgn(X/+ Xj). l<_i
Both U,x and Un2 a r e U-statistics. This fact was mentioned in Section 1. Hoeffding (1948) obtained a very useful central limit theorem for suitably normalized U-statistics. Later, Hoeffding (1961) introduced a decomposition for U-statistics which facilitated considerably the development of the asymptotics for U-statistics. Similar decomposition for more general statistics (not necessarily symmetric) is available in the later work of Hfijek (1968), Efron and Stein (1981), and Karlin and Rinott (1982). To arrive at Hoeffding's decomposition, we first introduce the following notations. For any c = 1 . . . . . m, let ~bc(xl. . . . , xc) = E[~b(X1,L. . . .
Xm)IX1 = Xl,..., Xc = Xc].
(2.3)
Define ~b~(xl) = ~bl(Xl)- 0(V);
(2.4)
~b2(xl, x2) = [~b2(Xl, x 2 ) - 0(F)] - ~01(Xl)- ~01(x2) ;
(2.5)
3
I]/3(X1,X2, X3) =
[~3(X1, X2, X 3 ) -
0(F)I - Z ~(xj)- Z Z 62(xj, x;,) ; i<~j
j=l
• ..
(2.6)
~m(X1, X 2. . . . . Xm) = [~(X 1. . . . . -- E Z
~J2(X.i,Xj,) . . . . .
Xm)- O(F)] E''"
Z
~ IPl(Xj) j=l
I]lm-l(Xj1. . . . . Xjm-l)"
(2.7)
149
R a n k statistics and limit theorems
Then U. has the decomposition
°)
m "
(2
Z Z ~,2(xj, x ; ) + . - - + - -
(m") XE ' ' "
E
~//m(Xj, . . . . . Xj,).
(2.8)
l<.jl<...
The above decomposition has been recently referred to in the literature as a ANOVA type decomposition (see, for example, Efron and Stein, 1981; or Karlin and Rinott, 1982). If E4~2(XI,..., X , , ) < ~, then using the conditional version of Jensen's inequality, it follows that EO](X1 . . . . , X j ) < w for all j = 1. . . . . m - 1. Also, it can be shown that the individual terms involved in (2.8) have zero means, and are mutually orthogonal. Accordingly, E(U.) = 0(F);
(2.9)
n v[~(X,)l + ~
v[~,2(x~,X~)l+ - "
+
The first main result of this section is as follows: THEOREM 1.
Assume
that E[t~2(X1 . . . . ,Xm)]
~rl = V[~Ol(X1)] > 0 .
Then L
~ / n ( U , - O(F))/(m2~l) u2---~ N(O, 1).
L where ~ means convergence in distribution. The above theorem, first obtained by Hoeffding (1948), follows easily from (2.8) by applying the classical central limit theorem on the mean like term m n -1Ej-"_-I ~01(Xj) and showing that E(RZ,)--~ 0, where R , is the remainder term in (2.8) in the decomposition of U, - O(F). This proof involves the 'projection' idea of breaking up a statistic as a sample mean plus a remainder term converging to zero at a rate faster than the centered sample mean. As an application of Theorem 1, consider the example when
w~ =
(.)1
Z Z (x, - xj)~/2,
Malay Ghosh
150 the sample variance. In this case,
~bl(x) = E[½(X1- X2)21 X1 = X] = ½[(xl-/z)2+ o-2], w h e r e / z = E(X1) and 0-2= V(XI). Hence, ~'1 • W [ ~ l ( X l ) ]
~- W[l]/l(Xl)] = 1(~1z4- or4),
where /z4 = E(X1 - tz)4. Accordingly, if or4
~v/-~(Un
- 0"2)/(].,£4 - 0-4) 1/2---)' N O ,
1).
One may make a note that (2.10) is not the usual way the variance of a U-statistic is expressed in the literature (see, for example, Puri and Sen, 1971; Randles and Wolfe, 1979; or Serfling, 1980). Writing (c = V[cb(X1 . . . . . Xc)], V ( U , ) is usually expressed as V ( U , ) = (m")-I Ec"=l (m)(7,--"~)~'c. Situations could arise though when ~rl = 0. For instance, in the sample variance example, one could have/.t4 = o-4. In such a situation, Theorem 1 is not applicable. Hoeffding (1948) obtained the inequalities ~c/c <~~d/d for 1 ~< c ~< d ~< m. If ~1 . . . . . ~'c--0 < ~'~+1, one can still obtain asymptotic distribution of suitably normalized U,. The case ~'1 = 0 < ~'2 has been addressed by Gregory (1977) (see also Serfling, 1980). If E~b2(X1 . . . . , Xm) < o~" and ~1 : 0 < ~2, then L
n(U,, - O(F))--> (~) Y , where Y is a weighted average of possibly infinitely many chi-squared random variables each with one degree of freedom. As an application, consider once again the sample variance example where the Xi's are Bernoulli random variables with probability of success equal to ½. Recently, interest has been focussed on obtaining rates of convergence to normality for U-statistics. The problem has been addressed by Grams and Serfling (1973), Bickel (1974), Chan and Wierman (1977), and Callaert and Janssen (1978), the latter obtaining the sharpest rate as follows. THEOREM
2.
IfEFlqb(X1 . . . . . X,n)]3< °° and ~'1>0 then sup
eF(X/n(Un~l/20(F))<~x)-~(x)~Cl.,(m2~l)-3/2n
-1/2 ,
(2.11)
-0o
where C is an absolute constant not depending on n or the moments of the distribution of the Xi's, v = EFI~Ol(X1)I3, and c19(x) is the distribution function of a N(0, 1) random variable. The moment assumptions of Callaert and Janssen (1978) have been weakened further by Helmers and Van Zwet (1982). They obtained the same error rate as given in the right hand side of (2.11) under the assumptions that
151
R a n k statistics and limit theorems
(i)
EF[~2(/1 . . . . . /m)]
and
(ii)
v
Note
that
the
condition
Ev[4~(X1. . . . . Xm)]3 < oo implies both conditions (i) and (ii), but not vice versa. An example to this effect appears in a paper of Borovskikh (1979). Central limit theorems for U-statistics with random indices were obtained by Sproule (1974), and rates of convergence to normality for such random Ustatistics were obtained by Ghosh and Dasgupfa (1982). Callaert, Janssen, and Veraverbeke (1980) obtained certain asymptotic Edgeworth type expansions for U-statistics. Next, in this section, we mention briefly the laws of large numbers for one sample U-statistics. The classical strong law of large numbers generalized to U-statistics is as follows. THEOREM
3.
If EF[c~(X1 . . . . .
Xm)] < o~ then U,-+ O(F) almost surely as n-,oo.
The strong convergence result was first obtained by Sen (1960) under the stronger moment condition EI~b(X1 . . . . . Xm)12-m-~< o¢. Later, Hoeffding (1961) obtained the result under the assumption of finiteness of the first moment of ~b using the decomposition given in (2.8). Berk (1966) showed that U-statistics are backward martingales, and proved Theorem 3 using the strong convergence results for backward martingales. For most practical purposes, it is useful to obtain an estimate of the variance of U, or at least o f m2~l, since nV(Un)/m2(xol as n0~. The following Jackknifed estimator of V(Un) was proposed by Sen (1960). To introduce Sen's estimator, first let
U~)(Xi)= ( n - 11)-1 ~ . . .
~ 4o(Xi, X~2,..., X~,),
where the summation extends over all possible 1 ~
Sen (1960) (and later Sproule, 1969) showed that if E4~2(X1 . . . . . Xm) < 0% then P
S2.-+sq
as n o ° ° .
Accordingly, L -
o(F))/(mS,.)--,
N(0,1).
Very recently, rate of convergence to normality for such studentized U-statistics has been obtained by Callaert and Veraverbeke (1981). They show that if E [ ~ ( X 1 , . . . , Xm)[4'5 < % then,
supl
'
0(F))
For a generalization of Theorem 1 to vector-valued U-statistics, consider a
Malay Ghosh
152 vector
Unk), w h e r e U,i is a o n e sample U-statistic with kernel ~b~ and
(Unl .....
d e g r e e rn~, i.e.
.,i=(n), rnl
~
Xjmi) ,
¢~I(Xj,,...,
l~Jl<'"
n ~ > m a x ( m l . . . . . ink). Also, let Oi=EF~bi(X1 . . . . . Xm), ~bu(X) = EF[qb~(Xx. . . . . Xmi) [ X1 = x], and o'~j = Covv[~b~(X 0, ~b~j(X0], 1 ~< i, j ~< k. It is where
a s s u m e d that O'ii is positive and finite for all i = 1 . . . . . k. Then, ~ / n ( U ~ 01,..., U~k- Ok) converges in distribution to N(0, ~ ) , where Z = ((o-ij)). T o p r o v e this result, one uses a d e c o m p o s i t i o n similar to (2.3)-(2.8) for each U~i, show that the r e m a i n d e r term converges to zero in m e a n square for each c o m p o n e n t , and apply the multidimensional central limit t h e o r e m to the vector of principal t e r m s (n -1Yqn=1 ~bll(Xj) . . . . . rt -1E~=I ~lk (Xj)). Hoeffding (1948) also o b t a i n e d a central limit t h e o r e m for U-statistics in the case when X~'s are i n d e p e n d e n t , but not iid. This can be p r o v e d by using a d e c o m p o s i t i o n similar to (but m o r e complicated than) (2.8). In fact, such a d e c o m p o s i t i o n for m o r e general statistics is now available in the literature (see Hfijek, 1968; E f r o n and Stein, 1981; and Karlin and Rinott, 1982). W e do not exhibit this d e c o m p o s i t i o n in detail, but introduce a few notations n e e d e d to obtain the variances of U-statistics in the non-lid case, and in stating a corresponding central limit t h e o r e m . Let
(xo, . . . . . x%) C(al . . . . .
°tc)fll . . . . .
= EI
,(X
•m-c
1. . . . .
~C(ffl. . . . . . c)fll . . . . .
x o c,
.....
x,o_
1 = Xal . . . . .
)lX
tim-c; Y, ..... "Ym-c = [E6c(al ...... c).B,..... B,,,_c (xa'
•6 C(°tl. . . . .
~,~
{(
n 2m
-
Xot c
=
Xotc] ,
x%)
(x~ . . . . . x%)], ~c)Yl '" " " ' )'m-c
' ~ ( 2 m - c ' ~ ( 2 m - ~ c ) } -1 c/',
c
/',
m
~
¢c(ot 1. . . . . . ¢)~1 . . . . . a m - c ; ' y l . . . . . "Ym-c
Then, o n e can write
Also, let ~w)(Xi) = (g~_])-i y, ~bl(i)a2...... ra(gi), w h e r e the s u m m a t i o n extends over all 1 ~< a2 < • • • < am ~< n with each oti # i. L e t ~lCi)(X~) = q~l(0(X/)- Eq~I(0(X/), i = 1 . . . . . n. T h e n we h a v e the following theorem. THEOREM 4. Suppose (i) supl,,,,<...<%~, E~b2(X,~,. . . . . X%) < oo;
(ii) EIol(o(X~)I- 2+8 <
~ for all i = 1 , . . . ,
n, and
Rank statistics and limit theorems
(iii) limn-.~ ET=~EI~¢i)(X~)I2+~/ET=x{E~¢~)(X~)} ~+1/28= 0
153
Then,
L
( U n - E U n ) / V l / 2 ( U n ) ---->N ( O , 1).
Rates of convergence to normality for U-statistics in the non-iid case are available in Ghosh and Dasgupta (1982), and also in the Ph.D. thesis of Janssen (1978).
3. Multisample U-statistics U-statistics can be usefully extended to the multisample case. Lehmann (1951) considered the two-sample extension of U-statistics. The general csample extension of such statistics is mentioned in Lehmann (1963). In the two-sample case, let X1 . . . . . Xm and Y1. . . . . Yn be independent random samples from distributions with distribution functions F ( x ) and G(y) respectively. A parameter O(F, G ) is said to be estimable of degree (ml, m 2 ) for distributions (F, G) in a family ~ if ml and m2 are the smallest sample sizes for which there exists an estimator of O(F, G ) that is unbiased for every (F, G ) E o~, that is, there exists a function ¢ such that EF.Gr~(X1 . . . . . gin, Y1 . . . . .
(3.1)
Yn) = O(F, G ) .
Once again, without any loss of generality, ¢ can be assumed to be symmetric in its components, and separately in the Y~ components. For an estimable parameter 0 of degree (ml, mE), and with a symmetric kernel ~b, a two-sample U-statistic is defined as Un,.n2-(nl]-l(n2~-lEd/)(Xal,. --
\ m l /
kin2~
,Sam1, YO1. . . . ""
Y#m2) ,
(3.2)
'
where the summation extends over all possible 1 <~ al < ' - " < am1~< nl, and 1 ~<131 < " " 3m2 < n2. Important examples of two-sample U-statistics are I7",~-Jr,1 and the MannWhitney U-statistic nl
n2
Unl.~2 = (n,n2) -1 ~'~ Z I t y ~ x d . i=l j=l
Using the notation q~c.d(X1. . . . .
Xc, yl . . . . .
Yd)
= E [ 6 ( X 1 . . . . . Xml, YI . . . . .
Ym2) I X~ = x ~ , . . . , X c
= x~,
(3.3)
Y1 = Yl . . . . , Yd = Yd]
and
~c,d = VF.O[d~c,d(XI . . . . . Xc, Y1 . . . . .
Ya)],
one
gets an
expression
for
154
Malay Ghosh
V[ U.,,,~] as nl -1 n2 -1 ml
V[U""~]= (rnl) (m2) ~=o~ (m''~(nx-m''~(mz'~[n2-rne~" ' d=okC ] \ m l - c J k d ] k m 2 - d ] ~c'a' (3.4) where st0.0is defined as 0. The central limit theorem for two-sample U-statistics is now stated as follows, THEOREM 5.
If
(i) Eck2(X,,...,
Xnl , YI ....
, Yn2) < 00,
(ii) 0 < A = !imN_~ nl/N < 1 (where N = nl + n2) (iii) ~2 = m2¢,o/A + m~'o1/(1 - A) > 0,
and
L
then, X/N[U,~.,~- O(F, G)]/cr ~ N(O, 1) as N--~o~. As an application of this theorem, consider the case of Mann-Whitney U-statistics. In this case, ~bl,0(x) = 1 - G ( x - ) , ~b0,1(y)= F(y) so that ~'1,0= VF[1 -- G ( X - ) ] = Vr[G(X-)] and st0,1= VG[F(Y)]. Condition (i) of Theorem 5 is trivially satisfied, since ~b is bounded. Hence, if (ii) and (iii) of Theorem 5 L
hold, then ~f N(U,~,~ - f F(x) dG(x))/o" ~ N(O, 1). A Hoeffding type decomposition for two-sample U-statistics is possible, and using such a decomposition, one can prove Theorem 5. Rates of convergence to normality for two-sample U-statistics is given in the Ph.D. thesis of Jannsen (1978). A strong law of large numbers for two-sample U-statistics extending the one-sample backward martingale argument is given in Arvesen (1969). For c-sample U-statistics, Sen (1977b) obtained the strong convergence of Ustatistics under the condition E{[~b[ log + I~1c-'} < ~. T o extend the above ideas to c-sample U-statistics, let X 0 (j = 1 . . . . . ni, i = 1 . . . . . c) be c independent samples, the ith sample having distribution function Fi(x) (i = 1. . . . . e). A parameter O(F1,..., F~) is said to be estimable of degree (ml . . . . . me) for distributions ( F 1 , . . . , F~) in a family ~ if ml . . . . . rnc are the smallest sample sizes for which there exists an estimator of 0(1:1. . . . . F~) that is unbiased for every (F~. . . . . F¢) E ~, i.e., there exists a function ~b such that EF1 . . . . .
Fc~)(Xll ....
, Xlm, .....
Xcl .....
Xcmc) = O ( F 1. . . . .
Fc) .
(3.5) Also, q~ is assumed to be separately symmetric in the arguments X i l , . . . , X~,i for each i = 1 , . . . , c. The c-sample U-statistic with kernel ~b and degree (ml . . . . . me) is defined as Un 1 = (nl~-l... ..... nc \~ll /
X Z''"
E al~A 1
( n c ~ -1 \?Tic/
~(Xlall,'--,Xlalm, ..... aetna c
Xcal, . . . . .
Xcacmc),
(3.6)
Rank statistics and limit theorems
155
where a~ = ( ~ i l . . . . . OLimi), and Ai is the collection of all subsets of mi integers chosen without replacement from the integers {1 . . . . . ni}, i = 1 , . . . , c. To obtain the asymptotic normality of a properly standardized c-sample U-statistic, we proceed as follows. Let i be a fixed integer satisfying 1 ~< i ~< c. Define (~il = ¢ ~ ( X l a l l . . . . .
Xlalm 1. . . . .
Xcacl . . . . .
Xcacmc)
and ~i2 = 4,(X1,,1 . . . . . Xl~,o, . . . . . Xc~c, . . . . . X ~ . ) where the two sets (ajl . . . . . C~imi) and (fljl . . . . . fljmj) have no integers in common whenever j ~ i and exactly one integer in comnon when j = i. Let (~0 . . . . . 0, 1,0 . . . . . 0 = C O V F 1 . . . . . Fc(~)il, ~ i 2 ) ,
where the only 1 in the subscript of 6o..... 0,1,0..... 0 is in the ith position. Also, let N = Zk=ln~. The following c-sample U-statistic theorem is due to L e h m a n n (1963). THEOREM 5(a). Let UN ~ U ( X 1 1 . . . . . X l n , . . . . . X c l . . . . . Xcnc) be a c-sample U-statistic with kernel ~b and degree (mx . . . . . me) for the parameter 0 =O(F1,..., F~). If
lim (nJN) = Ai(0 < Ai < 1)
for i = 1 . . . . . c
N-~
and E F 1 . . . . . F ~ b 2(X 11 . . . . .
Xlni,. .., Xcl,..
L
c then N/-N(UN-O)--> N(O, tr2), w h e r e o'2= ~i=l
.~ X c % ) < o0 , m 2 iA - lir ~0 ..... o, 1..... O.
4. Jackknifed and bootstrapped U-statistics Jackknife, originally proposed by Quenouille (1956), and later by Tukey (1957) has proved to be a very useful tool in nonparametric bias and variance estimation. T o introduce the idea of jackknife, let 7 / b e an unknown p a r a m e t e r of interest, and let 2(1. . . . . X~ be N (= nk) iid r a n d o m variables with c o m m o n distribution function F T. The N observations are divided into n groups of k observations each. Let 4, be an estimator of ~7 based on all the N observations, and let '1,-1 ^~ denote the estimator obtained after deletion of the ith group of observations. Let '~i = n~,, - (n -1)'q,,_1, ^i
i = 1 . . . . . n.
(4.1)
The ~i's are called pseudo-values by Tukey. Then the jackknife estimator of ~7
156
Malay Ghosh
is given by ~. = n - ' ~ r~i.
(4.2)
/=1
Motivated from the study of robust procedures in Model II ANOVA problems, Arvesen (1969) studied very extensively the limit laws for jackknifed Ustatistics and functions of them. We consider U-statistics with kernel ~b and degree m. For simplicity, consider the case k = 1. Let I ..... i
Cn-1
where C,-1 i denotes that the summation is over all combinations (/3i . . . . , fl~) of m integers chosen from (1 . . . . . i - 1 , i + 1. . . . . n). Let g be a real valued function, and let #. = g ( U , ) ,
,1 = g(O(F)),
^~n - 1 - -
g(U/-1), n
¢1i = n¢ln - (n - 1)~/_1 , ~n = n-1 E ~i i=1
and
S2g=(n-l)-1~(~,-~.)2. i=I
The following two theorems are proved by Arvesen (1969). The first one provides a central limit theorem for suitable normalized "0,. The second theorem provides a consistent estimator of the variance of the limiting distribution of @,. THEOREM 6.
Suppose E~2(X1 . . . . . Xm) < ~. L e t g be a real valued function defined in a neighbourhood of 0 which has a bounded second derivative. Then L
"k/-N('7% - rl)-+ N(O , m2~l(g'(O)) 2)
as n ~ oo.
THEOREM 7.
Suppose E~2(X1 . . . . . Xm) < oo. L e t g be a real valued function defined in a neighbourhood of 0 which has a continuous first derivative. Then, P
S 2 ~ m 2srl(g'(0))2. A further generalization of jackknife was introduced by Efron (1979) under the rubric bootstrap. Bootstrapping is a resampling scheme in which the statistician learns the sampling properties of a statistic by recomputing its value on the basis of a new sample realized from the original one in a certain way. This is made more precise as follows. Consider the one-sample situation in which a random sample X = (X1 . . . . . X,) is observed from a completely unspecified distribution F. The observed realization
Rank statistics and limit theorems
157
of X is denoted by x = (xl . . . . . x.). The bootstrap method of realizing a sample can now be described as follows. A. Construct the empirical distribution function Fn based on the observed realization x of X. B. With Fn fixed, draw a random sample of size N from F~, say X * = (XT . . . . , X}). Then, conditional on X = x fixed, the X*'s are iid with distribution function F,. The bootstrapped sample is now denoted by X*. C. The above procedure can be repeated any number of times. A U-statistic U} with kernel ~b and degree m (N/> m) based on the bootstrapped sample X* is given by Uf¢= /N\[Z) -1 ~ - ' - ~ ~b(X * j, . . . . . X *Jm), N~>m. \¢t~/ l~jl<...
(4.3)
Bickel and Freedman (1981) have developed a very extensive central limit theory for many bootstrapped statistics including U-statistics. Under the assumption Edpz(x1 . . . . . X,,) < 0% they showed that L
V'N(Ufv- U.)---> N(0, m2~'1) as min(n, N)---> ~. Following Sen (1960), Athreya, Ghosh, Low and Sen (1984) proposed the estimator
S=n=(N -
I)-' ~" ~[(mN ' , = 1- 11)-'
"'" ~
d~(X*'X~2' . . . , X*,.) - uNjl
2
1
Jk#i
(4.4)
for (1. This estimator multiplied by m 2 c a n be easily recognized as the jackknifed estimator of the variance of the asymptotic distribution of bootstrapped U-statistics. Athreya, Ghosh, Low and Sen (1983) showed that under the assumption that E~bE(X1. . . . . Xm) < ~, P
S~--~m2~1
as m i n ( n , N ) ~ .
Accordingly, the bootstrapped pivot L
~ / - N ( U } - Un)/(mSN)-*N(O, 1) as min(n, N ) - ~ . Also, a strong law of large numbers for bootstrapped U-statistics was proved by Athreya et al. (1983) under the assumption that EFI~b(X1. . . . , Xm)l1+8 < ~ for some G > 0.
5. Functional central limit theorems for U-statistics
Central limit theorems for U-statistics developed earlier in Sections 2 and 3 can be generalized in a stochastic process setting. The need for such a generalization is discussed in Billingsley (1968), and in Serfling (1980).
Malay Ghosh
158
Consider once again U-statistics with kernel ~ and degree m with E 4 ~ z ( x 1 , . . . , Xm)< ~ and (a > 0. With the sequence {U,; n ~> m} of U-statistics, consider two associated sequence of stochastic process on the unit interval [0, 11. The process pertaining to the past was introduced by Miller and Sen (1972). Define Y,(t) = 0,
m-1 0 ~ < t ~< -
k(Uk - O) Y . ( k / n ) = (m2(O1/2nm,
n
k=m,m+l
..... n;
(5.0
Y , ( t ) (0 <- t <- 1) is defined elsewhere by linear interpolation. The process pertaining to the future was introduced by Loynes (1970). Define
z.(0) = 0; z.(t.k)
= (uk - o)/v'/2(u.),
k >1 n, t.k =
;
(5.2)
Z , ( t ) = Z.(t.k), t.,k+l < t < t.k .
For each n, t,., t,.,+l . . . . . form a sequence tending to 0 and Z,(-) is a step function continuous from the left. Before stating the limit laws for the processes Y,(.) and Z,(.), we need certain preliminaries. Consider a collection of random variables 7"1, Tz . . . . . and T taking values in an arbitrary metric space S, and having respective probability measures P1, Pz . . . . . and P defined on the Borel sets in S (i.e. on the ~r-field generated by the open sets with respect to the metric associated with S). We say that P, ~ P if lim,_~o~P , ( A ) = P ( A ) for every Borel set A is S satisfying P ( O A ) = 0, where OA = boundary of A = closure of A - interior of A. In particular if S is a metrizable function space, then P.--* P denotes 'convergence in distribution' of a sequence of stochastic processes to a limit stochastic process. In the Miller and Sen (1972) situation, we take S = C[0, 1], the collection of all continuous functions on the unit interval [0, 1]. The space C[0, 1] is metrized by
p(x, y) =
sup I x ( t ) - y(t)l
(5.3)
0~t~l
for x = x(.) and y = y(.) in C[0, 1]. Denote by ~ the class of Boret sets in C[0, 1] relative to p. Denote by (2), the probability distribution of Y,(.) in C[0, 1], i.e. the probability measure on (C, ~ ) induced by the measure P through the relation Q . ( B ) = P({w: Y,(-, w ) E B}),
B E N.
(5.4)
In order that the sequence of processes {Y,(.)} converges to a limit process
R a n k statistics and limit theorems
159
Y(') in the sense of convergence in distribution, we seek a measure Q on (C, ~3) such that Q, ~ Q. The measure Q in the Miller and Sen (1972) situation turns out to be the Wiener measure, or the probability measure Q defined by the properties (a) O({x('): x(O)= 0}) = 1; (b) O({x('): x ( t ) <~ a}) = ( 2 ' r r t ) -1/2 f_a e x p ( _ x 2 / 2 t ) dx for all 0 < t ~< 1, - ~ < a<~; (c) for 0 <~ to <~ . . . <-tk ~< 1 and - ~ < al . . . . . ak < % k
= 1-10({x(.): x(t,)- x(t,-3
a,}).
i=1
A random element of C[0, 1] having the distribution Q is called a Wiener process, and is denoted by {W(t), 0 ~< t ~< 1} or simply by W. From (a), (b) and (c), it follows that (a)' W(0) = 0 with probability 1; (b)' W ( t ) is N(0, t) for each t ~ (0, 1); (c)' for 0<<-t o < ~ ' " <<-tk ~< 1, the increments W ( q ) - W(to) . . . . , W ( t k ) - W(tk-1) are mutually independent. We are now in a position to state the main theorem of Miller and Sen (1972). THEOREM 8. in C[0, 1].
d
A s s u m e that E~b2(Xx. . . . . Xm) < ~ and ~1 > O. Then Y,(.)---~ W ( . )
For Loynes (1970), S = D[0, 1], the class of all functions which are right continuous and for which left hand limit exists. The D[0, 1] space is endowed with the following topology. Let A denote the class of all strictly increasing continuous mappings of [0, 1] onto itself. The Skorohod distance between x and y (both belonging to the D[0, 1] space) is defined by d(x, y) = inf{e > 0; there exists a ;t in A such that
sup IA(t)~-t[ < e and sup I x ( t ) - y(A(t))l < e}. t
(5.5)
t
Note that d is a metric (called the Skorohod metric) which generates a topology on D[0, 1]. This is the so-called Skorohod topology which we refer to as the Jrtopology. The main result of Loynes (1970) is as follows. d
THEOREM 9.
A s s u m e the conditions of Theorem 8. Then, Z,(.)---~ W ( . ) in the
Jl-tOpology on D[0, 1].
REMARK. Theorems 8 and 9 are useful in sequential analysis. Loynes (1970) cited Theorem 9, but did not check the validity of the needed regularity conditions. These conditions were verified later by Sen and Bhattacharyya (1977), and Loynes (1978). For c-sample U-statistics, functional CLT's are due to Sen (1974) in the
160
Malay Ghosh
nondegenerate case, and Neuhaus (1977) in the degenerate case. Hall (1979) obtained a single result from which invariance principles in both the degenerate and nondegenerate cases followed as immediate corollaries. 6. Miscellaneous r e m a r k s
Closely related to the U-statistics are the von Mises V-statistics defined by V, = n -m ~
""
al=l
~
dp(X~,, . . . . . X ~ , ) .
(6.1)
Ctm=l
It can be shown that (see, for example, Ghosh and Sen, 1970) if E[~b2(X~1. . . . . X~.)] < ~ for all 1 ~< c~1. . . . . am ~< n, then E ( U , - V , ) 2 = O(n-2). This implies that, P
n r ( U , - V,)--->O
for any r < l .
Consequently, n l / 2 ( V , - O) has the same limiting distribution a s nl/2(Un- 0). Miller and Sen (1972) obtained functional central limit theorems (of the type discussed in Section 5) for von Mises V-statistics. This requires a more subtle analysis than just showing E ( U , - V,) 2= O(n-2). Central limit laws for U-statistics in the m-dependent case were obtained by Sen (1963, 1965) and for sampling without replacement from a finite population by Nandi and Sen (1963). A functional central limit theorem for jackknifed U-statistics was obtained by Sen (1977a). 7. Linear rank statistics: Central limit t h e o r e m s and rates of convergence
There are many linear rank statistics which are not U-statistics, and the development of limit distributions for such statistics requires a different analysis. Notable among such statistics are the so-called normal scores statistics introduced in Section 1. In this section, we present certain results concerning null and nonnull distributions of linear rank statistics. Recall from Section 1 that a linear rank statistic is expressible in the form N
SN = ~ cN(i)aN(RNI)
(7.1)
i=1
where RNi is the rank of X~ among X 1 , . . . , X N. Both the regression constants and the scores are indexed by N, i.e. they may change as N changes. Suppose that Xi has a continuous distribution function F~, i = 1. . . . . N. First consider the null situation, i.e. where FI=--"" =-FN. In this case the vector (RN1. . . . . Rmq) "is equiprobable over the N ! permutations of the first N positive integers. In this situation (see, for example, Hfijek and Sidfik, 1967; Randles and Wolfe, 1979; or Sen, 1981) l
Rank statisticsand limit theorems
161
E(SN) =- I-tN = NeNgtN; N
(7.2)
N
V(Su) ~ o.2 = (N - 1)-1 ~ {cN(i)-- ~'u}2 ~ {an(i)-- ~}2 i=1
i=1
where cN = N -1EN=I cN(i), Flu = N -1EN=I an(i). Central limit theorems for suitably normalized SN were first obtained by Wald and Wolfowitz (1944), and subsequently by Noether (1949), Hoeffding (1951), Dwass (1956), and Hfijek (1961). Hfijek (1961) obtained the result under minimal regularity conditions. His theorem is as follows. THEOREM 10. Let SN be defined as in (7.1) with regression constants cN(i) satisfying Noether's uniform asymptotic negligibility condition, namely, max (cN(i)- ~N)2
(cN(i)- ~N)2-->0
as N - - - ~ .
(7.3)
l<_i<.N
Assume also that the scores an(i) are generated by a s ( i ) = ch(i/(N + 1)), i = 1. . . . . N, where ~b is a squared integrable score function on the unit interval [0,1]. Then, L
(SN - P~N)/O'N~ N(0, 1). REMARK. In the above theorem, one could also take an(i) = E4~(UNi), i = 1 . . . . . N, where UN1 ~< • • • ~< UNN are the order statistics in a random sample of size N from the uniform [0, 1] distribution. In the case an(i) = E~b(UNi) where f01 kb(u)[ du < o0, there is an alternate way of proving (7.3). Let o%N denote the o.-algebra generated by R N I , . . . , Run. Then, Sen and Ghosh (1972) have shown that {(SN, fin); N / > 1} is a martingale sequence. The result can now be proved by appealing to some suitable martingale central limit theorem (see, for example, Mcleish, 1974). The martingale idea has been fruitfully exploited in obtaining functional central limit theorems for linear rank statistics (see Section 9). In the two-sample case, Chernoff and Savage (1958) obtained in their pioneering paper the asymptotic nonnull distribution of SN under nonlocal alternatives. For local alternatives, Hfijek (1962) employed the idea of 'continguity' of probability measures, and obtained some useful limit theorems. Finally, Hfijek (1968) obtained the limit distribution of SN under nonlocal alternatives. His main results are given below. THEOREM 11. Consider the statistic SN given in (7.1) where the score function an(i) is given either by q~(i/(N + 1)) or Erp(UNI) where qb has a bounded second derivative. Assume that max (cN(i)- gN)2/V(SN)---*O l~i<-N
as N----~.
(7.4)
Malay Ghosh
162
Then L (Su - ESu )/V'/2(SN ) --+ N (0, 1).
(7.5)
The assertion remains true, if in (7.4) and (7.5), we replace V(SN) by o'~ = E/N=1 V[lNi(Xi)], where
N
INi(X) = N -1 Z (CN(j)- Clv(i))
f
[Ib,>.xI -
j=l
F~Cv)14,'[~(y)ld F j ( y ) ,
i=1 ..... N, where FN(y)
= N -1
~"N=lF}(y).
The next theorem of Hfijek (1968) does not require the assumption of boundedness of the second derivative of ~b. 12. Let ~(u)---&l(u)-~b2(u), 0 < u < 1, where the ~bi(u)'s are nondecreasing square integrable and absolutely continuous inside (0, 1), i = 1, 2. Assume that
THEOREM
N maxl(cN(i ) l<~i~N
6N)2/V(Su)---~O
as N---~oo.
(7.6)
Then (7.5) holds. Also, in (7.5) and (7.6), V ( Su ) can be replaced by o'2Nas defined in Theorem 11. The proofs of these theorems employ the 'projection method' introduced earlier in Section 2 in connection with U-statistics. Hfijek's (1968) projection lemma does not require the statistics to be symmetric in their arguments. The other major tool developed by Hfijek (1968) is a powerful 'variance inequality' majorizing V(SN) when the distributions F~. . . . . Fu are not necessarily equal. An important question left open in H~ijek's (1968) paper is whether the centering constant E(Su) can be replaced by the simpler #N = ~N=I cN(i)f2 q~(Pn(X))dFi(x). This question has been answered in the affirmative by Hoeffding (1973) if the square integrability condition of ~bl and ~b2 is slightly strengthened. Hoeffding's (1973) main theorem is as follows. THEOREM 13. Let ( ~ ( U ) = - ~ b l ( U ) - i J ) 2 ( u ) where each dai is nondecreasing, absolutely continuous in (0, 1), and satisfies f01 ul/2(1 - u) 1]2d~bi(u) < ~,
i = 1, 2.
(7.7)
Then, if aN(i) = Eqb(UNi), 1 <- i <- N, the conclusion of Theorem 12 remains valid with E(SN) replaced by tz~. If aN(i) = dp(i/(N + 1)), 1 <- i <~N, then the conclusion
Rank statistics and limit theorems
163
of Theorem 12 remains valid with E ( S s ) replaced by tx[q = tzs + cN
ch(i/(N + 1)) - N
fo
q~(u) du
}
.
If, in addition, [gu[/maxi~i,Nlcu(i)-cNI is bounded, E ( S s ) can be replaced by tzN even when as(i) = ck(i/(N + 1)). For one sample signed rank statistics STy,analogues of Theorems 11 and 12 have been obtained by Huskovfi (1970). Pyke and Shorack (1968) used an alternative approach in obtaining the limit distributions of linear rank statistics in the two sample case. Next, in this section, we consider rates of convergence to normality for the simple linear rank statistics Ss. Results in this direction were obtained by Jureckovfi and Puri (1975), and later by Huskovfi (1977) and Bergstrom and Purl (1977). The method of proof consists in approximating the simple linear rank statistic by a sum of independent random varialzles, and establishing for arbitrary r, a suitable bound on the rth moment of the error of approximation. The following assumptions are made. I. The scores are generated by a function ~b(u), 0 ~ u ~< 1, in either of the following two ways:
as(i) = dp(i/(N + 1))
or
E4)(Usi),
1 <~i <~N .
II. 4~ has a bounded second derivative. III. The regression constants satisfy max (cN(i)- PN)2 l~i<<-N
(cs(i) - gs) 2 = O ( N -1 log N ) .
(7.8)
--i=1
IV. liminfN_,=V(Su) > 0. Then the following theorem is true (see Serfling, 1980). THEOREM 14.
Assume I-IV. Then, for every ~ > O,
/ SN - ESN sup P [ Vff~(~N) ~< x ] - qb(x) = O(N-1/2+c).
(7.9)
x
The assertion (7.9) remains true if V(SN) is replaced by try, where tr2N is defined in Theorem 12. Both the assertions remain true with E(SN) replaced by tXl~, where IZlV is defined in Theorem 13. REMARK. Edgeworth expansions for linear rank statistics have been obtained by Albers, Bickel and van Zwet (1976), and Bickel and van Zwet (1978).
164
Malay Ghosh
8. Linear rank statistics: Strong laws Sen (1970) obtained strong law of large numbers for statistics of the form S~ (introduced in (1.2)) when the ci's are all equal to 1 and an(i) = ¢ ( i / ( N + 1)), 1 ~< i ~< N. The ideas of Sen can be extended to more general regression rank statistics of the type S~, provided the c/s satisfy certain uniform asymptotic negligibility conditions. Sen and Ghosh (1972) have a result to this effect. To state the result of Sen and Ghosh (1972), let N
Hi(x) = F i ( x ) - F / ( - x )
and
/-~(x) = N -1 ~ Hi(x). i=1
Define C20 = 2 c2(i)
and
A20 = n -1 2 {¢*(i/(n + 1))}2 ,
i=l
i=l
where ~b*(u)= ¢(½+ l/2u). It is also assumed that ¢ ( u ) + ¢ ( 1 - u) = 0 for all 0 < u < 1. Define N
SN = N - m A ~ 1 ~ C-Nisgn X i ¢ * ( R ~ i / ( N + 1)), i=1
where C-Ni= Ci/(Eiu=lC2) 1/2, 1 ~< i ~< N. Also let, ~IV = N-1/2A Nlo ~
~
sgn(x)~*(/~N(lxl)) dFi(x)
i=1
where /~N(IX]) = N -1E~=l/-/i([Xl)- Then, we have the following theorem. THEOREM
15.
maxi~i~N[~Ni[ =
Suppose that fd [¢(u)[' du 2 . Then, ~ - ~ N - > 0 a . s . a s S - ~ ~.
A s s u m e that
O(N-1/2).
For the statistics SN, a corresponding strong convergence result as proved in Sen and Ghosh (1972) is as follows. THEOREM 16. Suppose that the score function aN(i) is the same as in Theorem 15 with f~ [¢(u)l r du < ~ for some r > 2. Assume that maxl~i~N]C~i[ = O(N-m), where c ~,i = (ci -
eN)/{E?=~ (c~- eN)2}~/2. D e f i n e
S?v = N-1/2A?vlC~ISN
and
~'fv = f ~ J(Fu(x)) dFfv(x),
where F ~ x ) = N-1/2A~1Y.~=l c~iFi(x) and fiN(x) = N -1 ~/N=IF/(x). Then, S ~ - r?v .~ O a.s. as N ~ ~. REMARK. H~ijek (1974) proved a result similar to Theorem 16.
165
Rank statistics and limit theorems 9. Linear rank statistics: Functional central limit theorems
Consider once again the statistics S~ = E~=l c(i)E[ch(UNRN)] where UN1 << "'" <~ UNN are the order statistics in a random sample of size N from the uniform (0, 1) distribution. Assume that the score function ~b is squa~,e integrable, and define 4~ = fo1 qS(u) du and A 2 = fd 4~2(U)du - ,~2. Defining ~ 2 = /~rl •/N=I [as(i) - aN] 2, where aN(i) = E d p ( U s i ) , ! <~ i <- N, one gets the inequ'ality A 2 <~ A 2. In fact (see Sen, 1981, pp. 92-93), using the dominated convergence theorem one can show that A 2 ~ A 2 as N---~ o0. Write C 2 = E~V=l( c N ( i ) - i t s ) 2. For every N 1> 1, consider the stochastic process YN (t) : S%(t)/(CNAN),
T0(t) = max{k: C2k <~ tC2N},
t E (0.1). (9.1)
(Conventionally, let So = S1 -- 0 so that YN(0) = 0). Then YN belongs to D[0, 1] for every N >~ 1. The following theorem, proved in Sen (1981) is an improved version of a corresponding result of Sen and Ghosh (1972). THEOREM 17. Suppose fd (a2(u) du < ~. Then, under F1 - " • • =- F s and under (7.3), d YN --> W as N---~ ~ in the Jl-topoiogy on D[0, 1].
The sequence { Y s ( ' ) , N >i 1} describes a process pertaining to the past. Similarly, one can define a process pertaining to the future. With this end, define S~ = Sk/C~, k >t N >1 2; ST = 0. Now, define the stochastic process Z s ( t ) = (Cs/As)S*o( o,
~'0(t)= min{k: C 2 / C 2 ~< t},
t ~ (0, 1).
(9.2) Then Z s belongs to the D[0, 1] space for every N. The following theorem is proved in Sen (1981). THEOREM 18. Under the conditions of Theorem 17, as N - ~ ~, ZN ~ W in the Jl-topology on D[0, 1]. Theorems 17 and 18 relate to the null situation, namely, when F1 -=" • • ~ Fs. For Contiguous alternatives, functional central limit theorems of the above type were obtained by Sen and are reported very extensively in his recent book (see Sen, 1981, pp. 98-104). Functional central limit theorems for linear rank statistics under nonlocal alternatives is an important open question. Next, consider the one sample case where Z1 . . . . . Z s are lid with a common continuous distribution function F. Consider signed rank statistics of the form (1.2) with c(i) = 1 for all i, a s ( i ) = E[th(UN,)], 1 <~ i ~ 0 . Let R~v,= rank of [Xll . . . . . IXs], l~
166
Malay Ghosh
Ghosh (1971) have shown that if f l I (u)l du < oo, then {(S~v, ~N), N / > 1} is a martingale sequence. Using this martingale structure, weak and strong invariance principles (functional central limit theorems) for one sample signed rank statistics can be established. The results are discussed in detail in Sen (1981). In what follows, we present a few of the important results. Once again, let A 2 = N -1 E/u=1 [ a n ( i ) - aN] 2. Define ~'0(t) = max{k: k / N <~ t}, 0 < t < l. Let Y ~ ( t ) = S~,(o/(N1/2Au), Y~(0) = 0. The following theorem is proved in Sen and Ghosh (1973). 19. If f l th2(u) du < ~ and F ( x ) + F ( - x ) = 1 for all x, then YN --~ W as N--->~ in the Jl-topology on D[0, 1].
THEOREM
An analogous result for the tail sequence Z N = {ZN(t), 0 < t < 1} is proved in Sen (1981). Define To(t)=-min{k:N/k<~t}, and Z N ( t ) = ~ / N - -S ~ +o ( j ( A N~'0(t)). Then the following theorem is true. THEOREM 20. Under the same assumptions as in Theorem N--+oo in the Ja-topology on D[0, 1].
d
19, Zu--> W as
The above theorem provides a functional central limit theorem for signed rank statistics under the null situation. Results similar to T h e o r e m s 19 and 20 are proved in Sen (1981) under contiguous alternatives.
10. Multivariate linear rank statistics: Permutational central limit theorems Let X i = ( X i l . . . . , Xip)', i -- 1. . . . . n, be n iid random variables with a common continuous distribution function F defined on R p, p being some positive integer. Let e, =(cil . . . . . c~)', i = 1. . . . . n, q >t 1. For each j (= 1 . . . . , p), let R~j denote the rank of X~i among Xaj . . . . . X,j (i = 1 . . . . , n); F being assumed to be continuous, ties among the observations can be neglected in probability. Also, let a,j(1) . . . . , a,j(n) (j = 1 . . . . , p ) be a set of scores. Then, a set of multivariate linear rank statistics may be defined by
L.jk = ~ (Cik - e~)a.j(Ro),
(10.1)
i=1
where g ~ = n -1X?=l Cik. The statistics L,jk defined in (10.1) appear in Puri and Sen (1969) in connection with hypothesis testing in general linear model. Special cases have been considered earlier by Chatterjee and Sen (1964) for the bivariate two sample location problem when p = 2, q = 1, and the Cil'S are l's or zeroes. Puri and Sen (1966) considered the multivariate multisample location problem, where p and q are general, but eik'S are still l's or zeroes. It should be noted that unconditionally L, = ((Lnjk)) is not distribution-free.
Rank statistics and limit theorems
167
However, it is permutationally (conditionally) distribution-free under the following rank permutation model due to Chatterjee and Sen (1964)• Consider the rank-collection matrix 2 . (of order p × n) defined by ~, = (Rb.. •
Rn) = ( R n " ' R n l ) " \Rip" Rnp/
'
(1o.2) '
where R~= ( R i l , . . . , R i p ) , i = 1. . . . , n. Consider now a permutation of the columns of Y~, such that the top row is in the natural order (i.e. 1 , . . . , n), and denote the resulting matrix (called reduced rank-collection matrix) by Y~*. The totality of (nI) p rank-collection matrices may thus be partitioned into (n!) p-I subsets, where each subset corresponds to a particular reduced rank-collection matrix ~ *, and the subset S ( ~ * ) has cardinality n !. The conditional distribution of Y~, over the S ( ~ * ) is uniform irrespective of what F is. This conditional (permutational) probability measure is denoted by ~,. Puri and Sen (1969) have shown that Ee,(L,) = 0
and
(lO.3)
Vp,(L,)= V~ x C,,
where V, = ((v,if)) has the elements v,jf= (n - 1)-1 ~ ( a . ( n i j ) - a,j)(a,(R¢)- G f ) ;
(10.4)
i=1
&~ = n -1 ~ a,j(i),
(10.5)
i = 1. . . . . p.
i=l
Also C, = ~ ( c i - G ) ( c i - e,)',
g. = (g(01... ~a(°)V,p,.
(10.6)
i=I
Recently, Sen (1983) has established the asymptotic multinormality of L, under the permutational model ~ , assuming minimal regularity conditions. We call this result the multivariate permutational central limit theorem. Earlier Puri and Sen (1969) established the permutational central limit theorem under more stringent regularity conditions. To state Sen's (1983) result, assume that (7, has full rank (i.e. q) for large n (say n ~> no). Define so, = max (ci - ~,)'C:l(ci - G)' ,
(10.7)
l<~i<_n
for n/> no. Also, for each j (= 1 . . . . . p), let, a°j(u)= a,j(i),
( i - 1)/n < u ~ i / n ,
The following assumptions are made.
i = 1 , . . . , n.
(lO.8)
168
Malay Ghosh
I. There exist score functions ffj(u) (19< u < 1), j = 1 . . . . , p such that ~bj(u)= ~bjl(u)-~bjz(U), where ~bjk(u) is nondecreasing, absolutely continuous and square integrable inside (0, 1), k = 1, 2. II. For the a°j(u) defined in (10.8), maxa~ 0 as n ----->oo.
III. so, -- maxl~0 as n--> ~. The main theorem of Sen (1983) is as follows. THEOREM 21. A s s u m e (I)-(III). Then, under ~,, L , is asymptotically (in probability) normal with mean vector O, and variance-covariance matrix
v.®c.. REMARK. Condition (III) can be viewed as an extended Noether condition (see (7.3)).
11. Bilinear rank statistics: The independence problem Let (X1, Y1). . . . , (X~, Y,) be a random sample from a continuous bivariate distribution with distribution function F ( x , y ) . The problem is to construct nonparametric distribution-free tests of Ho: F(x, y) = Fl(X)F2(y), for every pair (x, y) against Hi: F(x, y) # FI(x)F2(y) for at least one pair (x, y), where F1 and F2 denote the marginal distribution functions of the X / s and the Y/s respectively. Thus, the null hypothesis of interest is that the X and Y variables are independent. Let Ri denote the rank of X/ among X1 . . . . . X,, and let Q / d e n o t e the rank of Y/ among Y1. . . . . Y~. In addition, let a(1) ~<-.. ~< a(n) with a(1) # a(n), and c(1)<~... ~< c(n) with c ( 1 ) # c(n) be two sets of scores. Let
(11.1) with E n - 1 ~in_-i Ci and a n -1 ~=1 al denote the correlation coefficient for the group of scored rank pairs (c(Ri), a(Q1)) . . . . . (c(R,), a(Q~)). The special case a(i) = i, i = 1 . . . . . n corresponds to the Spearman rank correlation coefficient. Note that 7". has the alternate representation =
=
Write Sn=~n=lC(Ri)a(Oi). When H0 is true, the rank vectors R = (R1 . . . . . R,)' and Q = (Q1 . . . . . Q,)' are both uniformly distributed over the set of n! possible permutations of the first n positive integers. Moreover, under
Rank statistics and limit theorems
169
Ho, Q a n d R are i n d e p e n d e n t l y distributed. In this case E n o ( S . ) = nFt6 a n d n
tl
VHo(S,, ) = (n - 1)-' ~ ( c ( i ) - e)2 ~ ( a ( i ) - a ) 2 . i=1
i=l
Also, Hoeffding (1952) showed that if
max ( c ( i ) - #)2 l~i<.n
= O(n-1),
( c ( i ) - tT)2 _
max ( a ( i ) - a ) 2 l~i<<, n
(a(i)- a) 2 _
(11.3)
then, u n d e r H0, ~ / n T , is asymptotically N ( 0 , 1). T h e asymptotic n o n n u l l (when /40 is n o t true) d i s t r i b u t i o n of T, has b e e n d e r i v e d by B h u c h o n g k u l (1964). A class of r a n k o r d e r tests for i n d e p e n d e n c e in m u l t i v a r i a t e d i s t r i b u t i o n s was p r o p o s e d by Puri, Sen a n d G o k h a l e (1970), a n d the asymptotic d i s t r i b u t i o n s of the r e l e v a n t test statistics in the null a n d n o n n u l l cases were derived.
References Albers, W., Bickel, P. J. and van Zwet, W. R. (1976). Asymptotic expansions for the power of distribution free tests in the one-sample problem. Ann. Statist. 4, 108-156. Arvesen, J. N. (1969). Jackknifing U-statistics. Ann. Math. Statist. 40, 2076-2100. Athreya, K. B., (3hosh, M., Low, L. and Sen, P. K. (1984). Laws of large numbers for bootstrapped means and U-statistics. J. Stat. Planning Inference 9, 185-194. Berk, R. H. (1966). Limiting behavior of posterior distributions when the model is incorrect. Ann. Math. Statist. 37, 51-58. Bergstrfm, H. and Puri, M. L. (1977). Convergence and remainder terms in linear rank statistics. Ann. Statist. 5, 671-680. Bickel, P. J. (1974). Edgeworth expansions in nonparametric statistics. Ann. Statist. 2, 1-20. Bickel, P. J. and Freedman, D. A. (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 9, 1196-1217. Bickel, P. J. and van Zwet, W. R. (1978). Asymptotic expansions for the power of distribution free tests in the two sample problem. Ann. Statist. 6, 937-1004. Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York. Bhuchongkul, S. (1964). A class of nonparametric tests for independence in bivariate populations. Ann. Math. Statist. 35, 138--149. Borovskikh, Yu. V. (1979). Approximation of U-statistics distribution (in Ukrainian). Proc. Ukrainian Acad~ Sci. A 9, 695-698.. Callaert, H. and Janssen, P. (1978). 1.,)e Berry-Esseen theorem for U-statistics. Ann. Statist. 6, 417-421. Callaert, H., Janssen, P. and Veraverbeke, N. (1980). An Edgeworth expansion for U-statistics. Ann. Statist. 8, 299-312. Callaert, H. and Veraverbeke, N. (1981). The order of the normal approximation for a studentized U-statistic. Ann. Statist. 9, 194-200. Chan, Y. K. and Wierman, J. (1977). On the Berry-Esseen Theorem for U-statistics. Ann. Prob. 5, 136-139. Chatterjee, S. K. and Sen, P. K. (1964). Nonparametric tests for bivariate two sample location problem. Cal. Statist. Assoc. Bull. 13, 18-58.
170
Malay Ghosh
Chernoff, H. and Savage, I. R. (1958). Asymptotic normality and efficiency of certain nonparametric test statistics. Ann. Math. Statist. 29, 972-994. Dwass, M. (1956). The large sample power of rank tests in two sample problems. Ann. Math. Statist. 27, 352-374. Efron, B. (1979). Bootstrap methods: another look at the jackknife. Ann. Statist. 7, 1-26. Efron, B. and Stein, C. (1981) The jackknife estimate of variance. Ann. Statist. 9, 586-596. Ghosh, M. and Dasgupta, R. (1982). Berry-Esseen theorems for U-statistics in the non iid case. In: Proc. of the Conf. on Nonparametric Inference, organized by the Janos Bolyai Math. Soc., held in Budapest, Hungary, pp. 293-313. Ghosh, M. and Sen, P. K. (1970). On the almost sure convergence of von Mises' differentiable statistical functions. Cal. Statist. Assoc. Bull. 19, 41--44. Grams, W. F. and Serfling, R. J. (1973). Convergence rates for U-statistics and related statistics. Ann. Statist. 1, 153-160. Gregory, G. G. (1977). Large sample theory for U-statistics and tests of fit. Ann. Statist. 5, 110-123. H~ijek, J. (1961). Some extensions of the Wald-Wolfowitz-Noether Theorem. Ann. Math. Statist. 32, 506-523. H~ijek, J. (1962). Asymptotically most powerful rank tests. Ann. Math. Statist. 33, 1124-1147. H~ijek, J. (1974). Asymptotic sufficiency of the vector of ranks in the Bahadur sense. Ann. Statist. 2, 75--83. l-I~ijek, J. (1968). Asymptotic normality of simple linear rank statistics under alternatives. Ann. Math. Statist. 39, 325-346. H~ijek, J. and Sidak, Z. (1967). Theory of Rank Tests. Academic Press, New York. Hall, P. (1979). On the invariance principle for U-statistics. Stoch. Proc. and Appl. 9, 163-174. Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Ann. Math. Statist. 19, 293-325. Hoeffding, W. (1951). A combinatorial central limit theorem. Ann. Math. Statist. 22, 558-566. Hoeffding, W. (1952). The large sample power of tests based on permutations of observations. Ann. Math. Statist. 23, 169-192. Hoeffding, W. (1961). The strong law of large numbers for U-statistics. Inst. Stat. Univer. North Carolina Mimeo Series No. 302. Hoeffding, W. (1973). On the centering of a simple linear rank statistic. Ann. Statist. 1, 54-66. Huskov~, M. (1970). Asymptotic distribution of simple linear rank statistic for testing symmetry. Z. Wahrsch. Verw. Geb. 12, 308-322. Huskov~i, M. (1977). The rate of convergence of simple linear rank statistics under hypothesis and alternatives. Ann. Statist. 5, 658--670. Janssen, P. (1978). De Berry-Esseen stellingen: Een asymptotische ontwikkeling voor U-statistieken. Unpublished Ph.D. dissertation, Belgium. Jureckov~i, J. and Puri, M. L. (1975). Order of normal approximation for rank test statistic distribution. Ann. Probab. 3, 526-533. Karlin, S. and Rinott, Y. (1982). Applications of ANOVA type decompositions for comparisons of conditional variance statistics including jackknife estimates. Ann. Statist. 10, 485-501. Lehmann, E. L. (1951). Consistency and unbiasedness of certain nonparametric tests. Ann. Statist. 22, 165-179. Lehmann, E. L. (1963). Robust estimation in analysis of variance. Ann. Math. Statist. 34, 957-966. Loynes, R. M. (1970). An invariance principle for reversed martingales. Proc. Amer. Math. Soc. 25, 56--64. Loynes, R. M. (1978). On the weak convergence of U-statistics processes and of the empirical processes. Proc. Camb. Phil. Soc. (Math.) 83, 269-272. Mann, H. B. and Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist. 18, 50-60. McLeish, D. (1974). Dependent central limit theorems and invariance principles. Ann. Probab. 2, 620--628. Miller, R. G., Jr. and Sen, P. K. (1972). Weak convergence of U-statistics and von Mises' differentiable statistical functions. Ann. Math. Statist. 43, 31--41. Nandi, H. K. and Sen, P. K. (1963). On the properties of U-statistics when the observations are not
Rank statistics and limit theorems
171
independent. Part II: Unbiased estimation of the parameters of a finite population. Cal. Statist. Assoc. Bull. 12, 125-143. Neuhaus, G. (1977). Functional limit theorems for U-statistics in the degenerate case. J. Mult. Anal. 7, 424-439. Noether, G. E. (1949). On a theorem of Wald and Wolfowitz. Ann. Math. Statist. 20, 455-458. Puri, M. L. and Sen, P. K. (1966). On a class of multivariate multisample rank order tests. Sankhygt Ser. A 28, 353-376. Puri, M. L. and Sen, P. K. (1969). A class of rank order tests for a general linear hypothesis. Ann. Math. Statist. 40, 1325-1343. Puri, M. L. and Sen, P. K. (1971). Nonparametric Methods in Multivariate Analysis. Wiley, New York. Puri, M. L., Sen, P. K. and Gokhale, D. V. (1970). On a class of rank order tests for independence in multivariate distributions. Sankhy~ Set. A 32, 271-297. Pyke, R. and Shorack, G. R. (1968). Weak convergence of a two-sample empirical process, and a new approach to Chernoff-Savage theorems. Ann. Math. Statist. 39, 755-771. Quenouille, M. H. (1956). Note on bias in estimation. Biometrika 43, 353-360. Randles, R. H. and Wolfe, D. A. (1979). Introduction to the Theory of Nonparametric Statistics. Wiley, New York. Sen, P. K. (1960). On some convergence properties of U-statistics. Cal. Statist. Assoc. Bull. 10, 1-18. Sen, P. K. (1963). On the properties of U-statistics when the observations are not independent. Part I: Estimation of the non-serial parameters of a stationary properties. Cal. Statist. Assoc. Bull. 12, 69-92. Sen, P. K. (1965). Some nonparametric tests for m-dependent time series. J. Amer. Statist. Assoc. (ill, 134-147. Sen, P. K. (1970). On some convergence properties of one-sample rank order statistics. Ann. Math. Statist. 41, 2206-2209. Sen, P. K. (1974). Weak convergence of generalized U-statistics. Ann. Probab. 2, 90-102. Sen, P. K. (1977a). Some invariance principles relating to jackknifing, and their role in sequential analysis. Ann. Statist. 5, 315-329. Sen, P. K. (1977b). Almost sure convergence of generalized U-statistics. Ann. Prob. 5, 287-290. Sen, P. K. (1981). Sequential Nonparametrics: Invariance Principles and Statistical Inference. Wiley, New York. Sen, P. K. (1983). On permutational central limit theorems for general multivariate linear rank statistics. Sankhyft Ser. A 45, 141-149. Sen, P. K. and Bhattacharyya, B. (1977). Weak convergence of the Rao-Blackwell estimator of a distribution function. Ann. Probab. 5, 500-510. Sen, P. K. and Ghosh, M. (1971). On bounded length sequential confidence intervals based on one-sample rank order statistics. Ann. Math. Statist. 42, 189-203. Sen, P. K. and Ghosh, M. (1972). On strong convergence of regression rank statistics. Sankhygt Set. A 34, 335-348. Sen, P. K. and Ghosh, M. (1973). A law of iterated logarithm for one-sample rank order statistics and some applications. Ann. Statist. 1, 568-576. Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York. Sproule, R. N. (1969). A sequential fixed width confidence interval for the mean of a U-statistic. Unpublished Ph.D. dissertation. UNC, Chapel Hill. Sproule, R. N. (1974). Asymptotic properties of U-statistics. Trans. Amer. Math. Soc. 199, 55~o4. Tukey, J. W. (1949). The simplest signed rank tests. Princeton Univ. Star. Res. Group Memo. Report No. 17. Tukey, J. W. (1957). Variances of variance components II: The unbalanced single classification. Ann. Math. Statist. 28, 43-56. Wald, A. and Wolfowitz, J. (1944). Statistical tests based on the permutations of the observations. ~. Ann. Math. Statist. 15, 358-372. Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics 1, 80-83.