Two-sample goodness-of-fit tests when ties are present

Two-sample goodness-of-fit tests when ties are present

Journal of Statistical North-Holland Planning and Inference Two-sample goodness-of-fit when ties are present Arnold tests Janssen Mathematisches...

1MB Sizes 0 Downloads 57 Views

Journal of Statistical North-Holland

Planning

and Inference

Two-sample goodness-of-fit when ties are present Arnold

tests

Janssen

Mathematisches Germany

Received

399

39 (1994) 399-424

Institut,

29 February

Heinrich-H&e-Universitiit

1993; revised manuscript

Diisseldorf,

received

Universitiitsstr.

1, 40225 Diisseldof

8 June 1993

Abstract The paper contains asymptotic results for two-sample Kolmogorov-Smirnov, Cram&van Mises, Anderson-Darling and other related tests. Whenever the portion of ties cannot be neglected, two procedures are proposed to make the tests (asymptotically) distribution free. The tests can either be carried out as permutation tests or as tests with estimated critical values. All tests turn out to be asymptotically equivalent and their asymptotic power function is established under local alternatives. The same results apply to one-sided tests for stochastically larger alternatives where noncontinuous limit distributions of the test statistics may appear. AMS

Subject Classification:

Primary

62GlQ

Secondary

62G20.

Key words: Ties; permutation tests; Kolmogorov-Smirnov Darling test; asymptotic power function.

tests; Cram&-van

Mises tests; Anderson-

1. Introduction Goodness-of-fit tests are asymptotically distribution-free under the null hypothesis as long as the underlying distribution function F is continuous. We refer to the monograph of Shorack and Wellner (1986) for one-sample tests and HAjek and SidAk (1967, Ch. V), Chibisov and Durbin (1973, p. 39) in connection with two-sample problems where detailed results can be found. However, practical sets of data usually contain ties, and noncontinuous distributions have to be taken into account. In principle, the device for the treatment of Kolmogorov-Smirnov tests under ties is contained in the literature as a special case of more abstract empirical process theory. For instance, Bickel(l969, p. 7) proposed to carry out these tests as permutation tests. He showed consistency and asymptotic equivalence of conditional and unconditional

Correspondence to: A. Janssen, Universititsstr.1, 40225 Diisseldorf,

Mathematisches Germany.

Institut,

Heinrich-Heine-‘Jniversitit,

0378-3758/94/$07.00 0 1994 ~ Elsevier Science B.V. All rights reserved. SSDI 0378-3758(93)E0049-M

Diisseldorf,

400

A. Janssen 1 Two-sample

goodness-of--fit tests when ties are present

tests within a more general setting, see also Roman0 a recent discussion. Throughout, for univariate

(1989) and Praestgaard

(1991) for

we will have a closer look at permutation goodness-of-fit tests tied data. We will also treat integral tests (like the Cramer-von

Mises and Anderson-Darling tests) and one-sided tests for stochastically larger alternatives under ties. It turns out that the limit distribution function under the null hypothesis is not necessarily continuous, which requires further effort. (Note that condition A of Roman0 (1989) is violated.) Section 4 deals with asymptotic power functions of two-sided Kolmogorov-Smirnov tests under local alternatives. Extending earlier results for continuous distributions, we obtain asymptotic admissibility and strict asymptotic unbiasedness of these tests. Whenever the portion of ties cannot be neglected, a further procedure with estimated critical values (Theorem 3.3) is proposed in order to make the tests asymptotically distribution free. Below we will make a few comments concerning references for one- and two-sample Kolmogorov-Smirnov tests if ties are present. It is well known that Kolmogorov-Smirnov tests are conservative and remain valid if ties are ignored; see Kolmogorov (1941), Noether (1963) or Guilbaud (1986) for a recent discussion. In connection with one-sample problems we mention Gleser (1985), who calculated the power under discontinuous distributions. Guilbaud (1986) established stochastic inequalities for KolmogorovSmirnov statistics which are reflected in our Lemma 2.1 within the asymptotic set-up. Acceptance regions for the Kaplan-Meier estimator appear in Guilbaud (1988). Two-sample omnibus tests are studied in Behnen and Neuhaus (1989) also when ties are present. They studied rank tests with estimated scores. In conclusion we give the following recommendation for practical purposes: (1) In the case of small or intermediate sample size the tests should be carried out as permutation tests. For computational reasons a Monte Carlo simulation may necessary. (2) Alternatively, the critical values can be estimated (Theorem 3.3), whenever large. However, these procedures require also Monte Carlo simulations via Brownian bridge. These calculations may be easier as in (1) when a lot of ties

be it is the are

present. (3) If only a few ties show up, they may be ignored for large n and the use of asymptotic critical values seems to be justified. The paper is organized as follows. Section 2 explains the results for two-sample problems. The main results are given in Section 3 and the treatment under local alternatives can be found in Section 4. All longer proofs are represented in Section 5. The results about asymptotic distributions of the Brownian bridge under semi-norms are summarized in the appendix. We will use the following notation. Let 2(X1 P) denote the law of a random variable X under P. The uniform distribution on the unit by supp(P). By interval of Iw is denoted by /I,,,, 1). The support of P is abbreviated

A. Janssen / Two-sample goodness-of-fit tests when ties are present

definition

Xj: n stands for the j-th smallest

and 7

indicate

tion, let f’

among

(P-probability,

X1, . . . ,X,. respectively).

Let z In addi-

tests

This section

explains

and motivates

for the following

Example 2.1. Consider specified by i.i.d. random within group 1 and i.i.d. F2 for group 2. Let us hypotheses:

where K, denotes

procedures.

The results

are first

a two-sample testing problem of sample size yl=nl + n2 function F1 variables Xi, . . . , X,, with common distribution random variables X,, + i, . ,X, with distribution function examine two-sample goodness-of-fit tests for testing the

K1 being the omnibus alternatives, unknown, or for testing Ho: F,=F2

the testing

example.

Ho: F1 = F2 against

Throughout,

order statistic

in distribution

= max(A 0).

2. Two-sample

presented

convergence

401

against

the one-sided

K1: F1 # FZ,

(2.1)

where F1 and F2 are assumed K2: F,3F2, alternative

to be completely

F,#F2, that F2 is stochastically

(2.2) larger than Fi.

let (2.3)

(2.4)

denote the empirical distribution functions of group 1,2 and the pooled respectively. Typically, goodness-of-fit tests are based on test statistics I(&(.)-fin(.))

or

Z((c,(.)-fin(.))+)

sample,

(2.6)

on a suitable function space. As for K1 or KZ, where Z( .) denotes a semi-norm motivation, let us first consider the sup-norm IlfI/:=~up,,~ If(t)1 leading to Kolmogorov-Smirnov tests. As extension of classical invariance theorems, we obtain the limit distribution of (2.6) for arbitrary distributions. Let (Bo(t))tEro.ll denote a standard Brownian bridge. Lemma 2.1 is known as a special case of more abstract empirical process theory.

402

A. Janssen / Two-sample goodness-of-fit tests when ties are present

Lemma

2.1. Assume that X1, . . . ,X,

suppose

that n1 ~n2-+co.

are i.i.d. with joint

distribution

function

F and

Then

112

nln2

supI &(t)-&(r)+

(> n

(2.7)

sup I&(F(t))l

tPR

toi?%

and nl

n2

l/2

sup (d,(t)-&(t))+

C-1 n

5

supB,(F(t))+,

tER

(2.8)

fElR

both in distribution.

For completeness

we will give a proof of Lemma 2.1. Let us introduce

the following

abbreviations:

(I{

c ,.= ~ n,.

li2 lh,

n

-

i=l, ..

(2.9)

i=n,+l,...,n,

Un2,

where the coefficients satisfy Cy=, cii= 1 and max,,iGnlc,il+O Under H,, we may assume

whenever

Xi:=F-‘(Ui),

(2.10)

where F- ’ denotes the left-continuous U1, LIZ, . . . are i.i.d. uniformly distributed

inverse (or quantile on (0,l).

Proof. Taking (2.9) and (2.10) into account, u~(0, l), proves the following representation:

0

nln2 n

the equivalence

(~n(t)-fin(t))=i

principle if

l(-m,t](xi)

t iff u< F(t),

cni

for weighted

(0.~1 C”i)

processes

sup If(s)I soF(R)

on D[O, 11. Then standard plete the proof. 0

implies (2.12)

~(~0(4)s.r0,11

cni ) SE[O,

i=l

empirical

(2.11)

cni.

11

in distribution on D[O, 11, see Shorack and Wellner consider the almost surely continuous functions f~

F-‘(u)<

of F and

i=l

t”i)

(

function)

l/2

=i$ll[O,F(r)l The invariance

n, An2+co.

and f~ arguments

(1986, pp. 88 and 93). Next

(2.13)

sup (f(s))’ ssF(W) of Billingsley

(1968) together

with (2.11) com-

403

A. Janssen / Two-sample goodness-of-fit tests when ties are present

From

Lemma

2.1 we derive

Kolmogorov-Smirnov

easily

the

asymptotic

null

distribution

of the

tests: (2.14)

(2.15) where Z(f)= I/f/l. Evidently, (Pi and $. are asymptotic level CI tests if our critical values c and d are the (1 - u)-quantiles of the limit distributions (2.7) and (2.8) as long as F is not a Dirac distribution and additionally in the case K, we have a< l/2; cf. Remark A of the appendix. However, the test statistics are no longer distributionfree since c=c(F([W)) and d =d(F(R)) depend on the range of the unknown distribution function F. Note that qo, and $,, become conservative if c and d are substituted by the (1 - U)quantile of Smirnov’s statistic (continuous case). Observe also that c(F”(R))
d(F”(R))
and

(2.16)

whenever ~(R)cF(R). A finite sample result of this kind was obtained Guilbaud (1986). To overcome problems with unknown critical values, we will construct of-fit tests that are distribution-free. In particular, we propose permutation fix the idea, consider first the Kolmogorov-Smirnov statistic: SUP

i

l(-m,Xj:,](xi)cni

Permutation permutations

>

goodnesstests. To

I

tests are based on uniformly of 1, . . . , n)

distributed

permutations

(on the set Y,, of

GwH~ni(L3))i

~n=(“ni)r
defined

by

(2.17)

.

(+)

IGIl I( i=l

earlier

on a further

probability

OCQ the permutation

statistic

space (6,2,

P) independent

of Xi : Q+R.

For fixed

of (2.17) is given by (2.18)

which is equal in distribution &++sup j
to

i$I lc-m,X,:,(o)J

For fixed w let x H Gr)(x, w) denote the distribution random variables &(o) and y,(w)~[O, l] as solutions

s

(2.19)

~xi:~~~~~c~~~~~~~)~+~~~

I(

(l(~,,na)(x)+~nl(~.)(~)}G,(dx,o)=a.

function of

of (2.19). Next choose

(2.20)

404

A. Janssen / Two-sample goodness-of-fit tests when ties are present

Then the permutation

test for H,, against

1

&:=

I y.

I

((

0

111112 n >

K1 >

112 (G,(t)-H,(t))

is an exact level c1 test for each F and each

n.

(2.2 1)

=I?"

)

<

Similarly,

permutation

obtained for K2; see (3.20). The main results of the present procedures work well for a large class of semi-norms.

Remark

2.1. To

u 1, . . . , U, defined bution of

paper

tests 6” are

show that these

give further motivation, consider antiranks (D,i(U))i<” of implicitly by Ui: n= UDni(“). Then we have equality in the distri-

by (2.10) and (2.11), since (F-’ (Ui:n))i
If (2.17) is

(2.23)

SUP

jan we get further insight in the permutation new permutations gn. For computational Kolmogorov-Smirnov can be recommended.

statistic (2.19). The antiranks

are replaced

by

reasons one may be interested in additional procedures of type for large n. In this connection estimated critical values Note that the right-hand side of (2.7) reads as (2.24)

sup IBo(s)I. seF(W)

It is reasonable to estimate the unknown fixed w let cn(. , co) denote the distribution

range F(R) by P,(R)= function of

{F^,(Xi):

i
For

(2.25) and

let C,:= c,(. , co)- ’ (1 -a)

critical

value C. similar

be its (1 -a)

to (2.21). It is shown

quantile.

Let

that (Pi - (P,,p

(P,, be the

test

with

0 and (P,,- & p

in probability under nondegenerate distributions of Ho. In this case the nominal level of (P,, converges to CIunder Ho as n+m. The critical values E,, can be obtained by a Monte Carlo study. Similar results hold for one-sided tests.

0

405

A. Janssen J Two-sample goodness-of--fit tests when ties are present

3. Main results In this section

the motivation

above is made precise and the results are extended

to a wider class of goodness-of-fit integral

tests relying

tests given by semi-norms.

examples

are

l/2

(~,(t)-fi,(t))2 q(&t))d&(t) or their one-sided

Typical

on

(3.1)

versions

(3.2) where q: (0,l)-+[O, 00) denotes a suitable weight function. The choice q= 1 gives the Cram&von Mises test (cf. Hijek and SidSk (1967, p. 93)), whereas q(t)=(t(l -t))-‘, 0 < y < 1, leads to tests which give more weight to the extremes. For y = 1 one obtains the weights proposed by Anderson and Darling (1952) for one-sample tests. As motivation of these tests, observe that c,(t)-l?,,(t) has conditional variance var(G,(t)-B,(t)1

X1 :n, . . . ,X,,,)=

&

Q)(l

-R(t))

givenX 1 : “, . . . , X,:, (apply (2.23) and Hhjek and Sidak (1967, p. 61)). Thus (3.1) includes a weighted renormalization via that variance. We now give a precise formulation of our problem. Consider more general regression coefficients C,i such that

(3.3) All our goodness-of-fit

x,,,:= i

tests are based on the general l,_

co,X~,.]:n](Xi)

cni,

OdsG

rank process

(3.4)

l

i=l

([xl denotes the integer part of x), and on the pivoted (via Dirac measures E.) by

empirical

measure

&,, defined

(3.5) In particular,

the Kolmogorov-Smirnov sup SSIO, 11

IXnJl=suP{

IXJ:

test is obtained s=uPP(&l))~

by (3.6)

406

Integral

A. Janssen / Two-sample goodness-of+

tests are obtained

tests when ties are present

by (3.7)

which obviously coincide with (3.1) in the two-sample case. For one-sided testing problems X,,, is substituted by X,7,. The examples have a common feature which is covered

by the following

class of semi-norms.

To explain

_I(~,r)): F distribution

Jq)Jl:={6P(F~F-1~3

this, let

function

on R)

denote the pivoted distributions on [0,11. In addition, consider for A c[O, l] the restriction D(A):= {jjA:fgD [0, l]} of the Skorokhod space D [0, l] on A. Assume now that

001, H-@‘[o, II,

ZH: D(supp(W)+CO, is a given family of measurable

semi-norms.

Then we may define a new function

ll-Kh ~01, z(H,f):=I,(f;,,,,(,,),

I:~,o,l,xDC@

which for fixed H defines a semi-norm f+-+ Z(H,f) Kolmogorov-Smirnov semi-norm is obtained by

Iidff,f):= and the integral

(353)

sup

on D[O, 11. For instance,

by

Zcq) (H 3f).=. Based on this function tests

the

(3.10)

If(s)1

tests are obtained

(3.9)

(3.11) I, (3.9), we now introduce

the general classes of goodness-of-fit

(PII= l@,,Oa,(Z(&I> (Xn,s)sE[O,11) along with the one-sided

(3.12)

versions

*II=1 Cd,,rn~(Z(WXL)SE,O,

1,))

(3.13)

provided the statistics under consideration are measurable. (Note that (2.14) and (2.15) are included in these classes). For Zks and ZCq)keep (3.6) and (3.7) in mind. If ties are present, the evaluation of exact critical values c, and d, yields serious problems. Also asymptotic distributions (if available) usually depend on the unknown distribution. As in Section 2, permutation versions of qn and $,, are proposed. Following the approach of (2.19) let us introduce for fixed w the permutation statistic of X,,, (3.4) by n

(T,HX”n,s(0,6)=2 i=l

l,_ m,Xi,,j:n’w’]txi : ntw)) cno,i ($1

(3.14)

407

A. Janssen / Two-sample goodness-of-jit tests when ties are present

for 0~s~ (6,2,

1, where again the permutations

F) independent

((T”i)i lie on a separate

of the observations.

we will now introduce

the permutation

For technical

probability

and computational

space reasons

process L-1

Z n,s

:!%C[O, 11,

o
1,

Z,,,@):=

c

(3.15)

c.,,<(@+Kl(s)>

i=l

with remainders R,(j/n) = 0, j = 0, . . , n, such that Z,,,, is continuous linear on [j/n, (j+ 1)/n]. It is most important that the permutation depends

in s and piecewise statistic (3.14) only

on o via A”(w) and on & via Z,,,(&).

Lemma 3.1. For each family of semi-norms (3.9) we have (3.16) and a similar result holds for _flS and zc,. one has

Proof. Check that for s=j/nEsupp(&(m)) nnL(O,j/nl x”n, j/n=

C

(3.17)

c~~,i = zn, rh,(O, j/n]

i=l

and

A,(O,j/n] =j/n.

suPP(&).

Thus

s H x”,,,

and

SH Z,,,

coincide

on

the

random

set

0

For fixed OEQ let now Gb+) (.,w) ~Hz(~,(0),(ZbTC,)(~))SEIO,

denote

the distribution

functions

of (3.18)

11),

where the index ‘+’ always indicates the one-sided case. If no other comments are made, both cases are treated equal. Clearly, the exact level a permutation tests associated

to (3.12) and (3.13) are given by >

1

I

@“I= Yn ~(fi,,(X,,S)SS[O,1]) 0

=c”, <

(3.19)

zj.2

(3.20)

and 1 $n:=

0.

I(J%, (XJSSIO, 11)

i 0 where the random variables treated similarly via G,’ .

< yn and c”, are determined

by equation

(2.20) and $,, is

408

A. Janssenl

Two-sample goodness-of-fit

tests when ties are present

Our treatment of permutation tests is based on convergence distributions G!,+‘(. ;) under Ho. Consider FeHo and define

of their conditional

HF:=~(F.F-ll~,(O,l)). Also introduce

unconditional

(3.21) distribution

functions

G and G+

G(+)(x):=Q(Z(HF,(Bo(s)(+)),)~x),

(3.22)

given by the distribution Q of B0 on C[O, 11. Recall from the appendix that G is typically absolutely continuous for Zks and 1(q) but G+ might have a jump at zero. Thus a different treatment of G and G+ is required. Let d(F,,F*)=inf{s:

Fi(x-E)-sEFFz(x)dF1(x+s)+s

for all x)

(3.23)

denote the Levy-distance of two distribution functions F1 and Fz. This is a metric for convergence in distribution. Call FEH, nondegenerate if F is not a dirac measure. It should be mentioned that assertion (a) below is already contained in Bickel (1969) for the two sample-problem (2.9). Distribution free Kolmogorov-Smirnov tests for randomly censored data were recently established by Neuhaus (1992, 1993). Their main concern is with the problems that arise from censoring. Theorem 3.1. Consider the Kolmogorov-Smirnov semi-norm I = ZKS(3.10) or the integral norm I = Zcg)(3.11) with continuous weight function q : (0,l)+(O, co) such that q(s)
O
(3.24)

holds for some K > 0 and 0 < y < 1. Let (Xi), be a sequence of i.i.d. random variables with nondegenerate distribution function F. (a) Zf Z= ZKSor Z= Zcg)with bounded weight function (y = 0) then OH

and

SUP

IG,b,N--(x)1

xe[O,30)

o+-+d(G,+ (.,m),G+(.))

(3.25)

(3.26)

converge to zero almost surely. In both cases the d@erence of the quantile processes o~sup]G(,+)(s,~)-l-GG(+)(~)-l/

(3.27)

SEA

converges to zero almost surely uniformly on compact sets Ac(0,1). (b) Next consider unbounded weight functions q (3.24) and their integral norms Z=Ztg). Then the assertions (3.25)-(3.27) remain valid tfalmost sure convergence to zero is substituted by convergence in probability.

A. Janssen J Two-sample

Proof. Section The proof

5.

goodness-of-@

409

tests when ties are present

Cl

of Theorem

3.1(b) is based

on an approximation

bounded weight functions. As we will see in Theorem under general circumstances. Consider the following family of semi-norms. Condition (A): Let Z(H, f) = 1,(f)

procedure

for q by

3.2 that type of argument works assumptions for the underlying

be as in (3.8) and (3.9) nontrivial

semi-norms

such

that with Q as in (3.22)

QIWF, (B&))MO, m))>O holds for nondegenerate

(3.28)

F and assume

Q(~(H,,(IB,(s)I),)E(O,

a))>0

(3.29)

for the one-sided case. Assume also that G and G+ (3.22) are nondegenerate limit laws. Condition (B): In the case of one-sided tests we assume I(H,,.) to be positive increasing, meaning that I(HF,f)dl(H,,g), Condition

(C): (Unconditional

whenever

O
convergence).

(3.30)

For each nondegenerate

be convergent in distribution as rz+c~ (Their limit distribution were already specified in (3.22).) These assumptions require further comments.

functions

FEH,

let

G and G+

Remark 3.1. (a) Under condition (A) the random variable I(HF, (B,(s)),) has a proper distribution G (3.22) on (0, co). If in addition condition (B) holds then G’ is concentrated on [0, co). Confer the appendix. In both cases the quantile functions G(+)- ’ are continuous on (0,l) which implies the equivalence of tests (see Lemma 3.2). (b) One easily checks that unconditional convergence (3.31) is necessary for conditional convergence (3.25) and (3.26), respectively. For ZKSassertion (3.31) was earlier obtained in Lemma 2.1. Theorem 3.2. Consider afamily of semi-norms IH( .) such that the condition (A)-(C) hold for

the two-sided

semi-norms

and one-sided

case,

respectively.

Assume

that there exist further

(Ik,H)ksN (3.8) with

(3.32)

410

A. Janssen / Two-sample goodness-of-fit tests when ties are present

for all

HE&[,,, II. Define lk(H,f):=Zk,n (f) and let G:yd(.,o) (3.18) and GCkJCf)(3.22) denote the corresponding distribution function belonging to Ik( .). Assume that under nondegenerate FEH, o+d(G:;n)(

., o), GCk)(+)(.))

(3.33)

converges to zero in probability as n-co for each keN. Then we have (a) the same assertion (3.33) holdsfor d(G!,+‘(. ,a), G’+‘( .)), (b) the difSerence of the quantile processes uniformly on compact sets A. Proof. Section

5.

(3.27) converges

to zero in probability

0

These results prove asymptotic

equivalence

of ordinary

and permutation

goodness-

of-fit tests under the null hypothesis. To establish a result of this kind consider conditions (A)-(C). Then there exists xb+‘~R! such that G(+) 1cxr~,mj is absolutely (+).-.- 1 - G(+) (x0) > 0, see the appendix. The choice continuous satisfying txO c,+G-‘(l-cc),

GI
and

d,,+(G+)-‘(l-cc),

a<~$,

(3.34)

lead to asymptotic cc-similar tests cp,, and Ic/, (3.12), (3.13). By the appendix we can choose x0 so that CI~= 1 and c&J= l/2 for I~{l,s, Pq’} of Theorem 3.1. Note that G(+), c, and d, are not really available in practice under ties. Lemma 3.2. In addition to these assumptions suppose that under nondegenerate we have convergence of oHd(G;+)(.,o),G(+)(.))

FEH,

(3.35)

to zero in probability. Then (a) for each ~
(Pl#-d~-O P

(3.36)

in probability under F; (b) for each ~
idtL~0 in probability

(3.37)

under F.

Proof. That proof easily follows from the almost sure subsequence convergence principle for convergence in probability. Choose a subsequence nk such that (3.35) converges almost surely along nk. We have almost sure convergence of quantile functions along nk and (P,,~- &,---+ 0 since G is strictly increasing at G-‘(a) for ~
cf. Witting

and Nijlle (1970, i. 58). Now we may choose a further subsequence

A. Janssen J Two-sample

n; such

that

proved.

Cl

(3.36) converges

goodness-of-/it

almost

surely

tests when ties are present

along

n;.

Lemma 3.2 can be used to evaluate the asymptotic under local alternatives, cf. Section 4.

411

Similarly,

(3.37) can

power function

be

of @, and $”

As explained computational process (Z,,,),

in the introduction other approximations as 9, of (Pi are of interest for reasons whenever n is large. Throughout, it is shown that the rank used for the definition of Gr’ (. , co) can be substituted by its limit process (B,(s)), to get asymptotic quantiles. Assume that the Brownian bridge is defined on a further probability space (fi, d, Q”)independent of the observations. For fixed o~s2 introduce G”(.,u) and GT (.,w) by G:+‘(x,w):=

(3.38)

Q”(I(~~,(Bo(S)(+)),,[O, 1,)d.x)

and let I&, and $n denote

the tests (3.12) and (3.13) given by critical

c,,:=(G,,(.,w))-‘(1-a) These tests work with estimated asymptotically a-similar tests.

and critical

Theorem 3.3. Under the assumption

values

&,=(G~(.,o)))~(~--cr).

(3.39)

values. Again we will see that @, and qn are

of Theorem

3.1 we have for IE{I~~, Icq’]

sup I~,(x,o)-G(x)l+O XE[O,3(1)

(3.40)

and (3.41) in probability.

In addition, we obtain (3.42)

in probability tests.

under nondegenerate

Proof. Section As conclusion tests and further

5.

FEH,,

whenever

a< l/2 holds for the one-sided

0 the Kolmogorov-Smirnov, Cramer-von integral tests Cp, and 4, with estimated

Remark 3.2. Consider that

the two-sample

O
n

problem

1.

Mises, Anderson-Darling critical values work well.

with regression

coefficient

(2.9) such

(3.43)

412

A. Janssenl

Two-sample goodness-of--fit

tests when ties are present

holds. Then the results of Einmahl and Mason (1992) can be used to sharpen assertion of Theorem 3.1(b). Actually, one can prove that (3.25)-(3.27) are almost surely convergent for those weight functions (3.24) determined by 0 < y < l/2. Note that (3.43) yields XI= I c$ = O(n), which is one assumption of Einmahl and Mason.

4. KolmogorovSmirnov

tests under local alternatives

Our treatment of the power of conditional goodness-of-fit tests @, and Cp.is closely related to the continuous case already analysed in the literature. For these reasons we restrict ourselves to the Kolmogorov-Smirnov test in order to show which type of modifications are required. The continuous case was successfully studied in Milbrodt and Strasser (1990). Here the reader finds references about the asymptotic power functions of goodness-of-fit tests. Moreover, we refer to the early paper of Chibisov (1965), where the likelihood ratio (4.3) and the power function was treated within a restricted continuous case. The key for our results is the asymptotic equivalence of both Gn and (Pn to unconditional tests which can be treated under local alternatives. Throughout, the modern approach via tangent vectors is used, cf. Pfanzagl and Wefelmeyer (1982) and references therein. The model (4.1) was earlier proposed by Janssen and Milbrodt (1993). Let 9 + Ps denote an &-differentiable curve at 9 = 0, see Strasser (1985, Section 75), whose tangent gulf’:= {gE&(P,): s g dPo =0} 1s g iven by the L,(P,,)-derivative of 9 -+ 2 (dPg/dPO)“*. Starting with regression coefficients (3.3), we introduce the model of the joint distribution of XI, . . . ,X, as

2(X

x,1=

l,...,

6 Pcni=:Pn,F,p s~q’(po) >

(4.1)

i=l

where F denotes the distribution function of PO. The parametrization allowed within the asymptotic setting since

I6

P,,,-

6

i=l

i=l

p:,,

by F and g is

-+O II

in the variational distance whenever Pg and P$ admit the same tangent g. This is due to local asymptotic normality (LAN). Note that L2-differentiability implies the LAN of the experiment &I=(~“,~~, cf. Strasser

{P”,F,& s~L:“‘(Po))),

(1985, Section dP” F d=U)-; log dP n,F,O

(4.2)

79.2), given by the approximation

g s

g2dF+op,,F,o(l),

(4.3)

413

A. Janssen 1 Two-sample goodness-of--fit tests when ties are present

where I,,( .) denotes

the central

L,(g)(x):=

i

sequence x=(x1,

C7(xi)9

cni

,X,)T.

(4.4)

i=l

Let gl,gELr’

(P,). Then Le Cam’s third lemma

s

in distribution

under

g1 0

s 1

1

Jxg1)~

implies

F-l (+A-,(du)+

0

gloF-‘(u)goF-‘(u)du

(4.5)

0

P,, F,4. Note that the right-hand

local alternatives

side of (4.5)

has mean

s s

gPF-‘g°F-‘d&o,l)=

and variance

h°F-')2d~,co,l,=

These arguments Gaussian

motivate

s

s

(4.6)

glg2dP0

(4.7)

g:dP,,.

that the limit experiment

GF of E, (4.2) is given by the

shift

GF:=(CCO, 11, WCC& 111,{QF, g: s~L’,“‘(J’o)l), with QF,g defined log

by

dQ~,~ _ %,o

(4.8)



’ g20F-l(u)du,

g#(u)B,(du)-;

s o

s

(4.9)

0

where QF, o = Q is the distribution of the Brownian bridge Bo. This model is a submodel of the case where the uniform distribution AI(o, 1J (= PO) is restricted to the tangent set (g 0 F-‘: gall’}, cf. Strasser (1985, Section 82.23). Moreover, we have convergence of experiments E,+GF in the sense of Strasser (1985, Section 80.6). For these reasons one expects that the asymptotic power of the Kolmogorov-Smirnov test can be calculated within the limit model CF. These arguments can be made rigorous. Theorem 4.1. Let (P”(3.12), & (3.19) and & (3.39) denote the two-sided asymptotic a~(0,1) Kolmogorov-Smirnov tests dejined through (3.10). Then we have

level

a(s) := n+m lim EP”,~,# cp.= n-m lim EP,,,~,~$J,,=n-+m lim EP,,F,g(Pn (4.10) where c, denotes the (1 - cc)-quantile of (2.24).

A. Janssen / Two-sample goodness-o@

414

tests when ties are present

Proof. The present proof follows standard arguments of Hajek and Sidak (1967 Chap. VI, Section 4.4), see also Milbrodt and Strasser (1990). Consider first -F(t), LE R. Then (4.5) is equal to 91=1,-m,,, F (0 Bo(F(r))+ which is actually i

g”F-‘(u)du, s

(4.11)

0

the limit distribution crd

of (4.12)

l(-mrt](Xi)

i=l

device establishes the convergence of their finite under Pn,F,S. The CramCrWold dimensional marginal distributions. In addition tightness of (4.12) remains valid under contiguous alternatives if we consider s~F(R)n(0, 1) and replace t =F’(s). Thus we may apply the functional f-

(4.13)

sup If(s ssF(R)

which completes

the proof.

Remark 4.1. The limit

0

experiment

GF (4.8) can be written

in an equivalent

form

given by ~F=(CO([W),~(CO([W)),{~F,g:gEL(20'(PO)})

on the set of continuous

functions

C,(R) on If%vanishing

at infinity.

The distributions

OF4 are defined by the shift model

&~,,:=~((Bo(F(x))+~~_~,~,

gdF)xeR)

of the Brownian bridge. Obviously, the corresponding log-likelihood as in (4.9). The right-hand side of (4.10) then reads as

Next we briefly sketch some applications continuous case. We restrict our attention regression coefficients (2.9).

ratio is the same

of Theorem 4.1 and some extensions of the to the two-sample testing problem given by

Example 4.1 (Continuation of Example 2.1). (a) The conditional two-sample Kolmogorov-Smirnov tests &, (3.19) and (Pn (3.39) given by IKS are asymptotically admissible in the following sense. Assume that qn denotes a further asymptotic level LXtest for Ho (2.1) and suppose

lim inf&,, n-30

Fr

g

r?,2 B(s)

(4.14)

A. Janssen / Two-sample

goodness-of--fit

tests when ties are present

415

for each gEL$‘) (P,) and each PO. Then we have

in

probability. The proof follows the lines of Strasser (1985, p. 436). Pn,F,g (b) (strict asymptotic unbiasedness of (Pn and I$,,). For each gELi’) (P,)\(O), we have p(g) > CI,where p is as in (4.10).

(c) (consistency). For g#O and t,+co, we have fl(t”g)+l as y1--tcc. (d) As further consequence one immediately generalizes the principal component decomposition for the power of Kolmogorov-Smirnov tests of Milbrodt and Strasser (1990) if ties are present. Note that for each gEL$“(P,) we have ~(tg)=cl++a(g)t2+o(t2)

(4.15)

as t+O,

where a(g)=a(g, c()30 denotes the curvature of the power function along the ray {tg: PER} at t=O. It can be shown that for fixed PO the curvature g-a(g) admits a principal component decomposition in the sense of Milbrodt and Strasser (1990, Theorem 2.8).

5. Main proofs The proof of Theorem 3.1 is based on an conditional invariance principle. denote i.i.d. uniformly distributed random Throughout let Ui, U,, . . . : Q-(0,1) variables with standard empirical process cn(m,t). Let M denote (the probability one set) M=

W: SUP Iti,(O,t)--tl+O,

Ui(W)#Uj(O)

for all i#j

rsm, 11

For REM introduce

the process

SH Y,,, on [0,1] into C[O, l] by

Y,,,(w,c5):= i 1[O,s](Ui:n(u))C,,~i(~)+~Rn(S),

(5.1)

i=l

with remainders making (5.1) continuous and Y,_, is piecewise linear in between. Lemma 5.1. For.$xed

o~A4

YE,s (QA6) 5 weakly in distribution

in s such that R,(O) = R,( 1) = 0, R,( Ui : ,,) = 0

we have

Bo(4

on C[O, l] under uniformly distributed

(5.2) permutations

(a,i(G))i,,.

416

A. Janssenl

Two-sample goodness-of--fit

Proof. Let REM be fixed. According (1968, Section 24), we have

tests when ties are present

to Hajek and Sidak (1967, p. 186) or Billingsley

weakly in C[O, 11, where Z,,, is given in (3.15). Check that Ui : n(w)
yn,s=~“,ii”(o,s,. Let (Pi : [0, l] + [0, l] denote

the one-to-one continuous transformation given by i n, which is linear on [ Ui _ 1 : ,, (CO), Ui : n (co)]. Conserp,(O)=O, $%(I)= 1, cPn(Ui:, (m))= / quently, we have

yn, s=

&I, qpn (s) .

Since sup(l~,(s)---sl:s~[O,

(5.5) l]}-0,

it is easy to see that (5.6)

llfnO(Pn-fll+O> whenever /fn - f 11+O in C [0, 11. If we now combine 0 Billingsley (1968 p. 34) complete the proof.

(5.3))(5.6) standard

arguments

of

Proof of Theorem 3.1 (part (a)). The proof is given in two steps. Step 1: I = ZKs. Consider a fixed element OEM and Xi = F- 1(Ui). The basic identity (3.17) implies sup lZ!I’,’ I= sup lr7:;: I sssupp(ni.) SPS”PP (4)

if we take (2.11) into account.

fH

Now we can apply the continuous

function

sup Im’+‘I SOP(R)

and the invariance principle (5.2) proves the desired results. Note that G is continuous according to the appendix. Step 2: Z=Ztq), q bounded. Lemma 5.2 below shows weak convergence of &,(o)+HF on [0, l] for all o lying in a probability one set N. Let oeN be fixed and

A. Janssenl

consider

Two-sample

fn, fE{gEC[O,l],

goodness-of--fit

g(O)=g(l)=O}.

tests when ties are present

Then

417

it is easy to see that

llfn-flI+O

implies (5.7)

jffqd&+f’qdH,. The invariance

principle

(5.3) together

with Theorem

5.5 of Billingsley

(1968) now

imply (.G~~‘)‘q(s)d&,(~,s)+ which completes

BbfW2q(s)dH&),

s

s

the proof of part (a).

0

Lemma 5.2. Under H, we have weak convergence

of

A,jHF

(5.8)

on [0,1) almost surely, where HF is de$ned

Proof. Let US use again Xi = F 5.1, and set A:=

by (3.21).

’ (Ui). Define N:= Mn@,

where M is as in Lemma

.

o: supI&(F(x)I+O XER

Let S(q) denote the set of continuity points (5.8) will be established for fixed oeN. (1) First let us verify that

of a monotone

function

cp. Throughout,

lim inf &, [0, y] 3 HF [0, y) “-02 holds except for a countable distribution the inverse

subset

(5.9) of (0,l).

Note

implies F^;’ (y)-+F-’ (y) for all ~ES(F-‘), of t H f,(o, t). Moreover,

that convergence

choose z
Since o~h?,

we have by (5.10)

lim inf riz, [0, y] > F(z), n+m where F(z)fF(F-’

(y)-)

as zfF_’

(5.11) (y). Note

that in addition

HFCO,Y)=E,~~~,~,(U:F~F-~(~)
This statement,

together

in

denotes

(5.10)

&J-o~Yl3F^,(R(Y)-).

For ~ES(F-‘)

of t+F,(t)

y~(0, l), where F^;’

F-‘(u)
with (5.1 l), implies

assertion

(5.9).

418

A. Janssen / Two-sample goodness-of@

tests when ties are present

(2) The converse inequality of (5.9) for limsup can Consider random variables I’, on (0,l) with distribution fixed woN (cf. Lemma 5.1). Consequently,

V,$+

be obtained as follows. function TV fi,,(~, t) for

U, and FOF-‘(I’,)*

FOF’(Ui)

converge both in distribution since F 0 F- ’ is IIIco,1j almost everywhere Check that F- ‘(V,) has distribution function t H $,,(a~,t). Thus c!Z’(Fd,‘lA,,,,

~,)=Z(FOF-~(I’~))+H,

weakly on [0, l] as n-03. ^ & = y(F, Choose observe

now that

(5.12)

On the other hand, note that

^ o F, ’ 11,co,1)).

continuity

continuous.

points

(5.13)

y, y+ E~(0, l), s>O,

of x H HFIO, x].

In addition

I F^,(x)-F(x)1 de for all x and II large enough. large n &cO,Yl=~,(,,l,(~:

If we take (5.12) and (5.13) into account,

F^,(F^,‘(u))dy)

dA,(o, l)(u: F(F^,l(u))d~+~)~H~CO,~+~l. Altogether, (5.9) and (5.14) yield convergence of (0,l). 0

q(s) dH&) dQ =

s

VWM4Ms)

The present proof is based on Theorem sequence of seminorms (3.32) as

h&f)

=

(f

f’ mink, 4 dH

(5.14)

of &,[O, y]+HFIO,y]

Proof of Theorem 3.1 (part (b)). First, it can be checked finite almost surely. Observe that by Fubini’s theorem

Is&(s)~

we have for

for a dense set

that JB,,(s)2q(s)dHF(s)

dH&) < 00

3.2. We may choose

.

is

(5.15)

the approximating

11-7

,

(5.16)

with bounded weight function min(q, k) for kEN. Note that assumption (3.33) is valid according to part (a) of the theorem. In order to apply Theorem 3.2, it is now enough to prove unconditional convergence, see condition (C). This is done in Lemma 5.3. The convergence of the quantile functions (3.27) can be obtained as follows. Note that supp(G’+‘) is a closed interval and that the inverse G(+)-’ is continuous on (0,l). Thus convergence in distribution implies here pointwise convergence of their inverse functions. Monotonicity arguments yield uniform convergence on compact sets.

A. Janssen/ Two-sample goodness-of-fit tests when ties are present

Lemma

5.3. For each weight function

s

(X!CA2 q(s) d&,(s) --%

in distribution

under nondegenerate

419

q given by (3.24), we have

s

(&,(s)‘+‘)~ q(s)dH,(s)

distributions

(5.17)

FEH~.

Proof.

Throughout consider continuity points 6 and l-6, 0~ 6 < 1, of the distribution function of HF (3.21). Introduce qs(s):=q(s) 1C6,1_6j (s). According to Billingsley (1968, Theorem 4.2), it remains to check the following three conditions. (1) For 6 > 0 we have convergence in distribution of

s

(X!Z2qs(s)

d% (s) *

(Bb+‘(s))2qd(s) dH,(s). s

(2) We have

s

(@,+’ (s))2qa(s) dHF (s) ---%

in distribution

(Bb”(s))2q(s)

dH,(s)

as 6 JO.

(3) lunli_m supE

(X:::)’

(q(s)-q4s(s))d@,(s)

(S Assertion (1) follows from the conditional convergence theorem for the bounded weight function qs, see Remark 3.1(b). Note that qs is HF almost surely continuous and (5.7) carries over. It should be mentioned that (1) also follows directly along the lines of the proof of Theorem 3.1(a). Moreover, observe that condition (2) is verified provided ;gE U holds.

Obviously,

(Bd+)(~))~(q(s)-qqs(s)dH,(s)

(5.18)

=0 1

it remains

to check

the conditions

(3) and (5.18) for two-sided

tests. Note that E

B:(s)(q(s)-&))dHF(s)=

s

E(B;(s)(q(s)-&))dH,(s)

converges to zero as 6 JO which implies (5.18). Similarly, the third condition can be verified which is done below. As in Section 3, we get that the conditional variance of X,,, given the order statistics is

420

A. Janssen / Two-sample goodness-of--fit tests when ties are present

which is equal to [n/(nthe order statistics, E

U

l)] (~(1 -s))

for s=j/n~supp(&,).

If we now condition

under

we have

X,2,,(q(s)-qqd(~))d~,(~)

=SS~(s(‘-r))(q(S)-q6(S))dl(s)d~(X1:”,


s

where the upper bound

X”,,)

~,((0,1)\(6,1_6))d~(X,:.,...,X,:.), converges

to (5.19)

KHF((O, l)\@, 1 - 6)). This limit follows from Lemma

5.2 and the strong law of large number,

which yields

&l({l})+HF((l)) almost surely for a proper treatment of the upper endpoint. Since (5.9) becomes arbitrary small for 6 10 the conditions (l)-(3) are established and the proof is complete. 0 Proof of Theorem 3.2. The proof is carried apply to GJ (. , w). Notice that for each x

out for G,( . , co). Analogous

arguments

(5.20)

Gk, n(x, w) 3 G,(x, 0). Let B denote the intersection of the sets of continuity points Throughout let XEB be fixed. The assumption implies as k-co

Gck’(x)- G(x)+0

(5.21)

On the other hand, we have Gk, ,,(x, o) - Gck)(x) 7 keN. This result combined

and

observe

(cni)i
as n+ cc for each

proves (5.22)

P

of (5.22) to negative that

0 in probability

with (5.20) and (5.21) immediately

(G,(x,o)-G(x))+-0. The extension

of G and Gk for each k.

(Dni(u))i
(Xi : n)i = (F- ’ (UC:n))i. Returning (2.22), that OH (X,,,(c.$ &,(~))

parts

runs

which

to our definition

and

as follows. turns

out

Choose

Xi=F-‘(Ui)

to be independent

(3.4) and (3.14) we see, similarly

(w, &)H (znJ~,

G), h,(w))

of to

A. Janssen J Two-sample goodness-of-jt tests when ties are present

have the same distribution we have

since fi, only depends

on order statistics.

421

By Lemma

3.1,

s

G,(x,O)dP(o)=P(I(~,,(X,,,),)Qx)~G(x),

which converges

according

to our assumptions.

Thus (5.23)

{G,(vJ-G(x)}dP(w)+O

s

holds for fixed XEB. On the other hand, we obtain

s s

(G,(x, co)- G(x))+ dP(o)+O

by (5.22). Combining

(5.24)

(5.23) and (5.24), we have

IG,(x,co-G(x)IdP(o)+O.

Now we may choose a countable

s

dense subset

{xiEB: igN}

of [0, co). Then

itI IG,(xi,~)-G(xi)l/2’dP(~)~O.

For each subsequence $I

there exists a further

subsequence

nk such that

IG,,(xi,~)-G(xi)//2’~0

holds almost surely. Along that subsequence we get G,,(x, w)- G(x)-+0 almost surely again for all continuity points x of G. This gives us convergence of d(G,(. , w), G( .)) to zero in probability. 0 Proof of Theorem 3.3. First consider

sup {lz::l)SSSUPPbw

sup

the semi-norm

IKs. Then, we have

{IBL+‘(s)l} d ll&(~)-G,. II.

(5.25)

SES”PPl~“)

According to (5.3) and Skorokhod’s embedding theorem (Shorack and Wellner (1986, p. 47), one can find versions of B,( .) and Z,, such that the right-hand side of (5.25) converges almost surely to zero. Together with (3.25) and (3.26) one gets assertions (3.40) and (3.41). The same arguments apply to I (4) for bounded weight functions. The device of the proof of Theorem 3.2 can now be used to establish the result for unbounded weight functions q (3.24). Consider again the approximating sequence Ik,H (5.16) of semi-norms and let G, T+)n denote their conditional distribution functions (3.38). Then we have, similarly to (5.20),

422

A. Janssen 1 Two-sample goodness-of-@

We already

proved

tests when ties are present

(3.40) and (3.41) for k. Hence

Theorem 3.2 can be adapted. Lemma 3.2. 0

The proof of statement

the arguments

of the proof

of

(3.42) is the same as that proof of

Appendix Let I : C [0, l] -[O, co] denote a measurable semi-norm and let Q = _.Y((B,(s),)) be the distribution of the standard Brownian bridge. It is well known that each measurable linear subspace Vc C[O, l] has either Q-measure zero or one, cf. Kallianpur (1970). A survey of results of that type is given in Janssen (1984). Evidently, Ni:= {f: I(f)=O} and N2:= {f: I(f) < co} are O-l sets. Throughout, we are concerned with the distribution of the semi-norm I. Lemma A.l(a) is due to Hoffmann-Jorgensen et al. (1979) who applied convexity arguments of Bore11 (1974). Let I be nontrivial, following lemma.

i.e. Q(N,uN;)=O.

Then, we have the

Lemma A.l. Let H:‘)(t):= Q(I((Bb+’ (s)),) < t) denote the distribution functions of I under B,, and BJ. Choose xbf):= sup{t: Hj+)(t)=O}. Then we have: (a) log H,(t) is concave on (x,,, co) and HI is absolutely continuous on that interval, (b) if1 is positive increasing (3.30) and Q(f: Z(lfl) < 00) = 1 the same result holds for H; on (x0’, co). This H:

result

implies

may have typically

Remarks condition

A2. (a) The

absolute a jump

continuity

distribution

QV((Bo(s),)@O,r))>O

of HI under

mild

conditions

whereas

at zero. HI

is absolutely

continuous

whenever

the (Al)

holds for all t >O. Obviously, (Al) is valid for nontrivial 11./I-continuous semi-norms on C[O, 11. (b) However, continuity of I does not imply H: (0) =0 and absolute continuity of Ht. For instance choose I(f) = If(y) )for some y@O, 1). More generally consider ZE {Zks, Ztq’} (3.10), (3.11) and nondegenerate FEH~. Then there exists y~supp(H~)n(O, 1). Thus

interval with lower endpoint (c) The support of H :+) is an (usually unbounded) xr’. (For reasons of concavity H:” cannot be constant on whole intervals between xy’ and the upper endpoint, since log Hi+‘(t)+0 as t+co). Hence the quantile function Hi+‘- ’ is continuous on (0,l).

A. Jamsen/

Two-sample

goodness-of-fit

tests when ties are present

423

Proof. The present proof relies on an obvious extension of Theorem 1.2 of subspace with Hoffman-Jorgensen et al. (1979). Let C,, c C [0, l] be a measurable Q((B,(s)),EC~)= 1. Consider a convex function q: Co-+[O, co), i.e. q(Af+(l -A)g)< Then t c, log Q(q((B,(s)),) < t) is concave nq(f)+(l-n)q(g), OGA< 1 and JgEC,. and

dP(q((Bo(s))s))

is absolutely

continuous

on

the

interval

(to, co), t,:=sup{t:

QM(&WMdt)=Ol. To prove this, consider

s, te[to,

l”{f: q(f)bt}+(l-A){g:

co). Then convexity q(g)ds}c{f:

of q implies

q(f)dIt+(l-+}.

If we proceed as in Hoffmann-Jorgensen et al. (1979), (l.ll), the statement follows from the results of Bore11 (1974). Obviously, this result gives assertion (a). To prove part (b) define q(f):=I(f +). Now it is easy to check that q is convex iff I is positive increasing, which is done below. Assume (3.30). Then (,?f+(l -i)g)+
take 0 d f< g and consider f= 4 (f-g)

+3 (g +f).

Convexity

of q yields

I(f)GfI((.!-g)+)+t1(g+f)d&U(g)+1(f)), since I((f-g)+)=O. Next we take care that q remains finite on Co= {j r(lfl)< 00). Note that (3.30) implies Z(f’)
Acknowledgement I am grateful to D.M. Mason Lemma 5.3.

who showed me the &method

(l)-(3)

of the proof of

References Anderson, T.W. and D.A. Darling (1952). Asymptotic theory of certain ‘goodness of fit’ criteria based on stochastic processes. Ann. Math. Statist. 23, 193-212. Behnen, K. and G. Neuhaus (1989). Rank Tests with Estimated Scores and their Application. Teubner, Stuttgart. Bickel, P. (1969). A distribution free version of the Smirnov two sample test in the p-variate case. Ann. Math. Statist. 50, l-23. Billingsley, P. (1968). Weak Conuergence of Probability Measures. Wiley, New York. Borell, C. (1974). Convex measures on locally convex spaces. Ark. Math. 12, 239-252. Chibisov, D.M. (1965). An investigation of the asymptotic power of the tests of fit. Theory Prob. Appl. X, 421437. Durbin, J. (1973). Distribution theory for tests based on sample distribution function. Reg. Conf: Ser. Appl. Math., Vol. 9., SIAM, Philadelphia.

424

A. Janssenj

Two-sample goodness-of-@

tests when ties are present

Gleser, L.J. (1985). Exact power of goodness-of-fit tests of Kolmogorov type for discontinuous distributions. J. Amer. Statist. Assoc. 80, 954-958. Guilbaud, 0. (1986). Stochastic inequalities for Kolmogorov and similar statistics, with confidence region applications. Stand. J. Statist. 13, 301-305. Guilbaud, 0. (1988). Exact Kolmogorov-type tests for left truncated and/or right-censored data. J. Amer. Statist. Assoc. 83, 213-221. HBjek, J. and Z. Sidak (1967). Theory of Rank Tests. Academic Press, New York. Hoffmann-Jerrgensen, J., L.A. Shepp and R.M. Dudley (1979). On the lower tail of Gaussian seminorms. Ann. Prob. 7, 319-342. Janssen, A. (1984). A survey about zero-one laws for probability measures on linear spaces and locally compact groups. In: Lecture Notes in Mathematics, Vol. 1064, Springer, Berlin, 551-563. Janssen, A. and H. Milbrodt (1993). R&nyi type goodness of fit tests with adjusted principal direction of alternatives. Stand. J. Statist. (to appear). Kolmogorov, A. (1941). Confidence limits for an unknown distribution function. Ann. Math. Statist. 12, 461-463. Mason, D. and U. Einmahl(l992). Approximations to permutation and exchangeable processes. J, Theoret. Prob. 5, 101&126. Milbrodt, H. and H. Strasser (1990). On the asymptotic power of the two-sided Kolmogorov-Smirnov test. J. Statist. Planning Inference 26, l-23. Neuhaus, G. (1992). Conditional rank tests for the two-sample problem under random censorship: treatment of ties (to appear). Neuhaus, G. (1993). Conditional rank tests for the two-sample problem under random censorship. Ann. Statist. (to appear). Noether, G. (1963). Note on the Kolmogorov statistic in the discrete case. Metrika 7, 115-116. Pfanzagl, J. and W. Wefelmeyer (1982). Contributions to a General Asymptotic Sfatistical Theory. Lecture Notes in Statistics, Vol. 13, Springer, New York. Praestgaard, J.T. (1991). General-weights bootstrap of the empirical process. Ph.D. dissertation, Department of Statistics, University of Washington. Romano, J.P. (1989). Bootstrap and randomization tests for some nonparametric hypotheses. Ann. Statist. 17, 141-159. Shorack, G.R. and J.A. Wellner (1986). Empirical Processes with Applications to Statistics. Wiley, New York. Strasser, H. (1985). Mathematical Theory of Statistics. De Gruyter, Berlin. Witting, H. and G. N(ille (1970). Angewandte Mathematische Statistik. Teubner, Stuttgart.