Locally optimal rank tests for independence

Locally optimal rank tests for independence

Statistics & Probability Letters 2 (1984) 83 89 North-Holland March 1984 LOCALLY OPTIMAL RANK TESTS FOR INDEPENDENCE Jerzy R U D N I C K I Systems...

297KB Sizes 0 Downloads 56 Views

Statistics & Probability Letters 2 (1984) 83 89 North-Holland

March 1984

LOCALLY OPTIMAL RANK TESTS FOR INDEPENDENCE

Jerzy R U D N I C K I

Systems Research Institute, Polish Academy of Sciences, O1- 447 Warsaw, Poland Received May 1983 Revised September 1983

Abstract: The problem of testing the hypothesis of independence against multiparametrical set of alternatives is considered. Rank tests, having some locally maximin property are studied and a certain characterization of these tests is given. Finite sample and asymptotic test statistics in a restricted class of tests are derived.

Keywords: rank tests for independence, locally optimal rank test, maximin test.

1. Introduction Let X 1 . . . . . X N be a r a n d o m sample of size N, consisting of m - d i m e n s i o n a l , i n d e p e n d e n t a n d identically distributed r a n d o m vectors. Every r a n d o m vector X~ = (X, t . . . . . X~m), i = 1 . . . . . N, has a d i s t r i b u t i o n f u n c t i o n (df) H ( x ; 0 ) , d e p e n d e n t o n a p a r a m e t e r O c ((01 . . . . . Op): O~>1 O, i = 1 . . . . . p } . We assume that H ( x ; O) is a p r o d u c t of its marginals if a n d only if 0r = 0 for all i = 1 . . . . . p. The p r o b l e m to be dealt with in this paper is the c o n s t r u c t i o n of optimal r a n k tests for testing the hypothesis of i n d e p e n d e n c e of r a n d o m variables ~ 1 . . . . . Xim, i = 1 . . . . . N. The case p = 1 has been studied b y m a n y authors, whereas the case p >~ 2 was considered only by Shirahata (1974). I n his paper the form of the locally most powerful ( L M P ) r a n k test was given u n d e r the a s s u m p t i o n that the rates 8~/Y~f=~Oj converge to some k n o w n c o n s t a n t s as O -o 0. The aim of this paper is to study the p r o b l e m without this assumption. It will be proved that, in spite of n o n e x i s t e n c e of the L M P r a n k test, there exists a test with a local m a x i m i n property.

2. Notations and assumptions The p r o b l e m we deal with is to test the hypothesis H:

H(x;O),

0i=0

fori=l

. . . . . p.

8i>0

for at least one i = l . . . . . p.

against the alternative K:

H(x;O),

T h r o u g h o u t this p a p e r the alternatives will be written i n a form

O=OA,

a = ( ) ~ 1. . . . .

)~e),

O,)~i>~O,

)~]+...+)~=1.

0167-7152/84/$3.00 © 1984, Elsevier Science Publishers B.V. (North-Holland)

(2.1) 83

V o l u m e 2, N u m b e r 2

M a r c h 1984

STATISTICS & PROBABILITY LETTERS

H(x; 6)) have

It is assumed that df's

density functions

h(x; 6)) and

that

m

H(x;O)=l---IFi(x,),

h(x;O)=fifi(x,),

i~l

0 = ( 0 . . . . . 0).

i=1

Let R i = (Rj, . . . . . Rx~ ) be the rank vector of

(X1s..... XNS), and

R = (R~ . . . . . R,,,).

Assumption Ai. There exist nonconstant functions lira 040

~,(X; 6))-

lim -~i a l°gh(x; O)=4i(x)' o-.o

/=1 ..... p,

satisfying 0 < lim f . . -

fl~b,(x; O)l d H ( x ; 69) = f . . .

fl~b,(x)[ I~I d ~ ( x ; )

O~'O

< ¢~.

j=l

Assumption A 2. In a certain open set containing 0, functions N

/I-Ih(xi; 6)) dx i

W(6))=f.." R~

i=1

have all partial derivatives of second order. Moreover, the difference quotients of its second derivatives are b o u n d e d by a certain constant, say

iw'J(o)- w'J(6)')l 116) - 6)'11

<

L/(N!)"

(2.2)

where Wq(O)= 02/OOi08j)W(O), for all 0 and O' in the open set and for all rank vectors R. This assumption is in particular satisfied, if W is three times differentiable wrt O. Let us introduce score constants

ai(i a. . . . .

i.,) =

i~

1 [(N-

1)!]-mf''"

k=l

× [1 - F k ( x ~ ) ] x - , , d F k ( x g ) , Let N

i ____.2 ai( RjI , . . . , Rjm)

a R

j~l

and for a test with a critical function

A'~ b - -

~"

R

84

fdPi(X) H

[ F k ( x g ) ] ik-I

k=l

il . . . . . i , = 1 . . . . . N, i = 1 . . . . . p.

Volume 2, Number 2

STATISTICS & PROBABILITY LETTERS

March 1984

Expanding the power function f14 of + into the Taylor series and using (2.1) we'll obtain N

B (19) = Y ' + ( n ) f • • • f FI h(x,; 19) dx, R ~

R

i=1

p

N

N

= a + 0 E kkEg'(~) E f"" k=l

N ~

t=l

f'C'h-(x,) 1-I h(xj" O) dX, + o,,(0)

~

J=]

P

=a+o

Y'~ XkA ~ + OA(0 ).

(2.3)

k=l

In the following discussion it would be useful to express the remainder q,~(0) in the form P

P

1 2 X , X j W ' J ( c O A ) O 2,

on(O)=½~+(R)12

O
(2.4)

i=lj=l

3. Locally maximin rank tests If the p a r a m e t e r 19 is multidimensional the L M P test usually does not exist. Such tests exist for any set of alternatives, provided the vector A is fixed (see Shirahata, 1974), however for different A's different tests are obtained. In that case one can use the following definition of the optimality of tests: Definition. A level a test qJ* is locally m a x i m i n (LM) if for any other level a test ~b there exists 00 > 0 such

that for every 0 < 0 < 00 inf fl~.(19)>! inf fl,;(19). IIOI1=0 IIOI1~0 Before stating the main theorem we'll give a characterization of locally unbiased (LU) tests. By a L U test we mean every test ~ for which fl~(19) >! a, 19 belongs in a neighbourhood of 0. L e m m a 3.1. Assume that there is a constant K such that IWiJ(19) I ~ K / ( N ! ) "

(3.1)

in a neighbourhood of 0 (this assumption follows in particular from the assumption A2). Then A~ >10 for k ~ 1..... p, if the test q~ is L U. Proof. F r o m the definition of L U tests and from (2.4) P

I2 XkA + OA(0)/0

0

k~l

should hold for every A and small 0 's. Taking A = I k = ( ~ k . . . . . 6pk ), where $ik = 1 if i = k and 0 otherwise, and considering limits as 0 ~ 0 for k = 1 . . . . . p, we obtain the desired result. [] U n d e r assumptions A a and A 2 L M rank tests are characterized by the following 85

Volume 2, Number 2

STATISTICS & PROBABILITY LETTERS

March 1984

Theorem 3.1. The L M rank test to test H against K maximizes the expression min k~l

A~

(3.2)

..... p

Proof. Obviously, the L M test is also LU. Hence, by L e m m a 3.1., the expression (3.2) is nonnegative for the L M test. In view of (2.3) the power function reaches its minimal values at the same points as the function P

Z ( A , O) = E XkA~ + 0 - ' q , ( 0 ) k=l

for any fixed 0 > 0. Let us assume that A% > 0 for k = 1 ..... p. Consequently the function P

z ( A ) = ~_, XkA~ considered that under sufficiently IIA - A'll >

at the surface EP=~Xz, = 1, reaches its minimal values at the points 1k, k = 1 ..... p. We'll show above assumptions the addition of the term 0-1o~(0) doesn't change this property for O's small. Firstly we prove that for every e > 0 there is a 00 such that, for any 0 < 0 < 0o and e,

Z ( A , O) > Z ( A ' , O) ~

z(A) > z(A').

(3.4)

Let us notice that there exists M > 0 such that, for IIA - A'II > e,

z ( A ) - z ( A ' ) > eM.

(3.5)

In view of (2.4) and (3.1)

0 < Z ( A , 0 ) - Z ( A ' , O) < z ( a ) - z ( a ' )

+ 2p2KO.

Hence, taking 00 = eM(2p2K) -1, from (3.5) we'll obtain, for any 0 < 0 < 00'

O n the other hand, in view of (3.1) and (3.5),

Z ( A, O) - Z ( A', O) > eM - 2 pZKO. The equivalence (3.4) follows immediately from these inequalities. N o w let 1 ~< k ~ 0 and 0o > 0 such that, for any 0 < 0 < 00 and IIA - Ik[] < e,

Z ( A , O) >i Z ( I k, 0).

(3.6)

It follows from (2.2), (2.4) and (3.1) that, for some 0 < c < 1,

Z(A,O)-Z(Ik,

O)

>1z ( A ) -- Z( Ik) -- E~(R_)(IXZwkk(cOA ) \ n

+

wkk(cOIk)]

x,xjlw'J(eOa)l}o (1 <~i,j<~p } \ { i = j = k

}

>l z( A ) - Z( Ik ) - tOIIcO( A - Ik )ll - 2 ( 1 - - X k ) g 0 - - ( p +

a ) g o E •j jq:k

>lz(A)-z(lk)-L02

2~--Xk)--2(1--Xk)KO--(p+

I)KO E Xj. jq:k

86

(3.7)

Volume 2, Number 2

STATISTICS & PROBABILITY LETTERS

March 1984

At I k this expression is equal to 0. Let i 4: k and write z ( A ) = ~ X, A~ + A• 1 - ~, jq=i

X2j

and

j-~i

~kk

A~, - -

Hence, the directional derivative of the least expression in (3.7) at the point to i A~-A~X k

1-EX

+L02[2(1-Xk)]

jq-i

y~-i

I~., in

/

"

the direction of I i is equal

, '/2+2KO+½(p+I)KOX k 1-EX2j

"

j~i

A'~ > 0, so the last expression tends to - ~ as A ~ I k, for 0 ' s sufficiently small. Consequently the directional derivative under consideration is negative in a neighbourhood of I k and, in view of (3.7), (3.6) is proved. Combining (3.4) and (3.6) we conclude that a power of a test, for which A~ > 0 for all k ' s , is minimal at the points Olk, for sufficiently small 0 's. It is easy to conclude that the L M rank test maximizes (3.2), if it is positive. This conclusion can be easily extended to the case min

d~ = 0.

k=l ..... p

Indeed, if A~o = 0 for a certain k o -~0

0

as0 ~0.

If there is a test ~b' such that A~, > 0 for all k then lim 0~0

B,~,(OA ) - a 0

>~

min A ~ , > 0 k=l ..... p

holds for all A ' s and q~ couldn't be the L M test.

[]

Generally, it is difficult to derive the critical function from the above theorem, except in some special cases. It is possible to determine the optimal critical set for the test statistic (a~ . . . . . a p) using the quadratic programming method, as it was done by Gastwirth (1966). Forms of the L M ' r a n k tests derived in such a way are, in general, very complicated, and hence useless. In the next section we look for L M rank tests of computationally simpler forms.

4. L M rank tests in a restricted class

We restrict our considerations to the case p = m = 2. The class of tests we'll deal with is the following:

~p,,(.a,.2)=[1

for T~I,. 2 = pail,n2 + (1 - ~,)a2n,,n2 > k ~ , otherwise,

(4.1)

In (4.1) statistics a~,,n 2 and a n,,n2 2 are the L M P rank test statistics in the one-dimensional problems with parameters 0 a and 02. Moreover, tests (4.1) are also L M P rank tests for the one-dimensional weighed alternatives 0'0, (1 - p)0), with fixed p, and hence it is worthwhile to consider them. 87

Volume 2, Number 2 Lemma

4.1. A ~

is a

STATISTICS & PROBABILITY LETTERS nondecreasing function of

v.

A 2 is a nonincreasing function of

March 1984 v.

Proof. for any 0 < v ~< 1 let us introduce the relation

F o r simplicity we assume that a is a natural significance level. Hence the critical set of the test +~ is the set of a • ( N ! ) 2 greatest pairs (R 1, R 2), in the sense of relation -~ Let us consider A ~,, and let 0 ~< v' ~< v ~< 1. The difference A~+~- AI4,, can be written as a sum of a certain n u m b e r of differences, say A1 _A 1 =}'~[

a 1RI,R2--a

1

such that

Since A~, is maximal and relation -~ between every two particular pairs could change only once, therefore (R'a, n'2) ~ (n~, R 2), which is equivalent to a ~ , , . : - a~ . , , >10. Similarly we can proceed with A ~ / [] The following theorem is an obvious consequence of T h e o r e m 3.1. and L e m m a 4.1. T h e o r e m 4.1. Let v* ~ (0, 1) be such that lim A 1, tgv < lim A2+, V .~ V*

V .2 v*

and

lim A 1~bv > lim A ~ l, ".~ l,*

I, ka v*

or an arbitrary number for which A 1q~,- A +/ 2 Then the test ~ . is the L M rank test in the class (4.1). I r A 1,~ > A~ for v ~ (0,1), ~1 is the L M test among (4.1). I f A~, < AS1 for v ~ (0,1), 60 is the L M test among (4.1). The above statements are true, if both Aa¢, and A 2~. are nonnegative for the determined test.

Let us notice that if A 1+. or (and) A~, is (are) negative, then no LU tests exists in the class (4.1), and hence they are all ' w o r s e ' than, for instance, the test ~ - a. Corollary 4.1. There exists 0o such that for any 0 < 0 < 0o the following holds: (a) I f there is u* ~ (0, 1) such that lim /3+,(0, 0) < lim /3+,(0, 0) V.~V*

v /~v *

and

lim /34,(0, 0) > lim/3~,(0, 0) VkaV*

V"~V*

then one of the tests: lim, ,~.~k~, tk,., l i m ~ . + , is the L M rank test in the class (4.1). (b) If1~+(O, 0) > ~q, (0, O)for all v E (0, 1), ~bI is the L M rank test in the class (4.1). (c) I f & ( O,O) < & (O, O) for all v ~ (0,1), 4~o is the L M rank test in the class (4.1). The above statements are true, if the determined test is LU.

Proof. I f A 1 < A ~ , 0 ~ < v ~ < v ' ~ < l , then there is 0g ' ~ ' s u c h t h a t , f o r a n y 0 < 0 < 0 ~

,

(0,0) holds. Since there is a finite n u m b e r of tests (4.1), this neighbourhood can be chosen jointly for all these tests. Similarly we proceed with AS and 139 (0, 8). Hence, using T h e o r e m 4.1, one can show that 6~* is the L M rank test in the case A~.. # AY.. If A$~ = A~, for v from a certain interval (v 1, v2), then v* = v I or v2 and by T h e o r e m 4.1. one of the tests lim~,,p.~p~, limv,~p.tp~ is the L M rank test. [] 88

Volume 2, Number 2

STATISTICS & PROBABILITY LETTERS

March 1984

Let us notice that if fl,~ were continuous, the LM test would correspond to the point of intersection of curves

fl~( O, O) and fl~(O, 0).

5. Asymptotic solution U n d e r suitable assumptions Theorem 4.2.2. of R u y m g a a r t (1973) states that

lim /~tp~(01/V~,02/N~4)=

1- q~(~-l(1-a)

-

e 0,.0,) p

N~o~

where q~ is the standard normal distribution function and

eo~,o~

=

lim

#~v (~9) -/~0

°0,

N~o9 v

r~t..v

]

#iv( 19) = EO/(N [ ]n,,n~,

v

v

I-to = EoT~,n~,

% - V a r o T ~ , n ~•

Let us consider now two sequences f l ~ ( O / f N , 0) and fl,~ (0, 0 / v r N ) . Their limits are continuous functions of i,. Thus, using R u y m g a a r t ' s theorem, one can conclude that the L M rank test corresponds, as N ~ 00, to 1,* such that

%,0 - %,0. Dividing both sides of this equation by 0 and using (3.17) of Behnen (1972) (under additional assumption that ~ a n d ~2 are square integrable) one can formulate the following Corrollary 5.1. Asymptotically the L M rank test in the class (4.1) corresponds to i,*, which is the solution of the

equation 301

~'~(O)10=o - 002 ~(O)1,~=o

(5.1)

where try(O) = f f [ ~ J ~ ( F o ( x ), G o ( y ) ) + ( 1 - v ) J 2 ( F o ( x ), G o ( y ) ) ] d H o ( x , y ) , Ji(u,o)=dPi(F-l(u),G-l(v)),

i = 1,2.

and Fo, G O are marginal dfls of H o, F = Fo, G = G o. I f the left-hand side of (5.1) is always greater (less) than the right-hand side, asymptotically the L M rank test

is 41 (40). This corrollary is true, as previously, if both derivatives in (5.1) are nonnegative for the determined test.

References Behnen K. (1972), A characterization of certain rank-order tests with bounds for the asymptotic relative efficiency, Ann. Math. Statist. 43, 1839-1851. Gastwirth J.L. (1966), On robust procedures. J. Amer. Statist. Assoc. 61, 929-948.

Ruymgaart F.H. (1973), Asymptotic theory of rank tests for independence, Mathematical Centre Tracts 43 (Mathematisch Centrum, Amsterdam). Shirahata S. (1974), Locally most powerful rank tests for independence, Bulletin of Math. Statist. 16, 11-22. 89