Consistency of rank tests against some general alternatives

Consistency of rank tests against some general alternatives

Mathl. Comput. Modelling Vol. 18, No. 12, pp. 49-55, 1993 Copyright 0 1994 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0895-7...

1MB Sizes 2 Downloads 94 Views

Mathl.

Comput.

Modelling Vol. 18, No. 12, pp. 49-55, 1993 Copyright 0 1994 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0895-7177/93 $6.00 + 0.00

Consistency ‘of Rank Tests against Some General Alternatives A. I. KATSEVICH Mathematics Department, Kansas State University Manhattan, KS 66506-2602, U.S.A.

[email protected] A. G. RAMM Mathematics Department, Kansas State University Manhattan, KS 66506-2602, U.S.A. [email protected]

(Received and accepted

October 1993)

AbstractThe proof of consistency of rank teats against some general alternatives is given. The alternatives considered include arbitrary, not necessarily monotone, trend in location or change points (change surfaces) under the infill asymptotics. One-dimensional and multi-dimensional cases are studied. The numerical experiment illustrating the usage of the above results for image analysis (edge detection) is presented. Keywords-Bank

test, Consistency,

Change points, Trend.

1. INTRODUCTION In this

paper,

statistic prove

we study

tests

[l] (one-dimensional their

consistency

of randomness case)

against

based

and on Geary-type

two general

on the statistic

alternatives.

xk, k = 1,. . . , N, be a random sequence. The

modification

of the

Wald-Wolfowitz

[2,3] (multi-dimensional

Consider

the one-dimensional

case)

and

case.

Let

Ho is that all xk are independent, identically distributed (iid) random variables (xv’s) with common distribution function (df) F(z). Fix m 2 2 and define the first alternative hypothesis H,. null hypothesis

kEK(,

Kn : Fk(x) = G(x),

C=l,...,

m,

(14

e=L...,m-1,

(lb)

where

G(x) > G+l(x),

2 E

ijK!={l,...,

El;

3x,e:Ge(~,e)>Ge+~(x,e), N},

KinKj=O,

fori#j.

WI

e=i Here

Fk is the df of the kth term

of the sequence

xk, Ge are continuously

differentiable

df’s.

The

H, means that the initial sequence is a union of samples (possibly multi-connected) of m rv’s & with df’s Ge such that & is stochastically smaller than &+I, e = 1,. . . , m - 1. Thus, the initial sequence need not be stochastically monotone under H,. The second alternative hypothesis

Typeset 49

by A_ML-TEX

50

A. I. KATSEVICH AND

A. G. RAMM

hypothesis H’ which we consider consists in the presence of an arbitrary “continuous” trend in location: k = l,...,N,

H’ : F/c(t) = F(z - 6’&

(2a)

where we assume that 34(t) E G[o, I],

4(t) f Const : ok = 4

+

.

0

(2b)

A similar multi-dimensional problem deals with the data XL, k = (ICI,. . . , kd), 1 < ki 5 &IV, 0 < pi < 00, 1 < i < d, where ,& are arbitrary fixed integers. In this case, XL are the values , T := ~:~EZd,lIki~PiN,1~i5d { > The multi-dimensional analogs of the hypotheses Ho, Hmr

of a rv defined on a lattice in Rd. Define BN := { t^: t^E Rd, 0 I ti I ,&, 1 < i I d}. and H’ are rlc : Fj$) H,

= q(x),

: F&r) = Ge(x),

V&j

E BN;

(3)

k~l&,

e=l,..., m,

(4a)

= 0,

for i # j;

(4b)

where the functions GI satisfy (lb),

fi e=i

&l=

BN,

i&nkj

and 8’ : F&r)

= F(x - e,),

k~ BN,

(5a)

where we assume that ^ El4(t^)E C(T),

4(i) $ const : t9&= C#I $ 0

.

(5b)

The literature on one-dimensional tests of randomness is very large [4-141, but we could not find any proofs of consistency against the general alternatives we consider. The most frequently considered alternatives are one change-point [6], monotone trend and serial correlation [12]. Different rank tests and different results for the case of multiple change points can be found in [13,14]. There are less publications on the theory of multi-dimensional tests of randomness (tests for spatial autocorrelation), most notably [3], but we did not find any results as general as ours. In Section 2, the statistics are described, basic results are stated and tests of randomness are formulated. In Section 3, some ideas of the proofs are sketched. In Section 4, some applications of the tests are outlined and results of a numerical experiment are reported. Detailed proofs of our results are given in [15,16].

2. TESTS

OF RANDOMNESS

1. Consider first the one-dimensional case. Define the statistic [15] N-l VN

:=

(N - 1)-l

c

(&+I

- &)2N-2.

(6)

k=l

Here Rk is the rank of xk, i.e., Rk := # {Xj : Xj < Xk} + 1. Note that UN has the form of the Durbin-Watson statistic 1171with observations replaced by their ranks. To construct the test of randomness baaed on VN, we need to study its asymptotic properties as N + co. Using the assumptions (lb,c), define interior and boundary points p!) := #{k : k, k + 1 E Ke}, e =

Consistencyof Rank Tests 1 I”‘, m, pcb) := # {Ic E Ki, k + 1 E Kj,i assumptions: $1

N

I n addition

# j}.

O
NT, aeT

51 to (la-c)

let us make quite

natural

e=l,...m;

(Id) . ,

nl

c

(Ye= 1;

pcb)5

as

const

N--+co.

e=i

ms ” the convergence by “N-m

Denote

Our basic res&

where E,

in probability.

is:

THEOREM 1. IfHe E,,

in mean square and by L’NLm” the convergence +

holds, then UN Nzm b. If H,,, (the assumptions (la-d))

< i.

If H’ holds ind

holds, then UN % N+C0 the function #(t) in (2b) is independent of N, then

UN -% E’, whereE’< $. N-C0 Let us formulate the test of randomness. Fix E, 0 < E < 1, the probability of rejecting Ho when Ho holds (type I error or false alarm error) and find the threshold A0 from the equation defines A0 as a function of N and E. Then, for the given p{uN < Ao 1 Ho} = E. This equation sequence {z~}~=i, compute the statistic UN by formula (6). If UN 2 Ao, HO is accepted, i.e. the sequence is assumed to be random. If UN < Ao, HO is rejected. Clearly, the probabilities of the second type errors (against H, or H’) are given by P {UN 2 A0 1Hm} and P {UN 2 A0 ) H’}. THEOREM 2. Fix E, 0 < e < 1, and find A0 from the equation P {UN < Ao ( HO} = E. Then and P{UN > A0 1H’} --+ 0 a.~N -+ 00.

P{UN 2 A0 ( Hm} = @N-l) Theorem 2.

2 shows the consistency

Let us discuss

of the proposed

the multi-dimensional

case.

test. Fix multi-index

h E BN and define

the set

.! : i E BN, e^# i, l~,yd I& - kil = 1 . If i is a -{ > strictly inner point of BN, the number of points in L(i) is 3d - 1. Let Ri be the rank of z:, 2^E BN, R; := # {zj : j E BN, xcj < xi} + 1. Define [15]

of lattice

points

neighboring

to it by L(k)

:=

tiN := MN1 c

c

(RL - R,-)2N-2d,

(7)

jcEB,v &L(i) where MN is the number of terms in the double sum (7), MN = (3d-1)Pi.. . :&Nd(l+o(l/N)) as N -+ 00. Theorems 1 and 2 and the test of randomness remain the same in the multi-dimensional case with VN replaced by fiN. Now a few words about the numerical implementation of the above test are in order. One sees that the most important part is the calculation of the threshold A0 from the equation P{uN
normal

as N -+ 0;) if HO holds.

One has CEy’

(R~+I - Ri)2 = N(N + 1) y

- 2R - (RI - RN)~. Thus, UN = $ - 2R/(N2(N - 1)) + o(l) and UN is also asymptotically normal. For small values of N one can use the Monte-Carlo method. One models a sufficiently large number of permutations of (1,. . . , N}, corresponding to the equiprobable distributions of ranks within the initial sequence, and computes UN for each permutation. Select A0 so that UN < A0 in lOOf% cases. In the multi-dimensional case, the asymptotic distribution of CN is also normal [3], and for small N one can also use the Monte-Carlo method.

3.

IDEAS

OF THE

1. First, we prove that TN = (1 + N-l)/6 HO is true (see also [l]). Here the bar stands

PROOFS

and Var(VN) = (36N)-’ + 0(NW2) as N + 00 if for the mean value. Then we assume H, and let

A. I. KATSEVICHAND A. G. RAMM

52 m =

2. In this case, we prove that var(vry) = 0 (+) 2

where f(r)

:= wg1

2

G-l(r) g(G-l(r))

and ‘~j are defined in (Id).

as N -+ 00 and

(8)

G(z) := alGl(z)



Here G-’

on a set of regular values of G(z).

+ cqG2(2),

Let E’ := {r : 0 5 T 5 l,G(z)

= (~1 lim gl(x) G(z) da:= c--r0JwL

From

G’,,

Clearly, f(r) is defined = r,g(x) = 0). By Sard’s

Then, by taking z = G-‘(r),

(~11, I := RglGdx. J

we get

(9)

sup gl(x) -+ 0 as 6 -+ 0. Furthermore, XEW\W,

dx + JGlgl w

=

C onsider an arbitrary open set EL such that

E’ c E:, measE: 5 E. Let E, := [0, l] \ EL, R, := G-l(E,).

I = a1

G’, g1

is the inverse function of G(z).

theorem [18], measE’ = 0. Let E := (0, l] \ E’.

The last equation follows since

g =

~2

Jw

Ggl dx < (u1/2 +

a2

Glgldr=$+F=;.

(10)

Jw

(8) it follows that the inequality E2 < i is equivalent to the following one

z2 -

QlZ + 7

>

a1(1 4

al)

(11)

.

From (9) and (10) one gets z = crll < a1/2, which implies (11). Next, we consider the case m > 2 and prove that 17~ -+ E, and Var(VN) = 0 (k) Denote E, := E,(q,. . . , a,; Gl,. . . , Gm). Let us prove the inequality E, < i.

as N + co.

The following inequality holds

LEMMA.

E,(q

,..., a,;Gl,...,

G,,J < Em--1(cq ,..., a,-2,cx,_1+a,;G1,...,

G,-1).

(12)

Applying (12) repeatedly, one obtains the estimate: E,(cQ ,..., a,;Gl,...,

G,rJ < E~((Y~,cYP+...+(Y,;G~,G~)

= E2 < f.

2. Now assume H’. Fix m 2 2 and divide the interval [0, l] into m equal subintervals. Consider a piecewise-constant trend approximating 4(t). Let VP’ be the statistic (6) calculated for such trend. Using the above results, we prove three results [16]: (i1) I/F’ (i2)

(is)

=

E,;

N+m

(me) , E --+ 0, where UN is the statistic calculated vN - uN I > N+CO {I in the case of the trend (2b); and

t/e >

lim

N-CC

0 3m, such that P

I

P

{

Yj;n')<.}- P{vN < a}1= 0,for any a $ [E,< - 6, Em, + E].

These three results imply VN Nzm E,, and the following formula are pr&ed:

E,

where E,

:=

lim E,.

The existence of this limit

= 2 (f - 1($)yl”hw ere I(4) is a functional which is

Consistency

Gateaux

differentiable

in CIO, 11. Thus,

of Rank Tests

53

to show E Q) < i we need to prove that

I(d)

> 9. The

proof of this fact is based on two assertions:

w6, + @?A

(ii)

aa

Theorem

> 0, for

any y, a < y < b.

a=0

1 is proved.

3. Now the proof of Theorem

2 presents

no difficulties:

P{~N>Ao(H,}=P{~N-E~>Ao-E~IH,} I P {IVN - EmI 2 AO - Em I &J

=

0

(N-l)

points

_ const Nd-’ p-lb) < if fis holds. holds.

case, the argument

ficb) := # {i : k E &, L(k) rl kl as N --) 00 (compare

Moreover,

The inequalities

6~ 8,

3

N-w

Em)-2var(w)

ClSN-+OQ.

O
I (Ao -

+c+ndE,&~,soAo-Em~c>Oas of Theorem 2 is proved similarly. is similar.

# L(k)}

under

with (Id)).

Note that the number

the hypothesis

l?,

of boundary

is assumed

to satisfy

f, Var(fi/N) = O(Nmd) We prove that fiN N-roe z

gm, Var(fiN) = O(N-‘)

if &

< i and gw < i are proved analogously

holds;

and fiN NLm l?,

to the one-dimTnsiona1

if P case.

4. APPLICATIONS There are many possible applications of the proposed test of randomness, especially of its multidimensional version: image processing, edge detection, remote sensing, spatial and temporalspatial analysis of data arising in the earth sciences, ecology, soil science, hydrology etc. For example, let us describe an algorithm for edge detection based on the proposed test. In image processing, the data are intensities of the grey level specified at each pixel, i.e., at the nodes of the two-dimensional square grid (image). An edge (discontinuity of a signal) can be defined as follows: the grey level is relatively consistent in each of the two adjacent extensive regions, and changes abruptly as the border between the two regions is crossed [19,20]. Consider an N x N window BN sliding over the image. For each position of the window one wants to make a decision: whether I’ n BN = 0 or not, where l? is an edge. If I’n BN # 0, then l? divides BN into two sets K1 and Kg, such that the values of the grey level in one set are stochastically larger than those in the other set, hence the hypothesis Hs (or, more generally, H,) takes place. If r rl BN = 0, then the grey level is approximately constant inside BN and the hypothesis HO takes place. Thus, the choice between “I’n BN = 0” (HO) and ‘Trl BN # 0”(H m ) can be made using the test presented in Section 2. If the hypothesis H,,, is accepted, the center of the current window is marked as an edge point. Repeating this process for each position of the window, one finds all edge points. The numerical results of an example. Figure 1 represents magnitude D = 1.5 specified the uniform distribution, zero chosen N = 7, the probability detected edges of Figure 1.

application of the a synthetic image at a square 101 x mean and standard of false alarm has

above algorithm are illustrated by the following of square and circle step edges with the jump 101 grid. The image is corrupted by noise with deviation u = 0.75. The window size has been been E = 0.01. Figure 2 represents the image of

A. I. KATSEVICH AND A. G. RAMM

54

100

60

J

0

.

20

40

60

80

101

Figure 1. A 101 x 101 synthetic image of square and circle step edges corrupted noise.

20

40

60

80

Figure 2. Detected step edges of Figure 1.

by

Consistency

of Rank Tests

55

REFERENCES 1. A. 2. 3. 4. 5. 6.

7. 8. 9.

10. 11. 12.

13. 14. 15. 16. 17. 18. 19. 20.

Wald and J. Wolfowitz, An exact test for randomness in the nonparametric case based on serial correlation, Annals of Mathematical Statistics 14, 378-388, (1943). R.C. Geary, The contiguity ratio and statistical mapping, The Incorporated Statistitian 5, 115-145, (1954). A.D. Cliff and J.K. Ord, Spatial Processes: Models and Applications, Pion, London, (1981). R.J. Aiyar, Asymptotic efficiency of rank tests of randomness against autocorrelation, Annals of the Institute of Statistical Mathematics 33, 255-262, (1981). R.J. Aiyar, C.L. Gouillier and W. Albers, Asymptotic relative efficiencies of rank tests for trend alternatives, Journal of the American Statistical Association 74, 226-231, (1979). M. C&g6 and L. Horv&th, Nonparametric methods for changepoint problems, In Handbook of Statistics, Vol. 7. Quality Control and Reliability, (Edited by P.R. Krishnaiah and P.K. Sen), pp. 403426, NorthHolland, Amsterdam, (1988). J. H&jek and 2. Sid&k, Theory of Rank Tests, Academic Press, New York, (1967). M. Kendall and J.K. Ord, Time Series, 3’d ed., Edward Arnold, UK, (1990). P.R. Krishnaiah and B.Q. Miao, Review about estimation of change points, In Handbook of Statistics, Vol. 7. Quality Control and Reliability, (Edited by P.R. Krishnaiah and P.K. Sen), pp, 375-402, North-Holland, Amsterdam, (1988). M. Kendall and A. Stuart, The Advanced Theory of Statistics, Vol. 2, 4th ed., Charles Griffin, London, (1979). G.E. Noether, Asymptotic properties of the Wald-Wolfowitz test of randomness, Annals of Mathematical Statistics 21, 231-246, (1950). G.K. Bhattacharyya, Tests of randomness against trend or serial correlation, In Handbook of Statistics, Vol. 4. Nonparametric Methods, (Edited by P.R. Krishnaiah and P.K. Sen), pp. 89-111, North-Holland, Amsterdam, (1984). F. Lombard, Rank tests for changepoint problems, Biometrika 74, 615-624, (1987). B.E. Brodsky and B.S. Darkhovsky, Nonparametric Methods in Change-Point Problems, Kluwer, Dordrecht, Netherlands, (1993). A.I. Katsevich and A.G. Ramm, Consistency of rank test against multi-sample alternative: fixed and random design models, (submitted for publication). A.I. Katsevich and A.G. Ramm, Consistency of rank test against trend in location, (submitted for publication). J. Durbin and G.S. Watson, Testing for serial correlation in least squares regression I, Biometrika 37, 409-428, (1950). A. Sard, The measure of critical values of differentiable maps, 8~11. Amer. Math. Sot. 48, 883-890, (1942). W.K. Pratt, Digital Image Processing, Znd ed., Wiley, New York, (1991). A. Rosenfeld and A.C. Kak, Digital Picture Processing, Vol. 2, 2 nd ed., Academic Press, New York, (1982).