RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE

RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE

1999,19( 1):37--44 RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE 1 Yang Ying ( ~.Jl ) Department of Probability and Statistics, Peking University, Bei...

454KB Sizes 1 Downloads 47 Views

1999,19( 1):37--44

RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE 1 Yang Ying ( ~.Jl ) Department of Probability and Statistics, Peking University, Beijing 100871, China Abstract

Consider the uonparametric median regression model

i $ n, where

Yni"S

~li=

are the observations at the fixed design points

Xui

g(xnd

+ €ni,

E [0,1],

€ni'S

1 $

are

independent identically distributed random variables with median zero, g(x) is the smooth function of interest. Suppose the local median estimate

gn,h(X)

of g(x) admits the Ba-

hadur's representation. Under some regular conditions, the relative stability of the local median estimate is established in the L 2 sense.

Key words Local median estimate, relative stability. nonparametric median regression. 1991 MR Subject Classification

1

62.1

Introduction and Result Consider the nonparametric median regression model (1.1)

where 9 : [0,1]

f--+

R is a smooth function to be estimated, {Xni. 1 ~ i ~ n} are non-random

design points in the interval [0, 1L {cni' 1 ~ 'i ~ n} are independent identically distributed (iid)

random variables with median zero and {~ti, 1 ~ i ~ n} are the observations. Without loss

°

of generality, assume that == XnO ~ Xnl ~ XnZ ~ ... ~ X n n == 1. For x E [0, 1] and n ~ 1, let {D n 1 (x), D n z ( x), ... , D n n (:r:)} be the order statistics of {Xnl' X n 2 , ... , x n n}, ordered by the distance {Ia.: - xnil, 1 ~ 'l ~ n}, with ties being broken by the chronological order. Let ~ld;t) and cni(~.c) denote the corresponding observation and random variable at Dni(x) (1 ~ £ :::; n), respectively. The following estimate

is called the local median estimator of g(x), where the number of nearest neighbors h plays the role of the smoothing parameter, if h is even, the gn,h (x) is equal to the average of the two middle order statistics. We consider the following quadratic measure of accuracy for gn,h (x): average squared errol'

(ASE),

= n- 1 L n

ASE(h)

[gn,h(Xni) - g(Xni)]2 ,

(1.3)

i=l

lReceived Dec.24,1996; revised Sep.15,1997. Research Supported by NSFC (Grant No.198.71003) and the Doctoral Foundation of Education of China.

38

ACTA MATHEMATICA SCIENTIA

Vol. 19

we will show the following asymptotic equivalence:

· 1un sup

I

n~oo hEH n

ASE(h), - E(AS,E(h») I , = 0 in probability, E(ASE(h))

(1.4)

where Hi, is the index set specified below. When (1.4) holds, the estimate ?in,h (x) is called (weakly) relatively stable in the L 2 sense.

The concept of relative stability for the kernel

estimate of density function was introduced by Devroyel". The usefulness of results such as (1.4) has been demonstrated for the kernel estimate in the setting of iicl by Marron and Hardle[2] and in the setting of stationary dependent observations by Vieu[3] (e.g., as an essential step to prove the asymptotic optimality of bandwidth selectors). In this article we establish (104) for the local median estimate. TI).e other usefulness of asymptotic equivalence lies ill analyzing bandwidth selectors, see, e.g. references [4-7J for kernel estimate and [8] for local median estimate. To avoid edge effects we only estimate g(x) for

:1;

E D = Db = [0,1- bJ, where b E (0, 1/2)

is small enough. At this time, the ASE is redefined on Db as follows (1.5) where no

= #{-i : 1 ~ i

~ n,

Xni

ED}.

Now we make the following assumptions,

(C.l) cnl' Cn2,' " , Cnn are iid random variables defined on probability space (O,:F, P) with a COll11110n continuos distribution function F(x) and density function f(x). F(O) = 1/2, f(O) > O. (C.2) The local median estimate Yn,h(X) admits the following Bahadurs representation

Yn.h(X)

= g(x) + g"(x)M(x)(hjn)2 (1.6)

as n

-+ 00, whe~'e

M(x) is a function dependent on the design points satisfying 0 <

infxE[o,l] M(x) ::;'suPxE[O,l] M(x)

u,

= Hn(a, b) = {h

=M

,: [an,a] :::; h :::;

<

00, I{A.}

[bn f3 ] } ,

0
r

that

~asER;'~t(x) :s; em [(h/n)4m + l/h

= 1, Cl = Cl (n)

-+

0 as n

" -1 I1111

Theorem

00,

3/5 < a :::; 4/5 ~ {3 < 37/45 and

there exists a bounded em,

m] , Vh E H

n,

as n -. 00;

> 0 such (1.7)

-+ 00.

(C.3) g(x) is twice differentiable in [0, 1],

n-+oo nD

=

denotes the indicator function of the set A,

2a - (3 > 3 j 5. Furthermore, for each positive integer

in particular, for 17~

'In

IIg"1!

:= n1aX~rE[O,l] Ig"(x)1

'L--J " (g 1/ (Xni ) ) 2 := K D

<

00

and

> 0.

.rniED

Under the assumptions of (C.1),,-,(C.3),

I d"' - Ed" I = ()~ 11111 s u p Edit 0

n~oo hEH n

III 0

pro b abOI' l l ty .

( 1.8)

39

Yang: RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE

No.1

Remark 1

Under some regular conditions on the distribution function F( z ) of random

variables €~l'iS , fixed design points x~lis and the unknown function g( z ), condition (C.2) holds. In fact: (1.6) and (1.7) are the main results of the papers, which will be reported elsewhere. written by the author. The asymptotic mean squared error of local median estimate is

which minimises at hn (x) == (16j2(O)(g"y2(x )M 2 (x )) -1/5 n 4 / 5 • Theoretically, we can choose the smoothing parameter The interested reader instead of Bahadur's Theorem in selecting discussed in [8J.

h in the vicinity of n 4 / 5 • Therefore, the assumption on H n is reasonable. can refer to [8] for more details. Here we emphasize the relative stability representation of the local median estimate. The applications of the the smoothing parameter h for the local median estimate gn,h(x) are

Remark 2 If design points {X n1' X n 2, ... , x n n } are generated by P {X ~ 0 < v(x) < 00, then M(x) == 1/{24'v 2 (x )} in the condition (C.2).

J;:ni v(x)rlx,

:r:ni}

==

Proof of Th eorern

2

In this section, we give the proof of the Theorem. In what follows, for simplicity, Xni·S are abbreviated to Xi'S. All asymptotic statement are taken as n( h) ~ 00, unless specified otherwise.

Set Kn,h(X) == M(X)gll(X)(~)2

+ hf~O) I:~1=1(1/2

- I{€n.i(:L')~O})' In,h(X)

M(x)g"(x)(h/n)2, €h(X) == hf~.O) I:~1=1(1/2 - I{€nj(:r)~O})' First we estimate the lower bound of Ed h . By condition (C.2), for lUin{rn2 !-i D , _~ 01/"\0) , there exists a positive integer N{I such that C1 == { .11

4 nlax{ .1\/119

n ~

}}2

II, 2/(0)

}

tri

== 1 and

C1 (n)

<

.

')

Ed h == E - LJ (gn,h(X'i) - g(Xi))w iit: ;T:iED

1 == E tit:

L

xiED

[Kn.k(Xi) '

. ') + Rn.h(Xi)]"" ,

1 ~ ( > nD LJ EK;;,h(X;) 'J.

2 EK;;,,.(x;) [

')

]

1/2 [

?

. ]

ER~,h(X;)

1/2)

XjED

> _1 »»

L

[M 2(x;)(gll(;c;))2(hln)4 + {4j2(O)h}-1]

~riED

1 _2_ "[M 2(x;)(g"(x;) )2(hln)4 nD LJ

+ {4j2(O)h} -1] 1/2 [ER~ h(Xj)j 1/2

;TiED

>

m2

(_1 L

(gll(X;))2) (hln)4

,

+ {4j2(O)h}-1

ii» a;iED

-2 [1n 2 1Ig" 112( h/n)4

+ {4j2(O)h} -1] 1/2 y' C1(n) [(h/n)4 + 1/ h] 1/2

(m K D (hln)4 + {4j2(O)h}-1] 2

-2 [M 2I1 g1111 2(hl n )4 + {4j2(O)h} -1] 1/2 y'C 1(n ) [(hln)4

+ Il h] 1/2

==

fJ for all

Nb. Thus, conditions (C.2) and (C.3) yields 1 ~.

f,

40

ACTA MATHEMATICA SCIENTIA

Vol.19

~ min {m 2K D, {4f2(O)} -1} ((J~/n)4 + l/h)

(max {Mlll'lI, 1/{2f(O)} }) Vi((h/n)4 + l/h)

-2

>

~min{m2KD,1/{4f2(O)}} ((h/nf + l/h)

> 2.5111in {'17L2 K D , 1/ {4/ 2 (O)}} 4 -4/5 n -4/5 ==: Kn- 4 / 5 where K == ~ min {1n 2KD,

412\-0)

}4 -4/5 is a positive constant depending only on !(D and f( 0).

Therefore, there exists a constant K such that for n large enough (2.1)

In what follows, we will use C7·- inequality many times and not indicate it at each appearance.

Set Dj == {Xi: 'i == (2h+ l)k+j,k ~ O,Xi E D,O ~ k ~ [n/(2h+ l)]},j == O.1,2,···.2h. where [x] denotes the integer part of x. The index set {Xi, Xi E D} beC0111eS the union of the disjoint subsets of D] ; j == 0,1,2, ... ,2h and {eh (Xi) == hf(O) L~t=1 (1/2 - I{ enk (:rd~O})' :Ci E D', } is a set of iid random variables for each j E {I, 2, ... , 2h}. From now on we treat ti u / (2h + 1) as an integer to avoid unnecessary complications. By condition (1.6),

L

[(gn,h(Xi) - g(Xi))2 - E(gn,h(Xi) - g(Xi))2] == V1

+ 2V2 + V3 ,

xiED

where

L

V1 ==

[K7~h(Xi) - EK'~h(Xi)],

V3 ==

L

[R~h(Xi) - ER'~h(Xi)]

:l:iED

:riED

L

V2 ==

[Knh(~Di)Rnh(Xi) - E(Knh(Xi))Rnh(Xi)]'

~riED

Observe that for h E H n and \IE

:S

bnf3 sup hEH n

> 0 and

k ~ 1, by Chebyshev inequality

p{1V1 + 2V2 + V3 ~ mDEd,,} :S 1

3

L bn(3 1=1

sup Udh), hEH n

where U1 == P{IViI ~ ~EnDEdh} for 1== 1,3 and U2 == P{IV2 ~ ~EnDEdh}' To prove Theorem, it is enough to show that 1

n f3 sup U1(h)

-+

0, as n

-+ 00,

for I == 1,2,3.

hEH n

As to I == 1: We know from the decomposition of the index set {Xi. :ri E D} that 2h

U 1(h) =

p{1

L L j=O ;riED.i

[K,~,,,(X;) - EK,~,h(xi)]1 ~ ~mDEd,,}

(2.2)

41

Yang: RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE

No.1

p{tl L "h

<

2h


j=O

[K;',h(Xi)-EK;',h(X;)]/

J.:iEDJ

j=o

p{1 L

:z:iEDj

~ ~€nDEdh} 1

[K,~,h(:C;) - EK,~,h(X;)]I2':

3(2h

+ 1) enDEd h }

(2.3)

,

Tedious calculations show that (2.4)

Therefore, the definition of Kn,h(·) and Eh(·), (2.4) yields

"

")

K,,~,h(Xi) - EK1~,h(Xi. ..

+ fh ( Xi) ] 2 - E [r; It ( Xi) + fh ( Xi )] 2 == 2.In,h(Xi)€h(J.~i) + ~ (Xi) - 2J n,h(Xi )EEh(Xi) == [.In,h( Xi)

E~(Xi)

+ ~(Xi) -1/{4f2(O)h},

== 2J n,h(Xi)Eh(Xi)

(2.5 )

and

1 ff.(3;i) = [h!(O)

1

h

2

l:)"2 - I{E".j{.r;)~o})]

)=1

1

= h2P(O)

[L( 2" -

I{

j=l

= h 2! ~ (0)

11

E"j(:r;)~O})~ + L("2 - I{ E"j(:r;)~O})("2 - I{E"k(.';)~O))]

h I . )

[~+

jtk

L

(~- I {

E u] ( .r ;) S; 0)

l~j:;ck~1t

)

(~ -

I {E n k ( •• ;) S; O})] •

( 2.6 )

It follows from (2.5) and (2.6) that

KJ~,h(:1.~i) - EK1~,h(Xi)

= 2Jn,,,(X;)€h(Xi)

+ P(~)h2

= 2Jn,h(X;)€h(Xi) +

L

(1/2 -

I{E"j(x;jS;O})( 1/2

-

I{ E"k(.';

l~j:;ck~h

L

j2(~)h2

)~O})

Wn,j(Xi)Wn,k!Xi)

l~jtk~h

(2.7) where U11 ==

.In,h (Xi )Eh (Xi),

U12 == f~(~)h2 I:,l~j:;ck~h 'W n,) (Xi )'U'n.k (;Ci).

== 1/2'U'n.h(3·:i) are iid

'Wn,j (Xi)

I{En.j{;r;)~O}' Xi E D,l ::; j ::; h. Note that for every fixed i, 'U'n,l(;Ci) • . . . , random variables.

By (2.3) and (2.7) 2h

U1(h)

< L p{1 J=O

L

xiED

[2Ul l + Udl ~ 3(2h

2h

L

2Ul l i + I

< ?=P{I )=0

;l:iED

L

xiED

U12I

1

~

+ 1) €nDEdh} 1

3(2h+ l)€n .

DEdh }

42

ACTA MATHEMATICA SCIENTIA 2h

< LP{I L )=0

Un

a:iED

l2

2h

+ LP{I L J=O

= : II

+ 12 •

U

12

xiED

Vol.19

1

12(2h+1)€n

DEd,,}

1 6(2h+ l)€n

12

DEdh}

'

(2.8)

Item II: Using Chebyshey inequality and Dharmadhikari-Jogdeo (DJ) inequality [9], for

'v'k

= 1,2,···,

(2.9)

Recalling the definition of Ell, (Xi) and using DJ inequality again,

E!E'h(X;)I

2X :

1 h , = (hf(O))21.: EI L(1/2 -

2'k

I{fni(x;j:<;O})

I

)=1

< (hf(O))21.: C21.: hl.:- 1 f; EI(1/2 1

1

h

1

= j21.:(O) hI.: c21.: E11/2 where

C2~~

2'k

I{fl:<;o}

I

C2k

(2.10)

= hI.: '

is a constant, which may takes different value at each appearance, depending only

on k. Hence by (2.9) and (2.10) and the fact that ')h

11

2'k

I{fnj(x;j:<;O})/

~( 1

< L.J

12 fEd h

)-2'k

> 0 yield

(2h + 1)k+1 C2k "'"' '21.: C21.: ----;;;;. };k L.J !.J",h (x;l I ;T:iED

)=0

=C21.: f

1 - 26

~ ~

_ ,)1.: -

+

2h 1) (2h+1)(Ed h ) - ,)-,k ( no

J.~ + 1

1 'LJ " IM(Xi)g(x,d(-)--' I h ,) I') l:

·-}k t·

xiED

"

n

-21.: II "11 2 1.: } (,E 1 ,)-21.: ( h ')1.:+1 1 n (h ',)41.: < _ C21.: f 9 i l h -; • hI.: . h . ,-;

_ -

c21.:f

-21.: (Ed )-21.:} 11.:+1/, 51.: 11,

(2.11 )

tn,

Item 12 : By Theorem 2 in [10], for k

= 1~ 2, ... , tedious calculations show

that

43

Yang: RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE

No.1

By (2.8) ~ (2.9) and (2.12), (2.13) where c is a positive constant independent of h.

Therefore, when k ;::: max{ ¥2~4{3' :~~}, (2.1) and (2.13) yields n/3 sup U1(h)

:S cnf3(n-4/5)-2k(hl+4kn-5k + h1-kn- k )

hEH n

(2.14)

As to 1

= 2 : As done in the proof of the case 1 = 1, the condition (C.2) yields

U 2(h)

2h

=

p{IL L

~

?= p{ I L

1

.

[Kn,,,(Xi)R",,,(x;) - EK",,,(Xi)R,,,h(Xi)]1 ;::: -r/nDEdh}

j=O xiEDj

2h

< -

< -

(2h

+ 1)

Ej

L

.

[Kn,/;(Xi)R,.,h(Xi) - EK",h(Xi)R,.,h(Xi)] I ;:::

xiEDj

J=O

EnD

''''k

max ( . . Edh)-~ 6(2h + 1) ,

. O~j~2h

»,«»,

[K",h(Xi)Rn,h(Xi) - EK",h(Xi)R",h(Xi)]1

2k

ch(~)-2k(Ed ). _2k(~)k-l h li 2h + 1

L

EIKn,h(Xi)Rn,h(Xi) - EKn,h(Xi)R n,h(Xi)1

2k

xiED j

:S C h ( -jn ) - 2k ( E dh). - 2k( -h ) 1.~ i

n

() 121.~ max E IKn,h () Xi Rn,h Xi

xiED

k ('x)]1/2 < ch(~)k(Ed )-2k n1ax[EK,4k (x)]1/2[ER 4n,h n h xED n,h.

< ch(~)k(Ed,,)-:!k [(~)8k + h;k].

~ 2}~~

1 Ed,,}

44

ACTA MATHEMATICA SCIENTIA

Therefore, when k > max {

l!!9f3' 2olfi-~},

n f3 sup U2(h) ~ enf3 h (h - ) k (Ed h ) n

hEH n

<)

-

(2.1) yield k [(

h ) 8k

-

1] + h?k

n'"

• n f3 • n -(l-j3)1.~n ¥- [n -8k(1-,B)

=

cn f3

=

cn 2,B-9(1-,B)k+¥-

Vol.19

+ n -2ko]

+ cn2,B-(1-,B)k+¥--21.~a

=

--+

O.

(2.15)

=

3 : This is similar to the proof of the case 1 2, it is easily to see that As to 1 U3 (h) :::; eh(h/n)k(Edh)-2k[(h/n)8k + 1/h21.~]. Therefore, when k > 111ax{2/3/(37/5 - 9f3)~ f3/(2n

-f3-3/5)} n/3

sup U3(h)

-+

O.

(2.16)

hEH n

Combining (2.14), (2.15) and

(2.16)~

we know that (2.2) holds provided k: > 111ax{(a

(0: - 3/5), 2f3/ ( 37/5 - 9(3), {3/ (20: - (3 - 3/5)} an d n The proof is complete,

+ {3)/

-+ 00.

References 1 Devroye L. The kernel estimate is relatively stable. Probab Th ReI Fields, 1988, 77: 521-536

2 Marron J S . Hiirdle W. Random approximation to some measures of accuracy in nonparametric curve estimation. J Multivariate Anal, 1986, 20: 91-113 3 View P. Quadratic errors for nonparametric estimates under dependence. J Multivariate Anal. 1991. 39: 324-347

4 Hall P, Marron J S. Exact to which least squares cross validation minimizes integrated square error ill nonparamebric density estimation. Probab Th ReI Fields, 1987,74: 567-581 5 Hiirdle W, Hall P, Marron J S. How far are automatically chosen regression smoothing parnmef.crs from their optimum? J Arner Statist Asso c, 1988, 83: 86-95 6 Hardle W. Vieu P. Kernel smoothing of time series. J 'I'ime Ser Anal, 1988, 9: 86-95

7 Hart W, Vieu P. Data-driven bandwidth choice for density estimation based on dependent data. Ann

St.at ist , 1990, 18: 873-890 8 Yang Y. Nearest neighbor median estimate and the selection of smooth parameters [dissertation]. Beijing: Peking University, 1996. 1-129 9 Dharmadhikar S W, Jogdeo K. Bounds on moments of certain random var-iables.

Ann Math Statist.

1969,40: 1506-1508 10 Whittle P. Bounds for the moments of linear and quadratic forms in independent i-andom vari ables. Theor

Probab Appl, 1969, 5: 302-305