1999,19( 1):37--44
RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE 1 Yang Ying ( ~.Jl ) Department of Probability and Statistics, Peking University, Beijing 100871, China Abstract
Consider the uonparametric median regression model
i $ n, where
Yni"S
~li=
are the observations at the fixed design points
Xui
g(xnd
+ €ni,
E [0,1],
€ni'S
1 $
are
independent identically distributed random variables with median zero, g(x) is the smooth function of interest. Suppose the local median estimate
gn,h(X)
of g(x) admits the Ba-
hadur's representation. Under some regular conditions, the relative stability of the local median estimate is established in the L 2 sense.
Key words Local median estimate, relative stability. nonparametric median regression. 1991 MR Subject Classification
1
62.1
Introduction and Result Consider the nonparametric median regression model (1.1)
where 9 : [0,1]
f--+
R is a smooth function to be estimated, {Xni. 1 ~ i ~ n} are non-random
design points in the interval [0, 1L {cni' 1 ~ 'i ~ n} are independent identically distributed (iid)
random variables with median zero and {~ti, 1 ~ i ~ n} are the observations. Without loss
°
of generality, assume that == XnO ~ Xnl ~ XnZ ~ ... ~ X n n == 1. For x E [0, 1] and n ~ 1, let {D n 1 (x), D n z ( x), ... , D n n (:r:)} be the order statistics of {Xnl' X n 2 , ... , x n n}, ordered by the distance {Ia.: - xnil, 1 ~ 'l ~ n}, with ties being broken by the chronological order. Let ~ld;t) and cni(~.c) denote the corresponding observation and random variable at Dni(x) (1 ~ £ :::; n), respectively. The following estimate
is called the local median estimator of g(x), where the number of nearest neighbors h plays the role of the smoothing parameter, if h is even, the gn,h (x) is equal to the average of the two middle order statistics. We consider the following quadratic measure of accuracy for gn,h (x): average squared errol'
(ASE),
= n- 1 L n
ASE(h)
[gn,h(Xni) - g(Xni)]2 ,
(1.3)
i=l
lReceived Dec.24,1996; revised Sep.15,1997. Research Supported by NSFC (Grant No.198.71003) and the Doctoral Foundation of Education of China.
38
ACTA MATHEMATICA SCIENTIA
Vol. 19
we will show the following asymptotic equivalence:
· 1un sup
I
n~oo hEH n
ASE(h), - E(AS,E(h») I , = 0 in probability, E(ASE(h))
(1.4)
where Hi, is the index set specified below. When (1.4) holds, the estimate ?in,h (x) is called (weakly) relatively stable in the L 2 sense.
The concept of relative stability for the kernel
estimate of density function was introduced by Devroyel". The usefulness of results such as (1.4) has been demonstrated for the kernel estimate in the setting of iicl by Marron and Hardle[2] and in the setting of stationary dependent observations by Vieu[3] (e.g., as an essential step to prove the asymptotic optimality of bandwidth selectors). In this article we establish (104) for the local median estimate. TI).e other usefulness of asymptotic equivalence lies ill analyzing bandwidth selectors, see, e.g. references [4-7J for kernel estimate and [8] for local median estimate. To avoid edge effects we only estimate g(x) for
:1;
E D = Db = [0,1- bJ, where b E (0, 1/2)
is small enough. At this time, the ASE is redefined on Db as follows (1.5) where no
= #{-i : 1 ~ i
~ n,
Xni
ED}.
Now we make the following assumptions,
(C.l) cnl' Cn2,' " , Cnn are iid random variables defined on probability space (O,:F, P) with a COll11110n continuos distribution function F(x) and density function f(x). F(O) = 1/2, f(O) > O. (C.2) The local median estimate Yn,h(X) admits the following Bahadurs representation
Yn.h(X)
= g(x) + g"(x)M(x)(hjn)2 (1.6)
as n
-+ 00, whe~'e
M(x) is a function dependent on the design points satisfying 0 <
infxE[o,l] M(x) ::;'suPxE[O,l] M(x)
u,
= Hn(a, b) = {h
=M
,: [an,a] :::; h :::;
<
00, I{A.}
[bn f3 ] } ,
0
r
that
~asER;'~t(x) :s; em [(h/n)4m + l/h
= 1, Cl = Cl (n)
-+
0 as n
" -1 I1111
Theorem
00,
3/5 < a :::; 4/5 ~ {3 < 37/45 and
there exists a bounded em,
m] , Vh E H
n,
as n -. 00;
> 0 such (1.7)
-+ 00.
(C.3) g(x) is twice differentiable in [0, 1],
n-+oo nD
=
denotes the indicator function of the set A,
2a - (3 > 3 j 5. Furthermore, for each positive integer
in particular, for 17~
'In
IIg"1!
:= n1aX~rE[O,l] Ig"(x)1
'L--J " (g 1/ (Xni ) ) 2 := K D
<
00
and
> 0.
.rniED
Under the assumptions of (C.1),,-,(C.3),
I d"' - Ed" I = ()~ 11111 s u p Edit 0
n~oo hEH n
III 0
pro b abOI' l l ty .
( 1.8)
39
Yang: RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE
No.1
Remark 1
Under some regular conditions on the distribution function F( z ) of random
variables €~l'iS , fixed design points x~lis and the unknown function g( z ), condition (C.2) holds. In fact: (1.6) and (1.7) are the main results of the papers, which will be reported elsewhere. written by the author. The asymptotic mean squared error of local median estimate is
which minimises at hn (x) == (16j2(O)(g"y2(x )M 2 (x )) -1/5 n 4 / 5 • Theoretically, we can choose the smoothing parameter The interested reader instead of Bahadur's Theorem in selecting discussed in [8J.
h in the vicinity of n 4 / 5 • Therefore, the assumption on H n is reasonable. can refer to [8] for more details. Here we emphasize the relative stability representation of the local median estimate. The applications of the the smoothing parameter h for the local median estimate gn,h(x) are
Remark 2 If design points {X n1' X n 2, ... , x n n } are generated by P {X ~ 0 < v(x) < 00, then M(x) == 1/{24'v 2 (x )} in the condition (C.2).
J;:ni v(x)rlx,
:r:ni}
==
Proof of Th eorern
2
In this section, we give the proof of the Theorem. In what follows, for simplicity, Xni·S are abbreviated to Xi'S. All asymptotic statement are taken as n( h) ~ 00, unless specified otherwise.
Set Kn,h(X) == M(X)gll(X)(~)2
+ hf~O) I:~1=1(1/2
- I{€n.i(:L')~O})' In,h(X)
M(x)g"(x)(h/n)2, €h(X) == hf~.O) I:~1=1(1/2 - I{€nj(:r)~O})' First we estimate the lower bound of Ed h . By condition (C.2), for lUin{rn2 !-i D , _~ 01/"\0) , there exists a positive integer N{I such that C1 == { .11
4 nlax{ .1\/119
n ~
}}2
II, 2/(0)
}
tri
== 1 and
C1 (n)
<
.
')
Ed h == E - LJ (gn,h(X'i) - g(Xi))w iit: ;T:iED
1 == E tit:
L
xiED
[Kn.k(Xi) '
. ') + Rn.h(Xi)]"" ,
1 ~ ( > nD LJ EK;;,h(X;) 'J.
2 EK;;,,.(x;) [
')
]
1/2 [
?
. ]
ER~,h(X;)
1/2)
XjED
> _1 »»
L
[M 2(x;)(gll(;c;))2(hln)4 + {4j2(O)h}-1]
~riED
1 _2_ "[M 2(x;)(g"(x;) )2(hln)4 nD LJ
+ {4j2(O)h} -1] 1/2 [ER~ h(Xj)j 1/2
;TiED
>
m2
(_1 L
(gll(X;))2) (hln)4
,
+ {4j2(O)h}-1
ii» a;iED
-2 [1n 2 1Ig" 112( h/n)4
+ {4j2(O)h} -1] 1/2 y' C1(n) [(h/n)4 + 1/ h] 1/2
(m K D (hln)4 + {4j2(O)h}-1] 2
-2 [M 2I1 g1111 2(hl n )4 + {4j2(O)h} -1] 1/2 y'C 1(n ) [(hln)4
+ Il h] 1/2
==
fJ for all
Nb. Thus, conditions (C.2) and (C.3) yields 1 ~.
f,
40
ACTA MATHEMATICA SCIENTIA
Vol.19
~ min {m 2K D, {4f2(O)} -1} ((J~/n)4 + l/h)
(max {Mlll'lI, 1/{2f(O)} }) Vi((h/n)4 + l/h)
-2
>
~min{m2KD,1/{4f2(O)}} ((h/nf + l/h)
> 2.5111in {'17L2 K D , 1/ {4/ 2 (O)}} 4 -4/5 n -4/5 ==: Kn- 4 / 5 where K == ~ min {1n 2KD,
412\-0)
}4 -4/5 is a positive constant depending only on !(D and f( 0).
Therefore, there exists a constant K such that for n large enough (2.1)
In what follows, we will use C7·- inequality many times and not indicate it at each appearance.
Set Dj == {Xi: 'i == (2h+ l)k+j,k ~ O,Xi E D,O ~ k ~ [n/(2h+ l)]},j == O.1,2,···.2h. where [x] denotes the integer part of x. The index set {Xi, Xi E D} beC0111eS the union of the disjoint subsets of D] ; j == 0,1,2, ... ,2h and {eh (Xi) == hf(O) L~t=1 (1/2 - I{ enk (:rd~O})' :Ci E D', } is a set of iid random variables for each j E {I, 2, ... , 2h}. From now on we treat ti u / (2h + 1) as an integer to avoid unnecessary complications. By condition (1.6),
L
[(gn,h(Xi) - g(Xi))2 - E(gn,h(Xi) - g(Xi))2] == V1
+ 2V2 + V3 ,
xiED
where
L
V1 ==
[K7~h(Xi) - EK'~h(Xi)],
V3 ==
L
[R~h(Xi) - ER'~h(Xi)]
:l:iED
:riED
L
V2 ==
[Knh(~Di)Rnh(Xi) - E(Knh(Xi))Rnh(Xi)]'
~riED
Observe that for h E H n and \IE
:S
bnf3 sup hEH n
> 0 and
k ~ 1, by Chebyshev inequality
p{1V1 + 2V2 + V3 ~ mDEd,,} :S 1
3
L bn(3 1=1
sup Udh), hEH n
where U1 == P{IViI ~ ~EnDEdh} for 1== 1,3 and U2 == P{IV2 ~ ~EnDEdh}' To prove Theorem, it is enough to show that 1
n f3 sup U1(h)
-+
0, as n
-+ 00,
for I == 1,2,3.
hEH n
As to I == 1: We know from the decomposition of the index set {Xi. :ri E D} that 2h
U 1(h) =
p{1
L L j=O ;riED.i
[K,~,,,(X;) - EK,~,h(xi)]1 ~ ~mDEd,,}
(2.2)
41
Yang: RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE
No.1
p{tl L "h
<
2h
j=O
[K;',h(Xi)-EK;',h(X;)]/
J.:iEDJ
j=o
p{1 L
:z:iEDj
~ ~€nDEdh} 1
[K,~,h(:C;) - EK,~,h(X;)]I2':
3(2h
+ 1) enDEd h }
(2.3)
,
Tedious calculations show that (2.4)
Therefore, the definition of Kn,h(·) and Eh(·), (2.4) yields
"
")
K,,~,h(Xi) - EK1~,h(Xi. ..
+ fh ( Xi) ] 2 - E [r; It ( Xi) + fh ( Xi )] 2 == 2.In,h(Xi)€h(J.~i) + ~ (Xi) - 2J n,h(Xi )EEh(Xi) == [.In,h( Xi)
E~(Xi)
+ ~(Xi) -1/{4f2(O)h},
== 2J n,h(Xi)Eh(Xi)
(2.5 )
and
1 ff.(3;i) = [h!(O)
1
h
2
l:)"2 - I{E".j{.r;)~o})]
)=1
1
= h2P(O)
[L( 2" -
I{
j=l
= h 2! ~ (0)
11
E"j(:r;)~O})~ + L("2 - I{ E"j(:r;)~O})("2 - I{E"k(.';)~O))]
h I . )
[~+
jtk
L
(~- I {
E u] ( .r ;) S; 0)
l~j:;ck~1t
)
(~ -
I {E n k ( •• ;) S; O})] •
( 2.6 )
It follows from (2.5) and (2.6) that
KJ~,h(:1.~i) - EK1~,h(Xi)
= 2Jn,,,(X;)€h(Xi)
+ P(~)h2
= 2Jn,h(X;)€h(Xi) +
L
(1/2 -
I{E"j(x;jS;O})( 1/2
-
I{ E"k(.';
l~j:;ck~h
L
j2(~)h2
)~O})
Wn,j(Xi)Wn,k!Xi)
l~jtk~h
(2.7) where U11 ==
.In,h (Xi )Eh (Xi),
U12 == f~(~)h2 I:,l~j:;ck~h 'W n,) (Xi )'U'n.k (;Ci).
== 1/2'U'n.h(3·:i) are iid
'Wn,j (Xi)
I{En.j{;r;)~O}' Xi E D,l ::; j ::; h. Note that for every fixed i, 'U'n,l(;Ci) • . . . , random variables.
By (2.3) and (2.7) 2h
U1(h)
< L p{1 J=O
L
xiED
[2Ul l + Udl ~ 3(2h
2h
L
2Ul l i + I
< ?=P{I )=0
;l:iED
L
xiED
U12I
1
~
+ 1) €nDEdh} 1
3(2h+ l)€n .
DEdh }
42
ACTA MATHEMATICA SCIENTIA 2h
< LP{I L )=0
Un
a:iED
l2
2h
+ LP{I L J=O
= : II
+ 12 •
U
12
xiED
Vol.19
1
12(2h+1)€n
DEd,,}
1 6(2h+ l)€n
12
DEdh}
'
(2.8)
Item II: Using Chebyshey inequality and Dharmadhikari-Jogdeo (DJ) inequality [9], for
'v'k
= 1,2,···,
(2.9)
Recalling the definition of Ell, (Xi) and using DJ inequality again,
E!E'h(X;)I
2X :
1 h , = (hf(O))21.: EI L(1/2 -
2'k
I{fni(x;j:<;O})
I
)=1
< (hf(O))21.: C21.: hl.:- 1 f; EI(1/2 1
1
h
1
= j21.:(O) hI.: c21.: E11/2 where
C2~~
2'k
I{fl:<;o}
I
C2k
(2.10)
= hI.: '
is a constant, which may takes different value at each appearance, depending only
on k. Hence by (2.9) and (2.10) and the fact that ')h
11
2'k
I{fnj(x;j:<;O})/
~( 1
< L.J
12 fEd h
)-2'k
> 0 yield
(2h + 1)k+1 C2k "'"' '21.: C21.: ----;;;;. };k L.J !.J",h (x;l I ;T:iED
)=0
=C21.: f
1 - 26
~ ~
_ ,)1.: -
+
2h 1) (2h+1)(Ed h ) - ,)-,k ( no
J.~ + 1
1 'LJ " IM(Xi)g(x,d(-)--' I h ,) I') l:
·-}k t·
xiED
"
n
-21.: II "11 2 1.: } (,E 1 ,)-21.: ( h ')1.:+1 1 n (h ',)41.: < _ C21.: f 9 i l h -; • hI.: . h . ,-;
_ -
c21.:f
-21.: (Ed )-21.:} 11.:+1/, 51.: 11,
(2.11 )
tn,
Item 12 : By Theorem 2 in [10], for k
= 1~ 2, ... , tedious calculations show
that
43
Yang: RELATIVE STABILITY FOR LOCAL MEDIAN ESTIMATE
No.1
By (2.8) ~ (2.9) and (2.12), (2.13) where c is a positive constant independent of h.
Therefore, when k ;::: max{ ¥2~4{3' :~~}, (2.1) and (2.13) yields n/3 sup U1(h)
:S cnf3(n-4/5)-2k(hl+4kn-5k + h1-kn- k )
hEH n
(2.14)
As to 1
= 2 : As done in the proof of the case 1 = 1, the condition (C.2) yields
U 2(h)
2h
=
p{IL L
~
?= p{ I L
1
.
[Kn,,,(Xi)R",,,(x;) - EK",,,(Xi)R,,,h(Xi)]1 ;::: -r/nDEdh}
j=O xiEDj
2h
< -
< -
(2h
+ 1)
Ej
L
.
[Kn,/;(Xi)R,.,h(Xi) - EK",h(Xi)R,.,h(Xi)] I ;:::
xiEDj
J=O
EnD
''''k
max ( . . Edh)-~ 6(2h + 1) ,
. O~j~2h
»,«»,
[K",h(Xi)Rn,h(Xi) - EK",h(Xi)R",h(Xi)]1
2k
ch(~)-2k(Ed ). _2k(~)k-l h li 2h + 1
L
EIKn,h(Xi)Rn,h(Xi) - EKn,h(Xi)R n,h(Xi)1
2k
xiED j
:S C h ( -jn ) - 2k ( E dh). - 2k( -h ) 1.~ i
n
() 121.~ max E IKn,h () Xi Rn,h Xi
xiED
k ('x)]1/2 < ch(~)k(Ed )-2k n1ax[EK,4k (x)]1/2[ER 4n,h n h xED n,h.
< ch(~)k(Ed,,)-:!k [(~)8k + h;k].
~ 2}~~
1 Ed,,}
44
ACTA MATHEMATICA SCIENTIA
Therefore, when k > max {
l!!9f3' 2olfi-~},
n f3 sup U2(h) ~ enf3 h (h - ) k (Ed h ) n
hEH n
<)
-
(2.1) yield k [(
h ) 8k
-
1] + h?k
n'"
• n f3 • n -(l-j3)1.~n ¥- [n -8k(1-,B)
=
cn f3
=
cn 2,B-9(1-,B)k+¥-
Vol.19
+ n -2ko]
+ cn2,B-(1-,B)k+¥--21.~a
=
--+
O.
(2.15)
=
3 : This is similar to the proof of the case 1 2, it is easily to see that As to 1 U3 (h) :::; eh(h/n)k(Edh)-2k[(h/n)8k + 1/h21.~]. Therefore, when k > 111ax{2/3/(37/5 - 9f3)~ f3/(2n
-f3-3/5)} n/3
sup U3(h)
-+
O.
(2.16)
hEH n
Combining (2.14), (2.15) and
(2.16)~
we know that (2.2) holds provided k: > 111ax{(a
(0: - 3/5), 2f3/ ( 37/5 - 9(3), {3/ (20: - (3 - 3/5)} an d n The proof is complete,
+ {3)/
-+ 00.
References 1 Devroye L. The kernel estimate is relatively stable. Probab Th ReI Fields, 1988, 77: 521-536
2 Marron J S . Hiirdle W. Random approximation to some measures of accuracy in nonparametric curve estimation. J Multivariate Anal, 1986, 20: 91-113 3 View P. Quadratic errors for nonparametric estimates under dependence. J Multivariate Anal. 1991. 39: 324-347
4 Hall P, Marron J S. Exact to which least squares cross validation minimizes integrated square error ill nonparamebric density estimation. Probab Th ReI Fields, 1987,74: 567-581 5 Hiirdle W, Hall P, Marron J S. How far are automatically chosen regression smoothing parnmef.crs from their optimum? J Arner Statist Asso c, 1988, 83: 86-95 6 Hardle W. Vieu P. Kernel smoothing of time series. J 'I'ime Ser Anal, 1988, 9: 86-95
7 Hart W, Vieu P. Data-driven bandwidth choice for density estimation based on dependent data. Ann
St.at ist , 1990, 18: 873-890 8 Yang Y. Nearest neighbor median estimate and the selection of smooth parameters [dissertation]. Beijing: Peking University, 1996. 1-129 9 Dharmadhikar S W, Jogdeo K. Bounds on moments of certain random var-iables.
Ann Math Statist.
1969,40: 1506-1508 10 Whittle P. Bounds for the moments of linear and quadratic forms in independent i-andom vari ables. Theor
Probab Appl, 1969, 5: 302-305