Moderate deviations and law of the iterated logarithm in L1(Rd) for kernel density estimators

Moderate deviations and law of the iterated logarithm in L1(Rd) for kernel density estimators

Stochastic Processes and their Applications 118 (2008) 452–473 www.elsevier.com/locate/spa Moderate deviations and law of the iterated logarithm in L...

349KB Sizes 0 Downloads 66 Views

Stochastic Processes and their Applications 118 (2008) 452–473 www.elsevier.com/locate/spa

Moderate deviations and law of the iterated logarithm in L 1(Rd ) for kernel density estimators Fuqing Gao School of Mathematics and Statistics, Wuhan University, Wuhan 430072, PR China Received 27 January 2005; received in revised form 25 April 2007; accepted 26 April 2007 Available online 29 April 2007

Abstract Let f n (x) be the non-parametric kernel density estimator of a density function f (x) based on a kernel function K . In this paper, we first prove two moderate deviation theorems in L 1 (Rd ) for { f n (x), n ≥ 1}. Then, as an application of the moderate deviations, we obtain a law of the iterated logarithm for {k f n − E f n k1 , n ≥ 1}. c 2007 Elsevier B.V. All rights reserved.

MSC: primary 60F10; 62G07; secondary 62F12 Keywords: Kernel density estimator; Moderate deviations; Law of the iterated logarithm

1. Introduction and main results Let {X i ; i ≥ 1} be a sequence of independent and identically distributed (i.i.d.) random variables taking values in Rd , defined on a probability space (Ω , F, P) with unknown density function f (x). Let K be a measurable function such that Z K ≥ 0, K (x)dx = 1. (1.1) Rd

The kernel density estimator of f , based on the kernel function K , is defined by   n 1 X x − Xi f n (x) = d K , x ∈ Rd nan i=1 an E-mail address: [email protected]. c 2007 Elsevier B.V. All rights reserved. 0304-4149/$ - see front matter doi:10.1016/j.spa.2007.04.010

(1.2)

453

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

where {an , n ≥ 1} is a bandsequence (width of windows), that is, a sequence of positive numbers tending to 0 and satisfying nand → +∞ as n → ∞.

(1.3) R As usual, we denote by kgk p = ( Rd |g(x)| p dx)1/ p , p ≥ 1. In [4] (see also [5]), Devroye proved that all types of L 1 -consistency are equivalent to (1.3). The asymptotic normality of {k f n − E f n k1 , n ≥ 1} was studied by Cs¨org¨o and Horv´ath [1] and Horv´ath [7]. More recently, Gin´e, Mason and Zaitsev [13] considered the asymptotic normality of the L 1 -norm density estimator process, Louani [17] and Lei, Wu and Xie [15] (see Lei and Wu [16] for density estimator in a Markov process) studied the large deviations in L 1 (Rd ) for { f n (x), n ≥ 1}. For the uniform consistency, and uniform large deviations and uniform moderate deviations for { f n (x), n ≥ 1}, we refer to Einmahl and Mason [6], Gin´e and Guillou [10], Gin´e, Koltchinskii and Zinn [11], Louani [18], Gao [8] and the references therein. Gin´e and Mason [12] considered the law of the iterated logarithm for {k f n − E f n k22 − Ek f n − E f n k22 , n ≥ 1} by the KMT approximation, and indicated that their methods do not extend to the cases k · k p , p 6= 2. The purpose of this paper is to study the moderate deviations and the law of the iterated logarithm in L 1 (Rd ) for { f n , n ≥ 1}. We find the best condition on the bandsequence such that {k f n − E f n k1 , n ≥ 1} satisfies the moderate deviation principle. A law of the iterated logarithm for {k f n − E f n k1 , n ≥ 1} is also obtained. Let bn , n ≥ 1 be a sequence of positive real numbers satisfying n →0 bn2

n → +∞ and bn

as n → +∞.

We introduce the following condition: Z   (H1 ) 1 + |x| pd K 2 (x)dx < ∞, Rd

(1.4)

Z Rd

|x| pd f (x)dx < ∞,

Remark 1.1. If (H1 ) holds, then Z  Z p −1 1/2 Z f (x)dx ≤ 1 + |x| pd dx Rd

Rd

Rd

for some p > 1.

1/2   1 + |x| pd f (x)dx < ∞,

and Z

  Z sZ 1 2 x−y K f (y)dydx = K 2 (z) f (x − an z)dzdx d an Rd Rd a n Rd Rd Z  1/2 −1 1/2 Z  Z ≤ 1 + |x| pd dx 1 + |x| pd K 2 (z) f (x − an z)dzdx sZ

Rd

Z

Rd





1 + |x| pd

−1

Rd

1/2 dx

Rd

Z

Z



× Rd

Rd

1/2  1 + 2 pd−1 (|u| pd + |an v| pd ) K 2 (u) f (v)dudv ,

and so Z sup n≥1 Rd

sZ

1 2 K d d a R n



x−y an



f (y)dydx < ∞.

454

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

In fact, the last condition will be sufficient for the MDP in this paper. Define  Z   g(x) 2 1 f (x)dx I (g) = 2 Rd f (x)  +∞ where

0 0

if g ∈ L 1 (R ) d

Z g(x)dx = 0,

and Rd

(1.5)

otherwise,

= 0.

Theorem 1.1. Let (H1 ) hold. If the width of windows {an , n ≥ 1} satisfies n (BC) 2 d → 0 as n → +∞, bn an then (1) for any open subset G in (L 1 (Rd ), k · k1 ),   n n lim inf 2 log P ( f n − E f n ) ∈ G ≥ − inf I (g), n→∞ b g∈G bn n (2) for any open and convex subset G in (L 1 (Rd ), k · k1 ),   n n ( f n − E f n ) ∈ G = − inf I (g), log P lim n→∞ b2 g∈G bn n

(1.6)

(1.7)

(3) for any compact subset C in (L 1 (Rd ), k · k1 ), for any δ > 0, there exists an open subset G δ ⊃ C such that   n n ( f n − E f n ) ∈ G δ ≤ − inf I (g) + δ, (1.8) lim sup 2 log P g∈C bn n→∞ bn in particular, n lim sup 2 log P b n→∞ n



n ( fn − E fn ) ∈ C bn



≤ − inf I (g). g∈C

(1.9)

Remark 1.2. (1) By Fatou’s lemma, I (·) is lower-semicontinuous in (L 1 (Rd ), k · k1 ). Since I (·) isn’t a good rate function in (L 1 (Rd ), k · k1 ), i.e., {g; I (g) ≤ l} isn’t compact, the above theorem does not give a full moderate deviation principle. (2) By Dawson–G¨artner’s theorem (cf. [2] Theorem 3.4 or [3]) and the proof of Theorem 1.1(2), the full moderate deviation principle holds in the weak topology (L 1 , σ (L 1 , L ∞ )), i.e., (1.6) holds for each open subset G in (L 1 , σ (L 1 , L ∞ )) and (1.9) holds for each closed subset C in (L 1 , σ (L 1 , L ∞ )). Theorem 1.2. (1) Let (H1 ) and (BC) hold. Then for any open subset G ⊂ [0, +∞),   n n λ2 lim inf 2 log P k f n − E f n k1 ∈ G ≥ − inf , n→∞ b λ∈G 2 bn n and for any closed subset F ⊂ [0, +∞),   n n λ2 lim sup 2 log P k f n − E f n k1 ∈ F ≤ − inf . λ∈F 2 bn n→∞ bn

(1.10)

(1.11)

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

In particular, for any λ > 0,   n n λ2 lim 2 log P k f n − E f n k1 > λ = − . n→∞ b bn 2 n

455

(1.12)

(2) Let K be a bounded function with compact support, and let f also have compact support. Then (1.12) holds if and only if (BC) is valid. Remark 1.3. Theorem 1.2(2) shows that the condition (BC) is necessary for MDP. In order to get the law of the iterated logarithm, we need the following hypothesis on the kernel function K , taken from [10]: (H2 ) K is a bounded, square integrable function in the linear span (the set of finite linear combinations) of functions k ≥ 0 satisfying the following property: the subgraph of k, {(s, u); k(s) ≥ u}, can be represented as a finite number of Boolean operations among sets of the form {(s, u); p(s, u) ≥ ψ(u)}, where p is a polynomial on Rd × R, and ψ is an arbitrary real function. The hypothesis is imposed because the class of functions     x −· d ; x ∈ R , a ∈ R \ {0} F= K a is a bounded, measurable V C class of functions under the assumed condition. The condition (H2 ) is quite general, for example, it is satisfied if K (x) = φ( p(x)), where p is a polynomial and φ a bounded real function of bounded variation, or if the graph of K is a pyramid, or if K = I[−1,1] , etc. Theorem 1.3. Let K be a bounded function of bounded variation. Assume that (H2 ) holds, and that there exist p > 1, α > (2 p + 1)d such that Z α (H3 ) lim sup |x| |K (x)| < ∞, |x|2 pd f (x)dx < ∞. Rd

|x|→∞

If the width of windows {an , n ≥ 1} satisfies an & 0,

nand % +∞,

lim sup n→∞

log an−d < ∞, and log log n

(1.13)

Then r lim sup n→∞

n k f n − E f n k1 = 1 a.s. 2 log log n

(1.14)

Remark 1.4. Let K be a bounded function with compact support. Let (1.14) hold for any density function f with compact support; then by Lemma A.1, it is easy to get r n P (k f n − E f n k1 − Ek f n − E f n k1 ) −→ 0. 2 log log n Therefore, if (1.14) holds, then r n lim sup Ek f n − E f n k1 ≤ 1. 2 log log n n→∞

456

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

By (3.8), there exists a constant C > 0 such that for all n ≥ 1, s r Z q πand log log n  n d E |ξ (x)|2 dx Ek f − E f k − a n n 1 1,n n 2 log log n p C πand log log n p ≤ , and 2n log log n p 1 1 ) − E(K ( x−X ))). Therefore, limn→∞ and n log log n = +∞ (If where ξ1,n (x) = a1d (K ( x−X a a n n p n R√ lim infn→∞ and n log log n < +∞, then lim infn→∞ and log log n = 0, and so f (x)dx = 0), and s Z p 1 lim sup f (x)dx ≤ 1. πand log log n n→∞ Now, we take a sequence of density functions f (m) , m ≥ 1 with compact support such that Rp f (m) (x)dx → ∞ to get lim a d n→∞ n

log log n = +∞.

(1.15)

Therefore, a minimal condition for the LIL in Theorem 1.3 on the bandsequence is (1.15). 2. Proof of Theorem 1.1 The lower bound is shown by a measure transformation. The upper bound for an open convex subset follows from the Hahn–Banach theorem and the Chebyshev inequality. Proof of Theorem 1.1(1). Let G be an open subset in (L 1 (Rd ), k · k1 ). For any g ∈ G, choose δ > 0 such that B(g, δ) := {ϕ ∈ L 1 (Rd ); kϕ − gk1 ≤ δ} ⊂ G. Then

   

n

n n n

lim inf 2 log P ( f n − E f n ) ∈ G ≥ lim inf 2 log P ( f − E f ) − g < δ . n

b n

n→∞ b n→∞ b bn n n n 1 Therefore, the following lemma implies Theorem 1.1(1).



Lemma 2.1. Assume that (H1 ) and (BC) hold. Then for any g ∈ L 1 (Rd ), and for any δ > 0,

 

n

n

− E f ) − g < δ ≥ −I (g). (2.1) lim inf 2 log P ( f n

b n

n→∞ b n n 1 Proof. Without loss of generality, we assume that (Ω , F, P) = (Rd , B(Rd ), µ)N where µ(dx) = f (x)dx and N = {1, 2, . . .}. Let X i (ω) = ωi , i = 1, 2, . . . be the coordinate variables on Ω . If I (g) = ∞, then (2.1) is trivial. Hence, we only need to prove (2.1) for g with I (g) < ∞. Furthermore, if I (g) < ∞, set g˜ N (x) = g(x)I{ 1 f (x)≤|g(x)|≤N f (x)} N R R and g N (x) = g˜ N (x) − f (x) g˜ N (y)dy, then g N (x)dx = 0, and kg N − gk1 → 0 and I (g N ) → I (g) as N → ∞. Hence, we may assume that g(x)/ f (x) is a bounded function. Then for n large enough,   bn νn (dx) = f (x) + g(x) dx n

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

457

is a probability measure on Rd , which is equivalent to µ. Set Q n (dx1 , . . . , dxn ) = νn (dx1 ) · · · νn (dxn ). Then for n large enough, for any ε > 0,

 

n

P ( fn − E fn ) − g < δ bn 1  Z n  Y bn g(xi ) −1 Q n (dx1 , . . . , dxn ) 1+ = n f (xi ) {k bnn ( f n −E f n )−gk1 <δ} i=1 (  ) Z n X bn g(xi ) Q n (dx1 , . . . , dxn ) = exp − log 1 + n f (xi ) {k bnn ( f n −E f n )−gk1 <δ} i=1      bn g(X 1 ) bn2 ε νn ≥ exp −n E log 1 + + 2 n f (X 1 ) n

   \ n

× Q n An,ε

b ( fn − E fn ) − g < δ n 1 where ( An,ε :=

)     n bn g(X 1 ) n X bn g(X i ) n 2 νn log 1 + ≤ 2 E log 1 + +ε . n f (X i ) n f (X 1 ) bn2 i=1 bn

Since     n n n g(X i ) 1 X g 2 (X i ) n X bn g(X i ) 1 X bn − log 1 + = +O , n f (X i ) bn i=1 f (X i ) 2n i=1 f 2 (X i ) n bn2 i=1         n νn g(X 1 ) bn bn 1 νn g 2 (X 1 ) E E = I (g) + O = 2I (g) + O , , 2 bn f (X 1 ) n 2 n f (X 1 ) and     bn g(X 1 ) bn n 2 νn E log 1 + = I (g) + O , 2 n f (X 1 ) n bn by the Chebyshev inequality, for each η > 0, for n large enough, we have that ! !   n n 1 X 1 X g 2 (X 1 ) g 2 (X 1 ) 1 νn g 2 (X 1 ) η Qn − I (g) > η ≤ Q n − E > 2n i=1 f 2 (X 1 ) 2n i=1 f 2 (X 1 ) 2 f 2 (X 1 ) 2   2 2 g (X 1 ) 1 νn g 2 (X 1 ) νn ≤ E −E nη2 f 2 (X 1 ) f 2 (X 1 ) and Qn

! n 1 X g(X 1 ) − 2I (g) > η ≤ Q n bn i=1 f (X 1 )

!   n  1 X g(X 1 ) g(X ) η 1 − E νn > bn i=1 f (X 1 ) f (X 1 ) 2   2 4n νn g(X 1 ) g(X 1 ) νn ≤ 2 2E −E . f (X 1 ) f (X 1 ) bn η

458

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

Therefore  Q n An,ε → 1. R R R On the other hand, since E Q n f n = E f n + bnn a1d K ((x − y)/an )g(y)dy and | a1d K ((x − n n y)/an )g(y)dy − g(x)|dx → 0 as n → ∞, we have



  



Qn n Qn Qn n

lim sup E ( f n − E f n ) − g ≤ lim sup E ( f n − E f n )

bn bn n→∞ n→∞ 1 1 Now, applying the Cauchy–Schwartz inequality to E Q n (| bnn ( f n (x) − E Q n f n (x))|), it is easy to get that for any η > 0,

 Z   



Qn n Qn Qn Qn n

E

b ( f n − E f n ) = d E b ( f n (x) − E f n (x)) dx n n R 1     Z s r n x − X1 2 ν n and dx. K E ≤ an bn2 and Rd   And so lim supn→∞ E Q n k bnn ( f n − E Q n f n )k1 = 0, and

 

n

Q n ( f n − E f n ) − g < δ → 1. bn 1 Therefore, for any ε > 0,

 

n

n

lim inf 2 log P ( f n − E f n ) − g < δ n→∞ b bn n 1     2 n b g(X b2 ε n 1) ≥ − lim sup 2 E νn log 1 + + n2 = −I (g) − ε. n f (X 1 ) n n→∞ bn Letting ε → 0, we obtain (2.1).



Proof of Theorem 1.1(2). It is sufficient for (1.7) to prove that for any open convex subset G,   n n ( f n − E f n ) ∈ G ≤ − inf I (g). (2.2) lim sup 2 log P g∈G bn n→∞ bn Now let G be an open convex subset. Since (2.2) is trivial if infg∈G I (g) = 0, we can assume infg∈G I (g) > 0. For any N > 0 and 0 < ε < infg∈G I (g), set U = {ϕ ∈ L 1 (Rd ); I (ϕ) ≤ t N } where t N = min{N , infg∈G I (g) − ε}. Then U ∩ G = ∅, and hence by the Hahn–Banach theorem, there exist Rh ∈ L ∞ (Rd ) and c ∈ R such that H ∩ U = ∅ and G ⊂ H , where H = {ϕ ∈ L 1 (Rd ); h(x)ϕ(x)dx > c}. Therefore, by the Chebyshev inequality, we have that for any α > 0,   n n ( fn − E fn ) ∈ G lim sup 2 log P bn n→∞ bn  Z  n n h(x)( f n (x) − E f n (x))dx ≥ c ≤ lim sup 2 log P bn n→∞ bn    Z n ≤ −αc + lim sup 2 log E exp αbn h(x)( f n (x) − E f n (x))dx . n→∞ bn

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

459

By a Taylor expansion, it is easy to get   Z  n Λ(h) ≡ lim 2 log E exp bn h(x)( f n (x) − E f n (x))dx n→∞ b n Z 2 ! Z 1 = . h(x)2 f (x)dx − h(x) f (x)dx 2 Therefore for any α > 0,   n n lim sup 2 log P ( f n − E f n ) ∈ G ≤ −αc + α 2 Λ(h), bn n→∞ bn and so n lim sup 2 log P n→∞ bn



n ( fn − E fn ) ∈ G bn

 ≤−

c2 . 4Λ(h)

R Noting that ϕ ∈ U implies −ϕ ∈ U , we have that U ⊂ {ϕ; h(x)ϕ(x)dx ≤ |c|}. Therefore, 2 Z  Z 2Λ(h) = f (x)dx h(x) − h(y) f (y)dy 2 Z 2 Z 1 c2 = sup h(x)ϕ(x)dx = sup h(x)ϕ(x)dx ≤ 2t N ϕ∈U 2t N 2I (ϕ)≤1 R ¯ where µ(dx) = f (x)dx and h(x) := h(x) − h(y)µ(dy). Hence,   n n lim sup 2 log P ( f n − E f n ) ∈ G ≤ −t N = − min{N , inf I (g) − ε}. g∈G bn n→∞ bn Now, first letting N → ∞, and then letting ε → 0, we obtain (2.2). Proof of Theorem 1.1(3). Let C be a compact subset. For any δ > 0, and for any g ∈ C, there exists an open ball Ug 3 g such that infϕ∈U SgmI (ϕ) ≥ I (g)−δ, since I (·) isSlower-semicontinuous. m Choose finite g1 , . . . , gm such that C ⊂ i=1 Ugi and denote by G δ = i=1 Ugi . Then   n n ( f n − E f n ) ∈ G δ ≤ max {− inf I (ϕ)} ≤ − inf I (g) + δ.  lim sup 2 log P ϕ∈Ugi g∈C 1≤i≤m bn n→∞ bn 3. Proof of Theorem 1.2 The lower bound is a consequence of Theorem 1.1. Here are two basic steps in proving the upper bound. Then the upper bound follows by Devroye’s proof in [4]. The Devroye partition [4] plays an important role in proof of the upper bound, but here it requires precise estimates to get the MDP. Proof of Theorem 1.2(1). Define Ψ : (L 1 (Rd ), k · k1 ) 7−→ [0, ∞) by Ψ (ϕ) = kϕk1 . Then Ψ is continuous from (L 1 (Rd ), k · k1 ) to [0, ∞) and inf

Ψ (g)=λ

I (g) =

λ2 . 2

(3.1)

460

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

In fact, it is clear that infΨ (g)=λ I (g) ≥

λ2 2 ; on the R Rd and A

other hand, if one takes g(x) = λ(I A (x) −

I B (x)) f (x) where A ∩ B = ∅, A ∪ B = f (x)dx = 12 , then kgk1 = λ and I (g) = λ2 . Therefore (3.1) holds. Lower bound. Let G be an open subset in [0, ∞). Then Ψ −1 (G) is an open subset in (L 1 (Rd ), k · k1 ); thus by Theorem 1.1,   n n lim inf 2 log P k f n − E f n k1 ∈ G n→∞ b bn n   n λ2 n −1 f − E f ∈ Ψ (G) ≥ − inf I (g) = − inf . = lim inf 2 log P ( n n) n→∞ b λ∈G 2 bn Ψ (g)∈G n 2

Upper bound. Let F be a closed subset in [0, ∞), and let λ = inf{x; x ∈ F}. Without loss of generality, we can assume λ > 0. Then for any 0 < ε < λ,     n n P k f n − E f n k1 ∈ F ≤ P k f n − E f n k1 > λ − ε . bn bn Therefore, it is sufficient for the upper bound to prove for any λ > 0,   n λ2 n k f n − E f n k1 > λ ≤ − . lim sup 2 log P bn 2 n→∞ bn By (H1 ) and (BC) and Lemma A.1, without S loss of generality we can assume that there exists a constant 1 < L < ∞ such that {K 6= 0} { f 6= 0} ⊂ [−L + 1, L − 1]d , and K (x) =

m X j=1

c j I A j (x),

m X

c j |A j | = 1,

|A j | > 0,

j=1

where 0 < c j < ∞, j = 1, . . .R, m are constants, and A j ⊂ [−L , L]d , j = 1, . . . , m are disjoint rectangles, and |A| = A dx (for detail, see Lemmas A.5–A.7 in Appendix). Set K j (x) = (|A j |)−1 I A j (x) and   n 1 X x − Xi f n; j (x) = d Kj . nan i=1 an P P Then by f n = mj=1 c j |A j | f n; j and mj=1 c j |A j | = 1, we have that for any λ > 0,   [  m  n n k f n − E f n k1 > λ ⊂ k f n; j − E f n: j k1 > λ . bn bn j=1 Hence, we only prove the upper bound for K (x) = (|A|)−1 I A (x), where A is a rectangle. Without loss of generality, we assume that A = [0, 1]d , i.e., K (x) = I[0,1]d (x). In this case, Z 1 k f n − E f n k1 = d |µn (x + an A) − µ(x + an A)|dx a n Rd R Pn where µ(B) = B f (x)dx and µn (B) = n1 i=1 δ X i (B) is the empirical measure for X i , i = 1, . . . , n. Define the partition Ψ of Rd as follows (cf. [4]): ) (  d  Y (i j − 1)an i j an , ; i j ∈ Z, j = 1, . . . , d Ψ := N N j=1

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

461

where N is a constant to be chosen later. Set [ B. Dx = (x + an A) − B∈Ψ ,B⊂x+an A B∩[−L ,L]d 6=∅

Then, Z 1 |µn (x + an A) − µ(x + an A)|dx and Rd Z X 1 |µn (B) − µ(B)| + d |µn (Dx ) − µ(Dx )| dx, ≤ a n Rd d B∈Ψ ,B∩[−L ,L] 6=∅

R where the last inequality is due to a1d B⊂x+an A dx ≤ 1. Therefore for any λ > 0 and any n 0 < δ < λ,   Z n |µ (x + a A) − µ(x + a A)|dx > λ P n n n bn and Rd   X bn   |µn (B) − µ(B)| > (λ − δ) ≤ P n B∈Ψ B∩[−L ,L]d 6=∅

 +P

 Z n |µn (Dx ) − µ(Dx )| dx > δ . bn and Rd

Let Fn,N denote the σ -algebra generated by the collection of sets B ∈ Ψ with B ∩[−L , L]d 6= ∅. Then the cardinality of Fn,N is at most equal to 2([ an ]+2) . Hence,   X b n |µn (B) − µ(B)| > (λ − δ) P n d B∈Ψ ,B∩[−L ,L] 6=∅   2N d n 1 |µn (B) − µ(B)| > (λ − δ) . ≤ 2( an +2) sup P bn 2 B∈Fn,N 2L N

d

By the Chebyshev inequality, it is easy to get that for any t > 0,   n 1 |µn (B) − µ(B)| > (λ − δ) sup P bn 2 B∈Fn,N  2   2 2  !n bn bn2 t 2 t bn 2 ≤ 2 exp − (λ − δ)t 1+ sup E(I B (X 1 ) − E(I B (X 1 ))) + o 2 2n 2n B∈Fn,N n2   2  2 2 n b2 t 2 t bn b 1+ n 2 +o . ≤ 2 exp − n (λ − δ)t 2n 8n n2  2 b Now we choose N = N (n) such that limn→∞ N (n) = ∞ and ( 2LanN + 2)d = o nn , which is possible by our condition (BC). Then, first we have   X b n n |µn (B) − µ(B)| > (λ − δ) lim sup lim sup 2 log P  n N →∞ n→∞ bn d B∈Ψ ,B∩[−L ,L] 6=∅

462

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

1 t2 ≤ − sup (λ − δ)t − 8 t>0 2 

 =−

(λ − δ)2 . 2

(3.2)

Next, we show that lim sup n→∞

n log P bn2

Set A∗ = [ N1 , 1 − Z Rd



1 d N) .

µ(Dx )dx ≤

 Z n |µ (D ) − µ(D )| dx > δ = −∞. n x x bn and Rd

(3.3)

Then Dx ⊂ x + an (A − A∗ ), and so

Z

µ(x + an (A − A ))dx = ∗

Rd

Z an (A−A∗ )

dx ≤

2dand . N

(3.4)

By the Cauchy–Schwartz inequality, we have, for any δ > 0, Z  n |µn (Dx ) − µ(Dx )| dx E bn and Rd s s Z Z r r µ(Dx ) n µ(Dx ) n d ≤ (2L) dx ≤ dx. d 2 d 2 d an and bn an [−L ,L]d bn an [−L ,L]d Hence lim sup n→∞

n E bn and

Z Rd

|µn (Dx ) − µ(Dx )| dx

 = 0.

(3.5)

 Taking B = L 1 (Rd ) and ξi (x) = b 1a d δ X i (Dx ) − E(δ X i (Dx )) in Lemma A.1, by Lemma A.1, n n (3.4) and (3.5), we easily get (3.3). Finally by (3.2) and (3.3), we obtain   n n λ2 lim sup 2 log P k f n − E f n k1 > λ ≤ − .  bn 2 n→∞ bn (2) We only need to prove necessity. Let K be a bounded function with compact support, and let f have also compact support. If (1.12) holds, then n P k f n − E f n k1 −→ 0. bn Now we take B = L 1 (Rd ), and ξi,n (x) = (3.6) and Lemma A.1, we have

(3.6) 1 i (K ( x−X an and

i ) − E(K ( x−X an ))) in Lemma A.1, then by

n E (k f n − E f n k1 ) −→ 0. bn Since K is bounded, and K and f have compact support, we have therefore  Z d an E |ξ1,n (x)|3  dx < ∞ lim sup E |ξ1,n (x)|2 n→∞ where 00 = 0. By Lemma A.2, one has r q  √ 3  AE |ξ (x)| 2 1,n . E |ξ1,n (x)|2 ≤ √ n E| f n (x) − E f n (x)| − π n E |ξ1,n (x)|2

(3.7)

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

463

Therefore r √ Z q n  2 n d 2 p an E |ξ1,n (x)| dx Ek f n − E f n k1 − bn π bn and  Z Aand E |ξ1,n (x)|3  dx. ≤ (3.8) bn and E |ξ1,n (x)|2 Rq R√  and E |ξ1,n (x)|2 dx = f (x)dx > 0, we see that (3.8) Finally, by (3.7) and limn→∞ implies (BC).  4. Proof of Theorem 1.3 We use MDP to show the upper bound and the lower bound for subsequences. The difficulty in the proof is to estimate the expectation of supn k−1 ≤n≤n k k f n − f n k k1 , where n k = [γ k ]. We employ the VC-class method to solve this problem (Lemma 4.2). Lemma 4.1. Let K be a function with bounded variation. If there exists α > (2 p + 1)d such that lim sup |z|α |K (z)| < ∞, |x|→∞

then Z lim

δ→0 Rd

(1 + |z| pd )2 sup |K (z) − K (γ z + z)| dz = 0. |γ |≤δ

Proof. By the condition of the lemma, there exist constants M > 1, L > 1 such that |K (z)| ≤ M|z|−α ,

|z| ≥ L ,

and so sup |K (z + γ z)| ≤ M(|z|(1 − δ))−α ,

|z| ≥ L/(1 − δ).

|γ |≤δ

Therefore, for any 0 < ε < 1, there exists R > L/(1 − δ) such that for all 0 ≤ δ ≤ 1/2, Z Z 2 pd α sup |K (z + γ z)||z| dz ≤ M2 |z|−α |z|2 pd dz ≤ ε. |z|≥R |γ |≤δ

|z|≥R

Because K is a function with bounded variation, K can be written as K = K 1 − K 2 , where K 1 (z) = µ1 ((−∞, z]) and K 2 (z) = µ2 ((−∞, z]), and µ1 , µ2 are two finite positive measures on Rd . Thus Z sup |K (z) − K (z + γ z)| (1 + |z| pd )2 dz |z|≤R |γ |≤δ

Z

sup |K 1 (z) − K 1 (z + γ z)| (1 + |z| pd )2 dz



|z|≤R |γ |≤δ

Z +

sup |K 2 (z) − K 2 (z + γ z)| (1 + |z| pd )2 dz. |z|≤R |γ |≤δ

464

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

For any δ > 0 and any z = (z 1 , . . . , z d ) ∈ Rd , set δ i = δsign(z i ), δ i = δsign(−z i ), i = 1, . . . , d, and z(δ) := ((1 + δ 1 )z 1 , . . . , (1 + δ d )z d ),

z(δ) := ((1 + δ 1 )z 1 , . . . , (1 + δ d )z d ).

Then, for all |γ | ≤ δ, (1 + γ )z − z(δ) = ((γ − δ 1 )z 1 , . . . , (γ − δ d )z d ) ≤ 0, and (1 + γ )z − z(δ) = ((γ − δ 1 )z 1 , . . . , (γ − δ d )z d ) ≥ 0, Therefore sup K j (z) − K j (γ z + z) ≤ K j (z) − K j (z(δ)) + K j (z) − K j (z(δ)) ,

j = 1, 2,

|γ |≤δ

and so, for any 0 ≤ δ ≤ 1/2, Z (1 + |z| pd )2 sup |K (z) − K (γ z + z)| dz Rd

|γ |≤δ

Z ≤ 2ε + 2 max

j=1,2 |z|≤R

 (1 + |z| pd )2 K j (z) − K j (z(δ)) + K j (z) − K j (z(δ)) dz.

Since for j = 1, 2, K j (z) − K j (z(δ)) → 0 a.s. z, as δ → 0, by the Lebesgue dominated convergence theorem Z lim sup (1 + |z| pd )2 sup |K (z) − K (γ z + z)| dz ≤ 2ε. δ→0

Rd

The proof is completed.

|γ |≤δ



Lemma 4.2. Assume that (H2 ) and ( H3 ) hold. Then s

! nk

X

1

E sup (ξn k ,i − ξn,i ) = 0. lim sup lim sup

2 log log n n k−1 ≤n≤n k i=1 k−1 k→∞ γ →1 1

[γ k ],

k ≥ 1, and      x − Xi 1 x − Xi ξn,i (x) = d K − EK . an an an

where n k =

Proof. Take U ≥ 2 supz∈Rd |K (z)| such that U 2 A2 ≥ 1. Since the class of functions d F = K x−· a ; x ∈ R , a ∈ R \ {0} is a bounded, measurable V C class of functions, so the following classes of functions       x −· x −· −K ; n k−1 ≤ n ≤ n k , k ≥ 1, x ∈ Rd Fk,x = K an k an are measurable V C classes of functions. Moreover, there is a common V C characteristic (A, v) for these classes that does not depend on k and x. By Lemma A.4,  

nk

X

1

 (g(X i ) − Eg(X i )) √ E 

i=1

nk Fk,x

465

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

v  

u

nk

u

u  X 2 A2 U 2  log  ≤ CuE (g(X i ) − Eg(X i )) /n k

P

nk

u

i=1 t Fk,x E  (g(X i )−Eg(X i ))2 /n k

i=1

For convenience, set 

nk

X

2  ϕk (x) := E (g(X i ) − Eg(X i )) /n k

i=1



.  Fk,x

 ,

ψk (x) := (1 + |x| pd )2 ϕk (x),

Fk,x

and Z Bp =

Rd

(1 + |x| pd )−1 dx,

pd −1 ν(dx) = B −1 p (1 + |x| ) dx.

Then by the Cauchy–Schwartz inequality and the concavity of the function [0, +∞) 3 x → −x log x, we have v  

u

nk

u

u  X 2 A2 U 2  log  (g(X i ) − Eg(X i )) /n k

uE

P

nk

Rd u

i=1 2 t Fk,x  E (g(X i )−Eg(X i )) /n k

Z

i=1

s

Z = Bp

d

sR Z ≤ Bp

(1 + |x| pd ) ϕk (x) log

A2 U 2 ν(dx) ϕk (x)

(1 + |x| pd )2 ϕk (x) log

A2 U 2 ν(dx) ϕk (x)

d

sZR  = Bp

Rd

sZ ≤ Bp

Rd

ψk

(x) log(A2 U 2 (1 + |x| pd )2 ) + ψ

ψk (x) log(A2 U 2 (1 + |x| pd )2 )ν(dx) +

 dx  Fk,x

 1 ν(dx) k (x) log ψk (x)

Z Rd

1 . Rd ψk (x)ν(dx)

ψk (x)ν(dx) log R

Since for any 1 ≤ q ≤ 2, Z (1 + |x| pd )q ϕk (x)dx Rd !     2 Z  q Z x−y x−y pd ≤ 2 1 + |x| −K f (y)dy dx sup K an k an Rd Rd n k−1 ≤n≤n k Z Z ≤ 2andk f (y) (1 + |y + an k z| pd )q sup |K (z) − K (δz + z)|2 dzdy, Rd

Rd

|δ|≤

n k−1 n k −1

and Z

ψk (x) log(A U (1 + |x| ) )ν(dx) ≤ 2

Rd

2

pd 2

Z Rd

2A2 U 2 ψk (x)(1 + |x| pd )ν(dx),

By (H3 ) and Lemma 4.1, there exist two bounded functions Bi (k, γ ), i = 1, 2 satisfying lim sup lim sup Bi (k, γ ) = 0, γ →1

k→∞

i = 1, 2

466

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

such that Z

 

nk

X

1

 dx (g(X i ) − Eg(X i )) √ E 

i=1

nk Rd Fk,x q ≤ C andk B1 (k, γ ) − andk B2 (k, γ ) log(andk B2 (k, γ )),

which implies s lim sup lim sup γ →1

k→∞

1 E 2andk log log n k−1

! nk

X

sup (andk ξn k ,i − and ξn,i ) = 0.

n k−1 ≤n≤n k i=1 1

Similarly, it is easy to get s lim sup lim sup γ →1

k→∞

1 E 2 log log n k−1

sup n k−1 ≤n≤n k

! ! nk

X

ξn,i = 0.

i=1

and −1 andk

1

Therefore, the conclusion of the lemma follows from the triangle inequality.



Proof of Theorem 1.3. Step 1: p Upper bound. For any γ > 1 fixed, set n k = [γ k ], k ≥ 1. Applying Theorem 1.2 to bn = 2n log log n, we have that for any ε > 0, for k large enough  r nk k f n k − E f n k k1 > 1 + 2ε ≤ exp{−(1 + ε)2 log log n k }. P 2 log log n k By the Borel–Cantelli lemma and the arbitrariness of ε, we obtain r nk lim sup k f n k − E f n k k1 ≤ 1 a.s. 2 log log n k k→∞

(4.1)

Next we need to compare the whole sequence with the subsequence. Denote by      1 x − Xi x − Xi ξn,i (x) = d K − EK . an an an T Then for any n ∈ N [n k−1 , n k ],

n

n

Pk

P



n k k f n k − E f n k k1 + ξn k ,i + (ξn k ,i − ξn,i ) r

n i=n i=1 1 1 p k f n − E f n k1 ≤ . 2 log log n 2n k−1 log log n k−1 (4.2) Then it is easy to get the following inequalities.  lim sup

sup

k→∞ n k−1 ≤n≤n k

 an − 1 ≤ lim sup sup an k k→∞ n k−1 ≤n≤n k

! nand n k − 1 ≤ γ − 1, n k andk n k−1

(4.3)

and kξn k ,1 − ξn,1 k1 ≤ 2

Z sup |η|≤ n

nk −1 k−1

Rd

|K (z) − K (ηz + z)| dz + 2



 nk −1 . n k−1

(4.4)

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

467

By Lemma 4.2, s lim sup lim sup γ →1

k→∞

1 E 2 log log n k−1

! nk

X

sup (ξn k ,i − ξn,i ) = 0.

n k−1 ≤n≤n k i=1 1

For any 0 < ε < 1, choose 1 < γ0 such that for any 1 < γ ≤ γ0 , there exists kγ ≥ 1 such that for all k ≥ kγ , sup n k−1 ≤ j≤n k

k(ξn k ,1 − ξ j,1 )k1 ≤

ε , 60

(4.5)

and s

1 E 2 log log n k−1

! nk

X

ε

sup (ξn k ,i − ξn,i ) < .

60 n k−1 ≤n≤n k i=1 1

Take B = L 1 (Rd ) × {n k−1 , . . . , n k }, kgk := supn k−1 ≤ j≤n k

R

|g(x, j)|dx and

ξi (x, j) = ξn k ,i (x) − ξ j,i (x). By Lemma A.3, for any k ≥ kγ ,    P  sup n k−1 ≤n≤n k

n



P

(ξn ,i − ξn,i ) k

 i=1  1 p > ε  2n k−1 log log n k−1

   ≤ P  sup sup n k−1 ≤n≤n k n k−1 ≤ j≤n k    ≤ 9P  sup n k−1 ≤ j≤n k    ≤ 9P  sup n k−1 ≤ j≤n k

n 

P

(ξn ,i − ξ j,i ) k

 i=1  1 p > ε  2n k−1 log log n k−1

n



Pk

(ξn ,i − ξ j,i )   k

i=1  1 p > ε 30  2n k−1 log log n k−1

n

Pk

(ξn ,i − ξ j,i ) k

i=1 1 p 2n k−1 log log n k−1

   − E  sup n k−1 ≤ j≤n k



n 

Pk

(ξn ,i − ξ j,i )   k

 i=1  1  p  > ε 60 .  2n k−1 log log n k−1 

(4.6)

468

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

Then by Lemma A.1 and (4.5), we have for any 1 < γ ≤ γ0 ,

n

 

P

(ξn ,i − ξn,i ) k

∞   X i=1   1 p > ε < ∞. P  sup   n ≤n≤n 2n k−1 log log n k−1 k k−1 k=1 Therefore, by the Borel–Cantelli lemma, we get for any 1 < γ ≤ γ0 ,

n

P

(ξn ,i − ξn,i ) k

lim

sup

i=1

k→∞ n k−1 ≤n≤n k

p

1

2n k−1 log log n k−1

≤ε

a.s.

(4.7)

On the other hand, by Lemma A.3,

n  

Pk

ξn ,i k

  p i=n   1 p P  sup > 60 γ − 1 n k−1 ≤n≤n k 2n k−1 log log n k−1  



n k −n

Pk−1 ξn k ,i

i=1



  p   1 , p ≤ 9P  γ − 1 > 2  2n  k−1 log log n k−1   and so by Theorem 1.2, we also have

n  

Pk

ξn ,i k ∞

  X p i=n   1 p P  sup > 60 γ − 1 < ∞,   n k−1 ≤n≤n k 2n k−1 log log n k−1 k=1 which implies by the Borel–Cantelli lemma

n

Pk

ξn ,i k

p i=n 1 p lim sup sup ≤ 60 γ − 1 2n k−1 log log n k−1 k→∞ n k−1 ≤n≤n k

a.s.

Now, combining (4.1) and (4.7) with (4.8), we have for any 1 < γ ≤ γ0 , r p n √ lim sup sup k f n − E f n k1 ≤ γ + 60 γ − 1 + ε 2 log log n k→∞ n k−1 ≤n≤n k

(4.8)

a.s.

First letting γ & 1, and then letting ε → 0, we obtain the upper bound. Step 2: Lower bound. For any γ > 2 fixed, set n k = [γ k ], k ≥ 1. Then



n

nP

Pk

k+1

ξ − ξ

n k+1 ,i n k+1 ,i r

i=n k

i=1 n k+1 1 1 p k f n k+1 − E f n k+1 k1 ≥ . 2 log log n k+1 2n k+1 log log n k+1

(4.9)

469

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

By (4.5), Lemma A.1 and the Borel–Cantelli lemma, we have

n

Pk

ξn ,i k+1

4 i=1 1 a.s. ≤√ lim sup p γ 2n k+1 log log n k+1 k→∞

(4.10)

By Theorem 1.2 and the Borel–Cantelli lemma, one can easily get that s s

nX

k+1 1 γ −1

ξn k+1 ,i ≥ lim sup a.s.

2n log log n γ k+1 k+1 i=n k→∞ k

(4.11)

1

Combining (4.10) and (4.11) with (4.9), we get r n k f n − E f n k1 lim sup 2 log log n n→∞ s r 1 n k+1 γ −1 ≥ lim sup k f n k+1 − E f n k+1 k1 ≥ −√ 2 log log n γ γ k+1 k→∞

a.s.

Finally, letting γ → ∞, we obtain the lower bound. Acknowledgments The author would like to thank Professor Liming Wu for his comments on the first version of this paper. The author is indebted to the referee, the Associate Editor and Editor-in-Chief Protter for their helpful suggestions and comments. The research is supported by the National Natural Science Foundation of China (No.10271091, 10571139). Appendix Lemma A.1 (cf. [14]). Let {ξi , i = 1, . . . , n} be a number of independent random variables taking values in a Banach space let E(ξi ) = 0, kξi k ≤ c, i = 1, . . . , n, where c is Pn(B, k · k), and E(kξi k2 ), any λ > 0, a constant. Then for any β ≥ i=1

!

! n n

X



X



ξ >λ ξi − E P

i=1 i i=1        λ2 2cλ λ2 ≤ min 2 exp − 2 − exp , 2 exp − . (A.1) 2β β 2nc2 Lemma A.2 (cf. [20], or [5] p.90). Let {ξi , i ≥ 1} be a sequence of independent and identically distributed real random variables with common variance 1, mean 0 and finite absolute moment of the third order. Then n P

ξi r AE(|ξ1 |3 ) 2 |E| √ − √ ≤ π n n i=1

where A is an universal positive constant.

470

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

Lemma A.3 (Montgomery–Smith [19]). Let {ξi , i = 1, 2, . . .} be a sequence of independent and identical distributed random variables taking values in a Banach space (B, k · k) with E(ξi ) = 0. Then, for any λ > 0,



! ! k n

X

X

λ



. P sup ξi > λ ≤ 9P ξ >

i=1 i 30 1≤k≤n i=1 ˇ class. Let (S, S) be a Let us introduce some notations on the V C (Vapnik–Cervonenkis) measurable space and let F be a uniformly bounded collection of measurable functions on it. The class F is called to be a bounded measurable V C class of functions if it is separable, and if there exist positive numbers A and v such that, for every probability measure µ on (S, S) and every 0 < τ < 1,  v A N (F, k · k L 2 (µ) , τ kFk L 2 (µ) ) ≤ , τ where F = sup{|g|; g ∈ F} and N (F, k · k L 2 (µ) , τ ) denotes the τ -covering number of the metric space (F, k · k L 2 (µ) ); that is, the smallest number of balls of radius not larger than τ and centers in F needed to cover F. The pair (A, v) is called the characteristic of the class F. For any map Φ from F to R, denote Qby kΦkF = sup{|Φ(g)|; g ∈ F}. Let µ be any probability measure on (S, S), and let P = i∈N µi be the product probability measure of µi = µ, i ∈ N. Let ξi : S N 7→ S, i ∈ N, be the coordinate functions. Lemma A.4 (cf. [9]). Let F be a measurable uniformly bounded V C class of functions, and let U ≥ supg∈F kgk∞ . Then, there exists a constant C depending only on the characteristic (A, v) of the class F, such that

! n

X

1

√ E (g(ξi ) − Eg(ξ1 ))

i=1 n F v

! u  n

u A2 U 2

X

u 2 ≤ CuE n log (g(ξi ) − Eg(ξ1 ))

 !.

i=1

n u

P

2 F t n E

(g(ξi ) − Eg(ξ1 ))

i=1

Lemma A.5. Let (H1 ) and (BC) hold. For each N ≥ 1, set     n x − Xi x − Xi 1 X K I[−N ,N ]d ∩{K ≤N } , f n,N (x) = d nan i=1 an an

F

x ∈ Rd .

Then lim sup n→∞

n sup Ek f n − f n,N k1 = 0 bn N ≥1

(A.2)

and for any δ > 0, n lim sup lim sup 2 log P b N →∞ n→∞ n



 n k f n − f n,N k1 > δ = −∞. bn

(A.3)

471

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

Proof. In Lemma A.1, take B = L 1 (Rd ) and      x − Xi x − Xi 1 I([−N ,N ]d ∩{K ≤N })c ξi = d K an an an      x − Xi x − Xi I([−N ,N ]d ∩{K ≤N })c , −E K an an Then kξi k ≤ 2kK I([−N ,N ]d ∩{K ≤N })c k1 → 0 as N → ∞, and so by Lemma A.1,



!! ! n n

X

1 n

X lim sup lim sup 2 log P ξi − E ξ > δ = −∞.

i=1 i bn i=1 N →∞ n→∞ bn

(A.4)

By the Cauchy–Schwartz inequality, it is easy to get for any η > 0,

! r Z sZ n

X

n 1

E ξi ≤ K 2 (z) f (x − an z)dzdx. sup

i=1 bn2 and N ≥1 bn Hence, (A.2) holds and (A.3) follows from (A.4) and (A.2).



Lemma A.6. Let (H1 ) and (BC) hold, and let K be a bounded measurable function with (l) compact support. For any l > 0, set X i = (min{X i,1 , l}, . . . , min{X i,d , l}) and ! (l) n x − Xi 1 X K , x ∈ Rd . f n,l (x) = d nan i=1 an Then lim sup n→∞

n Ek f n − f n,l k1 = 0 bn

(A.5)

and for any δ > 0, lim sup lim sup l→∞

n→∞

n log P bn2



 n k f n − f n,l k1 > δ = −∞. bn

Proof. In Lemma A.1, take B = L 1 (Rd ) and !   (l) x − Xi 1 x − Xi ξi = d K −K −E an an an

 K

x − Xi an

(A.6)

(l)

 −K

x − Xi an

!!! .

Then kξ1 k1 ≤ 4, and Z 2 E(kξ1 k1 ) ≤

 2 Z  Z K (z) − K z + y − l dz f (y)dy ≤ 4 f (y)dy. an |y|≥l |y|≥l

It is easy to get (A.5) by the Cauchy–Schwartz inequality. Therefore, by (A.5), and Lemma A.1 one gets immediately (A.6).  Lemma A.7. Assume that (BC) hold. Let K be a bounded measurable function with compact support, and let f (x) also have compact support. Then there exist two constants M and L and a sequence of simple functions K l (x) =

ml X j=1

cl j I Al j (x),

l = 1, 2, . . .

472

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

where 0 < cl1 , . . . , clm l ≤ M are positive constants and Al1 , . . . , Alm l ⊂ [−L , L]d are disjoint rectangles in Rd such that lim sup n→∞

n Ek f n − f n(l) k1 = 0 bn

(A.7)

and for any δ > 0, n lim sup lim sup 2 log P b l→∞ n→∞ n



 n (l) k f n − f n k1 > δ = −∞ bn

(A.8)

where f n(l) (x) =

  n 1 X x − Xi , K l nand i=1 an

x ∈ Rd .

Proof. Set M = 2 supx∈Rd |K (x)| and L = 1 + sup{|x|; K (x) 6= 0} + sup{|x|; f (x) 6= 0}. For any l ≥ 1, choose positive constants cl1 , . . . , clm l ≤ M and disjoint rectangles Al1 , . . . , Alm l ⊂ [−L , L]d such that the function K l (x) =

ml X

cl j I Al j (x)

j=1

satisfies kK − K l k1 ≤ 1/l. In Lemma A.1, take B = L 1 (Rd ) and           x − Xi x − Xi x − Xi x − Xi 1 ξi = d K − Kl −E K − Kl , an an an an an Tthen kξ1 k ≤ 2/l and (A.7) and (A.8) hold as in Lemma A.5.



References [1] M. Cs¨org¨o, L. Horv´ath, Central limit theorems for L p -norms of density estimators, Z. Wahrsch. Verw. 80 (1988) 269–291. [2] D.W. Dawson, J. Gartner, Long time fluctuation of weakly interacting diffusions, Stochastics 20 (1987) 247–308. [3] A. Dembo, O. Zeitouni, Large Deviations Techniques and Applications, 2nd ed., Springer, New York, 1998. [4] L. Devroye, The equivalence of weak, strong and complete convergence in L 1 for kernel density estimates, Ann. Statist. 11 (1983) 896–904. [5] L. Devroye, L. Gy¨orfi, Nonparametric Density Estimation. The L 1 -View, Wiley, New York, 1985. [6] U. Einmahl, D. Mason, An empirical process approach to the uniform consistency of kernel-type function estimators, J. Theoret. Probab. 13 (2000) 1–37. [7] L. Horv´ath, On L p -norms of multivariate density estimators, Ann. Statist. 19 (1991) 1933–1949. [8] F.Q. Gao, Moderate deviations and large deviations for kernel density estimators, J. Theoret. Probab. 16 (2003) 401–418. [9] E. Gin´e, A. Guillou, On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals, Ann. Inst. H. Poincar´e (B) Probab. Statist. 37 (2001) 503–522. [10] E. Gin´e, A. Guillou, Rates of strong uniform consistency for multivariate kernel density estimators, Ann. Inst. H. Poincar´e (B) Probab. Statist. 38 (2002) 907–921. [11] E. Gin´e, V. Koltchinskii, J. Zinn, Weighted uniform consistency of kernel density estimators, Ann. Probab. 32 (2004) 2570–2605. [12] E. Gin´e, D.M. Mason, The law of the iterated logarithm for the integrated squared deviation of a kernel density estimator, Bernoulli 10 (2004) 721–752. [13] E. Gin´e, D.M. Mason, A.Y. Zaitsev, The L 1 -norm density estimator process, Ann. Probab. 31 (2003) 719–768. [14] M. Ledoux, M. Talagrand, Probability in Banach Spaces: Isoperimetry and Processes, Springer-Verlag, Berlin, 1991.

F. Gao / Stochastic Processes and their Applications 118 (2008) 452–473

473

[15] L.Z. Lei, L.M. Wu, B. Xie, Large deviations and deviation inequality for kernel density estimator in L 1 (Rd )distance, in: Development of Modern Statistics and Related Topics, in: Series in Biostatistics, vol. 1, 2003, pp. 89–97. [16] L.Z. Lei, L.M. Wu, Large deviations of kernel density estimator in L 1 for uniformly ergodic Markov processes, Stochastic Process. Appl. 115 (2005) 275–298. [17] D. Louani, Large deviations limit theorems for the kernel density estimator, Scand. J. Statist. 25 (1998) 243–253. [18] D. Louani, Large deviations for the L 1 -distance in kernel density estimation, J. Statist. Plann. Inference 90 (2000) 177–182. [19] S.J. Montgomery-Smith, Comparison of sum of independent identically distributed random vectors, Probab. Math. Statist. 14 (1993) 281–285. [20] T.J. Sweeting, Speeds of convergence in th mutidimensinal central limit theorem, Ann. Probab. 5 (1977) 28–41.