Moderate deviation principles for Engel’s, Sylvester’s series and Cantor’s products

Moderate deviation principles for Engel’s, Sylvester’s series and Cantor’s products

Statistics and Probability Letters 96 (2015) 247–254 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: ...

371KB Sizes 0 Downloads 26 Views

Statistics and Probability Letters 96 (2015) 247–254

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Moderate deviation principles for Engel’s, Sylvester’s series and Cantor’s products✩ Wei Hu ∗ School of Mathematics and Statistics, Wuhan University, Wuhan, Hubei 430072, PR China

article

abstract

info

Article history: Received 22 July 2014 Received in revised form 22 September 2014 Accepted 4 October 2014 Available online 12 October 2014

In this paper, we obtain the moderate deviation principles for Engel’s series, Sylvester’s series and Cantor’s products, which is a complement to Zhu (2014). © 2014 Elsevier B.V. All rights reserved.

MSC: 60F10 60H10 Keywords: Large deviation Moderate deviation Engel’s series Sylvester’s series Cantor’s products

1. Introduction Representations of a real number have been widely studied in literature (cf. Borel, 1947, Erdős et al., 1958, Lévy, 1947, Rényi, 1958 and references therein). One particular class of representations of real number is Engel’s series, and it is related to Sylvester’s series and Cantor’s products. The statistical properties, such as the central limit theorem, law of iterated logarithms have been well understood for Engel’s, Sylvester’s series and Cantor’s products. Recently, Zhu (2014) studies the large deviation principles (LDPs) for all these three series. In this short note, we further study the moderate deviation principle (MDP) for these representations. For stating our results, we review some notations in relation to the large and moderate deviation problem. For general theory of LDP, we refer to the book Dembo and Zeitouni (1998). Let (S , d) be a metric space and {Yn : n ≥ 1} be a sequence of S -valued random variables on probability space (Ω , F , P ). Let λn be a sequence of positive real numbers satisfying λn → ∞ as n → ∞. A function I (·) : S → [0, +∞] is said to be a rate function if it is lower semicontinuous. The sequence {Yn , n ≥ 1} is said to satisfy a LDP with speed λn and rate function I if for any Borel set Γ in S

− inf◦ I (x) ≤ lim inf x∈Γ

n→∞

1

λn

≤ lim sup n→∞

log P (Yn ∈ Γ )

1

λn

log P (Yn ∈ Γ ) ≤ − inf I (x), x∈Γ

✩ Research supported in part by the National Natural Science Foundation of China (No. 11171262).



Tel.: +86 27 68752957. E-mail address: [email protected].

http://dx.doi.org/10.1016/j.spl.2014.10.006 0167-7152/© 2014 Elsevier B.V. All rights reserved.

248

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

where Γ ◦ and Γ denote the interior and closure of Γ , respectively. The rate function is said to be good if it has compact level set. Now we consider a family of random variables {ξn , n ≥ 1}. Assume that it satisfies a fluctuation theorem such as central limit theorem, that is, there exists a sequence of positive numbers {bn , n ≥ 1} such that bn (ξn − θ ) → Y in law, where θ is a constant and Y is a nontrivial random variable (see Feng and Gao, 2008 and Gao and Zhao, 2011). Generally, a large deviation principle for {Yn := rn (ξn − θ ), n ≥ 1} is called a moderate deviation principle for {ξn , n ≥ 1}, where rn is an intermediate scale between 1 and bn , that is, rn → ∞ and bn /rn → ∞. Roughly speaking, LDP characterizes the convergence speed of the law of large numbers, while the MDP gives the speed of convergence towards the law corresponding to central limit theorem. The paper is organized as follows: in Sections 2–4 we consider the MDP for Engel’s Series, Sylvester’s and Cantor’s product respectively. 2. Engel’s series For any real number 0 < x < 1, it can be represented uniquely in the form of Engel’s series, 1

x=

q1

+

1 q1 q2

1

+ ··· +

q1 . . . qn

+ ···

where qn = qn (x) ∈ N and qn+1 ≥ qn ≥ 2 for any n ∈ N. Borel (1947) proved that for a.e. x, 1

lim (qn ) n = e.

n→∞

Lévy (1947) and Erdős et al. (1958) showed that

 lim P

n→∞

x ∈ (0, 1) :

log qn − n



n

 ≤t

= Φ (t )

(1)

where P is the Lebesgue measure and Φ (t ) = −∞ √1 e−u /2 du. Rényi and Révész (1958) showed that (1) also holds for any 2π probability measure which is absolutely continuous with respect to the Lebesgue measure. Erdős et al. (1958) gave a simple proof of the above result by observing that {qn } is a homogeneous Markov chain with transition probabilities

t

P (qn = k|qn−1 = j) =

j−1 k(k − 1)

,

2

k≥j

and P (q1 = k) =

1 k(k − 1)

.  log q



n By explicit calculation of the limit of Laplace functional for qn , Zhu (2014) proved that P ∈ · satisfies a large deviation n principle. A natural topic is to study its moderate deviations, i.e., to consider the asymptotic behavior of

log qn − n an where {an , n ≥ 1} is a sequence of positive numbers such that for log qn ; when an = n, this is LDP for log qn . Our main result in this section is the following.



n ≪ an ≪ n. When an =



n, this is central limit theorem

Theorem 1. Assume that x is distributed uniformly on (0, 1) and {qn , n ≥ 1} is the Engel’s series. Let Q be any probability dP measure on (0, 1) which is equivalent to Lebesgue measure P such that the Radon–Nikodym derivatives dQ ∈ Lp (Q ) and dQ dP

∈ Lp (P ) for any 1 < p < ∞. Then for any sequence of positive numbers {an , n ≥ 1} satisfying √

an n log an

→ ∞,

an n

→ 0,

2 (log qn − n)/an satisfies MDP with speed na− n and good rate function I (x) =

x2 2

under Q .

Remark 1. In fact, we may get the result under condition that for any sequence of positive numbers {an , n ≥ 1} satisfying an

√ → ∞, n

an n

→ 0.

n → ∞ in order to avoid the technical difficulties. The main reason is that we cannot get Here we need the condition √n alog a n

finer estimates if we adopt the Gärtner–Ellis theorem.

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

249

Remark 2. If we take an = np , p ∈ ( 12 , 1), the condition in Theorem 1 is satisfied. Proof. Let E (·) denote the expectation under P and EQ (·) denote the expectation under Q . First of all, we derive an expression for n

lim

n→∞



log E exp

a2n

a

n

n

(log qn − n)λ



.

In fact, for any fixed λ ∈ R, denoting θn = λ ann , we have ∞ 

Eeθn log qn =

N −1 

P (qn = k)kθn =

k=2

P (qn = k)kθn +

∞ 

P (qn = k)kθn

k=N

k=2

where N is a constant which will be chosen later. By (1.13) in Zhu (2014), we can see that for any N, Eeθn log qn −

∞ 

(N − 1)|θn |+1

P (qn = k)kθn ≤

2n

k=N

.

(2)

By assumption upon an , we can choose n large enough such that θn ∈ (− 21 , 12 ). For such n, ∞



1

=

1 − θn

(k+1)/j

∞  

xθn −2 dx =

1

xθn −2 dx ≤

k/j

k=j

  θ −2 ∞  1 k n j

k=j

j

≤ ( 1 + n− 1 )

∞  j−1 (k/j)θn k ( k − 1 ) k=j

for all j ≥ N, where N is a constant depending on a2n (N = [a2n ] + 2). Now for above N, we have ∞ 

θn

P (qn = k)k

P (qn−1

 θn ∞  j−1 k = j)j k ( k − 1 ) j k =j

1

1

∞ 



k=N

j =N



θn

∞ 

1 + n−1 1 − θn j=N

P (qn−1 = j)jθn ≥ · · · ≥ C0 (n)

where C0 (n) is a constant which is dependent of a2n . In fact, C0 (n) = n

lim inf

a2n

n→∞



log E exp

a

n

n

(log qn − n)λ



= lim inf n→∞

≥ lim inf n→∞



j=[a2n ]+2

n

an

a2n

 n2

n

an

a2n

(−θn ) + 2 (−θn ) + 2

(3)

1 + n−1 1 − θn

∞

 n2

n

1

1

1 3

j 2 (j−1)

. By (3), we deduce that



log E (qθnn )



1



log C0 (n)

1

n 

1 + n−1 1 − θn

.

(4)

By assumption on an , we know that

 lim

n

n→∞ a2 n

n

log C0 (n) = lim

n→∞ a2 n

log 



∞  j=[a2n ]+2

1 3 2

j (j − 1)

 = lim n log an = 0. 2 n→∞

(5)

an

Using Taylor formula, we deduce that lim

n→∞

n2 a2n

log

1 1 + n−1

= 0.

(6)

Combine (4)–(6) and then use Taylor formula again we show lim inf n→∞

n a2n



log E exp

a

n

n

(log qn − n)λ



= lim inf n→∞

≥ lim inf n→∞

= lim inf n→∞

=

λ2 2

.

 n2

n

(−θn ) + 2

an

a2n

 n2

n

an

a2n

 n2

2

(−θn ) + 2

a2n

(−θn ) −

n

a2n



log E (qθnn )

log C0 (n) + log(1 − θn )

n2 a2n



log

1 1 + n− 1

+

n2 a2n

log

1 1 − θn



250

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

On the other hand, for any n such that θn ∈ (− 12 , 12 ), we have ∞



x

θn −2

(k+1)/j

∞  

dx =

1

x

θ n −2

dx ≥

  θ −2 ∞  1 k+1 n

k/j

k=j

j

k=j

≥ (1 − n−1 )

j

∞  j−1 (k/j)θn k ( k − 1) k=j

for any j ≥ N, where N is a sufficiently large positive integer dependent on a2n , N = [2a2n ] + 1. Thus using similar argument as adopted in the proof of the lower bound, we have ∞ 

P (qn = k)kθn ≤ C1 (n)



1

n

1

,

1 − n−1 1 − θn

k=N

where C1 (n) is a constant depending on a2n . In fact C1 (n) = 3

n

lim sup

log

a2n

n→∞

(N − 1) 2

= lim sup

2n

n→∞

n a2n

∞

j=[2a2n ]+1

log a3n − lim sup n→∞

n2

1 3

(j−1) 2

. So

log 2 = −∞.

a2n

It follows from (2) and Taylor expansion that n

lim sup



log E exp

a2n

n→∞

a

n

n

(log qn − n)λ



3   (N − 1) 2 n n ≤ max lim sup 2 log , lim sup log n 2

an

n→∞

λ

=

2

2

2

n→∞

an

1

1

n 

1 − n− 1 1 − θ n

.

Thus we have proved that

Γ (λ) := lim

n



n→∞ a2 n

log E exp

a

n

n

(log qn − n)λ



=

λ2 2

,

λ ∈ R.

By virtue of the Gärtner–Ellis theorem (see Dembo and Zeitouni, 1998), we know that under probability measure P,

2 (log qn − n)/an satisfies MDP with speed na− n and good rate function   x2 I (x) = sup λx − Γ (λ) = , x ∈ R,

2

λ∈R

i.e., for any Borel set A, we have

− inf◦

x2

≤ lim inf

2

x∈A

n→∞

n a2n

log P

 log q − n n an

  log q − n  x2 n n ∈ A ≤ lim sup 2 log P ∈ A ≤ − inf . n→∞

For any Borel set A and positive numbers p, q such that P

 log q − n n an

1 p

+

an

1 q

an

x∈A

(7)

2

= 1, by Hölder inequality,

  dP  ∈ A = EQ I{ log qn −n ∈A} dQ

≤Q

an

 log q − n n an

∈A

 1p

EQ

 dP q  1q dQ

.

Since p > 1, we get lim inf n→∞

n a2n

log P

Using the condition, lim sup n→∞

n a2n

 log q − n n an

dQ dP

log P

  log q − n  n n ∈ A ≤ lim inf 2 log Q ∈A . n→∞

an

(8)

an

∈ Lp (P ) we can also show that

 log q − n n an

  log q − n  n n ∈ A ≥ lim sup 2 log Q ∈A . n→∞

an

Combining (8), (9) and (7), we can finish the proof of Theorem 1.

(9)

an



Remark 3. As remarked by Zhu (2014), the conditional probability measure Q (

log qn n

∈ ·|q1 = q), q ≥ 2 also satisfies a LDP.

We can also show that Q ( (log qn − n) ∈ ·|q1 = q) satisfies a MDP with speed nan and good rate function I (x) = the same method as above. The proof is routine. 1 an

−2

x2 2

by

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

251

3. Sylvester’s series For a real number x ∈ (0, 1), Sylvester’s series is defined via the expansion 1 1 1 x= + + ··· + + ··· Q1 Q2 Qn where Qn+1 ≥ Qn (Qn − 1) + 1, Qn ≥ 2, Qn ∈ N. Let P be the Lebesgue measure on (0, 1). Erdős et al. (1958) observed that {Qn } is a homogeneous Markov chain under P and P (Q1 = k1 , Q2 = k2 , . . . , Qn = kn ) =

1

(10)

kn (kn − 1)

j(j − 1)

P (Qn = k|Qn−1 = j) =

(11)

k(k − 1)

for any j ≥ 2 and k ≥ j(j − 1) + 1 and proved the central limit theorem



log



−n

Qi

  

  ≤ t  = Φ (t ) 

i=1

lim P 

n→∞

Qn n −1



n

2 t where Φ (t ) = −∞ √1 e−u /2 du. Zhu (2014) proved that P 2π the following.





1 n

log nQ−n1 i=1

Qi

 ∈ · satisfies a large deviation principle. We have

Theorem 2. Assume that x is a uniformly distributed random variable on (0, 1) and {Qn , n ≥ 1} is the Sylvester’s series. Let Q be assumed as Theorem 1. Then for any sequence of positive numbers {an , n ≥ 1} satisfying



an

n log n



log nQ−n1

Q i=1 i

an

→ ∞,

→ 0,

n

 2 − n /an satisfies MDP with speed na− n and good rate function I (x) =

x2 2

under Q .

Proof. As before, E (·) denotes the expectation under P and EQ (·) denotes the expectation under Q . Firstly we will prove that

    

         λ2 n Qn an     lim Γn (λ) := lim 2 log E exp log n−1 − n λ  = . n→∞ n→∞ a        n 2   n   Qi   



(12)

i=1

For any fixed real number λ, denoting θn = λ

an n

θn

  Q  n Υn (λ) = e−nθn E  −1  n

Qi

=e

 k1 ,...,kn−1

i=1

 k  n  n −1  n

    = e−nθn  k ,...,k 1

Qi

− n)) = Υn (λ), by (10) and (11) we have

θn



ki

i=1

−nθn

and E (exp θn (log nQ−n1

   

1 kn (kn − 1)

i =1



θn

k  n −1  n −2 

   

ki



1 kn−1 (kn−1 − 1)



kn

kn ≥kn−1 (kn−1 −1)+1

k2n−1

θ n

kn−1 (kn−1 − 1) kn (kn − 1)

i =1

where kn ≥ kn−1 (kn−1 − 1) + 1 ≥ kn−1 + 1 ≥ n + 1. Note that kn−1 ≥ n. We can choose n sufficiently large such that for θn ∈ (− 14 , 41 ). But for such n, when kn = j ≥ n + 1, kn−1 ≥ n, we have

(1 − n−1 )k2n−1 < (kn−1 (kn−1 − 1) + 1) < (1 + n−1 )k2n−1 , k2n−1 < (1 − n−1 )−1 kn−1 (kn−1 − 1),

(1 − n−1 )jθn < (j + 1)θn < (1 + n−1 )jθn , (j + 1)2 < (1 + n−1 )2 j2 , j2 ≤ (1 + n−1 )j(j − 1).

252

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

Thus we get ∞



1 1 − θn

=

∞ 

xθn −2 dx =

1



j+1 kn−1 (kn−1 −1)+1 j

j=kn−1 (kn−1 −1)+1

kn−1 (kn−1 −1)+1

∞ 

1

j=kn−1 (kn−1 −1)+1

kn−1 (kn−1 − 1) + 1



∞ 

=

j=kn−1 (kn−1

(n − 1)2 (n + 1)2

(n − 1)2 ≥ (1 − n−1 ) (n + 1)2 ≥ (1 − n−1 ) (n − 1) (n + 1)5



 θ n −2

j+1 kn−1 (kn−1 − 1) + 1

(j + 1)θn 1 θn −1 (j + 1)2 ( k ( k − 1 ) + 1 ) n−1 n−1 −1)+1

≥ (1 − n−1 )

(n − 1)2 (n + 1)2 ∞ 

5



xθn −2 dx

j=kn−1 (kn−1

∞  j=kn−1 (kn−1

 j θn k2 n−1 2 ( j + 1)2 k n − 1 −1)+1

∞ 

 j θn 

j=kn−1 (kn−1 −1)+1

∞ 

1

(1 +

k2n−1

 j θn 

j=kn−1 (kn−1 −1)+1

n −1

(kn−1 − 1)

n−1 )2 1

(1 +

k2n−1

k

j2

k

n −1

n−1 )3

(kn−1 − 1)

j(j − 1)

 j θn k (k n−1 n−1 − 1) 2 j(j − 1) k n −1 −1)+1

and similarly, ∞



1 1 − θn

=

∞ 

xθn −2 dx =

1



j+1 kn−1 (kn−1 −1)+1 j

j=kn−1 (kn−1 −1)+1

kn−1 (kn−1 −1)+1

∞ 

1

j=kn−1 (kn−1 −1)+1

kn−1 (kn−1 − 1) + 1



j=kn−1 (kn−1

(n + 1)2 (n − 1)2

kn−1 (kn−1 − 1) + 1 1

j=kn−1 (kn−1 −1)+1

2

(1 + n−1 )2 (n + 1)2 ≤ 1 − n−1 (n − 1)2 ∞ 

4

j=kn−1 (kn−1

jθn

1

k2n−1 θn −1

j(j − 1)

∞ 

(1 + n ) (n + 1) 1 − n−1 (n − 1)2 −1 2

(n + 1) (n − 1)4

 θ n −2

j

(kn−1 (kn−1 − 1) + 1)θn −1 j2 −1)+1

≤ (1 + n−1 )2

=



jθn

∞ 

=



xθn −2 dx

∞  j=kn−1 (kn−1

∞  j=kn−1 (kn−1

(

)

 j θn k2 n −1 2 j ( j − 1) k n − 1 −1)+1  j θn k (k n−1 n−1 − 1) 2 (1 − n−1 )j(j − 1) kn−1 −1)+1

 j θn k (k n−1 n−1 − 1) . 2 j(j − 1) k n −1 −1)+1

It follows from above inequalities that for sufficiently large n,



 n − 1 4

1

n+1

1 − θn



j

j≥kn−1 (kn−1 −1)+1

k2n−1



θn

kn−1 (kn−1 − 1) j(j − 1)



 n + 1 5

1

n−1

1 − θn

.

By iterations, we get that for sufficiently large n, e−nθn

362



1

n4 (n + 1)4 1 − θn

n−2  Q − 14  2 E

Q1

≤ Υn (λ) ≤ e−nθn

n5 (n + 1)5  65

1 1 − θn

n−2  Q  41  2 E

Q1

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

where E





Q2 Q1

− 14

n

lim inf

and E



a2n

n→∞





1 4

 

Q2 Q1

a  n

log E exp

n

253

are two constants. Thus, using Taylor expansion we obtain that

log

Qn

Πin=−11 Qi

   n2  n − n λ = lim inf 2 (−θn ) + 2 log E an

n→∞

≥ lim inf

 n2

n→∞

=

λ

a2n

an

(−θn ) +

n a2n

log

θn 

Qn

Πin=−11 Qi 2 36

n4

(n + 1)

4

+

n(n − 2) a2n

log

1



1 − θn

2

2

and n

lim sup



a2n

n→∞

a 

Qn

n

Πin=−11 Qi

n

log E exp

log

   n2  n − n λ = lim sup 2 (−θn ) + 2 log E n→∞

≤ lim sup n→∞

=

λ

2

an

an

 n2

n

an

a2n

(−θn ) + 2

Qn

θn 

Πin=−11 Qi 5 5

log

n (n + 1) 65

+

n(n − 2) a2n

log

1



1 − θn

.

2  So far we have  established (12). By the Gärtner–Ellis theorem we know that under probability measure P, the sequence 2 log nQ−n1 − n /an satisfies MDP with speed na− n and good rate function i=1

Qi





I (x) = sup λx − Γ (λ) = λ∈R

x2 2

,

x ∈ R.

Finally, using the same method as adopted in last part of the proof of Theorem 1, we can get our conclusion.



4. Cantor’s products Cantor’s product is defined via x=

∞ 

1+

n =1

1 Sn

where 1 < x < 2, Sn ∈ N and Sn+1 ≥ Sn2 . Rényi (1958) showed that {Sn } is a Markov chain with transition probabilities j2 − 1

P (Sn+1 = k|Sn = j) =

k(k − 1)

,

where k ≥ j2 , j ≥ 2, k, j ∈ N. He also showed the strong law of large numbers

 1n

  S  n −1 n→∞  n lim 

Si

   =e 

i =1

and the central limit theorem



log

  



−n Si

  ≤ t  = Φ (t ) 

i=1

lim P 

n→∞

Sn n −1



n

where P is the Probability measure under which x is distributed uniformly on (1, 2) and Φ (t ) = (2014) proved P



1 n

log nS−n1 i=1

Si

t

√1

2 e−u /2 du. Zhu

−∞ 2π  ∈ · satisfies a large deviation principle. Similar to the proof of Theorem 2, we can obtain the

following result. We omit its proof. Theorem 3. Assume that x is a uniformly distributed random variable on (1, 2) and {Sn , n ≥ 1} is the Cantor’s products. Let 2 probability Q and {an , n ≥ 1} be assumed as in Theorem 2. Then under Q , log nS−n1 − n /an satisfies MDP with speed na− n and good rate function I (x) =

x2 2

i=1

, x ∈ R.

Si

254

W. Hu / Statistics and Probability Letters 96 (2015) 247–254

References Borel, E., 1947. Sur les developpements unitaires normaux. C. R. Acad. Sci. Paris 225, 51. Dembo, A., Zeitouni, O., 1998. Large Deviations Techniques and Applications. Springer-Verlag, New York. Erdős, P., Szüsz, P., Rényi, A., 1958. On Engel’s and Sylvester series. Ann. Univ. L. Eotvos (Sect. Math.) 1, 7–32. Feng, S., Gao, F., 2008. Moderate deviations for Poisson–Dirichlet distribution. Ann. Appl. Probab. 18 (5), 1794–1824. Gao, F., Zhao, X., 2011. Delta method in large deviations and moderate deviations for estimators. Ann. Statist. 39 (2), 1211–1240. Lévy, P., 1947. Remarques sur un theoreme de M. Emile Borle. C. R. Acad. Sci. Paris 225, 918–919. Rényi, A., 1958. On Cantor’s products. Colloq. Math. 6, 135–139. Rényi, A., Révész, P., 1958. On mixing sequences of random variables. Acta Math. Acad. Sci. Hungar. 9, 389–393. Zhu, L., 2014. On the large deviations for Engel’s, Sylvester’s series and Cantor’s products. Electron. Comm. Probab. 19 (2), 1–9.