On the almost sure invariance principle for dependent Bernoulli random variables

On the almost sure invariance principle for dependent Bernoulli random variables

Statistics and Probability Letters 107 (2015) 264–271 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage:...

386KB Sizes 36 Downloads 126 Views

Statistics and Probability Letters 107 (2015) 264–271

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

On the almost sure invariance principle for dependent Bernoulli random variables Yang Zhang ∗ , Li-Xin Zhang School of Mathematical Sciences, Zhejiang University, Hangzhou, 310027, China

article

info

Article history: Received 5 July 2015 Received in revised form 5 September 2015 Accepted 6 September 2015 Available online 11 September 2015 MSC: 60F05 60F17 60G15

abstract We consider a sequence of dependent Bernoulli variables where the success probability of the trial conditional on the past history is a linear function of the mean number of successes achieved to that point. An almost sure invariance principle is established for the partial sum and we also generalize the model to a multi-dimensional case, extending the results of Heyde (2004), James et al. (2008) and Wu et al. (2012). © 2015 Elsevier B.V. All rights reserved.

Keywords: Dependent Bernoulli random variables Martingales Central limit theorem Almost sure invariance principle

1. Introduction Consider a sequence of dependent Bernoulli random variables {Xn , n ≥ 1} where Xn are dependent in the following way: the success probability of the trial conditional on all the previous trials is a linear function of the mean number of successes achieved to that point. Precisely speaking, X1 follows a binomial distribution with parameter p and for n ≥ 1, P(Xn+1 = 1|Fn ) = αn + βn n−1 Sn ,

(1)

where Sn = i=1 Xi , αn and βn are non-negative dependence parameters satisfying αn + βn ≤ 1 and Fn = σ {X1 , . . . , Xn } denotes the σ -field generated by X1 , . . . , Xn . When αn = p and βn = 0, {Xn , n ≥ 1} reduces to the classical i.i.d. Bernoulli sequence. The parameters offer the possibility of allowing for overdispersion and thus make the model more flexible. Drezner and Farnum (1993) first introduced this generalized binomial model and derived the distribution of Sn . Their model was a special case of (1) with αn = (1 − θ )p and βn = θ for some θ ∈ (0, 1). Heyde (2004) further studied their model and obtained the central limit theorem for the case θ ≤ 1/2. He also obtained an almost sure convergence result when θ > 1/2. The linear structure may indeed arise naturally. Drezner and Farnum (1993) studied a data collection of the final standing of teams in major league baseball for 31 years where their generalized binomial model fits the data better. See Drezner and Farnum (1993) for more properties for the generalized binomial model.

n



Corresponding author. E-mail addresses: [email protected] (Y. Zhang), [email protected] (L.-X. Zhang).

http://dx.doi.org/10.1016/j.spl.2015.09.008 0167-7152/© 2015 Elsevier B.V. All rights reserved.

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

265

Recently, James et al. (2008) extended Drezner and Farnum’s model with θ = θn . Under some suitable conditions, they derived the strong law of large numbers, the central limit theorem and the law of the iterated logarithm for Sn . Wu et al. (2012) explored model (1) where they obtained the central limit theorem and the law of the iterated logarithm. They also considered the case where the conditional success probability involved a general function of Sn /n and derived a strong law of large numbers. In the present paper, we will first derive an almost sure invariance principle for model (1), which is a further development of the results mentioned above. The almost sure invariance principle is an important part in probability theory and has received considerable attention in the literature. Numerous results concerning the almost sure invariance principle have been established. We recommend Wu (2007) and references therein for a brief introduction about the ASIP. Then we will generalize this model to a multi-dimensional case, where we have kept the linear structure in order to prove limit theorems. The rest of the paper is organized as follows. Section 2 contains a brief summary about the preliminary results. Section 3 deals with the almost sure invariance principle for model (1). In Section 4, we introduce a multi-dimensional model and derive its limit theorems. In the sequel, by xn ∼ yn we mean limn→∞ xn /yn = 1. For two sequences of r.v.’s Xn and Yn , Xn = op (Yn ) and Xn = oa.s. (Yn ) mean Xn /Yn → 0 in probability and almost surely, respectively. The notation log is used d

for the natural logarithm and → for convergence in distribution. All limits are taken as n → ∞ unless specified otherwise. 2. Preliminary results Here we briefly summarize some of the main results that are related to model (1). For n ≥ 1, write pn = P(Xn = 1),

an =

n −1  

1+

βj j

j =1

where

0

i=1

,

A2n =

n  1

a2j j=1

and B2n =

n  pj (1 − pj ) j=1

a2j

,

xi = 1. Wu et al. (2012) proved that Sn − E(Sn )

lim



= 0 a.s.

n

n→∞

(2)

if and only if ∞  1 − βj

= ∞.

1+j

j=1

If the following conditions lim Bn = ∞ and

n→∞

lim sup n→∞

An Bn

<∞

(3)

are satisfied, then (2) holds and furthermore, we have the following central limit theorem and law of the iterated logarithm: Sn − E(Sn )

d

→ N (0, 1),

an Bn lim sup ± n→∞

(4)

Sn − E(Sn )



an Bn

log log(Bn )

√ =

2

a.s.

(5)

The key in the proofs is to introduce a martingale and derive limit theorems for the martingale. To be specific, define Mn =

Sn − E(Sn ) an

and D1 = M1 ,

Dj = Mj − Mj−1 ,

j ≥ 2.

From the calculations on page 459 of Wu et al. (2012), we know that {Mn , Fn , n ≥ 1} is a martingale and Dj =

Xj − E(Xj ) aj



Sj−1 − E(Sj−1 ) βj−1 j−1

aj

,

j ≥ 2.

Moreover, it follows from (1) that E(

D2j

Xj − 2pj Xj + p2j 

  ξj Sj−1 − E(Sj−1 )  |Fj−1 ) = E Fj−1 + a2 j−1 a2j j   (1 − 2pj ) αj−1 + (j − 1)−1 βj−1 Sj−1 + p2j ξj Sj−1 − E(Sj−1 ) = + 2 j−1 a2j aj 

266

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

= = =

  (1 − 2pj ) αj−1 + (j − 1)−1 βj−1 E(Sj−1 ) + p2j a2j

(1 − 2pj )pj + p2j a2j pj (1 − pj ) a2j

+

+

+

ξj Sj−1 − E(Sj−1 ) j−1

a2j

ξj Sj−1 − E(Sj−1 ) j−1

a2j

ξj Sj−1 − E(Sj−1 ) , j−1 a2j

(6)

where ξj = ξj (ω) may differ from line to line and there exists a finite C > 0 such that |ξj (ω)| < C for all ω ∈ Ω and j ≥ 1. The above conclusions and notations are important in proving asymptotic properties for Sn and will be frequently used in sections to follow. 3. An almost sure invariance principle for Sn In this section, we will derive a more advanced result for the partial sum Sn , i.e., the almost sure invariance principle, which is a further development of the results obtained by Heyde (2004), James   et al. (2008) and Wu et al. (2012). 1 From the CLT and LIL in Section 2, one may have noticed that a− Sn − E(Sn ) shares similar properties with W (B2n ), where n {W (t ), t ≥ 0} denotes a standard Brownian motion. So a natural question arises: can we (possibly on a new probability 1 2 space) approximate a− n (Sn − E(Sn )) with W (Bn ) plus a remainder which is rather small in some stochastic sense? The following theorem gives such a possible approximation. Theorem 1. Assume that (3) holds. Then we can redefine {Xn , n ≥ 1} on a new probability space without changing its distribution and there exists a standard Brownian motion {W (t ), t ≥ 0} on the same probability space such that

     Sn − E(Sn )  2  2 log log B  − W ( B ) = o B , a . s . n n  n  an    Sn − E(Sn )  2   − W ( B ) n  = op (Bn ).  a

(7)

(8)

n

Proof. According to the Skorokhod embedding theorem (e.g., page 269 in Hall and Heyde, 1980), we can redefine

{Dn , Fn , n ≥ 1} (and hence {Xn , Fn , n ≥ 1}) on a new probability space without changing its distribution and on the same probability space, there exists a standard Brownian motion {W (t ), t ≥ 0} together with an Fn -adapted sequence of nonnegative random variables τn such that d

Mn = W (Tn ), where Tn =

n

i =1

τi and

E(τj |Fj−1 ) = E(D2j |Fj−1 ),

p

E(τj |Fj−1 ) ≤ Cp E(|Dj |2p |Fj−1 )

a.s.

(9)

for p ≥ 1 and some positive number Cp . Without loss of generality, we can assume Mn = W (Tn ). Recall that n−1 (Sn − E(Sn )) → 0 almost surely under (3), it follows from (6) that E(D2j |Fj−1 ) =

pj (1 − pj ) a2j

 + oa.s.

1 a2j

 as j → ∞,

which further implies n        E τj |Fj−1 = B2n + oa.s. A2n = B2n + oa.s. B2n . j =1 d

Note that τˆj = τj − E τj |Fj−1 are martingale differences with respect to Fj and, since |Dj | ≤ 2/aj , by (9), we have





E(τˆj2 |Fj−1 ) = E(τj2 |Fj−1 ) − E2 (τj |Fj−1 ) ≤ Caj

−4

(10)

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

267

almost surely for some finite C > 0. Hence, ∞ E(τˆ 2 |F  j −1 ) j

A4j

j=1



4 ∞  a− j

C

j =1



=

Ca21 Ca21

+

+



A4j

2 ∞  a− j

C

A4j

j=1

∞  A2 − A2   j j −1 j =2

A2j A2j−1

∞  

1



A2j−1

j =2

1



A2j

= 2Ca21 < ∞ a.s. It follows from Theorem 2.18 of Hall and Heyde (1980) and Kronecker’s lemma that n 

    τˆj = oa.s. A2n = oa.s. B2n ,

j=1

which, together with (10), yields Tn =

n 

  τj = B2n + oa.s. B2n .

j =1

Since Bn → ∞, in view of Theorem 1.2.1 of Csörgő and Révész (1981), we obtain Mn = W (Tn ) = W (B2n ) + oa.s.



B2n log log Bn ,



which proves (7). It remains to verify (8). For any ε > 0 and δ > 0 (we can choose δ < 1),

 P

|W (Tn ) − W (B2n )| Bn





 =P

|W (Tn ) − W (B2n )| Bn

 +P

> ε, |Tn − B2n | ≥ δ B2n

|W (Tn ) − W (B2n )| Bn



> ε, |Tn − B2n | ≤ δ B2n



d

= I1 + I2 . Obviously, I1 → 0 as n → ∞. Since B2n is non-stochastic, by variable substitution, we have

 I2 ≤ P



sup |W (s) − W (1)| > ε . |s−1|≤δ

Using the Lévy modulus of continuity for a Wiener process, we conclude that I2 → 0 as δ → 0. Choosing δ small enough followed by letting n → ∞ now yields (8), which completes the proof.  Remark 1. If we can estimate Tn more accurately like γ

Tn −B2n 2−δ Bn

→ 0 for some δ > 0, then we can derive a smaller remainder

oa.s. Bn with γ > 1 − δ/2. However, to obtain such results, one may need the convergence rate in the strong law of large



numbers for Sn − E(Sn ) /n, which is not very easy to derive.





Remark 2. It should be helpful and interesting to consider more dependence relations like (Wu et al., 2012). But since in that case it is not easy to construct a martingale which is related to Sn , we have not found a good way to derive similar asymptotic results. By the central limit theorem and the law of the iterated logarithm for a standard Brownian motion, we easily obtain the following corollary, which is the combination of Theorems 3 and 4 in Wu et al. (2012) and also demonstrates the powerfulness of the almost sure invariance principle. Corollary 1. If condition (3) is satisfied, then the CLT (4) and LIL (5) hold. For the original model proposed by Drezner and Farnum (1993), we have explicit forms of an and as a consequence, the following approximations for Sn can be derived.

268

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

Corollary 2. Consider the following model proposed by Drezner and Farnum (1993): let {Xn , n ≥ 1} be a sequence of Bernoulli variables with X1 ∼ binomial (p) and for n ≥ 1, P(Xn+1 = 1|Fn ) = θ Sn /n + (1 − θ )p.

Then we can redefine {Xn , n ≥ 1} on a new probability space without changing its distribution and the new probability space supports a standard Brownian motion {W (t ), t ≥ 0} with two Gaussian processes {Gt , t ≥ 0}, { Gt , t ≥ 1} such that: (a) If 0 < θ < 1/2, then

 Sn − np =

p(1 − p) 1 − 2θ

 Sn − np =

p(1 − p) 1 − 2θ

G(n) + oa.s. G(n) + op

n log log n ,





(11)

n ,

√ 

(12)

d

where {G(t ), t ≥ 0} = {t θ W (t 1−2θ ), t ≥ 0}. (b) If θ = 1/2, then Sn − np =



Sn − np =



p(1 − p) G(n) + oa.s. p(1 − p) G(n) + op

n log n log log log n ,







 n log n ,

(13) (14)

√ where { G(t ), t ≥ 1} = { tW (log t ), t ≥ 1}. d

Proof. Let {W (t ), t ≥ 0} be the standard Brownian motion as stated in Theorem 1.   If 0 < θ < 1/2, from Lemma 3.1 of James et al. (2008), we know an ∼ nθ / θ Γ (θ ) and consequently B2n ∼ ρ n1−2θ where ρ = p(1 − p)θ 2 Γ 2 (θ )/(1 − 2θ ). Therefore, in view of Theorem 1.2.1 of Csörgő and Révész (1981), we conclude that W (B2n ) = W (ρ n1−2θ ) + oa.s.



n1−2θ log log n ,



which further implies nθ

an W (B2n ) =

W (ρ n1−2θ ) + oa.s.



n log log n .



θ Γ (θ )   d Denote G(t ) = √ρ W ρ t 1−2θ , t ≥ 0. Clearly, {G(t ), t ≥ 0} = {t θ W (t 1−2θ ), t ≥ 0} and from Theorem 1, we obtain   Sn − np = an W (B2n ) + oa.s. a2n B2n log log Bn    p(1 − p) = G(n) + oa.s. n log log n , 1 − 2θ tθ

which is exactly (11). To prove (12), first note that the following equation can be verified by using the same argument as shown in the proof of (8): W (B2n ) = W (ρ n1−2θ ) + op

√

n1−2θ .



Then taking Theorem 1 into account, we have

 Sn − np = an W (

B2n

) + op (an Bn ) =

p(1 − p) 1 − 2θ



G(n) + op ( n),

which completes the proof√ of part √ (a). If θ = 1/2, then an ∼ 2 n/ π , B2n ∼ π p(1 − p) log n/4 and consequently, W(

B2n

Define

)=W



π p(1 − p) 4

 log n

+ oa.s.

log n log log log n .





√   2 t π p(1 − p)  G(t ) = √ W log t , t ≥ 1. 4 π p(1 − p) d √ It is obvious that { G(t ), t ≥ 1} = { tW (log t ), t ≥ 1} and following similar calculations in lines above, we can obtain    Sn − np = p(1 − p) G(n) + oa.s. n log n log log log n ,

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

269

which proves (13). Finally, (14) can be deduced from W(

B2n

)=W



π p(1 − p) 4

and we omit the details.

 + op

log n



log n







d

Remark 3. We should point out that G(t ) = by calculating the covariance structure.

1 − 2θ t θ

t 0

d

s−θ dW (s) and  G(t ) =

√ t t

1

s−1/2 dW (s), which can be verified

4. A multi-dimensional generalization and its central limit theorem In this section, we generalize model (1) to a multi-dimensional case. Here, Xn = (Xn1 , . . . , Xnd ) takes values in {0, h1 , . . . , hd } with hk = (0, . . . , 1, . . . , 0) denoting the kth coordinate unit vector. For 1 ≤ k ≤ d and n ≥ 1, P(Xn+1 = hk |Fn ) = αnk + βn n−1 Snk ,

(15)

where Sn = i=1 Xi , αnk and βn are non-negative dependence parameters satisfying σ {X1 , . . . , Xn }. The initial distribution of X is:

n

P(X1 = hk ) = p1k

and P(X1 = 0) = 1 −

d 

d

i=1

αni + βn ≤ 1 and Fn =

p1i .

i=1

When d = 1, model (15) reduces to (1) and for a fixed k, {Xnk , n ≥ 1} can also be viewed as a case of (1). For 1 ≤ k ≤ d and n ≥ 1, denote pnk = P(Xnk = 1),

an =

n −1  

1+

βj



j

j =1

,

A2n =

n  1

, 2

j =1

B2nk =

aj

n  pjk (1 − pjk )

a2j

j =1

.

It follows from the results in Section 2 that the following three statements are equivalent: (i) There exists a 1 ≤ k ≤ d such that lim

Snk − E(Snk ) n

n→∞

= 0 a.s.

(16)

(ii) ∞  1 − βj j =1

= ∞.

1+j

(17)

(iii) For all 1 ≤ k ≤ d, lim

Snk − E(Snk ) n

n→∞

= 0 a.s.

(18)

Moreover, if the following conditions lim Bnk = ∞ and

n→∞

lim sup n→∞

An

<∞

Bnk

(19)

hold for some 1 ≤ k ≤ d, then (16) holds and Snk − E(Snk ) an Bnk lim sup ± n→∞

d

→ N (0, 1), Snk − E(Snk )



an Bnk

log log(Bnk )

√ =

2

a.s.

Now we turn to the weak convergence of Sn . Since Cov(Xni , Xnj ) = −pni pnj ̸= 0 for i ̸= j, meaning the columns of Sn are not independent, the convergence of Sn is a little more complicated. The next theorem shows that, under an additional suitable condition, the random vector Sn with some proper normalization will converge in distribution to a normal vector. Before we proceed, we shall introduce a lemma concerning the central limit theorem for bounded martingale difference sequences, which is Lemma 3.4 in James et al. (2008). Lemma 1. Let {Zn , Gn , n ≥ 1} be a sequence of bounded martingale differences. Assume that there exists a sequence of positive  constants {Wn , n ≥ 1} such that Wn → ∞ and Wn−2

n

j =1

P

E(Zj2 |Gj−1 ) → σ 2 . Then

n

j=1 Zj Wn

d

→ N (0, 1).

270

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

Theorem 2. Suppose that for all 1 ≤ k ≤ d, condition (19) is satisfied and pnk → p′k . Then the following central limit theorem holds: Sn − E(Sn ) an An

d

→ N (0, Σ ),

(20)

where Σ = (σij )d×d is the covariance matrix with σii = p′i (1 − p′i ) and σij = −p′i p′j for i ̸= j. Proof. For n ≥ 1 and 1 ≤ k ≤ d, define



Mn = (Mn1 , . . . , Mnd ) = D1k = M1k ,

Sn1 − E(Sn1 ) an

Djk = Mjk − Mj−1,k ,

Snd − E(Snd )

,...,



an

,

j ≥ 2.

Then for each fixed k, {Mnk , Fn , n ≥ 1} is a martingale and Djk =

Xjk − pjk aj

Sj−1,k − E(Sj−1,k ) βj−1,k



j−1

aj

,

j ≥ 2.

Since Snk − E(Snk ) /n → 0 almost surely under (19), with similar calculations as shown in (6), it is not difficult to derive





E(

D2jk

pjk (1 − pjk )

|Fj−1 ) =

a2j

E(Djk Djl |Fj−1 ) = −

pjk pjl a2j

 + oa.s. 

+ oa.s.



1

and

a2j

(21)



1

for k ̸= l.

a2j

By the Cramér–Wold device (e.g., Theorem 29.4 in Billingsley, 1995), to prove (20) it suffices to prove that for any real numbers t1 , . . . , td , the following weak convergence holds: t1

Mn1 An

where σ 2 =

+ · · · + td d

2 ′ i=1 ti pi

Mnd An

d

→ N (0, σ 2 ),

(1 − p′i ) −

(22) ′ ′



i̸=j ti tj pi pj .

Write

 Dj = t1 Dj1 + · · · + td Djd . It is clear that { Dj , Fj , j ≥ 1} is also a martingale difference sequence. By (21) and some algebra, we obtain E( D2j |Fj−1 ) =

d  t 2 pji (1 − pji ) i

a2j

i =1

Note that A2n = n 

n

j =1

A2n

 ti tl pji pjl a2j

i̸=l

 + oa.s.

1



a2j

as j → ∞.

2 a− → ∞. So j n  d 

E( D2j |Fj−1 )

j =1



=

j=1 i=1

ti2 pji (1−pji ) a2j



 ti tl pji pjl i̸=l

n  j =1



d  i=1

a2j

ti2 p′i (1 − p′i ) −

+ oa.s.

 1  a2j

1 a2j



ti tl p′i p′l

a.s.

i̸=l

Since { Dj , j ≥ 1} is bounded, it follows from Lemma 1 that n 

 Dj

j =1

An

d

→ N (0, σ 2 ),

which proves (22) and thus completes the proof.



Remark 4. The condition pnk → p′k , meaning the convergence of the success probability for all components of Xn , is not so restrictive. For example, in the original model proposed by Drezner and Farnum (1993) and its extension proposed by James et al. (2008), pn = p for all n ≥ 1. If the parameters do not depend on n, i.e., αnk = αk and βn = β for all n ≥ 1, then by

Y. Zhang, L.-X. Zhang / Statistics and Probability Letters 107 (2015) 264–271

271

the monotone convergence theorem, it is routine to derive that pnk → αk /(1 − β). Moreover, the following lemma ensures that pnk → p′k is satisfied as long as αnk → αk and βn → β . Lemma 2. Assume {xn , n ≥ 1} is a sequence of real numbers with recursion xn+1 = bn n−1 sn + cn where sn = c non-negative parameters bn , cn satisfy bn + cn ≤ 1. If x1 ∈ [0, 1], bn → b < 1 and cn → c, then xn → 1− . b

n

i=1

xi and the

Proof. The proof is still based on the monotone convergence theorem, but a little different from the case where bn and cn are constants. It suffices to prove that limn→∞ sn /n exists. Let yn = sn /n and we have the following recursion for yn : yn+1 =

cn n+1

+

n + bn n+1

yn .

It is straightforward to derive that if yn ≤ 1−cnb , then yn ≤ yn+1 ≤ 1−cnb , while if yn ≥ 1−cnb , then yn ≥ yn+1 ≥ 1−cnb , which n n n n means yn+1 is between yn and 1−cnb . n

c For any ε > 0, there exists an N1 such that | 1−cnb − 1− | < ε holds for all n ≥ N1 . b n cN cn 1 For the case yN1 ≥ 1−b , if yn ≥ 1−b holds for all n ≥ N1 , then yn is decreasing for n ≥ N1 , which, together with N1

n

cN

the fact that yn ∈ [0, 1] yields the existence of limn→∞ yn . Otherwise, there must be an N2 (N2 ≥ N1 ) such that 1−b2 ≤ N2 cN +1

c yN2 +1 ≤ 1−b2 , which, together with the above fact that yn+1 is between yn and 1−cnb , means |yn − 1− | < ε holds for all b n N2 +1 cN

n ≥ N2 + 1, also ensuring the existence of limn→∞ yn . The case yN1 ≤ 1−b1 can be dealt with in the same way and the proof N1 is now completed.  α

k Example. If for n large enough, αnk = αk + 1n and βn = β + 1n , then from the above lemma we know pnk → 1−β , which means the central limit theorem (20) will hold if (19) is satisfied for all k.

Acknowledgments We thank an anonymous referee and the Editor for their helpful comments which led to a much improved version of this paper. This research was partially supported by NSF of China (No. 11225104) and the Fundamental Research Funds for the Central University of China (No. 2015FZA3001). References Billingsley, P., 1995. Probability and Measure, third ed. John Wiley & Sons. Csörgő, M., Révész, P., 1981. Strong Approximations in Probability and Statistics. Academic Press, New York. Drezner, Z., Farnum, N., 1993. A generalized binomial distribution. Comm. Statist. Theory Methods 22 (11), 3051–3063. Hall, P., Heyde, C.C., 1980. Martingale Limit Theory and its Application. Academic Press, New York. Heyde, C.C., 2004. Asymptotics and criticality for a correlated Bernoulli process. Aust. N. Z. J. Stat. 46 (1), 53–57. James, B., James, K., Qi, Y., 2008. Limit theorems for correlated Bernoulli random variables. Statist. Probab. Lett. 78 (15), 2339–2345. Wu, W.B., 2007. Strong invariance principles for dependent random variables. Ann. Probab. 35 (6), 2294–2320. Wu, L., Qi, Y., Yang, J., 2012. Asymptotics for dependent Bernoulli random variables. Statist. Probab. Lett. 82 (3), 455–463.