Bootstrapping the autocorrelation coefficient of finite Markov chains

Bootstrapping the autocorrelation coefficient of finite Markov chains

Journal of Statistical Planning and Inference 291 32 (1992) 291-302 North-Holland Bootstrapping finite Markov the autocorrelation chains coef...

632KB Sizes 0 Downloads 93 Views

Journal

of Statistical

Planning

and Inference

291

32 (1992) 291-302

North-Holland

Bootstrapping finite Markov

the autocorrelation chains

coefficient of

CD. Fuh” Institute

of

Statistical

Received 20 February Recommended

Abstract;

1990; revised manuscript

The autocorrelation

199 1

of the serial correlation

makes it less applicable.

The verification

AMS Subject Classification: Key words andphrases: bution; transition

in simple Markov chains has been studied in recent years. Although

of an estimator

matrix

distribution.

received 18 October

by S.S. Gupta

totic distribution covariance

Science. Academia Sinica, Taipei, Taiwan, China

is known,

A recent bootstrap

the difficulty

method

of the bootstrap

method is also investigated

Primary

secondary

62005;

Autocorrelation

coefficient;

to compute

will be introduced

the asymp-

the varianceto simulate

the

here.

60F05

bootstrap;

hitting time; Markov

chain; stationary

distri-

probability.

1. Introduction Let X= (X,;j= 1,2, . . . > be a homogeneous ergodic (positive recurrent, irreducible and aperiodic) Markov chain with finite state space S= {E,, E2, . . . , Es} and transition probability matrix P= (pjj). The states Ei may be quantitative or qualitative. Throughout this paper we shall assume that Ei are ordered categories or levels of a certain characteristic and, for simplicity, identify them with the numdistribution n=(nj) is well-known, bers 1,2, . . . . s. The existence of the stationary and TLsatisfies the balance equation n=nP. The problem of estimating the autocorrelation coefficient R ( = cov(X,, X,, ])/ var(X,)) of a Markov chain arises in applied probability and statistics. Some estimators of R have been proposed by Basawa (1972) based on the serial correlation, and their asymptotic properties established there. Next, the problems for testing hypothesis and confidence interval for R involve the sampling distribution of an estimator of R. Correspondence to: Dr. CD. China. * Research

partially

0378-3758/92/$05.00

supported

0

Fuh, Institute

of Statistical

by the National

1992-Elsevier

Science, Academia

Science Council

Science Publishers

Sinica, Taipei 11529, Taiwan,

of ROC Grant

NSC 79-0208.MOOI-63.

B.V. All rights reserved

C.D. Fuh / Bootstrapping autocorrelation coefficient

292

Recently, Efron’s (1979) method of bootstrap provides a general device to simulate the ‘distribution’. The present paper discusses this method for the above situation. Several bootstrap methods for estimating the sampling distribution of the transition probability are proposed. Kulperger and Prakasa Rao (1990) and Basawa et al. (1990) discuss the bootstrap method for a finite state Markov chain. A general investigation for bootstrapping Markov chains can be found in Fuh (1989) and Athreya and Fuh (1989). Datta and McCormick (1990) have investigated the accuracy of the conditional bootstrap proposed by Basawa et al. (1990). A survey paper in this area is in Athreya and Fuh (1992). The bootstrap algorithm will be proposed in Section 2. In Section 3, we discuss some limiting theorems as well as the validity of the bootstrap method. Final remarks are presented in Section 4.

2. The bootstrap

method

Let x=(x,=1,..., x,> be a realization of an ergodic finite Markov chain (X;; j> l> defined as above. The maximum likelihood estimators (MLE) of pi and Pij are fi,(i) = ni/‘n and $,(i,j) = nu/ni, respectively, where ni is the number of visits to state i, and nti is the number of ij-th transitions during the sample. We define 6, = (e,(i)) and p,, = (j,(i,j)). The rank autocorrelation of first order associated with the above Markov chain is

R=

cov(xk~X~+,) vaWG)

where e=(e,), and ti,=Pr(Xk=i,Xk+t Cj Bij=ni. A natural estimator for R is J?, =g(&, nij/n. It is easy to verify that R = c;=, n

(x,-@(%,I c”,=, (x,-H2

=j)=nipij. where

Notice 6= (g$) and

that

Cij 8,~

gG = fi,(i)lj,(j,j)

1, =

-X) ’

and xn+, is replaced by xi. where R= (x, + *.++x,)/n The bootstrap method for estimating the sampling distribution H,, of fi(R^, -R) can be described as follows: (1) With the sample x, fit the transition probability P and stationary distribution n by their MLE p,, and ii,, respectively. (2) Draw a bootstrap sample x* = {x: = 1, . . . ,xi”} from p,,,, and calculate the bootstrap estimate a

= cz, n

(+n*)(x;+t cz,

where N, is the bootstrap placed by xl*.

(x,*-R*)2 sample

-n*) ’ size, K*=

(xf + ... +xzn)/Nn

and xGn+i is re-

C. D. Fuh / Bootstrapping autocorrelation coefficient

(3) Approximate the sampling distribution H,* of fl(R,, -I?,,) simulation.

293

distribution H,, of fi(Z?, -R) by the conditional given x, which can be done by the Monte-Carlo

That is, we treat RI,, as the maximum likelihood estimator of R^, based on the bootstrap sample x*, and investigate the conditional distribution of m(R,, - R^,) based on the sample x. The problem here is to verify that the distribution H,, and the conditional distribution H,* are asymptotically close. Traditionally, we consider the limiting distributions of H, and H,*, and prove fi(R^, -R) and fl(En -I?,,) have the same asymptotic distribution. In a recent paper, Lo (1989) proposes another criterion which does not rely on the validity of CLT or Edgeworth expansions. His approach consists of expressing the bootstrapped difference of an estimator l?,-Z?,, as the sum of two terms, of which one has the same distribution as Z?,,- R and the other is a negligible remainder. The representation of the bootstrap for Markov chains could be a considerable task, and needs further research. The asymptotic normality of fi(R^,,-R) is well-known (Derman, 1956). Therefore, we will use the first criterion described as above. Since the multivariate normal distribution is continuous, with Polya’s theorem, we need only to prove the conditional central limit theorem for m(Z?, - R^,,) based on the given observation x. In this paper, we will use a central limit theorem for a functional of a double array of Markov chains, which has been investigated by Athreya and Fuh (1990). Next, we prove a new central limit theorem for finite state Markov chains, and derive our result as a corollary.

3. Central limit theorem

for the bootstrap

estimator

Let X be an ergodic (homogeneous, irreducible, positive recurrent) Markov chain with finite state S = { 1,2, . . . , s}. Assume that EkTk2< 03, where T, is the first hitting time to state k, and Ek refers to the expectation of the Markov chain with initial point k. It is known that fi(fi,

- rc) -+ N(O,&)

in distribution,

fi(pn

-P)

in distribution,

-+ N(O,Z,)

where C, and ,Zp are the asymptotic variance-covariance respectively. Here, the convergence in distribution means that

{fi(AG,.i> -Pij); the asymptotic

(iA)

variance-covariance

ES>+ matrix

N(O,Zp)

matrices

of i2, and pn,

in distribution,

Zp is a block

diagonal

matrix,

which

C.D. Fuh / Bootstrapping autocorrelation coefficient

294

has the form

+Y

0

$2(P) ...

0

Zp=

.*a

0

0

-1.

0

0

...

$ es(P)

delta function, for j,l= where L”(P) = (Pu(dj/ -pi,)) and 6,” is the Kronecker 1, ‘.., s. The above asymptotic variance-covariance matrix is, in fact, a special case of a general central limit theorem for a double array of Markov chains in Proposition 1. The asymptotic variance-covariance matrix ZE has a similar form as above and will not be written here. By the definition of R and g(B), it is easy to verify that g has continuous first order partial derivatives at 0. By the a-method, we have fi(I?,

- R) -+ N(0, ,ZR)

where ZR is the asymptotic pressed as follows:

in distribution,

variance-covariance

matrix

of I?,,

which can be ex-

where Gr = (ag/ae r r, . . . , ag/&9,,) is a 1 x s2 vector and Z8 is the variance-covariance matrix of the asymptotic joint distribution of {J~z((?~-- Bij); i,j~ S}. In order to get a confidence interval for R, one can use the above asymptotic distribution, but the computation of the variance-covariance matrix is rather difficult. Here, we propose an alternative approach by using the method of bootstrap. In order to verify the validity of the bootstrap method, we establish in this section that for almost all realizations of the Markov chain {Xj; j2 l}, the conditional distribution of m(& -I?,) given x converges weakly to N(0, ZR) as n, N, + 00. For this purpose, we consider the following general CLT for a double array of Markov chains. Let us define some notations first. Let X,, = (Xnj; j20) be a sequence of ergodic Markov chains with general state spaces (S,, 9’,,) such that for each n there exists a singleton A, ES,, such that distribution of X, under Pnd,(Tnd, < 00) = 1, where P,,, refers to the probability XnO =x, and Tnd, be the first hitting time of the Markov chain X, to its recurrent point A, on the state space S,, namely, T

_

inf(21,

IId”c

00

X,j=A,>,

if no such j exists.

C.D. Fuh / Bootstrapping autocorrelation coefficient

where E,,n, is the expectation of X, under X,,= A,. It is known Ney, 1978) that v,(. ) is a-finite and invariant for X,,, that is,

295

(Athreya

and

vn(.) = v&bVn(y, . ), where P,(y, . ) = Pr(X,, E . 1X,,,=y) is the transition kernel function of the Markov chain X,,. Assume that v,(S,)= End,Tnd, < 03 and set n,( 1) = 0,(. )/v,(S,). Here, n,( . ) is the invariant measure on S induced by the Markov chain X,. Define f,, : S, + R be YU-measurable and N,, be a sequence of observations that goes to 03 as n --t 00. We state the following two propositions without proof, the reader is referred to Athreya and Fuh (1990) for the details. Proposition

1. With the same notations

defined as above, assume the following

conditions hold: (9 _f, E L2h,),

!f, dn,, =0 and

s lfnldrn

4 0; WRGm (ii)N,J,,(A,,)=N,,/E,o,~KI~)+00 as n-+os; (iii) for each E > 0, as n + 03,

(iv) for each x, A, and e>O, as n-t o3,

Me)=

SUP ; m>eNn

s

E, ~,‘“‘W,)

- n,(A,)~

-+ 0,

where Pf’(x,A.)=Pr(X,I,=A.

1XnO=x),

Gil,- r

rlnl(fn) E js0 fn(xnj)3 a2=En&Mf,))2 ”

= 2 f,(x)(T,f,)(x)~,(~) 1

and

T

(T,f,)(x)=E,,,

(

oJ_j=o

s

f;(x)n,(M

1

““- fn(xnj) jgO

Then, for any initial distribution 1

-

N, c f,(X,,)

>

.

of X,,,,, --, N(0, 1)

in distribution.

C. D. Fuh / Bootstrapping

296

The assumption

(i) of Proposition

terpreted as the bootstrap the stationary probability

autocorrelation

1 is a technical

coefficient

one. Assumption

(ii) can be in-

sample size converging to 00, provided the consistence of 7t,. Assumption (iii) is an analogy of the Lindeberg con-

dition of CLT for a double array of independent random variables, Assumption (iv), namely s,(e) + 0 as n -+ 00 for each O beasequence of Markov chains with the same state space (S, Y), and transition probability kernel . ). We assume without loss of generality that X,,, X have recurrent singletons Pll(. 3 d, and d, respectively. Proposition 2. If we replace the assumption (iv) in Proposition 1 by the following assumptions: (1) sup, jJP,(x, .)-P(x, .)I] -+O as n-too; (2) Iln,(.)-71(.)11 -+O as n+m; (3) sup IIPcL’(x, .)-n(.)jl+O as L-+00. (4) the;e exists a finite measure p on (S, &) such that P(X,, E dy) =f,,(y)p(dy), and there existsfeL,(p) such that for all n, If,(y)1 < If(y)1 a.e., where 11.jJ is the total variation norm, and n,(. ) and z(. ) are stationary probabilities for P,,( . , . ) and P( a, . ), respectively, then the central limit theorem in Proposition 1 still holds. Now, we consider the Markov chain with finite all X,,,X have the same recurrent point d.

state space S = { 1,2, . . . , s}, and

Theorem 1. Let IX”j; j = 1, . . . , N,,) be a sequence of ergodic Markov chains with finite state space S, and transition probability matrix P, = (p,(i, j)), such that p,(i, j) +pij for all i,j, where P=(pij) is the transition probability matrix of a Markov chain (Xj ; j> I> on the same state space S. Let { f,, } be a sequence of Smeasurable functions. If

SUP IfnW l,scS

then

1 CJ,dmj=l

N?J c fn(Xnj) + N(0, 1)

in distribution,

where ai is the same as the one in Proposition Proof. ditions

1,

Here, we need only to check that with the assumptions given above, the conin Proposition 2 hold. Let us prove the following lemma first.

C.D. Fuh / Bootstrapping autocorrelation coefficieni

Lemma

29-l

1. With the notations given above, there exists 6 > 0 such that

Notice that since sup, ai < co, Lemma Proof.

1 implies assumption

Since the state space S is finite,

all entries

of P,, are positive

without

(iii) of Proposition

loss of generality,

we may assume

for all n, and let cz=maxpii,

cr,=maxp,(i,j), i,j

i,J

p=minpU. i,j

P,=minPnO,A,

i,j

Note that

a,-+a

and

/I,,-p.

Let

f~~‘=Pd{XJfA

for j
fn(d”~PPno{X~j#A

X,=A},

for j
X,,,=A}.

Then

&ld=P*/l
fJ~‘ 0 and integer for all n>no.

fn(;i 1 such that &l-P-s)<

1, then

1
In order (1) (2) (3) (4)

I-&+-Ej=

that there exists 6>0

to complete

SUPj,j IP~(i,j)-PijI

/rc,(i)-nil

+O

the proof, -0

such that

we need to check that aS n+

037

as n-03, as L+ 03,

lpIJL’-7Lil +O lim infj,4 p,(i,A)>O

= N,n,(A)-+

~0.

1.

no such that

298

C.D. Fuh / Bootstrapping

autocorrelation

coefficient

For (l), (2) and (3), it is true since the ergodic Markov chain has finite state space. For (4), since A is a recurrent point of the Markov chain, we have pn(i,d)+pjd>O,

as n+03.

0

Next, in order to verify the asymptotic prove the following lemma first.

normality

of m(pn

-p,,),

we need to

Lemma 2. Let X be an ergodic Markov chain with state space (E, &) and transition probability matrix P. Let g be a measurable function on (E x E, &@ &) such that g E L, (,u) where ,u is I[ x n, and TCis the invariant measure on E. Then

& with probability

,jo

gWnv X,n + 1)--t

1 1 g(x,y>p(x,dy)n(h) E

E

1.

Proof. Consider Y, = (Xn,Xn+ r), for n = 0, 1, . . . , then Y, is an ergodic chain with state space E x E and invariant measure ,u. Then, we have 1 lim ~ n+02 n+l

g(z)p (dz) =

m=~

s EXE

&, y)p(h

x dy),

II EXE

where z = (x, y). For all A, B E E, C= (A, B), and A = (i, E), we have

= f

E;(Z,(X,)Z,(X,+,)Z(~>m))

m=O

This implies

that P (ck

dr) = ~(x, dy)n (k)

and thus g(x,y)Adx

x du) =

ISE E

g(x, Y)P(X, dy)n(W.

0

Markov

C.D. Fuh / Bootstrapping

autocorrelation

299

coefficient

Theorem 2. With the same notations given above, for almost all realizations of the process, we have

m(p,,

- p,,> + N(O,ZP)

in distribution

as n + w and N,, + 03, where .XP is the same as above. Proof. The maximum likelihood estimator fi,(i,j) of P;j is a consistent estimator. That is, p,*(i,j) -+pij for all states i, j from the state space S. For each fixed n, the observation {XnJ; j= 1, . . . , N,} can be regarded as a Markov chain with finite state space S and transition probability matrix J?,(. , * ). By Theorem 1, for all f, g E L2(q1), we have fi(?,

f~-~[f(x)n,(dx))-+N(0,02(f))

m(,F,

y

in distribution,

and - 1 g(x)n,,(dw))

--+N(0, a*(g))

in distribution,

where

f (&7x+(~)-

and

Note that o’(f) Then N i( ’

=

and a2(g)

have a similar

cz,

f (&,>

Cz,

dxnj)


( Cz,

form as 0,” in Proposition

Sf (X)%(~) -

fg(x)nn(dw)

m(

.I20

>

f (x,zj)/Nn - Sf (XH,(~))

g(Xnj)/N,)(f

g(x)n,(~))

g(x,j)/N,-Sg(X)rc,(dw)) - (.!f(X)~,(~))~(C~~ ( C,“o g(x,zj)/N,)(S g(X)n,(b)) --f N(0, a*(f, g))

in distribution,

where

02(frg)=

(Sf (xMd-M202(g)

a*(f) (S g(x)7@4)*

+

(Sg(xMdM4

.

1.

C.D. Fuh / Bootstrapping

300

Now, we define y,,= (Xn,X,,+t),

f (x9Y) = Then,

1

coefficient

which is also a Markov

if x=i,

0

autocorrelation

Y=j, g(x, Y) =

otherwise,

chain.

Let

1

if x=i,

0

otherwise.

we have

$2, f(r,) CyZ,

gtynj)

m,ij

=m,i’

where m,,; is the number of visits to state i and m,ij is the number tions during the bootstrap sample. Note that

of ij-th transi-

Hence m,ij

p-fi,(i,j)

+ N(0, a2(P))

in distribution.

mni

where a2(P) is the asymptotic variance which can be got from a2(f,g). Since the state space is finite, the asymptotic normality still hold for any linear combination of m(m,,/m,,i-b’n(i, j)). By the Cramer-Wold device, we have m(pn

- pn) -+ N(O,Z,)

in distribution.

0

The central limit theorem for fl(ii, - 77,) can be obtained similarly, and will not be repeated here. Kulperger and Prakasa Rao (1990) used the supnorm in stochastic matrix to prove the uniform convergence of m(p,,-_n). For a Markov chain with countable state space, an analogue of Theorem 2 with different idea of proof can be found in Athreya and Fuh (1989). We state the following proposition. Proposition 3 (b-method). Suppose {Xnk} is a double array of random d-vectors -c) converges in distribution to the d-variate normal distribusuch that fl(X,, tion, with mean 0 and variance-covariance matrix Z. Let f be a real-valued function from Rd to R such that f has continuous first order partial derivatives at c. Then,

fi(f (X,,,) -f (c)) + N(0, a2)

in distribution,

where a2 = (f ‘(c))*Zf ‘(c) and f ‘(c) is the column vector with entries df (c)/dx;, for ,..., k.

j-l,2

Now, since I?,, defined as above is continuous differentiable vation x, by Theorem 2 and Proposition 3, we have:

at 0 for given obser-

C.D. Fuh / Bootstrapping autocorrelation coefficient

Corollary

301

1. With the notations given as above,

fi(l?,,

- 2,) -+ N(0, ZR)

in distribution,

where ZR is the asymptotic variance-covariance the one of I/(&R).

matrix which has the same form as

4. Remarks (1) Regarding the condition of the finite second moment of the recurrent time of the Markov chain, i.e. EkTi< 03, this is true for an ergodic Markov chain with finite state space. For the countable infinity state space case, it is profitable to use various relations that reduce the order of these ‘block’ moment conditions. The relationship between EkTr and conditions on the rate of decay of mixing coefficients of a chain was investigated in Bolthausen’s paper (1982) for a noncyclic chain. (2) There are three other estimates suggested by Basawa (1972), namely,

where nj is assumed to be known completely, ,Y and o2 are the known variance of the stationary distribution. Further, R

~

n2

C&j

UQij-P2 CT2

and

Rn3E

1 _

Ii,,

mean

and

iti-.Mj

no2

are two more estimates of R, the former requiring the knowledge of p and 02, while the second depends on a2 only. The same argument developed in this paper can be applied for these three estimates. (3) The choice of the bootstrap sample size for finite state space case, is independent of the original sample size, as long as both convergence to 03. A natural choice for the bootstrap sample size is simply to pick up the same as the original one. The Monte-Carlo study for this problem with some modified bootstrap techniques will be in a separate paper. (4) The asymptotic efficiency between the classical normal approximation and the method of bootstrap involving the second order approximation (Edgeworth type expansion) for Pr(fi(pn -P)< t), is an interesting open question.

Acknowledgements The author thanks the editor and two referees for their helpful comments led to a substantially improved article.

suggestions.

Their

302

C.D. Fuh / Bootstrapping

autocorrelation

coefficient

References Athreya,

K.B. and P. Ney (1978). A new approach

Amer.

Math.

Athreya,

Sot.

to the limit theory of recurrent

K.B. and C.D. Fuh (1989). Bootstrapping

B-89-7, Institute Plann.

Markov

chains.

Trans.

245, 493-501.

of Statistical

Markov

Science, Academia

chains:

Sinica, Taipei,

Countable

Taiwan,

case. Technical

ROC. To appear

Report

in J. Statist.

Inference.

Athreya,

K.B. and C.D. Fuh (1990). Central

Report

B-90-1,

Institute

of Statistical

limit theorem

Science,

for a double array of Harris

Academia

Sinica, Taipei,

Taiwan,

chains.

Technical

ROC. To appear

in

Sankhyh. Athreya, K.B. and C.D. Fuh (1992). Bootstrapping Markov chains. In: R. Lepage and L. Billard, Exploring the Limits of Bootstrap, Wiley, New York, pp. 49-64. Basawa,

I.V. (1972). Estimation

of the autocorrelation

coefficient

in simple Markov

chain.

Eds.,

Biometrika

59, 85-89. Basawa,

I., T. Green,

Markov

chains.

Bolthausen,

Statist.

Technical

Report

Biometrika

bootstrap

validity

for finite

for strong

mixing Harris

recurrent

Markov

chain.

Z.

Bootstrapping

distribution

for a finite state Markov

of Georgia, theory

Athens,

for Markov

chain

based

on i.i.d.

chains with a denumerable

number

GA 30602.

43, 285-294.

Fuh, C.D. (1989). The bootstrap Ames,

(1990).

139, University

C. (1956). Some asymptotic

of states.

theorem

(1990). Asymptotic

19, 1493-1510.

Verw. Gebiete 60, 283-289.

S. and W. McCormick

sampling. Derman,

and R. Taylor

Theory Meth.

E. (1982). The Berry-Esseen

Wahrsch. Datta,

W. McCormick

Comm.

method

for Markov

chains.

Ph.D.

dissertation,

Iowa State University,

IA, USA.

Efron, B. (1979). Bootstrap method: another look at the jackknife. Ann. Statist. Kulperger, R.J. and B.L.S. Prakasa Rao (1990). Bootstrapping a finite state Markov

7, l-26. chain. Sankhya

Ser.

A 51, 178-191. Lo, S.H. (1989). On some representations

of the bootstrap.

Probab.

Theory RelatedFields

82, 411-418.