Journal
of Statistical
Planning
and Inference
291
32 (1992) 291-302
North-Holland
Bootstrapping finite Markov
the autocorrelation chains
coefficient of
CD. Fuh” Institute
of
Statistical
Received 20 February Recommended
Abstract;
1990; revised manuscript
The autocorrelation
199 1
of the serial correlation
makes it less applicable.
The verification
AMS Subject Classification: Key words andphrases: bution; transition
in simple Markov chains has been studied in recent years. Although
of an estimator
matrix
distribution.
received 18 October
by S.S. Gupta
totic distribution covariance
Science. Academia Sinica, Taipei, Taiwan, China
is known,
A recent bootstrap
the difficulty
method
of the bootstrap
method is also investigated
Primary
secondary
62005;
Autocorrelation
coefficient;
to compute
will be introduced
the asymp-
the varianceto simulate
the
here.
60F05
bootstrap;
hitting time; Markov
chain; stationary
distri-
probability.
1. Introduction Let X= (X,;j= 1,2, . . . > be a homogeneous ergodic (positive recurrent, irreducible and aperiodic) Markov chain with finite state space S= {E,, E2, . . . , Es} and transition probability matrix P= (pjj). The states Ei may be quantitative or qualitative. Throughout this paper we shall assume that Ei are ordered categories or levels of a certain characteristic and, for simplicity, identify them with the numdistribution n=(nj) is well-known, bers 1,2, . . . . s. The existence of the stationary and TLsatisfies the balance equation n=nP. The problem of estimating the autocorrelation coefficient R ( = cov(X,, X,, ])/ var(X,)) of a Markov chain arises in applied probability and statistics. Some estimators of R have been proposed by Basawa (1972) based on the serial correlation, and their asymptotic properties established there. Next, the problems for testing hypothesis and confidence interval for R involve the sampling distribution of an estimator of R. Correspondence to: Dr. CD. China. * Research
partially
0378-3758/92/$05.00
supported
0
Fuh, Institute
of Statistical
by the National
1992-Elsevier
Science, Academia
Science Council
Science Publishers
Sinica, Taipei 11529, Taiwan,
of ROC Grant
NSC 79-0208.MOOI-63.
B.V. All rights reserved
C.D. Fuh / Bootstrapping autocorrelation coefficient
292
Recently, Efron’s (1979) method of bootstrap provides a general device to simulate the ‘distribution’. The present paper discusses this method for the above situation. Several bootstrap methods for estimating the sampling distribution of the transition probability are proposed. Kulperger and Prakasa Rao (1990) and Basawa et al. (1990) discuss the bootstrap method for a finite state Markov chain. A general investigation for bootstrapping Markov chains can be found in Fuh (1989) and Athreya and Fuh (1989). Datta and McCormick (1990) have investigated the accuracy of the conditional bootstrap proposed by Basawa et al. (1990). A survey paper in this area is in Athreya and Fuh (1992). The bootstrap algorithm will be proposed in Section 2. In Section 3, we discuss some limiting theorems as well as the validity of the bootstrap method. Final remarks are presented in Section 4.
2. The bootstrap
method
Let x=(x,=1,..., x,> be a realization of an ergodic finite Markov chain (X;; j> l> defined as above. The maximum likelihood estimators (MLE) of pi and Pij are fi,(i) = ni/‘n and $,(i,j) = nu/ni, respectively, where ni is the number of visits to state i, and nti is the number of ij-th transitions during the sample. We define 6, = (e,(i)) and p,, = (j,(i,j)). The rank autocorrelation of first order associated with the above Markov chain is
R=
cov(xk~X~+,) vaWG)
where e=(e,), and ti,=Pr(Xk=i,Xk+t Cj Bij=ni. A natural estimator for R is J?, =g(&, nij/n. It is easy to verify that R = c;=, n
(x,-@(%,I c”,=, (x,-H2
=j)=nipij. where
Notice 6= (g$) and
that
Cij 8,~
gG = fi,(i)lj,(j,j)
1, =
-X) ’
and xn+, is replaced by xi. where R= (x, + *.++x,)/n The bootstrap method for estimating the sampling distribution H,, of fi(R^, -R) can be described as follows: (1) With the sample x, fit the transition probability P and stationary distribution n by their MLE p,, and ii,, respectively. (2) Draw a bootstrap sample x* = {x: = 1, . . . ,xi”} from p,,,, and calculate the bootstrap estimate a
= cz, n
(+n*)(x;+t cz,
where N, is the bootstrap placed by xl*.
(x,*-R*)2 sample
-n*) ’ size, K*=
(xf + ... +xzn)/Nn
and xGn+i is re-
C. D. Fuh / Bootstrapping autocorrelation coefficient
(3) Approximate the sampling distribution H,* of fl(R,, -I?,,) simulation.
293
distribution H,, of fi(Z?, -R) by the conditional given x, which can be done by the Monte-Carlo
That is, we treat RI,, as the maximum likelihood estimator of R^, based on the bootstrap sample x*, and investigate the conditional distribution of m(R,, - R^,) based on the sample x. The problem here is to verify that the distribution H,, and the conditional distribution H,* are asymptotically close. Traditionally, we consider the limiting distributions of H, and H,*, and prove fi(R^, -R) and fl(En -I?,,) have the same asymptotic distribution. In a recent paper, Lo (1989) proposes another criterion which does not rely on the validity of CLT or Edgeworth expansions. His approach consists of expressing the bootstrapped difference of an estimator l?,-Z?,, as the sum of two terms, of which one has the same distribution as Z?,,- R and the other is a negligible remainder. The representation of the bootstrap for Markov chains could be a considerable task, and needs further research. The asymptotic normality of fi(R^,,-R) is well-known (Derman, 1956). Therefore, we will use the first criterion described as above. Since the multivariate normal distribution is continuous, with Polya’s theorem, we need only to prove the conditional central limit theorem for m(Z?, - R^,,) based on the given observation x. In this paper, we will use a central limit theorem for a functional of a double array of Markov chains, which has been investigated by Athreya and Fuh (1990). Next, we prove a new central limit theorem for finite state Markov chains, and derive our result as a corollary.
3. Central limit theorem
for the bootstrap
estimator
Let X be an ergodic (homogeneous, irreducible, positive recurrent) Markov chain with finite state S = { 1,2, . . . , s}. Assume that EkTk2< 03, where T, is the first hitting time to state k, and Ek refers to the expectation of the Markov chain with initial point k. It is known that fi(fi,
- rc) -+ N(O,&)
in distribution,
fi(pn
-P)
in distribution,
-+ N(O,Z,)
where C, and ,Zp are the asymptotic variance-covariance respectively. Here, the convergence in distribution means that
{fi(AG,.i> -Pij); the asymptotic
(iA)
variance-covariance
ES>+ matrix
N(O,Zp)
matrices
of i2, and pn,
in distribution,
Zp is a block
diagonal
matrix,
which
C.D. Fuh / Bootstrapping autocorrelation coefficient
294
has the form
+Y
0
$2(P) ...
0
Zp=
.*a
0
0
-1.
0
0
...
$ es(P)
delta function, for j,l= where L”(P) = (Pu(dj/ -pi,)) and 6,” is the Kronecker 1, ‘.., s. The above asymptotic variance-covariance matrix is, in fact, a special case of a general central limit theorem for a double array of Markov chains in Proposition 1. The asymptotic variance-covariance matrix ZE has a similar form as above and will not be written here. By the definition of R and g(B), it is easy to verify that g has continuous first order partial derivatives at 0. By the a-method, we have fi(I?,
- R) -+ N(0, ,ZR)
where ZR is the asymptotic pressed as follows:
in distribution,
variance-covariance
matrix
of I?,,
which can be ex-
where Gr = (ag/ae r r, . . . , ag/&9,,) is a 1 x s2 vector and Z8 is the variance-covariance matrix of the asymptotic joint distribution of {J~z((?~-- Bij); i,j~ S}. In order to get a confidence interval for R, one can use the above asymptotic distribution, but the computation of the variance-covariance matrix is rather difficult. Here, we propose an alternative approach by using the method of bootstrap. In order to verify the validity of the bootstrap method, we establish in this section that for almost all realizations of the Markov chain {Xj; j2 l}, the conditional distribution of m(& -I?,) given x converges weakly to N(0, ZR) as n, N, + 00. For this purpose, we consider the following general CLT for a double array of Markov chains. Let us define some notations first. Let X,, = (Xnj; j20) be a sequence of ergodic Markov chains with general state spaces (S,, 9’,,) such that for each n there exists a singleton A, ES,, such that distribution of X, under Pnd,(Tnd, < 00) = 1, where P,,, refers to the probability XnO =x, and Tnd, be the first hitting time of the Markov chain X, to its recurrent point A, on the state space S,, namely, T
_
inf(21,
IId”c
00
X,j=A,>,
if no such j exists.
C.D. Fuh / Bootstrapping autocorrelation coefficient
where E,,n, is the expectation of X, under X,,= A,. It is known Ney, 1978) that v,(. ) is a-finite and invariant for X,,, that is,
295
(Athreya
and
vn(.) = v&bVn(y, . ), where P,(y, . ) = Pr(X,, E . 1X,,,=y) is the transition kernel function of the Markov chain X,,. Assume that v,(S,)= End,Tnd, < 03 and set n,( 1) = 0,(. )/v,(S,). Here, n,( . ) is the invariant measure on S induced by the Markov chain X,. Define f,, : S, + R be YU-measurable and N,, be a sequence of observations that goes to 03 as n --t 00. We state the following two propositions without proof, the reader is referred to Athreya and Fuh (1990) for the details. Proposition
1. With the same notations
defined as above, assume the following
conditions hold: (9 _f, E L2h,),
!f, dn,, =0 and
s lfnldrn
4 0; WRGm (ii)N,J,,(A,,)=N,,/E,o,~KI~)+00 as n-+os; (iii) for each E > 0, as n + 03,
(iv) for each x, A, and e>O, as n-t o3,
Me)=
SUP ; m>eNn
s
E, ~,‘“‘W,)
- n,(A,)~
-+ 0,
where Pf’(x,A.)=Pr(X,I,=A.
1XnO=x),
Gil,- r
rlnl(fn) E js0 fn(xnj)3 a2=En&Mf,))2 ”
= 2 f,(x)(T,f,)(x)~,(~) 1
and
T
(T,f,)(x)=E,,,
(
oJ_j=o
s
f;(x)n,(M
1
““- fn(xnj) jgO
Then, for any initial distribution 1
-
N, c f,(X,,)
>
.
of X,,,,, --, N(0, 1)
in distribution.
C. D. Fuh / Bootstrapping
296
The assumption
(i) of Proposition
terpreted as the bootstrap the stationary probability
autocorrelation
1 is a technical
coefficient
one. Assumption
(ii) can be in-
sample size converging to 00, provided the consistence of 7t,. Assumption (iii) is an analogy of the Lindeberg con-
dition of CLT for a double array of independent random variables, Assumption (iv), namely s,(e) + 0 as n -+ 00 for each O
beasequence of Markov chains with the same state space (S, Y), and transition probability kernel . ). We assume without loss of generality that X,,, X have recurrent singletons Pll(. 3 d, and d, respectively. Proposition 2. If we replace the assumption (iv) in Proposition 1 by the following assumptions: (1) sup, jJP,(x, .)-P(x, .)I] -+O as n-too; (2) Iln,(.)-71(.)11 -+O as n+m; (3) sup IIPcL’(x, .)-n(.)jl+O as L-+00. (4) the;e exists a finite measure p on (S, &) such that P(X,, E dy) =f,,(y)p(dy), and there existsfeL,(p) such that for all n, If,(y)1 < If(y)1 a.e., where 11.jJ is the total variation norm, and n,(. ) and z(. ) are stationary probabilities for P,,( . , . ) and P( a, . ), respectively, then the central limit theorem in Proposition 1 still holds. Now, we consider the Markov chain with finite all X,,,X have the same recurrent point d.
state space S = { 1,2, . . . , s}, and
Theorem 1. Let IX”j; j = 1, . . . , N,,) be a sequence of ergodic Markov chains with finite state space S, and transition probability matrix P, = (p,(i, j)), such that p,(i, j) +pij for all i,j, where P=(pij) is the transition probability matrix of a Markov chain (Xj ; j> I> on the same state space S. Let { f,, } be a sequence of Smeasurable functions. If
SUP IfnW l,scS
then
1 CJ,dmj=l
N?J c fn(Xnj) + N(0, 1)
in distribution,
where ai is the same as the one in Proposition Proof. ditions
1,
Here, we need only to check that with the assumptions given above, the conin Proposition 2 hold. Let us prove the following lemma first.
C.D. Fuh / Bootstrapping autocorrelation coefficieni
Lemma
29-l
1. With the notations given above, there exists 6 > 0 such that
Notice that since sup, ai < co, Lemma Proof.
1 implies assumption
Since the state space S is finite,
all entries
of P,, are positive
without
(iii) of Proposition
loss of generality,
we may assume
for all n, and let cz=maxpii,
cr,=maxp,(i,j), i,j
i,J
p=minpU. i,j
P,=minPnO,A,
i,j
Note that
a,-+a
and
/I,,-p.
Let
f~~‘=Pd{XJfA
for j
fn(d”~PPno{X~j#A
X,=A},
for j
X,,,=A}.
Then
&ld=P*/l
fJ~‘ 0 and integer for all n>no.
fn(;i 1 such that &l-P-s)<
1, then
1
In order (1) (2) (3) (4)
I-&+-Ej=
that there exists 6>0
to complete
SUPj,j IP~(i,j)-PijI
/rc,(i)-nil
+O
the proof, -0
such that
we need to check that aS n+
037
as n-03, as L+ 03,
lpIJL’-7Lil +O lim infj,4 p,(i,A)>O
= N,n,(A)-+
~0.
1.
no such that
298
C.D. Fuh / Bootstrapping
autocorrelation
coefficient
For (l), (2) and (3), it is true since the ergodic Markov chain has finite state space. For (4), since A is a recurrent point of the Markov chain, we have pn(i,d)+pjd>O,
as n+03.
0
Next, in order to verify the asymptotic prove the following lemma first.
normality
of m(pn
-p,,),
we need to
Lemma 2. Let X be an ergodic Markov chain with state space (E, &) and transition probability matrix P. Let g be a measurable function on (E x E, &@ &) such that g E L, (,u) where ,u is I[ x n, and TCis the invariant measure on E. Then
& with probability
,jo
gWnv X,n + 1)--t
1 1 g(x,y>p(x,dy)n(h) E
E
1.
Proof. Consider Y, = (Xn,Xn+ r), for n = 0, 1, . . . , then Y, is an ergodic chain with state space E x E and invariant measure ,u. Then, we have 1 lim ~ n+02 n+l
g(z)p (dz) =
m=~
s EXE
&, y)p(h
x dy),
II EXE
where z = (x, y). For all A, B E E, C= (A, B), and A = (i, E), we have
= f
E;(Z,(X,)Z,(X,+,)Z(~>m))
m=O
This implies
that P (ck
dr) = ~(x, dy)n (k)
and thus g(x,y)Adx
x du) =
ISE E
g(x, Y)P(X, dy)n(W.
0
Markov
C.D. Fuh / Bootstrapping
autocorrelation
299
coefficient
Theorem 2. With the same notations given above, for almost all realizations of the process, we have
m(p,,
- p,,> + N(O,ZP)
in distribution
as n + w and N,, + 03, where .XP is the same as above. Proof. The maximum likelihood estimator fi,(i,j) of P;j is a consistent estimator. That is, p,*(i,j) -+pij for all states i, j from the state space S. For each fixed n, the observation {XnJ; j= 1, . . . , N,} can be regarded as a Markov chain with finite state space S and transition probability matrix J?,(. , * ). By Theorem 1, for all f, g E L2(q1), we have fi(?,
f~-~[f(x)n,(dx))-+N(0,02(f))
m(,F,
y
in distribution,
and - 1 g(x)n,,(dw))
--+N(0, a*(g))
in distribution,
where
f (&7x+(~)-
and
Note that o’(f) Then N i( ’
=
and a2(g)
have a similar
cz,
f (&,>
Cz,
dxnj)
( Cz,
form as 0,” in Proposition
Sf (X)%(~) -
fg(x)nn(dw)
m(
.I20
>
f (x,zj)/Nn - Sf (XH,(~))
g(Xnj)/N,)(f
g(x)n,(~))
g(x,j)/N,-Sg(X)rc,(dw)) - (.!f(X)~,(~))~(C~~ ( C,“o g(x,zj)/N,)(S g(X)n,(b)) --f N(0, a*(f, g))
in distribution,
where
02(frg)=
(Sf (xMd-M202(g)
a*(f) (S g(x)7@4)*
+
(Sg(xMdM4
.
1.
C.D. Fuh / Bootstrapping
300
Now, we define y,,= (Xn,X,,+t),
f (x9Y) = Then,
1
coefficient
which is also a Markov
if x=i,
0
autocorrelation
Y=j, g(x, Y) =
otherwise,
chain.
Let
1
if x=i,
0
otherwise.
we have
$2, f(r,) CyZ,
gtynj)
m,ij
=m,i’
where m,,; is the number of visits to state i and m,ij is the number tions during the bootstrap sample. Note that
of ij-th transi-
Hence m,ij
p-fi,(i,j)
+ N(0, a2(P))
in distribution.
mni
where a2(P) is the asymptotic variance which can be got from a2(f,g). Since the state space is finite, the asymptotic normality still hold for any linear combination of m(m,,/m,,i-b’n(i, j)). By the Cramer-Wold device, we have m(pn
- pn) -+ N(O,Z,)
in distribution.
0
The central limit theorem for fl(ii, - 77,) can be obtained similarly, and will not be repeated here. Kulperger and Prakasa Rao (1990) used the supnorm in stochastic matrix to prove the uniform convergence of m(p,,-_n). For a Markov chain with countable state space, an analogue of Theorem 2 with different idea of proof can be found in Athreya and Fuh (1989). We state the following proposition. Proposition 3 (b-method). Suppose {Xnk} is a double array of random d-vectors -c) converges in distribution to the d-variate normal distribusuch that fl(X,, tion, with mean 0 and variance-covariance matrix Z. Let f be a real-valued function from Rd to R such that f has continuous first order partial derivatives at c. Then,
fi(f (X,,,) -f (c)) + N(0, a2)
in distribution,
where a2 = (f ‘(c))*Zf ‘(c) and f ‘(c) is the column vector with entries df (c)/dx;, for ,..., k.
j-l,2
Now, since I?,, defined as above is continuous differentiable vation x, by Theorem 2 and Proposition 3, we have:
at 0 for given obser-
C.D. Fuh / Bootstrapping autocorrelation coefficient
Corollary
301
1. With the notations given as above,
fi(l?,,
- 2,) -+ N(0, ZR)
in distribution,
where ZR is the asymptotic variance-covariance the one of I/(&R).
matrix which has the same form as
4. Remarks (1) Regarding the condition of the finite second moment of the recurrent time of the Markov chain, i.e. EkTi< 03, this is true for an ergodic Markov chain with finite state space. For the countable infinity state space case, it is profitable to use various relations that reduce the order of these ‘block’ moment conditions. The relationship between EkTr and conditions on the rate of decay of mixing coefficients of a chain was investigated in Bolthausen’s paper (1982) for a noncyclic chain. (2) There are three other estimates suggested by Basawa (1972), namely,
where nj is assumed to be known completely, ,Y and o2 are the known variance of the stationary distribution. Further, R
~
n2
C&j
UQij-P2 CT2
and
Rn3E
1 _
Ii,,
mean
and
iti-.Mj
no2
are two more estimates of R, the former requiring the knowledge of p and 02, while the second depends on a2 only. The same argument developed in this paper can be applied for these three estimates. (3) The choice of the bootstrap sample size for finite state space case, is independent of the original sample size, as long as both convergence to 03. A natural choice for the bootstrap sample size is simply to pick up the same as the original one. The Monte-Carlo study for this problem with some modified bootstrap techniques will be in a separate paper. (4) The asymptotic efficiency between the classical normal approximation and the method of bootstrap involving the second order approximation (Edgeworth type expansion) for Pr(fi(pn -P)< t), is an interesting open question.
Acknowledgements The author thanks the editor and two referees for their helpful comments led to a substantially improved article.
suggestions.
Their
302
C.D. Fuh / Bootstrapping
autocorrelation
coefficient
References Athreya,
K.B. and P. Ney (1978). A new approach
Amer.
Math.
Athreya,
Sot.
to the limit theory of recurrent
K.B. and C.D. Fuh (1989). Bootstrapping
B-89-7, Institute Plann.
Markov
chains.
Trans.
245, 493-501.
of Statistical
Markov
Science, Academia
chains:
Sinica, Taipei,
Countable
Taiwan,
case. Technical
ROC. To appear
Report
in J. Statist.
Inference.
Athreya,
K.B. and C.D. Fuh (1990). Central
Report
B-90-1,
Institute
of Statistical
limit theorem
Science,
for a double array of Harris
Academia
Sinica, Taipei,
Taiwan,
chains.
Technical
ROC. To appear
in
Sankhyh. Athreya, K.B. and C.D. Fuh (1992). Bootstrapping Markov chains. In: R. Lepage and L. Billard, Exploring the Limits of Bootstrap, Wiley, New York, pp. 49-64. Basawa,
I.V. (1972). Estimation
of the autocorrelation
coefficient
in simple Markov
chain.
Eds.,
Biometrika
59, 85-89. Basawa,
I., T. Green,
Markov
chains.
Bolthausen,
Statist.
Technical
Report
Biometrika
bootstrap
validity
for finite
for strong
mixing Harris
recurrent
Markov
chain.
Z.
Bootstrapping
distribution
for a finite state Markov
of Georgia, theory
Athens,
for Markov
chain
based
on i.i.d.
chains with a denumerable
number
GA 30602.
43, 285-294.
Fuh, C.D. (1989). The bootstrap Ames,
(1990).
139, University
C. (1956). Some asymptotic
of states.
theorem
(1990). Asymptotic
19, 1493-1510.
Verw. Gebiete 60, 283-289.
S. and W. McCormick
sampling. Derman,
and R. Taylor
Theory Meth.
E. (1982). The Berry-Esseen
Wahrsch. Datta,
W. McCormick
Comm.
method
for Markov
chains.
Ph.D.
dissertation,
Iowa State University,
IA, USA.
Efron, B. (1979). Bootstrap method: another look at the jackknife. Ann. Statist. Kulperger, R.J. and B.L.S. Prakasa Rao (1990). Bootstrapping a finite state Markov
7, l-26. chain. Sankhya
Ser.
A 51, 178-191. Lo, S.H. (1989). On some representations
of the bootstrap.
Probab.
Theory RelatedFields
82, 411-418.