An Algebraic
Treatment
of Finite Markov
CHIN LONG CHIANG Uniwrsity of California, Berkeley,
Chains*
California 94720
Received I7 October 1979
ABSTRACT Simple and explicit formulas are presented for the n-step transition probabilities pij(n) in finite Markov chains. Formulas are also obtained for the limiting probability distribution and the mean return time for irreducible ergodic finite chains. An example is given for illustration.
1.
INTRODUCTION.
Consider matrix
a finite Markov
chain with states
1,2,. . . ,s, and a stochastic
where
j+lg=l, For every n > 1, the n-step transition the corresponding matrix by
i=l,...,
probabilities
S.
(2) are denoted bypJn)
and
It is well known that the matrix P(n) is equal to the nth power of the matrix
*The research reported herein was performed pursuant to a grant (003-7601-P2021) from the Department of Health, Education and Welfare, Washington, D.C. The opinions and conclusions expressed herein are solely those of the author, and should not be construed as representing the opinions or policy of any agency of the United States Government. Presented at the Second Vilnius Conference on Probability Theory and Mathematical Statistics, July, 1977, Vilnius, Lithuania, U.S.S.R. MATHEMATICAL
BIOSCIENCES
50: l-12
OElsevier North Holland, Inc., 1980 52 Vanderbilt Ave., New York, NY 10017
1
(1980)
0025-5564/80/050001+
12$02.25
2
CHIN LONG CHIANG
P, or P(n)=P.
(4)
Conventionally, one could appeal to Sylvester’s theorem to derive the nth power of a matrix. But the resulting formula for P(n) is a product of s characteristic matrices corresponding to s eigenvalues of matrix P. Feller [4] uses the probability generating function to derive individual elements pii directly. His formula for p,(n) involves right eigenvectors and left eigenvectors, which themselves are to be obtained from two systems of s simultaneous equations (see also Karlin [5]). Thus, the formulas derived from conventional methods are not conducive to easy computations. The purpose of this paper is to present a simple and explicit expression for individual n-step transition probabilities p,(n), the limiting probability distribution, and the mean return time. An example is given for illustration. 2.
THREE
LEMMAS.
To derive the main results, we need the following three lemmas. LEMMA I Let a,,a, ,..., a, be distinct
numbers;
ai
i
then we have
=o
for
r=O,l,...,
= 1
for
r=s-1.
s-2,
(=’ j+Vrj) j#i
(5)
For proof of the lemma, see [l], p. 347. LEMMA 2 Any root X of the characteristic
equation p-PI
is not greater
than unity in absolute
of the matrix
P,
=o,
(6)
value, and one of the roots is h= 1.
Lemma 2 is well known. For every root of Eq. (6) &, I= 1,. . . ,s, let A(&) = (&I-P)
(7)
FINITE
3
MARKOV CHAINS
be the corresponding
characteristic
matrix and
the adjoint matrix of A@,). Take the kth (nonzero) matrices in (8), and formulate a matrix
where k is any arbitrary LEMMA
but constant
integer,
column from each of the
1
3
Ij the eigenvalues A,, . . . ,&, are distinct, then the components of the k th column of the inverse T-‘(k) of T(k) in (9) are independent of k and are given by
T,,(k) -= IT(k) I
1 Ii
(‘0)
C&-L,)
fV=l mfl
Proof of the lemma is given in [2]. 3.
THE MAIN RESULTS-THE EIGENVALUES.
MATRIX
P HAS SINGLE
We now use the above lemmas to derive the n-step transition ties, the limiting distribution, and the mean return time. THEOREM
probabili-
I
Zj the matrix P in (1) has distinct eigenvalues A,,.. .,A,, then the n-step transition probabilities are given by pV(n) = SI A,i(‘,F/” /=I
fi m=l mfl
’ (X,-Q’
ij=
l,..., s;
n= l,....
(11)
4
CHIN LONG CHIANG
Proof. Every nonzero column in (8) is an eigenvector corresponding to the eigenvalue AC;that is,
@,1-P)
AH(&) .
of the matrix P
0
=
. .
A,&,)
(‘2)
0
When the eigenvalues are distinct, the vectors in the matrix T(k) in (9) are linearly independent; therefore the inverse T-‘(k) exists. It is easy to show that the matrix T(k) defined in (9) diagonalizes the matrix P as follows:
T-‘(k)PT(k)=
0 A,
. .
... ...
. .
(j
OX 0 . 9
(j
...
h,
(‘3)
i
so that
P=T(k)
0
...
0
0
A,
...
0
..
..
T- l(k).
(‘4)
T-‘(k).
(15)
give the nth power of P:
Direct computations
(A; P”=T(k)
0
...
0
” .
A,” . .
*.’
0
(j
(j
...
A)
Expanding both sides of (15) yields the formulas probabilities, pg(n) = i
I=1
A&,)X;
m IT(k)1 ’
ij=
for individual
1,...,S,
n=l )....
transition
(‘6)
This formula holds true even if the s vectors of T(k) in (9) are taken from different columns of (8)-that is, k varies with I. In any event, the formula in (16) is independent of the choice of columns. However, if k is the same
FINITE
MARKOV
5
CHAINS
for all the s vectors, then the formula such cases, we set k=j and write
$$ ,
pii( n) = i Aj(( I-1 Substituting
of p,(n) can be further simplified.
ij=
1>...,s,
n=1,2 )....
In
(‘7)
(10) for k =j in (17) we recover (11) as required.
For an irreducible finite Markov chain with ergodic states, the limiting probabilities exist and are equal to the reciprocal of the corresponding mean return times r_lii,i.e., (18) Proof of (18) may be found in Chung [3] or Feller [4]. The following theorem gives explicit formulas for the limiting probabilities. THEOREM
2
chain with a stochastic matrix P
In an irreducible ergodic finite Markov given in (1) the limiting probabilities $rCr P,.(n)=‘rri,
j=l
,...s,
j=l
,...,s,
(19)
are given by
Ai,(l)
?lj =
fi Akk(l) ’
k-l
where A,&( 1) is the (k, k)th cofactor of the matrix A( 1) = (I - P), and I is an s X s unit matrix. Proof.
For every n > 0,
Taking the limit of both sides of (21) as n+co, probabilities satisfying the equation
we find
the limiting
s Tj’
c
7r&
j=l
i=l
,...,S.
(22)
Let 9r=(7r, ,...>TS,)’
(23)
6
CHIN LONG CHIANG
be a column rewritten as
vector of the limiting
probabilities.
Equation
(22) may be
or
(24)
(I-p’)7r=O,
where P’ is the transpose of matrix P. Thus n is an eigenvector corresponding to the eigenvalue X = 1, and is given by
of P
Al,(l) =C i
Ah(l) (25) L
A,$)
where c is an arbitrary constant and A;(l) is the (ij)th matrix A’(l)=@-P’). It is easy to show that
cofactor
j=l,...,s,
A,;(l)=A;(l)=Ai,(l),
of the
(26)
and (25) becomes < . Tl
A,,(l)
=2
A22V)
(27)
=C
In order for {T} to be a probability
distribution,
the constant
c must be
1
c=
(28) i:
A&l)
.
k-l
Substituting
(28) in (27) yields (20).
COROLLARY
The mean return time of an ergodic state j is given by
k$,
Akk(l)
(29)
cqii = A,(l)
The corollary
is implied in (18) and (20).
.
FINITE
MARKOV
7
CHAINS
We have derived in (11) a formula for the transition probability p,(n). Obviously, the limiting probability 9 may be derived from (11) by taking the limit as n-cc.
,51 A,,(A,)A;
ilm
Tj’j=
’
s
(30)
m!, C-U. mfl
The two formulas for 7 in (20) and (30) must be equal. The following theorem establishes this fact. THEOREM
3
The formula (20) for the limiting probabilities 9 is equal to the formula in (30). Namely,
lim 5 A+(A,)h; n+m ,-,
4(l)
’ fi m-1 mzt
(4-L)
=
ij=
1,...,s.
(31)
k+Ml)’
Proof: In view of Lemma 2, we let X, = 1 and [A,[< 1 for 1=2,. . . ,s. As n-+ cc, hf+O for I = 2,. . . ,s, and (31) becomes
A,,(l)
Ai,(l)
’
s
II
(1-L)
=
5
A/d)
(32)
’
k-1
WI-2
or, since Aii( 1) = Aii( l), we need to show that
(l-Am)=
ii
i
m-2
To prove (33) we introduce
(33)
&k(l)-
k-l
a matrix A(l)-pI
and solve the equation
IA(l)-pII= to obtain
(34)
the roots pi, p2,. . . , ps. It is well known in algebra that s P2P3’.
*Ps+PIPJ”‘Ps+“’
+PIPz**.Ps-I=
c k-l
Since A( 1) = (I - P), we have A(l)-pI=
(l-p)I-P
= XI-P,
A/c/c(l).
(35)
8
CHIN LONG CHIANG
p = 1 -X. The roots of the equation
=o,
1x1-PI
A,, . . . ,a, and the roots of the equation p,= 1 ---A,
Substituting
(6)
in (34) have the relationship
for
(36) in (35) and recognizing
f= l,...,s. p, = 1 -h,
(l-X,)(1-A,)*..(l-a>=
i:
(36) = 0, we find
&(l).
(37)
k-l
Therefore,
we recover (33) and hence (31) in the theorem.
The probabilities p,(n) as given in (11) are always real functions of pii, since they are real elements of P”, regardless of whether the stochastic matrix P has complex eigenvalues. 4.
THE MAIN RESULTS-THE EIGENVALUES
MATRIX
P HAS MULTIPLE
In Sec. 3 we assumed that the eigenvalues h,,. . .,A, of matrix P are distinct. We shall now extend the results to cases where the stochastic matrix P has multiple eigenvalues. The formulas for the probabilities pij(n) are given in the following theorem. THEOREM
4
If the matrix P in (1) has multiple roots, h,, . . . ,&, with multiplicities s,, . . . ,s,, respectively, so that s, + . . . + s,=s, then the n th order transition probabilities are given by
p,(n)=
s,-
A;
5
‘=I j,
(A,-AJ=
I
lz
m=o
ail
r
s,-1-m
,
n
A,-(s-l-m-h)
dh
,,,A,,(&)
h! ij=
I
) I
1,...,s,
(38)
where 2. . . 2 stands for r - 1 summations cu=l ,... r, such that (m,+... +m,)-mm,=m,
is the h th derivative
taken
over m, = 0, 1, . . . , m, (Y#I,
and
of the cofactor A,,(X) of A(h) evaluated
at X =A,.
FINITE
9
MARKOV CHAINS
Proof. We first assume that the matrix P has s distinct divide them into r groups according to order of . ..7 T&j, and rewrite (11) as follows: (A,,,...,h,,,),...,(~l,
p,j(n)=
i: i &)” I-1 ,,I=1
A,,(&) I? (&fi-&) 6=1 S#P
(39)
li
2
a=l a#[
y=l
(&3-a,>
We shall derive (38) from (39) by taking the limit as A,&, I=1 , . . . , r. Formally, we let &fl = A, + sla and write
We evaluate the numerator
eigenvalues, magnitude:
for p = 1,. . . ,s,
in each term as follows:
(A,+qJ=
i:
(rg)qp;-8
6=0
and
This product is
(41) where min[k, s - I] represents the smaller of the two numbers As .++O, we find from the denominator of (40)
k and s - 1.
CHIN LONG CHIANG
10
where Z. * * C stands for r - 1 summations taken over m, = 0, 1, . . . ,m, (r +I, (Y= 1,2 ,...,r, such that (ml+... + m,) - m, = m. Substituting (41) and (42) in (40) yields
(43) According
to Lemma
1,
=0 For k+m=s,,
s,+l,...,
for
k+m=s,-
1,
for
k+m=O,
l,...,
q-2.
(4)
we take the limit as E,~-+O to find ktm
lim
els
=o
for
k+m
as,,
(45)
/3-l,...,+ %+O
since the numerator is of the smaller order of magnitude than the denominator. For each m = 0, 1, . . . , s, - 1, we can find a value of k such that k + m = s, - 1, or k = s, - 1 - m. As a result, (43) reduces to (38). The proof of the theorem is complete. While the probabilities pV(n) assume different expressions in Remark. (11) and (38) depending upon the multiplicities of eigenvalues of matrix P, the limiting probability distribution (rr,, . . ..rJ given in (20) and the mean return times b given in (29) hold true also when the matrix P has multiple eigenvalues. This is because Theorem 2 does not require the matrix P to have all single eigenvalues and h= 1 is a single root.
FINITE
5.
MARKOV
11
CHAINS
AN EXAMPLE
This example appears in Feller method of computation. Let
[4]. We use it to illustrate
0 0
0
P=
TI
0
1‘ 1
0 0
I r 0
the present
0
0.
1
0,
The transitions are 1+4, 2-4, 3+1, 3+2, and 4+3. Therefore, states communicate. There are four roots of the equation [XI-PI
all the four =O; two of
the roots are complex conjugates: h, =O, X,= 1, A,=( - 1 + ifi )/2, and & = ( - 1 - ifi )/2. With these values we find X,2=X4, hi = h,, X,3= Xi = 1,
ii
6
- 1,
@,-A)=
m-2
(x,--&)=+3
for 1=2,3,4,
m-1 mfl
and
r A, A(&) =
A, _;
_f
0 The formula for p,(n) and compute ‘1
px=
0
-1‘
0
-1
A,
0
-1
/I,
I= 1,2,3,4.
)
varies with n. For n=3k,3k+
L i
0
;
;
0
0,
0 ko
0 0
1
0 1,
i
0
0
0
0
0
p=+l=
1,3k+2
we use (11)
1
0
0
0
0
0
0
1 ,
1
1
0
0
6
1
0,
,6
and
‘0 P3k+2=
0 0 I i
0 0 0 1 i
1 1 0
0 0 1
0
0
Therefore, the corresponding Markov chain is periodic the system starts from scratch after every three steps.
with period t = 3;
12
CHIN LONG CHIANG To
I
find the stationary
probability
A(l)=
and compute
we formulate
01
01
0
-1
_ I T
_ I T
1
;’
the matrix
the cofactors A,,(l)=;,
Therefore,
distribution,
the stationary
A,,(l)=f,
A&1)=1,
distribution
&(l)=l.
is
(~1,~*,~~,~~)=(~,~,f,f). It is easy to check that
(&&;,+)
‘0
0
0
1‘
p
;
;
;
0
0
1
0
=(&&
’ 1). 373
REFERENCES C. L. Chiang, A stochastic model of competing risks of illness and competing risks of death, in Stochastic Models in Medicine and Biofogv (J. Gurland, Ed.), Univ. of Wisconsin Press, Madison, 1964, pp. 325-354. C. L. Chiang, Introduction to Stochastic Processes in Biostatistics, Wiley, New York, 1968. K. L. Chung, Markou Chains with Stationaty Transition Probabilities, Springer, Berlin, 1960. W. Feller, An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed., Wiley, New York, 1968. S. Karlin, A First Course in Stochastic Processes, Academic, New York, 1%6. F. R. Gantmacher, The Theory of Matrices, Vol. I, Chelsea, New York, 1960.