An algebraic treatment of finite Markov chains

An algebraic treatment of finite Markov chains

An Algebraic Treatment of Finite Markov CHIN LONG CHIANG Uniwrsity of California, Berkeley, Chains* California 94720 Received I7 October 1979 A...

427KB Sizes 10 Downloads 186 Views

An Algebraic

Treatment

of Finite Markov

CHIN LONG CHIANG Uniwrsity of California, Berkeley,

Chains*

California 94720

Received I7 October 1979

ABSTRACT Simple and explicit formulas are presented for the n-step transition probabilities pij(n) in finite Markov chains. Formulas are also obtained for the limiting probability distribution and the mean return time for irreducible ergodic finite chains. An example is given for illustration.

1.

INTRODUCTION.

Consider matrix

a finite Markov

chain with states

1,2,. . . ,s, and a stochastic

where

j+lg=l, For every n > 1, the n-step transition the corresponding matrix by

i=l,...,

probabilities

S.

(2) are denoted bypJn)

and

It is well known that the matrix P(n) is equal to the nth power of the matrix

*The research reported herein was performed pursuant to a grant (003-7601-P2021) from the Department of Health, Education and Welfare, Washington, D.C. The opinions and conclusions expressed herein are solely those of the author, and should not be construed as representing the opinions or policy of any agency of the United States Government. Presented at the Second Vilnius Conference on Probability Theory and Mathematical Statistics, July, 1977, Vilnius, Lithuania, U.S.S.R. MATHEMATICAL

BIOSCIENCES

50: l-12

OElsevier North Holland, Inc., 1980 52 Vanderbilt Ave., New York, NY 10017

1

(1980)

0025-5564/80/050001+

12$02.25

2

CHIN LONG CHIANG

P, or P(n)=P.

(4)

Conventionally, one could appeal to Sylvester’s theorem to derive the nth power of a matrix. But the resulting formula for P(n) is a product of s characteristic matrices corresponding to s eigenvalues of matrix P. Feller [4] uses the probability generating function to derive individual elements pii directly. His formula for p,(n) involves right eigenvectors and left eigenvectors, which themselves are to be obtained from two systems of s simultaneous equations (see also Karlin [5]). Thus, the formulas derived from conventional methods are not conducive to easy computations. The purpose of this paper is to present a simple and explicit expression for individual n-step transition probabilities p,(n), the limiting probability distribution, and the mean return time. An example is given for illustration. 2.

THREE

LEMMAS.

To derive the main results, we need the following three lemmas. LEMMA I Let a,,a, ,..., a, be distinct

numbers;

ai

i

then we have

=o

for

r=O,l,...,

= 1

for

r=s-1.

s-2,

(=’ j+Vrj) j#i

(5)

For proof of the lemma, see [l], p. 347. LEMMA 2 Any root X of the characteristic

equation p-PI

is not greater

than unity in absolute

of the matrix

P,

=o,

(6)

value, and one of the roots is h= 1.

Lemma 2 is well known. For every root of Eq. (6) &, I= 1,. . . ,s, let A(&) = (&I-P)

(7)

FINITE

3

MARKOV CHAINS

be the corresponding

characteristic

matrix and

the adjoint matrix of A@,). Take the kth (nonzero) matrices in (8), and formulate a matrix

where k is any arbitrary LEMMA

but constant

integer,

column from each of the

1
3

Ij the eigenvalues A,, . . . ,&, are distinct, then the components of the k th column of the inverse T-‘(k) of T(k) in (9) are independent of k and are given by

T,,(k) -= IT(k) I

1 Ii

(‘0)

C&-L,)

fV=l mfl

Proof of the lemma is given in [2]. 3.

THE MAIN RESULTS-THE EIGENVALUES.

MATRIX

P HAS SINGLE

We now use the above lemmas to derive the n-step transition ties, the limiting distribution, and the mean return time. THEOREM

probabili-

I

Zj the matrix P in (1) has distinct eigenvalues A,,.. .,A,, then the n-step transition probabilities are given by pV(n) = SI A,i(‘,F/” /=I

fi m=l mfl

’ (X,-Q’

ij=

l,..., s;

n= l,....

(11)

4

CHIN LONG CHIANG

Proof. Every nonzero column in (8) is an eigenvector corresponding to the eigenvalue AC;that is,

@,1-P)

AH(&) .

of the matrix P

0

=

. .

A,&,)

(‘2)

0

When the eigenvalues are distinct, the vectors in the matrix T(k) in (9) are linearly independent; therefore the inverse T-‘(k) exists. It is easy to show that the matrix T(k) defined in (9) diagonalizes the matrix P as follows:
T-‘(k)PT(k)=

0 A,

. .

... ...

. .

(j

OX 0 . 9

(j

...

h,

(‘3)

i

so that

P=T(k)


0

...

0

0

A,

...

0

..

..

T- l(k).

(‘4)

T-‘(k).

(15)

give the nth power of P:

Direct computations

(A; P”=T(k)

0

...

0

” .

A,” . .

*.’

0

(j

(j

...

A)

Expanding both sides of (15) yields the formulas probabilities, pg(n) = i

I=1

A&,)X;

m IT(k)1 ’

ij=

for individual

1,...,S,

n=l )....

transition

(‘6)

This formula holds true even if the s vectors of T(k) in (9) are taken from different columns of (8)-that is, k varies with I. In any event, the formula in (16) is independent of the choice of columns. However, if k is the same

FINITE

MARKOV

5

CHAINS

for all the s vectors, then the formula such cases, we set k=j and write

$$ ,

pii( n) = i Aj(( I-1 Substituting

of p,(n) can be further simplified.

ij=

1>...,s,

n=1,2 )....

In

(‘7)

(10) for k =j in (17) we recover (11) as required.

For an irreducible finite Markov chain with ergodic states, the limiting probabilities exist and are equal to the reciprocal of the corresponding mean return times r_lii,i.e., (18) Proof of (18) may be found in Chung [3] or Feller [4]. The following theorem gives explicit formulas for the limiting probabilities. THEOREM

2

chain with a stochastic matrix P

In an irreducible ergodic finite Markov given in (1) the limiting probabilities $rCr P,.(n)=‘rri,

j=l

,...s,

j=l

,...,s,

(19)

are given by

Ai,(l)

?lj =

fi Akk(l) ’

k-l

where A,&( 1) is the (k, k)th cofactor of the matrix A( 1) = (I - P), and I is an s X s unit matrix. Proof.

For every n > 0,

Taking the limit of both sides of (21) as n+co, probabilities satisfying the equation

we find

the limiting

s Tj’

c

7r&

j=l

i=l

,...,S.

(22)

Let 9r=(7r, ,...>TS,)’

(23)

6

CHIN LONG CHIANG

be a column rewritten as

vector of the limiting

probabilities.

Equation

(22) may be

or

(24)

(I-p’)7r=O,

where P’ is the transpose of matrix P. Thus n is an eigenvector corresponding to the eigenvalue X = 1, and is given by

of P

Al,(l) =C i

Ah(l) (25) L

A,$)

where c is an arbitrary constant and A;(l) is the (ij)th matrix A’(l)=@-P’). It is easy to show that

cofactor

j=l,...,s,

A,;(l)=A;(l)=Ai,(l),

of the

(26)

and (25) becomes < . Tl

A,,(l)

=2

A22V)

(27)

=C

In order for {T} to be a probability

distribution,

the constant

c must be

1

c=

(28) i:

A&l)

.

k-l

Substituting

(28) in (27) yields (20).

COROLLARY

The mean return time of an ergodic state j is given by

k$,

Akk(l)

(29)

cqii = A,(l)

The corollary

is implied in (18) and (20).

.

FINITE

MARKOV

7

CHAINS

We have derived in (11) a formula for the transition probability p,(n). Obviously, the limiting probability 9 may be derived from (11) by taking the limit as n-cc.

,51 A,,(A,)A;

ilm

Tj’j=



s

(30)

m!, C-U. mfl

The two formulas for 7 in (20) and (30) must be equal. The following theorem establishes this fact. THEOREM

3

The formula (20) for the limiting probabilities 9 is equal to the formula in (30). Namely,

lim 5 A+(A,)h; n+m ,-,

4(l)

’ fi m-1 mzt

(4-L)

=

ij=

1,...,s.

(31)

k+Ml)’

Proof: In view of Lemma 2, we let X, = 1 and [A,[< 1 for 1=2,. . . ,s. As n-+ cc, hf+O for I = 2,. . . ,s, and (31) becomes

A,,(l)

Ai,(l)



s

II

(1-L)

=

5

A/d)

(32)



k-1

WI-2

or, since Aii( 1) = Aii( l), we need to show that

(l-Am)=

ii

i

m-2

To prove (33) we introduce

(33)

&k(l)-

k-l

a matrix A(l)-pI

and solve the equation

IA(l)-pII= to obtain

(34)

the roots pi, p2,. . . , ps. It is well known in algebra that s P2P3’.

*Ps+PIPJ”‘Ps+“’

+PIPz**.Ps-I=

c k-l

Since A( 1) = (I - P), we have A(l)-pI=

(l-p)I-P

= XI-P,

A/c/c(l).

(35)

8

CHIN LONG CHIANG

p = 1 -X. The roots of the equation

=o,

1x1-PI

A,, . . . ,a, and the roots of the equation p,= 1 ---A,

Substituting

(6)

in (34) have the relationship

for

(36) in (35) and recognizing

f= l,...,s. p, = 1 -h,

(l-X,)(1-A,)*..(l-a>=

i:

(36) = 0, we find

&(l).

(37)

k-l

Therefore,

we recover (33) and hence (31) in the theorem.

The probabilities p,(n) as given in (11) are always real functions of pii, since they are real elements of P”, regardless of whether the stochastic matrix P has complex eigenvalues. 4.

THE MAIN RESULTS-THE EIGENVALUES

MATRIX

P HAS MULTIPLE

In Sec. 3 we assumed that the eigenvalues h,,. . .,A, of matrix P are distinct. We shall now extend the results to cases where the stochastic matrix P has multiple eigenvalues. The formulas for the probabilities pij(n) are given in the following theorem. THEOREM

4

If the matrix P in (1) has multiple roots, h,, . . . ,&, with multiplicities s,, . . . ,s,, respectively, so that s, + . . . + s,=s, then the n th order transition probabilities are given by

p,(n)=

s,-

A;

5

‘=I j,

(A,-AJ=

I

lz

m=o

ail

r

s,-1-m

,

n

A,-(s-l-m-h)

dh

,,,A,,(&)

h! ij=

I

) I

1,...,s,

(38)

where 2. . . 2 stands for r - 1 summations cu=l ,... r, such that (m,+... +m,)-mm,=m,

is the h th derivative

taken

over m, = 0, 1, . . . , m, (Y#I,

and

of the cofactor A,,(X) of A(h) evaluated

at X =A,.

FINITE

9

MARKOV CHAINS

Proof. We first assume that the matrix P has s distinct divide them into r groups according to order of . ..7 T&j, and rewrite (11) as follows: (A,,,...,h,,,),...,(~l,

p,j(n)=

i: i &)” I-1 ,,I=1

A,,(&) I? (&fi-&) 6=1 S#P

(39)

li

2

a=l a#[

y=l

(&3-a,>

We shall derive (38) from (39) by taking the limit as A,&, I=1 , . . . , r. Formally, we let &fl = A, + sla and write

We evaluate the numerator

eigenvalues, magnitude:

for p = 1,. . . ,s,

in each term as follows:

(A,+qJ=

i:

(rg)qp;-8

6=0

and

This product is

(41) where min[k, s - I] represents the smaller of the two numbers As .++O, we find from the denominator of (40)

k and s - 1.

CHIN LONG CHIANG

10

where Z. * * C stands for r - 1 summations taken over m, = 0, 1, . . . ,m, (r +I, (Y= 1,2 ,...,r, such that (ml+... + m,) - m, = m. Substituting (41) and (42) in (40) yields

(43) According

to Lemma

1,

=0 For k+m=s,,

s,+l,...,

for

k+m=s,-

1,

for

k+m=O,

l,...,

q-2.

(4)

we take the limit as E,~-+O to find ktm

lim

els

=o

for

k+m

as,,

(45)

/3-l,...,+ %+O

since the numerator is of the smaller order of magnitude than the denominator. For each m = 0, 1, . . . , s, - 1, we can find a value of k such that k + m = s, - 1, or k = s, - 1 - m. As a result, (43) reduces to (38). The proof of the theorem is complete. While the probabilities pV(n) assume different expressions in Remark. (11) and (38) depending upon the multiplicities of eigenvalues of matrix P, the limiting probability distribution (rr,, . . ..rJ given in (20) and the mean return times b given in (29) hold true also when the matrix P has multiple eigenvalues. This is because Theorem 2 does not require the matrix P to have all single eigenvalues and h= 1 is a single root.

FINITE

5.

MARKOV

11

CHAINS

AN EXAMPLE

This example appears in Feller method of computation. Let

[4]. We use it to illustrate


0 0

0

P=

TI

0

1‘ 1

0 0

I r 0

the present

0

0.

1

0,

The transitions are 1+4, 2-4, 3+1, 3+2, and 4+3. Therefore, states communicate. There are four roots of the equation [XI-PI

all the four =O; two of

the roots are complex conjugates: h, =O, X,= 1, A,=( - 1 + ifi )/2, and & = ( - 1 - ifi )/2. With these values we find X,2=X4, hi = h,, X,3= Xi = 1,

ii

6

- 1,

@,-A)=

m-2

(x,--&)=+3

for 1=2,3,4,

m-1 mfl

and

r A, A(&) =

A, _;

_f

0 The formula for p,(n) and compute ‘1

px=

0

-1‘

0

-1

A,

0

-1

/I,

I= 1,2,3,4.

)

varies with n. For n=3k,3k+

L i

0

;

;

0

0,

0 ko

0 0

1

0 1,

i

0

0

0

0

0

p=+l=

1,3k+2

we use (11)

1

0

0

0

0

0

0

1 ,

1

1

0

0

6

1

0,

,6

and

‘0 P3k+2=

0 0 I i

0 0 0 1 i

1 1 0

0 0 1

0

0

Therefore, the corresponding Markov chain is periodic the system starts from scratch after every three steps.

with period t = 3;

12

CHIN LONG CHIANG To

I

find the stationary

probability

A(l)=

and compute

we formulate

01

01

0

-1

_ I T

_ I T

1

;’

the matrix

the cofactors A,,(l)=;,

Therefore,

distribution,

the stationary

A,,(l)=f,

A&1)=1,

distribution

&(l)=l.

is

(~1,~*,~~,~~)=(~,~,f,f). It is easy to check that

(&&;,+)

‘0

0

0

1‘

p

;

;

;

0

0

1

0

=(&&

’ 1). 373

REFERENCES C. L. Chiang, A stochastic model of competing risks of illness and competing risks of death, in Stochastic Models in Medicine and Biofogv (J. Gurland, Ed.), Univ. of Wisconsin Press, Madison, 1964, pp. 325-354. C. L. Chiang, Introduction to Stochastic Processes in Biostatistics, Wiley, New York, 1968. K. L. Chung, Markou Chains with Stationaty Transition Probabilities, Springer, Berlin, 1960. W. Feller, An Introduction to Probability Theory and Its Applications, Vol. 1, 3rd ed., Wiley, New York, 1968. S. Karlin, A First Course in Stochastic Processes, Academic, New York, 1%6. F. R. Gantmacher, The Theory of Matrices, Vol. I, Chelsea, New York, 1960.