The stopped distributions of a controlled Markov chain with discrete time

The stopped distributions of a controlled Markov chain with discrete time

Systems & Control North-Holland Letters 6 (1985) 277-285 October 1985 The stopped distributions of a controlled Markov chain with discrete time ...

415KB Sizes 0 Downloads 46 Views

Systems & Control North-Holland

Letters

6 (1985)

277-285

October

1985

The stopped distributions of a controlled Markov chain with discrete time Piotr

ZAREMBA

institute

of Mathematics.

Received 22 April Revised 3 August

Polish

Academy

of Sciences,

P. 0. Box

137, 00 - 950

Rest

Poland

1985 1985

Abstract: A characterization is given, in terms of excessive functions, with given initial distribution. Control here is understood as a mixing results of Rost for more than one Markov kernel. Keywords:

Warsaw,

theorem,

Discrete

time

Markov

chain,

Control,

of possible stopped distributions of a controlled Markov chain of a given finite number of Markov chains. It is an extension of

Excessive

function.

1. Introduction Suppose that X is a Markov chain and p, v are probability established when we can obtain Y as a distribution of the stopped p. In this paper we generalise his result for the case of a finite countable state space and let X’, X2, ; . . , X” be Markov chains, X’=(&

F,(F,),,,,(X,),,,,

f2=ENV,

Pi),

X,:fl-+E,

F,=a(Xi;‘i
X,(w) kEN,

and P, is a Markov

kernel

measures on the state space. In [l] Rost Markov chain X with initial distribution number of Markov kernels. Let E be a where

ISign,

= Wk. F=a(Xi;

~=(b,).;,)~fk icN),

on E, 1 I i I n.

Definition 1. A function cr = ( DL,, (Ye,. . . , a,,) : E + [0, 11” such that for each e E E the equality holds will be called a control. Definition P,(e,

2. Let OLbe a control. 8) = iY ai(e)Pi(e7

Let P, be defined

c:!, ,ai( e) = 1

as follows:

e, gcE.

91,

i-l

Then

P,, will be called a controlled

Remark.

P, is a Markov

kernel.

kernel on E.

Definition 3. A function filtration (F,), EN iff:

T: (0, l]

X

(a) Vr E (0, 11, T(r, .) is a stopping (b) Vo E 9, T( ., w) is decreasing Let X- (fk 6 (&lPsN, respect to ( Fk)k E ,.,,. 0167-6911/85/$3.30

0 1985,

Elsevier

(XA.)~~~,

Science

s2 + N will

be called

a randomized

time with respect to filtration and left continuous. P) be a Markov

Publishers

B.V.

chain

(North-Holland)

(F,),

stopping

time with

respect

to

E N,

and T a randomized

stopping

time with

271

Volume

6, Number

Definition

4

SYSTEMS

4. X, is a function

&:L?X(O,l]+E,

defined

function

oEG!,

1985

O
from (a X (0, 11, F x B(( 0, 11)) to (E, 2”),

Proof. Let W denote the set of pairs (p, 4) of rational Then the following equalities hold: (X,=e}={(u,r)Ef22(0,1]:

numbers

where B denotes Bore1

such that 0


XT,(fd)=e}

= f-j

u ((u, nrN (p.q)= b+’

r)ESEX(O,l]:

wEQ2:

=.QN(,.$V(~

n ox r=(O,l]: i ( So we obtain

October

LETTERS

as follows:

r)=XT,(w),

X,(0,

Remark. X, is a measurable u-algebra.

& CONTROL

XT,(a)=e,

psr_iq+i

X,,(w)=e, T,(w)=T,(m)}

p
7+)=7+))

x(0,11)

1)

that ( X, = e } E F 0 B((0, 11).

Notation. Let P, denote the unique probability measure on (a, F) such that (52, F, ( Fk)L E N, P,,, (X,), gives us the Markov chain X = (52, F, (P,), E ,v, ( X, )k E N, P) with initial distribution P.

E ,v)

Definition 5. An E-valued random variable X, defined on the probability space (Q x (0, 11, F 0 B((0, l&P, 0 X), where A is the Lebesgue measure on (0, 11, will be called the value of the chain X with the initial distribution p at the randomized stopping time T. Definition 6. A probability measure P on the space (E, 2E) will be called the distribution at the randomized stopping time T if v is the distribution of the random variable X,.

of the process X

Definition 7. We will call a distribution v later than distribution p by Markov kernels P,, P,,..., P,, ift there exist a control CY and a randomized stopping time T such that, if the Markov chain X” = P,) has initial distribution p then XT has distribution v. We will write that (J-k F, (Fk)kEw (X,c)kaw relation 8s II I p, + . . . p,,vThe following

example

explains

why we consider

randomized

stopping

time instead

of stopping

time.

Example. Let E = (1, 2, 3,...}, P(i, i + l)= 1, P({l))= 1, ~((1)) =y({2})= i. Then T which fulfils the equality EJ( X,) = JE/(x)p(dx) for all bounded functions f, ought to be such that P,,(( T= 1)) = P,({ T = 2)) = i. The u-field F, however consists only of sets of P,-measure 0 or 1. So T is not a stopping time. Definition VXEE,

8. Let P be a Markov f(x)>

c

kernel on E. We will say that a function

f(y)P(x,

f: E + R is P-excessive

if

y)zO.

Y~E

The main result of this paper says that f.t ] p, .pl., ,, p,,v iff (CL, f) the set of bounded P-excessive functions. 278

2 (v, f)

for all f~fly,,S,,,

where S, is

Volume

6. Number

2. Main

4

SYSTEMS

& CONTROL

October

LETTERS

1985

result

Our problem of control can be reduced to some geometrical considerations in normed linear spaces. Auxiliary lemmas from linear analysis are in given in the Appendix. Theorem 1 of this section summarizes facts from the Appendix. So Theorem 2, the main result of this paper, is a conclusion of Theorem 1. The proof of Theorem 1 contains also a scheme of construction of a control leading from distribution p to v. Let Y be a normed linear space. Definition 9. If KC Y, then K is a cone iff: (a)VyEK,Vr>O, ryEK, (b) Vx, y E K, cod{ x, y}) c K, whereconv(U):={yEY: y=rx+(l-r)z,x,zEU,O~r~l}forUCY. Notation.

For z E Y*,

H,:=(yEY:(z,y)>O}, Theorem following

rr,: = {YE

L,: = {YE

Y: (z, y) =O}.

1. Let W be a closed cone in the Banach space X, b E X*, b # 0; and Ai : N + X*, 1 I i I n. If the inclusion holds:

Wn

fi

/$

j-l

i-1

then there exists following inclusion

RA,(j)cRhT

a function holds:

a = (a,, az.. , . , a,,) : N + [0, 11” such that

wn i? HE:-,m,(j)A,(j) c j=l

Proof.

Y: (z, y)20},

Suppose

that we define

We will now define

VjE

N, Ey=,a,( j)=

H,. the function

a on the set (1, 2,. . . , k }

c

N in such a way that

the value a(k + 1). Let

j-k+2

i-l



p= ii HR,(x+I,. i=l

If K = Q then a( k + 1) could be arbitrary. Suppose then that K # 9. In that case P fl H-, assume that dim X < + co. Indeed, if dim X = + cc then by Lemma 3 we have (PnH-,)n(K+

1 and the

n K = 9. We can

V(PnH-,))=@

By Lemma 2 then we can reduce our consideration to cones P/V(P n H-,,). H-,,/V(Pn H-,,) (K + V( P n H-,))/V( P n H-,,) in the finite-dimensional quotient Banach space X/V( P n H-,,). By Theorem 4 we know that there exist numbers yi 2 0, 1 I i I n, which fulfil the equality

By the remark

after Lemma

i we know that Zy- ,yi > 0. We can then define

and

a( k + 1) as follows:

1 ljln.

219

Volume

6, Number

SYSTEMS

4

As simple conclusion obtained as distributions

& CONTROL

October

LETTERS

from Theorem 1 we obtain a theorem of a stopped controlled Markov chain.

describing

measures

Theorem 2. Let E = {e,, e,, e3,. . . } be a countable state space; P,, P2,. . . , P,, be Markov p, Y probability measures on E. The following conditions are equioalent: (4 ~‘l~,.~~ .__..p,,vl (b) (CL, f > 2 (y, f > for allf~ fl:1- ,%,.

which

1985

could

be

kernels on E; and

Proof: (a) =S (b). By assumption there is a control n such that p IE;;Y. By the Rost theorem then V/E Spc,, (p, f > 2 (v, f). It is enough then to notice that flyl,.Y,, C S, . (b)=r (a). Let X be the Banach space of bounded funct?lons on E with supremum norm. We can identify finite measures on E, in a natural way, with elements of the conjugate space X*. Let functions A,, 1
P,(e,

*)

where 8, is a probability f(e)>P,f(e) we can describe q= where

measure concentrated *

(A,(e),

n %,4x+, bounded

functions

on E. Condition

(b) means then that

(7 fi H..,,,,cH,-,.

By Theorem

Therefore

as a cone:

PEE

e=E

x+n

e. Because of the equivalence

lsiln,

the Pi-excessive function

Xf is a set of nonnegative

xfn

f)rO,

in point

i-1

2 we have then a control fl

e=E

a which fulfils

J?s,-pn~,..+~~-,.

Vf 6 Spti,, (cl, f)

2 (v, f)

and by the Rost theorem

).tlcv.

The restrictive assumption concerning countability of the state space follows from the fact that construction of the control a will not give its measurability in the case of a u-field in E different than 2”.

3. Commentary Our definition of control (Definition 1) is natural and gives us a Markovian time-homogenous control. One can ask however what is the set of possible stopped distributions of a controlled Markov chain when the control is not Markovian time-homogeneus. The answer is simple. Those sets are equal. Definition 10. A function y = (y,, yz,. . . ,y,,) : U, E NEc X {k } + [0, 11” such that for each w E U, E ,,,E’ {k } the equality ET, ,v,( w) = 1 holds, will be called a wide control. Definition

11. Let y be a wide control

and let P, be defined

as follows:

p, : ,‘;‘, E’x{k}xE+[O,l], PY(el, Then 280

e2....,

e,, k, s>-

5 y,(e,, i-1

P, will be called a wide controlled

e2 ,..., e,, k)Pi(en., kernel.

g),

e,, e2 ,..., ek, gE E.

X

Volume

6. Number

Remark.

4

SYSTEMS

P, is a Markov

Definition 12. A Markov function on E.

kernel

& CONTROL

October

LETTERS

form U, E NE’ x {k ) to E. For a definition, Q from Uk E ,,,E’ X (k}

kernel

see [2], Ch. IX, p. 1-2.

to E will be called a non-Markovian

Definition 13. Let Q be a non-Markovian transition function on E. Q, is the unique probability on (s2, F) such that for each p E N and each bounded measureable function f: Rp 4 R,

J,r(x,,

X2,...,Xp)dQ~=~...JEf(x,' -Q(x,,

+..-+)Q(-+

1985

transition

measure

XZ~...~X~-I~ p-l+,)

x2,..., x,,-~, p - 2. dx,-,)

... Q(x,, 1, dxzh(dx,).

Definition 14. Let T be a randomized stopping time with respect to the filtration (F,),, ,,,. An E-valued random variable X, defined on the probability space (fi X (O,l], F@ B((O,l]), PP 0 h) will be called the value of the process (X,), E N with initial distribution p at the randomized stopping time T. Notation.

a,.,

=(JA

Q,J

F, (4)1;s,w

Note. The process (X,),

E N defined

on the probability

space D,.,

is not a Markov

chain.

Definition 15. A probability measure Y on the space (E, 2E) will be called the distribution of the process space s2,.,, at t,he randomized stopping time T, if P is the distribution (Xk)A.EN’ defined on the probability of the random variable X, defined on the probability space 51,.,. We will write that relation as v - XT.P.r. Definition 16. We will call a distribution Y later in the wide sense than distribution p by Markov kernels stopping time T such that Y is the p,, pp..., P,, iff there exist a wide control y and a randomized space DP7., at the randomized stopping distribution of the process (X,), E N defined on the probability time T. We will .write that relation as pllP,,P2 ,__,.P,,v. Lemma

1. Suppose that f E fly, ,S,,, i is a wide control and p is a probability

VkEN,

Proof.

distribution

on E. Then

Sf(x,).

EPlr[f(X~+,)IFI]

Let w E P. Then

~%,,[f(~~+~)I~~l(~)= C f(e)P,(wIyw~~--.~*A. k3e) PEE

= eFEf(e)(

i~,u.(~l.

+r...r~k,

k)P,(w,,

e))

= c Y,(W,,%!I..., ok7k) [ c f(e)f’,(wk, e)) $ i Yi(Wl, qr...rwgr

k)f(w,)=f(w,)=f(x,(w)).

i-l

Definition 17. If T is a randomized that Thk((r, w)) = T,Ak( 0).

stopping

time, k EN,

then TAk

is a randomized

stopping

time such

281

Volume

6, Number

4

SYSTEMS

& CONTROL

LETTERS

Theorem 3. Suppose that E is a countable state space, P, , Pz, . . . , P,, are Markov probability measures on E. Then the following equivalence holds: PlP,.P, . . . . P”V *

October

1985

kernels on E and p, Y are

PllP,.P, ,.... P,,V.

Because the implication (*) is obvious it is enough to prove the reverse implication. Suppose that y is a wide control and T is a randomized stopping time such that Y - XT.P,.P. Let Pk - &AL.P,.p’ Then p = P, and pk --i cc weakly when k + 00, i.e. for each bounded function f, (Ye, f) --, (Y, f) when k + M. For kE N and j~nj’-,.S,, ihe inequality (va, f) 2 (Pi+,, /) holds. Indeed,

Proof.

The inequality in the above is obtained from Lemma 1. So we have that for /E fly- ,S,, and k E N the inequality (IL, f ) 2 (vl, f) holds, which gives us the inequality (PP f) 2 >iJ\ (Vk3 f) = (vv f). By Theorem 2 we obtain that p IP,,PI.,,,. ,,.Y.

Appendix

Theorem 1 is based on following definitions and lemmas. Definition 18. For KC Y and x E Y, K is a cone with vertex x iff: (a)VyEK,Vr>O,x+r(y-x)EK, (b) Vy, z E K, conv(( y, z)) c K. Remark. Definition Remark. Definition

K is a cone iff K is a cone with vertex 0. 19. C(K):=(x=X: C(K)

x=ry,

yEconv(K),

r>O}.

is a cone.

20. V(K) := {x E X: K is a cone with vertex x}.

Lemmas 2 and 3 will give us a few simple properties of cones. 282

Volume

6, Number

4

SYSTEMS

& CONTROL

October

LE’ITERS

1985

2. If K is a cone and L is a linear subspace of X, then: (a) V( K ) is a Linear subspace of X, (b) L + K is a cone and L c V( K + L),

Lemma

(c) VE V(K) (d) Lc V(K)

*

o+K=K, => L+K=K.

Let u,w~V(K),y~K,r~R,s>0.Then

Proof.(a)

o+w+s(y-o-w)=u+s(fy-u)+w+s(+y-W)EK

so U+WE V(K). If r=O then ru=O~

V(K). If r>O

then

ru+s(y-ru)=r(u+s(r-‘y-u))EK, so ruE

V(K).

If r
ro+s(y-ru)=

-sr(u+s-‘(-sr-‘y-o))eK,

so TOE V(K).

(b) Let(I+k),(f,+k,)~L+K,s>O,O~r~l.Then s(l+k)=sl+skEL+K, r(l+k)+(l-r)(l,+k,)=(rl+(l-r)/,)+(rk+(l-r)k,)EL+K, so L + K is a cone. Since

it follows that L c V( K + L). (c) If kEK, UEV(K) then v+k=2u+ic2k-2u)EK, by(a) -VEK so -v+kEK and k=v+(-u+k) (d) This is a conclusion from (c). 3. If K, P c X are cones then the following (a) KnP=@, (b) (K+ V(P))nP=9.

Lemma

so u+KcK. therefore KCo-t-K. conditions

If kEK,

UE V(K)

then

are equivalent:

(b) =) (a). Because 0 E V(P) it follows that KC K + V(P) and therefore K n P = 9. (a)-(b). Vx~X,(K+x)n(P+x)=f& By Lemma l(c) VuE V(P), (K+u)nP=Q and therefore

Proof. (K+

V(P))nP=9.

The following lemma shows us that certain cones can be separated by a hyperplane. ,..., a,,, b E X*, and flj’, ,B@, n Hh n K = 8. Then

Lemma4.L&KcXbeaclosedcone,dimX<~,a,,a2 there exists c E x* such that KcHC

and

/!ig,,,nH,,CH-,.. i-l

If K of ny-,HU, n H,, is an empty set the statement is obvious. Suppose then that both sets are nonempty. We can choose d,, d,, . . . , d,, E X* such that

Proof.

fiIT,,nH,,c i-

I

h Hd,nH,, i=l

and

fj H,#nH,,nK=@ i=l

283

Volume

6. Number

4

SYSTEMS

By the Hahn-Banach

KCq.,

theorem

& CONTROL

LETTERS

there exists then c E X* which fulfils

October

the following

1985

inclusions:

ii 4, n H,,c K,..

i-1

Lemma Lemma

5 will describe

5. IA

P C X

some properties

of an intersection

of a convex set with a halfspace.

be a convex set and a, b E X*, a # 6, such that P fl H,,

c

E,,, P (I H, fl L,, + Q. Then

PC H,. Proof. Suppose that x E Pn H-,, and y E P n H,, n L,,. By convexity of P we have that [y. x] c P, analogously (JJ, x] c He,,. Because y E H, there exists then z E (v. x] such that [JJ, z] c H,,. So we have and that gives us a contradiction with the assumptions. (y, zlcpnH,nH-,, We will need a lemma finite-dimensional space.

describing

connections

between

a cone

6. Let dim XC + 00; a,, a,, . .., a,,, b E X*. Then the following (a) fl:l, ,a,,, c R,,; (b) 3a,, al,. , ., CY,,E R’, b = E;=,a,u,.

Lemma

and functionals

conditions

Proof. (b) =5 (a). If x E ny,, p,,, then (b, x) = Z:‘- ,a, (a,, x) > 0 so x E s,,. (a) =E.(b). Because dim X-C + co we have X ** = X, hence there exist b,, b,, . C((a,, So from aeCC({a,, Remark.

a,, . . . . a,,}>=

designating

it in

are equioalent:

, b,,, E X such that

fi JG,. i-l

(a,, b,) 2 0, 1 5 i < n, 1 5 j< m, we obtain that b,, b,,. .., b,,, E fly, ,EU,. If we -11 take a2,..., a,,}) then fore some 6, we will have a G?p,,,. But (a, b, ) < 0 means that fl:, , H,,, C SC,. if E:!- ,a, = 0 then b = 0.

Theorem halfspace.

4 will

say that in finite

dimensions

in a special

Theorem4. Letdim X-C +co; a,, a2 ,..., a,,, bEX+; then there exist a,, al,. . . , (Y,,E R+ such that

case we can separate

andKcXbeaclosedcone.

cones by a certain

Iffl:l,,@C,,nH,,nK=Q

f&r- ,a,u, nH,nK=Q. Proof. If n;-,g<,, n H,, = Q then n/b,p,, c H-,, so by Lemma 5 we can find such that -b=Ey,,a,a,.But H-,,nH,,nK=Q. Suppose then fly,, E LI,n H,, # Q. By Lemma 3 we can find c E x* such that

nonnegative



nE,,nH,,cH-, i-1 If b = -c P=SUP

and

KcgC.

then H,, n K = Q so also H,, n H,, n K = Q. Suppose ,t Y:Lyr+,,--y,hn n E,, n&,+0 . i=l

284

then b # - c and let

a,, az,. . . , a,,

Volume

6, Number

SYSTEMS

4

Because f-I;, , a<,, n Hh C H-,. n Hh it follows LPc+fl-Plbn The following

h ~,,nH,,#@ i- 1 inclusion

ByLemma6wethenobtajn HI, c H-,. n H,, and Kc

and

& CONTROL

LETTERS

October

1985

that 0 < p c 1, ii H,,nH,~cH-,,,+,,-,,,~,.

i=l

then holds by Lemma

4:

-[pc+(l-P)b]=C:‘,,n,a,, H, the theorem follows.

q>O.

i=l,2

,...,

n. BecauseH_IS,.+I,-P),,ln

References (11 H. Rosl, Markoff-Kerten bei sich ftillenden [2] C. Dellacherie and P.A. Meyer, Probobi/it+s

Lochern im Zustandsraum. er Poren~iel, Chs. IX-XI

Awl. Inst. Fourier 21 (1) (1971) (Hermann, Paris. 1983).

253-270.

285