Optimal control and differential games with measures

Optimal control and differential games with measures

Nonlinear Anolysir, ‘Theory, Methods & Applicalions, Printed in Great Britain. OPTIMAL 0362-546X/93 $6.00+ .oO % 1993 Pergamon Press Ltd Vol. 21, N...

2MB Sizes 0 Downloads 141 Views

Nonlinear Anolysir, ‘Theory, Methods & Applicalions, Printed in Great Britain.

OPTIMAL

0362-546X/93 $6.00+ .oO % 1993 Pergamon Press Ltd

Vol. 21, No. 4. PP. 241-268. 1993.

CONTROL AND DIFFERENTIAL WITH MEASURES

GAMES

E. N. BARRON,~$ R. JENSEN~§ and J. L. MENALDI(~~ tDepartment

of Mathematical Sciences, (IDepartment of Mathematics,

Loyola University of Chicago, Chicago, IL 60626, U.S.A. Wayne State University, Detroit, MI 48202, U.S.A.

and;

(Received 10 January 1991; received for publication 26 January 1993) Key words andphrases: Optimal solutions,

singular

control

and differential

games,

bounded

variation

trajectories,

viscosity

control.

1. INTRODUCTION

THE PROBLEMS of this paper are motivated

AND

SUMMARY

by models of physical controlled

systems in which the

trajectory is a function of bounded variation. The time of the jumps, if any, and the new spatial positions are under the control of the designer. In the optimal control problem the objective is to minimize a cost involving a running cost and a cumulative cost against the control measure. When the measure involves only jumps this will be a standard impulse control problem. this paper we are not restricting the measures merely to jumps but are allowing general

But in Radon

measures. In many problems of interest we can only manipulate the system with ordinary controls and we wish to do so to minimize a cost. But when the system is subject to disturbances one seeks to design the system so as to perform well under the worst possible circumstances. In this situation we assume that the disturbances are modelled by a measure term in the dynamics with a cost incurred in the payoff. The worst case analysis assumes that the measures are chosen so as to maximize this payoff. Therefore, this model is a differential game in which the dynamics is a function of bounded variation, the payoff involves the measures, and we choose an ordinary control to minimize this payoff while the opponent (in some cases considered to be nature) is choosing the measures to maximize the payoff. A result of this paper is that the order of play of the maximizer and minimizer makes a difference. That is, the differential game with a maximizing measure and a minimizing ordinary control does not, has knowledge of We will derive the the game to have

in general, have a value. The upper value, i.e. the case when the maximizer the minimizer, is then the central object of interest in a worst case analysis. results for both the upper and lower value and give a sufficient condition for a value.

The approach throughout this paper is dynamic programming leading to the value functions and the associated Bellman and Isaacs equations. In the optimal control case the Bellman equation becomes a standard variational inequality with two first order operators. In the differential game case the Isaacs equation is a highly nonlinear, first order problem involving a minimization over a set which depends on the derivatives of the value function. Precisely, the $ Supported 0 Supported YSupported

in part by AFOSR-86-0202, NSF DMS-9102967, and a grant from Loyola University. in part by AFOSR-860202, an NSF grant, and a grant from Loyola University. in part by NSF DMS-9101360. 241

242

E. N. BARRONet al.

Isaacs equation for the upper value is v,’ +

min t K’ *.!-I@,x, z) + A,@, x, 211 = 0, zEZ(O,t,Vmf,vx+)

where Z(0, t, V,, v,‘) = (z E z: V,+ + v,’ *f&z)

+ h,(t, z) I 0).

This equation has a discontinuous, generally nonconvex Hamiltonian. The equation for the lower value is even more complicated. A theory of first order partial differential equations encompassing such equations is the viscosity solution theory initiated by Crandall and Lions [l]. The first example of a Bellman equation with control sets depending on the solution arose in the consideration of an optimal control problem with a minimax cost [2, 31. That is, the minimax problem consists of finding a control which minimizes the L” norm of a function of time, the state, and the control. Using the well-known fact that the L” norm of a function is the maximum over subprobability measures of the function integrated against the measure, we see that the minimax problem is a special case of the subject of this paper. This example is included at the end of this paper. Some justification for taking the dynamic programming approach to the problems of this paper may be necessary. Control problems involving measures are extremely difficult to solve via necessary conditions [4, 51. Such necessary conditions are not even known for the differential game. The Pontryagin conditions involve knowing a priori the support of the optimal measures, which in turn depends on the unknown adjoint variables. Further, one must then verify that one actually has an optimal control. The determination of the value function by solving the Bellman equation is not beyond the scope of numerical methods. Moreover, the Bellman equation leads to the candidate feedback optimal controls in the usual way. Finally, it is well known, and proved in [6], that for standard control problems there is an intimate connection between the adjoint variable in the Pontryagin conditions and the spatial gradient of the value function. In fact, the adjoint variable is the spatial gradient of the optimal cost evaluated along the optimal trajectory. Such a result is not so clear in problems involving measures. Finally, we mention that previous work regarding problems with measures in one form or another appears in [2, 4, 5, 7-151. Necessary conditions are derived in [5, 131. Problems with measures are more commonly called singular control problems. See [15, 171 for related examples. 2. THE

OPTIMAL

CONTROL

PROBLEM

We consider the following model on the finite horizon [0, T]. The dynamics are dr = fi(r, T(r), c(r)) dr + fi(r> 5(r)) dp(r) r(t)

= x E R’

p(t)

= m E [0,

if t -c z I T, 11.

The controls are ([, p), chosen from the class (Z x CJK,)[t, T] where Z[t, Tl = (Ct 5: It, Tl + Z, [ is Bore1 measurable] =&it,

Tl = @iw

it, Tl + [0, 11, p is nondecreasing on [t, T], ,u(t) = m,,u

is right continuous on [t, T]).

(2.1)

(2.2)

243

Optimal control

Z is a compact subset of some RP, p 1 1, and m is any point in [0, 11. Since [ E Z[t, T] is bounded and Bore1 measurable, r y (h2(r, c(r)), f2(r, c(r))) are dp integrable for any p E m,[t, T]. We use the convention that ,~(t-) = m, and so dp may have a point mass at the initial point t. There is a one-one correspondence between Radon measures d.u and distribution functions ,u. The objective in this section will be to minimize the following cost over the class

(Z x nt)[G Tl T

P t,x,m(l7 fi) = We define the value function

I

h,(r, r(r), C(r)) dr +

s [t, Tl

Mf, C(r)) Q(r).

(2.3)

K [t, T] x R’ x [0, l] + R’ as follows

V(t, x, m) =

P t,x,m(L fl). inf G-,P)E(Z x w?)[t,Tl

(2.4)

We will make the following assumptions regarding the given functions fi and hi, i = 1,2. Assumption (A). For vii = fi, hi we assume pi: [0, T] x R’ x Z + R1 and p2: [0, T] are continuous in all arguments and there is a constant K > 0 such that, l~i(t, x, z)l 5 K,

I~z(t, z)l 5 K,

x

Z + R’

v (t, x, z) E [0, T] x R’ x Z

and l~li(t, X, z) -

PI(~,Y, z)l 5 Klx - ~1, h(f,

v (t, z) E [0, T]

z) - Wf’, z)\ 5 Kit - t’l,

x Z, and x,y E R’;

z E z.

The assumption (A) is more than sufficient to guarantee that for each pair of controls (i, ,M)E (Z x X,)[t, T] there will be a unique trajectory r(e) on the interval [t, T]. This trajectory is not necessarily absolutely continuous but it will be of bounded total variation. In general, a unique trajectory will not exist if we allow dependence of f2 on x. For given admissible controls ([, P), the associated trajectory starting from x E R’ is, by definition, the solution of r(r) = x +

‘fi(r, dXr), tXrN dr + f2(rltXrN Wr). i [f,Tl .i*

From the fact that 1tt,Tj dp I 1, we easily verify that supt5 T5 Tll<(~)II I K independent controls. Furthermore, r is right continuous.

of

Remark. We will only be considering the one-dimensional case in this paper to simplify the presentation. The extension to the n-dimensional case involves interpreting appropriately the meaning of the expressions f2 *dp and h2 * dp. This can be done in several ways, c.f. [5]. Our first theorem establishes the continuity of the value function. THEOREM 2.1. Under the assumption (A), I/ is a continuous function of (t, x, m) E [0, T] x R’ x [0, I]. In fact: (I) I W, x, m) - W, Y, Ml 5 Klx - yl; (2) 0 5 V(t, x, mz) - V(t, x, ml) s K(m, - ml) if 0 5 ml I m, I 1; (3) -(K/(T - f2))(t2 - tl) I V(tl ,x, m) - V(t,, x, m) I K(t2 - tl) if 0 I t, 5 t2 < T.

E. N.

244

BARRON

et al.

Proof. The hard part is establishing continuity in t, so we will first prove continuity in m. Continuity in x is easy and we will leave it for the reader. Fix m, < m2 E [0, l] and fix t E [0, T], x E R’. For each E > 0 we find (i,, PA E (Z x w7q)P~ Tl, such that Vt, x, m,) 2 P,,,,,,(i,,

(2.5)

k) - e.

Define t0 as the first time T after t for which p,(r) + m, - m, 1 1; if this condition never occurs then set T, = T. Let

k(T) + P2(d

m2 - ml9

=

i

if r < t0 if r 1 5,.

1,

Let
‘fi(r,
These trajectories will be identical if r0 = T, so we assume that r,, < T. On the time interval [t, T,,] the trajectories are identical. Let TV < T I T. We have from assumption (A), with K denoting a generic constant, that k(r)

- &(r)I 5 K

= K

’ k(r) st

- ‘&@)Idr +

’ ITI i t

-

5

K

st’

dr

_f2(r, L(r))

+

f2(r9 c,(r)) d,M)

It,4

GM9

II [TO,4

’ I<&)

t

~2(r)l

fi(r, c,(r)) d&) Ii It, 71

-

T2Wi

dr

d,Mr)

+ K .I I%,Tl

kdr) - t2@)ldr + KM2 - 4.

Gronwall’s inequality allows us to conclude that sup ITAT)- M5)l 5 K(m2- %I. ts7sT It then follows, by a similar calculation that Ipt,,,,&, Consequently,

pi) - pt,X,,&E9 ~~11 5 K(m2 - ml).

using (2.9, I% x5 ml) 1 Pt,x,mzK,, p2) - Mm2 - 4

- E

L V’(t,x, m2) - K(m, - ml) - E.

So, we conclude that V(t, x, m,) - V(t, x, m,) I K(m2 - m,).

For the other side we use the lemma.

(2.6)

245

Optimal control

LEMMA 2.2. V is monotone

nondecreasing

in m E [0, 11, i.e. 1 2 m2 2 m, L 0.

vt, x, m2) - vt, x, ml) =- 0,

(2.7)

Proof. For each E > 0 choose ([, , ,u2) E (2 x tJ?&,J[t, T], such that vt, xv mz) 2 Pf,,,&,

9I%) - &*

Let Jo, 3 p2 - m2 + m, . Then ,u, starts at m, and is simply ,u2 shifted down by m2 - m, . Further, dp, = dp2 so that the associated trajectories are identical. Therefore, V(t, x3 m2) 2 pt,X,&e 9~2) =

(L3P*)

p*,,,nz,

1

V(t,

-

E

-

E

ml) - E,

x,

n

completing the proof of the lemma.

Combining (2.6) and (2.7), continuity in m and (2) is established. Now we turn to continuity in t E [0, T). Fix 0 I t, < t, < T and fix x E R’, m E [0, 1). For each E > 0 there exists (c2, ,u~) E (Z x 3i&,J[t,, T], such that v(t2,

x,

m)

2

Ptz,x,m(52,

m2)

-

(2.8)

E.

Set cl(r) = i2(t2), if t, 5 7 < t,; cl(z) = c2(t), if t2 5 T I T. Let fit(~) = m if t, I T 5 t2, and ~~(7.) = ,n2(r) if t2 I t I T. Then, (cl, ,ul) E (Z x Ent,)[t, , T]. Finally, let r2 be the trajectory on [t2, T] for the controls (c2, ,u2) and let (I be the trajectory on [t, , T], also starting from x, for the controls (cl, pl). Then, it follows from assumption (A) and the fact that dp,(r) = 0 if t, 5 t c: t2, and dp,(r) = dp2(r) if t2 5 T 5 T, that kl(f2)

-

xl

5

K(t2

-

and

tl)

SUP I&(r)-

[z(r)1

5

Nt2

-

t,);

-

tl)

5

K(t,

&t2

-

tJ

tZs7sT

and, so IPt,,X*~(Cl 9 PI)

-

Pt,,x,m(r2

> P2)l

t* b,(r,

5

b(r),

z)l

dr

+

K(t,

-

4).

Jt,

Therefore, Vt2

from (2.8), 9 x3 m)

2

Pt,,,,,

(Cl,

,4)

-

x,

m)

K(t2

-

tJ

-

E 2

Vt, ,x,

m)

-

-

E.

We conclude that V(t,,

-

V(t2,

x, m) 5 K(t, - tl).

(2.9)

Next, we need to show that V(t,,x,m)

-

V(tl,x,m)

5

&

(t2

-

tl).

2

We again begin with an E > 0 and (
(2.10)

246

E. N. BARRON et al.

Define the functions s: [ti , T] --t [tz, I”], z: s(7) = t2 +

3

[tz, T]

(7 - td,

+ [tl , T], by

7(s) =

C[tl , T] into C[t,,

T] by

(@f)(s) = J-(7(s)). The map 0 is a linear isomorphism with norm 1. Now we consider the adjoint operator O*, which is also an isomorphism from Radon measures on [t2, T] to Radon measures on [tl , T]. Therefore, there exists a Radon measure ,u2 such that 0*(p2) = pi, and for any a, E C[t, , T], (V,Pl> =

ItI, Tl

~09 d/M-)

= (P, @*P2) = (@P,Pz)

=

Vz.Tl

(2.11)

47(7(r))d/&9.

We can extend 0 It is not hard to see, by suitably choosing V, that p2 E 9lZ, relation (2.11) to the space of bounded, Bore1 measurable functions since the Bore1 a-field is contained in the p,-measurable algebra. Then, by approximating a Bore1 measurable function by a sequence of continuous functions and using the dominated convergence theorem, we see that (2.11) will hold for any v, which is bounded and Bore1 measurable. Note that we are not saying that a continuous linear functional on the space of Bore1 functions is represented by a Radon measure. We are saying that the Radon measure representation of the continuous linear functional (with the sup norm) 0 can be extended to Bore1 functions using the L’ norm with the p measure. Define c2(s) = (Ok,) = [r(r(s)). Let r1 be the trajectory on pJ and let r2 be the

(2.12) (2.13) 1

Proof. We will only prove (2.13) since the proof of (2.12) is similar. (See theorem 3.1 below for the preliminaries for (2.12).) We have that

T2(s) = x +

f2(r, c2(r)) dr((Ar)

‘fi(r, r2(r), (2(r)) dr + t2

1t2.4

241

Optimal control

and
T(S) f&b, c,(b)) dp,(b) f,(b, r,(b), i,(b)) db + i k,.r(S)l i t1

s

s?-G,

t,(b), C,(b))db +

l,(W)lp,,,,,,,@) &l(b).

rlfi(h

It

II

fl

(2.14)

We use the notation that lA is the characteristic function of the set A. Make the substitution b = Y(T)in the first integral in (2.14) and use the definition of 0 given in (2.11) in the second integral to get

rl(m)

T-t,

s ’

.fl(7G% =x+T - f2 t2

T-t,

tl(703),

t,(7@N,

t-2@))

T-t,

Now we

/z-l/



dr

[t

+ i

_fi(7W, TdrW),

-

[tz T1 @U,(h

+

CdW)$t,,,(,,l

(6))

Tlf2(7W9

iz(r))ll,,,~(,,~(r(r))

.h(dr),

fl(dr), tl(7W), Ck9) dr +

G2(r)

2r

f2(W, ctz> 7-1

C2W) dr +

i f2

Mr))lLfZ,Sl (4 &d-)

C20?) d,M).

Itz.4

use the following facts =e,

and

Is-~(S)l=(t2-t~)~~(t2-tl)~.

Then, using assumption (A), fl(7G%

tl(7WL

C2W

dr

f2(7W9

+ [fz,

= x +

&2@)

’ i f2

=X+T-t2

dr

i,

fd709,

=X+T-t2

Cl(709))

’ fdr, s t2


f2(r,

C209) dr +

i

C209)

sl

C2(rN diu209 + 0

It,, sl

Combining these facts, again using assumption (A), we get the estimate that

Gronwall’s inequality then establishes that (2.13) holds. Now that we have an estimate on the trajectories

Ipt,,x,m(Cl Pl) - Pt,,x,m(C2 9

P

n

it is easy to verify that

P2)l 5

g$

tt2 2

- t11.

b2(r)

E. N. BARRON et al.

248

We conclude that

which gives the desired estimate (2.10). The proof of theorem 2.1 is completed using the next proposition. This result gives us the terminal and boundary conditions and shows that I/is continuous on [0, T] x R’ x [0, 11. PROPOSITION 2.4. I/ satisfies the terminal condition lim V(t, x, m) = V(T, x, m) = z Ezti;a

d 1 h2(T, @(a - m) = min((1 - m) ~2; h2(T, z), 0)

t+T

(2.15) and boundary condition (2.16)

V(l,X, 1) = y(t,x), where with d&dr = fi(t, r(r), <(t)),

h,(r, r(r), T(r)) dr,

r(t) = X,

is the value function for the optimal control problem in which the measures do not appear. Proof.FixmIaIl,zEZandchoose~(t-O)=m,~(5)=aiftIz(T.Wehavea point mass at t if m < a. Then from (2.3), (2.4), “T iqt,

x,

m)

5

h(r,

T(r),

z)

dr

+

h,(t,

@(a

-

m).

it Let t t T to get lim sup,,T V(t, x, m) 5 min,, Z,mSa h2(T, z)(a - m). For the other side, let (<, ,u) E (Z x 312,)[t, T] be arbitrary. Then from assumption (A), T M, c k(r, &% c(r)) dr + c Jr J Lt.~1

s

T(r)) d,Mr)

T

2

h(r,

W9

i(r))

dr

+

.i It, rl

t

T

2

h(r,

C(r),

C(r))

dr

+

s

min h,(T, z) - K(T - r) d,u(r)

It,g z Ez

i t T 2

MT, i(r)> - K(T - r) dp(r)

h(r,

&9,

5(r))

dr

(p(T)

min h2(T, z) - K(T - t)

+

(

.i t

ZEZ

- WZ)

>

T 2

W, i t

Consequently, minz.z,m,.

W),

C(r))

dr

+

min ZCZ,l7lSll

(h2(T, z) - K(T - t))(a - m).

letting t t T we see that since < and ,D were arbitrary, lim inf,, T V(t, x, m) 2 h,(T, z)(a - m) and the terminal condition (2.15) is verified.

249

Optimal control

Finally, to see that the boundary condition (2.16) is satisfied we simply observe that if the controls p must start at 1 and be nondecreasing then they must stay at 1. That is, nt,[t, T] = (l), and the result follows immediately from the proof of continuity of I/in m. n The proof of proposition 2.4 as well as theorem 2.1 is complete. Remark. Suppose that we had a terminal cost, say g(<(T)), as well as a running cost, i.e. the cost functional is g(<(T)) + P,,,,,(C, p). In this case, the terminal condition becomes IV,x,m)

- Mh = ZEZr$zYdl (MT, z)(a - m) + g(x + fdT z)(a

and the boundary condition becomes Iqt, x, 1) = y(t, x) = r $& Tl [MT)) where dr/dr = fi(z,t(t), T(T)), t(t)= x. The next result contains the dynamic problem. PROPOSITION

+ i:h,(r,

programming

C(r), C(r)) drj 3

principle

for the optimal

control

2.5. Let assumption (A) hold. Then for any t < s 5 T we have that s

v(t, x, m) =

inf K,P)E(Zx %#,sl

f

k(r, &% C(r) dr

and lqt, x, m) =

min tEZ,l-ma690

Mf,

z)6 + V(t, X + df#,z), m + S)].

PW

Proof. We will prove (DP2); the proof of (DPl) is standard and furthermore is very similar to [18, theorem 2.1, 2.31. Let F(t, x, m) denote the right-hand side of (DP2). Since we can choose 6 = 0 we see that F(t, x, m) I Iqt, x, m). For the other side, let z E Z be fixed and c(r) = z, t I 5 I T. Fix 0 I 6 I 1 - m. Let ,UE 3n,[t, T] be defined by p(t-) = m, and p(t) = 6 + m, if t s z I T. Let r(e) be the trajectory for the controls [, p. Then, for any E > 0, with t + E I T, we have from (DPl) that f+E V(t, x, m) 5 W, 5(r), C(r)) dr t

+

s

Mr, CO->) 44r) + Ut +

E,

t(t +

E-

O),r4t + E - 0))

It,t+El

rt+& =

3 *

h,(r, &9, C-(r))dr + Mt, z)6 + V(t + E, t(t + E - o), p(t + E - 0)). (2.17)

E. N. BARRONet al.

250

Letting E + 0, since <(t + E) + x + f2(f, z)6, and p is right continuous, and the continuity of V that

Therefore,

of proof

LEMMA 2-6.The map6 y [O,1 -

n

V(t, x, m) I F(t, x, m) and the result is proved.

Using the same method

from (2.5)

VI-m2S20.

VZEZ,

Vt, x, m) 5 h,(t, z)6 + V(t, x + fi(t, x)6, m + 4,

we conclude

we easily derive the following

lemma.

z)6 + V(t, x + f2(t, ~)a, m + s)) is non&creasing

minz,zth2(t,

on

m].

Remark. We can combine

(DPl)

and (DP2) to get s

V(t, x, m) =

inf K,s) E (Z x %x~*Sl +

I

h,(r, T(r), C(r)) dr +

h,(r, C(r)) d&) b,s)

min VQ(s3 z)d + Vs, 5(s-) ZEZ,I-p(S-)tcs20

Next we will derive the Bellman viscosity solution of the equation.

+ 6fd.c z), ,f@-_) + 6)) . 1

equation for the problem and prove that V is the Define the Hamiltonians HI : [0, T] x R2 + R’, and

H,:[O,T]xR’+R’by H,(t, x,P,) = min(a,fi(t, ZEZ

x, z) + 40, x, z)),

Hz(t,Px) = $(Pxfz(t,

2.7. Let assumption (A) hold. The value function on the set C? = (0, T) x R’ x (0, 1) of

THEOREM

I/ is the unique

z) + Wt viscosity

mint V, + H,(t, x, V,), V, + H,(t, V,)) = 0 and V satisfies

the terminal

condition

(2.15) and boundary

condition

u: R” + R’ is a viscosity

G(x, u, Dxu) = 0,

subsolution

(2.16).

2 0

(supersolution)

of a (possibly

of the equation

where G: R” x R’ x R” + R’,

if for any v, E C’(R”) for which u* - v, has a maximum y, we have G*(Y, u*(~),Dxd

solution

(2.18)

Before we give the proof of the theorem we recall from [19, 201 the definition discontinuous) viscosity solution of a Hamilton-Jacobi equation.

Definition 2.8. A function

z)).

(u, - p has a minimum)

(respectively G(y,

where u*, u, denote the upper and lower semicontinuous for G*, G,.

wd~),D,d

envelopes

at the point

5 0) sty,

of u, respectively.

Similarly

251

Optimal control

In general, we see that a viscosity solution as well as the function G may be discontinuous. In our problem we have already proved the continuity of the proposed solution and we have the continuous function G given by G(t, x, m, PtT P,,, , P,) = mintp, + HI@, x, PA P,,, + HA& PA We now turn to the proof of the theorem. Proof. Let 9 be a smooth function on a and suppose that V - 9 achieves a strict zero maximum at the point (to, x0, mo). We can always arrange, by modifying 9 if necessary (c.f. [l, 21]), to have (to, x,,, m,) E (0, T) x R’ x (0, 1). From (DP2) we have that IV0 9x0 t mo) = 9@0, x0 ? mo)

Therefore,

for every 6 > 0 O 5 m$$MtO, 4 + J-‘(9(to, x0 + Mto,

Let 6 --t 0 and use the differentiability O5

minMt0,

z)

ZCZ

+

z), m. + 4 - 9(to, x0, mo))l.

of 9 to get that

9X(to, xoc,,mo) *sZ(to72) + 9,(to, x0, mo)l.

(2.19)

Define the control ,UE 3nmo[to, T] by P(T) = m,, to s z d T. From (DPl) we get for any to 5 s 5 T, 9(to, x0, mo) = Uto, x0, mo)

Notice that for the control p the trajectory for each c is given by d to, ((to) = x0, and there are no jumps in either ,u or <, Set s = to + E, in the preceding; divide by E and let E + 0 to obtain that O 5 minlk(to,xo,z) ZEZ

+ 9,(t0,~o,mo)~f,(to,xo,z)

+ cPt(to,xo,mo)J.

(2.20)

Combining (2.19) and (2.20) we see that V is a subsolution of (2.18). We need to prove finally that V is a supersolution of (2.18). Thus, suppose that V - 9 has a strict zero minimum at the point (to, x0, m,) E CJ with 9 a smooth function. Assume to the contrary that there is a constant C > 0 such that 9t(to,X0,mo)

+ rlnfi2(9~(t0,~o,mo).fi(to,x0,z)

+ h,(lo,xo,z)l

z C

(2.21)

and 9m (to 9x0, mo) + n39,(toV

x0, m,) *f2(to, z) + h,(t, ,@I 2 C.

(2.22)

252

E. N. BARRONet al.

Fix z E Z and define r(o) by dr(m)/dm = f&to, z), [(m,) = x0. From smooth, we see that for all m E [m,, m, + 61 for small e > 6 > 0, 3am>, ml + n&l

9,&l

9r(m), m> *f&,

4 + w,,

(2.22),

since

9 is

z) 2 c/2.

Consequently,

Integrate

this from

9(t0, x0

+

~fi(to, 4, m. + 4 - 9(to, x0, mo) + &(to, z) at (to, x0, mo) we obtain

Since I/ - 9 has a strict zero minimum

lVo, x0 + 6Mfo, z>, m. + 4 + 6Mo, 2 9(to, x0, mo) + K/2

x0,

mo)

where the infimum Proof.

=

‘;“,f * u

to jump

0 4@-,

W),

i(r)>

dr

W-,

+

i(r)>

dr

+

Us,

tX@,

P(S))

J [fop sl

fo

, 1

on the class 3n,_ [to, s] n C[to , s].

n = 1, 2,. . . , there exists ([, , p,J E (Z x nt,,,)(t,,

2 to be the first point of discontinuity V(t,, x0, mo) + i 2 Pt,,,,,,,G9

L

(2.24)

says that it is not optimal

VroF x0, mo) -t k 2 P,,,,,,,&, Lets,

from (2.23)

then there exists an E > 0 such that for all to < s < to + E,

on fi is taken

For each integer

(2.23)

= V(t,, x0, mo) + K/2,

s V,,

E/2.

4

for all z E 2 and sufficiently small e > 6 > 0. This inequality to another position at time to. LEMMA 2.9. If (2.24) holds,

2

+ mintV(s,, zez

, vu,).

p, . We have that, with A,, = ,uu,(s,) - ,uu,(s, - 0),

PJ

srl h,(r, t,(r),

1 f0

of

T] such that

1 L(r)) Lb,

dr +

Mr, i [ro,S”)

- 0) + LfdS,,

i, (4) 4s, 09

zh ran,

+ 42 + AW,,

zN.

If there is a subsequence such that s, + to and 6’ > 0 with A2, + a’, then using the continuity of V we obtain that if n + co, V(to, x0, mo) 2 $$ Using lemma

2.6 we have reached

W,, x0 + d’fi(to, z), m. + a’) + I’M,, a contradiction

of (2.24).

n

zN.

253

Optimal control

Now fix E given by lemma 2.9. Let 0 < p0 < E and i E Z[tO, t, + pO] and ,UE %,, fl C[t,,, t, + pO] with (2.21) and (2.22) (with C replaced by C/2) holding at (r, r(r), p(r)), t, d r I t,, + po. Then compute

That is,

c4to +

POY

mo+ PO),,mo+ PO))-

v(to

9

x0

7

mo)

(0 fP0 +

i f0

h,O-,T(r), C(r))dr +

i

M-> i(r)) &(r) 2

(2.25)

~oc.

Ito,to+Pol

Since V - v, has a strict zero minimum at (to, x0, m,) we obtain from (2.25) that Wo + PO>ato + PO),No + PO)) - vto >x0 7mo) +

to+po h(r,

tlr), T(r)) dr +

W, 5 [to.fo+P01

.rto

C(r)) d&)

2 poC.

(2.26)

This inequality is a contradiction. Therefore, V is shown to be a viscosity supersolution of (2.18) as well. Finally, the fact that V is the only viscosity solution of (2.18) follows from more general uniqueness results for first order Hamilton-Jacobi equations (c.f. [22-241). n Now we introduce the following optimal control problem with unbounded

T Minimize P,,,,,([,

a) = t

controls.

T hk,

T(r), c(r)) dr +

t

&(r, W))a(r)lt IL< 1I dr

(2.27)

subject to

dt-/ds = .fi(z, T(r), T(3) + .h(z, l(~)b(Qll ,,< 1 I , Wdr = 4r)ll,<,, r(t) = x E R’

, p(t)

over the class of controls ([, a) E Z[t, T] x L:[t,

t
(2.28)

f
11,

T], where

L’+[t, T] = ICY:[t, T] -+ [0, 00)) ET&r) dr < 031.

254

E. N. BARRON etal.

The function 1,s < 1I is the characteristic function of the set (p(s) < 1). For (Y E L:[t, T] we see that P(T) E [0, 11, for all t 5 Y 5 T. Furthermore, since

we see that Il<(t)llrm 5 K, independently The value function for this problem

of controls. is defined by

W(t, x, m) = It is easily seen that

W is a bounded

any control

inf

Pt,,,, (L 4.

under

the assumption

K,~)EZXL!&

function

(A).

2.10. Let assumption (A) hold. (1) W is a viscosity solution of (2.18) and satisfies the terminal condition (2.15) and boundary condition (2.16). (2) The value function W is also the unique continuous viscosity solution of THEOREM

(2.29)

KG, X, m) + H(t, X, K, , KJ = 0 where

Wt, x, P,,, , PJ = (3) W=

H,(t> x, PA

if p,

-co,

if

+

P,,, +

f&O, P,) 2 0

(2.30)

%(f, P,) < 0.

Vonfi.

Proof. The proof that W satisfies the terminal and boundary conditions is similar to that in proposition 2.5 and is left to the reader. We will prove that W is a viscosity solution of (2.18). In fact, this follows immediately from theorem I.1 of [7] but we will provide the details. The idea of the proof is to bound the controls (Y which then results in a standard optimal control problem to which classical results apply. Therefore, we consider the control problem (2.27), (2.28) but we must choose the controls u from the class AB[t,

T] = (a: [t, T] + [0, B]: a E L:[t,

T]),

for each fixed B > 0. When we use this class we will denote the corresponding by WB. Now, using standard theory, WB is the unique viscosity solution of W, + HB(t, x, W,,

W,) = 0

value function

on Q

where HB(t, x, pm, P,) = FJ;(pxfi(t,

x, 4 + h,(t, x, 4 - B(P,

+ pxfdt,

z) + Mt, z))-I.

By considering classes of control functions it is clear that B 1 B’ implies that WI WBI wB’. We conclude that WB converges to some function I- 2 W which is upper semicontinuous. In fact it is not hard for the reader to verify that on (0, T) x R’ x [0, 11, r = W. Therefore, W is at least upper semicontinuous. We will now use the fact that WB L W to show that W is a viscosity solution of (2.18).

Optimal

255

control

Let W - p achieve a zero unique maximum at the point (to, x0, m,) with cp a smooth function. We arrange, if necessary, to have to > I and 0 < m, < 1. Then, by lemma A.2 in Barles and Perthame [25], for each B > 0, WB - v, achieves a maximum at (tB, x,, me) and (tg,xs, ma) + (to,%, m,) as B + ~0. Since WB is a subsolution, at (te, x,, m,) rp, + minlcp,f,(t,,

x,, z) + hl(tB, x,, z) - B(cp, + ~l,_&(t~, z) + MB,

ZEZ

z))-1 2 0.

(2.31)

Since the expression in parentheses is nonnegative we may drop it to get % + minl%fi(t,, ZCZ

XB)

z)

+

hl(fB,

XB,

z))

1

at

0

OB,xB,

mBh

Let B + 00 to see that v)~+ min19Jt(to 9x0, z) + Mh,

at (to, x0, mob

x0, ZN 2 0

ZEZ

(2.32)

Also, divide through by B in (2.31), let B + m and use assumption (A) to obtain min(-(p, ZEZ

+ u?J~(~~, z)

+

Mt,,

z))-)

2

0

which immediately implies that %?Z+ mn$&(to

3z> + Mto,

(2.33)

z)> 2 0.

Combining (2.32) and (2.33) we conclude that W is a viscosity subsolution of (2.18). NOW suppose that W, - q achieves a zero unique minimum at the point (to, x0, m,) with v a smooth function. Then, again by lemma A.2 in Barles and Perthame [25], for each B > 0, WE - v,achieves a minimum at (fB, x,, mB) and (tB, x,, me) --t (to, x0, m,) as B + 03. At (tB 9 XB , mB> (Pr + ~j;(%fi(fB~ XB, Z) + hl(fB, XB, Z) - B(~7,n + &f#B, Z) + h&B, Z))-)5 0. (2.34) If pm + mEi;l~J~(t,~ z) + MO, z)J 2 C > 0 then, by continuity,

at (ts, xB, m,) for B sufficiently large pm +

minf%.fi(~B, ZEZ

z)

+

h2(fB,

z))

2

c/2.

From (2.34) we see that (Pr +

J$$$vx.f~(f~,

XB,

Z)

+

hl(tB,

XB,

7-j)

5

0.

(2.35)

Letting B + m we see that (2.35) holds at the point (to, x0, mo). Consequently, w is a supersolution of (2.18). Since the viscosity solution of (2.18) is known to be at least continuous, we know also from this fact that W must be continuous. Part (1) is proved. It is not hard to directly establish the continuity of W. The details of the proof are similar to that of theorem 2.1.

Remark.

We will prove part (2) of the theorem from the following lemma.

E. N.BARRONetal.

256

LEMMA 2.11. A continuous function viscosity solution of (2.29).

is a viscosity

solution

of (2.18) if and only if it is also a

Proof. Let I be a viscosity solution of (2.18). It is obvious that I is then also a subsolution of (2.29) so we need only show it is a supersolution of (2.29). To this end, if I - ~1 has a minimum at (to, x,, , mo) then mink4

+ Hi(k,

x0, P,), Pm + K(fO,

P,)J 5 0.

If p, + H,(t,, p,) > 0 then (D(+ H,(t,, x0, p,) I 0. So, by (2.30), I is a supersolution of (2.29). On the other hand, if qrn + H,(t,, q,) 5 0 then, again using the definition of the Hamiltonian H, vt + H(to, x0, m,, pm, tp.J = - CO.In either case we conclude that I’ is a supersolution of (2.29). Hence, a viscosity solution of (2.18) is also a viscosity solution of (2.29). The proof that I is a viscosity solution of (2.18) if it is a solution of (2.29) is similar and so we omit it. We conclude that the equations (2.18) and (2.29) are equivalent in the viscosity sense. n Finally we will prove for (2.18) to conclude equation and boundary proposition 5.3 of [5]. Clearly, V(t, x, m) I (<,, ,uJ with associated

that W = V. We can appeal to uniqueness theorems (c.f. Barles [22] that W = I/ because we have shown that W and V satisfy the same conditions. We can also prove this directly, however, by using W(t, x, m). For the other side, given E > 0 there exists a pair of controls trajectory <,( *) which are e-optimal V(r, x, m) 2 P,,,,,(&,

According such that

to [5, proposition

ri + <, 9 dti f Finally,

h,(t,

k)

- c.

5.31 there exists a sequence measlr:

dti 9 a~ dz 3 dP,(r),

(C, ai) and associated

V(t, x, m) 2 P,,,,,

(&,

,4)

-

and the result follows.

co.

large

T h,(r,

[i(r),

(i(r))

i t 2

for i sufficiently

ri

E

T I

asi+

C(t) # c,(r)) -+ 0

Ti(~))oli(~) dr 3 h,(r, C,(T)) dpE(r). Therefore,

trajectories

dr

+

Mr,

Ci (r)bi

(r)l(

,, < I 109

dr

-

2.2

*

W(t, x, m) - 2~ n

Remarks. (1) It follows from this result that the model with measures is not more general than that with unbounded control functions. (2) The Bellman equation formally tells us what the optimal controls are. For example, when V, + H2(f, V,) > 0 the optimal measure control consists of doing nothing, i.e. dp = 0. The optimal c control will then provide the minimum of the Hamiltonian H, . The dp measure, or equivalently, the (Ycontrol will be nonzero only on the set where V, + H,(t, V,) = 0. On this set the optimal c control will minimize the Hamiltonian H, . The optimal measure could have an absolutely continuous as well as a singular component. We leave as an open problem the rigorous connection between the Bellman equation and the optimal control.

Optimal

251

control

3. THE DIFFERENTIAL

GAME

In this section we will consider the differential game associated with the dynamics (2. I), (2.2) and payoff (2.3). The players will be the controls [ and ,U with C the minimizer and p the maximizer of P. We will work within the framework of Elliott and Kalton’s definition of differential games and refer to Elliott [26] for a basic synopsis of results on differential games in the connection with viscosity solutions. Many of the results for differential games are proved in a manner similar to that for the optimal control case. In the interest of brevity we will only provide the proofs which are substantially distinct from those of Section 2. In order to be precise about the differential game let us define the terms. A strategy for the maximizer is a map 01: Z[t, T] + 3n,[t, T], such that c,(r) = &(r), f I r I s for each t s s I T, implies that a[il](r) = ol[&](r), t I T 5 s. This defines CYas a nonanticrpating map. Let r(t) denote the class of strategies for ,D on [t, T]. Similarly, the class of nonanticipating strategies for i on [t, T], is denoted by A(t). A strategy for the minimizer is a nonanticipating map fl: 3n, [t, T] + Z[t, T]. We will sometimes write CYE 3n, [t, T], p E Z[t, T] to signify that the strategies map into a control function in the class. An outcome of ([, a(C)) (respectively (p(p), p)) must be an element of (Z x TK,)[t, T]. Then we have the following definition. Vf: [0, T] x R’ x [0, l] -+ R’ is defined

Definition 3.1. The upper value function V+(t, x, m) =

V-(t, x, m) =

(1)

V’:

boundary

[0,

inf

assumption

inf

sup

BEA(r)PEWn[r,rl

4Cl). by

Pt,x,m(P[Pl, P).

(A),

T] x R’ x [0, l] --t R’ are bounded

and

continuous

and

satisfy

the terminal

conditions

V’(T, x, m) = r$;(h2(T,

wt, x, 1) = where y is defined (2)

pt,,,,(<,

a El-(t)( EZ[Z, T]

V-: [0, T] x R’ x [0, l] + R’ is defined

The lower value function

THEOREM 3.2. Under

sup

by

I/+ satisfies

,$)+(I

- m),

(TC)

VW

?dt, 4,

in (1.16). the dynamic

programming

V+(t,*, ml =

min zez

V+(t, x, m) =

inf 01E%“[f,Sl f E ztt,s1

principles

max (h,(t, z)6 + V+(t,

l-mz6ro

x

+

sfi(t, z), m + a)],

(3.1)

s

sup

+

s

t

hi@, &% 5(r)) dr

h,(r, C(r))dr + v+b tW-1, PCS-_)I .

[t,sl

1

(3.2)

E. N. BARRONet al.

258

(3) V- satisfies v-(t,

the dynamic

programming

principles

z)6 + VT, x + df2(t, z), max minth,(t, zez

x, m) =

(3.3)

m + a)),

l-rn,6>0

s

V-(t, x, m) =

inf sup BEzlt, sl P E311,11, $1

h,(r, T(r), C(r)) dr I

1

h2(r, i(r)) dr + v-(s, CW), r(l(s-1) .

+ tt. sl Remark. becomes

If we add a terminal

cost to the payoff,

say g(r(T)),

(3.4)

then the terminal

condition

z)(a - m)) + h,(T, z)(a - m)], V+(T, x, m) = min max (g(x + f2(T, ZEZ mra51

(3.5)

for k” and I/-(T,

x, m) =

max

mrnrt

min(g(x + f2(T,z)(a - m)) + h,(T, @(a - m)),

(3.6)

ZEZ

for V-. Of course these terminal conditions will not be the same in general. therefore, expect the game with measures to always have value.

One should

not,

Proof. We will only prove some of the results stated and only for the upper value. The proofs for the lower value are similar. We prove first that I/+ is continuous in t in one direction. Fix (x, m) E R’ x (0,1). Let 0 < r, < t2 < Tand let E > 0 be given. Then, there is a strategy 01~E r(t,) such that V+(ti,

v 4-lE Z[t,, Tl.

x, m) 5 Pt,,x,m(S1,Q,[~J) + E,

Define the maps S: [ti, T] + [t2, T], z:[t2, T] + [tl , T] as in Section 2. Define the map 0: C[t,, Tl -+ C[f,, Tl by (@f)(4 = f@(9). Given & E Z[t,, T] set
As in

Section

v? E C]t,,

2, ,uu2 is a Radon

measure

with

,u2 E tm,[t,,

T]. Furthermore,

Tl, (P,&> =

!

v(r) Qdr)

Ifz. Tl

(WNr) &dr)

= (co, @*ccl) = (09, PI> = It,, Tl

for

any

259

Optimal control

In fact, by dominated convergence, this is valid for any bounded lemma 2.3 (2.12) holds and we conclude after some manipulation that V+(G>

Bore1 measurable 9. Then, involving assumption (A),

ctz - ?I) + E,

xv m) 5 ~t,,x,mK2r4-21) + &

v c2 E Z[f*, Tl.

2

This implies

that x, WI) I

Vf(lr,

v-+(&,x,

m) + &

(b - tl). 2

The remaining estimates for continuity are similar reader. Now we turn to the proof of (3.1). Let

F(t, x, m) = min

l-m2620

x, m). Next, given any [ E Z[t, T] set z = c(t).

By setting 6 = 0 we see that F(t, x, m) 1 V’(t, We can find 1 - m 1 6’ = 6’(z) 2 0 so that IV, x, m) 5 =

(h,(t, 236 + V+(t, x + df2(t, z), m + 6))

max

l-mZ6tO h2(t,

z>S’

+

If 6’ = 0 we are done so we assume strategy cy’ E 3nm+6,[t, T] such that V+(t, x +

a’f2(t,

z),

V+(t,

x

+

m

+

6’)

5

P t,x+s’fi(t,z),m+*‘(r,a’[il) + the preceding,

B’f2(t,

z),

that 6’ > 0. Now,

m + 6’).

by definition

Pt,x+aifi(t,l),m+s,(r,

Define the strategy 01” E CXm[t, T] by a”[c](t-) Then, it is not hard to verify that

so that, combining

2.1 and are left to the

{h,(t, ~$3 + V+(t, x + &f2(t, z), m + 6)).

max

ZEZ

to that of theorem

a’hz(t,

a’K1)

= m and

z)

of V/+, there exists a

=

a”[[](r)

P,,,,,(L

+

E.

= a’[[](@

if t I

7

5

T.

dCl),

we get that

F(t, x, m) 5 h2(t, z)S’ + V+(t, x + B’f2(t, z), m + S’)

5 Mt* z)&’ + Pr,x+s’fi(t,z),m+6’(rr~‘[Cl) + E

= pt,x,rncc, a”[Cl) + &* This evidently implies that F(t, x, m) 5 V’(t, x, m), completing assertions of the theorem are left to the reader. n

the proof.

The remaining

We will now focus on the upper value, Vf, since we are taking the point of view that we are studying the differential game as a worst case analysis of a system subject to disturbances. Later we will state the results for the lower value, V-. Define the upper Hamiltonian H+: R’ x [0, T] x R3 --f R’ as H+(a,

t, X, P,,, , P,) =

min (P, *f1(t, xv z) + h,(t, x, z)) z E~@~~,P,JJJ

(3.7)

E. N. BARRONet

260

al.

where Z(a, t, Pm 9PJ = (2 E z: Pm + p* *.Mt, z) + Mf,

(3.8)

z) 5 4.

If ‘Z(a, t,p,, p,) = @ then we set Hf = +w. In general, one cannot expect such Hamiltonians to be continuous functions. In fact, this Hamiltonian is not continuous. In view of the definition of viscosity solution with discontinuous Hamiltonians, we have to calculate the upper and lower semicontinuous envelopes of H’. We do so in the next lemma. The statement of the lemma is similar to that of [3, proposition 2.51 but the proof here is simpler. LEMMA 3.3.The upper semicontinuous

envelope,

w+)*(a, f, x, Pm 9P,> The lower semicontinuous

envelope (H+)*@,

Proof.

(H+)* of H+ is given by

= ff+tfl

- 0, t,x,P,,P,).

is given by

t, X, Pm, P,) = ff+ca

+ 0, t, x, Pm 9P,).

We will only prove the result for the upper semicontinuous

(H+)*(a,

t,x, P,,PJ

= lim su~W+(hs,~,

qm,4J:@,hy,

envelope.

By definition

qm,4J -+ (a,t,x,P~,PJI.

Given E > 0, fix (s, u) E B,(t, x), such that I&, X, Z) - ~6, Y, z)l 5 KG

~1 =fi,h,,

idt, z) - ul@, 41 5 KG Also fix (b, q,,, , qJ E &(a, pm, P,). penalization theory, we have that

+

by

Now,

Consequently,

result

in

+ p,f,(t,

z) + Mt,

min (Pxf,(t, 2.Em-KE,~,&,P,)

finite-dimensional

1

x, z) + h,(t, x, z) + Ip&

mintp,fi(t,

=

standard

h,.

z) + h,(s, z) - b)+)

B(q, + qxfb,

+ B(P,

a

VI = fi,

+ KC

z) - a + Kc)+)

1

x, z) + h,(t, x, z)) + Ip,le + Kc.

since E was arbitrary, VP)*@,

Since the reverse complete. n The next lemma

inequality

t,x,p,,p,)

follows

is the useful

5 H+(a

- 0, t,x,p,,,,p,).

from the definition

analogue

of [3, proposition

of the upper

envelope,

4.11 and lemma

the proof

2.9 above.

is

Optimal

LEMMA 3.4. A continuous

function

261

control

u E C(Q) is a viscosity

solution

of

max] v + H+(O, t, x, V,, V,), V, + H2(f, V,) = 0 if and only if u is a viscosity

solution

(3.9)

of

v, + H+(O, t, x, v,,

(3.10)

v,) = 0.

The advantage of the formulation (3.9) is that the minimum in H+ is always taken over a set which is nonempty. With these preliminaries completed we can now state the main result of this section. THEOREM 3.5. V/+ is a viscosity

solution

of (3.9) (or (3.10)) on (0, T) x R’ x (0, 1).

Proof. We know that V+ is continuous on d = [O, T] x R’ x 10, l] so we need to verify the viscosity requirements. Let V+ - Ed achieve a strict maximum of zero at the point (to, x0, m,,). Without loss of generality we may assume that (t,,, x0, m,) E (0, T) x R’ x (0, 1). We must show that (D&,x0,%) Suppose

+H+(O

at &,x0,

- O,t,,x,,P,,&J~O

mo).

this is not true. Then there exists a /3 > 0 for which Vt(& 7x0 7 mo) + fY+ (-4P,

BY definition

of the Hamiltonian,

to, x0, pm, rp,> 5 -4p

this implies

at (to,xo,

(3.11)

m,).

that there exists z* E z(-4p,

to, q,,

p,), such

that Vr(fo 9x0 >mo)

+ K4to >x0 3mo)f,(t,, x0, z*) + h&, x0, z*) 5 -4p.

Consequently, V,(S, YYv) + VX(&Y,

V)fic%y, z*) + h,(s, y, z*) 5 -3p

(3.12)

v)f*(s, z*) + h,(s, z*) 5 -38

(3.13)

and %I(s, Y7 v) + VX(S,Y,

for every (s, y, v) 6 B,(t,, x0, mo) for some 6 > 0. Set c*(r) E z*. NOW, the fact that (3.13) holds at (to, x0, mo) implies that there exists a 6’ > 0 such that v’(to* x0 +

dfAto, z*), m, + 6) + 6’h&, z*) 5 V’Q,, x0, mo) + 3ps,

vo < 6 The proof of (3.14) is similar to that of (2.24). Next, (3.14) implies such that for any to < s < to + E we have

< 6’.

(3.14)

that there exists an E > 0

s Vf(fo,xo,

mo) 5 SUP OL

Mr,

kO-, T(r), C*(r)> dr +

to

[fo.Sl

C*(r)) dr + v+@, <(s), ,44)

, 1

(3.15)

E. N. BARRONet

262

al.

where the supremum is taken over strategies CY which satisfy the property that the outcome p of ([*, CY[[*])is in X,,, I-I C[t,, s]. Again, the proof of this is similar to that of lemma 2.6 and uses the fact that 6-

V+(fOr x0 + &(tO, z*), m. + 6) + %(to,

z*)

is nonincreasing on [0, 1 - m,]. Fix f, < p < t, + E so that (r, l(r), p(r)) E B,(tO, x0,m,), toI r I p. Here, ,UE 3nm0 tl C[t,, to + E] is arbitrary, and r is the (continuous) trajectory corresponding to (,D, [*). Then, using (3.12) and (3.13) and the change of variable formula for Stieltjes integrals-noting that q is smooth-we get that

+

pm(r, 56%dr)) + vxf2(ryC*(r))4.W ko,Pl

5 -

c’

h,(r, t(r), C*(r)) dr -

[fo,PI

to

Consequently,

&(r, i*(r)) d&r) - ~B(P - to) - 3PMp)

- mol.

since V+ - ~1has a zero maximum at (to, x0,m,,)we get that P ~‘(/A

r(P),

Lo))

k(r,

+

&9,

<*O-N dr

5

~+(fo,

x0,

mo)

-

W,

+

i’f0

C*(r))

d&9

1 Ito.Pl

~P(P - to) - 3PMp)

- mol.

(3.16)

This is true for every ,DE 3nmo 0 C[t,, p]. Thus, using (3.15) we have arrived at a contradiction. Therefore, V+ is a subsolution. Next we prove that I/’ is a supersolution of (3.9). Let V+ - 9 achieve a strict minimum of zero at the point (to, zO, mo). Again, without loss of generality we may assume that (to, x0,mo) E (0,T) x R' x (0,1). We must show that e+(toYxO, mo) + H+(O + 0, co,xO, v,, v,) 5 0

at (to, x0, moL

or, equivalently, max@, + H+(O + 0, to, x0, v,, ~1, pm + ff,+Vo, PA 5 0 at (to, x0, mol.

Suppose to the contrary that there is a /? > 0 such that ~t(~o,~o,~o)+~+(4PI~o,~o~~m,~~~~4P

at (to1 x0, mo).

Let [ E Z[t,, T] be arbitrary. We claim that PDm (to, x0, mo) + px (to, x0, mol_M~of WON + WO, C(to)) 22 4/3.

(3.17)

Suppose instead that Pm(to9 x0, mo) + (0,(to, x0, mol_f2(~o~6Vo)) + Mto7 UoN > 4P.

(3.18)

263

Optimal control

In this case we let 6 > 0 be such that m, + 6 5 1 and

if m, 5 m I m, + 6. Since f2 and h, are bounded and a, is smooth, we can choose 6 to be independent of [. Define on [mo, m, + 61 the trajectory d<(m)/dm = f2(f0, [(to)) with initial condition T(mo) = x0. We obtain & ~0,~ T(m), m) + MO, iGo)) > 3/3,

m, 5 m 5 m, + 6.

Integrating this from m, to m, + 6 c~(t,, x0 + afi(to,

5(to)), m. + 4 - &toy x0, mo) + Wfo9

WON > 3Pb.

Since V+ - v, has a minimum at (to, x0, mo) we conclude that V+(to, x0 + dfi(to, tXt,N, m. + 4 + &Go,

C(toN > V+Vo, x0, mJ + 3/36.

Since [ was arbitrary, this is a contradiction of (3.1). Thus (3.17) must hold. Define the strategy ol[T](r) = ~(5) = m, on [to, T]. Assume that we are given a control c E 2 n C[to, T]. Then, from (3.17) there exists to < s I T such that (~m(r, T(r), p(r)) + v)*(r, T(r), ,dr))fz(t,

C(r)) + hz(T, C(z)) 5 3P,

1, I T I s,

where < denotes the trajectory associated with (5, II). We claim that there exists such an s independent of 5. Indeed, if not then there would be a sequence Sj L to such that (3.18) would be true. But we have already seen that (3.18) leads to a contradiction. Consequently, we see that [ E 2(3p, r, pm, q,) for all I, 5 t I s. Using the definition of H+ we have, V’,(r, t(r), p(r)) + %(r, l(r), dz))fi(T,

C(s), l(r)) + h,(T, t(r), C(r)) 2 3P,

to I ‘5I s.

This readily implies (we omit the frequently used details) that s V’(s, T(s), P(S)) + Mr, c(r)) d&) 2 Vf(to, x0, MO) + 3P(s - to). hi@, 5(r), c(r)) dr + 5 to s Ito.4 Now, [ was assumed continuous. But this inequality will hold for any [ E Z[t,, t] since s was independent of [ and (Yis identically m, and so is also independent of <. Conseqently, we have found a strategy Q!E 3nm0[to, T] such that for all [ E Z[t,, T] the previous inequality holds. But this contradicts (3.2). Thus, I/+ is also a supersolution. This completes the proof. n Next we prove that there is exactly one continuous viscosity solution of (3.10) satisfying the terminal condition (TC) and boundary condition (BC). We state the uniqueness result in the form of a comparison principle. THEOREM

3.6. Let u be a continuous

supersolution Proof.

viscosity subsolution and u a continuous viscosity of (3.10), both satisfying the conditions (TC), (BC). Then u 5 v on Q.

Assume that (u - v)(t,, x0, mo) = max(u - u) > 0. Let p > y > 0 satisfy u(to,xo,mo)

- Y > u(k,x,,m,)

+ P.

E. N. BARRONet al.

264

Define fi(t, X, m) = u(t, X, m) - (v/m) + (y/t) and fi(t, x, m) = u(t, x, m) + (P/m) + (/3/t). Then it is straightforward to check that ii is a subsolution of ii, + (H+)*

and 6 is a supersolution

(2)

t,x,m,fi,,ii, >

-lLt2 --

0,

of

( $9

ct+ w+)* -

t,x,m,

Gm, i&

+

P 7

=

0.

>

Set w(t, x, m, y, n) E ii(t, x, m) - C(t,y, n). Then, w is a subsolution of w, + (H+)*

(-$

,t,x,m,w..,w,)+(*+),(-$,t,y,n,-w,,

-WY)-y=O.

(3.19)

Let E > 0 and consider the function f,(t, x, m,y, n) = W, x, my, 4 -

$ Ix -

A2 - i Im - n12-

Assume that this function achieves its maximum at a point (t,, x,, m,, y,, n,). Then, it will follow from the continuity of u and u, more generally from the upper (lower) semicontinuity of u (v), that ;

Ix, - YEI2+ 0,

& Im, - n,j2 -+ 0

(3.20)

as E -+ 0,

and W,,x,,

m,,yE, 4) 3 max w

as e -+ 0.

Now, we may assume that max(ii - fi) > 0. It is clear that we will have 0 < tE < T and O
= Hi

4 m,

O,t,,x,,m,,q,,p,

- -$ + 0, t,,y,,

n, -vn, -0);

>

P+Y -t,“* (3.21)

Notice that 9, = -p,, and pX = -q+ at (t,, x,, m,, ye, n,). Furthermore,

since

Optimal

265

control

for every (qm, qx) E R2, we have that H+ 4 - 0, t,,x,, (%

m,, e%,,,%

>

5 H+(O, t,,x,,

m,, e%,,,%),

(3.22)

n,, pm, cd.

(3.23)

and Hf c

P - 2 f 0, tEyYe, n,, pm7 c 2 H+(O, t,,y,, E >

Combining (3.21)-(3.23) we see that 0 5 H+(O, t,,x,,

m,, (Pm,rp,) - H+(O, t,,y,,

P+Y It,, qrn, p,) - t2 E

n,, v~, v,) - H+(O, tE,_vE,nE, pm, P,) + Klx, - yEI -

5 H+(O, t,,y,,

P+Y

=~lx,-Ycl

P+Y t2 E

(3.24)

-7. E

Since j3 > y > 0 we can choose E sufficiently small so that, using (3.20), the last part of (3.24) is nonpositive. This is a contradiction, so we conclude that u I u. The only gap we need to close is the fact that the maximum in the proof may not be achieved due to the fact that x is not known to be in a bounded set. We can fix this in the following way. Let R > 0 and K E C’(R’) be a function with 0 5 K’(T) 5 1, K(T) = 0 if r 5 R, and K(T) + a asr-+ co. Then, we modify the definition of ii and fi as follows fi(t, x, m) = u(t, x, m) -

ylc(lxl) - Ky(T - t) - f - f

qt, x, m) = u(t, x, m) +

plc(lxl) + zq?(T - t) + ; +

The proof continues as before with minor modifications. theorem. n

5.

This completes the proof of the

Remark. It is not hard to show that V+(t, x, m) = $y_ WB(t, x, m), where WB(t, x, m) is the viscosity solution of IKB(t, X, m) + mint K%, ZEZ

+

HW,B(t,

-5

m)

+

x, m)fi(t, x, z) + hi@, x, z) KV,

x,

mlf,(t,

z>

+

Mt,

z)l+l

=

0.

In fact, the proof follows from [25] by establishing that the corresponding Hamiltonian for WB(t, x, m) converges appropriately to the Hamiltonian for V’. Also, W(t, x, m) satisfies the same terminal and boundary conditions as does V+(t, x, m). Notice that WB(t, x, m) is the upper value of the differential game in which the functions p are absolutely continuous and 0 5 dp/dr I B. Now we will state the results for the lower value. To do so we need the definition of the lower Hamiltonian.

E. N. BARRON et al.

266

Let TO, x) = ml(f,(t,

x, z), h,(G -5 Z)rf&,

z), h*(G z)): z E Z]

and @(a, t, x, Pm 9PJ = The notation

Co(A) denotes

H-(a,

THEOREM 3.7.

E qt, xl: Pm + P&Z + V2 5 4.

the closed convex hull of the set A. H-

t, x, pm, P,) = minIp,<,

t, x,p,,pJ

and H-(a,

Nil , 1?17(2 959

+ 53 : El, vl, L, rlZ) E W,

= +a if @.(a, t, x,pm,px)

The lower value function

is defined

by (3.25)

t, x, pm, PJI,

= 0.

V-(t, x, m) is the unique continuous

viscosity

solution

of v,-

+ H-(0,

t,x,

v,-, v,-) = 0

and V- satisfies the terminal condition (TC) and boundary condition We will leave the proof of this theorem for the reader. We note, lemma which will explain the origin of the lower Hamiltonian. LEMMA 3.8.

Fix

m:;

(t, x, b,p,,p,)

(BC). however,

the following

E (0, 7’) x R’. Then

p,f,(t, z) + h,(t,z) - 611

x, z) + Mr, x, z) + A(P,,, +

m$.MI(rj

(3.26)

(t, x9 m) E Q

= H-(b, t, x, pm, PA. Remark. unbounded

(3.27)

The left side of (3.27) arises from considering maximizing control (Yas in (2.27), (2.28).

the lower

differential

PrOOfof lemma 3.8. If CT(b,t, x,p,,,,pJ = @ it is clear that (3.27) trivially assume this set is not empty. Since min, EA p = min, EGA p we know that minbxfl(t,x7Z) ZEZ

+ h(t, x, Z> + A(P,

+

game

holds,

with

so we

p,f2(t,z) + h,(t,z) - b)j (3.28)

The function

9VI 2(29 ~2) - PA,

(A,
certainly

theorem max

concave-convex. 49.A]) to see that

Therefore,

minhfi(t, x, z>+ h,(t,x, z) +

AZ0 ZEZ

= gj;

+ Q

: G,

The next result follows

+ PA

+

we may apply the minimax A(P,

+

+ ~1 + A(P, + pxr2

~~;(PAI

= min(p,b

+ rll + GP,

vlv
from the uniqueness

pxf2(t,Z)

+ ~2 -

112 -

theorem

+ hdt, Z) -

6)

(for example,

[16],

b)l

6))

6 x, pm, PA property.

= H-@,

t, x, pm vPA

(3.29)

H

Optimal

267

control

COROLLARY 3.9. If ff+(O,t,x,~,,,,~,)

= H-W’,

(3.30)

t,x,p,,pA

then the differential game associated with (2.1)-(2.4) has value; i.e. I/+ = V-. If the payoff has a terminal cost, g(r(T)) as well, with g Lipschitz continuous and bounded, and if, in addition to (3.30) max

mintg(x + fi(T, z)(a - m)) + h,(T, z)(a - m)]

msns1

zez

=

min max lg(x + f#, ZEZ msar1

@(a - m)) + h,(T, z)(a - m)]

then this differential game has value. Remark. When the differential

game has a terminal cost and the makes a difference which player has the last move.

player can jump it

maximum

We conclude this paper with the following special case of the results of this section. We begin by noting that all of the results can be extended to the case h2 = h,(t, x, z) iff2 = 0. Take& = ht = 0, and assume that hz 2 0. Thus, the trajectory is continuous and we only have a cost against the measures. Then, problem (3.10) for the upper value, V’(t, x, m), becomes V,’ + min( K’fi(t,

x, z) 1z E 2 such that I$, + h2(t, x, z) I 0) = 0,

(t, x, m) E Q

with V+(T, x, m) = (1 - m) mei: MT, x, z),

V+(t, x, 1) = y(t, x).

It is straightforward to verify that V+(t, x, m) = (1 - m)W(t, x) if (t, x, m) E (0, T] x R’ x (0, l), where W is the unique viscosity solution of W, + mint WXfi(t, x, z) 1z E Z such that h2(t, x, z) s W(t, x)) = 0,

(t, x) E (0, T) x R’,

with W( T, x) = min h2( T, x, z).

ZEZ

But, it was established in [3] that

To understand the connection between this minimax problem and the problem with measures simply recall the basic fact that the L” norm of a functionf(x) is the norm of the functional on L’, T(g) = jf(x) *g(x) dx. The results of this paper, therefore, generalize the main result of [3]. Furthermore, we have shown that Vi = - W. This is connected to a result of Karatzas [ 171. REFERENCES CRANDALL M. G. & LIONS P.-L., Viscosity solutions of l-42 (1984). BARRON E. N., Differential games with maximum cost, BARRON E. N. & ISHII H., The Bellman equation for 1067-1090 (1989). MURRAY J. M., Existence theorems for optimal control jump, SIAM J. Control. Optim. 24, 412-438 (1986).

Hamilton-Jacobi

equations,

Trans. Am. math. Sot. 282,

Nonlinear Analysis 14, 971-989 (1990). minimizing the maximum cost, Nonlinear and calculus

of variations

problems

Analysis 13,

where the states can

268

E. N. BARRON et al.

5. VINTER R. & PEREIRA F., A maximum principle for optimal processes with discontinuous trajectories, SIAM .Z. Control Optim. 26, 205-229 (1988). 6. BARRON E. N. & JENSEN R., The Pontryagin maximum principle from dynamic programming and vicosity solutions to first order partial differential equations, Trans. Am. math. Sot. 2, 635-641 (1986). of deterministic unbounded control problems and of first order Hamilton-Jacobi 7. BARLES G., An approach equations with gradient constraints, Ph.D. Thesis, Universite de Paris IX-Dauphine, pp. 2-31 (1989). control problem, SIAM J. Control Optim. 23, 161-171 8. BARRON E. N., Viscosity solutions for the monotone (1985). 9. CHOW P. L., MENALDI J. L. & ROBIN M., Additive control of stochastic linear systems with finite horizon, SIAM

J. Control Optim. 23, 858-899 (1985). equations, SZAM J. Control 10. CAPUZZO DOLCETTA I. & EVANS L. C., Optimal switching for ordinary differential Optim. 22, 143-161 (1984). 11. MENALDI J. L. & ROBIN M., On some cheap control problems for diffusion processes, Trans. Am. math. Sot. 278,

771-802 (1983). 12. MENALDI J. L. & ROBIN M., On singular

control

problems

for diffusions

with jumps,

IEEE Trans. Autom.

Control. AC-29, 991-1004 (1984). 13. RISHEL R., An extended

Pontryagin

principle

for control

systems whose control

laws contain

measures,

SIAM J.

Control. 3, 191-205 (1965). 14. SCHMAEDKEW. W., Optimal

control

theory for nonlinear

vector differential

equations

containing

measures,

SIAM

.Z. Control 3, 231-280 (1965). control of a damped oscillator under random perturbations, ZMA J. math. Control Znf. 5, 169-186 (1988). 16. ZEIDLER E., Nonlinear Functional Analysis and its Applications. ZZZ: Variational Methods and Optimization. 15. SUN M., Monotone

Springer, New York (1985). aspects 17. KARATZAS I., Probabilistic

of finite-fuel

stochastic

Proc. Nat/. Acud. Sci. USA 82, 5579-5581

control,

(1985). impulse control problems, SIAM J. Control Optim. 23, 419-432 (1985). 18. BARLES G., Deterministic 19. ISHII H., Perron’s method for Hamilton-Jacobi equations, Duke math. J. 55, 369-384 (1987). equations with discontinuous Hamiltonians on arbitrary open sets, BUN. Fat. Sci. Eng. 20. ISHII H., Hamilton-Jacobi Chuo Univ. 28, 33-77 (1985). equa21. CRANDALL M. G., EVANS L. C. & LIONS P.-L., Some properties of viscosity solutions of Hamilton-Jacobi tions, Trans. Am. math. Sot. 282, 487-502 (1984). equations, Zndiana Univ. math. J. 22. BARLES G., Uniqueness and regularity results for first order Hamilton-Jacobi

39, 443-466 (1990). equations, Ann. Inst. H. Poincare Analyse non 23. BARLES G., Existence results for first order Hamilton-Jacobi LinPaire 1, 325-340 (1984). 24. CRANDALL M. G., ISHII H. & LIONS P.-L., Uniqueness of viscosity solutions revisited, J. Math. Sot. Japan 39,

581-596 (1987). 25. BARLES G. & PERTHAME B., Discontinuous

solutions

of deterministic

Modelling numer. Analysis 21, 557-579 (1987). 26. ELLIOTT R. J., Viscosity Solutions and Optimal Control. Pitman,

optimal

London

stopping

(1986).

time problems,

Math.