Nonlinear Anolysir, ‘Theory, Methods & Applicalions, Printed in Great Britain.
OPTIMAL
0362-546X/93 $6.00+ .oO % 1993 Pergamon Press Ltd
Vol. 21, No. 4. PP. 241-268. 1993.
CONTROL AND DIFFERENTIAL WITH MEASURES
GAMES
E. N. BARRON,~$ R. JENSEN~§ and J. L. MENALDI(~~ tDepartment
of Mathematical Sciences, (IDepartment of Mathematics,
Loyola University of Chicago, Chicago, IL 60626, U.S.A. Wayne State University, Detroit, MI 48202, U.S.A.
and;
(Received 10 January 1991; received for publication 26 January 1993) Key words andphrases: Optimal solutions,
singular
control
and differential
games,
bounded
variation
trajectories,
viscosity
control.
1. INTRODUCTION
THE PROBLEMS of this paper are motivated
AND
SUMMARY
by models of physical controlled
systems in which the
trajectory is a function of bounded variation. The time of the jumps, if any, and the new spatial positions are under the control of the designer. In the optimal control problem the objective is to minimize a cost involving a running cost and a cumulative cost against the control measure. When the measure involves only jumps this will be a standard impulse control problem. this paper we are not restricting the measures merely to jumps but are allowing general
But in Radon
measures. In many problems of interest we can only manipulate the system with ordinary controls and we wish to do so to minimize a cost. But when the system is subject to disturbances one seeks to design the system so as to perform well under the worst possible circumstances. In this situation we assume that the disturbances are modelled by a measure term in the dynamics with a cost incurred in the payoff. The worst case analysis assumes that the measures are chosen so as to maximize this payoff. Therefore, this model is a differential game in which the dynamics is a function of bounded variation, the payoff involves the measures, and we choose an ordinary control to minimize this payoff while the opponent (in some cases considered to be nature) is choosing the measures to maximize the payoff. A result of this paper is that the order of play of the maximizer and minimizer makes a difference. That is, the differential game with a maximizing measure and a minimizing ordinary control does not, has knowledge of We will derive the the game to have
in general, have a value. The upper value, i.e. the case when the maximizer the minimizer, is then the central object of interest in a worst case analysis. results for both the upper and lower value and give a sufficient condition for a value.
The approach throughout this paper is dynamic programming leading to the value functions and the associated Bellman and Isaacs equations. In the optimal control case the Bellman equation becomes a standard variational inequality with two first order operators. In the differential game case the Isaacs equation is a highly nonlinear, first order problem involving a minimization over a set which depends on the derivatives of the value function. Precisely, the $ Supported 0 Supported YSupported
in part by AFOSR-86-0202, NSF DMS-9102967, and a grant from Loyola University. in part by AFOSR-860202, an NSF grant, and a grant from Loyola University. in part by NSF DMS-9101360. 241
242
E. N. BARRONet al.
Isaacs equation for the upper value is v,’ +
min t K’ *.!-I@,x, z) + A,@, x, 211 = 0, zEZ(O,t,Vmf,vx+)
where Z(0, t, V,, v,‘) = (z E z: V,+ + v,’ *f&z)
+ h,(t, z) I 0).
This equation has a discontinuous, generally nonconvex Hamiltonian. The equation for the lower value is even more complicated. A theory of first order partial differential equations encompassing such equations is the viscosity solution theory initiated by Crandall and Lions [l]. The first example of a Bellman equation with control sets depending on the solution arose in the consideration of an optimal control problem with a minimax cost [2, 31. That is, the minimax problem consists of finding a control which minimizes the L” norm of a function of time, the state, and the control. Using the well-known fact that the L” norm of a function is the maximum over subprobability measures of the function integrated against the measure, we see that the minimax problem is a special case of the subject of this paper. This example is included at the end of this paper. Some justification for taking the dynamic programming approach to the problems of this paper may be necessary. Control problems involving measures are extremely difficult to solve via necessary conditions [4, 51. Such necessary conditions are not even known for the differential game. The Pontryagin conditions involve knowing a priori the support of the optimal measures, which in turn depends on the unknown adjoint variables. Further, one must then verify that one actually has an optimal control. The determination of the value function by solving the Bellman equation is not beyond the scope of numerical methods. Moreover, the Bellman equation leads to the candidate feedback optimal controls in the usual way. Finally, it is well known, and proved in [6], that for standard control problems there is an intimate connection between the adjoint variable in the Pontryagin conditions and the spatial gradient of the value function. In fact, the adjoint variable is the spatial gradient of the optimal cost evaluated along the optimal trajectory. Such a result is not so clear in problems involving measures. Finally, we mention that previous work regarding problems with measures in one form or another appears in [2, 4, 5, 7-151. Necessary conditions are derived in [5, 131. Problems with measures are more commonly called singular control problems. See [15, 171 for related examples. 2. THE
OPTIMAL
CONTROL
PROBLEM
We consider the following model on the finite horizon [0, T]. The dynamics are dr = fi(r, T(r), c(r)) dr + fi(r> 5(r)) dp(r) r(t)
= x E R’
p(t)
= m E [0,
if t -c z I T, 11.
The controls are ([, p), chosen from the class (Z x CJK,)[t, T] where Z[t, Tl = (Ct 5: It, Tl + Z, [ is Bore1 measurable] =&it,
Tl = @iw
it, Tl + [0, 11, p is nondecreasing on [t, T], ,u(t) = m,,u
is right continuous on [t, T]).
(2.1)
(2.2)
243
Optimal control
Z is a compact subset of some RP, p 1 1, and m is any point in [0, 11. Since [ E Z[t, T] is bounded and Bore1 measurable, r y (h2(r, c(r)), f2(r, c(r))) are dp integrable for any p E m,[t, T]. We use the convention that ,~(t-) = m, and so dp may have a point mass at the initial point t. There is a one-one correspondence between Radon measures d.u and distribution functions ,u. The objective in this section will be to minimize the following cost over the class
(Z x nt)[G Tl T
P t,x,m(l7 fi) = We define the value function
I
h,(r, r(r), C(r)) dr +
s [t, Tl
Mf, C(r)) Q(r).
(2.3)
K [t, T] x R’ x [0, l] + R’ as follows
V(t, x, m) =
P t,x,m(L fl). inf G-,P)E(Z x w?)[t,Tl
(2.4)
We will make the following assumptions regarding the given functions fi and hi, i = 1,2. Assumption (A). For vii = fi, hi we assume pi: [0, T] x R’ x Z + R1 and p2: [0, T] are continuous in all arguments and there is a constant K > 0 such that, l~i(t, x, z)l 5 K,
I~z(t, z)l 5 K,
x
Z + R’
v (t, x, z) E [0, T] x R’ x Z
and l~li(t, X, z) -
PI(~,Y, z)l 5 Klx - ~1, h(f,
v (t, z) E [0, T]
z) - Wf’, z)\ 5 Kit - t’l,
x Z, and x,y E R’;
z E z.
The assumption (A) is more than sufficient to guarantee that for each pair of controls (i, ,M)E (Z x X,)[t, T] there will be a unique trajectory r(e) on the interval [t, T]. This trajectory is not necessarily absolutely continuous but it will be of bounded total variation. In general, a unique trajectory will not exist if we allow dependence of f2 on x. For given admissible controls ([, P), the associated trajectory starting from x E R’ is, by definition, the solution of r(r) = x +
‘fi(r, dXr), tXrN dr + f2(rltXrN Wr). i [f,Tl .i*
From the fact that 1tt,Tj dp I 1, we easily verify that supt5 T5 Tll<(~)II I K independent controls. Furthermore, r is right continuous.
of
Remark. We will only be considering the one-dimensional case in this paper to simplify the presentation. The extension to the n-dimensional case involves interpreting appropriately the meaning of the expressions f2 *dp and h2 * dp. This can be done in several ways, c.f. [5]. Our first theorem establishes the continuity of the value function. THEOREM 2.1. Under the assumption (A), I/ is a continuous function of (t, x, m) E [0, T] x R’ x [0, I]. In fact: (I) I W, x, m) - W, Y, Ml 5 Klx - yl; (2) 0 5 V(t, x, mz) - V(t, x, ml) s K(m, - ml) if 0 5 ml I m, I 1; (3) -(K/(T - f2))(t2 - tl) I V(tl ,x, m) - V(t,, x, m) I K(t2 - tl) if 0 I t, 5 t2 < T.
E. N.
244
BARRON
et al.
Proof. The hard part is establishing continuity in t, so we will first prove continuity in m. Continuity in x is easy and we will leave it for the reader. Fix m, < m2 E [0, l] and fix t E [0, T], x E R’. For each E > 0 we find (i,, PA E (Z x w7q)P~ Tl, such that Vt, x, m,) 2 P,,,,,,(i,,
(2.5)
k) - e.
Define t0 as the first time T after t for which p,(r) + m, - m, 1 1; if this condition never occurs then set T, = T. Let
k(T) + P2(d
m2 - ml9
=
i
if r < t0 if r 1 5,.
1,
Let
‘fi(r,
These trajectories will be identical if r0 = T, so we assume that r,, < T. On the time interval [t, T,,] the trajectories are identical. Let TV < T I T. We have from assumption (A), with K denoting a generic constant, that k(r)
- &(r)I 5 K
= K
’ k(r) st
- ‘&@)Idr +
’ ITI i t
-
5
K
st’
dr
_f2(r, L(r))
+
f2(r9 c,(r)) d,M)
It,4
GM9
II [TO,4
’ I<&)
t
~2(r)l
fi(r, c,(r)) d&) Ii It, 71
-
T2Wi
dr
d,Mr)
+ K .I I%,Tl
kdr) - t2@)ldr + KM2 - 4.
Gronwall’s inequality allows us to conclude that sup ITAT)- M5)l 5 K(m2- %I. ts7sT It then follows, by a similar calculation that Ipt,,,,&, Consequently,
pi) - pt,X,,&E9 ~~11 5 K(m2 - ml).
using (2.9, I% x5 ml) 1 Pt,x,mzK,, p2) - Mm2 - 4
- E
L V’(t,x, m2) - K(m, - ml) - E.
So, we conclude that V(t, x, m,) - V(t, x, m,) I K(m2 - m,).
For the other side we use the lemma.
(2.6)
245
Optimal control
LEMMA 2.2. V is monotone
nondecreasing
in m E [0, 11, i.e. 1 2 m2 2 m, L 0.
vt, x, m2) - vt, x, ml) =- 0,
(2.7)
Proof. For each E > 0 choose ([, , ,u2) E (2 x tJ?&,J[t, T], such that vt, xv mz) 2 Pf,,,&,
9I%) - &*
Let Jo, 3 p2 - m2 + m, . Then ,u, starts at m, and is simply ,u2 shifted down by m2 - m, . Further, dp, = dp2 so that the associated trajectories are identical. Therefore, V(t, x3 m2) 2 pt,X,&e 9~2) =
(L3P*)
p*,,,nz,
1
V(t,
-
E
-
E
ml) - E,
x,
n
completing the proof of the lemma.
Combining (2.6) and (2.7), continuity in m and (2) is established. Now we turn to continuity in t E [0, T). Fix 0 I t, < t, < T and fix x E R’, m E [0, 1). For each E > 0 there exists (c2, ,u~) E (Z x 3i&,J[t,, T], such that v(t2,
x,
m)
2
Ptz,x,m(52,
m2)
-
(2.8)
E.
Set cl(r) = i2(t2), if t, 5 7 < t,; cl(z) = c2(t), if t2 5 T I T. Let fit(~) = m if t, I T 5 t2, and ~~(7.) = ,n2(r) if t2 I t I T. Then, (cl, ,ul) E (Z x Ent,)[t, , T]. Finally, let r2 be the trajectory on [t2, T] for the controls (c2, ,u2) and let (I be the trajectory on [t, , T], also starting from x, for the controls (cl, pl). Then, it follows from assumption (A) and the fact that dp,(r) = 0 if t, 5 t c: t2, and dp,(r) = dp2(r) if t2 5 T 5 T, that kl(f2)
-
xl
5
K(t2
-
and
tl)
SUP I&(r)-
[z(r)1
5
Nt2
-
t,);
-
tl)
5
K(t,
&t2
-
tJ
tZs7sT
and, so IPt,,X*~(Cl 9 PI)
-
Pt,,x,m(r2
> P2)l
t* b,(r,
5
b(r),
z)l
dr
+
K(t,
-
4).
Jt,
Therefore, Vt2
from (2.8), 9 x3 m)
2
Pt,,,,,
(Cl,
,4)
-
x,
m)
K(t2
-
tJ
-
E 2
Vt, ,x,
m)
-
-
E.
We conclude that V(t,,
-
V(t2,
x, m) 5 K(t, - tl).
(2.9)
Next, we need to show that V(t,,x,m)
-
V(tl,x,m)
5
&
(t2
-
tl).
2
We again begin with an E > 0 and (
(2.10)
246
E. N. BARRON et al.
Define the functions s: [ti , T] --t [tz, I”], z: s(7) = t2 +
3
[tz, T]
(7 - td,
+ [tl , T], by
7(s) =
C[tl , T] into C[t,,
T] by
(@f)(s) = J-(7(s)). The map 0 is a linear isomorphism with norm 1. Now we consider the adjoint operator O*, which is also an isomorphism from Radon measures on [t2, T] to Radon measures on [tl , T]. Therefore, there exists a Radon measure ,u2 such that 0*(p2) = pi, and for any a, E C[t, , T], (V,Pl> =
ItI, Tl
~09 d/M-)
= (P, @*P2) = (@P,Pz)
=
Vz.Tl
(2.11)
47(7(r))d/&9.
We can extend 0 It is not hard to see, by suitably choosing V, that p2 E 9lZ, relation (2.11) to the space of bounded, Bore1 measurable functions since the Bore1 a-field is contained in the p,-measurable algebra. Then, by approximating a Bore1 measurable function by a sequence of continuous functions and using the dominated convergence theorem, we see that (2.11) will hold for any v, which is bounded and Bore1 measurable. Note that we are not saying that a continuous linear functional on the space of Bore1 functions is represented by a Radon measure. We are saying that the Radon measure representation of the continuous linear functional (with the sup norm) 0 can be extended to Bore1 functions using the L’ norm with the p measure. Define c2(s) = (Ok,) = [r(r(s)). Let r1 be the trajectory on pJ and let r2 be the
(2.12) (2.13) 1
Proof. We will only prove (2.13) since the proof of (2.12) is similar. (See theorem 3.1 below for the preliminaries for (2.12).) We have that
T2(s) = x +
f2(r, c2(r)) dr((Ar)
‘fi(r, r2(r), (2(r)) dr + t2
1t2.4
241
Optimal control
and
T(S) f&b, c,(b)) dp,(b) f,(b, r,(b), i,(b)) db + i k,.r(S)l i t1
s
s?-G,
t,(b), C,(b))db +
l,(W)lp,,,,,,,@) &l(b).
rlfi(h
It
II
fl
(2.14)
We use the notation that lA is the characteristic function of the set A. Make the substitution b = Y(T)in the first integral in (2.14) and use the definition of 0 given in (2.11) in the second integral to get
rl(m)
T-t,
s ’
.fl(7G% =x+T - f2 t2
T-t,
tl(703),
t,(7@N,
t-2@))
T-t,
Now we
/z-l/
’
dr
[t
+ i
_fi(7W, TdrW),
-
[tz T1 @U,(h
+
CdW)$t,,,(,,l
(6))
Tlf2(7W9
iz(r))ll,,,~(,,~(r(r))
.h(dr),
fl(dr), tl(7W), Ck9) dr +
G2(r)
2r
f2(W, ctz> 7-1
C2W) dr +
i f2
Mr))lLfZ,Sl (4 &d-)
C20?) d,M).
Itz.4
use the following facts =e,
and
Is-~(S)l=(t2-t~)~~(t2-tl)~.
Then, using assumption (A), fl(7G%
tl(7WL
C2W
dr
f2(7W9
+ [fz,
= x +
&2@)
’ i f2
=X+T-t2
dr
i,
fd709,
=X+T-t2
Cl(709))
’ fdr, s t2
f2(r,
C209) dr +
i
C209)
sl
C2(rN diu209 + 0
It,, sl
Combining these facts, again using assumption (A), we get the estimate that
Gronwall’s inequality then establishes that (2.13) holds. Now that we have an estimate on the trajectories
Ipt,,x,m(Cl Pl) - Pt,,x,m(C2 9
P
n
it is easy to verify that
P2)l 5
g$
tt2 2
- t11.
b2(r)
E. N. BARRON et al.
248
We conclude that
which gives the desired estimate (2.10). The proof of theorem 2.1 is completed using the next proposition. This result gives us the terminal and boundary conditions and shows that I/is continuous on [0, T] x R’ x [0, 11. PROPOSITION 2.4. I/ satisfies the terminal condition lim V(t, x, m) = V(T, x, m) = z Ezti;a
d 1 h2(T, @(a - m) = min((1 - m) ~2; h2(T, z), 0)
t+T
(2.15) and boundary condition (2.16)
V(l,X, 1) = y(t,x), where with d&dr = fi(t, r(r), <(t)),
h,(r, r(r), T(r)) dr,
r(t) = X,
is the value function for the optimal control problem in which the measures do not appear. Proof.FixmIaIl,zEZandchoose~(t-O)=m,~(5)=aiftIz(T.Wehavea point mass at t if m < a. Then from (2.3), (2.4), “T iqt,
x,
m)
5
h(r,
T(r),
z)
dr
+
h,(t,
@(a
-
m).
it Let t t T to get lim sup,,T V(t, x, m) 5 min,, Z,mSa h2(T, z)(a - m). For the other side, let (<, ,u) E (Z x 312,)[t, T] be arbitrary. Then from assumption (A), T M, c k(r, &% c(r)) dr + c Jr J Lt.~1
s
T(r)) d,Mr)
T
2
h(r,
W9
i(r))
dr
+
.i It, rl
t
T
2
h(r,
C(r),
C(r))
dr
+
s
min h,(T, z) - K(T - r) d,u(r)
It,g z Ez
i t T 2
MT, i(r)> - K(T - r) dp(r)
h(r,
&9,
5(r))
dr
(p(T)
min h2(T, z) - K(T - t)
+
(
.i t
ZEZ
- WZ)
>
T 2
W, i t
Consequently, minz.z,m,.
W),
C(r))
dr
+
min ZCZ,l7lSll
(h2(T, z) - K(T - t))(a - m).
letting t t T we see that since < and ,D were arbitrary, lim inf,, T V(t, x, m) 2 h,(T, z)(a - m) and the terminal condition (2.15) is verified.
249
Optimal control
Finally, to see that the boundary condition (2.16) is satisfied we simply observe that if the controls p must start at 1 and be nondecreasing then they must stay at 1. That is, nt,[t, T] = (l), and the result follows immediately from the proof of continuity of I/in m. n The proof of proposition 2.4 as well as theorem 2.1 is complete. Remark. Suppose that we had a terminal cost, say g(<(T)), as well as a running cost, i.e. the cost functional is g(<(T)) + P,,,,,(C, p). In this case, the terminal condition becomes IV,x,m)
- Mh = ZEZr$zYdl (MT, z)(a - m) + g(x + fdT z)(a
and the boundary condition becomes Iqt, x, 1) = y(t, x) = r $& Tl [MT)) where dr/dr = fi(z,t(t), T(T)), t(t)= x. The next result contains the dynamic problem. PROPOSITION
+ i:h,(r,
programming
C(r), C(r)) drj 3
principle
for the optimal
control
2.5. Let assumption (A) hold. Then for any t < s 5 T we have that s
v(t, x, m) =
inf K,P)E(Zx %#,sl
f
k(r, &% C(r) dr
and lqt, x, m) =
min tEZ,l-ma690
Mf,
z)6 + V(t, X + df#,z), m + S)].
PW
Proof. We will prove (DP2); the proof of (DPl) is standard and furthermore is very similar to [18, theorem 2.1, 2.31. Let F(t, x, m) denote the right-hand side of (DP2). Since we can choose 6 = 0 we see that F(t, x, m) I Iqt, x, m). For the other side, let z E Z be fixed and c(r) = z, t I 5 I T. Fix 0 I 6 I 1 - m. Let ,UE 3n,[t, T] be defined by p(t-) = m, and p(t) = 6 + m, if t s z I T. Let r(e) be the trajectory for the controls [, p. Then, for any E > 0, with t + E I T, we have from (DPl) that f+E V(t, x, m) 5 W, 5(r), C(r)) dr t
+
s
Mr, CO->) 44r) + Ut +
E,
t(t +
E-
O),r4t + E - 0))
It,t+El
rt+& =
3 *
h,(r, &9, C-(r))dr + Mt, z)6 + V(t + E, t(t + E - o), p(t + E - 0)). (2.17)
E. N. BARRONet al.
250
Letting E + 0, since <(t + E) + x + f2(f, z)6, and p is right continuous, and the continuity of V that
Therefore,
of proof
LEMMA 2-6.The map6 y [O,1 -
n
V(t, x, m) I F(t, x, m) and the result is proved.
Using the same method
from (2.5)
VI-m2S20.
VZEZ,
Vt, x, m) 5 h,(t, z)6 + V(t, x + fi(t, x)6, m + 4,
we conclude
we easily derive the following
lemma.
z)6 + V(t, x + f2(t, ~)a, m + s)) is non&creasing
minz,zth2(t,
on
m].
Remark. We can combine
(DPl)
and (DP2) to get s
V(t, x, m) =
inf K,s) E (Z x %x~*Sl +
I
h,(r, T(r), C(r)) dr +
h,(r, C(r)) d&) b,s)
min VQ(s3 z)d + Vs, 5(s-) ZEZ,I-p(S-)tcs20
Next we will derive the Bellman viscosity solution of the equation.
+ 6fd.c z), ,f@-_) + 6)) . 1
equation for the problem and prove that V is the Define the Hamiltonians HI : [0, T] x R2 + R’, and
H,:[O,T]xR’+R’by H,(t, x,P,) = min(a,fi(t, ZEZ
x, z) + 40, x, z)),
Hz(t,Px) = $(Pxfz(t,
2.7. Let assumption (A) hold. The value function on the set C? = (0, T) x R’ x (0, 1) of
THEOREM
I/ is the unique
z) + Wt viscosity
mint V, + H,(t, x, V,), V, + H,(t, V,)) = 0 and V satisfies
the terminal
condition
(2.15) and boundary
condition
u: R” + R’ is a viscosity
G(x, u, Dxu) = 0,
subsolution
(2.16).
2 0
(supersolution)
of a (possibly
of the equation
where G: R” x R’ x R” + R’,
if for any v, E C’(R”) for which u* - v, has a maximum y, we have G*(Y, u*(~),Dxd
solution
(2.18)
Before we give the proof of the theorem we recall from [19, 201 the definition discontinuous) viscosity solution of a Hamilton-Jacobi equation.
Definition 2.8. A function
z)).
(u, - p has a minimum)
(respectively G(y,
where u*, u, denote the upper and lower semicontinuous for G*, G,.
wd~),D,d
envelopes
at the point
5 0) sty,
of u, respectively.
Similarly
251
Optimal control
In general, we see that a viscosity solution as well as the function G may be discontinuous. In our problem we have already proved the continuity of the proposed solution and we have the continuous function G given by G(t, x, m, PtT P,,, , P,) = mintp, + HI@, x, PA P,,, + HA& PA We now turn to the proof of the theorem. Proof. Let 9 be a smooth function on a and suppose that V - 9 achieves a strict zero maximum at the point (to, x0, mo). We can always arrange, by modifying 9 if necessary (c.f. [l, 21]), to have (to, x,,, m,) E (0, T) x R’ x (0, 1). From (DP2) we have that IV0 9x0 t mo) = 9@0, x0 ? mo)
Therefore,
for every 6 > 0 O 5 m$$MtO, 4 + J-‘(9(to, x0 + Mto,
Let 6 --t 0 and use the differentiability O5
minMt0,
z)
ZCZ
+
z), m. + 4 - 9(to, x0, mo))l.
of 9 to get that
9X(to, xoc,,mo) *sZ(to72) + 9,(to, x0, mo)l.
(2.19)
Define the control ,UE 3nmo[to, T] by P(T) = m,, to s z d T. From (DPl) we get for any to 5 s 5 T, 9(to, x0, mo) = Uto, x0, mo)
Notice that for the control p the trajectory for each c is given by d to, ((to) = x0, and there are no jumps in either ,u or <, Set s = to + E, in the preceding; divide by E and let E + 0 to obtain that O 5 minlk(to,xo,z) ZEZ
+ 9,(t0,~o,mo)~f,(to,xo,z)
+ cPt(to,xo,mo)J.
(2.20)
Combining (2.19) and (2.20) we see that V is a subsolution of (2.18). We need to prove finally that V is a supersolution of (2.18). Thus, suppose that V - 9 has a strict zero minimum at the point (to, x0, m,) E CJ with 9 a smooth function. Assume to the contrary that there is a constant C > 0 such that 9t(to,X0,mo)
+ rlnfi2(9~(t0,~o,mo).fi(to,x0,z)
+ h,(lo,xo,z)l
z C
(2.21)
and 9m (to 9x0, mo) + n39,(toV
x0, m,) *f2(to, z) + h,(t, ,@I 2 C.
(2.22)
252
E. N. BARRONet al.
Fix z E Z and define r(o) by dr(m)/dm = f&to, z), [(m,) = x0. From smooth, we see that for all m E [m,, m, + 61 for small e > 6 > 0, 3am>, ml + n&l
9,&l
9r(m), m> *f&,
4 + w,,
(2.22),
since
9 is
z) 2 c/2.
Consequently,
Integrate
this from
9(t0, x0
+
~fi(to, 4, m. + 4 - 9(to, x0, mo) + &(to, z) at (to, x0, mo) we obtain
Since I/ - 9 has a strict zero minimum
lVo, x0 + 6Mfo, z>, m. + 4 + 6Mo, 2 9(to, x0, mo) + K/2
x0,
mo)
where the infimum Proof.
=
‘;“,f * u
to jump
0 4@-,
W),
i(r)>
dr
W-,
+
i(r)>
dr
+
Us,
tX@,
P(S))
J [fop sl
fo
, 1
on the class 3n,_ [to, s] n C[to , s].
n = 1, 2,. . . , there exists ([, , p,J E (Z x nt,,,)(t,,
2 to be the first point of discontinuity V(t,, x0, mo) + i 2 Pt,,,,,,,G9
L
(2.24)
says that it is not optimal
VroF x0, mo) -t k 2 P,,,,,,,&, Lets,
from (2.23)
then there exists an E > 0 such that for all to < s < to + E,
on fi is taken
For each integer
(2.23)
= V(t,, x0, mo) + K/2,
s V,,
E/2.
4
for all z E 2 and sufficiently small e > 6 > 0. This inequality to another position at time to. LEMMA 2.9. If (2.24) holds,
2
+ mintV(s,, zez
, vu,).
p, . We have that, with A,, = ,uu,(s,) - ,uu,(s, - 0),
PJ
srl h,(r, t,(r),
1 f0
of
T] such that
1 L(r)) Lb,
dr +
Mr, i [ro,S”)
- 0) + LfdS,,
i, (4) 4s, 09
zh ran,
+ 42 + AW,,
zN.
If there is a subsequence such that s, + to and 6’ > 0 with A2, + a’, then using the continuity of V we obtain that if n + co, V(to, x0, mo) 2 $$ Using lemma
2.6 we have reached
W,, x0 + d’fi(to, z), m. + a’) + I’M,, a contradiction
of (2.24).
n
zN.
253
Optimal control
Now fix E given by lemma 2.9. Let 0 < p0 < E and i E Z[tO, t, + pO] and ,UE %,, fl C[t,,, t, + pO] with (2.21) and (2.22) (with C replaced by C/2) holding at (r, r(r), p(r)), t, d r I t,, + po. Then compute
That is,
c4to +
POY
mo+ PO),,mo+ PO))-
v(to
9
x0
7
mo)
(0 fP0 +
i f0
h,O-,T(r), C(r))dr +
i
M-> i(r)) &(r) 2
(2.25)
~oc.
Ito,to+Pol
Since V - v, has a strict zero minimum at (to, x0, m,) we obtain from (2.25) that Wo + PO>ato + PO),No + PO)) - vto >x0 7mo) +
to+po h(r,
tlr), T(r)) dr +
W, 5 [to.fo+P01
.rto
C(r)) d&)
2 poC.
(2.26)
This inequality is a contradiction. Therefore, V is shown to be a viscosity supersolution of (2.18) as well. Finally, the fact that V is the only viscosity solution of (2.18) follows from more general uniqueness results for first order Hamilton-Jacobi equations (c.f. [22-241). n Now we introduce the following optimal control problem with unbounded
T Minimize P,,,,,([,
a) = t
controls.
T hk,
T(r), c(r)) dr +
t
&(r, W))a(r)lt IL< 1I dr
(2.27)
subject to
dt-/ds = .fi(z, T(r), T(3) + .h(z, l(~)b(Qll ,,< 1 I , Wdr = 4r)ll,<,, r(t) = x E R’
, p(t)
over the class of controls ([, a) E Z[t, T] x L:[t,
t
(2.28)
f
11,
T], where
L’+[t, T] = ICY:[t, T] -+ [0, 00)) ET&r) dr < 031.
254
E. N. BARRON etal.
The function 1,s < 1I is the characteristic function of the set (p(s) < 1). For (Y E L:[t, T] we see that P(T) E [0, 11, for all t 5 Y 5 T. Furthermore, since
we see that Il<(t)llrm 5 K, independently The value function for this problem
of controls. is defined by
W(t, x, m) = It is easily seen that
W is a bounded
any control
inf
Pt,,,, (L 4.
under
the assumption
K,~)EZXL!&
function
(A).
2.10. Let assumption (A) hold. (1) W is a viscosity solution of (2.18) and satisfies the terminal condition (2.15) and boundary condition (2.16). (2) The value function W is also the unique continuous viscosity solution of THEOREM
(2.29)
KG, X, m) + H(t, X, K, , KJ = 0 where
Wt, x, P,,, , PJ = (3) W=
H,(t> x, PA
if p,
-co,
if
+
P,,, +
f&O, P,) 2 0
(2.30)
%(f, P,) < 0.
Vonfi.
Proof. The proof that W satisfies the terminal and boundary conditions is similar to that in proposition 2.5 and is left to the reader. We will prove that W is a viscosity solution of (2.18). In fact, this follows immediately from theorem I.1 of [7] but we will provide the details. The idea of the proof is to bound the controls (Y which then results in a standard optimal control problem to which classical results apply. Therefore, we consider the control problem (2.27), (2.28) but we must choose the controls u from the class AB[t,
T] = (a: [t, T] + [0, B]: a E L:[t,
T]),
for each fixed B > 0. When we use this class we will denote the corresponding by WB. Now, using standard theory, WB is the unique viscosity solution of W, + HB(t, x, W,,
W,) = 0
value function
on Q
where HB(t, x, pm, P,) = FJ;(pxfi(t,
x, 4 + h,(t, x, 4 - B(P,
+ pxfdt,
z) + Mt, z))-I.
By considering classes of control functions it is clear that B 1 B’ implies that WI WBI wB’. We conclude that WB converges to some function I- 2 W which is upper semicontinuous. In fact it is not hard for the reader to verify that on (0, T) x R’ x [0, 11, r = W. Therefore, W is at least upper semicontinuous. We will now use the fact that WB L W to show that W is a viscosity solution of (2.18).
Optimal
255
control
Let W - p achieve a zero unique maximum at the point (to, x0, m,) with cp a smooth function. We arrange, if necessary, to have to > I and 0 < m, < 1. Then, by lemma A.2 in Barles and Perthame [25], for each B > 0, WB - v, achieves a maximum at (tB, x,, me) and (tg,xs, ma) + (to,%, m,) as B + ~0. Since WB is a subsolution, at (te, x,, m,) rp, + minlcp,f,(t,,
x,, z) + hl(tB, x,, z) - B(cp, + ~l,_&(t~, z) + MB,
ZEZ
z))-1 2 0.
(2.31)
Since the expression in parentheses is nonnegative we may drop it to get % + minl%fi(t,, ZCZ
XB)
z)
+
hl(fB,
XB,
z))
1
at
0
OB,xB,
mBh
Let B + 00 to see that v)~+ min19Jt(to 9x0, z) + Mh,
at (to, x0, mob
x0, ZN 2 0
ZEZ
(2.32)
Also, divide through by B in (2.31), let B + m and use assumption (A) to obtain min(-(p, ZEZ
+ u?J~(~~, z)
+
Mt,,
z))-)
2
0
which immediately implies that %?Z+ mn$&(to
3z> + Mto,
(2.33)
z)> 2 0.
Combining (2.32) and (2.33) we conclude that W is a viscosity subsolution of (2.18). NOW suppose that W, - q achieves a zero unique minimum at the point (to, x0, m,) with v a smooth function. Then, again by lemma A.2 in Barles and Perthame [25], for each B > 0, WE - v,achieves a minimum at (fB, x,, mB) and (tB, x,, me) --t (to, x0, m,) as B + 03. At (tB 9 XB , mB> (Pr + ~j;(%fi(fB~ XB, Z) + hl(fB, XB, Z) - B(~7,n + &f#B, Z) + h&B, Z))-)5 0. (2.34) If pm + mEi;l~J~(t,~ z) + MO, z)J 2 C > 0 then, by continuity,
at (ts, xB, m,) for B sufficiently large pm +
minf%.fi(~B, ZEZ
z)
+
h2(fB,
z))
2
c/2.
From (2.34) we see that (Pr +
J$$$vx.f~(f~,
XB,
Z)
+
hl(tB,
XB,
7-j)
5
0.
(2.35)
Letting B + m we see that (2.35) holds at the point (to, x0, mo). Consequently, w is a supersolution of (2.18). Since the viscosity solution of (2.18) is known to be at least continuous, we know also from this fact that W must be continuous. Part (1) is proved. It is not hard to directly establish the continuity of W. The details of the proof are similar to that of theorem 2.1.
Remark.
We will prove part (2) of the theorem from the following lemma.
E. N.BARRONetal.
256
LEMMA 2.11. A continuous function viscosity solution of (2.29).
is a viscosity
solution
of (2.18) if and only if it is also a
Proof. Let I be a viscosity solution of (2.18). It is obvious that I is then also a subsolution of (2.29) so we need only show it is a supersolution of (2.29). To this end, if I - ~1 has a minimum at (to, x,, , mo) then mink4
+ Hi(k,
x0, P,), Pm + K(fO,
P,)J 5 0.
If p, + H,(t,, p,) > 0 then (D(+ H,(t,, x0, p,) I 0. So, by (2.30), I is a supersolution of (2.29). On the other hand, if qrn + H,(t,, q,) 5 0 then, again using the definition of the Hamiltonian H, vt + H(to, x0, m,, pm, tp.J = - CO.In either case we conclude that I’ is a supersolution of (2.29). Hence, a viscosity solution of (2.18) is also a viscosity solution of (2.29). The proof that I is a viscosity solution of (2.18) if it is a solution of (2.29) is similar and so we omit it. We conclude that the equations (2.18) and (2.29) are equivalent in the viscosity sense. n Finally we will prove for (2.18) to conclude equation and boundary proposition 5.3 of [5]. Clearly, V(t, x, m) I (<,, ,uJ with associated
that W = V. We can appeal to uniqueness theorems (c.f. Barles [22] that W = I/ because we have shown that W and V satisfy the same conditions. We can also prove this directly, however, by using W(t, x, m). For the other side, given E > 0 there exists a pair of controls trajectory <,( *) which are e-optimal V(r, x, m) 2 P,,,,,(&,
According such that
to [5, proposition
ri + <, 9 dti f Finally,
h,(t,
k)
- c.
5.31 there exists a sequence measlr:
dti 9 a~ dz 3 dP,(r),
(C, ai) and associated
V(t, x, m) 2 P,,,,,
(&,
,4)
-
and the result follows.
co.
large
T h,(r,
[i(r),
(i(r))
i t 2
for i sufficiently
ri
E
T I
asi+
C(t) # c,(r)) -+ 0
Ti(~))oli(~) dr 3 h,(r, C,(T)) dpE(r). Therefore,
trajectories
dr
+
Mr,
Ci (r)bi
(r)l(
,, < I 109
dr
-
2.2
*
W(t, x, m) - 2~ n
Remarks. (1) It follows from this result that the model with measures is not more general than that with unbounded control functions. (2) The Bellman equation formally tells us what the optimal controls are. For example, when V, + H2(f, V,) > 0 the optimal measure control consists of doing nothing, i.e. dp = 0. The optimal c control will then provide the minimum of the Hamiltonian H, . The dp measure, or equivalently, the (Ycontrol will be nonzero only on the set where V, + H,(t, V,) = 0. On this set the optimal c control will minimize the Hamiltonian H, . The optimal measure could have an absolutely continuous as well as a singular component. We leave as an open problem the rigorous connection between the Bellman equation and the optimal control.
Optimal
251
control
3. THE DIFFERENTIAL
GAME
In this section we will consider the differential game associated with the dynamics (2. I), (2.2) and payoff (2.3). The players will be the controls [ and ,U with C the minimizer and p the maximizer of P. We will work within the framework of Elliott and Kalton’s definition of differential games and refer to Elliott [26] for a basic synopsis of results on differential games in the connection with viscosity solutions. Many of the results for differential games are proved in a manner similar to that for the optimal control case. In the interest of brevity we will only provide the proofs which are substantially distinct from those of Section 2. In order to be precise about the differential game let us define the terms. A strategy for the maximizer is a map 01: Z[t, T] + 3n,[t, T], such that c,(r) = &(r), f I r I s for each t s s I T, implies that a[il](r) = ol[&](r), t I T 5 s. This defines CYas a nonanticrpating map. Let r(t) denote the class of strategies for ,D on [t, T]. Similarly, the class of nonanticipating strategies for i on [t, T], is denoted by A(t). A strategy for the minimizer is a nonanticipating map fl: 3n, [t, T] + Z[t, T]. We will sometimes write CYE 3n, [t, T], p E Z[t, T] to signify that the strategies map into a control function in the class. An outcome of ([, a(C)) (respectively (p(p), p)) must be an element of (Z x TK,)[t, T]. Then we have the following definition. Vf: [0, T] x R’ x [0, l] -+ R’ is defined
Definition 3.1. The upper value function V+(t, x, m) =
V-(t, x, m) =
(1)
V’:
boundary
[0,
inf
assumption
inf
sup
BEA(r)PEWn[r,rl
4Cl). by
Pt,x,m(P[Pl, P).
(A),
T] x R’ x [0, l] --t R’ are bounded
and
continuous
and
satisfy
the terminal
conditions
V’(T, x, m) = r$;(h2(T,
wt, x, 1) = where y is defined (2)
pt,,,,(<,
a El-(t)( EZ[Z, T]
V-: [0, T] x R’ x [0, l] + R’ is defined
The lower value function
THEOREM 3.2. Under
sup
by
I/+ satisfies
,$)+(I
- m),
(TC)
VW
?dt, 4,
in (1.16). the dynamic
programming
V+(t,*, ml =
min zez
V+(t, x, m) =
inf 01E%“[f,Sl f E ztt,s1
principles
max (h,(t, z)6 + V+(t,
l-mz6ro
x
+
sfi(t, z), m + a)],
(3.1)
s
sup
+
s
t
hi@, &% 5(r)) dr
h,(r, C(r))dr + v+b tW-1, PCS-_)I .
[t,sl
1
(3.2)
E. N. BARRONet al.
258
(3) V- satisfies v-(t,
the dynamic
programming
principles
z)6 + VT, x + df2(t, z), max minth,(t, zez
x, m) =
(3.3)
m + a)),
l-rn,6>0
s
V-(t, x, m) =
inf sup BEzlt, sl P E311,11, $1
h,(r, T(r), C(r)) dr I
1
h2(r, i(r)) dr + v-(s, CW), r(l(s-1) .
+ tt. sl Remark. becomes
If we add a terminal
cost to the payoff,
say g(r(T)),
(3.4)
then the terminal
condition
z)(a - m)) + h,(T, z)(a - m)], V+(T, x, m) = min max (g(x + f2(T, ZEZ mra51
(3.5)
for k” and I/-(T,
x, m) =
max
mrnrt
min(g(x + f2(T,z)(a - m)) + h,(T, @(a - m)),
(3.6)
ZEZ
for V-. Of course these terminal conditions will not be the same in general. therefore, expect the game with measures to always have value.
One should
not,
Proof. We will only prove some of the results stated and only for the upper value. The proofs for the lower value are similar. We prove first that I/+ is continuous in t in one direction. Fix (x, m) E R’ x (0,1). Let 0 < r, < t2 < Tand let E > 0 be given. Then, there is a strategy 01~E r(t,) such that V+(ti,
v 4-lE Z[t,, Tl.
x, m) 5 Pt,,x,m(S1,Q,[~J) + E,
Define the maps S: [ti, T] + [t2, T], z:[t2, T] + [tl , T] as in Section 2. Define the map 0: C[t,, Tl -+ C[f,, Tl by (@f)(4 = f@(9). Given & E Z[t,, T] set
As in
Section
v? E C]t,,
2, ,uu2 is a Radon
measure
with
,u2 E tm,[t,,
T]. Furthermore,
Tl, (P,&> =
!
v(r) Qdr)
Ifz. Tl
(WNr) &dr)
= (co, @*ccl) = (09, PI> = It,, Tl
for
any
259
Optimal control
In fact, by dominated convergence, this is valid for any bounded lemma 2.3 (2.12) holds and we conclude after some manipulation that V+(G>
Bore1 measurable 9. Then, involving assumption (A),
ctz - ?I) + E,
xv m) 5 ~t,,x,mK2r4-21) + &
v c2 E Z[f*, Tl.
2
This implies
that x, WI) I
Vf(lr,
v-+(&,x,
m) + &
(b - tl). 2
The remaining estimates for continuity are similar reader. Now we turn to the proof of (3.1). Let
F(t, x, m) = min
l-m2620
x, m). Next, given any [ E Z[t, T] set z = c(t).
By setting 6 = 0 we see that F(t, x, m) 1 V’(t, We can find 1 - m 1 6’ = 6’(z) 2 0 so that IV, x, m) 5 =
(h,(t, 236 + V+(t, x + df2(t, z), m + 6))
max
l-mZ6tO h2(t,
z>S’
+
If 6’ = 0 we are done so we assume strategy cy’ E 3nm+6,[t, T] such that V+(t, x +
a’f2(t,
z),
V+(t,
x
+
m
+
6’)
5
P t,x+s’fi(t,z),m+*‘(r,a’[il) + the preceding,
B’f2(t,
z),
that 6’ > 0. Now,
m + 6’).
by definition
Pt,x+aifi(t,l),m+s,(r,
Define the strategy 01” E CXm[t, T] by a”[c](t-) Then, it is not hard to verify that
so that, combining
2.1 and are left to the
{h,(t, ~$3 + V+(t, x + &f2(t, z), m + 6)).
max
ZEZ
to that of theorem
a’hz(t,
a’K1)
= m and
z)
of V/+, there exists a
=
a”[[](r)
P,,,,,(L
+
E.
= a’[[](@
if t I
7
5
T.
dCl),
we get that
F(t, x, m) 5 h2(t, z)S’ + V+(t, x + B’f2(t, z), m + S’)
5 Mt* z)&’ + Pr,x+s’fi(t,z),m+6’(rr~‘[Cl) + E
= pt,x,rncc, a”[Cl) + &* This evidently implies that F(t, x, m) 5 V’(t, x, m), completing assertions of the theorem are left to the reader. n
the proof.
The remaining
We will now focus on the upper value, Vf, since we are taking the point of view that we are studying the differential game as a worst case analysis of a system subject to disturbances. Later we will state the results for the lower value, V-. Define the upper Hamiltonian H+: R’ x [0, T] x R3 --f R’ as H+(a,
t, X, P,,, , P,) =
min (P, *f1(t, xv z) + h,(t, x, z)) z E~@~~,P,JJJ
(3.7)
E. N. BARRONet
260
al.
where Z(a, t, Pm 9PJ = (2 E z: Pm + p* *.Mt, z) + Mf,
(3.8)
z) 5 4.
If ‘Z(a, t,p,, p,) = @ then we set Hf = +w. In general, one cannot expect such Hamiltonians to be continuous functions. In fact, this Hamiltonian is not continuous. In view of the definition of viscosity solution with discontinuous Hamiltonians, we have to calculate the upper and lower semicontinuous envelopes of H’. We do so in the next lemma. The statement of the lemma is similar to that of [3, proposition 2.51 but the proof here is simpler. LEMMA 3.3.The upper semicontinuous
envelope,
w+)*(a, f, x, Pm 9P,> The lower semicontinuous
envelope (H+)*@,
Proof.
(H+)* of H+ is given by
= ff+tfl
- 0, t,x,P,,P,).
is given by
t, X, Pm, P,) = ff+ca
+ 0, t, x, Pm 9P,).
We will only prove the result for the upper semicontinuous
(H+)*(a,
t,x, P,,PJ
= lim su~W+(hs,~,
qm,4J:@,hy,
envelope.
By definition
qm,4J -+ (a,t,x,P~,PJI.
Given E > 0, fix (s, u) E B,(t, x), such that I&, X, Z) - ~6, Y, z)l 5 KG
~1 =fi,h,,
idt, z) - ul@, 41 5 KG Also fix (b, q,,, , qJ E &(a, pm, P,). penalization theory, we have that
+
by
Now,
Consequently,
result
in
+ p,f,(t,
z) + Mt,
min (Pxf,(t, 2.Em-KE,~,&,P,)
finite-dimensional
1
x, z) + h,(t, x, z) + Ip&
mintp,fi(t,
=
standard
h,.
z) + h,(s, z) - b)+)
B(q, + qxfb,
+ B(P,
a
VI = fi,
+ KC
z) - a + Kc)+)
1
x, z) + h,(t, x, z)) + Ip,le + Kc.
since E was arbitrary, VP)*@,
Since the reverse complete. n The next lemma
inequality
t,x,p,,p,)
follows
is the useful
5 H+(a
- 0, t,x,p,,,,p,).
from the definition
analogue
of [3, proposition
of the upper
envelope,
4.11 and lemma
the proof
2.9 above.
is
Optimal
LEMMA 3.4. A continuous
function
261
control
u E C(Q) is a viscosity
solution
of
max] v + H+(O, t, x, V,, V,), V, + H2(f, V,) = 0 if and only if u is a viscosity
solution
(3.9)
of
v, + H+(O, t, x, v,,
(3.10)
v,) = 0.
The advantage of the formulation (3.9) is that the minimum in H+ is always taken over a set which is nonempty. With these preliminaries completed we can now state the main result of this section. THEOREM 3.5. V/+ is a viscosity
solution
of (3.9) (or (3.10)) on (0, T) x R’ x (0, 1).
Proof. We know that V+ is continuous on d = [O, T] x R’ x 10, l] so we need to verify the viscosity requirements. Let V+ - Ed achieve a strict maximum of zero at the point (to, x0, m,,). Without loss of generality we may assume that (t,,, x0, m,) E (0, T) x R’ x (0, 1). We must show that (D&,x0,%) Suppose
+H+(O
at &,x0,
- O,t,,x,,P,,&J~O
mo).
this is not true. Then there exists a /3 > 0 for which Vt(& 7x0 7 mo) + fY+ (-4P,
BY definition
of the Hamiltonian,
to, x0, pm, rp,> 5 -4p
this implies
at (to,xo,
(3.11)
m,).
that there exists z* E z(-4p,
to, q,,
p,), such
that Vr(fo 9x0 >mo)
+ K4to >x0 3mo)f,(t,, x0, z*) + h&, x0, z*) 5 -4p.
Consequently, V,(S, YYv) + VX(&Y,
V)fic%y, z*) + h,(s, y, z*) 5 -3p
(3.12)
v)f*(s, z*) + h,(s, z*) 5 -38
(3.13)
and %I(s, Y7 v) + VX(S,Y,
for every (s, y, v) 6 B,(t,, x0, mo) for some 6 > 0. Set c*(r) E z*. NOW, the fact that (3.13) holds at (to, x0, mo) implies that there exists a 6’ > 0 such that v’(to* x0 +
dfAto, z*), m, + 6) + 6’h&, z*) 5 V’Q,, x0, mo) + 3ps,
vo < 6 The proof of (3.14) is similar to that of (2.24). Next, (3.14) implies such that for any to < s < to + E we have
< 6’.
(3.14)
that there exists an E > 0
s Vf(fo,xo,
mo) 5 SUP OL
Mr,
kO-, T(r), C*(r)> dr +
to
[fo.Sl
C*(r)) dr + v+@, <(s), ,44)
, 1
(3.15)
E. N. BARRONet
262
al.
where the supremum is taken over strategies CY which satisfy the property that the outcome p of ([*, CY[[*])is in X,,, I-I C[t,, s]. Again, the proof of this is similar to that of lemma 2.6 and uses the fact that 6-
V+(fOr x0 + &(tO, z*), m. + 6) + %(to,
z*)
is nonincreasing on [0, 1 - m,]. Fix f, < p < t, + E so that (r, l(r), p(r)) E B,(tO, x0,m,), toI r I p. Here, ,UE 3nm0 tl C[t,, to + E] is arbitrary, and r is the (continuous) trajectory corresponding to (,D, [*). Then, using (3.12) and (3.13) and the change of variable formula for Stieltjes integrals-noting that q is smooth-we get that
+
pm(r, 56%dr)) + vxf2(ryC*(r))4.W ko,Pl
5 -
c’
h,(r, t(r), C*(r)) dr -
[fo,PI
to
Consequently,
&(r, i*(r)) d&r) - ~B(P - to) - 3PMp)
- mol.
since V+ - ~1has a zero maximum at (to, x0,m,,)we get that P ~‘(/A
r(P),
Lo))
k(r,
+
&9,
<*O-N dr
5
~+(fo,
x0,
mo)
-
W,
+
i’f0
C*(r))
d&9
1 Ito.Pl
~P(P - to) - 3PMp)
- mol.
(3.16)
This is true for every ,DE 3nmo 0 C[t,, p]. Thus, using (3.15) we have arrived at a contradiction. Therefore, V+ is a subsolution. Next we prove that I/’ is a supersolution of (3.9). Let V+ - 9 achieve a strict minimum of zero at the point (to, zO, mo). Again, without loss of generality we may assume that (to, x0,mo) E (0,T) x R' x (0,1). We must show that e+(toYxO, mo) + H+(O + 0, co,xO, v,, v,) 5 0
at (to, x0, moL
or, equivalently, max@, + H+(O + 0, to, x0, v,, ~1, pm + ff,+Vo, PA 5 0 at (to, x0, mol.
Suppose to the contrary that there is a /? > 0 such that ~t(~o,~o,~o)+~+(4PI~o,~o~~m,~~~~4P
at (to1 x0, mo).
Let [ E Z[t,, T] be arbitrary. We claim that PDm (to, x0, mo) + px (to, x0, mol_M~of WON + WO, C(to)) 22 4/3.
(3.17)
Suppose instead that Pm(to9 x0, mo) + (0,(to, x0, mol_f2(~o~6Vo)) + Mto7 UoN > 4P.
(3.18)
263
Optimal control
In this case we let 6 > 0 be such that m, + 6 5 1 and
if m, 5 m I m, + 6. Since f2 and h, are bounded and a, is smooth, we can choose 6 to be independent of [. Define on [mo, m, + 61 the trajectory d<(m)/dm = f2(f0, [(to)) with initial condition T(mo) = x0. We obtain & ~0,~ T(m), m) + MO, iGo)) > 3/3,
m, 5 m 5 m, + 6.
Integrating this from m, to m, + 6 c~(t,, x0 + afi(to,
5(to)), m. + 4 - &toy x0, mo) + Wfo9
WON > 3Pb.
Since V+ - v, has a minimum at (to, x0, mo) we conclude that V+(to, x0 + dfi(to, tXt,N, m. + 4 + &Go,
C(toN > V+Vo, x0, mJ + 3/36.
Since [ was arbitrary, this is a contradiction of (3.1). Thus (3.17) must hold. Define the strategy ol[T](r) = ~(5) = m, on [to, T]. Assume that we are given a control c E 2 n C[to, T]. Then, from (3.17) there exists to < s I T such that (~m(r, T(r), p(r)) + v)*(r, T(r), ,dr))fz(t,
C(r)) + hz(T, C(z)) 5 3P,
1, I T I s,
where < denotes the trajectory associated with (5, II). We claim that there exists such an s independent of 5. Indeed, if not then there would be a sequence Sj L to such that (3.18) would be true. But we have already seen that (3.18) leads to a contradiction. Consequently, we see that [ E 2(3p, r, pm, q,) for all I, 5 t I s. Using the definition of H+ we have, V’,(r, t(r), p(r)) + %(r, l(r), dz))fi(T,
C(s), l(r)) + h,(T, t(r), C(r)) 2 3P,
to I ‘5I s.
This readily implies (we omit the frequently used details) that s V’(s, T(s), P(S)) + Mr, c(r)) d&) 2 Vf(to, x0, MO) + 3P(s - to). hi@, 5(r), c(r)) dr + 5 to s Ito.4 Now, [ was assumed continuous. But this inequality will hold for any [ E Z[t,, t] since s was independent of [ and (Yis identically m, and so is also independent of <. Conseqently, we have found a strategy Q!E 3nm0[to, T] such that for all [ E Z[t,, T] the previous inequality holds. But this contradicts (3.2). Thus, I/+ is also a supersolution. This completes the proof. n Next we prove that there is exactly one continuous viscosity solution of (3.10) satisfying the terminal condition (TC) and boundary condition (BC). We state the uniqueness result in the form of a comparison principle. THEOREM
3.6. Let u be a continuous
supersolution Proof.
viscosity subsolution and u a continuous viscosity of (3.10), both satisfying the conditions (TC), (BC). Then u 5 v on Q.
Assume that (u - v)(t,, x0, mo) = max(u - u) > 0. Let p > y > 0 satisfy u(to,xo,mo)
- Y > u(k,x,,m,)
+ P.
E. N. BARRONet al.
264
Define fi(t, X, m) = u(t, X, m) - (v/m) + (y/t) and fi(t, x, m) = u(t, x, m) + (P/m) + (/3/t). Then it is straightforward to check that ii is a subsolution of ii, + (H+)*
and 6 is a supersolution
(2)
t,x,m,fi,,ii, >
-lLt2 --
0,
of
( $9
ct+ w+)* -
t,x,m,
Gm, i&
+
P 7
=
0.
>
Set w(t, x, m, y, n) E ii(t, x, m) - C(t,y, n). Then, w is a subsolution of w, + (H+)*
(-$
,t,x,m,w..,w,)+(*+),(-$,t,y,n,-w,,
-WY)-y=O.
(3.19)
Let E > 0 and consider the function f,(t, x, m,y, n) = W, x, my, 4 -
$ Ix -
A2 - i Im - n12-
Assume that this function achieves its maximum at a point (t,, x,, m,, y,, n,). Then, it will follow from the continuity of u and u, more generally from the upper (lower) semicontinuity of u (v), that ;
Ix, - YEI2+ 0,
& Im, - n,j2 -+ 0
(3.20)
as E -+ 0,
and W,,x,,
m,,yE, 4) 3 max w
as e -+ 0.
Now, we may assume that max(ii - fi) > 0. It is clear that we will have 0 < tE < T and O
= Hi
4 m,
O,t,,x,,m,,q,,p,
- -$ + 0, t,,y,,
n, -vn, -0);
>
P+Y -t,“* (3.21)
Notice that 9, = -p,, and pX = -q+ at (t,, x,, m,, ye, n,). Furthermore,
since
Optimal
265
control
for every (qm, qx) E R2, we have that H+ 4 - 0, t,,x,, (%
m,, e%,,,%
>
5 H+(O, t,,x,,
m,, e%,,,%),
(3.22)
n,, pm, cd.
(3.23)
and Hf c
P - 2 f 0, tEyYe, n,, pm7 c 2 H+(O, t,,y,, E >
Combining (3.21)-(3.23) we see that 0 5 H+(O, t,,x,,
m,, (Pm,rp,) - H+(O, t,,y,,
P+Y It,, qrn, p,) - t2 E
n,, v~, v,) - H+(O, tE,_vE,nE, pm, P,) + Klx, - yEI -
5 H+(O, t,,y,,
P+Y
=~lx,-Ycl
P+Y t2 E
(3.24)
-7. E
Since j3 > y > 0 we can choose E sufficiently small so that, using (3.20), the last part of (3.24) is nonpositive. This is a contradiction, so we conclude that u I u. The only gap we need to close is the fact that the maximum in the proof may not be achieved due to the fact that x is not known to be in a bounded set. We can fix this in the following way. Let R > 0 and K E C’(R’) be a function with 0 5 K’(T) 5 1, K(T) = 0 if r 5 R, and K(T) + a asr-+ co. Then, we modify the definition of ii and fi as follows fi(t, x, m) = u(t, x, m) -
ylc(lxl) - Ky(T - t) - f - f
qt, x, m) = u(t, x, m) +
plc(lxl) + zq?(T - t) + ; +
The proof continues as before with minor modifications. theorem. n
5.
This completes the proof of the
Remark. It is not hard to show that V+(t, x, m) = $y_ WB(t, x, m), where WB(t, x, m) is the viscosity solution of IKB(t, X, m) + mint K%, ZEZ
+
HW,B(t,
-5
m)
+
x, m)fi(t, x, z) + hi@, x, z) KV,
x,
mlf,(t,
z>
+
Mt,
z)l+l
=
0.
In fact, the proof follows from [25] by establishing that the corresponding Hamiltonian for WB(t, x, m) converges appropriately to the Hamiltonian for V’. Also, W(t, x, m) satisfies the same terminal and boundary conditions as does V+(t, x, m). Notice that WB(t, x, m) is the upper value of the differential game in which the functions p are absolutely continuous and 0 5 dp/dr I B. Now we will state the results for the lower value. To do so we need the definition of the lower Hamiltonian.
E. N. BARRON et al.
266
Let TO, x) = ml(f,(t,
x, z), h,(G -5 Z)rf&,
z), h*(G z)): z E Z]
and @(a, t, x, Pm 9PJ = The notation
Co(A) denotes
H-(a,
THEOREM 3.7.
E qt, xl: Pm + P&Z + V2 5 4.
the closed convex hull of the set A. H-
t, x, pm, P,) = minIp,<,
t, x,p,,pJ
and H-(a,
Nil , 1?17(2 959
+ 53 : El, vl, L, rlZ) E W,
= +a if @.(a, t, x,pm,px)
The lower value function
is defined
by (3.25)
t, x, pm, PJI,
= 0.
V-(t, x, m) is the unique continuous
viscosity
solution
of v,-
+ H-(0,
t,x,
v,-, v,-) = 0
and V- satisfies the terminal condition (TC) and boundary condition We will leave the proof of this theorem for the reader. We note, lemma which will explain the origin of the lower Hamiltonian. LEMMA 3.8.
Fix
m:;
(t, x, b,p,,p,)
(BC). however,
the following
E (0, 7’) x R’. Then
p,f,(t, z) + h,(t,z) - 611
x, z) + Mr, x, z) + A(P,,, +
m$.MI(rj
(3.26)
(t, x9 m) E Q
= H-(b, t, x, pm, PA. Remark. unbounded
(3.27)
The left side of (3.27) arises from considering maximizing control (Yas in (2.27), (2.28).
the lower
differential
PrOOfof lemma 3.8. If CT(b,t, x,p,,,,pJ = @ it is clear that (3.27) trivially assume this set is not empty. Since min, EA p = min, EGA p we know that minbxfl(t,x7Z) ZEZ
+ h(t, x, Z> + A(P,
+
game
holds,
with
so we
p,f2(t,z) + h,(t,z) - b)j (3.28)
The function
9VI 2(29 ~2) - PA,
(A,
certainly
theorem max
concave-convex. 49.A]) to see that
Therefore,
minhfi(t, x, z>+ h,(t,x, z) +
AZ0 ZEZ
= gj;
+ Q
: G,
The next result follows
+ PA
+
we may apply the minimax A(P,
+
+ ~1 + A(P, + pxr2
~~;(PAI
= min(p,b
+ rll + GP,
vlv
from the uniqueness
pxf2(t,Z)
+ ~2 -
112 -
theorem
+ hdt, Z) -
6)
(for example,
[16],
b)l
6))
6 x, pm, PA property.
= H-@,
t, x, pm vPA
(3.29)
H
Optimal
267
control
COROLLARY 3.9. If ff+(O,t,x,~,,,,~,)
= H-W’,
(3.30)
t,x,p,,pA
then the differential game associated with (2.1)-(2.4) has value; i.e. I/+ = V-. If the payoff has a terminal cost, g(r(T)) as well, with g Lipschitz continuous and bounded, and if, in addition to (3.30) max
mintg(x + fi(T, z)(a - m)) + h,(T, z)(a - m)]
msns1
zez
=
min max lg(x + f#, ZEZ msar1
@(a - m)) + h,(T, z)(a - m)]
then this differential game has value. Remark. When the differential
game has a terminal cost and the makes a difference which player has the last move.
player can jump it
maximum
We conclude this paper with the following special case of the results of this section. We begin by noting that all of the results can be extended to the case h2 = h,(t, x, z) iff2 = 0. Take& = ht = 0, and assume that hz 2 0. Thus, the trajectory is continuous and we only have a cost against the measures. Then, problem (3.10) for the upper value, V’(t, x, m), becomes V,’ + min( K’fi(t,
x, z) 1z E 2 such that I$, + h2(t, x, z) I 0) = 0,
(t, x, m) E Q
with V+(T, x, m) = (1 - m) mei: MT, x, z),
V+(t, x, 1) = y(t, x).
It is straightforward to verify that V+(t, x, m) = (1 - m)W(t, x) if (t, x, m) E (0, T] x R’ x (0, l), where W is the unique viscosity solution of W, + mint WXfi(t, x, z) 1z E Z such that h2(t, x, z) s W(t, x)) = 0,
(t, x) E (0, T) x R’,
with W( T, x) = min h2( T, x, z).
ZEZ
But, it was established in [3] that
To understand the connection between this minimax problem and the problem with measures simply recall the basic fact that the L” norm of a functionf(x) is the norm of the functional on L’, T(g) = jf(x) *g(x) dx. The results of this paper, therefore, generalize the main result of [3]. Furthermore, we have shown that Vi = - W. This is connected to a result of Karatzas [ 171. REFERENCES CRANDALL M. G. & LIONS P.-L., Viscosity solutions of l-42 (1984). BARRON E. N., Differential games with maximum cost, BARRON E. N. & ISHII H., The Bellman equation for 1067-1090 (1989). MURRAY J. M., Existence theorems for optimal control jump, SIAM J. Control. Optim. 24, 412-438 (1986).
Hamilton-Jacobi
equations,
Trans. Am. math. Sot. 282,
Nonlinear Analysis 14, 971-989 (1990). minimizing the maximum cost, Nonlinear and calculus
of variations
problems
Analysis 13,
where the states can
268
E. N. BARRON et al.
5. VINTER R. & PEREIRA F., A maximum principle for optimal processes with discontinuous trajectories, SIAM .Z. Control Optim. 26, 205-229 (1988). 6. BARRON E. N. & JENSEN R., The Pontryagin maximum principle from dynamic programming and vicosity solutions to first order partial differential equations, Trans. Am. math. Sot. 2, 635-641 (1986). of deterministic unbounded control problems and of first order Hamilton-Jacobi 7. BARLES G., An approach equations with gradient constraints, Ph.D. Thesis, Universite de Paris IX-Dauphine, pp. 2-31 (1989). control problem, SIAM J. Control Optim. 23, 161-171 8. BARRON E. N., Viscosity solutions for the monotone (1985). 9. CHOW P. L., MENALDI J. L. & ROBIN M., Additive control of stochastic linear systems with finite horizon, SIAM
J. Control Optim. 23, 858-899 (1985). equations, SZAM J. Control 10. CAPUZZO DOLCETTA I. & EVANS L. C., Optimal switching for ordinary differential Optim. 22, 143-161 (1984). 11. MENALDI J. L. & ROBIN M., On some cheap control problems for diffusion processes, Trans. Am. math. Sot. 278,
771-802 (1983). 12. MENALDI J. L. & ROBIN M., On singular
control
problems
for diffusions
with jumps,
IEEE Trans. Autom.
Control. AC-29, 991-1004 (1984). 13. RISHEL R., An extended
Pontryagin
principle
for control
systems whose control
laws contain
measures,
SIAM J.
Control. 3, 191-205 (1965). 14. SCHMAEDKEW. W., Optimal
control
theory for nonlinear
vector differential
equations
containing
measures,
SIAM
.Z. Control 3, 231-280 (1965). control of a damped oscillator under random perturbations, ZMA J. math. Control Znf. 5, 169-186 (1988). 16. ZEIDLER E., Nonlinear Functional Analysis and its Applications. ZZZ: Variational Methods and Optimization. 15. SUN M., Monotone
Springer, New York (1985). aspects 17. KARATZAS I., Probabilistic
of finite-fuel
stochastic
Proc. Nat/. Acud. Sci. USA 82, 5579-5581
control,
(1985). impulse control problems, SIAM J. Control Optim. 23, 419-432 (1985). 18. BARLES G., Deterministic 19. ISHII H., Perron’s method for Hamilton-Jacobi equations, Duke math. J. 55, 369-384 (1987). equations with discontinuous Hamiltonians on arbitrary open sets, BUN. Fat. Sci. Eng. 20. ISHII H., Hamilton-Jacobi Chuo Univ. 28, 33-77 (1985). equa21. CRANDALL M. G., EVANS L. C. & LIONS P.-L., Some properties of viscosity solutions of Hamilton-Jacobi tions, Trans. Am. math. Sot. 282, 487-502 (1984). equations, Zndiana Univ. math. J. 22. BARLES G., Uniqueness and regularity results for first order Hamilton-Jacobi
39, 443-466 (1990). equations, Ann. Inst. H. Poincare Analyse non 23. BARLES G., Existence results for first order Hamilton-Jacobi LinPaire 1, 325-340 (1984). 24. CRANDALL M. G., ISHII H. & LIONS P.-L., Uniqueness of viscosity solutions revisited, J. Math. Sot. Japan 39,
581-596 (1987). 25. BARLES G. & PERTHAME B., Discontinuous
solutions
of deterministic
Modelling numer. Analysis 21, 557-579 (1987). 26. ELLIOTT R. J., Viscosity Solutions and Optimal Control. Pitman,
optimal
London
stopping
(1986).
time problems,
Math.