MAXIMUM PRINCIPLE FOR DIFFERENTIAL INCLUSIONS (NECESSITY)* R . P. FEDORENKO Moscow (Received 3 December 1970, revised version 22 December 1970)
A PROOF of the maximum
principle
for differential
about the Bellman function
than in the previous paper by the author, is given.
Below, as in [l] , the time-optimal inclusion
inclusions
response problem
with weaker assumptions
is considered
for a differential
(1) In [I] the maximum
principle was obtained
with some assumptions
about the Bellman
function w(x) of this problem. Here a proof of the principle with considerably weaker assumptions about w(x) will be given. At the same time somewhat stricter assumptions about the function
Q(x, I&) will be made.
We recall that
Therefore,
the following
assumptions
about K(x)
(1) K(r -j- 6~) c K(z, CIjSzll), where spheres of radius 6 (continuity of K (x));
(2) Q(T $) IIQx(x
the constants
IS d’ff I erentiable
+ 6x, $)
-
Q,(.x,
llQ.,(~
,!/I + S$) -
Q,(.I,
llQ\(r.
q) II <
will be made below: K(2,
6)
with respect to x, and ‘$) il < $) II &
(J,/i6~11. C~/l6$ii,
C:j;
C, C,, Cz and C3 are independent
*Zh. vjJchisl. Mat. mut. Fiz., 11, 4, 885-893, 1971.
89
of x, $;
is an extension
of K(u) by
(3) K(x)
is a convex bounded
We will require the following
closed region for all x. lemma below.
Lemma I Let conditions
(1) - (3) be satisfied. Let
such that in the domain 6x -
C,J/]
K(x
k E K(z).
-I- 6~) n [ZU: (w,
Then constants
$) >
(k,
C4, C, exist
$) + Q~(.c,
$)
a point w can be found for which Ilk -
(here
W =
II - II‘1
Iull$ <
Csll~x11
is the norm of the projection
of the vector on the hyperplane
orthogonal
to J/). Proof:
We first establish that C, can be so chosen that
point k(s + #6x, 9) E IV ( we recall that This follows from the estimate
IV #
0.
(k (x + 6x, IJ) , $) =
Indeed, the Q (x f
6x, $)
).
> 6, $,) + Q&c $,)6x - CfiiW12, C, > C,. We first consider the case and consequently, k(x + 6x, t#) E IV for Q&z 2 0 (see Fig. I), considering, for example, C, = 2C,. We define the sphere S: J/s - kll < C116xll. If w:n s # 0, the lemma is proved for C5 > C. Therefore in what follows we will consider the case IV n S = 0; let w E W be the point in IV closest to k (in the sense of the norm
Il./Ii
1.
In Fig. 1 the situation is represented in the projection on the plane stretched on the vectors J/ and (w - k). We write r(k) = I/w - kll,.; let A > C be some number which we determine more precisely later. If r( 6~) < A 116sll, the lemma is proved for C’s > A. Hence in what follows we consider only the case FIG. 1.
r(h)
>
AII6xll.
The projection of W is situated in the region shaded in Fig. 1. It is easy to establish the following facts: the point w closest to k is projected into the left lower angular point of the region shaded in Fig. 1; there cannot
be points of the projection
of Woutside this region.
Maximum principle for differential inclusions (necessity)
Indeed, in S there exists a point
S* E K(s
+ 6s)
91
(because of the assumption
(s*, $) < (k, $) + QJx - C,j1~6~!1(~. If the nearest point were [w’, s*] would the point w’, because of the convexity of K(J: + 6~) the segment lie entirely within K(J: + 55) and w’ would not be the point of W closest to k. K(z -j- ‘6~)~ this is obvious. Also Moreover, w cannot be an interior point of the region (I)): for this point
the projection
of w onto the plane of the vectors $ and w - k cannot be an interior either. This follows from the convexity of of K(z + 5s)
point of the projection
and from the fact that w is the point closest to k. We consider the vector p(w - k). We take (Yand p such that the projection of this vector $’ at the point w and Ii+‘\1 = I\$,11 will be a support to the projection of K(s + ,6z) = 1.
K(x + 6s) ‘$‘= a$ +
Since the projection
of $’ is identical
max (k, *‘) = GK(X+bX)
max hEX*(z,
with +!I’,and for such a J/’
(k,g’) 6x)
(here K” is the projection of K), $J’ also defines a hyperplane support to K(s + ,6x) at the point w E K(L~: f 6s) .The projection of this hyperplane is obviously the line 3. Let $’ make with J, the angle $ (see Fig. 1). We first establish that we may restrict ourselves exclusively to angles cp < indeed, in a CljGrlj -neighbourhood of the point k there exists the point < x14; s* E K (z + 6~) ; it must lie below the straight line 3 (Fig. l), from which it easily follows that
~IIG~ll+llQ.IIll~~ll~ G All6dl- Cll64l = C +llQAl A-C
’
and A can be so chosen that rf < rr / 4. Then for the situation shown in Fig. 1, r = rr + rrr also, r’ = H ctgy, = [Q&x - C,11Gzl12]ctg~cp. For r” we have the expression
r”=l(k-w,~‘)=~[(k,~))--(w,~/)]=
sin cp
=1[(k,~‘)-Q(z+6x,~‘)1~ sin cp
L
92
--c,ll~~ll”c0s(p+Q(~,~‘)-Q(~+Sr,~’)]. We transform
the expression
Q(T qf) - Q@ + 6x3 s’f = 0(x, $1 - Q(J,ff - Q.r(x, %‘) x
x Then (putting
&r + ~(11~~112) < - Q,(z, q’)ci.z:-t C,l1642.
$’ =
$ + sic, ) we have
- Ctl16r/12cos ‘r +(cosrp-- 1)Qx(s,$)6~: Since
cos q >
chosen that uniformly
cos (n / 4) for the angles Cpconsidered,
G~cos~ > C,
bounded,
the ratios
+
c,l/6sl/2}.
we may consider C4 to be so
lI&Qll / sin cc,
Icosrp-_l)
jsinrp
are
and therefore
which is what it was required to prove.
Remark. The case was considered where the straight line 3 intersects the straight line 1 on the right of the point k; if this intersection lies on the left of k, we have r = r’ - r”, the same expression is valid for r’. and F” = (1 / sin q) (pl’, w - k). and we arrive at the same estimate for k. To complete In S there exists - c411641z, all the points
the proof of the lemma it is necessary to consider the case Q&z <
0.
S* E K(z + 6~). If (S*, 11:) > (k, $) + Q%(x, $)&x the lemma is proved with C, > C; we will consider only the case where S fi K (.r + (7~) are situated below the level 2 (see Fig. 2). Then the
point w E II’, for which I! 10 - k II/;1 is a minimum occupies the corner position by the same reasoning as in the case Q&x > 0. The remainder of the construction is the same as in the case already considered, and for r(6z) we obtain the expression r = r” - It:
I”=
-[Qx(x, $)6x
-
CillGsli’]ctgcp.
After this the proof is reduced to the same estimate as before.
93
Maximum principle for differential inclusions [necessity]
FIG. 2. Note. After fixing C, > CtJ’Z, beforehand, it would be more precise to distinguish the situations not by the sign of Q&, but by the sign of Q&z -CL IF6zl12; it is obvious that nothing
need be changed in the proof given above because of this refinement.
Therefore,
the lemma is proved
We now pass to a proof of the necessity Let x( 1) be the trajectory
of the maximum
of the differential
inclusion
principle. (1)) joining
the point
xo to the point x1 after a time T, that is, x(7’) = x1 ; let this trajectory be timeoptimal. We also suppose that the trajectory x( t ) is such that x(t) is a piecewisecontinuous function. With respect to the Bellman function w(x) we only assume that at the point x1 the surface of level O(X) = T, (w (xi) = T) has a supporting hyperplane with the normal $ 1. More precisely, we suppose the following: let the hyperplane FT be defined by the equation J?T : {z : (x Then any trajectory
&, $1) =
of the inclusion
01.
# E K(g) , connecting x0 with the hyperplane of the trajectory x(t), reaches f’~ after a
FT and passing through a A-neighbourhood time T’ > 7’ - q (A), where
Let K(x) satisfy the conditions (1) - (3). Let x(t) be the time-optima trajectory of the inclusion (l), connecting x0 to x1 after a time T; let w(x) have at the point x1 a hyperplane support with the normal $r (in the sense explained above) and (i(T), $,) > 0. Then the trajectory x(t) satisfies the maximum principle, that is, along the trajectory x(t) a vector Jf (t) is defined which satisfies the equation
R. P. Fedorenko
94
where
1111,(t) 11=
1
and
(~(t)r9(t))=Q("(t),9(t))= ~~~~~~(~~~(~))*
(3)
The scheme of the proof is as fol.lows: first, along the trajectory is determined
from equation
turns out to be infringed,
(2) by ~tegrat~on
a trajectory
y(t)
“backwards”
is constructed
from
x(t) a vector $(t)
T to
0. Then, if (3)
which passes in a A-neighbour-
hood of the trajectory x (t) and connects x0 with I+ after a time 2” < T - cA, c > A > 0 is an arbitrarily small number, and this contradicts the assumption made. Therefore, the vector J/ (t) is obtained by integration of equation (2) along x(t); the assumption (2) ensures the existence and u~queness of Ifi( We introduce the function R(t) = (g(t), $(t)). Let R(t) = Q(t)= Q(z(t), q(t)) for t’ ,( t < T, that is, the maximum principle is satisfied on the segment [t*, T] , and R(t) < Q(t) for t < t*. We consider several different
situations.
1. Let t* be a point of discontinuity of the function R(t), that is, on some segment we have R(t) < Q(t) - a, a >_ 0. In this case we construct the It* - A, t*] trajectory y(t) as follows: in the interval [0, t’ - A] we put y(t) = x 0); in the interval [t” - A, t”] the trajectory y(t) is constructed as the solution of the equation i =
y(t*-A)
k0.47 s(0),
=x(2*-A).
(The vector k(y, 9) is not unique, hence it would be more precise to speak of any trajectory of the inclusion .GE k(y, q(P)), where k(y, #) is the set of vectors; (k, 9) = =
Q(Y) $1,k =K(Y).>
Here (i%, 9)
A > =
0 is a small number, Q (y, 4). We determine rt*: {X:: (X -
Let (s(t* intersects
-A) rt*
z(t*),
k@, JI) is any of the vectors
k E K(y),
for which
the hyperplane $(t*))
= 0).
- s(2*), $~(i*)) < 0, that is, at the point t* the trajectory x(t) “from below upwards” (the case of opposite sign will be considered
separately). We show that in this situation the trajectory y(t) reaches the hyperplane after a time yA, y < 1. This is almost obvious, since because of continuity
1(9(t)&(t*))Z?(t)1 < O(tA)at
[t* -
I’t*
A,t*],
and the quantity (6 9 (t*) ) is the velocity of motion of the projection of the point x(t) on the normal to rt* and by hypothesis R(t) < Q(t)- a. This velocity of the trajectory x(t) is less by a finite quantity than that of y(t) (for y(t) the velocity along the normal is Q(y(t), m+(P))== Q(x(t), $(t*)) +0(A)).
0;
95
Maximum principle for differential inclusions (necessity)
Therefore,
y <
for some
that is, the trajectory
1
(independent
y(t) has “gained”
deviates from x(t) by not more than Ilz(t*) Note.
expressions
-
y’]] <
(1 -
0 (A),
y) A
0 (A)
thereby,
y)A)
E I?P,
naturally,
it
that is,
AA.
Here and below we omit the trivial calculation of type
(1 -
of A) y* = y(t* -
the time
in terms of the constants
it is obvious that all these estimates the estimates in (1) - (3).
are uniform
of the constants
Ci occurring
and
in conditions
(1) - (3);
because of the assumed uniformity
It is now necessary to show that after a time T - L* f q (A) to some point of y(t) can be transferred from the point y* E l?~ remaining in some B X A neighbourhood of the trajectory 2 (t), = o(A)).
of
the trajectory while rt (q (A) =
For this we require the following construction. At every point z(t), t E [t*, T], we draw the hyperplane l?,: (2: (5 - z(t), q(t))= 0}, just as in [l]; we also notice that Q(t) > q0 > 0 at [t*, T] ; this follows from the maximum principle (see [l] , note 5). Since the trajectory considered cannot have selfintersections (their presence is incompatible with optimality), 1‘, and I% (ti + t) do not intersect in some finite neighbourhood of the trajectory x(t) (see [l] , Fig. 3). Now, prescribing a small step T, we construct trajectory y(t) as follows. Let t, = t* + nt, is constructed such that:
1) 2)
Y,
Y=+I
=
Y nti
E
+
an Eulerian polygonal
I, = x(t,),
and let
~4,,where
Qn 6~ K(Y,),
Y,, E l?, =
l?, ,?*
path for the
Then a yn+l
rn+i,
Alily, - 5,112T + AiT?, 4) !14.- ii_(L) II< Ah - ynll,
i’,)
TT,,<
7 +
5) Yo=
Y*,
6)
IIyni-i -
x n+i!I <
IIYn -
GIlI (1 + A&q + A.?.
We demonstrate the existence of such a construction. Indeed, i(tn) = k, (More precisely, a k, E K(x,) exists such that xn+i = zn + Tk, + O(9). E I-c (5,) Then x,,,~ = X, + Tk,, + O(t"). By the maximum (k,, $,,) =
principle max (k, qn), kczXn
R. P Fedorenko
96
and the lemma asserts the existence of a Q= E K(Yn), such that (qn, $J > 2 (‘& %) + Qz(Yn - 2,) - c,IIYn - &It” and 114, - k,ll < CJY,~:,il. We take a point Y’ = Y,, + zq,; substituting it in the equation of the hyperplane r,+l, we obtain (Y*+zQn--~--zk,-Q(,G2),
$V--zQx+O(z2))
=
(Y* -
2
- C&llY,- GJl”z + 0 (T2).
598,44 - Z(QX, Y, - .2J + 7(47X-
=
k,, $4 + O(,c2) >
Here we used the lemma and the relation
that is (Y, - &I, 9%) = 0. Yn EJJ,, Since Q(Yn7 44 = Q(G %) + 0(/l Yn - 41) 2 f70 + O(Iljh dl), we may consider that Q(Y,,, $J > q. 12 and a rn satisfying the estimate (3) exists for which Yn+r = Y, + %nqnE i’n+t. Moreover, since x,+* = L + rkn $ O(T~), we have I/Yn+1
-
x,+~ll =
llgn + znq, - xn - 6
< llyn -
GII + tllqn -
<
dl
lly, -
+ AZ2 <
knll +
+ TGII y7l -
IlYE -
1~ -
+ 0 (T’) II G
~1 llqnll 4-W”)
4
&lll + O(T2) + Ai-c II !h:-
&II (1 + An)
~xl12 +
+ A5,G2.
Therefore, the estimate (6) is also satisfied. Suing the estimates (1) - (6) for n=o, 1, .‘., (T - t*) /a - 1, we obtain the following properties of the polygona1 path yn. The estimate (6) implies that the polygonal path yn is situated in some (I?, A f neighbourhood of the trajectory x(r) (the obvious calculation of the constant + %r) B is omitted; we again emphasize that it is independent of A and 7).
with
Summing the estimates (3) we obtain that the polygonal path yn connects T’P I’, after a time T’ < T - t” + B,A’ - Bit.
Carrying out this construction for an infinite sequence of decreasing z -+ 0, we obtain a sequence of polygonal paths. This family of polygonal paths satisfies the condition of Arzela’s theorem, and from it we can select a convergent subsequence (converging in the metric c). It follows from the results of ]2] that the limiting trajectory y(t) is an absolutely i e
r,*
continuous
function
and almost everywhere
satisfies the differential
inclusion
K(Y). Also, y(t) traverses a B, A neighbourhood with rT after a time T’ <
T -
t” + &A”.
of the trajectory
x(t) and connects
97
Maximum principle for differential inclusions (necessity]
the trajectory y(t) connects x0 with r~ which contradicts the hypothesis (1-y)A+&A2,
hyperplane
of the surface
In the case reflected”
(~(t* -
T” < T -
after a time
Therefore,
of the existence
of a supporting
O(Z) = T. A) -
x(1’),
$(t*))
>
0
(that is, the trajectory
x(t) “is
from I’,* ) the matter is simpler, since in this case .z(t’ - A) c I’P*, ,!*’ > 1” and lllt(t* -A) -x(2**) jl = O(A), and after this everything
where is the same as in the first case.
2. We now consider the situation where t* is a point of continuity of R(t), that is from 1 to smaller values. In this case the R(t) /Q(t) for t < t* varies continuously proof is almost the same as in the first case. It is only necessary to prove that in this case also we may consider the function R (t) = (9 (t) , i(t) ) > qO > 0. This fact is a simple generalization remark implies that for the function
of the result in note 5 of [l] . Indeed, this
Q (t) =
(CX? (t) , $ (t) )
Let R(t) = Q(t) up to the instant t’, and in the case where Q (0) > 0, the function Q(t) cannot decrease more quickly than an exponential on [0, t’] , and tatter consequently, Q(t) > q. > 0. Let t” be either the first point of discontinuity t), or the first root of the equation R(f)=O; in other words, on [f’,t”] the function R(t) is continuous
and positive. If
(Qx, 4)
>
0
on [i,,“],
the function
Q(t) does not decrease;
if QX.. $) < 0: we have dQ/dt > (aQ/k, +)Q(t) (since Q(t) 2 R(t) and e(t) decreases not more rapidly than an exponential. Therefore, two variants are possible: either R (k,) < Q (ta) - a for some t, > t’ and then Q(t) > qu > 0 for all t e [0, &,I, or a transition from R = Q to R < Q occurs at the point of discontinuity. The second case is considered; in the first the construction of the trajectory r(t) is exactly the same. It remains to note that we have used an estimate in Q(t) with increase in t, and we require an estimate really the same thing because the trajectory
{X(S)
, -I& (s) },
reversed flow of time (s = T - t), also satisfies the maximum replaced by X(x),
of the decrease
for a decrease in C. However, this is considered
principle
with a
(where K(x) is
and $(t) by - G(t)).
We now discuss some of the assumptions made. We first note that the satisfaction of a Lipschitz condition on J, for the function f),(~, $1) is to some extent connected with a Lipschitz condition on x. The following general statement can be formulated. Let the function
fix, v) satisfy the following
continuity
conditions:
98
R. P. Fcdorenko
IjX(z -I- A, y) - f%(z, y) 16
(2) f, (x, y) exists and independent
1 fx(x,
of x, JJ); then
y + 6) -
fx(a,
CZj,A ( (C,
and CZ are
g) I G CSI 61%.
Proof
O(A’) & i C,E dg = ;
A’.
In exactly the same w\y f(~ + A, Y + 6) = Subtracting,
f(s,
Y + 6) + Af.=(x, Y + 6) +O(A’).
we obtain f(x +
A, Y + 6) -
-MS,
f( x+A,y)
=
A[fx(w+@
-
Y)] + O(A2),
from which If&4
+ +--f&y)
-f(s+ Putting
A,Y)
A =
85,
If&
y + 6) -
I+
C2A.
we obtain
fx(z, y) 1G
&Cd+ Cd”2=
(Cl + Cd6%
The example of the function f(lc, y) = y2 sin (z / y) (proposed by N. N. shows that the exponent ?4 cannot be replaced by a greater one in the general However, the function Q(z, I/I) is not a general function, it has a specific a stronger result be obtained for it ? We now discuss the actual assumption function Q(x, J/) is differentiable with respect to x (as for y, it is known that directionally
differentiable,
a fairly stringent
Chentsov) case. origin. Can that the Q(x, J/) is
that is, lim Q (zp 9 + S(P) - Q (z~ *) exists for all $I). This is s assumption, and it suld be interesting to investigate the question in a
more general situation. For example, if K(x) is prescribed as the intersection of semispaLes K(s) = n l-L P
where IT, = {z : (z, I&*) < bp(x)}, p = 1, 2, . . . , P, the b&) being arbi trarily smooth functions, K(x) may not be differentiable, but only directionally differentiable. Thus, in a plane let ,ql = (I; i),$ = (-1; I), $3 = (0; I), b, = a, Then Q,(a, I&) does not exist at the point cy= 0, at this b z = a, bs = 2~. point Q,(a, +3) is discontinuous. The assumption of the existence of a supporting hyperplane to the surface o(x) = T is fairly natural and fully justified. In [3] the case of a “concave” arc of this surface is considered and it is shown how to construct an example where such a situation arises. However, it will be associated with non-uniqueness of the optimal trajectory: the starting manifold ?“e can be connected with x1 by two essentially different trajectories with the same optimal time r , and the starting points to E To for these trajectories are quite distinct. So that this example does not contradict the statement of the naturalness of the assumption made. ensured
By J. Berry.
REFERENCES 1.
FEDORENKO, R. P. The maximum principle for differential inclusions. Z/z. @hid. Fiz. 10, 6, 1385-1393, 1970.
2.
FILIPPOV, A. F. Some questions of the theory of optimal control. 1959.
3.
FEDORENKO, R. P. The Cauchy problem for Bellman’s dynamic programming equation. Zh. ~jkhisl.Mat. ma%.Fiz. 9, 2, 426-432, 1969.
Mat. mat.
Vestn. MGU 2, 25-32,