An Application of Optimal Control to Midcourse Guidance J. S. MEDITCH and L. W. NEUSTADT Summary A general midcourse guidance problem is formulated as an optimal control problem. The problem consists of bringing a vehicle on to a nominal trajectory with a minimum expenditure offuel. An amplitude constraint is placed on the thrust (or control) vector. The equations are linearized by considering perturbations about the nominal. The form of the optimal control follows from the maximum principle and is a vector function of the system adjoint solution. The optimal control is 'on' only when a scalar function of the system adjoint solution exceeds a fixed threshold. When the control is 'on', the length of the control vector is equal to the constraint limit while its 'direction' is governed by the 'direction' of the system adjoint solution. The usually difficult problem of obta ining the initial conditions for the adjoint solution is reduced to the problem of maximizing a particular function . It is shown that this function possesses a unique maximum and that its gradient assumes an especially simple form. A direct synthesis procedure then follows by applying the method of steepest ascent. The method developed is particularly suited for digital computer solution. Sommaire Ce rapport presente un probleme de pilotage de missile en navigation, comme un probleme d'optimalisation. 11 s'agit de mettre le missile sur sa trajectoire nominale en utilisant un minimumde combustible. Vne contrainte est a ppliquee du vecteur de la poussee. Prenant en consideration des perturbations par rapport a un regime nominal. La forme de l'optimalisation resuIte du principe du maximum et constitue une fonction vectorielle de la solution adjointe du systeme. Cette optimalisation n'est realisee que lorsque une fonction scalaire de cette solution, depasse certain seuil. Quand cette condition est remplie, la longueur du vecteur caracteristique du systeme est egale a celle qui resulte de la contrainte, tandis que sa direction est fixee par celle donnee par la solution adjointe du systeme. La difficulte habitue lie de determiner les conditions initiales de la solution adjointe du systeme se reduit au probleme de la maximalisation d'une fonction particuliere. 11 est montre que cette fonction ne possede qu'un seul maximum et que son gradient est d'une forme particulierement simple. Vne procedure de synthese directe en decoule en appliquant la methode de la pente la plus raide. Cette methode se prete particulierement bien a un traitement au moyen d'un calculateur numrique.
schreitet. Bei wirksamer Regelung ist die Liinge des Vektors der Regelung gleich der vorgegebenen Begrenzung, wiihrend seine "Richtung" von der "Richtung" der adjungierten Losung des Systems abhiingt. Die normalerweise schwierige Gewinnung der Anfangsbedingungen flir die adjungierte Losung wird auf die Maximierung einer bestimmten Funktion zurlickgeflihrt. Es zeigt sich, daB diese Funktion ein eindeutiges Maximum besitzt und daB ihr Gradient eine besonders einfache Form annimmt. Daraus folgt unmittelbar ein Syntheseverfahren nach der Methode des steilsten Anstieges. Die hier entwickelte Methode ist besonders gut flir die Losung auf Digitalrechnern geeignet.
Introduction Recent advances in optimal control theory l - 3 have led naturally to an interest in applying these results to practical problems. The need for application studies is especially strong in the field of missile and space vehicle guidance and control, where such factors as fuel consumption, payload, mission time, and target errors are critical. Some results have been obtained in this field 4- 7 • This paper presents a synthesis of the control function which is optimal in the sense of minimum fuel for the midcourse phase of a broad class of space missions. The central result of the paper is the development of an iterative computational procedure for synthesizing the optimal control. The procedure is simple in form and well suited for digital computation.
Midcourse Guidance as an Optimal Control Problem A number of space missions may be subsumed under the configuration shown in Figure 1. The problem is that of transferring a space vehicle from one moving point A to another moving point B . Both A and B may lie on bodies which are spinning as well as moving along their respective orbits. Typically, the mission and its associated guidance operations are separated into three phases: launch or boost, midcourse, and terminal. These are shown in the figure. If the launch phase Destination po int
Zusammenfassung Ein aIlgemeines Problem der Lenkung wiihrend der mittleren Flugphase (Startphase- mittlere Phase-Endphase) wird als optimales Regelproblem formuliert. Die Schwierigkeit besteht darin, den Flugkorper bei geringstem Brennstoffverbrauch auf eine vorgeschriebene Flugbahn zu bringen. Der Schubvektor (oder der Vektor der Regelung) unterliegt einer Amplitudenbegrenzung. Die Linearisierung der Gleichungen geschieht dadurch, daB man St6rungen urn die Nennwerte annimmt. Die Art der Optimalwertregelung folgt aus dem Maximumprinzip und ist eine Vektorfunktion der adjungierten Losung des Systems. Die optimale Regelung wird nur dann wirksam, wenn eine Skalarfunktion der adjungierten Losung des Systems einen festen SchweIlwert tiber-
292
Terminal phase
A Launch phase
Figure 1. General configuration/or space missions
AN APPLICATION OF OPTIMAL CONTROL TO MIDCOURSE GUIDANCE
could be executed perfectly, the vehicle would 'free-fall' to the destination point without requiring any corrections. However, imperfect launch guidance, resulting from sensor errors, incorrect thrusting, external disturbances, etc., causes errors to exist at the conclusion of this phase. Hence, midcourse corrections may be required. The term errors is used here to denote position and velocity deviations from the ideal 'free-fall' trajectory which is hereafter termed the nominal trajectory. There are basically two major problems associated with the overall guidance problem for each phase of a space mission. The first deals with the sensing and processing of information to determine the vehicle's state, such as its position and velocity. The second is concerned with utilizing this information to guide and control the vehicle so that the mission objectives are achieved. This paper presents a solution to the second problem for the midcourse phase, under the assumption that the position and velocity errors are known at the termination of the launch phase. These errors are assumed to be small enough to permit linearization of the vehicle's equations of motion about the nominal trajectory. In order to allow freedom in the terminal phase for manoeuvring and landing or docking, it is required that the midcourse corrections reduce the errors to zero at the end of the midcourse phase. In addition, it is required that the midcourse phase be executed in a fixed time equal to the time for the nominal flight. Physically, these two requirements mean that the vehicle will arrive at the desired destination point with the same velocity and at the same time as a vehicle following the nominal trajectory. Thus, the task of the terminal phase will be reduced considerably. It will be assumed that the midcourse phase is to be executed using a minimum amount of fuel in order to maximize the payload. In addition, it is assumed that the corrective thrust magnitude is constrained for obvious practical reasons. The equations of motion of a space vehicle subject only to gravitational and propulsive forces may be written as
xi=fi(x1,x2,x3,t)+~7;(t) m
i=1,2,3
(1)
where Xl' x 2 , and X3 are the coordinates of the centre of mass of the vehicle in a Cartesian inertial coordinate system and the dots denote differentiation with respect to time. The gravitational accelerations, as represented by the fi, are time dependent since the attracting bodies, e.g. planets, may be moving. The Ti(l) represent the propulsive forces, and m is the mass of the vehicle. It is assumed that the mass of the fuel consumed during midcourse flight is small compared to the total vehicle mass. Thus, m is essentially constant during the mission. Defining i=1,2,3 (2)
For the sake of generality, it will be assumed the equations of motion are of the form of (4). Let y = Y + by where Y represents a nominal trajectory satisfying the free-fall equation
Y=P(Y,t) Since y satisfies (4), by satisfies the equation
bY=(~:)bY+Q(Y) T(t)+ ...
where (ap/ay) is a 6 x 6 matrix of partial derivatives (evaluated along the nominal trajectory Y), and the dots on the right denote higher order terms in by. The term Q(y) T(l), which represents a propulsive correction, will be considered to be small of the first order. Therefore, the difference [Q (Y) - Q(y)] T Ct) will be considered to be a 'second-order' term. Thus, neglecting higher order terms, eqn (5) can be written as
bY=(~:)bY+Q(y) T(t) Defining
x(t)=by(t)
A(t)=(~:)
u(t)=T(t)
B(t)=Q(Y)
m
x(t)=A (t) x(t)+B(t) u (t) As an example consider the two-dimensional, restricted twobody problem. Let r1 and f)1 be the polar coordinates of the centre of mass of the vehicle in an inertial coordinate system with origin in the centre of the attracting body. Let 111 and U2 be the components of the vehicle's radial and tangential thrust, respectively (see Figure 2). Letting;l = r2 and 81 = f)2, it is well known that the equations of motion take the form r1
r2
/"2
/"18~- r~
d dt 8 1
82
1
82 -2 /"2 82 r1
0 0 1 0
+
m
0 0
U1
(8)
0
1
y =P(y, t) + Q (y) T (t)
(4) 293
112
mr]
where m is the vehicle's mass, fl = GM where G is the universal gravitational constant, and M is the mass of the attracting body.
(3)
Here x is the vector (x] • ... , x 6 ); f depends only on the first three components of X (as well as time); and the vector T(t) is related to the Ti(l) of (1) in an obvious way. It is sometimes more advantageous to use polar or spherical coordinates rather than Cartesian coordinates as in (1). In vector form an equation similar to (3) is obtained:
(6)
(6) becomes
the six equations, (I) and (2), may be written in vector form as
x=f(x,t)+~T(t)
(5)
o Figure 2. Restricted two-body problem
J. S. MEDITCH AND L. W. NEUSTADT
where X(t) is the n x n matrix solution of X (t) = A (t) X (t), X (0) = I
Linearizing (8), with R, 0, R, a"ud 0 the values of rI' (JI' r 2 , and (J2 along the nominal trajectory, respectively, yields Xl
0
X2
IF+2/l0 R3
d dt X3
x4
1
0 2Re
Ji2
0
2e R
o0 o 2Re o1 o _ 2R R
Xl
0 0
X2
!O m
X3 X4
+
where I is the identity matrix. Since X(r) is non-singular, X(T)
0 0 0
1
xn+I(t)= tIIU(S)lldS Clearly, Xn + I (T) is the cost function. Now consider the set Q(T) defined by the relation Q(r)=
{[f: X
-1
(t) B (t) u (t) dt,
t
r
Ilu (t)11 dtJ
U
(t) admiSSible}
A point (Xl' ... , Xm x n+I ) belongs to .Q(T) if and only if there is an admissible control u(t) which drives the initial state (- Xl' ... , - Xn) to the origin in time T with cost X n +I . Since - u(t) is admissible if u(t) is, and since 11- u(t)11 = IllI(t)II, it is clear from the definition of Q(T) that this set is symmetric about the cost (x n +I ) axis. It is shown in the Appendix that Q(T) is convex, closed, and bounded. A typical three-dimensional representation of Q(T) is given in Figure 3 for a second-order system. For higher-order systems the simple two-dimensional representation of Figure 4 will be helpful in the discussions which follow.
where the lIi(t) are the components of lI(t). Note that Ilu(t)11 is simply the length or magnitude of the control (thrust) vector. Replacing the matrix B(t) by A B(t), and u(t) by (1/A) 11(1), the constraint can be assumed to be normalized, and of the form II u(t)11 :0;; 1. Assuming a constant rate of expulsion of fuel, the amount of fuel consumed during the midcourse phase is proportional to the time integral of the length of the control (thrust) vector. Thus, the cost function to be minimized is given by Ilu(t)11 dt
(11)
In other words, for an admissible control lI(t), (11) determines the initial state from which it is possible to reach the origin in time T using the controlu(t). Let an additional state variable X n+I be defined by
mR
Ilu(t)II=)ui(t)+ ... +u;(t)
f:
0 if and only if
-x(O)= f:X-1(S)B(S)U(S)dS
u2
where Xl = brI, X2 = br2, X:l = b(JI' and x J = 15° 2 , Returning to the general case, it will be assumed hereafter that x(/) is an n-dimensional column vector, the state vector; 1I(t) is an r-dimensional column vector, the control vector; and A (t) and B(t) are n X nand n x r matrices, respectively. Since the time interval over which midcourse guidance is performed is fixed, it is assumed that 0 :0;; t :0;; T in (7), where T is a known constant. The deviations which exist at the initiation of midcourse guidance are denoted by x(O). To null these errors, it is required that X(T) = O. An amplitude constraint is placed on the thrust or control vector by requiring 111I(t)11 :0;; A, where A represents the maximum thrust available. The symbol I I denotes the Euclidean norm given by
s=
=
U1
(9)
The formal mathematical statement of the problem is now presented. Given the linear dynamical system of (7), it is desired to drive the system from a known initial state x(O) in phase space to the origin in a fixed time T such that the cost function (9) is minimized subject to the constraint 1111(1)11 :0;; I for all t, 0:0;; t:O;; T. Controls for which Ilu(t)11 :0;; I and for which each component of u(t) is measurable for 0 :0;; t :0;; T are termed admissible. The set U of all such controls is defined by the relation
X, Figure 3. Representation ofD (T)for second-order systems
U=[u(t): Ilu(t)11 :0;;1, u(t)measurable, O:o;;t:o;;rJ where u(t) is an r-dimensional column vector. An admissible control which minimizes the integral in (9) and drives the system given by (7) from the initial state x(O) to the origin in time T is called an optimal control. Derivation of the Optimal Control The response of the system given by (7) is
----------~~~~----------~x
X(t)=X(t{X(O)+ j;X- 1(S)B(S)U(S)dS]
(10)
o
Figure 4. Simplified representation of D (T) for higher-order systems
294
AN APPLICATION OF OPTIMAL CONTROL TO MIDCOURSE GUIDANCE
For some initial states it will be impossible to reach the origin even if full control Ilu(t)1I = 1 is utilized throughout the interval [0, T]. The set of all such initial states is termed degenerate. Only cases where the initial states are non-degenerate will be considered here. If a non-degenerate x (0) is given, the minimum cost to reach the origin in time T is the least X n +1 (T) for which (- x (0), X n+1 (T)) E Q (T). This cost is denoted by x On +1 (T). Obviously, (- x (0), x On+1 (T» is a boundary point of Q (T). Let Yo = ( - x (0), Xon+l (T». Since Q (T) is convex, there exists at least one (n + I)-dimensional row vector '1)* such that
as the function to be maximized by the choice of an admissible control u (I) over the interval [0, T]. If a (t) = 0,[ (t) will clearly be maximized by setting II (I) = O. If a (t) =f. 0, J(/) will be maximized only if It (t) has the same direction as a (t) in which case
a (t)·u (t)= lI a (t)II' lI u (t) 1I and
!
(t) = IlL! (t)11 [lla (t) II -1]
The maximization of J (t) is now separated into the following three cases:
(12)
(a)
lIa (t)11> 1
for all wE Q (T). That is, a hyperplane of support may be constructed at Yo with 1)* a vector normal to this hyperplane at Yo' This result has a simple geometric interpretation which is given in Figure 5. The hyperplane of support is 'tangent' to the boundary of Q (T) at Yo' The vector '1)* is normal to this plane and is directed 'out of' or 'away from' Q (T). Observe that the components of this vector are proportional to the direction numbers of the line 'normal' to the boundary of Q (T) at Yo. It is geometrically obvious (see Figure 5) and easily shown that the (n + 1)st component of '1)* is non-positive. With little
(b)
Ila(t)ll
(c)
lIa (t)1I = 1
17*' Yo '?IJ*' W
(17)
Case (a)
Whenever 11 a (t) 11 > 1, J(t) is maximized by setting 11 u (t)11 = 1, the maximum permissible. Hence, since the two vectors are in the same direction, the choice of u (t) is
a (t) L!(t)=lIa(t) 11 f
where the prime denotes the transpose. Note that the components of II (t) are simply the components of a' (t) normalized so that 11 u (t) 11 = 1. Hyperplane
Case (b)
of support
Whenever 11 a (t)11 < 1, J(t) is maximized by setting 11 u (1)11 = 0, or equivalently, Figure 5. Geometric interpretation of hyperplane of support and normal veclor 1]*
loss of generality, it may be assumed negative. Since the length of '1* is obviously immaterial, the (n + l)st component of 1) * is taken to be - 1. Henceforth, it will be assumed that all 'normal' vectors to Q (T) with negative (n + 1)st component have this component equal to -1. The function (of w) '1* . w attains its maximums in SJ (T) when «) = Yo. In general, for W EQ (T) given by
w=[f: X - 1 (t)B(t)u(t)dt, f: Ilu(t)1I dtJ }j
* . UJ
becomes
'l*'W= t[IJ·X-I(t)B(t)U(t)- IIU(t)ll]dt
(13)
where i,* = (I), -1). It is desired to select an admissible control 11 (t) which maximizes 1,* . w . This function will be maximized by maximizing the integrand of (13) for all I, 0 ::::;; t ::::;; T. Let !(t)=I]'X-I(t)B(t)u(t)-llu(t)1I (14) and define
a (t)=I]' X-I (t)B(t)
u(t)=O Since u (t) is also zero for the special case where 11 a (t)1I seen above, then u (t) = 0 whenever 0 ::::;; 11 a (t)11 < 1.
(t) = a (t)· u (t) -IlL! (t)11
Whenever 11a (t)11 = I, (17) reveals that the choice of 11 u (t)11 is immaterial as long as u (t) is admissible. The system is defined to be regular if the set of points in [0, T] at which 11 a (1)11 = 1 is of measure zero for every vector 1/*. Only regular systems will be considered here. For such systems, the control which maximizes 1) * . W (for WEQ (T» is determined almost everywhere. Since sets of measure zero play no role in the systems under consideration, they will be ignored, so that the control which maximizes 1/* . w, that is, the optimal control, is uniquely determined, For the sake of completeness, however, the choice u (t) = 0 will be made whenever 11 a(t)11 = 1. Note that because of the uniqueness, inequality (12) can be replaced by the strict inequality
17*' Yo > 1]*' W
(18)
for all WE.!) (T) different from Yo' Recalling the definition of a (t) from (15), and summarizing the above results, the admissible control which maximizes 1/* . win (13) is given by
l[I],X- I (t)B(t)]'
(15) u(t) =
(16)
0, as
Case (c)
which is an r-dimensional row vector. Substituting (15) into (14) gives
!
=
lil t!· X- l~t) B (t)11
for 0 ::::;; t::::;;
295
T.
if
III] ' X- I (t)B(t) 11> 1
if
111J'X- I (t)B(t)II::::;;1
(19)
J. S. MEDITCH AND L. W. NEUSTADT
Note from (19) that whether or not the control is 'on' is governed by whether or not 11"f/ • X-I (t) B (1)11 exceeds a fixed threshold. Moreover, when the control is 'on', the system utilizes the full capability of this control by making 11 u (1)11 = 1 and changing only the 'direction' of the control in accordance with the components of "f/ • X-I (t) B (t). Synthesis of the Optimal Control
Since "f/ is the only unknown in (19), the problem of synthesizing the optimal control is that of determining an "f/ that corresponds to a given initial state x (0). Another way of looking at this problem is to observe that the transpose of the row vector "f/ • X-I (t) is the solution of the homogeneous system adjoint to (7): JjJ (t) = - A' (t) l{I (t),
l{I(O)=rl'
where the prime denotes the transpose. Hence, the synthesis problem is equivalently that of determining the initial conditions on the system adjoint. Now consider an (n + I)-dimensional vector of the form
A*=(2,-1)
(20)
jf
1I..1·x-l(t)B(t)ll>l
jf
1I..1·x- l (t)B(t)lI.::;l
This is simply the equation of the plane B (A *). The line through the point (- x (0), 0) parallel to the (n + l)st coordinate axis intersects B (A *) at the point ~ = ( - x (0), ~n+l) as shown in the figure. Substituting ~ = ( - x (0), ~n+l) into (24) gives
2 * . z (r, 2*) = 2 * .( - x (0), ~n + I)
(25)
Since Z (T, .1.*) "# Z (T, "f/*), it follows from (22) that
2*' z(r, A*»2*' z(r, 1]*)
(26)
(27)
-2'x(0)-~n+l > -2'x(0)-x~+1 (r)
(21)
which simplifies to
(28) This result is geometrically clear from Figure 6 and indicates that the value of ~n+l cannot exceed XOn+l (T). Solving (25) for ~n+l gives ~n+1 = -2*'(x(0),0)-2*'z(r,A*)
In the same way as relation (18) was derived, the relation
A*,z(r,2*»2*"
(24)
Expanding (27) yields
which is obtained from (19) by replacing "f/ by A. For a regular system, G (t, }.) is unique almost everywhere for every vector A. Define the vector function
z (r, 2*)=[f: X-I (t) B (t) G (t, A) dt, f: 11 G (t, 2)11 dt]
(23)
2 * . z ( r, 2*) = 2 * . ~
..1*.( -x(O), ~n+I»2*'( -x(O),X~+1 (r»
1
G (t, 2) = 1112 ' X-I~t) B (t)1I
z(r,1]*)=( -x(O),X~+1 (r»
where x On + 1 (T) has been defined previously. The hyperplane of support at this point is denoted by A. Let .1.* be any vector as defined in (20) such that Z (T, A*) "# Z (T, "f/*). As shown above, Z (T, .1.*) is on the boundary of Q (T). If ~ = (~l' ... , ~m ~n+l) is any point in the space En+! which also lies in B (A *), then
Substituting (23) and (25) into (26) gives
where A = (AI' ... , An). For an arbitrary A, let the function G (t, A) be given by
J[2'X- (t)B(t)]'
The geometric formulation of the problem of determining an "f/* associated with a specified initial state x (0) is depicted in Figure 6. If "f/* is known, the corresponding point on the boundary of Q (T) has the coordinates
(22)
for all I; E Q (T), I; "# Z (T, .1.*) can also be obtained. Hence, Z (T, .1.*) is on the boundary of Q (T). The hyperplane of support to Q (T) at Z (T, .1.*) has the normal vector .1.*. This hyperplane is denoted by B (A *) as shown in Figure 6.
/
(29)
It is clear from (29) and Figure 6 that ~n+1 is a function of i. since .1.* = (A, - 1). Now, if Z (T, i,*) = Z (T, "f/*), it readily follows that
(30) However, (30) holds if and only if Z (T, ). *) = Z Cr, 1) "'). Hence, the problem of synthesizing the optimal control has been reduced to the problem of finding an (n + I)-dimensional vector A* = (l, - 1) which maximizes ~n+l which is given in (29). Computation of the Optimal Control
Since the (n + 1) st coordinate of }, '" is fixed, the maximization of ~n+l in (29) is actually with respect to the n-dimensional vector }.. From (21)
},*,z(r,2*)= f:[2'X- 1 (t)B(t)G(t,2)- IIG(t,2)II]dt
in which }. is the only variable. To simplify the notation. let
o Figure 6. Representation of the problem of determining
(31)
(32)
1} '"
296
AN APPLICATION OF OPTIMAL CONTROL TO MIDCOURSE GUIDANCE
Substituting (32) into (29) and expanding the result gives ~n+l
(2)= -),. x-(O)- g(),)
boundary point of Q (T). From (22) it is clear that Z (r, AO*) is also a boundary point of Q (T). Hence,
(33)
The maximization of ~n+l (A) may be performed readily by utilizing the method of steepest ascent. The variable A is made a function of some parameter a and the differential equation
z (r, ),o*)=z (r, 11*) and, therefore, G (t, ;'0) is the optimal control. Substituting (35) into (34) yields
~~ =
(34) is solved for A (a). Here;' is an n-dimensional column vector, k is some positive constant, and V denotes the gradient. Then, under the conditions discussed below and a suitable initial choice of }" that is, }, (0), it· can be shown that the desired r) is given by 11= lim ),(a) " .... 00
In particular, it will be shown below that if this limit does exist, it is precisely the desired 1]. Substitution of this 1] into (19) then gives the optimal control for the problem. It can be shown8 that V~n+l (A) is a continuous function of ;. and is given by
V~n+l 0.)= -x(o)- f: X-I (t)B(t)G(t,)')dt
(35)
-k[X(O)+
f:
which may be rewritten as
~~ =
- kX- 1 (r) {X (r{x(O)+
" .... 00
where },O is a constant n-dimensional vector. Since continuous,
V~n+l (),)
is
(37) " .... 00
From (36) and (37) it then follows that as A -;. AO, dA/da e.
lim
->-
V~n+I(),(a»=O
0,
(38)
Hence, for A = },O in (35), it is clear that
-x(O)= f: X-I (t)B(t) G(t, ),o)dt
(39)
This means that G (t, i,O) is precisely the control required to drive the system to the origin from x (0) in time T. Let },O* = (AO, - 1). Then, from the definition of Z (T, A*), it follows that
Z(T,),o*)=[f: X-I (t)B(t)G(t,),o)dt,
f:
IIG(t,i,o)11 dt] (40)
Substitution of (39) into (40) gives
z (r, ),0*)=[ - x (0),
f:
X-I (t)B(t) G(t,),) dt]}
(41)
x(r,),)=x(r{x(o)+ f: X-I (t)B(t)G(t,)')dt]
(42)
Substituting (42) into (41) yields d)'
-I
(r)x(r,),)
(43)
The discrete form of (43) is
(36)
1.
t'
From (10) the term in brackets is the solution of (7) for x (T) where the choice of ;. determines the control. Since this x (T) depends on}. which is a function of a in (41), define
da=-kX Hence, from (34) dA/da is also continuous. Assume
X-I (t)B(t)G(t,)')dt]
IIG (t, ,1°)11 dt]
From (23) (see also Figure 6),
z(r,I1*)=( -x(O),X~+1 (r)) Hence, the first n coordinates of Z (T, },O*) are the same as the first n coordinates of Z (T, 1]*). By hypothesis, Z (T, 1)*) is a 297
2(;+1)_),(i)= -KX- 1 (r)x(-r,),(i»)
(44)
where K is some positive constant and i is the index of iteration. A computational procedure is now clear. An initial 'guess' A(0) is made and substituted into (19) to determine the corresponding control. This control is then applied to the system of (7) to obtain x (T, A(O». This result is substituted into (44) to compute }P), the first iteration on ;.. The cycle is then repeated using (44) to iterate on A each time. The process continues until 1 ;'(i+1) - ;'(i) 1 is less than some specified constant. Then, the optimal control is obtained from (19) using the last iteration on}..
Discussion of Results This paper has presented a synthesis procedure for a class of minimal effort controls. While the results are fairly general, it will prove instructive to discuss them in terms of the guidance application which motivated the study. By minimizing the fuel consumed in the midcourse phase, more space is made available on a given space vehicle for such items as life support equipment, communication sytems, and scientific instruments. The scope and efficiency of space missions are, of course, directly dependent on the availability of additional equipment. While it is desirable that the space saved by using the optimal system be sizable, a saving of only a few pounds may permit the inclusion of experiments not possible otherwise. An important factor in space and weight considerations is whether or not the computations required can be done by equipment already on board the vehicle. If additional computer equipment is required, the question of savings in weight and space must be examined carefully. Just how close the terminal error x (T) will come to being zero in any practical system depends on a number of factors.
J. S. MEDITCH AND L. W. NEUSTADT
Some of the more important factors include the accuracy with which the initial error x (0) can be measured, the accuracy with which the control can be computed and the capability of the midcourse propulsion system to implement this control. The first of these factors depends upon the performance limitations of measurement schemes and the associated sensors. In general, the longer one is willing to wait, the better will be the estimates of position and velocity errors. However, in waiting, initial errors are permitted to continue uncontrolled. The trade-off is obvious. One possible approach is to divide the mission interval [0,1'] into two or more sub-intervals. Then, while control is being executed over one sub-interval on the basis of earlier error estimates, data can be sensed and processed to improve the error estimates for succeeding sub-intervals. While the system is now optimal over each sub-interval rather than over the entire interval [0,1'], the control is based on more accurate position and velocity data. Since the control operates 'open-loop' once 11. is determined, this latter approach also permits the system to control for disturbances occurring in previous sub-intervals. The accuracy with which the control can be computed is governed by how rapidly the solution of (43) converges. In addition, the amount of computing time allowed before control must be initiated as well as computer speed are key factors. Studies are needed to determine how small II}. (i+l) - 11. (i) 1 must be before computation is terminated. This information is essential to ensure that Ilx (1')11 can be brought within reasonable limits. As a result of the 'on-off' nature of the optimal control, non-throttlable propulsion can be used. Hence, the problems associated with controlling throttlable engines are completely circumvented. However, since the direction of the control is time-varying, some means must be provided for controlling this direction.
= [L (u**), F(u**)], so that Y*E Q (1'). Since for all t, it follows that F (u i ) ::; r (i = 1, 2). Thus
y*
Y:+ 1 =aF (u l )
IllIi (t)11 ::;
I
+ f3F (u 2 )::; (a+ 13) l' =T
To construct 11**, a control v (t) will first be constructed, with the property that 1 v (1)11 = 1 for all t (so that F(v) = r), and such that L (v) = L (u*). Let
it (t) = {
u*(t)
if u*(t)#O
(1,0, ... ,0) if u*(t)=O so that 1 ::::: 11 it (1)11 > 0 for all t. It is clear that it (t) is admissible and that
u
*( )_11 *11 it(t) t - u Ilit(t)11
(45)
Let -1
lJ(t)=X
(t)B(t)
it(t)
(46)
Ilit(t)11
Clearly, 17 (t) is measurable and bounded, and, by (45),
L (u*) =
f:
IJ (t)
Ilu* (t)11 dt
According to Lemma 2 of ref. 9, there is a measurable function with I ex (t) I == 1 such that L (u*) = \ ~ 17 (t) ex (t) dt. Let
ex (t)
it (t) v(t)=a(t) Ilit (t)il Then
Ilv (1)11
=
L(u*)=
I ex (t) I =
f:
(47)
1 for all t, and, by (46) and (47),
lJ(t)cx(t)dt=
f:
X-I (t)B(t)v(t)dt=L(v)
and v (t) has the desired properties. Repeating the above construction on sub-intervals of [0, r], it is possible to find admissible controls VS (t), for 0 ::; t ::; s (s < r), such that 1 VS (1)11 = 1 for all t ::; s, and
Appendix
In this appendix it will be proved that the set Q (1') is convex, closed, and bounded. Since the components of X-I (t), B (t), and U (I) (so long as u is admissible) for 0 ~ I ~ 1', are uniformly bounded, it immediately follows that Q (1') is bounded. Let
t
X-I (t)B(t)u*(t)dt=
t
X-I (t)B(t)vS(t)dt
Let
L(u(t))= f:X-l(t)B(t)U(t)dt
US(t)= {
and
VS (t)
for
0::; t::; s
u*(t)
for
s
To show that Q (1') is convex it must be shown that if yl and y2 are any two points in Q (r), then y* = exyl + fJy2 also belongs to Q (r) whenever ex ::::: 0, fJ::::: 0, ex + fJ = 1. Since yiE Q (r), yi = [L (u i (t», F(u i (t»], where u i (t) is admissible (i = 1,2). Let 1I*(t) = exul(t) + fJu 2(/); then
Then it is easily seen that L (US) = L (u*). Define () (s) = F (US), so that ()(O) =F(uO) =F(II*)::;Y*n+l and ()(r) = F(u'C) = r::::: y* n+l' It is obvious that () (s) is a continuous function of s, so that () (so) = y* n+l for some so, 0::; So ~ r. If u**(t) = u·<· (t), u** has the promised properties. To prove that Q (1') is closed, let yi = [L (u i ), F(ui)],j = 1, 2, ... , uj (t) admissible, be a sequence of points in Q (1') with yi -+ y*. It must be shown that y* EQ (1'). Since luii (t) I ~
Ilu*(t)11 = Ilcw l (t)+f3u 2 (t)ll::;a Ilul(t)11 +f3llu 2 (t)11 ~cx+f3= 1
1 ui (1)11 ::;
F(u(t»= f:
Ilu(t)11 dt
j-* 00
1 for all t, j, and i = I, ... r, the functions ul (t) are uniformly bounded in the norm of the Hilbert space L2 (0, r). By a well-known property of L 2 , there exists a subsequence uik (t) such that ulk (t) -+ Ui* (t) weakly for every i and some
for all t (since ul and u2 are admissible), so that u*(t) is admissible. The above inequalities also imply that F (u*) ~ exF(ul ) + fJ F(u 2). Since L is a linear operator, L (u*) = exL (u l ) + fJL (u 2). Thus
+
[exL (u l ) fJL (u 2), where Y*n+l ::::: F(u*). y*
=
exF(ul )
+ fJF(u 2 )]
=
k~oo
functions
[L (u*), Y*n+d
It;
* (t) in L 2• It will simply be assumed that ul (t) -+ lti * (t) j-+oo
An admissible con trol u * * (t) will now be constructed such that L(II*) =L(u**) and F(u**) =Y*n+l' i.e. such that
weakly for each i. By definition of weak convergence, L (u i ) -+ j~oo
L (u*), where u*
298
=
(ul *, ... ,
Ur
*). It is is a further consequence
AN APPLICATION OF OPTIMAL CONTROL TO MIDCOURSE GUIDANCE
of the weak convergence that Ilu* (t)1I :::;; 1 for almost all t, and that F(u*):::;; /~mDOF(ui). Thus,y*=[limL(ui),limF(u i )]=
[L (u*), y* n+l] with y* n+l ;::.: F (u*). Now an admissible control u**(t) such thaty* = [L (u**), F(u**)] can be constructed just as above, thereby proving that y* EQ (r) and that Q (r) is closed.
3
4
6
7
References 1
5
BOLTYANSKII, V. G., GAMKRELlDZE, R. V., and PONTRYAGIN, L. S. The theory of optimal process~s. 1. The maximum principle. Amer. Math. Soc. Trans. Ser. 2, 18 (1961), 341 KALMAN, R. E. The theory of optimal control and the calculus of variations. Res. inst. Adv. Studies Tech. rep. 61-3 BERKOVITZ, L. D. Variational methods in problems of control and programming. J. Math. Analysis and Applic. 3, No. 1 (1961) 145 LUKES, D. Application of Pontryagin's maximum principle in
8
9
determining the optimal control of a variable mass vehicle. Amer. Rocket Soc. Paper 1927-61, August 1961 LEITMANN, G. On a cla5s of variational problem in rocket flight. J. aero. Sci. Amer. 26, No. 9 (1959), 586 ISAEV, V. K. L. S. Pontryagin's maximum principle and optimal programming of rocket thrust. Automat. Rem. Control 22, No. 18 (1961), 881 BRYSON, A. E., et al. Determination of the lift or drag program that minimizes re-entry heating with acceleration or range constraints using a steepest descent computational procedure. Paper presented at inst. aero. Sci. Amer. meeting. New York; January 1961 NEuSTADT, L. W. On synthesizing optimal controls. Automatic and Remote Control, 1963. London; Butterworths: Munich; Oldenbourg LA SALLE, J. P. The 'bang-bang' principle. Automatic and Remote Control, 1961. London; Butterworths
DISCUSSION M . ATHANS, M.l.T. Lincoln Laboratory, Lexington, Massachusetts, U.S.A. In the literature on optimal control, the paper by Meditch and Neustadt is one of the few concerned with control constraints of the type I[ u(I)[[ ~ 1. The purpose of this discussion is to present some recent results which provide the solution to a class of problems using the same control constraint. Consider the system
x(t)=f[x(t); tJ +u(t)
x
where x (t), tt), u (t), and fare n dimensional vectors and t is the (scalar) time.
Assumption] 111
The control vector u (t) has n non-zero components
(t), ... , Un (t) and is constrained by
lIu(t) I =y'ui(t)+ ...
In the special case
g [ llx (t)II; tJ =0 for all x (t) and t, the time-optimal control u (t) = - Mx (t) ![[ x (t) if is also fuel-optimal to the origin in the sense that it minimizes the functional
f:
This fuel-optimality has also been proved2 , J. The theoretical results given above can be used for the optimal angular velocity control of a tumbling body in space, using gimballed reaction jets. References 1
+u;(t):::;;Mforallt 2
Assumption 2
< f [x (t); tJ, X (t) > = g [11 X (t) 1I ; tJ x (t) and t, where < f, x > is the scalar product of the vectors
for all f and x, [[ x (t) [[ is the Euclidean norm of the vector x (t) and g is the scalar function of [[ x (t) [[ and t. Under these assumptions it has been proved l that the control X
u (t)= - M
(t)
Ilx (t)11
will force any initial controllable state to the origin, 0, in minimum time.
Ilu (t)1I dt
3
ATHANS, M. and FALB, P. L. Time-optimal control for a class of non-linear systems. i.E.E.E. Trans. Automatic Control. (October 1963) ATHANS, M., FALB, P . L. and LAcoss, R. T. Time-, fuel-, and energy-optimal control of non-linear norm invariant systems, i.E.E.E. Trans. Automatic Control. (July 1963) ATHANS, M., FALB, P. L. and LAcoss, R. T. On optimal control of self-adjoint systems. Proc. 1963 Joint Automatic Control Con! (June 1963), pp. \\3-120
J. S. MEDITCH, in reply Dr. Athans' interesting discussion is especially important since it presents one of the few classes of problems in which the minimal fuel control law can be obtained in closed form as a function of the 'instantaneous' state of the dynamical system, i.e., as a feedback law.
299