Copyright © IFAC 12th Triennial World Congress, Sydney, Australia, 1993
A DISTURBANCE ATTENUATION APPROACH TO ADAPTIVE CONTROL D.F. Chlchka and J.L. Speyer Department of Mechanical, Aerospace and Nuclear Engineering, University of California at Los Angeles, Los Angeles, California, USA
Abstract, This paper discusses the the control of systems with poorly known, not necessarily constant parameters under the effect of process and measurement noise. To limit the effects of the uncertainties and disturbanc('-s, a disturbance attenuation function is constructed, and converted to. a. p~rformance index. A. multi-player game is then considered, in which the control attempts to mlnllmzc the performance IIIdex, which the uncertainties and disturbances attempt to maximize. By .taking a dynamic programming approach, the problem is decomposed into a forward-looking optimal control problem and a past-tillle filtering proble m. These are each much simpler than the original problem, allowing a larger class of problems to be tackled. Most of the results presented here are for linear systems, with some ~uggestions for th e ir extension to non-linear cases. Keywords: Adaptive Control, Disturbance Attenuation Problem, Game Theory
split into two parts, each to be solved separately, and rejoined by an algebraic "connection" condition. The result!ng problems are much simpler than the original, although in general still quite complex. In many cases, however, they can be reduced to a manageable level.
1. INTRODUCTION
In Rhee and Speyer [1], the disturbance attenuation problem is shown to extend the results of Hoo analysis from time-invariant systems on infinite intervals to time-varying systems on finite intervals. In this approach, a disturbance attenuation function is constructed. This is done by defining a measure of the uncertainties and disturbances, and a measure of their effects on the output. The function itself is a ratio of these measures. The disturbance attenuation function is then converted to a performance index similar to those in more common optimal control problems. The problem then becomes a zero-sum game, in which the initial uncertainties and the system disturbances are considered as intelligent adversaries attempting to maximize the performance index, with the control playing their opponent, trying to minimize.
The controller reSUlting from this approach is a function both of the state and parameter estimates and the associated curvature matrices which act as pseudo-variances. Although this approach represents a type of separation, no explicit assumption is made of certainty equivalence, as found in the development of current adapti ve control algorithms. In the sections of this paper in which the approach is developed, the systems considered will be linear. The uncertainties include a poor knowledge of the initial conditions, and uncertainty in the coefficient matrices of both the state and the con~rol. These matrices are considered to be functions of some parameters. Said functions are assumed to be well known, but the parameters are not . In later sections, the application of the approach to non linear systems is discussed.
This approach has in the past resulted in very complex results when applied to other than a few special cases [2, 4]. Speyer, Fruchter, and Hahn [2] applied it to a scalar system, and for a constant unknown control parameter obtained a closed-form solution that hinged upon the solution of a fifth-order polynomial. In this paper, an alternate approach suggested by Bernhard [3] appears to produce some simplification. By using the Principle of Optimality, the problem is
In the next section, the general problem will be presented. The third section will discuss the dynamic programming approach to the problem,
207
more complex systems are certainly possible and worthy of investigation. •
which results in a forward-looking optimal control problem and a rear-looking filtering prol::r lem. Section 4 will deal with the problem in which the unknown parameters are considered constant, for which extensive results can be ol::r tained. Finally, section 5 will describe some ways in which the approach can be used to tackle non-linear problems.
In what follows, it will often be convenient to speak of the vector €(-) ~ {x T (.) aT(-)}T.
2.2. The Performance Index
Notational Comments: In what follows, T will always be the running variable; t will be some fixed time. Thus the overdot will refer to differentiation with respect to T. We also will use the notational conven~ {J(T) : tl ~ T ~ td . The contion struction lIylly refers to the Euclidean norm of y weighted by the matrix Y: lIyll~ = yTyy. Because so many variables are subscripted, derivatives (other than those with respect to T) will always be explicitly identified.
f::
2. THE PROBLEM AND THE APPROACH
It is our objective to limit the effects of any possible combination of disturbance inputs and initial uncertainties. We take the disturbance attenuation approach: we attempt to limit some measure IWII of the effect of the disturbance/uncertainty combination to some (preferably small, of course) multiple of some measure II:!QII of that combination.
Consider the disturbance attenuation function as
(4) The disturbance attenuation problem is to find a control such that this equation is satisfied 'V ~ E £2 .
This section outlines the class of problems under primary consideration in this paper, and describes briefly the game-theoretic approach from which we begin .
We define the measures in the disturbance attenuation function for the system outlined above as
2.1. Dynamic Model Consider the class of problems :i: = A(a)x
a=
+ B(a)u + Wl
(1)
W2
(2)
z=Hx+v
(3)
Here R is a positive definite weighting matrix and Q and Q! are nOTl- negati ve defini te weightings. In the disturbance measure, P, W, and V are positive definite. The vector ~o is a given estimate of the initial values of x and a .
Here, x( ·) is the n-dimensional state vector, u( ·) an m-dimensional vector of control inputs, and z( ·) the j-dimensional output vector . W 1> W2, and v are appropriately dimensioned disturbance inputs. We assume that all disturbance functions lie in £2 spaces and are defined 'V T E It, t!]. The control function u( T) is in the space of measurable functions everywhere defined on the interval.
2.3. The Game Problem We will attempt to satisfy the disturbance attenuation criterion by using a game-theory approach. Convert the condition (7) to a game performance criterion as
The coefficient matrices A and B are known functions of the parameter vector a E ReP. For the initial analysis, the form of the dependence is unimportant. In later sections, we will make some assumptions to allow specific results to be formulated.
where we have added the ~ for later convenience. The problem is now a minimax game: the disturbances and initial uncertainties will attempt to maximize the performance index, while the control will try to keep it to a minimum. Thus we will be trying to solve the problem
Remark 1: It should be noted that the assumed evolution of a in the system formulation above are very simple. It might represent, for example, drift in some system parameters. This simple formulation has been chosen for convenience;
min max J u
208
{(O),w ,v
(6)
with 1 defined above, subject to the dynamical system (1-3) .
In what follows, x will refer not to the actual state variable but to our worst-case estimate of it, which we require to satisfy (9) and (10) as well as the extremal conditions to be derived.
3. DYNAMIC PROGRAMMING APPROACH
Define the function 4.1. The Optimal Control Problem
where le[t;tfl
=
'12 { IIx(tf)llb, +
The optimal control subproblem is to solve (7) subject to the dynamics (9-10), as a function of the values of the state and parameters at time t. Keeping in mind Remark 2 at the end of the third section, we may ignore the measurement disturbance in this part of the problem. Thus we have
It, [llxllb + IIull1 t
-~ (1Iwll~ - l + IIvll~-l)l fiT} and (t denotes ( at time t. That is to say, given any time t and conditions x and et within the acceptable spaces at that time, W is the result of extremizing le[l, t fl over the remaining variables. Then, using the Principle of Optimality, the problem (6) can be written at any time t as
Xt, et
given.
(11)
where
where
If [0; t I =
1{
1 '2 -811((0) -
•
2
~OIlP - l
+ lot[ II:cIIQ2
This is a standard linear-quadratic two-player game, of the kind considered in Bryson & Ho [51 among others. The usual Hamilton-JacobiBellman results give the optimal return function as
+llull~ - ~ (lIwll~-l + IIvll~- l)l dT} Thus the problem has been reduced to a pair of simpler problems. The construction of w(t,(d, which is simply the optimal return function for this problem, will be referred to as the optimal control subproblem. Once this is done, the maximization of (5) can be attempted. This step will be known as the filtering subproblem.
( 12)
with the optimal control being given by
DWT
U(T) = - ' ) (T) = TI(T)X(T) ex The Riccati matrix TIC) is given by
Remark 2: An immediate simplification that should be noted is that the structure of the conis identically trol subproblem indicates that zero. This is due to the lack of weighting on the output. Also, since the filtering problem is entirely in the past, the output z6 and the control history u& are known functions. The control term is then removed from the performance index in the filtering problem. These considerations lead to a much simpler problem than that laid out in (6) . •
,
v:'
II
T = -TIA - A T TI - TI(OWI - BR - I B)TI - Q
subject to the final condition
(14) TI(t f) = Qf ·
4.2. The Filtering Problem In this section we are looking back from time t . Thus, the control history u~- and the output
history 4. CONSTANT PARAMETEHS
z6
are known functions .
The performance index we are to extremize is
In this section, we assume that the unknown parameters are constants. Thus, the forcing function W2 is identically zero, and the system is simply
x = A(ct):c + B(n)u + WI Z = Hx + v
(13)
(15) where
[llx(O) - xoll~-l + (Q - &)2P,,] + 2~ J~ [Ollxllb -llwll~ - l - Ilz - Hxll~-ll
If[O;tl =
(9) ( 10) 209
1 -2 0
and we have used (10) to replace v with the quantity z - H x. As remarked in section 3, the control ub being known removes it from the performance index. We also note that knowledge of the control and measurement histories allows us to fully specify the history of the dynamic system through a knowledge of the initial condition x(O), the parameter a, and the disturbance history wb. Thus the filtering problem reduces to max J xo,o::,w~
We take a standard variational approach. Using ~17 as a Lagrange multiplier vector associated with the dynamical system (9-10), we find the first-order necessary conditions to be W
= W1]
1] = (HTV- 1 H - OQ)
(IG) X
-AT1]-WV - I Z
P17(O) = x(O) -
(17)
xo
Noting that 1], x, and a are completely independent of one another, we write (20) as T
(dA dB) da x + da u
dT
8w
+ 0 8a (21)
+(a-a)P" = 0
Remark 3 : As it happens, it would be perfectly straightforward to include uncertainty in the H matrix as well. The optimality condi tion (21) would include another term containing dH / da: 1 8
--8 IIz 2 a
-
2
Hxllv-l
=
T
-(z - Hx) V
_I
dH :c da
-
This would make what follows a bit more intricate, but would not be impossible . As it is, we will have constructions allowing terms of the form x T M x to be evaluated. The extension is not included here because it offers no additional insight. • Following usual practice, we make the assumption x = x - S1] (22) Under this assumption, eqns (9) and (lG-19) yield after some manipulation the filter equations
i
=
(A - OSQ)x
+ Bu -
SHTV-l(z -Hx);
X(O) =
S=
AS + SAT
( 8W)T
= x(t) - OS(t) 8xt
(25)
Note that this is the value of Xt that is required for the calculation of the optimal control command Ut . This result is explicit in the problem, requiring no assumption of certainty equivalence. We might refer to equations (21) and (25) as "connection conditions," in that they serve to connect the results of the control problem to those of the filtering problem through their inclusion of the control problem return function .
4.3. The Adaptive Control
+08w/8a+(a-a)Pa = 0 (20)
t
Xt
(19)
la J~ 1]T(Ax + Bu) dT
lo 1]
We now substitute (22) into (19) to get the worst-case value of Xt:
(18)
= o(8w/8xt)T
1]t
where we have chosen the initial conditions to satisfy (18). Note that the matrix S is a function of the parameter vector a .
xo
+ S(WV-l H
This expression will, however, include terms that we can only compute by integrating z(.) and u(·) over the interval [0, t-J. To solve for Xt and a, we will need to perform quadratures over the recorded histories of these functions, as these terms will be dependent upon a. Further, the return function also depends upon the solution of a Riccati equation, itself dependent upon a . We note, however, that for any particular value of a, we can solve the problem as the measurements become available and control commands are issued. Thus, we might wish to carry along solutions for several values of a which blanket the region in which we expect this parameter to lie. We may then interpolate for the value for which (21) is satisfied, and interpolate the resulting control.
(23)
- OQ)S - W 1 ;
S(O) = -P
At this point, we do not quite have a solution to the overall control problem. It still remains to satisfy (21) and (25) in terms of the results of the optimal control problem . To do this, we note that we can express xb completely in terms of a and Xt by using (16) in (9) and solving the resulting linear equation as usual. We can use the result in (21) to generate a rather complicated algebraic expression for this optimality condition in terms of a and Xt. This gives us a set of equations which we can solve, perhaps via a Newton-Raphson technique, for these values.
Remark 4 : In the special case in which the system matrices are constant, the problem can
(24) 210
be solved completely. The optimal return function in such a case can be solved in closed form for any particular a using the Kalman-Englar method [6]. The filtering problem is not simplified, but the remaining difficulty can be dealt with as discussed. If the uncertainty is further restricted as in section 5.1, the problem is further simplified and a differential equation for the optimal control can be derived. •
i: = f(x, u, t). Rewrite the Dynamic Programming problem (8) as max[Y(a,xtl
+ W(a,xt)]
(26)
Q,Xt
where the new function Y( a, xtl is the extremal max Jf[O;tj;
x(O).w~
Xt,
a given.
We will refer to this as the optimal accumulation function for the problem. It can be shown that this function satisfies a forward HJB equation .
5. EXTENSIONS
As this is a non-linear problem, it is doubtful that either the accumulation or return functions can be derived in so straight-forward a manner as for the linear system . However, it may we be possible to assume a form for each and obtain an approximate solution.
This section deals with some uses of the method in cases of non-constant parameters or nonlinear systems.
5.1. Non-constant Parameters 6. CONCLUSIONS
This class of problems is much more difficult than those already discussed, and to this point few useful results have been obtained for the general case. There is a fairly broad class of problems which can be approached with some success, however.
The solution to the disturbance attenuation problem provides a conceptual structure for the development of robust adaptive control schemes. The formulation given here shows how naturally the problem can be divided into a control problem and an estimation problem. This is stated in its most elegant form in (26), in which the sum of the optimal return function w(a, Xt) and the optimal accumulation function Y(a, xtJ are maximized with respect to the current state and the unknown parameters. Although in general this is quite a complex operation, there are many problems in which it reduces to a tractable set of non-linear algebraic equations which can be solved qUickly enough for real-time implementation.
Consider the class of systems which we may write as
We have written this assuming a scalar a rather than a vector; the inclusion of more parameters is straight-forward. The optimal return function for this system is in general quite complex and, depending upon the form of B(n) may have no tractable solution. The filtering problem, however, yields to a simple approach. We may re-write the system equations as
The essential elegance of this approach is that no assumption on the structure of the controller is made. For example, in most approaches to the adaptive control problem in the literature, the Certainty Equivalence Principle is invoked . However, since this is a deterministic approach to system uncertainty, important structural properties found in stochastic control are lost . For example, the dual control concept, in which the control issues commands designed to produce extra information about the state, does not seem to manifest itself. However, much work is required to understand the non-standard adaptive control laws generated by this approach.
Since we are dealing with a past-time problem, this is simply a known, time-varying linear system. The filtering problem is then solved as a standard game problem [2, 4]. ,Clearly, more complex systems can be approached in this manner, so long as they can be re-written in such a way that they may be solved by the methods of the last section.
7. ACKNOWLEDGMENT 5.2. Non-linear Systems
This work was supported by the Air Force Office of Scientific Research under grant AFOSR 91-0077 .
Consider the more general problem in which 211
8. REFERENCES
ControL and ReLated Minimax Design ProbLems, Birkhauser, Boston, 1991.
[1] I. Rhee and J. L. Speyer, "A Game Theoretic Approach to a Finite-Time Disturbance Attenuation Problem," IEEE Transactions on Automatic Control, Sept . 1991. [2] J . L. Speyer, G. Fruchter, and Y-S. Hahn, "A Game Theoretic Approach to Dual Control," Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems, May 20-22, 1992. [3] T. Basar and P. Bernhard, Hoc -Optimal
212
[4] Y.-S . Hahn, "Stochastic Adaptive Control for a Class of Dual Control Problems," Doctoral Dissertation, University of Texas, Austin, Texas, August, 1990. [5] Arthur E. Bryson and Yu-Chi Ho, Applied OptimaL ControL, Revised printing, Hemisphere, @1975 . [u] H. Kwakernaak and R. Sivan, Linear OptimaL Control Theory, Wiley-Interscience, @1972 .