Adaptive Control by Worst-Case Duality

Adaptive Control by Worst-Case Duality

Copyright © IFAC Adaptive Systems in Control and Signal Processing, Budapest, Hungary, 1995 ADAPTIVE CONTROL BY WORST-CASE DUALITY Chris B Graves and...

1MB Sizes 2 Downloads 57 Views

Copyright © IFAC Adaptive Systems in Control and Signal Processing, Budapest, Hungary, 1995

ADAPTIVE CONTROL BY WORST-CASE DUALITY Chris B Graves and Sandor M Veres School of Electronic and Electrical Engineering, The University ofBirmingham. Edgbaston, B15 2IT, UK.

Abstract - The worst-<:ase dual control problem under conditions of bounded disturbances and modelling errors is formulated. Comparison of worst-<:ase and stochastic approaches is briefly discussed. It is shown that in some cases the possibility of bad performance of an adaptive controller is unavoidable. A priori and posteriori finite-time self-tuning are defined. Finally, in the area of a posteriori finite-time selftuning, a combined scheme is given which is based on synergic interaction between identification and control. Keywords: Adaptive control, duality, self-tuning control.

a

Payne, 1977; Milito et al.; Wittenmark and Elevitch, 1985). Linearization of the loss function of the dual control problem was discussed by Wenk and BarShalom (1980). Central tendency pole-assignment and minimum variance controllers (Moore et al., 1989; Ryall and Moore, 1989) calculate the control law most likely to achieve the objectives in view of the model parameter uncertainties. A two-step horizon adaptive dual controller for MIMQ-ARMA systems has been introduced by Mookerjee and BarShalom (1989). In adaptive pole-placement control a joint criterion of identification and error-variance has been formulated by AstrOm (1993). An adaptive control scheme will be called weakly dual if it makes some explicit use of the model uncertainty obtained by the on-line identification process and also takes into account the effect of inputs on future identification. In stochastic control the exact dual control problem has only been solved in some simple cases (AstrOm and Wittenmark, 1971; Alster and Belanger, 1974; Wittenmark, 1975, a survey; Stemby, 1976; AstrOm and Helmerson, 1986; Dumont and Astrom, 1988). Those adaptive control schemes which lack any form of interaction will be termed neutral (Feldbaum, 1966). A control scheme will be termed either apriori or a posteriori finite-time self-tuning depending on whether finite-time tuning is achieved within an a priori known or unknown finite time. The present work was preceded by some related investigations on adaptive predictive control using parameter bounds (Veres and Norton, 1993), on weakly-dual schemes of adaptive control (Veres, 1995), on a worst-<:ase framework of adaptive control (Veres and Warn, 1994) and a MATLAB TM toolbox of geometric operations on ellipsoids and polytopes (Veres and

1. INTRODUCTION

Duality of identification and control in adaptive control has long been an elusive dream of theorists because of the complicated solution of the associated equations of optimality. In practice, instead of the combined exercise of identification and control, often first identification and then control design is carried out. In self-tuning control, based on certainty eqUivalence. the estimator works separately if not independently from the controller. It has long been recognised in control system design that plant uncertainty must be taken into account, for instance by ensuring stability with respect to a set of model perturbations, which has lead to the concept of worst-<:ase design (Doyle and Stein, 1981). It has also been long known that optimal control under model uncertainty requires the solution of some minimax control problems (Witsenhausen, 1%8; Bertsekas and Rhodes, 1973). The associated Bellman equation has proved, however, too complicated to be solved with the available techniques of the time. Recently Polak et al. (1987) suggested a powerful method to solve minimax problems in the form of semi-infinite optimisation algorithms. Their suggestion was to make use of the on-line gained model uncertainty in control design and to optimise performance functions dependent upon model uncertainty. This has essentially been a move away from the traditional certainty equivalence principle. Extensions of one-step horizon loss functions of control performance by a term measuring model uncertainty has been developed in some cases in (Wittenrnark, 1975; Goodwin and This research was supported by EPSRC Grant GRlJl0846.

73

Hennsmeyer, 1993) which was frequently used to obtain the results in this area. Section 2 outlines some problems arising in worst-case adaptive control and Section 3 gives a convergent weakly-dual scheme.

Le. the maximum deviation of the output y from a given output reference y., where wj > 0, i ~ 0, are given weights. One can then optimise the calculation of the control inputs over the time interval [I, I+N], so that only minimization of criterion C; (I)

2. l1IE WORST-eASE DUAL-eONTROL PROBLEM

bounding. When the solution to minimizing C; (I)

Consider SISO models of the form A(Z-l )y(/)

=B(Z-l )u(t) +e(t)

implicitly carries out identification and control, it will be called worsl-ease dual control. We are going to describe how this happens in a precise formulation of the dual control problem by the associated Bellman equation. In the stochastic context similar equations are known (see e.g. AstrOm and Wittenmark. 1989) to the ones to be presented here. Introduce the finite N-horizon worst-case criterion

(2.1)

where

=1+a1z- 1+... +a,.z-II , B(Z-l) =b1z- 1 +b2z- 1+... +b",z-III

A(Z-I)

This model can alternatively be written in the usual way as

C'.N(P,
y(t) =
sup{c; (/)1 e eP,
where

(2.4) where the set of control algorithms over which

=


o = [-al

, ..• ,

is

considered as final goal and no separate criterion is given to improve identification, Le. parameter

C'.N(P,
-aIf' b1'···'''' b]T .

control schemes in which u(/+i),i=O,I, ... ,N-I

The equalion error e(/) could represent more than just output error, it could also cover errors from nonperfect linearity of the plant, neglected high frequency modes, input errors, etc. The only serious assumption is that the bound 8 >0, le(I)I~ is known and (2.1) is valid with some parameter vector

function of past inputs u(t +i - j),j

e.

infimum of C',N(P,
o< w < 1,

w;_I/y(t+i)-Y·(t+i)l,

w; = w; , i ~ O. The next theorem states

that the optimal solution to minimising C"N (p,
where M>O is a length of "memory" during which the model parameter 0 did not change. Recursive calculation of P(/) means that bounds are updated by intersecting with the set of parameter vectors e feasible with the input-output at time I . Ways of carrying out this calculation have been described in the literature extensively (Walter and Piet-Lahanier (1991), Veres (1992, 1993). Let the control performance be measured by the time-dependent criterion

sup

1 and past and

to minimise C,.N(P,
P(t)=n(O Ily(/-i+l)-
Is;sN

~

present outputs y( I + i - j), j ~ O.


In this paper we will be mostly concerned with time invariant plants. If the plant parameters are assumed to be constant for a time period, the set of all parameter values e which are consistent with 110 data up to time I can be calculated as

CZ(/)=

current control input is calculated as some

of dynamic programming. Define

C~ =0 and let

U C[-00,00] be a given set of possible values of the control inputs u(/+i),

i ~ O.

Theorem 2. J (Bel/man equalion). The worst-<:ase minima satisfy the equations

(2.3)

74

C~.N_I(P(t+;),
inf { o(,l+l)flJ

sup ee>(I+I),j0!<6

max{wC~1.N_I-\(P(/+;+l),
(2.5)

Lemma 2.1 For any C. a, beR and a
a+b~

P(t+i+I)={O' IICPT(t+i)(O-O')+&I:s;o} cp(t+i+l) depends on cp(t+i), 0

and

a~, b~.

the formula

and

(2.8)

via

&

2C =--. a+b

y(t+i+I)=OTcp(t+i)+&.O

holds, and the infimum is attained for u

A clear consequence of Theorem 2.1 is that calculating the feasible parameter sets is an inherent part of the exact solution of the worst-case dualcontrol problem. Next the solution of a worst-case dual-control problem will be illustrated. Example 2.1. Assume that only an priori interval-

The following result gives an example of an infinite horizon criterion with N = 00 • Assume the same model as in (2.6).

Theorem 2.3 (Veres, 1994). Assume that the following conditions are satisfied:

a

bound P(to) = [ao,bol is known for parameter 0 of the simple static-gain model

y(t) = OT u(t) +e(t)

(i) 0
(ii) y·(t+k) = y. > 0, k ~ I is constant,

(2.6)

(iii) The allowed input set U is large enough in the

with known equation-error bound 8>0 for e(t). The criterion

ct(t)=

suP.

eeP( to ).lel.: S:6,1=O,1

sense that [Y·lbt,Y·lat]cU holds and that the sequence of weights 0 < w 1 < w 2 < w 3 <...satisfies

W;IY(t+i)-Y·(t+i)1

lim w k

will be minimised to obtain the current u(t) given the

y·(t),

setpoints

y·(t+I).

Duality

k-+co

choice

of

a time index

Theorem 2.2 (i) Optimisation of C~(t) for u(t) is

a)

b) C~ •

_

• {

b - mm b"O+

&

hw

The limit

equivalent to minimising

&+&gn(u(t»)} a=max at,O+----{ u(t)

w;ly(t+i)-y·(t+i)1 (2.9)

• 0 b -at . If y > 0, at
u(t+ I) .

where

sup

eeP(t ),1e1..S:6.1s:i
effect on the parameter bounds calculated after y(t) has been observed, which in turn will reduce appropriate

.

Cco(t) =

u(t) balances between reducing Iy(t) - y. (t)1 and its

with

=I

Assume that the infinite-orizon performance is measured by

of

identification and control in this case will mean that

Iy(t + I) - / (t + 1)/

0

that the following hold: lim C~ exists

It-+co

=It-+co' limC~1t



in a way that

. 0

2.1 Comparison of stochastic and worst-case dual control

and

Stochastic dual-control (see e.g. Chapter 7, Astrom and Wittenmark, 1989) is mainly concerned with optimizing criteria based on conditional expectations in a probabilistic sense. In practice this means considering statistical averages. Criteria of the type

-&gn(u(t»)} . u(t)

(ii) Having y(t) =Ou(t) +& observed and at+1' bt+1 computed as the updated parameter bounds of 0, the next optimum value of the input is

u(t + I) = 2/ (t + I) I (at+1 +bt+1)' Proof The following lenuna can be used:

75

are used to measure performance as opposed to (2.4). Here 3,-\ is a a-algebra of . events representing knowledge on past I/O and parameter vector 8.

y(t +k + 1) = blu(t +k) +b2 u(t +k -1)+... ... +bmu(t +k -m) +e(t +k).

Then V(t) satisfies a Bel/man equation

8 = (b\ •...•bm). le(t+k + 1)1 ~ o. k = I•... ,N where the equation-error bound 8>0 is assumed to be apriori known. Knowledge of a safe upper bound o might in practice be based on (i) that the process is approximately linear (ii) measurement errors of input/output are known (Hi) exponential decay of the impulse response is known so that the tail of the impulse response can be incorporated into 0 . Assume the plant is time invariant and let at time t

which is the stochastic version of (2.5). It is clear that in the stochastic approach probability distributions of disturbances and of the parameter vector play an important role. There are two assumptions inherent to this approach: (S.i) the distributions of disturbances must be known, (S.ii) the development engineer is mainly interested in average performance . In contrast to the stochastic case. worst~ dual control makes no assumptions concerning the distribution of the disturbance except that its support

the feasible parameter parameter region

p( t).

The

control criterion will be formulated as in (2.3).

Theorem 2.4 Assume that there is a

e

that

is known in the form of bounds. A time domain co equation error bound can accommodate disturbances of a wide spectrum with bounded amplitude and also slight non-linearities or unknown linear dynamics

IIsl12

eE p( t) such

<0 andthaty·(t+N):;tO. Then the

worst~ performance

C:fl(P(t») of the optimal

dual-eontroller is such that

e

C::t'(P(t»)~ wNI/(t+N)I. 0

bounded in the I-sense. Instead of (S.i) and (S.ii) the following hold in the case of worst~ dual control: (W.i) a bound on the equation-error of the model must be known, (W.ii) the development engineer is mainly interested in worst~ performance. In stochastic dual-eontrol assuming the wrong distributions might endanger optimality or stability. Assuming overly conservative error bounds in worstcase dual control might also reduce optimality of performance. Still. one might suspect that .in some practical cases assuming only bounds might be a safer option for the development engineer.

2.3

A priorijinite-time tuning

Calculating the exact solution to the worst~ dual control problem can be prohibitively complex. One useful conclusion can, however. be drawn from Theorem 2.1: updating of the parameter feasibility sets is taking place continuously in the solution. What is most difficult to evaluate when solving the optimal-input problem is its effect on future parameter bounding and on control performance. To achieve computability. in this section we will give an example of substantial simplification of the original dual-problem, whilst retaining an element of compromise between identification and control. An approximation to the problem can be obtained by opening the closed-loop of identification and control over a time period [t.t+N]. Both worst~ identification and control performance during the interval [t.t+N] can be taken into account to

2.2 Negative results on the exact worst-case dual control/er Negative results are verified statements on the limits of achievable performance of dual controllers. Such statements can be of considerable interest not only in theory but in practice. Since the dual controller "optimally" compromises identification and control, results of this kind might help to realise that some hoped for solution might not exist for a particular practical problem. In this case the developer should decide to look for other solutions. beyond the scope of adaptive control. In this way limitations on the performance of all suboptimal controllers can also be devised. For an example consider the FIR models

calculate the inputs over [t,t+N]. A criterion CUknt can be introduced to evaluate the parameter bounding effect of inputs over [t.t+N] which is based on knowledge of model parameters at time t. At the same time the control inputs can be constrained by output behaviour. again based on limited knowledge of model parameters at time t. Thereby the closedloop of identification and control present in this scheme can be opened for a time period say [t.t+N]. during which the inputs are optimised to reduce

CiiU",

76

under the constraint that they do not cause the

output to go undesirable for any of the models from

{SI}: 1 will have to

P(t).

This is the only requirement

Denote by UN the set of inputs over [t,t+N] which will satisfy the given output constraints for any of the models in P(t). To calculate the control inputs over [t.t+N] the minimax criterion

satisfy to achieve asymptotic performance. Different choices of {SI}:I might determine the rate by which

the performance level is attained. The recursive steps of the adaptive controller for each ~I are as follows: (i) Polytope updating is performed under the condition that

can be optimized. Since the length of the time period N over which tuning takes place is fixed in advance, therefore this type of self-tuning can be termed as priori finite-time tuning as opposed to the a posteriori finite-time tuning where the length of the finite tuning period is not known in advance. Detailed discussion and examples of solutions to the cl priori finite-time tuning problem can be found in Veres (1994).

Iy(t -I) _cpT (t - 2)e1_1

I> 0

(3.3)

to obtain the next parameter bounding polytope

a

p(t) = p(t -I)n{ elly(t _1)_cpT (t - 2)81 ~ o} (3.4) otherwise

let

p(t) =p(t -I).

Define

8 =centr{P(t)] . 1

3. A POSTERIORI FINITE-TIME TUNING BY A SYNERGIC SCHEME

(ii) Define B'(q-1Ie) = e"+2q-2+...+ert+mq-m. A set of feasible control inputs is calculated as

In a case where performance is guaranteed after a finite period of time with cl priori unknown duration, the respective method will be termed a posteriori finite-time tuning. This section illustrates such an algorithm which uses a combination of control and identification criteria to calculate the control inputs. The full difference equation model (2.1) will again be used The following assumptions will be made. (C. I) Assume that the plant is time-invariant with parameter

(iii) Compute the diameter-vector d(t) of p(t) as the vector-difference of the two most distant points

e E R,,+m such that

of p(t) and define cjJ(1)-[e;cp(I-I) y(I-I)... y(1-II+I) 11(1) ... 1I(I-m+l)r

Y(t)=ql(t-l)e+e(t), t~m+1 q>(t-I) =[y(t-I), ...,y(t -n), u{t -I),.. .,u{t-m)] (C.2)

(3.1)

Control input

u( t ) e UI

is calculated by

maximizing

le(t)1 ~ 0,

.(3.5)

Assume that it is known that e eP(O),

Comments. Step (i) of the WDCAI algorithm defines conditional polytope updating which can be carried out using recursive algorithms. The centre of P(t)

A 0 > 0 bound is known such that t~m+1

(C.3)

T

.

a

where P(O) is an priori known set of parameter vectors defined as a polytope. It

means Chebishev centre in the

will also be assumed that all e eP(O) define a minimum-phase model and that

e(rt+I) =hi

l

co

sense, i. e.

_ I ( sup e(i)'of e(i») In eA(i) I - I + It , I. --12 , , ... , n + m. 2 9E1'{t} 9EP(t)

:t: 0, e eP(O).

d(t)=SS_Si

Weakly-Dual Control Algorithm J (WDCAJ)

step (iii) of the algorithm

Assume that a monotone decreasing sequence of

3 S and 3 i are the two most distant vertices within

polytopes

{SI}: I is defined so that it contracts to

where

P(t). Since I
the origin: co

nS ={O}, SI 2SI+I,t~ 1. 1=1 I

{ellY(t+I)-cpT(t)elso} from a line with direction

(3.2)

d(t), its maximization gives the largest chance of a substantial reduction of P ( t ) by updating. The 77

Moore, 1.B., T. Ryall and L. Xia (1989). Central tendency adaptive pole assignment. IEEE Trons. Aut. Control, AC-34, 363-367. Morse, S. A (1980). Global stability of parameter adaptive control systems. IEEE Trons. Automat. Control. AC-2S, 433-439. Morse, AS., Mayne D.Q. and G.C. Goodwin (1992). Applications of hysteresis switching in parameter adaptive control. IEEE Trons. Automat. Control, AC37, 1345-1354. Norton, 1. P. and S. M. Veres (1991). Developments in parameter boWlding. In: Lecture Notes in Control and Infonnation Sciences, Vol. 161, pp. 137-158. Polak, E. , S.E. Salcudean, and D.Q. Mayne (1987). Adaptive control of ARMA plants using worst-case design by semi-infinite optimization. IEEE Trons. Automat. Contr.. AC-32, 388-396. Ryall, T. and 1.B. Moore (1989). Central tendency minimwn variance adaptive control. IEEE Trons. Automat. Control, AC-34, 367-371. Sternby, 1. (1976). A simple dual control problem with an analyitical solution', IEEE Trons. Automat. Control, AC-21, 84Q.844. Veres, S.M. and J.P. Norton (1990). Structure selection for parameter-boWlding models: consistency and selection criterion. IEEE Trons. Automat. Control, 36, 474-481. Veres, S.M. and S. Hermsmeyer (1993). Geometric Bounding Toolbox. Published and licenced by The University of Birmingham, BRD Ltd., Add: Vincent Drive, Edgbaston, B15, UK. Veres, S.M. and J.P. Norton (1993). Predictive self-tWling control by parameter boWlding and worst-case design. Automatica,29, 911-928. Veres, S.M. (1995). Identification by parameter boWlding in adaptive control. Int. J. of Adaptive Control and Signal Processing. 10, Veres, S.M. and K.J. Warn (1994). Performance measures and a new approach to adaptive control. Proc. lASTED Symp on Modelling and Control. Grindelwald, February, 21-23,1994. Veres, S.M. (1994). Worst-case Dual-control: Improvements of Peifonnance in Process Control. Research Memorandwn No. 40. Shoo} ofE&E Eng., The University of Birmingham. Walter, E. and H. Piet-Lahanier (1988). Estimation of parameter boWlds from boWlded-error data: survey. Proc. 12th /MACS World Congress. pp.467-472. Wenk • C.1. and Y. Bar Shalom (1980). A multiple model adaptive control algorithm for stochastic systems with unknown parameters, IEEE Trons. Automat. Control, AC-25,703-71O. Witsenhausen, H.S. (1968). A minimax control problem for samples linear systems. IEEE Trons. Automat. Control. AC-13, 5-21. Wittenmark , B. and C. Elevitch (1985). An adaptive control algorithm with dual features, Preprints of the 7th IFAC Symposium on Identification and System Parameter Estimation. York, U.K. Wittenmark, B. (1975). Stochastic adaptive control methods: a survey., Int. J. Control. 21, 705-730.

following theorem states the asymptotic properties of the algorithm. Theorem 3.1. Under assumptions (C.1)-(C.3) WDCA1 ensures the asymptotic tracking performance

Because of lack of space the proof is not included here but can be found in Veres(l994). 4. CONCLUSION Worst-case dual-control for SISO equation error models was the topic of this paper. First the basic problem was formulated and illustrated on a staticgain example both for finite and infinite horizon. Another theorem has shown that in some cases one might obtain unacceptable worst-case performance that cannot be resolved with the optimal dualcontroller. A comparison was made between the worst-case and stochastic approaches to dual control. Concepts of a priori and a posteriori finite-time selftuning were introduced. Finally cl posteriori finitetime self-tuning was proved for a synergic scheme of identification and control. REFERENCES Alster, 1. and P.R. Belanger (1974). A technique for dual adaptive control. Automatica, 10,627-634. Astrom, K.J. (1993). Matching criteria for control and identification. Proc. Euroean Control Conference '93. Groningen, pp. 248-251. Astrorn, K.J. and A Hehnerson (1986). Dual control of an integrator with unknown gain. Comp. & Math. with Appls., 12, 653-662. Astrom, K.J. and B. Witt.enmark (1971). Problems of Identification and Control. Journal of Mathematical Analysis and Applications. 34, 90-113. AstrOm, K.1. and B. Wittenmark (1989). Adaptive Control, Addison Wesley, Reading, Massachusetts. Bertsekas, and I.B. Rhodes (1973). Sufficiently informative functions and the minimax feedback control of Wlcertam dynamic systems. IEEE Trans. Automat. Control. AC18, 112-123. Doyle, 1.C. and G. Stein (1981). Multivariable feedback design: Concepts for a classical/modern synthesis. IEEE Trons. Automat. Control, AC-26, 4-16. Durnont, G. A and K. 1.Astrom (1988). Wood chip refmer control. IEEE Control Systems Magazine. 8. 38-43. Feldbawn, A A (1966). Optimal Control Systems, Academic Press, New York. Milito, R., C.S. Padilla, RA Morse, AS., D.Q. Mayne and G.C. Goodwin, (1988). Applications of hysteresis switching in parameter adaptive control. Draft paper. Mooketjee, P. and Y. Bar Shalom (1989). An adaptive dual controller for MIMO-ARMA systems. IEEE Trons. Automat. Control. AC-34, 795-800.

78