0005 1098/84$3.00+ 0.00 Pergamon PressLtd. ~, 1984InternationalFederationof AutomaticControl
Vol.20. No. 1, pp. 39 50, 1984 Printed in Great Britain. Automatica,
Predictor-based Self-tuning Control* V. PETERKAt
Heavy-duty conditions of microprocessor self-tuning control motivated the development of a new numerical method for LQ-optimum control design--a kind of Wiener approach revival. Key Words--Control system synthesis; self-tuning control; adaptive control; self-adjusting systems; optimal control; direct digital control; stochastic control.
occur, and in addition as simple as possible. These requirements and the requirement to tune the controller also with respect to the preprogrammed sequence of the setpoints (program control) motivated the development of a new numerical method for LQ-optimum control synthesis which is the main topic of the paper. The problem is stated in Section 2 where the control objectives are defined and the corresponding models and assumptions are introduced. The main results are summarized in Section 3. Their constructive proof and the optimization method itself are presented in Section 4. In the concluding Section 5 possible extensions are outlined and some practically important points are recalled and briefly discussed.
Al~traet--Linear finite-memory output predictors updated in real-time appear to be a suitable internal representation of the system in a digital self-tuning controller. A new time-domain method of quadratic-optimum control synthesis for systems described by such predictors is presented. The synthesis covers both the servo problem and the regulation problem, including the program control (with preprogrammed command signal) and the feedforward from measurable external disturbances. Unlike the standard Riccati equation, the method leads to algorithms (or explicit formulae in low-order cases) which are numerically robust and therefore suitable for real-time computation using microprocessors with reduced wordlength. 1. I N T R O D U C T I O N
Most of the existing self-tuning control algorithms which appear practicable rest on the so-called certainty-equivalence hypothesis (Astr~Sm, 1980), i.e. on the enforced separation of identification and control. In this paper a similar simplification is employed using a linear finite-memory output predictor, updated in real-time, as an internal representation of the uncertain process to be controlled. In this sense the paper follows the line applied in Peterka (1970) and Clarke and Gawthrop (1975) but it differs substantially in the control strategy applied. Here the control strategy is designed to minimize, for the most recent predictor available, the expected value of a suitably chosen quadratic criterion embracing a given (finite or asymptotically infinite) control horizon. As in such a self-tuning controller the control law has to be repeatedly redesigned or updated, usually in each control step, it is required that the algorithm performing this task be numerically stable, free of singularities, uniform for all situations which might
2. C O N T R O L OBJECTIVES AND MODELS
The scheme in Fig. 1 shows the situation considered throughout the paper. The process P, with single input u and single output y, is stochastic due to unmeasurable internal and/or external disturbances. The self-tuning controller STC may include the feedforward from the measurable external disturbance v if available. It is required that the controller be automatically tuned also with respect to possible changes of the setpoint (command signal) w which may be either a priori known (program control), or uncertain (servo), or it may be required that the process output follows the output of a reference model the input of which is a priori uncertain (model following). The extension
* Received 11 June 1982; revised 23 May 1983. The original version of this paper was presented at the 6th IFAC Symposium on Identification and System Parameter Estimation which was held in Washington, D.C., U.S.A. during June 1982. The published proceedings of this IFAC meeting may be ordered from Pergamon Press Ltd, Headington Hill Hall, Oxford OX3 0BW, U.K. This paper was recommended for publication in revised form by associate editor P. Parks under the direction of editor B. D. O. Anderson. tInstitute of Information Theory and Automation, Czechoslovak Academy of Sciences, Prague 8, Pod vod~irenskou v~£i 4, Czechoslovakia.
v
Y
FIG. 1. Control loop considered. 39
40
V. PETERKA
for the case of more than one measurable external disturbances is straightforward and simple, and therefore is not considered explicitly.
Time indexing The chronology of signal samples is defined so that the samples Y~t ~) and v~, ~) are available when um is computed whereas y,) and v m are not known at that moment. The availability of the future setpoints (command signal) will be discussed and defined separately. Output predictors Two kinds of finite-memory predictors are considered. The first one has the standard linear form t~a ~b )~(t) = - ~ aiy,-ll + ~., blu(t-i~ i-I
i=0
Pd
+ ~ divlt_i I + k
(1)
i-1
where the integers Oa, Ob and Od define the memory of the predictor. It will be called the positional (P) predictor. The prediction error is introduced as em = Ym - .Vm
(2)
and is available as soon as the true output Ym has been observed. In most cases of industrial process control the main task of a controller is to compensate stochastic disturbances which are inherently nonstationary like drifts, unpredictable load changes, etc. In such cases it is more appropriate to consider the following incremental (1) predictor ~a Ylt} ~--- Y(t
11 -- E i=1
ai k Y t t - 1)
gb + ~ bi Au(t-il i=0 Pd
-4- E diAvl,-i~
(3)
i=1
where Aum = ulo - u~_ ~ and similarly for y and v. Note that in (3) the 'fixed point' in the i n p u t - o u t p u t relation is the last observed output y{,_ ~ instead of the constant k in (1).
Process dead-time The unknown time delay of the process is covered by the b-parameters in (1) or (3) which are estimated in real-time. For the sake of simplicity the a priori known delay, say ~c, is not considered explicitly. It can be introduced in the following ways: (i) the corresponding number of leading b-parameters are not estimated and simply set to zero and (ii) the known time delay is neglected in the control-design step (Y(~)is considered instead of Y(~+~)) and the
outputs which are not available but are required in the resulting control law are replaced by their predictions supplied by a predictor running in parallel. The latter appeared convenient in practice and is theoretically equivalent to the former if the prediction errors can be assumed uncorrelated.
Performance criterion Let tc be the current time index. The control law for t > tc is designed with the aim to minimize the criterion 1 ( tc+'r J = ~E~ y [(Y.) - w.0 2 + 021au.0 2] Lt=tc + 1 +
te+T t ~ 02s(Aum)2 t=lc+T ~ b + l
(41
where co _> 0, cos > 0 and T is the control horizon. The case of T--* oo is of particular interest. Note that not the inputs u~,~but only their increments Aum are penalized for both predictors. The reason is to ensure that the mean of the control error ~,,I - wl, ) be zero in steady state for any fixed setpoint. The choice of 02 is a performance-oriented 'knob' which makes it possible for the user to d a m p the movements of the actuator if it is required by the operating conditions. The auxiliary term, penalizing only the last control actions, is introduced in order to guarantee the asymptotic stability of the inputs also for 02 = 0. This makes the difference between the nonstabilized minimum variance control (Astr6m, 1970) and stabilized minimum variance control (Peterka, 1972) of a nonminimum-phase process. Note that the auxiliary term (0 < 02~ < ,~¢) is negligible for T--. o¢ but only when the input is stable.
Separation of identification and control According to the certainty-equivalence hypothesis the control law is determined in each control ste p for the most recent predictor available, but under the assumption that its parameters remain constant within the control horizon T considered. This means that the criterion (4) is minimized under the simplifying assumption that the predictor parameters as, b~, d~ are known and equal to their estimates based on the data observed up to and including the current time index t~. However, to be able to find the control strategy minimizing the criterion (4) in a well-defined manner some additional assumptions concerning the prediction errors have to be adopted. A standard way would be to assume that the prediction errors are mutually uncorrelated random variables with zero mean (discrete white noise). Instead of this the following assumptions, which lead to the same result but are somewhat more general and better suited for the conditions of self-tuning control, will be made.
Predictor-based self-tuning control Al:Fortc
41
The variances of the uncorrelated increments
T
Ee(t) = 0
Av(,) = ev(~)
E~(,e~,=O,
t < r < t¢ + T
(5)
where ~(,) means u(,) or Au(t), depending on whether the P-predictor (1) or the/-predictor (3) is considered, and is understood as a function, which is to be determined, of the data available up to and including the time index (t - 1). A2: The covariances of the prediction errors do not depend on the control strategy in the set of strategies from which the optimal one is to be selected, otherwise they can be arbitrary but finite. In this way, by simplification based on the certaintyequivalance hypothesis, the self-tuning controller is decomposed algorithmically into the identification part and control-law-design part. Only the latter is discussed in detail in this paper. The unknown parameters of the predictor (1) or (3) can be identified in real time using recursive least squares (possibly modified by exponential forgetting and other tricks). However, for numerical reasons it is strongly recommended to use a square-root filter (Peterka, 1975; Peterka and Halouskov~i, 1976), or its square-root free and somewhat faster variant (Bierman, 1977), for this purpose. The main modelling problem is to choose the structure of the predictor so that the assumptions A1 and A2 could be fulfilled, at least asymptotically, with good approximation. It is known that the linear leastsquares regression, when applied to ergodic time series, produces residuals which are orthogonal with respect to the variables considered in the regressor. This indicates that the number of delayed inputs incorporated into the predictor (the number of bparameters) may be most important. It well accords with practical experience. Practical experience also shows that in most cases of industrial process control the/-predictors should be preferred. Measurable external disturbance If the self-tuning controller includes the feedforward from the measurable external disturbance v and the predictor contains a linear combination of past values {v(t-i): i = 1, 2 ..... c~d} then the minimization of the criterion (4) requires the adoption of a suitable model for the future development of the external disturbance for t > t~. As in most practical industrial cases the measurable external disturbances are nonstationary it is appropriate to design the self-tuning controller for a 'random walk' V(t) :
~)(t-1) + ev(t)
E[ev(t)lv(t- 1~, u(t-2) . . . . ] = Eev(,) = O.
(6)
(7)
are not required to be known and can be arbitrarily time-varying but finite. The 'random walk' (6) generalized in such way is a very realistic model for load changes at unpredictable time instants, drifts, etc. Note that such a model does not need to be identified. In order to cover other cases also, an autoregressive model will be considered in the general development of the control-law design procedure in the next section [see (13)]. Command signal Because the criterion (4) depends on the command signal w(, for t > tc it is necessary to introduce a suitable model for its future development and to define on what values of w an admissible control law can operate. The following typical situations will be considered.
(A) Fixed setpoint. If the process output y is measured as the deviation of the given setpoint, then w(,=0,
t>tc
(B) Program control. The command signal is deterministically preprogrammed and the entire sequence of the future setpoints {w(t): t = t~ + 1, t¢ + 2 ..... t¢ + T} is a priori known and available for the controller. (C) Positional servo. The future course of the setpoints is a priori uncertain and is modelled as a random walk generalized in the above sense w,) = w(,_ 1) + ewl,).
(8)
It is assumed that w(t) (the desired value of y(,) is available when utt) (preceding the output Y(t)) is computed. (D) Model following. It is required that the process output follow the output of a reference model introduced by the relation Oam
Obm
w(t) + ~, am.iW(t-i)= ~ bm.iWc(t-i, i= l
(9)
i=0
where wc is the input of the reference model (the overall command signal) modelled as a generalized random walk we(t) = we(t-l) + ew~m
(10)
and am.i, bm.i are the reference-model parameters chosen by the designer (usually so that the steadystate gain is equal to one). It is assumed that w(t) and w¢~t) are available when u(t) is computed. There are two situations when it is appropriate to formulate the control problem as the model-following problem: (1) in case of unpredictable step-wise
42
V. PETERKA
setpoint changes [which might be modelled by the generalized r a n d o m walk (8)] the minimization of the quadratic criterion m a y lead to overshoots which m a y be undesirable. This disadvantage of the quadratic criterion can be, as a rule, removed by the reference model (9) of first order
General solution The system of T equations obtained from (12) and (13) for t = t~ + 1, tc + 2 . . . . . t~ + T can be written in the following matrix form
Ay = B5 + D~ + e + k*
(15)
w(,) = ~w,-1) + (1 - c0w~(,
A~,~ = e,. + k*
(16)
(11)
with just a single adjustable p a r a m e t e r (0 < :( < 1 ). (2) If the overall c o m m a n d signal w~ is generated by a h u m a n o p e r a t o r (like in aircraft applications) it m a y be required to maintain the dynamic properties of the closed control loop as constant as possible and close to a nominal case prescribed by the reference model (9). 3. CONTROL SYNTHESIS Notational conventions In order to cover all cases discussed above, the following forms of the process models will be considered ?~
where U(~c + I)7 /Y(t~+ 2) /
'(Ic + T)
u.o+2)I
J
U(t~.+ "r) _]
and similarly for f, e and e~.. A, B, D and A,, are RSLB-matrices of degrees (3A = (35, (3B = ~b, (3D = (3d and (3.'~, = (35~,, respectively. Aol. = 1, Ai,~ = 5i, B,,) = b~, etc. in the appropriate range of the index i. The vectors k* and k* are defined as follows: k* =
~b
k + s~(,ol
k,* =
s,, ~,~-]
.)',) + ~ 5iy, i)= ~ biS(,-i) i=1
i=0 ?d
+ ~ dif,-i)+k+e~
(12)
i=1 ~nav O(t) -IF E 5v,if(t-i) ~--" e~,(,) i=l
=
0
(17)
- -
where n = max ((35, Ob, (3d), m = (35~, Sj(tc) ~- k (--5iY(tc+J -i) "~ bill(to+J-i) i=j
+ di~(,o+j-i))
(18)
= Mi(t)
where i indicates the distance from the main diagonal to the left for the matrix element appearing in the row (t). Thus, for a lower triangular (LT) matrix it holds that M,t) = 0 for i < 0 and all t. We shall say that the matrix M is strictly lower triangular (SLT) if Mi,~ = 0 for i_< 0. The LTmatrix M for which it holds that M ~ , ) = 0 for i > (3M will be called lower band (LB) matrix of degree (3M. A LB-matrix with zero diagonal elements will be denoted as a strictly lower band (SLB) matrix. F o r a diagonal matrix the index i is superfluous and will be omitted• The matrix will be called row-stationary (RS) if it holds that Mi,) = Mi for all t considered. It is also useful to introduce the matrix 1 O=
Sv,m(t,.)
k
(13)
where for the P-predictor (1) t7 = u, f = v, (35 = (3a, 5~ = ai and according to (6) (35~, = l, 5~,1 1 ; for the /-predictor (3) 5 = A u , tT=Av, ( 3 5 = ( 3 a + 1, 51 = al - 1, 5#a = --aaa , 5 i = a i -- a i _ 1 (1 < i < Oa), k = 0 and according to (7) (35,, = O. Operating with elements of square matrices the following indexing will a p p e a r convenient Mta-i
k ~- Sn(tc )
-1
0
The criterion (4) can be written in the following vector-matrix form J=~ 1E {(y-w)'(y-w)+~7'O'n®fi~
(20)
where (.)' means transposition, f~ is a diagonal matrix ~(.=~o
for tc < t <_ t,. + T -
f~,)=o+o9=
for to+ T -
~b
~b < t < t,, + T
and = ®
for the P-predictor
= 1
for t h e / - p r e d i c t o r .
In case of the P-predictor the term
1
.
o9(u~c ) - 2u,~)u(,o) )
1 -1
(19)
Sv,j(tc ) = -- ~ 5v,iU(le+j_i). i=j
1
(14)
(21)
is omitted in (20) because it is insignificant in the sequal.
Predictor-based self-tuning control To make the main results easier to survey, the control synthesis minimizing the criterion (20) will be presented first as Result 1. Its constructive proof, and the optimization method itself, will be given in the next Section 4. Result 1. Finite control horizon: The control law minimizing the criterion (20) for the process models (12) and (13) has the form
+ 2
with k # 0 this term is nonzero, in general, and can be calculated also recursively for decreasing t according to the 'upper-band' set of equations p *
¢b'Xk = F'k* + G kv.
(31)
The term x~,), by which the command signal is incorporated into the control law (22), is obtained as follows: (A) Fixed setpoint. The output y(,) is understood as the deviation from the given setpoint and
~(,) = - A(,, 1 [~'~=lRimfqt_i) + ~es Si,)y(,- i) i
43
i=1
Oi(t)V(t-i) + X k ( t ) - Xw(I)
i=1
]
Xw(t) = O.
(22)
and is obtained as the tth row, t > tc + max (c~a, ~3b, 0d, Oa,,), of the vector-matrix relation = -A -~[Rff+Sy+Q~+xk-xw]
(23)
where R, S and Q are SLB-matrices and A is diagonal. The matrices can be determined in the following way. Compose the symmetric band matrix B'B
+ A'~A
= M
(24)
(B) Program control. The term Xw(t) is the output of the filter operating on the future course of the command signal w backwards with respect to natural time according to the relation ~'Xw = B'w. (C) Positional servo.
xw(,) = ~,)w(,)
= ~)A = ®A
(25)
and perform its factorization M = qb'Aq~
(26)
where ~ is a LB-matrix with unit diagonal elements. Determine the SLB-matrices S, H and the LBmatrices F, G in the matrix equalities
(33)
where F,) is the element of the diagonal matrix F determined by the matrix equation B' = (l)'r + A'®
where
(32)
(34)
where A is an auxiliary SLB-matrix. The matrix equation (34) is solved again recursively for decreasing t. (D) Model following. The term xw(,) is obtained as the output of the filter operating on the overall command signal wc in real-time in the following way dam Sin(t) ~ -- Z am,iSm(t-i) + Wc(t) i=1
B' = O'S + F ' A
(27)
B'D = ~ ' H + G'A~,,4.
(28)
OFm Xw(t) = ~, Fm,i{t)Sm(I-i) .
(35)
i=0
The SLB-matrices R and Q are determined by the strictly lower parts of matrix equalities
The coefficients F.,i(t) are obtained as the elements of the LB-matrix F. defined by the matrix equation
@'R + F ' B + A ' I ) O - 0 %
(29)
B'Bm = (I)'r. + A~®A.
~ ' Q += F'D - G'Av
(30)
where + indicates that only the equality of SL - parts is significant. Algorithms performing this task are described below. They operate recursively backwards with respect to natural time (for t = tc + T, tc + T - 1, tc + T - 2.... ), can run in parallel, require only a fixed memory size (depending on degrees ~ , c3b,dd and 0 ~ , but not on T) and exhibit a high numerical stability. The term Xkm in the optimal control law (22) is zero for/-predictors where k = 0. For P-predictors
(36)
which is solved recursively for decreasing t. A . is an auxiliary SLB-matrix, A . and B . are RS-LBmatrices with the elements Am,i(t) = a.,i(t) , B.,i.) = bin,i, respectively. Remark 1. Recall that in the important case of an incremental predictor (3), when combined with the external disturbance modelled as a generalized random walk (7), X,, = I, 0 = I. In this case only the first matrix terms on the right-hand sides of (29) and (30) have nonzero elements under the main diagonal. Omitting the insignificant terms the relations O'R + FB,
O'Q +_ F'D
(37)
44
V. PETERKA
are sufficient to determine R and Q. However, then neither the matrix G nor H are required and the decomposition (28) does not need to be performed.
for i = 1 to n @.. = A(.~1 M.,~ -
~. OPj(,+.i~A. +;)OP~+ i(, +j~ j=l
R e m a r k 2. The optimal control law (22) has been given only for t > t~ + max (0a, 0b, c~d, Oa,,). However, this does not restrict its generality because, for the known model parameters, the time index t~ can be shifted arbitrarily backwards. Nevertheless, it may be interesting to note that the first row in (23) gives the optimal control law operating on the state the components of which are defined, for t = t~, by (18) and (19). For instance, for the case discussed in Remark 1 the first row in (23) gives -1 Alg(t~+ l) = - A ( t ~ + l ) [Xk(t~+ l) -]- Xw,(t~+ l)]
where, according to (31) and (17) Xk(tc+ 1) = ~ tJt/i l(t~+ 1)Si{tc) i=l
and q~.,o+~) are the elements in the first row of the matrix W' = (qY)-'F' Algorithms. The algorithm for matrix factorization (26) is well known (Bierman, 1977). As the 'upper-diagonal-lower' factorization is required the algorithm must start in the lower right-hand corner of the matrix equality (26) and runs backwards with respect to natural time (as in dynamic programming). An examination of the matrix decomposition (27) shows that also here the lower righthand corner is the only place where the solution may start. The algorithm can be found by comparing the lower and upper parts of the matrix equality. Similarly, the algorithm to calculate the matrix R from the lower part of (29) can be derived. All the three algorithms can run in parallel and can be unified into a single Algorithm OCS given below. The algorithms for solving (28) and (30) are similar and therefore are omitted here. However, it is worthy to note that in the important special case, discussed in Remark 1, (28) is not required to be solved and Q~,) can be calculated in the very same way a s Rilo, just replacing b i by d~ [see (37)]. Algorithm O C S (optimum control synthesis). In order to unify the algorithm let n - - m a x (~b, ~a + l ) = 3~ and set zeros for parameter values where required. The algorithm operates backwards with respect to natural time with zero initial conditions for t > t~ + T and for all calculated variables
F~m = b~ -
n i ~ (Si.+~+j.+i)
+ aiF~+j.~ ,i))
j-I n-i
j-1
+ ai+fj.+jj) tt
Rim = biFom -
i
~ ( Ri+ jtt+jflJ jl,+ jl
j-1
-- bi+jFj(t+j) )
Rlit) := R l , ) -- J~o~
where # = 1 for the P-predictor and # = 0 for the lpredictor. Note that there is just a single division in the entire algorithm, namely by Am in the factorization part. However, this number must be positive since the factorized matrix M is positive definite (except for the case when all b-parameters are zero and co = 0, i.e. when any control has no sense and the input is allowed to be arbitrary). The only potential numerical danger is the subtraction of positive numbers in the first row the result of which must be the positive number AI,. This drawback can be removed by a modification of the algorithm developed by my colleague K. Smuk. The idea of this modification is to transform the lefthand side of (24) directly into the right-hand side of (26) using an algorithm similar to the Bierman's UD-filter which guarantees the positive definiteness numerically without calculating the matrix M explicitly. Details will be reported elsewhere. Note also that the algorithm, when coded for some fixed nmax, can solve all cases for n _< nma~. Asymptotic solution The solution for the infinite control horizon can be found, if it exists, by increasing the number of iterations in the Algorithm OCS until the stationarity is reached. However, in low-degree cases it may be more convenient to look for the stationary solution directly. It is well known that the multiplication of RS-LB-matrices is equivalent to multiplication of polynomials. Hence, it is appropriate to look for the stationary solution in terms of polynomial equations. This can be done very simply by replacing the LB-matrices in the matrix relations (24) to (30) by polynomials introduced as follows: 5 = a(() = 1 + ~ i l ( + ' " + & ~ , ~ ' ~ in place of ,4, and in general
Am = Morn - ~ A.+j~qbj2.+~) j=l
Fore = b0 -
~ (Sj~,+j)~j~,+j~ + a~F~,+j~)
j=~
P = P(~) = Po + Pl~ + "'" + P,>~'-P in place of a LB-matrix P. Note that p(0) = Po,
Predictor-based self-tuning control
45
p(1) = Po + Px + "'" + Pep and that, according to (14), 0(~) = 1 - ~, and according to (25) for both Pand/-predictors
(D) Model following. The term xw(,) is generated by the filter operating on the overall command signal wc in real time
fi = fi(() = 0"(()5(() = (1 - ()a(().
~'" Wc(t)" Xw(t) = am
(38)
It can be verified by inspection that, when considering the row-stationary zone in the matrix relations (24) to (30), the transpose P' must be replaced by the polynomial which can be called, with some abuse of language, reciprocal p, = p ( ~ - l )
[rS(,) + sy(t) + qf(o + Xk
-- Xw(t)]
(39)
where the scalar 5 and the polynomials r, s, q (with ro = So = qo = 0) are determined by the polynomial equations which follow from RS-zones of matrix equations (24) (30). b'b + ~'to~ = m = 49'649
(40)
b ' = 49's + f ' 5
(41)
b'd = O'h + g'5~5
(42)
49'(6 + r) + f ' b + ?t'o)ff
(43)
49'q + f ' d - g'5~.
(44)
where + means that only the equality of terms with positive powers of ( is considered. The polynomial 49 in the factorization (40) has no roots inside the unit circle and is normalized so that 49o = 1. The constant term Xk in (39) is zero for the Ipredictor while for the P-predictor f(1) k Xk = 49(1) "
(45)
The term x~(, in the control law (39) is obtained as follows. (A) Fixed setpoint. The output y means the deviation from the given setpoint and xw(t) = O. (B) Program control. The term xw(t)is obtained as the output of the filter b/49 by which the future command signal is filtered backwards with respect to natural time b' Xw(t) = ~ ; w(,)
(46)
(C) Positional servo. The term Xw(t) depends only on the recent setpoint b(1) Xw(t) - ~ 1 ) w(,)
is determined
by
b'b,, = 49'7,. + 2'"0a,.
the (49)
where 2,.(0) = 2,,.o = 0.
= PO + P l ~ - 1 + "'" + Pep~ - o p
Result 2. Infinite control horizon: If ( is understood as the one-step-delay operator, e.g. (~(t)=5(,_1), then the stationary control law minimizing the criterion (4) for T ~ ~ , if it exists, can be written in the form u(,
The polynomial 7" polynomial equation
(48)
(47)
Remark 3. In the important special case discussed in Remark 1 5~ = 1, 0"= 1. Omitting the insignificant terms the relations (43) and (44) read 49'r +=f ' b ,
49'q +=fla.
However, then the (42) is superfluous and it is advantageous to proceed as follows. Perform v = max (tob,tod) steps of the polynomial division f/49 = 00 + 0a~ + " " + 0v~ ~ + " " to obtain the polynomial 0, (tO0 = v). The coefficients ri and qi of the control law are easily determined by comparing the terms with positive powers of ~ in the relations r __+0'b,
q =+ 0'd.
(50)
The numbers 0i are, actually, the parameters of the stationary control law operating on the state introduced for t = tc by (18), see Remark 2. Remark 4. If the order of the factorization (40) is not higher than 2 then the following simple formulae can be used to perform it explicitly 2 = mo/2
-
m2 +
x/[(mo/2
+ m2) 2 -
m2]
6 = [~. + ,,/(2 2 - 4m2)]/2 494 = ml/(6 + m2),
492 = m2/6.
If the sampling period is chosen long enough compared to the significant part of the system step response then such low-order predictors appear satisfactory for a rather broad class of practical cases. However, then also the solution of (41)-(44) can be brought up to simple explicit formulae which can be well implemented on a microprocessor. For instance, a self-tuning controller based on the Ipredictor with the structure toa = 1 (to5 = 2), tob = 2, tod = 2, to8~= 0 can well manage many practical situations if the number of samples within the transient part of the step response is, say, up to six (but often even more). An important feature of the procedure is that, unlike other methods of numerical synthesis, the only singularity which might be encountered is the unavoidable one discussed below. Remark 5. Inspection of (40) and (41) shows that (41) does not have any solution if the polynomials 5
46
V. PETERKA
and b have an unstable common factor. In this case the stationary solution does not exist. The algorithm OCS (its decomposition part) does not converge, but for a finite horizon T it still retains its correct physical meaning. However, the process is not stabilizable. R e m a r k 6. In case of model following with a reference model of first order (11 ) the polynomial 7~ in (48) is
~,'~ = 7,.,0 + 7,,, 1~. Its coefficients can be easily determined using the formulae b(1)
b(e)
[-b(c0
b(1)]
which follow from the polynomial equation (49) with a~(~) = 1 - c~, bm(Q = b,,,o = 1 - ~. R e m a r k 7. As in case of an/-predictor ~(1) = 0, it follows from (41) that
of a deterministic system follows precisely any a priori given discrete command signal even when the system is of non-minimum phase. When the system input is damped by penalization of its increments, co > 0, then the controller naturally starts to follow a sudden change in the setpoint earlier than it really occurs in order to avoid large movements of the actuator. This makes the transition smoother and easier to realize. 4. OPTIMIZATION METHOD In this section Result 1, from which the Result 2 directly follows, will be derived. The principle idea of the method is a suitable prearrangement of the criterion (20) into two nonnegative components the first one of which can be zeroed in one shot while the second one does not depend on the strategy applied and represents the attainable minimum of the criterion. Recall that the matrices/1 and/T,, in (15) and (161 are lower triangular with unit diagonal elements and therefore invertible. Moreover, the RS LTmatrices commute
b(1) 4)(1) - s(1) = ~ s i . i
Then the control law (39) with x~,~ determined by (47) can be given the form Au(t , = - - 6
l[E
riAul,_i)
q- E si(yit_i) i
t_~
wl,))
B A = - AB,
A
IB = Bd
~
DA = AD,
A- 1D = D A
1
which can be easily verified by inspection, Making use of these properties the vector of control errors (y - w) and the criterion (20) can be expressed in the following way y-w=B,4
(51)
+ ~qiAv,_i) 1
operating only on the time increments ofu and v and on the deviations of the past actual outputs from the present setpoint. R e m a r k 8. The self-tuning controller designed for
program control must have a buffer where a finite number, say T~, of the next setpoints are stored. To achieve the desired effect the memory of the buffer does not need to be large. As a rule, T~ < 10 is fully sufficient. Then it is natural to start the backward filtering (46) with the initial conditions
l~+z
z = A-l(e
+ k*) + D A - I A , T l ( e ,
+ k*) - w
(531
1
j = T E { ~ - I ~ ) , M ( , ~ 1~) + 2 ( A - l ~ ) ' B ' z + z'z~ where M is the symmetric band matrix (24). This matrix is positive definite (except the very special and uninteresting case when simultaneously B = 0 and co = 0 and any control has no sense) and therefore can be uniquely factorized according to (26). Applying this factorization the rearrangement of the criterion can continue as follows:
W(t + Tw + i) z W( t + l"w)
J =
E{~'A~ + 2ii'10')- 1B'z + . _,_t ~,
(54)
for i > 0.
b(1) Xw(~ + T,,, + i) ~ ~
(521
W(t + Tw)
This is, actually, the precise solution for the case when the uncertainty ofwt,+kl for k > Tw is modelled as the generalized random walk (8). The interested reader, who will perform some simulations for himself, will appreciate the favourable and sometimes amazing effect of the term Xw~,) calculated in this way. For instance, for co = 0 the sampled output
where = ( I ) / t - 1~.
{55)
Note that ft,) is a linear combination of I~I~; r _< t I but, due to the lower triangularity of@,4 1, does not contain t/m for z > t. This fact will appear very important. It is also the reason why the factorization (26) has been performed in the 'upper-diagonallower' form.
Predictor-based self-tuning control The cross-term in (54) can be written, using (53) Efi'((o)-lB'z = E { ~ ' ( O ' ) - I B ' A - I ( e + k*)
command signal w is generated and made available for the controller. To proceed generally at this stage express this term as follows:
+ F((O')-IB'D,4-1A~'(ev + k*) - fi'(O')-
1B'w}.
E~(O')- ~B'w = E~'xw (56)
Now the main trick of the method follows. Denote for a while (O')-IB'A -x = P
(57)
and decompose this T x T-matrix into the strictly lower part and the upper part P=P+
+P_
47
and assume that xwm is available when u m is computed. The determination of the vector xw for the four situations (A, B, C, D) will be discussed separately. Making use of (61), (63) and (64) the bilinear term (56) of the quadratic criterion can be written as follows: Efi'(O')- ~B'z = Efi'x
(58)
where the SLT-matrix P+ has zeros on and above the main diagonal whereas P_ has zeros under the main diagonal. The point, and the reason for the decomposition, is that
(65)
where x = S,4-1(e + k*) + Ha-l-A, Tl(ev + k*) +xk-xw
(66)
and in coincidence with (31)
E~'P_e = 0
x~
whatever P_ might be. This is due to the fact that this expression contains the products tTme(oonly for z > t the mean values of which are zero according to the assumption A1, equation (5). Hence
(64)
( O ' ) - l ( F ' k * + Gkv).
(67)
Now, by completion of squares, the criterion (54) can be given the form 1
a = 7FE[fi'A~ + 2~'x + z'z]
EF~'Pe = Eft'P+ e. It is more convenient to consider the decomposition (58) of (57) in the following equivalent form
= 1 E [ ( fi + A-lx),A(fi + A-Ix) ~t_ZIz 1 _ x'zX-lx]
( O ' ) - ' B ' / ] ' - ' = S.z~-1 --[-( O ' ) - I F ' B' = O'S + F ' A
(59) (60)
where (60) coincides with (27). Note that the matrix relation (60) determines the SLB-matrix S and the LB-matrix F uniquely, including their orders. In this way the first bilinear form on the right-hand side of (56) can be expressed as follows: E~'(O')-~B',4-~(e + k*) = E f i ' [ S X - ' ( e + k*) + (O')-lF'k*].
(61)
In a very similar way also the second term on the right-hand side of (56) can be rearranged ( 0 ' ) - 'B'D(.4~.,4)- 1 = H(.4~.4)- ~ + ( 0 ' ) - 'G'. (62) This decomposition leads to the matrix equality (28) by which the SLB-matrix H and the LB-matrix G are determined. Recall that the sequence {vm} has been introduced as a disturbance which is external with respect to the control loop. This means that Eft,letup=O,
= 7FEIt=EIA(O
-.I-
+ 3o
(68)
where 1
Jo = 7FE[ z'z - x ' A - i x ]
(69)
is the component of the criterion which cannot be influenced by any admissible control strategy. At the same time J0 is the minimum of J which is attained by the control strategy satisfying the relation + A - i x = 0.
(70)
Substitution for ~ and x from (55) and (66), respectively, gives
O.z~- lb~ + A- 1 [S~z~- l(e -+- k*) + H.4-1.4~-~(ev + k*) + Xk -- Xw] = 0.
(71)
forr>t
and in analogy with (61) E~'(O')-'B'D,Yt-'X;I(e~ + k*) = E~'[H,~-I,4;'(e~ + k*~) + (~b')-'G'kv*].
(63)
The suitable arrangement of the last term on the right-hand side of (56) depends on the way how the
Note that the control strategy fulfilling this relation is realizable as each t~(0 is generated as a linear combination of the past inputs, of the past prediction errors {e(~I, evm;tc < z < t} which are known at the moment, and of the initial state contained in k* and k* (17). However, this is not a very practical control law. Therefore express /~-1
48
V. PETERKA
(e + k*) and /1~71 (e,. + k*) from (15) and (16), respectively, and substitute into (71) to obtain (qb -- A - ~ S B ) . ~
~t7 + A - ' [Sy + (H - S D ) . 4 - ' ~ + .¥~ - x,, ] = 0
or equivalently +A
~[R~+Sy+Qg+xk-xw]=0
(72)
R = (a@ - S B ) A
(73)
where 1
_
m
Note that the structure of the relation (72) coincides with (23) but from (73) and (74) it is not directly seen that the SLT-matrices R and S are also band limited. To show this calculate S from (60) and substitute into the right-hand sides of (73) and (74)
O
=
((I)')
[(O'A@ -
F r o m the matrix relations (24) and (26) and from (28) it is readily seen that d)'A(I) - B ' B = A ' f l A = A ' ~ O X cl)'H - B ' D = - G'A,,A.
In this way it is finally obtained R = (0')
I[F'B + A'~O] - A
Q = ( @ ' ) - ' [F'D - G'A,,].
(79}
It is important to realize that (79) does not hold tbr = t. Thus, the same trick can be applied as before with the only difference that the matrix decomposition in (78) (q~,)- 1B, O - 1 = F O - 1 + (qS')- IA' B' = ~b'F +
A'O
(80) t81)
must be performed so that the main diagonal of the upper part (qT) IA' be zero. Hence, A is a SLBmatrix. As 0 0 = 1 it can be seen from (81) that F is diagonal. Substitution of (80) into (78) gives E h ' x = E h ' Y O - le,,, = E?(Fw
(D) M o d e l Jbllowing. As the reference model (9) must be chosen sufficiently stable it is possible and convenient to simplify the verification of the above given Results assuming the zero initial conditions for this model. With this simplification the system of T equations obtained from (9) for t = t, + 1. tc + 2 . . . . . t,. + T can be written as follows: Amw
(75) (76)
Now it is already a p p a r e n t that, in general, (?R = max (c~B, 0 0 ) and gQ = max (0D, ~A~,). Note that the upper parts of the matrix equalities (75) and (76) including the main diagonals, are superfluous and must be identically zero. The matrix relations (29) and (30), which were to be proved, directly follow from (75) and (76). It remains to derive the relations by which the vector xw is determined. (A) Fixed setpoint. In this trivial case x,, = 0 which does not need any comment. (B) Program control. As the entire vector w is available for any time index t, (to < t < t,. + T) it directly follows from (64) that x,,, = ( O ' ) - l B ' w
=
Bmwc
where Am and Bm are R S - L B - m a t r i c e s with the elements Am.i,) = am,i and B,,,,, I = bin.i, respectively. Similarly to (77) we also have ( ~ 4 , ' c ~--- ewe
Using these relations the bilinear term (64) can be expressed as follows: E~x,,, = E~'(q~')- 1B'w = E ~ ' ( ( Z ) - I B ' B m A m l O - l e ....
Proceeding in the very same way as in the case (C) the following matrix decomposition is applied (O,)-le,e,,AmlO
1 = F,,A, t O - 1 + (O') 1A;,, ~82)
where Am is a SLB-matrix. This leads to the result E~'x,,, = Eh'FmA(., 1 0
l ew,. = EVF,.A~. lw,
xw = YmAm ~w,,
which coincides with (32).
or equivalently
(C) Positional servo. The set of equations obtained from (8) for t = t,, + 2, t~ + 3. . . . . t~ + T can be written in the matrix-vector form
Ow = e,,
(77)
where, pro Jbrma, ew,~+ ~ = w,~+ 1). Substitution for w in (64) gives Eh'xw = Efi'(O') xB'O
for z > t.
which proves (33).
B'B),4- ' + F'B] - A
[(qb'H - B'D).,t -1 + F'D].
1
Eu,}ew,t = 0,
(74)
O = (H - S D ) , 4 - ' .
R = (O')-'
As w is by definition an external signal for the control loop it holds
l e w.
(78)
x~,-
F,.s.,,
A ~ s , . = Wc.
(83t
This verifies the real-time filter (35) while (82) is equivalent to the matrix relation (36) by which the LB-matrix,Fm is determined. 5. C O N C L U D I N G
REMARKS
The matrix factorization (26) and the matrix decomposition (58) are the fundamental steps in the
Predictor-based self-tuning control derivation of the presented method. Similar steps can be found also in the Wiener synthesis, namely spectrum factorization and partial fractioning, (see e.g. Newton, Gould and Kaiser, 1957 or Chang, 1961). In this sense the method can be viewed as a certain revival of the Wiener approach. However, here it is applied to finite time intervals and nonstationary situations. Going through the derivation in Section 4 it can be seen that the feedforward paths from any number of measurable external disturbances can be synthetized in the very same way. Also the extension of the control-law synthesis for the case of preprogrammed loading, when one or more external variables v are a priori determined so that their future values {v,+i), i > 0} are available when u,i is being determined, is simple and straightforward. Unfortunately, the extension for the multi-input multi-output case is not so easy. The difficulty is that the matrices ,4, B and D in (15) are no longer rowstationary but only block-row-stationary and the commutation (52) does not apply automatically. As this commutation is inherent for the method it must be performed using an additional algorithm. This makes the method less elegant and less suitable (compared to square-root dynamic programming) for multivariate cases. It has been stated in Section 2 that in most cases of industrial process control the/-predictors should be preferred to P-predictors. There are two main arguments for this recommendation. First, the control law synthetized for an/-predictor generates the increments of the system input (~ = Au) so that the controller possesses an integral action. This accords with the well-tried feedback structure generally used in practice to remove the steady state bias and to compensate drifts and disturbances or load changes of relatively long duration which is, as a rule, the primary task of any regulator. The second but not less important argument in favor of Ipredictors concerns the identification part of the self-tuning algorithm. Those who deal with selftuning control systematically know that the plain application of the certainty-equivalence hypothesis in connection with exponential forgetting of old data may lead to the unfavourable phenomena called 'bursts' (Astr6m, 1979) and 'covariance windup' (Astr6m, 1980) if the closed control loop is not excited externally (for instance by setpoint changes) and if no special precautions are taken (Kfirn~, 1982). The main root of this difficulty is that the fixed linear control law, to which the selftuning algorithm eventually converges, creates a linear dependence on the data in the regressor used to update the estimates of the predictor parameters. An insight into this problem can be obtained from the simple Example 4.4 in Peterka (1981). The advantage of the/-predictor is that, unlike the PAUT 20:I-D
49
predictor and other non-incremental models, the corresponding optimal control law does not create this linear dependence. For instance, the parameters of the/-predictor .Y~, = Ylt-1) + boAul,) + blAulr-1) + bzAu.-2 ) - alAy.
l)
are estimated by recursive least squares applied to the observation equation Ay.) = [b0, b,, b2, - al]p,) + e,) where p,) is the regressor P(t) =
[A/d(t), Au(t_ 1), A/d(, _ 1), Ay,_ 1~]'.
The corresponding optimal structure of the control law, considered in the form (5t) with the fixed setpoint wl, = w, can be written as follows: [6, rl, r2, s 2 ] p . ) ---- (sl + s 2 ) ( Y t t - 1 ) -- w). This clearly shows that any deviation of the output from the given setpoint perturbs the linear dependence within the regressor which helps to avoid the phenomena mentioned above. The numerical and computational aspects were the main motive for the development of the presented method of control synthesis. To conclude the paper some additional comments will be given to this, for the self-tuning control ever so important, point. As outlined in Remark (4) the polynomial equations (40)-(44) can be used to derive explicit formulae for the control synthesis in case of low order predictors. It may be interesting to compare, from the computational point of view, this procedure with the pole-placement technique used by Wellstead, Prager and Zanker (1979) and by Wittenmark and Astr6m (1980) for a similar purpose. In the polynomial equation (43) only the equalities of terms with positive powers of ~ are significant for the determination of the polynomial r = rl~ + r2~ 2 + "'" r~b~eb. However, the equation itself holds generally. The unknown polynomial f could be eliminated from the equations (43) and (41). If the former is multiplied by the polynomial a, the latter is multiplied by b, and the both equations are subtracted, then the following relation is obtained 4" [(6 + r)gl + sb] = b'b + ~'G'cogg~ = 4,'64,
where the second equality follows from (40). This means that for all ~ for which q~(~-l) ¢ 0 it holds (6 + r)~ + sb = bqS.
(84)
Since the left-hand side is the characteristic polynomial of the closed control loop, the relation can be understood as the placement of the poles to
50
V. PETERKA
the positions prescribed by the polynomial th which is optimal with respect to the quadratic criterion. A similar relation can be found in (Peterka, 1972) and might be used to determine the polynomials r and s directly. However, this is not a good way how to proceed. As pointed out in Wittenmark and Astr6m (1980) the relation (84) leads to formulae which are singular whenever the polynomials ~ and b have a common factor (even when it is stable) and are numerically sensitive in case of an almost common factor. As discussed in Remark 4 the formulae obtainable via (41) and (43) do not have this defect and can be given the form which is universal for all situations when the stationary optimal control exists. Note also that (41) and (43) determine the calculated polynomials uniquely, including their orders, and do not allow the cancellation of stable common factors in ~ and b which would lead to the nonoptimality of the controller in some transient situations. In most practical cases it is possible to retain the model order sufficiently low (without violating the assumption AI, equation (5), significantly) by a proper choice of the sampling interval (and, if need be, by a suitable prefiltering of the controlled output). Simple robust self-tuning controllers obtainable in this way will be reported in a separate paper where also experimental results and further important details concerning the identification part will be given. Acknowledgements--The author is indebted to his colleagues, namely Drs J. B6hm, A. Halouskov~i and M. Khrn~, for extensive cooperation both in discussing concepts and collecting practical
experience. Also the impulses obtained as a feedback from industrial pradtice, namely from Drs A. Lizr and A. Lizrova (pulp and paper production), Dr J. Cendelin (rolling mills) and Ing. J. Fessl (power plants), are gratefully acknowledged. REFERENCES Astr6m, K. J. (1970). Introduction to Stochastic Control Theory. Academic Press. Astr6m, K. J. (1979). Self-tuning regulators--design principles and applications. Proceedings of Yale Workshop on Applications of Adaptive Control. Academic Press. Astr6m, K. J. (1980). Design principles for self-tuning regulators. International Symposium on Methods and Applications in Adaptive Control. Bochum, Springer. Bierman, G. J. (1977). Factorization Methods Jbr Discrete Sequential Estimation. Academic Press. Chang, S. S. L. (1961). Synthesis of Optimum Control Systems. McGraw-Hill. Clarke, D. W. and P. J. Gawthrop (1975). Self-tuning controller. Proc. IEE, 122, 929. Karn~,, M. (1982). Identification with interruptions as an antibursting device, Kybernetika, 18, 320. Newton, G. C., L. A. Gould and J. F. Kaiser (1957). Analytical Design oJ Linear Feedback Controls. John Wiley. Peterka, V. (1970). Adaptive digital regulation of noisy systems. 2nd I F A C Symposium on Identification and Process Parameter Estimation. Paper 6.2, Academia, Prague. Peterka, V. (1972). On steady-state minimum-variance control strategy. Kybernetika, 8, 219. Peterka, V. (1975). A square-root filter for real-time multivariate regression. Kybernetika, II, 53. Peterka, V. and A. Halouskov~t (1976). Effective algorithms for real-time multivariate regression. Proceedings of 4th IFAC Symposium on Identification and System Parameter Estimation, Tbilisi, Vol. 3, pp. 100-110. Peterka, V. (1981). Bayesian approach to system identification. In P. Eykhoff (Ed.), Trends and Progress in System Identification, Chapter 8, Pergamon Press. Wellstead, P. E., D. Prager, and P. Zanker (1979). Pole assignment self-tuning regulator. Proc. IEE, 126, 781. Wittenmark, B. and K. J. Astr6m (1980). Simple self-tuning controller. International Symposium on Methods and Applications in Adaptive Control. Bochum. Springer.