11· \(
( ,"p\tt!.!ln
{'
glld.l(lt ·, t.
J [lItt ~. lt
t 1....... \
'tth
\
tl("lltlLd
\\
A GLOBALLY CONVERGENT RECURSIVE ADAPTIVE LQG REGULATOR J. /}, /,,;II''',',' ,./
B. Moore
\ \'("'" ( 11:':"1 ,,1 111 ":.
\ /, , (/11111/',
1:" ,(11,/, \ '/""/ "I
.\ "1/1"""; { "Ill. 1/11/\.
( . 1111 / " ' / "
\(
/I /",{,,,/ \ ",1/1 ,'
r.
\ /I,/),II"i
Abstract Time - varying recursive Riccati equations are exploited for a globally convergent discrete - time adaptive re g ulator for a fixed but unknown plant. With the addition of sufficiently rich perturbations for consistent parameter estimation, the controller is asymptotically the optimal re gu lator in a linear , quadratic Gauss ian (LQG) sense with externa l perturbation signals . The scheme has advantages over globally convergent minimum variance based schemes in that there is a more direct t r ade - off between control energy and tracking er ror. For nonminimum phase plants, the scheme has an advantage of computational s i mplicity over globally convergent schemes based on solving Bezoutian equations for pole assignment, in that the Riccati equation is easier to update at each iteration than the so l ution of a Bezout equation .
Keywords
Adaptive control; adaptive systems ; optimal control
former case is relatively straightforward. The theory of the paper g uides in the a l go rith m design, and with persistently exciting inpu t signa l s leads to an asymptotica ll y optimal scheme in the sense of LQG control in the presence of the pe r siste ntl y exc iting perturbations.
INTRODUCTION For plants ,,-ith a known discrete -t ime state - space representation the Linear Quadra tic Gaussian (LQG) opt i mal design approach allows trade - offs be t ween performance and control costs Anderson and Moore (1971) . The r esu lt ant desi gns have inherent robustness features and with the introduction of frequency shaping in the performance index, there can be trades between control system robustness and performanc e for the nominal plant, as studied in Gupta (1980) and Moore et al (1984) and their references .
ARMAX Signal Model In the first instance, let us ARMAX model with unit delay
For the case when there are fixed but unknown plant parameters, it makes sense to estimate such parameters or the closed -l oop system parameters on line, and use these in an adaptive LOG scheme. A number of possible schemes can be devised using this approach, see Mosca (1980), Astrom (1982), Bartolini et al (1984) for re cent examples using input-output models . It is of interest to develop an LQG based scheme for stochastic adaptive control with a g lobal convergence theory based on adaptive minimum tracking error variance control and least sq u ares parame t er es t imation as in Goodwin et al (1981), Moore et al (1982-84).
AYk = q
-1
BUk +
consider the scalar
(2.1)
Cw k
-1
A (q -1 ) is
where q is a unit delay operator , A a polynomial operator
I
+ "In
...
-1
?..,"
-7
•• anq
-n
T . i1r~T.dC'~
q
-1
B
q
-1
B (q - 1)
b q l
-1
-2 + b2 q +
bmq - m,
and
C = C (q -l ) = 1 + C q- l + C£q - £ In the l model (2 . 1), Yk' uk denote plant output, input, respectively, and wk is a zero mean "white" no i se term with bounded covariance . More precisely, fo r the theory of subse q nt sec t i0 s E[wk !F k - l 0, E[wk!F k _ l ) ~ 0w < 00,
In this paper , we develop an adaptive LOG control scheme, and derive its g lobal convergence theory building on the earlier technical approach . The scheme of this paper is shown to be equivalent to an earlier variation of the adaptive po l e assignment scheme save that there is a substitution of the Bezout equation solut i on by a Ricca ti equation update. The Ricca ti equation solut ion conve r ges to t hat which can be u sed in a known plant LQG des i gn, where the plant has a coo r dinate basis whi ch is nonminimal.
27
i
2
00
k- l ~L s2k , .1' R~ sup
<
00
o
An important aspec t of the a l go rit h m proposed i n this paper is t hat there must be some so rt of projection into a sta bilit y d oma in of the estimated plant with associated con tr o ller at each time inst an t, rather than of the actual pla n t and con troller, as in the earlier theories based on ordinary differ en tial equations (O.D.E.) of Ljung (1977). To ca rr y out the projections in the
where Fk denotes wl ' w2 •• wk • Fo r the theorems r es tri c ti on that (C -1 __1_)
2
955
the of
°-
this
algebra
paper
generated
by
let us impose the
is strictly positive real (2.2)
J. B. Moore
956
which is satisfied when the noise CWk is "near" white. This condition can be sidestepped using the ideas of Moore (1983).
I _
0 0 0
State Space Descriptions
n 1
0
For the first state space description it is assumed that the ARMAX model is normalized so that £ = n= m Then (2.1) can be rewrit ten as
0 0 0 0 1 0 0 Om-I 0 0
A&ain
the
,K
1
o o o
1
Ka}man
o o o
o
0 1£_1 0
0 0
filter
has
1
0 (2.8c) estimates
e'x k given from
x k + 1 , wk= Yk-
(2.3a)
o o
'0
~1
(2.3b) or where
I •••
~
-a
2
r-"' -a
n
lj
~k
K
r'
I
-T' ~ I'
c -a 2 2
c -a n n
I 0
(2.9b) and the filter characteristic polynomial is (2.10)
0
LQG Index (2.3c) This
model
Tile
the form of an innovations that the associated Kalman filter is ,.g iven from inspection and thE;. state estimates, xk / k- I which we write simply as xk are given from representation
has
Q
~
performance
standard quadratic 0, R> 0 is
index
with
so
k+t E{
I
k
3
(2.4 )
ADAPTIVE LQG CONTROL
Extended Least Squares Observe that the Kalman characteristic polynomial
filter
has
the
(2.5)
For parameter estimation purposes, consider the non-minimal model (2.8) where the measurements Yk are linear in the unknown parameters e and the states ~k • Extended least squares applied to this model leads to the recursions
Since C(z) is minimum phase, the Kalman filter is stable, so that Of course Xk : xk and wk~,wk as k -+ 00.
a~ymptotically
if xO= Xo then xk = xk
(3.1)
and
A Non-Minimal Representation The second state space description we consider is useful for parameter estimation purposes in the adaptive control algorithms to follow. Let us express (2.1) in terms of a parameter vector
r
Pr =
[I o
YiXiX;T
I (3.1 b)
(3.Ie) and a non-minimal state vector
With these definitions, we see that the outputs Yk are linear in both the parameters e and the states x k since Yk = e'x k+ wk. The resulting non-minimal state space description is
(2.8a)
(2.8b)
where
i nit ial ized by until k=r Xo = 0, and ek = 0 where r is the smallest integer such that P r exists. A selection Y = I can be employed in k practice, although to guarantee global convergence we assume a Yk selection as in Moore et al (1981) • Of course (3.1c) has the alternative formulation
X·k (3 .1c)·
957
A Clobally Convergent Recursive Adaptive LQC Regulator
*
A Perturbation on the Parameter Estimate In (3 . lc) is introduc~d
Proof
o (3 . 2a) Also, lIBk is constrained so t hat the state estimator (3 . lc) is exponentially asymptotically stable . Equivalently, as is readily checked using (3. I c) " liCk can be chosen so tha t - I (C + liCk) as a time-varying system is k (3 . 2b) exponentially asymptotically stable A method for selecting 118 k subject to (3.2) and such as to ensure g l oba l convergence is given in the next section. Recursive Update of Control Cain Consider the adaptive LQC control law u ~ - Lkx k k
(3 . 3a)
whe r e
(3 . 3b)
~ - I f ' ~ ~(8 k
k
~ '( B k+
k
~ ~
+ lIB ) , k
k
p O . Furt he r more, a s.u ffi,.c i e n t co ndition fo r (3 . 5) t o hold i s that (8 k +ll8 k ) be l ongs to t he ne i ghbourhood of some fixed 8 s u ch t ha t t here are no common zeros in A(8), B(8) . The following theo r em is the main g l obal convergence resu l t of the paper .
f Skf + R (3.3c)
lI8 )· k
See Appendix
Theo r e m 3.1 Consider the ARNAX signal model (2 . 1) with (2.2) holding . Consider the recursive parameter estimation scheme (3.1) and either of. t he control schemes of (3 . 3) . Then with Mk se l ec t ions satisfying (3.2) yet so as to guara nt ee the stabilizability condition (3 . 4) and detectability condition (3 . 5) see Lemma 3 . 2, the adaptive scheme is almost surely g lobally stable . Moreover , the a l go rithm i s globally convergent in the sense that the control converges asymptotically to the optimal LQC control if lI8 k = 0 for all k ) ~ for some~, and converges to the. neighbourhood of the op timal LQC control if II lI8 k 11 is small as k -+~ . If in addition there are added to the control signals persistently exciting perturbations in the sense of Moore (1983), then there is consistent parameter estimation. Furthermore, should there be no common zeros in A_ and B of (2 .1 ) , then for sufficien tl y l arge k and all k) k , a selection lI8 k = 0 can be made to ensure that the cont r ol scheme converges to the optimal LQC scheme with the added perturbati on signals . Proof Central subsequently .
ideas
for
the
proof
are
developed
Re marks It would be satisfying. to develop a simple formula
(3 . 3d) with So ~ O . Observe that with a forward time Riccati varying coefficients .
now we are dealing equation with time-
Lemma 3 . 1
With the pai r [ ~(8 k+
lI8 k ),fJ,
uniformly completely
then the solution in norm for all k.
[ H 8 k + 118 k ), 0 I , detectable
A Relationship to Ninimum Variance Control (3 . 4 )
stabil izable
Sk of (3 . 3b) is bounded above If in addit ion, the pair is
uniformly
where 0 is any 0 satisfying followi ng closed - loop system asymptotically stable
0'0 is
completely (3 . 5) ~
Q, then the exponentially
(3 . 6) This system is the closed loop system consisting of. • t he plant with estimates (8 + lIB ) in lieu of 8 and the control l er of k k (3 . 2a). Proof Fo l lows from Ricca t i (J 981) .
for achieving the lI8 k selection required of the theorem. Such does not seem a priori easy to achieve, but in practice ad hoc techniques can be uSj\d which seeR' to work well. In part i cu l ar , set lI8 k = 0 unt il Sk exceeds some upper bound, then resta r t t he a l gorithm. In practice this will probably not occur infinitely often , but the proof of such a result is elusive. Ano.ther approach is to freeze the controller until Sk is below some bound guaranteed by persistence of excitation . This is the subject of current research.
Consider the application of minimum variance control to the plant (2 . 1),(2 . 8) augmented with output (3.7) and having input uk,and noise input wk '
Selecting
i
u k so that Yk!1 = 0 , and thus E [ Yk+1 F k ~ = 0 , yields the minimum variance control applied to the augmented plant(2 . 8),(3 . 7) . Likewise for Yk+I ' We now claim the followin g result .
Lemma 3.3 The adaptive control laws (3 . 3) app li ed to the origina l p l ant (2 . 1) yield adapt i ve minimum variance control of the augmented plant (3 . 7) . An al t ernative exp r ession for (3 . 7) is given as
theory Ande r son et al (3 . 8)
q -nde t ( qI - [~(8(ek+lIek) - KHJ !
Lemma 3.2 A suffic;i, ent ~ond i tion for (3 . 4) with (A + M ) bounded above , k k defin i te , o r Q=diag . (p 2 00 • • • 0!
to hold i s tha t Q is positive fo r some scalar
(I+Lk ( qI -[ ~(ek+ lI8 k ) - KHJ!- l f ) (3. 9a)
J. B. Moo r e
958
poles as the zeros of M • k ,(3. 10) Ek , F k , with Mu~ Mk • a required condit ion
q-ndet l qI-[ ~(ek+ ~ek)-KH)I
B~zoutian
(q -1~ 1qI-[ H6 + Mk) -KH) }-1 K) k (3.9b) Here Ek Fk are time-var ying moving average operators. For the case when Yk is deri"ed {rom the estimated plant with estimates (6 k +~6k) replacing e in t he original plant, then the augmented plant operator acting on the input u k is
saveFOtrha~0n;:{g:~c;~p1:~;~ ~~
M- l k
Convergence of Adaptive LQG first part of Theorem 3.1.)
(3 .11)
det l qI -[ ~(9k+~ 9 k)-KHI )
Proof
;IU - 1 (
We are now in a position to give a key the convergence theory.
See Appendix
GLOBAL CONVERGENCE THEORY
The simples t global convergence theory for control of ARNAX models is for minimum variance control. Since such control action forces closed - loop poles t o plant zeros. the assumption of a minimum phase plant is crucial for achieving bounded controls and closed-loop stabil it y . The condition (2.2) is also required, although th is can be side stepped. For the remainder of this section (2 . 2) is assumed to hold . To cope with non - minimum phase plants and yet exploit the simplicity of the minimum variance theory, one idea is to study min imum variance control applied ,to AR~1AX models augmented with compensators E , F k' as in (3 . 7) . Such an k approach i n essence leads to a pole assignment scheme. The first step is to assign poles specified as the zeros of ~I, and the next step is tQ sQlve the Bezoutian (3.9) at each k for E , F , with Hu ~ ~I . For global convergence, k k and to achieve asymptot ically the assigned poles, t he theory requires that (2 . 2) holds and is exponentially asymptotically stable (4 .1a )
E , Fk are bounded i,ibove, in norm by a k bounded selection of ~Ak ' ~k satisfying
suitable (3 .2 ) (4.1b)
Generalizations of Adaptive Pole-AsSignment Theory The theory generalized
of Moore et a l (1983) is to cope with time varying
5k +
M
k
)
is
exponentially
asymptotically (4 . 2a)
above in norm satisfying (3.2)
by
a
(4.2b)
Adaptive Pole-Assignment Results .
M-I
identified
result for
Lemma 3.4
4
be
Sk( e + ~ek) is bounded se l ection of ~ek
Consider that selections of ~e are made such that the stabiJizapility conJltion (3 . 4) is satisfied, then E , F , in, (3.8) are bounded in k k norm. If in addition , ~e..k is such that the detectability condition U;S) holds, and the stabili~1' ,condition (3 . 2b) holds, then Mu (8 + ~8k) is exponentially asymptotically k stable. Proof
can
suita~le
A Stability Result
of
(3.8) ,(3.10).
stable,
See Appendix
(Proof
~ith,~l u (ek+ 68 k ),E k , Fk Since E , Fk are bounded in k (4.1) have that t erms of Sk(8 k + ~ek)' we translates he r e as
of
det(ql -[ ?(5k+~ek)-rLkl 1
Scheme
As spelled out in the previous section , there is a mathematical correspondence between the adaptive LQG equations and adaptive pole assignment via minimum variance control a,ppUed to a suitably au g mented plant . Thus ~Ik ,E k , F,k referred to in (4 . 1)
Furthermore,
For this situation, the is solved for
trivially specified
The identity (3 .11) gives an interpretation of (4.2a) as being equivalent to the condit ion that the closed-loop system (3.6) be exponentially asymptotically stable. Applying Lemma (3 . 4), we see that the (4 . 2) is satisfied under the condit i ons (3.4), (3 .5), (3 . 2b) of Theorem (3.1). The above translations of generalizations of the results of ~Ioore et a l (1983) , in fact now yield the desired convergence results, as stated in the first part of Theorem 3.1, for our adap tive LQG scheme. The reader is referred to t·loore et al (1983) to extract more specific convergence formula applicable to the adaptive pole-assignment case and thus to the adaptive LOG case
5.
CONCLUSION
Building on global convergence theories for stochastic adaptive minimum variance control of AR~1AX input - output signal models and related adaptive pole assignment schemes, a global convergence theory for 3 class of recursive indirect adaptive LOG control schemes with persistently exciting perturbation signals has been developed . The theory relies on the algorithms being applied to non-minimal state space plant representations. Steps must be taken in the algorithms to avoid possible ill-conditioning due to unstable pole/zero cancella tion s in the estimated plant, or more precisely, lack of uniform controllability of the (time -v arying) estimated plant. The paper does no t claim to give the "best" steps to take, but rather gives a theoretical context for taking practical steps .
REFERENCES Anderson , B.D.O. and R.M. Johnstone . Global adaptive pole positioning. Submitted for publication. Anderson, B.D.O. and J.B. Noore . (1979) Linear Optimal Control. Pren tice-Hall, NY; see also Opt imal Filtering. Prentice Hall, 1979.
A Global l y Convergent Recursive Adaptive LQG Regulator Anderson, B . D. O. and J . B . ~!oor e (1981) . Detectabilit y and Stabilizabi l ity of time varying dis c r e t e - time lin ea r systems . SIA}! J . of Contro l and Op timiz ation , ~ , 20 - 32 . Astrom , K. J . (1982) . A lin ea r quadratic gaussian se l f - tune r. Workshop o n Adaptive Control, Fl o rence, It a l y , 1- 20 . Ba rt ol ini, C. , G. Casa lin o , F. Davoli, and R. ~! i nc iardi ( 19 84) . The [ COF approach to i nf init e ho riz on LQG adap ti ve cont r ol . Ricerche di Automa ti ca . Goodwin , G. C . , P . J . Ramadge and P . E. Ca i nes (1981). Discrete - time stochastic adap t ive contro l. SIN! J . Cont r . Opt i mz . , 19, 829 - 853 . F r equencv - shaped~ost Gup ta, K. K. (1980) . functionals: extensions of li near- quadratic ga ussi an design methods . J. Gu id ance an d Control, 3 , 529 - 535 . Kumar Rand a-;;-d J . B . '!oore (1982). Conve r gence of adaptive minimum va ri ance algorithms in weighting coefficien t selection . IEEE Trans . Auto . Con tr . , AC - 27 , 146 -1 53 . Ljung, L. (1977) . Analysis of r ec ursi ve stochastic a l go rit hms . I EEE Tr ans . Auto . Con tr ., AC - 22 , ----551 - 575 . Hoo r e , J . B . (1983) . Persistency of excitation i n ex ten ded least squares . I EEE T r ans . on Auto . Con tr . , AC - 28 , 6 0- 68 . Hoor~ ( 19 82) . Sid estepp i ng the pos iti ve r eal restri ction for stochast i c adaptive schemes . Wo rkshop on Adaptive Con tr ol , Florence, Italy . 501 - 525 ; a l so Ricerche di Au t oma ti ca , to appear . ~oore, J.B . and R. Kumar . Conve r gence of an adaptive co ntr o l sche me appl ied t o non - minimum phase plants. Subm it te d fo r pu bliction . See also R. Ku ma r and J . B . Hoo re (1983) . An adap ti ve minimum va ri ance regulation for non mini mum phase plan t s . Au t omat i ca , 103 . Hoore, J . B . , D. L. Hingo r i and B . 0 . 0 . Ande r son (1984) . Frequency shape d li nea r opt i ma l con tr ol wi t h t r ansfe r func ti o n Riccati eq uati ons . Proc . of IFAC \,orld Co n g r ess, Bud apes t. Hosca . H. and G. Zappa ( 1980) . HUS~~R: basic conve r gence an d consistency p r ope rti es . Lecture Notes in Cont r o l. Sp rin ger Verlag , 28, 189-1 99 .
959
APPENDIX
Proof of Lemma 3. 2 The first part of the lemma follows because with d' = [a 00 .•• 01 and a " 0 then the pai r [ ~ ,DJ is in o bs e rvabl e canonica l form . The obse r va bility gramm ia n ~! is th en bounded above and below as Cl I < ~! < Cl I for some Cl . > O. The 2 I second part fo ll ows since the con fr o ll ab i l it y condi t i on can, be ,re - exp r es.;;ed , as r equi ri ng th at the system (A k + Mk)Yk = (Bk+llBk)u k be uniformly comp let ely cont r Q ll~ bl~ f r om its input uk' For the cons ta nt A ,B , 6A , 68 case , it is known t hat , as long as the r e ar~ ,n~1 ,unsJable po l e / zeros cancella ti ons in (A+llA) (B + llB) , t hen Yk is con tr ollab le_ fr om uk ' equival,.en tl v~ there ex ists a ga i n L such that [ d " + .). S) + fLJ has e i genva lues inside the unit circle . It is not diUi cu lt tD e xtl'nd the above result for cons tant (f\+66),(B,+llB) to the time - varving case where (f\k+ 116 ) ,(B k + t,Bk) a r e in the neighbourhood of k (A + M) ,(B + llB) . De t ails are omitted.
Proof of Lemma 3.3 Adapt i ve minimum va ri ance con tr ols are such as to se t the one step - ahead plant outp ut pred i c ti ons t o zero. Here we see f r om (3 . 7) that t he adaptive cont rol (3 . 3a) sets y~+ I= 0 and thus E[Y~+I iF k = 0 , and is consequen tl y an adaptive mi ni mum variance con tr o l as cla i med .
Proof of Lemma 3.4 Tbi s fi rst r es ult follows f r om Lemma 3 . 1 bounding Sk under the stab i l,.i z~bil it y cond ition and conseq uent l y bounding Ek,F • The stab iliz ab ilit y k and detectability cond i tions gi ve exponen ti a l asymptot i c stability for (3 . 6) as in Lemma 3 . 1 . This t ogethe r with th e stability ass u mption (3 . 2b) in (3 .1 1) yie ld s the r ~q uir efl exponen t ia l asym pt o t ic stability of ~u-I( Bk+ llB ) . k
'7'1'7