On the Use of Minimal Parametrizations in Multivariable Armax Identification

On the Use of Minimal Parametrizations in Multivariable Armax Identification

Copyright © I 996 IFAC 13th Triennial World Congress. San Franci~co, 3a-16 6 lJSA ON THE USE OF MINIMAL PARAMETRIZATIONS IN MULTIVARIABLE ARMAX ID...

323KB Sizes 7 Downloads 60 Views

Copyright © I 996 IFAC 13th Triennial World Congress. San

Franci~co,

3a-16 6

lJSA

ON THE USE OF MINIMAL PARAMETRIZATIONS IN MULTIVARIABLE ARMAX IDENTIFICATION Rob.rto P. Guidorzl

Ulliversita di Bologna. Viale del Risorgimento 2. 40136 Bologna, Italv

e-mail: [email protected]

Abstr"ct: This paper derives minimally parametrized ARMAX model s for linear multivariable systems and describes the implemcnWion of a prediclion error method (PEM) for their estimation. It is shown that, differently from the commonly accepted opinion, the determination of a suitable order and structure for the ARMAX model of a real process does not involve critical choices. Keywords: idenlificaLion. mullivariable systems. ARMAX models, minimal parametrizations, c~U1onicaJ forms.

I . INTRODUCTION

The identification of multi variable models proposes some specific problems that have no counterpart in the scalar case. Among them. the choice of minimal or reduced identifiable parametrizations and the selection or suitable model structures. The estimation Jrom the data or this information, given hy a set of inlegers, is considered as a difficult and critical prohlem because of the negati vc e ffects of model stmcture..
physical process where the slate variables are assumed as conSlant. This qualitative analysis has been veri fied. during over 20 years of applications of identification techniques to real processes, in almost all ca.
4458

stmcture of the deterministic models described by Guidorzi (198 1), slarting from stat<:-spacc innovations representations x(t+I,=Ax(t)+Bu(t)+Kw(t) (la) yet ) = Cx(t)+w(t) (Ib) where x(t) ER" , yet) E R>n, u(t) E RP, wet) E R=; allern.tively, it woulu be· pO"ible to start from the transfer function as is done hy Hannan and Deisller ( 1988). The lriple (A, B, C) is assumed completely reachahle and observable, the cigcnvalues of A and A - KC are assumed strictly inside the unit cirdc and wet) is a remote (unmeasurahle) white process. This sel of models will be denoted with E. Define now the auxi liary output y' (t ) = Cx(t):

(2)

using the links hetween cano nical state- space and inputoutput models described by Guiuorzi (1975 , 1981), the link between u(t), wet) and y' (t) can be described by the polynomial input--{)utput (MFD) model Prz ) y' (t) = Q(z) u(t! + R(z) wet)

Pii(Z)

= Zll;

J

(i,j= I, ... ,m)

- U iiv, z{v i -·1) - ... - a:iil

(5a)

qii(Z)

= P'ill,

Z (v;-I)

R(z) = [ 'i, (Z)]

""2 lJ Z -

+ ... + fJii2

Z

Considering (1 b), tile ARMAX model p (z)y(t) = Q(z) u(t ) + (P(z)

+ f3iil

(i, j = I , ... , m )

(5b)

(6.)

1) deg det p(z) =

2) Vii

~ min ( v-i

Vii

= min(vi

L;:1Vi = n

+ 1 Vi ) for i > Vi) for i < i J

(7)

i.

J

=p-l(z) Q(z) C(z 1- A )- ' K =p-l(Z) R(z )

3) C(z 1- A)- l B 4)

(8) (9) (10)

5) p(z) and Q(z) are left eoprimc.

6) lbe scalars (Vi, Qiik. fJiiJc. 'Yiik) are the image of (A, B, K, C ) in a complete set of independent invariants for the equivalence relation described, in E, by a change of basis in the state space. 7) Denote with S the set of all triples Plz), Q(z), R(z) satisfying properties 1), 3),4) "n1l5) and with E the equivalence relation described, in S. hy the premultipli-

(11)

follows immediately. Denote now with v the multi-index (V I , ... , v~); the number of parameters, t , in p (z), Q(z)

anu R(z) is m

m

t=LLvi;+n(m +p) where Vii=Vi·

(12)

,=1 i=l

Define also the vector of parameters, 9 ER'

0=[ QIII .. . all lll I ... IQ lrnJ . . . Qlmlllrnl

... IQmml ...

... am=Vm IP111"' P11v,I .. · IPlp,,,,PlPv, I ... .. ·1 P~I>l .... "P=pv m11111 ... 111", 1... 111=1 ...

I ...

(13)

11mml ... 'lfllmVrn JT .

Remark 1 - From property 7) il follows that the parametrization of the elements of Se is, for every v, minimal and that

there exist" (t surjcctivc map between the points of R. i and the elemenLI uf Se. S< is thus an identifiahle sct of models (Picci 1982. Gevers and Wertz 19S4). Remark 2 -

Model (3) has the following properties (Guidorzi 1975, 1981 , 1989, Gevers and Werlz, 1984):

+ R(z))w(t)

= Q( z) u(t ) +8(z) wet)

... 'lmvl

Q(z) = [ Qi;( Z)] (i=I, ... , m, j=I, ... ,p)

-

set of canonical forms for E on S.

(4b)

(40)

.. .

The set of scalars (Qiik ' I'iik J 'liik) is a minimal parametrization for P(z), Q (z), R(z).

9) Denote with S, c S the subset uf triples p(z),Q(z), R (z) satisfying properties 11, 2), 3),4) and 5). Se is a

(4a)

"'''1 1J

' lJ.. (z) = -c."tJVi ; Zlv,; -l) P

~)

(3)

where p(z) , Q(z) and R(z) arc polynomial matrices of suitable dimensions in the unitary advance operator z: p (z)= [Pi, (Z)

cation of these triples by a nonsingular unimodular matrix. The scalars (Vil Qiik) f3.iikl 'litk) are the image of P(z ), Q(z), R(z) in a complete set of independent invariants for E.

Denote

with

Se

the

set

of triples

P Cz), Q(z), S(z) and with 0 E R' the vector of their parameters. It can be verified tilat, in general, is not a

e

minimal parametrization for model (11 ) and that no surjection cxi.' ts between the points of R' and S,. The proof can be easily ohtained observing that, because of the relations between the degrees of the entries of p(z) described by (8), some coefficients of the polynomials of S(z) can be equal to coefficients of pulynomials in p(z); this happens when max(vi) - min(v;) :;: 2. These considerations show that the trip le p (z ), Q(z), S(z), differently from p(z) , Q(z), R(z), is not suitable for identification purposes because of the lack of independence among il~ cocffident."i thal would lead to c:aimalc anLidpativc ARMAX models. 2.1 Minimally parametrized ARMAX predicturs

The purpose of identification is often prediction so that predktion error methods (PEM) constitute a standard approach requiring a predict.ive form (u.·mally onc-step-ahead) for the models. Generic (non Sb"uctunx:1 and non canonical) AR-

4459

MAX models with order (memory) " arc usually written in the form

where

P(z) yet) = Q(z)u(t) + S(z) wet)

(14)

P(Z)=I-P1 Z- 1 - . " -p,.z-~

(15.)

Q(z) = Q1 z-l + ... + Q~z-~

(ISb)

S(Z)=I+S1Z-1+ ... +S~z-~.

(lSc)

Denoting with y(t) tlle prediction of yet) performed at time t - I (often indicated as y(tlt - Il), the optimal one-stepahead predictor, with innovation yet) - yet) = wet), is

Yet) =

~

~

i=1

i=l

2)p, +S;) yet -, )+ :L Q, u(t -i) -

Remark 3 - Model (18) is not minimal because M(z)

2::

is not unimodular (deg detP*(z) = n + .dvi); its 1 parametrization is however minimal (the same as that of model (11) and also its memory, VM, coincides with that of model (11). Model (18) and predictor (20H21) can thus

be used to implement prediction error methods without any drawback. Remark 4 - Models (18) are fonnally similar to those proposed by Gevers and Wertz (1984) characterized, like models (18), by polynomials of equal degree on the main diagonal of P(z). Models (18) are however based on the choice of a basis preserving the same (minimal) parametrization of the canonical models (3) and all related advantages.

~

:L S, y(t -i).

3. PEM ESTIMATION OF MINIMAL

ARMAX PARAMETRIZATIONS

t~l

( 16)

Models (11) can be directly put in a predictive fonn only when all the structural indices Vi arc equal since in all other cases the coefficients of Zl.'M (LlM == maxj(lIi» in p(z) form a singular matrix. It is however easy to obtain a predictor [Of model (11) for every 1I. Denote, to this purpose, with

the difference VM - Vi and constmct the non singular and non unirnodular (uoless all indices are equal) matrix

Consider a parametrization, 8, associated with the multiindex v. Predictor (20)-(21) allows computing, for t > VM, the sequence of predicted outputs yet, e) and the associated prediction errors


aWi

M(z) = diag [z""', ... ,z"u

rn

(17)

]

(22)

Denoting with L the length of the available sequences, the sample covariance matrix of the errors (22) is

and the model

(23)

p(z)' yet) = Q(z)' u(t) + (p(z)' + R(zJ')w(t) = Q(z), u(t)

+ S(z), wet)

(18)

where

prediction error methods compute the parameters that minimize a suitable scalar function, V(R.,(e)), of the positive definite matrix (23). The choice V(R,(e)) = det R.(e) correspond.... to the Maximum Likelihood method if the process

."

-, Pf.JM

(l9a)

belongs to the model set and wet) is Gaussian (Stiderstrtim and Stoica, 1989). A morc appealing choice from the computational standpoint is

(24)

Because of relations (8), Pr; is a lower Idt triangular matrix with unitary elements on the main diagonal; the corresponding optimal predictor is [P'(z) + R'(z)] yet) = R'(z) yet) + Q'(z) u(t)

(20)

where S is a positive definite weighting matrix. A nice choice for S is S = [cov w(t)r 1 but the covariance matrix of the noise is, in general, not known. It is however possible, as suggested by Falkus (1994), to approximate the process with all high-order ARX model (Ljung 1987) and S

yet) =

1>,;-1 [ t,R; yet -

i) +

t,

= COy w(t)

with the sample covariance matrix of its

residuals. Assuming the elements of wet) as mutually independent, the loss function (24) corresponds to

or, equivalently,

Q; ,,(t -

V(R,(e») = tr R,(e)

i)

+ t,CR,'+P,'lY(t-i)].

(21)

(25)

after scaling the OUlput sequences with the inverses of the standard deviations of the residuals of the high-order ARX model. Assuming this operation as performed, the cost function to be minimized is, modulo the multiplication by

4460

a constant,

so that

a,(t)

L

V(Re(B)) =

,T(t,B),(t,B).

.F

- - = -y, .(t + k + Llv, - 1) 8 Cf.iik 3

L

(26)

(36)

where fI!i(t) denotes the output of the filter

t==VM+ 1

Performing a shift to rewrite (26) in the simpler form

S(z)' fi!i(t)

= vet)

(37)

N

v(a) =

L

,T(t, 8) elt, e),

(27)

driven hy the input

o

t=l

where N = L matrix

VM,

and denoting with .p(t, 0) the (£ x m) vet) =

.p(t, 0) = the

Gauss~Newton

OH1

e+

= k

[t,

fii(t)

~

(38)

1.

(28)

o

algorithm consists in relation

.p(t, ek).pT (t,

ek ) ]-1

t,

Similar considerations show that

.p(t, ek ) 'et, ek )

a,(t)

F

8/3ijk

1.1

- - = -u· .(t+ k+Llv, - 1)

(29) where ()k denotes the estimate of () aftcr the k~th step. Introducing the (N m X £) malrix

(39)

where u[,.(t) denotes the output of the filter S(z), u!i(t) = vet)

(30)

driven hy the input

o

and the vector

e)1 k

'(1, ,'(Ok) =

[

:

vet)

,

=

Uj(t)

~

(41)

1.

(31)

erN, ek )

o

it is possible to rewrite (29) in the form

eH1 = ek

(40)

Finally

+ [ H~ (e k ) H,,(e k ) J -1 H~ (e k ) "(e k )

a'lt)

F

- - = -',jet + k + Llv, - 1)

(32)

(42)

a/ijk

or in the usual variation

where er,.(t) denoles the output of the filter S(z), '!i(t) = vet)

where the scalar ak ::; 1 allows a better convergence control. The implementation of the algorithm requires thus the computation of .p(t, e); making reference to model (18), straightforward differentiation gives, assuming ,et) = wet), Le. a parametrization neaf to the true one,

(43)

driven by the input

o vet) =

'jet)

~

(44)

1.

o where

The requested result is thus

o aP(z)' -a--fl(t) =

-fiilt + k + Llv, - 1)

,pet, 8) = ~,

(45)

[yi! (t + Llvd .. - yfl (t + V.l + ill/.l - 1)

(35)

Cf.i-ik

1

.

.. :Yf.,Jt+~vd ... yfm(t+Vl m +Llv! -1)1

o

···1 Y!:.l(t + Lll/m )

4461

...

y~ l (t + Vm.l + Llvm - 1)

1

..

·1 Y~rn(t + Llvrn)

IUfl(t + Llvd 1

I

Vm

+ Llvm

... uf!(t+v! +..::lVt ~

-

I) I

1)1···

ufp(t + Ll.vd ... ufv(t + Vi + ..::lVt - 1) I

... Iu:;:t(t+..::lllm ... 1

. _. Y~rn(t +

) ...

t/.~I(t+lIrn +Ll.lIm -1)1··

u~,p(t + ..::lvrn) ... u:.p(t + lI m + Llvm - 1) 1

ef! (t + Lllld ... eft (t + Vi + LlVl -

I efm(t + Llvt}

... t:fm(t

+ VI + ,dVl

1) I -

1) I

... le~l(t+..::lvm)' . e~il(t+vm+tlvm-l)l·

As a first step, an high order ARX model with order n = 36 and structure v = (12,12,12) has been identified and the standard deviation of its residuals used to scale the data; the first output has been divided by 16.3771, the second by 4.3766 and the third by 3.8~86. ARMAX models with stmctures increasing from v = (2,2,2) to v = (7,7,7) have then been identified by means of the procedure described in Section 3. Table 1 reports the corresponding standard deviations of the innovations of the associated predictors and tile values assumed by loss [unction (27) . Table I v

Remark 5 - The computation of (45) requires 2m2 + mp filterings with filters (37), (40) and (43) using the inputs (38), (41) and (44) obtained from the entries of yet), u(t) and e(t).

(2,2,2) (3,2,2) (3,3,2) (3,3,3) (4,3,3) (4,4,3) (4,4,4) (5,4,4) (5,5,4) (5,5,5) (6,5,5) (6,6,5) (6,6,6)

Remark 6 - The PEM procedure outlined can be extended, with very minor variations, to overlapping models properly selected to avoid the presence of (apparently) anticipative links (Correa and Glover, 1982, Guidorzi and Beghelli, 1982, Gevers and Wertz, 1984, Hannan and Deistler, 1988). A very complete analysis of multivariahlc ARMAX identification can be found in Hannall et al. (1980), Hannan and Kavalieris (1984) and in Hannan and Deistier (1988). Remark 7 - The Gauss-Newton algorithm allows a simpler implementation than the Newton-Raphson one but requires an initial estimate of (1 not far from the optimal value. In the case of ARMAX models a possible strategy could consist in computing all initial IV estimate of the parameters aiik and {3;jk (Sooerstr6m and Stoica, 19~9). It is then possible to compute the sequence of equation errors, e(t), of the ARMAX model and to estimate Ule parameters 1iik of the MA part of lhe model using an auxiliary high-order AR model (Ljung, 1987).

4. IDENTIFYING A REAL PROCESS: THE POWER PLANT OF PONT-SUR-SAMBRE The data have been contributed by J.E.N. Richalet as a test case on the identification of multivariable processes. The available measures, performed on the 120 MW gas power plant of Pont-sur-Sambrc (FranL:c), refer to the following inputs: 1) Gas flow, 2) Turbine gas opening, 3) Super heater spray fiow, 4) Gas dampers, 5) Air flow, while the outputs are: 1) Steam pressure, 2) Main steam temperature, 3) Reheat steam temperature.

(1,6,6) (7,7,6) (7,7,7)

Performance of 16 different ARMAX models (To,

1.36 1. 21 1.19 1.18 1.19 1.17 1.17 1.12 1.08 1 .09 1 .09 1 .06 1 .06 1.07 1.07 1.05

er f:~ 1.31 1.20 1.17 1.16 1.16 1.15 1.14 1.15 1.12 1.13 1.10 1.09 1.08 1.06 1.07 1.06

(J "

V(O)

1.29 1.27 1.29 1.24 1.22 1.22 1.22 1.21 1.22 1.20 1.16 1.19 1.14 1.14

5.23 4.52 4.45 4.28 4.25 4.18 4.16 4.04 3.91 3.91 3.74 3.73 3.59 3.57

1.16

3.64

1.07

3.37

These results show the modest effect of variations in the order of the model from n = 6 to n = 21. Considering then models with order 9 a'i a reasonable compromise between accuracy and complexity, the structures (3,3,3), (4,2,3), (4,3,2), (2,4,3), (3,4,2), (2,3,4) and (3,2,4) have been compared; the results are reported in Table 2. Also in this case the variat.ions in the model :-:tructures have only a modest effect on their performance !hat can be evaluated comparing the standard deviations of the innovations with those of the outputs, given, after the scaling, by (Jy, = 29.18, (Ty, = 39.57 and (Jy, = 55.23. Table 2

V

(3,3,3) (4,2,3)

TIle original data (available trom the author) consist in 2202 samples; the sampling time is L\t = 10.24 s. Because of the small sampling time with respect 1.0 t.he process time constants, every set of 6 consecutive samples has been sub~ stituted with their mean value ohtaining thus an equivalent sampling time of approximately 1 minute (61.44 s).

(4,3,2) (2,3,4) (3,2,4) (2,4,3) (3,4,2)

4462

Performance of order 9 ARMAX models with different structures (T.,

1.18 1.18 1.20 1.38 1.18 1.45 1.20

0" B:;

(J"

1.16 1.19 1.18 1. 23 1. 20 1.36

1.24 1.24 1.27 1.26 1.23 1.27

1.19

1.24

V(O)

4.28 4.52 4.45 4.28 4.25 4.18 4.16

An even morc effective picture of the perfOmlaflCe of

the whole family or the models described in the Tables 1 and 2 (covering 16 differeDt nnder.; and 22 structures) can be observed in the Figures I. 2 and 3 where the aelllal outputs (continuous lines ) arc compared with the worst predictions (dashed lines) over the whole family of models, TI1C curves arc almost undistinguishable.

processes exhihil U1C same behavior. While this do,", not justify, in any case, the use of sloppy mathematics or of improper modeling of data uncertainties, it can stimulate wider applications of procedures relying on really mullivariable and minimally parametri:led models, REFERENCES Correa, G.O. and K. Glover (l ~M2). Multivariable identification using pseudo-canon ical forms_ Proceedings of the 6th IFAC Svmpo.,ium on Identification and System Paranl£ter Estimlltion, Washington, D.e., pp . 11101115.

(l

J(X)

200

Samples

Figurc 1 - Worst prediction of output I

F.lkus, R ( 1994). Parametric uncertainty in system identification. PhD. Thesis. Department. of Electrical Engineering, Eindhoven Uni versity of Technology, The Netherlands. Gcvers, M. and V. Werrz ( 1984 l. Uniquely iden tifiable statespace and ARMA parametrizations for multi vari able linear systems. Auwmatica, 20, pp. 333-347. Guidorzi, R. (1975). Canonical structures in the identification of multi variable SySlCnLS. AutomaticlI, 11 , pp. 361 374. Guidorzi , R. ( 198 1). invarianls and canonical fonns for systems structural and parametric identification, AutomaticlI , 17, pp. 117-133.

o

l(X)

2(X)

Samples

Figure 2 - Worst prediction of output 2

Guidorn, R. (l9S2). Equivalence, invariance and dynamical system canonical modelling. ParL, I and 11. Kybemetica, 25, pp. 233-257 and 386-407. Guidorzi, R. and S. Beghelli ( 1982). Input-output multistructural models in multi variable systems identification. Proceedings of the 6rh IFAC Symposium on Identi.(lcation and System Parameter Estimation. WashiogLOn. D.e., pp. 461-465. Hannan, EU., W. Dunsmuir and M. Deisller ( 1980). Estimation of vcctor ARMAX models. 1. Multivariate Anal. , 10, pp. 275-295.

()

lOO

2(Xl

Samples

Figure 3 - Worst prediction of output 3 5. CONCLUDING REMARKS

This paper ha..\ recalled some attention on the erroneous conclusions that can derive when the fiction of a process belonging to Ihe considered da... s of models is confused with reality in the identification of multi variable systems, The example Lhat has been proposed is not based 00 an ad

hoc simulated system hut 011 a real process; most real

Hannan, E.J . and L. Kavalien s ( 1984). Multivariate linear time series models. Adv. Appl. Probab. , 16, pp. 492561. Hannan, E.J. and M. Deistler ( 1988). The statiJticaltll£ory of linear systems. John Wiley, New York. Ljung, L. ( 1987). System identification: theory for the user. Prentice-Hall, Engelwood Cliffs, NJ. Picci, G. (1982). Some numerical aspects of multi variable systems identi fication. Moth. Programming Study, 18, 76. SilderslTom, T. and P. Stoica (1989). Sysrem identification. Prentice-Hall, Engdwood Cliffs, NJ.

4463