Global Identification and Optimal Input Design

Global Identification and Optimal Input Design

EXPERIMENT DESIGN Copyright © IFAC Identification and System Parameter Estimation. Budapest. Hungary 1991 GLOBAL IDENTIFICATION AND OPTIMAL INPUT DE...

1MB Sizes 0 Downloads 111 Views

EXPERIMENT DESIGN

Copyright © IFAC Identification and System Parameter Estimation. Budapest. Hungary 1991

GLOBAL IDENTIFICATION AND OPTIMAL INPUT DESIGN J. RichaIet, S. Abu el Ata-Doss and A. Co'ic ADERSA, 7 bd. du Marechal Juin, F-91370 Verrieres-le-Buisson, France

AbstrRct. The "p;l oba l" identif i catio n approach is addressed in this pRper. In this approach, the values of the parameters obtained from local identification are accompanied by 'lncertaintv intervals corresponding to the assumed error l evel. The geometric structure of the iso-distance domain in the paraiiletric space reflects th e distrib ution of the inforlllation to the different parameters. A siwplified strategy for optima l input des i p,n , which is a fundBlnental problem in identification, is presented. An example, with real data, is treatpri , showinr, the approach ef ficien cy . Key words: IIlodelling, estilllation, dorllain. input design.

identification, uncertainty spectrun will always yield a large \Ilcertainty on the high order parameters of a differential equation.

INTRa:u::TlOO

With the advent of Mode 1 Based Predictive Control (MEP C) , the resign of controllers has changed. Tuni ng pararneters have been r e pla ced by specificati on paraweters since a canputer aided design approach of the control strategy is now made easily accessible to non experts in control. If a III 0 de 1 of the plant is available, then very efficiently, with a proper trade-off between robustneso. cnd dynamics, a controller can be fully designed . Trning the controller appears now E,:; a gratification while the risky and t~re-spending job deals with "lOdelling and identification phases. The situation cppears nOli to be clearer:

The optimal design of' the protocol is then the ini t:.al step of the ..nole procedure. What is the "be st" protocol which will respect all the specific constraints of application and bring the best information? One can imagine that by some analytical optimization it is possible to approa::h this problem provided sorre morel were availrole. To get out of this vicious circle, an iterative procedure can be used. How good is a model ? - the prediction error of the proposed model should be l ess then a prescribed value but that condition is not at all sufficient; - the set of all models which fulfill this condi tion shruld be "small".

- ei thC'r a iBodel is not necessary , then a sirnple Signal B8Sed Control technique is to be used (e.g. PI). Tuni np, is done marually wi th a perfonnance to cost ratio which is difficult to rnatch. Fortunately 90 % of .i .ndustrial loops can be treated this way.

Thus modelling is a trade-off: the aS9..uned structure of the ulOrel should be cOHplex enc:ugh to be role to reproduce the process behaviour wi thin the stochastic enviroruuEn t, but not overparametrized so that uncertainty of the paraneters is reduced and that no paranete~collpensation should occur.

- or a d,odel is needed since l ocal tuning is illrpossible because of the co;nplexi ty of control. MEPC takes Ol/er, provided a model is available. but deals with the Inodel parameters, not with the controller my mor e. Local identification methods are abundmt and effiCient, either they come from estiulBtion techniques and do not use a simulator (e.g. least squares) or they come from modelling technique s cnd an expl ic i t s imul ator is to be adapted. They yield m estimated set of paraneters, some of tile" giving an uncertainty dcmain. The goal of i dent ification is to ex tract a penllanent IIlOdel structure from a set of input-output data. For practical reasons, professional rnodelling and identification, where economical comrnissioning constraints put a strong pressure on the operating conditions. tend to operate in an adverse envirorunent e.g. : low anpli rude. short duration of the :Jrotocol (set of test-signals), high l eve l of perturbation. e tc ... It is obvious that the "richness" o f the infonuation conte nt of the protocol (to be cEfined later on) will affect the ultiulBte quality of the r
All these properties in fact come from the adaptation of the protocol to the process. Opt1mizing the protocol cppears now as one of the most time consuming parts of the procedure ; unfortunately it is a theoretically difficul t problem, but practical solutions are availrole. For a canplete control cppl ication, it appears fran professional experience that more time is spent on modelling, identification md model evaluation than on control specification md design. It is ttus clear that global identification and protocol optimization ·are necessary (see Abu el Ata et aI, 1989 and Richalet, 1991). These two requirements will revc:ur a large allount of ca.lculation and need powerful corrputers, which may explain why these techniques appear relatively recently. because they could not be effictently considered in the past.

O'or instance, even wi th the most efficient identification algori thrn, a low frequency input

1121

At the minimum of D,.& = 0 and if the ffilnlJrRJUI is at the nOlllinal point (D(e.) = 0) , we have :

PARAMETRIC SPACE

The !lIOde 11 ing rhase I eads to a struc ture ci=pending on a vector of para!leters E = (Pk)k=l,K to which is ass::>ciated the paranetric space IRK

D(p+~p) = l ~pT A ~p 2 -

where the

If the ,nodel characterization is co rr ect, the obj~t (true systelll) will be represented, in this space, by a ooint called the nominal point corresponding to a vector Po rod the nodel by a poin t M corresponding to a- vec tor EM •

(i,j)

element of

°

2 ~

k = l, ••• ,K ci=fined by :

In order to compa re the object and lIlodel behaviours, one needs to introduce a topology of the parametric space. Two types of distance can be defined :

1, ••• ,K

denotes the vector

t;

A

St.nJCtJuraJ. and state dista'lces

f H 0i 0j dt

being the sensitivity functions k

and if

?

fH ~5..T

A is given by

(Ok)k=l,K

(4)

, then

dt

A is then a positive ci=finite matrix if the Ok's are all di fferent frolll zero .

Th" st ructural distance is given by D(O,r,1)

vi

In these conditions, near the lIunlllllllll point, D is a positive definite QJadratic fonn and the ioo-D is a closed hyper-€ 11 ipso id whose characteris tics are dete I'Inin e d by A 1. e. by the sensi tivi ty functions.

be inR a positi.ve ci=finite IlIatrix.

The s tate distan ce is given by D(0 ,[·1)

(2)

So cnd sr'1 denoting the object rod model outputs r especti vely for a sallle input signal on the ohse rvation horizon H To guararl t ee silllilar object rod ,!loci= l behaviours, ind?pendently of the data used f or identification, the iIIorel has to tend towards t he objec t according to a st ructural distance. Unfortunately Po is not kno~lT1. Only the state distance is accessible and can he minilllized • Thus the identification resul t v/ill depend on the data used rod val idation is then necessary .

Valley and ridge

We call valley and ridge the locus of points of the i so-D surfaces where the slope is minimun and maxilllllln respectively . In the hypothesis of quadratic approXimation, the valley corresponds to the great axis of the hypf.r-ell ipsoid i.e. to the eigenvector of A associated with the slllallest eigenvalue, cnd the ridge corresponds to the snaIl axis associated wi th the largest eigenvalue. The iso-D can thus be f'Pherical (a) or stretched (b) as shown below :

I so-distance

f o r gi ven data, an iso-distance (iso-D) is co nsti tuted of all points (all lIo dels) of the oara.netric space correspondinp, to sOlne state distance value (D*). D*

@f I

\ lso-D

Th e iro-D con ce pt pI ays a fundalllental r ol e in g l obal identification since, in this approach, the objective is to detennine the set of all models corresponding to the minilllulO distroce level which takes into a:;count the different oources of errors.

The condi tioning of the identification problan is tightly related to the shClle of the ioo-D which refl ects the relative distribution'of the infonnation, braJght by the data, in all paraneter directions. LOCAL IIlENl'IFICATlOO

Sensi tivi ty

~tions

The state distance can be considered as a fun ctio nal of p . Near a point p , a limited series expanSion-to the 2nd order yields

= D(p) + gT ~p + l ~pT A ~p (3) ---2-g rod A denoting respectively the gradient and hessian illatrix of D D(p +~p)

1122

In local identification, the objective consists of detenni.ning a particular moci=] from the class of mode ls considered (a point in the parametric space), which best represents the object behaviour. This model is obtained by optilHizing a state distance criterion. The identific ation methods can be classified into two nain categories which we mention here briefly as there is an abundant l i te ratu re on the subject .

Estilllaticn methods The ,dost CO!,idon anong these JIJ~thods are the least squares mct instrumental variable lIlethods, their r>?cursive version..c;, the least mem squares lIlethod, its nOrlnal ized version and the stochastic approxilnation method (see e.g. LjLTlg & Soderstrom, 1 S33 and Soderstrom & Stoica, 1983, 1989).

- correlation leads to a bias and at the limit, conplete correlation will yield v *2 = 0 thus giving the lIaximLllll bias. This resul t mif".,ht ~pear to be paradoxical: the best ,nodel in the state distmce sense is the worst in the structural distance sense. The interpretation is that, the optimization technique identifies the noise as belonging to the model.

Model methods The basic principle of these methods is the coulp
Global identification is concerned with iso-D determination (Richalet et aI, 1971). Local identification allows to find a point achieving the rniniJnLlln value of the state distance. Actually, it can only yield a local minilllum near the initialization point. Different initializations llay thus give different Inodel s. Different results cm also be obtained from applying different techniques since n'Jise affect these techniques differently.

If there is identity, a ,;]ode1 is thus obtained. If not, the 1,10del paralleters are modified to improve t'le sta tp di stance, thc objec ti ve be ing to progress in the parallletric space, along judicious dirr,ctions. This is done to favrur the convergence of the parcllileters towards the coordinates of the point xhi?ving the IninilnLllll criterion. T'1C''3(~ 'iI~ thods

are essent ially rxIlli.near prograoning for parametric ootilllization (e.g. Iliu"'lc I bl au , 1972) . r,1any types of me thods c an be usoc : ar.alytic, w~ometric, heuristic or randan. A str:1ter:{ using different methods for different phases of optimization is often adopted, the initialization being generally provided by a direct est iJ,ntion Idethod (e.g. least squares). t,~c hniqucs

The efficiency of these ,Iethods depends on the cond.i ti.oning of the iso-D surfaces. If the valley is tiP-J'1t md 1,ighly curved, first order analytic ule thods ( e .p;. grr,dient meth od) and heuristic IdethorJs (e.g. simplex Illetl1od) will progress difficul tly and will stop far frall the optimal point. If the surface is not qJaciratic, the second on:ipr mCllytic Illethods (e.g. Gauss-Newton lllethod) or ge 011,," tr ic Il.e thocts ( e . g. Pov.e 11 me thod) will conve rge slowly and also to a point far from the opti.;rHl one; they cm even diver'ge. In fir. t, experience shows that in practice, instead of searching for the best 1,Icthod to use, it is hetter to define a strategy for using different illethods for different phases : initialization, ulini"lizIng near the ini tial point then near the !llinil!lL1!J .

In realistic condi tions md due to noise (state or/ md struc tural ), the minillBl val ue of D cannot be zero 2 it must t.o'lke into a::count t"le noise level. Le t v · be this value. Ac tually, by applying a local identification ,nethod, one can obtain Dillin = v*2 < v 2 • This leads to a bias between the nOi>linal values of the paraneters (to be obtained) and the identified ones. Analysis of this result shows that the bias is related to the correlation between the noise and the sensi ti vi ty functions : - if no correlation exists, then bias

v*2

v2

i.e. no

This leads to the following question : what is the interest in searching for just the minii>lLllTI point ? In fact, it makes .nore sense to determine all points (models) corresponding to the minimllln possible value of the state distance taking into account all types of perturbations and modelling errors i.e. to detennine m iso-D ; this is the main concern of the global identification approa::h. The problelll of esti1!lating the lIuniJnum iso-D level will be addressed later. The iso-D gives an interval of variation for ea::h paraneter instead of one vcUue in the local CflProach. This allows us to test the model validity : using the same repres~ntation with different data sets, the corresponding iso-D's have to intersect. The iso-D stru-:ture, which is completely specified by the data used, allows the evaluation of the relative distribution of the inf'
lso-distalce detenninatia'l

Now, to determine an iso-D, many analytic, heuristic and randan lle thods exist (Richalet et al, 1971 ). A sillp l e procedure consists of lirni ting the problem to the determination of the parallelepiped circulllscribed to the iso-D, the main interest being to find the variation intervals of the parale ters related to the considered iso-D level, as shown in the figure below for a 2 d:Ulensions case :

"

____

w~

__ -

__

,

---r

,

,

/

/

/

identified model

, ----~ /

--

-

,

--,---

,,

t---~~--------~-----------------" We will begin by deterTllining the parallelepiped in

a particularly simple case md, in turn, generalize it by applying an iterative procedure.

1123

Particular case

and for a Riven protocol, the sensi tivi ty functions will repend on the point of the paranetric space, at which they are calcul ated. The canputation of these functions can be silrple and even analytic in some cases but, in general, it is done by silfulating the model using t he expression :

It is the case where the IllOrel is given by: K

sr~(n)

E.T ~(n) = l: k=l

Pk xk(n)

(5)

{Xk}k=l,K being Uleasure::! variables; e.g. past values of input and output. This is a simplified case since the sensitivity functions are given by : Ok = xl< Ilk. Then, for a fixed protocol, these functions will not depend on the point of the paralle tric space at which they are calcul ated. We v/ill show later that this does not apply in the p,ene ral case. If D* denotes the considered iso-D level, pOints to be detennined have to satisfy

6E.k having 6Pk cOlOponents, with

as kth canponent and zero other 6Pk slllall.

D can be approximated by

the

N

K

E {sM(n),p) + E 6Pk 0k(n,E.) - so(n)}2 n=l k=l

N

1-

1: {E.T ~(n) - so(n) } 2 n=l

2-

for each direction i (c orresponding to Pi), the gradient of D at the extreme points is in this direction. This leads to :

D*

i.e.

(6)

N

D(E.+6E.) = E {pT ~ (n,p) - \i(n ,E.) } 2 n=1 - -

(12)

where !;. = ( 0 k) k=1 K and \i = so-sM can be c;alculated. We thus get a1 expression similar to

(7)

\6) •

i

is the unit vector in the i th direction 1s a scalar , R i s the infonllation matrix :

where md N

R

Now to obtain the extrane points on the iso-D we proceed by iterations. Instead of aiming at the D* level f roll I the first step, intennediate values are introduced , which increase until D*.

N

E x(n)

~(n)T

n=l

'L = E so(n) n=l

~(n)

(8)

For a given direction i , the i terati ve procedure is applied for eac h extrene point separately. Ini tialization is done using a local identification resul t. At each iteration, a displacement 6 p is calcul ated by applying the procedure developed for the particular case a1d using (12). But there is a need for the computation of the gradient of D(E.+6E.) .

so thRt we can wri te (9)

with:

Repl acing p by (9) in eq;ation (6) gives a 2nd degree equation in the unknown A whose solution yields two values of A and froll expression (9), the two extreme points of the ioo-D in the i direction are then detennined. The ith components of these two points give the mcertainty interval on Pi correspondi ng to D* .

Fran (2) the expression for the gradient hessia1 A can be derived as follows:

£

and

N

!i(E.)

- 2

A(E.)

2

1: \i (E.) n=l

!;.(£)

N

The roove procedure is applied to all directions. Remark. The structure of the iso-D can be analyzed through the spectrum of the matrix R . If \nax is the greater eigenvalue and Ai is any other eigenvalue, Ai/ Amax allows the evaluation of the re 1 ati ve allount of infonllation in the direction corresponding to Ai wi th respect to the best information (that corresponding to Amax). Actually, if this ratio is snall (less than 0. 1 ), we can consider that the infonllation in this (U.rection is poor and if it is too snall , the iso-D is too Iluch stretched in this direction and can even be open. General case

In the gene ral case , the more 1 output sM is generated from the more 1; e.g. the case of the discrete transfer fmction representat i on

E n=l

N

!;. (E.) !;. (E.)T - 2

a ~ (r.)

E \i(E.) ~ n=1

this leads to the iterative fonllula :

which will be used in the procedure at ecch iteration g ~ (12) ca'"! even be neglected.

a

12

The iterat i ve procedure is stopped when the distance obtained is given by D*:. E (E small). A trial to analyze the ioo-D structure near the minimal point can be done by deterlOining the eige nvalues and eigenvectors of the matrix N

E ~(n,E.) ~( n'E.)T n=l

which can be considered here

as being the infonnation matrix (similarly to

(10)

+ bl e(n-l) + ... + b nb e(n-nb)

N

E ~(n) ~(n)T n =1

e resignating the system input. In these condi tions,

1124

in the particular case).

To avoid non convergence of the iterative procedure, in badly conditioned or even open iso-O cases, a limit on the number of iterations has to be fixed.

A way to simplify the problem cnd to make it nore adap t oole to practical requirements is to detennine the optimal protocol within a specific class. In what follows, such a procedure is proposed.

Sillli lar procedures can also be developed to dete nnine the iso-D projecti
Definiticn of a class The cl ass, for the candidate protocol, has to be defined in relation to the speCifications and constraints of each particular application. One ccn thus imagine a large runber of possibilities, tut to fix ideas, we give here a sirrple excvrple of a class for ~nich we have to choose :

Estimation of the iso-D level

The i so-O level estimation is. a fmdanental problem but a difficult one . Two view points can be

the horizon : this concerns the duration H of the protocol. It is always recornrnendabl e to consider - if possible - a large horizon. This will have two advantages; first, the signal frequency spectrum can be enriched cnd second, the chmce to get prOject ive independence between noise and sensi ti vi ty functions can be increased hence reducinr. noise effect. -

consi~rm

i) if only Dmin i i) if ..many data sets are avail abl e in thi s case, if J is the I1lIlnber of sets, we obtain the corresponding minilflUJlI val ues Dtnin( j) j=l, ... ,J and we assume Do(j) j = l, ... , J • Two cases are then possible

- the ampli tude : assurni ng the signa I to be centered e .g. around 0 , the minimum and maximun values .:!:. m of the signal anpli tude have to be fixed. - the stru:ture : prefercbly paranetrized in order

the intersection of the ioo-D domains is not void; a IIlodel can thus be taken in the common domain as shown in the figure be low.

to sirrplify the optimization procedure as will be seen l ater. As an example, the signal can be fonned by successive steps. It is clear that such a choice is not possibl e i f a constnaint prescribes the variation of the signal from + en to - en ffid vice versa. Now the duration of the steps is defined through parameters : e. g. periods decrE'asinp exponentially Le. .m,... _ _T::.,o_-,

T,

0··:·--·-···_- --.- ... -.- ....... _.......... .

Do{l)

. this intersection is void cnd one has eith_er to ac cept a less prec ise re suI t by increasing the levels Do(j) to let the iso-O domains intersect or re-exanine the model s truc ture; in this 1 ast case, the detenninjstic aspect (characterization of the morel) is preponrercnt on the stochastic aspect resul ting from noise.

i

0, .• ,L


The global identification EPproach deals wi th an infonnation problem. The iso-D gives light on the relative infonnaticn, brought by the protocol, on the different paraneters.

It is easy to show that,

in these condi tions, the signal is corrpletely defined by the choice of To and TL .

The role of noise will also depend on the protocol. : with a protocol which proVides no much infonnation on a parameter, Cl l ow l evel noise can brinp; about an undetenllination on this paratlleter a good deal larger thm that obtained wi th a high level noise by a protocol which further maxirnizes the j nfonllatjon on this pararreter.

Choice of criteria

I t cm reveale bad conditioning

An important question arises : how to control the p otential infonnation contained in the Signal s appl ied to the system? This is the optimal irlJut design problerll . The pr~ tical need encountered in this problern is fundawental especially as the duration of the experiment dedicated to the identification is short (often for economic reasons). Unfortunately, this cruc ial problem has n eve r been well enough addressed because of its difficulty and the fact that the theoretical work in the subject did not provide satisfactory results in practice.

1125

A class is defined and wi thin this class, a protocol is specified by particular values of paraneters: To ffid TL in the considered example. Optimizing the protocol will then consists of optimizing , over To and TL • some criterion in re lation wi th the infonnational identification problen. The objective is of course to have more inforMation, i.e. less uncertainty on the parameters, for a given iso-D level. Thus it makes sense to define the criteria using the mcertainty intervals {IPk}k=l,K but other possibilities also exist. If only one paraneter Pk is considered, then the problem consists to minimize IPk' In general, for K' fron the K paraneters, the criterion to be minimized is the product of the corresponding IPk's . At the limit if K' = K this

Global identification corresponding to a 5 % ioo-D level gives the following tct>le for ll1certainty intervals :

criterion repr e sents the volume of the pa rall e l epiped circuHlscribed t o the ioo-D . Ano ther objective could be the e
a1

a2

min

0.0177

0.0003

flax

0.1583

0.0C67

bO

1 1

b1

I

-0.01201 -0.0054 1 -0.00661 -0.0016

The figure below represents the iso-D intersection by a plane correspondi.ng to the paraneters a1 and bO •

cptimizing procedJre No nJ i n ear prog raruning t ec hnique s can be us ed to opt i,lIi ze o n e particul a r c ri teri o n o ver the pa rane ters defining th e protoc ol, but simple proc ro ures c a> al so be e ffi c i ent. In the case of two paralle ters , e .p;. To and TL in the previously 'lIention cd e xc.·.,-, le, one Cf/1 s c an values of To and TL and c.a l c ul ate, for each couple (To, TL) , the c rite rion va lue. The opti nal couple (To,TL) will then be sel ected.

+-___---!_ _ _ _+ ____

~---_+

_

(.It·

. . ..

,.~

I-

. . ..

,.~

,_ -',N

...............•........

Expe ri ence shows that a protocol whi ch is optimal f or on e particular parane ter need not be optimal at a ll fo r othe r pa rame ters. If many pa raneters of the 'IiOde 1 are considered, the ideal si tuation consists t o de si p)'1 m optima l protocol for each of them. But , if thi s i s no t pOSSibl e , which is often the case , one h one paralle ter or eve n all the parame t e rs . The epherizing criterion c m al so be used . Elca;!ple p,i ve he re the result of global identification epplied to a> exanple of hydrostatic transuission. Re a l data have bee n used.

'vIe

Optimising the protocol for a1 in a class of signals defined by: To and Ti TL i=l, ••• ,L wi th allpl i tude m = 1 vol t over a 6 sec. horizon, 1",i ves the following interval : min = 0.0185 and Hax = 0. 0230 which shows the efficiency of the procedure in reducing uncertainty. Similar resul ts are obtained for the other paraneters.

curren~ PO:~~~~ing~~bJl-_t_o_r_q_u_e~... r

C

The tmns fer functi on (I -) C) mode 1 chos en is (b o + bl

s) e-O· 05 s

Abu el Ata-Doss S., Coic A.. Richalet J. (1989) . Identifi c ation globale et optimisation de protoc oles. ADERSA report.

+ al s + a2 s 2

Hi Hllnelblau D. M. (1972) programTling . McGraw-Hill.

The l ocal i dentification gives a l = 0.0540 , <12 = 0 .0027 , bO = -0.0092 ,bl -0. 0034. The prot.oc 01 f or the input I and the corresponding out put s CObjec t and CMode 1 are illustrated on the f ollovlinp, fip;ure

,

,

Ljung L., SOderstrOm T. (1983). Theory and practice of recursive identification. The MlT Press. Ri c h a let J., Rault A., Pouliquen R. (1971). Identification des prooessus par la methocle du modele. Gardon & Breach. Theorie des systemes. Vo1.4 •

,

_-_1- _._.___ ... __ ___ ._ - _.- - -,- - - - - ... - - - _.I

_ ._

I

_ . _ __ ... . _ . _

~

I

_ __ __ ... . _ . _

,

.- . -

,

-

,.. ._ . _ . -

-

I

-1- _ _ -

, ,

- ' - 1-

- .-

-



-

.. -

-

-

_ .-

-

... -

.

-

-

_ .-

• Applied nonlinear

- .- 1- -- -- -- -- '::1-' - '- -- - ... - - - "- '-

Richalet J. (1991). Pratique de l' identification. To RPpear Edi tions Heroes. Soderstrom T., Stoica P. (1983). Instrumental vari able me thods for system identification. Springer Verlag, Lecture Notes in Control and Infonnation Sciences n057. Soderstrom T., Stoica P. (1989). System identification. Prentice Hall, International Series in SystelflS and Control Engineering.

1126