System Identification Methods: Unification and Information-Development Using Template Functions

System Identification Methods: Unification and Information-Development Using Template Functions

Copyright © IFAC Control Science and Technology (8th Triennial World Congress) Kyoto, Japan, 1981 SYSTEM IDENTIFICATION METHODS: UNIFICATION AND INFO...

1MB Sizes 0 Downloads 71 Views

Copyright © IFAC Control Science and Technology (8th Triennial World Congress) Kyoto, Japan, 1981

SYSTEM IDENTIFICATION METHODS: UNIFICATION AND INFORMATIONDEVELOPMENT USING TEMPLATE FUNCTIONS P. Eykhoff, A.

J. W.

van den Boom and A. A. van Rede

Eindhoven University of Technology, Eindhoven, The Netherlands

Abstract. Based on the notions of template functions a multitude of system parameter estimation methods is presented as a coherent picture. This leads t .O increased insight and to new, pract~cal estimation schemes, adaptable for a wide variety of situations ThC' goal of [l<.lr.:tmet.er estimation ~s the derivation of quantitative knowledge on unknown (:haTact.erist.ics of dynamical systems. Consequently it is of basic i nlpTesl Ico d€'\.ermi ne whether, fundamentally, the particular situation at hand offC'Tr; pnough informat.ion, i.e. whether the best estimation algorithm av.) 1 I ah] (> COli ld i nneC'c) provi de the knowledge required. 'l'hp nf'x i '1l1estion is about. the efficiency of the various estimation algori 1hms, ;, p. t he part of t.he potentl.ally available informatl.on that becomes a"i_1I0111y pre,-,ent , In Ihis paper partial answers to these questions are given. This holds for recursive estimat.ion as well as for "one shot" estimation in cases of lin"'"r j ty j n ·the ·"par
INTRODUCTION

'rhe goal of system parameter estimation is to derive knowledge about unknown or partially known parameters. This implies a number of basic questions: what is the maximal amount of information that can be made available under particular well-defined circumstances? .. what is the optimal estimation technique ? - what is the information losS, using a feasihle but perhaps non-optimal technique ? The availability of answers to these questions depends on the particular situation at hand: - for some (comparatively Simple) schemes a statistically-complete answer can be given, i.e. the development in time of the probability density functions of the unknown parameters, for short observation sequences as well as their asymptotic properties; for other more complex situations one (still) has to be satisfied with approximate answers and the asymptotic properties only, i.e. for long observation sequences. We will restrict the discussion to algorithms discrete"in-time, viz. operating on sampled input and output signals . For continuous (analog) signals, however, comparable discussions apply.

2

MODELS

x(k)

(1)

with x

= [x(I), ...• x(k)]T

(2 )

( 3 )

[-aI'

... ,

... ,

b ]

(4 )

q

this can be noted as: x

= f.1(x,u)

~

(5 )

To these equations a term indicating a d.c. level of the output also can be added. The input and output may be observable only as contaminated by additive disturbances ~ and ~ which are supposed to have zero mean and to be mutually uncorrelated: y

m.

lA

y.

and

= ~

+

I}

(6 )

Then

:i

f.1(y.v)

!:

n -

e

+

!:

(7)

e

(8 )

with

The class of process-models used is characterized by "linear-in-the-parameters" , e.g. cf. fig. 1: 681

~Hn,m)

P. Eykhoff, A. J. W. van den Boom and A. A. van Rede

682

Some remarks have to be made on the restriction to 1 ineari ty- in·-the-parameters: - rather general representations, including the characteristics of the process (output) ncdse, can be covered; cL fig.2 (Talmon and Van den Boom, 1973): p q y(k) E a.y . (k) + Eb.u . (k) +

i=1

~ ~

i=O

r +

~ ~

o= ~

E d.e . (k) + w(k)

E c.w. (k) -

i=I~~

i=I~~

(9)

-a y(k) = [ l:: T·: ~

T:

T; w

~T]

b

+ w(k)

~

l::

[y : U

l::

~(y ,u,w,e) ~ +

W El

e

+

T

+

[F ~ ]

-I

FT~

( 14)

In the sequel we will need £ [~E covariance matrix of the noise.

T

1

R , the

Note that, due to the linearity-in-the-parameters, these estimators can easily be put into a recursive form, having the same statistical properties as the "one-shot" estimators indicated by eq .(13) . In discussing these statistical properties one notices that we may recognize three classes of situati on s : ,.,lass F

( 10) I

-d

or

II IIT

deterministic deterministic stochastic

deterministic stochastic stochastic

"

w

(11 )

Note that this model is linear in all parameter.s, describing process and noise characteristics, but that the Wand E samples are unobservable (and consequently should be derived in an iterative way when implementing such an estimation scheme). .. linearity · in- the-parameters does not imply linearity-in-dynamic-behaviour; these two concepls are unrelated. Note that eq.(l) can be extended with nonlinear terms. - particular types of nonlinearities-in-theparameters can be reduced to linear relations by means of transformations (e.g. reciprocal ·· , logar i thmic-, and others) of some model variable(s). note that, in principle, the representation of eq .( l) can be used also for multi - input situations. In spite of the importance of this topic a further discussion of modelling aspects is outside the scope of this paper.

Here "deterministic" refers to the situation where the Signals concerned can be measured with o ut uncertainty. In fig. 3 some examples are given, that will be co nsidered briefly . (cf also Van den Room, 1976) .

Loosely speaking CLASS I contains the simple situations where: a) only forward parameters are being estimatated and where the process input can be measured without disturbances / measurement erron;, cf. fig.3a, or b) on ly ha ckward parameters are being estimat eo .-md whpre the process output can be measured without disturbances I measurement err OL·S, cf. fig .3b . Th e template function matrix F can be chosen appropriately i . e. as samples of functions that are independent of the process noise. (Note that F als o may consist of sine/ cosine functions, orthogonal functions, etc . that may be chosen independently of the p~ ocess Si gnals; th1s, however, will be discussed under class 11).

ESTIMATION TECHNIQUES

The approaches to estimation to be considered here are based on the notion of "template functions" (Eykhoff, 1980). In short it refers to a premultiplication of the processmodel equation (7) with a matrix FT, where the dimensions of F are equal to the dimensions of ~ . F is called "template function matrix". Then it is recognized that .E is unobservable and under certain conditions Ilk FTr can be chosen to be equal to 0:

kI (F T l::

e

provided of course that the inverse matrix exists. Substitution of the expression (7) for the process output into eq.(13) leads to:

s

The dynamics is again represented by the forward process parameters ~ and the backward process parameters ~. The samples w(k) represent whit e discrete noise; this is colouren by the forward noise parameters Q and/or the backward noise parameters 9 . With the appr opriate vector definitions, this can be written :

3

can be written as :

T -

- F r.~)

I

T

= kF E= Q

(12 )

where is the estimator of the unknown parameters Consequently this estimator

e.

Some methods pertaining to class T ad a): .. corre la t ion meth. - matrix method with retarded observat. Volterra method Wiener method - least squares meth. - weighted l . s.meth.

=_~~~~~~_:~~:~~ ad b): comparable to

J

F chosen as (consisting of):

n ( u(k-i), u 2 (k-i),. ) n (u) - U ~(u) WU

_________~=~~i~2_:_~=~~_______ a)

e.g. n( y(k-i))

System Identification Methods CLASS II contains the cases where there are additive disturbances at the output (and possibly also the input) signal, where forward as well as backward parameters are being d£'t ermi.ned and when~: r) F lS chosen again as before . Note that to this class belongs the instrumental variable technique, viz . if the instrumental variables (lemplate functions) are derived from an input signal that can be measured Wilhout disturbances or errors; cf.fig . 3c. d) F is ch o sen independently of noisy measurements, e . g . based on rough knowledge that may be available a prior1. Such situations may occur in eslimation on batch processes or on biomedical systems; cf. fig. 3d . e) F consists of a set of deterministic funclions, chosen indepedently of the process signals, c.g. sine/cosine-, orthogonal-, Walsh funclions , etc.; cf . fig . 3e Some methods pertaining to class 11

68 3

Some methods pertaining to class III

F chosen as (consisting of):

ad f): generalized l.sq. Clarke ' s method Hastings-James' meth. -, Young ' s method extended matrix meth. (Talmon and Van den Boom,1973) extended l . squares

}

Dri (y, u ) ~ (y,u,w)

}

ad g): least - squares - like meth . (van den Dungen,1978)

~ ( y ,u,w,e )

~ (Y I,vI )

Now we will discuss separately these three classes of estimalors .

F chosen as (consisting of): 4 INFORMATION DEVELOPMENT - CLASS

ad c): instrumental vari able method (optimal covar.) instrum . var . meth. " de layed inputs inslrum.var.meth. ad d):

~ (u,z)

z is model output -I

R

~ (u,x)

~

(u,u*) u*is delayed outp.

F based on rough a priori knowledge e. g . , in t.he case of batch processes, averaged o ver an "ensemble" of previous batches .

ad e) : Fourier method - Laplace method - orthogonal developm . - Shinbrot·s method - strejc ' s method

This s1mple case, as mentioned before, includes the straightforward correlation techniquns . No te tha t if ~ = U and Fare d e lerminislic then from eq.(14) it is clear lhat, if FTr i s independent Gaussian, then § ~s Gaussian loo . By determining the expect
sin Wit; cos Wit

exp (a it) sin Wit

CLASS III contains the cases where the template matrix is derived using disturbed process signals. One such method is the differential approxtmation; -as it leads to highly biased results we will leave this out of discussion. other methods include: f) F consists of observations of measurable signals and estimations of unobservable quantities, e . g, in the representation of eq . (11) :

= ~ (y,u,O,O)

e

El (Q

c ov [ 0 ]

where wand can only be estimated after a rough estimate of the parameters ~ and b from the input and output Signals . Consequently the template function matrix develops in an iterative way; cf . fig.3f. g) F consists of noisy observations of process signals . Here it is assumed that two (or more) sequences of the same variable can be measured with errors that are mutually independent between these sequences. Such situalions may occur if a quantity can be measured using different physical prinCiples, e . g . measuring thickness by capacitive-, radialion-, or ultrasonic means; cf.fig.3g.

[

~)

-

[FT~ ]

(Q -

(15 )

~) T]

-I

F\!:TF

=

[ ~TF]

-11 J

(16)

we obtain the complete statistical information on the estimator ( even for short sample length s ) , as the Gausssian di s tribution is " omplelely delermined by its expectation and V ~1.r

FI

1 !:J

FT viz .

=E

orthogonal funct. modulating funct .

I

ian c e,

Nole that E [~] = Q is sufficient for unbiasedness of lhe estimator and that eq.(16) can be s implified t o : -I

[FTU]

-1

FTRF[ UTF]

BF

(17 )

cf. Eykhoff (1980). From here we can discuss a variety of parameter estimation methods by makJng particular choices for F . At: lhis point l.t may be helpful to consider the Markov estimator -1 ~ = TR-I UTR-IX (18)

[u

u]

This estimator is known to be optimal in the sense of having the minimum variance of all linear unbiased estimators . It can be rewritten in two ways by recognizing: R- I = DTD

e viz .

,thus

[{UTDT} {DU}]-I {UTDT}

D~

(19)

684

P. Eykhoff, A. J. W. van den Boom and A. A. van Rede -

pro'pert~es

e

is consistent,

(20) Eq.(19) represents an interpretation using a noise-whitening filter (Eykhoff, 1974, pag. 1'33) and eq . (20) can be recognized as a templat e' · funct~on - interpretation. Now being interesLed in the latter representation it is wo rth nol1.ng that. the template function matrIX apparently consists of the independent var~able~ weighted by the inverse noise coV.:lYlilnCe ma1.. rix:

(21 ) For this optimal case corresponding to the Markov estimator the parameter covariance ma lri ~ is :

[l

T 1 cov [e] = [U R- U]-1 = -

opt

U]-1

=

B (22) opt

i

If is an unbiased estimate resulting from any other template function matrix F, then opt.ima lity in the s ense of covariance means

T

z (B

-

opt

-B) z ;:

~

0

(23)

for all ~. This implies that the hyper ellipsoid corresponding to Bo t lies completely in s ide or touches the ellip~oid corresponding Lo BF' nnot her point of interest is the fact that sometimes:

[B:~t

det

- B;I]

=

(24)

0

Condition (24) implies that both ellipsoids touch at at least two poi.nts; cf. Jesus( 1979) Summarizing for class I we notice: the information development during the estimation procedure is known even for short observation sequences; for Gauszian noise we obtain complete statistical information; the effect of the choice of a non-optimal template function matrix and its associated information loss can be studied.

5

INFORMATION DEVELOPMENT - CLASS

Il

This case, with observation matrix g stochastic and template function matrix F deterministic, has extensively been studied. Unfortunately the unbiasedness through eq. (15) cannot be guaranteed, due to the statistical relation between n and r . Consequently one has to be satisfied with asymptotic properties for the number of observations k going to 00 • The estimator characteristics can be summarized as follows, cf. e.g. Ward (1977): .. assumptions Q

(: rrl

R

=

(:

Z

[r r T] nonsingular

plim l/k FTg exists and is k->oo nonsingular plim l/k FTr Q

(25 )

cov[~]

-1 = [FTg ]

~T~F

plim 8 k->oo

8

-1

[ g TF]

( 26 ) (27 )

the optimal template function/instrumental variable is: -1

F

= R g (u,x) opt (28) Note that this requires undisturbed process signals which are of course not observable ! Again a number of remarks can be made. - When chosing a non-optimal but feasible F its effect on the asymptotic properties can be studied. This is of real importance, as in pra(: ti c al situati o ns the optimal instrumental variahle can only be approximated in an iterat .iv e' way . The choice of F, based on rough a priori kn o wledge, suggests ~tself and leads to new eslImation methods well adapted to practical engineering and biomedical situations. As an example, such a prior~ knowledge may be c hospn as the dev e lopment in time of a batch pr oc('~,s, a v eraged over a number of such previnus baLches. This provides us with acceptable, feasiLle template function matrices. An example (rom lhe biomedical field (quantitative measure of a cardiac infarction) can be found .in Van Kemenade (1980). From that case o ne learns that the estimation results may be rather insensitive for deviations in the cho ice of the template function matrix. - As indicaLed before, the F may also be chosen independently from process measurements and independently from a priori knowledge . Such a choice may be sine/cosine function!'. In surh a case the "projection" of the process s ignals o n the F-space will generally be rather "inefficient" and badly adapted to the type of parameter information needed. This implies that more template funclions are needed, and that the dimensions of F and g are nol lhe same.

Summarizing for class 11 we notice : the information development during the estimation procedure can be studied conveniently (still) for the asymptotic case, i .e . for sizable observation sequences; the effect of chosing various kinds of template functions can be studied. Such choices include the use of rough a priori kn o wledge and the use of sets of determinlstH~ functions.

6

INFORMATION DEVELOPMENT - CLASS

III

The process description is chosen again as indicated by the eq. (9) - (11), summarized in eq . (29);

y = g~ + ~ (29) If w is white noise then the unbiased, minimum variance estimate is given by "

~ =

T

[SI g ]

-1 T

SI y

(30)

where the method function matrix equals the

685

System Identification Methods matrix rl . I f the elements of this matrix could be measured without uncertainty, then the covar~ance of the estima~or could be wri. tten us: cov [~)

(31)

Note that, with an increasing number of observations, some parts of the matrix can be neglec~ed due to stalistical independence of the respective components . Based on the assumption!:) made with respect to the observ~ility of the ma~rix elements and the whiteness of the noise it is clear that this covuriance is an optimistic guess of the informut.ion thut might ultimately be available. This guess is illustrated with a simple examp]e: Assume the process and noise characteristics to be described by:

=

y(k)

(32 )

(33 )

-I

2 22 2 bOOu+(I+cl- 2alcl) o w I-a

2

.c I

2

0

l

0

0

2 0 w

0

(34) 2 u

0.7

bO

250

0

0.3

u

the covariance is [. 176

0

-.176

0 .4 0

-.m] 0

(35 )

.576

or 0 RJ .076 RJ .042 Ob RJ .063 cl 0 l Once again we have to emphasize that this simple guess of the information must be too optimistic due to the assumption that the "template functions matrix" can be determined without uncertainty, which certainly is not the case. Another estimate can be found using the Cramer-Rao theorem, assuming Gaussian noise, along the lines indicated by Rstrom (1967) and costongs (1979). To the results attained that way an interesting physical meaning can be found in terms of the signals that act as information carriers for the respective parameters . 0

~

W

In sp~te of the notions indicated, there stlll remains the challenge to find an adequute connection to e. g. informa tion/communicatio n theory . The parameter values to be estimated can be conSIdered as the messages s en t by means o[ a coding that requires an o ptimal decoding , c o nSidering the noise pr efic nt. .

In spite of some problem statements along theGe ideas and relevant material available this line has not yet been adequately develo ped . CONCLUSIONS

Based on the notion of template function(s) (matrices), a coherent picture is given that includes a wide variety of parameter estimalion schemes . Due t o the underlying lineari ty .. in the· 'parameter s all these schemes can e;1 s i ly bp brought- int o recursive form. Thr ee clu~ses have been recognized, where eac h class has its characteristic possihilitie ~/ limitut.ions of describing the information d e velo pment during the estImation procedure. Thi~ implies the discussion o f optimal and non-optimal/feasible template function mat.rices .

a

Remaining to be discussed is the estimation situation indicated by fig. 3g, where two or more sequences of the same variable can be measured with errors, that are mutually independent between these sequences. This type eS T 1

FURTHER PERSPECTIVES

8

USing the numerical values:

10- 2

Summarizing for class III we notice: a rough and Simple estimation of the information available can be obtained quite s imply; the informa~ion development during the e"t~matlon procedure can also be discussed using lhe Cramer -'Rao bound, indicating the maximal amount of information that can be derived on the estimated parame~ers. Simulatl o ns f o r particular cases of the extended m~t . rix method indicate a rather close approa c h lo this bound, whereas the amount o f c o mputation time needed compares favourabJy with the maximum likelihood method. 7

with the assumptions 2 2 ; dw2) = 0 £[u ) = 0 2 w u u , w independent, whi te Then eq.(31) is found to be

I"j ~

~

-aly(k-I) + bOu(k) + +c,w(k-I) + w(k)

cov b

~I.ochastic template function matrix needs more extensive treatment and is discussed sepurat.ely (Vun den Dungen,1978; Van den Dungen and Eykhoff, 1981)

n[

REFERENCES

~strom, K.J. (1967). On the achievable accuracy in identification problems . Preprints 1st IFAC Symp. on Identifica~ion, Academia, Prague. Paper l . A. Boom, cf. Van den Boom. Costongs, J.W . (1979). Accuracy of the parameter estimation techniques in the program package SATER (in Dutch). M.Sc.Thesis EE Dept . , Eindhoven UnIversity of Technology. Deutsch, R. (1965). Estimation Theory. Prentice Hall, Englewood Cliffs, N.J. Dungen, cf. Van den Dungen. Eykhoff, P . (1974). System Identification, Parameter and State Estimation. Wiley, London, 555pp Eykhoff, P . (1980). System identification; an approach to a coherent picture through template functions. Electronics Letters, vol.16, pp.502 - 504.

686

P. Eykhoff, A. J. W. van den Boom and A. A. van Rede

Goldberger, A.S. (1964). Econometric Theory . Wi1ey, New York Jesus, E.R. (1979). H~thod functions and ~nformatiQn. H.Sc.Thesis, EE Dept., Eindhovcn University of Technology Kemenade, cf. Van Kemenade. Kendall, H.G. and A . Stuart (1961). The Advanced Theory of Statistics . GrIffin, London. Rao, C.R. (1945). Information and accuracy attainable in the estimation of statistical parameters. Bull . Calcutta Hath.Soc., v o l . 37, p.Ol. Sh~nbrot , H . (1957). On the analysis of linear and nonlinear systems. Trans . ASHE, vol.7Q, pp.~47-552. Talmon, J . L . and A.J.W. Van den Boom (1973) . On the estimation of the transfer function parameters of proceDs ·- and noise dynamics using a single -stage estimator. Proc. 3rd IFAC Symp. o n Identification and Syst. Param.Estimat., the Hague/Delft. NorthHolland, Amsterdam, pp.711 - 720.

Van den Boom, A . J.W. (1976). On the relation between weighted l~ast-squares estimators and instrumental variable estimators. Proc. 4th IFAC Symp. on Identificat~on and System Param. EDtimat . , Tbilisi. North ·Holland, Am ste rdam, 1978, pp . 1261-1271. Van den Dungen, W. T.H . H. (1978). A generalisation of the least squares estimator (in Dutch) . M. Sc.thesis, EE Oept. , Eindhoven University of Technology . Van den Oungen, W.T . H.M . and P.Eykhoff (1981) A generalisation of least squares estimation . Proe.8th IFAC Congress, Kyoto. Van Ke mcnade, C.J.C. (1980). Estimation pr o cedure using template functions for quantification of cardiac infarction (in Out.ch). M. Sc. TheSIS EE Oept ., Eindhoven Un ive rsity of Technology . Ward , R.K. (1977) . Notes on the instrumental variable. IEEE Trans.Autom .Control, vol.AC22, pp . 482 -- 484

-------------------------------------1

1 Iw

,-

1 1 1 1 1 1 1__ ____ - - - - - - - - - - - - - - - 1 1

1 1 1 1 1

1 1

1

- -------- ~

- -

--:

1 1 1 1

1 1 1

1 1 1 1 1

L ___________________________________ J IIg 2

------------------------- .

----------------------------~

t !!(£.!:!l

1

1 1 1 1 1 1 1 1

!l(ul

o

:------------------------j

, Y6 J :

aT!

r------------------------------:

i 1 1

1 I 1

1 1

.

!!

i 1 1 1 I 1

i:: _~;---u~:~----~-~- ,:'J

1

1 1 1

1 1 1

1 1

1 1

: ""o,~ ~: ~F .~------------------------------~~--------------~-------------~: CLASS I 1 CLASS II : ~-----------------------------_r CLASS ID 1 I

.

l ... nO ..... led9~F

:

I

I

I

I

F

I I