Optimal Input Design for Discrimination of Linear Stochastic Models Based on Kullback–Leibler Discrimination Information Measure

Optimal Input Design for Discrimination of Linear Stochastic Models Based on Kullback–Leibler Discrimination Information Measure

Copyright © IFAC Identification and System Parameter Estimation. Beijing. PRC 1988 OPTIMAL INPUT DESIGN FOR DISCRIMINATION OF LINEAR STOCHASTIC MODEL...

976KB Sizes 0 Downloads 158 Views

Copyright © IFAC Identification and System Parameter Estimation. Beijing. PRC 1988

OPTIMAL INPUT DESIGN FOR DISCRIMINATION OF LINEAR STOCHASTIC MODELS BASED ON KULLBACK-LEIBLER DISCRIMINATION INFORMATION MEASURE T. Hatanaka and K. Uosaki Department oJ Applied Physics, Osaka ['nil'ersity, Suita, Osaka 565, japan

Abstract. Optimal input design for stochastic dynamic system identification has attracted considerable attention to obtain ma~imal information from the observation data. Most studies in this area have been devoted to accurate parameter estimation within a specified model structure. However, the problem of primary importance in system identification might be to determine the model structure itself. From this point of \'iew, ~e have considered the optimal input design problem for autoregressive model discrimination based on the Kullback-Leibler discrimination information (KLDI) measure IUosaki and Hatanaka (1987)). In this paper, we derive an optimal input based on the KLDI measure which discriminates efficiently more general linear stochastic models including autoregressive models. Several digital simulation experiments exemplify that the proposed input works well in general linear stochastic model discrimination compared to other inputs such as random input. Keywords. Identification; modelling; stochastic systems; linear systems; discrete time systems; optimization; input design; model discrimination. Kullback-Leibler discrimination information measure under the constraint on input amplitude.

I NTRODUCI'I ot\

The choice of the input signal may determine the nature and accuracy of the dynamic system which are identified from identification experiments. Hence, the importance of input design has been recognized for long time to obtain the maximal information about the system from the observed input/output data. A large number of investigations on this subject have been carried out (see surveys by Astrorn and Eykhoff (1970), Mehra (1974a), Goodwin and Payne (1977), Zarrop (1979), Soderstrorn and Stoica (19B5) and Ljung (19B7)) Most of the studies on the optimal input design problem for dynamic systems are for accurate parameter estimation within a specified model structure under some input/output constraints (see,for example, Aoki and Staley (1970), Mehra (1974b), Wahba (1980), Stoica and Sodestrom (1982), Ng, Qureshi and Cheah (1984), and Krolikowski and Eykhoff (1985)); they are extensions of the optimal experiment design on which statisticians have extensively studied over the last three decades for regression problems (see Fedorov (1972), Silvey (1980)). However, the problem of primary importance in system identification might be to determine the model structure itself as shown in the flowdiagram of identification process (Fig. 1). A little attention has been paid on optimal input design from this \'iewpoint, though. Only few exceptions were: optimal input design for dicrimination of autoregressive models by Uosaki, Tanaka and Sugiyama (1984, 1987) based on the Dscriterion; by Krolikowski and Eykhoff (1984) based on the idea of alternative instrumental determinant ratio (AIDR) test; and Uosaki and Hatanaka (1987) based on the Kullback-Leibler discrimination information measure (Kullback (1959)) . In this paper, we consider the input design problem for general linear stochastic model discrimination based on the Kullback-Leibler discrimination information measure. An optimal input is derived which maximizes the time increment of the

~DELS

AND KULLBACK-LEIBLER DISCRIMINATION INFORMATION MEASURE

Consider a linear stochastic model with controllable input signal {Ut}'

A priori

knowledge Experimental goal

Determ inat ion

of

NO

model structure

NO

Fig.l Identification process IZarrop (1979))

571

572

T. Hatanaka and K. Uosaki

PI (y log where

E

t is independently normally distributed

wi th mean zero and constant variance cr 2. We asStlDe the system is stable. We consider the problem to find an input Ut which determines

~

c

Iu

(4)

)

where p , (yt:ut-l) is the probability density , J t T. t-1 functlon of y =(Yt'Yt-I' ""Y1) glven U =(u _ ' t l u t _2 '·· .,U 1 )T under the model Mj' (j=1,2), res-

efficiently the order of the model under the input constraint. IUtl

P2(y

t( t-l u ) t t-1

pectively, and it has the following properties : (2)

where C is a given constant. The order detennination problem can be restated as a model discrimination problem, that is, discrimination one model among the following two rival models: (1)

Yt

= al

(1)

a

(l)T t-1 + b(l)T t-1 (1) Yt-n Ut_m +e t (2)

Yt

It[i:j;yt j ~ 0

(1'1')

I t [ l:J;y ' , t J

t

=

Pi (y lu

(5a)

0

i f and only if

t-l

p.(yt 1ut-1)

)

(5b)

J

(models are identical) These facts suggest that it becomes easier to

Yt - 1+·· .+a p (l)Yt-p(l)

(1) (1) (l) +b 1 u t _1+· .. +bq (l)u t _q (l)+E t

=

(i)

(3a)

(2)

detennine the model as the KLDI I [1:2;yt j is t larger. Therefore it is natural to find an input which maximize I [1:2;yt j , or maximizes the time t increment of the KLDI defined by

a 1 Yt - 1+· .. +ap (2)Yt-p(2) (2)

+b l

(2)

(2)

u t _ l +·· .+bq (2)u t _q (2)+E t

(2)T t-1 + b(2)T t-1 + (2) = a y t-n Ut_m E t where (p(1),q(1))yt(p(2),q(2)),

( .) Et J

(3b)

in order to discriminate the models efficiently. Application of the chain rule

is an

independent normal random variable with mean zero and variance cr ~ j )' (j= 1,2), respectively, and

a(l)=

(1) (1) (1) T (a 1 ,a 2 , ... ,a (1)) , p

where b(2)=

t

~

(2) (2) (2) T (b 1 ,b 2 , ... ,bq (2) ,0, ... ,0) ,

= (u t ,U t _ 1 ' ...

,~)

T

,

k
and we can assume without loss of generality,

n =

p(1)~p(2),

m=

q(1)~q(2).

t 1 Here, we have used the fact u - does not effect In the following, we will focus only on on Yt-1' the time increment of the KLDI A It [ 1 : 2 ; y ~ and seek for an input u _ which maximizes this crit 1 terion under the input constraint (2).

Remark 1. By imagining the case of approximating a hither order system by a lower order one, we can see, in general, (1)

a k

4

(2)

,.. a k

(k=1,2, ••. ,p(2))

INFUI' DESIGN FOR MJDEL DISCRIMINATION

By the assumption of linearity of the system (1) and nonnali ty of Et' the probability distribution

of y ( k= 1 , 2, ... ,q ( 2 ) ) .

t

given u

the model M" J

Hence, It should be noted that the settings are different from the previous model discrimination problems (Uosaki, Tanaka. and Sugiyama (1984, 1987), Krolikowski and Eykhoff (1984)). •

Pj(y

t

lu

t-l

wexp( -

) 1

T

t-1

.

,and Yt glven y

t-l

and u

t-l und er

(j=I,2), are also normal, i.e.,

= (21t) -t/2 I~ (j)1 -1 t

(y

t

-.u(j))

T

-1 t t ~(j)(Y -.u(j))),(10a)

As a measure of distance between these two models,

we employ the Kullback-Leibler discrimination information (KLDI) measure. The Kullback-Leibler discrimination information (KLDI) measure for discriminating in favor of the model MI over the model M2 may be defined by

(lOb)

573

Optimal Input Design for Linear Stochastic Models where

mt(j)= E}yt]y

t-1

,u

t-1

I, (j=1,2) .

Using (lOb), we have

+( l: 0 bJ.u t _ J· )( l: 0 aJ.m t _ J· (1») )+11. , j=l j=2

log~ 2

2

0" (2)

=

2

2

(11 )

2

ob . = J

20" (2) We can evaluat e the expecta tion of the last tenn in (11) with respect to the probab ility density t-1 t-1 under the model H1 as given U function of y 2

El [

(mt (1) -m t (1»

1

)

2

b(1) _ b(2). J J

int It is easy to see that under the input constra at(2) the time increme nt of KLDl D.l t £1:2;yt l tains its maximum if we select the input u t _ 1 as

o

u _ t 1

----220" (2)

20" (2)

a (.1) J

0" (1 )-0" (2)

+

n

m

2

a = Sgn(T )'

(15 )

on As seen above, the optimal input u t _ 1 depends

2)Tut-1 )}21 -E [«a(1)T yt-1+b( 1)Tut-1 )_(a(2)T yt-1+b( t-m t-n t-m t-n 1 T T t-I t-I t-l I ) 0 ao a } =-2-[ tr{ (l: t-n(1) +( ~ t-n(1)} ( ~ t-n(1) 20" (2) T t I t-l T Tt-I Tt-I +2Oa ~y_n(1)ob u t _m+ (Ut_m) obob ut=ml

(12)

where

the true systems , i.e., true values of oa,ob This means that, in and mt _ j (l) (j=l, ... ,n). practic e, we must contruc t the optimal input based on their estimat es obtaine d by recursiv e least square estimat ion method, for example . Remark 2. The input derived here maximiz es the time increment of the KLDl, and is of course not optimal in Howemaximiz ing the KLDl for whole time period. ver, it can be shown, as in Kroliko wski and Eykhoff (1985), by dynamic program ming formula tion with some approxi mation that the input maximiz es • the KLDI for whole time period.

NUMERICAL EXAMPLE

(1) T (1) (2) b(1 ) ob -(b O ) b(2) - I - 1 ' ' ' ' q(2)-bq (2) ,bq (2)+1'" ,bm ) .

Conside r the problem to detenni ne whether the order of given model is (1,2) or (1,1) under the input constra int : Ut : ::01 . Here the true system is assumed to be of order ( 1,2) and is given by

t

Thus we can express D. It ( 1 : 2; y I as

Yt =0. 5yt-1 +1. Ou t _ 1+b 2u t _ 2 + Et'

T t-1 t-l T +2 (Ut_m) 0 bo a ~ t-m} +1\

(13 )

where E t is indepen dently nonnall y distribu ted wi th zero mean and unit varianc e, and the parameter b is chosen as 2 (Case 1) b 2 =-0.1

where t-1 1 1\ =:-ztr [ l: t-n( 20" (2)

t-1 1) +( ~ t-n(

1)

2 2 0" (1)-0" (1) 2 + 20" (2)

t-l (~t-n(

T 1)

0 ao a

T

I

2 1

2'

0"(1)

log

----z- . 0" (2)

Here we determi ne only an input u t _ 1 which maximize the criterio n (6) since Ut- 2"'" Ut-m have already been detenni ned. We can rewrite (13) as a quadrat ic function of Ut_I' that is

(Case 2) b 2 =-0.3 Figure 2 shows an input sequenc e obtaine d by the propose d method and RBS (random binary signal) input for Case 1. Here, as a criterio n of model order determi nation, we employ Akaike' s Infonna tion Criterio n (AIC) defined by AIC(p,q ) = log(RSS~P,q)) + 2(P{q+1 ),

and RSS(p,q ) denotes the residua l sun of squares for model of order (p,q). By applyin g AlC, an estimat e of model order (p,q) is obtaine d as the value of p and q minimiz ing Then Here, it is known that p=1. AIC(p,q ). only the order q should be estimat ed by

574

T. Hatanaka and K. Uosaki

500

900

( a) Proposed input

(b) RES input Fig.2 Input sequences for identification (from t=500 to t=900) MIC

q = arg min AIC(l,q).

2

q

Hence, in the model estimation using AIC, it becomes easier to determine q if the differences of the value of AIC(l,q) from the adjacent values AIC(l,q-l) and AIC(l,q+l) are large, and if the difference AIC(l,q-l)-AIC(l,q) is positive and AIC(1,q)-AlC(1,q+l) is negative. Based on this observation, the relative difference of AIC at the true order q=2, AIC(1,1)-AIC(1,2) AIC(l,2)

t.AIC(1,2)

o L----4~----?-----------------r_-_-v

. ~

..

-'



/

-1

shows the discrimination ability for the model order in this case. Figure 3 shows the relative difference t.AIC(1,2), the values of AIC(l,l) and AIC(1,2), for the proposed input u~_1 and RES input for Case 1.

It is seen that, in this case, t.AIC(1,2) by the proposed input becomes take big positive value as nunber of observations increases. On the other hand, t. AIC ( 1 ,2) by RES input is negative. This implies the proposed input gives the true model of order (1,2) correctly, while, RES input never gives the true model. The corresponding results for Case 2 is shown in Fig.4. In this case, though both inputs gives the true model of order (1,2), the value is much bigger by the proposed input than the RES input. Thus, the proposed input makes discrimination of the true model clearer and easier compared to by random input. Simulation results for other true models not given here show same characteristics in the behavior of t.AIC, and indicate the applicability of the proposed input or general linear stochastic model discrimination.

o 1000 1500 500 2000 (a) Relative difference of AIC(l,l) and AIC(1,2) AlC

"fA'- ,.: : ./ 1'1\ ' ' /",',:

80

70

AIC (l, 1)

60

50

..

40

,

,

f.'

30

;'\1 1

}

20

10t~________

L -______

f . I

,

,/ - ';

"

':

'\-, \

\

!,

'\

AIC(1,2) \, \

"

(

,: \ ~:'

~

______

~

______

~I~

__

o

500 1000 1500 2000 (b) AIC for model order (1,1) and (1,2) by proposed input

Remark 3. Since AIC for model M is an unbiased estimate of j

twice of the average mean log-likelihood (Akaike (1974) ,

*

SIp ,p . )= J

J*

70 60

p (x)log p . (x)dx, .l

AIC(l,2 )

then the difference of AlC for models Mi and Mj is an unbiased estimate of twice of t-l P1 (y ) t-l t-l p*(y ) log t-l d.v P2(y )

50

AIC(l,l) - \

40 30

J

20 10

where p*(yt-l) is a probability density function

o 500 1000 1500 2000 (c) AIC for model order (1,1) and (1,2) by RES input

of yt-l corresponding to true model. Hence, if model M1 is the true model, the difference of AIC's becomes an unbiased estimator of twice of the KLDI I _ [1:2;y t-1 1. t l



Fig.3

Behaviors of order determination criteria for Case 1

Optimal Input Design for Linear Stochastic Models MIC

here is obtained for maximizing only the time increment of the KLDI, it is not optiDal for whole time period. So, the optiDal input design for model discrimination in frequency domain is now under investigation for long time period optimization.

3

~

,
2

... . . .,: '- ../

,-

--../

,

,...

I

-----'--

REFERENCES

>

o r--,~/______________________RB__S___________ 0~~--~-5~0-0-----1~0~0-0-----1-5~0-0------2040-0---t

(a) Relative difference of AIC(l,l) and AIC(l,2) AIC

180

r)

160 140

f\ ........ ---

AICO'l)!

120

/'

.J"-/I /'-~f\

100

/

80 60 40

" , /,'- --.:' AICO,2) --' ,_ / ,

/'" ; '-

J , j"'. / '.' \: 20~______~____ .__-L________LI______-4_____ o 500 . 1000 1500 2000 (b) AIC for model order (1,1) and (1,2) by proposed input

t

AIC

180 160 140

./' ,. ,.

AIC (1,1)

120

df

100 80 60

,.,

40

. f\.

20 ~

J

j ,. .. : "

>\

,/ \./

;r"",.,"

d \..; ____

,..

~~



______

I

:/

:

' - ",,' ~

\ I,

\,

I

'-_ ...

" Ale ( 1 • 2)

"

,. . . """" . '

\.

'~

,:

\ J: -4__~

575

,

o

500 1000 1500 2000 (c) AlC for model order (1,1) and (1,2) by RBS input Fig.4 Behaviors of order determination criteria for Case 2

CONCWSIONS OptiDal input desitn problem under input constraint is di8CU8sed in this paper for efficient discrimination of ieneral linear stochastic m0dels. An optiDal input is derived, which maximizes the time increment of Kullback-Leibler information discrimination information measure (KLDI) and may yield the bii value of KLDI to make difference of the models clearer. The input can be constructed by usina the recursive estiDates of the model parameters. Simulation studies indicate the proposed input makes discrimination of the true model aaainst the rival model easier compared to random input. It should be noted that an optimal input can be obtained similarly under the Kullback's diverience criterion (Uosaki Anti HAtAnaka (1988)). Since the input derived

t

Akaike, H. (1974). A new look at the statistical model identification, IEEE Trans. Auto. Contr. , AC-19 , 716-723 . Aoki, M. and R. M. Staley (1970). On input synthesis in parameter identification, Automati~!i, 431-440. Astrom. K. J. and P. Eykhoff (1970). System identification - a survey , Automatica, 7,123-162. Goodwin, G. C. and R. L. Payne (1977) . -~ SYStem Identification, Academic Press, New York. Krolikowski, A. and P. Eykhoff (1984). Aspects of input signal design for model order and parameter estimation i n l inear dynamical systems, In L. Ljung and K. J. Astrom (ed.) Identification, Adapti ve and Stochastic Control, Pergamon Press, Oxford, 735-740. Krolikowski, A. and P. Eykhoff (1985), Input signal design f or system identification: A comparative analysis, IFAC Identification and System Parameter Estimation 1985 , Pergamon--Press, Oxford, 915-920. Kullback, S . (1959). Information Theory and Statistics, J. Wiley , New York. Ljung, L. (1987). System Identification : Theory for the Users, Prentice-Hall , New Jersey. Mehra, R. K. (1974a). Optimal input signals for parameter estimation in ~vnami c systems -survey and new results-, IEEE Trans. Auto. Contr., AC-19, 753-768. - - ----- - Mehra, R. K. (1974b) . Optimal input for linear system identification, IEEE Trans. Auto. Contr., AC-19, 192-200.-- - - - - - Ng, T. S., Z. H. Qureshi and Y. C. Cheah (1984). Optimal input design for an AR model with output constraints, Automatica, 20, 739-742. Silvey , S> D. (1980). Optimal Design, ChapmanHall, London. Soderstrom, T. and P . G. Stoica (1983). Instrunental Variable Methods for System Identification, Springer-Verlag, Berlin. Stoica, P. G. and T. Sodestrom (1982). A useful input parameterization for optimal experiment design, IEEE Trans. Auto. Contr., AC-27 , 986989. Uosaki, K. and T. Hatanaka (1987). Optimal input design for autoregressive model discrimination based on Kullback-Leibler discrimination information, Prepri. lOth IFAC World Congress on Automatic Control, 10, 376-380 (1987 ) . Uosaki, K. and T. Hatanaka (1988). Optimal input for autoregressive model discrimination based on Kullback's divergence, In M. Iri (Ed.) SyStem Modelling and Optimization, SpringerVerlag, Berlin (in press). Uosaki, K., I. Tanaka and H. Sugiyama (1984). Optimal input design for autoregressive model discrimination with constrained output variance, IEEE Trans. Auto. Contr. AC-29 , 348-350. Uosaki, K., 1. Tanaka and H. Sugiyama (1987). Optimal input for discrimination of autoregressive models under output amplitude constraints, Int . ,h SyStems ScL, 18, 323-332. Wahba, G. (1980). Parameter estimation in dynamic systems, IEEE Trans. Auto. Contr. AC-2S, 235238. Zarrop, M. B. (1979). Optimal Experiment Design for ~ SyStem Identification, Springer, Berlin.