Tracking the Output of a Poorly Known Linear System

Tracking the Output of a Poorly Known Linear System

Copyright © IFAC 12th Triennial World Congress, Sydney, Australia. 1993 TRACKING THE OUTPUT OF A POORLY KNOWN LINEAR SYSTEM R. Risbel Department of M...

486KB Sizes 2 Downloads 56 Views

Copyright © IFAC 12th Triennial World Congress, Sydney, Australia. 1993

TRACKING THE OUTPUT OF A POORLY KNOWN LINEAR SYSTEM R. Risbel Department of Mathematics, University of Kentucky, Lexington, KY 40506, USA

Abstract. Tracking the output of Linear Stochastic Systems whose parameters are unknown is considered. For long term discounted quadratic cost, the optimal control is shown to depend on estimates of the state of the linear system considered as if the parameters were known and the conditional density of the unknown parameters.

I. PROBLEM FORMULATION

satisfies

dZ t = dYt - Ut dt. Consider the problem of tracking the scalar noisy output of a poorly known linear system. The n-dimensionallinear system Xt has the form

(3)

It's desired to chose the control u( t) to minimize the expected long term discounted combination of squared tracking error plus control energy.

(1) with scalar output Yt satisfying

dYt = hXt dt

+ dVi

(2) 11. DETERMINING THE OPTIMAL CONTROL

where A, (7 and h are respectively n x n, n x m and n x 1 dimensional matrices, Wt and Vi are m and 1 dimensional Wiener processes. The initial vector Xo is normally distributed with mean J-L and covariance matrix E.

Let

M t = (7 [y s; 0

~ s ~ t1

(5 )

denote the (7-field of measurements of the output of the linear system. The following theorem gives estimation results for the linear system and is critical in determining the optimal control.

The system is poorly known in that the parameters of the problem, that is the vector 0 of matrices 0 g, (A , (7, h, J-L, E), are unknown; however enough is known about them to consider them as random quantities with given prior probability density P(O).

Theorem 1: The joint conditional density, of the current state Xt and the parameters 0 of the linear system, given the field M t of past measurements of the output, is given by

The problem is to chose a control u(t) based on past measurements of Yt so that

lot u( s) ds tracks Yt. The error process Zt

401

where n(x; x,~) is the normal distribution with mean x and covariance matrix ~, Xt( 0) is the solution of the Kalman filter equations

dXt(O) = AXt(O) dt

+ ~t(O)h'[dYt -

hXt(O) dt];

xo( 0) = 1',

(7) ~t (0)

is the solution of the matrix Ricatti equation

(8)

It is not difficult to obtain Theorem I using the Kalianpur-Striebel formula and a double conditioning with respect to both the measurements M t and the unknown parameters O. Before giving the sketch of the proof of Theorem II let us consider what is needed to implement It's control. That is, what is involved in calculating Jt ?

If we define

At(O) =

2

efot hi.(8)dll·-t f; Ihi.(B)1 d.

(14)

a well known calculation with Ito's rule shows At (0) satisfies

dAt(O) ~O(O)

(9) ef; hi.(B)dy.-t fOllhi.B)12 ds P(O)

J

rl

e Jo

hi.(B)dys-

t Jor Ihi.(8)1 l

2

d" P( O)dO



(10) Using Theorem 1 we can now state the form of the optimal control. Theorem 11: The optimal control for the tracking problem is given by

+ (3)"K -

)..

=0

The innovations process Vt for the system is defined by

dVt

= dYt - hI; dt (16)

(12)

= dYt - f hXt( O)7rt( O)P( O)dO

and J t is given by

J t = -K

=1

Ill. SKETCH OF THE PROOF OF THEOREM II.

where K is the positive root of the quadratic equation

K2

Ao(O)

(15) Theoretically the equations (7) and (15) for Xt( 0) and At( 0) would be solved simultaneously for each of the parameters 0, then 7rt(O) formed as in formula (10) involving and integration over 0 and finally Jt in (13) involves another integration over O. In practice a choice of a finite number of the parameters 0 would be made and the equations (7) and (15) solved for these parameter values and the two integrations approximated by summations. While this is computationally intensive it could be possible.

= ~,

and 7rt(O) is given by

7rt(O) =

= At(O)hxt(O)dYt;

Jh[A-(f3+)..-l K)/tlXt(O)7rt(O)dO (13)

From (10) Ito's formula and interchange of stochastic and ordinary integration, it can be shown that 7rt( 0) satisfies

d7rt( 0) = 7rt(O)[hxt(O) -

Because of space requirements we will only give a sketch of the proof of Theorem II.

f 7rt(O)hxt(O)P(O)dO]dVt (17)

402

From this and !to's formula it can be shown that

dXt(O)7rt(fJ) = AXt(O)7rt(O) dt

+ rt(O)dVt

- ,\-1f{2z; - ,\-12f{zdt - ,\-1Jt2, (25)

(18) where

rt(o) =

Now since the minimum over u of 2(I< Zt + Jt)u + '\u 2 is attained by u = _,\-1 (f{Zt + Jt ) and equals

~t(O)h'7rt(O)

+Xt(O)7rt(O)[hXt(O) - j 7rt(O)hXt(O)P(O)dO] (19)

Substituting (25) for the terms involving in Ut the right hand side of (24) and using (12) and (20) implies

Then (13) and (18) imply

dJt = (f3

+ ,\ -1 f{)J t dt

-f{(j hXt(O)7rt(O)P(O)dO) dt

+ ~tdVt

+2JtdVt + (I<

(20) where

~t

= -f{ j h(A-(f3+,\-1 f{)J)- 1r t (O)p(O)dO.

= e-{3t{2(~t

+ 2~t) dt}

+ Jt)dVt + (I< + 2~t)dt}. (26)

(21) Integrating both sides of (26) from 0 to T and taking expected values, using that the expected value of the stochastic integral is zero gives

Rewriting the error equation in terms of the innovations as

E[e-{3T(I< z}

and armed with the equation (20) for dJt we are in a position to begin the optimization argument. Let u(t) be any control and Zt the corresponding solution of (22). Consider the quantity

e-{3t(f{ z;

+ 2Jt zt )

(23)

(27) Letting T approach infinity gives

where f{ is a solution of (12) and J t of (13). Forming the differential below and using (20) and (22) and !to's rule,

de-{3t(I< z;

+ 2JTzT )]- E[f{ Zo + 2JozoJ

+ 2Jt zt ) + e-{3t(z; + '\u;) dt

(28) Now if the argument above is repeated with the control Ut = ,\-1(f{Zt+Jt) equality holds in (26) and (27) showing this control is optimal.

(24)

403