Copyright © IFAC Singular Solutions and Perturbations in Control Systems, Pereslavl-Zalessky, Russia, 1997
OBSERVATION CONTROL PROBLEM FOR DISCRETE-CONTINUOUS SYSTEMS AS A SINGULAR CONTROL PROBLEM
Boris M. Miller, Karen V. Stepanyan
Institute for Information Transmission Problems GSP-4, B. Karetny Per. 19, 101447 Moscow, Russia e-mail: bmi/ler@ippi. ras. TU,
[email protected]
Abstract: The problem of the estimation and observation control is considered for the stochastic system with a state-estimation dependent noise in observations. The optimal estimation in the class of linear filters was obtained and the separation principle for the linear-quadratic optimization problem was also proved. It gives the opportunity to solve simultaneously the control problem for the process and observations. This new problem belongs to a class of singular control problems and can be solved by standard methods of impulsive control theory. Keywords: stochastic systems, non-Gaussian processes, nonlinear control systems, filtering problems, optimal control.
a deterministic one which can be solved by a standard methods of impulsive control theory (Miller, 1985; 1991; 1995). The structure of the paper is as follows. In Section 2 we present the system model and statement of the observation control problem. In Section 3 we consider the estimation of a Kalman type and derive the equation for the optimal estimation and covariance matrix. In Section 4 we discuss the properties of the estimation derived in the previous section and prove that this estimation is really optimal in the class of linear filters. In Section 5 we consider the problem of simultaneous process and observation control and prove the separation principle. So we can reduce the originally stochastic problem to a deterministic one which has a form of standard impulsive control problem.
1. INTRODUCTION The problem of the estimation and observation control is considered for a class of stochastic systems with state-estimation dependent noise in observation. These systems arise in image processing and in correlation traking systems (Miller, 1991). Here the problem of optimal estimation was solved in the class of linear filters for systems with affine dependence on the estimation error. This optimal linear estimation has a Kalman filter form , however, the equation for covariance matrix differs from a standard one of a Riccatti type. Meanwhile, the closed form of filtering equations gives the opportunity to prove a separation principle for this kind of problems and to expand a wellknown separation result (Kuznetsov, et .al., 1980) onto a class of nonlinear stochastic control problems. Moreover, if the solution is looking for in the class of linear filters and linear control laws, the originally stochastic problem of simultaneous process and observation control can be reduced to
2. SYSTEM MODEL Consider a dynamic stochastic system described 179
by a stochastic differential equation
dx(t) = A(t)x(t)dt + C(t)u(t)dt x(O) = xo , t E [0, T] ,
+ B(t)dWtO,
with deterministic functions Fo , L(t , s), which, however , could depend on the given observation controls a( t) , v( t) . Definition 1. The estimation of a type (5) with appropriate error covariance matrix
(1)
where x(t) E Rn be a system state, the initial condition Xo is Gaussian with parameters
Exo
= mo,
cov(xo,x~)
,o(t) = E{(x(t) - x(t»(x(t) - x(t»)*}
= ,0 ,
will be called optimal in the class of linear estimations if
and u(t) be a process control which depends on the observation process described by equation
dy(t) = H(t , a(t»x(t)v(t)dH G(t , x(t) - x(t»v 1 / 2 (t)dW/ y(O) = 0,
z",o(t)z:5 z",(t)z
(2)
(U is a compact set ), v(t)dt :5 M < 00 .
f:
'iz E Rn , 'it E [0 ,1']
(7)
where ~f (t) is the error covariance matrix for any estimation of a type (5). To derive the equation for the optimal linear estimation we first find the optimal Kalman-type estimation, which will be described by equation
In (1) and (2) {Wn and {W/} are standard \Viener processes, and x(t) is some estimation of x(t) based on the values {yes) : 0 :5 s :5 t}. The initial condition Xo, processes {Wn and {Wl} are independent. Observation control is described by the functions a( t) and v( t), satisfying the constraints:
a(t) E U vet) ~ 0,
(6)
diet) = A(t)x(t)dt + C(t)u(t)dH K(t)(dy(t) - H(t , a(t»x(t)v(t)dt)
(8)
with initial condition X(O) = mo o So, to define the optimal estimation according to Def. 1, we have to find the coefficient KO(t) which guaranties that relation (7) holds for any arbitrary K(t). Theorem 1. The optimal value of KO(t) is equal to
(3)
Our aim is twofold: the first one is to obtain the estimation process for x(t) which would be optimal in some sence, and the second one is to consider the problem of observation and process control simultaneously. It should be noted, that model (1) , (2) doesn 't belong to the class of a conditionallyGaussian processes (Liptser and Shiriayev, 1977) due to the dependence of the observation noise from the process x(t) and its estimation x(t) .
KO(t) = ,(t)H"(t , a(t» x [Go(t)Go(t)+ < G 1(th(t)Gi(t) >]-1.
(9)
where ,(t) is the solution of equation
d,(t) = A(th(t)dt + ,(t)A"(t)dH B(t)B* (t)dt -,(t)H" (t, a(t»x [Go(t)Go(t)+ < G 1 (th(t)G'i(t) >]-1 x H(t , a(t)h(t)v(t)dt,
(10)
,0
and the optimal with initial condition ~f(O) = estimation is described by equation FILTERI~G
3. LINEAR
PROBLEM
dx(t)
a(t» x [Go(t)Go(t)+ < G 1 (th(t)Gi(t) >]-1 x x (dy(t) - H(t , a(t»x(t)v(t)dt).
Go(t) > 0 and
(11 )
with i(O) = mo o Remark 1. Equations (10) , (11) look like a standard Kalman filter , however , equation for ,et) differs from a standard Riccatti one. Meanwhile, if G 1 = 0 we obtain a classical Kalman filter.
G(t, x(t) - x(t» = Go(t)+ < G 1 (t), x(t) - x(t) > ; where
= A(t)x(t)dt + C(t)u(t)dH
~f (t)H*(t ,
Since the solution of the optimal estimation problem for models like (1) , (2) is unknown we consider a problem of optimal linear estimation in special case of affine dependence of GC) upon the estimation error, i.e.,
Proof. For given K(t) denote by Z(t) = x(t)x(t) and Al(t) = A(t) - K(t)H(t , a(t»v(t). Then Z(t) satisfies the equation
(4)
dZ(t) = Al(t)Z(t) + B(t)dW?K(t)Go(t) v 1 / 2 (t)dW/K(t) < Gl(t),Z(t) > v 1 / 2 (t)dW/,
We also will consider the class of linear estimations
(12)
t
x(t) = Fo(t)
+
J
L(t , s)dy(s)
°
and by applying the Ito's formula to Z(t)Z*(t) we obtain the following equation for ,(t) =
(5) 180
EZ(t)Z"(t) d,(t) = A 1(th(t)dt +,(t)Ai(t)dt+ B(t)B*(t)dt+ J(t)(;O(t)(;o(t)J(" (t)v(t)dt+ J(t) < (;l(th(t)(;j(t) > J(-(t)v(t)dt ;(0)
3. the estimation is ortogonal to the estimation error
Ex(t)(x(t) - x(t))" = O.
Proof. First statement easily follows from equation (12) , the second one is a direct collorary of Theorem 1. To prove the ortogonality property consider a variable Z(t) = x(t)(x(t) - x(t))" . By applying Ito 's formula we obtain
( 13)
= ;0.
To derive the equation for the optimal value of J(t) we put
J('(t) = J(O(t)
dZ(t) = A(t)x(t)(x(t) - x(t»*dH K(t)H(t, o:(t»(x(t) - x(t» x (x(t) - x(t»*v(t)dt+ J(t)(Go(t)+ < G 1(t),x(t) - x(t) » x v 1/ 2 (t)dW/(x(t) - x(t))*+ C(t)u(t)(x(t) - x(t»*dt+ x(t)(x(t) - x(t»" A"(t)dt+ x(t)(B(t)dW?)*x(t)(x(t) - x(t»" x H"(t , o:(t»J(*(t)v(t)dtx(t)(J(t)[Go(t)+ < G 1(t),x(t) - x(t) >] x v 1/ 2 (t)dW/ )*J(t)(Go(t)+ < G 1(t) , x(t) - x(t) »x (Go(t)+ < G 1(t), x(t) - x(t) >)* x J(* (t)v(t)dt
+ €~.}(t)
and define the appropriate value of -/ (t) in the form of expansion
Applying the standard algebraic techique to equation (13) , we obtain the following equation for ~:r( t) ,
d~,(t)
= (A1(t)~,(t) + ~,(t)AiCt)+
/{0(t) < (;l(t)~,(t)(;i(t) > (J(O(t»*v(t)+ ~J(t)Q1(t)V(t) ~,(O)
(15)
+ Qj(t)~J("(t)v(t»dt
= 0,
And for Z(t) = EZ(t)
where
dZ(t) = A(t)Z(t)dt + ,(t)H"(t , o:(t»x [Go(t)Go(t)+ < G 1(th(t)Gj(t) >]-1 x H(t , o:(t)h(t)v(t)dH C(t)L(t)Z(t)dt + Z(t)A"(t)dH Z(t)H" (t , o:(t»J("(t)v(t)dt,(t)H" (t , o:(t» x [Go(t)Go(t)+ < (;l(th(t)Gi(t) >]-1 x [Go(t)Go(t)+ < G 1(th(t)Gi(t) >] x b(t)H"(t , o:(t»x [Go(t)Go(t)+ < (;l(th(t)Gj(t) >]-1 }*v(t)dt
Q1(t) = Go(t)Go(t)(/(O(t»"+ < G 1(th(t)Gi(t) > (J(O(t»*H(t , o:(t)h(t). As follows from condition (7) if J(O(t) is optimal, then ~,(t) = 0 for an arbitrary ~K(t). Therefore, Q1 == 0 and we obtain relation (9) for the optimal value of J(O(t) . By substitution of (9) into (13) we obtain also equation (10) and the optimal estimation in the form of equation (11) .
From previous equation after algebraic transformations , we have 4. PROPERTIES OF ESTIMATION
dZ(t) = (.4(t) + C(t)L(t»Z(t)dH Z(t)(A*(t) + H"(t , o:(t»J("(t)v(t»dt
In the section above we have found the optimal linear estimation of a Kalman type. Now we are in position to prove some additional properties of this estimation .
where Z(O) = O. Thus , Z(t) satisfies the homogeneous linear differential equation (16) with zero initial condition , hence Z(t) = O. This completes the proof. Remark 2. Since the estimation x(t) belongs to the class of linear estimations of a type (5) and satisfies the ortogonality condition (16) we can conclude that the estimation (11) is really optimal in the class of all linear estimations according to Def. 1. The rigorous proof of this statement follows from ortogonality condition (16) and differential equations (10), (11), which describe the estimation. However , this proof is long enough and immaterial for further consideration . Due to the lack of space this proof is omitted .
Theorem 2. Suppose that control law has a form
u(t) = L(t)x(t) . Then
1. the estimation described by equation (11) is nonbiased, i.e.
E(x(t) - x(t» = 0,
(16)
(14),
2. the covariance of the estimation error, namely, ,(t) = cov«x(t) - x(t» , (x(t) x(t»") satisfies the equation (10) , 181
where fo(t) satisfies the following equation
fo(t) = A(t)fo(t) + fo(t)A*(t)+ B(t)W(t)+ C(t)L(t)(fo(t) -,(t»+ (fo(t) -,(t»L*(t)C*(t) ,
5 . SIMULTA~EOlJS PROCESS AND OBSERVATIOX COl'
In this section we consider the control problem for system described by equation (1) with observation process (2). Our aim is to control simultaneously the process (1) and oservation (2) to minimize a performance criterion
with initial condition
= E f(x-(t)P(t)x(t)+
Remark 3. This theorem really shows the separation principle for this kind of problems. Indeed , the optimal process control L(t) does not depend on the observation as in classical linear problem. Meanwhile, to find the optimal observation control we have to solve the separate problem of optimal control for system with dynamic , described by equations (10) , (20) and performance criterion
(17)
o
u*(t)R(t)u(t»dt --;. min Since the model (1), (2) is nonlinear we have to simplify the problem to make it possible to obtain the solution in a closed form. The idea of such simplification is that control law could be taken in the form
(19). Remark 4 . The problem of optimal control for system (10) , (20) with performance criterion belongs to a class of impulsive control problems, because the constraints (3) for variable v(t) allow the using of impulsive controls. However , this problem can be treated by standard methods of impulse control theory as for observation control problem for discrete-continuos systems in linear case (Miller, 1991). Proof First , we notice that by virtue of ortogonality condition (15) we have the relation
u(t) = L(t)x(t) , where x(t) be the best linear estimate of a process (1) by observation (2). So the process x(t) satisfies the equation (11). Therefore, the problem of simultaneous process and observation control can be formulated as a problem of search the triple {L(t), a(t), v(t)} which satisfies (3) and minimize the criterion (17). In linear case the solution of this problem is known and can be formulated in the form of a separation principle (Kuznetsov , et.al., 1980), which gives the opportunity to solve the process and observation control problem separately. The same result also takes place in our case of nonlinear system (1) , (2). Theorem 3. Assume that R(t) in (17) are positively definite for all t E [0, T], then the minimum value of performance criterion can be achieved by triple {L(t), a(t) , v(t)} where:
,(t) = E{(x(t) - x(t»(x(t) - x(t))*} Ex(t)x*(t) - Ex(t)x*(t)Ex(t)x*(t) + Ex(t)x*(t) = E(x(t)x*(t» - E(x(t)x*(t»,
Ex(t)x*(t) = E(x(t)x*(t» -/(t).
and is the solution of equation
d(x(t)x*(t» = x(t)dx*(t) + dx(t)x*(t)+ B(t)W (t)dt = (x(t)x*(t)A*(t) + x(t)x*(t)L*(t)C*(t)+ A(t)x(t)x*(t) + C(t)L(t)x(t)x*(t)+ B(t)B* (t»dt + dAft
(18)
iv(r) = -A*(r)N(r) - N(r)A(r)+ P*(r) + N*(r)C(T)R-l(r)C(T)N(r) with terminal condition
I"'(T)
= 0;
where Aft is a martingale. After the substitution of relation Ex(t)x*(t) = Ex(t)x*(t) , which is a consequence of (15) , into previous relation , we obtain by Ito's formula:
2. {a(-) , v(-)} minimizes the performance crite-
non
fo(t) = A(t)fo(t) + fo(t)A*(t)+ B(t)B*(t)+ C(t)L(t)(fo(t) -,(t»)+ (fo(t) -,(t»L*(t)C*(t).
T
=f
o
(22)
dx(t) = A(t)x(t)dt + C(t)L(t)x(t)dH B(t)dW?
where
J
(2 ) 1
Later we derive the equation for fo(t) = E(x(t)x*(t» . Taking into account the choosen type of control law u(t) = L(t)x , we obtain:
= _R-l(t)C*(t)N(t)
N(t)
=
Thus, we have
1. L(t) is given by relation
L(t)
fo = 0,
L(t) is given by (18) , and ,(t) is the solution of equation (10).
T
J
(20)
Sp(P(t)fo(t)+
L*(t)R(t)L(t)(fo(t) -
(19) ~/(t»)dt
182
(23)
The derivate of dJ over dL is equal to zero due to the optimality condition, hence we have a relation
Now the performance criterion (17) can be rewritten in the form T
J=
R(r)L(r)(fo(r) - ")'(r»*+
J Sp(P(t)Ex(t)x*(t)+
T
o L*(t)R(t)L(t)Ex(t)x*(t»dt =
J C*(r)
(24)
T
J Sp(P(t)fo(t)+ o
L*(t)R(t)L(t)(fo(t) - ")'(t)))dt
~
L(r)
J.,.
E [0 , T] .
= -R-l(r)C*(r)x
+ t:dL(t),
if fo(t) and value of JO correspond to the optimal L(t) we can find the expansions for re (t) and Je: which correspond to [«t) in the forms f«t)
0
Since ~( t) corresponds to covariation matrix of the optimal estimation, we have fo(r)-")'(r) > 0 '1r E [0, T] and by virtue of condition R( ·) > 0, we obtain:
min
and we can find the optimal L(t) for fixed ")'(t) , as the optimal matrix-valued control for the system (23) with performance criterion (24). Suppose that L(t) is the optimal control and L'(t) = L(t)
= '1r
+ L*(t)R(t)L(t»*
(29)
Denote T
N ( r)
= fo(t) + t:df(t) + 0(t: 2 )
= .,.J 41* (t, r) x
(P(t)
+ L*(t)R(t)L(t»*
and derive the differential equation for N ( .) .
and
r = JO + t:dJ + 0(t: 2 ) .
d~~r) =
Using the standard perturbation methods we obtain the following equation for dr(t) and relation for dJ : 1tdf(t) = (A(t) + C(t)L(t»dr(t)+ dr(t)(A(t) + C(t)L(t»*+ C(t)dL(t)(fo(t) - ")'(t»+ (fo(t) - ")'(t»dL*(t)C*(t),
L*(r)R(r)L(r»*
J8
J.,.
and
L * (t)R(t)L(t»*
T
dJ =
J Sp[P(t)df(t)+
o dL*(t)R(t)L(t)(fo(t) - ")'(t»+ L*(t)R(t)dL(t»(fo(t) - ~(t»+ L * (t)R( t)L(t)df( t)]dt
(30)
.,.
(25)
8
Matrix-valued function 41* (t, r) as a fundamental solution of linear differential equation has the following properties:
(26)
8
For optimal solution dJ = 0 for any dL(t) . To check this condition we can apply to the equation in variations (25) dL(t) = dLb(t - r) . This type of variation correspond to a Pontriagin variation type in the representation of [«t) on the interval of the lenght t: in the neiborhood of the point r. Then:
= -
8
d.~~r) =
6.f(t) =
*(t , r), t ~ r
(P(r)
+ L*(r)R(r)L(r»*-
T
J A*(r)
6.J = Sp(dK* R(r)K(r)(fo(r) - ~f(r»+ L*(r)R(r)dK{fo(r) - ~(r»+ T
L*(t)R(t)L(t»*
J
(27)
J.,. Sp«P(t) + L*(t)R(t)K(t»x
T
L*(t)R(t)L(t»*
P*(r) + r(.)R*(r)L(r)A*(r)N(r) - .'V (r)A(r)
=
where
r.
=
=
P*(r) + N*(r)C(r)(R-l(r»* R*(r)x R- 1 ( r)C*( r)N( r)A*(r)N(r) - N(r)A(r).
(28) 183
which completes the proof.
6. CO:\CLCSIO:\ The new, really nonlinear observation control problem has been considered . By appliing the idea of linear filtering it is possible to reduce this problem to a deterministic control one. for which all well-known methods could be applied. Acknowledgements This work was supported in part by INTAS Grants 94-697 and 93-2622 , and Russian Basic Research Foundation Grant No 9501-00573 .
REFERE NCES
Liptser R . Sh . and Shiryaev A. 1'1., (1977) Statistics of Random Processes J, 11 New York: Springer-Verlag. Kuznetsov N. A. , Liptser R . Sh. , and Serebrovskii A.P. (1980). Optimal control and data processing in continuous time (linear system and quadratic functional) Automat. Remote Control, 41 , No. 10 , 13691374. Miller B. M . (1985). Optimal control of observations in the filtering of diffusion processes. I, II A utomat. Remote Control, 46, :'\0. 2, 207214; Ko . 6, 745-754 . Miller B. M . (1991) . Generalized optimization in problems of observation control Automat. Remote Control, 52 , No . 10, 83-92. Miller B. M. (1995) . Generalized solutions of non linear optimization problems with impulse control I , Il Automat. Remote Control, 55 , No . 4, 62- 76 , No . 5, 56- 70.
184