379
Nonlinear Sequential Algorithms for Estimation under Uncertainty? M. Z. DAJANI ANDA. I?. SAGE Information and Control Sciences Cenfer SMU Institute of Technology, Dallas, Texas Communicated by John M. Richardson
ABSTRACT In many realistic estimation problems, the presence of the state vector to be estimated is uncertain. In this instance, use of the linear sequential conditional mean ftlter algorithms (the Kalman filter) results in suboptimum performance. This paper derives nonlinear sequential filter algorithms for conditional mean estimation of a Gauss Markov process when there is uncertainty as to the presence of the Gauss Markov process (signal process) in the observation. Associated estimation error variances are determined and a simple example is considered.
I. INTRODUCTION
This paper develops algorithms for conditional mean estimators under uncertainty as to the presence of the signal to be estimated. A simple derivation for the point estimate under uncertainty is first presented. A nonlinear discrete sequential estimator, and the continuous estimator which results when the sampling becomes dense, are then developed. The type of uncertain estimator probIems discussed here is similar to those which result in the classical conditional mean linear estimator [1,2]. For the classical estimator the message and observation models are r(k + 1) = &)y(k) -t- 49 w(k), x(k) = BYW z(k) = &z(k) x(k) + O(k).
(1) (2) (3)
a, 8, and 6 are one for the classical certain estimation problem. x(k) is a Gauss Markov IZvector, the state of the system. w(k), the plant noise vector, is a white Gaussian m vector with known mean and variance. v(k), the observation noise t This work was supported by Air Force Oihce of scientitic Research, United States Air Force, under contract F446204g-C-0023. I#f~~~tion Sciences 2 (1970), 379-393
Copyright @ 1970 by American Elsevier ~b~~~ 15
Company, Inc.
380
M. 2. DAJANI AND A. P. SAGE
vector, is a white zero mean Gaussian r vector. A(k), G(k), and C(k) are known coefficient matrices. The assumed prior statistics are: @x(O)) = cLXO>
varW)1
= v,,,
~~~(~)~ = /&VW, Ejv(k)) = 0
var ~~~~)~= Q(k>,
(4)
var (u(k)) = R(k),
cov (W(k), v(k)) = COV{X(O), W(k)}= COV(X(k),u(k)> = 0. The uncertain estimation problems considered are illustrated in Figure 1. In each case conditional mean estimation of x(k) is desired. These three problems can be represented by defining three sets of binary hypotheses : Problem I
Ho: ci= 0 HI : a = 1
L
Probability HO = P IHO] = 4
mfll
‘P
/3=l=l Ho:/3=0
Problem II
w&1-~
H, : fl= 1 ( u=i$=l Ho: [=O
~oblern~
Hi: [-
P[Hll
=P
(5)
~~fbl =(I 1
Pvf*l
=P
u=j3=1 Middleton and Esposito have developed a nonlinear uncertain conditional mean point estimator for problem II. Nahi [4] has developed a sequential uncertain estimator constrained to be a linear function of the observation, for problem III, which is not the optimal estimator but is suboptimal due to this constraint. This paper differs from the foregoing two papers in that it develops the optimal nonlinear sequential aIgorithms for estimation under uncertainty. All three of the aforementioned problems are considered. Section II is devoted to the basic derivations of the uncertain point estimator. In Section III, a sequential algorithm for the conditional mean uncertain estimators are developed, and a continuous version of that algorithm is derived. Finally, in Section IV, a simple example is solved to indicate the relative performance of the newly developed algorithm. II. UNC~TAIN
POINT ES~~ATOR
A simple derivation of the uncertain point estimator is sought. It is desired to estimate from the observation z(k) : k = 1, 2, . . ., N, the state of a system which is assumed to evolve from the linear Markov model of (1) and (2). The conditional mean estimator is well known to be the optimal one for linear and nonlinear systems [1,2] for cases in which the minimum error
381
NONLINEARSEQUEWIAL ALGORITJzXMS
v(k) I
plQPLE!4 * v(k) w(k)
FIGUIU31. Block diagram of three estimation under uncertainty problems.
variance estimate is desired. This fact will be used here to develop the desired uncertain conditional mean point estimator. The conditional mean estimator W) P%(N IZ (#I can be obtained using the fundamental theorem of expectation as W) = 7 ~wf[mw41 --m
~(a
(6)
(7)
where f [x(k)jZ(k)] denotes the conditional probability density function of x(Ic) conditioned upon Z(k) = (z(l), z(2), . . ., z(k)). Invoking Bayes mixed probability rule yields W4 = 7 x(k){ $-[x(k)t~(k),
-CO
*
H,1Pmlw)~~~~
09
Informaion Sciences 2 (1970), 379-393
382
M. Z. DAJANJ AND A. P. SAGE
where fMW(W,
&I = JWxw, Uk>l
(9.1)
for problem I. The mean and variance of this normal density function evolve from tLX(k+ 1) = A(k) i&M, iqk + 1) = A(k) V&C)AT(k), which are the expressions for the mean and variance of x(k) assuming no plant noise. Appropriate initial conditions are given by (4). For problems II and III this probability density is fW)lW),
&I = UXWI,
f[WW),
HOI= NCL,(IC), JLwl~
(9.11) (9.111)
where 6, represents the Dirac delta or impulse function and p,(k) and V,(k) evolve from the relations CLX(~ + 1) = A(k) cl&) + G(k) CL,,@), V,(k + 1) = A(k) V,(k) A=(k) + G(k) Q(k) G=(k), whose initial conditions are given by (4). Also,
is the
certain conditional mean point estimate. Therefore, from (8), (9), and (10) it is seen that W) = p [Hi IZWIW)
+ p [~oIzw)li&w,
(11.1)
IW)l w, Wl IZW 1m + p wolzwl PAW,
(ll.II)
2(k) = P[Hi W
=p
(11.111)
relate the conditional mean uncertain estimate to the certain estimate and the posterior hypothesis probability. The obvious generalization of (7) and (11) for the one stage uncertain prediction algorithm gives for the three problems under consideration: 3(k + 1lk) = P [H, /Z(k)] f(k + 1lk) + P [H,IZ(k)l F,(k + l), R(k + 1lk) = P[H, [Z(k)] R(k + 1Ik), R(k+ llk)=P[H,IZ(k)]R(k+
l(k)+P[H,,IZ(k)]p,(k+
(123 (12.W
1). (12.III)
383
NONLINJ3AR SEQUENTIAL ALGORITHMS
These are the conditional mean uncertain point estimators of the waveform x(k) for problems I, II, and III. These estimators are nonlinear, since P [Hi 12(k)] is a nonlinear function of Z(k), and
pwoIzwl=
1 -P[~,IZ@N.
(13)
Also from Bayes’ rule
fY~Jf[z(k)l~,1
pwl’z(k)l =P(H,)f[Z(k)lH,]
+P(H,)f[Z(k)p,]
*
(14)
It is convenient to rewrite the foregoing two expressions in the forms (15) where (16) is the likelihood ratio. Hence for problems I, II, and ET:
‘(k)=1 +d(k)A(k) R(k) + &
L(k),
‘(k) = L 1 + A(k) J(k),
(17-r) (17.E)
* if(k) + ‘(k) = 1 + A(k) 1 + t-l(k)px(k)-
(17.III)
Similar relations also exist for the one stage prediction solution. It should be noted that (17.E) is the relation obtained by Middleton and Esposito [3]. The other point estimation algorithms (17.1 and 17.II.I) are believed to be new. The statistical properties of this estimator may easily be developed. The estimator defined by equation (12) or (17) is unconditionally unbiased and consistent.
III.
SJKJUENTIAL l?STIMATION ALGORITHM
The need for sequential estimation section a sequential algorithm and an estimation error variance will be derived. become dense, a continuous estimator is
algorithms is well known. In this algorithm for the propagation of Finally, in the limits as the samples developed. Information Sciences 2 (1970),379-393
384
M. 2. DAJANI AND A. P. SAGE
Discrete Sequential Estimator
In the following derivation, the discrete Kalman estimation algorithm is needed. For the case of one stage prediction [I, 21 the algorithms become z?(k + 1jk) = A(k) a(+
- 1) + G(k) ~w(k) + K(k)[z(k) - C(x) X(k[k - l)], (18)
where K(k) = A(k) V$k]lc - 1) P(k)[C(k)
V&r+ - 1) CT(k) + R(k)]_‘,
(19)
V,(k + 1lk) = A(k) V&clk - l)M(/%) + G(k) Q(k) GT(k)- K(k) C(k) Vl(klk - 1) AT(k). (20)
Here ;Qis the certain estimator and V, the associated error variance. This one stage prediction estimate and associated error variance is the natural one to use in computation of the likelihood ratio. It is related to the filtering algorithm and associated error variance through the relation R(k + 1lk) = A(k) R(k), V,(k + 1 Ik) = A(k) V,(k) A=(k) + G(k) Q(k) G=(k).
Initial conditions for the state estimate and error variance are the prior mean and variance of x at stage 0. In order to complete the algorithm it is necessary to determine a sequential relationship for P [H, [Z(k)]. For simplicity of derivation, it is convenient to let P P-J, IZWI= NW
W,IW - 1)l.
(21)
This assumption is legitimate since, using Bayes’ chain rule, it is possible to write
JMW(k-
I>,&lPCfW(k- l)l.W(k - 111 (22) fMWlZ(k - W-W - 111 =NWW~IZW1)1,
p]H,IZ(k)]
where #)
AfWIZ(k
f
MWXk
-
1)~HII - 1)l
fMWW- l),H,l
=f[zz(k)IZ(k-l),~,lP[~,IZ(k-
1)1+f[z(k)lZ(~-1),~,1P[HolZ(k-1)1’ (23)
This expression may be rewritten as r(k)
y&I =
PVWW - VI 1 +mbw l)j& P WolZ(k- 01
NONLINEAR
385
SEQUENTIAL ALGORITHMS
where (25) is a familiar term in sequential detection algorithms [ 1,5]. F(k) can be calculated at every stage sequentially from [ 1] T(k) =
112 det [E(k)] exp - +{[z(k) - C(k)f(klk - l)]’ det [VJkJk - 1)II . V;‘(klk - l)[z(k) - C(k)R(klk - l)] - [z(k) - C(k) a(klk -
1)l’
.E-‘(k)[z(k) - C(k)R(klk - l)]},
(26)
where VZ(klk - 1) = d(k) l$(k\k - 1) C=(k) + R(k).
(27)
For problem I, B and 9 are given by E(k) = C(k) Vl(klk - 1, Q = 0) C=(k) + R(k),
(284
R(klk - 1) = f(klk - 1, /.L~= 0, Q = 0), where VJ(klk - 1, Q = 0) and R(klk - 1,pW = 0, Q =0) are the certain estimates which result when the plant noise is absent. For problem II, E and 3 are given by S(k) = R(k),
A(klk - 1) = 0,
(28.IIl
n(klk - 1) = pX(k).
(28.III)
whereas for problem III E(k) = R(k),
Therefore, using the new measurement z(k) and the already known aposteriori probability distribution of H1, P [H, IZ(k - l)], F(k) and hence y(k) can be calculated at every stage and hence P [H, (Z(k)] can be determined sequentially. The error variance associated with Z(k + 1lk) is not explicitly needed, due to the fact that the derivation used the Kahnan titer as the basic algorithm in developing the uncertain estimation algorithm. Knowledge of the estimation error variance, however, is of great importance for comparative numerical studies. In the following, the error variance for the sequential estimator will be developed. The conditional error variance is defined for certain estimation as V,(k + 1 jk) BE{[x(k
+
1) - f(k + llk)l[x(k + 1) - %k + 1Ik)lTI-W),HI),
(2% whereas for uncertain estimation it is defined as V,(k + 1 lk) h E{[x(k + 1) - R(k + 1Ik)][x(k + 1) - R(k + 1Ik)Y-12(k), Ho). (30) Information Sciences 2 (1970), 379-393
386
M. Z. DAJANI AND A. P. SAGE
In order to relate V&C+ 1]k) with V,(k + Ilk), Bayes’ mixed chain rule is used in (30). Thus
V&S ilk)= f [WC+ llk)2Yk+ l]k)]{j-[X&f -*
l)IZ(k),H,]P[H,IZ(k)]
+fW + ~~I~~~~~~~l~~~,I~~~~l~~~~~ + 0,
(31)
where s?(k+ l]k)Ax(k+
1)-g&+
l)]k
(32)
is the uncertain estimate error. Using (9), (12), and (31) yields for the three problems considered I/,(k+l~k)=PIH,~Z(k)]V~(k+l~k)+PIHO~Z(k)]Vx(k~l)
+P[~,IZ(~)lP[~,IZ(k)l{[~(k+W - i&W+ W(k + 1lk)- i&c@ + Ol’>, V&s
ilk)
(334
=PW,IZWl V&t+ ilk) +P[H,IZ(k)lP[H,IZ(k)lR(k:+ Ilk) *F(k + 1Ik),
(33.11)
V,(k + 1lk) = P [H, [Z(k)] V,(k + 1lk) + P [H,IZ(k)] V-x(k+ 1)
EMk)lP W,IZWl{W + 1lk)- dk + 1)lW + 1IN (33.III) - cL,(k+ WI. +P
Thus estimation algorithms and associated error variances have been obtained for three problems of interest in uncertain estimation. These algorithms are summarized in Table I. Continuous Uncertain Estimator Results pertaining to the evolution of the likelihood ratio function in continuous time [1,6] are helpful in developing the continuous estimator. Let L(t) A InL@(t>l,
(34)
where (35) where Z(t) is the continuous related to A(t) of (16) by
observation process {z(h); 0 Q X G t}. /ict) is
A(t) = ii(t)p&.
(36)
NONLJlWAR
SEQUENTIAL
387
ALGORITHMS
TABLE I ESTIMATION
UNDER
CERTAINTY
ALGORITHMS
Message Model
x(k + 1) = A(&) x(k) + G(k) w(k)
Observation
z(k) = C(k) x(k) + u(k)
Model
Certain Estimator Algorithm
f(& + 11k) = A(k) k(kl k - 1) + G(k) h(k) + K(/c)[z(k) - C(k) n(klk - l)]
Gain
K(k) = A(k) Vp(klk - 1) CT(k)[C(k) Vn(k]k - 1) CT(k) + R(k)]-’
Certain Error VarialW2 -TION
Va(k + 11k) = A(k) I’&] k - 1) A=(k) + G(k) Q(k) G=(k) - K(k) C(k) V2(klk - 1) AT(k)
UNDER UNCERTAINTY
ALCjORITHMS
Message Model Observation
Model
Uncertain One Stage Predictor Algorithms
z(k)= SC(k)x(k) + u(k) Problem I: Ho : a= 0 H1:a=l Estimator
P Wol = 4 PWl)
=P
‘=Izz(fk+ =’ lIk)=P[H,IZ(k)lJ(k$ Ilk) + prH,Iz(k)lM~ + 1)
Px(k+ 1)= A(k)A(k) Error Vakltlce
vi(k + Ilk) =
- ik(k + 1m + 1) = A(k) P,(k) AT(k)
Problem ZZ: Ho : /I = 0 H*:j3=1 La=.$=1 Estimator
Vari~Ce Problem ZZZ:
+ IlkI + PWolZ(k)l
*([.fuc + Ilk) - /%(k+ l)l[.f(k+ Ilk)
p,(k
Error
PW,IZWl w
- P,(k + 1) + P[HoIz(k)lP[H,Iz(k)l
PBOI =cl PW,I
=p
f(k + Ilk) = P[H,IZ(k)lJ(k
+ Ilk)
V& + ilk) = PI&IZ(k)l Vg(k + Ilk) + P[HoIZ(k)l *P[HJZ(k)l;e(k+ lIk)R=(k+ Ilk)
PEOI = 4 PW,1 =P
Estimator
~(k+llk)=P[H,IZ(k)]~(k+ Ilk) + P WoIZ(Wl& + 1)
Error Variance
?‘i(k+l~k)=P[H,~Z(k)lVn(k+l~k) + PWoIZWl v,(k + 1) + P[ffoIZWl
I.& + 1)= 4k) P.&I + G(k)p&c) *P[H,IZWWt + IlkI - p&t + 111 ~[~(k+~lk)-&+1)1T)
Vx(k+ 1) = A(k) V,(k) A=(k) + G(k) Q(k) G=(k) Information Sciences 2 (1970), 379-393
388
M. 2. ISA&WI AND A. P. SAGE
TABLE 1 (continued) Propagation of
det (S(k)}
a posteriori
Probability Distribution of H
r(k)=[det [V,(klk
- l)] 3
112 exp - *{[z(k)- C(k) _Wllt - l)p
. v;‘(klk - 1)&k) - C(k)&klk - 111- tz(k) -,C(k) *f(klk - l)]‘E-‘(k)[z(k) - C(k)i(kjk - l)]}
where V,(klk - 1) = C(k) Vn(kjk - 1) C’(k) + R(k) E(k) =C(k)Vn(klk-l,fi=O,Q=O)C’(k)+R(k)
Problem I Problem II Problem III
= R(k) ! = R(k) $(kjk-1)=
2(klk-i,pw=O,Q=O> 0
(, r(k)
Problem I Problem II Problem III
PWZWI = yW.Wff,IZ(k- 01 PFLolZWl = 1 - ~Ff,IZ(kN W&IZ(k - 111
y(k) =
1 + r(kIP[61Z(k
--‘1)1 1)l
PU&,IZ(k -
Prior Statistics
WW
E[ufk)] = 0
= fiW,
var [WI = RW JW~l.WM = 1 -P
var Mk)] = Q(k),
~EWWI
‘P,
- 4
f&m = f&(O) = /.&so
VW [x(O)] = vt(o> = Jf-0
i+(O) = v,
~&o = KJ
ao) = fhJ P.-O
n(o) = Pko ( PXO
Va(O]=
Problem I Problem II Problem III Problem I Problem II Problem III
The evolution of L(t) in time can be represented by the stochastic di~erential equation [ 1,6] C&C(t) = R(t) C’(t) &j(t) - j@(t) CT(I) C(r) R(t)&
(37)
W-9 = 0, where z(t) dt = &j(2), and R(t) is obtained as the solution of the continuous
(38) certain estimator
NONLlNEAR SEQtJENTUL ALG0Rrlx.m
389
algorithm [1,2] which evolves from the message and observation model for the continuous equivalent of(l), (2), and (3) : 3 = 4OYW do
+ a0
49,
(39
= Mot
(9
z(t) = &z(t) x(t) + v(t)*
(441) There are tbree uncertain estimation problems here, just as in the discrete case. For either problem when hypothesis HI is true, the certain estimate is optimum. The algorithms for the certain estimator are i(t) = -40) R(t) f G(t)/.&) K(t) = V,(t) P(QK’(t),
f W[z(t)
- C(t)
W)l,
(42) (43)
t’s(t) = A(0 Y&I + v&f AT(r) + G(t) Q(t) G=(f) - ~(~)~(~) v&I,
(44) with initial conditions n(o) = & and Vg(O)= var [x(O)] = V,. The coefficient matrices in the continuous case are related to the coefficient matrices in the discrete case by A(!)=lim’-y),
G(t)=limT,
T-PO Q(t)=
C(t)=limC(kT),
T-30
jime(kTf, T-50 T
@t)=
aims T+O
T-b0 ;
T
where T is the sampling period and where T --f 0 also implies k -+ co, kT + t, From (34) and (35), (33) can be rewritten as A(t) = A(t)[P(t) A(0) = #
CT(f) z(t) - *P(t) P(t)
C(t) R(t)],
(45)
*
(46)
0
This equation must be interpreted as a stochastic diEerentia1 equation. The continuous version of equations-(17) can similarly be shown to be R(t) f &)~xw’ Z(t)=--I!iQ1 + A(t) T(t) =
&)
%th
(47.I) t47J.9
2(t) = - n(t) R(t) -I-&j 1 + A(t)
f%(f),
(47.IrI)
where fix= A(t) P&I t-W Fx = A(t) i&(t). The initial equations for these relations are
CL&),
.$(O) = &Ax(O). Zn~r~~~i~nSciences 2 ~1970),379-393
M.2.DAJANI AND
390
A. P. SAGE
The error variance equations for optimum estimation under uncertainty are obtained in much the same way as for the discrete case. These relations are
IV. ILLUSTRATIVE EXAMPLE
In this section a simple example is mechanized and solved by using the results in Table I. The purpose of this example is to assess the relative performance of the new optimal algorithms. Problem III will be considered and the optimum results will be compared with results from the linear algorithm in [4]. The stationary stochastic process x(k) is normally distributed with known mean and variance. It can be modeled as x(lc + 1) = x(k), z(k)=x(k)+v(k):M,;P[H1]=p = u(k)
:Hs;P[Hs]=rj=
1 -p,
with prior statistics V&lo) = VX(l10)= R = 0.2, ~(1~0)=~(1~0)=~*(0)=0, For this particular exampie, problems I, II, and III all lead to the same optimum estimation algorithms. These are: z?(k + Ilk)= P[H,IZ(k)]f(k+ Ilk), ;e(k + 1jk) = R(klk - 1) + K(k)[z(k) - R(k]k - l)],
NONLINEA_R SFQUEW
391
ALGQ-
The needed posterior densities are obtained from
mwwl= 1 -mwxwl~ p[~ll~(m = Ywvwwf(k> yfk) = JvA3Izt~ - VI 1 + r(k)P[%l~(k - 1>1’
m
PW~~~ - 111
[R]"2 1 [z(k)a(klk - I)]2 z*(k) '@)= [R+ V&ilk- 1)]1/2expZ- 1 [R+ V&tlk- l)] + R * The error variance under uncertainty for this (problem III) example is V&k + 1]I%)= P[H,IZ(k)]
v&c +
yb(k f Ilk) + P[H&(k)]
+P[~*lZ(~)lPCH;IZ(~)lR2(~~
V,(k -I- 1)
11,
1) = I/,(O) = 0.2.
The above algorithm and the corresponding linear one in [4] were simulated with several values of x(0) selected from a normal random distribution with Vi$k+lfk)
I
FIGURE2. Error variance: H, true. Information Sciences 2 (1970), 379-393
392
M. Z. DAJANI AND A. P. SAGE
mean and variance 0, and 0.2. Changing the value of the apriori probability of H,, namely p, does not change the steady state performance of the optimal algorithm. However, the “transient” performance is definitely affected with the change in p. In the case in which z(k) = x(k) + v(k), i.e., H, is true, the optimal algorithm performance was found to be superior to the suboptimum algorithms of [4]. The minimum steady state estimation error variance in [4] is nonzero and is obtained from lim V_&) = (1 - p) Vs(0) = (1 - p) V,(O), K+TJ
which is zero only ifp = 1, which corresponds to the certain estimation case.
0,2
OJ
0,o
FIGURE 3. Error variance: H,, true.
Figures 2 and 3 show the ensemble estimation error variances V,(k + 1[k) for different values ofp, the apriori probability of HI. As expected, the lowest error variances occur for HI true, when p is large, and for Ho true, when p is small.
393
NONLINEAR SEQUJZNTIALALGORITHMS
SUMMARY
This paper has developed sequential algorithms for estimation under uncertainty. The algorithms are nonlinear involving the product of a Kalman filter estimate and a likelihood ratio term. The uncertain models are such as to include a variety of physical applications. BIBLIOGRAPHY 1 Sage, A. P., and Melsa, J. L., Estimation Theory: With Applications to Communications and Control, McGraw-Hill,New York, 1970. 2 Sage,A. P., Optimum Systems Control, Prentice-Hall,EnglewoodCliffs,N.J., 1968. 3 Middleton, D., and Esposito, R., S~~t~~us detection and estimation of signals in
noise, IEEE Tram. Infor~tion Theory 14, No. 3, pp. 434-444, May 1968. 4 Nahi, N., Optimal recursive estimation with uncertain observation, IEEE Truns. information Theory 15, No. 4, pp. 457-462, July 1969. 5 Sage, A. P., and McLendon, J. R., Discrete sequential detection and likelihood ratio computation for nongaussian signals in gaussian noise, Proceed&s Purdue University Centennial Year Symposium on Information Processing, pp. 589-598, April 1969. 6 Duncan, ‘I., Evaluation of likelihood functions, J. Information and Control 13, pp. 62-74, 1968. Received March 23,197O
~~for~~~
Sciences 2 (1970), 379-393