e
Computers chem. Engng Vol. 19, No.2. pp. IH7..195 1995
Pergamon 0098-1354(94)EOO46-8
Copyright © 1995 Elsevier Science Ltd Printed in Great Britain All rights reserved 009H-I.l'i4/9' $9'0 iJ I~)
ESTIMATION OF THE COVARIANCES OF THE PROCESS NOISE AND MEASUREMENT NOISE FOR A LINEAR DISCRETE DYNAMIC SYSTEM J. ZHOU and R. H. LUECKEt Department of Chemical Engineering, University of Missouri-Columbia. Columbia. MO 65211. U.S.A. (Received 5 February 1993; final revision received 6 April 1994; received for publication 27 April 1994) Abstract-There have been many papers written about tuning Kalman filters. Such tuning usually consists of adjustments to values used for the covariances of the model and observation noise. In this paper we described a procedure using the observations with a linear process model to develop estimates for the effective values of the covariance matrices of both the process noise (Q) and the measurement noise (R). These are needed for maximum likelihood state estimation in control and optimization. A horizon state estimator is derived that is linearly unbiased and has a constant state estimation error covariance. The process and measurement models are combined with the state estimates and a constant state estimation error covariance to generate cumulative error covariances that are also constant. A maximum likelihood function and a linear regression technique are then utilized to obtain the diagonal elements of covariance matrices of the process noise and the measurement noise. A simulation example for two chemcial reactors in series is presented.
1. INTRODUCTION
Process plants typically have a large number of state variables many of which interact. In modern multivarible control and optimization, a good estimator for the current state is often vital. Many of these variables, however, are corrupted by the measurement noise; models of the process are limited in detail and accuracy and may be subject to unmeasured inputs. Statistical characterization of noise both in the measurements and in the model is needed for estimation of the states with any maximum likelihood estimator such as the Kalman filter. The measurement noise covariance R and the process noise covariance Q, both must be known. In practical problems, however, these quite often are not available. At best there often is a large uncertainty in Q and R because of inadequate knowlege about the nature of the process and measurement errors. Poor values of Q and R lead to poor state estimation. Some authors have discussed the impact on the dynamic response of the Kalman filter by using Q and R as tuning parameters (Mehra, 1970; Sage and Melsa, 1971; Ray, 1989; Myers et al., 1991). In this work, we develop a method to estimate Q and R using only the observations and the model for a linear discrete system with both the measurement noise and process noise. t To whom all correspondence should be addressed.
The general linear discrete stochastic dynamic system is taken as: Xk+1
=
AXk+ UUk +
GW k •
Zk =HXk+ Vk.
(1)
(2)
where the measurement sequence {zd and the control input sequence {ud are both given. The state Xk will be estimated. The process noise w, and the measurement noise Vk are assumed to be zero-mean. white and mutally uncorrelated. The process noise covariance is assumed to be diagonal positive semidefinite (Q~O). the measurement noise covariance positive definite (R > 0), and both Q and Rare independent of time. 2. THEORY
The moving horizon method is based on the idea that the value of the state variable at any time can be uniquely determined by the initial conditions and estimated by minimizing a least squares function subject to the linear model constraints. The method is comprised of four recursive steps: the state estimation, the state estimation error covariance, the cumulative error covariance and finally the estimates of Q and R.
2.1. State estimation The basic idea for the horizon method is that past states of x, can be estimated by N forward measurements from 'Lk to z, I N I for each k, The measurements are linear functions of the estimate x, along 187
J.
188
ZHOU
and R. H.
with the inputs u, and noise Vk and Wk' The linear dynamic system (1) can be combined into (2) yielding the following modelling matrix equation bas ed on the initial state of x, and j forward steps:
in" =
t
{K
j (
Zk +j _l-
~ HN -IBU
k +i - I - )
} ,
k = 1, 2, . .. , M ,
(7)
where
j- I
Zk+k - I = HAj-I x k
L UECKE
+ 2: HAr-IBuk + j_I_r
(8)
r -I j-I
+ 2: HAr -IGWk+j _l_ r+V k+j _1
(9)
r=1
k==1,2, . . . , M ; and j = 1, 2, . .. , N .
(3)
N is the state estimation horizon ; M is the covariance estimation horizon which is used to estimate the cumulative error covariances. Thus, in this equation {Zk+i-I: k == 1, ... , M; j= 1, . . . , N} are con sidered to be in hand. Estimates of x, are thus "smoothing" estimates. While equation (3) appears to calculate the observations , it is better to think of it as an implicit equation using the observations and inputs for estimation of the state Xk and the noise sequences. Define the last two terms in equation (3) as the cumulative error: j-I
ekj=
2: HAr-IGwk+i_l_r+Vk+i_I'
(4)
r=1
Because the process noise and measurement noise have been assumed to be zero-mean , white and mutually uncorrelated , the cumulative error of ekj is also zero-mean , white and mutually uncorrelated. The covariance of the cumulative error of ek j is: j- I
W j = cov(eki) ==
L (HAr-1G)Q(HAr-1G)T+ R, r =1
k=1,2, ... ,M;andj=1,2 , . . . ,N.
(5)
The cumulative error covariance W j is a function of i.e. the length of the state estimation, but not function of k. The state estimation modelling equations may have different error covariances depending on the length of the state estimation. The generalized least square (GLS) estimation technique can be applied to efficiently estimate the state Xk :
i.
min{j(Xk) ==
~ Ilzuj-1 -
HAj-!Xk
If the weighting matrix W j follows from equation (5), it will give a maximum likelihood estimator. The an alytical solution to the unconstrained optimization problem of equation (6) with the decision variable of x, is obtained as:
The weighting coefficient K j is the measurement gain of ZUj-I ' This is very similar to the gain in the Kalman filter and determines the measurement influence in the state estimation. In general, {IIKJ j= 1, 2, . ..} is a quickly decreasing monotone sequence because the error covariance of W j in equation (8) increases quickly as j increases . Thus, the first several gains , corresponding to the first several measurements of {Zb ZH " . . .}, play important roles in xk-estimation. The matrix a E IR n x n defined in equation (9) is not a function of time. It represents the statistical effect of noise. a will appro ach zero as Q and R approach zero simultaneously. A necessary condition is given to guarantee the positive definiteness of a as follows: Theorem 1. a E IR n x n is positive definite if (A, H) is obs ervable. The proof of th is theorem can be found in Appendix E . The next section will show that a actually is the state estimation error covariance. Therefore , thi s theorem gives a test for the boundedness of the state estimation error covariance. The state estimation problem depends on the observability of the original system. It is easy to test for observability by determining if the rank of the observability matrix of (A , H) is equal to n , the dimension of the states . The state estimation horizon N in equation (7) must be greater than the rank of the observability matrix of (A, H) in the sens e of statistics. That is, there exists a min imum integer number m such that for each N te m we have rank [(H T , (HA)T, . . . , (HAN-I)T)TJ == n. This impli es that the sample size must be greater than the unknowns in the regression function. In most situations , since II~II decreases quickly, the first several observations Zk+j-I in equation (7) are the most important for the state estimate i k. The observation Zk+j-I for a large j has little effect on the xk-estimation. N is usually 2 times n or 3 times n. The estimate of Xk in equation (7) is simply a linear combination of N forward measurements with the measurement gain Kj • Solving equation (7) without much trouble gives an estimate of Xk for k == 1,2, ... ,M.
Covariances of the process noise
2.2. State estimation error covariance The state estimation error at time k is defined as the difference beween the true value x, and the corresponding estimated value Xk. Denoting this error bye.. we can write:
(10) Note that the true value of x, involves the true process model [equation (1)] with true noise that is really unknown for practical problems. Hence e, is known only to the process simulator. The estimated value Xk is also a random variable with some distribution. For a given k, the sampling distribution of Xk refers to the different values of Xk that would be obtained with repeated sampling when the state estimation horizon steps are held constant from sample to sample. It can be shown that the state estimation error, as defined in equation (10), can be expressed as: N
ek== -
2:
Kjekj,
(11)
j~l
where the K, are fixed quantities defined by equation (8) and ekj are the cumulative errors defined by equation (4). e, is a linear combination of the ekj for j from 1 to N and thus also is zero-mean, white and mutually uncorrelated. The sampling distribution of ek can be described with mean and covariance matrix: ,Ii =
E{e.} == 0,
the valuable property of the constant estimation error covariance a to estimate Q and R. For the Kalman filter, a priori or a posteriori error cavan ace is dependent on time especially at the beginning and the first several iterations of the estimation process until a bounded steady state value is achieved under the conditions of detectability (A, H) and reachabi lity (A, G YQ). Of course. the fixed interval smoothing algorithm involving the Kalman filter could be employed to generate the smoothed error covariances that change very little with time for an appropriate final time N. Such a smoothing scheme which includes a Kalman filter, an information filter. and a smoother, is more complicated than that of the horizon state estimator and has no guarantee tor constant error covariances.
2.3. Cumulative error covariance To estimate Q and R, we need to build a set of models with constant error covariances such that through the whole sample all errors draw from same distribution. The error covariances, of course. involve Q and R. which is the principle logic for estimation of Q and R. In real problems. the true state values arc unknown so that the estimated state values must bc employed in the set of models III that purpose, subtracting HAl' 'x, from both sidesof the modelling equation (3) and rearranging yields the following equivalent modelling equation
(12) r--!
p = cov (ed = a.
(13)
Equation (12) implies that the population mean of the state estimation error is also zero and Xk is thus an unbiased linear estimator that tends neither to overestimate nor underestimate systematically. The state estimation error covariance is a constant positive definite matrix a which does not depend explicity on measurements and time and can be calculated as the state of Xk is estimated. This is a key property for this work and will be used in the cumulative error covariance estimation. The absolute values of the elements of P decrease as the state estimation horizon step N increases since all the summed matrices in a are positive definite. That is, each new measurement in the regression reduces the state estimation error covariance although the reduction becomes slight. The horizon state estimator defined in equation (7) is an unbiased linear filter like the Kalman filter, but is not a better estimator of Xk' A poor estimator or a better estimator for states is not the essential issue for this work. The horizon state estimator has
x
==HAI-l(xk- .) +
/
I
r~
I
2: HA' IGWkf/
I ,+v/,
I·
il4\
k==1.2 .... . M:}==1,2,
Note that equation (14) is same in structure as equation (3), Equation (14) puts emphasis on k . the time length of the covariance estimation, trom I 10 11-1 for each j, but equation (3) on i, the time length of the state estimation, from 1 to N for each k, The estimation horizon steps of Nand Mare generally different with M> N. Define right side of the above equation as the total cumulative error. Denoting this error by e, we have: 1-1
Ckl=HAJ
I(Xk
-x.) +
2: HA'
'GWhl"
,+ VI.
r;=;]
( ]:")
This error remains zero-mean, white and mutually uncorrelated. The state estimation error fA is correlated to fkj since ek defined in equation (11) involves
J.
190
ZHOU
and R. H.
The covariance of this total cumulative error can be shown as:
ekj'
Let us briefly explain this equation and its significance. The error covariance W; must be constant for all k at each fixed j. This condition provides the crucial requirement cited in the beginning of this section. Secondly, the positive definite (but generaIly not diagonal) matrix HAj-1a(HAj-IY represents the influence of state estimation errors. The minus sign takes into account of the correlation between ek and ekj' Consider the simplest case for j = 1. From equation (5), WI = R for all k. We can use the measure, M} and the estimated ment sequence {Zk: k = 1, , M} to estimate the state sequence {Xk: k = 1, measurement noise covariance R, which appears in equation (16), by employing the modeIling equation (14). Similarly, for j = 2, W2 = (HG)Q(HG)T + R for all k. The sequences {Zk: k = 2, ... , M + I} and {Xk: k = 1, ... , M} can be used to estimate the matrix (HG)Q(HG)T which involves Q. In general, the sequences {Zk: k = i. ... , M + j -l} and {Xk: k= 1, ... , M} can be used to estimate Wj' which appears in equation (16) and involves Q and R, by employing equation (14). This fundamental strategy gives the covariance estimates. There is one final complication. We cannot use the generalized least squares estimation technique to minimize some norm of the the difference between observation and its estimate with decision variables of Q and R. Consider the density function of a simple normaIly distributed error e with zeromean and variance a 2: [(a 2) = 2 2 2 exp( - f /2a )/Y 2:n:a • The true value of o? cannot be estimated by minimizing only the exponential term of f 212a 2 . In this work, the maximum likelihood (ML) estimation technique will be applied to find the covariance estimate. With all statistics Gaussian, the maximum likelihood function for Wi, can be given by: Lj(WD = (2:n:t Mp' 2(det Wit M /2 exp{
-~ ~ (ek)T(wj)-I(ekJ }, j=I,2, ...
(17)
where ekj follows left side of equation (14). The maximum likelihood estimate of the total cumulative error covariance of W; is readily shown as: 1
W;= M
M
2: (ekj)(ekJ k~l
T
,
j= 1, 2, 3,...
(18)
LUECKE
The resulting estimator is the usual sample covariance matrix often called a mean square because a sum of squares has been divided by an appropriate number of degrees of freedom. To exploit the unbiased property, the denominator M of equation (18) should be reduced to M - 1 to take into account of one degree of freedom lost by using the estimated state instead of the true state. It is convenient to write W j as the following form: W j = Btr(DjQDJ) + R,
k=I,2, ... ,M:j=1,2,3, ... , (19) where DI = 0 and
D, = [(HGY, ... , (HA,-2G)Ty,
j=2,3, ...
(20)
D, is the (j - 1)th observability matrix for a linear discrete dynamic system. It is the generalized form
and will return to the ordinary form in optimal control if G is an identity matrix. Btr ( .) is block trace of a square block matrix. Recalling equation (16) and equation (19), we can rearrange equation (18) for j = 1 and j ~ 2 as follows:
, , 1 ~ WI =R=- L.J (Zk- HXk)(Zk- HidT + HaHT M k~1
-------2:
(21)
W , j = Btr(DjQDJ) + R 1 M = M (ek;) (ekj)T + (HAj-l)a(HA' ~I)T, k~1
j=2,3, ... , (22) Equations (21) and (22) are the governing equations used in this work. The estimate of R can be obtained directly from equation (21), but the estimate of Q cannot be explicitly obtained from equation (22). This represents the obstacle to estimate noise covariances when a linear system includes process noise (or model errors). For the simplest case that both Q and R are scalar (p = q = 1), Q is easily calculated from equation (22) if D, does not approach zero. For higher dimensions, the calculation of Q becomes more difficult. The matrix of Btr(DjQDJ) is usually a summation of several symmetric matrices [(HAr-LG)Q(HAr-IGY for r = 1, ... , j - 1] which involve Q and are possibly not full rank. Therefore, we cannot obtain an explicit formulation to calculate Q by multiplying a pseudoinverse matrix. Even the simpler case j = 2, may not work because the matrix D2 is generaIly not square and may not have independent rows. The error covariance horizon M is the sample size used to estimate R in (21) and the matrix Btr(DjQDJ) + R in (22). M must be sufficiently large such that we have enough information to be able to
Covariances of the process noise mak e a good maximum likelihood estimator. Th e sample means will conve rge in prob abil ity to the tru e values of R and th e matrix Btr(DjQDJ) + R as M approaches infin ite . In most practical problems, M > 50 is sufficient large . In the next section, a linear regression method is developed to estimate the elements of Q when both the Q and R are dia gon al.
Q and R are assum ed to be diagonal posmve semide finite . Thi s conditio n impli es t hat all the com ponent disturbances of measurement noise and process noise ar e un co rre lated and covari ances a mo ng those component distu rb ances are then zero . Thi s condition in process engineering is usu ally feasible ; e.g. the disturbance of temperature is not related to th e disturbance of flowr ate . Fo r th is case th en , onl y the diagonal eleme nts need to be con sidered in equa tio ns (21, 22). It is co nvenient to define the diagon al e lements of R and (HA,- IG) Q( HAr-1G)T as column vectors. Den ot ing those vecto rs by b (O) and be,), we have: b(U)= diag(R) , b(')=diag[(HA,-IG)Q(HA, -IG)Tl,
(23)
r= 1, 2, ... , L. (24)
Thus, eq ua tio ns (21 , 22) ca n be writte n as the following formulae : ==
diag{~
i
+
(25)
----
r==1 ,2, . . . ,L, (26)
where W; is calculated from the right side of equation (22) . Note th at th e sub script or supe rscript of j in the right side of equa tio n (22) has been chan ged to r (r = j -1) in equa tion (26) following the defin ition of equa tion (24) and simplifying not at ions in th e context. Thus, a vecto r seq ue nce {b(r): r == 1, ... , L} ca n be found from eq ua tion (26). All the con stant matri ces of H, A and G are known and th e matrix of HA, - JG in the definition of bir) [equati on (24)J can be also calculated. Fo r convenience, we write components of the matrix HA,-IG as follows: HA' - IG=[a ;: upl
'" '"
aupq:~ ],
r=1 , 2, .. . , L.(27)
A linear eq uation is obtain ed from equatio n (24) in th e following matrix form : diag(HAr- 'G )Q (HA '-'Gf = C("q =
r = 1, 2 , . .. , L , CACE 19:2 -E
(30)
IG. and bl ' ) are now known and q is a q-dimen sional vector with the diagonal element s of Q which needs to be estim ated . Since Cl,l usually does not ha ve q indepe nde nt rows (ge nera lly p
q = (C 1Ct '(C 'b ).
b (r
(3 l)
whe re C ==[ (C(lJ)l . . . . . (CC,·II T an d bIlJ)f , . . .. (bI1 J?f C is a (p L) x q mat rix with th e squared elem ent s of th e Lt h obse rvability matrix of 0 , • I defi ned in equ ati on (20) and b is a (p L)-d ime nsio na l vecto r with e leme nts ca lculated from equation (26) . The estim at ed values of diagonal elements of the process noise covaria nce Q. can now be com put ed fro m the abov e eq uation if C has at least q ind epen de nt rows . It can be shown th at : F.{q}= q =(q t, . . . . qq)' .
E{f}= r==d iag(R) = (r ,. .. . r,.)' .
HaW},
~ b(')=diag{W'+I-W,},
(29)
(~2)
cov(cj) = E{( q - q)(q - q )I} = !::. VAS )~T - qq ' , (.B )
[(Zk- Hi d (z, - HXk?l
M k~ 1
whe re
C'" is a matrix with the squ ared ele ments of HA'
2.4. Q and R estimation
b (G)
i41
J,
(ZR)
( .\ 4)
cov(f')= £{( r- r)(r - r)I}= V,/SII)- r r' . 1~5) Th e notations in equations (33. 35) are de fined ID Appendix D . q and rare cumposed of th e true diagonal elemen ts of Q and R. Th erefor e , fro m eq ua tions (32,34) . the estima tes of th e d iago na l ele me nts of Q and R are still unb iase d . A general qu estion occurs now: how IS the I. selected? Selec t L such th a t C has at least q independent rows to make the q-estirnation equat ion (3 1) work. A theo rem is give n to answer this qu estion as follows: Theorem 2 . Th e diagonal ele me nts of Q ca n be es timated from equ ation (3 1) by choosing L ? m it m sa tisfies: m=min {j : rank[sgn( Oi+l))== q}.
(.'>h)
whe re sgn( .) is a sign funct io n. The pr oof of rhis theore m is stra ightforward . In a gene ra lization of the usua l no men clat ure of the pro cess co ntrol te xts or literature , a linear d iscre te dynamic syste m is ca lled the m th obs er vab le if m follows equ ation (36), or say tha t Dm is full rank
192
J.
ZHOU
and R. H.
This generalization extends the concept of observability to the estimation of Q. Another important point is that the condition that rank [sgn (D/+1) ] = q in equation (36), instead of rank(Dj+I)=q, is necessary because we cannot mathematically guarantee that C in equation (31) is full rank, which is a necessary condition to estimate Q, if DL+ I is full rank. For example, some, not all, elements of a row of DL+ I are with opposite signs to the corresponding elements of another row, but two rows are linear dependent with all positive elements. Then we can not conclude that C is full rank even if DL+ 1 is full rank since the elements of C are the squared elements of DL +I' For chemical engineering problems, the condition, rank (DL+ I) = q, would be enough to guarantee that C has q independent rows.
2.5. Algorithm Equations (7,25, 26, 31) constitute our algorithm for finding an estimate of (Q, R). Since the state estimation depends on Q and R, an iterative procedure is required. The key steps in the algorithm are: (1) Selection of the parameters N, M and L which are in the number of points used in various aspects of the regression. These should be chosen generally as one chooses the number of data points for regression from time series, i.e. large enough to converge in probability but not too large for economical computing; (a) The state estimation horizon N must be greater than the rank of the observability matrix of (A, H). That is, there exists a minimum integer number m such that for each N'e m we have rank [H', (HA)', ... ,(HAn-I)'] = n where n is the dimension of the states. Simply put, the sample size must be greater than the number of unknowns in the regresson function. In most situations, since the measurement gain K j decreases quickly as j increases, the first several observations Zk+/-I in the state estimator (7) are most important to the state estimate Xk' The observation Zk+j-I for a large j has little effect on the xk-estimation. A large value for N is not necessary in this step. N is usually 2n or 3n. (b) The error covariance horizon M is the "sample size" used to estimate R in (21) and the matrix Btr(DjQDJ) + R in (22). M must be sufficiently large such that we have enough information for a good maximum likelihood estimator. Theoretically, the sample means in (21) and (22) will converge in probability to the true values of Rand the matrix Btr(DjQDJ) + R as M approaches infinity. In most practical problems, M> 50 is sufficiently large.
LUECKE
(c) The minimum choice of the linear regression number L for the estimate of the diagonal elements of Q is addressed in Theorem 2. L must be greater than an integer number m = min{j: rank[sgn(D/+1)]=q}. If L is small, more error may occur in Q, especially when IIHAr-1GII is small because the original system is ill-conditioned in observability. However, if L is large, the linkage between the observations and Q (through the matrix HAr- LG) will be far reaching possibly introducing more error. Note that the number M plus L must be less than the observations Zk' Typically, L - M12. (2) The observations {Zk: k = 1, ... , M + L} are given and the initial values of Q and R are selected; (3) Calculate the state estimation error covariance a using equations (5) and (9); (4) Calculate the state estimates {Xk: k = 1, ... , M} using equations (7) and (8); (5) Calculate the error covariance estimates {Wj: j = 1, ... , L + l} using equations (21) and (22); note that ekj in equation (22) follows the left side of equation (14); (6) Calculate {b(r): r = 0, 1, ... , L} using equations (25) and (26); the estimate of the diagonal elements of R are obtained by setting diag(R) = b(O); (7) Calculate {c(r): r= 1, ... , L} using definitions (27) and (29); and then the estimate of the diagonal elements of Q are obtained by solving the linear equation (31); if the residuals between the calculated and the guessed values of Q do not satisfy the smallness criterion, go back to step (2) and try again. 3. A SIMULA nON EXAMPLE
Let us consider an isothermal continuous-stirred tank reactor (CSTR) with the irreversible first-order reactions A ~B ~C taking place (Example 3.2.5, Tay, 1989). The rates of reactions are given by rl = k, C A and r2= k 2CB where k I and k 2 are constant. The modelling equations for this system in the absence of control take the dimensionless form:
Table 1. The values of parameters k, = 172.5h-' k, = k ,/2 = 68.25 h ' F= 1.5 m3 h- I V=2m 3 Ss' = 2 s
A = [ 0.9 0.096] 052 0.904 •
c,=k,VIF=2300 c,=k,VIF= 1150 T= Ilt' FIV= (5112) x 10-' x,(O) =XIO = 0.0 x,(O) = X,,, = 1.0 1.0
G= [ 0
0]
1.0'
H=( 1 0).
I <) ~
Covariances of the process noise T anle 2 . Average values of true and estimated covarianccs
Seeds IOJ-3lX1 201-400 30t-500
401- 600 50 1-7(JO 6Ot-8fJ() 70l-9fJ() 801-1000 401-1;00
Q,,,,, x 10
3
0 .1fJ() 0 .999 4998 19 .946 79.889 4.980 19.973
101.864 1.005
Q NmX
In'
O.W 7
LIng 5.2 15 21.3 2 1 83.038 5.951 19.011 W . IS6 1.4iX)
or Q and R
«;»
{](J I Q c:>tm
Seeds
1<,,",. x 1lJ '
0. 2 156 0.2261 0.2306 0.2259 ()2 282 0.2279 0 .2558 0. 2949 O.2I>1iS
.~0I -5 I K)
O.llx,
n.un
o lILlO
-t0l - 600 501 -7(X) 60 1- 800 701-9{X) SOl - WOO 90 l - 1100 IlKJI-1 200 WI-l,O(J
1I .~l).1
II.Wo 4 <}85
"'1l.12
where Xt =CR/CA h Xl = CA/ CA f , cl = k ,VI F, C2= k 2 VI F and t = t ' FIV. A process noise wet) is added to equation (38) to compensate for unmodelled syste m d yna mics. Suppose tha t it is possible to measur e the con ce ntration of only species B with a measur e ment noise vet). T hus, the sampled system can be approximated by :
4,972 19 .90 ] 79.S42 1.IX)7 1.lxe 1.00.' 4.7{,K
T is th e sampling period and the state variable at time k is de fined as x, = (x 1, Xl)" T he second observability matrix of (A, H, G) is full rank [rank(D) = q = 2, but rank(D 2) = 1J for parameters given in Table 1. Thus , we can select L ;'? 2 to ge t the least square estimate of Q. This si mulatio n example applied normal distributed process noise (fo r species A) and measurement no ise (for species B) with zero-mean and different true values of Q and R genera ted from a random number gene rator (IMSL subro utine DRNNOA) . The statistical average va lues of true and estimated covariances of Q and R were computed using 200 different samples wit h different seeds for the random number generator. The state horizon N was ch ose n as 5. the covariance ho rizo n M wa s 9U, and the number of reg ression eq uations L was 50 for th e least squ are estimat e of Q . The fina l results are sho wn in Table 2. The st atistical average estimated values are very close to the true ones . The average standard deviation for R is about l °'{, wh ile tha t of Q is about 25%. In th is example, an d in ma ny real sit uations, the linkage between the observations and R is much more direct than th at with Q . For most problems, II Q II/ II R II variations o f o nly 25% will have little effect o n Kalman filter estimators. The 25% error in Q wo uld be seri ou s for mo st estimations. Note, however that Q is the co variance of noise in the derivati ves of state va ria ble a nd is estimated, in effect, by taking differences in R. The error in Q is a function of obse rvability . For Ka lman
1 9 ~: :'
79826 I.(X)6
(004 1.IX/4 4. 77'
oHi Rc'IlIo
110134 0 .0 128 O.OI2<} 0.0146 (1,(1125 (I,(Jll 9 O.OOS6
filter estimators. a 25% error in the estim ate in Q IS acce pta ble for mo st problems since onl y order of magnitude guesses are usuall y a vaila ble in pr actice . Wh ile this system of two con secutive reactions IS simp le , it illustra tes the main ide as of th e proposed method. More complex examples are under studv
~.
(39)
(40 )
10'
SUMMARY
The horizon state estimato r des cribed hen' b linearly unbi ased with a con stant sta te e rro r L'O variance. Fo r th e ordinary Kalman filter o r va rio us smoo thing app roaches involving the Kalma n ti tt er presented in texts o r lite ra tu re , all ma v fill t hi, con stant property which is nec essary for estimat ing Q and R . In this work , we co m bine the measure rnent model into th e proc ess model an d lise l h l' resulting est imat ed state values a nd th e co nsta n t sta te estimati on e rro r covaria nce to ge nerat e ( 111 sta nt cumul ati ve e rro r covaria nces requ ired h I. ge ne ra l est ima tio n strate gies . A maximum Ii kvh hood fu nction a nd a linear regression techniq ue yie ld the diagon al clements of Q and R efti cicru lv We als o give a nec essa ry condition for choi ce o f I. the number of regression equ ati on s to estim ate ()
REFEREi'lCES
Lewis F. 1... Optim al Estimati on, Wiley. New Yor k Il 9 Hf» Mehr a R. K.. On the identification of variance, and adapti ve Kalman fillering. l E E/:; Tram . A Uf. COII/. AC· 15. 175-1 1\4 ( 1
APPENDIX A Prove equation (I l l.
J.
194
ZHOU
and R. H.
Rewrite equation (7), using equations (3,4), as follows: N j-I KJZk+j-lHAr-IBuk+j_l_r) Xk=
2:
2:
;=1
r=l
LUECKE
Note that: HAj-1KjWj = Wj(HAj-IKj? =HAHa(HAj-l)T, (A6) N
2: K,W,K; =a.
N
=
2: Kj(HAHxk+ekj)
Hence, equation (16) holds.
j~1
N
=
(A7)
s=1
N
(~KjHAI-)Xk+ ~ Kjekj'
(AI)
Prove equations (32, 33, 34, 35). Note that:
From definitions (8, 9): N
N
j~l
j=l
2: KjHAj-1 = 2: a(HAj-I?WtHA j-
APPENDlXD
1
N
=a
2: (HAj-l?Wt(HAH)
E{lV:l} = W r + h
(A8)
W,+l- W,= (HA'-IG)Q(HA'-lG)T.
(A9)
E{b(OI}=diag(WI)=diag(R)=r,
(AW)
We have:
j=l = aa- I = I.
(A2)
E{b(')}=diag(W'+I- W r )
Equation (11) holds since a is not singular.
=diag[(HA'-lG)Q(HAr-1G)Tj =Clr1q. (All)
APPENDIXB Prove equation (13). From equations (5, 11) and definitions (8, 9), we have:
Thus, equations (32, 34) hold. To show covariances of i' and q, we need to define a new function. Suppose that we have a sequence of square matrices with random entries given by:
N
P=cov(ek)=
2: KjWjKJ
a\? ... a\~]
j=1
A(')
[ ~~i
=
N
=
2: [a(HAj-I)TWj-llWj!a(HAj-I)TWj-ljT
~~i' r= 1, ... , L.
(AI2)
The diagonal variance-covariance matrix of A is then defined as:
j=l N
=a
2: (HAj-I)T(wtWj)(Wj-1)T(HAj-l)aT j=1
(A13)
=a[~ (HAj-I)TWj-1(HAj-l)] a
where
=aa-1a
L'I=
(A3)
=a.
[E{~\:I~W} ... E{~\:l~~j}]. E{a~~aW}
Note that both Wj- l and a are symmetric and nonsingular.
(A14)
E{a~~a~j}
From equation (26), we have:
APPENDIX C
b(')= diag(W:> W;) = diag(S,),
Prove equation (16) Substituting equations (4, 11) into equation (15) results in: j-I ekj=HAJ-I(Xk-Xk) + HAr-IGWk+j_l_,+Vk+j_1
r= 0,1, ... ,L, (AI5)
where S,= W::l- W;:. Therefore, according to the definition of equation (Al3), the diagonal variance-covariance matrices of So and S are:
L
E{b(O)b(O)~= VASo) ,
E{bb~=
r=l
Vd(S).
(AI6) (AI7)
Hence cov(i') = E{(i'- r)(i'- r}'} N
=(I-HAHKj)ekJ- HAH
2:
K,eks'
(A4)
The covariance of ekJ is: WJ= (I -HAj-IKj)Wj(l- HAj-IKj?
=
ct"j
=~E{bbl~T- qqT=~Vd(S)~T _qqT
K,W,K;) (HAJ-l)T
(~K,W,K;t)
(AI9)
where ~ = (CTC)-ICT. Equations (33, 35) hold.
APPENDIXE
W j - HAj-IKjW j- Wj(HAHKJI + (HAj-l)
(AI8)
cov(q) = E{(q - q)(q -q)~
S=1.S'Fj
+ (HAj-l)
= E{blO)b(O)~ - rrT= Vd(So)- rr",
Prove Theorem I (AS)
Fact: WjE lJ;!Pxp defined in equation (5) are positive definite.
Covariances of the process noise ~A22)
To show this fact, rewrite equation (5) as: W1=R, Wj=R+(HG)Q(HGf + (HAG)Q(HAG)T + ... + (HAj- 2G)Q(HAj- 2G)T,
j= 2, ... , N.
We can claim that (HA'-IG)Q(HA'-IG)T~O for r= 1, ... ,j-I. Otherwise, if the symmetric matrix (HA,-IG)Q(HA, -IG)T is negative definite, for any nonzero vector zEIRP, we have: O>zT(HA'-IG)Q(HA'-IGyrz = [YQ(HA'-IG)Tz]T[YQ(HA'-IG)Tzl·
This is a contradiction. Since W j consists of a sum of positive semidefinite terms and R> 0, the fact holds. Define:
HA H ]
DN=
H~NI
[
(A20) '
I
[
(0)
O~ xT(D;;;IW'IDN)x
= (ONX)IW-1(D v x).
then D.~ x =O. Since D,v has full column rank. this implies that x=O. Dimension AE IRn x n
BE IRn x m G E [J;lnxq HE [J;lpx n QE [J;lqxq
R E [J;lpxp XE [J;ln
uElRm
W- 1
W- 1 =
Since W) are positive definite, we are guaranteed symmetry and positive definiteness of W- 1• Note that (A. H) is observable. This condition implies that rank(D N ) = n. Thus, the matrix D;;;TW-1DN is positive definite and hence a. Otherwise. if given x E IR n satisfies:
(0) ] ,
(A2I)
W N1
where ON E IR Np x n is the observability matrix of (A, H); WjEWXP in (A21) are the weighting matrices defined in equation (5). Following these definitions, we can write equation (9) as:
ZE [J;lP wE[J;lq
v E [J;lP
n = dimension m = dimension p = dimension q = dimension
of states; of control inputs; of measurements; of process noise.