Mathematics and Computers in Simulation XX (1978) 102-109 0 North-Holland Publishing Company
AN EXACT EQUIVALENCE BETWEEN THE DISCRETE- AND CONTINUOUS-TIME FORMULATIONS OF THE KALMAN FILTER M.W.A. SMITH Senior Lecturer,
School of Computer Science,
Ulster College, The Northern Ireland Polytechnic,
United Kingdom
and A.P. ROBERTS The Queen’s University of Belfast, United Kingdom
It is shown that if the definition of the covariance of a white noise sequence in discrete-time is derived from the accepted mathematical description for the covariance of a white noise process in continuous-time, compatibility between the discreteand continuous-time versions of the Kalman filter is complete. Consequently the approach to the limit of the discrete-time filter to obtain its continuous-time equivalent no longer depends on Kalman’s non-rigorous argument for dividing the covariante of a white noise sequence by the sampling interval At. Such an exact equivalence is essential for comparing the accuray of discrete-time computations with results obtained by numerically integrating the continuous-time filter equations. This approach provides a pragmatic technique for the determination of the most suitable sampling interval for discrete-time Kalman filtering. 1. Introduction
white noise source which may be correlated with w(t). In discrete-time the system (1) and measurements (2) may be expressed respectively as
An important aspect of control theory is concerned with the ability to measure the state of a system when measurements are corrupted by noise. A significant advance was made by Kalman (e.g. [6]) who devised an optimal filter to estimate the state of a linear system which may also be disturbed by noise. When a control is incorporated such a system may be described in continuous-time by x = F(t) x + G(r) w(t) + C(t) u(t) ,
x(kti)=~(kti,kj~(k)+r(k+i,k)w(k) +*(ktl,k)u(k), z(ktl)=H(jctl)x(k+l)+~(ktl),
(4)
where @(k t 1, k), I’(k + 1, k) and \k(k + 1, k) are the state, disturbance and control transition matrices respectively. Each of the matrices in (3) and (4) is of the same dimension as its continuous-time equivalent in eqs. (1) and (2). Kalman [S] originally devised his filter by analysing the problem in discrete-time to obtain the following recursive relationships, which provide optimal estimates 3 of the state of (3).
(1)
where x(t) is an Z-vector representing the state, w(t) a p-vector representing a white noise input and u(t) an r-vector representing the input or control of the system. The matrices F(t), G(t) and c(t), which describe the nature of the system, disturbance and control, respectively, are all assumed to be of compatible dimensions, The associated measurements, or outputs, of (1) are provided by z(t) = H(t) x(t) +Il(t) ,
(3)
yktlIk)=~(ktl,k)~(Q(kIk-l) tK(k)[z(k)-H(k)k(kIk-l)]+*(k+llk)u(k),
(2)
(9 K(k)=[~((ktl,k)P(kIk-l)H’(k)+r(k+l,k)S(k)]
where z is an m-vector (m < 1) and H(t) is the (m X Z) measurement matrix. The m-vector r~(t) represents a
X [H(k)P(klk 102
- l)H’(k)
+R(k)]-' ,
(6)
M. WA. Smith, A.P. Roberts / The Kalman filter
P(ktlIk)=Q(k+l,k){P(kIk-1) -[P(kIk-l)H’(k)-+r(k+l,k)S(k)] X [H(k)P(k 1k - 1) H’(k) + R(k)] -’ X [H(k)P(k 1k - 1) + S’(k) r’(k + 1, k)]} Q’(k + 1, k) +r(k+l,k)Q(k)r’(k+l,k).
(7)
If x”(k + 11k) is defined as the error in the estimate %(k+llk)ofx(k+l),then x”(k+llk)=x(k+l)-it(k+lik). The matrix qk + 11 k) is defined as the covariance of the predicted error in the estimate at time point k + 1 using information at the preceding sample point k. Thus
The matrix K(k) is called the Kalman gain. The covariances of the noise sequences are defined such that EfwO’) w’(k)) = Q(k)
6jk
7
E’iO’)tl’(k)I
=R(k)
ajk
>
E {Wo’) ‘l’(k))
= S(k)
6 jk
3
where is the Kronecker delta. All the stochastic quantities are assumed to be zero mean sequences. Subsequently, Kalman and Bucy [7] analysed the problem entirely in the continuous domain obtaining a formulation of the filter by solving the Wiener-Hopf integral equation. The continuous-time equations corresponding to (5), (6) and (7) may be expressed as 6jk
k=F(t)f++(t)[z(t)
-H(t)f]
K(t) = [p(tI t)ff(t)
+cyt)u(t),
+ G(t) S(f)] R-‘(t)
,
(8) (9)
i, = F(t) P + PF’(t) - [PI!‘(t) + G(t) S(t)] R-‘(t) X W”(t)
+ G(t)
s(t)l’ + G(t) Q G’(t) ,
(10)
where the noise covariances are defined by E{w(t) w’(7)) = Q(t) s(t - r) ) Eh(t)
tl’(~)) = R(t) s(f - 4 ,
E{w(Otl’(7)1
= $0 Nt - Q-19
in which s(t - 7) is the Dirac delta function.
An alter-
103
native simple derivation of the continuous-time Kalman filter has been provided by Johnson [4]. The Kalman filter has been studied in great depth to gain insight into its performance. Nevertheless there can be difficulties in its implementation. For instance, an a priori requirement is the knowledge of a considerable amount of statistical data concerning the system. Further, it has been found that under certain circumstances the estimates provided by the filter may diverge. A profusion of papers has appeared suggesting methods of overcoming these and other difficulties. Both continuous- and discrete-time versions of the filter have been used as the basis for further theoretical development with perhaps more significant success resulting from considerations in discrete-time. In addition, the computational performance of the filter has received much attention: Camp [2] presented results of a straightforward simulation of the filter, while Gaston and Rowlands [3] studied various integration algorithms for the implementation of a digital solution of the continuous-time equations. Mendel [8] has examined in detail the computational requirements for the realization of the discrete version of the filter. Bierman and Thornton [l] have compared a number of discretetime algorithms designed to overcome the numerical deficiencies of the conventional Kalman filter. One design parameter which can affect the quality of the results obtained from discrete-time computations is the choice of a suitable sampling interval. However few, if any, experiments or simulations appear to have been conducted to compare the performance of the discrete filter with results furnished by any of the available numerical algorithms when applied to the continuous-time filter equations. To perform such an investigation it would be necessary to simulate the system, including the additive noise inputs, together with exactly equivalent discrete- and continuous-time versions of the Kalman filter. There is a theoretical difficulty in establishing this equivalence which must be overcome before such work can proceed with confidence. In section 2 the nature of the difficulty is examined in the context in which it arises, that is, in the process of limiting the discrete Kalman filter to its continuous-time equivalent. Following a brief discussion in section 3, the apparent inconsistency is explored in section 4, and a suggestion is made to resolve the problem. Some numerical examples are included, in section 5, to illustrate the behaviour of
104
M. WA. Smith, A.P. Roberts / The Kalman filter
=
discrete-time computations and to clarify the application of the theory developed in the preceding section.
& {Z + F(t)
At + O(At)*} fit I t - At) H’(t)
+ (G(t) At + O(At)*) S,(t)] 2. The formulation
X [H(t) fit I t - At) H’(t) •t R,(t)]
of the problem
Originally, to obtain the continuous-time filter, Kalman limited the discrete-time equations by permitting the interval At between successive samples to tend to zero, while simultaneously allowing the number of samples n, within a constant finite time interval, to increase without limit. As c(t) u(t) and \k(k t 1, k) u(k) do not contribute to the problem, the control term will be omitted henceforth. If successive time points k - 1, k, k + 1 are indicated by t - At, t, t + At, respectively, (5) becomes
K(t) = A!d o ; +
{I + F(t) At + O(At)*} P(t 1t - At) [ sD(t)
Xg(t)
+ {G(t) At + O(At)*} at
+ K(t) [z(t) - H(t) 2(t I t - At)] . the relation
H(t)P(tIt-At)H’(t)+--
@(t+At,t)=Z+F(t)At+O(At)*,
=
where O(At)* is an (I Xl) matrix of all terms which include powers of At greater than unity, k((t t At 1t) - 2(t 1t - At) = F(t) At$(t 1t - At) +K(t)[z(t)
-H(t)
r?(tl t - At)] + O(At)*2(tl
t - At).
Dividing through by At, approaching the limit as both At -+ 0 and 12+ 00, and writing k(t I t) as Z(t), the difference eqaution becomes the differential equation: i(t)
= F(t) a(t) +$($t)/At)[z(t)
-H(t)
2(t)] .
RD(t>
At
1 1
K(t)/At is the continuous-time gain. In order to avoid confusion when a variable in discrete-time differs from the corresponding variable in continuous-time, a subscript D will be introduced to denote the discretetime version of that variable. To evaluate K(t) = substitute from (6), replacing LtAt+O, n+m KD(t)/At, k by t as above, and in addition incorporate the relation I’(t t At, t) = G(t) At + O(At>*,
(12)
-’
[fltl t) H’(t)+G(t) SD(t)] 6%) ,
(13)
as required. If identical reasoning is applied to approach the limit of (7) it is easily shown that (10) results. There is clearly an inconsistency in such a course of action: Dividing QD(k), RD(k) and SD(k) by At may imply that the continuous-time noise covariances are given by tit)
3
= Apo f?D(t)/At
R(t) =
Lt
&(t)/At,
(14)
At-0
At this stage it is clear that the limit LtA,+O, n+m
(11)
However as At -+ 0 and n + CQ,(11) would appear to indicate that the continuous-time gain increases without limit. To overcome this problem Kalman, referring to his argument as “a delicate matter”, introduced ad hoc the idea of dividing each of the noise covariances QD(k), l+,(k), SD(k) by At. The outcome is to provide ati additional At in (11) for manipulative purposes, so that
ift + AtI t) = @((t+ At, t) %(ti t - At)
Introducing
-‘.
s(t)
=
Lt At’0
s,(t)/At
,
while the actual process of approaching (12) to obtain (13) implies that Lt
At’0 *+-
Lt At-+0 n-+-
!&(t) = tit)
RD(f)
the limit of
1
= R(t),
to give
Lt SD(t) At-0 ?I+-
= s(t).
&,(t)/At
Moreover, the meaning of (14) is not at all obvious
105
hf. W.A. Smith, A.P. Roberts / The Kalman jilter
as such limits would appear to lack physical explanation.
4. Reconciliation Kalman filter
3. Discussion
In continuous-time it is generally accepted that the covariance R(t) of a white noise processq(t) is represented by
Kalman filter theory is widely accepted and is usually included in standard texts on estimation and control of stochastic systems. In these sources authors usually refer to Kalman’s non-rigorous argument, often introducing further elaboration. Nevertheless it would appear that the difficulty is more fundamental and a rigorous explanation is required, not only for intellectual satisfaction but also to permit the computational comparisons indicated in section 1. The Riccati equation (10) often exhibits very large rates of change during the initial stages of computation. When an integration algorithm is applied the step size must be sufficiently small to accommodate such a transient. However a small step size becomes unnecessary as the steady state is approached and results in excessive execution times. If, instead, a variable step size method is employed, not only may the overall execution time be reduced but greater accuracy can be achieved. When the discrete-time version of the Kalman filter is being used, it may be regarded as an alternative method of fixed step size approximation and should the results appear questionable, it would be prudent to repeat a specific computation in continuous-time. In this context the Runga-Kutta-Merson method (with perhaps a revised error criterion) would probably be the most reliable and thoroughly tested algorithm to apply. For such checks,_it is imperitive to ensure that the values of Q, R and S in continuoustime are consistent with those used during the discretetime computation. The greater accuracy anticipated will, of course, be at the expense of a considerably increased execution time. It would therefore be expected that continuous-time solutions would be used sparingly. The procedure of checking by a slow but accurate computation is familiar to analogue computer users, who frequently validate digitally a few of the many results obtained during an investigation. In the next section, the difference between the two versions of the Kalman filter is resolved, permitting such checks to be readily performed.
Qtl@)q’(r))
of the discrete- and continuous-time
=R(t) 60 - 7) 9
while in discrete-time described by
(15)
a white noise sequence is
E{rl (k) ‘I’@) =RD@)
6jk .
(16)
These descriptions are anomalous in that (16) is always bounded, while (15) is not. Moreover, in continuoustime
/&Y{q(t)11’(r)]dr= 0
[R(r)+r)dr=R(t)
(17)
b
for 0 < t < T, whereas the equivalent discrete-time, that is n Lt At-0
C RDQ] 6jkAt=
O .
i=l
expression in
(18)
Clearly the difficulty in approaching the limit of the discrete-time Kalman filter is associated with the definitions (15) and (16). Although continuous-time white noise cannot exist physically, it is an important concept in stochastic theory and, in the absence of a better mathematical formulation, its description by (15) will be accepted. It is therefore necessary to deduce a discrete-time definition consistent with (15). If &jk is replaced by 6ik/At then, from fig. 1, it is clear that Lt -FE=fi(t-r),wheret-r=At. At
At'0
The definition that
(16) may therefore be revised such
E{tl(k) ‘I’@} At = R&)
6jk .
(19)
Expressions (1.5) and (19) are now compatible. The subscript S is used to indicate the value of the discrete-time covariance using the suggested definition (19).
M. W.A. Smith, A.P. Roberts / The Kalman filter
106
Example 1. Consider a second order equation
Amplitude
j; = -1.5i
- 2.oy t osu,
where u = 7 andy(0) =3(O) = 0. To express this equation in state space representation, state variables yl, yz are defined as 4;, y respectively. Then
(23) Fig. 1.
where Comparing (15) and (19) the elements of Rs(k) are equal to the elements of covariance R(t) at the time point k = t while, from (16) and (19) R,,(k) = Rs(k)/At
(20)
so that (18) becomes
Lt
At-0
k j=l
The discrete-time
representation
Tkil)l
=@(k+l,k)
Ya(k + 1)
of (23) is
[ 1 yt(k)
t\k(k+
l,k)u(k)
yz (k)
(24)
&Sk) __ 6jkAt At
= Rs(k) ,
n+=
which is consistent with (17). Since the discrete-time Kalman filter was derived using (16) instead of (19) to limit to continuous-time, R,,(k) must be replaced by Rs(k)/At. Subsequently, as the limit is taken, Rs(k) is written as R(t). Furthermore, an immediate consequence is that limits of the form of (14) are no longer implied.
5. Computational
results and implications
Comparatively few studies appear to have been undertaken to establish techniques for choosing a suitable sample interval for the discrete-time realization of the Kalman filter. While the importance of an appropriate step size is well known, the inclusion of such details when reporting the outcome of computations is rare. In this section the characteristics of discretetime computations are studied by means of two simple examples which indicate conflicting requirements. Firstly, a second order differential equation is computed in both continuous- and discrete-time and secondly, using the theory developed in section 4, exactly equivalent continuous- and discrete-time versions of a Riccati equation are compared.
Assuming a constant step size At between time points k and k + 1, k = 0, 1,2, . . . . Q(k + 1, k) is given by the solution of ,
@(t,to) = FW,to)
(25)
subject to the initial condition @(to, to) = I and to At At - to. Also it can be shown that tk+ 1
\k(k+ l,k)=
s
@(tk+l,7)
Cd7.
(26)
*k
The result of computing size Runge-Kutta-Merson
(23) by the variable step algorithm over the interval
Table 1 RKM solution of eq. (23)
t 0.000 1.024 2.048 3.012 4.096 5.018 6.042 7.066 8.000 9.011 10.001
E;
Y
0.000 1.275 3.981 -1.503 -1.326 -1.789 2.576 1.189 -1.202 -3.328 -8.586
0.000 0.999 1.892 1.956 1.784 1.718 1.730 1.751 1.755 1.152 1.752
x 10-l x 10-l X 10-l x 1O-2 X 1O-2 x 1O-2 x 10-3 x 1O-3 x 1o-4
M. W.A. Smith, A.P. Roberts
Table 2 Discrete-time
solution
of eq. (24) using two different
/ The Kalman filter
107
sample intervals
t
Y
Y
i
Y
0.000 1.000 2.000 3.000 4.000 5.000 6.000 7.000 8.000 9.000 10.000
0.000 1.285 4.409 -1.352 -1.448 -1.949 2.563 1.317 -1.170 -3.312 -8.475
0.000 0.968 1.872 1.966 1.797 1.718 1.728 1.750 1.755 1.752 1.749
0.000 1.281 4.413 -1.298 -1.395 -1.574 2.895 1.669 +2.605x +3.756x +2.615
0.000 0.969 1.867 1.961 1.794 1.716 1.726 1.747 1.752 1.749 1.747
x x x x x x x x x
10-l 10-l 10-l 1o-2 1O-2 1o-2 1o-3 1O-3 1O-4
Step size 0.01
continuous-
and discrete-time -_____
solutions
turn depends on the covariance of error equation (7). As (7) is independent of (5) a meaningful comparison of the continuous- and discrete-time algorithms may be obtained by studying in isolation the propagation of the covariance of estimation error. As a second example the covariance of error of a system described by the scalar equation i(t) = fx(t) will be computed. Let the measurement be z(t) =x(t) + v(t), where a(t) is a stationary white zero mean process of covariance
RK4
Discrete-time
Step size
0.0001
0.1
0.01
t
PO)
p(k + Ilk)’
p(k+
p(k + 1 I k) =
#2(k+ l,k)p(klkp(klk-
l)rD (27)
l)+rD
where, in accordance with the theory presented in sec-
Ilk)
~~ 2.000 2.963 7.326 1.816
E{rl(t) v(r)) = rW - 7). From (29, @(k + 1, k) = efAf, where At is the sampling interval. Equations (7) and (10) respectively reduce to
of a Riccati
Algorithm
0.000 1.000 2.000 3.000
10-l 10-l 10-l 10-2 1O-2 1O-2 1O-3 lo-’ X 1O-3 --____
Step size 0.0001
0 < t < 10 is given in table 1, while discrete-time solutions using (24) for At = 0.01 and At = 0.0001 are presented in table 2. The numerical values for the elements of @(k + 1, k) and W(k t 1, k) were derived analytically from (25) and (26) and are therefore very accurate. Thus errors can be expected to be negligible when the step size is large. However, the cumulative effect of truncation is significant when the sample interval is very small; the incorrect signs obtained forj, when 8 < t < 10 indicate that the results eventually become meaningless. The implications that may be drawn from this example are that decreasing the step size will in general be detrimental to the accuracy of the solution and that a very great precision in the values of the elements of Q((k + 1, k) is essential. Example 2. The Kalman filtered estimates provided by (5) depend on the gain obtained from (6) which in
Table 3 Exactly equivalent
x x x x x x
equation
when a white noise covariance .___
is present ~~
0.005
0.001
0.0001
p(k + Ilk)
P(k ,f Ilk)
PUG+ Ilk)
2.000 2.920 7.221 1.790
2.000 2.956 7.309 1.812
2.000 2.960 7.310 1.810
--x X X x
lo3 1O-2 lo@ lo-’
2.000 2.232 5.519 1.368
x x x X
lo3 1O-2 10-5 lo-’
2.000 2.878 7.115 1.764
x x x X
lo3 1O-2 1o-5 lo-’
x x x x
lo3 1o-2 lo-’ 1o-7
x X x x
lo3 1O-2 1o-5 1o-7
x x x x
IO3 1O-2 1o-5 1O-7
108
M.WA. Smith, A.P. Roberts / The Kalman filter
tion 4, rD = r/At, and
When At = 0.0001
ri = p(2f - p/r) .
(28)
A Kalman filter computation differs from that of the second order equation discussed above in that prediction is involved. Assume that the initial value ~(01-1) is given; then ~(1 I0) can be claculated from (27). If the step size is reduced to At/2, to repeat the computation the state transition matrix becomes #j*(k + 1, k) andrD is doubled. Incorporating these changes the predicted value of p at k = 1 is now given by two cycles of (29) where j = k and j t 2 E k t 1. The accuracy of the predicted value is therefore dependent on the sampling interval. For example, if the step size is At, @(k + 1, k) = $, ~(01-1) = 4 and rD = 2, the predicted value at t = At is, from (27), p(Atl0) = A. When the step size is halved the predicted value at the same time point, from (29), isp(At1 At/2) = ;. Such a poor result is due to the choice of a step size unsuited to the dynamics of the system: @(k t 1, k) = 0.25 is equivalent to fAt = -1.386. When f= -3.0, r = 2, p(O) = 2000, discrete-time results obtained using a number of different values for At are given in table 3, together with those computed from (28) by a fourth order Runge-Kutta algorithm. In this case for the range of values of At adopted it is apparent that as the sampling interval is reduced the accuracy is improved. It would appear that At = 0.001 would offer the best compromise between minimum execution time and accuracy of result. 5.1. Discussion
1
r0.4963
0.5000
x 1o-4
0.6566 x lo-’
1 *
The effect of reducing At is that the elements of \k and the off diagonal elements of @ become smaller relative to the diagonal elements of @. Also, a larger number of computational cycles are required to span a given duration of time. Taken together, the outcome is a greater accumulation of error as the step size is decreased. In the second example the effect of truncation is minimal and ill-conditioning does not arise. It is an example to illustrate the consequences of step size selection on the accuracy of prediction. Combining the outcome of both examples it is apparent that to obtain state estimates by means of the discrete-time Kalman filter algorithm a compromise between two conflicting characteristics must be found. In order to do so, an accurate continuous-time solution used as a reference, is a pragmatic approach to the determination of a suitable value for the step size for subsequent discrete-time computation. For a specialized application of this nature a sophisticated control of the continuous-time variable step size integration algorithm should be designed, so that its output will occur properly aligned to prescribed time points and not as shown in table 1.
6. Conclusion
The first example was chosen to show the effects of cumulative error when no prediction is involved. In addition, the coefficients of the second order equation (23) were selected to show that such error is not confined to ill-conditioned cases. However, the discretetime representation (24) becomes increasingly ill-conditioned as the sample interval is reduced. When t = 0.01 -0.1985 X 10-l 0.9850 @(k+ l,k)= [ 0.9925 X lo-* 0.9999 ’ \k(k+ l,k)=
\k(k+ l,k)=
X lo-*1
1 0.2500 x lo- 4J
A difficulty concerning the equivalence of the discrete- and continuous-time versions of the Kalman filter has been resolved. It follows that to perform a filtering computation in continuous-time, when values QD, RD and SD are given for the sampled data case, the numerical values of the noise covariances which should be used are QD, RD and SD multiplied by the step size. Results from the examples computed in section 5 have indicated that increased accuracy does not necessarily occur as a consequence of decreasing the step size, and that the selection of step size for discretetime operation can be obtained by comparing the out-
M. W.A. Smith, A.P. Roberts / The Kalman filter
come of a few preliminary computations with the output obtained by integrating the continuous-time equations. An interesting corollary, from a theoretical point of view, is that if Q, R and S are constant in continuous-time and it is decided that computations should be performed in discrete-time using a variable sample interval, the corresponding quantities QD, RD and So vary inversely with that interval. Finally, it is noted that at time point k the covariance of a white noise sequence, if formulated by (19) represents the area of the rectangle between samples at k and k + 1, and that this is a consequence of the inclusion of the Dirac delta function in the continuoustime specification (15).
109
References 111 J.B. Bierman and C.L. Thornton, 23-35. 121 J.A. Camp, J. Comput. Physics [31 J.A. Gaston and J.R. Rowland, 2 (1975) 131-140.
Automatica 8 (1971) Comput.
13 (1977)
175-196. and Elect. Engng
[41 W.J. Johnson I.E.E.E. Trans. Autom. Control AC-14 (1969) 380-384. [51 R.E. Kalman, J. Basic Engr. (ASME Trans.) 82 D (1960) 35-45. [61 R.E. Kalman, Proc. 1st Symp. on Engng. Appl. of Random Function Thry. and Prob. (Wiley), (1963) 270-388. 171 R.E. Kalman and R.S. Bucy, J. Basic Engr. (ASME Trans.) 83 D (1961) p. 95. 181 J.M. Mendel, I.E.E.E. Trans. Autom. Control AC-16, (1971) 748-758.