Copyright © IFAC Identificati on and System Parameter Estimation. Budapes t. Hungary 1991
ESTIMA TING TIME-VARYING PARAMETERS IN REGRESSION MODELS L. Guo, H.F. Chen and J.F. Zhang Institute a/System Science, Academia Sinica, Beijing 100080, PRC
Abstract. The Kalman filtering algorithm and the least-mean-squres (LMS) algorithm are two standard methods for estimating time-varying parameters in a linear regression model. This paper establishes the upper bounds for parameter tarcking errors produced by these two algorithms, without resorting to stationary or independency assumptions on the regressors. Keywords : Time-varying parameter, stochastic regression, Kalman filter, LMS, tracking error.
1. Introduction its difficulty. Section 2 present some results on this topic, Tracking or estimating a system or a signal whose
which are necessary for our study. A key excitation condi-
properties vary with time is an important problem in sys-
tion in the time-varying case, called the "conditional rich-
tem identification as well as in signal processing. The
ness condition" is presented and illustrated in Section 3.
basic time-varying model is that of a regression:
Two typical algorithms for tracking time-varying parameters, the Kalman filtering and the LMS algorithms, are
(1)
studied in Sections 4 and 5. where Yk and Vk are the scalar output and the noise respectively, and
'Pk
and
Ok
2. Stability of Random TimeVarying Equations
are, respectively, the r-dimensional
stachastic regressor and the unknown time-varying parameter. It is convenient to denote the parameter variation at time k by
We start with the following simple time-varying linear
L'lk:
equation:
(2) It is clear that if L'lk
== 0, and
'Pk
and
Vk
(3)
respectively
where
have the following forms
ak
E [0, 1) are random variables, and where the
initial value Xo satisfies Elxol <
00.
The solution of (3) can be expressed as
(4) then (1) is reduced to the time-invariant ARMAX model st udied extensively in literature.
k
In the time-invariant
where as usual
case, the adaptation gain in the estimation algorithm is
i =i+l
For convenience of discussion, we introduce the follow-
usually diminishing (for example, the gain ~ in stochas-
ing definition.
tic gradient (SC) algorithm). Algorithms usually used for
Definition 1. A random sequence {Xk' k 2: O} defined
estimating constant parameters will fail in the time varying case, since the parameter variations pected to be vanishing as k
---+ 00.
L'lk
IT (.) ~ 1, if k :S i.
on the basic probability space (0,1, P) is called Lp-stable
are not ex-
(p > 0) if sup Ellxkl lP <
Hence algorithms with
00.
If such a sequence is gener-
k
non-vanishing gains are naturally used. The analysis for
ated by (3), then the equation (3) is called Lp-stab le .
such algorithms is no longer similar to that of e.g. SC
In the sequel, we shall use the following notation
algorithm and hinges on the stability analysis for time-
(5)
varying linear equations. The latter problem is known by
121
A natural question is: under which conditions equa-
where {€k, fk} is an adapted sequence satisfying
tion (3) is Lp-stable. (13)
A commonly used condition is that there exist constants c > 0, "f E (0, 1) such that
and
k
E
IT
(1 - aj) S; qk-i,
Vk ~ i,
(14)
(6)
;='+1
From this it is not difficult to show that (10) is true (see
which, as easily to be seen, guarantees the Lp-stability of
[1J for related derivations) .
(3) provided that {xo, €k} satisfies some moment conditions. The next result shows that (6) is also a necessary
Lemma 2. Let {Xk, fd be a nonnegative adapted pro-
condition in some sense.
~
cess, Xk
1, and satisfy
Proposition 1. Let {ad be a sequence of mutually independent random variables. Then for any
{€d
E
M,
(15)
the equation (9) is L 1 -stable If and only If (6) holds, where
where c > 0 is a constant, and {ak} is defined as in
M is the set defined as
Lemma 1. Furthermore, assume that ak E [0, if], if
M ~ {{ €k} : {6} E Ll and
{€d
is independent of {ak}}.
Then there exist constants N
E
<
1.
> 0 and oX E (0,1) such that
IT (1 - ~) S; NoXn-m+ I,Vn ~ m~O. Xk
(16)
k=m
The proof is simple and is omitted here. Since (6) plays a key role in the stability study of
The proof is similar to that of Lemma 4 in [IJ.
equation (3) it is important to investigate for which kind We have studied some cases where (6) holds. As we
of (possibly strongly correlated) random variables ak , (6) does hold. The results presented here will be used in later
mentioned before (6) is a key condition in guaranteeing
sections.
the Lp-stability of (3) . However, from the Lp-stability of {Xk}' we cannot directly infer the boundedness of sample
Lemma 1. Let {am, f m } be an adapted random se-
averages of {Xn} . This issue is addressed in the following
quence satisfying
lemma [2J:
(7)
am E [O,I J,
Lemma 3. Let {fk , fd be an adapted nonnegative
with {am' f m} being an adapted nonnegative sequence,
sequence satisfying
satisfying
(17)
(8)
for some a > 0, where {ak} is the same as in Lemma 1,
where {'7m, f m } is an adapted nonnegative sequence with
and {€k, fk} is nonnegative and satisfies
(9) sup Ef:+1 < 00, k~ O
< 0 < 00 and 0 S; M < 00 . Then constants c > 0 and "f E (0,1) such that
and where a E [0, 1), 0
there exist two n
E
IT (1 -
a k+ l ) < _ c...,n-m+l I ,
"In
~
m
~
0,
Then there exists a constant L
limsup
(10)
n -oo
k=m
where c and "'I only depend on a, M, Mo and O. Proof.
whenever
It is easy to show that there is an adapted
S; LB *
0 such that
a .s.,
Vt1 E (0, a)
(19)
i =O
appearing in (9) satisfies
{j
> a 2!!f3 '
3. Conditional Richness Condition
and that 0
t ff
(18)
ing results of Sections 3 and 4.
(11)
+ €m+ l ,
n
~
00
We remark that Lemmas 1- 3 are crucial in establish-
sequence {t1m, f m} such that
t1m+l = bt1m
{j
..!.
1 ~ -a
l!.
B = lim sup - ~ €i < n- oo n i= O
< b < 1,
Ej3~+6 < 00,
Like the constant parameter case, some kind of excitation or richness on 'Pi is necessary in estimating the
(12)
122
unknown time-varying parameters. The stationarity as-
equation with state ()k> then (24) is nothing but the uni-
sumption on 'Pi is widely used in the adaptive signal pro-
form observability condition for
()k .
In the area of deter-
cessing area for studying LMS algorithms (more discus-
ministic adaptive control, (24) is sometimes called "suffi-
sions on LMS will be given in Section 5). Without any
cient richness condition" . Notwithstanding the fairly wide
doubt it is reasonable in certain circumstances, but re-
use of (24), its verification appears to be very difficult if
strictive in general. For example, the stationarity assump-
not impossible. Actually, (24) is mainly a deterministic
tion on 'Pi excludes the feedback control systems from
hypothesis and excludes many standard signals including
consideration. Thus, finding a weaker richness condition
any unbounded signals 'Pk.
on 'Pi, which includes both stationary and nonstationary
Example 2. Let {'Pk} be an r-dimensional >-mixing
signals, is important in both theory and application. The
process. This means that there is a sequence {
following condition will be used in the next two sections.
?
O} such that Conditional Richness (CR) Condition
(i).
We say that an adapted sequence {'Pk>
ld
(ii) .
(i .e., 'Pk is
-+
lk-measurable for any k, where {lk} is a family of non de-
-+ 00;
and
IP(AIB) - P(A)I ::;
sup AE
0, as h
T.+.h
BET;
creasing a-algebras), satisfies the CR condition, if there
V h ? 0,
Vs? 0,
exists an integer h > 0 such that
? 0 and h ? 0, l,~h ~ a{'Pk> s + h::; k < oo}.
where, for any nonnegative integers s
70' ~ a{'Pk> 0::;
(20)
k ::; s}
Suppose further that
where {am, lm} is an adapted nonnegative sequence satisfying
i~f .Amin (E'Pk'P~)
> 0 and sup EII'Pk ll· <
(25)
00.
k
Then the eR condition (20) holds with lm =
7om.
(21) Actually, we can prove that {'Pk} defined in Example 2 with {71m, lm} being an adapted nonnegative sequence
satisfies condition (23).
satisfying
We remark that the
a.s.
(22)
class of random processes. In particular, any h-dependent
are
random process (including moving average processes of
m~O
and where a E [0, 1), 0 <
{j
<
00
and 0 ::; M <
00
order h) is >-mixing.
constants.
Example 9. Let {'P d be the output of the following
At first glance, the eR condition looks rather com-
linear stochastic model:
plicated, however, it does have a clear meaning and is satisfied by a large class of stochastic signals. An important special case of (21) is when a = 0, 71 .. ==
a >
E
o.
a-I
V k ? 1,
for some
[lh +'Pir k;m+1 1
k
'Pk
IIzllm] ? a1
V k ? 0,
'Pk
In this case (20) red uces to
where A E a.s.,
Vm?
o.
JR.nxn,
(23)
and C E
JR.rxn
are deter-
n-I
I: CAiB(CAiB), > o.
hand side of (20) may not be uniformly positive definite,
i=O
Suppose that {Ed and {~k} are independent processes
since the sequence {ak} may not be bounded in sample
which are also mutually independent, and satisfy
path . We now give some examples to illustrate the eR condition.
EEk=O,
Example 1. If there are constants 0 < a < (3 <
00
E[6W ? cl,
and
E [IIEkI1
integer h > 0 such that 'Pk'P~::; (31
4
(1+I')
E~k=O,
Vk ? 0,
+ II~kl l ·l
::; M <
00,
for some constants c: > 0, J1. > 0 and M >
m +h
I:
JR.n x q
controllable in the sense that
What (20) effectively means is that the matrix on the left-
a1 <
B E
ministic matrices, A is stable and (A, B, C) is output
a.s .
Vm? 0,
(24)
Vk ? 0,
o.
Then the
CR condition (20) is fulfilled.
k=m +l
then the eR condition (20) holds.
It is worth noting that {It'k} defined in Example 3 can
The proof is straightforward since in this case (23)
also be shown to satisfy (23) provided that {~k' ek} are
holds . Note that if we regard (1) and (2) as a state space
uniformly bounded [1).
123
condition (eO) is verified, then the following properties
4. Analysis of Kalman Filter Based Algorithms
hold:
(i)
Note that if we regard (1) and (2) as a state space
sup EIIPkll
model with state Ob then it is natural to use the Kalman
1 limsup -k
(ii)
filter to estimate the time-varying parameter Ok' (see, e.g.
+
P _ k
where Po
2: 0,
R
Pk'Pk
R
+ 'p
'Pk k'Pk Pk'Pk'PkPk R + 'PkPk'Pk
> 0, Q > 0
and
(iii)
• (Yk - 'PkOk)
<
00,
Vm> 0,
k
L 1I1~llm
:5 c <
00,
a.s., Vm> 0,
i=O
k-+oo
[1]-[5]) . The Kalman filter takes the following form: • Ok
m
k?:O
IIPkl1 = O(logk), a.s. as k -+
00.
(26) We now proceed to analyse the tracking error Ok - Ok .
+Q
We first present a lemma.
(27)
Denote
00 are deterministic, and
(29)
can be arbitrarily chosen (here Rand Q may be regarded as the a priori estimates for the variances of Vk and
~k,
Vk may be regarded as a stochastic Lyapunov func-
respectively) .
It is known that if 'Pk is l k _ l -measurable, where
Jk-I
tion. Although it has the same form as that used in least
=
squares analysis, here the analysis for it is completely dif-
a{Yi, i :5 k - I} and {~k' Vk} is a Gaussian white noise
ferent from there (see [1]) due to different definitions of
process, then Ok generated by (26) and (27) is the min-
Pk : There is an additional Q
imum variance estimate for Ok and Pk is the estimation
> 0 in (27), which prevents
Pk from tending to zero.
error covariance, i.e.,
Lemma 4. For the algorithm (e6) and (f!7), the Lya-
(28)
punov function Vk defined by (ey) has the following prop-
where Ok ~ Ok - Ok, provided that Q = E~k~k' R = Ev~,
erty:
[808~l (see, e.g. [6]) .
00 = EOo and Po = E
In studying asymptotic properties of the algorithm (26) and (27), the primary issue is to establish boundedwhere a =
ness (in some sense) of the tracking error Ok' This problem
2I1Q- I II,
Zk ~
IIVkl1 + lI~k+l1l
and
c is
a con-
stant.
is reminiscent of the stability theory of the Kalman filter, and the standard condition for such a stability (bounded-
By Lemmas 2 and 4, we can prove the following main
ness) is (24) . As we mentioned before, (24) is no longer
results of this section.
suitable for the stability study of the Kalman filter (26)
Theorem 2. Consider the time-varying model (1) and
because in the present case {'Pk} is a random process
(e). Suppose that {Vk,
rather than a deterministic sequence.
satisfies for some p
Based on the work in [1], we shall adopt conditon (20) .
~k} is a stochastic sequence and
> 0 and f3 >
1,
I
The following theorem shows that under the eR condition
(31)
(20) the moment generating functions of IIPkll, k = 0, 1, and
.. ., exist in a small neighborhood of the origin and are
(32)
uniformly bounded in k. where Zk
Theorem 1. For {Pk } recursively defined by (e7)
if
00
the CR condition (eO) holds, then there exists a constant
c'
>0
= Ilvkll+II~k+rll, 80 = 00 -00 ,
andvk, ~k' 00 and
are respectively given by (1), (e) and (e6) . Then under
the CR Condition (eO), the estimation error {Ok - Ok, k 2:
such that for any c E [0, c·)
o} generated by (e6) and (e7) is Lp-stable and limsup EllOk - Okll P :5 A rap log1+3 p /2(e k~oo
and
1
limsup k-+oo
+ a;l) 1 (33)
where A is a constant depending on h, a, M, Mo and 0 k
k Lexp{c Il Pdl} :5 C'
only.
a.s.
i=O
Moreover, If Vk
where C and C' are constants.
~k
== 0 (i.e., Ok == ( 0 ), then (34)
Theorem 1 is interesting by itself. The proof involves the use of Lemma 1, details are omitted.
Corollary 1. For {Pk } generated by (e7)
== 0 and
and
if the
(35) CR
124
interested in its theoretical properties. As far as the track-
for any q E (O,p).
ing aspect is concerned, there is a vast literature on the Remark 1. If in Theorem 2, {c,ok} and {Vk' ~k} are
tracking error analysis, for example, [7]-[12], among oth-
assumed to be mutually independent, then for Lp-stability
ers. Most of the works require some sort of stationarity
of {e k - Ok}, Condition (31) can be replaced by a weaker
and independence of the observations {Yk, c,ok}. Here such
one:
a restriction will not be made and only the upper bound sup EZf <
(36)
00,
of the tracking error is studied. The condition imposed
k~ O
which is a natural condition for the desired Lp-stability.
on {c,ok} is similar to the eR condition introduced in Sec-
What condition (31) means is that if the independency be-
tion 2.
tween {c,ok} and {Vb ~k} is removed, then the Lp-stability
CR Condition for LMS
of {e k - Ok} is still preserved provided that the moment
For the LMS algorithm with Iln E 1n the regressor
condition (36) is slightly strengthened.
{IOn , 1n} is said to satisfy CR condition if
Next, we present a result on time averages of the estimation error {e k - Ok}, which can be proved by using
Ilmllc,omll 2 ::;
1, E { 'I;h
Ilkc,okc,o~ 1 1m} 2
k= m + l
Lemmas 3 and 4.
1
Cl I, Vm 2 0, m
(40)
Theorem 3. Consider the time-varying model (1) and
where h is a positive integer and {Cl m , 1m} is a nonnegative
(f) . Suppose that {Vb ~k} is a stochastic sequence and
sequence satisfying Cl m 2 1, and
for some p > 0, 1
k-l
cp ~ li~~p k ~{lIvdIP + 1I ~i+l IlP} < 00 a.s . Then under the CR condition (fa), {e k - Ok,
k
(37) where a E (0,1) is a constant and {77m, 1m} is a nonnegative sequence such that
2 O} is
Lq-stable in the time average sense for any q E (0, p), and
li~~s!p
1
k
(42)
_
k ~ 1I0j - Oj llq ::; B(cp)q /p,
(38)
with 0 > 0 and M <
where B is a constant depending on q, h, a, M, Mo and
ek
->
if Vk == 0, 00
a.s .
being constants.
Clearly, if we take the step size Ilk as Ilk
0, but independent of sample path. Furthermore,
00
= 1 + Ilc,ok 112'
then (40) coincides with the CR condition (20) introduced
and Ok == 00 , then
in Section 2. Recursively define for "In 2 m 2 0
exponentia/ly fast .
~(n
+ I,m) = (I - llnc,onc,o~)
(n,m),
~(m,m)
= I, (43)
5. Analysis of LMS-Like Algorithms
From (1), (2), (39) and (43) it is easy to see that
(I - Ilnc,onc,o~)On
A basic linear adaptive filtering algorithm used in adap-
+ ~n+! -
Ilnc,onVn
n
~(n+l,O)Oo + L(n+l,i+l)~j+l
tive signal processing is the following least mean-squares
(44)
i=O
(LMS) algorithm [7]: where
(39) where Ilk E (0,1) is a positive constant, often called the step-size, Yk and c,ok are respectively the noisy output
From (44) we see that in order to prove the bounded-
and the measured signal. The LMS is so named because
ness of On+! it is necessary to consider properties of the
the increment of the algorithm (39) is opposite to the
transition matrix ~(n, m) first (c.f. Proposition 1). This
(stochastic) gradient of the mean square error
is done in the following lemma.
Lemma 5. Under Condition (40), there are constants c and "1 E (0,1) such that
and (39) is a type of stee pest descent algorithm that aims at recursively minimizing ek· The LMS algorithm is useful in many applications,
where ~(n, m) is defined by (49) .
and has naturally drawn much attention from researchers
125
[4] A. Benveniste and G. Ruget, A measure of the track-
We now present the main results of this section , which
ing capacity of recursive stochastic algorithms with
can be proved by using Lemmas 5 a nd 3.
constant gains, IEEE Trans. Autom. Control, Vol.
Theorem 4 . Consider the time-varying model (1) and
27, 1982,639-649.
(2) . Suppose that Condition (40) holds and that for some [5] L. Ljung and T . Soderstrom, Theory and Practice
a > 0, Up
~ supE(lvn la + II.6. n ll a ) <
of Recursive Identification, MIT Press, Cambridge, 00,
n~ O
E ll O'oll
a
<
00 .
MA,1983.
Then the tracking error O'k = Ok - Ok produced by the LMS
[6] H. F. Chen, P.R . Kumar and J .H. Van Schuppen, On Kalman filter for conditionally Gaussian systems
algorithm (99) has the following property
with random matrices, Systems and Control Letters, Vo1.13, 1989, 513-529. where c is a positive constant . Moreover, if Vn == EI IO'n+1l lfl
----> n _ 00
°and
.6. n
[7] B. Widrow, J .M . McColl, M .G . Larimore and C .R.
== 0, then
°exponentially fast,
and O'n + 1 ----> n _ 00
°
Johnson, Jr., Stationary and non-stationary characteristics of the LMS adaptive filter, Proc . lEE,
'(l E (0, a)
Vol.64 , 1976, 1151-1162. [8] A. Benveniste, Design of adaptive algorithms for track-
exponentially fast.
ing of time-varying systems, Adaptive Control and Signal Processing, Vol.l, 1987,3-29.
Similarly, for the sample path average of the tracking
[9] R. Bitmead, Convergence in distribution of LMS-type
error, we have the following result.
adaptive parameter estimates, IEEE Trans. Autom. Control, VoI.AC-28, 1983, 54-60.
Theorem 5. Consider the time-varying model (1) and
(2). Suppose that Condition (40) holds and that for some
[10] E . Eweda and O . Macchi, Tracking error bounds of
a > 0,
adaptive nonstationary filtering, Automatica, Vol. 21, 3, 1985, 293-302. [11] O. Macchi, Optimization of adaptive identification
Then the tracking error O'k = Ok - Ok produced by the LMS
for time-varying filters, IEEE Trans . Autom. Con-
algorithm (99) satisfies:
trol, VoI.AC-31, 1986, 283-287. [12] V. Solo, The limiting behavior of LMS, IEEE Trans. Acoustics, Speech, and Signal Processing, Vo1.37,
for any (l E (0, a), provided that {; > a 2!! fJ where {; is the
1989, 1909-1922.
constant appearing in (42), and A is a constant.
References [1] L. Guo, Estimating time-varying parameters by Kalman filter based algorithm: stability and convergence, IEEE Trans. Autom. Control, Vo1.35, No.2, 1990, 141-147. [2] J .F . Zhang, L. Guo and H.F . Chen, Lp-stability of estimation errors of Kalman filter for tracking timevarying parameters, Int . J. Of Adaptive Control and Signal Processing, Vo1.5, 1991. [3] G . Kitagawa and W. Gersh, A smoothness priors time varying AR coefficient modelling of non-stationary covariance time series, IEEE Trans. Autom. Control, Vo1.30, 1985, 48-56.
126