Estimating Time-Varying Parameters in Regression Models

Estimating Time-Varying Parameters in Regression Models

Copyright © IFAC Identificati on and System Parameter Estimation. Budapes t. Hungary 1991 ESTIMA TING TIME-VARYING PARAMETERS IN REGRESSION MODELS L...

1MB Sizes 2 Downloads 75 Views

Copyright © IFAC Identificati on and System Parameter Estimation. Budapes t. Hungary 1991

ESTIMA TING TIME-VARYING PARAMETERS IN REGRESSION MODELS L. Guo, H.F. Chen and J.F. Zhang Institute a/System Science, Academia Sinica, Beijing 100080, PRC

Abstract. The Kalman filtering algorithm and the least-mean-squres (LMS) algorithm are two standard methods for estimating time-varying parameters in a linear regression model. This paper establishes the upper bounds for parameter tarcking errors produced by these two algorithms, without resorting to stationary or independency assumptions on the regressors. Keywords : Time-varying parameter, stochastic regression, Kalman filter, LMS, tracking error.

1. Introduction its difficulty. Section 2 present some results on this topic, Tracking or estimating a system or a signal whose

which are necessary for our study. A key excitation condi-

properties vary with time is an important problem in sys-

tion in the time-varying case, called the "conditional rich-

tem identification as well as in signal processing. The

ness condition" is presented and illustrated in Section 3.

basic time-varying model is that of a regression:

Two typical algorithms for tracking time-varying parameters, the Kalman filtering and the LMS algorithms, are

(1)

studied in Sections 4 and 5. where Yk and Vk are the scalar output and the noise respectively, and

'Pk

and

Ok

2. Stability of Random TimeVarying Equations

are, respectively, the r-dimensional

stachastic regressor and the unknown time-varying parameter. It is convenient to denote the parameter variation at time k by

We start with the following simple time-varying linear

L'lk:

equation:

(2) It is clear that if L'lk

== 0, and

'Pk

and

Vk

(3)

respectively

where

have the following forms

ak

E [0, 1) are random variables, and where the

initial value Xo satisfies Elxol <

00.

The solution of (3) can be expressed as

(4) then (1) is reduced to the time-invariant ARMAX model st udied extensively in literature.

k

In the time-invariant

where as usual

case, the adaptation gain in the estimation algorithm is

i =i+l

For convenience of discussion, we introduce the follow-

usually diminishing (for example, the gain ~ in stochas-

ing definition.

tic gradient (SC) algorithm). Algorithms usually used for

Definition 1. A random sequence {Xk' k 2: O} defined

estimating constant parameters will fail in the time varying case, since the parameter variations pected to be vanishing as k

---+ 00.

L'lk

IT (.) ~ 1, if k :S i.

on the basic probability space (0,1, P) is called Lp-stable

are not ex-

(p > 0) if sup Ellxkl lP <

Hence algorithms with

00.

If such a sequence is gener-

k

non-vanishing gains are naturally used. The analysis for

ated by (3), then the equation (3) is called Lp-stab le .

such algorithms is no longer similar to that of e.g. SC

In the sequel, we shall use the following notation

algorithm and hinges on the stability analysis for time-

(5)

varying linear equations. The latter problem is known by

121

A natural question is: under which conditions equa-

where {€k, fk} is an adapted sequence satisfying

tion (3) is Lp-stable. (13)

A commonly used condition is that there exist constants c > 0, "f E (0, 1) such that

and

k

E

IT

(1 - aj) S; qk-i,

Vk ~ i,

(14)

(6)

;='+1

From this it is not difficult to show that (10) is true (see

which, as easily to be seen, guarantees the Lp-stability of

[1J for related derivations) .

(3) provided that {xo, €k} satisfies some moment conditions. The next result shows that (6) is also a necessary

Lemma 2. Let {Xk, fd be a nonnegative adapted pro-

condition in some sense.

~

cess, Xk

1, and satisfy

Proposition 1. Let {ad be a sequence of mutually independent random variables. Then for any

{€d

E

M,

(15)

the equation (9) is L 1 -stable If and only If (6) holds, where

where c > 0 is a constant, and {ak} is defined as in

M is the set defined as

Lemma 1. Furthermore, assume that ak E [0, if], if

M ~ {{ €k} : {6} E Ll and

{€d

is independent of {ak}}.

Then there exist constants N

E

<

1.

> 0 and oX E (0,1) such that

IT (1 - ~) S; NoXn-m+ I,Vn ~ m~O. Xk

(16)

k=m

The proof is simple and is omitted here. Since (6) plays a key role in the stability study of

The proof is similar to that of Lemma 4 in [IJ.

equation (3) it is important to investigate for which kind We have studied some cases where (6) holds. As we

of (possibly strongly correlated) random variables ak , (6) does hold. The results presented here will be used in later

mentioned before (6) is a key condition in guaranteeing

sections.

the Lp-stability of (3) . However, from the Lp-stability of {Xk}' we cannot directly infer the boundedness of sample

Lemma 1. Let {am, f m } be an adapted random se-

averages of {Xn} . This issue is addressed in the following

quence satisfying

lemma [2J:

(7)

am E [O,I J,

Lemma 3. Let {fk , fd be an adapted nonnegative

with {am' f m} being an adapted nonnegative sequence,

sequence satisfying

satisfying

(17)

(8)

for some a > 0, where {ak} is the same as in Lemma 1,

where {'7m, f m } is an adapted nonnegative sequence with

and {€k, fk} is nonnegative and satisfies

(9) sup Ef:+1 < 00, k~ O

< 0 < 00 and 0 S; M < 00 . Then constants c > 0 and "f E (0,1) such that

and where a E [0, 1), 0

there exist two n

E

IT (1 -

a k+ l ) < _ c...,n-m+l I ,

"In

~

m

~

0,

Then there exists a constant L

limsup

(10)

n -oo

k=m

where c and "'I only depend on a, M, Mo and O. Proof.

whenever

It is easy to show that there is an adapted

S; LB *

0 such that

a .s.,

Vt1 E (0, a)

(19)

i =O

appearing in (9) satisfies

{j

> a 2!!f3 '

3. Conditional Richness Condition

and that 0

t ff

(18)

ing results of Sections 3 and 4.

(11)

+ €m+ l ,

n

~

00

We remark that Lemmas 1- 3 are crucial in establish-

sequence {t1m, f m} such that

t1m+l = bt1m

{j

..!.

1 ~ -a

l!.

B = lim sup - ~ €i < n- oo n i= O

< b < 1,

Ej3~+6 < 00,

Like the constant parameter case, some kind of excitation or richness on 'Pi is necessary in estimating the

(12)

122

unknown time-varying parameters. The stationarity as-

equation with state ()k> then (24) is nothing but the uni-

sumption on 'Pi is widely used in the adaptive signal pro-

form observability condition for

()k .

In the area of deter-

cessing area for studying LMS algorithms (more discus-

ministic adaptive control, (24) is sometimes called "suffi-

sions on LMS will be given in Section 5). Without any

cient richness condition" . Notwithstanding the fairly wide

doubt it is reasonable in certain circumstances, but re-

use of (24), its verification appears to be very difficult if

strictive in general. For example, the stationarity assump-

not impossible. Actually, (24) is mainly a deterministic

tion on 'Pi excludes the feedback control systems from

hypothesis and excludes many standard signals including

consideration. Thus, finding a weaker richness condition

any unbounded signals 'Pk.

on 'Pi, which includes both stationary and nonstationary

Example 2. Let {'Pk} be an r-dimensional -mixing

signals, is important in both theory and application. The

process. This means that there is a sequence {
following condition will be used in the next two sections.

?

O} such that Conditional Richness (CR) Condition


(i).

We say that an adapted sequence {'Pk>

ld

(ii) .

(i .e., 'Pk is

-+

lk-measurable for any k, where {lk} is a family of non de-

-+ 00;

and

IP(AIB) - P(A)I ::;
sup AE

0, as h

T.+.h

BET;

creasing a-algebras), satisfies the CR condition, if there

V h ? 0,

Vs? 0,

exists an integer h > 0 such that

? 0 and h ? 0, l,~h ~ a{'Pk> s + h::; k < oo}.

where, for any nonnegative integers s

70' ~ a{'Pk> 0::;

(20)

k ::; s}

Suppose further that

where {am, lm} is an adapted nonnegative sequence satisfying

i~f .Amin (E'Pk'P~)

> 0 and sup EII'Pk ll· <

(25)

00.

k

Then the eR condition (20) holds with lm =

7om.

(21) Actually, we can prove that {'Pk} defined in Example 2 with {71m, lm} being an adapted nonnegative sequence

satisfies condition (23).

satisfying

We remark that the
a.s.

(22)

class of random processes. In particular, any h-dependent

are

random process (including moving average processes of

m~O

and where a E [0, 1), 0 <

{j

<

00

and 0 ::; M <

00

order h) is -mixing.

constants.

Example 9. Let {'P d be the output of the following

At first glance, the eR condition looks rather com-

linear stochastic model:

plicated, however, it does have a clear meaning and is satisfied by a large class of stochastic signals. An important special case of (21) is when a = 0, 71 .. ==

a >

E

o.

a-I

V k ? 1,

for some

[lh +'Pir k;m+1 1

k

'Pk

IIzllm] ? a1

V k ? 0,

'Pk

In this case (20) red uces to

where A E a.s.,

Vm?

o.

JR.nxn,

(23)

and C E

JR.rxn

are deter-

n-I

I: CAiB(CAiB), > o.

hand side of (20) may not be uniformly positive definite,

i=O

Suppose that {Ed and {~k} are independent processes

since the sequence {ak} may not be bounded in sample

which are also mutually independent, and satisfy

path . We now give some examples to illustrate the eR condition.

EEk=O,

Example 1. If there are constants 0 < a < (3 <

00

E[6W ? cl,

and

E [IIEkI1

integer h > 0 such that 'Pk'P~::; (31

4

(1+I')

E~k=O,

Vk ? 0,

+ II~kl l ·l

::; M <

00,

for some constants c: > 0, J1. > 0 and M >

m +h

I:

JR.n x q

controllable in the sense that

What (20) effectively means is that the matrix on the left-

a1 <

B E

ministic matrices, A is stable and (A, B, C) is output

a.s .

Vm? 0,

(24)

Vk ? 0,

o.

Then the

CR condition (20) is fulfilled.

k=m +l

then the eR condition (20) holds.

It is worth noting that {It'k} defined in Example 3 can

The proof is straightforward since in this case (23)

also be shown to satisfy (23) provided that {~k' ek} are

holds . Note that if we regard (1) and (2) as a state space

uniformly bounded [1).

123

condition (eO) is verified, then the following properties

4. Analysis of Kalman Filter Based Algorithms

hold:

(i)

Note that if we regard (1) and (2) as a state space

sup EIIPkll

model with state Ob then it is natural to use the Kalman

1 limsup -k

(ii)

filter to estimate the time-varying parameter Ok' (see, e.g.

+

P _ k

where Po

2: 0,

R

Pk'Pk

R

+ 'p

'Pk k'Pk Pk'Pk'PkPk R + 'PkPk'Pk

> 0, Q > 0

and

(iii)

• (Yk - 'PkOk)

<

00,

Vm> 0,

k

L 1I1~llm

:5 c <

00,

a.s., Vm> 0,

i=O

k-+oo

[1]-[5]) . The Kalman filter takes the following form: • Ok

m

k?:O

IIPkl1 = O(logk), a.s. as k -+

00.

(26) We now proceed to analyse the tracking error Ok - Ok .

+Q

We first present a lemma.

(27)

Denote

00 are deterministic, and

(29)

can be arbitrarily chosen (here Rand Q may be regarded as the a priori estimates for the variances of Vk and

~k,

Vk may be regarded as a stochastic Lyapunov func-

respectively) .

It is known that if 'Pk is l k _ l -measurable, where

Jk-I

tion. Although it has the same form as that used in least

=

squares analysis, here the analysis for it is completely dif-

a{Yi, i :5 k - I} and {~k' Vk} is a Gaussian white noise

ferent from there (see [1]) due to different definitions of

process, then Ok generated by (26) and (27) is the min-

Pk : There is an additional Q

imum variance estimate for Ok and Pk is the estimation

> 0 in (27), which prevents

Pk from tending to zero.

error covariance, i.e.,

Lemma 4. For the algorithm (e6) and (f!7), the Lya-

(28)

punov function Vk defined by (ey) has the following prop-

where Ok ~ Ok - Ok, provided that Q = E~k~k' R = Ev~,

erty:

[808~l (see, e.g. [6]) .

00 = EOo and Po = E

In studying asymptotic properties of the algorithm (26) and (27), the primary issue is to establish boundedwhere a =

ness (in some sense) of the tracking error Ok' This problem

2I1Q- I II,

Zk ~

IIVkl1 + lI~k+l1l

and

c is

a con-

stant.

is reminiscent of the stability theory of the Kalman filter, and the standard condition for such a stability (bounded-

By Lemmas 2 and 4, we can prove the following main

ness) is (24) . As we mentioned before, (24) is no longer

results of this section.

suitable for the stability study of the Kalman filter (26)

Theorem 2. Consider the time-varying model (1) and

because in the present case {'Pk} is a random process

(e). Suppose that {Vk,

rather than a deterministic sequence.

satisfies for some p

Based on the work in [1], we shall adopt conditon (20) .

~k} is a stochastic sequence and

> 0 and f3 >

1,

I

The following theorem shows that under the eR condition

(31)

(20) the moment generating functions of IIPkll, k = 0, 1, and

.. ., exist in a small neighborhood of the origin and are

(32)

uniformly bounded in k. where Zk

Theorem 1. For {Pk } recursively defined by (e7)

if

00

the CR condition (eO) holds, then there exists a constant

c'

>0

= Ilvkll+II~k+rll, 80 = 00 -00 ,

andvk, ~k' 00 and

are respectively given by (1), (e) and (e6) . Then under

the CR Condition (eO), the estimation error {Ok - Ok, k 2:

such that for any c E [0, c·)

o} generated by (e6) and (e7) is Lp-stable and limsup EllOk - Okll P :5 A rap log1+3 p /2(e k~oo

and

1

limsup k-+oo

+ a;l) 1 (33)

where A is a constant depending on h, a, M, Mo and 0 k

k Lexp{c Il Pdl} :5 C'

only.

a.s.

i=O

Moreover, If Vk

where C and C' are constants.

~k

== 0 (i.e., Ok == ( 0 ), then (34)

Theorem 1 is interesting by itself. The proof involves the use of Lemma 1, details are omitted.

Corollary 1. For {Pk } generated by (e7)

== 0 and

and

if the

(35) CR

124

interested in its theoretical properties. As far as the track-

for any q E (O,p).

ing aspect is concerned, there is a vast literature on the Remark 1. If in Theorem 2, {c,ok} and {Vk' ~k} are

tracking error analysis, for example, [7]-[12], among oth-

assumed to be mutually independent, then for Lp-stability

ers. Most of the works require some sort of stationarity

of {e k - Ok}, Condition (31) can be replaced by a weaker

and independence of the observations {Yk, c,ok}. Here such

one:

a restriction will not be made and only the upper bound sup EZf <

(36)

00,

of the tracking error is studied. The condition imposed

k~ O

which is a natural condition for the desired Lp-stability.

on {c,ok} is similar to the eR condition introduced in Sec-

What condition (31) means is that if the independency be-

tion 2.

tween {c,ok} and {Vb ~k} is removed, then the Lp-stability

CR Condition for LMS

of {e k - Ok} is still preserved provided that the moment

For the LMS algorithm with Iln E 1n the regressor

condition (36) is slightly strengthened.

{IOn , 1n} is said to satisfy CR condition if

Next, we present a result on time averages of the estimation error {e k - Ok}, which can be proved by using

Ilmllc,omll 2 ::;

1, E { 'I;h

Ilkc,okc,o~ 1 1m} 2

k= m + l

Lemmas 3 and 4.

1

Cl I, Vm 2 0, m

(40)

Theorem 3. Consider the time-varying model (1) and

where h is a positive integer and {Cl m , 1m} is a nonnegative

(f) . Suppose that {Vb ~k} is a stochastic sequence and

sequence satisfying Cl m 2 1, and

for some p > 0, 1

k-l

cp ~ li~~p k ~{lIvdIP + 1I ~i+l IlP} < 00 a.s . Then under the CR condition (fa), {e k - Ok,

k

(37) where a E (0,1) is a constant and {77m, 1m} is a nonnegative sequence such that

2 O} is

Lq-stable in the time average sense for any q E (0, p), and

li~~s!p

1

k

(42)

_

k ~ 1I0j - Oj llq ::; B(cp)q /p,

(38)

with 0 > 0 and M <

where B is a constant depending on q, h, a, M, Mo and

ek

->

if Vk == 0, 00

a.s .

being constants.

Clearly, if we take the step size Ilk as Ilk

0, but independent of sample path. Furthermore,

00

= 1 + Ilc,ok 112'

then (40) coincides with the CR condition (20) introduced

and Ok == 00 , then

in Section 2. Recursively define for "In 2 m 2 0

exponentia/ly fast .

~(n

+ I,m) = (I - llnc,onc,o~)(n,m),

~(m,m)

= I, (43)

5. Analysis of LMS-Like Algorithms

From (1), (2), (39) and (43) it is easy to see that

(I - Ilnc,onc,o~)On

A basic linear adaptive filtering algorithm used in adap-

+ ~n+! -

Ilnc,onVn

n

~(n+l,O)Oo + L(n+l,i+l)~j+l

tive signal processing is the following least mean-squares

(44)

i=O

(LMS) algorithm [7]: where

(39) where Ilk E (0,1) is a positive constant, often called the step-size, Yk and c,ok are respectively the noisy output

From (44) we see that in order to prove the bounded-

and the measured signal. The LMS is so named because

ness of On+! it is necessary to consider properties of the

the increment of the algorithm (39) is opposite to the

transition matrix ~(n, m) first (c.f. Proposition 1). This

(stochastic) gradient of the mean square error

is done in the following lemma.

Lemma 5. Under Condition (40), there are constants c and "1 E (0,1) such that

and (39) is a type of stee pest descent algorithm that aims at recursively minimizing ek· The LMS algorithm is useful in many applications,

where ~(n, m) is defined by (49) .

and has naturally drawn much attention from researchers

125

[4] A. Benveniste and G. Ruget, A measure of the track-

We now present the main results of this section , which

ing capacity of recursive stochastic algorithms with

can be proved by using Lemmas 5 a nd 3.

constant gains, IEEE Trans. Autom. Control, Vol.

Theorem 4 . Consider the time-varying model (1) and

27, 1982,639-649.

(2) . Suppose that Condition (40) holds and that for some [5] L. Ljung and T . Soderstrom, Theory and Practice

a > 0, Up

~ supE(lvn la + II.6. n ll a ) <

of Recursive Identification, MIT Press, Cambridge, 00,

n~ O

E ll O'oll

a

<

00 .

MA,1983.

Then the tracking error O'k = Ok - Ok produced by the LMS

[6] H. F. Chen, P.R . Kumar and J .H. Van Schuppen, On Kalman filter for conditionally Gaussian systems

algorithm (99) has the following property

with random matrices, Systems and Control Letters, Vo1.13, 1989, 513-529. where c is a positive constant . Moreover, if Vn == EI IO'n+1l lfl

----> n _ 00

°and

.6. n

[7] B. Widrow, J .M . McColl, M .G . Larimore and C .R.

== 0, then

°exponentially fast,

and O'n + 1 ----> n _ 00

°

Johnson, Jr., Stationary and non-stationary characteristics of the LMS adaptive filter, Proc . lEE,

'
Vol.64 , 1976, 1151-1162. [8] A. Benveniste, Design of adaptive algorithms for track-

exponentially fast.

ing of time-varying systems, Adaptive Control and Signal Processing, Vol.l, 1987,3-29.

Similarly, for the sample path average of the tracking

[9] R. Bitmead, Convergence in distribution of LMS-type

error, we have the following result.

adaptive parameter estimates, IEEE Trans. Autom. Control, VoI.AC-28, 1983, 54-60.

Theorem 5. Consider the time-varying model (1) and

(2). Suppose that Condition (40) holds and that for some

[10] E . Eweda and O . Macchi, Tracking error bounds of

a > 0,

adaptive nonstationary filtering, Automatica, Vol. 21, 3, 1985, 293-302. [11] O. Macchi, Optimization of adaptive identification

Then the tracking error O'k = Ok - Ok produced by the LMS

for time-varying filters, IEEE Trans . Autom. Con-

algorithm (99) satisfies:

trol, VoI.AC-31, 1986, 283-287. [12] V. Solo, The limiting behavior of LMS, IEEE Trans. Acoustics, Speech, and Signal Processing, Vo1.37,

for any (l E (0, a), provided that {; > a 2!! fJ where {; is the

1989, 1909-1922.

constant appearing in (42), and A is a constant.

References [1] L. Guo, Estimating time-varying parameters by Kalman filter based algorithm: stability and convergence, IEEE Trans. Autom. Control, Vo1.35, No.2, 1990, 141-147. [2] J .F . Zhang, L. Guo and H.F . Chen, Lp-stability of estimation errors of Kalman filter for tracking timevarying parameters, Int . J. Of Adaptive Control and Signal Processing, Vo1.5, 1991. [3] G . Kitagawa and W. Gersh, A smoothness priors time varying AR coefficient modelling of non-stationary covariance time series, IEEE Trans. Autom. Control, Vo1.30, 1985, 48-56.

126