Signal Processing 20 (1990) 227-245 Elsevier
227
RECURSIVE S Y S T E M IDENTIFICATION IN THE P R E S E N C E OF BURST D I S T U R B A N C E B. J E Y E N D R A N Department of Electrical Engineering, Yale University, New Haven, CT06511, U.S.A.
V.U. R E D D Y Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore 560012, India Received 2 March 1989 Revised 23 August 1989, 12 February 1990
Abstract. Conventional algorithms for the recursive identification of time varying systems will fail in the presence of disturbances which occur in bursts. Recently proposed methods to solve the problem suffer from certain drawbacks. In this paper, we present a new algorithm for FIR system identification which automatically detects the onset of a burst of disturbance and inhibits the updating of the parameters for the duration of the burst. Parameter updating is continued after the burst ceases, enabling the tracking of slow variations in the system parameters. Simulations are used to study the performance of the algorithm in the context of the well-known problem of near-end speaker interference in echo cancellation. The results demonstrate that the algorithm performs as predicted.
Zusammenfassung. Obliche Algorithmen zur rekursiven Identifikation zeitvarianter Systeme versagen bei BiindelstSrungen. Kiirzlich vorgeschlagene Verfahren zur LSsung dieses Problems haben bestimmte Nachteile. Wir schlagen einen neuen Algorithmus zur Identifikation von FIR-Systemen vor, bei dem das Einsetzen der Biindelstfrung erkannt wird und eine Anderung der Parameter fiir die Dauer der StiSrung verhindert. Die ,~mderung der Parameter wird nach Aussetzen der StSrung fortgesetzt, womit es den Systemparametern mSglich ist, langsamen ~,nderungen zu folgen. Simulationen werden verwendet um das Vernhalten des Algorithmus im Zusammenhang mit dem bekannten Problem des Nahnebensprechens bei der Echokompensation zu untersuchen. Die Ergebnisse zeigen, daJ3 sich der AIgorithmus wie erwartet verhiilt. Rrsumr. Les algorithmes d'identification rrcursive de syst~mes variant dans le temps conventionnels fonctionnent mal en prrsence d'interfrrences arrivant par paquets. Des mrthodes prrsentres rrcemment pour rrsoudre ce probl~me souffrent de certains drsavantages. Dans cet article nous prrsentons un nouvel algorithme d'identification de syst~mes FIR qui drtecte automatiquement le drbut d'une zone d'interfrrence et inhibe la mise ~ jour des param~tres pour la durre de cette zone. La mise h jour des param~tres reprend apr~s la fin de l'interfrrence, ce qui permet la poursuite de variations lentes de ces param&res. Des simulations sont utilisres pour &udier les performances de l'algorithme dans le contexte du probl~me bien connu d'interfrrence dQe au locuteur proche (en anglais near-end speaker interference) dans un annulateur d'rcho. Les rrsultats obtenus montrent que l'algorithme fonctionne comme prrvu.
Keywords. Burst disturbance, FIR system identification.
1. Introduction Recursive algorithms have been extensively used in system identification [8] as applied to adaptive control, adaptive noise cancellation, channel equalization, echo cancellation and multipath cancellation, etc. Much research has been carded 0165-1684/90/$03.50 ~) 1990- Elsevier Science Publishers B.V.
out to study their behaviour in the presence of noise, measurement errors or modelling errors. Algorithms exist that can guarantee almost sure convergence of the estimated parameters to the true system parameters in these situations, provided the noise or errors are bounded, have zero mean and a constant variance, and are uncorrelated
228
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
with the system input signal (see, for example, [5, Chapter 8]). However, existing recursive identification algorithms are not robust in the presence of sudden bursts of disturbance occurring in the measurement of the system output. The occurrence of a strong burst of disturbance acts as a large additive noise term in the observed error signal which will be interpreted erroneously by the recursive algorithm as a sudden increase in the mismatch between the estimated and true parameters. This would force the estimated parameters to diverge away from the identified values. If it can be assumed that the disturbances would not occur for some time after the algorithm has been initially started, one way to tackle the situation would be to use a fast algorithm like the recursive least squares (RLS) algorithm [8] to quickly identify the system parameters, and then freeze the identified parameters permanently after convergence. Though this would solve the problem if the assumptions were to be satisfied, there are some applications where the system parameters can vary slowly with time. In these situations, the algorithm has to be maintained alive in order to track the possible time variations. Thus, there is a need for a recursive identification algorithm which is immune to the presence of bursts of disturbance and is capable of following slow changes in the system parameters. Examples of such situations occur frequently in practice. (i) In adaptive control [5], recursive algorithms are used to identify the plant parameters, which could exhibit slow variations. Any abrupt change in the plant noise power level or a sudden burst disturbance occurring in the plant output measurement would lead to divergence of the identified parameters as discussed earlier. (ii) In echo cancellation for long distance telephony [9, 13], adaptive algorithms are used to identify the echo path. The identified echo path is used by an echo canceller to produce a synthetic echo which can cancel out the echo arising from far-end speech. Since the echo path can undergo Signal Processing
small variations during a conversation due to the connection or disconnection of extension phones or the transfer of calls through private exchanges, an algorithm which can quickly identify the echo path and thereafter follow its changes is needed. However, the occurrence of near-end speech would add bursts of large interference to the observed echo signal, thereby resulting in divergence of the estimated parameters when conventional identification algorithms are used. Two methods to tackle the above problem for the case of FIR systems have been proposed in [10, 12]. In [10], Reddy et al. proposed weighted recursive least squares and recursive least squares lattice algorithms to handle the occurrence of nearend speaker disturbance in adaptive echo cancellation. In [12], Shan and Kailath proposed an adaptive gain mechanism for adaptive algorithms to handle the presence of burst disturbance in a general situation. As discussed in Section 2, the method of Reddy et al. suffers from certain deficiencies, while that of Shan and Kailath appears to be unsuitable for solving the problem. In this paper, we present an improved version of the algorithm proposed by Reddy et al. [10] for handling burst disturbances in the recursive identification of slowly time varying FIR systems. The main feature of the algorithm is its capability to atuomatically detect the onset of bursts of disturbance and 'freeze' the estimated parameters during the burst intervals. When the disturbance ends, the algorithm 'defreezes' rapidly and the parameters are once again allowed to adapt. This allows the algorithm to track slowly varying parameters. We have used computer simulations to substantiate these claims and also to bring out its superiority over those proposed in [10] and [12]. The paper is organized as follows. In Section 2, we discuss the recent approaches and point out their drawbacks. In Section 3, we present our proposed algorithm. In Section 4, we discuss some related methods for tracking slowly varying parameters. Simulation results of our algorithm are presented in Section 5 and Section 6 concludes the paper.
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
2, Background
For later use we define the a priori estimated output
In this section, we first introduce our notation and then discuss the drawbacks of the earlier methods proposed by Reddy et al. [10] and Shan and Kailath [12]. The dynamic system to be identified is modelled as an FIR system shtisfying the equation
y(t) = ~O(t)To+ to(t) + d(t),
(1)
where
~o(t)=[u(t), u ( t - 1 ) , . . . ,
229
u(t-m)] T
(2)
is the regression vector containing past and present samples of the input u(t) to the system, 0 = [ 0 o , 0 1 , . . . , Ore]T
(3)
is the vector of parameters to be identified, to(t) is the stationary system noise which is assumed to be white, zero mean and of variance o.2, and d(t) is a non-stationary disturbance with zero mean and time varying variance o.2(t), d(t) is assumed to occur in bursts. When there is no disturbance, o.] (t) is zero. u(t), to(t) and d(t) are assumed to be mutually uncorrelated. In adaptive control applications, u(t) is the input to the unknown plant with parameters 0, and ~o(t)'r0 is the true plant output, to(t) is the stationary measurement noise, d(t) represents any nonstationary disturbance which occurs in the measurement of q~(t)rO, and y(t) is the observed plant output. In echo cancellation applications, u(t) is the far-end speech appearing at the input port of the hybrid, the echo path, which is represented by 0. ~o(t)rO is the true echo signal appearing at the output port of the hybrid, to (t) is the stationary measurement noise, d (t) is the near-end speech signal which occurs in bursts and y(t) is the observed echo signal. Denoting the estimated parameter vector as 0(t), the output error is defined as
e( t) = y( t) - ~(t)T0(t).
as
~(t) = ~ o ( t ) r 0 ( t - 1)
(6)
and the a priori output error as
*l(t) = y ( t ) - ~ ( t ) .
(7)
We also define the a priori estimation error as the difference between the true system output and the a priori estimated output
e(t)
= go(t)To
-- )3(t).
(8)
Using (1), (6), (7) and (8), r/(t) can be written as
71(t) = e(t)+ to(t) + d(t).
;(9)
The power in ,/(t) is given by o.2(t) = o.~(t) + o.2 + o.2(t),
(10)
where we have used the fact that u(t), to(t) and d(t) are mutually uncorrelated. Note that no time dependence is shown with the variance of the system noise as it is assumed to be constant.
2.1. The method of Reddy et al. Reddy et al. [10] proposed a weighted sum of squared errors, J = ~ g(n) eZ(n),
(11)
n=O
as the criterion function to be minimized. The weights {g(n)} are to be chosen such that low weight (g(n) < 1) is attached to measurements suspected of being corrupted by large measurement noise (caused by disturbance), and unit weight (g(n) = 1) is attached to measurements that are assumed to contain no such noise. The minimization of J yields the so called 'block' least squares solution,
(4)
O= [ ~=og(n)~o(n)tp(n)T]-'
(5)
x ~ g(n)~o(n)y(n).
Using (1) this can be written as
e(t)=~(t)To+to(t)+d(t)--~o(t)To(t).
(12)
n=0 Vol. 20, No. 3, July 1990
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
230
Defining
window using the observed a priori error ~(t) as
P ( t ) -1-~
g(n)~o(n)q~(n) x
(13)
n=O
and using a well-known matrix inversion lemma results in the weighted recursive least squares (WRLS) algorithm (we discuss only their RLS formulation), 0(t) = 0 ( t - 1)
g( t)P( t - 1) ~o(t) +l+g(t)tp(t)Tp(t_l)~(t
) n(t),
(14)
~,(t)= o-~(t-1)
g(t)=
2 o-o,
2 , o-v(t)
(16)
(17)
When there is no disturbance, v(t)=to(t) and 2 o-~(t) = o-o,, 2 so that g(t)= 1 and the updating of 0(t) and P(t) takes place normally. When disturbance sets in, o-2(t) increases substantially and g(t) becomes very low. This reduces the magnitude of the update terms in (14) and (15) correspondingly and 0(t) and P(t) are updated very little. Thus, the estimated parameters are prevented from diverging. Since o-~ and o-2(t) are not known, they have to be estimated and hence only an estimate of g(t) is available. The authors assumed that disturbance will not occur during the first N iterations after the commencement of the algorithm. N is assumed to be large enough to allow the algorithm to converge. During this initial period, the normal algorithm (with g(t)= 1) is run, and an estimate of the system noise power is computed over a short Signal Processing
,
~'~-~(t) = Ao-~(t " ~ - 1) + g(t)'r/2(t).
where o-~(t) is the power in the total measurement noise
v( t) = to(t)+ d( t).
°-2(t)
(15)
where ~(t) is the a priori output error. Now, suppose that g(t) is chosen as
(18)
where 0 < A < 1. The effective length of the window is approximately 1/(1 - A). After N iterations, within which the algorithm is assumed to have converged, the estimation error would be zero and the power in r/(t) would be equal to the system noise power o-2,. At t = N. the estimate of the power in v(t), o-~(t), is set equal to ~'~(N) and thereafter these quantities and if(t) are computed as follows: ~--'2~(t)= Ao'2"~t - 1)+ 7/2(t),
P(t)=P(t-1) g(t)P(t - 1)~o(t)¢(t)Tp(t-- 1) l+g(t)~o(t)Tp(t_l)~o(t) ,
o'o,(t) = ;to-~,(t - 1) + n2(t),
(19) (20) (21)
When a burst of disturbanc._e occurs, o-2(t) 2 ' becomes large compared to o-,~(t-1) and ~(t) becomes low, as required. The argument for computing ~-~'~(t) as in (21) is that when there is no disturbance, ~(t) is close to unity and hence (21) is approximately equal to (19), w~hile during the burst intervals, ~(t) is low and oM(t) 2 is updated very little. On the average, ~-~(t) would remain as a good estimate of the power in to (t). The modified RLS algorithm with ~(t) as in (20) is started at t = N + 1 and allowed to run thereafter. To handle slow variations in the system parameters, the authors use the cost function J ' = ~ A~-ng(n) e2(n), n=0
where 0 < A F < I is a forgetting factor. The minimization of J ' leads to the recursive algorithm as follows:
0(t) = 0(t - 1)
g(t)P(t - 1) ¢p(t) 4 Av+g(t)~o(t)XP(t_l)~p(t) ~7(t),
(22)
1 P(t) =~F ( P ( t - 1)
g( t)P(t - 1)~O(t)~O(t)Tp(t -- 1)~ A - - ~ g ( ~ ~ ) - ] "
(23)
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
Since only the estimate of g(t) is available, the algorithm uses if(t) in place of g(t). The factor AF in (23) prevents the elements of P (t) from decaying to zero. This maintains a non-zero gain for the update of 0(t) in (22), allowing it to track slow changes in 0. The tracking properties of the above algorithm, with g(t)= 1, have been well studied [1,7]. In [1], Eleftheriou and Falconer have proved that for channel estimation applications, its tracking performance will always be at least as good as that of the LMS algorithm [14] when the system parameters obey a first order Markov difference equation and vary slowly. We now point out the drawbacks of the above algorithm. (i) The first drawback is the assumption that the burst disturbance will not occur till the recursive algorithm has converged. Though it could be justified in the echo cancellation situation it is too restrictive for use in a general situation. (ii) The more severe drawback is its inability to track the slow variations in the system parameters. Though a forgetting factor AF< 1 is used in the algorithm, a defect in the estimation scheme given by (21) (as explained below), forces ~(t) to remain at a low value after the incidence of the first burst disturbance. This in turn forces the gain of the algorithm permanently to a low value, thereby freezing the parameter updation, i.e., variations in the system parameters cannot be followed. When a burst disturbance occurs for the first ti~me, ~(t) becomes low and the update term for tr2(t) decreases correspondingly. This means that during the burst interval, ~-"~(t) is essentially updated as
o',o(t)--~
1).
~z~(t) would be large compared to c r ~ ( t - 1 ) and ~(t) would remain small. Further, the magnitude of ~(t)~2(t) would be of lower order than what it was during the burst, because ~7(t) is lower and ~(t) is the same. This would cause o-~(t) to decay further and eventually to approach zero, thus forcing ~(t) to remain at a very low value after the first burst. As a result, the gain of the WRLS algorithm (cf. (22)) would be close to zero and the parameter updation would be nearly frozen.
2.2. The method of Shan and Kailath Shan and Kailath [12] proposed an automatic gain control mechanism to handle bursts of disturbance. Their adaptive gain g(t) is proportional to an estimate of the cross correlation, t;(t), (p(t) = E{e(t)u(t)}) between the output error e(t) and the system input u(t). Their main argument is as follows. Once the algorithm converges, the cross correlation between e(t) and u(t) is close to zero and so is the adaptive gain g(t). This puts the algorithm in an 'asleep' state. Any strong independent disturbance, it is claimed, causes very little change in the estimated parameters because of the low gain and this renders the algorithm immune to such disturbances. Whenever the system parameters change, e(t) becomes more correlated with u(t). This results in an increase in g(t) which 'awakens' the algorithm and allows it to track the changes. They proposed to compute the adaptive gain as follows. Define
•(t) = 1 ~ u(t-n), m
Since A < 1, this results in ~'~(t) decaying exponentially to a small value with a time constant z = 1 / ( 1 - A). If the disturbance persists for a period which is large compared to r, ~J~(t) would have become v~ery small by the end of t h e burst. Now, though cry(t) would decrease when the burst ends (because ~7(t) would have become low), it cap never become smaller than o,~, the system noise power. Therefore, even after the end of the burst,
231
(24)
n=o
which is the average of the elements in q~(t). 13(t) and g(t) are then computed as
~ ( t ) = A ~ ( t - 1 ) + ( 1 - A ) f i ( t ) e(t),
(25)
g(t + 1) = a~(t),
(26)
where c~ is a constant. If the {g(t)} are used as weights in the WRLS algorithm, (14) and (15), the {e(t)} used in the above computations are to be Vol. 20, No. 3, July 1990
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
232
replaced algorithm (i) The proposed algorithm turbance. algorithm
by {r/(t)}. The drawbacks of this are as follows. first drawback is that the mechanism for g(t) would not allow the adaptive to converge even when there is no disThe reason is that in an adaptive of the form
O(t) = O(t-- 1 ) + g ( t ) M ( t ) ~ ( t ) e(t),
(27)
{g(t)} should be positive for convergence of the algorithm (see the proof of the general WRLS algorithm in [5, Chapter 3]). The purpose of using the weights {g(t)} is to change only the magnitude of tlae update of 0(t) and not the direction, which is to be solely determined by M (t)~p (t). A negative g(t) will increase the update magnitude when it has to be decreased and vice versa. Here, the cross correlation estimate/~(t) can be negative as well as positive, and so g(t) could also become negative. This prevents the algorithm from converging. (ii) The second drawback is that the proposed adaptive gain mechanism will not really provide immunity to the burst disturbance. The reason is as follows. The true (ensemble averaged) cross correlation p(t) would become zero after the algorithm has converged and remain unaffected by any uncorrelated disturbance. Now, the estimate proposed for p(t) will be equal to the ensemble average only if the averaging is done over infinite data. But since {e(t)} is non-stationary, an exponentially weighted averaging scheme is used to estimate p(t) (cf. (25)) which uses only the most recent data in a window of length approximately 1/(1 - h ) . Hence, the terms in if(t) e(t) can never sum to zero and will in fact remain non-zero. This would be the case even if the algorithm is assumed to have converged. Now, when a burst disturbance occurs, the increase in e(t) results in an increase in if(t) e(t). Unless ( l - A ) is very low, this would cause a corresponding increase in fi(t), thus necessitating the use of a h very close to one. Though g(t) remains constant for A close to one, the magnitude of the update term in (27) increases due to the increase in e(t) causing the parameters to diverge. Signal Processing
Furthermore, if the disturbance occurs before 0(t) has converged, e(t) and u (t) would be significantly correlated resulting in a high value for g(t) which would further disrupt adaptation. (iii) The final drawback is the inability to track variations in the system parameters. As discussed earlier, for good estimation of p(t), h should be very close to one. This implies a large 'memory' for the estimate ~(t) which in turn prevents the algorithm from tracking the variations in the system parameters.
3. The proposed algorithm In this section, we present the proposed algorithm which is immune to the occurrence of the burst of disturbance and tracks slow variations in the FIR system parameters. Our algorithm differs from that of Reddy et al. [10] in several ways. We use a slightly different definition for g(t) and a more accurate method to evaluate it. Finally, the proposed algorithm starts right from the first iteration unlike the algorithm of Reddy et al.
3. I. Expression for g(t) Define
s( t) = e( t) + to( t),
(28)
which is the sum of the estimation error and the system noise. Using (28) and (9), the observed a priori error can be expressed as
77(t) = s( t) + d(t).
(28)
The expression that we propose for g(t) is
O'S(t) g(t) - cr2(t).
(30)
When a burst disturbance occurs, tr2(t) becomes large resulting in a substantial increase in tr2(t), whereas tr2(t) remains the same. This pulls g(t) down to a low value. When the burst is over, tr2(t) will be equal to irE(t). This brings g(t) back to
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance one. In practice, we can only have estimates of 0.2(t) and 0.2(t), and hence an estimate of g(t). To bring out the modification in the definition of g(t) in the proposed algorithm, we may use (16) and (17) to express g(t) (as defined in [10]) as
g(t)-
2 0.co
2 0.~,+ 0.~(t)
0.2(t)+o.2 g(t) -- 0.2(t) + o.2 + 0.2(t ) • Note that to determine g(t) (cf. (30)), we do not require an estimate of the system noise power 0.2 separately, and hence we need not wait for the a priori error power to converge to 0.2 . We can therefore start estimating o.2(t) and 0.2(t) right from the first iteration. This allows us to drop the restriction that disturbances can occur only after the convergence of the algorithm.
We need estimates of o.2(t) and 0.2(t) at each iteration to give an on-line estimate of g(t). We propose an unbiased exponentially weighted running average estimate for o.2(t), (31)
where 0 < A < 1 is the exponential weighting constant. ~(t) is then computed as
o.%(t)
(32) '
following the logic similar to that for the estim..ate (20). We now develop a rule for the estimate o.~(t), the motivation for which is as follows. The estimate ~ ( t ) should be computed using only the a priori error ~ (t) and it should not decay to zero during the periods of burst disturbance. Obviously, we cannot use an estimate of the form o.~'~t) = A~2~2(t- 1) + (1 - A) 7/2(t)
(33)
for o.~(t), because ~-~,(t) would increase when n ( t )
(34)
When there is no disturbance, ~(t) is close to one, and (34) is approximately equal to (31), as required. When disturbance is present, ~(t) is low and the upd.~ate term is negligble. Since A < 1, this results in o.2(t) exponentially decaying to a small quantity, as in the case of the estimate ~'32~(~t) in the method of Reddy et al. [10]. To prevent this, we replace the coefficient A in the first term of (34) by 1 - (1 - A)g2(t) and obtain the final expression for the estimate of try(t) as 0.2"~t) = (1 - (1 - A)~2(t))~2~2(t - 1) + (1 - A)~2(t) r/2(t). Now, when ~ ( t ) - 1
3.2. Computation of ~,(t)
~ ( t ) = ~_.-.. 2 ( t - 1)
becomes large during bursts of disturbance. To prevent this, we scale down the contribution of ,72(t) by the power in the disturbance signal. Since ~(t) is used to scale down ~(t) in the recursive update for O(t) (cf. (22)), we use ~2(t) to scale down 'r/2(t) in (33). With this, (33) becomes
~'~(t)=A-'~(t-1)+(1-A)ff,2(t)712(t).
whereas our modified definition for g(t) (as defined in (30)), in view of (28) and (29), gives
0.2"--(t)=Ao.%"-~(t-1)+(1-A)~q2(t),
233
(35)
(no disturbance),
0 . 2 ( t ) - A0.2(t- 1) + (1 - A)r/2(t)
(36a)
as required. When ~(t) is close to zero (disturbance present), o.2(t) = o.2(t - 1),
(36b)
which implies that during the burst disturbance o.2(t) remains at the level it was before the burst started. Because of this, when the burs~t ends, the decrease in r;(t) and consequently in o.2(t) forces ~(t) to increase towards unity again. The 'recovery period' needed for ~(t) to recover to a value close to one is evidently determined by the speed at which o.2(t) tracks changes in ~7(t), and this is controlled by the exponential weighting constant A. Using a A closer to one results in less noisy estimates fo.~r o.2(t) and ~"~2(t), whereas a smaller A allows o.2(t) to track more rapidly the sudden jump variations in ~(t) caused by the bursts of disturbance. The choice of A also has a bearing on the initial values to be chosen for ~n(t) and ~2~2(t). If A is Vol. 20, NO. 3. July 1990
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
234
small enough, the initial choices are not critical because initial conditions will die out within the interval o f l / ( 1 - A) iterations. IfA is close to unity, choosing ~"~(0) too large would make o.2"~(t) insensitive to the comparatively small increases in ~72(t) caused by disturbance bursts occurring before convergence. This prevents 8(t) from becoming small enough to freeze the parameter updation during such bursts. On the other hand, a A close to unity and too small a value for ~ ( 0 ) would force the estimate to take a long time to settle down, causing wide fluctuations in g(t)... ~ e best compromise for the initial choice of o'2(0) would be of the order of the system noise level. Note that o'2-"~0) should be made equal to ~-'~(0) because both ~'~(t) and o-2(t) would be estimating the same quantity in the absence of disturbance. ... Now, during the disturbance-free interv.,als, o.~(t) is updated in the same way as o.2(t). Whenever *l(t) exhibits a slowly increasing trend (without making jump changes as in the case when bursts..of disturbance come and go), both o.2(t) and o'2(t) can track these trends, and 8(t) would not become small. This situation will arise if the true system parameters vary slowly. Now, the gain of the WRLS algorithm is determined by g ( t ) P ( t ) . Since P(t) will always be greater than zero (because AF
3.3. The overall WRLS algorithm The overall WRLS algorithm is as follows:
0(0) = O, P(O) = ill,
o.2(0) = o.2(0) = O'o.
For all t I> 1,
~(t) = y(t) - ¢ ( t ) r 0 ( t - 1),
(37a)
6"~(t)=h~2(t-1)+(1-A)~72(t),
(37b)
8(t)= o.2(t-
(37c)
1)
o.2"~(t)= (1 - (1 - A)82(t))~'~(t - 1) + (1 - A)82(t) 7/2(t), Signal Processing
(37d)
O(t) = O(t - 1)
8( t)P( t-1)q~( t) -~ Av+8(t)~o(t)Tp(t__l)q~(t) rl(t),
(37e)
P(t)=-~F(P(t-1) 8 ( t ) P ( t - l ) ~ ( t)~p( t)T p( t--1).~ A F + 8 ( t ) ~ ( t ) T p ( t _ l ) ~ p ( t ) ].
(37f)
Here, I denotes the identity matrix and fl is a large quantity. The initial value o.o is to be chosen as suggested in the previous section.
4. Related techniques for parameter tracking To enable tracking of slowly varying parameters, we used a constant forgetting factor AF in our proposed algorithm. This technique suffers from some drawbacks--notably the estimator windup problem, which we briefly discuss in this section. We then describe some representative methods for identifying slowly varying parameters that attempt to overcome these problems. It should be noted that these methods do not address the problem of handling burst disturbances and are likely to fail in their presence. The estimator windup problem [2, 3] arises when input excitation is poor, i.e., ~o(t) is small. Such a situation forces P(t) to grow exponentially-because it is scaled by a factor AF< 1 (cf. (37f)), resulting in a very large gain for the estimator. The consequences of this are severe: (i) the estimates become highly susceptible to noise and numerical errors and (ii) a change in parameters or increase in excitation is overcompensated for, resulting in large oscillations in 0(t) for some time. Other problems with the method lie in the choice for AF, because a smaller AF exacerbates the windup problem, whereas a AF tOO close to unity retards parameter tracking. Fortescue et al. [3] proposed the use of a variable forgetting factor A(t) to overcome the above problems. They argued that a low output error arises from either poor excitation or low parameter error,
235
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
whereas a large output error implies a large parameter error (due to parameter change). To reflect this, A(t) was computed as A ( t ) = l - 1 / N ( t ) , where N(t) -
20 (1 - ~o(t)TP(t)q~(t))rl2(t)
(with ~o a specified constant) is the effective memory length at time t. A low ~7(t) thus yields a large m e m o r y length, to retain as much information as possible, whereas a large ~7(t) shortens memory length to enable quick convergence to the changed parameters. A problem with this method is that increase in r/(t) due to increase in noise level would be misinterpreted as due to change in parameters and result in an erroneous reduction of N ( t ) . Hagglund [6] suggested a directional forgetting technique and a WRLS-type algorithm, which could handle slow changes in the noise level. His technique was to forget information only in the direction where new information is gathered, resulting in a memory length that is different in different directions. The memory length depends on the excitation level and not on the output e r r o r - - w h e n little or no information is coming in, a long m e m o r y length is chosen to retain old information; otherwise memory length is shortened to place more reliance on recent data. The amount of forgetting was designed to make P ( t ) converge to a matrix proportional to L Hagglund's method shares with that of Fortescue et al. the limitation that a priori information about parameter variations is difficult to incorporate. Other methods for tracking slowly varying parameters include the 'constant trace' algorithm of Goodwin et al. [4] and the modified RLS algorithm with exponential resetting and forgetting, proposed by Salgado et al. [11].
5. Simulation results
In this section, we apply our algorithm to the problem of echo cancellation and study its performance using computer simulations. The algorithm
is tested on a variety of scenarios. We also present a few simulations with the algorithms of [10] and [12] to compare their performance with that of ours.
For simulating the echo path through the hybrid, we used an A R M A ( 2 , 2 ) filter whose impulse response is as shown in Fig. la. Note that the impulse response dies out within 35 sampling intervals. This allows us to use a 35 tap F I R filter to identify the echo path impulse response. We may point out here that in order to show the zeroth coefficient of the impulse response clearly, we have plotted one-sample delayed versions in all the cases. The A R M A filter equation is given by y( t) = Kh[ U( t) - - 0 . 7 2 9 3 2 2 3 u ( t - 1)
+4.41u(t--2))] + 0.2431074y(t- 1) --0.49y(t--2), where Kh is the gain constant which was adjusted to give the desired attenuation. Both the far-end and near-end speech-like signals (constituting the reference input and the disturbance signals, respectively) were simulated by passing white Gaussian noise of zero mean through an eighth order elliptic filter which bandlimited the signal to the telephone band of 300-3400 Hz. The gain of the filter was chosen to make the input and output signal variances equal. The white Gaussian samples were obtained using the BoxMueller transform on two independent uniform random numbers each generated from a separate random number generator routine initialized with different seeds. The system noise was a zero mean white Gaussian sequence. Both the speech-like signals are Gaussian distributed, being the outputs of linear systems driven by Gaussian inputs. Their amplitudes can be controlled by specifying the variance of the Gaussian driving inputs. We can also specify the iterations at which to turn on and turn off the near-end speech-like signals, so as to simulate the bursts of disturbance. The algorithm was initialized in all cases with 0(0) = 0, P(0) = 100 1 and ~z(0) = ~,,(0) = 0.01. Vol. 20, No. 3, July 1990
236
B. Jeyendran, V.U. Reddy / Recursioe system identification in burst disturbance
For computation of ~2~2(t) and ~ ( t ) , the tial weighting constant A was chosen as 0.95, which we found to yield the best compromise between noise in estimates and tracking ability. The choice of o-2(0) = 0.01 is not critical because its effect dies out within 20 (= 1/(1 - 0.95)) iterations.
Scenario 1
In this scenario, the burst rejection capability of our algorithm is demonstrated. The far-end speech signal power at the input port of the hybrid is set at 20 dB, and that of the output of the hybrid
0.15
0.10
_1
X 0,05 uJ
]
I l l . ,
I1,
-0.05
10
0
.....
,'
15
20
3b
2'5
40
COEFFICIENT NUMBER
Fig. la. True impulse response of the echo path.
25
15
5 /"1 0-4
(t)
(DB) -5
-15
h
-25
i 0
1800
i
i
, , 5/.=00 3600 ITERATIONS
'
72'00
9000
Fig. lb. Exponentially weighted running average estimate of the output error power, ~'~(t) (Scenario 1). signal Processing
237
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
(which represents the echo signal) at 0 d B by choosing Kh as 0.0216913. The powers of the nearend speech and the system noise are kept at 20 dB and - 2 0 dB, respectively. Initially, the algorithm is allowed to run without any near-end speaker disturbance (which acts as
a burst disturbance) till the 1999th iteration. At iteration 2000, a sequence of bursts of near-end speech, generated as described in the beginning of this section, is introduced to simulate the on-off nature of two-way telephone conversation. The burst sequence is chosen as follows. The first burst
1.0
0-8
0.6
~'~t~ 0.4
0.2
0
L
0
i
i
i
1800
3600
I
=
5400
i
9000
7200
ITERATIONS
Fig. lc. Behaviourof the adaptive weight, if(t) (Scenario 1).
0,15
0.10 ILl
0.05 i.t. I.L bJ 0
11 - 0.05
0
~
lb
1;
2b
2'5
3'0
3's
40
COEFFICIENT NUMBER
Fig. ld. Estimated impulse response of the echo path at the 8500th iteration (Scenario 1). Dots represent the true coefficientvalues. Vol. 20, No. 3, July 1990
238
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
begins and ends at iterations 2000 and 3000, respectively. The second burst appears at the 3500th iteration and lasts upto the 6500th iteration. The third burst spans the interval covering iterations 6750 to 8000. In Figs. lb and lc we plotted ~-'~(t), which is an exponentially weighted running average esti-
mate of the a priori output error power and the adaptive weight ~(t), respectively. In echo cancellation, the output error power represents the residual (i.e., uncancelled) echo power plus the system noise power. The plot of Fig. lb shows that the output error power has reduced to the system noise level of - 2 0 dB within 500 iterations,
1.0
0.8
0.6 A
9(t) 0.4
020 0
1800
I
I
3600 5400 ITERATIONS
I
7200
9000
Fig. le. Behaviour of the adaptive weight, ~(t), of the algorithm of [10] (Scenario 1).
0.16 /
0'121
0.08
~ct~ 0.04
-0.04
' 1800
= 5400 ITERATIONS
36100
, 7200
9000
Fig. lf. Behaviour of the adaptive weight, ~(t), of the algorithm of [12] (Scenario 1). Signal
Processing
239
B. Jeyendran, V.U. Reddy / Recursive system identificatimi in burst disturbance
07 0.4
ta
0.2
.J
t~ B tl. tl. t~
-0.2
-0.4
-0-6
5
10
15 20 25 COEFFICIENT NUMBER
30
35
/40
Fig. lg. Estimated impulse response of the echo path at the 8500th iteration, using the algorithm of [12]. (Scenario 1).
implying that the algorithm has converged and the residual echo power is nearly zero. Note also that the output error power rises to the disturbance level of 20 dB at the onset of the burst, remains nearly at this level during the burst interval and falls to the system noise level of - 2 0 dB when the burst is over. From the plot of Fig. lc we find that ~(t) remains close to one during the burst-free intervals and close to zero during the burst intervals as required. Further, the fall of ~(t) close to zero at the onset of the burst is nearly instantaneous, and its recovery to close to unity when the burst is over is very fast (approximately 150 iterations) indicating the other desirable feature of the algorithm-the presence and absence of bursts are detected with very little delay. The small fluctuations in if(t) during the burst-free ~ntervals are due to the fluctuations in the estimates of ~--~(t) and ~-'~'~(t), that are caused by the short averaging interval ( = 1 / ( 1 0.95)). In Fig. ld, we have plotted the estimated echo path impulse response as observed at the 8500th iteration, after all the bursts are over. For the purpose of comparison, we have indicated the
corresponding coefficient values o f the true impulse response as dots. Note that the estimated response is nearly identical to the true echo path impulse response shown in Fig. la, implying that our algorithm is immune to the burst interference caused by the near-end speech in the sense that it automatically freezes the updating of the estimated echo path parameters during the periods of nearend speech. To compare the performance of our algorithm with that of [10], we repeated the simulations of this scenario with the algorithm of [10], using identical signal realizations. The value of ;t in (19) and (21) was kept at 0.95, as in our case, and that of N, the initial burst-free period, was chosen as 1000---sufficiently larger than the 500 iterations required for convergence (see Fig. lb). The behaviour of ~(t) is plotted in Fig. le. For the first N = 1000 iterations if(t) is fixed at 1, as required by the algorithm of [10]. Thereafter the if(t) is estimated adaptively. Once adaptation starts, ~(t) reduces slowly because of the exponential decay of ~-~(t), falling to a value very close to zero after iteration 2000 when the first burst occurs, and remains at that value subsequently. This prevents Vol. 20, No. 3, July 1990
240
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
the RLS algorithm from tracking any changes in the echo path (or the system parameters, in general), as we pointed out in Section 2.1. On the other hand, the if(t) computed as in our algorithm remains close to unity between the periods of burst disturbances, thereby facilitating the tracking of echo path changes during these periods, as illustrated in Scenario 4. Next, we repeated the simulations with the algorithm of [ 12], again with identical signal realizations. The initial value of the cross correlation estimate /~(t) was chosen as 1, to reflect the high initial correlation between the output error and the reference input. As discussed in Section 2.2, the value of A in the estimation rule (25) for /~(t) should be very close to unity. Accordingly, we chose A =0.999, corresponding to an averaging window length of 1000 samples for/~(t). The value of a was chosen as 0.1. Figure i f shows the behaviour of if(t), computed as in (26). Initially, if(t) converges towards zero as the RLS algorithm converges and the output error becomes uncorrelated with the input. During the intervals of burst disturbance, ~(t) does not remain at zero; instead, it oscillates between positive and negative values, as predicted in Section 2.2, As a result, the esti-
mated echo path is forced to diverge during these intervals. This is confirmed by the plot of Fig. lg, which shows the estimated impulse response as observed at iteration 8500. Note that this is completely different from the true impulse response (Fig. la). Similar results were obtained with A = 0.998 and 0.995, and for different signal realizations. This example clearly brings out the effectiveness of our algorithm in handling bursts of disturbance, as well as the drawbacks of the algorithms of [10] and [12]. Scenario 2
To show that our algorithm does equally well even when the system noise level is not too low compared to the power level of other signals, we have kept the system noise power level at - 1 0 dB, leaving the power levels of other signals and the disturbance sequence as in Scenario 1, and repeated the experiment. Figures 2a and 2b give the plots of the estimates of the output error power and the adaptive weight ~(t). The plots show that the behaviour of the algorithm is the same as before. That is, the output error power rises to 20 dB when the burst starts, remains approximately
25
15
A2 O'-,,i (t) (DB)
5
U
-5
-15
-25
I
o
1BOO
36t00
5 4I0 0
i 7200
9000
ITERATIONS
Fig. 2a. Exponentially weighted running average estimate of the output error power, ~ ( t ) Signal Processing
(Scenario 2).
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
241
1.0
0.8
0.6
0./4
0.2
0 0
1800
i 3600 5/-,00 ITERATIONS
72100
9000
Fig. 2b. Behaviour of the adaptive weight, ~(t) (Scenario 2).
at this level during the burst interval and then falls to the system noise level ( - 1 0 dB) at the end of the burst, and the adaptive weight if(t) takes the values one and zero in the same way as before.
Scenario 3
In Section 3, we claimed that our algorithm does not require any initial burst-free interval for the output error power to converge to the system noise level, as is assumed in Reddy et al. [10]. To demonstrate this property of the algorithm, we have chosen the signal levels as in Scenario 1 and the burst sequence as follows. The first burst begins at the 100th iteration and ends at the 600th iteration, and the second burst spans the interval from 1600 to 2600 iterations. The plot of Fig. 3a shows that the output power falls to about - 7 dB in the first 100 iterations which, in view of the system noise level of - 2 0 dB, implies that the algorithm is far from convergence at the 100th iteration. With the occurrence of the burst the error power rises to 20 dB, the power level of the near-end speech. After the burst is terminated at the 600th iteration, the error power continues to decrease until it converges to the system noise level ( - 2 0 dB) at about
the 900th iteration. The adaptive weight ¢(t) (see Fig. 3b) reduces to a level close to zero at the 100th iteration and remains at this value during the burst interval. After the burst is over, it rises rapidly close to unity thereby enabling the algorithm to further adapt. Further behaviour of the error power and the adaptive weight is similar to that in the Scenario 1. Finally, we have plotted the estimated echo path impulse response as observed at the 3000th iteration in Fig. 3c. Note the close similarity between this response and the true impulse response shown by dots. This example demonstrates that the algorithm freezes the parameter updation during the burst intervals even when the burst occurs before the parameters have initially converged, and it continues to update the parameters after the burst is over.
Scenario 4
This scenario is chosen to demonstrate the ability of the algorithm to track a small change in the echo path, which may take place after the algorithm has run for a while. We have set the Vol. 20, No. 3, July 1990
242
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
forgetting factor in the RLS algorithm (AF) at 0.998 and power levels of the signals as in Scenario 1. The burst disturbance (i.e., the near-end speech) begins at the 2000th iteration and ends at the 3000th iteration. At iteration 5000, the echo path of Fig. la is randomly changed by 15%. That is, each coefficient of the impulse response is increased or
decreased randomly by an amount equal to 15% of its original value. The resulting echo path is shown in Fig. 4a. Figure 4b shows that the output error power converges to the system noise level initially, rises to the burst disturbance level and remains at this level during the burst interval. At iteration 5000,
25
15
/*'2
o-, 1 (t)
(DB)
-1
-25
I
0
600 =
1600 a 24a00 ITERATIONS
3200
4000
Fig. 3a. Exponentially weighted running average estimate of the output error power, ~'~(t) (Scenario 3).
1.0
0.6
0.6
0.4
0.2
8~)0
1600 24100 ITERATIONS
I
3200
Fig. 3b. Behaviour of the adaptive weight, ~(t) (Scenario 3). Signal Processing
4000
243
B. Jeyendran, V.U. Red@ / Recursive system identification in burst disturbance 0.15
0.10 lu ..i .<[ 1> 0.05 o M. M.
o
I
0
l!
__l
I
-0.0510
i I
!
I
l
I
!
z
l
~
t
f
I. I |
i
lb
~s 2'0 ~'5 CoEF~,C~Nr NUM.E~
,
i
•
,
•
3~
,
- -
3'5
40
Fig. 3c. Estimated impulse r e s p o n s e o f the echo path at the 3000th iteration (Scenario 3). D o t s represent the true coefficient values.
the error power increases by about 7 dB because of the change in the true echo path. However, the algorithm continues to adapt and the error power settles down to the system noise level by about the 7000th iteration. The estimated echo path impulse
response at the 7500th iteration, given in Fig. 4c, shows that the changed echo path (Fig. 4a) has indeed been identified. For ease of comparison, we have shown the coefficient values of the true impulse response of the changed echo path as dots.
0.15
0,10 tlJ _1
~0.05 ~J ou 0
J.
F
Ill=z,,
J
-O.OSl 0
lO
1'5
i
20
A
i
30
h
.... 35
40
COEFFICIENT NUMBER Fig.
4a. True i m p u l s e
r e s p o n s e o f the changed e c h o path.
Vol. 20, No. 3, July 1990
244
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance 25
15
5 A2 0-7 ( t ) (DB) -5
0
1800
3600 5400 ITERATIONS
7200
9000
Fig. 4b. Exponentially weighted running average estimate of the output error power, (r2n(t) (Scenario 4).
0.15
0.10 hi
tu 0.05 __q
LL 11.
tj
ll[t - 0.05
lb
: 1~ 2b ~5 COEFF,CIENT NUMBER
3'0
3's
40
Fig. 4c. Estimated impulse response of the changed echo path at the 7500th iteration (Scenario 4). Dots represent the true coefficient values of Fig. 4a.
The a b o v e s i m u l a t i o n s thus corroborate o u r claims o n the p e r f o r m a n c e o f o u r algorithm a n d also d e m o n s t r a t e its superiority over the earlier m e t h o d s p r o p o s e d in [10] a n d [12]. Signal Processing
6. Conclusion Conventional recursive system identification algorithms will fail when sudden bursts of
B. Jeyendran, V.U. Reddy / Recursive system identification in burst disturbance
disturbance appear in the measurement of the system output. The problem is especially critical in the identification of systems whose parameters vary slowly with time. In this paper, we have proposed a new algorithm, which is an improvement of the earlier methods [10, 12], to solve the above problem for the case of FIR systems. The improved algorithm is capable of automatically 'freezing' the updating of the estimated parameters during the periods of disturbance and 'defreezing' rapidly after the disturbances end. This would keep the algorithm alive to track the time varying parameters. In addition, the algorithm does not require the restriction imposed in the earlier methods that the disturbances can occur only after the initial convergence of the algorithm. The performance of the algorithm as applied to echo cancellation was studied using computer simulations. The simulation results support our predictions on the behaviour of the algorithm and also bring out its superiority over those proposed in [10] and [12].
References [1] E. Eleftheriou and D.D. Falconer, "Tracking properties and steady-state performance of RLS adaptive filter algorithms", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-34, No. 5, October 1986, pp. 1097-1110. [2] R.J. Evans, X. Xie, C. Zhang and Y.C. Soh, "Algorithms for discrete-time adaptive control of rapidly time-varying systems", in: C.T. Leondes, ed., Control and Dynamic Systems, Advances in Theory and Applications, Vol. 29, Academic Press, San Diego, CA, 1988, pp. 251-282.
245
[3] T.R. Fortescue, L.S. Kershenbaum and B.E. Ydstie, "Implementation of self-tuning regulators with variable forgetting factors", Automatica, Vol. 17, No. 6, November 1981, pp. 831-835. [4] G.C. Goodwin, D.J. Hill and M. Palaniswami, "Towards an adaptive robust controller", Proc. 7th IFAC Symposium on Identification and System Parameter Estimation, York, England, 1985, pp. 997-1002. [5] G.C. Goodwin and K.S. Sin, Adaptive Filtering, Prediction and Control, Prentice-Hall, Englewood Cliffs, N J, 1983. [6] T. Hagglund, "'Recursive estimation of slowly timevarying parameters", Proc. 7th IFAC Symposium on Identification and System Parameter Estimation, York, England, 1985, pp. 1137-1142. [7] F. Ling and J.G. Proakis, "'Non-stationary learning characteristics of least squares adaptive estimation algorithms", Proc. IEEE Internat. Conf. Acoust. Speech Signal Process. 1984, San Diego, CA, March 1984, pp. 3.7.1-3.7.4. [8] L. Ljung and T. Soderstrom, Theory and Practice of Recursire System Identification, MIT Press, Cambridge, MA, 1983. [9] D.G. Messerschmitt, "'Echo cancellation in speech and data transmission", IEEE J. Selected Areas Commun., Vol. SAC-2, No. 2, March 1984, pp. 283-296. [10] V.U. Reddy, T.J. Shan and T. Kailath, "'Application of modified least squares algorithms to adaptive echo cancellation", Proc. IEEE Internat. Conf. Acoust. Speech Signal Process. 1983, Boston, MA, April 1983, pp. 53-56. [11] M.E. Salgado, G.C. Goodwin and R.H. Middleton, "Modified least squares algorithm incorporating exponential resetting and forgetting", Internat. J. Control, Vol. 47, No. 2, February 1988, pp. 477-491. [12] T.J. Shan and T. Kailath, "Adaptive algorithms with an automatic gain control feature", IEEE Trans. Circuits Systems, Vol. CAS-35, No. 1, January 1988, pp. 122-127. [13] M.M. Sondhi and D.A. Berkley, "Silencing echos on the telephone network", Proc. IEEE, Vol. 68, No. 8, August 1980, pp. 948-963. [14] B. Widrow, J.M. McCool, M.G. Larimore and C.R. Johnson Jr., "Stationary and non-stationary learning characteristics of the LMS adaptive filter", Proc. IEEE, Vol. 64, No. 8, August 1976, pp. 1151-1162.
Vol. 20, No. 3, July 1990