JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.1 (1-15)
Digital Signal Processing ••• (••••) •••–•••
Contents lists available at ScienceDirect
Digital Signal Processing www.elsevier.com/locate/dsp
Analysis of the TDLMS algorithm operating in a nonstationary environment Eduardo Vinicius Kuhn a , Javier Ernesto Kolodziej b , Rui Seara a,∗ a
LINSE: Circuits and Signal Processing Laboratory, Department of Electrical Engineering, Federal University of Santa Catarina, 88040-900, Florianópolis, SC, Brazil GID-IE: Research and Development in Electronics Engineering Group, Department of Electronics Engineering, National University of Misiones, Oberá, Misiones, Argentina b
a r t i c l e
i n f o
Article history: Available online xxxx Keywords: Adaptive filtering Nonstationary environment Stochastic analysis TDLMS algorithm Time-varying plant
a b s t r a c t This paper presents a stochastic analysis of the transform-domain least-mean-square (TDLMS) algorithm operating in a nonstationary environment (time-varying plant) with real-valued correlated Gaussian input data, from which the analysis for the stationary environment follows as a special case. In this analysis, accurate model expressions are derived describing the transformed mean weight-error behavior, learning curve, transformed weight-error covariance matrix, steady-state excess mean-square error (EMSE), misadjustment, step size for minimum EMSE, degree of nonstationarity, as well as a relationship between misadjustment and degree of nonstationarity. Based on these model expressions, the impact of the algorithm parameters on its performance is discussed, clarifying the behavior of the algorithm vis-à-vis the nonstationary environment considered. Simulation results for the TDLMS algorithm are shown by using the discrete cosine transform, which confirm the accuracy of the proposed model for both transient and steady-state phases. © 2015 Elsevier Inc. All rights reserved.
1. Introduction One of the most powerful and well-known adaptive algorithms is the least-mean-square (LMS) algorithm [1]. Due to its features of low computational complexity and robustness for a wide range of operating conditions, this algorithm has been successfully used in many practical applications [2–5]. However, it is well known that, depending on the correlation level of the input data (i.e., eigenvalue spread of the input autocorrelation matrix), the LMS algorithm has its convergence rate compromised [1–5]. Aiming to overcome this drawback of the LMS algorithm, different strategies have been proposed and studied in the open literature [2–19]. Such strategies generally lead to better performance at the expense of a substantial increase in the computational load (for details, see [5]). Nevertheless, in many practical applications, it is important to consider the tradeoff between computational complexity and convergence rate. To meet this challenge, the transform-domain LMS (TDLMS) algorithm [6] arises as an interesting alternative to the LMS algorithm. The TDLMS is obtained by including a preprocessing stage into the LMS [2,4,6]. This preprocessing performs an orthogonal transformation on the input data followed by power nor-
*
Corresponding author. E-mail addresses:
[email protected] (E.V. Kuhn), koloj@fio.unam.edu.ar (J.E. Kolodziej),
[email protected] (R. Seara). http://dx.doi.org/10.1016/j.dsp.2015.05.013 1051-2004/© 2015 Elsevier Inc. All rights reserved.
malization [2,17]. Such an approach aims to reduce the correlation level of the input data, thus enhancing the algorithm convergence rate for correlated data [5,6]. Hence, taking into consideration the practical applicability of the TDLMS algorithm, its stochastic model becomes an interesting tool to predict the algorithm behavior under different operating conditions. In short, stochastic models can provide a comprehensive understanding as well as useful insights into the algorithm behavior [20], permitting one to identify undesired behavior and to modify the algorithm aiming to overcome such a problem or even improve its performance under specific operating conditions [21]. Moreover, having model expressions at hand, the use of extensive Monte Carlo (MC) simulations for tuning the algorithm parameters as well as assessing its performance can be avoided [19–25], saving time spent on simulations. However, the analysis of adaptive algorithms has brought to light some mathematical challenges, which are mostly circumvented by the use of simplifying assumptions [5,25]. In general, accurate models require as few as possible simplifying assumptions, implying more mathematical difficulties during their development. Thereby, depending on the kind of assumptions used, there is a tradeoff between the difficulty of the involved mathematics and the accuracy of the obtained model. Concerning the TDLMS algorithm, it is important to mention that most of the analyses presented so far in the literature focus exclusively on the case of a stationary environment
JID:YDSPR
AID:1782 /FLA
2
[m5G; v1.159; Prn:17/07/2015; 12:25] P.2 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
[6,8,9,14,26,27]. On the other hand, the analysis of the TDLMS algorithm considering a nonstationary environment (i.e., a timevarying plant) has only been addressed in a few research works [15,25]. In [15], a modified power estimator is used in the TDLMS, leading to a model particularized for such a modified algorithm. Moreover, [15] assumes that the output of each transformation branch is a sequence of statistically independent samples [16], which significantly compromises the accuracy of the obtained model (especially, in the transient phase). In [25], model expressions for predicting the steady-state excess mean-square error (EMSE) of some key algorithms (including the TDLMS) have been derived. Nevertheless, particularly for the TDLMS algorithm, the model expressions presented there are not accurate (see [25, Fig. 11]). In addition, even with respect to the steady-state algorithm behavior, the results obtained in [25] have not been properly discussed. Therefore, a complete and accurate analysis of the TDLMS algorithm operating in a nonstationary environment remains an open problem. Another important aspect in the analysis of the TDLMS algorithm resides in the determination of expected values in the form
ˆ −1 (n)Rˆ (n)] E[D
(1)
ˆ (n) denotes a diagonal normalization matrix [2] and Rˆ (n) where D represents an estimate of the input autocorrelation matrix. In the open literature, several approximations have been used to analytically determine (1), each of them having distinct accuracy levels [14,15,26,27]. In particular, better results are obtained in [27], where the Averaging Principle (AP) [28] is used to split (1) into two expected values, i.e.,
ˆ −1 (n)Rˆ (n)] ∼ E[D = E[Dˆ −1 (n)]E[Rˆ (n)]
(2)
ˆ (n)] is obtained from the input autocorrelation matrix where E[R − ˆ and E[D 1 (n)] is determined by using an approximate method for computing the integrals arising from the calculus of such an expected value, which are known as hyperelliptic integrals [29]. Although, the use of the AP results in simpler model expressions, its accuracy is satisfactory only under restrictive operating condiˆ (n) and when tions, i.e., when a large window is used to estimate D the input data is weakly correlated. Therefore, the determination of accurate analytical expressions for computing such expected values is still an open issue. In this context, focusing on the analysis of the TDLMS algorithm operating in a nonstationary environment (time-varying plant), the present research work has the following goals: i) to determine analytical solutions for expected values in the form of (1) considering real-valued correlated Gaussian input data; ii) to obtain model expressions describing the transformed mean weight-error behavior, learning curve, and transformed weighterror covariance matrix; iii) to derive model expressions for the EMSE in steady state, misadjustment, step size for minimum EMSE, degree of nonstationarity as well as a relationship between misadjustment and degree of nonstationarity; and iv) to discuss how the algorithm parameters affect its performance. A preliminary sketch of this work appears in [30] and [31]. Here, more accurate model expressions are derived and a deeper discussion about the algorithm behavior in a nonstationary environment is provided. The paper is organized as follows. Section 2 describes the rule used to obtain time-varying plants characterizing the nonstationary environment considered and introduces the TDLMS algorithm.
Section 3 derives the proposed model and discusses how the algorithm parameters affect its performance. In Section 4, simulation results are shown and discussed, aiming to assess the accuracy of the proposed model under different operating scenarios. Finally, Section 5 presents concluding remarks. Throughout this paper, the mathematical notation adopted follows the standard practice of using lower-case boldface letters for vectors, upper-case boldface letters for matrices, and both italic Roman and Greek letters for scalar quantities. In addition, superscript T stands for the transpose, E(·) denotes the expected value, and tr(·) characterizes the trace operator. 2. Problem statement In this section, a brief description of the nonstationary environment considered in the development of the proposed model is firstly presented. Next, the general expressions describing the TDLMS algorithm are introduced. 2.1. Brief description of the nonstationary environment A fundamental feature of adaptive filters is its capability to track variations of the underlying signal statistics in the surrounding environment. Hence, it is important to develop stochastic models characterizing the tracking capability of adaptive algorithms in nonstationary environments [5]. In this context, a plant (system to be identified) exhibiting a time-varying behavior is used here to describe the nonstationary environment considered [1–5]. For simplicity, we assume that the plant undergoes random variations according to a first-order Markov process, which is described by the following rule [4,5,32]:
wo (n + 1) = α wo (n) + φ(n)
(3)
where 0 < α ≤ 1, wo (0) characterizes an N-dimensional arbitrary starting vector, and φ(n) = [φ0 (n) φ1 (n) · · · φ N −1 (n)]T denotes a plant perturbation vector (see Assumption A3 further ahead). In the literature, it is usually assumed that the value of parameter α in (3) is very close to 1 [3–5]; thus, the first-order Markov process [given by (3)] can be simplified (making α = 1) to the commonly used random-walk model [3,33]. On the other hand, instead of simplifying (3) to the random-walk model as done in [15] and [25], we have preferred to keep α as a free parameter in the development presented here. Thereby, since the proposed model is not restricted to a specific value of α , the condition in which (3) reduces to the random-walk model (i.e., α = 1) follows as a special case. Note also that, making α = 1 and φ(n) = 0 ∀n (σφ2 = 0) in (3), the plant weight vector becomes time invariant, i.e., wo (n) = wo ∀n; as a consequence, one returns to the stationary environment commonly considered in the analysis of adaptive algorithms. 2.2. Revisiting the TDLMS algorithm In order to track variations of the plant weights, an adaptive filter of the same order is used (see Fig. 1). This adaptive filter has its weights updated by the TDLMS algorithm (see Fig. 2), which is given by [2,6]
˜ (n + 1) = w ˜ (n) + 2μDˆ −1 (n)e (n)˜x(n) w
(4)
˜ (n) = [ w ˜ 0 (n) w ˜ N −1 (n)] is the transformed ˜ 1 (n) · · · w where w adaptive weight vector, x˜ (n) = Tx(n) = [˜x0 (n) x˜ 1 (n) · · · x˜ N −1 (n)]T denotes the transformed input vector resulting from applying a real-valued N × N-orthogonal transform T [2] to the input vector x(n) = [x(n) x(n − 1) · · · x(n − N + 1)]T , e (n) represents the error ˆ (n) characterizes signal, μ is the step-size control parameter, and D T
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.3 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
3
3. Proposed model
Fig. 1. Typical setup of a system identification problem.
a diagonal normalization matrix whose elements are determined by
σˆ i2 (n) =
1 M
x˜ Ti (n)˜xi (n)
(5)
with
x˜ i (n) = [˜xi (n) x˜ i (n − 1) · · · x˜ i (n − M + 1)]T
(6)
denoting an M-dimensional (window length) vector that contains the output of the ith transformation branch. Note that, in practical implementations, the variance of the output of the ith transformation branch σˆ i2 (n) is usually estimated through
σˆ i2 (n) = β σˆ i2 (n − 1) + (1 − β)˜x2i (n)
(7)
with 0 β < 1 [2,5,26]. Nevertheless, since the algorithm exhibits the same behavior considering either (5) or (7), the former expression has been used here for the sake of mathematical simplicity. By considering the system identification problem depicted in Fig. 1, the error signal can be expressed as
˜ T (n)˜x(n) e (n) = d(n) − w
Now, we proceed to derive model expressions describing the algorithm behavior, taking into account the nonstationary environment considered and the particulars of the TDLMS algorithm. Specifically, in this section, model expressions for the transformed mean weight-error behavior, learning curve, and transformed weight-error covariance matrix are firstly derived. Then, model expressions for the steady-state EMSE, misadjustment, step size for minimum EMSE, degree of nonstationarity, as well as a relationship between misadjustment and degree of nonstationarity are also obtained. Based on these expressions, we still discuss how the algorithm parameters affect its performance. Before proceeding further with the model development, let us firstly introduce the simplifying assumptions considered [2–5]: A1. The input signal x(n) is obtained from a zero-mean correlated Gaussian stationary process with variance σx2 and autocorrelation matrix R = E[x(n)xT (n)]. A2. The measurement noise z(n) is obtained from a zero-mean stationary process with variance σz2 , which is assumed uncorrelated with any other signal in the system. A3. The plant perturbation vector φ(n) is obtained from a process with zero mean, variance σφ2 , and autocorrelation matrix
= E[φ(n)φ T (n)], which is uncorrelated with any other signal in the system. Also, it is assumed that φ(k) and φ(l) are uncorrelated for k = l, i.e., E[φ(k)φ T (l)] = 0 for k = l. A4. For a slow adaptation condition (small step size), w(n), wo (n), and x(n) are assumed statistically independent1 . Here, two important remarks about the assumptions considered can be made. Firstly, these assumptions also hold for the corre˜ n). ˜ (n), w ˜ o (n), x˜ (n), and φ( sponding transformed variables, i.e., w Secondly, such assumptions have been commonly used in the analysis of adaptive algorithms to make the model development mathematically tractable (for details, see [2–5]).
(8) 3.1. Transformed mean weight-error behavior
with In this section, an expression for the transformed mean weighterror behavior is derived. To this end, we first substitute (9) into (8) and the resulting expression into (4), which leads to
d(n) = wTo (n)x(n) + z(n)
˜ To (n)˜x(n) + z(n) =w
(9)
˜ o (n) = Two (n) characterizes the time-varying transformed where w plant weight vector and z(n) denotes the measurement noise (see Assumption A2 further ahead). Note that the adaptive filter output ˜ o (n), and the error signal are in the time domain, although x˜ (n), w ˜ (n) are in the transform domain [2]. and w
˜ (n + 1) = w ˜ (n) + 2μDˆ −1 (n)˜x(n)[˜xT (n)w ˜ o (n) w ˜ (n)]. + z(n) − x˜ T (n)w
(10)
Then, using (3) and defining the transformed weight-error vector as
˜ (n) − w ˜ o (n) v˜ (n) = w
(11)
the transformed weight-error update equation becomes
ˆ −1 (n)˜x(n)˜xT (n)]˜v(n) v˜ (n + 1) = [I − 2μD ˜ n) ˜ o (n) − φ( + 2μDˆ −1 (n)˜x(n) z(n) + (1 − α )w
(12)
˜ n) = Tφ(n) denotes the where I is an N × N identity matrix and φ( transformed plant perturbation vector. Now, taking the expected value of both sides of (12) and considering Assumptions A2–A4, we get
˜ o (n)] E[˜v(n + 1)] ∼ = (I − 2μP)E[˜v(n)] + (1 − α )E[w
(13)
with Fig. 2. Block diagram of the transform-domain adaptive filter.
1
For a different point of view about the independence assumption, see [34].
JID:YDSPR
AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.4 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
4
˜ o (0) ˜ o (n)] = αn w E[w
(14)
3.2. Learning curve
(15)
For the TDLMS algorithm, an expression describing the learning curve can be obtained by substituting (9) into (8), and using (11), which results in
and
ˆ −1
P = E[D
T
(n)˜x(n)˜x (n)].
The key point now is to determine a solution for (15), which depends on the statistical characteristics of the input data. Then, considering real-valued correlated Gaussian input data (see Assumption A1) and using the expected value definition, the elements of matrix P can be written as
∞ p (i , j ) = M
∞ ···
−∞
−∞
M +1 fold
x˜ i (n)˜x j (n) x˜ Ti (n)˜xi (n)
f x (˜xi , j )dx˜ i , j
(16)
1
(2π ) M +1 det(R˜ i , j )
e
1 − 12 x˜ Ti , j (n)R˜ − x˜ (n) i, j i, j
(17)
Then, squaring (25), taking the expected value of both sides, and considering Assumptions A2 and A4, one obtains [27]
˜ K˜ (n)] E[e 2 (n)] = σz2 + tr[R
x˜ i , j (n) = [˜xi (n) x˜ i (n − 1) · · · x˜ i (n − M + 1) x˜ j (n)]
T
(18)
˜ i , j = E[˜xi , j (n)˜xT (n)] is the corresponding autocorrelation and R i, j matrix. In Appendix A, we present in detail the procedure followed to compute (16), which yields
˜ = E[˜x(n)˜xT (n)] is the transformed autocorrelation matrix where R and
˜ −1 r˜ i , j M q˜ Ti Hi Q i
In this section, an expression describing the transformed ˜ (n) is obtained. Thus, calculating weight-error covariance matrix K ˜ the outer product of (12) [i.e., v(n)˜vT (n)] and taking the expected value of both sides of the resulting expression, we get
E[˜v(n + 1)˜vT (n + 1)]
= E[˜v(n)˜vT (n)] + (1 − α )E1 (n) + (1 − α )ET1 (n)
(19)
T − E[˜v(n)φ˜ (n)] + 2μE[˜v(n) z(n)˜xT (n)Dˆ −1 (n)]
− 2μ(1 − α )E2 (n) − 2μ(1 − α )ET2 (n) − 2μE3 (n) − 2μET3 (n) + 4μ2 E4 (n) − 4μ2 E[Dˆ −1 (n)˜x(n)˜xT (n)˜v(n) z(n)˜xT (n)Dˆ −1 (n)]
˜ i , r˜ i , j = E[˜xi (n)˜x j (n)] represents a vector of crossfirst row of Q correlation between x˜ i (n) [given by (6)] and x˜ j (n), and Hi is a diagonal matrix whose elements are given by
M /2
1
√
2λi ,l
Gi
T + 2μE[Dˆ −1 (n)˜x(n)˜xT (n)˜v(n)φ˜ (n)] ˜ n)˜vT (n)] + 2μE[Dˆ −1 (n)˜x(n) z(n)˜vT (n)] − E[φ(
A i ,l,q ln(λi ,q ) + B i ,l ln(λi ,l )
(20)
˜ n)w ˜ To (n)] − (1 − α )E[φ( − 4μ2 E[Dˆ −1 (n)˜x(n) z(n)˜vT (n)˜x(n)˜xT (n)Dˆ −1 (n)]
q =1
with
Gi =
M
˜ To (n)] + 4μ2 E5 (n) + 2μ(1 − α )E[Dˆ −1 (n)˜x(n) z(n)w T − 2μE[Dˆ −1 (n)˜x(n) z(n)φ˜ (n)]
(21)
λi ,k ,
˜ o (n) z(n)˜xT (n)Dˆ −1 (n)] + 2μ(1 − α )E[w
k =1
A i ,l,q =
(λ
i ,q
λ
)
i ,q
M /2
λ
λi ,l
− λi ,l
M /2 j =1 j =q
T
i, j
λ
i ,q
− λ
,
˜ o (n)w ˜ To (n)] − (1 − α )E[w ˜ o (n)φ˜ (n)] + (1 − α )2 E[w T T − 1 ˜ n)˜v (n)˜x(n)˜x (n)Dˆ (n)] + 2μE[φ(
(22)
i, j
T
˜ n) z(n)˜xT (n)Dˆ −1 (n)] + E[φ( ˜ n)φ˜ (n)] − 2μE[φ(
and
q =1
(23)
λi ,l − λ i ,q
˜ To (n)], E1 (n) = E[˜v(n)w
(29)
ˆ −1 (n)˜x(n)˜xT (n)˜v(n)w ˜ To (n)], E2 (n) = E[D
(30)
ˆ −1 (n)], E3 (n) = E[˜v(n)˜vT (n)˜x(n)˜xT (n)D
for
λ i ,q = λi ,(2q−1) λi ,(2q) ,
(28)
with
λi ,l λ i ,q
M /2
B i ,l =
(27)
characterizes the transformed weight-error covariance matrix. ˜ (n) is Thereby, the algorithm learning curve can be predicted if K known.
˜ i denotes the eigenvector matrix arising from the eigenwhere Q ˜ i = E[˜xi (n)˜xT (n)], q˜ i is a vector containing the decomposition of R i
h i (l, l) ∼ =
(26)
3.3. Transformed weight-error covariance matrix
represents the Gaussian probability density function [33] of
p (i , j ) =
(25)
˜ (n) = E[˜v(n)˜vT (n)] K
where
f x (˜xi , j ) =
e (n) = −˜vT (n)˜x(n) + z(n).
q = 1 , 2 , . . . , M /2
(24)
characterizing the geometric mean of adjacent pairs of eigenval˜ i. ues λi ,k of R Therefore, from the model expression (13), along with (14) and (19)–(24), the transformed mean weight-error behavior of the TDLMS algorithm can be predicted.
ˆ −1
E4 (n) = E[D
T
T
(31) T
ˆ −1
(n)˜x(n)˜x (n)˜v(n)˜v (n)˜x(n)˜x (n)D
(n)],
(32)
and
ˆ −1 (n)˜x(n) z2 (n)˜xT (n)Dˆ −1 (n)]. E5 (n) = E[D
(33)
˜ n) (see AsThen, due to the statistical characteristics of z(n) and φ( sumptions A2 and A3), all terms containing such variables in (28)
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.5 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–••• T
5
M /2 A i ,l,q B i ,l
[1 + ln(2λi ,q )] + [1 + ln(2λi ,l )] λ i ,q λi ,l 4 Gi q =1
˜ n)φ˜ (n). In are equal to zero except those involving z2 (n) and φ( ˜ o (n) are assumed staaddition, taking into account that v˜ (n) and w tistically independent of x˜ (n) (see Assumption A4), the remaining nonzero terms in (28) [i.e., (29)–(33)] can be rewritten as
−1 u i (l, l) ∼ = √
˜ (n) − K˜ o (n), E1 (n) = K
(34)
E2 (n) ∼ = P[K˜ (n) − K˜ o (n)],
(35)
with G i , A i ,l,q , B i ,l , and λ i ,q given by (21), (22), (23), and (24), respectively. On the other hand, aiming to make the derivation mathematically tractable and since the off-diagonal elements of S have less impact on the overall solution than those on the main diagonal, the off-diagonal elements are computed by using the AP [28], yielding
E3 (n) ∼ = K˜ (n)PT , E4 (n) ∼ = 2PK˜ (n)PT + S tr[R˜ K˜ (n)],
(36) (37)
and
s (i , j ) ∼ = M2E 2 z
(38)
˜ o (n) = E[w ˜ o (n)w ˜ To (n)], K
(39)
˜ (n) = E[w ˜ (n)w ˜ To (n)], K
(40)
E5 (n) = Sσ
(47)
with
and
ˆ −1 (n)˜x(n)˜xT (n)Dˆ −1 (n)]. S = E[D
(41)
Therefore, from (34)–(41), (28) reduces to
1
x˜ Ti (n)˜xi (n)
E[˜xi (n)˜x j (n)]E
1
x˜ Tj (n)˜x j (n)
(48)
˜ and with E[˜xi (n)˜x j (n)] given by the (i , j )th element of R E{[˜xTi (n)˜xi (n)]−1 } obtained as in [27]. Thereby, despite the use of the AP for obtaining the off-diagonal elements, matrix S is computed here in a more accurate way than that given in [27] (see Section 4.3). Finally, taking into account the proposed solutions for P given by (19)–(24) and S given by (46)–(48), along with the model expressions (26), (42), (44), and (45), the algorithm learning curve is completely characterized. 3.4. Steady-state excess mean-square error
˜ (n + 1) = K˜ (n) − 2μK˜ (n)P − 2μPK˜ (n) K T
Here, an expression describing the steady-state EMSE ξex (∞) is derived. To this end, let us first recall the definition of the EMSE [4,5]
+ 4μ2 {2PK˜ (n)PT + S tr[R˜ K˜ (n)]} + 4μ2 σz2 S + (1 − α )[K˜ (n) + K˜ T (n)]
ξex (n) = E{[e (n) − z(n)]2 }
− 2μ(1 − α )P[K˜ (n) − K˜ o (n)]
= E{[˜vT (n)˜x(n)]2 }.
− 2μ(1 − α )[K˜ (n) − K˜ o (n)]T PT ˜ − (1 − α 2 )K˜ o (n) +
(42)
with
˜ n)φ˜ T (n)]. ˜ = E[φ(
(43)
Note now that the model expression (42) depends on the ˜ o (n) [given by (39)] and K˜ (n) [given knowledge of matrices K by (40)], which are related to the nonstationary environment (time-varying plant) considered. Hence, it is required to determine auxiliary model expressions for computing recursively such matrices. Particularly, a recursive expression for the plant autocorrela˜ o (n) can be obtained by pre-multiplying both sides tion matrix K of (3) by T, determining the outer product of the resulting expres˜ To (n)], taking the expected value of both sides, and ˜ o (n)w sion [i.e., w considering Assumption A3. Thereby,
˜ o (n + 1) = α 2 K˜ o (n) + . ˜ K
Thereby, the steady-state EMSE can be obtained by determining ˆ (n)˜v(n + 1) from (12), taking the expected the product v˜ T (n + 1)D value of both sides of the resulting expression, letting n → ∞, considering (15), (34), (49),
ˆ (n)˜v(n + 1)] = lim E[˜vT (n)Dˆ (n)˜v(n)] lim E[˜vT (n + 1)D
n→∞
ξex (∞) =
s(i , i ) = M 2 q˜ Ti Ui q˜ i
+
˜ i and Ui is where q˜ i denotes a vector containing the first row of Q a diagonal matrix whose elements are
(50)
(1 − α ) tr{E[Dˆ (∞)][K˜ (∞) − K˜ o (∞)]} 1
4μ 1 4μ 1
(1 − α ) tr{E[Dˆ (∞)][K˜ (∞) − K˜ o (∞)]T } (1 − α )2 tr{E[Dˆ (∞)]K˜ o (∞)}
ˆ (∞)]} ˜ + μσz2 tr(P) + μ E 6 (∞) tr{E[D 4μ 1 − (1 − α ) tr{R˜ [K˜ (∞) − K˜ o (∞)]} 2 1 − (1 − α ) tr{R˜ [K˜ (∞) − K˜ o (∞)]T } 2 +
(45)
(46)
1 4μ
+
(44)
To complete the model derivation, we still need to determine the matrix S given in (41). Then, following a way similar to the one used to obtain P, we derive S as shown in Appendix B. Specifically, the diagonal elements of S are obtained according to
n→∞
and Assumptions A2–A4 (for details, see [5, Ch. 15–21]). Thus,
˜ (n) can be obtained Likewise, a recursive expression describing K from (3) and (10) along with Assumption A4 as
˜ (n + 1) = α (I − 2μP)K˜ (n) + 2μα PK˜ o (n). K
(49)
(51)
with
ˆ −1 (∞)˜x(∞)]. E 6 (∞) = E[˜xT (∞)˜v(∞)˜vT (∞)˜x(∞)˜xT (∞)D
(52)
Then, noting that
ˆ (∞)][K˜ (∞) − K˜ o (∞)]} = tr{E[Dˆ (∞)][K˜ (∞) − K˜ o (∞)]T } tr{E[D (53) and
JID:YDSPR
AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.6 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
6
˜ [K˜ (∞) − K˜ o (∞)]} = tr{R˜ [K˜ (∞) − K˜ o (∞)]T } tr{R
(54)
M2 =
ˆ (∞)][(1 − α )I + 2μα P]−1 P} ˜ tr{E[D
ξex (∞) =
1 2μ
(1 − α ) tr{E[Dˆ (∞)][K˜ (∞) − K˜ o (∞)]} 1
+
4μ 1
+
(1 − α )2 tr{E[Dˆ (∞)]K˜ o (∞)} ˆ (∞)]} ˜ + μσz2 tr(P) + μ E 6 (∞) tr{E[D
4μ
− (1 − α ) tr{R˜ [K˜ (∞) − K˜ o (∞)]}.
(55)
Now, taking into account that
˜ o (∞) = lim K˜ o (n) K n→∞
=
1 1 − α2
˜
(56)
and
˜ (∞) = lim K˜ (n) K n→∞
=
2μα 1 − α2
˜ [(1 − α )I + 2μα P]−1 P
(57)
(55) becomes
ξex (∞) = μσz2 tr(P) + μ E 6 (∞) ˆ (∞)] + (1 − α )RP ˜ −1 }[(1 − α )I + 2μα P]−1 P]) ˜ tr({α E[D . + 1+α (58) Finally, considering
E 6 (∞) ∼ = tr(P)ξex (∞)
(59)
and
˜ −1 ∼ RP = E[Dˆ (∞)]
(60)
the steady-state EMSE is obtained according to
ˆ (∞)][(1 − α )I + 2μα P]−1 P} ˜ μσz2 tr(P) tr{E[D ξex (∞) ∼ + . = 1 − μ tr(P) (1 + α )[1 − μ tr(P)] (61) 3.5. Misadjustment Firstly, let us recall that the algorithm misadjustment M is defined as [4,5]
M=
ξex (∞) ξmin
(62)
where ξex (∞) denotes the steady-state EMSE while ξmin characterizes the minimum MSE attainable in steady state. Thereby, substituting (61) into (62) and making ξmin = σz2 , we can write the misadjustment as a sum of two portions [2–4], i.e.,
M = M1 + M2
(63)
with
M1 =
μ tr(P) 1 − μ tr(P)
(65)
σz2 (1 + α )[1 − μ tr(P)]
(51) can be simplified to
denoting the portion related to the nonstationary environment considered (time-varying plant). Note from these model expressions that, except for the special case of uncorrelated input data2 , the misadjustment of the TDLMS algorithm is affected by the choice of the transformation matrix T. According to [3–5], the M1 portion [given by (64)] is due to the adaptive weight vector fluctuations (termed estimation misadjustment) while the M2 portion [given by (65)] is due to the plant weight vector tracking lag (termed lag misadjustment). Thereby, since the M2 portion is equal to zero only for the case of sta˜ = σ 2 I with σ 2 = 0), the misadtionary environment (α = 1 and φ φ justment M [given by (63)] attained in a nonstationary environment is always larger than the one obtained in the stationary case. Also, observe from (63)–(65) that, except for the common term 1 − μ tr(P), the M1 portion is directly proportional to μ while the M2 portion is inversely proportional to μ, ratifying the existence of a tradeoff between these two misadjustment portions mediated by adjusting the step size. In order to clarify the impact of the plant variation speed on the adaptive algorithm, Fig. 3 plots the model expressions (63)–(65) as a function of the parameters α [Fig. 3(a)] and σφ2 [Fig. 3(b)]. (The remaining parameter values are the same as used later in Example 1 of Section 4.1, considering 40 dB SNR.) Notice from these figures that the misadjustment M is a convex function with respect to μ; consequently, there is a step-size value that leads to the minimum misadjustment M. Particularly, observe from Fig. 3(a) that, for values of α close to 1, small misadjustment values are achieved by using intermediate step sizes; in contrast, as α is decreased, small misadjustment values are attained with small step sizes (μ tending to 0). In turn, notice from Fig. 3(b) that, for α close to 1, intermediate step sizes are required to attain the minimum misadjustment M over a wide range of values of σφ2 ; such a characteristic gradually vanishes as σφ2 decreases. For more details about the algorithm misadjustment in a nonstationary environment, Fig. 4 plots separate curves of (63), (64), and (65) for α = 0.999 and σφ2 = 5 × 10−7 [Fig. 4(a)], and σφ2 =
10−5 [Fig. 4(b)]. From Fig. 4(a), observe that the M1 portion is an increasing function of μ and, hence, small misadjustment values are achieved with small step sizes; on the other hand, due to the characteristics of the M2 portion, small step sizes lead to high misadjustment values. Therefore, under certain conditions, it is important to consider an intermediate step-size value μme that satisfies this tradeoff between the M1 and M2 portions, which then leads to the minimum misadjustment M [see Fig. 4(a)]. In turn, notice from Fig. 4(b) that for a fast time-varying plant (obtained by increasing σφ2 ) the overall misadjustment M is dominated by the lag misadjustment M2 . Thereby, since the M2 portion is a M convex function with respect to μ, a step size μme 2 that leads to the minimum misadjustment M can also be determined for this condition [see Fig. 4(b)]. 3.6. Step size for minimum excess mean-square error Here, since the misadjustment (equivalently, the steady-state EMSE) is a convex function with respect to μ, we derive approximate expressions for the optimum step size that minimizes ξex (∞). To this end, taking into account that intermediate
(64)
characterizing the misadjustment portion present in both stationary and nonstationary cases, and
2
For
uncorrelated
μN M∼ = 1−μ N +
input
data,
(63)–(65)
can
be
simplified
to
σx2 tr() ; consequently, in this case, the misadσz2 (1−μ N )(1+α )(1−α +2μα N )
justment does not depend on the transformation matrix T (see [2, pp. 479–483]).
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.7 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
Fig. 3. Misadjustment surface [given by (63)–(65)] obtained by considering (a)
7
α variable and σφ2 = 5 × 10−7 , and (b) α = 0.999 and σφ2 variable.
Fig. 4. Individual curves of misadjustment obtained from (63), (64), and (65) with
α = 0.999. (a) σφ2 = 5 × 10−7 . (b) σφ2 = 10−5 .
step-size values are required to attain the minimum misadjustment especially for values of α very close to 1 [see Fig. 3(a)], expression (61) is first rewritten as
ˆ (∞)]} ˜ μσz2 tr(P) tr{E[D ξex (∞) ∼ + . = 1 − μ tr(P) 4μ[1 − μ tr(P)] Then, differentiating (66) with respect to
(66)
μ, i.e.,
˜ [1 − 2μ tr(P)] tr{E[Dˆ (∞)]} ∂ξex (∞) σz2 tr(P) = − 2 2 2 ∂μ [1 − μ tr(P)] 4μ [1 − μ tr(P)] setting the resulting expression equal to zero, and solving for μme , we get
μme =
ˆ (∞)]} ˜ tr{E[D
4σz2
1+
4σz2
ˆ (∞)]} ˜ tr(P) tr{E[D
(67)
μ= Fig. 5. Step size for minimum EMSE [given by (69)] obtained by considering variable and α = 0.999.
−1 .
(68)
˜ = σ 2 I, (68) reduces to Note that, for φ
μme =
1 +
N σx2 σφ2 4σz2
4σz2 N σx2 σφ2 tr(P)
−1 .
(69)
Aiming to explain the impact of parameter σφ2 on (69), Fig. 5 shows the step size for minimum EMSE μme as a function of σφ2 . From this figure, one verifies that (69) is an increasing func-
σφ2 , presenting saturation for both very low and very high values of σφ2 . Such a behavior can be interpreted as follows: firstly, for very low values of σφ2 , the plant is approximately
tion of
σφ2
stationary, implying that μme tends to zero; second, as σφ2 increases, the plant variation becomes faster and, hence, (69) leads to larger values of μme . In other words, for plants that undergo slow variation (mainly for small values of σφ2 ), the algorithm needs smaller step-size values to track the plant weight vector with small misadjustment. On the other hand, for plants that experience fast variation (mainly for large values of σφ2 ), larger stepsize values are required to properly track the plant weight vector. In particular, for plants with very fast variation (very large values of σφ2 ), the overall misadjustment is dominated by the M2 portion [see Fig. 4(b)]. In this condition, the step size for minimum misadjustment (or steady-state EMSE) can be obtained from
JID:YDSPR
AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.8 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
8
ˆ (∞)]}[ ˜ 2μ tr(P) − 1] ∂ M2 tr{E[D = ∂μ 4σz2 [μ − μ2 tr(P)]2
(70)
which results in an expression for the step-size value that depends only on the input data statistics, i.e., 2 μM me =
1 2 tr(P)
(71)
.
3.7. Degree of nonstationarity
ψ(n) =
E[ y 2o,inc (n)] E[ z2 (n)]
(72)
Similarly, multiplying both sides of (49) by ering
˜ o (n + 1) − w ˜ o (n)]T x˜ (n) y o,inc (n) = [w
(73)
is the incremental output variation. Note that the numerator of (72) characterizes the average power introduced by the plant variation and the denominator denotes the minimum MSE [3]. Regarding the numerator of (72), pre-multiplying both sides of (3) by T, substituting the resulting expression into (73), taking the expected value of both sides, and considering Assumptions A3 and A4, we obtain
˜ o (0)R˜ ] + E[ y 2o,inc (n)] = α 2n (1 − α )2 tr[K
2 − α 2n (1 − α ) 1+α
˜ R˜ ). tr( (74)
2 z
Then, substituting (74) into (72) and recalling that σ denotes the variance of the measurement noise z(n), the degree of nonstationarity reads as
2n α 2n (1 − α )2 ˜ ˜ ] + 2 − α (1 − α ) tr( ˜ R˜ ). tr [ K ( 0 ) R o σz2 σz2 (1 + α )
(75) Note from (75) that the degree of nonstationarity ψ(n) presents the following characteristics: a) Exponential evolution, starting with the maximum value
ψ(0) =
˜ R˜ ) (1 − α )2 tr[K˜ o (0)R˜ ] + tr(
σz2
(76)
and approaching
ψ(∞) =
˜ R˜ ) 2 tr( (1 + α )σz2
.
b) The degree of nonstationarity ψ(n) is larger for α = 1. c) For α = 1, (76) and (77) become
(77)
α < 1 than for
α , using (11), consid-
˜ n) ˜ o (n) = w ˜ o (n + 1) − φ( αw
(80)
and taking into account Assumption A3, a second relationship is established as
σz
(81)
Then, adding (79) and (81), and rearranging the resulting expression, we have
ξex (n) =
1 1+α
2
˜ o (n − 1)]}2 ˜ (n) − α w E {˜xT (n)[w
˜ R˜ ) . ˜ (n) − w ˜ o (n + 1)]}2 + 2 tr( + E {˜xT (n)[α w
(82)
From (82), it is possible to show that the following inequality holds:
ξex (n) ≥
2 1 + α2
˜ R˜ ) ≥ tr(
2 1+α
˜ R˜ ). tr(
Now, dividing both sides of (83) by that ξmin = σz2 , we get
ξex (n) 2 ˜ R˜ ). ≥ tr( ξmin (1 + α )σz2
(83)
σz2 and taking into account (84)
Finally, considering (62) and (77), we find [from (84)] that the misadjustment of an adaptive filter operating under the nonstationary environment considered is lower bounded by
M ≥ ψ 2 (∞).
(85)
From this inequality, we can draw out the following important features [3,5,32]: a) If ψ(∞) 1, variation of the plant weight vector is too fast for the filter to be able to track, leading to a larger misadjustment. b) If ψ(∞) 1, the adaptive filter would generally be able to track variation of the plant weight vector, resulting in smaller misadjustment. Therefore, we conclude that the condition ψ(∞) 1 should be fulfilled for the algorithm to be able to track plant variation with small misadjustment. 4. Simulation results
ψ(n)|α =1 = ψ(0) = ψ(∞) ˜ R˜ ) tr( = 2
(79)
2
˜ R˜ ). ˜ (n) − x˜ T (n)w ˜ o (n + 1)]2 } + tr( α 2 ξex (n) = E{[α x˜ T (n)w
where
ψ(n) =
Here, aiming to provide some insights on the conditions in which the adaptive filter would be able to properly track variations of the plant weight vector, we obtain an expression relating the degree of nonstationarity ψ(n) and the misadjustment M. Thus, substituting (11) into (49), using (3), and considering Assumption A3, the EMSE is firstly rewritten as
˜ R˜ ). ˜ (n) − α x˜ T (n)w ˜ o (n − 1)]2 } + tr( ξex (n) = E{[˜xT (n)w
In this section, aiming to clarify the concepts of slow and fast variation of the plant weight vector, an expression for the degree of nonstationarity is derived. This metric is formally defined as [3–5,32]
3.8. Relationship between misadjustment and degree of nonstationarity
(78)
which is a result commonly found in the open literature [3–5,32].
Now, aiming to assess the accuracy of the proposed model, some examples are presented considering a system identification problem (see Fig. 1). In these examples, MC simulations obtained by averaging 200 independent runs are compared with results obtained from model predictions. For these examples, a zero-mean correlated Gaussian input signal with unit variance is used (see Assumption A1), which is obtained from an AR(2) process given by
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.9 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
9
Fig. 6. Example 1. Transformed mean weight-error behavior in a nonstationary environment obtained from MC simulations (gray-ragged lines) and proposed model (13) (dark-dashed lines). (a) 40 dB SNR. (b) 60 dB SNR.
x(n) = 0.18x(n − 1) − 0.85x(n − 2) + v (n)
(86)
where v (n) is a white noise with variance σ v2 = 0.2748, resulting in an eigenvalue spread of the input autocorrelation matrix of χ = 45.9 for N = 8. (Note that, irrespective of the process used to generate the input data, the algorithm behavior can be adequately predicted through the proposed model if the input autocorrelation matrix R is known.) Here, the orthogonal transform T used to reduce the correlation level of the input data is the discrete cosine transform (DCT)3 , since it approximates very well the optimal Karhunen–Loève transform [4]. In our simulations, unless otherwise stated, two different values of signal-to-noise ratio (SNR) are used, i.e., 40 dB and 60 dB SNR. For the sake of simplicity, the initial condition for plant weight vector, used in (3), is obtained as
wo (0) =
waux wTaux waux
(87)
with waux = [sinc(0) sinc(1/ N ) · · · sinc( N − 1/ N )]T denoting a vector that contains samples taken from the sinc function. In the presented examples, the adaptive filter weights are initialized in different ways, namely [3]:
˜ (0) = Two (0), aiming to evaluate the tracking behavior in the a) w nonstationary case; and ˜ (0) = 0, in order to assess the algorithm behavior during the b) w acquisition mode in the stationary case. 4.1. Assessment of the proposed model In the following, four examples are presented aiming to verify the accuracy of the proposed model under different operating conditions. Example 1. In this example, the proposed model is assessed under a nonstationary environment in order to confirm the accuracy of the model expressions describing the transformed mean weight-error behavior (13) and the learning curve (26). To this end, the following parameter values are used: N = 8, M = 8, and μ = 0.1μme = 0.00257. In addition, to obtain the time-varying plant, we consider in (3) that α = 0.999 and that the elements of the plant perturbation vector φ(n) consist of samples taken from a white noise with autocorrelation matrix = σφ2 I, where
σφ2 is adjusted so that ψ(∞) = 0.2 for each SNR value considered. For this operating scenario, Fig. 6 shows curves describing the transformed mean weight-error behavior for only four of the
3 Although the DCT has been considered here, any other real-valued N × Northogonal transform could be used instead.
Fig. 7. Example 1. Learning curves in a nonstationary environment obtained by MC simulations (gray-ragged lines) and proposed model (26) (dark-dashed lines).
weights (for the sake of clarity). Specifically, Figs. 6(a) and (b) depict the case of 40 dB and 60 dB SNR, respectively. In addition, Fig. 7 shows the learning curves obtained for both SNR values. From these figures, a very good match between MC simulations and the proposed model is verified for both transient and steadystate phases. Example 2. In this example, considering also a nonstationary environment, the accuracy of the proposed model is assessed for different filter lengths and eigenvalue spreads of the input autocorrelation matrix. To this end, we now consider filters with N = 32 and N = 64 weights for M = 16 and M = 32, respectively. In particular for this operating scenario, the eigenvalue spreads of the input autocorrelation matrix are χ = 581.13 for N = 32 and χ = 941.17 for N = 64, both obtained by changing −0.85 to −0.95 in (86) with σ v2 = 0.0966. Here, the step size is μ = 0.000257, aiming to assure the algorithm stability, with the remaining parameter values being the same as in Example 1. The results obtained for this scenario are depicted in Fig. 8, in which a very good match between MC simulations and model predictions can be verified for these larger filter lengths and both SNR values considered. Note also that the proposed model performs well even for higher eigenvalue spreads, ratifying the accuracy of the methodology used to compute matrices P and S. Example 3. Here, the accuracy of the model expression describing the steady-state EMSE is verified in a nonstationary environment for different step-size values. To this end, results predicted from the model expression (61) are compared with those obtained by MC simulations, averaging the last 100 EMSE values in steady state. Particularly for this example, we use three distinct values for the plant perturbation variance, i.e., σφ2 = 5 × 10−6 , σφ2 = 5 × 10−8 , and σφ2 = 5 × 10−10 . The remaining parameter values are the same as in Example 1. In this context, Fig. 9 illustrates the steady-state
JID:YDSPR
10
AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.10 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
Fig. 8. Example 2. Learning curves for a nonstationary environment obtained from MC simulations (gray-ragged lines) and proposed model (26) (dark-dashed lines). (a) χ = 581.13 for N = 32 and M = 16. (b) χ = 941.17 for N = 64 and M = 32.
Fig. 9. Example 3. Steady-state EMSE curves for a nonstationary environment obtained from MC simulations (×) and proposed model (61) (dark-dashed lines) for 40 dB SNR (left) and 60 dB SNR (right). (a) and (b) σφ2 = 5 × 10−6 . (c) and (d) σφ2 = 5 × 10−8 . (e) and (f) σφ2 = 5 × 10−10 .
EMSE as a function of μ. From this figure, we verify that the model expression (61) predicts very well the steady-state EMSE, irrespective of the SNR and σφ2 value considered. Furthermore, observe from Figs. 9(a) and (b) that fast time-varying plants (obtained by increasing σφ2 ) lead to higher steady-state EMSE values due to the tracking lag. On the other hand, slow time-varying plants (obtained by decreasing σφ2 ) lead to lower steady-state EMSE values [see Figs. 9(e) and (f)]. Specifically in Figs. 9(c) and (f), notice that under certain conditions there exists an interme-
diate step-size value for minimum EMSE. Therefore, these results come from confirming the discussion presented in Sections 3.5 and 3.6. Example 4. This example has three main goals, namely: a) to assess the accuracy of the proposed model (via learning curve) in a stationary environment (α = 1 and σφ2 = 0); b) to verify the impact of the value of M on the model accuracy; and c) to compare the proposed model with that given in [27], which is derived ex-
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.11 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
11
Fig. 10. Example 4. Learning curves for a stationary environment obtained by MC simulations (gray-ragged lines), model of [27] (dark-dotted lines), and proposed model (26) (dark-dashed lines). (a) M = 8. (b) M = 64.
ˆ −1 (n)˜x(n)˜xT (n)˜v(n)w ˜ To (n)]}i ,i (gray-ragged lines) and model {P[K˜ (n) − K˜ o (n)]}i ,i Fig. 11. Test of Approximation E2 (n) [given by (35)] obtained by MC simulations of {E[D (dark-dashed lines). (a) M = 8. (b) M = 64.
ˆ −1 (n)]}i ,i (gray-ragged lines) and model [K˜ (n)PT ]i ,i (dark-dashed Fig. 12. Test of Approximation E3 (n) [given by (36)] obtained by MC simulations of {E[˜v(n)˜vT (n)˜x(n)˜xT (n)D lines). (a) M = 8. (b) M = 64.
clusively using the AP. To this end, two different values of M are used, i.e., M = 8 and M = 64, whereas parameters N and μ are the same as in Example 1. The results obtained for this scenario are shown in Fig. 10. Notice that the model given in [27] does not properly predict the algorithm behavior during the transient phase for small values of M; according to [27], satisfactory results are obtained for M > 30. On the other hand, the proposed model exhibits very good accuracy for both transient and steady-state phases, irrespective of the value of M. Moreover, observe that the steady-state EMSE is smaller than that achieved in the nonstationary case (see Fig. 7), ratifying the discussion presented in Section 3.5. 4.2. On the approximations E2 (n)–E4 (n) In this section, simulation results are shown aiming to verify the validity of approximations (35)–(37). Such results are obtained
by considering the scenario described in Example 1 for the case of 40 dB SNR, using M = 8 and M = 64. These results are depicted in Figs. 11–13; in particular, Fig. 11 illustrates the accuracy of the first four elements on the main diagonal of (35) for both M = 8 and M = 64. From the presented plots, we verify the reasonability of the approximation used, which is confirmed by the very good match between the curves obtained by MC simulations ˆ −1 (n)˜x(n)˜xT (n)˜v(n)w ˜ To (n)]}i ,i and model {P[K˜ (n) − K˜ o (n)]}i ,i . of {E[D In Figs. 12 and 13, the same presentation pattern of Fig. 11 is repeated for the approximations given by (36) and (37), respectively. Regarding Fig. 13(a), notice that (37) is less accurate for M = 8, becoming better as the value of M is increased [see Fig. 13(b)]. Therefore, observe that the approximations used in the model development exhibit satisfactory accuracy for both values of M considered.
JID:YDSPR
AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.12 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
12
ˆ −1 (n)˜x(n)˜xT (n)˜v(n)˜vT (n)˜x(n)˜xT (n)Dˆ −1 (n)]}i ,i (gray-ragged lines) and model Fig. 13. Test of Approximation E4 (n) [given by (37)] obtained by MC simulations of {E[D {2PK˜ (n)PT + S tr[K˜ (n)R˜ ]}i ,i (dark-dashed lines). (a) M = 8. (b) M = 64.
Fig. 14. Comparisons of results obtained by MC simulations (×), the approach given in [27] (o), and the proposed approach (). (a) Matrix P. (b) Matrix S.
4.3. On the solutions of P and S Here, simulation results are presented in order to assess the accuracy of the proposed solutions for matrices P [given by (19)–(24)] and S [given by (46)–(48)] in comparison with those solutions obtained by using the AP in [27]. Such results, shown in Fig. 14, are obtained by considering the same parameter values as in Example 1. From this figure, one verifies that the proposed approach for computing both P and S exhibits very good accuracy, i.e., a very good match between the results obtained by MC simulations and the proposed solutions. However, the same accuracy cannot be verified by means of the results obtained from [27]. 5. Concluding remarks In this paper, a stochastic model for the TDLMS algorithm operating in a nonstationary environment (time-varying plant) was presented, from which a model for a stationary environment follows as a special case. Specifically, we derived model expressions for the transformed mean weight-error behavior, learning curve, transformed weight-error covariance matrix, steady-state EMSE, misadjustment, step size for minimum EMSE, degree of nonstationarity, as well as a relationship between misadjustment and degree of nonstationarity. Based on these model expressions, we have also discussed how the algorithm parameters affect its performance and which conditions must be fulfilled so that the algorithm can properly track a time-varying plant. It is important to highlight that such expressions have been obtained by using a small number of simplifying assumptions and approximations, leading to a model that accurately predicts the algorithm behavior over a wide range of operating conditions. This accuracy is mainly due to the strategy used for computing some expected values arising from the model development. Moreover, in contrast to other models so far presented in the open literature, the proposed model is valid irrespective of the window length used to estimate the signal power
at each transformation branch. Simulation results showed a very good match between MC simulations and model predictions, ratifying the effectiveness of the proposed model for both transient and steady-state phases. Acknowledgments The authors would like to thank the Handling Editor and the anonymous reviewers, whose valuable and constructive comments and suggestions have improved significantly the quality of this paper. This work was supported in part by the Brazilian National Council for Scientific and Technological Development (CNPq), and by the Inter-American Development Bank (IDB) under Grant UNaM PICTO 2011-0121. Appendix A Determination of P In this appendix, the approximate approach considered for computing (15) is presented. To this end, let us first use the extended vector x˜ i , j (n), given in (18), to rewrite (5) so that M −1 1
M
x˜ 2i (n − k) =
k =0
1 M
x˜ Ti , j (n)Is x˜ i , j (n)
(88)
with Is denoting an M + 1 × M + 1-dimensional matrix defined as
Is =
IM 0T
0 0
(89)
where I M represents an M × M identity matrix and 0 is a vector of 0’s with appropriate dimension. Thus, substituting (17) into (16),
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.13 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
13
using (88), and dropping the index n for notational simplicity, the (i , j )th element of P can now be expressed as
−1 −1 ∂ f i , j (ω) − M [Q s,i , j (2ωs,i , j + IM +1 ) Q s,i , j R˜ i , j ]1, M +1 = . (98) ∂ω det(2ωs,i , j + I M +1 )
p (i , j )
Note that (98) can be rewritten in a more compact form as
=
∞
M
∞ ···
(2π ) M +1 det(R˜ i , j ) −∞ −∞
x˜ i x˜ j − 1 x˜ T R˜ −1 x˜ e 2 i, j i, j i, j dx˜ i , j T x˜ i , j Is x˜ i , j
M +1 fold
˜ i , j denotes the autocorrelation matrix of x˜ i , j . To deterwhere R mine (90), similarly to [35], we define an auxiliary function f i , j (ω) as
˜ i, j . and re,i , j represents a vector containing the last column of R Additional simplifications of (99) can be obtained by analyz˜ i , j Is . For instance, from the eigendecomposiing the structure of R tion (97), it follows that
Q s,i , j =
f i , j (ω)
∞
M
∞ ···
(2π ) M +1 det(R˜ i , j ) −∞ −∞
x˜ i x˜ j − 1 x˜ T L−1 (ω)˜xi , j e 2 i, j i, j dx˜ i , j T x˜ i , j Is x˜ i , j
s,i , j = and
M +1 fold
1 Q− s,i , j
with
˜ −1 )−1 . Li , j (ω) = (2ωIs + R i, j
(92)
ω = 0, (91) is by definition the desired expected
p (i , j ) = f i , j (0).
∂ f i , j (ω) = −M ∂ω
det[Li , j (ω)]
˜ i, j ) det(R
(94)
(ω) =
∞
∞ ···
(2π ) M +1 det[Li , j (ω)] −∞ −∞
x˜ i x˜ j e
1 − 12 x˜ Ti , j L− (ω)˜xi , j i, j
dx˜ i , j
M +1 fold
(95) characterizing the cross-correlation between x˜ i and x˜ j when they are jointly Gaussian random variables (see Assumption A1) with covariance matrix given by (92). Since x˜ i denotes the 1st element of vector x˜ i , j and x˜ j the ( M + 1)th element, variable (ω) corresponds to the (1, M + 1)th element of matrix Li , j (ω); thereby, (94) is simplified to
∂ f i , j (ω) − M [(2ωR˜ i , j Is + IM +1 )−1 R˜ i , j ]1, M +1 . = ∂ω ˜ i , j Is + IM +1 ) det(2ωR
˜T Q i
(101)
,
0
(102)
˜ i are matrices containing the eigenvectors and ˜ i and where Q ˜ i , respectively, and r˜ i , j is the cross-correlation veceigenvalues of R tor between x˜ i (n) and x˜ j (n). Furthermore, it is possible to show that re,i , j can be written in terms of r˜ i , j as
r˜ i , j
re,i , j =
(103)
σ j2
and qs,i , j in terms of q˜ i as
qs,i , j =
q˜ i 0
(104)
˜ i + IM )−1 Q˜ T r˜ i , j ∂ f i , j (ω) − M q˜ Ti (2ω i = . ∂ω ˜ det(2ωi + I M )
(96)
(97)
where Q s,i , j represents the eigenvector matrix and s,i , j is a diag˜ i , j Is , we have from (96) onal matrix containing the eigenvalues of R that
(105)
Finally, integrating both sides of (105) with respect to sidering (93), we obtain
˜ T r˜ i , j p (i , j ) = M q˜ Ti Hi Q i
ω and con(106)
with Hi denoting a diagonal matrix whose elements are given by
∞ h i (l, l) =
Now, considering the eigendecomposition [4,5]
˜ i , j Is = Q s,i , j s,i , j Q−1 R s,i , j
0
(100)
,
1
˜ i , j and q˜ i denotes an where σ j2 is the ( j , j )th element of R M-dimensional vector formed by the elements of the first row of ˜ i. Q Taking into account (100)–(104), (99) reduces to
with
1
−1
1 −˜rTi , j R˜ − 1 i
(ω)
˜i 0
0
˜i ˜ i r˜ Ti , j Q 0T
=
(93)
Then, applying partial differentiation in (91) with respect to ω , the term x˜ Ti , j Is x˜ i , j in the denominator of the integrand is eliminated, which results in
˜i Q
(91)
Note that, for value, i.e.,
(99)
where qs,i , j denotes a vector containing the first row of Q s,i , j
(90)
=
T −1 −1 ∂ f i , j (ω) − Mqs,i , j (2ωs,i , j + IM +1 ) Q s,i , j re,i , j = ∂ω det(2ωs,i , j + I M +1 )
0
(1 + 2ωλi ,l )
1 M
k =1
dω
(107)
(1 + 2ωλi ,k )
˜ i . Therefore, the dewhere λi ,k characterizes the eigenvalues of R termination of P involves the calculus of a high-order hyperelliptic integral [see (107)]. However, there is no general solution for this kind of integral in the open literature [29]. Thereby, aiming to determine an analytical solution for such an integral, the approach described in [27] is considered to obtain (20) as an approximate solution.
JID:YDSPR
AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.14 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
14
Appendix B
References
Determination of S
[1] B. Widrow, S.D. Stearns, Adaptive Signal Processing, Prentice-Hall, Englewood Cliffs, 1985. [2] B. Farhang-Boroujeny, Adaptive Filters: Theory and Applications, Wiley, Baffins Lane, Chichester, 1999. [3] D.G. Manolakis, V.K. Ingle, S.M. Kogon, Statistical and Adaptive Signal Processing: Spectral Estimation, Signal Modeling, Adaptive Filtering, and Array Processing, McGraw-Hill, New York, 2000. [4] S. Haykin, Adaptive Filter Theory, 4th ed., Prentice-Hall, Upper Saddle River, 2002. [5] A.H. Sayed, Adaptive Filters, John Wiley & Sons, Hoboken, 2009. [6] S.S. Narayan, A.M. Peterson, M.J. Narasimha, Transform domain LMS algorithm, IEEE Trans. Acoust. Speech Signal Process. 31 (3) (1983) 609–615. [7] K. Ozeki, T. Umeda, An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its properties, Electron. Commun. Jpn. 67-A (5) (1984) 19–27. [8] J.C. Lee, C.K. Un, Performance of transform-domain LMS adaptive digital filters, IEEE Trans. Acoust. Speech Signal Process. 34 (3) (1986) 499–510. [9] D.F. Marshall, W.K. Jenkins, J.J. Murphy, The use of orthogonal transforms for improving performance of adaptive filters, IEEE Trans. Circuits Syst. 36 (4) (1989) 474–484. [10] A. Gilloire, M. Vetterli, Adaptive filtering in subbands with critical sampling: analysis, experiments, and application to acoustic echo cancellation, IEEE Trans. Signal Process. 40 (8) (1992) 1862–1875. [11] F. Beaufays, Transform-domain adaptive filters: an analytical approach, IEEE Trans. Signal Process. 43 (2) (1995) 422–431. [12] S.L. Gay, S. Tavathia, The fast affine projection algorithm, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, Detroit, MI, May 1995, pp. 3023–3026. [13] S.J. Chern, J.C. Horng, K.M. Wong, The performance of the hybrid LMS adaptive algorithm, Signal Process. 44 (1) (1995) 67–88. [14] S. Hosur, A.H. Tewfik, Wavelet transform domain adaptive FIR filtering, IEEE Trans. Signal Process. 45 (3) (1997) 617–630. [15] D.I. Kim, P. De Wilde, Performance analysis of the DCT-LMS adaptive filtering algorithm, Signal Process. 80 (8) (2000) 1629–1654. [16] K. Mayyas, A note on “Performance analysis of the DCT-LMS adaptive filtering algorithm”, Signal Process. 85 (7) (2005) 1465–1467. [17] S. Zhao, Z. Man, S. Khoo, H.R. Wu, Stability and convergence analysis of transform-domain LMS adaptive filters with second-order autoregressive process, IEEE Trans. Signal Process. 57 (1) (2009) 119–130. [18] G. Murmu, R. Nath, Convergence performance comparison of transform domain LMS adaptive filters for correlated signal, in: Proceedings of the International Conference on Devices and Communications (ICDECOM), Ranchi, India, February 2011, pp. 1–5. [19] W. Yin, A.S. Mehr, Stochastic analysis of the normalized subband adaptive filter algorithm, IEEE Trans. Circuits Syst. I, Regul. Pap. 58 (5) (2011) 1020–1033. [20] E.L.O. Batista, R. Seara, On the performance of adaptive pruned Volterra filters, Signal Process. 93 (7) (2013) 1909–1920. [21] J.E. Kolodziej, O.J. Tobias, R. Seara, D.R. Morgan, On the constrained stochastic gradient algorithm: Model, performance, and improved version, IEEE Trans. Signal Process. 57 (4) (2009) 1304–1315. [22] E. Eweda, Transient and tracking performance bounds of the sign–sign algorithm, IEEE Trans. Signal Process. 47 (8) (1999) 2200–2210. [23] A.I. Sulyman, A. Zerguine, Convergence and steady-state analysis of a variable step-size NLMS algorithm, Signal Process. 83 (6) (2003) 1255–1273. [24] M. Moinuddin, A. Zerguine, A.U.H. Sheikh, Multiple-access interference plus noise-constrained least mean square (MNCLMS) algorithm for CDMA systems, IEEE Trans. Circuits Syst. I: Regul. Pap. 55 (9) (2008) 2870–2883. [25] M.S.E. Abadi, J.H. Husøy, On the application of a unified adaptive filter theory in the performance prediction of adaptive filter algorithms, Digit. Signal Process. 19 (3) (2009) 410–432. [26] E.M. Lobato, O.J. Tobias, R. Seara, A stochastic model for the transform-domain LMS algorithm, in: Proceedings of the European Signal Processing Conference (EUSIPCO), Vienna, Austria, September 2004, pp. 1833–1836. [27] E.M. Lobato, O.J. Tobias, R. Seara, Stochastic modeling of the transform-domain ε LMS algorithm, IEEE Trans. Signal Process. 56 (5) (2008) 1840–1852. [28] C.G. Samson, U. Reddy, Fixed point error analysis of the normalized ladder algorithm, IEEE Trans. Acoust. Speech Signal Process. 31 (5) (1983) 1177–1191. [29] I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series, and Products, 7th ed., Academic Press, San Diego, 2007. [30] E.M. Lobato, O.J. Tobias, R. Seara, Stochastic modeling of the transform-domain ε LMS algorithm for a time-varying environment, in: Proceedings of the European Signal Processing Conference (EUSIPCO), Antalya, Turkey, September 2005, pp. 1–4. [31] J.E. Kolodziej, O.J. Tobias, R. Seara, Stochastic analysis of the transform domain LMS algorithm for a non-stationary environment, in: Proceedings of the European Signal Processing Conference (EUSIPCO), Glasgow, Scotland, August 2009, pp. 1–4. [32] O. Macchi, Adaptive Processing: The Least Mean Squares Approach with Applications in Transmission, Wiley, New York, 1995.
Here, the procedure carried out for computing matrix (41) is described. Such a procedure considers two strategies, one to determine the main diagonal elements of S and another for the offdiagonal elements. Specifically, the main diagonal elements of S are obtained in a similar way as in Appendix A, whereas the off-diagonal elements are computed by using the AP as in [27]. Thus, from the Gaussian probability density function (see Assumption A1) [33], the main diagonal elements of S are first expressed as
s (i , i ) =
∞
M2
∞
x˜ 2i 1 T ˜ −1 e− 2 x˜ i Ri x˜ i dx˜ i . T˜ 2 (˜xi xi ) −∞
···
(2π ) M det(R˜ i ) −∞
(108)
M fold
Then, defining an auxiliary function g i (ω), such that g i (ω) returns to (108) for ω = 0 [i.e., s(i , i ) = g i (0)], and differentiating twice with respect to ω , we get
∂ 2 g i (ω) = M2 ∂ ω2
det[Mi (ω)]
˜ i) det(R
ϒ(ω)
(109)
with
ϒ(ω) =
1
(2π
)M
det[Mi (ω)]
∞
∞ ···
−∞
1 T −1 x˜ 2i e− 2 x˜ i Mi (ω)˜xi dx˜ i
−∞
M fold
(110) and
˜ −1 )−1 . Mi (ω) = (2ωI M + R i
(111)
Now, since x˜ i characterizes the first element of x˜ i , variable ϒ(ω) corresponds to the (1, 1)st element of Mi (ω). Hence, making ϒ(ω) = [Mi (ω)]1,1 and using the eigendecomposition of ˜ i , (109) reduces to R
˜ i + IM )−1 q˜ i ∂ 2 g i (ω) M 2 q˜ Ti (2ω = 2 ∂ω ˜ i + IM ) det(2ω
(112)
˜ i , and q˜ i defined as in Appendix A. Therefore, inte˜ i, with Q grating both sides of (112) twice with respect to ω , the diagonal elements of S are obtained as s(i , i ) = M 2 q˜ Ti Ui q˜ i
(113)
where Ui denotes a diagonal matrix whose elements are
∞ ∞ u i (l, l) = 0
λi ,l dω1 dω2 . M ω1 =ω2 (1 + 2ω λ ) (1 + 2ω1 λi ,k ) 1 i ,l k =1
(114) Finally, considering the approach described in [27], (47) is obtained as an approximate solution for the high-order hyperelliptic integral given in (114). In turn, (48) follows as an approximate solution for the off-diagonal elements of S.
JID:YDSPR AID:1782 /FLA
[m5G; v1.159; Prn:17/07/2015; 12:25] P.15 (1-15)
E.V. Kuhn et al. / Digital Signal Processing ••• (••••) •••–•••
15
[33] C.W. Therrien, Discrete Random Signals and Statistical Signal Processing, Prentice-Hall, Englewood Cliffs, 1992. [34] H.J. Butterweck, The independence assumption: a dispensable tool in adaptive filter theory, Signal Process. 57 (3) (1997) 305–310. [35] M. Rupp, The behavior of LMS and NLMS algorithms in the presence of spherically invariant processes, IEEE Trans. Signal Process. 41 (3) (1993) 1149–1160.
University of Misiones, Argentina, and the M.Sc. and Ph.D. degrees in Electrical Engineering from the Federal University of Santa Catarina, Florianópolis, Brazil, in 2006 and 2010, respectively. He joined the Department of Electronics Engineering at the National University of Misiones, in 2001. His present research interests include adaptive signal processing theory and its applications.
Eduardo Vinicius Kuhn was born in Cascavel, PR, Brazil. He received the B.S. degree in Telecommunications Engineering from the Assis Gurgacz University, Cascavel, PR, Brazil, in 2009, and the M.Sc. degree in Electrical Engineering from the Federal University of Santa Catarina, Florianópolis, SC, Brazil, in 2012. Currently, he is working towards the Ph.D. degree in Electrical Engineering at the Federal University of Santa Catarina. His research interests include adaptive signal processing, speech processing, and digital communications systems.
Rui Seara was born in Florianópolis, SC, Brazil. He received the B.S. and M.Sc. degrees in Electrical Engineering from the Federal University of Santa Catarina, Brazil, in 1975 and 1980, respectively. In 1984, he received the Doctoral degree in Electrical Engineering from the ParisSud University, Paris, France. He joined the Electrical Engineering Department at the Federal University of Santa Catarina, Brazil, in 1976, where he is currently a Professor of Electrical Engineering, and Director of LINSE–Circuits and Signal Processing Laboratory. His research interests include digital and analog filtering, adaptive signal processing algorithms, image and speech processing, and digital communications.
Javier Ernesto Kolodziej was born in Posadas, Mns., Argentina. In 2002, he received the B.S. degree in Electronics Engineering from the National