Efficient fast recursive least-squares adaptive complex filtering using real valued arithmetic

ARTICLE IN PRESS Signal Processing 85 (2005) 1759–1779 www.elsevier.com/locate/sigpro Efﬁcient fast recursive least-squares adaptive complex ﬁlterin...

Download PDF

771KB Sizes 1 Downloads 97 Views

Report

PDF Reader
Full Text

ARTICLE IN PRESS

Signal Processing 85 (2005) 1759–1779 www.elsevier.com/locate/sigpro

Efﬁcient fast recursive least-squares adaptive complex ﬁltering using real valued arithmetic George-Othon Glentis Technological Education Institute of Crete, Branch at Chania, Department of Electronics, 3, Romanou Str., Halepa, 73133 Chania, Greece Received 26 January 2004; received in revised form 29 March 2005

Abstract In this paper, a novel implementation of the fast recursive least squares (FRLS) algorithm for complex adaptive ﬁltering, is presented. Complex signals are treated as pairs of real signals, and operations are re-organized on a real arithmetic basis. A new set of signals and ﬁltering parameters is introduced, which are expressed either as the sum, or as the difference between the real and the imaginary part of the original variables. The algorithmic strength reduction technique is subsequently applied, resulting in signiﬁcant computational savings. In this way, an efﬁcient, FRLS algorithm is derived, where only two real valued ﬁlters are propagated, based on novel three-terms recursions for the time updating of the pertinent parameters. The proposed algorithm is implemented using real valued arithmetic only, whilst reducing the number of the required real valued multiplication operations by 23.5%, at the expense of a marginal 2.9% increase in the number of the real valued additions. The performance of the proposed FRLS algorithm is investigated by computer simulations, in the context of adaptive system identiﬁcation and adaptive channel equalization. r 2005 Elsevier B.V. All rights reserved. Keywords: Adaptive ﬁltering; Fast algorithms; Complex valued arithmetic

1. Introduction Complex valued signals are encountered in many adaptive signal processing applications. Typical examples include the design of adaptive algorithms for channel identiﬁcation or channel Tel.: +30 28210 23003; fax: +30 821 281 90.

E-mail address: [email protected].

equalization of telecommunication links, the design of adaptive beamformers or sidelobe cancellers in array signal processing, etc, [1–5]. While most of the existing real valued adaptive algorithms can easily be extended to handle complexvalued signals, it was not until recently when attention was given in the speciﬁc implementation of the complex valued operations [6–9]. The fast complex-valued multiplication method [10,11], has

0165-1684/$ - see front matter r 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2005.03.013

ARTICLE IN PRESS 1760

G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

been adopted for the design of efﬁcient digital signal processing algorithms, including digital ﬁltering and adaptive ﬁltering. In the context of VLSI signal processing, the fast complex valued multiplication method has been introduced as an algorithmic strength reduction (SR) transform, and it has been successfully applied for the design of low-power, high-speed adaptive ﬁlters and equalizers [12–14,8,15,16]. In this paper, the fast recursive least squares (FRLS) algorithm for complex adaptive ﬁltering is considered and an efﬁcient real valued FRLS scheme is derived. Throughout this paper, the FRLS nomenclature introduced by the FAEST algorithm is adopted [17]. The FRLS FTF counterpart can be handled in a similar way [18]. Direct application of a fast complex valued multiplication method to the original FRLS scheme, where each individual complex multiplication is performed using a fast scheme [10,19,11], can reduce the number of the required real multiplications by 23.5%, at the expense of a 70% increase in the required number of additions. To keep the overhead due to the extra additions low, both the input and the desired response signals, as well as all variables associated with the FRLS algorithm, are treated as pairs of real signals, and operations are re-organized on a real arithmetic basis. Auxiliary signals and ﬁlter parameters are introduced, which correspond to the generic signal transformation sR ðnÞ sI ðnÞ, where sR ðnÞ and sI ðnÞ represent the real and the imaginary part of the complex valued signal sðnÞ, respectively. The algorithmic strength reduction technique is subsequently applied to the transformed FRLS algorithm [9,16]. In this way, novel real valued threeterms recursions are derived for the efﬁcient time update of the FRLS parameters, resulting in signiﬁcant computational savings. The proposed FRLS algorithm utilizes real arithmetic only, for the efﬁcient realization of the transversal ﬁltering part, as well as for the parameters updating part. The computational complexity of the proposed scheme, measured by the required number of the valued real multiplications, is reduced by 23.5%, with a marginal 2.9% increase in the number of the real valued additions. Computational savings, without sacriﬁ-

cing performance, is a task of primary interest in the design of high-speed adaptive systems, where processing power of several GOPS (Giga operations per second) is needed. On the other hand, in slow sampling rate adaptive ﬁltering systems, for example in speech processing and hearing aids applications, where special VLSI ASICs are required for small size and low-power implementation, the reduction in the computational complexity is of primarily interest, since it is directly related to the size and the power consumption of the ﬁnal design. The performance of the proposed FRLS implementation is investigated by computer simulations, in the context of adaptive system identiﬁcation and adaptive channel equalization. To avoid numerical instability, error feedback stabilization is utilized [20]. Simulation results indicate that both the original complex valued FRLS algorithm as well as the proposed efﬁcient FRLS implementation using real valued arithmetic have similar convergence and stabilization properties.

2. Fast RLS adaptive ﬁltering overview Let us consider a complex valued FIR model described by the difference equation [3] yðnÞ ¼

M X

ci xðn i þ 1Þ.

(1)

i¼1

Here xðnÞ is a complex valued input signal and ci , i ¼ 1; . . . M, are the complex valued ﬁlter coefﬁcients. (As for the superscripts throughout this paper, ‘*’ means the complex conjugate, ‘T’ denotes the vector transpose, and ‘H’ stands for the Hermitian operator (conjugate and transpose).) The above equation is compactly written as yðnÞ ¼ cH M xM ðnÞ.

(2)

Vector xM ðnÞ is the regressor xM ðnÞ ¼ ½xðnÞ xðn 1Þ . . . xðn M þ 1Þ T

(3)

and vector cM ¼ ½c1 c2 . . . cM T , is the ﬁlter coefﬁcients vector.

(4)

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

Given a sequence of a complex valued input signal xðkÞ, and a complex valued desired response signal zðkÞ, over a ﬁnite observation horizon, k ¼ 0; 1; . . . n, the optimum parameters of the FIR model of Eq. (1) are estimated by minimizing the exponentially forgetting total squared error, i.e., ! n X c cM ðnÞ : min lnk ec M ðkÞeM ðkÞ , cM

k¼0

ecM ðkÞ

¼ zðkÞ cH M ðnÞxM ðkÞ.

ð5Þ

Here, 05lo1 is a parameter that controls the size of the exponentially fading memory of the adaptation mechanism. Minimization of the cost function (5), with respect to the ﬁlter coefﬁcients vector, results in a set of linear equations, the socalled normal equations RM ðnÞcM ðnÞ ¼ dM ðnÞ.

(6)

Matrix RM ðnÞ is the sampled autocorrelation matrix of the input signal xðnÞ, deﬁned as RM ðnÞ ¼

n X

lnk xM ðkÞxH M ðkÞ.

(7)

k¼0

Vector dM ðnÞ is the sampled cross-correlation vector between the input signal xðnÞ and the desired response signal zðnÞ, i.e., dM ðnÞ ¼

n X

lnk xM ðkÞz ðkÞ.

(8)

k¼0

The RLS algorithm provides a method for the time updating of the inverse autocorrelation matrix, bypassing the direct matrix inversion imposed by Eq. (6). Indeed, application of the matrix inversion lemma [3, p. 480] to Eqs. (6)–(8), results in the following updating scheme: cM ðnÞ ¼ ðzðnÞ cH M ðn 1ÞxM ðnÞÞ =ð1 þ wH M ðnÞxM ðnÞÞ, cM ðnÞ ¼ cM ðn 1Þ þ wM ðnÞc M ðnÞ.

ð9Þ (10)

Parameter wM ðnÞ that appears in the above equations, is the so-called Kalman gain vector, deﬁned as [17,18] wM ðnÞ ¼ l1 R1 M ðn 1ÞxM ðnÞ.

(11)

1761

According to the RLS algorithm, the inverse matrix R1 M ðn 1Þ can be updated using an OðM 2 Þ recursive scheme [3]. Fast RLS algorithms go a step further, using fast OðMÞ adaptive schemes for the time update of the Kalman gain vector, bypassing the need of the inverse matrix R1 M ðn 1Þ adaptation, [17,18,21,4]. The key point in the development of fast RLS algorithms is the proper utilization of the shift invariance property of the data matrix. Indeed, the increased order data vector xMþ1 ðnÞ can be partitioned either in a lower or in an upper form, as " # " # xM ðnÞ xðnÞ xMþ1 ðnÞ ¼ ¼ . (12) xðn MÞ xM ðn 1Þ These partitions of the data vector are utilized to obtain appropriate partitions for the increased order autocorrelation matrix RMþ1 ðnÞ. It has been shown [17,18] that the Kalman gain vector wM ðn 1Þ at time n 1, is related to the Kalman gain vector at time n, as it follows from the upper and the lower form solution of the increased order system, Eq. (11). The fast RLS algorithm for complex adaptive ﬁltering is tabulated in Table 1. Auxiliary vectors aM ðnÞ and bM ðnÞ are least squares forward and backward predictors, and result from the general ﬁltering scheme of Eq. (1), by setting zðnÞ ¼ xðn þ 1Þ and zðnÞ ¼ xðn MÞ, respectively. The intrinsic numerical instability of the FRLS algorithm is treated using the error feedback stabilization method, originally proposed in [22,20]. Two speciﬁc variables are used for this purpose, the a priori backward prediction error ebM ðnÞ and the Kalman gain power aM ðnÞ. These are estimated by the algorithm as is dictated by Eqs. (10) and (6), (14) of Table 1, respectively. At the same time these parameters are re-estimated using Eqs. (11) and (15) of Table 1. The error signals DbM ðnÞ and DaM ðnÞ corresponding to the backward prediction error and the Kalman gain power, are utilized as a feedback mechanism to control the error propagation and provide a new set of variables to be used by the algorithm as is dictated by Eqs. (13) and (17) of Table 1. Tuning variables sb and sa are carefully selected so that

1762

Table 1 The stabilized complex valued Fast RLS (FRLS) adaptive algorithm Initialization

A: Addition; M: Multiplication; R: Real valued; C: Complex valued

cM ð1Þ ¼ 0; wM ð1Þ ¼ 0; aM ð1Þ ¼ 0; bM ð1Þ ¼ 0 aM ð1Þ ¼ 1, afM ð1Þ ¼ lM abM ð1Þ, abM ð1Þ ¼ s2x

Forward prediction part 1 efM ðnÞ ¼ xðnÞ þ aH M ðn 1ÞxM ðn 1Þ 2 f ðnÞ ¼ ef ðnÞ=aM ðn 1Þ M

CM

CA

M

M

M

M

4

afM ðnÞ ¼ lafM ðn 1Þ þ efM ðnÞfM ðnÞ

2

5

kfMþ1 ðnÞ ¼ l1 efM ðnÞ=afM ðn 1Þ

2

6

ðnÞ aMþ1 ðnÞ ¼ aM ðn 1Þ þ efM ðnÞkfMþ1

2

" wMþ1 ðnÞ ¼

8 wMþ1 ðnÞ ¼ 9

0

wM ðn 1Þ " # qM ðnÞ

#

" þ

1 aM ðn 1Þ

15 16 17 18

wM ðnÞ ¼ qM ðnÞ bM ðn 1ÞkbMþ1 ðnÞ

M

M

M

M

M

M 1 3

2

b aM ðnÞ ¼ aMþ1 ðnÞ eb;1 M ðnÞk Mþ1 ðnÞ ðnÞx ðnÞ AM ðnÞ ¼ 1 þ wH M M DaM ðnÞ ¼ AM ðnÞ aM ðnÞ a a aM ðnÞ ¼ AM ðnÞ þ s DM ðnÞ b;i b;i M ðnÞ ¼ eM ðnÞ=aM ðnÞ;

bM ðnÞ ¼ bM ðn 1Þ wM ðnÞb;2 M ðnÞ

20

b;3 abM ðnÞ ¼ labM ðn 1Þ þ eb;3 M ðnÞM ðnÞ

ecM ðnÞ ¼ zðnÞ cH M ðn 1ÞxM ðnÞ cM ðnÞ ¼ ecM ðnÞ=aM ðnÞ cM ðnÞ ¼ cM ðn 1Þ þ wM ðnÞc M ðnÞ

2

2

2M

2M 1 1

4

i ¼ 2; 3

19

Filtering part 21 22 23

kfMþ1 ðnÞ

kbMþ1 ðnÞ

Backward prediction part 10 ebM ðnÞ ¼ labM ðn 1ÞkbMþ1 ðnÞ 11 E bM ðnÞ ¼ xðN mÞ þ bH M ðn 1ÞxM ðnÞ 12 DbM ðnÞ ¼ E bM ðnÞ ebM ðnÞ b b b 13 eb;i i ¼ 1; 2; 3 M ðnÞ ¼ E M ðnÞ þ si D ðnÞ; 14

#

2

M

M 2

M

M

M

M

2

2

Forward and backward prediction powers abM ð1Þ and afM ð1Þ are initialized using an estimate of the input signal power, s2x . Eq. (8) deﬁnes a vector partition.

ARTICLE IN PRESS

Kalman gain part 7

2

G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

aM ðnÞ ¼ aM ðn 1Þ wM ðn 1ÞfM ðnÞ

RA

2

M

3

RM

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

the overall algorithm remains stable. An additional constrain of the stabilized fast RLS algorithms is the nominal range within which the forgetting factor l should be, in order the algorithm to remain stable, and as was explained in [18], l 2 ð1 1=3M; 1Þ [20].

3. Algorithmic strength reduction Complex valued arithmetic is required for the implementation of the FRLS algorithm when the input signals and/or the ﬁlter model are represented by complex valued variables. Complex valued addition is realized by a set of two real valued additions, i.e., ða þ jbÞ þ ðc þ jdÞ ¼ ða þ cÞ þ jðb þ dÞ.

(13)

Complex valued multiplication can be realized by the classic method, that requires four real valued multiplications and two real valued additions, i.e., ða þ jbÞðc þ jdÞ ¼ ðac bdÞ þ jðad þ bcÞ.

(14)

Alternatively, the fast complex valued multiplication method can be applied, where the inherent dependencies of the partial products and sums, are utilized for the reduction of the number of real valued multiplication operations, at the expense of some extra real valued additions [10,14,15,19,11]. In this case, three real valued multiplications and ﬁve real valued additions, are required. Among others, two possible implementations of the fast complex valued multiplication are described by the following equations [19]: ac bd ¼ bðc dÞ þ ða bÞc, ad þ bc ¼ aðc þ dÞ ða bÞc

that the input–output behavior is preserved. The application of the fast complex valued multiplication method belongs to a general class of algorithmic transformations, and it is known as the algorithmic strength reduction transform. It has been successfully applied for the design of lowpower, high-speed adaptive ﬁlters and equalizers [12,13,16]. Direct implementation of the fast complex valued multiplication method to the FRLS scheme, can reduce the required number of real multiplication operations. However, although the number of real valued multiplications is reduced, there is a signiﬁcant increase in the number of real valued additions and in the total number of the overall real valued operations performed by the algorithm. Further improvement on the number of the required real valued additions can be achieved by algorithm re-organization, using a set of auxiliary signals and ﬁlter parameters, that are propagated through the algorithm and are updated by the pertinent recursive equations, in accordance to the strength reduction real valued arithmetic imposed by Eqs. (15) and (16).

4. The SR FRLS adaptive algorithm Application of the algorithmic strength reduction transform, described by Eqs. (15) and (16), to the FRLS adaptive algorithm, results in an equivalent algorithmic description, called the strength reduced (SR) FRLS adaptive algorithm, where real valued arithmetic is only required. The main features of the SR FRLS algorithm are

ð15Þ

and ac bd ¼ ða þ bÞd þ aðc þ dÞ, ad þ bc ¼ bðc dÞ þ ða þ bÞd.

1763

ð16Þ

Efﬁcient signal processing algorithms, for digital ﬁltering and adaptive ﬁltering, have been developed by taken into account the reduced complexity, fast complex valued multiplication method [6–9]. In the context of VLSI signal processing, transformations are modiﬁcations to the computational structure of a given algorithm, in a manner

the introduction of a set of real valued auxiliary signals that are fed into the transversal ﬁlters and are propagated through the algorithm, and the use of a set of real valued, modiﬁed ﬁlter parameters that are updated instead of the original ones, using novel three-terms recursions. As a result, a minimum amount of operations, i.e., real valued multiplication and addition, is attained.

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1764

Let us consider the complex ﬁltering set up dictated by Eqs. (1)–(4). The input signal xðnÞ is expressed in terms of a real and an imaginary part, as xðnÞ ¼ xR ðnÞ þ jxI ðnÞ (throughout this paper, subscripts R and I denote the real and the imaginary parts of a scalar, complex valued variable, respectively). As a consequence, the corresponding complex regressor vector xM ðnÞ is written as I xM ðnÞ ¼ xR M ðnÞ þ jxM ðnÞ,

(17)

I where vectors xR M ðnÞ and xM ðnÞ denote the real and the imaginary part of xM ðnÞ, respectively, i.e., T xR M ðnÞ ¼ ½xR ðnÞ xR ðn 1Þ . . . xR ðn M þ 1Þ

(18) and xIM ðnÞ

cM ðnÞ. They are deﬁned as I bcM ðnÞ ¼ cR M ðnÞ þ cM ðnÞ,

(26)

I c~ M ðnÞ ¼ cR M ðnÞ cM ðnÞ.

(27)

Consider the complex variable ~ þ jb Y ðnÞ ¼ yðnÞ yðnÞ.

(28)

It is computed using Eqs. (24) and (25) as I Y ðnÞ ¼ ðbcM ðn 1Þ þ j~cM ðn 1ÞÞT ðxR M ðnÞ þ jxM ðnÞÞ.

(29) Application of the SR transform, Eq. (15), to the above equation, results in T ybðnÞ ¼ bcM ðn 1Þb xM ðnÞ ðbcM ðn 1Þ

c~ M ðn 1ÞÞT xR M ðnÞ,

ð30Þ

~ ¼ c~ TM ðn 1Þx~ M ðnÞ þ ðbcM ðn 1Þ yðnÞ ¼ ½xI ðnÞ xI ðn 1Þ . . . xI ðn M þ 1Þ

T

c~ M ðn 1ÞÞT xR M ðnÞ. (19)

(superscripts R and I denote the real and the imaginary parts of a vector, complex valued variable, respectively). In a similar way, the ﬁlter coefﬁcients complex vector cM ðnÞ is expressed as cM ðnÞ ¼

cR M ðnÞ

þ

jcIM ðnÞ.

(20)

Using Eqs. (17) and (20), the ﬁlter output at time n, i.e., yðnÞ ¼ cH M ðn 1ÞxM ðnÞ, is expanded as I;T R I yðnÞ ¼ cR;T M ðn 1ÞxM ðnÞ þ cM ðn 1ÞxM ðnÞ I;T I R þ jðcR;T M ðn 1ÞxM ðnÞ cM ðn 1ÞxM ðnÞÞ.

ð21Þ Let us consider a set of auxiliary ﬁlter output signals, deﬁned as ybðnÞ ¼ yR ðnÞ þ yI ðnÞ,

(22)

~ ¼ yR ðnÞ yI ðnÞ. yðnÞ

(23)

~ Using Eq. (21), variables ybðnÞ and yðnÞ are estimated as T ybðnÞ ¼ c~ TM ðn 1ÞxR cM ðn 1ÞxIM ðnÞ, M ðnÞ þ b

(24)

~ TM ðn 1ÞxIM ðnÞ. ~ ¼ bcTM ðn 1ÞxR yðnÞ M ðnÞ c

(25)

Here, vectors bcM ðnÞ and c~ M ðnÞ are transformed versions of the original ﬁlter coefﬁcient vector

ð31Þ

The auxiliary regressors b xM ðnÞ and x~ M ðnÞ appeared above, are compatible deﬁned as I b xM ðnÞ ¼ xR M ðnÞ þ xM ðnÞ,

(32)

I x~ M ðnÞ ¼ xR M ðnÞ xM ðnÞ.

(33)

Eqs. (30) and (31) provide a cost effective way of ~ computing variables ybðnÞ and yðnÞ, using 3M real valued multiplications and 4M 1 real valued additions. On the contrary, evaluation of either ~ using Eqs. (24) and (25), or yðnÞ using ybðnÞ and yðnÞ Eq. (21), requires 4M real valued multiplications and 4M 2 real valued additions. Thus, working ~ with the transformed variables ybðnÞ and yðnÞ instead of the original parameter yðnÞ, results in a signiﬁcant reduction in the required number of real valued multiplications. On the contrary, if the SR technique is applied on an individual basis, for each complex valued multiplication involved into the inner product computation yðnÞ ¼ cH M ðn 1ÞxM ðnÞ, 3M real valued multiplications and 7M real valued additions are utilized. Estimation of the complex convolution yðnÞ ¼ cH M ðn 1ÞxM ðnÞ using the SR technique was originally introduced in the context of LMS adaptive ﬁltering [9], where yR ðnÞ and yI ðnÞ was computed in an efﬁcient way. Here, we approach

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

the subject in a slightly different way, estimating ~ instead of yR ðnÞ and yI ðnÞ. Of course, ybðnÞ and yðnÞ the later variables can be retrieved from the former, by applying the inverse transformation implied by Eqs. (22) and (23). ~ Since ybðnÞ and yðnÞ are estimated using the transformed ﬁlters bcM ðnÞ and c~ M ðnÞ, it could be more efﬁcient, if these variables were updated directly, instead of the updating the original cM ðnÞ. Using Eq. (23) of Table 1, we get a couple of recursive equations, of the form R cR M ðnÞ ¼ cM ðn 1Þ þ

c;R wR M ðnÞM ðnÞ

þ

wIM ðnÞc;I M ðnÞ, (34)

c;I c;R I cIM ðnÞ ¼ cIM ðn 1Þ wR M ðnÞM ðnÞ þ wM ðnÞM ðnÞ. (35)

If we insert the above recursions into Eqs. (26) and (27), we get bcM ðnÞ ¼ bcM ðn 1Þ þ ðb ~ M ðnÞc;I wM ðnÞc;R M ðnÞ w M ðnÞÞ, (36) ~ M ðnÞc;R b M ðnÞc;I c~ M ðnÞ ¼ c~ M ðn 1Þ þ ðw M ðnÞ þ w M ðnÞÞ. (37) b M ðnÞ Here, the transformed Kalman gain vectors w ~ M ðnÞ, are deﬁned as and w I b M ðnÞ ¼ wR w M ðnÞ þ wM ðnÞ,

(38)

I ~ M ðnÞ ¼ wR w M ðnÞ wM ðnÞ.

(39)

Consider the complex valued transformed coefﬁcients vector deﬁned as CM ðnÞ ¼ bcM ðnÞ j~cM ðnÞ.

(40)

It is updated using Eqs. (36) and (37) as ~ M ðnÞÞ CM ðnÞ ¼ CM ðn 1Þ þ ðb wM ðnÞ j w c;I ðc;R M ðnÞ jM ðnÞÞ.

ð41Þ

Finally, application of the algorithmic strength reduction transform, Eq. (16), to the above equation, results in bcM ðnÞ ¼ bcM ðn 1Þ þ w b M ðnÞ~cM ðnÞ þ ðb wM ðnÞ w~ M ðnÞÞc;I M ðnÞ,

ð42Þ

1765

~ M ðnÞbcM ðnÞ c~ M ðnÞ ¼ c~ M ðn 1Þ þ w ~ M ðnÞÞc;I þ ðb wM ðnÞ w M ðnÞ.

ð43Þ

In the above recursions, a set of transformed ﬁltering error variables has been introduced according to the scheme b zðnÞ þ ybðnÞ, ecM ðnÞ ¼ b

(44)

~ e~cM ðnÞ ¼ z~ðnÞ yðnÞ,

(45)

where b zðnÞ ¼ zR ðnÞ þ zI ðnÞ;

z~ðnÞ ¼ zR ðnÞ zI ðnÞ.

(46)

The computational complexity of Eqs. (42) and (43) is 3M real valued multiplications and 5M real valued additions. This ﬁgure should be compared with the computational complexity of the original adaptive scheme implied by Eqs. (34) and (35), which is either 4M real valued multiplications and 4M real valued additions, or 3M real valued multiplications and 7M real valued additions, depending on the type of complex valued multiplication (i.e., classical, or fast elementwise) is adopted. Clearly, the proposed recursions provide the best low complexity compromise among all the competitive schemes. The SR transform can be applied to the forward and the backward prediction parameters that are involved in the computational ﬂow of the FRLS algorithm of Table 1. Indeed, fast, three-term recursions can be derived for the efﬁcient updating of the transformed forward prediction parameters, implied by Eqs. (1)–(4) of Table 1, as well as the transformed backward prediction parameters, resulting from Eqs. (10)–(20) of Table 1. The SR FRLS formulas for updating the transformed forward and backward parameters are tabulated in Table 2, Eqs. (1)–(15) and (36)–(57), respectively. These reduced complexity recursions result from the general ﬁltering scheme discussed so far, bðn þ 1Þ, z~ðnÞ ¼ xðn ~ þ 1Þ, and by setting b zðnÞ ¼ x bðn MÞ, z~ðnÞ ¼ xðn ~ MÞ, respectively. b zðnÞ ¼ x Finally, the derivation of the SR Kalman gain updating scheme is provided by Eqs. (16)–(35) of Table 2. The proof of the pertinent SR recursions is supplied in Appendix. The signal ﬂow chart of the proposed FRLS algorithm is presented in Fig. 1. Twelve basic processing blocks are utilized, named after the

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1766

Table 2 Real valued implementation of the stabilized FRLS adaptive algorithm, using the strength reduction transform (SR FRLS) Initialization bcM ð1Þ ¼ 0; c~ M ð1Þ ¼ 0; w b M ð1Þ ¼ 0; w ~ M ð1Þ ¼ 0; b aM ð1Þ ¼ 0; a~ M ð1Þ ¼ 0 b bM ð1Þ ¼ 0; b~ M ð1Þ ¼ 0; dw ð1Þ ¼ 0 M

aM ð1Þ ¼ 1; afM ð1Þ ¼ lM abM ð1Þ; abM ð1Þ ¼ s2x bðnÞ ¼ xR ðnÞ þ xI ðnÞ; xðnÞ ~ x ¼ xR ðnÞ xI ðnÞ b zðnÞ ¼ zR ðnÞ þ zI ðnÞ; z~ðnÞ ¼ zR ðnÞ zI ðnÞ

Forward prediction part 1 aTM ðn 1Þb xM ðn 1Þ xf1 ðnÞ ¼ b f 2 x ðnÞ ¼ a~ T ðn 1Þx~ M ðn 1Þ 2

3 4 5 6 7 8 9 10 11 12 13 14 15 Kalman gain part 16 17 18 19 20 21 22 23

M

aM ðn 1Þ a~ M ðn 1Þ daM ðn 1Þ ¼ b

R xf3 ðnÞ ¼ daT M ðn 1ÞxM ðn 1Þ

RM

RA

M

M 1

M

M 1

0 M

M M 1

b bðnÞ þ ðxf1 ðnÞ xf3 ðnÞÞ efM ðnÞ ¼ x

~ þ ðxf2 ðnÞ þ xf3 ðnÞÞ e~fM ðnÞ ¼ xðnÞ f bM ðnÞ ¼ b efM ðnÞ=aM ðn 1Þ f ~M ðnÞ ¼ e~fM ðnÞ=aM ðn 1Þ fM;R ðnÞ ¼ 0:5ðbfM ðnÞ þ ~fM ðnÞÞ b M ðn 1Þ~fM ðnÞ gfM;1 ðnÞ ¼ w f ;2 ~ M ðn 1ÞbfM ðnÞ gM ðnÞ ¼ w f ;3 gM ðnÞ ¼ dwM ðn 1ÞfM;R ðnÞ b aM ðn 1Þ gfM;1 ðnÞ gfM;3 ðnÞ aM ðnÞ ¼ b a~ M ðnÞ ¼ a~ M ðn 1Þ gfM;2 ðnÞ gfM;3 ðnÞ efM ðnÞbfM ðnÞ afM ðnÞ ¼ lafM ðn 1Þ þ 0:5ðb

2 2 1 1 1 M M M 2M 2M þ e~fM ðnÞ~fM ðnÞÞ

f efM ðnÞ=afM ðn 1Þ kbMþ1 ðnÞ ¼ l1 b f 1 k~Mþ1 ðnÞ ¼ l e~fM ðnÞ=afM ðn 1Þ " # 1 f wf ;1 ðnÞ gMþ1 ðnÞ ¼ kb b aM ðn 1Þ Mþ1 " # 1 f ;2 ðnÞ gwf k~ Mþ1 ðnÞ ¼ a ~ M ðn 1Þ Mþ1 f f ;I ðnÞ ¼ 0:5ðkbMþ1 ðnÞ k~Mþ1 ðnÞÞ kfMþ1 " # 0 ;3 f ;I gwf Mþ1 ðnÞ ¼ dw ðn 1Þ kMþ1 ðnÞ M " # 0 ;1 wf ;3 b Mþ1 ¼ þ gwf w Mþ1 ðnÞ gMþ1 ðnÞ b M ðn 1Þ w " # 0 ;2 wf ;3 þ gwf w~ Mþ1 ¼ Mþ1 ðnÞ gMþ1 ðnÞ ~ M ðn 1Þ w

2

1 1 M M 0

1

M 2M 2M 2

25

f f efM ðnÞ þ k~Mþ1 ðnÞ~efM ðnÞÞ aMþ1 ðnÞ ¼ aM ðn 1Þ þ 0:5ðkbMþ1 ðnÞb 2 3 " # b qM ðnÞ q~ M ðnÞ 4 5 b Mþ1 ðnÞ ¼ ; w~ Mþ1 ðnÞ ¼ ~b w b kMþ1 ðnÞ kbMþ1 ðnÞ

26

bb b gwb;1 M ðnÞ ¼ bM ðn 1Þk Mþ1 ðnÞ

M

24

2

2

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1767

Table 2 (continued ) Initialization bcM ð1Þ ¼ 0; c~ M ð1Þ ¼ 0; w b M ð1Þ ¼ 0; w ~ M ð1Þ ¼ 0; b aM ð1Þ ¼ 0; a~ M ð1Þ ¼ 0 b bM ð1Þ ¼ 0; b~ M ð1Þ ¼ 0; dw ð1Þ ¼ 0 M

aM ð1Þ ¼ 1; afM ð1Þ ¼ lM abM ð1Þ; abM ð1Þ ¼ s2x bðnÞ ¼ xR ðnÞ þ xI ðnÞ; xðnÞ ~ x ¼ xR ðnÞ xI ðnÞ b zðnÞ ¼ zR ðnÞ þ zI ðnÞ; z~ðnÞ ¼ zR ðnÞ zI ðnÞ RM 27 28 29 30 31 32 33 34 35

~b b gwb;2 M ðnÞ ¼ bM ðn 1Þk Mþ1 ðnÞ kb;I ðnÞ ¼ 0:5ðkbMþ1 ðnÞ k~Mþ1 ðnÞÞ

1

dbM ðn 1Þ ¼ b bM ðn 1Þ b~ M ðn 1Þ b;I b ðnÞ ¼ d gwb;3 M M ðn 1Þk Mþ1 ðnÞ

M

Mþ1

M

M

AM ðnÞ ¼ 1 þ

Mþ1 0:5ðb wTM ðnÞb xM ðnÞ

Backward prediction part T 36 bM ðn 1Þb xM ðnÞ xb1 ðnÞ ¼ b T 37 xb ðnÞ ¼ b~ ðn 1Þx~ M ðnÞ 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 Filtering part 58 59

M M

wb;3 b M ðnÞ ¼ b qM ðnÞ gwb;1 w M ðnÞ þ gM ðnÞ wb;2 w~ M ðnÞ ¼ q~ M ðnÞ gM ðnÞ þ gwb;3 M ðnÞ b b ebM ðnÞ ¼ labM ðn 1ÞkbMþ1 ðnÞ b e~b ðnÞ ¼ lab ðn 1Þk~ ðnÞ

2M 2M 1 1

þ

w~ TM ðnÞx~ M ðnÞÞ

M 2 R xb3 ðnÞ ¼ dbT M ðn 1ÞxM ðnÞ b bðn MÞ þ ðxb1 ðnÞ xb3 ðnÞÞ EbM ðnÞ ¼ x b ~ MÞ þ ðxb2 ðnÞ þ xb3 ðnÞÞ E~ M ðnÞ ¼ xðn b b b b ebM ðnÞ DM ðnÞ ¼ E M ðnÞ b b b D~ M ðnÞ ¼ E~ M ðnÞ e~bM ðnÞ b b ðnÞ; i ¼ 1; 2; 3 bb b eb;i sbi D M M ðnÞ ¼ E M ðnÞ þ b ~ b ðnÞ þ s~ b D~ b ðnÞ; i ¼ 1; 2; 3 ðnÞ ¼ E e~b;i i M M M b;1 bb ~b eb;1 aM ðnÞ ¼ aMþ1 ðnÞ 0:5ðb M ðnÞk Mþ1 ðnÞ þ e~M ðnÞk Mþ1 ðnÞÞ a DM ðnÞ ¼ AM ðnÞ aM ðnÞ aM ðnÞ ¼ AM ðnÞ þ sa DaM ðnÞ bb;i eb;i i ¼ 2; 3 M ðnÞ=aM ðnÞ; M ðnÞ ¼ b b;i ~M ðnÞ ¼ e~b;i i ¼ 2; 3 M ðnÞ=aM ðnÞ; ~b;2 b;I b;2 M ðnÞ M ðnÞ ¼ 0:5ðb M ðnÞÞ b M ðnÞ~b;2 gb;1 M ðnÞ ¼ w M ðnÞ b;2 ~ M ðnÞbb;2 gM ðnÞ ¼ w M ðnÞ b M ðnÞ w ~ M ðnÞ dwM ðnÞ ¼ w b;I w ðnÞ ¼ d ðnÞ gb;3 M M ðnÞ M b;3 b bM ðnÞ ¼ b bM ðn 1Þ gb;1 M ðnÞ gM ðnÞ b;2 b~ M ðnÞ ¼ b~ M ðn 1Þ gM ðnÞ gb;3 M ðnÞ b;3 abM ðnÞ ¼ labM ðn 1Þ þ 0:5ðb eb;3 ðnÞb b;3 b;3 M ðnÞ þ e~M ðnÞ~ M M ðnÞÞ

T y1 ðnÞ ¼ bcM ðn 1Þb xM ðnÞ y2 ðnÞ ¼ c~ TM ðn 1Þx~ M ðnÞ

RA

2M

2M 1

M

M 1

M

M 1

M

M 1 2 2 1 1 3 3

2

2

1 1 2 2 1 M M 0 M

M M 2M

2

2M 2

M

M 1

M

M 1

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1768 Table 2 (continued ) Initialization

bcM ð1Þ ¼ 0; c~ M ð1Þ ¼ 0; w b M ð1Þ ¼ 0; w ~ M ð1Þ ¼ 0; b aM ð1Þ ¼ 0; a~ M ð1Þ ¼ 0 b bM ð1Þ ¼ 0; b~ M ð1Þ ¼ 0; dw ð1Þ ¼ 0 M

aM ð1Þ ¼ 1; afM ð1Þ ¼ lM abM ð1Þ; abM ð1Þ ¼ s2x bðnÞ ¼ xR ðnÞ þ xI ðnÞ; xðnÞ ~ x ¼ xR ðnÞ xI ðnÞ b zðnÞ ¼ zR ðnÞ þ zI ðnÞ; z~ðnÞ ¼ zR ðnÞ zI ðnÞ RM 60 61 62 63 64 65 66 67 68 69 70 71

dcM ðn 1Þ ¼ bcM ðn 1Þ c~ M ðn 1Þ R y3 ðnÞ ¼ dcT M ðn 1ÞxM ðnÞ c b eM ðnÞ ¼ b zðnÞ ðy1 ðnÞ y3 ðnÞÞ e~cM ðnÞ ¼ z~ðnÞ ðy2 ðnÞ þ y3 ðnÞÞ bcM ðnÞ ¼ b ecM ðnÞ=aM ðnÞ ~cM ðnÞ ¼ e~cM ðnÞ=aM ðnÞ cM ðnÞ ~cM ðnÞÞ c;I M ðnÞ ¼ 0:5ðb c;1 b M ðnÞ~cM ðnÞ gM ðnÞ ¼ w c;2 ~ M ðnÞbcM ðnÞ gM ðnÞ ¼ w c;3 gM ðnÞ ¼ dwM ðnÞc;I M ðnÞ c;3 bcM ðnÞ ¼ bcM ðn 1Þ þ gc;1 M ðnÞ þ gM ðnÞ c;2 c~ M ðnÞ ¼ c~ M ðn 1Þ þ gM ðnÞ þ gc;3 M ðnÞ

M

RA M M 1 2 2

1 1 1 M M M

M 1 2M 2M

Forward and backward prediction powers abM ð1Þ and afM ð1Þ are initialized using an estimate of the input signal power, s2x . Eqs. (25) deﬁne vector partitions.

~ M ðnÞ and c~ M ðnÞ, four extra local vector variables w are used, i.e., daM ðnÞ, dbM ðnÞ, dwM ðnÞ and dcM ðnÞ. These are the difference between the corresponding ‘tilde’ and ‘hat’ state parameters, and are deﬁned by Eqs. (3), (29), (53) and (60) of Table 2, respectively. Equations of Table 2 that describe the proposed stabilized FRLS adaptive, can be partitioned to ﬁt into the processing scheme of Fig. 1. Take for example the top-left block that is characterized by the vector variable b aM ðnÞ. Then, Eqs. (1), (5), (7), (10), (13) and (16) of Table 2, may be assigned to this particular processing block.

Fig. 1. Block diagram of the proposed SR FRLS algorithm.

main vector parameter that is processed by each block. Apart from the ‘state’ vector variables b b M ðnÞ and bcM ðnÞ, and a~ M ðnÞ, b~ M ðnÞ, aM ðnÞ, b bM ðnÞ, w

Remark 1. Eq. (29) implies a least-squares ﬁltering interpretation, where the FIR model is parameterized by the transformed parameters vector CM ðnÞ ¼ bcM ðnÞ j~cM ðnÞ and the desired response signal is set equal to ZðnÞ ¼ z~ðnÞ þ jb zðnÞ. Obviously, the original ﬁlter cM ðnÞ can be recovered from the transformed parameters, as cR cM ðnÞ þ c~ M ðnÞÞ, M ðnÞ ¼ 0:5ðb

(47)

cIM ðnÞ ¼ 0:5ðbcM ðnÞ c~ M ðnÞÞ.

(48)

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

In a similar way, the original ﬁltering error, ecM ðnÞ ¼ zðnÞ yðnÞ, is retried by the corresponding transformed ﬁltering errors, as ec;R ecM ðnÞ þ e~cM ðnÞÞ, M ðnÞ ¼ 0:5ðb

(49)

ec;I ecM ðnÞ e~cM ðnÞÞ. M ðnÞ ¼ 0:5ðb

(50)

Remark 2. Despite the fact that the FRLS implements an RLS scheme, the algorithm exhibits poor numerical error performance, which appears, even when it is carefully implemented in double precision arithmetic. Severe divergence problems are caused by the intrinsic instability of the FRLS algorithm, and extra care has been given to compensate for this undesirable effect. The most promising method for the stabilization of the FRLS algorithm was proposed in [22], who suggested the use of a small amount of computational redundancy combined with a suitable feedback mechanism, to control the numerical errors. It is the difference between the instantaneous backward error and its value as determined by the algorithm. It was shown in [20], how this can be fed back into the algorithm and considerably improve its numerical performance. Two speciﬁc variables are used for this purpose, the a priori backward error ebM ðnÞ and the Kalman gain power aM ðnÞ. These are estimated by the complex valued FRLS algorithm as is dictated by Eqs. (10) and (14) of Table 1, and deﬁned by E bM ðnÞ

¼ xðn MÞ þ

AM ðnÞ ¼ 1 þ

bH M ðn

wH M ðnÞxM ðnÞ.

1ÞxM ðnÞ, ð51Þ

We form the differences DbM ðnÞ ¼ E bM ðnÞ ebM ðnÞ, DaM ðnÞ ¼ AM ðnÞ aM ðnÞ.

ð52Þ

In inﬁnite precision, these differences should be zero; however in practice, DbM ðnÞ grows exponentially while DaM ðnÞ grows linearly. These errors are utilized as a feedback mechanism to control the error propagation and provide a new set of variables to be used by the algorithm as is dictated by Eqs. (13) and (17) of Table 1. Tuning variables

1769

sbi , i ¼ 1; 2; 3 and sa are carefully selected so that the overall algorithm remains stable. In analogy with the complex valued FRLS algorithm, the differences between the backward errors, e~bM ðnÞ and b ebM ðnÞ, and the Kalman gain power, aM ðnÞ, and their estimated values have been identiﬁed as sensitive stability indicators for the SR FRLS adaptive algorithm of Table 2, where a real valued implementation of the FRLS adaptive algorithm using the strength reduction transform is adopted. These variables are deﬁned as T b ~ MÞ þ b E~ M ðnÞ ¼ xðn b ðn 1Þb xM ðnÞ R dbT M ðn 1ÞxM ðnÞ,

ð53Þ

T bb ðnÞ ¼ x bðn MÞ þ b~ ðn 1Þx~ M ðnÞ E M R þ dbT M ðn 1ÞxM ðnÞ,

ð54Þ

T T bM ðn 1Þ b~ M ðn 1Þ, and where dbM ðn 1Þ ¼ b

~ TM ðnÞx~ M ðnÞÞ. wTM ðnÞb xM ðnÞ þ w AM ðnÞ ¼ 1 þ 0:5ðb (55) They are alternatively estimated by Eqs. (34), (35) and (45) of Table 2. Once again, we form the differences ~ b ðnÞ ¼ E~ b ðnÞ e~b ðnÞ, D M M M b b ðnÞ ¼ E bb ðnÞ b D ebM ðnÞ M M

ð56Þ

and DaM ðnÞ ¼ AM ðnÞ aM ðnÞ.

(57)

These variables, that should be equal to zero in inﬁnite precision, are used through appropriate feedback to improve stability behavior of the algorithm, as is dictated by Eqs. (43), (44) and (47) of Table 2. Parameters s~ bi , b sbi , i ¼ 1; 2; 3, and sa are tuning variables that are utilized for the stabilization of the algorithm. The stable performance of the resulting stabilized algorithm is testiﬁed via extensive simulations, as it is demonstrated in Section 5. Notice that the strength reduction technique adopted here, implies a close relationship between the backward error indicators utilized by the FRLS algorithm of Table 1 and the SR FRLS

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1770

algorithm of Table 2, i.e., b b ðnÞ ¼ Db;R ðnÞ þ Db;I ðnÞ, D M M M ~ b ðnÞ ¼ Db;R ðnÞ Db;I ðnÞ, D M M M

ð58Þ

b;I where, Db;R M ðnÞ and DM ðnÞ denote the real and the imaginary part of variable DbM ðnÞ deﬁned by Eq. (52).

4.1. Complexity assessment The computational complexity of the complex valued FRLS algorithm of Table 1 is given by C classical FRLS 8M CMUL þ 8M CADD þ 2MRMUL þ 2M RADD.

ð59Þ

Here, CMUL, CADD, RMUL and RADD denote the two operands complex valued multiplication, the two operands complex valued addition, the two operands real valued multiplication, and the two operands real valued addition, respectively. Notice that: (a) multiplication by l is ignored, since it can be realized by a digit shift operator, provided that l is chosen to be a power of two, and (b) real valued divisions are treated as being real valued multiplications. Direct implementation of the fast complex valued multiplication method to the complex valued FRLS, can reduce the number of real multiplications, at the expense of an increased number of additions. Consider Eq. (59) that describes the complexity of the original FRLS adaptive algorithm. The equivalent algorithm complexity in terms of real valued operations is estimated, using the fact that the classical complex valued multiplication method requires 4 RMULs and 2 RADDs, while the complex valued addition can be performed by 2 RADDs. Thus, we get C classical FRLS ¼ 34M RMUL þ 34M RADD.

(60)

On the other hand, if the direct fast complex valued multiplication is applied, where 3 RMULs and 5 RADDs are engaged, we get SR C direct ¼ 26M RMUL þ 58M RADD. FRLS

(61)

Comparing Eqs. (60) and (61) we conclude that the direct application of the fast complex valued

multiplication reduces the number of real valued multiplications by 23.5%, at the expense of 70.7% extra real valued additions. The computational complexity of the proposed novel SR FLRS adaptive algorithm of Table 2, is C proposed FRLS

SR

26M RMUL þ 35M RADD.

(62)

Comparing Eqs. (62) and (60) that gives the equivalent real valued arithmetic complexity of the FRLS algorithm, it is established that the proposed SR FRLS adaptive algorithm requires about 23.5% less real valued multiplications, at the expense of 2.9% extra real valued additions. A comparison of the relative computational requirements among all FRLS adaptive algorithms discussed in this paper is provided in Table 3. The amount of memory required for the implementation of the proposed SR FRLS adaptive algorithm of Table 2, is slightly increased compared to the memory requirements of the direct implementation of the FRLS algorithm of Table 1. Indeed, the complex valued FRLS adaptive scheme of Table 1, requires approximately 12M real valued memory cells, while the proposed real valued SR FRLS adaptive algorithm requires approximately 16M real valued memory cells. This ﬁgure can further be reduced to 14M real valued memory cells, at the expense of a small increase in the number of real valued additions ( i.e. 37M instead of 37M real valued additions). Computational savings, without sacriﬁcing performance, is a task of primary interest in the design of high-processing rate adaptive systems. Implementation of the proposed FRLS algorithm

Table 3 Comparison of the relative computational requirements among three different realizations of the complex valued FRLS algorithm Realization of the FRLS Algorithm

Relative RMUL

Relative RADD

Classical Direct SR Proposed SR

100% 77.5% 77.5%

100% 171% 103%

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

on a typical DSP processor with a single-cycle MAC (multiply and accumulate) unit, requires 20% less cycles per processing sample, compared to the implementation of the original complex valued FRLS scheme. On the contrary, if the fast multiplication is applied directly on each complex valued operation, the processing rate would be deteriorated by a factor of 70%. When multiple DSP units are utilized for the implementation of the adaptive processing task, the proposed FRLS method results in hardware savings [23]. When the implementation of the adaptive algorithm in ASIC is under consideration, the proposed method could result in about 20% reduction of the power dissipation and the silicon area of the fabricated circuit, since the hardware multiplier units are an order of magnitude more complex compared to the hardware addition/subtraction units [24,9,16].

5. Simulation results The performance of the proposed FRLS adaptive algorithm for complex valued adaptive ﬁltering and system identiﬁcation is illustrated by means of computer simulations. Two important adaptive ﬁltering applications are considered: (a) the FIR system identiﬁcation, and (b) the equalization of a telecommunication channel using a linear equalizer. 5.1. Adaptive FIR system identification The input time series used in the experiment is a complex valued stationary AR process of order two, i.e., xðnÞ ¼ a1 xðn 1Þ þ a2 xðn 2Þ þ ZðnÞ. The driven signal ZðnÞ is a unit variance complex valued white Gaussian noise. This AR process is used as input to an FIR system of order M ¼ 100. The ﬁlter taps are randomly selected from a complex valued, uniformly distributed random process. At the output of the FIR system complex valued white Gaussian noise is added, resulting in an SNR equal to about 30 dB. Thus the contaminated with noise output of the system is

1771

given by yðnÞ ¼

100 X

ci xðn i þ 1Þ þ nðnÞ.

(63)

i¼1

The eigenvalue spread of the sampled autocorrelation matrix of the input signal is controlled by the parameters of the AR process. In our case the AR model parameters are selected so that the resulting condition number of the sampled autocorrelation matrix to be wR ¼ Oð102 Þ. The unknown FIR system is estimated using the classical FRLS adaptive algorithm of Table 1, and the SR FRLS adaptive algorithm of Table 2. The exponentially forgetting parameter is set equal to l ¼ 0:997. The ﬁltering error power for each case is computed by averaging the squared instantaneous ﬁltering errors. Since input data are stationary signals, ensemble averaging is replaced by time averaging, over an exponentially decaying window with effective memory equal to 128 time instants. Simulation results are presented in Fig. 2, where the mean-squared ﬁltering error EðecM ðnÞec M ðnÞÞ is plotted versus time. The output of the SR FRLS algorithm is mathematically equivalent to that of the original FRLS algorithm, since the ﬁltering error ec;R M ðnÞ, R I ec;I ðnÞ and the ﬁlter coefﬁcients c ðnÞ, c ðnÞ can M M M be retrieved from the transformed parameters b ecM ðnÞ, e~cM ðnÞ and bcM ðnÞ, c~ M ðnÞ, as is dictated by Eqs. (47)–(50). The numerical equivalence of the two algorithms is demonstrated by computing the signals difference c;R

DeM ðnÞ ¼ ec;R ecM ðnÞ þ e~cM ðnÞÞ, M ðnÞ 0:5ðb c;I

DeM ðnÞ ¼ ec;I ecM ðnÞ e~cM ðnÞÞ, M ðnÞ 0:5ðb R

DcM ðnÞ ¼ kcR cM ðnÞ þ c~ M ðnÞÞk1 , M ðnÞ 0:5ðb I

DcM ðnÞ ¼ kcIM ðnÞ 0:5ðbcM ðnÞ c~ M ðnÞÞk1 .

(64) (65) (66) (67)

These signals are plotted in Fig. 3. Clearly, the FRLS algorithm and the proposed SR FRLS algorithm, are numerically equivalent, since the signals difference are very close to the machine’s precision constant.

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1772

Learning curve 25

MSE db

20 15 10 5 0 0

500

1000

1500

2000

(a)

2500 3000 samples

3500

4000

4500

7

8

9

5000

Learning curve 25

MSE db

20 15 10 5 0 0

1

2

3

4

(b)

5

6

samples

10 x 105

Fig. 2. Adaptive FIR system identiﬁcation: Plot of the Mean Squared Error (MSE), EðecM ðnÞec M ðnÞÞ, versus time; (a) convergence behavior and (b) long run behavior.

Simulation results allow us to conclude that both algorithms show similar stabilization characteristics. The error feedback signals, Db;R M ðnÞ, a Db;I ðnÞ and D ðnÞ, that are utilized by the FRLS M M algorithm are depicted in the ﬁrstbcolumn of Fig. 4. b ðnÞ D ~ b ðnÞ and The error feedback signals D M M a DM ðnÞ, that are utilized by the SR FRLS algorithm are depicted in second column of Fig. 4. Clearly, both algorithms perform well, when feedback stabilization is utilized. 5.2. Adaptive channel equalization In this experiment, the proposed SR FRLS adaptive algorithm is used for the equalization of a telecommunication channel [2,3,5]. A QPSK sig-

nalling is considered and the ISI is simulated by passing the transmitted signal through a discrete-time linear channel. At the receiver, a linear symbol spaced equalizer is engaged in order to cancel the ISI introduced by the channel. Although linear, symbol spaced, equalizers are considered, the proposed SR FRLS methodology can be extended to the case of fractionally spaced, and/or decision feedback equalizers, where multichannel SR FRLS algorithms can be applied. Several communication channels are considered for the evaluation of the SR FRLS linear equalizer. Following [8], channels h1 and h2 are considered, with impulse responses h1 ¼ ½0:2194ejf0 þ ejf1 þ 0:2194ejf3 T

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

x 10-13

3 2

1

1

c,ℜ

∆eM (n)

2

0

c,ℜ

∆eM (n)

3

-1

-2

-2

-3

-3 0

2

4 6 samples

8

-4

10 x 105

0

2

4 6 samples

8

10 x 105

4

8

10

x 10-13 2

1.5

1.5 ∆cM (n)

1

ℜ

ℜ

∆cM (n)

x 10-13 2

0.5

0

x 10-13

0

-1

-4

1773

1

0.5

0

2

4

6

8

samples

10 x 10

5

0

0

2

6

samples

x 105

Fig. 3. Comparison c;R between c;Ithe output of the classical complex valued FRLS and the proposed SR FRLS adaptive algorithm. c;R c;I Difference signals DeM ðnÞ, DeM ðnÞ, DcM ðnÞ and DcM ðnÞ are plotted versus time.

and h2 ¼ ½0:227ejf0 þ 0:406ejf1 þ 0:688ejf2 þ 0:406ejf3 þ 0:227ejf4 T . Here, the fi are random phases. Moreover, channel h3 is also considered with impulse response [25] h3 ¼ ½0:1897ejf0 þ 0:5097ejf1 þ 0:6847ejf2 þ 0:4600ejf3 þ 0:1545ejf4 T . Finally, a time-varying channel h4 ðnÞ is considered with impulse response given by the ﬁrst order model [26,27,29], h4 ðnÞ ¼ bb4 ðn 1Þ þ qðnÞ. Here, qðnÞ is a zero mean complex valued random signal having diagonal covariance matrix, i.e.,

EðqðnÞqH ðnÞÞ ¼ s2q I. The model parameters are set equal to b ¼ 0:9999, s2q ¼ 0:01, and h4 ð1Þ ¼ h3 . In all cases, additive complex white Gaussian noise is added to the received signal, resulting in an SNR equal to 20 db. The stabilized SR FRLS adaptive algorithm is employed for the channel equalization for all four channels tested. In all cases, feedback stabilization is employed, where the stabilization parameters are all set equal to sbi ¼ b sbi ¼ s~ bi ¼ 0:5, i ¼ 1; 2; 3, a and s ¼ 0:5. The equalizers are initially trained using a known data sequence, consisting of 100 data samples. After the training period, equalizers are set to a decision-directed mode. A set of 100 independent trials are run for each experiment, and the ensemble squared error (MSE) is estimated for each case. Details of each experiment such as the equalizer length M, the decision delay

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1774

4

x 10-14

1 hat ∆bM (n)

∆b,ℜ (n) M

2 0 -2 -4 0

2

4

6

8

0 -0.5 -1

10

x 10-14

1

0

-5 0

2

4 6 samples

8

-1

2

∆αM (n)

∆αM (n)

4 6 samples

8

10 x 105

6

8

10 x 105

x 10-13

-0.5

10 x 105

0

4

0

10

2

2

0.5

x 10-16

0

0

samples

tilde ∆bM (n)

∆b,ℑ (n) M

0.5

x 105

samples

5

x 10-13

0

2

4 6 samples

8

10 x 105

4 6 samples

8

10 x 105

x 10-15

1 0 -1

0

2

Fig. 4. Stability issues. Signals utilized for the error feedback stabilization of the classical complex valued FRLS (ﬁrst column) and the proposed SR FRLS adaptive algorithm (second column). Notice that all signals of interest are kept at a low level, in the range of machine’s precision.

D and the forgetting factor l, are summarized below

Channel Channel Channel Channel

h1 , h2 , h3 , h4 ,

M M M M

¼ 9, D ¼ 5, l ¼ 0:98, ¼ 21, D ¼ 11, l ¼ 0:985, ¼ 21, D ¼ 11, l ¼ 0:985, ¼ 21, D ¼ 11, l ¼ 0:985.

The learning curve for each experiment is depicted in Fig. 5. Notice the striking difference between the results reported here and those of [8], as far as the number of training symbols is concerned. This is due to the fact that the convergence rate of RLStype algorithms is independent from the eigenvalue distribution of the input data signals, a property

that the LMS-type algorithms do not possess. No numerical instability is occurred during the simulation of the SR FRLS adaptive equalization algorithm for any of the four channels tested.

6. Conclusions A novel implementation of the fast recursive least squares algorithm for complex adaptive ﬁltering, has been presented. The proposed method results in a reduced computational complexity adaptive scheme, where all complex valued operations are replaced by real valued counterparts. A set of auxiliary signals and ﬁltering parameters has

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

been introduced, which are propagated through the algorithm, and are updated by the pertinent recursive equations, in accordance to the strength reduction real valued arithmetic. In this way, an

1775

efﬁcient adaptive scheme has been established based on novel three-terms recursions for the time updating of the pertinent parameters. The proposed algorithm is implemented using real valued

Learning curve 10 MSE db

5 0 -5 -10 -15 -20

0

100

200

300

400

500 600 samples

700

800

900

1000

7

8

9

700

800

900

1000

7

8

9

10

Learning curve -15 MSE db

-15.5 -16 -16.5 -17 -17.5 -18

0

1

2

3

4

MSE db

(a)

6 4 2 0 -2 -4 -6 -8

5 samples

6

10 x 104

Learning curve

0

100

200

300

400

500 600 samples

MSE db

Learning curve -4.5 -5 -5.5 -6 -6.5 -7 -7.5 -8 0

1

2

3

4

5

6

samples x 104 c Fig. 5. Adaptive channel equalization: Mean Squared Error plot E eM ðnÞec M ðnÞ versus time (Convergence behavior and long run behavior), for four different communication channels: (a) Channel equalization: h1 ; (b) channel equalization: h2 ; (c) channel equalization: h3 ; (d) channel equalization: h4 . (b)

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1776

Learning curve 10 MSE db

5 0 -5 -10 -15

MSE db

-20

-13 -13.5 -14 -14.5 -15 -15.5 -16 -16.5

0

100

200

300

400

700

800

900

7

8

9

1000

Learning curve

0

1

2

3

4

5

6

10 x 104

samples

(c)

MSE db

500 600 samples

Learning curve

6 4 2 0 -2 -4 -6 -8 0

100

200

300

400

500

600

700

800

900

7

8

9

1000

samples Learning curve

MSE db

0 -5 -10 -15 -20 0

1

2

3

4

5 samples

(d)

6

10 x 104

Fig. 5. (Continued)

arithmetic only, whilst reducing the number of the required real valued multiplications by 23.5%, at the expense of a marginal 2.9% increase in the number of the real valued additions. This provides a signiﬁcant advantage over the direct application

of a fast complex valued multiplication method to the original FRLS scheme, where the reduction in the number of real valued multiplications, is overshadowed by the 70% increase in the required number of additions. The proposed scheme is

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

suitable for multiprocessor or VLSI ASIC implementation, achieving reduction of the required DSP units, or in the power dissipation and the silicon area of the fabricated circuit. The performance of the proposed FRLS algorithm has been investigated by computer simulations, in the context of adaptive system identiﬁcation and adaptive channel equalization. Simulation results indicate that both the original complex valued FRLS algorithm as well as the proposed efﬁcient FRLS implementation using real valued arithmetic attain similar convergence and stabilization properties. Although single-channel complex adaptive ﬁltering has been considered, the proposed SR FRLS methodology can be extended to the case multichannel complex adaptive ﬁltering [28,27].

Let us consider the transformed Kalman gain parameters, deﬁned by Eqs. (38) and (39), in conjunction with Eq. (68). Thus, we get b Mþ1 ðnÞ w " ¼

It remains to show how the Kalman gain parameters are updated using the proposed SR method. Consider the order updating equation of the Kalman gain vector, Eq. (7) of Table 1. Let us deﬁne

b M ðn 1Þ w 2

3

;R f ;I kfMþ1 ðnÞ þ kMþ1 ðnÞ

;R ;I b aM ðn 1ÞkfMþ1 ðnÞ þ a~ M ðn 1ÞkfMþ1 ðnÞ

5

ð69Þ and ~ Mþ1 ðnÞ w "

#

0

~ M ðn 1Þ w 2 þ4

3

;R f ;I kfMþ1 ðnÞ kMþ1 ðnÞ

;R ;I a~ M ðn 1ÞkfMþ1 ðnÞ þ b aM ðn 1ÞkfMþ1 ðnÞ

5.

ð70Þ Deﬁne the vector

I wM ðnÞ ¼ wR M ðnÞ þ jwM ðnÞ,

wM ðnÞ. WM ðnÞ ¼ w~ M ðnÞ jb

I aM ðnÞ ¼ aR M ðnÞ þ jaM ðnÞ

(71)

Using Eqs. (69) and (70), Eq. (71) is updated as follows: " # 0 WMþ1 ðn þ 1Þ ¼ WM ðnÞ " # 1j þ a~ M ðn 1Þ jb aM ðn 1Þ

and ;R ;I kfMþ1 ðnÞ ¼ kfMþ1 ðnÞ þ jkfMþ1 ðnÞ.

Then, Eq. (7) of Table 1 can be expanded as I wR Mþ1 ðnÞ þ jwMþ1 ðnÞ " # " # 0 0 ¼ þj I wR wM ðn 1Þ M ðn 1Þ " # 1 þ kf ;R ðnÞ R aM ðn 1Þ Mþ1 ! " # 1 f ;I I ðnÞ k aM ðn 1Þ Mþ1 " # 1 ;I þj ðnÞ kfMþ1 aR ðn 1Þ M ! " # 1 f ;R þ I ðnÞ . k aM ðn 1Þ Mþ1

#

0

þ4

¼ Appendix A. SR Kalman gain updating

1777

f ;R ;I ðkMþ1 ðnÞ jkfMþ1 ðnÞÞ.

ð72Þ

Finally, application of the strength reduction operation, Eq. (16) results in the following couple of equations: " b Mþ1 ¼ w

0 b M ðn 1Þ w "

ð68Þ

#

" þ 0

1

#

b aM ðn 1Þ #

b aM ðn 1Þ a~ M ðn 1Þ

f kbMþ1 ðnÞ

;I ðnÞ kfMþ1

ð73Þ

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779

1778

and

and "

~ Mþ1 ¼ w

#

0

þ

~ M ðn 1Þ w "

"

0

#

1

a~ M ðn 1Þ #

b aM ðn 1Þ a~ M ðn 1Þ

" ~ Mþ1 ¼ w

f k~Mþ1 ðnÞ

~ M ðnÞ w "

;I ðnÞ. kfMþ1

0

"

# þ

b~ M ðn 1Þ

Here, we have I b bM ðnÞ ¼ bR M ðnÞ þ bM ðnÞ, b~ M ðnÞ ¼ bR ðnÞ bI ðnÞ, M

b aM ðnÞ ¼ a~ M ðnÞ ¼

aIM ðnÞ, aIM ðnÞ.

ð75Þ

Similarly, we get kb fMþ1 ðnÞ k~ fMþ1 ðnÞ

¼k ¼k

f ;R Mþ1 ðnÞ þ f ;R Mþ1 ðnÞ

k k

wM ðnÞ

0

þ

ð76Þ

bM ðn 1Þ b kMþ1 ðnÞ. 1 (77)

Notice that the above equation is implicitly expressed by Eqs. (8) and (9) of Table 1. Working in a similar way as before, we get the set of recursive equations as " b Mþ1 ¼ w

b M ðnÞ w "

0

#

" þ

b bM ðn 1Þ 1

#

b kbMþ1 ðnÞ

b bM ðn 1Þ b~ M ðn 1Þ 0

M

ð80Þ

I where bR M ðnÞ and bM ðnÞ represent the real and the imaginary part of the backward predictor bM ðnÞ. Moreover, b b;I kbMþ1 ðnÞ ¼ kb;R Mþ1 ðnÞ þ kMþ1 ðnÞ,

f ;I Mþ1 ðnÞ, f ;I Mþ1 ðnÞ,

f ;R f ;I where, kMþ1 ðnÞ and kMþ1 ðnÞ represent the real and f the imaginary part of parameter kMþ1 ðnÞ. The order downdating recursion associated with the recursive updating of the Kalman gain vector can be treated in a similar way. Let us consider the dual order updating recursion for the Kalman gain vector, i.e.,

wMþ1 ðnÞ ¼

kb;I Mþ1 ðnÞ. ð79Þ

ð74Þ

aR M ðnÞ þ aR M ðnÞ

b k~Mþ1 ðnÞ

1 # ~ b bM ðn 1Þ bM ðn 1Þ 0

Auxiliary parameters b aM ðnÞ and a~ M ðnÞ introduced above are related to the forward predictor I aM ðnÞ ¼ aR M ðnÞ þ jaM ðnÞ, as

#

# kb;I Mþ1 ðnÞ ð78Þ

b b;I k~Mþ1 ðnÞ ¼ kb;R Mþ1 ðnÞ kMþ1 ðnÞ,

ð81Þ

b;I where kb;R Mþ1 ðnÞ and k Mþ1 ðnÞ are the real and the imaginary part of parameter kbMþ1 ðnÞ. The order updating recursions, as well as the order downdating recursions, that are utilized for the efﬁcient time updating of the Kalman gain vector using the SR transform, are listed in Eqs. (16)–(35) of Table 2. Eqs. (15), (24), (35), (45) and (57) of Table 2, that are used for the computation of the ‘power’ of the forward predictor, the Kalman gain and the backward predictor, result by applying the SR updating recursions developed so far, on the deﬁnition equations of the pertinent ‘power’ parameters.

References [1] M. Bellanger, Adaptive Digital Filters and Signal Analysis, Marcel Dekker, New York, 1987. [2] S. Benedetto, E. Biglieri, Principles of Digital Transmission, with Wireless Applications, Kluwer Academic Publishers, Dordrecht, 1999. [3] S. Haykin, Adaptive Filter Theory, second ed., PrenticeHall, Englewood Cliffs, NJ, 1991. [4] N. Kalouptsidis, S. Theodoridis (Eds.), Adaptive System. Identiﬁcation And Signal Processing Algorithms, PrenticeHall, Englewood Cliffs, NJ, 1993.

ARTICLE IN PRESS G.-O. Glentis / Signal Processing 85 (2005) 1759–1779 [5] J. Proakis, Digital Communications, third ed., McGrawHill, New York, 1995. [6] R. Baghaie, T. Laakso, Implementation of low-power CDMA RAKE receivers using strength reduction transformation, Proceedings of the IEEE Nordic Signal Processing Symposium, Denmark, 1998, pp. 169–172. [7] D. Bull, Efﬁcient IQ ﬁlter structure for use in adaptive equalisation, Electron. Lett. 30 (24) (November 1994) 2018–2019. [8] R. Perry, D. Bull, A. Nix, Efﬁcient adaptive complex ﬁltering algorithm with application to channel equalisation, IEE Proc.-Vis. Image Signal Process. 146 (2) (February 1999) 57–64. [9] N. Shanbhag, M. Goel, Low-power adaptive ﬁlter architectures and their application to 51.84 Mb/s ATM-LAN, IEEE Trans. Signal Process. 45 (5) (May 1997) 1276–1290. [10] A. Lamagna, Fast computer algebra, IEEE Comput. 15 (9) (September 1982) 43–56. [11] S. Winograd, Arithmetic Complexity of Computations, SIAM, Philadelphia, 1980. [12] A. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, R. Brodersen, Optimizing power using transforms, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 14 (1) (January 1995) 12–31. [13] A. Chandrakasan, R. Brodersen, Minimizing power consumption in digital CMOS circuits, Proc. IEEE 83 (4) (April 1995) 498–523. [14] K. Parhi, VLSI Digital Signal Processing Systems, Design and Implementation, Wiley, New York, 1999. [15] P. Pirsch, Architectures for Digital Signal Processing, Wiley, New York, 1996. [16] N. Shanbhag, Algorithmic transformation techniques for low-power wireless VLSI systems design, Internat. J. Wireless Informat. Networks 5 (2) (1998) 147–171. [17] G. Carayannis, D. Manolakis, N. Kalouptsidis, A fast sequential algorithm for least-squares ﬁltering and prediction, IEEE Trans. Acoust. Speech, Signal Process. 31 (12) (December 1983) 1394–1402. [18] J. Ciofﬁ, T. Kailath, Fast recursive LS transversal ﬁlters for adaptive processing, IEEE Trans. Acoust. Speech, Signal Process. 32 (4) (April 1984) 304–337.

1779

[19] A. Wenzler, E. Luder, New structures for complex multipliers and their noise analysis, Proceedings of the IEEE International Symposium on Circuit Systems, ISCAS-1995, Seattle, pp. 1431–1435. [20] D. Slock, T. Kailath, Numerical Stable Fast RLS transversal adaptive ﬁltering, IEEE Trans. Acoust. Speech, Signal Process. 39 (1) (January 1991) 92–114. [21] G. Glentis, K. Berberidis, S. Theodoridis, Efﬁcient least squares adaptive algorithms for FIR transversal ﬁltering: a uniﬁed view, IEEE Signal Process. Mag. 16 (4) (April 1999) 13–42. [22] J. Botto, G. Moustakides, Stabilizing the fast Kalman algorithms, IEEE Trans. Acoust. Speech, Signal Process. 37 (9) (September 1989) 1342–1348. [23] J. Sanchez, H. Barral, Multiprocessor implementation models for adaptive algorithms, IEEE Trans. Signal Process. 44 (9) (September 1996) 2319–2331. [24] J. Diguet, D. Chillet, O. Sentieys, A framework for high level estimations of signal processing VLSI implementations, J. VLSI Syst. Signal, Image Video Technol. 25 (July 2000). [25] T. Vaidis, C. Weber, Block adaptive techniques for channel identiﬁcation and data demodulation over bandlimited channels, IEEE Trans. Comm. 46 (2) (February 1998) 232–243. [26] E. Eweda, Comparison of RLS, LMS and Sign algorithms for tracking randomly time-varying channels, IEEE Trans. Signal Process. 42 (11) (November 1994) 2937–2944. [27] D. Slock, L. Chisci, H. Lev-Ari, T. Kailath, Modular and numerically stable fast transversal ﬁlters for multichannel and multiexperiment RLS, IEEE Trans. Signal Process. 40 (4) (April 1992) 784–802. [28] G. Glentis, N. Kalouptsidis, Fast adaptive algorithms for multichannel ﬁltering and system identiﬁcation, IEEE Trans. Signal Process. 40 (10) (October 1992) 2433–2458. [29] L. Lindbom, M. Sternad, A. Ahlen, Tracking of timevarying mobile radio channels—Part I: The Wiener LMS algorithm, IEEE Trans. Comm. 49 (12) (December 2001) 2207–2217.

Efficient fast recursive least-squares adaptive complex filtering using real valued arithmetic

Efficient fast recursive least-squares adaptive complex filtering using real valued arithmetic

Recommend Documents