Frequency domain quaternion adaptive filters: Algorithms and convergence performance

Frequency domain quaternion adaptive filters: Algorithms and convergence performance

Author’s Accepted Manuscript Frequency Domain Quaternion Adaptive Filters: Algorithms and Convergence Performance Francesca Ortolani, Danilo Comminiel...

1MB Sizes 4 Downloads 130 Views

Author’s Accepted Manuscript Frequency Domain Quaternion Adaptive Filters: Algorithms and Convergence Performance Francesca Ortolani, Danilo Comminiello, Aurelio Uncini www.elsevier.com/locate/sigpro

PII: DOI: Reference:

S0165-1684(16)30294-8 http://dx.doi.org/10.1016/j.sigpro.2016.11.002 SIGPRO6302

To appear in: Signal Processing Received date: 30 April 2016 Revised date: 31 August 2016 Accepted date: 1 November 2016 Cite this article as: Francesca Ortolani, Danilo Comminiello and Aurelio Uncini, Frequency Domain Quaternion Adaptive Filters: Algorithms and Convergence Performance, Signal Processing, http://dx.doi.org/10.1016/j.sigpro.2016.11.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Frequency Domain Quaternion Adaptive Filters: Algorithms and Convergence Performance

Francesca Ortolani, Danilo Comminiello and Aurelio Uncini Dpt. of Information Engineering, Electronics and Telecommunications University of Rome "Sapienza", Rome, Italy [email protected], [email protected], [email protected]

Abstract

Recently, adaptive ltering was extended to quaternion-valued systems. Quaternion-valued algorithms exhibit improved geometrical properties compared with real- and complex-valued algorithms. Moreover, working in the frequency domain allows a fast execution along with a good convergence performance. In this work, we propose three dierent quaternion-valued adaptive algorithms operating in the frequency domain. Convergence properties are also analyzed in detail: the step size stability range is obtained in relation to the eigenvalues of the input autocorrelation matrix and the Excess Mean Square Error (EMSE) is expressed in relation to the algorithm parameters. Finally, simulations support the proposal. Adaptive Filters, Convergence, Frequency Domain, Hypercomplex, Quaternions, Quaternion-Valued Adaptive Filtering.

Keywords:

1. Introduction

Adaptive ltering nds use in many dierent elds. It is customary to see adaptive lters embedded in RADAR and SONAR applications, weather forecast, seismology, control applications (navigation, tracking and guidance), audio processing applications (e.g. audio restoration, immersive audio. . . ) and so on. Specically, adaptive lters can be employed in noise, interference and echo cancelling, channel equalization, system identication, inverse system modeling. The work presented in this paper extends the use of adaptive lters to hypercomplex numbers, in particular, quaternions. Traditionally, quaternions found application in electromagnetism, quantum physics and relativity. Indeed, it is possible to take advantage of their compact notation to express Maxwell's equations or the atomic orbital funcPreprint submitted to Signal Processing

November 3, 2016

tions in a more accessible form. Recent applications, however, concern aeronavigation, kinematics, 3D graphics, image processing in general (e.g. color template matching). In eect, hypercomplex algebra allows the description of a wide variety of geometrical objects (points, lines, planes, hyperplanes, spheres, hyperspheres. . . ), thus simplifying the treatise of many physical problems. So, within the hypercomplex number subelds, multidimensional ltering gives the possibility to process data accordingly to their typical space dimensions, thus obtaining robustness and coeherence from the results. These features derive from considering and exploiting the correlation between the quaternion dimensions. It has only been in the last decade that adaptive ltering was exteded to quaternions [13, 27, 28, 32]. Algorithms such as Quaternion Least Mean Square (QLMS) [1] or Quaternion Recursive Least Squares (QRLS) [3] operate adaptation in the time domain at each time instant n correspondingly with each new input sample. The downsides in these approaches are the time-variant impulse response of the adaptive lter and the time-domain convolution, which requires a high computational eort for long-tapped lters. In acoustic applications, because of the longlasting tail of reections to handle, the adaptive lter is designed to have a long impulse response, that is, a long memory is needed, thus resulting in a signicant increase in the computational complexity of the algorithm [4, 5]. In such a case, time domain processing is not recommended. A wiser approach in designing the lter combines a block implementation of an FIR lter and transform domain processing. The algorithms presented in this paper operate weight adaptation in the frequency domain. This can be achieved eciently with the use of the Fast Fourier Transform (FFT), which allows a fast performance of convolutions and crosscorrelations. Another reason to carry out frequency domain adaptive ltering resides in the feasibility of improving the convergence performance of the algorithms taking advantage of power normalization. Effectively, it is easier to get a lower eigenvalue disparity in the input signal autocorrelation matrix with frequency-domain algorithms rather than their time-domain counterpart [6, 7]. The eigenvalue disparity, also known as eigenspread, determines the convergence rate of the algorithm, so the physical meaning of the algorithm eigenvalues, responsible for the algorithm natural modes, should not be ignored. The eigenvalues represent the power of each frequency element (frequency bin ). It is possible to compensate power variation by means of power normalization, as shown in Section 3.3. In frequency domain algorithms the orthogonality property of the Discrete Fourier Transform (DFT) produces a diagonal input autocorrelation matrix as the transformation length tends to innity. In other words, the DFT/FFT pro2

duces low-correlated signals. Consequently, when power normalization is operated, a more uniform convergence rate can be obtained easily. The algorithms we present in this paper are derived from Block QLMS (read [5] for the real-valued Block LMS). The eort in our work consists in nding a formulation of the frequency domain block algorithms as close as possible to Block QLMS and studying the convergence properties of these algorithms. Recently, a simple multichannel frequency domain quaternionvalued algorithm was presented in [8]. The algorithm is based on the quaternionvalued LMS [1]. In our work we propose the Overlap-Save Quaternion Frequency Domain Adaptive Filter (OS-QFDAF) and its modications (see [9] for the real and complex-valued versions of these algorithms): the Circular Convolution Quaternion Frequency Domain Adaptive Filter (CC-QFDAF) and the quaternion-valued transform domain Sliding Window algorithm as a particular case of the overlap-save algorithm with block length L = 1. All the algorithms can include an adaptive step size thanks to power normalization. This paper is organized as follows: Section 2 gives a short introduction to quaternion algebra and its fundamental properties; the Overlap-Save Quaternion Frequency Domain Adaptive Filter (OS-QFDAF) is presented in Section 3. Additional transform domain block algorithms are summarized in Appendix B. Sections 4 and 5 are dedicated to the analysis of the computational cost and the convergence properties of the OS-QFDAF algorithm, respectively. Finally, results from simulations are reported in Section 6. 2. Working with Quaternions

Before starting using quaternions it is advisable to take a look at quaternion algebra and its most important properties in order to comprehend the computations in this article easily and attain awareness of the topic. 2.1. Fundamentals of Quaternion Algebra

Quaternions are a four-dimensional normed division algebra belonging to a subgroup of the hypercomplex numbers [1014]. Specically, quaternions are a geometric algebra (in the sense of Cliord Algebras). A quaternion has 4 components: one scalar component (real part) and 3 imaginary components (vector part): q = q0 + q1 i + q2 j + q3 k. (1) A quaternion having zero scalar part is also known as pure quaternion.

3

The imaginary units, i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1), represent an orthonormal basis in R3 and follow the fundamental properties below:

ij = i × j = k jk = j × k = i ki = k × i = j

(2)

i2 = j2 = k2 = −1.

(3)

Quaternion product is non-commutative, i.e. ij ̸= ji , in fact

ij = −ji jk = −kj ki = −ik.

(4)

The sum of quaternions is dened in the same way as the sum of complex numbers: q ± p = (q0 + q1 i + q2 j + q3 k) ± (p0 + p1 i + p2 j + p3 k) = (q0 ± p0 ) + (q1 ± p1 ) i + (q2 ± p2 ) j + (q3 ± p3 ) k.

(5)

The product between quaternions q1 and q2 is computed as

q1 q2 = (a0 + a1 i + a2 j + a3 k) (b0 + b1 i + b2 j + b3 k) = (a0 b0 − a1 b1 − a2 b2 − a3 b3 ) + (a0 b1 + a1 b0 + a2 b3 − a3 b2 ) i

(6)

+ (a0 b2 − a1 b3 + a2 b0 + a3 b1 ) j + (a0 b3 + a1 b2 − a2 b1 + a3 b0 ) k. The conjugate of a quaternion is dened as

q ∗ = w − xi − yj − zk.

(7)

2.2. Quaternion-valued transforms

Transform domain algorithms require mathematical transformations in order to (pre-)process input and output signals. Such transformations, e.g. DFT and FFT, are quite uncommon in a quaternionic format. General information about quaternion-valued transformations can be found in [15, 24]. Using hypercomplex algebra to build a quaternion DFT/FFT function from scratch is quite an arduous endeavour. Fortunately, there is a quick and easy method that exploits the conventional DFT/FFT functions available in several programming environments by decomposing the quaternion-valued FFT (QFFT) into complex FFTs. This method was formerly developed in image processing and presented in [16, 17, 2426]. Originally, the need for the development of a QFFT arose from the idea of collecting colour (RGB) or luminance/chrominance signals in one quaternion, rather than treating 4

them as independent vectors [1719, 25], thus permitting the generalization of several techniques in image processing depending on the Fourier transform of the colour. The non-commutativity property of the quaternionic product gives rise to a two-sided monodimensional quaternion transform, i.e. quaternion transforms can be found either in a left- or a right-handed form (transpose with one another) as summed-up in Table 1. Table 1: Kernel denitions for monodimensional QDFT

Axis ν

Left

e

−νωn

f (·)

Right

f (·) e−νωn

Equations (8a) and (8b) represent the quaternionic Fourier monodimensional left-handed transform (the exponential function is on the left) and its inverse:

F (u) =

f (n) =

N −1 ∑

( nu ) exp −2πν f (n) N

n=0 N −1 ∑

1 N

u=0

( nu ) exp 2πν F (u) N

(8a) (8b)

where F (u) is the spectrum of f (n). Both functions F (u) and f (n) are quaternionic functions (of N samples) of the kind f (n) = w (n) + x(n)i + y(n)j + z(n)k. Versor ν is an arbitrarily chosen pure unitary quaternion versor and can be expressed as xi yj zk ν= √ +√ +√ . 2 2 2 2 2 2 2 x +y +z x +y +z x + y2 + z2

(9)

Fourier transform analyzes a signal according to sinusoidal components and de Moivre's formula (eiθ = cos θ + i sin θ) generalizes all kinds of algebra where roots of -1 are denable. For instance, in the case of the complex quaternion Fourier transform, a complex quaternion root of -1 (ν2 = −1) is required:

q = w + xi + yj + zk = |q| eνθ = |q| (cos θ + ν sin θ) .

(10)

Versor ν dened in (9) describes the spatial direction of the imaginary part of quaternion q. The angle named with letter θ denotes the rotation angle about the axis ν. 5

As one can see, if ν = i and the input function is complex, equations (8a) and (8b) reduce to the conventional complex Fourier transform. Quaternions can be thought of as complex numbers whose real and imaginary parts are complex numbers in turn (Cayley-Dickson form):

q = a + bν2

(11)

where a = w1 + x1 ν1 and b = y1 + z1 ν1 . In this manner a quaternion q can be decomposed into

q = (w1 + x1 ν1 ) + (y1 + z1 ν1 ) ν2 {z } | {z } | simplex

perplex

(12)

= w1 + x1 ν1 + y1 ν2 + z1 ν3 . The highlighted parts in equation (12) are named respectively simplex and perplex parts of quaternion q, both isomorphic to the conventional eld (i, j, k), since they are both dened in the same space of the complex operator ν1 . Hence, each quaternion function f (n) can be formulated in terms of an orthonormal basis. The versors (ν2 , ν3 ) must be chosen in a way that ν1 ⊥ν2 ⊥ν3 , ν1 ν2 = ν3 and ν1 ν2 ν3 = −1. The advantage of using the basis (ν1 , ν2 , ν3 ) is that such a system of operators, isomorphic to system (i, j, k), is not bound to the axes (i, j, k). The orthonormal basis can be represented in a compact matrix form (matrix of change of basis):   ν1x ν1y ν1z B = ν2x ν2y ν2z  (13) ν3x ν3y ν3z where

   ν1 = ν1x i + ν1y j + ν1z k ν2 = ν2x i + ν2y j + ν2z k   ν3 = ν3x i + ν3y j + ν3z k

(14)

Then it is possible to extract the component functions of f (n) in the new basis:

f (n) = w1 (n) + x1 (n)ν1 + y1 (n)ν2 + z1 (n)ν3 = [w1 (n) + x1 (n)ν1 ] + [y1 (n) + z1 (n)ν1 ] ν2 .

6

(15)

Each component function is obtained by the dot products below:  w1 (n) = w (n)     x (n) = ⟨ν , x (n) i + y (n) j + z (n) k⟩ 1 1  y1 (n) = ⟨ν2 , x (n) i + y (n) j + z (n) k⟩    z1 (n) = ⟨ν3 , x (n) i + y (n) j + z (n) k⟩

(16)

After changing the basis, it is possible to separate the quaternion Fourier transform into the sum of two transforms, thanks to the linearity property of the Fourier transform:

F (u) =

+

N −1 ∑ n=0 N −1 ∑

e(−2πν N ) (w1 (n) + x1 (n) ν1 ) nu

(17)

e(

−2πν nu N

) (y (n) + z (n) ν ) ν . 1 1 1 2

n=0

Once the quaternion transform is executed, the next step is to reassemble the transformed quaternion and change back to the original basis by means of inverse change of basis (it is sucient to transpose each element of matrix B). We wonder what implications the existence of a two-sided transform has on ltering applications. Tests conducted during the development of the algorithms presented in this paper revealed that, in order to avoid awed algorithms, once a direction of rotation is chosen, this has to be kept unchanged. 3. Overlap-Save Quaternion Frequency Domain Adaptive Filter

The OS-QFDAF algorithm is a block algorithm. Block algorithms are dened by a periodic update equation, i.e. the lter coecients are updated at each block iteration. In general, such algorithms dier in the length of the input block, the number of the overlapping samples, the type of transform used (for those algorithms working in a transform domain). In the OS-QFDAF algorithm, the number of the overlapping samples is denoted by M . Usually, the overlap length M is chosen equal to the lter length in the time domain, so we can conventionally refer to M as the lter length. In order to simplify the implementation, the transform length is chosen as N = M + L, where L is the block length.

7

3.1. Introduction to the algorithm

The inverse transform of the product of two DFT sequences provides a circular convolution in the time domain, while ltering operations are implemented with linear convolution. In order to obtain the linear convolution from the circular convolution, proper window constraints on data are needed. If these constraints are not taken into account, with the idea of designing an algorithm with reduced computational cost, the algorithm may not converge to Wiener optimum solution. Fast convolution may be performed using two dierent methods: overlapadd (OA) and overlap-save (OS). The former requires much more computation than needed [4], so the OS method with 50% overlap turns out to be the best performing choice. Therefore, the OS-QFDAF algorithm presented here embeds the OS method. The algorithm comes with power normalization, a strategy intended to improve convergence to optimum and fully described later on in this paper. The equations arisen from the development of the OS-QFDAF are essentially the same as those in complex OS-FDAF. This was achieved by deriving the OS-QFDAF from the Block QLMS algorithm as formulated in Appendix B. It is possible to store the input samples into a block matrix Xk and obtain a similar formulation for all block algorithms in both time and transform domain. The nal form of the OS-QFDAF algorithm results from the transformation into the frequency domain of the equations of Block QLMS, being aware of the properties of complex convolution and correlation in the frequency domain [20]. The Quaternion Discrete Fourier Transform was introduced in Section 2.2 and a practical method for computing the Quaternion Fast Fourier Transform was suggested. This method is based on the Cayley-Dickson symplectic decomposition for quaternions. A brief overview of the OS-QFDAF algorithm and comments about it are given just below. A block diagram of the algorithm is illustrated in Fig. 1. 3.2. Algorithm overview

Initialize the algorithm with

Winit = 0 (2M-by-1 null vector) µ = µ0 ,

P0 (m) = δ,

m = 0, 1, ..., N − 1

(18)

where Pk (m) is the power of the m -th frequency bin at block k and µ0 , δ are initialization constants to be chosen empirically. The following steps are to be executed for each new input block k. 8

xk

xMold

xLk

Xk

Yk x

QFFT

yˆ k

[ yk ]  L

IQFFT

N ( IQFFT(W

QFFT

k+1



M 

) )

0L

  

gradient constraint

H

[⋅]

Wk z -1

Wk +1

+ μk

x XHk

x

Ek

QFFT

0M

eˆ k

dˆ k

+

Figure 1: OS-QFDAF block diagram.

Compute the QFFT of the lter input samples as { [ ] } L T x Xk = diag QFFT xM old k

(19)

L where the input block consists of xM old and xk , dened as

xM old = [x (kL − M + 1) , · · · , x (kL − 1)] xL k = [x (kL) , · · · , x (kL + L − 1)]. Compute the lter output in the time domain as

Y k = X k Wk

(20)

y ˆk = [IQFFT (Xk Wk )]⌊L⌋

(21)

where Yk is the lter output in the frequency domain1 and y ˆk is the win2 dowed lter output in the time domain .

1

Diagonalization in (19) allows to express the lter output signal in a formalism similar to that of time-domain Block LMS [5]. 2 In OS only the last L samples of the anti-transformed output block are useful in the time domain. In fact, it is not assured that the rst M samples are zero, so windowing is necessary.

9

ˆ k be the desired output vector in the time domain at block k : Let d [ ˆ k = d (kL) d (kL + 1) · · · d

d (kL + L − 1)

]T

(22)

the error vector in the time domain is dened as

ˆk − y ˆ ek = d ˆk and transformed3 into the frequency domain as ([ ]T ) ek Ek = QFFT 0M ˆ .

(23)

(24)

This algorithm updates the learning rates by means of power normalization, fully explained in Section 3.3: [ ] µk = µ0 · diag Pk−1 (0) , . . . , Pk−1 (N − 1) . (25) Finally, we can update the lter weights as

Wk+1 = Wk + µk XH k Ek

(26)

where XH k Ek is the gradient estimation in the frequency domain at step k. The classic version of the OS-QFDAF algorithm considers constraining the gradient as follows: ( ) [IQFFT(Wk+1 )]⌈M ⌉ Wk+1 = QFFT . (27) 0L The stochastic gradient in the time domain is dened as [ ]⌈M ⌉ ˆ ∇Jk = IQFFT(XH , i.e. only the rst M samples are useful, since k Ek ) the last L samples are those relative to circular correlation and it is not guaranteed they are zero. Equation (27) applies the gradient constraint after updating the lter weights in the frequency domain. If the gradient constraint is neutralized, the product XH k Ek corresponds to a circular correlation in the time domain and the algorithm is said Unconstrained QFDAF. Usually, unconstrained algorithms exhibit a polarized convergence. In order to get convergence to optimum, the lter length M should be increased, but this is not recommended. Modications of the OS-QFDAF algorithms are the CC-QFDAF algo3

In order to transform the error vector, zero-padding is needed.

10

rithm and the Sliding Window transform domain algorithms concisely presented in Appendix B. 3.3. Power Normalization

Power normalization makes it possible to have all modes converging at the same rate. This is achieved, as in (25), by assigning to each weight an individual step size of its own, µ (m) = µ/P (m), where P (m) is an estimation of the average power relative to the m -th frequency bin. The elements of vector µk have the property of equalizing the convergence modes by whitening the input signal. When power estimation P (m) is not available, as it usually happens, especially when the input data are non-stationary, it has to be computed dierently. The update for the m -th power in the m -th frequency bin at step k (block k ) is carried out by recursion as follows:

Pk (m) = λPk−1 (m) + (1 − λ) |Xk (m)|2

(28)

with m = 0, ..., N − 1. The parameter denoted by λ ∈ [0, 1] represents a forgetting factor and (28) is the expression of a low-pass lter. If λ = 1, the term |Xk (m)|2 (punctual energy of the m -th frequency bin) is ignored. At step k, the m -th step-size can be dened as

µk (m) =

µ Pk (m)

(29)

where m = 0, ..., N − 1 and µ is a scalar chosen empirically. 4. Computational cost of OS-QFDAF

An approximation of the computational cost is given by taking into consideration the critical paths of the algorithm, i.e. multiplications and computation of FFTs. Each QFFT requires the execution of 2 complex FFTs. The computation of one complex FFT involves N log2 N multiplications. The OSQFDAF algorithm includes 5 QFFTs, 16N multiplications to compute the lter output and 16N multiplications to update the lter weights. Considered that a block has L = M samples, the computational cost for OS-QFDAF is approximately:

COS−QF DAF = 2 · 5N log2 N + 2 · 16N = 20M log2 2M + 64M where N = L + M = 2M .

11

(30)

Table 2: Computational Cost of Quaternion Adaptive Algorithms Algorithm Computational Cost QLMS 32 · M · nsamples BQLMS 16 · M + 16 · L · nblocks OS-QFDAF 2 · 5N log2 N + 2 · 16N CC-QFDAF 2 · 3N log2 N + 2 · 16N

In the QLMS algorithm the computation of the lter output requires 4 · 4M = 16M multiplications for each sample and so does the computation of the cross-correlation in the update equation. Overall, for M samples, the computational cost for QLMS is approximately CQLM S = 32M · M = 32M 2 . The complexity ratio between OS-QFDAF and QLMS reveals that the former is several times faster than its time-domain ancestor:

COS−QF DAF 5log2 2M + 16 = . CQLM S 8M

(31)

For example, for M = 64, the OS-QFDAF is about 10 times faster than the QLMS algorithm. A comparison among algorithms is given in Table 2. 5. Convergence Properties

The study of the convergence properties of an algorithm gives clues about how the algorithm behavior is conditioned by the value of the Rxx eigenvalues, the lter length, the block length and other parameters. 5.1. Step size stability range

As a matter of simplicity, we analyze the convergence behavior of version of the OS-QFDAF algorithm. The analysis in case of constrained algorithm can be conducted the same way by adding gradient constraint. Considering the test circuit in Fig. 2 (Section 6), desired output in the frequency domain can be expressed as Unconstrained

Dk = GM,0 (Xk W0 + Vk )

the the the the

(32)

where W0 is the optimum solution, GM,0 = FgM,0 F−1 is a window constraint (necessary to get linear convolution from circular convolution). The transform denoted by F is the QFFT and matrix gM,0 is dened as [ ] IM,M 0M,L gM,0 = (33) 0L,M 0L,L 12

Vk denotes the additive noise in the frequency domain, as shown in Fig. 2. In other words, E0k is the error in case Wk = W0 ⇒ E0k = Vk . The error in the frequency domain is dened as Ek = Dk − Yk = GM,0 (Xk W0 + Vk − Xk Wk ) .

(34)

Substituting (34) into the unconstrained adaptation law (26), where GM,0 = I, we obtain: Wk+1 = Wk + µk XH k GM,0 (Xk W0 + Vk − Xk Wk ) .

(35)

We dene the weight error vector in the frequency domain:

Uk = Wk − W0

(36)

Uk+1 = Uk + µk XH k GM,0 (Xk W0 + Vk − Xk Wk ) ( ) H = I − µk Xk GM,0 Xk Uk + µk XH k GM,0 Xk Vk .

(37)

thus resulting:

Taking the expectation of each term in (37), we assume that Xk and Vk are independent. Under this assumption we can apply the orthogonality principle according to which E {GM,0 Xk Vk } = 0: ( { }) { } E {Uk+1 } = I − µk E XH E {Uk } + µk E XH k GM,0 Xk k GM,0 Xk Vk = (I − µk Ruxx ) E {Uk }

where the autocorrelation matrix Ruxx was dened as } { Ruxx = E XH k GM,0 Xk .

(38)

(39)

We see that the convergence behavior of the step-normalized Unconstrained QFDAF is controlled by the eigenvalues of the power-normalized autocorrelation matrix µk Ruxx . Matrix diagonalization within the right eigenvalue spectrum4 , σR (A), is possible with the use of a unitary matrix Q as stated in [13]: ( ) σR QH AQ = σR (A) . (40) The diagonal matrix having the right eigenvalues of Ruxx on the principal 4

λ is a right eigenvalue if there exist a non-zero x ∈ H such that Ax = xλ and λ is a left eigenvalue if there exist a non-zero x ∈ H such that Ax = λx.

13

diagonal is denoted by Λ:

Λ = QH Ruxx Q.

(41)

ˆ k = QH Uk the transformed error vector, (38) can be rewritten using Being U the new vectors (we do not consider power normalization here). We take the expectation of each element of (38): { } { } ˆ ˆk . E Uk+1 = (I − µΛ) E U (42) Equation (42) can be decomposed into a nite-dierence-equation system as { } { } ˆk+1 (m) = (1 − µλm ) E U ˆk (m) . E U (43) Back-substitution leads to { } { } ˆk+1 (m) = (1 − µλm )k E U ˆ−1 (m) . E U

(44)

Convergence occurs if |1 − µλm | < 1. The autocorrelation matrix is a hermitian matrix, i.e. Rxx = RH xx and the eigenvalues of a hermitian matrix are all real. This property was formerly demonstrated by Charles Hermite for complex hermitian matrices and it can be expanded to quaternions at least for right eigenvalues. In eect, right eigenvalues can be computed by Cayley-Dickson decomposition and each complex sub-matrix is hermitian, as well. If we consider the left eigenspectrum, the convergence properties are not elementary: whereas the right eigenspectrum is a quaternionic analogue of the complex eigenvalues [21], the left eigenspectrum is not and it is not even unitarily invariant, so the autocorrelation matrix Rxx cannot be diagonalized as in (41). However, if both the eigenspectra σR (A) and σL (A) are nite, then σR (A) = σL (A). A proof of that is given in [21, 22]. In our case σR (A) is indeed nite: we can read in [13, 22] that σR (A) is innite if and only if σR (A) ̸⊂ R. Nothing can be said about σL (A) a priori. In fact, examples showed that a hermitian matrix does not necessarily produce a nite left eigenspectrum. So, further investigation is essential. Detailed information about the computation of the quaternion eigenspectra is given in [13, 22] and a summary is reported in Appendix A. The step size convergence range, working with the right eigenspectrum, can be found by solving the inequality

|1 − µλm | < 1. 14

(45)

Denitively, the step size µ must be chosen such that

0<µ<

1 λmax

.

Alternatively, given (38) and considering that { } { } H E XH k GM,0 Xk = E tr(Xk GM,0 Xk ) { } = tr(E XH k GM,0 Xk )

(46)

(47)

= tr(Rxx ) a conservative relation for choosing the step size stability range is suggested below: 1 0<µ< . (48) tr(Rxx ) Weight adaptation in OS-QFDAF is independent for each lter weight, since each weight is associated with one mode of the adaptive ltering system. On the contrary, in the standard original time domain LMS/QLMS algorithm each weight is accountable for multiple modes summed up and therefore the convergence rate cannot be optimized for any specic mode of the process. From (38) we see that the time constant τm for the m -th mode of the algorithm can be found by considering the time it takes for the error vector to decay as 1/e of its initial value: ˆ ˆ −1 Uτm (m) = U −1 (m) e ˆ (49) = |(1 − µλm )τm | U −1 (m)

→ τm |ln (1 − µλm )| = −1. Denitively, the time constant for the m -th mode is

τm = −

1 |ln (1 − µλm )|

(50)

where λm is the m -th eigenvalue of the autocorrelation matrix Ruxx of the input data Xk . Anyway, taking advantage of power normalization, in a widesense stationary environment, the weights tend to converge at the same rate. 5.2. Excess Mean Square Error

When the OS-QFDAF algorithm operates without power normalization, the Excess Mean Square Error (EMSE) can be computed in a straightforward 15

way. With the denition in (36), the error at block k is

Ek = Dk − Yk = Dk − Xk Wk − Xk W0 + Xk W0 = Vk − Xk Uk .

(51)

Substituting (51) into the MSE curve, with the assumption that Vk and Xk are independent, the MSE at block k is { } { ( )} H Jk = E |Ek |2 = E (Vk − Xk Uk ) VkH − UH k Xk (52) { } H = Jmin + E Xk Uk UH X . k k We dene the weight error vector correlation as { } Kk = E Uk UH ∈ HM ×M k which can be conveniently diagonalized as [ ] ˆ k = QH Kk Q = diag kˆk (i) ∈ RM ×M . K

(53)

(54)

ˆ k is a diagonal matrix, the elements forming its principal diagonal Since K are its eigenvalues. With the unitary similarity transformation in (54), we are assuming that we are using the right eigenspectrum and since Kk is ˆ k is real-valued. In view of the above, (52) can be rewritten as hermitian, K { } } { H H ˆ = J + E X K X X Jk = Jmin + E Xk Uk UH min k k k k k { } { ( )} H ˆ k Xk Xk = Jmin + E tr K ˆ k Xk XH = Jmin + E K k ( ) ( ) ˆ k Rxx = Jmin + tr K ˆ kΛ = Jmin + tr K

(55)

where Λ = QH Rxx Q. The EMSE is dened as −1 ( ) M ∑ ˆ kΛ = JEM SE = Jk − Jmin = tr K kˆk (i) λi .

(56)

i=0

The computation of kˆk (i) can be conducted as follows. Multiplying the error vector Uk = Uk−1 + µXH k Ek by its hermitian vector and taking the expectation of each term, we have: {( { } ) ( )H } H H H E UH U = E I − µX X U U + µ2 Jmin Rxx (57) I − µX X k k k−1 k−1 k k k k

16

{ } where Jmin = E Vk VkH ∈ R. The weight error vector correlation can be expressed as ( ) ( )H H Kk = I − µXH + µ2 Jmin Rxx k Xk Kk−1 I − µXk Xk

(58)

and diagonalizing as in (51) and (52) ˆ k = (I − µΛ) K ˆ k−1 (I − µΛ)H + µ2 Jmin Λ K

(59)

we obtain a system of M nite dierence equations: 2 kˆk (i) = (1 − µλi ) kˆk−1 (i) + µ2 Jmin λi

(60)

for i = 0, 1, . . . M − 1. Back-substitution leads to 2k kˆk (i) = (1 − µλi ) kˆ−1 (i) + µ2

k/2 ∑

(1 − µλi ) Jmin λi .

(61)

i=0

As k → ∞ k/2 ∑

lim kˆk (i) = µ2

k→∞

(1 − µλi ) Jmin λi

i=0

1

µJmin = µ2 λJmin 2 = 2 − µλ . i 1 − (1 − µλi )

(62)

Substituting (62) into (56): J∞ = Jmin + JEM SE = Jmin +

M −1 ∑

kˆk (i) λi = Jmin + Jmin

i=0

M −1 ∑ i=0

µλi . 2 − µλi

(63)

Equation (63) can be expanded into a Taylor series: J∞ = Jmin + Jmin

( ) M −1 µ ∑ µ µ2 2 λi 1 + λi + λi + . . . . 2 i=0 2 2

(64)

Recalling that the autocorrelation matrix Rxx is hermitian, it has right eigenvalues belonging to the reals and, for µ << 2/λmax , we have J∞ ≈ Jmin + Jmin

M −1 ) ( µ ∑ µ λi =Jmin 1 + tr (Rxx ) . 2 i=0 2

17

(65)

Denitively, the EMSE is equal to µ µ tr (Rxx ) = M · rxx [0] 2 2 µ = M · {input power} . 2

JEM SE ≃

(66)

5.3. QDFT with Power Normalization

The combination of DFT and Power Normalization is a powerful gimmick to obtain fast convergence in an adaptive algorithm. One of the eects of the DFT, the orthogonalizing property, is to asymptotically make the autocorrelation matrix a diagonal matrix with increasing the DFT length N. This can be seen by nding a convenient expression of the autocorrelation matrix Rxx . The expectation of the squared magnitude of X(ω) is equal to −1 N −1 [ ] N∑ ∑ 2 E |X (ω)| = E [x (n) x∗ (k)] e−µω(n−k)/N

(67)

n=0 k=0

where ϕxx (n − k) = E [x (n) x∗ (k)] is the autocorrelation function of x (n). Equation (67) is conveniently divided by N and rewritten as [ ] 2 E |X (ω)| N

=

(

N −1 ∑ l=−N +1

|l| 1− N

)

ϕxx (l) e−µωl/N

(68)

where the variable running over the summation has been changed in l = n−k and the function is computed in a double (see |l| ) triangular domain by multiplying the autocorrelation coecient ϕxx (n − k) by Bartlett's window. The autocorrelation coecients ϕxx of x can be considered as an ARMA (AutoRegressive Moving Average) process. Its expression in the time domain is P −1 ∑ |l| ∗ ϕxx (l) = E {x (n) x (n − l)} = aP c P (69) p=0

where l = . . . , −1, 0, 1, . . ., cP , aP and P denote the process poles, the process coecients and the order of the process, respectively. In the frequency domain the autocorrelation coecients are transformed into ∞ ( ) ∑ Φxx e−µ2πn/N = ϕxx (l) e−µ2πn/N (70) l=−∞

that is the expression of the Power Spectral Density (PSD) of the input process.

18

As l → ∞, we can see that ϕxx (l) → 0 and the diagonal elements of the autocorrelation matrix (divided by N for convience) tend to the PSD samples asymptotically. Recalling the Fourier transform of the exponential series, the autocorrelation matrix can be rewritten by decomposing the coecients as    

Rxx =  N  

N∑ +1

(

l=−N +1 P∑ −1 1 aP N p=0

1− [

|l| N

)

ϕxx (l) e−µ2πn/N 1

cP

−e+µ2πm/N

1

−µ2πn/N c−1 P −e

m=n ] + c.c.

m ̸= n

(71)

where n,m denote the matrix indexes. As the transform length N tends to innity, the non-diagonal elements of the autocorrelation matrix tend to 0. As can be read from (71), when power normalization is applied, the elements of the principal diagonal of Rxx , multiplied by the normalization factors, all tend to have the same value, thus leading to a single convergence mode. As N increases, the diagonal matrix just obtained tends to the identity matrix I, i.e. Rxx → Λ, Λ−1 Rxx = I. 6. Simulations

6.1. OS-QFDAF simulations

The results from the simulations presented in this paper were obtained by applying the transform domain algorithms to a system identication problem (Fig. 2). The system to be identied is characterized by a set of random weights w0 (in the time domain), uniformly distributed in the range [−1, 1]. The quaternion lter input signal x[n] is a unit variance colored noise, obtained by ltering the white Gaussian noise η[n] as √ 1 − b2 η [n] (72) x [n] = bx [n − 1] + √ 4 where b determines the bandwidth of the signal. The additive signal v[n] is dened the same way as x[n], but it has a dierent bandwidth. The choice of the overlap length and the block length (M = L) determines the FFT length (N = M + L). The OS-QFDAF algorithm and its modications exploit the power normalization technique in order to have all lter weights converging at the same rate. Figure 3 shows how power normalization aects the lter performance by whitening the input signal. The test is repeated with b = 0 19

w0 v[n] +

y[n]

Adaptive Filter

x[n]

d[n]

+

e[n]

Figure 2: Algorithm test circuit.

and b = 0.95. Whereas both the OS-QFDAF and the Sliding QFFT-QLMS algorithm converge nearly at the same rate when b is changed from 0 to 0.95, the QLMS algorithm converges very slowly if b = 0.95. Smoothed Learning Curve [n average = 10], b = 0 QLMS Sliding QFFT-QLMS OS-QFDAF MSE Bound

QLMS Sliding QFFT-QLMS OS-QFDAF MSE Bound

0

MSE [dB] 10log(J(w))

MSE [dB] 10log(J(w))

0

Smoothed Learning Curve [n average = 10], b = 0.95

10

10

-10

-10

-20

-20

-30

-30

-40

-40

-50

-50 0

2000

4000

6000

8000

0

2000

4000

6000

8000

Samples

Samples

Figure 3: Power normalization eect on the lter performance.

The step size µ0 in a power-normalized algorithm is an overall parameter governing the diagonal inverse power matrix. In both cases power normalization is present or not, the step size modies the algorithm behavior in the same way as in QLMS: if the step size is too large, the lter tends to instability. On the contrary, if the step size is too small, further iterations are needed for the lter to reach the steady state. Somewhere in the middle, when the step size decreases, the learning function drops slower, but at the steady state the mean square error is smaller. An example of how the step 20

size modies the lter behaviour is given in Fig. 4. The second experiment compares the OS-QFDAF and its unconstrained counterpart. The scenario is the same depicted in Fig. 2. As can be seen from Fig. 5, all parameters being equal (b = 0.7, M = L = 32, N = 64, µ0 = 0.3, δ = 0, λ = 0.7), the OS-QFDAF convergences to optimum in fewer iterations and with a lesser excess MSE in the learning curve, in comparison with the Unconstrained OS-QFDAF. However, the execution in the Unconstrained OS-QFDAF is faster. The Excess MSE is a gap between the steady state MSE curve and the theoretical MSE bound. Smoothed Learning Curve [n average = 20]

20

µ 0 = 0.08 µ 0 = 0.1

10

µ 0 = 0.6 MSE Bound

MSE [dB] 10log(J(w))

0

-10

-20

-30

-40

-50 0

1000

2000

3000

4000

5000

6000

Samples

Figure 4: Step size eect on the lter performance. Smoothed Learning Curve [n average = 20] 10

OS-QFDAF Unconstrained OS-QFDAF MSE Bound

MSE [dB] 10log(J(w))

0

-10

-20

-30

-40

-50 0

500

1000

1500

2000

2500

3000

Samples

Figure 5: OS-QFDAF vs Unconstrained OS-QFDAF.

21

6.2. Evaluation of the Excess Mean-Square Error

As suggested by (66), the Excess MSE (EMSE), dened by (56), increases with the lter length M and/or the step size µ. An experimental proof of that was reported in Fig. 4 with varying µ0 . In the following test, we check the validity of (66). The OS-QFDAF algorithm was applied to the adaptive block in Fig. 2 and a comparison between the real J∞ (learning curve value at the steady state) and J∞ (formula) computed by means of (65) is given in Fig. 6. The EMSE has been dened as JEM SE∞ = J∞ − JM SE Bound . The experiment reveals that the value of J∞ (formula), computed with (65), approximates quite well the learning curve at the steady state. The algorithm parameters were chosen as M = L = 32, N = 64, µ = {0.002, 0.003} with b = 0.7. As a matter of coherence, the algorithm in this test works without power normalization, since (65) was derived in this way. Smoothed Learning Curve [n average = 20]

10

µ = 0.002, M = 32, N = 64 µ = 0.003, M = 32, N = 64 J ∞ (formula)

0

MSE [dB] 10log(J(w))

MSE Bound

-10

-20

-30

-40

-50 0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Samples

Figure 6: Excess MSE in OS-QFDAF. 6.3. Performance evaluation in changing scenario

Given the experiment in Fig. 2, the weights of the system to be identied are changed in value twice during the simulation. The lter response (in terms of adaptation) varies in speed accordingly with the amount of change. A greater change produces a slower adaptation, i.e. the algorithms need a higher number of iterations in order to reach the steady state after the alteration. The plots in Fig. 7 and Fig. 8 show the results. The tracking 22

capability was tested on two lters with equal block length: OS-QFDAF and CC-QFDAF (b = 0.5; (OS-QFDAF) M = L = 32, N = 64, µ0 = 0.085, λ = 0.7, δ = 0; (CC-QFDAF) M = L = N = 32, µ0 = 0.045, λ = 0.7, δ = 0). Figure 8, in particular, reports the tracking of weight w1 = w1a + w1b i + w1c j + w1c k of the adaptive lter. Smoothed Learning Curve [n average = 20]

20

CC-QFDAF OS-QFDAF MSE Bound

10

MSE [dB] 10log(J(w))

0

-10

-20

-30

-40

-50 0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Samples

Figure 7: Tracking capability - Learning Curve: OSQFDAF, CC-QFDAF. 0.5

1

CC-QFDAF OS-QFDAF wopt

0.4

w[k]1b

w[k]1a

0.3

CC-QFDAF OS-QFDAF wopt

0.8

0.2

0.6 0.4

0.1 0.2

0 -0.1

0 0

50

100

150

200

250

300

0

50

100

Samples 0.8

200

250

300

200

250

300

0.8

CC-QFDAF OS-QFDAF wopt

CC-QFDAF OS-QFDAF wopt

0.6

w[k]1d

0.6

w[k]1c

150

Samples

0.4

0.2

0.4

0.2

0

0 0

50

100

150

200

250

300

0

Samples

50

100

150

Samples

Figure 8: Tracking capability - Weights: OS-QFDAF, CC-QFDAF.

23

7. Conclusion

We have introduced the OS-QFDAF algorithm and its modications (CC-QFDAF and Sliding Window QFFT-QLMS) to process quaternionvalued signals in the frequency domain. We have seen that quaternionvalued algorithms provide a compact form for implementing digital lters which is consistent with the multidimensional nature of several problems in physics. After deriving the OS-QFDAF algorithm, we analyzed its convergence properties and computational cost. We obtained the stability range of the step size and found a mathematical relation between the EMSE and the algorithm parameters. The OS-QFDAF was achieved by transforming the Block QLMS algorithm into the frequency domain and following the rules of the overlap-save method to obtain a fast convolution. We also proved by test that the EMSE in OS-QFDAF changes with varying the step size and the lter length. We nally checked the OS-QFDAF tracking capabilities compared with CC-QFDAF. Appendix A. Quaternion Eigenvalues

Appendix A.1. Right eigenvalues

Let A ∈ Hn×n and x ∈ Hn , we can decompose A and x into a sum of complex matrices and vectors, respectively:

A = A1 + A2 j x = x1 + x2 j

(A.1)

where A1 , A2 ∈ Cn×n and x1 , x2 ∈ Cn . The characteristic equation for right eigenvalues Ax = xλ becomes: [ ][ ] [ ] A1 A2 x1 x1 a a A x = =λ . (A.2) ¯2 A ¯1 −A −¯ x2 −¯ x2 Since matrix Aa is a 2n × 2n complex matrix, it has exactly 2n complex eigenvalues which appear in complex conjugate pairs. The nal step is to choose those eigenvalues having non-negative imaginary part [13]. Appendix A.2. Left eigenvalues

The following two lemmas are given: Lemma 1.1 For p, q ∈ H

σL (pI + qA) = {p + qt : t ∈ σL (A)} 24

(A.3)

where I is the identity matrix. Lemma 1.2 If A is a (lower or upper) triangular matrix, then the left spectrum σL (A) is the set of the distinct [ diagonal ] elements of A. a b For example, given the matrix A = , c d (a) if bc = 0, then A is a triangular matrix and σL (A) = {a, d}, according to Lemma 1.2. (b) if bc ̸= 0, using Lemma 1.1: ([ ]) 0 1 σL (A) = a + bσL . (A.4) b−1 c b−1 (d − a)

[

The quantity λ is a left eigenvalue if and only if there exist a nonzero ]T x1 x2 such that:

[

In other terms:

0

1 −1 −1 b c b (d − a)

][

x1 x2

]

[ =λ

λ2 − b−1 (d − a) λ − b−1 c = 0.

x1 x2

] .

(A.5)

(A.6)

The eigenvalues in (A.4) are generally quaternion-valued. The diculty here is that (A.6) is a quaternionic equation. Solving a 2nd order quaternionic equation is not a problem, as explained in [22]. On the contrary, higher order equations require advanced resolution methods. Appendix B. Algorithms

In this Appendix, modications of the OS-QFDAF algorithm are presented. In the beginning, some vector denitions can be found in Table B.3. Appendix B.1. Block QLMS

Algorithm initialization: winit = random values.

25

Table B.3: Algorithm vector denitions (in the time domain) lter input vector at step n lter input vector at step k lter output error signal lter weights

[ ]T xn = x [n] x [n − 1] . . . x [n − M + 1] [ ]T xk = x [kL] x [kL − 1] . . . x [kL − L + 1] [ ]T yk = y [kL] y [kL − 1] . . . y [kL − L + 1] [ ]T ek = e [kL] e [kL − 1] . . . e [kL − L + 1] [ ]T wk = w0 [k] w1 [k] . . . wM [k] n = kL + i: sample index k = 0, 1, 2, . . .: block index i = 0, 1, . . . , M − 1: internal block index

At each new input block k do:

y [n] = wkT xn =

wk+1

M −1 ∑

wk [l] x [kL + i − l]

(B.1)

l=0

e [n] = d [n] − y [n] L−1 ∑ = wk + 2 µLB e [kL + i] x∗kL+i

(B.2) (B.3)

i=0

where µB is the block step size. Alternatively, the Block QLMS algorithm can be expressed by dening the block matrix Xk by rows (R ) and columns (C ) as follows:

]T [ XR k = xkL xkL−1 . . . xkL−L+1 [ ]T XC k = x [kL] x [kL − 1] . . . x [kL − M + 1]

(B.4)

[ ]T where xkL = x [kL] x [kL − 1] . . . x [kL − M + 1] and [ ]T x [kL] = x [kL] x [kL − 1] . . . x [kL − L + 1] . With the denition given in (B.4), at each input block k do: yk = wkT Xk

(B.5)

wk+1 = wk + 2

µB T H ek Xk . L

(B.6)

Appendix B.2. Circular Convolution Quaternion Frequency Domain Adaptive Filter (CC-QFDAF)

Algorithm initialization: Winit = 0 (2M-by-1 null vector) µ = δµ , P0 (m) = δm , m = 0, 1, ..., N − 1.

26

(B.7)

Pk (m): power of the m -th frequency bin at block k. δµ , δm : initialization constants to be chosen empirically.

At each new input block k do: (B.8) Yk = Xk Wk (B.9) yk = IQFFT (Yk ) (B.10) Dk = QFFT (dk ) (B.11)

Xk = diag {QFFT (xk )}

(B.12) Wk = QFFT (wk ) (B.13) { } µk = µ · diag Pk−1 (0) , . . . , Pk−1 (M − 1) (B.14) Wk+1 = Wk + µk XH k Ek . (B.15) E k = Dk − Y k

Appendix B.3. Sliding Window Algorithms - Sliding QFFT-QLMS

The input block is pre-processed and decomposed into orthogonal components, each feeding its own adaptive lter in a parallel lter bank. Adaptation takes place in the transform domain. Algorithm initialization: Winit = 0, P0 (m) = δm , for m = 0, 1, . . . , M − 1.

(B.16)

Let L = 1, at each new input sample for n = 0, 1, . . . do: Xn = diag {QFFT (xn )} Wn = QFFT (wn )

Yn = Xn Wn e (n) = d (n) − 1T Yn

(B.17) (B.18) (B.19) (B.20)

(B.21) (B.22) Wn+1 = Wn + µn XH E n (B.23) n

En = 1e (n) { −1 } −1 µn = µ · diag Pn (0) , . . . , Pn (M − 1)

with e (n): error in the time domain, En : error in the frequency domain, 1: column of ones. The QFFT in the algorithm can be replaced by any other orthogonal unitary transform F, i.e. F−1 F = I. The Sliding Window algorithms can be considered as a boundary case of OS-QFDAF where L = 1. [1] C. C. Took, D. P. Mandic, The Quaternion LMS Algorithm for Adaptive Filtering of Hypercomplex Processes, IEEE Trans. Signal Process. 57 (4) (2009) 13161327. [2] J. Navarro-Moreno, R. M. Fernández-Alcalá, C. Took, D. P. Mandic, Prediction of wide-sense stationary quaternion random signals, Signal Process. 93 (2013) 25732580. [3] C. Jahanehahi, C. C. Took, D. P. Mandic, The Widely Linear Quaternion Recursive Least Squares Filter, Proc. 2nd Int. Workshop Cognitive Information Processing (CIP) (2010). 27

[4] A. Uncini, Fundamentals of Adaptive Signal Processing, Springer (2015). [5] A. H. Sayed, Adaptive Filters, Wiley (2008). [6] A. S. Deeba, S. L. Wood, Convergence Rate Improvements for Frequency Domain Implementations of LMS Adaptive Filters, Proc. 24th Asilomar Conf. on Signals, Systems and Computers, 2 (1990) 754757. [7] F. Beaufays, Transform-Domain Adaptive Filters: An Analytical Approach, IEEE Trans. Signal Proc. 43 (2) (1995) 422431. [8] M. Jiang, W. Liu, Y. L. Li, X. Zhang, Frequency-domain Quaternionvalued Adaptive Filtering and its Application to Wind Prole Prediction, IEEE Region 10 Conf. (TENCON) (2013) 15. [9] J. J. Shynk, Frequency-domain and multirate adaptive ltering, IEEE Signal Process. Mag. 9 (1) (2002) 1437. [10] W. R. Hamilton, Elements of Quaternions, Longmans, Green & Co. (1866). [11] E. Hitzer, Introduction to Cliord's Geometric Algebra, SICE J. of Control, Measurement, and System Integration 4 (1-10) (2011) 1437. [12] C. J. Joly, A Manual of Quaternions, Macmillan and Co. Ltd. (1905). [13] F. Zhang, Quaternions and matrices of quaternions, Linear Algebra Appl. 251 (1997) 2157. [14] J. L. Brenner, Matrices of Quaternions, Pacic J. Math. 1 (3) (1951) 329335. [15] E. Hitzer, S. J. Sangwine, Quaternion and Cliord Fourier Transforms and Wavelets, Birkenhäuser, Springer, 2010. [16] T. A. Ell, S. J. Sangwine, Decomposition of 2D Hypercomplex Fourier Transforms into Pairs of Complex Fourier Transforms, Proc. 10th Eur. Signal Process. Conf. (EUSIPCO) (2000). [17] S. Sangwine, T-Ell, Colour image lters based on hypercomplex convolution, IEEE Proc. Vision, Image and Signal Process. 147 (2) (2000) 8993.

28

[18] G. Wang, Y. Liua, T. Zhao, A quaternion-based switching lter for colour image denoising, Signal Process. 102 (2014) 216225. [19] M. I. Khalil, Applying Quaternion Fourier Transforms for Enhancing Color Images, Int. J. Image, Graphics and Signal Process. 4 (2) (2012) 915. [20] A. V. Oppenheim, R. W. Schafer, Discrete-Time Signal Processing, Prentice Hall (1989). [21] L. S. Siu, A Study of Polynomials, Determinants, Eigenvalues and Numerical Ranges over Real Quaternions, Ph.D. thesis, University of Hong Kong (1997). [22] L. Huang, W. So, On left eigenvalues of a quaternionic matrix, Linear Algebra Appl. 323 (2001) 105116. [23] S. Haykin, Adaptive Filter Theory, Prentice Hall (1996). [24] T. A. Ell, Hypercomplex spectral transformations, Ph.D. thesis, University of Minnesota (1992). [25] S. Said, N. L. Bihan, S. J. Sangwine, Fast complexied quaternion Fourier transform, IEEE Trans. Signal Process. 56 (4) (2008) 1522 1531. [26] S. C. Pei, J. J. Ding, J. H. Chang, Ecient Implementation of Quaternion Fourier Transform, Convolution, and Correlation by 2-D Complex FFT, IEEE Trans. Signal Process. 11 (49) (2001) 27832797. [27] X. Zhanga, W. Liu, Y. Xu, Z. Liu, Quaternion-valued robust adaptive beamformer for electromagnetic vector-sensor arrays with worst-case constraint, Signal Process. 104 (2014) 274283. [28] X. Gou, Z. Liu, Y. Xu, Biquaternion cumulant-music for doa estimation of noncircular signals, Signal Process. 93 (2013) 874881. [29] B. Witten, J. Shragge, Quaternion-based Signal Processing, Proc. Soc. of Exploration Geophysicists Annual Meeting (SEG) (2006). [30] Q. Barthélemy, A. Larue, J. I. Mars, About QLMS derivations, IEEE Trans. Signal Process. Letters 21 (2) (2014) 240243. [31] C. Jahanehahi, D. P. Mandic, A class of quaternion kalman lters, IEEE Trans. Neural Netw. Learn. Syst. 25 (3) (2014) 533544. 29

[32] C. Jahanchahi, C. C. Took, D. P. Mandic, A class of quaternion-valued ane projection algorithms, Signal Process. (93) (2013) 1712-1723. [33] C. C. Took, D. P. Mandic, Fusion of heterogeneous data sources: A quaternionic approach, IEEE Workshop on Machine Learning for Signal Process. (MLSP) (2008) 456461. [34] J. C. Lee, C. K. Un, Performance Analysis of Frequency-Domain Block LMS Adaptive Digital Filters, IEEE Trans. Circuits Syst. 36 (2) (1989) 173189. [35] B. Farhang-Boroujeny, K. S. Chan, Analysis of the Frequency-Domain Block LMS Algorithm, IEEE Trans. Signal Proc. 48 (8) (2000) 2332 2342. [36] Quaternions, Interpolation and Animation. URL http://web.mit.edu/2.998/www/QuaternionReport1.pdf [37] Quaternion Algebra and Calculus. URL http://www.geometrictools.com/Documentation/Quaternions.pdf [38] N. A. Wiegmann, The Structure of Unitary and Orthogonal Quaternion Matrices, Illinois J. Math. 2 (3) (1958) 402407.

30