Deterministic regression methods for unbiased estimation of time-varying autoregressive parameters from noisy observations

Deterministic regression methods for unbiased estimation of time-varying autoregressive parameters from noisy observations

Signal Processing 92 (2012) 857–871 Contents lists available at SciVerse ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/s...

923KB Sizes 0 Downloads 34 Views

Signal Processing 92 (2012) 857–871

Contents lists available at SciVerse ScienceDirect

Signal Processing journal homepage: www.elsevier.com/locate/sigpro

Deterministic regression methods for unbiased estimation of time-varying autoregressive parameters from noisy observations Hiroshi Ijima a,n, Eric Grivel b a b

Faculty of Education, Wakayama University, 640-8510 Wakayama, Japan Universite´ de Bordeaux, IPB, UMR CNRS 5218 IMS, 33400 Talence, France

a r t i c l e i n f o

abstract

Article history: Received 25 November 2010 Received in revised form 23 July 2011 Accepted 21 September 2011 Available online 29 September 2011

A great deal of interest has been paid to autoregressive parameter estimation in the noise-free case or when the observation data are disturbed by random noise. Tracking time-varying autoregressive (TVAR) parameters has been also discussed, but few papers deal with this issue when there is an additive zero-mean white Gaussian measurement noise. In this paper, one considers deterministic regression methods (or evolutive methods) where the TVAR parameters are assumed to be weighted combinations of basis functions. However, the additive white measurement noise leads to a weightestimation bias when standard least squares methods are used. Therefore, we propose two alternative blind off-line methods that allow both the variance of the additive noise and the weights to be estimated. The first one is based on the errors-in-variable issue whereas the second consists in viewing the estimation issue as a generalized eigenvalue problem. A comparative study with other existing methods confirms the effectiveness of the proposed methods. & 2011 Elsevier B.V. All rights reserved.

Keywords: Time-varying autoregressive (TVAR) model Parameter estimation Parameter tracking Errors-in-variables Generalized eigenvalue decomposition Least squares

1. Introduction Time-varying autoregressive (TVAR) models are useful tools for various engineering applications such as radar processing to model the clutter [3], biomedical applications [6,17], speech processing to capture local and global timevariation of the speech parameters [18], speech dereverberation to model time varying channels related to moving speakers [6,15], etc. To track the TVAR parameters, two kinds of approaches, namely the stochastic and the deterministic regression (DR) methods, have been proposed in the literature. Thus, the first family includes recursive methods such as the recursive least-squares (RLS) with forgetting factor [8]. When using Kalman filtering [1] or HN filtering, the state vector stores the TVAR parameters to be tracked. In the second family also called ‘‘evolutive methods’’, the parameter evolutions are assumed to be weighted linear combinations of some basis time-dependent functions such as power of time, Legendre polynomials, or Fourier functions. Usually the basis choice can be based on a priori properties of the signal. Discussions to determine the optimal basis can be read for instance in Ref. [16]. Then, the weights have to be estimated by using least squares (LS) methods for instance. Hence, TVAR parameter estimation becomes a stationary identification issue. As the TVAR parameters are deduced by summing the weighted basis functions, a smoothed evolution of the parameters is obtained. For more details, the reader may refer to the pioneering works by Grenier in the field of speech analysis, transmission and recognition like [12,2].

n

Corresponding author. E-mail addresses: [email protected] (H. Ijima), [email protected] (E. Grivel).

0165-1684/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2011.09.020

858

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

However, when the TVAR process is disturbed by an additive measurement noise, using the above LS approaches leads to an estimation bias. Alternative approaches have to be considered. To our knowledge, few papers deal with this issue. As cumulants are insensitive to additive white Gaussian noise, there is a direct relation between the noisy TVAR process mthorder cumulant sequence and the weights. Nevertheless, directly using this approach with a sliding window leads to poor results because the estimations of the cumulants may have a high variance when few data are available. Therefore, adaptive gradient-type has been proposed in Ref. [4]. When dealing with stochastic methods, extended Kalman filtering and sigma point Kalman filters (SPKF) [20] such as the unscented Kalman filter (UKF) and the central difference Kalman filter (CDKF) can be considered. In that case, the state vector stores both the TVAR parameters to be tracked and the TVAR process to be estimated from noisy observations. Nevertheless, these methods require the a priori knowledge of the noise variances. In [7], one of us has recently proposed a recursive errors-in-variable (EIV) based method. It combines a Newton Raphson (NR) algorithm to iteratively estimate the additive-noise variance and a sliding off-line noise compensated Yule–Walker based method to deduce the TVAR parameters. In Refs. [19] and [11], particle filtering is considered, but the computational cost may be particularly high. Concerning the model order, some authors have discussed the importance of its selection. Among them, Costa and Hengstler [9] recently suggested using a modified version of the so-called predictive least squares (PLS) principle, initially introduced by Rissanen. They propose to choose the order that minimizes a weighted prediction error energy, based on an exponential windowing using a forgetting factor. This adaptive algorithm is then used successfully for time-frequency analysis. In this paper, we focus our attention on DR based methods. We propose two off-line methods to estimate the weights from the noisy observations. The first one is based on an EIV approach and more particularly on the so-called Frisch scheme [5] whereas the second method consists in viewing the weight estimation as a generalized eigenvalue problem. This latter extends to the TVAR case the approach initially proposed by Davila [10] for AR parameter estimations from noisy observations. The remainder of the paper is organized as follows: Section 2 describes the problem statement. Section 3 deals with the methods we propose. Section 4 reports the results of the comparative study we carry out. First, we analyze the relevance of the approaches on synthetic data; then, results are provided with real speech signal. 2. LS weight estimation for TVAR process Let x(k) be a time-varying autoregressive (TVAR)-process defined by xðkÞ ¼ 

p X

an ðknÞxðknÞ þ uðkÞ

ð1Þ

n¼1

where u(k) is the zero-mean driving process with variance s2u and where the TVAR parameters can be expressed by considering a function basis, as follows: an ðkÞ ¼

m X

bnj f j ðkÞ,

ð2Þ

j¼0

where ff j ðUÞgj ¼ 0,1,...,m are the basis functions and fbnj gn ¼ 1,...,p

and j ¼ 0,...,m

the corresponding weights.

The purpose of this section is to see how to estimate the weights fbnj gn ¼ 1,...,p

and j ¼ 0,...,m

from the observations by using

an off-line method and to analyze the estimation bias when an additive white Gaussian noise disturbs the observations. . . . bpm T Thus, using (2) and introducing the weight vector y 0 ¼ ½ b10 b11 . . . b1m b20  and X ðkÞ ¼ ½ f 0 ðkÞ

...

f m ðkÞ T xðkÞ, we obtain T

an ðknÞxðknÞ ¼ y 0 X ðknÞ ¼ X T ðknÞy 0 ,

for ¼ 1,. . .,p:

ð3Þ

Given (3), Eq. (1) becomes xðkÞ ¼ ½ X T ðk1Þ

...

X T ðkpÞ y 0 þuðkÞ:

ð4Þ

Premultiplying both sides in Eq. (4) by 2 3 X ðk1Þ 6 7 ^ 4 5 X ðkpÞ

and taking the expectation, we have 3 22 X ðk1Þ T 1 66 y 0 ¼ E 44 ^ 7 5½ X ðk1Þ X ðkpÞ

3 22 ...

66 X ðkpÞ 7 5E44 T

X ðk1Þ ^ X ðkpÞ

3

3

7 7 p1 p 5xðkÞ5 ¼ RX r X

ð5Þ

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

859

where 22 66 RpX ¼ E44

X ðk1Þ ^ X ðkpÞ

3

22

3

7 X T ðk1Þ 5½

66 p X T ðkpÞ 7 5 and r X ¼ E44

...

X ðk1Þ ^ X ðkpÞ

3

3

7 7 5xðkÞ5:

Here, the observation data y(k) is assumed to be disturbed by an additive zero-mean white Gaussian noise b(k) with variance s2b yðkÞ ¼ xðkÞ þ bðkÞ

ð6Þ

Now similarly to X ðkÞ, let us introduce two vectors defined from the function basis, namely Y ðkÞ ¼ ½ f 0 ðkÞ

...

f m ðkÞ T yðkÞ and BðkÞ ¼ ½ f 0 ðkÞ

...

f m ðkÞ T bðkÞ:

Based on Eq. (6), we can first easily derive the following relation: X ðkÞ ¼ Y ðkÞBðkÞ

ð7Þ

Then, by substituting (7) into (5) and since the additive noise is white and zero-mean, one has 3 2 3 3 3 22 33 22 Y ðk1Þ Bðk1Þ Y ðk1Þ 6 7 6 7 6 6 7 6 7 7 y 0 ¼ E1 44 ^ 5½ Y T ðk1Þ . . . Y T ðkpÞ 4 ^ 5½ B T ðk1Þ . . . B T ðkpÞ 55E44 ^ 5yðkÞ7 5 Y ðkpÞ BðkpÞ Y ðkpÞ

ð8Þ

Then, let us focus our attention on the (mþ 1)p  (mþ1)p matrix 3 22 3 Bðk1Þ 6 7 6 7 T T ^ RB RpB ¼ EE44 5½ B ðk1Þ . . . B ðkpÞ 5: BðkpÞ One can easily show that RpB is a block diagonal matrix defined by the basis functions and the variance of the additive noise s2b 2 3 2 f 0 ðk1Þ . . . f m ðk1Þf 0 ðk1Þ 0 ... 0 6 7 6 7 ^ & ^ ^ & ^ ... 6 7 2 6 f ðk1Þf ðk1Þ . . . 7 0 ... 0 f m ðk1Þ 6 0 7 m 6 7 7 p 26 ^ & ^ RB ¼ sb 6 7 6 7 2 6 7 ðkpÞ . . . f ðkpÞf ðkpÞ f 0 . . . 0 m 0 0 6 7 6 7 ^ & ^ 6 7 ^ & ^ ... 4 5 2 0 ... 0 f m ðkpÞ f 0 ðkpÞf m ðkpÞ . . . By denoting 2

2

f 0 ðkjÞ

...

f m ðkjÞf 0 ðkjÞ

^

&

^

f 0 ðkjÞf m ðkjÞ

...

f m ðkjÞ

2

3

6 FðjÞ ¼ 6 4

2

3 7 7 5

one has RpB

26 b4

¼s

Fð1Þ

...

^

&

0

...

0

^ 7 5 FðpÞ

ð9Þ

Consequently, if RpY and r pY are similarly defined as RpX and r pX , respectively, Eq. (8) can be rewritten as follows: ðRpY RpB Þ1 r pY ¼ y 0

ð10Þ

If one does not compensate the influence of the additive noise in (10), the weight estimations would satisfy RpY y ðyÞ ¼ ðRpX þ RpB Þy ðyÞ ¼ r pY ¼ r pX

ð11Þ

Given (5), this means that 1

1

1

y ðyÞ ¼ RpX RpB y ðyÞRpX r pX ¼ RpX RpB X ðyÞy 0

ð12Þ

or equivalently 1

y 0 ¼ ðI þ RpX RpB Þy ðyÞ

ð13Þ

860

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

By using the classical inversion lemma for a matrix,1 Eq. (13) becomes 1

1

y ðyÞ ¼ ðI þ RpX RpB Þ1 y 0 ¼ y 0 RpY RpB y 0

ð14Þ

Since one can rewrite the above equation as follows: 1

y 0 ¼ y ðyÞ þ RpY RpB y 0 ,

ð15Þ

1 the bias is equal to RpY RpB y 0 . If the variance 2b of the noise

s is known, then weights y 0 can be estimated directly using (10). However, if it is unknown, a noise-compensated approach is required to estimate the weights y 0 . The purpose of the next section is to propose two kinds of approaches that allow both the variance of the additive noise and the weights to be estimated. 3. LS estimation of weights with noise compensation In this paper, we suggest extending two approaches initially proposed for AR parameter estimation in [7] and [10] to the TVAR case. The first one is based on the EIV issue whereas the second consists in viewing the estimation issue as a generalized eigenvalue problem. 3.1. DR approach based on errors in variable (DEIV) Let us first rewrite (4) as follows: xðkÞuðkÞ þ½ X T ðk1Þ p

where y ¼ ½ 1

...

X T ðk1Þ

X T ðkpÞ y 0 ¼ ½ xðkÞuðkÞ

...

X T ðkpÞ y p ¼ 0

ð16Þ

y T0 T .

By premultiplying (16) by 2 3 xðkÞuðkÞ 6 X ðk1Þ 7 6 7 6 7 6 7 ^ 4 5 X ðkpÞ and taking the expectation, we have 3 22 xðkÞuðkÞ 66 X ðk1Þ 7 7 66 6 7½ xðkÞuðkÞ X T ðk1Þ E6 7 66 ^ 5 44 X ðkpÞ

3 ...

7 7 X T ðkpÞ 7y p ¼ 0 ðpðm þ 1Þ þ 1Þ1 7 5

ð17Þ

Given (17), if one has noise-free observations, one alternative way to estimate the extended unknown weight vector y would be to search the kernel of the autocorrelation matrix of the vector xðkÞuðkÞ

T

X ðk1Þ

...

p

T

X ðkpÞ , i.e. the

kernel of ðpðm þ 1Þ þ 1Þ  ðpðm þ1Þ þ 1Þ matrix. However, when there are only noisy observations, this matrix is not available. Our purpose is hence to obtain this autocorrelation matrix from the ðpðm þ1Þ þ 1Þ  ðpðm þ 1Þ þ 1Þ autocorrelation matrix RpY of the noisy observations. As the variance of both the additive noise and the driving process are unknown, they have to be also estimated. For this purpose, let us look at the influence of the additive noise in Eq. (17) and express the autocorrelation matrix of T

T

the vector ½ xðkÞuðkÞ X ðk1Þ . . . X ðkpÞ  from RpY . Thus, by substituting (6) and (7) into (17), one has 3 22 3 yðkÞbðkÞuðkÞ 66 Y ðk1ÞBðk1Þ 7 7 7 66 7 6 7½ yðkÞbðkÞuðkÞ Y T ðk1ÞB T ðk1Þ . . . Y T ðkpÞB T ðkpÞ 7y p ¼ 0 E6 ðpðm þ 1Þ þ 1Þ1 7 66 7 ^ 5 44 5 Y ðkpÞBðkpÞ Then, let us express (18) more clearly. For j Z1, note that 2 3 f 0 ðkjÞ 6 7 ^ E½ðyðkÞbðkÞuðkÞÞðY T ðkjÞB T ðkjÞÞ ¼ 4 5E½ðyðkÞyðkjÞÞ f m ðkjÞ

1

If A ¼B þCDT thenA1 ¼ B1 B1 CðI þ DT B1 CÞ1 DT B1 .

ð18Þ

ð19Þ

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

861

and E½ðyðkÞbðkÞuðkÞÞðyðkÞbðkÞuðkÞÞ ¼ E½y2 ðkÞs2u s2b :

ð20Þ

If one introduces RpY which is the autocorrelation matrix of the vector ½yðkÞ Y T ðk1Þ. . .Y T ðkpÞ 2 3 2 3T 2 3T f 0 ðk1Þ f 0 ðkpÞ 6 7 6 7 6 7 6 ^ ^ E½yðkÞyðkÞ 4 5 E½yðkÞyðk1Þ . . . 4 5 E½yðkÞyðkpÞ 7 6 7 6 7 f m ðk1Þ f m ðkpÞ 6 7 6 7 2 3 6 7 6 f 0 ðk1Þ 7 66 7 7 64 7 ^ 5E½yðkÞyðk1Þ 6 7 p RY ¼ 6 7 6 f m ðk1Þ 7 6 7 6 7 ^ 6 7 RpY 62 7 3 6 f 0 ðkpÞ 7 6 7 66 7 7 ^ 64 7 5E½yðkÞyðkpÞ 4 5 f m ðkpÞ and 2 6 6 RpB ¼ 6 6 4

s2u þ s2b

0

...

0

0 RpB

^ 0

3 7 7 7 7 5

Eq. (18) becomes p

ðRpY RpB Þy ¼ 0

ð21Þ

In (21), RpY RpB is semi-definite positive and RpB is defined by the unknown variances s2u and s2b . The formulation of an EIV estimation problem using the Frisch scheme [5] consists in determining, only on the basis of the noisy observations, the set of noise variances that satisfies the semi-definite positiveness condition 2 3 0 ... 0 a 6 0 bFð1Þ . . . 0 7 6 7 7Z0 ð22Þ RpY C p ða, bÞ ¼ RpY 6 6^ 7 ^ & ^ 4 5 0 . . . bFðpÞ 0 where a and b are reel positive variates. To determine this set, let us consider the set of candidates ½ a1 Appendix A, the 2-tuple ½ la ¼ a1

b1  so that a1 2 þ b1 2 ¼ 1 and a1 Z b1. According to

lb ¼ b1  makes RpY C p ða, bÞ semi-definite positive provided that l is the largest

p 1

eigenvalue of RY C p ða1 , b1 Þ. This hence leads to an infinite set of solutions defining a first curve Sp(a1,b1). Then, one can iterate the same process by using another model order q higher than p. This leads to a second curve Sq(a1,b1). Therefore, in theory, the variances to be found belong to both curves. In practice, the expectation is replaced by the temporal mean over a sliding window. As a consequence, there is no longer an intersection between both curves. On the one hand, for the curve Sq(a1,b1), one has p

ðRpY C p ða1 lp,max , b1 lp,max ÞÞy ¼ 0 ðpðm þ 1Þ þ 1Þ1 1

1

where lp,max is the inverse of the largest eigenvalue of RpY On the other hand, for the curve Sq(a1,b1), one has 1

1

C p ða1 , b1 Þ. 2

1 1 q ðRqY C q ð 1 lq,max , b1 lq,max ÞÞy

a

1 1 ¼ ðRqY C q ð 1 lq,max , b1 lq,max ÞÞ4

a

y q1:p y qp þ 1:q

3 5¼0

ðqðm þ 1Þ þ 1Þ1

1

where lq,max is the inverse of the largest eigenvalue of RqY C q ða1 , b1 Þ. So, we suggest estimating the noise variance by minimizing the following criterion: 1

q

Jða1 , b1 Þ ¼ :ðRpY C p ða1 lp,max , b1 lp,max ÞÞy 1:p :2 1

1

where :U:2 denotes the Frobenius norm.

ð23Þ

862

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

The procedure for off-line TVAR estimation using EIV approach can thus be summarized as follows: 1. Compute RpY and RqY . 2. Set the initial values of a1 and b1 under the conditions of a1 2 þ b1 2 ¼ 1 and a1 Z b1. q 3. Construct Cq(a1,b1) and solve the eigenvalue decomposition of the matrix RY

1

C q ða1 , b1 Þ.

1 q 4. Solve the eigenvalue decomposition of the matrix RqY C q ða1 l1 q,max , b1 lq,max Þ to deduce the extended weight vector y . q

q

Keep the first p components of y to define y 1:p . p p 5. Solve the eigenvalue decomposition of the matrix RY C ða1 lp,max , b1 lp,max Þ. 1

6. Calculate the criterion Jða

1

1 1 q p p 1 , b1 Þ ¼ JðRY C ð 1 lp,max , b1 lp,max ÞÞy 1:p J2 .

a

7. Obtain the estimations of the parameter set ða^ 1 , b^ 1 Þ minimizing J(a1,b1) by the loop iteration from 3. to 5. with respect 2

to a1 and b1 (a1 2 þ b1 ¼ 1 and a1 Z b1). ^ 1 ^ 2b ¼ b^ 1 l1 ^ 2 ^ 1 8. Deduce the estimations of the noise variances s2u and s2b by s p,max and su ¼ a1 lp,max b1 lp,max , respectively. ^ 2u þ s ^ 2b , s ^ 2b ÞÞ. 9. Deduce from (21) the extended weight vector y p corresponding to the kernel of ðRpY C p ðs

3.2. DR approach based on noise subspace (DNS) In this second method and unlike the method proposed in the above section, we no longer aim at estimating the variance of the driving process and suggest estimating it once the other quantities have been estimated. We only focus our p attention on the joint estimation of both the variance of the additive noise and the extended weight vector y . It is still based on linear algebra results, but here one searches the kernel of a pðm þ1Þ  ðpðm þ 1Þ þ 1Þmatrix. For this purpose let us rewrite Eq. (10) as follows: 2 3 f 0 ðk1ÞEðyðk1ÞyðkÞÞ 6 7 ^ ðRpY RpB Þy 0 ¼ 4 ð24Þ 5 f m ðkpÞEðyðkpÞyðkÞÞ or equivalently 02 f 0 ðk1ÞEðyðk1ÞyðkÞÞ B6 ^ B6 @4 f m ðkpÞEðyðkpÞyðkÞÞ

3

2 0 7 6 RpY 7s2b 6 ^ 5 4 0

0

31

Fð1Þ

...

^ 0

& ...

7C ^ 7Cy p ¼ 0 pðm þ 1Þ1 5A FðpÞ

ð25Þ

...

f m ðkpÞEðxðkpÞxðktÞÞ y 0

ð26Þ

Given (4), for t 40, one can easily show that EðxðkÞxðktÞÞ ¼ ½ f 0 ðk1ÞEðxðk1ÞxðktÞÞ

In addition, one has EðxðklÞxðktÞÞ ¼ EðyðklÞyðktÞÞ for t 4p ZlZ0. Therefore for t 4p, combining (26) one has EðyðkÞyðktÞÞ þ ½ f 0 ðk1ÞEðyðk1ÞyðktÞÞ

...

f m ðkpÞEðyðkpÞyðktÞÞ y 0 ¼ 0

or p

½EðyðkÞyðktÞÞf 0 ðk1ÞEðyðk1ÞyðktÞÞ. . .f m ðkpÞEðyðkpÞyðktÞÞy ¼ 0

ð27Þ

Combining (25) and (27) for p o t rq, we obtain the following new relationship: p p p f 2f ðR Y sb RB Þy ¼ 0 ðpm þ qÞ1

ð28Þ

where 2

3

f 0 ðk1ÞEðyðkÞyðk1ÞÞ

6 ^ 6 6 6 f ðkpÞEðyðkÞyðkpÞÞ m 6 p f 6 R Y ¼6 EðyðkÞyðkp1ÞÞ 6 6 ^ 6 4 EðyðkÞyðkqÞÞ

RpY f 0 ðk1ÞEðyðk1Þyðkp1ÞÞ

...

^ f 0 ðk1ÞEðyðk1ÞyðkqÞÞ

...

7 7 7 7 7 7ðq Z 1Þ f m ðkpÞEðyðkpÞyðkp1ÞÞ 7 7 7 ^ 7 5 f m ðkpÞEðyðkpÞyðkqÞÞ

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

863

and " 0 pðm þ 1Þ1 p f RB ¼ 0 ðqpÞ1

#

RpB 0 ðqpÞpðm þ 1Þ

p p f fp To obtain the weight vector y , Eq. (28) has to be solved. However, R Y and RB are not square. As described by Davila in [10], T T p p T f 2f multiplying both sides of (28) by ðR Y sb RB Þ , one has p

ðA0 þ s2b A1 þ s4b A2 Þy ¼ 0 ðpðm þ 1Þ þ 1Þ1 T T p f p p p f p f fp T f f fp T fp where A0 ¼ R Y RY , A1 ¼ RB RY RY RB , and A2 ¼ RB RB .     A0 0 A1 A2 Let us define P ¼ . Eq. (28) is hence equivalent to and Q ¼ I 0 0 I " p#

ðPs2b Q Þ

y

ye

¼ 0 2ðpðm þ 1Þ þ 1Þ1

ð29Þ

e is a vector which is not directly concerned with estimation. Therefore, the variance and the weights can be where y deduced by solving the generalized eigenvalue problem of the matrices ðP,Q Þ: The weights can be deduced by looking at pT e T T the generalized eigenvector ½y y  corresponding to the generalized eigenvalue with the lowest modulus which is equal to the variance. In practice, as in Section 3.1, the expectation is replaced by the temporal mean over a sliding window.

4. Simulation results In this section, let us first analyze the performances of the proposed approaches with synthetic data. Then we will use the above methods with a speech signal. 4.1. Results based on synthetic data In this sub-section, we suggest carrying out a comparison study between two deterministic regression (DR) methods proposed in this paper, and five other existing approaches: 1) 2) 3) 4) 5) 6) 7)

DR approach based on errors-in-variables (DEIV) proposed in Section 3.1; DR approach based on noise-subspace (DNS) proposed in Section 3.2; TVAR-version of Yule-Walker (YW) estimation [12] directly used with noisy observations; Cumulant-based adaptive gradient-type approach (CAG) [4]; Standard Kalman filtering [1] directly used with noisy observations; Extended Kalman filtering (EKF) [20]; Recursive errors-in-variables approach (REIV) [7]; The noisy observations are generated using Eqs. (1), (2), and (6), where the variance of the driving process is equal to

s2u ¼ 1. The order of the TVAR-model and the size of the basis are set to p ¼2 and m¼ 1, respectively. Basis functions are given as f 0 ðkÞ ¼ 1 and f 1 ðkÞ ¼ sin



 2p k , N

ðwith N the number of samplesÞ:

First, a comparative study between DEIV, DNS and YW is carried out. The true values of weights to be estimated are set P to b10 ¼0, b11 ¼1.8, b20 ¼0.85, and b21 ¼ 0. The poles of the transfer functions Hðz,kÞ ¼ 1=ð1þ an ¼ 1 an ðknÞzn Þ evolving in time and the spectrogram are given in Fig. 1. We consider 1400 samples of this TVAR process disturbed an additive noise. The signal-to-noise ratio (SNR) is equal to 10 dB. Based on Montecarlo simulations using 50 runs of the additive noise, Table 1 gives the estimations of the weights with the three methods. When YW is used, we can notice that it leads to a poor estimation of the weights. Indeed, when comparing the estimated TVAR parameters using YW and the true ones in Table 1, there is sharp contrast. According to Fig. 2 which gives the estimations of the TVAR parameters obtained by Eq. (2) using weight estimates and the corresponding poles in the z-plane, we can see that the modulus of the estimated poles is rather smaller than the modulus of true ones when using YW. This hence leads to a spectrogram that is ‘‘smoother’’ along the frequency axis and confirms the result presented by Kay in [21] about the LS estimation of the AR parameters from noisy observations. DNS and DEIV approaches provide weight estimates that are closer to the true ones in Table 1. Both methods make it possible to compensate significantly the influence of the additive noise. In Fig. 3, we present the root mean square error (RMSE) using 50 independent trials for each SNR (varying from 0 to 50 dB). For a wide range of SNR, the performances of the DEIV and DNS approaches are similar to the YW equations in the noise-free case. In addition, we can

864

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

1 0.8 0.6 Imaginary Part

0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -0.5

-1

0

0.5

1

Real Part Fig. 1. A evolution of the poles in the z-plane (a) and the spectrogram of the noise-free TVAR process (b). Table 1 Average of the weight estimates based on 50 trials (Monte Carlo simulations). True value

YW

DEIV

DNS

b10 b11 b20 b21

0 1.8 0.85 0

 0.002 1.25 0.39 0.005

0.003 1.75 0.81 0.002

0.004 1.85 0.89 0.001

2 1.5 1 0.5 0 -0.5 -1 -1.5 -2

DEIV DNS YW true

0.8 0.6 0.4 end 0

a2

1

200

400

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2

600 800 1000 1200 1400 Sample Number DEIV DNS YW true

Imaginary Part

a1

Weights

0.2 0

start

-0.2

DEIV DNS YW true

-0.4 -0.6 -0.8 -1

0

200

400

600 800 1000 1200 1400 Sample Number

-0.8 -0.6 -0.4 -0.2 0 0.2 Real Part

0.4

0.6

0.8

1

Fig. 2. Estimations of the TVAR parameters (a) and the poles in the z-plane (b).

see that DEIV and DNS are reliable as long as the SNR is higher than 10 dB. This confirms the performances of the EIV method obtained in [13] to estimate the AR parameters from noisy observations. It should be noted that for a smaller SNR, we suggest using DNS rather than DEIV.

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

865

RMSE of weight estiomation

1 DEIV DNS YW YW-noise free

0.95 0.9 0.85 0.8 0.75 0.7 0.65 0

5

10

15

20

25 30 SNR dB

35

40

45

50

0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03

DEIV DNS YW true

1 0.9 Weight β20

Weight β10

Fig. 3. RMSE of the weight estimations for each SNR.

0.6

0.4

550 600 650 700 750 800 850 900 950 1000 Sample Number

2 1.8 1.7

DEIV DNS YW true

1.6 1.5 1.4

550 600 650 700 750 800 850 900 950 1000 Sample Number

Weight β21

Weight β11

DEIV DNS YW true

0.7

0.5

1.9

1.3

0.8

550 600 650 700 750 800 850 900 950 1000 Sample Number

0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04

DEIV DNS YW true

550 600 650 700 750 800 850 900 950 1000 Sample Number

Fig. 4. Estimations of the weights.

Let us now study the robustness of the off-line approaches regarding the set of TVAR process samples under study. For this purpose, the TVAR process defined in the first test is still considered, but we now use a sliding window, whose length is 600, to estimate the weights bnj. The weights depend on the windowed signal (or frame), are assigned to the central time of the frame and are denoted as b^ ðkÞ. According to Figs. 4 and 5, we can notice that every method is sensitive to the time k. To nj

make the estimation more robust, we suggest considering the weight estimates obtained for several values of k and averaging them. Given Table 2 where the results are based on 50 trials, we can see that the approaches become more robust. Then, the DEIV and DNS approaches are compared with the other approaches REIV, CAG, EKF and the standard Kalman filtering. The evolution of the poles and the spectrogram of the noise-free TVAR process are illustrated in Fig. 6. The order of the TVAR model, the size of the basis, and the true value of variance of the driving process are set to the same values as those used in the previous simulation. The true values of the weights are set to b10 ¼ 0:5, b11 ¼  1, b20 ¼ 0.9, and b21 ¼0.02. In that case, the TVAR parameters are oscillatory. As depicted in Fig. 6 (a), poles are evolving reciprocally. The SNR is equal to 8 dB. Concerning our approaches DEIV and DNS, it should be noted that the order of the basis is set to m ¼ 3, which is different from the true order of the TVAR process. It can be seen that DEIV, DNS, and EKF approaches provide estimates that ‘‘look like’’ the true curves. As expected, standard Kalman filtering is unable to provide a good estimation of the weights. See Fig. 7. This hence leads to a poor

866

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

1.5 DEIV DNS YW true

1

a1

0.5

1

0

0.8

-0.5

0.6

-1 -2 550 600 650 700 750 800 850 900 950 1000 Sample Number 1

Imaginary Part

0.4

-1.5

end

0.2

DEIV DNS YW true

0 -0.2 -0.4

0.9 0.8 a2

start

-0.6

DEIV DNS YW true

0.7

-0.8

0.6

-1

-0.5

0.5 0.4

0 Real Part

0.5

1

550 600 650 700 750 800 850 900 950 1000 Sample Number Fig. 5. Estimations of the TVAR parameters (a) and the poles in the z-plane (b).

Table 2 Averages of weights estimation based on the method using a sliding window at k¼600, 750, 900, 580, 711, and 826. True values

YW

DEIV

DNS

Sample k

b10 b11 b20 b21

0 1.8 0.85 0

600

750

0.06 1.20 0.69 0.46

0.02 1.17 0.49 0.23

900 0.01 1.24 0.37  0.05

mean

600

0.03 1.20 0.52 0.21

 0.02 1.74 0.89 0.08

mean

580

0.05 1.19 0.53 0.30

0.02 1.85 0.85  0.004

750 0.02 1.80 0.86 0.02

900 0.04 1.62 0.69 0.003

mean 0.03 1.72 0.81 0.03

600 0.02 1.82 0.87 0.02

750

900

0.01 1.83 0.88 0.02

 0.001 1.78 0.84  0.003

mean 0.01 1.81 0.86 0.01

Sample k

b10 b11 b20 b21

0 1.8 0.85 0

580

711

0.08 1.21 0.66 0.45

0.04 1.18 0.54 0.31

826 0.02 1.19 0.40 0.13

711 0.02 1.83 0.86 0.003

826 0.02 1.74 0.80 0.03

mean

580

0.02 1.81 0.84 0.01

 0.03 1.77 0.90 0.05

711

826

0.005 1.82 0.86 0.007

0.003 1.81 0.86 0.003

mean  0.01 1.80 0.87 0.02

estimation of the TVAR parameters and the corresponding poles in the z-plane. The other approaches provide comparable results. CAG leads to a higher variance of the weight estimate over time. Then, Fig. 8 deals with the estimations of the evolution of the variance of additive noise for DEIV and DNS approaches. We can see that there are small variations of the variance estimate around the true value for both methods. DEIV method is more accurate than the DNS method. 4.2. Results based on real data In this sub-section, let us focus our attention on TVAR parameters for speech signal processing. AR parameters have been used for speech analysis, speech CELP coding (for Code Excited Linear Predictive) and speech enhancement from a single sequence of noisy observations based on Kalman filtering [13] or HN filtering [14]. Here, we propose to look at TVAR parameters estimations for a speech signal frame disturbed by an additive white noise. More particularly, we look at French speech signals, sampled at 8 kHz and disturbed by an additive noise for SNR equal to 22 dB. Among the sentences we have analyzed, we focus our attention in this paper on the following one: ‘‘le tribunal va bientˆot rendre son jugement’’ (in English: The Court will soon render its verdict.),

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

1

867

pi

0.8 0.6

0.2

start and end

Frequency

Imaginary Part

0.4

0 -0.2

pi/2

-0.4 -0.6 -0.8 0

-1 -1

-0.5

0 Real Part

0.5

1

600 700 800 900 1000110012001300140015001600 Step

Fig. 6. Evolutions of the poles in the z-plane (a) and the spectrogram of the noise-free TVAR process (b).

1.5

0.5

a1

a1

1

0

DEIV REIV CAG true

-0.5

1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4

DNS EKF Kalman true 600 700 800 900 1000 1100 1200 1300 1400 1500 Sample Number

600 700 800 900 1000 1100 1200 1300 1400 1500 Sample Number 1.5

1.5

1 a2

a2

1

DNS EKF Kalman true

0.5

DEIV REIV CAG true

0

0.5

0 600 700 800 900 1000 1100 1200 1300 1400 1500 Sample Number

600 700 800 900 1000 1100 1200 1300 1400 1500 Sample Number

Fig. 7. Comparison of the estimation of TVAR parameters for DEIV, REIV, and CAG (a); and DNS, EKF and Kalman filtering (b).

In this paper, we present the analysis of some frames lasting approximately 200 ms, i.e. corresponding to 1600 samples. Both the estimations of the TVAR parameters from noise-free observations and from noisy observations are presented. The order of the simulated M-TVAR process and the basis are set to p ¼10, m ¼3. Fig. 9 shows the temporal representation and the corresponding spectrogram of the first sentence. Then, in Fig. 10, we present the temporal representation of a frame lasting 200 ms which is a part of the sentence. The corresponding noisy frame with a SNR equal to 22 dB and the spectrograms are also given. In Fig. 11, we present both the temporal evolution of the TVAR parameters estimated by using our approaches and the TVAR-version of YW method. The results confirm the ones we obtain in the previous sub-section. Indeed, DEIV and DNS provide quite similar results. Fig. 12 deals with the estimations of the evolution of the variance of additive noise for DEIV and DNS approaches.

868

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

DEIV DNS true

Noise Variance σ2b

1.2 1.1 1 0.9 0.8 0.7

600 700 800 900 100011001200130014001500 Sample Number

Fig. 8. Evolutions of the noise variance s2b estimated by DEIV and DNS approaches.

x 104

1 0.5

Speech signal

0 -0.5 -1 -1.5 -2 -2.5

0

1

2

3 4 Time sec.

5

6

Fig. 9. Speech signal and its spectrogram.

8000 6000

Speech signal

4000 2000 0 -2000 -4000 -6000 -8000 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. Fig. 10. Time representation of a frame lasting 200 ms and its spectrogram.

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

0

a6

a1

5

0

a7

a2

-5 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. 5

-5 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec.

0

a8

a3

5

-5 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec.

0

a9

a4

5

0

a10

a5

-5 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. 5

-5 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec.

869

4 2 0 -2 -4 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. 4 2 0 -2 -4 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. 4 2 0 -2 -4 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. 4 2 0 -2 -4 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec. 4 2 0 -2 -4 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec.

Fig. 11. Estimation of the TVAR parameters (each line indicates that solid line: DEIV approach, break line: DNS approach and dotted line: standard YW method, respectively).

Estimations of Noise Variance σ2b

8 7

x 104 DEIV DNS true

6 5 4 3 2 1 3.4 3.42 3.44 3.46 3.48 3.5 3.52 3.54 3.56 3.58 3.6 Time sec.

Fig. 12. Evolutions of the noise variance s2b estimated by DEIV and DNS approaches.

5. Conclusions and perspectives In this paper, two deterministic regression methods have been proposed to estimate weights corresponding to the TVAR parameters from noisy observations. In the DEIV approach, the weight vector corresponds to the kernel of a specific

870

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

matrix and can be estimated by means of the eigenvalue decomposition. The variances of the additive noise and the driving process are also estimated based on the error-in-variables based approach. In the DNS approach, the generalized eigenvalue decomposition allows both the variance of the additive noise and the weights to be estimated. Therefore, the DEIV and the DNS approaches do not require any a priori information on the noise variance. Even if the computational costs of the DEIV approach are a higher than for the DNS approach (because of the noisevariance estimation), the DEIV approach provides results with higher accuracy. Both approaches provide smoothed evolutions the TVAR unlike the REIV. We now focus our attention on single-microphone based speech enhancement combining TVAR model for speech and Kalman filtering. This will include an analysis of an appropriate size of the frame, psychoacoustic properties, model order selection issue, etc. Appendix A Let OM1 be defined by ½ a1 ½ la ¼ a1

b1  such as a21 þ b21 ¼ 1 with b1 40, a1 40, and a1 Z b1. Defining the relation OM1 ¼ l OM by

lb ¼ b1 , then let us look at the semi-definite compensated matrix 2

3

a

0

60 6 RpY C p ða, bÞ ¼ RpY 6 6 ^ 4 0

Fð1Þb

... ...

^

&

^

0

...

FðpÞb

As RpY is definite positive, RpY

1

0 0

7 7 7 Z0 7 5

exists and one has

1 IRpY C p ð

a, bÞ Z 0:

This is equivalent to I

RpY

1

l

C p ða1 , b1 Þ Z0

where {li}i ¼ 1,y,p(m þ 1) þ 1are the eigenvalues. At that stage, let us introduce the eigenvalue decomposition of RpY

1

C p ða1 , b1 Þ. If IðRpY

1

=lÞC p ða1 , b1 Þ Z0, its eigenvalues

are positive or null. Therefore, one has {1  (li/lmax)}i ¼ 1,y,p(m þ 1) þ 1 Z0. So, lmax is the largest eigenvalue of RpY

1

C p ða1 , b1 Þ.

References [1] D. Aboutajdine, Y. Grenier, M. Najim, Non stationary signal modeling and identification, in: Proceedings of the IFAC/IFORS Conference on Control Science and Technology for Development (CSTD 85), Beijing, 1985, pp. 1180–1187. [2] A. Aboutajdine, N. Najim, Y. Grenier, M. Barlaud, G. Alengrin, J. Menez, On AR modelling of nonstationary signals, in: Proceedings of the Eighth IFAC/ IFORS Symposium on Identification and System Parameter Estimation, Beijing, 1988. [3] Y.I. Abramovich, N.K. Spencer, M.D.E. Turley, Time-varying autoregressive (TVAR) models for multiple radar observations, IEEE Transactions on Signal Processing 55 (4) (2007) 1298–1311. [4] M. Bakrim, D. Aboutajdine, M. Najim, New cumulant-based approaches for non-Gaussian time-varying AR models, Signal Processing 39 (n. 1–2) (1994) 107–115. [5] S. Beghelli, R. Guidorzi, U. Soverini, The Frisch scheme in dynamic system identification, Automatica 26 (1990) 171–176. [6] R.B. Pachori, P. Sircar, EEG signal analysis using FB expansion and second-order linear TVAR process, Signal Processing 88 (n. 2) (2008) 415–420. [7] W. Bobillet, R. Diversi, E. Grivel, R. Guidorzi, M. Najim, U. Soverini, Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR process, IEEE Transactions on Signal Processing 55 (n. 2) (2007) 5564–5578. [8] G.E.P. Box, G.M. Jenkins, et al., Time Series Analysis: Forecasting and Control, 3rd ed., Prentice Hall, Englewood Clifs, NJ, 1994. [9] A.H. Costa, S. Hengstler, Adaptive time-frequency analysis based on autoregressive modeling, Signal Processing 91 (4) (2011) 740–749. [10] C.E. Davila, A subspace approach to estimation of autoregressive parameters from noisy measurements, IEEE Transactions on Signal Processing 46 (2) (1998) 531–534. [11] W. Fong, S. Godsill, A. Doucet, M. West, Monte Carlo smoothing with application to audio signal enhancement, IEEE Transactions on Signal Processing 50 (2) (2002) 438–449. [12] Y. Grenier, Time-dependent ARMA modeling of nonstationary signals, IEEE Transactions on Acoust. Speech Signal Process ASSP 31 (4) (1983) 899–911. [13] W. Bobillet, R. Diversi, E. Grivel, R. Guidorzi, M. Najim, U. Soverini, Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR processes, IEEE Transactions on Signal Processing 55 (12) (2007) 5564–5578 Dec. [14] D. Labarre, E. Grivel, M. Najim, N. Christov, Dual HN algorithms for signal processing, application to speech enhancement, IEEE Transactions on Signal Processing 55 (n111) (2007) 5195–5208 Nov. [15] J.R. Hopgood, C. Evers, Block-based TVAR models for single-channel blind dereverberation of speech from a moving speaker, in: Proceedings of IEEE Workshop on Statistical Signal Processing 2007, 2007, pp. 274–278. [16] J.P. Kaipio, P.A. Karjalainen, TVAR Modeling of Event Related Synchronization Changes, the Optimal Basis Approach, University of Kuopio Department of Applied Physics Report Series ISSN 0788–4672, 1995. [17] J.P. Kaipio, P.A. Karjalainen, Estimation of event-related synchronization changes by a new TVAR method, IEEE Transactions on Biomedical Engineering 44 (n. 8) (1997) 649–656. [18] M. Najim, M. Salhi, D. Aboutajdine, A. Rajouani, M. Zyoute, Recounstruction of Arabic long vowels using time varying linear prediction technique, in: Proceedings of the 1988 International Conference on Acoustics, Speech, and Signal Processing (ICASSP88), New York, USA, April, 1988.

H. Ijima, E. Grivel / Signal Processing 92 (2012) 857–871

871

[19] J. Vermaak, C. Andrieu, A. Doucet, S. Godsill, Particle methods for Bayesian modeling and enhancement of speech signals, IEEE Transactions on Speech and Audio Processing 10 (3) (2002) 173–185. [20] E.A. Wan, R. Van Der Merwe, The Unscented Kalman Filter, Chapter 7 in Kalman Filtering and Neural Networks. Adaptive and Learning Systems for Signal Processing, Communications, and Control, Wiley, 2001. [21] S.M. Kay, Modern Spectral Estimation, Theory and Application, Prentice Hall, Englewood Cliffs, NJ, 1988.