Implementation aspects of various higher-order statistics estimators

Implementation aspects of various higher-order statistics estimators

J. FranklinInst. Vol. 333(B), No. 3, pp. 349 367, 1996 ~ Pergamon S0016--0032(96)00022-1 Copyright (~2;1996 The Franklin Institute Published by Els...

946KB Sizes 1 Downloads 44 Views

J. FranklinInst. Vol. 333(B), No. 3, pp. 349 367, 1996

~ Pergamon

S0016--0032(96)00022-1

Copyright (~2;1996 The Franklin Institute Published by Elsevier Sci. . . . Ltd Printed in Great Britain 00164)032/96 $15.00+0.00

Implementation Aspects of Various Higher-order Statistics Estimators b y G E O R G E C. W . L E U N G

and

DIMITRIOS HATZINAKOS

Department o f Electrical and Computer Engineering, University o[" Toronto, 10 King's College Road, Toronto, Ontario, Canada M 5 S 1A4

ABSTRACT: The purpose of this paper is twofold. First, a modified robust maximum likelihood higher-order statistics estimator is derived based on the assumption that the obtained estimates (moments or cumulants) follow a generalized Gaussian distribution (GGD) of parameter ~. By determining the parameter ~ which corresponds to the lowest bias and variance of estimation the most appropriate GGD is determined as a.function of data length. Then the efficiency of various higher-order statistics (HOS) estimators is assessed in terms of VLS1 implementation merits such as data throughout delay, cost, and round-off noise performance. The obtained results provide guidelines in choosin 9 the appropriate estimator in HOS applications. Copyright (() 1996 Published

by Elsevier Science Ltd 1. Introduction For many years, higher-order statistics (moments or cumulants) has been an increasing trend in the area of digital signal processing. Quite apart from the well known second-order analysis based on the assumption of Gaussian signal model, higher-order statistics can capture the nonGaussian information in the signal. Also they preserve the nonminimum phase information of signals which is suppressed by the second-order analysis and are natural tools for characterization of nonlinearities (1-5). However, the development of efficient higher-order moment and cumulant estimators is still in the preliminary stage. The most widely used estimator is the mean estimator which is the maximum likelihood estimator under Gaussian assumption for the moment or cumulant distribution. Other estimators are the robust estimators suggested in (6) which attempt to deal with outliers in the obtained estimates from short data records. On the other hand, efficient implementations of the mean estimator have been proposed in ('7) and (8--11) to reduce the cost and data throughput delay. However, more has to be done on the above aspects of HOS estimators. First, investigations of the validity of the choice of estimator on certain signal source will be the aim of this paper. This was done using the generalized Gaussian (12-13) distribution ( G G D ) of parameter ~. Second, a guideline will be addressed on the digital implementation of the currently known HOS estimators based on a set of widely accepted figures of merit, namely the cost, data throughput delay and round-off noise with respect to the input data length. The organization of the paper is as follows: first a brief review of the definitions for ? Due to circumstances beyond the Publisher's control, this article appears without author corrections. 349

350

G. C. W. Leun9 and D. Hatzinakos

the moments and cumulants of a random process will be given. Due to the fact that (7) has not appeared yet a brief description of the idea behind the Alshebeili's efficient algorithms is given in Section 1.2. The proposed estimator based on the G G D assumption is investigated in Section II. The VLSI implementation characteristics of various estimators/implementations are examined in Section III. Conclusions are drawn in Section IV.

1.1. Definition of moments and cumulants The nth order moment of the real stationary discrete random process {x(i)} is defined as (1-3) Mx,,(r,,T2 . . . . ,z,_,) = E { x ( i ) x ( i + T , ) . . . x ( i + r ,

,)}.

(1)

The E{} denotes statistical expectation. On the other hand, the nth order cumulant of {x(i)} can be written as a nonlinear function of moments of order less or equal to n. The 2nd, 3rd and 4th order cumulants of zero-mean processes are utilized mostly in practice. They take the form

Cx,4 ('Cl, "c2, T3) =

C~,2 ( 0 = M,,2 ( 0

(2)

Cx,3(~,, z2) = mx,3(T,, v2)

(3)

Mx,4('Cl, "E2, "c3) -M~,2(z,)M~,2(z2

-T3)

-M~,2(z2)Mx,2(T3-'Cl)-Mx,2('c3)Mx,2('rl -32).

(4)

Relations between moments and cumulants for n > 4 can be found in (2, 3). In practice only a finite number of time samples {x(/)} for i = 1, 2 . . . . . N are available. Then the well known mean estimator for the nth order moment takes the form N

3;Ix,,(T,,T2 . . . . . r,_,) = ~

"On_ I

i~

y(i)y(i+Tl)...y(z,

l)

(5)

for the non-redundant region 0 ~< TI ~< 32... ~< ~, 1 ~< q << N where q is some chosen value of maximum lag that is much smaller than N. Usually, the nth order cumulant estimate is obtained by properly combining moment estimates of order less or equal to n.

1.2. Alshebeili's efficient algorithms The Alshebeili's efficient algorithms (7) have managed to speed up the operation of the mean estimator by reducing the number of multiplications, which is the most costly arithmetic operation in terms of time. Two different forms were proposed in (7). 1.2.1. Cross-correlation based approach (CCBA). Consider thecase of estimating the third-order cumulant lag Cx.3 (m, n) of a zero mean process {x(i)}. It is quite obvious that the expression in Eq. (6) can be separated into two parts, one depending on m only and the other on n only, as seen in Eq. (7). 1 N

n

Cx,3 (m, n) = M~, 3(m, n) = ~ i~-l x(i)x(i+ m)x(i+ n)

(6)

Implementation Aspects of Higher-order Statistics Estimators

351

C ,3(rl, T2) B1

Bx

B2

B3

..-

B2

B3

B4

.'.

B2M+I

B2M+I

B2(M+I) ]

B2(M+I)

FIG. 1. A schematic diagram of the block correlation based approach.

NF

n

= ~/ ~ {x(i)x(i+n)}x(i+m) i=l

1 = ~w(n)x~(m)

(7)

where w(n) and x(m) are row vectors and 0 ~< m ~< n ~< q. Thus for certain value of n, all the terms w(i,n) = x(i)x(i+n) can be calculated first and stored in memory so that they can be accessed for any value of 0 ~< m ~< n, instead of repeating the same multiplications. This can be extended also to the fourth-order moment, Mx.4(m, n, 1), with the terms x(Ox(i+ n) and x(i)x(i+ m)x(i+ n) all calculated and stored in memory so that no redundant multiplication is needed for any case of l where 0 ~< l ~< m ~< n. Furthermore, no multiplication is needed to calculate the second-order moments separately because the fourth-order cumulants of a zero-mean process depend only on the fourth-order and the second-order moments of the process. From the results in (7), the amount of multiplications saved by the CCBA implementation reached 50% for the third order cumulant case and 88% for the fourth-order cumulant case (taking N = 128 and q = 10) compared to the direct implementation scheme. This represents a considerable amount of savings in hardware power consumption and computational time. 1.2.2. Generalized block correlation based approach (GBCBA). Consider the thirdorder cumulant case with the summation expanded out like this. Cx.3(~,,xz) = x(O)x(v,)x(v2)+"" +x(vt)x(2rl)X(Z, + z 2 ) + " " = x(~ l) [x(0)x(T2) + x(2~l)x(r, + ~2)] + "'"

(8) (9)

The main effect of the above factorization of the two terms is to reduce the multiplications from four in Eq. (8) to only three in Eq. (9) with the number of addition unaffected. This overlapping of moment terms can be best illustrated in Fig. 1, where Bi denotes a block of moment terms. In the example denoted by Eq. (9), the term X(0)X(T2) will be inside block B1 and x(2~t)x(v~ +~z) will be inside block B 2. Each block in Fig. 1 will have a size /'3 which is the number of cumulant terms x(Ox(i+ m)x(i+ n) in the block.

fmin(m,(n-m)), T3 = {~n,

n ~ m and m # 0 else

(10)

The expressions for the third-order cumulant for each case of T3 are given in Table 1. The parameter M is defined as follows. Let

352

G. C. W. Leung and D. H a t z i n a k o s

TABLE I Third-order cumulant---Alshebeili's GBCBA implementation T3 m n-m H

0

Cx,3 (m, n) ~fVl o

j2T + Z~=j~ r,T~, x ( i + m) [x(i)x(i + n) + x(i + 2m)x(i+ m + n)] Z~= j2T +T - o Y~i=j½v ~, x(i+ n) [x(i)x(i + m) + x(i + n-m)x(i + 2n-m)]

ZM

=o E j2r =A+T,r+ x(i + n) [x(i)x(i+ m) + x(i+ m + n)x(i + 2n)]

zN_~ x~(O

N--n=2T3S+s,

for

0
0~
where S and s are integers. Then, assuming x(n) = 0 for n > N, the value of M will be S - 1 i f s = 0, or S i f s ¢ O. I L A N e w Estimator f o r H O S

Even though the mean estimator is the fundamental tool in estimating HOS, it is often found to have high estimation variance and bias. In (6) Nandi observed that in the case of short data length, the resultant cumulant distributions have longer tails than a Gaussian p.d.f, which led to the high variance of the mean estimate. His strategy to rectify the effect of the outliers is either by excluding them, as in outlier-excluded mean estimator, or by deflating their effect through some designed influence functions, as in biweight (and wave) estimators, which were known as the robust estimators in (6). However, a better approach m a y be possible by choosing a distribution model with a parameter that controls the amount of outliers. For example a G G D of parameter becomes a good choice with ~ controlling the size of tails and outliers (12, 13). f(q)-

-/l\e

~p/

where

0<~<~,

-~
(ll)

For ~ = 2, we get the Gaussian p.d.f, and for ct > 2, we tend to a more uniform distribution while for a < 2, we obtain a p.d.f, with more outliers and heavier tails than the Gaussian p.d.f. Actually for a = 1, we have the Laplacian p.d.f. In the area of parameter of estimation, it is c o m m o n to use Eq. (12) to relate the estimate z, the true value of estimate 0, and the noise of estimate q z = O+q.

(12)

By choosing the G G D with parameter a as p.d.f, of r/, the m a x i m u m likelihood estimator can be constructed for estimation purpose. 2. l. M a x i m u m likelihood estimator

Consider the case of the estimation of the third-order moment or cumulant of a zero mean process. With estimates z~ becoming z~(m,n),t true value of estimation 0 tzi(m,n) = x ( i ) x ( i + m ) x ( i + n ) , i = 1,2 .... , Nin the case of 3rd order moment.

Implementation Aspects of Higher-order Statistics Estimators

353

substituted by Mx,3(m , n) and the same noise r/i(m, n), Eq. (12) becomes the following expressions.

zi(m, n) = M~.3(m, n) + q,(m, n)

(13)

rli(m,n) = z , ( m , n ) - Mx,3(m,n).

(14)

Thus by substitution, we obtain the expression forf(z,(m, n)[Mv3(m , n)). Now given the data samples zi(m, n), i = 1, 2 . . . . . N and assuming that r/~(m, n) are from a zero mean G G D and independent to each other for all i, we can write down the likelihood and log-likelihood functions (16) N

5fl = ~If(zi(m,n) l Mx,3(m,n)) i--1

(15)

t2/3r

~ln~

= In -

-

{

= ~

/

= Nln/2/7 F~-~)-;

1 ~ ~ i ~ ' ]zi-Mx,3(m,n)[ =.

(16)

Thus maximizing In ~ with respect to Mx,3(m,n) is the same as minimizing the summation term since the first term is constant with respect to Mx.3(m, n) for given and ft. By differentiation, we have, #ln £o maxln 5¢=> OMx,3 (m, n) - 0

(17)

N

~ ~lzi--M~,3(m,n)[6-1 sgn(zi-M~,3(m,n))(--1) = 0 i=1 N

~ Izg--Mx,3(m,n)[~-' sgn(zi-M~,3(m,n)) = 0.

(18)

i=1

To get the closed form of the MLE of M x , 3 ( m , n) for a given ~, we need to solve the nonlinear equation Eq. (18). The preferable method to do that is the Newton-Raphson iteration due to its property of fast convergence to the root of Mx,3(m, n). The detailed algorithm is summarized below for the case of third order moment, with obvious modifications for the case of nth order moment. For a given ~, (14, 15) 1. Choose values of m and n where n >~ m >~ 0. 2. Calculate z(i) = x ( i ) x ( i + m ) x r ( i + n ) for i = 1, 2 . . . . . N.

G. C. W. Leun# and D. Hatzinakos

354

3. Choose a value for a. 4. Carry out the Newton-Raphson iteration, where a # 1 (since a = 1 leads to the median estimator) /l~!k aTt¢k)(m, n) -- 9, 9 x,3+ l) tm t , n) =,,~x,3

(19)

where N

=

n

Z

I z~--M(~k3(m,n) I~-~ sgn(z~--ffl~(m,n))

i=1

N--n

#' = - ( a - l )

~ [zi--M~k)3(m,n)[~-2 i=1

until

/

Q

x

(k+ ,3 l)(m, n) -h~¢(~.)3(m, n) [ ~< ~.

Initial value ~ 0 ] (m, n) is chosen to be the mean of z(i) and _~(k{(m, n) is the kth iteration of/~fx,3(m, n). To reach conclusions for different values of ~, the above procedure should be repeated for each ~ chosen. Also, as it has been mentioned earlier the calculation of cumulants of order n > 3 will be accomplished by properly combining the moment estimates up to order n which have been obtained by using the above iterative procedure. 2.2. Simulation results For simulation purpose, the following input signals were chosen for the M L E in Eq. (19). (1) Discretely uniformly distributed i.i.d. 4-PAM {+1, +3}, 7-PAM {0, +1, ___3} and 8-PAM { + 1, + 3, + 5, + 7} data sequences; (2) a continuously uniformly distributed i.i.d, sequence in the interval [-3,3]; and (3) two correlated random sequences obtained by driving the linear moving average system (MA(2)) of Eq. (20).

y[k] = x[k] - 2 . 0 5 x [ k - 1] + x [ k - 2]

(20)

with a discretely uniformly i.i.d. (4-PAM) and a continuously uniformly i.i.d. ( [ - 3,3]) sequence, respectively. The choice of ~ used in the simulations is in the range from 0.5 to 2.5 in intervals of 0.2. It should be clarified that the obtained results represented the value of c~for which the variance and bias of estimate were both minimum. However, in many cases the values of ~ for minimum variance and bias did not agree, so a balanced choice of ~ was made. It was found from the simulation results (15) that the fluctuation of variance was much smoother than the bias, thus choosing the least bias became the criterion of locating the appropriate ~. The results of the choice of ~ were tabulated in Tables 2-8. The following observations were made (14--15). 2.2.1. Zero mean, uniformly distributed i.i.d, signals. • Discretely distributed signal: 4-PAM, (Table 3) (a) When the true moment value is zero, the values of ~ seemed to increase from as

Implementation Aspects o f Higher-order Statistics Estimators

355

TABLE II

Assumptions for the number of yates, G, and delay, T, for all digital components; b represents a word-length in bits; Naddrstands for "number of address" Delay time, T

TALU Tmutt Tta,ch TRAM

No. of gates, G

5t 4bt 5t

GALU 16b G,,ul, 10b2+ 30b+80 G~,ch 10b (5+logz(Naaar))t

TABLE III

Results ofor with best variance and bias performance--uniformly i.i.d. 4-PAM signal Discretely uniformly distributed i.i.d, signal: 4-PAM { ___1, + 3} Length of input data Moment

Theoretical value

M~(0)

5 0 0 41 25 0

M~(z) 3rd order Mx(0,0,0) Mx(0, z, z)

Mx(z,z,z)

64

128

256

512

1024

>2.5 1.5 1.1 >2.5 2.0 1.1

>2.5 1.5 1.3 >2.5 2.0 1.1

>2.5 1.7 1.3 >2.5 2.0 1.1

>2.5 1.7 1.5 >2.5 2.0 1.3

>2.5 1.9 1.5 >2.5 2.0 1.3

100,000 >2.5 2.0 1.7 >2.5 2.0 1.5

TABLE IV

Results of ~ with best variance and bias performance--uniformly i.i.d. 7-PAM signal Discretely uniformly distributed i.i.d, signal: 7-PAM {0, -4-_1, + 2 +3} Length of input data Moment Mx(0)

Theoretical value

3rd order Mx(0, 0, 0) Me(0, z, z)

4 0 0 28 16

Mx(z, z, z)

0

Mx(z)

64

128

256

512

1024

2.0 1.3 1.1 2.0 2.0 1.0

2.0 1.3 1.0 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

356

G. C. W. Leun9 and D. Hatzinakos TABLEV Results of ct with best variance and bias performance--uniformly i.i.d. 8-PAM signal Discretely uniformly distributed i.i.d, signal: 8-PAM {+1, +3, -+5, -+7} Length of input data

Moment M~(0) Mx(r) 3rd order Mx(0, 0, 0) Mx(0, z, z) Me(z, z, z)

Theoretical value 21 0 0 777 441 0

64

128

256

512

1024

2.0 1.5 1.0 2.0 2.0 1.0

2.0 1.5 1.0 2.0 2.0 1.0

2.0 1.7 1.0 2.0 2.0 1.0

2.0 1.7 1.0 2.0 2.0 1.0

2.0 1.9 1.0 2.0 2.0 1.0

TABLE VI

Results of c~ with best variance and bias performance -cont&uously uniformly distributed i.i.d. signal Continuously uniformly distributed i.i.d, signal: uniform distribution within [-3,3] Length of input data Moment Mx(0) My(z) 3rd order My(0, 0, 0) M~(0, z, z) Mx(z, v, z)

Theoretical value 3 0 0 16.2 9 0

64

128

256

512

1024

2.0 1.3 1.3 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

2.0 1.0 1.0 2.0 2.0 1.0

low as 1.1 up to 2.0 with increasing data length. This would mean that when the data length is increased sufficiently, the moment distribution becomes asymptotically Gaussian. This effect is more pronounced in the case of second-order moments compared to the third and fourth order moments. As expected, the convergence to Gaussianity is slower as the order of the estimated moment increases. Even with 100,000 samples sequence the distribution of fourth-order moments is still far from Gaussian. It should be pointed out that in this case of 4-PAM signal, the median estimator had worse variance performance than the generalized Gaussian estimator at value of c~ close to but not equal to 1.0. To explain this, consider the equation denoting the maximum likelihood condition that leads to the median estimator.

Implementation Aspects o f Higher-order Statistics Estimators

357

TABLEVII Results of ~t with best variance and bias performance--MA system, uniformly distributed 4-PAM as input MA(2) output signal: uniformly distributed 4-PAM as input Length of input data Moment 3rd order M,(0, 0, 0) M~(0, 1, 1) M,(O, 2, 2) 3/,(0, 3, 3) M,(1, 1, 1) M.,(2, 2, 2) Mx(3, 3, 3)

Theoretical value 0 2217 1516 978 962 - 1545 431 0

64

128

256

512

1024

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

TABLEVlll Results of" ~ with best variance and bias performanee--MA system, continuously un([brmly distributed signal { - 3,3} as input MA(2) output signal: continuously uniformly distributed signal [ - 3,3] as input Length of input data Moment 3rd order M~.(0, 0, 0) M,(0, 1, 1) M,(0, 2, 2) M,(0, 3, 3) M,(I, 1, 1) M,(2, 2, 2) M,(3, 3, 3)

Theoretical value 0 826 558 353 346 -571 157 0

64

128

256

512

1024

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

1.0 2.0 2.0 2.0 2.0 2.0 2.0 1.0

N

sgn(zi-O) = 0.

(21)

i=l

Consider the case o f all zero lags third-order c u m u l a n t which should be zero. The m o m e n t estimates will consist o f the values { - 27, - i, 1,27} for 4 - P A M . Since the median estimator works by sorting the sequence o f estimates and

358

G. C. W. Leun9 and D. Hatzinakos

taking the value in the middle or the average of the middle two values, the thirdorder m o m e n t will only be zero if there are the same number of { - 27, - 1} and {I, 27}. Otherwise the m o m e n t is most likely to be - 1 or 1. This will certainly result in high variance of estimation. This situation will most likely arise when a M - P A M signal with even M is used, such as the 4-PAM signal. Notice that when M is even, the symbol "0" is not a valid signal level. Thus the third-order cumulant terms x(i)x(i+m)x(i+n) will not be zero in any case and the median estimator will unlikely produce the value zero as estimate. (b) For the case of even order m o m e n t with all zero lag, the best values of c~seemed to locate beyond 2.5 as denoted in the form " > 2 . 5 " in Table 3. Since the underlying distribution becomes more uniform as ~ --* ~ , this trend of • moving beyond 2.5 might suggest the distribution of discrete even order moments to be quite uniform. Examination of the m o m e n t distribution revealed that these moments of a P A M signal have exactly the same distribution as the original input P A M signal. (c) For the case of all non-zero, non-all zero lag moments, such a s M 4 , x ( 0 , z-, ~'), the value of ~ lies very close to 2 even for short data length, as shown in Table 3, meaning that the mean is the best estimator regardless of the variation of the data sample length. Actually this observation for the 4-PAM signal was found to be more pronounced in all other signals examined next. • Discretely distributed signal: 7-PAM (Table 4) (a) When the true m o m e n t value is zero, almost all the values of e were found to be 1.0 except at the short data end with length being 64 or 128. For those cases, slight deviations towards 1.3 were observed. The trend here suggested that when the estimate value is expected to be zero, the median estimator seems to be a good choice. In m a n y cases, the estimate variances and biases were actually zero, making the choice of median estimator most obvious. In the case of 7-PAM the median estimator has much better performance since the "0" is a valid signal level, thus, making it most likely for the estimator to produce the value of 0. (b) For the case of even order m o m e n t with all zero lag, it was found that the value of e = 2.0 was the best choice for all signal lengths. For this value of e the estimator gave the smallest bias and slightly higher variance than values of c~ > 2.0. This is a considerable difference compared to the case of 4-PAM and it is explained by the observation that the 7-PAM is closer to a continuously distributed signal compared to 4-PAM. Similar behaviour was observed for the 8-PAM signal that is examined later. (c) F o r the case of all non-zero, non-all zero lag moments, such a s M4,x(O, ~c, "c), the value of ~ lies very close to 2, as shown in Table 4, meaning that the mean estimator is the choice regardless of the variation of the data sample length. These results for the 7-PAM were more pronounced than the corresponding results from a 4-PAM signal. • Discretely distributed signal: 8-PAM (Table 5) By comparing the results of Table 5 with those of Table 4, we observe that the behaviour of ~ has changed considerably in some of the cases. In the case of all zero

Implementation Aspects of Higher-order Statistics Estimators

359

lag second and fourth-order moments, namely Mx(0) and Mx(0, 0, 0) respectively, the behaviour is similar to that with a 7-PAM signal with the bias close to zero when = 2.0. Also in the case of third and fourth-order moments with zero true moment value, the median estimator proved to be much better than the case of 4-PAM. In general the results with the 8-PAM signal were found to be on the average closer to those of a continuously distributed signal than to the results with a 4-PAM signal. • Continuously uniformly distributed signal within [ - 3,3] (Table 6) The simulation results verify the trend that has been observed going from a 4-PAM to a 7-PAM and then to a 8-PAM signals. In particular: (a) When the true m o m e n t value is zero. the best value of ~ was chosen to be 1.0 since the value lies somewhere around 1.0, meaning that the median would be the best estimator in this case. The change in data length didn't affect the choice of • in the way the discrete case did. This observation agrees with the results in (6), where the conclusion was that the median estimator produced the best results in the case of 3rd order cumulants. (b) When the m o m e n t value is non-zero, the best value of a was found to be 2.0 which meant that the mean estimator was the choice here. This is even observed for the case of short data length. However, it is difficult to decide whether the outliers effect mentioned in (6) exists here or not, since the variance of ~ = 2.0 was not the lowest a m o n g all a but the bias increased sharply as ~ moved beyond the vicinity of 2.0. The outliers are still responsible to the high variance of the mean estimator when the data length is small, but the requirement for low bias maintains the mean estimator as the best choice in this case. 2.2.2. Linear MA(2) signals (Tables 7 and 8). For both cases of input (discretely [4PAM] and continuously uniformly distributed signals) for the linear MA(2) system described in Eq. (20), the simulation results of the behaviour of ~ were very similar to the case of uniformly distributed i.i.d, signal in the range [ - 3,3]. The observations can be summarized as follows: • For all the moments with true value being non-zero, the lowest bias point is at = 2.0, meaning that the mean estimator should be chosen here. • When the true value of m o m e n t is zero, e.g. in the case of Mx(3, 3, 3) and all the third-order moments, the best value of ~ is chosen to be 1.0 for both low variance and bias performance. This implies the choice of median estimator. In conclusion, we see that the assumption of Gaussianity for the distribution of moments or cumulants is not universally true. Gaussianity is a valid assumption when long data records are available or when the set of signal values is sufficiently "rich". This became clear from the discrete type of distributions considered in this paper. The utilization of a G G D has allowed us to determine more accurately this deviation from Gaussianity.

II1. Assessment of HOS Estimators Before any analysis can be carried out, it is necessary to list clearly the implementations of H O S estimators included in the coming assessment.

360

G. C. W. Leung and D. Hatzinakos

(1) Mean estimator--forms of implementations such as direct implementation (15) taking into account the symmetry property and the efficient algorithms of Alshebeili (7), the cross-correlation based approach (CCBA), the generalized block-correlation based approach (GBCBA) and the systolic array implementation (8-11). (2) Robust estimator (6)--median estimator, outlier-excluded mean estimator biweight and wave estimator, weighted biweight and wave estimator. (3) Generalized Gaussian estimator suggested in Section II.

3.1. Figure of merits and assumptions In order to assess the efficiency of the VLSI implementations of a group of HOS estimators, a number of figures of merit has been agreed upon. Using the definitions of Cost and Data Throughput Delay suggested in (17, 18), the following criteria were chosen and applied: • Number of mathematical operations such as multiplications and additions per cumulant lag. • Data throughput delay (per output sample) was defined as

Ta & Nm,t, Tmul,+ NAL~T, Lu + NRAMTRAM+ NRoMTRoM+ NI,,.hT,,,.h = Nmut,Trout,+ NAru TALV + NRAM TRAM + Nt,,ch Tta,,.h

(22)

since no algorithm involved here requires the use of ROM.

--Nm,l,, NALU, ?CRAM,NROM, Nl,,h: number of multiplications, additions, RAM accesses, ROM accesses, and latchings per output sample, in this case, one cumulant lag --T~,,~,, TALU, TRAM, TROM, Tta,h: time needed to perform one multiplication, addition, RAM access, ROM access, and latching. A general definition of delay time for each digital component is given in Table 2 (17). Notice that the definition of time Td is dependent on the parameter t which is the duration of one clock cycle. Since this parameter t relates closely with the available technology and is beyond the scope of this paper, we can normalize the expression Eq. (22) by dividing both sides with t. The unit of the data throughput delay Ta will be in clock cycles instead of units of time. • Cost of digital hardware, was defined as

C~ & (Gfc + KRAMBRA~J~AM + KRoMBRo~J'ROM) " P,~,,,

(23)

with

--BRAM, BROM being the number of bits of memory --KRAM = 0.3; KROM = 0.03 being the relative costs of one bit of memory to one logic gate --fc = fRAM = fROM as the potential operating speed in Hz --P,,,,:h in watts per gate Hz or silicon area per gate Hz - - G as the total number of gates (i.e. GALV+ G,,u~,+ Gt~,,.h+ G, ...... 3. The control circuit will consist of counters for time index i and the moment lags (/,m, n) and some registers for generating control signals.

Implementation Aspects of Higher-order Statistics Estimators

361

C~ will be defined in units of watts or silicon area. We can normalize C~ by dividing it with fLP,e,h and thus obtain the following expression C s..... i = C~/fLPte~h & G+O.3BRAM+O.O3BRoM

= G+O.3BRAM

(24) (25)

Notice that no algorithm mentioned requires the use of ROM. This normalized cost function has a unit of "No. of gates". Also it should be further normalized by the number of cumulant lags done throughout the non-redundant region since the cost should spread out evenly between each cumulant lag. Thus the final unit of the cost function will be "No. of gates per cumulant lag". A general definition of number of gates for each digital component is given in Table 2 (17). • Roundoff noisc using techniques developed in (19). In floating point arithmetics, the round-off noise variance is directly related to the number of additions in HOS estimation and is more pronounced in the case of long data samples. However, some of the existing algorithms involves other operations such as sorting or recursive operations. Thus it would be beneficial to have a general expression showing the depending factors of the round-off noise. This will be done by assuming a zero mean, stationary input signal. Since the i.i.d, input signal is a special case of the stationary condition, it is used to simplify the general expression to a point that numerical values can be obtained. Graphs will be plotted to facilitate the discussions. Certain assumptions and parameters are used in the digital implementation of all the available HOS estimators. (1) Floating point arithmetics, represented as sonM2 A, where M is the mantissa with 1/2 ~< M < 1 and A is the exponent being positive or negative integer, is used for round-off noise analysis. (2) Data word length, b, taken to be 16 bits. (3) Uniformly distributed round-off noise over one quantization interval, so that

2-2b O"2

=

3

(26)

(4) Data sample length, N with 64 ~< N ~< 1024. With the assumptions and parameters clearly defined, numerical values of the data throughput delay Td and the cost function Cs can be obtained within the range of data sample length N. Also the same analysis can be done on the round-off noise for each implementation. [All the detail steps and analysis that led to the following results have been omitted in this paper and should be referred to (15), which contained the block diagrams and the closed form expressions of td, C, and the round-off noise of each implementation.] The numerical results are plotted in graphs--Figs 2-7--that show how the throughput delay, cost and round-offnoise vary with respect to N. This should facilitate the comparison and choosing of a HOS estimator according to the user's requirement. 3.2. Discussion From the graphs, the following observations are made: (1) The Alshebeili implementations proved to be a low cost, low data throughput

G. C. W. Leun# and D. Hatzinakos

362

T h i r d O r d e r C u m u l a n ( : D i l t a ThroughlPUl [ : ) e l a y / C u m u l a n ! l a g

Median

..........

7.5

........" 7

..........

..'"'"

(5o ~.e.r)

_. .

.-"

. .........

UI VVeC~i'~rW&VQ Oen.r__uu

6.5 ..'" ,m"

A

. ~ ' ~

, .... " --

' ~'

_..~...-" . . . .

"l'=([~iWOi~ h ~ l ~ V a v e

(20 iter)

"--

6 .<':"

o ,,

--gJ5.5

."

/

"

.,...~, ~ ' ~

- _ _ -

..,-~"

_ - - -

/ ' /

--

; y,

--

~

......

--

Systolic-Array Tnmmed-Mean Moan(GIBCBA)

~ ~ . . . . ~ o , r . = )

_---

5

' :" 4.5

I

..,'°"

I le ,..

4

do

3.5

40

4o

~o

d a t a length

,o'0o

,='0o

FIG. 2. Third order cumulant: data throughput delay per cumulant lag.

Third Order Cumulant : Cc4VCumulant lag 4.8 BiweighlANave 4.1B

!il!:i.. !i i!:!:!:!!!:;..........

.....

..

..

..

o.__

4.4 Median 4.2 . . . . . . . . . . . . . .

Systolic-Array

4

......,.'" -~3.8

3.6

Uean(GBCBA)

J

3.4

Mean(CCBA) 3.2

3

0

do

40

do

d a t a length

~o

lO'OO

FIG. 3. Third order cumulant: cost function per cumulant lag.

Implementation Aspects of Higher-order Statistics Estimators

363

Fount+ Order Cumulant : Oat= Throughput Delay/Cumulant lag 8

...... 7.5

. . ~ - - " _. - - + + +

....+...

(50 iter) Biwekzlht/Wave •. ~ .": G e n . ~ n

. . - " Medbm

Biweigt~Nave

.~.+-.-'.;+".....

s++

.+~.,'~

..,

,i .~

+.-" ...-

6.5 +/

I.I

.+,..,

,,

,,' ~"

Trimmed-Mean

/

Mean(GBCBA)

.-•"

i

."

-

Syltolio-Army

.!

5.5

5 :,:

,,s

4.5

,J

200

400

600 dam length

800

lZ00

1000

FIG. 4. Fourth order cumulant: data throughput delay per cumulant lag.

Fourth Order Cumulant : ~ C u m u l a n t

lag

Biweight/Wave

6

I

5.5

.':::~:~" '

Mean(dire~t)

Mean(GBCBA)

J

~

4.5

"Median

. . . . . Mean(CCBA)

. . . . . . .

4

Systolio-Array

3.5

3

0

~o

~o

do

data length

~o

1o'oo

FIG. 5. Fourth order cumulant: cost function per curnu|ant lag.

12'oo

G. C. 141.Leun9 and D. Hatzinakos

364

Third Order Cumulant : Roundoff Error Variance 10 _

_

50 iter Biweight Biwelght 2 0 iter

.

_ ~ _ _ _ _ . . . . . . . . . .

9 i

~

/

37

/ / /

°.

I

///

8 @ 6

// /

Mean(direct) Mean(CCBA) • Trimmed-Mean . .'. .~ . ~. . i . °. . ~. . i . .'. . i. . i • "~" S y s t o l i c - A r r a y

~

0 Z



..,.

..- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0

0

Median Mean(GBCBA)

I

I

I

I

I

200

400

600 d a t a length

800

1000

12'00

FIG. 6. Third order cumulant: roundoff error variance (upper bound).

Fourth Order Cumulant : Roundoff Error Variance 12 5 0 Iter Biweight

_

11

=10

s

_9

# J

@

/

/

Mean(direct) .~. Mean(CCBA) Systolic-Array

/ . . . . ¢0

E o

Z

7

m

Mean(GBCBA) Median

.......................

i

I

I

I

200

400

600 d a t a length

800

gO

1

0

FIG. 7. Fourth order cumulant: roundoff error variance (upper bound).

12'o0

Implementation Aspects of Higher-order Statistics Estimators

365

delay in both third order and fourth order cumulants, with the cross-correlation based approach (CCBA) always being better in terms of both the throughput delay and cost than the generalized block-correlation based approach (GBCBA). It should be noted that the main feature of Alshebeili's efficient algorithms is the use of RAM blocks to store all the second-order terms, x(i)x(i+/7), and thirdorder terms, x(i)x(i+~)x(i+7) so that the same term can be accessed from the memory instead of recalculated, thus saving much multiplication time. However, because all these second- and third-order terms have to be calculated and stored right before any calculation of moment terms can start, this led to a slightly larger throughput delay than the mean estimator in the form of direct implementation, except in the case of fourth order cumulant where the CCBA has a better delay performance. This can be explained by noting that during the calculation of the second order moments, all the x(i)x(i+ ~) terms have already been calculated and stored in RAM memory and can be accessed with much less time than actually multiplying the x(i) and x(i+ ~). This is not the case for both the direct implementation and Alshebeili's GBCBA approach of the mean estimator. On the other hand, the cost functions per cumulant lag of the Alshebeili algorithms are always lower than the direct implementation of mean estimator, since normalization has to be applied in the Alshebeili's algorithms. (2) For all the robust estimators, the cost and data throughput delay functions are observed to be higher than those of all the implementation schemes of mean estimator due to higher complexity of the algorithms. This observation can be extended to the Generalized Gaussian estimator because the Newton-Raphson iteration is also used in the Biweight and Wave estimators. The higher cost and throughput delay functions are expected since the robust estimators either involved a recursive algorithm as in the biweight, wave and Generalized Gaussian estimators, or a sorting algorithm in the median estimator, both of them involve more computations than the mean estimator in any implementation form. Even the outliers-excluded mean estimator has to spend time calculating the mean and standard deviation of the moment terms before any result can be obtained. One thing that has to be clarified here is the many possibilities of implementing a median estimator. The one chosen in this paper used a single comparator and used a bubble sort algorithm, thus producing the result of high data throughput delay but low cost. More comparators can be used to improve the data throughput delay but then the algorithm will be more complicated to manage. (3) For the Systolic array, the data throughput delay performance was found to be right between the Alshebeili's algorithm for mean estimator and the robust estimator in the case of third-order cumulant. However, in the fourth order cumulant case, the Alshebeili's GBCBA algorithm needed extra time to calculate the second order moments, thus producing a higher throughput delay than the Systolic array. While the same Systolic array can be used for all the moments from first order (mean) up to fourth order, the hardware needed by the mean and Alshebeili's algorithms increased much from the third order case to the fourth order. Thus comparatively, the Systolic array implementation scheme has a lower cost than the mean and Alshebeili's algorithms. (4) From the graphs in Figs 6 and 7, regarding the roundoff noise variance of

G. C. W. Leuny and D. Itatzinakos

366

different implementations, the following observations were made. For the mean estimator (both the direct implementation form), the Alshebeili's approaches and the Systolic array the variance of the round-off noise increases linearly with the data length N. While the direct implementation form and the Alshebeili's CCBA approach have the same round-off noise variance expression given in (19), the Alshebeili's GBCBA approach and the Systolic array have better noise variance performance. The main reason is the nature of segmentation in those two approaches. For the GBCBA approach, the data samples are divided into blocks and additions are done first within the blocks. Then the results of each block are added together to form the moment estimate. For the Systolic array, the data samples have to be segmented into blocks of 4 since it is how the array is constructed to work. This segmentation of data samples allows the round-off noise variance to be less in these two approaches than the direct implementation and the CCBA approach. For the robust estimators, the median estimator seemed to have a round-off noise performance independent of N, which should be much better than the various implementation schemes of mean estimator, all having linear relationship with respect to N. However, an interesting thing happens in the Figs 6 and 7 when the round-off noise variance of the GBCBA approach is less than that of the median estimator when N is relatively small. This is not obvious before the graphs are plotted. For the outliers-excluded mean estimator, the fraction of moment terms excluded will lead to the same fractioning in the resultant roundoff noise as the direct implementation form of the mean estimator. However, if the upper bound is considered, it will coincide with the noise variance of the direct implementation form since the extreme case for the outliers-excluded mean estimation is that no outlier is found and excluded. Finally, the biweight, wave and the Generalized Gaussian estimators are found to have much complicated expression of round-off noise variance. Even though the final expression is incomplete in some extent, the relationship with respect to data length N is found to be close to N 2 which can be regarded as an order worse than the mean estimator.

4. Conclusion This paper shows that the assumption of Gaussian distribution for moment estimates is not universally true especially when the short data length was used. By choosing a generalized Gaussian distribution, we were able to locate the best value of a and select the best estimator, which may be the median estimator when a ~ l, the mean estimator when a ~ 2 or some other values of ct. Also in the digital implementations of higher order statistics estimators, special care should be placed in balancing the cost, data throughput delay and roundoff errors with the purpose. The obtained results in this paper will provide some guidelines in choosing a HOS estimator or implementation for different lengths of data sequences.

Acknowledgement This work was supported by the Natural Sciences and Engineering Research Council NSERC.

Implementation Aspects o f Higher-order Statistics Estimators

367

References (1) D. R. Brillinger and M. Rosenblatt, "Asymptotical theory of estimates of k-th order spectra". In "Spectral Analysis of Time Series", Edited by B. Harris, pp. 153-188, Wiley, NY, 1967. (2) C. L. Nikias and J. M. Mendel, "Signal processing with higher-order spectra", IEE-F Signal Process. May. pp. 10-37, July, 1993. (3) J. M. Mendel, "Tutorial on higher-order statistics (spectra) in signal processing and system theory". Proc. IEEE, Vol. 79, pp. 278-305, 1991. (4) C. L. Nikias and A. P. Petropulu, "Higher-order spectral analysis: a nonlinear signal processingframework", Prentice-Hall, Englewood Cliffs, N J, 1993. (5) D. Hatzinakos, "Higher-order spectral (HOS) analysis", in "Dynamic Control Systems" (Edited by C. Leondes), Academic Press, NY, 1995. (6) A. K. Nandi, "On the robust estimation of third order cumulants in applications of higher order statistics", Proc. IEEE, Vol. 140(6), pp. 380-389, 1993. (7) A. Kh. AI Jabri, S. A. Alshebeili and A. N. Venetsanopoulos, "Efficient algorithms for computing third and fourth order cumulants", submitted to "Signal Processing" under review. (8) E. S. Manolakos, H. M. Stellakis, and D. H. Brooks, "Parallel processing for biomedical signal processing: higher order spectral analysis", IEEE Computer, Vol. 24(7), pp. 3334, March, 1991. (9) H. M. Stellakis and E. S. Manolakos, "An array of processors for the real-time estimation of fourth- and lower-order moments", Signal Process., Vol. 36, pp. 341-354, Elsevier, April, 1994. (10) H. M. Stellakis and E. S. Manolakos, "An architecture for the real-time estimation of cumulants", Proc. IEEE ICASSP'93, pp. IV, 220-223, Minneapolis, MN, 1993. (11) H. M. Stellakis and E. S. Manolakos, "A VLSI architecture for the order recursive estimation of higher order statistics", Proc. IEEE Signal Process. Workshop on HOS, pp. 96-100, June, 1993. (12) M. K. Varanasi and B. Aazhang, "Parametric generalized Gaussian density estimation", J. Acoust. Soc. Am., Vol. 86(4), pp. 1403-1415, October, 1989. (13) R. H. Lambert and C. L. Nikias, "Optimal equalization cost functions and maximum a posteriori estimation", Proc. MILCOM'95, 1995. (14) G. Leung and D. Hatzinakos, "Efficient implementations of higher-order statistics estimators", "Proc. Athos Workshop on H.O.S.", Begur, Spain, June, 1995. (15) G. Leung, "Efficient implementations of higher-order statistics estimators", Master Thesis, University of Toronto, May, 1995. (16) P.J. Bickel and K. A. Doksum, "Mathematical Statistics: Basic Ideas and Selected Topics", Holden-Day, San Francisco, CA, 1985. (17) J. S. Ward, P. Barton, J. B. G. Roberts and B. L. Stanier, "Figures of merit for VLSI implementations of digital signal processing algorithms", IEE Proc. Part F, Vol. 131 (1), pp. 64-70, February, 1984. (18) H. H. Chiang, C. L. Nikias and A. N. Venetsanopoulos, "Efficient implementations of quadratic digital filters", IEEE Trans. Acoustics, Speech, Signal Process., Vol. ASSP34(6), December, 1986. (19) D. Hatzinakos, "Analysis of floating point roundofferrors in the estimation of higher order statistics", Proc. IEE-F, Vol. 140(6), pp. 371-379, 1993.