Estimation of first- and second-order moments of a quantized stationary signal

Estimation of first- and second-order moments of a quantized stationary signal

Signal Processing 10 (1986) 115-127 North-Holland 115 ESTIMATION OF FIRST- AND S E C O N D - O R D E R M O M E N T S OF A QUANTIZED STATIONARY SIGNA...

1MB Sizes 0 Downloads 10 Views

Signal Processing 10 (1986) 115-127 North-Holland

115

ESTIMATION OF FIRST- AND S E C O N D - O R D E R M O M E N T S OF A QUANTIZED STATIONARY SIGNAL Thierry PUN* (Member EURASIP) and Murray E D E N Biomedical Engineering and Instrumentation Branch, Division of Research Services, National Institutes of Health, Bethesda, MD20205, U.S.A. Received 3 January 1985 Revised 20 May 1985 and 19 August 1985

Abstract. It is routinely desired to estimate certain statistical properties from the quantized representation of a signal. This ai'ticle considers the estimators for the local average and variance of the process, and establishes expressions for the mean, variance, and envelopes of the probability density functions of these two estimators. The relationships obtained closely match simulation results, even in abnormal circumstances such as estimation window of a few elements only, or quantization where the step comes close to the signal range. The results give a quantitative justification to the known fact that increasing quantizer resolution brings relatively little improvement in the quality of moment estimates. Zusammenfassung. Die Sch/itzung gewisser statisticher Eigenschaften aus der quantisierten Darstellung eines Signals ist an sich ein Routineproblem. Oft sind die gewiinschten Parameter lediglich die lokalen Werte fiir Mittelwert und Varianz des Prozesses. In diesem Beitrag werden die Formeln fiir Mittelwert, Varianz, und Einhiillende der Wa,hrscheinlichkeitsdichten dieser beiden Parameter hergeleitet. Die hierbei erhaltenen Beziehungen stimmen sehr gut mit experimentellen Resultaten iiberein, dies sogar unter sehr ungiinstigen Umst~inden, z.B. bei Verwendung eines Fensters von nur wenigen Elementen oder einer Quantisierung, die derart grob ist, dab die Quantisierungsstufe in der GriSI3enordnung der Signalamplitude liegt. Die Ergebnisse erbringen eine quantitative Rechtfertigung fiir die wohlbekannte Tatsache, dab die Verbesserung der Auflrsung des Quantisierers relativ wenig zur Verbesserung der Parameterschiitzwerte beitr~igt. Rrsumr. L'estimation de certaines proprirtrs statistiques d'un signal :i partir de sa reprrsentation quantifire est un probl~me courant. Cet article consid~re les estimateurs de l a m o y e n n e et de la variance locale d'un processus. Des expressions sont 6tablies pour la moyenne, la variance, et les enveloppes des densitrs de probabilit6 de ces deux estimateurs. Les relations obtenues suivent de tr~s pros les rrsultats exprrimentaux, m~me dans des circonstances tr~s drfavorables telles I'utilisation d'une fen~tre d'estimation ne comprenant que quelques 616ments, ou d'un dispositif de quantification dont le pas se rapproche de la dynamique du signal. Ces rrsultats donnent une jutification quantitative au fair connu que l'augmentation de la rrsolution du dispositif n'amrliore pas sensiblement l'estimation des moments. Keywords. Quantization, estimation, moments, average, variance, probability density function, distribution.

1. Statement of the problem In many data acquisition or data processing systems, analog input signals are first converted to digital form by quantization [1, 2]. The difference between the output signal, which can only take a finite number of values, and the input is called * Present affiliation: Division of Experimental Physics, CERN (European Laboratory for Particle Physics), CH-1211 Gen~ve 23, Switzerland.

quantization error. Its effect is known as quantization noise, and has been source of several studies. Widrow [3, 4] has formalized the quantization process into a framework similar to the classical one of sampling theory. Conditions were derived to have a uniformly distributed quantization error. Requicha [5] has discussed the problem of evaluating expected values of functions of quantized variables under Widrow's framework. Sripad and Snyder [6] extended Widrow's work to give

0165-1684/86/$3.50 @ 1986, Elsevier Science Publishers B.V. (North-Holland)

116

T. Pun, M. Eden / Moments from a quantized signal

sufficient conditions for the quantization error to be uniform and uncorrelated with the input. Instead of requiring an input signal whose characteristic function (CF) has finite support [3,4] (which is impossible to have simultaneously with limited dynamic range), they showed that it is sufficient for the CF to have zeroes at well defined locations. The reduction of quantization error by tailoring the structure of the quantizer to the input signal was first considered by Max [7], who derived implicit equations for the optimum quantization intervals. This approach is conceptually elegant, and several articles describe the properties of optimum quantizers [8-11]. However, such devices require the knowledge of the statistic of the input signal, and are more complex to implement than the usual uniform quantizer with equispaced steps. Correlated quantization noise has undesired and poorly modeled effects, such as correlated perturbations in PCM signal transmission. Rather than giving conditions to be satisfied by the input signal in order to obtain uncorrelated quantization noise, Castani6 [12, 13] introduced the concept of Random Reference Quantizing (RRQ). In such a scheme, transition points are randomly distributed. True noncorrelation between input and quantization error can be achieved, while linearity with respect to the average values is respected. This approach allows statistical estimation to be performed from very coarsely randomly quantized processes. However, RRQs are more complex than deterministic quantizers are, and the latter are used in practice by almost all experimental scientists for analyzing their data. It is often desired to estimate some statistical properties of an unknown signal from its quantized representation [14]. Typical examples are the determination of the averages and standard deviations of locally stationary processes. These signals may be, for example, one-dimensional time series or two-dimensional images. The problem treated here is the determination of the accuracy of the estimated mean and variance, as well as the envelopes of their approximate Probability Signal Processing

Density Functions (PDF's) in the case of a deterministic uniform quantizer. The importance of having PDF's is primarily that they allow statistical inferences to be made, such as determination of confidence limits or prediction of the number of samples that have to be measured. Whatever the dimensionality of the process under scrutiny, the usual manner of computing the first order statistics is to examine the signal values sequentially. It is, therefore, sufficient to consider the problem of analyzing a one-dimensional realization x(t) of a stochastic process X(t, ~). The variable t represents the independent variable of the observation, and usually denotes time; ~: accounts for the fact that what is observed is one realization of a stochastic process. Sampling x(t) results in a series of random variables Xi,

i=l,...,n,

Xi~R.

(1)

The signal x(t) is assumed to be locally stationary, at least over the n samples (hypothesis H1). The series X~ is quantized uniformly, yielding the observed (measured) variables Y,,

Y~,

i=l,...,n,

(2)

where h is the quantizer step and n the sample size (number of values in the 'estimation window'). The relationship between the input X and the output Y is, for each i,

Y~=k.h if

½(2k- 1)h < x <~½(2k+ 1)h, k integer, h ~ ~÷.

(3)

For simplification, it is assumed that there is no truncation of X or Y. The quantization error between the actual and observed values Y and X is defined by

D, = X, - Y~.

(4)

It must be noted that Y~ is uniquely determined by X~ (the inverse is not true). Di and Xi are therefore strictly connected, which makes the

T. Pun, M. Eden / Moments from a quantized signal

statistical analysis of the quantization error difficult. The (unknown) ensemble mean and variance of x(t) for the stationary sequence under consideration are respectively given by

ixx = ix = E[X], (5) 0-~ = 02 = E l ( x _ IX)2], while the unbiased sample mean and variance are respectively given by n

mx=X=-I Y. X,, /eli= 1

(6) sx =

( X , - mx) 2. n--li=l

Statistical independence between samples assumed (hypothesis H2); this implies

is

E[mx] = E [ X ] = ix, var(mx) = v a r ( X ) / n = 0-2/n, (7)

E[ X'~ X'~ ] = E[ X'~]E[ X~'] = E[X']E[X"]. The estimators my and s 2 of ix and 0-2, respectively, are the sample statistics of the measured Y,

2. Quantization error Determination of statistical properties of the quantization error is an extensively treated question, both in case of deterministic [3-11] and random reference quantizers [12, 13]. The bottom line is that although strict noncorrelation between D and X can be achieved in these two cases, in practice it is often an approximation for the deterministic quantizers (which are considered here). In addition to the hypothesis of noncorrelation between input signal and quantization error, another assumption is usually employed regarding the statistical distribution of D. A necessary and sufficient condition for this distribution to be uniform is that the characteristic function (CF) of X becomes zero at multiples of I/h, where h is the quantization step [6]. A uniformly distributed input signal X whose dynamic range is an integer multiple of h verifies this condition. However, this is not the case for a Gaussian process, whose CF also has a Gaussian shape. The requirement is approximately satisfied if h does not significantly exceed the standard deviation of the signal distribution. Under this assumption, the quantization error D is modeled as uniformly distributed white noise, uncorrelated with x(t) (hypothesis H3): 1

pD(d)=~,

n

m y = ~=_1 E Y,,

117

-½h<~d<~½h.

(9)

(8a)

ni_l

1 E (Y,- m~y.

s y = n--1 i=~

(8b)

In the following, statistical properties of these parameters are derived. Since the output of the quantizer can take only discrete values, the PDF of the random variable Y is also discrete. The problem presented above is therefore the same as the problem of sampling a continuous probability density function into a discrete histogram of values. In the present case, it is shown in Section 4.2 that the continuous PDF to be sampled is the one of X - D. This stresses the importance of the assumptions that can be made on the PDF of D.

Another hypothesis (H4) that permits many further simplifications is that, in addition to noncorrelation ( E [ ( X - m x ) ( D - mD)] = 0), some higher order moments also satisfy separability relationships,

E[ X, DjDkD,] = E[ X,]. E[ DjDkD,]q E [ X, XjDkD,] = E [ X,Xj ]. E [ Dkl),], I

E[ X,XjXkD,] = E[ X,XsXk]. E[D,], J i,j,k, l c { 1 , . . . , n } .

(10)

Since statistical independence is assumed between samples (H2), the indices can be dropped, yielding moments of degrees 1 to 3. Direct and exact evaluation of the right-hand sides of equations (10) Vol, 10, No. 2, March 1986

T. Pun, M. Eden / M o m e n t s f r o m a quantized signal

118

would be feasible, assuming certain hypotheses regarding the PDF of X. This would alleviate the need for hypothesis H4, at the expense of significantly more complex derivations. As will be confirmed by simulation, hypothesis H4 is satisfied under normal circumstances (i.e., a ratio of standard deviation to quantization step being larger than approximately 0.5). The mean mo=/):lx

D,

(11)

n

is consequently uncorrelated with rex. The four first moments of D are given by E[D]=0,

3. M e a n

a n d v a r i a n c e o f l7 a n d s ] ,

3.1. Mean and variance of The sample mean of Y is given by m y = ~" =

mx

(16)

- roD,

from which it immediately follows that E[my]=l~-E[D]=tx

if E [ D ] =0.

(17)

As expected, my is an unbiased estimator of p. if E [ D ] = 0. Due to the linearity of the mathematical expectation E l . ], the first equality in equation (17) is always true, whatever be the correlation between X and D. The variance of m r is given by 0.2 var(D) 0.2 h 2 var(my) = - - 4 - + -n n n 12n

E [ D 2] = var(O) =- ~2h2, (12) E [ D 3] = 0,

-

E [ D 4] : ~6oh4. Due to the independence between different Di's (hypothesis H2), E[mo]=O, E l m 2] = var(mo) = hZ/(12n), E [ m 3] = 0

.

(18)

This indicates that my is a consistent estimator: it tends towards the actual value when sufficiently many observations are taken (bias as ~ ell as variance decrease with n). In addition, the variance of my is always larger than the variance of rex. 3.2. Mean and variance of s~

(symmetry of pD(d)).

E[m4o] = 3 var2(mD) = h 4 / ( 4 8 n 2 ) .

(14)

The ratio of signal standard deviation to quantization step is denoted by

From its definition (8b), the sample variance of Y is given by s 2y =

1

(15)

When this ratio is smaller than approximately 0.5, simulation can provide interesting insights into the phenomenon (see Section 5). Such a coarse quantization step, however, is uncommon in practice, and the results below can be used in most cases.

Z ( Y I - my) 2

n-I 1

= n - 1 ( ~ Y ~ - nm2)"

(19)

and, therefore, E[s~] =

Signal Processing

1+

(13)

Even for very small n, mo is approximately normally distributed (the Central Limit Theorem applied to identical uniform distributions is very accurate); therefore,

p=o'/h.

n

n (E[ y2] _ E [ mZy]). n-1

(20)

Using the fact that E[ r2] = E[ ( X - D) 2] = tx 2 + o.2 + E [ D 2] -2/.rE[D]

(by H3),

(21)

119

T. Pun, M. Eden / Momentsfrom a quantized signal the mean of s 2v is given by E [s 2 ] = ~r2 + var(D) = Or2+ ~2h2.

(22)

This result is not surprising, since it can be rewritten as

E[s~] = var(X) + var(D) = var(Y).

(23)

2

It confirms that s y is an unbiased estimator of var(Y) but not of Or2. It always overestimates the latter by a bias equal to the variance of D (the Sheppard correction [15]). The determination of the variance of s 2 is much more tedious. General formulas have been derived in the case of Random Reference Quantizing [12, 13], and could be used here. In the present context, it appears more straightforward to start from the definition var(s 2) = E[s~] - E[s~] 2.

y 2 ) 2 q_ n2??14_ 2nm 2

1 Px (x) = Or~/~-~exp {

E Y~]

(x-~)2 /

(26)

The expressions for the moments are readily obtained, E[X]--/z,

E [ X 2] =/.t,2 q- o- 2, (27)

(24)

The second term at the right-hand side is known; the first one can be obtained knowing that

1 S4 _ ( n -- 1) 2 [(2

in many situations. It allows the use of well-known expressions for the various moments of X. Other functions could be assumed as well, as long as they decrease rapidly enough in their tails to maintain the validity of hypothesis H3 [6], and that their first four moments exist. If these are hard to determine, the Central Limit Theorem might be employed with benefits. Therefore,

E [ X 3] =/.t 3 q- 3/d, or 2, E [ X 4] =/~4+6/~20"2+3o- 4, and

E [ m x ] = lz, E[ m2x ] = tz2 + o-2/ n,

1 - (n - 1) 2[(~ X~+Y, P ~ - 2 ~ X~Di) 2

E[m 4 ] = ~4 + 6~2o-2/n + 3o-4/n 2.

+ n2(mx - too) 4 - 2n( mx - t o o ) 2 x ( ~ X ~ + ~ D ~ - 2 ~ XiY, D,)].

(28)

E[rn3 ] = l~3 + 3tzor2/ n,

4.1. Variance of sZv (25)

By developing mx and mo as ~ X i / n and ~ D J n respectively, it is possible to obtain E[s4y] as a function of E[X], E[X2], E [ N 3 ] , E[X4], E[D], E[D2], E[D3], E[D4]. However, the derivation is lengthy even though straightforward, and can be simplified by making further assumptions regarding the nature of the process x(t). The basic idea is to gain simplification by expressing the moments of mx and mD as functions of the moments of X and D.

Introducing the moments of X, m×, D, mD obtained previously in the expression of s 4 (equation (25)) yields its mathematical expectation. The developments are given in Appendix A, and the resulting expression is var(s 2) = E[s4v] - E[s~] 2

(nn+l ----

or

1

4 - P -n- + l hZor2 n-1

6

h a 5 n 2 - 6 n +7'~ -~ 16 4 ~ - - - 1 - ~ ] - ( o r Z + ~ h : ) 2

2

4. Input signal with Gaussian distribution It is assumed from here on that X follows a Gaussian distribution. This hypothesis is realistic

_

2 h2or2 o-44 _ _ n-1 n-1 6 4n+2 + (½h)4 45(n - 1) 2.

(29)

Vol. 10, No. 2, March 1986

T. Pun, M. Eden / Moments from a quantized signal

120

It is easy to demonstrate that var(s~) is 2o'4/(n - 1) [16], which is what equation (29) gives for h equal to zero (no quantization). Only in this degenerate case will s 2 be a consistent estimator of or2. Using the ratio p defined in equation (15), we have var(sZv) = var(s~)(1

1 2n+l ;__,) +6p 2-~ 360(n - 1) (30) It is often true that p >>1; in that case, var(s2)=var(s~x)(l++)

o

if p>> 1.

P Y ( Y ) ~ PY.o(Y) = 1. If h ~ 0, then

pv.k(y)~O

4.2. Probability density function of Y As mentioned in Section 1, the probability density function of Y is obtained by sampling the pdf of the continuous random variable X - D [1, 2, 3]. Formally,

and

PY(')--'Px(').

4.3. Approximate envelope for the PDF of 17 From definition (8a), the probability density function of my is

(31)

With a quantization step h to tr, the variance of s~ is approximately 16% larger than the variance of s~. This has been experimentally observed [ 17]. Even with a rather coarse quantization, the accuracy of the estimator S2y remains quite good. This is not necessarily true for its precision: the estimator is biased, and nonconsistent unless the step h degenerates to zero.

Pv(Y) = 5~PY.k(Y) " 6(y - kh )

If h ~ co, then

p,707) = pg(Y] Y~/n ).

Since Y~ and Y~ are statistically independent for i ~ j (hypothesis H2), this PDF is the multiple convolution of the individual pv(y)'s. Each term of this convolution product is a discrete PDF (equation (32)). Therefore, the average I7" has a discrete PDF. The larger n becomes, the more different possible combinations of Y are measured and the more different values 17 are obtained. Developing (36) is feasible, but it is of more practical use to give an expression for the envelope of pf(y). This result is sufficient to allow interesting predictions relative to the possible variations of Y.. The envelope of Y being Gaussian, it comes for the one of I7, env{p~07)} - (2"rrtr2) -1/2 exp/ t

(32)

with

(37a)

f y+h/2

PY,k(Y) =

ay-h/2

px(X) dx.

(33)

It can be noted that each term PY, k corresponds to the convolution product of p x ( v ) and P o ( Y - v ) , corresponding to one difference X / - Di of equation (4). Denoting the error function by eft(u) = -~o ~

(36)

exp{ - ½w2} dw,

(34)

equation (33) becomes

with tr~ = 1 (tr2 + t2h2).

(37b)

n

The sign - indicates the existence of a scaling factor. This factor can be chosen so that the sum of theoretical envelope values, taken where the observed distribution is nonzero, equals unity. This has been done to derive the images given in Figs. 1, 2, and 3.

4.4. Approximate envelope for the PDF of s 2

(35) Signal Processing

Although the expression of the distribution of s~ is well known [16], the problem is more complex for s~. As before, only an expression for its

T. Pun, M. Eden / Moments from a quantized signal

4 .~...... ~':~'~----~

--4'~*~= "~'~* ''**~

:

h=2

h=l

121

i

h=1/2

Fig. 1. Theoretical and observed probability density functions of X, Y, D, mr, s 2 (top to bottom), for p = 0.5, 1, and 2 (left to right). Parameters: cr = 1, n = 5, 10 000 replications.

h=2

h=l

h=Y2

Fig, 3. T h e o r e t i c a l and observed p r o b a b i l i t y d e n s i t y f u n c t i o n s o f X, Y, D, my, s 2 ( t o p to b o t t o m ) , f o r p = 0 . 5 , 1, a n d 2 (left to right). P a r a m e t e r s : o- = 1, n = 50, 500 r e p l i c a t i o n s .

envelope

is derived. T h e s a m p l e v a r i a n c e o f

Y

( e q u a t i o n ( 8 b ) ) can be rewritten as

1 ~

S 2 v - n - 1 i=1

((X_mx)_(D_mD))2

1

2

1

-._---SE(v,-c,) =77~_~Z w~

(381

with Vi = Xi - m x

" - I x,-L E xj, n

FIj# i

Ci = D i - m D

i=l,...,n.

. - 1 D_IE D,, ?Z

h=2

h=l

h=Y2

Fig. 2. T h e o r e t i c a l and observed p r o b a b i l i t y d e n s i t y f u n c t i o n s o f X, Y, D, m r , s 2 ( t o p to b o t t o m ) , f o r p = 0.5, 1, a n d 2 (left to right). P a r a m e t e r s : cr = 1, n = 10, 2000 r e p l i c a t i o n s .

(39)

n j#i

= Yi-my

D u e to the i n d e p e n d e n c e h y p o t h e s i s H 2 , a n d s i n c e X is n o r m a l l y d i s t r i b u t e d , e a c h V, has a G a u s s i a n Vol. 10. No. 2, March 1986

T. Pun, M. Eden / Moments from a quantized signal

122

with

distribution

pv(v ) O'vV2wl-l-expl l-~

0"2 K=-n-1

(40a)

with

002+~h2

env{p(s2)} =

The variable D approximately exhibits a uniform distribution (hypothesis H3), and ~ Dj/n rapidly converges towards a normal distribution as n increases. But, at the same time, the distribution of Ci tends to become more uniform with zero mean and variance n-1 n

h2

12

2 [s~.~ ~-~1 X.-l~--~).

(44)

(40b)

002.

n

00~

(43b)

and, finally, the envelope is given as

0 -2

n-1

var(my)

n

As in the previous section, the sign - indicates that there is a sealing factor. The mean and vari2 2 ante of a X,-l(Sy) distribution are n - 1 and 2(n - 1) respectively. Hence, mean(sgy) - (n - 1)K = n - 1 (0"2+ ~ h 2) n

.

(41) = E[s~.](1 _ 1 )

However, especially when h does not significantly exceed 00, W~ is approximately normally distributed,

pw(w)~--00wx/2~rllexp ~ 2I --- ~ w (

(45a)

and var(s~) -~ 2(n - 1)K 2

(42a)

2(n - 1) (o.2+~h2)2 n2

with mean = O,

2(n - 1) ~---

002___ n -___11( 0 0 2 + 1 h 2 ) . n

o'2

Signal Processing

O'44

1

42n-2 ~n 2 .

6//2

(42b)

Different terms V~'s are not independent, since E [ V~Vj]= -002/n for i#j. This is also true for the D/s. Consequently, W~ and Wj for i#j have a correlation term proportional to 1In. When n increases, they become more nearly uncorrelated and it is possible to apply classical results from sampling statistics (see, for example, [18]) to derive the limiting distribution for s 2. Being interested in the envelope of the PDF of Sty, it is assumed that W 1 , . . . , W, are n independent normal random variables of mean 0 and variance cr2 . Since if" -- ~ W~/n is identically zero by construction, ~ ( Wi - if,)2 has a X,-~ 2 distribution. According to (38), s]. = ~ y ( w , - ~ ) 2

2(n - 1) h2002

712

(43a)

+(~h)

(45b)

These results closely match (22) and (29) obtained by direct derivation, especially for large n. In the latter case, p(s 2) comes close to a normal distribution. However, small window sizes may be mandatory for signals having only short-term stationarity, hence the interest in having expressions valid in these circumstances. Simulations presented in the next section confirm that validity.

5. S i m u l a t i o n

Extensive simulations have been performed (VAX 11/780), using the intrinsic VAX-11 FORTRAN random generator to obtain uniform deviates,

T. Pun, M. Eden / Moments from a quantized signal

then exploiting Hastings' approximation to the error function erf(x) [19]. This method provides one normal deviate per uniform one. Figs. 1, 2, and 3 show the probability density functions of X, Y, D, my, and s 2 for various values of n (size of the estimation window) and h (quantizer step). The jagged curves and bar graphs are obtained through simulation, while the solid drawings show the theoretical results. Figs. 4 and 5 illustrate the differences between theoretical predictions and actual simulations of the variances of m y (equation (18)) and Sy2 (equation (30)).

123

",v

6

n=50 :

,/

v

v

--

,---, k-

o> k. ,~'

.

v

.

.

v

_,.~ _ .

v

.

,.n= 10 .

v

r3

'>" LLI

*'"""2 . 0

I

E[..]..

,._ .

v

.

.

I

n=2 I

I

I

1.2 0

t.l.l

Fig. 5. Theoretical ( ) and observed (...... ) values for the ratio 'measured average of var(Y) to its measured standard deviation', as a function of p = tr/h and of n. The theoretical value of E[s2y] is also shown ( .... ).

I.L

o z o

C

n=2

.6

t23

rY < z

n=lO

~'-,c

.

.

v

. v

I

. n =50 v

! 3

I

I 6

e/h Fig. 4. Theoretical ) and observed (..... ) values for the standard deviation of ?, as a function of p = tr/h and of n.

The principal observation to be made is that the domain of validity of the preceding derivations, hence of hypotheses H3 and H4, is fairly large. Only for very coarse quantization steps (larger than about two standard deviations) together with very small sample sizes (less than about five data) does the agreement between model and simulation degrade significantly. This mostly appears for the envelope of the PDF of var(Y), and is due to the following causes: as the step size increases, 1/h decreases. For a given distribution of X, its CF takes larger values at multiples of 1/h, and the Sripad and Snyder's -

5

cr/h

)-

~,,

3

condition [6] becomes more poorly satisfied. Correlation between X and D increases, even though the latter remains quite uniform. One consequence is that the assumption (Section 4.4) of different W's to be uncorrelated no longer holds; it is probable that hypothesis H4 becomes less valid, since X and D become correlated. This phenomenon, however, is difficult to investigate analytically; when the sample size n decreases, the assumption of Gaussian distribution for W becomes more inaccurate (equations (42). However, it is known that, in the case of uniform distribution, the Central Limit Theorem converges very quickly, and this effect may not be very significant; there are outliers in the simulation, appearing as isolated peaks in the figures. These may be due to the joint effect of rounding errors together with the binning procedure. That is, discrete intervals have to be defined to record the PDF's; the choices of numbers of bins, their positions and sizes necessarily imply roundoff errors. This introduces problems in the determination of the V o l . 10, N o . 2, M a r c h

1986

124

T. Pun, M. Eden / Moments from a quantized signal

scaling factor for the envelope, but above all gives more variance in the measurement (discrepancies between envelope and simulation). The validity of the model has been checked by running the K o l m o g o r o v - S m i r n o v test (see, for example, [ 18]) for comparing an observed distribution with a continuous expected one. The test statistic is the m a x i m u m deviation between cumulative sample distribution (observed) and parent distribution (expected): D K s { F ( u ) } = max{lFe,p(U) - fobs(u)l}.

(46)

This measure quantifies how well the two functions match. The distribution of DKS is known and tabulated. For example, the hypothesis that the functions are 'the same' can be accepted on a significance level of 5% if DKs is smaller than 1.36/square root (number of measurements); note that this particular bound is valid only for large numbers of measures ( > 100). Assuming that X and Y are 'perfect', since they come from a supposedly reliable random generator, DKs{X} and DKS{ Y} are reference values that can be compared withDKs{D}, D K s { m r } , and DKs{S2y}. The statistic is nonlinear; a factor of two between two DKs'S does not imply a factor two in the significance level. The envelopes were scaled, as explained in Section 4.3, and are shown in Figs. 1, 2, and 3. Numerical results are not reported in their entirety, since their number is large. Also, they match well what is visually observed on the figures. The ranges of tr, h, and n used for analysis are those detailed in the figure captions. Values of DKs{D} and DKs{mr} are found to be very close to DKs{X}, sometimes even smaller. DKs{S2y} is larger than the reference values for small sample sizes (two to eight times for n = 5, one to three times for n = 10), then becomes similar when n increases. Therefore, the model seems adequate over a wide range of n. For very small n, even if (44) is quite approximative, the discrepancy is not too large (see Figs. 4 and 5). These observations made out of Figs. 1, 2, and 3 can be summarized. First, the probability density Signal Processing

function of D remains quite uniform with increasing h. It has been observed that this is true until the quantization step becomes approximately 2 to 3 standard deviations. Second, the envelopes of the discrete distributions are relatively accurate over most of the range of p and n. Third, these experiments cannot directly validate the noncorrelation hypotheses H3 and H4. Only indirect inferences can be made, namely that noncorrelation is probably verified as long as the model accurately reflects the observations. Fig. 4 presents the evolution of the standard deviation of Y, as a function of n and p (equation (18)). As expected, it decreases as the square root of n. When p = o ' / h becomes small (<0.5), equation (18) predicts an increase of this variance. Actually, this is only possible as long as Y can take more than one value with a significant probability, i.e., the quantization steps are not too far out on the tails of the distribution of X. One condition is that the half-dynamic range of X exceeds the quantization step. Otherwise, ~" becomes univalued, and its variance cancels. In the present case of Gaussian input, this means p is larger than about 0.5. Fig. 5 illustrates the variation of the ratio 'measured average of s 2r to its measured standard deviation'. This can be regarded as a signal-tonoise ratio of var(Y). It is approximately constant over a very wide range of p = tT/h. There is thus no interest in quantizing a signal finely when only first-order statistics have to be estimated. Even a ratio p equal to 1 yields an almost optimum result (for a given n). Interpreting Fig. 5 for small p is more complex; similar observations to those shown here have been observed in real experiments [17]. As explained for Fig. 4, when Y almost always takes one single value, i.e., when other values have a very low probability of occurrence, the variance of Y becomes very small. Consequently, the expectation and variance of this measure become very small, too. For a small n, chances of observing values of X outside the interval [-~h, ~h], i.e., different Y's, are smaller than for large n. This is the probable cause of the decrease in snr for small n, and the increase for large n.

125

T. Pun, M. Eden / Moments from a quantized signal

However, in the case of a Gaussian signal, the probability of obtaining more than one different value of Y is extremely remote for a step h exceeding 6 to 8 standard deviations. Consequently, the 1 snr becomes rather poorly defined for p
Appendix A. Determination of

E[s 41

Rewriting equation (25) as follows: E[(n - 1)~,]

= E [ ( E X 2 + E D2-2y~

X,D,) 2]

+ n2EE(mx - rnD) 4] - 2nE[(mx - roD) 2

6. Conclusion

When first-order statistics (average, standard deviation) need to be estimated from a quantized signal, it is useful to have a quantitative knowledge of their statistical distributions. This can help in setting confidence limits on their values, or in determining what sample size (estimation window) has to be used. Theoretical expressions for the expectation and variance of these statistical moments are derived, as well as limiting envelopes of their probability distribution functions. These expressions are experimentally shown to be accurate descriptions in most circumstances. Only for very small ratios of signal variance over quantization step, together with limited sample size, does the description become inaccurate. Even then, the results presented are believed to be of some use, principally due to their simple formulation. Recourse to simulation becomes necessary when the ratio is lower than approximately 0.5 (although some numerical problems may arise). This limit offers an approximation for the domain of validity of the model and the various hypotheses used in its formalization. In particular, it appears in this domain that noncorrelation between input and quantization noise, and uniformity of that noise, reflect reality rather well. The case detailed is the one of a Gaussian signal, although the derivations could be extended without significant difficulties to any distribution that decreases relatively rapidly for arguments of increasing magnitude, and consequently having first-order statistics. If mathematical expressions are hard to obtain for these, use of the normal approximation remains a valid alternative (Central Limit Theorem) and the derivation can be pursued.

x (E X(+E D2-2E X,D,)] = E1 + E2+ E3.

(A.1)

Making use of hypotheses H3 and H4, each of the three terms El, E2, and E3 can be expressed as a function of the moments of E and D. For the first term El, E1 = E[(Y~ X2)2q- ('~ Di2)2+4(E XiDi) 2

- 4 E X 2 E XjDj - 4 •

D~ E XjDj

(A.2)

+ 2 Y~X~ Y. Dy]. Developing each individual sum yields E[(~

X 2 ) 2] = n E [ X 4 ] - I - n ( n - 1 ) E 2 [ X 2 ] ,

E[(~

D 2 ) 2] --- r / E [ D 4] h-

n(n - 1)E2[ D2],

E [ ( ~ XiD;) 2] = nE[X2]E[D2], E [ ( ~ X2~) Y, XjDj] =- 0

(A.3)

(since E[D] =- 0),

E [ ( ~ D2~) ~ XjDj] ~ O,

E[~, x a y. D y] = n2E[ X2]E[ D2], and, finally, E1 = or4(n2+2n) + n2~4+l,.t,20"2(2n2+4n) + h4(~o n 4- l ~ ( n 2 - n ) )

+~1h2(o'2+tz2)(4n+2n2).

(A.4)

For the second term E2, E2 = n2( E[ m4x ] + 6 E[ m2x ]E[ rn2D]+ E [ m 4 ] ) = n 2/.t, 4 -I- 6 n ~ 2 0-2 q- 3 or4 q- 43~h4 q._l ~2

1~2 2 ~nn tx 2 _l-~n tr.

(A.5) Vol. 10, No. 2, March 1986

T Pun, M. Eden / Moments from a quantized signal

126

The final expression is obtained by replacing each of the three terms by its development,

For the third term E3,

23 = - ( 2/ n ) E[ (Y~ X~)2 Y~X~. + (Y. Xi)2 Y D~

-2(2 xi)22 XjDj+(2

Di)22 X 2

E[(n - 1)2s 4] = (n 2 - 1)o'4+~h20"2(n 2-1)

+ (2 D~) 2 D] - 2(2 D~)2 E X)Dj + ~20h4(5n 2 - 6n + 7).

(A.9)

- 2 ( 2 X,)(E Dj)(E X~)

-2EX, ED~ED~ Acknowledgment

+ 4 2 X, ~. Dj 2 XkDk].

(A.6) The authors wish to thank James R. Ellis and Benes Trus for their help, and also wish to express their gratitude for the usage of the N I H / D C R T Image Processing Facility VAX 11/780.

Developing each individual sum yields E [ ( 2 X,) 22 x~] = n E [ X 4] +

n(n - 1)E2[X 2]

+ 2 n ( n - 1) E[ X ]E[ X 3] References

+ n(n - 1)(n - 2 ) E 2 [ X ] E [ X 2 ] , E[( 2 X,)22 D~]

= nE[D2](nE[X2]+ n(n - 1)E2[X]), E [ ( 2 X~) 2 2 XsDj] -= 0

since E[D] =-0),

E[(2 D~)22 X~] = n2E[X2]E[ D2], E[(2 D,) 22 D~] = nE[ D 4] + n( n -

1)E2[D2],

E[(2 o,Y E xjo3 -= o, E[2 X, E D, E X~]=-o, E[E x, E Di E D~,]-=0,

E[E X, E Dj E X~ D~] = nE[X2]E[D 2] + n(n - 1)E2[X]E[D2],

(A.7)

and, finally, E3 = - 2 E [ X ' ] - 2 ( n - 1)E2[X 2] - 4 ( n - 1 ) E [ X ] E [ X 3] - 2(n - 1)(n - 2)E[X2]E2[X] -

(4n + 8)E[X2]E[D 2]

-

(n - 1)(2n + 8)E2[X]E [D 2]

-2E[D4]-2(n Signal Processing

- 1)E2[D2].

(A.8)

[1] F. de Coulon, Th~orie et Traitement des Signaux, Volume VI du Trait6 d'Electricit6, Editions Georgi, Lausanne, 1984, Chap. 10, pp. 314-323. [2] B. L6vine, Fondements Thdoriques de la Radiotechnique Statistique, Tome I, Editions Mir, Moscou, 1973, Chap. 7, pp. 329-333. [3] B. Widrow, "A study of rough amplitude quantization by means of Nyquist sampling theory", IRE Trans. Circuit Theory, Vol. CT-3, Dec. 1956, pp. 266-276. [4] B. Widrow, "Statistical analysis of amplitude-quantized sampled-data systems", AIEE Trans. Appl. Ind., Vol. 79, Jan. 1961, pp. 555-568. [5] A.A.G. Requicha, "Expected values of functions ofquantized random variables", IEEE Trans. Comm., Vol. COM21, July 1973, pp. 850-854. [6] A.B. Sripad and D.L. Snyder, "A necessary and sufficient condition for quantization errors to be uniform and white", IEEE T-ASSP, Vol. ASSP-25, No. 5, Oct. 1977, pp. 442-448. [7] J. Max, "Quantizing for minimum distortion", IRE Trans. Inform. Theory, Vol. IT-6, No. 1, March 1960, pp. 7-12. [8] R.C. Wood, "On optimum quantization", IEEE Trans. Inform. Theory, Vol. IT-15, No. 2, March 1969, pp. 248252. [9] J.A. Bucklew and N.C. Gallagher, Jr., "A note on optimal quantization", IEEE Trans. Inform. Theory, Vol. IT-25, No. 3, May 1979, pp. 365-366. [10] A. Bucklew and N. C. Gallagher, Jr., "Some properties of uniform step size quantizers", IEEE Trans. Inform. Theory, Vol. IT-26, No. 5, Sept. 1980, pp. 610-613. [11] K. Sayood and J.D. Gibson, "Explicit additive noise models for uniform and nonuniform MMSE quantization", Signal Processing, Vol. 7, No. 4, Dec. 1984, pp. 407-414.

T. Pun, M. Eden / Moments from a quantized signal [12] F. Castanir, "Stochastic computing: A bridge between signal processing and digital computing", Proc. EUSIPCO-80, North-Holland, Amsterdam, 1980, pp. 467482. [13] F. Castanir, "Linear mean transfer random quantization', Signal Processing, Vol. 7, No. 2, Oct. 1984, pp. 99-117. [ 14] M. Kunt, Traitement Numdrique des Signaux, Volume XX du Trait6 d'Electricitr, Editions Georgi, Lausanne, 1980, Chap. 6. [15] W.F. Sheppard, "On the calculation of the most probable values of frequency constants for data arranged "accordingly to equidistant divisions of scale', Proc. London Math. Soc., Vol. 29, 1898, p. 353.

127

[16] M.H. Quenouille, The Fundamentals of Statistical Reasoning, 2nd ed., Griffin's Statistical Monographs and Courses, M.G. Kendall, ed., Hafner, New York, Chap. 2, 1965. [17] W. H. Schuette, E. Carducci, G. E. Marti, S. E. Shack'ney and M. Eden, "The relationship between mean channel selection and the calculated coefficient of variation", submitted for publication, Oct. 1984. [18] D. Ransom Whitney, Elements of Mathematical Statistics, Holt, Rinehart & Winston, New York, Chap. 6, 1961. [19] C. Hastings, Jr., Approximations for Digital Computers, Princeton University Press, Princeton, N J, 1955, p. 192.

Vol. 10, NO. 2, March 1986