Reducing computational roundoff errors efficiently

Reducing computational roundoff errors efficiently

Cowput d Ops. Res.. Vol. I, pp. 135-136. Pergamon NOTES, Press, 1974. Printed IDEAS in Great Britain &TECHNIQUES REDUCING COMPUTATIONAL ROUN...

137KB Sizes 1 Downloads 70 Views

Cowput

d Ops. Res.. Vol. I, pp. 135-136.

Pergamon

NOTES,

Press,

1974. Printed

IDEAS

in Great

Britain

&TECHNIQUES

REDUCING COMPUTATIONAL ROUNDOFF ERRORS EFFICIENTLY EDWARD

L. MELNICK*

New York University, New York, N.Y.10003, U.S.A. Presented here is a new computational

procedure for determining the statistical properties of data.

INTRODUCTION

The essence of operations research lies in the construction of a mathematical model which adequately describes the behavior of a phenomenon being studied. Often preceding this construction is an examination of summary statistics computed from observable data generated by the unknown process. Some of these statistics are the mean, variance, skewness and kurtosis coefficients. The computational form of the latter three statistics is based on

where Xi, i = 1,. . . , n, are the observed data and X is the sampled mean defined as ln X = - C xi. The sampfed variance is m2, the skewness coefficient is rn$irnz and the kurtosis fir coefficient is m,/m$. Concentrating, for the moment, on the variance m,, we recall the equivalent mathematical relationship 1*

m2 = --Fxf

- X2.

In the olden days of the desk calculator, provisions were made for calculations with 20 significant digits so that equation (2) was almost always used to calculate m2. This formula only required one pass through the data file. With the arrival of the electronic computers and their limited word size, equation (1) became the preferred calculating formula since equation (2) was affected by roundoff errors especially when n was large and/or the absolute values of Xi were large (large being determined by the computer word size). This is especially true for computer programs that invert variance-covariance matrices (such as regression programs) since roundoff errors greatly influence the results of the computations when the matrices are ill conditioned. Studies have demonstrated that accuracy is enhanced

* Edward L. Melmck is Associate Professor of statistics at New York University. He has been a mathematical statistician at the U.S. Census Bureau and has published in the Journalof AppliedProbability, the IEEE Transactions on Information Theory, and Decision Sciences. We holds the B.A. degree in industrial psychology from Lehigh University, M.S. in mathematical statistics from Virginia Polytechnic Institute and Ph.D. in mathematical statistics from George Washington University. 135

EDWARDL. MELNICK

136

by working with deviations about the mean rather than the raw data. Although greater accuracy had been obtained, this procedure is unnecessarily slow and expensive since it requires two passes through the data file, once to computer X and then to compute m2. A NEW

If xc, a good approximation m2

COMPUTATIONAL

PROCEDURE

of X, is known, then =

$ (Xi -

i

X,)' - (X, -

2)’

requires only one pass through the data and the roundoff error will be the same order of magnitude obtained from equation (1). However, since a good x, is rarely known, it can be obtained by the following iterative scheme. Define

sf = c (Xi -

2,)”

(4)

I

where ,FI = 1 i xi. Then, r i

s; = i

x,_ 11 - [X, - x,_ 11)”

([Xi -

(5)

and using the binomial expansion So = ~

(Xi -

X,_,)k

+

~

~

i=l

i=l

(-

1)’ r 0

j=l

(Xi -

X,_ l)k-‘(X,

-

X,_ ,y’.

(6)

Changing the order of summation in the last expression and substituting, where appropriate, the notation St- 1,

sf = SF_ 1

+

(x, -

ql)k

+ i j=

(- ly’ ; 1

[S;:i

Recognizing the identity (X, - X,_ $ = (l/$(x, SF= Sk_

1

+

(x, - E,_

l)k + i j=l

(- l/t-y'

+ (x, - x,-#-j]&

- x,_$.

(7)

0 - X,_ #, equation (7) can be expressed as

05

[(x, - X,_

l)k + (x, - X,_ ,)‘$I{].

(8)

Thus, mk = i Si,and this is obtained by only saving X,_ i, and S{_, where j = l,...,k. In the special case where k = 2, the variance m2 = 4 S,'and SF in equation (5) reduces to the form r-l s; = S12-l+ 7(X,

- xr_1)2.

(Paper received 7 March 1973)