NUCLEAR INSTRUMENTS AND METHODS
164 (1979) 5 6 1 - 5 6 3 ,
t~
NORTH-HOLLAND
PUBLISHING
CO
A FAST APPROXIMATION TO THE COMPLEMENTARY ERROR FUNCTION FOR USE IN FITTING GAMMA-RAY PEAKS GARY W PHILLIPS
Radtatton Technology Dtvtston, Naval Research Laboratory, Washmgton, D.C 20375, U S A. Received 12 February 1979 A fast approximation to the complementary error function has been programmed and tested for use in the peak-shape function for fitting peaks in gamma-ray spectra The funcuon was compared for speed and accuracy on the NRL ASC 7 computer to the mathematical hbrary version of the complementary error function The approximation has resulted in a 50% time savings in the computer program HYPERMET which was developed at NRL for automatic analysis of gamma-ray spectra from germanium detectors.
1. Introduction The error function and its complement are widely used in fields such as computational physics, numerical methods and statistical analysis. A number of authors have found these functions useful in order to accurately fit the shape of gamma-ray peaks from germanium detectors l-a). All of these applications make heavy use of computer calculations. Unfortunately these functions tend to be relatively slow to calculate precisely on a digital computer. In program HYPERMET3), which was developed at the Naval Research Laboratory (NRL) for automatic peak analysis of gamma-ray spectra, the complementary error function appears in several te,rms of the peak-shape function which is used in an iterative least-squares fit to the peaks. Timing studies showed that the program was spending up to 80% of its central processing unit (cpu) time in the error-function routine. Since high precision was not required in this application, a search was made for a fast approximation to this function. 2. Definitions The error function erf (x) derives its name as the integral of the normal curve of error,
fo
eft(x) = (2/X/~)
e-'~dt.
It has a range of - 1 to +1 f o r - ~ _ < x _ < ~ . complementary error function is defined by erfc(x) = (2/X/~)
including that used in HYPERMET, require precise derivatives. For the complementary error function, d (erfc(x)) = -(2/x/~r)e -~2. dx In the gamma-ray peak-shape function used in HYPERMET3), the main term is a gaussian of width 6 and amplitude F
fl(x)
= Fexp(-x2/O2).
(For simplicity in the expressions below we will assume a gaussian width of 6 = 1.) Added to the main gaussian term are one or two exponentially decaying " s k e w " or "tail" terms on the low energy side of the peak and a constant " s t e p " term below the peak. Each of these, when folded with a gaussian representing random noise from the detector and associated electronics, is cut off smoothly beneath the peak by a factor with the functional form cerf(x- Xo) = ½erfc( x - Xo). This function rapidly approaches zero for x greater than the offset x0, which depends on the exponential slope and the gaussian width3). The derivative can be written d [cerf(x- Xo)] d e r f ( x - Xo) = ~x
The
e-t2dt
= 1 - eft(x). Most non-linear least-squares optimization routines,
= -(1/x/n) exp [ - ( X - X o ) 2 ] . Because the offset x0 differs for each term in the peak-shape function, calculation of the function cerf and its derivative deft is required up to three times at each data channel (corresponding to x) of the region being fit and for each iteration of the fit.
562 3. F a s t
G w PHILLIPS approximation
Since the amplitudes of the terms in the peakshape function in which the complementary error function appears are small relative to the main gaussian term, since the cutoff term cerf is very different from either 1 or 0 only for small absolute x (where A" is large), and since the reasons for inclusion of this term in the peak-shape function are largely empirical, a simpler approximation to ceft should perform just as well in the fit. (However, whichever approximation is used, its value and the value of its derivative must be calculated precisely at each channel and for each iteration in order to avoid cumulative errors which can effect the convergence of the fitting process.) Several approximations to the error function are given in ref. 5. The following approximation to the cutoff function cerf was chosen as sufficiently accurate and convenient for this purpose: cuff(x) = (b 1t +
b 2 t 2 -I- bat a)
exp ( - x2),
expressions involving arrays in the source code. Unfortunately, the complementary error function is available only as a scalar routine. For this test two routines 6) were written in FORTRAN to calculate cutf and duff for x varying between - 5 . 0 and +5.0, which covers a range typical for program HYPERMET. One routine was written to store intermediate results in order to compile efficiently in vector code. The second routine was written to compile efficiently in scalar code. A third program was written to calculate cerf and deft over the same range. In figs. 1 and 2 are plotted the results of the calculation for cutf and dutf and the errors epsc(x) = cerf(x) - cutf(x), epsd(x) = derf(x) - dutf(x), between x = 0 . 0 and 5.0. For negative x, cutf (x) = 1 . 0 - c u t f ( - x ) . The other curves are symmetric about x = 0 . Over the range of 0 < x < 1.65, the error curve epsc oscillates between _ 1 . 1 0 × 1 0 -5
for x>_0 and t = l / ( l + p x ) . The constant p = 0 . 4 7 0 4 7 and the polynomial coefficients are bl = 0.174 012 1, b 2 : 0.047 939 9, and b3 = 0.373 927 8. For x less than 0,
0
'
-1 -
~
I +
)
dutf(x) = d-~-(cutf(x))
'
I
r
]
h
I
I
I
J
"2
-3
The exact derivative is given by . +| (--)
I '
cuff(x) = 1 - cuff ( - x ) .
-4
I
. (~
EPSC
N
(--I
-S 0
= - ( 2 x ) cuff(x) - pt2 (bl + 2tb2 + 3 t2 ba) e x p ( - x 2 ) .
o
-s
>
-7
e o
results The cutoff function cuff and its derivative dutf were tested on the Texas Instruments ASC 7 computer system at NRL for speed and accuracy against the functions cerf and deft. The ASC 7 is a two-pipe-line vector-oriented machine which in an overlapped fashion can retrieve arrays of data, perform simple algebraic manipulations on the arrays, and store the results. For arrays of length n > 10, this is usually much more efficient than the corresponding scalar operations on the data element by element. Many of the standard mathematical functions have both vector and scalar routines available for their calculation, and the FORTRAN compiler is written so as to automatically invoke the vector routines when applicable for processing 4. Tests
and
-10 -11 -12 -13
-14
~ 0
I
I
I
1
2
3
4
X
Fig. l. Plot o f the fast a p p r o x i m a t i o n C U T F to the c o m p l e m e n tary error f u n c t i o n , and the error E P S C , as d e f i n e d m the text Plotted is the l ° l o g o f the a b s o l u t e v a l u e o f the f u n c t i o n s T h e s~gn ~s i n d i c a t e d in p a r e n t h e s e s a b o v e the c u r v e s T h e c u s p s in the c u r v e for E P S C result f r o m s i g n c h a n g e s as the funcUon p a s s e d t h r o u g h zero.
THE C O M P L E M E N T A R Y ERROR F U N C T I O N I
t
I
~ - )
'
I
J
I
cerf and deft. All codes were compiled and executed in double precision on the ASC 7 at NRL. Results are given for three different optional levels of compiler optimizationT), 1) I-level, scalar object code with direct translation of source code, no optimization. 2) J-level, scalar object code with local and global optimization. 3) K-level, vector object code with local and global optimization. The table shows significant savings in execution time at all compiler levels, with maximum savings at the K compiler level for the vector-coded version of the fast approximation, which executes in 2396 of the time required for the full calculation. At the I and J levels, maximum savings are found for the scalar-code version, which executes in about 60% of the time required for the full calculation. Further details of the tests and results are given in ref. 6.
I
DUTF
(-) (+)
EPSD
~-s -s >
563
-7
-a 0 -9
-10 -11 -12
5. Conclusion
-13 -14
i
I
i
,
2
i 3
,
i
,
4
I 5
X
Fig. 2. Plot of the derivative DUTF of the fast approximation to the complementary error function CUTF, and the error EPSD as defined in the text. The sign is indicated in parentheses above the curves. The cusps in the curve for EPSD result from sign changes as the function passes through zero.
The sign of the error is indicated in parentheses and changes at each cusp in the curve as the error passes through zero. For x > 1.65, epsc declines gradually with cutf. At x = 5, cuff= 7.99× 10 -.2 and lepscl -- 3.08 × 1 0 -~4 or about 496. Similar behavior is observed for the error epsd in the derivative duff. Table 1 gives the cpu execution times for 100 000 fast-approximation calculations of cutf and dutf in both the scalar- and vector-coded versions compared to the times for the full calculation of "]'ABLE 1 Complementary error function and derivative, time (s) for 100 000 calculations.
Compiler level I J K
Fast approximation Vectorscalercoded coded 7.23 5 50 1.35
5.63 5.04 2.86
Full calculation 8.82 8.49 5.95
In program HYPERMET, subroutine FUNC which calculates the peak-shape function has been rewritten in order to produce efficient vector code at the K-level of compiler optimization. This change resulted in a reduction in execution time to about 4096 of that required for the pr~yious version. The vector-coded fast approximation for the functions cuff and duff was then added in-line at each of three locations in the code where the functions cerf and deft had been used previously. With this change another factor of two in execution speed was achieved. As tile final result of these changes, the program now executes on the ASC 7 in about one-fifth the time required by the previous version to analyze a typical gamma-ray spectrum.
References 1) L. Vamell and J. Trischuk, Nucl. Instr. and Meth 76 (1969) 109. 2) M Dojo, Nucl Instr. and Meth. 115 (1974) 425. 3) G. W. Phillips and K. W. Marlow, Nucl. Instr. and Meth. 137 (1976) 525; NRL Memorandum Report 3198 (January 1976). 4) H.H. Jorch and J.L. Campbell, Nucl. Instr. and Meth. 143 (1977) 551. 5) Handbook of mathematwalfuncttons (Dover Pubhcat,ons, New York, 1965). 6) For source listings, see G.W. Philhps, NRL Memorandum Report 3963 (March 1979). 7) ASC FORTRAN reference manual, Manual no 93004~-3, Texas Instruments, Austin, Texas Oanuary 1978).