NUCLEAR
INSTRUMENTS
A COMPUTER
AND
METHODS
PROGRAM
81
(1970)
217-219;
FOR THE AUTOMATIC S. J.
IV(G) computer
NORTH-HOLLAND
PUBLIS~~ING
CO.
ANALYSIS OF y-RAY SPECTRA
MILLS
Received 25 November A FORTRAN
0
1969
program is reported which automatically locates alI the significant peaks in a read-in y-ray spectrum and determines their centres and areas.
This paper outlines a computer program which has been developed as an aid in the analysis of complex y-ray spectra obtained with Ge(Li) detectors. The object of the program, named AUTSPAN, is to locate automatically all the significant peaks in a read-in spectrum and determine their centres and areas. As in other programs of this typele4), the procedure followed can be divided into three main steps:
The automatic identi~~ation of peaks in a y-ray spectrum is usually based on the behaviour of either the first differences’ *3, or the second difference?) between the counts in successive channels of the spectrum. In order to increase the peak-detection sensitivity it is customary to carry out some form of data smoothing on these quantities before they are scanned for indications of y-ray peaks. In the present case, however, the author has preferred to smooth the spectrum itself. Smoothed values of both the first and second differences can then easily be obtained afterwards by simple subtraction, while the smoothing of the spectrum results in a much more accurate initial estimate of the parameters used in the peak-fitting procedure followed to determine the peak parameters. The smoothing procedure basically consists of obtaining a least-squares fit of a polynomial to an uneven number, M, of adjacent data points, and taking the value of the fitted polynomial at the central point as the smoothed value of that point’). Following the general trend ‘*3*6), provision has only been made for using quadratic and cubic functions for this purpose. However, other power functions can easily be utilized, if required, by simply making the appropriate changes to the values of a number of constants defined in the program. In fact, even the Mariscotti method of data smoothingz), which employs a moving average over a
number of adjacent points, can be used in this way. However,this type of smoothing function is known ‘) to flatten and broaden the peaks in the data to be smoothed, and the author has found that it leads to quite unsatisfactory results in cases where y-ray peaks are situated close to one another. The value of n that must be used to obtain the best smoothing depends on the average fwhm (in channels) of the peaks in the spectrum to be smooth~d1-3,6). It is therefore preferable not to write the program with a fixed value of n, as was done, for instance, by Barnes3). In the present case it was decided that it would probably be best to make a trial-and-error determination of this number for each type of spectrum, and it has therefore simply been programmed as a read-in constant which can assume any uneven value from 5 to 15. For the same reason provision has also been made for reading in the number of smoothing operations to be performed - it was found that in many cases, especially in cases where the peaks have a fwhm of a considerable number of chaunels, one smoothing operation is not sufficient for obtaining the desired degree of smoothing. In these cases it is advantageous to perform a number of consecutive smoothing operations before the peak-finding routine is entered.
Peak location Of several peak-locating methods that were investigated - existing ones” 3, as well as new ones - the following technique has finally been adopted: It is assumed that each peak in the spectrum can be represented by a Gaussian function superimposed on a linear background. The problem of peak location is thereby reduced to a search for regions of significant negative second differences in the smoothed spectrum”).
217
218
s.J. MILLS
Thus, if such a region is found, and it has a width greater than, or equal to, one-half the read-in average fwhm of the peaks in the spectrum, its centre of gravity is taken as the approximate position of a possible peak. However, in order to ensure that only statistically valid variations are processed, a rough estimate of the height of the peak above a linear background is made in each case, and all the peaks with a height less than a read-in factor times the square-root of the background at that point, or less than 10 counts, are rejected. No attempt has been made to impose any conditions based on distinctive peak characteristics which must be met before a peak is recognized as such. Pseudopeaks, for instance "peaks" due to Compton edge effects, will therefore also be processed. However, the author has preferred to leave it to the user himself to overlook these peaks, rather than run the risk of also rejecting true peaks because of too stringent tests in the program - as was found to happen in the Mariscotti program 2) when peaks were situated close to one another. Furthermore, it is advantageous to retain such peaks if a fit to them can possibly be obtained, as this will lead to a more accurate fit of overlapping true peaks that might be present.
Determination of the peak parameters In this step a Gaussian function, superimposed on a quadratic background, is fitted to each region of the raw spectrum where a significant peak has been found (if two or more peaks, up to a maximum of five, mutally interfere, they are fitted simultaneously). The parameters of the fitted Gaussian are then taken as the true peak parameters within the standard-errors-of-fit. If no fit can be obtained to a particular peak, it usually means that the "peak" is far from Gaussian in shape, and is therefore probably a spurious peak. The fitting region for each peak is determined by inspection of the smoothed first differences. The boundaries are fixed at those channels on either side of the peak where a statistically significant change in the sign of the first difference is observed1), i.e. in the valleys above and below the peak. However, as this sometimes results in a boundary being too far from the peak, they are limited to a maximum distance of twice the fwhm from the peak centre. Furthermore, if no valley is found between neighbouring peaks, or if the separation of the peaks is less than 2.5 times the fwhm, the
fitting region is extended to include both peaks. This procedure should be a considerable improvement on those 2'3) in which the boundaries are chosen at a fixed factor times the fwhm above and below the peak centre, and in which it can consequently easily happen that some of the boundaries fall on a steeply sloping part of the spectrum, with the result that a very poor initial estimate of the background under the specific peak is obtained. The fitting procedure is basically the same as that which was used, for instance, by Mariscotti 2) - the initial estimates of the fitting parameters are obtained in the same way, and the search for a best fit is performed by means of a similar non-linear least-squares method. However, several important improvements, adopted from a program written by Burdzik 7) for the analysis of charged-particle spectra, have been made to the search routine to make it more efficient: Firstly, the Oak Ridge and Oxford method of descent 8, 9) is used. This utilizes a parabolic approximation to speed up the search, instead of always taking the full parameter-step. The step-length multiplication factor z, however, is computed in a slightly different way: 1 z ---- C O N V / ( P m a x
if if
(Pmax ~---C O N V , ~0max > C O N V ,
where CONV is a read-in speed-of-convergence factor. In addition, z is cut by a factor of five whenever it is found that Z2 at the minimum point of the fitted parabola is larger than at the central point under consideration9). Secondly, a different test for termination of the search routine is employed. According to this a fit is achieved either when the calculated step-length of each parameter is found to be less than a read-in factor times the estimated standard-error-of-fit of that parameter, or when z has to be cut three times in the same fitting operation9). If, however, the number of iterations exceeds an input number before either of these is achieved, the solution is assumed to be divergent. Finally, it sometimes happens during the search that the height of a peak, probably a spurious peak, becomes negative. If this peak is fitted together with one, or more true peaks this unphysical behaviour may prevent a fit to these peaks being obtained. Therefore, in these cases the program sets the height of this peak equal to zero and excludes all its parameters
AUTOMATIC ANALYSIS OF ])-RAY SPECTRA
from the search procedure for the iteration in progressg). However, in each new iteration an attempt is made to bring this peak back into the search routine. The program has been written in F O R T R A N IV(G) for use on an IBM System 360/65 computer. A report containing the complete program listing of A U T S P A N , detailed instructions for its use and a sample input/ output is available on request. The author would like to thank Mr. G. F. Burdzik for permission to use the basic search routine of the program P E A K F I T , and for his help in incorporating it in A U T S P A N .
219
References 1) 2) 3) 4) 3) 6) ~) s)
9)
H. P. Yule, Anal. Chem. 38 (1966) 103. M. A. Mariscotti, Nucl. Instr. and Meth. 50 (1967) 309. V. Barnes, IEEE Trans. Nucl. Sci. NS-15, no. 3 (1968) 437. T. Inouye, T. Harper and N. C. Rasmussen, Nucl. Instr. and Meth. 67 (1969) 125. A. Savitzky and M. J. E. Golay, Anal. Chem. 36 (1964) 1627. H. P. Yule, Nucl. Instr. and Meth. 54 (1967) 61. G. F. Burdzik, to be published. M. A. Melkanoff, T. Sawada and J. Raynal, in Methods in computational physics (eds. B. Adler, S. Fernbach and M. Rotenberg; Academic Press, New York and London, 1966) p.l. M. A. Melkanoff, J. Raynal and T. Sawada, SEEK, a FORTRAN program for automatic searches in elastic scattering analyses with the Nuclear Optical Model, University of California Report no. 66-10 (Los Angeles, Jan. 1966).