Nuclear Instruments and Methods in Physics Research A245 (1986) 525-529 North-Holland, Amsterdam
AN APPROXIMATE F.T. AVIGNONE
ANALYTICAL
MAXIMUM
525
LIKELIHOOD
TECHNIQUE
II1
Department of Ptlvsics. Universi(v of South Carolina, Columbia, South Carolina 29208, USA Received 30 October 1985
An approximate analytical technique for calculating the likelihood function associated with the number of counts under a Gaussian peak, superimposed on a continuum, is demonstrated. The replacement of the Poisson statistical distribution by a normal distribution, coupled with an alternative definition of the a posteriori probabilities of the likelihood function, results in a simplification allowing the derivation of formulae in closed form. Results using this technique compare well to those from numerical computations using Poisson statistics and more conventional forms of likelihood functions. Sample applications are given.
1. Introduction
2. Analysis
In m a n y applications it is necessary to determine the n u m b e r of events contained in a spectral peak of k n o w n shape, superimposed on a significant background. It is also i m p o r t a n t to o b t a i n the corresponding variances to some desired level of confidence (CL). The m a n y familiar approaches to this general p r o b l e m include: least-squares fitting of the data to the k n o w n peak shape, m a x i m u m likelihood analyses [1] as well as a variety of techniques involving M o n t e Carlo simulations of the peak and b a c k g r o u n d [2]. In some cases the energy and width of the peak are k n o w n but the actual existence of a peak, with a finite n u m b e r of counts, is not evident and a determination, at some level of confidence, of the m a x i m u m n u m b e r of counts hiding in the statistically fluctuating b a c k g r o u n d is desired. Searches for zero-neutrino double beta decay of V6Ge in lowb a c k g r o u n d G e detectors are currently interesting examples [3-8]. More c o m m o n examples occur when such detectors are used to search for very m i n u t e quantities of specific radioisotopes. M a n y elegant statistical analyses address this problem; however, most require computers a n d software which at times are not available to workers in remote locations away from their laboratories. The purpose of this p a p e r is to present a simple, a p p r o x i m a t e m a x i m u m likelihood technique requiring a calculator a n d a normal distribution table. The latter is not required if one is content with confidence limits which are an integral multiple of the variance. This m e t h o d is applicable to the special case of a G a u s s i a n peak, superimposed on a b a c k g r o u n d of k n o w n shape, b o t h of which suffer statistical fluctuations which can be a p p r o x i m a t e d by normal distributions. We derive an explicit expression for the most p r o b a b l e n u m b e r of counts and a simple prescription for d e t e r m i n i n g the variance.
Let us consider a set of data points N(Xi) in a spectrum which is composed of a peak of Gaussian shape Y ( X , ) a n d a mean b a c k g r o u n d rn(X,). Here X i is a discrete variable corresponding to energy for example. We define Z i ~ N ( X , ) - m(X,), where in m a n y applications m(X,) is a c o n s t a n t for all i. Suppose we hypothesize X counts u n d e r the peak Y(X,), then we write
Y(X,)-
e ,x,-x,,p/2o~
(1)
where X 0 corresponds to the center of the peak and o, is a measure of its width. In the absence of statistical fluctuations, there would exist some value of X for which Y( X, ) = Z, for all values of i. Let us consider an experimental spectrum with a peak of k n o w n shape and k n o w n mean rn(X,) and which suffers statistical fluctuations obeying a known distribution. For some hypothesized value of X, there exists an a posteriori probability fx(X,Y,Z,) that each data point N(X,) will have the value observed in the experimental spectrum. The likelihood function L ( X ) is defined as the overall probability of observing the experimental values N(X,) if X and m are specified a priori, but subjected to instrumental dispersion and statistical fluctuations. The likelihood function L(X) is defined: n
L ( X ) = H fx(XiY, Z,),
(2)
1-1
where Y~=- Y(X,), a n d where fx can be calculated from the statistical distribution. We would like first to determine X -= X 0 which maximizes L. The log likelihood function l x has the same extremal values of X that L(X) has. It can be written as follows: n
z~ = ~ in f~( x,Y,Z,). i
0 1 6 8 - 9 0 0 2 / 8 6 / $ 0 3 . 5 0 ,v) Elsevier Science Publishers B.V. ( N o r t h - H o l l a n d Physics Publishing Division)
X
I
(3)
F. 7~ A vignone I11 / A n approximate analytical maximum likelihood technique
526
and
We maximize l x in the usual way where
l
dla d~ = k f~l(XtYtZi) i=~
dfx d ~ dY, d ) t
(4)
The probability fx can be defined in two ways: either as that for observing Z, as far from the expected value Y, as it was, or as that for being on some finite interval. In the first case, fx is proportional to the integral of the normal distribution from the net value of the data point, Z , = N ( X , ) - m(X,), to infinity. An alternative and equally meaningful definition of fx is that of finding Z, on some finite interval Z i to Z, + A Z,. In this case,
fx(X,Y, Zi) =-
fz,+az, e
1 Ov 2 ~
~z,- v,~2/20~ dZi,
(5)
Z,
which, for A Z, << %, is approximately proportional to the integrand itself. This definition of fx results in a variance of the likelihood which is, on the average, 28% larger than that derived with the integral definition. Accordingly this formulation will lead to more conservative limits on the maximum number of counts hiding in the fluctuations of the background. When the background, or continuum under the peak, is large compared to the peak contribution itself, another useful approximation can be made. In this case we can treat %. as a constant and we obtain a significant simplification of the derivatives. In this approximation, the maximizing value )to can be derived analytically while L()t) itself is Gaussian; hence, standard tables can be used to evaluate variances. With the above definitions and approximations we can easily show that 1 [ )te_,X,_Xo)2/2o~ 1 a2 2 v ~ Z , - G - - ~~-
1 df~d~ /x dY~ d)t
x e -~x,-x''¢/2°~,
(6)
and dlx
[ En Zj e -(x'-x°)2/2°2'
1
dX o ) ~
i 1
)t
~
e -'x,-x°)2/°~ ]
(7)
Ox2¢2¢~ i = 1 In the usual data set there are between 5 and 15 data points in the peak and the sums are trivial to evaluate with a standard pocket calculator. The general form for the derivative is, d/x
d-~ = a - b)t,
1
x,,)~/,,~
(10)
To within a constant of integration then,
l x = a X - ½ b X 2,
(11)
and L ( X ) cc e "x-'2hx2.
(12)
Completing the square in the exponent we see L(X)=
e"2/2h e -'2mx-x')2.
(13)
Having determined )t 0, the most likely value of )t, we then wish to calculate the confidence interval MCL). There are several ways of accomplishing this. The first is applicable when )to > 0 and is a straightforward ratio of integrals of L()t) written as follows: cL
d)t + Jx,,(~CC()t) d)t.
=
(14)
It is convenient to initiate a change of variable in which u =---b 1/2 ()t - )to) and u(CL) = b~/2(MCL) - )to). Using this substitution we see, =
2- d u = J0
e -2u d u . ~ -- u ( C L )
(15) To determine u(CL), and hence )t(CL), one selects the desired confidence interval CL and obtains u(CL) directly from normal distribution tables. The use of eq. (15) allows the determination of the variance to any arbitrary level of confidence by reference to the tables. The determination of the variance to an integral number of standard deviations is particularly simple now that we have shown L()t) to be Gaussian. We recall that in this case, o 2 = 1321X/O2)t[ -1
(16)
hence from eq. (11) we see o = b 1/2 immediately. In the case that )to < 0, it is more appropriate to offset the integrals in eq. (14). In effect we center L()t) at )t = )to < 0 and integrate from )t = 0 rather than from the centroid. The normal distribution tables can be used to perform the integrals numerically because L()t) is Gaussian. To use the tables for this purpose requires a simple change of variable X=- ])t [ / o where X0 =I)t0 l / ° . In this way the actual spectral shape is being accounted for but a tacit assumption is being made that the background shape is known and any dips and bulges are purely due to statistical fluctuations.
(8)
3. Examples
where )t o = a/b n
a-= - , a¢2rr 2~, EZ'
"
b = 2rro2o~,~l~ e-(X'
e -'x'-x°'~/2°~
(9)
Let us consider the background spectrum from a recent search for neutrinoless tiff-decay of 76Ge [3]. We
F.7".Avignone 111 / An approximate analytical maximum likelihood technique wish to determine whether or not the 2615 keV y-ray line from the decay of 232Th is present in the spectrum s h o w n in fig. 1 in which there is no obvious evidence of such a peak. T h e question is, how large a peak might be lost in the statistical fluctuations? W e subjected these d a t a to a s t a n d a r d numerical m a x i m u m likelihood analysis to o b t a i n the values X(68%), X(90%) and X(95%), which represent the m a x i m u m n u m b e r of counts to confidence levels of 68%, 90% and 95% respectively. The present a p p r o x i m a t e technique was then applied and the results c o m p a r e d in table 1. First, the m e a n was determined to be 1.47 by averaging over the 40 channel interval a b o u t 2615 keV. The values of Z i are: 0.51, 0.30, - 0 . 0 7 , - 0 . 7 2 , 0.50, 0.25 a n d - 0 . 6 8 . Since m is so small a n d N(Xi) approximately constant, we replace or with m in eq. (7). In cases in which there is a real peak we should approxim a t e % with the average value ~%,) c o m p u t e d as follows:
(o2) _ ox 2~X ~_~ e-(X'-X°)2/°~ + E e-(X'-X°fl/2a~" (19) We note, however, that this is only necessary in determining the variance b u t not for d e t e r m i n i n g X 0 since o2 cancels in the ratio X 0 = a/b. If we define 7qi=-exp[-(Xi-Xo)2/2o 2] we can write the following expression for the derivative of the log-likelihood function:
2~
~x
~v¢2~
Z Zini---,
(20)
o
Ox 2CTg . '7~'
U s i n g the seven c h a n n e l s a b o u t 2615 keV, a n d the values of Z i given above, we determine the values, 7
~_, Zi'o i = - 0 . 0 1 8 i=l
5
527
Table 1 Results obtained with approximate maximum likelihood method compared with those obtained by standard numerical techniques. The discrepancies between the three pairs of values are 2.1%, 15% and 18% for confidence levels of 0.68, 0.90 and 0.85 respectively Confidence level (CL)
~ (CL) approximate
X(CL) numerical
0.68 0.90 0.95
2.3 3.8 4.6
2.4 4.4 5.4
and 7
Y' 72 = 4.044.
(21)
i--1
Accordingly ~0 = - 0 . 0 2 6 , a = - 0 . 0 0 5 and b=0.186. We can refer to the s t a n d a r d normal distribution tables to d e t e r m i n e u(0.68) = 1, u(0.9) = 1.645 a n d u(0.95) = 1.96. W e then solve for u(CL) a n d hence X(CL). The results are given in table 1 with those o b t a i n e d with the numerical results of a m a x i m u m likelihood analysis b a s e d on Poisson statistics. In this case we can neglect the small negative value of h 0 a n d set it equal to zero. W e note that to a 68% level of confidence we have less than 2.31 counts u n d e r the 2615 keV peak in the decay of 2 3 2 T h in 253 d or < 3.8 × 10 - 4 counts per hour. To a level of confidence of 95% our technique results in a statement that we have fewer than 7.5 × 10 -4 counts per hour. Since these are statistical estimators there is no real difference between the numerical analysis using Poisson statistics a n d the present a p p r o x i m a t e method. Let us now consider the example of the spectrum s h o w n in fig. 2. We are searching for the 238.626 keV g a m m a ray peak from the decay of 212pb. We have 8 energy channels of interest over which the m e a n flat b a c k g r o u n d is 148. T h e detector resolution is fwhm = 1.88 keV which corresponds to ox = 1.11 channels. The
190
i
4
175
Z
160 o o 145
0 2605
2615 ENERGY IN k e V
2625
Fig. 1. Background in a Ge detector in the region of the 2614.47 keV y-ray line from the decay of 228Th.
130 229
I
239
249
ENERGY IN keV
Fig. 2. Ge detector energy spectrum in the r e , o n of the 238.626 keV y-ray peak from the decay of ZlzPb.
528
F.T. Avignone 111 / An approximate analytical maximum likelihood technique
d a t a can be summarized as Z i = - 3 , 0, 6, 30, 31, 21, 3 a n d - 4 . We find: 7
Y~ Z, rl, = 66.1 and 7
E ~/2 = 1.97.
(22)
i -+ ]
Accordingly we find k 0 = + 93.5 with a s t a n d a r d deviation a = b - 1 / 2 = 22.9. We can say then that the peak is there at the 4 a level or k 0 = 94 + 23. It is interesting to note that the result from a straighforward sum of the n u m b e r s Z i is 84 + 9, where we consider only Poisson statistical fluctuations a n d not how well the data fit the k n o w n O a u s s i a n resolution function. Finally, let us use this a p p r o x i m a t e m e t h o d in a case for which we expect the a p p r o x i m a t i o n to be worst. This is the case of a large O a u s s i a n peak, well above background, such that the a p p r o x i m a t i o n a,. = constant is certainly not valid. A plot is shown in fig. 3 which represents the K~I X-ray of lS3Eu observed in a Si, b e n b c r y s t a l spectrometer. The resolution is characterized by a = 2.12 energy units or channels or - 58 eV. The m e a n count rate in the linear c o n t i n u u m u n d e r the peak is 2504 counts per c h a n n e l while for i = 1-7, Z~ = 3873, 6872, 9739, 10866, 9740, 7201 a n d 4380. Accordingly, 7
Y'. Z,~/i = 40, 467 i--I
and 7
Y~ r/,z = 3.679.
(23)
i--1
F r o m eqs. (8) a n d (9) we find k 0 = 58 245. The best fit G a u s s i a n using this m e t h o d can then be written, X0
G ( X~ )
m
14
.
.
10 946"q~,
.
.
.
.
.
(24)
.
.
.
z
o
o
10
0
m o z.<
8 6
from which we calculate the seven values G ( X , ) = 4036, 7025, 9798, 10946, 9798, 7025 and 4036. These agree well with the values of Z i given above. The function given in eq. (24) is represented by the solid line in fig. 3. The a p p r o x i m a t i o n would fail if we were to use o~. = m, the mean count in the continuum. Using this prescription with eqs. (10) and (16) we see o, the variance on our value of X0, is 95 whereas V/~00 - 241. Using the prescription given in eq. (19) we calculate o = 318. We conclude that in this case this approximate analytical technique yields a likelihood function whose centroid is still accurately predicted but whose variance was a b o u t one third as large as it should have been unless a mean value of r~,. was used. In this case we o b t a i n a variance a b o u t 1 / 3 larger than that given by pure Poisson counting statistics. This is expected as there is some small contribution to the variance due to fitting error.
4. C o n c l u s i o n
We have d e m o n s t r a t e d a simple analytical m a x i m u m likelihood technique for determining the total n u m b e r of counts (or limits on it) which are u n d e r a Gaussian peak superimposed on a c o n t i n u u m of known shape. The simplification occurs if the statistical fluctuations are assumed to obey a normal distribution with a variance c o m p u t e d from Poisson statistics and if the a posteriori probabilities are calculated over small but finite intervals. The results are c o m p a r e d to a numerical analysis using Poisson statistics a n d to one case for which the normal distribution is not a good approximation because the mean count in the c o n t i n u u m is on the order of one count per unit energy interval. The use of n o r m a l statistics does not affect the resulting limits significantly. The technique is also d e m o n s t r a t e d in a case of a small peak on a significant b a c k g r o u n d ( - 150 counts per channel). Finally it is d e m o n s t r a t e d to yield an excellent fit to a very large Gaussian peak on a relatively small background. In this case, however, the validity of the a p p r o x i m a t i o n s used in obtaining variances are not as t r a n s p a r e n t a n d only estimates are obtained. This technique has the advantage of not requiring c o m p u t e r facilities and hence has b r o a d application particularly in radiochemistrry and nuclear physics data analyses of spectra from g e r m a n i u m and other high resolution detectors with approximately Gaussian instrumental responses.
0 "r
go
0
o
o
OOoo
11
22
CHANNEL NUMBER
Fig. 3. K~2 X ray of 153En observed in a silicon bent-crystal spectrometer. The solid line is the best fit Gaussian using the present techniques.
Acknowledgment This work was partially supported by the National Science F o u n d a t i o n u n d e r grant PHY-8306098.
F.T. Aoignone II1 / An approximate analytical maximum likelihood technique
References [1] S. Brandt, Statistical and Computational Methods in Data Analysis (North-Holland, Amsterdam, 1983) p. 132. [2] F.T. Avignone III, H.S. Miley, W.J. Padgett and D.W. Weir, Nucl. Instr. and Meth. A234 (1985) 315. [3] F.T. Avignone IIl, R.L. Brodzinski, D.P. Brown, J.C. Evans Jr., W.K. Hensley, J.H. Reeves and N.A. Wogman, Phys. Rev. Lett. 54 (1985) 2309. [4] E. Bellotti, O. Cremonesi, E. Fiorini, C. Liguori, A. Pullia, P. Sverzellati and L. Zanotti, Phys. Lett. 146B (1984) 450, [5] A. Forster, H. Kwon, J.K. Markey, F. Boehm and H.E. Henrikson, Phys. Lett. 138B (1984) 301.
529
[6] J.J. Simpson, P. Jagam, J.L. Campbell, H.L. Mann and B.C. Robertson, Phys. Rev. Lett. 53 (1984) 141. [7] H. Ejiri, N. Takahashi, T. Shibata, Y. Nagai, K. Okada, N. Kamikubota and T. Watanabe, Proc. Int. Symp. on Nuclear Spectroscopy and Nuclear Interactions, Osaka, Japan (1984) in press. [8] D.O. Caldwell, R.M. Eisberg, D.M. Grumm, D.L. Hale, M.S. Witherell, F.S. Goulding, D.A. Landis, N.W. Madden, D.F. Malone, R.H. Pehl and A.R. Smith, Phys. Rev. Lett. 54 (1985) 281.