130
Nuclear Instruments and Methods in Physics Research 223 (1984) 130- t36 North-Holland, Amsterdam
A NEW PROCEDURE
TO ANALYSE RADIOACTIVE EMISSION
COUNT DATA
C . M . O ' B R I E N a n d M. S T O N E Department of Statistical Science, University College London, Gower Street, London 14/CIE, England Received 12 September 1983
Nuclear analysis of a chemical mixture is considered in which counts of alpha, negatron beta or positron beta particle emissions are made. A statistical approach to the determination of the composition of a chemical mixture is presented and applied to both artificial and experimental data. Numerical comparisons with three currently adopted techniques for the analysis of multi-exponential data suggest a general superiority of the new procedure.
1. Introduction For many years experimentalists have been interested in determining the mixture composition of chemical specimens by analytical techniques. These have consisted of either (1) expensive separation procedures, such as those described by Bleuler and Goldsmith [1], for chemically active specimens or (2) nuclear analyses, such as activation analysis with neutrons or charged particles, for chemically inert specimens; e.g. diamond. A unified approach to the elemental determination of a chemical specimen of completely unknown composition has not appeared up to now. This paper sets out to rectify this deficiency, that lies more in the statistical analysis of nuclear activation data than in their experimental acquisition. The specimen consists of a stable combination of elements each with one or more isotopes. Let i 1, 2 . . . . . I index a combined listing of the known isotopes of all elements. Let ~r,, i = 1 , . . . , I denote their proportions in the specimen under analysis. For example, pure graphite consists of two isotopes with atomic number 6 and mass numbers 12 and 13, in the proportions 0.9889:0.0111 respectively. A number of distinct radioactive secondary materials are induced in the specimen by bombardment either with charged particles of a specified type and energy level or with neutrons. This irradiation is carried out for a controlled period of time that is effectively instantaneous, compared to the half-life, "r, of any induced
count __ o
u
y, Fr, u,+t
count , ,,~ u
Y2 F ~ - - u£t 2
count ~ u
Fig. 1. Diagrammatic representation of the particle emission counts, 0167-5087/84/$03.00 © Elsevier Science Publishers B.V. (North-Holland Physics Publishing Division)
radioactivity. The numbers of particle emissions resulting from this radioactivity are counted over chosen time periods (fig. 1) and these counts are then subjected to statistical analysis.
2. Model Corresponding to each of the isotopes that may comprise a specimen, there are a finite number D,(i = 1 . . . . . I ) of decay schemes that may or may not be initiated by a particular irradiation design. Each decay scheme is a (generally branching) process that starts with a particular initial parent state, induced by the irradiation, and ends when all transitions via daughter states have taken place. For simplicity of exposition, we will suppose that the decay schemes are non-branching. Consider the possibly activated, parent nuclei corresponding to a particular decay scheme for a particular isotope. Let A denote the number of these immediately after irradiation. Assume that A has a Poisson distribution with mean k h , where (1) k depends on the particular parent state, the irradiation design and the cross-sectional parameters of the isotope and (2) h is a composition parameter of the particular isotope, such that the values of ~, for different isotopes are in the same ratios as the isotopic proportions ~r1. . . . . ~rI in the specimen. If the design parameters rule out the activation of an isotope, the associated value of k will be set equal to zero when we come to data analysis. Denoting the
y ---T~ ) U [ t i m e since u+t° irradiation)
C.M. O'Brien, M. Stone / New analysis procedure parent state
[intermediate I
S,
$2
daughter $3
states S
131 final state
- - -
Sf
path of a single nucleus emission ,
e,
e~
×
~.(
ef.~
e 3 x
-
-
-
x
) U
o
Fig. 2. Decay history of a single parent state nucleus from an activated isotope. (e 1 beta particle emissions).
. . . . .
parent state by S 1 and the successive daughter states by S 2 . . . . . Sf, respectively, fig. 2 represents the decay history of a single S 1 nucleus. Let Aj(u), j = 1 . . . . . f denote the number of the A parent nuclei at time u = 0 that are in state ~ at time u. Note that Al( u ) + ... + A/( u) = A. The classical deterministic theory [2] of radioactive decay models such a chain of transitions as a first-order process with decay rate ttj between states Sj and Sj÷ v Strictly analogous probabilistic arguments formulate the model as a simple stochastic migration process (e.g. ref. [3]). By considering such models, one may conclude that, given A = a, ( A I ( U ) , A 2 ( u ) . . . . . A / ( u ) ) has a multinomial distribution with means ap l ( u ), ap 2( u ) . . . . . apf ( u ), respectively, where e -~'w,
ef_ 1 represent alpha, negatron beta or positron
period (u, u + t). It is shown in the appendix that /-1 X= E (f-j)[Aj(u)-Aj(u+t)]. j=l Using eq. (3), the mean of X is given by f--1
E(x)= kx E ( f - J ) [ p j ( u ) - p A . + t ) ] .
pj(u) =
(5)
j=l
forj = 1 ;
On substitution of eqs. (1) and (2), this reduces to the corresponding expression for the deterministic activity [4] of a non-branching decay process. Let Y(u, t) denote the total count of particles from the irradiated specimen for the period (u, u + t), together with any contribution from background radiation. (The Poisson character of the latter has been studied in ref. [5]). Then we will have
forj = 2 . . . . . f - - 1;
E{Y(u,t)}=bt+
J
I
Ojh e -uh",
(4)
h=l
(1) L-~hJ
•
Dr
E E(Xid),
(6)
i~l d~l
where b denotes the background rate, X, d denotes the n u m b e r of particle emissions during (u, u + t) rekulting from the decay scheme d of the i th activated isotope and each E(Xia ) is of the same form as eq. (5).
e-"~",
forj =f; j-i
H Ptl j I~ 1
~'h
3. Composition determination (2)
/=l
(l~h)
Combining this conditional multinomial distribution with the Poisson distribution for A results in ( Al( u), AE( U) . . . . . A / ( u )) being independent Poisson random variables with means given by
khpj(u),
j=l ..... f,
(3)
respectively. Assume that only one particle is emitted per transition. Let X denote the n u m b e r of particles emitted, for the parent state considered, over the time
It is unreasonable to attempt to fit the isotope model (6) to an observed series of emission count data as it stands, i.e. with the composition parameters ha,.:. , ~,t to be estimated for all the known isotopes. Existing statistical procedures would yield uninformative results if they were put to this task. Instead, we will take a sequential approach in which reduced models, corresponding to subsets of isotopes, are fitted in a "survival of the fittest" scheme: the most promising isotopes are introduced in order, with the possibility of elimination by later introductions, and the complexity of the finally accepted model is determined by cross-validatory prun-
132
C.M. O'Brien, M. Stone / New analysis procedure
ing, a special case of the method cross-validatory choice [6] that has been widely and successfully applied [7,8] since it was introduced ten years ago. In all, there are 2 / models corresponding to the inclusion or not of each isotope. Let M = {i 1. . . . . i,, } denote a general model that includes only the isotopes with indices i~ . . . . . i,,. Let Y/' J = 1 . . . . . n be the observed count of the number of emissions Y ( u j , tj) which, under an hypothesised model M, has expectation m
btj + Y'~ cjirhir,
(7)
r--1
with f,-I
Ci,.=k,. E (f, -I)[pt(uj)-p,(u/+t,)].
(8)
I-I
where the k, h and f of eq. (3) have been assigned the relevant isotope indices. The sequential procedure uses, as a criterion for inclusion or exclusion of an isotope in a model, its contribution to what may be labelled the pseudo-likelihood: ,,
LM(b, hi,, . . . . h,.,)
btj
-~
Cjirhlr
= H
The expression for L g would be the true likelihood if Y( ul, t 1) . . . . . Y( u . , G) were independently Poissonlydistributed with the (exact) expectations (7) - as is the case when ~, f i 2 £ . = 2. Let b, hi,, .... Xi. denote the unrestricted values of b, X~,. . . . . h,., respectively, that maximise L~t and define L M by =
=
. . . .
L M = L M ( ~ , ~,,, . . . . X,~),
provided that b, ~i~, .... X,,,, are admissible, i.e. b >/0 a n d S , > 0 , r = l . . . . . m;
= 0,
otherwise•
(lO)
When one considers extending a model by the introduction of a single additional isotope, i,.+1, a collinearity test will be applied: the additional isotope is required to be eligible in the sense that the coefficient of determination, R 2, of the n-vector
I Clim.I Cnilm+l
by the vectors
, t,
c,,,~ ]
ci,,, '
already in the model does not exceed an adjustable tolerance level such as 0.99 (e.g. ref. [9]). An isotope excluded by this test is, however, available for inclusion in any later model. Stage 1: Select the single-isotope model ( i 1 } that has the largest value of L M. If there is no such model with positive L M then end the procedure with the null, i.e. background only, model. Stage 2: Find the eligible second isotope, i 2, that produces the largest value of L M for the model { i 1, i 2 }. If this largest value is zero, end the procedure with { i~ }. Otherwise consider all potential eligible replacement for the first isotope i~ before proceeding to stage 3 with the largest L g value. Stage 3: Find the eligible third isotope, i 3, that gives the largest L m value when added to a model from stage 2. If this value is zero, the procedure is ended with the stage 2 model. Otherwise consider all potential eligible replacements for the first isotope and then, having made such replacement if L M is thereby increased, consider all potential eligible replacements for the second isotope, before proceeding to stage 4 with the largest L M value. The sequential procedure continues thus either until it ends as indicated or until it exhausts the eligible isotopes. The object of the sequential procedure is not to accept the end model at the stage when the procedure terminates (which will, for the applications envisaged, contain an unrealistically large number of isotopes) but to use it as the basis for cross-validatory pruning. How is pruning to be effected? We define a pruning level c~ as a cut-off value in the increment in In L M resulting from the addition of an isotope to a model. The pruning level is applied to the "end model" by "backward elimination" [10] as follows: starting at the end, we go back to the model from which the elimination of any isotope would produce a decrement in In L M in excess of a. This could feasibly be the end model itself although this would not usually be the case. Let ~/~ denote the fitted model that we obtain at pruning level a, with the associated estimates b ( a ) , Xi,(a) . . . . . ~im(a) of b, hi,, . . . . him, respectively, inserted. We use cross-validation to choose a, as follows. The whole sequential procedure is repeated n times, each time omitting just one of the observed counts. Let hTt7 denote M" when t h e j t h count is omitted and 9f' the prediction of yj under fitted model M^j~, that is, eq. (7) for M7. A cross-validatory assessment for level a is
C.M. O'Brien, M. Stone / New analysis procedure.
133
provided by the discrepancy criterion n
C(a) = •
/= 1
(YJ-YT)2
(11)
Y7
We choose ~ to be the supremum, e ?, of the values of a
that minimise C(~). The finally accepted model is then taken to be M s*, that is, the model corresponding to pruning level a ? applied to the fitted-model sequence for the whole dat~i.
E
\
o
E c
4. Simulations and applications A F O R T R A N IV program M O D S E L was written to perform the sequential procedure and a program P R U L E V written to determine the cross-validatory choice of pruning level. Details of the subroutines are described in [11] together with applications to experimental data and the algorithms of a more general sequential procedure that forces selected isotopes into the model. The following five examples - two simulation, two comparative and one experimental - serve to demonstrate the effectiveness of the new procedure.
4.1. Simulated examples We test the method for a simplified data base in which the method is restricted to just twelve selected parent states of activated isotopes: 13N('r = 9.96 min), 3SK(~" = 7.61 min), 5°rnMn(~- = 1.74 min), 54mCo( T = 1.46 min), 140 0" = 70.6 s) 17F('r = 64.5 s)
21Na(~- = 22.47 s), l°C(~- = 19.2 s), 24A1(~- = 2.07 s), 3smK(~" = 0.93 s), 2°Na(~" = 0.446 s), 36K(,r = 0.34 s),
instead of to the full range of parent states that would be used in practice. Four data sets I, 1I, III, IV were generated from subsets of these twelve, each with a random background radiation of 13 counts per minute, and are shown in fig. 3. In the analyses, we take k = l and the ~. values are then our estimates of the number of nuclei in the parent states composing the models. (I) Table 1 gives the output of M O D S E L . The true composition has been immediately picked up, but the procedure then goes to include four more parent states. Fig. 4 gives a plot of C(a) calculated by P R U L E V . We see that a t = 0.040, whose application to table 1 gives the finally accepted pruned model: 140 with composition estimate 999.5 and b = 12.6. (II) Table 2 gives the output of M O D S E L for which the sequential procedure ends with the true composition.
I
o
~
I0
I~
2'0
time
25
30
3b
4o
in m i n u t e s
Fig. 3. Randomly generated values of In(count/rain) for fl-decay and background radiation. Numbers of nuclei used in the simulations were: I, 1000 of 140; II, 1000 of 24A1; 13 of 3SinK; 1000 of 36K; III, background radiation only; IV, 700 of t3N; 10 of 24.4.1. Fig. 5 gives a plot of C ( a ) calculated by P R U L E V . We see that a t = 20.13, whose application to table 2 gives the finally accepted pruned model: 24A1, 38inK, 36K with composition estimates 988.0, 12.8, 997.1, respectively, and b = 12.5. Table 1 MODSEL output for dataset I Stages
Parent states in model at end of stage s
Composition estimates, ),
Increment in In L M at stage s
140
999.5
2.1 × 103
140 36K
991.1 0.3
3.9× 10 -3
50,-,Mn 140 2oNa
3.4 995.9 0.4
3.2 × 10- 3
5oreMn 14O 24AI 36K
3.3 996.0 0.1 0.3
2.8 × 10- 3
3s K 5°mMn 14O 24A1 36K
0.4 2.5 996.6 0.1 0.3
9.7 × 10- ~
C.M. O'Brien, M. Stone / New analysis procedure
134
[C(~-) d
004
Fig. 4. C(o0 for data set I.
Table 2 MODSEL output for dataset lI Stage s
Parent states in model at end of stage s
Composition estimates, ~
Increment in In L M at stage s
1
38inK
1981.3
1.3 X 10 4
2
l°C 20Na
303.0 1706.0
1.0 × 102
3
24A1 38mK 36K
988.0 12.8 997.1
4.7 × 10
t
I-- LJ
:L
J
L 20"13
Fig. 5. C(a) for data set II.
(III) This example illustrates a feature of the new procedure by showing what happens when count data recorded for an unactivated specimen are analysed, i.e. the count data are due solely to background radiation. Application of M O D S E L ends with the null model. P R U L E V determines that at = 0, giving the null model with b = 12.3 as the finally accepted pruned model.
(IV) For this example, application of M O D S E L ends with the three parent states 13N, 5°mMn, 36K, having yielded the single parent state 38K at stage 1 and the two parent states 13N, 24A1 at stage 2. Application of at =0.11, obtained from PRULEV, gives the finally accepted pruned model: 13N, 24A1 with composition estimates 652.5, 7.9, respectively, and b = 19.5. As a comparison with existing methods, the data sets III and IV were re-analysed using three currently adopted techniques for the decomposition of multiexponential data, namely, 1) the Graphical Peeling method [12,13] which uses the well-known procedure of "stripping", to fit exponentials singly in increasing order of their decay constant, 2) the method of de Prony [14,15] which fits a linear combination of exponentials as a solution of a difference equation with constant coefficients (see refs. [16-18] for a recent re-discovery of this approach), 3) The spectral analytic technique of Gardner et al. [19,20]. The results are as follows: (1II) The Graphical Peeling method correctly chooses background radiation only, with b = 12.3. De Prony's method, as described in [15], gives two exponential components: one with a real decay constant estimate of - 0 . 0 4 6 and the other with a complex decay constant estimate! The spectral analytic technique [20] gives, as may be expected, output that is difficult to interpret but suggests a peak decay constant of 0.1. (IV) The Graphical Peeling method incorrectly chooses background radiation only, with k, = 41.9. De Prony's method performs as in III but with a real decay constant estimate of -0.021. The spectral method gives two peaks, the first close to the decay constant for 13N but the second some distance from that for 24A1, reflecting the unconstrained nature of this method.
4.2. Experimental example The method of this paper has been successfully applied [11] to experimental data resulting from the 3 MeV and 9 MeV proton activation of 1) boron, carbon-12, carbon-13 and silicon standards, 2) NH4CI and graphite specimens. The case of boron is now discussed. Although only the nuclear reaction riB(p, n ) n C should be initiated in a pure boron specimen under 3 MeV proton bombardment, we test the method with I = 24 for the database consisting of the isotopes for the elements with atomic numbers 5-11, 13, 14, 16 and 17. The corresponding decay schemes are those for 3 MeV proton activation. The data, from which an estimate of the background radiation has been subtracted (unnecessarily from the point of view of the new procedure), are shown in fig. 6. Table 3 shows the output of M O D S E L (in which we
C.M. O'Brien, M. Stone / New analysisprocedure 15 ,p
C 1 0 .................
E
-___...__
\
~3 0 _c
5
0
............... ~-. . . . . . . . . . .
14~30
~. . . . . . . . . . . .
,'",
15~00 15:30 16~00 t i m e in h o u r s : m i n u t e s
16~30
Fig. 6. The fl-decay of llC(~"= 20.38 min) after 3 MeV proton activation of a boron standard (irradiation stopped at 14:25 h). Table 3 MODSEL output for 3 MeV proton activation of a boron standard a) A
Stage s
Parent states in model at end of stage s
(kX)
Inferred isotopes of the specimen
1
llC
1.05×106
HB/12C/14N
2
18F HC
2.34x 10s 9.57×10 s
180 11B/12C/14N
~) M"t: model at the end of stage 1. can specify zero background); since the cross-sectional parameters k are not known to us, the table gives the estimates of k~,. The true composition has been immediately picked up but the procedure then goes on to include one more parent state - only to reject this possibility (after crossvalidatory pruning) in favour of the finally accepted pruned model: HC. The cross-validatory selection agrees with our knowledge that liB(p, n)11C ~ l l B under bombardment with 3 MeV protons. However, this experiment was badly designed, because the reactions 12C(p, pn) 11C ---,11B and 1aN(p, ot)llc ~ nB are also initiated by the same irradiation design. On the basis of this nuclear analysis, it is impossible to decide whether the specimen is composed of 11B, 12C, 14N or a combination of the three.
135
rently adopted techniques, the experimental example of section 4.2 illustrates the need for a careful choice of irradiation design. Our method may be improvable in the specification of (a) the fitting criterion (currently the pseudo-likelihood), (b) the eligibility criterion (currently R2), and (c) the pruning criterion (currently the increment in the natural logarithm of the pseudo-likelihood). However, greater gains may be obtainable by attention to the accuracy of decay scheme parameters and the design of irradiation schemes. We plan to incorporate the programs MODSEL and PRULEV into a larger suite of programs which will, along with gamma-ray analysis routines (see, for example, refs. [21,22]), permit the automatic design and interpretation of nuclear analyses. We would like to thank Dr. H.J. Milledge of the Crystallography Unit of UCL for suggesting the physical problem from which this paper grew, Mr. R.F. Galbraith for his assistance and encouragement, and Dr. M.I. Edmonds of the University of Birmingham for the use of her experimental data. We are grateful to Prof. H.-J. St6ckmann, Fachbereich Physik der Philipps-Universit~t Marburg, for letting us use his spectral analysis subroutines. One of us (C.M.O'B) acknowledges the financial support of the Science and Engineering Research Council.
Appendix Let Ej denote the number of emissions of type ej (i.e. from Sj to 5 + 0 during the interval (u, u+t). Let ( j - a , j + b ) , a>~O,b>~O,a+b~O, denote the number of such emissions for which the nucleus is in state Sj_ a at time u and state Sj÷ b at time u + t. Let Rj denote the number of nuclei in state Sj at time u which are still in state Sj at time u + t. Then, forj = 1. . . . . f - 1,
Aj(u)=Rj+(j,j+I)+(j,j+2)+ =Rj+
Y'. ~
...+(j,f)
(12)
(j-l,j+l+b)
a > ~ 0 b>~0
5. Discussion The examples of the last section have shown that the sequential procedure and cross-validatory selection of an isotope model can produce satisfactory determinations of the composition of multi-exponential decay curves when half-life and cross-sectional parameters are known. Our examples had I = 12 or 24 but, because of the sequential nature of the method, there is no essential limit on the number of isotopes in the data base. While the procedure is certainly more successful than cur-
- E a>0
E (j-a,j+l+b)
(13)
b~>O
=Rj+Ej- ~ y" (j-,,,j+l+b) a>0
(14)
b~>0
and, f o r j = 2. . . . . f,
Aj(u + t) = Rj + ( j - 1 , j ) + ( j -
2 , j ) + ... + ( 1 , j ) (15)
=Rj+ ~_, ~., ( j - l - a , j + b ) a>~O b>~O
-
]~
Y~. ( j -
a>~O b > O
1 - a,j + b)
(16)
C.M. O'Brien, M. Stone / New analysis procedure
136
=Rj+E,_,- E Y'. ( j - a , j + l + b ) . a>O b>0
(17) Hence, using eqs. (14) and (17),
Ai(u)-Aj(u+t)=E,-E,_
,,
j=2 ..... f-1.
+ t) = E,.
(18)
(19)
Writing
Aj=Aj(u)-Aj(u+t),
j=l
..... f-l,
(20)
a n d using eqs. (18) a n d (19), we then have E 1 = A1
E2= A ~ + A 2
Ef_l=aa+A2+
...+af
~.
(21)
Whence f-1
g = ( f - 1)a, + ( f -
2 ) a 2 + ... + a i _ ,
(22)
/=1
= X,
1955) p. 198.
[5] N. Arley, On the theory of stochastic processes and their
Also
A~( u ) - A t ( u
[21 H. Bateman, Proc. Cam. Phil. Soc. 15 (1910) 423. [31 E. Renshaw, Biometrika 59 (1972) 49. [41 A.E.S. Green, Nuclear physics (McGraw-Hill, London,
(23)
assuming that only one particle is emitted during emission type e j , j = 1 . . . . . f - 1. Hence, the validity of eq. (4) is established.
References [1] E. Bleuler and G.J. Goldsmith, Experimental nucleonics (Pitman, London, 1952).
application to the theory of cosmic radiation (Wiley, New York, 1943). [61 M. Stone, J.R. Statist. Soc. B 36 (1974) 111. [71 A. Mabbett, M. Stone and J. Washbrook, Appl. Statist. 29 (1980) 198. [81 U. Hjorth and L. Holmqvist, Appl. Statist. 30 (1981) 264. [91 K.N. Berk, J. Amer. Statist. Assoc. 72 (1977) 863. [lO1 M.L. Thompson, Int. Statist. Rev. 46 (1978) 1. [111 C.M. O'Brien and M. Stone, Cross-validated composition determination in nuclear analysis, University College London (Department of Statistical Science) Research Report 29. [12] P. Mancini and A. Pilo, Comp. Biomed. Res. 3 (1970) 1. [131 D.H.D. West, Nucl. Instr. and Meth. 136 (1976) 137. [14] R. de Prony, J. l'Ecole Polytechn. (Paris) 1 (1795) 24. [151 1. Barrodale and D.D. Olesky, in The numerical solution of nonlinear problems, eds., C.T.H. Baker and C.P. Phillips (Clarendon Press, Oxford, 1981)ch. 20. [16] M. Vakselj, Fizika 7 (1975) 132. [171 T. Mukoyama, Nucl. Instr. and Meth. 179 (1981) 357. [181 E. Oliveri, O. Fiorella and M. Mangia, Nucl. Instr. and Meth. 204 (1982) 171. [191 D.G. Gardner, J.C. Gardner, G. Laush and W.W. Meinke, J. Chem. Phys. 31 (1959) 978. [201 H.-J. StOckmann, Nucl. Instr. and Meth. 150 (1978) 273. [211 J.T. Routti and S.G." Prussin, Nucl. Instr. and Meth. 72 (1969) 125. [221 W. Carder, T.D. MacMahon and A. Egan, Talanta 25 (1978) 21.