Analysis of two-dimensional coincidence spectra

Analysis of two-dimensional coincidence spectra

NUCLEAR INSTRUMENTS ANALYSIS AND METHODS 41 (1966) 1-12; 0 OF TWO-DIMENSIONAL NORTH-HOLLAND COINCIDENCE PUBLISHING co. SPECTRA D.D.SLAVIN...

1MB Sizes 0 Downloads 63 Views

NUCLEAR

INSTRUMENTS

ANALYSIS

AND

METHODS

41 (1966)

1-12; 0

OF TWO-DIMENSIONAL

NORTH-HOLLAND

COINCIDENCE

PUBLISHING

co.

SPECTRA

D.D.SLAVINSKAS,T.J.KENNETT,W.V.PRESTWICHandG.L.KEECH McMaster

University,

Hamilton,

Ontario,

Canada

Received 27 October 1965 A model is presented for spectra obtained with two detectors in coincidence. Two methods of solution for the significant parameters are discussed. The first method is a full least squares solution which may lead to the inversion of an impractically

large matrix. The second method avoids this difficulty by breaking up the solution into parts. Covariance matrices are derived for both methods of solution. A study of various least squares weighting estimates is also presented.

1. Introduction

i=l, 2,..., n,. The subscript i indexes the n, types of nuclear transitions to which detector x is sensitive. Similarly we let detector y have ny response functions Y,(y). In general the numbers n, and ny are different. We could have, say, detector x sensitive to /?-rays and detector y to y-rays. The function Xi(x) would in this case represent the experimental P-ray spectrum with end-point Ei, whereas Yj(y) would be the response to y-rays of energy Ej. In the particular case of a y-y ccincidence spectrum we expect to have n, = ny. We shall assume that the response functions are normalized to unity over their respective energy scales x and y, i.e. xx X,(x) = c, Yj(y) = 1. With this stipulation the response functions become probability distributions over the energy base. If counter x operates freely without coincidence restrictions we obtain the one-dimensional model spectrum

Recent developments in nuclear spectroscopy have shown a considerable advance in data acquisition equipment and techniques. For instance, multiple correlation experiments with data fields of 16000 channels are becoming quite common. To make efficient use of these data a parallel development in data reduction methods is desirable. The associated large scale calculations are made feasible by the fortunate circumstance of high speed and large memory capacity computers becoming increasingly available. The data considered here contain distortions introduced by the response of detecting systems. Aggravated by statistical uncertainties inherent in counting experiments, these distortions render accurate manual interpretation of data quite difficult. Mathematical techniques, such as reduction by the method of least squares, are required to obtain in some sense the best estimates of the physically meaningful parameters. An important class of such parameters are the relative intensities of nuclear transitions. A detecting system will respond to monoenergetic radiation by producing a line shape, the departure of which from a delta function is usually the main cause of distortion in the spectrum. If the response functions comprising a spectrum are known, it is possible to obtain the intensities by a simple least squares calculation. A subsequent study of residuals will indicate if the model used was correct. The calculation can then be repeated with additional response functions and/or modifications to the presently used ones. In the case of y-ray spectra obtained with a single Nal(T1) detector such techniques have been developed to a high degree of accuracy, notably in the work of Heath et al.‘). It appears that coincidence spectra have not been similarly treated and it is the purpose of the present work to provide a basis for filling this gap. We will consider a two-dimensional time correlation experiment performed with detectors x and y. Suppose that detector x produces response functions Xi(x) where APRIL 1966

The coefficients a,, are related efficiencies E,.~ and the number in time T,Jvi‘ by

to the total detection of nuclear transitions

axi = e,gV I’

(2)

When coincidences between detectors x and y are demanded the resultant spectrum fills a two-dimensional ar’ray. Since detector x is independent of detector y the two-dimensional distribution function for a coincidence (i,j) is simply the product Xi(x)Yj(y). Upon summing the coincidences between all possible pairs (i,j) we obtain the spectrum model S(X,y)

=

2 $j

aijXi(x)q(Y)’

i=l

The coincidence

j=l

coefficients aij =

ExiEyj(~j+

(3)

are given by kJt+q), (4)

where Jyii in the first term refers to the number of true coincidences between events i and j in cascade. The 1

D. D. SLAVINSKAS

2

second term represents chance coincidences. It contains the constant k which is related to the resolving time of the detecting system. Knowledge of the coincidence coefficients aij can be of great importance in the construction of nuclear decay schemes. An added practical advantage is that the results can be made independent of detection efficiencies which are usually difficult to determine. Using the one-dimensional coefficients axi = E,~& and ayj = .zyjMj we can obtain

aij/(axiUyj)= ~j/(~~)+

k.

(5)

Here we have an n, by nY array of coefficients independent of eXi and Eyj. If we are dealing with y-y coincidences, then Jvii = 0 and the diagonal elements determine the parameter k. This can be subtracted from the whole array, leaving the simplified elements &j/(~&$. This final array can be used to determine y-ray cascades, branching ratios and even absolute decay rates. An example of such numerical calculations is given in a later section. Solution for coefficients aij can be effected by the application of least squares techniques to model eq. (3). There are n,n,, coefficients; hence the full least squares solution requires the inversion of an n,n, by nxny size matrix. Clearly, this matrix can become prohibitively large when n, and nY reach the neighbourhood of 10 or more. As a consequence, the practicability of a full least squares solution is limited to cases with few nuclear transitions. For larger scale problems we can use a solution by parts in which the dimensions of x and y are treated separately. Model eq. (3) can be rewritten in the form of two coupled equations

i=l

etal.

ably smaller than the n,n, by n,ny matrix in the full solution. This reduction in size is a decided advantage in that a larger number of nuclear transitions can be considered in the analysis. In effect, the overall problem is reduced to a series of smaller problems by the use of the coupled eqs. (6) and (7). There is another practical advantage to the solution by parts. The simultaneous solution does not provide estimates of intermediate coefficients hi(y) which represent a one-dimensional spectrum in coincidence These spectra are imwith the ilh nuclear transition. portant because they provide additional insight into the model. For instance, they may provide a basis for improving the accuracy of employed response functions or they may even reveal the existence of new transitions which were not included in the original model. When we write down definite (non-statistical) quantities, the coupled eqs. (6) and (7) are mathematically equivalent to the model eq. (3). In our least squares calculations we are dealing with statistical intermediate coefficients b;(y) which in general do not satisfy an equation like (7) but are distributed around its values. Therefore the solution by parts can be expected to yield estimates of aij somewhat different from the true least squares estimates of the full solution. The character of this difference is investigated in the following sections. It is shown that both types of estimates are unbiased and that the solution by parts gives variances greater than or equal to the full solution variances. This reduction in accuracy of estimates is the price exacted for the practical advantages offered in the solution by parts. Fortunately the increase in variances appears to be small-only a few percent or less in the examples considered. Full covariance matrices of coefficients aij will be derived for both methods of solution.

ny

hi(Y)

=

jgl

aijYj(Y).

(7)

If the value of y is fixed, eq. (6) can be used as a model for a least squares fit in the x-dimension. This calculation yielding n, coefficients hi(y) can be carried out for all the NY values of y in the data field. As a result least squares estimates are obtained for all nxNp intermediate coefficients hi(y). Next we can fix the value of i and using model eq. (7) perform a least squares fit in the y-dimension. After n, such calculations one obtains estimates of all n,nY values of aij. The solution by parts requires a total of N, + n, least squares calculations each involving the inversion of a size n, by n, or n,, by nY matrix which is consider-

2. Full least squares solution All quantities in model eq. (3) are definite, i.e. they are nonstatistical. If the size of the data field in the x and y dimensions is N, and N,, then the experimentally observed spectrum is a sample from an N,N,,dimensional distribution function. We assume that the observed spectrum can be written in the form S’(x,L’)= S(x,y)+d'(x,y),

(8)

where the A’(x,y) are statistically independent variables having normal probability distribution functions with zero mean and variances equal to S(x,y). Primes will be used in this paper for indication of statistical quantities exclusively.

ANALYSIS

OF

TWO-DIMENSIONAL

Upon fitting the model to experimental obtain an estimate of the coefficients aij,

data we will

aij = aij+dfj.

(9)

The dij are random variables (in general statistically dependent) whose full statistical properties will be determined later. We shall use the method of maximum likelihood which provides unbiased estimates of the a:j with minimum possible variances. For each data field point (X,JJ) we can form a likelihood function

COINCIDENCE

SPECTRA

3

divide matrices into smaller submatrices. To avoid constant use of cumbersome spatial matrix arrangements a special type of notation was adopted. Matrices will be represented by enclosing a general element between double vertical bars (single bars being reserved for determinants). Two superscripts outside the bars will indicate the number of rows and columns. Curly braces will be used to denote a matrix element whether it be just a number or a submatrix. For example, we define the matrix Yl(1P2(1).

. . q,o>

L(x,y) = [27&(X$)] -+ x x exp

YlP)Y2(2)

~-~~‘~~~Y~-~~~~Y~I*/~~~~~~Y~I >. (10)

The total likelihood function is obtained product over the entire field,

by taking

Y=

the

.....................

I/ww2uQ'.

L = 8 pi L(x,y). x=1 y=l

(11)

We get the estimates aij by maximizing L. This leads directly to the principle of least squares in that we must minimize the exponent in L,

=

i$lj$l

aijxi(x)q(Y)]

2/s(x9Y)

)

+

min’

(12)

Differentiating (12) with respect to a,, and equating the result to zero yields IZ,+, normal equations

. Y,JyJ

(14)

II{ yj(Y) >y,jl/Ny'ny'

The subscripts y and j indicate the row and column of the general element. Carrying this notation one step further we define a matrix of submatrices

’ = -

q,(2)

. . . . . . . . . . . . . . . ...*..

I(

II

NXJk

1II .

Ny’“y

{xi(x>rj(Y>~y,jIl

x.i

(15)

The superscipts N, and n, refer to the number of rows and columns of submatrices. For elements the number of rows and columns is N,N, and n,n,, respectively. The spectrum S(x,y) can be represented by the vertically arranged vector (one column matrix)

Likewise,

for the coefficients

uij we use the vector

where k=

1,2,...,

n,and

1= I,2 ,...,

ny.

In eq. (13) we have the n,ny unknown quantities aij. However the weights l/S(x,y) are also unknown. To obtain a solution in practice one may use the values l/S’(x,y) which can be fair estimates of the required weights. More will be said about the use of weights in a following section but the theoretical discussion will proceed with the assumption that the true weights are obtainable. We are thus in position to solve for the estimates a$ In general this solution requires the inversion of an n,ny by n,n, size matrix. The solution has certain statistical properties which are best derived by the use of matrix algebra. In the following treatment it will be convenient to

These definitions enable us to write model eq. (3) in the simplified matrix notation S=ZA.

(18)

To rewrite the normal eqs. (13) we make use of two further definitions. Let D’ be the vector of residuals D’ = S’-ZA’,

(19)

where the primes on S and A indicate the corresponding statistical vectors. We also introduce the diagonal matrix of weights

W=

[

Nr,Nr’ t20)

I/

(&)y,j~jNy’Ny6xij x,i

4

D. D. SLAVINSKAS

Condition (12) for maximizing now takes the form

the likelihood

function

DfTWD’ --, min., where superscript T indicates eqs. (13) are given by

(21)

the transpose.

The normal

ZTWD’ = 0.

(22)

If we define the design matrix c = ZTWZ,

(23)

then eq. (22) becomes

for every x = 1, 2, . . , N,. The second factor in eq. (29) cannot be a null vector for every x since V is not a null vector and the Xi(x) are linearily independent. But then eq. (29) implies that the Y,(y) are linearly dependent, which is not true. Hence we must reject the assumption that the columns of Z are linearly dependent and our requirement for the quoted properties of matrix C is satisfied. The inverse matrix C-’ is also positive definite as can be seen by the use of transformation V= C’U, so that 0 < VTCV = UTC-lCC-‘U

CA’ = ZTWS’.

(24)

To obtain a solution for A’ we can multiply the inverse of matrix C, provided such exists. The result is

{"ij)j,l//ny'l

are symmetric.

= u=c-‘U. In fact, we have

CT = (ZTWZ)T = z=wz Since matrix C has an inverse, @e(2S’S) holds. The expectation

= c. the solution for A’ in values of A’ is

(25)

Matrix C has some important properties which we shall now investigate. It is shown in textbooks on statistics that provided matrix Z (in eq. (23)) is made up of columns which form a set of linearly independent vectors (i.e. Z has rank nxny), then C is a positive definite matrix. This means that det (C) > 0 and that the inverse C-’ exists. In order to prove that this requirement for Z is satisfied we shall have to assume that the n, response functions Xi(x) and the ny response functions Yj(Y) form two sets of linearly independent vectors. This assumption should always hold since linear dependence indicates that at least one response function can be written as a linear combination of others and is therefore redundant. First we shall assume that the columns of Z form linearly dependent vectors and then show that this leads to an impossible condition. The linear dependence in Z Implies that we can find a non-zero vector

(II

Both C and C-’

eq. (24) by an inverse

A’ = C_‘ZTWS’.

” = //

f?t d.

(26) Ii,,l~nx’12

E(A’) = E(C_‘ZTWS’) = c-‘ZTWE(S’). Since C- ‘, ZT and W are definite matrices, they can be excluded from expectation considerations. By the original assumption (8) the expectation in S’ is just the vector of true values S = ZA. The final result is E(‘4’) = c-‘ZTWZ&4

= A.

By virtue of eq. (30) we call A’ an unbiased estimate of the true vector A. It remains to find the covariance matrix for A’. The normally distributed data vector S’ has a diagonal covariance matrix given by MS, = W-‘.

A’ = GS’. = 11{o}i,III ‘Qv=r.

(27)

We can write the null vector ZV in more detail

lli

i$l II{xi(X>Yj(Y> ~y,jilNy'"y IIt”ij

If we look only at the subvectors sav that

)j,lllny’l

X.1

= 0 NxN”*l. (28) indexed

by x we can

ll( zI Xi(x)uij ), 111%‘I = ONysl (29) II ~yj(.Y)~y,jllNy~“y

(32)

There is a theorem’) which states that A’ is also normally distributed and has a covariance matrix given by M,, = GM9GT.

/I Nx’l

1

(31)

The diagonal elements represent variances and are equal to S(x,y). Off-diagonal covariance elements are zero since components of S’ are statistically independent. If we put G = C-‘Z’W, then from eq. (25) we have

such that zv

(30)

Accordingly

(33)

we can write M,. = C-‘Z=WW-‘WTZC-’ = c-1.

(34)

Eqs. (25) and (34) provide a full least squares solution for A’ together with its statistical properties.

ANALYSIS

OF

TWO-DIMENSIONAL

It is possible to show that this solution represents in some sense the best possible estimate of the true vector A. We construct Fisher’s information matrix3) / according to the definition “Xn,PX”,

(35) Ai, 1 are components

The variables

A = II iAi,

Carrying out the algebra we obtain

=

1 Ii,1

of vector

l(n,ny’l

differentiation

and

ZTWZ = c.

some

matrix

(37)

Let A have some other unbiased estimator A” with a covariance matrix MAPS. Then there is a theorem4) which states that for any vector V = 1)I& /Inxny*1 we have VTM,,J’

Consequently

3. Solution by parts The practical advantages of the solution by parts using model eqs. (6) and (7) have been discussed in the introduction. It has also been stated that this solution can be expected to yield estimates A” different from the full solution estimates A’. According to inequality (45) the differences result in A” having variances greater than those obtained in A’. In this section we shall present the formalism for solution A” and derive the covariance matrix Mr. Similar to definition (14) for Y, we introduce a matrix of response functions in the x-dimension x = II {xi(x>>x,ill Nx’nx. In addition

2 VTM,X

implication of relation the scalar products

(39) (39) is obtained

5

SPECTRA

all unbiased estimates A”, the particular estimate A’ has the smallest possible variance in all its individual components. It is in this sense that the vector A’ is the best unbiased estimate of the true vector A. Calculation of A’ requires the inversion of the n,n, by n,ny size matrix C. Obviously the number of lines (n, and n,,) is severely limited by the necessity of keeping matrix C down to practical size. Fortunately it is possible to obtain another estimate A”, the calculation of which requires a repeated inversion of smaller matrices. This method is treated in the following section.

(38)

we can write VTM,J’

An important by considering

2 VT)- ‘V.

COINCIDENCE

we shall need the data vectors S’(Y) = II{~‘(~~Y)IX,I IINX,l>

vectors

of intermediate

These two numbers

On the strength result

(40)

$’ = VT@” -A).

(41)

II&‘) = VTM,,V,

(42)

D(p”) = VTM,X

(43)

(39) we can write the

D(p’) s D(J).

(44)

Since this last relation is true for all vectors V, we can choose a particular vector whose only non-zero component is the jth component Vj, , = 1. For this vector we have D(p’) = D(Ai,,) and D(p”) =D(Ay,,) and hence D&I) Relation

(45) permits

5 D(A;,,).

matrix

(45)

us to assert that in the class of

(48)

of weights

W(Y)=

have variances

of inequality

the diagonal

(47)

coefficients

B’(Y) = II (bj(Y))i,ll/nX.l, fi’ = VT(A’-A),

(46)

and also the positive

jl[i&j)x,j 11 N*‘Nx>

(49)

definite

matrix

F(y) = XTW(y)X.

(50)

Analogous to result (25) of the simultaneous solution we obtain least squares estimates of the intermediate coefficients B’(Y) = F- ‘(J$oVY)S’(Y).

(51)

It can be easily shown that estimates B’(y) are unbiased [i.e. E(B’(y)) = B(y)] and that the covariance matrix for B’(y) is M B’(y)= F_‘(Y).

(52)

This completes the fit to model eq. (6). Next we turn to eq. (7) which we use for fitting in the y-dimension.

6

D. D. SLAVINSKAS

In the set of n,N, coefficients b:(y) we will now have to fix the value of i and let y vary. Therefore we must rearrange them into a new set of vectors BI = II {b:(y) >,,I

ii Ny31.

(53)

Since all components of S; are calculated from statistically independent sets of data, the vector has a diagonal covariance matrix. In fact we have Ma,i = /I ( {F- r(Y) Ii,i d,j ), j I/ N”‘N’. where {F-‘(y)}i,i is the ith diagonal inverse matrix F-‘(y). We define a matrix of weights

element

(54) of the

of estimates AI’=

for coincidence

coefficients,

/I {aI)}j,~(~ny~l,

(56) these solu-

AI’ = Ki-‘YTWiB;, definite

to find an n,ny by N,N,

then the required

M,.,

matrix

will be given by

= PMsToPT.

B’=

(64)

Iis;,

(65)

where

III

I (IF1(y)XTW(y)I( “-N*6yi,

II

Ny’NY. (66)

Y,l

Before we can group solutions (57) for the y-dimension, we must rearrange the intermediate vector B’ so that it consists of a string of vectors 6;. This is accomplished by a rearranging matrix R such that Bb = 11(II Cbf(J&,

matrix Ki is defined by

(63)

Eq. (51) gives NY least squares solutions in the x-dimension. These solutions can be grouped together and expressed in one matrix equation

/INyYL Ii,, //nX’l

(67)

= RB’.

(57)

Ki = YTWiY.

covariance

The desired

matrix

can be defined by 63,

(58)

(68)

9

As before, we have the result that the solution is an unbiased estimate of the true values aij and that the covariance matrix is given by M,.si

= Ki-‘.

as substitution in (67) will readily show. To proceed with the grouping of solutions one matrix we introduce one more definition

6 = II{S’(Y)>y,1 II

Q =

1 IINy’l.

(60)

(~~K;‘YTWi~;O1.N~~ij)_ /I

This enables

I/

of intermediate

), I/Ny’1y(61)

B’ = (il {bl(Y)li,l/Inxsl 1 and the estimates

of coincidence

A” = QRHS;,

(70)

M,n = QRHMs,oHTRTQT.

(71)

reduced

some to

matrix ;

The diagonal

algebra,

expression

fr;tK_lyTU

matrix ‘ij

.yK_’ I

=

il

~, I

(71) can

(6’4

be

,/,

1.1

.

(72)

Uij is defined by il {Pij(YPky

)k,y/I

NyrNy)

(73)

where

Pij(Y>= tF-‘(Y)Ii.jC {F-‘(Y)Ii,i {F-l(Y)Ij.jl-‘~

coefficients

(69)

us to write down the final answer

A”

the vectors

nx’“x. rd II

from which we have

After

= /I (II IS’(~~Y)~,;III”-~~)~ out

(57) into

(59)

From (59) one can obtain all variances of the estimates aFj and also the covariances between terms with equal i. The other covariances are generally not zero and we will now derive the full covariance matrix for all elements ayj. Joining all data vectors (47) together to form one long vertical vector we obtain the result

Likewise we string coefficients

P

size matrix

A” = f’s;;

(55)

where the double prime is used to distinguish values from estimates obtained by simultaneous tion. Use of model eq. (7) results in the solution

where the positive

We propose such that

H =

Wi = M;r,l and a vector

et al.

Consider now the diagonal Puttingj = i we obtain

submatrices

(74) of MA...

ANALYSIS

{K;‘YTUiiYK;‘)i,i

OF

TWO-DIMENSIONAL

= K;‘YTWiYK;’ = K;‘.

(75)

This result agrees with the covariance matrices given by eq. (59). The off-diagonal submatrices of MA,, give covariances which do not appear in expression (59), namely covariances between coefficients ayj having different values of i. According to inequality (45) the variances in A” are greater than or equal to the corresponding variances in the simultaneous solution vector A’. An exact comparison of variances can be made only after inverting the full solution matrix C. This however may be impractical in a routine calculation since one of the reasons for going to the solution by parts is precisely to avoid such a matrix inversion. For this reason it is desirable to have some simple method of determining lower limits to the variances in A’. Using definition (23) and expressions (15) and (20) it is possible to write design matrix C in terms of submatrices Cij, c = I/ {‘ij

>i,jll

“x’nx,

/I(z,

[xi(x>xj(x)yp(Y)y~(Y)is(x~Y)l

D(a;,) in the vector

D(QL) 2 (CL&’ >r,r 2 1 i (C1i.i.

7

S(%Y) = ~OOc~,(~)~,(Y~+~,(~)~,(Y)l.

(79)

A set of 13 models characterized by different response functions X2(x) was used. Function X,(x) was kept fixed with its photo-peak centered in channel 20. The photo-peak position of X,(x) was varied from channel 30 to 20.1 as shown in column 1 of table 1. The spectrum field size used was 33 x 33 channels. With the given response functions (having shapes similar to the ones in fig. 2) the spectrum has some regions of zero content. Consequently the weighting function l/S(x,Y) contains singularities. This difficulty was eliminated by slightly modifying the response functions with the addition of a constant equal to 10-3. Variances of the full solution estimates ~7;~and the solution by parts estimates uFj were calculated for each model. Column 3 of table 1 gives the average variance

1,yII”g’“y’ (77)

It can be shown that for variances A’ the following inequality holds

SPECTRA

performed on the IBM 7040 computer at McMaster University. Calculations presented here fall into two main groups, First we shall compare the variances in solutions A’ and A” for a set of particular models. The models were varied so that the design matrix C gradually approached singularity. In the second part we construct a hypothetical decay scheme and the corresponding coincidence spectrum with appropriate statistical variations. Essential parameters will be calculated by the method of least squares to demonstrate the amount of information which can be extracted from initial data. A test and discussion of various weighting factors is also included. The models chosen for the purpose of comparing variances were based on the simple scheme of two y-rays in cascade. Response functions were the same for both dimensions and were generated using an assumed analyzer gain of 40 keV/channel. With the added specification that a, 1 = uZ2 = 0 and uI2 = uzL = 100 the model coincidence spectrum can be written

(76)

where

cij =

COINCIDENCE

(78)

The index i is given by i = (k- l)n, + 1. In eq. (78) the second term represents the /lh diagonal element in the inverse of kth diagonal submatrix C,,. It is the stronger of the two given lower limits. The weaker limit [last term in eq. (78)] is just the reciprocal of ith diagonal element in matrix C. These limits can be used to establish upper bounds on the magnification of variances introduced through the solution by parts. The following section presents examples of actual magnification and calculated upper limits. 4. Numerical calculations Here we shall present some numerical results obtained using actual response functions of a 3” x 3” Nal(T1) detector at 10 cm from a y-ray source. The response functions X,(x) and Yj(y) were generated by a computer program supplied by Dr. N. P. Archer. The program interpolates between measured line shapes and permits specification of y-ray energy and analyzer gain (keV/ channel) at will. Numerical computations were

D(UIj)

=

a

~

o(U:j)

i,j = 1

which is seen to increase quite rapidly as the two photopeaks reach close proximity. The average ratio of variances r=$

i

il [D(uYj>iD(aij>l

w-9

is given in column 4. It is seen that this ratio is very close to unity with the greatest departure being 5.4%. The second column of table 1 provides a measure of the approach to singularity of design matrix C. Since C is positive definite and symmetric, the ratio of its

D. D. SLAVINSKAS

et al.

TABLE 1 Comparison of variances. Singularity parameter s

Peak positions, Channel nos.

20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20.

0.465 0.445 0.424 0.407 0.421 0.495 0.590 0.592 0.319 3.71 x 10-Z 4.41 x 10-4 2.25 x 10-C 2.2 x 10-7

30 29 28 21 26 25 24 23 22 21 20.5 20.2 20.1

D(a’cj)

iveragevariance ratio r

130 136 143 148 144 123 103 103 154 702 8.05 x 103 1.23 x 10s 4.16 x 10s

1.030 1.033 1.036 1.040 1.041 1.042 1.040 1.042 1.054 1.036 1.004 1.0004 1.0001

qveragevariance

1

Stronger upper limit

Weaker upper limit

1,

1,

1.227 1.240 I.255 1.271 1.262 1.218 1.167 1.163 1.295 2.19 6.68 26.2 46.9

1.502 1.538 1.582 1.620 1.601 1.466 1.339 1.336 1.661 5.13 47.2 667 2240

I

determinant not

to the product

be greater

the singularity

than

unity.

of diagonal In equation

elements form

can-

we have

parameter

s = det (C)/n

I

(C}i,i 5 1.

(81)

If C were diagonal this ratio would be equal to one. However, diagonality cannot be had in C with the present response functions since they do not possess the required orthogonality. For photo-peak separations greater than one channel the value of s is near 0.5. As the separation is reduced matrix C approaches singularity and the ratio s tends to zero. Uncertainties in least squares solutions become very large but the ratio of variances, Y appears to approach unity. Upper limits for the ratio Y were calculated by the use of inequality (78). Column 5 of table 1 gives values of the stronger upper limit I, = $ i i,j=

[D(aij)/

(C,’

1, = * f i,j=l

D(arj)

coefficients aij. The average variance ratio Y was 1.01. It would thus appear that in most practical cases of coincidence spectra analysis the magnification of uncertainties introduced by the solution by parts is insignificant. The hypothetical decay scheme used in the example of coincidence analysis is shown in fig. 1. There are four y-ray transitions indexed from 1 to 4 in order of their energies 410, 600, 650 and 1250 keV. Intensities are indicated by the number of transitions during observation, &. The first transition y1 is shown to have

(82)

1

limit given in column

The weaker

>j,j].

/

y

6 is

[X,2(x)X3(y)/S(x,y)].

(83)

x,y=l

For most of the models tested both limits remain close to unity and therefore are useful. However, these limits provide little information when s approaches zero. This may be a small limitation since in these cases the uncertainties are greater than the values of least squares parameters. Variances resulting from the two methods of solution were calculated also for the example given below which has 4 y-ray transitions and consequently 16 coincidence

650 & 4=7r106

0

Fig.

1. Decay

scheme

.’

650keV

,

‘tY

used in example analysis.

4Y

of coincidence

spectrum

ANALYSIS

I

I

20

1,

30

OF TWO-DIMENSIONAL

I

I

40

50

1

y . Ghan.

Fig. 2. Response functions of the four y-ray transitions

used in

sample calculation. occurrence of 10’ counts. Following this there are 2 branches. Transitions y2 and y3 in cascade have an occurrence of 70”/ whereas the single y4 takes up the remaining 30%. In order of increasing energy the total efficiencies used were 0.294, 0.293, 0.293, 0.283 for detector x and 0,265, 0.265, 0.265,0.254 for detector y. The parameter k in eq. (4) was fixed for loo/, chance coincidences by setting kll = 0.1. Response functions were generated using the different gains of 20 keV/ch. in the x-dimension and 25 keV/ch. in the y-dimension. The set of functions for the y-dimension, Yi(y) is shown in fig. 2. A data field of 3100 channels was used with N, = 62 and N,, = 50. Model one-dimensional and two-dimensional (coincidence) spectra were calculated by the use of eqs. experimental spectra (1) through (4). T 0 simulate statistical deviations sampled from a normal distribution were added to the model. The distribution variances were equal to the model values in each channel. Resultant values were rounded off to the nearest integer to give the final channel content S’. If S’ came out negative it was arbitrarily set to zero. When the model value S is large (say S > 5) then the distribution in S’ is similar to a Poisson distribution (with ( S’ ) = S) which obtains in actual counting experiments. This similarity becomes progressively better for increasing values of S. The one-dimensional spectrum in the y-dimension, S;(y) is shown in fig. 3. All four response functions of fig. 2 show their presence in various intensities. A least squares calculation produced estimates uij of 2.6528 x 106, 1.8537 x 106, 1.8556 x lo6 and 7.6189 x lo5 in order of increasingj. The largest fractional deviation

COINCIDENCE

12

0

9

SPECTRA

I

I

I

I

I

IO

20

30

40

50

I

y, than. Fig. 3. Singles spectrum corresponding to the decay scheme of fig. 1 and response functions of fig. 2.

from model value Eyj~~ occurred in the first coefficient ai and was O.llOh. The content of each channel in spectrum S;(y) is quite large (S’ > 4000). Consequently the statistical deviations which are of order dS’ represent a small percent change, too small to appear in fig. 3. With these “good” statistics the use of weighting factors l/S’ in the least squares calculation provided an adequate approximation to the true weights l/S. A similar analysis of the spectrum S:(x) in the x-dimension yielded coefficients aii of 2.9417 x 106, 2.0473 x 106, 2.0564 x lo6 and 8.4843 x 105. The largest fractional deviation from the model value was 0.26% in the third coefficient. Assuming the detection efficiencies are known, these coefficients can be used to determine the intensities of each y-ray transition. Using coincidence spectra it is possible to obtain intensities 4 independently of the efficiencies E,~ or E,,~. The generated coincidence spectrum S’(x,y) contains areas of very low channel content. For these channels l/S’ can be a very poor approximation of the true weight. An iterative process was used to obtain better weight estimates. Weights I/S’ were used in a preliminary least squares solution by parts. Channels with S’ < 5 were ignored in the calculation. The ensuing coincidence coefficients were then substituted in model eq. (3) to obtain the improved weight estimates I/S”. A second least squares solution by parts (rejecting channels with S” < 5) yielded the final estimates ayj. Comparison of results obtained with various weight estimates is made at the end of this section. Intermediate coefficients b;(y) are plotted in fig. 4. Each shown function of y corresponds to a spectrum in the y-dimension which is in coincidence with the ith transition recorded in detector x. Except for small

10

D. D. SLAVINSKAS

et al.

estimate k’ = 1.017 x IO* which is 1.79ib above the model value. For low counting rates this parameter is related to the analyzer resolving time r through the expression

IO5

IO4

2r = kT,

0’

I03.1C

Y

b’,(Y) IC

Y.10"

IC

Y. IO'

IO

1oa

where T is the duration of experiment. Diagonal elements of table 2 are expected to be zero. The degree of their departure from this value is a rough indication of the overall accuracy of the table entries. The large values of second, third and fourth elements in row 1 and column 1 indicate that transition y1 has cascade coincidences with all other transitions. Thus yr cannot have any competing branches. Further examination of table 2 reveals that transitions yZ and y3 are in cascade whereas y4 forms a separate branch. These conclusions are in agreement with the original decay scheme of fig. 1. For4 >=Nj it is possible to write 4j/(4Nj)

= JV’;ccij/(k;crijJQ = l/h;,

IO’

I I

IO

I

I

20

30

II

I

40

1

2 3 4

of values

-0.018

9.982 9.923 10.032

coefficients

b(‘(y).

TABLE 2 [Jlr~/(.Nd,)]

10.023 0.018 14.223 - 0.003

where xii is the fraction of transitions yi leading to transitions yj. Averaging elements with (i,j) equal to (1,2), (1,3), (1,4), (2,1), (3,l) and (4,1) we thus obtain the value 9.978 x 10m8 the reciprocal of which yields JV; = I .0022 x 107.

contributions due to chance coincidences, we see only transitions yZ, y3 and y4 in b;(y); y1 and y3 in b;(y); y1 and y2 in b;(y) and finally only yr in b;(y). These simplified spectra can be an advantage in studying response function shapes or searching for new transitions not included in the old model. In contrast, the one-dimensional spectrum of fig. 3 contains all response functions in their full intensities. Table 2 presents the 4 x 4 matrix x,/(&4) calculated with the aid of eq. (5). The value taken for chance parameter k was the average over all diagonal elements in matrix aij/(a,,aYj). The result was the Matrix

(85)

5(

Y .chon.

Fig. 4. Intermediate

(84)

1

x 10s.

3

4

9.932 14.276 -0.011

9.975 - 0.039 - 0.033

-0.041

0.010

This result is only 0.220!, above the model value Jt’;. Similarly from elements (2,3) and (3,2) we obtain the estimate J1;‘; = JV”; = 7.018 x 106, which exceeds the model value 7 x lo6 by only 0.26:/,. An estimate of the intensity in the competing branch y4 is obtained by simply taking the difference N; Y/v-;. Let us now examine the effects of various weight approximations. The principle of maximum likelihood reduces to the method of least squares only when the random multidimensional data vector belongs to a normal distribution. In nuclear counting experiments the content S’ in a given channel belongs to a Poisson distribution with mean value S. This distribution has the variance E(S’ - S)’ = S. Strictly speaking, the least squares method applied to nuclear spectra does not give maximum likelihood solutions. The latter could be obtained in principle by maximizing the total likelihood function written as a product of Poisson distribution functions. This would result in cumbersome expressions the solution of which requires iterative procedures. Fortunately in the limit of large values of

ANALYSIS

OF TWO-DIMENSIONAL

COINCIDENCE

SPECTRA

11

S the Poisson distribution approaches a normal distribution with mean value and variance equal to S. When the channel contents S’ are high we are thus able to obtain least squares solutions which are very close to maximum likelihood. The required weights are given by reciprocals of variance, l/S. However, since the value of S is unknown, it is necessary to use some approximation. A commonly used weight is l/S’. This can lead to some difficulties, especially where the counts are low. The value of S’ may be zero, leading to an infinite weight even when the true weight l/S is finite. It is thus necessary that all channels with zero content be excluded from the least squares calculation. Moreover, the weight expectation ( I/S’ ) is generally different from the true weight I/( S’ ). Defining the Poisson distribution P(S,S’) = (S”‘/S’!) eVS,

(86)

we have

( l/S’> =

Js~(W’)(lP’)~

(87)

I

where S, is the lowest channel content not rejected from the least squares fit. Ratios of l/S’ to true weight l/S are shown by the solid curves in fig. 5. The 6 curves are for S, = 1, 2, 3, 4, 5 and 10. It is seen that for most values of S the estimates I/S/are on the average too high. For S, = 1 the excess is almost 30:/, near S = 4 and gradually diminishes with increasing S. For S= 100 the excess is 1%. A probably better estimate of weights is given by l/(S’+ 1). With this estimate it is not necessary to reject channels with zero counts. For a given S the weight expectation is

S Fig. 5. Ratio of average weight < W’> to “true” weight l/S as a function of average channel content S = < S’>. The solid curves are for weights IV’= l/S’; dotted curves are for weights w’= l/(S + 1). Numerical labels refer to various values of acceptance limit St.

least squares calculation can then be performed using the more accurate weight estimates l/S”. To test the various weighting schemes the previously discussed two-dimensional spectrum with 16 coefficients aij was used. Least squares solutions by parts were applied using the three varieties of weights and a number of limits S,. A basis of comparison was obtained by defining the dispersion parameter. d

When no channels are rejected (i.e. S, = 0) the ratio of ( l/(S’ + 1) ) to l/S is simply 1 - eeS, which rapidly approaches unity with increasing S. These ratios are shown in fig. 5 by the dotted curves calculated for S, = 0, 1, 2, 3, 4, 5 and 10. It is apparent that the averages < l/(5” + 1) ) approach true weights l/S much faster than is done by values ( I/S’ ) of the solid curves. Better weights than either of the two above estimates can be obtained by a process of iteration. Weights l/S’ or l/(S’ + I> can be used in an initial solution for the least squares parameters. The latter can then be substituted in the model equation to get model spectrum estimates S” for each point in the data field. A second

=

5 i,j=

!aj:j-aij)2 7

1

0,;

.

(89)

The same set of variances cFj was used in all calculations of d. It was obtained from the covariance matrices Ki- ’ in the solution with weights W’ = I/S” and limit S, = 5. In a good fit d is expected to have a value near 16. Larger values will result when the estimates ayj have excessive deviations from model values aij. Results are presented in table 3. Column 1 shows the acceptance limit S, ranging from 0 to 25. Columns 2 and 3 give the number of channels which fell below the limit S, and were excluded from the least squares calculation. Results in column 2 obtain in calculations using weights W’ = l/S’ and W’ = l/(S’ + I), whereas column 3 corresponds to the weights W’ = I/S’. The

12

D. D. SLAVINSKAS TABLE

3

Test of various weighting functions. No. of channels rejected out of3100

Acceptance limit

w 01 1 5 10 15 20 25

Dispersion parameter d

,

_

s’
1 588 750 836 890 1 939

I_ last three

-

1 193 570 765 835 1 897 947

~

38.5 25.3 33.4 14.9 16.0 13.8

I

)i w’=

55.0 26.8 24.9 32.3 14.7 15.9 13.6

l/S”

12.5 13.6 14.7 14.9 14.7 14.7

~_~ columns

give the parameter

d obtained

with

various culated

weights. The iterative weights l/S” were calfrom the results of an initial solution with W’ = l/S’ and S, = 5. In all cases the weights l/(S’ + 1) appear superior to weights l/S’ in that their associated dispersion para-

meter d has smaller values. As expected the difference diminishes with increasing S,. Also expected is the observed tendency for d to decrease at higher values of S,. On the other hand, the iterative weights l/S” yield values of d relatively independent of S, in the given range. It thus appears that in cases where channels with very low content must be used, the iterative weights l/S” are preferable. When S, can be set at 15 or 20, the weights l/(S’ + 1) seem to be adequate.

et al.

the present application. One practical advantage of the solution by parts is that it deals with considerably reduced design matrices. Thus the required matrix inversion is rendered less difficult. When the spectrum analyzed contains many components this simplification can result in a great reduction of computing time. In the given example with 4 y-ray transitions a full least squares solution required 41 min of calculation on the IBM 7040 computer, whereas a solution by parts required less than 14 min. The intermediate coefficients hi(y) arising in the solution by parts are useful in that they represent simplified versions of the one-dimensional spectrum. These spectra supply additional insight into the problem and may provide a basis for possible revision of the least squares model employed. A numerical example was provided to illustrate the value of coefficients aii. They can be used to determine competing branches and to calculate absolute decay rates independently of detector efficiencies. A study of various estimates to the weighting function indicated that for a channel with content S’ the estimate l/(S’ + 1) is superior to the estimate l/S’. Best estimates are provided by the iterative weights l/S” where the results from a previous least squares calculation are used to calculate spectrum 9’. Although the present theory was applied specifically to nuclear coincidence spectra, no such limitation need be imposed. The two least squares methods can obviously be applied to any two-dimensional model which can be written as a linear combination of product functions X(x) Y(y).

5. Summary and conclusions Two methods of solution for coincidence coefficients aij have been presented. The first method is based on a direct least squares calculation, using model eq. (3), and provides unbiased estimates of aij with minimum possible variances. The second method introduces the intermediate coefficients hi(y), per model eq. (6) and (7), and is based on a solution by parts. It has been shown that this method too provides unbiased estimates of coefficients aij; however the associated variances are not minimal. Numerical calculations with a series of models indicate that the magnification of variances incurred through the solution by parts is negligible for

The authors wish to acknowledge the financial assistance received from the Province of Ontario and the National Research Council. References 1) R. L. Heath, R. G. Helmer, L. A. Schmittroth

and G. A. Cazier, The calculation of gamma-ray shapes for sodium iodide scintillation spectrometers, U.S.A.&C. Report no. IDO(1965). 2) C. R. Rao, Advanced statistical methods in biometric research (Wiley, New York, 1952) p. 53. 3) Y. V. Linnik, Method of least squares and principles of the theory of observations (Pergamon Press, London, 1961) p. 84. 4) Ibid. p. 85.