324
NEUTRON
Nuclear Instruments and Methods in Physics Research A234 (1985) 324-330 North-Holland, Amsterdam
SPECTRUM
M. MATZKE
UNFOLDING
BY THE MONTE CARLO METHOD
and K. WEISE
Physikalisch -Technische Bundesanstalt, D - 3300 Braunschweig Fed. Rep. Germany
Received 24 September 1984
Bayes’ formalism is applied to neutron dosimetry problems. It is shown that this formalism and a special Monte Carlo algorithm (importance sampling random walk) can be used for parameter estimation in cases where the least-squares method fails. Three examples are discussed. The essential advantages of this method are that it makes possible the evaluation of confidence limits in neutron dose problems and the determination of the (unfolded) neutron spectrum from a set of a few reaction rates.
1. Introduction
the form
The determination of the spectral neutron fluence is one of the main problems in reactor neutron meteorology also called reactor dosimetry, see e.g. refs. [1,2]. Various algorithms exist which allow the spectral neutron fluence +E(E) to be unfolded from a set A of measured reaction rates. The integral equations describing the measurements are discretized and can be represented by an usually considerably underdetermined system of linear equations
xz= (A
A =S+,
(1)
where A = (A,, . . . , A,)T is the vector of the N experimental reaction rates, S is the reaction cross section (i.e. response) matrix with N rows and M columns (M > N). The element &, expresses the average cross section of the reaction i (i=l,..., N) in the energy interval E,_, sE
T indicates transposition of a matrix. The measured reaction rates A, are considered to be The superscript
estimates of the expectation values qA,]. Additional information is often available such as the empirical covariance matrix KA of the measured reaction rates and some a priori knowledge of +, usually specified by a vector 4, again considered to be an estimated expectation value. The covariance matrix KS is sometimes also approximately known. The various algorithms differ in the methods used to obtain and include this additional information. Most of them use the least-squares method and aim at a solution which minimizes an expression of 0168~9002/85/$03.30 0 Elsevier Science Publishers (North-Holland Physics Publishing Division),
B.V.
-
S+)TK,-l(A- sS>
+(+-4)TK;‘(+-i). An instructive review of the use of the least-squares method in spectrum unfolding can be found in ref. [2]. The least-squares method is a powerful tool in many unfolding problems. It may, however, lead to difficulties in three cases unfortunately frequently encountered; namely if 1) the a priori group fluence 4 or the reaction rates A contain large uncertainties, 2) the a priori information on 4 and K+ is incomplete, 3) constraints in addition to eq. (3) are specified. In the first case unphgsical results, such as negative elements of the group fluence vector (& < 0) may be obtained [3], the second case often leads to an ill-conditioned system of minimization equations of x2 and in the third case a system of non-linear equations from eq. (3) may arise, sometimes associated with numerical problems. It is the aim of this paper to propose an algorithm which removes these difficulties and leads not only to a solution vector I$ but also to a multivariate probability density of this vector (see also ref. [4]). This algorithm is based on the original ideas of Bayes and Laplace (see refs. [5,6]) and is a consequent application of the probability theory, it includes the least-squares method in principle. It consists mainly of a special Monte Carlo technique which is widely used in thermodynamic statistics (see e.g. refs. [7,8]). Monte Carlo methods have been used in spectrum unfolding before [9], but until now only minimization procedures connected with the least-
M. Matzke, K. Weise / Neutron spectrum unfolding squares method have been applied. The algorithm proposed here seems to be unknown in literature related to neutron dosimetry. Three problems arising in neutron dosimetry will be discussed, in all of which the usual least-squares method is not effective.
2. Bayes' theorem Bayes (see refs. [5,6]) was the first to propose a consistent procedure to combine a priori and measurement information. He suggested that a procedure to maximize the a posteriori probability should be performed in estimating parameters. This concept can be based on the two essential statements of Bayes and Laplace [6]: 1) the a posteriori probability is proportional to the a priori probability times the likelihood, 2) if no a priori information on the parameters is known, an uniform probability density of the parameters must be assumed [10]. Jeffreys [6] calls the first statement the "chief rule involved in the process of learning from experience" and the second "the principle of the equal distribution of ignorance". Since the method can be applied to a rather large class of physical problems it is attempted here to give a general description. It might, however, be instructive to bear in mind the reactor dosimetry problem which was introduced above, where the "parameters" to be determined are the spectral neutron group fluence values, and where the "observations" are the measured reaction rates. The construction of the a posteriori probability requires, according to Bayes, the assumption of an a priori probability and the likelihood. Given some observations A = (A 1. . . . . AN) T randomly drawn from a probability density, the aim is to evaluate parameters ~ = (~1 . . . . . ¢M), for which a model A i = fi(~b) (i = 1. . . . . N) is valid. In the example discussed in the introduction, this model fi is given by eq. (1). The parameters ~ may also be regarded as drawn from a probability density, the a priori and the a posteriori respectively. If there is no a priori information on the parameters ~ available, the second statement of Bayes and Laplace is used and for each parameter ¢v an interval is specified [~,, ~i,, ¢v, max] which contains ~ with certainty. However, all values of ~, are equally probable. If there is a priori information one is often able to restrict the parameters to a rather small interval or to assign a more detailed probability density for the parameter vector. There may, for instance, exist a previous calculation or estimation ~, with the covariance matrix K¢, so that one can assume approximately a normal multivariate density distribution for non-negative q~v values according to
325
Percy [21 and Dragt [111:
P0 ( ~ ) = C exp( - ½(~b -
~)TK~I(~
--
~b)),
(4)
where C is a normalization constant. The next step leading to the likelihood is more obvious. Considering the observations Ai as random with the expectation value fi(ck) with q) fixed, the corresponding probability density of the Ai must be taken as the likelihood. There is no restriction on the form of this distribution. In most cases one should also assume a multivariate normal probability density for the likelihood: PL(A Iq)) = C' e x p ( - ½(A --jr)TK~X(A
-.f)),
(5)
where C' is another normalization constant. Here, KA is the empirical covadance matrix of the measurements. After these a priori considerations the a posteriori probability density for the parameter vector follows from Bayes' theorem [5,6]: P ( ¢ I A ) = C"PL(A I¢) P0 ( ¢ ) ,
(6)
with the normalization constant C " = 1 / P ( A ) , where P ( A ) is a pure number. Eq. (6) is the fundamental equation of Bayes [6], which enables all intended operations to be performed for the parameter vector ~ e.g. the evaluation of a parameter vector which maximizes P(~IA). The latter is the so-called Bayes' estimator [5] of the parameter vector. Since the whole probability density of the parameter vector ~ is available for use, not only the most probable value can be evaluated but also an estimate of the expectation value of this parameter vector can be made. In addition, expectation values, standard deviations and confidence limits for functions of the parameter vector can be evaluated. This procedure, which seems to be more efficient than the evaluation of only the Bayes' estimator, is described in the next section.
3. Random walk and importance sampling Monte Carlo Bearing in mind the dosimetry problem, it should be the aim of the evaluation to determine the group fluence vector ~ as an expectation value of the a posteriori probability distribution. The covariance matrix of within this distribution is also required. It is expected that this covariance matrix will contain contributions of propagated uncertainties either from the observations (likelihood) or from the a priori information. To generalize the method, the expectation value of a function g(q0 must be evaluated, leading to the problem of an M-dimensional integration: E [ g ] = f P ( c k l A ) g ( ~ ) d q ~ l . . , doom,
(7)
which in general can only be solved by numerical methods. In thermodynamic particle statistics a well known
M. Matzke, 1(. Weise / Neutron spectrum unfolding
326
algorithm exists for the solution of M-dimensional integrals [7,8]: the importance sampling random walk. A complete description of this method is given by Wood [7], therefore only the essential features will be indicated here. The method consists in performing a random walk in the M-dimensional space spanned by the parameters. Each point in this parameter space is termed parameter vector. The random walk is started at an arbitrary point in this space and then certain Markov chain transition probabilities are used to produce a successive set of parameter vectors. The links of this chain are denoted by 4~(n). For the construction of such transition probabilities the special form of the distribution P ( ~ I A ) of eq. (6) is taken into account, so that parameter vectors can be avoided where Pn = P(q~(n)IA) is rather small. In practice one starts with a vector q~(0), which for reasons of computing time should originate from a region of high probability. At each step n the probability density Pn must be calculated. Then a trial state n + 1 (new parameter vector ~ ( n + 1)) is chosen randomly and uniformly (i.e. with equal probability for any state n + 1) from the possible states in the neighbourhood of ~(n). In the procedure most preferred for the selection of the neighbouring state, one randomly chosen component of the vector q~(n) is changed randomly with a maximum aUowed displacement [7]. The probability densities P,+z and P, are then compared. If P,+I > P, the trial state is accepted as a realization of the Markov chain. In the other case ( P , + I / P , <_ 1) a new random number ~ between 0 and 1 is compared with r = P , + I / P , . For ~ _< r the trial state is accepted while for ~ > r it is rejected and the old state n is taken as a new realization n + 1. In thermodynamic particle physics there exist several additional methods for an acceleration of the Markov process [8]. It can be shown in general [7] that this procedure allows averages to be calculated as T
1 = y E g ( 4 , ( n ) ) ,
(8)
where n indicates the Markov chain realization, and T is the number of trials. In addition the variance of g can be evaluated by means of
,,~ =
(9)
It can be shown that in the limit T--, oo these mean values converge asymptotically against the corresponding expectation values. The Monte Carlo bias of mean values may be obtained from the results of repeated Markov chains as known from theory of repeated sample measurements in physics [5].
4. Examples Three examples have been investigated by means of a computer program MIEKE written for the a.m. algorithm. The first one is a simple polynomial fit connected with problems of neutron source strength calibration in large rooms [12,13]. The counting rate Y~ of a long counter [12] as a function of the distance ri to a neutron point source can be written as yi r2 = epl + ep2ri + ~3r 2 ,
where the parameters qh, q'2, 4'3 describe contributions of the source strength, the air scattering and the room background [12] (i.e. the reaction rates in eq. (1) are replaced by Y,ri2 and the cross section matrix S~, by ri"). Here an ordinary least-squares calculation performed for 28 pairs (Y,, r~) measured by Kluge [13] resulted in a negative (unphysical) parameter q~2. The constraint q~2 > 0, however, requires a non linear least-squares calculation [5] combined with an iterative solution. The uncertainty of the parameters for this technique can only be approximately estimated [5]. By Bayes' formalism one may remove these difficulties, since the constraint q~2/>0 only reduces the parameter space for the a priori probability density. It has been found that this formalism can be applied to the example under consideration even in the bad case where one parameter has a standard deviation of more than 80%. The calculated results for this example are consistent with the measurements of Kluge [13] in the sense of a X z test. The second example is the estimation of parameters in connection with multi-channel peak analysis [14]. Using a Gaussian model for the peak structure A,=¢zexp(-(k-
q~2)2/ 4 b2 ),
(10)
where A , is the counting rate in channel k, non-linear minimization equations would yield from ordinary least-squares methods. Bayes' formalism proved to work well in this case too [14]. These two examples have been investigated mainly to check the consistency of the program. For routine analysis, there are various other much better and faster methods for solving these two simple problems. In fact, the computing time for running the program is more than one hour CPU time on a TR440 computer for each problem. However, in comparing the methods it becomes apparent that Bayes' formalism seems to be the best one, as is properly combines the a priori and a posteriori information and in the first example avoids unphysical results. This becomes more evident in considering the last example known from radiation protection dosimetry which, in principle, is the problem of reactor dosimetry introduced before. This problem can also be described by eq. (1). However, an additional equation links the
0""[-2'~0
03
1.0
I.S
2.0
2.5
¸
10-1
t I
,
I
I
100
,
, I , 101 E
t I , 102 i
I
103 r>.
,
10 ~'
, I
,
10 s
t I
,
106
, I
,
, I
10 ~ eV 108
,x c,
Fig. 1. Calculated spectral energy fluence ~bE(E) E (histogram) for spectrum 1. The smooth curve represents the values used for the simulation of the Bonner sphere reaction rates.
I
3.0
33
4.0
x 10 -2 Cm"2
4.5
~
-
10 0
'101
102 E Fig. 2. The same as fig. 1 for spectrum 2.
o.OOo10-I
0.01
0.02;
0.03
0.04
0.05
~ 0.06
0.07
0.08
0.09;
0.10
0.11
cm-2
0.12
103 t:::=,-
104
10 s
106
-,4
107 e V 101
328
f
M. Matzke, K. Weise/ Neutron spectrum unfolding evaluated:
Table 1 Relative standard deviations of the dose equivalent H Relative standard Spectrum I deviation of: reaction rates H
O.05 = foH'P( H)dH = f£TP( H)dH ,
10% 5% 1% 10% 5% 1% 34.4% 27.7% 22.0% 42.3% 38.4% 28.4%
H = RTep
(11)
where fir is the dose equivalent and R the corresponding phantom response vector [15]. In this problem one is not only interested in an average or an most probable value but also in limits of an interval which contains fir at a certain level of confidence [15]. Obviously Bayes' formalism provides not only the possibility of determining the expectation value fir but also enables one, under the assumptions (4) and (5), to calculate the whole probability density of fir which may be used for the estimation of confidence limits, A set of 8 reaction rates (count rates)A~ (i = 1. . . . . 8) is considered obtained by simulating measurements with Bonner spheres (diameter arbitrarily chosen) [16] in an "unknown" neutron fluenee. The uncertainty of the reaction rates is assumed to be known (standard deviations), the covarianee matrix K A is here taken as diagonal. The model eq. (1) for ~ is taken with 48 energy groups. The uncertainty of Si, is neglected. As nothing a priori about the group fluence vector is known (with the exception of ~ >_0), Bayes' formalism seems to be the most adequate here, due to the fact that 48 fluence values have to be determined from only 8 observations. Following the statements in sect. 2 the a priori probability density is constructed as a product of single step functions:
vo( ~ ) = CE( ~ I ) . . . e ( ~,8), 02) with E(x) = 1 for 0 < x < Xm~ and E(x) = 0 otherwise. A normal likelihood distribution is assumed:
PL(AI~,)= C' exp(-- ½(A - I ) T K ~ ' ( A - f ) ) ,
03)
with f = $~. The elements of the diagonal covariance matrix /(A are the squares of the standard deviations of the reaction rates. The mean values of the fluence vector components and their standard deviation can now be calculated according to eqs. (8) and (9). Importance sampling Monte Carlo random walk also provides the probability density of the dose equivalent:
RT,)),
(15)
Spectrum 2
group fluence vector to the quantity of main interest:
e( u) = (8( H -
329
(14)
where the parenthesis indicates the average (eq. (8)) and 8 ( x ) is the Dirac delta function. The 90% confidence interval of H may now be
leading to [ H 1, H2] as the desired interval. The MIEKE program has been tested taking the example of a 252Cf fission neutron spectrum with a 1/E contribution for intermediate neutrons (spectrum 1) as well as for the extreme case of a rather narrow neutron resonance at 25 keV including a small background contribution at other energies (spectrum 2). The only known information fed into the program was the simulated reaction rates Ai(i = 1, . . . , 8) and their standard deviations together with the constraint that each component of ~ should be non-negative and less than an estimated upper bound (eq. (12)). Three runs for each spectrum were performed with 1%, 5% and 10% standard deviation of the reaction rates, respectively. In figs. 1 and 2 the results of the spectrum evaluation for the 5% runs are shown. The agreement between expected spectral fluence values (i.e. the values used for the simulation of the reaction rates) and the calculated is rather good, although it should be mentioned that for this "few channel unfolding" the average fluence values have only statistical meaning since the uncertainty (relative standard deviation of the mean) reaches 70% to 100% in several energy groups. It may therefore be concluded that the a priori information is insufficient for the determination of the fluence vector. The evaluation of the dose equivalent probability density, however, shows the convenience of the program (figs. 3 and 4) even in the case where little a priori information is available. In table I the relative standard deviations of the dose equivalent for the several runs are listed. It can be concluded that even in the case of exactly known reaction rates there remains an uncertainty of the dose equivalent due to the lack of a priori information on the spectral fluence vector. The limits are 21% for spectrum 1 and 26% for spectrum 2, and it may be concluded that no improvement of the single reaction rate measurement can reduce this range of uncertainty. In figs. 3 and 4 the probability densities of the dose equivalent are shown for the two spectra under consideration. The form of the curves is asymmetrical leading to asymmetrical confidence intervals. The expectation value of H for the probability density P(H) and the value which results from the spectrum used for the simulation of the reaction rates are indicated together with the 90% confidence limits.
5. Conclusions The method presented here should be considered as an alternative procedure to ordinarily used least-squares
330
M. Matzke, K. Weise / Neutron spectrum unfolding
techniques. The advantages may be listed as follows: it handles the propagation of information exactly in the sense of statistics, - it works efficiently even in cases which lead to difficulties when using the least-squares method, in particular for n o n linear problems, - it seems to be an adequate method for application in cases where nothing is known a priori on the parameter vector, - it allows the determination of an a posteriori probability density of the parameters and of confidence limits, it can be used even in the case where the n u m b e r of parameters is much larger than the n u m b e r of observations (e.g. few channel unfolding). Unfortunately the computing time is at present considerable, particularly if a parameter vector of large dimensions occurs. Further investigations to accelerate the speed of the program seem to be necessary [17]. In particle physics such procedures are known as force bias technique [18], umbrella sampling [19] and preferential sampling [20]. The authors wish to thank A. Alevra for the data on the response functions of the Bonner spheres. Note added in proof: A similar formalism based on Bayes' theory with an alternative Monte Carlo technique has been presented at the 5th A S T M - E U R A T O M Symposium on Reactor Dosimetry in Geesthacht, F R G , September 24-28, 1984 by M. Nakazawa, N. Ueda, T. Taniguchi and A. Sekiguchi.
R e f e r e n c e s
[1] W.L. Zijp, E.M. Zsolnay, H.J. Nolthenius, E.J. Szondi, G.C.H.M. Verhaag, D.E. Cullen and C. Ertek, Report ECN-128 (INDC(NED)-7) (1983). [2] F.K. Percy, Proc. IAEA Technical Committee Meeting, Oak Ridge (1977) Report IAEA TECDOC-221 (1979) 195. [3] M. Matzke, Proc. 3rd ASTM-Euratom Symp. on Reactor Dosimetry, Ispra (1979) Report EUR 6813, vol. 2 (1980) 721. [4] M. Matzke, Annual Report of the Physikalisch-Technische Bundesanstalt, Braunschweig, ISSN 0340-4366 (1981) 188. [5] B.R. Martin, Statistics for Physicists (Academic Press, London, New York, 1971). [6] H. Jeffreys, Theory of Probability (Clarendon Press, Oxford, 1967). [7] W.W. Wood, in: Physics of simple Liquids, ed., H.N.V. Temperley (North-Holland, Amsterdam, New York, 1968) p. 115. [8] K. Binder (ed.) Application of the Monte Carlo Method in Statistical Physics (Springer, Berlin, Heidelberg, New York, Tokyo 1984). [9] K. O'Brien and R. Sanna, Nucl. Instr. and Meth. 91 (1971) 573. [10] S. Wagner, PTB-Mitt, 79 (1969) 343. [11] J.B. Dragt, J.W.M. Dekker, H. Gruppelaar and A.J. Jarlssen, Nucl. Sci. Eng. 62 (1977) 117. [12] J.B. Hunt, to be published in Rad. Prot. [13] H. Kluge, private communication, to be published. [14] K. Knauf, private communication, to be published. [15] B.R.L. Siebert, R. Hollnagel and R. Jahr, Phys. Med. Biol. 28 (1983) 521. [16] A. Alevra, to be published. [17] M. Matzke, B. Siebert, to be published as Report PTBFMRB. [18] D. Ceperlay, G.V. Chester and M.H. Kalos, Phys. Rev. B16 (1977) 3081. [19] G.M. Torrie and J.P. Valleau, Modem Theoretical Chemistry, vol. 5, ed., B.J. Berne (Plenum, New York, 1977). [20] J.C. Owicki and H.A. Scheraga, Chem. Phys. Lett. 47 (1977) 600.