Applied Radiation and Isotopes 70 (2012) 712–719
Contents lists available at SciVerse ScienceDirect
Applied Radiation and Isotopes journal homepage: www.elsevier.com/locate/apradiso
Coverage intervals according to MARLAP, Bayesian statistics and the new ISO 11929 for ionising radiation measurements Dieter Grientschnig a,n, Ignacio Lira b a b
Chemical Laboratories, B¨ ohler Edelstahl, Mariazeller Str. 25, 8605 Kapfenberg, Austria ´lica de Chile, Vicun ˜a Mackenna 4860, Santiago, Chile Department of Mechanical and Metallurgical Engineering, Pontificia Universidad Cato
a r t i c l e i n f o
a b s t r a c t
Article history: Received 27 May 2010 Received in revised form 25 July 2011 Accepted 8 December 2011 Available online 16 December 2011
The coverage intervals stipulated by ISO 11929 (2010) for estimating the uncertainty from ionising radiation measurements of replicate samples are compared with those of MARLAP ( ¼Multi-Agency Radiological Laboratory Analytical Protocols Manual) and of Bayesian statistics. The latter two intervals agree well despite their different concepts. Whereas for either of them the ratio of the length of the coverage interval and MARLAP’s standard uncertainty grows when the number of samples decreases, no such growth arises for the interval mandated by ISO 11929 (2010). It may therefore be too short (e.g. for three samples by a factor of approximately 2). & 2011 Elsevier Ltd. All rights reserved.
Keywords: Measurement uncertainty Bayesian data analysis Replicate samples Empirical standard deviation Extra-Poisson variation Gaussian distribution
1. Introduction A new edition of the international standard ISO 11929 for measurements of ionising radiation has come into force in March 2010, which replaces all eight parts of the preceding ISO 11929 standards series. Despite being mainly concerned with decision thresholds and detection limits, the new standard also includes the calculation of coverage intervals. Only the latter aspect of this standard will be investigated in the present paper. The study will consider the case of extra-Poisson variation determined by replicate measurements, thus contrasting a statistical analysis of low-level radioactivity measurement recently published in this journal by Heisel et al. (2009), which did not accommodate situations of such a kind. The new ISO 11929 (2010) is claimed to proceed from the principles of Bayesian statistics. It stipulates a procedure for calculating coverage intervals that differs from the one employed in parts 1–4 of the previous ISO 11929 standards series, which are based on conventional (i.e. frequentist) statistics. These parts were in line with the ‘‘Multi-Agency Radiological Laboratory Analytical Protocols Manual’’ (MARLAP, 2004) which in turn, according to its section 19.3, adopts the methods for evaluating measurement uncertainty laid down in the ‘‘Guide to the Expression of
n
Corresponding author. Tel.: þ43 3862 20 36498; fax: þ 43 3862 20 37430. E-mail address:
[email protected] (D. Grientschnig).
0969-8043/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.apradiso.2011.12.017
Uncertainty in Measurement’’ (JCGM 100, 2008), subsequently simply called the GUM. In view of this equivalence of MARLAP and GUM, mostly references to the former document will be made in Sections 2–6 below, although all these references pertain to the latter as well. While the GUM (JCGM 100, 2008) mandates the propagation of uncertainties, its recently published Supplement 1 (JCGM 101, 2008) advocates the propagation of probability distributions instead. In the framework of GUM Supplement 1 such distributions are not a frequency distributions but representations of ‘‘state of knowledge’’ or, equivalently, of ‘‘degree of belief’’. In this sense, as shown by Elster et al. (2007), GUM Supplement 1 embodies a Bayesian approach, not one of conventional statistics. By contrast, MARLAP and GUM contain elements of both these branches of statistics. The means advocated in GUM Supplement 1 for the propagation of distributions is the Monte Carlo method. However, as demonstrated in a recent review (Lira and Grientschnig, 2010), for measurement models involving a small number of quantities the Monte Carlo approach can be replaced by analytical calculations followed by numerical integration. Therefore, the procedures of GUM Supplement 1 (JCGM 101, 2008) and of Lira and Grientschnig (2010) are equivalent and will hereinafter be referred to as the calculation ‘‘according to Bayesian statistics’’. As stated in the introduction of the GUM ‘‘it is often necessary to provide an interval about the measurement result that may be expected to encompass a large fraction of the distribution of values that could reasonably be attributed to the quantity subject
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
to measurement’’. The GUM requires the best estimate of the measurand to serve as the centre of this interval and the ‘‘expanded uncertainty’’ obtained with a chosen coverage factor to be its half-width. As mentioned in clause 6.2.2 of the GUM, the interval so defined has a certain ‘‘coverage probability’’ or ‘‘level of confidence’’. For this interval neither of the terms ‘‘coverage interval’’ or ‘‘confidence interval’’ nor any other short designation formed by placing a modifier in front of ‘‘interval’’ is introduced in the GUM, whereas the ‘‘International Vocabulary of Metrology’’ (JCGM 200, 2008) does attribute the term ‘‘coverage interval’’ to it. By contrast, the same kind of interval is termed ‘‘confidence interval’’ in the new ISO 11929 (2010). But this designation is not well-chosen because it has a specific meaning in conventional statistics, established by clause 1.28 of ISO 3534-1 (2006), which is not applicable in the context of Bayesian statistics. Therefore, we will adopt the terminology of the ‘‘International Vocabulary of Metrology’’ instead and use the term ‘‘coverage interval’’ throughout, regardless of whether it is determined according to MARLAP, Bayesian statistics or the new edition of ISO 11929 (2010). The objective of this study is to compare for a particular example of activity measurement, presented in Section 2, the coverage intervals that ensue from the three approaches at hand. Sections 3, 4 and 5 are devoted to dealing with the example according to MARLAP, Bayesian statistics and the new ISO 11929 (2010), respectively. The article concludes with a discussion of the results and a summary.
2. Example of specific
137
Cs activity measurement
Table 1 Limits of the 95% coverage interval of the specific MARLAP for the original data, based on K samples. Gross count data
137
Cs activity according to
Result and uncertainty
Coverage interval
K
nG
sG
a
u(a)
nA,eff
aL
aU
3 4 5 6 8 10
745.7 739.3 747.0 728.5 740.5 778.2
193.1 158.2 138.1 131.6 126.4 159.4
12.61 12.48 12.64 12.26 12.51 13.28
2.28 1.62 1.27 1.11 0.93 1.04
2.02 3.06 4.15 5.23 7.48 9.54
2.89 7.38 9.16 9.45 10.34 10.94
22.34 17.59 16.13 15.08 14.68 15.62
Specific activity given in Bq/kg.
Table 2 Limits of the 95% coverage intervals of the specific 137Cs activity according to Bayesian statistics and to the new ISO 11929 for the original data, based on K samples. K
3 4 5 6 8 10
Bayesian statistics
New ISO 11929
c
aL
aU
aA
ax
1.0158 1.0023 1.0003 1.0001 1.0000 1.0000
4.11 7.43 9.14 9.43 10.33 10.93
21.13 17.55 16.16 15.10 14.70 15.64
8.14 9.30 10.15 10.09 10.69 11.23
17.09 15.67 15.14 14.44 14.33 15.33
Specific activity given in Bq/kg.
2.1. Original data The following example derives from a scenario considered by Michel and Kirchhoff (1999) of measuring the specific 137Cs activity of a batch of waste matter that despite reduction to small pieces and mechanical homogenisation is still fairly inhomogeneous with respect to activity. The average activity per mass of this waste shall be determined on the basis of several samples taken from different spots of the batch. Up to ten samples are assumed to be collected and subjected to a single-channel gross counting measurement with a counting time tG ¼3600 s for each sample. By ‘‘counting time’’ we refer to the measuring system’s live time, the uncertainty of which is assumed to be negligible. The gross counting results adopted from Michel and Kirchhoff (1999) are nG,i ¼ ð847,523,867,720,778,636,881,672,1102,756Þ,
713
i ¼ 1,:::,10: ð1aÞ
In order to study the effect of the number of samples on the width of the coverage interval of the resulting specific activity, either all 10 counting results are accounted for or just part of them from the beginning of the series, i.e. nG,i, i¼ 1,...,K, with K set to 3, 4, 5, 6, 8 or 10. The means nG and empirical standard deviations sG for these six numbers K of individual counting results are displayed in Table 1, which along with Table 2 also shows the coverage intervals ensuing from these data. Given that the standard deviations exceed by far the square roots of the means nG , the scatter of the results cannot be solely due to Poisson counting statistics but must originate predominantly from other causes, such as random effects of sampling and sample treatment. Therefore it seems reasonable to assume, as Michel and Kirchhoff (1999) did, that the gross counts originate from a Gaussian distribution of unknown variance. The quantity estimated by means of the gross counts in conjunction with the corresponding counting time tG is the gross count rate G. Following the scenario of Michel and Kirchhoff (1999), the gross effect measurements are complemented with counting the
radiation of 10 different blank samples, allotting a counting time of 7200 s for each of them. The standard deviation of these 10 counting results agrees well with the square root of the mean counts observed, so that Poisson counting statistics can be assumed to apply. Thus, for the sake of simplicity we consider only the sum nB ¼2561 of the counts for the 10 blank samples and their total counting time tB ¼72,000 s. The fact that this number of blank counts is supposed to be drawn from a Poisson distribution enables us to forgo the empirical determination of the variance suggested by Michel and Kirchhoff (1999). The quantity inferable from the information described is the blank count rate B. The third quantity required is the calibration factor F, obtained by dividing the portion of the count rate originating from transformations of 137Cs in the sample through its specific 137Cs activity. This factor equals the product of the efficiency, i.e. the fraction of transformations of 137Cs leading to counts, and the mass of the sample. However, due to attenuation effects within the sample, its mass may have a bearing on the efficiency. Therefore, a fixed sample mass is considered, so that it can be fused with the respective efficiency to the calibration factor F. The estimate f of this factor is supposed to originate from an earlier calibration utilising a sample of known specific 137Cs activity, whose mass of 0.800 kg concurs with that of the waste samples at hand. Adopting from Michel and Kirchhoff (1999) the efficiency of 0.017 obtained in this way leads to f¼0.0136 s 1 Bq 1 kg. Instead of taking this as the perfectly known value of the factor F we accommodate its uncertainty by assuming that only its lower and upper limits, f Df and fþ Df, are given, where Df¼ 0.0003 s 1 Bq 1 kg, and that all values within these limits are equally likely. This corresponds to adopting a rectangular probability distribution for F. 2.2. Modified data The original data for the example are such that the uncertainties arising from the blank measurement and the calibration
714
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
factor are almost negligible in comparison with the uncertainty caused by the variation of the activities of the samples. To contrast this scenario we also deal with a synthetic situation that yields virtually the same estimates of the samples’ activities, but whose specification apart from that is markedly different. In the modified scenario the rectangular probability distribution for F has the same mean as in Section 2.1 but a half-length of Df¼0.0008 s 1 Bq 1 kg, the counting times are tG ¼tB ¼720 s, the corresponding list of gross counts reads nG,i ¼ ð170,105,174,144,156,127,176,134,220,152Þ,
i ¼ 1,:::,10, ð1bÞ
and nB ¼ 26 blank counts are observed. For these modified data the uncertainties originating from the blank measurement and the calibration factor are both half as large as the uncertainty related to the gross counts in the case K ¼10 and therefore have a significant impact even when the number of samples employed is only moderate. 2.3. Measurement model The specific activity A is determined by utilising its ‘‘mathematical model of the measurement’’ (JCGM 100, 2008) A¼
GB : F
ð2Þ
The new standard ISO 11929 (2010) mandates that a coverage interval of this quantity should only be calculated if the hypothesis of zero specific activity is disproved, signifying detection of 137Cs activity. For the example considered this condition is fulfilled in the case of the original as well as the modified data because it will become obvious below that 137Cs activity would actually be detected with respect to any reasonable error of the first kind.
¼f
1
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u2 ðgÞ þ u2 ðbÞ þa2 u2 ðf Þ,
ð3Þ
where g, b and f are assumed to be uncorrelated and thus no covariance terms arise. The standard uncertainty u(g) is obtained by a ‘‘Type A’’ evaluation and equals the estimated standard deviation of the mean of the individual gross count rate results observed, viz. vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u K X 1 sG 1u 1 uðgÞ ¼ pffiffiffiffi ¼ t ð4Þ ðn nG Þ2 , tG K t G KðK1Þ i ¼ 1 G,i where the use has been made of the above-mentioned assumption that the uncertainty of the counting time is negligible. The letter K denotes the number of samples included in the evaluation of the gross count rate estimate, from which the number of degrees of freedom nG ¼K 1 of u(g) results. Estimating the standard uncertainties u(b) due to Poisson counting statistics and u(f) associated with a rectangular distribution are instances of ‘‘Type B’’ uncertainty evaluation as pointed out in MARLAP (2004). Heeding the recommendation therein pertaining to numbers of counts not very close to zero, we take pffiffiffiffiffi nB uðbÞ ¼ , ð5Þ tB and employ the formula for the standard uncertainty ensuing from a rectangular distribution,
Df uðf Þ ¼ pffiffiffi : 3
ð6Þ
The degrees of freedom nB of u(b) can be estimated by (JCGM 100, 2008; MARLAP, 2004)
nB ¼
1 u2 ðbÞ 1 nB ¼ : 2 VarðuðbÞÞ 2 Varðn1=2 Þ
ð7Þ
B
3. Coverage intervals according to MARLAP A coverage interval according to MARLAP (2004) is centred on the estimate of the quantity of interest, which for the specific activity example is obtained by inserting the estimates of the gross count rate g ¼ nG =t G , the blank count rate b¼nB/tB and the calibration factor f¼0.0136 s 1 Bq 1 kg into Eq. (2), giving the values of a¼ (g b)/f shown in Table 1 for the original data and in Table 3 for the modified data. The construction of such a coverage interval is based on the principle of propagating uncertainties. In view of Eq. (2) this implies that the combined standard uncertainty u(a) of the estimate a is derived from the standard uncertainties u(g), u(b) and u(f) of the estimates g, b and f of the input quantities by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 @a @a @a uðaÞ ¼ u2 ðgÞ þ u2 ðbÞ þ u2 ðf Þ @g @b @f Table 3 Limits of the 95% coverage interval of the specific MARLAP for the modified data, based on K samples. Gross count data
137
Result and uncertainty
Cs activity according to
Coverage interval
K
nG
sG
a
u(a)
nA,eff
aL
aU
3 4 5 6 8 10
149.7 148.3 149.8 146.0 148.3 155.8
38.7 31.8 27.7 26.5 25.4 31.8
12.63 12.48 12.64 12.25 12.48 13.26
2.38 1.75 1.43 1.29 1.14 1.24
2.36 4.11 6.58 9.27 16.31 18.69
3.76 7.67 9.21 9.35 10.08 10.67
21.50 17.30 16.08 15.16 14.89 15.85
Specific activity given in Bq/kg.
When the product of the true blank count rate and the blank counting time is 10 or larger, which can evidently be taken for granted in the current circumstances, the variance of the square root of the Poisson variable is very close to 1/4 (Bartlett, 1936) and Eq. (7) can be approximated by nB E2nB. If the lower and upper limits for the calibration factor F are so chosen that its value is most unlikely to lie outside the interval defined by these limits, a very large number of degrees of freedom can be assigned to u(f) (JCGM 100, 2008). Then, choosing nF-N represents a reasonable approximation, due to which the Welch– Satterthwaite formula simplifies to 4 1 u ðgÞ u4 ðbÞ nA,eff ¼ f 4 u4 ðaÞ þ , ð8Þ
nG
nB
resulting in effective degrees of freedom which are seen from Tables 1 and 3 to be close to nG ¼ K 1 for the original data but larger than that for the modified data. The expanded uncertainty required for constructing a coverage interval of the specific activity that has an approximate level of confidence of 0.95 can now be computed by multiplying the 0.975—quantile of Student’s t-distribution for nA,eff degrees of freedom with the standard uncertainty u(a). This expanded uncertainty is subtracted from or added to the estimate of the specific activity, yielding the lower limit aL and the upper limit aU of the coverage interval reported in Table 1 for the original data and in Table 3 for the modified data.
4. Bayesian coverage intervals Radioactivity measurements have been analysed with the help of Bayesian statistics in the current notation for many years
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
(Little, 1982), including instances that involved a calibration factor (Miller et al., 2002) as in the present case. The procedure we use in the following is modelled after Laedermann et al. (2005) and in particular after the equivalent approach of Lira and Grientschnig (2010). Naturally, Bayesian data analysis proceeds from the same measurement model as MARLAP (2004), i.e. Eq. (2), for the example at hand. However, in the Bayesian framework this equation does not serve for the calculation of the estimate of an output quantity from the estimates of the input quantities but for the propagation of probability distributions of the unknown true values of the quantities involved. Such a distribution has to be understood as a representation of the state of knowledge about the respective quantity, not as a frequency distribution expected to arise when a measurement is performed repeatedly (of which a sampling distribution is an important representative). The former kind of distribution is a function of a parameter value, designated by a lowercase Greek letter, whereas the latter is a function of an observed value, represented by a lowercase Latin letter. The probability density functions of both types of distributions are in the following denoted by the letter p and can be distinguished by their argument variables. The letter p will also be used to designate priors, which due to the absence of prior knowledge in this study are non-informative and do not constitute genuine distributions but improper (i.e. not normalizable) functions. To permit applying the measurement model already available, the joint state-of-knowledge distribution of its input quantities has to be established. We start with the distribution of the gross count rate. It represents the unknown location parameter g of K independent identical Gaussian sampling distributions ! 1 ððnG,i =t G ÞgÞ2 exp pðnG,i 9g, sÞ ¼ , i ¼ 1,:::,K, ð9Þ 2s2 ð2pÞ1=2 st G whose common scale parameter s, viz. the standard deviation, is also unknown. Since the values nG,i, i¼1,...,K are presupposed to be uncorrelated, their joint sampling distribution is simply formed by multiplying the individual probability densities p(nG,i9g,s), i¼1,...,K, which yields after rearrangement ! 1 ðK1ÞðsG =t G Þ2 þKðgðnG =t G ÞÞ2 : pðnG,1 ,:::,nG,K 9g, sÞp K exp s 2s2 ð10Þ Due to the absence of prior knowledge about g and s a noninformative prior shall be assigned. Today, the so-called reference priors are the most widely accepted choice for that purpose and they have already proved to be expedient in radionuclide measurement (Barrera et al., 2007). Reference priors for all common sampling distributions can be drawn from a catalogue (Yang and Berger, 1998), where for the Gaussian statistical model of both g and s unknown the entry is p(g,s)p1/s. According to Bayes’ theorem, the product of the prior and the sampling distribution of Eq. (10) is proportional to the posterior distribution of g and s, given the gross counts nG,i, i¼ 1,...,K. Integration over the parameter value s has been shown e.g. by Jeffreys (1961) to lead to a marginal distribution fulfilling 0
1 gðnG =t G Þ pðg9nG ,sG Þp@1 þ K1 sG =ðt G K 1=2 Þ
!2 1K=2 A ,
715
g Z0 and thus represents a truncated and renormalised tdistribution. The true value of the blank count rate B, designated by the variable b, constitutes the parameter of a Poisson sampling distribution pðnB 9bÞ ¼
ð bt B Þ n B expðbt B Þ, nB !
ð12Þ
which is discrete and thus a probability mass function. Multiplying it with the pertinent reference prior p(b)pb 1/2 (Yang and Berger, 1998), applying Bayes’ theorem and normalising gives the posterior distribution of b pðb9nB Þ ¼ t B
ðbt B ÞnB 0:5 expðbt B Þ ¼ t B Gaðbt B 9nB þ0:5, 1Þ, GðnB þ 0:5Þ
ð13Þ
where Ga(btB9nB þ0.5, 1), in conformity with the definition by Bernardo (2003), is a gamma distribution of btB with shape parameter nB þ0.5 and scale parameter 1. Since the only information IF about the value j of the calibration factor F consists of the knowledge that this value is contained between f Df and fþ Df, its state-of-knowledge distribution supported on that interval is uniform, viz. pðj9IF Þ ¼
1 : 2Df
ð14Þ
With the individual distributions of the input quantities now available in the form of Eqs. (11), (13) and (14), the joint posterior distribution of these quantities can be established. While the distribution of the calibration factor is independent of both count rate distributions, the latter two are mutually dependent. The reason is that the true count rate values are related by g Z b. Lira and Grientschnig (2010) addressed this dependence by considering a conditional distribution. However, dependence of such a kind leading to zero probability density in a portion of the support can alternatively be taken into account by excluding the portion from the support where g o b. The latter approach avoids using a prior having zero probability density in certain parts of the support, which is considered inadvisable (Bernardo, 2003, p. 9). Applied to the current example it results in the joint state-ofknowledge distribution of the input quantities nG sG tB pðg, b, j9nG ,sG ,nB ,IF Þp St g , pffiffiffiffi ,K1 Gaðbt B 9nB þ 0:5,1Þ, 2Df tG tG K ð15Þ whose support is delimited by f Dfr j rfþ Df, b Z0 and g Z b. Next, a change of variable is performed to introduce the value a of the specific activity, thereby eliminating the value of one arbitrarily chosen input quantity. Which of them is expressed through the other quantities makes no difference for the marginal distribution of the specific activity to be calculated, although it has a bearing on the appearance of the formula encoding it (Lira and Grientschnig, 2010). For instance, if g is eliminated, the marginal distribution of a, supported on a Z0, takes the form pða9nG ,sG ,nB ,IF Þ Z Z 1 db ¼c 0
f þ Df
dj f Df
nG sG jt B St aj þ b , pffiffiffiffi ,K1 GaðbtB 9nB þ 0:5,1Þ 2Df tG tG K
ð16Þ ð11Þ
whose support determines of course its normalisation constant. If supported on the whole real line Eq. (11) constitutes Student’s tdistribution of location parameter nG =t G , scale parameter sG/(tGK1/2) 1 1=2 and K 1 degrees of freedom, i.e. Stðg9nG t 1 ,K1Þ in the G ,sG t G K notation of Bernardo (2003). However, since the gross count rate is a non-negative quantity the distribution is restricted to the support
The constant c following from the normalisation condition is close to unity if the posterior distributions of the gross and the blank count rates encoded by Eqs. (11) and (13) do not overlap significantly, because then the constraint g Z b has almost no effect. This situation prevails in the current example, and the larger the number K of samples becomes, the more closely it is met, as Tables 2 and 4 show for the original and the modified data, respectively. The probability density functions for these two
716
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
Table 4 Limits of the 95% coverage intervals of the specific 137Cs activity according to Bayesian statistics and to the new ISO 11929 for the modified data, based on K samples. K
Bayesian statistics
3 4 5 6 8 10
5. Coverage intervals according to the new ISO 11929
New ISO 11929
c
aL
aU
aA
ax
1.0161 1.0023 1.0003 1.0001 1.0000 1.0000
4.01 7.25 8.89 9.12 9.93 10.55
21.21 17.68 16.35 15.34 14.98 15.91
7.96 9.05 9.83 9.73 10.26 10.83
17.30 15.92 15.45 14.78 14.71 15.68
Specific activity given in Bq/kg.
0.4
p()/Bq-1 kg
0.3
0.2
0.1
0.0 8
10
12
14
16
18
/Bq kg−1 Fig. 1. Posterior distributions of the specific 137Cs activity a in the instance of K¼ 10 samples for the original data (solid line) and the modified data (dashed line).
sets of data in the case of K ¼10, corresponding to the normalised Eq. (16), are depicted in Fig. 1. The coverage interval [aL, aU] of the specific activity that shall be established has to satisfy Z aU aL
pða9nG ,sG ,nB ,IF Þda ¼ P,
respect. Therefore, these potential alternatives have no bearing on the exposition.
ð17Þ
where P signifies the desired coverage probability. Among the infinitely many intervals of coverage probability P, that of shortest length is preferable. It ensues from numerically minimising aU aL subject to the constraint composed of Eqs. (16) and (17), which results for P¼0.95 in the coverage intervals whose limits are presented in Tables 2 and 4. Note that considering the sampling distributions of the gross counts and the blank counts independently and assigning the respective reference priors for these basic statistical models is not the only existing Bayesian approach. Another option would be a genuine Bayesian reference analysis. In the situation at hand this requires establishing a joint sampling distribution for the gross and blank counts in terms of the specific activity and of nuisance parameters (Bernardo, 2003). Since the calibration factor is not precisely known but knowledge about it in the form of a probability distribution exists, a so-called ‘‘reference prior with partial information’’ according to Sun and Berger (1998) has to be constructed. Although this approach (which is beyond the scope of the current paper) may lead to slightly different coverage intervals than the one above, the dependence of their lengths on K will be virtually the same. Even choosing a uniform prior instead of a reference prior would not make a significant difference in this
The procedure of the new ISO 11929 (2010) for calculating coverage intervals combines elements of the two approaches described in Sections 3 and 4, respectively. It starts with the propagation of uncertainties from the input quantities to the output quantity in exactly the way specified in MARLAP. Thus, Eqs. (2)–(6) are utilised within the procedure of the new ISO 11929 (2010) as well, leading to the values a and u(a) called therein ‘‘primary measurement result’’ and ‘‘standard uncertainty associated with the primary measurement result’’, respectively. However, the notions of degrees of freedom, coverage factor and expanded uncertainty are absent from ISO 11929 (2010). Instead, it invokes the principle of maximum entropy to establish a Gaussian probability distribution of mean a and standard deviation u(a) supported on the whole real line. It is claimed to be a probability distribution of the true value of the specific activity although this quantity is known to be non-negative. Based on this unbounded distribution a decision threshold and a detection limit are established. By restricting this distribution to the interval [0,N] followed by renormalization, a different distribution for the purpose of calculating coverage intervals is introduced, which seems to be intended as an approximation to the Bayesian posterior distribution because it makes allowance for the knowledge that the specific activity is non-negative. The limits of the coverage interval ensuing from this truncated Gaussian distribution can be calculated analytically. In contrast to the preceding section, ISO 11929 (2010) stipulates probabilistically symmetrical coverage intervals, whose lower and upper limits are written as aA ¼ a þ zA uðaÞ and a x ¼ a þ z x uðaÞ, respectively, where zA and z x are expressed as quantiles of the standard normal distribution. Since the general equations for the probability values of these quantiles given in ISO 11929 (2010) are neither proved nor substantiated by references therein, they will be derived in the appendix. For the example at hand the ratio a/u(a) is so large that the truncation of the underlying Gaussian distribution at the origin resulting in its renormalization has virtually no effect, which leads to values zA and z x in good agreement with the (17P)/2—quantiles of the standard normal distribution. The ensuing limits aA and a x of the 95% coverage interval for the two sets of data are listed in Tables 2 and 4.
6. Discussion The calculation of Bayesian coverage intervals in Section 4 shows that for a simple measurement model, such as the one considered, obtaining the marginal posterior distribution of the output quantity by direct numerical integration is a viable alternative to approximating this distribution by a Monte Carlo method according to GUM Supplement 1 (JCGM 101, 2008). As clauses 7.7.2 and D.1 of that document reveal, calculation of a shortest coverage interval based on the result of a Monte Carlo study is so complicated that even establishing a continuous approximation to the distribution function for the output quantity may be indicated for that purpose. By contrast, the procedure utilised in Section 4 above permits such a coverage interval to be computed straightforwardly by minimising its length subject to the constraint of achieving the desired coverage probability. The lengths of all types of coverage intervals considered depend markedly on the empirical standard deviation sG, which may change with every further measurement result taken into
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
account, as Tables 1 and 3 confirm. In the given example dependence on K of the standard uncertainty u(a) ensuing from MARLAP (2004) pffiffiffiffi only originates from u(g) due to its proportionality to sG = K . In order to study exclusively the effect the number of measurements has on the length of the coverage pffiffiffiffi interval in excess of the aforementioned dependence on sG = K , we consider the ratio of its half-length hA and MARLAP’s standard uncertainty u(a). This ratio is depicted in Figs. 2 and 3 for all three types of coverage intervals as a function of K. For the original data Fig. 2 shows close agreement of the coverage intervals of MARLAP (2004) with those of the Bayesian approach for KZ4, although the underlying concepts are different. One conceptual difference is e.g. that the lower limit of the coverage interval according to MARLAP (2004) can become negative, which is excluded when either of the alternative procedures is used. The reason why nonetheless the agreement is so close can
10 8
K
6 5 4 3 0
1
2
3
4
5
hA /u(a) Fig. 2. Half-lengths hA of the coverage intervals of the specific 137Cs activity according to MARLAP (grey), Bayesian statistics (white) and the new ISO 11929 (black) expressed as multiples of MARLAP’s standard uncertainty u(a) for the original data, when counting results of K replicate samples are employed for estimating the gross count rate.
10 8
K
6 5 4 3 0
1
2
3
4
5
hA /u(a) Fig. 3. Half-lengths hA of the coverage intervals of the specific 137Cs activity according to MARLAP (grey), Bayesian statistics (white) and the new ISO 11929 (black) expressed as multiples of MARLAP’s standard uncertainty u(a) for the modified data, when counting results of K replicate samples are employed for estimating the gross count rate.
717
be explained as follows. When the original data apply, the uncertainty of the specific activity originates overwhelmingly from the determination of the gross count rate. Therefore, the distribution of the specific activity expressed by Eq. (16) is similar to and only slightly broader than the gross count rate distribution transformed with the help of the approximate relation g ¼ af þnB t 1 B , which yields a t-distribution of K 1 degrees of 1/2 1 freedom shifted by ðnG t 1 f), G nB t B Þ=f , scaled by sG/(tGK restricted to a Z0 and renormalised. The two distributions are almost indiscernible even in the case K¼10, where the dominance of the uncertainty contribution of the gross count rate is least pronounced. In the absence of prior information, a Gaussian sampling distribution of unknown scale parameter is known to have the exceptional feature that the coverage intervals ensuing from it by a Bayesian analysis in conjunction with a reference prior are numerically identical to the confidence limits it yields under the frequentist paradigm (Bernardo, 2003). However, this is only true if the Bayesian posterior distribution involved is unbounded or if the restriction of its support has no significant effect. The latter applies when K Z4, as the normalisation constants in Table 2 reveal. By contrast, for K¼ 3 the restriction to a Z0 comes to bear and consequently the coverage interval obtained by the Bayesian approach in this instance differs somewhat from that according to MARLAP (2004). This is seen to be the reason for the difference observed if one compares the coverage interval of the above-mentioned approximate t-distribution for K ¼3 with that of its counterpart restricted to the positive half-axis and renormalised. In the case of MARLAP’s coverage interval the ratio hA/u(a) equals just the coverage factor, i.e. the 0.975—quantile of Student’s t-distribution for the (generally non-integer) degrees of freedom nA,eff, thus accounting for the statistical uncertainty that arises from a small number of measurements. In the Bayesian approach this statistical uncertainty is inherently allowed for and therefore does not necessitate the concept of degrees of freedom (Kacker and Jones, 2003). Contrary to that the coverage interval stipulated by the new ISO 11929 (2010) disregards this statistical uncertainty completely. Whereas this interval approaches the lengths of those of the other two procedures in the case of a large number of observed gross count values, it is unrealistically short if the calculation of the interval is based only on few gross count measurements. If the uncertainty is primarily due to replicate sampling from a Gaussian statistical model of unknown variance (as in the situation of the original data), the length of the coverage interval pffiffiffiffihas to increase in excess of the proportionality relation hA p1= K when the number K of measured values decreases. Otherwise the coverage interval would not make allowance for the fact that the uncertainty associated with the estimate of the standard deviation of the mean indubitably depends on K. In the case of the coverage interval advanced in the new ISO 11929 (2010) the ratio hA/u(a) equals almost exactly the 0.975-quantile of the Gaussian distribution only if K approaches infinity and is thus in line with MARLAP (2004) solely in this limiting case. For K¼3, however, this interval is too short by a factor of about 2 compared to both other procedures. Therefore, constructing a coverage interval in this way is not advisable if K falls short of, say, 10. The new ISO 11929 (2010) and the section ‘‘Additional remarks’’ of a concomitant paper (Weise et al., 2009) are not the first instances where this problem arises because the standards series now replaced comprised formerly ISO 11929-7 (2005), where the concept underlying the new standard was introduced and which was affected by the same shortcoming. Actually, this flaw is contained already in an earlier paper (Weise ¨ and Woger, 1993), where the discrepancy relative to the corresponding result of conventional statistics was explicitly
718
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
mentioned but classified as a ‘‘slight difference’’. Despite the introduction of ISO 11929-7 in 2005, the procedure prescribed for calculating coverage intervals in situations like the one of the above example has until March 2010 been that of ISO 11929-2 (2000), now superseded by ISO 11929 (2010). Since in ISO 119292 (2000) the statistical uncertainty originating from a small number of measurements was taken into account by using a quantile of the t-distribution, it gave coverage intervals of reasonable lengths roughly corresponding to those according to MARLAP (2004) in the situation at hand. Consequently the replacement of ISO 11929-2 (2000) by the new ISO 11929 (2010) represents a change for the worse in that respect and leads to coverage intervals of markedly different lengths from now on. In a study by Calmet et al. (2008) of the precursor (ISO 11929-7, 2005) of the new ISO 11929 (2010), the difference of the results compared to MARLAP (2004) went unnoticed, since that investigation did not cover the case of repeated measurements under the influence of random effects of other than counting statistical nature. As annex F of ISO 11929 (2010) reveals, the shortcomings of the procedure of this new standard for computing a coverage interval can be attributed to using the principle of maximum entropy in a context where it should not be applied. This principle leads to a Gaussian probability distribution of mean g and standard deviation u(g) when these two pieces of information are provided (Jaynes, 1968; Lira, 2009). However, the situation at hand is different because the information available consists of the average nG =t G and its empirical standard deviation sG/(tGK1/2) resulting from the measurements of K samples. Consequently the statistical uncertainty has to be taken into account, especially if K is small, and no accepted procedure for applying the principle of maximum entropy in these circumstances exists. Finally, it seems worth mentioning some particular aspects related to the modified data. Their distinguishing features in comparison with the original data are the larger uncertainties associated with the blank count rate and the calibration factor. Therefore, larger uncertainty of the specific activity and longer coverage intervals are to be expected. Comparison of Tables 1 and 3 reveals that the first expectation indeed materialises. However, as to the lengths of the coverage intervals, the calculation according to MARLAP produces an anomaly: for K ¼3, 4, 5 the modified data lead to shorter coverage intervals of this kind than the original data. This originates from the fact that although the modification of the data causes an extension of the standard uncertainty, this is overcompensated by the reduction of the coverage factor that the effective degrees of freedom calculated by the Welch–Satterthwaite equation brings about. The other two kinds of coverage intervals do not show such illogical behaviour. Additionally, the increase of the degrees of freedom ensuing from the modification of the data implies that the ratio hA/u(a) displayed in Fig. 3 rises more moderately when K becomes smaller than in the case of the original data represented by Fig. 2.
7. Summary The procedure of the new ISO 11929 (2010) for computing a coverage interval from radiation measurements of K replicate samples in the case of extra-Poisson variations has been compared with the pertinent procedures of MARLAP (2004) and Bayesian statistics by means of a specific example. It was shown that the latter two calculations yield intervals whose lengths are in fairly good agreement. Given that in the example considered the major contribution to the uncertainty originates from K measurements conforming to a Gaussian statistical model of unknown variance, the lengths of the coverage intervals pffiffiffiensuing ffi from these two approaches increase in excess of 1= K if the
number of samples taken into account is reduced, thereby reflecting the statistical uncertainty that arises if only few samples are available. By contrast, the procedure mandated by the new ISO 11929 does not take this uncertainty into account, thus producing a coverage interval whose length is much shorter than those of the other two procedures when the number of samples measured is small.
Acknowledgements ¨ Special thanks are extended to Gunter Kanisch of the German ¨ ‘‘Johann Heinrich von Thunen Institut’’ for a private communication in which he raised the issue of the questionable procedure stipulated in the new ISO 11929 for processing results of repeated radiation measurements affected by extra-Poisson variations. The second author gratefully acknowledges the support of Fondecyt (Chile) research Grant 1095160.
Appendix The calculation of the lower limit aA and the upper limit a x of the coverage interval according to the new ISO 11929 (2010) proceeds from the Gaussian distribution with mean a and standard deviation u(a), truncated at the origin and renormalised by division through the value F(a/u(a)), where F designates the standard normal cumulative distribution function. By considering the portion of this distribution that extends from the lower limit of the coverage interval to infinity it is seen that the equation A 1 1þ P a a a ¼ 1F F ðA:1Þ 2 uðaÞ uðaÞ applies. Analogously, from the portion of the distribution between the upper limit of the coverage interval and infinity ensues x 1 1P a a a ¼ 1F F ðA:2Þ 2 uðaÞ uðaÞ hence the quantiles zA ¼ ðaA aÞ=uðaÞ and z x ¼ ða x aÞ=uðaÞ of the standard normal distribution associated with the limits of the coverage interval correspond to the probability values 1 F(a/ u(a))(17P)/2. With the help of those quantiles the limits of the coverage interval, aA ¼ a þ zA uðaÞ and a x ¼ a þ z x uðaÞ, are calculated. Note that the equations (29) and (30) of ISO 11929 (2010) representing the probability values of the quantiles zA and z x look differently and less symmetrically than those derived here, since the former equations are (i) not established in terms of the coverage probability P but of its complement and (ii) one of the relations underlying them is aA ¼ azA uðaÞ instead of aA ¼ a þ zA uðaÞ. Although by definition the areas under the distribution curve to the left and to the right of such a probabilistically symmetrical coverage interval are both equal to (1 P)/2, the limits of this coverage interval are not arranged symmetrically about the primary measurement result a. The smaller F(a/u(a)) is, the more pronounced this asymmetry gets. By contrast, the corresponding coverage interval of shortest length is symmetrical about the primary measurement result provided that the tail area of the distribution extending between 2a and infinity corresponds to a probability less than 1 P. If this condition, expressed by the inequality (F(a/u(a))) 1 1r1 P, is not fulfilled the shortest coverage interval becomes single-sided, because then its lower limit coincides with the physical limit of zero (Little, 1982; Roe and Woodroofe, 2001). But the values of a and u(a) in Tables 1 and 3 make evident that in the example at hand F(a/u(a)) is very close to unity. Under this condition the shortest coverage interval is
D. Grientschnig, I. Lira / Applied Radiation and Isotopes 70 (2012) 712–719
two-sided and symmetrical about the primary measurement result, i.e. the mode. The probabilistically symmetrical coverage interval then roughly concurs with it and is almost symmetrical about the mode, since the effect of the truncation of the distribution at the origin is insignificantly small. References ˜ ez-Lagos, R., Bernardo, J.M., 2007. A Bayesian Barrera, M., Romero, M.L., Nun approach to assess data from radionuclide activity analyses in environmental samples. Anal. Chim. Acta 604, 197–202. Bartlett, M.S., 1936. The square root transformation in analysis of variance. J. R. Stat. Soc. Suppl. 3, 68–78. Bernardo, J.M., 2003. Bayesian statistics. In: Viertl, R. (Ed.), Probability and Statistics, belonging to: Encyclopedia of Life Support Systems (EOLSS). , Eolss Publishers, Oxford /http://www.eolss.netS. Calmet, D., Herranz, M., Idoeta, R., 2008. Characteristic limits in radioactivity measurement: from Currie’s definition to the to the international standard ISO-11929 publication. J. Radioanal. Nucl. Chem. 276, 299–304. ¨ Elster, C., Woger, W., Cox, M.G., 2007. Draft GUM Supplement 1 and Bayesian statistics. Metrologia 44, L31–L32. Heisel, M., Kaether, F., Simgen, H., 2009. Statistical analysis of low-level screening measurements via gamma-spectroscopy. Appl. Radiat. Isot. 67, 741–745. ISO 3534-1, 2006. Statistics – Vocabulary and Symbols – Part 1: General Statistical Terms and Terms Used in Metrology. International Organization for Standardization, Geneva. ISO 11929, 2010. Determination of the Characteristic Limits (Decision Threshold, Detection Limit and Limits of the Confidence Interval) for Measurements of Ionizing Radiation—Fundamentals and Application. International Organization for Standardization, Geneva. ISO 11929-2, 2000. Determination of the Detection Limit and Decision Threshold for Ionizing Radiation Measurements—Part 2: Fundamentals and Application to Counting Measurements with the Influence of Sample Treatment. International Organization for Standardization, Geneva. ISO 11929-7, 2005. Determination of the Detection Limit and Decision Threshold for Ionizing Radiation Measurements—Part 7: Fundamentals and General Applications. International Organization for Standardization, Geneva. Jaynes, E.T., 1968. Prior probabilities. IEEE Trans. Syst. Sci. Cybern. 4, 227–241. JCGM 100, 2008. Evaluation of Measurement Data—Guide to the Expression of Uncertainty in Measurement (GUM 1995 with Minor Corrections). Joint Committee for Guides in Metrology, Se vres. /http://www.bipm.org/utils/ common/documents/jcgm/JCGM_100_2008_E.pdfS.
719
JCGM 101, 2008. Evaluation of Measurement Data—Supplement 1 to the ‘‘Guide to the Expression of Uncertainty in Measurement’’—Propagation of Distributions using a Monte Carlo Method. Joint Committee for Guides in Metrology, Se vres. /http://www.bipm.org/utils/common/documents/jcgm/JCGM_101_2008_E. pdfS. JCGM 200, 2008. International Vocabulary of Metrology—Basic and General Concepts and Associated Terms (VIM), third ed. Joint Committee for Guides in Metrology, Se vres /http://www.bipm.org/utils/common/documents/jcgm/ JCGM_200_2008.pdfS. Jeffreys, H., 1961. Theory of Probability, third ed. Clarendon, Oxford. Kacker, R., Jones, A., 2003. On use of Bayesian statistics to make the guide to the expression of uncertainty in measurement consistent. Metrologia 40, 235–248. Laedermann, J.-P., Valley, J.-F., Bochud, F.O., 2005. Measurement of radioactive samples: application of the Bayesian statistical decision theory. Metrologia 42, 442–448. Lira, I., 2009. The probability distribution of a quantity with given mean and variance. Metrologia 46, L27–L28. Lira, I., Grientschnig, D., 2010. Bayesian assessment of uncertainty in metrology: a tutorial. Metrologia 47, R1–R14. Little, R.J.A., 1982. The statistical analysis of low-level radioactivity in the presence of background counts. Health Phys. 43, 693–703. MARLAP, 2004. Multi-Agency Radiological Laboratory Analytical Protocols Manual. United States Environmental Protection Agency, Washington, DC /http:// www.epa.gov/radiation/marlap/manual.htmlS. Michel, R., Kirchhoff, K., 1999. Nachweis-, Erkennungs- und Vertrauensgrenzen bei ¨ ¨ Kernstrahlungsmessungen. TUV-Verlag, Koln. Miller, G., Martz, H.F., Little, T.T., Guilmette, R., 2002. Using exact Poisson likelihood functions in Bayesian interpretation of counting measurements. Health Phys. 83, 512–518. Roe, B.P., Woodroofe, M.B., 2001. Setting confidence belts. Phys. Rev. D 63, 013009. Sun, D., Berger, J.O., 1998. Reference priors with partial information. Biometrika 85, 55–71. ¨ ¨ Weise, K., Kanisch, G., Michel, R., Schlager, M., Schrammel, D., Taschner, M., 2009. Monte Carlo determination of the characteristic limits in measurement of ionising radiation—fundamentals and numerics. Radiat. Prot. Dosim. 135, 169–196. ¨ Weise, K., Woger, W., 1993. A Bayesian theory of measurement uncertainty. Meas. Sci. Technol. 4, 1–11. Yang, R., Berger, J.O., 1998. A Catalog of Noninformative Priors. Parexel International, Northbrook and Duke University, Durham /http://www.stats.org.uk/ priors/noninformative/YangBerger1998.pdfS.