Available online at www.sciencedirect.com
Nuclear Data Sheets 120 (2014) 106–108 www.elsevier.com/locate/nds
Method of Best Representation for Averages in Data Evaluation M. Birch1, ∗ and B. Singh1 1
Department of Physics and Astronomy, McMaster University, Hamilton, Ontario, Canada L8S 4M1 A new method for averaging data for which incomplete information is available is presented. For example, this method would be applicable during data evaluation where only the final outcomes of the experiments and the associated uncertainties are known. This method is based on using the measurements to construct a mean probability density for the data set. This “expected value method” (EVM) is designed to treat asymmetric uncertainties and has distinct advantages over other methods of averaging, including giving a more realistic uncertainty, being robust to outliers and consistent under various representations of the same quantity.
I.
INTRODUCTION
Within much of scientific literature only the final results of experiments, along with the associated uncertainties, are reported, as opposed to making entire data sets available. Therefore, it is often the case that detailed statistical information concerning a quantity of interest (e.g. a half-life) is not available, however this information is required to perform advanced statistical analysis on combinations of data sets (see e.g. [1]). We have derived a new method of averaging to be used in situations when the original data sets are not accessible (e.g. in nuclear data evaluation) which has the following advantages over other averaging methods: generates a more realistic uncertainty, robust to outliers, and consistent under various representations of the same quantity.
II.
SCOPE OF EVM
We make the following assumptions about the data to be averaged: the measurements are independent, accurate and uncertainties are well estimated. EVM is specifically designed to handle the case that only the final result of each experiment is available and not the original raw data. The final result of an experiment (μ±σ) is often the average of many trials and therefore can be interpreted as giving the mean (μ) and variance (σ 2 ) of the probability density function of a random variable associated with the measured quantity. Given only this information, according to the principle of maximum entropy, the best probability density one can assign is Gaussian [2]. In the case that the uncertainty is asymmetric (μ + a − b) we
∗
Corresponding author:
[email protected]
http://dx.doi.org/10.1016/j.nds.2014.07.019 0090-3752/© 2014 Elsevier Inc. All rights reserved.
generalize the Gaussian probability density as follows: ⎧ (x−μ)2 ⎪ 2 ⎨ e− 2b2 , x ≤ μ π(a+b)2 . (1) A (x; μ, a, b) = (x−μ)2 − 2a2 ⎪ 2 ⎩ , x > μ 2e π(a+b)
III.
THE EXPECTED VALUE METHOD (EVM)
For each experiment, the associated probability density is expected to be proportional to the distribution of frequencies which would result from repeating the experiment many times. To obtain the total frequency distribution given by the entire set of measurements, we simply sum over the expected frequencies given by each experiment; re-normalizing this sum would correspond to summing the probability densities. Let the mean probability density function, M (x), for a n data set {μi + ai − bi }i=1 consisting of n measurements be given by n
M (x) =
1 A (x; μi , ai , bi ) . n i=1
(2)
The result of this averaging method, xE , is obtained by computing the discrete expectation value of M (x) sampled at each measured value μi xE =
n
M (μi ) wi μi , where wi = n . j=1 M (μj ) i=1
(3)
There are two estimates of the uncertainty associated with xE that are defined as the internal and external uncertainties. The internal uncertainty measurement follows from a theorem on the variance of a probability density which results from a linear combination of normal
Method of Best Representation . . .
NUCLEAR DATA SHEETS
distributions [3],
n
n
2 2 σint+ = wi ai and σint- = wi2 b2i . i=1
V. A.
(4)
i=1
σext
(5)
i=1
To be conservative, the final quoted uncertainty should be the one for which the variance of the associated (asymmetric) normal distribution would be greater. IV.
GOODNESS OF FIT IN EVM
A modification to the usual Chi-Square goodness of fit measure used in ordinary weighted averaging is used as the goodness of fit measure for EVM. This test is motivated by a statistical test presented in section 9.3 of Ref.[3]. We define the quantity, Q, to be 2
Q=
ADVANTAGES OF EVM
Recommending a Realistic Uncertainty
The internal and external uncertainties of EVM reflect different physical sources of uncertainty and both have mathematical motivation. The internal estimate follows from a theorem which corresponds to the limitation of the precision of the measurements. The external uncertainty of EVM represents the variation of the results among the measurements and summarizes how well our current knowledge of the true value is described by the mean. Therefore, the external uncertainty estimate of EVM offers a mathematically motivated (using the variance of the probability density), meaningful and realistic uncertainty in the case of discrepant data. This is not the case in ordinary weighted averaging. A note of caution: the external uncertainty comes from the variance (spread) of the measurements in the data set, therefore its legitimacy is highly contingent on the accuracy assumption of the measurements. Older measurements which may be subject to systematic errors can cause the EVM uncertainty to be quite large, therefore the user should be careful in selecting his or her data set.
The external uncertainty is calculated as the standard deviation of M (x), again only sampling at measured values
n
2 = wi (μi − xE ) .
M. Birch et al.
2
(Nh − nph ) (Nl − npl ) + , npl nph
(6)
B.
where Nl , Nh are the number of measured points lower xE and higher than the mean respectively, pl = −∞ M (x) dx is the probability of a point being below the mean, ∞ ph = xE M (x) dx = 1 − pl is the probability of a point being above the mean, and n = Nl + Nh is the total number of measurements. As n → ∞ the distribution of Q approaches a Chi-Square distribution with one degree of freedom. Given this fact, we may transform Q into an easily interpretable percentage
Q , (7) α = 1 − erf 2 x 2 where erf (x) = √2π 0 e−t dt and α is called the “confidence level”. The latter term is a measure of the probability that the data are realizations of a random variable with probability density given by M (x). It should be noted that since Q has only an approximate Chi-Square distribution, the confidence level is also approximate. One should be cautious of very high confidence levels as they could indicate that α is not adequate to draw a conclusion about the hypothesis because the number of points is too few for the distribution of Q to be approximated by a Chi-Square distribution. Of course high confidence may also simply mean that the data set is highly consistent and can be described very well by the model. We also stress that this confidence level is a test for the validity of the underlying hypothesis of the EVM procedure and is not associated with the confidence interval for the mean given by the quoted uncertainty which is always equal to one standard deviation (68%).
Robust to Outliers
The weight of a particular measurement in EVM is proportional to the expected frequency of that measurement based on the data set. When the same value has a high expected frequency in more than one experiment, the sum of these results in a very large weighting. Since outliers do not benefit from this summing of frequencies, they will have a reduced weight in comparison with the other measurements, and therefore will not have a large effect on the data. This property is illustrated in Fig. 1, which shows results from a test in which simulated data sets, where a small number of data points were taken from an “outlying distribution”, were averaged by various methods in order to determine how much influence the outliers have on the mean given by that method. The unweighted average, weighted average and Normalised Residuals Method (NRM) [4] all show a dependence, while EVM is consistent with a constant value. C.
Consistent Under Alternative Representations
For some physical quantities, there is more than one numerical representation. For example, consider the reduced transition probability, B(E2), and mean lifetime, τ , of an excited state in a nucleus. One quantity can be converted to another via a simple relationship, e.g. in the case of the first excited state in even-even nuclei [5], and both correspond to the same underlying property related to the structure of the nucleus. Therefore, when presented with a set of data which consists of both B(E2) 107
Method of Best Representation . . .
NUCLEAR DATA SHEETS
from computing the expectation value of a probability density function which is constructed from the data set. Mathematically justified internal and external uncertainty estimates are derived which physically reflect different sources of uncertainty. EVM is advantageous over traditional methods of averaging since the process returns a realistic uncertainty estimate, is robust to outliers, and gives consistent results under alternative representations of the same data.
0.7
0.65
MBR EVM Unweighted Average
0.6
RMS Error in Mean
NRM 0.55 Weighted Average 0.5
0.45
0.4
0.35
0.3 0
1
2
3
4
5
6
Quantity of Sample Consisting of Outliers (%)
TABLE I. Measured B(E2)↑ values and mean lifetimes of the first 2+ state in 48 Ti. EVM and weighted average results are given for the data represented as B(E2) values, B ∗ , and as lifetimes, τ ∗ . The EVM results are consistent, while the two weighted average results barely agree within uncertainties. References cited are keynumbers in the NSR database [6]. Mean Lifetime (ps) B(E2)↑ (e2 b2 ) Reference 5.7(3)ac 0.0778(+43-39) 2000Er01 0.0662(+65-54) 1981Ca10,1977Ca14 6.7(6)a 0.103(+90-33) 1978Li13 4.3(20)a 0.106(+72-44) 1978DeYT 4.2(+30-17)a 0.0535(+94-70) 1973Fi15 8.29(124)ac 0.074(+20-13) 1973Ba02 6.0(13)a 0.084(+15-11) 1972WaYZ 5.3(8)a 0.0720(40)a 1971De29 6.16(+36-32) 0.081(8)a 1970MiZQ 5.48(+60-49) 0.069(6)a 1970Ha24 6.43(+61-51) 0.37(+12-30)b 1969Ka10 1.2(+50-3)ab 0.080(16)a 1967Af03 5.54(+139-92) 0.123(+88-36) 1964Bo22 3.6(15)a 0.062(+28-15) 1963Ak02 7.1(22)a 0.070(14)a 1960An07 6.3(+16-11) 0.140(40)a 1959Al95 3.17(+127-70) 0.074(+37-18) 1958Kn36 6.0(20)a 0.031(6)ab 1956Te26 14.3(+34-23)b 6.12(18) 0.0725(21) B ∗ (W. Ave.) 0.0756(+25-24) τ ∗ (W. Ave.) 5.87(19) 0.075(11) B ∗ (EVM) 5.9(+10-8) 0.076(+13-10) τ ∗ (EVM) 5.83(86)
FIG. 1. The root mean square (RMS) error in the determination of the mean of a known normal distribution for different averaging methods as a function of the quantity of outliers in the sample (in percent). The unweighted average (orange triangles), weighted average (blue diamonds) and Normalised Residuals Method (NRM) [4] (purple squares) all show a dependence while EVM (red circles) is consistent with a constant value. The error bars show the standard deviation for the set of results from different runs of the Monte Carlo simulation.
and mean lifetime measurements, the recommended value should not depend on the chosen representation of the data set. However, the weighted average result of converting every measurement to a lifetime will not always be consistent with the result of converting every measurement to a B(E2) value. Using EVM, the results will be consistent regardless of the representation of the data set. Consider the example of 48 Ti given in Table I. The EVM results are consistent in either representation, while the weighted average values nearly disagree within their uncertainties.
a b
VI.
M. Birch et al.
c
CONCLUSIONS
We have presented the Expected Value Method (EVM) of averaging experimental data. This procedure has been developed for independent and accurate measurements with reliable uncertainty estimates when detailed statistical information is unavailable. Results follow
Original value measured in experiment. Value not used in averaging as result appears to be an outlier in the data set. Uncertainty increased to 15% in 1973Fi15 and 5% in the inverse kinematic experiment of 2000Er01 to account for systematic uncertainty in stopping powers.
Acknowledgements: This work was supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Office of Nuclear Physics in the Office of Science of the U.S. Department of Energy.
[1] The ATLAS Collaboration et al., ATL-PHYS-PUB2011-11, CMS NOTE-2011/005 (2011). [2] E.T. Jaynes, G.L. Bretthorst, Probability Theory: The Logic of Science, Cambridge University Press (2003).
[3] R.V. Hogg, A.T. Craig, Introduction to Mathematical Statistics, Macmillan Publishing (1978). [4] M.F. James, R.W. Mills, D.R. Weaver, Nucl. Instrum. Methods A 313, 277 (1992). [5] S. Raman, C.W. Nestor Jr., P. Tikkanen, At. Data Nucl. Data Tables 78, 1 (2001). [6] http://www.nndc.bnl.gov/nsr
108