ARTICLE IN PRESS
Nuclear Instruments and Methods in Physics Research A 534 (2004) 152–155 www.elsevier.com/locate/nima
Distinguishability of hypotheses$ S.I. Bityukova,, N.V. Krasnikovb a
Institute for High Energy Physics, Prospect 60-letiya, Octyabrya 7a, 142281 Protvino Moscow Region, Russian Federation b Institute for Nuclear Research RAS, Prospect 60-letiya, Octyabrya 7a, 142281 Moscow 117312, Russia Available online 29 July 2004
Abstract The possible approach to estimate the distinguishability of hypotheses is considered. The differences between the distinguishability of hypotheses and the probability of making a correct decision in hypotheses testing are discussed. The proposed estimator allows to take into account systematics and statistical uncertainties in determination of signal and background rates. r 2004 Elsevier B.V. All rights reserved. PACS: 02.50.Le; 02.50.Cw; 07.05.Fb Keywords: Hypotheses testing; Uncertainties; Distinguish-ability
Let us consider a statistical hypothesis
1. Introduction In the paper [1] the concept of the probability of making a correct decision in the testing of hypotheses as the estimator of quality of planned experiments is proposed. In the paper [2] this concept is used in the determination of exclusion limits. In parallel with this estimator which allows to compare the planned experiments, we propose the measure of the distinguishability of hypotheses. This measure is the property of hypotheses under testing. $ The participation in the WorkShop is supported by the Organizing Committee of ACAT’03. Corresponding author. E-mail address:
[email protected] (S.I. Bityukov).
H0 : new physics is present in Nature against an alternative hypothesis H1 : new physics is absent in Nature: The value of uncertainty is defined by the probability to reject H0 when it is true (Type I error) a ¼ Pðreject H0 jH0 is trueÞ and the probability to accept H0 when H1 is true (Type II error) b ¼ Pðaccept H0 jH0 is falseÞ: Suppose that the probability of observing n events in an experiment is described by the
0168-9002/$ - see front matter r 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.nima.2004.07.038
ARTICLE IN PRESS S.I. Bityukov, N.V. Krasnikov / Nuclear Instruments and Methods in Physics Research A 534 (2004) 152–155
153
function f ðn; mÞ with parameter m, and that the expected numbers of signal and background events (ms and mb , respectively) are known, i.e. there are considered two conditional distributions of probabilities
Accordingly, 1 k^ is the probability to make a correct choice with the given critical region. The 1 k^ can be considered as the estimator of quality of planned experiment. The advantages of this estimator are
f 0 ðnÞ ¼ f ðn; ms þ mb Þ; f 1 ðnÞ ¼ f ðn; mb Þ:
ð1Þ
2. The probability of making a correct decision
Let us specify what we mean by the probability of making a correct decision about the presence or absence of a new phenomenon in a planned experiment. Let us define the criterion for the hypothesis choice and calculate the probability of making a correct decision. This is possible, because we construct the critical region in such a way that the probability of an incorrect choice in favour of one of the hypotheses is independent of whether H0 or H1 is true. After choosing a critical region in some way, we can estimate the possible Type I ð^aÞ and Type II ^ In the case of applying the equal-tailed errors ðbÞ. ^ see for example [3]), their combination test (^a ¼ b, k^ ¼
a^ þ b^ 2
ð2Þ
is the probability of making incorrect choice in favour of one of the hypotheses. In actuality, we must estimate the random value k ¼ a þ b ¼ k^ þ e, where k^ is a constant and e is a stochastic term. a is the fraction of incorrect decisions if H0 is true. Then b is absent because H1 is not realised in Nature. Correspondingly, b is the fraction of incorrect decisions if H1 takes place; then a is absent. If H0 is true, the Type I error equals a^ and the error of our estimator (Eq. (2) is ^ ^ e^ ¼ k^ a^ ¼ ð^a þ bÞ=2 a^ ¼ ð^a bÞ=2. Similarly, if H1 is true, the Type II error equals b^ and the error of the estimator is ^ ^ e^ ¼ k^ b^ ¼ ð^a þ bÞ=2 b^ ¼ ð^a bÞ=2. Thus the ^ stochastic term takes the values ð^a bÞ=2. If we ^ require a^ ¼ b, both errors of the estimation are equal to 0 ðk^ a^ ¼ k^ b^ ¼ 0Þ. As a result, the estimator (Eq. (2)) gives the probability of making an incorrect decision in future hypothesis testing.
in case of continuous distributions this probability is independent of which hypothesis is chosen as H0 , which is H1 , and which is true, in case of discrete distributions the error e^ can be taken into account, this estimator allows the comparison of planned experiments.
3. The possible measure of the distinguishability of hypotheses The probability of making a correct decision in hypotheses testing has disadvantages to be a measure of the distinguishability of hypotheses, namely,
the equal-tailed test gives non-minimal magnitudes of sum of Type I and Type II errors [1,2], the k^ mainly assumes values in ½0; 1=2 (the desirable region for probabilistic measure is [0,1] [4]), the equal-tailed test does not always give a single-valued critical region for complicated distributions. The value1
1 k~ ¼ 1
a^ þ b^ ^ 2 ð^a þ bÞ
ð4Þ
is devoid of these disadvantages if we use the equal probability test [5]. 1
If we will use the geometric approach (let us the A is a set of possible realizations of the result of the planned experiment if the hypothesis H0 takes place in Nature and the B is a set of possible realizations of the result of the planned experiment if the hypothesis H1 takes place) then we have the total number of the possibilities for decision equals to A [ B and the fraction of incorrect decisions will be k~ ¼
T A B a^ þ b^ S ¼ : ^ A B 2 ð^a þ bÞ
ð3Þ
ARTICLE IN PRESS S.I. Bityukov, N.V. Krasnikov / Nuclear Instruments and Methods in Physics Research A 534 (2004) 152–155
154
There are three possibilities. (a) Distributions f 0 ðnÞ and f 1 ðnÞ have no overlapping, hence, the distributions are completely distinguishable and any result of the experiment will give the correct choice between hypotheses, i.e. 1 k~ ¼ 1. (b) Distributions f 0 ðnÞ and f 1 ðnÞ coincide completely. It means, that it is impossible to get a correct answer, i.e. f 0 ðnÞ and f 1 ðnÞ are not distinguishable, i.e. 1 k~ ¼ 0. (c) Distributions f 0 ðnÞ and f 1 ðnÞ do not coincide, but they have an overlapping, i.e. k~ is the fraction of incorrect decisions under the equal probability test. If in this case we hold the ^ designation k^ ¼ ð^a þ bÞ=2 then, k~ ¼
k^ : 1 k^
ð5Þ
4. The case of Poisson distribution Let the probability of observing n events in an experiment be described by a Poisson distribution with parameter m, i.e. mn f ðn; mÞ ¼ em : n!
ð6Þ
Then the Type I and II errors and, correspondingly, the measure of the distinguishability of hypotheses can be written as a^ ¼
nc X
f ði; ms þ mb Þ ¼
nc X
i¼0
b^ ¼ 1
f 0 ðiÞ
i¼0 nc X
f ði; mb Þ ¼ 1
i¼0
1 k~ ¼ 1
nc X
^ 2 ð^a þ bÞ
with critical value ms nc ¼ lnðms þ mb Þ lnðmb Þ
ð9Þ
The influence of systematics and statistical uncertainties in determination of mean numbers of signal and background rates ms and mb can be taken into account in this approach by easy way [6].
5. Conclusion In this report the possible approach to estimate the distinguishability of hypotheses is considered. The differences between the distinguishability of hypotheses and the probability of making a correct decision in hypotheses testing are discussed. The proposed estimator allows to take into account systematics and statistical uncertainties in determination of signal and background rates.
Acknowledgements The authors thank V.A. Kachanov, Louis Lyons, V.A. Matveev and V.F. Obraztsov for support of the given work. The authors also are grateful to S.S. Bityukov, Yu.P. Gouz, G. Kahrimanis, A. Nikitenko, V.V. Smirnova and V.A. Taperechkin for fruitful discussions and constructive criticism. S.B. thanks Toshiaki Kaneko and Fukuko Yuasa. This work has been supported by Grant RFBR 03-02-16933.
f 1 ðiÞ References
i¼0
a^ þ b^
conserves the linearity with respect to time ms t : nc t ¼ lnðms þ mb Þ lnðmb Þ
ð7Þ
ð8Þ
where square brackets mean the integer part of a number. Notice, that this critical value in case of the process of Poisson with parameter mt practically
[1] S.I. Bityukov, N.V. Krasnikov, The probability of making a correct decision in hypotheses testing as estimator of quality of planned experiments, in: G. Erickson, Y. Zhai (Eds.), Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 23rd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Jackson Hole, Wyoming, 3–8 August 2003, AIP Conference Proceedings, Vol. 707, Melville, NY, 2004, pp. 455–464, also e-Print: physics/ 0309031. [2] S.I. Bityukov, N.V. Krasnikov, in: L. Lyons, R. Mount, R. Reitmeyer (Eds.), Proceedings of the Conference
ARTICLE IN PRESS S.I. Bityukov, N.V. Krasnikov / Nuclear Instruments and Methods in Physics Research A 534 (2004) 152–155 Statistical Problems in Particle Physics, Astrophysics, and Cosmology (PhyStat’2003), SLAC, Stanford, CA, USA, 8–11 September, 2003, SLAC-R-703, pp. 318–320; econf C030908: THNT002, 2003. [3] J.O. Berger, B. Boukai, Y. Wang, Stat. Sci. 12 (1997) 133. [4] A.L. Gibbs, F.E. Su, On choosing and bounding probability metrics, e-Print: math.PR/0209021.
155
[5] S.I. Bityukov, N.V. Krasnikov, Nucl. Instr. and Meth. A 452 (2000) 518. [6] S.I. Bityukov, N.V. Krasnikov, Nucl. Instr. and Meth. A 502 (2003) 795; see also, S.I. Bityukov, JHEP 09 (2002) 060; e-Print: hep-ph/ 0207130.