A guided tour of chernoff bounds

A guided tour of chernoff bounds

Information Processing North-Holland Letters 33 (I989/90) Torben HAGERUP 305-308 and Christine ROB Fachbereich Informatik, Universitiit des Saarl...

350KB Sizes 26 Downloads 222 Views

Information Processing North-Holland

Letters 33 (I989/90)

Torben HAGERUP

305-308

and Christine ROB

Fachbereich Informatik, Universitiit des Saarlandes, D-6600 Saarbriicken, FRC Communicated by R. Wilhelm Received 14 April 1989 Revised 4 September 1989

We give elementary derivations of the various inequalities collectively known as Chernoff bounds. Chemoff bounds are strong upper bounds on the probability of obtaining very few or very many heads in series of independent coin tossings. This note aims at making known results and their proofs accessible to a wider audience; it contains little or no new material.

Keywords: Chemoff

bounds, coin tossing, Bernoulli trials, probabilistic

analysis

The following notation is used throughout: Pr(A) denotes the probability of an event A, E(X) the expected value of a random variable X, in the natural logarithm function and exp its kverse. We write exp(x) and ex interchangeably. We need the inequalities stated in the following lemma.

Let g(a) = e” - (1 + a), a E R. Then g”(a) = e” > 0 for all Q E R, while g’(0) = 3. g(0) 10 for all a E R, which proves (1). Putting a = b/x and raising both sides of (I) to the xth power gives (2). Now for q E { 0, 1}, let

fq
-l<~:
Then

f,lW = - In( 1 + E) + 6 - ;qc*, f;‘(E) = -

& +1 ‘.

46,

sevicr ITkience

305

*IV -5P-t-..,,, QQn rCUlua4J 1A.._

INFORMATION PROCESSING LETTERS

voIume 33, Number 6 Table 1 0

4

1

, 3, tc) f;(E) r,,,

i-

.?i;w

+

J&w

-----

CC0

E=O

0

-y +

I -

n

-

0

-

0

+

+

0

-

I-

+

-

0

c= 0

c
c>o cc_

c>O

The sign variation of f0 immediately implies (4) and the left half of (3). From the sign variation of fi weget forO
Let r?~fU and let P~,...,JI,,ER tiiih O
Pr(Y= 1) =p

for i=l,...,n,

l

for i= I,...,

ha kah~~x1lfi11r %Veaie interested in lL1lw Vwa.Ur kuL of the rando,n varighlec ._a. c*eLI iC = We first quickly derive the most useful results. Let CaOand i>,O.Then

-xi +

..

l

** +p,J/n

and m=np

n.

+ _x, and S’ = yl + . . . + y,.

Pr(S > (1 + c)m) < e-‘(‘+‘)m e’(‘+‘)” pr(efs >/ e’(‘+‘)m) < e-r(‘-cc~mE(erS)_ Since X,,...,

X,, are independent, we further get (also using (1))

E(etS) = E(e*txl+ ‘a.““1) = E(etxl

(1

+ni(ef

-

1))

<

i= 1

I_i.

. . - etxn)

ep~(~‘-l)=

=

(pi

exn/ ---r

i=l

2

p,(e’-

\ i= 1

I)\ _I

=

et t_ (1

_p,))

em(e’-l).

I

Putting t = II@+ C) yields Pr( S >, (1 + +I)

< (1 + c)-“+““e”’

and hence Pr(S2

(

(1 +c)m)g

t

(1:) c

1+c

m

1

(9

*

B:y the right half of (3), r(S2

(1 -i-+2)

$e--c7m/‘3, I)
(6)

ondinglv ,,r for 0 6 c < 1 and t 2 0, r(Sg

(1 ---E)m)

= fe

r(nl-

S >, cm) $ Pr(et(m-s)

-tern (

et(m-S)

\ - ,tm(l / -- +a#

2 eizig)

-C)~(e-tS)

Volume 33, Number 6

INFORMATIQN PROCESSING LETTERS

10 Fhuary

1990

and =

E(e-tS)

n

n

i= 1

i= 1

(pie-’ + (1 -pi)) n

G

e-P,(l

-e-‘1 =

i3 1

(1 -pi!1

=

- es’))

r=l n

(

exp -(I

pi I = e-m(l-e-‘)_

- e-l) iz 1

I

Putting t = - ln(1 - f) yields

from which, by (4), the left half of (3), and continuity at e = 1, we get m

et

Pr(S~(1-~)m)~e-‘2m’2f .

((1+

l+E c1

,

(7)

O
i

As a consequence of (5), (1+r)m Pr(S>,(l+c)m)<

(

*,

_’

r

In particular, r>,6m.

Pr(S>,r)<2-‘,

(8)

We next derive some sharper but more complicated bounds. We henceforth consider only the case and t >, 0. Choosing c to make (1 -k c)m = an, we get from PI” -* =p,, =p. Let 0 QZ <: 1, asp previous calculations l

Pr(S’>

an) < ewron( p et + (1 -p))?

For t = ln( a( f - p)/[ p(1 - a )]) this becomes

from which we obtain Pr(S’ >, an) < ((;)a(+gl-a]n, et us introduce the

O
lop.

reviation S’ * k (‘S’ is at least as extremp as k ‘.) defined by

ifbpn, if k
terchange of a wit

INFQRMATIGN

Volume 33, Number 6

PROCESSING LEITERS

10 February 1990

Since by (2)

also have

we

PI@’ b

an) 4

Icr,aeavp]“, O
(11)

Putting na = k, we may finally derive the inequality ekWnp, O
The material in this note was drawn from several sources. The fundamental technique goes back to Chernoff ia], Wh!le we have not reproduced any of Chemoff’s original formulas, putting r = 1 in his equation (5.11) and exponentiating it while combining it with equations (3.5) and (3,6j yields our equation (10). Our treailarent is based mosilv on the derivation in [6], and equations (5) and (7) were taken from there. Equation (6) and the left harf of (7) apparently were first formulated in [l]. Equation, (9) and (11) are from [S], while (12) appears in [7] and a close relative of the left half of (12) is the form preferred in [3]. More general results may be found in [4, p.104, Theorems 6 and 71.

111D. Angluin and L.G. Valiant, Fast p:obabilistic algorithms for Hamiltonian circuits and I z&rings,

VI H.

J. Compur. System

Chemoff, A measure of asymptotic efficiency for tests

of a hypothesis based on the sum of observations, Ann. Muth. Batist. 23 (1952) 493-507. 131 P. Erdi%and J. Spencer, Probabilistic Methods in Combinaiodcs (Academic Press, New York, l9Xj. [41 M. Hofri, Probabilistic Analysis of Algorithms (Springer, New York, 1987).

308

[5] R.M. Karp, Probabilistic analysis of algorithms, Class Notes, Univ. California, Berkeley, 1988. [6] P. Raghavan, Probatilistic construction of deterministic algorithms: Approximating packing integer programs, J. Comput. System Sci. 37 (1988) 130-143. [7] L.G. Valiant and G.J. Brebner, Universal schemes for parallel communication, In: Proc. 13th Ann. ACM Symp.

on Theory of Computing (1979) 263-277.