0 for CY>p. By setting t = t* in (3.2), we obtain readily the following equalities: P(S,3na)=exp
Samuels (1967) proved that P(S, > na) < P(T, 2na) for (A + 1)/n
Remark 2.1. Several bounds have been obtained for d, which are, at times, sharper than the upper bound given in (2.5). For instance, Serfling (1978) proved that d,<+A2. Shorgin (1977) and Deheuvels and Pfeifer (1986b, 1988) obtain the evaluations dK=dK+rK,
(2.6)
where AQp’(a-A)
a.I
Ibe’(b-1)
’
b!
I ’
(2.7)
where a=[A+++m], b=[A++-m], [u]
(2.8) It is noteworthy (see e.g. Deheuvels and Pfeifer (1988)) that, we have lr,l = o(&?~) and CI?~= (1-t o(l))(,I,/(21 fi)) and 1+03,
whenever AJ,I + 0 which improves
78
P. Deheuvels / Large deviations by Poisson approximations
upon (2.5) by a factor of 1/(21/sle). In general (see e.g. Deheuvels (1986a)), dK is always close to +d,, while the ratio @,/A)( 1 -e&)/d, above. Thus, the order of approximation given by (2.5) in (2.3)-(2.4) to a multiplicative constant.
and Pfeifer is bounded is sharp up
Remark 2.2. The lower bounds in (2.3) and (2.4) may not be improved, as can be seen from the definition (2.1) of dK. If one replaces dK in (2.3) and (2.4) by a function d(a) of a, n, A. and AZ, better bounds can be obtained by using either the semigroup operator setting of Deheuvels and Pfeifer (1988) (see e.g. their Theorem 3.1), or the Stein-Chen approach of Barbour and Jensen (1989) (see e.g. their Theorem 4). We will not follow either of these approaches here, in spite of their potential interest. Theorem
There exists an absolute constant A GO.7975 such that
2.2.
&=d,(S,,U,)=sup
lP(S,~x)-P(U,~x)l
(2.9)
X Proof.
By the Berry-Esseen
Theorem
(see Van Beek (1972), Hall (1982)), we have
&< A(,l - ,12)m3’2i E(JX,-Pj/3)6A(~-~,)~“2, j=l
wherewehaveusedthefactsthatIX,-Pjl61forj=1,...,n,and A-A*. 0
CT=, E(IXj-pj12)=
Remark 2.3. It is obvious from higher order expansions of P(S,
dK(Sn, T,)
U,)
for O
,
(2.10)
while
dK(S,,, T,)>d,(S,,U,,) where a = 0.2784.. . is the positive
for y>+a@=0.7670 solution
of the equation
... .
(2.11)
1 +a+log
a=O.
Thus, we see from Theorems 2.1, 2.2 and by making use of the arguments above, that the best rates of approximation which can be achieved for the Kolmogorov distance between the distributions of S, and of T,, or U,, is 0
min
:,
( It follows lp( +S,>
that
+na)
(2.12)
I>
the evaluations
-P(k
.
A2, (A-A.2))“2
i
U, > *na)l
of
iP( ?S, > t_na) -P(
which
may
be deduced
f T,> -+na)l from
these
and
of
theorems
19
P. Deheuvels / Large deviations by Poisson approximations
become inefficient abilities become motivates Section
for the evaluation of P( +-S,> kncu) negligible with respect to min{A,/A, 3 below.
3. Large deviations
and Poisson
when the latter AZ, (A -AZ)-“*}.
probThis
approximations
We start this section by introducing some further notation. For any --03 < t< ~0, define by exponential centering an auxiliary sequence X:, . . ,Xi of independent Bernoulli random variables such that
P(Xj=
l)= l-P(X,‘=o)=p;
:=
p,e’ I-pj+pje’
for j= 1, . . ..n.
(3.1)
Observe that u:(s+f)/Yj(f)
Y](s) := E(exp(sXj)) = 1 -pJ +p, e’, and that Y,‘(S) := E(exp(sXi)) = for all -a<.s,f
E(exp(sS,t))
{
Yj(S
+
t)/(u
(f)}
P(S,=k)=exp
E (
L,(t)-kt
P(SA=k)
j=l
>
for k=O, 1, . . . and -m
(3.2)
where L,(t):=log U:(t) forj=l,...,n, and -w
i
Lj(t)=n-’
j=l
i
and Var(SA) = a:(r),
where
p;,
(3.3)
J=I
and a;(t)
=nM,:(l)
= i
pJ(l -pj).
(3.4)
j=l
The assumption, nondegenerate,
made in (1. l), that O
or equivalently
1 for j=
1, . . . , n, implies
that S, is
that
Var(S,)=a,2(0)=A-Al=
E pj(l-pi)>O.
(3.5)
j=l
Moreover, (1.1) also implies Since (3.5) is readily as t+m. it follows that M,,(t) increases Denote accordingly by t* = M,,(t*)=a
that, for all j= 1, . . . , n, pj- 0 as t -+ --co and pj* 1 seen to be equivalent to a:(t) > 0 for all --03 < t < 03, from 0 to 1 as t increases from --03 to 00. t,*(a) the unique root of the equation
for O
1.
(3.6)
It will be convenient, from now on, to set p =n-‘A =K’ Cy=, Pj=M,(O). It is straightforward from the monotonicity of A4,(. ) that t*< 0 for ar
>C k>na
e-(k-m)l*p(S;*
=
/lo,
(3.7)
80
P. Deheuvels / Large deviations by Poisson approximations
and P(S,
c
Lj(t*)-nCXt*
i j=l
,,(x-na)l*p(S;*=q*
(3.8)
k
From this point on, the classical theory of large deviations uses a normal approximant for F,‘*(y) =P((SL*-na)/a,(t*)
function
and
density,
respectively,
k~ue-(k~n(l)‘*p(s~*=k)= /
bv
this approach
leads
to approx-
rme~Y/*u,;““‘d~~*(y)
(3.9)
-0
-_I
I‘@=
e-Yt*%,(r*)d@(y)
1- @(t*a,(t*))
=
for P
1,
(3.10)
@GHt*o,(t*))
to and
e-Y%,(f*)d~;*(y)
k~~ae-‘k~“a”*P(s:,*=k)=
80
!
e-Yl*‘&,((*)
d@(y)
=
1- @(-t*o,(t*))
(3.12)
fziqqt*a,(t*))
. --m
(3.11)
for o
The fact that t*>O in (3.10) while t*
that
,Im II
e~y’*ullc’*)(dF’*(y)-d@(y)) n
<2d,(SA*,
UA*)
for p
(3.13)
and ‘0
e-“*“~l(“)(dF,‘*(y)-do)
<2dK(SA*, (/A’“) for O
(3.14)
IL where UA* denotes a random variable following a normal N(nM,(t*),a,(t*)) distribution. Combining (3.9)-(3.14) with (2.9) yields the following theorem. Theorem 3.1. For any j?<(r<
1,
1- @(t*a,(t*))
P(S,>na)=
i *@(t*%(t*))
+~,‘(a)
n C Lj(t*)-nat*
exp 1
(
j=l
>
(3.15)
,
and for any O<(r
1- @(-t*a,(t*))
,
(3.16)
fiW*%(t*)) where, for A as in (2.9), ly,‘(a)l$2Aa,(t*)-‘.
(3.17)
Remark 3.1. Since lim
Z-m
41 - @(z)) @(z)
=
ly
(3.18)
P. Deheuvels / Large deviations by Poisson approximations
it follows from (3.15)-(3.18) that if It*1 + co and on(f*)+ assumptions leading to (3.15) and (3.16), respectively, P(kS,
3 em)
=
1+0(l) exp
t*(T,(t*)1/ZfE
i
c
81
03, we have,
Lj(t*)-nCYt*
J=I
under
the
(3.19)
. >
In particular, in the i.i.d. case wherep, = ... =pn =a, we obtain t*=log[(cr(l @(l-o))], (~,(t*)=n~(l-CZ), and Lj(r*)=lOg((l-p)/(l-cr)) for j=l,...,n. follows that (3.19) holds when (Y+ 0, p+O, na+ ~0, together with p/(r+ p/u+co.
-p)/ It 0 or
Remark 3.2. Theorem 3.1 is far from the best result one could obtain here, and serves mainly to illustrate the method. By taking care of the lattice structure of S, (see e.g. Petrov (1965, 1975)), sharper evaluations of the coefficients of exp( Cy=, Lj(t *) - nat*) in (3.15) and (3.16) can be made. However the corresponding results fail to give any reasonable estimate for P(tS, 2 +na) when the asymptotic normality of Si* does not hold, i.e. when a,(t*) it co. If it is the case, then the most reasonable approximant for the distribution of Si* is given by a Poisson distribution. This motivated the sequel, where such an analysis is achieved. We will make use of the following t*= t,:(a) is as in (3.6). Let G=Ji,
(Ppi,
notation,
At=&
;Ik*=g
cr~(O, 1) is arbitrary
for k-1,2,...
( l_;+‘;.e,)k J
where
and
and
-m
J
for k= 1,2, . . . and A*=A~=na.
(3.20)
Denote by T,’ a random variable following a Poisson PO@‘) distribution, and set T,,*= T,‘* (note that T,,* follows a Poisson Po(ncx) distribution). Set further dk=d,(SA,
T,‘),
dL=d,(SA,
T,‘),
di=d$
and
d,“=dc.
(3.21)
Let also, for pu>O and A >O, (3.22) and P-(/4~)=
c O
where Z7(p) denotes The main theorem proximations. Theorem
E
emV =
P(I7(p)
(3.23)
k!
a random variable following a Poisson PO(U) distribution. of this section, stated below, gives the basic form of our ap-
3.2. For any p
1,
P(S, 3 na) = {p+ (na em’*,e’*)+d,+(a)} exp
jJ L,(t*)-na(l-em’*) j=l
,
>
(3.24
82
P. Deheuvels
/ Large deviations
by Poisson
approximations
exp
Lj,(t*)-nU(l-eP’*)
and for any 0 < cz
i (
where IS+(a)/ <2d; <-
exp(-na(e-‘*-
lt
(1- ePa) exp(-ncr(ee’*-
By (3.2) taken
with t=t*,
(3.26)
1+ t*)).
we have for p
L,(f*))
+exp(i]
1,
k~ueCk’*(P(S~*=k)-P(T,;‘=k))
=:D,+D,. By (3.22), we obtain
readily
(3.27) that
D, =P+(ncx eC’*,e-‘*) exp
i (
while,
(3.25)
t*))
ncx Proof.
,
>
j=l
in view of t *>O, a summation
Lj(t*)-na(1
-em’*)
(3.28)
, >
j=l
by parts yields
ID,I<2d,*exp(-ncut*)<2dcexp(-ncrt*).
(3.29)
Combining (3.27), (3.28) and (3.29) with (2.5) taken with A =I*= n and A2=,l,* yields (3.24) and (3.26) in the ‘ +’ case. A similar argument used for P(S,
i=log(E)
and
(3.30)
i=log(z),
where p=n-tji,p,
and
j=l-[n-ljt,&jm’. J
(3.31)
83
P. Deheuvels / Large deviations by Poisson approximations
Since M,( .) is nondecreasing, analysis
it follows
from
(3.30)-(3.31)
and by elementary
that t^< T< t
d
and
M,,(f)
(3.32)
It may be verified that ?(respectively ior t^, is the value of t* obtained when S, follows a binomial B(n,b) (respectively B(n,P) or B(n,$)) distribution. In general, t * is related to < T and t^ via the following inequalities. Theorem
For any O
3.3.
1, we have
0<7
(3.33)
while for any 0 < a
(3.34) pi, . . . , pn, there does not exist a simple expressince M,(t) and ML(t) are easily computed for to evaluate t* by a numerical algorithm (for solving M,,(t*) = a in either of the intervals bounds are sharp in the sense that, in the i.i.d. ?=i=t^=t*.
Remark 3.3. For unequal values of sion of t* in closed form. However, each value of t, it is straightforward instance of Newton-Raphson type) given by (3.33) and (3.34). The latter case where p, = ... =p,,=p, we have Proof. similar
We limit ourselves to the proof of (3.33), since (3.34) may be obtained arguments. Set xJ=pj/p and y,=fi/p, for j= 1, . . . ,n, and let G(u,, . . . . u,)=a(l-Pi)
i
j&l
uj
by
(3.35)
l-a+Uj(a-Y)’
and H(U,, ee.9 U,)=a(l-f?)
i j=l
1
(3.36)
(l-a)Uj+a-fi3
By the Lagrange multiplier method, we see that the extrema of G under the constraint C,“=, uj = n are obtained by solving in u,, . . . , u,, and A the equations
a
UJ
-(
-A=0
l-a+Uj(a-p)
f3Uj
forj=l,...,
n, and
i
Uj=n.
j=l
(3.37)
Since the first n equations of (3.37) imply that ul = ... =u,, it follows that G(l, . . . . 1) = na is an extremum. To see whether it is a maximum or a minimum, it suffices to compute the corresponding Hessian matrix, which is diagonal since (a2/dui auj)G = 0 for i # j, the diagonal terms being given by a*G(l, au:
. . . . 1)
2a(l -a) = -
(1-p)2
(a-p)
We see that G(l, . . . . 1) = na is a maximum follows that
for j= 1, . . ..n. for a>p
G(l, . . . . l)=na=nM,,(t*)>G(x,,...,x,,)=nM,T)
and a minimum
for a
for a>p.
It
84
P. Deheuvels / Large deviations by Poisson approximations
A similar
argument
based on (3.36) shows likewise
that
H(l, . . . . l)=na=nM,(t*)6H(y,,...,y,)=nM,(~). This readily
proves
(3.33) as sought.
0
In the remainder of this section, we investigate the asymptotic sharpness of the bounds given in Theorem (3.2) for P(aS, > *no). We will make use at times of the following additional assumptions on p, =pI, ,,, . . . , p,, =p,,,, , n 2 1, p =p, and (Y= an, as n-w. (Al) (A2) (A3) (A4) Theorem
max(pr ,...,p,)/min(p,,...,p,)=O(l); JY-+OO; ncxa 1 is integer; ncr 3 + 0. 3.4.
Under (Al)-(A4),
we have for p<(~< 1,
P(S,~ncr)=(l+o(l))p+(ncre~‘*,e’*)exp
i
Lj(t*)-na(l-e-‘*)
(3.38)
Lj(t*)-nCX(l-ep’*)
(3.39)
j=l
and for O
as n-03,
whereP,(.,
.)
i
is as in (3.22)-(3.2)1/j=’
Proof. Using Stirling’s formula (i.e. m! = (m/e)“‘@ exp(BJ(12m)) with O< 8,,, < 1 for m 2 l), we obtain that for all p > 0 and y > 0 such that y~(Z 1 is integer,
-P
_yp)L& (YP)! By (3.40) and (A3), it follows
exp(-fi(l
- Y+ y log v)).
(3.40)
that 1
p*(ncx ee’*, e’*) >-
1 2fi
exp(-na(e+-
1 + t*)).
(3.41)
2fi Combining (3.24), (3.25), ((3.39) holds for O
(3.26) and (3.41), we see that (3.38) holds for ~
1
(AS) ,I;=o(fi). We will show that the conditions sake, recall that n;=
i
of the theorem
Pj exp(t*)
j= 1 ( 1 -pj+Pj
eXp(t*)
which by (3.30) and (3.33) is for p
ensure
that (A5) holds.
For this
2
> ’ 1 less than
or equal to (3.42)
P. Deheuvels / Large deviations by Poisson approximations
85
we use (A4) to show that (x+0, and (Al) to obtain that Next, (l/p’) Cq,, p]!=O(n). This, when combined with (3.42), shows that L:=O(ncr2). By (A4), it follows that (A5) holds. A similar argument with the forma1 replacements of p by p in (3.42), and of (3.33) by (3.34) yields, for O
n;< i J=]
Pjew@)
(
l-p;+pjeXp(F)
=(g)2j,
2
1
[PI/(’ -Pj(z))]‘-
(3.43)
By (Al), we have pj = O(p) uniformly over j= 1, . . . , n. Moreover, by (A2), p- a
pi(na em’*,e’*) in
It is interesting to seek a simple approximant of the coefficients (3.38)-(3.39). For this sake, introduce the function exp(x(e’< - 1 - it))
‘iI
u(x, Y) = & This function
satisfies
(see e.g. Hoglund
lim u(x, y) * x-to In view of (3.44)-(3.45), Theorem
3.5. Assume
(3.44)
d<.
l-yei
I_,
(1979)) for O
1,
1
~ l-y’
(3.45)
we have the following
that (Al)-(A4)
approximation
of P(+S,>
km).
hold, together with
(A6) np+m. Then, for any p
1, we have
P(S,>na)=(l+o(l))u(na,e-‘*)exp
i
Lj(t*)-nat*
j=l
,
(3.46)
>
while for O
i
c
j=l
Lj(t*)-nat*
. >
(3.47)
86
Proof.
P. Deheuvels
By Theorem
/ Large deviations
3.3 and (3.32),
by Poisson
approximations
we have (3.48)
By (3.30)-(3.38),
(3.48) and (Al)-(A3),
it follows
that (3.49)
Thus, by (A5) and (3.49), n(xe-‘* + m. This last condition enables to apply Theorem A of Hoglund (1979) to evaluate P&(ncue-‘*,e’*) as deviations of the Poisson distribution. Note that the original assumptions in Hoglund’s theorem require e’* to be bounded. However, in the particular case of the Poisson distribution, a direct calculus based on Stirling’s formula shows that this restriction is not necessary. In view of Theorem 3.3, we so obtain the conclusion of Theorem 3.5. 0
4. Simple approximations
of P(*S,<
km)
In order to avoid the technicalities introduced by the numerical solution of the equation (3.6), i.e. M(t*) =a, or alternatively by the replacement of t* by an upper or lower board selected among 6 i or t, we will assume here that p, = ... =p,, =p. It follows from Section 2 that our arguments may be extended without difficulty through simple inequalities in the general case where pi, . . ..p., are arbitrary. Following the notation introduced in Section 1, we shall denote by s, a random variable following a binomial B(n,p) distribution. A direct application of the results of Section 3 shows that
t*&a(l-P) JY(l-a)’
Lj(t*)=lOg
1-p ~ I-C!
( >
for j=
1, . . . . n,
(4.1)
An application of Theorem 3.3 in combination with (4.1) and routine tions yields the following corollary of this theorem. Corollary
4.1.
Under (A2)-(A4),
while for 0 < ct
we have, as n+ 03, for p
A:=na2.
1,
computa-
P. Deheuvels
/ Large deviations
by Poisson
approximations
87
Remark 4.1. A comparison of (4.2)-(4.3) with the exact binomial probabilities shows that these expressions are obtained by replacing the coefficient n!/(n - k)! by nx ePa/(l - a)n-k in the expression of tl!
_
P(Sn=k)= k!
An alternative
$(I
_p)“-k.
(n-k)!
proof of (4.2)-(4.3)
could be obtained
by using Stirling’s
formula
for
n!and(n-k)!,whichyields,whenn~co,n-k~~,k/n-tOandk=ncu(l+o(l)), n” e-“u
n! P=(l+o(l))(l_Cl),-k’ (n-k)!
Remark 4.2. By combining Corollary 4.1 with Theorem 2.1, one obtains simple upper bounds for P(+-S,> +na) under very general conditions. We do not offer here a detailed numerical investigation of such bounds, for the sake of brevity.
Acknowledgement We thank paper.
the referees
for helpful
comments
concerning
the contents
of this
References Anderson,
T.W. and S.M. Samuels
Proceedings
of ihe Fifth Berkeley
(1967). Some inequalities Symposium
Barbour, A.D. and P. Hall (1984). On the rate of Poisson Sot. 95, 473-480. Barbour, Slat&t.
among
on Mathemafical
binomial
Statistics
convergence.
and Poisson
Math.
A.D. and J.L. Jensen (1989). Local and tail approximations
probabilities. Vol.
I, I-12.
Proc. Cambridge
Philos.
and Probability
near the Poisson
limit. Stand.
J.
16, 75-87.
Deheuvels,
P. and D. Pfeifer
(1986a).
A semigroup
approach
to Poisson
approximation.
Ann.
Probab.
14, 663-676. Deheuvels,
P. and D. Pfeifer
Semigroup
Forum
(1986b). Operator
semigroups
Deheuvels, P. and D. Pfeifer (1988). On a relationship imations. Ann. Ins/. Statist. Math. 40, 671-681. Deheuvels,
P., M.L.
and Poisson
convergence
Puri and S.S. Ralescu
between
(1989). Asymptotic
Uspensky’s
theorem
expansions
Hoeffding, W. (1956). On the distribution Slarist. 27, 113-721.
of the number
HBglund, T. (1979). A unified formulation of the central from the mean. Z. Wahrsch. Verw. Geb. 49, 105-117. N.L.
and S. Kotz (1969). Discrete
Dislributions.
of successes
Houghton
Boston,
in independent
limit theorem
and Poisson
approx-
for sums of nonidentically
distributed Bernoulli random variables. J. Multivariate Anal. 282-303. Hall, P. (1982). Rates of Convergence in the Central Limir Theorem. Pitman,
Johnson,
in selected metrics.
34, 203-224.
MA.
trials. Ann. Math.
for small and large deviations Mifflin,
Boston,
Molenaar, W. (1973). Approximations to the Poisson, Binomial and Hypergeomerric tions. Math. Centre Tracts Vol. 31, Math. Centrum, Amsterdam.
MA.
Distribution
Func-
88 Petrov,
P. Deheuvels / Large deviations by Poisson approximations V.V. (1965). On the probabilities
of large deviations
of sums of independent
Theory Probab. Appl. 10, 287-298. Petrov, V.V. (1975). Sums of Independent Random Variables. Springer, Serfling,
R. (1978). Some elementary
results on Poisson
approximation
random
variables.
Berlin.
in a sequence
of Bernoulli
trials.
SIAM Rev. 20, 567-579. Shorack,
G.R. and J.A. Wellner (1986). Empirical Processes with Applications to Statistics, Wiley, New
York. Shorgin,
S.Ya. (1977). Approximation
of a generalized
binomial
distribution.
Theory Probab. Appl. 22,
846-850. Van Beek, P. (1972). An application inequality.
of Fourier
methods
Z. Wahrsch. Verw. Geb. 23, 187-196.
to the problem
of sharpening
the Berry-E&en