Journal of Statistical Planning and Inference 141 (2011) 1380–1393
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Polynomial structures in rank statistics distributions Claude Lefe vre a,, Philippe Picard b a b
Universite´ Libre de Bruxelles, De´partement de Mathe´matique, Campus de la Plaine, C.P. 210, B-1050 Bruxelles, Belgium Universite´ de Lyon, Institut de Science Financie re et d’Assurances, 50 Avenue Tony Garnier, F-69007 Lyon, France
a r t i c l e in fo
abstract
Article history: Received 16 February 2010 Accepted 15 October 2010 Available online 21 October 2010
This paper deals with the classical problem of how to evaluate the joint rank statistics distributions for two independent i.i.d. samples from a common continuous distribution. It is pointed that these distributions rely on an underlying polynomial structure of negative binomial type. That property is exploited to obtain, in a systematic and unified way, closed forms and simple recursions, some well established, for computing the joint tail and rectangular probabilities of interest. & 2010 Elsevier B.V. All rights reserved.
Keywords: Rank statistics Kolmogorov–Smirnov two-sample test Generalized Appell and Abel–Gontcharoff polynomials Negative binomial type polynomials Lattice path enumeration
1. Introduction There is a vast literature devoted to the derivation and analysis of the distributions of order statistics and rank statistics for i.i.d. samples. Much can be found in the comprehensive books by Conover (1999), David and Nagaraja (2003), Arnold et al. (2008) and Shorack and Wellner (2009). Motivation: A previous paper (Denuit et al., 2003) was concerned with the exact joint tail and rectangular distributions of order statistics for i.i.d. random variables with arbitrary common law. In the one-sided case, the distributions were shown to rely on an underlying polynomial structure which is of Abel–Gontcharoff type for the left tail and of Appell type for the right tail. A major interest of that property is to allow us to obtain, in a unified way, different recursions for evaluating these probabilities. The two-sided case was analyzed in a similar way. An application to the Kolmogorov–Smirnov goodness-of-fit test was also presented. The sequence of polynomials that plays a key role in this study is the family of monomials fxn =n!, n Z 0g. The present work deals with the exact joint distributions of the rank statistics for two independent i.i.d. samples from a common continuous law. Our purpose is to point out how a similar methodology can be applied when the basic sequence of polynomials is now the family of negative binomial type polynomials fðx þ nn1Þ, n Z 0g. In Section 2, we show that the left and right tail distributions can be written in terms of a special case, named negative binomial, of the wider family of generalized Abel–Gontcharoff and Appell polynomials. The main properties of these polynomials are examined in the Appendix. The existence of such a polynomial structure will provide us with closed forms and simple recursions for determining the probabilities of interest. In Section 3, we prove that the two-sided distributions can be evaluated by means of two remarkable recursions which rely on a similar, but more complex, algebraic structure. Finally, the results are illustrated in Section 4 by applying the Kolmogorov–Smirnov (K–S) two-sample test to some data sets.
Corresponding author.
E-mail address:
[email protected] (C. Lefe vre). 0378-3758/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2010.10.007
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1381
The approach developed here is closely related to the pioneering paper by Steck (1969) and the nice works by Mohanty (1980) and Niederhausen (1981). It starts by using a well-known straightforward connection with problems of lattice path enumeration. For this topic, see e.g. Mohanty (1979) and Balakrishnan (1997) with the references therein. K–S two-sample test: Following Steck (1969), we introduce (and will illustrate) the problem through the K–S two-sample test. Let Xð1Þ , . . . ,XðnÞ and Yð1Þ , . . . ,YðmÞ be the order statistics for two independent samples from continuous distribution functions F and G. Let Fn(x) and Gm(x) be the corresponding empirical distribution functions. To test the null hypothesis H0 : F ¼ G against the one-sided alternatives H1 : F o G or H1 : F 4 G, the K–S one-sided statistics are D ¼ maxx ½Gm ðxÞFn ðxÞ or D þ ¼ maxx ½Fn ðxÞGm ðxÞ. For the two-sided alternative H1 : FaG, the K–S two-sided statistics is D ¼ maxðD ,D þ Þ. Suppose that H0 is true. The two samples may be combined into a single one of size m+ n. Let i +Ri denote the rank of X(i), 1 ri r n, in this new ordered sample. By construction, 0 rR1 r r Rn r m. We are going to show how to determine the joint tail or rectangular distributions of the random vector ðR1 , . . . ,Rn Þ. Now, it is rather easy to see that the distribution functions of D , D + and D can be rewritten as probabilities of this form. Specifically, PðD r dÞ ¼ PðR1 r t1 , . . . ,Rn r tn Þ where ti ¼ minðbmði1Þ=n þmdc,mÞ with bzc representing the largest integer r z, PðD þ r dÞ ¼ PðR1 Z s1 , . . . ,Rn Zsn Þ where si ¼ maxð0,dmi=nmdeÞ with dze the smallest integer Z z, and PðD rdÞ ¼ Pðs1 rR1 rt1 , . . . ,sn rRn r tn Þ. Note that D and D + have the same distribution (as the two vectors ðmR1 , . . . ,mRn Þ and ðRn , . . . ,R1 Þ have the same distribution, and msi ¼ tni þ 1 , 1 r i rn). 2. One-sided rank statistics distributions Let us consider the rank statistics framework described above. The distribution of the random vector ðR1 , . . . ,Rn Þ can be obtained by counting lattice paths in a particular grid of N2 . Specifically, the m+ n ordered random variables are followed one by one and, starting at point (0,0), one associates a step (0,1) to each occurrence of an X(i) and a step (1,0) to each occurrence of an Y(i). Such a path, random by construction, is non-decreasing and always stops at point (n,m). In fact, it can be simply represented by the sequence of points fð0,R1 Þ,ð1,R2 Þ, . . . ,ðn1,Rn Þ,ðn,mÞg. The number of paths from (0,0) to (n,m) being equal to ðm nþ nÞ, one gets, for 0 r r1 r r2 r rrn r m, m þn PðR1 ¼ r1 ,R2 ¼ r2 , . . . ,Rn ¼ rn Þ ¼ 1 : n Our purpose in this section is to determine the joint left or right tail distributions of ðR1 , . . . ,Rn Þ. This means that a boundary, upper or lower, will now be imposed to the lattice paths in the grid. Denote such a lattice by fð0,y0 Þ,ð1,y1 Þ . . . ðn1,yn1 Þg (the last step (n,m) being omitted in the notation). Let fs0 , . . . ,sn1 g be a non-decreasing family of non-negative integers with sn1 rm, and consider an associated boundary F ¼ fð0,s0 Þ,ð1,s1 Þ, . . . ,ðn1,sn1 Þg. Given some integer z 2 ½0,s0 , we denote by Nu,n ðz,s0 ,s1 , . . . ,sn1 Þ the number of lattice paths satisfying y0 rs0 ,
y1 rs1 , . . . ,yn1 rsn1
and
y0 Zz:
When z ¼ 0, this means that F Fu is an upper boundary to the lattice paths in the grid. Then, one has mþn Pðz r R1 rs0 ,R2 rs1 , . . . ,Rn r sn1 Þ ¼ Nu,n ðz,s0 , . . . ,sn1 Þ : n
ð2:1Þ
Similarly, given some integer z 2 ½sn1 ,m, we write Nl,n ðs0 ,s1 , . . . ,sn1 ,zÞ for the number of lattice paths such that y0 Zs0 ,
y1 Zs1 , . . . ,yn1 Zsn1
and
yn1 r z:
When z ¼ m, F Fl corresponds to a lower boundary in the grid. Then, mþn PðR1 Z s0 , . . . ,Rn1 Z sn2 ,z ZRn Z sn1 Þ ¼ Nl,n ðs0 , . . . ,sn1 ,zÞ : n
ð2:2Þ
2.1. Polynomial structures In case of solely an upper horizontal boundary at level x1ð r mÞ,Nu,n ðz ¼ 0,x1, . . . ,x1Þ ¼ Nl,n ð0, . . . ,0,z ¼ x1Þ and this number is obviously equal to the basic polynomial en ðxÞ ¼ ðx þ nn1Þ. We are going to establish that more generally, the numbers Nu,n and Nl,n correspond to some polynomials G n and A n of negative binomial type, respectively. Proposition 2.1. Given the above upper and lower boundaries (with n Z 1), Nu,n ðz,s0 , . . . ,sn1 Þ ¼ G n ð1zjfs0 , . . . ,sn1 gÞ,
ð2:3Þ
Nl,n ðs0 , . . . ,sn1 ,zÞ ¼ A n ð1þ zjfs0 , . . . ,sn1 gÞ,
ð2:4Þ
where the polynomials G n and A n are of negative binomial type.
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1382
Proof. Let us begin with Nu,n. This number can be represented under the form Nu,n ðz,s0 , . . . ,sn1 Þ ¼
s0 s1 X X
sn1 X
...
y0 ¼ z y1 ¼ y0
1:
ð2:5Þ
yn1 ¼ yn2
Choose an arbitrary integer a satisfying a 4 sn1 . Then, for 0 ri r n1, let us operate the change of indices ji ¼ ayi and put 1 þ ui ¼ asi , where ui Z0 as a 4 si . The sum (2.5) so becomes az X
Nu,n ðz,s0 , . . . ,sn1 Þ ¼
j0 X
jn2 X
...
j0 ¼ 1 þ u0 j1 ¼ 1 þ u1
1:
jn1 ¼ 1 þ un1
Note that the ui’s are non-increasing and az Z as0 ¼ 1 þu0 . From (A.25), we then find that Nu,n ðz,s0 , . . . ,sn1 Þ ¼ G n ðazjfu0 , . . . ,un1 gÞ: Inserting the expression of the ui’s gives Nu,n ðz,s0 , . . . ,sn1 Þ ¼ G n ðazjfa1s0 , . . . ,a1sn1 gÞ, which becomes (2.3) by an elementary property of G n (mentioned after (A.16)). Let us turn to the number Nl,n. We can express it as z X
Nl,n ðs0 , . . . ,sn1 ,zÞ ¼
yn1 X
...
yn1 ¼ sn1 yn2 ¼ sn2
y1 X
1:
ð2:6Þ
y0 ¼ s0
Choose now any integer b 4x. For 0 r ir n1, let us make the transforms ji ¼ bx þyi and put 1þ vi ¼ bx þ si , where vi Z 0 by construction. Then, (2.6) becomes jn1 X
bx þz X
Nl,n ðs0 , . . . ,sn1 ,zÞ ¼
j1 X
...
jn1 ¼ 1 þ vn1 jn2 ¼ 1 þ vn2
1,
j0 ¼ 1 þ v0
and using (A.25) and (A.10), we obtain Nl,n ðs0 , . . . ,sn1 ,zÞ ¼ A n ðbx þ zjfv0 , . . . ,vn1 gÞ: Substituting the expression of the vi’s then leads to (2.4).
&
Remark. (i) For a horizontal barrier at level x 1, (2.3) and (2.4) reduce to en(x), as expected. Indeed, from (2.3) and formula (A.19), one gets Nu,n ð0,x1, . . . ,x1Þ ¼ G n ð1jfx þ 1, . . . ,x þ1gÞ ¼ en ðxÞ: Similarly, using (A.10) and (A.19), Nl,n ð0, . . . ,0,x1Þ given by (2.4) becomes en(x). (ii) Nu,n and Nl,n are directly connected. So, consider all the paths counted by Nl,n. Making a rotation of 1803 and following the paths in reversed time shows that Nl,n ðs0 , . . . ,sn1 ,zÞ ¼ Nu,n ð0,zsn1 , . . . ,zs0 Þ: Formulas (2.3) and (2.4) can now be linked through properties of G n and A n . Starting from (2.3) and thanks to (A.10), we have Nl,n ðs0 , . . . ,sn1 ,zÞ ¼ G n ð1jfz þsn1 , . . . ,z þs0 Þ ¼ A n ð1jfz þ s0 , . . . ,z þ sn1 Þ, which is equivalent to (2.4). If the boundary is linear, (2.3) and (2.4) provide quite simple results. Indeed, G n and A n can then be written explicitly from (A.19) with (A.10). Corollary 2.2. Let si ¼ a þ bi, 0 ri r n1, with a,b Z0. Then, Nu,n ðz,s0 , . . . ,sn1 Þ ¼ ð1z þaÞ
en ð1z þ a þ bnÞ , 1z þa þ bn
Nl,n ðs0 , . . . ,sn1 ,zÞ ¼ ½1 þ zabðn1Þ
en ð1 þ za þ bÞ : 1þ za þb
ð2:7Þ
ð2:8Þ
In case of an arbitrary (non-decreasing) boundary, a closed form for (2.3) and (2.4) follows from (A.20) with (A.10). We note that (2.9) below with z ¼ 0 corresponds to the formula (21) derived in Mohanty (1980).
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1383
Corollary 2.3. n X r X Y ej1 ð1z þ s0 Þ eji ðssi1 ssi2 Þ,
Nu,n ðz,s0 , . . . ,sn1 Þ ¼
r ¼ 1 Cr
ð2:9Þ
i¼2
n X r X Y ej1 ð1 þ zsn1 Þ eji ðsn1si2 sn1si1 Þ,
Nl,n ðs0 , . . . ,sn1 ,zÞ ¼
r ¼ 1 Cr
ð2:10Þ
i¼2
where s0 ¼ 0, si ¼ j1 þ þji , i Z1, and Cr ¼ fðj1 Z 1, . . . ,jr Z 1Þ : sr ¼ ng. 2.2. Associated recursions Thanks to the properties of G n and A n , the numbers of paths Nu and Nl can be easily determined by recursion. Specifically, let U ¼ fsi ,i Z0g be a non-decreasing sequence of non-negative integers, and consider the successive boundaries Fk ¼ fð0,s0 Þ,ð1,s1 Þ . . . ðk1,sk1 Þg for k ¼ 1,2, . . . . Write Fu,k in case of an upper boundary and let Nu,k ðz,UÞ Nu,k ðz,s0 , . . . ,sk1 Þ be the number of paths below this boundary with, as before, z ry0 for some integer z 2 ½0,s0 ; put Nu,0 ðz,UÞ ¼ 1. Similarly, let Nl,k ðz,UÞ Nl,k ðs0 , . . . ,sk1 ,zÞ be the number of paths above the lower boundary Fl,k such that yk1 rz for some integer z Zsk1 ; put Nl,0 ðz,UÞ ¼ 1. Proposition 2.4. The Nu,k(z,U)’s satisfy the following recursion: for any y 2 Z, ek ðy þ 1zÞ ¼
k X
ekj ðysj ÞNu,j ðz,UÞ,
kZ 1:
ð2:11Þ
j¼0
Proof. Given any real family U~ , we have from (A.18) that ek ðy þ 1zÞ ¼ ½Sy ek ðxÞx ¼ 1z ¼
k X
ekj ðu~ j ÞG j ðyþ 1zjU~ Þ:
j¼0
Let us take u~ i ¼ ysi , 0 r ir k1; this gives ek ðy þ 1zÞ ¼
k X
ekj ðysj ÞG j ð1zjUÞ,
j¼0
hence (2.11) by virtue of (2.3).
&
In particular, choosing y ¼ z1 in (2.11) yields, by (A.1), 0¼
k X
ekj ðz1sj ÞNu,j ðz,UÞ,
kZ 1:
ð2:12Þ
j¼0
On another hand, taking y ¼ sk1 gives ek ðsk1 þ 1zÞ ¼
k X
ekj ðsk1 sj ÞNu,j ðz,UÞ,
k Z1:
ð2:13Þ
j¼0
When z ¼ 0, both recursions were derived by Mohanty (1980) (formulas (16) and (18)) by using a combinatoric approach. Note that in (2.12), the terms in the sum have alternating signs; in (2.13), all the terms are non-negative, which is numerically more stable. Proposition 2.5. The Nl,k(z,U)’s satisfy the following recursion: for any y 2 Z, A k ðy þ 1 þzjUÞ ¼
k X
ekj ðyÞNl,j ðz,UÞ,
k Z 1:
ð2:14Þ
j¼0
Proof. By (A.13), for any real family U~ , A k ðy þ 1 þzjU~ Þ ¼
k X
ekj ðyÞA j ð1 þzjU~ Þ:
j¼0
Choosing u~ i ¼ si , 0 ri rk1, and using (2.4) then yields (2.14).
&
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1384
In particular, since A k ðsk1 jfs0 , . . . ,sk1 gÞ ¼ 0 (see (A.12) with r ¼ 0), taking y ¼ sk1 1z in (2.14) gives the recursion k X
0¼
ekj ðsk1 1zÞNl,j ðz,UÞ,
k Z1:
ð2:15Þ
j¼0
Observe that the terms in the sum are of alternating signs. Another possible method is to put z ¼ 1 and then y ¼ 1 þz in (2.14); using (2.4), this yields the formula Nl,k ðz,UÞ ¼
k X
ekj ð1þ zÞA j ð0jUÞ,
k Z1,
ð2:16Þ
j¼0
where the A j ð0jUÞ’s can now be determined recursively through (A.14). Note that the terms in these sums are not of constant sign. 3. Two-sided rank statistics distributions We pursue the analysis by examining the joint rectangular probabilities for the random vector ðR1 , . . . ,Rn Þ. Let U ¼ fsi ,i Z0g and V ¼ fti ,i Z0g be two non-decreasing sequences of non-negative integers satisfying si rti for all i. Then, with respect to the previous grid, one has mþ n Pðs0 rR1 r t0 , . . . ,sn1 r Rn rtn1 Þ ¼ Nlu,n ðU,VÞ ; ð3:1Þ n where Nlu,n ðU,VÞ Nlu,n ðs0 , . . . ,sn1 ,t0 , . . . ,tn1 Þ is the number of paths of the form fð0,y0 Þ, . . . ,ðn1,yn1 Þg such that s0 r y0 rt0 , . . . ,sn1 ryn1 rtn1 : Furthermore, that joint probability is allowed to contain, as before, the supplementary event that either z r R1 for some integer z 2 ½0,t0 or Rn rz for some integer z 2 ½sn1 ,m. This addition implies an obvious adaptation of formula (3.1). Let us now focus on the associated lattice path counting problem. We start by introducing a family of functions, ffn ,n Z 0g, defined on Z by x ð1Þn , n Z 1, ð3:2Þ f0 ¼ 1 and fn ðxÞ ¼ n þ x where we put ðx n Þ þ ¼ ð n Þ ðresp: ¼ 0Þ if x Z n ðresp: x onÞ. These fn’s remind us, of course, of the negative binomial type polynomials en. In fact, it is directly checked that an equivalent expression is
fn ðxÞ ¼ en ðxÞ1ðx o 0Þ ,
n Z 1,
ð3:3Þ þ
i.e. fn , n Z1, corresponds to a truncated en with value 0 on Z . Here too, we consider two-sided boundaries Flu,k ¼ fðs0 ,t0 Þ, . . . ,ðsk1 ,tk1 Þg, for k ¼ 1,2, . . . . We will mainly derive two important relations (given by Propositions 3.1 and 3.2) which allow us to determine the number of paths desired. Both formulas are expressed in terms of the functions fn (and thus, the polynomials en). Their algebraic structures are roughly similar to those pointed out in Section 2 for the one-side case. 3.1. A recursion relation Let Nlu,k ðz,U,VÞ Nlu,k ðz,s0 , . . . ,sk1 ,t0 , . . . ,tk1 Þ be the number of paths inside Flu,k such that y0 Z z for some integer z 2 ½0,t0 . Put Nlu,0 ðz,U,VÞ ¼ 1. Formula (3.4) below provides a recursion of Nlu,k(z,U,V) in function of Nlu,j ðz,U,VÞ,0 r j r k1. Proposition 3.1. fk ðsk1 zÞ ¼
k X
fkj ðsk1 tj 1ÞNlu,j ðz,U,VÞ,
k Z1:
ð3:4Þ
j¼0
Proof. By definition, t0 X
Nlu,k ðz,U,VÞ ¼
t1 X
y0 ¼ s0 3z y1 ¼ s1 3y0
Q0 Q1 . . . Qk1 1, where Qj ¼
tj X yj ¼ sj 3yj1
,
...
tk1 X
1
yk1 ¼ sk1 3yk2
ð3:5Þ
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1385
after putting y1 ¼ z. Each Qj can be decomposed under the form Qj ¼ 1ðyj1 r sj Þ
tj X
þ 1ðyj1 4 sj Þ
yj ¼ sj
tj X
tj X
¼
yj ¼ yj1
yj1 1
1ðyj1 Z 1 þ sj Þ
X
Sj Suj :
ð3:6Þ
yj ¼ sj
yj ¼ sj
Thus, one sees that Q0 ¼ S0 Su0 , Q0 Q1 ¼ Q0 S1 Q0 Su1 ¼ Q0 S1 S0 Su1 þ Su0 Su1 , . . . . In general, making successive decompositions (3.6) in (3.5), from the right to the left, until the first appearance of a Sj leads to the expansion Q0 Q1 . . . Qk1 ¼
k1 X
ð1Þk1j Q0 . . . Qj1 Sj Suj þ 1 . . . Suk1 þ ð1Þk Su0 . . . Suk1 :
ð3:7Þ
j¼0
By (3.5), one knows that Q0 . . . Qj1 1 ¼ Nlu,j ðz,U,VÞ. Now, putting zj ¼ 1þ j þ yj allows us to express Suj as zj1 X
Suj ¼ 1ð1 þ j þ sj r zj1 Þ
,
zj ¼ 1 þ j þ s j
so that y1 X
Su0 . . . Suj1 ¼ 1ðy1 Z j þ sj1 Þ
z0 X
zj2 X
...
z0 ¼ 1 þ s0 z1 ¼ 2 þ s1
:
zj1 ¼ j þ sj1
From (A.27), (A.28), we then find that
Su0 . . . Suj1 1 ¼ ½ej ðy1 j þ 1sj1 Þ þ ¼
y1 sj1 j
! :
ð3:8Þ
þ
Consider the operator E that maps sj (resp. yj) to sj + 1 (resp. yj + 1), for all j (as e.g. in (A.15)). This leads us to write that
Suj þ 1 . . . Suk1 1 ¼ E j þ 1 ðSu0 . . . Sukj2 Þ1 ¼ ½ekj1 ðyj kþ j þ 2sk1 Þ þ :
ð3:9Þ
Observe that Sj is equivalent to Suj when tj + 1 is substituted for yj 1. Writing Suj . . . Suk1 1 by means of (3.8), we so get that
Sj Suj þ 1 . . . Suk1 1 ¼ ½ekj ðtj k þ j þ2sk1 Þ þ :
ð3:10Þ
Now, inserting (3.9), (3.10) in (3.5), (3.7) yields Nlu,k ðz,U,VÞ ¼ ð1Þk ½ek ðzk þ1sk1 Þ þ þ
k1 X
ð1Þk1j ½ekj ðtj kþ j þ 2sk1 Þ þ Nlu,j ðz,U,VÞ:
ð3:11Þ
j¼0
From the definitions (A.21) of en and (3.2) of fn, one can then write (3.11) as in (3.4).
&
Expressing fn by means of (3.3), the recursion (3.4) becomes ek ðsk1 zÞ1ðsk1 r zÞ ¼
k X
ekj ðsk1 tj 1ÞNlu,j ðz,U,VÞ,
k Z1,
ð3:12Þ
j ¼ rk ðU,V Þ
where rk ðU,VÞ ¼ minfj Z0 : tj þ 1 4 sk1 g ( r k1 by definition). Now, let us consider the special case z ¼ 0. Remember that Nlu,k ð0,U,VÞ Nlu,k ðU,VÞ, the number of paths inside Flu,k. Then, (3.12) provides the simple recursion 0¼
k X
ekj ðsk1 tj 1ÞNlu,j ðU,VÞ,
k Z1:
ð3:13Þ
j ¼ rk ðU,V Þ
The same result was obtained by Mohanty (1980) (formula (23)) using a different method. Remark. (i) Let us compare the recursions derived here for Nlu,k and in Section 2 for Nu,k (where the upper boundary is now defined through the sequence fti ,iZ 0g). We first notice that formula (3.4) has an algebraic form that looks like an Abelian expansion of type (A.17). In the same vein, we also see that the recursion (3.12) presents a structure roughly similar to (2.11) with y ¼ sk1 1 (and ti is substituted for si). Clearly, the algebraic differences observed come from that the basic elements in the two-sided case are the functions fn which are truncated polynomials en. (ii) The decompositions (3.6) might be inserted in (3.5) from the left to the right until the first appearance of a Sj . That leads to the expansion Q0 Q1 . . . Qk1 ¼
k1 X
ð1Þj Su0 . . . Suj1 Sj Qj þ 1 . . . Qk1 þð1Þk Su0 . . . Suk1 :
j¼0
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1386
By adopting analogous arguments, one then obtains the following relations: Nlu,k ðz,U,VÞ ¼
k X
fj ðsj1 zÞNlu,kj ðE j U,E j VÞ,
k Z 1,
ð3:14Þ
j¼0
where s1 z and E j U ðresp: VÞ ¼ fsj þ i ðresp: tj þ i Þ, i Z0g, j Z0. The use of (3.14) requires to first evaluate the Nlu,kj ðE j U,E j VÞ’s, which can be done from the recursion (3.25) derived later. 3.2. An expansion formula Let Nlu,k ðU,V,zÞ Nlu,k ðs0 , . . . ,sk1 ,t0 , . . . ,tk1 ,zÞ be the number of paths inside the boundary Flu,k such that yk1 rz for some integer z Z sk1 . Put Nlu,0 ðU,V,zÞ ¼ 1. Formula (3.15) below provides an expansion of Nlu,k(z,U,V) in function of Nlu,j ðU,VÞ,0 r j r k. Proposition 3.2. Nlu,k ðU,V,zÞ ¼
k X
fkj ðztj ÞNlu,j ðU,VÞ,
k Z1:
ð3:15Þ
j¼0
Proof. We follow a similar approach but on basis of the formula Nlu,k ðU,V,zÞ ¼
yk1 4tk2 X
z4t k1 X
yk1 ¼ sk1 yk2 ¼ sk2
...
yX 1 4t0
1
y0 ¼ s0
¼ wk1 wk2 . . . w0 1,
ð3:16Þ
where yj þ 1 4tj
wj ¼
X
,
yj ¼ sj
with yk z. Each wj is expressed as
wj ¼
tj X
tj X
1ðyj þ 1 o tj Þ
yj ¼ sj
sj suj :
ð3:17Þ
yj ¼ yj þ 1 þ 1
Making successive decompositions (3.17) in (3.16) from the left to the right until the first appearance of a sj then gives k1 X
wk1 wk2 . . . w0 ¼
ð1Þk1j suk1 . . . suj þ 1 sj wj1 . . . w0 þ ð1Þk suk1 . . . su0 :
ð3:18Þ
j¼0
Evidently, one knows that wj1 . . . w0 1 ¼ Nlu,j ðU,V,zÞ, and therefore,
sj wj1 . . . w0 1 ¼ Nlu,j þ 1 ðU,V,tj Þ ¼ Nlu,j þ 1 ðU,VÞ,
ð3:19Þ
the latter equality coming from that, by definition, Nlu,j + 1(U,V,tj) gives the number of paths inside Flu,j + 1. Moreover, putting zl ¼ l þ tj1 yj1l for 0 r l r j1, one has tj1 yj 1
suj1 su0 ¼ 1ðyj þ j r t0 Þ
X
z0 X
z0 ¼ 1
z1 ¼ 1 þ tj1 tj2
zj2 X
,
zj1 ¼ j1 þ tj1 t0
so that by (2.11), (2.12),
suj1 su0 1 ¼ ½ej ðt0 yj j þ1Þ þ ¼
t0 yj j
! :
ð3:20Þ
þ
Furthermore, for j r k2,
suk1 . . . suj þ 1 1 ¼ E j þ 1 ðsukj2 . . . su0 Þ1 ¼ ½ek1j ðtj þ 1 zk þj þ2Þ þ :
ð3:21Þ
By combining (3.18)–(3.21), we then obtain that Nlu,k ðU,V,zÞ ¼
k2 X
ð1Þk1j ½ek1j ðtj þ 1 zkþ j þ 2Þ þ Nlu,j þ 1 ðU,VÞ þ Nlu,k ðU,VÞ þ ð1Þk
j¼0
Using (A.21) and (3.2), formula (3.22) can be rewritten as (3.15).
&
t0 z k
: þ
ð3:22Þ
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1387
By (3.3), the relations (3.15) can also be expressed as k X
Nlu,k ðU,V,zÞ ¼
ekj ðztj ÞNlu,j ðU,VÞ,
k Z 1,
ð3:23Þ
j ¼ k4rz ðVÞ
where rz ðVÞ ¼ minfj Z 0 : tj 4 zg. To determine Nlu,k(U,V,z), it thus suffices to calculate the numbers Nlu,j(U,V). These will be provided by the recursion (3.13). Remark. (i) Formula (3.15), or (3.23), is especially simple. It has exactly the same structure as the formula (A.16) given by Niederhausen (1981) in the framework of an extended Sheffer (generalized Appell) theory. In that context, the family fNlu,k ðU,V,zÞ,kZ 0g corresponds to a so-called Sheffer-sequence for the operator D and with basic polynomials fen ,n Z 0g. The reason for this appellation is that, looking at the right hand side of (3.15) and defining the polynomials tk,i ðzÞ ¼
k X
ekj ðztj ÞNlu,j ðU,VÞ,
0 ri r k,
j¼i
one directly sees that each family fti þ n,i ,n Z 0g, i Z 0 is a generalized Appell sequence. (ii) Making successive decompositions (3.17) in (3.16) from the right to the left until a first sj would give the expansion
wk1 wk2 . . . w0 ¼
k1 X
ð1Þj wk1 . . . wj þ 1 sj suj1 . . . su0 þ ð1Þk suk1 . . . su0 :
j¼0
One can then deduce the following relations: fk ðzt0 Þ ¼
k X
fj ðsj1 t0 1ÞNlu,kj ðE j U,E j V,zÞ,
kZ 1:
ð3:24Þ
j¼0
In particular, choosing z ¼ tk1 in (3.24) yields k4rt ðUÞ
0¼
0 X
ej ðsj1 t0 1ÞNlu,kj ðE j U,E j VÞ,
kZ 1,
ð3:25Þ
j¼0
where rt0 ðUÞ ¼ maxfj Z 1 : sj1 ot0 þ 1g. 4. Some numerical illustrations The recursions obtained in the paper are simple to implement and, in general, do not raise any serious stability problems. For instance, one can easily construct a table of quantiles for the K–S two-sample test such as given in the book by Conover (1999). In this section, we will apply this test to two particular data sets in biostatistics. The variable under study is the CRP level (mg/l) measured in two independent samples, one of size n ¼ 8 for patients that receive a drug (prednisone) and the other of size m ¼ 10 for patients that get a placebo (CRP is a c-reactive protein that reflects acute inflammation). The measures are presented for two different days. (i) Here are the observations obtained at day 1: with prednisone: 2.2, 3.8, 4.6, 8.3, 8.4, 18.5, 32.0, 58.3, with placebo: 0.9, 2.5, 3.5, 4.5, 9.4, 18.5, 26.3, 37.2, 39.5, 79.0. The corresponding empirical distribution functions (d.f.), Fn and Gm, are drawn in Fig. 1. One sees that d ¼ maxx fGm ðxÞFn ðxÞg ¼ 7=40, d þ ¼ maxx fFn ðxÞGm ðxÞg ¼ 9=40 and d ¼ maxfd ,d þ g ¼ 9=40. As recalled in the Introduction, the d.f. of D and D + can be expressed in terms of the left and right tail distributions of ðR1 , . . . ,Rn Þ. With our data, that gives PðD r d Þ ¼ PðR1 r1,R2 r 3,R3 r4,R4 r 5,R5 r6,R6 r 8,R7 r9,R8 r 10Þ, PðD þ r d þ Þ ¼ PðR1 Z 0,R2 Z 1,R3 Z2,R4 Z 3,R5 Z4,R6 Z 6,R7 Z7,R8 Z 8Þ: The d.f. PðD rdÞ is obtained by requiring the ranks to be between the two boundaries above. Under H0, these probabilities can be computed by means of the recurrences (2.12), (2.15) and (3.13). So, we find that the p-values for the three tests are PðD Z d Þ ¼ 0:6225, PðD þ Zd þ Þ ¼ 0:5135 and PðD Z dÞ ¼ 0:89823. That implies that no significant difference between prednisone and placebo was detected in CRP levels at level 0.05 (RH0 with both one and two-sided tests). This is not surprising as at day 1, the drug has not yet produced any effect. (ii) Let us examine the observations obtained at day 8: with prednisone: 0.9, 1.0, 1.1, 1.3, 2.0, 4.3, 6.4, 12.9, with placebo: 0.4, 2.1, 3.3, 5.0, 7.7, 9.2, 26.7, 29.3, 39.4, 44.5.
1388
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1 0.8 0.6 0.4 Fn(x) Gm(x)
0.2 0 0
10
20
30
40
50
60
70
80
x Fig. 1. Empirical d.f. Fn (prednisone) and Gm (placebo), day 1.
1 0.8 0.6 0.4 Fn(x) Gm(x)
0.2 0 0
10
20
30
40
x Fig. 2. Empirical d.f. Fn (prednisone) and Gm (placebo), day 8.
1 0.8 0.6 0.4 P (D+<=d) P (D<=d)
0.2 0 0
0.2
0.4
0.6
0.8
1
d Fig. 3. Theoretical d.f. of D + and D under H0, when n ¼ 8 and m ¼ 10.
Fig. 2 shows the associated empirical d.f. Here, d ¼ 1=10 and d þ ¼ 21=40. Using the vector ðR1 , . . . ,R8 Þ, one gets PðD r d Þ ¼ PðR1 r 1,R2 r 2,R3 r 3,R4 r 4,R5 r 6,R6 r7,R7 r 8,R8 r9Þ, PðD þ rd þ Þ ¼ PðR1 Z 0,R2 Z 0,R3 Z 0,R4 Z 0,R5 Z 1,R6 Z 3,R7 Z 4,R8 Z 5Þ: This yields the p-values PðD Z1=10Þ ¼ 0:80808, PðD þ Z 21=40Þ ¼ 0:04756 and PðD Z 21=40Þ ¼ 0:09511. At the 0.05 level of significance, prednisone significantly reduces the CRP level (RH0 with a one-sided test). With a two-sided test, no effect of prednisone is detected at the same level (RH0 at level 0.05, but RH0 at level 0.10). Thus, the effect of the drug is not yet sufficiently strong at day 8. For data at day 15 (not given here), the decision is clearly RH0 at level 0.05.
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1389
(iii) To check the efficiency of the approach, we have also evaluated, under H0, the d.f. PðD þ r dÞ ( ¼ PðD rdÞ) and PðD rdÞ, d 2 ½0,1. The corresponding graphs are given in Fig. 3. The simplicity of the recursions obtained allows us to perform such computations in a fast and precise manner.
Acknowledgments This research was made while C. Lefe vre was visiting the Institut de Science Financie re et d’Assurances, Universite´ de Lyon. We are grateful to Pr. S.G. Mohanty for motivating our interest in this problem through his nice paper of 1980. We thank the Editor and the referee for useful remarks and suggestions. Appendix A. Two special families of polynomials Appell polynomials are classical in the literature; they cover the subfamilies of Hermite, Laguerre, Bernoulli and Euler polynomials (see e.g. Kaz’min, 1988). Generalized Appell polynomials, fA n g, often called Sheffer polynomials, are also quite standard (see e.g. Mullin and Rota, 1970). They occur as solutions to a number of enumeration problems, some with applications in two-sample statistical tests (see e.g. Niederhausen, 1981, 1986, 1997). In applied probability, they are used when computing the ruin probability in insurance risk models (see e.g. Picard and Lefe vre, 1997; De Vylder, 1999; Ignatov and Kaishev, 2000, 2004; Lefe vre and Picard, 2006). Although of similar construction, Abel-Gontcharoff polynomials, due to Gontcharoff (1937), seem to be less popular. In stochastic modelling, they arise, for instance, when determining the final state distribution in epidemic processes (see e.g. Lefe vre and Picard, 1990; Ball and O’Neill, 1999). Generalized Abel–Gontcharoff polynomials, fG n g, were introduced by Picard and Lefe vre (1996) to study a wider class of first-passage time problems (see also Picard and Lefe vre, 2003). In fact, generalized Appell and Abel–Gontcharoff polynomials are closely related and can be defined in a common framework. They can also be extended to functions of similar types. A.1. fA n g and fG n g polynomials Let us summarize the main points of the theory such as presented in Picard and Lefe vre (1996). Consider the vector space P of polynomials B on R. Let fen ,n Z0g be a family of polynomials of degree exactly equal to n (this is a basis of P). It is assumed that e0 ¼ 1
and
en ð0Þ ¼ 0, n Z1:
ðA:1Þ
A linear operator D is then defined on P by stipulating that
De0 ¼ 0 and Den ¼ en1 , n Z 1:
ðA:2Þ
In many cases (as in Section 5.2 below), the en’s satisfy a convolution type property: for x,y 2 R, n X
en ðx þ yÞ ¼
ej ðxÞenj ðyÞ,
n Z0:
ðA:3Þ
j¼0
Denote by Sa, a 2 R, the shift operator defined by ðSa BÞðxÞ ¼ Bðx þ aÞ. Under (A.3), it is known that the operator D is shiftinvariant, i.e. Sa D ¼ DSa . As a consequence, any polynomial B of degree n can be expanded by a Taylor type formula: BðxÞ ¼
n X
Dj BðaÞej ðxaÞ,
ðA:4Þ
j¼0
Dj being the j-th iterate of the operator D. The formal generating function of the en’s, i.e. fðx,sÞ ¼ fðx,sÞ ¼ exp½xgðsÞ,
P1
n¼0
en ðxÞsn , is also given by ðA:5Þ
where g(s) is a formal series such that gð0Þ ¼ 0. The operator D commutes with the differentiation operator D, and it can be expressed as a power series with respect to D (and reciprocally):
D ¼ g 1 ðDÞ ½reciprocally D ¼ gðDÞ:
ðA:6Þ
Let us introduce an inverse operator to D. Given any real u, such an operator, Iu say, is a linear operator on P defined by Iu en ¼ en þ 1 en þ 1 ðuÞe0 , 0
n Z 0:
ðA:7Þ
0
So, DIu ¼ D (but Iu DaD in general). We can now give the two central definitions. Let U ¼ fui ,i Z0g be a family of real numbers. Attached to U, the generalized Appell polynomials A n ð jUÞ of degree n Z 0 are built as A 0 ð jUÞ ¼ 1
and
A n ð jUÞ ¼ Iun1 Iun2 . . . Iu0 e0 , n Z1,
ðA:8Þ
and the generalized Abel–Gontcharoff polynomials G n ð jUÞ of degree n Z 0 as G 0 ð jUÞ ¼ 1
and
G n ð jUÞ ¼ Iu0 Iu1 . . . Iun1 e0 , n Z 1:
ðA:9Þ
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1390
Observe that for fixed n, A n ð jUÞ and G n ð jUÞ depend on the family U only through its first n terms u0 , . . . ,un1 . Furthermore, these polynomials are linked by the identity A n ð ju0 , . . . ,un1 Þ ¼ G n ð jun1 , . . . ,u0 Þ,
n Z0:
ðA:10Þ
For A n ð jUÞ, n Z 1, (A.8) implies that
Dr A n ð jUÞ ¼ A nr ð jUÞ, 0 r r rn,
ðA:11Þ
Dr A n ðun1r jUÞ ¼ dn,r , 0 rr r n:
ðA:12Þ
Note also that A n ðxjx þ UÞ ¼ A n ð0jUÞ, where x þU ¼ fx þ ui , iZ 0g. Under the condition (A.3) and thanks to the property (A.11), the Taylor type formula (A.4) yields: for x,y 2 R, A n ðx þ yjUÞ ¼
n X
A nj ðyjUÞej ðxÞ,
n Z 0:
ðA:13Þ
j¼0
This provides a simple algorithm to evaluate the polynomials. Indeed, (A.13) with y ¼ 0 gives A n ðxjUÞ in function of A j ð0jUÞ,0 r j r n. These terms can then be determined recursively from the relations
dn,0 ¼
n X
A nj ð0jUÞej ðun1 Þ,
n Z 0,
ðA:14Þ
j¼0
which follow by choosing x ¼ un1 and y ¼ 0 in (A.13) and using (A.12) with r ¼ 0. For G n ð jUÞ, n Z1, putting E r U ¼ fur þ i ,i Z0g where r Z0, we have
Dr G n ð jUÞ ¼ G nr ð jE r UÞ, 0 rr r n,
ðA:15Þ
Dr G n ður jUÞ ¼ dn,r , 0 r r rn,
ðA:16Þ
and G n ðxjx þ UÞ ¼ G n ð0jUÞ. Using (A.16), one obtains a generalized Abelian type expansion for any polynomial B of degree n: n X
BðxÞ ¼
Dj Bðuj ÞG j ðxjUÞ:
ðA:17Þ
j¼0
Note that the condition (A.3) is not required here. Formula (A.16) provides an algorithm to evaluate the polynomials. Indeed, taking B ¼ en in (A.16) yields the recursion en ðxÞ ¼
n X
enj ðuj ÞG j ðxjUÞ,
n Z 1:
ðA:18Þ
j¼0
If (A.3) is satisfied, an alternative method for calculating G n ðxjUÞ is to use the Taylor type formula (A.4) around a ¼ 0. It then suffices to find the coefficients G nj ð0jE j UÞ, 0 r j r n. For that, taking x ¼ ur in (A.4) (with a ¼ 0) and applying (A.16) yields the recursion
dn,r ¼
n X
G nj ð0jE j UÞejr ður Þ,
0 r r rn:
j¼r
A simple particular case for G n ðxjUÞ is when U is an affine sequence, i.e. ui ¼ a þbi,i Z0. If xaun , then a quite explicit formula holds true: G n ðxjUÞ ¼ ðxu0 Þ
en ðxun Þ , xun
n Z0:
ðA:19Þ
Eq. (A.19)can be proved by showing that the right hand side, Hn ðxjUÞ say, is a polynomial of degree n that satisfies the conditions (A.16), as G n ðxjUÞ. A closed form formula still exists in the general case (we did not realize it so far). Property A.1. G n ðxjUÞ ¼
n X r X Y ej1 ðxu0 Þ eji ðusi2 usi1 Þ, r ¼ 1 Cr
n Z 1,
ðA:20Þ
i¼2
where si ¼ j1 þ þ ji , i Z 1, with s0 ¼ 0, and Cr ¼ fðj1 Z1, . . . ,jr Z 1Þ : sr ¼ ng. Proof. We apply the Taylor type formula (A.4) by chain, around the first term of the successive shifted families U. So, firstly G n ðxjUÞ is expanded around u0 (a zero of the polynomial), which provides coefficients G nj1 ðu0 jE j1 UÞ for 1r j1 rn. Then, each coefficient is expanded, with u0 as a variable, around uj1 (a zero of the new polynomial), hence the coefficients G nj1 j2 ðuj1 jE j1 þ j2 UÞ for 1r j2 rnj1 , which are now expanded, with uj1 as variable, around uj1 þ j þ 2 and so on. &
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1391
The simple Appell and Abel–Gontcharoff polynomials are obtained in the special situation where en ðxÞ ¼ xn =n!,
n Z 0:
Then, D corresponds to the differentiation operator D, and Iu becomes the usual integration operator (i.e. Iu en ðxÞ ¼ Rx u en ðtÞ dt,n Z 0). A.2. Negative binomial type polynomials The polynomials of relevance in our work correspond to another special case for the en’s, namely x þ n1 x en ðxÞ ¼ ¼ ð1Þn , n Z 0: n n
ðA:21Þ
That choice for the en’s is often considered in the theory of Sheffer polynomials (e.g. Mullin and Rota, 1970). Note that the condition (A.1) is well satisfied. Observe also that en ð1Þ ¼ 1 for all n Z0 and en ðxÞ ¼ 0 if x ¼ 0,1, . . . ,n þ 1. The previous formulas (A.8) and (A.9) allow us to define the associated generalized Appell and Abel–Gontcharoff polynomials. The polynomials en, A n and G n are called here of negative binomial type. The reason is that the combinatorial coefficients (A.21) remind us, for fixed x, of the negative binomial distribution and, as a function of x, the negative binomial Le´vy process (e.g. Kozubowski and Podgo´rski, 2007). In the sequel, it is enough to define these polynomials only on Z (the set of integers), instead of R. To begin with, let us consider the en(x)’s on R. Their generating function fðx,sÞ is given by
fðx,sÞ ¼ ð1sÞx , jsjo 1, which is of the form (A.5) with gðsÞ ¼ ln½1=ð1sÞ. Obviously, it satisfies fðx þ y,sÞ ¼ fðx,sÞfðy,sÞ for all x,y 2 R, so that the convolution type (A.3) holds true and the operators S and D then commute. By (A.6), D ¼ gðDÞ ¼ ln½I=ðIDÞ, hence ID ¼ expðDÞ. Noticing that expðDÞ ¼ S1 , one gets that
D ¼ IS1 , i.e. D corresponds to the backward difference operator. Let us now focus our attention on Z. Our purpose is to show that the polynomials A n and G n have a simple closed form expression. As explained above, they can also be determined by recursion. Firstly, we notice that the zeros of en are kept and the operator D remains internal (as Den is valued on Z). The following expansion formula of en(x+ a) is standard; a proof is sketched for clarity. Lemma A.2. For x,a 2 Z, a30 X
en ðx þ aÞ ¼ en ðxÞ þ signðaÞ
en1 ðx þ jÞ,
n Z 1,
ðA:22Þ
j ¼ 1 þ a40
where signðaÞ denotes the sign of a. Proof. If a Z 0, one writes Sa I ¼
a X
ðSj Sj1 Þ ¼
j¼1
a X
Sj D,
j¼1
and applying it to en(x) yields (A.22). If a r0, it suffices to use that Sa I ¼
0 X
ðSj1 Sj Þ ¼
j ¼ aþ1
0 X
Sj D:
&
j ¼ aþ1
Thanks to (A.22), we can write explicitly the inverse operator Iu defined in (A.7). Lemma A.3. Given any u 2 Z, then for any polynomial B on Z, x3u X
Iu BðxÞ ¼ signðxuÞ
BðjÞ:
ðA:23Þ
j ¼ 1 þ x4u
Proof. First, consider BðxÞ ¼ en ðxÞ, n Z 0. Taking u ¼ x þa in (A.7) and inserting (A.22), we obtain Iu en ðxÞ ¼ ½en þ 1 ðx þ aÞen þ 1 ðxÞ ¼ signðaÞ
a30 X i ¼ 1 þ a40
en ðx þ iÞ ¼ signðxuÞ
x þ ðuxÞ30 X j ¼ 1 þ x þ ðuxÞ40
en ðjÞ,
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1392
which is (A.23) for B ¼ en . Suppose now that B ¼ X an Iu en ðxÞ Iu BðxÞ ¼
P
a
n Z 0 n en .
Using the previous result, we then get
nZ0
¼ signðxuÞ
X
x3u X
an
nZ0 x3u X
¼ signðxuÞ
en ðjÞ
j ¼ 1 þ x4u
X
an en ðjÞ,
j ¼ 1 þ x4u n Z 0
hence formula (A.23).
&
An explicit formula for G n easily follows from (A.23). The one for A n is then straightforward from the identity (A.10). Proposition A.4. For n Z 1 and x,u0 ,u1 , . . . 2 Z, 0 1 ji1 3ui n1 Y X @signðji1 ui Þ Ae0 ðxÞ, G n ðxjUÞ ¼
ðA:24Þ
ji ¼ 1 þ ji1 4ui
i¼0
with the convention j1 ¼ x. In particular, if x 4 u0 Z u1 Zu2 Z , G n ðxjUÞ ¼
ji1 X
n1 Y
e0 ðxÞ,
ðA:25Þ
i ¼ 0 ji ¼ 1 þ u i
and if x o u0 ru1 ru2 r , G n ðxjUÞ ¼ ð1Þn
ui X
n 1 Y
e0 ðxÞ:
ðA:26Þ
i ¼ 0 ji ¼ 1 þ ji1
Proof. From the definition (A.9), G n ðxjUÞ ¼ Iu0 G n1 ðxjEUÞ, and applying (A.23) then gives x3u X0
G n ðxjUÞ ¼ signðxu0 Þ
G n1 ðjjEUÞ:
j ¼ 1 þ x4u0
By successive iterations, this yields formula (A.24). If x 4u0 Z u1 Z u2 Z , one has ji1 Z1 þ ui1 4ui for 0 r ir n1, so that (A.24) becomes (A.25). Similarly for (A.26). & Formula (A.25) is simple and elegant. It can be extended to an arbitrary integer sequence U as follows. Corollary A.5. For n Z1 and u0 ,u1 , . . . 2 Z, put Wn1 ¼ fwi,n1 ,0 ri r n1g where wi,n1 ¼ maxfui ,ui þ 1 , . . . ,un1 g. Then, for any integer x 4 w0,n1 , G n ðxjWn1 Þ ¼
n 1 Y
ji1 X
e0 ðxÞ:
ðA:27Þ
i ¼ 0 ji ¼ 1 þ ui
In particular, if u0 r u1 r u2 r , then for x Z1 þ un1 ! ðrespectivelyo Þ, G n ðxjWn1 Þ ¼ en ðxun1 Þ
ðrespectively ¼ 0Þ:
ðA:28Þ
Proof. Let J be the right-hand side of (A.27). Consider any given index ji in J. To have non-null terms, one needs ji Z1 þ ui , ji Z ji þ 1 Z 1þ ui þ 1 , ji Zji þ 1 Zji þ 2 Z1 þ ui þ 2 , . . . . Thus, the only terms retained are ji Z1 þ maxfui ,ui þ 1 , . . . ,un1 g ¼ 1 þ wi,n1 . As a consequence, J reduces to J¼
n 1 Y
ji1 X
e0 ðxÞ,
i ¼ 0 ji ¼ 1 þ wi,n1
and this time, the wi,n 1’s form a non-increasing sequence (with respect to i). From (A.25), we then deduce that if x rw0,n1 , J ¼ G n ðxjWn1 Þ as stated in (A.27). Now, if the ui ’s are non-decreasing, wi,n1 ¼ un1 for all i. So, if x 4 un1 , we obtain, using (A.19), that J ¼ G n ðxjfun1 , . . . ,un1 g ¼ en ðxun1 Þ, i.e. formula (A.28). The identity J ¼ en ðxun1 Þ could also be proved by a direct combinatorial argument.
&
C. Lefe vre, P. Picard / Journal of Statistical Planning and Inference 141 (2011) 1380–1393
1393
References Arnold, B.C., Balakrishnan, N., Nagaraja, H.N., 2008. A First Course in Order Statistics. SIAM, Classics in Applied Mathematics 54, Philadelphia. Ball, F.G., O’Neill, P., 1999. The distribution of general final state random variables for stochastic epidemic models. J. Appl. Probab. 36, 473–491. ¨ Balakrishnan, N. (Ed.), 1997. Advances in Combinatorial Methods and Applications to Probability and Statistics. Birkhauser, Boston. Conover, W.J., 1999. Practical Nonparametric Statistics, third ed. Wiley, New York. David, H.A., Nagaraja, H.N., 2003. Order Statistics, third ed. Wiley, Hoboken. Denuit, M., Lefe vre, C., Picard, P., 2003. Polynomial structures in order statistics distributions. J. Statist. Plann. Inference 113, 151–178. De Vylder, F.E., 1999. Numerical finite-time ruin probabilities by the Picard–Lefe vre formula. Scand. Actuar. J. 2, 97–105. Gontcharoff, W., 1937. De´termination des Fonctions Entie res par Interpolation. Hermann, Paris. Ignatov, Z.G., Kaishev, V.K., 2000. Two-sided bounds for the finite time probability of ruin. Scand. Actuar. J. 1, 46–62. Ignatov, Z.G., Kaishev, V.K., 2004. A finite-time ruin probability formula for continuous claim severities. J. Appl. Probab. 41, 570–578. Kaz’min, Y.A., 1988. Appell polynomials. In: Hazewinkel, M. (Ed.), Encyclopedia of Mathematics, vol. 1. Kluwer, Dordrecht, pp. 209–210. Kozubowski, T.J., Podgo´rski, K., 2007. Invariance properties of the negative binomial Le´vy process and stochastic self-similarity. Int. Math. Forum 30, 1457–1468. Lefe vre, C., Picard, P., 1990. A non-standard family of polynomials and the final size distribution of Reed–Frost epidemic processes. Adv. Appl. Probab. 22, 25–48. Lefe vre, C., Picard, P., 2006. A nonhomogeneous risk model for insurance. Comput. Math. Appl. 51, 325–334. Mohanty, S.G., 1979. Lattice Path Counting and Applications. Academic, New York. Mohanty, S.G., 1980. On some computational aspects of rectangular probabilities. In: Colloquia Mathematica Societatis Ja´nos Bolyai, Nonparametric Statistical Inference, vol. 32, pp. 597–617. Mullin, R., Rota, G.-C., 1970. On the foundation of combinatorial theory—III: theory of binomial enumeration. In: Harris, B. (Ed.), Graph Theory and its Applications. Academic, New York, pp. 167–213. Niederhausen, H., 1981. Sheffer polynomials for computing exact Kolmogorov–Smirnov and Re´nyi type distributions. Ann. Statist. 9, 923–944. Niederhausen, H., 1986. The enumeration of restricted random walks by Sheffer polynomials with applications to statistics. J. Statist. Plann. Inference 14, 95–114. Niederhausen, H., 1997. Lattice path enumeration and Umbral calculus. In: Balakrishnan, N. (Ed.), Advances in Combinatorial Methods and Applications to ¨ Probability and Statistics. Birkhauser, Boston, pp. 15–27. Picard, P., Lefe vre, C., 1996. First crossing of basic counting processes with lower non-linear boundaries: a unified approach with pseudopolynomials (1). Adv. Appl. Probab. 28, 853–876. Picard, P., Lefe vre, C., 1997. The probability of ruin in finite time with discrete claim size distribution. Scand. Actuar. J. 1, 58–69. Picard, P., Lefe vre, C., 2003. On the first meeting or crossing of two independent trajectories for some counting processes. Stochastic Process. Appl. 104, 217–242. Steck, G.P., 1969. The Smirnov two-sample tests as rank tests. Ann. Math. Statist. 40, 1449–1466. Shorack, G.R., Wellner, J.A., 2009. Empirical Processes with Applications to Statistics. SIAM, Classics in Applied Mathematics 59, Philadelphia.