Journal of Statistical Planning and Inference 142 (2012) 1464–1479
Contents lists available at SciVerse ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Exact and asymptotic results for pattern waiting times M.V. Koutras a, F.S. Milienos b,n a b
Department of Statistics and Insurance Science, University of Piraeus, 80, Karaoli & Dimitriou Street, 18534 Piraeus, Greece Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, 83200 Karlovassi, Samos, Greece
a r t i c l e in f o
abstract
Article history: Received 31 May 2011 Received in revised form 17 October 2011 Accepted 28 December 2011 Available online 8 January 2012
Waiting time problems for the occurrence of a pattern have attracted considerable research interest. Several results, including Poisson or Compound Poisson approximations as well as Normal approximations have appeared in the literature. In addition, a number of asymptotic results has been developed by making use of the finite Markov chain imbedding technique and the Perron–Frobenius eigenvalue. In the present paper we present a recursive scheme for the evaluation of the tail probabilities of the waiting time for the first and r-th occurrence of a pattern. A number of asymptotic results (along with their rates of convergence) that do not require the existence of the Perron– Frobenius eigenvalue are also offered. These results cover a quite wide class of pattern waiting time problems and, in most cases, perform better than the ones using the Perron–Frobenius eigenvalue. & 2012 Elsevier B.V. All rights reserved.
Keywords: Finite Markov chain imbedding Pattern waiting times Multistate trials Success runs
1. Introduction The concept of pattern arises in a quite natural way in many applications comprising experimental trials with two or more possible outcomes in each trial. For example, an educational psychologist evaluates his subject’s learning capability or his method’s efficiency by examining patterns of level achievements of students exposed to the learning process, Grant (1946) and Bogartz (1965); an ecologist identifies the spread of a disease by studying specific patterns of infected/noninfected plants in a transect through a field, Pielou (1963a,b, 1977); many quality control plans base the acceptance/ rejection of the sample lot on the occurrence of prolonged sequences of successive working/failed components (this is a special case of pattern that has been termed as ‘‘run’’), Wolfowitz (1943) and Balakrishnan et al. (1993). The term pattern (and its special case, run) is used in the field of Probability and Statistics in almost the same way as it is used in common language. Thus, if a sequence of multistate trials is considered (sequence of trials with two or more possible outcomes in each trial) the term pattern E designates a specific string or family of strings with given composition. For example, in the sequence 11123133123312 we have one run of 1’s of length 3, two runs of 3’s of length 2, two occurrences of the pattern E ¼ 123, four occurrences of the composite pattern E ¼ 12 or 13, etc. Pattern problems, in the simplest form of a run, have attracted the attention of probabilists and statisticians as far back as the 18th century. Historically, the origin of problems pertaining to success runs in sequences of binary trials begins with De Moivre (1756). Their widespread applicability in numerous scientific fields (Psychology, Meteorology, non-parametric Statistical Inference, Quality Control, etc.) raised a continuous research interest and led to several variations, extensions and generalizations of the original concept and set-up, thereby arriving at the more general idea of pattern. Two
n
Corresponding author. E-mail addresses:
[email protected] (M.V. Koutras),
[email protected],
[email protected] (F.S. Milienos).
0378-3758/$ - see front matter & 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2011.12.028
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1465
contemporary areas where the application of distribution theory of runs and patterns had beneficial influence are reliability theory (cf. Chao et al., 1995 for a review) and start-up demonstration tests (see, Viveros and Balakrishnan, 1993 or Balakrishnan et al., 1995). Fu and Koutras (1994) developed a unified method for capturing the exact distribution of the number of runs of specified length by employing a Markov chain imbedding technique. Koutras and Alexandrou (1995) refined the method and expressed these distributions in terms of multidimensional binomial type probability vectors. Fu (1996) extended the original method to cover the case of arbitrary patterns (instead of runs). Doi and Yamamoto (1998) and Han and Aki (1999) considered the case of multivariate run related distributions and offered simple solutions to the problem by exploiting proper extensions of the Markov chain imbedding technique. The Markov chain approach possesses great potential and simplifies substantially the solution of the problem it is applied to. A great advantage of this approach is that, by some trivial adjustments in the form of the transition probability matrix, it is applicable not only to the cases where the sequence of multistate trials consists of independent and identically distributed (i.i.d.) trials; one can still exploit it when the independence assumption is dropped and e.g. a sequence of trials evolving according to a stationary Markov chain is considered. Run or pattern related problems under Markovian dependence or independence set-ups have been studied by Schwager (1983), Hirano and Aki (1993), Balasubramanian et al. (1993), Aki and Hirano (1993), Mohanty (1994), Fu et al. (2003), Makri et al. (2007), Fu and Johnson (2009), Dafnis et al. (2010), and Stefanov (2011). The results presented in the aforementioned works pertain to the study of the number of appearances of a run or, more generally, a pattern E. A closely related problem that could be looked upon is the investigation of the waiting time until the first or in general the r-th occurrence of the pattern E (see, Koutras, 1997). Waiting time problems have been studied by numerous authors; for a comprehensive list of publications on this area the interested reader may refer to the monographs by Balakrishnan and Koutras (2002), Fu and Lou (2003), and Glaz et al. (2001). During the last decades, several publications dealt with the problem of establishing exact and asymptotic results for the tail probabilities of waiting time distributions associated to pattern occurrences, see e.g. Godbole (1991), Godbole and Schaffner (1993), Koutras (1996a), Fu et al. (2003), and Fu and Johnson (2009). In the first two papers, a Poisson or compound Poisson approximation is exploited to prove, in the sense that, under appropriate conditions the total variation distance between the distribution of interest and corresponding probabilities of a Poisson or compound Poisson distribution tends to zero. In the third paper, exact and asymptotic results are provided for the waiting time distribution of a special pattern. In the papers by Fu and his colleagues, the finite Markov chain imbedding technique is brought into play and several asymptotic results are developed by placing specific assumptions about the eigenspaces of the associated transition probability matrix. A crucial point for setting up their approach is that, the non-negative s s matrix involved in the finite Markov chain imbedding procedure, possesses a real eigenvalue l1 o 1, which is greater in modulus than any other eigenvalue lj ,j ¼ 1; 2, . . . ,s; l1 is then explicitly used in the formulae that provide the asymptotic approximations. One could resort to the celebrated Perron–Frobenius theory to guarantee the existence of such an eigenvalue for a special class of matrices, namely primitive matrices; however this class cannot cover all matrices emerging from pattern waiting time problems. In the present paper, motivated by the aforementioned observation, we pursue a different class of asymptotic results Ps n which, instead of using l1 , are based on the quantity i ¼ 1 li (the latter is a non-negative real number for any nonnegative matrix). Although we focus on the waiting time for the first occurrence of a pattern E (simple or composite) in a sequence of multistate trials, our approach can be adequately adapted to the more general problem of the waiting time Tr, for the r-th occurrence (r 4 1). The exposition has been organized as follows. After the introduction of the necessary notations and definitions (Section 2), a general result is established in Section 3 which offers a recursive scheme for the computation of the tail probability of T and Tr. The derivation of these results and the asymptotic formulae of the rest sections are accomplished by exploiting the Markov chain imbedding technique. More specifically, in Section 4, the asymptotic distributions of T and Tr are investigated under fairly general conditions on the nature of the substochastic P n matrix associated with the pattern of interest. All suggested asymptotic formulae use the quantity si ¼ 1 li and not l1 only (which is traditionally used in the publications up to date); in addition, their rate of convergence is provided and a brief discussion is offered on how they can be exploited to develop computationally tractable approximations. Finally, in Section 5 a numerical study is carried out to verify the quality of the suggested approximations.
2. Definitions and notations Let Z 1 ,Z 2 , . . . be an (infinite) sequence of multistate trials and denote by E a specific pattern i.e. a fixed length string of symbols from the alphabet that describes the possible outcomes of the trials. Denote by Xn the number of occurrences of E in Z 1 ,Z 2 , . . . ,Z n for fixed n Z 1 and by T the waiting time for the first occurrence of E in the sequence Z 1 ,Z 2 , . . . ,Z n . It is clear that, the probability function f ðnÞ ¼ PðT ¼ nÞ,n ¼ 1; 2, . . . of T can be expressed via the tail probabilities of T F ðnÞ ¼ PðT 4 nÞ,
n ¼ 0; 1, . . .
ð1Þ
(convention: F ð0Þ ¼ 1) as follows: f ðnÞ ¼ F ðn1ÞF ðnÞ,
n ¼ 1; 2, . . . :
ð2Þ
1466
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
Also, it will be proved very helpful to associate F ðnÞ to the point probability of Xn, at zero, namely F ðnÞ ¼ PðX n ¼ 0Þ,
n ¼ 1; 2, . . . :
ð3Þ
The usefulness of the last formula arises from the fact that, in many cases, Xn can be studied by exploiting an appropriate Markov chain. As Fu and Lou (2003) elucidated, a substantial number of problems associated to pattern enumeration can be addressed in a unified way by establishing an appropriate finite Markov chain imbedding framework. Such a framework offers neat and compact representations of the underlying distributions which are amenable to further study of their properties. In the rest of this paper we shall be assuming that the pattern enumerating random variable is a Markov chain imbeddable variable in the sense of Fu and Koutras (1994) definition. We recall that the non-negative finite integer-valued random variable Xn is called a Markov chain imbeddable variable if there exists a Markov chain fY t : t Z 0g defined on a state space O ¼ fo1 , o2 , . . .g, which can be partitioned as [ O¼ Cx xZ0
and its probability mass function can be captured by considering the projection of the probability space of Yn onto Cx i.e. ! n X Y Kt e0i , x ¼ 0; 1, . . . ,ln PðX n ¼ xÞ ¼ PðY n 2 C x Þ ¼ p t¼1
i:oi 2C x
(ln ¼ maxfx : PðX n ¼ xÞ 4 0g stands for the upper end point of Xn and e0i is a column vector having 1 at the i-th coordinate and 0 elsewhere). The vector
p ¼ ðPðY 0 ¼ o1 Þ,PðY 0 ¼ o2 Þ, . . .Þ contains the initial probabilities of the Markov chain fY t : t Z 0g. In order to keep the previous (and several forthcoming Q results) valid for n ¼0, the following convention will be used: bt ¼ a Kt ¼ I if a 4b. Evidently, if Kt ¼ K for all t Z1 (homogeneous Markov chain), the exact distribution of Xn takes on the next form X e0i , x ¼ 0; 1, . . . ,ln : ð4Þ PðX n ¼ xÞ ¼ pKn i:oi 2C x
Since in most practical applications the underlying probability models can be described by homogeneous Markov chain imbeddable variables, it is sufficient for our purposes and also of greater simplicity to adhere to this class of variables. On the basis of (1)–(4) the obvious conclusion to be drawn is that, the distribution of the waiting time random variable T is determined by X F ðnÞ ¼ PðX n ¼ 0Þ ¼ pKn e0i , n ¼ 0; 1, . . . i:oi 2C 0
(convention: PðX 0 ¼ 0Þ ¼ 1). Let us next rearrange the states of O ¼ fo1 , o2 , . . .g so that the states of C0 correspond to the first s ¼ 9C 0 9 states of K and accumulate the states of C x ,x ¼ 1; 2, . . . ,ln in a single absorbing state. It is then evident that PðX n ¼ 0Þ can be obtained by virtue of the formula F ðnÞ ¼ PðT 4 nÞ ¼ PðX n ¼ 0Þ ¼ p0 Kn0 10 ,
n ¼ 0; 1, . . . ,
ð5Þ
where K0 is the upper-left s s part of K, p0 contains the first s entries of the initial probability vector p and 1 ¼ ð1; 1, . . . ,1Þ. It is worth mentioning that (5) provides the probability that the reduced Markov chain (i.e. the one described by the s þ 1 states) does not enter its absorbing state until time n. In the next paragraphs we shall establish several formulae that facilitate the evaluation of the tail probabilities F ðnÞ ¼ PðT 4 nÞ (and thereof, the probability function f ðnÞ ¼ PðT ¼ nÞ, by applying (2)) of the waiting time random variable T. The evaluation will be carried over through quantities arising from the substochastic matrix K0 . 3. A recursive scheme for the tail probabilities In the present section we are going to exploit expression (5) in order to establish a simple linear recursive scheme for the waiting time tail probabilities F n ¼ F ðnÞ ¼ PðT 4 nÞ,n ¼ 0; 1, . . . . More specifically we have the following result. Proposition 1. Let P s ðwÞ ¼ detðIwK0 Þ ¼
s X i¼0
ai wi
ð6Þ
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1467
be the characteristic polynomial of the substochastic matrix K0 . Then F n satisfy the recurrence n1 X
Fn ¼
ajn þ s F j
for n 4s:
ð7Þ
j ¼ ns
Proof. According to the well known Cayley–Hamilton theorem, the characteristic polynomial Ps(w) of K0 should vanish when w is replaced by K0 , i.e. s X
ai Ki0 ¼ 0:
i¼0
(for n 4 s) and postmultiplying by 10 we get Premultiplying the last equality by p0 Kns 0 s X
þi 0 0 ai ðp0 Kns 1 Þ ¼ p0 Kns 0 0 01 ¼ 0,
i¼0
which, by virtue of (5) yields s X
ai F ns þ i ¼ 0,
n 4 s:
i¼0
The desired result is readily ascertainable by changing the variable of summation to j ¼ ns þ i and making use of the fact that the coefficient of ws in the characteristic polynomial Ps(w) equals 1 (i.e. as ¼ 1). & It is of interest to recall that the coefficients a0 and as1 of the characteristic polynomial Ps(w) are given by the expressions a0 ¼ ð1Þs detðK0 Þ,
as1 ¼ trðK0 Þ:
Manifestly, in order to launch the recursive (7) one needs (as initial conditions) the values of F 1 ,F 2 , . . . ,F s . It is quite common in several applications to have F 1 ¼ F 2 ¼ ¼ F s1 ¼ 1, in which (7) yields Fsþ1 ¼
s X
aj1 F j ¼ as1 F s
j¼1
s2 X
aj :
j¼0
Reapplying (7) for higher values of n we may express the tail probabilities F n ,n 4 s in terms of F s and the coefficients a0 ,a1 , . . . ,as1 of the characteristic polynomial (6). As an illustration of how Proposition 1 can be applied let us consider the problem of studying the distribution of the longest success run in a sequence of repeated trials with two possible outcomes S (success) or F (failure). A typical method for handling success run problems is by establishing a Markov chain fY t ,t Z 0g over O ¼ f0; 1, . . .g which keeps trace of the current success run length (see, Taylor and Karlin, 1984). Denoting by p,q ¼ 1p the probability of success, failure respectively it is easily verified that the transition probability matrix is 0 1 q p 0 0 Bq 0 p 0 C B C B C B q 0 0 p C: B C Bq 0 0 0 C @ A ^ ^ ^ ^ & If we wish to study the waiting time T for the first occurrence of a success run of length k Z1, we can accumulate states k,kþ 1, . . . in an absorbing state and, in view of the results presented in Section 2, we have F n ¼ PðT 4nÞ ¼ p0 Kn0 10 ,
n ¼ 0; 1, . . . ,
where 0
1
q
p
0
0
Bq B B K0 ¼ B B^ Bq @ q
0
p
0 0
0 0
0C C C ^C C pC A 0
:
kk
Denoting by Ln the length of the longest success run in a sequence of n trials, we may write PðLn o kÞ ¼ PðT 4 nÞ ¼ F n and the computation of the distribution of Ln could be carried out by establishing recurrence relations for F n .
ð8Þ
1468
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
The characteristic polynomial of K0 is effortlessly obtained as P s ðwÞ ¼ detðIwK0 Þ ¼ wk
k1 X
qpj wk1j ¼
j¼0
k X
aj wj ,
j¼0
where aj ¼ qpkj1 , j ¼ 0; 1, . . . ,k1
ak ¼ 1,
ð9Þ
and a direct application of Proposition 1 yields the recurrence Fn ¼
k2 X
qpk1j F nk þ j
for n 4 k:
ð10Þ
j¼0
The last recursion scheme can be exploited in conjunction with the obvious initial conditions F 0 ¼ F 1 ¼ ¼ F k1 ¼ 1,
F k ¼ 1pk
and
ð11Þ
to compute the quantities F n ¼ PðLn o kÞ for any n Z 1 (and any k, providing that we modify the scheme according to the desired value of k). It should be stressed that the quantities F n of the above example can also be interpreted as reliabilities of a consecutivek-out-of-n:F system with component failure probabilities p (for a review, see e.g. Chao et al., 1995). In this framework, the recurrence relation (10) has been established by several approaches in the reliability literature (see also Koutras, 1996b). The aforementioned results carry over very nicely to the case where the independence assumption between successive trials is relaxed. Consider for example a sequence of dependent trials such that the probability of success/failure of the n-th trial depends on the number of consecutive successes that have been observed in the preceding trials (such sequence has been termed by Aki (1985) as ‘‘binary sequence of order k’’; see also Philippou (1988) for related results under this set up). This might be a realistic scenario for reliability structures where massive breakdowns of consecutive components—note that in the reliability set up the breakdown corresponds to a success—increase the load to the next component and therefore affect their survival probabilities. If we denote by p09i the conditional probability of observing a success given that there are exactly i consecutive successes just before it (i ¼ 0; 1, . . . ,k1) and by q09i ¼ 1p19i the respective conditional failure probability, it is readily obtained that the transition probability matrix (8) should be replaced by 0 1 p190 0 0 q090 B C 0 p191 0 C B q091 B C ^ C K0 ¼ B B ^ C : B C 0 0 p19k2 C B q09k2 @ A q09k1 0 0 0 kk
The characteristic polynomial now reads P s ðwÞ ¼ detðIwK0 Þ ¼ wk
k1 X
q09j
j Y
! p19v1 wk1j ¼
v¼1
j¼0
k X
aj wj ,
j¼0
where ak ¼ 1,
and
aj ¼ q09k1j
k1j Y
p19v1 , j ¼ 0; 1, . . . ,k1
v¼1
and by a straightforward application of Proposition 1 the next recursion scheme ensues Fn ¼
n1 X
ajn þ k F j
for n 4 k
j ¼ nk
with initial (similar to (11)) conditions F 0 ¼ F 1 ¼ ¼ F k1 ¼ 1
and
F k ¼ 1
k Y
p19j1 :
j¼1
Proposition 1 is applicable to more general problems which have been considered as substantially more difficult than the study of success runs in a sequence of binary trials. For example in the sooner succession quota problem as stated by Ebneshahrashoob and Sobel (1990) one is interested in the waiting time until a run of k consecutive successes or a run of r consecutive failures (whichever comes first) is observed in a sequence of i.i.d. binary trials. Let us briefly illustrate how this problem can be tackled by the aid of Proposition 1, in a more general framework where trinary trials are generated by a first order Markov dependent model (see Koutras and Alexandrou, 1997). To fix the notation, consider a time homogeneous Markov chain Z 1 ,Z 2 , . . . ,Z n on O ¼ f0; 1,2g with initial probabilities PðZ 1 ¼ jÞ ¼ pj , for j ¼ 0; 1,2 and transition
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1469
probabilities PðZ t ¼ j9Z t1 ¼ iÞ ¼ pij
for t Z 2 and i,j 2 f0; 1,2g:
Then the waiting time T for the first occurrence of a run of k consecutive 1’s or r consecutive 2’s can be easily investigated by taking advantage of a Markov chain imbedding technique with
The characteristic polynomial of K0 equals Ps ðwÞ ¼ detðIwK0 Þ ¼
s X
ai wi
i¼0
with (see Koutras and Alexandrou, 1997) as ¼ 1,
as1 ¼ p00 ,
as2c ¼ c2 Sc px11 py22 c1 Sc1 px11 py22 c3 pc11 I½c r k2 c4 pc22 I½c r r2 ,
c ¼ 0; 1, . . . ,s2,
where s ¼ k þ r1, c1 ¼ p21 p02 p10 þ p12 p01 p20 p00 p12 p21 ,c2 ¼ p12 p21 ,c3 ¼ p10 p01 ,c4 ¼ p20 p02 and the symbol Sc denotes a sum extended over the integer solutions (x, y) of the system xþy ¼ c
with 0 r x rk2; 0 ry r r2
(convention: the above sum equals zero, if no such solution exists). The first part of the initial conditions for launching the above recursive scheme are given by F 0 ¼ F 1 ¼ ¼ F minfk,rg1 ¼ 1, while the remaining necessary initial tail probabilities F i ,i ¼ minfk,rg, . . . ,s may be computed by appealing to formula (5) or any other convenient computational scheme. We conclude this section by presenting a result for the tail probability of the waiting time for the r-th occurrence of a pattern. Proposition 2. Let Tr denote the waiting time for the r-th non-overlapping occurrence of a pattern E in the sequence of trials Z 1 ,Z 2 , . . . and ðrÞ
F n ¼ PðT r 4 nÞ,
n ¼ 0; 1, . . . P its tail probabilities. If P s ðwÞ ¼ si ¼ 0 bi wi is the characteristic polynomial of the substochastic matrix K0 associated with E then ðrÞ F n obey the recurrence n1 X
ðrÞ
Fn ¼
ðrÞ
ajn þ rs F j ,
n 4 rs,
ð12Þ
j ¼ nrs
where ai ¼
X
r! r r r b 0 b 1 bss , r 0 !r 1 ! r s ! 0 1
i ¼ 0; 1, . . . ,rs
with the last summation being extended over the non-negative integer solutions fr 0 , . . . ,r s g of the linear system r 0 þ r 1 þ þ r s ¼ r, r 1 þ 2r 2 þ þ sr s ¼ i:
ð13Þ
Proof. It is obvious that PðT r 4 nÞ ¼ PðX n o rÞ ¼ PðX nn ¼ 0Þ where X nn enumerates the occurrences of the composite pattern ‘‘r non-overlapping appearances of the pattern E’’. The transition probability matrix can be expressed in the form
1470
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
where Kn0 is a rs rs matrix 0 K0 c 0 0 B 0 K c0 0 B B & Kn0 ¼ B B ^ B 0 0 0 @ 0 0 0
that can be partitioned as follows (see Fu and Johnson, 2009): 1 0 0 C C C ^ C C c0 C A K0
ð14Þ
srsr
and c0 is an s 1 column vector (while u is a rs 1 vector, of the form u ¼ ð09cÞ). According to Proposition 1 the tail ðrÞ probabilities F n obey the linear recurrence stated in (12) with the coefficients aj ,j ¼ 0; 1, . . . ,rs being determined via the characteristic polynomial P n ðwÞ ¼ detðIwKn0 Þ ¼
rs X
ai wi :
i¼0
Appealing to standard matrix theory arguments we may readily conclude that P n ðwÞ ¼ detðIwKn0 Þ ¼ ðdetðIwK0 ÞÞr P and since P s ðwÞ ¼ detðIwK0 Þ ¼ si ¼ 0 bi wi , the following identity ensues: !r rs s X X ai wi ¼ bi wi : i¼0
ð15Þ
i¼0
Using the polynomial expansion formula, the RHS takes on the form !r s X X r! r r r i b 0 b 1 bss wr1 þ 2r2 þ þ srs , bi w ¼ r !r ! rs ! 0 1 0 1 i¼0 where the summation extends over all the non-negative integers r 0i s subject to the constrain r 0 þr 1 þ þr s ¼ r. The desired result is easily captured by grouping the summation terms according to the values of r 1 þ 2r 2 þ þ sr s ¼ i and comparing them to the coefficients of the LHS of (15). & Note that, formulae calling for the calculations of all integer solutions of linear systems (13) have appeared in several works of Philippou and his colleagues (see, e.g. Philippou and Makri, 1986, Theorem 2.1 and Lemma 3.1). As an illustration, let us consider a sequence of binary trials with success probability p and the pattern E ¼ 101. Then (see, e.g. Fu and Lou, 2003) 0 1 p 0 q Bp q 0C K0 ¼ @ A 0 q 0 and the characteristic polynomial is P 3 ðwÞ ¼ detðIwK0 Þ ¼ w3 w2 þ wðpp2 Þ þð2p2 p3 pÞ: If we are interested on the second appearance of the pattern, we can generate the tail probabilities ð2Þ F n ¼ PðT 2 4nÞ,n ¼ 0; 1, . . . by exploiting the recursive scheme ð2Þ
Fn ¼
n1 X
ð2Þ
ajn þ 6 F j
for n 4 6,
j ¼ n6
where the coefficients ai are given by X 2! r r r r b 0 b 1 b 2 b 3 , i ¼ 0; 1, . . . ,6 ai ¼ r 0 !r 1 !r 2 !r 3 ! 0 1 2 3 with b3 ¼ 1,b2 ¼ 1,
b1 ¼ pp2 ,
b0 ¼ 2p2 p3 p,
while the summation is extended over the non-negative integer solutions ðr 0 ,r 1 ,r 2 ,r 3 Þ of the system r 0 þ r 1 þ r 2 þ r 3 ¼ 2, r 1 þ 2r 2 þ3r 3 ¼ i: Hence, a6 ¼ 1,
a5 ¼ 2,
a2 ¼ 2p3p2 þ p4 ,
a4 ¼ 1þ 2p2p2 ,
a3 ¼ 4p þ6p2 2p3 ,
a1 ¼ 2p2 þ 6p3 6p4 þ2p5 ,
a0 ¼ p2 4p3 þ6p4 4p5 þ p6 :
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1471
A sufficient set of initial conditions to launch the recursion is offered by F 6 ¼ 1p4 ð1pÞ2 :
F 0 ¼ F 1 ¼ ¼ F 5 ¼ 1,
4. Asymptotic results The exact calculations of the tail probabilities F n , though feasible through the recurrence provided by Proposition 1, become tedious when n increases (not to mention the accumulative rounding error when many iterations are required). An asymptotic estimate may therefore be useful. In the present section we develop an asymptotic result of F n by comparing it to the trace of the matrix Kn0 . More specifically, if l1 , l2 , . . . , ls are the s (not necessarily distinct) eigenvalues of K0 , we shall prove that, as n-1, the behaviour of F n may be determined by virtue of the behaviour
tn ¼
s X
lni ¼ trðKn0 Þ, n ¼ 1; 2, . . . :
i¼1
Let us denote by
r ¼ rðK0 Þ ¼ max 9li 9 1rirs
the radius of the spectral circle of K0 , i.e. the maximum modulus of all eigenvalues of K0 . Using the formula 1 X
ðwK0 Þn ¼ ðIwK0 Þ1 ,
9w9 o 1=r,
ð16Þ
n¼0
the generating function of F n ,n Z 0 (see also (5)) GðwÞ ¼
1 X
1 X
F n wn ¼ p0
n¼0
ðwK0 Þn 10
n¼0
takes on the form GðwÞ ¼
1 X
F n wn ¼ p0 ðIwK0 Þ1 10 ,
9w9 o 1=r:
ð17Þ
n¼0
Let us assume that there are m distinct eigenvalues of K0 , i.e. the spectrum of K0 is
sðK0 Þ ¼ fl1 , l2 , . . . , lm g, where m r s and 9l1 9 Z9l2 9 Z Z 9lm 9 (a relabelling of l1 , l2 , . . . , ls has been applied so that the distinct ones receive the indices 1; 2, . . . ,m; in addition the eigenvalues have been ordered in decreasing order of their moduli). In the sequel we shall denote by ui ,i ¼ 1; 2, . . . ,m the multiplicities of li ,i ¼ 1; 2, . . . ,m and with no further notice we shall be assuming that, in case two or more eigenvalues have the same modulus they have been arranged in decreasing multiplicity order. The next lemma provides an expression for the difference F n ytn (y is a constant independent of n) which will be exploited in the sequel to establish results for the tail probabilities F n ,n Z1. Lemma 1. If the eigenvalue l1 of the substochastic matrix K0 has multiplicity u1 ¼ 1, then for large n, there exists a constant y such that F n ytn ¼
m X
n
pi ðnÞli ,
ð18Þ
i¼2
where pi(n) are polynomials of n of degree at most ui 1,i ¼ 2; 3, . . . ,m. Proof. Let us denote by cij ðwÞ ¼ ð1Þi þ j detðM ij ðwÞÞ, i,j 2 f1; 2, . . . ,sg the cofactors of IwK0 (recall that Mij(w) is the matrix IwK0 with its i-th row and j-th column deleted; hence detðMij ðwÞÞ will be a polynomial of w with degree at most s1 and the same holds true for cij(w) as well). Since, ðIwK0 Þ1 ¼
1 adjðIwK0 Þ, detðIwK0 Þ
where adjðIwK0 Þ ¼ ðcij ðwÞÞ0 ¼ ðcji ðwÞÞss is the adjoint (or adjugate) matrix of IwK0 , expression (17) may be restated as follows: Ps Ps p0 adjðIwK0 Þ10 i¼1 j ¼ 1 p0j c ij ðwÞ ¼ , GðwÞ ¼ detðIwK0 Þ detðIwK0 Þ
1472
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
Q ui while p0j ,j ¼ 1; 2, . . . ,s are the s coordinates of the vector p0 . Taking into account that detðIwK0 Þ ¼ m i ¼ 1 ð1wli Þ , we conclude that G(w) (which is a rational function of w) admits a partial fraction decomposition of the form Ps Ps ui m X X div i¼1 j ¼ 1 p0j c ij ðwÞ ¼ GðwÞ ¼ þ RðwÞ, ð19Þ detðIwK0 Þ ð1w li Þv i¼1v¼1 where div are constants (independent of n) and R(w) a polynomial of w. It is worth mentioning that the constants div corresponding to real eigenvalues are real, while those that corresponds to complex eigenvalues might be complex too. Expanding ð1wli Þv we arrive at the following power series representation: ! ui 1 1 m X X X X n þ v1 n n GðwÞ ¼ F nw ¼ div li wn þRðwÞ: n n¼0 n¼0 i¼1v¼1 For n greater than the degree of R(w), comparing the coefficients of wn in the power series of both sides, we get ui m X X n þv1 n Fn ¼ div li n i¼1v¼1 or equivalently, since u1 ¼ 1 ui m X X n þ v1 n n F n ¼ d11 l1 þ div li : n i¼2v¼1
ð20Þ
The desired result follows immediately by setting y ¼ d11 and observing that ! ui m m X X X n þv1 n n F n ytn ¼ pi ðnÞli , div ui d11 li ¼ n i¼2 v¼1 i¼2
ð21Þ
where 0 1 ui ui v 1 Y X X n þ v1 div @ ui d11 pi ðnÞ ¼ ðn þ jÞA div ui d11 ¼ ðv1Þ! n v¼1 v¼1 j¼1 is a polynomial of n of degree at most ui 1, for i ¼ 2; 3, . . . ,m.
&
Expression (18) would be of no practical use in cases where y is a (pure) complex number. Therefore, we shall next focus on cases where y is real, and a proper normalization of the difference F n ytn (i.e. dividing it by an appropriate normalizing sequence) leads to an absolutely bounded sequence, thereof establishing an asymptotic approximation of F n by ytn . The next two propositions offer results of this type. Proposition 3. If the substochastic matrix K0 has only one eigenvalue l1 on its spectral circle and its multiplicity is one ði:e: u1 ¼ 1Þ, then there exists a non-negative real constant y such that n
F n ¼ ytn þ Oð9l2 9 nu2 1 Þ:
Proof. A well known result for non-negative matrices states that the radius of them belongs to their spectrum (see, e.g. Meyer, 2000). Applying this result to the substochastic matrix K0 yields r 2 sðK0 Þ ¼ fl1 , l2 , . . . , lm g and since we have assumed that K0 has only one eigenvalue on its spectral circle we conclude that r ¼ l1 Z 0 and
r ¼ l1 ¼ 9l1 9 49l2 9 Z Z 9lm 9: Note that, if the size of K0 is greater than one (s 4 1) the above inequality implies that r ¼ l1 4 0 (this is also true for the case s ¼1, since l1 ¼ 0 corresponds to a transitions probability matrix K with two absorbing states which is of no interest for our study). Let y ¼ d11 be the constant used in the proof of Lemma 1. Since l1 2 R we conclude that y 2 R; moreover, by virtue of (20) which is valid due to the assumption u1 ¼ 1, we may write n ui m X X n þ v1 l Fn yþ ¼ n Z0 div i l1 n l1 i¼2v¼1 (for n greater than the degree of R(w)). For n-1 the last inequality yields that y is a non-negative real number. Exploiting (21) for 9l2 9a0, we deduce n F yt X m n n pi ðnÞ li n u 1 r u 1 2 l2 n 2 l n 2 i¼2
ð22Þ
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1473
and it is not difficult to verify that the RHS converges to a non-negative real number. More specifically, if 9l2 9 4 9l3 9 Z Z 9lm 9 it is clear that n p ðnÞ li n l ¼0 lim pi ðnÞ i ¼ 0 ) lim ui 1 n-1 n-1 n 2 l2 l2 for all i ¼ 3; 4, . . . ,m, while if 9l2 9 ¼ 9l3 9 ¼ ¼ 9ll 9 49ll þ 1 9 Z Z9lm 9 (2r l rm) we may write n X m l m X X pi ðnÞ li n pi ðnÞ li ¼ pi ðnÞ þ nu2 1 l nu2 1 nu2 1 l 2 2 i¼2 i¼2 i ¼ lþ1 with the second sum (in the RHS of the above equality) converging to zero and the first converging to a constant (note that pi(n) are polynomials of degree at most ui 1 r u2 1, for 2 ri r l). n n Since we have proved that 9ðF n ytn Þ=l2 nu2 1 9 is bounded as n-1 and the quantities F n ytn , 9l2 9 nu2 1 are real numbers we may state that n
F n ytn ¼ Oð9l2 9 nu2 1 Þ, which is equivalent to the assertion of the proposition. Finally, if 9l2 9 ¼ 0 the expression (18) leads us immediately to the above result. & It is worth noting that, since K0 is substochastic we have 0 r r ¼ rðK0 Þ r 1. The trivial case r ¼ 0 is of no interest (all eigenvalues vanish), while the case r o 1 is what is usually met. On the other hand if 9l1 9 49l2 9, Proposition 3 reveals that lim
Fn
t
n-1 y n
¼1
ð23Þ
(for ytn a0) and therefore the waiting time tail probabilities F n can be satisfactorily approximated by F n ytn :
ð24Þ
A class of substochastic matrices that satisfies the conditions of Proposition 3 is the class of primitive matrices. A nonnegative square matrix A is said to be primitive if there is a positive integer n such that An 4 0 (i.e. the elements of An are all positive). According to the well known Perron–Frobenius Theorem (see, e.g. Seneta, 1981) if A is a non-negative primitive matrix, then there exists an eigenvalue r (Perron–Frobenius eigenvalue) of A such that: (i) r is a real positive number, (ii) r 4 9l9 for any eigenvalue lar and (iii) the multiplicity of r equals 1, i.e. r is a simple eigenvalue. Corollary 1. If the substochastic matrix K0 is primitive then there exists a positive real constant y such that n
F n ¼ ytn þOð9l2 9 nu2 1 Þ: Proof. It is clear that if the substochastic matrix K0 is primitive we shall have 9l1 9 ¼ l1 4 9l2 9 Z Z 9lm 9Z 0,
u1 ¼ 1
and therefore, Proposition 3 guarantees the validity of the stated formula for y ¼ d11 Z0. We shall next prove that the constant y ¼ d11 is now positive. To achieve that we may take advantage of the fact that, for primitive matrices we have (see Seneta, 1981) adjðIð1=rÞK0 Þ 0 Qm ui ¼ v 2 v 1 , i ¼ 2 ð1ð1=rÞli Þ
ð25Þ
where v1 ,v2 are respectively the positive left and right eigenvectors of r ¼ l1 (which always exist). Taking into account that u1 ¼ 1 and the partial decomposition (19), we get p0 adjðIwK0 Þ10 d11 ¼ lim ½GðwÞð1wrÞ ¼ lim Qm ð1w r Þ ¼ p0 v02 v1 10 40: & ui w-r1 w-r1 i ¼ 1 ð1wli Þ It should be stressed that, due to the construction of the substochastic matrix K0 , at least one of its rows has row sum strictly less than 1 and no row sum vanishes. If K0 is primitive, then taking into account the fact that the Perron–Frobenius eigenvalue is bounded from below by the minimum row sum and from above by the maximum row sum of K0 , while equality on either side implies equality throughout (see, e.g. Seneta, 1981), we conclude that 0 o l1 o 1 and therefore, 9li 9 o 1 for all i ¼ 2; 3, . . . ,m. Note also that if l1 4 9li 9, for i ¼ 2; 3, . . . ,m then ! m X tn li n ¼ 1, lim n ¼ lim 1 þ n-1 l 1
n-1
i¼2
l1
1474
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
and the asymptotic results (23) and (24) may also be stated as follows: lim
Fn
n-1 yln 1
¼ 1,
n
F n yl1 :
These formulae have appeared before in the literature under a reliability framework (see, e.g. Boutsikas and Koutras, 2001), or as results for run related problems (see, e.g. Fu et al., 2003). Let us next state an asymptotic result for F n that covers a wider class than the primitive matrices, namely the class of irreducible matrices. Proposition 4. If the substochastic matrix K0 is irreducible then there exists a positive real constant y such that F n ¼ ytn þ Oðrn Þ: Proof. Since K0 is irreducible all its eigenvalues that lie on its spectral circle will have multiplicity 1 (see, e.g. Meyer, 2000). If l1 is the only eigenvalue of K0 on its spectral circle, the desired result is readily deduced from Proposition 3 (or Corollary 1, since a non-negative irreducible matrix with only one eigenvalue on its spectral circle, is primitive). Let us next consider the case where
r ¼ 9l1 9 ¼ 9l2 9 ¼ ¼ 9ll 94 9ll þ 1 9 Z Z9lm 9 with u1 ¼ u2 ¼ ¼ ul ¼ 1 (2r l r m). Given that u1 ¼ 1, Lemma 1 is still applicable and choosing again y ¼ d11 (without loss of generality assume that r ¼ l1 ) we may gain identity (21) with pi ðnÞ,i ¼ 1; 2, . . . ,l being polynomials of degree u1 1 ¼ 0, i.e. pi ðnÞ ¼ mi (constants). Inequality (22) takes now the form n F yt X l m X n n p ðnÞ li : 9mi 9 þ r i n r r i¼1
i ¼ lþ1
P Apparently, the RHS converges to li ¼ 1 9mi 9 and the proof is completed, by verifying, in the same fashion as in the proof of Corollary 1, that y ¼ d11 is a positive real number ((25) is also true for non-negative irreducible matrices (see, Seneta, 1981, Theorem 1.5, p.22)). & A key point in establishing Propositions 3 and 4 was the assumption that the multiplicity u1 of the maximum (in modulus) eigenvalue l1 of the substochastic matrix K0 equals 1. As a matter of fact it was this assumption that led to the expression (18) of Lemma 1, which played a crucial role in the proof of Propositions 3 and 4. On the other hand, in Proposition 2 we have encountered a case where the characteristic polynomial of interest is of the form detðIwKn0 Þ ¼ ðdetðIwK0 ÞÞr ; hence, for r 4 1 all the eigenvalues of Kn0 have multiplicity greater than 1 and none of the aforementioned results can be applied in order to gain an asymptotic approximation for the waiting time of interest. Fortunately, the approach used for the case u1 ¼ 1 can be properly adapted to cover the case u1 4 1 at the expense of a more complicated form for the multiplicative factor y in the approximation ytn . We summarize below some results for this case with some hints on the procedure that could be exploited to establish them. Let us assume that r ¼ rðK0 Þ ¼ 9l1 9o 1 and that the multiplicity u1 of l1 is greater than 1. If 9l1 9 49l2 9 it follows that l1 ¼ r 4 0 (recall that the radius r ¼ rðK0 Þ for a substochastic matrix belongs to its spectrum sðK0 Þ ¼ fl1 , l2 , . . . , lm g) and therefore, appealing to (19) we may obtain a partial fraction decomposition for G(w) with all d1v ,v ¼ 1; 2, . . . ,u1 being real constants. A similar expansion is feasible in the case where there are l (2r l r m) eigenvalues on the spectral length (r ¼ 9l1 9 ¼ 9l2 9 ¼ ¼ 9ll 9) and we choose l1 to be the one that coincides to r. Let yn denote the polynomial of degree (at most) u1 1 u1 n þ v1 1 X yn ¼ ð26Þ d1v : u1 v ¼ 1 n Arguing in exactly the same way as in the proofs of Lemma 1 and Proposition 3 we may gain the following asymptotic results: (a) If 9l1 9 49l2 9 then n
F n ¼ yn tn þ Oð9l2 9 nmaxfu1 ,u2 g1 Þ:
ð27Þ
(b) If r ¼ l1 ¼ 9l2 9 ¼ ¼ 9ll 9 (recall that u1 Z u2 ) F n ¼ yn tn þ Oðrn nu1 1 Þ:
Needless to say, (a) and (b) do not cover all the cases where the multiplicity u1 is greater than 1. In case (b) an additional condition has been imposed that among the eigenvalues lying on the spectral circle of K0 , the one that coincides to r has the greatest multiplicity. Apparently, this is not necessarily valid for any substochastic matrix; nevertheless, this extra condition is usually fulfilled for the matrices K0 emerging from Markov chain imbeddable variables.
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1475
In view of the above, it becomes clear that, under the conditions stated in (a) and (b), one may use the approximation F n yn tn
ð28Þ
for large values of n. It is of interest to note that the coefficient d1u1 of nu1 1 in the polynomial yn is non-negative (it can be easily verified by a similar reasoning as the one used in the proof of Proposition 3 to capture that y ¼ d11 Z 0); hence yn becomes positive for sufficiently large values of n, a fact that makes plausible the approximation indicated in (28). Unfortunately, the use of formulae (24) and (28) requires the prior determination of li ,i ¼ 1; 2, . . . ,s (so that P tn ¼ si ¼ 1 lni is evaluated) and the value of the constant y ¼ d11 (for (26)) or the constants d1v ,v ¼ 1; 2, . . . ,u1 (for (28)). These calculations make our approach unwieldy, especially if the dimensionality of K0 or the multiplicity u1 are large (in which case the use of computationally demanding numerical algorithms is at most indispensable). We shall conclude the present section by describing some simple ideas how to use (24) and (28) for evaluating F n without resorting to the exact calculation of li ,i ¼ 1; 2, . . . ,s and/or d1v ,v ¼ 1; 2, . . . ,u1 . P P Apparently t1 ¼ si ¼ 1 li ¼ m i ¼ 1 ui li does not require the calculation of li since one may evaluate it through the expression t1 ¼ trðK0 Þ by adding up the diagonal entries of K0 . The next proposition offers a simple recursive scheme for the calculation of all tn ,n Z 2 by making use of t1 and the coefficients of the characteristic polynomial (6) of K0 . Proposition 5. Let 1 X
HðwÞ ¼
tn þ 1 wn , 9w9 o1=r
n¼0
be the generating function of the sequence tn þ 1 , n Z0. If ai ,i ¼ 0; 1, . . . ,s are the coefficients of the characteristic polynomial (6) of K0 , then (a) H(w) can be expressed as follows: 1 X Q 0 ðwÞ , tn þ 1 wn ¼ s HðwÞ ¼ Q s ðwÞ n¼0 P where Q s ðwÞ ¼ ws P s ð1=wÞ ¼ si ¼ 0 asi wi . (b) The numbers tn ,n Z 1 satisfy the recurrences ( minfs,n1g X nasn if 1r n r s, asi tni ¼ 0 if n 4s:
ð29Þ
i¼0
Proof. (a) Note first that, by virtue of the expression tn þ 1 ¼ trðKn0 þ 1 Þ we may gain the formula ! 1 1 1 1 X X X X nþ1 n n n n nþ1 tn þ 1 w ¼ trðK0 Þw ¼ trðw K0 Þ ¼ tr K0 ðwK0 Þ , HðwÞ ¼ n¼0
n¼0
n¼0
n¼0
and making use of formula (16) we deduce HðwÞ ¼ trðK0 ðIwK0 Þ1 Þ,
9w9 o1=r:
If l1 , l2 , . . . , ls are the s (not necessarily distinct) eigenvalues of K0 it can be readily verified that 1wli are the s eigenvalues of IwK0 and li =ð1wli Þ,i ¼ 1; 2, . . . ,s are the s eigenvalues of K0 ðIwK0 Þ1 . Hence, s Y 1 IK0 ¼ ws Ps ð1=wÞ ¼ Q s ðwÞ ð1wli Þ ¼ detðIwK0 Þ ¼ ws det w i¼1 and s X
li
1wli i¼1
¼ trðK0 ðIwK0 Þ1 Þ ¼ HðwÞ:
Since, Q 0 ðwÞ ¼
s s s s Y X X d Y li li ð1wli Þ ¼ ð1wli Þ ¼ Q s ðwÞ dw i ¼ 1 1w l 1w li i i¼1 i¼1 i¼1
we may write
s X Q 0s ðwÞ li ¼ ¼ trðK0 ðIwK0 Þ1 Þ ¼ HðwÞ Q s ðwÞ 1wli i¼1
and the proof of part (a) is complete.
1476
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
(b) The set of recurrences (29) can be easily established if we rewrite the equality Q 0s ðwÞ ¼ HðwÞQ s ðwÞ in the form ! ! s 1 s X X X i1 n n iasi w ¼ tn þ 1 w asi w n¼0
i¼1
i¼0
and perform the power series multiplication in the RHS to gain the identity ! minfs,n1g s 1 X X X nasn wn1 ¼ asi tni wn1 : & n¼1
n¼1
i¼0
It is worth mentioning that the recurrence relations (29) are the so-called Newton’s identities (see, e.g. Meyer, 2000). Note also that, for n 4s the recurrences satisfied by tn are exactly the same as the ones satisfied by F n (compare the second part of (29) to (7)). Finally, it is of interest to draw the attention of the reader to the fact that the pair of recurrence relations (29) is sufficient for the evaluation of all tn ,n Z1, starting with the initial value t1 ¼ trðK0 Þ; thus, if the coefficients of the characteristic polynomial Ps(w) have been determined, no need for the computation of its roots P li ,i ¼ 1; 2, . . . ,s (real or complex) arises in order to calculate tn ¼ si ¼ 1 lni ,n ¼ 1; 2, . . . . Though one can gain the value of tn ,n Z 1, without having calculated the eigenvalues li ,i ¼ 1; 2, . . . ,m the determination of the constants y appearing in (24) or the polynomial yn shown in (26) cannot be performed equally easily; to arrive at y or yn , not only do we need the eigenvalues li ,i ¼ 1; 2, . . . ,m and their multiplicities ui, but a partial fraction decomposition (as shown in (19)) should be carried over as well. However, one can exploit (24) for producing the values of F n ,n Zn0 without having at hand y, provided that the evaluation of F n0 is computationally tractable and n,n0 are large enough to justify a reasonably good accuracy of the approximating formula (24). Evidently, under these conditions we may write Fn ytn tn ¼ , ytn0 tn0 F n0 which yields Fn
tn F , n Zn0 : tn0 n0
ð30Þ
Apparently, the evaluation F n ,n Z n0 by the aid of the last formula requires the value of F n0 and the values of tn and tn0 , n Zn0 (Proposition 5 may be helpful for gaining the last quantities). In a similar fashion, when approximation (28) is of good quality, we may state that Fn F n0
yn tn : yn0 tn0
If limn0 -1 ðn=n0 Þ ¼ 1 then, by virtue of (26) we get lim
yn
n0 -1 yn
¼1
0
and therefore formula (30) can be used for this case as well. A second practical approach that could be used to avoid the computational effort required for the exact calculation of y in formula (24) is offered by estimating it through the ratios F n =tn ,n Z1 (for tn a0). More specifically, if the evaluation of the exact probabilities F n ,n Z 1 is computationally tractable for a number of selected values of n, we may create the respective ratios F n =tn and check whether their values are converging to a (positive) constant; this will offer a reasonable estimator for y and then formula (24) can be applied to capture the tail probabilities of interest (for large n). 5. Numerical study In the present section we illustrate how the results described in Section 4 can be exploited to establish computationally manageable and quite accurate approximations for the waiting time tail probabilities F n ,n Z1. Although our results are applicable to various wide classes of substochastic matrices K0 , we shall begin with an example where K0 is primitive. In this case the celebrated Perron–Frobenius theory has already been extensively used in the relevant literature to establish asymptotic results for waiting time tail probabilities (see, Fu et al., 2003; Boutsikas and Koutras, 2001; Chao and Fu, 1991), so it seems reasonable to proceed to a comparison between the approximations gained by different approaches. The main difference between our results and the ones already published is that, instead of using ln1 (l1 4 0 is the Perron–Frobenius eigenvalue of K0 ) for deducing an approximation of F n , we couch on P tn ¼ trðKn0 Þ ¼ si ¼ 1 lni . It goes without saying that if the rest eigenvalues of K0 are much smaller (in modulus) than l1 , then the values of ln1 and tn will be practically the same (especially for large n), and the two approaches will result in practically the same approximation. Let us consider the waiting time problem for the first occurrence of a success run of length k Z1 in a sequence of Bernoulli trials with success probabilities p. As illustrated in the discussion following Proposition 1, the tail probabilities F n
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1477
are given by F n ¼ p0 Kn0 10 where K0 is the substochastic matrix (8), which can be easily verified to be primitive. Moreover, t1 ¼ trðK0 Þ ¼ q ¼ 1p and using the coefficients (9) of the characteristic polynomial of K0 , one can easily compute the values of tn ,n Z 1 via the recurrence relations offered in Proposition 5. The exact values of the tail probabilities F n ,n Z 1 can also be effortlessly derived through the recurrence (10) and the initial (11). Fig. 1 depicts, for several choices of k and p, how fast the sequence of ratios F n =tn ,n ¼ 1; 2, . . . converges to a positive constant; this value can be used next as a reasonable estimate of the parameter y appearing in Proposition 3. In Table 1 we n present the exact values of F n , the approximation achieved by (24) and the approximation deduced by yl1 (see, e.g. Fu et al., 2003; Boutsikas and Koutras, 2001) for ðk,pÞ ¼ ð25,0:995Þ,ð30,0:990Þ and several choices of n. The boldfaced entries indicate the best approximation for each value of n. It is clear that, in most cases the new approximation (ytn ) outperforms the one that uses the Perron–Frobenius eigenvalue although the difference between them is not large. Let us next consider the case of the waiting time T ¼ T r for the r-th non-overlapping occurrence of a success run of ðrÞ length k Z1 in a sequence of Bernoulli trials with success probabilities p. Then the tail probabilities F n ¼ PðT r 4 nÞ may be expressed as ðrÞ
F n ¼ p0 ðKn0 Þn 10 , where Kn0 is a rk rk blocked matrix of the form (14) with the main diagonal blocks K0 given by (8) and the k 1 column vector c 0 having only a non-zero entry equal to 1p at its last coordinate. As already stated in the proof of Proposition 2 detðIwKn0 Þ ¼ ðdetðIwK0 ÞÞr and since K0 has only one positive eigenvalue l1 on its spectral circle (recall that the matrix K0 given by (8) is primitive), the same will be true for Kn0 as well. However, for r 4 1, the multiplicity of the eigenvalue l1 of Kn0 equals u1 ¼ r 4 1 and the Perron–Frobenius theory cannot be applied. Despite that, according to the discussion following Proposition 4, we can now k 30, p 0.990
k 25, p 0.995 7
10
6 8
Fn n
Fn n
5 6
4 4 3
2 80
90
100
110
120
130
140
150
2 80
90
100
110
120
130
n
n
Fig. 1. The sequence F n =tn , for the waiting time for the first success run of length k in a sequence of Bernoulli trials.
Table 1 Exact and approximate values for the tail probabilities F n of the waiting time for the first success run of length k in a sequence of Bernoulli trials. Fn
ytn
yln1
k¼ 25, with p ¼ 0.995, y ¼ 8:1538 30 0.007370 40 0.005534 45 0.003835 65 0.000218 75 0.000044 85 0.000014 95 3:38924 106
0.095724 0.051613 0.029558 0.001752 0.000351 0.000113 0.0000268
0.060096 0.045129 0.031274 0.001774 0.000355 0.000112 0.0000276
0.168060 0.044068 0.022566 0.001552 0.000407 0.000107 0.0000277
k¼ 30, with p ¼0.990, y ¼ 4:2491 35 0.037949 45 0.031920 50 0.026430 70 0.004871 80 0.002496 90 0.001022 100 0.000513
0.223315 0.149345 0.112360 0.021597 0.010277 0.004428 0.002180
0.161249 0.135634 0.112303 0.020699 0.010605 0.004344 0.002181
0.313317 0.145529 0.099182 0.021398 0.009939 0.004616 0.002144
n
tn
140
150
1478
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
use the asymptotic result (27), because in this case we have 9l1 9 49l2 9. As a consequence, the approximating formula (30) remains valid and may be exploited to compute F n for large values of n. Table 2 displays the approximations derived by the aid of (30) for the tail probabilities of the waiting time for the third occurrence (r ¼3) of a non-overlapping success run of length 5 (k¼5) in a sequence of Bernoulli trials with success probabilities p ¼0.75. The entries corresponding to the case n ¼ n0 are the exact tail probabilities for F n (they have been deduced by an alternative exact computational scheme). Evidently, the approximate values depicted in Table 2 are increasing in quality as n0 approaches n. As a final example let us consider a classical occupancy problem with k urns and an unlimited supply of balls. Balls are randomly distributed in the urns by inserting one ball at a time in one of the k urns. Denote by T the waiting time (number of balls to be used) until all urns are occupied by at least one ball, i.e. until all k urns are non-empty. As indicated by Fu and Lou (2003) the system can be analysed by a Markov chain with states 0; 1, . . . ,k where state i designates that exactly i urns are occupied. Considering state k as an absorbing state, it is not difficult to verify that F n ¼ PðT 4 nÞ,n Z1 can be evaluated by the aid of formula (5) with K0 given by (see Fu and Lou, 2003) 0
0
B B B B K0 ¼ B B B B B @
1
1 1 k
k1 k
&
0 & i k
ki k
&
0
k1 k
C C C C C C C C C A
:
kk
It is evident that the spectrum of K0 is
1 k
sðK0 Þ ¼ 0, , . . . ,
k1 k
Table 2 The tail probabilities F n as computed by the approximation F n ðtn =tn0 ÞF n0 . n
n0
40 k¼5, r¼ 3, p ¼ 0.75 20 0.985399 25 0.989029 35 0.996327 40 0.999995 50 60 70 80 90 100
50
60
70
80
90
100
0.978179 0.981782 0.989027 0.992668 0.999991
0.971012 0.974589 0.981780 0.985395 0.992664 0.999985
0.963898 0.967448 0.974587 0.978175 0.985391 0.992658 0.999976
0.956835 0.960359 0.967446 0.971008 0.978171 0.985385 0.992650 0.999965
0.949825 0.953323 0.960357 0.963894 0.971004 0.978165 0.985376 0.992638 0.999951
0.942865 0.946338 0.953321 0.956831 0.96389 0.970998 0.978157 0.985365 0.992624 0.999934
k 10
10
k 20
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0
20
40
60
80 n
100
120
140
0.0
0
20
40
20
60
80
100
120
n
Fig. 2. Exact (continuous line) and approximate (dotted line) values for F n in a classical occupancy problem.
140
M.V. Koutras, F.S. Milienos / Journal of Statistical Planning and Inference 142 (2012) 1464–1479
1479
and K0 is neither a primitive nor an irreducible matrix. Nevertheless, since rðK0 Þ ¼ ðk1Þ=k ¼ l1 o1,u1 ¼ 1 and l1 is the only eigenvalue of K0 on its spectral circle, we can apply Proposition 3, to obtain the asymptotic result n k1 n X i k2 n F n ¼ PðT 4nÞ ¼ ytn þOð9l2 9 nu2 1 Þ ¼ y þO : k k i¼1 P n In Fig. 2 the approximate values ytn ¼ y k1 i ¼ 1 ði=kÞ have been sketched along with the exact tail probabilities F n for k¼10,20. It is evident that, as n increases, the quality of the approximation improves and when n-1 the approximate value converges to the exact tail probability F n . References Aki, S., 1985. Discrete distributions of order k on a binary sequence. Annals of the Institute of Statistical Mathematics 37, 205–224. Aki, S., Hirano, K., 1993. Discrete distributions related to succession events in a two-state Markov chain. In: Matusita, K., Puri, M.L., Hayakawa, T. (Eds.), Statistical Science and Data Analysis, VSP International Science Publishers, Zeist, pp. 467–474. Balakrishnan, N., Koutras, M.V., 2002. Runs and Scans with Applications. John Wiley and Sons, New York. Balakrishnan, N., Balasubramanian, K., Viveros, R., 1993. On sampling inspection plans based on the theory of runs. The Mathematical Scientist 18, 113–126. Balakrishnan, N., Balasubramanian, K., Viveros, R., 1995. Start-up demonstration tests under correlation and corrective action. Naval Research Logistics 42, 1271–1276. Balasubramanian, K., Viveros, R., Balakrishnan, N., 1993. Sooner and later waiting time problems for Markovian Bernoulli trials. Statistics and Probability Letters 18, 153–161. Bogartz, R., 1965. The criterion method: some analyses and remarks. Psychological Bulletin 64, 1–14. Boutsikas, M.V., Koutras, M.V., 2001. Reliability approximation for Markov chain imbeddable systems. Methodology and Computing in Applied Probability 2, 393–411. Chao, M.T., Fu, J.C., 1991. The reliability of large series systems under Markov structure. Advances in Applied Probability 23, 894–908. Chao, M.T., Fu, J.C., Koutras, M.V., 1995. A survey of the reliability studies of consecutive-k-out-of-n: F systems and its related systems. IEEE Transactions on Reliability 44, 120–127. Dafnis, S.D., Antzoulakos, D.L., Philippou, A.N., 2010. Distributions related to ðk1 ,k2 Þ events. Journal of Statistical Planning and Inference 140, 1691–1700. De Moivre, A., 1756. The Doctrine of Chance, 3rd ed. Chelsea Publishing Co, New York. Doi, M., Yamamoto, E., 1998. On the joint distribution of runs in a sequence of multi-state trials. Statistics and Probability Letters 39, 133–141. Ebneshahrashoob, M., Sobel, M., 1990. Sooner and later waiting time problems for Bernoulli trials: frequency and run quotas. Statistics and Probability Letters 9, 5–11. Fu, J.C., 1996. Distribution theory of runs and patterns associated with a sequence of multistate trials. Statistica Sinica 6, 957–964. Fu, J.C., Johnson, B.C., 2009. Approximate probabilities for runs and patterns in i.i.d. and Markov-dependent multistate trials. Advances in Applied Probability 41, 292–308. Fu, J.C., Koutras, M.V., 1994. Distribution theory of runs: a Markov chain approach. Journal of the American Statistical Association 89, 1050–1058. Fu, J.C., Lou, W.Y.W., 2003. Distribution Theory of Runs and Patterns and its Applications: A Finite Markov Chain Imbedding Approach. World Scientific, Singapore. Fu, J.C., Wang, L., Lou, W.Y.W., 2003. On exact and large deviation approximation for the distribution of the longest run in a sequence of two-state Markov dependent trials. Journal of Applied Probability 40, 346–360. Glaz, J., Naus, J., Wallenstein, S., 2001. Scan Statistics. Springer Verlag, New York. Godbole, A.P., 1991. Poisson approximations for runs and patterns of rare events. Advances in Applied Probability 23, 851–865. Godbole, A.P., Schaffner, A.A., 1993. Improved Poisson approximations for word patterns. Advances in Applied Probability 25, 334–347. Grant, D., 1946. New statistical criteria for learning and problem solution in experiments involving repeated trials. Psychological Bulletin 43, 272–282. Han, Q., Aki, S., 1999. Joint distributions of runs in a sequence of multi-state trials. Annals of the Institute of Statistical Mathematics 51, 419–447. Hirano, K., Aki, S., 1993. On number of occurrences of success runs of specified length in a two-state Markov chain. Statistica Sinica 3, 313–320. Koutras, M.V., 1996a. On a waiting time distribution in a sequence of Bernoulli trials. Annals of the Institute of Statistical Mathematics 48, 789–806. Koutras, M.V., 1996b. On a Markov chain approach for the study of reliability structures. Journal of Applied Probability 33, 357–367. Koutras, M.V., 1997. Waiting times and number of appearances of events in a sequence of discrete random variables. In: Balakrishnan, N. (Ed.), Advances ¨ in Combinatorial Methods and Applications to Probability and Statistics, Birkhauser, Boston, pp. 363–384. Koutras, M.V., Alexandrou, V.A., 1997. Sooner waiting time problems in a sequence of trinary trials. Journal of Applied Probability 34, 593–609. Koutras, M.V., Alexandrou, V.A., 1995. Runs, scans and urn model distributions: a unified Markov chain approach. Annals of the Institute of Statistical Mathematics 47, 743–766. Makri, F.S., Philippou, A.N., Psillakis, Z.M., 2007. Shortest and longest length of success runs in binary sequences. Journal of Statistical Planning and Inference 137, 2226–2239. Meyer, C., 2000. Matrix Analysis and Applied Linear Algebra. SIAM, Philadelphia. Mohanty, S.G., 1994. Success runs of length-k in Markov dependent trials. Annals of the Institute of Statistical Mathematics 46, 777–796. Philippou, A.N., 1988. On multiparameter distributions of order k. Annals of the Institute of Statistical Mathematics 40, 467–475. Philippou, A.N., Makri, F.S., 1986. Successes, runs and longest runs. Statistics and Probability Letters 4, 101–105. Pielou, E.C., 1963a. Runs of healthy and diseased trees in transects through an infected forest. Biometrics 19, 603–614. Pielou, E.C., 1963b. The distribution of the diseased trees with respect to healthy ones in a patchily infected forest. Biometrics 19, 450–459. Pielou, E.C., 1977. Mathematical Ecology. John Wiley and Sons, New York. Seneta, E., 1981. Nonnegative Matrices and Markov chains, 2nd ed. Springer, New York. Schwager, S., 1983. Run probabilities in sequences of Markov dependent trials. Journal of the American Statistical Association 78, 168–175. Stefanov, V.T., 2011. Compound patterns and generating functions: from basic waiting times to counts of occurrence. Journal of Statistical Planning and Inference 141, 2298–2302. Taylor, H.M., Karlin, S., 1984. An Introduction to Stochastic Modeling. Academic Press, San Diego. Viveros, R., Balakrishnan, N., 1993. Statistical inference from start-up demonstration test data. Journal of Quality and Technology 22, 119–130. Wolfowitz, J., 1943. On the theory of runs with some applications to quality control. Annals of Mathematical Statistics 14, 280–288.