Available online at www.sciencedirect.com
Mathematics and Computers in Simulation 79 (2009) 2293–2301
Joint distribution of the number of vertices with given different outdegrees in Galton–Watson forest Tatiana Mylläri Åbo Akademi University, Åbo, Finland Available online 21 November 2008
Abstract A Galton–Watson forest consisting of N roots (or trees) and n non-root vertices is considered. We study limit distributions of the number of vertices of a given outdegree in such a forest. In this work joint distribution of the number of vertices with the given different outdegrees r1 and r2 is considered. A local limit theorem for this characteristic is proved. © 2008 IMACS. Published by Elsevier B.V. All rights reserved. Keywords: Local limit theorem; Galton–Watson forest; Outdegree of vertices
1. Introduction Mathematical concepts of trees and forests are used, on the one hand, when modeling and analyzing various phenomena and on the other hand, when developing mathematical and statistical methods. We mention as examples algorithm analysis, electrical network modeling, methods in applied statistics, random equation theory, random mapping theory, and the theory of super Brownian motion (which is a measure valued process). There are different ways to construct random forests. One of them is to use classical graph theory. In this case, a tree is a connected graph without cycles and a forest is a graph without cycles. Another way to construct random forests is to use branching processes. The idea of using the theory of branching processes to study random forests is based on the intuitive image of a tree as a realization of a branching process. In this context, Galton–Watson branching processes are the most natural to use. The Galton–Watson process is a stochastic process arising from Francis Galton’s investigation of the extinction of surnames. Sir Francis Galton formulated the following problem (The Educational Times, 1837): What is the probability of aristocratic surnames becoming extinct? Reverend Henry William Watson found a solution to this problem. The solution was published in a joint paper entitled “On the probability of extinction of families” (1874). Assume that surnames are passed on to all male children by their father. Suppose the number of a man’s sons to be a random variable ξ distributed on the set 0, 1, 2, 3, . . .. Further suppose the numbers of different men’s sons to be independent random variables with the same distribution. If the expectation Eξ ≤ 1 then the surname will surely die out, while if Eξ > 1 then there is positive probability that it will survive forever. (t) (t) We define the classical Galton–Watson process following Karlin and Taylor [2]. Let ξ1 , ξ2 , . . ., t = 1, 2, . . . be independent identically distributed random variables with the probability distribution (the so-called offspring distribution):
E-mail address:
[email protected]. 0378-4754/$36.00 © 2008 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.matcom.2008.11.003
2294
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301 (t)
P{ξi = k} = pk , i = 1, 2, . . . , k = 0, 1, 2, . . . , (1) ∞ and k=0 pk = 1. The Galton–Watson branching process starting with N particles is a family of random variables {(t) : t = 0, 1, 2, . . .}, defined by (0) = N,
(t)
(t)
(t) = ξ1 + . . . + ξ(t−1) .
The random variable (t) is interpreted as the number of particles in the t th generation of . If (t) = 0 for some t we define (t + 1) = 0. Denote by n the total number of offspring. The set of all realizations of a Galton–Watson process with the probability measure induced by this process in a natural way is called a G alton–Watson forest with N trees and n non-rooted vertices. If N = 1, then we have a Galton–Watson tree. For the general concept of a random forest we refer the readers to [7]. Many different characteristics of a Galton–Watson forest were studied earlier by different researchers, e.g., Yu. Pavlov, M. Drmota, B. Gittenberger. This paper could be considered as a continuation of the earlier work [6], namely studying the distribution for the number of vertices with a given outdegree. Outdegree of a vertex is the number of branches emanating from this vertex. Earlier, the local limit theorems for this characteristic were proved. It was shown that the limiting distribution is normal with parameters depending on how N and n approach infinity. In this paper, we consider the joint distribution of the number of vertices with the given different outdegrees r1 , r2 . We refer also to [1] where M. Drmota and B. Gittenberger consider the distribution of nodes of given degree for some classes of random trees, such as unrooted unlabeled trees, plane trees, and labeled trees. Computer Algebra Systems like Mathematica or Maple, appear to be very useful in such studies helping to deal with the bulky calculations. In this work we mainly used Mathematica, using Maple only from time to time to duplicate calculations in some very tedious cases. 2. Joint distribution of the number of vertices with the given different outdegrees Let ξj , j = 1, 2, . . ., be an independent identically distributed (i.i.d.) sequence of non-negative integer valued random variables with probability distribution P{ξj = k} = pk .
(2)
Assume that p0 = / 0. Consider the vector r = (r1 , r2 ), where r1 , r2 are integers such that 0 < r1 < r2 . Introduce a (r) random variable ξj with the distribution: (r)
P{ξj = k} = P{ξj = k|ξj = / r1 , ξj = / r2 },
k = 1, 2, . . .
(3)
Let Sn = ξ1 + ξ2 + . . . + ξn ,
(r)
(r)
Sn(r) = ξ1 + ξ2 + . . . + ξn(r) .
Denote by (N, n; r) the number of vertices with outdegrees r = (r1 , r2 ) in a Galton–Watson forest with N trees and n non-root vertices. Let k = (k1 , k2 ), k1 + k2 ≤ N + n. Now we have Theorem 1. The joint distribution of the number of vertices with outdegrees r1 , r2 in the Galton–Watson forest with N trees and n non-root vertices is given by N +n N + n − k1 k1 N+n−k1 P{(N, n; r) = k} = pr1 (1 − pr1 ) pkr12 (1 − pr2 )N+n−k1 −k2 × k1 k2 (r)
×
P{SN+n−k1 −k2 = n − k1 r1 − k2 r2 } P{SN+n = n}
.
This theorem could be proved using the generalized allocation scheme (see [3]).
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
2295
3. Local limit theorem for the number of vertices with two given different outdegrees Let ξ be a random variable with probability distribution (2) and generating function F (z) =
∞
pk zk .
(4)
k=0
We assume that p0 = / 0 and that there exist 0 < i1 < i2 < i3 , with / 0, pi1 =
pi2 = / 0,
pi3 = / 0.
(5)
Further, it is assumed that the equation zF (z) = F (z) has a solution c > 0 satisfying F (c) < ∞ and F (c) < ∞. Let λ > 0 be such that F (λ) < ∞ and introduce the random variable ζ with the distribution P{ζ = k} = pk (λ) =
λk pk , F (λ)
k = 0, 1, 2, . . . .
(6)
We can suppose, without loss of generality, that the distribution of the number of offspring in the underlying Galton–Watson process is of the form (6). Moreover, we can assume that the mean of ξ equals 1. Assume also that the variance of ξ exists and equals V. Let 0 < r1 < r2 and r = (r1 , r2 ). For later use we define a random variable ζ (r) with the distribution: / r1 , ζ = / r2 }, P{ζ (r) = k} = P{ζ = k|ζ =
k = 1, 2, . . .
(7)
The random variable ζ is latticed and we let d denote its span. The span of the distribution of ζ (r) is denoted by dr . It is easy to see that dr is divisible by d. Consider the distribution of ζ, and let j:= min{k > 0 : pk > 0}, l1 := min{k > 0 : pj+k > 0}, l2 := min{k > l1 : pj+k > 0}. Define (r)
t1 := inf{k ≥ 0 : pk > 0} and (r)
t2 := inf{k > t1 : pk > 0}. Concrete values of t1 and t2 for different values of r1 and r2 are given in Table 1. Let w be the least non-negative integer such that j + w determines the span d of the distribution of ζ, i.e., j + w is the least positive integer such that the conditional distribution: P{ζ = k|ζ ≤ j + w} has the span d. Let vr be the least non-negative integer such that j + vr determines the span dr of the distribution of ζ (r) . Denote by (m − prj (λ))2 . σj = pj (λ) 1 − pj (λ) − σ2 Let r2 =
pr1 (λ)(1 + o(1)) 1 − 2pr1 (λ) − 2pr2 (λ) − (m − r1 )2 pr1 (λ)/σ 2 − (m − r2 )2 pr2 (λ)/σ 2
Theorem 2. Let N, n → ∞ in such a way that n takes values divisible by dr , n/N 2 → 0. Let λmax{w,vr ,t2 −t1 −j} n → ∞
(8)
2296
r1 = 0 r2 = j
r1 = 0 r2 = l1
r1 = 0 r2 ≥ l2
r1 = j r2 = l1
r1 = j r2 ≥ l2
r 1 ≥ l1 r2 ≥ l2
t1
l1
j
j
0
0
0
t2
l2
l2
l1
l2
l1
j
λl2 + o(λl2 )
l1 pl1 p0
λl1 + o(λl1 )
jpj p0
λl2 + o(λl2 )
l12 pl1 p0
λl1 + o(λl1 )
j 2 pj p0
m(r)
l1 +
l2 pl2 pl1
σ 2 (r)
(l2 −l1 )2 pl2 pl1
λl2 −l1 + o(λl2 −l1 )
j+
l2 pl2 j
λl2 −l1 + o(λl2 −l1 )
(l2 −j)2 pl2 pj
λl2 −j + o(λl2 −j )
j+
l1 pl1 j
λl1 −j + o(λl1 −j )
l2 pl2 p0
λl2 −j + o(λl2 −j )
(l2 −j)2 pl2 pj
λl2 −j + o(λl1 −j )
l22 pl2 p0
λj + o(λj ) λj + o(λj )
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
Table 1 Values of t1 and t2 for different combinations of r1 and r2 and corresponding estimates of m(r) and σ 2 (r).
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
2297
where λ is determined by n λF (λ) = . F (λ) N +n Then
dr (1 + o(1)) u21 + u22 − 2ru1 u2 √ P{(N, n; r) = kdr /d} = exp − 2(1 − r 2 ) d2π(N + n)σ1 σ2 1 − r 2
uniformly in the integers k1 and k2 such that u1 =
k1 dr /d − (N + n)pr1 (λ) √ σ1 N + n
and
u2 =
k1 dr /d − (N + n)pr2 (λ) √ σ2 N + n
lie in any finite fixed intervals. Remark. In theorem we assume that pr1 , pr2 = / 0. If pr1 = 0, or pr2 = 0 then from Theorem 1 we have P{(N, n; r) = kdr /d} = 0. 4. Auxiliary lemmas As can be seen from Theorem 1, to obtain the limit distribution of r (N, n) we must study the limit distribution of the sums of independent and identically distributed random variables. Assume that the offspring distribution of the underlying branching process is given by (6) with 0 < λ ≤ 1. Recall that it is assumed that the mean of ξ is 1. Then the distribution in (9) takes the form P{(N, n; r) = k} = P{(N, n; r1 r2 ) = (k1 , k2 )} N +n N + n − k1 k1 N+n−k1 = pr1 (λ)(1 − pr1 (λ)) pkr22 (λ)(1 − pr2 (λ))N+n−k1 × k1 k2 (r)
×
P{SN+n−k1 −k2 = n − k1 r1 − k2 r2 } P{SN+n = n}
.
where Sn = ζ1 + . . . + ζn ,
(r)
Sn(r) = ζ1 + . . . + ζn(r) ,
(r)
ζi , i = 1, 2, . . . and ζi , i = 1, 2, . . . are sequences of i.i.d. random variables distributed as ζ and ζ r , respectively. Let m = m(λ) and σ 2 = σ 2 (λ) be respectively the expectation and the variance of ζ. These exist by (6). Introduce (m − r1 )2 pr1 (λ) + (m − r2 )2 pr2 (λ) 2 . σ∗ = (pr1 (λ) + pr2 (λ)) × 1 − pr1 (λ) − pr2 (λ) − σ2 Let now N and n be given and take λ such that n λF (λ) = F (λ) N +n
(9)
holds. To begin with, we recall following two lemmas. Lemma 1 is Lemma 2.2.1 in Pavlov ([7], p. 41) and Lemma 2 is proved in Mylläri [5]. Lemma 1. The Eq. (9) has a unique solution λ = λ(n, N) such that 0 < λ < 1 and as N, n → ∞ the next assertions hold
2298
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
(a) if n/N → 0 and j:= inf{k > 0 : pk (λ) > 0} = inf{k > 0 : pk > 0} then n p0 (1 + o(1)), N jpj
λj =
(b) if 0 < C1 ≤ n/N ≤ C2 < ∞ then 0 < C3 ≤ λ ≤ C4 < 1, (c) if n/N → ∞ then λ → 1. Lemma 2. Let N, n → ∞ so that λw n → ∞ and n/N 2 → 0, where λ is determined by the relation (9) and w is the least non-negative integer such that j + w determines the span d of the distribution of ζ (see Section 3). Then for a non-negative integer h divisible by d (h − n)2 d(1 + o(1)) P{SN+n = h} = √ exp − 2 2σ (N + n) σ 2π(N + n) √ uniformly in h such that (h − n)/(σ N + n) lies in any finite fixed interval. Next, define m(r):=E ζ (r) =
m − r1 pr1 (λ) − r2 pr2 (λ) 1 − pr1 (λ) − pr2 (λ)
and 2
σ (r):=ζ
(r)
σ2 = × (1 − pr1 (λ) − pr2 (λ))2
(10)
(m − r1 )2 pr1 (λ)+(m − r2 )2 pr2 (λ) 1−pr1 (λ) − pr2 (λ) − σ2
.
(11)
ˆ Let ψ(u) denote √ the characteristic function of ζ (r) − m(r), and ϕˆ K (u) the characteristic function of (r) (SK − Km(r)/(σ(r) K). In the proofs below C1 , C2 , . . . stand for positive constants. Lemma 3. Let N, n → ∞ so that n/N 2 → 0, and let K = K(N, n) be an integer such that K(N, n) = (N + n)(1 − pr1 (λ) − pr2 (λ))(1 + o(1)), where λ is determined by the relation (9). Assume that λt2 −t1 K → ∞. Then ϕˆ K (u) → e−u
2 /2
(12)
uniformly in u in any finite fixed interval. Proof. For sufficiently small u, we have the Taylor expansion ˆ ψ(u) = 1 − u2 σ 2 (r)/2 + u3 Q(u)/6, where |Q(u)| ≤ C1 E|ζ (r) − m(r)|3 < ∞. Then for sufficiently large K and any fixed u we have u2 u u3 √ Q √ . + ln ϕˆ K (u) = K ln 1 − 2K 6σ 3 (r) K3 σ(r) K
(13)
(14)
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
To prove (12) it is enough to show that 1 u √ Q √ = o(1). σ 3 (r) K σ(r) K
2299
(15)
We consider first the case n/N → 0. Using formulae (10) and (11) we get next estimates for m(r) and σ 2 (r): t2 pt2 t2 −t1 λ + o(λt2 −t1 ), pt1 (t2 − t1 )2 pt2 λt2 −t1 (1 + o(1)). σ 2 (r) = pt1 m(r) = t1 +
Estimates for m(r) and σ 2 (r) for different values of r1 and r2 are given in Table 1. From E|ζ (r) − m(r)|3 =
i= / r1 ,r2
|i − m(r)|3 pi λi F (λ)(1 − pr1 (λ) − pr2 (λ))
using the estimates above and the assumptions made, we get E|ζ (r) − m(r)|3 =
i= / r1 ,r2
= i.e., 1 √ Q 3 σ (r) K
|i − m(r)|3 pi λi F (λ)(1 − pr1 (λ) − pr2 (λ))
|t1 − m(r)|3 pt1 λt1 + |t2 − m(r)|3 pt2 λt2 + o(λt2 ) ≤ C2 λt2 −t1 + o(λt2 −t1 ), pt1 λt1 + pt2 λt2 + o(λt2 )
u √ σ(r) K
≤
C3 C2 λt2 −t1 ≤√ → 0, t 2 2 λ 2 −t1 K σ (r) σ (r)K
(because λt2 −t1 K → ∞), i.e., (15) is fulfilled. Next consider the case 0 < C4 ≤ n/N ≤ C5 < ∞. Since in this case 0 < C6 ≤ λ ≤ C7 < 1 (see (b) in Lemma 1) we obtain E|ζ (r) − m(r)|3 =
∞
∞
C10 |i − m(r)|3 pi λi d3 i λ ≤ C8 + ≤ C11 . ≤ C8 + C9 3 F (λ)(1 − pr1 (λ) − pr2 (λ)) dλ (1 − λ)4 i=0 i=0
Combining this with (13) and 0 < C12 ≤ σ 2 (r) ≤ C13 < ∞ we obtain that condition (15) is fulfilled. Finally, consider the case n/N → ∞, n/N 2 → 0. We have E|ζ (r) − m(r)|3 =
∞
|i − m(r)|3
i= / r1 ,r2
pi λi i3 p i λ i < C14 + . F (λ)(1 − pr1 (λ) − pr2 (λ)) ∗ F (λ)(1 − pr1 (λ) − pr2 (λ)) i=j
Because F (λ) → 1 as λ → 1 when n/N → ∞ (see (c) in Lemma 1), we obtain ∞
∞
i=j
i=j
i3 p i λ i i2 p i ≤ supkλk . k ∗ F (λ)(1 − pr1 (λ) − pr2 (λ)) ∗ F (λ)(1 − pr1 (λ) − pr2 (λ))
Since the function x → xλx reaches the maximum when x = 1/ ln λ and λ → 1 we have supkλk ≤ k
C15 . 1−λ
Hence E|ζ (r) − m(r)|3 ≤
C16 C17 n ≤ . 1−λ N
2300
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
Further, in this case 0 < C18 ≤ σ 2 (r). Then 1 n u √ Q √ . =O N2 σ 3 (r) K σ(r) K
(16)
Since n/N 2 → 0 it is seen that (15) is fulfilled in this case too. The proof is complete. Lemma 3 shows that the (r) distribution of SK converges weakly to the normal law. We prove now that also the local convergence takes place. Lemma 4. Let N, n → ∞ such that n/N 2 → 0, and K = K(N, n) be as in Lemma 3. Assume that λmax(j+vr ,t2 −t1 ) K → ∞. Then for a non-negative integer h divisible by dr dr (1 + o(1)) (h − Km(r))2 (r) √ P{SK = h} = exp − 2σ 2 (r)K σ(r) 2πK uniformly in h lying in any finite fixed interval. Proof. Assume first that dr = 1. The basic tool in our proof is Mukhin’s theorem (see [4]). Define HK =
(r)
inf
1/4≤α≤1/2
(r)
2
KE( (ζ2 − ζ1 )α ),
where x is the distance from x to the nearest integer. Consider first the case N, n → ∞ such that 0 < C1 ≤ n/N and n/N 2 → 0. According to Mukhin’s theorem the local limit theorem follows from the integral limit theorem if the condition (Mukhin’s first condition) σ 2 (r)K = O(HK )
(17)
is fulfilled. Consider secondly the case n/N → 0. Let
2 BK (u) = K x2 F (dx), |x|≤u
(r)
(r)
where F is the distribution function of ζ2 − ζ1 . Then, again according Mukhin’s theorem (use now the second condition), √ the local limit theorem follows from the integral limit theorem if HK → ∞ and there exists M > 0 such that σ(r) K = O(BK (M)). To see that this holds consider (r)
(r)
2
E( (ζ2 − ζ1 )α ) =
∞
(k − i)α2
k,i=0
pk (λ)pi (λ) . (1 − pr1 (λ) − pr2 (λ))2
Since j + vr determines the span of the distribution of ζ (r) , it follows that there exist α ∈ [1/4, 1/2] such that j+v r −1 pk (λ)pi (λ)
(k − i)α2 = 0, (1 − pr1 (λ) − pr2 (λ))2 k,i=0 and for all α ∈ [1/4, 1/2] j+v r
(k − i)α2
k,i=0
pk (λ)pi (λ) > 0. (1 − pr1 (λ) − pr2 (λ))2
Now using (6) it is not difficult to get that C2 Kλj+vr ≤ HK . Hence, the condition HK → ∞ is fulfilled. Moreover, it is clear that for sufficiently large M we have
x2 F (dx) ≤ C3 σ 2 (r). |x|≤M
T. Mylläri / Mathematics and Computers in Simulation 79 (2009) 2293–2301
2301
Hence, 2 (M) BK ≤ C4 , σ 2 (r)K √ i.e., σ(r) K = O(BK (M)), and the second condition in Mukhin’s theorem is valid. The lemma is now proved when dr = 1. If dr > 1 we can consider the random variables η(r) = ζ (r) /dr having the span 1, and this completes the proof.
5. The sketch of the proof of Theorem 2 First we check that we can use Lemmas 2 and 4, i.e., that condition (8) implies the conditions λw n → ∞ in Lemma 2 and λj+vr K → ∞, where K = N + n − (k1 + k2 )dr /d in Lemma 4. According to the normal approximation of the binomial distribution M (m − Mpr (λ))2 1 + o(1) m M−m pr (λ)(1 − pr (λ)) exp − =√ 2Mpr (λ)(1 − pr (λ)) 2πNpr (λ)(1 − pr (λ)) m uniformly in √
m − Mpr (λ) 2Mpr (λ)(1 − pr (λ))
lying in any finite fixed interval. Using these approximations for binomial distributions from Theorem 1 and results from Lemmas 2 and 4, after some calculations we get the result from Theorem 2. References [1] [2] [3] [4] [5] [6]
M. Drmota, B. Gittenberger, The distribution of nodes of given degree in random trees, J. Graph Theory 31 (3) (1999) 227–253. S. Karlin, H.M. Taylor, A First Cource in Stochastic Processes, Academic Press, New York, 1975. V.F. Kolchin, Random Mappings, Springer, New York, 1986. A.V. Mukhin, Lokal limit theorems for lattice random variables, Theory Prob. Appl. 27 (4) (1992) 698–713. T. Mylläri, Limit distributions for the number of leaves in a random forest, Adv. Appl. Prob. 34 (4) (2002). T. Mylläri, Yu. Pavlov, Limit distributions of the number of vertices of a given outdegree in a random forest, J. Math. Sci. 138 (1) (2006) 5424–5433. [7] Yu.L. Pavlov, Random Forests, VSP, Utrecht, 2000.