!~.i i '
/:
.. . . .
STATIIIlrlClt t PROBABILITY LETTERS
ELSEVIER
Statistics & Probability Letters 32 (1997) 339-342
A note on the growth of random trees J.D. Biggins, D.R. Grey* Probability and Statistics Section, School of Mathematics and Statistics, University of Sheffield. Sheffield $3 7RH. UK Received January 1996; revised March 1996
Abstract We provide a unification and generalization of several recent results on the asymptotics, as the number of nodes increases, of the heights of trees grown according to various rules.
Ke)'words: Search tree; Pyramid; Recursive tree; General branching process; First birth problem; Exponential growth; Logarithmic growth
!. Introduction and theory A random tree is grown one branch at a time, starting at a root node, in the following way. For a given sequence o f non-negative numbers (wi: i -- 0, 1,2 .... ) satisfying w0 = 1 by convention and wl > 0 to avoid triviality, at any stage of growth, independently o f the past, the probability that a given node receives the next daughter node depends upon the number i o f daughter nodes already possessed by that node, and is found by dividing wi by the sum o f such quantities over all existing nodes. This model includes the following as special cases. 1. The binary search tree where w0 = l,Wl = . ½ and w~ = 0 for i>~2. This arises in a computer science context where each node has two potential daughters, left and right. Numbers, assumed to be independent identically distributed continuous random variables, are inspected one at a time and placed in the correct position on the tree (ordered from left to right) by a sequence o f comparisons starting at the root node. In the current context, the left/right distinction is unimportant. 2. The random m-ary pyramid (m~>2) where wi = 1 for i < m and wi = 0 for i>~m. The relevance o f this model to the spread o f chain letters is discussed by Mahmoud (1994). 3. The linear recursire tree where w~ = 1 + bi for all i, some constant b >/0. We use this name to describe a model discussed by Pit-tel (1994), special cases being the uniform recursive tree with b = 0 and the plane-oriented tree with b = I.
* Corresponding author. 0167-7152/97/$17.00 ~ 1997 Elsevier Science B.V. All rights reserved PII S0 I 6 7 - 7 152( 96 ) 0 0 0 9 2 - 2
J . D Biqgins, D.R. Grey I Statistics & Probability Letters 32 r 1997) 339. 342
340
The purpose of this note is to show how the asymptotic behaviour of h n, the height o f the tree after n nodes have been added, may be determined under quite general conditions, including various special cases, such as the above three, which have previously been considered. We first embed this process in continuous time as follows, according to an idea introduced by Pittel (1984) for the binary search tree, and crucial in obtaining all the subsequent results mentioned here. Construct a general (Crump-Mode-Jagers) branching process in which, starting with a single ancestor at time 0, each individual independently never dies but gives birth to single individuals after independent exponential interbirth intervals with parameters w0, wl,w2 . . . . . respectively (so that if wi = 0 for some i then the family size cannot exceed i). Then regarding individuals as nodes in a tree, the order o f births through time exactly corresponds probabilistically to our original model. Next, again following earlier authors, we invoke the extension of Kingman's (1975) theorem due to Biggins (1977). This states that if B~ is the time of the first birth in generation k then
Bk/k ~ 7
a.s. as k ---, ~c
where 7 = sup{z : p(z) < 1},
p(z) = inf e:°4'(O) 0>0
and 4' is the Laplace transform o f the mean reproduction measure, which here may easily be computed to be
,=0 j=o wj + O" For this result, we need to assume that 4,(0) is finite for some 0 > 0 (whence, by dominated convergence, lim0._~ 4'(0) = 0 and then p ( 0 + ) = 0; also l i m : _ ~ p(z)>~2 by comparison with some binary tree, so that 7 is positive and finite). In the linear recursive tree, we show later that ~b(0) < oc for 0 > b. An easy comparison allows us to conclude that l i m s u p i ~ i-tw~ < oo is a sufficient condition for the above assumption on 4' to hold, which comfortably accommodates all the special cases mentioned earlier. Now if tn denotes the time o f the nth birth, using the obvious fact that Bh,, ~
tn/hn --~ 7
a.s. as n --, :x~.
It remains to obtain a good estimate of tn. Pittel (1994) and Mahmoud (1994) took an ad hoe approach to this, but we take advantage of existing theory, which also yields a more general conclusion. Nerman's (1981) important theorem on almost sure exponential growth of the general branching process allows an excellent estimate of t,, to be derived under weak conditions. However we invoke instead Theorem 1 o f Biggins (1995) which has even weaker hypotheses, but a much weaker conclusion. It states that if Z(t) is the number of births up to time t and .z~ = inf{O : 4'(0) < 1} (which here is positive since 4 ' ( 0 + ) > / 2 and finite since 4'(0) ~ 0 as 0 ---, oc), then log e Z(t ) - - ~ t
a.s. as t --. ~
J.D Biggins, D.R. Grey I Statistics & Probability Letters 32 (1997j 339-342
341
from which it follows, putting t = t,,, that t,,1
--I
log e n
a.s. as
n~oo
and this enables us to conclude that hH
(z~7)
log c n
-- I
a.s. as n ~ ~c..
2. Calculations Analytic methods may sometimes be used before resorting to numerical methods to calculate the various constants appearing in the above limiting results. For the uniform recursive tree, we can be completely explicit: q~(0) = 0 - I , zt = I, y ( z ) = ez, 7 = e - I and (:iT) - I = e, a result obtained by Devroye (1987). For the binary search tree, ~b(0) = 2/(1 + 20), :t = ½, /~(z) = ze 1-:'2 for z < 2 and 2 for z~>2, and 7 is the smaller positive root o f the equation ze I-:,'2 = 1 (Devroye, 1986). For the random m - a r y p y r a m i d , in general tp(0) = 0 l { l - ( 1 + 0 ) - " } ; in the case m = 2, Mahmoud (1994) obtains that :t = ( v ' ~ - 1)/2 and 7 is a root o f a complicated equation not reproduced here; in the case m = 3, ~ is the positive root o f 03 + 202 - 2 = 0 and we have calculated 7 numerically. For the linear recursive tree with b > 0, by expressing ~b in terms o f G a m m a functions and then Beta functions, we may calculate (denoting [t = b-~ ) that
~(0) =
1
if-.
~.. B([3 + 1 + i, flO) B(l~,l~O) i-o -
-
( 1 - x ) ~°- I dx B(I~ + 1,/~0 -
t)
B(I~,I~O) =(0-b)
-I
for 0 > b, and so :t = b + 1, p ( z ) = ze b:+l and 7 is the unique root o f ze b-'+] = 1. This result was obtained by Pittel (1994), although the formula for qS(0) was derived less directly. These results are summarised numerically (using the plane-oriented tree as an example o f the linear recursive tree) in Table 1.
Table I Model
2
7
(z~y) I
Binary search tree Random binary pyramid Random ternary pyramid Uniform recursive tree Plane-oriented tree
0.5000 0.6180 0.8393 1.0000 2.0000
0.4639 0.4056 0.3760 0.3679 0.2785
4.31 I 1 3.9891 3.1690 2.7183 1.7956
342
J.D. Biggins. D.R. GrO' /Statistics & Probability Letters 32 (1997)339-342
3. Concluding remarks There is a deliberate monotonicity in the way the tabulated results have been presented: namely, for each i, wg is non-decreasing from top to bottom. The fact that under such circumstances :t increases and ;~' decreases is easily confirmed theoretically; we have found no comparable explanation for the monotonicity of (:tT) - I , although intuitively the more the tree is encouraged to grow sideways the more slowly can it be expected to grow upwards. A further example of this monotonicity is the linear recursive tree, where it is not hard to show analytically that (:iT) -I decreases as b increases. It is possible to allow wi = oc for some values of i, provided that the assumption on ~b holds. This corresponds to the possibility that a node whose number of daughter nodes reaches i automatically receives the next daughter node; or in the branching process formulation, to the possibility of multiple births (i.e. twins, triplets, etc.). The formalities for the linear recursive tree may be extended with little change to the case b = - m - I for some integer m>~2, because if w,,, = 0 then it does not matter if w~ is defined to be negative for i > m. The binary search tree is the special case m = 2 of this. If h,, is the maximum generation number among the first n births of a general branching process, the methods described continue to apply to show that h , / l o g ~ n ~ (~t'/) -I. However the rules for growing the tree sequentially will not take an attractive form in general. One interesting case covered by the more general result is the m-ary search tree, which is a generalization of the binary search tree. Pittel (1994) strengthened Devroye's (1990) result on the heights of these trees to almost sure convergence by essentially this route, and so that paper can be consulted for a description of the appropriate general branching process. The derivation of the behaviour of h,, for the m-ary search tree based on identifying a general branching process can also be found in Biggins (1996); the relevant section there was prepared independently of Pittel (1994) but has several ideas in common with it.
References Biggins, J.D. (1977), Chemoff's Theorem in the branching random walk, J. Appl. Probab. 14, 630-636. Biggins, J.D. (1995), The growth and spread of the general branching random walk, Ann. Appl. Probab. 5, 1008-1024. Biggins, .I.D. (1996), How fast does a general branching random walk spread? in: K.B. Athreya and P. Jagers, eds., Classical and Modern Branching Processes, IMA Volumes in Mathematics and its Applications, Vol. 84 (Springer, New York) pp. 19~,0. Devroye, L. (1986), A note on the height of binary search trees. J. Assoc. Comput. Math. 33, 489.-498. Devroye, L. (1987), Branching processes in the analysis of the height of trees, Acta InJbrm. 24, 277-298. Devroye, L. (1990), On the height of random m-ary search trees, Random Structures Al.qorithms I, 191-203. Kingman, J.F.C. (1975), The first birth problem for an age-dependent branching process, Ann. Probab. 3, 790-801. Mahmoud, H.M. (1994), A strong law for the height of random binary pyramids, Ann. Appl. Probab. 4, 923-932. Nerman, O. (1981), On the convergence of supercritical general (C-M-J) branching process, Z. Wahrsch. verw. Gebiete 57, 365-395. Pittel, B. (1984), On growing random binary trees, J. Math. Anal. Appl. 103, 461~,80. Pittel, B. (1994), Note on the heights of random recursive trees and random m-ary search trees, Random Structures Al#orithms 5, 337-347.