Computer Methods and Programs in Biomedicine, 33 (1990) 165-169
165
Elsevier COMMET 01158
Section I. Methodology
A simple algorithm for generating neuronal dendritic trees J e a n M. Pallo D~partement d'Informatique, UniversitO de Bourgogne, 21004 Dijon, France
A simple, efficient algorithm is presented for generating the codewords of all neuronal dendritic trees with a given number of terminal nodes. Furthermore, a procedure is developed for deciding if different codewords correspond to topologically equivalent trees. Neuronal tree; Data structure; Lexicographic generation
1. Introduction
2. Notations and definitions
For many reasons it is often useful to have available lists of all the shapes of trees of a certain type. Previous research on topics related to tree generation, found mostly in computer science, has been concerned primarily with binary trees, i.e. trees for which nodes have always zero or two successors. Many papers have appeared which contain algorithms for generating all binary trees. Typically, the trees are encoded as integer sequences and then those sequences are generated lexicographically (see [7] and the references given therein). In contrast, dendritic fields of neurones can be represented as trees for which nodes have a variable number of successors [2]. In this paper, we present a coding for neuronal trees. This coding is used in order to generate all neuronal trees with a given number of terminal nodes. Then we give rules for representing neuronal trees in canonical form: neuronal trees which appear to have different shapes and are topologically equivalent will have the same canonical form.
In a (rooted, ordered) tree, every node except the root has a parent. Every node has m >~ 0 children (the order is significant) and each of these children is also a tree called subtree of this node. The number of subtrees of a node is called the degree of that node. A node of degree zero is called a terminal node. A nonterminal node is called a branch node [4]. The height of a node is the
Correspondence." J.M. Pallo, D~partement d'Informatique, Universit~ de Bourgogne, B.P. 138, 21004 Dijon, France.
TABLE 1 The number of neuronal trees having fewer than 12 terminal nodes
n
3 4 5 6 7 8 9 10 11 12
Cn
3 11 45 197 903 4279 20793 103049 518859 2646723
0169-2607/90/$03.50 © 1990 Elsevier Science Publishers B.V. (Biomedical Division)
166 n u m b e r of nodes on the unique path from the root to this node. The term ' n e u r o n a l tree' will be used to refer to a tree for which a variable degree of branch nodes is greater or equal to 2 [1]. Translated into dendritic terminology, the root is taken to be the axon hillock and the terminal nodes are the tips of the terminal segments. The order of magnitude of branching at a node m a y be described as dichotomous if the degree of that n o d e is 2, trichotomous if the degree is 3, and so on [2]. Let S~ denote the set of neuronal trees with n terminal nodes. Let c, = ]Snl, [.[meaning the cardinality of sets. c, is the well-known n-th Schr~Sder n u m b e r [8]. c n can be c o m p u t e d by the linear recurrence formula [3]: (n+l)c.+ C1 =
C2 =
t=3(2n-1)c.-(n-2)c._,
n>/2
]
Unfortunately, no explicit formula for c n as a function of n has ever been obtained. See for example Table 1. Given a neuronal tree T, let d r be the degree of its root and T, be the subtree rooted at the i-th son of the root of T. We say that two neuronal trees T and T ' are in B-order [7], and we will denote T < T ' , if: (1) d T < d T, or (2) d r = d T, and for some i ~ [1, dT] we have (a) T / = ~ ' for all j ~ [ 1 , i - 1 ] a n d (b) Ti < Ti' W e say that two integer sequences u = (u~, ! u 2. . . . . u,) and u ' = (u~, u 2. . . . . u ' ) are in lexico-
]ulOlOO
120000
IOliOOO
102000
graphic order, and we will denote u < u', if there exists i ~ [1, min(n, m)] such that t (1) u j = u ~ for all j ~ [ 1 , i - 1 ] (2) u i < u i
3. Codeword representation of neuronal trees The problem of representing (or encoding) a tree by an integer sequence has received substantial attention (see [6] and the references given therein). A n integer sequence is said to be feasible with respect to a certain encoding scheme if it represents a tree under that scheme. Basically, the uniqueness of the representation and the capability of constructing the tree from its representation (i.e. decodability) are necessary conditions for a tree encoding scheme. Given a neuronal tree T, we label each terminal node with zero and each b r a n c h node with its degree minus one (let us recall that in a neuronal tree degrees of branch nodes are 2, 3, 4 . . . . ). We then read these labels in preorder, i.e. if the root of T has subtrees T 1. . . . . Tin, visit the root of T and then traverse T 1. . . . . T m. We obtain an integer sequence s r called the c o d e w o r d of T. See for examples Figs. 1 and 2. It can be easily proved that T < T ' iff s r < s r , and that the integer sequence s = (sl . . . . . st) is a feasible c o d e w o r d of a neuronal tree iff s I = 0 and for all k ~ [1, l - 1]:
~_,
s,>l{j~[1,
k] Isg=O)[
l <~i~
1100100
IiOlOOO
200100 201000 210000 Fig. 1. The 11 neuronal trees with four terminal nodes.
1110000
30000
167 1
2
0
o
o-oo / o o/1\o 0
0
0
0
0
0
22030000010210000200120000
o
o 0 0 0
130210000001200300000
Fig. 2. Two neuronal trees of $17 and $14.
4. T h e generating algorithm The following algorithm effectively generates lexicographically all codewords of neuronal trees with n internal nodes. Thus neuronal trees are generated in B-order: Begin with s = (1, 0, 1, 0 . . . . . 1, 0, 1, 0, 0) l.'= 2 n - 1 w h i t e i : = m a x ( k ~ [1, l] I s(k) 4: 0} 4= 1 do if s(i-1)=0 then q : = l - - i + l else q : = l - i endif
1 l==i+t+q-1 t :=s(i)forj:=l
s ( i - 1) : = s ( i -
1) + 1
s ( l ) := 0 totdos(l-2j+l):=0s(l-2j):=l
enddo
f o r j : = i t o 1 - 2 t - - 1 do s(j):= 0 enddo enddo
TABLE 2 List of the codewords of the 45 neuronal trees with five terminal nodes 101010100 101011000 10102000 101100100 101101000 101110000 10120000 10200100 10201000 10210000 1030000 110010100 110011000 11002000 110100100
110101000 110110000 11020000 111000100 111001000 111010000 111100000 11200000 12000100 12001000 12010000 12100000 1300000 20010100 20011000
2002000 20100100 20101000 20110000 2020000 21000100 21001000 21010000 21100000 2200000 3000100 3001000 3010000 3100000 400000
See for example Table 2 for n = 5. Let S,, m denote the set of neuronal trees with n terminal nodes whose b r a n c h nodes all have degrees less or equal to m (m ~< n). We have: [S~,.] = c,,, I 8~,,,_1 ] = c,, - 1 and
( 2n - 2 ]/n
1S"'2{= \ n - 1
]
the n - 1-th Catalan n u m b e r which counts the set of binary trees with n - 1 b r a n c h nodes (and thus n terminal nodes). If we do not wish to generate all the neuronal trees of S, but only neuronal trees of Sn, m, we will use the following algorithm: Begin with s = (1, 0, 1, 0 . . . . . 1, 0, 1, 0, 0) / : = 2 n - 1 s(0) .'= 0 whi te i : = m a x ( k ¢ [1, l] I s(k) 4:0 and s ( k - 1)
s(k)
E i-l
<~k<~l
q:= {k~[i-l,l]
I s(k)=0}l
s(i-1)'.=s(i-1)+l
t:=t-1 s(l):= 0 forj:=l to tdos(l-2j+l):=0
l:=i+t+q-1 s(1-2j):=l
enddo
forj:=i enddo
to l - 2 t - l d o s ( j ) : = O
enddo
168 TABLE 3 Numbers of neuronal trees with n terminal nodes and whose branch nodes all have degrees m (m ~< n ~ 12)
4
2
3
4
5
10
11
5
6
7
8
9
10
11
5
14
38
44
45
5
42
154
189
t96
197
7
132
654
850
894
902
903
8
429
2871
3951
4215
4269
4278
4279
9
1430
12925
18832
20377
20717
20782
20792
10
4862
59345
91542
100463
102531
102960
103037
103048
103049
11
16796
276835
452075
503191
515521
518224
518756
518846
518858
518859
12
58786
1308320
2261753
2553291
2625909
2642484
2645955
2646605
2646709
2646722
Table 3 shows that if we constrain degrees of b r a n c h nodes to be less than or equal to 3, we get approximately a half of the possible neuronal trees. O n the other hand, if the b o u n d of these degrees is 4, then we reach almost all the neuronal trees.
5. Transforming neuronal trees into canonical form The problem of finding a canonical representation of neuronal trees occurs when different neuronal trees are c o m p a r e d [5]. It is necessary to represent neuronal trees in a canonical form so that different neuronal trees which appear to have different branching patterns and are topologically equivalent will have the same canonical form. Canonical forms can be recursively defined as
22300000021200000012] 00000
12
20793
2646723
follows: a neuronal tree T is in canonical f o r m if the subtrees Tt, T2. . . . . Tm of its root are in canonical form and if T1 > / T 2 >~ ... > / T m. See for example Fig. 3. To a given neuronal tree T, the class of topologically equivalent neuronal trees can be associated. It can be easily seen that the canonical f o r m of T is merely the greatest neuronal tree with respect to the B-ordering. F o r a neuronal tree in canonical form, each branch node contains subtrees T 1. . . . . T,~ such that T~ >/ ... > / T m. Hence to put a neuronal tree in canonical form, subtrees of each b r a n c h node must be sorted. Let h be the greatest height of b r a n c h nodes of a given neuronal tree T. To avoid repetitions, we start by sorting subtrees of all b r a n c h nodes of height h - 1, then of height h - 2 , . . . ,
132] O0000001230000000
Fig. 3. Canonical forms of neuronal trees of Fig. 2.
169
and we terminate by sorting subtrees of the root with height 1. Consequently obtaining the canonical form requires a number of sorts equal to the number of branch nodes. Given the codeword s = (s(1) . . . . . s(l)) of a neuronal tree, the following algorithm computes the height h(i) of the i-th node in preorder (1 ~< i ~< l) using an additional array a: h(1) .'= 1 a(1) := s(1) + 1 f o r i : = 2 to / d o if s(i)=0 then a(i):=0
e l s e a(i):=s(i)+l
endif
m := m a x { k ~ [1, i-1]la(k)~aO}
a(m):=a(rn)-I
h(i):=h(m)+ l
enddo To get the codewords of subtrees of the i-th branch node (s(i) 4= 0) in lexicographic order, we will use the following algorithm which computes the s(i) + 1 terminating subtrees indices p: j:~
i
f o r k:==l to s ( i ) + l do c:=0 f:=0 while c=f+l doj:=j+ 1 if s(j)=O t h e n c : = c + l
else
f ' = f + s(j)
endif enddo
p(k):=j enddo
References [1] S.B. Berger and L.W. Tucker, Binary tree representation of three-dimensional, reconstructed neuronal trees: a simple, efficient algorithm, Comput. Methods Programs Biomed. 23 (1986) 231-235. [2] M. Berry and P.M. Bradley, The application of network analysis to the study of branching patterns of large dendritic fields, Brain Res. 109 (1976) 111-132. [3] L. Comtet, Analyse Combinatoire, Vol. 1 (Presses Universitaires de France, 1970). [4] D.E. Knuth, The Art of Computer Programming, Vol. 1, Fundamentals algorithms (Addison-Wesley, Reading, MA, 1973). [5] W.P. Ireland, Pattern conserving data structure and algorithms for computations on dendritic trees, Comput. Biomed. Res. 22 (1989) 44-51. [6] J.M. Pailo, Enumerating, ranking and unranking binary trees, Comput. J. 29 (1986) 171-175. [7] J.M. Pallo and R. Racca, A note on generating binary trees in A-order and B-order, Int. J. Comput. Math. 18 (1985) 27-39. [8] E. Schrt~der, Vier combinatorische Probleme, Z. Math. Phys. 15 (1870) 361-376.