General combinatorics of RNA secondary structure

General combinatorics of RNA secondary structure

Mathematical Biosciences 191 (2004) 69–81 www.elsevier.com/locate/mbs General combinatorics of RNA secondary structure Bo Liao *, Tian-ming Wang Depa...

332KB Sizes 0 Downloads 79 Views

Mathematical Biosciences 191 (2004) 69–81 www.elsevier.com/locate/mbs

General combinatorics of RNA secondary structure Bo Liao *, Tian-ming Wang Department of Applied Mathematics, Dalian University of Technology, Dalian of Liaoning, Dalian 116024, China Received 25 July 2002; received in revised form 31 October 2003; accepted 13 May 2004

Abstract The total number of RNA secondary structures of a given length with minimal hairpin loop length m(m > 0) and with minimal stack length l(l > 0) is computed, under the assumption that all base pairs can occur. Asymptotics are derived from the determination of recurrence relations of decomposition properties. Ó 2004 Elsevier Inc. All rights reserved. Keywords: RNA secondary structure; Generating function; Asymptotic enumeration

1. Introduction Determining the shape a single-stranded RNA takes in solution is an important problem in computation biology; see [21]. In [22] the problem of prediction is shown to be polynomial. The prediction algorithms are combinatorial and the enumeration of secondary structure is a natural problem (see [5,6,8]). In these enumeration studies, the specific identities of the bases are ignored, in effect all possible base pairs are allowed. In this way, the entire set of possible RNA secondary structures is studied. Here we will focus on enumeration problems which are related to the secondary structure of RNA [1–9,20]. This sort of studies has a long history starting with the investigations of Waterman *

Corresponding author. Tel.: +86 41 1470 8351 8708; fax: +86 41 1470 6100. E-mail addresses: [email protected] (B. Liao), [email protected] (T.-m. Wang).

0025-5564/$ - see front matter Ó 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.mbs.2004.05.004

70

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

and Smith [8] who gave the first formal frame-work for the topic [6]. As shown in [5] the sequences arising in the enumeration of secondary structures which can occur under various reasonable restriction may be consider as natural generalizations of the Catalan and Motzkin numbers. For most of the problems considered one finds similar decomposition patterns. This observation is used in [17] to describe algorithms and techniques for computing generating functions for certain RNA configurations. Based on the traditional approach [10–16,18,19], i.e., based on the determination of recurrence relation from decomposition properties of the objects, the authors of [2] obtain several enumeration results for restricted configurations of secondary structures. But all results obtained previously under the assumption that the minimal loop length is 1 or the minimal stack length is one. Now, we will consider general combinatorics of the RNA secondary structure with minimal hairpin loop length m(m > 0) and with minimal stack length l(l > 0). In paper [1], we have computed the number of hairpins and cloverleaves of a given length with minimal hairpin loop length m(m > 0) and with minimal stack length l(l > 0). In this paper, we will consider other structure elements of RNA secondary structure. We computed the total number of RNA secondary structures of a given length, as a generation of the above paper [2]. In Section 2 we introduce the definition of secondary structure and structural elements, and the basic recursion. Section 3 presents the recursion formulas for the exact enumeration of various types of constrained secondary structures as well as their structural elements. In Section 4 asymptotics to these recursions are derived.

2. Secondary structures and structural elements 2.1. Definitions Definition 2.1 (Penner and Waterman [3]). A secondary structure is a vertex-labelled graph on n vertices with an adjacency matrix A fulfilling (i) ai,i+1 = 1 for 1 6 i 6 n  1, (ii) For each i there is at most a single k 5 i  1, i + 1 such that ai,k = 1, (iii) If ai,j = ak,l = 1 and i < k < j then i < l < j. We will call an edge (i, k), ji  kj 5 1 a base pair. A vertex i connected only to i  1 and i + 1 will be called unpaired. A vertex i is said to be interior the base pair (k, l), if k < i < l. If, in addition, there is no base pair (p, q) such that k < p < i < q, we will say that i is immediately interior to the base pair (k, l). Definition 2.2 [2]. Suppose A is the adjacency matrix for a secondary structure of size n. A secondary structure consists of the following structure elements (Fig. 1): (i) A stack consists of subsequent base pairs (p  k, q + k), (p  k + 1,q + k  1), . . ., (p, q) such that neither (p  k  1, q + k + 1) nor (p + 1, q  1) is a base pair. k + 1 is the length of the stack, (p  k, q + k) is the terminal base pair of the stack.

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

71

(ii) The sequence i + 1, i + 2, . . ., j  1 is a loop, if i + 1, i + 2, . . ., j  1 are all unpaired and ai,j = 1. The pair (i, j) is said to the foundation of the loop. (iii) The sequence i + 1, i + 2, . . ., j  1 is a bulge, if i + 1, i + 2, . . ., j  1 are all unpaired i and j are both paired, and ai,j 5 1. (iv) An external vertex is an unpaired vertex which does not belong to a loop. A collection of adjacent external vertices is called an external element. If it contains the vertex 1 or n it is a free end, otherwise it is called joint. (v) A tail is a sequence 1, 2, . . ., j resp. j, j + 1, . . ., n where 1, 2, . . ., j resp. j, j + 1, . . ., n are unpaired and j + 1 resp. j  1 is paired. (vi) A hairpin is the longest sequence i + 1, i + 2, . . ., j  1 containing exactly one loop such that ai+1,j1 = 1 and ai,j = 0. The paired points i + 1 and j  1 will be called the foundation of the hairpin. Definition 2.3 [2]. A stack [(p, q), . . ., (p + k, q  k)] is called terminal if p = 1 or q = n or if the two vertices p  1 and q + 1 are not interior to any base pair. The substructure enclosed by the terminal base pair (p, q) of a terminal stack will be called a component of u. We will say that a structure on n vertices has a terminal base pair if (1, n) is a base pair. From the combinatorial point of view it makes perfect sense to consider the general problem with a minimum number m (m > 0) of unpaired vertices in each hairpin loop and with minimal stack length l (l > 0). Throughout the remainder of this paper we will consider the general problem.

Fig. 1. Multibranch loop; hairpin; stack.

72

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

2.2. The basic recursion Let W(n) be the number of structures with minimal stack length l, and let W*(n) be the number of structures on n digits which have only stacks of length at least l if an additional terminal base pair is attached. Furthermore, let W**(n) be the number of structures on n digits which have all stacks of length at least l for which (1, n) is not a base pair. These three numbers fulfill for l > 1 the coupled recursions Wnþ1 ðlÞ ¼ Wn ðlÞ þ

n1 X

Wk ðlÞWnk1 ðlÞ;

n P m þ 2l;

k¼mþ2l2

Wn ðlÞ ¼

ðnmÞ=2 X

W n P m þ 2l; n2k ðlÞ

k¼l1

W n ðlÞ

¼ Wn ðlÞ  Wn2 ðlÞ;

W nþ1 ðlÞ ¼ Wn ðlÞ þ

n2 X

n P m þ 2l; Wk ðlÞWnk1 ðlÞ;

n P m þ 2l;

k¼mþ2l2 1; Wn ðlÞ ¼

Wn ðlÞ ¼ W 0; n < m þ 2l: n ðlÞ ¼ P1 P P1 1 n Let wðxÞ ¼ n¼0 Wn ðlÞxn , /ðxÞ ¼ n¼0 Wn ðlÞxn , hðxÞ ¼ n¼0 W n ðlÞx , tm ðxÞ ¼

m1 X k¼0

xk ; sm ðxÞ ¼

m1 X

kxk :

k¼1

P n Lemma 2.1 [2]. Suppose yn P 0 and yðxÞ ¼ 1 n¼0 y n x is of the form   x x ; yðxÞ ¼ bðxÞ þ 1  a where a > 0 is real, b(x) are analytic near a and x is real but not a non-negative integer. If y(x) is analytic for j x j < a and x = a is the only singularity of y on its circle of convergence, then  n gðaÞ 1x 1 : n yn CðxÞ a Lemma 2.2 [2]. Let /(x, y) be analytic for j x j < a + d and j y j < b(a) + d, d > 0. Suppose y is of the form  x1=2 gðxÞ: yðxÞ ¼ bðxÞ þ 1  a P n Let zðxÞ ¼ 1 n¼0 zn x be a generating function of the form z = U(x, y). Then zn ¼ Uy ða; bðaÞÞ: lim n!1 y n Lemma 2.3 [2]. Let U(x, y) be analytic as in Lemma 2.2, suppose y is the form  x1=2 yðxÞ ¼ bðxÞ þ 1  gðxÞ: a

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

Let zðxÞ ¼

P1

zðxÞ ¼

n n¼0 zn x

73

be a generating function of the form

1 Uðx; yÞ; ab  xy

where b = b(a), then zn 2Uða; bÞn ; ag2 ðaÞ yn where an bn means limn!1 abnn ¼ 1. Lemma 2.4 [2]. The generating function w, /, h fulfills the coupled functional equations w ¼ 1 þ xw þ x2 /w; x2ðl1Þ ðh  tm ðxÞÞ; 1  x2 h ¼ w  x2 /:



n pffiffi n3=2 ð1 Þ , where a is the smallest positive solution of p(x) = Lemma 2.5 [2]. Wn gðaÞ a 2 p

[(1  x)(1  x2 + x2l) + x2ltm(x)]2  4x2l(1  x2 + x2l) = 0 that satisfies sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 1 dpðxÞ gðaÞ ¼ 2l a 6¼ 0: x dx a Throughout the remainder of this paper we will assume that a denote the solutions of Eqs. p(x) = 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and b ¼ a1l 1  a2 þ a2l . Lemma 2.6 [2]. If l = 1, then tm ðaÞ ¼

3a  1 ; a2

sm ðaÞ ¼

3a  1 ð1  2aÞ2 m 2 ; að1  aÞ a ð1  aÞ

g2 ðaÞ ¼

Theorem 2.7 lim

n!1

Wn ðlÞ 1 : ¼ 2 Wn ðlÞ a b2

Proof. From Lemma 2.4 we obtain /¼

wð1  xÞ  1 : x2 w

Application of Lemma 2.2 immediately yields the desired result. h

ð1  2aÞð2 þ m  2maÞ : ð1  aÞa3

74

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

Theorem 2.8 lim

n!1

W 1  a2 n ðlÞ : ¼ Wn ðlÞ a2l b2

Proof. From Lemma 2.4 we obtain h¼

1  x2 ð1  x2 Þ½wð1  xÞ  1 / þ t ðxÞ ¼ þ tm ðxÞ: m x2l w x2ðl1Þ

Application of Lemma 2.2 completes the proof. h

3. Recursion Theorem 3.1. Let Jn(b, l) denote the number of structures on n vertices with exactly b components, then n1 X Wk ðlÞJ nk1 ðb  1; lÞ; n P m þ 2l J nþ1 ðb; lÞ ¼ J n ðb; lÞ þ k¼mþ2l2

J n ðb; lÞ ¼ 0;

b > 0; n < m þ 2l;

J n ð0; lÞ ¼ 1;

n P 0:

Proof. Because the number Jn+1(b, l) of structures on n + 1 vertices with b components can be computed as follows: (i) adding an unpaired digit to a structure on n vertices does not change the number of components, we obtain Jn(b, l) structures with b components; (ii) introducing an additional pair makes the bracketed part of length k a single component and does not affect the remainder of the sequence. There are b  1 components in the remainder, we have Wk ðlÞ times all the structures with b  1 components in the reminder of the sequence. Summing over k, we can get the recursion. h Theorem 3.2. Let En(b, l) denote the number of structures on n vertices with exactly b external digits, then n1 X Wk ðlÞEnk1 ðb; lÞ; Enþ1 ðb; lÞ ¼ En ðb  1; lÞ þ k¼mþ2l2

Enþ1 ð0; lÞ ¼

n1 X

Wk ðlÞEnk1 ð0; lÞ;

n P m þ 2l; n  2l  1 P b > 0;

k¼mþ2l2

En ðb; lÞ ¼ 0;

b > n  2l  1; or; n < m þ 2l;

En ðb; lÞ ¼ l; b ¼ n  2l  1:

Proof. The number En+1(b, l) of structures with b external digits can be computed as follows: (i) Adding an unpaired base to each structure on n digits, because adding a external digit, we obtain En(b  1, l) structures with b external digits.

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

75

(ii) Inserting a base pair (1, k + 2) makes the bracketed part length k no external digit, we have Wk ðlÞ times all the structures with b external digits in the reminder of the sequence. Summing over k, we can get the recursion. h Theorem 3.3. Let En(l) be the number of external digits, then n1 X Enþ1 ðlÞ ¼ En ðlÞ þ Wn ðlÞ þ Wk ðlÞEnk1 ðlÞ; n P m þ 2l; k¼mþ2l2

En ðlÞ ¼ 0;

n < m þ 2l:

Proof. The number En+1(l) of external digits can be computed as follows: (i) Adding an unpaired base to each structure on n digits, we obtain En(l) external digits plus the Wn(l) newly added ones. (ii) Inserting a base pair (1, k + 2), because there is no external digits in the newly bracketed part of length k, we have Wk ðlÞ times all the external digits in the reminder of the sequence. Summing over k, we can get the recursion. h Theorem 3.4. Let In(l) denote the number of components in the set of all secondary structure on n vertices, then n1 X Wk ðlÞ½I nk1 ðlÞ þ Wnk1 ðlÞ; n P m þ 2l; I nþ1 ðlÞ ¼ I n ðlÞ þ k¼mþ2l2

I n ðlÞ ¼ 0;

n < m þ 2l:

Proof (i) Adding an unpaired base to each structure on n digits does not change the number of components, we obtain In(l) components. (ii) Inserting a base pair (1, k + 2), we have Wk ðlÞ times all the components in the reminder of the sequence plus the number of structures that can be formed the remainder of the structure. Summing over k, we can get the recursion. h Theorem 3.5. Let Un(l) denote the number of unpaired bases in the set of all structures on n vertices, then n1 X ½Wk ðlÞU nk1 ðlÞ þ Wnk1 ðlÞU k ðlÞ; n P m þ 2l; U nþ1 ðlÞ ¼ ½U n ðlÞ þ Wn ðlÞ þ l1 X U n ðlÞ ¼ ðn  2pÞ;

k¼mþ2l2

n < m þ 2l:

p¼0

Proof. Because the total number of Un+1 of unpaired bases in the set of all structures with n + 1 bases can be computed as follows: (i) Adding an unpaired base to each structure on n digits we obtain their Un(l) unpaired digits plus the Wn(l) newly added ones.

76

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

(ii) Introducing a base pair (1, k + 2) we have Wk ðlÞ times all the unpaired digits in the reminder of the sequence plus all the unpaired digits in the newly bracketed part of length k times the number of structures that can be formed from the reminder of the structure. Summing over k we can get the recursion. h Theorem 3.6. Let Pn(l) denote the number of base pairs on n vertices with, then n1 X

P nþ1 ðlÞ ¼ P n ðlÞ þ

Wk ðlÞP nk1 ðlÞ þ Wnk1 ðlÞ½P k ðlÞ þ Wk ðlÞ;

n P m þ 2l;

k¼mþ2l2

P n ðlÞ ¼

l1 X

p;

n < m þ 2l:

p¼0

Proof (i) Adding an unpaired base to each structure on n digits, we obtain Pn(l) base pairs. (ii) Inserting a base pair (1, k + 2), we have Wk ðlÞ times all the base pairs in the reminder of the sequence plus all base pairs within the newly formed base pair times the number of structures in the tail. Summing over k, we can get the recursion. h

Theorem 3.7. Let Nn(l) be the number of stacks on n vertices, then n1 X

N nþ1 ðlÞ ¼ N n ðlÞ þ

fWk ðlÞN nk1 ðlÞ þ Wnk1 ðlÞ½N k ðlÞ þ Wk ðlÞg

k¼mþ2l2 n1 X



Wk2 ðlÞWnk1 ðlÞ;

n P m þ 2l;

N n ðlÞ ¼ 0;

n < m þ 2l:

k¼mþ2l2

Proof. Because the number Nn+1(l) of stacks in the set of structures on n + 1 digits consists of all stacks on n digits plus all stacks in the tail times the number of structures with the newly introduced base pair plus all stacks within the newly formed base pair times the number of structures in the tail. The newly formed base pair introduces an additional stack for all the Wk ðlÞ  Wk2 ðlÞ structures in its interior without a terminal base pair. (For the Wk2 ðlÞ structures with terminal base pair a stack is elongated.) h Theorem 3.8. Let Qn(b, l) denote the number of loops with b unpaired digits in the set of all secondary structures, then Qnþ1 ðb; lÞ ¼ Qn ðb; lÞ þ

n1 X

fWk ðlÞQnk1 ðb; lÞ þ Wnk1 ðlÞ½Qk ðb; lÞ þ Ek ðb; lÞg;

k¼mþ2l2

n P m þ 2l; Qn ðb; lÞ ¼ 0;

b > 0; n < m þ 2l:

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

77

Proof. For n + 1 vertices we retain all loops from the set of loops on n digits by adding a vertex to the 3 0 end. In addition, we have to count all loops in the tail-substructure for each possible structure that lies interior to the new base pair. The third contribution consists of an loops interior to the new base pair times all possible structures in the tail, A loop with b unpaired vertices remains unchanged and each structure with exactly b external vertices within the new base pair gives rise to an additional loop with b unpaired digits.

4. Asymptotic enumeration Theorem 4.1. The number of structures with b components Jn(b, l), fulfills b1

lim

n!1

J n ðb; lÞ b½bð1  aÞ  1 : ¼ bþ1 Wn ðlÞ bbþ1 ð1  aÞ

P1 Proof. Let jb ðxÞ ¼ n¼0 J n ðb; lÞxn be the generating function for the number of secondary structures with exactly b components. It is straightforward to derive

2 b x2 /ðxÞ x /ðxÞ j ðxÞ ¼ j0 ðxÞ jb ðxÞ ¼ xjb ðxÞ þ x /ðxÞjb1 ðxÞ ¼ 1  x b1 1x 2

1 . Using the functional equation for /(x), some compuand from Jn(0, l) = 1, we obtain j0 ðxÞ ¼ 1x b 1 wð1xÞ1 tations yield jb ðxÞ ¼ 1x ½ wð1xÞ  . From Lemma 2.2, we find that

lim

n!1

J n ðb; lÞ b½bð1  aÞ  1b1 : ¼ Wn ðlÞ bbþ1 ð1  aÞbþ1

 2

a 12a b1 Corollary 1. If l = 1, then limn!1 JWn ðb;1Þ ¼ b ð1aÞ . 2 ð 1a Þ n ð1Þ

Theorem 4.2. The number of structures with b external digits, En(b, l), fulfills  b En ðb; lÞ bþ1 ab : lim ¼ n!1 Wn ðlÞ ð1 þ abÞ2 1 þ ab P n Proof. Let eb ðxÞ ¼ 1 n¼0 E n ðb; lÞx be the generating function for the number of secondary structures with exactly b external digits. From theorem yields the functional equation eb ðxÞ ¼ xeb1 ðxÞ þ x2 /ðxÞeb ðxÞ ¼

b x x ðxÞ ¼ e0 ðxÞ e b1 1  x2 /ðxÞ 1  x2 /ðxÞ

substituting the functional equation for /, we obtain e0 ðxÞ ¼ 1x21/ðxÞ. Therefore

b wðxÞx wðxÞ : eb ðxÞ ¼ 1 þ xwðxÞ 1 þ xwðxÞ

78

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

From Lemma 2.2, we find that  b En ðb; lÞ bþ1 ab : lim ¼ 2 n!1 Wn ðlÞ ð1 þ abÞ 1 þ ab



¼ bþ1 ð1 Þb . Corollary 2. If l = 1, then limn!1 EWn ðb;1Þ 4 2 n ð1Þ Theorem 4.3. The number of external digits En(l), fulfills lim

n!1

En ðlÞ ¼ 2ab: Wn ðlÞ

P n Proof. Let eðxÞ ¼ 1 n¼0 En ðlÞx be the generating function for the number of external digits. From the recursion about En(l), we can obtain eðxÞ ¼ xeðxÞ þ xwðxÞ þ x2 /ðxÞeðxÞ: Using the functional equation for /(x), we find that eðxÞ ¼

xwðxÞ ¼ xw2 ðxÞ: 1  x  x2 /ðxÞ

Application of Lemma 2.2, immediately yields the desired result. h Corollary 3. If l = 1, then limn!1 WEnnð1Þ ¼ 2. ð1Þ Theorem 4.4. The number of components In(l), fulfills lim

n!1

I n ðlÞ ¼ 2bð1  aÞ  1: Wn ðlÞ

P n Proof. Let iðxÞ ¼ 1 n¼0 I n ðlÞx be the generating function for the number of components. From the recursion about In(l), we can obtain iðxÞ ¼ xiðxÞ þ x2 /ðxÞiðxÞ þ x2 /ðxÞwðxÞ: Using the functional equation for /(x), we find that iðxÞ ¼

x2 /ðxÞwðxÞ ¼ w2 ðxÞð1  xÞ  wðxÞ: 1  x  x2 /ðxÞ

Application of Lemma 2.2, immediately yields the desired result. h Corollary 4. If l = 1, then limn!1 WI nnð1Þ ¼ 2a  3. ð1Þ

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

79

Theorem 4.5. The number of unpaired digits Un(l), fulfills lim

n!1

U n ð1Þ 2ab þ 2la2 b½ðl  1Þtmþ2l2 ðaÞ  smþ2l2 ðaÞ ¼ ; Wn ð1Þ ð1  a2 b2 Þ2

U n ðlÞ 2a þ mð1  2aÞ n; Wn ðlÞ 2 þ m  2ma

l > 1;

l ¼ 1:

P n Proof. Let uðxÞ ¼ 1 n¼0 U n ðlÞx be the generating function for the number of unpaired digits, we find immediately the functional equation uðxÞ ¼ xuðxÞ þ xwðxÞ þ x2 /ðxÞuðxÞ þ x2 wðxÞuðxÞ  x2 l½smþ2l2 ðxÞ  ðl  1Þtmþ2l2 ðxÞwðxÞ: Using the functional equation for /(x), some computations yield uðxÞ ¼ ¼

xwðxÞ  x2 l½smþ2l2 ðxÞ  ðl  1Þtmþ2l2 ðxÞwðxÞ 1  x  x2 /ðxÞ  x2 wðxÞ xw2 ðxÞ  x2 l½smþ2l2 ðxÞ  ðl  1Þtmþ2l2 ðxÞw2 ðxÞ : 1  x2 w2 ðxÞ

If l = 1, application of Lemma 2.3, else if l > 1, application of Lemma 2.2 completes the proof. h Theorem 4.6. The number of base pairs, Pn(l), fulfills lim

n!1

P n ð1Þ 3a2 b2  a2 b½2 þ lðl  1Þtmþ2l2 ðaÞ  a4 b4 ¼ ; Wn ð1Þ ð1  a2 b2 Þ2

P n ðlÞ ð1  aÞn ; Wn ðlÞ 2 þ m  2ma

l > 1;

l ¼ 1:

P n Proof. Let p0 ðxÞ ¼ 1 n¼0 P n ðlÞx be the generating function for the number of base pairs. From the recursion about Pn(l), we find immediately the functional equation p0 ðxÞ ¼ xp0 ðxÞ þ x2 w2 ðxÞ þ x2 /ðxÞp0 ðxÞ þ x2 wðxÞp0 ðxÞ  x2 tmþ2l2 ðxÞwðxÞ x2 lðl  1Þ tmþ2l2 ðxÞwðxÞ: 2 Using the functional equation for /(x), some computations yield 

2

x2 w2 ðxÞ  x2 tmþ2l2 ðxÞwðxÞ  x lðl1Þ tmþ2l2 ðxÞwðxÞ 2 p ðxÞ ¼ 2 2 1  x  x /ðxÞ  x wðxÞ 0

¼

2x2 w3 ðxÞ  x2 tmþ2l2 ðxÞw2 ðxÞ½2 þ lðl  1Þ : 2½1  x2 w2 ðxÞ

If l = 1, application of Lemma 2.3, else if l > 1, application of Lemma 2.2 completes the proof. h

80

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

Theorem 4.7. The number of stacks, Nn(l), fulfills lim

n!1

N n ðlÞ a4 b2 ð1  b2 Þ þ 3a2 b2  2a2 b½tmþ2l2 ðaÞ þ 1  a þ a2 ; ¼ 2 Wn ðlÞ ð1  a2 b2 Þ

N n ðlÞ ð1  aÞ2 ð1 þ aÞ n; Wn ðlÞ 2 þ m  2ma

l > 1;

l ¼ 1:

P n Proof. Let vðxÞ ¼ 1 n¼0 N n ðlÞx be the generating function for the number of stacks. From the recursion about Nn(l), we find immediately the functional equation vðxÞ ¼ xvðxÞ þ x2 w2 ðxÞ þ x2 /ðxÞvðxÞ þ x2 wðxÞvðxÞ  x4 /ðxÞwðxÞ  x2 tmþ2l2 ðxÞwðxÞ: Using the functional equation for /(x), some computations yield vðxÞ ¼ ¼

x2 w2 ðxÞ  x4 /ðxÞwðxÞ  x2 tmþ2l2 ðxÞwðxÞ 1  x  x2 /ðxÞ  x2 wðxÞ x2 w3 ðxÞ  x2 w2 ðxÞð1  xÞ þ x2 wðxÞ  x2 tmþ2l2 ðxÞw2 ðxÞ : 1  x2 w2 ðxÞ

If l = 1, application of Lemma 2.3, else if l > 1, application of Lemma 2.2 completes the proof. h Theorem 4.8. The number of loops with b unpaired digits, Qn(b, l), fulfills

bþ2   Qn ðb; lÞ 1 ab 2 þbþ1 ; lim ¼ n!1 Wn ðlÞ 1  ab 1  a2 b2 1 þ ab Qn ðb; lÞ að1  aÞ n  bþ1 ; Wn ðlÞ ð1  2aÞð2 þ m  2maÞ 2

l > 1;

l ¼ 1:

P n Proof. Let qb ðxÞ ¼ 1 n¼0 Qn ðb; lÞx be the generating function for the number of loops with exactly b unpaired digits. From Theorem 3.7, we obtain qb ðxÞ ¼ xqb ðxÞ þ x2 /ðxÞjqb ðxÞ þ x2 wðxÞ þ x2 wðxÞeb ðxÞ ¼

x2 wðxÞeb ðxÞ : 1  x  x2 /ðxÞ  x2 wðxÞ

Using the functional equation for /(x) and eb(x). Some computation yield

b x2 w2 ðxÞ xwðxÞ wðxÞ : qb ðxÞ ¼ 2 1  x2 w ðxÞ 1 þ xwðxÞ 1 þ xwðxÞ If l = 1, application of Lemma 2.3, else if l > 1, application of Lemma 2.2 completes the proof. h

B. Liao, T.-m. Wang / Mathematical Biosciences 191 (2004) 69–81

81

5. Conclusion If l = 1 then ab = 1, wn(l) = Sn, where Sn denote the number of RNA secondary structures of a given length n with minimal hairpin loop length m (m > 0), we can immediately conclude that our results from Theorems 4.1–4.8 are the generation of the results in paper [2].

References [1] B. Liao, T.-M. Wang, General combinatorics of RNA hairpins and cloverleaves, J. Chem. Inf. Comput. Sci. 43 (4) (2003) 1138. [2] I.L. Hofacker, P. Schuster, P.F. Stadler, Combinatorics of RNA secondary structures, Discrete Appl. Math. 88 (1998) 207. [3] R.C. Penner, M.S. Waterman, Spaces of RNA secondary structures, Adv. Math. 101 (1993) 31. [4] W.R. Schnitt, M.S. Waterman, Linear trees and RNA secondary structure, Discrete Math. 12 (1994). [5] P.R. Stein, M.S. Waterman, On some new sequences generalizing the Catalan and Motzkin numbers, Discrete Math. 26 (1978) 261. [6] M.S. Waterman, Secondary structure of single-stranded nucleic acids, Adv. Math. Suppl. Studies 1, Hall, London, 1995. [7] M.S. Waterman, Introduction to Computational Biology: Maps: Sequences and Genomes, Chapman and Hall, London, 1995. [8] M.S. Waterman, T.F. Smith, Combinatorics of RNA hairpins and cloverleaves, Stud. Appl. Math. 60 (1978) 91. [9] M.S. Waterman, T.F. Smith, RNA secondary structure: a complete mathematical analysis, Math. Biosci. 42 (1978) 257. [10] W.N. Hsich, Proportions of irreducible diagrams, Stud. Appl. Math. 52 (1973) 227. [11] D. Kleitman, Proportions of irreducible diagrams, Stud. Appl. Math. 49 (1970) 297. [12] P.R. Stein, On a class of linked diagrams, I. Enumeration, J. Comb. Theory A 24 (1978) 357. [13] P.R. Stein, C.J. Everett, On a class of linked diagrams, II. Asymptotics, Discrete Math. 22 (1978) 309. [14] J. Touchard, Surune problene de configurations et sur les fractions continues, Can. J. Math. 4 (1952) 2. [15] A.M. Odlyzko, Asymptotic enumeration methods, in: R.L. Graham, M. Grotschel, L. Lovasz (Eds.), Handbook of Combinatorics, Vol. II, Elsevier, Amsterdam, 1995, p. 1021. [16] A. Meir, J.W. Moon, On an asymptotic method in enumeration, J. Comb. Theory A 51 (1989) 77. [17] J.A. Howell, T.F. Smith, M.S. Waterman, Computation of generating functions for biological sequences, SIAM. J. Appl. Math. 39 (1980) 119. [18] E.A. Bender, Asymptotic methods in enumeration, SIMA Rev. 16 (1974) 485. [19] M. Regnier, F. Tahi, Enumeration and asymptotics in computational biology, in: Mathematical Analysis for Biological Sequences Workshop Trondheim, Norway, 1996. [20] X.G. Vienmt, M.V. de chaumont, Enumeration of RNAs secondary structures by complexity, in: V. Capasso, E. Grosso, S.L. Paver-Fontana (Eds.), Mathematics in Medicine and Biology, Lect. Notes in Biomath, Vol. 57 Springer, Berlin, 1985, p. 360. [21] M. Zuker, D. Sank off, RNA secondary structures and their prediction, Bull. Math. Biol. 46 (4) (1984) 591. [22] M.S. Waterman, T.F. Smith, Rapid dynamic programming algorithms for RNA Secondary structure, Adv. Appl. Math. 7 (1986) 455.