Information Processing Letters 81 (2002) 289–296
Algorithms for transitive closure A. Koubková a,∗ , V. Koubek b,1 a Department of Software Engineering, Faculty of Mathematics and Physics, Charles University,
Malostranské nám. 25, 118 00 Praha 1, Czech Republic b Department of Theoretical Computer Science and Mathematical Logic, and Institute of Theoretical Computer Science,
Faculty of Mathematics and Physics, Charles University, Malostranské nám. 25, 118 00 Praha 1, Czech Republic Received 24 October 2000; received in revised form 21 June 2001 Communicated by L. Boasson
Abstract Let σ (n) denote the number of all strongly connected graphs on the n-element set. We prove that σ (n) 2n · (1 − n(n − 1)/2n−1 ). Hence the algorithm computing a transitive closure by a reduction to acyclic graphs has the expected time O(n2 ), under the assumption of uniform distribution of input graphs. Furthermore, we present a new algorithm constructing the transitive closure of an acyclic graph. 2002 Elsevier Science B.V. All rights reserved. 2
Keywords: Computational complexity; Graph algorithms; Algorithms; Transitive closure
1. Introduction For a directed graph G = (X, R) (the elements of X are called nodes of G, the elements of R are called arcs of G), let G+ = (X, R + ) denote the transitive closure of G, and let G∗ = (X, R ∗ ) denote the reflexive and transitive closure of G. The transitive closure or the reflexive and transitive closure form a basis of many constructions in various fields of mathematics and computer science. Therefore many papers were devoted to algorithms constructing transitive closure. Roughly speaking, we can divide * Corresponding author. Financial support of the Grant agency of
the Czech Republic under the grant no. 201/98/1451 is acknowledged. E-mail addresses:
[email protected] (A. Koubková),
[email protected] (V. Koubek). 1 The second author gratefully acknowledges the support of the project LN00A056 of The Ministry of Education of the Czech Republic.
these algorithms into two groups according to the tools used to produce the transitive closure. The first group utilizes a known relation between paths of given length and powers of the adjacency matrix. Therefore the basic operation for these algorithms is Boolean matrix multiplication and graphs are represented by adjacency matrices, see [2]. The basic facts about these algorithms can be found in monographs by Aho, Hopcroft and Ullman [1], or Mehlhorn [4]. The algorithms from the second group are based on graph search techniques. The input graph (X, R) for these algorithms is usually given by a simple list X of nodes and a family {xR | x ∈ X} of simple lists of direct successors (here xR = {y ∈ X | (x, y) ∈ R}). Sometimes it is suitable to also have a family {Rx | x ∈ X} of simple lists of direct predecessors (here Rx = {y ∈ X | (y, x) ∈ R}). The most successful algorithms using this concept are due to Schnorr, see [5] and Simon, see [6]. Under the assumption of uniform distribution of input graphs, Schnorr’s algorithm constructs
0020-0190/02/$ – see front matter 2002 Elsevier Science B.V. All rights reserved. PII: S 0 0 2 0 - 0 1 9 0 ( 0 1 ) 0 0 2 4 5 - 9
290
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
the transitive closure of a graph with n nodes and m arcs in the expected time equal to O(E(n, m)) where E(n, m) is the expected number of arcs in transitive closure of the input graph. The worst case of time complexity of Schnorr’s algorithm is O(nm). Simon’s algorithm constructs transitive closure only for acyclic graphs. The expected time complexity of Simon’s algorithm for a random acyclic graph with n nodes and a probability p is O(n2 ) whenever p (log2 n)/n and O(n2 (log(log(n))) whenever p < (log2 n)/n. The worst case time complexity of Simon’s algorithm is better than the time complexity of Schnorr’s algorithm, see [6]. Simon’s algorithm is a modification of the algorithm suggested by Goralˇcíková and Koubek in [3]. For this algorithm it was proved that its expected time for a random graph with n nodes and a probability p (log n)/n is O(n2 ) [7]. The relation between the reflexive and transitive closure of a general graph and the reflexive and transitive closure of an acyclic graph is given by the wellknown meta-algorithm TRCL. To describe TRCL we recall several notions for a graph G = (X, R), let C(G) denote the set of all strongly connected components of G and, for a node x ∈ X, let C(x) denote the strongly connected component of G containing x. Meta-algorithm TRCL Input: the input graph G = (X, R). Output: the reflexive and transitive closure G∗ = (X, R ∗ ) of G. (1) Construct the set C(G) of all strongly connected components of G. (2) Compute the graph S(G) = (C(G), C(R)) where C(R) = (C, D) | C = D, C, D ∈ C(G), ∃c ∈ C and d ∈ D with (c, d) ∈ R . (3) Compute the reflexive and transitive closure S(G)∗ = (C(G), C(R)∗ ) of the graph S(G). (4) Construct the set R ∗ = {(a, b) | (C(a), C(b)) ∈ C(R)∗ }. (5) The reflexive and transitive closure of G is G∗ = (X, R ∗ ). First we sketch the analysis of TRCL. A verification of the correctness of TRCL is straightforward, see [3]. By Tarjan [8], step (1) requires O(|X| + |R|) time in the worst case. Clearly, step (2) also requires O(|X| + |R|) time in the worst case. Steps (4) and (5) require
O(|X| + |R ∗ |) time in the worst case. Thus the time complexity of the meta-algorithm TRCL depends on the time complexity of step (3). Observe that the graph S(G) is acyclic, thus we reduce a construction of the reflexive and transitive closure of general graphs to a construction of the reflexive and transitive closure of acyclic graphs. The following theorem formalizes this fact. Theorem 1. If step (3) uses O(ϕ(|X|, |R|)) time in the worst case, where ϕ(n, m) is a function of two variables, then the time complexity of TRCL in the worst case is O(ϕ(|X|, |R|) + |X| + |R ∗ |). The present note has two goals. In Section 2 we estimate a number of strongly connected graphs on a set X. A consequence of this estimate is the fact that if step (3) of TRCL runs in a polynomial time, then under the uniform distribution of input graphs we obtain that the expected time of TRCL is O(|X|2 ). Thus we deduce that most of the algorithms constructing the reflexive and transitive closure has a behaviour similar to the Schnorr’s algorithm. We also derive estimates of other statistical characteristics of TRCL. Section 3 brings the second goal of this note. We suggest a new algorithm for a construction of transitive closure for acyclic graphs, using an idea similar to Simon’s algorithm, or to that of Goralˇcíková and Koubek [3]. We give also an analysis of the new algorithm. We prove that the new algorithm is preferable to the algorithm of Goralˇcíková and Koubek but the relation to the Simon’s algorithm is unknown. The direct generalization of Theorem 1 for the expected time is impossible because the distribution of input graphs for step (3) is unknown. This motivates the first goal of this note, to give some properties of this distribution. The importance of this is clear because the expected time is a more natural and important characteristic of a behaviour of an algorithm than the time complexity in the worst case. In fact, we prove a more general result. From the statistical analysis it is known that the expected value is only one of the characteristics of a random variable. The full characterization of a random variable in the statistical analysis is given by the knowledge of its moments. Yet the classical statistical models are not quite suitable for our situation because we compute only upper estimates of time complexity. Thus the deviation from
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
the estimate in general is not important. In fact, if the time consumed by the algorithm on a current input is less than the estimate then this deviation is not interesting, but if the time consumed by the algorithm on current input is greater than the estimate then the deviation impedes us. In this case we want to know, for example, the probability that this deviation occurs, or a value of this deviation and so on. This leads us to the definition of the kth positive a-estimate moment as a replacement for the classical kth central moment. Its basic statistical properties relevant to the behaviour of algorithms will be given in a subsequent paper. Definition. Let x be a nonnegative random variable whose expected value is less than or equal to a number a. For k > 1, the kth positive a-estimate moment of x is the expected value of the random variable max{(x − a)k , 0}.
2. Strongly connected graphs For a graph G = (X, R), let G0 = (X, R 0 ) denote the reflexive closure of G. Since G and G0 have the same strongly connected components, we can restrict ourselves to the class of all reflexive graphs. (Observe that a computation of the reflexive closure requires O(|X| + |R|) time.) Let σ (n) denote the number of strongly connected reflexive graphs on an n-element set, and let τ (n) denote the number of all reflexive graphs on an nelement set with at least two strongly connected components. Observe that the number of all reflexive 2 graphs on an n-element set is 2n −n and hence σ (n) + 2 τ (n) = 2n −n for all n. Lemma 2. For all natural numbers n, n−1 n k 2 −k (n−k)2 −(n−k) k(n−k) 2 2 2 . τ (n) k k=1
Proof. Assume that G = (X, R) is a reflexive graph that is not strongly connected with |X| = n. Then there exists a strongly connected component C of G satisfying (c) if y ∈ C and (x, y) ∈ R then x ∈ C.
Let Y be a non-empty proper subset of X with |Y | = k. 2 2 Then there exist at most 2k −k 2(n−k) −(n−k) 2(n−k)k reflexive graphs on X such that Y is its strongly con nected component satisfying (c). Since there exist nk sets Y ⊆ X with |Y | = k, the proof is complete. ✷ Obviously, n−1 n k 2 −k (n−k)2 −(n−k) k(n−k) 2 2 2 k k=1
n−1 n n2 −n−nk+k 2 2 k k=1 n−1 n k n2 −n . =2 2 nk−k 2 k=1
=
The next technical lemma enables us to estimate this sum. Lemma 3. If n and k are natural numbers with 0 < k < n and 4 n, then n n k n−1 . 2 2(n−k)k Proof. The statement is true for k = 1 and k = n − 1. From the definition of binomial coefficients it follows that n k+1 n−k
k+1
2(n−k−1)(k+1) 22k−n+1 n n =
k 2(n−k)k
=
k−1 2(n−k+1)(k−1)
n+1−k k 2n−2k+1
for all k with 1 < k < n − 1. If 1 < k 14 n then n+1−k k 2n−2k+1
(1)
291
n+1 2n−n/2+2
1,
if 14 n k < 12 n then n−k+1 k 2n−2k+1
n−n/4+1 n/4
8
3+ 8
4 n
1
because (n − k + 1)/k is a decreasing function of k, if k = 12 n then n+1−n/2 n/2 2n−2n/2+1
=
1+ 2
2 n
1.
292
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
If 34 n k < n − 1 then k+1 n−k 22k−n+1
n−1 2 23n/2−n+1
n−1 1, 2n/2+2
if 12 n < k 34 n then k+1 n−k 22k−n+1
3n/4+1 n−3n/4
3 + n4 1 8 8 because (k + 1)/(n − k) is an increasing function of k. Thus two inductions over k complete the proof. ✷
Hence we conclude that n−1 n n−1 (n − 1)n n k = , n−1 (n−k)k 2 2n−1 2 k=1
k=1
and summarize as follows. Theorem 4. For every n 4, n(n − 1) n2 −n 2 and 2n−1 n(n − 1) n2 −n σ (n) 2 1− . 2n−1
τ (n)
Thus the number of strongly connected graphs on an 2 n-element set is greater than or equal to 2n (1 − n(n − 1)/2n−1 ). By Theorem 4, the probability that an input graph with n nodes is strongly connected under the assumption of a uniform distribution of input graphs is greater than 1 − n(n − 1)/2n−1 . We apply this fact for an estimate of the expected time of the meta-algorithm TRCL and positive estimate moments of the expected time. Assume that n is a number of nodes of an input graph. Theorem 5. Assume that the distribution of input graphs is uniform and that the procedure constructing transitive closure of acyclic graphs used in step (3) requires polynomial time. Then the expected time of TRCL is O(n2 ) and the kth positive O(n2 )-estimate moment is O(1/2n/2). Proof. In our analysis, we assume that any input graph is reflexive. Since the construction of the reflexive closure requires O(|X| + |R|) time and because for every reflexive graph G = (X, R) there exist exactly
2|X| graphs of which G is the reflexive closure, we see that our result is not affected by this assumption. Assume that step (3) requires O(|C(G)|j ) time for some natural number j where C(G) is the set of all strongly connected components of an input graph G. Thus TRCL uses O(|X| + |R ∗ | + |C(G)|j ) = O(|X| + |X|2 + |X|j ) = O(nmax{2,j } ) time if G is not strongly connected and O(|X| + |X|2 ) = O(n2 ) if G is strongly connected. Hence the expected time of TRCL is 1 σ (n)n2 + τ (n)nmax{2,j } 2 2n −n 1 n(n − 1) 2 2 n2 2n −n + 2n −n nmax{j,2} n−1 2 n 2 2 −n n4+j n2 + n−1 = O n2 . 2 Now we compute the kth positive O(n2 )-estimate moment. If the input graph G is strongly connected, then the contribution of G to the moment is 0. If G is not strongly connected and j > 2 then the contribution of G to the moment is O(nj ), and if G is not strongly connected and j 2, then the contribution of G to the moment is also 0. Thus the kth positive O(n2 )estimate moment is 1 0σ (n) + nj k τ (n) 2 2n −n n(n − 1) nj k+2 1 . ✷ nj k = O 2n−1 2n−1 2n/2 To generalize Theorem 5, first observe that classical algorithms constructing transitive closure of acyclic graphs use O(n3 ) time where n is a number of nodes of an input graph. For a graph G = (X, R), let Pr(G) denote the probability that G is an input graph. Thus
{Pr(G) | G is a graph on X} = 1 for all sets X. Theorem 6. If the procedure constructing transitive closure of acyclic graphs used in step (3) requires O(n3 ) time in the worst case, and if for all sufficiently large sets X Pr(G) | G is a graph on X with at most |X|2/3 strongly connected components
|X|3 − |X|2 , |X|3
then the expected time of TRCL is O(n2 ) and the kth positive O(n2 )-estimate moment is O(n3k−1 ).
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
Proof. If G = (X, R) is a graph with at most |X|2/3 strongly connected components, then TRCL requires O(|X| + |R ∗ | + |X|2/3 )3 = O(|X|)2 time and if G = (X, R) has more than |X|2/3 strongly connected components then TRCL runs in time O(|X|3 ). Thus the expected time of TRCL for graphs with n nodes is n3 − n2 3 O n2 + 1 − n = O n2 + n2 = O n2 . 3 n To compute the kth positive O(n2 )-estimate moment we estimate a difference between the run time of TRCL and its estimate. If a graph G = (X, R) has at most |X|2/3 strongly connected components, then the difference is at most 0, else the difference is less than O(|X3 |). Hence the kth positive O(n2 )-moment for graphs with n nodes is n3 − n2 3k O 1− n n3 2 n ✷ = O 3 n3k = O n3k−1 . n 3. A new algorithm for acyclic graphs We suggest a new algorithm constructing the reflexive and transitive closure of acyclic graphs. The algorithm exploits a special data structure — an extended successor list — making possible an effective calculation of the reflexive and transitive closure. Assume that G = (X, R) is an acyclic graph and let x ∈ X. Then a G-extended successor list of a node x — a G-ext-successor list of x, for short — is a pair (Lx , ψx ) where Lx = {y0 = x, y1, . . . , yk } is a simple one-linked list representing the set xR ∗ = {y ∈ X | (x, y) ∈ R ∗ } beginning with x, and ψx is a mapping from the set {0, 1, . . . , k} into the set {1, 2, . . . , k + 1} such that (s1) i < ψx (i) for all i = 0, 1, . . . , k; (s2) if (yi , yj ) ∈ R ∗ then j < ψx (i); (s3) if i j < ψx (i) then (yi , yj ) ∈ R ∗ . Observe that, by (s2) and (s3) any G-ext-successor list (Lx , ψx ) fulfills (s4) if i j < ψx (i) then ψx (j ) ψx (i). The value ψx (i) can be represented as a pointer from the ith place of the list Lx to the ψx (i)th place of Lx where a pointer to the (k + 1)st place of Lx is the special pointer NIL.
293
The data structure G-ext-successor list is motivated by the depth first search, see [1,4]. To obtain the reflexive and transitive closure of an acyclic graph G = (X, R), it suffices to systematically construct Gext-successor lists of all nodes of G and this is a task of our algorithm. An important role in our algorithm is played by topological sorting. For a graph G = (X, R), a topological sorting of G is any linear order on X for which x y whenever (x, y) ∈ R. It is known that a graph G has a topological sorting if and only if it is acyclic. In fact, there exists an algorithm that for a given graph G = (X, R) decides whether G is acyclic and in the affirmative case it constructs a topological sorting of G in time O(|X| + |R|), see [4]. In what follows, we require that any acyclic graph G = (X, R) is represented so that for a topological sorting of G, the list X representing the underlying set of G is sorted by the dual of , that is, x precedes y in the list X just when y x and the lists xR for x ∈ X representing direct successors are sorted by , i.e., y precedes z in the list xR exactly when y z for all y, z ∈ xR and all x ∈ X. In a standard way, we can transform an arbitrary representation of G to the desired form with respect to a given topological sorting in time O(|X|+|R|), see, e.g., [5]. Therefore we suppose that any acyclic graph is given in a desired form (for some topological sorting). The following procedure Successorlist constructs a G-ext-successor list of a given node x of G under the assumption that G-ext-successor lists of all nodes y ∈ xR are given. The procedure uses two stacks S and S1 that are initialized empty and all nodes are initialized as unmarked. Successorlist(G, x) (1) y0 ← x, mark x, k ← 0, (2) for every y ∈ xR in the order of xR do (3) assume that (Ly = {u0 , u1 , . . . , ul }, ψy ) is the G-ext-successor list of y, j ← 0 (4) while j l do (5) if uj is not marked then (6) k ← k + 1, yk ← uj , mark uj , (7) while top of S is (i, i ) for some i and i with i j do (8) (i, i ) ← top of S, pop (i, i ) from S, ψx (i) ← k (9) enddo (10) if ψy (j ) = NIL then
294
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
push (k, ψy (j )) to S else push k to S1 endif j ← j + 1, else j ← ψy (j ) endif enddo while S = ∅ do (i, i ) ← top of S, pop (i, i ) from S, ψx (i) ← k + 1 (19) enddo (20) while S1 = ∅ do (21) i ← top of S1 , pop i from S1 , ψx (i) ← k + 1 (22) enddo (23) enddo (24) ψx (0) ← k + 1, Lx is the list {y0 , y1 , . . . , yk } (25) the output is (Lx , ψx ).
(11) (12) (13) (14) (15) (16) (17) (18)
Recall that NIL and the (k + 1)th place of the list {v0 , v1 , . . . , vk } are synonymous. If G = (X, R) is an acyclic graph, then we form a set R − from the set R by omitting all arcs (x, y) ∈ R such that there exists a path in G from x to y of length at least 2. Then (X, R − ) is the least graph (with respect to the inclusion) such that its reflexive and transitive closure is the same as the reflexive and transitive closure of G. The graph (X, R − ) is called a transitive reduct of G; it exists for any acyclic graph, see [1,4]. The next auxiliary lemma proves the correctness of procedure Successorlist. It will be exploited in the time analysis of main algorithm. In the next lemma, for the sake of simplicity, the instruction yk ← v for some natural number k and a node v ∈ X is interpreted as the insert of v to the kth member of the list Lx . If we apply the algorithm Successorlist to G and x, then the execution of lines 3–22 for a given u ∈ xR will be called a u-search. Lemma 7. Let G = (X, R) be an acyclic graph with a topological sorting and let x be a node of G. If a G-ext-successor list is given for all z ∈ X with x < z and if we apply the algorithm Successorlist to G and x, then (1) if y is inserted into the list Lx , then y is marked and either y = x or there exists u ∈ xR with y ∈ Lu ;
(2) if u ∈ xR and y ∈ Lu then the u-search does not realize line 5 to y if and only if there exists z ∈ Lu with (z, y) ∈ R ∗ and z is marked before the usearch; (3) for every u ∈ xR after the u-search all y ∈ Lu are contained in Lx and are marked; (4) the list Lx after the run of Successorlist is a simple list consisting of all y ∈ X with (x, y) ∈ R ∗ and the first member of Lx is x; (5) if ψx (j ) is defined and y is the j th member of Lx , then every z ∈ X with (y, z) ∈ R ∗ is the kth member of Lx for some k < ψx (j ) and if z ∈ X is the kth member of Lx for j < k < ψx (j ), then (y, z) ∈ R ∗ ; (6) the pair (Lx , ψx ) is a G-ext-successor list of x. Proof. If a node v is inserted into the list Lx , then either it was executed by line 1 and v = x and is marked or it was executed by line 6 in the course of some u-search. Then v is a member of Lu , u ∈ xR and v is marked. Thus (1) is proved. We prove (2). First observe that if y is the j th member of Lu for some u ∈ xR, then the u-search does not realize the line 5 for y if and only if there exists the kth member z of Lu such that z is marked before the u-search and k < j < ψu (k) (lines 13 and 14). By (s3), (z, y) ∈ R ∗ and the proof of (2) is complete. First observe that y ∈ X is marked at the moment when y is inserted into Lx . We prove (3), by induction over xR. Assume that y ∈ Lu is not marked before the u-search. Then every z ∈ Lu with (z, y) ∈ R ∗ is not marked. Conversely, z was marked in a v-search and v precedes u in xR. Since (Lv , ψv ) is a G-extsuccessor list of v we conclude that y ∈ Lx and then, by the induction hypothesis, y has to be marked — a contradiction. Thus by (2), the u-search realizes line 5 to y and y is inserted to Lx and is marked. If y is marked before the u-search, then y was a member of Lx before the u-search and (3) is proved. From (1) and (3) it follows that Lx consists of all y ∈ X with (x, y) ∈ R ∗ . If a node is inserted into Lx , then it is marked and because no marked node is inserted into Lx we conclude that Lx is simple. By the line 1, x is the first member of Lx and (4) is proved. First observe that ψx (0) satisfies (5), by line 24. Assume that j > 0 and that y is the j th member of Lx . Therefore there exists u ∈ xR such that the execution
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
of line 6 in the course of the u-search inserts y in Lx . Then there exists k such that y is the kth member of Lu . From lines 10 and 11 it follows that either ψu (k) is less or equal to the length of Lu and then (j, ψu (k)) is inserted into S or ψu (k) is greater than the length of Lu and then j is inserted into S1 . Observe that ψx (j ) is defined at the moment that either (j, ψx (k)) is deleted from S (lines 7–9 or 17–19) or j is deleted from S1 (lines 20–22). First assume that ψu (k) is less or equal to the length Lu and there exists the least number l > j such that the u-search realizes line 6 for the lth member of Lu and l ψu (k). By lines 7–9, the lth member of Lu becomes the ψx (j )th member of Lx . In this moment all members of Lu preceding the lth member of Lu are marked and hence are members of Lx . The first statement follows from (s2). If z is the j th member of Lx with j < j < ψx (j ), then necessarily z is the k th member of Lu for k < k < ψu (k) and (s3) completes the proof of the second statement of (5). Assume that ψu (k) is less or equal to the length of Lu such that if the u-search executes line 6 for l > k, then l < ψu (k) or ψu (k) is greater than the length of Lu . Hence after the u-search any member of Lx subsequent y is the k th member of Lu with k < k < ψu (k). By lines 17–22, ψx (j ) is the length of Lx after the u-search plus 1 and hence (5) immediately follows. Since ψx (i) > i for all i such that ψx (i) is defined, we deduce from (5) that ψx satisfies the conditions (s1), (s2) and (s3). From (4) it follows that (Lx , ψx ) is a G-ext-successor list of x and (6) is proved. ✷ Now we suggest an algorithm constructing the reflexive and transitive closure of acyclic graphs exploiting the procedure Successorlist. Trancl(G = (X, R)) (1) Assume that for a topological sorting of G, the graph G is represented by the simple one-linked list X sorted dually to and a family of simple one-linked lists xR for x ∈ X sorted by . (2) For every x ∈ X in the order of the list X apply Successorlist(x). (3) The output is the graph represented by the list X and a family of lists Lx for x ∈ X. For an acyclic graph (X, R) and for a node x ∈ X, let rx denote the minimum of 1 and of the number
295
of arcs in the transitive reduct of the graph (Xx , Rx ) where Xx = {y ∈ X | (x, y) ∈ R ∗ } = xR ∗ and Rx = R ∩ (Xx × Xx ). The following theorem characterizes time complexity of the algorithm Trancl. Theorem 8. The algorithm Trancl computes the reflexive and transitive
closure of any acyclic graph G = (X, R) in time O( x∈X rx ) in the worst case. Proof. Since the list X is sorted dually to the topological sorting , we conclude that if the algorithm constructs a G-ext-successor list of a node x ∈ X, then G-ext-successor lists for all nodes y ∈ xR were constructed. Thus, by Lemma 7(4) and (6), the procedure Trancl is correct. To estimate the time used by algorithm Trancl, for an input graph G = (X, R) we compute the running time of the procedure Successorlist for a node x ∈ X. First we observe that any run of while-cycles on lines 7–9, or on lines 17–19, or on lines 20–22 creates a pointer ψx (i) and requires O(1) time. Hence the time consumed by all runs of these while-cycles is O(|xR ∗ |). Any run of if-test on lines 5–15 without the while-cycle on lines 7–9 uses O(1) time. Since the ifcondition on line 5 is satisfied for every member of Lx exactly once, we deduce that procedure Successorlist for a node x uses the time proportional to the number of if-tests on line 5. First, for every u ∈ xR the u-search realizes line 5 for u. If y distinct from u is a member of Lu for some u ∈ xR, then there exists an arc (z, y) of transitive reduct of G with (u, z) ∈ R ∗ and, by Lemma 7(2), the u-search executes line 5 to y if and only if no node z ∈ Lu such that an arc (z, y) belongs to the transitive reduct of G was marked before the u-search. Since |xR| rx and since the transitive reduct of (Xx , Rx ) is the induced subgraph of the transitive reduct of G, we deduce that Successorlist for a node x uses O(rx ) time because any execution of line 5 corresponds an arc of Rx . The proof of Theorem 8 is complete. ✷ The worst time estimate in Theorem 8 is very rough. For an acyclic graph G = (X, R) with a topological sorting and for nodes x and u of G with (x, u) ∈ R − , let rx,u denote the number of nodes y of G such that • (u, y) ∈ R ∗ ; • (v, y) ∈ R ∗ for some v ∈ xR with x < v < u;
296
A. Koubková, V. Koubek / Information Processing Letters 81 (2002) 289–296
• for all nodes z of G with (z, y) ∈ R − , (u, z) ∈ R ∗ there exists no node v of G with v ∈ xR, x < v < u and (v, z) ∈ R ∗ . Define rx,u . rG = (x,u)∈R −
Then, in fact, we proved the following result. Theorem 9. Let G = (X, R) be an acyclic graph and let be a topological sorting of G. Then the algorithm Trancl for G with uses O(rG + |R ∗ |) time. 4. Conclusion We give an estimate of the number of strongly connected graphs. As a consequence of this estimate we find a broad class of algorithms constructing the reflexive and transitive closure of a graph with expected time O(n2 ) (n is a number of nodes of an input graph). The result is similar to the Schnorr’s result. But we were not successful in finding a satisfactory estimate of the number of strongly connected graphs with given number of nodes and arcs. Therefore we did not obtain a result more similar to the Schnorr’s result. We present a new algorithm for the construction of the reflexive and transitive closure of acyclic graphs and its time analysis. To compare the new algorithm with classical ones it is necessary to find a relation between rG and other characteristics of G that is unknown, as well as a description of a topological sorting with the least rG . By a comparison of single steps we obtain that there exists a constant c > 0 such that for every acyclic graph G the running time of the new algorithm for G is less than c multiple of
the running time of the algorithm of Goralˇcíková and Koubek for G. On the other hand a comparison with the Simon’s algorithm is unknown. For a natural number n let Gn = (X, R) be a graph such that X = {(i, j ) | i = 0, 1, . . . , n − 1, j = 0, 1, 2}, R = {((i, j ), (k, l)) | i, k = 0, 1, . . . , n − 1, j = 0, 1, l = j + 1}. It is easy to see that there exists a constant c > 0 such that the running time of the new algorithm on Gn is greater than cn3 . Thus the time complexity of the new algorithm in the worst case is &(n3 ) (the same is true for the Simon’s algorithm). It seems to be suitable to use the new algorithm for acyclic graphs with long paths.
References [1] A.V. Aho, J.E. Hopcroft, J.D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, MA, 1976. [2] M.J. Fischer, A.R. Meyer, Boolean matrix multiplication and transitive closure, in: Proc. of 12th Annual Symp. on Switching and Automata Theory, 1971, pp. 129–131. [3] A. Goralˇcíková, V. Koubek, A reduct and closure algorithm for graphs, in: Proc. Math. Found. of Comp. Sci. 1979, Lecture Notes in Comput. Sci., Vol. 74, Springer, Berlin, 1979, pp. 301– 307. [4] K. Mehlhorn, Data Structures and Algorithms II: Graph Algorithms and NP-Completeness, EATCS Monographs in Comput. Sci., Vol. 2, Springer, Berlin, 1984. [5] C.P. Schnorr, An algorithm for transitive closure with linear expected time, SIAM J. Comput. 7 (1978) 124–133. [6] K. Simon, An improved algorithm for transitive closure on acyclic digraphs, Theoret. Comput. Sci. 58 (1988) 325–346. [7] K. Simon, D. Crippa, F. Collenberg, On the distribution of the transitive closure in a random acyclic digraph, in: Algorithms — ESA’93 (Bad Honnef), Lecture Notes in Comput. Sci., Vol. 726, Springer, Berlin, 1993, pp. 345–356. [8] R.E. Tarjan, Depth first search and linear graph algorithms, SIAM J. Comput. 1 (1972) 146–160.