Parallel Computing 24 (1998) 1245±1261
Node-to-set and set-to-set cluster fault tolerant routing in hypercubes Qian-Ping Gu *, Shietung Peng Department of Computer Software, The University of Aizu, Aizu-Wakamatsu, Fukushima, 965-80 Japan Received 27 September 1997; received in revised form 16 March 1998
Abstract We study node-to-set and set-to-set fault tolerant routing problems in n-dimensional hypercubes Hn . Node-to-set routing problem is that given a node s and a set of nodes T ft1 ; . . . ; tk g, ®nds k node-disjoint paths s ! ti ; 1 6 i 6 k. Set-to-set routing problem is that given two sets of nodes S fs1 ; . . . ; sk g and T ft1 ; . . . ; tk g, ®nds k node-disjoint paths, each path connects a node of S and a node of T . From Menger's theorem, it is known that these two problems in Hn can tolerate at most n ÿ k arbitrary faulty nodes. In this paper, we prove that both routing problems can tolerate n ÿ k arbitrary faulty subgraphs (called cluster) of diameter 1. For 2 6 k 6 n, we show that, in the presence of at most n ÿ k faulty clusters of diameter at most 1, the k paths of length at most n 2 for node-to-set routing in Hn can be found in O
kn optimal time and the k paths of length at most n k 2 for set-to-set routing in Hn can be found in O
kn log k time. The upper bound n 2 on the length of the paths for nodeto-set routing in Hn is optimal. Ó 1998 Elsevier Science B.V. All rights reserved. Keywords: Algorithm; Node-disjoint paths; Interconnection network; Node fault tolerant routing
1. Introduction Node fault tolerant routing is one of the central issues in today's interconnection networks and it has been discussed extensively [1±4]. Most work for node fault tolerant routing has been done on a graph when a certain number of arbitrary nodes failed. For a speci®c routing problem, we say a graph can tolerate l faulty nodes if given at most l arbitrary faulty nodes, the required routing paths exist for the routing problem. For example, node-to-node routing in an n-connected graph can tolerate at
*
Corresponding author. E-mail:
[email protected]
0167-8191/98/$ ± see front matter Ó 1998 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 1 9 1 ( 9 8 ) 0 0 0 5 0 - 7
1246
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
most n ÿ 1 arbitrary faulty nodes. However, if there are more than l faulty nodes, what are the conditions for the existence of the routing paths and how to ®nd them. Recently, this problem has been attracting much attention and several approaches such as forbidden set [5] and cluster fault tolerant routing (CFT routing) [6±8] have been developed. The approach of CFT routing is to reduce the number of ``faulty nodes'' that the routing has to deal with using subgraphs of small diameter to cover the faulty nodes. It was known that for certain routing problems in some interconnection networks with regular structures, if multiple faulty nodes can be covered by a subgraph of small diameter, those faulty nodes can be viewed as ``one fault'' rather than several ones [6±8]. In practice, failed processors can often be covered by fewer subgraphs of small diameters. For example, two routing jobs, one between nodes s1 and t1 and the other between s2 and t2 , are performed simultaneously. The nodes in the routing path between s1 and t1 may be viewed as faulty nodes by the routing between s2 and t2 , and vice versa. In the above example, l faulty nodes can be covered by l=d subgraphs of diameter d. In this paper, we consider node-to-set and set-to-set fault tolerant routing problems in the n-dimensional hypercubes Hn . Node-to-set routing problem is that given a node s and a set of nodes T ft1 ; . . . ; tk g, ®nds k node-disjoint paths s ! ti . Setto-set routing problem is that given two sets of nodes S fs1 ; . . . ; sk g and T ft1 ; . . . ; tk g, ®nds k node-disjoint paths, each path connects a node of S and a node of T . For k 1, the above problems become node-to-node routing which has been studied extensively. We assume k P 2 for the above problems. In what follows, we use disjoint paths for node-disjoint paths. From Menger's theorem, it was known that both node-to-set routing and set-to-set routing in an n-connected graph can tolerate at most n ÿ k arbitrary faulty nodes [9]. However, it is not trivial to ®nd the routing paths guaranteed by the connectivity. For general graphs G, the paths for the two problems can be found by maximum-¯ow based algorithms in Poly(jV
Gj) time [9]. For interconnection networks with regular structure like hypercubes, Poly(jV
Gj) time is far from eciency. Hypercubes are interesting interconnection topologies for parallel computation and communication networks and have been used in many commercial and experimental parallel computers. Numerous works have been done in hypercubes [10,11,4,12]. For node-to-set routing in Hn , Rabin showed that k; k 6 n, disjoint paths s ! ti ; 1 6 i 6 k, of optimal length (at most n 1) can be found in O
kn time [4]. For set-to-set routing in Hn , it was shown that k fault-free disjoint paths can be found in O
kn log k time, in the presence of at most n ÿ k faulty nodes [13]. In this paper, we prove that both node-to-set and set-to-set routing problems can tolerate as many as n ÿ k arbitrary faulty clusters of diameter 1 rather than n ÿ k faulty nodes. In particular, with the presence of at most n ÿ k faulty clusters of diameter at most 1, we give an O
kn optimal time algorithm which constructs k fault-free disjoint paths of length at most n 2 for node-to-set routing in Hn , and give an O
kn log k time algorithm which constructs k fault-free disjoint paths of length at most n k 2 for set-to-set routing in Hn . For node-to-set routing, n 2 is the optimal upper bound on the length of k fault-free disjoint paths. The results of this paper show that for node-to-set and set-to-set routings in Hn , a faulty cluster of diameter 1 can be viewed as a single fault.
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1247
The rest of this paper is organized as follows. In Section 2, we give the preliminaries of the paper. Node-to-set routing and set-to-set routing are discussed in Sections 3 and 4, respectively. The ®nal section concludes the paper. 2. Preliminaries A path in a graph G is a sequence of edges of the form
s1 ; s2
s2 ; s3
skÿ1 ; sk , where si 2 V
G for 1 6 i 6 k and si 6 sj for i 6 j. The length of a path is the number of edges in the path. We sometimes denote the path from s1 to sk by s1 ! sk . We call a path fault-free, if there is no faulty node in the path. For a path P
s1 ; s2
s2 ; s3
skÿ1 ; sk , we also use P to denote the set fs1 ; . . . ; sk g of the nodes in path P , if no confusion arises. For any two nodes s; t 2 V
G, d
s; t denotes the distance between s and t, i.e., the length of the shortest path connecting s and t. The diameter of G is d
G maxfd
s; t j s; t 2 Gg. In this paper, hni f1; 2; . . . ; ng and the logarithm is based on 2. For a graph G, a cluster C of G is a connected subgraph of G. C is used to express the cluster and the set of the nodes in the cluster as well if no confusion arises. A cluster C is called faulty if all nodes in C are faulty. Let F be a set of faulty clusters. jFj denotes theS cardinality of F, d
F maxfd
C j C 2 Fg denotes the diameter of F, and F C2F C denotes the set of the nodes of the clusters in F. Parameters jFj, d
F, and jF j are used as criteria to evaluate the CFT routing properties of a graph. These parameters are put into a triple
m; d; l. For a particular CFT routing problem in a graph G, a triple
m; d; l is called a features number of G, if for any set F of faulty clusters in G with jFj 6 m; d
F 6 d, and jF j 6 l, the required routing paths for the routing problem exist. A features number
m; d; l is called an optimum features number for a speci®c CFT routing problem, if for any
m; d; l <
m0 ; d 0 ; l0 ,
m0 ; d 0 ; l0 is not a features number for the problem. 1 We denote an optimum features number for CFT routing as OCFT number. For an OCFT number
m; d; l with l m maxfjCj j C G; d
C dg, the OCFT number will be denoted (and only in this case) by two parameters
m; d. A pair
m; d is an OCFT number if
m; d is a features number and for any
m0 ; d 0 with
m; d <
m0 ; d 0 ,
m0 ; d 0 is not a features number. In this paper, the n-dimensional hypercube, denoted by Hn , is the undirected n graph on node set V
Hn f0; 1g such that there is an edge from u 2 V
Hn to v 2 V
Hn if and only if u and v dier exactly in one bit position. Fig. 1 shows H3 . There are 2n nodes in Hn , and each node has exactly n edges incident upon it. Hn is n-connected and has diameter d
Hn n. The hypercube Hn has a good recursive structure. It can be partitioned into two disjoint
n ÿ 1-dimensional subcubes on the kth dimension for any k with 1 6 k 6 n. For n P 1, the 0-subcube of Hn , denoted by 0 , is de®ned to be the subgraph of Hn induced by the set of nodes whose kth bit is Hnÿ1 1
The partial order 6 on
m; d; l is de®ned as:
m; d; l 6
m0 ; d 0 ; l0 if m 6 m0 ; d 6 d 0 , and l 6 l0 .
m; d; l <
m0 ; d 0 ; l0 if
m; d; l 6
m0 ; d 0 ; l0 and
m; d; l 6
m0 ; d 0 ; l0 .
1248
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
Fig. 1. The 3-dimensional hypercube H3 .
0. De®ne similarly the 1-subcube of Hn . It will play key roles in this paper to partition Hn into subcubes and connect some nodes in one subcube to some nodes in the other subcube by fault-free disjoint paths of short length. The following lemma shows that each node of one subcube can be connected to n distinct nodes in the other subcube by n disjoint paths of length at most 2. 0 1 and Hnÿ1 such Lemma 1 (Ref. [8]). For any node s 2 Hn , n P 1, partition Hn into Hnÿ1 0 1 that s 2 Hnÿ1
s 2 Hnÿ1 . Then there are n disjoint paths of length at most 2 that 1 0
Hnÿ1 . connect s into n distinct nodes in Hnÿ1
For a node s a1 a2 an 2 Hn , s
i ; 1 6 i 6 n, denotes the node a1 aiÿ1 ai ai1 an , where ai is the logical negation of ai . Similarly, s
i1 ;i2 ;...;ik denotes the node b1 bn , where bij aij ; 1 6 j 6 k; and bl al for l 2 hni ÿ fi1 ; . . . ; ik g. 1 Given s with the 1st bit equal to 0, the n paths of Lemma 1 are P1 : s ! s
1 2 Hnÿ1
j
j;1 1 (of length 1), and Pj : s ! s ! s 2 Hnÿ1 (of length 2), 2 6 j 6 n (see Fig. 2). Obviously, a cluster of diameter 1 can block at most one of the n ÿ 1 paths of length 2 for s. More precisely, the following holds.
Fig. 2. The n paths for s in Lemma 1.
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1249
0 Lemma 2. For s 2 Hnÿ1 , let P1 be the routing path of length 1, Pj , 2 6 j 6 n, be the n ÿ 1 routing paths of length 2 given in Lemma 1.Then: T 0 6 ;, C can block at 1. For any cluster C of diameter at most 1 with s 62 C and C Hnÿ1 most one of the n paths P1 ; . . . ; Pn . 1 , C can block at most one of 2. For any cluster C of diameter at most 1 with C Hnÿ1 the n ÿ 1 paths Pj , 2 6 j 6 n, of length 2.
Notice that a cluster of diameter 1 may block both path P1 and path Pj (see Fig. 2). Node-to-node routing, i.e., ®nding a routing path between two nodes s and t, is one of the bases of the problems considered in this paper. For node-to-node CFT routing, it was known that
n ÿ 1; 1; 2n ÿ 3 is an OCFT number of Hn [8]. In particular, we have the following result. Lemma 3 (Ref. [8]). Given a set F of faulty clusters with jFj 6 n ÿ 1, d
F 6 1, and jF j 6 2n ÿ 3, and non-faulty nodes s and t in Hn , a fault-free path s ! t of length at most n 2 can be found in O
n time. In this paper, we prove that for k P 2,
n ÿ k; 1 is an OCFT number for both node-to-set CFT routing and set-to-set CFT routing in Hn . We show algorithms which, given F with jFj 6 n ÿ k and d
F 6 1, ®nd the routing paths for the two routing problems. 3. Node-to-set routing In this section, we show an algorithm for node-to-set CFT routing in Hn . The algorithm follows a divide-and-conquer strategy. For k P 2, given F, s, and T with 0 1 and Hnÿ1 such that jFj 6 n ÿ k, d
F 6 1, and jT j k, partition Hn into Hnÿ1 0 1 0 1 is the T0 T \ Hnÿ1 6 ; and T1 T \ Hnÿ1 6 ;. Assume that s 2 Hnÿ1 and s1 2 Hnÿ1 0 neighbor of s. The nodes of T0 are connected to s in Hnÿ1 and the nodes of T1 are 1 , recursively. The paths from T1 to s1 are reconstructed into connected to s1 in Hnÿ1 paths from T1 to s. The basic idea of the reconstruction is as follows. Assume that si , 1 1 and s0i 2 Hnÿ1 , 2 6 i 6 n, are the neigh1 6 i 6 n, are the n neighbors of s
s1 2 Hnÿ1 0 bors of si , respectively. Notice that si are also neighbors of s1 (see Fig. 2). After the partition of Hn , we ®rst ®nd the paths from T1 to s1 with the constraint that each path s1 ! ti 2 T1 passes through the neighbor s0ji of s1 such that sji is fault-free. This can be done by marking s0i as a faulty node if si is faulty and then ®nding the fault-free disjoint paths from s1 to T1 . The paths s1 ! s0ji ! ti 2 T1 are reconstructed into paths 0 . s ! sji ! s0ji ! ti . Mark such sji faulty, we then ®nd the paths from s to T0 in Hnÿ1 Lemma 3 is used when the size of T is reduced to 1. Notice that for k 2, node-to-set CFT routing on Hn is reduced to two node-to0 1 and Hnÿ1 , respectively. If the n ÿ 2 faulty clusters of node CFT routings on Hnÿ1 0 , then we may not be able to solve nodediameter 1 are in the same subcube, say Hnÿ1 0 0 can be disconnected by n ÿ 2 faulty to-node CFT routing in Hnÿ1 , because Hnÿ1
1250
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
clusters of diameter 1 (e.g., H3 of Fig. 1 is disconnected by faulty clusters
100; 110 0 will be connected to s by a and
001; 011). If this happens then the node of T \ Hnÿ1 1 path which passes through Hnÿ1 as well. Considering the above, we ®rst give a subroutine which handles the case of k 2 (see Fig. 3). Lemma 4. Given a set F of faulty clusters with jFj 6 n ÿ 2 and d
F 6 1, non-faulty node s and T ft1 ; t2 g in Hn , n P 2, 2 fault-free disjoint paths s ! t1 and s ! t2 of length at most n 2 can be constructed in O
n time. Proof. If t1 or t2 , say t1 , is the neighbor of s then
s; t1 is the path we need. Marking t1 faulty, there will be at most n ÿ 1 faulty clusters of diameter at most 1 with at most 2n ÿ 3 faulty nodes in total. Therefore, t2 can be connected to s by a fault-free path in Hn of length at most n 2 in O
n time by Lemma 3. So we assume neither t1 nor t2 is the neighbor of s. Following the algorithm, we prove the lemma in two cases. 0 1 : (The case 8Ci 2 F; Ci Hnÿ1 can be proved simCase 1. 8Ci 2 F; Ci Hnÿ1 1 ilarly.) In this case, Hnÿ1 is fault-free which implies s1 is fault-free. From jFj 6 n ÿ 2 1 of length 2. and Lemma 2, we can ®nd another fault-free path s ! sj ! s0j 2 Hnÿ1 Similarly, we can ®nd two fault-free paths which connect t1 to two distinct nodes in 1 1 . Let t1 ! t10 2 Hnÿ1 be one of the two paths with t10 6 t2 . Now, we connect ft10 ; t2 g Hnÿ1 0 1 1 (Fig. 4(a)). To do so, we further partition Hnÿ1 to fs1 ; sj g by disjoint paths in Hnÿ1 0 1 0 0 1 0 into Hnÿ2 and Hnÿ2 such that t1 2 Hnÿ2 and t2 2 Hnÿ2 . If s1 and sj are separated, say 0 1 0 and s0j 2 Hnÿ2 (Fig. 4(b)), then t10 is connected to s1 in Hnÿ2 and t2 is cons1 2 Hnÿ2 0 1 nected to sj in Hnÿ2 by shortest paths of length at most n ÿ 2. Assume that s1 and s0j 1 are in the same subcube, say Hnÿ2 (Fig. 4(c)). Then we connect t10 to the neighbor in 0 0 Hnÿ2 of s1 and connect t2 to sj . It is easy to check that the length of paths s ! t1 and s ! t2 is at most n 2.
Fig. 3. Algorithm for node-to-set CFT routing in Hn for k 2.
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1251
1 Fig. 4. Case 1 of algorithm router1: connecting ft10 ; t2 g to fs1 ; s0j g in Hnÿ1 .
It takes O
n time to partition Hn . Enumerating the n paths for s and the n paths for t1 given by Lemma 1, it takes O
n time to ®nd the fault-free paths t1 ! t10 and sj ! s0j . Obviously, ft10 ; t2 g can be connected to fs1 ; s0j g in O
n time as well. T 0 T 1 6 ; and Cj Hnÿ1 6 ;: The key of this case is Case 2. 9Ci ; Cj 2 F s:t: Ci Hnÿ1 how to connect t2 to an s0j by a fault-free path of length at most n such that sj ! s0j is 1 . Since d
s; t1 > 1, fault-free. From Lemma 2, there is a fault-free path sj ! s0j 2 Hnÿ1 0 t1 6 sj . We can connect t2 to sj by Lemma 3. However, the direct application of Lemma 3 for connecting t2 and s0j may give a path of length
n ÿ 1 2 n 1. To ®nd a fault-free path t2 ! s0j of length at most n, we use Lemma 3 to connect t2 to s1 , forcing the path passing through s0j (which is a neighbor of s1 ). To do so, we mark the node s0i faulty if si is faulty and s0i is fault-free. Then if t2 is connected to s1 by a faultfree path (except the node s1 ), the path must pass through a neighbor node s0j of s1 such that sj is fault-free. Note that for each marked s0i , there must be a faulty cluster 0 (Fig. 5 (a)). In addition, for C; C 0 2 F, si 2 C C 2 F such that si 2 C and C Hnÿ1 0 gj nodes in and sj 2 C 0 with i 6 j, C 6 C 0 . From these, at most jfCi j Ci Hnÿ1 fs0i j 2 6 i 6 ng are marked faulty. Therefore, from the condition of this case, there will be at most n ÿ 2 faulty clusters of diameter at most 1 with at most 2
n ÿ 1 ÿ 3 1 after the marking. Thus, a path t2 ! s0j ! s1 of length at faulty nodes in total in Hnÿ1 most
n ÿ 1 2 n 1 (the length of t2 ! s0j is at most n), t2 ! s0j ! sj is fault-free, 1 can be found in Hnÿ1 by Lemma 3. Notice that for t2 s0j with sj fault-free, the path 0 t2 ! sj is trivial. If t2 s0l with sl faulty then we can enumerate the n ÿ 2 disjoint paths of length 3 between t2 and s1 (the n ÿ 2 paths contain the n ÿ 2 neighbors s0i ,
Fig. 5. Case 2 of algorithm router1: connecting t2 to s via s1 or sj .
1252
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
2 6 i 6 n of s1 , one path has one neighbor) to ®nd the path t2 ! s0j , because a cluster of diameter at most 1 can block at most one of the n ÿ 2 disjoint paths (Lemma 2). If s1 is fault-free then we have a fault-free path s ! s1 ! t2 of length at most n 2 (Fig. 5(a)). Obviously, s ! t1 of length at most
n ÿ 1 2 n 1 can be found in 0 by Lemma 3. If s1 is faulty then we get a fault-free path s ! sj ! s0j ! t2 of Hnÿ1 length at most n 2 (Fig. 5(b)). Mark sj faulty. Since s1 is faulty, there is a faulty 1 . Therefore, there are at most n ÿ 2 faulty clusters of diameter at cluster C Hnÿ1 0 after marking sj faulty. most 1 with at most 2
n ÿ 1 ÿ 3 faulty nodes in total in Hnÿ1 Thus, a fault-free path s ! t1 of length at most
n ÿ 1 2 n 1 can be found by 0 . It is easy to see that the time complexity of Router1 is O
n. Lemma 3 in Hnÿ1 The algorithm for node-to-set CFT routing in Hn for k > 2 is given in Fig. 6. Theorem 5. Given a set F of faulty clusters with jFj 6 n ÿ k and d
F 6 1, non-faulty nodes s, and T ft1 ; . . . ; tk g in Hn , 2 6 k 6 n, k fault-free disjoint paths s ! ti of length at most n 2, can be constructed in O
kn optimal time. Proof. We use induction on k to prove the theorem. For k 2, the theorem holds by Lemma 4. Assume that the theorem holds for k ÿ 1 P 2 and we prove it for k. If some node ti 2 T is the neighbor of s then connect ti to s by a path of length 1. Mark ti faulty. Then the rest k ÿ 1 nodes of T can be connected to s by the induction hypothesis. So we assume that 8ti 2 T , d
s; ti > 1. Partition Hn as we did in algorithm Node-To-Set. From jFj 6 n ÿ k and Lemma 2, at least n ÿ 1ÿ
n ÿ k k ÿ 1 P jT1 j paths in fsi ! s0i j 2 6 i 6 ng are fault-free. For any node tj 2 T , from d
s; tj > 1, we have tj 6 si , 1 6 i 6 n. We call Node-To-Set recursively to connect s1 to ti 2 T1 , forcing the paths to pass through the neighbors s0ji of s1 such that sji ! s0ji ! ti are fault-free. Before calling Node-To-Set, we ®rst handle the nodes ti with d
s1 ; ti 1.
Fig. 6. Algorithm for node-to-set CFT routing in Hn for k P 3.
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1253
For ti s0ji with sji fault-free, the path ti ! s0ji is trivial. For each ti s0li with sli faulty, we ®nd the path ti ! s0ji with sji fault-free by enumerating the n ÿ 2 disjoint paths of length 3 from ti to s1 . Obviously, one faulty cluster or a path ti ! s0ji can block at most one of the n ÿ 2 paths from tj to s1 for tj 6 ti . Thus, the fault-free disjoint paths ti ! s0ji of length 0 or 2 for d
s1 ; ti 1 can be found. For ti s0ji , mark ti faulty. For each node ti s0li , mark
ti ! s0ji as two faulty clusters; one is a single node and the other is of diameter 1 (see Fig. 7). The nodes ti 2 T1 with d
s1 ; ti > 1 are connected to s0ji as follows. Similar to the proof of Lemma 4, we mark s0i faulty if si is faulty and s0i is fault-free. Assume that l nodes s0i are marked. Then there are at least l jfti j ti s0li gj faulty clusters C of F 0 . So, there are at most n ÿ k ÿ l ÿ jfti j ti s0li gj faulty clusters C of F with C T Hnÿ1 1 with C Hnÿ1 6 ;. On the other hand, the number of new marked faulty clusters is jfti j ti s0ji gj 2jfti j ti s0li gj l. Therefore, the total number of faulty clusters in 1 is at most n ÿ k jfti j d
s1 ; ti 1gj, and at least one cluster is a single node if Hnÿ1 jfti j d
s1 ; ti 1gj l > 0. The number of nodes ti with d
s1 ; ti > 1 is jT1 j ÿ jfti j d
s1 ; ti 1gj. From the induction hypothesis or Lemma 3, paths s1 ! s0ji ! ti (where d
s1 ; ti > 1 and sji ! s0ji ! ti are fault-free) can be found in 1 . The length of the paths s1 ! ti 2 T1 is at most
n ÿ 1 2 (induction hyHnÿ1 pothesis), and thus the length of ti ! s0ji is at most n. After the nodes of T1 are connected to s0ji (where sji ! s0ji are fault-free) we connect 0 sji to s. If s1 is fault-free then we ®nd jT1 j ÿ 1 paths s ! sji ! s0ji ! ti and one path s ! s1 ! s0ji ! ti . We obtain jT1 j fault-free disjoint paths s ! ti 2 T1 of length at most n 2. Mark the jT1 j ÿ 1 nodes sji faulty. Then there are at most 0 . n ÿ k jT1 j ÿ 1
n ÿ 1 ÿ jT0 j faulty clusters in Hnÿ1 Assume that s1 is faulty. Find jT1 j paths s ! sji ! s0ji ! ti . we obtain jT1 j fault-free disjoint paths s ! ti 2 T1 of length at most n 2. Since s1 is faulty, there is a faulty 1 and thus, there are at most n ÿ k ÿ 1 cluster C 2 F with C Hnÿ1 0 after marking the nodes sji faulty. If jT1 j
n ÿ 1 ÿ jT0 j faulty clusters in Hnÿ1 jT1 j 1 then jT0 j k ÿ jT1 j P 2. If jT1 j P 2 then at least one faulty cluster is a marked sji , a single node. Therefore, from the induction hypothesis or Lemma 3, jT0 j disjoint fault-free paths of length at most
n ÿ 1 2 n 1 from s to T0 can be
Fig. 7. Connect ti with d
s1 ; ti 1 to s0ji .
1254
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
0 found in Hnÿ1 . These jT0 j paths do not contain any faulty node or any node in the paths from s to T1 . Let t
k; n be the time complexity of algorithm Node-To-Set. Then it is easy to see that t
k; n t
jT0 j; n ÿ 1 t
jT1 j; n ÿ 1, where 1 6 jT0 j; jT1 j 6 k ÿ 1, jT0 j jT1 j k, and t
k; n O
n for k 6 2. Thus, the time complexity of algorithm Node-To-Set is O
kn. Since there are k paths of length X
n to be constructed, O
kn is the optimal time upper bound on Node-to-Set CFT routing in Hn .
Theorem 6. For k P 2,
n ÿ k; 1 is an OCFT number of Hn for node-to-set CFT routing. Proof. From Theorem 5,
n ÿ k; 1 is a features number of Hn for node-to-set CFT routing. Let F f
x1 ; y1 ; . . . ;
xl ; yl g, where l n ÿ k 1 and for 1 6 i 6 l, xi a1 anÿ1 0 and yi a1 anÿ1 1 with ai 1 and aj 0 for 1 6 j 6 n ÿ 1 and j 6 i. Then jFj n ÿ k 1 and d
F 1. It is easy to check that the connectivity of Hn ÿ fx1 ; y1 ; . . . ; xl ; yl g is k ÿ 1. By Menger's theorem [9], graph G has k disjoint paths from s to ft1 ; . . . ; tk g i G is k-connected. Thus,
n ÿ k 1; 1 is not a features number of Hn for node-to-set CFT routing. Similarly, we can show
n ÿ k; 2 is not a features number as well. Theorem 7. There is an F with jFj n ÿ k and d
F 1, and non-faulty node s and T ft1 ; . . . ; tk g in Hn such that the length of the fault-free disjoint paths from s to T is at least n 2. Proof. Let s 1 . . . 111, t1 0 . . . 010, t2 0 . . . 011, and F f
x1 ; y1 ; . . . ;
xnÿ2 ; ynÿ2 g, where for 1 6 i 6 n ÿ 2, xi a1 anÿ2 01 and yi a1 anÿ2 11, with ai 1 and aj 0 for j 6 i. Then it is easy to check that the shortest fault-free path from s to t2 that does not contain t1 is n 2. Theorem 7 shows that the upper bound n 2 on the length of the paths for nodeto-node CFT routing in Hn is optimal. 4. Set-to-set routing Now we show the algorithm for set-to-set CFT routing in Hn . The algorithm follows a similar partition-then-routing strategy as that in the last section. Given a set F of faulty clusters, and two sets S fs1 ; . . . ; sk g and T ft1T; . . . ; tk g of non0 faulty Tnodes in Hn , we T partition Hn such T 1that T0 T Hnÿ1 6 ; and 1 0 T1 T Hnÿ1 6 ;. Let S0 S Hnÿ1 and S1 S Hnÿ1 . If jS0 j 6 jT0 j, say jS0 j > jT0 j, 0 then we route some nodes of S0 into the opposite subcube to get subsets S00 Hnÿ1 0 1 0 0 and S1 Hnÿ1 of S, with jS0 j jT0 j and jS1 j jT1 j, and the subproblems are recur0 1 and Hnÿ1 , respectively. When we connect a node si 2 S00 to a sively solved in Hnÿ1 node tji 2 T0 , some node sj 2 S0 ÿ S00 may appear in the path si ! tji . If this happens 1 . If then a substitution technique is used that connects sj to tji and re-route si into Hnÿ1
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1255
jSj and jT j become 1 then Lemma 3 is used. We ®rst give a subroutine which handles the case of k 2 (see Fig. 8). To show the correctness of subroutine Router2, we need the following lemma. 0 , let P1 ; . . . ; Pn be the n routing paths for s1 and Q1 ; . . . ; Qn Lemma 8. For s1 ; s2 2 Hnÿ1 be the n routing paths for s2 , given by Lemma 1. Then, for s1 6 s2 , at least 2n ÿ 2 paths of P1 ; . . . ; Pn ; Q1 ; . . . ; Qn are disjoint except at the end-nodes s1 or s2 .
Proof. Assume s1 a1 a2 an and s2 b1 b2 bn . If d
s1 ; s2 P 3 then obviously the 2n paths P1 ; . . . ; Pn and Q1 ; . . . ; Qn are disjoint except at the end-nodes s1 or s2 . For d
s1 ; s2 1, assume that ak bk and ai bi , 1 6 i 6 n and i 6 k. It is easy to see that path Pk meets path Q1 , path P1 meets path Qk , and none of any other pair of paths Pl and Qm has the common node except the end-nodes s1 or s2 . Thus, take P fPi ; Qi ; 1 6 i 6 n; i 6 kg, the paths in P are disjoint except at end-nodes s1 or s2 and jPj 2n ÿ 2. For d
s1 ; s2 2, assume ak bk , aj bj , and ai bi , 1 6 i 6 n and i 6 j; k. Then in the 2n paths P1 ; . . . ; Pn and Q1 ; . . . ; Qn , path Pk meets path Qj and path Pj meets path Qk , and none of any other pair of paths has the common node except the endnodes s1 or s2 . Thus, there are 2n ÿ 2 paths of the above 2n paths are disjoint except at the end-nodes s1 or s2 . Lemma 9. Given a set F of faulty clusters with jFj 6 n ÿ 2 and d
F 6 1, and two sets of mutually disjoint, non-faulty nodes S fs1 ; s2 g and T ft1 ; t2 g in Hn , 2 6 n, 2 faultfree disjoint paths of length at most n 4, each path connects a node of S and a node of T , can be constructed in O
n time. Proof. We prove the lemma by induction on n. For n 2, the paths are trivial and the length of the paths is 1. Assume that the lemma holds for n ÿ 1 P 2 and we prove it for n. Partition Hn as in subroutine Router2. 0 : From Lemma 2, we can ®nd a fault-free path Case 1. 8Ci 2 F, Ci Hnÿ1 0 1 t1 ! t1 2 Hnÿ1 of length at most 2 with t10 6 t2 . Similarly, we can ®nd disjoint fault-
Fig. 8. Algorithm for set-to-set CFT routing in Hn for k 2.
1256
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
T 0 1 1 free paths si ! s0i 2 Hnÿ1 of length at most 2 for si 2 S Hnÿ1 . Since Hnÿ1 is fault-free, following a similar argument as in the proof of Case 1 of Lemma 4, fs01 ; s02 g and 1 . Thus, ft10 ; t2 g can be connected by fault-free disjoint paths of length at most n in Hnÿ1 the lemma holds. T 0 T 1 6 ; and Cj Hnÿ1 6 ;:: Assume that s1 and s2 are Case 2. 9Ci ; Cj 2 F, Ci Hnÿ1 separated by the partition. From the condition of the case, there are at most n ÿ 2 faulty clusters of diameter at most 1 with at most 2
n ÿ 1 ÿ 3 faulty nodes in total in 0 0 0 . Therefore, the node of S \ Hnÿ1 can be connected to t1 by Lemma 3 in Hnÿ1 . Hnÿ1 1 1 Similarly, the node of S \ Hnÿ1 and t2 can be connected in Hnÿ1 . The length of the paths is at most
n ÿ 1 2 n 1. 0 1 (s1 ; s2 2 Hnÿ1 can be proved symmetrically). We ®rst Now, we assume s1 ; s2 2 Hnÿ1 1 1 and Q : s2 ! s02 2 Hnÿ1 , where check if there are fault-free paths P : s1 ! s01 2 Hnÿ1 s1 62 Q and s2 62 P . If path P does not exist then it must be the case that a faulty 1 blocks one path of length 1 and one path of length 2, the rest n ÿ 3 cluster C Hnÿ1 clusters block n ÿ 3 paths of length 2, and s2 blocks one path of length 2, of the n paths given in Lemma 1 for s1 (Fig. 9(a)). Since at least n ÿ 1 faulty nodes appear in the routing paths for s1 , from Lemma 8 and the fact there are at most 2n ÿ 4 faulty 1 contains only one faulty cluster C then we nodes in Hn , path Q exists. In fact, if Hnÿ1 can ®nd a path Q of length 1 for s2 . If the path Q of length 1 is blocked then there 1 with C 0 6 C. Therefore, marking the submust beTanother faulty cluster C 0 Hnÿ1 0 path Q Hnÿ1 faulty, there are at most n ÿ 2 faulty clusters of diameter at most 1 0 . This implies that s1 can be with at most 2
n ÿ 1 ÿ 3 faulty nodes in total in Hnÿ1 0 connected to t1 by Lemma 3 in Hnÿ1 . Obviously, node s02 can be connected to t2 in 1 by Lemma 3 as well. Hnÿ1 So, we assume both P and Q exist. From a similar argument as above, we can further assume that P and Q are disjoint. We ®nd the fault-free path 1 0 and a fault-free path R : s2 ! t1 in Hnÿ1 (see Fig. 9(b)). If paths P P : s1 ! s01 2 Hnÿ1 and R are disjoint then we have done. Assume that paths P and R intersect at node u and t1 ! v is the longest subpath of R that does not contain any node of P . We ®nd 1 which is disjoint with P . If Q intersects path the fault-free path Q : s2 ! s02 2 Hnÿ1 1 by P . Assume that Q and t1 ! v are t1 ! v then connect s2 to t1 and route s1 into Hnÿ1 disjoint. Then obviously, paths Q and s1 ! t1 are disjoint. Thus, s1 can be connected 1 by fault-free disjoint paths. s01 or s02 can be to t1 and s2 can be routed into Hnÿ1 1 connected to t2 in Hnÿ1 by Lemma 3.
Fig. 9. Connect fs1 ; s2 g to ft1 ; t2 g.
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1257
The length of the paths from fs1 ; s2 g to ft1 ; t2 g is at most
n ÿ 1 2 2 n 3. Obviously, the time complexity of Router2 is O
n: The algorithm for set-to-set CFT routing in Hn for k P 3 is given in Fig. 10. Theorem 10. Given a set F of faulty clusters with jFj 6 n ÿ k and d
F 6 1, and two sets of mutually disjoint, non-faulty nodes S fs1 ; . . . ; sk g and T ft1 ; . . . ; tk g in Hn , 2 6 k 6 n, k fault-free disjoint paths of length at most n k 2, each path connects a node of S and a node of T , can be constructed in O
kn log k time. Proof. The theorem is proved by induction on k. For k 2, the theorem follows directly from Lemma 9. Assume that the theorem holds for k ÿ 1 P 2 and we prove it for k in Hn . Following T we1 divide the proof into the following cases. T 0 the algorithm, 1 6 ; and S1 Hnÿ1 6 ;) or (S0 S and 9Ci 2 F, Ci Hnÿ1 ): Case 1. (S0 S Hnÿ1 Assume that jS0 j > jT0 j P jT1 j. The other cases can be proved similarly. We claim 1 that jS0 j ÿ jT0 j nodes si of S0 can be connected to s0i 2 Hnÿ1 , s0i 62 S, by fault-free 1 exists then the disjoint paths of length at most 2. If 8si 2 S0 , the path si ! s0i 2 Hnÿ1 claim holds. If there is a node s 2 S0 such that all the n paths of Lemma 1 for s are blocked by faulty nodes and the nodes si of S, si 6 s, then at least n ÿ
k ÿ 1 faulty nodes appear in the n paths for s. From Lemma 8, for each node si 2 S0 , si 6 s, there are n ÿ 2 disjoint paths for si that are disjoint with the n paths for s. Since the total number of faulty nodes is at most 2
n ÿ k, at most 2
n ÿ k ÿ
n ÿ k 1 n ÿ k ÿ 1 faulty nodes may block the n ÿ 2 paths for si . The k ÿ 2 nodes sj (or the routing path for sj ) of S, sj 6 s; si , may block k ÿ 2 additional paths. Totally at most n ÿ k ÿ 1 k ÿ 2 < n ÿ 2 paths for si can be blocked. Thus, for any si 6 s of S0 , we 1 . From jT0 j P 1, the claim holds. can ®nd a fault-free path si ! s0i 2 Hnÿ1 0 1 and Hnÿ1 , To balance the number of nodes to be connected in Hnÿ1 1 m jS0 j ÿ jT0 j jSj ÿ jS1 j ÿ jT0 j nodes of S0 are routed into Hnÿ1 , that generate m
Fig. 10. Algorithm for set-to-set CFT routing in Hn for k P 3.
1258
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
0 1 new faulty clusters of diameter at most 1 in Hnÿ1 . Assume jS1 j P 1. If 9C Hnÿ1 then there are at most
n ÿ k ÿ 1 m 6
n ÿ 2 ÿ jT0 j faulty clusters of diameter at most 0 . By the induction hypothesis (for jT0 j P 2) or Lemma 1 in Hnÿ1 T 3 0(for jT0 j 1), the 6 ; then there nodes of T0 can be connected to jT0 j nodes of S00 . If 8C 2 F; C Hnÿ1 0 . Since are at most
n ÿ k m 6
n ÿ 1 ÿ jT0 j faulty clusters in THnÿ1 0 6 ;, m jS0 j > jT0 j P jT1 j > jS1 j, jS0 j > jS1 j m. Therefore, from 8C 2 F, C Hnÿ1 1 by disjoint paths of length 1, resulting in m P 1 nodes of S0 can be routed to Hnÿ1 0 . Thus, the nodes of T0 can be connected to jT0 j faulty clusters of single node in Hnÿ1 0 nodes of S0 by the induction or Lemma 3. 1 . From Assume jS1 j 0. Let C1 ; . . . ; Cl be the clusters of F with C1 ; . . . ; Cl Hnÿ1 the condition of this case, l P 1. If l P 2 then there are at most 0 . If l 1
n ÿ k ÿ l m 6
n ÿ 2 ÿ jT0 j faulty clusters of diameter at most 1 in Hnÿ1 then C1 can block the routing paths of length 1 for at most two nodes of S0 . From 1 by a fault-free path of jS0 j jSj P 3, we can route at least one node of S0 into Hnÿ1 0 . Thus, the nodes length 1, resulting in at least one faulty cluster of single node in Hnÿ1 0 0 of T0 can be connected to jT0 j nodes of S0 in Hnÿ1 by the induction or Lemma 3. 1 can be connected by disjoint fault-free Obviously, the nodes of S and T in Hnÿ1 paths from the induction hypothesis. T 0 0 and 8Ci 2 F, Ci Hnÿ1 6 ;: In this case, for si 2 S, path Case 2. S Hnÿ1 0 1 0 si ! si 2 Hnÿ1 of length 1 is fault-free. Let S be a subset of S with jS 0 j jT0 j. By the 0 . For the node sj 2 S ÿ S 0 , if sj induction hypothesis, we can connect S 0 to T0 in Hnÿ1 1 does not appear in any path si ! tji 2 T0 then ®nd the path sj ! s0j 2 Hnÿ1 of length 0 1. Notice that for si 2 S , path si ! tji 2 T0 may contain some nodes in
S ÿ S 0 . In this case, let sj be the nearest node to tji among the nodes of the path si ! tji which are in
S ÿ S 0 . Then, we disconnect si from tji , connect sj to tji , and ®nd the path 1 1 (see Fig. 11). The subproblem in Hnÿ1 can be done by the induction si ! s0i 2 Hnÿ1 hypothesis. Let L
k; n be the length of the paths found by algorithm Set-To-Set. Then it is easy to see L
k; n 6 maxfL
k1 ; n ÿ 1; L
k2 ; n ÿ 1g 2, where 1 6 k1 ; k2 6 k, k1 k2 k, L
2; n 6 n 4 and L
1; n 6 n 2. From this, L
k; n 6 n k 2. Let t
k; n be the time complexity of algorithm Set-To-Set. Then t
k; n t
k1 ; n ÿ 1 t
k2 ; n ÿ 1 t
n; where 1 6 k1 ; k2 6 k ÿ 1, k1 k2 k, t
1; n 6 cn for some constant c, and t
n is the time complexity of partitioning Hn and then routing
Fig. 11. Connect S 0 to T1 .
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1259
some nodes of S into the opposite subcube. From this, t
2; n t
1; n ÿ 1 t
1; n ÿ 1 t
n 6 2cn t
n. Assume t
j; n 6 jcn
j ÿ 1t
n for all 1 6 j 6 k ÿ 1. Then, t
k; n t
k1 ; n ÿ 1 t
k2 ; n ÿ 1 t
n 6 k1 cn
k1 ÿ 1t
n ÿ 1 k2 cn
k2 ÿ 1t
n ÿ 1 t
n 6 kcn
k ÿ 1t
n: Therefore, t
k; n O
kn kt
n. Clearly, the partition takes O
n time. After the partition, O
k nodes of S may be routed into the opposite subcube. It takes O
n time to ®nd a fault-free routing path for one node of S. Therefore, a trivial upper bound on t
n is O
kn and t
k; n O
k 2 n. However, it should be noted that if it takes O
n time to route one node si then most of the faulty nodes are close to si and it may take less time to route the other nodes. We claim that t
n O
n log k. To prove this, we need the following combinatorial properties of Hn . Lemma 11. For a node u 2 Hn , let N
u fv j v 2 Hn ; d
u; v 6 1g. P For any set of nodes X Hn de®ne f
x jfy j x 2 N
y; y 2 X gj, and F
X x2X f
x. Then F
X 6 jX j log jX j. Proof. The lemma is proved by induction on jX j. It is trivial to show that the lemma holds for jX j 6 4. Now we assume that the lemma holds for jX j 6 m ÿ 1, and prove 0 1 0 and Hnÿ1 such that Hnÿ1 the case of jX j m. We partition Hn into two subcubes Hnÿ1 1 and Hnÿ1 contain l and m ÿ l nodes of X ,Trespectively, with 1 6 l 6 m ÿ T 1. 1From 0 6 l log l and F
X Hnÿ1 6 the induction hypothesis we have F
X Hnÿ1 0 is a neighbor node of exactly
m ÿ l log
m ÿ l. It is clear that each node of Hnÿ1 1 and vice versa. Therefore, we have one node in Hnÿ1 F
X 6 l log l
m ÿ l log
m ÿ l 2 minfl; m ÿ lg: It is not dicult to show that l log l
m ÿ l log
m ÿ l 2 minfl; m ÿ lg 6 m log m for 1 6 l 6 m ÿ 1. Thus, the lemma holds.
Lemma 12. Let X ; Y Hn with PjX j P jY j. For x 2 X , de®ne g
x; Y jfy j x 2 N
y; y 2 Y gj and G
X ; Y x2X g
x; Y . Then G
X ; Y O
jX j log jY j. Proof. Partition X into r djX j=jY je disjoint subsets X1 ; . . . ; Xr with jjXi j ÿ jXj jj 6 1
1 6 i; j 6 r. Then from Lemma 11, r r [ X X G
Xi ; Y 6 F
Xi Y O
jX j log jY j: G
X ; Y i1
i1
It takes PO
mi time to route si if mi of the n paths for si are blocked. From this, it takes O
si 2S O
mi time to route the nodes of S. Let Y be the set of nodes to be routed and X be the set of nodes which may block the routing paths needed. Then g
x; Y de®ned in Lemma 12 gives an upper bound on the number of the routing
1260
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
paths for the nodes to be routed that are blocked by node x 2 X . Therefore, from Lemma 12, ! X mi 6 G
X ; Y O
jX j log jY j: t
n O si 2S
From this, jX j O
n, and jY j O
k, we proved t
n O
n log k. Thus, the time complexity of algorithm Set-To-Set is O
kn log k. 5. Concluding remarks In this paper, we proved that node-to-set and set-to-set routings in Hn can tolerate n ÿ k arbitrary faulty clusters of diameter 1 rather than n ÿ k faulty nodes. We also gave ecient algorithms for node-to-set and set-to-set CFT routings in Hn . How to deal with the clusters of diameter greater than 1 for routing problems in Hn are open. It can be shown that Hn can be separated by one cluster of diameter 4. CFT routings in other interconnection networks are also worth further investigation.
Acknowledgements The authors thank the anonymous reviewers for their helpful comments and suggestions. A preliminary version of this paper was presented at the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN'94). This research was partially supported by the Founding of Group Research Projects at The University of Aizu.
References [1] D.F. Hsu, On container with width and length in graphs, groups, and networks, IEICE Trans. Fundamental of Electronics, Information, and Computer Sciences, E77-A (4) (1994) 668±680. [2] D.F. Hsu, Interconnection Networks and Algorithms (special issue), Networks 23 (4) (1993). [3] J.C. Bermond, Interconnection Networks (special issue), Discrete Appl. Math., 1992. [4] M.A. Rabin, Ecient dispersal of information for security, load balancing, and fault tolerance, J. ACM 36 (2) (1989) 335±348. [5] A.H. Esfahanian, Generalized measures of fault tolerance with application to n-cube networks, IEEE Trans. Computers 38 (11) (1989) 1586±1591. [6] Q. Gu, S. Peng, k-pairwise cluster fault tolerant routing in hypercubes, IEEE Trans. Computers 46 (9) (1997) 1042±1049. [7] Q. Gu, S. Peng, Node-to-node cluster fault tolerant routing in star graphs, Information Processing Letters 56 (1995) 29±35. [8] Q. Gu, S. Peng, An ecient algorithm for node-to-node routing in hypercubes with faulty clusters, The Computer Journal 39 (1) (1996) 14±19. [9] J. McHugh, Algorithmic Graph Theory, Prentice-Hall, Englewood Clis, NJ, 1990. [10] C.L. Seitz, The cosmic cube, Communication of ACM 28 (1) (1985) 22±33.
Q.-P. Gu, S. Peng / Parallel Computing 24 (1998) 1245±1261
1261
[11] Y. Saad, M.H. Shultz, Topological properties of hypercubes, IEEE Trans. Computers 37 (7) (1988) 867±872. [12] S. Madhavapeddy, I.H. Sudborough, A topological property of hypercubes: Node disjoint paths, in: Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, 1990, pp. 532± 539. [13] Q. Gu, S. Okawa, S. Peng, Set-to-set fault tolerant routing in hypercubes, IEICE Trans. Fundamentals E79A (4) (1996) 483±488.