Computer Networks and ISDN Systems 26 (1993) 227-232 North-Holland
227
Incremental distributed asynchronous algorithm for minimum spanning trees * Yung H. Tsin School of Computer Science, University of Windsor, Windsor, Ont., Canada N9B 3P4
Abstract A distributed algorithm for updating a minimum spanning tree when a new vertex is added to the underlying network (a connected, undirected, weighted graph) is presented. The algorithm runs asynchronously and the processor at each vertex of the network is required to know only information concerning its adjacent edges. The number of messages transmitted is bounded by 6N and the time complexity is O(H) where N and H( ~
Keywords: Distributed algorithms; Communication networks; Minimum spanning trees; Graph theory; Incremental algorithms; Vertex insertion.
I. Introduction A distributed computer network can he modelled as a connected, undirected, weighted graph, G = (V, E), in which every vertex corresponds to a computer system and every edge corresponds to a communication link between two computer systems. The weight associated with each edge represents traffic cost of the communication link corresponding to that edge. The problem of constructing an MST (Minimum Spanning Tree) for a computer network is a well-known important problem. This is because many problems in distributed computer networks are reducible to the MST problem. Numerous distributed algorithms for this problem have thus been proposed [1,3Correspondence to: Y.H. Tsin, School of Computer Science, University of Windsor, Windsor, Ont. Canada N9B 3P4. * Research supported by the Natural Sciences and Engineering Research Council of Canada under Grant NSERCOGP0007811 and by the University of Windsor Research
Board Grant.
8,12]. Among them the most efficient one is that of Awerbuch [1] which has O ( N ) time and O ( N lOgEN + I E I ) message complexity, where N = I V I. In reality, the structure of any network is subject to modification. This brings up the MST updating problem which concerns with recomputing the MST when changes are made to the underlying network. An obvious solution to the MST updating problem is to construct an MST for the new network from scratch. However, this could be unnecessaily expensive when much of the new MST coincide with the old one. A better solution is thus to exploit and modify the structure of the old MST so as to produce the new one. Algorithms of this type which make use of existing structure to avoid computing from scratch are called incremental algorithms. The changes we will consider in this paper are vertex addition/deletion. We assume that these changes are incremental in the sense that they are made to the network one at a time. For vertex deletion, by
0169-7552/93/$06.00 © 1993 - Elsevier Science Publishers B.V. All rights reserved
228
Y.H. Tsin / Incremental distributed asynchronous algorithm
adopting the ideal underlying Tsin's parallel algorithm [11], an O ( H ' + N " ) time and O ( N log2N" + I E ' l) message distributed algorithm can be derived, where H ' ( < N) and N " ( < N ) are the maximum height of the fragments and the number of fragments, respectively, of the old MST that remain after the vertex deletion; and E'(___ E) = E - E where E is the set of edges incident upon the deleted vertex (vertices). However, as Pawagi and Ramakrishnan [10] had pointed out earlier that deleting a vertex from a network may result in removing the entire MST, so any incremental algorithm for vertex deletion, including the aforementioned one, has worst case time and message complexities no better than those of constructing the new MST from scratch. We therefore will not pursue this problem any further. For vertex insertion, however, the suitation is much better. We present an asynchronous distributed incremental algorithm which handles one vertex insertion in O ( H ) time and O ( N ) messages where H is the height of the old MST (note that H < N). This result is superior to the algorithm which constructs the new MST from scratch as well as the previously known incremental algorithm of Parker and Samadi [9] which requires O ( N 2) time and messages and complicated routing tables and locking mechanisms. By incorporating a vertex selection algorithm into our algorithm, the resulting algorithm could handle k( >i 1) vertex insertions in O(k* max(H, H 1. . . . . Ilk- 1)) time and O(kN + k 2) message transmissions where H i is the height of the MST after i new vertices have been handled. The previously known algorithm requires O ( k N 2 + k 2) time and messages in the worst case [9]. We assume without loss of generality that a MST is a rooted tree. Hence, it makes sense to talk about the father/sons of a vertex in an MST.
The model we use here has been used in [1,4,7,9]. Specifically, the processor at each vertex knows only (i) its unique 'id' number; (ii) the labels and the weights of all its adjacent edges; (iii) which of its adjacent edges are tree-edges of the old MST; (iv) the edge joining it to its father vertex in the old MST. Each processor executes the same local algorithm asynchronously, which consists of sending out messages along adjacent edges and processing messages received from adjacent edges. Messages at each vertex are processed in a firstcome-first-serve order. The edges are bidirectional in that messages can be transmitted independently along them in both directions. The time it takes a message to pass from a vertex to an adjacent vertex is unpredictable but finite.
2. The distributed MST updating algorithm for vertex insertion Our MST updating algorithm is based on the following lemma. Lemma 1. An edge e of G is not an edge of the M S T of G if and only if e is the maximum edge of a cycle in G. Let G = (V, E) be a connected, undirected, weighted graph and T = (V, E ' ) be an MST of G with root r. Suppose a new vertex z is inserted into G resulting in a new graph G Z = (VU {z}, E U E z) where Ez is the set of new weighted edges connecting z with vertices in G. Our task is to construct an MST T' for G z. Let Tz -- (VU {z}, E ' U E~). Using Lemma 1, it is easily shown that an MST of Tz is also an MST of G z. Since T~ is a subgraph of G~, it is obviously easier to construct T' from T~. However, instead of using
Yung l-l.vang Tsin received his B. Sc. degree in mathematics from Nanyang University in 1972; his M. Sc. and Ph.D. degrees in computer science from the University of Calgray in 1979 and tile University of Alberta in 1983, respectively. His is currently an associate professor at the University of Windsor. His research interests is mainly on the design and analysis of algorithms for sequential, parallel as well as distributed computer models. Dr. Tsin is a member of ACM and SIAM.
Y.H. Tsin / Incrementaldistributedasynchronousalgorithm the existing MST algorithms, all of which construct an MST by starting from small fragments of graph and successively extend them to produce the MST, we use a different approach which starts from the entire graph and gradually trims it
r 10 ~ / / ' ~ - - -
a ~ (
down to an MST by repeatedly deleting maximum edges on cycles. This approach of constructing MST has at least two advantages: First, cycles can be deleted in parallel. Second, for graphs
v/ 6 ¢/a
with few cycles (such as Tz)' it c°uld °utpeff°rm
~77 ~
all the existing MST algorithms. Although, in principle, our algorithm is essentially the same as the greedy algorithm for constructing a maximum weight cobasis of a matroid [2, pp. 495-496] as well as Pawagi and Ramakrishnan's parallel algorithm [10], the detailed implementation is, however, completely different.
Definition. Let v ~ V. A z-path of v is a path in Tz joining v to z such that every intermediate vertex in the path is a descendent of v in T. A cycle C in T~ is said to be rooted at vertex v if v is the vertex in C that is closest to the root r. Our method is to destroy, for each v ~ V, all the cycles rooted at v. Clearly, every cycle rooted at v consists of two disjoint z-paths of v, we could therefore destroy all the cycles rooted at v by deleting the maximum edge of every z-path of v except the one which has the smallest weight among all those maximum edges (Fig. 1). The messages that are used by the distributed algorithm are the following whose lengths are easily seen to be at most (3 + lOgES) bits where m is the maximum edge weight in T~. Awake: its format is ("awake"). Candidate: its format is ("candidate", wt) where wt is the maximum edge weight on a z-path of its receiver or is + o0. If wt = + 0% the message is called a null candidate message, Delete: its format is ("delete", wt) where wt is the weight of an edge. Invert: its format is ("invert"). Complete: its format is ("complete").
Initially, all the vertices are in the 'sleep' state except the new vertex z. Vertex z initiates the execution of the updating algorithm by sending out a Candidate message, ("candidate", cost(j)), along each adjacent edge j where cost(j) is the weight of edge j.
229
tt
~ ~ 10~~
8~. ~
16 I~/ ~77
~/
•
t Xt
\
x
\
,~-
i , ~ ~ 6
,"
t'1 3 / 1 ~ 1 2 I
~~ 1 0 ~ , ~ ~ •
"z
/
,'
3/" , ,','
,"
~_
4
=~6\ ,' , N. // " / " ," --
ol
7
,
,,' \2
t
~
1~
,"
, . 11
/
:::: ....... . . . . . . . . . . ....... .... x . . . . edg~,tobodelmd x
--
by the algorithm z-path,, of veaexv
Fig. 1.
A vertex v is awakened by either a Candidate message sent from z or an Awake message sent from an adjacent vertex. Once v is awakened, if v has at least one son (vertex z is regarded as a son of every adjacent vertex), then v enters the 'awake' state and immediately sends out Awake messages to all its adjacent vertices. It then waits for the Candidate messages from all its sons. Otherwise, v simply sends a null candidate message to its father and remains in the 'sleep' state. Awake messages are ignored by vertices which are already in the 'awake' state. Each awakened vertex v maintains two local registers weight and wherefrom. Weight is initialized to +o0. At any time when weight ~ +oo, weight contains the maximum edge weight of a z-path of v. This z-path reaches v via the edge wherefrom and has the smallest maximum edge weight among all the z-paths of v found up to that point of time. Whenever v receives a Candidate message ("candidate", wt) via an adjacent edge j, it reacts as follows: (i) If wt = + oo, vertex v discards that message; (ii) if the message is the first non-null Candidate message, vertex v stores wt into weight and j into wherefrom. (iii) if the message is not the first non-null Candidate mes-
230
Y.H. Tsin / Incrementaldistributed asynchronous algorithm
sage, then a cycle consisting of two z-paths of v whose maximum edge weights are weight and wt respectively has been found. Vertex v compares weight with wt and sends a Delete message, ("delete", wt'), along the z-path whose maximum weight is larger. It then updates the registers weight and wherefrom if necessary. Guided by the wherefrom registers, the delete message is relayed along that z-path until an edge whose weight equals wt' is encountered. Then the two endvertices of that edge will delete that edge from Tz, Let w be the end-vertex of the deleted edge which is farther away from v. clearly the directions of all the edges joining w and z on that z-path must be inverted. So vertex w will initiate an invert message and passes it down that z-path until vertex z is reached. Every vertex, excluding z but including w, which receives the invert message will make edge wherefrom its new father edge. All the vertices, except v and z, on that z-path will then enter the 'sleep' state. Vertex z will enter the 'sleep' state only if it has received a message from every adjacent edge. After vertex v has received and processed all the Candidate messages from its sons, no cycles rooted at v remains in Tz. If v has not received any non-null Candidate message, then v simply sends a null Candidate message to its father and enters the 'sleep' state. Otherwise, it should be clear that at this point of time, there is exactly one z-path of v remained in T~. If v :~ r, then vertex v extends this z-path to include its father edge by sending a ("candidate", wt') message to its father vertex, where wt' = max{weight, cost(v's father-edge)}. The extended path is a z-path of v 's father. If v = r, then vertex r sends out a Complete message along this z-path and then enters the 'sleep' state. Every vertex receiving this message will enter the 'sleep' state except vertex z. When the Complete message reaches z, vertex z makes the edge from which it receives the message as its father edge. Moreover, if z has received a message from every adjacent edge, it will enter the 'sleep' state, Execution of the updating algorithm terminates when all vertices of Tz are asleep again,
3. Time and message complexities Let one time unit be the longest time required to transmit a message between two adjacent ver-
tices. By observing that the longest distance between z and any vertex in Tz is at most 2 H + 1 ( H is the height of the old MST), we can use an inductive argument to prove that 2 H + 1 time units after vertex z initiates the execution of the updating algorithm, all the vertices in Tz must have awakened. After awakening, a vertex must perform a sequence of tasks before it re-enters the 'sleep' state. This sequence of tasks, in the worst case, consists of: (i) send out Awake messages to adjacent vertices; (ii) wait for Candidate messages from the sons; (iii) send a Candidate message to the father and Delete messages to all but one son; (iv) wait for a replying message from the father; (v) send a message to the only son who has not received a message in task (iii). Tasks (i), (iii) and (v) each takes 1 time unit. Task (ii) takes at most h + 1 time units where h is the height of the vertex in T (h = - 1 for vertex z). This is because every Candidate message the vertex received is originated at either a pendant (a vertex of degree 1) or the new vertex z and the distance between the vertex and a pendant or z is at most h + 1. Task (iv) takes 2 ( H - h) time units. Therefore, the above five tasks take at most 2 H + 5 time units. Hence, execution of the updating algorithm takes at most 4 H + 6 time units. Let d e g a ( v ) denote the degree of vertex v in graph G and I V I = N. Vertex z sends out only d e g r ( z ) messages. For each vertex v 4: z, vertex v sends out at most degr(V) Awake messages; at most 1 Candidate message; at most d e g r ( v ) - 1 Delete or Invert or Complete messages. Therefore the total number of messages transmitted is at most d e g T ( z ) + Y'-v~v d e g r ( v ) + E v ~ v 1 + E~~ v(degT~v) -- 1) < 6 N - 4.
4. Handling k( >i 1) vertex insertions To extend the above updating algorithm to handle k(>~ 1)vertex insertions, what we need is to incorporate a vertex selection algorithm into the algorithm. The function of this algorithm is to ensure that at any time, only one new vertex can initiate an execution of the updating algorithm and that once an execution of the updating algorithm is initiated, no other new vertex could initiate another execution until the current execution terminates. The vertex selection algorithm is centralized at the root r and works as follows: whenever a new vertex w is added to the graph,
Y.H. Tsin / Incremental distributed asynchronous algorithm
231
w has to inform r of its existence and request for permission to inititate an execution of the updating algorithm. This is accomplished as follows. Vertex w arbitrarily selects one adjacent vertex in the current MST and sends it a ("Request", w) message. This Request message is then passed along a path composing of father edges to r. As the distributed updating algorithm may be running concurrently with the vertex selection algorithm, this path, in the worst case, consists of the following three parts: a path from w to a vertex u whose father edge has been changed; a path from u to z where z is the new vertex initiated the current execution of the updating algorithm and a path from z to r through which z receives the Complete message. Let H and H ' denote, respectively, the height of the MST before and after the current execution initiated by z terminates. It is easily verified that the Request message takes at most H + H ' < ~ 2 m a x ( H , H')time units and message transmissions to reach r. The root r maintains a queue to keep all the Request messages that have arrived so that it could process one Request message at a time. Whenever r receives a Request message, if no other vertex has initiated an execution of the updating algorithm, then r will process the request immediately. Otherwise, it will simply enter the message into the queue. To process a Request message ("Request", z), r sends a ("Start", z) message to vertex z to inform z that it could proceed to initiate an execution of the updating algorithm. The message is passed along the path through which the Request message was sent. Obviously, this requires at most max(H, H ' ) time units and message transmissions. After sending out the Start message, r will not process another Request message until it is informed of the termi-
the asynchronous nature of the distributed algorithm, it is possible that a pendant of the old MST which is adjacent to z may receive an Awake message from its father before receiving the Candidate message from z. Without z ' s identity included in the Awake message, the pendant would not know that it is its new neighbour z which initiates the current execution. As a result, it could regard itself as a pendant and hence incorrectly sends out a null Candidate message to its father. Let H i be the height of the MST after i new vertices have been included. Note that H i <~N + i. To handle k vertex insertions, the vertex selection algorithm and the distributed Updating algorithm must each be executed k times. This requires at most 4k max(H, H 1, HE, . . . , H k - 1 ) + ( 4 H + 6) + ( 4 H 1 + 6) + • • • + ( 4 H k_ 1 -Jt6) = O ( k * m a x ( H , H1, H 2 . . . . . H k _ l ) ) time units and 4k max(H, H1, H2, . . . , n k _ l ) + 6 N - 4 + 6(N + 1) - 4 + • • • + 6 ( N + k - 1) - 4 = O ( k N + k 2) messages. Finally, it is worth pointing out that the above updating algorithm for vertex insertion can be modified to handle edge insertion. The resulting time and message complexities are the same as those above and are comparable to the previously known ones [9]. As the techniques employed are similar to those used above, we refrain from discussing the details here.
nation of the current execution initiated by z. This is easily accomplished by having vertex z returning the Complete message to r after z re-entering the 'sleep' state. Obviously, it takes the Complete message at most n ' time units and
terdam, 1973). [3] E.J. Chang, Decentralized algorithms in distributed systems, Technical Report CSRG-103, Department of Cornputer Science, University of Toronto, October 1979. [4] F. Chin, H.F. Ting, An almost linear time and O(n log n
H ' transmissions to return to r. When r receives the returning Complete message, it will process the next Request message, if any, in the queue. It is important to note that for multiple vertex insertions, each Candidate and Awake message must include a new field to store the identity of the vertex which initiates the current execution of the updating algorithm. This is because due to
References [11 B. Awerbuch, Optimal distributed algorithms for minimum spanning trees, counting, leader election and re-
lated problems, Proc. ACM Symposium on Theory of Computing, May 1987, p. 230-240.
[2] C. Berge, Graphs and Hypergraphs (North-Holland, Ams-
+ e) messages distributed algorithm for minimum-weight spanning trees, Proc. 26th Symposium on Foundations of Comp. Sci., Portland, OR, October 1985, pp. 257-266.
[5] Y, Dalai, Broadcast protocols on packet switched cornputer networks, Technical Report No. 128, Department of Electrical Engineering, Stanford University, April 1977.
[6] E. Gafni, Improvements in the time complexity of two message-optimal election algorithms, Proc. 4th ACM Symposium on Principles of Distr. Compt., Minaki, Ont.,
Canada, August 1985, pp. 175-185.
232
Y.H. Tsin / Incremental distributed asynchronous algorithm
[7] R. Gallager, P. Humblet and P. Spira, A distributed algorithm for minimum-weight spanning trees, ACM TOPLAS, 5 (I) (1983) 63-77. [8] D.S. Parker, B. Samadi, Distributed minimum spanning trees algorithms, Proc. International Conference on Perf ormance of Data Communication Systems and Their Applications, INRIA, Paris, France, September 1981. [9] D.S. Parker and B. Samadi, Adaptive distributed minimum spanning trees algorithms, Proc. Symposium on Reliability in Distributed Software and Data Base Systems, Pittsburgh, PA, July 1981, pp. 138-144.
[10] S. Pawagi and I.V. Ramakrishnan, An O(log2n) time algorithm for parallel update of minimum spanning trees, Inform. Process. Letters, 22 (5) (1986) 223-229. [11] Y.H. Tsin, On handling vertex deletion in updating minimum spanning trees, Inform. Process. Letters. 27 (4) (1988) 167-168. [12] D.W. Wall, Mechanisms for broadcast and selective broadcast, Technical Report No. 190, Department of Electrical Engineering, Stanford University, June 1980.