Ring based termination detection algorithm for distributed computations

Ring based termination detection algorithm for distributed computations

Information Processing North-Holland RING BASED TERMINATION COMPUTATIONS S. HALDAR Department 26 October Letters 29 (1988) 149-153 DETECTION ALGO...

361KB Sizes 0 Downloads 52 Views

Information Processing North-Holland

RING BASED TERMINATION COMPUTATIONS S. HALDAR Department

26 October

Letters 29 (1988) 149-153

DETECTION

ALGORITHM

1988

FOR DISTRIBUTED

and D.K. SUBRAMANIAN

of Computer

Science and Automation,

Indian Instrtute of Science, Bangalore 560 012, India

Communicated by W.L. Van der Poe1 Received 6 November 1987

Keywords:

Distributed

program,

distributed

termination,

1. Introduction A distributed program P should terminate soon after performing the task for which it was written. For a sequential program, termination is a trivial issue; but, for a distributed program, an additional measure is needed to ensure this condition. Many algorithms have already appeared in the literature [1,2,4,5,6] in recent years. The majority of them (like [4,6]) assign the responsibility to a unique centralised process in the system. This causes performance bottlenecks in the system. A few algorithms belong to the distributed category [1,2,5]. In this paper we present a distributed and fully symmetric algorithm, where all processes follow an identical protocol. This is an improved algorithm similar to the ones described in [2,5]. After introducing the termination detection problem in Section 2, an improved algorithm is developed with correctness argument in Section 3. Section 4 concludes with comments on performance aspects of the algorithm presented.

process

communication,

ring

predicate B,, 0 G i < n. Let B be the conjunction of local predicates B,, where the value of the B,‘s corresponds to the same time instant of all the processes. B is the Global Termination Condition (GTC) of P. A program P can only terminate when all processes of program P satisfy their local predicates simultaneously, i.e., when the GTC is satisfied.

3. Algorithm for distributed termination The processes of a distributed program P are assumed to be connected by a Hamiltonian ring as in [1,2,5] (see Fig. l), where the control communication to detect the termination takes place in single direction, say anti-clockwise. Each process knows its successor in the ring. When the local condition Bi of a process p, is true, process p, is said to be in passive state, and in active state otherwise. Apart from control communication, basic communication may take place

. .

‘i+l

2. Distributed termination problem Let a distributed program P consist of n communicating sequential processes pO, pi, . . . , pn_ 1. The processes communicate among themselves by exchanging messages. Each process p, has a local 0020-0190/88/$3.50

0 1988, Elsevier Science Publishers

B.V. (North-Holland)

pi

. . .P

pi -1

Fig. 1.

149

INFORMATION

Volume 29, Number 3

PROCESSING

among the processes when they are active. A passive process never initiates a basic communication; but a process in any state can initiate or engage in control communication. Distributed termination consists of two phases: the termination detection phase for detecting the global terminating condition, followed by termination of each process. We shall now explain some terminology which will be used in the ensuing discussion. (a) The control section of each process contains three boolean variables PASV, RCTM, and SNDTM as in [1,2] with initial value false. PASV becomes true when a process changes from active to passive state. SNDTM would be set true upon sending a termination message to its successor. RCTM would be set true upon receiving a termination message from its predecessor. (b) An active process stores the identification of all those processes (a subset of { pO, . . . , pn- 1}) with which it enters into communication as in [2]. (c) A process pi also keeps a process identification variable whose value is the farthest process [3] down the ring with which p, has communicated when pi was active.

Define

sequence

<:

to be a relation among processes. that pk is ahead of pi in the ring. If pi communicates with all other processes, then

Pi < : pk means

farthest(

pi) :=p((i

If pi communicates {p((i+j)

mod n),

p((i+m>

+ n - 1) mod n). with a subset of processes, p((i+k)

SNDTM Current

nos.

= false

sequence

nos.

Pk (b)

(4

FARTHEST = ... PASV = true RCTM = false SNDTM = false

FARTHEST = ... PASV = true RCTM = false SNDTM = fufse sequence

no.

Cc)

(4 Fig. 2.

150

mod n),...,

mod n)},

P,

Current

say

wherej
RCTM = false SNDTM = false Current

26 October 1988

LETTERS

J

INFORMATIONPROCESSINGLETTERS

Volume 29,Number 3 3.1. Termination

phase

PI? upon becoming passive, issues a control message for detecting the termination condition. Let this message reach pi without any negative response (i.e., with KEY = 0), at that time the configuration of the control section of p/- will be one of those shown in Fig. 2. p, will generate a negative response when its control section status is either (a) or (b) of Fig. 2, and a positive response if its status is either (c) or (d) of Fig. 2 as in [2]. When a process receives a termination detection message, it does the following before forwarding the message to its successor. (a) As soon as detection message gets falsified,

(1) Upon p, becoming

26 October1988

the bit flag KEY is set to 1 and remains 1 until the message is purged [5]. (b) The unique identification carried with the detection message is removed from the control section of the visited process, if present. (c) If KEY becomes 1 before the control message generated by p, reaches FARTHEST( p, ), the message is purged by FARTHEST( p,). (d) As soon as the control message generated by a process p, crosses FARTHEST( p,), the message is purged if it is received by an active process or by a process having a nonempty control section. The algorithm (for p,) of the termination detection phase is fully described below.

passive :

begin

PASV( p,) := true; KEy:=o;

send DM( p,, FARTHEST( p,), KEY) to succ( p,): ( * DM stands for Detection Message *) ( * succ( p,) denotes the successor of p, on the ring *) end; (2) Upon receiving

a message from some predecessor

DM( p,, FARTHEST( p, ), KEY):

begin if KEY = 1 then

either some processes are active or their control identifiers is not empty *)

(*

list of process

begin if p, = p, then begin KEY :=

0;

fonwd DM( P,, P(, +1j

mod

n,

KEY) to SW P, >;

end else if p, -C: p, < : FARTHEST( p,) then begin if p, E ID( p,) then remove p, from ID( p,); (* ID denotes the set of identification stored in p, * )

forward DM( p,, FARTHEST( p, ), KEY) end else begin if p, E ID( p, ) then remove p, from ID( p, ); if p, is active or not(empty(ID( pi))) then

purge DM( p,, FARTHEST( p,), KEY) else forward DM( p,, FARTHEST( p,), KEY) to succ( p,) end 151

INFORMATION

Volume 29,Number 3

26October1988

PROCESSINGLETTERS

end then enter termination phase ( * DM has returned back to initiator *) else if p, = FARTHEST(p,) then begin if p, E ID( p,) then remove p, from ID( p,); if p, is passive and empty(ID( p,)) then forward the message to succ( p,) else purge DM( p,, FARTHEST( p,), KEY);

else if p, =pj

end else if pj < : pi < : FARTHEST( p,) then begin ( * p, lies on the path p, to FARTHEST(p,) *) if pJ E ID( p,) then remove p, from ID( p,); if p, is active or not(empty(ID( p,))) then KEY := 1; forward message DM( p,, FARTHEST( p,), KEY) to succ( p, )

end else if p, is passive and empty(ID( p,)) then forward message DM( p,, FARTHEST( p,), I(EY) to succ( p,) else purge the message DM( p,, FARTHEST( p,), KEY) end;

3.2. Termination

3.3. Correctness

phase

This phase is similar to the one of Arora et al. [2]. So, here we only provide their algorithm.

(a) Upon determining the GTC by a process, say p,, do the following:

begin SNDTM := true; send termination message to succ( p,) end; (b) Upon receipt of termination

message by p,:

begin RCTM := true; then terminate else begin SNDTM := true; send termination message to SUCC(pi); terminate if SNDTM

end end: 152

of the algorithm

To establish the algorithm we have tions: (1) At least one Global Termination

correctness of the presented to prove the following asserprocess is able to detect the Condition when it is satisfied

(true). (2) No

probability

of detecting false termina-

tion. 3.3. I. Proof of assertion (1) Case 1. Let the process p, be the latest process to become passive. Also, suppose at that time that all other processes have already become passive and their control messages are not in transit in the ring. Hence, the control sections of other processes would at most contain the identification of p,. As per the algorithm, the detection message of p, would remain unfalsified (E(EY = 0) before it reaches back to p,. Eventually, p, would enter the termination phase. Case 2. Let process p, be the latest process to become passive. Also, suppose at that time that some of the detection messages issued by some

Volume 29, Number 3

INFORMATION

PROCESSING

LETTERS

26 October 1988

processes are in transit in the ring. Either all the messages or at least that sent by p, will return to their respective issuer. Let pk be the process which receives a message issued by p,. pk will set KEY to 1 (i.e., message gets falsified) before forwarding the message issued by p,, if its control section contains some identifications of processes whose control messages are in transit in the ring. Eventually, the message will reach p, (if p((i + n 1) mod n) is its farthest process or the control section beyond farthest process is empty) and p, will start the same protocol for the second time. As the messages are served FCFS, when the second message issued by p, reaches pk, pk’s control section will be empty. So, pk will forward the control message of p, with positive response. Finally, p, will enter the termination phase.

time-stamp and clock synchronization (as in [5]). Also, we did not use any sequence number as in [2]. Here, we allow the detection message issued by process pi to be purged by a FARTHEST( pi) process or an active process beyond that or by a process with nonempty control section beyond FARTHEST( p, ) (similar to [ 51). Here we have used the concept of a farthest process in the ring (like [3]), but we do not hold up any message by an active process (like [3]). Our solution is simple and it can untuitively be said that, in averages cases, it takes a smaller number of messages to detect the Global Termination Condition than that of [2]. At the extreme case, both methods use the same number of messages.

3.3.2. Proof of assertion (2) Detecting a false Global Termination Condition implies a process gets back its own unfalsified detection message when some processes are either active or its control section is not empty. But, this is impossible because when the message moves around the ring, it is either purged or falsified if it is met with some active processes or the processes having nonempty control section. A falsified message can never force a process to enter the termination phase.

References

4. Concluding remarks We have presented a fully distributed and symmetric algorithm for the distributed termination problem. It does not make use of the concept of

PI R.K. Arora, S.P. Rana and M.N. Gupta, Distributed termination detection algorithm for distributed computations, Inform. Process. Lerr. 22 (6) (1986) 311-314 (see also: R.K. Arora et al., Letter to the Editor, Inform. Process. Letr. 29 (1) (1988) 53-55). PI R.K. Arora, S.P. Rana and M.N. Gupta, Ring based detection algorithm for distributed computations, Microprocessing & Microprogramming 19 (3) (1987) 219-226. [31 R.K. Arora and N.K. Sharma, A methodology to solve distributed termination problem, Zform. .Systems 8 (1) (1983) 37-39. [41 E.W. Dijkstra and C.S. Scholten, Termination detection for diffusing computations, Inform. Process. Left. 11 (1) (1980) l-4. PI S.P. Rana, A distributed solution of the distributed termination problem, Inform. Process. Left. 17 (1) (1983) 43-46. Fl R.W. Topor, Termination detection for distributed computations, Inform. Process. Left. 18 (1) (1984) 33-36.

153