Synchronization of Concurrent Activities in a Local Area Computer Network

Synchronization of Concurrent Activities in a Local Area Computer Network

SOFfWARE FOR DISTRIBUTED CONTROL SYSTEMS III Copyright © IFAC Software for Computer Control Madrid. Spain 1982 SYNCHRONIZATION OF CONCURRENT ACTIVIT...

1MB Sizes 0 Downloads 36 Views

SOFfWARE FOR DISTRIBUTED CONTROL SYSTEMS III

Copyright © IFAC Software for Computer Control Madrid. Spain 1982

SYNCHRONIZATION OF CONCURRENT ACTIVITIES IN A LOCAL AREA COMPUTER NETWORK H. Rzehak Department of Computer Science, University of the Federal Armed Forces, USA

In local area computer networks it may be necessary Abstract. to synchronize activities wich are allocated to different computer systems of the network.

These networks can still

operate

in the case of a breakdown of a single computer with restricted performance. the

In this case it may happen that some activities in

working computer

(waiting for

systems

synchronization)

state because the signal activity

allocated

paper first for

still

to

and

remain they

in

a

blocked

state

never could leave this

for the release must be produced by an the

gives a brief

centralized computer

computer summary

systems.

system

which

fails.

of synchronization Then

it

is

shown

The

concepts how

these

concepts could be modified for an implementation on local computer networks.

area

It is described how to handle the problem of

blocked activities if a computer system of the network fails. Keywords.

Decentra1 i zed systems;

control; fau1 t

synchroni zati on;

concurrency

to1 erance; error recovery.

I NTRODUCTI ON

Today local

area

play an important

computer role

in

networks

control of

i ndustri a1

processes.

As

no commonly

accepted

definition

of

the term

"local

area

- The

computer

computer

the to

distance network maintain

transmission

net-

or

ri ng

between is

a fast -

the

nodes

sufficiently and

cheap

commonly

system

-

and

via

the

of

small data a

bus

standi ng

work" exists we consider two proper-

charges

ties inherent with this term:

load of the communication network.

Each computer of the network

is an

We assume

are

i ndependant

of

the

that only one organi zati on

autonomos (sub)system and can oper-

is responsible

ate for some tasks in a stand-alone

and

mode. This

easily how to coordinate the different parts of software in the subsys-

clearly

allows

to

dis-

tinguish between a multiprocessor and a local area computer network.

for

the whole

network

therefore agreements can be made

terns of the network. It is not neces-

171

172

H. Rzehak

sary to prefer those concepts for the coordination which work with a minimum of information transmitted over the network because this transmission doe s n 't a f f e c t any cos t sas far as the communication system is not overloaded. (C1ark, Pogran, Reed, 1978). The term "synchronization" is used to refer to the solution of either of two problems: (SI) specification and control of the joint activity of cooperating sequential process, or (S2) serialization of to shared access multiple processes.

concurrent objects by

Problem (SI) covers a characteristic situation in computer control of industrial processes where the prog res 0 fan act i vi ty de pen dson the progres of other activities. (In this paper the term activity is used synonymous to the term process.) Since cooperating sequential processes can be synchronized by means of mutual exclusive operations on a common data object frequently the literature deals with problem (S2), which is a key problem in data base systems ( Ko h 1 er, 1981). That i s why the 0 ve rview in the next paragraph mainly gives sol~tions of problem (S2).

SYNCHRONIZATION IN SINGLE COMPUTER SYSTEMS The implementation of basic constructs to sol ve prob1 em (S2) so called synchronization primitives is possible if the following three conditions hold:

(Cl) There exists an indivisible operation (atomic action) which is executed one at a time and can't be interrupted by other activities or external events. (C2) It is possible to remove the processor form an activity waiting for some event, e.g. for synchronization, to avoid any form of "busy waiting". Usually the corresponding functions are performed by an operati ng system. ( C3) I fan act i vi ty has en t ere d a regi on contro11 ed by a synchronization primitve it must leave this region through the regular exit, especially it doesn't terminate within this region. With condition (Cl) we can distinguish whether an atomic action is performed before an other or not. This means that we can define a sequence of atomic actions. It depends on the hardware and software architecture of the computer system which action can be considered as an atomic action. Fig. 1 gives an example for the implementation of a binary semaphore in a multiprocessor environment. The atomic action is an operation of the common memory taking into account that a sequence of operations in one of the processors is not an atomic action if the same sequence can be performed concurrently by an other processor. In the case of a single processor system generally condition (C2) is necessary activities. In a for concurrent multiprocessor environment this condition additionally maintains an economic implementation (Hoare, 1978).

Synchr onizati on of Concur rent Activi ties (0""0111 MEMOAY

SENA :

I I=== ==f- ----- " l

...I

PrOlt!5S0r 1 (lASt. A)

'rounor 11 (TAU 8)

5UIA :oo TRUE READ AND CLEAR SENA

f-- . _ . _ . _i . IF SENA COnTINUE USK A

-'-'-'-'READ AIIO CLEAR SENA;

[LSE WAll TASK It FOR

If SE"A CONU"Ut TASK 8

SEHA.

ELSE WAIT TASK B FOR SEMA

rIM . FIN L.....- Si:MA : .

TRUE

(b" •• al S£M,,·Reqw utsl

SEHA : . TRUE (Rene .... 1 S£MA-Reqwu tlo)

_

SEM,,-FALS[

-

TASK. Is luit i ng for synchronlulion

-

TIME READ AND CLEAR SE"A : READ

SEHA

CUU SE"A

173

The semap hore mecha nism has been shown in the examp1 e above (F i g. 1). It shou1 d be notice d that the programm er canno t check the value of a semap hore. There are only two opera tions for the entran ce and exit of the contr olled region (P- and V-ope ration by Dijks tra, 1968) . The semaphore must be prese t at a defi ni te value , e.g. by the opera ting system or the compi ler. The critic al region conce pt is equiv alent to the region contr olled by a semap hor, but the progra mmer has to dec1 are the region where an objec t is used exclu sively by an activ ity:

Fig. 1 Imple menta tion of binary semaphor in a multip rocess or environme nt The examp le shows that an implem entation of synch roniza tion primi tives can't garan tee that condi tion (C3) holds. It is the respo nsibil ity of the user to check that it holds . There fore more sophi sticat ed mecha ni sms have been estab1 i shed to check condi tion (C3) at compi le time. Fig. 2 shows the varios techni ques for synch ronisa tion (Levi, 1981) . Some remark s may descri be briefl y the rea1ti ons betwee n these techn iques:

wit~

objec t do statem ents done;

The corres pondi ng opera tions on semaphores can be produc ed autom atic1y by the compi 1 er. Semap hore and cri ti ca1 region s solve the proble m of mutua l exclu sion. In contr ast the condi tional critic al region solve the proble m of produ cer and consum er acti vi ti es. An activ ity produc es some objec ts where as an other activ ity consum es these objec ts concu rrentl y. It is clear that this proble m can be solved by semap hores or cri ti ca1 regi ons but the notati on in a progra m natura l in some way.

seems

un-

In the monit or conce pt an objec t with exclu sive acces s, togeth er with all I I I explicit control of synchronlu tion hl,he,. techniques (oncw,.,. nt .cthities opera tions on this objec t repres ented prl.Ulves for ,y"chr-ontr at.on by proce dures, form an abstra ct data for the proble. of for cOOpt,..ttn , prOUdur. .uhel f'XCluSlon :;:::::. type - the monit or. The synch roniza uquenttel l proc:usn orhnted tion is perfor med within the monit or in such a way that only one proced ure of the monit or can be execu ted at a time. This means that a monit or call can be cons; dered as an atomic action. The progra mmer is forced to Fig. 2 Survey of synch roni zati on write down the objec t decla ration s techni ques togeth er wi th the proced ures of the I

S1nchronil ltion hchn ; ques

H. Rzehak

174

monitor.

There

tions, e.g. ically

created

which

include

some

and

deleted

possi bi 1 ity

the

possible

monitors

by

semaphores.

sage

passing

longs

system

buffer

to

ceiver.

an

activity

Some other

transmitter can

the

rece i ver box.

by

called

it

message

identification

of

recei ver.

a

message

procede The

exept

the

a

for

box. To avoid

is

empty

message.

connection receiver

it

to

any

at

set

conditions.

of

free

these

must

message a

by

systems

Such

a

can

be

system

monitor

per-

which

con-

bOx". Genera1-

are

not

easy

to

handle in an environment where activ-

the

ities

an

exist

only

dynamically

over

a

certain periode of time. The synchronization tool tor concept and message

of

tems.

is

time.

this

direct

form

Fig.

3

passing

is

sys-

a so called

gives

an

exam-

is

the accept-statement -

of an

activity (task) can be entered exc1u-

and

lot

A

basic

ple. Like a monitor the entry point -

waiting

no

The

"rendezvous".

If the

transmitter

in Ada TM

has some properties of both the moni-

full. out

stops

There

between

a

systems

message

tains the object "mail 1y

box if there is any and pro-

box a

re-

will

is

message

cedes by interpretation of it. mail

be considered as a mail

implemented

After send-

mai 1 box

takes

in MASCOT can

transmitter as

transmitter

the

recei ver

the mail

the

the channel

deadlock

called

into

E.g.

form

a

contains

well

as of the

the

different

deadlock s

be-

send messages

putt i nq

The

of

which

activities

the mail

box)

of

A mes-

consists

(mail

passing 1itera-

ture.

imp1e-

to

massage

systems are published in the

slide1y

monitors

i s

message

ing

modifica-

It

deadlocks. ment

exist

nested monitors or dynam-

of

sive1y

and parameters can

Then

the

statements

be

passed.

withi n

the

flow o f control

I :

task body generiere_bot is

I

f1 ow of control

na chster_code: CHARACTER;

I

~begin

I

~ stateme nt s to produce

I

nac hster code

I

.. -

-

-

- -

-

-

-

-

-

-

-

-

-

-

-

-

f -~ ~ - _ - .~ __ ~ accept se nde zeichen (C:~ CHARACTER)~

-..,

end generiere_bot;

body dekodiere ~ code, zeichen: CHARACTER;

t.. begi n

dekodiere.sende_zeichen (n achster_code) ; - - - ,

~~;

~

:

: :

"rendezv ous"

I1

L - - - - -l - - -

I

nachstes zeic he n : CHARACTER; index:

I

INTEGER;

r - -

I ., -~

- -

index := 1 ;

: "rendezvous"

loop

~ dekodiere . empfange zeichen (nachstes_zei c hen) ; zeile (index) index

< 132

statements t o decode the val ue of code and to store

zei 1 e: STRI%{ 1 .. 132);

i.!.

code := C;

end;

: = nachstes_zeichen; ...

..J

r - - -~- - :

--- ---.J

I I ~accept

11

the result in zeichen empfange_zeichen {C:o ut CHARACTER)do C := zeichen;

~;

~~~;

end dekodiere;

~

index :z index + 1; else drucke (zeile); index := 1; end

.!..!:;

end loop; end drucke_bot;

Fig. 3 The rendezvous in Ada

175

Synchronization of Concurrent Activities

do end of the accept statement are executed by the called activity,

tion system i f it is a unique bus or

that

unusual because it leads to special forms of bus protocol s. Usually each

is

the

task

with

the

accept-

statement, if it has reached this statement. If the called activity has reached the accept-statement first, it must wait like an activity waiting for a message in a message passing system. It should be noticed only the call ing activity must

that know

the identifier of the called activity. In extension of the basic rendezvous a choi se of some accept-statements can be given in a select-statement. The rendezvous happens if the conditions

for

one

of

the

accept-

statements hold.

ring

system.

The

latter

is

quite

synchronization task is assigned to a certain computer of the network which performs

the

associated

atomic

ac-

tion. In

principle

each

participating

in

of a

the

activities

synchronization

task has to send a request to that subsystem which performs the associated atomic action and gets back an answer whether it can continue or not. This is very similar to the operation of message passing and it seems quite natural

systems to use

SYNCHRONIZATION IN LOCAL AREA

such systems for the synchronization of distributed activities. As stated

COMPUTER NETWORKS

in the last paragraph message passing systems are not easy to handle

in an

In a local area computer network some activities to be synchronized may be

environment where activities exist only dynamically and they must be

allocated to different computers of the network. I n the 1 i ght of the 1 ast paragraph we look how to use or modify the synchronization for this case.

structured carefully to prevent deadlocks. We can implement simpler con-

Firs t we have to check whether the conditions (Cl), (C2) and (C3) hold in a local area computer network. we have seen the implementation synchronization garantee

that

primitives condition

(C3)

As of

can't ho1 ds.

This is the responsibility of the user. We further ass ume that an operating system controls the network pr e fer ab 1y i n a de c en t r a 1 i zed form.

cepts

by

using a substitute activity

which represents the calling activity in the subsystem performi ng the atomic

action.

Messages

are

passed between the call ing and its substitute. The

only

activity message

oriented technique is hidden for the user. Fig. 4 gi ves an examp1 e for the implementation of a monitor. Semaphores can be implemented in the same way. The effect of synchronization is

To sat i sf y con d i t ion

( Cl) we h a v e no

delaied by the transmission time. In principle an anomaly of the time

common

the

ordering of events is

memory

as

in

examp1 e

of

possible

{see:

Fig. 1. An atomic action can only be performed either by a single computer

Lamport,

of

local network it can be negl ected in most of the applications.

the

network

or

by

the

communica-

the

1978al,

short

but with respect to

transmission

time

in

a

H. Rzehak

176

Subsys t em

forever"

Subsys t em j

1

way

a cti vi t~

I

r

,//

MaN!TOR

that

all

activities

the

- The

system

state

detects

which

may

each

lead

erroneous

to

a

"wait

I

I

forever"

state

~>

branch facility

to

Synchronization by a

handl i ng

part

distributed monitor

Enough

r 19 .

in

sound subsystems can run.

Q ( su bs ti tu t e )

'"

state and reacts in such a

by

and

an

the

user

exeption

an

handling

appropriate of

the

information

must

can error

programm. be

passed

to select an appropriate action. SYNCHRONIZATION S TATE S OF It is rest

quite of

a

AND

natural network

to

tolerance

is

not

the

implementation

computer

network.

activity region e.g.

very

reason

a

a

by

for

local

area

that

irregularly

the

the if

even if fault

Supposed

controlled

all

how

operate

of

terminates

because

then

The

look

can

single subsystem fails, the

ERRONEOUS

SUB S YSTEM S

a

an

in

a

semaphore,

processor

other activities

fails,

which

have

performed a request on the same semaphore

will

this,

we

for

all

wait can

forever.

install

activities

To

a

in a

region

controlled by a semaphore. By this is possible release

to

perform

operation

terminates easier

to

the

if

an

irregularly. prevent

a

but

cases

not

it

is

the

user

possible

activities

very

in many

to continue

unchanged

in

the

run-

ning part of the network whereas some activities subsystem

have with

terminated

in

the

breakdown.

The

re-

a

configuration of the system should be done

by

the

knowledge

user

because

which

whol e

system

means

that

he has more

functions

are the

of

the

essenti al.

second

Thi s

alternative

should be preferred.

it

missing

Synchronization attention ures.

to

techniques

don ' t

the occurrence

The only exeption

is

of

pay

fail-

Ada where

a time out condition can be given for

activity

an accept-statement by a delay-state-

is

ment.

It

"wait

seems

for

the

first

prevent

book-keeping

being

alternative

comfortable

much

forever"

This

activity

means

(which

that

the

contains

the

wait only

a

called accept-

situation using a distributed monitor

statement)

(see fig.

time and then continue performing the

occure

4). This situation can only

that tor

the

if

contains

the

da ta

subsystem monitor

transmi ss i on

kernel

fails

kernel

which

(suppsed

between

an.d monitor cover

calling

pos-

and

it

are

two

more

possible

generality

ways

to

there

handle

the

problem: - The

state

which

activities

not

is

are

clear what

that

unknown

happens

if

detects leads

each to

erroneous a

"wait

In

the

monitor

and

the

ren-

dezvous concept it is not possible to have the

system

is

disadvantage

the called activity terminates irregularly.

to

The

the

is

defin i te

statements associated with the delaystatement.

moni-

sible in any case). Returning

will

access

to

global

controll ed

objects

must

of the

call .

overhead

in

regi on.

be

passed

This many

may

object

within

A11

external

as

arguments

cause

applications

some of

177

Synchronization of Concurrent Activities

1 oca 1 area computer network s for process

control.

Therefore

we

propose

schedule before performing a requeststatement with

two different solutions:

a

on

a

semaphore.

message

passing

solution establishes a Synchronization transmitting tinued

done by a message In

any

activity

and

an

delivered about

is

s Ys t em.

pas sin 9

error

cas e

the

will

be

con-

code

will

containing

the

status

dency

on

the

higher

this

depen-

avai1labi1ity

of

the

subsystem containing the semaphore.

be CONCLUSION

information

of

the

Compared

system

receiver.

Checks for deadlocks and the reconfi-

Cons i deri ng

guration is the task of the user. For

synchronization

better convenience the user can simu-

puter network two solutions should be

1 ate

prefered

di stri buted semaphores or moni-

tors. ward

This

leads

to

implementation. the

tive

is not the

the

very

a straight-for-

states

a subsystem.

other solu-

READ

synchronization

AND

have

a

READ

level

the

had

user

passing to

sol ution concept

on

subsystem

and

if we

mai nta in

priate waitinq queues.

"waiting

A solution

any

to

operation

avoid

area

is

with

more

message

symmetric

of

with is

the

an

access

function

allows

to

extended

unsymmetric containing

to

The

semaphore

because the

the

semaphore

plays a special

the

cal experience the author is going to

appro-

We propose to

a

read

special

subsystems.

implement

role. To gain practi-

different

hierarchical local

solutions

semaphore the

value

are

expected

phore

with

concept

on

a

area computer net-

work. For such a system best which

com-

wou1 d have

enlarge the semaphore concept by - an

for

forever"

because it doesn't assigne a role

an

CLEAR

in a local

systems

as

to

operation

but

techn i ques

in the case of a breakdown of

It is quite unusual AND

1 anguage

primi-

semaphore itse1 f

CLEAR

atomic action.

to

The

tion is directed by the consideration that

di fferent

an

results

extended

because

the

sema-

hierarchy

without changing it; example:

reflects the unsymmetry of this solu-

value sema := sema

tion.

assignes

the value to

the

variable

value sema; REF ERENCES - an

access

function

to

the

queue

assoziated with a sema; example: name task_i assignes

the

position variable

Ada

:= sema.queue(i) name of

of

the

the queue

name_task_i;

reference

Department task to

on

manual. of

States

July

1980,

DoD MIL-STD-1815

the

Clark,

for i=O this

Reed

(1978).

local

area

is the name of the active task;

Uni ted

Defense,

D.D.,

K.T. "An

Pogran

and

introduction

networks",

D.P. to

Proc. of the

IEEE 66, pp. 1497-1517 With this access functions a monitoring

task

can

detect

a

"waiting

for-

Dijkstra,

E.W.

sequential

(1968).

"Cooperating

processes", in: Program-'

ever" state and perform the appropri-

ming Languages (Editor: F. Genuys),

ate actions.

pp.

The monitoring task

can

be activated with an appropriate time

43-112,

York 1968

Academic

Press,

New

H. Rzehak

178

Ethernet.

"The Ethernet,

network", DEC,

common

Intel

and Xerox,

C. A. R•

sequenti al

Q,

pp.

K0 h 1 er,

of

spezifikation

1980

( 1978 ).

"C 0 mm u n i cat i n g

processes",

Commun.

ACM

666-677 W. H.

techniques recovery

( 1 981 ) . for

in

pp.

"A

sur v ey

synchronization

a

puter system",

ll,

area

implementation

version 1.0, Sept. Ho are,

a local

decentralized

0

f

and com-

in Computing surveys

149-183;

(with an annotated

bibliography) Lamport, the

L.

(1978a).

ordering

of

tributed system" pp.

"Time clocks and events

in

a

dis-

in Commun. AOl

Q,

558-565

Lamport,

L.

(1978b).

"The implementa-

tion of reliable distributed multiprocess systems", Computer Networks ~,

pp.

Levi,

95-114

P.

(1981).

"Betriebssysteme fUr

Realzeitanwendungen",

Datakontext-

Verlag, Koln 1981 MASCOT

Supp 1 i ers

offi c i al puting

Association.

handbook

of

Standards

Signals

and

Molvern,

Worcestershire

Steusloff, mierung

H. von

Radar

MASCOT,

Section,

The ComRoyal

Establishment,

(1977). raumlich

"Zur

PrOClram-

verteilten,

dezentralen Prozessrechensystemen", Dissertation, Universitat Karlsruhe 1977