On facilities for interprocess communication

On facilities for interprocess communication

Volume 12, number 5 INFORMATIONPROCESSINGLETTERS 13 October 1981 ON FACILITIESFOR INTERPROCESSCOMMUNICATION David M. HARLAND Departmentof Computat...

789KB Sizes 7 Downloads 61 Views

Volume 12, number 5

INFORMATIONPROCESSINGLETTERS

13 October 1981

ON FACILITIESFOR INTERPROCESSCOMMUNICATION

David M. HARLAND Departmentof ComputationalScience, Universityof St. Andrews,Fife, Scotland

Received9 March1981; revisedversion received 8 June 1981

Concurrency,messagepassing,interactingprocesses,deadlock

1. Introduction

2. Messages as a means of communication

When parallelismis explicitly introduced into a computing system there are severalproblems which arise. Some of these problem areas will be considered here, and a powerful set of communication facilities will be presented to overcome them. An implementa-

To tackle all of these requirements we will use a language which employs a message passing mechanism for communication between concurrent instances. In’ [ 1 ] a language was proposed which included concurrency as a fundamental concept. A concurrent entity had a value of type instance, which uniquely identified it in the running system, and it was an incarnation of a static lexical description of its behaviour, a value of type definition. These two types are distinct, and like all other types in the language, following Strachey [2], are ‘first class’ citizens. All interactions take place by explicitly sending and receiving messages. Messages in a concurrent environment are regarded as the parallel equivalent of the sequential procedure’s parameter passing mechanism. Procedure parameters are thus merely a special case of the message system. The message system is clean because it is uniform, ?qd as will be shown, it is an elegant mechanism for programming parallel interactions. The notion of a message is simple, it is a value of any data type. This may be either a simple scalar value, or a data structure; all data types in the language have the same rights, and so all can be passed as messages. Message passing is a high-level concept, as it abstracts over all the formating and buffering, synchronisation and transfer activities involved in parallel interactions.

tion of these facilities using a message passing mechanism will then be discussed. The particular aspects of the concurrency problem to which we shall direct our attention are: - synchronising interactions to retain the integrity of data being passed; - preserving internal sequencing of data during its transfer, and permitting that data to be read either (a) strictly sequentially, forming a continuous record (e.g. in data logging), or (b) as an up-to-date record, by permitting the receiver to skip over back-logs of unwanted data (e.g. in process control environments) where old data are worse than useless. To these basic requirements we may add: - facilities for readily specifying programmer defined scheduling constraints, and - the ability for some instances to supervise the activities of others. This requires that we be able to

dynamically determine the running identity of a concurrent entity and its behaviour, in terms of the static definition that it is executing.

0020-0190/S I/OOOO-0000/$02.50 0 198 1 North-Holland

221

Volume 12, number5

13 October1981

INFORMATION PRXESSINGLETTERS

3. Instances and definitions An instance can be created from a definition at any time, by using the new operator. This takes a definition value and returns an identity which denotes the instance created from it. For example:

&rgmessages, we can exploit the simplicity and concisenessto accommodate the above requirements. 5.1. Jdentifying theactivity being undertaken

send m to x,

The instance values are unique and serve to identify particular members of the system; this is why we use them to direct messages.The definitions, being distinct from their incarnations as instances, can be related to individual instance values to determine whether an instance is executing that particular definition. It is therefore possible to find out, dynamically, what any particular instance is doing (in terms of the availabledefinitions). For example:

and a message from that instance can be requested by:

ifxisdemothen..

receive i from x,

where “x” is a value of type instance and “demo” is a value of type definition. This is self explanatory.

let x := new demo.

4. Messagepassing A message, composed of a value, can be sent by:

where i is a local variable in the receiving instance.

In the case of simply sending a message the sender can continue immediately, but requesting that a message be collected involves an automatic delay until such a messagebecomes available, should one not already be in the queue. These are two asynchronous action:;, messagescan be sent before they are needed, and backlogs of messagescan be read in without disturbing the senders. There is a built-in synchronisation when an as yet unsent messageis requested. A synchronous ‘handshake’of messages can be written in the sugared construction: send m to .Yand receive i, which is a kjne-linershorthand for severaldistinct opera?ions: !t is succinctI yet embodies all of the necessary synchronisation and buffering activities. These constructions are not merely documentation for lower level facilities, they are the constructs themselves. There is no need to concern the user with the details of how messagesare passed, and how instances are synchronised, whenever this proves to be necessary. Message passing is a high level concept

.else.. .

5.2. Data jlow in the system Instances communicate via messages,and messagesare sent asynchronously. Messagesaccumulate in queues attached to the target instances, from where they can be ‘read in’ when needed. These queues serve to buffer the transfer of data within the system. By absorbing variable transaction rates by holding temporary overruns the queue mechanism implements the synchronisation. It is the flexibility of the message queue which makes the system tolerant of unpredictable data rates, and it is this which enables us to separate out user-oriented scheduling requirements from system scheduling constraints. As we shall see later, the user can write private scheduling algorithms directly. The asynchronous nature of a queue means that messagescan arrive from a variety of sources, in any order, whilst those from a particular source may be read in sequence, irrespective of any other messages, from other sources, already in the queue. By examining the queues, to determine the origin of the messages,and the nature of the senders, nondeterministic interactions can be undertaken.

S. -Managementfacilities Having established the basics of starting up an instance from a definition and of sending and receiv-

222

52.1. A continuous record In situations where a continuous stream of data is required it is merely necessary to request that the

t olume 12, number 5

INFORMATIONPROCESSINGLETTERS

next message from that source be read in (with an automatic wait should it not yet have arrived). This is the default action, but can be explicitly specified by using the keyword fult, e.g.: receive first i from x. 52.2. An up-to-daterecord In some cases it is essential that only the most recent data be used, and that all intermediate messages from that source be discarded (but leaving messages from other sources undisturbed). This can be specified quite easily by the keyword latest, e.g.: receive latest i from x. These simple mechanisms provide quite a powerful data handling facility. If an instance, such as a line printer driver, is neither interested in who it serves, nor in what they are doing, then it can use the predefined instance identity “system”, which matches against any instance. This permits an instance to serve others in general, without taking specific action to find out the identities and requirements of all the members of the system. The “system” instance valued literal, when used for input of a message with either the first or the latest options, accepts either the first or the latest message from any source. When used for output this instance identity serves to send a copy of the given message to every active instance in the system, providing, as a result of the uniformity of the system, a ‘broadcast’ facility: send xyz to system. Since every instance has an identity of its own, it is only natural that an instance be able to access its own identity, by way of a local constant called **identification”, which it can then pass around the system, to other instances, and so identify itself explicitly. Because the system is orthogonal it is even possible for an instance to send a message to itself! 5.3. Intermgating messagequeues Merely requesting that a message be received incorporates an automatic delay until such a mes-

13 October 1981

sage becomes available, if one is not already in the queue. There are situations, e.g. in an instance serving a variety of other instances, where this enforced delay is unsatisfactory. In such cases it is more sensible to cycle, polling the queue, in order to decide what action to undertake next. It is therefore necessary to provide a means of interrogating a queue to determine whether or not there are messages available from particular sources, or classes of sources. This can be achieved by the “message” preditite : if message x do . . ., which, given an instance value reports whether there are any messages currently in the queue from that source or, given a definition value, reports whether there are any messages from any instance executing that definition. If no argument is given then it simply returns whether there are any messages at all, from any source. 5.4. Scheduling Being able to interrogate message queues, identify messages, and select the order in which to consume them provides the basis for a user-oriented scheduling mechanism. For example, if an instance services messages from a variety of producers, then it can select its work according to some in-built priority scheme, written by the user. Although the system automatically buffers messages, the user could simulate this if hc y WOulti reader/writer problem could be wished. The +*;written so that the ‘common buffer’ is actually managed by a separate instance; this taking requests from a “reader” and a “writer” instance. The buffer manager then serves the writer before the reader simply by: if message writer then . . else . . . . This is simple for the tr&al case of a single writer and reader instance pair. If, hajwever, there were many readers and writers then the buffer manager would have to discover which of i’Jsmessages came from instances executing a write-oriented definition, and which were from a read-oriented definition. This can be done using the is operator mentioned above, or by using the write-oriented definition value, to find out 223

Volume 12, wmber 5

INFORMATION PROCESSINGLETTERS

13 October 1981

whether there are any messages from any writer

it examines that of the current instance.

irkstances, *whoever they

A two-way link can establish itself automatically, as once one instance knows another, the second can find out the identity of the sender of the ‘anonymous’ message.This is quite an attractive capability, and it permits a system of interacting instances to ‘grow’ nevccommunications paths as and when desired. As a data type a set is merely an unordered collection of values. The set returned by census is a set which is composed entirely of instance identities. There are various operations on a set, one is a simple membership predicate:

may be, before servingreadoriented instances - this is extremely concise! - or by using the write-oriented definition value, to find out whether there are any messagesfrom any writer instatances, whoever they may be, before servingreadoriented instances. This is extremely concise! SOfar all communication has been with either instances which know of each other personally, or wi,th instances of known definitions. If interacting directly they must have been placed in contact with each other either by being directly related, via default communications paths, or by having.been sent their mutual identities via the message system, so that they could communicate directly. Any instance can send a message to any other instance, so long as it has been given its identity. However, that instance might not know of, and so will not 1)~expecting messagesfrom, such an instance, and because it does not know its identity it cannot specifically ask for its messageto be read in! If the definition that such an instance is executing is known in advance then, as in the case of the extended reader/ writer problem, the message could be retrieved by asking for the messagefrom an instance of that de% nition. If however the receiver does not know of the render directly, and does not know which definition it is executing, then it simply knows that it has a mes?;agefrom somewhere, but has no way of directly asking for it. The generality of the message scheme being outlined permits even this sort of situation to be managed sensibly. Apart f’rbmpolling the queue to see if there are arnymessages,either at all or from specific sources, it if;possible for an instance to request the identities of all senders of messagescurrently in its queue. The system primitive census achieves this. It returns a set (sets are one of the data types in the language) of identities. This set value can then be examined to decide how to manage the messagesin the queue at that time. Unl&e in the above cases,where only previously known ifnet antes could be served, the receiver now has the ldexititiesof all of the senders, and so can ask for their rnessagesin whatever order is deemed appropriate. The mechanism is completely general purpose: if c~nlpus is given an argument, corresponding to a currenliy active instance in the system, then it samples the input queue attached to that instance, otherwise :!24

if census has x then . . . else . . . , where “x” is again a previously known instance vzl?re, would test for messages from the abovementione{J instance. This is as before. The advantages are only realised when the set of identities is used in conjunction with the is operator discussed earlier, or the CLZW @operator. The is operator reports whether a particular instance is executing a particular defmition. The classofoperator takes an instance and returns its definition value. This definition value may not have been known to that particular instance before. By testing elements of the census set the senders, even if previously unheard of and so with unknown requirements, can always be identified in terms of ‘what they are doing’ as well as ‘who they are’. This permits scheduling-by-activity,as well as scheduling-by-identity. For example, if we take a censusof the queue and save it, by: let ids := census, then by using an iterative set element selector (which applies a statement to each member of a set, in some ‘randomised’order) we can serviceall of those messagesfrom instances performing a certain function, whatever their identities: forall sender in ids do if sender is tape 9driver do . . . . In this way it is possible to test the senders against known definitions, if it was necessary to find out the definition being executed by any given instance then the clbssofoperator will report this. It is possible, therefore, to write a deftition which records the names and activities of all instances which send it messages,even if they were unheard of before the messagearrived.

Volume 12, number 5

INFORMATION PROCESSING LETTERS

Note that such high level schedulingis basedon the natureof the concurrententities and their activities, ratherthan on some ad hoc priorityscheme.This mechanismenablesthe user to write his scheduling algorithmsdirectlyinto generalpurposemessagemanagingsystems.

6. Conclusions Since severalmessage-oriented systemshave alreadybeen implementedit is fair to ask how that proposedhere is an improvementon such systems. Messagepassingsystemscan be categorisedin various ways. Firstly,we can classifysystemsby the formatof theirmessages.Mostuse fixed sized buffers,into which the messagesmust be squeezed.Naturallyonce a buffer size is specifiedone has the problemof messageswhich arelargerthan that buffer size. Here all messagesarethe samesize but each is a due, which can be an arbitrarilycomplex datastructure. In a singleaddressspacethis can be highly effcient for ‘largemessages’becausedatastructuresare passedas pointers.This is possiblein our system becausedata structurescannot be selectivelyupdated, as a matterof design(as can be seen in the ontology diagramof [ 11). If messagesareto be passedbetween processors,to instanceson other machiies, then directi/o is performed,and the operationis correspondinglyslower,as copyingwould be requiredfor largemessages.In this multi-processorsystem,however,all the bufferingand i/o transferswould be undertakenby the interpreterson the respective machines,all that the instances(on separateprocessors)would do is passa messageas normal.Thus instancevaluesarenot simplynumericalvalues(as in most other systems),they aremore abstract,embody ing the identificationof their processorand their relationshipto it. An interpreterrunning(possiblyon a varietyof maihines) managesall the detailedbuffering of physicali/o transfers.The language,therefore, is free of such matters,i.e. it is high level. One can also categorisemessagesystemsby their type-checkingmechanisms.Mostareeithertypeless, as in TRIPOS[3], or heavilytype-checked,as in Thoih [4]. To permitpolymorphism,the heavily checkedsystemsintroduceextraneoussyntacticcon-

13 October 1981

structs, such as variations on the Pascalvariant-records,in Thoth itself. Wetake the view that whilst typesand type-checkingareimportant,static checks are too restrictive, so we implement a dynamic typechecking mechanism via a tagged architecture. This ensures complete freedom of action, yet it is fully protected. Wesupport a receiver-specified message request, with a no-wait send, and use (invisible) queues to hold accumulated messages. The queues can grow arbitrarily long, because they reside in the same heap-based address space as the rest of the system. This contrasts with the blocking sends of a fully synchronous system, e.g. CSP [5], and the need for special reply-messages, as introduced in Thoth. A nowait send is preferred because it both reduces process switching and increases parallelism in a system. As demonstrated the current system offers considerable scope for nondeterminism, because an instance can evaluate its workload (in its queue) and schedule its own activities accordingly. It can always identify the senders of its messages, and can read the messages in as necessary. It can respond at any time, tiithout being delayed. This system is extremely orthogonal, unlikeits predecessors. Indeed, many of the Thoth restrictions and ‘extra’ concepts are avoided here, and the dynamic type-checking, instance values and asynchronism makes our system much easier to use. In short, we have here an interactive facility [6] which is both a programming language and its own ‘shell’, in which concurrency is a fundamental concept, not an add-on feature, and a completely uniform generalised message system, passing arbitrarily complex values, is the only means of communication. It is quite clear that these communication primitives provide considerable freedom of action. All of the constructions embody the essence of the desired activity without involving the user in the details of the interactions. It is felt that it is only by abstracting out the essential purpose of a construct, and by pitching it at such a high level of participation that the immensely complicated interactions taking place in a system of concurrent instances can be reduced to an intellectually manageable form.

225

Volume 12, number 5

INFORMATION PROCESSING LETTERS

7. Acknowledgements T&s work was financed by the Science Research Council [IJK] -

References [ 11 D.M. Hiuland, Concurrency in a language employing messages, Information Processing Lett. 12 (2) (1981) 59-62.

226

13 October 1981

[ 21 C. Strachey, Fundamental concepts of programming languages, Oxford University Programming Research Group. [ 3 ] M. Richards, A.R. Aylward, P. Bond, R.D. Evans and B.J. Knight, TRIPOS - A portable operating system, Software Practice and Experience 9 (1979) 513. [4] W.M. Gentleman, Messagepassing between sequential processes, Software Practice and Experience 11 (1981) 435. [ 51 C.A.R. Hoare, Communicating sequential processes, Comm. ACM 21(1978) 666. [6] D.M. Harland, Introduction to Protocol, St. Andrews University, Department of Computer Science (1980).