Remarks on a real-time, master-slaves operating system

Remarks on a real-time, master-slaves operating system

304 P~~,I:rci.?it)iessrl;g and Mlcroprogramm~ng Remarkk Operating OII B 7 i1981) :ii)4-311 Real-Time, h/laster-Slaves System They consist of ...

711KB Sizes 0 Downloads 67 Views

304

P~~,I:rci.?it)iessrl;g and Mlcroprogramm~ng

Remarkk

Operating

OII

B

7 i1981) :ii)4-311

Real-Time,

h/laster-Slaves

System They consist of a few, usually large processors connected to a common memory and col:trolled by an operating system which “floats” i.e. may be activated by any of the processors (a decentralized system}. Ur,fortunately, lheir speed u,ill not increase with the number of processors p, This is explained on d free scale (Fig. 1) by the increating demands of the operating system. This results from the context-switching, program-swapr ing features of its (multiprogramming) operating s>rstem and primarily because common Cata demand protection by semaphores, locks r:tc. The features mentioned are net required by process-control applications. Thus, cenrmll’ted systems have lately been suggested [7,8, IO, ’81. They all may be modelled by the system of Fig. 2. 1~ has a ‘Yrinsler” which enforces information transfer, a single bus which connects the master, the p “Slaves”, peripherals and memory and ,u local memories (“slores”~ which enable the slaves to work independently of and concurrently with other slaves. (The memory processor may be disregarded until the section on memory organization.) Tilese systlrms work in an “AIternating Squenfial/Parallef” ASP-mode [lS], exemplified in Fig.

!

-\ 1 total

time

/

dperoilng system

/

functions

‘L

\

I

Number

of

processo,~_, --

Fig. 1. Time as Function

of p.

II____-

-__-..

.”.“...

.,_= ,.

3. There is an initialization phase during which the master transfers programs and data to the stores after which the following steps are repeated: The master transfers information from and between slaves and completes step S, by starting all p slave% The slaves work during step P, autonomotwb for times dictated by data in their stores until ail complete their “tasks”. An interrupt “wakes-up” the master which then initiates another “information-exchange” step S,, J and so on until the end of the job. ASP-systems have the foliowing advantages: (a) Their price is lower, their cost/effectiveness better, because [6] they use more effectively advanced technology (LSI-units, microprocessors, minicomputers etc.) produced in larger quantities. _.__,___.-. -_+_.-“.. Tasks I!-------“-----%

-If+5, -------)P

1 t

S,+l -~-

Fig. 3. ASP-mode.

,.

_..l.l”. .- ,_,.r_. .,

“_,..I_--,

-“_.-._d+.-----

3%

1. Rizhlxer, Y. Wallach / Remarks an a Real- Time, Master-Slaves

Operating

System

Input

Fig. 4. Pans of EPOC.

dispatcher has a good approximation tu the state of the network, prefiltered, relevant and checked out. Since this data forms the basis to the other programs, STEP is central (Fig. 4) to EPIC. It is First used for checking input data against iow and high boundaries (supplied by STEP). This is the “-it checking program” LIMP. It is also used by the “&onomic I&patch Frogram” EDIP, the “contingency Analysis Frogram” CAP and the “Monitoring, Alarm and Qisplay” MAD program. The aim of EDIP is to adjust generaricln so that all loads are supplied and the cost of gereration is minimal (pp. 959-972 of [223): in short it is an optimization program. The (security) constraints may considerably [S,l I] modify the minimum operating cost. It may also be required to compute the effects of losing one or more lines of the network or even to find that branch (or branches) which - if removed - cause other parts of the network to be overloaded. EDIP suggests the best generation or energy, CAP recommends corrective actions in case of co’ntingencies: generation rescheduling, line switching, load-shedding (pp. 884-892 of 1221) etc. MAD will monitor the line-diagrams, Lists of deenergized systems, of circuit-breakers and relays etc. It wiH include an alarm system for any voltage, current or power outside its limits and a display program. It will also include “supporting activities” such as interchange billing, energy accoun-. ting and tariffication, printing of reports and logs.. scheduling of work orders and maintenance, accounting etc. It relies heavily on sorting [13].

Two important

observations arc: The programs are simple, but the data size enormous. Consider EDIP. The best-known algorithm [5] is a simple gradient optimization program which adjusts iterative/v the steady-state of the network until an optimum is achieved. Thi:. steady-state problem, cahed “Load-Efow” LOAF may be defined as follows: given all node powers, calcuIate the node voltages. Since power dcpcnds quadratically on voltage and both are complex quantities, LOAF of a network with n nodes will lead to a set of 2n algebraic, non-linear equations. The difficulty is that n for the “average” American network is about IO00 and that EDIP has to solve these 200 equations repeatedly. CAP has also to compute LOAF repee:ediy for every contingency. The size of the p;oblcrn is thus readity appreciated. (b) The network changes dynamically tcircuits in/out, generation up/down etc.). The programs mentioned will therefore “normally” proceed according to some “cycle” say LIMP, STEP, EDIP, CAP and MAD, but should a real contingency appear, the entire data-base is valueless (because it changes continuousIy). Hence, rn cases of emergencies, network dim should not be stored. Efficient ASP-versions For most problems mentioned above were already devjsrd. LOAF was Asp’ed in f14,15,16], sets of linear equations basic to LOAF, EDIP and CAP in [2,3,4], EDIP in [11,12,17], State-estimation in [I,191 and sorting in [ 13,201. This, rather complete list proves that the programs exist - the online operating system does not. Irt thepresent paper guidelinesJor such operating systems will be developed. (a)

3. Cyclic Work-Load The programs EDIP, STEP and CAP are similar in that they rely on the Newton-Raphson method of solution. This in turn will “factorize” the system matrix once and then reuse it for a number of “back-substitutions”. We may assess the required time roughly, as follows: The most efficient algorithm [16] for the first part of all above programs is factorization. It will

elimina.te 4x.u eat!? one uring I,

,

=--

half the matrix in say 5 “steps”

2m”cu

Pk time units, *AlithM beine. the size of the matrix <,‘:I, /1 “bs

(2)

the “oper:,tir~n” time of one multiplication i/ and one addition o and k depending on spar!,sty. The average clr of a nurnhcr of minicornpurers wa5 found to he 119= In /..‘;ec. 7 htlr:

are needed to eliminate half of the matrix, The te. nl;rL:Mlg, fuii matrix is best %oived hy “block
For each factcrization, 5 to 5 “back-substitutions,” [4] are needed to complete a solution. Each of them reqliires:

A network has n nodes (bus-bars) :!nd each bus is connected on the average to ahnur 2 lines, so that the number

of line5 is 2n, We measure

active and

reactive power on b>th ends of each line for STEP (m--n) and both rrtlue and angir of voltage for both EDIP and CAP (m = in). For simplicity let us assume tha: we need C soiutions for EDIP and STEP to converge and a single solution for each o: the 5 cases of contingcnc~ to be checked. Thus: ”

1 bchter,

k’, Wallach I Remarks on a Real-Time, Master-Slaves

lt wal: found fl4,15,17,19] that for networks with ~1:--1000, factor k z 100. Thus the total time would

l.or lo(N) buses, this would be about 10/p minutes. 13n ihe other hand, k for small and/or dense iysrems is much lower than 100 so that the time is hJgher than that indicated by (g), Let us now compute the time required by the remaining programs, i/done by rhe fsing/e) muster. ‘Ihc w,mber of measurements is Sn (for all lines) and 2r1 (for all nodes). Assuming 1.65 +ec for a single comparison and that LlMP checks each value against a minimum and maximum point, it needs t; ; l&l X 3.3 = 33n [rsec. (9) ‘I 1~ output will use “smart” terminals which normally include 64 Kbytes memory, work at about %Ot? LnJud, have raster scan and good color qutrliry. ‘They will use not more time than LIMP. ~%rftlirion;rlly MAD would have to sort the data to !V &~lnycrl, but as shown in [13], this time is \~erce y proportionai top, Thus, if p displays are .~%Liurrlr:d, r, will also be needed for output. Total 111W" F J i $,11 #r'.;l.

(10)

I('i;sr:ip~~~iigfi asirh t: w,p reach the canclusion: If ?LQ~~)aiihlcr onit~ the work-load cycle, the master H til ur~d rnu~h less for completing limit-checking i’irlQ<.lis~l~~~+ that !he 51~ves need for STEP, EDIP ;hnil C.“rll f’it;~~)t
( ftf ifs I:KX or it, the failure of the master incapaci-

Operating

System

tates the entire system and thus defears the main purpose of using a pps, namely its higher availability. We intend to deai with it by making the system work in two “states”: the pr~duckn and self-testing statme. During the production period, with its sequential and parallel steps the cyclic work is done under the complete and exclusive control of the master. A test-program resides permanenti:j in all stores as well as in the main and secondary memory. At specified intervals z a timer initiates a self-testing period i.e. execution of this program. During the test, the master becomes dependrut, with each of the slaves numbered 1 to p being the master for a short testing time. If we consider the p slaves threaded in a circular chain then first slave 1 checks the master, slaves p and 2, then slave 2 checks the master, slaves 1 and 3 and so on. After p test steps the system has been redundantly checked and the master regains control. There are two possible failure cases: (a) The failure of a slave is recognized by three indications: Its “left” and “right” neighbours point to it and the testing slave when it was itself a master, had to be terminated by the clock. The faulty slave must be removed and since the number of slaves decreases by 1, data must be re-formed. (b) The failure of the master will be recogn.ized p times (as each of the slaves consecutively checks the master). For a completely symmetrical system, the master will be replaced by one of the slaves and both the system programs reloaded and data transformed. Note also that: (c) The test program it; run by a slave because it should check the master more than the slaves and because there are p slaves but only one master. This test-strategy may be called ‘p - I out of p”. (b) AvailabiIity should not be confllsed with reliability. For reliability, say [21] of APZISO, hardware is duplicated, so that a cpu may be replaced without degrading service. In our scheme, service will decline at faults but the cost is not doubled. Furthermore it should be mentioned that when service degrades too far (e.g. slower than one cycle per minute) the system may be considered to 5ave failed, Since all processes are (centrally) controlled by

the single master, there can be no unsolvable conflicts. Simultalieous requests will be sequentiaiized. We may differentiate between internal and external, berween predictable and unpredictabte interrupts. They are summarized in ‘able I which lists dlso their (Greek) name, the time of interrupl handling and a priority (with 1 being the highest priori1.y).

Table 1 Inrerwpts

Hardware failure in the

1

Varying

5

VW{

6

Long

master Hardware failure in a slave Request by the dispatcher Changes in the electrical PetWOrk

Start of the lest. procedure Program fault ic the

stwxt Internal

master Program fault in a sldve

hternas

7

ShOrt

End of a parallel P-step

internal

8

VW short

The P-steps are so shalt that we must classify their interrupts n as predictable. With a small variance, they “clock” the cyclic work-load according to the times computed earlier and according to the relevant data. The time of the test interrupt (T) must be fixed before hand; the timing of YCdepends on data which is not constant. Therefore synchronization between n and T is impossible. Still z is completely predictable. We may combine r with a timer for yet another purpose. The hardware is such [ 181that the master is activated only after ufl slaves finished their Ptasks. Thus, should a hardware failure prevent one of the slaves from issuing its interrupt, no 17wifl appear. 5 may then act as a “timer-interrupt” telling the master to take over. We next discuss some of the unpredictable interrupts. If the dispatcher is dissatisfied with any of the results on the display, he will interrupt the system (6) and activate one of the query programs.

We could also at quiet moments initiate some common functions such as I:oIlection of data related to energy exchange with neighbouring companies, billings, logs etc. We should though not forget that if an unexpected even? occurred in the network, say thunderstorm, short-circuit etc. then we should nor rely completely on the dispatcher. In this case, ai well as in the case of a hardware failure, an interrupt [rx) should activate both a message in the display and ’ome prevrntive action by the sysrem. We may s-1 up a timing diagram for the predictable interrupts n and :’but only a probability distribution as far as the unpredictible interrupts 6 are concerned. Accori;ling to section 3 the first part may be computet:!, for given network size n and given p. I-he secortl part can only be assumed on the basis of expsrience. From talking to indri>t: y reprp.2cntatirl:s we gathered that no st~!i~~~al data really exists. but that interrupts dric lo r!etwork changl:s occur “abmt every halfhour” ;tnd “,erious trouble, once every few weeks”, If this information is taken as an “order of magni:ude”, the conclusion will be drawn that the master r overwritten (Set Table I)+ The reason is that until the end oi’handliltg the interrupt, the network probability changed, SO rhr results we might get then, uould be based on false data. Moreover. if an external mterrupr I:CZ,r, ~$1arrives while the master does i/O, it should stop due IO the same consideration.

The system has three !~vcls of memory: rhc loyal memories of the slavr-5, the main memory tit the master (the samr: as the local memoriesj and a C‘OIII. mon back-up slore. I’:lis back-up store r:on:ain\ the entire system i.e. all IoadabIe pr~grarn\ t LIF both the master and rhc slaves, the self-resring \oi’lware and the necessary data-files. The local
310

i

Rr&c?r, Y. Wdach

i Remarks an a Real-Tinw,

Master-Slaves

Operaring System

Table 2 State

identifiartion Bit

State _

0 1 2 3

Role

Period

Active

Test-time Test-time

Active

0.2

0 0 6 0

Slave

1,3 4

00x1

Slave

5

0 1 0

6 8-l;

lnxtive

Slave

Wurk-load

Inactive

Slave

Wxk-load

Active

0110

Slave

VVork-load

Inactive

10xX

Master

Test-time

Master

\Vork-load

Master

‘jJork-load

0100

12

11

14

1110

1

OC

Master Work-had 15 1 1 1 1 -.-~^__-----and both 7 and 13 are invahd states

or not

IP-step)

(P-step1 IS .stepl

Idle [P-step) Idle [S-step) Busy Ei-step1

The status indicator is part of “resident data” of each af the Fast men-&es. Additionally each resident part holds: ~ The startup address, which holds the location counter after li?e initial startup. - A small conlignration table which defines the number of the carre:lt master, the number of the cpu and its leti and right neighbours. - An interrupt-indicator field. - A table of free and occupied memory blocks, During normal workload the moister will be its CJM n left and righ: neighbour. The extensions of a higher-level language which enable direct programming of the system [4] are such that a “parallel procedure” is compiled and transferred to all slaves at its declaration. This means that all parallel procedures ,would be lsaded into the IocaI stores ahead of running tlx program, thereby wasting valuable memory. IIelaying the transfer of a parallel procedure to the time of its actual call will not help, since if it is called inside DEa loop, this method will waste a lot of time for transfer. The only solution is to let the prDgrammer or compiter indicate ta the operating system when a particular procedure is needed and when it may be overwritten. If stores are occupied and recovered, this 11’1~1he recorded in memory tables as “used” and “free” aretjs. The same applies to data oi’ the &~tric network, the compiler and some system ;?cogratas which occupy their areas permanently and some buffers etc. which are intermittently Illsed and free. We thus need memory tablus which will

indicate the status of each block of our statically managed memories.

Symposium tics,

1101 G.A. Illi

D.

Leven,

1121

1131

May

D.

Y.

Leven,

155

pp.

34+354.

Y.

Wallach,

E. Handschin,

V. Conrad. on

C-26,

61-68.

Y.

W.ilia:h.

fewnw

9, 1977,

141 V.

of

pp.

solur:on

system”.

0f lineir

pp.

Trans.

equa-

IKE.

638-647. methods

Numerische

for se%

Y.

Mathematlk.

Lol

32,

Y.

“On

block-parallai

to

equaticns”.

appear

methods

in Trans.

for

IEEE

No.

parallel

4,

solut~cns

!978,

algor~ghrns

pp.

for

paper

1975,

IEEE,

Vol.

ordered

of

Sysrem pp.

1.

Con

77CH1131-

economic

Computation 495-483.

“A mulTibus~orlented

“, Jraria

load-flow

Applicstron

parallelization

6.

59. No.

optimally

1977.

Power

August

lodd~flow 57. No.

1977. VIII.

Computer

on

of

1976, Vol

May.

Vt

Shimor,

V

parallel

IEC!-25.

No.

pro-

2.

1978.

“Faster

F’xwr

brr panill%

322

zlgorrthms

for

state-

of

IEEE.

N:w

Meeting

the

sortrng

marh0ds”.

PSCC

(as rn

325.

Whrte

Ifdrtor).

VII.

ior power

Corrad.

pp A 79O3C~O.

~tiulr~pr0cessa~

systems”.

lnfotech

1976.

Proceediny

1221

0n

Sequensubmitted.

a M!MD-type

“Paratlel,

Industry

&,nter

1979.

Repor

Wallach.

linear

IEEE,

‘f4’

Walixl:.

C.H.

Ccnr,.jd.

Darmsrad;. A

sy-Tet”

171, pp. 121;

“Fast

“Rema+

Wall&h,

York.

of

on

Journal.

fu’ Elektrotechnrk.

calculations”

estimatitiri”.

I.201

“Alwnating

the

“Alternarrng

“Paratlei

0; the IEEE, Toiwrro.

137

Y

Vol.

Power

of

programming”,

E!!klrotechnik,

Power

2 PLL’F: I ‘vu&ch.

105-108.

Conrad,

solving

“iterative

Y. Wallauh.

equations”. pp.

“ASP-uersrons

ad

Con.&,

V.

factz:l:at,w”,

L19!

a parallel-processing

No.

1979,

Y. Wallach.

submitted

Y. Wallach,

Conrad,

Ltnear

V.

C-~nfcrt:r::e.

R. Bangers,

linear

’Sorting

Con

Archw

pp.

ce*sur

131 V.

V

1181 Y. Wa!&ch,

state-estrmation”,

Conference

Et_nomicro

Archivfu-

dispair:h

References

dispatch”,

‘J. Conrad, fo-

Wallach.

Wallach.

problsrx”,

116:

“Mathematical~

power

1979.

system”.

aroblems”.

[15!

Simula-

161

Y.

1141

mathema-

.“,

Conrad,

for

iippiication

Wallach,

Y.

parallel

computation..

V.

>dr

methods

Tolub,

pwallet

Wallach,

Computer

Cleveland.

S

ID

merh

processrrrg

Ii>,

lians

Y.

tial/Parallel

and

X5-268.

37-45.

programming

Parts of an operating system foi an “alternating sequential/parallel” ASP-system were defined and it was shown that the load on the master is not as high as predicted. The cyclic work load requires a number of slaves proportional to n? where n is the problem size. System control is relatively easy because of the central position of the master. Intcrrupts can be processed fast; memory management does not require much storage or time. The overall conclusion is that changing fr:Jm an off-line operating system (as il>ed in SMS) to an on-line operating system should lead toI sim~liii~ation rather than I le other way around.

I21

pp.

computers

,,p.

“3ack

1972.

industry

III

1977,

Kern,

tion,

6, Conclusions

on parallel

Munich,

uf the lEEI_. syswrn

Specrai

.orol.Ilt!ms.

rssue

on use of computers

1974

Computers. 151 H.W.

Dommel,

tions”,

‘W.F.

Trans.

IEEE,

Tinney. Vol.

“Optimal

PAS97.

No.

solu-

po-tier-flew 10, 1968,

pp.

1866-

1876. 161 S.H.

Fuller,

and

I_. Richter the

the

“Price/Perfurmance

PDPIO,

Symposium

comparison

IEEE~Computer

on

Computer

Socrety,

of Third

Architecture,

C.mmp Annual

1976.

pp.

.“,

Euro-

is Professor

University

various operating

at tt r! Computer

Dorrmu?d.

projects

syslems.

resource engineering parallel

of

research

HIS

management ipartrcuiarly

c went rn w

Scrence

Gevnany

cr~hcerni~

He

researrl~

rnultitz;skrny trcal mgratronl

Deparrment has

9 computer

of

conducted

structures inwrests

and

include

5ysterx. firmware and ali aspects of

processing.

155-202. 171 R.

Kober.

H.

rn~cro Journal, 181 R. Kober, Joint I91

H.

“The

Computer KOPP.

Kopp. 1976,

Ch. pp.

multiprocessor Conference.

“Numericat

microprocessor

Kuznia, 56-64.

System

“SMS201-. Vol.

system 1977,

weather

2.

pp.

forecast

SMS201”.

.“,

SMS201..

Fail

225-230. with Proc.

T.

Warlach

Carnegie-Mellon versity

the

mutti-

IMACS-GI-

IS Professor

at :he

neerrng Department of Wa’iw He has held prevtous positlox

areas cations.

and

of Dortmund, of parallel

Purdue Germa

processrng

Eiecrrical

and

Computer

State Unwers~ty. ar the Technron U~wersitres, w. HIS research nicroprocessors

USA

Derrorr, rn Waifa. and

interests

Errs1 USA. lsradl.

at the !Jn are 111the;

and contiirl

appii-