A data flow numerical processing operator

A data flow numerical processing operator

North-Holland Microprocessingand Microprogramming25 (1989) 1 3 3 - 138 133 A Data Flow Numerical Processing Operator P. Abellard and B. Barbagelata ...

290KB Sizes 2 Downloads 100 Views

North-Holland Microprocessingand Microprogramming25 (1989) 1 3 3 - 138

133

A Data Flow Numerical Processing Operator P. Abellard and B. Barbagelata Laboratoire d ' A u t o m a t i q u e et d ' I n f o r m a t i q u e Appliqu~es de Toulon Universit~ de Toulon 8 3 1 3 0 - La Garde - France

1 - INTRODUCTION

c o n n e c t i o n o f a p r o c e s s i n g unit output memory module section concerned (5).

Efficiently carrying out a parallel calculation with a c o n v e n t i o n a l multiprocessor machine poses some difficulties and d o e s n ' t always yield the p e r f o r m a n c e s e x p e c t e d . The t h r e e m a i n difficulties to be solved are : -A limited c o n c u r r e n c e : the use o f Von Neuman's m o d e l , c h a r a c t e r i z e d by a s e q u e n t i a l c o n t r o l m o d e and a s h a r i n g o f m e m o r y areas reduces the c o n c u r r e n c e b e t w e e n the various tasks of the program and the number o f tasks carried out simultaneously. - A complex control system : in order to m a n a g e the p r o c e s s o r to p r o c e s s o r and the processor to m e m o r y c o m m u n i c a t i o n s , and to solve the conflicts arising in access to resources, a complex control system is required. P h e n o m e n a o f j a m m i n g and saturation o f b u s e s c o m p l i c a t e this control and reduce its efficiency. - D i f f i c u l t i e s o f p r a g r a m m a t i a n : the p r o b l e m o f p r o g r a m m a t i o n with m u l t i p r o c e s s o r m a c h i n e s is to a v o i d interference between the various instructions of the parallel program. Now, interference is important in the case o f a conventional program carried out by a multiprocessor system.

to

the

A representation model of parallel c a l c u l a t i o n is a b s o l u t e l y n e c e s s a r y , on the o n e hand in o r d e r to express the parallelism, on the other hand to determine the dynamic behaviour o f the system (6). A m o n g the existing models, an extension o f the basic Petri Nets, dubbed " D a t a Flow Petri N e t s " , has been selected (7).

3-DATA 3-1

:

FLOW

PETRI

Reminding

NETS

the

definition.

A Data Fl aw Petri Net is a 7 uple < R, to, ~, qJ, ×,

O, C > in which :

Net,

Pv

R is a conformable Two-part Places Petri and Po called set o f places a s s o c i a t e d

respectively

to

variables

and

operators.

-tO is a surjective application t O / P v :P --~ X ; ~0/Po : P --> O

All o f this makes the analysis o f the system bchaviour more difficult and more complex.

as V Pi ~ P o ' V pj E P o a n d to (pi)= to (pj) for i j then :

In a Data Flow architecture, these t h r e e problems are non-existent. This architecture is structurally different from that o f c o n v e n t i o n a l multiprocessor machines and doesn't implied supplementary material requirements for the execution of parallel programs (1, 2, 3, 4).

V t 1 E Pi,V t k ~ pj, {tO( tl)} ;~ (to(°tk)}, so two identical operators can't work on the same set of data.

2-DATA

(ME 1 ......... ME u}

FLOW

ARCHITECTURE

(Figure 1)

In a Data Flow architecture, there is no central p r o c e s s o r it is replaced by a s e c t i o n of p r o c e s s o r m o d u l e s ( a r i t h m e t i c and l o g i c units, input/output p r o c e s s o r s , . . . ) . T h e r e is no R A M central m e m o r y . It is replaced by a section o f memory modules including addresses, operation codes, operands .... Neither is there any program c o u n t e r : a d e c i s i o n - m a k i n g array e n a b l e s c o n n e c t i o n o f one m e m o r y module section output to the appropriate p r o c e s s i n g unit, and a distribution array enables

o

-~

o

o

is an injective application ~ : X - ~

a s V p E Pv, ME E M ~

M =

M E = ~ (tO (p)) ; M i s

called set o f memory areas. - qJ is a surjective application qJ : T .~

C.

- X = {Xl,X 2 ......... Xu} is a set of variables (real, entire, logic) with values in domains D1, D2 ........ Du. - O = {o 1 , 0 2 ......... ot} is a set o f operators defined as internal applications of D I * D2* ....... D u.

134

P. Abe/lard, B. Barbagelata / A Data Flow Numerical Processing Operator

. . . . . . . . . . . . . . . . . . .

"%

!

1 ,.Roc -ss,No ,N.T VL / I ! !

I..ocEssi.o PACK

OF

RESULTS

PROCESSINO

1

I

L I

~

OF

PACK

J

OPERATIONS

INSTRUCTIONS

~.' ~ I

Figure

SECTION

!

i

,

: Data

Fiow

Architecture.

C = {c 1, c 2 ......... Cr} is a set of conditions (predicates) on X variables. -

3-2

:

Pi

Representation.

PZ

Pr

°t. }Data

F i g u r e 2 s h o w s the r e p r e s e n t a t i o n o f an operation carried out with an operator and a set o f variables. S o f t w a r e and h a r d w a r e s i m u l a t i o n s o f t h o s e nets have been r e a l i z e d with e l e m e n t a r y studied m o d u l e s a s s e m b l e d t o g e t h e r a c c o r d i n g to the net architecture (8). -t

i and tj

are

respectively

called

output transitions o f the o r operator

input

P£j ~

and

or

operator

associated with

the place Pij" o

places P'2 ......... P's} n e c e s s a r y for obtained. -

3-3

:

o

t i = {Pl' P2 ......... pr } and t j = {P'I' respectively represent the data the o p e r a t o r o r and the results

j J

Pl

~ P

t?}results J

P~"

Figure 2 : Representatlon of an operation.

Marking.

A mark put down in a "variable" place means that the value o f the variable is written. A mark put d o w n in an " o p e r a t o r " p l a c e m e a n s that the operator is activated. We assume that a place store one mark, so the nets are safe. 4 - A P P L I C A T I O N TO C O M P U T A T I O N A L 3-4

:

Example.

Figure 3 shows the Data Flow Petri Net of the calculation : z22 = a22 - w21* Zl2 - w24" z42. 3-5

:

LINEAR

Marker

graph.

In order to study the dynamic behaviour the net we can use the marker graph (figure 4).

of

Some new methods of using parallel algorithms are now well suited to their i m p l e m e n t a t i o n on such a parallel architecture. This is the case o f Evans' s Quadrant Interlocking M e t h o d , for the solving o f linear s y s t e m s o f equations, which, contrarily to t h e Gaussian method or triangular LU decomposition, doesn't lead to sequential, but to concurrent relations (9).

135

P. Abellard, B. Barbagelata / A Data Flow Numerical Processing Operator

W2,

Zi2

W24

ME = Iiii0001000

Z42

i,2,3,,,81

-
P5 ~~

Pd

~

7,8,3,, ] ] 5,8,8]" 8,9,1,2J

t3 a

tz ~

[7,8,o

t5

~ i t

t ~

I

/ t

"1 5,8,9 I"

I

7,8,9 t5

p,o( i0

6 -l t

'I

t~ c

Z22 -- a22

M r = 00000000001 Figure 4 : Marker graph.

- w21~zi2 - w24)~z42

Figure 3 : Data Flow Petri Net. Let the system A x = b be solved and let us consider the d e c o m p o s i t i o n o f A into " b u t t e r f l y " matrices : A = W , Z with :

and by substitution equations, we obtain :

in the

second and

third

z 2 2 = a22 _ w21 z12 _w24 z42 and z23 = a23 _

W=

1 w21

0 0 0 1 0 w24

w31 0

0 1 w34 0 0 1

Zll Z=

0

w21 Zl 3 _ w24 z43-

Z12 Z13 Z14 Z 2 2 Z23

0 Z32 Z33 Z41 Z 4 2 Z43

0 0 Z44

Similarly, for row III, we have :

In order to determine the coefficients o f W and Z, we realize the equality : A = W , Z, and thus we obtain, for rows I and IV :

w31 Z l l + w34 z41 = a31, w31 z12 + z 3 2 + w34 z42 -- a32, w31 z13 + z33 + w34 z43 = a33, w31 z14 + w34 z44 = a34,

I

: Z l l = a11, z12 = a12, z13 = a13, z14 = a14

IV : z41 = a41, z42 = a42, z43 = a43, z44 = a44 w h i c h require no calculation. manner, we have for row II :

In

the

same

II : w21 Z l l + w24 = a21, w21 z12 + z22 + w24

As above, the values o f w 3 1 and w 3 4 are deduced from the first and the last equations, and by s u b s t i t u t i o n in the s e c o n d and the third equations, we obtain z32 and z33. The Data Flow Petri Nets m o d e l l i n g calculations are illustrated in figures 5 and 6.

these

z42 = a22, By using the relation A = W , w21 Zl 3 + z23 + w24 z43 = a43, w21 z14 + w24 z44 = a24, From obtain :

the

first

and

the

last

equations,

we

w21 = (a24 - z44 a 2 1 / z 4 1 ) / ( z 1 4 - Z l l z 4 4 / z 4 1 ) and w24 = (a24 - z14 a 2 1 / Z l l ) / ( z 4 4 - z41 Z l 4 / Z l l ) '

Z, the linear

system A x = b can now be expressed with two linear systems : Wy. = b, and Z x = Z. For solving W v = b, we proceed as follows :

P. Abellard, B. Barbagelata / A Data Flow Numerical Processing Operator

136

z II

Z14

p,O t

/

0i

t

~21

2 ~ / " - - ~

zxg/zxt

02

~

=i4/zix

~ai

0 3

,.%

....%

t

t

Pio

~r

Piz

Pia

4

Pt

Wt9 O~ ~

W

m

O~ ~

~ - J t~ x

txo

--

O70

--

q!) / ~,

t,?

W24

r---I t,B

W ~g

Figure 5 : Data Flow Petri Net of decomposition 1. Y 1

0 0

0

w21 1 0 w 2 4 w31 0 1 w 3 4 0 0 0 1

1

b

Y2

b

Y3

b

Y4

b

1 2 3 4

We can see immediatly that Yl = bl"Y4 = b4' w21 Yl + Y 2 + w24 Y4 = b 2 and w31 Yl +Y3 + w34 Y4--b3 in wich Y2 = b 2 " Yl w21 " Y4W24 and Y 3 - - b 3 " Y l w31 " Y4 w34" Once y.. is known, we have to solve y_ = Zx by proceeding as follows :

Zll 0 0 Z41

Z12 Z22 Z32 Z42

Z13 Z14 Z23 0 Z33 0 Z43 Z44

Y

x1 x2

=

1

Y2

x3

Y3

x4

Y4

Starting from the center, we solve the system z22x2 + z23x3 = Y2 and z32x 2 + z33x 3 = Y3 to obtain : x3 = (Y3 - z32Y2/Z22)/(z33 - z32Y23/z22) and x 2 = (z32 - z33Y22/z23)/(y 3 - z33Y 1/z23 )By substitution in ZllXl + z12x2 + z13x3 + z14x4 = Yl' z 4 1 x l + z42x2 + z43x3 + z44x4 = Y4 we can calculate x 1 and x4

137

PoAbe/lard, B. Barbagelata / A Data Flow Numerical Processing Operator

W

Z

2.

W

12

Z

24

42

P4

t3

a2z

L . - - - I

t 5

( rZ2:2

Figure

NODE

i

6

: Data

Flow

NODE

MUL

r--~

~p.

Petri

OUEUE 2

MUL

r-

"-1 ARC

/ NODE

N (~

OUEUE

ADD

Net

6

of d e c o m p o s i t i o n

2.

I .EQUATE HOST=O; 2 .MODULE EXONE=I; 3 .INPUT ARCI,ARC2,ARC3,ARC4; 4 .OUTPUT ARC8; 5 .LINK ARCS=NODEI(ARCI,ARC2); 6 .LINK ARC6=NODE2(ARC3,ARC4); 7 .LINK ARC7=NODE3(ARC5,ARC6); 8 .LINK ARCS=NODE4(ARC7, ); 9 .FUNCTION NODEI=MUL,QUEUE(QUEI,I); IO.FUNCTION NODE2=MUL,QUEUEiQUE2,1); II.FUNCTION NODE3=ADD(QUEUE,QUE3,1); 12.FUNCTION NODE4=OUTI(HOST,O); 13.MEMORY QUEI=AREA(1); 14.MEMORY QUE2=AREA(II; 15.MEMORY QUE3=AREA(1); 16.START; 17.DATA EXEC(EXONE,ARCI); 18.DATA EXEC(EXONE,ARC2); 19.DATA EXEC(EXONE,ARC3); 20.DATA EXEC(EXONE,ARC4);

21.END. ARC

Figure

7

B

: Implementation

of a c a l c u l a t i o n

on a H P D

7281

processor

P. Abellard, B. Barbagelata / A Data Flow Numerical Processing Operator

138

We can see that the two nets described above can be used again. Those nets feature a h i g h parallelism rate. 5 - OPERATOR The

IMPLEMENTATION

nets

obtained

can

be

directly

implemented on a data flow processor [~ PD 7281 (10, 11). Figure 7 gives an example of implementation for the calculation : Y2 = z22' x2 + z23' x3

6

-

5 - DONAN A : Paradocs, a highly parallel data flow computer and its data flow langage. Microprocessors ane microsystems, n ° 7, pp 20-31, 1981. 6 - KRISNA M, KHAVI, BILL P BUCKLES, U NARAYAN BHAT : A f o r m a t defintion o f data f l o w graph m o d e l s , IEEE Vol C35, n ° 1, pp 940-948, November 1986. 7 - ALMHANA J : Moddlisation par Rdseaux de Pdtri d Flux de Donnges. Application ~ la synth~se de l'opdrateur de Riccati rapide. Th~se de Doctorat d'Etat, Marseille, Juin 1983.

~

Data flow or data-driven machines feature a high degree o f parallelism between the tasks feasible in parallel, and consequently allow a high level performance to be reached, substantially higher than with conventionnal machines. In this paper, a static data flow architecture has been presented. The regular and cellular structure in which communications are simple allows to take advantage of the progress achieved in the domain of Large Scale Integration for the implementation with data flow processors, o f fast and low cost machines. Though the architecture presented corresponds to an operator for the solving of linear systems of equations, (an important subset in filtering and image p r o c e s s i n g problems), the principles discussed are yet valid for numerous other applications such as robotics, etc... Data Flow Petri Nets have been used for the modelling o f data flow parallel programs. A software tool has been developped for their automatic print-out, validation and arrangement in order to know the temporal behaviour of the programs represented and to optimize the procedure of their implementation on a data flow machine (12).

1- RUMBAUGH J : A Data Flow Multiprocessor, IEEE Transactions on Computer, Vol C26 n°2, pp 138,146, February 1977. 2 - AGERWALA T, ARVIND : Data Flow Systems, IEEE Computer, February 1982. 3 - ARVIND, VINOD KHATAIL, PINGALI : A Data Flow Architrecture with tagged tokens. Laboratory for Computer Sciences, M I T / L C S / T M - 1 1 4 , September 1980. 4 - ARVIND, VINOD KHATAIL : A multiple processor data f l o w machine that s u p p o r t generalized procedures. Eighth Annual Architecture Conference IEEE, pp 291-302, May 1981, Minneapolis.

8 - BARBAGELATA B, ABELLARD P : P a r a l l e l processing modelling with Data Flow Petri Nets. First european workshop on parallel processing and techniques for simulation, UMIST, 28-29 October 1985, Manchester. 9 - EVANS D.J : Design o f p a r a l l e l numerical algorithms. First european workshop on parallel processing techniques for simulation UMIST 28-29 October 1985, Manchester. 10 - BARBAGELATA B, ABELLARD P : Opgrateurs de calcul paralldle modglisgs par Rdseaux de Petri gt Flux de Donndes. Onzi~me Colloque GRETSI, 1-5 Juin 1987, Nice. 11 - BARBAGELATA B, ABELLARD P : Data Flow Petri nets f o r Data Flow processors. The Petri Net Newsletter, n ° 24, pp 18-20, August 1986. 12 - BARBAGELATA B, ABELLARD P : A Data Flow Multiprocessor, Eight European Workshop on Applications and Theory of Petri nets, 24-26 June 1987, Zaragoza. 13 - GUILIERI A, BARBAGELATA B, ABELLARD P : Systolic arrays modelling by Data Flow Petri Nets, First European Workshop on parallel processing techniques for simulation, 28-29 October 1985, UMIST, Manchester. 14 - ABELLARD P, BARBAGELATA B : Systolic array d e s i g n , Ninth European Workshop on Applications and Theory of Petri Nets, 22-24 June 1988, Venice. 15 - MESHACH W : Data Flow IC makes short work though processing chores, Electronic Design n ° 17, pp 191-206, May 1984. 16 ABELLARD P : C o n t r i b u t i o n gt l'dtude d'extensions des rdseaux de Petri ~ Flux de Donndes r la tdl~symbiotique assistge p a r calculateur. se de Doctorat ~ciences, 14 Juin 1988, Toulon.

~