Science of Computer Programming 37 (2000) 225–252
www.elsevier.nl/locate/scico
Shapeliness analysis of functional programs with algebraic data types Thomas Nitsche ∗ Institut fur Kommunikations- und Softwaretechnik, Technische Universtitat Berlin, Franklinstr. 28=29, Sekr. FR5-13, 10587 Berlin, Germany
Abstract Data distribution algebras are an abstract notion for the description of parallel programs. Their dynamic execution can be optimized if they are shapely. In this paper we describe a shape analysis which allows compile-time shapeliness-tests. It operates on the structure of algebraic data types and works for arbitrary functional programs rather than only shapely ones. Besides a rst-order calculus we also propose a higher-order version which can handle higher-order c 2000 Elsevier Science B.V. All rights reserved. functions as well. Keywords: Parallel programming; Functional programming; Skeletons; Data distribution algebras; Algebraic data types; Shape analysis
1. Introduction Programming of parallel systems is much more complex than in the sequential case, because additionally synchronization and communication issues have to be taken into account. On systems with distributed memory like massive parallel processing (MPP) systems and workstation clusters the distribution of work and data has to be handled explicitly. But also for the growing class of symmetric multiprocessor (SMP) systems where (virtual) shared memory simpli es programming, we have to take data distribution issues into account, because non-uniform memory-access architectures require the exploitation of data locality for an ecient execution. Data distribution algebras [25, 30, 31] are an abstract notion for the description of parallel programs. The key idea behind the concept is that a data structure is split into a cover of overlapping subobjects which may be allocated to dierent processors. It allows to express data locality in terms of computations on subobjects as well as ∗
Corresponding author. E-mail address:
[email protected] (T. Nitsche).
c 2000 Elsevier Science B.V. All rights reserved. 0167-6423/00/$ - see front matter PII: S 0 1 6 7 - 6 4 2 3 ( 9 9 ) 0 0 0 2 8 - 3
226
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
communication on an abstract level in terms of overlapping areas. In contrast to common data parallel languages like HPF [15] or C∗ [8] where arrays are the only parallel datatype, it allows the usage of arbitrary algebraic data types. Programs are described using skeletons [5, 10, 11] that is, higher-order functions with an ecient implementation on dierent parallel systems. Parallelism is encapsulated within the skeletons which operate on covers. Covers are de ned in a functional language in terms of a splitting and a corresponding gluing operation. In a dynamic executive model split distributes the subobjects onto dierent processors, while glue collects the data. In order to avoid unnecessary communication we can eliminate pairs of equivalent split and glue functions. This leads to a more static data distribution where only the overlapping parts are exchanged among the processors. Experimental results con rm that for iterative problems we can reach the eciency of handwritten programs with such optimizations. The applicability of the optimizations is ensured if split and glue are shapely. Shape and content are two aspects of a data structure. The shape contains the structure information like the index range of a matrix or the node structure of a tree, while the content is the set of elements within this structure like the set of elements in the elds of a matrix, in the node of a tree and so on. In a shapely function [17, 19] the shape, i.e., structure of the result only depends on the shape of the input data structure but not on the data values itself. This data-independence of shapely structures give rise to a lot of program optimizations and other applications, especially in parallel programming [17]. In this paper we describe the shapeliness analysis of functional programs. It is based on the structure of algebraic data types and allows compile-time optimizations for the dynamic execution of covers. In order to detect data-dependent swapping of elements between dierent subobjects and hence possibly dierent processors we extend the usual notion of shapeliness by some information about the order of elements within the structure. As we allow shape functions we do not have to restrict the language to shapely expressions but can even analyse higher-order functions. This paper is organized as follows: Section 2 motivates the shapeliness analysis with the optimized dynamic execution of data distribution algebras. In Section 3 we describe the main idea of our shapeliness analysis. A rst-order calculus is given in Section 4, while Section 5 extends it to the higher-order case and discusses the xpoint calculation. An example cover is analysed in Section 6. Finally, Section 7 concludes.
2. Motivation In this section, we brie y review the basic concepts of data distribution algebras. A more detailed description can be found in [30]. Its optimized dynamic execution [23] is the main motivation for our approch to shape analysis.
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
227
2.1. Covers Conceptually, a cover C of an object O is just a set C = {Si | i ∈ I } of subobjects S S Si ⊆ O such that their union yields O again, that is, C = i∈I Si = O. Moreover, the following requirements have to be met: • Ownership: Each of the subobjects is partitioned into an own part own(Si ) and a foreign part foreign(Si ), i.e., Si = own (Si )∪foreign(Si ) and own(Si )∩foreign(Si ) = ∅. • Partitioning: The own parts of all subobjects have to be a partition of the object, U that is, i∈I own(Si ) = O. The idea is that the dierent subobjects will be allocated to dierent processors. Then the own part speci es local data while the foreign part can be regarded as a reference to the data of another processor. A function application to an object yielding a new object corresponds to the application of a (local) function to the subobjects yielding a cover of new subobjects, where the overlapping foreign parts specify — on an abstract level — the necessary communication. Formally every cover can be described functionally as a re nement to the following generic speci cation in terms of splitting an object into subobjects and a corresponding gluing function. COVER TYPE
C[] obj[] - -whole object subobj[] - -local subobjects cover[subobj[]] - -structure of the cover
splitC : obj[] → cover[subobj[]] FUN glueC : cover[subobj[]] → obj[] AXM glueC ◦ splitC = Id FUN
To illustrate the idea let us consider as an example the division of a sequence into a list of sublists. This can be de ned using obj = seq[], subobj = block, cover = seq and TYPE
block == block (FOREIGN left : seq[]; OWN inner : seq[]; FOREIGN right : seq[])
by the two operations FUN FUN
split SeqBlock : seq[] → seq[block[]] glue SeqBlock : seq[block[]] → seq[]
Fig. 1 shows a sequence splitted into four subobjects with one element overlapping to the left and two to the right.
228
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
Fig. 1. An overlapping sequence-block cover with p = 4 subobjects.
The full de nition of the sequence-block cover can be found in Fig. 5. 2.2. Skeletons Skeletons are de ned as certain higher-order functions in terms of covers. An example is the well-known map skeleton which applies a function to all elements of a certain data structure f ∗ [a1 ; : : : ; an ] = [f(a1 ); : : : ; f(an )]: If the cover cover[ ] possesses a map operator ‘ ∗ ’, we can lift a function FUN f : subobj[] → subobj[ ] on subobjects onto the original object obj[]: MapC (f) == glueC ◦ (f*) ◦ splitC : Analogously other skeletons can be de ned over covers. 2.3. Executional model A dynamic approach to the automatic transformation of covers considers splitting of an object into subobjects as a generation of tasks and distribution of subobjects onto the dierent processors, while glue collects the data. We can recursively repeat this procedure if some subobjects are to be split in turn. This allows an easy treatment of composed covers and thus of nested skeletons. In addition to that we can also integrate task and data parallelism. Besides data parallel skeletons like map and reduce which operate on covers we can represent task parallel skeletons like farm or divide-andconquer [7, 10] within the same framework. For systems with shared memory this works well as we have a set of threads which operate on subobjects and thus exploit data locality. On a system with distributed memory like MPPs or workstation clusters we have the problem that a program generally consists of more than just one skeleton application. If there is not too much work done locally this results in a bad computation-to-communication ratio as every split and glue operation may involve a great amount of communication as all the local data may be transferred to another processor. Instead of doing some useful work the processors are just sending their local data to a master processor (glue) and are immediately receiving the data again (split) for the next iteration step. This glue=split-problem can be overcome if the subtasks do not terminate immediately. A following split of the object together with a corresponding re-distribution
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
229
of the data onto the processors can be left out. Instead, we just update the values of the overlapping foreign parts. For iterative programs we can eliminate the exchange of the subobjects in all intermediate steps which leads to a more static data distribution. This corresponds to the view of a cover as a distributed object whose subobjects are allocated onto dierent processors and are communicating only locally with their “neighbours” as speci ed by the overlapping parts. Thus the overhead of the initial distribution and nal collection becomes negligible with an increasing number of iteration steps which reaches the eciency of handwritten programs without covers [23]. Note that due to the overlapping parts of the subobjects a simple map fusion does not work here, as in general MapC (f) ◦ MapC (g) 6= MapC (f ◦ g). This implies a communication phase between the two steps to exchange the overlapping foreign values. 2.4. Shape analysis We can, unfortunately, not always eliminate the split=glue-functions as they can, in principle, arbitrarily change the data structure. A necessary prerequisite is that the data distribution with a possible exception of the overlapping foreign part remains unchanged. It is especially required that a certain element will not dynamically change its position within the data structure, in particular that it will not be put into another subobject, as this may result in hidden communication with another processor. However, if split and glue are shapely functions [17, 19], that is, the resulting structure (shape) only depends on the data structure but not on the data itself, this is ensured. For a shapely cover the communication structure is only determined by the shape, i.e. structure, of the data type and is therefore xed. That allows us to replace the split=glue functions by a precalculated function which just exchanges the overlapping foreign parts with the corresponding processor. If, however, the cover is non-shapely, than the subobject which a certain referenced element belongs to may change in every step depending on the element values of — potentially — all other subobjects. That means, that the referenced subobject and its processor may be dierent in every step, i.e., the communication structure of the overlapping foreign parts depends on the values within the data structure and can only be calculated if the values of the other subobjects are known. This requires the collection of subobjects onto a master processor and thus prohibits the optimizations described above. Shape analysis thus yields conditions about the applicability of our optimizations. 3. Basic idea We therefore want to ensure that splitting and gluing do not change the subobject into which a certain data element will be placed in a dynamic way. This is the case if they operate independently from the data values but use only structure information like the size of a list or the order of its elements.
230
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
Let us consider a few examples. For a sequence xs = [0; 1; 2; 3; 4] we get map(succ) (xs) =[1; 2; 3; 4; 5] map(x: IF x¡2
THEN
2
ELSE
x
FI) (xs) = [2; 2; 2; 3; 4]
LET
filter(even?) (xs) = [0; 2; 4] x == ft (xs) - - rst element
IN
y == ft (rt(xs)) - - second element [max(x; y); min(x; y)] = [1; 0]
(1) (2) (3)
(4)
1. This expression is shapely, as every element of the sequence will be just incremented without changing the order. 2. Also shapely. The new elements themselves, however, which are calculated by the mapped function might be non-shapely. The distinction between (1) and (2) will only become relevant in the case of nested data structures. 3. Non-shapely: the resulting sequence structure depends on the entry values in a datadependent way. 4. Problematic: the length of the sequence is constant (always 2), but the order of the resulting elements depends on its values and may change. As the structure of the result — a sequence with two elements — is independent of the values within the given sequence, this is considered as shapely in traditional approaches [17]. In our setting, however, expression (4) has to be considered as non-shapely, because a dierent order of the elements within the sequence may result in another data distribution over the subobjects, but this can only be decided dynamically at runtime if the concrete data values are known. Note that from a functional point of view, x and y are values and the result is just the application of some function on these values. From an operational point of view, subobjects may be allocated on dierent processors, so values depending on them may involve some communication to read these values and to store the result on the corresponding target processors. To consider this in our analysis we treat the values like references to (sub-)objects. In the case of nested data structures where the elements of the list are not natural numbers but complex objects themselves, we are interested how a map changes the elements according to their shapeliness. While in expression (1) a type constructor 1 (see below) is applied to every element which has no impact on their shape, the resulting elements in (2) depend on a conditional so they might be non-shapely. Nested types will be described in Section 3.3. 3.1. Separation of shape and data In contrast to [21] where the shape of a vector is just its length, the size of a structure is not sucient for our purpose as we have to consider the order of elements 1
Here we assume the natural numbers are de ned as zero or the successor of a natural number as in TYPE nat == zero | succ(pred : nat).
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
231
within the structure as in (4). In addition to that we allow arbitrary algebraic data types like trees or graphs, so we also have to handle more complex data structures than just vectors. Because the calculation of the actual shape for a complex, dynamic data structure is quite expensive, we are content with a compile-time shapeliness-test rather than a run-time calculation of the exact shape. The basic approach is to distinguish shapely and data-dependent expressions and functions. Shapely functions only operate with structure information, i.e. the shape of its input parameters to construct the structure of the result, while in data-dependent functions also the content, i.e. the value of the elements like the minimal element of a sequence in uences the shape of the result. An example for a shapely function is map which applies a function to each element of a list but leaves the shape unchanged. In contrast to that is filter. It removes elements from a list that do not satisfy a certain predicate and is hence non-shapely. 3.2. Shape of algebraic types Non-shapeliness occurs if the shape of the result depends on the content values of the input. The only source of data dependency which can occur in the program are conditionals. However, we do not assume that the program is written with a special shape conditional ifs as in FISh [21] but allow an ordinary functional program. To distinguish between structure (shape) and content values of a data structure we use the de nition of the parameterized algebraic data type. Consider for example the parameterized sequence type over data elements TYPE
seq[] == empty :: (ft : ; rt : seq[])
Such a de nition implicitly de nes in the algebraic functional language OPAL [12, 13] a set of constructors to build expressions, selectors to select elements as well as discriminators 2 to check which constructor has created an expression. - -Constructors FUN empty : seq[] FUN :: : × seq[] → seq[] - -Selectors FUN ft : seq[] → FUN rt : seq[] → seq[] - - Discriminators FUN empty? : seq[] → bool FUN ::? : seq[] → bool 2
- - empty sequence - - cons - - rst element - - rest of sequence - - is the sequence empty ? - - has it a cons-cell ?
The name of a discriminator function is derived by adding ’?’ to the name of the corresponding constructor.
232
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
We consider the list of cons-cells of the sequence as structure information while its element values belong to the content part.
A pattern-based function de nition thus exploits structure information and is shapely, provided that the function body is shapely as well. DEF DEF
length(empty) == 0 length(Ft :: Rt) == succ(length(Rt))
For our analysis rules we will only consider a functional kernel language using discriminators to examine the structure as well as selector and constructor functions. Pattern matching can then be added as syntactic sugar for certain kinds of conditionals. Discriminators are test functions of the structure, so they are content-independent and hence preserve shapeliness. If the selector rt is applied to a sequence it returns a sequence as well, but without the rst element. In the following we will call such a selector function recursive, as it has type t → t and thus returns a partial structure. Although rt changes the structure as the sequence gets shorter, shapeliness is not aected because rt operates independently from the elements within the list. If the input sequence xs is shapely so is the result rt(xs). In contrast to that is ft a nonrecursive selector. It selects element values from the sequence. Conditionals of shapely expressions remain shapely. Only if an element value is used as a conditional a content dependency is caused. The constructor :: puts a data element into a sequence and hence “hides” (or wraps) the element. For a shapely sequence xs we therefore get empty?(xs) rt(xs) ft(xs) ft(xs) :: rt(xs) length(rt(xs))
IF
empty?(xs) THEN : : : ::?(xs) THEN : : : length(rt(xs)) = 1 THEN : : :
IF
ft(xs)¡2
IF IF
THEN
:::
--------------
shapely (discriminator) shapely (partial structure) element value (content) (may cause content dependency) shapely (element ft(xs) is wrapped) shapely (no content values are used) shapely (discriminator) shapely (discriminator) shapely (only structure information used) content dependent (NOT shapely)
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
233
3.3. Nested types For a nested data structure we have the problem that we have to know which parts of the nested data structure belong to the structure and which are value contents. If the data themselves are also de ned algebraically we actually cannot dierentiate between content part and structure in a nested type from the type de nition alone. We could assume, that only “basic”, non-parameterized types like nat, real or bool contain data values and all structured types contain shape information. The other extreme case is to treat only the outermost structure as shape and the rest as (complex) data elements. In practice, it is somewhere in between. Consider for example the type seq[block[seq[pair[nat; seq[real]]]]] which describes the types of the sequence-block cover SeqBlock[pair[nat,seq[real]]], of Fig. 1, where a sequence of e.g. measurements with dierent values at discrete time points is split into a sequence of blocks. Here pair[nat,seq[real]] is the type of the data elements, while the three outermost type levels contain the shape of the cover. For nested data types we therefore consider its type tree and count the structure level of a subexpression.
All levels below a certain threshold are considered as structure part while the others are regarded as content. This threshold is set by selecting the structure level of the initial data structure. In Section 6.1 we describe how this level is chosen for data distribution algebras. Applications of non-recursive selectors go down to the data values and thus increment the structure level, while constructors wrap the data in a certain structure and thus decrement the level. If there is more than one level of the structure part within the type tree, the structure level of an expression may even become negative, as we use 1 to denote the rst level of content values. Note that constructors only wrap those elements which are non-recursive parameters. For x :: xs the element x is encapsulated in the resulting sequence by one structure
234
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
level, while the structure level of the partial list xs is the same as that of the enlarged sequence x :: xs as both sequences have the same type. Thus we can distinguish conditions on the structure part which preserve shapeliness from conditionals on the content part which introduce data-dependencies. This is similar to the distinction of shape and data conditionals in FISh. However, we do not require the user to write such special conditionals but try to automatically derive this information. If there is a content dependency, we store the dependency level where it occured. This dependency level is incremented and decremented by selector and constructor applications respectively, in a similar way to the structure level. From this dependency level we can derive if the dependency aects the structure or only the content of the data. If, for instance, an expression of such a type above has a dependency at level 2, that means that one element of the pair, i.e. the measure time : nat or the values : seq[real] or both, are content dependent. If in the current context the rst three levels cover[subobj[seq[]]] belong to the structure, the whole expression will still be considered as shapely as the shape of the structure is not aected by the dependency on the data level. 3 3.4. Multiple dependencies The structure and dependency level are sucient to analyse shapeliness of the structure part of a data structure, if only the size of the data structure as in FISh or the graph of (cons-)cells is being considered as the relevant shape information. In our setting, however, the order of elements within the data structure has also to be considered as shown in expression (4). This could be achieved by analyzing the topology of the structure or the index of elements. As this approach does not allow a static analysis for arbitrary data types, we are content with partial shape information and only dierentiate between expressions which depend on just a single data value and those that are dependent on multiple values. If an expression depends on multiple values this indicates a potential swapping of these element values and hence a change in the order of elements. However, the dierence between multiple and single dependency becomes only apparent in the case where all those type levels of the structure which belong to the shape are data-independent, but that of the rst data-level is not. In this case the order of elements within the shape is important. If the elements are single dependent, then they only depend on themselves or some sub-components thereof as in expression (2). If the elements are multiple dependent, then they may have changed their order as in (4). Consider for example the nested sequence xss = [[0; 1]; [3; 2]], where only the outer sequence shall belong to the shape, i.e., [0; 1] and [3; 2] are data values. Then in map([max; min])(xss) = [[1; 0]; [3; 2]]; 3
−2; −3 The formal shapeliness of this expression would be S⊥ , see Section 4.
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
235
the sublists [1; 0] and [3; 2] contain a multiple dependency, but as this only in uences the data part the shape of the result is not aected and hence shapely. 4 If, however, the elements of the outer sequence were ordered such that the rst element is the maximum w.r.t. the sum of its elements within the sublist, i.e. [[3,2],[1,0]], then the swapping would occur on the rst data level and hence result in non-shapeliness. The shapeliness information thus consists of • The structure level: sometimes also called selector level, which denotes where we are in a nested data structure. It counts from a given start value the number of non-recursive selector applications as those select element values of a data structure. Constructors have the opposite eect and work as wrappers. • The dependency level: which denotes the structure level in a nested data structure at which a content dependency has occured. If this level is positive the expression is content dependent and thus not shapely. • A ag for single or multiple dependencies. The analysis of recursive functions shows that this is not enough, because that would result in an analysis which is more restrictive than necessary. Note that the result of a recursive call may introduce a (single) dependency due to a data-dependent conditional within the function body. The xpoint calculation now would take this dependency in the recursive branch and together with its own dependency it would pessimistically assume that there were a multiple dependency. Consider for example the filter function FUN DEF
filter : ( → bool) → seq[] → seq[] filter(p)(s) == rec filter: IF empty?(s) THEN empty ELSE IF p(ft(s)) THEN ft(s) :: filter(p)(rt(s)) ELSE filter(p)(rt(s)) FI FI
This function contains a data-dependent conditional IF p(ft(s)) and therefore the result of the recursive call filter(p)(rt(s)) is also data-dependent. To prevent the undesirable behavior that together this would lead to a multiple dependency, we use an additional content dependency ag to check for a multiple dependency. Recursive calls clear this ag to ensure that only the dependencies within the function body itself are taken into account.
4
The rst element of the result list only depends on some calculation f([x1 ; x2 ]) = [max; min]([x1 ; x2 ]) of the data element [x1 ; x2 ] = [0; 1], the second element f([y1 ; y2 ]) only depends on [y1 ; y2 ] = [3; 2]. The resulting sequence [f([x1 ; x2 ]); f([y1 ; y2 ])] is therefore shapely, i.e. the (multiple) dependency on the data level (f([x1 ; x2 ]) = [max; min]([x1 ; x2 ]) = [1; 0]) has no impact.
236
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
4. First-order analysis 4.1. Formal shapeliness representation Formally we represent the shapeliness of an expression as a tuple (l, d, m, f), where • l ∈ Z∞ ⊥ = {⊥} ∪ Z ∪ {∞} denotes the structure level. For l¿0 the current expression is considered as content, that is an element of a surrounding structure. • d ∈ Z∞ ⊥ is the dependency level. If d¿0 the expression is data dependent and d denotes the number of type levels which have to be wrapped to get a shapely expression. If for instance an element x with dependency level 1 is wrapped in the list [x] = ::(x; empty) then this list has dependency level 0 and is hence shapely. (The list structure – one element within the list – is data independent.) • m ∈ {I; S; M } denotes if the expression is dependent on a single (S) or on multiple (M) values or independent from content (I ). • f ∈ {⊥; c} is the content dependency ag which describes if the expression is a condition on a data value (c) or not (⊥). It is used to prevent the analysis of a recursive function always yielding a multiple dependency which would be too restrictive. Thus the shapeliness of an expression is of type ∞ 0 = Z∞ ⊥ × Z⊥ × {I; S; M } × {⊥; c}:
Note that all component sets have domain structure if we de ne ⊥ ¡i¡i + 1¡∞
∀i ∈ Z;
I ¡S¡M
and
⊥¡c:
To improve readability we will write ml;f d instead of (l; d; m; f) and remove components which are ⊥. So ml;⊥d will be abbreviated as ml; d and for d = ⊥ we will write mlf or even ml if also f = ⊥. Note that only the dependency level d and the multiple dependency ag m are relevant for the decision, whether an expression is shapely or not. Structure level and content dependency ag are only needed for its calculation. For m = I no dependency has occured at all, so I 1 means shapeliness regardless of the structure level l, although it does not make much sense to speak of shapeliness for content values (l¿0). Sfl; d denotes shapeliness only if the dependency level d is not positive. In the case Mfl; d where a multiple dependency has occured we have to consider that on the corresponding dependency level d the order of the elements may have changed, so actually the level d + 1 has to be considered as content dependent. This implies non-shapeliness already for d¿0. A function is considered shapely if it preserves shapeliness in its arguments.
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
237
Fig. 2. First-order shapeliness rules.
4.2. Analysis rules Fig. 2 shows the rules for the shapeliness analysis. It assumes that the program is well typed. Then we separate the set of functions into constructors (C), selectors (S), discriminators (D) and user de ned functions (F). Discriminators do not change shapeliness (Discr). The same holds for a recursive selector s : t → t like rt (Sel2 ). A non-recursive selector s : t1 → t2 with t1 6= t2 like ft, however, increments the structure and dependency level (Sel1 ) succ : 0 → 0 ml;f d 7 mfsucc(l); succ(d) ; → where succ on Z∞ ⊥ is the successor function on integers with natural extension succ(⊥) = ⊥ and succ(∞) = ∞. The predecessor function pred : 0 → 0 is de ned analogously. 4.3. Constructors Constructors are more complicated. Consider for example :: : × seq[] → seq[];
238
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
which appends an element to a sequence. Both the element and the sequence have shapeliness information, so the shapeliness of the resulting sequence is the worst case, i.e. the maximum of the shapeliness xs of the partial sequence xs and the decremented shapeliness pred(x ) of the wrapped element x (Constr) {x : x ; xs : xs } ` (x :: xs) : pred(x ) t xs : The function t is de ned component-wise t : 0 × 0 → 0 (l1 ; d1 ; m1 ; f1 ) t (l2 ; d2 ; m2 ; f2 ) 7→ (max(l1 ; l2 ); max(d1 ; d2 ); max(m1 ; m2 ); max(f1 ; f2 )) as the maximum according to the partial order v de ned as v: 0 × 0 → 0 (l1 ; d1 ; m1 ; f1 ) v (l2 ; d2 ; m2 ; f2 ) i : l1 6l2 ∧ d1 6d2 ∧ m1 6m2 ∧ f1 6f2 ; where ⊥¡i¡i + 1¡∞; I ¡S¡M and ⊥¡c. For a general constructor c : t1 × · · · × tn → t we have to consider data element components (ti 6= t) and substructures (ti = t) in case of recursive types which yields to F F pred (i ) t i : ti = t
ti 6= t
Constants are constructors with no arguments, so (Constr) yields as a special case ∧
∧
c : ⊥0 = I ⊥ = (⊥; ⊥; I; ⊥): They are thus neutral elements w.r.t. t. 4.4. Conditionals Up to now we only changed the structure levels. Content dependencies can occur in conditionals. There we have to check if the condition does not depend on a data value. That is the case if the expression e1 with shapeliness 1 = ml11f; d1 is neither a 1 content value, i.e. its structure level is not positive, nor content dependent itself, i.e. its dependency level is not positive. This is formally expressed by l1 60 and d1 60 ∧ which is equivalent to 1 v Mc0; 0 = (0; 0; M; c) as c and M are the maxima in the corresponding components and hence do not change the comparison. If the conditional is not content dependent, we return the worst case shapeliness of both branches and clear the content-dependency ag. clear : 0 → 0 ml;f d 7 ml;⊥d →
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
239
Note that we might loose shapeliness information if both branches in a condition are constants as for example in FUN DEF
not : bool → bool not(b) == IF true?(b) THEN false ELSE true FI
For that reason we also take the shapeliness 1 of the conditional e1 into account for the result (Cond 1 ). clear(1 t 2 t 3 ) If the condition depends on a content value or a content-dependent expression, that is 1 6v Mc0; 0 ; than we have either a multiple dependency or a dependency from a single value. A multiple dependency occurs if the expression in at least one branch is content dependent as well which means its dependency ag is set: Ic⊥ v 2 t 3 . In that case the dependency level is set to 1 and the multiple dependency as well as the content dependency ag are set (Cond3 ): IF
e1
THEN
e2
ELSE
e3
FI
:
max(l1 ; l2 ; l3 ); max(d1 ; d2 ; d3 ; 1) ∧
= 1 t 2 t 3 t MC⊥; 1
MC
Otherwise we only have a dependency from a single value (Cond2 ): IF
e1
THEN
e2
ELSE
e3
FI
: 1 t 2 t 3 t Sc⊥; 1
4.5. Other rules Variables have the shapeliness as stated in the environment (Var), where the disjoint union ] {x : } shall mean that the variable x is unique within the environment. For LET x == e1 IN e2 we append the shapeliness 1 of e1 to the environment (Let). ] {x : 1 } means uniqueness of x in the environment, so we have to perform -conversion if necessary. For the application of user functions the shapeliness of the parameter is added to the environment for the derivation of the body (Appl). 4.6. Examples Now we can give the shape of our example expressions in the last section. Under the assumption of a shapely sequence xs : I 0 (that is (0; ⊥; I; ⊥)), we get ∧
1. {xs : I 0 } ` map(succ)(xs) : I 0 = (0; ⊥; I; ⊥) the shapeliness does not change and the result is therefore shapely as well
240
T. Nitsche / Science of Computer Programming 37 (2000) 225–252 ∧
2. {xs : I 0 } ` map(x: IF x¡2 THEN 2 ELSE x FI)(xs) : S 0;0 = (0,0,S; ⊥) the result list itself is shapely, but the elements are content dependent as ft(map(x: IF : : :) (xs)) : S 1; 1 ∧ 3. {xs : I 0 } ` filter(even?)(xs) : S 0,1 = (0; 1; S; ⊥) not shapely, as the dependency level is greater than zero ∧ 4. {xs : I 0 } ` LET x == : : : IN [max(x,y); min(x,y)] : M 0,0 = (0,0,M; ⊥) the data elements are dependent on multiple content values. As thus the element order may have changed, d = 0 already implies non-shapeliness. 4.7. Explicit shape selectors The analysis described so far is more restrictive than necessary. If structure information is encoded as a data value instead of as a type variant, i.e. part of the structure, shapeliness analysis yields to content dependent and thus non-shapely expressions. An example is the union type ] . If it is de ned as TYPE
union == left(val : ) right(val : )
the corresponding variant is part of the structure. However, if we de ne this as TYPE
variant == left right union == pair(discr : variant,val : )
the variant is considered as data element. Thus both versions would yield to dierent results in the shapeliness analysis. The reason is that every non-recursive component is considered as data part and hence increments the selector level, while a selection of a recursive component or a discriminator test leave the level unchanged. This leads also to the problem that for a pair either both components are data elements or both belong to the structure part
To overcome these problems we introduce an explicit notation which shall belong to the shape TYPE
SHAPE
for components
union == pair(SHAPE discr : variant,val : )
If such a component is selected, the structure level remains unchanged as it is the case for recursive components. The same holds for constructor applications. The
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
analysis rules in Fig. 2 thus have to be changed as follows, where if the subtype ti has the SHAPE-notation.
241
SHAPE?(ti ) = true
` ei : i c : t1 × · · · × tn → t ∈ C; 0 ! ! Constr F F ` c(e1 ; : : : ; en ) : pred(i ) t i ti 6= t ∧ ¬SHAPE?(ti )
s : t1 → t2 ∈ S; t1 6= t2 ; ¬SHAPE?(t2 );
ti = t ∨ SHAPE?(ti )
`e :
` s(e) : succ() s : t1 → t2 ∈ S; t1 = t2 ∨ SHAPE?(t2 );
`e :
` s(e) :
Sel01 Sel02
Now we can de ne types where one component belongs to the structure and contains shape information like a discriminator, while the other components contain ordinary data values
An application is a tree type where every node may have a dierent number of children TYPE
tree == node(val : ; SHAPE
children : seq[tree])
Here val denotes the data value within a node while children contains the structure of the tree. If the children were considered as data values as it is not a directly recursive selector children : tree → tree; the x-point would yield ∞ and thus always indicates non-shapeliness. If we search the transitive closure of selectors for recursive subcomponents, this SHAPE-notation can automatically be derived for types with mutually recursive components. 5. Higher-order analysis The rst-order rules described in the last section enable a static analysis of the program. Problems occur if we allow higher-order functions, that is, functions passed as parameters or returned as a result. This has to be resolved in a similar way to the standard closure technique where we have to suspend the analysis until all (data) parameters are available and have to construct a single expression containing all function de nitions which are used. As a result, we could only analyse the shapeliness of a
242
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
higher-order function if we knew the de nition of all its parameter functions which prohibits separate compilation of independent structures. If we instead allow shapeliness functions we can simplify this analysis. So we extend the shapeliness type to the domain ::= 0 | × | → The functions succ; pred; clear; t and v are extended component-wise. The expressions in the functional kernel language are de ned as follows: E ::= x | c e1 : : : en |s e |d e | e1 e2 | LET x == e1 IN e2 | IF e1 THEN e2 ELSE e3 FI | x: e | rec f: e | (e1 ; e2 )
Variable Constructor Selector Discriminator Application LET Conditional Abstraction Recursion Tupling
To improve readability we will use some syntactic sugar like brackets, multiple values in a LET-de nition, etc. 5.1. Analysis rules The rules for the higher-order shapeliness are shown in Fig. 3. They are in most cases similar to the rst-order case (Fig. 2). For instance discriminators do not change the shapeliness in the rst-order case, so they are the identity function here (Discr). The shapeliness of selectors is either the identity or the successor function depending on if it is a recursive selector (Sel2 ) or not (Sel1 ). The three cases of the conditional are encoded by special functions less and dep, so that we only need one rule for the conditional (Cond). The function less compares two shapeliness values, while the function dep is called when a content dependency has been detected and we have to distinguish between single and multiple dependencies. This is done according to the content-dependency ags of the expressions in both branches of the conditional. less : 0 × 0 × × → 2 0 v 1 less(0 ; 1 ; 2 ; 3 ) = 3 otherwise dep : →
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
243
Fig. 3. Higher-order shapeliness rules.
t Sc⊥; 1 l; max(d; 1) Mc = t Mc⊥; 1 dep() = (dep(1 ); dep(2 )) 1 : dep(2 )
= ml;⊥d = ml;c d = (1 ; 2 ) = 1 : 2
For 1 v Mc0; 0 we get from the de nition of less IF···FI = clear(1 t 2 t 3 ) as in (Cond 1 ), while otherwise IF···FI = 1 t dep(2 t 3 ). The rst two cases in the de nition of dep encode the single ( = ml;⊥d means has no content-dependency ag) and multiple dependency similar to (Cond 2 ) and (Cond 3 ). The other cases are the extension to the higher-order domain. Note that the maximum is only de ned, if both arguments, respectively, their corresponding expressions have the same type, which is the case for the two branches of a conditional. 0 ; 0 ∈ 0 ; t 0 0 0 t = (1 t 1 ; 2 t 2 ) = (1 ; 2 ); 0 = (10 ; 20 ); type() = type(0 ); 0 0 1 :(2 t 2 [1 =1 ]) = 1 :2 ; 0 = 10 :20 ; type() = type(0 ): The shapeliness 1 of the condition e1 : bool is implicitly considered as a constant function or tuple to meet this type requirement, i.e., the maximum function is extended to t : 0 × → .
244
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
The other rules for variables, lambda abstraction and function application are straightforward. 5.2. Fixpoint calculation Recursive functions lead to x-points. Their existence is ensured as all shape functions are continuous. Proposition 1. succ; pred; clear; dep and (x; y; z):less(x; Mc0; 0 ; y; z) are continuous functions. Proof. As tupling and function-lifting preserve continuity for element-wise de ned functions [32], we only have to show continuity on 0 . • As succ is de ned component-wise and {⊥; c} and {I; S; M } are nite sets, it is is continuous. This is, however, clear as Z∞ sucient to prove that succZ∞ ⊥ is ⊥ totally ordered. • pred: analogously to succ. × IdZ∞ × Id{I; S; M } × {f 7→ ⊥} is continuous, because it is de ned • clear = IdZ∞ ⊥ ⊥ component-wise of either an identity or a constant function. • dep: Let i be an ascending chain. If Ic⊥ v K , i.e. K = mlKKc; dK , then Ic⊥ v i ∀i¿K and therefore Ic⊥ v t i as well as dep(i ) = i t Mc⊥; 1 ∀i¿K. Thus t dep(i ) = t i t Mc⊥; 1 v dep( t i ). Due to dep() v t Mc⊥; 1 according to the de nition of dep for all this implies t dep(di ) = dep(t di ). • (x; y; z):less(x; Mc0; 0 ; y; z): Let i = (i1 ; Mc0; 0 ; i3 ; i4 ) with i1 ∈ 0 ; i3 ; i4 ∈ be an ascending chain. K1 ; dK1 this implies Case 1: There exists K1 such that K1 6v Mc0; 0 . As K1 = mlK1 fK1 either lK1 0 or dK1 0, i.e. either the selector or the dependency level of K1 is positive. Because i1 is ascending this implies either li ¿lK1 ¿0 or di ¿dK1 ¿0 for all i¿K1. Thus i1 6v Mc0; 0 ∀i¿K because i1 is ascending. This means t i1 6v Mc0; 0 and less(i1 ; Mc0; 0 ; i3 ; i4 ) = i4 ∀i¿K. Therefore less( t i1 ; Mc0; 0 ; t i3 ; t i4 ) = t i4 = t less(i1 ; Mc0; 0 ; i3 ; i4 ). Otherwise if K1 v Mc0; 0 ∀K ∈ N, then less(i1 ; Mc0; 0 ; i3 ; i4 ) = i3 ∀i as well as t i1 v Mc0; 0 . This implies less( t i1 ; Mc0; 0 ; t i3 ; t i4 ) = ti3 = tless(i1 ; Mc0; 0 ; i3 ; i4 ).
Note that less with arbitrary arguments is not continuous! Consider for example the chain i = ((I i+1 ); (I i ); (I 0 ); (I 1 ))i∈Z . Here t less(i ) = t I 1 = I 1 , but less(t i ) = less(I ∞ ; I ∞ ; I 0 ; I 1 ) = I 0 . But as only less( ; Mc0; 0 ; ; ) is used in the rules in Fig. 3, this does not matter. Therefore the x-point always exists. The only problem is that its value may be ∞ and thus its calculation be in nite. For that reason we stop shapeliness analysis if
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
245
selector or dependency level reaches a certain upper bound and consider the corresponding function as non-shapely if the analysis has not terminated by then. Continuity ensures that the x-point is either nite or the calculation will yield in nity and hence reach such a nite upper bound. For nite xpoints the values of selector and dependency level are bounded by the number of non-recursive selector applications which can thus be derived from the program text. This upper bound can be used for the termination check, as a greater value indicates ∞ as the x-point. The same holds for the number of wrapper levels created by constructors which yield a lower bound. For an important subclass of recursive functions, where every recursive call uses the same shapeliness as that of the function argument, this is not necessary as the x point in this case is nite. 5.3. Example: Functions on sequences Example 2. Consider the function FUN DEF
length : seq[] → nat rec length:xs. IF empty?(xs) THEN 0 ELSE succ(length(rt(xs))) FI
which calculates the length of a sequence. Its shapeliness can be derived as follows:
empty? : id; xs : s
xs : s ; rt : id {xs : s } ` rt(xs) : s ; length : length {xs : s } ` length(rt(xs)) : length (s ); succ : id
{xs : s } ` empty?(xs) : s ; 0 : I ⊥ ; {xs : s } ` succ(length(rt(xs))) : length (s ) {xs : s } ` IF empty?(xs) THEN 0 ELSE succ(length(rt(xs))) FI : 0; 0 ⊥ ⊥ less(s ; Mc ; clear(s t Ic t length (s )); s t dep(Ic t length (s ))) ` xs: IF : : : FI : s :less(s ; Mc0; 0 ; clear(s t length (s )); s t dep(length (s ))) ` length : length : s : clear( less(s ; Mc0; 0 ; clear(s t length (s )); s t dep(length (s ))))
which leads to 0 (⊥) = ⊥ 1 (⊥) = s : clear(less(s ; Mc0; 0 ; clear(s t 0 (⊥)(s )); ) | {z } ⊥
0
s t dep( (⊥)(s ))) {z } | dep(⊥)=Sc⊥; 1
= s : clear(less(s ; Mc0; 0 ; clear(s ); s t Sc⊥; 1 )) 2 (⊥) = ((⊥))
246
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
Fig. 4. Shapelines of functions on sequences.
= s : clear = s : clear
less(s ; Mc0; 0 ; clear(s t 1 (⊥)(s )); s t dep(1 (⊥)(s )))
less(s ; Mc0; 0 ; clear(s ) t (clear(s )); s t (s t Sc⊥; 1 ))
= 1 (⊥) Thus the x-point is length = s : clear(less(s ; Mc0; 0 ; clear(s ); s t Sc⊥; 1 )) = s : less(s ; Mc0; 0 ; clear(clear(s )); clear(s t Sc⊥; 1 )) = s : less(s ; Mc0; 0 ; clear(s ); clear(s ) t S ⊥; 1 ) and length is a shapely function because {xs : I 0 } ` length(xs) : I 0 . Analogously we can derive the shapeliness of other functions on sequences as well as on trees, matrices and other data structures. Fig. 4 shows as examples the shapeliness functions for the concatenation function a`, for drop which removes the rst n elements of a sequence, for take which takes the rst n elements and removes the rest and for map, filter, reduce. Their corresponding function de nitions can be found in [22].
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
247
6. Example: Sequence block cover Fig. 5 shows the de nition of a general sequence block cover which splits a list into p blocks of sublists with l (resp. r) elements overlapping to the left (resp. right). Although the code should be self-descriptive, we give some explaining remarks. ; p; l and r are parameters of the cover (line 1–5). While objects are lists over data elements (7), subobjects are partial lists together with overlapping lists to the left and right (9–11). Gluing just selects the own parts of each subobject and concatenates the resulting lists together (13). Splitting the list into sublists of size length(O)=p is more complicated. Here a function split0 is called which gets the overlapping left elements and the size of each subobject as additional parameters (17). It calculates the current own part (20) and the overlapping to the right (22) and to the left (23). In line 25–26 we create one subobject-block and recursively split the rest of the list.
Fig. 5. De nition of a sequence block cover.
248
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
If we start with a shapely object O : I 0 , then we can derive {O : I 0 } ` split(O) : I −2 Thus the split-operation is shapely. However, its structure level decreases by 2. This is due to the fact that split actually adds two wrappers to the data: one is the cover-type itself, the other is the subobject-type which adds an additional level as it encapsulates the sublists. Glue just does the opposite and removes these two levels {C : I −2 } ` glue(C) : I 0 If we look at the code, we can see that glue rst selects the own part of all subobjects (map) and then attens the list of sublists to a single list (reduce). Note that we have to consider that cover[subobj[seq[]]] is the scope of the structure-part which can be done through the assumption C : I −2 . Otherwise shapeliness analysis would pessimistically indicate that glue was not shapely: {C : I 0 } ` glue(C) : S 2; 1 . Because split and glue both are shapely, the sequence block cover can be optimized in a dynamic execution. 5 This result can easily be extended to column-block covers if we represent matrices as a nested sequence of sequences. 6.1. Choosing the selector level The example above shows the necessity to choose the proper start value for the selection level, as this de nes the border between structure and data part in a nested data structure. In the case of a cover this information can be derived from the types for object, subobject and cover. Split considers the object as relevant structure and its parameters as data values. So for TYPE obj[] == seq[] we start with I 0 , because the outermost sequence is the only part of the structure here. Glue considers the cover and the subobject as structure and only the parameters of the subobjects as data values. In the case of the sequence block cover this means that the outermost sequence (cover[ ]), the block-wrapper and the left, inner and right sequences belong to the structure part. These are three structure levels above the data part, so we have to start the analysis with I −2 as we have done above. Using the explicit SHAPE-notation for certain data type components, the derivation of the number of structure levels is not even necessary. We can just implicitly mark all selectors within the types obj, subobj and cover as belonging to the structure unless they select some parameter values (in case of obj or subobj). 5
In general, it is sucient to show that split ◦ glue is shapely. If, however, split and glue are shapely themselves the same holds for their composition, as function composition preserves shapeliness.
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
249
For the sequence block cover this looks like TYPE
obj[] == seq[] cover[ ] == seq[SHAPE ] subobj[] == block (SHAPE FOREIGN left : seq[], SHAPE OWN inner : seq[], SHAPE FOREIGN right : seq[])
In this case we can always start with I 0 as the selector level remains unchanged for SHAPE-components. This would produce {O : I 0 } ` split(O) : I 0 {C : I 0 } ` glue(C) : I 0 7. Related work and conclusion Parallel programming with skeletons [4, 5, 10, 11] has been an active eld of research in the last few years. Our approach is two-fold. First, we use prede ned skeletons as optimized parallel functions. Second, we allow the programmer to de ne his own skeletons in terms of covers (data distributions) and automatically generate the necessary communication. That diers from systems like HOPP [26, 27], or P3L [1, 9, 24] where the set of skeletons is xed within the compiler which then optimizes the program according to a cost calculus, and from Skil [2, 3] which allows only at skeletons to be de ned. Possible ineciencies of an automatic transformation can be avoided if prede ned and hence optimized skeletons are used. In [23] we have described a dynamic execution of covers as well as optimizations which allow for the class of iterative algorithms the generation of code that nearly reaches the eciency of an optimized static data distribution. Shapeliness of the cover ensures the applicability of such optimizations. Thus we are mainly interested in the information whether the structure of an expression depends on its data elements or only on structure i.e. shape. That diers from the approach taken by Jay et al., in the Sydney Shape project [17–19] where the concrete shape is calculated. Unlike the FISh language [21] and its predecessor VEC [20] which uses a special shape conditional ifs and allows only shapely programs, we do
250
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
not restrict the set of possible programs but want to nd out which ones are shapely. As this is in general undecidable we can only get a subset of the shapely functions and expressions. For the analysis of higher-order functions we propose a higher-order analysis where shapeliness functions are possible. A related approach to parallel programming is the Bird–Meertens Formalism (BMF) [14, 16, 28, 29]. Here a program is transformed from an abstract speci cation to an ecient parallel program, where the parallelization is done via homomorphisms 6 de ned over the data structures, i.e. the structure of algebraic or categorical data types. To analyse real program libraries we have to consider handcoded functions in our calculus. They are not functionally de ned but in another language like C or Java and are just called via a hand-coding interface. Therefore we cannot analyse their de nitions and have to make a worst-case assumption on their shapeliness. Only for unstructured types like real or natural numbers we can weaken this assumption as no data selections can occur in such a type. For other handcoded types like arrays the library programmer has to provide the shapeliness of a handcoded function. As a future work we intend to extend this analysis to a topological shape analysis where a neighbourhood relation is derived from the data-type de nition. The algebraic data-type implicitly de nes a traversal of a data structure from which we can derive index information. This would allow to weaken the worst-case assumption of multiple-dependent values as changes of the element order are directly visible. Another application is the generation of explicit communication functions. Acknowledgements The work is being supported by a scholarship from the German Research Foundation (DFG). I would like to thank Peter Pepper and Barry Jay for numerous remarks on an earlier version of this paper as well as the anonymous referees for their valuable comments. References [1] B. Bacci, M. Danelutto, S. Orlando, S. Pelagatti, M. Vanneschi, P 3 L: a structured high-level parallel language and its structured support, Tech. Rep. HPL-PSC-93-55, Hewlett-Packard Laboratories, Pisa Science Center, 1993. [2] G.H. Botorog, High-level parallel programming and the ecient implementation of numerical algorithms, Ph.D. Thesis, Mathematisch-Naturwissenschaftliche Fakultat der Rheinisch-Westf a lischen Technischen Hochschule Aachen, January 1998. [3] G.H. Botorog, H. Kuchen, Ecient high-level parallel programming, Theoret. Comput. Sci., Special Issue on Parallel Computing (1998). [4] D.K.G. Campbell, Towards the classi cation of algorithmic skeletons, Tech. Rep. YCS 276, Department of Computer Science, University of York, 1996. [5] M. Cole, Algorithmic skeletons: structured management of parallel computation, Research Monographs in Parallel and Distributed Computing, MIT Press, Cambridge, MA, 1989. 6
Or more general near-homomorphisms [6] or mutumorphisms [16].
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
251
[6] M. Cole, Parallel programming with list homomorphisms, Parallel Proces. Lett. 5 (2) (1995). [7] M. Cole, On dividing and conquering independently, in: C. Lengauer, M. Griebl, S. Gorlatch (Eds.), Proc. 3rd Int. Euro-Par Conf. (EuroPar’97), Passau, Lecture Notes in Computer Science, Vol. 1300, Springer, Berlin, August 1997, pp. 634 – 637. [8] C∗ Language Reference Manual, Thinking Machines Corporation, 1991. [9] M. Danelutto, F. Pasqualetti, S. Pelagatti, Skeletons for Data Parallelism in P3L, in: C. Lengauer, M. Griebl, S. Gorlatch (Eds.), Proc. 3rd Int. Euro-Par Conf. (EuroPar’97), Passau, of Lecture Notes in Computer Science, Vol. 1300, Springer, Berlin, August 1997, pp. 619–628. [10] J. Darlington, A.J. Field, P.G. Harrison, P.H.G. Kelly, D.W.N. Sharp, Q. Wu, Parallel programming using skeleton functions, in: A. Bode, M. Reeve, G. Wolf (Eds.), Proc. PARLE ’93, Lecture Notes in Computer Science, Vol. 694, Springer, Berlin, 1993, pp. 146–160. [11] J. Darlington, Y. Guo, H.W. To, J. Yang, Parallel skeletons for structured composition, in: Proc. 5th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, ACM Press, New York, July 1995, pp. 19–28. [12] K. Didrich, A. Fett, C. Gerke, W. Grieskamp, P. Pepper, OPAL: design and implementation of an algebraic programming language, in: J. Gutknecht (Ed.), Programming Languages and System Architectures, Int. Conf., Zurich, Switzerland, March 1994, Lecture Notes in Computer Science, Vol. 782, Springer, Berlin, 1994, pp. 228–244. [13] K. Didrich, W. Grieskamp, C. Maeder, P. Pepper, Programming in the large: the algebraic-functional language Opal 2, in: Proc. 9th Int. Workshop on Implementation of Functional Languages, St. Andrews, Scotland, September 1997 (IFL’97), Selected Papers, Lecture Notes in Computer Science, Vol. 1467, Springer, Berlin, 1998, pp. 323–338. [14] S. Gorlatch, Formal derivation of divide-and-conquer programs: a case study in the multidimensional FFT’s, in: D. Mery (Ed.), Formal Methods for Parallel Programming: Theory and Applications, 1997, pp. 80–94. [15] High Performance Fortran Forum, High performance Fortran language speci cation, Sci. Programm. 2 (1) (1993). [16] Z. Hu, M. Takeichi, W.-N. Chin, Parallelization of calculational forms, in: 25th ACM SIGPLANSIGACT Symp. on Principles of Programming Languages (POPL’98). ACM Press, New York, January 1998. [17] C.B. Jay, Shape analysis for parallel computing, in: J. Darlington (Ed.), Proc. 4th Int. Parallel Computing Workshop: Imperial College London, 25–26 September, 1995, Imperial College=Fujitsu Parallel Computing Research Centre, 1995, pp. 287–298. [18] C.B. Jay, Shape in computing, ACM Comput. Surv. 28 (2) (1996) 355–357. [19] C.B. Jay, J.R.B. Cockett, Shapely types and shape polymorphism, in: D. Sannella (Ed.), Programming Languages and Systems — ESOP’94: 5th European Symp. on Programming, Edinburgh, U.K., April 1994, Proc., Lecture Notes in Computer Science, Springer, Berlin, 1994, pp. 302–316. [20] C.B. Jay, M.I. Cole, M. Sekanina, P. Steckler, A monadic calculus for parallel costing of a functional language of arrays, in: C. Lengauer, M. Griebl, S. Gorlatch (Eds.), Euro-Par’97 Parallel Processing, Lecture Notes in Computer Science, Vol. 1300, Springer, Berlin, August 1997, pp. 650 – 661. [21] C.B. Jay, P.A. Steckler, The functional imperative: shape! in: C. Hankin (Ed.), Programming Languages and Systems: 7th European Symp. on Programming, ESOP’98 Held as part of the joint European Conf. on Theory and Practice of Software, ETAPS’98 Lisbon, Portugal, March=April 1998, Lecture Notes in Computer Science, Vol. 1381, Springer, Berlin, 1998, pp. 139 –153. [22] T. Nitsche, Optimising dynamic execution of data distribution algebras using shape analysis, Tech. Rep. TR98-3, Technical University of Berlin, Department of Computer Science, 1998. [23] T. Nitsche, Optimizing dynamic execution of data distribution algebras, in: Proc. GI-Workshop Parallel-Algorithmen, -Rechnerstrukturen und -Systemsoftware, Karlsruhe (PARS’98), September 1998, pp. 119–128. [24] S. Pelagatti, A methodology for the development and the support of massively parallel programs, Ph.D. Thesis, Universita di Pisa-Genova-Udine, Technical Report TD-11=93, March 1993. [25] P. Pepper, M. Sudholt, Deriving parallel numerical algorithms using data distribution algebras: Wang’s algorithm, in: Proc. 30th Hawaii International Conf. System Sciences, 7th–10th January 1997. [26] R. Rangaswami, HOPP — a higher-order parallel programming model, in: Algorithms and Parallel VLSI Architectures, 1995.
252
T. Nitsche / Science of Computer Programming 37 (2000) 225–252
[27] R. Rangaswami, A cost analysis for a higher-order parallel programming model, Ph.D. Thesis, University of Edinburgh, 1996. [28] D.B. Skillicorn, The Bird-Meertens formalism as a parallel model, in NATO ARW ” ‘Software for Parallel Computation” ’, June 1992. [29] D.B. Skillicorn, Foundations of Parallel Programming, Cambridge University Press, Cambridge, 1994. [30] M. Sudholt, The transformational derivation of parallel programs using data distribution algebras and skeletons, Ph.D. Thesis, Fachgruppe Ubersetzerbau, Fachbereich Informatik, Technische Universitat Berlin, August 1997. [31] M. Sudholt, C. Piepenbrock, K. Obermayer, P. Pepper, Solving large systems of dierential equations using covers and skeletons, in: 50th IFIP WG 2.1 Working Conf. on Algorithmic Languages and Calculi, Chapman & Hall, London, February 1997. [32] R.D. Tennent, Semantics of Programming Languages, Prentice-Hall, Englewood Clis, NJ, 1991.