Express: proposal for uniform notations A J E van Delft
A m~tational standard called Express is proposed that can be applied in several s~?[tware engineering areas, e.g., requirements d~f[inition, design, and implementation. The kernel of Express is simple hut powerIul through application of mathematical forms of expressions. Notations in Express are uniform:for example, data structures are denoted in the ,same way as program structures. Express can he used in combination with templates that software engineers can Jill with natural language and expressions. programming languages, notational standard, software spec![ieation, requirements specification
During the process of software engineering various documents must be produced that define relevant aspects of the software. The presentation of the information is important. Documentation should be clear for those who use it, and it should not allow misunderstandings. Moreover, it should be possible to produce, study, and maintain the documentation efficiently. In this context, documentation standards can be defined with respect to the layout of pages, chapters, and paragraphs, to character styles, tables, diagrams, and pictures, and to the use of natural and synthetic or formal languages. Formal languages are generally meant for the specification of software tasks. Among them are programming languages, formal specification languages, and formal notation standards. The latter are relatively simple in structure and not necessarily strict, when they prescribe how to apply special symbols and keywords to specifications. One example is the standard for data dictionaries in structured systems design 1. This paper proposes a simple and powerful formal notation standard called Express. The next section discusses existing formal languages and diagramming techniques. A large number of formal languages have been developed and proposed for software specifications, but despite the claims made by their proposers, their application by software engineers is limited. Apart from programming languages, software engineers often stick with Department of Computer Science,Universityof Leiden,Niels Bohrweg 1, PO Box 9512, 2300 RA Leiden,The Netherlands. Current address: Delftware Technology,Gentsestraat 165, 2587 HP Den Haag, The Netherlands
vol 31 no 3 april 1 9 8 9
natural language and diagrams, which then results in incomplete or ill-digested amounts of documentation. A standard for expressing their findings concisely could help practitioners in their tasks. Criteria for such a standard are identified in the section 'Criteria for notational standard'. The section 'Express' introduces the proposed Express standard, which aims to fulfil the identified criteria. For the sake of simplicity, descriptions in Express basically consist of expressions; i.e., words, operators, and parentheses. Subsequent sections are about operators that occur in Express, their precedences, laws that they obey, and layout of expressions. A way is presented of enriching Express with templates, which are texts with special fields that software engineers can fill with expressions. Examples of the use of some templates are presented in Appendix 1. Two examples are then presented of practical applications of Express; namely, how Express can be applied within the Jackson Structured Programming technique 2 and how a concept from Express can improve Chen's standard for entity-relationship diagrams 3. The final section discusses the kinds of tools that can be used for Express. One tool is the programming system for SCRIPTIC, a powerful and simple programming language based on Express. Appendix 2 discusses this in more detail. FORMAL LANGUAGES AND DIAGRAMMING TECHNIQUES In the last few years a number of formal languages have been developed to help software engineers with requirements specifications and designs. Among them are LOTOS 4, Process Algebra 5'6, Communicating Sequential Processes (CSP) 7, Vienna Development Method (VDM) 8, and Backus Naur Form (BNF) 9. For each of these languages, practical problems can be identified that the language can describe well. Research and development projects related to these languages are often ambitious. Researchers demand automated tools, which are then built. However, a major problem is the reluctance of software engineers to use formal languages for requirements and designs. The
0950-5849/89/030143-17 $3.00 © 1989 Butterworth & Co (Publishers) Ltd
143
languages suffer from one or more of the following problems: • Perfectionism, which allocates time inefficiently. Many formal languages are designed for precise specification of a special kind of software, and they make specification as hard as programming, or even harder. Using them creates an additional problem: the correctness of a program with respect to its formal specification should be demonstrated. But even this correctness is no guarantee that the program will answer the demands of its users. • Limited applicability and flexibility. Most of the languages are meant for special purposes. BNF is only meant for describing context-free grammars, and Process Algebra and CSP are meant to describe the behaviour of processes. VDM allows algebraic specification of functions. The area of software testing is not covered by most languages. A formal language that is as strict as a programming language does not allow the engineer to express all his findings. The language may become an obstacle rather than an aid. • Large complexity. It often requires a large effort to learn one single programming language well, and learning an additional formal specification language requires additional intellectual grasp. In its search for perfection, CSP is more complicated than many programming languages. In an attempt to overcome the problem of limited applicability, L O T O S incorporates a number of distinct facilities for expressing behaviour of processes and for expressing data structures and operations on them. Its vocabulary is as large as the vocabulary of a programming language. • Abundant use of special symbols, as in CSP, VDM, and Process Algebra. Not only are Greek characters used, but so are dedicated symbols. This makes the formalism inaccessible to outsiders. Moreover, these notations require special text processing facilities. This problem has been recognised in LOTOS, which is in a sense a practical version of Process Algebra. In LOTOS, strange symbols have simply been replaced by keywords. Experience points out that a more fruitful approach to helping software engineers is to provide them with a simple notational standard that does not aim at perfection. G o o d examples are the various kinds of diagrams and the formal notations in the data dictionary of Structured Systems Design 1 and the diagrams in Jackson Structured Programming (JSP) 2. Often a diagram can present structures better than text can. This holds especially when the diagram is a graph with one or more cycles in it. However, diagrams have their drawbacks too: • They are often difficult to implement in documentation. Diagrams must either be drawn by hand, or with help of special software packages as workbenches. When made by hand, diagrams are hard to modify. Workbenches are sold with great promises, but up to
144
now buyers are disappointed. Desktop publishing systems may improve the situation in the nearby future, although for comment sections in programs no solution is to be expected. Most compilers cannot process program text files that contain diagrams, except for diagrams that consist of characters. • Diagrams may fill a lot of paper with little information. • Diagrams often need additional text to be fully understandable. CRITERIA FOR STANDARD
NOTATIONAL
By analyzing and comparing formal languages and diagramming techniques as they are mentioned in the previous sections, some criteria for a notational standard can be identified: • Flexibility. The standard should serve the software engineer as a handy tool, allowing him to express all his findings. Application of the standard does not necessarily need to be to perfection. • Applicability. The standard should be applicable in the following dimensions. ~ Phases in software engineering. For instance, distinguishing between the phases requirements definition, design, implementation, module testing, system testing, and maintenance. ~ Methods and techniques, as SASD, SDM, and JSP. Most methods prescribe their own notational standards, but these are not essential. o Product ranges. For instance, differentiation can be made between realtime systems, interactive systems, data processing systems, and expert systems. • Simplicity. The standard should be easy to apply and comprehend. Although at first sight conflicting with the former criterion, simplicity may be achieved by means of: o Simple grammatical rules. o Use of common and recognisable symbols. Their meaning should correspond to their everyday use. They should be available on common keyboards and text processors. To apply the standard in commentary sections in program texts, the symbols should be such that they are accepted by compilers. o Uniformity. Similar types of structures in different application areas should appear in similar ways. Apart from the easy-to-learn soberness, uniformity in general has the advantage of visualizing relations between structures. For instance, the Jackson Structured Programming technique uses uniform representations for data and program structures. • Conciseness. The standard should allow for concise formulations, thus eliminating the danger of huge amounts of documentation that contain relatively little information. The criteria mentioned above can help to develop a good
information and software technology
standard, but only practical experience with a standard can show the real value. Too often practical evaluations of proposed standards and languages are omitted.
combined into one: Expression = ( Word [ Operator
(=
EXPRESS Express is a proposal for a standard that meets the criteria in the last section. The uniform base of Express is expressions--pieces of text with words, operators (mathematical symbols and punctuation marks), and parentheses. The way these expressions are applied can vary from loose and informal to strict and formal, depending on the circumstances. When used formally, Express can be supported by computers. For instance, the author is currently involved in the development of SCRIPTIC, a powerful enrichment of conventional languages, which is based on Express. A convenient way to apply Express is to define notions by means of equations. A word or group of words that represents a notion is followed by the equals symbol ( = ) and a description. This description is in general an Express expression, but the possibility is left open that it is a piece of natural language. The words that occur in the expression refer to other notions, which may in turn be defined by means of equations, but they may speak for themselves as well. The software engineer can work top-down, bottom-up, or otherwise. Most operators in the expressions are binary, which means that they have two operands. The operators have different precedences, which can be overruled with parentheses, by analogy with the rules for mathematical expressions. Special operators may occur in expressions that do not get their meaning from Express but from some other standard. It is possible to describe the syntax of well-formed Express expressions in Express itself. Instead of such an elaborate but complicated description, a small but incomplete one can be given. Here a small description is given that does not express the fact that in well formed expressions there are as many left- as right-hand parentheses in a expression, and how many operands the operators have: Expression = (Word [ Operator
I
= Equals Pair of Periods Comma Arrow
Ampersand Colon Bar Minus
I Semicolon
Parentriesis = Leftparenthesis[ Rightparenthesis
These definitions express which symbols are operators and which are parentheses. The three definitions can be
vol 31 no 3 april 1989
Ampersand Colon
Comma [ Bar I Semicolon Arrow [ Minus ) I Parenthesis ( = );..
Leftparenthesis I Rightparenthesis)
The nested definitions start with the significant composed symbol ( = . As abundant use of parentheses may cause confusion, special care should be taken with their layout. Nesting large texts, as in the description of Operator above, can become unclear. In this case a compromise is: Expression = ( Word I Operator I Parenthesis (=Leftparenthesis [ Rightparenthesis) );.. Operator
....
In principle it should be possible that the occurrence of a notion within an expression is replaced by its definition, enclosed by parentheses if necessary. For the current example above this yields: Expression
=
Word I Equals I Bar I Comma Colon I Semicolon [ Arrow I Minus Ampersand I Pair of Periods Leftparenthesis [ Rightparenthesis ) : .
Expressions may be related to natural language, by letting operators correspond to phrases. For instance, the operator = corresponds to the phrase is and I corresponds to or. This relation may help the software engineer in formulating his expressions while thinking in natural language. Moreover, the relation is useful for reading expressions. It allows also for a loose application of Express; namely, texts in natural language where phrases as is and o r have been replaced by their corresponding operators.
EXPRESS OPERATORS
Parenthesis):..
The operator 'r' denotes a choice, and the semicolon followed by the periods means that there exists a sequence of a number of times of the aforementioned. Additional definitions can be given for the notions Operator and Parentheses: Operator
Equals [ Pair of Periods [
During the development of software, various kinds of definitions should be made. These can be static structures such as entity-relationship models, database designs, data structures in programs, modular and procedural composition of modules, and import and export of items between modules. There are also dynamic structures, e.g., behaviour of entities (entity life histories), behaviour of programs, and test scripts. Finally, there are predicates with which conditions can be defined with respect to the static and dynamic structures. A large overlap exists among the building blocks or operators that are needed to describe static and dynamic structures and predicates. For example, the Express operator I can be applied to obtain the logical operation o r on two predicates, or to obtain the union of two sets.
145
Because of this overlap only a limited set of operators is needed to express a lot of constructs. Therefore the meaning of operators is overloaded. Overloading may cause trouble when Express is applied carelessly, but it is not difficult to apply Express in an appropriate way. The operators in Express are used to formalize constructs such as composition, choice, condition, sequence, and multiplicity. These operators are presented below one at a time, with examples from their application areas, e.g., data analysis, modular design, and specification of behaviour of programs. The word 'function' will be reserved for the context of system analysis. In the context of programs, the word 'procedure' will be used to avoid ambiguity.
Ampersand
(&)
The ampersand in Express has the same meaning as in natural written language. & always has two operands for which it denotes a symmetrical composition, and it is pronounced and. Depending on the context, A = B&C can mean: • The predicate A is true if and only if predicates B and C are both true. • The set A is the intersection of the sets B and C. The relation with the former meaning becomes clear in the equivalence of the composed predicate: (E is an element of set S) & (E is an element of setT)
with the single predicate: E is an element of set S&T
It is recognised that in Express one symbol only is not sufficient to express and. In the context of sets, the word and may not only denote an intersection, but also a Cartesian product. Therefore, the following operator is introduced.
Comma
(,)
As is the case for the ampersand, the meaning of the c o m m a in Express is the same as in normal written language, and it is pronounced and. In Express, commas are used to denote Cartesian products of sets, elements of those products (or ordered coordinate pairs), and related structures such as parameter lists. Conceptually, the c o m m a is a symmetrical operator, but sequence may be important in ordered pairs. Depending on the context, A = B,C can mean: • The record, entity, or relation A has attributes 13 and C. • The function A consists of subfunctions B and C. • The system A consists of parts (programs, modules) B and C. • A screen layout A consists of fields B and C. • The program statement A consists of the parallel execution of the statements 13 and C.
146
Bar (l) The application of the bar in Express is an extension of its use in B N F and in the programming language MODULA2, and it can be compared with the plus-sign in Process Algebra. It always has two operands between which it denotes a choice, and it is pronounced or. It may or may not be specified how this choice is controlled. As a concept, choice is closely related to addition in mathematics, and therefore several notational standards reserve the plus-sign for it. A previous version of Express did so as well, but the bar turned out to be clearer. Some confusion may occur with the ampersand and the comma, because the bar often behaves like and. A choice A or B can be described by choose from A and B. Therefore these operators should be used carefully. Depending on the context, A = BIC can mean: • The predicate A is true if and only if at least one of the predicates B and C is true. • The domain A (enumerated type) offers a choice from values B and C. • The domain or set A is a union of B and C. • The program statement A consists of a choice between the statements B and C. The bar in predicates means an inclusive or: if at least one of the operands is true, then the result of the operation is true. For the specification of static structures and behaviour, the bar yields an exclusive operation: either one of the operands is to exist or to happen, unless they are the same.
Semicolon (;) The semicolon always has two operands for which it denotes an asymmetrical composition, and it is pronounced and then. Its application in Express is an extension of its use in the programming languages PASCALand MODULA-2. The meaning of the semicolon is like the meaning of the comma, the major difference being that the semicolon always denotes a sequence. Depending on the context, A = B;C can mean: • The linear text A consists of a sequence of texts B and C. • The array (or linked list or file) A contains elements B and C in this sequence. • The array (or linked list or file) A is a concatenation of arrays (lists, files) B and C. • The program statement A consists of the sequencing in time of the statements B and then C. • The precedence list A is given by: first B and then C.
Pair of periods
(..)
For both static and dynamic structures, the pair of periods denotes repetition of the foregoing text. The pair of periods acts in an expression as a special operand, but it can also be regarded as an operator that has two operands, of which the latter is either the c o m m a or the
information and software technology
semicolon. In
.. the pair of periods has two operands, and . It is pronounced so on. In the context of data, A = B,.. means that structure A is a Cartesian product of any number of times of structure B. In other words, structure A consists of a repetition of structure B. A = B . . . . B denotes one or more Bs. For repeating structures that involve sequences of other structures (lists of items, texts, or program statements), the periods are used in combination with the semicolon: A = B;.. ;C means that A is a sequence of zero or more times B, concluded by C. How the repetition is controlled does not need to be specified. The pair of periods may as well be used to denote sets of numbers or characters. For example, q .. 1 0 means a set that contains the numbers 1, 2 .... up to 10. When reading a subexpression that is followed by a c o m m a or semicolon and a pair of periods, it does not become clear that this structure occurs a number of times until the pair of periods is encountered. Meyer 1° solved this disadvantage by the requirement that the subexpression be enclosed by braces, as in {subexpr;..}. But a comparable construct in Express would still not clarify immediately whether the iteration is sequential or not. Alternatively, it could be required that the subexpression is both preceded and followed by the pair of periods and the c o m m a or semicolon, as in .., subexpr, .. and ..; s u b e x p r ; . . . For the time being Express adopts the suffix notation because this appears to be simpler.
Colon
(:)
The colon is used for several purposes, by analogy with the situation in PASCALand MODULA-2, and it always has two operands. In the contexts of data structures and of sets, A:13 is pronounced A is a 13, meaning that an object (element, entity, or attribute) A is in the domain (set, type, class) 13. The colon then indicates that a further description follows, comparable with its function in natural written language. In several other contexts, e.g., behaviour and data structures, the colon may point at a condition. To distinguish this use from the former one, either the left or the right operand is marked by means of braces or the use of italics. Examples are A: { B}, {A} : B, and also A:B and A: B. The marked operand expresses a condition and is called a guard. Marking will often prevent the need for delimiting parentheses. {A}:13 is p r o n o u n c e d / f A then B and means that the condition A should be true for B to happen or exist. If A is false then 13 does not happen, nor can the subexpression be skipped, yielding a deadlock situation if there are no alternatives. By means of the bar, alternatives may be added to obtain short expressions comparable with the CASE- and IF-statements in PASCAL. { A } : B I { C } : D means that if A is true then B happens, or if C is true then D happens. If both A and C are true, then the choice is not determined by the guards. A: {13} is pronounced A yieldin9 13, meaning that when A has happened condition 13 holds.
vol 31 no 3 april 1989
Arrows (
+- )
Arrows are used for several purposes, as in diagrams and schemes. Arrows may be built into the text with standard keyboard symbols: horizontal bars and
(A,B) ~
• A,.. •
(C,D)
B-,C
A ,- (B,C)
• A--B,C~ D • A ---, ( B & C )
:attributes C and D depend on the combination of attributes A and 13. :entity A has a n - 1 relation g with entity C. :module A imports items B and C. :module A exports items 13 and C to module D. :predicate A implies the predicate B&C.
The examples above are based on global overviews. Arrows may be used in some other context, e.g., in a description of an entity or a module. Then the arrows may be used without a left or a right operand. Examples for the context of an item A: •
. .--B~
•
'-B,C
D
•
--B,C-,
(D.E)
•
*-B,C--
D
Minus
C
:entity A has a n - 1 relation t3 with entity C. :function A has input dataflows with items B and C from function D. :function A has output dataflows that contain items t3 and C, to function D and E. :module A imports items B and C from module D.
( - )
The minus in Express is used as a unary operator for negating predicates. Binary use is defined by A - B = A & - 13. In the context of sets, A - B denotes the difference of sets A and B. In guarded choices the mere minus that is followed by the colon denotes a condition that forms the negation of the alternative guards, like the E I S E construct in PASCAL. For instance, { G } : A I { H } : B I - : C means: if G holds then behaviour A, or if H holds then behaviour B, or, if neither G nor H hold, then behaviour C. Moreover, the minus can denote an empty structure, so that optional structures can be expressed. A I denotes a choice between A and an empty structure, or in other words A need not exist or happen. A traditional IF G THEN A ENDIF construct may be expressed as {G}:A I - : - , whereas {G}: A will result in a deadlock situation if G is not true. A traditional statement such as WHILE cond DO A E N D, not specifying further behaviour, can be expressed
147
in Express with the minus: ({cond}:A);..; ({-cond}: - }. A shorter notation mentions the condition only once, as in A ; . . : ( { - c o n d } : - ) or ( { c o n d } : A ) ; . . . A more traditional notation, however, with the help of keywords such as WHILE and UNTIL, may be clearer to many engineers: A;..WHILE cond and A;..UNTIL cond. These keywords are not included in Express, but for the time being Express does not forbid any enhancements because practice should point out what is best.
This will be changed when experience points out that the arrows are better suited. The third question must also be answered by experience. The software engineer can clarify the meaning of his expressions by providing additional textual information. For example, the overview earlier in this section is preceded by the text For the application areas .... which leaves no doubt about the meaning of the overview, although arrows have not been used this way earlier in this paper.
SOME ADDITIONAL NOTATIONS Overview For the application areas that have illustrated the operators above, an overview of the main use of the operators (except for the equals-sign) is as follows: D a t a f l o w Analysis
*-
Data Analysis
*-
c o m m a , arrows comma , arrows
M o d u l a r Design
*-
comma , arrows
Sets
*-
c o m m a , ampersand, bar, m i n u s ,
*-
colon ampersand
Predicates Data Structures Precedence List Behaviour
**-
bar,minus.arrows,
Sets semicolon
pair of p e r i o d s , S e t s
semicolon
comma
semicolon c o m m a , bar. m i n u s . c o l o n , pair of periods, a r r o w s ,
The kernel of Express as it is presented above is enriched with notations to support application in areas such as data analysis, predicate logic, and program pseudocode. The additional notations are as far as possible according to existing conventions.
Mathematical notations Mathematical expressions can be formed as is customary with symbols such as +, - , . , / , # , {, }, =, <, >, ( = , and ) = . The last five operators may be used not only for comparing numbers, but also for comparing sets. In program pseudocode, .'= can be applied for assignments of values to variables. Some examples are:
Predicates
Remarks
• • • •
After this, three important questions remain to be answered:
Period
• Is the set of Express operators complete? • Is there any overlap between the operators? • Does the meaning of each operator become clear from its context? The first question can hardly be answered theoretically, but rather by experience. In many cases additional notations will be needed, and the next section gives some suggestions. The second question should yield a negative answer, since overlap means that it will not always be clear what operators to use. When the arrows are not taken into consideration, the set of operators does not contain overlap. There is some overlap, however, between the arrows and other asymmetrical operators, though this does not appear to be critical. For instance, it is natural to let an arrow denote sequences, as is done in CSP, although for that purpose the semicolon is more widely applied in programming languages. Guarded commands and postconditions can be expressed well with the help of arrows instead of a colon, as is done in the programming language ADA. This has the advantage of uniformity with predicates that express implications by means of arrows. Express adopts the colon for this purpose, conforming to the programming languages c, PASCAL, and MODULA-2.
1418
# S {1.2} S> =T n> m
:The :The :The :The
number of elements of set S. set that consists of the elements 1 and 2. set S is equal to or a superset of set T. number n is greater than the number m.
The use of the period in Express is an extension of its use in the programming languages c, PASCAL,and MODULA-2 for the qualification of record fields and module items. A.B is pronounced B o f A or field B o f A or item B of A, denoting some part B of item A.
Per cent symbol An additional operator % is needed for specifying behaviour that is terminated by some event. A%B is pronounced A possibly terminated by B and denotes a behaviour A that will halt as soon as behaviour B occurs. Applications are terminating program execution because of some exceptional situation and, in combination with a pair of periods, exiting from a loop.
Quantification In predicates the symbols 9 and V are commonly used for existential and universal quantification, but these symbols are rarely available on text processors. Express uses the symbols E? and A! instead, in combination with the colon instead of the e symbol. A quantifying expression precedes the quantified expression, and it is marked in italics or, if the right operand is already in italics or italics
information and software technology
are not available, by vertical braces. Inside the expression, the colon can be used to denote the domain of quantifying variables, just like the declaration of free variables. For example, a predicate each person descends from some other person is formalized with help of the predicate ... OescendsFrom... by: {A! Child: Person} {E? Parent: Person} Child DescendsFrom Parent
In the parameter lists of procedure declarations and calls the exclamation-mark marks input-parameters, and the question-mark marks output-parameters. Parameters that are used for both input and output are marked with !?. The analogy with predicates becomes clear by regarding a procedure specification in Express:
means that the operator operates only on nearby text. The precedences in Express are defined in such a way that they correspond as much as possible to comparable precedences in mathematics. Thus the equals sign has the lowest precedence of all binary operators, followed by the per cent symbol and then by the bar and the minus. The semicolon, the ampersand, and the comma share a higher precedence, while the highest is for the colon and the arrows. The minus in unary use still has lower binding strength. The pair of periods and the minus can be used as normal operands. Expressions in which sequences of operators with equal precedences occur should be interpreted from left to right. For example, a; - , b; c,.. is equal to ( ( ( a : - ) , b) ; c ) . . . . The precedence order of the main binary operators can be expressed in Express:
P(tnPar!:Doml, OutPar?:Dom2) . . . .
is pronounced the procedure P will deliver for each value oflnPar in the domain Domq some value of OutPar in the domain Dora2 by .... This analogy is not new: predicates and procedures are treated uniformly in the programming language PROLOG. As most parameters tend to be input parameters, it is possible to adopt the rule that input parameters are not marked with !, while in-outand out-parameters are still marked with T? and ?. An example is OpenFile (filehandle?, filename, mode, done?).
Underscoring In data analysis it is important to distinguish key attributes from others. Following the present custom, key attributes are underscored in Express. When underscoring is not possible on the text processor used, surrounding underscore symbols can be used ( _ e x a m p l e ) . For the specification of FOR loops in program pseudocode, the loop counter can be marked by underscoring it. The analogy with key attributes is the fact that the separate iterations are primarily identified by the value of the loop counter. For example, instead of FOR i = 1 TO N DO Print(i) END it is possible to specify { A T i : I ..N} Print(i):... Some will prefer to write {FOR i l . . N } P r i n t ( i ) ..
Three periods Specifications may often be shortened by the use of three consecutive periods, pronounced as etcetera. The actual meaning of this symbol should become clear just by intuition, unlike the situation for the pair of periods, which strictly denotes an iteration of some operand.
PRECEDENCES OF OPERATORS Like operators in mathematics, Express operators have different precedences (=binding strengths), which are bypassed by means of parentheses. A high precedence
vol 31 no 3 april 1989
Express Precedence List = equals: (bar, minus): (semicolon, ampersand, comma): (colon, arrows)
LAWS FOR OPERATORS In recent years, interest in mathematical properties of programs and specifications has grown ~'6'11. It appears that when programs and specifications are written down with the use of expressions, laws can be formulated with respect to equivalences between these expressions. The laws can be interpreted as rewrite rules that clarify the meaning of shorthand notations. Simple classes of laws concern idempotency, commutativity, associativity, and distributivity of operators. Obviously in Express several laws hold too, but an important question is whether such laws are independent of the context in which expressions are used. If there is a strong dependence, the engineer who applies Express should be aware of it, and this would make Express harder to use. A binary operator ( o p ) is said to be idempotent if the expression x ( o p ) x is equal to x. The bar and the ampersand are idempotent. A binary operator ( o p ) is called commutative if the expression x ( o p ) y is equivalent to y ( o p ) x . This seems to hold for the equals sign, the ampersand, the comma, and the bar, but there are some restrictions. The equals sign is generally used in definitions where its left-handside consists of a single operand, not an expression. The comma is conceptually commutative, but the order of its operands can be relevant, for instance in coordinate pairs and parameter lists. A binary operator (op) is called associative if the expression ( x ( o p ) y ) ( o p ) z is equivalent to x ( o p ) ( y ( o p ) z ) . Associativity implies that in such an expression the parentheses may be omitted, which makes the expression clearer. In principle, associativity holds for the ampersand, the comma, the bar, and the semicolon. However, parentheses can be used in combination with the comma for the notation of parameter lists and coordinate pairs, as (x,y). In these situations, removing
149
parentheses is not justified. When the pair of periods acts as a operand for the c o m m a or the semicolon, associativity does not hold either. (x;y); .. is not equal to
x&(y&z) is equal to (x&y)&(x&z), because of associativity, commutativity, and idempotency, x&(yl z) is equal to x&ylx&z, and x & ( y - z ) is equal to x & y -
x:(y:..).
x&z.
A binary operator ( o p l ) is said to be left-distributive over an operator (op2) if the expression x(opl)(y(op2)z) is equal to ( x ( o p l ) y ) ( o p 2 ) (x(opl)z) Right-distributivity is defined by analogy with the expression ( y ( o p 2 ) z ) ( o p l ;)x. For cummutative operators left- and right-distributivity are the same, so that the single phrase distributivity can be applied. Distributive laws in general imply that concise expressions may be used instead of larger ones, and the meaning of such a concise expression can be found by rewriting it into its larger equivalents. There are quite a lot of combinations of Express operators for which the distributivity should be checked, which is done below. It will appear that m a n y operators distribute over the bar and the minus.
The bar in Express distributes over itself: xl (yJz) is equal to (xly)l(xlz), because of associativity, commutativity, and idempotency. However, the bar does not distribute over the minus: xt ( y - z ) is not equal to
• The equals sign, being a special operator, does not distribute, x = ( y ( o p 2 ) z ) is in general not equal to (x = y ) ( o p 2 ) ( x = z). Neither do other operators distribute over the equals sign. • Depending on the context, the c o m m a may distribute over the bar. x, (ylz) =x,glx.z holds, for instance, when a combination of either the items x and y or x and z is meant. When x, y, and z are sets and the c o m m a denotes a Cartesian product, distributing over the bar, the ampersand, and the minus is mathematically correct. • The semicolon is right-distributive over the bar in the context of behaviour and data structures: ( x l y ) ; z = x;zly;z. Left-distributivity, formulated as x ; ( y l z ) = x;ytx;z, can well be assumed too, but there are reasons not to do so in the context of behaviour. According to the theory of Process Algebra (where the plus-sign is applied instead of the bar), the difference between the two expressions lies in the moment of choice between the branches; either before or after the occurrence of x. Thus, left-distributivity may or may not be assumed, depending on the circumstances. • The arrows distribute over the comma. In graphs, an arc that has multiple labels can be replaced by a set of arcs with single labels. By analogy, x - - ( u , v ) ~ v is equivalent to x - - u - - , y , x - - v ~ y and ( x , y ) - - u - - , z equals x - - u ~ z , y---u--,z. In predicates, the arrow is distributive over the bar and the ampersand in the direction of the arrow, x--*(y&z) = x ~ y & x ~ z and x-,(ylz) = x--*ylx~z. • The colon can be used as is a, in the context of sets. F o r ordered pairs neither (x,y):z = x:z,y:z nor x:(y,z) = x:y,x:z holds. The colon can also be used to mark
guards in behaviour expressions. {x}:(ylz)= {x}:yl {x} :z and { x } : ( y . z ) = {x}:y, {x}:z. Moreover, {xly}:z = {x}:zl{y}:z. Note that here the contexts in which the bar occurs switches from predicates to behaviour, which is not possible for the ampersand. • The ampersand distributes over itself, the bar, and the minus, in both the contexts of predicates and of sets.
150
(xly)-(xlz).
The minus in the contexts of predicates and of sets is right-distributive over the bar and the ampersand, but not over the minus itself: o x&y - z = ( x - z ) & ( y - z ) o (xly)-z =(x-z) l(y-z)
and
However, the minus is not left-distributive, x - y & z is not equal to ( x - y ) & ( x - z ) and x - ( y l z ) is not equal to ( x - y ) l ( x - z ) . In contrast, the following laws apply: o o o o
x&y x-y x-y&z x-(ylz)
= = = =
x -(x-y) (x-y)-(y-z) (x-y) l(x-z) (x-y)&(x-z)
= (x-y)-z
• Severallaws hold for the pair of periods, like x; " = x . . and - ; . . . . . Mathematically x;.. is the solution Y of the equation Y = - I x ; Y and x , . . is the solution of Y = - Ix,Y. The conclusion is that the Express operators obey several laws that are intuitively clear and in general independent of the context. Practical use of Express should point out whether the laws can lead to confusion. LAYOUT Layout of expressions is an important issue, especially for complicated ones. Express allows a lot of information to be expressed on a single line, which is sometimes good for clarity. Often, however, clarity is enhanced by the use of more than one line. The layout must be in conformity with the precedences of the operators. Indent levels can mark refinements. An operator with a high precedence (thus having strong binding strength) must have its operands closer to itself, relative to operators with lower precedences. The convention for written natural language may be adopted that no blank is put before and one is put behind the comma, the colon, and the semicolon. The following conventions suit the use of Express as a pseudocode: • The c o m m a and the semicolon are never put at the beginning of a line. • The bar and, in binary use, the minus are never put at the end of a line. • In enumerating definitions that do not all fit on a single line, operators are preferably put into columns below one another.
information and software technology
• Line-crossings and blank spaces next to operators must be according to their precedences: a high precedence of an operator requires that its operands appear nearby. • The indent level is increased at the start of each nested definition, and decreased at the end. An example for this convention: ExampleExpression - Term1 ( - ~Predicatel &Predicate21 Predicate3}: Term2 I [ -- Predicate41: Term3); Term3 ( - T e r m 4 & T e r m S ) : ( Terrn6 ( ~PredicateSl:Term7; .; ~Pred icate6] :Term8); Term 9 ;.. ) ; Term10 % Term11
Expressions can be laid out as trees. For example, A = A1, A2 ( = A21,A22,A23), A3 ( = A31,A32) is equivalently laid out as A
-
A1, A2( = A21, A22, A23). A3( A31, A32)
an expression does not provide sufficient explanation about the aspect being described. Therefore it may be wise to define dedicated sets of keywords for software development projects. When it is decided to use keywords, a different style will be applied for Express: less use of definitions by means of the equals symbol, and more definitions within sections like the TYPE-, CONST-, VAR-, and PROCEDURE- sections in the programming languages PASCAL and MODULA-2. It is recommended to use capitals or boldfaced characters for keywords, and lower-case characters for other words. A way of working with keywords is the use of templates - outlines of texts with keywords and fields that are to be filled with Express expressions or natural language. The context of the expressions will then be determined by the surrounding keywords. The software engineer or his organization can decide what templates he will use. Some templates are presented below. They are meant for requirements analysis and design phases in software engineering. Fields between ' ( ' and ' ) ' brackets are to be filled by the software engineer: for (expression) an Express expression, for ( e x p r e s s i o n , . ) a number of expressions, for ( i d e n t ) an identifier (a word or group of words), and for ( t e x t ) text in natural language. The engineer may use the following abbreviations for predefined data types:
or as A = A1, A2 ( - A21,A22.A23), A3 ( - A32,A32)
KEYWORDS
AND
TEMPLATES
Express is in a certain sense a small language. It does not contain keywords like ENTITY, RELATION, KEY, TYPE, RECORD, ARRAY. SET. CONSTANT, VARIABLE, MODULE, PROCEDURE, FUNCTION, REPEAT, and CASE. Express limits itself to the set of operators already discussed. The reason for this is the aim of Express to support the entire cycle of software engineering. There are many aspects of software engineering that need illumination, depending on the situation. Apart from the aspects that the keywords mentioned above focus on, additional information can be thought of, such as author, date, version and purpose of formal notations, expected and measured file activity, turnover and growth, input and output data streams and commands, pre- and postconditions, and this list is far from complete. Special keywords and language constructs could be included in Express to denote these aspects, but that would make Express complicated. Moreover, the choice of keywords would inevitably be arbitrary and incomplete. The operators of Express can be seen as a special kind of keywords, or replaced by keywords such as IS and AN D, but special symbols like = and & are more striking and require less space. Because of the absence of natural language keywords. Express is suitable for many situations, just as mathematical expressions are. Express is also accessible to people who speak different languages. Keywords are necessary, however, when the context of
v o l 31 n o
3 april
1989
• S1, S2 .... :Strings of at most 1, 2 .... symbols. • I1. 12. . . . :An integer number, represented as a string of at most 1, 2.... symbols. • R2.1 . . . . :A real number, represented as in fixedpoint notation consisting of 2 digits before and 1 behind the decimal point. In the next templates it is possible to fill in the entity types that the system being developed should deal with. The first template is meant for an enumeration. The second template is meant for a table that defines attributes for entities by means of the equals sign, the comma, the colon, the semicolon, the bar, and the underscoring. ENTITLES ( i d e n t ) = ( expression ) ENTITY
= ATTRIBUTES
(ident)
= (expression)
Relationships between entities can be notated into the next template, where the arrow operator is used: ENTITY
*- R E L A T I O N ( = ATTRIBUTES) .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
(expression) *- (expression)
.
.
.
.
.
.
.
.
.
--, ENTITY .
~ (expression)
The following template is meant for illuminating entities one by one: their attributes, relations with other entities, and a general description. ENTITY ( i d e n t ) ATTRIBUTES (expression ;..) RELATIONS (expression;..) DESCRIPTION ( t e x t )
151
When a design for a relational database is made, it is relevant to state what attributes are dependent on others, and what attributes are key attributes. The following template is meant for this purpose and is filled with the operators arrow, comma, and underscoring. Such an attribute table can serve as a base for a normalization process.
the others are detail cards. Each detail card carries an integer amount. It is required to produce a report showing the totals of amount for all keys. For this problem Jackson presents the diagrams shown in Figure 1 and following pseudocode: CARDFILE-REPORT
ATTRIBUTES
~ DEPENDENT ATTRIBUTES
..............................................
(expression)
~ (expression)
A table that contains the names of the relations that result from the normalization process is suggested by the following template: RELATION
= FIELDS
.......................................
(ident)
= (expression)
The structures of files being designed are illuminated by the next template: what will the medium and the format be, and what systems have access to it. FILE ( i d e n t ) MEDIUM (text) FORMAT (expression) SIZE ( t e x t ) [RIW] ACCESS ( e x p r e s s i o n ) DESCRIPTION ( t e x t )
Appendix 1 shows examples of the use of these and other templates.
EXAMPLE: J A C K S O N S T R U C T U R E D PROGRAMMING (JSP) As stated before, Express can be used within existing software development methods. Here this is demonstrated for Jackson Structured Programming. The first step in this method is to describe the structure of relevant data items by means of so-called Jackson diagrams. A Jackson diagram is a tree with the root node at the top. The nodes in the tree represent items and are drawn as labelled boxes. Items can be refined by attaching subtrees to them. When an item consists of a sequential structure, the box that corresponds to the item has an array of subtrees pending from it, placed from the left to the right. Boxes can be annotated with stars and circles, denoting repetition and choice. The second step is to compose these data structures into a global program structure, again visualized by a Jackson diagram. In the last step, the executable operations required are listed, and each one is allocated to its right place in the program structure, which is notated as pseudocode. An example, retrieved from Jackson 2, shows how things can be notated as well in Express: A cardfile of punched cards is sorted into ascending sequence of values of a key which appears in each card. Within this sequence, the first card for each group of cards with a common key value is a header card, while
152
sequence open cardfile: read cardfile: write title: REPORT BODY iteration untff cardfile eof total ,= O; groupkey ,= header key read cardfile; GROUP BODY Iteratzon unti/cardfile eof or detail key ( ) groupkey total ,= total + detail amount; read cardfile: GROUP BODY end write totalline (groupkey. totat): REPORT BODY end close card file: CAROFILE REPORT end
Three Express alternatives for the data structure descriptions are presented here. The first is similar to the specification that Jackson provides in BNF. It gives a clear top-down presentation, but it is produced much faster than the corresponding Jackson diagrams: cardfile
= group;..
group
-
groupbody
= detail;..
header; groupbody
report
-
reportbody
= totalline;..
title; reportbody
Three of these five definitions can be nested within the other two, which is done in the second alternative. It is more concise, while still containing the top-down information. However, this nesting decreases clarity: cardfile report
= group ( = header; groupbody ( = d e t a i l ; . ) ) = title; reportbody ( = totalline;..)
In the third Express alternative the top-down information has been omitted, yielding a clear and concise description: cardfile report
= header;(detaii ; . . ) : . . = title; (totalline;.)
For the global program structure there are again three alternatives for the diagram that Jackson supplies. The shortest one is given: consume cardfile and produce report = produce title: (consume header: (consume detail;..) produce totalline;..)
Now the global program structure can be transformed into a detailed program structure. Again, of three possibilities the shortest one is given: consume cardfile and produce report = open cardfile; read cardfiie: write title: ( total: = 0; groupkey: = header.key; read cardfile; ( total: = total + detail.amount: read c a r d f i l e ; . . ) ; {cardfile.eofl - (detail.key = groupkey) }: write totalline
(groupkey.
total) ;
..);
{cardfile.eof}: close cardfile
information and software technology
I
CARDFILE
I / ' " I GROUP
HEADER
]
REPORT
]
"1
TITLE
[
REPORT BODY
I [
GROUPBODY
DETAIL
*I
TOTALLINE
[
*I
CONSUME CARDFILE PRODUCE REPORT
I
[
[
I IREPORTBODYI I [ *1 PRODUCE TOTALLINE [
GROUP BODY
TOTALLINE
f I CONSUME DETAIL
*I
Figure 1. Jackson diagrams to represent cardfile example
Selection and backtracking, other main themes in the JSP technique, are handled in Express by means of the bar and the per cent symbol. From the foregoing, it can be concluded that Express can well be applied within the three stages of JSP. Originally in JSP, diagrams are used for representing both data structures and global program structures, and a pseudocode is used for detailed program structures. All can be done as well in the form of expressions, enabling a smooth transition between the three stages.
EXAMPLE: ENTITY-RELATIONSHIP MODELLING Express can also be applied for entity-relationship
vol 31 no 3 april 1989
modelling. The cardfile example from the previous section gives an illustration. Table 1, a filled template, summarizes what entities and relations exist in Express. Diagrams are another way to represent entities and Table 1. Filled template, summarizing entities and relations in Express
ENTITY
~ RELATION
~ ENTITY
Cardfile Cardfile Report Report Report TotaUine
Contains --Contains Contains Contains --Is about Summarizes
~ Header ---,Detail card,. ~ Title --, Totalline,.. ~ Cardfile --*Detail card,..
153
Header card
Cardfile
]
]
Title
~
Report
Figure 2. Entity-relationship diagram to represent cardfile example, according to Chen standard
relations. In the widely applied standard of Chen for entity-relationship diagrams 3, arrow directions denote 1-1 and 1-n relations. Relation texts are placed in diamonds. An entity-relationship diagram for the cardfile example, according to the Chen standard, is shown in Figure 2. The direction in which a relation text between two entities must be read is in general not clear from such a diagram and may well be opposite to an arrow. This turns out to be a drawback for the use of the Chen standard. In Express specifications of relations, the directions of the arrows denote the directions for the names of the relations, and 1-n and m-n relations are marked with pairs of periods near the corresponding entities, which is much clearer. Therefore, it is recommended to reserve the arrow direction in entity-relationship diagrams to indicate the relation text, and to adopt an annotation at the extreme points of the arrows for denoting 1-n and m-n relations. It is possible to take the pair of periods, in imitation of Express. Moreover, the diamonds are not needed in m a n y cases, so that relation texts can appear as labels at the arrows. This simplifies the production of the diagram. The proposed diagram for the cardfile example is shown in Figure 3.
T O O L S F O R EXPRESS Useful tools for Express are simple, depending on the way Express is applied: • Express needs tools for processing texts, such as pencil and paper, blackboard and chalk, or, on computers, simple text editors and word processors. • When Express is used in combination with templates, help will be offered on computers by more sophisticated tools that can present a choice of templates to the software engineer. This will decrease the mental ballast and eases insertion of templates into texts. Such tools are text editors and word processors with facilities to define templates or macros, or with facilities to handle several text files at the same time, thus allowing the software engineer to copy pieces of texts from one file to another. In recent years a number of so-called idea processors have appeared, e.g., the program More on Apple Macintosh computers. Besides basic editing functions, More offers outline views and the use of templates. • When Express is applied for the formal description of behaviour of programs, the SCRmTICpackage is useful. SCmPTIC can be described as a superset of conventional programming languages such as MODULA-2, PASCAL, or C, which uses Express expressions for the definition of program structures. SCRIPTIC was originally meant as a tool for rapid prototyping of user interfaces, but it turned out to be a high-level programming language useful for implementations. SCRIPTICalso offers part of the functionality of object-oriented and parallel programming languages, and of parser generators and PROLOG. SCmPTIC can therefore be applied well in areas such as simulation of communication protocols and development of parsers. Because of its fundamental basic constructs, SCRIPTIC is a tool for those who study theories like Process Algebra. Appendix 2 provides more information.
CONCLUSION
Header card
I
I
/Iontains
/IkCont....
Cardfile
~
Title
Report ntains
Detail card ~'"
Summarizes
Totalline
[
Fioure 3. Proposed entity-relationship diagram to represent cardfile example, with direction in which relation text between two entities clarified
154
Express has been proposed as a sober but powerful standard for notations, offering a good prospect for practical applications. It is better suited for the representation of hierarchical structures and behaviour, than for representation of more general structures such as dataflows between functions and relationships between entities. Above all, the clarity of dataflow diagrams 12 is hard to meet with Express. There are situations in which it will be more convenient to use tables, matrices, data dictionaries, algebraic specification languages, conventional programming languages, and fourth-generation languages. Express should be applied when it is a better alternative, which should prove to be the case on trying. The simplicity and power of the SCRmTIC language show that the underlying notations proposed by Express are sound, at least in the context of describing behaviour.
information and software technology
In the future several activities with respect to Express will take place. Express itself will be evaluated and might be enhanced, which does not necessarily mean enlarged. There are development activities for the SCRIPTIC package, and a comparable product for database design might be developed as well. Express and its tools should be tried out in practical software engineering environments. This requires the cooperation of professional software engineers, who are invited to contact the author.
formation processing systems--open systems interconnection, LOTOS--a formal description technique based on the temporal ordering of observational behaviour' IS0/TC97/SC21 DP 8807 ISO, Geneva, Switzerland (1985) Bergstra, J A and Klop, J W 'Algebra of communicating processes' in de Bakker, J W, Hazelwinkel, M and Lenstra, J K Proc. CWI Syrup. Mathematics and Computer Science North-Holland, Amsterdam, The Netherlands (1986) pp 61-94 Bergstra, J A and Klop, J W 'Process algebra for synchronous communication' Inf. Control Vol 60 (1985) pp 77-121 Hoare, C A R Communicating sequential processes Prentice-Hall, Englewood Cliffs, N J, USA (1985) Bjerner, D and Jones, C B Formal specification and software development Prentice-Hall, Englewood Cliffs, NJ, USA (1982) Baekus, J W 'The syntax and semantics of the proposed international algebraic language of the Zfirich ACM-GAMM Conference' in ICIP Proc. (Paris, France (1959)) Butterworths, London, UK (1960) pp 125-132 Meyer, B Object oriented software construction Prentice-Hall, Englewood Cliffs, N J, USA (1988) Hoare, C A R et al. 'Laws of programming' Commun. ACM Vol 30 No 8 (August 1987) Yourdon, E and Constantine, L L Structured design Yourdon Press, New York, NY, USA (1978) Huijsmans, D P, van Delft, A J E and Knip, C A C 'Waalsurf: molecular graphics on a personal computer' Comput. Graph. Vol 11 No 4 (1987) pp 449 458
5
6
ACKNOWLEDGMENTS The author thanks Thiel Chang and Lex Hendriks of the Dutch GAK-automation organization for their useful suggestions on Express. Dick Bruin, Kurt Anzenhofer, and Frans Peters of the Department of Computer Science at the University of Leiden had useful comments on earlier versions of this paper that were gratefully adopted. Paul Kranenburg and Dony van Vliet contributed to the implementation of the SCRIPTIClanguage.
7 8
9
REFERENCES 1 Page-Jones, M 7he practical guide to structured systems design Yourdon Press, New York, NY, USA (1980) 2 Jackson, M A Principles of program design Academic Press, London, UK (1975) 3 Chen, P P 'The entity-relationship model: toward a unified view of data' ACM Trans. Database Syst. Vol 1 No 1 (March 1976)pp 9 37 4 International Organization for Standardization 'In-
10 11 12
13
A P P E N D I X 1. E X A M P L E S OF USE OF T E M P L A T E S This appendix presents some examples of the use of the templates described in the section 'Keywords and templates'. They originate from the development of the program Waalsurf at the University of Leiden 13. Waalsurf is meant to support chemists in drawing molecules on computer screens. Only small pieces of the requirements definition and design of Waalsurf are shown, all in Univers Light font.
Requirements definition Waalsurf can communicate with a user. by means of a keyboard, a mouse device, a display and a printer, depending on their availability. In the following. 'User' will refer to this user, but if applicable he may be referred to with 'Keyboard'. 'Mouse'. "Display' and 'Printer'. Moreover, Waalsurf deals with molecular data files originating from chemical databases. EXTERNAL ENTITIES Waalsurf = User ( = Keyboard,Mouse.Display, Printer). Chemical database EXTERNAL .
.
.
.
.
.
.
.
.
.
.
.
.
,-, .
.
.
.
.
.
.
.
Keyboard Mouse Display Printer Chemical database
.
.
.
.
.
.
.
~ --, ,*-,
FLOW .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Command to be executed, Molecule name, Label text Command to be executed, Atoms and groups to be selected Choice menu. Status report, Picture of molecule Status Report, Picture of molecule Molecule data file
EXTERNAL ENTITY Printer DESCRIPTION Should be able to print both text and colours. Colour printers are supported by Waalsurf . . . .
vol 31 no 3 april 1989
155
Waalsurf manages data about molecules, atoms, bonds, elements, groups and types of groups. Molecules are identified with a name of at most 36 characters, and are often identified with a formula consisting of at most 36 alphanumeric characters. Molecules consist of atoms, which have fixed positions with respect to one another. There are about a hundred different types of atoms, which are called elements. Elements have a name consisting of at most 12 characters, but they can be identified with a chemical symbol of 1 or 2 characters as well. An atom has an ion-radius and a covalence radius, which is characteristic for its type of element. Sometimes a molecule is considered as a set of groups of atoms. For proteins and DNA there are about 35 types of groups, which are identified by a name of at most 12 characters or a code of 3 characters. A single group can contain several atoms of 1 element, each of which can have a special role. Chemists denote the types of these roles by a code of 1 character behind the chemical symbol of the atom. Conventions exist for colouring atoms and groups on the basis of their type, but chemists sometimes want to define colours themselves. In addition, chemists use numberings for the atoms and groups in molecules. Sometimes they wish to define a name themselves. Atoms in molecules are generally bound to one another. Each atom has at most 8 bonds with other atoms. Sometimes chemists wish to give the bonds special colours.
ENTITIES Waalsurf = Molecule, Atom, Element, Group. GroupType, Bond ENTITY Molecule ATTRIBUTES Name:S36, Formula:S36 R E L A T I O N S - - c o n t a i n s ~ Atom;.. DESCRIPTION At most 1000 atoms will be required per molecule
ENTITY .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Molecule Atom Element Group G rou pType Bond
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
= = = = = =
ENTITY .
ATTRIBUTES
= .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Molecule Atom Bond Atom Group, Element,..
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Name:S36, Formula:S36 Number:l, Colour, Position:(X,Y,Z:R), (special n a m e : S 1 2 1 - ) Name:S12, Colour, Symbol:S2, IonRadius:R, CovalenceRadius: R Number:l Name:S12, Colour Colour
.- RELATION ( = ATTRIBUTES) .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
--contains --is a --(connects with --belongs to ( = R o l e : S l l - ) --is a --belongs to ( = Role:Sll - )
.
.
.
.
.
.
~ ENTITY .
.
.
.
.
.
.
Atom,.. -~ Element -~ Atom -~ Atom) Group -~ GroupType -~ GroupType,..
Waalsurf can present molecules with special spatial effects. The user can choose from wireframes, ball-stick models and Vanderwaals models. Waalsurf offers a set of functions for spatial manipulation (movement, rotation, scaling), as well as animations, file handling (molecule data, elements and group types), colouring, labelling of atoms and groups with indexes and names, selection (of part of the molecule, for colouring and labelling), output to printers (including colour printers) and cameras, editing (addition and deletion of atoms and bonds) and demonstrations (creation, display and modification). FUNCTIONS Waalsurf = display in several representations ( = (normal lstereo), (wi~eframel bafl-sticktVanderwaals)), ( = movement, rotation, scaling), spatial manipulation ( = (molecules, elements, grouptypes),(readinglwriting)), file handling ( = to black/white printer, to colour printer, to camera), output for hardcopy ( = (additionldeletion),(atoml bond)), editing ( = creation, display, modification) demonstrations animation, colouring, labelling, selection . . . .
156
information and software technology
FUNCTION colouring IN Element,.. GroupType,.. Selection User OUT atom;..
--, ~ ~ ~ .-
colour.. colour,.. part of molecule that will have new colour (new colour,..Icolouring is according to (element,..Igrouptype,..)) colour,..
CONTROL user modifies colour set-up by means of a dedicated menu. DESCRIPTION This function allows the user to adjust the colours of the atoms on display. A change of the colour attributes of the atoms will result in a different picture on the screen, after redrawing . . . . The layout for the corresponding colouring menu is:... PERFORMANCE When redrawing of the molecule is expected to take less than 1.0 second, it will be carried out each time a colour attribute is changed. In other cases, long response times are avoided because redrawing must be activated by the user. For this purpose a prompting button appears in the menu. Redefinition of default colours for elements and grouptypes is not supported by Waalsurf. This is done by the user by means of a text editor.
SOURCEIDESTINATION .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
,- DATAFLOW .
.
Display in several representations Spatial manipulation Colouring Colouring Colouring
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-, .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
,- Position, Colour, bond, ,-- Position, ,-- colour, ,- part of molecule .- new colour, OK, Cancel, colouring according to (elementl grouptype), ,- colour,
Colouring
.
.
.
DESTINATIONISOURCE
.
--Atom,.. ~ Atom,.. - - Element .... GroupType,.. --Selection --User
~ Atom,..
The program Waalsurf will consist of I file, which on IBM PCs will have the name Waalsurf.COM, on Sun 3 Waalsurf.EXE . . . . . Data files for Waalsurf will be Element.DEF and Group.DEF, and files generated by the user, with extensions .DAT and .STP. B e l o w ' * " will be used as a wild-card: for example, "*.DAT' means 'files of which the name has extension '.DAT PROGRAM FILESWaalsurf =- IBM PC}:Waalsurf.COMl{SUN 3}:Waalsurf.EXEI... DATA FILES Waalsurf = Element.DEF, Group.DEF, *.DAT, *.STP The Waalsurf program files will be generated from Modula-2 source code modules. The names of these modules all start with the string "ws'. The main program resides in wsMain, facilities for file input and output in wsFilelO, drawing facilities in wsDraw, having local modules for wireframes, ball-stick models and Vanderwaals models, spatial manipulation facilities in wsManipulate, animation in wsAnimate, adjustment to the colouring in wsColourAdj, selection in wsSelect . . . . . Device dependent source code will be in the modules wsPCDevice for IBM PCs, and wsSun3Device for Sun. MODULES Waalsurf = wsMain, weFilelO, wsMoIData, wsManipulate, wsAnimate, wsColourAdj, wsSelect, w s D r a w ( = WaalDraw, BallDraw, StickDraw, StereoDraw), ( { I B M P C } : w s P C D e v i c e l { S u n 3 } : w s S u n 3 D e v i c e l . . . ) ....
REQUIREMENT ITEM .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
* - I M P L E M E N T I N G ITEM .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Display in several representations *- w s D r a w Spatial manipulation ~ wsManipulate Animation *- wsAnimate Atom .- wsMolData.(AtomType, Molecule) Molecule ,- *.DAT, wsMolData.(MoleculeType, Molecule) Element *- Element.DAT, wsMolData.(ElementType, Element)
vol 31 no 3 april 1989
157
SOURCE/DESTINATION .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
,- DATAFLOW .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
--, DESTINATION/SOURCE .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
wsDraw wsManipulate wsColourAdj wsColourAdj wsColourAdj
--wsMolData *- Position, Colour, bond, --, wsMolData *- Position, - , wsMolData colour, --wsSelect *- Atom, - - Keyboard, Mouse ,- new colour,OK, Cancel, colouring according to (elementlgrouptype)
SOURCEIDESTINATION
,- [PICITIV] TRANSPORTED ITEM
wsDraw,wsManipulate, wsColourAdj wsColourAdj
,- [T] MoleculeType,AtomType. IV] Molecule ,-- [P] SelectionMenu
DESTINATION ISOURCE - - wsMain --wsSelect
FILE Element.Def MEDIUM Internal Hard disk or floppy disk FORMAT 'WAALSURF ELEMENTS FILE '; Linefeed; ( Name :S12;Space; ColourCode ( = Red:13;Space;Green:13;Space;Blue:13); Space; Symbol:S2 ; Space; Ionradius:R6; Covalenceradius:R6; Linefeed ;..); EndOfFile SIZE 3 to 6 kilobytes [RIW] ACCESS wsFilelO JR], text editor [W] DESCRIPTION Contains information about elements: name, default colour (RGB values in interval [0.. 255]), chemical symbol and ion- and covalence radii measured in ~ngstrCm
A P P E N D I X 2. P R O G R A M M I N G
LANGUAGE
SCRIPTIC
SCRIPTIC is a superset of conventional programming languages such as MODULA-2, PASCAL, and c. It uses Express expressions for the definition of program structures. The SCRIPTICextension of MODULA-2 consists of only six keywords and three symbols, since MODULA-2 already includes most Express symbols. Currently, a programming system for SCRIPTIC is under development on Sun computers. The first commercial version should become available in the summer of 1989, consisting of a preprocessor and a run-time system. The preprocessor translates SCRIPTICtexts into MODULA-2. Its part that parses the input texts is written in SCRIPTICitself. The run-time system handles process scheduling, among others, and it has been designed with the help of Express. As in MODULA-2 and PASCAL,the main program in SCRIPTIC is placed at the bottom of the module, but here it is an Express expression instead of a statement sequence. The typical SCRIPTICconstruct for refinements is the S C R IPT: a kind of template that can be compared with a conventional MODULA-2 procedure. It may have formal parameters and local variables. A SCRIPT has a body that is an expression, whereas MODULA-2 procedures are built as sequences of statements. The expressions are as usual built with operators, operands, and parentheses. The following operators are supported: the semicolon for sequential composition, the bar for alternative composition, the comma for parallel composition, the per cent symbol for termination, and the colon for marking guards. Guards are Boolean expressions in MODULA-2, and they are enclosed by braces. As with operands, names of scripts can occur with optional actual parameters, which may be compared with procedure calls. Other kinds of operands are the minus and the pair of periods (behind a comma or a semicolon, denoting parallel and sequential iterations), and pieces of MODULA-2 code, which are placed between braces. As an example, part of a SCRIPTICversion of the Game of Life is presented. In the Game of Life a human player defines which cells of the playing board are alive or dead, so that an initial pattern of living cells appears on a computer screen. When the player gives the command the pattern on the board will start to change according to a certain algorithm. SCRIPTIC MODULE Life; FROM LifeAIgorithm IMPORT Board .... ; (*this is common Modula-2 code...*) SCRIPTS Help = Key ('?'); {Info};.. Game = Edit;..;Start; ({Generation};..);StopToEdit;.. Edit = Leftl Rightl Upl Downl Livel Dead
158
information and software technology
SpeedControl = (Faster[Slower);.. Faster = Key (FasterKey); {IF Speed(MaxSpeed THEN Speed:= Speed+l END} Slower Left = Key (LeftKey); ({x =O}:{x:= XLimit} I{x()O}:{x:= x - 1 });{Position (x,y)} Live = {NOT Board [x,y] }: Key (LiveKey); {Set(Board[x,y])} =
StopSession = Kev('x'); {Print ('Exit Life')} BEGIN [Initialisations; Info}; (Game, SpeedControl, Help) END Life.
% StopSession
Here scripts are defined within a SCRIPTS section. A script may also be defined in the way a procedure is declared in MODULA-2, e.g., SCRIPT Info; BEGIN {Print (HelpMessage)} END Info: By analogy with scripts, procedures in SCRIPTIC may be defined within a so-called PROCEDURES section. The above program states that first of all some initializations are performed, and information is displayed on the screen about the operation of the game. Thereafter the script G a m e actually starts. In parallel with G a m e two facilities exist, SpeedControl and Help. These enable the player to control the speed of the game and to get help. At any time the player can press the x key to activate the script Stop Session, which will halt the program. The game itself consists of an iteration of editing of the board and generation of new patterns. Editing, or defining cells, is done with the help of a cursor, which can be moved in either direction by pressing keys. Generation of new patterns may be halted by the user so he can go back to editing mode. SCRIPTICsupports more parallelism than the example shows. Subexpressions that are executed in parallel to each other can communicate in the way communication occurs in Process Algebra. For example, when it has been defined in the SCRI PTS section that a,b = {ModulaCode}, then the parallel activation of a and b may result in communication, which is the execution of the MODULAcode. Moreover, the expression x,.. denotes zero or more objects or processes with behaviour x. In combination with communication or guards this construct enables basic object-oriented programming and building systolic arrays. There is no need for additional language constructs like Create that are included in most object-oriented languages.
vol 31 no 3 april 1989
159