The role of PASCAl. in
software engineering PASCALwas the first programming language to implement many of the concepts now seen as fundamental to good software engineering practice. Patricia Samwell introduces and illustrates the use of the major PASCAL constru cts
The second of a series of papers on modern high-level programming languages for microprocessors introduces PASCAL, which was the first language to implement strong data typing and support for structured programming. The major language constructs - - program structures, control structures, simple and structured data types etc. w are discussed with respect to their support for good software engineering practice. The development of a program to simulate the behaviour of a simple computer is outlined as a programming example. high-level languages
PASCAL softwareengineering
The programming language PASCAL1 Was developed by Niklaus Wirth between the years 1968 and 1973. It is notable for the fact that it was the first practical language to implement many of the concepts, such as strong data typing and control structures to support structured programming 2, that are seen as fundamental to good software engineering practice. It is now a widely used language defined by an International Standards Organization (ISO) standard, and a measure of its success is that almost all subsequent imperative programming languages that have gained a wide degree of acceptance have been based on PASCAL. An earlier paper 3 in this series discussed the criteria against which programming languages should be measured, namely their support for the design and production of correct, reliable, maintainable software. In discussing PASCAL,therefore, we.should consider the extent to which it meets these criteria. The major problem faced by designers and implementors of realistically large software systems is that of complexity, and the principal tools to deal with complexity are the processes of abstraction and refinement 4. Consequently, programming languages should support expression of abstract concepts but must also provide Centre for Information Engineerin~ The City University, Northampton Square, London EClV 0HB, UK MODULA-2will be the subject of the next paper in this series (Vol 11 No 4 May 1987)
means of realizing the concepts. To allow the programmer to express abstract concepts, PASCALprovides a hierarchical program structure and user definition of data types; realization is achieved by performing a series of transformations, or refinements, on the original procedures and data structures until an executable form is obtained. If we assume that the programmer has expressed the original model of the solution correctly--and the likelihood of this being so will be influenced by how closely the language being used matches the programmer's concept of the p r o b l e m - then the correctness and reliability of the final program will depend on the accuracy of the refinements made to the original solution. As the refinements represent only informal (as opposed to mathematically formal) transformations, their preservation of correctness will depend primarily on the clarity and ease with which they can be made. Aspects of PASCALwhich facilitate this process are, for example, unambiguous control structures with a single entry point and a single exit point, and hierarchical data types with compiler enforced type checking. Meaningful choice of data names (and appropriate comments) also help the programmer to implement ideas and are a primary means of communication when it comes to program maintenance. Software development by successive refinement allows the maintenance programmer to follow each explicit design decision which has been made. If subsequently there are changes to the software requirements or environment, it is possible to see clearly which levels of abstraction are affected, and the extent of amendment required. This paper is intended to present the characteristics, level and scope of use of PASCAL, and to indicate the support it can offer in software engineering practice. The next section provides an informal introduction to the major language constructs. For a more rigorous definition the reader is referred to one of the many available textbooks s on the subject. After this the development, in outline, of a program to simulate the behaviour of a simple computer is described. This is given as an illustration of the refinement of an abstract concept (in this case, a process) to an executable PASCALprogram.
0141-9331/87/03141-08 $03.00 © 1987 Butterworth & Co. (Publishers) Ltd Vol 11 No 3 April 1987
141
const low = O; high = 100;
PASCAL LANGUAGE CONSTRUCTS The concepts of abstraction and refinement are the programmer's principal aids in developing computer programs. Presented with the details of a complex problem, the programmer can abstract to a higher level at which the concept to be realized is isolated from details of its implementation and environment. Concept Abstraction
I Refinement Details
Once the concept has been isolated, a model for a solution can be developed. The solution will be implemented by refiningthe model until it is realizable, i.e. until it is expressed in an executable language. With this approach 3 programs are developed by successive refinements of (active) algorithms and the (passive) data structures on which they operate. The basic mechanisms provided by PASCALto support refinement are the procedure, for algorithms, and the user-defined type for data structures. Underlying the procedure are PASCALcontrol statements and underlying user-defined data types are standard PASCALdata types and structures. To put it another way, procedures and user-defined data types allow programmers to abstract away from PASCAL'Slanguage constructs towards a model of the concept they wish to realize. The language constructs provided by PASCALcan be summarized under the headings of program structures, control structures, and data types and structures.
Program structures
vat
firstvalue, secondvalue, thirdvalue:integer; inrange: boolean; minimum :integer; operandi, operand2, result:real; operator: char; The constants 'low' and 'high' are of type 'integer'. The variables are all of scalar data type. The programmer can define further scalar types ('user-defined types') by specifying a subrange of an existing type, i.e. type byte = 0.. 255; var register: byte or by enumeration. type day = (mon, tue, wed, thu, fri, sat, sun); var
today:day
Control structures PASCAL control structures fall into two main categories: conditional (if, case) and repetitive (while, repeat, for). The 'if' statement may take either the form ' i f . . . then', or the form ' i f . . . t h e n . . , else' (Figure 1). An example of an ' i f . . . then' statement, together with some assignment statements, is firstvalue := 30; secondvalue := 50; thirdvalue :-- (firstvalue - secondvalue) * 2; minimum := firstvalue; if secondvalue < firstvalue then minimum := secondvalue;
A PASCALprogram has the following (simplified) form. program name (input, output); {this is the outline structure of a program} constant definitions; type definitions; variable declarations; procedure and function declarations; begin body of program end. The parameter list following the program name specifies the channels used by the program to communicate with its environment, the default values generally being 'keyboard' and 'screen'. Note that in PASCALthe semicolon acts as a statement separator and not as a statement terminator, and that a comment is delimited by braces { }. The body of a PASCAL program is a sequence of statements. A procedure (or function) is a subprogram that has essentially the same form as a program except that it starts with the word 'procedure' (or 'function'). The statements in the program body manipulate program variables by the basic operations of input, output and assignment, the sequence of these operations being determined by control statements.
Simple data types 'Boolean', 'integer', 'real' and 'char' are standard PASCAL data types. Some examples are shown below.
142
Alternatively, we might have written if firstvalue < secondvalue then minimum := firstvalue else minimum := secondvalue: The boolean expressions may themselves be complex, and the normal logical operators and boolean values boolean- I expression
booleanexpression
Tl
I
statement1
- statement1
a
1
I/1,atemen2 I
b
Figure 1. 'If" statements: a, if boolean-expression then statement l ; b, if boolean-expression then statement1 else statement2
Microprocessors and Microsystems
exist. For example, we could determine whether 'thirdvalue' was in 'range' by if (thirdvalue > = low) and thirdvalue < = high) then inrange := true else inrange := false;
booleanexpression
statement(s)
or, more simply, by inrange := (thirdvalue > = low) and (thirdvalue < = high) 'If' statements may be nested to any depth but care should be taken to avoid structures of the form
I
I
if . . . . then . . . . if . . . . then . . . . else... which are likely to be misinterpreted by a human reader. Additional blocks, delimited by 'begin' and 'end', can be used to resolve the structure as either
booleanexpression
statement
a
b
Figure 3. Repetitive control structures: a, while booleanexpression do statement; b, repeat statement(s) until boolean-expression
i f . . . then begin i f . . . then... end else... or i f . . . then begin i f . . . then.., else... end A single 'if' statement allows, at most, two courses of action. The 'case' statement (Figure 2) case i of il : statement1; i2: statement2;
'Operator' is of type 'char' (see above), but it could of course take many values other than '+ ', ' - ' , '*' and '/' and so generate a runtime error. It would be tedious to enumerate all other 'char' values and deal with their occurrence; consequently the lack of an 'else' clause may be a potential source of errors of which the programmer should be aware. PASCAL'Srepetitive control structures 'while' and 'repeat' (Figure 3) have essentially the same constituent parts - - a boolean expression to be evaluated and a statement or sequence of statements to be executed - - but differ in their relative placement. Assuming the declaration var x, power, p, count:integer;
iN: statement N end
and the assignments
allows one of a number of courses of action to be selected and is often preferable to a nested 'if' statement. For example case operator of
power:= 1; count := 0; p := 5; then, assuming that x has been initialized, x s could be calculated either by
'+':result:= operand2 + operandi; '-':result := operand2 - operandi; '*':result := operand2 * operandi; '/':result := operand2/operandl
while count < p do begin power := power * x; count := count + 1 end
end
or by repeat power := power * x; count := count + 1 until count > = p statement1
Figure 2.
statement2
'Case' statement
Vol 11 No 3 April 1987
• • •
statementN
However, if p had been initialized to 0, 'while' would not have executed its (compound) statement at all but 'repeat' would have executed its sequence of statements once, and so returned an incorrect answer. A further example of the while statement, which also illustrates the use of the standard successor function 'succ' with enumerated types, is type day = (mon, tue, wed, thu, rri, sat, sun); Mar today:day; workingdays :integer;
143
There is also a pointer data type which provide~ lor dynamic list structures. Components of an array are of identical type and a~e accessed by subscripting.
b e gin workingdays := 0; today := mon; while today < = fri do begin workingdays := workingdays + 1; today := succ (today) end
var
i:integer; memory:array[0 .. 4095] of integer;
end As we would expect, 'wed' is the successor of 'tue' which is the successor of 'mon' which is the successor of 'sun'. Similarly, PASCALalso has a standard predecessor function, 'pred'. The third repetitive control structure in PASCALis the 'for' statement, used as in the following example. for controlvariabte := initial-value to (downto) final-value do statement The flowchart representing this loop is shown in Figure 4. On each iteration of the loop, the control variable is either incremented ('to') or decremented ('downto'), as in sum := 0; f o r i : = 1 to 9 do sum := sum + x or : = 0; for i:= 9 downto 1 do sum := sum + x sum
Structured data types PASCAL has four structured data types: • • • •
arrays records sets files
These are all static structures, i.e. values of these types cannot be created or destroyed during program execution.
initial value ~< control variable ~< final value?
alter control
for i:= 0 to 4095 do memory [i] := 0 Components of a record, on the other hand, may be of different types. type date = record day:l .. 31; month:(Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec); year: 1900.. 2000 end; Here, a record of type 'date' consists of a day (of subrange type from 1 to 31) followed by a month (of enumerated type Jan, Feb . . . . . Dec) followed by a year (of subrange type from 1900 to 2000). The components of the record are called fields and are accessed by name. A more complex record, 'employee', is as follows. type grade = 1 .. 10; employee = record name:array[1 .. 25] of char; dept: (sales, marketing, production, finance); joiningdate:date; case salarygrade :grade of 1,2, 3, 4, 5, 6:(weeklysalary:0.. 500; bonus:0.. 300); 7, 8, 9, 10:(monthlysalary:0 .. 2000; expenses :integer) end 'Employee' is a variant record. All records of type 'employee' start with a name (of type 'array of char') followed by a 'dept' (of enumerated type) followed by 'joiningdate' (of type date, with date itself being a record). The final part of the record, however, depends on the value of the variable 'salarygrade' (of subrange type grade). If 'salarygrade' is in the range from 1 to 6then there will be a weekly salary and a bonus (both of subrange type) but, if 'salarygrade' is in the range from 7 to 10, there will be a monthly salary (of subrange type) and an (integer) expenses field. Assuming the following variables of type 'employee' var
manager, clerk:employee
variable
we could write an assignment statement such as clerk.bonus := 200 statement
I
,l Figure 4. Repetitive control structure: the 'for' statement (for controlvariable : = initial-value to (downto) final-value do statement) 744
assigning the value 200 to the bonus field of the variable 'clerk'. If data structures are deeply nested within one another it can be tedious to name each field specifically at every reference to it. To alleviate this, PASCALprovides a 'with' statement so that instead of writing manager.monthlysalary := manager.monthlysalary + 50; manager.expenses := 500
Microprocessors and Microsystems
we could write with manager do begin monthlysala~, := monthlysalary + 50; expenses := 500 end where the qualifier 'manager' is appended to field identifiers of its record within the inner block. The 'set' data type, together with the usual set operators, allows structures of 'collections' of components (of the same type) to be handled. type alphabet = 'a'.. 'z'; possibleword = set of alphabet The type 'possibleword' is the set of components (i.e. lower-case letters) which is the powerset (i.e. the set of all sets) of 'alphabet'. The 'file' data type is a sequence of components all of the same type. Using the definition of 'employee' given above we can write var staff:file of employee The length (i.e. the number of components) of the file is not fixed by the file type definition. There are several basic file handling procedures including Reset (initiation of reading), Rewrite (initiation of writing) and the boolean end-of-file function 'eof'. Also associated with a file variable F is a buffer variable F^ (of the component type) which is used to access the file viathe procedures Get and Put. A text file, of the predefined type 'Text', is a sequence of characters subdivided into variable-length lines. As well as 'eof', it has an end-of-line boolean function 'eoln' associated with it.
P r o c e d u r e s and f u n c t i o n s
A procedure is a named block which may have local definitions and declarations in the same way as a program does. Specific information from the external environment is passed to the procedure via the parameter list. When a procedure is declared, a formal parameter list is specified; when the procedure is called the formal parameters are replaced by actual parameters. Parameters may be called either by value or by reference; parameters called by reference are known as variable parameters and are preceded by 'vat'. In general, reference parameters are used to return results from an activation of the procedure to the environment. During the activation, references to the parameter are treated as references to the actual parameter that existed before, and will continue to exist after, the activation. Parameters whose values do not change, or whose changed values are not of interest, are called by value. On activation of a procedure, value parameters are copied into the procedure's activation record (i.e. into its runtime local storage) and during the activation references to the parameters are treated as references to this copy. When the procedure terminates, the activation record, including the value parameters, disappears. During a procedure activation, then, the objects which are visible are the objects declared locally within the p r o c e d u r e - the parameters- and the objects which are global to the procedure. The objects global to the procedure are those which were in scope (i.e. visible)
Vol 11 No 3 April 1987
when the procedure was declared (as opposed to when it was called). The procedure 'matrixmultiply' shown below has three arrays of integers as parameters. It multiplies X by Y to give a result in Z. type mat1 = array[1 .. p, 1 .. q] of integer; mat2 = array[1 .. q, 1 .. r] of integer; mat3 = array[1 .. p, 1 .. r] of integer; var
A: mat1 ; B: mat2; C: mat3; procedure matrixmultiply (vat X: mat1; var Y: mat2; var Z: mat3); var i:1 .. p; j:l .. r; k:l .. q; temp :integer; begin for i:= 1 to p do for j : = l t o r d o begin temp := 0; for k := 1 to q do temp := temp + X[i, k]*Y[k, j]; Z[i, j] := temp end end
matrixmultiply (A, B, C)
In this case all three parameters are called by reference although X and Y remain unchanged and therefore could have been called by value, but some thought should be given to the use of structured types as array parameters. Calling by value would have caused copies of X and Y to be formed in the activation record of the procedure. Depending on the size of X and Y, the time and storage requirements of the copying could be significant and, as each element of X and Y is only accessed once in this procedure, a call by reference if preferable. PASCALalso allows procedures (or functions) themselves to be passed as parameters. Functions differ from procedures in that they return a value of a particular type (the type of the function) which may be assigned etc. in the usual way. Some standard PASCALfunctions, e.g. the boolean end-of-file function 'eof', have already been explained above. Recursive procedures (i.e. those which call themselves from within the procedure body) are allowed in PASCAL. These are particularly useful when dealingwith recursively defined data structures (e.g. lists) but care should be taken with their use as there are situations when iterative methods are preferable.
P o i n t e r variables
Pointer variables allow the creation of dynamic data structures whose size can change during the execution of a program. A pointer variable is a pointer to variables of a specific type. type nextemployee = "employee
145
Variables of type 'nextemployee' will be pointers to variables of type 'employee'. There are two standard PASCALprocedures, 'new' and 'dispose', that create and destroy the objects pointed at by pointer variables. Consider the example type nextemployee = "employee; employee = record next: nextemployee; name:array[1 .. 25] of char; dept:(sales, marketing, production, finance); joiningdate: date case salarygrade:grade of I, 2, 3, 4, 5, 6:(weeklysalary:O.. 500; bonus :0.. 300); 7, 8, 9, 10:(monthlysalary:O .. 2000; expenses:integer) end; var staff: nextemployee; new(staff) Type nextemployee is a pointer to variables of type employee. Type 'employee' has been altered slightly from its previous definition: it now has an extra field which is a pointer to variables of type 'employee'. The variable 'staff' can be thought of as follows
I
staff
I
S
pointer to a record of Wpe employee
The procedure call new(staff) will create a record of type 'employee' and put a pointer to the record in 'staff' (Figure 5). There are no data values in the record, and the only way of referring to it is via 'staff'. For example
assigns the value 'sales' to the 'dept' field of the new record, and staff'.next.next := nil assigns the null pointer ('nil') to the next field of the new record (Figure 6). In practice, working pointer variables or 'with' statements are used to avoid repetitive complex pointer addressing. In the same way as procedure 'new' dynamically builds data structures, procedure 'dispose' is used to dynamically remove elements of data structures. Together they form the basis of a very powerful programming technique. It can be seen that there is recursion in the definition of 'nextemployee' and 'employee'. Structures built with variables of these types are therefore recursive and it may be appropriate to handle them with recursive procedures.
WRITING PROGRAMS IN PASCAL The strength of PASCALas a problem-solving language ties in its descriptive facilities. The programmer may define data types to represent the objects that must be dealt with and corresponding procedures to manipulate the objects. By way of an example, this section develops the outline of a program to model the operation of a simple computer. The example is not on a realistic scale but this type of modelling is frequently used as a first step in designing new computer architectures. The PASCAL program must simulate processes, i.e. it must simulate the execution by the computer of its programs. The design for the model is shown in Figure 7; this model is of course only one of a family of possible designs. Some of the components of the model, such as the process itself, are active and will be represented in PASCALby program structures, typically procedures. Other components, such as the memory, are passive and will be represented by data structures. The process (which is the execution by the processor and control section of a
~I
staff'.dept := finance assigns the value 'finance' to the 'dept' field of the record at which 'staff' is pointing. Lists of records can be built up as follows.
finance
nil sales
new(staff', next) will create another employee variable and put a pointer to it in the next field of the recrod which is pointed at by 'staff'. staff'.next'.dept := sales
Figure 6.
Building up a list of records process
j staff
I
y
~
Fixed part of record
centre~
processor
fetch + execute
registers
I
alu
Variant part of record
program
data
stack
I
instructions
opcode + address
Figure 5.
146
Record of type 'employee'
Figure 7.
Model for the operation of a simple computer
Microprocessors and Microsystems
program held in the memory) can be realized by a procedure. procedure process(var mere: memory); var proc: processor; eop: boolean; begin eop := false; while not eop do begin fetch(mem, proc); execute(proc, eop, mem) end end The parameter memory is called by reference as it will return the results of the process execution. The function of the process is to perform fetch-andexecute cycles until the program terminates. This is realized by the 'while' statement which implements the control section of the processor. The processor itself consists of a register set comprising an instruction register, a program counter, an accumulator, a stack pointer and a flag register. type processor = record instructionreg: instruction; programcounter, accumulator, stackpointer: word; flags: flagregister end where flagregister is defined as type flagregister = record zero, carry:boolean end The memory contains the program, the stack and a data area type memory = record p: prgram; st:array[0.. 511] of word; data:array[0.. 1023]of word end and the program consists of an ordered collection of instructions. type prgram = array[0.. 4095] of instruction The collection of instructions is called 'prgram' because 'program' is a reserved word in PASCALand must therefore not be used as an identifier. An instruction has two parts to it, an op. code (which may be 'load', 'store', 'add', 'jump', 'push', 'pop' or 'halt') and a 16-bit address. type word = 0.. 65535; operation = (load, store, add, jump, push, pop, halt); instruction = record opcode: operation; address:word end Returning to the control section of the processor, 'fetch' and 'execute' are both active components implemented by procedures.
Vol 11 No 3 April 1987
procedure fetch(var mem: memory; var proc: processor); begin with proc do begin instructionreg := mem.p[programcounter]; programcounter := programcounter + 1 end end The following is an outline of procedure 'execute' which, in its turn, would activate the components of the ALU. proced ure execute(var proc: processor; var eop: boolean; var mem: memory); begin with proc do case instructionreg, opcode of load : . . . . . . . . . . . . . . . . . . . ; store : . . . . . . . . . . . . . . . . . . . ; add : . . . . . . . . . . . . . . . . . . . . ;
halt : . . . . . . . . . . . . . . . . . . . . . end end Assuming the variable declaration var
m:memory process would be called as process(m)
CONCLUSIONS This paper has presented the major features of PASCALand illustrated their use in program design and development. What programming languages offer however is support for, and not a guarantee of, good software production. The constructs provided by a language must therefore underpin a more comprehensive philosophy of software engineering. PASCAL'Sundoubted success as a programming language is a result of its provision of basic language constructs which support the techniques of abstraction and refinement discussed above. That is not to say, of course, that it is the perfect language, and there are a number of issues (in particular concurrent and realtime programming) which it does not address at all. It has, however, gained very wide acceptance, being available on almost all computers, and has become a standard communication medium for textbooks, journals and research reports in the areas of software engineering and algorithm description. Since its inception, many languages have been developed from PASCAL.Two of these, MODULA-2and ADA, are the subject of subsequent papers in this series. It is instructive to observe from their development the lessons in language design and application which have been learnt from PASCAL,the consequent extension of support for software engineering techniques (notably in the areas of visibility and generics) and the inclusion of facilities for realtime and concurrent programming.
ACKNOWLEDGEMENT This article is based in part on material used in lectures for the Fourth lEE Vacation School on 'Software Engineering for Microprocessor Systems'.
147
REFERENCES 1 2 3 4 5
148
Wirth, N 'The programming language Pascal' Acta Informatica Vol 1 (1971) pp 35-63 Dahl, O J, Dijkstra, E W and Hoare, C A R Structured programming Academic Press, Orlando, FL, USA (1972) Davies, A C 'Features of high-level languages for microprocessors' Microprocessors Microsyst. Vol 11 No 2 (March 1987) pp 77-87 Wirth, N 'Program development by stepwise refinement' Commun. ACM Vol 14 (1971) pp 221-227 lensen, K and Wirth, N (Revised by Mickel, A B and Miner, J F) 'Pascal user manual and report' ISO Pascal Standard 3rd Edition Springer Verlag, Heidelberg, FRG (1985)
Patricia M Samwelt is a lecturer in information engineering at The City University, London, UK. Her research interests are in software engineering, particularly in relation to concurrent processing. A member of the lEE, BCS and ACM, she holds a BSc in mathematics from the University of Leeds, UK, and an MSc in computer science and PhD in computer engineering from The City University, London.
Microprocessors and Microsystems