A Multi-pass Translation Scheme for ALGOL 60 E. N. HAWKINS and D. H. R. HUXTABLE The English Electric Go. Ltd.
1. INTRODUCTION
the past few years techniques have been developed which enable algorithmic languages like ALGOL 60 to be translated in one pass through the input text. Such translators operate at high speed and enable the user to operate a 'load and go' system. The machine program resulting from such a translation is, however, slower than that which can be produced by multi-pass translators. With the new machines this penalty does not assume the proportions that it did with the older machines. There still exists, however, a large class of users to whom the new machines are barely large enough or fast enough. It is for this reason that the authors have developed a multipass scheme (for translating ALGOL 60 on KDF9) which is designed to take advantage of the structure of the machine. Large portions of the scheme are of general interest, and applicable to any machine. Some ofit, however, is applicable only to KDF9, and is included for the sake of completeness. OVER
2. BASIC STRUCTURE AND OBJECTS OF THE SCHEME
The main object of the scheme is to produce a translation ofALGOL 60, into KDF9 machine code, which will run as efficiently as possible. Efficiency includes such things as minimum running time and minimum machine storage requirements. The main attack has obviously to be centred on an efficient solution to the evaluation of subscript lists. Such a solution is only possible if the variables appearing in a subscript list have a well-defined sequence ofvalues. Such a definition is supplied explicitly by means of' for clauses'. Naturally 163
164
E. N. Hawkins and D. H. R . Huxtable
the solution cannot be applied to 'for statements' which contain quantities which interfere with the defined sequence. Since any procedure is capable of such an alteration, but very few do, some classification of the procedures must take place. This classification has the further desirable property that it enables the storage allocation process to be simplified. An arithmetic expression can also be subjected to an optimizing process provided it does not contain functions liable to produce side-effects which are unknown to the translator. Functions with 'side-effects' are functions which during the course of evaluation change the value of a variable which is either planted as a parameter by name or referred to 'non-locally'. P. N aur in his Course ofALGOL 60 Programming (Ref. 9) defines a function 'sneak' with this property. This term will be used to describe all such functions. A procedure is regarded as being a 'sneak' until proved otherwise. The procedure classification process therefore increases the number of expressions which can be optimised. The system with full diagnostic aids and library requires a KDF9 with 8K core storage, four magnetic tape units, paper tape reader and punch. If fewer than four units are available, then by eliminating some ofthe facilities and/or increasing the time to translate, the scheme can run with only two magnetic tape units. This minimum is not, however, regarded as practical. The scheme operates in seven distinct phases: 1. Input. 2. Syntactic check and reduction of the input text to a form suitable for processing by the later phases. 3. Procedure classification. 4. Storage allocation. 5. Index optimization. 6. Translation and formula optimization. 7. Final compilation and output.
The scheme is organized around the storage allocation system. So rather than go through the above phases in order, it is proposed to consider phases 4, 3,5, 6 in detail and in that order. Phases 1, 2, 7 will then be discussed in broad outline only, since they arc relevant only in support of the work done in the other phases. 3. A BRIEF DESCRIPTION OF KDF9 IN TERMS OF ITS USER CODE
The main feature of this machine is its nesting accumulator. This accumulator consists of a series ofcells called N 1, N 2 ... N 16. All transfers take place to or from N 1, when such a transfer takes place the contents (if
A
Multi~pass
Translation Scheme for ALGOL 60
165
any) of N2, N3 ... N 16 are moved (nested) up or down one cell. The arithmetic functions operate on the contents of the top two cells (N1 and N 2). The answer is left in N 1 and the contents of N 3 moved into N 2, etc. Various manipulative functions, which re-order the contents of the top 2,3 or 4 cells of the accumulator, are listed below, together with some ofthe arithmetic functions.
Functions
+ X REV REVD PERM CAB DUP DUPD
Nl a
N2
b+a b-a bxa b/a b
c c c c a d c a a b
C
b c a a
b
N3 c d d d d c a a
N4 d
d b d d
b b
c
a
b
Most of the ALGOL operators are defined as single KDF9 functions. Some, however, have to be simulated, for example -;- and t. The division ..;in the above list denotes floating-point division and corresponds to the ALGOL 60 [, Fixed-point and floating-point operations are available. Transfers to or from the core store are achieved by instructions written in the form = Y y or Y y , Y is the main class; there are, however, limited classes YA, YB ... YZ and also V-stores. These latter can be set as constants to the program by means of instructions to the compiler of the form Vn = 'constant'. Other integer constants of up to 15 bits can be putintoN1 with instructions of the form SET p (where p is a signed integer). There are 15 Q-stores available for modification of the addresses specified in Fetch and Store instructions. These Q-stores are divided into three equal parts Counter/Increment/Modifier (referred to as C, 1, M). The whole store being referred to as Q. If we wish to modify a transfer address by the contents of the modifier part ofa Q-store, the instruction YyMm or YyMm is written. It is also possible to change the value of the modifier M by the value of the increment 1 and decrease the counter C by 1 in any transfer instruction. This is achieved by writing Q after the instruction, i.e. YyMm Q. Such modification takes place after the transfer has been initiated, i.e. after the address has been evaluated. Storage is possible directly into the three parts of a Q-store, e.g. Cl, =13,
166
E. N. Hawkins and D. H. R. Huxtable
=M15. In which case the least significant 16 bits ofN 1 are taken and put into the appropriate part. Or the whole 48 bits can be stored by instructions of the form =Q14. The importance ofthis class of instruction will become apparent in the section on Index Optimization (Section 6). The remaining feature is the Subroutine Jump Nesting Store (SJNS). This is a nesting store similar in operation to the nesting accumulator. It is used to store return addresses for use on exit from Subroutines. The address in the top cell of SJNS is the address to which control is transferred by the use of the instruction EXIT. Addresses can be transferred to or from the SINS (from the nesting accumulator) by use of the instruction =LINK or LINK. The return addresses are set by Subroutine Jump instructions IS., where 's' is the required label. This facility is available regardless of whether's' is a subroutine label or just an ordinary reference point. Conditional Jumps are conditional on the state of the top cell or the relationship between the top two cells of the accumulator. Such Jumps always nest up one place. The tests on the top cell only are comparisons with zero and are written j, = Z, Ir =1= Z, etc. Jumps conditional on the top two cells are limited to Jr =1=, Ir =. For a more detailed description of KDF9 and its User code reference should be made to Refs. 7, 13. A program for evaluating the expression: Z _ a(a3 b2 + 1) - b 2c2 X d 2
+
a, b, c, din Yl, Y2, Y3, Y4 Z Y6
in
Y2; Yl; DUPD; DUP; X; X; DUP; X; +; REV; Y4; Y3; X; DUP; X; DUP; +; +; ....;-; = Y6 4. THE STORAGE ALLOCATION. SYSTEM
The ALGOL syntax allows, by means of the block structure, an automatic form of storage economy. Space need only be reserved for those variables which are currently available. This storage is simply nested; the last declaration passed is always the first one to be cleared back. This leads automatically to the idea of a 'stack'. Basically 'stacked storage' is storage in a vector which is continually expanding and contracting. The end ofthe vector is indicated by means of an ARROW. This ARROW contains the address of the Next Free Space (NFS), when it is desired to store a new quantity it goes into the NFS and the ARROW is moved on one place (for an array the ARROW is moved on several places). When a variable is no longer accessible according to the ALGOL syntax then the ARROW reverts to the value it had before that variable was declared.
A Multi-pass Translation Schemefor ALGOL 60
167
As an example consider the following ALGOL structure:
Example 4.1. L1:
begin
real array A[1: 4J; integer P, Q;
L2:
begin integer S, T, U; L3:
L4:
end; begin integer X, Y; L5:
end
end The stack structure at the various points in this block is as follows: At L2: At L3: At L4: At L5:
A[1], A[2], A[1], A[2], A[1], A[2], A[1], A[2],
A[3], A[4], P, Q, t A[3], A[4], P, Q, S, T, U, t A[3], A[4], P, Q, t A[3], A[4], P, Q, X, Y, .j,
(t points to the address contained in the ARROW location). It will be seen that ifby some mechanism the program returns to L1 without leaving the block which bears that label (i.e. by the recursive procedure call) then the whole process is merely repeated, e.g. At L2: A[1], A[2], A[3], A[4], P, Q, A[1], A[2], A[3], A[4], P, Q,
t
This possibility, together with the possibility that the array A may have dynamic bounds (i.e. suffix bounds which vary at run time), means that by using this simple mechanism it is impossible to allocate an address to any variable, until run-lime. The first step in the solution of this problem is to remove the indefinite storage requirements of arrays. Instead of trying to allocate space to contain the elements at translate time let us allocate a single word, which is loaded at run time with information about where the elements are located. The' picture' at L2 would now be: A, P,
Q, A[1], A[2], A[3], A[4],
t
168
E. N. Hawkins and D. H. R. Huxtable
the contents of location A being set up (at run time) to point to A[1]. The variables A, P, Q, can therefore be given explicit addresses. If, however, it is required to re-enter at LI we still have to generate a new set of A, P, Q, etc. This is achieved by making the locations allocated to A, P, Q relative to some address whose value can be changed. This address is known as the BASE ADDRESS (BA) and is used to refer to the particular set of A, P, Q in current use. This treatment can be extended to cover more than one block, provided recursive entry is not required to any of the internal blocks of the set. In general a procedure body is a block (or is regarded as such) and recursive entry may be required. There is no recursive entry requirement on the internal blocks, hence these blocks can be regarded as a unit. Thus for storage, and consequently much else ofthe system, the basic unit of program is the procedure, the main or base program being treated as a procedure without parameters. This unit is termed a 'Programme Level'. The ARROW is of course initially set to the NFS after the fixed-storage requirements for the entire level have been set. In Example 4.1 the maximum number offixed stores is required at L3 and is A, P, Q, S, T, U Therefore the ARROW for the unit is set at the NFS after' V'. The storage layout is therefore as follows: At At At At
L2: L3: L4: L5:
A, A, A, A,
P, Q, -, - , -, A[1], A[2], A[3], A[4], ~ P, Q, S, T, U, A[l], A[2], A[3], A[4], {P, Q, -, -, -, A[l], A[2], A[3], A[4], ~ P, Q, X, Y, -, A[l], A[2], A[3], A[4], ~
[, -, denotes an unused location]. There is a certain penalty incurred in storage economy. The unused locations make for a lossin storage, but as only simple variables are involved the loss is unlikely to be important. The fixed storage part of a Programme Level is therefore allocated at translation time, and an initial value for the ARROW (in the above case it is the location ultimately occupied by A[1]) set on entry to the Programme Level. When at run time an array declaration is encountered the Bound Pair List expressions are evaluated and sufficient space reserved for the variables constituting the array. The array location (A) is updated to point to the elements, and to the vector ofinformation derived from the Bound Pair List
A Multi-pass Translation Scheme for ALGOL 60
169
Expressions. This vector in current jargon is known as the' Dope Vector' and will be referred to as such. Since the number ofsuch expressions is fixed at translation time space is reserved for the Dope Vector in the fixed space for the level, one such vector for each Bound Pair List. Such increments of the ARROW value occur therefore at various begins within a Programme Level. Obviously the value of the ARROW must be decreased by the same amount when the corresponding ends are found. Since it is possible to leave a block before the corresponding end is found and jump to some lower 'block level', this decrement is achieved by a table look-up of the value corresponding to the block level to which control is being transferred. Space is therefore reserved in the fixed space for an Arrow Vector (AV)• A new element of the AV is set up on entry to a new block with the current value of the ARROW. This is then updated in step with the ARROW according to the array declarations present. The value is then available, when return is made to that block level either via ajump or end, to reinstate the correct value of the ARROW. The entries in the Arrow Vector are keyed to the block level. The sizeof the Vector is therefore equal to the maximum block depth reached within the Programme Level. Similarly information must also be stored concerning the Base Address and ARROW positions whenever a Programme Level is left for a new one. This information is analogous to the LINK instructions used in returning from a subroutine to the main programme, and is termed the Data Link. It consists of three quantities: 1. The Programme Level name. 2. Base Address value. 3. ARROW value. This is stored as one word, which together with an adjacent word containing the Instruction Link, occupies the first two locations of the new Programme Level, and refers to the previous level. Thus on exit to a previous Programme Level we can restore the previous values of the bounds of the storage. In general a Programme Level has a certain number of formal parameters associated with it. Due to the requirements in implementing calls of formal procedures these parameters must be in identical positions. The next en' locations are therefore reserved for the parameters (Ref. 3). The layout of the storage for a level is shown in Fig. 4.2. All references to quantities are referred to by addresses relative to the Base Address.
170
E. N. Hawkins and D. H. R. Huxtable
ARROW
FIXED STORE
DOPE VECTORS
ARROW VECTORS
PARAMETER LOCATiONS
1 f------INSTRUCTION LINK BASE ADDRESS - 7
D ATALrNK
-7
:FIG. 4.2.
A Multi-pass Translation Schemefor ALGOL 60
171
The Dope Vectors and fixed stores are in fact intermingled. The main result of this form of storage allocation is that variables only became non-local if they belong to a different level from the one in which they are being used. Access to a non-local variable is achieved by finding the value of the Base Address of the level to which it is local. There are two ways of doing this: 1. To keep an immediately accessible vector of values for each Programme Level. 2. To search for the last use of that level by examination of the Data Links. The snag with scheme 1 is the problem of keeping the vector up to date, especially when abnormal exits are made from levels. The snag with scheme 2 is the time factor. The scheme adopted is a combination of both schemes. Scheme 1 is adopted for those levels which are non-recursive (a fact which immediately removes the updating problem); and scheme 2 used for those levels which are recursive. Examination of the special properties ofthe BaseLevel or Main Program immediately reveals the fact that it cannot recurse. I t isfurther used in only one dynamic position in the stack, i.e, the first. Therefore its Base Address can be fixed, and is fixed at O. Access to non-local variables which are local to this level is therefore immediate. It will be seen that this system embodies the speed of a fixed storage system and yet retains the flexibility required to implement either recursive procedures or the recursive use of a procedure. Anticipating the procedure classification process described later, Special Functions are defined such that they require no dynamic storage. Therefore we can index the storage requirements from the ARROW instead offrom a new Base Address, i.e. their storage allocation is shown in Fig. 4.3. The Arrow Vector and Index Information is completely unnecessary by definition. The entry to such a function is therefore considerably faster than that to the more general function or procedure. Some indication has already been made in the description of the fixed storage assigned to a Programme Level of the means used to reduce the stack manipulations necessary when transfer of control is made via a designational expression. The general treatment is only required if the label to which control is being transferred is local to another Programme Level, i.e, an escape path is being followed by jumping out of a procedure.
172
E. N. Hawkins and D. H. R. Huxtable
FIXED STORAGE
PARAMETERS
UNUSED LINKS
--
FlO. 4.3.
~ARROW
A Multi-pass Translation Schemefor ALGOL 60
173
Jumps from one block to an enclosing block within the same levelmerely involve extraction from the Arrow Vector of the ARROW value appropriate to the destination block. Jumps within a block are mere transfers of control. The general treatment requires the use of both the Data Link (to restore the BaseAddress) and the Arrow Vector to restore the ARROW depending on the particular block of the level entered. A normal end exit from a Programme Level uses the Data Link to restore both these quantities, since return is being made to the original block of the previous level. own variables have not yet been mentioned. The use of such variables is as defined in Ref. 10, despite the lack of unanimous support for that proposal. Consequently such variables are located in the main Programme Level, and the only property they have different from such variables is in their scope. 5. PROCEDURE CLASSIFICATION
The advantages gained from the classification process have already been indicated in (Section 4). This classification divides the procedures used in a program into three classes: 1. Special functions. 2. Those procedures which it is possible to use recursively. 3. Those procedures which can only be used in a simple manner.
Specialfunctions A special function is a procedure which satisfies all of the following rules: 1. It is a function designator. 2. Specifiers are limited to 'type'. 3. All parameters are called by value. 4. No internal procedure statements. 5. No reference ismade to variables which are non-local to the procedure. 6. There are no abnormal exits (i.e. transfers to labels which are nonlocal to the procedure). 7. Any local declarations do not include arrays or own variables or switches. Examination of this set of rules shows that such a function has two very important properties: 1. It is incapable of producing 'side-effects' and if the parameters planted produce 'side-effects' such effects are external to the procedure.
E. N. Hawkins and D. H. R. Huxtable
174
2. It does not require any storage which involves alteration of the ARROW, either explicitly (local arrays or procedure statements) or implicitly during the evaluation of parameters called by name. All these conditions are detectable in one pass through the procedure declaration. This pass-for organization reasons-is best done in a backward run through the procedure. This is possible if the 'locale' (the Programme Level to which an identifier is local) of any identifier is known (see Section 8 on Input).
Recursive or simple The problem of dividing the remaining procedures into the two other classes is far from triviaL It should be pointed out that the system to be outlined does not take any account of dynamic conditions. As an example of such a condition consider the following procedure structure. Example 5.1.
begin real procedure PI (F1); real procedure F 1; begin PI: = F1 (x) + ... end; real procedure P2 (F2); real procedure F2; begin P2: =F2 (x) + ... end; procedure PE (B, p) ; Boolean B; real procedure p; begin if B then p (PI) else p (P2);
end;
PE (true, P2); P3 (false, PI);
end
A Multi-pass Translation Scheme for ALGOL 60
175
It is obvious that in the given situation a recursion cannot occur. However, the system will classifyprocedures P1 andP2 as recursive. It would be by no means obvious if more general Boolean expressions were used instead of true and false (i.e. the system 'fails safe'). In general terms a recursion can arise in one of two ways: the recursive definition or declaration and the recursive use. An example of the recursive declaration isthe definition ofa square root by Newton's Approximation. Example 5.2. real procedure SQRT (a) approximation: (x) tolerance: (eps); value a,x,eps; real a,x,eps;
SQRT: = if abs (x t 2 - a) < eps then x else SQRT (a, (x + afx)f2, eps); Such recursive declarations can obviously involve more than one declaration, and consequently more than one procedure. The recursive use of a procedure is in general the more subtle process. The example given above (5.1) is a good illustration of the kind ofthing that can happen. Such recursions are usually explicitly finite in depth, whereas the recursive definition (5.2) can cycle an indefinite number of times. The detect recursive situations it is therefore necessary to note not only the declaration structure but also the dynamic statement structure, and to combine the results of the two surveys. Fundamentally the problem reduces to tracing through the call structure of the program. The method used is similar to that used in detecting precedence loops in multiprogramming theory (Ref. 5). Let a procedure call structure be represented diagrammatically by a sequence of blocks connected to one another by directional links. As an example consider the following system:
176
E. N. Hawkins andD. H. R. Huxtable
procedure PI can call anyone of procedures P2, P3, Pi which in turn either call other procedures or terminate the call sequence, i.e. return to the calling procedure. The system can be described by means of a matrix of Boolean variables. The rows and columns correspond to procedures, the rows designating the calling procedure and the columns the called procedure. When a procedure (PI) calls other procedures (P2, P3, Pi) then the Boolean variables corresponding to columns P2, P 3, P4 of row PI take the value' true'. A description of the above system is therefore:
PI PI P2 P3 P4 P5
P2
P3
P4
1
I
1
P5
1 1
1 1
1
It will be noticed that the calls are oriented, i.e. PI calls P2 but P2 does not in the diagram call PI (it does ultimately do so) and, further, only digits corresponding to one step are inserted. In order to extract the required information (that for instance there exists a sequence P1 P2P5 P4P1) the matrix is processed using the following algorithm. Let the matrix be Boolean array A [1: n, 1: n] ; begin integer i, j, k; for p: = 1, 2do for j: = 1 step 1 until n do for i: = 1 step 1 until n do
begin if A [i,j] then begin for k: = 1 step 1 until n do
A [i,k] : = A [i,k] /\ A [j,k] end end end This has the obvious effect that if a procedure PI calls P2 then it has 'access' to all the calls by P2, i.e, in the example given PI can call P5 via P2. It can be proved that one such pass is required to complete the full connection or call matrix (Ref. 8).
A Multi-pass Translation Scheme for ALGOL 60
177
The matrix completed by this process is as follows:
PI P2 P3 P4 P5
PI
P2
P3
P4
P5
1
1
1
1
I I
I I
1 1
I 1
1 1
1 1
I I
1 1
It will be seen that PI calls PI and P2 calls P2, etc., i.e, PI P2 P4P5 are in some way involved in a sequence of calls which ultimately call themselves, i.e. recursion is indicated by a Boolean value true in the leading diagonal.
Setting up the precedence matrix It has already been intimated that recursions can depend on actualformal correspondences, either via parameters involving expressions or parameters involving procedure identifiers. The matrix when set up must therefore include all possible 'values' of 'formal procedures'. A 'value' of a 'formal procedure' is the' actual procedure' given in a statement of the procedure with that parameter in its parameter list. A 'formal procedure' is an identifier used as a formal parameter specified as a procedure; it has all the attributes of a normal procedure except a declaration. An' actual procedure'· is a procedure which has a declaration. Obviously if a value of a formal procedure P is Q, then all calls by P become calls by Qand all calls of P become calls of Q, even if Q is itself a formal procedure. The matrix is therefore set up initially with four distinct regions as follows:
12
actual
formal
actual
Al
A2
formal
A3
A4
178
E. N. Hawkins and D. H. R. Huxtable
Digits placed in Al correspond to actual procedures calling actual procedures, in A2 actuals calling formals, in A3 formals callingactuals, in A4 formals calling formals. Examples of the various types of calls in skeleton form are: Actual calling Actual (AI calling sin) : real procedure Al (x); real x; Ai: = sin (x) x t 2 Actual calling Formal (A2 calling P): real procedure A2 (P); procedure P; begin real x; ••• P (x); ... A2: = x 3 end Formal calling Actual (P calling AI): procedure A3 (P); procedure P; begin real x; •.. P (AI, x); .•• end Formal calling Formal (P calling Q); procedure A4 (P,Q); procedure P,Q; begin . . . P(Q); ... end
+
t
Once the matrix has been set up in the above form we merely observe the actual-formal correspondences and use them to map the rows and columns of the formal procedure on to the rows and columns of the actual procedure. This mapping process can involve a chain of correspondences. In Example 5.3 : q has the 'value' p p has the 'value' sin Example 5.3. begin real result; real procedure P (P) ; real procedure p; begin real procedure Q(q); real procedure q; begin real x; ••• Q: = q (x); ••• end; •.• P: = Q (P); ••• end; result: = P (sin) •.. end
A Multi-pass Translation Scheme for ALGOL 60
179
Therefore by implication q has the 'value' sin. This is a similar problem to that involved in listing the possible calls of procedure. We therefore set up an actual-formal correspondence matrix. In Example 5.3 this has the following structure: 'Formal'
p p Q sin
'Actual'
q
1
p
1
q
This matrix is then 'reduced' using the above algorithm and yields:
p p Q sin
q
1
1 1
p q
This matrix is now used to control the' mapping process'. (The value true implies a map.)
The call matrix as set up initially:
p
p
Q szn
p q
Q
sin
p
1
1
1
q
1
E. N."Hawkins andD. H. R. Huxtable
180
The mapping process gives: p
p
Q
Q
szn
p
q
1
1 1
1 1
1
sm
p q
which, it can be seen, yields the correct connections. Having seen the required structure of the basic call matrix we can now examine briefly the methods adopted to set it up. Fundamentally the process is one of looking at procedure declarations and procedure statements and correlating the two (to find the actualformal correspondences). Two lists are therefore set up, a Declaration List and a Statement List. The former is basically a copy of the procedure headings used in the declarations. It is, however, augmented by dummy declarations derived for the formal procedures from one of their corresponding actual procedures. (One requirement is therefore that all the actual procedures corresponding to a given formal procedure shall have identical specifications.) This dummy declaration serves a second purpose in providing the control necessary to set up the parameters when a formal procedure call is encountered. This list is retained during the translation and serves to hold general information about a given Programme Level, e.g. the fixed storage requirements. The second list, the Statement List, reflects the nested structure of procedure statements. In setting up the list we are interested in only-three types ofparameters. Those corresponding to formal parameters specified as procedures or labels or' type'. These can be further reduced to two classifications, procedures and expressions. The nested property arises because an expression (arithmetic, boolean, or designational) can contain statements of function designators. Statements are therefore divisible into two classes, those which occur in their own right, and those which occur within other statements. In the primary statement a note is made concerning the level in which that statement is occurring; the importance of this will be apparent later. The statements in any given level are indexed by a List Index and when an expression contains a statement it points to a new level of the List Index. Diagrammatically the structure is as follows:
A Multi-pass Translation Scheme for ALGOL 60 (~
181
means an address reference)
List Index
1
~
List Index
~
List Index
1
1
Statement name Statement name Statement name
1
procedure
~
expression - - - List Index
I t
procedure ~
~
List Index
1 1
1
Statement name Statement name formal variable list procedure The whole list is compiled in a single string indexed by the address references indicated by pointel's. Only the first (primary) List Index isaccessible. The Statement List enables a rapid scan to be made yielding the required information to set up the connection matrix. It will be remembered that the mapping process involves correlation between the two lists. The main thing that this list achieves is a ready placement of the level in which any of the statements in the secondary lists occurs. The point is that if an expression is called by value, then it is called not by the procedure whose statement contains the expression but in the enclosing level, i.e. the level in which the statement is being made. The setting up of the Statement List enables exact notes to be made ofwhich parameters are expressionsand how they are called. This involves the dummy declarations mentioned in the Declaration List. 6. OPTIMIZATION OF FOR STATEMENTS AND SUBSCRIPTED VARIABLES
As most computer programmers know the computation time for addresses of subscripted variables can often be substantially reduced in cases where systematic operations are performed. This is particularly true of machines with index registers. Consider the situation: for k : = 1 step 1 until n do C [i,j] : = C [i,j] + A [i, k] X B [k,j] ;
where i, j, k, n are simple variables of declared type integer.
182
E. N. Hawkins andD. H. R. Huxtable
If arrays are compactly stored by columns a multiplication is implied for each subscript, and one for the actual product, making five in all. This is clearly uneconomic and any programmer will quickly recognize that: 1. The address ofC [i,j] is unaffected by k; 2. Harray Ais declared array A [1 : 10, 1: n] that the address of A [i, k]
advances by 10 when k advances by 1. 3. The address ofB [k,j] advances by 1when k advances by 1 regardless of the declaration of B. 4. The loop is traversed n times (unless n < 0). From this deduced information a more economic programme can be written than a transliteration of the ALGOL statement. However, there exists in ALGOL 60 the possibility of situations where a casual interpretation of a statement is misleading. Let us in the above example remove the condition that j is a simple integer variable and substitute the following declaration: integer procedure j; begin k: = k i; i: = i
+
+ k; j:
=
i end
This makes the behaviour pattern of the addresses, to which the subscripted variables refer, much less obvious. This example illustrates one of many traps in the way of optimum translation. In practice the most frequent situation is the simpler one so that economization where possible is worth while. Thus the translation of for statements may be dichotomized as follows:
1. The detection of situations which may contain a trap, and then translating those parts of the program literally. 2. The detection of simple situations, where algorithms can be constructed for more economic translation, and then realizing these algorithms. The idea ofmechanical optimization offor statements is not new (Refs. 11 and 6) but for clarification of the methods used to detect traps some description of the particular optimization method adopted is required. So we give, firstly, the general method of translation of for statements and subscripted variables; secondly, a description of the optimization techniques, and, finally, the methods by which trap situations may be detected and the resulting effects on optimization. Before continuing itis worth while pointing out one or two general points of strategy.
A
Multi~jJass
Translation Scheme for ALGOL 60
183
Each Programme Level is treated completely separately. This is essential because of the dynamic way in which Programme Levels interact. The 'for statement' structure within each level is dealt with in 'insideout' order. This again is essential to the method as:
1. Inner loops require prior treatment. 2. The behaviour of an inner loop frequently influences the possibilities of optimization in an outer. If (in an obvious notation) a 'for statement' structure is as follows: 0((( ))())(())
iiklmmlrrkstts then the order of processing will be: i, m, l, r, k, t, S
See, in this connection, Ref. 12. This ordering is achieved by re-arranging the program and introducing 'tags' where inner for statements have been removed. A Tag List isretained as the processing proceeds and is one means by which information is transmitted 'outwards'. Other means are discussed. Where the ALGOL program, as here, is retained on magnetic tape a little ingenuity enables this inversion process to be done without too much time being wasted on winding and re-winding. After the' for statements' have been processed the program is re-ordered in its original form with regard to the 'for statements' but Programme Levels are retained separately as this is more convenient for the translation proper. The following trivial change to the ALGOL syntax is introduced for convenience of explanation.
General translation of 'for clauses' andsubscripted variables In the running program arrays are stored compactly by columns. In concrete terms this means that we define a storage function for the elements of an array as follows. Han array is declared array A [LI : Uil L2 : U2, ••• Ln: Un]; where Lj and ~ 1 ~j ~ n are arithmetic expressions, then we define II, UI, •••, ln,
184
E. N. Hawkins and D. H. R. Huxtable
as the particular values of L 1 ••• U; evaluated from left to right at the time the declaration of A is encountered dynamically. Then we now define:
Un
address (A [II' 12,
•••
,1,,]) = address (A [1 1,
+
•••,
In])
n
~ (i. -
j=1
J
[.) X 6.. J 1
where ij is the particular value yielded by the subscript expression ~ when then subscripted variable is encountered dynamically, and also where
It suffices to note that each element ofthe array has a unique address and these are compactly contained in a region of store size 6.n+I' the total number of elements in the array. The defining relation can now be arranged: n
address (A [II) ..., In]) = address (A [1J) ..., In]) -
+ j=~" 1 i· X J
~ i> 1
lj X fl j
fl·1
In the arrangement it should be noted that the first part is a function ofthe declaration only and the second a function of the 6.'s and the particular subscript only. It is now possible to describe the general method ofobtaining the address of a subscripted variable. When an array declaration is encountered the translator produces instructions to evaluate the lower and upper bounds from the left to right and places these values on the top of the stack (advancing the ARROW appropriately). The translator then manufactures instructions which give the following quantities as parameters to a closed subroutine:
1. The number of dimensions of the array. 2. The address of the' array box' relative to the base ofthe current level. 3. The address of the Dope Vector relative to the base of the current
level. The closed subroutine, known as the Array Declaration Subroutine, is then entered.
A Multi-pass Translation Scheme for ALGOL 60
185
The action of the Array Declaration is: 1. To compute three quantities for entry in the' array box': n
(a) address (A [II' ..., In]) -
~ j=1
Ij X .6.j ;
(b) address (A [11" .., In]) ; (c) the address of the dope vector.
The addresses are computed absolutely, i.e, independent oflevel. The address (A [II' ..., In]) is where II was planted before entry to the subroutine. 2. To set up the values of the Dope Vector. This is a set ofn locations in the fixed space of a level, consisting of .6.n + I, .6. 2, .6. 3, •••, .6.n in this order. t1 n + I' 3. The arrow is set equal to address (A [II) ..., In])
+
The general method of evaluation of a subscripted variable is only used when optimization is not possible for some reason and the method used is as follows. The subscript expressions II, •••, In are evaluated and put on top of the stack. Then as parameters to a closed subroutine, the Subscript Address Evaluator, the following are supplied:
1. The address of the 'array box'. 2. The number of dimensions of the array. The action of the subroutine is to fetch the 'array box' and then form n
~
j=1
ij X
s; n
It then adds this to address (A [II' .. " In]) -
~ ~
j= I
X .6.j giving the
absolute address of the subscripted variable. This it leaves in the top cell of the nesting accumulator, i.e, a standard location. The use of this address is determined by the subsequent external instructions. When an array is a parameter to a procedure call some manipulation is necessary. If an array is called as a value parameter then the translator puts into the formal array box initially the parameter in the actual array box. After the actual parameters have all been set up the translator sets entry to the Array Copying Subroutine after supplying it with the address of the formal array box. This subroutine makes a copy of the array on the top ofthe stack and the amended version of the array box is replaced in the formal parameter
E. N. Hawkins and D. H. R. Huxtable
186
position. Ifan array is called by name the actual to formal transfer is set up but no copying is done. The translation process proper has no knowledge of for statements and every statement in the ALGOL program will be opened out by the for statement processor. This opening out is in strict accordance with 4.6 of Ref. 1, except for 4.6.4.2, where reformulation 14 of Ref. 10 is preferred. The 'for processor' produces the necessary ALGOL statements and labels to replace the' for clause'. As labels have at this state already been processed these labels are specially marked as requiring no stack manipulation. When a 'for list' contains more than one 'for list' element the whole of the statement following the 'for clause' is made into a closed subroutine. This subroutine is called individually by the expanded form of each element. Again, this method is only invoked in full when no optimization is possible.
Forstatement optimization Let us consider the following statement: for k : = a step b until c do begin x: = x + A [m, k] end where a, b, c, k, m are simple integers. Now from the previous discussion: address (A
Em, k])
= address (A [II' 12])
2
2: Ii
X
i= I
s,
We now observe:
1. address (A [m, aJ) = address (A [ll, 12 ] )
2
-
Z; l, X !J. j
;=1
+mX
tJ. I + a X tJ.2 ; 2. the increment to address (A Em, k]) at each iteration is b
X
tJ. 2 •
Thus ifbefore entering the loop we load a register with address (A [m, a]) and add to this register on each iteration the value b X tJ. 2, then at all times we have available the correct address of A [m, k]. If the registeris the Mpart of a Q-store the incrementation and address modification is obviously simple. This idea can be expressed concisely by the following definitions:
1. L (A [m, k]) means the absolute address of A [m, k]. The Lstands for location. 2. I (A [n,PJ) means the value of tJ. I X n +!J.2 X p, where tJ. I , tJ.2 refer to array A. It is an increment rather than an address.
A Multi-pass Translation Scheme/or ALGOL 60
187
We can now express the optimization process as:
R2: =I(A[O,b]); Rl: = L (A [m,~a]); for k : = a step b until c do begin x: = x + 'contents of the address contained in RI' ; Rl : = RI + R2 end Alternatively, if there is a Q-store available, e.g. Q15 115: = I (A [0, bJ ); MIS: = L (A a]); for k: = a step b until c do begin x: = x + 'contents of the address contained in MIS'; 'increment MIS' end
em,
Now one point of the inside-ou t working can be illustrated by the following example of an operation on a square array:
Y : for i : = 1 step 1 until n do Z : for j: = i + 1 step 1 until n do A [j, i] : = A [i,j] ; The first statement Z can be processed, giving
Z : 114:
= I (A [1, 0] ); M14: = L (A [i + 1, i]); 115: =I(A [0,1]); M15: =L(A[i,i+ IJ); forj: = 1 step 1 until n do begin 'address M 14' : = ' address MIS' ; 'increment M 14 andMIS' end
Now the initial values of114, lIS are recognizable as independent of i and M 14 and M 15can be further processed when considering statement Y. After replacing the increments by arithmetic expressions we obtain:
Y: 114: = a1 ; lIS: = a2 ; Rl: = L (A [2, 1] ); R2: = ill + a2 ; R3: = L (A [1, 2] ); R4: = for i: = 1 step 1 until n do begin Z: M14: = Rl; MIS: =R3;
a 1 + a2 ;
188
E. N. Hawkins and D. H. R. Huxtable
+
for j: = i 1 step 1 until n do begin 'address M14' : = 'address MH'; 'increment M14, MI5' end Rl: = Rl + R2; R3: =R3+ R4end
Further economies associated with Rl, R2, R3, R" are possible, quite apart from economy in counting, but this would obscure the basic principle. At each stage in the processing of a 'for statement' the intermediate registers R and the quantities assigned to them are not explicitly entered into the ALGOL program but are contained in the Register List and tagged to indicate their occurrence in the program. Now, since in practice it is highly desirable that this Register List be retained in the fast access storage (i.e. cores), it is necessarily of bounded size. Thus it was decided that some restriction must be placed on the number of subscripts and the complexity of arithmetic expressions allowed in each subscript position. The restrictions decided upon were the following: 1. Only subscripted variables with less than four subscripts to be considered. 2. The only subscript expressions considered are those which can be expressed in the following form: (a) Containing no identifier or constant which is not a simple integer. Those admissible are called elements. (b) Containing not more than three elements. (c) The elements are not connected by more than three operators, two adding and one multiplication. (d) For other reasons quadratics in any variable.
Examples:
Types considered A [i], A [l X i
+ m, -n X J -
P],
Types not considered A [1, 2, 3, 4], A [B [i]] A [if b then 5 else 4], A [l...;- m] A [sin (PI{2)] Thus a total ofnot more than nine identifiers and nine operators express an element in the Register List. It should be clearly understood that these are not programming restrictions but merely a restriction internal to the
189
A Multi-pass Translation Scheme for ALGOL 60
translator. Cognizance ofthe restrictions though may enable a programmer to improve his program running speed should he so desire. The Register Listis organized on a nestedprinci ple with' lock-out' points determined by the placing of tags. For example, with the bracket structure (()(O))
ijjkllki the order of processing isj, 1, k, i. Now clearly when processing l we must 'lock-out' information aboutj although this will be in the list (waiting for i). While processing k again j is locked out while information about l is highly relevant. This lock-out principle applies to several other lists. There are two other types of optimization applicable to subscripted variables and the Register List. These are:
1. Economies due to the fact that ~l = 1 for all arrays and computable at translation time for fixed bound arrays. 2. Equivalence of certain registers in the Register List.
~i
are
As an example of the first, consider A [i + 7, j + 9J with an array declaration array A [1: 30, 1: 50]. Now, clearly,ifinstead of addressing this subscripted variable from store 0 it is addressed from store 7 9 X 30 = 277, we can consider this as a subscripted variable A [i,jJ. This is, of course, achieved by altering the basic fetch or store instruction. This type of economy is performed before all others. Since ~l = 1, we can reduce A [i 7,j 9] to A [i,j 9J, regardless of the declaration. Associated with this is the fact that if at any time we
+
+
+
+
n
must compute
~ ~ X ~j
i= 1
for a fixed bound array, and the fs are known
integers, then this calculation is done at translation time. To facilitate this economy information on fixed bound arrays is retained in a Fixed Array List. As an example of a second type of economy reconsider the operation on a square array. Here R2 and R4 are seen to be identical and arise from two identical entries in the Register List. Thus one of R2, R4 is redundant and can be immediately eliminated. There are, however, more complex waysin which such identities can arise. Also temporary registers which contain different quantities but have disjunct scopes can use the same store. To achieve these economies when an entry is made in the Register List it is given a temporary number which is replaced later by an actual register number. To facilitate this a list containing register numbers and their equivalents is retained. Care has to be
E. N. Hawkins and D. H. R. Huxtable
190
exercised in this context to ensure that registers which appear equivalent are in fact so. For example, consider a structure:
(()()) ijj k k i Now two registers may emerge fromj and k appearing equivalent in i, but ifeitherj or kincrements this register, then they are not actually equivalent (although they may use the same store due to their disjunct scopes in i and k). Intermediate registers are eventually located in the fixed space about a level. The other sort ofoptimization which it is possible to perform is associated with 'for list' elements. Let us consider first the 'for list' element A step B until C. If A, Band C are constants, then the number ofiterations is computable at translation time and use is made of the counting facility available with Q-stores. IfA, Band Care not constant, then some optimization can be achieved by computing the values of B, C and sign (B) outside the loop and putting these values into intermediate registers. These quantities are held in a third type ofintermediate register -E q.v. types L and I. If B is a constant, and thus sign (B) is fixed, then if sign (B) X (V - C) > 0 can be modified to if V > C or if V < C
Consider next E while F. There is considerably less that can be done to this type of element. However, ifF is the symbol true we may remove the test. Also, there is one case which receives special treatment. Say we have
+
m while F do either for i: = p, i or for i: = i m whileFdo
+
where p and m are simple integer variables or integers constants, these situations can be regarded as equivalent to for i: =
p step m until '--, F' do
and for i: = i
+ m step m until '--, F' do respectively
There is little point in trying to optimize in the case of an element which is just a single arithmetic expression.
Detection ofoptimization inhibitors Previously we have discussed the types of optimization which the translator attempts to achieve. As pointed out if certain features exist in the
A Multi-pass Translation Scheme for ALGOL 60
191
particular ALGOL program being translated, then optimization becomes more difficult. The features which are significant in this context can be summarized as follows: 1. A statement exists either statically or dynamically in the for body which (a) makes assignment to the control variable; (b) makes assignation to a variable which occurs either statically or dynamically in the arithmetic expression, of a subscripted variable, or the B or Cof an A step B until Celement, or the E or F of an E while F element; (c) causes exit from the 'for statement' before the list is exhausted. 2. The control variable is not a simple integer variable. The existence of any ofthe above conditions will in the translator under discussion inhibit optimization to a greater or lesserextent. The reasons for these inhibitions are usually difficulties oflogic but they are often influenced by practical considerations when dealing with a machine offinite fast access storage. We now consider in more detail the ways in which these' undesirable' situations may be detected and the effect that this detection has on the optimization. One way in which an undesirable assignation or jump may occur is by means of a procedure call. In order to determine which calls have this effect the procedure declarations are inspected and divided into three classes: 1. Normal Procedures which are defined as procedures which have: (a) no own variables declared; (b) no abnormal exit or use of a switch; (c) no non-local assignments; (d) calls only normal procedures internally; (e) no parameters specified as label or switch; (1) no assignation to a parameter specified as integer. 2. Conditional Sneak Procedures which are the same as Normal Procedures except for the relaxation of condition (1). 3. Sneak Procedures which are procedures which are neither Normal nor Conditional Sneak. The implications of the above definitions are that a call of a Normal Procedure cannot have an undesirable effect. A call of a Conditional Sneak Procedure may only assign to a certain limited number of integer variables which are listed (Integer Assignment
192
E. N. Hawkins and D. H. R. Huxtable
List). A call of a Sneak Procedure may affect any variable or have an abnormal exit. It is seen that condition (d) implies a recursive definition of these three classes. The recursion can be conveniently broken by logical operations involving the call matrix. (If a vector representing the truth of a property ofthe procedures is logically premultiplied by the call matrix this will yield a vector showing all the procedures which by implication have this property.) The existence of the call of a Sneak Procedure anywhere in a 'for statement' completely inhibits optimization. It also has significance in the placing ofQ's discussed later. Apart from procedures undesirable effects may arise from direct assignation to variables. However, the treatment is relatively simple when we remember that only simple integer variables may occur in subscript expressions which we attempt to optimize. At an early stage in the processing ofa 'for statement' the body is reviewed and direct assignments to simple integer variables are noted in a list called the Integer Assignment List (LA. List). This list is worked on the lock-out principle like the Register List. If the LA. List threatens to exceed capacity then the translator assumes it has met a Sneak procedure. This device may, of course, result in a less efficient translation. It is preferable, however, to halting the computer due to a restriction which the programmer may have difficulty in understanding. This sort of device is used in nearly all cases where lists are retained. Having established those simple integer variables to which assignation is made, any subscript expression which contains such a variable is non-optirnizable and this subscripted variable is now translated in the general way. If the arithmetic expressions B in A step B until C or E in E while F contain: 1. 2. 3. 4. 5.
a call of a non-special procedure; a variable which is not a simple integer; a non-integer constant; an if clause; a variable in the LA. List,
then all optimizable subscripts which are a function of the control variable are changed to non-optimizable and translated in the general way. In order to discover go-to statements leading out of a 'far statement', the labels in designational expressions are entered in a list called the RightHand-Label List. The labels in the 'for body' are entered in a list called the Left-Hand-Label List. After the lists are compiled the two listsare compared
A Multi-pass Translation Schemefor ALGOL 60
193
and corresponding labels cancelled. If there is an excessin the Right-HandLabel List, then we have an exit ofthe type we are looking for. The RightHand Labels are retained in the list for comparison purposes in any surrounding for statement. Excessmembers in the Left-Hand List are ignored. (They may well bejust descriptive labels.) In the event ofa non-standard exit the control variable must be retained explicitly. Ifthe control variable ofa' for statement' is not a simple integer variable the 'for statement' is regarded as non-optimizable,
Q-stores There remain two points with regard to Q-stores (index registers). The first of these is the placing of the automatic increment symbol. The second is the question ofthe buffering of Q-stores due to the fact that these are fixed registers in the computer, whereas the Programme Level conceptisdynamic. As previously mentioned, Q-stores are divided into three parts, Counter, Increment and Modifier. Now it is possible on any modified instruction to addend a Q. Under these circumstances after the fetch or store instruction has been obeyed I issubtracted from the counter and the Incrementisadded to the Modifier. Thus, ideally, one would like to place the Q's on the last M reference to a particular subscript combination in the' for body'. However, this is not always possible because a subscripted variable may occurin a part of the' for statement' either not necessarily executed or executed more than once during each iteration. Under these circumstances one must resort to an incrementation by other means. Hardware instructions existfor this purpose which we willcall the Dummy Increment. Although more complex analysis of the 'for statement' is possible, the following is adopted as a practical compromise solution to the problem. The statements of the 'for body' are examined to detect: 1. 2. 3. 4. 5.
Labelled statements. Go-to statements. Statements containing an if clause. Statements containing the call of a Sneak procedure. Statements which are blocks which themselves contain declarations of arrays.
Any such statement is called divergent. After this review all statements up to but not including the first divergent statement is called the Start ofa 'for body'. All statements occurring after the last divergent statement are called the Finish of the' for body'. The rest of the statements are called the Middle ofthe 'for body'. 13
194
E. N. Hawkins and D. H. R. Huxtable
Now reference to Q-stores are classified according to whether the last reference to it occurs in the Start, Middle or Finish. Then Q's are addended according to the following rule. If the last occurrence is in the Start or Finish this last occurrence is given the Q. If the last occurrence occurs in the Middle, then a Dummy Increment is planted in the Finish. This depends on the fact that if the 'for loop' is to be re-iterated, control must pass through Start and Finish and can only go through these once. For these purposes the left part list of an assignment statement is regarded as ifit occurred after the expression. As the 'for loop' and subscript optimization is done for each level independently and the Q-stores by their nature are not in the stack, there arises a problem when one level calls another. Say, for example, that the program is dynamically in a level with uses Q13 to Q15 and a procedure call is encountered which uses Q14and Q15internally. Now clearly Q14 and Q15 must be unloaded into the stack and brought back later ifor when required. Considerations of reducing this unloading and re-loading effort and the possibility ofun-natural exits from procedures rapidly leads to the following rule. Let us suppose that level i uses n(i) Q-stores and further that level i is invoked by levels i. ... i.: When entering level i from any ofji ... Jm it is necessary to unload 'p' of the Q stores where:
p = min (n (i), max (n Uk)) The mechanization of this is achieved in the following way. As the 'for optimizer' operates the n(i) for each i is discovered. After every level has been processed max (n Uk)) can easily be computed by reference to the procedure call matrix. 'p' is then computed for each level and this information is handed to the translator proper. The translator ensures that the entry and exit mechanisms for each procedure dump and restore 'p' Q-stores. To reduce further the number of buffer transfers requires a dynamic analysis of the program. 7. TRANSLATION
The process of translation is split into two distinct phases. The first, transliteration of the ALGOL string into an intermediate code, is a single pass, the second, optimization of the intermediate code to produce KDF9 user code, is a multi-pass processor. The optimizer is under the control of the transliterator and is entered at well-defined points in the ALGOL program.
195
A Multi-pass Translation Schemefor ALGOL 60
The transliterator operates in a manner similar to that adopted by all one-pass systems. Its next operation is decided by comparison of operators and delimiters (Ref. 6). The code it produces, however, is designed for easy processing by the optimizer. This code assumes a machine with an infinite nesting store, and is fundamentally a 'reverse-polish' code. However, included in the code words isinformation about the operands, i.e, their types and the codes which produce them. This makes the code resemble a normal3-address function code, the third address being implicit in the instruction number. As an example consider the representation of the expression:
+
a-bxe l.a-F fetch 'a'. 2. b-F fetch 'b'. 3. c-F fetch 'c', 4.32 X multiply the results of code 3 and 2. 5.41 - subtract the result of code 4 from 1. The column of' operators' (the right-hand column) is a straight reversepolish representation of the expression. The two operands of an operator (e.g, X in code 4) are assumed to be in N land N 2 in that order (i.e. 3 is
inNl). The code includes provision for describing Fetch and Store in considerable details. Since none of the stock manipulations are generated at this point, the Fetch and Store codes contain descriptions oflocale and modifiers. The type of any given arithmetic operation is determined at translate time. This is possible ifsources of ambiguity are removed. There are three such sources, lack ofspecifications on a call by name, conditional arithmetic expressions, and the definition of t where an integer exponent is used. The system requires all formal parameters to be specified thus removing ambiguity one. Conditional arithmetic expressions produce a result of ambiguous type ifthe two alternative expressions are different, for example: begin real x; integer y; ... x: =ifx
E. N. Hawkins and D. H. R. Huxtable
196
a closed subroutineifi ~ 8. Because of the redundant floating and fixing operations that this definition is likely to produce, integers must be limited to the size of the mantissa of the floating point number (on KDF9 39 digits). THE TRANSLITERATION AND OVERALL ORGANIZATION
The action ofthe Transliteration can be divided into three classes:
A. Generation of the intermediate code. R Control of the Optimization process, i.e. definition of entry to the Optimizer. C. Output ofmachine code either direct (storage manipulation) or the results of the Optimizer. The actions are decided on the basis ofa comparison between the current operator or delimiter in the program and the top operator or delimiter in an operator stack or push-down list. The operators are divided into eleven classesas follows (twelveifthe null class is included). 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
+-
Arithmetic operators: t X -;- / Relational operators: < ~ = ~ > =1= Boolean operators: -, /\ V ;:, ==: Arithmetic expression brackets: ( ) Expression delimiters: if then else: = Conditional statement delimiters: IF THEN ELSE Organization delimiters: go to ; Bound pair list delimiters: [ ] , Suffix list delimiters: [] , Parameter delimiters: ( ) , Block delimiters: begin end
It will be noticed that for step while until do are all missing. These have been resolved into more fundamental ALGOL statements by the suffix and index evaluation process. Distinction is made between the various uses of some of the delimiters. The most important distinction is that between if then else as used in a conditional expression and if then else (here written as IF THEN ELSE) in a conditional statement. The optimization process is entered whenever a complete assignment is found. It should be noted that where suffixes or bound pair lists are being evaluated by subroutine, then each element is effectively an assignment to the parameter location for that subroutine. Therefore any pair of delimiters
A Multi-pass Translation Schemefor ALGOL 60
197
in classes6, 7, 8, 9, 10 generate an entry to the optimization process, some of them generate an entry to the output process as well. These latter are pairs taken from: 1. go to, semi-colon, semi-colon
2. 3. 4. 5.
[,] (,) IF, THEN, ELSE begin end
begin and end generate direct output of stack manipulation instructions either before or after the output corresponding to statements. In addition to the assignments the optimizer also processes the expression between IF and THEN. The optimizer takes the required sequence of codes and produces the corresponding machine instructions, it then replaces the codes by a reference to this sequence of instructions. This is because of the possibility of nested assignments as in parameter planting or suffix evaluation. It will be noted that such nested assignments can only occur between the brackets of classes 8, 9,10,11 which are therefore used to control the field ofoperation of the optimizer at any point within a statement. Normally the optimizer treats the whole statement which usually corresponds to a 'begin - ;' pair of delimiters or 'THEN - ELSE' or 'THEN ~ ;' or 'ELSE - ;' etc. The parameter planting process is controlled by the Declaration List set up during the procedure classification phase. The location of the formal parameter is the current ARROW position, which is stepped on one or two places, after it is planted, depending on the parameter specification. This process is also used in planting suffix or bound pair values for subroutines dealing with the general suffixed variable and the array declarations. The translation process operates on a Programme Level as a unit. These units are in any order with the proviso that those levels which have been specified as 'special functions' shall be translated first.
Optimization The prime aim of this phase of translation is to produce an efficient evaluation of an expression which makes full use of the KDF9 Nesting Accumulator. The fact that the code produced by the transliteration phase undergoes further processing considerably simplifies the problem ofthe type of an operator, and the point at which suffixes or functions are evaluated within an expression. I t is well known that there are many traps awaiting the unwary program which attempts to optimize an expression. It is obviously desirable in the
E. N. Hawkins and D. H. R. Huxtable
198
evaluation of an expression which contains suffixed variables and functions that the subscript values and function values should be determined before the main body of the expression is started. The reason is ofcourse that any interruption in the normal reverse-polish flow of instructions leads to undesirable effects such as necessity for unloading the Nesting Accumulator. The basic bar to optimization of an expression is the general form of a function. This can preclude any attempt to re-order the evaluation of subexpressions, Any function within an expression, together with its parameter, is replaced by a reference to a sequence ofinstructions which evaluate the parameters, plant them, and enter the function. Obviously such sequences may refer to sub-sequences. As an example consider the following expression. Example 7.1.
a+b
X
sin (x + cos(y)
+ z) + t
The first sequence evaluated is one to plant the valuey to the function cos. We then evaluate x cos(y) z to plant to sin. The next is to evaluate the whole expression. The sequence oflists of subexpressions is as follows:
+
+
yo planted as a parameter
RI R2 R3 R4
RI, enter cos 'x R2 z' R3, enter sin
R5
a
+
+
+bX
R4
+t
Each of these references is marked with an S if the subexpression it .indexes contains a 'sneak' property, or a reference to a sequence marked as S. If sin is assumed to have a sneak property, then R4 and consequently R5 are both labelled with an S. It will be remembered that a classification of the procedures has been defined which marks functions which can have no peculiar properties-the Special Functions. Any reference can be moved to the front ofa sequence ifthat reference is not marked with an S. This has the effect of putting the evaluation of a function before the evaluation of the body of the expression, a situation which is highly desirable in that it avoids the disruption of the storage of Partial Results which occurs if the Nesting Accumulator has to be unloaded. The first operation of the optimizer is to 'float' such references as the expression contains as far' up' the expression as they will go. In the example we can float R2 to the top of the sequence denoted by R3.
A Multi-pass Translation Scheme jor ALGOL 60
199
The barriers to this, and to all the remaining optimizing operations, are references to expressions which are Sneaks, and labels generated within an expression during the evaluation of a conditional expression, e.g. if exp 1 then exp 2 else exp 3
is treated by the organization as one expression. However, the labels generated by translation ofifthen else form barriers to the optimizerwhich therefore regards the above sequence as a sequence of three independent expressions. The operations are also suspended whenever a reference is made to a formal parameter of type real, integer or Boolean which is called by name. Calls by value and calls by name of arrays do not affect the optimizer since they do not generate 'sneaks'. Functions ultimately leave their answers in the Nesting Store so that a reference to a sequence ofinstructions calling a function can be regarded as a fetch operation. Suffix expressions are treated like function designators in that the evaluation produces a value for the modifier which is left in the Nesting Store. When it is required to use this suffixto fetch an element of an array it is extracted from the Nesting Store and put into a standard modifier. The process adopted, if the suffix value is not accessible, is dealt with under the general topic of Partial Result storage. The rest of the action of the optimizer is best described in conjunction with an example. The example chosen is the following simple expression. Example 7.2.
c X (a - b)
t
4 -
c X (a - b)
+aX
b
where a, b, c, are all real. The intermediate code produced by the transliterator is (type markers are omitted):
1. c -F. 2. a -F. 3. b -F. 4.32 -. 5. '4' - F. 6. 54 7. 61 x , 8. c -F.
t.
9. a -P.
(constant)
E. N . Hawkins andD. H. R. Huxtable
200 10. b -F. 11. 109-.
12. 118 x , 13. 127-. 14. a -F. 15. b -F. 16. 1514 x ,
17. 1613
+.
The first pass detects and eliminates common subexpressions. This is achieved by merely scanning the expression for identical codes. (This includes identical fetch codes.) When an identity is found the second code is deleted and all references to it replaced by references to the first one. Allowance is made for commutative functions. It should be noted that only computationally equivalent sub expressions are detected, i.e.
a
+ b + c =1= a + (b + c)
This means that if a programmer inserts brackets the optimization process assumes that they are there for a purpose, and they are not overruled. The result of this pass on the example is as follows: 1. c -F. 2. a -F. 3. b -F. 4.32 -.
5. '4' - F. 6. 5'1
t.
7. 61 x , 12. 41 x . 13. 127-. 16. 32 x , 17. 1613
+.
Apart from eliminating repetitive fetching of variables the duplicate calculation of (a - b) has also been eliminated, and further that code 16 refe rs to the results of codes 3 and-z. It is obviously desirable that ifpossible code 16 should be calculated whilst its operands are in the top cells of the Nesting Accumulator. This leads to the basic idea of the next process which is to 'float' the codes up to the earliest position at.which both operands become available. The fetch instructions are fixed points in this process and do not move although naturally codes can move past them.
A Multi-pass Translation Scheme for ALGOL 60
201
When two or more codes operating on one or more common operands have to be placed in the list, they are placed in their numerical order. The result of this process on the example is on the left:
1. c -F. 2. a -P. 3. b -F. 4.32 -. 16.32 x , 12. 41 x. 5. '4' -F. 6. 54 t. 7. 61 x. 13. 127-. 17. 1613
+.
1. c -F. 2. a -F. 3. b -F. 16.32 x , 4.32 -. 12. 41 x , 5. '4' -F. 6. 54 7. 61 x . 13.127-. 17. 1613
t.
+.
LL
LL LL LL LL
It will be noted that 16 has moved to be a position adjacent to 3. One further scan is now required to correctly order these codes which have been placed in position according to the arbitrary rule given above. The rule applied is to re-order such codes in the reverse order to that in which they are initially used. The result is to interchange 3 and 16, since 3 is used before 16. One last run is to mark the last appearance oruse of any operand by an L marker. The final result is on the right above. Production of machine code can now take place. This is basically one through pass of the intermediate codes. During this run a record is kept of the current state of the Nesting Accumulator, this record is used to generate the manipulative functions required to obtain the operands. The program generated is as follows:
'c'
'a' 'b' 1.
DUPD
X PERM
CAB DUPD
X
CAB
Current Nesting Accumulator 1 2, 1
3, 2, 1 3, 2, 3, 2, I 16, 3, 2, 1 3, 2,16, 1 4,16, 1 1, 4,16 1, 4, 1, 4,16 12, I, 4,16 4,12, 1,16
E. N. Hawkins and D. H. R. Huxtable
202
2.
FUF DUP
6,12, 1,16 1, 6,12,16 7,12,16 12, 7,16 13,16 17
X CAB
X REV
+
1. Operands 3, 2 are both required again, copies are therefore generated. 2. Calculation of (a - b) 4. Open sequences of instructions are generated if the power used is a positive integer ~ 7.
t
The use of manipulative functions to obtain the required operands should be noted.
Partial result organization It may so happen that, despite the efforts at organizing the sequence of operations, an operand may be either inaccessible (i.e. not in one of the top four cells of the accumulator when it is wanted) or may cause the capacity of the accumulator to be exceeded. In either case the operand in question has to be stored as a Partial Result. A note is made of this fact and the production of machine code for the current expression or assignment restarted. When that operand is now' re-generated ' it is immediately stored in the chosen location. Thereafter whenever it is required it is fetched from that location. Partial Results may also occur through having to unload the entire accumulator. This happens when a reference to a function which is a Sneak is found or a formal parameter with a 'type' specification is called by name. Such Partial Results occupy the same kind oflocations as the more normal kind outlined above. However, there is no need to restart. The locations used for the partial results are in the fixed storage space of the level. Since an expression is contained within a block, the partial results are stored on 'top' of the fixed space required up to and including that block. The process in effect uses where possible the unused spaces in the fixed store. If no such spaces are available the initial value of the ARROW is advanced. One particular result ofthe system is that ifa restart has to be made after a complete unloading has been done, then all the results which were unloaded together will now be unloaded one by one as they are generated.
A Multi-pass Translation Schemefor ALGOL 60
203
This is a desirable feature on machines with Advance Control, and may result in a faster program. 8. INPUT AND SYNTACTIC CHECK
Phase one This is primarily designed as an external to internal hardware representation converter. The external representation can be defined and can be anything from 5-hole teleprinter code to any 6-bit character code produced by a 'Flexowriter', the particular hardware representation in use being read as data to this phase. The internal hardware representation is in 8-bit characters. In parallel with this hardware conversion a table is compiled of all the declared identifiers together with their internal representation. This table is based on the block and procedure structure of the program. During the construction of this table the checking process starts; the check at thisstage is a check of the bracket structure only. Phase two During this phase a full scope and syntactic check is done on the program. The 'scope check' checks that all identifiers and labels are used in contexts in which their use is valid, and uses the block structured tables compiled in phase one. Each identifier is thus converted from a character string to a binary internal representation. Each identifier becomes unique and carries with it a complete description ofits type and its place of origin in the program (its locale). The syntactic check builds up syntactic units from basic symbols to primaries, to expressions, to basic statements, to statements, and checks the structure of each. It does not for obvious reasons check the semantic restrictions of the language. The results of these two phases is therefore either a syntactically correct program or a series of error messages designed to aid correction of the ALGOL text. In view of the later phases the most important thing is that the identifiers have been processed so that all required information about them is immediately available without searching for it. Phase seven Final compilation and output is merely conversion to binary of the user code produced by the previous phase. The advantages gained in going to an intermediate user language are four-fold and are: 1. The reference problem is removed. 2. The storage problem is removed in that the size of the resulting program need not be considered.
204
E. N. Hawkins and D. H. R. Huxtable
3. It allows the use of other classes of store like the constants (V-stores) where size is again undefined until the end of the translation. 4. The code can be output for diagnostic purposes. 9. CONCLUSIONS
The scheme outlined above exists in detailed block diagram form at the time of writing (Jan. 1962). The outline given has of necessity been restricted to principles of operation. But it is hoped that sufficient detail has been given to show that ALGOL 60, when intelligently used, can be made to yield efficient machine-code programs. Naturally if it is used with this intention, then the programmer must not use the full flexibility of the language. Some self-discipline on the part of the programmer will be required, and such programs should be written in the most obvious manner -if'tricks' are used then they will probably not be optimized. No mention has been made ofthe scheme which is being written round ALGOL 60 to convert it from a language into a User's System. Such a system includes methods of correcting, testing and running ALGOL 60 programs, and is as important as the translator if efficient use is to be made of the machine. It is intended that the system will relay to the programmer the information which he requires to 'debug' his program, as well as providing an overall sequencing control. ACKNOWLEDGEMENTS
Development of the input phases, especially that part of it which deals with specifiable hardware representation, is due to Mr. A. G. Price. This paper is published by permission of the Director of Research, Nelson Research Laboratories, The English Electric Co. Ltd., Stafford, and The Manager, Data Processing and Control Systems Division, The English Electric Co. Ltd., Kidsgrove.
REFERENCES 1. NAUR, P. (editor), 'Report on the Algorithmic Language ALGOL 60'. Num. Mat. 2, 106-137 (1960). ·2. DIJKSTRA, E. W., 'Recursive Programming'. Num. Mat. 2, 312-318 (1960). 3. JENSEN,J. and NAuR, P., 'An Implementation of ALGOL 60 Procedures'. BIT, 1, 38 (1961). 4. JENSEN, J., MANDRUP, P. and NAUR, P., 'A Storage Allocation Scheme for ALGOL 60'. BIT, 2, 89-102 (1961). 5. MAIuMONT, ROSALIND B., 'Checking the Consistency of Precedence Matrices'. J.A.C.M., 6, No.2 (1959).
A Multi-pass Translation Schemefor ALGOL 60
205
6. SAMELSON, K. and BAUER, F., 'Sequential Formula Translation'. Commun. A.C.M., 3, No.2 (1960). 7. KDF9 Programming Manual. English Electric (Data Processing and Control Systems Division), publication. B. WARSHALL, S., 'A Theorem on Boolean Matrices'. I.A.C.M., 9, No.2 (1962). 9. NAUR, P., Course of ALGOL 60 Programming, Regnecentralen, Copenhagen. 10. NAUR, P. (editor), ALGOL Bulletin 14, Regnecentralen, Copenhagen. 11. BACKUS et al., 'The Fortran Automatic Coding System'. Proc. Western Joint Computer Conference (1957). 12. Easnov, A. P., A Programming Programmefor a HighSpeed Electronic Computer, Academy of Sciences of the U.S.S.R. (Moscow, 1958). English translation published by the Pergamon Press. 13. DAVIS, G. M., 'The English Electric KDF9 Computer System'. The Computer Bulletin (Dec. 1960).