J . SYSTEMS SOFlWARE 1991: 14:3-15
3
A Comparative Study of Five Language Independent Programming Environments P. Pintelas Department of Computer Engineering, University of Patras and Computer Technology Institute (C. T.I.) P.O. Box 1122, 26110 Patras, Greece
S. Tragoudas Computer Technology Institute (C, TJ.), P.O. Box 1122, 26110 Patras, Greece
This paper is concerned with language independent programming environments only. Five such environments are surveyed: MENTOR, CPS (Cornell Program Synthesizer), IPE (Incremental Programming Environment), GANDALF and PECAN. The survey examines the environments from several points of view: functional aspects, design targets, language incorporation, tools for the program transformation cycle, user communication and interface, and tool modification capabilities. The similarities between the systems, their individual strengths and weaknesses and their future research directions are analyzed and discussed.
1 a INTRODUCTION A programming environment aims at reducing the cost of program development and m~nten~ce by automating the development and m~ification cycle. Programming environments are classified either as language independent or as language specific, depending on whether they can be customized for any programming language or can be tailor-made to accommodate program development for a single progr~ing language. Integrated program support environments (IPSEs) consist of a set of tools, with varying degrees of tool integration, in different environments, to support the phases of the life-cycle. The functional quality of IPSEs is highly improved if they are layered: their tools are found on discrete layers [ 191. Language specific IPSEs support the development and modification cycle of programs written in a specific
Address correspondence to P. Pinteh, Dept. of Computer Engineering, University of Patras, Patras 26110 Greece. 0 Elsevier Science Publishing Co., Inc. 6.55 Avenue of the Americas, New York, NY 10010
high-level language only. The construction of such environments is a mighty and time-consuming task and their modification, to support program development in a different programing language, is practically equivalent to reconstructing the environment. Language independent IPSEs, on the other hand, are guided by a design philosophy which aims at providing means for configuring the environment so that it can support program development in any programming language. It is relatively easy for an environment implementor to construct a working environment based on a specific language. This paper is only concerned with language independent programming environments. More specifically it is concerned with MENTOR [8, 141, CPS (Cornell Program Synthesizer) [ 14, 32, 35, 361, IPE (in~~rnen~ programming environment) 19, 191, GANDALF [lo, 11, 2 l] and PECAN [28, 291. It is aimed at researchers involved in the design, internal organization and usage of language independent programming environments. It examines and compares the above-mentioned environments from the following points of view: ~n~tional aspects, design targets, implementation and tools, program development cycle, language introduction (concrete and abstract syntax), user communication and friendliness, and ease of tool replacement. These env~o~ents consist of a compiler or interpreter, a debugger, a linker, a loader, an editor, a set of tools to provide for the incorporation of a programming language into the environment and a set of tools for documentation, static semantic checking, version control, etc. Albeit none of the environments supports the first two phases of the life cycle (s~cification and
0164-1212/91/$3.50
4
J SYSTEMS SOFTWARE
P. Pintelas and S. Tragoudas
1991; 14:3-15
design) and some [6] do not support maintenance, they can still be characterized as software development environments. The common characteristic they all share is that the editor and all other tools or packages operate on an abstract syntax tree as intermediate code for the programs. For this reason all environments supply tools to maintain the internal representation of the abstract syntax trees. At this point it may be useful to define a few terms.
Section 3 gives a functional view of the surveyed environments discussing issues such as user interface, tool integration, and program representation as abstract syntax trees. The rest of the paper is a detailed view of tools and facilities found in each environment.
The concrete syntax [7] is a representation of the language necessary for the external representation of program units written in this language. The abstract syntax describes the internal representation of the abstract syntax trees and was first introduced by McCarthy [18] in the VDL (Vienna definition language [16, 171). It can be represented as an AND/OR graph where the OR vertices refer to alternatives and the AND vertices to compositions. There exists a correspondence between the abstract and the concrete syntax of a language. Through the abstract syntax it is possible to extend a language by manipulating the OR vertices. This extension is independent of the concrete syntax. The abstract syntax tree is a form of intermediate code, for the environment’s compiler, which does not contain the supertluous information as it exists in the parse tree. More specifically, the intermediate nodes of an abstract syntax tree contain semantic components while the leaves are only constants or identifiers. Contrary to this, the parse trees contain non-terminals on the intermediate nodes and terminals on the leaves. Although the syntax trees are slightly different from one environment to the other, they all comply to the above definitions.
The major contribution of language independent programming environments (LIPEs) is in automating the tedious and time-consuming process of building a software development environment for the construction and maintenance of software systems. If one follows the taxonomy of software development environments as described in Dart et al. [5], one may classify LIPEs as structure oriented environments which are used to automate the development and maintenance phase of the life cycle. The main characteristic of LIPEs is their ability to support manipulation of program structures in a language-independent manner. The implementor encapsulates the syntactic and semantic properties of a programming language with the help of a specification language or tables. The description of the language for which a working programming environment is constructed is introduced to each LIPE in a different way. In any case, the environment implementor must supply information about:
The Language independent programming environments are capable of manipulating hierarchically structured information which could represent a programming language, VLSI design, documentation preparation, etc. If the environment is capable of processing only one of these types of information, it is called special purpose; otherwise it is a truly general purpose environment. The environments mentioned so far are general purpose. There exists, however, at least one special purpose environment dedicated to VLSI design, namely CEYX [14]. VLSI design requires the description of different views of properties, e.g., geometric, electric, of a hierarchically structured object. These descriptions are usually expressed in different languages. CEYX provides for a quick selection of alternative representations for VLSI objects. Section 2 discusses language independent programming environments and their contribution in building systems to facilitate the software development process.
2. LANGUAGE INDEPENDENT ENVIRONMENTS
PROGRAMMING
1. lexical syntax for the description of language tokens, 2. concrete syntax for the external representation of the programs, 3. abstract syntax for the description of abstract syntax trees, 4. dynamic semantics for checking input and output relationship consistency, 5. static semantics for checking data structure consistency, and 6. unparsing (pretty printing [23]) for readable representation of abstract syntax trees on the screen. After such information is supplied by the environment implementor, all five language independent programming environments produce a language-specific program development and modification cycle which consists of the four phases: 1. introduction and modification of a program through the editor, 2. syntax and semantic program correctness through the translator, 3. creation of absolute code with the help of a linker and/or loader, and
A Comparative
4. support ger.
Study of Five Environments
of program
debugging
through
J SYSTEMS SOFTWARE 1991; 14:3-15
the debug-
LIPEs have made several contributions to technology. They have provided direct manipulation of program structures, multiple views of concrete programs generated from the same abstract program structure, incremental checking of static semantics and semantic information accessible to the implementor, etc. Perhaps the most important contribution is their ability to formally describe the syntax and the static semantics of a language from which an instance of a structure editor can be generated. The editor is the central component of the environment and plays the role of the interface through which the user interacts and through which all structures are manipulated. Incremental program construction as provided by the environments is an improvement in the compile-load-execute cycle. Further contributions are made to support programming-in-thelarge and programming-in-the-many as well as history logs and access lists. Static semantic analyzers process the program structure and decorate it with semantic information which the user can access through the editor. Compiler technology, such as attribute grammar evaluators, has been successfully extended to support incremental compilation as demonstrated in IPE, GANDALF, and PECAN. Until recently LIPEs had been accepted primarily as teaching aids in university courses while they had found little acceptance in industry. Lately, as they have become mature enough, they are becoming available in commercial products. 3. FUNCTIONAL
ASPECTS
OF LlPEs
This section aims at a functional evaluation of the facilities offered by the five environments, although it is recognized that the diversity of functionality which is provided by the environments and the lack of agreedupon terminology makes such an evaluation very difficult. The evaluation of environments is much more difficult than the evaluation of single components such as static semantic analyzers. It is therefore beyond the scope of this paper to provide a complete functional evaluation of the surveyed environments as this implies knowledge and most important user experience with all five environments, some of which were not fully operational at the time of the survey. The only user experience of the authors was gained from using GANDALF (and less experience from MENTOR). Functional aspects of the five LIPEs are discussed in this section. These include facilities for language support, structure editing, programming-in-the-large, incremental compilation, granularity of tools, debugging, verification, etc. Not all environments provide the same facilities.
5
MENTOR, for example, is the only environment to provide concurrent support for programs developed in different programming languages. Different languages can be manipulated during a single session and they can even be mixed in the same MENTOR object. Appendix A provides a summary of the functional aspects of the surveyed environments. Facilities for language support imply formal means of presenting the abstract and concrete syntax of a programming language to the environment. METAL and SSL are two examples of specification languages used by MENTOR and CPS, respectively. Tabular form is used by the other environments. The structure editor uses abstract syntax trees which were first introduced in MENTOR. Purely structural editing enforces construction of syntactically correct programs, but presents difficulties in entering and modifying language expressions. To overcome these difficulties, CPS allows for the representation of expressions as text. Structure editors are currently being used to support only the coding phase of the life-cycle and they are being viewed as tools for programming-inthe-small, although they can also be used as tools for the specification and design phase. As the environments generate the textual representation of a program from its structure it is possible for different external representations (views) to be generated from the same structure. This allows users to view programs at different levels of detail in different display windows. PECAN prototype has demonstrated the feasibility of producing graphical representations from program structures. Tool integration and granularity are very important functional characteristics and relate to systems designed to achieve a high degree of cohesion and loose coupling. For different tools to be properly integrated, they must either be adapted to understand a common structural representation or there must be mechanisms for consistent updating of structures through multiple views. A layered system design structure such as that of PECAN allows for easy tool replacement and modification with minimal disturbance to the rest of the system. The case is not the same with the other environments. The user interface is in a sense the most important part of any environment and is used as the yardstick by which the user judges system quality. Systems with poor interface are liable to be rejected irrespective of facilities offered. Questions which relate to user interface include: Does the interface cater to a number of different terminals? Can the user easily package up his own commands? Does the user have to call each tool explicitly? Can the user work concurrently with other users? The list of questions may be very long and it is
6
J SYSTEMS SOFTWARE
P. Pintelas and S. Tragoudas
1991; 14:3- 1.5
not easy to give an answer unless significant working experience is gained from all environments. As the main user interface is through the structure editor, one may conclude that the most user-friendly interface is provided by PECAN, which supports multiple program views and multiple screen windows and does not restrict the user to write his program text within placeholders only. In most environment the user need not be aware of what tools he is invoking or communicating with at any particular time as is the case with standard operating systems. In the environments produced from GANDALF, for example, many of the desired functions associated with system version control and project management are performed without any explicit user request. Incremental program construction is accomplished by reducing the compilation time with the help of a structure editor, providing incremental linking and loading of a system and source-level debugging support. Incremental evaluators for attribute grammars are used to perform a minimal recomputation of attribute values when a modification is made in a program unit under construction. Static semantic information can be attached to program structures and be incrementally analyzed by ~gori~ms that traverse the structures, as in CPS. The environments track the user’s changes to the program unit structure and reanalyze only those parts that are affected. This can be done either by explicit user request or upon exit from the modified program unit. Attribute grammars have been successfully extended to support such incremental compilation in IPE, GANDALF, and PECAN. CPS provides an interpreter while in MENTOR the scanner and parser are created by scanner and parser generators. Debugging is the process of locating where errors have occurred in a program unit and correcting the incorrect code. Source level debugging is provided by all environments. In MENTOR, a debugging interpreter is included acting on the abstract syntax tree. The user is supplied with an execution profile of his program. In CPS, the source-level debugger within the editor provides for powerful tracing of programs at text line level and detection of user modi~cations which enforce reconstruction of the symbol table information. Because of the interpretive nature of the system, a reverse execution facility is provided which allows for backward execution of programs. In IPE and GANDALF, source-level debugging is provided via editor commands. The unit for tracing and break is the procedure. Single stepping is difficult to achieve. Verification of program correctness involves providing a mathematical proof of correctness which demonstrates the correspondence between a program and its s~ifications, thus showing that a program is
“correct.“’ The only environment that provides a theorem prover for program verification by intr~ucing first-order predicate logic is MENTOR. Program verification in IPE is supported by a mechanism for run-time checking of assertions for segments of the program. System version control (programming-in-the-large) provides for system versions to be automatically generated. Project management support provides protection against chaotic m~ifications of the state of a project by treating system descriptions as collections of typed objects. The only environment that provides mechanisms for version control and project management is GANDALF. Many of these mechanisms are activated without explicit user request.
4. DESIGN OBJECTIVES MENTOR is a general purpose environment oriented toward the implemen~tion, d~umentation, testing, validation, and management of programs. The Cornell Program Synthesizer (or synthesizer generator) is a language independent programming environment with major design goal the provision of tools for syntax directed editing (including incremental checking of static semantic constraints) of programs in the introduced language using a fixed set of three traversal and editing commands. IPE (now part of GANDALF) is an incremental programming environment designed for a single user and therefore provides no support for either project management or version control. GANDALF programing environment, based on previous work on IPE, was designed to provide support for project management, version control, enforcement of semantic correctness, and experimental data bases. PECAN was designed to exploit the new generation of powerful personal workstations with high graphic display capabilities. The attempt to adopt a layered integration of tools has led to an easy tool and package replacement capability. 5. IMPLEMENTATION
AND TOOLS
MENTOR and its tools were written in Pascal and implemented initially on Honeywell-Bull 68 supported by MULTICS operating system. It has also been transported to VAX under UNIX for Ada and Pica [ 141. In version 5 available languages are: Pascal, Ada, Metal, Flip, and Rapport. MENTOR includes a structured editor for the incorporated concrete language syntax and an interpreter for the syntax tree manipulation language MENTOL which also provides for the specifications of procedures. It provides tools to support documentation and optimization and a theorem prover for program verification. Debugging is supported
A Comparative Study of Five Environments through a debugging interpreter. The scanner and the parser needed for the incorporated language are created by external scanner and parser generators. The design and implementation of CPS started in 1978 under UNIX in C. It has also been implements on Terak micros [36]. The first language on which a CPS environment was built was PL,/CS, a version of PL/I [4], while a later environment was that for the Pica toy language. A Pascal environment is under development. CPS includes an editor which is a mixture between a syntax tree driven editor and a text editor, an interpreter for the execution of programs, a source level debugger, and a link/loader. External scanner and parser generators are used as with MENTOR. IPE is written in GC, a simple extention of C, runs under UNIX, and is implements on DEC VAX for GC. IPE includes a structured (language independent) editor, an incremental compiler [19] which exploits the capabilities of an interpreter, a linker/loader, the ALOEGEN preprocessor for the incorporation of any new language, the KERNEL (editor tool [22]) which is used to convert the user commands to “primitive” commands, a number of packages for the administration of the data base, and a source level debugger which is part of the Control Module of the editor. The scanner and parser are incorporated with the editor. GANDALF, as mentions earlier, is based on IPE and subsequently includes all of IPE’s tools. It is implemented in GC under UNIX on DEC VAX computers. In addition to the IPE tools, GANDALF also includes SMILE, which is responsible for the management of the data base which supports the incremental compiler, DBGEN which along with ALOEGEN are used for the incorporation of each new language, the ARL editor for the action routines language, and ALOE (language oriented editor) which includes an extended IPE KERNEL [34]. GANDALF also includes a scope package, a package splitting syntax trees into subtrees, a system version control for the desc~ption of intermodule communication, and a package for project management [21]. As with IPE, there is no parser but, instead, a parsing algorithm [12] in charge of the incremental parsing. PECAN [30] is implemented at Brown University on the APOLLO wor~~tions under UNIX and is written in C. It includes an editor which, like in CPS, is a mixture of structured and text editor, and an incremental compiler which is under development. A number of preprocessors are used to provide for a tabular representation of the incorporated language. The tools and packages of PECAN are distributed over a number of layers. On the lowest layer the following packages are found:
J. SYSTEMS SOFTWARE 1991; 14:3-H
7
ASH [27] is a facility for screen and window management, VT [24] is a package to support the terminal interface, SGP [20] is a graphics support package, WILLOW [26] supports the creation and manipulation of windows, and, MAPLE [3] which is a screen based input utility similar to GANDALF’s KERNEL. On the next layer two packages are found: l
l
PLUM [25], which is responsible for managing the data base, and ASPEN which manipulates the syntax trees and provides support for tree editing functions.
Finally, on the top layer one finds the Parser and the incremental compiler. The parser is directed by the tabular information as supplied by the preprocessors and is accompanied by the LEX scanner generator for generating the scanner. The incremental compiler consists of the following modules: o the l the l the l the l the
Control Module which includes the debugger, Symbol Module, Data Type Module, Expression Module, and Control Flow Module.
6. ISSUES RELATING TO TOOLS AND PROGRAM TRANSFORMATION CYCLE 6.1 MENTOR
As mentions, MENTOR includes a constructive editor in which the MENTOL commands have been incorporated. MENTOL is not simply an editor command language but also provides facilities for the definition of procedures. There are predefined procedures in MENTOL for tree traversal with parallel execution of commands on the tree nodes, creation of new identifiers, strings and comments, management of help files, debugging, etc. Debugging is done at source level and the user is provided with an integrated debugging interface, e.g., procedure PROFILE supplies an execution profile of the program. Specialized interpreters assist the user in computations or program rearrangements. Since there is no standard semantic checking mechanism, programs may contain semantic errors while being developed. Program development can be considered as a multistep sequence where each step is taken care of by a processor whose output is input to the next step processor. The sequence includes the following steps: 1. Syntax correctness check via parser activation.
8
2. Scoping correctness check via the scope checker to 3. 4. 5.
6.
P. Pintelas and S. Tragoudas
J . SYSTEMS SOFTWARE 1991; 14:3-H
check the declaration of identifiers. Operations type checking via a type checker. Checking for run-time errors and aliases via data flow analysis routines and a symbolic interpreter. Program verification via a debugging interpreter which acts on the syntax tree and the symbol table. Verification is implemented by introducing firstorder predicates. Program optimization mechanism. Local optimizations are made by program transformations while global optimizations are made at source level with the help of a set of MENTOL procedures.
Scope checker, type checker, data flow analysis routines, symbolic interpreter, and the verification mechanism are all implemented as MENTOL procedures. MENTOR is not producing executable programs; when a program is to be executed it is transported to a host compiler, compiled, and subsequently executed.
6.2 CPS In CPS, the parser is called by the editor on a phraseby-phrase basis. The interpreter is of the classic type and its operation is based on the internal representation which is discussed in Section 5. Code generation is done every time program segments are introduced into the unparsing templates. Execution follows editing immediately. Execution is suspended when some program item is missing and resumed as soon as the item is put in. Debugging is done at source level. The unit for break and trace [19] is the text line and not a statement or expression. As the user is aware of the transition from one tool to the other, program execution can be controlled. During execution the user may decide to modify the declaration of a variable. On detecting a request for declaration modification, CPS removes the old symbol table and replaces it with the new one. If the user resumes program execution during the process of symbol table reconstruction, the interpreter detects the modification. Traced programs, as probably expected, are slow while in pacing mode the interpreter waits at every “go to” statement for user response before it continues. In stepping mode, the interpreter waits for a resume command to continue execution. Unresolved references are satisfied by the linker and subsequent loading is taken care of by the loader. The user may check the execution (run-time facility-single stepping), the execution speed (a mechanism is provided through the comments) and the scope of run-time supervision. Finally reverse execution possibility is provided.
6.3 IPE IPE uses an incremental compiler instead of an intepreter. Compilers have the advantage that high-level program representation may be translated to machine code which is not necessarily that of the host machine. The replacement unit is the procedure. Procedures have fixed entry points which refer to the actual machine code of the procedure. When the machine code of a procedure is replaced, the new code may be longer than the old copy and it does not fit in the same address space. In this case the code of some procedures may have to be relocated (moved). To allow for this relocation without enforcing full program relinking, local references are placed in a place relative to the procedure code, whereas global references are placed in an area which is not affected by procedure replacements. Thus the program code is condensed (packed) by block-move operations. For debugging purposes there exists a mapping between the syntax tree representation and the machine code representation. The mapping is made by the code generator and only basic nodes are mapped. The code generator is also responsible for certain optimizations which should not affect the mapping. The user is not aware of the transitions from one tool to the other within the environment. Linking, loading and execution take place on an incremental basis. If a procedure is called and its body is missing (not written yet), the execution is suspended and is not resumed until the user fills the missing code. Side effects are possible in some cases; for example, when the executable representation of a procedure is replaced, it is possible that the return address on the activation stack is not legal any longer. Similarly if some statement is added in an active procedure in a place from which control has already passed, the user is responsible for deciding whether this is legal or not. In cases where the entry points of the procedures on the activation stack have changed, control is taken over by recovery mechanisms.
6.4 GANDALF In GANDALF the program transformation cycle is the same as that of IPE except that in GANDALF the semantic checking is enforced. The user communicates with the ALOE editor via editing commands, language-dependent constructive commands, and language-dependent extended commands. The Kernel [2] of the editor analyzes these commands into primitive functions and each primitive function is transformed into three daemon calls [34].
A Comparative Study of Five Environments 6.5 PECAN In PECAN, the PLUM package provides support for
the data base administration, data distribution, reordering, field referencing, storage in files, data structure retrieval, and changes in data structures. ASPEN is responsible for the management of syntax trees, to supply information when a node has changed, and to support the undo requirement. The incremental compiler of PECAN consists of: The control module which includes the debugger which is similar to the IPE debugger. The function of the control module is analogous to that of GANDALF and supports the incremental construction and destruction of flow graphs. The symbol module which provides an incremental symbol table which handles the di~culties of symbol processing inherent in modem programming languages. The data type module which supports the built-in and structured types of the language as well as type conversions. The expression module which builds the expression trees on source operations incrementally. The command module which in cooperation with PLUM maintains two copies of the statements that the user has executed so far. The first copy is a list of primitive statements while the second copy is a list of user-oriented statements. This module provides the global undo/redo facility. The editor allows the user to move anywhere in the program and perform any text editing functions. It parses the program text and constructs its internal represen~tion. The program transformation cycle in PECAN was not complete (as indicated in Reiss [29]) since the environment was still under development.
7. LANGUAGE
INCORPORATION
As mentioned, to incorporate a language in an environment, the environment implementor must supply information about the lexical syntax, concrete syntax, abstract syntax, static and dynamic semantics, and unparsing schemes. Some environments, e.g., CPS, employ attribute grammars and attribute grammar evaluation algorithms [ 13, 331 as a means of introducing the language and the semantics. Attribute grammars provide for the incorporation of attributes to the nonte~in~ symbols and semantic rules to the productions of a context-free grammar. The attributes are of two types:
J SYSTEMS SOFTWARE 1991: 14%15
9
Inherited attributes which are based on the attributes of the ancestors of the non-terminal and which are computed on the tree from top to bottom, and Synthesized attributes which are based on the attributes of descendants of the non-terminals and which are computed from bottom to top. MENTOR uses regular expressions to describe the lexical syntax and an external scanner generator. For the description of the concrete syntax, MENTOR uses a BNF description and a parser generator which produces LARLfl) parsers. No repetition or alternative in the concrete production rules is allowed, and all language constructs, which may bc created during an editing session, are given as alternatives of the initial symbol of the grammar. For describing the abstract syntax, MENTOR has its own specification language, METAL. The abstract syntax specification contains all information needed to determine the correct relationship between the operators and classes (or sorts) during the creation and modification of the syntax trees. Classes show the legal offsprings of a node in the syntax tree. Syntax trees in MENTOR are not binary trees. The reserved words of a language appear as label nodes, a metavariable (extensible node) exists only once in a syntax tree, and lists have a fixed number of children. There are no braces in the syntax trees and the unparser is responsible for determining priorities. In ambiguous situations, the unparser uses the minimum number of braces needed to ensure correct parsing. Comments and programs from different programming languages are allowed in the syntax trees. Static semantic checking is not provided through METAL but the user may write MENTOL programs which take care of static semantics and have them called automatically. No dynamic semantic checking is provided either. However, an interpreter which embodies the dynamic semantics and operates on the syntax tree can be written. METAL provides no specific unparsing scheme but unparsing procedures may be added to the environment. A prototype unparser can be written in Pascal, which specifies the communication between the unparser and MENTOR. When the tables of a Ianguage are loaded, MENTOR creates a set of predefined schemas, one for each operator, which consists of an operator and the metavariables on which the operator applies. In this way, one obtains access to metavariables without restricting control to a single path on the syntax tree. The tabular description of the language is given to the editor via the compiler.
10
SYSTEMS SOFTWARE 1991; 14:3-D
J
In CPS, lexical and concrete syntax is introduced in a manner similar to MENTOR with the only difference being that CPS allows repetition in the procedure rules. The syntax trees are constructed through the attribute grammar evaluation mechanism [3 1, 331. The abstract syntax is introduced with the assistance of SSL (synthesizer specification language) which accepts inherited and synthesized attributes. Through SSL, the attributes existing on the nodes of the syntax tree can be visualized on the abstract syntax. Syntax trees in CPS are subject to the following restrictions. . No operator is allowed to exist in a class more than once. . Lists are nodes with unfixed (variable) number of children. . Only one language exists in every tree. Static semantic checking is provided through the capabilities of attribute grammars. During an editing session, for each node on the syntax tree, values for the attributes of the node are automatically built. These values are used by the corresponding semantic rules for static semantic checking. The environment allows references to non-local attributes. There is no standard mechanism for dynamic semantic checking. Instead, C functions can be written for the interpretation of the syntax tree or expressions linked, via SSL, to the nodes of the tree. In CPS, an unparsing template is connected with every abstract rule while the external representation of each unparsing template is held in an internal table. For each screen representation it is required that the internal syntax tree representation uses the unparsing templates. Each cursor position corresponds to a node in the tree but only forward (preorder) cursor movement is possible. The display routine traverses the tree in preorder, keeping track of the position of the external representation of the corresponding template. For this reason, there is a table mapping internal node addresses to screen coordinates. This table is updated as the tree is traversed. In IPE, the incorporation of the language to the environment is done via tabular information; thus, the need for a specification language is eliminated. The editor is completely structured and contains the scanner and the parser of the language. Because of the ALOEGEN preprocessor, a language is incorporated in IPE via a standard interface which includes the concrete syntax, the syntax/semantics interface, and the user interface. The concrete syntax includes multiple unparsing schemes and formatting commands. A mechanism (trapdoors/traproots) allowing for dynamic change of the representation on the screen is available. The syntax
P. Pintelas and S. Tragoudas
and semantic interface [15] includes action routines (written in C), attributes, and a boolean variable for connection with a name table. The syntax trees may contain only one language. On the nodes there are attributes which the action routines act upon. Lists are represented as n-nodes. Furthermore, there are the nonterminal nodes for control flow and data definition language constructs, corresponding to unparsing templates and containing proper information through the attributes. Static semantic checking is not included but can be implemented by employing action routines. The kernel [23] analyzes the user commands into “primitive” commands and for each one action routines are called in a predefined way. Thus, incremental checking is achieved but no semantic correctness is enforced. Dynamic semantic checking can also be implemented through action routines. The user interface assists in specifying the priorities, the synonyms for operators, and routines for lexical syntax correctness which activate a scanner within the editor. In GANDALF, the incorporation of a language into the environment follows the same procedure as in IPE since the ALOEGEN preprocessor is used in both environments. However, further information is incorporated into the environment via DBGEN [ 151. DBGEN provides for the introduction of a name for the concrete grammar, the attribute grammar, the extended command table (for extended commands written in ARL language), the external region environment table, the command restriction table, the unparsing schemes, and the default procedures for accessing the system-dependent routines of the system. GANDALF’s action routines are written in the ARL language [34]. ARL is a very flexible language for static and dynamic semantic checking on the internal syntax trees. It is the use of this language as well as the extensions of GANDALF to project management and version control that forced GANDALF’s designers to extend IPE’s mechanism for the incorporation of a language. In contrast with IPE, GANDALF enforces semantic correctness. In PECAN, as with IPE and GANDALF, the language is incorporated in tabular form with the help of four preprocessors. READER, the analog of ALOEGEN in IPE and GANDALF, is the first preprocessor and accepts information about the syntax trees. READER must be supplied with the semantics of each abstract syntax production, information for the editor’s parser, the incremental compiler’s modules, and ASPEN. Information for the parser includes the definition of the concrete syntax, priorities relating to each syntax rule, and the lexeme definitions. Information for the compiler, the control module in particular, includes
A Comparative Study of Five Environments program segments for each abstract syntax rule which implement the semantic actions and other information such as the unparsing scheme formats. The second preprocessor builds the tables for the symbol module. Its input is a list of classes of names, objects, scopes, and takes from the READER rules for name analysis. The third preprocessor supports the data module and takes as input information on the use of data types and the data type statements from READER. The fourth preprocessor supports the expression module of the compiler and accepts as input information about the use of expressions, and the specifications for their construction, as provided by the READER. As in all other environments except MENTOR, only one language can exist in a tree. The trees have an internal and an external representation and a mapping mechanism similar to those of IPE and GANDALF. 8. USER COMMUNKATION INTERFACE, EDITOR
AND
The user interface provides the view of the environment as seen by the user. The way the user communicates and interacts with the environment is as irn~~nt as all other facilities offered by the environment put together. All environments provide as one of their most valuable ingredients a structured editor as means of user interface and communication tool. The editor operates on the syntax trees. In general, the access of the abstract syntax tree is effected either by language independent or language dependent editor commands. In MENTOR the editor embodies the statements of MENTOL [g]. A menu mode interface is provided for beginners and multiple-language sessions, supporting three windows for system dialogs, text, and help re-
J . SYSTEMS SOFTWARE
1991; 14:s15
11
dependent. It is automatic~ly created by the kernel and the language description. The user communicates with the editor through language-inde~ndent editing commands. Screen communication with the environment is also supported by the IPE editor. During program development, the user writes program text in placeholders found within unparsing templates. If syntactically illegal text is written in a placeholder, the text is hig~ight~ until corrected. Such error prevention and correction mechanisms are activated by the editor in many cases. The GANDALF environment provides a reasonably good screen interface and a user communization facility which is similar to tbat of IPE. The most advanced user interface and screen display facilities are offered by PECAN which supports capabilities for multiple program views and multiple screen windows. The PECAN editor is a mixture between a structured and a text editor and allows the user to write program text anywhere and not to restrict himself wi~in pla~eholders only. It consists of two inde~ndent modules. The first module provides the user interface for editing while the second module is responsible for providing a formatted display of the syntax trees. The user is free to move anywhere in his program and make any text-editing functions. This kind of operation is very important since most program changes are textual. The editor parses the user text and arranges for its internal representation as well as the screen representation. In template mode, the PECAN editor allows all the capabilities offered by ALOE or the CPS editor. It provides capabilities for extending the co~uni~ation with the user through the extended commands. Finally, the user has the advantage of including a pointing device.
spectively .
CPS contains an editor with screen co~unication capabilities which is driven by the constants of each incorporated language in which the programs are written. The editor is a hybrid between a structured and a text editor. Unparsing templates are built with editing commands. The user may write in placeholders found in the unparsing templates. If illegal text is written in some placeholder, the text is highli~t~ until corrected (error prevention mechanism). An error correction mechanism is made possible because of the hybrid nature of the editor. In general, the user communicates with the environment via editing co-eds and l~g~ge-de~ndent commands. The environment allows the use of external procedures in C, thus facilitating the addition of new tools. The editor in IPE [1] is structured and language-in-
9. TOOL REPLACEMENT
AND MODIFICATION
In MENTOR, the implementor may expand or replace the environment’s tools by writing external procedures in MENTOL. Likewise in CPS the implementor may write C procedures, thus achieving tool replacement and m~i~cation. This kind of facility (external procedures for tool expansion) is not provided in GANDALF nor, ipso facto, in IPE. Tool replacement in GANDALF is a rather difficult task. The situation is completely different in PECAN due to the layered design of the environment. It is a rather simple task to modify any of the tools sup~~ing the enviro~ent, without affecting any of the other tools or packages. This flexibility in PECAN is a result of the layered design and the communication by message-passing between layers.
12
J.
SYSTEMS SOFTWARE 1991; 14:3-u
10. SUMMARY/CONCLUSIONS
The main advantages of MENTOR are the concurrent support for programs written in different programming languages, its portability and its friendliness toward the user. A major disadvantage is that the facilities it offers for the program design and modification cycle are restricted and the environment implementor has to write program segments in order to add more facilities to the environment. The main advantages of CPS include the screen support for user communication and the capability to add semantic rules acting on the nodes of the syntax tree, where the attributes are, thus achieving static and dynamic semantic checking. A further advantage is the existence of the interpreter which produces quick intermediate code generation, provides quick error detection, and the capability for user execution control. On the other hand, the execution of programs is slow and there is no support for optimization. It cannot be extended for programming-in-the-large and programming-in-the-many. Finally, as with IPE, GANDALF, and PECAN, it provides no concurrent support for programs written in different programming languages. Advantages of IPE include the incremental compiler, which combines the advantages of interpreters and compilers, good user communication, and screen interface support. On the other hand, it is impossible to extend IPE for programming-in-the-large and programming-in-the-many
.
GANDALF includes the advantages of IPE and additionally provides for good user communication, screen interface, enforces semantic correctness during program evolution, supports experimental data bases, provides tools for programming-in-the-large, and can be extended for programming-in-the-many. Its main drawback is the difficulty in replacing tools. PECAN’s advantages include easy tool replacement, good support for screen interface, and extendable user communication. On the other hand, it is not so easy to extend PECAN to provide for programming-in-the-large and programming-in-the-many. Viewing all environments together, one distinguishes the characteristics they have in common and those in which they differ (a detailed comparative table for all environments is found in Appendix B). GANDALF is the only environment which supports version control and project management. All environments provide screen interface for user communication. The user interface in MENTOR, GANDALF, and PECAN can be extended by writing extra procedures. The introduction of a new language in the environment is made possible either via specification lan-
P. Pintelas and S. Tragoudas guages, as in MENTOR and CPS, or via tables, as in all other environments. All except MENTOR provide unparsing support. In MENTOR, this can be made possible only via external Pascal procedures. Standard mechanisms for static and dynamic semantic checking are only provided by GANDALF while CPS provides only for static semantic checking. The editor is a hybrid between text and structured editor except in IPE where the editor is completely structured. Incremental compiler is provided in IPE, GANDALF, and PECAN, whereas CPS provides interpreter and MENTOR compiler. Source-level debugging facilities are provided by all environments. MENTOR is the only one which does not support execution of programs and PECAN is the only one to provide multiple views and graphics support. All environments support some central data base and GANDALF also supports experimental data bases. Finally, all environments can be considered as integrated and user friendly. As regards current research and future directions for the environments, the situation is as follows. Referring to MENTOR, one observes that the user, via the MENTOL language, can extend the capabilities of the environment by writing extended program segments, a laborious and time-consuming task. The difficulties of such a task can be significantly reduced if the implementation language of the environment provides proper language and data structures. It is in the direction of developing such programming languages that the research efforts are focused to achieve a MENTOL environment which is highly integrated. In CPS, the situation is similar to MENTOR regarding tool replacement modification and integration. The research direction in CPS is toward providing the capability for storage and retrieval of the syntax trees to and from external files, i.e., attempting to provide some minimal version control system [35]. IPE was the first research step toward the GANDALF programming environment. It is obvious that the capabilities of GANDALF can be added to IPE but the philosophy behind the IPE design has restrained further research. GANDALF was designed with the previous experience of IPE and embodies tools and facilities for programming-in-the-large. By providing multiple views in GANDALF, the environment can be extended for programming-in-the-many. Current research is focused on investigating methods for easy tool replacement. PECAN is not fully implemented yet. The implementation of the multilevel incremental compiler must be
J SYSTEMS SOFTWARE 1991; i4:3-15
A Comparative Study of Five Environments
complete by now. Currently it supports only programming-in-the-small but its design philosophy allows it to be extended and provide for programming-in-the-large and programming-in-the-many. As LIPEs become mature, they are becoming available in commercial products. So far, little empirical data have been collected to indicate whether LIPEs actually increase productivity [5]. Initial attempts to scale up the environments to support programming-inthe-large and programming-in-the-many have encountered di~culties. Te~h~ques currently used in many environments have shortcomings in terms of providing efficient, persistent storage for large structures and in coordinating concurrent access to the structures for multiple users or tools. REFERENCES
15.
16.
17.
18.
19.
1. ALOE
2.
3. 4.
5.
6. 7. 8.
9.
10.
Il.
12.
13.
Users’ and Implementors’ Guide (Fourth Edition), The Gandalf Project, Dpt. of Comp. SC., Carnegie-Mellon University, Pittsburgh (October 1984). V. Ambriola and B. J. Staudt, The ALGE action routine language manual, Dpt. of Camp. SC., Carnegie-Mellon University, Pi~sburgh (November 1985). M. H. Brown and S. P. Reiss, Maple Reference Manual, Brown University (December 1982). R. Conway and R. Constable, PL/CS-A disciplined subset of PL/I, Techn. Report 76-293, Dept. Computer Science, Cornell Univ. (1976). S. A. Dart, R. J. Ellison, P. H. Feiler, and A. N. Habermann, Software development environments, IEEE Computer 20, 18-28 (1987). L. P. Deutsch and E. A. Taft, Requirements for an experimental programming environment, CSL-80-10 Xerox Corporation (June 1980). H. Diel, Language represen~tion based on abstract syntax, IBM Labor. Boeblingen. V. Donzeau-Gouge, G. Huet, G. Kahn, and B. Lang, Programming environments based on structured editors: The MENTOR experience, in Barstow et al. (eds.), Interactive Programming Environments, McGraw-Hill, 1984. R. J. Ellison and B. J. Staudt, The evolution of the Gandalf system, Journal of Systems and Software, 5, 107-119 (1985). N. Habermann and D. Notkin, Gandalf software development environment, from The Second Compendium of Gandaif Documentation, Dept. of Comp. Science, Camegie-Mellon University (January 1982). N. Habermann and D. Notkin, GANDALF software development environments, IEEE Trans. on Soft. Eng., SE-12, 1117-1127 (1986). G. E. Kaiser and E. Kant, Incremental parsing without a parser, Journal of Systems and Software, 5, 121- 144 (1985). K. Katayama, Translation of attribute grammars into
14.
20. 21. 22.
23.
24. 25. 26.
13
procedures, ACM Transactions on Programming Languages and Systems, 6, 345-369 (1984). P. Klint, A survey of three language-independent programming enviromnents, INRIA Tech. Report No 257 (December 1983). C. W. Krueger, The GANDALF system reference manuals, Dpt. of Comp. SC., Carnegie-Mellon University, Pittsburgh (January 1986). P. Lucas, P. Lauer and H. Stiegleitner, Method and notation for the formal definition of programming languages, IBM Laboratory Vienna Techn. Report, TR 25 087 (July 1970). P. Lucas and Walk, On the formal description of PL/I, Annual Review in Automatic Programming, 6, 105 (1970). J. McCarthy, Towards a mathematical science of computation, in Information Processing 1962, North-Hollad Publ. Comp., Amsterd~, 1963, pp 21-28. R. Mona-Mora and P. H. Feiler, An incremental programming environment, Dept. of Computer Science Carnegie-Mellon University Tech. Report CMU-CS80-126 (April 1980). D. B. Nanian and J. N. Pato, Simple graphics package, Brown University (September 1982). D. Notkin, The GANDALF project, journal of Systems and Software, 5, 91-105 (1985). D. S. Notkin and G. E. Kaiser, The implementation of the Gandalf software development environment, Second Compendium of Gandalf Documentation, Qt. of Comp. SC., Carnegie-Mellon University, Pittsburgh (May 1982). D. C. Oppen, Pretty printing, ACM Transactions on Programming Languages and Systems, 5, 449-477 (1983). S. P. Reiss, Virtual terminal package, Brown University (September 1982). S. P. Reiss, PLUM: A data structure m~age~nt package, Brown University (December 1982). S. P. Reiss, Willow: A window manager, Brown University (September 1983).
27. S. P. Reiss, A screen handler, Brown University (October 1983). 28. S. P. Reiss, Graphical program development with PECAN program deveiopment system, Proc. ACM SIGSOFTISIGPLAN Soft. Eng. Symposium on Practical Softw. Development Environments (April 1984). 29. S. P. Reiss, PECAN: Program development systems that support multiple views, IEEE Tran. on Soft. Eng. SE-l 1, 276-285 (1985). 30. S. P. Reiss, J. N. Pato, and M. H. Brown, An environment for workstations, Techn. Report CS-84-03, Dept. Comp. Science, Brown University (Jan. 1984). 31. T. Reps, Generating Language-Based Environments, MIT Press, Cambridge (1984). 32. T. Reps and T. Teitel~um, The synthesizer generator, Proc. of the ACM SIGSOFT/SIGPLAN Software
14
P. Pintelas and S. Tragoudas
J. SYSTEMS SOFTWARE 1991: 14:3-15
Engineering Symposium on Practical Software Development Environments 42-48 (April 1984). 33. T. Reps and T. Teitelbaum, Language processing in program editors, IEEE Computer, 20, 29-40 (1987). 34. B. J. Staudt, The ImplementorL Guide to Writing Daemons for Aloe, Dpt. of Comp. SC., Camegie-MelIon University, Pittsburgh (December 1985). 35. T. Teitelbaum and T. Reps, The Cornell Program Syn-
thesizer: A syntax directed programming environment, Comm. ACM, 24, 563-573 (1981). 36. T. Teitelbaum, T. Reps, and S. Horwitz, The why and wherefore
APPENDIX
FUNCTIONS
MENTOR
of
the
Cornell
Program
Synthesizer
(Tu-
torial: Software Development Environments, A. I. Wasserman (ed.), IEEE COMPUTER SOCIETY Order NO. 385 (1981).
A
IPE
CSP
GANDALF
PECAN
Language incorporation by specification language or tables Structure editor Multiple views Basy tool replacement User interface User interface extensibility
METAL SSL hybrid between text (in place holders) and structure editor no yes
table driven structure editor
table driven table driven hybrid between text and structure editor
no
yes yes
no
(3
yes,
nr no
f;g no
;4”, y=(6)
Incremental compilation
via MENTOL procs compiler
interpreter
incremental compiler
incremen~l compiler
Source level debugging
yes
yes
yes
yes
Program verification Version control Project management Concu~ent multiple language support User friendly
yes no
yes, also supports reverse execution no no no no
mechanism for run-time no no no
checking of assertions
no no no no
yes
yes
yes
(1) a. b. (2) a. b. (3) (4) (5) (6) (7) (8) (9)
Language Language Language Language
n;
ye:*, yes@)
YesO) increments compifer
yes yes IYO
yes
independent commands, e.g.. up. delete. dependent commands, e.g.. MENTOL procedure for tinding the next procedure declaration respecting the scope rules independent commands. e.g.. up, down. dependent commands for the creation of new language constmcts, e.g.. while or ‘wh’.
Similar to (a) and (b) in CSP. Via language-inde~ndent commands, ianguagede~ndent commands. and language~de~ndent extended commands. Via language-independent and language-dependent commands. Via the extended commands and the DBgen. By writing procedures in the action routines language. By associating annotations with nodes in the tree. Easy to learn MENTOR, easy to debug, fast enough in execution.
PROPERTIES Attribute grammars Support for dynamic semantic checking Integrated Central DB support Redo mechanism Lexical syntax via specification language or tables Parser Unparsing support via the specification language or tables Multiple unparsing schemes via specification language or tables implementation language General purpose internal representation of syntax trees
MENTOR no no (1) yes yes no yes via regular expressions yes no, user adds Pascal procs for unparsing no
APPENDIX B IPE CSP no
GANDALF yes
PECAN no
yes no (2)
no (3)
yes, via the extended commands
no (4)
yes
Yes
yes yes yes yes via lexical routines parsing algorithm
yes yes no
yes no yes via regular expressions yes
yes no yes via lexical routines no
yes
no
Pascal
C
yes no
yes yes (6)
yes
G.C. yes Yes TCOLada-like
yes (5) yes
yes
yes
yes
yes
G.C.
C
yes
yes yes
Yes TCOLada-like
TCOLada
A Comparative Study of Five Environments
PROPERTIES Supportfor static semantic checking Enforces semantic correctness Portable Graphics support Optimization Supports program execution Parsing/unparsing of external files Experimental DB support Language-independent tools for data flow analysis (1) It is possible (2) It is possible (3) (4) (5) (6) (7) (8) (9) (10)
MENTOR
15
1. SYSTEMS SOFTWARE 1991; 14:3-15
APPENDIX B (continued) CSP IPE
GANDALF
PECAN
no, user must add MENTOL procs no
yes (7)
no, unless given via
yes, via
no, only via semantic
attribute grammars
no (9)
semantic actions in C no
yes
actions (user) no
yes (8) no mainly done by the code generator no
no no difficult due to interpreter
yes no by the code generator
yes no
yes yes
yes yes (10)
yes no
yes yes
yes (not yet?) no
no yes, (global data
no no
yes no
yes (flow view)
yes no no, (language dependent procs)
by the code generator
flow procs)
to write a Pascal interpreter to act on the syntax tree. to add C procedures for the interpretation of the syntax tree or to associate SSL expressions It is possible to write action routines in C to take care of dynamic semantics. It is possible to write an interpreter in the language for the semantic actions. External scanner driven by lexeme definitions in tables. Provides forward cursor movement only. Through the inherited/synthesized attributes introduced via SSL and the attribute grammar mechanism. The most portable environment due to lack of internal representation. Highlights the error. Supports unparsing specifications for the editor.
with the nodes
no