YT—the Yacc tracer

YT—the Yacc tracer

Journal of Microcomputer Applications (1988) 11,281-299 COMMUNICATION YT-the Yacc tracer R. Adamov University of Ztirich, Institut ftir Informatik,...

5MB Sizes 2 Downloads 97 Views

Journal of Microcomputer Applications (1988) 11,281-299 COMMUNICATION

YT-the

Yacc tracer

R. Adamov University of Ztirich, Institut ftir Informatik, Winterthurerstr. 190, CH-8057 Ziirich This paper describes the realization of Yt-a tracer of the well-known UnixTM compiler Yacc [l]. The main task of this tracer is to serve for educational purposes. The basic objective during the development of the tracer was to make the analysis of parsers produced by Yacc simple, efficient, and user friendly. This is achieved through the

specific (multilevel) form of displaying the current status of the pushdown automaton (PDA) and the interactive mode of controlling the tracer execution.

1.

Introduction

Over the years Yacc has become an extremely popular software development tool, not only in the area of compiler construction, but in other areas as well. One of the reasons for the general popularity of compiler compilers is the fact that a large number of programs may be regarded as language processors transforming input strings into sequences of actions and/or output strings. In this report we will go into the details of the Yacc parser generator only as much as we need to describe the interface between Yacc and Yt. Interested readers are directed to the corresponding references [2,3]. The second section contains a very general description of the parsers produced by Yacc, along with a description of the Yacc/Yt interface. A brief list of Yt features is included as well. The third section handles the display organization and the user interface. The fourth section deals with the structural organization of the Yacc-produced parsers, introduces Yt-specific modifications, and examines the issue of grammar independence. The fifth section describes the Yt generation process and the structural organization of the tracer code. Some application specific Yt extensions are mentioned in the sixth section. The seventh section contains some concluding remarks along with comments regarding possible future extensions/modifications.

2.

Yacc/Yt Interface

Yacc is a parser generator for a set of LALR( 1) grammars [l, 31, which accepts the grammar of a language in Backus-Naur form (the specification of language syntax) and produces a parser for this language (file y.tab.c) as its output. The user may augment the grammar rules (productions) with actions, thus defining the meaning (semantics) of the language. The resulting output file consists of three parts: -a

set of tables which define the parsing table and are grammar dependent; 281

0745-7138/88/040281+ 19 $03.00/O

0 1988 Academic Press Limited

282

R. Adamov

-grammar independent driver code whose execution is controlled by the contents of these tables; -code for the user supplied actions. Input submitted to the parser is analysed by the lexical analyser (the yylex procedure). An input stream is broken up into basic units (tokens) which are passed to the parser. Depending on the current PDA state and the current input symbol, the parser consults the parsing table(s) in order to determine the next action. Our primary goal was to make every state transition of the parser explicit (visible). Special effort was made to show the conditions which led to each particular state transition. In order to achieve the above goals we have applied the following steps: -we thoroughly analysed the driver code of the Yacc-produced parser and pinpointed the optimal points for making snapshots of the parser state, i.e. we have located the driver code sections where the relevant events of the parsing process take place; -we instrumented the driver with procedure calls which embody the links to the actual tracer program. The tracer may be considered grammar independent as long as the interface between the tracer and the parser generator is well defined and invariant. This makes tracer extensions easier to realize. The situation is presented in Figure 1. The shaded parts are Yt related. One very important feature of Yt is that it is an interactive tracer. Yt offers a user friendly interface to grammar independent tracing of Yacc-produced parsers. This is either used for examining the correctness of newly developed grammars or for educational purposes, i.e. to enable extensive tracing of steps executed while analysing an input string for a given grammar. The analysis is done in a multilevel manner, by simultaneously tracing the parser at the lower (PDA) level and at the higher BNF level.

Figure 1.

Parser

instrumentation

process.

YT-the

Yacc tracer

Moflltor

ENF

window

PDA

window

Guide

283

window

wmdow

Comment

window

State

Figure 2.

s+o!zk window

Display organization.

The lower level tracing takes the form of a slightly modified parsing table whose contents dynamically changes with each step of the parsing process. The higher level tracing takes the form of a dynamically changing table of entries. Each entry is an ordered pair: BNF production, the most recent part of an input string seen at a given point in the parsing process. The two tracing levels take the form of two overlapping windows: PDA window and BNF window (Figure 2). PDA window (lower level tracing) is normally in the foreground (this is where the changes occur much more often) and the BNF window (higher level tracing) is in the background. The second window comes to the front only when the respective BNF production is activated (e.g. reduce action of the parser). The tracing process is controlled by the user in an interactive manner (see the next section). The user may define an input string to be processed or change the tracing mode and speed at any moment during the tracing process. Another feature of the user interface is the possibility to backtrack the parser to any of the steps executed so far. This feature is not restricted to the steps executed while parsing the currently analysed input string (see section 5). The user interface defined as above enables each user to adjust the complete tracing process to his own liking.

3.

The display organization/user

interface

The display is organized as a set of dedicated overlapping windows. We actually tried to avoid overlapping as much as we could, but due to the quantity of information which was to be- presented, a certain degree of overlapping still remained. The display organization is presented in Figure 2. The PDA window is slightly shifted to the left in order to show that this window partly overlaps with the BNF window. The display is diveded into six windows. Four windows have respective headers which indicate their purpose. We will start with the headerless one.

284

R. Adamov

3.1

Guide window

The guide window is an interface between the tracing process and the user. The window pops up at the beginning of each tracer run and prompts the user to modify the default values of parameters necessary for the tracer execution. After the initial’dialogue, this window disappears and pops up either at the beginning of the next run or during the current run (at user request). The initial dialogue assists the user in defining the values of the following parameters: -ZnputjiZe. The user is given an option to either use the default input file (standard input) or to define some other one. If the default option is chosen, user is prompted with a usual shell-like interface which is similar to any typical Unix command interpreter accepting input strings from standard input. An input string may span across several lines. The end of input is marked by an EOF (ctrl-D) character. If some other file is opted for, the user may use it without modification, modify it and then use it, or create it in the case it did not exist. The popular ‘vi’ editor is activated from within the tracer to support these editing activities. -Tracing puce. This parameter defines the pace with which the tracing is done. It has two possible values: step and contiguous. If the value step is chosen, execution of each substep of the tracer run must be explicitly requested by the user (by depressing the space bar). Mark that we differentiate between steps and substeps. A step denotes any change in the state of the PDA. A substep denotes any change in one of the following windows: PDA window, BNF window, State stack window, and comment window. A single step is typically composed of several substeps. If the value contiguous is chosen; the tracer executes substeps in regular time intervals determined by the value of the next parameter. -Sleep grain is the time interval between substeps of the tracer run for the contiguous tracing pace. This parameter’s name reflects the fact that this timing is realized by the sleep (g.grain) system call. g.grain is an integer field of the global data structure g where all important tracer parameters are stored. Its value range is between 1 and 9 (seconds). -Foretrack option. This feature offers the functionality of the fast-forward button on any tape recorder. The user has a choice to ‘roll’ the tracer run ‘fast-forward’ to some latter point in the input string analysis, starting from the current one. The funny name foretrack was chosen in order to indicate the relation between this option to the next one (the backtrack option). In the current implementation the ‘fast-forward’ run stops when a certain specified state of the PDA has been reached. Implementation of the following extensions to this halting criterion is under way: -stop when a given line of an input string/file has been reached, -stop after a given number of steps (state changes), -stop when a given token type has been recognized, -stop when the current part of an input string matches a given regular expression (full regular expressions assumed, like those used by the egrep pattern searching utility [4]). -Backtrack option. This option enables backtracking the tracer for a certain number of steps. This option does not necessarily have to be related to only one tracer run. Depending on how the main () function is organized, it is possible to store the tracing data for all tracer runs and backtrack out of the limits of the current tracer run. This option will be more thoroughly discussed later on.

YT-the 3.2

Yacc tracer

285

Monitor window

The monitor window is closely related to the guide window. Since the guide window pops up and remains on the screen only for short time intervals, the monitor window displays the current settings of the tracing-parameters, the source of the input stream, and the current tracing pace. The upper part of Figure 3 illustrates both windows after the definition of the initial parameters of a typical parser/tracer run. The lower part of this figure shows an input string. 3.3

PDA window

The PDA window is divided into three areas (fields): the states area, the item set area, and the actions area. Each entry (row) within this window is related to one PDA state. The first field of the entry (states area) displays the state number. The elements comprising the item set for the relevant PDA state are sequentially displayed (one after another) in the item set area of the relevant entry. The actions area displays actions which are to be executed in this state depending on the next input token received from the lexical analyser. Figure 4 presents the PDA window in one of the final steps of the syntax analysis for the input defined in Figure 3. The final (sixth) stage of the high order calcuZutorprogram [4] is traced in this case. The high order calculator is used in this book to illustrate an implementation of the language in six stages, each of which contributes with some additional features. The first stage (hocl) is a simple four-function calculator.

DclbI I

INPUT READ:

PRRSER ACTIONS:

I I

list : _ (1) I list : list \n_ (2) 1 list : list defn \nI I STRTES: UALID ITEN SET:

I l state

(3) PRRSER ACTIONS:

0 GUIDE HENU I P -

Input File Tracing Pace Ouit WIYT P S - Sleep Interval X - Exit Guide Benu

I

state

Ifuncll [Cl

Ilsl

10 Press

letter

of

choice:

_

Scat funcl func silnpl 0 (: return $l+S2+$3 :

Figure 3.

Parameter

definition

protocol.

R. Adamov

286

PARSER

FICTIONS:

:

expr

LT

expr-

:

expr

LE

expr-

(44)

ex*r

:

eXDP

ED

PXDP_

<45>

‘JRLID

state

tist

0

m state

2

state

3

state

4

state

5

state

6

state

7

state

8

state

Q

state

10

going

to

ITEV

-

P&&F4 tist

(1)

Li s t_error

defn

FUNC_procname

3

B I

SET:

list

state

!?q

(43)

expr expr

STATES :

nput i

INPUT READ:

fmoNs:

got0

1

\n

$$54

(

) stmt

procname

got0

45

. .._

Figure 4.

The PDA window.

Hoc2 and hoc3 introduce variables. built-in functions (sin, log, exp, etc.), and useful constants (e.g. PI, Euler-Mascheroni constant, golden ratio). Hoc4 generates code for each statement which is subsequently interpreted, rather than evaluated on the fly. The fifth stage introduces the control-flow statements (if-else, while), statement grouping with { and }, and relational operators. The final stage (hoc6) offers support for recursive functions and procedures. BNF grammar for hoc6 is given in Appendix 1. Figure 4 illustrates some of the details regarding the presentation of the dynamic behaviour of PDA. This is also called tracing at the PDA level. Since the length of the window is limited, we had to use scrolling in order to be able to show every state change of the PDA. The following scrolling strategy was chosen in order to minimize scenery changes:

-if the next PDA state is within the current window, proceed without scrolling; -if the next PDA state is outside (above) the current window, scroll upwards making the next state the one at the top of the window; -if the next PDA state is outside (below) the current window, scroll downwards, making the next state the bottom one. On a typical 24 rows x 80 columns display (e.g.: vtlOO), the PDA window is limited to 11 rows. Tracer windows automatically expand on larger terminals without any modification of the program. This is done by exploiting diverse features of the curses library [5]. 3.4

BNF window

The BNF window is used for tracing at the next higher (Backus-Naur form) level. The window is divided in two areas: the BNFproduction area, and the input read area. Each window entry (row) is related to one production rule of the grammar. The first field of

YT-the

Yacc tracer

287

the entry (the BNF production area) displays the respective production rule, while the second one shows the processed part of the input string. The activation of the next production rule is first indicated at the PDA level/window, by highlighting the respective state and action fields. After these events, the BNF window overlaps the PDA window, the respective production rule is highlighted, and the input read field is updated. Figure 5 illustrates the sequence of events. The PDA window overlaps the BNF window in the next substep, as the tracing proceeds.

PARSER RCTIONS:

1

INPUT RERD:

end : stmtlist stmtlist

_

STATES

:

URLID

state state state state state state state state state state m

1 2 3 4 5 6 7 8 9 10

list

:

I i s t_error

defn

:

FUNC_procnamc

cxpr

:

AR&

by (29)

. .

(23)

: _ (24) : stmtl irt

reducing

func \n_

simpl

< 1 (

<25)

PARSER ACTIONS:

ITEM SET:

\n

FUNC shift

$854

< ) stmt

procname

rACK 11 31 52 12 61 17 110 98 81 45 8 1 0

8

goto

45

(29) popping

1 state(s)

from

stack

I

PARSER RCTIONS:

INPUT READ:

end : _ <23> stmtlist : _ (24) : stmtl ist \nstmtlist stmtlist : stmtl ist stmt_ : NlJHBER_ (27) expr : UARexpr (28)

&id func

simpl

I

state I state

:

asgn(30) FUNCTION begin < arglist READ < URR >(32) BLTIN < expr )_ (33) (34) < expr )_ expr + expr_ (35)

: : : : :

< )

3

expr

:

I

)_

.

simpl

< )

(29)

( return

-1

I reducing

by (29)

. . . popping

1 I

11 I 31 I 52 I 12 I 61 I 17 I 1101

(return$l+f2+$3

$1 + $2 _

I i RRG_

i

1 STRCK 1

9 10

I m

Pace

(

(25) (26)

() expr expr expr expr expr expr

input

1 state(s)

from

stack

I ,..

Substep sequence

I I i

8

I

I :

I

Figure 5.

98 81 45

288

R. Adamov

This figure also illustrates some of the drawbacks of our approach of displaying data. We have chosen to keep the fields within windows constant and to restrict the number of windows. This resulted in not being able to correctly show production rules which are longer than the field length (refer to rule 3 1 in Figure 5 and in Appendix l), as well as the part of the input string currently read in full length. While the first drawback does not seem to be that serious (we are usually interested in the rule number), the second one was overcome by introducing ellipses and showing only the fixed number of input characters seen at a given point in the parsing process. The number of characters shown is limited by the field length. What we gained was easier window handling. In Figure 5 we see two instances of the display. The one at the bottom of the figure is two substeps ‘younger’ than the one at the top. A very informal description of the order in which the (sub)steps are executed is given in Figure 6. 3.5

Comment window/Stack window

Figure 5 illustrates two other windows as well. The state stack window displays the

while

(input string not processed)

I

parser directed state change;

if (!first state change) unhighlight previous state; highlight current state; while

(item'set != empty) display next item;

update stack window; update comment window; if

/* parser field */

(!first state change) unhighlight previous action;

display & highlight current action; if (current action == reduce by rule-number) f BNF window overlaps PDA window: if

(current rule != first rule activated) unhighlight previous rule;

highlight rule(rule_number); display input read; PDA window overlaps BNF window; update stack window;

/* pop states */

update comment window;

/* parser field */

if (user action specified) i execute user action; update comment window;

/* action field */

Figure6. Substep sequencing description

YT-the

Yaee tracer

289

current contents of the state stack of the parser. The top of the stack contains the vital information (PDA’s current state) which is why it is always displayed. The comment window displays messages from the parser in the upper field, and the user defined messages (results, strings, diagnostic messages) in the lowerfield. Readers who already have had some experience with Yacc will notice that some parser messages are similar to the ones obtained when using Yacc in the debugging mode. This set of messages was extended in order to meet the educational purpose of the tracer. 3.6

Executing and interpreting the tracer

Once the tracing has started the user can interrupt the tracer by depressing the g key. The guide window pops up, and the user is prompted for eventual quitting, restarting the tracer with the same input string or with a new one, tracing pace modification, sleep interval modification, foretrack, and backtrack. If the backtrack option is chosen, the user may browse through the execution trace. An example execution trace is shown in Figure 7. Execution trace is divided into two fields: the state field which is just a list of PDA states, and the list of execution steps for the relevant state. When scrolling up/down, the complete execution trace is moved. When shifting left/right, only the list of states is moved to the leftmost/rightmost step of the execution trace. States remain where they are to make the orientation easier. The list of numbers separated with commas represents the list of the relevant steps (this is the reason why we differentiate between steps and substeps) during the parsing process. It is during the execution of these steps that the PDA is in the respective state. After the user has inspected the execution trace, he is prompted to specify the target step to backtrack to. To go backwards to a certain

I

I

1 PARSER ACTIONS:

1

1

\ist list

: _ : list

Cl> \h

(21 . .

I STATES :

URLID

m

list list

state

1

state

1 2

sta I sta sta sta sta rta sta

state state state state state stat* state

going

10 11 12 13 14 15 16

to state

linaut/

INPUT REfW:

ITEll

SET:

: _ Cl) : I i st-error

end

\n 1 \n

PARSER ACTIONS:

\n

defn

93, 19, 31,

IQP, 40, 33,

120, 47, 39,

147, 133, 41,

159,

146, 127, 17,

148, 139, 29,

150, 224,

226,

1 ..

) \n print

161,

175,

61,

.

I

Figure 7.

Execution trace.

goto

177,

3

181,

iSTACK

I

290

R. Adamov

visit of a specific state, it suffices to specify the execution step to go back to. For instance, to go back to when the PDA was in the state 14 for the fourth time in the previous example, one has to specify step 224 as the target step. After the user has specified the step number, the tracer backtracks to the moment when the specified step was executed, i.e. to the moment when the first substep of this step was executed. The target specification for the backtrack option is rather primitive at the moment. We would like to extend it to be similar to the one used for the foretrack option. However, this extension is not under way since our intention is first to experiment with multiple ways of execution trace logging and with different forms of presenting them to the user.

4.

Grammar

independence

We have postulated from the very beginning that users who had earlier experiences with Yacc should have no problems when using Yt. This implies that the user has only to have a general idea of the global structure of the Yacc input/specification file (any_name.y) in order to be able to specify a Yt input file. Figure 8 illustrates additional entities which are to be added to the any_name.y file in order to make it a Yt input. The overall organization of the Yacc input file [4] is given in italics. Necessary additions are written in boldface. We see that the additions take place in the main function (where the program execution starts) and in the yylex function (lexical analyser). The main function given in Figure 8 is a typical one: the non-bold parts imply that the parser runs in a ‘forever’ loop calling yyparse (parser function) for each new input string. The semantic of the addition is as follows: -init_trace function does the first part of tracer initialization. It is called only once per tracer activation. -setjmp(begin) (a library function) saves the stack environment in begin for later use by longjmp (another library function) which restores the environment saved by the last setjmp. These functions implement what is often referred to as ‘non-local goto’. A longjmp(begin) is used to implement the ‘restart tracer with a new input’ feature of Yt. -setjmp(backtrack) serves a similar purpose as above, but is used to make tracer backtracking faster. Backtracking is implemented by simply reanalysing the same input string up to the specified halting location. longjmp(backtrack) makes a nonlocal jump to the beginning of the current tracer run. -reinit_trace does the second part of tracer initialization. It is called once per tracer run. -The tracer rests one g.grain after the parsing is done, and is reinitialized for the next tracer/parser run. -g.yytext is another field of the g data structure reserved for the current input token where Yt is supposed to look for it. This is why calls to the sprintf library function should be inserted to the lexical analyser. sprintf takes the current token and stores it in g.yytext, according to a given format. This summarizes the necessary changes to be made within a Yacc input file in order to make it a Yt input file. We have speculated on the possibility to make the above changes

YT-the

Yacc tracer

291

%I C statements like #include, declarations, etc. Optional section. $1 yacc declarations: lexical tokens, grammar variables, precedence and associativity information %% grammar rules and actions %% more C statements main0

(optional);

{

...

init_trace

() ;

setjap(begin); while (1) { setjmp(backtrack); reinit_trace();

yypar.3e 0 ; sleep(g.grain);

YylexO

t ...

sprintf(g.yytext,

format,

token);

...

Figure 8. Yacc inputfile structure/modifications.

automatic and rejected this due to the restricting effect that ‘hard-wiring’ of the main function would have on the flexibility of organizing the tracing/parsing activities. One of the factors that made tracer development much easier was the Yacc feature to help users understand actual PDAs generated by this parser generator. This is the -v (verbose) option which instructs Yacc to produce a y.output file which contains a description of the PDA in a human readable form. We used this option to obtain all the information we needed for displaying item sets and actions within the PDA window, as well as the production rules within the BNF window. Another option which we also exploited was the -d option which instructs Yacc to produce a y.tab.h file with (among other things) the list of all token names defined by the user along with the respective values assigned to them by the parser generator. One might say that Yt is grammar independent, but for that it owes much to Yacc for being so generous in producing detailed information regarding generated parsers. The actual guinea-pigs for Yt were the DING DONG DELL example from the Yacc manual/report [l], along with all six stages of the high order calculator (hoc) program from the Unix Programming Environment book [4]. The initial development (80% of the

292

R. Adamov

final version) was done with the DING DONG DELL example, after which the unmodified version was applied to the tist four stages of hoc. The rest of the tracer (mostly some application dependent extensions) was developed using the last two stages of hoc (see section 6).

5.

Implementation details

The actual generation of the final tracer/parser file proceeds automatically, with the help of the make [7,8] file update utility. make executes commands in the description file to update one or more target names. Figure 9 shows the parts of the description file for ding.y (Yacc input file for the DING DONG DELL example) relevant to the discussion. This illustrates the tracer generation process. First three lines are simple make macros, Lines starting with a non-white space character are called dependency lines. The names to the right of colons are target names and are dependent on items listed on the right side. The lines immediately following the dependency lines (the ones starting with white space characters) are command lines. These are executed when target items are older than any of the items they depend on. The semantic of these lines is as follows: ding is dependent on all object files defined as macro OBJS (${OBJS} is a macro reference). When any of these is younger than the target, ding has to be recompiled (line 6). Line 6 further states that objects are linked together with the curses library and the Yt library where all the tracer routines are located. The next dependency line states that ding.0 is dependent on the respective input file (ding.y), which is first preprocessed by the sed.stdio routine, resulting in ding-t .y file. 1

CFLAGS

= -0

2

YFLAGS

= -d

3

OBJS

-V

#

force the creation of y.tab.h and y.output

= ding.0 ytabh.0 yprod.0 actions.0 buffers.0

4 5

ding:

8

S(OBJS~ CC

6

ding.0:

$(CFLAGS]

-0

ding ${oBJS} -1curses -1yt

ding.y

9

sed.stdio ding.y > ding+.y

10

yacc $(YFLAGS) ding+.y

11

rm ding+.y

12

ed - y.tab.c < ed.script

13

cc S~CFLAGSI -c y.tab.c

14

mv y.tab.0 ding.0

15 16

y.tabh.0: y.tab.h

17

awk.ytabh > ytabh.c

18

CC

SICFLAGSI

Figure 9.

-c ytabh.c

Typicaltracermakefile.

YT-the

Yacc tracer

293

Yacc-ing this file results with three files: y.tab.c, y.tab.h and y.output. The first one is modified with the ed editor by using a standard set of editor commands located in the edscript file. This modified file is compiled and its name is changed from y.tab.0 to ding.0. The third dependency line states that y.tabh.0 is dependent on y.tab.h, and the two command lines describe how y.tabh.0 is produced. Other files listed in OBJS are handled in a similar manner. Any casual user of Unix will notice that names like sed.stdio and awk.ytabh do not belong to the standard utility set; these are small filters written by using standard Unix rapid prototyping facilities: awk-pattern scanning and processing language/utility [9, lo], sed-stream editor [4], and sh-standard command programming language/utility [l 11. None of them is longer than 50 lines. The previous Yt version had a simpler make description file due to the fact that we read in both y.output and y.tab.h files at the beginning of each tracer run and searched the relevant items which were to be displayed. The problem with this version was that the size of y.output file grows pretty fast with the complexity of the grammar submitted to Yacc, so we ended up searching as much as 32 kbytes each time we (re)initialized the tracer, with obvious consequences. On the other hand, since the lists of productions and token names are invariant for a given grammar, we decided to make those only once (at the generation process), save them in a couple of C source files (yprod.c, ytabh.c, actions.c, and buffers.c), and ‘hard-wire’ them to the tracer. Here is a brief description of files which comprise the tracer library: init.c

guide.c

trace.c

mapping.c tab1es.c

uti1.c

Consists basically of two parts. The first part handles the initialization of windows, signal handling, memory allocation to tables and buffers, and their initialization. The second part is responsible for the necessary actions regarding reinitialization for each tracer run. This file contains code which controls interactive dialogues between the user and the tracer. The code may be subdivided into: the part which handles dialogues during the parsing process, the part which deals with dialogues during the code ‘execution’ process (see the next section), and the set of functions used by them both. Represents the central part of the tracer code. This file is mostly filled with the code of 12 tracing functions which are referenced from within the modified y.tab.c file. Further on, this file contains code for pacing control of the tracer run, archival, state stack and comment window updates, as well as a set of functions for searching items to be displayed. Handles the scrolling (up/down) and shifting (left/right) of window contents, as well as formatting the output to be displayed. Contains the code for the handling of exceptional conditions, signal handling, and the tables initialization/modification code used by most of the other functions of the tracer. Just a set of handy functions used during the tracer process, as well as ‘scaffolding’ functions to be used when (re)debugging the tracer: functions for dumping contents of tables, buffers, and tracer data structures in general.

Interaction with Yt is completely menu driven. When interacting with Yt, the user remains in full-screen mode. This menu system is almost completely implemented with

294

R. Adamov

shell procedures/scripts activated from within the tracer. Since menus are interpreted instead of compiled, one has much more freedom/space to experiment with different menu designs. External structure of this menu system resembles a tree-like structure. There are two types of menus: general purpose menus (like the Guide Menu in Figure 3), and parameter collection menus. A general purpose menu contains the menu’s name, heading, optional help or descriptive text, indexed menu items, and a prompt asking the user to choose a menu item. Parameter collection menus prompt the user for values that are passed to the tracer. The only thing that must remain invariant is the format of the temporary file through which these values are communicated. We are considering the possibilities of introducing an on-line help facility as a simple extension of the menu system.

6.

Some application

dependent

extensions

This part covers some of the extensions which we were tempted to make within the last three stages of hoc. The hoc development consists of six stages. The implementation of the first three stages (hocl, hoc2, hoc3) is organized in such a way that the parsing of an input string causes immediate evaluation of a recognized (sub)expression. The last three stages primarily generate (for a given input string) code for a simple (virtual) machine which is supposed to execute it. This code is assumed to be executed only after the complete input string is parsed/translated. The hypothetical computer is a simple stack machine. We found it very instructive to trace both the parsing of the input string and the ‘execution’ of the generated code. Thus our tracer was extended to include this feature as well. The changes (additions) actually introduced in the tracer code were minor. The changes comprised the introduction of three additional windows: the program window, the frame stack window (which corresponds roughly to what one usually encounters as activation record stack, and the value stack window, which is the actual stack of the virtual machine. The three windows are placed in front of all other windows and overlap the BNF, the PDA, and guide windows. Figure 10 shows one sequence while tracing the execution of the code generated by calling/activating the simpl function (see Figure 3): simpl(sin(2),exp(2.3*log(l.S)), PI). The guide.c module had to be extended in order to handle additional dialogues which take place during the ‘execution’ of code. Code execution tracing is done in an analogous manner to tracing of the syntax analysis. The set of dialogues already described (foretrack and backtrack included) is available to the user during the code execution tracing. This feature looks very attractive, but is not of special importance to Yt implementation. Most of the changes were actually done in the code.c module of hoc which is responsible for generation/execution of the code for the virtual stack machine. However, the extensions were simple to implement since we already had all the prototype solutions for these problems within the tracer code.

YT-the

PARSER FICTIONS: list tist list

Im

15: 2.3 16: constpush 17: 18: bltin 1.5

1 ftate 1 state I stabI stal I stal I stal

19: log 20: mul 21: bltin 22: exp 23: varpush 24: PI

1 STATE

i

reducing

by (61

r I i

FRRRE :

I

2

I

Conclusions,

)

I

3.1415927 2.5410306 0.90929743

I

. . . popping

Figure 10.

7.

STRCK :

i

simpl 3 29

295

1input I :i:i;: ! 3 ! !& t=l

INPUT RERD:

PROGRRM:

Yacc tracer

3 state(s)

from stack

Code execution

extensions

tracing

sequence.

and applications

Yt offers help in two areas: analysis of existing programs/grammars and development/ specification new programs/grammars. The time needed to analyse a grammar thoroughly has been significantly reduced. There is no need for time consuming searches through the y.output file which tends to become extremely large for non-trivial grammars. One of the non-trivial grammars analysed with Yt was the C grammar as specified in the portable C compiler [12]. The foretrack/backtrack feature of the tracer has proven to be extremely useful for novices because it enabled frequent reexamination of already analysed steps which are unavoidable in the learning process. Further on, the user is in a position to change dynamically the tracing pace to the value that suits him/ her best. No reference to mice is to be found in this article, since we assumed that in spite of their popularity and user friendliness the current state of the matters (at least in our environment) is such, that most students have only mouse-less alphanumeric terminals at their disposal. Switching to the high-resolution bitmap terminals is not the problem for the current form of the tracer display, but poses an interesting challenge to switch from Yacc tracing to Yacc animation, by using the graphical capabilities of these devices. Further on, we expect that the experience acquired during the development of Yt, as well as the experience we are about to acquire with the Yt extensions should be a great help in developing an interactive tool for the structural analysis/presentation of larger program pieces.

296

R. Adamov

References 1. S. C. Johnson1979.

Yacc: Yet another compiler compiler. UNIX Time Sharing System: UNIX Programmer’s Manual Seventh Edition, Volume 2B. AT&T Bell Laboratories, Murray Hill,

NJ. 2. A. V. Aho & S. C. Johnson 1974. LR parsing. Computing Surveys, 6 (21, 99-124. 3. A. V. Aho, R. Sethi & J. D. Ullman 1986. Compilers: Principles, Techniques and Tools. New York: Addison-Wesley. 4. B. W. Kemighan & R. Pike 1984. The Unix Programming Environment. Englewood Cliffs, NJ: Prentice-Hall. 5. K. C. R. C. Arnold. Screen updating and cursor movement optimization: a library package. UNIX Programmer’s Supplementary Documents, 4.3.BSD. 6. M. E. Lesk 1975. Lex-A lexical analyzer generator. Computer Science Technical Report No. 89, AT&T Bell Laboratories, Murray Hill, NJ. 7. S. E. Feldman 1979. Make-A program for maintaining computer programs. Software Practice and Experience, 9, 255-265.

8. S. E. Feldman

1979. Make-A

Sharing System:

program for maintaining computer programs. UNIX Time Manual, Seventh Edition, Volume 2B, AT&T Bell

UNIX Programmer’s

Laboratories, Murray Hill, NJ. 9. A. V. Aho, B. W. Kernighan & P. J. Weinberger 1979. Awk-A pattern scanning and text processing language. Software Practice and Experience, 9, 267-279. 10. A. V. Aho, B. W. Kernighan & P. J. Weinberger 1988. The A WK Programming Language. New York: Addison-Wesley. 11. S. G. Kochan & P. H. Wood 1985. UNIX Shell Programming. London: Hayden Books. 12. S. C. Johnson & D. Seely 1986. A tour through the portable C compiler. UNIX Programmer’s Supplementary

Documents, 4.3.BSD. Rade Adamov is currently with the Institute of Computer Science of the University of Zurich. Before joining the university in 1985, he had several years of industrial experience in digital systems design, telemetry, and the design of supervisory, control, and data acquisition systems, His research interests include software quality assurance, software testing, and compiler construction. He received his Dipl. Ing. and MS degrees in electrical engineering from the University of Belgrade, Yugoslavia, and the PhD in computer science from the University of Dortmund, FR Germany.

YT-the

Appendix

1. The grammar

list:

Yacc tracer

for hoc6

/* nothing*/

(1)

I list

(2)

'\n'

I list defn

'in'

(3)

I list asgn

'\n'

(4)

( list stmt

'\n'

(5)

( list expr

'\n'

(6)

I list error

'in'

(7)

, VAR

'=' expr

(8)

I ARG

'=' expr

(9)

asgn:

; stmt:

expr

(10)

I RETURN

(11)

I RETURN expr

(12)

I PROCEDURE

begin

‘(’

arglist

')

I PRINT prlist

(13) (14)

I while cond stmt end

15)

I if cond stmt end

16)

I if cond stmt end ELSE stmt end

17)

I ‘I’

18)

stmtlist

'}'

I cond

:

‘(’

expr

')'

(19)

, while:

WHILE

(20)

IF

(21)

, if:

, begin:

/*

nothing

*/

(22)

/*

nothing

*/

(23)

, end: ;

297

298

R. Adamov /*

stmtlist:

*/

(24)

( stmtlist

'\n'

(25)

( stmtlist

stmt

nothing

(2.6)

'\n'

I expr:

NUMBER

(27)

I VAR

(28)

I I+RG

(29)

I asgn

(30)

I FUNCTION ( READ

'(' VAR

I BLTIN I ‘(’

begin

‘) ’

'(' expr

expr

‘(’

‘) ’

')'

arglist

')

(31) (32) (33) (34)

I expr

'+' expr

(35)

1 expr

'-' expr

(36)

I expr

'/' expr

(37)

I expr

'*' expr

(38)

I expr

‘A’

(39)

expr

'-1 expr

(40)

I expr GT expr

(41)

I expr GE expr

(42)

I expr LT expr

(43)

( expr LE expr

44)

I expr EQ expr

45)

I expr NE expr

46)

I expr AND expr

47)

I expr OR expr

(48)

I NOT expr

(49)

, prlist:

expr

(50)

( STRING

(51)

1 prlist

',' expr

(52)

I prlist

',' STRING

(53)

YT-the defn:

prccname:

arglist:

Yacc tracer

FUNC procname

‘(’

‘) ’ stmt

(54)

1 PROC procname

‘t’

‘) ’ stmt

(55) (56)

VAR

1 FUNCTION

(57)

I PROCEDURE

(58)

/*

nothing

*/

(60)

I expr I arglist

(59)

',' expr

(61)

299