Automatic software test generation

Automatic software test generation

Automatic software test generation M Camuffo, M Maiocchi* and M Morselli The paper deals with the automation ~?[quality control activities in soltwar...

859KB Sizes 8 Downloads 148 Views

Automatic software test generation M Camuffo, M Maiocchi* and M Morselli

The paper deals with the automation ~?[quality control activities in soltware deveh~pment. It proposes the description ~f the functional speeg[ications oJ a program ( ~Tntax and semantics) using a /ormalism based on two-h, vel grammars. Some tools, which can generate automatically both tests and related expected results and can provide a nwasure ~!/ the functional coverage', are presented. Lhnitations are examined and discussed. Real applications are shown, evaluating the economic advantages in industrial experiences. s~l?ware quality, quality control, s
In thc last few years the problem of software program quality has had a high profile and gained considerable support to find a possible solution. Comparison between software quality (and mainly confidence in the quality) and the quality obtained in other kinds of products (e.g., cars. foods, etc.) is discouraging, especially taking into account the growing role played by software components in many complex products (such as aircrafts, industrial plants, power plants, etc.). Moreover, while people accept a fault due to "external" reasons (age, shocks, heat, etc.), they can hardly stand the idea of errors already present in the product before its use. Software products can be modified and changed with low cost in respect to other technologies (such as hardware), and for these reasons all complex, evolutionary, and misunderstood problems are approached through software technologies, producing a high level of uncertainty in software applications. Owing to these problems, as well as the impossibility of determining foundations for general problem solving, methodologies have bccn set up (this term is used as opposed to "theories'), together with expressive languages lor describing the work going on. so that their results can bc joined with thc diffusion of formal (or scmi-formal, often graphics) languages. In fact methodologies began to emerge in programming (structurcd programming j, Warnier-', Jackson .~. Phos 4 methods, etc.), for which formal languages (the programming languages) were available, and turned towards design (such as Petri nets ~ or structured design ~') and specifications (such as Petri nets. SADT ~, structured analysis x, etc.)

Itnoteam S.p.A.. Via A. Bono Cairoli 34, 20127 Milan, Italy. *Also at Department of Computer Science, University of Milan, Via Moretto da Brescia 9, 20133 Milan, Italy. Paper submitted: 2 August 1989. Revised version received: 19 December 1989.

vol 32 no 5 june 1990

The methods were simply related to the project activity from a technical point of view; management needs were quickly discovered and pointed out in response to the fast growing size and complexity of the processes: a clear software life-cycle, subdivided into coordinated phases, configuration management and change control techniques; documentation and standards, verification and validation (V&V) are the basis of what is called, together with the previous techniques, software engineering. The management aspects are actually the most important elements for a successful software design. However, they arc not yet widely accepted as part of program management in software production: minimal requirements on the subject have been a matter of recommendations and norms, such as NATO AQAP 13*~and ISO 9000 Series"'. Following the rules outlined as recommendations in the software, project management is a precondition for assuring receipt of a good product, but not an assurance: each product must be accurately verified and validated by itself, to guarantee proper behaviour confidence. Thus verification and validation (V&V) activities are planned in all recommendations. Now the following are defined: • Verification: an activity to check if the result of a design phase is completely coherent with its inputs and with the stated production rules. ('Are we implementing the product correctly?') • Validation: an activity to check if the result of a phase is coherent with the initial requirements of the products. ('Are we implementing the right product?') The recommendations give the proper identification for the type, way, and quantity of V&V to bc performed; these elements and the reasons for the choices are then documented into a quality plan, describing the V&V approach for a single project. But the cost of V&V activities can be high, meeting and perhaps exceeding 50% of the overall expenses of the projcct. Thereforc, opposing interests arc raised between purchaser and provider: • The customer requires the maximum possible correctness and reliability in the product, but does not like overpaying for quality control. • The producer does not like increasing project costs if not refunded by the purchaser, so prefers to reduce V&V activities despite the impact on the quality of the final product. From this point of view, the solution must imply a strong

0950 5849/90,,'050337- I0 ~(',"1990 Butterworth-Hcinemann Ltd

337

reduction of V&V costs, possibly through their (also partial) automation.

Requirement analysis .,,.....

TESTING: WHY, WHAT, HOW The term 'quality', when applied to software, has various meanings and is generally intended as a mix of attributes useful in different degrees for different kinds of products and users: this paper deals only with correctness, the most relevant attribute related to reliability. V&V activities are carried on according to many techniques, ranging from reviews, walkthroughs, inspections, etc. But the most important and expensive is the testing activity~.~2. A test is composed of: • a set of data to exercise a program • a set of expected results from its execution • a mechanism (programs, rules given to humans, etc.) to carry the system under test in a predefined state, to submit input data, and to collect output results • a mechanism (a program, rules given to humans, etc.) to compare actual results with expected ones • a description of test scope, that is, the list of'elements" (behaviour of program, physical entities of "elements" such as variables or control paths, etc.) on which the test has focused Moreover a self-checking test is defined as a test that contains some "ad hoc' control statements able to check the congruence between expected results and real ones, computed during the exercise of the test itself. Testing is actually practically unavoidable, as it is the only way to demonstrate that the program behaviour matches the required one and is related to a validation. Unfortunately, the tests cannot guarantee complete program correctness because of the large number of possible execution histories. So, for that purpose, select only those values that represent a class and suppose that the routine will also run correctly for other values in the class, when the related test has been correctly executed. Tests are classified by many criteria ~3, two of which are particularly relevant to the discussion: • Positive/negative tests. A positive test checks that a program works correctly when correctly used (i.e., when the input data are provided correctly and in the proper form); a negative test checks that a program also works correctly when misused, i.e., that incorrect inputs (values or formats) are recognised, discarded, and signalled and no crashes or damage occur. • Functional/structural tests. A functional test checks that the item functions properly when provided to the user, while a structural test exercises physical parts of the program: it is working correctly when executing a specific control path. Tests cannot guarantec program correctness, but an adequately tested program is recognisable, compared to one

338

ISP ecificati°n L .....

I Design

IIt___..~_ _ [Implementation Quality II control L-

_

--.el_

_

_

Ii Testing= l test preparation L

-----4j

_ _

IMaintenance i I

Figure 1. Tests running phase in software I(fe-~3'cle less adequately tested, because of the rare occurrence of malfunctions. This indicates that the correctness of software can be increased through adequate testing. If it was possible to determine the probability that an execution would present a malfunction, through a proper statistical analysis, the number of tests to be performed to reach the proper mean time between failure could be defined. So availability of a uniform testing approach is determined by the capability to measure this quantity as it relates to quality control.

EFFECTIVENESS AND EFFICIENCY OF VALIDATION PHASE Unfortunately, software engineering techniques are not so advanced in quality control, and in particular in testing activities, with regards to efficiency. That is: • Effectiveness. What is required is an objective measure of the quantity of quality control performed. While structural testing approaches happen to be the key for obtaining measures of topological testing, a primary need is a measure of functional testing, this being the real goal for matching the product behaviour with user expectations. A formal description of user requirements allows an unambiguous definition of what a functional item is and a clear measure of its exercise. • Efficiency. Two aspects have to be considered: efficiency in production and efficiency in use. As far as efficiency in test production is concerned a formal description of the product allows automatic test generation and considerably reduces man-time effort. Moreover, the tests running phase induces a feedback activity (scc Figure I) which can largely affect thc delivery time, as roughly shown in Figure 2. Figure 2 shows that the quantity of work done by the quality control group is increasing because of the larger amount of executable and non-regrcssion tests ]3reduced as a consequence of discovered errors, fixed by the project group; this fact also influences the project group, waiting for the completion of quality control activity ~4. The use of automatic self-checking tests largely reduces the above problem. Nevertheless formal specification techniques currently available still have some problems with practical use in an industrial environment;

information and software technology

to6 _

,j.....÷

tat3

~-

t4 t~

te tr ,-~-~.~+

te .

.

.

.

,,

t9 ,

i,'

,

' '

r .....

I

>

Formal description Coverage requests ..,~

-~-

oo

-

n

.'K

I I

I 1

t(c)

-I

~[

ATG

[

Self-checking . . . . . . . . tests

D.

Coverage information

_.J

I I I

I 1

Figure 3. Automation 0[" test generation phase ~_

t

Figure 2. Quantit.v of work done b)" quality control group over lime their use is complex and their applicability is not general. Term algebras ~' are a good example of formalisms that, although effective in describing things, have not been accepted in the industrial environment. Furthermore. these formal specification techniques tend to describe the expected product behaviour, without any effort in specifying the handling of user measures and errors. In spite of that, the availability of a formal specification l a n g u a g e ' , even if such a language can only describe the correct behaviour of a program, gives unusual advantages that are good enough to justify the cost involved in partly using such a technique, not necessarily oriented to the automatic generation of the final product: • First, formal specifications, even if partial and only referred to a correct use of the product, can be coupled with an interpreter that 'executes" them, supplying an efficient prototype for measures of adequacy and for experimentation. • Moreover. these specifications, coupled with the interpretcr: induce a topology in the functional description, and thus a measure of quantity of 'elements" inside the specification (the functional size of a product and of its exercise by means of the tests). allow development of tools that generate automatically all test items meaningful for an exhaustive functional coverage of the product, coupled with the expected output related to the tests themselves. The presented work is in the field of the automatic generation of test items, starting from formal specifications, and refers to some tools and experiences performed in the concrete industrial environment of the Olivetti Labs.

PROPOSED APPROACH The present work addresses functional testing, based on the assumption that a software product must be validated according to user requests ~7. Note that a m o n g the various kinds of "users', software layers can also be considered. In this approach the functional specification of a product points out what must be validated. Moreover, in

vol 32 no 5june 1990

........

the authors" point of view, even if it is not proved how good the structural approach is, it does support the functional one (for instance, in effectiveness). A validation system of a software product must address many organizational and industrial aspects, including: • Tests must be developed in a short time, to execute tests as soon as possible, and with low cost. • Both costs and time for tests execution must bc low, to avoid delays in release of the product (usually a long time is requested because of poor test automation, with a lot of data-entry operations and visual checks of the results). • It is necessary to handle tests results, avoiding "volatile" checks (employed usually by a programmer during a visual check) by recording the launch reports on suitable media. This is possible only with automatic checks. Hence a test must be automatic, that is, must be thought of as a set of: • input data • expected results • launch instructions, to know how to run the product, give it input, and collect output • programs that compare actual output with expected ones

This approach is completely general and suitable also in those cases in which a visual check seems unavoidable (i.e., screen or printer output) L~. Cost reduction of software quality control can be obtained through automation of the test generation phase: a tool is described below that can generate test items, starting from a formal description of the functional specifications of the product to be tested (see Figure 3). As can be seen from Figure 3, use of formal specifications allows the development phase and test preparation to be performed concurrently. To define specifications formally, an ad hoc language is needed that can describe both syntactical (also for contextual parts) and semantic aspects of a software product. For this purpose, the context-free parametric grammars (shortened as cf.pg) are used - - particular two-level grammars described in the following. Cf.pg also includes the semantics description. The formal functional specifications describe the pro-

.339

Syntax

Parser

l Internal

form of syntax

inteT~:ifnorm --

re!ts [

1 Internal

form of semantics

[Generatori I

CONTEXT-FREE PARAMETRIC GRAMMARS

Semantics

a{InterpreterI

I

. . . .

I -[

Test L equipper I-

Expected results

I

t Self-chgcking tests

Figure 4. High-level schema el" automatic test generation technique

gram behaviour, i.e., cf.pg describes all the legal input to the program, together with the 'meaning" of each given by the semantic part. So a test can be considered as a phrase derived by cf.pg, together with the related "meaning'. Figure 4 shows a high-level schema of the technique. The generator is a program that can generate syntactically and contextually corrected tests through a left-most derivation of the attributed syntax (cf.pg) given in input. It is possible to force the application of some production rules driving the generation. The interpreter is a program that simulates the behavtour of the product under validation, by 'execution' of the semantics, and outputs the expected results related to the test built by the generator. The test equipper is a program that, aggregating the generated tests and expected results, can build self-checking tests that automatically verify the congruence between the actual and expected results% The following presents the case of compiler testing, where the main points of the approach are immediately identifiable, i.e.: • syntax, which is the syntax of the language translated by the compiler under test (together with contextual aspects) • semantics, which is the formal translation of the semantic actions binded with the syntactical language clauses • program tests, which are programs written in the source language as input for the compiler • expected results, which are the results of the execution of the program tests • final tests, which are the program tests suitably turned in self-checking tests Finally, some less straightforward cases are discussed, namely, negative tests, termination, and all branches covering.

340

The cf.pg are two-level grammars described with a formalism similar to Backus-Naur Form (BNF). The main characteristic of cf.pg is the capability of associating contextual attributes to nonterminal symbols, so that production applicability depends both on the congruence of the nonterminal in the left part and on the coherence with the contextual attributes. A cf.pg production consists of: • A left part, composed by a nonterminal symbol, possibly followed by a list of attributes called the parametric part; use of these attributes allows management of the contextual aspects of the product to be described. • A right part, composed by a sequence of terminal and nonterminal symbols: nonterminals can be associated with a parametric part, which can in turn contain a sequence of terminals and nonterminals, like a generic right part, without limits in the nesting depth. The following presents some examples of both the syntax description and the semantics. A complete formal presentation of cf.pg theory has been given elsewhere 2".z~.

Example of cf.pg syntax In the description of a programming language the attributes of the g r a m m a r belong to one of the following sets: • ID, set of allowed identifiers, which represents all names that can be used with the language rules • Ei, user-defined support sets (they can contain, for instance, a group of terminals useful to handle contextual aspects) • T, set of terminals • N, natural numbers In the following example the syntax of a Pascal expression is described. First. the usual BNF is presented. Then it is changed into cf.pg. The BNF description is: < addop> < ~arof>

::= ::= ::= "'= ::= :: =

< t e r m > < m u l t o p > I( ) * I ' ' div rood land + or < identifier>

where is not further specified as it is a generic identifier of a variable. At this point the sets described above have to be defined, starting with the ID set that is the set of allowed identifiers. (In the real generation process these identifiers are automatically built during the derivation, but in this example they are present already, and their generation is assumed). Let: 11) = !k,i,j.xl

information and software technology

N o w d e f i n e t h e set E~, u s e d t o g r o u p t h e t y p e s in t h e Pascal fragment above. Hence

declaration statements and ST to describe the assignment statements: DC = : I D : x : i * ST = { ID := ID; ',*

E, = linteger,real,boolean', Let T b e t h e set o f all t e r m i n a l s y m b o l s r e q u i r e d in t h e e x p r e s s i o n t h a t d o n o t b e l o n g t o a n y set E,. I n t h e e x a m -

F i n a l l y , i n t r o d u c e t h e set o f v a l u e s V:

ple:

v

T = ',var.*

.mod.div.and,

+

= ,,o,o.o,1.1.1

{

- .or: L e t P b e t h e set o f t h e f o l l o w i n g p r o d u c t i o n s :

F i n a l l y , a l a n g u a g e L is b u i l t o n t h e a l p h a b e t

I) < S > 2) (begin D C S T e n d )

:: = < execute > (PRG) ::=

:II). E,. T. N. (.);

(ST. < initial > (DC))

3) 3") 4) 4") 5) 6)

that provides the necessary relationship between varia b l e s a n d t h e i r t y p e (a k i n d o f s y m b o l t a b l e ) ; f o r b r e v i t y t h e final r e s u l t o f t h a t l a n g u a g e is p r e s e n t e d , w h i c h c o n tains simply:

7) L = Ivar boolean k, var integer i, var integer j. var real xl 8) 9) 10)

T h e c f . p g p r o d u c t i o n s are: I (El*L) 2 < expression > (El*L)

::= (El*L) :: = < expression > (EI*LI < addop > (El) (El*L) 3 (El*L) ::= (El*L) 4 (El*L) ::= (El*L) < multop > (El) < factor >(El*L) 5 (El*L) ::= (El*L) 6 (El*l.) "'= ((El*L)) 7 -.~.arol'>(El*l.Vl',ar El ID. LV2) ::= ID 8 (real) ::= * 9 < multop > (real) :: = I0 < multop > (integer) ""= mod II < multop>(integer) "'= div 12 (mteger) ::= * 13 < multop > (boolean) ""- and 14 (real) "'= + 15 < addop > Ire,d) :: = -16 < addop >(mtegcrl '-:: I 7 " addop > (inlegerl :: - 18 < addop>(boolean) or

(ID;integer; DC) ::= ID:0, (DC) (ID;integer; DC) ::= ID:I; (DC) (ID: real; DC) :: = ID:0.0; (DC) (ID: real; I3'(_') : : - ID:I.I; (DC) < initial> ( ) ' - epsilon (1DI:= ID2; ST.DC) ::= < executestat > (ST. < executeassign > (IDI: = ID2*DCI) HI)l: = ID2*DC)::= (IDI (ID2*DC)*DC) (ID * DCI ID: V; DC2) ::= V (1D V * DCI ID : VI; DC2) ::= I)CI It) : V: DC2 (.DC) " = I)C

PRG represents the program whose operational semantics is d e s c r i b e d b y t h e p r o d u c t i o n s s h o w n a b o v e . F o r example, such a program could be begin

idl : integer: id2 : integer; id3 : real; id4 : integer: idl := id2; id3 := id3; id4 := id2; end in w h i c h t h e d u m m y s y m b o l ' " s e p a r a t e s t h e d e c l a r a t i o n part from the statements.

TEST GENERATION U s u a l l y , q u a l i t y c o n t r o l c o n s i s t s of:

" -

T h e s y m b o l "*" in t h e r i g h t p a r t o f p r o d u c t i o n s (8) a n d (12) is t h e m u l t i p l y o p e r a t o r a n d s h o u l d n o t b e c o n f u s e d w i t h t h e a s t e r i s k t h a t a p p e a r s as s e p a r a t o r in s o m e p a r a m e t r i c p a r t s , as in (1) a n d (2). In r u l e (6) t h e m o s t external parantheses are terminals and not delimiters of the parametric part.

Example of cf.pg semantic l'o show how the semantic specification works, a descript i o n o f s i m p l e a s s i g n m e n t s t a t e m e n t s is i n t r o d u c e d . B u i l d t h e set o f a l l o w e d i d e n t i f i e r s : ID = I id(i)

I -: - i ~ -

n . where n is an integer > -

U s i n g a n a u t o m a t i c test g e n e r a t o r ( s t a r t i n g f r o m f o r m a l specifications), to automate the quality control phase, d e f i n e , a c c o r d i n g to t h e c f . p g a p p r o a c h :

I I

T h e n i n t r o d u c e the set o f t e r m i n a l s :

× = I integer.real : Two

languages

should

vol 32 no 5 june 1990

• T e s t p l a n d e l i n i t i o n , in w h i c h t h e f u n c t i o n a l a r e a s o f t h e p r o d u c t t o be v a l i d a t e d a r e d e f i n e d a n d c o s t s a n d quantity of tests planned. • C h e c k l i s t b u i l d i n g , in w h i c h , s t a r t i n g f r o m t h e f u n c tional specifications, the functionalities of the product to b e v a l i d a t e d a r e p o i n t e d o u t . • T e s t s p e c i f i c a t i o n s d e f i n i t i o n , in w h i c h f u n c t i o n a l i t i e s a r e g r o u p e d t o g e t h e r , t o b u i l d t e s t s t h a t will e x e r c i s e them. During this phase tests with similar characteristics a r e g r o u p e d t o g e t h e r in test c h a i n . • T e s t b u i l d i n g , in w h i c h t h e test s p e c i f i c a t i o n s a r e t r a n s l a t e d i n t o d a t a a n d p r o c e d u r e s , to be p r o v i d e d as i n p u t to t h e s o f t w a r e p r o d u c t .



be built:

DC

to d e s c r i b e

the

F u n c t i o n a l a r e a s , w h i c h a r e p a r t i c u l a r "sections" o f t h e syntax. For instance, the productions that describe the i n p u t / o u t p u t s t a t e m e n t s , o r t h e e x p r e s s i o n s , etc., c a n

341





• •

be considered a functional area. The functional areas definition is a manual operation. Checklist. A single production rule of the syntax can be considered a functional item. So the whole syntax can be considered the checklist. Test specification, which is the sequence of production rules used by the automatic generator to produce a particular test. This information can be automatically collected by the t0ol during the generation phase. Test program, which is automatically built by derivation of the input syntax. Expected results, which are produced by the interpreter according to the described semantics. Self-checking test, which is automatically built by a test equipper that introduces expected results into the test program in an appropriate way.

Generation algorithm The generator produces correct programs in respect of the syntactical and contextual description, through a left-most derivation of the input cf.pg. As the purpose of testing is the complete check of the functional items of a specification, the generator has the goal of producing a set of tests able to exercise, at least once, each production rule of the cf.pg. To reach this goal, a specific program examines the cf.pg and associates a 'weight' to each production rule, meaning the minimum number of times the rule has to bc applied for a complete coverage or to reach the coverage requested by the user. So the program can also determine the minimum number of generations required for the complete coverage -'2. Thus the derivation process is monitored by the production weights and required number of tests. Now for a brief schematic description of the generation. Using a derivation stack, the algorithm pushes on it the start symbol of the grammar and executes the following steps: • If a nonterminal symbol is top of the stack, then a production is selected from those that have a nonterminal symbol on their left side (the selection criteria are explained below) and the top of the stack is replaced with the right part of the chosen production; the left-most element of the right part of the production becomes the new top of the stack (in this way a leftmost derivation is performed). • If the top of the stack is not a nonterminal symbol, then the symbol is written on the output file of the generator and one element is popped from the stack. Such a process terminates when the stack becomes empty. At this point if the number of requested generations is greater than one, the start symbol of the grammar is again pushed onto the stack and the above steps repeated. In addition to the tests, at the end of each run, the generator gives information about the productions that have been used and the coverage of the productions with respect to the initial request (weights).

342

The criterion for the selection among productions is: when the top of the stack is a nonterminal symbol, the algorithm looks for all the grammar productions with that nonterminal symbol in the left part. If only one production satisfies this requirement, then a check is made on the congruence of its left nontcrminal parametric part ('formal parametric part') with that of the nonterminal on top of the stack ('actual parametric part'): if the check is satisfactory then the production is applied (otherwise an crror in the grammar is signalled). If there are more productions that satisfy the abovementioned requirement, then selection among them is as follows: • Among all the productions with an acceptable parametric part. the ratio is evaluated between the number of times that the production has already been selected and its weight, choosing the production with the lowest ratio. • If there arc more productions with the same ratio value then the selection is random. • If all eligible productions have a ratio value greater than or equal to one (this means that they have been applied at least a number of times equal to their weight) then the production generating a terminal string through the shortest derivation is selected: this guarantees termination of the derivation when the grammar has been written correctly. This last type of selection is done by a procedure obtained by adapting an algorithm -'~ to the cf.pg. The weights of production rules and number of tests to be generated can also bc inserted manually, allowing: • specitic tests on sharply selected parts of the program • different coverage levels for reducing or increasing exercising effort As examples some productions that describe the syntactical and contextual aspects of some operations on a stack abstract type are demonstrated. I) 2~ 3) 4) 5~ 6) 71 81 9~

(LV) (LV) (LVJ (L) (L) (I,VI (N LV) (N LV) I()1 (LV)

::= ::= ::= ::= ::= ::= ::= ::= = ::=

(J (N LVI (I,VJ (L) (L) ',ar = top (LV) put N (N LV) vat = pop (LV) var = erupt) ILV)

In accordance with the formal definition of cf.pg, this gives the following sets: ID Ei

= empty set: = empty set:

T

=

N

= Natural numbers

Ivar,top,put,pop,empty,

= '.. the set of terminals:

From the above grammar, and applying the generation algorithm, it is possible to produce tests as:

information and software technology

PROG1.01

I

Trace 1 2 8 3 1 0 6 9 2857286 95731028 69310572 84

< op_on_stack > ( )

I

< put • (N)

put

/IN N

< op_on_stack• (N)

I

Figure 5. Trace 0[ applied rules

(N)

/I\ Production number

var

Applications

2 3 4

9 10

5 3 1 3

5 3 3

F~ure 6. Frequency of u s e of syntax productions put 9 var

-

empty

~ar

=

pop

put 9 'car =

top

put 3 var

=

pop

"~ar =

lop

var

empty

=

=

empty

< op_on_stack > (N)

I epsilon Figure 7. Derivation tree of.first two instructions o[test • the tests to be interpreted • the operational semantics, given by cf.pg, of the language in which the tests are written The interpreter output consists of a sequence of couples, 'variable-associated value', which form the final state vector expected by the execution and some information about the used semantic rules, which can be useful in knowing the functional coverage of semantic actions of the language. At the beginning of the interpretation, the test program is copied in the parametric part of the semantic start symbol, which is put on top of the stack used for the interpretation phase. The interpretation process can be seen as a normal cf.pg derivation, the output string of which will be the state vector of the program: the interpretation algorithm can be explained as follows:

pul 8 var

=

~ar

= empty

pop

var

=

top

pul 9

which can be considered, for instance, as a test for the validation of a compiler that accepts such a language. Associated with the generated test source, the automatic test generator produces a trace of the applied rules (see Figure 5) and some information about the frequency of use of syntax productions (see Figure 6). This information is useful for management and tuning of the quality control phase. It consists of statistical data on the functional coverage assured by the produced tests. With regard to this, a good topological coverage of the production set provides a good functional coverage of the object under test. To present the derivation process in more detail, Figure 7 shows the derivation tree of the first two instructions of the test above. More details about the derivation process are given elsewhere-'".

Interpretation algorithm The interpreter has as input data:

vol 32 no 5 june 1990

• If a nonterminal symbol is on top of the stack, among all the productions that expand that nonterminal, the first one (following the ordcr of writing) whose parametric part is congruent to the parametric part of the nonterminal on top of the stack is chosen. Thus a particular structure of the program is indicated. At this point the element on top of the stack is changed with the right part of the selected production, so that the left-most element becomes the new top of the stack. This action can be considered as the start of the actions that simulate the selected structure. If there are no applicable rules, the semantics is written wrongly, and an error message given. • If the "C" symbol is on top of the stack (which in the semantics indicates the need for mathematical computation), the computation function is activated to evaluate the expression contained in the parametric part of "C'. The top of the stack is then replaced by a variable belonging to the set "V', whose value is the result of the evaluated expression. • Finally, if an element different from a nonterminal and "C' is on top of the stack, the element is written in the output file of the interpreter and popped from the stack.

343

(put N var = empty) I (put N var = empty • ) I (var = empty • N)

I (empty • N) < empty> (N)

(. N)

I

I

false

epsilon

Figure 8. Derivation tree o f semantics related to interpretation o[flrst two lines of test

The process described above ends when the stack is empty. The criterion to evaluate the congruence between formal parametric parts and the actual ones is similar to the criterion in the generation phase. As an example of the interpretation process a fragment of the semantic, expressed by cf.pg productions, of the allowed operation on the stack abstract type is given. 1) 2) 3) 4) 51

< s t a r t s e m > (LV) < e v a l > ( p u t N LVI * LV2) < e v a l > ( t o p LVI * N LV2) < e v a l > ( p o p LVI * N LV2) < e , , a l > ( c m p t y LVI * LV2)

::= ::= ::= ::= ::=

< e v a l > II.V *J < e v a l > ( L V l * N I.V2l C(N) < e v a l > l l . V l * N LV2) CIN) < e v a l > ( L V I * LV2) < empty >(LV2) < e , , a l > ( l . V l * LV2) ::= < e v a l > l L V I * LV2)

6) 7) 8) 9)

< e v a l > ( v a r = LVI * LV2) < e v a l > (* LV2) IL) ()

" = false "' = true

X =

if (var = 9) then PASS else F A I L put 3 var = pop i f ( v a r = 3) then PASS else F A I L var = top if (var = 9) then PASS else F A I L var = empty if (vat = false) then PASS else F A I L put 8 var = pop if (var = 8) then PASS else F A I L "*'dr = e m p t y if ( v a t = false) then PASS else F A I L var = top i f l v a r = 9) then PASS else F A I L put 9 end of_test

Using such a test validates the product, obtaining as output the result of the comparison between the real and expected results. More details about the interpretation process are presented elsewhere -~.

REAL EXPERIENCES The described tool has been used in real activities to validate some software products: this approach can be used for each product whose formal description is possible. Even though this description is not correctly used, a large set of products can be covered. Actually, every software product specified by SADT, automata, PDL, CFL, decision tables, graphs, and other equivalent formalisms, can be represented using cf.pg. The following briefly examines various experiences.

Programming languages From the interpretation phase the following expected results, related to the generated program shown above, are obtained. false

9 9 3 9 false 8 false 9

As for the syntax derivation, Figure 8 presents a derivation tree of the semantics related to the interpretation of the first two lines of the test used in the example. Such a derivation produces the first of the expected results above. Starting from the outputs produced by the generator (test source) and interpreter (expected results) the test equipper can produce the final self-checking test, which has the following form: start .of test put 9 var = e m p t y i f ( v a r = f a l ~ ) then PASS else FAII~ var = pop if (var = 9) then PASS else F A I L put 9 var

344

=

top

The tool is particularly appropriate for programming languages with well defined syntax and semantics. The generated tests are built using both phases (generation and interpretation) to obtain self-checking programs. Cobol The generated tests concern arithmetical statements (ADD, SUBTRACT, MULTIPLY, DIVIDE, COMPUTE, etc.) in the different syntaxes allowed by the language using different types of variables and their combination, including field length, sign presence or absence, and other field attributes. After each arithmetical operation, the variables with modified values are compared to the expected results. The generated test items that consider these characteristics number about 900, grouped in 80 test programs. The syntax consists of about 200 productions and the semantics of about 60 rules. Another significant experience, in the Cobol environment concerns the validation of part of the concurrent input/output; about 100 syntactic and 50 semantic productions have been written.

Pascal Pascal syntax and semantics were defined first as they were used during the development phase of the described tool to evaluate, step by step, the state of the product.

information

and software technology

Both syntax and semantics consist of about 150 production rules. Pascal tests have not been used industrially because the test suite for this language was already complete. Anyway, the syntax can be easily used to generate new tests fast.

Operating systems In contrast to language validation, here the environment in which tests are executed is important; so attention to the productions that contribute to the environment definition is necessary. MOS Some tests were generated on some commands (FORMAT60, CMPK, M K K E Y E D , COPY, etc.) of MOS, which is an operating system of the Olivetti LI family. The syntax consists of about 250 productions and the semantics of about 200. About 800 tests were generated. Unix At present, the automatic generation of tests for Unix commands is under development for the Olivetti LSX/L2 family. Particularly, the emphasis is on the negative aspects, to validate system robustness.

Procedural interface products The most important experience in this area is about communication protocols, mainly the validation of a file transfer (UFT). All tests in this environment have been completely built automatically; U F T consists of a set of functions with a procedural interface in Pascal. So the generated tests are Pascal programs. The cf.pg description of U F T is composed of 60 productions, generating about 165 tests. Other activities in this area deal with C-ISAM and transactional environments.

Applications The cf.pg formalism is useful for describing menu-structured products, and the coverage of the grammar is easily guaranteed. The main experience in this environment is the automatic validation of O L I C A L C (a particular evolution of VISICALC, running under MS-DOS on Olivetti PCs). To perform the OLICALC validation, it is necessary to test versions for different customizations: in fact the user interface is based on the native language (i.e., one version for France, one fi)r UK, etc.). Using the described tool, tests for each version can be obtained simply by modifying terminals of the grammar. The grammar consists of about 250 productions, which generate about 800 tests.

vol 32 no 5june 1990

OPEN PROBLEMS AND FUTURE DEVELOPMENTS The approach presents intrinsic limitations; in fact a generated test program: • could never terminate. • could contain some control structures driven by Boolean expressions, which can produce unexecutable paths, thus making the test on functionalities present in such paths useless. The solution to the above problems requires some adhoc actions: • Nonterminating test programs can, on user request, be modified by the described tool so that a counter is able to compute the number of already executed iterations; a too high value (parametric) of this counter is signalled by the interpreter, which stops and gives a diagnostic message. At that point possible actions to modify the test are left to the user. • Moreover, the interpreter gives the real coverage of paths derived from selection structures; also in this case the user can modify that particular Boolean expression, to allow the execution of uncovered paths and related execution of the functionalities present in such paths. Nevertheless, product effectiveness is not really limited by these problems, which affect, in every respect, only a small part of the whole product to be validated. Another problem is the generation of "negative tests', i.e., tests checking the correct behaviour of the product with respect to its misuse. The trouble in the semantic formal definition comes from the presence of a large number of wrong cases (which are not easy to formalize). That leads to a systematic set of automatic perturbations in the use of the syntax production rules, to provide wrong cases without computing the expected results. A final possible development is the interaction of the described product with "quality control workstations '-'4--'~. Such workstations, whose purpose is semiautomatic test production and test documentation handling, could extract information for formal description straight from functional specification documents, thus binding automatically the test definition activity to the previous one, related to the product definition, and the next one, related to the test generation and execution. An aspect to underline, even if it is a long-term idea, is prototyping: the existence of an interpreter, able to simulate the software product behaviour starting from product input data, could make it possible to use the formal semantic description at the place of the product to be implemented. In fact, at that point, the interpreter can execute the same functions as the described product does.

ACKNOWLEDGMENTS The work presented in this paper was born as a degree thesis -'°.-'kand is based on work done by I. Spadafora 2~for

345

the formal definition o f the context-flee parametric grammars. T h a n k s are due to Massimo Vercelli, o f Olivetti S.p.A., for help during the development phase and useful suggestions on the paper. T h a n k s also to A n n a Cavigioli for revisions. A particular acknowledgment goes to some Olivetti users who helped to collect statistical information.

15 16

17

18

REFERENCES 1 Dijkstra, E W, and Hoare, C A Structured programming Academic Press, London, UK (1972) 2 Warnier, J D Introdu=ione alia programma=ione Etas Kompass (1974) 3 Jackson, M A Principh, s o[program design Academic Press, London, UK (1975) 4 Maiocchi, M and Spoletini, E 'PHOS: una evoluzione nelle metodologie di programmazione" Quaderni di h!/ormatica Vol I1 (1979) 5 Peterson, J L "Petri nets" ACM Comput. Surv. Vol 9 No 3 (1977) 6 Myers, G W Reliable .m[tware through composite design Petrocelli (1975) 7 Ross, D T 'Structured analysis: a language for communicating ideas" IEEE Trans. So[t. EJlg. Vo] 3 No I (January 1977) 8 DeMarco, T Structured analysis and system .~pec!flcation Prentice Hall, Englewood Cliffs. N J, USA (1979) 9 NATO AQAP-13 - Nato ,w?/'tware control system requiremerits (August 1987) 10 International Organisation for Standardization ISO 9000 ISO, Geneva, Switzerland (1987) II Myers, G W The art
346

19 20

21

22

23 24

25

26

ration" in Proc. Fourth Colloquium on S~?[tware Engineering Paris. France (1988) Ehrig, H and Mahr, B Fundamentals ~?/algebric specification 1 Springer-Verlag, Berlin, FRG (1985) Balzer, R 'Programming in the 1990"s" in Convegno Lmquaggi di spec![ica per la produzione di so[tware CNUCE. Pisa, Italy (6-7 November 1984) Cicu, A and Maiocchi, M "II testing dei prodotti software: il punto di vista del responsabile di produzione' in Convegno AICA 1982 Padova, Italy (October 1982) Faccia, G and Pizzoferro, A "Software testing: I'automazione del controllo qualita visivo" in Convegno La cert(ficazione del sq/tware: principi, organi:zazione, metodi, strumenti ed esperienze Milan. Italy (4-6 April 1984) Meomartino, I 'L'automazione del controllo dei risultati per test generati automaticamente nella certificazione del software PhD dissertation University of Milan. Italy (1986) Camuffo, M "Definizione e sviluppo di un parser generaliz7.ato per la costruzione automatica di check list e per la generazione automatica di test" PhD dissertation University of Milan, Italy (1986) Morselli, M "Definizione e sviluppo di un interprete semantico generalizzato per sistemi di test e check list automatici' PhD dissertation University of Milan, Italy (1986) Nodaro, C "Dalle specifiche formali alia gcnerazione automatica dei test per la certificazione funzionale completa di prodotti software" PhD dissertation University of Milan, Italy (1986) Purdom, P 'A sentence generator for testing parser" Bit Vol 12 (July 1972) pp 366 375 Maioechi, M, Mazzetti, M, Oliva, M, and Villa, M "TEFAX: una test factory automatizzata per il controllo funzionale di progetti software' in ('onvegno La certfficazione del so[tware: principi, organizzazione, metodi, strumenti ed e.qwrienze Milan. Italy (4 6 April 1984) Biamonti, G, Faccia, G, Picciau, G and Valent, B "ATHENA: I'automazione di un metodo. Un ambiente di tools integrati per il controllo di qualita di prodotti software" in Con vegno .La cert([icazione del s(~[tware: principi, organizzazione, metodi, strumenti ed esperienze Milan. Italy (4 6 April 1984) Spadafora, I and Bazzichi, F "An automatic gcnerator for testing compiler" IEEE Trans. SoIL Eng. Vol 8 No 7 (July 1982)

information and software tcchnology