The application of module regression testing at TRIUMF

The application of module regression testing at TRIUMF

Nuclear Instruments and Methods in Physics Research A293 (1990) 377-381 North-Holland 377 THE APPLICATION OF MODULE REG ESSION TESTING AT TRIUMF Pam...

577KB Sizes 2 Downloads 69 Views

Nuclear Instruments and Methods in Physics Research A293 (1990) 377-381 North-Holland

377

THE APPLICATION OF MODULE REG ESSION TESTING AT TRIUMF Pamela A. BROWN TRIUMF, 4004 Wesbrook Mall, Vancouver, BC Canada V6T 2A3

Daniel HOFFMAN Department of Computer Science, University of Victoria, PO Box 1700, Victoria, BC Canada VS W 2Y2

At TRIUMF, as in any accelerator environment, there are stringent constraints upon the reliability of the control-system software. The programs must run in a complex environment with heterogeneous hardware and software. Production testing is limited to infrequent maintenance times . Therefore, regression testing of modules is essential in order to meet the reliability requirements on the system. Research at the University of Victoria has developed a practical approi ;:ch to module regression testing aimed at reducing the cost of test development, execution and maintenance . This paper describes a project which applies this approach to developments at TRIUMF.

1. Introduction The fundamental goal of our research is to improve system quality and reduce maintenance costs through systematic module regression testing . While conaiderable attention is given to testing during software development, this is not the only time testing is required . As Brooks [1) points out : "As a consequence of the introduction of new bugs, program maintenance requires far more system testing per statement written than any other programming. Theoretical'y, after each fix one must run the entire test bank of test cases previously run against the system, to ensure that it has not been damaged in an obscure way . In practice such regression testing must indeed approximate this theoretical ideal, and it is very costly." Successful regression testing depends on the ability to maintain and execute economically a large set of test cases throughout the life span of a software system . While system testing is usually emphasized, module testing is also important . It is difficult to test thoroughly a module in its production environment, just as it is difficult to test effectively a chip on its production board . IEEE testing standards [2] emphasize the benefits of testing software components, not just complete systems .

2. Te inol

and principles o testing

2.1 . Terminology

Scaffolding [1]

consisting of drivers and stubs has in modular software development . Conlong been used

sider a module M. A driver is a program written to call routines provided by M. A stub is a program that serves as a substitute for a routine called by M. Both batch and interactive drivers are useful . Batch drivers are effective for regression testing, where large numbers of test cases must be run repeatedly, but they are awkward for debugging, where the behavior of one test case determines what other test cases are interesting . Conversely, interactive drivers are good for debugging, but poor for regression testing . Code coverage measures are

used to determine the percentage of statements, branches or paths in a program that are executed by a given set of test data. While even 100% coverage does not guarantee program correctness, coverage measures do provide useful information on test data adequacy. For software testing, we adapt two important concepts well known in hardware testing [3] . Controllability refers to the ease with which an arbitrary stimulus may be applied to a module or a submodule . Observability refers to the ease with which the required behavior of a module or submodule may be observed . 2 .2. Principles Our approach to testing is based on the following principles . Systematic module testing for production .wsterris . Often tests are developed during or after implementation and are discarded shortly after acceptance . It is important to plan testing early in development and to deliver and maintain tests along with the production code . Focus component testing on module interfaces . Traditionally, component testing has focused on units : indi-

0168-9002/90/'$03 .50 O 1990 - Elsevier Science Publishers B .V . (North-Holland)

XI . SOFTWARE METHODS

378

P.A . Brown, D. __%ffman / Module regression testing at TRIUMF

vidual procedures or functions. There are significant benefits in focusing instead on modules : collections of related routines. In module testing, test cases may be expressed in terms of calls and return values rather than internal data-structure values . As a result, test cases are simpler and it is easier to automate test harness construction. Isolate the module under test. It is typically difficult thoroughly test a module M while it is installed in a to production system. M's routines may not be directly accessible. If M is a general-purpose module, some of its access programs may not be called at all in a particular production system . Errors in other modules may appear to be errors in M; errors in M may be masked by errors in other modules. Using test scaffolding, M may be tested in complete isolation from the production environment . In short, when M is running in its production environment, controllability and observability are significantly reduced . In practice, modules are best tested with a mixture of scaffolding and production code; the critical tradeoff is between the benefits realized through isolation, and the cost of developing and maintaining the scaffolding . Apply automation cost-effectively. Rather than automating tasks where manual approaches are cost-effective or where it is unclear how automation can be applied, focus on clerical tasks performed repeatedly. These tasks are the least costly to automate, and savings on tasks that are repeated often are realized a large rumber of times. 3. Related work It is useful to review the literature on software testing in terms of the required testing tasks, which are the following : (1) build the test harness, (2) generate the inputs, (3) generate the expected outputs, (4) execute the module under test on the generated inputs, and (5) compare the actual outputs to the expected outputs . In successful testing, the savings from detected errors must exceed the costs of generating, maintaining, executing and evaluating the tests . While this statement appears to be obvious, it is often ignored . A test cracle, or source of expected outputs, is normally assumed as given, and schemes are proposed that generate large quantities of test inputs . In practice, the generation of expected outputs would be done manually, at significant cost. Testing research has focused on generation of the inputs, covering the underlying theory of input selection criteria [4], and particular criteria such as path testing [5], data-flow testing [6] and functional testing [7] . A

variety of tools is commercially available for instrumenting programs to measure code coverage [8,9] . There has been some work on tools for automated driver generation for modules [10-12], but their commercial application has been limited . 4. Environment description The TRIUMF cyclotron accelerates negative hydrogen ions to energies which vary between 60 and 540 MeV . The proposed 30 GeV Kaon Factory will use TRIUMF as an injector. The basic architecture of the TRIUMF control system is described in ref. [13]. Frequent changes are in the nature of a laboratory environment. Demands for change come from experiments, from improvements and enhancements to the machine and from the necessity to update existing equipment . 4.1. Maintenance cycle In an accelerator environment, the software tester is constrained in many ways. Equipment is rarely available for testing; even when the equipment is available, it is not feasible or possible to test the software under all conditions. Furthermore, the tester must ensure that the software responds reasonably when the equipment is outside its normal operating specifications; conditions which may be undesirable to achieve in reality . Since some accelerator software is intended to prevent equipment damage, such programs should be tested carefully before trying them on the real machine . The TRIUMF schedule includes a day for maintenance about once a week, as well as shutdown periods of approximately a month twice a year. In order to use these periods efficiently for software integration tests, it is important that software be tested as thoroughly as possible beforehand. 4.2. Device interface In the selection of a project for a first application of module testing methods, there were several desirable characteristics. The project should be useful and its code should be modular. Its scope should be small enough to be done in a few months, but large enough both to illustrate the utility and to expose any problems with the methodology . A software interface to accelerator devices, through CAMAC, was available in the Data General systems ; its translation for use in the VAX Workstations was a good candidate for the application of modular regression testing techniques . It would be useful, it was divided into modules whose behavior was well understood, and the initial testing project could include as

P.A . Brown, D. Hoffman / Module regression testing at TRI UMF

much or as little of the total device interface project as necessary for its purposes .

379

. fa^Omele.5

1,o-toted

CAAAA(' -,l'

"

__ ~ l'o. )

(lutput

odd,,-,,

(AMA(

-"de

'( I~onStoled

t-

Odd

-d'-

dito

5. Test experience 5.1. Choice of module The trivial stub often used in module testing does not lend itself well to many of the problems encountered in the accelerator environment, where modules must be tested for their responses to different conditions of the real-world hardware . Specifications for typical device interface software modules involve not only subroutine parameters and results, but also translated outputs to hardware or software modules closer to the real world, and inputs from the real world (see fig. 1). CMCSH (CAMAC with short address form) was chosen as an example of such a module. It functions in the Data General system as the fundamental CAMAC subroutine. After translating it to a shell which calls the standard CAMAC subroutines provided in the VAX [141, many Data General Fortran programs can easily be ported to the VAX . These standard routines include: CFSA (Perform Single CAMAC Action). CMCSH performs several functions depending ccn the CAMAC address specified. In the parallel system, a standard CAMAC address is sent to the appropriate hardware registers, and data is read back as specified by the F code. The VAX version need only translate the encoded address into the format required by CFSA, and return the data and status as required. CMCSH also performs sequences of CAMAC operations to provide access to serial crates connected to modules ire the parallel system. In this case, the VAX version must perform the corresponding sequences of CFSA calls. 5.2. Interactive tester At the same time as the first version of CMCSH was written, an interactive driver and an interactive stub for CFSA were written . The driver prompts for the CAMAC

_.

__.. .

lest

Script

Fig. 2. CMCSH batch testing scheme .

address, F code and data, and reports the data and status returned . The CFSA stub reports the CAMAC address passed on by CMCSH and prompts for data 4, ad status to return to CMCSH . 5.3. Batch tester The batch driver and stub are controlled by data in a test script file, as shown in fig. 2. The driver reads the CAMAC address, F code and data f"om the script file, calls CMCSH and reads expected results from the file to compare with the data ar status returned . The CFSA stub compares the C t.MAC address passed on by CMCSH with the expected value read from the script file, and reads the data and status to return . The script file consists of lines of the following types : - Par: CMCSH parameters - encoded BONA, F, data; - Exp : expected CFSA parameters - translated BCNA, F, data; - Ret : CFSA results - data, Q, X, status ; - Res: expected CMCSH results - data, translated status, return number. The significance of the script file is determined by the way it is processed by the batch driver. The program performs the following steps for each test case: - driver : read next Par line, - driver : call CMCSH, passing Par values, - CFSA stub: read next Exp line, CFSA stub : compare Exp value~ against aciua` A

9

-

-

Fig. 1 . General testing scheme .

parameters, CFSA stub : read next Ret line, CFSA stub : return to CMCSH, driver : read next Res line,

driver : compare Res values against values returned by CMCSH, - driver : report discrepancies. The test cases were chosen to include all special parameter values and to ensure that all o¬ the program X1. SOFTWARE METHODS

380

P.A. Brown, D. Hoffman / Module regression testing at TRIUMF

statements were executed. The script file included the following cases: - CAMAC Branch Address z 8 - to test standard serial crate software; - CAMAC Crate Address= 0 - to test TRIUMF serial crate software; - F code Bit 3 set - for test cycle ; - I data I in (224 -1, 224-1 + A) and F set to write - to test data overflow check (where A is some small value); - ( data _ 2 24-1 + A and F set to not write - to make sure overflow is ignored when not writing. After the module tests had been completed, the CFSA stub was replaced with the production routine for interactive integration tests of CMCSH .

data acquisition and control software, these methods have a wide application in the accelerator environment . (b) After software modification, rerunning the complete set of batch tests is extremely eâsy. Thus, the programmer is encouraged to test thoroughly, and the test results are automatically recorded . (c) We considered the cost of the original approach (manual integration testing only) versus the cost of the proposed approach (automated module testing followed by manual integration testing). These costs were estimated to be about equal . However, automated module testing has the additional benefits that it can be done without the production environment and that tests, once stored, can be rerun at no cost. 7.2 . The proper role of coverage measures

6. Results The interactive tester took about half a day to write and debug. For the batch tester it was difficult to decide how much of the debugging time was spent in debugging the t-Sting apparatus, as opposed to the program under test. The batch testing pi ss, whiting and debugging the batch tester and the batch script, cost about a week of work, to test a module which itself took a week to code. The process of generating and running the testing apparatus was useful. While selecting the test cases, it was discovered that the program did not handle all necessary cases. The interactive tester was useful to ensure that the program ran to completion . The batch tests exposed several bugs in address translation and status interpretation. These might well have cost as much to find during integration testing as our scaffolding took to construct. The VAX Performance and Coverage Analyzer (PCA) [91 was used to measure branch coverage.. The preliminary run of the PCA revealed an untested section of error-handling code. After the modification of the script, 100% coverage ::ras achieved. 7. Conclusions nn%un vJ nf s.oso #Z.;. z .%. s1 . ÎMo he vusesa,

iafllls ilKLfC

SV LGJibIig

ree main advantages can be seen. (a) The approach enables the programmer to do a considerable amount of testing without access to the hardware. Thus, in a new system, testing can begin before the hardware is ready, and in a working system, program modifications can be tested when the hardware is not available. Because modules which interface to hardware constitute a substantial portion of accelerator

Complete coverage is necessary for proper testing; software with portions of code which have never been run should not be released . Coverage is extremely difficult to achieve when software must respond to conditions in the real world. It is time-consuming and difficult to change equipment conditions, but, by using scaffolding, the stub can be controlled to test programs completely. 7.3. Limitations The most obvious limitation of the method is the large initial cost of testing schemes . Proper testing requires a firm commitment from the management if it is to be consistently carried out in the face of software deadlines. Another limitation is that module testing does not eliminate the need for integration testing; modules must also be tested with real equipment in a real system . eferences [1)

F .P. Brooks, The Mythical Man-Month (Addison-Wesley, 1975) . [2) IEEE Standard for Software Unit Testing (Soft. Eng . Tech . Comm . of the IEEE Computer Society, May 1987) . [31 E .J. McCluskey, Logic Design Principles (Prentice-I-Iall, [4] J .B . Goodenough and S .L . Gerhart, IEEE Trans . Soft . Eng . SE-1 (1975) 156 . [5] W.E . liowden, IEEE Trans . Soft. Eng. SE-2 (1976) 208 . [6] S . Rapps and E. Weyuker, IEEE Trans . Soft . Eng. SE-11 (1985) 367 . [7] W .E . Howden, IEEE Trans. Soft . Eng. SE-6 (1980) 162 . [8) S-TCAT/C - System Test Coverage Analysis Tool for C (Software Research, Inc ., San Francisco, CA, 1989) . [9] Guide to WAX Performance and Coverage Analyzer (Digital Equipment Corporation, Maynard, MA, 1987) .

P.A . Brown, D. Hojjman / Module regression testing at TRIUMF [101 D.J . Panzl, Proc . AMPS Nat. Comp. Conf. (AFIPS, 1978) 609. [111 A. Jagota and V. Rao, Froc. Pacific Northwest Software Quality Conf. (IEEE Computer Society, 1986) 147. [121 D.M . Hoffman, Proc. Conf. on Software Maintenance, (IEEE Computer Society, 1989).

381

[131 D.A. Dohan and D.P. Gurd, Europhys. Conf. on Computing in Accelerator Design and Operations, eds. W. Burse and R. Zelazny (Springer, Berlin, 1983) 332. [141 IEEE Standard Subroutines for CAMAC, ESONE SR/0° (1981) .

XI . SOFTWARE METHODS