Reliability Engineering and System Safety 43 (1994) 87-94
Quality assessment of software prototypes Robert Lindermeier Institute for Quality and Safety in Electronics, TUV Bayern Sachsen e.V., Westendstr. 199, PO Box 210420, D-80674 Munchen, Germany. (Received 9 March 1993; accepted 9 June 1993)
This paper introduces a concept for the assessment of prototypes. On the basis of the international standard ISO/IEC CD 12119 it shows that there are differences as well as similarities between traditional methods of software testing and the assessment of software prototypes. Prototype assessments consist mainly of user-driven and ergonomic (usability related) assessments. The degree to which user requirements are realised is also an important assessment item. To perform correctness assessments a proper installation is essential. Without this errors cannot be assigned properly. Furthermore, it is also vital to have reliable documentation available for correctness assessments. Verbal information is insufficient, especially where no or poor general documentation is available. The assessment should be outlined on a black-box basis because the final implementation of the program will normally use a different program language. To maintain the advantage of short reaction times during prototype development it is also important to avoid too long assessment intervals.
INTRODUCTION
So why is it necessary when using this exploratory prototyping method to develop the 'same' program a second time? There are two reasons for this. First of all to ensure a quick reaction to customer wishes for program alterations which could not be achieved using traditional methods. Secondly these prototypes are normally developed using programming languages, that ensure quick visual results but that are not always able to guarantee an efficient and reliable performance of the final program. But efficiency and reliability are two elementary quality requirements and therefore a second programming using traditional methods is essential at the present state of technology. Figure 1 shows an example for the development of explorative software prototypes. In this illustration particular consideration is given to the test aspect. In fig. 1 grey boxes demonstrate the prototyping process. Starting with initial ideas or needs which turned up during maintenance new requirements for the further development process are collected. When the collection process has come to an end the requirements are analysed to guarantee that only sound and reasonable demands will be used during the subsequent development steps. On the basis of these requirements the prototype and a supplementary system description are realized simultaneously. Both the prototype and this additional system description, which covers all items that are not contained in the prototype itself, form the system specification.
In response to user demands in recent years the computer industry has had to integrate more and more quality aspects into their development strategies. An increase in product quality can only be achieved by paying attention to quality requirements and ensuring adherence to them. A method for doing this is supported by the international standard I S O / I E C CD 12119,1 which is a derivative of the G e r m a n standard DIN 66285. 2 This standard defines the quality of software according to it's product characteristics. A testing procedure for software products is intended to ensure adherence to the quality requirements demanded by this standard. This paper discusses how far this also applies for the assessment of explorative software p r o t o t y p e s . Software that is developed according to the so-called explorative prototyping method is intended to give the customer a 'look and feel'-impression of the program at an early stage. If the customer wants any modifications they can easily be integrated and the customer is asked again to examine the program. If there are no further alterations needed the prototype can be used as a model for the final program, which will be developed using traditional methods and programming languages.
Reliability Engineering and System Safety 0951-8320/94/$06.00 © 1994 Elsevier Science Publishers Ltd, England. 87
88
R. Lindermeier
]
Use of System
nI
~
I -t
I
-----~"
ti
I _T_
I
System
[
_A r c t
......
r
B
~_
System i ~ ~ ~\\
! ,u,,iu, ~. .
Test
Delivery
/
and
Installation
/
/
i) System
product status
//
///
/
/
/
/
1 System ~{_ Realization , r
iI
Syste Desig
action _
Fig. 1. Development and assessment of explorative software prototypes.
Before the software engineering process continues, the system specification is checked repeatedly against the user requirements. This prototype test loop assesses the consistency of the system specification with the user requirements. Therefore, after the end of this process the system specification can be used as a reliable basis for the further development. From now on the system developers can keep to the phases of traditional software engineering, such as system design, system realization and system installation. This part of the development is characterized by the white boxes in fig. 1. In this part of the development the second test unit can be executed. It comprises the verification of each development step as well as the verification of the ready installed system in its final working environment against the system specification (fig. 1, dashed arrows). Further information about prototyping is provided in [Refs 3,4].
a program cannot be tested. This also applies to vague quality requirements like 'quick response time', 'good portability', 'great flexibility' etc. Another basic feature is reproducibility of the test. Testing procedures have to be transparent. That is, testing methods and testing organization must be recorded and executed in such a way that results are reproducible at any time. Test methods and organization should be mostly resistant to personal judgement. Finally, it must be borne in mind that one of the most important tasks of software testing is to detect errors, although software testing is not intended to prove the absence of errors. In consequence, software testing itself is in a way a destructive process because of its inherent criticism. That is why testing should never be entrusted to the program developers only. They might always have a tendency to prove the correctness of their program. More test rules are for example supplied in [Refs 5,6], for example.
2 TEST BASICS 2.1 Quality requirements for software tests In this section we lay down some basics which apply to all kinds of software tests and which are also applicable for software prototypes. In principle, a test plan settled on quality requirements has to be developed. This test plan serves as a test directive and has to describe the quality requirements, the objectives as well as the objects of the test to be performed. A test without such pre-defined quality requirements is rather dubious. Without precise testing directives results could be easily subjected to the personal judgement of the tester. It is clear that without quality requirements
In addition to these rules, standard ISO/IEC CD 12119 supplies further information about software testing. It defines quality requirements for software products and especially for programs. It also suggests testing methods for these quality requirements. Both the quality requirements and the testing methods are useful tools in the context of quality assessment of software prototypes even if they cannot be applied in all parts. Therefore a brief summary of the standard's relevant prototype assessment elements is given. With respect to this ISO standard a software
Quality assessment of software prototypes product consists basically of three elements: • Product description; • User documentation of the program; • The actual program (software and data). These three elements are subject to a test according to this standard. In the following a short summary of the standard is given. The product description consists of information about the software product and contains important characteristics concerning installation and use of the product. Unfortunately software prototypes seldom have any product description. Therefore, we will go no further into that topic. (For further information about product description please see Ref. 1). More important for the assessment of software prototypes are the user documentation and the prototype itself (software and data). The ISO standard demands a complete and correct user documentation for software products. All necessary user information for installation, everyday use and maintenance should be provided clearly and free from contradiction or ambiguity. Other quality requirements for the documentation are understandability for specified users and ease of overview. If executed according to the instruction the installation of the software shall be successful. All documented features shall be executable in the presented form and according to their description. All programs and data shall comply with the documentation, be consistent with all other parts and possible to execute in the correct manner. The combination of software and hardware shall be reliable and robust. The system shall be robust against capacity exploitation up to the limits or against incorrect input which should never result in system crashes, loss or corruption of data. Usability of the software product also entails relevant and understandable system messages. In particular error messages should provide detailed information concerning cause and error correction. Furthermore, it should always be possible for the user to find out which function is being executed. The program shall provide information for the user which is easy to comprehend. The execution of functions which have serious consequences should be reversible or the programs shall give clear warning regarding the consequences and request confirmation before executing the command, especially concerning erasure and overwriting of data etc. For the assessment itself a test plan is required. The test history shall be protocolled in a test record. The test plan is part of the test record. The test record shall be written in such a way that the test can always be repeated by using it. Description and results of the test shall be part of the test record. Testing of a software product requires that the complete product, the required documentation as specified in the
89
product description are available as well as the necessary system configuration. If training is mentioned in the product description, the tester should have access to the training material. All parts of the software product shall be tested according to the criteria mentioned above. In addition to the testing of programs and data the successful installation shall also be tested. To what extent these rules can be applied to software prototype assessments is considered in section three. Another essential aspect of software tests which is not fully described in the standard is the organizational framework for software tests. This is discussed in the following.
2.2 Organizational framework for software tests For an efficient execution of software tests and for the assessment of software prototypes it is necessary to use an organizational framework. The organizational framework presented in the following and described in detail in Ref. 7 is black-box oriented. It was developed during many practical applications and proved its coherence with DIN 66285 as well as its capability to cope with everyday problems, such as : • • • •
• • •
•
project organization; efficient management of testing and test results; consistent quality of testing; to achieve homogeneous results on one hand and a maximum of flexibility for each tester on the other; coordination and harmonization of parallel test tasks; disproportionate increase in manpower expense in relation to increased test accuracy; controversy between the demand to reduce manpower expense and the requirement of reproducibility of the tests; the lack of necessary and helpful tools.
The structure of the organizational framework is oriented towards four aspects: (1) Temporal order; (2) functional order; (3) hierarchical order and (4) Organizational embedding. Figure 2 demonstrates the three orders of testing and organizational embedding. Due to the temporal order the test is subdivided into pre-test, main-test, follow-up tests. The pre-test classifies a product as testable or not. If a product is found to be non-testable because of the presence of too many errors etc. there will be no main-test. The main-test consists of the actual program testing. The follow-up tests are intended to survey the program's error corrections and future enhancement. Pre-tests and the follow-up test can be conducted periodically whereas the main test is normally only executed once for each product.
90
R. Lindermeier 3 ASSESSMENT STRATEGY FOR SOFTWARE PROTOTYPES Temporal Order Number and kind of errors determine the succession of phases,
~
i'N~ follow.up.mr i
'W~'"
t]
taain.test
.ust~ . ~l"est record, II : Checklist ~..... JI
Hierarchical
^s ~.x
Functional Order #xts . 0~.s
~ .,¢. v ~-
Grouping of program functions
~' ¢o%" 4," ~
Ord2rF°rm 1
The whole lesl record is hierarchically structured.
Fig. 2. Three-dimensional order of software tests.
Parallel testing asks for dividing up of the program into separate groups. Then each program group can later be tested independently. Program functions which requires the support of several groups in order to work properly are called complex functions. These have to be tested under consideration of the test results of the included groups. Finally a test run completes the program checks. The order of such testing dependencies is called a functional order of testing. To give clarity and completeness the test record is structured by a hierarchical order. On the top level is the test record. This describes the progress of the test, expected results and final results. The second level contains checklists. These contain questions concerning program characteristics. In addition, forms are used for aspects of more complicated questions. The results of these forms are reported in the checklists and their final results are transferred to the test record. The organizational embedding of software testing is oriented at project management strategies. The project manager is responsible for concept, coordination, management of work tasks, control of project information and testing quality. Practical experience showed that it would be desirable if the manager could work with three to five testers. Regular organizational and work meetings to define topics such as control of work results, exchange of information between the testers, discussion of particular problems etc. are to be held. Furthermore, the project manager makes minor adjustments to the organizational framework if necessary. More information about special aspects of project management which go beyond the first approach given in Ref. 7 is available in Ref. 8.
As pointed out in section 2, each test depends on the availability of quality requirements and product information. Unfortunately, most of the software prototypes do not provide much information and rarely give a product description or an elaborated user documentation. On the other hand potential future users play a more important role in the assessment of software prototypes than in traditional test methods. Therefore the testing strategies described above are not entirely directly applicable to the assessment of prototypes. Three of the most common cases of software prototype assessments and some general rules for the assessment of software prototypes are going to be discussed to indicate where the guidelines given in section 2 need adaptations. Due to the fact that prototypes are usually developed with different programming languages than their final implementation, source code tests are rather pointless. Therefore further discussion of the topic will concentrate on black-box assessments. 3.1 General rules for the assessment of software prototypes The basic advantage of software development using the prototyping approach is the availability of direct results after short reaction times. To keep this advantage the intervals between programming and assessment have to be as short as possible. This can be supported by each developer if he refers to the realized requirements, points out the differences between the single prototype versions and delivers with the prototype a list of known bugs. To avoid a too long delay between the start of programming and the time when the first assessment results are available, the delivery of the first assessable prototype has to be at a very early project stage. This is also useful if only a few quality requirements can be assessed. 3.2 Assessment of software prototypes without further information If no user requirements and no further documentation are available, an assessment is almost nearly impossible. There are few quality requirements that can be assessed according to ISO/IEC CD 12119: • No contradictions within the program • Uniformity of terms, program reactions and program control.
Quality assessment of software prototypes These requirements are program inherent and can be assessed without additional information. Without further data, assessments can go no further than that. Quality requirements demanding more assessment information than the sheer prototype is able to supply cannot be checked. Examples are all quality requirements referring to the product description, the documentation and the installation as well as quality requirements like functionality, correctness, robustness and performance of the prototype. The situation is improved if additional quality requirements can be defined. This is possible if representative future users can be found for practical operation of the prototype. In that case the users' requirements are used as a basis for the definition of quality requirements. These user requirements reflect the future users expectations towards the program and put the quality requirement usability into concrete terms. Additional requirements as task-oriented performance and self-explaining, ergonomic usage of the prototype can be developed in cooperation between users and assessors. This refers as well to the aspects of contradiction, consistency and uniformity. Other requirements such as robustness, correctness, etc. are still not assessable without further prototype information. The assessment is performed by defining representative tasks for user groups. They are going to use the prototype accordingly and the assessors will report difficulties and problems in protocols for later analysis. These protocols collect the basic information for the evaluation of how far quality requirements are fulfilled. But still this evaluation is dependent on the rate of pre-information concerning the prototype and a pre-defined quality requirement catalogue. In many cases the assessor will be able to assume the existence of errors but he cannot always verify them due to the lack of information. The organizational framework will mostly be similar to traditional tests. This refers to the temporal order as well as to the hierarchical order of testing although in the latter case the checklists do have to be adapted for prototype testing. In particular, the items mentioned above, which refer to quality requirements that cannot be assessed on the given conditions, have to be removed. As the assessor will not have a documentation of the prototype he has to survey the program in order to get a general overview of the program structure. The first information will be improved later on in breadth and depth and new aspects of program features will be detected in the course of the assessment. Consequently, the functional order cannot be designed at the start of the assessment but must be developed gradually.
91
3.3 Assessment of software prototypes with a user requirement catalogue
It can be assumed that an analysis of user requirements had been carried out before the prototype was developed, and that its results are available in the form of a catalogue from which quality requirements can be derived. In addition to the quality requirements listed in section 3.1, others derived from this catalogue can also be assessed. The requirement catalogue enables the assessor to control the complete fulfilment of the stated requirements. Furthermore, it can be assessed if the specified program functions do perform as described in the requirement catalogue (complete functionality). Using such a requirement catalogue it must be borne in mind that user requirements have a high and quick change rate, especially in case of prototype development. It must also be taken into consideration that many requirement parameters (e.g. existence, importance, feasibility, etc.) could be affected when users again make up their minds about realization demands. Therefore a more flexible and powerful maintenance strategy than in traditional software development models is demanded to keep requirement catalogue and test plan up to date. Otherwise, confusion in handling the different prototype versions and its specific requirements is unavoidable. The organizational framework is similar to the one described under section 3.1. Also the functional order of the testing can be oriented towards the requirement catalogue. However, supplements to the functional order during the test usually will be necessary assuming that the list of requirements is not stable in the beginning. Additionally it is to be expected that the degree of particularization in the functional order will most likely increase. 3.4 Assessment of software prototypes with a user requirement catalogue and further documentation
Sometimes the prototype is accompanied by additional system descriptions as development protocols, help texts, user handbooks, etc. Disregarding which material will be assessed the first step is always to check if the information is still up to date and available in writing. This material should be integrated into the assessment process according to its degree of completeness. Descriptions which are similar to user documentation can be used for the assessment of quality requirements as described in section 2.1. Usually prototypes do not have complete documentation and therefore the requirement of completeness with regard to documentation is superfluous in most cases. If there is documentation of parts of the prototype
92
R. Lindermeier
concerning use, installation and maintenance this can be assessed according to quality requirements, especially with regard to functionality and correctness (see section 2.1). Most of the prototype-tools do not support robustness and reliability because these aspects are not included in the normal development process of prototypes. If this is true for the prototype which is to be assessed, these requirements are to be excluded from assessment. Furthermore, parts of the program without documentation are still nonassessable. For the performance of assessments like correctness, robustness and reliability a proper installation has to be assured. Disregarding which of these quality assessments is performed installation must be carried out according to an exact written documentation or should be done by the developer. In either case it is essential to check the soundness of installation by an appropriate procedure which must be provided by the developer. If this is not done errors cannot be assigned properly. To avoid difficulties with the assignment of interface errors only one integrated software package should be installed. The organizational framework is similar to that given in section 3.2.
4. EXAMPLE FOR A SOFTWARE PROTOTYPE ASSESSMENT The research project 'VASARI' which was partly sponsored by the European Commission, is an example of the prototype approach. One aim of the project was the development of a prototype which is capable of scanning paintings with very high resolution and of storing that digital information. Using those data, analysis of the texture and colourmetric aspects of painting surfaces by digital imaging becomes possible. This process should guarantee a high rate of information about colour and surface with a minimum loss of information. A requirement analysis executed at the start of the project showed the following results: • The hardware structure should be modular, consisting of a scanner operating a movable camera that advances relative to the picture to be scanned and a high-capacity workstation with a true colour option. • The software should fulfil all reasonable user requirements and guarantee easy handling. The user interface should be ergonomic and easily understandable. If possible, all help functions should be integrated in the prototype. Figure 3 illustrates the prototype set-up as planned. After several development steps, assessments concentrated on the last two versions of the VASARI prototype. These versions were assumed to contain
Fig. 3. The VASARI prototype. most of the user requirements derived from the requirement catalogue (e.g. help functions) of the prototype. The assessments were supported by another partly European Commission funded research project (SCOPE) regarding software quality. The part of the assessments considered in this publication is concerned with the colour and texture analysis features of the software. All versions were assessed on the basis of the rules introduced in sections 3.2 and 3.3. In order to restrict the assessment expenses only the most relevant assessable program features were selected. This could be done with the help of the available documents and also by criteria such as importance and frequency of use of the planned program function. Assessment concentrated on the following requirements: usability, realization of requirements and in some cases correctness. Assessment of consistency is expensive in terms of time and manpower. Due to the fact that consistency was a subordinate quality requirement in this project it was only assessed superficially. The following experiences obtained during the assessments support the strategy given in section 3: • The VASARI prototype was documented only very poorly. On the other hand, potential future users could be identified. Therefore the main part of prototype assessments consisted of user-driven and ergonomic (usability related) assessments. To assess ergonomic quality requirements not only ISO/IEC CD 12119 had been considered but also ISO 9241/10. 9 • In addition to ergonomic assessments, the degree of realization of user requirements could be assessed because a requirement catalogue as described in section 3.2 was provided. However, in the present assessment it turned out that there was a considerable deficit in proper maintenance of requirements. Although at each stage of development, new user requirements and assessment aims turned up while others had to be dropped or changed, no adaptations of the
Quality assessment of software prototypes
•
•
•
•
requirement catalogue were made after the initial set up. Therefore it was impossible to trace the original meaning of certain requirements and in consequence it was impossible to decide whether a requirement was realized in the actual prototype version or not. Because of it's size the VASARI prototype was developed by different manufacturers using the programming language C instead of one of the typical prototyping languages. The single packages were installed and linked together independently in the test laboratory according to verbal instructions from the developer or by the developers themselves. In consequence no integral soundness check concerning the installation was provided by the developers. This caused problems when errors had to be assigned to distinct packages, especially if developers assured the integrity of their own packages and assigned malfunctions to an incorrect installation. In a few cases where the user documentation was available, functional correctness assessments were attempted. Unfortunately, most of these functions were not realized in the examined prototype versions. In the few cases where assessment was possible it could not be decided whether the errors were caused by erroneous program functions or incorrect installation. The documentation itself was not maintained regularly during the development. Therefore, the documentation quality was not rated as such but still influenced the correctness assessment of program functions. During the assessment and development cycles the manufacturer often complained about long assessment intervals. In consequence, sometimes new prototype versions were delivered before the end of the actual assessment cycle. This meant that the bugs found in these versions were not fixed before the follow-up versions were released. The assessment results for several prototype features improved during the examinations. On the other hand after the end of the VASARI project some features still needed improvement.
The VASARI prototype assessment showed that the assessment of software prototypes demands exceptional handling compared with traditional programs. To cope with these demands the rules given in section 3 seem to be an appropriate basis. 5. CONCLUSIONS Our practical investigations during the two research projects, VASARI and SCOPE, indicated that there
93
are considerable differences as well as similarities between traditional software testing methods and the assessment of software prototypes. The accessibility of software prototypes is related to the degree of obtainable information about the prototype and a clear understanding of the quality requirements actually demanded. Is no such information available for example in the form of documentation or a user requirement catalogue the prototype is not assessable in basic parts. If future users are obtainable for the assessments, investigations on an ergonomic basis become possible. If the prototype is delivered with precise information material about its functionality and its installation, more sophisticated quality requirements like correctness and robustness can be assessed. To carry out a reasonable assessment of software prototypes we recommend: • At the start of the prototype development future users should be identified and prepared for the assessment process. This assures that substantial ergonomic assessments can be carried out even if no further information about the prototype is available. • A requirement catalogue that can later serve as an assessment basis must be defined and maintained properly even before prototype development starts. • Important program functions and the installation procedure should be sufficiently documented and maintained in written form during the development process to allow the assessment of quality requirements like correctness, robustness, etc. Furthermore, it should be assured that these functions are to be implemented. • To avoid integration difficulties the prototype should be developed in one part with clearly defined responsibilities. • The assessment of prototypes should be outlined on a black-box basis due to the fact that the programming code of the prototype is irrelevant for the final implementation language. Whitebox prototype assessment results would be no longer valid for the final implementation. • Assessments have to start at an very early point of development and should be carried out as quickly as possible in order to avoid too long assessment intervals. Paying attention to these recommendations it should be possible to minimize potential problems. However, further investigations about the assessment of software prototypes are still necessary. These should concentrate on a further development of an automation of the black-box assessment method as well as a standardization of user supported assessments.
94
R. Lindermeier
REFERENCES 1. ISO/IEC CD 12119: Information processing; software packages; quality requirements and testing. The International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), 1992. 2. DIN 66285: Anwendersoftware - - Gutebedingungen und Prufbestimmungen. Beuth, Berlin, 1990. 3. Spitta, T., Software Engineering und Prototyping, Springer, Berlin, 1989. 4. Budde, R., Kautz, K-H., Kuhlenkamp, K. & Zullighoven H., Prototyping, Springer, Berlin, 1992. 5. DeMillo, R.A., McCracken, W.M., Maring, R.J. & Passafiume J.F., Software Test and Evaluation, The Benjamin Cummings, Menlo Park, California, 1987.
6. Myers, G.J., The Art of Software Testing, John Wiley & Sons, New York, 1979. (In German: Myers, G.J., Methodisches Testen yon Programmen, Oldenbourg, Munchen, 1991.) 7. Lindermeier, R. & Siebert F., Proposal for the organization of final software blackbox tests, in IFIP Working Conference on Approving Software Products, ed. W. Ehrenberger, 17-19 September 1990, GarmischPartenkirchen, North-Holland, 1990, pp. 183-93. 8. Gilb, T., Principle of Software Engineering Management, Addison-Wesley, New York, 1988. 9. ISO 9241110 Draft: Visual display terminals used for office tasks part 10, dialogue interface, The International Organization for Standardization (ISO), 1991.