Information and Software Technology 48 (2006) 645–659 www.elsevier.com/locate/infsof
A method for assigning a value to a communication protocol test case Richard Laia,*, Yong Soo Kimb a
Department of Computer Science and Computer Engineering, La Trobe University, Bundoora, Vic. 3086, Australia b College of Software, Kyungwon University, Songnam, Kyunggi-Do 461-701, South Korea Received 30 August 2004; revised 30 June 2005; accepted 6 July 2005 Available online 18 August 2005
Abstract One of the main problems in industrial testing is the enormous number of test cases derived from any complex communication protocol. Due to budget constraints and tight schedule, the number of test cases has to be within a certain limit. However, by having a limit on the number of test cases, it raises some issues. For instances, what criteria should be used for selecting the test cases? How can we ensure that important test cases have not been excluded? We are proposing that assigning a value to each of the test cases of a test suite can provide a solution. By doing so, the relative importance of each of the test cases can be ranked and an optimal test suite can then be designed. The value of a test case is to be measured in economic terms, which could be based on the probability that a particular case will occur, and the probability that an error is likely to be uncovered. This paper presents a method for assigning a value to a test case of a communication protocol; it is based on sensitivity analysis, which involves execution, infection and propagation probabilities. To illustrate the method, the results of applying it to the INRES protocol are presented. q 2005 Elsevier B.V. All rights reserved. Keywords: Formal Description Technique; Protocol testing; Estelle; Sensitivity analysis; Test case value
1. Introduction The advance in technology in telecommunications has resulted in the rapid development of many complex communication protocols. Communication protocol testing is playing a more and more crucial role in the development of computer network as it provides a means of enhancing the reliability of communication software. Due to their complexities, it is essential to develop a strategy for effective testing. The advance in protocol specification using Formal Description Techniques (FDTs) in recent years has opened up a new horizon for protocol testing [1]. Internationally standardized FDTs include SDL [2], Estelle [3] and LOTOS [4]. A test sequence for a protocol is a sequence of input–output pairs derived from the protocol specification. It can be generated from a formal specification for conformance testing. Successes have been achieved by academia on automatic test case generation from formal specifications [5,6]. In general, more work has been done in * Corresponding author. Tel.: C61 3 94792374; fax: C61 3 947993060. E-mail address:
[email protected] (R. Lai).
0950-5849/$ - see front matter q 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.infsof.2005.07.003
automating test case generation from Estelle specifications [7] then has been done with SDL or LOTOS. Conformance testing is concerned with the conformance of an implementation under test (IUT) to the standards [8]. International Organisation for Standardisation (ISO) has been working towards defining an abstract testing methodology and a framework for specifying conformance test suites since 1984. The effort resulted in the standard ISO 9646 [9], which defines the details for Conformance Testing Methodology and Framework (CTMF). A test notation called Tree and Tabular Combined Notation (TTCN) [9] has also been developed. TTCN is used so that conformance test suites can be shared among testers. CTMF is a very general framework, which could be applicable to the widest possible range of specifications and products. Despite the work by ISO and academia, there is still a big gap between testing practice and research results published in journals and reported at conferences [10]. This gap between academic and industrial testing practices and the fact that academia has not been addressing the real-life testing issues and problems account for the fact that academic testing methods are seldom used in industry [10]. Thus, there is a growing need for academic research on communication protocol testing to become more
646
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
industrially relevant in order to help narrow the gap. One of the main problems in industrial testing is the enormous number of test cases derived from any complex communication protocol. It would take many man-years to test all the possible test cases. Due to budget constraints and tight schedule, the number of test cases has to be within a certain limit. However, by having a limit on the number of test cases, it raises some issues. For instances, what criteria should be used for selecting the test cases? How can we ensure that important test cases have not been excluded? What technique should be used for designing an optimal test suite? We are proposing that assigning a value to each of the test cases of a test suite can provide a solution. By doing so, the relative importance of each of the test cases can be ranked and an optimal test suite can then be designed. The value of a test case is to be measured in economic terms, which could be based on the probability that a particular case will occur, and the probability that an error is likely to be uncovered. For this purpose, the testability and sensitivity of a program need to be analysed and understood. A program’s testability is a prediction of its ability to hide faults when the program is black-box-tested with inputs selected randomly from a particular input distribution [12]. A program has high testability when it readily reveals faults through random black-box testing; a program with low testability is unlikely to reveal faults through random blackbox testing. A program with low testability is dangerous because considerable testing may make it appear that the program has no faults when in reality it has many. ‘Sensitivity’ means a prediction of the probability that a fault will cause a failure in the software at a particular location under a specified input distribution [12]. For instance, if a location has a sensitivity of 0.99 under a particular distribution, almost any input in the distribution that executes the location will cause a program failure. On the other hand, if a location has a sensitivity of 0.01 relatively few inputs from the distribution that executes would cause the program to fail, no matter whether or not faults exist at that location. This paper presents a method for assigning a value to a test case of a communication protocol; the method uses a software testability technique, called Sensitivity Analysis [13], which is based on the Propagation, Infection, and Execution analysis (PIE) model [11]. The Contest tool [14] is employed to generate test cases from an Estelle [3] specification of a communication protocol; based on the three analyses, each transition of an implementation under test is analyzed; execution, infection and propagation probability can be determined. The value of a test case can then be derived from the number of transitions executed combined with this set of probabilities. To illustrate the method, the results of applying it to the INRES protocol are presented. It is our wish that this paper prompts other researchers to develop different methods for assigning a value to a test case.
2. Related work A detailed study of formal methods with regard to protocol testing can be found in [1,25], which pointed out that the Unique Input/Output (UIO) test sequence method [26–28] can achieve a high fault coverage. However, some faults are still undetected in some fully specified machines [29]. Therefore, in [29] a comprehensive analysis of fault coverage (including modeling) for completely specified machines has been discussed in order to alleviate the problems described in [1,25]. To estimate fault coverage in [29], a functional fault model is defined, with respect to three types of faults [29]. There have been analytical attempts both to classify faults and to characterize the kinds of faults that a certain conformance test generation procedure can detect [29]. Even though interesting, some of these results are based on imprecise definitions [30], and fault masking should be used to argue the relative merits of various testing methods [29]. Several studies [31–33] have tried to find out how to measure the goodness of a set of test cases and how to generate or select test suites with some good coverage measure. Development and implementation of a test case selection algorithm based on coverage metrics and testing distances between control execution sequences are presented in [32]. Improvement of the approach is presented in [33], in which the definition of the metric in [31] has been improved in order to tackle test selection and test coverage for protocols with parallelism and recursion. A study that is related to fault models is discussed in [34]. The purpose is to show the importance of fault models in testing, and to describe various fault models that correspond to different description techniques, which are used for hardware, software and/or communication protocols. Predicting the number of faults is not always necessary; it may be enough to identify the most trouble some modules [35]. In [35], discriminant analysis has been applied to identify faultprone modules, and a very large telecommunication system (approximately 1.3 million lines of code) has been used for a case study. This study focuses on non-parametric discriminant analysis rather than the parametric approach in modeling methodology. Information about the reuse of each module from a prior release is significant to software quality models for improving the accuracy of predictions found in [35]. Gotha [36] is a tool for generating an abstract test suite for a finite state machine (FSM) driven by a coverage model. The finite state machine is described in a high level language for modeling concrete systems. Such systems may be hardware architectures or components, software systems, communication protocols, or other complex systems and processes. A test case is a sequence of stimuli for the model. The Gotha prototype was originally developed for hardware architecture models but we have exploited it for modeling and testing software systems. Due to an increasing interest in SDL, MSC and TTCN based tools for validation and test
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
generation, Autolink [37] is a software tool developed for supporting automatic generation of TTCN test suites based on SDL and MSC specifications. This tool does not address test selection issues. TorX tool [38] is a prototype testing tool for conformance testing of reactive software. The tool requires a real implementation and a (formal) specification of that implementation. The specification describes the system behaviour that the real implementation is allowed to perform. TGV [39] is a tool for generation of conformance test suites for protocols. It explores the notion of test purpose as a way to specify a set of test cases and is based on the model of input/output transition systems and uses algorithms derived from verification technology. It links test purposes, test cases and reference specification, and also explores the similarities and differences between the specification of test cases, and the specification of programs. With the review of the related work above, our approach and purpose of our work in assigning a value to a communication protocol test case are distinct from others. The concept of assigning a value to a communication protocol test case is the first known contribution to this area of research. Also, our method is based on a three-part failure model, including evidence of execution coverage, altered data states caused by syntactic mutants, and propagation of simulated infections, as compared to other approaches which are normally based on one fault model.
3. Sensitivity analysis This section gives, a background knowledge to Sensitivity Analysis [13]. If the presence of faults in programs guaranteed program failure, every program would be highly testable. However, this is not necessarily true. Each set of variable values after the execution of a location in a computation is called a ‘data state’. After executing a fault, the resulting data state might be corrupted; if there is corruption in a data state, infection has occurred and the data state contains an error. The error is called a ‘data-state error’. In order for an error to be observable, the following conditions must be satisfied: the fault is executed, infection occurs, and the infection caused an incorrect output. Sensitivity is related to testability, but the terms are not equivalent. Sensitivity focuses on a single location in a program and the effects a fault at that location can have on the program’s output behavior. Testability encompasses the whole program and its sensitivities under a given input distribution. Sensitivity analysis [13] is the process of determining the sensitivity of a location in a program. From the collection of sensitivities over all locations, the program’s testability can be determined. If the presence of faults in programs guaranteed program failure, every sensitivity analysis requires that every location be analyzed for three properties: the probability of execution occurring, the probability of infection
647
occurring, and the probability of propagation occurring. Execution analysis is the process of estimating the likelihood that inputs execute some statements in the specification under test [12]; infection analysis is the process of estimating the likelihood of whether syntactic mutants are able to produce corrupted internal states with respect to inputs [12]; and propagation analysis is the process of estimating the likelihood of whether infected program state will result in corrupted output [12].
4. The method Our proposed method for assigning a value to a test case is based on the three techniques of sensitivity analysis: execution, infection and propagation analysis. These analyses are applied to an Estelle [3] specification. Due to the fact that most of the dynamic behaviors of an Estelle specification are centred on its transitions, we analyze each transition under test using sensitivity analysis. To approximate the fault/failure model at a transition of an Estelle specification, the following questions need to be answered. Transition executed. Given a test suite, what is the probability that a particular transition is executed? The answer to this question has nothing to do with whether there is a fault at the transition. Hence, we do not need to simulate faults. Faulty transition. What is the probability that a faulty transition will infect a data state? This question requires fault simulation. We apply techniques that are similar to the techniques of weak mutation testing [21]. We assess how likely it is that a fault at a transition will affect the output of the next transition, and how this fault at the transition will affect the data state immediately after the transition fires. Causing an infection. What is the probability that a faulty transition would cause an infected data state so that an implementation would produce an incorrect output? To answer this question, we would apply an analysis similar to strong mutation testing [20]. We would simulate the data state effect that a mutant will have on the output generated by an implementation. Fig. 1 shows the overall approach to assigning a value to a test case in a test suite generated from an Estelle specification. The Contest-tool [14] is applied to generate test cases from a given Estelle specification. These test cases are used in the three analyses. We can then obtain three estimates (execution, infection and propagation) from the three analyses. Using these estimates, the last step is to assign a value to a test case generated from a given Estelle specification. 4.1. Some definitions We define an implementation of an Estelle specification IE as a function g, that maps a domain of possible test cases
648
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
Test Case Generation
Estelle Specification
Provide test cases to three analyses
Execution Analysis
Infection Analysis
Propagation Analysis
Execution Estimates
Infection Estimates
Propagation Estimates
However, if IE is annotated with assertions about the correct data state before and after a particular transition Ti, and if there exists a test case that Tis succeeding data state violates the assertion and Tis proceeding data state does not violate the assertion, then Ti contains a fault. To define execution, infection and propagation probabilities, we first introduce the notations. Let T denote a set of nT transitions of IE: fT1 ; T2 ; .; TnT g, and let TC be a set of n test cases of IE: {tc1, tc2, tc3, tc4,.,tcn} (where 1%j%n). 4.1.1. Execution probability Let 3iIEj represent the probability that tcj will execute Ti of IE. It is defined as follows:
Analyzing the three estimates
( 3iIEj Z
1
if Ti is executed by tcj ðwhere 1% j% nÞ
0
otherwise
TEST CASE VALUE
Fig. 1. The method.
to a range of possible outputs. With the same domain and perhaps different range, we let function f represent the desired behavior of g, and f can be thought of as a functional specification of g. An oracle is a recursive predicate on input–output pairs that checks whether or not f has been implemented for an input. For example, if f(x)Zy, oracle u(x, y) is TRUE. It is necessary to be able to say whether a particular output of g is correct or incorrect for a particular test case x, with the latter implying that g(x)sf(x), and the former implying that g(x)Zf(x). The failure probability of IE, with respect to a set of test cases, tIE is the probability that IE produces an incorrect output for a test case selected. A data state of IE is a set of mappings between all variables (declared and dynamically allocated) and their values at a point during execution; in a data state we include both the program test case used for this execution and the value of IE counter. In the approach, the execution of a transition is considered here to be atomic, hence data states can only be viewed between transitions. A data state error is an incorrect variable/value pairing in a data state where correctness is determined by an assertion between transitions. We refer to a data state error as an infection. The data state and variable with the incorrect value at that point are termed infected if a data state error exists. Propagation of a data state error occurs when a data state error affects the output. It can be said that IE contains a fault with respect to a set of test cases if there exists at least one test case for which IE fails. We may know that a fault exists in IE; however in general, we cannot identify a single transition as the exclusive cause of the failure. For instance, several transitions may interact to cause the failure, or IE can be missing a required computation, which could be inserted in many different places to correct the problem.
Now the estimate of 3iIEj (execution estimate) is denoted by 3^iIE . It is found by finding the sample mean of 3iIEj. Accordingly, 3^iIE is defined as follows Pn jZ1 3iIEj 3^iIE Z n and 3^T Z ð^31IE ; 3^2IE ; .; 3^nT IE Þ where 3^T is a set of execution estimates of T. 4.1.2. Infection probability Infection probability is the probability that a change in IE causes a change in the resulting internal computational state. In order to define an infection probability, we let Mi represent a set of pi mutants of Ti : fmi1 ; mi2 ; .; mipi g (where 1%y%pi). Now let lmiy denote the infection probability of miy and let Tinext be a transition executed immediately after Ti. lmiy IEj is then defined as the probability that the succeeding output of Tinext from the master of IE is different from the succeeding output of Tinext that miy creates. Next let OIETinext ðtcj Þ represent the output that exists after executing Tinext from the master of IE by tcj, and let Omiy Tinext ðtcj Þ represent the output produced after executing Tinext from the replacement of miy by tcj; then lmiy IEj is defined below: ( lmiy IEj Z
1
if Omiy Tinext ðtcj Þ sOIETinext ðtcj Þ
0
otherwise
The estimate of lmiy IEj is termed an infection estimate. It is denoted by l^miy IE , and is defined as follows: Pn jZ1 lmiy IEj l^miy IE Z n
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
and ðfl^m1 l^T Z
fl^m2
IE ; lm12 IE ; .; lm1p IE g;
^
1
1
^
1
^ IE ; lm2
2
^ IE ; .; lm2p
2
IE g;
« fl^mn
IE ; lmnT2 IE ; .; lmnT
^
T1
^
pn
IE gÞ T
where l^T is a set of infection estimates of T. 4.1.3. Propagation probability Propagation probability is the probability that a forced change in an internal computational state propagates and causes a change in the output of an implementation of the specification. To define the propagation probability, we let xil be the lth variable x of Ti at location l, and let Dxil denote a set of m data states of xil : fdxil 1 ; dxil 2 ; .; dxil m g (where 1%v%m). Now let jxil IEv represent the propagation probability of xil. jxil IEv is the probability that the infection affecting variable xil in the vth data state succeeding Ti will make IEs output differs (from what would normally be produced). Let OM be an output of the master of IE, OP be an output of IE after perturbing the data state for xil; then, jxil IEv is defined as follows: ( 1 if OM sOP jxil IEv Z 0 otherwise and the estimate of propagation probability (propagation estimate) is denoted by j^ xil IE . It is defined as follows Pm vZ1 jxil IEv ^ jxil IE Z m
649
to execute every statement at least once; and branch testing is also a structural testing method that attempts to execute each branch at least once. Execution analysis estimates the probability of executing a particular transition when a test case is selected. Statement and branch testing provide weak criteria as they do not ensure failure, if a fault exists. Executing a statement during statement testing and not observing program failure provides one data point for estimating whether or not the statement contains a fault [15]. As execution analysis estimates execution probabilities, it has the advantage over structural testing method since it can indicate the likelihood of a particular statement being executed. Given an Estelle protocol specification (with the master specification denoted as M) and n test cases, execution analysis includes the following steps. † Master specification testing. For the master Estelle specification M do the following. 1. Translate M to its C/CCC implementation using an Estelle compiler. 2. Compile the C/CCC implementation to executable object codes (IE) using a C/CCC compiler. † Execution analysis. 1. Set X to zero. 2. Execute IE with n test cases according to a test case generator. 3. Increment X each time Ti is executed; make sure that counter is incremented at most once per test case. If Ti is repeatedly executed for the same test case, X is only incremented once. 4. The execution estimate for Ti is X/n. 5. Perform steps one to four for all transitions in the specification.
and j^ T Z ðfj^ x11 IE ; j^ x12 IE ; .; j^ x1m fj^ x21 IE ; j^ x22 IE ; .; j^ x2m
2
1
IE g;
IE g;
« fj^ xnT 1 IE ; j^ xnT 2 IE ; .; j^ xnT mn
T
IE g;
where j^ T is a set of propagation estimates of T, and m1 ; m2 ; .; mnT are the number of mutants of T1 ; T2 ; .; TnT respectively. 4.2. Algorithms In this section, the algorithms to implement the three analysis are presented. 4.2.1. Execution analysis Execution analysis is based on program structure [15]. Structural testing methods attempt to cover specific types of software structure with at least one test case. For instance, statement testing is a structural testing method that attempts
4.2.2. Infection analysis Infection analysis is similar to the process employed in fault-based testing [15] which defines faults in terms of their syntax and aims at demonstrating that certain faults are not in a program [16–18]. Mutation testing [19–21] is a faultbased testing strategy and evaluates program test cases. It takes a program and produces n versions (mutants) of the program that are syntactically different from the program. The goal of strong mutation testing [20] is to find a set of test cases that distinguish the mutants from the program, while weak mutation testing [21] selects test cases that cause all imagined infections to be created by a possibly infinite set of mutants. Infection analysis is different from weak mutation testing. It measures the effect mutants have on succeeding data states [15]. Syntactic changes are made to the internal computational state, and infection analysis finds the probability that a particular mutant affects the data state and estimates the infection probability for each mutant. In other words, infection analysis tests a location’s ability
650
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
to sustain a syntactic change yet not alter the data state that results from executing the mutant. When a syntactic mutant alters the data state, we call this data state an altered data state. Infection analysis involves the following steps. † Mutation preparation. Prepare mutants for Ti of an Estelle specification. To generate mutants, we apply the concept of Finite Complete Set of Alternatives (FCSA) [22]. † Master specification testing. For the master Estelle specification M, do the following. 1. Translate M to its C/CCC implementation using an Estelle compiler; and 2. Compile the C/CCC implementation to executable object codes (IE) using a CCC compiler. † Weak mutation testing [21] (Infection analysis). For each transition in the specification, do the following. 1. Set X to be zero. 2. Execute IE with n test cases according to a test case generator. 3. Execute the mutant object codes of the current test case and compare the output of Tinext with master behavior representation. If the outputs are not identical, increment X. 4. Repeat step 3 for tcn. The infection estimate for miy at Ti is X/n. 5. Perform steps one to four for all mutants. 6. Perform step five for all other transitions.
5. Repeat steps 1–4 for other variables. 6. Perform step 5 for all other transitions.
4.3. The estimates According to the three analyses, we will have three probability estimates for each transition in an IE. † Set 1. Execution estimate: it is an estimate of the probability that Ti of IE is executed. † Set 2. Infection estimate: for each mutant of Mi at Ti, given that the transition is executed, it is an estimate of the probability that the mutant will adversely affect the output of Tinext . † Set 3. Propagation estimate: for each variable in (xi1, xi2, .) at Ti, given that the output of Tinext changes, it is an estimate of the probability that the output of IE changes. Based on the above estimates, the following terms can be defined. tc
Definition 3.1. Let 3^xIEy represent n a set of execution o tcy tcy tc probabilities that are created by tcy : 3^1IE ; 3^2IE ; .; 3^qyyIE . Given two test cases tc1 and tc2, tc1 is said to be more likely to be executed than tc2 if: q1 X xZ1
4.2.3. Propagation analysis Propagation analysis generalizes the concept of faultbased testing by analyzing classes of faults in terms of their semantic effect on a data state [15]. It directly modifies program data states and measures the effect on the computation of the program, and involves the following steps. † Data state preparation. Initialise the data states of the variables for Ti in the Estelle specification. † Master specification testing. For the master Estelle specification M, perform the same steps as in the case of infection analysis. † Strong mutation testing [20] (Propagation analysis). For each transition in the specification, do the following. 1. Perturb the data state of variable xil in Ti and translate the Estelle specification to C/CCC implementation. 2. Execute IE with n test cases according to a test case generator. 3. Execute IE on tcj and compare OP(tcj) with OIE(tcj), where OP(tcj) is the output of this propagation analysis for tcj and OIE(tcj) is the corresponding output of the IE, and check the number of times that program outputs differ. 4. Repeat steps one to three n times. The propagation estimate for xil at Ti is Ndiff/n (where Ndiff is the number of times that the program outputs differ).
1 3^tc xIE O
q2 X
2 3^tc xIE
xZ1
Definition 3.2. If a test case produces a higher infection estimate for an implementation of a master specification and a mutant specification, this test case is said to discriminate the implementation of both specifications. TC
Definition 3.3. Let l^miyyE represent the infection estimate that is produced by TCy, where TCy represent a test suite. Given two test suites, TC1 Z tc11 ; tc12 ; .; tc1n and TC2 Z tc21 ; tc22 ; .; tc2n TC1 is said to be more more effective or discriminating than TC2 if T T l^m1iy E O l^m2iy E
Definition 3.4. If a test case has a high propagation estimate, it is said to propagate to the output of an implementation of the master specification. tc Definition 3.5. Let j^ IEy represent the propagation estimate that is produced by tcy. Given two test cases tc1 and tc2, tc1 tc tc is said to be more propagating than tc2 if j^ IE1 O j^ IE2
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
651
4.4. Obtaining a test case value
5. Implementing the method
We are now in the position to obtain a test case value (TCV) using the three estimates: execution, infection, and propagation estimates. To get a TCV, the following questions need to be considered.
How to implement the method is best illustrated by applying it to a communication protocol. We have chosen the INRES protocol [23]. The reasons for our choice are: it contains many basic Open Systems Interconnect (OSI) concepts; and it is not too big and easy to understand. It is a connection-oriented protocol that runs between two protocol entities: initiator and responder. The protocol entities do the communication by exchanging of the CR, CC, DT, AK and DR protocol data units. The communication between two protocol entities proceeds in three phases: the connection establishment phase, the data transmission phase and the disconnection phase. Fig. 2 shows the basic structure of the INRES protocol.
† How many transitions did a particular test case execute during an execution analysis? † How many of the outputs of Tinext infection were produced from the syntactic mutants during an infection analysis when the transitions were executed? † How many did the simulated infections of the transitions executed by this test case change the outputs of IE during a propagation analysis? We believe that those test cases executing more transitions, creating more alterations to the outputs of Tinext and propagating simulated infections more frequently than others could be expected to have a higher value than those test cases that do less frequently. We can also say that the higher value a test case has, the more faults that it is likely to uncover, if faults exist. Let Ntcj represent the number of transitions that a test case, tcj, executes. For each Ti executed by tcj, there are Si syntactic mutants used during an infection analysis and there are Ri simulated infections used during a propagation analysis. Let si represent the number of syntactic mutants that actually cause alterations to the output of Tinext at Ti of tcj, and let ri represent the number of simulated infections that propagate from Ti to the output of IE of tcj. We can then obtain two scores for tcj using the following equations: atcj Z
1 X si Ntcj i Si
btcj Z
1 X ri Ntcj i Ri
where i iterates over all the transitions executed by tcj in both equations. Averaging over all transitions executed by tcj, atcj and btcj characterize the ability to create alterations to the outputs of Tinext and to propagate simulated infections of tcj, respectively. The equation for the TCV of tcj is a combination of the three terms, one for the number of transitions executed, one for the ability to create alterations to the outputs of Tinext , and one for the ability to propagate simulated infections:
5.1. The procedure The procedure required to obtain TCVs are described below. Step 1: Formal specification of the protocol. A formal specification of the protocol needs to be first written. In this case, the INRES protocol has been specified in Estelle. It consists of two module: initiator and responder. It would be too lengthy to include the specification in this paper. More details can be found in [23]. Step 2: Normalization of the modified specifications. Normalization is a static analysis technique for Estelle specifications [24]. It is like symbolic evaluation, which tries to find symbolic paths in the specifications. The normalization process transforms the input specification into another in which each transition has only a single path. Step 3: Data flow and control flow generation. Control flow deals with viewing the normalized transitions focusing on control flow [24], without considering data flow. Each ICONconf IDISind
ICONind IDATind
ICONreq IDATreq
IDISreq ICONresp
INRES Protocol
ISAPres
ISAPini CC, DR, AK CR, DT
INITIATOR
RESPONDER
NDATind
NDATind
TCVtcj Z Ntcj atcj btcj As expected, a high and large number of executed transitions will produce a high TCV. If two test cases are similarly successful in creating alterations to the outputs of Tinext and propagating simulated infections at the transitions, then the test case that executes more transitions will have a higher TCV.
NDATreq
NDATreq
NSAP1
NSAP2
Medium Service Fig. 2. The INRES protocol.
652
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
module is a finite state machine. On the other hand data flow analysis focuses on the action part of the normalized transitions. This process extracts information about the memory part of the extended finite state machines. Step 4: Generation of test cases. The information from data flow analysis and control flow analysis are combined to obtain the transition tours. The transition tour generation tool, mstourgen [24], is used to obtain all the transition tours using the control flow graph, and then the infeasible paths are removed. Sub-tours are generated using the transition tours, the data flow graphs and the test suite structure. Sub-tours are then combined in different groups to obtain the test cases. Step 5: Implementation. To do the experiment, the INRES protocol needs to be implemented. A detailed description of the implementation is described in Section 5.2. Step 6: Obtaining the three estimates. Now the execution, infection and propagation estimates can be obtained by using the definitions and algorithms as defined in Section 4. Step 7: Estimating a TCV. This step is the highlight of the work. A TCV can be obtained by using the formula described
in Section 4. By obtaining the TCVs, the relative importance of each of the test cases can be ranked and an optimal test suite can then be designed. 5.2. Protocol implementation Fig. 3 shows the methodology for implementing the INRES protocol. There are three main parts: adaptation to the implementation environment, C-code generation and C-compiler. Adaptation to the implementation environment. To adapt the INRES protocol to the implementation environment of the experiment, the original Estelle specification of INRES protocol was modified by separating it into two modules: (Coder, Initiator) and (Coder, Responder). For each module, the main role of the two interface modules INITIATOR/RESPONDER and MEDIUM INTERFACE is to hide the specifications from the external environment. The INRES protocol is made by a single operating system task. Every external message received is transformed into the appropriate
INRES Estelle Specification
Specification of Inres Initiator
Specification of Inres Responder
Initiator Interface
Responder Interface
INITIATOR
RESPONDER CODER
CODER
Medium Interface
Medium Interface
Adaptation to the Implementation Environment
C-code of Inres Initiator
C-code of Inres Responder C-code Generation
Executable object of Inres Initiator
Executable object of Inres Responder C-code Compiler
IMPLEMENTATION
Source/Target Codes
Manually
Tool
Fig. 3. An implementation of the INRES protocol.
Execuable Object
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
data structure as defined in the specification and sent over the channel to either the Initiator/Responder or Coder modules. C-code Generation. The modified specification, is ready for generating C-codes. It is noted that a reference implmentation can be generated by using the EDT tool [40] to compile an Estelle specification to C executable codes. However, for the purpose of this experiment, the C-codes for both the initiator and responder specifications were manually written. The C-codes focus on the movement of the transitions in the specification. C-compiler. Finally, the C-compiler is used to compile both the initiator and responder C codes. The executable programs for them are thus obtained. 5.3. Mutant generation One of the main steps for an infection analysis is to construct a set of mutants according the master Estelle specification. The method of generating a set of mutants is based on the concept of Finite Complete Set of Alternative (FCSA) [22]. 5.3.1. Level of mutations Mutants can be classified into two levels: major and minor. 1. Major mutation. For an Estelle-based transition, a major mutation involves modifying: † the state of the from clause; and † the predicate of the provided clause. 2. Minor mutation. The scope of minor mutation is as follows: † the state of the to clause; and † all the syntactic elements of all the executable statements between the begin and the end statements of a transition block. Note that mutants in the state of the when clause are not generated as it is assumed that the when clause will accept certain inputs to execute the transition. With these two levels of mutants, the method will not only test all the transitions of a module but also check all the state-transition structures. 5.3.2. Element and mutant definitions We connect all syntactically possible alternatives of all the variables to the operators appearing in the executable parts of the Estelle specification. The definition given here has very specific scope. In particular, only syntactically possible elements are included. By syntactically possible, it means that these alternatives have to meet the requirements of Estelle syntax, especially the following two constraints: † variables must be declared before being referenced; and † alternatives must be syntactically compatible with other elements in the statements of the clauses which refer to them.
653
To understand the alternative definition better, transition 1 of the initiator (normalization form) is taken as an example. The transition structure is as follows: {transition 1} trans {1} when isap.iconreq from disconnected to wait begin counter:Z1; nsdu.id:Zcr; output nsap.ndatreq(nsdu) end; For this transition, there are a total of five syntactic elements, which can have mutation alternatives, namely: Starting state disconnected; Next state wait; Parameter counter, nsdu.id; Parameter of the output event nsdu. These can have the following syntactically possible mutants as shown in Table 1 below: 5.3.3. Mutant construction Mutant construction is based on the idea of constructing an FCSA [22]. For a given transition, the steps of building mutants are described below. 1. Extract all syntactic elements that can be mutanted. According to the concepts of major and minor mutations, these elements can be divided into the following three parts: † state variable appearing in a from and a to clauses; † all context variables appearing in a provided clause and the executable statements of a transition block; and † all operations related to context variables. 2. For each category of syntactic element, building its mutants is based on the following principles: † State variable Select all other state variables from the statedefinition-part of the module-body-definition containing the transition under test. If the state variable Table 1 Mutants for example transition 1 Elements
Mutants
Disconnected Wait Counter nsdu.id
wait, connected, sending; disconnected, connected, sending; number; (none);
654
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
appears in the from clause, add EITHER; otherwise, add same. † Input event Select all other plausible input events from the channel-definition part of the module-body-definition containing the transition under test. † Each context variable – Identify its type. – Select all other variables which have the same type as it has from * the VARIABLE-DECLARATION-PART of the transition under test; * the VARIABLE-DECLARATION-PART of the modulebody-definition containing the transition under test; and * the VARIABLE-DECLARATION-PART of the modulebody-definitions of all the ancestor module bodies containing the transition under test. – When an alternative is a numerical variable, enumerate all its possible values. † Each operator. Select all other possible operators, which have the same type. 5.4. Constructing the estimates 5.4.1. Execution estimate There are 12 and 9 test cases generated for the initiator and the responder, respectively. The execution estimates for both Table 2 Execution estimate for transitions in INRES Initiator Transition
Execution estimate
Transition execution
Estimate
T1 T2 T3 T4 T5 T6 T8 T9 T10 T11 T12 T13 T15 T16 T17 T19 T20
1.00 0.75 0.17 0.08 0.08 0.08 0.42 0.08 0.08 0.17 0.08 0.08 0.17 0.17 0.08 0.08 0.08
T21 T24 T25 T28 T29 T30 T32 T33 T34 T36 T37 T38 T39 T41 T42 T44 T45
0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08
Table 3 Execution estimate for transitions in INRES Responder Transition
Execution estimate
Transition
Execution estimate
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12
0.33 0.33 0.11 0.11 0.11 0.11 0.11 0.11 0.22 0.11 0.11 0.11
T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23
0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11 0.11
the initiator and the responder are given in Tables 2 and 3. Now we consider which test case is more likely to be executed, compared with others according to the definitions described in Section 4. To have a clear comparison, the following equation is applied to assigning the degree of execution for a test case. The results are shown in Tables 4 and 5. Pni jZk 3^T Etci Z Pntotal j lZm 3^Tl where Etci represents the degree of execution for tci.; ni is the number of transitions that are executed by tci; ntotal is the total number of transitions that are executed by all test cases during the testing; and (1%m, k%ntotal). The higher Etci is, the more likely it will be executed tci is. 5.4.2. Infection estimate To demonstrate the principle, the procedure of obtaining the infection estimate for T1 of the initiator is outlined below. Mutation preparation. A set of mutants of T1 is manually generated based on mutant generation concepts mentioned previously. As a result, the mutants described in Table 1 are obtained. Table 4 Transition execution for INRES Initiator Test case
Executed transition
Etci
tc1 tc2 tc3 tc4 tc5 tc6 tc7 tc8 tc9 tc10 tc11 tc12
T1, T2, T5 T1, T3, T6 T1, T13, T17 T1, T2, T8, T4 T1, T2, T12, T16 T1, T2, T25 T1, T30, T34, T38, T39, T3, T45, T2, T21 T1, T2, T29, T33, T37, T8, T20 T1, T2, T42, T44, T16 T1, T2, T8, T9, T10, T11, T15 T1, T2, T8, T19, T24 T1, T2, T8, T28, T32, T36, T11, T41, T15
0.37 0.25 0.23 0.45 0.40 0.37 0.66 0.50 0.42 0.53 0.47 0.56
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659 Table 5 Transition execution for INRES Responder Test case
Executed transition
Etci
tc1 tc2 tc3 tc4 tc5 tc6 tc7 tc8 tc9
T7 T24 T11 T14 T17 T20 T1, T13, T16, T19, T2, T9 T1, T2, T12, T15, T18, T9 T1, T2, T3, T4, T5, T21, T8, T6, T22, T23, T10
0.03 0.03 0.03 0.03 0.03 0.03 0.38 0.38 0.52
Applying infection analysis to T1. The infection analysis is applied as follows: 1. Set variable XZ0. 2. Replace a mutant of the initiator, and execute the mutant object codes of the current test case which executes T1, and compare the output of T1next with the output of T1next of the master object codes. If the output is not identical, increment X. 3. Repeat step 2 with different test cases. The infection estimate for the mutant at T1 is X/n (See Table 6), where n is the number of test cases applied. 4. Perform steps one to three for all mutants in Table 1. Lastly, the above procedure was applied to all transitions of the initiator and responder. The results are shown in Tables 7 and 8, where t the number of times that step 3 has been repeated; Fd from disconnected; Fw from wait; Fs from sending; Fc from connected; Table 6 Infection estimate for each mutant at T1 tcT1 t
X value of each mutant Fw
Fc
Transition 1 of the initiator 0 0 tcT1 1 0 0 tcT1 2 0 0 tcT1 3 0 0 tcT1 4 0 0 tcT1 5 0 0 tcT1 6 0 0 tcT1 7 0 0 tcT1 8 0 0 tcT1 9 0 0 tcT1 10 0 0 tcT1 11 0 0 tcT1 12 Infection 0.00 0.00 estimate
Fs
Tc
Ts
Td
Number
0 0 0 0 0 0 0 0 0 0 0 0 0.00
1 2 3 3 4 5 6 7 8 9 10 11 0.92
1 1 1 1 2 3 4 5 6 7 8 9 0.75
1 2 2 2 3 4 5 6 7 8 9 10 0.83
0 1 2 2 2 2 2 2 2 2 2 2 0.17
Td Tw Ts Tc
655
to disconnected; to wait; to sending; to connected.
5.4.3. Propagation estimate The infection estimates just obtained show how the mutants at the selected transitions will infect data state. We now estimate the probability that an infected data state at the transitions will propagate to the output. To obtain the estimates, propagation analysis has been applied to both the initiator and responder test cases. To demonstrate the method, the detail of obtaining the propagation estimate for T1 is described below. Data preparation. The data states of the variables of T1 are first prepared. The data states and the perturbed variable of T1 that are used in the experiment are as follows. Variable counter number nsdu.id
Data states 2, 4, 6 0, 1, 2, 3, 4, 5, 6 cr, cc, dt, ak, dr
Applying propagation analysis to T1. The propagation analysis is applied as follows: 1. Perturb a variable with the data state (see step 1) in T1. 2. Execute the object codes of the test case, which executes T1, and compare the output of the executable master INRES Initiator (OM) with the output of the executable INRES Initiator after perturbing the variables with the data state (OP). 3. Repeat step 2 for n test cases. 4. Repeat steps 1 and 2 for other data states. The propagation estimate for the variables at T1 is Ndiff/M (where Ndiff is the number of times that OMsOP, and M is n!the number of the data states of the variable). 5. Repeat steps one to four for other variables. Finally, we applied the above procedure to all transitions of the initiator and the responder. The results are shown in Tables 9 and 10. Note that the data used to perturb each variable of the transitions are as follows: counter 2, 4, 6 number 0, 1, 2, 3, 4, 5, 6 nsdu.id cr, cc, dt, ak, dr 5.5. Test case values Now, we are in the position to assign values to the test cases generated for the initiator and responder. There are 12 and 9 test cases for the initiator and the responder, respectively. To have a better understanding of how the TCVs are obtained, the procedure of assigning values to tc7 of the initiator and tc9 of the responder are given below.
656
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
Table 7 Infection estimates for INRES Initiator Infection estimate for each mutant at Ti Ti
Fd
Fw
Fs
Fc
Td
Tw
Ts
Tc
Number
Counter
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 T24 T25 T26 T27 T28 T29 T30 T31 T32 T33 T34 T35 T36 T37 T38 T39 T40 T41 T42 T43 T44 T45 T46 T47
– 1.00 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – –
0.00 – – 1.00 1.00 – 1.00 1.00 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – – 1.00 1.00 1.00 1.00 1.00 – 1.00 1.00
0.00 1.00 1.00 – – 1.00 1.00 1.00 – – – 1.00 1.00 1.00 – 1.00 1.00 1.00 – – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 1.00 1.00
0.00 1.00 1.00 1.00 – 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 1.00 – 1.00 – 1.00 1.00 1.00
0.83 0.70 0.5 – 0.00 – – 0.80 0.95 0.95 0.10 0.10 0.10 – – – – – 0.30 – – – – – – – – 0.00 0.05 0.00 – 0.00 0.00 0.05 – 0.00 0.95 0.00 0.90 – 0.10 0.00 – 0.00 0.90 – –
– 0.70 – 0.10 0.20 0.20 0.10 0.80 0.90 1.00 0.10 0.10 0.10 0.20 0.15 0.20 0.10 0.25 0.85 0.10 0.30 0.10 0.05 0.05 0.10 0.10 0.05 0.00 0.05 0.00 0.10 0.00 0.00 – 0.10 0.00 0.90 – – 0.00 0.10 0.00 0.20 0.00 – 0.10 0.10
0.75 0.70 0.5 0.20 0.10 0.15 0.10 – 0.95 0.90 – 0.10 0.10 0.20 0.10 0.25 0.15 0.10 – 0.10 0.20 0.10 0.00 0.00 0.10 0.10 0.10 – 0.05 0.05 0.10 – 0.00 0.00 0.10 – 1.00 0.00 0.40 0.00 – 0.00 0.20 0.00 0.90 0.10 0.10
0.92 – 0.5 0.10 0.00 0.15 0.10 0.60 – – 0.10 – – 0.20 0.15 0.10 0.10 0.10 0.20 0.20 0.20 0.15 0.10 0.10 0.10 0.10 0.10 0.00 – – 0.00 0.00 – 0.05 0.10 0.05 – 0.00 0.85 0.00 0.10 – 0.10 – 0.90 0.10 0.10
0.17 – 0.5 0.00 – 0.00 0.10 0.40 – – 0.90 0.90 0.90 0.20 – 0.10 0.10 0.00 0.00 0.15 0.00 0.00 0.00 – – – – – – – – – – – – – – – – – – – – – – – –
– 0.20 – –
Execution. To begin with, execution analysis for tc7 and tc9 are performed. The results are that tc7 execute nine transitions and that tc9 execute 11 transitions; thus Ntc7 and Ntc9 can be defined. Infection. Next, infection analysis is performed. It is based on 60 semantically different mutants distributed among the nine transitions of the initiator, and 44 mutants among the 11 transitions of the responder. atc7 and atc9 are calculated as
– – 0.20 0.00 0.00 0.80 0.80 0.80 0.10 0.10 0.00 0.00 0.10 0.10 – – – – – – – – – – – – – – – – – – – – – – – – – – – –
follows: atc7 Z 0:11ð0:00 C 0:50 C 0:50 C 0:83 C 1:00 C 0:30 C 1:00 C 0:42 C 0:00Þ Z 0:50 atc9 Z 0:09ð0:50 C 1:00 C 1:00 C 0:50 C 0:50 C 0:75 C 0:50 C 0:50 C 0:50 C 0:75 C 0:00Þ Z 0:59
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
657
Table 8 Infection estimates for INRES Responder Infection estimate for each mutant at Ti Ti
Fd
Fw
Fc
Td
Tw
Tc
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 T24
– 1.00 1.00 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 1.00 –
0.00 – 1.00 1.00 1.00 – 1.00 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – – 1.00
0.00 1.00 –a – – 1.00 1.00 – – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 – 1.00 1.00 1.00
0.67 0.33 0.95 0.13 0.10 0.13 – 0.00 – – – 0.10 0.00 – 0.00 0.10 – 0.23 0.90 – 0.13 0.00 0.43 –
– 0.33 1.00 0.13 0.00 – 0.43 – 0.33 0.43 0.33 0.10 – 0.00 0.00 – 0.00 0.33 – 0.00 1.00 – – 0.00
0.00 – – – 0.00 0.33 0.14 0.33 0.22 0.33 – 0.00 0.00 – 0.10 0.00 – 0.87 0.10 – 0.00 0.38 0.00
Table 9 Propagation estimates for INRES Initiator Propagation estimate for each perturbed variable at Ti Ti
Counter
Number
nsdu.id
Ti
Counter
Number
nsdu.id
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 T24
0.28 0.23 0.33 0.05 0.00 0.10 0.00 0.11 0.11 0.12 0.67 0.67 0.65 0.08 0.00 0.07 0.00 0.04 0.12 0.03 0.03 0.03 0.02 0.10
0.19 0.23 0.13 0.00 0.10 0.10 0.08 0.40 0.86 0.42 0.43 0.43 0.42 0.00 0.16 0.06 0.04 0.04 0.02 0.03 0.03 0.03 0.00 0.04
0.60 0.24 0.40 0.10 0.13 0.15 0.16 0.32 0.17 0.14 0.20 0.20 0.18 0.12 0.14 0.13 0.10 0.10 0.80 0.17 0.17 0.17 0.03 0.15
T25 T26 T27 T28 T29 T30 T31 T32 T33 T34 T35 T36 T37 T38 T39 T40 T41 T42 T43 T44 T45 T46 T47
0.04 0.00 0.19 0.48 0.13 0.33 0.04 0.42 0.12 0.33 0.03 0.40 0.12 0.30 0.30 0.02 0.67 0.23 0.07 0.23 0.33 0.00 0.00
0.04 0.03 0.10 0.57 0.08 0.13 0.04 0.50 0.08 0.13 0.00 0.47 0.08 0.12 0.12 0.04 0.43 0.33 0.04 0.33 0.06 0.02 0.02
0.18 0.12 0.27 0.74 0.15 0.42 0.10 0.63 0.14 0.43 0.12 0.50 0.14 0.10 0.10 0.04 0.18 0.74 0.10 0.74 0.80 0.02 0.02
658
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
Table 10 Propagation estimates for INRES Responder
The results for all the TCVs for both the initiator and responder are shown in Table 11.
Propagation estimate for each perturbed variable at T1 Ti
Number
nsdu.id
Ti
Number
nsdu.id
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12
0.00 0.05 0.47 0.02 0.00 0.00 0.07 0.03 0.07 0.04 0.04 0.00
0.00 0.27 0.74 0.74 0.87 0.87 0.14 0.90 0.24 0.10 0.10 0.00
T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 T24
0.05 0.02 0.03 0.05 0.02 0.03 0.04 0.02 0.00 0.03 0.02 0.02
0.12 0.05 0.08 0.12 0.05 0.08 0.10 0.05 0.85 0.89 0.01 0.01
Propagation. Propagation analysis is then performed, and btc7 and btc9 are calculated as follows: btc7 Z 0:11ð0:20 C 0:20 C 0:20 C 0:20 C 0:20 C 0:33 C 0:33 C 0:067 C 0:00Þ Z 0:19 btc9 Z 0:09ð0:00 C 0:42 C 0:42 C 0:83 C 0:00 C 0:33 C 0:83 C 0:00 C 0:00 C 0:00Þ Z 0:26 Test case values. Therefore, the TCVs for tc7 and tc9 are: TCVtc7 Z Ntc7 !atc7 !btc7 Z 9 !0:50 !0:19 Z 0:86 TCVtc9 Z Ntc9 !atc9 !btc9 Z 11 !0:59 !0:26 Z 1:69 Table 11 Test case value tcj
Ntcj
INRES Initiator 3 tc1 tc2 3 tc3 3 tc4 4 4 tc5 tc6 3 tc7 9 7 tc8 tc9 5 tc10 7 tc11 5 9 tc12 INRES responder tcj Ntcj tc1 1 tc2 1 tc3 1 1 tc4 tc5 1 tc6 1 6 tc7 tc8 6 tc9 11
atcj
btcj
TCVtcj
0.55 0.39 0.26 0.57 0.35 0.47 0.50 0.49 0.20 0.60 0.48 0.45
0.20 0.16 0.29 0.10 0.45 0.21 0.19 0.16 0.38 0.16 0.16 0.19
0.33 0.19 0.23 0.23 0.63 0.30 0.86 0.55 0.38 0.67 0.38 0.77
atcj 0.38 0.01 0.34 0.01 0.01 0.06 0.42 0.46 0.59
btcj 0.1 0.02 0.07 0.04 0.04 0.04 0.04 0.04 0.26
TCVtcj 0.04 0.00 0.02 0.00 0.00 0.00 0.10 0.11 1.69
6. Conclusions One of the main problems in industrial testing is the enormous number of test cases derived from any complex communication protocol. Due to budget constraints and tight schedule, the number of test cases has to be within a certain limit. However, by having a limit on the number of test cases, it raises some issues. For instances, what criteria should be used for selecting the test cases? How can we ensure that important test cases have not been excluded? What technique should be used for designing an optimal test suite? A good test case is one that has a high probability of finding undiscovered errors. In this paper, we have proposed that assigning a value to each of the test cases of a test suite can provide a solution to the problem. By having test case values, the relative importance of each of the test cases of a test suite can be ranked; an optimal test suite can be designed; and priority can then be given to testing those cases with higher values, should time run out. Our method is based on sensitivity analysis, which involves execution, infection and propagation analysis. It helps select test cases that are likely to produce failures if faults exist, and helps quantify the fault revealing power of a test case. In order to assign a value to a test case, the three analyses are used to provide a set of execution, infection and propagation probabilities with respect to a test case. From there, we define a test case value as the combination of the set of the three probabilities with the number of transitions executed. To demonstrate the usefulness of our method, it has been applied to the INRES protocol. The results show that for the test cases generated from the INRES Initiator, tc7 has the highest TCV, is therefore, likely to produce a failure if faults exist, and in the case of the Responder, it is tc12. When inspecting the specifications, the test cases tc7 and tc12 involve the most number of transitions. The results, therefore, are consistent with what we expected when a real implementation is tested, simply due to the fact the more complex the specification is, the more likely an implementation contains a fault. To conclude, our method for assigning a value to a test case is able to rank the relative importance of each of the test cases of a test suite from the point of view of finding undiscovered errors if they exist. Two tests having a similar high TCV may cover a very similar set of faults. Testing is the least understood part of software development and still relies much on the experiences and wisdoms of a competent tester. Even by producing TCVs for the test cases of a test suite, testers still need to exercise their judgements to decide if they agree to the rankings. However, without the TCVs, the testers would have no idea whether two cases are of similar importance. At least, this work will enable testers to quickly prioritise the test cases, and he can make a decision to judge if
R. Lai, Y.S. Kim / Information and Software Technology 48 (2006) 645–659
two cases having similar values may cover a similar set of faults.
Acknowledgements The assistance of A. Saekow in carrying out this research is hereby acknowledged.
References [1] D.P. Sidhu, T.K. Leung, Formal methods for protocol testing: a detailed study, IEEE transactions on Software Engineering 15 (4) (1989). [2] Recommendation Z, Specification and Description Language SDL, Com X-R15-E 100 (1987). [3] ISO/IEC 9074, Information Processing Systems—Open Systems Interconnection—ESTELLE—A Formal Description Technique Based on an Extended State Transition Model, 1989. [4] ISO/IEC 8807, Information Processing Systems—Open Systems Interconnection—LOTOS—A Formal Description Technique Based on the Temporal Ordering of Observational Behavior, 1989. [5] D.Y. Lee, J.Y. Lee, A well-defined estelle specification for the automatic test generation, IEEE Transactions on Computers 40 (4) (1991). [6] S. Fujiwara, G.V. Bochmann, F. Khendek, M. Amalou, A. Ghedamsi, Test selection based on finite state models, IEEE Transaction on Software Engineering 17 (6) (1991). [7] B. Forghani, B. Sarikaya, Automatic dynamic behaviour generation in TTCN format from estelle specification Proceedings of the 2nd International Workshop on Protocol Test Systems 1989. [8] D. Rayner, Standardizing conformance testing for OSI, Computer Networks and ISDN Systems 14 (1987) 1. [9] CCITT Recommendation X.290—ISO/IEC 9646-1, Information Technology—Open Systems Interconnection—Conformance Testing Methodology and Framework, Part 1: General Concepts, Part 2: Abstract test suite specification, Part 3: The Tree and Tabular Combined Notation, Part 4: Test Realization, Part 5: Requirements on test laboratories and clients for the conformance assessment process, 1991. [10] R. Lai, W. Leung, Industrial and academic protocol testing: the gap and the means of convergence, Computer Networks and ISDN systems, North-Holland, Systems 27 (4) (1995) 537–547. [11] J. Voas, PIE: A dynamic failure-based technique, IEEE Transactions on Software Engineering 18 (8) (1992) 717–727. [12] J. Voas, L. Morell, K. Miller, Predicting where faults can Hide from testing, IEEE Software 8 (2) (1991). [13] J. Voas, K. Miller, J. Payne, PISCES: a tool for predicting software testability Proceedings of the Symposium on Assessment of Quality Software Development Tools, IEEE Computer Society, New Orleans, LA, 1992. P. 297–309. [14] B. Sarikaya, B. Forghani, S. Eswara, Estelle-based test generation tool, Computer Communications 14 (9) (1991) 534–544. [15] J. Voas, K. Miller, The revealing power of a test case Journal of Software Testing, Verification, and Reliability, Wiley, New York, 1992. pp. 25–42. [16] L. Morell, A model for code-based testing schemes Proceedings of the 5th Annual Pacific Northwest Software Quality Conference 1987. [17] L. Morell, Theoretical insights into fault-based testing Proceedings of the Second Workshop on Software Testing, Validation, and Analysis 1988. [18] L. Morell, A theory of fault-based testing, IEEE Transactions on Software Engineering (1990). [19] A. Offutt, The coupling effect: fact or fiction Proceeding of the ACM SIGSOFT Third Symposium on Software Testing, Analysis, and Verification 1989.
659
[20] R. DeMillo, R. Lipton, F. Sayward, Hints on test data selection: help for the practising programmer, IEEE Computer (1978). [21] W. Howden, Weak mutation testing and completeness of test sets, IEEE Transactions on Software Engineering (1982). [22] R. Probert, G. Fuyin, Mutation testing of protocols: principles and preliminary experimental results Protocol Test Systems 1991. [23] R.M. Hierons, T.H. Kim, H. Ural, On the testability of SDL specifications Computer Networks 2004. [24] B. Sarikaya, Principles of Protocol Engineering and Conformance Testing, Ellis Horwood, Chichester, New York, 1993. [25] A.T. Dahbura, K. Sabnani, An experience in estimating fault coverage of a protocol test Proceedings of IEEE INFOCOM, IEEE Computer Society Press, Piscataway, NJ, 1988. pp. 71–9. [26] K. Sabnani, A. Dahbura, A protocol test generation procedure, Computer Networks and ISDN Systems, vol. 15 1988 pp. 285–97. [27] A.V. Aho, A.T. Dahbura, D. Lee, M.U. Uyar, An optimization technique for protocol conformance test generation based on UIO sequences and rural Chinese Postman Tours in: S. Aggarwal, K.K. Sabnani (Eds.), Protocol Specification, Testing, and Verification, VIII, North-Holland, New York, 1988. [28] Y.N. Shen, .F. Lombardi, A.T. Dahbura, Protocol conformance testing using multiple UIO sequences in: E. Brinksma, G. Scollo, C.A. Vissers (Eds.), Protocol Specification, Testing and Verification, IX, NorthHolland, New York, 1990. [29] F. Lombardi, Y.-N. Shen, Evaluation and Improvement of Fault Coverage of Conformance testing by UIO sequences, IEEE Transactions on Communications 40 (8) (1992). [30] H. Motteler, .A. Chung, D. Sidhu, Fault coverage of UIO-based methods for protocol testing in: O. Rafiq (Ed.), Protocol Test Systems, VI(C-19), North-Holland, New York, 1994. [31] S.T. Voung, J.A. Curgus, On test coverage metrics for communication protocols in: J. Kroon, R.J. Heijink, E. Brinksma (Eds.), Protocol Test Systems, IV, North-Holland, New York, 1992. [32] M. McAllister, S.T. Vuong, J. Alilovic-Curgus, Automated test case selection based on test coverage metrics in: G.V. Bochmann, R. Dssouli, A. Das (Eds.), Protocol Test Systems, V, North-Holland, New York, 1993. [33] J. Alilovic-Curgus, S.T. Vuong, A metric based theory of test selection and coverage Protocol Test Systems, V, North-Holland, New York, 1993. [34] G.V. Bochmann, A. Das, R. Dssouli, M. Dubuc, A. Ghedamsi, G. Luo, Fault models in testing in: J. Kroon, R.J. Heijink, E. Brinksma (Eds.), Protocol Test Systems, IV, North-Holland, New York, 1992. [35] T.M. Khoshgoftaar, E.B. Allen, K.S. Kalaichelvan, N. Goel, Early quality prediction: a case study in telecommunications, IEEE Software (1996) (Best 200 shown Relevance scale). [36] M. Benjamin, D. Geist, A. Hartman, G. Mas, R. Smeets, Y. Wolfsthal, A study in coverage-driven test generation Proceedings of the 36th ACM/IEEE Conference on Design Automation, IEEE Computer Society Press, 1999. [37] B Koch, J. Grabowski, D. Hogrefe, M. Schmitt, Autolink: a tool for automatic test generation from SDL specifications Second IEEE Workshop on Industrial Strength Formal Specification Techniques, IEEE Computer Society Press, 1998. [38] L. du Bousquet, S. Ramangalahy, S. Simon, C. Viho, A. Belinfante, R.G. de Vries, Formal test automation: the conference protocol with TGV/TORX Proceedings of the IFIP TC6/WG6. 1 13th International Conference on Testing Communicating Systems: Tools and Techniques, Kluwer, 2000. [39] Y. Ledru, L. du Bousquet, P. Bontron, O. Maury, C. Oriat, M.-L. Potet, Test purposes: adapting the notion of specification to testing Proceedings of the 16th IEEE International Conference on Automated Software Engineering, IEEE Computer Society Press, 2001. [40] S. Budkowski, Estelle development toolset (EDT), Computer Networks and ISDN Systems, vol. 25, North-Holland, New York, 1992 (Issue 1).