Testing input validation in Web applications through automated model recovery

Testing input validation in Web applications through automated model recovery

Available online at www.sciencedirect.com The Journal of Systems and Software 81 (2008) 222–233 www.elsevier.com/locate/jss Testing input validation...

372KB Sizes 2 Downloads 97 Views

Available online at www.sciencedirect.com

The Journal of Systems and Software 81 (2008) 222–233 www.elsevier.com/locate/jss

Testing input validation in Web applications through automated model recovery Hui Liu *, Hee Beng Kuan Tan School of Electrical and Electronic Engineering, Block S2, Nanyang Technological University, Nanyang Avenue, Singapore 639798, Singapore Available online 25 May 2007

Abstract Input validation is essential and critical in Web applications. It is the enforcement of constraints that any input must satisfy before it is accepted to raise external effects. We have discovered some empirical properties for characterizing input validation in Web applications. In this paper, we propose an approach for automated recovery of input validation model from program source code. The model recovered is represented in a variant of control flow graph, called validation flow graph, which shows essential input validation features implemented in programs. Based on the model, we then formulate two coverage criteria for testing input validation. The two criteria can be used to guide the structural testing of input validation in Web applications. We have evaluated the proposed approach through case studies and experiments. Ó 2007 Published by Elsevier Inc.

1. Introduction Most Web applications process input submitted from its external environment to raise external effects. For example, many of these applications receive inputs submitted from users to update the database maintained. They enforce that any input submitted from their external environment must satisfy the required constraints before it is accepted to raise external effects; and input submitted that violates the constraints is rejected and no external effect is raised. The enforcement is called input validation. Input validation plays a key role in the control and accuracy of inputs submitted to a system. It is essential and critical in Web applications. As stated in the design guidelines for secure Web applications in Microsoft’s essential online resource for developers (MSDN, 2006), proper input validation is one of the strongest measures of defense against today’s application attacks. Input validation is a challenging issue; however, the design and implementation of input

*

Corresponding author. E-mail addresses: [email protected] (H. Liu), [email protected] (H.B. Kuan Tan). 0164-1212/$ - see front matter Ó 2007 Published by Elsevier Inc. doi:10.1016/j.jss.2007.05.007

validation are carried out by a large population of application developers but not a small highly specialized software expert group. Therefore, to have an external effective method for testing input validation is important for software quality assurance. Recently, some techniques have been proposed for testing Web applications. Of these techniques, only the one proposed by Offutt et al. (2004) focuses on the testing of input validation. In their approach, through bypassing client-side validation checking, test cases with invalid inputs are created to test the input validation implemented in Web applications. However, the technique is purely based on the syntax of client-side user interface without analyzing any programs at the server-side. Hence, we propose in this paper an approach for the testing of input validation in Web server programs, which provides a more thorough and adequate way to test input validation in Web Applications. It is an enhancement and extension of our previous work (Liu and Tan, 2006). In this paper, we introduce an input validation model and some empirical properties for characterizing input validation. The input validation model is represented in a variant of control flow graph, called validation flow graph, which shows essential input validation features

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

implemented in programs. Based on that, we formulate two coverage criteria for testing input validation in Web applications. The proposed approach aims to provide a systematic structural testing of the implementation of the input validation feature in Web applications. It is a two-staged approach: First, the input validation model is automatically recovered from program source code; next, the two coverage criteria are used to guide the testing of input validation. The paper is organized as follows. Section 2 introduces the input validation model. Section 3 discusses the coverage criteria for testing input validation. Section 4 presents the automated recovery of input validation model and the techniques for testing input validation. Section 5 reports our evaluation of the proposed approach. Section 6 compares our work with related work. Finally, Section 7 concludes the paper. 2. The input validation model The analysis of input validation implemented in a program is based on the control flow graph of the program. We shall adopt the formalism of control flow graph from (Sinha et al., 2001). A control flow graph (CFG) of a program P is a directed graph G = (N, E), in which N contains a set of nodes and E = {(n, m)jn, m 2 N} contains edges that connecting the nodes. Each node in G represents one statement in P, and each edge in the G represents possible flow of control between two statements in P. There is an entry node nentry and an exit node nexit in a CFG representing entry to and exit from P, respectively. Let v and w be nodes in a CFG such that w 5 v. Node v dominates a node w if and only if every directed path from the entry node to w contains v. Node v postdominates node w if and only if every directed path from w to the exit node contains v. Node v is control-dependent on node w if and only if w has successor w 0 and w00 such that v postdominates w 0 but v does not postdominate w00 . For example, in the CFG shown in Fig. 1b, node 4 is control-dependent on node 2 and node eight is control-dependent on node 6. Let G be the CFG of a program. A node in G that raises effects in its external environment is called an effect node. A node v in G such that an input submitted by user is accessible at v and v dominates all nodes w 5 v at which the input is also accessible is called an input node. For example, in a Web database application, once an input is submitted through a form, a server program is executed. In the CFG of a server program that processes input submitted from users, the entry node is an input node as any input submitted is accessible at each node in the CFG. All the nodes that update the database maintained for the application are effect nodes. In this paper, only effects in a program that are influenced by the input accessed are concerned; hence, the term ‘‘effect’’ used always refers to an external effect that is influenced by inputs accessed in a program. Let t be an input node in the CFG of a program. In an execution of the program, the input submitted at t is said to

223

be accepted if after the execution, there is an effect raised by the program. Otherwise, the input is said to be rejected. Note that if all the effects raised during the execution are subsequently removed in the execution, then after the execution, there is no effect raised. Next, we shall introduce a variant of control flow graph, called validation flow graph, which provides a higher-level view of the input validation feature implemented in a program. The formal definition of validation flow graph is given below: Let G = (N, E) be the CFG of a program where N and E are its set of nodes and edges, respectively. Let N 0 be the subset of nodes in G that have one of the following properties: (1) (2) (3) (4)

Entry node nentry and Exit node nexit. All input nodes. All effect nodes. Node n in G such that a node in N 0 is control-dependent on n.

Let E 0 be the set of edges that connects the nodes in N 0 . For each unordered pair of nodes (n, m) in N 0 , if there is a path in G from n to m without passing through another node in N 0 , an edge (n, m) is included in E 0 . Formally, we define E 0 = {(n, m)jn, m 2 N 0 , n 5 m and there exists a path (n, n1, . . . , nk, m) in G such that k P 0 for all j, 1 6 j 6 k, nj 62 N 0 }. Then, the flow graph H = (N 0 , E 0 ) is called the validation flow graph (VFG) of the program, where N 0 and E 0 are its set of nodes and edges, respectively. The following property holds in a validation flow graph H: for each path (n0 = nentry, n1, . . . , nk1, nk = nexit) in H, there is always a path in G that passes through all nj (0 6 j 6 k) in G such that for each 0 6 j 6 k  1, if (nj, nj+1) is an edge in G, follow the edge; otherwise, follow any path from nj to nj+1 in G. Fig. 1a shows a program AddtoCart written in JavaServer Pages for an online shopping system. The program is executed when a user submits a request to add selected items to the shopping cart. The source codes of the program are simplified for illustration purpose. Fig. 1b shows the CFG of the program. As shown in Fig. 1b, in the CFG of the AddtoCart program, the entry node nentry is an input node, and node 15 is the only effect node. Since node 15 is controldependent on node 13, node 13 should be included in the VFG of the program. Through control dependency analysis, it is easy to verify that node 8, 6, and 2 should also be included in the VFG. Though at node 11 the program checks whether the variable count can be converted to an integer, it does not control the execution of node 15, thus node 11 should not be included in the VFG of the program. The VFG of the program is shown in Fig. 1c. Let t be an input node of a program. A path in a CFG or VFG of the program is called an input path of t if it satisfies the following two conditions:

224

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

Fig. 1. An AddtoCart program in an online shopping system. (a) Simplified source code of the program. (b) The CFG of the program. (c) The VFG of the program.

(1) It contains an input node t. (2) If it passes through a loop structure, it enters the loop exactly one time (i.e. it does not exit the loop right away or iterate the loop). As defined by Tai (1996), a simple predicate is a Boolean variable or a relational expression possible with one or more NOT () operators. In general, a predicate node in a flow graph consists of a number of simple predicates composed by Boolean operators. To characterize the control structure of the input validation implemented in a program, we define in the next the validation node through the identification of candidate validation nodes in a program. Let H be the VFG of a program P and let t be an input node in H a predicate node d in H such that there is an

effect node that is control-dependent on d, is called a candidate validation node (c-validation node) of t. Further, a predicate node d in H such that a c-validation node of t is control-dependent on d, also called a c-validation node of t. A c-validation node d is called a validation node of t if the following two properties hold: (1) d defines a selection structure. (2) There is an input path in H that contains d and does not pass through any effect nodes. Note that in the definition of validation node, node d should not be a loop structure such as the ‘‘for’’ loop or ‘‘while’’ loop. For example, as discussed earlier, Fig. 1c is the VFG of the program shown in Fig. 1a in which the

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

entry node is an input node and node 15 is an effect node. In the VFG, it can be easily verified that nodes 13, 6 and 2 are validation nodes. However, node eight is not a validation node. Next, we shall present the two empirical properties we have discovered for characterizing input validation. The first property states the necessary condition for implementing input validation in a program. Property 1. Let t be an input node of a program P. If input submitted at t is validated in P, then there exists at least one validation node of t. The second property is for partitioning the set of input paths through the CFG of a program for accepting or rejecting input submitted from user. Property 2. Let t be an input node of a program P. Let K be the set of input paths of t through the CFG of P such that they pass through the same set of validation nodes of t and follow the same branch at each of these validation nodes. It is highly probable that there are only two cases: (1) Any input submitted at t through executing each path in K is accepted. In this case, K is called an acceptance class of t. (2) Any input submitted at t through executing each path in K is rejected. In this case, K is called a rejection class of t. 2.1. Hypothesis testing We have conducted hypothesis testing to statistically validate Properties 1 and 2. The testing is based on binomial test (Montgomery and Runger, 2003). For the hypothesis testing of each of the two properties, the null hypothesis H0 states that the property holds for less than 99% of the cases, and the alternative hypothesis H1 states that the property holds for equal or more than 99% of the cases. The binomial test statistics z is computed as follows: X =n  p z ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðpð1  pÞ=nÞ

ð1Þ

where n is the sample size for the test, X is the number of cases that support the alternative hypothesis H1, and p is the hypothesized value for the proportion of cases in the population that support H1 (in our test, p = 0.99). Choosing 0.05 as the type I error probability, we get z-score 1.645. Hence, if the z-score calculated from a sample is greater than 1.645, we reject H0; otherwise, we accept H0. The sample for the testing were drawn from eight open source systems downloaded from Sourceforge (2006) and China Webmaster (2006) and 8 systems developed by senior computer science students. The details of the sample drawn are listed in Table 1. We analyzed all the programs in these systems through the use of part of the prototype tool discussed in Section 5, and collected the programs that

225

Table 1 Samples for hypothesis testing Systems 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Roomba (Sourceforge) JavaLibrary (Sourceforge) Smacs (Sourceforge) Bugtracker (Sourceforge) JspShop (Webmaster) ART (Sourceforge) NMS (Webmaster) StudentRecord (Webmaster) OnlineMarket SubjectRegister FriendsMatch EasyBooking TransportClaim ResaleOnline PersonalBlog FlowerShop Total

LOC

# Total programs

# Input programs

4212 12180 13462 1755 8131 12841 3617 2847 9286 6854 14365 4194 2528 11540 1639 2082

59 79 209 25 93 88 48 41 96 45 126 57 34 104 21 36

16 16 24 13 29 33 12 14 18 13 23 15 12 27 9 11

111533

1161

285

access inputs submitted from user to raise external effects as the sample cases for testing the two properties. In Table 1, the first eight systems are open source systems and the rest are developed by students. The column # Total programs shows the total number of programs in each system, and the column # Input programs indicates the number of programs in each system that access input submitted from user to raise external effects. The figure excludes the supporting libraries used in each system. There are in total 285 input programs in these systems. Each of these programs forms a case in our sample for testing Properties 1 and 2. As the systems collected from open source systems are supposed to be tested and the systems developed by students are at different phases of testing, our sample contains programs in various stages of development. Through careful examination of each case, we confirmed that all the cases gave affirmative support for the two properties. Applying formula (1), the z-score calculated is 1.697 (p = 0.99 and X = n = 285); hence, we conclude that both Properties 1 and 2 hold for equal or more than 99% of the cases at 5% level of significance. In the hypothesis testing of Properties 1 and 2, though both the properties hold for all the cases in the sample, the hypothesis testing is appropriate because the two properties do not necessarily hold. Theoretically, a program can implement an input validation in infinite number of ways. For example, Figs. 2 and 3 show two unusual programs written in pseudocode and their respective control flow graphs. In Fig. 2, node 1 is an input node, and the program accepts the input variable x if x > 0. The input submitted at node 1 is validated; however, there is no validation node in the program as node 3 defines a loop structure. Therefore, it violates Property 1. In Fig. 3, node 1 is an input node, and the program accepts the input variable x only if 0 < x 6 20. In this program, node 2 is the only validation node of node 1. Note that the paths (nentry, 1, 2, 3, 6, 7,

226

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

Fig. 2. An unusual program A. (a) Pseudocode of the program and (b) The CFG of the program.

Fig. 3. An unusual program B. (a) Pseudocode of the program and (b) the CFG of the program.

nexit) and (nentry, 1, 2, 3, 4, 5, 7, nexit) both contain node 1, pass through node 2 and follow the same branch at node 2; hence, they belong to the same class. However, any input submitted at node 1 through executing the path (nentry, 1, 2, 3, 6, 7, nexit) is accepted, and any input submitted at node 1 through executing the path (nentry, 1, 2, 3, 4, 5, 7, nexit) is rejected because the effect raised at node 4 is cancelled at node 5. Therefore, it violates Property 2. Nevertheless, the two examples are not contradictions to our statement that Properties 1 and 2 hold empirically, because usually no real program will be written in these ways. Property 1 is based on the fact that a program should properly process both expected and unexpected input, so that only valid input should be accepted for raising effects while invalid input should be rejected. Clearly, we can use Property 1 to infer whether input submitted to a program is not validated. Based on Property 2, we can divide the set of input paths of an input node t through the CFG of a program into different classes according to the set of validation nodes of t they pass through and the branch traversed at each of these nodes. This partition is called the validation node-based

path partition (VN-based path partition) of t. The set of validation nodes and the branch at each of these nodes contained in an input path of t through the CFG of a program can be extracted to represent one class in the partition, and it is called a VN-based path partition criterion. Clearly, the set of VN-based path partition criteria can be obtained by traversing the input paths of t through the VFG of a program. As discussed earlier, Fig. 1b and c are the CFG and VFG of the AddtoCart program shown in Fig. 1a, in which nentry is the input node, and nodes 2, 6 and 13 are validation nodes. It can be easily verified from the VFG of the program that there are four VN-based path partition criteria each of which represents one class in the partition: (2T) represents a rejection class that rejects the input submitted by user if the user is not logged in; (2F, 6T) represents a rejection class that rejects the input submitted if user is logged in but the product ID or quantity have invalid value; (2F, 6F, 13F) represents a rejection class that rejects the input submitted if the user is logged in, the product ID and quantity are not null, but the product ID does not exist in the database; and (2F, 6F, 13T) represents an acceptance class that accepts the input submitted by user if the user is logged in, the product ID and quantity are not null, and the product ID exists in the database. It can be further verified that there are four classes in the VN-based path partition from the CFG of the program, each of which corresponds to one of the VN-based partition criteria: {(nentry, 1, 2, 3, nexit)}, {(nentry, 1, 2, 4, 5, 6, 7, nexit)}, {(nentry, 1, 2, 4, 5, 6, 8, 9, 10, 11, 12, 13, 8, nexit), (nentry, 1, 2, 4, 5, 6, 8, 9, 10, 11, 13, 8, nexit)}, and {(nentry, 1, 2, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 8, nexit), (nentry, 1, 2, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 8, nexit)}.

3. Coverage criteria for testing input validation The objective of testing input validation is to test the constraints implemented in a program for accepting or rejecting input submitted from user so as to ensure the accuracy of input validation implemented in the program. Based on the input validation model, we introduce in this section two input validation coverage (IVC) criteria for the testing of input validation. The VN-based path partition of an input node t divides the input paths of t through the CFG of a program into different classes. Each class can be further classified as acceptance and rejection class. Input submitted at t through executing any path in an acceptance class is accepted, while input submitted at t through executing any path in a rejection class is rejected. Thus, each acceptance class implements one way to accept input submitted at t, and each rejection class implements one way to reject input submitted at t. Based on this, we propose the pathbased IVC for input submitted at t: Path-based IVC: At least one path in each class in the VN-based path partition of t has been exercised.

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

As discussed earlier, in the CFG of the AddtoCart program shown in Fig. 1b, there are four classes of paths in the VN-based path partition. To satisfy the path-based IVC, at least one path needs to be selected from each class of the four classes. For example, one test suite can consist of the following four paths: (nentry, 1, 2, 3, nexit), (nentry, 1, 2, 4, 5, 6, 7, nexit), (nentry, 1, 2, 4, 5, 6, 8, 9, 10, 11, 12, 13, 8, nexit), and (nentry, 1, 2, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 8, nexit). A validation node in a program may consist of a number of simple predicates composed by Boolean operators; however, in the path-based IVC, the complexity of the simple predicates that constitute a validation node is not considered. Therefore, for many safety critical Web applications in which the complexity of the simple predicates in a validation node is of concern, the testing of input validation based on the path-based IVC alone may not be sufficient. Condition-based IVC is proposed next to address the concern for testing the simple predicates in each validation node of an input node t. Condition-based IVC: All possible combinations of the outcomes of the simple predicates within each validation node of t have been taken at least once. If there are n simple predicates in a validation node, the number of test cases required to satisfy the condition-based IVC for testing the node is 2n. In Web applications, the number of simple predicates in a validation node is usually small, thus the condition-based IVC is feasible. For example, to test input validation in the AddtoCart program shown in Fig. 1a, the condition-based IVC aims to cover the following two conditions at node 2: ‘‘(username == null) = true’’ and ‘‘(username == null) = false’’, and the following four combinations of conditions at node 6: ‘‘(prodID == null) = true AND (prodCount == null) = true’’, ‘‘(prodID == null) = true AND (prodCount == null) = false’’, ‘‘(prodID == null) = false AND (prodCount == null) = true’’ and‘‘ (prodID == null) = false AND (prodCount == null) = false’’. The two input validation coverage criteria provide the basis for testing input validation in Web applications, in which they can be used to guide the test case selection. The choice of the criteria depends on the time and resources available. In addition, the two coverage criteria could also be used to measure the adequacy of any test suite in testing input validation.

4. The proposed approach This section discusses the approach for testing input validation through automated model recovery. In this approach, the input validation model is first recovered from program source code. The model recovered gives an overview of the input validation implemented in the program. Based on the model and the two coverage criteria proposed in Section 3, we then propose to test the details of the input validation implemented in the program. The following two subsections elaborate the automated

227

model recovery and the testing of input validation, respectively. 4.1. Recovery of input validation model For each program P in a system that accepts input submitted from user and raises effects, first, the CFG G of P is constructed, and then the VFG H is constructed from G. The VFG of a program provides a higher-level view of the input validation implemented in a program, and consequently, validation nodes and the VN-based path partition criteria can be computed. Detailed steps are presented next. Step 1: Construct VFG H of P. First, the input nodes and the effect nodes need to be identified. In Web application, as discussed earlier, for a server program that process input submitted from user, the entry node in the CFG of the program is an input node; any node that updates database maintained is an effect node. Such information are usually implemented through standard programming interface, thus it can be automatically identified through program analysis. Next, the VFG H of P can be constructed from the CFG G as follows: (1) Compute the set N 0 of nodes from G according to the definition given in Section 2. (2) Include all the nodes in N 0 in H. (3) For each unordered pair of nodes n and m in H: if there is a path in G from n to m without passing through another node in H, include an edge (n, m) in H. Step 2: Compute the set D of validation nodes. Let t be the input node and let F be the set of effect nodes computed in Step 1. First, the set C of c-validation nodes of t are computed through control dependency analysis based on the effect nodes in F. The set D of validation nodes of t are computed as follows: First, the set of input paths of t through H are constructed using depth first search algorithm. During the path traversal, any loop construct is traversed only once. Next, for each c-validation node c 2 C, if c defines a selection structure and there is an input path of t that contains c and does not contain any effect node in F, then c is identified as a validation node of t and included in D. Property 1 is then applied to check whether input submitted from user is not validated in the program. If the result is affirmative, no further processing is needed. Step 3: Compute the set Q of the VN-based path partition criteria. The set of input paths of t through H has been computed in Step 2. For each of the input paths computed, identify the partition criterion q that consists of the sequence of validation nodes contained in the path and the branch traversed at each of these nodes, and include q in set Q if it

228

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

has not yet been included. Further, if the path from which q is extracted does not contain any effect node, q represents a rejection class; otherwise, q represents an acceptance class. Each criterion in Q represents a class of input paths of t in G that traverse the same sequence of validation nodes and follow the same branch at each of these nodes as specified in the criterion. 4.2. Testing of input validation For a program P that accepts input from user and raises effects, the input validation model of P recovered contains the following information: the input node t and the set of effect nodes F, the CFG G and the VFG H of P, the set of validation nodes Dand the set Q of the VN-based path partition criteria. Through the use of the model recovered, the testing of input validation can be conducted based on the path-based IVC or the condition-based IVC, and choice depends on users. To test input validation based on the path-based IVC, an input path through G needs to be selected for each class in the VN-based path partition of t. For each VN-based partition criterion q in Q, an input path of t through G is traversed by following the branch at each validation node specified in q. During the traversal, any loop construct is traversed only once. Once a path is determined, a test case to exercise the path can be generated manually or using any existing path-oriented test case generation techniques (Korel, 1990; Ferguson and Korel, 1996; Gupta et al., 1998; Wegener et al., 2001; Harman et al., 2004). If the path is infeasible, another path is traversed according to q. This is repeated until a feasible path in the class represented by q is located and exercised, or no further traversal can be made. To test input validation based on the condition-based IVC, first, for each validation node d in D, the combinations of outcomes of the simple predicates in d are computed. Next, for each of such combination, the branch at d to be exercised is identified, and an input path of t through G is traversed in such a way that the identified branch is traversed and any loop construct is traversed only once. Test cases to cover each of the combinations can be generated manually or using existing test case generation techniques through transformations. For instance, one way is again to use path-oriented test case generation techniques. A test case to cover a combination of outcomes of the simple predicates in a validation node d can be generated as follows: (1) Replace the condition defined in d with the conjunction of the following simple predicates: For each simple predicate s in d, the predicate (s = its Boolean value set in the combination). (2) Identify the branch at d that will be exercised with the combination and locate an input path of t through G that follows the identified branch.

(3) Generate a test case to pass through the input path using one of the path-oriented test case generation techniques. 5. Evaluation To validate the proposed approach, a prototype tool called TIVT (Tool for Input Validation Testing) has been developed. A case study and a controlled experiment were then conducted using TIVT to demonstrate the feasibility and effectiveness of the proposed approach. In the case study, we did a general check on the existence of input validation in eight open source systems. The controlled experiment was conducted for testing input validation in a Web application developed in a student project. 5.1. The prototype tool The prototype tool called TIVT (Tool for Input Validation Testing) was developed through the use of Java Architecture for Bytecode Analysis (JABA, 2006) from Georgia Institute of Technology. It is targeted for Web database applications written in Java. The architecture of TIVT is shown in Fig. 4. It consists of four components: a Program Analyzer, an Input Validation Recovery (IV Recovery), an Input Validation Testing (IV Testing), and a Path-based IVC Analyzer. The Program Analyzer and the IV Recovery work together to automatically recover input validation from program source code. Based on the information recovered, the IV Testing computes the paths and constraints to guide the test case generation. At this stage, the test case generation and execution need to be performed manually. The Path-based IVC Analyzer computes the degree of path-based IVC for a given test suite. Detailed descriptions for each of the components are given below. The Program Analyzer uses JABA’s API to analyze Java programs. By parsing a class file of a Java program, JABA performs control and data flow analysis and builds

Fig. 4. The Architecture of TIVT.

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

CFG or Interprocedural CFG for the program. The Preprocessor is used to pre-compiles all files in a system into class files. Based on the bytecode engineering library (BCEL, 2006), the Instrumentor provides runtime execution information. The IV Recovery includes three modules: a VFG Constructor, a Validation Node Identifier (V-node Identifier) and a Criteria Extractor, which implement the three steps discussed in Section 4.1. The VFG Constructor produces a VFG for each program that accesses input submitted from user to raises external effects, The V-node Identifier computes the validation nodes in such a program, and the Criteria Extractor computes the VN-based path partition criteria by analyzing the input paths through the VFG of the program. The IV Testing contains a Path Constructor and a Constraints Extractor for implementation of the techniques discussed in Section 4.2. Based on the VN-based path partition criteria, the Path Constructor constructs the input paths through the CFG of a program to cover every class in the VN-based path partition. For each validation node, the Constraints Extractor computes all possible combinations of the outcomes of the simple predicates in the node. Each of such combination is then input to Path Constructor for selection of input paths through the CFG of the program that contains the node and the branch led by the combination. Depending on users’ choice on pathbased IVC or condition-based IVC, the information is computed accordingly to guide the test case generation. As a by-product, the Path-based IVC Analyzer is built to compute the degree of path-based IVC for a given test suite. This can be easily computed through analyzing the path execution report on the number of classes in the VN-based partition that have been covered by a test suite. At this stage, all the test cases in a test suite are executed manually on the instrumented code. Based on the number of VN-based path partition criteria, the degree of pathbased IVC for a given test suite is calculated. 5.2. Case study on testing the existence of input validation A case study was conducted on eight open source systems, which are also used in the hypothesis testing. All the eight systems are supposed to be tested. As discussed earlier, Table 1 shows the number of programs in those systems that access input submitted from user to raise external effects (called input programs). Based on Property 1, we then did a general check on the number of input programs in these systems in which input is not validated. The eight systems are briefly described below. Roomba is a web-based room booking system for small to mediumsized hotels. Smacs is a web-based facility for the management of casual staff in an organization. JavaLibrary is an electronic library in which registered users can browse books, journals, megazines, etc. Bugtracker is a web-based tool for bug reporting and bug tracking in software development. Art is a query and reporting tool for information

229

Table 2 Results of the case study Systems

# Total programs

# Input programs

# Programs w/o IV

Roomba (Sourceforge) JavaLibrary (Sourceforge) Smacs (Sourceforge) Bugtracker (Sourceforge) JspShop (Webmaster) ART (Sourceforge) NMS (Webmaster) StudentRecord (Webmaster)

59 79 209 25 93 88 48 41

16 16 24 13 29 33 12 14

7 0 8 0 0 5 9 0

Total

642

157

29

sharing on the web. JspShop is a commercial online shopping system. NMS is a news management system for updating and editing news releases. StudentRecord provides webbased educational facility for teachers and students to manage their courses and records. TIVT was used to process the eight systems. The summary of the analyzed results is shown in Table 2. The column # Programs w/o IV gives the number of programs in which input is not validated. In all, in the 8 systems, there are 157 input programs, which is 24% of the total number of programs. Among them, there are 29 programs in which input is not validated, which accounts for 18%. The figures show that input validation is an important feature; however, implementation of the feature is often neglected by some programmers. Through manual investigation of those programs in which input is not validated, it is not surprising to see that most of them could cause potential problems such as unexpected runtime exceptions, security vulnerability, or violation of database integrity. Two typical examples are shown in Fig. 5. The code segment in Fig. 5a is extracted from the Roomba system, in which inputs are accessed at line 2, 4, 5 and 6, and the system database is updated at line 11. The program accesses user inputs and uses them directly in database insertion or modification without any checking on the input values, and this can cause two problems. One is that an invalid input can result in Java SQL runtime exception if the value does not comply to its entry type defined in the database schema (e.g. inserting a string into a database where an integer is required). The other one is that the program is very easy to be attacked by SQL injection which will cause security issues. Another example is the code segment extracted from the Smacs system as shown in Fig. 5b. In the code, bean is an instance of the class TestingCard as defined in line 1. The program checks the input data at line 2 and 4 and set the corresponding field of bean if the value is not null. However, though a domain checking is performed on the user inputs, the method bean.create( ) at the line 6 will always be executed regardless of the results of the checking. This program can confuse the users who provide the invalid input. It can also cause the insertion of invalid records into the database.

230

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

Fig. 5. Code segments of two programs with no input validation. (a) Code extracted from Roomba – bookings/saveBooking.jsp (b) Code extracted from Smacs – insertTestingCard.jsp.

5.3. Experiment on testing input validation To evaluate the feasibility and effectiveness of the two proposed coverage criteria for input validation, we have conducted two experiments. The first one was performed on the open source system Smacs which is also used in the case study discussed in the previous section. As discussed earlier, Smacs is a web-based facility for the management of the casual staff in an organization, which can store rosters, produce pay schedules and automate the roster generation based on staff requirements and staff availability. There are altogether 80 files in the system and the size of the source code is 13462 LOC. The second experiment was performed on the system ResaleOnline developed by senior computer science students which is also used in the hypothesis testing. ResaleOnline provides a platform for school students, staff and alumni to resale their used items. There are altogether 107 files in the system and the size of the source code is 11540. It has been fully tested and well documented. For the Smacs system, since there is no specification available, two students studied and tested it thoroughly to produce a detailed specification for the system. Besides the system used and this preparation, the design of the two experiments is the same. As discussed earlier, there are 24 input programs in Smacs and 27 input programs in ResaleOnline (Shown in Table 1). In both experiments, we seeded 30 input validation errors into the input programs in the system under test: 6 on extra conditions, 6 on missing conditions, 6 on incorrect expressions and another 12 on logical opera-

tor errors. By making the specification and implementation inconsistent, we were able to compare the error detection ability of the test suites designed. Nine computer science students who had no previous knowledge of the two systems under test voluntarily participated in the experiments. They were given several hours to study the systems before the commencement of the experiments. We then randomly divide the 9 students into three groups. Each experiment consists of two parts. In Part I, the 3 students in Group 1 were required to perform functional testing of the system under test. Each of them was required to perform the task independently in three steps: (1) Design test cases based on the specifications. (2) Execute the test cases. (3) Verify the testing results and identify the errors in the source code. In addition, the execution of each test suite designed was traced and analyzed by TIVT so that the path-based IVC was computed. We analyzed and evaluated the test suites designed by the students based on the number of test cases in the test suite, the path-based IVC, and the number of input validation errors detected. A summary of the results is shown in Table 3. The column Cpiv shows the overall path-based IVC for the input programs in the system, and #IV-Errors shows the number of input validation errors identified by each test suite. As shown in Table 3, no test suite gives 100% Cpiv for testing the input programs in the system and none of the test suites manages to detect all the input validation errors seeded in the source code. Not surprisingly, a test suite with higher Cpiv tends to have a better ability in detecting input validation errors. We also observed that most of the undetected errors are those on extra conditions or logical operators. The results show that the testing of input validation is a non-trivial task, and the coverage criteria for measuring the adequacy of the testing can be very useful. In Part II of the experiment, the students in Group 2 and Group 3 were required to test the input validation implemented in the system under test based on pathbased IVC and condition-based IVC, respectively, with the aid of TIVT. Hence, each of them was required to design test cases based on the testing criteria generated by TIVT. Besides this step, the rest was the same as that in Part I. For the testing based on the path-based IVC, TIVT analyzed that there are altogether 18 acceptance classes and 61 rejection classes in the VN-based path partition in Smacs,

Table 3 The results of functional testing Student

1 2 3

Experiment 1 (Smacs)

Experiment 2 (ResaleOnline)

# Test cases

Cpiv (%)

# IVErrors

# Test cases

Cpiv (%)

# IVErrors

176 340 297

46 72 61

17 23 20

98 226 150

31 77 43

19 26 21

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233 Table 4 The results of the testing based on path-based IVC Student

1 2 3

6. Related work

Experiment 1 (Smacs)

Experiment 2 (ResaleOnline)

# Test cases

# IV-Errors

# Test cases

# IV-Errors

79 79 79

24 25 24

103 103 103

27 25 26

and there are altogether 29 acceptance classes and 74 rejection classes in the VN-based path partition in ResaleOnline. The results of the testing are shown in Table 4. It can be seen that all the students in Group 2 managed to design the test cases to cover each class in the VN-based partition for the system under test. The number of input validation errors detected by each student is also very close to each other. This is expected because each student designed the test cases according to the paths generated by TIVT which were constructed by the same algorithm. We further observed that most of the undetected errors were those logical operator errors hidden in the validation nodes that consist of more than one simple predicates. For the testing based on the condition-based IVC, TIVT analyzed that there are altogether 64 validation nodes in the 24 input programs in Smacs and 54 validation nodes in the 27 input program in ResaleOnline. A summary of the complexity of the simple predicates in those validation nodes is shown in Table 5. The row n shows the number of the simple predicates in a validation node, and the row #Vnodes gives the number of validation nodes. The average number of test cases designed by the students in Group 3 is 195 for Smacs and 133 for ResaleOnline. One student managed to identify all the errors seeded in Smacs, and two students managed to identify all the errors seed in ResaleOnline. The average number of errors identified is 29.3 and 29.7 for the two systems, respectively. The results of Part II of the experiment show that based on the two proposed coverage criteria, the testing of input validation implemented in a system can be more systematic and thorough. In particular, the test cases generated based on condition-based IVC can detect more complex logic errors in validation nodes. We observed that the total number of validation nodes in an input program is usually not too large, and the number of the simple predicates in a validation node is usually small; hence, the number of test cases required for both path-based IVC and conditionbased IVC is not too large.

Table 5 Complexity of the simple predicates in validation nodes n

1 2 3 4

231

# V-nodes Smacs

ResaleOnline

33 15 3 3

41 12 1 0

The earliest work related to input validation testing mostly focused on the automated generation of programs to test compilers (Bazzichi and Spadafora, 1982). They do not generate test cases. More recently, Beizer (1990) proposes a method for syntax testing that uses graph to specify user command. The method applies graph coverage techniques to manually generate test cases. Marick (1995) also proposes an approach to syntax testing based on informal guidelines. However, these techniques cover only syntax that is just one part of input validation. Recently, two techniques that are more related to the proposed technique have been proposed for testing input validation in data-intensive systems (Hayes and Offutt, 1999; Offutt et al., 2004). The input validation testing (IVT) proposed by Hayes and Offutt (1999) is a specification-based testing method that automatically analyzes user interface specification to generate test cases. Test cases for both valid and invalid inputs are generated. Motivated by the importance of security in Web applications, bypass testing was proposed by Offutt et al. (2004). Bypass testing is a special kind of IVT to test the robustness of Web applications. Through bypassing client-side checking, test cases with invalid inputs are created to test input validation implemented in the applications. The proposed approach shares the objective of testing input validation of Web applications with bypass testing; however, there is a major theoretical difference between the two approaches. The proposed approach automatically recovers the input validation model from program source code then proceeds to test the input validation implemented through structural testing, while the bypass testing does not require any program source code, which generate test cases purely based on the syntax of client-side user interface. Many tools and research works have focused on the testing of Web applications (Hower, 2006). Most of them are for static validation and measurement of Web applications such as protocol conformance, stress testing, link checking, and do not directly support functional testing of Web applications. More recent research has looked into testing functional requirements of Web applications through formal techniques. Liu et al. (2000) proposes a Web application test model, which considers each Web component as an object, and test cases are generated based on data flow among those objects. Ricca and Tonella (2001) propose a UML model of Web applications to enable Web application analysis and drives test case generation. Both techniques extract the models from source code, and employ and extend the traditional structural and data flow testing to Web application domain. An agent-based framework is proposed by Qi et al. (2005) for modeling Web applications. The method greatly reduces the complexity of Web application, and data flow testing is performed at different functional levels. The proposed approach shares with these approaches on the model recovery from source code and on the use of structural testing techniques, however, it

232

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233

focuses on the modeling and testing of input validation – an import feature in Web applications. An important key to ensure software quality is through the measurement of test quality. Test coverage is one of the key measures for this purpose (Weyuker et al., 1991; Zhu et al., 1997). Input validation is a critical feature to Web applications. It is important to verify that this feature is sufficiently tested before a system is implemented. To the best of our knowledge, currently, no coverage criterion has been proposed for testing input validation. Many criteria have been proposed for structural coverage such as statement coverage, branch coverage and path coverage (Beizer, 1990; Myers et al., 2004). Multiple-condition coverage and modified-condition/decision coverage have been proposed to for testing the logical expressions of individual decision (Chilenski and Miller, 1994). Data flow Criteria for examining the definitions and references through program flow have also been proposed (Ntafos, 1988; Clarke et al., 1989). Recently, some other criteria have been proposed for testing systems from different aspects. Memon et al. (2001) proposes an event-based coverage criteria for GUI events and their interactions, in which the GUI components are represented by an event-flow graph. Based on this graph, intra-component coverage criteria are used to evaluate the adequacy of tests on events within a component and inter-component criteria are used to assess the adequacy of test sequences across components. Rountev et al. (2005) present a family of coverage criteria for testing the object interactions based on reversed-engineered sequence diagrams, which are extensions of traditional coverage criteria through the use the sequences of messages in the diagrams. Furthermore, the proposed approach in this paper incorporated the empirical properties, which are constructed and iteratively refined through the empirical study and statistical validation. This method is commonly adopted in medicine research, and recently, more and more software engineering researchers have started to explore into the empirical software engineering methodologies (Kitchenham et al., 2002; Joseph et al., 2006) to solve software engineering problems. 7. Conclusion In this paper, we have introduced an input validation model represented by Validation Flow Graph. Based on that, some properties for implementing input validation and two coverage criteria for testing input validation have been discussed. The concept of VN-based path partition divides the paths through the CFG of a program into acceptance classes and rejection classes, and path-based IVC is proposed to ensure at least one path in each class of the partitions has been exercised. Condition-based IVC is proposed for testing the detailed implementation of each validation node. In the proposed approach, the input validation model is automatically recovered from program source code. Test cases for testing input validation are then

generated based on the path-based IVC or condition-based IVC. Our case study suggests that programmers should not overlook the input validation feature. It should be properly designed, implemented and tested during the software development process. Further, testing input validation is a non-trivial task. The empirical results have shown that the proposed approach, as implemented in the prototype tool, provides an effective way to adequately test input validation in Web applications. As input validation is an important way for ensuring the accuracy and adequate control of input in a system, it is a main concern in computer auditing. Hence, we believe the techniques proposed in this paper could be useful in aiding computer auditors in the audition of input control features. Furthermore, more exploration on feature-oriented software testing is a promising research area. Acknowledgments The authors would like to thank Mary Jean Harrold from Georgia Institute of Technology, for sharing the JABA program analysis tool, which helps tremendously on building the prototype system. References Bazzichi, F., Spadafora, I., 1982. An automatic generator for compiler testing. IEEE Trans. Software Eng. SE-8 (4), 343–353. BCEL, 2006. Byte code engineering library. http://jakarta.apache.org/ bcel/. Beizer, B., 1990. Software Testing Techniques. Van Nostrand Reinhold, New York. Chilenski, J.J., Miller, S.P., 1994. Applicability of modified condition/ decision coverage to software testing. Software Eng. J. 9 (5), 193–200. China Webmaster, 2006. Open source website: http://code.cnzz.cn. Clarke, L.A., Podgurski, A., Richardson, D.J., Zeil, S.J., 1989. Formal evaluation of data flow path selection criteria. IEEE Trans. Software Eng. 15 (11), 1318–1332. Ferguson, R., Korel, B., 1996. The chaining approach for software test data generation. ACM Trans. Software Eng. Methodol. 5 (1), 63–86. Gupta, N., Mathur, A.P., Soffa, M.L., 1998. Automated test data generation using an iterative relaxation method. In: Proceedings of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6), pp. 231–244. Harman, M., Hu, L., et al., 2004. Testability transformation. IEEE Trans. Software Eng. 30 (1), 3–16. Hayes, J.H., Offutt, A.J., 1999. Increased software reliability through input validation analysis and testing. In: Proceedings of the 10th International Symposium on Software Reliability Engineering, pp. 199–209. Hower, R., 2006. Web site test tools and site management tools. JABA, 2006. Java architecture for bytecode analysis: http://www.cc.gatech.edu/aristotle/Tools/jaba.html. Joseph, R.R., Sebastian, E., et al., 2006. Experimental program analysis: a new program analysis paradigm. Proceedings of the 2006 international symposium on Software testing and analysis. ACM Press, Portland, Maine, USA. Kitchenham, B.A., Pfleeger, S.L., et al., 2002. Preliminary guidelines for empirical research in software engineering. IEEE Trans. Software Eng. 28 (8), 721–734. Korel, B., 1990. Automated software test data generation. IEEE Trans. Software Eng. 16 (8), 870–879.

H. Liu, H.B. Kuan Tan / The Journal of Systems and Software 81 (2008) 222–233 Liu, C.-H., Kung, D.C., Hsia, P., Hsu, C.-T., 2000. Structural testing of web applications. In: 11th International Symposium on Software Reliability Engineering, pp. 84–96. Liu, H., Tan, H.B.K., 2006. Automated Verification and Test Case Generation for Input Validation. In: Proceedings of First Workshop on Automation of Software Test (Co-located with ICSE 2006), pp. 29–35. Marick, B., 1995. The craft of software testing: Subsystem testing including object-based and object-oriented testing. PTR Prentice Hall, Englewood Cliffs, NJ. Memon, A.M., Soffa, M.L., Pollack, M.E., 2001. Coverage criteria for gui testing. In: Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 256–267. Montgomery, D.C., Runger, G.C., 2003. Applied statistics and probability for engineers. John and Wiley, New York. MSDN, 2006. Design guidelines for secure web application: http:// msdn.microsoft.com/library/default.asp?url=/library/en-us/secmod/ html/secmod77.asp. Myers, G.J., Badgett, T., Thomas, T.M., Sandler, C., 2004. The art of software testing. John Wiley and Sons, Hoboken, NJ. Ntafos, S.C., 1988. A comparison of some structural testing strategies. IEEE Trans. Software Eng. 14 (6), 868–874. Offutt, J., Wu, Y., Du, X., Huang, H., 2004. Bypass testing of web applications. In: Proceedings of the 15th International Symposium on Software Reliability Engineering, pp. 187–197.

233

Qi, Y., Kung, D., Wong, E., 2005. An agent-based testing approach for web applications. In: Computer Software and Applications Conference, 2005, COMPSAC 2005, 29th Annual International, 26–28 July 2005. pp. 45–50. Ricca, F., Tonella, P., 2001. Analysis and testing of web applications. In: Proceedings of 23rd International Conference on Software Engineering, pp. 25–34. Rountev, A., Kagan, S., Sawin, J., 2005. Coverage criteria for testing of object interactions in sequence diagrams In: Proceedings of Eighth International Conference of Fundamental Approaches to Software Engineering (FASE’05), pp. 289–304. Sinha, S., Harrold, M.J., Rothermel, G., 2001. Interprocedural control dependence. ACM Trans. Software Eng. Methodol. 10 (2), 209–254. Sourceforge, 2006. Open source website: http://sourceforge.net. Tai, K.-C., 1996. Theory of fault-based predicate testing for computer programs. IEEE Trans. Software Eng. 22 (8), 552–562. Wegener, J., Baresel, A., et al., 2001. Evolutionary test environment for automatic structural testing. Inform. Software Technol. 43 (14), 841– 854. Weyuker, E.J., Weiss, S.N., Hamlet, D., 1991. Comparison of program testing strategies. In: Proceedings of the Symposium on Testing, Analysis, and Verification (TAV4), p. 1. Zhu, H., Hall, P.A.V., May, J.H.R., 1997. Software unit test coverage and adequacy. ACM Comput. Surv. 29 (4), 366–427.