Theoretical Computer Science 576 (2015) 45–60
Contents lists available at ScienceDirect
Theoretical Computer Science www.elsevier.com/locate/tcs
Complexity of node coverage games ✩ Farn Wang a,b,c,∗ , Sven Schewe d , Jung-Hsuan Wu a a
Department of Electrical Engineering, National Taiwan University, Taiwan, ROC Graduate Institute of Electronic Engineering, National Taiwan University, Taiwan, ROC Research Center for Information Technology Innovation (CITI), Academia Sinica, Taiwan, ROC d Department of Computer Science, University of Liverpool, UK b c
a r t i c l e
i n f o
Article history: Received 27 May 2014 Received in revised form 24 January 2015 Accepted 3 February 2015 Available online 7 February 2015 Communicated by V.Th. Paschos Keywords: Testing Nondeterminism Coverage Game Strategy Complexity
a b s t r a c t Modern software systems may exhibit a nondeterministic behavior due to many unpredictable factors. In this work, we propose the node coverage game, a two-player turn-based game played on a finite game graph, as a formalization of the problem to test such systems. Each node in the graph represents a functional equivalence class of the software under test (SUT). One player, the tester, wants to maximize the node coverage, measured by the number of nodes visited when exploring the game graphs, while his opponent, the SUT, wants to minimize it. An optimal test would maximize the cover, and it is an interesting problem to find the maximal number of nodes that the tester can guarantee to visit, irrespective of the responses of the SUT. We show that the decision problem of whether the guarantee is less than a given number is NP-complete. We also discuss two extensions of our result and present a testing procedure based on our result when the SUT is not so hostile. © 2015 Elsevier B.V. All rights reserved.
1. Introduction Coverage-based techniques [29,30] have been widely used in the management of testing projects of large and complex software systems. The idea is to model the software under test (SUT) as a finite number of functional equivalence classes (FEC). The number of FECs that a test plan can cover is then used as an indication of completeness of a verification task and quality of the SUT. For white-box testing, typical test criteria include line (statement) coverage, branch coverage, path coverage, dataflow coverage, class coverage, function/method coverage, and state coverage [29,30]. For black-box testing, popular criteria include input domain coverage, GUI event coverage, etc. Coverage technique is a common part of quality management techniques that are applied in industry for quality control. Full coverage does not normally imply the correctness of the SUT, but empirical studies [17] have shown that coverage-based techniques are effective in detecting software faults. High coverage shows that the test engineers have systematically and methodically sampled behaviors of the SUT to observe. However, most programs are so complicated that, even if all specified items of a program have been tested, the management can obtain high confidence on the quality of the program, but not a correctness proof for the program. For example, even if we have achieved full branch coverage in white-box testing (that is, even if all conditional branches have been covered in the source code execution), usually the majority of all combinations
✩ The work is partially supported by Grant MOST 103-2221-E-002 -150 -MY3 and Project “Performance Testing Techniques That Reflect User Experiences” of Research Center of Information Technology Innovation, Academia Sinica, Taiwan, ROC. Corresponding author at: Dept. of Electrical Engineering, National Taiwan University, Nr. 1, Sec. 4, Roosevelt Rd., Taipei, Taiwan 106, ROC. E-mail address:
[email protected] (F. Wang).
*
http://dx.doi.org/10.1016/j.tcs.2015.02.002 0304-3975/© 2015 Elsevier B.V. All rights reserved.
46
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
of truth values are still left unobserved. The correctness of a program, e.g., formalized as the problem whether or not the program violates a safety predicate, is undecidable, such that the holy grail of proven correctness is unobtainable. However, if a project or quality manager sees low coverage, then the manager knows for sure that the behaviors of some parts of the SUT have not been observed. Thus, software project managers use coverage techniques to evaluate the progress of the testing tasks for an SUT and to manage the quality control process, and ultimately to evaluate their confidence in the correctness/quality of the SUT. It is thus important for testers and managers to create test plans that yield high coverage, such that they can gain a high confidence on the quality of the SUT and the testing tasks. However, the test coverage of an SUT has to be achieved with various assumptions and nondeterministic responses of the SUT [29,31]. For example, when we observe how a request message is served by a server SUT, we really have no control whether the server SUT will finish serving the request, deny the request, or be unaware of the request, e.g., due to loss of connection. The best that a verification engineer can do is to use various strategies to try to reach as many FECs of the server SUT as possible. We propose to model this problem as a two player finite-state game [31], which we call a node coverage game (NC-game for short). The first player is the tester (maximizer; he for short) and the second is the SUT (minimizer; she for short). The two players play on a finite game graph with nodes for the FECs, where the nodes are partitioned into the nodes that are owned by the tester and the nodes that are owned by the SUT. The tester and the SUT together move a pebble from node to node according to the transition relation of the game graph. The tester chooses the next node when the pebble is on one of his nodes, while the SUT chooses the successor when the pebble is on one of her nodes. The objective of the tester is to maximize the number of visited (covered) nodes, while the SUT wants to minimize the number of covered nodes. An interesting question about NC-games is how much coverage the tester can guarantee, no matter how the SUT reacts to the test input. We call this guarantee the maximal coverage guarantee (MCG for short). The MCG is calculated under the conservative assumption that the SUT is malicious and tries to minimize the coverage. (For example, a server SUT may decline all interaction requests.) This is common in nondeterministic systems, and a test coverage below the MCG certainly implies deficiency in test execution. We show that the MCG decision problem (the problem whether the SUT can resolve the nondeterminism to prevent a cover over a given threshold c) is NP-complete. This complexity is lower than those established for related coverage problems [5,22]. (They are PSPACE-complete, cf. Section 2.) Our proof technique is based on new observations on games and may be itself interesting. Specifically, we observed that although the optimal tester strategies may need be memoryful in achieving MCG, the SUT strategies only need be prefix-consistent which means once the SUT makes an optimal decision at a node in a play, she can stick to that decision in the play. All in all, our result not only may be theoretically interesting, but also may be used as a solid theoretical foundation for testing research. In the remainder of the paper, we first review related work in Section 2 and basic concepts of game theory in Section 3. In Section 4, we define NC-games. We then establish the lower-bound and upper-bound of the MCG decision problem complexity, which we prove to be NP-complete, respectively in Section 5 and Section 6. In Section 7, we also discuss the complexity results of resettable NC-game decision problems and edge coverage game decision problems. Finally, in Section 8, we discuss how to apply our result to software testing in practice when the SUT is not so hostile. 2. Related work 2.1. Classic problems One related classic problem is the 2-player reachability game [9,21,25,26] on directed graphs. The game can be turnbased or concurrent. The two players together move a pebble in the game graph. Player 1 wins if and only if eventually a node in a target set is reached. The computation of pure strategies of reachability games is in PTIME. Another related classic problem is the Canadian traveler problem [22] of Papadimitriou and Yannakakis. A problem instance is a two player game. The players are given a partially observable game graph and do not know which edges are connected. Player 1 may try an edge when he is at the source of the edge. Player 2 then decides whether this edge is connected or not. The connectivity of an edge, once decided, cannot be changed by anyone. The answer to a problem instance is whether the ratio of the traveling distance over the optimal static traveling distance is lower than a given number. The problem is also PSPACE-complete. In contrast, we show that the MCG decision problem is ‘only’ NP-complete. Yet another classic problem that could be related to this work is the pursuit-evasion game, sometimes called graphsearching game [15,19,23] on a directed graph with a group of searchers and a group of evaders. A typical version of the problem is played as follows. Initially, each searcher (also each evader) stays in a node. In each round, a searcher (also an evader) can move to another node depending on various restrictions on the velocity. An evader is caught if she and a searcher are in the same node. The decision problem asks whether a given number of searchers is enough to catch all evaders. Important (and particularly simple) variants of these games are cops-and-rober games, which are used to determine the complexity of directed graphs. The most relevant games of this type are the games to determine the DAG width [2,20] and entanglement [3] of a directed graph. For general pursuit-evasion games, we believe that the mobility and velocity of the evaders does not seem natural for software bugs. Their role would, however, be quite restricted, as even for the simple cops-and-robber games mentioned
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
47
above, the DAG width and entanglement of the games used in our hardness proof are the lowest possible above those of DAGs. 2.2. Proposition coverage game A recent work in test coverage games is the proposition coverage game by Chatterjee, Alfaro, and Majumdar [5]. Their game graph is the same as ours except that each game node is labeled with a set of atomic propositions. The goal of the tester (SUT) is to cover as many (respectively few) propositions as possible. They showed that the decision problem of maximal proposition coverage guarantee is PSPACE-complete. Chatterjee et al also showed that when the game graph is recurrent (i.e., there exists a tester strategy from every node back to the initial node), the complexity of proposition coverage games drops to co-NP-complete. Their techniques involve reduction from probabilistic games. In contrast, our complexity result is established with new observations on games which can be useful in future research. In fact, node coverage has been long accepted in the practice of software testing. Thus our result implies that for software testing, coverage analysis can be achieved without incurring PSPACE complexity. 2.3. Optimization problems in coverage In practice, software testing projects are usually executed in phases with separated concerns. After test requirements (e.g., nodes in FSM, or lines in source codes) are specified in the test specification phase, we then carry out the phase of test case construction to cover the test requirements. Then, the following phase is usually “test prioritization” or “test scheduling” [14,27] which aims to determine the execution order of the test cases for another object, for example, efficiency of coverage, and is considered independent of test case construction for coverage. The phase-wise execution of testing projects helps in separation of concerns in the quality control process. At the moment, the values of test prioritization techniques are usually validated with experiments. In Subsection 7.1, we present resettable NC-game with the restriction that from every node, there is a reset edge to the initial node. Resettable NC-game seems a natural foundation for studying test prioritization problem. In the future, our MCG decision problem could be extended with constraints on counts of reset edges. 3. Preliminaries We first introduce some basic notations. Let N be the set of non-negative integers. We write [i , j ] for the set of integers inclusively between i and j. Also, (i , j ], [i , j ), and (i , j ) are shorthands of [i + 1, j ], [i , j − 1], and [i + 1, j − 1], respectively. Given a finite set D, we use | D | to denote the size of D. Given two sets D and D , we use D − D to denote the difference def
set of D from D , that is, D − D = {d | d ∈ D , d ∈ / D }. Suppose that we are given an (infinite) sequence φ = v 0 v 1 . . . with elements in a set V . For every i ∈ N, we let φ(i ) = v i . We use V ∗ to denote the set of finite sequences of elements in V . Given two sequences φ1 and φ2 such that φ1 is finite, we use φ1 φ2 to denote their concatenation. Also, |φ| denotes the length of φ . Our NC-game is played on a directed graph, conventionally called a game graph. There are many graph-based coverage techniques in the literature [1,28,30]. Game graphs can be obtained from the control-flow graphs of source programs in white-box testing [16], finite-state machine models (FSM) in the requirements [7,12,13,18,24], state-charts in UML specifications [4,6], etc. Conceptually, a node in the graph represents an FEC of the SUT. An edge represents a transition between FECs. A node can be owned either by the tester (the maximizer, or player 1) or by the SUT (the minimizer, or player 2). At the beginning of the game, there is a pebble in a dedicated initial node of the graph. Depending on who owns the node that contains the pebble, the owner of the node chooses a transition to move the pebble from the current node to the next node. The interleaving of choices by the SUT and the tester extends a play—an infinite sequence of nodes in the game graph. The size of the set of nodes that occur in the play is the coverage of the play. The above-mentioned concepts can be formalized as follows. Definition 1 (Game graph). A game graph G = V 1 , V 2 , E , ν is a weighted finite directed graph with finite node set V 1 ∪ V 2 , edge (transition) set E ⊆ ( V 1 ∪ V 2 ) × ( V 1 ∪ V 2 ), and gain function ν : ( V 1 ∪ V 2 ) → N. We require that V 1 ∩ V 2 = ∅. Nodes in V 1 are owned by the tester while those in V 2 are owned by the SUT. 2 For convenience, from now on, without saying it explicitly, we assume that we are in the context of a given game graph G = V 1 , V 2 , E , ν . Moreover, we let V denote V 1 ∪ V 2 . Fig. 1 shows an example game graph G. Nodes in V 1 are drawn as circles. Those in V 2 are drawn as squares. The gain of each node is put down under the node name. A continuous interaction between the SUT and the tester together create an infinite play defined in the following. Definition 2 (Plays and play prefixes). A play ψ is a function from N to V such that, for all i ≥ 0, (ψ(i ), ψ(i + 1)) ∈ E. A play prefix φ of ψ is a mapping from an interval [0, k] to V such that, for all i ∈ [0, k], φ(i ) = ψ(i ). 2
48
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
Fig. 1. A game graph.
The following notations are for the convenience of presentation. Given a play prefix φ : [0, k] → V , the length of φ , denoted |φ|, is k + 1. If φ is of infinite length, |φ| = ∞. Given two integers j and k in [0, |φ|) with j ≤ k, we use φ[ j , k] to def
denote the play prefix φ( j )φ( j + 1) . . . φ(k). We use last(φ) = φ(|φ| − 1) to denote the last node in φ if the length of φ is finite (|φ| = ∞). Given a play (or play prefix) ψ , we use JψK to denote the domain of ψ , that is, JψK = {ψ(k) | k ∈ [0, |ψ|)}. Also, by def
abuse of notation, ν (ψ) denotes the coverage gain of ψ , i.e., ν (ψ) = v ∈JψK ν ( v ). Without mentioning it explicitly, we assume that a play has infinite length. A play ψ with ψ(0) = v is called a v-play. In choosing transitions at a node owned by a player, the player may look up the play prefix that leads to the current node, investigate what decisions the opponent has made along the prefix, and select the next node s/he moves to. Such decision-making by a player can be formalized as follows. Definition 3 (Strategy). A strategy is a function from play prefixes to a successor node. Formally, a strategy σ is a function from V ∗ to V such that for every φ ∈ V ∗ , (last(φ), σ (φ)) ∈ E. A strategy σ is memoryless (positional) if the choice of σ only relies on the current node of the pebble, that is, for every two play prefixes φ and φ , last(φ) = last(φ ) implies σ (φ) = σ (φ ). If σ is not memoryless, it is called memoryful. 2 Given regular expressions [11] 1 , . . . , n with alphabet V and nodes v 1 , . . . , v n ∈ V , we may use [1 → v 1 , . . . , n → v n ] to (partially) specify a strategy. Supposedly, 1 , . . . , n should be disjoint from one another. For a strategy σ , a rule like i → v i means that, for every play prefix φ ∈ i , σ (φ) = v i . For example, in Fig. 1, a memoryless strategy of the tester can be specified with [ V ∗ v 0 → v 1 , V ∗ v 3 → v 2 ]. A memoryful strategy of the tester can be specified with [ v 0 → v 1 , v 0 V ∗ v 0 → v 2 , V ∗ v 3 → v 2 ]. Note that, in Definition 3, we do not distinguish between the strategies of the two players. As a player can only influence the decisions made on his or her nodes, we call a play φ σ -conform for a tester strategy σ if, for all i ∈ N, φ(i ) ∈ V 1 implies φ(i + 1) = σ (φ[0, i ]). Likewise, we call it σ -conform for an SUT strategy σ if, for all i ∈ N, φ(i ) ∈ V 2 implies φ(i + 1) = σ (φ[0, i ]). In the remainder of the paper, we denote the set of all strategies by Σ . Together with an initial node r, two strategies σ1 , σ2 ∈ Σ of the tester and the SUT, respectively, define a unique play, which is conform to both. We denote this play by play(r , σ1 , σ2 ). Definition 4 (Traps). For p ∈ {1, 2}, a p-trap is a subset V ⊆ V that player 3 − p has a strategy to keep all plays from leaving V . Formally, we require that:
• For every v ∈ V ∩ V p and every ( v , v ) ∈ E, v ∈ V . • For every v ∈ V − V p , there exists a ( v , v ) ∈ E with v ∈ V . For convenience, in this work, a 1-trap is called a tester trap while a 2-trap is called an SUT trap.
2
4. Node coverage game (NC-game) An NC-game is defined with G and an initial node. Definition 5 (Node coverage game, NC-game). A node coverage game G , r is a pair of game graph G and an initial node r ∈ V . In the game, the tester (player 1) tries to cover as much node weight as possible in plays while the SUT (player 2) tries to cover as little node weight as possible in plays. 2 For convenience, from now on, unless explicitly stated, we assume that we are in the context of an NC-game G , r . The maximal coverage guarantee (MCG) from r of G, denoted mcg(G , r ), is
maxσ1 ∈Σ minσ2 ∈Σ ν (play(r , σ1 , σ2 )). Intuitively, this is the maximal coverage gain from r that the tester can guarantee no matter how the SUT may respond.
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
49
A strategy σ1 of the tester is optimal if it can be used by the tester to achieve at least mcg(G , r ) coverage no matter how the SUT may respond in the game. Formally, σ1 is optimal for the tester if and only if minσ2 ∈Σ ν (play(r , σ1 , σ2 )) = mcg(G , r ) holds. Symmetrically, a strategy σ2 of the SUT is optimal if and only if maxσ1 ∈Σ ν (play(r , σ1 , σ2 )) = mcg(G , r ) holds. The complexity of a computation problem in computer sciences is usually studied in the framework of decision problem. Our MCG decision problem is defined as follows. Definition 6. MCG decision problem instance for a c ∈ N asks if mcg(G , r ) ≤ c.
2
4.1. Finite characterization of the MCG value Note that the definition of MCG seems to imply that we may need to extend infinite plays to calculate MCG. In fact, it is possible to deduce MCG out of a few plays of finite lengths. Let us consider an optimal strategy σ1 of the tester and an optimal strategy σ2 of the SUT. Assume play(r , σ1 , σ2 ) is ψ of infinite length. Then, eventually, there are two integer j < k such that ψ( j ) = ψ(k) and Jψ[0, j ]K = Jψ[0, k]K. That is, ψ[0, j ] and ψ[0, k] end at the same node and with the same coverage. We can construct optimal strategy σ1 and σ2 , respectively, for the tester and the SUT with the following contraction procedure.
• σ1 is the same as σ1 except that for every play prefix φ , σ1 (ψ[0, j ]φ) = σ1 (ψ[0, k]φ). • Similarly, σ2 is the same as σ2 except that for every play prefix φ , σ2 (ψ[0, j ]φ) = σ2 (ψ[0, k]φ). Then it is apparent that σ1 and σ2 are also optimal for the tester and the SUT, respectively. Moreover, Jplay(r , σ1 , σ2 )K = Jplay(r , σ1 , σ2 )K and ν (play(r , σ1 , σ2 )) = ν (play(r , σ1 , σ2 )) = mcg(G , r ). By repeating the contraction procedure only on those j and k with ν (ψ[0, j ]) = ν (ψ[0, k]) < mcg(G , r ), we can eventually construct optimal strategy σˆ 1 and σˆ 2 that together develop a play reaching MCG in no more than | V |2 steps. The reason is that, along any play ψ , Jψ[0, j ]K grows monotonically with j. In a segment of | V | + 1 nodes along ψ , if the coverage does not increase, then ψ must have visited the same node twice in the segment according to the pigeon’s hole principle. The above argument in fact also tells us that there is an optimal tester strategy that contracts the play as much as possible to positions j and k with j < k and Jψ[0, j ]K = Jψ[0, k]K < mcg(G , r ). We define contracting strategies as follows. Definition 7. A strategy σ is contracting if, for every play prefix ψ and all j < k < |ψ| with ψ( j ) = ψ(k) and Jψ[0, j ]K = Jψ[0, k]K, σ (ψ) = σ (ψφ) for any φ ∈ (ψ[ j + 1, k])∗ . 2 Intuitively, if a contracting strategy sees no increase in coverage in two consecutive visits to the same node, then it will stay with the coverage unless the other player makes a change. Lemma 1. For any node coverage game, there is an optimal contracting tester strategy. 2 Based on Lemma 1, we find the following equivalent characterization of MCG values.
maxσ1 ∈Σ minσ2 ∈Σ ν (play(r , σ1 , σ2 )) = minσ2 ∈Σ maxσ1 ∈Σ ν (play(r , σ1 , σ2 ))
= minσ2 ∈Σ maxσ1 ∈Σ,σ1 is contracting ν (play(r , σ1 , σ2 )) To play with contracting strategies of the tester, the SUT can also counter with contracting strategies without submitting more coverage. The reason is that since the tester has decided not to increase more coverage, then the SUT can collaborate to stop coverage increments. Thus, we can obtain the following lemma. Lemma 2. For any NC-game,
maxσ1 ∈Σ minσ2 ∈Σ ν (play(r , σ1 , σ2 ))
= minσ2 ∈Σ maxσ1 ∈Σ ν (play(r , σ1 , σ2 )) = minσ2 ∈Σ,σ2 is contracting maxσ1 ∈Σ,σ1 is contracting ν (play(r , σ1 , σ2 ))
2
Lemma 2 is interesting, because, when both players are using contracting strategies, we can reach the limit of coverage of the play in | V |2 steps along each play. This leads to the following lemma. Lemma 3. Given a node coverage game G , r ,
mcg(G , r ) = maxσ1 ∈Σ,σ1
is contracting
minσ2 ∈Σ,σ2
is contracting
ν (play(r , σ1 , σ2 )[0, | V |2 ]))
50
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
Fig. 2. A game graph for (x1 ∨ x2 ∨ x3 ) ∧ (¯x1 ∨ x¯ 2 ).
Proof. When both σ1 and σ2 are contracting, along any segment of length | V | + 1 in play(r , σ1 , σ2 ), the coverage must increase unless the coverage is already the coverage limit of play(r , σ1 , σ2 ). But the coverage can increase for at most | V | times. Thus, within | V |2 steps, the coverage limit is reached in play(r , σ1 , σ2 ). 2 Lemma 3 is useful for showing that the MCG problem is well-defined. In addition, it also suggests a PSPACE algorithm for the MCG decision problem. We only have to use O (| V |2 + 1) space to enumerate all play prefixes that can be constructed with contracting strategies of the two players. The checking whether a play prefix violates contracting strategies can also be done in PSPACE. However, we can do better than PSPACE. In fact, our main theoretical result is the establishment of the NP-completeness for the MCG decision problem. Theorem 1. MCG decision problem is NP-complete for both constant and general ν . 2 In the following, we use two sections to respectively establish the lower-bound and upper-bound of the MCG decision problem complexity. 5. NP-hardness of the MCG decision problem The hardness is easy to establish by a standard reduction from a SAT problem: we reduce from the satisfiability problem of Boolean formulas in conjunctive normal form (CNF) [8] to the MCG decision problem. As outlined in Fig. 2, we translate a Boolean formula η in CNF with n atomic propositions x1 , x2 , . . . , xn and m clauses C 1 , C 2 , . . . , C m into an NC-game G , dx1 as follows.
• • • • •
We have m + 3n + 1 nodes, nodes xi , x¯ i , and dxi for i = 1, . . . , n, a node y, and a node C j for j = 1 . . . , m. From the nodes dx1 , the SUT can choose to go to xi or x¯ i , respectively. From the nodes xi and x¯ i , dxi +1 is the only successor for i < n, and y is the only successor of xn and x¯ n . From y, the tester can choose to go to C 1 , . . . , C m . For each clause C j with 1 ≤ j ≤ m and literal l (some xi or x¯ i ) in C j , the SUT can go to l.
The size of G is m + 3n + 1 and the reduction can be done in polynomial time. If the formula is satisfiable, the SUT can force a cover ≤ m + 2n + 1: intuitively, she can guess a satisfying assignment and restrict the cover to either the node xi or the node x¯ i for i = 1, . . . , n (depending on the assignment), dxi for i = 1, . . . , n, y, and C j for i = 1, . . . , m. If there is no satisfying assignment, she still has to cover dxi for i = 1, . . . , n and either the node xi or the node x¯ i for i = 1, . . . , n in the first 2n steps, and she cannot prevent coverage of all C j for i = 1, . . . , m. Moreover, when read as an assignment her choice of nodes xi or the node x¯ i for i = 1, . . . , n must violate some disjunct C j . When the tester forces the SUT to cover C j , she therefore has to cover one further node xi or x¯ i . Note, however, that we show the existence of an SUT strategy; consequently testing the existence of an SUT strategy is CoNP hard. Lemma 4. The MCG decision problem is NP-hard, even for constant ν . Proof. We reduce the satisfiability problem of Boolean formulas in conjunctive normal form (CNF) [8] to the MCG decision problem. Suppose we have a CNF formula with atomic propositions x1 , x2 , . . . , xn and clauses C 1 , C 2 , . . . , C m . The node coverage game (G , dx1 ) that we construct is formally defined as follows.
• V 1 = { y } ∪ {¯xi , xi | 1 ≤ i ≤ n}. x¯ i resp. xi represent, for each atomic proposition xi , a node that interprets xi as false resp. true. y is a special node intuitively used to force the truth valuation of an arbitrary clause (and, ultimately, of all clauses). • V 2 = {dxi | 1 ≤ i ≤ n} ∪ {C 1 , . . . , C m }. Each node dxi represents a decision node to choose between interpretations xi = false and xi = true. Each node C j is used for the truth valuation of clause C j .
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
51
• E is a minimum set satisfying the following restrictions. – For each 1 ≤ i ≤ n, (dxi , x¯ i ) ∈ E and (dxi , xi ) ∈ E. – For each 1 ≤ i < n, (¯xi , dxi +1 ) ∈ E and (xi , dxi +1 ) ∈ E. – (¯xn , y ) ∈ E and (xn , y ) ∈ E. – For each 1 ≤ j ≤ m, ( y , C j ) ∈ E. – For each clause C j with 1 ≤ j ≤ m and literal1 l in C j , (C j , l) ∈ E. • dx1 is the starting node of the game. • For each node v in V , ν ( v ) = 1. For example, in Fig. 2, we have the game graph for a Boolean formula. The size of G is m + 3n + 1 and the reduction can be done in polynomial time. It now suffices to show that a CNF formula Ψ is satisfiable if, and only if, the optimal node coverage with the game graph described is above is m + 2n + 1 or smaller (it is precisely m + 2n + 1, but this is unimportant for the proof), and unsatisfiable if, and only if, it is strictly bigger. We prove this claim as follows in two directions. (⇒) Let us first assume that there is the best coverage that the tester can guarantee is m + 2n + 1 nodes. We want to show that this assumption implies the existence of an interpretation I from {x1 , . . . , xn } to {true, false} satisfying the given formula. We observe that the minimal coverage must always include all dxi (n nodes) and y (1 node), because these nodes are always among the first 2n + 1 visited nodes, irrespective of the chosen strategies, and all C j (m nodes), because the tester can reach each of these nodes from any position of the game. Further, we observe that, within the first 2n visited nodes, there is, for all i ∈ {1, . . . , n}, either a visit to node xi or to node xi . These are already m + 2n + 1 nodes. The interpretation I is defined as follows. Pick optimal strategies σ1 and σ2 respectively of the tester and the SUT. I assigns true to xi if and only if xi is covered in the first 2n steps of play(r , σ1 , σ2 ). The set of nodes xi or xi visited within the first 2n steps is fixed at the beginning. If the minimal coverage shall be (less than or) equal to m + 2n + 1, then the tester may not be able to cover any additional node. But as the tester can reach each C j , this means that, for each C j , some successor node (which is either an xi or an xi node) must have been covered in the first 2n steps. According to the construction of the game, the interpretation I that maps each xi to true if xi is covered (and to false if xi is covered) makes the CNF formula Ψ true. (⇐) Vice versa, let us fix such an interpretation I . We now consider a strategy of the SUT that moves from dxi to xi if xi is evaluated to true by I and to xi if xi is evaluated to false by I , and to turn from each C i to a node x j (resp. x j ) such that the respective literal x j (resp. ¬x j ) in C i is true. Clearly, the coverage is m + 2n + 1. 2 Another interpretation of our proof is that checking the size of the smallest tester trap that contains a node is NP-hard. One comment on our proof is that the games we use are very simple structures: after removing node y, they become a DAG. Consequently, it has entanglement [3] 1, DAG width [2,20] 2, and cycle width [10] 2. Thus, the hardness from Lemma 4 holds for such simple game graphs. 6. Inclusion in NP of the MCG decision problem In this section, we present an NP algorithm for the MCG decision problem of general game graphs. Game graphs with simpler structures (DAGs + self-loops) can be evaluated using minimax in linear time. To show inclusion in NP, we show that the SUT has an optimal strategy that can be described in polynomial space and checked in polynomial time. This may look slightly unusual for the testing community, as a strategy would rather be expected for the tester. But a consequence of our results is that the co-problem of determining if a tester has a strategy of a certain quality is CoNP-complete, and we cannot hope to generally have an optimal tester strategy with a similarly simple description. We start by outlining how an optimal strategy of the SUT can be described, and then give an intuition on why this is the case. 6.1. Main idea of our NP algorithm The key ingredient of our optimal strategy of the SUT is the following. From each node v, the SUT would offer the tester a set P v of nodes that the tester can cover. We refer to P v as the pseudo trap for v. Intuitively, the pseudo trap P v for v is a strongly connected part of the graph, in which the tester can cycle around to visit all nodes in P v . After all nodes P v have been covered, the tester can leave the pseudo trap P v (unless it is a proper tester trap). Note that P v is not necessarily a tester trap since the tester may be in a position to leave it. With the exception of singleton sets P v , the SUT, however, must have a successor node in P v for each of her nodes in P v . Starting at a point v where P v = { v } or v is a tester node, the tester obviously cannot obtain a higher value than
1
A literal is a Boolean variable or a negated Boolean variable.
52
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
Table 1 Choice of P v and c v in Fig. 3(b). v
r
a
b
c
d
e
f
g
h
i
j
k
m
n
p
q
Pv cv
{r }
{a, c }
{b, c }
{a, c }
{d}
{e}
{f}
{g}
{h}
{i , j }
{ j , k}
{ j , k}
{m}
{n, q}
{ p , q}
{ p , q}
9
7
8
7
8
5
4
6
6
5
5
5
3
2
2
2
– the value obtained for covering all nodes in the pseudo trap P v , plus – the maximum of the values obtained for optimal coverings he can enforce from any successor w ∈ / P v of his nodes from within the pseudo trap, where the successor is outside of the pseudo trap. For tester traps, the latter value is 0. A sufficient description of an optimal SUT strategy is to provide, for each node v, such a pseudo trap P v and the gain c v the tester can at most obtain against the SUT when starting in v. A witness therefore can thus be described in the simple form provided in Table 1. Checking if a description is such an SUT strategy can be done in PTIME. We use the game graph in Fig. 3(a) to explain these ideas. The game graph can be partitioned into three non-trivial maximal SCCs, the nodes in region (I), (II), and (III). There are also two trivial maximal SCCs, node h and node m. The maximal SCCs form a DAG (directed acyclic graph). After moving from one maximal SCC to another, the play cannot return to the previous SCCs. Thus, for better coverage, the tester can only hope to cover as many nodes as possible in one maximal SCC before moving to the next. However, the tester can only direct a play in a root-to-leaf path through the DAG. As a minimizer, the SUT can install pseudo traps within the non-trivial SCCs from Fig. 3(a) to restrict the coverage of the plays. For example, in Fig. 3(b), the SUT may introduce six different non-trivial pseudo traps:
• • • • • •
P a = P c = {a, c }, P b = {b, c }, P i = {i , j } , P j = P k = { j , k}, P n = {n, q}, and P p = P q = { p , q }.
As we can see, the pseudo traps are not disjoint. Consequently, determining the relevant pseudo trap requires memory: it has to be updated every time the relevant pseudo trap is left. It is also easy to see why this memory is necessary: a memoryless SUT strategy that always moves from q to p would imply that, when starting in m, the tester can cover {m, n, p , q} by moving to n. Likewise, when the SUT always moves from q to n, the tester could cover {m, n, p , q} when starting from m by moving to p. Memorizing the relevant trap will result in the tester playing back to n and q, respectively, resulting in a smaller cover. The choice of the pseudo traps is accompanied by cover obtained when starting at each individual node, provided in the second line of Table 1. Adding these values is for convenience, they can be inferred from the selected pseudo traps. To explain the values further, let us consider a game that starts in a. ca = 7 and P a = {a, c }, for example, implies that the SUT strategy to keep the game in is that after a is first entered, the SUT would enforce the following.
• The SUT would only allow seven nodes (including a) to be covered from now on. • As P a is a non-trivial SCC, the SUT will not leave P a from a SUT node. (Thus, P a can only be left from a node in P a ∩ V 1 .) • After leaving P a at most five nodes (ca − | P a | = 5) will be covered. Once the values of P v are chosen, the values of c v can be computed in a bottom-up way. For example, cr = 9 reflects the claim that the SUT can restrict the coverage from r to 9 nodes. In order to do so, she first allows the tester to cover P r = {r }. P r is a claim that boils down to “the tester is allowed to gain as much as he can in {r } until he decides to leave P r and will not visit P r again.” Calculating cr , we find that it is v ∈ P r ν ( v ) + max(ca , cb ) = 9. Similar explanation can be applied to ca = v ∈ P a ν ( v ) + c e = 7 and ch = . v ∈ P h ν ( v ) + max(c i , ck ) = 6. From r, the tester can choose to either go to a or to b. This choice determines the next pseudo trap for the tester that the system moves to, P a = {a, c } or P b = {b, c }, both of which are actually pseudo traps. The value of P c actually does not matter since the choice at c is already determined when either a or b is first entered. Note that such a description might be consistent without the SUT strategy being optimal. To emphasize this, we have described a non-optimal strategy. An optimal strategy is described in Table 2. The proof that an optimal strategy can be described in such a simple way leads us to the interesting observation of prefix-consistency for optimal SUT strategies. Prefix-consistency is a weaker requirement: it allows the selection of a successor of an SUT node v to depend on the history, but if v itself had occurred before, the same successor needs to be chosen as before. Given her objective to minimize the cover, the existence of optimal prefix-consistent strategies for the SUT is not
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
53
Fig. 3. Framework of resilience design.
Table 2 Optimal choice of P v and c v for the game from Fig. 3(a). v
r
a
b
c
d
e
f
g
h
i
j
k
m
n
p
q
Pv cv
{r }
{a}
{b}
{c , d}
{c , d}
{e, f , g }
{e, f , g }
{e, f , g }
{h}
{i , j }
{ j , k}
{ j , k}
{m}
{n, q}
{ p , q}
{ p , q}
8
4
7
2
2
3
3
3
6
5
5
5
3
2
2
2
surprising. In the following, we shall establish the existence of prefix-consistent optimal SUT strategies as an intermediate lemma. Once we restrict our focus to prefix-consistent optimal SUT strategies, one can view a run as the building of a directed graph. Let us assume we start in a node v and the SUT follows a prefix-consistent strategy. Then there is a (not necessarily unique) maximal SCC component that contains v that can be constructed by the tester. (This might be the trivial component that contains only v if the tester cannot return to v at all.) The pseudo trap P v can be chosen to reflect such an SCC component. It is an abstraction in that it only reflects the nodes occurring in this SCC component, but this is enough for the purpose of offering to cover it. Note that, with such
54
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
Fig. 4. Game graphs with memoryfull strategies.
an SCC component, the tester could do exactly this: first cover it completely, and then leave it from a tester node of his choice—unless it is a tester-trap or a trivial SCC component that consists only of a single SUT node. / P v can be selected by the Finally, note that a set P v chosen after leaving a previously assigned set P v to a node v ∈ tester independent of the history. 6.2. Existence of prefix-consistent optimal SUT strategies Before we turn to the NP completeness proof, we will establish that the SUT can play prefix-consistent, that is, it can make the same decision every time it comes to the same node. Note that this is different to memoryless, as the history at the first time it visits v may determine the choice. For example, in Fig. 4, the optimal strategy for the SUT is prefix-consistent but not memoryless. The SUT must check whether the tester has chosen v 1 or v 2 to decide whether to transit to v 1 or v 2 from v 3 to contain the coverage at 3 instead of 4. Prefix-consistent strategies make senses for the SUT since choosing an already-executed transition does not increase the coverage gain for the tester. Prefix-consistent strategies prove to be sufficient for optimal strategies of the SUT. The formal definition of prefix-consistent strategies follows. Note that Fig. 4 also serves as an example, where the SUT needs memory for her optimal decisions. This need for memory is reflected in the strategy above: the choice of the SUT is not made when v 3 is reached, but when v 0 is left, either to v 1 or to v 2 . At this point, the memory is taken into account. Definition 8 (Prefix-consistent strategies). A strategy last(φ) = last(ψ), σ (φ) = σ (φψ). 2
σ ∈ Σ is prefix-consistent if for every two play prefixes φ and ψ with
For example, with the game graph from Fig. 4, a strategy satisfying [ v 0 v 1 ( v 3 v 1 )∗ v 3 → v 1 , v 0 v 2 ( v 3 v 2 )∗ v 3 → v 2 ] is prefix-consistent while a strategy satisfying [ v 0 V ∗ v 3 v 1 v 3 → v 2 , v 0 V ∗ v 3 v 2 v 3 → v 1 ] is not prefix-consistent. Lemma 5. In an NC-game, the SUT has an optimal strategy, which is prefix-consistent. Proof. Assume that this is not the case. Let σ−1 be an optimal strategy in that it guarantees minimal gain c for the SUT, which may or may not be prefix consistent. We define a sequence of SUT strategies σ0 , σ1 , σ2 , σ3 , . . . as follows. Let φ = v 0 v 1 v 2 . . . v i be the prefix of a σi −1 -conform play. If v i ∈ V 2 , then we choose a φ such that 1. φφ is the prefix of a σi −1 -conform play, 2. φφ ends in v i , and 3. among such φφ , the nodes covered is maximal; that is, irrespective of the strategy of the tester, no further node is covered, on any finite sequence that returns to v i . We use this φ to infer σi from σi −1 as follows. We define for every history φφ1 φ2 with last(φφ1 ) = v i and v i ∈ / Jφ2 K (that is, φ1 is either empty or ends in v i and φ2 does not contain v i ), σi (φφ1 φ2 ) = σi −1 (φφ φ2 ). We now select the limit strategy σ∞ = limn→∞ σi . σ∞ is well defined, as the reaction on a play prefix of length i is the same as the reaction of σi . The way we update the function clearly ensures prefix-consistency. We show optimality by induction. As induction basis, σ−1 is optimal by assumption. For the induction step, let us assume for contradiction that σi −1 is optimal, but σi is not. Let us assume for contradiction that ψ = v 0 v 1 . . . v i . . . is a σi -conform play with ν (ψ) > c. We now distinguish two cases. 1. v i occurs infinitely often in ψ . But then, by condition (3) of our construction, the nodes covered by ψ are covered by the play prefix φφ of a σi −1 -conform play. Consequently, JψK ⊆ Jφφ K holds, which implies ν (ψ) ≤ ν (φφ ). As our induction hypothesis in particular implies ν (φφ ) ≤ c, this contradicts the assumption ν (ψ) > c. 2. v i occurs only finitely often, say, it occurs last at position k ≥ i. Let ψ = v 0 v 1 . . . v k and ψ = ψ ψ . Then, by the same argument as above, the nodes covered in ψ are contained in the nodes covered by φφ (Jψ K ⊆ Jφφ K). Also, by
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
construction, φφ ψ is a σi −1 -conform play. We thus obtain Jψ ψ K ⊆ Jφφ ψ K, and consequently As our induction hypothesis provides ν (φφ ψ ) ≤ c, this contradicts the assumption ν (ψ) > c.
55
ν (ψ) ≤ ν (φφ ψ ).
After having established that all σi are optimal, let us assume for contradiction that σ∞ is not, that is, that there is a σ∞ -conform play φ with ν (φ) = c > c. But then, some finite initial sequence of φ of φ must have a gain c = ν (φ ), and φ must be a prefix of a σ|φ | conform play ψ . Consequently ν (ψ) ≥ ν (φ ) > c holds (contradiction). 2 6.3. NP Algorithm Having established that it is enough to consider prefix-consistent SUT strategies, we outline an algorithm that guesses a witness SUT strategy that includes, for each node v of the game, – the coverage of c v that the SUT allows the tester to obtain if we start from this node (that is, on a game that is modified only in that the initial node is changed to v), and – a set P v ⊆ V (for pseudo trap from v) of nodes that includes v and satisfies the following constraints: • P v = { v } is singleton and v is an SUT node, or • for each SUT node u in P v there is a node w in P v such that (u , w ) ∈ E. For a game that starts in v, the SUT would intuitively offer the tester to cover P v . The game is then intuitively divided into two phases: the ‘covering phase’, where P v may be covered by the tester, followed by a phase where the game would not return to P v . We will argue that the SUT can play optimal by playing from a node w reached after P v is left as if the game would start in w. Not returning to P v would merely be a property of an optimal SUT strategy that starts in w and not technically required. Intuitively, the SUT strategy explained in the above is recursive in the sense that, at a node v, the SUT proposes to contain the plays in P v until an exit (u , w ) is chosen by the tester with u ∈ P v ∩ V 1 and w ∈ / P v , and then the strategy works recursively at w. (Unless P v = { v } ⊆ V 2 is a singleton set that contains a SUT node, in which case the SUT selects the successor.) This recursion is also reflected by our calculation of the c v from the successor SCCs to P v . Based on the observations, we thus have the following proposal for an optimal SUT strategy. Definition 9 (Witness strategy). A witness strategy is given as a set {( P v , c v ) ∈ 2 V × N | v ∈ V } if it satisfies the following side constraints for all v ∈ V .
• v ∈ P v , and • if P v is not singleton, then, for all SUT nodes w ∈ P v ∩ V 2 , there is an edge ( w , w ) ∈ E with w ∈ P v ; that is, the SUT can stay in non-singleton pseudo traps. The witness strategy for the SUT described by this set initially uses a memory of one state v m , where v m is initialized to be the initial state. For the memorized state v m , the SUT tries to stay in P v m from her states. Whenever the pseudo trap P v m is left to a state v ∈ / P v m , then v m is updated to v. The SUT tries to stay in P v m on its turn. By definition, this is possible unless if P v m = { v m } is singleton, then v m is an SUT node, v m is no sink, and the self-loop ( v m , v m ) is not an edge of the game. In this case, the SUT moves to a successor node v with minimal c v among the successors of v m . A witness strategy is consistent if it satisfies the following side constraints: – Case S (for singleton): if P v = { v }, v is an SUT node, v is no sink, and the self-loop ( v , v ) is not an edge of the game, then we have c v = min{c u + ν ( v ) | ( v , u ) ∈ E }, and – Case T (for trapping): otherwise, let c v = max{c w | u ∈ P v ∩ V 1 , (u , w ) ∈ E , w ∈ / P v } + ν ( P v ), where the maximum over the empty set is 0 (to account for the case that P v is a tester trap). 2 Consistency of witness strategies is easy to check. Lemma 6. Consistency can be checked in polynomial time. 2 To follow a consistent witness, the SUT needs a set variable T to record the current set of nodes that the tester is allowed to cover. The following steps describe how a consistent witness strategy can be followed. Initially, T is ∅. At a node v, the SUT first updates T according to the following cases.
• If v is a case S node, T is set to ∅. • If v is a case T node, the following cases are further considered.
56
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
– If T is ∅, then T is set to P v . (This refers to the case that v is the initial node.) – If v ∈ T , then we are still in the same set of nodes that the witness strategy has allowed the tester to cover. Thus, we ignore P v and let T stay unchanged. / T , then we just left the last set of nodes that the witness strategy allowed the tester to cover. Thus, we reset T – If v ∈ to P v . Note that v may be a tester node. Then at a node v ∈ V 2 , the SUT makes the following decision.
• If T = ∅, this implies that v is a case S node and the SUT should pick a successor u with ( v , u ) ∈ E and c u = min{c w | ( v , w ) ∈ E }. • If T = ∅, the SUT should choose a u ∈ T with ( v , u ) ∈ E. Lemma 7. If ν ( v ) ≥ 1 for all v ∈ V , then the SUT can, for every consistent witness, guarantee a gain of at most c v from every node v ∈ V. Proof. We show this by induction over c v . For the induction basis, this is clearly true if P v is a tester trap, as this implies that {c w | u ∈ P v ∩ V 1 , (u , w ) ∈ E , w ∈ / P v } is empty, and consequently c v = ν ( P v ) holds. Note that this is in particular the case for c v = 1. For the induction step, we can follow the witness strategy to obtain all guarantees no greater than c v by induction hypothesis. We now distinguish two cases. 1. Case S of v: Then there must be a successor u (with ( v , u ) ∈ E) with c u = c v − ν ( v ) < c v . By induction hypothesis, the SUT can restrict the gain to c u using the witness strategy, and consequently to c v from v. 2. Case T of v: Following a witness strategy allows to cover first (some or all) nodes in P v , and then might continue to a successor w ∈ / P v of any tester node in P v . (Recall that a witness strategy of the SUT will not leave P v from any SUT node.) We have c v > max{c x | u ∈ P v ∩ V 1 , (u , x) ∈ E , x ∈ / P v } ≥ c w by the consistency of the witness. Thus, the tester can guarantee a gain of at most c w after moving to w, while clearly at most ν ( P v ) has been gained before. 2 Note that the correctness argument does not require that P v is not visited again. It is indeed possible to construct consistent witnesses who do this—not all consistent witnesses are optimal. We now show that an optimal consistent witness exists. Theorem 2. For a game with ν ( v ) ≥ 1 for all v ∈ V where an optimal SUT can restrict the gain to m v when starting in node v, there is a consistent witness with c v = m v for all nodes v. Proof. We show by induction that, for games with a minimal gain of m v , we can construct a witness with c v = m v for all v ∈ V . By the previous lemma, this shows that the SUT can force that, when starting at v, the gain is bounded by m v . For the induction basis, this is clearly true if there is a tester trap P v with m v = ν ( P v ). For such tester traps, we can simply select c v = ν ( P v ) = m v . Note that this in particular includes the case m v = 1. For the induction step, we exploit that prefix-consistent strategies are sufficient for the SUT (Lemma 5). Let us assume that m v is the correct value, and that the SUT uses a fixed prefix-consistent optimal strategy. Then, we can look at a run as a producer of edges in a directed graph: we start with the game intersected with the transitions whose source is a tester node. That is, we remove exactly the transitions that exit an SUT node. During the run, every time we pass by an SUT node, we add the selected transition to the di-graph. For a run, we choose the maximal of these graphs such that the added transition belongs to the same SCC as the starting node v. For every run, there is obviously one such graph, and there is obviously a (not necessarily unique) graph among them where this SCC is maximal. We select such a graph G and select P v to be the set of nodes in the same SCC as v. For the SUT nodes in this SCC, we memorize the (due to prefix-consistency unique) outgoing edge selected by the strategy. Moreover, we pick, for each successor w of a tester node in P v , a history h w consistent with our prefix-consistent optimal strategy such that (1) initially P v is covered completely and (2) then a transition to w is taken. It is obvious that such a history exists, as the play in which P v is first an SCC can be extended to cover P v and then to move on to w. From w, the SUT will play as if it used the old strategy from v and had previously seen h w . We call the inferred strategy of the SUT to continue from w w-consistent. Note that a w-consistent strategy is (1) prefix-consistent and (2) guarantees that P v is not visited again, as P v is a maximal SCC. Note that the following strategy provides the same guarantees: while in P v , stay in P v , using the transitions from G; once P v is left to a node w, follow the w-consistent strategy. It is therefore an optimal prefix-consistent strategy from v. (Optimal because the SUT has the described strategy to achieve the same guarantees as under the optimal prefix-consistent strategy we started with.) Note that, by construction, P v is not reachable from w under the w-consistent strategy. (If it was, the SCC was not maximal.) Consequently, we have c w < c v , and, by induction hypothesis, we verify the correct c w using our witnesses.
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
57
Let us assume for contradiction that this is smaller than the result using the w-consistent strategy from w. Then the strategy from above can be strictly improved to: while in P v , stay in P v , using the transitions from G, once P v is left to w, follow an optimal strategy that starts in w. This provides a contradiction to the optimality of this strategy. This leaves the special case where P v = { v } contains a single SUT node without a self-loop, but with some successor. In this case, we have a clearly defined w-consistent strategy for the successor w selected by the SUT (and memorize the edge ( v , w )), together with the guarantee that, in the w-consistent strategy, v cannot be visited. This provides us with a strategy for the SUT to restrict the gain from w to c v − ν ( v ). 2 Note that the restriction to ν ( v ) ≥ 1 is not a real restriction. For a game with n nodes, we can use the function ν with ν ( v ) = 1 + n · ν ( v ) instead: a strategy is optimal for ν if it is optimal for ν . Putting the lemmas of this section together, we obtain inclusion in NP with this observation. With the matching hardness result from Lemma 4, we get our main theorem. Theorem 1. MCG decision problem is NP complete for both, constant and general ν . 2 7. Some variations 7.1. Resettable NC-games From the perspective of a tester, one might argue that it should be possible to re-start tests. A simple reduction from the MCG decision problem with restart to the MCG problem without is also provided here. As the hardness argument is not affected, we obtain a similar theorem. A slight variation of the problem is to allow the tester to re-start the game at any time. This variation is natural as one can argue that a test can be re-started. We first note that the hardness proof is not affected: the SUT can repeatedly guess the same satisfying assignment for the CNF SAT problem. For the inclusion, we can consider two games: the game with re-set, and a game where every node is doubled into an in- and an out-node, both with the same gain. A transition from v to w becomes a transitions from the out-node of v to the in-node of w, and the new initial node is the in-node to the old initial node. The in-node of v has two outgoing transitions: one to the initial node and one to the out-node of v. While all in-nodes are tester nodes, the out-node of v belongs to the player that owned v. The only other change applied is that, if v is a sink in the game with re-set, then we add a transition from the out-node of v to the new initial node. It is now easy to show that the maximal gain of the re-start game is exactly half the maximal gain of the new game. We can show this by simulating the games. Theorem 3. MCG decision problem with re-start is NP complete for both, constant and general ν . 2 7.2. Edge coverage games Edge coverage is another basic testing criterion in the industry. In the game’s perspective, the testers want to cover as many edges as possible for this criterion. It corresponds to the branch coverage since edges from a node can be interpreted as branching decisions from the node. Definition 10 (Edge coverage game, EC-game). An edge coverage game G , r is a pair of edge game graph G = V 1 , V 2 , E , and an initial node r ∈ V 1 ∪ V 2 . The edge game graph is identical to a node game graph except that now assigns natural number weights to the edges. In the game, the tester (player 1) tries to cover as much edge weight as possible in plays while the SUT (player 2) tries to cover as little edge weight as possible in plays. 2
For a play φ of an EC-game G , r , we use (φ) to denote the weight sum of edges in φ , that is, (φ) = ( v ,u )∈{(φ(k),φ(k+1))|k∈N} ( v , u ). The maximal edge coverage guarantee (MECG) from r of G, denoted mecg(G , r ), is maxσ1 ∈Σ minσ2 ∈Σ (play(r , σ1 , σ2 )). A strategy σ1 of the tester is optimal if it can be used by the tester to achieve at least mecg(G , r ) coverage no matter how the SUT may respond in the game. The complexity of a computation problem in computer sciences is usually studied in the framework of decision problem. Our MECG decision problem is defined as follows. Definition 11 (MECG decision problem). Given an EC-game G , r and a c ∈ N, the MECG decision problem asks whether mecg(G , r ) ≤ c. 2
58
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
We can establish the complexity upper-bound of MECG decision problem by reducing any MECG problem to an MCG problem. The idea is treating each edge as a special node in the EC-game. Specifically, for any EC-game G , r with G = V 1 , V 2 , E , ν , we construct an NC-game G , r with G = V 1 , V 2 , E , ν with the following restrictions.
• V 1 = V 1 ∪ E. For convenience, nodes introduced from E to V 1 are called edge nodes. • E = {(u, (u , v )), ((u , v ), v ) | (u , v ) ∈ E }. 0 if x ∈ V 1 ∪ V 2 • ν (x) = . 1 if x ∈ E Note that each edge (u , v ) becomes a node in the NC-game G , r with exactly one outgoing edge. Thus there is no non-trivial decision to make in edge nodes. With the reduction in the above, we can establish the complexity upper-bound of MECG decision problem. Lemma 8. Given an EC-game G, its NC-game G constructed as in the above, and a c ∈ N, mecg(G , r ) ≤ c if and only if mcg(G , r ) ≤ c. 2 Similarly, we can also establish the complexity lower-bound of MECG decision problem by reducing MCG decision problem to MECG decision problem. The idea is to converting nodes in the MCG decision problem to an edge. Given an NC-game G , r with G = V 1 , V 2 , E , ν , we construct an EC-game G , r with G = V 1 , V 2 , E , as follows.
• V 1 = V 1 ∪ { v¯ | v ∈ V 1 } and V 2 = V 2 ∪ { v¯ | v ∈ V 2 }. • E = {( v¯ , v ) | v ∈ V 1 ∪ V 2 } ∪ {(u , v¯ ) | (u , v ) ∈ E }. • For all if v ∈ V 1 ∪ V 2 , (( v¯ , v )) = ν ( v ) and (( v , v¯ )) = 0. Note that a newly added node v¯ has exactly one out-going edge and there is no non-trivial decision to make at v¯ . In any play, the coverage of v¯ is thus the same as that of ( v¯ , v ). We can then establish the complexity lower-bound of MECG problem with the following lemma. Lemma 9. Given an NC-game G, its EC-game G constructed as in the last paragraph, and a c ∈ N, mcg(G , r ) ≤ c if and only if mecg(G , r ) ≤ c. 2 Note that the two reduction in the above are all in PTIME. Based on the two lemmas in the above, we can establish the following theorem for the NP-completeness of MECG decision problem. Theorem 4. The MECG decision problem is NP-complete.
2
8. Practical testing when the SUT is not hostile Our approach assumes a zero-sum game model, in which the SUT tries to minimize the coverage. Such a model can be useful for deriving a theoretical lower-bound of coverage guarantee. Such a lower-bound is conservative, and might be too restrictive in practice. In this section, we discuss how to use the knowledge of MCG and its certificates to generate test cases for an SUT that is not known to be hostile. The motivation for this is that the assumption that the system under SUT is actively trying to oppose a high coverage is not natural but conservative. The game graph is rather the result of an abstraction. There is no reason to assume that the SUT would play rational or prefix consistent. As explained in Subsection 6.1, the MCG certificate consists of a pseudo trap P v for each node v ∈ V 1 ∪ V 2 , where P v describes an SCC that a prefix-consistent optimal SUT allows the tester to cover. For the example from Fig. 4, we can use the following certificate: P v 0 = { v 0 }, P v 1 = { v 1 , v 3 }, P v 2 = { v 2 , v 3 }, and P v 3 = { v 3 }. Note that P v 1 and P v 2 override P v 3 since v 3 ∈ P v 1 and v 3 ∈ P v 2 . After a pseudo trap is fully covered, the tester can leave it to a vertex of his choice, where he can obtain the highest total coverage. The pseudo traps together constitute a witness for a strategy of the SUT. For each MCG certificate, we can thus construct a chain of pseudo traps visited from the initial node such that the sum of the sizes of the pseudo traps in the chain is exactly the MCG. For example, in Fig. 4, a pseudo trap chain would be P v 0 P v 1 . Based on the pseudo trap chain, we can then construct a test case, i.e., a play prefix, that corresponds to the full expansion of the component SCCs in the chain. Specifically, given such an SCC chain S 0 S 1 . . . S m , the test case is a chain of play prefixes ψ0 ψ1 . . . ψm with the following constraints.
• For each k ∈ [0, m], ψk is a path in S k that covers all nodes in S k . • For each k ∈ [0, m − 1], (last(ψk ), ψk+1 (0)) ∈ E. For convenience, we call such ψ0 . . . ψm an MCG test case. For example, in Fig. 4, an MCG test case would be v 0 v 1 v 3 v 1 .
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
59
Algorithm 1 For dynamic testing an SUT that is not hostile. 1: while forever do 2: Call T MCG (G , r ) and assume it returns an MCG test case v 0 v 1 . . . v n . 3: for (k = 0; k < n; k++) do 4. if v k ∈ V 2 then 5. Get the next node u from the SUT. 6. if u = v k+1 then break the for-loop; end if 7: end if 8: end for 9: if k == n then exit since the present MCG test case has been hostile as expected in v 0 . . . v n ; else 10: for every node v of G do reset ν ( v ) to zero if v ∈ { v 0 , . . . , v k , u }. end for 11: Let r be u. 12: end if 13: end while
If an MCG test case ψ is executed as expected, then the theoretical coverage lower-bound is achieved. Otherwise, an unexpected node u must be observed in executing ψ and we can adjust the weights of all nodes (i.e., reset the weight of all covered nodes to zero), obtain an MCG test case again for G from u as the new initial node, and start the testing again. We can repeat these steps again and again until an MCG test case is executed as expected. Now we assume the availability of a procedure T MCG (G , r ) to generate such an MCG test case for G from node r. We thus present Algorithm 1, a dynamic testing procedure for SUTs that may not be hostile. The procedure has the advantage that it guarantees the coverage lower-bound when the SUT is hostile and in the mean-time also adapts for more node coverage when the SUT is not so hostile. It repeatedly gets an MCG test case v 0 . . . v n at line 2 and then executes the test case with the loop from line 3 to line 8. The execution loop is broken only when the node chosen by the SUT is not expected in the MCG test case at line 6. If the MCG test case is executed as expected, then the SUT has acted in hostility as expected in this MCG test case and we exit. Otherwise, the SUT was not so hostile and we try the next MCG test case. A further improvement might be obtained by optimistically extending the test cases. We first describe this for tester traps. If the pseudo trap we are currently in (according to the test plan) is a tester trap that is already covered, then a test plan would normally end. It might, however, be possible to leave the tester trap and cover further nodes in case the SUT cooperates. In the example from Fig. 4, for example, one might play repeatedly into node v 3 in the hope that the SUT eventually plays to the node that is not yet covered. Similarly, when any non-trivial pseudo trap is already covered, the tester check if there are SUT nodes, from which the SUT might leave the pseudo trap to a node, which provides a higher coverage than any node reachable from any node in the pseudo trap owned by the tester. In this case, the tester could direct the game to such an SUT node repeatedly (hoping that the SUT takes such a transition) before leaving the pseudo trap himself. Note, however, that there is no theoretical answer to the question how long the tester should try either of this. 9. Conclusion We investigate the theory for coverage techniques in software testing from a game perspective. We have established that the MCG decision problem is NP-complete while previous frameworks for game graph coverage [5,22] are PSPACE-complete. This may imply that our framework of NC-games can bring about more efficiency in various computing aspects of software testing than those of [5,22]. References [1] P. Ammann, J. Offutt, Introduction to Software Testing, Cambridge University Press, 2008. [2] D. Berwanger, A. Dawar, P. Hunter, S. Kreutzer, Dag-width and parity games, in: STACS’06, in: LNCS, vol. 3884, 2006, pp. 524–536. [3] D. Berwanger, E. Grädel, Entanglement – a measure for the complexity of directed graphs with applications to logic and games, in: LPAR’04, in: LNCS, vol. 3452, Springer-Verlag, 2005, pp. 209–223. [4] L. Briand, Y. Labiche, A uml-based approach to system testing, Softw. Syst. Model. 1 (1) (2002) 10–42. [5] K. Chatterjee, L. Alfaro, R. Majumdar, The complexity of coverage, in: 6th Asian Symposium on Programming Languages and Systems (APLAS), in: LNCS, vol. 5356, Springer-Verlag, 2008, pp. 91–106. [6] P. Chevalley, P. The’venod-Fosse, Automated generation of statistical test cases from uml state diagrams, in: 25th IEEE Annual International Computer Software and Applications Conference (COMPSAC), October 2001. [7] T. Chow, Testing software designs modeled by finite-state machines, IEEE Trans. Softw. Eng. SE-4 (3) (May 1978) 178–187. [8] S. Cook, The complexity of theorem proving procedures, in: 3rd ACM Symposium on Theory of Computing (STOC), Association for Computing Machinery (ACM), 1971, pp. 151–158. [9] L. de Alfaro, T.A. Henzinger, O. Kupferman, Concurrent reachability games, Theoret. Comput. Sci. 386 (3) (November 2007) 188–217. [10] L.C. Eggan, Transition graphs and the star-height of regular events, Michigan Math. J. 10 (4) (1963) 385–397. [11] J. Hopcroft, J. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, 1979. [12] W.E. Howden, Methodology for the generation of program test data, IEEE Trans. Softw. Eng. SE-24 (May 1975) 554–559. [13] J.C. Huang, An approach to program testing, ACM Comput. Surv. 7 (3) (September 1975) 113–128. [14] J.A. Jones, M.J. Harrold, Test-suite reduction and prioritization for modified condition/decision coverage, in: Proceedings of the International Conference on Software Maintenance, October 2001, pp. 92–101. [15] A.S. LaPaugh, Recontamination does not help to search a graph, J. ACM 40 (2) (1993) 224–245.
60
F. Wang et al. / Theoretical Computer Science 576 (2015) 45–60
[16] H. Legard, M. Marcotty, A generalogy of control structures, Commun. ACM 18 (November 1975) 629–639. [17] M.R. Lyu, Z. Huang, S.K.S. Sze, X. Cai, An empirical study on testing and fault tolerance for software reliability engineering, in: International Symposium on Software Reliability Engineering (ISSRE), Los Alamitos, CA, USA, IEEE Computer Society, 2003, p. 119. [18] T.J. McCabe, A complexity measure, IEEE Trans. Softw. Eng. 2 (4) (December 1976) 308–320. [19] N. Megiddo, S.L. Hakimi, M.R. Garey, D.S. Johnson, C.H. Papadimitriou, The complexity of searching a graph, J. ACM 35 (1) (Jan. 1988) 18–44. [20] J. Obdrz˘ ´ alek, Dag-width: connectivity measure for directed graphs, in: SODA’06, ACM Press, 2006, pp. 814–821. [21] M. Osborne, A. Rubinstein, A Course in Game Theory, MIT Press, 1994. [22] C.H. Papadimitriou, M. Yannakakis, Shortest paths without a map, Theoret. Comput. Sci. 84 (1991) 127–150. [23] T. Parsons, Pursuit-evasion in a graph, in: Y. Alani, D.R. Lick (Eds.), Theory and Applications of Graphs, Springer, Berlin, 1976, pp. 426–441. [24] S. Pimontand, J. Rault, A software reliability assessment based on a structural behavioral analysis of programs, in: Second International Conference on Software Engineering, October 1976. [25] A. Pnueli, R. Rosner, On the synthesis of a reactive module, in: 16th ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), 1989, pp. 179–190. [26] P.J. Ramadge, W.M. Wonham, The control of discrete event systems, Proc. IEEE 77 (1) (January 1989) 81–98, Special Issue on Discrete Event Systems. [27] H. Srikanth, Requirements-based test case prioritization, in: Student Research Forum in 12th ACM SIGSOFT International Symposium on the Foundations of Software Engineering, 2004. [28] E.J. Weyuker, The evaluation of program-based software test data adequacy criteria, Commun. ACM 31 (1988) 668–675. [29] E.J. Weyuker, How to judge testing progress, J. Inf. Softw. Technol. 45 (5) (2004) 323–328. [30] E.J. Weyuker, In defense of coverage criteria, in: Proceedings of the 11th ACM/IEEE International Conf. on Software Engineering, ICSE, May 1989. [31] M. Yannakakis, Testing, optimizaton, and games, in: IEEE LICS, 2004, pp. 78–88.