Journal of Computer and System Sciences 66 (2003) 40–65 http://www.elsevier.com/locate/jcss
Verification of relational transducers for electronic commerce Marc Spielmann Department WNI, University of Limburg, Universitaire Campus, B-3590 Diepenbeek, Belgium Received 1 October 2000; revised 1 January 2002
Abstract Motivated by recent work of Abiteboul, Vianu, Fordham, and Yesha, we investigate the verifiability of transaction protocols specifying the interaction of multiple parties via a network. The protocols which we are concerned with typically occur in the context of electronic commerce applications and can be formalized as relational transducers. We introduce a class of powerful relational transducers based on Gurevich’s abstract state machines and show that several verification problems related to electronic commerce applications are decidable for these transducers. r 2003 Published by Elsevier Science (USA).
1. Introduction One of the main reasons for the enormous and still accelerating growth of the World Wide Web is the strong interest of commercial enterprises in electronic commerce, i.e., in offering services on and conducting business via the Web. Electronic commerce offers challenges to various disciplines of computer science, such as cryptography (for security and authentication), databases (to support electronic transactions), and formal methods (for the design and verification of transaction protocols). A stimulating introduction to electronic commerce from the point of view of computer science can be found in [3,4]. In this paper, we study the automatic verifiability of transaction protocols specifying the interaction of multiple parties via a network. The protocols which we are concerned with typically occur in the context of electronic commerce applications. As an example, consider an electronic warehouse where multiple customers interact with a supplier via the Internet. In this scenario the transaction protocol of the supplier specifies how to react to requests of customers. Possible actions may include sending bills to customers, initiating delivery of products, updating the database of the supplier, etc. As a general framework to formalize transaction protocols for electronic commerce applications, Abiteboul et al. have recently put forward relational transducers [3]. A relational E-mail address:
[email protected] 0022-0000/03/$ - see front matter r 2003 Published by Elsevier Science (USA). PII: S 0 0 2 2 - 0 0 0 0 ( 0 2 ) 0 0 0 2 9 - 6
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
41
transducer can be viewed as an interactive computational device equipped with an active database. In every computation step, the transducer receives from its environment a collection of input relations (e.g., lists of orders and payments from various customers) and reacts by producing a collection of output relations (e.g., lists of bills and items to be sent to customers). During a computation step, the transducer may also update its internal database, which is a relational database consisting of a dynamic and a static component. The dynamic component contains all temporary data necessary to keep track of ongoing transactions (e.g., a list of the currently ordered items). We refer to this component as the memory of the transducer. The purpose of the static component is to provide all static data, i.e., data which does not change during a transaction (e.g., the catalog of a supplier). We call this component the database of the transducer. Several verification problems concerning relational transducers have been studied and partially solved in [3], among them: (1) the problem of verifying temporal properties of relational transducers (‘‘Does a transaction protocol meet its specification?’’) (2) the log validation problem (‘‘Does a transaction carried out on a remote customer site actually conform to the transaction protocol of the supplier?’’), and (3) the problem of deciding equivalence of relational transducers (‘‘Does a transaction protocol that was customized by a business partner still conform to the original protocol? That is, are two protocols equivalent with respect to transactions?’’). Although each of these problems is undecidable in general, positive results, i.e., decidability of variants of these problems, were obtained for a class of restricted relational transducers, called Spocus transducers [3]. A drawback of Spocus transducers is that their memory relations are cumulative: the memory of a Spocus transducer only accumulates all previous input; it cannot be updated otherwise. This restricts the scope of applicability of Spocus transducers substantially. In the present paper, we investigate the decidability of the above verification problems for relational transducers more powerful than Spocus transducers. Our transducers, which we call ASM (relational) transducers, are defined by simple, rule-based abstract state machine programs over relational vocabularies. Abstract state machines (ASMs, formerly called evolving algebras) have become the foundation of a successful methodology for specification and verification of complex dynamic systems [5,11,12]. ASM transducers are more powerful than Spocus transducers due to the following two reasons. Firstly, rule applications are in general guarded by first-order formulas. In contrast, Spocus transducers are defined by semi-positive datalog rules, i.e., rules guarded by conjunctive formulas (built from atomic and negated atomic formulas). As a consequence, the one-step complexity of ASM transducers is much higher than that of Spocus transducers. In fact, for ASM transducers it is undecidable whether they produce any output in the first step on any database, while for Spocus transducers this problem is easily seen to be decidable. Secondly, the memory relations of ASM transducers are not necessarily cumulative. Indeed, the memory of an ASM transducer can be seen as an active database with immediate triggering of insertion and deletion actions [2,16]. Since none of the above verification problems is, in its full generality, decidable for ASM transducers, we focus on natural restrictions of these problems. The restrictions we impose are
42
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
natural in the sense that they occur often in electronic commerce applications such that solutions of the restricted problems are still of practical importance. More precisely, we show that the verification problems become decidable for ASM transducers if one of the following two conditions is satisfied: The (static) database relations of an ASM transducer are known. This restriction is applicable if, say, the catalog of a supplier does not change frequently so that it becomes feasible to adjust verification to the currently valid catalog. The maximal input flow which an ASM transducer is exposed to is a priori limited. The maximal input flow is, roughly speaking, the maximal amount of input data forwarded to a relational transducer in one computation step. This restriction should be applicable in most electronic commerce applications. It is motivated by the observation that a relational transducer running in a realistic environment never receives ‘too much’ input in a single computation step, due to physical and technical limitations of the environment. For instance, the maximal input flow of a transducer running on a web server is limited by the number of clients accepted by the server, the capacity of the local network, the one-step capacity of the server, etc. Under the first condition, we observe decidability of the verification problems for the class of all ASM transducers. Under the second condition, decidability is obtained only for a class of restricted ASM transducers, called ASM transducers with input-bounded quantification. Transducers of the latter kind are less powerful than general ASM transducers because first-order quantification in the guards of their rules is bounded to the active domain of the current input. As it turns out, the maximal arity of the relations employed by ASM transducers is a major source for the complexity of the verification problems. We show that the restricted variants of the verification problems are Pspace-complete if the maximal arity of the employed relations is a priori bounded. Imposing an upper bound on the maximal arity should be no serious obstacle for real-life applications, since the arities of relations used in practice tend to be rather small. On the other hand, if there is no fixed upper bound on the arities of relations, then many of the restricted problems become EXPspace-complete. Related work. The general model of relational transducers as well as the verification problems which we consider here are adopted from or inspired by recent work of Abiteboul et al. [3]. (We slightly deviate from their framework; for details the reader is referred to Section 2.3 and the first paragraph in Section 3.) ASM relational transducers are based on the ASM computation model of Gurevich [11,12]. Applications of ASMs to database theory can be found in [9] and [10]. The latter work employs ASMs to specify active databases specially tailored for electronic commerce applications. Runs of ASM transducers can be viewed as temporal databases in the spirit of Abiteboul et al. [1]. We use first-order temporal logic (see, e.g., [1,8]) to express properties of runs of relational transducers (recall the first verification problem). Our decidability results are implied by a reduction to the finite satisfiability problem for existential transitive closure logic (see, e.g., [7,20]) which is decidable by results in [14,17]. The reduction borrows elements of a construction due to Immerman and Vardi [13]. An application of the same construction in a related context can be found in [19].
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
43
Outline. In Sections 2 and 3, we introduce ASM relational transducers and formally define the verification problems which we are concerned with. Since none of the verification problems is decidable for the class of all ASM transducers, we propose natural restrictions of the problems in Section 4. In Sections 5 and 6, we present our main results and briefly discuss some open problems. The appendix contains some technical details concerning the proofs of our results.
2. ASM relational transducers We consider relational transducers as defined in [3], though we will not assume any familiarity with that paper. We start with a self-contained definition of a powerful model of relational transducers based on ASMs [11,12]. Our transducers, which we call ASM (relational) transducers, are defined by simple, rule-based ASM programs over relational vocabularies. 2.1. ASM transducer programs A transducer vocabulary U is a quintuple ðUin ; Udb ; Umem ; Uout ; Ulog Þ of finite, relational vocabularies (or, schemas) where the first four vocabularies are pairwise disjoint and Ulog DUin ,Uout : Sometimes, we also denote by U the relational vocabulary Uin ,Udb ,Umem ,Uout : The intended meaning will be clear from the context. In the following, FO stands for first-order logic with equality. (In this paper we ignore questions concerning domain independence, safety of queries, etc.) Definition 2.1. Let U be a transducer vocabulary. An ASM transducer program P over U is a finite set of rules of the form if jðxÞ % then ð:ÞRðxÞ % where jðxÞ % and RðxÞ % is an atomic FO % is a FO formula over Uin ,Udb ,Umem with freeðjÞ ¼ fxg; formula over Umem ,Uout : RðxÞ % must occur positively on the right-hand side of the above rule if RAUout : jðxÞ % is called the guard of the rule. The semantics of an ASM transducer program P is similar to the one-step semantics of a datalog:: program [2], except that P treats inconsistent updates as ‘no operations’. 2.2. Semantics Let U be a transducer vocabulary, and let P be an ASM transducer program over U: U and P define a relational transducer TðU;PÞ as follows. A state S over U is a finite structure over the relational vocabulary U: For every U0 DU; let SjU0 denote the reduct of S to U0 ; i.e., the structure over U0 obtained from S by removing the interpretations of the symbols in U U0 : To ease notation, we regularly write Sjdb instead of SjUdb ; and Sjdb;mem instead of SjðUdb ,Umem Þ; and so forth. A state S can be viewed as being composed of the four components Sjin ; Sjdb ; Sjmem ; and Sjout : Intuitively, Sjin is the current input of the transducer TðU;PÞ in state S; and Sjout is the output that was produced during the last
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
44
computation step of TðU;PÞ (i.e., the step which led to state S). Sjdb and Sjmem are the database and memory of TðU;PÞ in state S; respectively. Remark. Instead of including the input and output of a transducer into its states, one could alternatively consider states which contain only the database and memory of the transducer, and view the input and output as labels of transitions between states. The relational transducer TðU;PÞ (defined by U and P) is a mapping from states over U to finite structures over Umem ,Uout : If S is a state over U; then TðU;PÞ ðSÞ determines the memory and output of the transducer TðU;PÞ in a successor state S0 of S: The new input of TðU;PÞ in state S0 will be provided by the environment. The database of TðU;PÞ in S0 is the same as in S: We come to the definition of TðU;PÞ : For simplicity, suppose that *
*
whenever ð:ÞRðxÞ % is the right-hand side of a rule in P; then the variable tuple x% consists of pairwise distinct variables, and whenever ð:ÞRðxÞ % and ð:ÞR0 ðyÞ % are the right-hand sides of two rules in P; and R ¼ R0 ; then the variable tuples x% and y% are identical.
For every RAUmem ,Uout ; define the following FO formulas: _ fjðxÞ: jR ðxÞ % :¼ % ðif jðxÞ % then RðxÞÞAPg % _ fjðxÞ: cR ðxÞ % :¼ % ðif jðxÞ % then:RðxÞÞAPg % wR ðxÞ % :¼ ðjR ðxÞ4:c % % R ðxÞÞ 3ðjR ðxÞ4c xÞÞ % % % R ðxÞ4Rð W
3ð:jR ðxÞ4:c xÞÞ; % % % R ðxÞ4Rð
S where | :¼ false: For every state S over U; let jS R and wR denote the answer relations of the queries jR and wR on S; respectively. TðU;PÞ maps S to a finite structure over Umem ,Uout defined as follows: * * *
the universe of TðU;PÞ ðSÞ is that of S for every RAUmem ; the interpretation of R in TðU;PÞ ðSÞ is wS R ; and S for every RAUout ; the interpretation of R in TðU;PÞ ðSÞ is jR :
This concludes the definition of TðU;PÞ : Definition 2.2. An ASM (relational) transducer T is a pair ðU; PÞ consisting of a transducer vocabulary U and an ASM transducer program P over U: ASM-T denotes the class of ASM transducers. For the sake of brevity, we will often blur the distinction between an ASM transducer T ¼ ðU; PÞ and the relational transducer TðU;PÞ defined by it. In particular, both will be denoted by T:
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
45
Remark. Readers familiar with the ASM computation model may have noticed that syntax and semantics of ASM transducer programs do not entirely conform to the ASM framework [12]. This deviation from the standard definitions serves to improve readability and allows us to simplify the technical presentation. It is however not essential for the (general) ASM transducer model presented here. 2.3. Runs In contrast to [3], we consider infinite runs of relational transducers over a fixed universe (see also the remark below). Let T be an ASM transducer of vocabulary U: A database D appropriate for T is a finite structure over Udb : For technical reasons, we shall always assume that D contains at least two elements. An input sequence I% appropriate for T and D is an infinite sequence ðIi ÞiAo of finite structures over Uin where each Ii has the same universe as D: Note that the universe of D can be a proper superset of the active domain of D: Hence, each Ii may contain input tuples some or all of whose components do not belong to the active domain of D: For example, Ii may contain orders from ‘new’ customers, i.e., customers which do not occur in any database or memory relation of T: Note also that, if Uin is not empty, I% may be non-periodic (and thus may not be finitely representable). % is an infinite sequence ðSi ÞiAo of states over U Let D and I% be as above. A run r of T on ðD; IÞ uniquely determined by the following conditions. For every iAo * * *
Si jin ¼ Ii ; Si jdb ¼ D; and Si jmem;out ¼ TðSi 1 Þ if i > 0; otherwise, every relation of Si jmem;out is the empty relation (i.e., in the initial state S0 the memory and output are empty).
The reduct of r to some vocabulary U 0 DU; denoted rjU 0 ; is the infinite sequence ðSi jU 0 ÞiAo : Again, we write rjdb instead of rjUdb ; and rjdb;mem instead of rjðUdb ,Umem Þ; etc. The output and log produced during r are the infinite sequences rjout and rjlog ; respectively. Logs have been introduced in [3] to capture the semantically significant input–output behavior of relational transducers (see the finite log validation problem in the next section). Remark. In [3], the authors considered finite runs of relational transducers over a possibly % where I% is a finite but expanding universe. More precisely, they considered runs on pairs ðD; IÞ arbitrary sequence ðIi Þipn of finite structures over Uin : In particular, the universe of each Ii is not necessarily identical with the universe of D: We can run an ASM transducer T on such a pair % as follows. Let D0 denote the expansion of the universe of D with all elements occurring in I:% ðD; IÞ Obtain D0 from D by replacing the universe of D with D0 : Similarly, obtain I0i from Ii : For every % to i > n; let I0i denote the ‘empty’ structure with universe D0 : Now, define the run of T on ðD; IÞ 0 0 be the run of T on ðD ; ðIi ÞiAo Þ: With respect to this convention, the Spocus transducer model of [3] can be regarded as a special case of the ASM transducer model. In fact, every Spocus % as transducer T can easily be converted into an ASM transducer T 0 such that for every pair ðD; IÞ 0 % is an initial segment of the infinite run of T on ðD; IÞ: % above, the finite run of T on ðD; IÞ
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
46
An example of a typical, admittedly very simple relational transducer follows. It is inspired by an example in [3] and will serve as our running example. Example 2.1. The ASM transducer Tsupp defined below specifies the transaction protocol of a supplier whose customers can order products, are billed for them, and are supplied with the ordered products on payment of the corresponding bills. To improve readability, we display the program of Tsupp in a slightly relaxed syntax, using nested rules (with the obvious meaning) and rules of the form (if jðx; ¼ fx; % yÞ % then ð:ÞRðxÞ) % where fxgifreeðjÞ % % yg: % Formally, a rule of the latter form has to be replaced with (if (yjð % x; % yÞ % then ð:ÞRðxÞ). % transducer Tsupp : relations: input: Order, Pay database: Price, Available memory: PastOrder output: SendBill, Deliver, RejectPay, RejectOrder log: Pay, SendBill, Deliver memory rules: if OrderðxÞ4AvailableðxÞ4:PastOrderðxÞ then PastOrderðxÞ if PastOrderðxÞ4Payðx; yÞ4Priceðx; yÞ then :PastOrderðxÞ output rules: if OrderðxÞ4AvailableðxÞ then if :PastOrderðxÞ4Priceðx; yÞ then SendBillðx; yÞ if PastOrderðxÞ then RejectOrderðxÞ if Payðx; yÞ then if PastOrderðxÞ4Priceðx; yÞ then DeliverðxÞ else RejectPayðx; yÞ The following table sketches the input and output components of a run of Tsupp : S0 In Out
S1
OrderðaÞ OrderðbÞ
S2 Payða; 5Þ
S3
OrderðaÞ OrderðbÞ SendBillða; 5Þ SendBillðb; 8Þ DeliverðaÞ
S4
S5
Payða; 5Þ SendBillða; 5Þ RejectOrderðbÞ
DeliverðaÞ
y
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
47
Notice that the memory relation of Tsupp is not cumulative. This enables customers to order one and the same product multiple times (in principle, infinitely often) during a session. Tsupp provides an example of a relational transducer which is not definable in the Spocus transducer model of [3]. Since every Spocus transducer can be view as a particularly simple ASM transducer (recall the remark above), we conclude that ASM transducers are more powerful than Spocus transducers.
2.4. Computational power of ASM transducers We compare the computational power of ASM transducers with that of Turing machines. Since Turing machines do not interact with their environment during run-time, we restrict ourselves to ASM transducers which use their database as input and perform a computation without reading any further input. Let T be an ASM transducer of vocabulary U where Uin ¼ | and Uout contains the two boolean symbols halt and accept: Furthermore, let Q be a boolean query over Udb (i.e., a class of finite structures over Udb closed under isomorphisms). We say that T computes Q if for every database D appropriate for T: 1. T on D reaches a halting state, and 2. the first halting state reached by T on D is accepting iff DAQ: Using well-known results from finite model theory (see, e.g., [2,7]), the proof of the next lemma is an easy exercise. Lemma 2.1. A boolean query is expressible in partial fixed-point logic (or, equivalently, in datalog:: ) iff it is computable by an ASM transducer. On ordered databases, ASM transducers compute precisely the class of Pspace-computable boolean queries.
3. Verification problems In this paper, we focus on the following three verification problems: (1) the problem of verifying temporal properties of relational transducers, (2) the finite log validation problem, and (3) the problem of deciding log equivalence of two relational transducers. The first and the third problem concern the correctness and customization of relational transducers. The second problem is related to fraud detection. While the first two problems are adopted from [3], the third problem is based on a notion of equivalence which is stronger than the notion of equivalence considered in [3]. Intuitively, two relational transducers are log equivalent if they produce the same semantically significant output whenever they run on the same database and receive the same input. We think that log equivalence can provide a useful notion of equivalence for customization of relational transducers. Formal definitions of the three verification problems follow.
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
48
3.1. Verifying temporal properties This problem concerns the correctness of relational transducers and thus arises while designing transducers. To specify requirements on relational transducers we propose first-order temporal logic (FTL) [1,8], a formalism suitable to express properties of runs of transducers. For the reader’s convenience we recall the definition of FTL below and use the opportunity to introduce a fragment of FTL which will play an important role in Section 5. Definition 3.1. First-order Temporal Logic (FTL) is obtained from FO by means of the following additional formula-formation rule: ðTÞ If j and c are formulas, then Xj and jUc are formulas. The free and bound variables of FTL formulas are defined in the obvious way. Let T denote the closure of the set of FO formulas under negation, disjunction, and the above rule ðTÞ: The universal closure of T, denoted UT, is the set of FTL formulas of the form 8xj % with jAT and freeðjÞDfxg: % Informally speaking, a formula of the form Xj expresses that ‘‘j holds in the next state’’, and a formula of the form jUc means that ‘‘j holds until c holds’’. We sketch the formal definition of the semantics of FTL formulas only with respect to runs of ASM transducers. Let r ¼ ðSi ÞiAo be a run of an ASM transducer of vocabulary U; and let j be a FTL formula over U with freeðjÞ ¼ fxg: % Simultaneously for every iAo and all interpretations a% of the variables x% (chosen from the universe of S0 ), define the satisfaction relation ðr; i; aÞFj by induction on the % construction of j: ðr; i; aÞFj %
:3
Si Fj½a ; % if j is an atomic formula
ðr; i; aÞFXj %
:3
ðr; i þ 1; aÞFj %
ðr; i; aÞFjUc :3 %
there is a jXi such that ðr; j; aÞFc and % for all k satisfying ipkoj; ðr; k; aÞFj: %
All other cases are defined as usual. We say that r satisfies a FTL sentence j if ðr; 0ÞFj: The problem of verifying temporal properties of ASM transducers can now be defined as a decision problem. Consider a class C of ASM transducers and a fragment F of FTL. The problem of verifying C-transducers against F -specifications is defined as follows: verifyðC; FÞ: Given an ASM transducer TAC and a sentence jAF over the vocabulary of T; decide whether every run of T satisfies j: Example 3.1. Recall from Example 2.1 the ASM transducer Tsupp specifying the transaction protocol of a supplier. For the supplier it might be desirable to enable customers to pay an ordered product and simultaneously reorder the very same product. We can specify this requirement on Tsupp by means of the UT formula displayed below. Intuitively, the formula says
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
49
that an order for a product will be rejected in the next step only if that product is currently ordered and not correctly paid: 8xG½XRejectOrderðxÞ-ðPastOrderðxÞ4:(yðPayðx; yÞ4Priceðx; yÞÞÞ ; where Gc abbreviates the formula :ðtrueU:cÞ whose meaning is ‘‘c holds in every state’’. Let j denote the above UT formula. Does Tsupp meet the specification j? In other words, is the pair ðTsupp ; jÞ a positive instance of the problem verify(ASM-T,UT), for short ðTsupp ; jÞA verify(ASM-T,UT)? It is not hard to see that the answer is no. However, one þ þ such that ðTsupp ; jÞA verify(ASM-T,UT). can upgrade Tsupp to an ASM transducer Tsupp For instance, remove the conjunct :PastOrderðxÞ from the guard of the first memory rule of Tsupp ; and instead add :OrderðxÞ as a new conjunct to the guard of the second memory rule. Furthermore, replace the first (nested) output rule of Tsupp with the following rule: if OrderðxÞ4AvailableðxÞ then if :PastOrderðxÞ4Priceðx; yÞ then SendBillðx; yÞ if PastOrderðxÞ then if Payðx; yÞ4Priceðx; yÞ then SendBillðx; yÞ if :(yðPayðx; yÞ4Priceðx; yÞÞ then RejectOrderðxÞ Remark. Although FTL does not include past-tense temporal operators, one can mimic such operators using additional memory relations to record the history of ongoing transactions. 3.2. Validating finite logs This problem is related to fraud detection and arises if, e.g., for efficiency reasons, the relational transducer of a supplier is allowed to run on remote customer sites. In such a distributed scenario, it can be vitally important for the supplier to be able to verify that a transaction carried out on a remote site actually conforms to the transaction protocol. That is, the supplier may need to check whether a transaction is valid in the sense that it is the result of a run of the transducer originally distributed among customers. Since in many applications not all of the information exchanged during a transaction is really important for the supplier (e.g., inquiries about prices), the semantically significant input and output relations of a relational transducer are specified as log relations [3]. A record of a transaction is now a finite sequence of collections of log relations. Each collection in the sequence contains the log relations at a particular time of the transaction. Let T be an ASM transducer of vocabulary U; and let L% ¼ ðL0 ; y; Ln Þ be a finite sequence of finite structures over Ulog : L% is called a finite log of T if there exists a run r of T such that L% is an
50
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
initial segment of rjlog : The finite log validation problem for a class CDASM-T is the following decision problem: fin-log-val(C): Given an ASM transducer TAC and a finite sequence L% of finite structures over the log vocabulary of T; decide whether L% is a finite log of T: 3.3. Deciding log equivalence The main motivation for considering this problem is customization of relational transducers. To enhance competitiveness, a supplier may want to allow customers to modify or upgrade the supplier’s transducer for their convenience or to conform to their own transaction protocol. This raises the question whether a customized transducer still conforms to the supplier’s transducer. Consider two ASM transducers T1 and T2 with the same log vocabulary. Let U denote the (component-wise) union of the vocabularies of T1 and T2 : Obtain T10 from T1 by replacing the vocabulary of T1 with U: Similarly, obtain T20 from T2 : We say that T1 and T2 are log equivalent if for every run r1 of T10 and every run r2 of T20 the following implication holds: ðr1 jdb;in ¼ r2 jdb;in Þ ) ðr1 jlog ¼ r2 jlog Þ: In other words, whenever T1 and T2 run on the same database and receive the same input, then T1 and T2 produce the same logs. The corresponding decision problem for a class CD ASM-T is defined as follows: log-eq(C): Given two ASM transducers T1 ; T2 AC with the same log vocabulary, decide whether T1 and T2 are log equivalent. þ Example 3.2. Verify that Tsupp and Tsupp in Examples 2.1 and 3.1 are not log equivalent. Further, observe that adding to the program of an ASM transducer new output rules which do not affect log-output relations preserves log equivalence (see also [3]). For instance, a customer may propose to add to the program of Tsupp the following output rule:
if PendingBills4PastOrderðxÞ4Priceðx; yÞ4:Payðx; yÞ then Rebillðx; yÞ where PendingBills is a new input relation and Rebill is a new output relation. If Rebill is not specified as a log relation, then the obtained ASM transducer is obviously log equivalence to Tsupp : Lemma 3.1. log-eq(ASM-T) is polynomial-time reducible to verify(ASM-T, UT). Proof (Crux). Consider an instance ðT1 ; T2 Þ of log-eq(ASM-T). From T1 and T2 one can obtain an ASM transducer T and a UT sentence j such that ðT; jÞA verify(ASM-T, UT) iff ðT1 ; T2 ÞA log-eq(ASM-T). The idea is to let T step-wise simulate T1 and T2 in parallel, using separate memory and output relations for the simulation of T1 and T2 ; respectively. j can then check whether T1 and T2 produce in every step the same log-output relations. &
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
51
4. Natural restrictions Decidability and complexity of the verification problems verifyðC; F Þ; fin-log-val(C), and log-eq(C) obviously depend on the choice of the class CD ASM-T. For instance, if we set C ¼ ASM-T (and assume that F contains the specification Xaccept), then all three problems are undecidable by Trakhtenbrot’s theorem (see, e.g., [7]). This leaves us with two options for how to proceed: we may * *
impose restrictions on ASM transducers, or consider simplified versions of the verification problems.
The first approach was successfully pursued in [3] and led to the Spocus transducer model. Here, we follow the second approach and attempt to solve simplified versions of verify, fin-log-val, and log-eq for ASM transducers. We consider two different kinds of simplification: 1. The database of an ASM transducer is given. 2. The maximal input flow which an ASM transducer is exposed to is a priori limited. Each simplification induces restricted variants of the verification problems, which are defined next. 4.1. Providing the database This restriction is applicable if the database of a relational transducer changes so rarely that it becomes feasible to adjust verification to the currently valid database. As an example, consider a relational transducer T running on a notebook computer of a salesperson. The database of T may contain the catalog of a supplier and may be stored on a CD-ROM. In view of the fact that T will run on this particular database only, one can fix the database while verifying T: We denote by verifydb the variant of verify where only runs on a given database are considered. Formally, verifydb is defined as follows: verifydb ðC; F Þ: Given an ASM transducer TAC; a sentence jAF over the vocabulary of T; and a database D appropriate for T; decide whether every run of T on D satisfies j: The corresponding variants of fin-log-val and log-eq, denoted fin-log-valdb and log-eqdb ; respectively, are defined similarly. Lemma 4.1. fin-log-valdb (ASM-T) is polynomial-time reducible to the complement of verifydb (ASM-T, UT). % DÞ of fin-log-valdb ðASM-TÞ: We define a UT Proof (Crux). Consider an instance ðT; L; db % DÞAfin-log-valdb ðASM-TÞ: Let sentence j such that ðT; j; DÞeverify ðASM-T; UTÞ iff ðT; L; U be the vocabulary of T; fix an enumeration a1 ; y; am of the universe of D; and suppose that L% ¼ ðL0 ; y; Ln Þ: For every iAf0; y; ng; let ti ðx1 ; y; xm Þ denote the (unique) atomic type in x1 ; y; xm over Ulog satisfying Li Fti ½a1 ; y; am : Now define j to be :(x1 ; y; xm ^ ^n ð iaj xi axj 4 i¼0 Xi ti Þ: &
52
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
Notice that it always suffices to validate a given transaction with respect to the database that was used during the transaction. Thus, in many cases it suffices to solve fin-log-valdb instead of fin-log-val. This particularly holds for applications where transactions are validated during transaction time. However, for applications where the database changes frequently, verifying temporal properties or deciding log equivalence with respect to a fixed database is impractical or entirely useless. In such case one may try to impose the following restriction. 4.2. Limiting the maximal input flow This restriction is motivated by the observation that the amount of input data which a relational transducer T receives from its environment in a single computation step usually does not exceed a certain limit due to physical and technical limitations of the environment. Consequently, we may focus on runs of T where in every state the number of tuples in every input relation is bounded by some a priori fixed natural number N (depending only on the environment of T). This motivates the following definition. Definition 4.1. Let r ¼ ðSi ÞiAo be a run of an ASM transducer of vocabulary U: The maximal input flow of r is the maximum of the set fjRSi j: RAUin ; iAog; where jRSi j denotes the cardinality of the input relation RSi seen as a set of tuples. Remark. Alternatively, one could define the maximal input flow to be the maximum of the t otal number of input tuples in every state. For technical reasons, we prefer the above definition. Let N be a natural number. We denote by verifyinpN the variant of verify where only runs with maximal input flow pN are considered. Formally, verifyinpN is defined as follows: verifyinpN ðC; FÞ: Given an ASM transducer TAC and a sentence jAF over the vocabulary of T; decide whether every run of T with maximal input flow pN satisfies j: The corresponding variants of fin-log-val and log-eq, denoted fin-log-valinpN and log-eqinpN ; respectively, are defined along the same line. The next lemma indicates that for ASM transducers, verifyinpN reduces to verify. Let T n be an ASM transducer whose output vocabulary contains the boolean symbol error. We call a run of T n error-free if error does not hold in any state of the run. Lemma 4.2. Fix some natural number N: Every ASM transducer T whose output vocabulary does not contain the boolean symbol error can be modified so that the error-free runs of the obtained ASM transducer are precisely the runs of T with maximal input flow pN: Equipped with the terminology established so far, we can now state our main results concerning the verifiability of ASM transducers.
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
53
5. Verifiability results We first consider ASM transducers which are supposed to run on a specific database only. For every natural number k; let verifyk denote the restriction of verify to instances in which only relation symbols of arity pk occur. The corresponding restrictions of fin-log-val and log-eq are denoted by fin-log-valk and log-eqk ; respectively. Theorem 5.1. The following problems are Pspace-complete for any kX0: (1) verifydb k ðASM-T; UTÞ (2) fin-log-valdb k ðASM-TÞ (3) log-eqdb k ðASM-TÞ In other words, if the maximal arity of the employed relations is a priori bounded, then verifying temporal properties (expressible in the fragment UTDFTL), validating finite logs, and deciding log equivalence are Pspace-complete problems for ASM transducers which run on a specific database only. Proof (Crux). Hardness of problem (1) is implied by a reduction from the Pspace-complete satisfiability problem for quantified boolean formulas (see, e.g., [15]). Consider a quantified boolean sentence g: We define an instance ðT; j; DÞ of problem (1) such that ðT; j; DÞ is a positive instance of the problem iff g is satisfiable. Let v be a variable not occurring in g: Obtain g0 from g by replacing every atomic subformula of the form x (i.e., a propositional variable) with the firstorder formula ðx ¼ vÞ: g0 can be viewed as a first-order formula with freeðg0 Þ ¼ fvg: Now define the instance ðT; j; DÞ as follows. Let T be defined by the rule (if (vg0 ðvÞ then accept), set j ¼ Xaccept; and let D be an arbitrary database (containing at least two elements). Similar reductions show hardness of problems (2) and (3). (Consider, e.g., the mappings j/ðT; ð|; facceptgÞ; DÞ and j/ðT; ðif true then acceptÞ; DÞ; respectively.) Containment of problem (1) is implied by a reduction to the finite satisfiability problem for existential transitive closure logic. The construction is rather technical and postponed to the appendix (see Corollary A.4). Containment of problems (2) and (3) then follows from the reductions in Lemmas 3.1 and 4.1. & Note that problems (1), (2), and (3) remain Pspace-hard if the transducer vocabulary and the database are fixed. All three problems remain in Pspace if the arities of database and memory relations are a priori bounded, the arities of input and output relations are however not. If there is no fixed upper bound on the arities of database and memory relations, then verification becomes more expensive. Theorem 5.2. The following problems are in EXPspace for any NX0: ð10 Þ verifydb;inpN (ASM-T, UT) ð20 Þ fin-log-valdb;inpN (ASM-T) ð30 Þ log-eqdb;inpN (ASM-T) The first and the third problem are even EXPspace-complete.
54
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
Proof (Crux). Hardness of problem ð10 Þ is implied by a reduction from the model checking problem for partial fixed-point logic, which is EXPspace-complete [21,22]. (The complexity of the model checking problem for a logic L is also known as the combined complexity of L [6].) Consider a finite structure A and a sentence c of partial fixed-point logic, both over the same vocabulary U: We define an instance ðT; j; DÞ of problem ð10 Þ such that ðT; j; DÞ is a positive instance iff AFc: 0 0 W.l.o.g., it can be assumed that c has the form (v½PFPX ;x% c0 ðX ; xÞ ð % v%Þ with c AFO and vefreeðc Þ (see, e.g., [7]). From c one can obtain an ASM transducer T with input vocabulary |; database vocabulary U; memory vocabulary fX g; and output vocabulary facceptg such that for every database D0 over U; any run of T on D0 satisfies Faccept iff D0 Fc: Hence, if we set j ¼ Faccept and D ¼ A; then ðT; j; DÞ is the desired instance of problem ð10 Þ: To prove hardness of problem ð30 Þ; we reuse the above construction. Let T| denote the ASM transducer with the empty program and the same vocabulary as T: One can show that AFc iff ðT; T| ; AÞ is a negative instance of problem ð30 Þ: This implies that ðA; cÞ/ðT; T| ; AÞ is a reduction from the model checking problem for partial fixed-point logic to the complement of problem ð30 Þ: Since EXPspace is closed under complementation, we obtain hardness of problem ð30 Þ: Containment of problems ð10 Þ; ð20 Þ; and ð30 Þ is proved in the appendix (see again Corollary A.4 and recall Lemmata 3.1 and 4.1). & As pointed out in the last section, Theorem 5.1 provides a sufficiently general solution of the verification problems for applications where the design and verification of ASM transducers can be adjusted to specific databases. Since it often suffices to solve fin-log-valdb instead of fin-logval (recall the comment following Lemma 4.1), the theorem settles the finite log validation problem for many other applications as well. Next, we turn to ASM transducers that need to be verified for all databases because their databases change frequently or may even change during run time (see the remark at the end of this section). 5.1. ASM transducers with input-bounded quantification Unfortunately, limiting the maximal input flow alone does not suffice to obtain decidability of any of the three verification problems for the class of all ASM transducers. This is again a consequence of Trakhtenbrot’s theorem and is due to the expressive power of first-order quantification in the guards of ASM transducer rules. However, the situation chances for ASM transducers which use, instead of unbounded first-order quantification, a kind of bounded quantification specially tailored for input-driven devices like relational transducers. The main idea is to restrict the range of first-order quantifiers to the active domain of the current þ in input. As an example, consider the following output rule taken from the ASM transducer Tsupp Example 3.1: if OrderðxÞ4AvailableðxÞ4PastOrderðxÞ4:(yðPayðx; yÞ4Priceðx; yÞÞ then RejectOrderðxÞ The quantification of the variable y in the guard of this rule is ‘guarded’ by the input relation Pay; which is why the range of y can safely be restricted to the active domain of the current input. In fact, the quantification of y in the above rule is input-bounded in the sense of the next definition.
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
55
Definition 5.1. Let U be a transducer vocabulary. An atomic formula of the form Rðt%Þ is also called an input atom (resp. memory atom, output atom) if R is a symbol in Uin (resp. Umem ; Uout ). The input-bounded fragment of FO, denoted FOI ; is obtained from FO by replacing the formulaformation rule for (unbounded) quantification with the following rule for input-bounded quantification: (IBQ) If x% is a tuple of variables, a is an input atom with xDfreeðaÞ; and j % is a formula such that for every memory and output atom b occurring in j; freeðbÞ-x% ¼ |; then (xða4jÞ and 8xða-jÞ are formulas. % % An ASM transducer with input-bounded quantification (or ASMI transducer for short) is an ASM transducer in whose program all rules are guarded by FOI formulas. ASMI -T denotes the class of ASMI transducers. Let TI denote the closure of the set of FOI formulas under negation, disjunction, and rule ðTÞ (see Definition 3.1). The universal closure of TI ; denoted UTI ; is the set of FTL formulas of the form 8xj % % with jATI and freeðjÞDfxg: þ To see an example of an ASMI transducer and a UTI specification, recall Tsupp and j in
Example 3.1. Also, verify that Lemma 4.2 remains true for ASMI transducers. Remark. ASMI transducers and Spocus transducers are incomparable in the following sense: there exists a run of an ASMI (resp. Spocus) transducer such that no Spocus (resp. ASMI ) transducer can produce that run. (Recall Example 2.1 and note that the Spocus transducer model allows unrestricted projections in the guards of output rules.) Theorem 5.3. The following problems are Pspace-complete for any NX2 and any kX1: (4) verifykinpN ðASMI -T; UTI Þ (5) log-eqkinpN ðASMI -TÞ In other words, if the maximal arity of the employed relations and the maximal input flow are a priori bounded, then verifying temporal properties (expressible in the fragment UTI DFTL) and deciding log equivalence are Pspace-complete problems for ASM I transducers. Proof (Crux). Hardness of problem (4) follows from a reduction similar to the first reduction in the proof of Theorem 5.1. Consider a quantified boolean sentence g; and let g0 ðvÞAFO be obtained from g as in the proof of Theorem 5.1. Let Bool be a unary relation symbol (denoting an input relation). Obtain g00 ðvÞAFOI from g0 ðvÞ by replacing every FO quantifier with the corresponding Bool-bounded quantifier (e.g., replace (xj with (xðBoolðxÞ4jÞ; and 8xj with 8xðBoolðxÞ-jÞ). It is easy to see that there exists a sentence yAFOI equivalent to (¼2 xBoolðxÞ: Let T now be defined by the rule ðif y-(vðBoolðvÞ4g00 ðvÞÞthen acceptÞ; and again set j ¼ Xaccept: Verify that ðT; jÞ is an instance of problem (4), and that ðT; jÞ is a positive instance of
56
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
the problem iff g is satisfiable. A similar reduction shows hardness of problem (5). For containment of problems (4) and (5), the reader is referred to the appendix (see Corollary A.1 and recall Lemma 3.1). & As before, we observe an exponential blow-up of the space complexity if there is no fixed upper bound on the arities of database and memory relations. Theorem 5.4. The following problems are in EXPspace for any NX0: ð40 Þ verifyinpN ðASMI -T; UTI Þ ð50 Þ log-eqinpN ðASMI -TÞ Both problems are even EXPspace-complete if in the guards of ASM I transducers quantification over constants is permitted (see Theorem A.2 in the appendix). Remark. So far, we have only considered relational transducers whose database does not change during run time. To verify an ASMI transducer T with an active database, i.e., a database that can be updated during a run of T; one may proceed as follows. Suppose that R1 ; y; Rn are the active database relations of T: Modify T so that, in the first step, it copies each Ri to a new, still empty memory relation R0i ; and then, in all subsequent steps, uses R0i instead of Ri : Every update of an Ri is now treated as input and redirected to R0i : There are some disadvantages to this solution, however. By definition of input-bounded quantification, it is prohibited to quantify inside active database relations, which are now memory relations. Furthermore, if the maximal input flow is limited, then database updates may jam customer inputs. 6. Conclusion Taking the framework provided in [3] as a starting point, we have investigated the verifiability of transaction protocols typically occurring in the context of electronic commerce applications. We have introduced a class of powerful relational transducers to specify such transaction protocols, and have shown that verification problems of practical importance can be solved for these transducers. A still open problem is the verification of systems of interacting relational transducers [3]. We are optimistic that our investigations in this paper can serve as a basis for future research in this direction. Another open problem concerns efficiently verifiable relational transducer. Pspace complexity is likely to be too expensive for large-scale applications, especially when it comes to verification with respect to a given database. The question is whether there are further natural restrictions such that the corresponding verification problems are in Ptime. Acknowledgments I am grateful to Erich Gr.adel for drawing my attention to the subject of this paper and for useful comments on an earlier draft. I would also like to thank Serge Abiteboul, Eric Rosen, and Victor Vianu for helpful comments and suggestions.
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
57
Appendix We sketch the proofs of our containment assertions in Theorems 5.1–5.4. Further details can be found in [20]. As pointed out earlier, it suffices to show that problems (1) and (4) are contained in Pspace and that problems ð10 Þ and ð40 Þ are contained in EXPspace. It will be convenient to assume that database vocabularies may contain constant symbols (and thus to consider databases which contain elements denoted by those symbols) and furthermore to assume that every database vocabulary contains (at least) the two constant symbols 0 and 1. These two assumptions do not affect the validity of any of our results. The following problem will be of particular interest to us: Run-SatðC; FÞ: Given an ASM transducer TAC and a sentence jAF over the vocabulary of T; decide whether there exists a run of T satisfying j: Obviously, ðT; jÞAverifyðC; F Þ iff ðT; :jÞerun-satðC; :F Þ; where :F denotes the set of negated F formulas. Recall the definition of the fragment TDFTL (see Definition 3.1). For any complexity class K closed under complementation and polynomial-time reductions, we have verifyðC; UTÞAK iff run-satðC; TÞAK; and verifyðC; UTI ÞAK iff run-satðC; TI ÞAK: This observation will justify to consider Run-Sat in place of verify in the subsequent discussion. The entire construction is presented in three steps. The first step outlines the general direction of the construction in the form of a polynomial-time reduction from run-satðASM-T; T) to the finite satisfiability problem for transitive closure logic. Refinements of this reduction in the second and third step then imply the desired containment results. Step 1: reduction of run-sat(ASM-T, T) Consider an ASM transducer T of vocabulary U: A run r ¼ ðSi ÞiAo of T is called periodic if there exist sX0 and pX1 such that for every iXs; Si ¼ Siþp : Notice that T has non-periodic runs if Uin is not empty. The next lemma is a generalization of an observation in [18]. Lemma A.1. Let j be a sentence over U in the fragment TDFTL: If there exists a run of T satisfying j; then there also exists a periodic run of T satisfying j: A finite or infinite sequence ðSi ÞiAk of states over U is called consistent if for all i; jAk; Si jdb ¼ Sj jdb : We encode finite consistent sequences (presumably representing periodic runs of T) as finite structures. Suppose that s is such a sequence, say, s ¼ ðS0 ; y; Sq Þ: Let D be the domain of the database S0 jdb ; and set I ¼ f0; y; qg: W.l.o.g., we may assume that D and I are disjoint. Obtain Uþ from U by 1. adding to U two new set symbols, say, D0 and I 0 ; and 2. increasing the arity of every relation symbol in Uin ,Umem ,Uout by one. Define the finite structure As over Uþ (encoding s) as follows: * *
the universe of As is D,I; D0 and I 0 are interpreted as D and I; respectively,
58 * *
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
every RAUdb is interpreted as in the database S0 jdb ; and every k-ary RAUin ,Umem ,Uout is interpreted as a ðk þ 1Þ-ary relation containing a tuple ða; % iÞ Si iff aAR : %
To ease notation, we may subsequently use one and same letter to denote both a free variable of a formula and an interpretation of that variable. The intended meaning will be clear from the context. Proposition A.1. From T one can obtain in polynomial time a FO formula wT ði; i0 Þ over Uþ such that for every s as above and for all i; i0 Af0; y; qg As FwT ½i; i0 3TðSi Þ ¼ Si0 jmem;out : For a logic L; we denote by fin-satðLÞ the finite satisfiability problem for L: For example, finsat(FO+TC) is the finite satisfiability problem for transitive closure logic (FO+TC) [7]. As a first step toward a proof of our containment assertions, we display a polynomial-time reduction from run-sat(ASM-T,T) to fin-sat(FO+TC). This reduction will serve as a guideline in the second and third step of the construction. Consider an instance ðT; jÞ of run-sat(ASM-T, T), and let U be the vocabulary of T: We define an (FO+TC) sentence wT;j over Uþ which has a finite model iff there exists a run of T satisfying j: (The definition closely follows a construction first presented in [13] as a translation of the branching-time logic CTLn into (FO+TC); see also [19].) In the following, we regard formulas of the form aBb as well-formed FTL formulas whose semantics is given by aBb :¼ :ð:aU:bÞ: In this extended syntax we can, w.l.o.g., assume that j is in negation normal form (which means that every negation in j occurs in front of an atomic subformula). Let clðjÞ be the set of those subformulas of j whose occurrence is not strictly inside some FO subformula of j: Note that j can be built from the FO formulas in clðjÞ by means of disjunction, conjunction, and the temporal operators X; U; and B: For every FO formula yAclðjÞ; let yþ ðiÞA FOðUþ Þ be obtained from y by replacing 1. every input, memory, and output atom of the form Rðt%Þ with Rðt%; iÞ; and 2. every FO quantifier with the corresponding D0 -bounded quantifier. For every yAclðjÞ; introduce a boolean variable by : If y has the form aUb; then introduce an additional boolean variable my different from by : (Intuitively, by will represent the truth value of y in a particular state of a run of T: my will serve as a reminder that y has still to be satisfied sometimes in the future. For further details, the reader is referred to [13].) By b% (resp. m) % we denote an enumeration of the boolean variables by (resp. my ) in some random but fixed order. Define nextAFOðUþ Þ with freeðnextÞD % b%0 ; m; fi; i0 ; b; % m % 0 g to be ^ ðiAI 0 Þ4wT ði; i0 Þ4 yAclðjÞ nexty ;
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
where wT ði; i0 Þ is obtained from T according to Proposition 8 if by -yþ ðiÞ > > > > > if > < by -ðba 3ð4Þbb Þ nexty :¼ by -b0a if > > 0 0 > > ðby -ðbb 3ðba 4by ÞÞÞ4ðmy -ðmy 3bb ÞÞ if > > : by -ðbb 4ðba 3b0y ÞÞ if
59
A.1, and yAFO y ¼ a3ð4Þb y ¼ Xa y ¼ aUb y ¼ aBb:
We come to the definition of wT;j : Let TCS denote the strict version of the transitive closure operator TC. Note that TCS is definable in (FO+TC) (see, e.g., [13]). wT;j
:¼
initialðiÞ :¼
% i0 b%0 ; m % i0 b%0 ; m (ib; % 0 ðinitialðiÞ4runðib; % 0 ÞÞ ^ 0 ðð8xAD Þ:Rðx; % % iÞÞ RAU ,U mem
% i0 b%0 ; m runðib; % 0 Þ :¼
out
% % 0 %0 % ½TCib%m;i % 0 next ði b0; i b 0Þ4 % 0 b%0 m
next ði0 b%0 0% ; i0 b%0 m ½TCSib%m;i % 0 Þ4 % 0 b%0 m %0 ^ bj 4 aUbAclðjÞ ðb0aUb -m0aUb Þ: It is not difficult to verify that wT;j can be obtained from ðT; jÞ in polynomial time, and that ðT; jÞ is a positive instance of run-sat(ASM-T,T) iff wT;j is a positive instance of fin-sat (FO+TC). (For the ‘‘only-if’’ direction use Lemma A.1. For the ‘‘if’’ direction unwind the essential part of a model of wT;j to obtain a periodic run of T satisfying j:) Next, we refine the above reduction to a polynomial-time reduction from runsatinpN ðASMI -T; TI Þ to a decidable subproblem of fin-sat(FO+TC). Step 2: reduction of run-satinpN ðASMI -T; TI Þ Consider an ASMI transducer T of vocabulary U: For every structure A over U; let AC denote the substructure of A induced by the set of constants of A (i.e., the set of those elements in A which are denoted by constant symbols in Udb ). A consistent sequence ðSi ÞiAo of states over U is called a local run of T if every relation of ðS0 jmem;out ÞC is the empty relation, and for every iAo; TðSi ÞC ¼ ðSiþ1 jmem;out ÞC : Lemma A.2. Let j be a sentence over U in the fragment TI DFTL: If there exists a local run of T satisfying j; then there also exists a (genuine) run of T satisfying j: In order to succinctly formulate our next result, we introduce a new kind of bounded quantification. Witness-bounded quantification. A finite set of variables and constant symbols is also called a witness set. For a witness set W and a variable x not in W ; let ðxAW Þ abbreviate the formula
60
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
W ð nAW x ¼ nÞ: Intuitively, ðxAW Þ holds iff the interpretation of x matches the interpretation of some symbol in W : Definition A.1. The witness-bounded fragment of FO, denoted FOW ; is obtained from FO by replacing the formula-formation rule for first-order quantification with the following rule for witness-bounded quantification:
(WBQ) If W is a witness set, x is a variable not in W ; and j is a formula, then ð(xAW Þj and ð8xAW Þj are formulas. The free and bound variables of FOW formulas are defined as usual. In particular, x occurs bound in ð(xAW Þj and ð8xAW Þj; whereas all variables in the witness set W occur free in these formulas. We view FOW as a fragment of FO where formulas of the form ð(xAW Þj and ð8xAW Þj are mere abbreviations for (xðxAW 4jÞ and 8xðxAW -jÞ; respectively. It is easy to see that FOW is as expressive as the quantifier-free fragment of FO. However, FOW allows us to represent (certain) quantifier-free formulas exponentially more succinctly (unless Pspace ¼ NP). We proceed toward a reduction of run-satinpN ðASMI -T; TI Þ: Fix a natural number NX1: In what follows, we implicitly assume that every transducer vocabulary U comes equipped with an arbitrary but fixed order on Uin : Let T be an ASMI transducer of vocabulary U; and let s ¼ ðS0 ; y; Sq Þ be a consistent sequence of states over U with maximal input flow pN; i.e., jRSi jpN for every RAUin and every iAf0; y; qg: (One may think of s as a finite representation of a periodic, local run of T:) Because N is an upper bound on the maximal input flow of s; the input component Si jin of each state Si can be encoded as a tuple of domain elements so that the length of this tuple depends only on N and Uin : We now fix such an encoding. Let D be the domain of the database S0 jdb ; and let R1 ; y; Rn be an enumeration of Uin according to the order on Uin : For each Rj ; let kj denote the arity of Rj : A Nð1 þ kj Þ-tuple d% of kj i elements in D is called an encoding of RS j if for every lAf1; y; Ng; there exist bl AD and c%l AD such that d% ¼ ðb1 c%1 ; y; bN c%N Þ and the following two conditions hold: Si 1. for every aAR % % and j ; there exists an lAf1; y; Ng such that bl ¼ 0 and c%l ¼ a; Si 2. for every lAf1; y; Ng; bl ¼ 0 implies c%l ARj : P Set LðN; Uin Þ ¼ nj¼1 ðNð1 þ kj ÞÞ: An LðN; Uin Þ-tuple e% of elements in D is called an i encoding of Si jin if for every jAf1; y; ng; there exists an encoding d%j of RS such that e% ¼ j % % ðd1 ; y; dn Þ:
Proposition A.2. Recall the definitions of Uþ and As (prior to Proposition A.1) and set Un ¼ Uþ Uin and Ans ¼ As jUn : From T one can obtain in polynomial time an FOW formula wnT ðe%; i; i0 Þ
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
61
over Un such that for every s as above, for all i; i0 Af0; y; qg; and for every encoding e% of Si jin Ans FwnT ½e%; i; i0 3TðSi ÞC ¼ ðSi0 jmem;out ÞC : Proof. Let e% be an LðN; Uin Þ-tuple of pairwise distinct variables such that the variables i and i0 do not occur in e%: For each Rj AUin ; there exists a quantifier-free formula decodej ðe%; xÞ % such that for every state S in s; for every encoding e% of Sjin ; and for every kj -tuple a% of I S elements of S; SFdecodej ½e%; a ðUÞ % iff aAR % % wR ðxÞAFO % j : For each RAUmem ,Uout ; define jR ðxÞ; as in Section 2. Furthermore, if RAUmem ; set yR ðxÞ % ¼ wR ðxÞ; % otherwise, set yR ðxÞ % ¼ jR ðxÞ: % Let C be the set of constant symbols in Udb ; and let W be the set of variables occurring in e%: Both C and W are witness sets. W.l.o.g., we can assume that no variable in W ,fi; i0 g n occurs in any yR ðxÞ: % Obtain yR ðe%; x; % iÞAFOW ðUn Þ from yR ðxÞ % by replacing 1. every input atom of the form Rj ðt%Þ with decodej ðe%; t%Þ; 2. every memory and output atom of the form R0 ðt%Þ with R0 ðt%; iÞ; and 3. every FO quantifier with the corresponding W -bounded quantifier. Finally, define wnT ðe%; i; i0 ÞAFOW ðUn Þ to be ^ n ½ð8xACÞðy % % iÞ2Rðx; % i0 ÞÞ : R ðe%; x; RAU ,U mem
out
n
wT can be obtained from T in polynomial time if N is a priori fixed. & Let ðFOW þ TCÞ denote FOW augmented with the transitive closure operator TC: An occurrence of a TC operator in a ðFOW þ TCÞ formula is called positive if the occurrence is in the scope of an even number of negations. By ðFOW þ posTCÞ we denote the set of those ðFOW þ TCÞ formulas in which every occurrence of a TC operator is positive. Theorem A.1. For any NX0; run-satinpN ðASMI -T; TI Þ is polynomial-time reducible to finsatðFOW þ posTCÞ: Proof. Consider an instance ðT; jÞ of run-satApN ðASMI -T; TI Þ: Let U be the vocabulary of T; and set Un ¼ Uþ Uin : We provide an ðFOW þ posTCÞ sentence wnT;j over Un which has a finite model iff there exists a run of T with maximal input flow pN that satisfies j: Recall the reduction % and m from run-satðASM-T; TÞ to fin-satðFO þ TCÞ in the first step, and let clðjÞ; by ; my ; b; % be I as in that reduction. Notice that every FO formula y in clðjÞ now is an FO formula. For each such y; obtain yn ðe%; iÞAFOW ðUn Þ from y as in the proof of Proposition A.2. The FO formula next now % b%0 ; m; becomes an FOI formula over Un with free variables among e%; i; i0 ; b; % m % 0 : define nextn to be ^ ðe%AD0 Þ4ðiAI 0 Þ4wnT ðe%; i; i0 Þ4 yAclðjÞ nextny ; where wnT ðe%; i; i0 Þ is obtained from T according to Proposition A.2, and nextny is defined as nexty ; except if yAFOI : In that case, set nextny ¼ by -yn ðe%; iÞ:
62
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
The definition of wnT;j follows. As in the proof of Proposition A.2, let C denote the set of constant symbols in Udb : % e%0 i0 b%0 ; m % e%0 i0 b%0 ; m wnT;j :¼ (e%ib; % 0 ðinitial n ðiÞ4runn ðe%ib; % 0 ÞÞ n
ðiÞ :¼
^ RAUmem ,Uout
ðð8xACÞ:Rð x; % % iÞÞ
n % e%0 i0 b%0 ; m % % 0 0 %0 % runn ðe%ib; % 0 Þ :¼ ½TCe%ib%m;% % e0 i0 b%0 m % 0 next ðe%ib0; e% i b 0Þ 4
½TCSe%ib%m;% nextn ðe%0 i0 b%0 0% ; e%0 i0 b%0 m % 0Þ 4 % e0 i0 b%0 m %0 ^ bj 4 aUbAclðjÞ ðb0aUb -m0aUb Þ: Observe that wnT;j as defined above is not an ðFOW þ posTCÞ formula. However, the desired ðFOW þ posTCÞ formula can be obtained from wnT;j by removing the prefix of existential quantifiers, and regarding all free variables of the resulting formula as new constant symbols. To complete the proof, verify that wnT;j can be obtained from ðT; jÞ in polynomial time, and that ðT; jÞ is a positive instance of run-satinpN ðASMI -T; TI Þ iff wnT;j is a positive instance of fin-satðFOW þ posTCÞ: (For the ‘‘only-if’’ direction again use Lemma A.1. For the ‘‘if’’ direction unwind the essential part of a model of wnT;j to obtain a local run of T satisfying j: Lemma A.2 then yields a run of T with maximal input flow pN which is also a model of j:) & Corollary A.1. For any k; NX0; problem (4) is in Pspace and problem ð40 Þ is in EXPspace: Proof. Let ðE þ TCÞ denote the existential fragment of ðFO þ TCÞ: By fin-satk ðE þ TCÞ we denote the restriction of fin-satðE þ TCÞ to instances in which only relation symbols of arity pk occur. In [20] it is shown that (i) fin-satðkÞ ðFOW þ posTCÞ is polynomial-time reducible to fin-satðkÞ ðE þ TCÞ; (ii) fin-satk ðE þ TCÞ is in Pspace; and (iii) fin-satðE þ TCÞ is in EXPspace. Together with Theorem A.1, we have run-satkinpN ðASMI -T; TI ÞAPspace and run-satinpN ðASMI -T; TI ÞAEXPspace: Since both Pspace and EXPspace are closed under complementation and polynomial-time reductions, the observation following the definition of run-sat implies the corollary. & Remark. One can arrange the definition of wnT;j in the proof of Theorem A.1 so that it becomes a sentence over Uþ ðUin ,Uout Þ: Hence, imposing an upper bound k on the arities of database and memory relations suffices to avoid an exponential blow-up of the space complexity. The next corollary of Theorem A.1 will be useful in the subsequent third step of the construction. Let FOIC denote the fragment of FO obtained from FOI (see Definition 5.1) by means of the additional formula-formation rule (WBQ) for witness-bounded quantification (see Definition A.1), with the restriction that this rule can only be applied to witness sets
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
63
containing constant symbols only. Define ASMIC -T and TIC as ASMI -T and TI in Definition 5.1, respectively, except that now every occurrence of FOI in the definition is replaced with FOIC : Corollary A.2. For any NX0; run-satinpN ðASMIC -T; TIC Þ is polynomial-time reducible to fin-satðFOW þ posTCÞ: The proof of this corollary is similar to the proof of Theorem A.1. Theorem A.2. The following problems are EXPspace-complete for any NX0: ð400 Þ verifyinpN ðASMIC -T; UTIC Þ ð500 Þ log-eqinpN ðASMIC TÞ For containment of problem ð400 Þ proceed as in the proof of Corollary A.1, but now use Corollary A.2 instead of Theorem A.1. Containment of problem ð500 Þ then follows from the observation that the reduction in Lemma 3.1 also reduces problem ð500 Þ to problem ð400 Þ: Hardness of problem ð500 Þ is proved at the end of this section (see below Corollary A.4). Step 3: reduction of run-satdb ðASM-T; TÞ We provide a reduction from run-satdb ðASM-T; TÞ to run-satðASMIC -T; TIC Þ: This reduction together with the reduction in Corollary A.2 will imply our containment assertions concerning problems (1) and ð10 Þ: Theorem A.3. run-satdb ðASM-T; TÞ is polynomial-time reducible to run-satðASMIC -T; TIC Þ: Proof. Consider an instance ðT; j; DÞ of run-satdb ðASM-T; TÞ; and suppose that U is the vocabulary of T: W.l.o.g., we can assume that every element in D is denoted by a constant symbol in Udb : We construct an instance ðT 0 ; j0 Þ of run-satðASMIC -T; TIC Þ such that there exists a run of T 0 satisfying j0 iff there exists a run of T on D satisfying j: Let C be the set of constant symbols in Udb : Obtain T 0 from T by replacing in the program of T every FO quantifier with the corresponding C-bounded quantifier. Obviously, T 0 is an ASMIC transducer of vocabulary U: T 0 and T are equivalent on D in the sense that for every infinite sequence s of states over U; s is a run of T 0 on D iff s is a run of T on D: Obtain jC from j by replacing every FO quantifier with the corresponding C-bounded quantifier. Define j0 ATIC ðUÞ to be ^
^
g 4G 8 xðRð xÞxACÞ ; jC 4 % % % DFg RAU in
where g ranges in the set of atomic and negated atomic sentences over Udb : &
64
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
Corollary A.3. ðiÞ For any NX0; run-satdb;inpN ðASM-T; TÞ is polynomial-time reducible to fin-satðFOW þ posTCÞ: ðiiÞ For any kX0; run-satdb k ðASM-T; TÞ is polynomial-time reducible to W fin-satkþ1 ðFO þ posTCÞ: Proof. Verify that the previous theorem remains valid if we impose an upper bound on the maximal input flow. The first assertion is then implied by Corollary A.2. We obtain a proof of the second assertion by composing the reductions in Theorem A.3 and Corollary A.2. Consider an instance ðT; j; DÞ of run-satdb k ðASM-T; TÞ as in the proof of Theorem A.2. Let d be the cardinality of D: First, obtain from ðT; j; DÞ an instance ðT 0 ; j0 Þ of run-satðASMIC -T; TIC Þ according to the reduction in Theorem A.3. Then, with N :¼ d k ; obtain from ðT 0 ; j0 Þ an instance wnT 0 ;j0 of fin-satðFOW þ posTCÞ according to the reduction in Corollary A.2. wnT 0 ;j0 is polynomialtime computable because k is a priori fixed and LðN; Uin Þ is polynomially bounded in the size of D and Uin : & Corollary A.4. For any k; NX0; problem (1) is in Pspace and problem ð10 Þ is in EXPspace: The proof of this corollary is similar to the proof of Corollary A.1. We still owe the reader a proof of Theorem A.2. Proof of Theorem A.2. Containment has already been discussed. For hardness it suffices to consider problem ð500 Þ: We modify the reduction which implied hardness of problem ð30 Þ (recall the proof of Theorem 5.2) and apply some of the ideas in the proof of Theorem A.3. The obtained mapping will reduce the model checking problem for partial fixed-point logic to the complement of problem ð500 Þ: Since EXPspace is closed under complementation, this will imply hardness of problem ð500 Þ: Consider a finite structure A and a sentence c of partial fixed-point logic, both over the same vocabulary U: We may assume that every element of A is denoted by some constant symbol in U: Let C be the set of constant symbols in U: We construct an ASMIC transducer T n such that, if T| is the ASMIC transducer with the empty program and the same vocabulary as T n ; then AFc iff ðT n ; T| Þ is a negative instance of problem ð500 Þ: In a first step, obtain the ASM transducer T from c as in the proof of Theorem 5.2. Recall that, for every database D0 over U; any run of T on D0 satisfies Faccept iff D0 Fc: In a second step, obtain the ASMIC transducer T C from T by replacing in the program of T every FO quantifier with the corresponding C-bounded quantifier. Suppose that P is the program of T C : Set wA ¼ 4AFg g; where g ranges in the set of atomic and negated atomic sentences over U: Finally, let the ASM transducer T n be defined by the program fifwA then Pg: &
References [1] S. Abiteboul, L. Herr, J. Van den Bussche, Temporal connectives versus explicit timestamps to query temporal databases, J. Comput. System Sci. 58 (1) (1999) 54–68.
M. Spielmann / Journal of Computer and System Sciences 66 (2003) 40–65
65
[2] S. Abiteboul, R. Hull, V. Vianu, Foundations of Databases, Addison-Wesley Publishing Company, Reading, MA, 1995. [3] S. Abiteboul, V. Vianu, B. Fordham, Y. Yesha, Relational transducers for electronic commerce, J. Comput. System Sci. 61 (2) (2000) 236–269. [4] N.R. Adam, Y. Yesha, Electronic commerce: an overview, in: N.R. Adam, Y. Yesha (Eds.), Electronic Commerce, Lecture Notes in Computer Science, Vol. 1028, Springer, Berlin, 1996, pp. 5–12. . [5] E. Borger, J. Huggins, Abstract state machines 1988–1998: commented ASM bibliography, Bulletin of the EATCS, 64:105–127, February 1998, see also http://www.eecs.umich.edu/gasm. [6] E. Dantsin, T. Eiter, G. Gottlob, A. Voronkov, Complexity and expressive power of logic programming, ACM Computing Surveys 33 (3) (2001) 374–425. [7] H.D. Ebbinghaus, J. Flum, Finite Model Theory, Springer, Berlin, 1995. [8] E.A. Emerson, Temporal and modal logic, in: J. van Leeuwen (Ed.), Handbook of Theoretical Computer Science, Vol. B, Elsevier Science Publishers B.V., Amsterdam, 1990, pp. 995–1072. [9] B. Fordham, S. Abiteboul, Y. Yesha, Evolving databases: an application to electronic commerce, in: Proceedings of the International Database Engineering and Applications Symposium (IDEAS), August 1997. [10] G. Gottlob, G. Kappel, M. Schrefl, Semantics of object-oriented data models—the evolving algebra approach, in: J.W. Schmidt, A.A. Stogny (Eds.), Next Generation Information Technology, Lecture Notes in Computer Science, Vol. 504, Springer, Berlin, 1991, pp. 144–160. . [11] Y. Gurevich, Evolving algebras 1993: Lipari guide, in: E. Borger (Ed.), Specification and Validation Methods, Oxford University Press, Oxford, 1995, pp. 9–36. [12] Y. Gurevich, May 1997 draft of the ASM guide, Technical Report CSE-TR-336-97, University of Michigan, May 1997. [13] N. Immerman, M.Y. Vardi, Model checking and transitive closure logic, in: Proceedings of 9th International Conference on Computer-Aided Verification (CAV ‘97), Lecture Notes in Computer Science, Vol. 1254, Springer, Berlin, 1997, pp. 291–302. [14] A. Levy, I. Mumick, Y. Sagiv, O. Shmueli, Equivalence, query-reachability, and satisfiability in datalog extensions, in: Proceedings of 12th ACM Symposium on Principles of Database Systems (PODS ‘93), ACM Press, New York, 1993, pp. 109–122. [15] C.H. Papadimitriou, Computational Complexity, Addison-Wesley Publishing Company, Reading, MA, 1994. [16] P. Picouet, V. Vianu, Semantics and expressiveness issues in active databases, J. Comput. System Sci. 57 (3) (1998) 325–355. [17] E. Rosen, An existential fragment of second order logic, Arch. Math. Logic 38 (1999) 217–234. [18] A.P. Sistla, E.M. Clarke, The complexity of propositional linear temporal logics, J. Assoc. Comput. Mach. 32 (3) (1985) 733–749. [19] M. Spielmann, Automatic verification of abstract state machines, in: Proceedings of 11th International Conference on Computer-Aided Verification (CAV ‘99), Lecture Notes in Computer Science, Vol. 1633, Springer, Berlin, 1999, pp. 431–442. [20] M. Spielmann, Abstract State Machines: Verification Problems and Complexity, Ph.D. Thesis, RWTH Aachen, 2000. [21] M. Vardi, The complexity of relational query languages, in: Proceedings of 14th ACM Symposium on Theory of Computing (STOC ‘82), ACM Press, New York, 1982, pp. 137–146. [22] M. Vardi, On the complexity of bounded-variable queries, in: Proceedings of 14th ACM Symposium on Principles of Database Systems (PODS ‘95), ACM Press, New York, 1995, pp. 266–276.