Unification in the Union of Disjoint Equational Theories: Combining Decision Procedures

Unification in the Union of Disjoint Equational Theories: Combining Decision Procedures

J. Symbolic Computation (1996) 21, 211–243 Unification in the Union of Disjoint Equational Theories: Combining Decision Procedures FRANZ BAADER† AND ...

756KB Sizes 0 Downloads 25 Views

J. Symbolic Computation (1996) 21, 211–243

Unification in the Union of Disjoint Equational Theories: Combining Decision Procedures FRANZ BAADER† AND KLAUS U. SCHULZ‡ † LuFg Theoretical Computer Science, RWTH Aachen, Ahornstr.55, 52074 Aachen, Germany ‡ CIS, University of Munich, Wagm¨ ullerstr.23,80538 M¨ unchen, Germany (Received 15 November 1993)

Most of the work on the combination of unification algorithms for the union of disjoint equational theories has been restricted to algorithms that compute finite complete sets of unifiers. Thus the developed combination methods usually cannot be used to combine decision procedures, i.e., algorithms that just decide solvability of unification problems without computing unifiers. In this paper we describe a combination algorithm for decision procedures that works for arbitrary equational theories, provided that solvability of so-called unification problems with constant restrictions—a slight generalization of unification problems with constants—is decidable for these theories. As a consequence of this new method, we can, for example, show that general A-unifiability, i.e., solvability of A-unification problems with free function symbols, is decidable. Here A stands for the equational theory of one associative function symbol. Our method can also be used to combine algorithms that compute finite complete sets of unifiers. Manfred Schmidt-Schauß’ combination result, the until now most general result in this direction, can be obtained as a consequence of this fact. We also obtain the new result that unification in the union of disjoint equational theories is finitary, if general unification—i.e., unification of terms with additional free function symbols—is finitary in the single theories. c 1996 Academic Press Limited °

1. Introduction E-unification is concerned with solving term equations modulo an equational theory E. The theory is called “unitary” (“finitary”) if the solutions of a system of equations can always be represented by one (finitely many) solution(s). Otherwise the theory is of type “infinitary” or “zero” [see e.g., Siekmann (1989); Jouannaud and Kirchner (1991); Baader and Siekmann (1994) for an introduction to unification theory]. Equational theories of unification type unitary or finitary play an important rˆ ole in automated theorem provers with “built in” theories (see e.g., Plotkin, 1972; Stickel, 1985), in generalizations of the Knuth-Bendix algorithm (see e.g., Jouannaud and Kirchner, 1986; Bachmair, 1991), and in logic programming with equality (see e.g., Jaffar et al., 1987). The reason is that † ‡

E-mail: [email protected] E-mail: [email protected]

0747–7171/96/020211 + 33 $18.00/0

c 1996 Academic Press Limited °

212

F. Baader and K. U. Schulz

these applications usually require algorithms which compute finite complete sets of unifiers, i.e., finite sets of unifiers from which all unifiers can be generated by instantiation. However, with the recent development of constraint approaches to theorem proving (see e.g., B¨ urckert, 1990), term rewriting (see e.g., Kirchner and Kirchner, 1989), and logic programming (see e.g. Jaffar and Lassez, 1987 or Colmerauer, 1990), the computation of finite complete sets of unifiers is no longer indispensable for these applications. It is sufficient to decide satisfiability of the constraints, that means e.g., solvability of the unification problems. In the present paper, the design of decision procedures for unification in the combination of equational theories will be a major issue. First, we explain why this combination problems arises naturally when using unification algorithms in the application domains mentioned above. the signature matters When considering unification in equational theories one has to be careful with regard to the signature over which the terms of the unification problems can be built. This leads to the distinction between elementary unification (where the terms to be unified are built over the signature of the equational theory, i.e., the function symbols occurring in the axioms of the theory), unification with constants (where additional free constant symbols may occur), and general unification (where additional free function symbols of arbitrary arity may occur). The following facts show that there really is a difference between the three types of E-unification: (i) There exist theories that are unitary with respect to elementary unification, but finitary with respect to unification with constants. An example for such a theory is the theory of Abelian monoids, i.e., the theory of an associative-commutative (AC) function symbol with a unit element (see e.g., Herold, 1987). (ii) There exists an equational theory for which elementary unification is decidable, but unification with constants is undecidable (see B¨ urckert, 1986). (iii) From the development of the first algorithm for AC-unification with constants (Stickel, 1975; Livesey and Siekmann, 1975) it took almost a decade until the termination of an algorithm for general AC-unification was shown by Fages (1984, 1987). The applications of theory unification mentioned above require algorithms for general unification. This fact is illustrated by the following example. Example 1.1. The theory A = {f (f (x, y), z) = f (x, f (y, z))} only contains the binary symbol f . When talking about A-unification, one first thinks of unifying modulo A terms built by using just the symbol f and variables, or equivalently, of unifying words over the alphabet V of all variables. However, suppose that a resolution theorem prover—which has built in the theory A—receives the formula ∃x: (∀y: f (x, y) = y ∧ ∀y: ∃z: f (z, y) = x) as axiom. In a first step, this formula must be Skolemized, i.e., the existential quantifiers

Combining Decision Procedures

213

have to be replaced by new function symbols. In our example, we need a nullary symbol e and a unary symbol i in the Skolemized form ∀y: f (e, y) = y ∧ ∀y: f (i(y), y) = e of the axiom. This shows that, even if we start with formulae containing only terms built over f , the theorem prover eventually has to handle terms containing additional free symbols. the combination problem We have mentioned that the question of how algorithms for elementary unification (or for unification with constants) can be used to obtain algorithms for general unification is nontrivial and important for applications. Even more general, one often would like to derive algorithms for unification in the union of disjoint equational theories, i.e., in the union of several equational theories over disjoint signatures, from unification algorithms in the single theories. The importance for applications of this so-called “combination problem” is illustrated by the following example. Example 1.2. Assume that we want to compute a canonical term rewriting system for the theory of Boolean rings. Thus we have a signature consisting of two binary symbols “+” and “∗”, a unary symbol “−”, and two nullary symbols “0” and “1”. Since the addition and multiplication in Boolean rings is associative and commutative, and since commutativity cannot be oriented into a terminating rewrite rule, we must use rewriting modulo associativity and commutativity of “+” and “∗”. Thus, critical pairs must also be computed modulo associativity and commutativity of these two symbols. To be more precise, we consider the theories AC + := {(x + y) + z = x + (y + z), x + y = y + x}, and AC ∗ := {(x ∗ y) ∗ z = x ∗ (y ∗ z), x ∗ y = y ∗ x}. Critical pairs are computed with the help of general unification modulo AC + ∪ AC ∗ , i.e., modulo the union of the two disjoint equational theories AC + and AC ∗ . This example can also be used to demonstrate that going from elementary unification to general unification is in fact an instance of the combination problem. If we define the free theory for “−”, “0” and “1” to be F0,1,− = {−x = −x, 0 = 0, 1 = 1}, then one can use elementary unification modulo AC + ∪ AC ∗ ∪ F0,1,− instead of general unification modulo AC + ∪ AC ∗ for computing critical pairs. When considering the combination problem, the attention was until now mostly restricted to finitary unifying theories, and by unification algorithm one meant a procedure that computes a finite complete set of unifiers. The problem was first considered in Stickel (1975), Stickel (1981), Fages (1987) and Herold and Siekmann (1987) for the case where several AC-symbols and free symbols may occur in the terms to be unified. More general combination problems were, for example, treated in Kirchner (1985), Tiden (1986), Herold (1986), Yelick (1987) and Boudet et al. (1989) but the theories considered in these papers always had to satisfy certain restrictions (such as being collapse-free or regular† ) on the syntactic form of their defining identities. † A theory E is called collapse-free if it does not contain an identity of the form x = t where x is a variable and t is a non-variable term, and it is called regular if the left and right hand sides of the identities contain the same variables.

214

F. Baader and K. U. Schulz

The problem was finally solved in its until now most general form by Schmidt-Schauß (1989). His algorithm imposes no restriction on the syntactic form of the identities. The only requirements for a combination of disjoint theories E, F are:

(i) All unification problems with constants must be finitary solvable in E and F . (ii) All constant elimination problems must be finitary solvable in E and F . A more efficient version of this algorithm has been described by Boudet (1993). The method of Schmidt-Schauß can also handle theories that are not finitary. In this case, procedures enumerating complete sets of unifiers for the single theories can be combined to a procedure enumerating a complete set of unifiers for their union. However, even if unification in the single theories is decidable, this does not show how to obtain a decision algorithm for unifiability in the combined theory. The infinitary theory A = {f (f (x, y), z) = f (x, f (y, z))} is an example for this case. Plotkin (1972) described a procedure that enumerates minimal complete sets of A-unifiers for general A-unification problems, and in 1977 Makanin (1977) has shown that Aunification with constants is decidable. But in 1991, decidability of general A-unification was still mentioned as an open problem by Kapur and Narendran (1991) in their table of known decidability and complexity results for unification. Such a decision procedure could, for example, be useful when building associativity into a theorem prover via constraint resolution; and it could be used to make Plotkin’s enumeration procedure terminating for equations having finite complete sets of A-unifiers. In his paper on unification in the combination of arbitrary disjoint equational theories, Schmidt-Schauß (1989) also treats the problem of how to combine decision procedures. But in this case he needs decision procedures for general unification in the single theories as prerequisites for his algorithm. Thus his result cannot be used to solve the above mentioned open problem of decidability of general A-unification. The research that will be presented in this paper builds up on the ideas of SchmidtSchauß and Boudet. It was motivated by the question of how to design a decision procedure for general A-unification. However, the results we have obtained are more general. We shall present a method that allows one to decide unifiability in the union of arbitrary disjoint equational theories, provided that solvability of so-called unification problems with constant restrictions—a slight generalization of unification problems with constants—is decidable for the single theories. In addition, our method can also be used to combine algorithms that compute finite complete sets of unifiers. These main results and some of the interesting consequences will be described in the next section. Among these consequences are the new results that general A-unification is in fact decidable, and that the union of disjoint equational theories is finitary if the single theories are finitary with respect to general unification. In Section 3 we shall present the algorithm for the decision problem, and describe how it can also be used to generate complete sets of unifiers. Section 4 proves the correctness of the method. In Section 5 we shall describe conditions under which algorithms for solving unification problems with constant restrictions exist. Some of the consequences mentioned in Section 2 depend on these results. In Section 6 we consider optimization techniques.

Combining Decision Procedures

215

2. Main Results and Consequences As mentioned in the introduction, we must consider a slight generalization of Eunification problems with constants, so-called E-unification problems with constant restriction, which will be introduced below. Having an algorithm that solves these kind of problems is the only prerequisite necessary for our combination method. Recall that an E-unification problem with constants is a finite set of equations Γ = . . {s1 = t1 , . . . , sn = tn }, where the terms s1 , . . . , tn are built from variables, the function symbols occurring in the axioms of E, and additional free constant symbols. Now, an E-unification problem with constant restriction is an ordinary E-unification problem with constants, Γ, where each free constant c occurring in the problem Γ is equipped with a set Vc of variables, namely, the variables in whose image c must not occur. A solution of the problem is an E-unifier σ of Γ such that for all c, x with x ∈ Vc , the constant c does not occur in xσ. Complete sets of solutions of unification problems with constant restriction are defined as in the case of ordinary unification problems. It turns out that our combination method does not really need an algorithm that can handle E-unification problems with arbitrary constant restrictions; it is sufficient to deal with problems with a so-called linear constant restriction. Such a restriction is induced by a linear ordering on the variables and free constants as follows: Let X be the set of all variables and C be the set of all free constants occurring in Γ. For a given linear ordering < on X ∪ C, the sets Vc are defined as {x | x is a variable with x < c}. We are now ready to formulate our first main result, which is concerned with combining decision algorithms. The decomposition algorithm that is used to establish this result will be described in the next section. Theorem 2.1. Let E1 , . . . , En be equational theories over disjoint signatures such that solvability of Ei -unification problems with linear constant restriction is decidable for i = 1, . . . , n. Then unifiability is decidable for the combined theory E1 ∪ · · · ∪ En . By “unifiability” we mean here solvability of elementary unification problems. However, we shall see below that the result can be lifted to general unification, and to solvability of unification problems with linear constant restriction. The theorem also has several other interesting consequences, which are listed below. 1 Let E be an equational theory such that solvability of E-unification problems with linear constant restriction is decidable. Then solvability of general E-unification problems is decidable. In fact, for a given set Ω of function symbols we can always build the free theory FΩ as exemplified in Example 1.2. It is easy to see that FΩ satisfies the assumption of the theorem; and obviously, any general unification problem modulo E can be seen as an elementary unification modulo E ∪ FΩ (where Ω contains all the additional free function symbols occurring in the problem). 2 This argument also shows why the result of the theorem can be lifted to general unification: in order to get decidability of general unification modulo E1 ∪ · · · ∪ En , apply the theorem to E1 , . . . , En , FΩ . 3 General A-unifiability is decidable. For A, decidability of unification problems with constant restriction is an easy consequence (see Baader and Schulz, 1993) of a result by Schulz (1991) on a generalization of Makanin’s procedure. This result shows that it is still decidable whether

216

F. Baader and K. U. Schulz

4

5

6

7

a given A-unification problem with constants has a solution for which the words substituted for the variables in the problem are elements of given regular languages over the constants. It is easy to see that problems with constant restriction are a special case of these more generally restricted problems. General AI-unifiability, where AI := A ∪ {f (x, x) = x}, is decidable. This was also stated as an open problem in Kapur and Narendran (1991). For AI, decidability of unification problems with constant restriction easily follows from the well-known fact [see e.g. Howie (1976)] that finitely generated idempotent semigroups are finite [see Baader and Schulz (1993) for details] . If solvability of the Ei -unification problems with linear constant restriction can be decided by an NP-algorithm, then unifiability in the combined theory is also NPdecidable. This fact will become obvious once we have described our decomposition algorithm. As a consequence one obtains simple proofs of Kapur and Narendran’s results (Kapur and Narendran, 1991) that solvability of general AC- and ACI -unification problems can be decided by NP-algorithms. For these theories, NP-decidability of unification problems with constant restriction can be shown very similarly as in the case of ordinary unification problems with constants [see Baader and Schulz (1993) for details]. Let E1 , . . . , En be equational theories over disjoint signatures such that solvability of general Ei -unification problems is decidable for i = 1, . . . , n. Then unifiability is decidable for the combined theory E1 ∪ · · · ∪ En . This result, which was first proved by Schmidt-Schauß [see Schmidt-Schauß (1989), Theorem 10.6], can also be obtained as a corollary to our theorem. In fact, we shall show that solvability of E-unification problems with linear constant restriction can be reduced to solvability of general E-unification problems (see Section 5). Together with the second consequence mentioned above, this reduction also shows that the result of Theorem 2.1 can be lifted to unification problems with linear constant restriction.

The algorithm that will be introduced for proving Theorem 2.1 can also be used to compute complete sets of unifiers. Theorem 2.2. Let E1 , . . . , En be equational theories over disjoint signatures such that all Ei -unification problems with linear constant restriction have finite complete set of solutions (i = 1, . . . , n). Then the combined theory E1 ∪ · · · ∪ En is finitary. Again, we are talking about elementary unification for the combined theory; but as for the case of the decision problem, the result can easily be lifted to general unification, and to unification problems with linear constant restriction. It should be noted that this result is effective in the sense that we really obtain an algorithm computing finite complete set of unifiers for the combined theory, provided that for the single theories there exist algorithms computing finite complete sets of solutions of unification problems with linear constant restriction. In the following, we mention two other interesting consequences of the theorem. 8 Let E1 , . . . , En be equational theories over disjoint signatures that are finitary with respect to general unification. Then the combined theory E1 ∪ · · · ∪ En is finitary.

Combining Decision Procedures

217

In fact, we can show how finite complete sets of unifiers for general Ei -unification problems can be used to construct finite complete sets of solutions for unification problems with linear constant restriction (see Section 5). 9 Algorithms that compute finite complete sets of unifiers for unification with constants, and finite complete sets of constant eliminators can be used to construct an algorithm computing finite complete sets of solutions for unification problems with constant restriction (see Section 5). As a consequence, the combination result of Schmidt-Schauß (1989, Corollary 7.14) mentioned in the introduction can also be obtained as a corollary to Theorem 2.2. the logical status Most of the results mentioned above use the notion of an E-unification problem with linear constant restriction. On first sight, a constant restriction such as “c must not occur in the value of x” seems to be a rather technical condition without an abstract logical meaning. This impression is wrong, however, in the particular case of a linear constant restriction. In fact, these conditions have a clear logical status. Let E be an equational theory with signature Σ, and let L(Σ) denote the (purely equational) first order language associated with Σ. We consider the positive fragment . L+ (Σ) of L(Σ): formulae of L+ (Σ) are built from equational atoms s = t using the Boolean operators conjunction and disjunction (but no negation), and both universal and existential quantification. Let φ be an L+ (Σ)-sentence of the form (∗) φ = Q1 x1 · · · Qk xk

n ^

. (si = ti ),

i=1

where the Qi are existential or universal quantifiers. From φ we obtain in a canonical way an E-unification problem with linear constant restriction, Γ, as follows: Γ consists of the . equations si = ti (i = 1, . . . , n). Existentially (universally) quantified variables of φ are treated as variables (constants), and the linear order on these variables and constants is given by the quantifier prefix, i.e., x1 < x2 < ... < xk . Obviously, this translation also works in the converse direction, and thus we obtain for every E-unification problem with linear constant restriction, Γ, a unique L+ (Σ)-sentence φ of the form (∗). In Section 5 we shall prove that φ is a theorem of E if, and only if, Γ has a solution. This fact has several consequences. We say that the positive fragment of the theory of E is decidable (or shorter: the positive theory of E is decidable) if there exists an algorithm that decides for arbitrary L+ (Σ)-sentences† φ whether φ is a theorem of E. 10 Let E be an equational theory. The positive theory of E is decidable if, and only if, solvability of E-unification problems with linear constant restriction is decidable. Now Theorem 2.1 and Result 7 can be reformulated: Theorem 2.3. Let E1 , . . . , En be equational theories over disjoint signatures. Then the positive theory of E1 ∪ · · · ∪ En is decidable if the positive fragments of all subtheories Ei are decidable, for i = 1, . . . , n. † Note that arbitrary L+ (Σ)-sentences are not necessarily of the form (∗) since they may contain disjunction. In Section 5 we show how disjunction is taken into account.

218

F. Baader and K. U. Schulz

Unification problems with linear constant restriction have been considered elsewhere. It is well-known that already Herbrand (1930, 1967) considered unification of first order terms. But what he really studied in his context were in fact unification problems with linear constant restriction, with an order induced by a quantifier prefix as described above. The interplay between mixed quantifier prefixes and constant restrictions in the context of simply typed λ-terms has recently been considered by Miller (1992). 3. The Decomposition Algorithm For the sake of convenience we shall restrict the presentation to the combination of two theories. The combination of more than two theories can be treated analogously. Before we can start with the description of the algorithm we must introduce some notation. Let E1 , E2 be two equational theories built over the disjoint signatures Ω1 , Ω2 , and let E = E1 ∪E2 denote their union. Since we are only interested in elementary E-unification, we can restrict our attention to terms built from variables and symbols of Ω1 ∪ Ω2 . The elements of Ω1 will be called 1-symbols and the elements of Ω2 2-symbols. A term t is called i-term iff it is of the form t = f (t1 , ..., tn ) for an i-symbol f (i = 1, 2). A subterm s of a 1-term t is called alien subterm of t iff it is a 2-term such that every proper superterm of s in t is a 1-term. Alien subterms of 2-terms are defined analogously. An i-term s is . pure iff it contains only i-symbols and variables. An equation s = t is pure iff there exists an i, 1 ≤ i ≤ 2, such that s and t are pure i-terms or variables; this equation is then called an i-equation. Please note that according to this definition equations of the form . x = y where x and y are variables are both 1- and 2-equations. In the following, the symbols x, y, z, with or without subscripts, will always stand for variables. Example 3.1. Let Ω1 consist of the binary (infix) symbol “◦” and Ω2 of the unary symbol “h”, let E1 := {x ◦ (y ◦ z) = (x ◦ y) ◦ z} be the theory which says that “◦” is associative, and let E2 := {h(x) = h(x)} be the free theory for “h”. The term y ◦ h(z ◦ h(x)) is a 1-term, which has h(z ◦ h(x)) as its only alien subterm. . The equation h(x1 ) ◦ x2 = y is not pure, but it can be replaced by two pure equations as follows. We replace the alien subterm h(x1 ) of h(x1 ) ◦ x2 by a new variable z. This yields . . the pure equation z ◦ x2 = y. In addition, we consider the new equation z = h(x1 ). This process of replacing alien subterms by new variables is called variable abstraction. It will be the first of the five steps of our decomposition algorithm. the main procedure The input for the decomposition algorithm is an elementary E-unification problem, i.e., . . a system Γ0 = {s1 = t1 , . . . , sn = tn }, where the terms s1 , . . . , tn are built from variables and the function symbols occurring in Ω1 ∪ Ω2 , the signature of E = E1 ∪ E2 . The first two steps of the algorithm are deterministic, i.e., they transform the given system of equations into one new system. Step 1: variable abstraction. Alien subterms are successively replaced by new variables until all terms occurring in the system are pure. To be more precise, assume . . that s = t or t = s is an equation in the current system, and that s contains the alien subterm s1 . Let x be a variable not occurring in the current system, and let s0 be the term obtained from s by replacing s1 by x. Then the original equation is

Combining Decision Procedures

219

. . replaced by the two equations s0 = t and x = s1 . This process is iterated until all terms occurring in the system are pure. It is easy to see that this can be achieved after finitely many iterations. We obtain a new system Γ1 . Now all the terms in the system are pure, but there may still exist non-pure equations, consisting of a 1-term on one side and a 2-term on the other side. . Step 2: split non-pure equations. Each non-pure equations of the form s = t is . . replaced by two equations x = s, x = t where the x are always new variables. It is quite obvious that these two steps do not change solvability of the system. The result is a system Γ2 , which consists of pure equations. The third and the fourth step are nondeterministic, i.e., a given system is transformed into finitely many new systems. Here the idea is that the original system is solvable iff at least one of the new systems is solvable. Step 3: variable identification. Consider all possible partitions of the set of all variables occurring in the system. Each of these partitions yields one of the new systems as follows. The variables in each class of the partition are “identified” with each other by choosing an element of the class as representative, and replacing in the system all occurrences of variables of the class by this representative. Step 4: choose ordering and theory indices. This step does not change a given system, it just adds some information that will be important in the next step. For a given system Γ3 obtained in Step 3, consider all possible strict linear orderings < on the variables of the system, and all mappings ind from the set of variables into the set of theory indices {1, 2}. Each pair (<, ind) yields one of the new systems Γ4 obtained from the given one. The last step is again deterministic. It splits each of the systems already obtained into a pair of pure systems. Step 5: split systems. A given system Γ4 is split into two systems Γ5,1 and Γ5,2 such that Γ5,1 contains only 1-equations and Γ5,2 only 2-equations. These systems can now be considered as unification problems with linear constant restriction. In the system Γ5,i , the variables with index i are still treated as variables, but the variables with alien index j 6= i are treated as free constants. The linear constant restriction for Γ5,i is induced by the linear ordering chosen in the previous step. The output of the algorithm is thus a finite set of pairs (Γ5,1 , Γ5,2 ) where the first component Γ5,1 is an E1 -unification problem with linear constant restriction, and the second component Γ5,2 is an E2 -unification problem with linear constant restriction. Proposition 3.2. The input system Γ0 is solvable if and only if there exists a pair (Γ5,1 , Γ5,2 ) in the output set such that Γ5,1 and Γ5,2 are solvable. A proof of this proposition is given in the next section. Obviously, if solvability of E1 and E2 -unification problems with linear constant restrictions is decidable, the proposition implies decidability of elementary E-unifiability, which proves Theorem 2.1.

220

F. Baader and K. U. Schulz

an example We consider the theories E1 and E2 of Example 3.1, and the unification problem {h(x) ◦ y = y ◦ h(z1 ◦ z2 )}. Step 1: variable abstraction. This step results in the new system {x1 ◦ y = y ◦ x2 , x1 = h(x), x2 = h(x3 ), x3 = z1 ◦ z2 }. Step 2: split non-pure equations. Since all equations are already pure, nothing is done in this step. Step 3: variable identification. As an example, we consider the partition where x1 and x2 are in one class, and all the other variables are in singleton classes. Choosing x1 as representative for its class, we obtain the new system {x1 ◦ y = y ◦ x1 , x1 = h(x), x1 = h(x3 ), x3 = z1 ◦ z2 }. Step 4: choose ordering and theory indices. As an example, we take the linear ordering z1 < z2 < x3 < x < x1 < y, and the theory indices ind(x1 ) = ind(x) = ind(z1 ) = ind(z2 ) = 2 and ind(x3 ) = ind(y) = 1. Step 5: split systems. On the one hand, we get the system Γ5,1 = {x1 ◦ y = y ◦ x1 , x3 = z1 ◦ z2 } consisting of pure 1-equations. In this system the variables with index 1, i.e., x3 and y, are still treated as variables, but the variables of index 2, i.e., x1 , z1 and z2 , are treated as free constants. The linear constant restriction induced by the linear ordering is given by Vx1 = {x3 }, Vz1 = Vz2 = ∅. On the other hand, we obtain the system Γ5,2 = {x1 = h(x), x1 = h(x3 )} consisting of pure 2-equations. Here x and x1 are treated as variables, and x3 is treated as free constant. The constant restriction is given by Vx3 = ∅. This pair (Γ5,1 , Γ5,2 ) is one element in the set obtained as output of the algorithm. It is easy to see that Γ5,1 has the solution {x3 7→ z1 ◦ z2 , y 7→ x1 }, and Γ5,2 has the solution {x1 7→ h(x3 ), x 7→ x3 }. Consequently, the proposition implies that the original system has a solution. combination of unifiers The decomposition algorithm can also be used to compute complete sets of unifiers for elementary (E1 ∪E2 )-unification problems, provided that one can compute finite complete sets of solutions for all Ei -unification problems with linear constant restriction (i = 1, 2). The reason is that solutions of the problems Γ5,1 , Γ5,2 in the output of the algorithm can be combined to solutions of the original input system. This combined solution is defined inductively over the linear ordering chosen in Step 4 of the algorithm.

Combining Decision Procedures

221

Assume that σ1 is a solution of Γ5,1 and σ2 is a solution of Γ5,2 . Without loss of generality we may assume that the substitution σi maps all variables of index i to terms containing only variables of index j 6= i (which are treated as free constants in Γ5,i ) or new variables, i.e., variables not occurring in Γ0 , Γ5,1 , or Γ5,2 . This can simply be achieved by renaming variables if necessary. First, we define the combined solution σ on the variables occurring in the system obtained after Step 4 of the algorithm. Note that the input system Γ0 may contain additional variables, which have been replaced during the variable identification step. Let x be the least variable with respect to the linear ordering chosen in Step 4, and let i be its index. Since the solution σi of Γ5,i satisfies the constant restriction induced by the linear ordering, the term xσi does not contain any variables of index j 6= i (Recall that these variables are treated as free constants in Γ5,i .) Thus we can simply define xσ := xσi . Now let x be an arbitrary variable with index i, and let y1 , . . . , ym be the variables with index j 6= i occurring in xσi . Since σi satisfies the constant restriction induced by the linear ordering, the variables y1 , . . . , ym (which are treated as free constants in Γ5,i ) have to be smaller than x. This means that y1 σ, . . . , ym σ are already defined. The term xσ is now obtained from xσi by replacing the yk by yk σ (k = 1, . . . , m). Because we have assumed that the other variables occurring in xσi are new variables, we thus have xσ = xσi σ. Finally, let x be a variable of the input system that has been replaced by the variable y during the variable identification step. Thus yσ is already defined, and we can simply set xσ := yσ. For all variables z not occurring in the input system we define zσ := z. Example 3.3. For the above example, the solutions σ1 = {x3 7→ z1 ◦ z2 , y 7→ x1 } and σ2 = {x1 7→ h(x3 ), x 7→ x3 } of Γ5,1 , Γ5,2 are combined to {z1 7→ z1 , z2 7→ z2 , x3 7→ z1 ◦ z2 , x 7→ z1 ◦ z2 , x1 7→ h(z1 ◦ z2 ), x2 7→ h(z1 ◦ z2 ), y 7→ h(z1 ◦ z2 )}. This construction can now be used to generate complete sets of unifiers for elementary (E1 ∪E2 )-unification problems. For a given system Γ0 , let {(Γ15,1 , Γ15,2 ), . . . , (Γn5,1 , Γn5,2 )} be the output of the decomposition algorithm, if it is applied to the input Γ0 . For i = 1, . . . , n and j = 1, 2, let Mi,j be a complete set of solutions of the Ei -unification problem with linear constant restriction Γi5,j . Proposition 3.4. The set of substitutions n [

{σ | σ is the combined solution obtained from σ1 ∈ Mi,1 and σ2 ∈ Mi,2 }

i=1

is a complete set of (E1 ∪ E2 )-unifiers of the input system Γ0 . A proof of this proposition will be given in the next section. Obviously, if all the sets Mi,j are finite, then the complete set given by the proposition is also finite, which proves Theorem 2.2.

222

F. Baader and K. U. Schulz

4. Correctness of the Decomposition Algorithm In this section, we shall prove Proposition 3.2 and Proposition 3.4, which shows that our combination method is correct both for the decision problem and for the problem of computing complete sets of unifiers. Before we can start with our task, we must introduce a useful tool, which has first been utilized in connection with the combination problem in Boudet et al. (1989), namely unfailing completion applied to the combined theory. Let E1 , E2 be equational theories over disjoint signatures Ω1 , Ω2 . We assume that both theories are consistent, i.e., they have at least one model of cardinality greater than one, or equivalently, the identity x =Ei y does not hold in either theory. One can now apply unfailing completion [see e.g., Dershowitz and Jouannaud (1990) for definitions and properties] to the combined theory E = E1 ∪ E2 . This yields a possibly infinite orderedrewriting system R which is confluent and terminating on ground terms. In the following, we shall also apply this system to terms containing variables from a fixed countable set of variables X0 ; but this is not a problem because these variables can simply be treated like constants. In particular, this means that the simplification ordering used during the completion must also take care of these additional “constants.” The ordered-rewriting system R consists of (possibly infinitely many) equations g = d. Such an equation can be applied to a term s ∈ T (Ω1 ∪ Ω2 , X0 ) iff there exists an occurrence u in s and a substitution τ such that s = s[u ← gτ ] (s = s[u ← dτ ], resp.) and gτ is greater than dτ (dτ is greater than gτ , resp.) with respect to the simplification ordering. This application results in the new term s[u ← dτ ] (s[u ← gτ ], resp.). It is easy to see that, because the signatures of E1 and E2 are disjoint, the system R is the union of two systems R1 and R2 , where the terms in Ri are built over the signature Ωi (i = 1, 2). In fact, Ri is just the system that would be obtained by applying unfailing completion to Ei . This is an easy consequence of the definition of critical pairs used for unfailing completion, and of the fact that E1 and E2 are assumed to be consistent. Let T (Ω1 ∪ Ω2 , X0 ) be the set of terms built from function symbols in Ω1 ∪ Ω2 and variables in X0 , and let T↓R denote its R-irreducible elements. We consider an arbitrary bijection π : T↓R −→ Y where Y is a set of variables of appropriate cardinality. This bijection induces mappings π1 , π2 of terms in T (Ω1 ∪Ω2 , X0 ) to terms in T (Ω1 ∪Ω2 , Y ) as follows. For variables x ∈ X0 , xπ1 := π(x) (Note that variables are always R-irreducible.) If t = f (t1 , . . . , tn ) for a 1-symbol f , then tπ1 := f (tπ1 1 , . . . , tπn1 ). Finally, if t is a 2term then tπ1 := y where y = π(s) for the unique R-irreducible element s of the =E class of t. The mapping π2 is defined analogously. The mappings πi may be regarded as projections that map a possibly mixed term to an i-pure term. We write these mappings as superscripts to distinguish them from substitutions. The inverse π −1 of π can be seen as a substitution that maps the variables y in Y back to the terms π −1 (y), and is the identity on all other variables. Obviously, we have tπi π −1 =E t for all terms t ∈ T (Ω1 ∪ Ω2 , X0 ), and if t is an R-irreducible term or an i-term such that all its alien subterms are R-irreducible, then (tπi )π −1 = t. A substitution σ is called R-normalized on a finite set of variables Z iff zσ ∈ T↓R for all variables z ∈ Z. The next lemma will be important in the proof of Proposition 3.2. Lemma 4.1. Let s, t be pure i-terms or variables, and let σ be a substitution that is R-normalized on the variables occurring in s, t. Then sσ =E tσ

iff

(sσ)πi =Ei (tσ)πi .

Combining Decision Procedures

223

Proof. (1) The if-direction is easy to prove. Obviously, (sσ)πi =Ei (tσ)πi implies that (sσ)πi =E (tσ)πi , and thus (sσ)πi π −1 =E (tσ)πi π −1 . By our assumptions on s, t and σ, the j-terms (for j 6= i) in sσ and tσ are R-irreducible, which finally yields sσ = (sσ)πi π −1 =E (tσ)πi π −1 = tσ. (2) From sσ =E tσ follows the existence of an R-irreducible term r that is a common Rdescendant of sσ and tσ. Let us now consider the derivation s0 := sσ →R s1 →R · · · →R r more closely. The goal is to show sπ0 i =Ei sπ1 i =Ei · · · =Ei rπi . Symmetrically, we could then also deduce (tσ)πi =Ei rπi , which would prove the lemma. The case where s is a variable is trivial since then s0 is R-irreducible, which yields s0 = r. Thus assume that s is a pure i-term. Since all alien subterms of sσ are Rirreducible, the first step of the derivation from s0 to r must take place at an occurrence u which is not inside an alien subterm of s0 = sσ. In particular, this means that it is done by applying a rule g = d of Ri . To be more precise, there exists a substitution τ such that s0 = s0 [u ← gτ ], s1 = s0 [u ← dτ ], and gτ is greater than dτ with respect to the simplification ordering. From the fact that u is not inside an alien subterm of s0 we obtain that sπ0 i = sπ0 i [u ← (gτ )πi ] and sπ1 i = sπ0 i [u ← (dτ )πi ]. In order to conclude sπ0 i =Ei sπ1 i , it thus remains to be shown that (gτ )πi =Ei (dτ )πi . To see this, we define the substitution τ πi := {x 7→ (xτ )πi | x occurs in g or d}. Since g, d are pure i-terms or variables, we have g(τ πi ) = (gτ )πi and d(τ πi ) = (dτ )πi . Because g = d ∈ Ri implies g =Ei d, we thus obtain (gτ )πi = g(τ πi ) =Ei d(τ πi ) = (dτ )πi . If we want to continue by induction, we have to know that all alien subterms of s1 are R-irreducible. This need not be the case for arbitrary derivations from sσ to r. The problem is that we only have an ordered-rewriting system that is terminating on ground terms. For this reason it may well be the case that d contains variables not contained in g; and in general we cannot be sure that the images of these variables under τ do not introduce reducible alien subterms into s1 . However, if we assume that the derivation from sσ to r is a bottom-up derivation where all the matching substitutions (such as our τ ) are R-normalized, then τ cannot introduce reducible alien subterms. This assumption can be made without loss of generality because it is easy to see that, whenever a term is not R-irreducible, then we can apply a rule of R to this term in a way that satisfies the constraints of the assumption.

proof of proposition 3.2 For the remainder of the paper we shall use the following notational convention: Γ0 always denotes an input system of the decomposition algorithm. Γi denotes a system that is reached after applying Step i of the decomposition algorithm to Γi−1 (1 ≤ i ≤ 4), and (Γ5,1 , Γ5,2 ) denotes an output pair of the algorithm, resulting from a decomposition of Γ4 as described in Step 5. Furthermore, Yi denotes the set of variables occurring in system Γi (0 ≤ i ≤ 4), and Y5,j denotes the subset of Y4 containing the variables of index j (j = 1, 2). Thus Y4 is the disjoint union of Y5,1 and Y5,2 . First, we show soundness of the decomposition algorithm, i.e., we demonstrate that Γ0 is solvable if there exists a pair (Γ5,1 , Γ5,2 ) in the output set such that Γ5,1 and Γ5,2 are solvable. Assume that σ1 is a solution of Γ5,1 and σ2 is a solution of Γ5,2 . In the previous section we have already described how these two solutions of the single problems can be combined to a substitution σ, which we have called the combined solution. It remains

224

F. Baader and K. U. Schulz

to be shown that σ is in fact a solution of Γ0 . Obviously, it is sufficient to prove that σ . is a solution of the system Γ4 that was split in Step 5 into Γ5,1 and Γ5,2 . Let s = t be an equation in Γ4 , and assume without loss of generality that this equation was put into Γ5,1 in Step 5. Thus we know that sσ1 =E1 tσ1 . As an easy consequence of the definition of σ, one obtains that σ = σ1 σ. Since sσ1 =E1 tσ1 obviously implies sσ1 σ =E1 tσ1 σ, and thus also sσ1 σ =E tσ1 σ, this shows that sσ =E tσ. In the second part of the proof we must show completeness of the decomposition algorithm, i.e., we must demonstrate that there exists a pair (Γ5,1 , Γ5,2 ) in the output set such that Γ5,1 and Γ5,2 are solvable, if Γ0 is solvable. Let σ be a solution of Γ0 . Without loss of generality we assume that σ is also a solution of Γ2 , that the set Y2 of all variables occurring in this system is disjoint to X0 , and that σ is R-normalized on Y2 . In particular, this implies that the variables occurring in yσ for y ∈ Y2 are elements of X0 . The solution σ can be used to define the correct alternatives in the nondeterministic steps of the decomposition algorithm: (i) The partition of the set of all variables, which has to be chosen in the third step, is defined as follows: two variables y and z are in the same class iff yσ = zσ. Obviously, this means that σ is also a solution of Γ3 , the system obtained after the variable identification step corresponding to this partition. (ii) In the fourth step, the variable y gets index i if yσ is an i-term. If yσ is itself a variable, y gets index 1. (This is arbitrary, we could have taken index 2 as well.) (iii) In the fourth step, we must also choose an appropriate linear ordering on the variables occurring in the system. Consider the strict partial ordering defined by y < z iff yσ is a strict subterm of zσ. We take an arbitrary extension of this partial ordering to a linear ordering on the variables occurring in the system. The choices we have just described determine a system Γ4 and a particular pair of systems (Γ5,1 , Γ5,2 ) in the output set of the decomposition algorithm. Since Step 4 does not modify any equation, σ solves Γ4 . It remains to be shown that Γ5,1 , Γ5,2 are solvable. In order to define solutions σi of these systems, we consider a bijection π from the Rirreducible elements of T (Ω1 ∪ Ω2 , X0 ) onto a set of variables Y . This bijection has to satisfy two conditions. First, Y should contain Y4 , the set of variables occurring in Γ4 . Since σ is assumed to be R-normalized on Y2 , we have that yσ is R-irreducible for all variables y ∈ Y4 . The second condition on π is that π(yσ) = y for all y ∈ Y4 . For the satisfiability of these conditions, the variable identification step is important. The reason is that only because of this step we can be sure that Γ4 does not contain two different variables y, y 0 with yσ = y 0 σ. As described above, the bijection π induces mappings π1 , π2 . These mappings will now be used to construct the solutions σi for i = 1, 2. The substitution σi is defined on the variables y ∈ Y4 by yσi := (yσ)πi . If y is a variable of index j 6= i, the term yσ is either a variable in X0 or a j-term. In both cases we obtain yσi = (yσ)πi = π(yσ) = y by definition of σi and πi . This shows that σi really treats the variables of index j as constants. . Now assume that s = t is an equation in Γ5,i . Since this equation is also contained in Γ4 , and since σ solves Γ4 , we know that sσ =E tσ. Since σ was assumed to be R. normalized on Y2 , and since s = t is an i-equation, we can apply Lemma 4.1 to get . πi πi (sσ) =Ei (tσ) . Using the definition of σi and the fact that s = t is an i-equation, it is . πi πi easy to see that (sσ) = sσi and (tσ) = tσi . Thus σi really solves the equation s = t.

Combining Decision Procedures

225

It remains to be shown that σi satisfies the constant restriction. Assume that x is a variable of index i, and that the variable y of index j 6= i (which is treated as a constant in Γ5,i ) occurs in xσi . We must show that x is not an element of Vy , i.e., that x 6< y. Recall that xσi = (xσ)πi , and that xσ is R-irreducible. Thus, since y 6∈ X0 , the occurrence of y in xσi must come from the occurrence of yσ as a subterm of xσ. Because of the identification step, the fact that x and y are different variables also implies that xσ and yσ are different terms. Thus yσ is a strict subterm of xσ, which yields y < x because of the way the linear ordering was chosen. proof of proposition 3.4 In the first part of the proof of Proposition 3.2 we have already shown that the elements of the set of substitutions defined in the formulation of Proposition 3.4 are solutions of Γ0 . It remains to be shown that this set is complete. Let τ be a solution of Γ0 . Without loss of generality we assume that τ is also a solution of Γ2 , that the set Y2 of all variables occurring in this system is disjoint to X0 , and that τ is R-normalized on Y2 . In the second part of the proof of Proposition 3.2 we have shown that τ can be used to find a pair of systems (Γ5,1 , Γ5,2 ) in the output set of the decomposition algorithm, and to construct solutions τ1 and τ2 of these systems. This construction makes use of a bijection π and mappings π1 , π2 induced by this bijection as described above. Since τi is a solution of Γ5,i , there exist an element σi in the complete set of solutions of Γ5,i and a substitution λi such that τi =Ei σi λi hY5,i i.† Without loss of generality we may assume that the substitution σi maps the variables in Y5,i to terms containing only variables of index j 6= i (which are treated as constants by λi ) or new variables from a set of variables Zi . We may assume that the domain of λi is Zi , and that the sets X0 , Y2 , Z1 , Z2 are pairwise disjoint. As described in the first part of the proof of Proposition 3.2, the solutions σ1 , σ2 of Γ5,1 , Γ5,2 can be combined to a solution σ of Γ. Since this combined solution is an element of the set of substitutions defined in the formulation of Proposition 3.4, it remains to be shown that there exists a substitution λ such that τ =E σλ hY0 i. We define λ := (λ1 ∪ λ2 )π −1 , where (λ1 ∪ λ2 ) is meant to denote the substitution that is equal to λi on Zi (i = 1, 2), and the identity on all variables not contained in Z1 ∪ Z2 . First, we show τ =E σλ hY5,1 ∪Y5,2 i. The proof is by induction on the linear ordering < chosen in Step 4 of the decomposition algorithm. Without loss of generality, we consider a variable y ∈ Y5,1 . By the definition of σ, we have yσλ = (yσ1 )σλ. The variables occurring in the term yσ1 are either variables of index 2, i.e., elements of Y5,2 , or new variables, i.e., elements of Z1 . We want to show that on these variables, the substitutions σλ and λ1 π −1 coincide modulo E. Let z1 be an element of Z1 occurring in yσ1 . Since we have assumed that the elements of Z1 are new variables, z1 is not in the domain of σ, which yields z1 σλ = z1 λ. By definition of λ, and since z1 ∈ Z1 , we obtain z1 λ = z1 λ1 π −1 . If y is the least variable with respect to the linear ordering <, then the term yσ1 does not contain a variable of Y5,2 , because σ1 satisfies the linear constant restriction induced by <. Now assume that y is an arbitrary variable in Y5,1 , and let y2 be an element of Y5,2 † Recall that, for a finite set Z of variables, δ1 = δ2 hZi means that zδ1 = zδ2 for all z ∈ Z. E E

226

F. Baader and K. U. Schulz

occurring in yσ1 . Since σ1 satisfies the linear constant restriction, we know that y2 < y. By induction, we thus obtain y2 σλ =E y2 τ . We also have y2 τ = (y2 τ )π1 π −1 = y2 τ1 π −1 . Since τ1 and λ1 treat variables of index 2 as constants, we know that y2 τ1 π −1 = y2 π −1 = y2 λ1 π −1 . Thus we have shown y2 σλ =E y2 λ1 π −1 . To sum up, we have just shown that, for all variables z occurring in yσ1 , we have zσλ =E zλ1 π −1 . Consequently, we obtain yσλ = (yσ1 )σλ =E (yσ1 )λ1 π −1 =E yτ1 π −1 = (yτ )π1 π −1 = yτ as required. Finally, assume that x is a variable occurring in Γ0 , but x 6∈ Y5,1 ∪ Y5,2 . This means that x has been substituted by a variable y ∈ Y5,1 ∪ Y5,2 during the variable identification step of the algorithm. On the one hand, this means that yτ = xτ (since this must have triggered the identification). On the other hand, because of this identification step, we have defined xσ := yσ. Thus we have xτ = yτ =E yσλ = xσλ. This completes the proof of the proposition. 5. Solving Unification Problems with Constant Restriction We have seen that algorithms for E-unification with linear constant restriction may be used to obtain—via our combination method—algorithms for general unification. In the first part of this section we shall describe how, conversely, algorithms for general unification can be used to solve unification problems with linear constant restriction. In the second part, constant elimination algorithms together with algorithms for unification with constants are used to solve unification problems with arbitrary constant restriction. The third part of this section is concerned with the logical characterization of E-unification problems with linear constant restriction mentioned in Section 2. In the following, E is assumed to be an arbitrary consistent equational theory. 5.1. using algorithms for general unification In this subsection we shall consider both the problem of deciding solvability and of generating complete sets of solutions of unification problems with linear constant restrictions. the decision problem Let Γ be an E-unification problem with a linear constant restriction, and let < be the linear ordering by which this restriction is induced. In the following, let X denote the set of all variables and C denote the set of all free constants occurring in Γ. Our goal is to construct a general E-unification problem Γ0 such that Γ is solvable iff Γ0 is solvable. In this new system Γ0 , the free constants of Γ will be treated as variables, i.e., the solutions are allowed to substitute terms for these “constants.” For any free constant c of Γ we introduce a new (free) function symbol fc of arity |Vc |. Recall that Vc = {x ∈ X | x < c} is the set of variables in whose σ-image c must not occur for the solution σ of Γ to satisfy the constant restriction. The general E-unification problem—in which the free constants of Γ are treated as variables—is now defined as . Γ0 := Γ ∪ {c = fc (x1 , . . . , xn ) | c ∈ C and Vc = {x1 , . . . , xn }} . Proposition 5.1. The E-unification problem with linear constant restriction, Γ, is solvable iff the general E-unification problem Γ0 is solvable.

Combining Decision Procedures

227

Please note that the proposition only holds for unification problems with linear constant restriction. The following example demonstrates that the construction described above cannot be used for unification problems with arbitrary constant restriction. Example 5.2. Let E be the empty theory, and let x, y be variables and c, d be free constants. We consider the following E-unification problem with constant restriction: . . Γ = {x = d, y = c}, Vc = {x}, Vd = {y}. It is easy to see that this restriction cannot be induced by a linear ordering on {x, y, c, d}. Obviously, the problem has the solution {x 7→ d, y 7→ c}. The corresponding general E-unification problem is . . . . Γ0 = {x = d, y = c, c = fc (x), d = fd (y)}, where c, d are now treated as variables. It is easy to see that this problem does not have a solution. However, as we shall prove below, our construction is correct for unification problems with linear constant restriction. But first, let us demonstrate that this construction also works for the case of algorithms computing complete sets of unifiers. generating complete sets of solutions More precisely, we shall describe how we can obtain a finite complete set of solutions of Γ, provided that a finite complete set of E-unifiers of Γ0 exists. Let R be the possibly infinite ordered-rewriting system that is obtained when applying unfailing completion to E. We assume that the simplification ordering used during the completion also takes the additional symbols fc and variables (which are however treated as constants by the ordering) out of a countable set X0 of new variables into account. This means that we can apply R to terms built out of symbols in the signature of E, the additional symbols fc , and variables in X0 . Let T↓R be the R-irreducible elements of the set of these terms. Now we shall show how an element σ 0 of a complete set of E-unifiers of Γ0 can be used to define a solution σ of Γ. Without loss of generality, we may assume that σ 0 is R-normalized on the variables occurring in Γ0 . In fact, for any substitution there exists an =E -equivalent substitution that is R-normalized on the variables occurring in Γ0 ; and exchanging an element of a complete set of E-unifiers of Γ0 by an =E -equivalent substitution still leaves us with a complete set of E-unifiers of Γ0 . Let π be a bijection from T↓R onto a set of variables Y . This bijection has to satisfy two conditions. First, Y should contain all the free constants occurring in Γ (which are treated as variables in Γ0 ). Since σ 0 is assumed to be R-normalized on the variables occurring in Γ0 , we have that cσ 0 is R-irreducible for all these constants c. The second condition on π is that π(cσ 0 ) = c for these constants c. The two conditions are satisfiable because for c 6= c0 we have cσ 0 6= c0 σ 0 . In fact, since σ 0 solves Γ0 , we know that cσ 0 =E fc (x1 σ 0 , . . . , xn σ 0 ) and c0 σ 0 =E fc0 (x1 σ 0 , . . . , xn σ 0 ). But this implies that cσ 0 has fc as root symbol, and c0 σ 0 the different symbol fc0 . As described in Section 4, the bijection π induces a mapping π1 . To this purpose we treat the symbols of the signature of E as 1-symbols and the symbols fc as 2-symbols. The mapping π1 is now used to define our solution σ of Γ. For all variables x occurring

228

F. Baader and K. U. Schulz

in Γ we define xσ := (xσ 0 )π1 . The constants c of Γ are really treated as constants by σ, i.e., cσ = c. However, note that c = (cσ 0 )π1 holds, anyway. Proposition 5.3. Let C(Γ0 ) be a complete set of E-unifiers of Γ0 , which are (without loss of generality) assumed to be R-normalized on the variables occurring in Γ0 . Then the set C(Γ) := {σ | σ 0 ∈ C(Γ0 )}, where σ is constructed out of σ 0 as described above, is a complete set of solutions of the E-unification problem with linear constant restriction, Γ. Again, the proposition only holds for unification problems with linear constant restriction. proof of proposition 5.1 Recall that X denotes the set of all variables and C denotes the set of all free constants occurring in Γ. To prove the “only-if ” direction, assume that σ is a solution of Γ. Without loss of generality, we may assume that for all x ∈ X the variables occurring in xσ are new variables (i.e., variables not contained in X), and that σ is the identity on all variables y 6∈ X. We define a substitution σ 0 on X ∪ C (where the elements of C are now treated as variables) by induction on the linear ordering < that induces the constant restriction of Γ. First, we consider the least element of X ∪ C with respect to <. If this is a variable x ∈ X, then for all c ∈ C we have x ∈ Vc . This implies that xσ does not contain any of these free constants, and we can define xσ 0 := xσ. If the least element of X ∪ C is a constant c ∈ C, then Vc = ∅. This means that fc is a constant symbol, and we define cσ 0 := fc . Now let x be an arbitrary element of X, and let c1 , . . . , cm ∈ C be the free constants occurring in xσ. Since σ satisfies the constant restriction induced by the linear ordering, the constants c1 , . . . , cm (which are treated as variables in Γ0 ) must be smaller than x. This means that we may assume by induction that c1 σ 0 , . . . , cm σ 0 are already defined. The term xσ 0 is now obtained from xσ by replacing the ck by ck σ 0 (k = 1, . . . , m). Finally, let c be an arbitrary element of C. By definition, the system Γ0 contains the . equation c = fc (x1 , . . . , xn ), where x1 , . . . , xn are those elements of X that are smaller than c with respect to <. Thus we can assume by induction that x1 σ 0 , . . . , xn σ 0 are already defined, and we set cσ 0 := fc (x1 σ 0 , . . . , xn σ 0 ). It remains to be shown that σ 0 is a solution of Γ0 . Obviously, the definition of σ 0 . . implies that it solves the equations c = fc (x1 , . . . , xn ) in Γ0 . Now let s = t be an equation of Γ. Since σ solves Γ, we know that sσ =E tσ. In addition, it is easy to see that for all c ∈ C we have cσ 0 = cσσ 0 and for all x ∈ X we have xσ 0 = xσσ 0 . But then sσ 0 = sσσ 0 =E tσσ 0 = tσ 0 . To prove the “if ” direction, assume that σ 0 is a solution of Γ0 . Without loss of generality, we assume that σ 0 is R-normalized on the set X ∪ C of all variables occurring in Γ0 . As described above, σ 0 can be used to define a new substitution σ as follows: For all variables x occurring in Γ one defines xσ := (xσ 0 )π1 . We have to show that σ solves Γ. . Let s = t be an equation of Γ. Since σ 0 solves Γ0 , we know that sσ 0 =E tσ 0 . Now we can apply Lemma 4.1 to obtain (sσ 0 )π1 =E (tσ 0 )π1 . Using the definition of σ and the fact that the terms s, t do not contain the symbols fc , it is easy to see that (sσ 0 )π1 = sσ and . (tσ 0 )π1 = tσ. Thus σ really solves the equation s = t.

Combining Decision Procedures

229

It remains to be shown that σ satisfies the constant restriction. Let c ∈ C be a free constant. Since σ 0 solves Γ0 , we know that cσ 0 =E fc (x1 σ 0 , . . . , xn σ 0 ), where {x1 , . . . , xn } = Vc . In addition, σ 0 was assumed to be R-normalized on X ∪ C, which implies that cσ 0 , x1 σ 0 , . . . , xn σ 0 are R-irreducible terms; and since the symbol fc does not occur in any rule of R, the term fc (x1 σ 0 , . . . , xn σ 0 ) is also R-irreducible. Thus we have cσ 0 = fc (x1 σ 0 , . . . , xn σ 0 ), which shows that the term cσ 0 is not a subterm of any of the terms x1 σ 0 , . . . , xn σ 0 . But then c cannot occur in xi σ = (xi σ 0 )π1 (i = 1, . . . , n). proof of proposition 5.3 Now we shall show that the set C(Γ), as defined in the formulation of Proposition 5.3, is a complete set of solutions of Γ. In the second part of the proof of Proposition 5.1, we have already shown that the substitution σ constructed from a solution σ 0 of Γ0 is itself a solution of Γ. Thus C(Γ) is a set of solutions of Γ. It remains to be shown that it is a complete set. As before, let X denote the set of variables and C the set of free constants occurring in Γ. Recall that the elements of C are treated as variables in Γ0 . Let τ be a solution of Γ. Without loss of generality, we assume that, for all x ∈ X, the variables occurring in xτ are elements of X0 . As shown in the first part of the proof of Proposition 5.1, τ can be used to define a solution τ 0 of Γ0 . Because of our assumption on τ , it is easy to see that for all z ∈ X ∪ C, the variables occurring in zτ 0 are elements of X0 . Since C(Γ0 ) is a complete set of E-unifiers of Γ0 , there exists an element σ 0 of C(Γ0 ) and a substitution λ0 such that τ 0 =E σ 0 λ0 hX ∪ Ci. Let σ be the element of C(Γ) constructed out of σ 0 . Our aim is to define a substitution λ that satisfies τ =E σλ hXi. Let Y0 ⊆ Y denote the set of all variables occurring in the terms xσ for x ∈ X. These terms may also contain elements of C, which must, however, be treated as constants by λ. In the construction of σ out of σ 0 we have used a bijection π from T↓R onto a set of variables Y which contains C. In addition, this bijection had to satisfy π(cσ 0 ) = c for all c ∈ C. The substitution σ was then defined by xσ := (xσ 0 )π1 for all x ∈ X. In order to be able to reverse the construction of τ 0 out of τ we shall now consider an analogous bijection µ from T↓R onto Y . The condition on µ is that µ((cτ 0 )↓R ) = c for all c ∈ C. Here (cτ 0 )↓R denotes the unique R-irreducible element of the =E -class of cτ 0 . As an easy consequence of this condition together with the definition of τ 0 , we obtain that xτ = (xτ 0 )µ1 for all x ∈ X. The substitution λ is now defined on all y ∈ Y0 as yλ := (yπ −1 λ0 )µ1 . To complete the proof of the proposition we must show that τ =E σλ hXi. In this proof we need a lemma that is a stronger version of Lemma 4.1 for the special case of the union of an arbitrary equational theory E with a disjoint free theory. Lemma 5.4. Let s, t be terms built out of symbols in the signature of E, the additional symbols fc , and variables in X0 . Then s =E t iff sµ1 =E tµ1 . Proof. (1) From sµ1 =E tµ1 we can deduce sµ1 µ−1 =E tµ1 µ−1 , and we also have s =E sµ1 µ−1 and t =E tµ1 µ−1 . (2) Obviously, it is sufficient to prove the “only-if” direction for the case where t is obtained from s by one application of an identity of E. Thus assume that g = d is an identity of E, u is an occurrence in s, and τ is a substitution such that s = s[u ← gτ ],

230

F. Baader and K. U. Schulz

t = s[u ← dτ ]. If the occurrence u is strictly below an occurrence of a free function symbol, then it is easy to see that sµ1 = tµ1 . Otherwise, we have sµ1 = sµ1 [u ← (gτ )µ1 ] and sµ1 = sµ1 [u ← (dτ )µ1 ]. What remains to be shown is (gτ )µ1 =E (dτ )µ1 , and this can be done as in the proof of Lemma 4.1. We can now continue with the proof of the proposition. For all x ∈ X we have xτ 0 =E xσ 0 λ0 = (xσ 0 )π1 π −1 λ0 = xσπ −1 λ0 . But then the lemma yields xτ = (xτ 0 )µ1 =E (xσπ −1 λ0 )µ1 . It remains to be shown that (xσπ −1 λ0 )µ1 =E xσλ. For this purpose, we define a substitution (π −1 λ0 )µ1 on Y0 ∪ C as follows: for all z ∈ Y0 ∪ C, z(π −1 λ0 )µ1 := (zπ −1 λ0 )µ1 . Because the terms xσ for x ∈ X do not contain any of the free function symbols fc , it is easy to see that (xσπ −1 λ0 )µ1 = (xσ)(π −1 λ0 )µ1 . Obviously, the substitutions λ and (π −1 λ0 )µ1 coincide on Y0 , but for c ∈ C we need not have c(π −1 λ0 )µ1 = c. However, we can show that c(π −1 λ0 )µ1 =E c. In fact, cπ −1 λ0 = cσ 0 λ0 =E cτ 0 , and thus the lemma yields (cπ −1 λ0 )µ1 =E (cτ 0 )µ1 . But our assumption on µ yields (cτ 0 )µ1 = c. To sum up, we have just seen that λ =E (π −1 λ0 )µ1 hY0 ∪ Ci, which implies (xσ)λ =E (xσ)(π −1 λ0 )µ1 . This completes the proof of the proposition. 5.2. using algorithms for constant elimination and for unification with constants In this subsection we shall consider unification problems with arbitrary constant restrictions. It will be shown how to reduce solving this kind of problems to solving both unification problems with constants and constant elimination problems. A constant elimination problem in the theory E is a finite set ∆ = {(c1 , t1 ), ..., (cn , tn )} where the ci ’s are free constants (i.e., constant symbols not occurring in the signature of E) and the ti ’s are terms (built over the signature of E, variables, and free constants). A solution to such a problem is called a constant eliminator. It is a substitution σ such that for all i, 1 ≤ i ≤ n, there exists a term t0i not containing the free constant ci with t0i =E ti σ. The notion complete set of constant eliminators is defined analogously to the notion complete set of unifiers. Let Γ be an E-unification problem with arbitrary constant restriction. The goal is to construct a complete set of solutions of this problem. In the first step, we just ignore the constant restriction, and solve Γ as an ordinary E-unification problem with constants. Let C(Γ) be a complete set of E-unifiers of this problem. In the second step, we define for all unifiers σ ∈ C(Γ) a constant elimination problem ∆σ as follows: ∆σ := {(c, xσ) | c is a free constant in Γ and x ∈ Vc }. For all σ ∈ C(Γ), let Cσ be a complete set of solutions of the constant elimination problem ∆σ . Before we can describe the complete set of solutions of the E-unification problem with constant restriction, Γ, we must define a slightly modified composition “⊗” of substitutions. Let σ be an element of C(Γ), and let τ be a constant eliminator in Cσ . Without loss of generality, we assume that τ is the identity on variables not occurring in ∆σ , and that the terms yτ for variables y occurring in ∆σ contain only new variables. In particular, we shall need for technical reasons that they do not contain variables occurring in terms xσ for variables x occurring in Γ. For a given variable x, let {c1 , . . . , cn } be the set of all constants ci occurring in Γ such that x ∈ Vci . If this set is empty (i.e., n = 0), we define x(σ ⊗τ ) := xστ . Now assume

Combining Decision Procedures

231

that n > 0. Obviously, we have {(c1 , xσ), . . . , (cn , xσ)} ⊆ ∆σ . Since τ is a solution of ∆σ , there exist terms s1 , . . . , sn such that for all i, 1 ≤ i ≤ n, xστ =E si and ci does not occur in si . It is easy to see that this also implies the existence of a single term s such c1 , . . . , cn do not occur in s and xστ =E s. We define x(σ⊗τ ) := s. Proposition 5.5. The set

[

{σ⊗τ | τ ∈ Cσ }

σ∈C(Γ)

is a complete set of solutions of the E-unification problem with constant restriction, Γ. . Proof. First, we show that the elements σ⊗τ of this set solve all the equations s = t of Γ. Since σ is an E-unifier of Γ, we have sσ =E tσ, which implies sστ =E tστ . But the definition of σ⊗τ was such that στ =E σ⊗τ , and thus we get s(σ⊗τ ) =E t(σ⊗τ ). Second, we prove that σ⊗τ satisfies the constant restriction. Assume that x ∈ Vc . Then the constant elimination problem ∆σ contains the tuple (c, xσ). By definition of σ ⊗τ , we obtain that x(σ⊗τ ) is a term s not containing c. Finally, it remains to be shown that the set is complete. Assume that θ is a solution of the E-unification problem with constant restriction, Γ. In particular, this means that θ solves the E-unification problem Γ (where the restrictions are ignored). Hence there exist an element σ of the complete set C(Γ) and a substitution λ such that θ =E σλ hXi, where X denotes the set of all variables occurring in Γ. Thus we have (i) for all x ∈ X: xθ =E (xσ)λ, and (ii) for all c with x ∈ Vc : c does not occur in xθ, which shows that λ solves the constant elimination problem ∆σ . Consequently, there exist an element τ of the complete set Cσ and a substitution λ0 such that λ =E τ λ0 hY i, where Y denotes the set of all variables occurring in ∆σ . Without loss of generality, we assume that zλ0 = zλ for all variables z not occurring in one of the terms yτ with y ∈ Y . We want to show that, for all x ∈ X, we have xθ =E x(σ ⊗τ )λ0 . For all x ∈ X, we know that xθ =E xσλ, and since στ =E σ⊗τ we also have x(σ⊗τ )λ0 =E xστ λ0 . Thus it remains to be shown that xσλ =E xστ λ0 . We must distinguish two cases. First, assume that (c, xσ) ∈ ∆σ for some c. In this case all variables occurring in xσ are elements of Y , and thus λ =E τ λ0 hY i yields xσλ =E xστ λ0 . For the second case, assume that xσ contains a variable z that is not an element of Y , the set of all variables occurring in ∆σ . We are finished if we can show that, nevertheless, zλ =E zτ λ0 holds for all such variables z. Since τ was assumed to be the identity on variables not occurring in ∆σ , we have zτ = z. Since z occurs in xσ for a variable x ∈ X, our second assumption on τ implies that z does not occur in any term yτ with y ∈ Y . But then zλ0 = zλ by our assumption on λ0 , which completes the proof of xσλ =E xστ λ0 . 5.3. linear constant restrictions and positive sentences Let Σ denote the signature of E. At the end of Section 2 we have mentioned a relationship between conjunctive L+ (Σ)-sentences and E-unification problems with linear constant restriction. In this subsection we shall consider this relationship in more detail. In particular, we shall prove Result 10 of Section 2.

232

F. Baader and K. U. Schulz

Let φ be a positive Σ-sentence. Obviously, it can be represented (modulo logical equivalence) in the form mi n ^ _ . ( (ri,j = si,j )), φ = Q1 x1 · · · Qk xk i=1 j=1

where the Qi are existential or universal quantifiers, and x1 , . . . , xn are distinct variables. The associated set of E-unification problems with linear constant restriction, UPLCR(φ), . . consists of n problems Γ1 , . . . , Γn , where Γi := {ri,1 = si,1 , . . . , ri,mi = si,mi }. In these equations, existentially (universally) quantified variables of φ are treated as variables (constants), and the linear order x1 < x2 < · · · < xk is induced by the quantifier prefix of φ. Proposition 5.6. Let φ ∈ L+ (Σ) be a positive Σ-sentence. Then φ is a theorem of E iff one of the E-unification problems with linear constant restriction in UPLCR(φ) has a solution. Proof. Let us assume that φ = Q1 x1 · · · Qk xk

mi n ^ _ . ( (ri,j = si,j )) i=1 j=1

is a Σ-theorem of E. We shall describe a series of equivalent statements, eventually showing that one of the E-unification problems with linear constant restriction in UPLCR(φ) has a solution. First, we apply to φ a transformation that is known as Skolemization of universally quantified For all universally quantified variables xν of φ, we replace in the Vm i Wn variables. . (ri,j = si,j )) all occurrences of xν by the term fxν (xµ1 , . . . , xµk ). Here matrix i=1 ( j=1 ∃xµ1 , . . . , ∃xµk is the sequence of existential quantifiers preceding ∀xν in the quantifier prefix of φ; the symbol fxν is a new free function symbol (“Skolem function symbol”), where distinct Skolem function symbols are used for distinct universally quantified varib be the enlarged signature that is obtained from Σ by adding the Skolem ables. Let Σ function symbols for all universally quantified variables of φ. We denote the positive b existential Σ-sentence that is obtained as result of the Skolemization process by φ = ∃xµ1 · · · ∃xµl S

mi n ^ _ S . S ( (ri,j = si,j )). i=1 j=1

b Claim 1: φ is a Σ-theorem of E. Proof of Claim 1: By compactness, we may assume that E is given by a finite set of equational axioms. In fact, if φS is a logical consequence of a (possibly infinite) set of equational axioms, then it is already a logical consequence of a finite subset of this set of axioms. Assume, for notational convenience, that the universal sentence ∀~y e axiomatizes E, where φ and ∀~y e do not have any variables in common. By assumption, φ ∨ ¬∀~y e is a valid sentence. Hence, the logically equivalent sentence in prefix normal form S

Q1 x1 · · · Qk xk ∃~y (

mi n ^ _ . ( (ri,j = si,j )) ∨ ¬e) i=1 j=1

is valid as well. Skolemization of universal quantifiers [which preserves validity, just as

Combining Decision Procedures

233

Skolemization of existentially quantified variables preserves satisfiability Chang and Lee (1973)] leads to the valid sentence ∃xµ1 · · · ∃xµl ∃~y (

mi n ^ _ S . S ( (ri,j = si,j )) ∨ ¬e), i=1 j=1

which is logically equivalent to φS ∨ ¬∀~y e. This shows that Claim 1 holds. In order to relate φS with the unification problems in UPLCR(φ), we use the fact b b that, for positive Σ-sentences, validity with respect to the class of all Σ-models of E and b X)/=E are equivalent conditions.† validity with respect to the E-free algebra T (Σ, Claim 2: One of the sentences φSi := ∃xµ1 · · · ∃xµl

mi ^

S . S (ri,j = si,j )

j=1

b X)/E, for some 1 ≤ i ≤ n. holds in T (Σ, Proof of Claim 2: Since φS is positive, Claim 1 and the above observation yield that b X)/=E satisfies φS . Claim 2 follows because existential quantification distributes T (Σ, over disjunction, and since an algebra satisfies a disjunction of sentences iff it satisfies one of the disjuncts. Each such (valid) sentence may be considered as a (solvable) general E-unification problem . S . S S = si,1 , . . . , ri,m = sSi,mi }. {ri,1 i An equivalent general E-unification problem is . . Γ0i := {ri,1 = si,1 , . . . , ri,mi = si,mi } ∪ FS , . where FS consists of all equations of the form xν = fxν (xµ1 , . . . , xµk ), which relate the universally quantified variables xν of φ with their Skolem terms. Obviously, this proves Claim 3: One of the general E-unification problems Γ0i has a solution, for some 1 ≤ i ≤ n. Claim 4: One of the E-unification problems in UPLCR(φ) has a solution. Proof of Claim 4: Note that the problems Γ0i have exactly the form of the general E-unification problems Γ0 that we have considered in Subsection 5.1 when proving the equivalence between E-unification problems with linear constant restriction and general E-unification problems. It is easy to see that Γ0i is just the general E-unification problem that is associated with the E-unification problem with linear constant restriction Γi ∈ . . UPLCR(φ) that contains the equations {ri,1 = si,1 , . . . , ri,mi = si,mi }. Claim 4 is now an immediate consequence of Proposition 5.1. Until now, we have shown the “only-if” direction of Proposition 5.6. Since it is easy to see that each of the steps leading from one claim to the next can be inverted, the “if” direction follows as well. Proposition 5.6 yields that the positive theory of E is decidable, if solvability of Eunification problems with linear constant restriction is decidable. Conversely, every E. unification problem with linear constant restriction, Γ = {rj = sj | j = 1, . . . , m}, can † This equivalence is well-known in universal algebra; see, e.g., Bockmayr (1992) for a self-contained proof.

234

F. Baader and K. U. Schulz

be translated into an equivalent L+ (Σ)-sentence φ = Q1 x1 · · · Qk xk

m ^

. (rj = sj ),

j=1

where {x1 , . . . , xk } are the variables and free constants occurring in Γ, and Qi is a universal quantifier (existential quantifier) if xi is a constant (variable). The order in the quantifier prefix is given by the linear order that induces the constant restriction. Obviously, UPLCR(φ) = {Γ}, and thus Proposition 5.6 implies that Γ has a solution iff φ is a theorem of E. Corollary 5.7. Let E be an equational theory. Then the positive fragment of E is decidable if, and only if, solvability of E-unification problems with linear constant restriction is decidable. 6. Optimized Decision Procedures Until now, our emphasis was mainly on giving a conceptually clear description of the decomposition algorithm and its consequences, with the goal of keeping the algorithm and the proofs as transparent as possible. For an actual implementation, one should, of course, try to avoid as many nondeterministic choices as possible. In the following, we shall restrict our attention to the combination of decision procedures. For the combination of algorithms computing complete sets of unifiers, additional optimizations may become possible that utilize the computed unifiers [see, e.g., Boudet (1993) for optimizations in this direction]. 6.1. an obvious optimization for the general case For the combination of decision procedures, not many optimizations seem to be possible without imposing additional restrictions on the theories to be combined. For the general case, we shall consider one fairly obvious technique for restricting the choice of the linear ordering. It depends on the observation that reordering variables of the same type in the output systems does not influence the resulting constant restrictions, as long as the relationship to variables of different type is not changed. To make this more precise, let us assume that the set Y of all variables is equipped with a standard ordering ≺Y . Let Y4 be the set of variables of an output system (Γ5,1 , Γ5,2 ) with indexing ind and linear ordering <. Definition 6.1. We say that the pair (<, ind) is ≺Y -compatible iff two variables y1 , y2 of the same type that are consecutive with respect to < satisfy y1 ≺Y y2 . If this condition holds only for all variables of type i, then we say that (<, ind) is ≺Y -compatible with respect to variables of type i (i = 1, 2). Proposition 6.2. Let Γ0 be a solvable input system of the decomposition algorithm. Then there exists a solvable output system (Γ5,1 , Γ5,2 ) with indexing ind and linear ordering < such that (<, ind) is ≺Y -compatible. Proof. The “if” direction is trivial. To show the “only-if” direction, assume that

Combining Decision Procedures

235

Γ0 is solvable. By Proposition 3.2 we know that there exists a solvable output pair. If the associated pair (<, ind) is not ≺Y -compatible then there are variables y1 < y2 of the same type that are consecutive with respect to < such that y2 ≺Y y1 . Since interchanging the order of y1 and y2 does not modify the <-relationships of these two variables with variables of different type, the induced constant restrictions are also not affected thereby. This shows that we can reach, after finitely many steps, a solvable output system where the linear ordering together with ind is ≺Y -compatible. The proposition shows that it is sufficient to choose ≺Y -compatible pairs (<, ind) in Step 4 of the decomposition algorithm. 6.2. combination with the free theory To obtain further optimizations, we shall consider a particular class of combination problems, namely those where an equational theory E is combined with an instance of the free theory FΩ := {f (x1 , . . . , xn ) = f (x1 , . . . , xn ) | f ∈ Ω}. This combination yields a unification algorithm for general E-unification. We shall proceed in three steps. The original decomposition algorithm, as described in Section 3, will be called “Algorithm I” in the following. First, we present an optimized version of the decomposition algorithm that always works in the specified context (Algorithm II). The idea is to use the most general unifier (mgu) of free subsystems to rule out certain choices in the nondeterministic steps. Then we discuss a second optimization (Algorithm III) that is possible under certain assumptions on the theory E. These assumptions concern the type of unification algorithms that are available for E. For example, the theories A and AI satisfy these requirements. Finally, we introduce a further simplification (Algorithm IV) that is possible when dealing with a collapse-free theory E (such as A). optimized decomposition algorithm—algorithm II Context: Combination of an equational theory E with the free theory FΩ . Decision problem for general E-unification. The input is a finite system Γ0 of equations between terms built from variables, free constants, and function symbols belonging to Ω or the signature of E. As before, Γi denotes a system that is reached after step i, and Yi denotes the set of variables occurring in Γi . If all equations of Γi are pure, then Γi,F denotes the subsystem of all equations containing only free function symbols, and Γi,E denotes the subsystem with the equations containing function symbols from the signature of E. In the following, we shall consider a most general unifier µi of the free subsystem Γi,F , for i = 2, 3. Without loss of generality, we may assume that these unifiers are “nice” as defined below. Definition 6.3. A most general unifier µi of Γi,F satisfying (i) τ = µi τ for all unifiers τ of Γi,F , and (ii) the variables occurring in the terms xµi for x ∈ Yi are again in Yi .

236

F. Baader and K. U. Schulz

is called nice† . Now, let us describe the optimized decomposition algorithm. The first two steps are unchanged. The third (nondeterministic) step of the original algorithm is split into a deterministic and a nondeterministic substep. Step 3.1: first (deterministic) variable identification. We try to solve the free subsystem Γ2,F , thereby treating all elements of Y2 as variables. If Γ2,F is unsolvable, then we stop with failure. Now, assume that Γ2,F is solvable, and let µ2 be a nice most general unifier of this system. We consider the partition π1 of Y2 in which two variables x, y belong to the same class iff xµ2 = yµ2 . Based on this partition we identify variables in Γ2 , as described in the original Step 3. This yields the system Γ3.1 with set of variables Y3.1 . Step 3.2: second (nondeterministic) variable identification. Here we choose a partition π2 of Y3.1 and identify variables in Γ3.1 according to this partition. This yields the system Γ3 with set of variables Y3 . If Γ3,F is unsolvable, or if the nice mgu of Γ3,F identifies two distinct variables x and y of Y3 (the set of variables obtained after the second identification), then we stop with failure. Otherwise, we continue with the next step. Now, assume that Γ3,F is solvable with nice mgu µ3 such that xµ3 6= yµ3 for all x 6= y in Y3 . In order to improve on Algorithm I in Step 4, we use µ3 to restrict the choices of the indices and, subsequently, the ordering of variables with index F ‡ . Once a linear ordering of the F -variables is fixed, there will be exactly one extension to a linear ordering on all variables (i.e., no more nondeterministic choices are necessary here). Step 4: choose theory indices and ordering. First, we choose a variable indexing ind that satisfies the following condition: (1) for every variable x ∈ Y4 , if xµ3 = f (t1 , . . . , tn ) for a free function symbol f , then ind(x) = F . Let Y3,F ⊆ Y3 denote the set of all variables x with ind(x) = F , and let Y3,E ⊆ Y3 be the remaining set of variables. We choose a linear ordering
Combining Decision Procedures

237

In principle, (3) means that E-variables are made as large as possible compared to F -variables. In fact, the E-variable y is only made smaller than an F -variable x, if this is required by the mgu µ3 and the already chosen ordering
238

F. Baader and K. U. Schulz

(a) In Step 3, partition π I is chosen in such a way that variables x and y are identified iff xσ = yσ. (b) in Step 4 we define indI (x) := E iff xσ = h(t1 , . . . , tn ) for a symbol h belonging to the signature of E, and indI (x) := F , otherwise. (c) The linear ordering
Combining Decision Procedures

239

the ordering
240

F. Baader and K. U. Schulz

restriction and unification problems with partially specified linear constant restriction have the same status. In fact, if solvability of the former class of problems is decidable, then the same holds for the latter class of problems. This is so because every partial ordering on a finite set has only a finite number of linear extensions. Thus, given a unification problem with a partially specified linear constant restriction, one could just enumerate all admissible linear extensions of the underlying partial ordering, and test whether one of them yields a solvable problem with linear constant restriction. This would mean, however, that the nondeterminism in the choice of the linear ordering is just pushed from the decomposition algorithm into the algorithm for the single theories. One could call this a disjunctive treatment of the E-unification problems with partially specified linear constant restriction, since the problem is reduced to a large disjunction of E-unification problems with linear constant restriction. Considering partially specified linear constant restrictions makes sense only if a nondisjunctive treatment is possible, i.e., if the effort to decide solvability of unification problems with such restrictions does not drastically exceed the effort to decide solvability of unification problems with linear constant restriction. Examples of theories for which this is the case are the theories A and AI [see Baader and Schulz (1993)]. For these theories, a disjunctive treatment of all linear extensions can be avoided, and thus the corresponding nondeterminism is really eliminated, not only pushed into the algorithm for the single theory. optimized decomposition algorithm—algorithm III Context: Decision problem for general E-unification, where solvability of E-unification problems with partially specified linear constant restriction is decidable in a “nondisjunctive manner,” as explained above. We simply proceed as in Algorithm II until we reach the point in Step 4 where the variable indexing ind has been chosen. Now, we do not choose a linear ordering
Combining Decision Procedures

241

assume that decidability of E-unification problems with partially specified linear constant restriction is decidable in “non-disjunctive manner.” A prominent example where the following version of the decomposition algorithm can be used is the theory A, expressing associativity of a binary function symbol [compare Baader and Schulz (1993)]. optimized combination algorithm—algorithm IV Context: Decision problem for general E-unification, where E is collapse-free. Without loss of generality, we may assume that the input problem does not contain an . equation of the form f (s1 , ..., sm ) = h(t1 , . . . , tn ) where f is a free function symbol and h is a symbol of the signature of E. Since E is collapse-free, we could immediately stop with failure in such a case. Step 1 remains as in Algorithms II and III. Obviously, Γ1 cannot have non-pure equations. Thus Step 2 can be omitted. Step 3.1 (first variable identification) remains almost as in Algorithms II and III. We add, however, a control step that detects unsolvability: if xµ2 = f (s1 , . . . , sm ) for a free function symbol f and if Γ2,E contains an equation . x = h(t1 , . . . , tn ), then we stop with failure. In Step 3.2 (second variable identification), we do not identify variables x and y if xµ2 = f (t1 , . . . , tm ) for a free function symbol f . and if Γ3.1,E contains an equation y = h(s1 , . . . , sn ). Step 4 of Algorithm III is modified, eliminating some nondeterminism in the choice of a variable indexing ind. We define . ind(y) = E whenever Γ3,E contains an equation y = h(s1 , . . . , sn ), where h belongs to the signature of E. Correctness and completeness of Algorithm IV follow immediately from Proposition 6.6 and from the fact that the theory E is collapse-free. 7. Conclusions We have presented a new method for treating the problem of unification in the union of disjoint equational theories. Unlike most of the other methods developed for this purpose, it can be used to combine decision procedures as well as procedures computing finite complete sets of unifiers. Applicability of our method depends on a new type of prerequisite, namely on the solvability of unification problems with linear constant restriction. Presupposing the existence of a constant elimination algorithm—as necessary for the method of Schmidt-Schauß—seems to be a stronger requirement. In fact, we have seen that constant elimination procedures can be used to solve unification problems with arbitrary constant restriction. However, it is still an open problem whether there exists an equational theory for which solving unification problems with linear constant restriction is finitary (or decidable) but solving unification problems with arbitrary constant restriction is not. In any case, we have seen that unification with linear constant restriction seems to be the more natural concept, since it has a clear logical interpretation: for an equational theory E, unification with linear constant restriction is decidable iff the positive theory of E is decidable. Our main results, together with the results described in the Section 5.1, show that there is also a close correspondence between solving unification problems with linear constant restriction and solving general unification problems. For a given equational theory, the first kind of problems is decidable (finitary solvable) if, and only if, the second kind of problems is so. As an interesting open problem it remains the question whether there

242

F. Baader and K. U. Schulz

exists an equational theory for which unification with constants is decidable (finitary), but general unification—or equivalently, solving unification problems with linear constant restrictions—is not. Using the logical interpretation of unification with linear constant restriction, this open problem can also be reformulated as follows: Is there an equational theory E for which the positive AE-fragment† is decidable, but the full positive theory is undecidable? One should note that there already exist such results for the case of single equations, i.e., unification problems of cardinality one. Narendran and Otto (1990) have shown that there exists an equational theory E such that solvability is decidable for E-unification problems (with constants) of cardinality one, but is undecidable for E-unification problems of cardinality greater than one, and thus also for general Eunification problems. Since the submission of this article, its results have been generalized to the case of disunification problems (Baader and Schulz, 1995a), the combination of theories with shared constants (Ringeissen, 1992), and to the combination of solvers for more general constraint languages (Baader and Schulz, 1995b,c). References Bachmair, L. (1991) Canonical equational proofs, Progress in Theoretical Computer Science, Birkh¨ auser, Boston Baader, F., Schulz, K.U. (1993) General A- and AX-unification via optimized combination procedures, in H. Abdulrab, J.-P. P`ecuchet, editors, Proceedings of the Second International Workshop on Word Equations and Related Topics, Rouen, 1991, Springer LNCS 677, pp. 23–42. Baader, F., Siekmann, J.(1994) Unification theory, in D. Gabbay, C. Hogger, J. Robinson, editors, Handbook of Logic in Artificial Intelligence and Logic Programming, Oxford University Press, Oxford, UK, pp. 41–125. Baader, F., Schulz, K.U. (1995a) Combination techniques and decision problems for disunification Theoretical Computer Science B 142, pp. 229–255. Baader, F., Schulz, K.U. (1995b) Combination of constraint solving techniques: An algebraic point of view Proceedings of the 6th International Conference on Rewriting Techniques and Applications, RTA95 Kaiserslautern, Germany, Springer LNCS 914, pp. 352–366. Baader, F., Schulz, K.U. (1995c) On the combination of symbolic constraints, solution domains and constraint solvers Proceedings of the International Conference on Principles and Practice of Constraint Programming, CP95, Cassis, France, Springer LNCS 976, pp. 380–397. Bockmayr, A. (1992) Algebraic and logical aspects of unification in K.U. Schulz, editor, Proceedings of the First International Workshop on Word Equations and Related Topics, T¨ ubingen, Germany, 1990, Springer LNCS 572, pp. 171–180. Boudet, A., Jouannaud, J.P., Schmidt-Schauß, M. (1989) Unification in Boolean rings and Abelian groups J. Symbolic Computation 8, pp. 449–477. Boudet, A. (1993) Combining unification algorithms J. Symbolic Computation 16, pp. 597–626. B¨ urckert, H.-J. (1986) Some relationships between unification, restricted unification, and matching Proceedings of the 8th International Conference on Automated Deduction, Springer LNCS 230, pp. 514–524. B¨ urckert, H.-J. (1990) A resolution principle for clauses with constraints Proceedings of the 10th International Conference on Automated Deduction, Springer LNCS 449, pp. 178–192 Chang, Ch.-L., Lee, R.Ch.-T. (1973) Symbolic Logic and Mechanical Theorem Proving, Computer Science Classics, Academic Press, San Diego Colmerauer, A. (1990) An introduction to PROLOG III C. ACM 33, pp. 69–90. Dershowitz, N., Jouannaud, J.P. (1990) Rewrite systems in J. van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume B, North-Holland, pp. 244–320 Fages, F. (1984) Associative-commutative unification Proceedings of the 7th International Conference on Automated Deduction, Springer LNCS 170, pp. 194–208. Fages, F. (1987) Associative-commutative unification J. Symbolic Computation 3, pp. 57–275. † This fragment consist of positive formulae with a quantifier-prefix consisting of a block of universal quantifiers followed by a block of existential quantifiers. It is easy to see that such formulae correspond to unification problems with (free) constants.

Combining Decision Procedures

243

Herbrand, J. (1930) Recherches sur la Th´ eorie de la D´ emonstration, Ph.D. Thesis, Sorbonne, Paris Reprinted in W.D. Goldfarb, editor, Logical Writings, Reidel, 1971. Herbrand, J. (1967) Investigations in proof theory: The properties of true propositions in J. van Heijenoort, editor, From Frege to G¨ odel: A Source Book in Mathematical Logic, 1879–1931, Harvard University Press, pp. 525–581. Herold, A. (1986) Combination of unification algorithms Proceedings of the 8th International Conference on Automated Deduction, Springer LNCS 230, pp. 450–469. Herold, A. (1987) Combination of unification algorithms in equational theories Dissertation, Fachbereich Informatik, Universit¨ at Kaiserslautern, 1987. Herold, A., Siekmann, J.H. (1987) Unification in Abelian semigroups J. Automated Reasoning 3, pp. 247–283. Howie, J.M. (1976) An Introduction to Semigroup Theory Academic Press, London, 1976. Jaffar, J., Lassez, J.L. (1987) Constraint logic programming Proceedings of 14th POPL Conference, Munich, pp. 111–119. Jaffar, J., Lassez, J.L., Maher, M. (1984) A theory of complete logic programs with equality J. Logic Programming 1, pp. 211–223. Jouannaud, J.P., Kirchner, H. (1986) Completion of a set of rules modulo a set of equations SIAM J. Computing 15, pp. 1155–1194. Jouannaud, J.P., Kirchner. C. (1991) Solving equations in abstract algebras: A rule-based survey of unification in J.-L. Lassez, G. Plotkin, editors, Computational Logic: Essays in Honor of Alan Robinson, MIT Press, pp. 257–321. Kapur, D., Narendran, P. (1992) Complexity of unification problems with associative-commutative operators J. Automated Reasoning 9, pp. 261–288. Kirchner, C. (1985) M´ ethodes et Outils de Conception Syst´ ematique d’Algorithmes d’Unification dans les Th´ eories equationnelles, Th` ese d’Etat, Univ. Nancy, France, Kirchner, C., Kirchner, H. (1989) Constrained equational reasoning Proceedings of SIGSAM 1989 International Symposium on Symbolic and Algebraic Computation, ACM Press, pp. 382–389. Livesey, M., Siekmann, J.H. (1975) Unification of AC-Terms (Bags) and ACI-Terms (Sets), Internal Report, University of Essex, 1975, and Technical Report 3-76, Universit¨ at Karlsruhe Makanin, G.S. (1977) The problem of solvability of equations in a free semigroup Mat. USSR Sbornik 32, pp. 129–198. Miller, D. (1992) Unification under a mixed prefix J. Symbolic Computation 14, pp. 321–358. Narendran, P., Otto, F. (1990) Some results on equational unification Proceedings of the 10th Conference on Automated Deduction, Springer LNCS 449, pp. 276–291. Plotkin, G. (1972) Building in equational theories Machine Intelligence 7, pp. 73–90. Ringeissen, Ch. (1992) Unification in a combination of equational theories with shared constants and its application to primal algebras Proceedings of the International Conference on Logic Programming and Automated Reasoning, LPAR’92, Springer LNCS 624, pp. 261–272. Schulz, K. (1991) Makanin’s Algorithm – Two Improvements and a Generalization, CIS-Report 91-39, CIS, University of Munich. Schmidt-Schauß, M. (1989) Unification in a combination of arbitrary disjoint equational theories J. Symbolic Computation 8, pp. 51–99. Siekmann, J.H. (1989) Unification theory: A survey J. Symbolic Computation 7, pp. 207–274. Stickel, M. (1975) A complete unification algorithm for associative-commutative functions Proceedings of the International Joint Conference on Artificial Intelligence, Tblisi, USSR, pp. 71–82. Stickel, M. (1981) A unification algorithm for associative-commutative functions J. ACM 28, pp. 423– 434. Stickel, M. (1985) Automated deduction by theory resolution J. Automated Reasoning 1, pp. 333–356. Tiden, E. (1986) Unification in combinations of collapse free theories with disjoint sets of function symbols Proceedings of the 8th International Conference on Automated Deduction, Springer LNCS 230, pp. 431–449. Yelick, K. (1987) Unification in combinations of collapse free regular theories. J. Symbolic Computation 3, pp. 153–182.