ELSEVIER
Information
Processing Letters 5 1 ( 1994) 6 l-66
Information Processing Letteis
A term equality problem equivalent to Graph Isomorphism David A. Basin Mnx-Plan&-Znstitut
fir Znfirmatik,
’
Zm Stadtwald, O-66123 Saarbriicken, Germany
Communicated by H. Ganzinger; received 19 January 1994; revised 19 April 1994
Abstract We demonstrate that deciding if two terms containing otherwise uninterpreted associative, commutative, and associative-commutative function symbols and commutative variable-binding operators are equal is polynomially equivalent to determining if two graphs are isomorphic. The reductions we use provide insight into this result and suggest polynomial time special cases. Keywords: Automatic theorem proving, Computational complexity; Equality reasoning; Graph isomorphism
1. Introduction Decision procedures for equality play an important role in automated theorem proving. We examine the complexity of a specific problem in equality reasoning: determining if two terms, which possibly contain associative, commutative, and associative-commutative (AC) functions and commutative variable-binding operators (that is, adjacent operators may be swapped) are equal. We prove that this problem is polynomially equivalent to deciding if two graphs are isomorphic. The exact complexity of graph isomorphism, and hence our problem, is still open. While graph isomorphism is clearly in NP, it has neither been proven NP-complete nor in P [9,13]. If one removes variable-binding operators, then polynomial AC-equality decision proce’ Email: basinampi-sb.mpg.de.
dures are trivial. For example, if + is an (infix) AC-function then we can normalize a term like Sl + s2 + . . . + s,, by sorting the si and left associating the result. More complex terms (e.g., the si themselves contain various associative and commutative functions) may be normalized by a bottom up traversal that left associates and sorts subterms as permitted by associativity and commutativity. Two terms are equal iff their normalized counterparts are identical. This kind of a procedure does not work though in the presence of bound variables, as the variable renaming (a-conversion) can, as we shall see, effect normalization. The problem we study arose in reasoning about hardware devices. A common approach to representing hardware (see [ 5 ] ) is to represent devices as relations on their external ports; relations are combined via conjunction and wires between internal ports are represented with existential quantification. For example, if
OOZO-0190/94/$07.00 @ 1994 Elsevier Science B.V. All rights reserved SSDZOO20-0190(94)00084-C
D.A. Basin /Information Processing Letters 51 (1994) 61-66
62
xur(a, b, u) represent an exclusive-or gate with inputs a and b and output o then a half adder sum(a, b, tin, sum) may be defined as the relation 3w.xor(cin,b,w)
Axor(w,a,sum).
In representing hardware, terms are the formulae of the predicate calculus. But when considering if two terms represent the same circuit it should not matter how internal wires are named or ordered, or how devices are ordered. That is, we consider the equality of these terms modulo associativity and commutativity of A, commutativity of 3, and renaming of bound variables. Consider the following example, where pI and pz are distinct uninterpreted predicates that are neither associative nor commutative. s =
~~~~(PlbG~)
APl(V,U))
f =
~W~XPI(~,X)
A (~2kw)
APZ(u,v)
(1)
AP~(~,w))@)
These terms are equal. If w and x in (2) are renamed to v and u, the renamed term is normalized by sorting quantifies lexicographically and left associating and sorting the summed terms, then the equality is apparent. The critical point is that if we simply normalized s and t (s, as written, is already in the above described normal form) without renaming w and x, then the normal form oft,
would not rename to s. Though simple, this example suggest that perhaps contrary to one’s intuition, that our slightly extended equality problem is not likely to be as easy as the binding operator free problem. Of course, in the above example, simply comparing the two p2 terms tells us what the proper bound variable renaming should be. In general, the problem is not so simple. The difficulty arises because both subterms and binding operators may be reordered and this obscures what, if any, bound variable renamings would allow rewriting the terms to a-variants. If the binding operators do not commute then equality again is decidable in polynomial time. For
example, if 1 is a non-commutative variablebinding operator, then we can decide the equality of two terms tl = Ixl.Ax2.. . .Axk.sI and t2 = Ay,.Ayz.. . . Ay,+s2 where the Si contain associative and commutative function symbols by renaming each bound yi by xi and normalizing tl and the renamed t2. Our complexity result is similar to that of [ 3,141 in that they analyze the complexity of matching problems in the presence of associative and commutative function symbols. Indeed, their result that AC-Matching is NP-complete might suggest that our problem is also. However, their problem has more latitude then ours. For example, the substitutions that they consider need not be variable bijections which is required in our problem. Moreover, it would be very surprising if their NP-complete problems are reducible to our problem which we show to be isomorphism complete. Other work relevant to ours is that of Bachmair and Plaisted on associative commutative term rewriting systems [ 21 and that of Hullot [ 121 and Gramlich and Denzinger [ 10 ] on associative-commutative matching. However, none of these authors consider complications introduced by variable binding operators. Our paper is organized as follows. Section 2 contains preliminaries. Section 3 contains problem statements and the proof that our problem is isomorphism complete. The final section considers special cases and draws conclusions.
2. Preliminaries Let I(V, F’) be a collection of terms constructed from denumerable sets of variables V and operators FT. Functions in 3 include binding operators which are instances of binding operator families. Specifically a binding operator family represents an denumerable set of binding operators indexed by members of V. For example if our language is that of the I-calculus, we have a binding operator family 1 with instances AX,A,, * * . and ;Ix.x would be short-hand for ;1, (x). We call the index variable of an operator 0,
D.A. Basin /Information Processing Letters 51 (1994) 61-66
the variable bound by 8. (As a notational convention we shall name functions by Greek letters when they are known to bind variables, and use lower-case Roman letters otherwise. ) We define free and bound variables in the standard way and define substitution to respect variable binding (see, e.g., [ 81 for an account of this in the predicate calculus). For example, if s is a term, the result of substituting t in s for a variable x, denoted by s [t/x] is defined recursively by the following cases: l y [t/x] = if y # x then y else t. = f(tl[t/xl,...,tn[t/xl), . f(t1 . ..tn)[t/xl when f is not a binding operator. l B,,(s’)[t/x] = ifx # y then O,(s’[t/x]) else 0, (S’). In the last condition, to avoid free-variable capture, we assume that t does not contain y free. When this is not the case we must rename bound variables (see (Y below) to avoid capture. We write [tl, tz,. . . , &/x1,x2,. . . ,x,] to represent the iterative application of the substitutions [ti/xi
I.
We terms tional when
will be reasoning about the equality of where their operators satisfy certain equaaxioms. A binary operator f is associative it satisfies
(A)
fk.f-(~,z))
= _f(_f(x,~),z)
for all x and y, and f is commutative when (C)
“f-(&Y)
= f(Y,X).
We call a binding operator family 8 commutative when for every x, y E V and t E 7 (V, F) ~3
e,(e,w)
We call functions
= e,v3,w).
that obey both (A) and (C)
AC-functions and operators that obey (C') Cbinding operators. Finally, we identify as equal so called cY-convertible terms, which are those terms differing only in the names of their bound variables. That is, for every binding operator family 8E3,sEI(V,7)andxEV (a)
e,(s)
= e,(slY/x])
where y is neither free nor bound in s
We
call
two
terms
63
s and
t
associative-
commutative equivalent (denoted
by s “=” t) when s = t is an equational consequence can be proven by using instances of (A), (C), (C’) (corresponding to functions that are associative and/or commutative) and (a) as axiom schemata. For example, the terms given in (1) and (2) are equal when conjunction satisfies (A) and (C) and existential quantification satisfies (C’).
3. Problem statements and equivalence proof Our proof that deciding AC-term equality is polynomially equivalent to Graph Isomorphism uses the following problems.
(1) (Directed) Graph Isomorphism (called DGI in the directed case and GI in the undirected case ) Instance: Two (directed) graphs Gi = (V,Ei) and (32 = (6~72). Question: Is G1 N Gz? That is, is there a bijection 4 : VI + V2 such that (Vi, Uj ) E Ei iff ($(vi),4(vj)) E E2? (2) Labeled Directed Graph Isomorphism (called LDGI) Instance: Two directed graphs Gi = ( VI, El ) and G2 = (V2, E2 ) that may have labels associated with their vertices and edges. Question: Is Gi N G2? That is, is there a bijection 4 : V, + V2 such that (Vi, Vi) E El iff (4(Vi),$(Vi)) E E2 and $J preserves edge and vertex labelings? (3) AC-Equality of Terms Containing Commutative Binding Operators (called ACE) Instance: Two terms s, t E 7 (V, F) where F possibly contains associative and commutative functions and commutative binding operators. Question: Is s “g t? Our equivalence proof proceeds in two parts. First we demonstrate that GI polynomially reduces to ACE.
64
D.A. Basin /Information Processing Letters 51 (1994) 61-66
Lemma 1. GI
k(Vi2,VujZ),~~~,
f(g(vi,_,,vj,_,),g(vi,,vj,))...))))
(3)
We construct t analogously using G2. Both constructions may be accomplished in polynomial time (and in log space). Correctness is easy to establish. If 4 is an isomorphism from Gi to G2, then s contains a subterm g (vik, vj, ) iff t contains either the subterm S(~(Vi,),~(Vj,)) or g(+(Vj,),4(Vg)) It follows that s “=” t. Conversely, if s “g t, then there must be a renaming of binding variables in s, such that the resulting term may be rewritten to t using (A), (C), and (C’). This renaming is an isomorphism from Gt to G2. 0 To prove the converse define a functionfiontierf of subterms oft.
of Lemma 1, we first (t ) which returns a list
frontierf (t ) frontierf (al ) @ . . . @ fiontierf (a, ) = ift = f (a,,...,a,), { [t ]
otherwise.
2 Edges in GI and G2 are unordered; being commutative.
this corresponds
to g
The operator @ appends its operand lists and [t ] represents the singleton list with member t. We use the following facts (see, e.g., [2,10] ) about frontier in our proof. Fact 2. If f
is associative, frontierf(s) [Cl>. . ., C,], frontierf(t) = [Dl,. . .,I&], for 1 < i < k, Ci “LeDi, then s “Ae t.
= and
Fact 3. Zf f is an AC-function, and s and t are
terms such that frontierf (s) is (pointwise) ACequivalent to a permutation of frontierf (t ) , then s “E t. We may now prove the following: Lemma 4. ACE
65
D.A. Basin /Information Processing Letters 51 (1994) 61-66
Case 3d: f is both associative and commutative (n = 2). Same as Case 3c, except do not label the edges leaving U. Case4:t-8,,(8,(...8,(t’)...))wherethe outermost term constructor oft’ is not 8. Case 4a: 8 is not commutative. Let Gt be rooted by a new vertex v labeled by 8. Add an edge from v to GIf. In addition, add edges from v to II new vertices WI, ~2,. . . , w,, and label the ith edge by the integer i. For each wi, add edges from Wi to the vertices in Gtl labeled by xi, and erase the “xi’‘-labels on these vertices. Case 4b: 8 is commutative. Same as Case 4a, except do not label the edges from v to the wi. If s E t then some finite application of (A), (C), (C’), and bound variable renaming will rewrite s and t to some term w. It is easy to check that each such rewrite does not, up to isomorphism, alter the image of a term under p_ Conclude that Gs N Gt . The converse is more complicated. Formally, we prove by induction on k that for all terms s and t where height(G,) 5 k and G, N GI, then s “g t. The function height computes the height of its LDAG argument (i.e., the length of the longest path). We proceed by cases on the outermost operator of s (which must be identical to the outermost operator oft since their images under p are isomorphic). Case 1: s = c, where c is a constant. As G, N GI, t must also be c. Case 2: s - x, where x is a variable. As Gs N Gt, t must also be x. Case3:s=f(sl,..., s,)andtrf(tl,..., tn). Case 3a: f is neither associative nor commutative. Then G, N Gt establishes an isomorphism between Gs, and Gti. By the induction hypothesis, we have that Si “=”ti, and it follows (as equality is a congruence relation) that s “2 t. Case 3b: f is commutative and is not associative. Then Gs N Gt implies either G,, N G,, and GQ N Gtl, or alternatively, G,, N G,, and Gs2 N Gt,.Either case is possible as, unlike Case 3a, the edges leaving the roots f are unlabeled. In the first case, we have by the induction hypothesis that si “=”tl and s2 “g t2 from which s “2 t follows by congruence. In the second case s1 “g t2
and s2 “g tl and our result follows by congruence and the commutativity off. Case 3c: f is associative and is not commutative. G, is a graph rooted by f with children Gc, >. . . , Gc, and [Cl,... , C,] = frontierf (s). Similarly Gt is a graph rooted by f with = children CD,, . . . , Gn, where [Oi, . . . , I&] frontierf (t ). Hence, G, N Gt implies that Gci N Goi and, by the induction hypothesis, it follows that Ci “=”Di. By Fact 2, s “2 t. Case 3d: f is both associative and commutative. Same as Case 3c, except that the isomorphism from Gs to Gt only establishes that some permutation of the Gc,, . . . , Gc, is isomorphic to the CD,,..., Go,. It follows by the induction hypothesis and Fact 3 that s “=”t. Case 4: s E 8,, (fl,, (. . . 0, (s’) . . .) ) and t E e,,uw...e,,u~)
. ..)I.
Case 4a: 8 is not commutative. implies that Gacsl) N Gt, where stitution [vi,. . . , y,/xl, . . . , x, 1. tion hypothesis, a(~‘) “2 t’, and ey,(...eyn((T(S’))...)
Then Gs N Gt d is the subBy the inducit follows that
“=“e,,(...e,,(ty...).~y
renaming bound variables we conclude s “=”t. Case 4b: 8 is commutative. Same as Case 4a, except that o is a bijection from {xi,. . . ,x,} to {VI , . . . , yn}. Permuting the names of binding variables does not effect AC-equality as 0 is commutative. 0 Putting the previous two lemmas together, we conclude: Theorem 5. ACE is polynomial equivalent to GI.
4. Discussion We have given fairly direct reductions between graph isomorphism and AC-equality. In both directions, the reductions are efficient enough that a previously coded solution for either problem could be used as a decision procedure for the other. The reduction from ACE to GI also provides insight into the complexity of AC-term equality and the possibility of polynomial decision proce-
66
D.A. Basin /Information Processing Letters 51 (1994) 61-66
dures for special cases. Observe that if we eliminate binding operators (and hence bound variables), then the labeled DAGs associated with terms would be labeled trees and we could decide equality by extending polynomial-time tree isomorphism algorithms [ 1 ] to handle vertex and edge labelings. This confirms our observation in the first section: AC-term equality is tractable in theories without binding variables. Even when binding variables are present, there are cases where deciding AC-equality is still tractable. For example, as planar graph isomorphism is decidable in polynomial time [ 111, so is AC-equality when the corresponding term graphs contain no subgraphs contractable to KS or Ks,s. Non-planarity arises when terms share multiple bound variables across multiple subterms. For example, three independent subterms containing the same three bound variables will correspond to a graph containing a subgraph contractable to K3,3. For example in hardware verification the terms we encounter are relations that represent devices and the only binding variables are existential quantifies which represent internal wires and are shared by relations that are “wired together”. In this domain, many gate-level devices have small fan-in and fan-out and, as a result, share few bound variables; their corresponding graphs are often planar. Acknowledgement This work benefited from discussions with Dexter Kozen, Paliath Narendran, and Alexander Bockmayr. References [ 1] A. Aho, J. Hopcroft and J. Ullman, The Design and
Analysis of Computer Algorithms (Addison-Wesley, Reading, MA, 1974). [2] L. Bachmair and D.A. Plaisted, Associative path orderings, in: proc. 1st Internat. Conf on Rewriting Techniques and Applications, Dijon, France (1985) 241-254.
[ 31 D. Benanav, D. Kapur and P. Narendran, Complexity of matching problems, in: Proc. 1st Intern. Conf on Rewriting Techniques and Applications, Dijon, France (1985) 417-429. [4] KS. Booth and C.J. Colboum, Problems polynomially equivalent to graph isomorphism, Tech. Rept. CS-7704, University of Waterloo, 1979. [S] A. Camilleri, M. Gordon and T. Melham, Hardware verification using higher-order logic, in: D. Borrione, ed., From HDL Descriptions to Guaranteed Correct Circuit Designs North-Holland, Amsterdam, 1987) 43-67. N. Dershowitz et al., Associative-commutative rewriting. in: Proc. 8th Internat. Joint Conf on Artificial Intelligence, Karlsruhe, Germany (1983) 940-944. M. Fontet, Automorphismes de graphes et planarite, Asterisque (1976) 73-90. J.H. Gallier, Logic for Computer Science (Harper & Row, New York, 1986). [9] M.R. Garey and D.S. Johnson, Computers and Intractability, A Guide to the Theory of NPCompleteness (Freeman, New York, 1979). [lo] B. Gramlich and J. Denzinger, Efficient acmatching using constraint propagation, Tech. Rept., SEKI Report SR-88-15, FB Informatik, Universith Kaiserslautem, 1988. [ 111 J. Hopcroft and J.K. Wong, A linear time algorithm for isomorphism of planar graphs, in: Proc. 6th Ann. ACM Symp. on Theory of Computing (1974) 172184. Associative-commutative pattern [12] J.M. Hullot, matching, in: Proc. IJCAI-79, Tokyo, Japan (1979) 406-412. [ 131 D.S. Johnson, The NP-completeness column: An ongoing guide, J. Algorithms 9 (1988) 426-444. [ 141 D. Kapur and P. Narendran, NP-completeness of the set unification and matching problems, in: Proc. 8th Internat. Conf on Automated Deduction, Oxford, UK (1986) 489-495.