ELSEVIER
Information
Information Processing Letters
Processing Letters 53 (1995) 237-242
Solvability of word equations modulo finite special and confluent string-rewriting systems is undecidable in general Friedrich Otto
’
Fachbereich MathematikJInformatik, Universitiit GH Kassel, 34109 Kassel, Germany
Communicated by L. Boasson: received 1 September 1994; revised 2 November
1994
Abstract
A finite, special, and confluent string-rewriting system S is constructed such that it is undecidable word equation is solvable modulo S. Thus, (word) unification modulo S is undecidable. Keywords: Formal languages;
in general whether a
Word equations; String-rewriting
1. Introduction
The problem of deciding whether an equation has a solution is one of the most fundamental problems in mathematics. Under the name of “unification” this problem has received much attention in the computer science literature. For various theories algorithms have been developed that not only allow to decide whether a given equation has a solution modulo the theory considered, but that in the affirmative also compute a basis for the set of all solutions, that is, a “complete set of most general unifiers” (see [ 151 for an overview). One of the most important results in this area is Makanin’s proof that it is decidable whether a word equation u E u has a solution in a free semigroup [ 61. Here u and u are strings that contain letters from a given alphabet 2 as well as variables from a set of variables V, and the equation u E v is said to be solvable if there exists a morphism (a “solution”) 4 : V -+ Z* such that c$( u) and 4(v) are identical strings.
’ Email:
[email protected]. 0020.0190/95/$09.50 @ 1995 Elsevier Science B.V. All rights reserved SSDI0020-0190(94)00208-8
This means that “unifiability modulo associativity” is decidable. Since Makanin’s paper appeared, his algorithm has been the object of many research activities. The objectives have been to simplify the proof of the termination and correctness of his algorithm [ 12,141, to develop simpler algorithms for deciding the solvability of word equations [ 5,131, and to compute a description for the set of all solutions of a solvable word equation [ 81. Observe that a word equation can have a minimal complete set of most general unifiers that is infinite, that is, the theory of associativity is of unification type infinitary. Makanin also proved that the solvability of word equations in free groups is decidable [ 71. Since the free group in n generators can be seen as a factor monoid of the free monoid in 2n generators, it is only natural to ask whether the solvability of word equations can be generalized to a still larger class of nonfree monoids. Book [2] introduces a restricted class of logical formulae that he calls “linear sentences”. The purely existential “linear sentences” are just dis-
238
E Orro/InformarionProcessingLetters 53 (1995) 237-242
junctions of word equations containing no variable more than once. Book proves that it is decidable in polynomial time whether an existential linear sentence is valid with respect to an interpretation that is induced by a finite, monadic, and confluent string-rewriting system. In fact, he considers constrained solutions in that he allows that, for each variable occurring in the sentence under consideration, a regular set may be specified as the domain for that variable. In a recent paper Oleshchuk applies the technique of narrowing to word equations showing that it is decidable whether a word equation of a certain restricted form has a solution modulo a finite string-rewriting system that is homogeneous of degree 2, that is, each rule of this system has a left-hand side of length 2 and the empty string A as right-hand side [lo]. Actually, Oleshchuk works with finite homogeneous systems of degree 2 that are in addition confluent, and then he uses a result of Book [3] which states that, for each finite and homogeneous string-rewriting system of degree 2, there exists a confluent system of the same type that is isomorphic to the original system under a morphism that identifies some of the letters. To which classes of finite string-rewriting systems can Makanin’s decidability result be generalized? On the one hand, it is not hard to see that the Post Correspondence Problem can be reduced to the problem of deciding whether a word equation is solvable modulo a finite, monadic, and confluent string-rewriting system (see, e.g., [ 4, Section 4.51) . On the other hand, Adian has shown that there exists a finite homogeneous string-rewriting system & of degree 3 such that the word problem for S3 is undecidable [ 11, and hence, since the word problem of S3 is obviously a special case of the problem of deciding whether a word equation has a solution modulo S3, the latter problem is also undecidable. Here we show that Makanin’s result cannot be carried over to the class of all finite and special string-rewriting systems that are confluent (see the next section for the definitions). We construct a fixed finite string-rewriting system S that is special and confluent such that it is undecidable in general whether a given word equation has a solution mod S. In particular, this shows that for finite, special,
and confluent string-rewriting systems the validity of non-linear sentences in the sense of Book is undecidable.
2. Definitions
and results
Let 2 be a finite alphabet, and let X := {x1 1 i > I} be a countable set of (existential) variables such that Zn X = 8. A word equation (over 2) is an expression of the form u E u, where u and u are strings from (ZUX)*.Forw E (ZUX>*,X(w) denotesthesetof variables that occur in the string w, that is, X(w) := {_xi E X 1 IwI,, > 0). A mapping 4 : X(u) UX(u) + 2* is called a solution of the word equation u G u if the strings 4(u) and 4(v) coincide. Here 4 is extended to a morphism from (X(u) U X(o) U 2) * into 2’ in the unique possible way. The solvability problem for word equations is the following decision problem: Instance: A word equation u = v over 2. Question: Does this word equation have a solution, that is, is there a mapping 4 : X(u) U X(u) + S* such that the strings 4(u) and d(v) are identical? As mentioned in the Introduction Makanin has shown that this problem is decidable [6]. Here we are interested in a generalization of this problem. Let S be a string-rewriting system on 2. Then ~3 denotes the Thue congruence on .Y* that is generated by S. The solvabilityproblem for word equations modulo S is the following decision problem: Instance: A word equation u E v over 2. Question: Does this equation have a solution modulo S, that is, is there a mapping 4 : X(u) U X(u) + _Z* such that 4(u) HZ 4(v) holds? Obviously, the word problem for S is reducible to this problem. Thus, the solvability problem for word equations modulo S is undecidable for each finite string-rewriting system S that has an undecidable word problem. Therefore, we are interested in certain restricted classes of finite string-rewriting systems. A finite string-rewriting system S is called lengthreducing if III > 1r-1holds for each rule ( 1 --+ r) E S. Here \wI denotes the length of the string w. The system S is called monadic if it is length-reducing and r E 2 U {A} holds for each rule (1 + r) E S, where A denotes the empty string. Further, S is a special
E Otro/Information
ProcessingL.eners53 (1995) 237-242
system if it is length-reducing, and r = h for each rule (1 -+ r) E S, and it is homogeneous of degree k, if it is special, and III = k holds for each rule (I -+ r) E S. Finally, S is con~7uent if u ~3. u implies that there exists a string w such that u +z w and u -3 w. Here --+z denotes the reduction relation that is induced by S, which is the reflexive and transitive closure of the following single-step reduction relation: u -‘s u
iff
3g, h E F
3(1--f r) E S:
u = glh and u = grh. A length-reducing and confluent system S defines a unique normal form for each congruence class [ w]s := {u E _Z* 1 u H; w}, since each such class contains one and only one string that is irreducible modulo S. By ZRR( S) we denote the set of strings that are irreducible modulo S. Thus, for each system of this form the word problem is decidable in linear time [ 41. For more information on string-rewriting systems the reader is asked to consult the literature, e.g., [ 41. Here we want to establish the following undecidability result. Theorem 1. There exists a$nite, special, and con$uent string-rewriting system S such that the solvability problem for word equations module S is undecidable. Since the solvability problem is a special case of the problem of determining whether a nonlinear sentence in the sense of Book ([2], see also [4]) is valid, we obtain the following consequence. Corollary 2. There exists a finite, special, and confluent string-rewriting system S such that it is undecidable in general whether or not a nonlinear sentence is valid under the interpretation induced by S.
239
{a, b}+ x {a, b}+ such that the following undecidable:
problem is
Instance: Two strings UI , u1 E {a, b}+. Question: Does the modified PCP, { (~1, UI)} U P, have a solution, that is, does there exist a sequence of integers il,. . . , i, E (2,. . . , k} such that UlUi, . . .Ui, = DIDi,. . .Uin?
We define a special string-rewriting on the alphabet r := {a, b, a’, b’, as
system Sp( P)
ZZ, . . .,Zkr z;, . . . ,z[}
follows:
UiZi
i = 2,. . . , k,
+A
z/p(u[)+A,
i=2 ,...,
z!zi
+A,
i=2 ,...,k,
au’
-+ A,
bb’
+A,
k,
where p : r* + P denotes the function reversal, and ’ : {a, b}* --+ {a’, b’}* denotes the obvious isomorphism. There are no overlaps between left-hand sides of rules of Sp( P), and hence, Sp( P) is a finite, special, and confluent system. Let d := r U {$}, where $ is an additional symbol, and, for ~1, UI E {a, b}+, let @( ~1, UI > denote the following word equation over A: @(Ul,Vl)
:=~x*x~$x4x2$x4x~$u,x~x2p(u~)$
=
s5,
wherexl,... , x4 are variables. Finally, we define four regular subsets of A*: RI := {a, b}*,
R2 := {a’, b’}*,
R3 := {?,2,. . . ,zk}*,
R4 := {z;, . . . , z;}*.
In the rest of this note we present a proof of Theorem 1.
Lemma3. Zfthemod@edPCP{(ul,ul)}UP hasa solution, then there exist strings si E Ri, i = 1, . . . ,4, such that
3. The proof
$~1~3$~4~2$~4~3$~I~1~2~(~~)$
Theorem 1 will be proved by a reduction from the Post Correspondence Problem (PCP) . In fact, it is known (see, e.g., [ 93 ) that there exists a set of pairs ofnonemptystringsP={(ui,ui) Ii=2,...,k}c
Proof. Let il,. . . ,i, UIUi,
s2
. . . Uin=UIUi
:=
p
s4 := z;
. .
. . . zi
++&,(p)
P.
E (2,. . . , k} be such that U,“.Takesl :=Uil..+Ui,, E RI, . ~6) E Rz, sg := zz, . . . zu E R3, and E Rd. Then ,...
E Ono/Informarion Processing Letters 53 (1995) 237-242
240
= $U,, . . * ut,,zi,, . . . Zil $Zi . . . .Z{,p( U:“) . . . p( Ui, ) $zJ * . . Z[EZi,, . . . 21, $w k, 3p(P)
. ..u~“p(u.“).. P.
.P(QP(u;)$ 0
Lemma 4. rf there exist strings si E Ri, i = 1, . . . (4, such that $sIs3$s4s2$s4s3$~lsls2p(U~)$~~~(p)
4,
PCP { ( zq, UI)} U P has a solution.
then the modijied
PrOOf.Let Si E Ri (i = 1,. . . ,4) be chosen such that the above congruence holds. Since $5 E ZRR( Sp( P) ) , and since no left-hand side of a rule of Sp( P) contains an occurrence of the symbol $, we see that s1s3
--$(P)
A,
s4s2
-+$,(pj
A, s4s3
-$,(pj
‘&
and UISIS~~(U{) -+&CPj A. Since s1 E {a,b}* and s2 E {a/,6’}*, u~s~s~p(u{) +$,(pj A by only using the rules au’ 3 A and bb’ + h, that is, = uip(s2). Since s1 E {a, b}’ +; = P(s2P(u;)) and s3 E (22,. . . , zk}*, s1s3 -+$,CPJ A implies that ~3 = Zi,, . . . z~, and ~1 = uil . . . Uin for some indices il, *. .) i, E {2,.. ., k}. Analogously, ~4~2 -+&CPj A implies that s4 = z;, . . . zJ!.,and s2 = p(ujJ . . . p(uj,) for some indices Jo,. . . , j, E (2,. . . , k}. Finally, sqs3 -+&(p) A implies that n = m and ih = jh for all h = l,... ,n, that is, ~1 = ui, . . .ui, and s2 = &I;, . . *u;,). Hence, we conclude that ~1 s1 = UlUi, * S. Ui,) = UlUij . . . Ul”, which means that il, . ..,h ’ is a solution for the modified PCP { ( ~1, ul )} UP. 0
Thus, the problem of validity for nonlinear existential sentences is undecidable for the finite, special, and confluent string-rewriting system Sp( P) . Thus, the decidability result of [ 21 does not carry over to nonlinear sentences, even if attention is restricted to a fixed finite, special, and confluent string-rewriting system. The validity problem for nonlinear existential sentences differs from the solvability problem for word equations modulo Sp( P) in that, for each variable, a regular subset R c A* is chosen as the domain for this variable. However, the subsets RI,. . . , R4 chosen above are of a very restricted form only. We now introduce another four new symbols, and we add some more rules to the system Sp( P) . Let 2 := A U {cr,&y,6}, and let S denote the following string-rewriting system on _%I S := Sp( P) U {aLy -+ A, ba -+ h, pa’ --) A, /3b’ --f A, yzl -+ h, z:6 -+ A 1i = 2,. . . , k}. Then S is a finite special system, and it is easily verified that S is confluent, since it does not have any nontrivial critical pairs. For ul,ul E {a,b} +, let W~(UI,UI) E (2 U and w2 E (2 U {Xl, . . . . x4, y1, . . . . y4})* {Yl, ..*, y4})* denote the following strings, where y4 are variables: Xl,. . .,x4,y1,..., WI (WV
Ul)
:= ylff$Py2$Yy3$y4~$xlyl$y2X2$y3X3
$x4y4$XlX3$X4X2$X4X3$UIXIXZP(Oi)$, w2 := ffyl$y2p$y3y$ay4$9.
Together these two lemmata give the following undecidability result.
Lemma& such that
Corollary 5. Let Sp( P) be the abovejinite, special, and conjuent string-rewriting system on A, and let R1,... , R4 be the above regular subsets of A*. Then the following problem is undecidable:
~s,s3$s4s2$s4s3&sls2p(u;)$
Instance: Two strings ~1, UI E {a, b}+. Question: Is the existential sentence 3X1,X2,X3,X4
then
Zftherearestringssi
the equation
module
wl(u1,
E R,, i= I,...,
+-+&(p)
~1) z
= s5
true under the interpretation induced by the system Sp( P) and the regular sets RI,. . . , Rq?
P,
w:! has a solution
S.
Proof. Let Si E Ri, i = 1,. . . ,4, be such that the above congruence holds. We define a morphism q? : {Xl). . . ,x4, y1,. . . , y4) + z* as follows:
:
$~,~3$~4~2$~4~3$U,xlx2p(u~)$
4,
@(Xi)
:=Si,
i=
#(y,)
:= &‘I,
@(y2) := plQI,
#(y3)
:=y’Q’,
$(y4)
1,...,4,
:= Cw.
E Otto/Information
Processing
(2)
Then qHY1)
=a h’+’
=$(y,)a,
P$(Y2)
=Nsz’+’
=$(y2)P,
and
W(Y3)
=+‘+’
=$(y3)y,
(3)
&qy4)
=t+d’+l
=$(y4)8.
Further, J/(XlYl)
= St&’
-5
A,
ti(Y2X2)
= #@‘s2
-3
A,
Ilr(Y3X3)
= e’s3
-;
A,
qqx4y4)
= s4dS4 -3
A,
since si E {a, b}*, $2 E {a’,b’}*, s3 E (Z2,. . . , Zk}*, and s4 E {zi, . . . , zI}*. Therefore, we see that 9t WI (@I 7Ul> ) +--+gJI(w2), that is, WI(UI,UI) z ~2 has a solution modulo S. •’ Lemma7.
Ifrl, : {xl ,...,
x4,yl,...,
y4)
-b Jf*
isa solution of the equation WI( UI , UI ) s w2 mod&o S, thensi:=$(xi) ~R,,i=l,..., 4,and $sls3$s4s2~s4s3$~,s,s2p(V;)$
H&,(p)
4 I$ = 12 + c
Itil$
i=l
=
i=I
I~(W(~1~~1))l$
that Isi]s = Itil$ = 0, i = 1,. . . ,4. Thus, -5 (/I ( w:! ) implies the following: ~/(W(W~~I)) implying
(1)
t1a -;
at1,
Ptz -t;
yt3 -f
t3y,
t48 -1
t2P, Bt4,
A, t2s2 +f
t3.Q -;
A, S&j
A,
-5 A,
hs3$s4s2$s4s3$uls,s2&)$
-3
s5.
Since S is length-reducing, and Itl aI = Iat1 1, we conclude that tla = cutI, which in turn yields that we obtain that t1 = a” for some rt E N. Analogously, 12 = p, t3 = p, and t4 = Sr4 for some r2, r3, t-4 E N. Since sItI = SILY” 4: A, and since aa -+ A and ba 4 A are the only rules of S containing occurrences of the symbol LY,this means that st E {a,b}* and ri = ISI]. Analogously, we see that s2 E {a’, b’}* and r-2= 1~21,s3 E (22,. ..,zk}* and r-3= 1~31,and s4 E {ZI,. . . , zi}* and r-4 = 1~41.THUS, si E R,, i = 1 .., 4, and in the reduction (3) above only rules frkm the subsystem Sp( P) of S are used. 0 The above lemmata imply that the modified PCP { (ut , UI ) } U P has a solution if and only if the equation WI (ut , ~1) E w2 has a solution modulo S. This completes the proof of Theorem 1.
4. Concluding remark As defined above the solvability problem for word equations modulo a string-rewriting system R is in fact the word unijication problem modulo the system R, that is, it corresponds to equational unification modulo the set of equations R plus associativity. If we interpret each symbol Q E 2 as a unary function symbol a( .), then the only terms containing variables are those of the form ai,(ai,(**.(a,,,,(~)) .-T)), where ai, E 2, j=l,... , m. Thus, in this case the only equations that we obtain are of the form u(x) G u(y), where u, u E _Z*, and x and y are variables that are not necessarily distinct. However, if S is a finite, special, and confluent string-rewriting system on 2, then, given U, u E r, it is decidable whether or not there exists a morphism 4 : {x, y} --f _X* such that I = u+(x) -3 u$( y) = +( uy) holds. Thus, unification modulo the term-rewriting system S’ := {l(x) -t r(x) 1 (I -+ r) E S} is decidable. In fact, if n = y, then the set {w E ZRR(S) I uw -3 uw} of irreducible solutions is a regular language, which can be constructed effectively from S, u, and u [ 111. Analogously, if x $ y, then the
=12+2f:Itil$+3f:lSil$ i=l
slt] -3
4.
Proof. Assume that (// : {xl,. . . ,xq,yt,. . . ,y4} -+ 2 is a solution of the equation wt (~1, ~1) E w2 mod1110S. Let si := @(xi) and ti := t,b(yi), i = 1,. . . ,4. Since S is confluent, we can assume without loss of generality that these strings are irreducible mod S. Hence, $( ~2) = atl$t2jl$t3y$St4$9 is irreducible, implyingthat~(w~(ui,oi)) -+:+(w~).Noneofthe rules of S contains an occurrence of the symbol $, and hence, INw2)
241
Letters 53 (1995) 237-242
242
E Otto/Information Processing Letters 53 (1995) 237-242
set {(WI, WZ) E IRR( S) x ZRR(S) 1 UWI HH UW~} is a recognizable subset of 2 x X*, which again can be constructed effectively from S, U, and u. Actually, these observations hold for all finite, monadic, and confluent string-rewriting systems.
References [ 1 ] S.I. Adian, Defining relations and algorithmic problems for groups and semigroups, Proc. Steklov Inst. Math. 85 1967. [2] R.V. Book, Decidable sentences of Church-Rosser congruences, Theoret. Comput. Sci. 24 (1983) 301-312. [3] R.V. Book, Homogeneous Thue systems and the ChurchRosser property, Discrete Math. 48 (1984) 137-145. [ 41 R.V. Book and E Otto, String-Rewriting Systems (Springer, New York 1993). ]5] J. Jaffar, Minimal and complete word unification, J. ACM 37 (1990) 47-85. [6] G.S. Makanin, The problem of solvability of equations in a free semigroup, Math. USSR Sbornik 32 (1977) 129-198. [7] G.S. Makanin, Equations in a free group, Math. USSR Izvestija 21 (1983) 483-546. [ 81 G.S. Makanin and H. Abdulmb, On general solution of word equations; in: Results and Trends in Theoretical Computer Science, lecture Notes Computer Science 812 (Springer,
Berlin, 1994) 251-263.
[9] P. Narendran and E Otto, Some results on equational unification, in: M.E. Stickel, ed., Proceedings 10th CADE,
Lecture Notes Artificial Intelligence 449 (Springer, Berlin, (1990) 276-291. [lo] V. Oleshchuk, Word equations over Thue systems, Talk presented at the Conference on Semigroups, Automata and Languages, Porto, Portugal, June 20-25, 1994. [ 111 E Otto, On two problems related to cancellativity, Semigroup Forum 33 (1986) 331-356. [ 121 J.P. Pecuchet, Equations avec Constantes et Algorithme de
Makanin, These 3e Cycle, Universite de Rouen, France, 1981. [ 131 K.U. Schulz, Makanin’s algorithm for word equations - Two improvements and a generalization; in: K.U. Schulz, ed., Word Equations and Related Topics, Proceedings, Lecture
Notes Computer Science 572 (Springer, Berlin, 1990) 85150. [ 141 K.U. Schulz, Word unification and transformation of generalized equations, J. Automated Reasoning 11 ( 1993) 149-l 84. [ 151 J. Siekmann, An introduction to unification theory; in: R.B. Baneji, ed., Formal Techniques in Artificial Intelligence. A Sourcebook (North-Holland, Amsterdam, 1990) 369-424.