A practically efficient and almost linear unification algorithm

A practically efficient and almost linear unification algorithm

ARTIFICIAL INTELLIGENCE 249 RESEARCH NOTE A Practically Efficient and Almost Linear Unification Algorithm Gonzalo Escalada-lmaz and Malik Ghallab L...

654KB Sizes 0 Downloads 35 Views

ARTIFICIAL INTELLIGENCE

249

RESEARCH NOTE

A Practically Efficient and Almost Linear Unification Algorithm Gonzalo Escalada-lmaz and Malik Ghallab L a b o r a t o i r e d ' A u t o m a t i q u e et d ' A n a l y s e des Syst~rnes, C . N . R . S , 7, A v . du C o l o n e l R o c h e , 3 1 0 7 7 T o u l o u s e , F r a n c e

Recommended by M. Bidoit and A. Martelli ABSTRACT An efficient unification algorithm for first-order terms is described. It relies on the known theoretical framework of homogeneous and valid equivalence relations on terms and it makes use of a union-find algorithmic schema, thus keeping an almost linear worst-case complexity, Its advantages over similar algorithms are a very low overhead and practical efficiency, even for small terms, that are due to simple data structures and careful design tradeoffs. The proposed algorithm is described in detail such as to be easily implemented. Its main part is proved, and its complexity analyzed. Comparative experimental results that support its advantage are given.

1. Introduction

Most knowledge representation formalisms in AI rely on some unification or pattern-matching algorithm. Such operation is probably the most frequent one in AI software. Except for a few particular cases where a set of patterns can be efficiently con~piled (e.g. [4,5]), unification of first-order terms remains in general a costly operation. Furthermore, it was shown to be inherently sequential [3]: parallelism would not significantly improve it. The standard unification algorithm [11] is widely used because it is simple, easy to understand and teach, readily implemented, and has better practical performances than other algorithms, although its worst-case complexity is exponential. Linear [10], or almost linear algorithms [6, 9] are known. They are however quite involved, and require a fairly !arge number of data Artificial Intelligence 36 (1988) 249-263 0004-3702/88/$3.50 © 1988, Elsevier Science Publishers B.V. (North-Holland)

250

G. ESCALADA-IMAZ AND M. GHALLAB

structures, the m a n a g e m e n t of which overload significantly their average performances for small terms. The purpose of this p a p e r is to present an efficient unification algorithm. It is based on the theoretical f r a m e w o r k developed in [6] and it makes use of a union-find algorithmic schema, thus keeping an almost linear worst-case complexity. Its advantages over similar algorithms are a very low overhead and practical efficiency, even for small terms. Indeed data structures were chosen such as to minimize initializations, b o o k - k e e p i n g and other constant oper-ations. Main tradeoffs in the design of the proposed algorithm were solved in favor of its average complexity. Relative frequencies of unification success and of the two cases of failure were taken into account. Moreover, the algorithm can take advantage of the semi-unification case (variables in only one term). Experimental results, summarized in Section 5, support the evidence that this algorithm o u t p e r f o r m s known algorithms over a broad set of test data. The proposed algorithm needs few data structures: only two pointers per distinct variable. It involves two separate steps detailed in Sections 2 and 3, with examples, such as to be easily implemented. Some notations and definitions are required. Let {x~, x 2. . . . } be a set of variables, {a, b , . . . } a set of constants, and {f, g, h . . . . } a set of k-adic function symbols. A term t is a variable or a constant or a k-adic function applied to k terms t~ . . . . . t~, called subterms of t. A substitution ~r = {(t~\x~)} is a mapping from some variables to some terms; ~(t) results from the simultaneous substitution in t of all occurrences of each x~ of o- by the corresponding t,. ~r is a unifier of two given terms t and t' iff o-(t) = o-(t'). If t and t' are unifiable, then there exists a unique most general unifier (mgu) solving ~r(t)= o'(t'): any other unifier is obtained from ~r by further substitution. The unification p r o b l e m is to find such mgu. A convenient concept for solving such a problem is that of equivalence relations on a set of terms. Two properties are of interest: Definition 1.1. A homogeneous equivalence relation is such that: (1) a variable may be equivalent to any term t; (2) two non-variable terms t and t' are equivalent iff: - t h e y correspond to the same constant or function symbol (homogeneity condition); and - t h e i r ith subterms t i and t I are pairwise equivalent. Definition 1.2. A valid equivalence relation is a h o m o g e n e o u s relation with a partial order on equivalence classes such that the class of t is before the class of t i whenever t i is a subterm of t. A c o m m o n basis to several unification algorithms is the following t h e o r e m [6]: t and t' are unifiable iff there exists a valid equivalence relation that makes t equivalent to t'. If such relation exists, equivalence classes define the mgu ~r of

AN EFFICIENT UNIFICATION ALGORITHM

251

t and t': if x belongs to a class containing only variables then ~r(x) is any element of the class, otherwise ¢(x) = ¢(t~) for a term t~ equivalent to x. The proposed algorithm is naturally decomposed into two steps: Step 1. Build a homogeneous equivalence relation. Step 2. Build from Step 1 a valid equivalence relation and obtain the unifier ~r.

2. Homogeneous Equivalence Relation (HERE)

To unify t and t' we first make t and t' equivalent and propagate the relation through their corresponding subterms, testing the homogeneity of each class. The procedure H E R E carries on this propagation. On the input t and t' it stops with the output "clash" iff the homogeneity condition fails at some step. HERE(t, t') if t and t' are not identical variables or constant symbols then do if t is a variable then VAR-HERE(t, t') else if t' is a variable then VAR-HERE(/', t) else do assume t = f(t~, t,., . . . , tk), t' = f ' ( t ~ , t~ . . . . . t'~) if f # f ' then exit(clash) else for i = 1 to k do HERE(ti, tl) end The homogeneous equivalence relation, built explicitly by the procedure VAR-HERE, is represented as a directed graph, called Gh, with the following properties: - n o d e s in G h are terms, to each variable in t or t' corresponds one single node; - e a c h connected component of Gh corresponds to an equivalence class and contains at most one non-variable term; - for any variable node x in Gh : out-degree(x) ~< 1 ; for a non-variable node t: out-degree(t) = 0, in-degree(t) = 1. Thus, for each variable x only one pointer r ( x ) is needed. Initially r ( x ) = nil. Let us first introduce a simplified and incomplete version of procedure VARHERE: VAR-HERE(U, t) if r ( u ) = nil or u is a marked variable then r ( u ) * - - t else if t is a variable and r(t) = nil then r(t) * - u else do mark u HERE(r(u), t) unmark u end

252

G. E S C A L A D A - I M A Z A N D M. G H A L L A B

These procedures perform mainly two distinct operations: - f i n d , through recursive calls, to which equivalence class a particular variable belongs; and union: define a new class as the union of two equivalence classes. -

Example 2.1. Let t =f(h(x~, x:, x3), h(x6, x 7, xs),

X3,

X6) ,

t' = f ( h ( g ( x 4 , xs), x~, x2), h(x 7, x8, x6), g(xs, a), x s ) . After processing the first two subterms of t and t', at the beginning of the recursion on HERE(x3, g(xs, a)), the current graph G h is (arcs are the rpointers):

x3-'x2---'xl--'g(x4, x~) X6--->X7----->X8

There will be three recursive calls to HERE and VAR-HERE along the path from x 3 to g(x 4, x~): this is the find operation for x 3. The next recursion I-IERE(g (x4, xs), g(x~, a)) adds to Gh a third connected component: x 4---~ x s---~ a

The next call HERE(X6,X5) leads similarly to a recursive traversal of the loop Ix6, x 7, xs, x6]. At this step x 6 is marked: its class does not contain a nonvariable term. The next VAR-HERE(x6,9¢5) performs the union of two classes giving the final graph Gh, with two connected components: X3---~X2---~XI---~g(x4, X s ) X7 ---'~ XS---~ X6 ---~ X5 "--~ a X4

Notice that a loop in G h cannot appear in a connected component containing a non-variable term. Furthermore, equivalence classes restricted to nonvariable terms do not appear in our representation (e.g. h(x 6, x 7, x~) and h(x 7, x8, x6) ). Once the homogeneity condition is checked such classes are irrelevant for the rest of the unification algorithm. Moreover there is at most one non-variable term per class, homogeneous terms are similarly implicit (e.g. g(x 5, a)). Procedure HERE is in fact a direct translation of Definition 1.1.

AN EFFICIENT UNIFICATION ALGORITHM

253

Let Nu and Nf be respectively the number of union and the number of find operations performed on some terms. In the previous example N u - - 9 (8 unions where just the addition of a variable to a class), and Nf = 2. In general Nu ~> Nf. This justifies the chosen representation for the equivalence relation: a union is carried out by one single pointer affectation. However, find operations remain costly in the simplified version given above; let us improve it. The best union-find algorithm known relies on the so-called collapse and weight rules [1]. Classes are represented as trees. Any path of length n traversed during a find is "collapsed" to one node with n successors, thus simplifying further finds. The union of two classes is done by making the root of the smallest class a successor node of the root of the largest class. The weight rule involves however two costly find operations (e.g. for HERE(X~,, X~) on both variables), whereas in most cases none or only one would be necessary. This, in addition to the load due to extra data structures, make the weight rule practically inefficient for our purpose. We propose an adaptation of a union-find procedure with a strategy that minimizes the number Nf and relies only on the collapse rule, thus warranting an almost linear complexity. For that purpose two procedures would be helpful:

source(x) let path ~--x.nil while r(x) is a variable do z r(x) path ~- z.path r(x) ~- done x ~ z

end if r(x) = done then z.path ~-path return(path) collapse(var-list, v) for all x in var-list do end

r(x)~-v

source(x) traverses the path up from x, to the last variable v in this path if it is loop-free, or up to the first variable to appear twice on the path. Variables along the path are put in a list and marked " d o n e " using their r-pointers. The dot notation stands for the concatenation operator used as a list constructor or selector. Notice that for a loop-free path the source variable v is not marked, thus preserving r(v). collapse(var-list, v) directs r-pointers of all variables in var-list to a source variable v. The complete version of procedure VAR-HERE is:

G. ESCALADA-IMAZ AND M. GHALLAB

254

VAR-HERE(., t) if r ( u ) = nil t h e n r(u) ~ - t

else if t is not a variable then do let v.path ~- source(u) collapse(path, v) if r ( u ) = nil t h e n r ( v ) ~ - t else RECUR-HERE(V, r ( o ) , t ) else do if r( t ) = nil t h e n r( t ) ~ - u else do let v.path ~ - s o u r c e ( u ) if r(v) = nil or r(v) = done t h e n collapse(v.path, t) else do let w.path2 ~-~source(t) if v = w t h e n collapse(path.path2, v) else do z ~ - ~-(w)

collapse(w.path.path2, v) if z ¢ nil and z ¢ done t h e n

RECUR-HERE(V, r(O), 2) end

RECUR-HERE(O, y, l) r( v ) <-- loop HERE( y, t)

r(v) ~-y end

In all cases where the union operation cannot be carried out in a straightforward step VAR-HERE performs a find and a collapse. A second find is p e r f o r m e d only when it cannot be avoided: for the union of two classes the first one containing a non-variable term and the second one not being a singleton. (The finds are also avoided when one of the two terms is a variable that has no predecessors in G~,.) In this case the collapse is global. A recursive call to HERE is necessary only when both classes contain a non-variable term. This call is done through procedure RECUR-HERE that marks the source variable v, as in the simplified version of VAR-HERE (to avoid loops in singular cases due to the fact that recursive calls are not always restricted to follow graph GI, ). The graph G~I built for the above example is:

x,-~g(x4, x~) \ X~

X3

x~-~a X4

X~

X7

Xs

AN EFFICIENT UNIFICATION ALGORITHM 3. V a l i d E q u i v a l e n c e

Relation

255

(VERE)

The second and final step of the proposed algorithm consists in checking the validity of the homogeneous equivalence relation built in Step 1, and in defining explicitly the unifier. The validity will be checked on a graph G~, obtained from G~, in the following way: - G,~ has the same set of nodes as G~,; and - f o r each node in G~, labeled by a function term we add an arc to each variable x appearing at any depth in this term. Thus two arcs will be added in the above graph: from g(x 4, xs) to x~ and to xs. L e m m a 3.1. There is" a valid equivalence relation between t and t' iff any loop in

the corresponding graph G~, contains only variable terms. Proof. If such a relation exists, then there is a homogeneous equivalence relation such that its set of classes is partially ordered by the term-inclusion order; this set corresponds to a loop-free graph. Thus graph G v is either loop-free or every loop must be restricted to a connected c o m p o n e n t of G~, (i.e. an equivalence class) and consequently contains only variable terms. Conversely, if G o meets the l e m m a condition then the connected components of G~, can be partially ordered by term-inclusion order, and hence classes in the corresponding homogeneous equivalence relation are similarly ordered. [] Procedure VERE checks the l e m m a condition, and at the same time defines the unifier. For that a second data structure is needed: a pointer s(x) gives for each variable x the term to be substituted to x. Initially s ( x ) = nil, and it remains nil if x is not in the unifier.

VEE(u) let v.path ~--source(u) if r(v) = nil then do z ~- v, r(v) <--done else if r(u) is a constant term then 8o

~(v) ~ z ~ r(v) r(v) ~ done else if r(v) is a function term then do y ~r(v) r( v ) ~ d o n e for all x in v.path do s ( x ) ~ l o o p S(V) ~ Z ~TERM-VERE(y) else do if

s(v)

= nil t h e n z ~ v else if s(v) = loop t h e n exit(loop)

else z ~ s( v ) for all x in path do s ( x ) ~ z return(z)

;case 1 ;case 2

;case 3

;case 4

256

G. E S C A L A D A d M A Z AND M. G H A L L A B

T h e g r a p h G~, is not built explicitly but results i n s t e a d from a d e p t h - l i r s t s e a r c h by p r o c e d u r e TERM-VERE on f u n c t i o n t e r m s o f g r a p h G~,. TERM-VERE(t)

assume t=f(tl,...,t k) for i = l to k do if t i is a v a r i a b l e then d o if r(ti) = d o n e t h e n d o if s(t,) = l o o p t h e n e x i t ( l o o p ) else r e p l a c e in t the s u b t e r m t~ by s(t;) else if r(t;) ~ nil t h e n r e p l a c e t~ by VERE(t~) else if t~ is a f u n c t i o n t e r m then r e p l a c e t~ by T E R M ~ V E R E ( t i ) end return(t) G i v e n a v a r i a b l e u as i n p u t V E R E ( U ) t r a v e r s e s G;, f r o m u up to the s o u r c e v a r i a b l e v, m a r k i n g v a r i a b l e s a l o n g this p a t h " d o n e " . T h e f o u r p o s s i b l e cases are s u m m a r i z e d b e l o w :

Case Case Case Case

1,

It--->"" " ---> w---~ {? ---~ nil.

2.

u --~ • • - --~ w --, v ~ c o n s t a n t .

3.

u~.

. .~ w~

v ~ ]~t~ . . . . .

4.

u+-

- -+ w~

o~done.

t~).

In C a s e ! ( r e s p e c t i v e l y C a s e 2) v ( r e s p e c t i v e l y r(v)) is a c o m m o n s u b s t i t u t i o n to all v a r i a b l e s in the p a t h . In Case 3 the s u b t e r m s tt to t k a r e successively e x a m i n e d : - i f t; is a f u n c t i o n t e r m , t h e n TERM-VERE p r o c e e d s to a recursive s e a r c h ; - if t; is a v a r i a b l e t e r m n o t y e t p r o c e s s e d , t h e n a f o r b i d d e n l o o p m a y exist: v a r i a b l e s on t h e p a t h f r o m u to v i n c l u d i n g v, are m a r k e d " l o o p " using t h e i r s - p o i n t e r s , a n d a r e c u r s i v e call to VERE(tg) is c a r r i e d on; - if t~ is a v a r i a b l e a l r e a d y p r o c e s s e d (r(t;) = d o n e ) , t h e n e i t h e r we are on a f o r b i d d e n l o o p (s(t;) = l o o p ) o r s(t;) has b e e n safely c o m p u t e d a n d s h o u l d be s u b s t i t u t e d to t~ in t h e c o m m o n s u b s t i t u t i o n to all v a r i a b l e s in the p a t h . In C a s e 4 a p a t h l e a d i n g to ~ has a l r e a d y b e e n p r o c e s s e d ; t h e r e is a f o r b i d d e n l o o p if s(v) = l o o p , o t h e r w i s e t h e c o m m o n s u b s t i t u t i o n to v a r i a b l e s o f the p a t h is e i t h e r s(v) o r v if s(v) = nil. T h e a b o v e analysis shows t h a t p r o c e d u r e VERE is in a c c o r d a n c e with D e f i n i t i o n 1.2, a n d p r o v e s its c o r r e c t n e s s . In the p r e v i o u s e x a m p l e a call V E R E ( X 2 ) will give the f o l l o w i n g s e q u e n c e : X 2---> X I ---> g ( x 4 , Xs): T E R M - V E R E issues a r e c u r s i v e call on x 4.

AN EFFICIENT UNIFICATION ALGORITHM

257

x 4 - - ~ x s - - ~ a: x 4 and x 5 are marked " d o n e " and have their s-pointer to a, no second recursion on x 5 is necessary. Finally x 2 and x~ are also marked " d o n e " and their common substitution defined as g ( a , a). Procedure VERE can be improved in several ways for example: (i) Collapse operations carried on during the first step can easily be remembered (a list of all collapsed paths) and taken advantage of during the second step. VERE will be called with two arguments: variable u and eventually a list of variables collapsed on u; those variables will be substituted by the same term as u, hence avoiding their individual processing. (ii) Calls to VERE should be avoided when the length of the path from u to v is less than 2: a few simpler instructions are required in this case. In particular whenever one of the two initial terms does not contain variables (the graph G~, is a forest and the roots of its trees are the terms of the unifier) the behavior of the algorithm would be close to that of a pattern-matching algorithm.

4. Analysis of the Algorithm The complete unification algorithm consists in calling once HERE(t, t') and, if it succeeds, calling VERE(U) for all variables in the two terms (starting preferably with variables on which a collapse has been achieved). The complexity of the first step is that of a union-find algorithm using only the collapse rule, that is O(Nu + Nf. C(Nu, Nf)) where [13] C(Nu, Nf) = max{l, log(Nu2/Nf)/log(2Nf/Nu)} . In our case, the number of union and find operations are bounded by: Nu~
and

Nf~
where d is the number of distinct variables in t and t' and rn is the total number of variable occurrences. In general Nf < Nu. Thus procedure ~IERE is linear in O(Nu + Nf) except for the case where ½ N u < N f < ~Nu-~/2X/~. In fact the nonlinearity conditions are much more stringent: C(Nu, Nf) is a large upper-bound, and for a significant number of examples studied we had the inequalities: N f ~< ½Nu and N f ~< ~(m - d). Notice that the number of calls to HERE is linear in the total number of subterms involved in t and t'. Indeed for classes with non-variable subterms the union operation is followed by the homogeneity test: - t w o given subterms are examined only once (because of the marking in RECUR-HERE); and - o n l y one subterm is kept after each such test. The second step tests the acyclicity of graph G~ and generates explicitly the unifier. It is linear in the number of arcs in G~,: each variable node visited is

258

G. ESCALADA-IMAZ AND M. GHALLAB

marked " d o n e " and not revisited, a non-variable node is visited only once since its in-degree is l. Graph G~, has at most one arc per distinct variable plus k arcs per equivalence class, k being the arity of the only term in this class.

5. Comparisons and Experimental Results A detailed analysis of the number of operations performed enabled us to compare favorably the proposed algorithm to the six best known algorithms [2~ 6, 9, 10-12]. This section summarizes the comparisons together with empirical results that support the conclusion. The algorithm in [11] is the standard and most widely implemented unification procedure. Its practical performances for small terms are due to the use of simple data structures (only one needs to be initialized). However its worstcase complexity is exponential. Contrary to what has been considered, this algorithm remains exponential even without thc acyclicity test (PROt.OG "occur check") [2]. It involves several costly and redundant operations [6, 121. hnprovements proposed in [2, 12] make use of more data structures to the cost of the practical simplicity and main interest of this algorithm. (In [21 the asymptotic worst-case complexity is lowered to that of a quadratic algorithm.) The algorithm in [10] proved that unification is a linear problem. To warrant strict linearity complex data structures were used: terms are dags, five pointers per variable are needed. The algorithm processes the dags through costly operations: terms are considered of depth one, deeper terms are reached step by step. Furthermore clash detection is closely related to cycle checking (cycles are very rare): if a clash is detected, unnecessary operations for cycle checking have been performed. The algorithm in [10]~ the only one of strict linearity~ is mainly of theoretical interest. The algorithm in [6] is an almost linear three-step procedure. Step 1 obtains every different subterm at any depth and put them as nodes in a graph G similar but much larger than G~, (it contains all homogeneous classes completely and explicitly). Step 2 checks the existence of a homogeneous relation and relies on the union-find algorithm with the collapse and weight rules. Therefore~ as it was mentioned, it needs a larger number of finds and unions (with additional data structures for the weights) than our algorithm. Step 3 is a topological sorting algorithm [7] that tests the acyclicity condition. The proposed procedure VERE performs this test more efficiently. Notice that our algorithm does not require a corresponding Step 1. In the algorithm in [9] each class is represented as a list of variables and a list of non-variable terms. When a union of two classes is carried out, a class pointer for every term in one of the classes must be updated. This operation may have an O(n x) worst-case complexity. In order to reduce it to O(n log n), it is proposed in [9] to move class pointers to the largest class: this requires

AN EFFICIENT UNIFICATION ALGORITHM

259

additional data structures and overhead. The acyclicity condition is checked by repeatedly finding the root class of the partial order and removing it. But extracting explicitly the root class of the term-inclusion order is a costly operation that requires fairly complex data structures. Furthermore if a clash failure is found some overhead to detect cycles has been unnecessarily paid for, as in [10]. Out of known polynomial algorithms, the one in [9], although O(n log n), was shown to have the best practical performance. We implemented this algorithm and our algorithm together with the standard algorithm on a LISP environment with fairly similar term representations. Extensive tests on published examples were carried out. We summarize below the running time performances of the proposed algorithm (noted EG) compared to those of [9] (noted MM) as ratios to the running time of the standard algorithm (to make the figures independent of the programming language and the computer, and thus more useful to the reader for further comparisons).

Result 1. Average ratios for two sets (S~ and S2 in Appendix A) of unifiable couples of terms typically found in the literature are given below: S~ EG MM

S2

0.51 0.25 1.3 0.74

Result 2. Compared performances for typical theorem proving applications are summarized below. The first column (S.~) corresponds to a geometry problem [8], S4 to an algebra problem [8], St to the same problem without variable renaming (to introduce some failures due to cycles). Figures given are the running time ratios averaged over all possible calls of the algorithms for each corresponding set of terms (see Appendix A). Other applications were tested with similar results.

EG MM

$3

$4

$5

0.6 3.2

0.57 2.4

0.59 2.2

Result 3. Terms of variable arity (from 1 to 20) are considered here. The algorithms were run on sets of unifiable couples (given in Appendix A). The second row corresponds to a case for which the standard algorithm is exponential (measures were stopped at n = 6).

260

G. E S C A L A D A - I M A Z AND M. G H A L L A B

1

2

3

4

5

6

7

8

0.32 0.88

0.28 0.77

0.16 ~1.l (1.4~ 0.26

0.31 0.85

0.27 0.76

0.1 0.28

S,,

EG MM

1.2 5.3

0.85 2.5

0.6 1.8

0.49 1.4

0.42 1.2

t).36 1

Sv

EG MM

1.1 6.7

0.75 2.8

0.3 1

0.15 0.46

0.07 0.17

0.03 0.07

Ss

EG MM

1.2 5.1

0.83 2.5

0.58 1.8

0.45 1.3

0.4 1.1

0.35 (t.96

15

20

0.09 0.19

We did not carry out an implementation of the algorithm in [2] and extensive comparisons to this quadratic algorithm. However, relying on the only figures given in [2], about the running times of the algorithms in [2] and [9] for two examples, S 7 and a second similar example (on which the algorithm of [11] is exponential) we can reach the following results. Let C B / M M be the ratio of the running time of the algorithm in [2] to that of the algorithm in [9]: - F o r example S 7, C B / M M ~ - 0 . 5 for n 2, C B / M M ~ I for n = 4 , and C B / M M > 1 for n > 14. For the second example C B / M M > I also for n > 20, although C B / M M ~ (I.6 for n = 4. - I n our case, for example S 7 we start at E G / M M ~ 0 . 1 6 and the ratio remains < 0.5 for large n on this and other examples. In addition to this partial comparison we should also note in favor of our algorithm that the algorithm in [2] takes as input a dag (preprocessed terms) and give as output a graph that should be traversed to explicit the unifier. In conclusion, the proposed algorithm presents some interesting properties: (1) its worst-case complexity is almost linear. (2) It requires only two simple data structures (pointers) per distinct variable. (3) It processes separately and efficiently the two failure cases: first the clash situation (the "clash" exit in HERE), then the cycle case (the " l o o p " exists in VERSt and TERM-VERE). Indeed in practical applications clashes happen much more frequently than failures due to cycles. In some particular cases (such as f(x, x) and f ( g ( x ) , g(g(x)))) cycles are detected earlier in Hf~R~. (4) It takes advantage of the semi-unification (pattern-matching) case for which it behaves as a simple pattern-matching algorithm. (5) For a significant set of test data it outperforms the standard algorithm together with the polynomial algorithm that was known to have the best practical performance. (6) It gives as output the explicit unifier which is required in typical applications. Since this algorithm is easily implemented it should be a good substitute to the standard algorithm.

261

AN EFFICIENT UNIFICATION ALGORITHM

Appendix A Set $1: t = f(x,, g(x2, x3) , xz, b), t' = f(g(h(a, xs), xz), x,, h(a, x4), x4)

from [9];

t = f(x, f ( . , x)),

t' : f ( f ( y , a), f(z, f(b, z))) t : A(B(v), C(u, v)), t ' : A(B(w), C(w, D(x, y))) t = q(x,, g(xl), x2, h(x,, x~), x3, k(x~, xe, x3)), t' = q ( y , , Y2, c(Y2), Y3, f(Yz, Y3), Y4) t = p ( f ( x , g(x, y)), h(z, y)), t'= p(z, h(f(u, v), y(d, c)))

from [6]; from [10]; from [11]; from [12].

Set $2: t= f(x~, xz, x 3, X4, X5, X6, X7, Xs, X9) , t' = f(x~_, x 3, x4, x5, x6, XT, Xs, x9, a); t= p(w, y, x, u), t' = p ( g ( y , y, y), g(x, x, x), g(u, u, u), g(a, a, a)); t =- P(Xa, X3, g(x2), c, h(x3, X4, f ( x 4 , b ) ) , X4) , t' = p ( f ( x 3, y,), b, y,, xz, yz, l(y,, b)); t = p(x, y, a, b, X4, X5) , t' = p ( f ( y ) , x, xz, x 3, x 3, x2); t = p(x, a, b, c, a, y),

t' = p(b, x, y, c, a, z); t = p(x~, xz, x 3, x 4, x 5, f(y), x 7, x s, x 9, xlo, f(a), xT), t' = p(x2, x3, X4, X5, X6, X6, Xs, X9, Xl0 , Xll , Xll , Xl). Set S 3 (geometry problem):

{s(u,, v~, x~, yl), s(c, a, b, c), s(u2, v2, x2, y2), s(b, c, c, a), s(u~, v.~, x_~, y~), s(a, c, b, c), te(X 4, y~, Z4), te(a, b, c), te(x 5, Ys, zs), te(a, b, c), C(L/6' V6' W6' X6' Y6, Z6), c(a, c, b, b, c, a), te(XT, YT, ZT), te(b, c, a), te(U8, U8, W8) , te(a , C, b), s(a, b, b, a), S(Xg, Y9, Y9, X9) s(c, b, c, a), s(c, b, c, a), s(a, c, b, c), s(a, c, b, c) A(c, a, b, c, b, a), A(c, a, b, c, b, a)}. Set $4 (algebra problem): {p(x,, h(xl), h(xl)), p(e, Y2, Y2), p(k(e), z, e), p(u3, z3, w3) , P(X4, v4, e), P(g(Y_~), Ys, e), P(Y6, z6, v6), p(e, Y7, YT), P(g(vs), e, k(e)), p(xg, vg, Wg) , p(u,o, z~o, k(e)), p(e, Yll, Y~), P(Y~2, k(e), e), p(g(y~3), Y,3, e), p(g(v,4 ), g(k(e)), e), P(g(Y15), Y,.~, e)}.

262 Set $5 (algebra problem without variable renaming): { p ( x , h(x), h(x)), p(e, y, y), p ( k ( e ) , z, e), p ( u , z, w), p ( x , v, e), p ( g ( y ) , y, e), p ( y , z, v), p(e, y, y), p ( g ( v ) , e, k(e)), p ( x , v, w), p ( u , z, k(e)), p(e, y, y), p ( y , k(e), e), p ( g ( y ) , y, e), p ( g ( v ) , g(k(e)), e), p ( g ( y ) , y, e)}.

Set $6: t = f ( x , , , x,, , . . . . . t'=f(x,,_,

.....

x,),

x 1, a).

Set S 7: t = f ( x , , , x , , _ ~ , . . . ,x~), t' = f ( g ( x , , ~ 1 ,

x,,+1, Xn+l) . . . . .

g(x2, X2, X2)).

Set S~: t = f(x~,

x2,..

t' = f ( a , x , , . . .

• , x,,),

,x,, l);

x 2. . . . . x , ) , t' = f ( g ( x 2 , x2, xe), g(x3, x3, x3), . . . , g(x,+~, x,,+l, x,,+t)); t = f(x~,

t = f(x, x ..... x), t' = f ( x , , , x,, 1 . . . . .

x 1, a); t = f ( x l , x2, . . . , x , _ ~ , x , , x~), t' = f ( a , x I, x 2 . . . . . x,,_ 2, x,, _~, a ) . ACKNOWLEDGMENT This work was supported by the French research program GRECO/PRC-IA and by a fellov~hip from the Basque Government Eusko Jaurlaritza. REFERENCES 1. Aho, A.V., Hopcroft, J.E. and Ullman, J.A., The Design and Analysis of Computer Algorithms (Addison-Wesley, Reading, MA, 1974). 2. Corbin, A. and Bidoit, M., Rehabilitation of Robinson unification algorithm, in: R.E.A. Mason (Ed.), Information Processing 83 (North-Holland, Amsterdam~ 1983) 909-914. 3. Dwor, C., Kanellakis, P.C. and Mitchell, J.C., On the sequential nature of unification, Memo LCS, TM-257, MIT, Cambridge, MA (1984). 4. Forgy, C.L., RETE: A fast algorithm for the many pattern/many obiect pattern match problem, Artificial Intelligence 19 (1982) 17-37. 5. Ghallab, M., Decision trees for optimizing pattern-matching algorithms in production systems, in: Proceedings HCAI-81, Vancouver, BC (1981) 310-312. 6. Huet, G.P., A unification algorithm for typed lambda-calculus, Theor. Comput. Sci ! (1975) 27-57. 7. Knuth, D.E., The Art of Computer Programming !: Fundamental Algorithms (AddisonWesley, Reading, MA, 1969). 8. Loveland, D.W., Automated Theorem Proving: A Logical Basis (North-Holland, Amsterdam, 1978).

AN EFFICIENT UNIFICATION ALGORITHM

263

9. Martelli, A. and Montanari, U., An efficient unification algorithm, ACM Trans. Program Lang. Syst. 4 (2) (1982) 258-282. 10. Paterson, M.S. and Wegman, M.N., Linear unification, J. Comput S~vst. Sci. 16 (1978) 158-167. 11. Robinson, J.A., A machine oriented logic based on the resolution principle, J. ACM 12 (I) ( 19651. 12. Robinson, J.A., Computational logic: The unification computation, in: B. Meltzer and D. Michie (Eds.), Machine Intelligence 6 (Edinburgh University Press, Edinburgh, 1971). 13. Tarjan, R.E., Efficiency of good but not linear set union algorithms, J. ACM 22 (2) (1975) 215-225.

Received A u g u s t 1987; revised version received April 1988