Lower bounds on type checking overloading

Lower bounds on type checking overloading

Information Processing EUEVIER Information Processing Letters 57 ( 1996) Letters 9-l 3 Lower bounds on type checking overloading ’ Dennis M. Vo...

439KB Sizes 4 Downloads 119 Views

Information Processing EUEVIER

Information Processing Letters 57

( 1996)

Letters

9-l 3

Lower bounds on type checking overloading



Dennis M. Volpano* Department

ofComputer

Science, Naval Postgraduare

School, Monterey,

CA 93943,

USA

Received 24 August 1995; revised 24 October 1995 Communicated by D. Gries

Abstract Smith has proposed an elegant extension of the ML type system for polymorphic functional languages with overloading. Type inference in his system requires solving a satisfiability problem that is undecidable if no restrictions are imposed on overloading. We explore the effect of recursion and the structure of type assumptions in overloadings on the problem’s complexity.

Keywords:

Type theory; Overloading; Computational complexity

vu,, . . , ~,withxI:rl

,...,

x,,:r,.r

1. Introduction

u ::=

The ML type system [ 6,1] has been studied extensively and has some well-known limitations. One practical limitation is that it prohibits overloading by allowing no more than one assumption per identifier in a type context. In an attempt to overcome this limitation, Smith gives an elegant type system that merges MLstyle polymorphism and overloading [ 1O-l 21. The system has the usual set of unquantified types given by

The set {a,, . . . , a,,} is the set of quantiJied variables Of(T,{X, :71,..., Xm : 7,) the set of constraints of CT,and T the body of CT.If there are no constraints, the with portion of the type scheme is omitted.

7 ::= ff / 7 * 7’ / x(71,. . . ) 7,) where x is an n-ary type constructor, like int or matrix, and (Yis a type variable. The set of quantified types, or type schemes, is given by ’ This

material is based upon activities supported by the National

Science Foundation

under Agreement

No. CCR-9400592.

Any

opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. * Email: [email protected].

0020-0190/96/$12.00

@

SSDlOO20-0190(95)00184-O

The type inference rules are given in Fig. 1. We can prove typing judgements of the form A I- e : CTwhere A is a finite set of assumptions of the form x : u, called a type context. A type context A may contain more than one typing for an identifier x; in this case we say that x is overloaded in A. We use the notation A t C to represent A l- x : T, for all x : 7 in C, and let IAl denote the number of assumptions in type context A. Observe that when C is empty the (Y-intro) and (Velim) rules reduce respectively to type generalization and specialization in the type system of Damas and Milner [ 11. The second hypothesis of the (V-intro) rule says that a constraint set can be moved into a type scheme only if the constraint set is satisfiable; the typing A E e : VG with C . T cannot be derived unless we know that there is some way of instantiating the

1996 Elsevier Science B.V. All rights reserved

IO

D.M.

Volpanoi Injhnation

Processing

Letters 57 (1996)

9-13

( hypoth) (A-intro)

A U {x : T} k e A t- Ax. e

: 7’ : T --t r’

(x

does not occur in A)

(x

does not occur in A)

(---elim)

A t e d

(let)

: r’

A b e

: (I : CT}k e’ : 7 A k let x = e in e’ : 7 A u {.I

(V-intro)

AUCte:r

A 1 C[CU .= fr] A t e (V-elim)

: V’a with

C ,r

(Cunot

free in A)

Ake:VbwithC.T

A t C[& := ir] A k e : ~[a. := 5-1 (-0)

Fig. 1. Typing rules with overloading.

c so that C is satisfied. This leads to the following problem: Definition 1. CS-SAT is the problem of given a constraint set C and a type context A, is there a substitution S such that A t- CS?

For example, the type context in Fig. 2 is recursive due to the assumptions for * and +. As a result of recursion, we have infinitely-many types at which * and + have instances.

2. Lower bounds In practice, we would expect a type inference algorithm to generate C by specializing the constraints of type schemes in A. So from now on, we assume that constraints in C can be formed by specializing constraints in A. CS-SAT is undecidable as was shown by Smith [ IO]. Type schemes in this system permit very expressive type contexts to be constructed, some of which may be recursive in the following sense. Definition 2. Let S = {x,_y, . . .} be the set of identifiers with assumptions in a type context A. Then the dependency relation of A is a binary relation R on S such that x R y iff x : u E A and y : T is a constraint of u (x and y are not necessarily distinct). Definition 3. A type context A with dependency lation R is recursive iff 3x. x R~+x.

re-

Although CS-SAT is decidable when recursion is prohibited, it remains hard if the structure of type assumptions is not restricted, requiring nondeterministic exponential time. Theorem 4. CS-SAT is NEXPTIME nonrecursive type contexts.

complete for

Proof. Given a nonrecursive type context A and a constraint set C, construct a set E of equations as follows. For each constraint x : r E C, nondeterministically choose an assumption x : ‘v’& with C’ . r’ in A, add to E equation 7 = #[(Y := 71, where 7 is new, and add the members of C’[ E := 71 to C. Then E is unifiable iff the original set C is satisfiable with respect to A. The size of E is at most exponential in IAl. Whether E is unifiable can be decided in linear time so C&SAT

D.M.

Information ProcessingLetrers 57 (1996) 9-13

Volpanol

+ : real -+ real ---) real

+:tJawith+:a+a+a. matrix( cu) 4 * : int +

int 4

matrix( cr) -+ mafrix( a)

int

* : real --r real -b real *:Vawith+:a+a+a, matrix( LY)+ matrix(a) Fig.

2. A recursive

*:cy+a-+a. + matrix(a)

type context.

is in NEXPTIME. For hardness, let B={

C,‘(“)$I : E

---$

40

--f 01 + u2 ---f “.

+ a, ---f &}

in the PTIME reduction of Theorem 4.2 in [ 131; a, is an input string x to a nondeterministic Turing machine M of exponential time complexity. Then B is satisfiable under A., iff M accepts x. 0 alal.

Next we consider the complexity of CS-SATwith the style of overloading permitted in the lazy functional programming language Haskell [ 41. An identifier may be overloaded in Haskell, but only through multiple instance declarations, which give rise to what we call a Haskell type context. A Haskell type context is very structured. Definition 5. Suppose X is the set of identifiers of a nonempty type context H. Then H is a Haskell type context if for each x E X, there is a type 7, with exactly one free type variable (y,, possibly occurring more than once, such that all assumptions for x in H are described by the set x: ‘dyl with CL .T,[cY, :=,yl(rl)]

If x has only one assumption (n = I ), then it is, technically speaking, not overloaded, nevertheless it may appear in a constraint. The body of a type scheme in an assumption must be formed by specializing a type with one level of type structure, or in other words, by a single type constructor x, parameterized perhaps by one or more type variables. Further, in a constraint .r : p, it must be possible to form p by merely renaming the free type variable of r,., the type used in forming all assumptions for J. The recursive type context of Fig. 2 is an example of a Haskell type context. Consider the assumptions for +. The bodies of its type schemes are formed by instantiating (Yof the type LY----tLY4 LYwith real and matrix(cYj,

whereto>

1,x/

?’ : p E c;

# x;fori

:=,yn(YIl)]

I3

# j,and

implies

?‘~X,p=~,.[cu~:=y],andy~y;. If x is overloaded (tz > 1) then the type 7, used in forming the assumptions for x is the least common generalization [ 81 of the types T,( cy, :=

Xl(7l)l,...,~,l~.,

:=xn(7t*)l

so 5-+ = a ---f (Y -+ ff.

Regular tree languages recognized by DR tree automata characterize the types of identifiers in Haskell type contexts. Formally, a k-ary, Z-valued tree is a mapping t : dom( t) + Z where &m(t) C { 1,. . , k}” is a nonempty set and closed under prefixes. We can assume for our purposes that 2 is a ranked alphabet of type constructors xi, . . . , ,yn. We let FL denote the set of all finite Z-trees. A deterministic root-to-frontier (DR) Z-tree automaton M is a pair (A, S) such that 1. A is a finite DR Z-algebra ( A, 2)) and 2. S E A is the initial state. A DR Z-algebra is a pair (A, 2) where A is a nonempty set of states and, for every cr E & with m > 0 there is a partial transition function gd : A + A”‘. If (T E 20, then crd 2 A [3]. A run of M on a tree t is a mapping g : dom( t) ----t A such that g(c) = S, and if t(w) = (T, CJE Znl for m > 0, and g( \i,) = a, then ad(a)

( x: ‘dy,, with Clr.7,[a.,

II

= (g(wl),g(w2)

,...,

g(wm)).

A rung of M on a tree t is accepting if g( 1~) t t( w)~ for every node it in the frontier of t. The set of all trees on which M has accepting runs is the language ofM, written T(M). Deciding whether the intersection of a sequence of DR tree automata is nonempty is EXPTIME complete. Hardness can be proved by a log-space generic reduction from polynomial-space bounded alternating Turing machines ( ATMs) [ 91. The lower bound on CS-SATfor Haskell type contexts follows directly from this result while the upper bound follows from the following lemma:

12

D.M.

Volpanoi Information

Processing Letters 57 (1996)

9-13

Lemma 6. Suppose H is a Haskell type context. If x is an identi’er of H then a DR tree automaton M., can be constructed such that

intersect by checking the emptiness of automaton

T(M.,)

This can be done in PTIME [ 21. Then accept iff this intersection is nonempty for all type variables in C. Hardness follows from a log-space reduction from the DR tree automaton intersection problem, Given a sequence of tree aUtOmata MI, Mz, . . . , Mk, where M; = (A;, S;)and d; is the Z-algebra (A;, Z), we construct a constraint set C and a Haskell type context H such that C is satisfiable with respect to H iff ni”_, T( Mi) is nonempty. Assume the sets of states A,,... , Ak are disjoint. For all 1 < i ,< k add to H a set of assumptions for M; as follows. If aA’ = (al,. . , a,,,) , for u E Z,,,, then add to H,

= {v 1H t- x: ~_,-[a.\-:= n-l}.

Further, such automata can be constructed for all identijers of H in time exponential (space polynomial) in IHI.

Proof. Given a Haskell type context H, let X be the set of identifiers of H. Let A = p(X) and 2 be the ranked alphabet of type constructors in H. Define M, = (A,{x}), where A = (A, 2>, and A is initially constructed so that T(d,{ }) = Fz. Then extend A with the following transitions. For every r 2 X, let r E xd iff all identifiers in r have an instance assumption in H at nullary type constructor x. Let xd(r) = rk) iff all have an instance at type construc(rl,..., tor x((Y,, . . . , ak), for k > 0. Set rj, for 1 < j 5 k, contains an identifier, say z, iff there is an identifier in r whose instance assumption at x has a constraint on z involving aj, There are 21’1subsets to consider, each of which can be stored in at most 1x1 space. Cl The following theorem was proved by Seidl /9] in the framework of Nipkow and Prehofer’s type system (71. A proof is given here for Smith’s system. Theorem 7. CS-SAT is EXPTIME Haskell type contexts.

complete for

Proof. Let H be a Haskell type context and C a constraint set. First construct a DR tree automaton (A, {x}) for each identifier x of H. By Lemma 6, this can be done in exponential time. Suppose that for each constraint x : p E C, p = ~,[a., := n-1, for some type 7~.Let g be a run of (A, {x}) on 71.If there is a node w E dam(r) such that V(W) = x,,,, m 3 0, and g(w) is undefined, then reject. If IThas type variables, then associate a DR tree automaton with every occurrence of a variable in T as follows. For every node w such that r(w) = fl and g(w) = a, assign to this occurrence of type variable /?, tree automaton (A, a). Now for every type variable ac in C, with occurrences aI. . . , a,, , decide whether the languages of their assigned tree automata (d,al),...,(d,a,,,)

(d,al

U...Ua,).

a:V/(YI+..a,,,wifha,:al,...,

a,:q,,.

and for u E &, add a : CTfor all a E uAg. Then with C = {S, : a,.. . , Sk : a}, we have nk, T( M;) is nonempty iff C is satisfiable with respect to H. 0 Now we consider the complexity of CS-SATwhen all assumptions for an overloaded identifier, say x, in a Haskell type context are formed by specializing 7, with unary or nullary type constructors only. An example of this kind of overloading is given in Fig. 2. We call such contexts unary Haskell type contexts. Definition 8. A Haskell type context H is a unary Haskell type context if all assumptions for every identifier x of H have the form x : T,~[CY,~:= x0], for some nullary type constructor x0, or x : VP with C .~.,.[a.~:= ,y(,@], for some unary type constructor x. Theorem 9. CS-SAT is PSPACE complete for unary Haskell type contexts. Proof. Let H be a unary Haskell type context and C a constraint set. Suppose that for each constraint x : p E C, p = 7, [ (Y,,:= ~1, for some type z-. Let g be a run of (A, {x}) on 7~,where (A,{x})is constructed “on demand” and in polynomial space by Lemma 6. As in Theorem 7, assign tree automaton (A,a) to a type variable /? in rr if z-(w) = /? and g(w) = a for some node w. If a variable LYin C has occurrences

D.M.

Valpanol

Injtirmatian

Processing

aI ill,, then check whether the languages of their assigned tree automata (A,a~>,...,(A,ant)intersect by checking the emptiness of (A, UI U . . . U a,,). This automaton represents a DFA since H is a unary Haskell type context. Thus emptiness can be checked by searching an on-demand construction of it nondeterministically in log 21”’ space, or DSPACE( (H12). Hardness follows from a log-space reduction from the DFA intersection problem [ 51. Let Mk be a sequence of DFAs, where /Vi is MI,M2,..., (Q;, 2, &, 90,, F;), and the sets of states Ql , . , Qk are disjoint. Create a unary Haskell type context H where for all I 6 i 6 k, assumptions are added to H for Mi as follows. For all 9,9’ E Qi and u E 2 such that S,( 9, a) = 9’, add to H, 3.

.

t

Letrers 57 (1996)

13

9-13

3. Summary We have observed that it is necessary to simultaneously restrict both recursion in overloadings and the structure of type assumptions if there is to be any hope of solving CS-SAT efficiently. Surprisingly, even for the simple kind of overloading allowed in Haskell the problem is hard.

References L.

Damas

and R.

Milner,

Principal

type

schemes for

functional programs, in: Proc. 9rh ACM Symp. on Principles of Programming

[21 J.

Camp/.

[31 F. 9 : ‘da with 9’ : a. u(a)

Languages

( 1982) 207-2 12.

Doner, Tree acceptors and some of their applications, J. System SC;. 4 ( 1970) 406-45

I.

Gecseg and M. Steinby, Tree Automata

Budapest, Hungary,

( Akademi Kiadb,

1984).

[41 Report on the programming language Haskell, A non-strict, purely functional language, Version 1.2, ACM SlGPLAN

and for all 9 E Fi add 9 : E where E is a nullary type constructor. Then with C = (90, : a,. . . ,90x : a}, nf=, L(M;) is nonempty respect to H. 0

iff C is satisfiable

with

Notices

27 ( 1992).

[51 D. Kozen, Lower bounds for natural proof systems, in: Prw. 18th IEEE Ann. Symp. on Foundations of’Computer Science,

( 1977)

Long Beach, CA

254-266.

[61 R. Milner, A theory of type polymorphism in programming, J. Comput. System Sci. 17 ( 1978) 348-375.

And finally we consider unary Haskell type contexts without recursion.

171 T. Nipkow and C. Prehofer, Type checking type classes, in: Proc. 20th ACM Symp. on Princip1e.r of Programming

Theorem 10. CS-SAT is NP hard for nonrecursive, unuty Huskell type contexts.

[81 J.C. Reynolds, Transformational systems and the algebraic structure of atomic formulas, Machine Intelligence 5 ( 1970)

Languajies

Proof. The proof is by a PTIME reduction from 3CNF-SAT. Suppose E is a 3CNF formula with clauses dl, . . ,d, and distinct variables xl,. . . ,x,,. The satisfying truth assignments for each clause dj are recognizable as unary trees by a DR tree automaton Mi = (di, S;) that can be constructed in O(m) time. That is, B,(&(...B,,,(&) ...)) E T(M;) iff the assignment of truth values BI, . . . , B,, to xl,. . . ,x,, respectively satisfies di. Assume the sets of states for the n automata are disjoint. Then for all 1 6 i < n, add to a type context H, the assumption

135-151.

[ 101

complete,

G.S. Smith, Polymorphic type inference for languages with overloading and subtyping, Ph.D. Thesis, Tech. Rept. 91. 1230, Dept. of Computer Science, Cornell University,

[Ill

I99 I.

G.S. Smith, Polymorphic type inference with overloading and subtyping, in: Proc. Computer Science 668

TAPSOFT

( Springer,

‘93,

Berlin,

Lecture Notes in

) 67 l-685.

I993

(121 G.S. Smith, Principal type schemes for functional programs with overloading and subtypin g, Sci. Compul. Programmin,q 23 ( 1994) [l31

D.M. ML on

197-226.

Volpano typability Funcrional

(Springer,

if B*l (a) = b and a : E if a E &*I. Then {SI : Q,.... s,,: a} is satisfiable under H iff E is satisfiable, 0 and H is nonrecursive and unary.

409-418.

[91 H. Seidl, Haskell overloading is DEXPTIME Inform. Process. Left. 52 ( 1994) 57-60.

Archifecrure,

a : Vex with h: a. B(a)

(1993)

and C.S. with

Smith,

overloading,

Programming Lecture

Notes

On

the complexity

in:

Prw.

Languages on

Berlin, 1991 ) 15-28.

Computer

5th

and

of

Con$

Computer

Science

523