ARTICLE IN PRESS
Journal of Computer and System Sciences 69 (2004) 499–524 http://www.elsevier.com/locate/jcss
On the reducibility of sets inside NP to sets with low information content Mitsunori Ogiharaa,,1 and Till Tantaub,2 a
Department of Computer Science, University of Rochester, P.O. Box 270226, Rochester, NY 14627, USA b International Computer Science Institute, 1947 Center Street, Berkeley, CA 94704, USA Received 2 July 2002; revised 8 March 2004
Abstract This paper studies for various natural problems in NP whether they can be reduced to sets with low information content, such as branches, P-selective sets, and membership comparable sets. The problems that are studied include the satisfiability problem, the graph automorphism problem, the undirected graph accessibility problem, the determinant function, and all logspace self-reducible languages. Some of these are complete for complexity classes within NP, but for others an exact complexity theoretic characterization is not known. Reducibility of these problems is studied in a general framework introduced in this paper: prover–verifier protocols with low-complexity provers. It is shown that all these natural problems indeed have such protocols. This fact is used to show, for certain reduction types, that these problems are not reducible to sets with low information content unless their complexity is much less than what it is currently believed to be. The general framework is also used to obtain a new characterization of the complexity class L : L is the class of all logspace self-reducible sets in LL-sel : r 2004 Elsevier Inc. All rights reserved. Keywords: Computational complexity; Selectivity; Membership comparability; Self-reduction; Sets with low information content; Prover–verifier protocols
Corresponding author. E-mail addresses:
[email protected] (M. Ogihara),
[email protected] (T. Tantau). 1 Supported in part by NSF Grants EIA-0080124, EIA-0205061, and DUE-9980943 and by NIH Grants RO1AG18231 (5-25589) and P30-AG18254. 2 Work done in part while visiting the University of Rochester, New York, and while working at the Technical University of Berlin. Supported in part by the TU Berlin Erwin-Stephan-Prize grant and through a postdoc fellowship by the DAAD (German Academic Exchange Service). 0022-0000/$ - see front matter r 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.jcss.2004.03.003
ARTICLE IN PRESS 500
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
1. Introduction This paper continues a line of research that studies whether problems inside NP become tractable if we have, possibly restricted, oracle access to a set with low information content. We study this question for a wide range of natural problems inside NP, including the NP-complete satisfiability problem sat and problems having seemingly much less complexity than sat such as the graph isomorphism problem gi, the graph automorphism problem ga, the circuit value problem cvp, the graph accessibility problem gap, and the undirected graph accessibility problem ugap. Studies of reducibilities to sets with low information content are motivated by the question of how precomputation helps to speedup decision algorithms. Although sets with low information content can be arbitrarily complex, they become computationally tractable when a small amount of extra advice (in the sense of Karp and Lipton [26]) is available for each word length. The smallness of the number advice bits makes sets with low information content attractive as oracles: in order to simulate an oracle with low information content—a task that arises automatically in practice, but also often in theory—we just need a small number of advice bits for each word length in order to answer queries. These bits could be obtained in a computationally expensive precomputation process, could be held in a database, or they could be hardwired into a machine or circuit. Thus, by studying whether a language is reducible to a set with low information content, we study essentially whether the language becomes tractable if precomputation is permitted to obtain a small piece of information that is dependent only on the length of the input. Classic examples of sets with low information content are tally sets, i.e., languages over a unary alphabet. Tally sets have low information content since they contain only one bit of information per word length: either there is a word of a specific length or there is not. More general examples are sparse sets, which contain only polynomially many words per word length. The information content of a sparse set is the positions of the words on each length level. In this paper we focus on classes of sets with low information content that are more ‘‘structured.’’ The first class we study is the class P-sel of all P-selective sets [43], which are sets for which there exist polynomial-time computable selectors. A selector picks from any two input words one word that is more ‘‘likely’’ to be in the language. P-selective sets have low information content since they become tractable if Oðn2 Þ advice bits are available for words of length n (see [27]). The presence of the selector functions makes P-selective sets more structured than sparse sets, because we can use the selector to derive some information about the set. Second, we study the classes P-mcðf Þ of all f -membership comparable sets [38], where f : N-N is a parameter function. Although membership comparable sets are less structured than P-selective sets, there still exist membership comparing functions for them. Such a function excludes one possibility for the characteristic string of any f ðnÞ words of length at most n: One can easily show that sets with low information content can become arbitrarily complex. For example, there are uncountably many tally sets and thus there must exist even non-recursive tally sets. Similarly, one can show that there exist uncountably many P-selective sets and, for each non-trivial f ; uncountably many f -membership comparable sets. Their arbitrarily high complexity is another reason why sets with low information content make interesting oracles. Intuitively, having oracle access to a difficult problem should allow us to solve other difficult problems efficiently. However, this intuition often turns out to be wrong for sets
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
501
with low information content because many results of the following form are known: ‘‘if problem A can be solved with (possibly restricted) oracle access to a specific class of sets with low information content, then some unlikely complexity class collapses occur.’’ In this paper we give numerous new results of this kind. Table 1 gives a (non-exhaustive) overview of known results of this type, together with the improvements and new results obtained in the present paper. In the table, det denotes the function that maps an integer matrix to its determinant, #ga denotes the function that maps a graph to the number of its automorphisms, L-sel and L-mcðf Þ denote the classes of all logspace selective sets and all logspace f -membership comparable sets, ALL denotes the class of all languages, SPARSE denotes the class of all sparse languages, and BRANCH denotes the class of all branches. A branch is the set of all finite prefixes of an infinite symbol string; in other words branches contain one word of length i for every number i and are closed under prefix.
Table 1 Summary of results Problem
PP-sel n1=2e -tt
PP-sel n1e -tt
Ptt
P-mcðconstÞ
P-mcðlogÞ
sat (1sat,sat) gi (1gi,gi) ga (1ga,ga)
satAPa satARP4:2 giAPa — gaAPa gaAP4:7
satAPa satARP4:2 giARP4:3 — gaAPa gaAP4:7
satARPb satARP4:2 giARP4:3 — gaAP4:7 gaAP4:7
satARPc — giARP4:3 — — —
Problem
L/log
LL-sel
LBRANCH
L-mc(const)
cvp gap ugap
cvpALd gapALd ugapAL4:24
cvpAL4:10 gapAL4:18 ugapAL4:23
cvpAL4:10 gapAL4:18 ugapAL4:23
— gapAL4:19 —
Function
FLALL Oðlog nÞ-T
FLL-sel
FLALL Oðlog nÞ-T
det #ga
detAFLe —
detAFL4:17 —
— giARPf, gaAP4:9
Consequences of the problems and functions on the left-hand side being elements of the classes at the top. For promise problems this means that at least one solution of the promise problem is in the class at the top. For example, the first entry in the first table states that if sat is truth-table reducible to a set in P-sel with n1=2e truth-table queries for words of length n; then satAP: The entry below states that if ð1sat; satÞ has a solution in PP-sel n1=2e -tt ; then satARP by Theorem 4.2. Results with a superscript theorem number are shown in the present paper. a Shown in [2,10,38]. b Shown in [9,48]. c Shown in [44]. d Shown in [7]. e Shown in [8]. f Shown in [11].
ARTICLE IN PRESS 502
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
1.1. Membership enumerable sets Those results in Table 1 that concern languages (rather than functions) are of the form ‘‘if A is reducible to a set with low information content, then A can be decided efficiently.’’ Most of the proofs of these results follow a similar pattern and can be divided into two parts. For the first part of the proofs we introduce a new notion, namely membership enumerable sets. For such sets a polynomial-time or logspace Turing machine can output for any tuple of input words a polynomially sized set of possibilities for the characteristic string of these words. Our key observation is that many structured sets with low information content are membership enumerable. For polynomial-time computations this is generally easily seen and has been implicitly used in the literature for some time. For logspace computations our results are new. The second part is based on the fact that many natural sets AANP have prover–verifier protocols in which the prover can be computed with non-adaptive queries to a set closely related to A: If the prover can be computed with oracle access to A itself, one says that search nonadaptively reduces to decision for A: For many sets in NP such as gap and cvp, it is easy to see that search non-adaptively reduces to decision. For other sets such as ga and the problems complete for L#L ; more complicated constructions are needed to prove that search non-adaptively reduces to decision. In our proofs we improve known constructions and give new ones. The combination of prover–verifier protocols and membership enumeration is not new. Different authors have used similar arguments, see for example [7,9,48,51]. However, our new notion of membership enumerability makes the technique behind these arguments much clearer. Previous proofs typically contained some sort of ad hoc argument why a small set of possibilities for the characteristic string of the membership queries could be computed. These ad hoc arguments were often neither optimal nor easy to follow. The notion of membership enumerability replaces them with a simple, clean proof technique. 1.2. Logspace selectivity and a new characterization of L For natural problems that are known to lie inside P, but that are not known to lie in L, such as the circuit value problem cvp and the graph accessibility problem gap, we can also ask whether sets with low information content are helpful for deciding these problems in logarithmic space. Results of van Melkebeek [51] and Balca´zar [7] indicate that the same effects as in the polynomial-time setting occur: unlikely collapses are caused by the reducibility of natural problems to sets with low information content. Van Melkebeek has shown that if either cvp or gap is plog btt -reducible to a sparse set, then P ¼ L; respectively, NL ¼ L: Balca´zar has shown that the same occurs if either cvpAL=log or gapAL=log: As an application of our study of logspace selectivity we prove several results that go in the same direction. For example, our results imply that, for each of cvp, gap, and ugap, if the problem is decidable in logarithmic space with oracle access to a branch (the set of all finite prefixes of an infinite symbol string), then the problem is in L. Note that we do not restrict the number of queries. The results on branches are just a corollary to a general result below, which is about LBCðL-selÞ ; the logspace Turing reducibility closure of BC(L-sel), the Boolean closure of L-sel.
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
503
Theorem 1.1. For each CAfP; SAC1 ; CFL; DCFL; L#L ; NL; Mod2 L; Mod3 L; y; FewL; UL; SLg; CDLBCðL-selÞ implies CDL: This theorem is the ‘‘summary’’ of different individual theorem proved later on, one for each of the classes mentioned in the theorem. These individual theorems are Theorems 4.10, 4.11, 4.12, 4.14, 4.16, 4.18, 4.21, and 4.23. An interesting aspect of the above theorem concerns the important conjecture ‘‘if NPDPP-sel ; then P ¼ NP:’’ Proving this conjecture is the ultimate goal of studying reduction closures of sets with low information content, but it is known [4] that even watered-down versions of this conjecture can be either true or false relative to some oracle. Theorem 1.1 shows that many plausible logspace analogues do hold. For all of the classes mentioned in Theorem 1.1 (except for SL, UL, and FewL, where we use slightly a different argument) the proof of the implication is based on the following new characterization of L in terms of logspace self-reducibility [7]. It is a strong counterpart to the characterization of P as the class of polynomial-time self-reducible sets belonging to class P-sel [12] and to similar characterizations of NP [23] and of NP-coNP [22]. Theorem 1.2. L is the class of logspace self-reducible sets in LBCðL-selÞ :
1.3. Reducibility of functions to sets with low information content Our proof techniques apply not only to decision problems but also to function problems. We study two functions, the determinant function det, which takes an integer matrix as input and produces its determinant as output, and the function #ga; which takes an undirected graph as input and produces the number of automorphisms of this graph as output. In Theorem 4.17 we show that if the determinant function can be computed in logarithmic space with Turing oracle access to an L-selective set, we can also compute this function in logarithmic space without using any oracles. In Theorem 4.9 we show that if #ga can be computed in polynomial time asking Oðlog nÞ-Turing queries to any oracle, then gaAP: These results demonstrate some advantages of our proof framework. The result on #ga was obtained mainly by rephrasing a known result of [8] in terms of membership enumerability and then applying our structural results on membership enumerability. We believe that our proof framework not only gives short, elegant proofs, but also helps to focus one’s attention on the special properties of individual problems, rather than on reproving general results. 1.4. Employed techniques Our results are based on a general analysis of the new notion of membership enumerability and on individual analyses of the different natural problems. These individual analyses employ numerous proof techniques. In the following, we give a short overview of the most important techniques that we use.
ARTICLE IN PRESS 504
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
For the results on the graph isomorphism problem we use the interactive proof protocol proposed by Goldreich et al. [19]. For the result on the graph automorphism problem we use the parallel census technique of Hartmanis [20], Hartmanis et al. [21] and the observation of Lozano and Tora´n [32] and of Agrawal and Arvind [1] that search non-adaptively reduces to decision for the graph automorphism problem. For the characterization of L we show that search reduces to decision for logspace self-reducible sets, and use the observation of Tantau [45] (see also [36]) that the tournament reachability problem is first-order definable and hence decidable in logarithmic space. For the result on gap and logspace Oð1Þ-membership comparability we adapt the proof techniques used by Agrawal, Arvind, Beigel, Kummer, Ogihara, and Stephan [2,10,38] to logarithmic space. For the results on the determinant function we use the characterization of det in terms of GapL, see [16,47,49,53], and the recent algorithms for transforming a binary number representation into a Chinese remainder representation and vice versa [15]. 1.5. Organization of the paper This paper is organized as follows. In Section 2 we give definitions of such basic notions as selectivity, membership enumerability, and reductions. In Section 3 we introduce a framework for the study of reduction closures of certain ‘‘structured’’ sets with low information content. In Section 4 we build on this framework to analyze whether different natural languages inside NP can be reduced to sets with low information content.
2. Preliminaries 2.1. Basic notations For a language ADS we define wA to be the characteristic function of A extended to tuples. It takes tuples of words as input and returns their characteristic string as output. The coding and pairing function /; y; S takes arbitrary tuples of words or finite objects and encodes them into one word, such that coding and decoding can be done easily. The join A"B of two languages is defined by A"B :¼ f0x j xAAg,f1x j xABg: For a graph G and its vertices s and t; let #pathsG ðs; tÞ denote the number of paths from s to t in G: 2.2. Language and function classes We will refer to three formal language classes: CFL, DCFL, and LOGCFL. The class CFL contains the context-free languages, DCFL contains the deterministic-context-free languages. The class LOGCFL is the closure of CFL under logspace many-to-one reductions. We point out that LOGCFL is known to equal the circuit complexity class SAC1 of all languages decided by logspace-uniform logarithmically depth-bounded polynomially sized circuits that have unbounded-fan-in or-gates and bounded-fan-in and-gates. We next summarize the complexity classes that will be used in this paper. The class P (respectively, NP) consists of the languages accepted by deterministic (respectively, nondeterministic) polynomial-time Turing machines. The randomized complexity class RP contains
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
505
the languages accepted by randomized polynomial-time Turing machines that have a bounded probability of false rejections and a zero probability of false acceptance. The function class FP contains all functions computable by polynomial-time Turing machines. See for example [39] for detailed definitions of these classes. The class L (respectively, NL) consists of the languages accepted by deterministic (respectively, non-deterministic) logspace Turing machines. The class SL of symmetric logarithmic space was introduced in [31]. It can be defined as the logspace many-to-one reduction closure of the undirected graph accessibility problem ugap. The class UL of unambiguous logarithmic space was introduced by A`lvarez and Jenner [3]. It contains all languages accepted by non-deterministic logspace machines that have at most one accepting path for every input. The classes FewL and Modk L; kX2; are due to Buntrock et al. [13]. The first generalizes UL by allowing a polynomial number of accepting paths instead of just one. For each kX2; and for each language L in Modk L; there exists a non-deterministic logspace machine M such that, for all strings w; wAL if and only if the number of accepting computation paths of M on input w is not divisible by k: The function class FL consists of all functions computable by logspace Turing machines. The function class #L contains all functions f : S -N for which there exists a non-deterministic logspace Turing machine M such that f ðxÞ is exactly the number of accepting paths of M on input x: The class ALL consists of all languages, SPARSE denotes the class of all sparse languages, and BRANCH denotes the class of all branches. Recall that a branch is the set of all finite prefixes of an infinite symbol string. We next give formal definitions of the notions of P-selectivity, which is due to Selman [43], and membership comparability, which is due to Ogihara [38]. Both notions can readily be generalized to logarithmic space. Definition 2.1 (Selman [43]). A selector for a language A is a binary function g such that for all x; yAS (1) gðx; yÞAfx; yg; (2) if xAA or yAA; then gðx; yÞAA: A language is in the class P-sel if it has a selector in FP, and it is in L-sel if it has a selector in FL. Definition 2.2 (Ogihara [38]). Let f : N-N: An f -membership comparing function for a language A is a function g such that for all words x1 ; y; xf ðnÞ of length at most n we have b :¼ gð/x1 ; y; xf ðnÞ SÞAf0; 1gf ðnÞ and bawA ðx1 ; y; xf ðnÞ Þ: A language is in the class P-mcðf Þ if it has an f -membership comparing function in FP. It is in L-mcðf Þ if it has an f -membership comparing function in FL. We write P-mcðconstÞ for P-mcðOð1ÞÞ and P-mcðlogÞ for P-mcðOðlog nÞÞ: For constant k we write P-mcðkÞ for P-mcðf Þ where f ðnÞ ¼ k for all nAN: ! P-mcð2Þ and the same proof technique can be used to show As shown in [35,38], P-selD ! L-selDL-mcð2Þ: The classes L-sel and L-mcðkÞ share many properties with the polynomial-time
ARTICLE IN PRESS 506
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
counterparts P-sel and P-mcðkÞ: For example, PL-sel ¼ P=poly and L-sel cannot be decided with sublinear advice for any recursive time bound [24]. Likewise the exact advice bounds for P-mcðkÞ shown by Ronneburger [40] also hold for L-mcðkÞ: 2.3. Relativized classes We denote the relativization of P where we have oracle access to some oracle from a class X by P : If the number of queries to the oracle for words of length n is restricted by some number f ðnÞ; C we denote this by PX f ðnÞ-T : If the oracle queries must be made in parallel, we denote this by Ptt and X
X PC f ðnÞ-tt : We similarly denote relativizations of the function class FP by FP and so on. See [30] for detailed definitions. For relativizations of the logspace classes L and FL we also use the same notation, as in LX : See [29] for detailed definitions and properties of logspace reductions.
2.4. Prover–verifier protocols, self-reducibility, promise problems, advice classes, boolean closure It is well-known that all sets in NP have prover–verifier protocols of the following kind: Definition 2.3. For sets in NP, we define the following notions of provers and verifiers: A prover is a polynomially length-bounded function. A verifier is a set in L. A prover–verifier protocol for a language AANP is a pair ðf ; V Þ where f is a prover and V is a verifier such that for every x the following holds: (1) If xAA; then /x; f ðxÞSAV: (2) If xeA; then for all y we have /x; ySeV: We say that A has a prover in FC, if there exists a prover–verifier protocol ðf ; VÞ for A such that f is an element of the function class FC: Every set in NP has a prover in FPNP : In the literature it is often said that search reduces to decision for a language A; if A has a prover in FPA and if the verifier works in polynomial time. We will however always use verifiers that work in logarithmic space. Definition 2.4 (Even et al. [17], Even and Yacobi [18]). A promise problem is a pair ðA; BÞ of languages that consists of a promise A and a problem B: A solution of a promise problem is a set L such that L-A ¼ B-A: Examples are the promise problems ð1ga; gaÞ and ð1sat; satÞ; where 1ga is the set of graphs having at most one non-trivial automorphism, and 1sat is the set of Boolean formulas having at most one satisfying assignment. See [28] for a discussion of the properties of these problems. Definition 2.5 (Balca´zar [7]). A language A is logspace self-reducible if there exists a constant c and a deterministic logspace oracle Turing machine such that on every input x the machine (1) accepts x if and only if xAA; when given A as oracle,
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
507
(2) only asks queries that are lexicographically smaller than x; have the same length as x; and are identical to x except possibly for the last c logjxj bits. The above definition is slightly more flexible than Balca´zar’s original definition, where c ¼ 1 is required. In Balca´zar’s definition the size of a self-reduction tree for an input x is linear in the length of x; whereas in our definition it is polynomial in this length. Definition 2.6 (Karp and Lipton [26]). Let f : N-N be an advice bound and C a complexity class. The advice class C=f is defined as follows: AAC=f if there exists a set BAC and an advice function h : N-f0; 1g with jhðnÞj ¼ f ðnÞ for all n; such that xAA if and only if /x; hðjxjÞSAB: Definition 2.7. Let C be a class of languages. The Boolean closure BCðCÞ of C is the smallest % superset D+C such that A; BAD implies AAD and A-BAD: Lemma 2.8. BRANCHDBCðL-selÞ: Proof. Let A be a branch. By definition, A is the set of all finite prefixes of some infinite symbol string s1 s2 s3 ?: Let plex denote the lexicographical ordering of words over S: Consider the sets fcAS jcplex s1 ?sjcj g and fcAS j cXlex s1 ?sjcj g: These sets are easily seen to be L-selective. Their intersection is exactly A: Thus A is in the Boolean closure of L-sel. & By the above lemma, any result of the type ‘‘AALBCðL-selÞ has the following consequences: y’’ implies that AALBRANCH has the same consequences. We could thus focus on the consequences of membership in LBCðL-selÞ and get results on branches as immediate corollaries. In the following, we actually investigate the consequences of membership in an even larger class, called L-men, which is introduced next. 3. A framework for the study of reductions to sets with low information content 3.1. Membership enumerable sets We now introduce the new notions of polynomial-time and logspace membership enumerability. Roughly speaking, a language A is membership enumerable if wA is enumerable in the sense of Cai and Hemachandra [14]. Definition 3.1. An enumerator for a function f : S -S is a function g : S -S with the following property: g maps every word x to some coded word tuple /x1 ; y; xc S that contains f ðxÞ; i.e., f ðxÞ ¼ xi for some iAf1; y; cg: A membership enumerator for a set A is an enumerator for wA : A language is in the class P-men if it has a membership enumerator in FP, it is in L-men if it has a membership enumerator in FL. Note that a function f has an enumerator in FP if and only if f AFPALL Oðlog nÞ-T and that f has an enumerator in FL if and only if f AFLALL Oðlog nÞ-T :
ARTICLE IN PRESS 508
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
Theorem 3.2. The class P-men is closed under join and pptt -reductions. The class L-men is closed under join and plog T -reductions. Proof. To show that P-men is closed under pptt -reductions, let Apptt B and BAP-men: We show AAP-men: Let mX1 be any integer and let x1 ;y, xm be any m words. For each iAf1; y; mg use the pptt -reduction on input xi to compute the list of queries q1i ; y; qci i to B: The numbers ci of queries are bounded by a fixed polynomial in the length of the input /x1 ; y; xm S: String all the query strings together and then use the membership enumerator for B to generate a list of possible values for the characteristic string of the query strings. The total number of query strings is bounded by a fixed polynomial in the length of /x1 ; y; xm S; and thus the number of possible values for the characteristic string in the list is polynomially bounded. Each value in the list induces a characteristic string for the original input words. Then we only have to output the set of all of these possibilities. To show that P-men is closed under join, let A and B be languages in P-men. Let mX1 be any integer. Let x1 ;y, xm be any m words and b1 ;y, bm be m bits. Suppose that we need to enumerate possible values for the characteristic string wA"B ðb1 x1 ; y; bm xm Þ: Since the words can be rearranged, without loss of generality, we can assume that there exists some cAf0; y; mg such that the first c of the b’s are all 0 and the others are 1: Using the membership enumerator for A; we enumerate a set P of possibilities for the string wA ðx1 ; y; xc Þ and using the membership enumerator for B; we enumerate a set Q of possibilities for the string wB ðxcþ1 ; y; xm Þ: Then we only have to output the set fbc j bAP; cAQg: For logarithmic space, Ladner and Lynch [29] show that for all languages A and B; we have log Aplog T B if and only if Aptt B: Keeping this in mind, we can simply repeat the above proofs, replacing polynomial time by logarithmic space everywhere. We only have to be careful that we do not ‘‘write down’’ any intermediate values like the queries qji ; but instead recompute them whenever necessary. & Theorem 3.3. BCðP-menÞ ¼ P-men and BCðL-menÞ ¼ L-men: Proof. Any Boolean connective of sets A1 ;y, Ac is plog btt -reducible to the join of the same sets. & For the proof of our next result we need the following fact, which is often called Sauer’s lemma [42], although Vapnik and Chervonenkis [52] appear to have been the first to discover it. Fact 3.4 (Vapnik and Chervonenkis [52]). Let QDf0; 1gn and kX1: Let fbi1 ?bik j b1 ?bn AQg have cardinality strictly less than 2k for all indices i1 ; y; ik Af1; y; ng: (In other words, the set misses at least one bit string of length k:) Then
jQjpSðn; kÞ :¼
k1 X n i¼0
i
:
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
509
Fact 3.4 is related to membership comparability in the following way: Consider a k-membership comparing function g and n words x1 ;y, xn : Let QDf0; 1gn be the set of all bit strings b such that for any selection xi1 ; y, xik of k words we have bi1 ?bik agðxi1 ; y; xik Þ: In other words, Q contains all bit strings that are consistent with the membership comparing function g: Then Q fulfills the requirements of Fact 3.4 and thus jQjpSðn; kÞ; which is Yðnk1 Þ for constant k: The following theorem states that we cannot only bound the size of Q in this way, but that we can also compute Q in time polynomial in Q’s size and we can do this even for non-constant k: Theorem 3.5. Let AAP-mcðf Þ for some monotonically non-decreasing, polynomial-time computable function f : Then wA has an enumerator that can be computed in time OðnOðf ðnÞÞ Þ: Proof. Let f be a monotonically non-decreasing, polynomial-time computable function. Let AAP-mcðf Þ via a membership comparing function g: Let mX1 be an integer. Suppose that /x1 ; y; xm S is given as the input. Let n ¼ j/x1 ; y; xm Sj be the length of this input. We wish to enumerate possibilities for wA ðx1 ; y; xm Þ: For each jAf1; y; mg and for each bit string b ¼ b1 ?bj of length j; we say that b is consistent with g; if for every f ðnÞ indices i1 ; y; if ðnÞ Af1; y; jg it holds that gð/xi1 ; y; xif ðnÞ SÞabi1 ?bif ðnÞ : Note that for all jAf1; y; mg and all length-j bit strings b; if b is not consistent with g; then bawA ðx1 ; y; xj Þ: Consider the following enumeration procedure that runs in two stages: In the first stage we run g on every list of f ðnÞ words chosen from the input words. There are mf ðnÞ pnf ðnÞ such lists. So, this stage requires Oðnf ðnÞ pðnÞÞ steps, where p is a polynomial bounding the running time of g: In the second stage we inductively construct a list of all consistent bit strings of length j as follows: For j ¼ 1 we start with the list L1 ¼ f0; 1g: Then we cross out all inconsistent bit strings from L1 ; arriving at a list L1 0 : For jAf2; y; mg; having constructed Lj1 0 ; we set Lj to fb0 j bALj1 0 g,fb1 j bALj1 0 g and then construct Lj 0 from Lj by keeping only strings that are consistent with g: For each jAf1; y; mg let cj denote the cardinality of Lj 0 : The set Lj 0 satisfies the requirements Pf ðnÞ1 of Fact 3.4 and thus cj is bounded by Sðj; f ðnÞÞ ¼ i¼0 ðjiÞpj f ðnÞ1 þ 1pnf ðnÞ : Since j ranges from 1 to m and the time required for checking the consistency of a bit string is Oðnf ðnÞ Þ; the running time of the second stage is Oðcm mnf ðnÞ Þ; and this is Oðn2f ðnÞþ1 Þ: & Corollary 3.6. P-mcðconstÞDP-men: This was first obtained by Nickelsen [35]. Theorem 3.7. L-selDL-men: Proof. Let AAL-sel via a selector gAFL: Let mX1 be an integer. Let /x1 ; y; xm S be an input tuple. Without loss of generality, we may assume that the words x1 ; y; xm are pairwise distinct. Let V be the set fx1 ; y; xm g: A tournament (see [34]) is a directed graph in which every pair of distinct nodes is connected by exactly one arc. Without loss of generality, the selector g can
ARTICLE IN PRESS 510
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
be assumed to be commutative, since otherwise we can replace it with g0 ðx; yÞ ¼ gðminfx; yg; maxfx; ygÞ: Then the selector g induces a tournament whose vertex set is V : Let G be the tournament. It is known [48] (see also [24]) that if V-Aa| there must exist an index i such that the set of all vertices reachable from xi in G is exactly V -A: Tantau [45] shows that the tournament reachability problem is first-order definable and hence, by an observation of Immerman [25], decidable in logarithmic space. This suggests the following enumeration procedure: output 0m and then, for each iAf1; y; mg; output an m-bit string bi such that for each jAf1; y; mg the jth bit of bi is a 1 if and only if xj is reachable from xi in the tournament G: Clearly, this procedure can be executed in logarithmic space and the characteristic string of x1 ; y; xm is one of the output strings of the procedure. Thus, AAL-men: & Corollary 3.8. LBCðL-selÞ DL-men: Proof. We have LBCðL-selÞ DLBCðL-menÞ DLL-men DL-men: The inclusions follow, in order, from Theorems 3.7, 3.3, and 3.2. & The following theorem shows that the class P-men is nicely snuggled between the classes and P-mcðlogÞ:
P-mcðconstÞ Ptt
P-mcðconstÞ
Theorem 3.9. Ptt
DP-menDP-mcðlogÞ:
Proof. The first inclusion follows from the fact that P-mcðconstÞDP-men; see Corollary 3.6, and from Theorem 3.2, which states that P-men is closed under polynomial-time truth-table reductions. To prove the second inclusion, let AAP-men via a polynomial-time Turing machine M: Let kX1 be an integer such that nk þ k bounds the running time of M: Let c be a constant such that for all tX1 for any t words x1 ; y; xt the coded tuple /x1 ; y; xt S has length at most ct maxfjx1 j; y; jxt jg: For f ðnÞ ¼ 3kJlog nn we will show that AAP-mcðf Þ via a machine N: Let nX1 be any integer and let /x1 ; y; xf ðnÞ S be an input for the machine N; where each xi has length at most n: Then the length c of /x1 ; y; xf ðnÞ S is bounded by cf ðnÞn: The membership comparing machine N starts a simulation of the membership enumerating machine M on input /x1 ; y; xf ðnÞ S: During its computation the machine M can output at most ck þ k different possibilities for wA ðx1 ; y; xf ðnÞ Þ: Since ck þ kpðcf ðnÞnÞk þ kpðcn2 Þk þ k ¼ ck n2k þ ko2f ðnÞ for all but finitely many n; at least one bit string of length f ðnÞ is not included in the output list. Any such string can be output by N: So, AAP-mcðf Þ: &
3.2. Self-reduction and prover–verifier protocols In the second part of our framework we study the relationship of logspace self-reductions and prover–verifier protocols. We then combine these results with the notion of membership
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
511
enumerability to obtain a new characterization of L in terms of logspace selectivity and selfreducibility (see Theorem 1.2). Theorem 3.10. If A is logspace self-reducible, then A has a prover in FLA : Proof. Let A be logspace self-reducible via M and a constant c: Consider the following prover– verifier protocol for A: Let x be an input whose membership in A needs to be proved. Let n ¼ 2c logjxj and let q1 ; y; qn be the enumeration in the lexicographic increasing order of all strings that have length jxj and that differ from x only in the last c logjxj bits. Then, for all iAf1; y; ng the machine M on input qi queries only strings qj with jAf1; y; i 1g: The prover’s output is the bit string wA ðq1 ; y; qn Þ: The strings q1 ; y, qn can be enumerated in logarithmic space and their membership in A can be calculated using A as the oracle. So, the prover can be calculated in FLA : Given a string b ¼ b1 ?bn as a proof, the verifier tests its validity as follows: For each iAf1; y; ng the verifier simulates M on input qi by answering, for each jAf1; y; i 1g; the query qj affirmatively if and only if bj ¼ 1: If for some iAf1; y; ng the machine M on input qi accepts in the simulation and bi ¼ 0 or if it rejects and bi ¼ 1; the verifier dismisses the proof as inconsistent. Otherwise, the verifier approves the proof to be consistent—but does not yet accept. The verifier accepts if and only if the bit of b corresponding to the input x is a 1: Since M is a logspace machine, each simulation needs only logarithmic space. So, the verifier can be logarithmically space-bounded. & then AAP: If A has a prover in FLL-men then AAL: Theorem 3.11. If A has a prover in FPP-men tt Proof. Assume that A has a prover f AFPBtt and BAP-men: On input x we run the prover, who produces queries q1 ; y; qc such that it can deduce its certificate f ðxÞ from wB ðq1 ; y; qc Þ: Since BAP-men; we can generate a list of candidates for wB ðq1 ; y; qc Þ in polynomial time. For each of these candidates we reconstruct a candidate for the prover’s proof and check it with the verifier. If the verifier accepts one of the proof candidates, we accept; otherwise we reject. Once more this proof also works for logarithmic space, since FLB ¼ FLBtt and since we can reproduce intermediate values as needed. & The above results suggest that in order to prove results of the type ‘‘if A is efficiently decidable with oracle access to a set with low information content, then A is also efficiently without access to this oracle,’’ we should look for ‘‘low complexity provers’’ for A: The idea is that if A has a prover in FPBtt ; then BAPP-men implies that A has a prover in FPP-men and hence AAP by the above theorem. tt tt In the next section we shall see that there are many sets in NP for which there exist prover–verifier protocols with a prover in FPBtt or FLB for some set B that is not believed to be NP-complete. Theorem 3.12. L is the class of logspace self-reducible, logspace membership enumerable sets. Proof. Let A be a logspace self-reducible set in L-men: Then A has a prover in FLA by Theorem 3.10. By Theorem 3.11 we have AAL: &
ARTICLE IN PRESS 512
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
We obtain Theorem 1.2 from the introduction, which states that L is the class of logspace selfreducible sets in LBCðL-selÞ ; as a corollary since LBCðL-selÞ DL-men by Corollary 3.8.
4. Application of the general framework to individual problems and classes We now apply the general framework established in the previous section to different natural problems and different complexity classes. The degree to which we use ‘‘out-of-the-box’’ results from the framework versus the degree to which we argue specifically varies from problem to problem. The following exposition of results is sorted by decreasing order of the complexity of the problems and classes under consideration. We start with results on the satisfiability problem sat and end with results on the undirected graph accessibility problem ugap. 4.1. Results on the satisfiability problem P-mcðconstÞ
Theorem 4.1. If ð1sat; satÞ has a solution in Ptt
; then ð1sat; satÞ it has a solution in P.
P-mcðconstÞ
Proof. Let AAPtt be a solution of ð1sat; satÞ: By Theorem 3.9 this implies that AAP-men via some machine M: For an n-variable formula f let Qf denote the output of M on input /f1 ; y; fn S; where fi is f with the ith variable substituted by 1: We define a set S as the set of all formulas f for which Qf contains a bit string that is a satisfying assignment of f: Clearly, SAP: Just as clearly, if f has no satisfying assignment, then feS: Now suppose f has exactly one satisfying assignment. Then fi A1sat for all i; and wA ðf1 ; y; fn Þ will be the satisfying assignment of f; and hence fAS: This shows that SAP is a solution of ð1sat; satÞ: & P-mcðconstÞ
Corollary 4.2. If ð1sat; satÞ has a solution in Ptt
; then satARP:
Proof. Valiant and Vazirani’s [50] have shown that ð1sat; satÞ having a solution in P implies satARP: & P-mcðconstÞ
Beigel and Toda [9,48] have shown that satAPtt implies satARP: Sivakumar [44] has shown that satAP-mcðlogÞ implies satARP: Sivakumar’s result is ‘‘stronger’’ than the BeigelP-mcðconstÞ ! P-mcðlogÞ (see [46]). Sivakumar’s result is incomparable to our D Toda result insofar as Ptt result and we do not know whether the assumption that ð1sat; satÞ has a solution in P-mcðlogÞ also implies satARP: 4.2. Results on the graph isomorphism problem We now establish an analog of Sivakumar’s result for the graph isomorphism problem gi. For once, the proof does not use the notion of membership enumerability.
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
513
Theorem 4.3. If giAP-mcðlogÞ; then giARP: Proof. Let giAP-mcðc log nÞ for some c via a membership comparing function g: We show giARP: Let /G0 ; G1 S be a pair of graphs given as input. We run Algorithm 1. We claim that the algorithm will never accept non-isomorphic graphs and will accept isomorphic graphs with probability at least 1=nc : First, consider the case when /G0 ; G1 Segi: Then the graphs in Hi will be isomorphic if and only if ri ¼ 1: Hence wgi ðH1 ; y; Hc Þ ¼ r and gð/H1 ; y; Hc SÞ will never be equal to r: Next, consider the case when /G0 ; G1 SAgi: Then what g ‘‘sees’’ is a list of pairs of isomorphic graphs. There is no information concerning the bit string r in the graph pairs. Since r is selected under uniform distribution, the number of times a graph isomorphic to G0 (and thus, to G1 ) is selected as Hi is independent of r: Hence, the probability that gð/H1 ; y; Hc SÞ equals the randomly chosen r is at least 1=ð2c log n Þ ¼ 1=nc : Using the standard probability amplification method of repeating the test, we can increase the probability to a constant. & Algorithm 1 input /G0 ; G1 S; each with k vertices let c :¼ Jc log nn where n is the length of /G0 ; G1 S guess r ¼ r1 yrc Af0; 1gc forall iAf1; y; cg do guess permutation si :f1; y; kg-f1; y; kg let Hi :¼ ðG1 ; si ðGri ÞÞ: if gð/H1 ; y; Hc SÞ ¼ r then output ‘‘isomorphic’’ and accept else output ‘‘perhaps non-isomorphic’’ and reject 4.3. Results on the graph automorphism problem Beigel et al. [10], and Ogihara [38] independently showed for a number of problems that are unknown to be in P, including ga and sat; that if the problem is in PP-sel n1e -tt then it is already in P. It was questioned whether the results can be improved by increasing the number of permissible queries while maintaining the collapse to P. Although it is unknown whether such an improvement is possible for other problems, in Theorem 4.7, we show that such an improvement is indeed possible for ga: For the proof, we first show that there exists a prover–verifier protocol for ga where the prover can be computed with truth-table oracle access to any solution of the promise problem ð1ga; gaÞ: Previously, it was only known that ga has a prover in FPga tt : Our proof builds on Lemma 4.4. This is an improved version of Theorem 1.31 of [28], which states that gaAPA for every solution A of ð1ga; gaÞ: The new idea in our proof is the use of the
ARTICLE IN PRESS 514
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
parallel census technique [20,21] to replace the Turing reduction by a disjunctive truth-table reduction. Lemma 4.4. Let A be any solution of ð1ga; gaÞ: Then gappdtt A: Proof. In this proof we use the notation of [28]. Let G ¼ ðV ; EÞ with jV j ¼ n be an undirected graph given as input. We may assume that G is connected (otherwise consider the complement graph). Using Mathon’s method [33] we first construct graphs G ð0Þ ; y; GðnÞ that form a tower of pointwise stabilizers in G’s automorphism group. These graphs are obtained as follows: in G ðiÞ we color the first i vertices with i different colors, which forces any automorphism of G ðiÞ to map the first i vertices to themselves. Note that every automorphism of a graph G ðiÞ is also an automorphism of G: As shown in [28], if G has a non-trivial automorphism, there is an index i0 such that the following conditions hold: (1) (2) (3) (4)
None of G ði0 Þ ; y; G ðnÞ has a non-trivial automorphism. The graph G ði0 1Þ has an automorphism mapping i0 to some j0 4i0 : For every j4i0 there is at most one automorphism in G ði0 1Þ mapping i0 to j: In G ði0 1Þ there is no non-trivial automorphism mapping i0 to i0 : ði1Þ
For i; jAf1; y; ng with ioj consider the graphs H ij :¼ G½i
ði1Þ
,G½j
; which are obtained by
ði1Þ
alongside each other and labeling i in a special way in the first copy putting two copies of G and j in an identical way in the second copy. This forces any automorphism of H ij to map the vertex i either to i in the first copy or to j in the second copy. For all i4i0 the graphs H ij will not have any non-trivial automorphism by condition (1). For i ¼ i0 the graph H i0 j0 will have exactly one non-trivial automorphism by conditions (2)–(4), ði 1Þ ði 1Þ namely the automorphism that interchanges G½i00 and G½j00 according to the unique automorphism in G ði1Þ that maps i0 to j0 : We ask the graphs H ij as queries to A and accept if and only if at least one answer is ‘‘yes.’’ (This is where we use the parallel census technique.) The queries will not necessarily satisfy the promise, but this does not matter: Suppose Gega: Then none of the graphs H ij has a non-trivial automorphism. Thus they all fulfill the promise and all answers are ‘‘no’’ and we reject. If GAga; the at least one graph, namely H i0 j0 ; will fulfill the promise and it is in ga: Thus the answer for this graph will be ‘‘yes’’ and we accept. & Since RP is closed under dtt-reductions, we note in passing that the following corollary holds: Corollary 4.5. If ð1ga; gaÞ has a solution in RP, then gaARP: Theorem 4.6. ga has a prover in FPA tt for every solution A of ð1ga; gaÞ: Proof. Agrawal and Arvind [1, Theorem 3.1] have shown that the lexicographically first nontrivial automorphism of a graph G can be computed in polynomial time with non-adaptive queries
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
515
to ga. In particular, ga has a prover–verifier protocol where the prover is in FPga tt : the prover maps a graph G to a non-trivial automorphism, if one exists; the verifier checks whether its input is a non-trivial automorphism. Since gaAPA dtt by Lemma 4.4, the prover is A also in FPtt : & P-mcðconstÞ
Theorem 4.7. If ð1ga; gaÞ has a solution in Ptt
; then gaAP:
P-mcðconstÞ
Proof. Let AAPtt DP-men be a solution of ð1ga; gaÞ: By Theorem 4.6 the language ga has : By Theorem 3.11 this implies gaAP: & a prover in FPA tt Using the same argument, but allowing for superpolynomially large sets of possible certificates, P-mcðf Þ it is easy to show the following more general result: if ð1ga; gaÞ has a solution in Ptt ; then gaADTIME½nOðf ðnÞÞ : 4.4. Results on the graph automorphism counting function Our framework can be applied not only to decision problems but also to function problems. Theorem 4.9 demonstrates how the framework can be used to prove a new connection between the enumerability of #ga and ga itself. The proof uses the following fact. Fact 4.8 (Beals et al. [8]). wgi ppm #ga: Theorem 4.9. If #ga has an enumerator in FP, then gaAP: Proof. By Fact 4.8, if #ga has an enumerator in FP, then giAP-men: Since gappm gi; we also have gaAP-men: Since ga has a prover in FPga tt ; by Theorem 3.11 we have gaAP: & Note that giAP-men implies giAP-mcðlogÞ by Theorem 3.9 and thus giARP by Theorem 4.3. Thus if #ga has an enumerator in FP, then giARP: This result was previously proved in [8]. 4.5. Results on the circuit value problem Theorem 4.10. If cvpAL-men; then cvpAL: Proof. For this proof we apply an argument that we will reapply similarly in different proofs in the following. As shown by Balca´zar [7], the circuit value problem is logspace self-reducible for an appropriate coding (namely where the code ends with the gate number for which we would like to know the output). Since search reduces to decision for logspace self-reducible problems, cvp has a prover in FLcvp : If we have cvpAL-men; then cvp has a prover in FLL-men : By Theorem 3.11, this implies cvpAL: &
ARTICLE IN PRESS 516
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
4.6. Results on formal language classes Theorem 4.11. (1) If DCFLDL-men; then DCFLDL: (2) If CFLDL-men; then CFLDL: Proof. Balca´zar [7] has constructed a logspace self-reducible set B1 that is plog m -complete for -reduction closure of DCFL. Assume DCFLDL-men: Then we also have LOGDCFL, the plog m DCFL DL-men since L-men is closed under logspace Turing reductions. Thus B1 AL-men and the L logspace self-reducibility of B1 together with Theorem 3.11 implies B1 AL: Since B1 is complete for DCFL, we can conclude DCFLDL: For the second claim, just note that Balca´zar has also constructed a plog m -complete set B2 for LOGCFL. & Corollary 4.12. If SAC1 DL-men; then SAC1 ¼ L: Proof. Recall that SAC1 ¼ LOGCFL: Thus SAC1 DL-men implies CFLDL-men; and CFLDL implies SAC1 DL: &
4.7. Results on logspace counting classes In this section we present results on the logspace counting classes #L and the classes Modk L: We begin with a theorem that states that variants of the reachability problem for topologically sorted directed acyclic graphs (t.-s. dags) are logspace self-reducible. Combined with Theorem 3.10 this shows that search reduces to decision for these variants. Theorem 4.13. The following sets are logspace self-reducible for all k: Rk :¼ f/G; s; t; cS j G is a t:-s: dag; #pathsG ðs; tÞ #mod k ¼ cg and R :¼ f/G; 1k ; s; t; cS j G is a t:-s: dag; #pathsG ðs; tÞ #mod k ¼ cg: Proof. For the sets Rk ; on input /G; s; t; cS with cok we ask the queries /G; s0 ; t; c0 S for all successors s0 of s and all c0 Af0; y; k 1g: These queries differ only on the last 2 log n þ log k bits from the input, where n is the number of vertices in the graph, since we need log n bits to encode one vertex. From the answers to these questions we can easily deduce whether /G; s; t; cSARk : For the set R we can apply the same algorithm: on input /G; 1k ; s; t; cS we ask the queries /G; 1k ; s0 ; t; c0 S for all successors s0 of s and all c0 Af0; y; k 1g: Again, the input differs only on 2 log n þ log k bits from the input and both n and k are bounded by the input’s length. &
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
517
For each k; the set Rk is a canonically plog m -complete problem for Modk L: This allows us to conclude the following theorem: Theorem 4.14. For every k; if Modk LDL-men; then Modk L ¼ L: Proof. As in the proof of Theorem 4.10 for the circuit value problem, we can argue as follows: If Modk LDL-men; then the logspace self-reducible set Rk is in L-men and thus also in L. This implies Modk LDL: & Our final result of this section concerns the class L#L : Once more, we show that L#L DL-men implies L#L ¼ L: The difficulty in proving this lies in showing that R is complete for L#L : We do log not know whether R is plog m -complete for this class, but we can show that it is pT -complete, which will be sufficient for our purposes. #L : Theorem 4.15. The set R is plog T -complete for L
Proof. Clearly R is in L#L : It thus remains to show hardness. Recall that the function #paths maps a tuple /G; s; tS to the number of paths in G from s to t: The restriction of this function to t.-s. dags is well-known to be a canonically plog m -complete problem for #L: Essentially, in order to show the hardness of R ; we must show how on input of a tuple /G; s; tS we can compute X :¼ #pathsG ðs; tÞ using queries to R : The obvious difficulty is that R gives us only somewhat indirect information about X : it just tells us whether X equals some number c when taken modulo some number k: Furthermore, the number k must be given in unary and we can thus only use small k: The idea is to switch to the Chinese remainder representation of X : The Chinese remainder representation of (any) number X is given by the tuple ðX mod 2; X mod 3; X mod 5; y; X mod pÞ; where p is a prime number such that the product P of all primes up to p is larger than X : By the Chinese remainder theorem all numbers smaller than P have a unique Chinese remainder representation. The work of Chiu et al. [15] shows that converting the Chinese remainder representation of a number into its binary representation can be done in logarithmic space. Thus, if we know all entries of the Chinese remainder representation of X and if we choose p large enough such that P4X ; we can compute X in logarithmic space. First, let us bound p: The number of paths between two vertices in an n-vertex dag is bounded 2 2 by nn p2n : If we pick p to be the n2 th prime number, then P will be at least 2n since it is the product of n2 numbers, each at least 2: Thus, picking p to be the n2 th prime number ensures X oP: It is known [41] that the kth prime number is Oðk2 Þ: Thus pon4 : This means that the length of p is only Oðlog nÞ and we can thus easily compute all prime numbers up to p:
ARTICLE IN PRESS 518
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
Our queries to R are /G; 1pi ; s; t; cS; where pi is the ith prime number, for iAf1; y; n2 g and cAf0; y; pi 1g: For each i there will be exactly one cAf0; y; pi 1g for which the query is answered with ‘‘yes.’’ Then X mod pi ¼ c: Thus, from the answers to our queries we can reconstruct ðX mod 2; X mod 3; X mod 5; y; X mod pÞ and hence X : & Corollary 4.16. If L#L DL-men; then #L ¼ FL: #L Proof. Since R is both logspace self-reducible and plog ; as in the proof of T -complete for L #L Theorem 4.10 for the circuit value problem we conclude that L DL-men implies L#L DL: This implies #L ¼ FL: &
4.8. Results on the determinant function Theorem 4.17. If detAFLL-men ; then detAFL: Proof. Assume detAFLL-men : Then Ldet DLL-men ¼ L-men: The plog m -reduction closure of det is known [16,47,49,53] to be exactly GapL, the class of functions that are the difference of two functions in #L: Hence Ldet ¼ LGapL ¼ L#L : But then L#L DL-men and hence L#L ¼ L by Theorem 4.16 above. This implies detAFL: &
4.9. Results on the directed graph accessibility problem Theorem 4.18. If gapAL-men; then gapAL: Proof. Since gap is logspace self-reducible just like the circuit value problem, the proof is exactly the same as the proof of Theorem 4.10 for the analogous result for cvp. & Since L-selDL-men; the above theorem also shows that it is ‘‘unlikely’’ that the graph accessibility problem is L-selective. However, the theorem makes no claims if we assume that gap is logspace k-membership complete, because it is unclear whether L-mcðconstÞDL-men holds analogously to the polynomial-time setting. To prove that gapAL-mcðconstÞ also implies gapAL; we have to use a different proof technique. The technique we use is transferred from the one used in [2,10,38] for showing that satAP-mcðkÞ implies satAP: The same proof technique can be used to show that Modk LDL-mcðconstÞ implies Modk L ¼ L:
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
519
Theorem 4.19. If gapAL-mcðconstÞ; then gapAL: Proof. If gap is in L-mcðkÞ; then so is the reachability problem for dags with out-degrees at most 2, which is easily seen to be NL-complete. Let /G; s; tS be an input for this problem. We start a search from the source s; keeping track of a list of at most 2k many vertices that fulfills two requirements. (1) All vertices in the list are reachable from s: (2) If t is reachable from s; then t is also reachable from at least one vertex in the list. Initially, the list contains only the source, which clearly fulfills the requirements. As long as the list has not grown to size 2k ; we remove the first element of the list and add its two successors to the list. This, too, does not violate either requirement. If we ever put the target into the list, we accept. If the list ever becomes empty, we reject. If the list grows to size 2k ; we remove one vertex from the list that has the property that it is not the only vertex from which t is reachable. To obtain such an element, we use a method originally employed in [2,10,38] in a polynomial-time setting. Let vb with bAf0; 1gk denote the vertices in the list. For a number iAf1; y; kg let Vi denote the set of vertices vb such that the ith bit of b is a 1: The set V1 contains the last ‘‘half’’ of the vertices, the set V2 contains the second and last ‘‘quarter’’ of the vertices, and the set Vk contains ‘‘every second’’ vertex. For each iAf1; y; kg we construct a triple /Di ; si ; ti S consisting of a dag, a source vertex, and a target vertex. As always, these dags are not written down anyway but are dynamically recalculated. The objective of the construction is to enforce the following property: () There is a path from si to ti in Di if and only if there is a path from some vAVi to t in G: The exact construction of Di is as follows: For each vAVi the dag Di contains a copy Gv of G: The target ti of Di is obtained by merging together all copies of the vertex t across the different Gv to a single vertex. For the source si of Di we do not merge together all copies of the vertex s across the different Gv : Rather, we take one vertex from each Gv ; namely the vertex v; and merge these together to form the source si : Clearly, the construction ensures property (). We run the k-membership comparing function on //D1 ; s1 ; t1 S; y; /Dk ; sk ; tk SS: Consider the bit string b output by the function. We claim that it is the index of a vertex vb that is not the only vertex in the list from which the target t is reachable in G: If this claim is true, we can remove vb from our list and properties (1) and (2) will still hold. So, for the sake of contradiction assume that vb were the only vertex in the list from which we could reach the target t: Then exactly those Vi containing vb would contain a vertex from which t is reachable. This in turn implies that exactly in the corresponding Di there would be a path from si to ti : Thus exactly those triples /Di ; si ; ti S are instances of gap for which vb AVi : Since vb AVi if and only if the ith bit of b is a 1;
ARTICLE IN PRESS 520
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
we conclude that b is exactly the characteristic string of the input to the k-membership comparing function—a contradiction. Since it is easily seen that we never reach the same list configuration twice, we get the claim. & 4.10. Results on unambiguous logarithmic space In this section we study the class UL of unambiguous logarithmic space and its generalization FewL. Recall that these classes are obtained from the class NL by restricting the non-deterministic Turing machine to have only one or only polynomially many accepting paths, respectively. We start with a result on the prover complexity of sets in these classes. Theorem 4.20. Each set in UL has a prover in FLUL : Each set in FewL has a prover in FLFewL : Proof. Let AAUL via M: Define B :¼ f/x; cS j x is an input for M; c is a vertex in the topologically sorted direct acyclic configuration graph of M; and there is a path from the initial configuration of M on input x to the accepting configuration through cg: Then BAUL: On input x the prover queries /x; cS for every configuration c of M on input x: It passes the characteristic string of these queries with respect to B as its certificate to the verifier. On input /x; bS the verifier simulates M on input x and uses the bit string b to decide non-deterministic choices. If it reaches the accepting configuration under the ‘‘guidance’’ of b; it accepts. For AAFewL we construct the same set B; which is now an element of FewL. Once more A has a prover in FLB ; only this time the verifier may find multiple legal non-deterministic choices in b: But then it can simply pick, say, the smallest one. & Corollary 4.21. If ULDL-men; then ULDL: If FewLDL-men; then FewLDL-men: 4.11. Results on the undirected graph accessibility problem Our first aim of this section is to show that ugapAL-men implies ugapAL: For this, we do not prove that ugap is logspace self-reducible (we do not know whether this is the case), but only that search reduces to decision for it. Theorem 4.22. The problem ugap has a prover in FLugap such that for every input all queries of this prover have the same length. Proof. Nisan and Ta-Shma [37] have shown that SL is closed under complement. This implies that not only ugap, but also its complement ugap is plog m -complete for SL. Thus there exists a function hAFL that maps every tuple /G; s; tS; consisting of an undirected graph G and its vertices s and t; to a tuple /G 0 ; s0 ; t0 S such that /G; s; tSAugap if and only if /G0 ; s0 ; t0 SAugap: For the construction of the prover–verifier protocol for ugap, we begin with an input /G; s; tS: The prover first maps it, using h; to /G 0 ; s0 ; t0 S (as always this is never written down anywhere, but each bit is calculated as needed). The problem now is to decide whether /G 0 ; s0 ; t0 SAugap:
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
521
The prover next maps /G 0 ; s0 ; t0 S to the set of all vertices reachable from s0 in G 0 : Clearly, this prover is in FLugap and all queries will have the same length. Our verifier gets as input a pair consisting of a problem instance /G; s; tS for ugap and an alleged proof I: The alleged proof is a set of vertices in G 0 : The verifier accepts if the following three conditions are met: (1) s0 AI; (2) t0 eI; (3) I is closed under reachability in G 0 ; i.e., the neighborhood of every vertex in I is contained in I: A set I with these properties indeed ‘‘proves’’ that there is no path from s0 to t0 in G 0 and hence /G0 ; s0 ; t0 SAugap; which in turn proves /G; s; tSAugap: & As a corollary we obtain the following theorem: Theorem 4.23. If ugapAL-men; then ugapAL: Theorem 4.24. If ugapAL=log; then ugapAL: Proof. Assume ugapAL=log: Consider the prover in FLugap for ugap from Theorem 4.22. On input of a graph G compute the prover’s queries q1 ; y; qk ; all of which have the same length c: As always, we do not actually write these graphs down anywhere, but dynamically recalculate their bits as needed. We next try to compute wugap ðq1 ; y; qk Þ; using the assumption ugapAL=log: The queries qi all have the same length. Thus if we knew the correct advice for length c; we could easily compute this string. However, we do not know this advice string. Since there are only polynomially many possible advice strings, we can, however, enter a loop in which we cycle through all ‘‘advice candidates.’’ For each advice candidate we compute a candidate for wugap ðq1 ; y; qk Þ: For incorrect advice string candidates we will not compute this string correctly—but we can use the verifier to catch this: For each candidate certificate we check whether it shows GAugap: If there exists such a certificate, we can accept, otherwise we reject. If, indeed, GAugap; then for some advice candidate we will ‘‘hit’’ the right certificate and will accept. If Geugap; then no certificate exists and we will not accept; independently of whether an advice candidate is correct or not. & Balca´zar has shown that gapAL=log implies gapAL and also that cvpAL=log implies cvpAL; but did not prove Theorem 4.24 since his proof relies on the self-reducibility of these problems, rather than on the existence of a prover–verifier protocol.
5. Conclusion We have shown that, for a wide variety of natural problems, they cannot be reduced to sets with low information content, unless unlikely collapses of complexity classes occur. For the proofs we
ARTICLE IN PRESS 522
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
developed a proof framework that is based on the new notion of membership enumerability and on prover–verifier protocols with low-complexity provers. Several of our logspace results boil down to a new characterization of L as the class of logspace self-reducible sets in LBCðL-selÞ ; the plog T -reduction closure of the Boolean closure of L-sel: A more general form of this characterization was that L is the class of logspace self-reducible, logspace membership enumerable sets. A much less general form is the corollary that L is also the class of logspace self-reducible L-selective sets. This is the typical kind of result available in the polynomial-time setting. For example, P is known to be the class of self-reducible P-selective sets, but not known to be the class of self-reducible sets in BCðP-selÞ; let alone in PBCðPselÞ ¼ P=poly: The presence of a Turing reduction in our characterization of L makes it somewhat ‘‘robust.’’ For example, Balca´zar has shown that LOGCFLDL=log implies CFLDL; but he points out that the assumption CFLDL=log is insufficient to arrive at the conclusion CFLDL; since L=log is not known to be closed under plog m -reductions. Our Theorem 1.1 does not suffer from this problem. We arrive at the conclusion CFLDL already from the assumption CFLDLBCðL-selÞ : We point out that Austinat et al. [5,6] have shown—unconditionally—that no inherently context-free language is finite automaton Oð1Þ-membership comparable (see [5,6] for definitions). In particular, no inherently context-free languages are in the Boolean closure of the finite-automata selective sets. We have not claimed that plog T -reducibility to L-mcðkÞ sets has any dramatic consequences. The reason is that although we could show L-selDL-men; it is not known whether L-mcðkÞDL-men for all k: In a personal communication Arfst Nickelsen reported that L-mcð2ÞDL-men; but the general case remains open. Another persisting open problem is the question of whether sat being pptt -reducible to P-sel implies satAP; and likewise for gi. Since we showed that giAP-mcðlogÞ implies giARP; we implicitly get that gi being pptt -reducible to P-sel implies giARP: Interestingly, we were not able to show that gaAP-mcðlogÞ implies gaARP: Acknowledgments The authors are very grateful to Lane Hemaspaandra for useful suggestions and for pointers to literature. The authors thank an anonymous referee for helpful comments. References [1] M. Agrawal, V. Arvind, A note on decision versus search for graph automorphism, Inform. and Comput. 131 (2) (1996) 179–189. [2] M. Agrawal, V. Arvind, Quasi-linear truth-table reductions to p-selective sets, Theoret. Comput. Sci. 158 (1/2) (1996) 361–370. [3] C. A`lvarez, B. Jenner, A very hard log-space counting class, Theoret. Comput. Sci. 107 (3) (1993) 3–30. [4] V. Arvind, Y. Han, L. Hemachandra, J. Ko¨bler, A. Lozano, M. Mundhenk, M. Ogiwara, U. Scho¨ning, R. Silvestri, T. Thierauf, Reductions to sets of low information content, in: K. Ambos-Spies, S. Homer, U. Scho¨ning (Eds.), Complexity Theory, Cambridge University Press, Cambridge, 1993, pp. 1–45. [5] H. Austinat, V. Diekert, U. Hertrampf, A structural property of regular frequency computations, Theoret. Comput. Sci. 292 (1) (2003) 33–43.
ARTICLE IN PRESS M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
523
[6] H. Austinat, V. Diekert, U. Hertrampf, H. Petersen, Regular frequency computations, in: Proceedings of the RIMS Symposium on Algebraic Systems, Formal Languages and Computation, RIMS Kokyuuroku, Vol. 1166, Research Institute for Mathematical Science, Kyoto University, Japan, 2000, pp. 32–42. [7] J. Balca´zar, Logspace self-reducibility, in: Proceedings of the Third Conference on Structure in Complexity Theory, IEEE Computer Society Press, Silver Spring, MD, 1988, pp. 40–46. [8] R. Beals, R. Chang, W. Gasarch, J. Tora´n, On finding the number of graph automorphisms, Chicago J. Theoret. Comput. Sci. 1999 (1) (1999) Article 1. [9] R. Beigel. NP-hard sets are P-superterse unless R ¼ NP; Technical Report 88-04, Dept. of Computer Science, Johns Hopkins University, 1988. [10] R. Beigel, M. Kummer, F. Stephan, Approximable sets, Inform. and Comput. 120 (2) (1995) 304–314. [11] A. Beygelzimer, M. Ogihara, The (non) enumerability of the determinant and the rank, Theory Comput. Syst. 36 (4) (2003) 359–374. [12] H. Buhrman, L. Torenvliet, P-selective self-reducible sets: a new characterization of P, J. Comput. System Sci. 53 (2) (1996) 210–217. [13] G. Buntrock, C. Damm, U. Hertrampf, C. Meinel, Structure and importance of Logspace-MOD class, Math. Systems Theory 25 (3) (1992) 223–237. [14] J. Cai, L. Hemachandra, Enumerative counting is hard, Inform. and Comput. 82 (1) (1989) 34–44. [15] A. Chiu, G. Davida, B. Litow, Division in logspace-uniform NC1 ; Theoret. Informatics Appl. 35 (3) (2001) 259–275. [16] C. Damm, DET ¼ Lð#LÞ ; Technical Report Informatik-Preprint 8, Fachbereich Informatik der HumboldtUniversita¨t zu Berlin, 1991. [17] S. Even, A. Selman, Y. Yacobi, The complexity of promise problems with applications to public-key cryptography, Inform. and Control 61 (2) (1984) 159–173. [18] S. Even, Y. Yacobi, Cryptocomplexity and NP-completeness, in: Proceedings of the International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, Vol. 85, Springer, Berlin, 1980, pp. 195–207. [19] O. Goldreich, S. Micali, A. Wigderson, Proofs that yield nothing but their validity and a methodology of cryptographic protocol design, in: Proceedings of the 27th Symposium on Foundations of Computer Science, IEEE Computer Society Press, Silver Spring, MD, 1986, pp. 174–187. [20] J. Hartmanis, On sparse sets in NPP, Inform. Process. Lett. 16 (2) (1983) 55–60. [21] J. Hartmanis, N. Immerman, V. Sewelson, Sparse sets in NPP: EXPTIME versus NEXPTIME, Inform. and Control 65 (2/3) (1985) 158–181. [22] L. Hemaspaandra, A. Hoene, A. Naik, M. Ogihara, A. Selman, T. Thierauf, J. Wang, Nondeterministically selective sets, Internat. J. Found. Comput. Sci. 6 (4) (1995) 403–416. [23] L. Hemaspaandra, A. Naik, M. Ogihara, A. Selman, Computing solutions uniquely collapses the polynomial hierarchy, SIAM J. Comput. 25 (4) (1996) 697–708. [24] L. Hemaspaandra, L. Torenvliet, Optimal advice, Theoret. Comput. Sci. 154 (2) (1996) 367–377. [25] N. Immerman, Descriptive Complexity, Springer, Berlin, 1998. [26] R. Karp, R. Lipton, Some connections between uniform and non-uniform complexity classes, in: Proceedings of the 12th ACM Symposium on Theory of Computing, ACM Press, New York, 1980, pp. 302–309. [27] K. Ko, On self-reducibility and weak P-selectivity, J. Comput. System Sci. 26 (2) (1983) 209–221. [28] J. Ko¨bler, U. Scho¨ning, J. Tora´n, The Graph Isomorphism Problem: Its Structural Complexity, Birkha¨user, Basel, 1993. [29] R. Ladner, N. Lynch, Relativization of questions about logspace computability, Math. Systems Theory 10 (1) (1976) 19–32. [30] R. Ladner, N. Lynch, A. Selman, A comparison of polynomial time reducibilities, Theoret. Comput. Sci. 1 (2) (1975) 103–123. [31] H. Lewis, C. Papadimitriou, Symmetric space-bounded computation, Theoret. Comput. Sci. 19 (2) (1982) 161–187. [32] A. Lozano, J. Tora´n, On the nonuniform complexity of the graph isomorphism problem, in: K. Ambos-Spies, S. Homer, U. Scho¨ning (Eds.), Complexity Theory, Cambridge University Press, Cambridge, 1993, pp. 245–271. [33] R. Mathon, A note on the graph isomorphism counting problem, Inform. Process. Lett. 8 (1979) 131–132.
ARTICLE IN PRESS 524
M. Ogihara, T. Tantau / Journal of Computer and System Sciences 69 (2004) 499–524
[34] J. Moon, Topics on Tournaments, Holt, Rinehart, & Winston, New York, 1968. [35] A. Nickelsen, Polynomial Time Partial Information Classes, Wissenschaft und Technik Verlag, Berlin, 2001. Also: Ph.D. Thesis, Technische Universita¨t Berlin, Germany, 1999. [36] A. Nickelsen, T. Tantau, On reachability in graphs with bounded independence number, in: O.H. Ibarra, L. Zhang (Eds.), Proceedings of the Eighth International Conference on Computing and Combinatorics, Lecture Notes in Computer Science, Vol. 2387, Springer, Berlin, 2002, pp. 554–563. [37] N. Nisan, A. Ta-Shma, Symmetric logspace is closed under complement, Chicago J. Theoret. Comput. Sci. 1995 (1) (1995) Article 1. [38] M. Ogihara, Polynomial-time membership comparable sets, SIAM J. Comput. 24 (5) (1995) 1068–1081. [39] C. Papadimitriou, Computational Complexity, Addison-Wesley, Reading, MA, 1994. [40] D. Ronneburger, Upper and lower bounds for token advice for partial information classes, Master’s Thesis, Technische Universita¨t Berlin, Germany, 1998. [41] J. Rosser, L. Schoenfeld, Approximate formulas for some functions of prime numbers, Illinois J. Math. 6 (1962) 64–94. [42] N. Sauer, On the density of families of sets, J. Combin. Theory Ser. A 13 (1972) 145–147. [43] A. Selman, P-selective sets, tally languages, and the behavior of polynomial time reducibilities on NP, Math. Systems Theory 13 (1979) 55–65. [44] D. Sivakumar, On membership comparable sets, J. Comput. System Sci. 59 (2) (1999) 270–280. [45] T. Tantau, A note on the complexity of the reachability problem for tournaments, Technical Report TR01-092, Electronic Colloquium on Computer Complexity, 2001, www.eccc.uni-trier.de/eccc. [46] T. Tantau, A note on the power of extra queries to membership comparable sets, Technical Report TR02-004, Electronic Colloquium on Computer Complexity, 2002, www.eccc.uni-trier.de/eccc. [47] S. Toda, Counting problems computationally equivalent to computing the determinant, Technical Report CSIM 91-07, Dept. of Computer Science, University of Electro-Communications, Tokyo, Japan, 1991. [48] S. Toda, On polynomial-time truth-table reducibility of intractable sets to p-selective sets, Math. Systems Theory 24 (1991) 69–82. [49] L. Valiant, Why is boolean complexity theory difficult?, in: M. Paterson (Ed.), Boolean Function Complexity, Lecture Note Series, Vol. 169, London Mathematics Society, Cambridge University Press, Cambrige, 1992, pp. 84–94. [50] L. Valiant, V. Vazirani, NP is as easy as detecting unique solutions, Theoret. Comput. Sci. 47 (1) (1986) 85–93. [51] D. van Melkebeek, Deterministic and randomized bounded truth-table reductions of P, NL, and L to sparse sets, J. Comput. System Sci. 58 (2) (1998) 213–232. [52] V. Vapnik, A. Chervonenkis, On the uniform convergence of relative frequencies of events to their probabilities, Theory Probab. Appl. 16 (2) (1971) 264–280. [53] V. Vinay, Counting auxiliary pushdown automata and semi-unbounded arithmetic circuits, in: Proceedings of the Sixth Conference on Structure in Complexity Theory, IEEE Computer Society Press, Silver Spring, MD, 1991, pp. 270–284.