JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.1 (1-23)
Journal of Computer and System Sciences ••• (••••) •••–•••
Contents lists available at ScienceDirect
Journal of Computer and System Sciences www.elsevier.com/locate/jcss
Deciding semantic finiteness of pushdown processes and first-order grammars w.r.t. bisimulation equivalence Petr Janˇcar Dept of Computer Science, Faculty of Science, Palacký Univ., Olomouc, Czechia
a r t i c l e
i n f o
Article history: Received 28 November 2017 Received in revised form 20 June 2019 Accepted 29 October 2019 Available online xxxx Keywords: Pushdown automata First-order grammars Bisimulation equivalence Regularity
a b s t r a c t The problem if a given configuration of a pushdown automaton (PDA) is bisimilar with some (unspecified) finite-state process is shown to be decidable. The decidability is proven in the framework of first-order grammars, which are given by finite sets of labelled rules that rewrite roots of first-order terms. The framework is equivalent to PDA where also deterministic (i.e. alternative-free) epsilon-steps are allowed, hence to the model for which Sénizergues showed an involved procedure deciding bisimilarity (1998, 2005). Such a procedure is here used as a black-box part of the algorithm. The result extends the decidability of the regularity problem for deterministic PDA that was shown by Stearns (1967), and later improved by Valiant (1975) regarding the complexity. The decidability question for nondeterministic PDA, answered positively here, had been open (as indicated, e.g., by Broadbent and Göller, 2012). © 2019 Elsevier Inc. All rights reserved.
1. Introduction The question of deciding semantic equivalences of systems, like language equivalence, has been a frequent topic in computer science. A closely related question asks if a given system in a class C1 has an equivalent in a subclass C2 . Pushdown automata (PDA) constitute a well-known example; language equivalence and regularity are undecidable for PDA. In the case of deterministic PDA (DPDA), the decidability and complexity results for regularity [1,2] preceded the famous decidability result for equivalence by Sénizergues [3]. In concurrency theory, logic, verification, and other areas, a finer equivalence, called bisimulation equivalence or bisimilarity, has emerged as another fundamental behavioural equivalence (cf., e.g., [4]); on deterministic systems it essentially coincides with language equivalence. An on-line survey of the results which study this equivalence in a specific area of process rewrite systems is maintained by Srba [5]. One of the most involved results in this area is the decidability of bisimilarity for pushdown processes generated by (nondeterministic) PDA in which ε -steps are restricted so that each ε -step has no alternative (and can be restricted to be popping); this result was shown by Sénizergues [6] who thus generalized his above mentioned result for DPDA. There is no known upper bound on the complexity of this decidable problem. The nonelementary lower bound established in [7] is, in fact, TOWER-hardness in the terminology of [8], and it holds even for real-time PDA, i.e. PDA with no ε -steps. For the above mentioned PDA with restricted ε -steps the bisimilarity problem is even not primitive recursive; its Ackermann-hardness is shown in [9]. In the deterministic case, the equivalence problem is known to be PTIME-hard, and has a primitive recursive upper bound shown by Stirling [10] (where a finer analysis places the problem in TOWER [9]).
E-mail address:
[email protected]. https://doi.org/10.1016/j.jcss.2019.10.002 0022-0000/© 2019 Elsevier Inc. All rights reserved.
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.2 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
2
Extrapolating the deterministic case, we might expect that for PDA the “regularity” problem w.r.t. bisimilarity (asking if a given PDA-configuration is bisimilar with a state in a finite-state system) is decidable as well, and that this problem might be easier than the equivalence problem solved in [6]; only EXPTIME-hardness is known here (see [11], and [5] for detailed references). Nevertheless, this decidability question has been open so far, as also indicated in [12] (besides [5]). Contribution of this paper We show that semantic finiteness of pushdown configurations w.r.t. bisimilarity is decidable. The decidability is proven in the framework of first-order grammars, i.e. of finite sets of labelled rules that rewrite roots of first-order terms. Though we do not use (explicit) ε -steps, the framework is equivalent to the model of PDA with restricted ε -steps for which Sénizergues’s general decidability proof [6] applies. (A simplified proof directly in the first-order grammar framework, hence an alternative to the proof in [6], is given in [13].) The presented algorithm, answering if a given configuration, i.e. a first-order term E 0 in the labelled transition system generated by a first-order grammar, has a bisimilar finite-state system, uses the result of [6] (or of [13]) as a black-box procedure. By [9] we cannot get a primitive recursive upper bound via a black-box use of the decision procedure for bisimilarity. Semidecidability of the semantic finiteness problem has been long clear, hence it is the existence of finite effectively verifiable witnesses of the negative case that is the crucial point here. It turns out that a witness of semantic infiniteness u
w
of a term (i.e., of a configuration) E 0 is a specific path E 0 −→−→ in the respective labelled transition system where the u
w
w
w
sequence w of actions can be repeated forever. The idea how to verify if the respective infinite path E 0 −→−→−→−→ · · · , u
wω
denoted E 0 −→−→, visits terms (configurations) from infinitely many equivalence classes is to consider the “limit term” u
wω
Lim that is “reached” by E 0 −→−→; the term Lim is generally infinite but regular (i.e., it has only finitely many subterms). The (black-box) procedure deciding equivalence is used for computing a finite number e such that we are guaranteed that u
we
u
wk
if E 0 −→−→ does not reach a term equivalent to Lim then E 0 −→−→ does not reach such a term for any k ≥ e. In this u
wω
case the path E 0 −→−→ indeed visits terms in infinitely many equivalence classes since the visited terms approach Lim syntactically and thus also semantically (by increasing the “equivalence-level” with Lim) but never belong to the equivalence u
w
class of Lim. To show the existence of a respective witness E 0 −→−→ for each semantically infinite E 0 is not trivial but a1
a2
a3
it can done by a detailed study of the paths E 0 −→ E 1 −→ E 2 −→ · · · where E i are from pairwise different equivalence classes; here we also use the infinite Ramsey theorem for technical convenience. Remark on the relation to other uses of first-order grammars In this paper the first-order grammars are used for slightly different aims than in the works on higher-order grammars (or higher-order recursion schemes) and higher-order pushdown automata, where the first order is a particular case; we can exemplify such works by [14,15], while many other references can be found, e.g., in the survey papers [16,17]. There a grammar is used to describe an infinite labelled tree (the syntax tree of an infinite applicative term produced by a unique outermost derivation from an initial nonterminal), and the questions like, e.g., the decidability of monadic second-order (MSO) properties for such trees are studied. In this paper, a first-order grammar can be also seen as a tool describing an infinite tree, namely the tree-unfolding of a nondeterministic labelled transition system with an initial state. The question if this tree is regular (i.e., if it has only finitely many subtrees) would correspond to the regularity question studied in [1,2]; but here we ask a different question, namely if identifying bisimilar subtrees results in a regular tree. We can also note that the question if a given first-order grammar generates a regular tree refers to a particular formalism (namely to the respective infinite applicative term) while the regularity question studied here is more “syntax-independent”. Some further remarks are given at the end of Section 2 and in Section 4. Organization of the paper Section 2 explains the used notions, and states the result. The proof is then given in Section 3, and Section 4 adds a few additional remarks. Finally, there is an appendix with some technical constructions that are not crucial for the proof. Remark. A preliminary version of this paper, with a sketch of the proof ideas, appeared in Proc. MFCS’16. 2. Basic notions, and result In this section we define the basic notions and state the result in the form of a theorem. Some standard definitions are restricted when we do not need the full generality. We finish the section by a note about a transformation of pushdown automata to first-order grammars. By N and N+ we denote the sets of nonnegative integers and of positive integers, respectively. By [i , j ], for i , j ∈ N , we denote the set {i , i +1, . . . , j }. For a set A, by A∗ we denote the set of finite sequences of elements of A, which are also called words (over A). By | w | we denote the length of w ∈ A∗ , and by ε the empty sequence; hence |ε | = 0. We put A+ = A∗ {ε }, w 0 = ε , and w j+1 = w w j for j ∈ N ; w ω denotes the infinite sequence w w w · · · .
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.3 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
3
Fig. 1. Example of a finite (nondeterministic) labelled transition system. a
Labelled transition systems A labelled transition system, an LTS for short, is a tuple L = (S , , (−→)a∈ ) where S is a finite a
or countable set of states, is a finite set of actions (or letters), and −→⊆ S × S is a set of a-transitions (for each a ∈ ). a We say that L is a deterministic LTS if for each pair s ∈ S , a ∈ there is at most one s such that s −→ s (which stands for a
a1
w
a2
an
(s, s ) ∈−→). By s −→ s , where w = a1 a2 . . . an ∈ ∗ , we denote that there is a path s = s0 −→ s1 −→ s2 · · · −→ sn = s ; if w w w s −→ s , then s is reachable from s. By s −→ we denote that w is enabled in s, i.e., s −→ s for some s . If L is deterministic, w w then s −→ s and s −→ also denote a unique path. a a a Bisimilarity Given L = (S , , (−→)a∈ ), a set B ⊆ S × S covers (s, t ) ∈ S × S if for each s −→ s there is t −→ t such a
a
that (s , t ) ∈ B , and for each t −→ t there is s −→ s such that (s , t ) ∈ B . For B , B ⊆ S × S we say that B covers B if B covers each (s, t ) ∈ B . A set B ⊆ S × S is a bisimulation if B covers B . States s, t ∈ S are bisimilar, written s ∼ t, if there is a bisimulation B containing (s, t ). A standard fact is that ∼ ⊆ S × S is an equivalence relation, and it is the largest bisimulation, namely the union of all bisimulations. a
E.g., in the LTS (S , , (−→)a∈ ) in Fig. 1 we have s3 ∼ s4 and s1 s2 (though s1 , s2 are trace-equivalent, i.e., the sets
{ w ∈ ∗
w
w
| s1 −→} and { w ∈ ∗ | s2 −→} are the same). a
Semantic finiteness Given L = (S , , (−→)a∈ ), we say that s0 ∈ S is finite up to bisimilarity, or bisim-finite for short, if there is some state f in some finite LTS such that s0 ∼ f ; otherwise s0 is infinite up to bisimilarity, or bisim-infinite. We should add that when comparing states from different LTSs, we implicitly refer to the disjoint union of these LTSs. First-order terms, regular terms, finite graph presentations We will consider LTSs with countable sets of states in which the states are first-order regular terms. (In Fig. 1 the states are depicted as unstructured black dots; Fig. 5 depicts three states of an LTS with terms “inside the black dots”.) The terms are built from variables taken from a fixed countable set
Var = {x1 , x2 , x3 , . . . } and from function symbols, also called (ranked) nonterminals, from some specified finite set N ; each A ∈ N has arit y ( A ) ∈ N . We reserve symbols A , B , C , D to range over nonterminals, and E , F , G , H to range over terms. E.g., on the left in Fig. 2 we can see the syntactic tree of a term E 1 , namely of E 1 = A ( D (x5 , C (x2 , B )), x5 , B ), where the arities of nonterminals A , B , C , D are 3, 0, 2, 2, respectively. The numbers at the arcs just highlight the fact that the outgoing arcs of each node are ordered. We identify terms with their syntactic trees. Thus a term over N is (viewed as) a rooted, ordered, finite or infinite tree where each node has a label from N ∪ Var; if the label of a node is xi ∈ Var, then the node has no successors, and if the label is A ∈ N , then it has m (immediate) successor-nodes where m = arit y ( A ). A subtree of a term E is also called a subterm of E. We make no difference between isomorphic (sub)trees, and thus a subterm can have more (maybe infinitely many) occurrences in E. Each subterm-occurrence has its (nesting) depth in E, which is its (naturally defined) distance from the root of E. E.g., C (x2 , B ) is a subterm of the term E 1 in Fig. 2, with one depth-2 occurrence. The term B has two occurrences in E 1 , one in depth 1 and another in depth 3. We also use the standard notation for terms: we write E = x or E = A (G 1 , . . . , G m ) with the obvious meaning; in the former case we have x ∈ Var = {x1 , x2 , . . . } and root( E ) = x, in the latter case we have A ∈ N , root( E ) = A, m = arit y ( A ), and G 1 , . . . , G m are the ordered depth-1 occurrences of subterms of E, which are also called the root-successors in E. A term is finite if the respective tree is finite. A (possibly infinite) term is regular if it has only finitely many subterms (though the subterms may be infinite and can have infinitely many occurrences). We note that any regular term has at least one graph presentation, i.e., a finite directed graph with a designated root where each node has a label from N ∪ Var; if the label of a node is x ∈ Var, then the node has no outgoing arcs, and if the label is A ∈ N , then it has m ordered outgoing arcs where m = arit y ( A ). (A graph presentation of an infinite regular term E 3 is on the right in Fig. 2.) The standard tree-unfolding of a graph presentation is the respective term, which is infinite if there are cycles in the graph. There is a bijection between the nodes in the least graph presentation of E and (the roots of) the subterms of E. (To get the least graph presentation of E 3 in Fig. 2, we should unify the roots of the same subterms, here the nodes labelled with B and the nodes labelled with x5 .)
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.4 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
4
Fig. 2. Finite terms E 1 , E 2 , and a graph presenting a regular infinite term E 3 .
Convention In what follows, by a “term” we mean a “regular term” unless the context makes clear that the term is finite. (We do not consider non-regular terms.) By TermsN we denote the set of all (regular) terms over a set N of (ranked) nonterminals (and over the set Var of variables). As already said, we reserve symbols A , B , C , D to range over nonterminals, and E , F , G , H to range over (regular) terms. Substitutions, associative composition, “iterated” substitutions A substitution
σ is a mapping σ : Var → TermsN whose support
supp(σ ) = {x ∈ Var | σ (x) = x} is finite; we reserve the symbol σ for substitutions. By [xi 1 / E 1 , xi 2 / E 2 , . . . , xik / E k ], where i j = i j when j = j , we denote the substitution σ such that σ (xi j ) = E j for all j ∈ [1, k] and σ (x) = x for all x ∈ Var {xi 1 , xi 2 , . . . , xik }. By applying a substitution σ to a term E we get the term E σ that arises from E by replacing each occurrence of x ∈ Var with σ (x). (Given graph presentations, in the graph of E we just redirect each arc leading to x towards the root of σ (x), which also includes the special “root-designating arc” when E = x.) Hence E = x implies E σ = xσ = σ (x). (E.g., for the terms in Fig. 2 we have E 2 = E 1 [x2 / E 1 ].) The natural composition of substitutions, where σ = σ1 σ2 is defined by xσ = (xσ1 )σ2 , can be easily verified to be associative. We thus write simply E σ1 σ2 when meaning ( E σ1 )σ2 or E (σ1 σ2 ). We let σ 0 be the empty-support substitution, and we put σ i +1 = σ σ i for i ∈ N . We will also use the limit substitution
σω = σσσ ··· when this is well-defined, i.e., when there is no “unguarded cycle” xi 1 σ = xi 2 , xi 2 σ = xi 3 , . . . , xik−1 σ = xik , xik σ = xi 1 where xi 1 = xi 2 . In this case we can formally define σ ω as the unique substitution satisfying the following conditions for each x ∈ Var: if xσ k ∈ Var for all k ∈ N , then xσ ω = x for x ∈ Var where xσ k = x and x σ = x for some k (such unique x must exist since supp(σ ) is finite and there is no “unguarded cycle”); if there is the least k ∈ N such that xσ k = E ∈ / Var (hence E has a nonterminal root), then xσ ω = E σ ω . (E.g., in Fig. 2 we have E 3 = E 1 σ ω for σ = [x2 / E 1 ].) In fact, we will use σ ω only for special “colour-idempotent” substitutions σ defined later. First-order grammars The set TermsN (of regular terms over a finite set N of nonterminals) will serve us as the set of states of an LTS. The transitions will be determined by a finite set of (schematic) root-rewriting rules, illustrated in Fig. 3. This is now defined formally. A first-order grammar, or just a grammar for short, is a tuple G = (N , , R) where N is a finite set of ranked nonterminals (viewed as function symbols with arities), is a finite set of actions (or letters), and R is a finite set of rules of the form a
A (x1 , x2 , . . . , xm ) −→ E
(1)
where A ∈ N , arit y ( A ) = m, a ∈ , and E is a finite term over N in which each occurring variable is from the set {x1 , x2 , . . . , xm }. b
b
Example. Fig. 3 shows a rule A (x1 , x2 , x3 ) −→ C ( A (x2 , x1 , B ), x2 ), and a rule A (x1 , x2 , x3 ) −→ x3 . The depiction stresses that the variables x1 , x2 , x3 serve as the “place-holders” for the root-successors (rs), i.e. the depth-1 occurrences of subterms of a term with the root A; the (root of the) term might be rewritten by performing action b (as defined below).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.5 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
b
5
b
Fig. 3. Depiction of rules A (x1 , x2 , x3 ) −→ C ( A (x2 , x1 , B ), x2 ) and A (x1 , x2 , x3 ) −→ x3 .
b
Fig. 4. Another presentation of the rule r1 : A (x1 , x2 , x3 ) −→ C ( A (x2 , x1 , B ), x2 ). r
LTSs generated by grammars Given G = (N , , R), by LrG we denote the (rule-based) LTS LrG = (TermsN , R, (−→)r ∈R ) a r where each rule r of the form A (x1 , x2 , . . . , xm ) −→ E induces transitions A (x1 , . . . , xm )σ −→ E σ for all substitutions σ . (In fact, only the restrictions of substitutions σ to the domain {x1 , . . . , xm } matter.) The transition induced by σ with r
supp(σ ) = ∅ is A (x1 , . . . , xm ) −→ E. b
Example. To continue the example from Fig. 3, in Fig. 4 we can see another depiction of the rule A (x1 , x2 , x3 ) −→ C ( A (x2 , x1 , B ), x2 ), denoted r1 . This makes more explicit that the application of the same substitution to both sides yields a transition; it also highlights the fact that rs3 “disappears” by applying the rule since it loses the connection with the b
root. Fig. 5 shows two examples of transitions in LrG . One is generated by the rule r1 of the form A (x1 , x2 , x3 ) −→ b
C ( A (x2 , x1 , B ), x2 ) (depicted in Figs. 3 and 4), and the other by the rule r2 of the form A (x1 , x2 , x3 ) −→ x3 ; in both cases we apply the substitution σ = [x1 / D (x5 , C (x2 , B )), x2 /x5 , x3 /x1 ] to (the both sides of) the respective rule. The small symbols x1 , x2 , x3 in Fig. 5 are only auxiliary, highlighting the use of our rules, and they are no part of the respective terms. The middle term is here given by an (acyclic) graph presentation of its syntactic tree (and the node with label x1 is no part of it). Fig. 6 shows the transitions resulting by the applications of two rules to a graph presenting an infinite regular term. (The small symbols x1 , x2 , x3 are again just auxiliary.) Remark. Since the rhs (right-hand sides) E in the rules (1) are finite, all terms reachable from a finite term are finite. It is technically convenient to have the rhs finite while including regular terms into our LTSs. We just remark that this is not crucial, since the “regular rhs” version can be easily “mimicked” by the “finite rhs” version. r
By definition the LTS LrG is deterministic (for each F and r there is at most one H such that F −→ H ). We note that w variables are dead (have no outgoing transitions). We also note that F −→ H implies that each variable occurring in H also occurs in F (but not necessarily vice versa).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.6 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
6
Fig. 5. One r1 -transition and one r2 -transition in LrG (both are b-transitions in LaG ).
b
a
Fig. 6. Applying r1 : A (x1 , x2 , x3 ) −→ C ( A (x2 , x1 , B ), x2 ) and r3 : A (x1 , x2 , x3 ) −→ x1 to a graph of an (infinite regular) term.
The deterministic rule-based LTS LrG is helpful technically, but we are primarily interested in the (generally nondea a terministic) action-based LTS LaG = (TermsN , , (−→)a∈ ) where each rule A (x1 , . . . , xm ) −→ E induces the transitions a
A (x1 , . . . , xm )σ −→ E σ for all substitutions σ . (Figs. 5 and 6 also show transitions in LaG , when we ignore the symbols r1 , r2 , r3 and consider the “labels” b, a instead. Fig. 5 also exemplifies nondeterminism in LaG , since there are two different outgoing b-transitions from a state.) Given a grammar G = (N , , R), two terms from TermsN are bisimilar if they are bisimilar as states in the actionbased LTS LaG . By our definitions all variables are bisimilar, since they are dead terms. The variables serve us primarily as “place-holders for subterm-occurrences” in terms (which might themselves be variable-free); such a use of variables as place-holders has been already exemplified in the rules (1). Main result, and its relation to pushdown automata We now state the theorem, to be proven in the next section, and we mention why the result also applies to pushdown automata (PDA) with deterministic popping ε -steps.
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.7 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
7
Theorem 1. There is an algorithm that, given a grammar G = (N , , R) and (a finite graph presentation of) a term E 0 ∈ Terms(N ), decides if E 0 is bisim-finite (i.e., if E 0 ∼ f for a state f in some finite LTS). A transformation of (nondeterministic) PDA in which deterministic popping ε -steps are allowed to first-order grammars (with no ε -steps) is recalled in the appendix. It makes clear that the semantic finiteness (w.r.t. bisimilarity) of PDA with deterministic popping ε -steps is also decidable. In fact, the problems are interreducible; the close relationship between (D)PDA and first-order schemes has been long known (see, e.g., [18]). The proof of Theorem 1 presented here uses the fact that bisimilarity of first-order grammars is decidable; this was shown for the above mentioned PDA model by Sénizergues [6], and a direct proof in the first-order-term framework was presented in [13]. We note that for PDA where popping ε -steps can be in conflict with “visible” steps bisimilarity is already undecidable [19]; hence the proof presented here does not yield the decidability of semantic finiteness in this more general model. The decidability status of semantic finiteness is also unclear for second-order PDA (that operate on a stack of stacks; besides the standard work on the topmost stack, they can also push a copy of the topmost stack or to pop the topmost stack in one move). Bisimilarity is undecidable for second-order PDA even without any use of ε -steps [12] (some remarks are also added in [20]). 3. Proof of Theorem 1 3.1. Computability of eq-levels, and semidecidability of bisim-finiteness We will soon note that the semidecidability of bisim-finiteness is clear, but we first recall the computability of eq-levels, which is one crucial ingredient in our proof of semidecidability of bisim-infiniteness. a
Stratified equivalence, and eq-levels Assuming an LTS L = (S , , (−→)a∈ ), we put ∼0 = S × S , and define ∼k+1 ⊆ S × S (for
a a k ∈ N ) as the set of pairs covered by ∼k . (Hence s ∼k+1 t iff for each s −→ s there is t −→ t such that s ∼k t and for each a
a
t −→ t there is s −→ s such that s ∼k t .) We easily verify that ∼k are equivalence relations, and that ∼0 ⊇ ∼1 ⊇ ∼2 ⊇ · · · · · · ⊇∼. For the (first infinite) ordinal ω we put s ∼ω t if s ∼k t for all k ∈ N ; hence ∼ω = k∈N ∼k . We do not need to consider ordinals bigger than ω , due to a
a
the following restriction. An LTS L = ( S , , (−→)a∈ ) is image-finite if the set {s ∈ S | s −→ s } is finite for each pair s ∈ S , a ∈ . Our grammar-generated LTSs LaG are obviously image-finite (while LrG are even deterministic). We thus further restrict ourselves to image-finite LTSs. In fact, since we consider LTSs with finite sets of actions, we are thus even restricted a
to finitely branching LTSs, where the set {s ∈ S | s −→ s for some a ∈ } is finite for each s ∈ S. It is a standard fact that
∼=∼ω =
∼k
k∈N
in image-finite LTSs (as also mentioned, e.g., in [4]); indeed, it is straightforward to check that in such an LTS the set k∈N ∼k is covered by itself, hence ∼ω is a bisimulation (which entails ∼ω ⊆∼, and thus ∼ω =∼). a
Given a (finitely branching) LTS L = (S , , (−→)a∈ ), to each (unordered) pair s, t of states we attach their equivalence level (eq-level):
EqLv(s, t ) = max{k ∈ N ∪ {ω} | s ∼k t }. (In Fig. 1 we have EqLv(s3 , s4 ) = ω , EqLv(s1 , s3 ) = EqLv(s1 , s4 ) = 0, EqLv(s1 , s2 ) = 1.) It is useful to observe: Proposition 2. If s ∼ s then EqLv(s, t ) = EqLv(s , t ), for all states s, s , t. Proof. Suppose s ∼ s , i.e., s ∼k s for all k ∈ N . For each k ∈ N we then have that s ∼k t implies s ∼k t and s k t implies s k t, since ∼k are equivalence relations. 2 Eq-levels are computable for first-order grammars We now recall a variant of the fundamental decidability theorem shown by Sénizergues in [6]; it will be used as an important ingredient in the proof of Theorem 1. Theorem 3. [6] There is an algorithm that, given G = (N , , R) and E 0 , F 0 ∈ Terms(N ), computes EqLv( E 0 , F 0 ) in LaG (and thus also decides if E 0 ∼ F 0 ). The crucial thing is that we can decide if E 0 ∼ F 0 (by the algorithm from [6], or by the alternative algorithm presented in [13] directly in the framework of first-order grammars). If E 0 F 0 , then a straightforward brute-force algorithm finds the least k+1 ∈ N such that E 0 k+1 F 0 , thus finding that EqLv( E 0 , F 0 ) = k.
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.8 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
8
Semidecidability of bisim-finiteness Given G and E 0 , we can systematically generate all finite LTSs, presenting them by firstorder grammars with nullary nonterminals (which then coincide with states); for each state f of each generated system we can check if E 0 ∼ f by Theorem 3. In fact, Theorem 3 is not crucial here, since the decidability of E 0 ∼ f can be shown in a much simpler way (see, e.g., [11]). 3.2. Semidecidability of bisim-infiniteness In Section 3.2.1 we note a few simple general facts on bisim-infiniteness, and also note the obvious compositionality (congruence properties) of bisimulation equivalence in our framework of first-order terms. In Section 3.2.2 we describe some finite structures that are candidates for witnessing bisim-infiniteness of a given term E 0 ; such a candidate is, in fact, a rule sequence u w such that the infinite (ultimately periodic) word u w ω is performable from E 0 in LrG . Then we show an algorithm checking if a candidate is indeed a witness, i.e., if the respective infinite path u
w
w
w
E0 − →−→−→−→ · · · visits terms from infinitely many equivalence classes. The crucial idea is that we can naturally define a (regular) term, called the limit Lim = E ω , that could be viewed as “reached” from E 0 by performing the infinite word u w ω . wj
u
The terms E j such that E 0 − →−−→ E j will approach E ω syntactically with increasing j (E j coincides with E ω up to depth j at least), which also entails that EqLv( E j , E ω ) will grow above any bound. If we can verify that EqLv( E j , E ω ) are finite for infinitely many j, in particular if EqLv( E j , E ω ) never reaches ω (hence E j E ω for all j), then u w is indeed a witness of bisim-infiniteness of E 0 ; Theorem 3 will play an important role in such a verification. In Section 3.2.3 we show that each bisim-infinite term has a witness of the above form. Here we will also use the infinite Ramsey theorem for a technical simplification. By this a proof of Theorem 1 will be finished. 3.2.1. Some general facts on bisim-infiniteness, and compositionality of terms a
a
Bisimilarity quotient Given an LTS L = (S , , (−→)a∈ ), the quotient-LTS L∼ is the tuple ({ [s]∼ | s ∈ S }, , (−→)a∈ ) where a
a
a
[s]∼ = {s | s ∼ s}, and [s]∼ −→ [t ]∼ if s −→ t for some s ∈ [s]∼ and t ∈ [t ]∼ ; in fact, [s]∼ −→ [t ]∼ implies that for each a s ∈ [s]∼ there is t ∈ [t ]∼ such that s −→ t . We have s ∼ [s]∼ , since {(s, [s]∼ ) | s ∈ S } is a bisimulation (in the union of L and L∼ ). We refer to the states of L∼ as to the bisim-classes (of L).
A sufficient condition for bisim-infiniteness We recall that s0 ∈ S is bisim-finite if there is some state f in a finite LTS such that s0 ∼ f ; otherwise s0 is bisim-infinite. We observe that s0 is bisim-infinite in L iff the reachability set of [s0 ]∼ in L∼ , i.e. the set of states reachable from [s0 ]∼ in L∼ , is infinite. a We also recall our restriction to finitely branching LTSs ({s | s −→ s for some a} is finite for each s), and note the following: a2
a1
a
Proposition 4. A state s0 of a finitely branching LTS L = (S , , (−→)a∈ ) is bisim-infinite iff there is an infinite path s0 −→ s1 −→ a3
s2 −→ · · · where si s j for all i = j. Proof. The “if” direction is trivial; our goal is thus to show the “only if” direction. Let s0 be a bisim-infinite state in a a1
a2
ak
a1
a2
ak
finitely branching LTS L. In the quotient-LTS L∼ we consider the set P of all finite paths C 0 −→ C 1 −→ C 2 · · · −→ C k where C 0 = [s0 ]∼ and C i = C j for all i , j ∈ [0, k], i = j. We present P as a tree: the paths in P are the nodes, the trivial a1
ak+1
a2
path C 0 being the root, and each node C 0 −→ C 1 −→ C 2 · · · −→ C k+1 is a child of the node C 0 −→ C 1 −→ C 2 · · · −→ C k . This tree is finitely branching (each node has a finite set of children). If all branches were finite (in which case each a2
a1
ak
a
leaf C 0 −→ C 1 −→ C 2 · · · −→ C k would satisfy that C k −→ C implies C = C i for some i ∈ [0, k]), then the tree would be finite (by König’s lemma), and thus the set of states in L∼ that are reachable from [s0 ]∼ would be also finite; this a1
a2
a3
would contradict with the assumption that s0 is bisim-infinite. Hence there is an infinite path C 0 −→ C 1 −→ C 2 −→ · · · in L∼ where C 0 = [s0 ]∼ and C i = C j for all i = j. Since s0 ∼ C 0 (in the union of L and L∼ ), there must be a path a1
a2
a3
s0 −→ s1 −→ s2 −→ · · · in L where si ∼ C i , i.e. [si ]∼ = C i , for all i ∈ N ; for i = j we thus have [si ]∼ = [s j ]∼ , i.e., si s j .
2
To demonstrate that s0 is bisim-infinite, it suffices to show that its reachability set contains states with arbitrarily large finite eq-levels w.r.t. a “test state” t; we now formalize this observation. a Proposition 5. Given L = (S , , (−→)a∈ ) and states s0 , t, if for every e ∈ N there is s that is reachable from s0 and satisfies e < EqLv(s , t ) < ω , then s0 is bisim-infinite.
Proof. If s0 , t satisfy the assumption, then there are e 1 < e 2 < e 3 < · · · and states s1 , s2 , s3 , . . . reachable from s0 such that EqLv(si , t ) = e i for all i ∈ N+ . By Proposition 2 we thus get that si s j for all i , j ∈ N+ , i = j. Hence from s0 we can reach states from infinitely many bisim-classes, which entails that s0 is bisim-infinite. 2
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.9 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
9
Fig. 7. Bounded regions of bisimilar states yield the same eq-levels w.r.t. test states.
Fig. 8. Replacing a subterm with an equivalent term does not change the bisim-class.
Eq-levels yielded by states in a bounded region and test states Our final general observation (tailored to a later use) is also straightforward: if two states of an LTS are bisimilar, then the states in their equally bounded reachability regions must yield the same eq-levels when compared with states from a fixed (test) set. This observation is informally depicted in Fig. 7, and formalized in what follows. (Despite the depiction in Fig. 7, the test states can be also inside the regions.) a
Given L = (S , , (−→)a∈ ), for any s ∈ S and d ∈ N (a distance, or a “radius”) we put w
Region(s, d) = {s | s −→ s’ for some w ∈ ∗ where | w | ≤ d}.
(2)
For any s ∈ S , d ∈ N , and T ⊆ S (a test set), we define the following subset of N (finite TestEqLevels):
TEL(s, d, T ) = {e ∈ N | e = EqLv(s , t ) for some s ∈ Region(s, d) and t ∈ T }.
(3)
For X ⊆ N , by the supremum sup( X ) we mean −1 if X = ∅, max( X ) if X is finite and nonempty, and ω if X is infinite. (The next proposition will be later applied to the LTSs LaG with finite test sets, hence the sets Region(s, d) and TEL(s, d, T ) will be finite.) Proposition 6. If s1 ∼ s2 , then TEL(s1 , d, T ) = TEL(s2 , d, T ) for all d ∈ N and T ⊆ S , which also entails that sup(TEL(s1 , d, T )) = sup(TEL(s2 , d, T )). w
Proof. (Recall Fig. 7.) Suppose s1 ∼ s2 and s1 ∈ Region(s1 , d); let s1 −→ s1 where | w | ≤ d. From the definition of bisimilarity w
we deduce that s2 −→ s2 for some s2 such that s1 ∼ s2 ; we have s2 ∈ Region(s2 , d). Since s1 ∼ s2 implies EqLv(s1 , t ) = EqLv(s2 , t ) for every t (by Proposition 2), the claim is clear. 2 a
a
Remark. The fact that s1 ∼ s2 and s1 −→ s1 implies that there is s2 such that s2 −→ s2 and s1 ∼ s2 is a crucial property of bisimilarity that we use for our decision procedure. Hence our approach does not apply to trace equivalence or simulation equivalence. (E.g., the states s1 , s2 in Fig. 1 are trace equivalent but their a-successors are from pairwise different trace equivalence classes.) On the other hand, the below mentioned compositionality of bisimilarity holds for other equivalences as well. Compositionality of states in grammar-generated LTSs We assume a grammar G = (N , , R), generating the LTS LaG = a (TermsN , , (−→)a∈ ) where ∼ is bisimulation equivalence on TermsN . Regarding the congruence properties, in principle it suffices for us to observe the fact depicted in Fig. 8: if in a term E we replace a subterm F with F such that F ∼ F then the resulting term E satisfies E ∼ E. ) implies G G for some i ∈ [1, m]. Formally, we put σ ∼ σ if Hence we also have that A (G 1 , . . . , G m ) A (G 1 , . . . , G m i i xσ ∼ xσ for each x ∈ Var, and we note: Proposition 7. If σ ∼ σ , then E σ ∼ E σ . (Hence E σ E σ implies that xσ xσ for some x occurring in E.) Proof. We show that the set B = ∼ ∪ {( E σ , E σ ) | E ∈ TermsN , σ ∼ σ } is a bisimulation (hence B = ∼). It suffices to consider a pair ( E σ , E σ ) where σ ∼ σ and show that it is covered by B . If E = x ∈ Var, then ( E σ , E σ ) = (xσ , xσ ),
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.10 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
10
which entails E σ ∼ E σ and ( E σ , E σ ) is thus covered by the bisimulation ∼ (included in B ). If root( E ) ∈ N , then each
a a a a transition E σ −→ G can be written as E σ −→ E σ where E −→ E ; it has the matching transition E σ −→ E σ , and we have ( E σ , E σ ) ∈ B . 2
Conventions
• To make some later discussions easier, we further consider only the normalized grammars G = (N , , R), i.e. those satisfying the following condition: for each A ∈ N and each i ∈ [1, m] where m = arit y ( A ) there is a word w ( A ,i ) ∈ R+ w ( A ,i )
such that A (x1 , . . . , xm ) −−−→ xi (in LrG ). Hence for each E ∈ TermsN it is possible to “sink” to each of its subtermoccurrences by applying a sequence of the grammar-rules. (E.g., in the middle term in Fig. 6 we can “sink” to the root-successor term with the root A by applying some rule sequence u 1 , then “to D” by applying some u 2 , then to C by some u 3 , then to A by some u 4 , ...) Such a normalization can be efficiently achieved by harmless modifications of the nonterminal arities and of the rules in R, while the bisimilarity quotient of the LTS LaG remains the same (up to isomorphism). Now we simply assume this, the details are given in the appendix. • In our notation we use m as the arity of all nonterminals in the considered grammar, though m is deemed to denote the maximum arity, in fact. Formally we could replace our expressions of the form A (G 1 , . . . , G m ) with A (G 1 , . . . , G m A ) where m A = arit y ( A ), and adjust the respective discussions accordingly, but it would be unnecessarily cumbersome. In fact, such uniformity of arities can be even achieved by a construction while keeping the previously discussed normalization condition, when a slight problem with arity 0 is handled. The details are also given in the appendix. w
• For technical convenience we further view the expressions like G −→ H as referring to the deterministic LTS LrG (hence w
w ∈ R∗ and any expression G −→ refers to a unique path in LrG ), while ∼k , ∼, and the eq-levels are always considered w.r.t. the action-based LTS LaG .
3.2.2. Witnesses of bisim-infiniteness Assuming a grammar G = (N , , R), we now describe candidates for witnesses of bisim-infiniteness of terms. A witness u
w
w
w
of bisim-infiniteness of E 0 will be a pair (u , w ), u ∈ R∗ and w ∈ R+ , for which there is the infinite path E 0 − →−→−→−→ · · · and it visits infinitely many bisim-classes. We first put a technically convenient restriction on the considered “iterated” words w, and then define candidates for witnesses formally.
Stairs, stair substitutions, colour-idempotent substitutions and stairs A nonempty sequence w = r1 r2 . . . r ∈ R+ of rules is a w
stair if we have A (x1 , . . . , xm ) −→ F where A (x1 , . . . , xm ) is the left-hand side of the rule r1 and root( F ) ∈ N (i.e., F ∈ / Var). a
b
c
E.g., for the rules r1 : A (x1 , x2 ) −→ C (C (x2 , B (x2 , x1 )), x2 ), r2 : C (x1 , x2 ) −→ x1 , r3 : C (x1 , x2 ) −→ x2 we have that r1 , r2
r1
r3
r1 r2 , and r1 r2 r3 are stairs, since A (x1 , x2 ) −→ C (C (x2 , B (x2 , x1 )), x2 ) −→ C (x2 , B (x2 , x1 )) −→ B (x2 , x1 ), r1 r2 r2 is no stair, since r1 r2 r2
r1 r2
A (x1 , x2 ) −−−→ x2 , and r1 r2 r1 is no stair since A (x1 , x2 ) −−→ C (x2 , B (x2 , x1 )) and r1 is not enabled in C (.., ..). We put var( E ) = {x ∈ Var | x occurs in E }. A substitution σ is called a stair substitution if the sets supp(σ ) and w
var(xi σ ), i ∈ [1, m], are subsets of {x1 , x2 , . . . , xm }. We note that each stair w ∈ R+ determines a path A (x1 , . . . , xm ) −→ r1 r2
r1 r2
B (x1 , . . . , xm )σ where σ is a stair substitution. (E.g., A (x1 , x2 ) −−→ C (x2 , B (x2 , x1 )) can be written as A (x1 , x2 ) −−→ C (x1 , x2 )σ where σ = [x1 /x2 , x2 / B (x2 , x1 )].) In fact, the above defined stair substitutions are more general; e.g., the emptysupport substitution is also a stair substitution. For a stair substitution σ we define:
• Surv(σ ) = {xi | xi occurs in x j σ for some j ∈ [1, m]}, and • RStick(σ ) = {xi | xi = x j σ for some j ∈ [1, m]}. Hence RStick(σ ) ⊆ Surv(σ ) ⊆ {x1 , . . . , xm }. We note that xi ∈ Surv(σ ) iff xi “survives” applying σ to A (x1 , . . . , xm ), i.e., xi occurs in A (x1 , . . . , xm )σ ; we have xi ∈ RStick(σ ) iff xi “sticks to the root”, i.e., is a root-successor in A (x1 , . . . , xm )σ . A stair substitution σ is colour-idempotent if for all i , j ∈ [1, m] we have: 1. xi ∈ RStick(σ ) entails that xi σ = xi , and 2. xi ∈ Surv(σ ) entails that xi occurs in x j σ for some x j ∈ Surv(σ ). Remark. The word “colour” anticipates a later use of Ramsey’s theorem. The word “idempotent” refers to the fact (shown below) that the above conditions 1 and 2 are sufficient to guarantee that Surv(σ σ ) = Surv(σ ) and RStick(σ σ ) = RStick(σ ). We could explicitly build a concrete finite semigroup of colours (of stair substitutions) but this is not necessary for our proof. We only remark that for technical reasons the colour associated with σ before Proposition 14 is finer than (Surv(σ ), RStick(σ )).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.11 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
Fig. 9. A colour-idempotent stair substitution
11
σ , and its limit σ ω .
Example. Fig. 9 (top-left) depicts a stair substitution
σ = [x1 / F 1 , x3 / F 3 , x4 / F 4 , x6 / F 6 , x7 / F 7 ]; we assume m = 7 but we also use nonterminals with smaller arities for simplicity. We have F 1 = B (C (x4 , x2 ), x3 ), F 3 = C (x1 , A (C (x4 , x2 ), B (x3 , x5 ), x3 , B (x3 , x5 ), x4 )), F 4 = x2 , F 6 = B (x3 , x5 ), F 7 = C (x4 , x5 ). It is also depicted explicitly that x2 σ = x2 and x5 σ = x5 ; each dashed line connects a variable xi (above the bar) with the root of the term xi σ . We can easily check that Surv(σ ) = {x1 , x2 , x3 , x4 , x5 } and RStick(σ ) = {x2 , x5 }, and that σ is colour-idempotent (x2 σ = x2 , x5 σ = x5 , x1 occurs in x3 σ , and both x3 , x4 occur, e.g., in x1 σ ). Fig. 9 also depicts the substitution σ σ (bottom-left) and the substitution σ ω (right). Here we use an auxiliary device, namely some “fictitious” nodes that are not labelled with nonterminals or variables. Such a node can be called a collector node: it might “collect” several incoming arcs that are in reality deemed to proceed to the target specified by the (precisely one) outgoing arc of the collector node. Fig. 9 can also serve for illustrating the next proposition. Proposition 8. If σ is a colour-idempotent stair substitution, then the following conditions hold: 1. Surv(σ k ) = Surv(σ ) and RStick(σ k ) = RStick(σ ) for all k ∈ N+ ; 2. for each xi ∈ Surv(σ ) RStick(σ ), all occurrences of xi in x j σ k , for j ∈ [1, m] and k ∈ N+ , have depths at least k (if there are any such occurrences). 3. Surv(σ ω ) = RStick(σ ω ) = RStick(σ ).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.12 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
12
u
w
w
Fig. 10. Candidate (u , w ) induces the path E 0 −→ A (x1 , . . . , xm )σ0 −→ A (x1 , . . . , xm )σ σ0 −→ · · · .
Proof. 1. By induction on k, the case k = 1 being trivial. We note that xi ∈ Surv(σ k+1 ) iff xi occurs in x j σ for some x j ∈ Surv(σ k ), and Surv(σ k ) = Surv(σ ) by the induction hypothesis. Since σ is colour-idempotent, the condition “xi occurs in x j σ for some x j ∈ Surv(σ )” is equivalent to “xi ∈ Surv(σ )”. Hence Surv(σ k+1 ) = Surv(σ ). Similarly, xi ∈ RStick(σ k+1 ) iff x j σ = xi for some x j ∈ RStick(σ k ) = RStick(σ ). The condition “x j σ = xi for some x j ∈ RStick(σ )” is equivalent to “xi ∈ RStick(σ )” (since σ is colour-idempotent). Hence RStick(σ k+1 ) = RStick(σ ). 2. For k = 1 the claim is trivial (by the definition of RStick(σ )). The depth of any respective occurrence of xi in x j σ k+1 is the sum of the depth of an occurrence of x in x j σ k and the depth of xi in x σ (for some ∈ [1, m]). The second depth is at least 1 (since x σ = xi would entail xi ∈ RStick(σ )); the first depth is at least k by the induction hypothesis, since x ∈ Surv(σ k ) = Surv(σ ) and x ∈ / RStick(σ ) (otherwise x σ = x due to the colour-idempotency of σ ). Hence the depth of the respective occurrence of xi in x j σ k+1 is at least k+1. 3. Due to the (colour-idempotency) condition “xi ∈ RStick(σ ) entails xi σ = xi ” (i.e., x j σ = xi entails xi σ = xi ), the substitution σ ω is clearly well-defined: if x j σ = xi , then x j σ ω = xi . Hence each variable xi ∈ RStick(σ ) “survives” the application of σ ω to A (x1 , . . . , xm ), having one occurrence as the ith root-successor in A (x1 , . . . , xm )σ ω . No other variables “survive”, as can be easily deduced from Points 1, 2. (Suppose xi occurs in A (x1 , . . . , xm )σ ω in depth k, and write A (x1 , . . . , xm )σ ω as A (x1 , . . . , xm )σ k+1 σ ω ; hence xi occurs in x j σ ω for some x j occurring in A (x1 , . . . , xm )σ k+1 in depth at most k. Points 1, 2 imply that x j ∈ RStick(σ ); hence x j σ ω = x j , and thus xi = x j ∈ RStick(σ ).) 2 w
We say that a stair w ∈ R+ is colour-idempotent if A (x1 , . . . , xm ) −→ A (x1 , . . . , xm )σ for some A ∈ N and some colouridempotent stair substitution σ . For such w we have the infinite path w
w
w
w
A (x1 , . . . , xm ) −→ A (x1 , . . . , xm )σ −→ A (x1 , . . . , xm )σ 2 −→ A (x1 , . . . , xm )σ 3 −→ · · · . The term A (x1 , . . . , xm )σ ω can be naturally seen as the respective “limit”. Candidates for witnesses of bisim-infiniteness Given a grammar G = (N , , R), by a candidate for a witness of bisim-infiniteness uw
of a term E 0 , or by a candidate for E 0 for short, we mean a pair (u , w ) where u ∈ R∗ , w ∈ R+ , E 0 −→, and w is a colouridempotent stair. w
u
For a candidate (u , w ) for E 0 we thus have E 0 −→ A (x1 , . . . , xm )σ0 and A (x1 , . . . , xm ) −→ A (x1 , . . . , xm )σ for some nonterminal A and some substitutions σ0 , σ , where σ is colour-idempotent; moreover, there is the corresponding infinite path u
w
w
w
E 0 −→ A (x1 , . . . , xm )σ0 −→ A (x1 , . . . , xm )σ σ0 −→ A (x1 , . . . , xm )σ 2 σ0 −→ · · · . An example is depicted in Fig. 10. Fig. 11 then depicts A (x1 , . . . , xm )σ j σ0 for some j ≥ 3 (left) and the limit A (x1 , . . . , xm )σ ω σ0 (right); some of the auxiliary collector nodes have special labels (p i j , qi ) that we now ignore (they serve for a later discussion).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.13 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
13
Fig. 11. A (x1 , . . . , xm )σ j σ0 (left) and A (x1 , . . . , xm )σ ω σ0 (right).
We will now note in more detail how the terms A (x1 , . . . , xm )σ j σ0 converge (syntactically and semantically) to the term A (x1 , . . . , xm )σ ω σ0 . Tops of terms A (x1 , . . . , xm )σ k σ0 converge to A (x1 , . . . , xm )σ ω σ0 For a term H and d ∈ N , by Topd ( H ) (the d-top of H ) we refer to the tree corresponding to H up to depth d. Hence Top0 ( H ) is the tree consisting solely of the root labelled with root( H ). For d > 0, we have Topd (xi ) = Top0 (xi ), and Topd ( A (G 1 , . . . , G m )) = A (Topd−1 (G 1 ), . . . , Topd−1 (G m )), which denotes the (ordered labelled) tree with the A-labelled root and with the (ordered) depth-1 subtrees Topd−1 (G 1 ), . . . , Topd−1 (G m ). (Hence Topd ( H ) is not a term in general, since it arises by “cutting-off” the depth-(d+1) subterm-occurrences.) We also define Top−1 ( H ) as the “empty tree”, and use the consequence that Top−1 ( H 1 ) = Top−1 ( H 2 ) for all H 1 , H 2 . The next observation is trivial, due to the root-rewriting form of the transitions in the grammar-generated labelled transition systems. Proposition 9. For any k ∈ N , if Topk−1 ( H 1 ) = Topk−1 ( H 2 ), then H 1 ∼k H 2 .
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.14 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
14
Proof. We proceed by induction on k; for k = 0 the claim is trivial. We assume Topk ( H 1 ) = Topk ( H 2 ) for k ≥ 0; hence root( H 1 ) = root( H 2 ) and thus the rules r enabled in H 1 in the LTS LrG are the same as the rules enabled in H 2 . If a a
r
r
a
transition H 1 −→ H 1 in LaG arises from H 1 −→ H 1 in LrG , then H 2 −→ H 2 in LrG gives rise to H 2 −→ H 2 in LaG ; we obviously have Topk−1 ( H 1 ) = Topk−1 ( H 2 ), and thus H 1 ∼k H 2 by the induction hypothesis. Hence Topk ( H 1 ) = Topk ( H 2 ) implies H 1 ∼k+1 H 2 . 2 We now derive an easy consequence (for which Fig. 11 can be useful).
Proposition 10. Let σ be a colour-idempotent substitution, and σ0 a substitution (with supp(σ0 ) ⊆ {x1 , . . . , xm }). For all j ∈ [1, m] and k ∈ N+ we have x j σ k σ0 ∼k x j σ ω σ0 . Hence for all A ∈ N and k ∈ N+ we have A (x1 , . . . , xm )σ k σ0 ∼k+1 A (x1 , . . . , xm )σ ω σ0 . Proof. Assuming σ and σ0 , we fix j ∈ [1, m] and k ∈ N+ , and show that Topk−1 (x j σ k σ0 ) = Topk−1 (x j σ ω σ0 ). We write x j σ ω σ0 as x j σ k σ ω σ0 , and recall that the variables xi occurring in x j σ k are from Surv(σ k ) = Surv(σ ) and all occurrences of xi ∈ Surv(σ ) RStick(σ ) in x j σ k are in depth k at least (by Proposition 8). Moreover, for each xi ∈ RStick(σ ) we have xi σ ω = xi and thus xi σ0 = xi σ ω σ0 . Hence we indeed have Topk−1 (x j σ k σ0 ) = Topk−1 (x j σ ω σ0 ). We thus also get that Topk ( A (x1 , . . . , xm )σ k σ0 ) = Topk ( A (x1 , . . . , xm )σ ω σ0 ). The claim thus follows by Proposition 9. 2 u
w
Checking if a candidate is a witness A candidate (u , w ) for E 0 , yielding the path E 0 −→ A (x1 , . . . , xm )σ0 −→ A (x1 , . . . , xm )σ σ0 w
w
−→ A (x1 , . . . , xm )σ 2 σ0 −→ · · · , and the term Lim = A (x1 , . . . , xm )σ ω σ0 , is a witness (of bisim-infiniteness) for E 0 if A (x1 , . . . , xm )σ k σ0 Lim for infinitely many k ∈ N . Since EqLv( A (x1 , . . . , xm )σ k σ0 , Lim) > k (as follows from Proposition 10), we then have that for each e ∈ N there is k ∈ N such that
e < EqLv( A (x1 , . . . , xm )σ k σ0 , Lim) < ω, and Proposition 5 thus confirms that E 0 is indeed bisim-infinite if it has a witness. The existence of an algorithm checking if a candidate is a witness follows from the next lemma (which we prove by using the fundamental fact captured by Theorem 3, also using the “labelled collector nodes” in Fig. 11 for illustration). Lemma 11. Given A ∈ N , a colour-idempotent substitution σ , and a substitution σ 0 with supp(σ 0 ) ⊆ RStick(σ ), there is a computable number e ∈ N such that for the term Lim = A (x1 , . . . , xm )σ ω σ 0 and any substitution σ0 coinciding with σ 0 on RStick(σ ) one of the following conditions holds: 1. A (x1 , . . . , xm )σ k σ0 Lim for all integers k ≥ e, or 2. A (x1 , . . . , xm )σ k σ0 ∼ Lim for all integers k ≥ e. u
w
To verify that a candidate (u , w ), where E 0 −→ A (x1 , . . . , xm )σ0 −→ A (x1 , . . . , xm )σ σ0 , is a witness for E 0 , it thus suffices to compute e for A, σ , σ 0 where σ 0 is the restriction of σ0 to RStick(σ ), and show that A (x1 , . . . , xm )σ e σ0 A (x1 , . . . , xm )σ ω σ 0 . Now we prove the lemma. Proof. We assume A ∈ N , a colour-idempotent substitution σ , and a substitution σ0 ; we define σ 0 as the restriction of σ0 to RStick(σ ) and put Lim = A (x1 , . . . , xm )σ ω σ 0 (hence Lim = A (x1 , . . . , xm )σ ω σ0 since only xi ∈ RStick(σ ) occur in A (x1 , . . . , xm )σ ω ). We can surely compute a number d ∈ N such that each xi ∈ Surv(σ ) belongs to both of the (reachability) regions Region( A (x1 , . . . , xm )σ , d) and Region( A (x1 , . . . , xm )σ σ , d) (recall (2)). (In Fig. 11 it means that the terms whose roots are determined by the collector nodes labelled p 11 , · · · , p 24 are reachable within d moves from the term A (x1 , . . . , xm )σ j σ0 .) We define the test set T = {x j σ ω σ 0 | x j ∈ Surv(σ )}, and put
MaxTEL = sup(TEL(Lim, d, T )) (recalling (3)). The set Region(Lim, d) and its subset T are finite, and easily constructible. (In Fig. 11, the roots of the terms in T are determined by the collector nodes labelled q1 , · · · , q4 .) Hence the number MaxTEL is finite, i.e., MaxTEL ∈ {−1} ∪ N , and computable by Theorem 3. We now put
e = MaxTEL + 2 and show that this e (computed from A,
σ , σ 0 ) satisfies the claim.
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.15 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
15
In fact, it suffices to show that for each k ≥ e we have:
A (x1 , . . . , xm )σ k σ0 Lim iff A (x1 , . . . , xm )σ k+1 σ0 Lim.
We start with assuming k ≥ e and A (x1 , . . . , xm )σ k σ0 Lim. Written in another form, we thus have A (x1 , . . . , xm )σ A (x1 , . . . , xm )σ σ ω σ0 .
σ k−1 σ0
By compositionality (Proposition 7) we then have x j σ k−1 σ0 x j σ ω σ0 for some x j occurring in A (x1 , . . . , xm )σ , i.e., for some x j ∈ Surv(σ ); in fact, x j ∈ Surv(σ ) RStick(σ ) (since xi σ k−1 = xi σ ω = xi for xi ∈ RStick(σ )). We fix such x j and recall that it also occurs in A (x1 , . . . , xm )σ σ (since Surv(σ σ ) = Surv(σ )). Since EqLv(x j σ k−1 σ0 , x j σ ω σ0 ) ≥ k−1 (by Proposition 10) and k ≥ e, we have
MaxTEL < EqLv(x j σ k−1 σ0 , x j σ ω σ0 ) < ω; we recall that x j σ ω σ0 belongs to the test set
(4)
T . Our choice of d guarantees that x j belongs to Region( A (x1 , . . . , xm )σ σ , d),
and thus
x j σ k−1 σ0 belongs to Region( A (x1 , . . . , xm )σ k+1 σ0 , d). This implies that A (x1 , . . . , xm )σ k+1 σ0 Lim (by Proposition 6). Similarly, if A (x1 , . . . , xm )σ k+1 σ0 Lim (for k ≥ e), i.e., A (x1 , . . . , xm )σ σ σ k−1 σ0 A (x1 , . . . , xm )σ σ σ ω σ0 , there is x j ∈ Surv(σ σ ) = Surv(σ ) for which (4) holds. Since x j is in Region( A (x1 , . . . , xm )σ , d), the term x j σ k−1 σ0 belongs to Region( A (x1 , . . . , xm )σ k σ0 , d), and thus A (x1 , . . . , xm )σ k σ0 Lim. 2 3.2.3. Each bisim-infinite term has a witness The last ingredient of the proof of Theorem 1 is Lemma 15 formulated and proven at the end of this section. The rough idea is that for each bisim-infinite term E 0 there is an infinite path u
w1
w2
w3
E 0 −→ H 0 −→ H 1 −→ H 2 −→ · · · where H i H j for all i = j, and w i are stairs (for all i ∈ N+ ) that might be even chosen from a finite set of “simple” stairs. The infinite Ramsey theorem will easily yield that infinitely many sequences w i w i +1 · · · w j are colour-idempotent stairs (with the same “colour”). By a more detailed analysis (and another use of Lemma 11) we will derive a contradiction when assuming that E 0 has no witness. We assume a fixed grammar G = (N , , R) and first define a few notions. Canonical (witness) sequences a
• For each rule r : A (x1 , . . . , xm ) −→ E in R and each subterm F of E with root( F ) ∈ N (i.e., F ∈ / Var) we fix a shortest r2 ···rk r1 + sequence u (r , F ) = r1 r2 · · · rk ∈ R such that r1 = r and A (x1 , . . . , xm ) − → E −−−→ F . Sequences u (r , F ) are obviously stairs, and we call them the canonical simple stairs.
• A stair w ∈ R+ is a canonical stair if w = u 1 u 2 · · · u where ∈ N+ and u i , i ∈ [1, ], are canonical simple stairs.
w i +1
• An infinite sequence H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . (where H i ∈ TermsN and w i ∈ R+ ) is a canonical sequence if H i −−−→ H i +1 and w i +1 is a canonical stair, for each i ∈ N . (In Fig. 12 we can see a depiction of such a sequence.) • A canonical sequence H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . is a witness sequence if H i H j for all i , j ∈ N where i = j. • Given a canonical sequence Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . ., we say that a sequence Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . is a subsequence of Seq if there are 0 ≤ i 0 < i 1 < i 2 < · · · such that H j = H i j and w j +1 = w i j +1 w i j +2 . . . w i j+1 , for each j ∈ N . u • A canonical sequence Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . ., is reachable from a term E 0 if E 0 −→ H 0 for some u ∈ R∗ . Proposition 12. Let Seq be a subsequence of a canonical sequence Seq. Then Seq is a canonical sequence; moreover, if Seq is a witness sequence, then Seq is a witness sequence, and if Seq is reachable from E 0 , then Seq is reachable from E 0 . w
w
Proof. It is obvious, once we note that if H −→ H −→ H where w and w are canonical stairs, then w w is a canonical stair. 2 Proposition 13. If a term E 0 is bisim-infinite, then there is a witness sequence Seq that is reachable from E 0 . Moreover, there is such Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . where w i , i ∈ N+ , are canonical simple stairs. Proof. We assume a bisim-infinite term E 0 and fix an infinite path
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.16 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
16
w1
w2
w3
Fig. 12. Depiction of H 0 −→ H 1 −→ H 2 −→ · · · .
r3
r2
r1
E 0 −→ E 1 −→ E 2 −→ · · ·
(5)
in the LTS LG such that E i E j (in LG ) for all i = j; the existence of such a path follows from Proposition 4. We show that there is the least i 0 ∈ N such that r i 0 +1 r i 0 +2 . . . r i 0 + is a stair for each ∈ N+ . If there was no such i 0 , we would have an infinite sequence 0 = j 0 < j 1 < j 2 < · · · where E jk+1 is a depth-1 subterm of E jk for each k ∈ N (since r
a
r jk +1 r jk +2 ···r jk+1
E jk −−−−−−−−−−→ E jk+1 sinks to a root-successor in E jk ); since E 0 is a regular term, it has only finitely many subterms, and we would thus have E i = E j (hence E i ∼ E j ) for some i = j. Having defined i 0 , i 1 , . . . , i j (for some j ≥ 0), we define i j +1 as the least number i such that i j < i and r i +1 r i +2 . . . r i + is r i j +1 r i j +2 ···r i j +
a stair for each ∈ N+ . There must be such i, since otherwise E i j −−−−−−−−−−→ would sink to a root-successor in E i j for some ∈ N+ , which contradicts with the choice of i j . For each j ∈ N we put H j = E i j and w j +1 = r i j +1 r i j +2 · · · r i j+1 ; hence the (infinite) suffix of the path (5) that starts with E i 0 can be presented as w1
w2
w3
H 0 −→ H 1 −→ H 2 −→ · · · . By the choice of (5) we have H i H j for i = j. By the definition of i 0 , i 1 , i 2 , . . . , all w j ∈ R+ are stairs, but they might not w j +1
w j +1
be canonical (simple) stairs. For each j ∈ N we can write H j −−−→ H j +1 in the form A (x1 , . . . , xm )σ −−−→ F σ where A = r i j +1
w j +1
r i j +2 ···r i j +1
root( H j ) and A (x1 , . . . , xm ) −−−→ F ; in more detail, we have A (x1 , . . . , xm ) −−−→ E −−−−−−−→ F where the rule r i j +1 ∈ R a
is of the form A (x1 , . . . , xm ) −→ E. Moreover, by the definition of i j and i j +1 we must have that F is a subterm of E and w j +1
F ∈ / Var (i.e., root( F ) ∈ N ). Hence we have H j −−−→ H j +1 for the respective canonical simple stair w j +1 = u (r , F ) where r = r i j +1 . By replacing all w j +1 with the respective canonical simple stairs w j +1 we get the desired sequence Seq. 2 (Strong) monochromatic canonical sequences We now build on the notions RStick(σ ) ⊆ Surv(σ ) ⊆ {x1 , . . . , xm } that we introduced for stair substitutions σ earlier. (We recall that Surv(σ ) consists of the variables occurring in A (x1 , . . . , xm )σ , while xi ∈ RStick(σ ) are even root-successors in A (x1 , . . . , xm )σ .)
• For a stair substitution σ we define its colour as
col(σ ) = Surv(σ ), (R1 , R2 , . . . , Rm )
where Ri = {x j | x j σ = xi }, for i ∈ [1, m]. Hence RStick(σ ) = {xi | Ri = ∅}. w
• For a stair w ∈ R+ where A (x1 , . . . , xm ) −→ B (x1 , . . . , xm )σ (with supp(σ ) ⊆ {x1 , x2 , . . . , xm }) we define its colour as
Col( w ) = A , B , col(σ ) .
• If Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . is a canonical sequence, then for i < j (i , j ∈ N ) we put COLSeq (i , j ) = Col( w i +1 w i +2 · · · w j ).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.17 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
17
• A canonical sequence Seq is a monochromatic sequence if there is a “colour” c = A , A , (S, (R1 , . . . , Rm )) such that COLSeq (i , j ) = c for all i < j. (This could not hold for c = A , B , . . . where A = B.) In this case we put SurvSeq = S and RStickSeq = {xi | Ri = ∅}. • If Seq is a monochromatic sequence H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . , presented as A (x1 , . . . , xm )σ0 , w 1 , A (x1 , . . . , xm )σ1 , w 2 , A (x1 , . . . , xm )σ2 , w 3 , . . . , and we have xi σ0 ∼ xi σ1 ∼ xi σ2 ∼ · · · for each xi ∈ SurvSeq , then Seq is a strong monochromatic sequence. (For each xi ∈ SurvSeq there is a fixed bisim-class such that the ith root-successor in H j is from this fixed class, for each j ∈ N .) Proposition 14. If a bisim-infinite term E 0 has no witness and Seq is a canonical sequence reachable from E 0 , then Seq has a strong monochromatic subsequence. Proof. We fix a bisim-infinite term E 0 that has no witness (assuming such E 0 exists); we further fix a canonical sequence Seq reachable from E 0 . By the infinite Ramsey theorem there is a monochromatic subsequence Seq of Seq. We will thus u
immediately assume that Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . is monochromatic, that E 0 −→ H 0 , and that
COLSeq (i , j ) = c = A , A , (SurvSeq , (R1 , . . . , Rm )) for all i < j (i , j ∈ N). Hence Col( w j w j +1 · · · w j + ) = c for all j ∈ N+ and ∈ N . Let us write H j = A (x1 , . . . , xm )σ j for all j ∈ N . In more detail,
xm )σ j+1 , for all j ∈ N ; hence
σ j = σ j σ j−1 · · · σ1 σ0 . We thus also have col(σ σ− 1 · · · σ j ) = SurvSeq , (R1 , . . . , Rm ) for all ≥ j ≥ 1,
w j +1
σ j+1 = σ j+1 σ j where A (x1 , . . . , xm ) −−−→ A (x1 , . . . ,
which also entails that RStick(σ σ− 1 · · · σ j ) = RStickSeq = {xi | Ri = ∅}.
We now verify that
σ1 is colour-idempotent; the definition requires two conditions:
1. (xi ∈ RStick(σ1 ) entails that xi σ1 = xi ) Suppose xi ∈ RStick(σ1 ) = RStickSeq , hence x j σ1 = xi for some j ∈ [1, m], and thus x j ∈ Ri . Since col(σ1 ) = col(σ2 ) = col(σ2 σ1 ), we also have x j σ2 = xi and x j σ2 σ1 = xi , which entails xi σ1 = xi . 2. (xi ∈ Surv(σ1 ) entails that xi occurs in x j σ1 for some x j ∈ Surv(σ1 )) For xi ∈ Surv(σ1 ) we also have xi ∈ Surv(σ2 σ1 ) (since Surv(σ1 ) = Surv(σ2 σ1 ) = SurvSeq ), hence there is x j ∈ Surv(σ2 ) such that xi occurs in x j σ1 ; since Surv(σ2 ) = Surv(σ1 ), we have that xi occurs in x j σ1 for some x j ∈ Surv(σ1 ). The colour-idempotency claim on σ1 can be obviously generalized, but it suffices for us to note that each nonempty R i (in the colour c) contains xi , hence xi σ j = xi for all xi ∈ RStickSeq and j ∈ N+ . Let w = w 1 , σ = σ1 , and j ∈ N . We have shown that w is a colour-idempotent stair, hence the pair (u w 1 w 2 · · · w j , w ) is a candidate for a witness of bisim-infiniteness of E 0 . We consider the path u w 1 w 2 ··· w j
w
w
w
E 0 −−−−−−−→ A (x1 , . . . , xm )σ j −→ A (x1 , . . . , xm )σ σ j −→ A (x1 , . . . , xm )σ σ σ j −→ · · · and the corresponding limit Lim j = A (x1 , . . . , xm )σ ω σ j . The set of variables occurring in the term A (x1 , . . . , xm )σ ω is RStick(σ ) = RStickSeq (recall Proposition 8). Since xi σ j σ j−1 · · · σ1 = xi for each xi ∈ RStickSeq , we have
Lim j = A (x1 , . . . , xm )σ ω σ j = A (x1 , . . . , xm )σ ω σ j σ j−1 · · · σ1 σ0 = A (x1 , . . . , xm )σ ω σ0 , or Lim j = A (x1 , . . . , xm )σ ω σ 0 where σ 0 is the restriction of σ0 to RStickSeq . Hence Lim0 = Lim1 = Lim2 = · · · , and we thus write just Lim instead of Lim j . Let e ∈ N be the value related to A, σ , and σ 0 as in Lemma 11. Since we assume that E 0 has no witness, we have A (x1 , . . . , xm )σ e σ j ∼ Lim; this holds for all j ∈ N . Each xi ∈ SurvSeq occurs in A (x1 , . . . , xm )σ e (by Proposition 8, since σ is colour-idempotent), and there is thus a (“sinkv
ing”) path A (x1 , . . . , xm )σ e −→ xi . Hence there is a bound, independent of j, such that all terms xi σ j where xi ∈ SurvSeq (and j ∈ N ), are in a bounded (reachability) distance from A (x1 , . . . , xm )σ e σ j . Since A (x1 , . . . , xm )σ e σ j ∼ Lim, every path v
v
A (x1 , . . . , xm )σ e σ j −→ xi σ j must be matched by a path Lim −→ G where | v | = | v | and xi σ j ∼ G. Hence the equivalence classes [xi σ j ]∼ , for xi ∈ SurvSeq and j ∈ N , are in a bounded distance from the class [Lim]∼ (in the quotient LTS related to LaG ); due to finite branching, the respective set F of classes in this bounded distance from [Lim]∼ is finite. The pigeonhole principle thus yields that Seq indeed has a strong monochromatic subsequence. 2
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.18 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
18
Lemma 15. For each grammar G and each bisim-infinite term E 0 there is a witness (satisfying the condition 1, namely A (x1 , . . . , xm )σ e σ0 Lim, in Lemma 11). Proof. Given a grammar G = (N , , R), let E 0 be a bisim-infinite term; for the sake of contradiction we assume that E 0 has no witness. We fix a witness sequence
Seq = H 0 , w 1 , H 1 , w 2 , H 2 , w 3 , . . . where H 0 is reachable from E 0 ; it exists by Proposition 13. We recall that H j H j for j = j (since Seq is a witness wj
sequence), and w j , for each j ∈ N+ , is a nonempty finite sequence of canonical simple stairs for which H j −1 −−→ H j ; this wj
w1
entails H 0 −−→ · · · −−→ H j .
w j
For each j ∈ N+ we fix a shortest canonical stair w j for which there is H j such that H 0 −−→ H j and H j ∼ H j . All w j
( j ∈ N+ ) are pairwise different finite sequences of canonical simple stairs. (We have w j = w j for j = j since H j H j .) By König’s lemma we can thus easily derive that we can fix an infinite sequence u 1 u 2 u 3 · · · of canonical simple stairs such that each sequence u 1 u 2 · · · u (for ∈ N ) is a prefix of w j for infinitely many j ∈ N+ . We now consider the canonical sequence H 0 , u 1 , G 1 , u 2 , G 2 , . . . , which is reachable from E 0 (since H 0 is reachable from E 0 ); we also write H 0 as A (x1 , . . . , xm )σ0 . Since E 0 has no witness, by Proposition 14 this sequence has a strong monochromatic subsequence
Seq = G i 0 , v 1 , G i 1 , v 2 , G i 2 , v 3 , . . . , where v j +1 = u i j +1 u i j +2 · · · u i j+1 , for each j ∈ N ; we also put v 0 = u 1 u 2 · · · u i 0 . (We do not exclude that G i j ∼ G i j for j = j .) Hence there are B ∈ N and stair substitutions v j +1
v
0 σ 0 , σ 1 , σ 2 , . . . such that A (x1 , . . . , xm ) −→ B (x1 , . . . , xm )σ 0 and
v0
v1
v2
B (x1 , . . . , xm ) −−−→ B (x1 , . . . , xm )σ j +1 (for all j ∈ N ); the infinite path H 0 −→ G i 0 −→ G i 1 −→ · · · can be thus written as v0
v2
v1
A (x1 , . . . , xm )σ0 −→ B (x1 , . . . , xm )σ 0 σ0 −→ B (x1 , . . . , xm )σ 1 σ 0 σ0 −→ · · · .
(6)
We have Surv(σ j ) = SurvSeq for all j ∈ N+ , and for each x ∈ SurvSeq the bisim-class of the term xσ j σ j −1 · · · σ 0 σ0 ( j ∈ N ) is independent of j (since Seq is strong monochromatic). By our choice of the sequence u 1 u 2 u 3 . . . , we can fix some j such that w j = v 0 v 1 v 2 w for a (canonical) stair w. Hence w j
H 0 −−→ H j can be written as v0
v1 v2
w
A (x1 , . . . , xm )σ0 −→ B (x1 , . . . , xm )σ 0 σ0 −−−→ B (x1 , . . . , xm )σ 2 σ 1 σ 0 σ0 −→ H j w
where B (x1 , . . . , xm ) −→ C (x1 , . . . , xm )σ and H j = C (x1 , . . . , xm )σ σ 2 σ 1 σ 0 σ0 .
We finish the proof by showing that w = v 0 v 2 w (arising from w j by omitting v 1 ) contradicts the length-minimality w
condition put on w j . We obviously have H 0 −→ H where H = C (x1 , . . . , xm )σ σ 2 σ 0 σ0 , and it is thus sufficient to show that H j ∼ H , i.e.,
C (x1 , . . . , xm )σ σ 2 σ 1 σ 0 σ0 ∼ C (x1 , . . . , xm )σ σ 2 σ 0 σ0 .
(7)
Since var (C (x1 , . . . , xm )σ ) ⊆ {x1 , . . . , xm } and Surv(σ 2 ) = i ∈[1,m] var(xi σ 2 ) = SurvSeq , we derive that var(C (x1 , . . . , xm )σ σ 2 ) ⊆ SurvSeq . For each x ∈ var (C (x1 , . . . , xm )σ σ 2 ) we thus have xσ 1 σ 0 σ0 ∼ xσ 0 σ0 (recall reasoning around (6)); hence the claim (7) follows due to compositionality captured by Proposition 7. 2 4. Additional remarks The idea of decision procedures for bisim-finiteness (or language regularity) in the deterministic case studied in [1, 2] could be roughly explained in our framework as follows. For a deterministic grammar (with at most one rule a
A (x1 , . . . , xm ) −→ .. for each nonterminal A and each action a), if we have u
w
E 0 −→ A (x1 , . . . , xm )σ0 −→ A (x1 , . . . , xm )σ σ0 where w is a colour-idempotent stair, then either A (x1 , . . . , xm )σ0 ∼ A (x1 , . . . , xm )σ σ0 (in which case A (x1 , . . . , xm )σ σ0 can be always safely replaced with the smaller A (x1 , . . . , xm )σ0 ), or A (x1 , . . . , xm )σ0 A (x1 , . . . , xm )σ σ0 and E 0 is bisim-infinite. This allows to derive a bound on the size of the potential equivalent finite system, and thus the decidability of the full equivalence (Theorem 3) is not needed here. In the case equivalent to normed pushdown processes, the regularity problem essentially coincides with the boundedness problem, and is thus much simpler. (See, e.g., [5] for a further discussion.)
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.19 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
19
Declaration of competing interest I wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. Acknowledgments This research was mostly carried out when I was affiliated with Techn. Univ. Ostrava and supported by the Grant Agency ˇ of the Czech Rep., project GACR:15-13784S. I also thank Stefan Göller for drawing my attention to the decidability question for regularity of pushdown processes, for discussions about some related works (like [2]), and for detailed comments on a previous version of this paper. My thanks also go to anonymous reviewers for their helpful comments that have also prompted me to further simplifications of the proof. Appendix A At the ends of Sections 2 and 3.2.1 the issues of transforming pushdown automata to first-order grammars, of normalizing the grammars, and of unifying the nonterminal arities were mentioned. We now deal with these issues in more detail. A.1. Transforming pushdown automata to first-order grammars A pushdown automaton (PDA) is a tuple M = ( Q , , , ) of finite sets where the elements of Q , , are called control a
states, actions (or terminal letters), and stack symbols, respectively; contains transition rules of the form pY −→ qα where p , q ∈ Q , Y ∈ , a ∈ ∪ {ε }, and α ∈ ∗ . (We assume ε ∈ / .) A PDA M = ( Q , , , ) generates the labelled transition system a
L M = ( Q × ∗ , ∪ {ε }, (−→)a∈∪{ε} ) a
a
where each rule pY −→ qα induces transitions pY β −→ qα β for all β ∈ ∗ . Fig. 13 (left) presents a PDA-configuration (i.e. a state in L M ) p AC B as a term; here we assume that Q = {q1 , q2 , q3 }. (The string p AC B, depicted on the left in a convenient vertical form, is transformed into a term presented by an acyclic a
graph in the figure.) On the right in Fig. 13 we can see a transformation of a PDA-rule p A −→ qC A into a grammar-rule. Formally, for a PDA M = ( Q , , , ), where Q = {q1 , q2 , . . . , qm }, we can define the first-order grammar G M = (N , ∪ {ε }, R) where N = Q ∪ ( Q × ), with arit y (q) = 0 and arit y ((q, X )) = m; the set R is defined below. We write [q] and [qY ] for nonterminals q and (q, Y ), respectively, and we map each configuration p α to the term T ( p α ) by structural induction: T ( p ε ) = [ p ], and T ( pY α ) = [ pY ](T (q1 α ), T (q2 α ), . . . , T (qm α )). For a smooth transformation of rules we introduce a special “stack variable” x, and we put T (q i x) = xi (for all i ∈ [1, m]). a
a
a
A PDA-rule pY −→ qα in is transformed to the grammar rule T ( pY x) −→ T (qα x) in R. (Hence pY −→ qi is transformed a
a
a
to [ pY ](x1 , . . . , xm ) −→ xi , and pY −→ q Z α is transformed to [ pY ](x1 , . . . , xm ) −→ [q Z ](T (q1 α x), . . . , T (qm α x).) It is obvious that the LTS L M is isomorphic with the restriction of the LTS LaG to the states T ( p α ) where p α are M
a
configurations of M; moreover, the set {T ( p α ) | p ∈ Q , α ∈ ∗ } is closed w.r.t. reachability in LaG (if T ( p α ) −→ F in LaG , M M a then F = T (qβ) where p α −→ qβ in L M ). ε
In fact, we have not allowed ε -rules A (x1 , . . . , xm ) −→ E in our definition of first-order grammars. We would consider a variant of so called weak bisimilarity in such a case, which is undecidable in general (see, e.g., [19] for a further discussion). At the end of Section 2 we mention restricted PDAs where
ε
ε -rules pY −→ qα can be only popping, i.e. α = ε in such ε
rules, and deterministic (or having no alternative), which means that if there is a rule pY −→ q in then there is no other a
rule with the left-hand side pY (of the form pY −→ q α where a ∈ ∪ {ε }). We define the stable configurations as p ε and ε
pY α where there is no rule pY −→ .. in ; for the restricted PDAs we have that any unstable configuration p α only allows to perform a finite sequence of ε -transitions that reaches a stable configuration. Hence it is natural to restrict the attention a
to the “visible” transitions p α −→ qβ (a ∈ ) between stable configurations; such transitions might encompass sequences of ε -steps. In defining the grammar G M we can naturally avoid the explicit use of deterministic popping ε -transitions, by “preprocessing” them: in our inductive definition of T ( p α ) (and T ( p α x)) we add the following item: if pY is unstable, ε
a
since there is a rule pY −→ q, then T ( pY α ) = T (qα ). Fig. 14 (right) shows the grammar-rule T ( p Ax) −→ T (qC Ax) (arising ε
a
from the PDA-rule p A −→ qC A), when Q = {q1 , q2 , q3 } and there is a PDA-rule q2 A −→ q3 , while q1 A, q3 A are stable. Such preprocessing causes that the term T ( p α ) can have branches of varying lengths. A.2. Normalization of grammars We call a grammar G = (N , , R) normalized if for each A ∈ N and each i ∈ [1, arit y ( A )] there is a (“sink”) word w ( A ,i )
w ( A ,i ) ∈ R+ such that A (x1 , . . . , xarit y ( A ) ) −−−→ xi .
JID:YJCSS AID:3247 /FLA
20
[m3G; v1.261; Prn:17/12/2019; 9:37] P.20 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
Fig. 13. PDA configuration as a term (left), and transforming a rule (right).
Fig. 14. Deterministic popping ε -transitions are “preprocessed”.
For any grammar G = (N , , R) we can find some words w ( A ,i ) or find out their non-existence, for all A ∈ N and i ∈ w ( E ,x ) i
[1, arit y ( A )], as shown below. For technical convenience we will also find some words w ( E ,xi ) ∈ R∗ satisfying E −−−−→ xi a
for subterms E of the rhs (right-hand sides) E of the rules A (x1 , . . . , xm ) −→ E in R. We put w (xi ,xi ) = ε (for subterms xi of the rhs), while all other w ( E ,xi ) and all w ( A ,i ) are undefined in the beginning. Then we repeatedly define so far undefined w ( A ,i ) or w ( E ,xi ) by applying the following constructions, as long as possible: a
• put w ( A ,i ) = r w ( E ,xi ) if there is a rule r : A (x1 , . . . , xm ) −→ E and w ( E ,xi ) is defined; • put w ( E ,xi ) = w ( A , j ) w ( E ,xi ) if root( E ) = A, E is the jth root-successor in E , and w ( A , j ) , w ( E ,xi ) are defined. The correctness is obvious. The process could be modified to find some shortest w ( A ,i ) (and w ( E ,xi ) ) that exist but this is not important here. For any pair ( A , i ) for which w ( A ,i ) has remained undefined such word obviously does not exist, hence the ith root-successor G i of any term A (G 1 , . . . , G m ) is “non-exposable” and thus plays “no role” (not affecting the bisim-class of A (G 1 , . . . , G m )). We will now show a safe removal of such non-exposable root-successors. For G = (N , , R) we put G = (N , , R ) where the sets N = { A | A ∈ N } and R = {r | r ∈ R} are defined below. For A ∈ N , we put
Sink( A ) = {i ∈ [1, arit y ( A )] | there is some w ( A ,i ) }, and arit y ( A ) = |Sink( A )|. We define the mapping Cut : TermsN → TermsN by the following structural induction: 1. Cut(xi ) = xi ; 2. Cut( A (G 1 , G 2 , . . . , G m )) = A (Cut(G i 1 ), Cut(G i 2 ), . . . , Cut(G im )), where 1 ≤ i 1 < i 2 < · · · < im ≤ m and {i 1 , i 2 , . . . , im } = Sink( A ).
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.21 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
21
Fig. 15. Modifying (cutting) a rule r, when w ( A ,1) , w (C ,1) , and w ( D ,2) do not exist.
Fig. 16. Padding a rule in R (left) with x4 , when m = 3.
The set of rules R = {r | r ∈ R} is defined as follows: a
a
for r : A (x1 , . . . , xm ) −→ E we put r : Cut( A (x1 , . . . , xm ))σ −→ Cut( E )σ where
σ = {(xi1 , x1 ), (xi2 , x2 ), . . . , (xim , xm )} for {i 1 , i 2 , . . . , im } = Sink( A ). (Fig. 15 depicts the transformation of r : b
b
A (x1 , x2 , x3 ) −→ C ( A (x2 , x1 , B ), D (x2 , x3 )) to r : A (x2 , x3 )σ −→ C ( D (x2 ))σ where σ = {(x2 , x1 ), (x3 , x2 )}.) We note that for each variable xi occurring in Cut( E ) we must have i ∈ Sink( A ) = {i 1 , i 2 , . . . , im }; hence σ yields a one-to-one renaming a
of “place-holders” in Cut( A (x1 , . . . , xm )) −→ Cut( E ).
r
r
It is easy to check that Cut maps TermsN onto TermsN , and that G −→ H in LrG implies Cut(G ) −→ Cut( H ) in LrG ; r
r
moreover, if G −→ H in LrG and G ∈ Cut−1 (G ), then there is H ∈ Cut−1 ( H ) such that G −→ H in LrG .
w ( A ,i j )
Grammar G is normalized: if Sink( A ) = {i 1 , i 2 , . . . , im } then for each j ∈ [1, m ] we have A (x1 , . . . , xm ) −−−−→ xi j , and ( w ( A ,i j ) )
( w ( A ,i j ) )
thus Cut( A (x1 , . . . , xm )) −−−−−→ Cut(xi j ), i.e. A (xi 1 , . . . , xim ) −−−−−→ xi j , where w arises from w by replacing each element r with r . We also have that the set {( F , Cut( F )) | F ∈ TermsN } is a bisimulation in the union of LaG and LaG ; hence F ∼ Cut( F ), and E 0 is bisim-finite in LaG iff Cut( E 0 ) is bisim-finite in LaG . (We can also note that the quotient-LTS (LaG )≡Cut is isomorphic with LaG , and that the bisimilarity quotients of LaG and LaG are the same, up to isomorphism.) A.3. Unification of nonterminal arities In the proof of Theorem 1 we used m for denoting the arity of each nonterminal in the considered normalized grammar, instead of using m A for arit y ( A ). This was not crucial, since it would be straightforward to modify the relevant arguments in the proof, but we also mentioned that we could “harmlessly” achieve the uniformity of nonterminal arities by a construction, while keeping the adjusted grammar normalized. We now sketch such a construction. Suppose G = (N , , R) is normalized and the arities of nonterminals are not all the same; let m be the maximum arity. If there are no nullary nonterminals, then the arities can be unified to m by a straightforward “padding with superfluous copies of root-successors”. But we will pad with a special (infinite regular) term, which handles the case of nullary nonterminals as well. (This is illustrated in Figs. 16 and 17.) We define the grammar G = (N , , R ) where N = { A | A ∈ N } ∪ { A sp }, = ∪ {asp }, and R = {r | r ∈ R} ∪ Rsp as defined below. Each nonterminal in N , including the special nonterminal A sp , has arity m+1. The set Rsp (of the rules with the special added action asp ) contains the following rules:
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.22 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
22
Fig. 17. A term F ∈ TermsN (left) and Padsp ( F ) (right), when m = 2.
asp
• A sp (x1 , . . . , xm+1 ) −−→ xi , for all i ∈ [1, m+1]; asp • A (x1 , . . . , xm+1 ) −−→ xi , for all A ∈ N and i ∈ [arit y ( A )+1, m+1]. The definition of R = Rsp ∪ {r | r ∈ R} is finished by the following point (see Fig. 16): a
a
• for r : A (x1 , . . . , xarit y( A ) ) −→ E we put r : A (x1 , . . . , xm+1 ) −→ Pad( E , xm+1 ). The expression Pad( E , xm+1 ) is clarified by the following inductive definition of Pad( F , H ) (padding F ∈ TermsN with certain H ): 1. Pad(xi , H ) = xi ; 2. Pad( A (G 1 , . . . , G m ), H ) = A (Pad(G 1 , H ), . . . , Pad(G m , H ), H , . . . , H ), where m = arit y ( A ), and m+1−m copies of H are used to “fill” the arity m+1 of A . Besides the above case Pad( E , xm+1 ) we use the definition of Pad( F , H ) also for H = E sp , i.e., for the special (infinite regular) term
E sp = A sp (x1 , . . . , xm+1 )σ ω where xi σ = A sp (x1 , . . . , xm+1 ) for all i ∈ [1, m+1]. (In Fig. 17 there are several copies of the least presentation of E sp when m = 2.) The behaviour of E sp is trivial: its only asp
outgoing transition in LaG is the loop E sp −→ E sp . We define the mapping Padsp : TermsN → TermsN by Padsp ( F ) = Pad( F , E sp )σsp where xi σsp = E sp for each variable xi . (The support of σsp is infinite but this causes no problem.) Hence there are no variables in Padsp ( F ). (Fig. 17 shows an example. We note that if the nullary nonterminal B happens to be dead in LaG , then B ∼ E sp in LaG ; this is the reason for replacing the variables with E sp .) The mapping Padsp is injective (but not onto TermsN ) and the following conditions obviously hold: r
r
• if G −→ H (in LrG ) then Padsp (G ) −→ Padsp ( H ) (in LrG ); r
r
• if Padsp (G ) −→ H then there is H such that Padsp ( H ) = H and G −→ H . The rules in Rsp guarantee that G is normalized (if G is normalized), and they also induce that the special action asp is asp
enabled in any term Padsp (G ) (in LaG ); moreover, Padsp (G ) − −→ H entails that H = E sp . We now note that any set B ⊆ TermsN × TermsN is a bisimulation in LaG iff B = {( E sp , E sp )} ∪ {(Padsp ( F ), Padsp (G )) | ( F , G ) ∈ B )} is a bisimulation in LaG . We deduce that E ∼ F in LaG iff Padsp ( E ) ∼ Padsp ( F ) in LaG , and E 0 is bisim-finite in LaG iff Padsp ( E 0 ) is bisim-finite in LaG .
JID:YJCSS AID:3247 /FLA
[m3G; v1.261; Prn:17/12/2019; 9:37] P.23 (1-23)
P. Janˇcar / Journal of Computer and System Sciences ••• (••••) •••–•••
23
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
R.E. Stearns, A regularity test for pushdown machines, Inf. Control 11 (3) (1967) 323–340. L.G. Valiant, Regularity and related problems for deterministic pushdown automata, J. ACM 22 (1) (1975) 1–10. G. Sénizergues, L(A) = L(B)? Decidability results from complete formal systems, Theor. Comput. Sci. 251 (1–2) (2001) 1–166. R. Milner, Communication and Concurrency, Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1989. J. Srba, Roadmap of infinite results, in: Current Trends in Theoretical Computer Science: The Challenge of the New Century, vol. 2, World Scientific Publishing Co., 2004, pp. 337–350, updated version at http://users-cs.au.dk/srba/roadmap/. G. Sénizergues, The bisimulation problem for equational graphs of finite out-degree, SIAM J. Comput. 34 (5) (2005) 1025–1106. M. Benedikt, S. Göller, S. Kiefer, A.S. Murawski, Bisimilarity of pushdown automata is nonelementary, in: Proc. LICS 2013, IEEE Computer Society, 2013, pp. 488–498. S. Schmitz, Complexity hierarchies beyond elementary, ACM Trans. Comput. Theory 8 (1) (2016) 3. P. Janˇcar, Equivalences of pushdown systems are hard, in: Proc. FOSSACS 2014, in: LNCS, vol. 8412, Springer, 2014, pp. 1–28. C. Stirling, Deciding DPDA equivalence is primitive recursive, in: Proc. ICALP’02, in: LNCS, vol. 2380, Springer, 2002, pp. 821–832. A. Kuˇcera, R. Mayr, On the complexity of checking semantic equivalences between pushdown processes and finite-state processes, Inf. Comput. 208 (7) (2010) 772–796. C.H. Broadbent, S. Göller, On bisimilarity of higher-order pushdown automata: undecidability at order two, in: FSTTCS 2012, in: LIPIcs, vol. 18, Schloss Dagstuhl - Leibniz-Zentrum Für Informatik, 2012, pp. 160–172. P. Janˇcar, Bisimulation equivalence of first-order grammars, in: Proc. ICALP’14 (II), in: LNCS, vol. 8573, Springer, 2014, pp. 232–243. B. Courcelle, The monadic second-order logic of graphs IX: machines and their behaviours, Theor. Comput. Sci. 151 (1) (1995) 125–162. T. Knapik, D. Niwinski, P. Urzyczyn, Higher-order pushdown trees are easy, in: Proc. FOSSACS 2002, in: LNCS, vol. 2303, Springer, 2002, pp. 205–222. L. Ong, Higher-order model checking: an overview, in: Proc. LICS 2015, IEEE Computer Society, 2015, pp. 1–15. I. Walukiewicz, Automata theory and higher-order model-checking, ACM SIGLOG News 3 (4) (2016) 13–31. B. Courcelle, Recursive applicative program schemes, in: J. van Leeuwen (Ed.), Handbook of Theoretical Computer Science, vol. B, Elsevier, MIT Press, 1990, pp. 459–492. P. Janˇcar, J. Srba, Undecidability of bisimilarity by defender’s forcing, J. ACM 55 (1) (2008). P. Janˇcar, J. Srba, Note on undecidability of bisimilarity for second-order pushdown processes, CoRR, arXiv:1303.0780.