Robustness and closure properties of recognizable languages in adhesive categories

Robustness and closure properties of recognizable languages in adhesive categories

Science of Computer Programming 104 (2015) 71–98 Contents lists available at ScienceDirect Science of Computer Programming www.elsevier.com/locate/s...

695KB Sizes 0 Downloads 67 Views

Science of Computer Programming 104 (2015) 71–98

Contents lists available at ScienceDirect

Science of Computer Programming www.elsevier.com/locate/scico

Robustness and closure properties of recognizable languages in adhesive categories ✩ H.J. Sander Bruggink ∗ , Barbara König ∗ , Sebastian Küpper ∗ Fakultät für Ingenieurwissenschaften, Abteilung für Informatik und Angewandte Kognitionswissenschaften, Universität Duisburg-Essen, Germany

a r t i c l e

i n f o

Article history: Received 8 July 2013 Received in revised form 2 April 2014 Accepted 20 August 2014 Available online 2 October 2014 Keywords: Closure properties Recognizable languages Recognizable graph languages Concatenation Adhesive categories

a b s t r a c t We consider recognizable languages of cospans in adhesive categories defined via automaton functors, of which recognizable graph languages are a special case. There are three contributions in this paper: we first show that the notion of recognizable languages is robust in the sense that also semi-functors, i.e., functors that do not necessarily preserve identities, characterize recognizable languages. This is done by converting semi-functors to functors with a procedure akin to epsilon elimination for nondeterministic finite automata. Second, relying on this result, we show that recognizable languages are closed under concatenation, i.e. under cospan composition, by providing a concrete construction that creates a concatenation automaton from two given automata. The construction is considerably more complex than the corresponding construction for finite automata. We conclude by showing negative closure properties for Kleene star and substitution. © 2014 Published by Elsevier B.V.

1. Introduction Regular languages for words and trees are an indispensable concept in many areas of computer science: they are used for instance for parsers, text editors or verification tools. Hence a natural question to ask is whether their counterpart in the world of graphs, recognizable graph languages à la Courcelle [7], can serve the same purpose. In order to use such languages in practice it is necessary to prove closure properties and to give effective procedures in order to constructively realize such closure properties. Here we focus on the closure of recognizable graph languages under concatenation. More concretely, we use an automaton model introduced by us in [4,2], which accepts exactly the recognizable graph languages of Courcelle (for an exact comparison see [4]). Such automata accept graphs equipped with a fixed inner and outer interface (categorically: cospans of graphs). Given two languages L X , K , L K ,Y of cospans, where the cospans are equipped with interfaces X , K , respectively K , Y , by concatenating every cospan of the first language with every cospan of the second language, one obtains a language L X , K ; L K ,Y of cospans equipped with interfaces X , Y . For finite automata on words concatenation can be implemented by a straightforward construction [11], however for graph languages the situation is considerably more complex, due to the fact that graphs are not freely generated, as opposed to words which are the free monoid over a given alphabet. In other words: there is no canonical way to decompose a graph



*

This work was supported by the dfg-project GaReV (KO 2185/6-2). Corresponding authors. E-mail addresses: [email protected] (H.J.S. Bruggink), [email protected] (B. König), [email protected] (S. Küpper).

http://dx.doi.org/10.1016/j.scico.2014.08.006 0167-6423/© 2014 Published by Elsevier B.V.

72

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

into atomic components (or cospans) and hence several decompositions generate the same graph. For graph automata this requires the condition that a graph is accepted independently of its specific decomposition and that two different decompositions of a cospan yield the same transitions in the automaton. We can assume that two automata A X , K , A K ,Y satisfying this requirement and accepting the languages L X , K , L K ,Y are given. Now the challenge is to construct an automaton for the concatenation language that also adheres to this constraint. Intuitively the problem is the following: when accepting a cospan of the concatenation language, one might read it in such a way that the automaton already sees parts of the second cospan before completely processing the first. This is visualized below on the left where cospans are represented by “tube-like” structures. Instead of reading first the cospan from X to K and then the cospan from K to Y , it is possible to start with a cospan from X to some new interface J that overlaps with K and already reaches into the second cospan, without fully containing the first. A concrete example is depicted on the right below. The upper cospan (from X to Y ) can be decomposed in two different ways: either into cospans from X to K and K to Y (as intended, see middle cospan) or into cospans from X to J and J to Y (see lower cospan).

This effect has to be taken into account in the construction of the new automaton. Especially we have to record states of both automata (since we can be in both automata at the same time) and we have to record the current union of the interface graphs K and J (this union will be denoted by U ) and the parts of U that are the current interfaces for the two automata. This construction will not only be given for graph automata, but more generally for languages of cospans of (monic) arrows in adhesive categories [14]. The properties of adhesive categories enable us to work in distributive subobject lattices, which are the key to the construction and the (unfortunately still quite lengthy) proof of its correctness. This greater generality allows us to prove the result for several classes of graph-like structures (directed graphs, hypergraphs, attributed graphs, etc.). The restriction to monos is required for certain crucial points in the proofs. Concatenation for recognizable graph languages was already studied by Courcelle in [7], but for a different setting where graphs have only one interface and are glued over their joint interface. Furthermore Courcelle’s results apply only to concrete graphs, whereas ours is shown in a general categorical setting. The concatenation construction was first presented in the predecessor paper at GT-VMT ’13 [5], where we stated that the concatenation automaton is always a functor. However, we only proved one of the functoriality conditions, but did not check whether identities are always preserved by the resulting functor A, which maps cospans to relations. Unfortunately this is not the case and hence the concatenation automaton, which we construct in Section 4 is in general only a semifunctor. However, the problem can be fixed quite easily by observing that for every identity arrow id X , A (id X ) is always an equivalence relation and one can factor the state space through these equivalences. Still, when solving this problem we were intrigued by the more general question: can any (automaton) semi-functor, not necessarily satisfying the property that identities are mapped to equivalences, be converted into an (automaton) functor? This question is closely related to epsilon elimination for non-deterministic finite automata. The construction we use works for all categories in which isomorphisms decompose into isomorphisms only. This construction can be broken up into three steps of deleting unnecessary states, merging equivalent states and deleting unnecessary edges in the automaton. Moreover, the property that isomorphisms decompose only into isomorphisms, holds in the class of categories of interest, i.e., for cospans of monos in adhesive categories. (However, it does not hold for cospans consisting of non-monic arrows.) Hence, before giving the concatenation construction, in Section 3 we will first describe the conversion of a semi-functor into a functor, since this result can be used in the following section describing the concatenation automaton. We conclude the paper in Section 5 by giving counterexamples for closure properties that fail to hold for graph languages, while being true for word languages: closure under Kleene star and closure under substitution. We will start by giving all necessary definitions, including particularly the definition of recognizable languages, in Section 2. We will then discuss semi-automata and show in Section 3 that under certain conditions they are equivalent to automata. This section is an original contribution not present in our previous paper [5]. In Section 4, we will discuss the construction of the concatenation automaton and show that recognizable languages are closed under concatenation. Section 5 then gives an overview over some negative results for recognizable languages and in Section 6 we will conclude by giving an outlook to possible applications of our theoretical results. The results in Sections 4 and 5 were stated before in [5]. In this paper, we will additionally prove all results.

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

73

2. Recognizable languages and automata 2.1. Category theory We presuppose basic knowledge of category theory. The identity arrow of an object X will be denoted by id X . If f and g are composable arrows, we write f ; g for the morphism f postcomposed with g, i.e. f ; g = g ◦ f . Among other categories we will work with Rel, the category of sets and relations. Here the identity arrow corresponds to the identity relation and arrow composition is simply composition of relations. Let C be a category in which all pushouts exist. A cospan c in C is a pair of arrows c L , c R  with the same codomain: cL cR c: J − −→ G ←−− K . Here J is the domain of c and K is the codomain of c. We often call J the inner interface of c and K the outer interface of c. Let c: J → F ← M and d: M → H ← K be cospans where the codomain of c equals the domain of d. Then the composition of c and d, written as c ; d: J → M  ← K is defined via the following commuting diagram, where the middle diamond is a pushout: cR

J

cL

cR

dL

dL

(PO)

G f

cL

M

H

dR

K

g

M

dR

Two cospans c: J − −→ G ←−− K and d: J −−→ H ←−− K are isomorphic, written c ∼ d, if there exists an isomorphism k from G to H , such that c L ; k = d L and c R ; k = d R . A semi-abstract cospan is a ∼-equivalence class of cospans. We will often identify a ∼-equivalence class with one of its representatives. The category Cospan(C ) is defined as the category that has the objects of C as objects and semi-abstract cospans as arrows. The identity arrows are given as the equivalence class of idG

idG

identity cospans G −−→ G ←−− G. We will now define the notion of semi-functors of a category. Definition 2.1. Let C , D be categories. A map F : C → D is called a semi-functor if for all arrows h = f ; g in C we have F (h) = F ( f ) ; F ( g ). A semi-functor that also satisfies F (id X ) = id F ( X ) is called a functor. 2.2. Adhesive categories For this paper we want to investigate a specific type of categories, so-called adhesive categories, defined as in [14]. Definition 2.2. A category C is called adhesive if

• C has pullbacks • C has pushouts along monomorphisms • pushouts along monomorphisms are Van Kampen-pushouts, i.e. whenever the lower square in the following picture is a pushout along a monomorphism and the front squares are pullbacks it holds that the top square is a pushout iff the back squares are pullbacks. C A

B D C B

A D

Definition 2.3. An example of an adhesive category which we will use in examples is the category HG raph. For a set A we will write A ∗ for the set of finite sequences of elements from A and for a function f : A → B we write f ∗ : A ∗ → B ∗ for the function that applies f to each element of a sequence. Let Λ be a fixed set of labels with an arity function ar: Λ → N0 . A hypergraph is a four-tuple G = ( V G , E G , att G , labG ) where V G is a set (the set of vertices), E G is a set (the set of edges), att G : E G → V G∗ and labG : E G → Λ are functions, where for each edge e ∈ E G it holds that ar (labG (e )) = |att G (e )|. We only consider finite graphs where the set of vertices and the set of edges are finite. A graph morphism is a pair of functions ( f V , f E ): G → H , f V : V G → V H , f E : E G → E H where lab H ◦ f E = labG and att H ◦ f E = f V∗ ◦ att G .

74

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

We say a graph D is discrete if E D = ∅, i.e. if it only contains vertices. The category HG raph has hypergraphs as objects and graph morphisms as arrows. We are particularly interested in subobject lattices in adhesive categories, defined as in [14,1]. Definition 2.4 (Subobject). Let U be an object in C . Two monomorphisms a: A  U and b: B  U are called isomorphic if there is an isomorphism ψ : A → B such that ψ ; b = a. A subobject of U is an isomorphism class of monomorphisms into U . It is denoted by [a: A  U ] or [a] where a: A  U is any representative. For a fixed object U we consider the category Sub(U ) that has subobjects of U as objects and that has an arrow between two objects [a], [b] if there exists an arrow c in C such that c ; b = a. In this case we write [a] ⊆ [b]. An important result due to [14] is that each Sub(U ) with partial order ⊆ forms a distributive lattice, the so-called subobject lattice of U . The meet of two subobjects [a], [b] is realized by taking their pullback in C (see diagram on the left below) and is denoted by [a] ∩ [b]. A join [a] ∪ [b] is obtained by taking the pushout over their meet (see diagram on the right below). A

A

a

a

A∩B

a∩b

U

A∩B

a∪b

A∪B

U

b

B

B

b

Distributivity means that ([a] ∪ [b]) ∩ [c ] = ([a] ∩ [c ]) ∪ ([b] ∩ [c ]) and ([a] ∩ [b]) ∪ [c ] = ([a] ∪ [c ]) ∩ ([b] ∪ [c ]) for subobjects

[a], [b], [c ].

In order to structure our results more clearly, we define: Definition 2.5. Let d: A → B ← C be a cospan consisting of monos and U be an object with monos a: A → U , b: B → U , c: C → U , such that a, b, c commute with the two legs of the cospan. That is, [a], [b], [c ] are subobjects of U and [a] ⊆ [b], [c ] ⊆ [b]. In the following we write [ d]U = ([a], [b], [c ]) in order to represent a cospan (over U ) as a triple of subobjects. When U is clear from the context, we omit it and just write [ d] = ([a], [b], [c ]). Union and intersection of such triples are defined pairwise. 2.3. Recognizability Courcelle [6,7] introduced the notion of recognizability of graph languages as an analogon to regularity of word languages. This notion can be extended to arbitrary categories C that are locally small, i.e. for every two objects the class of arrows between them is a set. Nondeterministic finite automata are commonly used to investigate properties of regular languages. An analogous notion for recognizable sets of arrows in a category is given by an automaton functor [4]. The intuition behind such automaton functors is the following: a transition function δ : Z × Σ → P ( Z ) (where Z is the set of states and Σ is an alphabet) for non-deterministic word automata can be extended to a transition function δˆ : Z × Σ ∗ → P ( Z ) on words. With some currying and rearranging we can view δˆ as a function that maps a word from Σ ∗ to a relation ˆ ε ) = id Z , where id Z is the identity relation on Z , and δ( ˆ w 1 w 2 ) = δ( ˆ w 1 )δ( ˆ w2) on Z and which furthermore satisfies δ( (functoriality conditions). Note that Σ ∗ can be seen as a category with one object and an arrow for every word, where ε (the empty word) is the identity. Similarly we define an automaton as a mapping from cospans of graphs to relations on states, such that the functoriality conditions above hold. Alternatively one could define automata only on so-called atomic cospans as in [2], but – different from word languages – it is necessary to require that a cospan generates the same relation, independently of its decomposition, which is already implicit in the functoriality conditions. Definition 2.6 (Automaton, semi-automaton). Let C be any category and let J and K be objects of C . A ( J , K )-semi-automaton (for C ) is a tuple A = ( A , I , F ), where A is a semi-functor A: C → Rel that maps each object X of C to a finite set, and I ⊆ A ( J ) and F ⊆ A ( K ). The semi-functor A will be called the automaton semi-functor of the semi-automaton A. Additionally, for each object X of C , A ( X ) is the state set of X , and I and F are the initial and final states, respectively.

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

75

A semi-automaton A is an automaton if its automaton semi-functor is in fact a functor. We often identify a (semi-)automaton with its automaton (semi-)functor. In particular, we will often write A( X ) instead of A ( X ). An automaton A is deterministic whenever I is a singleton and for each C -arrow f : X → Y , the relation A ( f ) is a function, i.e. for each x ∈ A ( X ) there exists exactly one y such that x A ( f ) y. The language L (A) of A, containing C -arrows from J to K , is defined as follows:

   L (A) = f : J → K  there exist q ∈ I , q ∈ F such that q A ( f ) q . A language L J , K of arrows from J to K is recognizable in C if it is the language of some ( J , K )-automaton. In the following section we will study the languages of semi-automata and show under which conditions they are recognizable. Many properties of regular languages can be shown for recognizable languages in a straightforward way; see [4] for proofs. Proposition 2.7. For every automaton, there exists an equivalent deterministic automaton. Proof (Sketch). The construction is more or less equivalent to the case of finite automata: we replace every set of states by its powerset. 2 Proposition 2.8. Suppose we have two recognizable languages of arrows, L 1J , K and L 2J , K . Then also L 1J , K ∩ L 2J , K , L 1J , K ∪ L 2J , K and

( L 1J , K )C (the complement of L 1J , K ) are recognizable. Proof (Sketch). Again the construction resembles the case of finite automata: For union and intersection, we take the cross product of two deterministic automata and define the final states accordingly. For the complement we exchange final and non-final states (i.e. states that are not in F ) of a deterministic automaton. 2 For our examples we will often use graph (semi-)automata which we will define in the following Definition 2.9. A graph (semi-)automaton is a (semi-)automaton functor over the category Cospan(HG raph). A graph language L, i.e. a set of cospans of monomorphisms in HG raph, is recognizable if there exists a graph automaton accepting L. The following example illustrates our automaton model based on a well-known problem for the case of graphs. The example is taken from [4] and we will discuss a variation in the next section. Example 2.10 (k-Colourability). We set Nk = {0, . . . , k − 1}. Let G be a graph. A k-colouring of G is a function f : V G → Nk such that for all e ∈ E G and for all v 1 , v 2 ∈ att G (e ) it holds that f ( v 1 ) = f ( v 2 ) if v 1 = v 2 . We show that the language of all k-colourable graphs is recognizable, by considering the following automaton functor A: Cospan(HG raph) → Rel:

• Every graph J is mapped to A( J ), the set of all k-colourings of J :

A( J ) = { f : V J → Nk | f is a k-colouring of J }. • For a cospan c: J → G ← K the relation A(c ) relates two colourings f J , f K , whenever there exists a k-colouring f for G such that f (c L ( v )) = f J ( v ) for every node v ∈ V J and f (c R ( v )) = f K ( v ) for every node v ∈ V K . That is, f coincides with f J , f K on the interfaces. Specifically we have that A(∅) = {∅} where ∅ is the empty colouring. Then in order to accept all k-colourable graphs with empty interfaces we take I A = F A = {∅}: a graph ∅ → G ← ∅ is accepted whenever the two empty mappings are related. 3. Languages of semi-automata are recognizable In this section we show that for each semi-automaton (which satisfies some additional conditions) there exists an automaton which accepts the same language. In other words, we show that automaton semi-functors accept recognizable languages and hence give further evidence of the robustness of the notion of recognizability. Given an automaton A = ( A , I , F ), the idea is to ensure that for each object X , A (id X ) is the identity relation, which is not guaranteed for semi-functors. Pairs of states x, y ∈ A ( X ) for which x A (id X ) y can be seen as having an “epsilon transition” (as in NFAs), i.e., a state change where no input is read. Hence the following procedure is similar, although not identical, to epsilon elimination. The relation will also be elaborated in Example 3.19. Note also that A (id X ) is always transitive since A is a semi-functor and hence A (id X ) = A (id X ) ; A (id X ).

76

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

Fig. 1. Flow-chart for constructing automaton functors from semi-functors.

The heart of the proof is formed by the so-called identity factorization (Definition 3.7): pairs of states which can be reached from each other by identity transitions are identified. If it is the case that A (id X ) is an equivalence relation for each object X , the identity factorization already yields an equivalent automaton. Otherwise, we first make sure that A (id X ) is reflexive for all objects X , thus ensuring that A (id X ) is a quasi-order. Then, either A (id X ) is an equivalence relation, in which case we apply identity factorization to obtain an automaton functor, or we use the identity factorization to make A (id X ) anti-symmetric for all X , and then transform such a semi-automaton to an automaton. The entire procedure is sketched in Fig. 1. For the transition which is marked with a ∗, we need the category under consideration to satisfy the following condition: if f ; g is an isomorphism, then f and g are also isomorphisms. At the end of the section we show that cospan categories (with cospans with monomorphism legs) of adhesive categories satisfy this condition. To demonstrate that semi-automata are a useful automaton model, we will investigate a way of modelling automata for k-colourability in a slightly different way compared to Example 2.10. Example 3.1. We use the notation of Example 2.10. We want to model a semi-automaton that accepts exactly the k-colourable graphs with empty interfaces. Our intuition is again to read the graph piece by piece, each piece, i.e. cospan, being glued together via the interfaces. However, we do not want to be forced to choose a colour for each node right away. Instead, we want to be able to decide on the colour of a node later on. We will therefore consider partial colourings of graphs. A partial k-colouring of a graph G is a partial function f : V G →  Nk such that for all e ∈ E G and for all v 1 , v 2 ∈ Att G (e ) it holds that f ( v 1 ) = f ( v 2 ) if v 1 = v 2 and f ( v 1 ), f ( v 2 ) are defined. We now define the semi-automaton functor A : Cospan(HG raph) → Rel according to:

• Every graph J is mapped to A( J ), the set of all partial k-colourings of J :

A( J ) = { f : V J →  Nk | f is a valid partial k-colouring of J }. • For a cospan c: J → G ← K the relation A(c ) relates two partial colourings f J , f K , whenever there exists a partial colouring f for G such that f (c L ( v )) = f J ( v ) for every node v ∈ V J where f J ( v ) is defined, f (c R ( v )) = f K ( v ) for every node v ∈ V K where f K ( v ) is defined and whenever there are nodes v ∈ V J , v  ∈ V K such that c L ( v ) = c R ( v  ) and f J ( v ) is defined, then f K ( v  ) is defined as well (and, naturally, f K ( v  ) = f J ( v )). Moreover we demand that for each edge e in G and all nodes v in att G (e ) such that there is a preimage v  so that c R ( v  ) = v, f K ( v  ) is defined. So whenever we read an edge, the adjacent nodes need to have a defined colour afterwards. This is necessary to check that we obtain a valid colouring in the end.

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

77

Then in order to accept all k-colourable graphs with empty interfaces we take I A = F A = {∅}. A is a semi-automaton, but it is not an automaton. Whenever we have a state s, i.e. a partial function from a set of vertices V to N, that is not defined for all vertices v ∈ V , there exists an identity-transition from s to any other state s where s (x) = s(x) whenever s is defined on x. We can transform this semi-automaton into an automaton though. Following the procedure sketched in Fig. 1, we will demonstrate how we can obtain an equivalent automaton functor. We first note that A(id X ) is reflexive. However, it is not symmetric, because if we can go from s to s and s = s via an identity arrow, we cannot go back, because we have decided on another node’s colour. So we need to check if A(id X ) is anti-symmetric. If there is an identity arrow from a state s to a state s and an identity arrow from s to s, s and s need to be the same partial colouring: Wherever s is defined, s must be defined and both must be equal and vice versa, which means that A(id X ) is anti-symmetric. We can thus obtain an equivalent automaton functor A ↔ by deleting all iso-transitions from a state s to a state s such that s is defined on more nodes than s. Intuitively, in A↔ , we need to decide on the colour of a vertex only when we read an incident edge. Throughout this section we will perform the procedure step-wise on a small running example. Example 3.2. For a running example, we investigate a rather simple and minimalistic semi-automaton that accepts just the identity for the discrete graph with one node. We call this arrow id and represent the semi-automaton graphically as seen below. id id

D

B

id

id

id

A id

id id

id

C id

The set of states (for the one-node graph) is { A , B , C , D }, where D is the only initial state and A is the only final state. Note that, in principle, we also need to have sets of states for every isomorphic copy of the one-node graph. However, these sets of states and their transitions are isomorphic and hence we omit them. It is easy to check that this is indeed a semi-automaton, because the identity arrow can only be decomposed into arrows isomorphic to the identity arrow. Hence we only have to check that whenever in the graph there is a path from one state x to another state y, there also is a transition from x to y. 3.1. Reflexivization We define the following procedure, which removes one non-reflexive state: Definition 3.3. Let A = ( A , I , F ) be a ( J , K )-semi-automaton for a category C , with its automaton semi-functor A: C → Rel. Furthermore, let X be an object of C and a ∈ A ( X ) a state such that (a, a) ∈ / A (id X ). We define the automaton semi-functor A(a) = ( A (a) , I (a) , F (a) ) as follows:

• For objects Y we have



A (a) (Y ) =

A (Y ) if X = Y A ( X ) \ {a} if X = Y

• For arrows f : Y → Z we have A (a) ( f ) = A ( f ) ∩ ( A (a) (Y ) × A (a) ( Z )). • Initial states: If X = J and a ∈ I , then I (a) = ( I \ {a}) ∪ {b | a A (id X ) b}. Otherwise I (a) = I . • Final states: If X = K and a ∈ F , then F (a) = ( F \ {a}) ∪ {b | b F (id X ) a}. Otherwise F (a) = F . Example 3.4. We perform the step described in Definition 3.3 on the automaton from Example 3.2. We have to delete the state D and make every state reachable from D initial, since D was initial. Since D was not final, we need not change the set of final states.

78

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98 id

id id

B

A id

id

id

C id

Lemma 3.5. Let A = ( A , I , F ) be a ( J , K )-semi-automaton for a category C . Furthermore, let X be an object of C and a ∈ A ( X ) a state such that (a, a) ∈ / A (id X ). Then A(a) is a semi-automaton which accepts the same language as A. Proof. Let A(a) = ( A (a) , I (a) , F (a) ). First we show that A (a) is a semi-functor, i.e. that A (a) ( f ; g ) = A (a) ( f ) ; A (a) ( g ) (for composable arrows f and g):

• “⊆”: Suppose x A (a) ( f ; g ) z. Then x A ( f ; g ) z and thus there is state y such that x A ( f ) y and y A ( g ) z. If cod( f ) = X or y = a we also have x A (a) ( f ) y and y A (a) ( g ) z. Otherwise, since g = id X ; g, there is a state y  such that a A (id X ) y  and y  A ( g ) z. Because (a, a) ∈ / A (id X ) by assumption, y  = a. Therefore, x A ( f ; id X ) y  and thus x A ( f ) y  also holds. So we have x A (a) ( f ) y  and y  A (a) ( g ) z. • “⊇”: Suppose x A (a) ( f ) y and y A (a) ( g ) z. Then it holds by definition that x A ( f ) y and y A ( g ) z. Thus, x A ( f ; g ) z, which yields x A (a) ( f ; g ) z. So A (a) is indeed an automaton semi-functor. It remains to show that L (A) = L (A(a) ). Let f : J → K be a C -arrow.

• “⊆”: Suppose f ∈ L (A). Then there exist an initial state x ∈ I and a final state y ∈ F such that x A ( f ) y. If ( J = X or x = a) and (K = X or a = y), x A (a) ( f ) y, x is still initial, and y is still final. If J = X and x = a, we have x A (id X ; f ) y (because f = id X ; f ). Then there exists a state x = x such that x A (id X ) x and x A ( f ) y, which yields x A (a) ( f ) y. Furthermore, as x A (id X ) x and x = a ∈ I , x ∈ I (a) . If J = X or x = a, we take x = x (then, also, x ∈ I (a) ). Symmetrically, we find a state y  ∈ F (a) such that x A (a) ( f ) y  . This shows that f ∈ L (A(a) ). • “⊇”: Suppose f ∈ L (A(a) ). Then there is an x ∈ I (a) and a y ∈ F (a) such that x A (a) ( f ) y and thus x A ( f ) y. We find states x ∈ I and y  ∈ F such that x A ( f ) y  as follows. If x ∈ I , then we take x = x. Otherwise, it must be the case that a ∈ I and a id X x, and we take x = a. Symmetrically, we take y  = y (if y ∈ F ) or y  = a (otherwise). This proves that f ∈ L (A). 2 Due to the local finiteness of automaton semi-functors, removing a single non-reflexive state can be generalized to infinitely many such states a, so that we obtain the following result: Proposition 3.6. Let A = ( A , I , F ) be a semi-automaton for a category C . There exists a semi-automaton A= = ( A = , I = , F = ) such that L (A= ) = L (A) and A = (id X ) is a reflexive relation for all objects X of C . Proof. Since, by definition, A ( X ) is a finite state set for all C -objects X , we can generalize the procedure of Definition 3.3 to remove all non-reflexive states. This yields an equivalent automaton semi-functor by Lemma 3.5. 2 3.2. Identity factorization The identity factorization identifies pairs of states which can be reached from each other by identity transitions. For a semi-automaton A with automaton semi-functor A such that A (id X ) is an equivalence relation for each X , this yields an automaton; on the other hand, if A (id X ) is only reflexive for each X , it yields a semi-automaton in which all relations are anti-symmetric, i.e., A (id X ) is a partial order, which can subsequently be transformed into an automaton. Definition 3.7. Let A = ( A , I , F ) be a ( J , K )-semi-automaton for a category C , such that A (id X ) is reflexive for all objects X of C . Then we define the identity factorization of A as Aid = ( A id , I id , F id ) as:

• A id ( X ) = {[a] X | a ∈ A ( X )} for each object X of C , where [a] X = {b ∈ A ( X ) | a A (id X ) b and b A (id X ) a}. • A id ( f ) = {([a] X , [b]Y ) | there are a ∈ [a] X , b ∈ [b]Y such that a A ( f ) b )} for each arrow f : X → Y of C .

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

79

• I id = {[a] J | a ∈ I }. • F id = {[a] K | a ∈ F }. It is easy to see that, because of the second functoriality property, the sets [a] X as used in the above definition form partitions of the respective state sets. Example 3.8. We continue our computation on the running Example 3.2 and perform the step described in Definition 3.3 on the automaton from Example 3.4. Since there is an identity arrow from A to C and vice-versa, we have to merge these two states, resulting in A. Since both states are initial, the resulting state is initial and since A is final, the resulting state is final. State B remains unchanged, because there is no other state to that B is related symmetrically by the identity. id

id id

A

B

Proposition 3.9. The identity factorization Aid of a semi-automaton A is a semi-automaton. Proof. Let A = ( A , I , F ) and Aid = ( A id , I id , F id ). We have to show that A id is a semi-functor, i.e. A id ( f ; g ) = A id ( f ) ; A id ( g ) for all arrows f : X → Y and g: Y → Z .

• “⊆”: Suppose that [a] X A id ( f ; g ) [b] Z . By definition, there exist a ∈ [a] X and b ∈ [b] Z such that a A ( f ; g ) b . Because A is a semi-functor, a A ( f ) ; A ( g ) b , and thus there exists a c ∈ A (Y ) such that a A ( f ) c and c A ( g ) b . By definition, [a] X A id ( f ) [c ]Y and [c ]Y A id ( g ) [b] Z , and therefore [a] X A id ( f ) ; A id ( g ) [b] Z . • “⊇”: Suppose [a] X A id ( f ) ; A id ( g ) [c ] Z . This means that there exists a [b]Y such that [a] X A id ( f ) [b]Y and [b]Y A id [c ] Z . By definition, there exist a ∈ [a] X and b ∈ [b]Y such that a A ( f ) b , and there exist b ∈ [b]Y and c  ∈ [c ] Z such that b A ( g ) c  . Since both b ∈ [b]Y and b ∈ [b]Y , it is the case that b A (idY ) b . Because f = f ; idY and A is a semi-functor, a A ( f ) b . From a A ( f ) b and b A ( g ) c  it follows that a A ( f ) ; A ( g ) c  and thus that a A ( f ; g ) c  . By definition, this means that [a] X A id ( f ; g ) [c ] Z . 2 Proposition 3.10. Let A be a ( J , K )-semi-automaton for a category C such that A (id X ) is reflexive for all objects X of C and let Aid be its identity factorization. Then L (A) = L (Aid ). Proof. Let A = ( A , I , F ) and Aid = ( A id , I id , F id ). Let f : J → K be a C -arrow.

• “⊆”: Suppose f ∈ L (A). Then there is an a ∈ I and a b ∈ F , such that a A ( f ) b. This means, by definition, that, [a] J A id ( f ) [b] K , [a] J ∈ I id , and [b] K ∈ F id . Hence, f ∈ L (Aid ). • “⊇”: Suppose f ∈ L ( A id ). Then there exists an [a] J ∈ I id and a [b] K ∈ F id such that [a] J A ( f ) [b] K . Since [a] J ∈ I id there exists an a ∈ [a] J such that a ∈ I ; since [b] K ∈ F id , there exists a b ∈ [b] K such that b ∈ F ; finally, since [a] J A ( f ) [b] K there exists a ∈ [a] J and b ∈ [b] K such that a A ( f ) b . Because a ∈ [a] J and a ∈ [a] J , it must hold that a A (id J ) a . Similarly, it must hold that b A (id K ) b . Since A is a semi-functor and f = id J ; f ; id K , it follows that a A ( f ) b , and thus f ∈ L (A). 2 Proposition 3.11. Let A = ( A , I , F ) be an automaton for a category C such that A (id X ) is reflexive for all objects X of C and let Aid = ( A id , I id , F id ) its identity-factorization. For each object X of C it holds that: (i) if A (id X ) is reflexive, then A id (id X ) is reflexive and anti-symmetric; (ii) if A (id X ) is an equivalence relation, then A id (id X ) = {(a, a) | a ∈ A id ( X )}, i.e., A id (id X ) is the identity relation. Proof. (i) Reflexivity of A id follows directly from the fact that A is reflexive. To prove anti-symmetry, suppose [a] X A id (id X ) [b] X and [b] X A id (id X ) [a] X . Then there exist a ∈ [a] X and b ∈ [b] X such that a A (id X ) b and there exist a ∈ [a] X and a b ∈ [b] X such that b A (id X ) a . Since b ∈ [b] X and b ∈ [b] X , it is the case that b A (id X ) b . So b A (id X ) b A (id X ) a A (id X ) a and thus b A (id X ) a , i.e. [a] X = [b] X , as required. (ii) The ⊇-direction follows directly from Part (i) (reflexivity). For the ⊆-direction, suppose [a] X A id (id X ) [b] X . Then there exist a ∈ [a] X and b ∈ [b] X such that a A (id X ) b , so b ∈ [a ] X = [a] X and thus [b] X = [a] X because A (id X ) is an equivalence relation. 2

80

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

Corollary 3.12. Let A = ( A , I , F ) be a semi-automaton for a category C and Aid = ( A id , I id , F id ) its identity factorization. If A (id X ) is an equivalence relation for each object X of C , then Aid is an automaton. Proof. The fact that Aid is a semi-automaton was shown in Proposition 3.9, which means that A id is a semi-functor. The fact that A id is also a functor follows from Proposition 3.11 (ii). 2 3.3. Building an automaton Until now, the construction is only guaranteed to produce semi-automata where all identity transitions are antisymmetric (although in some cases the result already is an automaton, see Corollary 3.12). In this subsection we present a construction which converts such semi-automata to automata. For this, we need the following condition on the category under consideration. Definition 3.13 (Isomorphism-atomic). A category C is isomorphism-atomic, if it is the case that, for all arrows f : X → Y and g: Y → Z , f ; g is an isomorphism only if f and g are both isomorphisms. Obviously, in an isomorphic-atomic category, it holds for composable arrows f and g that f ; g is an isomorphism if and only if f and g are isomorphisms. Now we can define the construction: Definition 3.14. Let C be an isomorphism-atomic category and A = ( A , I , F ) a semi-automaton for C such that A (id X ) is reflexive and anti-symmetric for all objects X of C . The symmetrization of A, written A↔ = ( A ↔ , I ↔ , F ↔ ), is defined as follows:

• • • •

A ↔ ( X ) = A ( X ) for all C -objects X ; A ↔ ( f ) = A ( f ) for all arrows f which are not isomorphisms and A ↔ ( f ) = A ( f ) ∩ A ( f −1 )−1 if f is an isomorphism; I ↔ = I ∪ {x | ∃ y ∈ I : y A (id J ) x}; F ↔ = F ∪ {x | ∃ y ∈ F : x A (id K ) y }.

Example 3.15. We conclude our computation on the running Example 3.2 and perform the step described in Definition 3.14 on the automaton from Example 3.8. The identity-arrow from A to B gets deleted in this step, because there is no identity arrow from B to A – which would be the inverse of the identity-arrow. B remains initial, however, it does not turn into a final state, since there were no arrows from B to a final state. State A stays unchanged. id

id

A

B

Now we show that the symmetrization of a semi-automaton which satisfies certain constraints is an automaton (Proposition 3.16) and that it accepts the same language (Proposition 3.17). Proposition 3.16. Let C be an isomorphism-atomic category and A = ( A , I , F ) a ( J , K )-semi-automaton for C such that A (id X ) is reflexive and anti-symmetric for all objects X of C . Then A↔ is a ( J , K )-automaton. Proof. The proof boils down to proving that A ↔ is a functor. First, we have to show that A ↔ (id X ) = id A ↔ ( X ) . The ⊇-direction is clear because A (id X ) is reflexive. For the ⊆-direction, we observe that a A ↔ (id X ) b implies a A (id X ) b, b A (id X ) a and hence a = b due to the fact that A (id X ) is anti-symmetric. Second, we have to show that, for arrows f : X → Y and g: Y → Z , it holds that A ↔ ( f ; g ) = A ↔ ( f ) ; A ↔ ( g ).

• “⊇”: Assume x A ↔ ( f ) y and y A ↔ ( g ) z.

If both f and g are isomorphisms, it follows from x A ↔ ( f ) y that x A ( f ) y and y A ( f −1 ) x and from y A ↔ ( g ) z that y A ( g ) z and z A ( g −1 ) y. Since A is a semi-functor, we have x A ( f ; g ) z and z A (( f ; g )−1 ) x, and thus x A ↔ ( f ; g ) z. If one of f and g is not an isomorphism, f ; g is not an isomorphism because C is isomorphism-atomic, and thus x A ↔ ( f ; g ) z follows from the fact that x A ( f ; g ) z, which in turn follows from the facts that x A ( f ) y, y A ( g ) z and A is a semi-functor. • “⊆”: Assume x A ↔ ( f ; g ) z. By definition, it must be the case that x A ( f ; g ) z. We want to show that there exists a y such that x A ↔ ( f ) y and y A ↔ ( g ) z. There are four cases:

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

81

– If neither f nor g are isomorphisms, then f ; g is not an isomorphism (because C is isomorphism-atomic). Therefore, A ↔ ( f ) = A ( f ), A ↔ ( g ) = A ( g ) and A ↔ ( f ; g ) = A ( f ; g ), and thus the required result directly follows from the semi-functoriality of A. – If both f and g (and thus f ; g) are isomorphisms, then there exists a y such that x A ( f ) y and y A ( f −1 ) x, because f ; f −1 = id X ; similarly, there is a y  with y  A ( g ) z and z A ( g −1 ) y  . We observe that f −1 ; f ; g ; g −1 = id X and we have y A ( f −1 ) x, x A ( f ; g ) z and z A ( g −1 ) y  and thus y A ( f −1 ; f ; g ; g −1 ) y  . Analogously we can show that y  A ( g ; g −1 ; f −1 ; f ) y and so we have y A (id X ) y  and y  A (id X ) y, so y = y  . Therefore, x A ↔ ( f ) y and y A ↔ ( g ) z. – If f is an isomorphism, but g is not, then we reason as follows. Because A (id X ) is reflexive and A is a semi-functor, there exists a y such that x A ( f ) y and y A ( f −1 ) x, and thus x A ↔ ( f ) y. Because A is a semi-functor and g = f −1 ; f ; g, we obtain y A ( g ) z, and thus we conclude that also y A ↔ ( g ) z. – If f is not an isomorphism, but g is, then we reason as follows. Since A (idY ) is reflexive and A is a semi-functor, there exists a y such that y A ( g ) z and z A ( g −1 ) y, and thus y A ↔ ( g ) z. Then we have x A ( f ; g ) z A ( g −1 ) y, and since f = f ; g ; g −1 this means x A ( f ) y. Because f is not an isomorphism, we conclude x A ↔ ( f ) y. 2

Proposition 3.17. Let C be an isomorphism-atomic category and A a ( J , K )-semi-automaton for C such that A (id X ) is reflexive and anti-symmetric for all objects X of C . Then L (A↔ ) = L (A).

Proof. Let A = ( A , I , F ) and A↔ = ( A ↔ , I ↔ , F ↔ ), and let f : J → K be a C -arrow.

• “⊆”: Suppose f ∈ L (A↔ ). Then there exist x ∈ I ↔ and y ∈ F ↔ such that x A ↔ ( f ) y. Then also x A ( f ) y. Now, either x ∈ I , or there exists an x ∈ I such that x A (id J ) x. Similarly, either y ∈ F or there exists a y  ∈ F such that y A (id K ) y  . Therefore, f ∈ L (A). • “⊇”: Suppose f ∈ L (A). Then there exists an x ∈ I and a y ∈ F such that x A ( f ) y. By definition, x ∈ I ↔ and y ∈ F ↔ . If f is not an isomorphism then x A ↔ ( f ) y, and so it follows directly that f ∈ L (A↔ ). If f is an isomorphism then we reason as follows. Since A (id K ) is reflexive and A is a semi-functor, there exists a z such that y A ( f −1 ) z and z A ( f ) y. From this it follows that x A ( f ) y A ( f −1 ) z and hence x A (id J ) z. Since x ∈ I it holds that z ∈ I ↔ . Then we conclude that f ∈ L (A↔ ). 2

The following theorem states the main result of this subsection.

Theorem 3.18. Let C be an isomorphism-atomic category and A = ( A , I , F ) a semi-automaton for C . Then there exists an automaton B for C such that L (A) = L (B).

Proof. By Proposition 3.6 A= is an equivalent semi-automaton such that all identity transitions are reflexive. Therefore, by Propositions 3.9, 3.10 and 3.11, (A= )id is an equivalent semi-automaton such that all identity transitions are reflexive and anti-symmetric. Finally, by Proposition 3.16 and Proposition 3.17, we can take B = ((A= )id )↔ to obtain an equivalent automaton. (It is not always necessary to perform all the conversions. See Fig. 1 for the possible shortcuts.) 2

We now demonstrate the constructions above with an example. Note that this example is not a graph automaton, but a non-deterministic automaton instead.

Example 3.19. We will now show the correspondence between our constructions and epsilon elimination for NFAs via an example. Note that we rely on the representation of a word automaton as a functor δˆ : Σ ∗ → Rel, as explained before Definition 2.6. Instead of a functor we could have a semi-functor, where ε (the empty word) is not necessarily mapped to the identity relation, yielding ε -transitions. Note however, that in NFAs it is usually assumed (different from our setting) ˆ ε ) contains the identity relation. This is not necessarily the case here. that each state always has an ε -loop, i.e., that δ( We will demonstrate the procedure for an automaton accepting the language represented by the regular expression (ab)∗ . In the diagrams we only represent transitions for single letters and ε -transitions. “Shortcuts” arising from concatenation of words, e.g. the epsilon-transition from A to D or any transition for words of length ≥ 2 are omitted, although, by definition, they must be present, differently from ε -automata. We consider the following NFA:

82

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

ε ε

A

ε

ε ε

ε

B

C a

b

ε a

b

a

a

D

b

E ε

We observe that there is a non-reflexive state, A, and apply Proposition 3.6 to remove it. The set of initial states needs to be updated, because B, C and D were reachable via an ε -transition from a removed initial state, thus becoming the new initial states. Now we observe that our semi-automaton is not anti-symmetric, so we need to apply the identity factorization (Definition 3.7) and thus have to merge C and D: Now we can obtain an automaton by applying Theorem 3.18, in which all arrows for isomorphisms that do not have inverse arrows are deleted. Since isomorphisms in our case are just the identities (i.e. epsilon), we delete a single transition and obtain the following automaton: ε

ε

B

CD a

b

b

a

E ε

3.4. Adhesive categories and isomorphism-atomicity We now show that categories of cospans of monos in adhesive categories are isomorphism-atomic. This allows us to apply the result of Theorem 3.18. We will use the following result due to [14]: Lemma 3.20. (See [14].) Pushouts along monomorphisms are also pullbacks in an adhesive category. Furthermore, we will use the following lemma: Lemma 3.21. Let i: X → Y be an isomorphism, a: X → A be an arrow and a : A → Y be a mono such that a ; a = i. Then a and a are isomorphisms. Proof. We first show the statement of the lemma for the case when i is the identity id X . We have a ; a = id X ⇒ a ; a ; a = a and thus, since a is a monomorphism, id A = a ; a. So a is the inverse of a and thus a and a are isomorphisms. Now we have that a ; a = i which implies a ; a ; i −1 = id X . Hence we know that a ; i −1 and a are isomorphisms because  a ; i −1 is a monomorphism. Then, a ; i −1 ; i = a is an isomorphism as well. 2 a

c

Lemma 3.22. In the category of cospans of monos in an adhesive category C , an arrow f : A − →B← − C is an isomorphism if and only if a and c are isomorphisms in C . Proof. The “if” direction is trivial. For the “only if” direction, we reason as follows: c

a

Since f is an isomorphism, there exists a cospan f −1 : C −→ D ←− A such that f ; f −1 = id A . The situation is depicted in the following commuting diagram: C c

A

a

(PO)

B b id A

c a

D d

A

id A

A

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

83

The arrow b is a mono because c  is a mono, and therefore it follows from Lemma 3.21 that a and b are isomorphisms. By a symmetrical argument, a and d are isomorphisms. The square above is a pushout and therefore, by Lemma 3.20, it is a pullback. A pullback of two isomorphisms always results in two isomorphisms and hence c , c  are isomorphisms. 2 Theorem 3.23. Let C be an adhesive category. The category of cospans of monos in C is isomorphism-atomic. f1

f2

g1

g2

Proof. Let f = A −−→ B ←−− C and g = C −−→ D ←−− E be two (representatives of) arrows in Cospan(C ) such that h = f ; g is an isomorphism. The situation is depicted in the following commuting diagram.

f2 f1

A

C (PO)

B p1 a

g1 g2

D

E

p2

F

e

By Lemma 3.22 the arrows a and e are isomorphisms. Since f 2 and g 1 are monos, p 1 and p 2 are monos, and therefore it follows from Lemma 3.21 that f 1 , p 1 , g 2 and p 2 are isomorphisms. Finally f 2 , g 1 are isomorphisms since they arise by pulling back isomorphisms (see the analogous argument in the proof of Lemma 3.22). 2 So now we can conclude: Corollary 3.24. For any semi-automaton for a category of cospans of monos in an adhesive category, we can always find an equivalent automaton accepting the same language. Proof. Directly by Theorems 3.18 and 3.23.

2

4. Closure under concatenation From now on we consider as the category, over which to define automaton functors, the category of cospans of monos of a given adhesive category. We want to prove that recognizable languages over such cospans are closed under concatenation. Let L X , K be a language of cospans from X to K and L K ,Y a language of cospans from K to Y . Then

L X , K ; L K ,Y = {c 1 ; c 2 | c 1 ∈ L X , K , c 2 ∈ L K ,Y }, consists of all possible compositions of cospans of the two languages. For regular (word) languages this closure property is well known and given two (non-deterministic finite) automata A1 and A2 it is easy to construct an automaton that accepts exactly L (A1 ) ; L (A2 ) by first simulating A1 for the first part of a given word, non-deterministically switching to A2 and then simulating A2 for the rest of the word. For recognizable languages on adhesive categories we cannot retain this easy construction because an automaton functor has to behave identically on different decompositions of the same cospan, as we will show in the following example. For clarity we will assume that all interfaces are discrete. 4.1. Constructing the concatenation semi-automaton We will start with a motivating example. Example 4.1. We concatenate two graph automata over the label set Λ = {a, b}. Automaton A1 accepts a cospan J → G ← K in the category HG raph if it has discrete interfaces of size one of the category HG raph and the graph G contains one vertex, an even number of a-labelled edges and no b-labelled edges. On the other hand, automaton A2 accepts a cospan J → G ← K in HG raph if it has discrete interfaces of size one and G contains one vertex, an even number of b-labelled edges and no a-labelled edges. We only specify A1 , A2 is constructed analogously. The number of a-labelled edges in a graph G will be denoted by #a (G ). The automaton A1 is a 3-tuple A1 = ( A 1 , I 1 , F 1 ), where, for each graph G,



A 1 (G ) =

{0, 1} if | V G | = 1 and | E G | = 0 ∅ otherwise,

and, for each cospan f = ( J → G ← K ) with discrete interfaces of size one,

84

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

 A1( f ) =

{(0, 0), (1, 1)} if #a (G ) ≡ 0 (mod 2), #b (G ) = 0 and | V G | = 1 {(0, 1), (1, 0)} if #a (G ) ≡ 1 (mod 2), #b (G ) = 0 and | V G | = 1 ∅ otherwise

For all other cospans f , A ( f ) = ∅. Furthermore, we take I 1 = {0} and F 1 = {0}. To show why we have to use a more complex construction than in the case of regular word languages, consider the following cospan:

c=

a

b

a

b

This cospan is in the concatenation language because it can be decomposed into a

b

;

a

b

and the left cospan is clearly accepted by A1 , while the right cospan is accepted by A2 . However, it is not possible in general to run the two automata one after the other. The same cospan c can also be decomposed in a

b

;

. a

b

Because of the functoriality condition, the concatenation automaton must also accept the cospan in this case, although none of the two constituents cospans are accepted by A1 or A2 . In general, A1 and A2 have to be run in parallel. We will give a constructive proof that recognizable languages on adhesive categories are closed under concatenation. We will start by defining a concatenation automaton. Definition 4.2 (Concatenation semi-automaton). Let C be a category of cospans of monos in an adhesive category. Let A1 = ( A 1 , I 1 , F 1 ) be an ( X , K )-automaton for C and let A2 = ( A 2 , I 2 , F 2 ) be a ( K , Y )-automaton for C . We define the concatenation automaton A = ( A , I , F ) of A1 and A2 as follows.

• A assigns to each object J a set of states A( J ), where each state consists of four monos as shown below, and two states z1 ∈ A1 (U 1 ), z2 ∈ A2 (U 2 ).1 K ϕK

U 1  z1 

s1

U

s2

U 2  z2 

ϕJ

J In addition we require that [s1 ] ∪ [s2 ] = [idU ] as well as [ϕ J ] ∪ [ϕ K ] = [idU ] and that [ϕ J ] ∩ [ϕ K ] = [s1 ] ∩ [s2 ]. We want to simulate A1 and A2 , so in such a state, U represents the union of the current interface J of A and the final interface K of A1 . Furthermore U 1 represents the current interface of A1 and U 2 represents the current interface of A2 .2 • A state z as above is initial (z ∈ I ) if z1 ∈ I 1 , z2 ∈ I 2 , s1 = ϕ J and s2 = ϕ K . • Analogously, a state z as above is final (z ∈ F ) if z1 ∈ F 1 , z2 ∈ F 2 , s1 = ϕ K and s2 = ϕ J . • On cospans the functor A is defined as follows: let c: J  J  J  be a cospan consisting of two monos. Two states (s1 , s2 , ϕ J , ϕ K , z1 , z2 ), (s1 , s2 , ϕ J , ϕ K , z1 , z2 ) are related by A(c ) whenever there are cospans d1 , d2 and d (consisting of monos) and additional monos such that the diagram below commutes

1 2

We will indicate the two states within the categorical diagrams in angular brackets. In order to obtain a finite set of states we do not take all monos into U , but only one representative of every subobject equivalence class.

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

id K :

id K

K ϕK

d1 :

U 1  z1 

d:

U1 U

u

d2 :

U 2  z2 

J

U 1  z1 

s1

U

u

u2

U2

υ2

ϕJ

c:

ϕ K υ1

u1

U s2

K

k

υ1 s1

id K

K

85

U 2  z2 

υ2

ϕ J

j

ι

J

ι

s2

J

and the following five conditions are satisfied. 1. [d1 ] ∪ [d2 ] = [d]

K ] 2. [d1 ] ∩ [d2 ] = [ c ] ∩ [id

K ] = [d] 3. [ c ] ∪ [id 4. It holds that z1 A1 (d1 ) z1 and z2 A2 (d2 ) z2 . 5. [u 1 ] ∩ [k] = [u 1 ] ∩ [k] and [u 2 ] ∩ [k] = [u 2 ] ∩ [k] where u 2 = υ2 ; u 2 and u 1 = υ1 ; u 1 The meaning of the five objects in the diagram above describing a state can be intuitively explained with the diagram depicted in the introduction (Section 1): the objects K (the interface over which cospans are concatenated) and J (the current interface) are indicated in that picture and U would be the union of K and J , describing their overlap. If we intersect U with the cospan from X to K we obtain U 1 (fully contained in the first cospan) and if we intersect U with the cospan from K to Y we obtain U 2 (fully contained in the second cospan). Hence U 1 is the current interface for the first automaton, whereas U 2 is the current interface for the second automaton. Conditions 1–3 above use the notation of Definition 2.5 and extend the conditions a state has to satisfy (see explanation above) to transitions. The fourth condition means that d1 is a cospan that allows a transition from z1 to z1 in A1 and that d2 is a cospan that allows a transition from z2 to z2 in A2 . Therefore, this condition ensures that the concatenation automaton behaves like A1 on U 1 and like A2 on U 2 . The last condition is in place to guarantee that U 1 can only grow regarding K whereas U 2 can only diminish regarding K . Intuitively, we do not want A1 to forget a part of K it has already read and we do not want A2 to read a part of K anew that it has already forgotten. Before we continue with the proof for correctness of the definition of the concatenation semi-automaton, we give an example of the construction. Example 4.3. We reuse the automata of Example 4.1. We want to construct the concatenation semi-automaton A = ( A , I , F ) of A1 = ( A 1 , I 1 , F 1 ) and A2 = ( A 2 , I 2 , F 2 ). In the following we will use the names c, d, J , U , . . . as in Definition 4.2. We start by defining the states of the automaton: K = D 1 , the discrete graph with one node, because the outer interface of a cospan accepted by A1 always has one vertex. The equality J = U = U 1 = U 2 has to hold for all relevant states, i.e., for all states on a path between initial and final states. This must be the case since we are working with discrete interfaces and neither A1 nor A2 accept a graph that has more or less than one vertex. Furthermore we set ϕ J = ϕ K = s1 = s2 = idU . For each combination of z1 ∈ {0, 1} and z2 ∈ {0, 1} this forms a state. A state is initial whenever z0 = z1 = 0 and a state is final whenever z0 = z1 = 0. So now we can define the state transitions of A. For a given cospan c we have to define z1 , z1 , z2 , z2 as well as d1 , d, d2 . The arrow id K is already uniquely defined. Let c = d and c = c 1 ; c 2 then d1 = c 1 , d2 = c 2 , z1 A 1 (c 1 ) z1 , z2 A 2 (c 2 ) z2 yields a transition in A and all transitions in A arise in this way. Looking at the states, one can see that the concatenation automaton reads a given cospan in such a way that the a-edges are counted separately in z1 and the b-edges are counted in z2 (both modulo 2). The resulting automaton accepts exactly the cospans of hypergraphs that have discrete interfaces of size one, exactly one vertex and an even number of a-edges, as well as an even number of b-edges. Our construction is not guaranteed to produce automata (instead one obtains semi-automata). An example of a situation where we obtain only a semi-automaton is the following. Example 4.4. Consider the language that contains exactly one semi-abstract cospan, swap in HG raph, depicted below:

86

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

An automaton Ai accepting with L (Ai ) = {swap} can be defined as follows: On objects, i.e. graphs, it sends the discrete graph with two vertices to the set { X i , Y i } and every other object to the empty set. On arrows, i.e. cospans of monos of the category HG raph, it sends the identity arrow of the discrete graph with two vertices to the identity on { X i , Y i } and cospan swap to the relation {( X i , Y i ), (Y i , X i )}. Here X i is the only initial state and Y i is the only final state of Ai . Analogously to an NFA we can depict this automaton as follows: id

id swap

Xi

Yi swap

We take two instances of this automaton, A1 and A2 , and concatenate these automata. In our concatenation automaton, for all states J = K = U 1 = U 2 = U = D 2 , where D 2 is the discrete graph with two vertices. A state has the form (s1 , s2 , ϕ J , ϕ K , z1 , z2 ), where z1 ∈ { X 1 , Y 1 } and z2 ∈ { X 2 , Y 2 }. There are only two possible monos to choose for s1 , s2 , ϕ J and ϕ K . The arrows s1 , s2 and ϕ K can either be the same as ϕ J , then we will denote them by 1, or different, then we will denote them by 0. For clarity, we only describe the states that are reachable from an initial state and from which a final state can be reached. Then, there are eight states of interest:

A B C D

= = = =

(1, 0, 1, 0, X 1 , X 2 ) (0, 0, 1, 0, Y 1 , X 2 ) (1, 1, 1, 0, X 1 , Y 2 ) (0, 1, 1, 0, Y 1 , Y 2 )

E F G H

= = = =

(0, 1, 1, 1, X 1 , X 2 ) (1, 1, 1, 1, Y 1 , X 2 ) (0, 0, 1, 1, X 1 , Y 2 ) (1, 0, 1, 1, Y 1 , Y 2 )

The only initial state is A, the only final state is E. Moreover, there is an identity arrow between any pair of (not necessarily different) states of the set { A , B , C , D }, an identity arrow between any pair of (not necessarily different) states of the set { E , F , G , H } and a swap-arrow between any two states where one state is from the set { A , B , C , D } and one state is from the set { E , F , G , H }. This is due to the fact that the composition of two swap-transitions yields an identity transition. Hence this is only an automaton semi-functor because there are identity transitions between different states. Now we can apply the isomorphism-factorization to this automaton, and obtain the following automaton: id

id swap

{E, F , G, H }

{ A, B , C , D } swap

As expected, this automaton accepts exactly the identity-cospan for the discrete graph with two vertices. 4.2. Concatenation semi-automata are semi-automata In this subsection we prove that the concatenation semi-automaton is indeed a semi-automaton. Then one can use the results of Section 3 to obtain an automaton. Note that in the case of the concatenation automaton every relation associated with an identity arrow is an equivalence, hence it is possible to apply isomorphism-factorization directly and immediately obtain a functor (cf. the flow-chart in Fig. 1). Showing that the concatenation automaton is a semi-functor boils down to proving that the defined map respects composition, i.e. that, for a concatenation semi-automaton A = ( A , I , F ) and composable cospans c 1 and c 2 , it holds that A (c 1 ; c 2 ) = A (c 1 ) ; A (c 2 ). This result is split into two parts: Proposition 4.6 corresponds to the “⊇” case, and Proposition 4.8 to the “⊆” case. Unfortunately the proofs are somewhat lengthy, although straightforward since we extensively use the fact that the subobjects form a distributive lattice. We tried to transform proofs into a more compact form, but this appears to be difficult. For the remainder of this section we assume that C is an adhesive category and that each cospan is a cospan consisting of monos in this category. Lemma 4.5. Let a: A → U , b: B → U , c: C → U , d: D → U and u: U 1 → U be monomorphisms in an adhesive category. Furthermore, let a = a ; u, b = b ; u, c = c  ; u and d = d ; u. The situation is depicted in the following diagram:

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

87

U

u

a

a

d

c

b

C

B

A

d c

U1

b

D

Then (i) [a] ∩ [b] = [c ] ∩ [d] iff [a ] ∩ [b ] = [c  ] ∩ [d ] (ii) [a] ∪ [b] = [c ] ∪ [d] iff [a ] ∪ [b ] = [c  ] ∪ [d ]. Proof. Let [x] be a subobject of U and y: U → U¯ . We write [x] ; y as a shorthand for [x ; y ]. We first show that [a] ∩ [b] = ([a ] ∩ [b ]) ; u. In the following, refer to this diagram for an overview of the situation. All the arrows in this diagram are monos. U u

a

b

U1 a

b

B

A a

b

( A ∩ B )1 a

k

b

h

A∩B The pullback of a and b forms the intersection. We want to show that this pullback is the same as the pullback of a and b. Let (a , b ) be the pullback of (a , b ), such that a ; a = b ; b . Furthermore, let (a, b) be the pullback of (a, b), such that a ; a = b ; b. Then

b  ; b = b  ; b  ; u = a ; a ; u = a ; a, so there is a unique arrow h: dom(a ) → dom(a) such that h ; a = a and h ; b = b . We also observe that

b ; b  ; u = b ; b = a ; a = a ; a ; u , and therefore, as u is mono, a ; a = b ; b . So there is a unique arrow k: dom(a) → dom(a ) such that k ; a = a and k ; b = b. Now we can prove h ; k ; a = a , and therefore that h ; k = iddom(a ) , since a is mono. Analogously we can compute k ; h ; a = a, and therefore that k ; h = iddom(a) , since a is mono. Thus, h and k are isomorphisms and therefore [a] ∩ [b] = ([a ] ∩ [b ]) ; u. Analogously we show that [c ] ∩ [d] = ([c  ] ∩ [d ]) ; u. So now we can prove the two parts of the lemma: (i) We have









a ∩ b  = c  ∩ d ⇒





a ∩ b

;u =





c ∩ d

; u ⇔ [a] ∩ [b] = [c ] ∩ [d].

And since [u ] is monic, we also have





a ∩ b ; u =





c ∩ d



; u ⇒ a ∩ b  = c  ∩ d .

So the first part of the lemma holds. (ii) Seeing that the union is built via the pushout of the intersection, this follows directly from part 1.

2

Proposition 4.6. Let A = ( A , I , F ) be a concatenation semi-automaton over C and c 1 and c 2 composable cospans. Then A (c 1 ) ; A (c 2 ) ⊆ A (c 1 ; c 2 ).

88

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

  Proof. Let c 1 = ( J → J 1 ← J 1 ) and c 2 = ( J 1 → J 2 ← J  ). Additionally, let c = c 1 ; c 2 , where c = ( J → J ← J  ). Assume  z A (c 1 ) ; A (c 2 ) z , for states z ∈ A ( J ) and z ∈ A ( J  ). This means that there exists a state z ∈ A ( J 1 ) such that z A (c 1 ) z and  z A (c 2 ) z . By construction, we have the following situation:

K

K

U 1  z1 

K

K





U 11  z1 

U 11

U 11  z1 

U1

U U 2  z2 

U1

U1







U

U2 

U 21  z2  J1

K U 1  z1 

U 12



U 21 J1

J

K

U 21  z2  J1



U 2  z2 

U 22 J

J2

There is an arrow from each object X in the first diagram to U 1 . By convention, the subobject of U 1 (see Definition 2.4) corresponding to this arrow will be denoted [x]1 , where x is the name of X with each letter written in lowercase. For example [ j ] is an equivalence class of arrows from J to U 1 . Similarly, subobjects of U 2 will be denoted by [x]2 . We now want to construct a transition for c 1 ; c 2 to prove our claim. This transition is of this form: id K :

K U 1  z1 

d1 :

K U 1  z1 

U1 U

d:

K

U

U U 2  z2 

d2 : J

c:

U 2  z2 

U2 J

J

The cospan c = ( J → J ← J  ) is constructed by composing c 1 and c 2 . The cospan d is constructed in a similar way: We  construct the pushout of U 1 and U 2 over U 1 and call the pushout object U . The arrows from U to U , called u, and from U  to U , called u  , again are found by arrow composition. The resulting equivalence classes of arrows going from an object A to U are called [a] in the following. The cospans d1 and d2 are constructed analogously by taking pushouts of U 11 and U 21 as well as U 12 and U 22 . We observe that U 1 → U 1 ← U 1 is a cospan that allows a transition from z1 to z1 in A1 as A1 must be an automaton functor. Accordingly, U 2 → U 2 ← U 2 must be a cospan that allows a transition from z2 to z2 in A2 . The missing arrows – e.g. the one from J to U – that need to be there for this to be a transition in A can be obtained as mediating morphisms. As shown in [10], these morphisms are in fact monos as well and they are unique up to isomorphism because the diagram commutes. Now we have constructed the diagram, we need to prove that is satisfies the five additional properties for it to witness a transition.

• Property 1: [d1 ] ∪ [d2 ] = [d]. We already know that [u 1 ] ∪ [u 2 ] = [u ] and [u 1 ] ∪ [u 2 ] = [u  ] because otherwise [u 1 ]1 ∪ [u 2 ]1 = [u ]1 resp. [u 1 ]2 ∪ [u 2 ]2 = [u  ]2 could not have been true. We compute

∪ u 12 ∪ u 22

= u 11 ∪ u 12 ∪ u 21 ∪ u 22 = u 1 ∪ u 2 = [u ].

[u 1 ] ∪ [u 2 ] =







u 11 ∪ u 21



K ] = [d]. We observe that [ j ] ∪ [k] = [u ] due to [ j ]1 ∪ [k]1 = [u ]1 , and [ j  ] ∪ [k] = [u  ] due to [ j  ]2 ∪ • Property 3: [ c ] ∪ [id  [k]2 = [u ]2 , so we can show this property by computing:









[ j ] ∪ [k] = j 1 ∪ j 2 ∪ [k] = j 1 ∪ [k] ∪ j 2 ∪ [k] = u 1 ∪ u 2 = [u ]. • Property 4: d1 is a cospan with z1 A1 (d1 )z1 and d2 is a cospan with zs A2 (d2 )z2 . See the argument regarding the states above, where we constructed the transition.

• Property 5: [u 1 ] ∩ [k] = [u 1 ] ∩ [k] and [u 2 ] ∩ [k] = [u 2 ] ∩ [k]. We have the following:

[u 1 ] ∩ [k] =







u 11 ∩ [k] ∪







u 21 ∩ [k] =







u 11 ∩ [k] ∪







u 1 ∩ [k] .

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

89

So ⊇ is clear. We show ⊆ as follows:



u 11









∩ [k] ∪ u 1 ∩ [k] ⊆ u 21 ∩ [k] ∪ u 1 ∩ [k]





= u 1 ∩ [k] ∪ u 1 ∩ [k] = u 1 ∩ [k].

Analogously we show that [u 2 ] ∩ [k] = [u 2 ] ∩ [k]. K ]. First, we show that [ j ] ∩ [k] ⊆ [u 1 ] ∩ [u 2 ]. We compute • Property 2: [d1 ] ∩ [d2 ] = [ c ] ∩ [id

[u 1 ] ∩ [u 2 ] = =















u 11 ∪ u 21 u 11 ∩ u 12

∩ ∪















u 12 ∪ u 22 u 11 ∩ u 22









u 21 ∩ u 12













u 21 ∩ u 22 .

Furthermore:







u 11 ∩ u 12











u 21 ∩ u 22



=





j 1 ∩ [k] ∪





j 2 ∩ [k] =





j1 ∪ j2

∩ [k] = [ j ] ∩ [k].

Next, we have to show that [ j ] ∩ [k] ⊇ [u 1 ] ∩ [u 2 ]. To do that, we can see from our computation above, that it is sufficient to show that ([u 11 ] ∩ [u 22 ]) ⊆ [ j ] ∩ [k] and ([u 21 ] ∩ [u 12 ]) ⊆ [ j ] ∩ [k]. – ([u 11 ] ∩ [u 22 ]) ⊆ [ j ] ∩ [k]: We recognize that [u 11 ] ⊆ [ j 1 ] ∪ [k] is true and therefore compute

































u 11 ∩ u 22 ∩ [k] = u 11 ∩ u 12 ∩ [k] ⊆ u 11 ∩ u 12 = j 1 ∩ [k] ⊆ [ j ] ∩ [k]





u 11 ∩ u 22 ∩ j 1 =











u 11 ∩ u 22 ∩ j 1 ∩ j 2



∪ u 11 ∩ u 22 ∩ j 1 ∩ [k] .

The second part of this is a subset of [ j 1 ] ∩ [k] ⊆ [ j ] ∩ [k] so using [ j 1  ] ⊆ [ j 1  ] ∪ [k] = [u 12  ] ∪ [u 11  ] for the first step, we can finally compute













u 11 ∩ u 22 ∩ j 1 ∩ j 2



= u 11 ∩ u 22 ∩ j 1  = u 11 ∩ u 22 ∩ j 1  ∩ u 12  ∪ u 11 ∩ u 22 ∩ j 1  ∩ u 11  ⊆ u 11 ∩ u 22 ∩ j 1  ∩ u 12 ∪ u 11 ∩ u 22 ∩ j 1  ∩ u 21



⊆ j 1 ∩ [k] ∪ j 2 ∩ [k] = [ j ] ∩ [k].



([u 21 ] ∩ [u 12 ])



⊆ [ j ] ∩ [k]: Analogously we first compute:

2

















u 1 ∩ u 12 = u 21 ∩ u 12 ∩ [ j ] ∪ [k] =











u 21 ∩ u 12 ∩ [ j ] ∪











u 21 ∩ u 12 ∩ [k] .

First, we have a look at the term [u 21 ] ∩ [u 12 ] ∩ [ j ]:









u 21 ∩ u 12 ∩ [ j ]



∪ u 21 ∩ u 12 ∩ j 2

= u 21 ∩ u 12 ∩ j 1 ∩ j 2 ∪ u 21 ∩ u 12 ∩ j 1 ∩ [k]  

=











u 21 ∩ u 12 ∩ j 1













⊆[ j ]∩[k]

u 21 ∩ u 12 ∩ j 2 ∩ [k]





⊆[ j ]∩[k]

For the last two terms we already have the inclusion, for the first one we compute, using [ j 1  ] ⊆ [u 11  ] ∪ [u 12  ]:













u 21 ∩ u 12 ∩ j 1 ∩ j 2



= u 21 ∩ u 12 ∩ j 1  = u 21 ∩ u 12 ∩ j 1  ∩ u 11  ∪ u 21 ∩ u 12 ∩ j 1  ∩ u 12  ⊆ u 21 ∩ u 12 ∩ j 1  ∩ u 11 ∪ u 21 ∩ u 12 ∩ j 1  ∩ u 22



⊆ j 1 ∩ [k] ∪ j 2 ∩ [k] = [ j ] ∩ [k].

90

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

To conclude, we still need to compute, using [k] ⊆ [u 11  ] ∪ [u 12  ]:









u 21 ∩ u 12 ∩ [k]



∪ u 21 ∩ u 12 ∩ [k] ∩ u 12 



⊆ u 21 ∩ u 12 ∩ [k] ∩ u 11 ∪ u 21 ∩ u 12 ∩ [k] ∩ u 22



= j 1 ∩ [k] ∪ j 2 ∩ [k] =











u 21 ∩ u 12 ∩ [k] ∩ u 11 

= [ j ] ∩ [k].



2

We proceed by proving the converse direction. For this, we need the following auxiliary lemma: Lemma 4.7. Let a: A → U , b: B → U be objects of the subobject lattice of U , such that a ⊆ b. Furthermore, let a and b be monomorphisms. Then there is a unique c: A → B such that c ; a = b. This arrow c is necessarily mono. Proof. From a ⊆ b follows that there is an arrow c such that c ; a = b. Now let d: A → B be another arrow such that d ; a = b, then d ; a = c ; a and as a is mono it follows that d = c. Finally, let two arrows e: E → A and e  : E → A be given. We compute

e ; c = e ; c ⇒ e ; c ; a = e ; c ; a ⇒ e ; b = e ; b ⇒ e = e . Because b is mono, c is mono too.

2

Proposition 4.8. Let A = ( A , I , F ) be a concatenation semi-automaton over C and c 1 and c 2 composable cospans. Then A (c 1 ; c 2 ) ⊆ A (c 1 ) ; A (c 2 ).   Proof. Let c 1 = ( J → J 1 ← J 1 ) and c 2 = ( J 1 → J 2 ← J  ). Additionally, let c = c 1 ; c 2 , where c = ( J → J ← J  ). Assume    z A (c ) z for states z ∈ A ( J ), z ∈ A ( J ). The situation is as follows:

id K : d1 :

K

K

U 1  z1 

U 1  z1 

U1 U

d:

K

U

U U 2  z2 

d2 : J

c:

U 2  z2 

U2 J

J

We want to construct transitions for c 1 and c 2 in A that can be used in a row and yield the same result as the given transition for c. As in the proof for Proposition 4.6, for each object A in the diagram above the corresponding equivalence class of arrows from A to U will be called [a]. Our goal is first to construct a transition for c 1 and c 2 in A as follows: K U 1  z1 

K 

U1

U

K

U 2  z2 

U 11  z1  U1



U1





J1







U

U2 U 21  z2 

J1

K U 1  z1 

U 12

U 21  z2 

U 21 J1

K



U 11  z1 

U 11

J

K

U 2  z2 

U 22 J2

J

For this, we will define the arrows going from each object in these diagrams to U . Via u 1 and u 2 , we can obtain the arrows in the above diagrams using Lemma 4.7. We start by defining the d-cospans of both transitions: As c 1 and c 2 can be composed to form c, we have an arrow from J i to J in this pushout called ψ iJ for i ∈ {1, 2}. The equivalence class of arrows [ψ iJ ] ; [ j ] will be called [ j i ]. Now we can form the pullback of [ j i ] and [k] and thus also [ j i ] ∪ [k] =: u i . Here and in the following, the domain of a newly defined

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

91

arrow [a] will be called A. Observe that we also get monomorphisms going from J i to U i and from K to U i this way. Here and in the following exactly these arrows are the ones we are using for the definition of the c i -transition.  Additionally, we need [u 1 ] = [u 1 ] ∩ [u 2 ]. For c 1 ; c 2 to be a valid decomposition of c, [ j ] ⊆ [ j 1 ] must hold and thus [u ] ⊆ [u 1 ], which yields a unique arrow from U to U 1 . Analogously we find an arrow from U  to U 2 Now we can continue by defining the d1 resp. d2 cospans of the transitions. Let





u 11 =







u1 ∩ j1

∪ [u 1 ] 2

2 ∪ [u 1 ] ∩ [k] u 1 = [u 1 ] ∩ j

1

u 1 = u 11 ∩ u 21



∪ u 2 1

u 2 = [u 2 ] ∩ j 1 ∪ [u 2 ] ∩ [k] 1



u 2 = u 12 ∩ u 22







u 22 = [u 2 ] ∩ j 2

All missing arrows can be obtained via concatenation and mediating morphisms. It remains to show that the constructed diagrams satisfy the properties of a transition. For this, we make heavy use of Lemma 4.5 and work in the subobject lattice of U instead of the ones of U 1 and U 2 .

• Property 4: d1 is a cospan with z1 A1 (d1 ) z1 and d2 is a cospan with zs A2 (d2 ) z2 .   First, we observe that, for i ∈ {1, 2}, the di -cospans U i → U 1i ← U i1 and U i1 → U 2i ← U i can be concatenated. So we 1 2 1 2 just have to show that [u 1 ] ∪ [u 1 ] = [u 1 ] and [u 2 ] ∪ [u 2 ] = [u 2 ].







u 11 ∪ u 21



= [u 1 ] ∩ j 1 ∪ [u 1 ] ∪ [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [k] = j 1 ∪ j 2 ∩ [u 1 ] ∪ [u 1 ] ∪ [u 1 ] ∩ [k] = [ j ] ∩ [u 1 ] ∪ [u 1 ] ∩ [k] ∪ [u 1 ] = [u 1 ] ∩ [ j ] ∪ [k] ∪ [u 1 ] = [ u 1 ] ∩ [ u ] ∪ [u 1 ] = [ u 1 ] ∪ [u 1 ] = [u 1 ].

Furthermore:







u 12 ∪ u 22





= [u 2 ] ∩ j 1 ∪ [u 2 ] ∩ [k] ∪ [u 2 ] ∩ j 2 ∪ u 2

= j 1 ∪ j 2 ∩ [u 2 ] ∪ [u 2 ] ∩ [k] ∪ u 2

= [ j ] ∩ [u 2 ] ∪ [k] ∩ [u 2 ] ∪ u 2

= [u ] ∩ [u 2 ] ∪ u 2

= [u 2 ] ∪ u 2 = [u 2 ].

Since A1 and A2 have the functor-property, Property 4 holds. • Property 5: [u 1 ] ∩ [k] = [u 1 ] ∩ [k] and [u 2 ] ∩ [k] = [u 2 ] ∩ [k].

We show Property 5 for the c 1 -transition as follows: – As the c-transition is a legal transition in A we know that [u 2 ] ∩ [k] = [u 2 ] ∩ [k] and [u 2 ] ⊆ [u 12 ] ⊆ [u 2 ], so [u 2 ] ∩ [k] = [u 12 ] ∩ [k]. – To show [u 11 ] ∩ [k] = [u 11  ] ∩ [k] we use [u 11  ] = [u 11 ] ∩ [u 21 ] to see that it suffices to show [u 11 ] ∩ [k] = [u 11 ] ∩ [k] ∩ [u 21 ]. The “⊇” direction is obvious. The “⊆” direction holds, if [u 11 ] ∩ [k] ⊆ [u 21 ]. So we compute [u 11 ] ∩ [k] = ([u 1 ] ∩ [ j 1 ] ∩

[k]) ∪ ([u 1 ] ∩ [k]) and [u 21 ] = ([u 1 ] ∩ [ j 2 ]) ∪ ([u 1 ] ∩ [k]). Finally, we observe that [u 1 ] ∩ [ j 1 ] ∩ [k] ⊆ [u 1 ] ∩ [k] and [u 1 ] ∩ [k] ⊆ [u 1 ] ∩ [k], so the property holds for the c 1 -transition. Showing Property 5 for the c 2 -transition is analogous.

• Property 1: [d1 ] ∪ [d2 ] = [d]. We know that [u 1 ] ∪ [u 2 ] = [u ] and [u 1 ] ∪ [u 2 ] = [u  ] because that has to hold for the given transition to be a transition of A. We still have to show three equalities. The first is:







u 11 ∪ u 12



= [u 1 ] ∩ j 1 ∪ [u 1 ] ∪ [u 2 ] ∩ j 1 ∪ [u 2 ] ∩ [k]

92

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98



= j 1 ∩ [u 1 ] ∪ [u 2 ] ∪ [u 2 ] ∩ j 1 ∪ [k] ∪ [u 1 ]



= j 1 ∩ [u ] ∪ j 1 ∪ [k] ∩ [u 2 ] ∪ [u 1 ]

= j 1 ∪ j 1 ∪ [k] ∩ [u 2 ] ∪ [u 1 ] =( ) j 1 ∪ [k] ∩ [u 2 ] ∪ [u 1 ] ∩ [k] ∪ [u 1 ] =(†) [u 2 ] ∪ [k] ∪ [u 1 ] ∪ [u 1 ] = j 1 ∪ [k] ∩ [u 1 ] ∪ [u 2 ] ∪ [u 1 ] = j 1 ∪ [k] ∩ [u ] ∪ [u 1 ]

= j 1 ∪ [k] ∪ [u 1 ]

= u 1 ∪ [u 1 ]

= u1 . The equation marked with ( ) follows from the fact that [ j 1 ] ∩ [u 2 ] ⊆ [ j 1 ] and [u 1 ] = [u 1 ] ∪ ([u 1 ] ∩ [k]). The equation marked with (†) follows from Property 5. The second equation can be shown as follows:





∪ u 12 

= u 21 ∩ u 11 ∪ u 12 ∩ u 22 = [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [k] ∩ [u 1 ] ∩ j 1 ∪ [u 1 ] ∪ [u 2 ] ∩ j 1 ∪ [u 2 ] ∩ [k] ∩ [u 2 ] ∩ j 2 ∪ u 2 = [ u 1 ] ∩ j 1 ∩ j 2 ∪ [ u 1 ] ∩ [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [k] ∩ j 1 ∪ [u 1 ] ∩ [u 1 ] ∩ [k] ∪ [u 2 ] ∩ j 1 ∩ j 2 ∪ [u 2 ] ∩ j 1 ∩ u 2 ∪ [u 2 ] ∩ [k] ∩ j 2 ∪ [u 2 ] ∩ [k] ∩ u 2 = j1 ∩ j2 1

∪ [k] ∩ j ∩ [u 1 ] ∪ [u 1 ] ∩ [u 1 ] ∪ [u 2 ] ∩ j 2 ∪ [u 2 ] ∩ u 2 ∪ [u 1 ] ∩ [u 1 ] ∩ j 2 ∪ [u 2 ] ∩ j 1 ∩ u 2

   

u 11



⊆([ j 2 ]∩[ j 1 ])

⊆([ j 2 ]∩[ j 1 ])





=( ) j 1 ∩ j 2 ∪ [k] ∩ [u 1 ] ∩ j 1 ∪ [u 1 ] ∪ [u 2 ] ∩ j 2 ∪ u 2    

(† )

1

⊇[u 1 ]

2

⊇u 2

∪ [k] 2 1

= [k] ∪ [k] ∩ j ∪ j ∩ [k] ∪ j 1 ∩ j 2

    =

j



∩ j

⊆[k]

1





⊆[k]

2

= [k] ∪ j ∩ [k] ∪ j

= u1 ∩ u2

= u1 

Above, the equation marked with ( ) follows from the fact that [u 1 ] ∪ [u 2 ] = [u ], and the equation marked with (†) follows from the fact that [u 1 ] ∪ [u 2 ] = [u ] ⊇ [k]. That leaves one equation to be shown:







u 21 ∪ u 22





= [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [k] ∪ [u 2 ] ∩ j 2 ∪ u 2

= [u 1 ] ∪ [u 2 ] ∩ j 2 ∪ [u 1 ] ∩ [k] ∪ u 2

= [u ] ∩ j 2 ∪ [u 1 ] ∩ [k] ∪ u 2

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

93



= j 2 ∪ [u 1 ] ∩ [k] ∪ u 2

=( ) j 2 ∪ [k] ∩ [u 1 ] ∪ u 2 ∪ j 2

=(†) j 2 ∪ [k] ∩ u 1 ∪ u 2 ∪ j 2



= j 2 ∪ [k] ∩ j  ∪ [k] ∪ j 2

= j 2 ∪ [k] = [u 2 ]. Above, the equation marked with ( ) follows from the fact that ([u 2 ] ⊆ [u  ] = [ j  ] ∪ [k] ⊆ [ j 2 ] ∪ [k] and the equation marked with (†) follows from Property 4. K ]: • Property 2: [d1 ] ∩ [d2 ] = [ c ] ∩ [id To show that this property holds, we must show that five equations are true. As in the proof for the first property, we know that two equations, [u 1 ] ∩ [u 2 ] = [ j ] ∩ [k] and [ j  ] ∩ [k] = [u 1 ] ∩ [u 2 ], have to hold for the original transition to be an A-transition. So there are three equations still to be shown.







u 11 ∩ u 12





= [u 1 ] ∩ j 1 ∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∪ [u 2 ] ∩ [k]



= [u 1 ] ∩ j 1 ∩ [u 2 ] ∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ [k] ∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∪ [u 1 ] ∩ [u 2 ] ∩ [k]

=( ) j 1 ∩ [k] ∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∪ [u 1 ] ∩ [u 2 ] ∩ [k]

=(†) j 1 ∩ [k]

Above, the equation marked with ( ) follows from the fact that [u 1 ] ∩ [u 2 ] = [ j ] ∩ [k], while the equation marked with (†) follows from the fact that [u 1 ] ⊆ [u 1 ] and [u 1 ] ∩ [u 2 ] = [ j 1 ] ∩ [k]. The second equation, [u 21 ] ∩ [u 22 ] = [ j 2 ] ∩ [k] is shown analogously. The last equation, [u 11  ] ∩ [u 12  ] = [ j 1  ] ∩ [k], is a bit more complicated so we break it up a bit. We first show:







u 11  ∩ u 12 





= u 21 ∩ u 11 ∩ u 22 ∩ u 12 = [u 1 ] ∩ j 1 ∪ [u 1 ] ∩ [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [k] ∩ [u 2 ] ∩ j 1 ∪ [u 2 ] ∩ [k] ∩ [u 2 ] ∩ j 2 ∪ u 2

= [u 1 ] ∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ [k] ∪ [u 1 ] ∩ [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [u 1 ] ∩ [k] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∪ [u 2 ] ∩ j 1 ∩ u 2

∪ [u 2 ] ∩ [k] ∩ j 2 ∪ [u 2 ] ∩ u 2 ∩ [k]

=( ) [u 1 ] ∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ [k] ∪ [u 1 ] ∩ j 2 ∪ [u 1 ] ∩ [k]

∩ [u 2 ] ∩ j 1 ∩ j 2 ∪ u 2 ∩ j 1 ∪ [u 2 ] ∩ [k] ∩ j 2 ∪ u 2 ∩ [k] = [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ j 2 ∩ u 2



∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ j 1 ∩ j 2 ∩ [k]



∪ [u 1 ] ∩ j 1 ∩ [k] ∩ [u 2 ] ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2





∪ [u 1 ] ∩ j 1 ∩ [k] ∩ j 2 ∩ [u 2 ] ∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2 ∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ u 2 ∩ j 1 ∩ j 2



∪ [u 1 ] ∩ j 2 ∩ [u 2 ] ∩ [k] ∪ u 1 ∩ u 2 ∩ j 2 ∩ [k]



∪ [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ j 1 ∩ [k]

∪ [u 1 ] ∩ [u 2 ] ∩ [k] ∩ j 2 ∪ [u 1 ] ∩ u 2 ∩ [k]

=(†) [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2



∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2 ∪ [u 1 ] ∩ j 2 ∩ [u 2 ] ∩ [k]

94

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98







∪ [u 1 ] ∩ u 2 ∩ j 2 ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ j 1 ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ [k]

= [u 1 ] ∩ [u 2 ] ∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2





∪ [u 1 ] ∩ j 2 ∩ [u 2 ] ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ j 2 ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ [k]

∩ j 1 ∩ j 2 ∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2 = [u ] ∩ [u ]

1  2

  [u 1 ]∩[u 2 ]=[k]∩[ j ]⊆[k],[k]∩[ j ]⊇[k]∩[ j 1  ]

[ j1  ]





∪ [u 1 ] ∩ j 2 ∩ [u 2 ] ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ [k]

= [k] ∩ j 1  ∪ [u 1 ] ∩ j 1 ∩ [k] ∩ u 2



∪ [u 1 ] ∩ j 2 ∩ [u 2 ] ∩ [k] ∪ [u 1 ] ∩ u 2 ∩ [k] To show that this equals [k] ∩ [ j 1  ], we will show three inclusions: – [u 1 ] ∩ [u 2 ] ∩ [ j 1 ] ⊆ [ j 1  ]: Using Property 5 we get [u 1 ] ∪ [u 2 ] = [u 1 ] ∩ [u 2 ] and with that we compute:



[u 1 ] ∩ u 2 ∩ j 1 = u 1 ∩ u 2 ∩ j 1 ⊆ u 21 ∩ u 22 ∩ j 1 ⊆ j 2 ∩ j 1 = j 1  . – [u 1 ] ∩ [u 2 ] ∩ [ j 2 ] ⊆ [ j 1  ]: Using Property 5 again, the relation [u 1 ] ∩ [u 2 ] = [u 1 ] ∩ [u 2 ] yields





[u 1 ] ∩ [u 2 ] ∩ j 2 = [u 1 ] ∩ [u 2 ] ∩ j 2 ⊆ u 11 ∩ u 12 ∩ j 2



= j 1 ∩ j 2 ∩ [k] ⊆ j 1 ∩ j 2 = j 1  . – [u 1 ] ∩ [u 2 ] ⊆ [ j 1  ]: Using [u 1 ] ⊆ [u 1 ], [u 2 ] ⊆ [u 2 ] and [u 1 ] ∩ [u 2 ] ⊆ [k] we get





[u 1 ] ∩ u 2 = [u 1 ] ∩ u 2 ∩ [k] ⊆ [u 1 ] ∩ [u 2 ] ∩ [k]

= [u 1 ] ∩ [u 2 ] ∩ [k] = j ∩ [k] ⊆ j 1 and analogously









[u 1 ] ∩ u 2 ⊆ [u 1 ] ∩ u 2 ∩ [k] = u 1 ∩ u 2 ∩ [k] ⊆ j 2 . So the third inclusion is also true and therefore the third equation we had to show is true, too.

K ] = [d]. • Property 3: [ c ] ∪ [id As before, we know that [ j ] ∪ [k] = [u ] and [ j  ] ∪ [k ] = [u  ] have to be true from the given transition. Furthermore, [ j 1 ] ∪ [k] = [u 1 ] and [ j 2 ] ∪ [k] = [u 2 ] are defined this way, so there is nothing to prove here either. So it suffices to show that [u 1  ] = [ j 1  ] ∪ [k].



u1 





= u1 ∩ u2



= j 1 ∪ [k] ∩ j 2 ∪ [k]





= j 1 ∩ j 2 ∪ [k] ∪ j 1 ∩ [k] ∪ j 2 ∩ [k]





= j 1  ∪ [k] ∪ j 1 ∩ [k] ∪ j 2 ∩ [k]

   

= j

1



⊆[k]

⊆[k]

∪ [k].

Now we have shown all five properties and therefore the transitions we have defined indeed are A-transitions. Therefore the proposition is proven. 2 For convenience, we repeat the main result of this section: Corollary 4.9. Concatenation semi-automata are well-defined semi-automata. Proof. By Propositions 4.6 and 4.8.

2

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

95

4.3. Concatenation semi-automata accept the correct language

Next we want to show that concatenation semi-automata accept the concatenation of the languages of the constituent automata.

Proposition 4.10. Let A1 be an ( X , K )-automaton, A2 a ( K , Y )-automaton, and A be the concatenation semi-automaton of A1 and A2 . Then L (A) = L (A1 ) ; L (A2 ).

Proof. Let A1 = ( A 1 , I 1 , F 1 ), A2 = ( A 2 , I 2 , F 2 ) and A = ( A , I , F ).

• “⊇”: Suppose c ∈ L (A1 ) ; L (A2 ). Then there exist c 1 ∈ L (A1 ) and c 2 ∈ L (A2 ) such that c = c 1 ; c 2 . We will now give a transition for c 1 in A that starts in an initial state, then a composable transition for c 2 in A that ends in a final state. As we have already shown that A is a semi-automaton, this is sufficient to show that c is accepted by A. The cospan c 1 can be written as c 1 = U 1 → U ← K , hence we can construct the transition depicted in the following diagram: id K : d1 :

K U 1  z s1 

K K  ze 1 

U U

d:

c1 :

K

U K  z s2 

d2 :

K

U1

K  z s2 

K K

U

where z s1 is the initial and ze1 the final state of the c 1 -transition in A1 and z s2 is the initial and ze2 the final state of the c 2 -transition in A2 . Apart from U , all objects and corresponding arrows are already known, U and its arrows are defined by first taking the pullback of U 1 → U and K → U and then taking the pushout of the resulting arrows, so it is the join of U 1 → U and K → U . As before, for each object X the equivalence class of arrows going from X to U is denoted by [x]. We now have to show that the five properties of state transitions in A hold. 1. [u ] = [u ] ∪ [k] and [k] ∪ [k] = [k] are obvious, [u ] is defined as the join of [u 1 ] and [k], so [u ] = [u 1 ] ∪ [k] holds per definition. 2. [k] ∩ [k] = [k] ∩ [k], [u ] ∩ [k] = [u ] ∩ [k] and [k] ∩ [u 1 ] = [k] ∩ [u 1 ] clearly hold. 3. Once again, [u ] = [k] ∪ [u 1 ] holds per definition and [u ] ∪ [k] = [u ] as well as [k] ∪ [k] = [k] are obvious. 4. The cospan d1 was assumed to be a cospan with z s1 A 1 (c 1 ) ze1 . As d2 is just the identity and does not change the state, it has to be a cospan with z s1 A 2 (d2 ) z s1 . 5. [u ] ∩ [k] = [k] ∩ [k] and [k] ∩ [k] = [k] ∩ [k] hold obviously. A transition for c 2 in A can be found analogously: id K : d1 : d:

K K  ze 1  K

U

U K  z s2 

K

K K  ze 1 

K

d2 : c2 :

K

J   ze 2 

U U

J

Apart from U  , all objects and corresponding arrows are known, U  can be obtained as the join of J  and K in the same way as U was constructed for the c 1 -transition. The five properties are easily proven in the same way as for the c 1 -transition. • “⊆”: Suppose c ∈ L (A). Let an accepting c-transition in A be given as follows:

96

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

id K : d1 : d:

K J  z1  U

K K  z1 

U1

U

U K  z2 

d2 : c:

K

J

J   z2 

U2 J

J

We will show that c = d1 ; d2 . As c ∈ L (A) it must be the case that d1 ∈ L (A1 ) and d2 ∈ L (A2 ), hence this is sufficient to show that c ∈ L (A1 ) ; L (A2 ). We first observe that [ j ] = [u ], because from d2 it follows that [k] ⊆ [u 2 ] and from d1 it follows that [k] ⊆ [u 1 ], and as [u 1 ] ∩ [u 2 ] = [ j ] ∩ [k] also [k] ⊆ [ j ] holds. We will now continue to show that d1 and d2 are in fact composable over U , which means that the pushout can be obtained as the union of U 1 and U 2 . The outer interface of d1 equals the inner interface of d2 and as we have just seen, [k] ⊆ [u 1 ] ∩ [u 2 ] holds. Moreover, [u 1 ] ∩ [u 2 ] = [ j ] ∩ [k] ⊆ [k], so [u 1 ] ∩ [u 2 ] = [k] and therefore the cospans are indeed composable in this sense. To conclude, we have to show that c and d1 ; d2 are equal. From the conditions a state transition in A has to fulfill, we know that [u 1 ] ∪ [u 2 ] = [u ] and thus [u 1 ] ∪ [u 2 ] = [ j ], hence c = d1 ; d2 . 2 Using the result of Section 3 we can now state the paper’s main result: Theorem 4.11. Let C be the category of cospans consisting of monos in an adhesive category. Recognizable languages in C are closed under concatenation. Moreover, given automata A1 and A2 , an automaton accepting the L (A1 ) ; L (A2 ) can be effectively constructed. Proof. By Corollary 4.9 and Proposition 4.10 there exists an equivalent semi-automaton for the concatenation language; by Corollary 3.24 this has an equivalent automaton. 2 A well-known adhesive category is the category HG raph so we have particularly shown that the concatenation of recognizable graph languages is recognizable. A similar result for recognizable graph languages has been shown in [7], for languages of graphs with a single interface. Note that here we also generalized this result to the setting of adhesive categories. 5. Negative results for closure properties As shown above recognizable languages over adhesive categories are closed under concatenation, just like regular languages. We already know that recognizable languages are closed under complement, union and intersection. Unfortunately, there are some closure properties that regular languages enjoy but which do not hold for recognizable languages. To show two negative results, we will work in the category of cospans over HG raph. We start by showing that recognizable languages are not closed under Kleene star and we will start by defining the notion of Kleene star in our setting. Definition 5.1. Let L be a ( K , K )-graph language. Then the Kleene-star of L, written L ∗ , is the ( K , K )-graph language defined as:

L ∗ = {σ | there are σ1 , . . . , σn such that σ = σ1 ; · · · ; σn and σi ∈ L for all 1 ≤ i ≤ n}. Hence in our setting we call the language of all cospans of graphs that can be decomposed into cospans that are in a language L, the language L ∗ . We will now show that recognizable graph languages are not closed under Kleene star. Theorem 5.2. Recognizable graph languages are not closed under Kleene-star. Proof. Let L be the graph language that contains only the following cospan: a

b

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

97

Since this language consists of only one cospan it is recognizable. However, L ∗ consists of all cospans of graphs with one vertex (which is both in the inner and outer interface), any number of a-edges and the same amount of b-edges. But we cannot find a graph automaton over the set of atomic cospans that accepts L ∗ , because the set of reachable states has to be the same for any decomposition of a given cospan in L ∗ . Thus, it is possible to decompose an accepted cospan in a way that first reads all a-edges and then all b-edges. Therefore there have to be infinitely many (Myhill–Nerode) equivalence classes for an interface of one vertex. 2 We will now see that recognizable graph languages are not closed under edge substitution (i.e. simultaneous substitution of all edges with a certain label, as in hyperedge replacement [9]) either. Definition 5.3. Let L be a set of graphs over the label alphabet Λ, a ∈ Λ a label for edges of arity k and G be a fixed graph where a sequence s of k vertices are marked as merge-vertices. The set L  of all graphs that are generated by substituting all a-labelled edges by G is the set of graphs that includes all graphs that can be generated from any graph G  ∈ L by repeating the following steps until there is no more original a-edge in G  .

• Let e be an a-edge in G  . Delete e from G  and remember the sequence of vertices att G  (e ). • Insert a copy of G into G  . Identify the merge-vertices of G with att G  (e ), the vertices previously incident to the a-edge. Theorem 5.4. Recognizable graph languages are not closed under substitution. Proof. Again we will show this by providing a counter-example. First, let L be the language of all graphs that have exactly one vertex v and an arbitrary number of a-edges of arity 1. As a graph to be substituted we will use the graph G that consists of one vertex (which also is the merge-vertex), one b-edge and one c-edge, both of arity 1. This substitution rule applied to L generates the language of all graphs with one vertex, any number of b-edges and the same amount of c-edges. As before we see that this language is not recognizable, whereas L is recognizable. 2 6. Conclusion Our first main contribution was to give a procedure for converting semi-automata into automata, a construction similar to epsilon elimination, thus showing robustness of the notion of recognizability. Second, we gave an explicit construction for a semi-automaton functor which recognizes the concatenation of the languages of two given automaton functors. More precisely, given an automaton functor A X , K accepting a language L X , K of cospans with domain X and codomain K and an automaton functor A K ,Y accepting a language L K , X of cospans with domain K and codomain Y , we constructed a semi-automaton functor A which accepts the concatenation of L X , K and L K ,Y . The construction is non-trivial because we have to make sure that the constructed automaton functor is a functor. By giving this construction, together with our first main result which allows us to convert semi-automata into automata, we have shown that recognizable languages (of arrows in an adhesive category) are closed under concatenation. One application of the construction is the calculation of strongest post-conditions. Let a property be given as an automaton functor A, and let p = ( , r ) be a graph rewriting rule in the format of reactive systems [13], that is, r and are cospans with the same domain and codomain, and the reduction relation is generated by ; c ⇒ r ; c for all context cospans c. We want to construct an automaton B that accepts exactly the cospans that are successors (after the application of rule p) of cospans accepted by A. We hence construct an automaton A which accepts c if and only if A accepts ; c by changing the initial states of A to the states which are reachable by from the original initial state (see also [2]). Then we have two recognizable languages, {r } and L (A ) and we obtain a representation of the strongest post-condition by constructing the composition automaton B . More details are given in [3]. It is future research to make this idea more concrete. Note that in this application one of the recognizable languages is quite simple, consisting of only a single cospan. Another future plan is to find a more efficient construction for such simple concatenations. In the final part of the paper we showed, by giving counter-examples, that recognizable languages are not closed under (a suitable notion of) Kleene star and substitution. In particular, the second negative result is unfortunate, as a similar property for regular word languages played a key role in the match-bound technique for showing that a string rewrite system terminates [8]. This makes it hard to transfer the technique to graph transformation systems. Additional closure properties were considered in [12], which forms the basis of this paper. In particular, [12] contains additional negative results and the proof that recognizable graph languages are closed under application of inverse context-free rules. Furthermore, we here restricted ourselves to cospans of monos. We believe that the results might also hold for arbitrary cospans, however, our proofs rely heavily on monos, especially in the use of subobject lattices and due to the requirement that isos decompose into isos, needed in Section 3.4. Hence it is non-trivial to extend the results to the general case. A special case of graph languages deserves to be mentioned: if our graphs consist only of 0-ary labelled hyperedges, we can recognize multi-sets or commutative words [15]. We still have to investigate whether there exist interesting connections to the theory of commutative regular languages.

98

H.J.S. Bruggink et al. / Science of Computer Programming 104 (2015) 71–98

There exists a tool, called Raven [2], which is maintained by Christoph Blume and which can represent and manipulate graph automata. The number of states in such an automaton can be huge, as seen in some of the examples in this paper, and Raven mitigates the state explosion via a symbolic representation by binary decision diagrams, leading to acceptable runtime results. We plan to implement the constructions presented here in Raven, and for this it is necessary to break them down to atomic cospans, a task which has already been done in [12]. References [1] Paolo Baldan, Filippo Bonchi, Andrea Corradini, Tobias Heindel, Barbara König, A lattice-theoretical perspective on adhesive categories, J. Symb. Comput. 46 (2011) 222–245. [2] Christoph Blume, H.J. Sander Bruggink, Dominik Engelke, Barbara König, Efficient symbolic implementation of graph automata with applications to invariant checking, in: Proc. of ICGT ’12, International Conference on Graph Transformation, in: Lecture Notes in Computer Science, vol. 7562, Springer, 2012, pp. 264–278. [3] H.J. Sander Bruggink, Raphaël Cauderlier, Barbara König, Mathias Hülsbusch, Conditional reactive systems, in: Proc. of FSTTCS ’11, in: LIPIcs, vol. 13, Schloss Dagstuhl – Leibniz Center for Informatics, 2011. [4] H.J. Sander Bruggink, Barbara König, On the recognizability of arrow and graph languages, in: Proc. of ICGT ’08, International Conference on Graph Transformation, in: Lecture Notes in Computer Science, vol. 5214, Springer, 2008, pp. 336–350. [5] H.J. Sander Bruggink, Barbara König, Sebastian Küpper, Concatenation and other closure properties of recognizable languages in adhesive categories, in: Proc. of GT-VMT ’13, Workshop on Graph Transformation and Visual Modeling Techniques, Electron. Commun. EASST 58 (2013), http://journal.ub. tu-berlin.de/eceasst/article/view/854/848. [6] Bruno Courcelle, The monadic second-order logic of graphs I. Recognizable sets of finite graphs, Inf. Comput. 85 (1990) 12–75. [7] Bruno Courcelle, Recognizable sets of graphs: equivalent definitions and closure properties, Math. Struct. Comput. Sci. 4 (1) (1994) 1–32. [8] Alfons Geser, Dieter Hofbauer, Johannes Waldmann, Match-bounded string rewriting systems, in: Branislav Rovan, Peter Vojtas (Eds.), Mathematical Foundations of Computer Science 2003, in: Lecture Notes in Computer Science, vol. 2747, Springer, Berlin/Heidelberg, 2003, pp. 449–459. [9] Annegret Habel, Hyperedge Replacement: Grammars and Languages, Lecture Notes in Computer Science, vol. 643, 1992. [10] Tobias Heindel, A category theoretical approach to the concurrent semantics of rewriting: adhesive categories and related concepts, PhD thesis, Universität Duisburg-Essen, 2009. [11] John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation, Prentice Hall, 2006. [12] Sebastian Küpper, Abschlusseigenschaften für Graphsprachen mit Anwendungen auf Terminierungsanalyse, Master’s thesis, Universität Duisburg-Essen, August 2012. [13] James J. Leifer, Robin Milner, Deriving bisimulation congruences for reactive systems, in: Proc. of CONCUR 2000, in: Lecture Notes in Computer Science, vol. 1877, Springer, 2000, pp. 243–258. ´ [14] Stephen Lack, Paweł Sobocinski, Adhesive and quasiadhesive categories, RAIRO Theor. Inform. Appl. 39 (3) (2005) 511–545. [15] C.P. Rupert, Commutative regular languages, SIGACT News 22 (4) (1991) 48–49.