WEAKLY SEPARATED
GRAMMARS*
A. L. FURSMAN
Rostov~n-eon (Received 11 September 1973; revised 11 March 1976) GRAMMARS admitting of analysis by means of au ordered magazine automaton with one state are discussed. The ordering consists of the fact that in each configuration the applicability of the rules is checked ~quenti~y and the fust suitable one is chosen. ConstNcti~ subclasses of the general class of weakly separated grammars are selected, the rigour of their imbedding is proved, they are compared with other known classes, and some algorithmic problems for them are studied. 1. In choosing an appropriate class of grammars for describing the syntax of practically applicable formal languages, for example progr~~g languages, it is desirable to look at the possibility of obtaining directly from the grammar a simple and efficient atgorithm of syntactical analysis. In particular, such a possibility exists for a certain subclass of context free languages CF-languages), namely for the class Ls of separated languages [l] , first considered in [2] under the title of simple languages and generated by the class l’s of separated grammars. Sub~quentiy it was noticed #at the same property is possessed by more extensive classes of languages: the classes LF(k) of [3] , the classes Lw.s. of weakly separated languages [4,5], generated by weakly separated grammars, and also the classes of modified separated languages [6]. A certain class Fw.s. of weakly separated grammars is non-constructive; however constructive subclasses rkW.s. of k-weakly-separated (k = 1,2) grammars exist; they generate the classes of languages Llwas. and Lz~.~., which are more suitable for practical application. (Here a class or its defmition is regarded as constructive, if the algorithmic resoivability of the problem of membership in this class is known.) For the class LI~.~. it was established [7] , that it is identical with the class LF(r) and the class Lln .s. of modified separated languages, and also that the equivalence problem for the grammars of L1, ,s_is solvable. Moreover, the class rl W.s. is identical with the class I’* .s. of modified separated grammars. For the class Lz~.~. it was shown [8], that it is imbedded in L,,,.$. and that the imbeddings CL2w.s.are characteristic. LS=hJ.S. In this paper a precise defmition of weakly separated grammars is given which is based on a comparison of them with so-called ordered automata with a magazine memory (MM-automata), which represent a formal model of the algorithm of analysis mentioned above. Also defined are new constructive subclasses Fkw,s. for k = 3,4 and it is proved that the imbeddings L~w.~.c Lk’W,s.are characte~stic (O
198
Weaklyseparatedgrammars
the resolvability of this predicate is unknown even for G,, GS
199 I’,. (see [2]).
It is established that l-weakly-separated grammars are characterized in the class I’,.,. by the property of uniqueness, but the classes l&, .s. are non-unique for k>l but for them there exists an efficient analyser - the ordered MM-automaton. We also consider the class Lo.,. of so-called ordered-separated languages, received by ordered MM-automata with one state, and it is established that the imbedding Lw.S. c Lo.,., which easily follows from the definition of these classes, is to the same extent characteristic. It is proved that L ,, .s. forms a subclass of the class LB of all deterministic KF-languages, the classes LF(k) for k> 1 and Lkw.s, for k>l are compared. 2. We give some definitions and notation used below: p(M) is the set of subsets of the set M; M* is the set of strings in the alphabet of M, including the empty string e;MN is the concatenation of the sets M and N, consisting of the strings: MP = MM. . . M (p times); I M I is the number of elements in the set M, I 6 I is the length of the string 8; OR is the mirror reflection of 0 ; that is, it is written from right to left; =+ is the symbol of implication; -X is the symbol of identity, that is, the relation “if and only if’; the quotient e\L is the set {-c 1&EL}.
A CF-grammar G, as in [9] , is the aggregate (V, Z, P, o) , where V=ZUN, ZflN= 0, o=N;here E is a terminal, and N a non-terminal alphabet, u is the initial symbol, P is a’set of substitutions of the form z-+01, where ZEN, CZEv’. Substitutions from P will sometimes be represented by a formula z*ai
I.. .I an,
(1)
combining all the substitutions with the same left side; we call the strings oi the alternatives of the symbol z. The notation a-+B means that the string 0 is obtained from a! by the application of some substitution from P, o&p means that or= /3or a+~*. . . --t a, = p. Substitutions of the form ~-+a, in which a=XV’,and the choice of cr itself are called significant; the choice of e is called empty; a symbol z which has a substitution Z-E, is said to be vanishing, a string of vanishing symbols is also called a vanishing string. The infeignce a_fCl3 is said to be significant, if the last substitution in it is significant, the notation is a-/3; the inference a&, is obviously not signitkint. We introduce as usual the language L(G) defined by the grammar G, and the equivalence of grammars as the identity of the languages defined by them. All the grammars discussed will be regarded as reduced, as in [9]. We will also consider that each symbol of N has at least one nonempty alternative (otherwise it can be removed from all the substitutions, without changing the language L(G)). We will also consider that every nonempty alternative of o belongs tc Vfi*Uiv., that is, a terminal symbol in it can be only the fast symbol. An arbitrary grammar is tasily reduced to this form without changing the language’defined by it. We say that the symbol z is separate, if oi = oi& in formula (1) where UiEEy ~,+a, for i # j. If this is true for i, j
200
A, L. Fuksmn
or) separated. The class of (pseudo-) separated grammars is denoted by (r,.,.) corresponding languages by (Lp .s.) L, .
I’$., and the class of
We note that in [I] pseudoseparated grammars are less appropriately called semiseparated. Everywhere below when giving examples of grammars we will usually write out only the rule P, this is sufficient if we take into account the fact that the initial non-terminal symbol is always o, and the other non-teach s~bols stand on the left sides of the foamy. 3. We consider a magazine automaton with only one state. It has an input tape from which the symbols are read out in succession, and a magazine tape infinite in one direction. Formally we defme it as the ensemble (2, 2, zo, 6, h), where Z, 2 are alphabets, ZTlZ=@, zocG, G&X (ZUe) X2’, ?eZ’Z’XP (6). In detail, I: is the input alphabet, 2 is the maguine alihabet , z,-,is a symbol defming the initial conjuration ~00, where @=z’, S is a substi~tion of the form zo+fi or z*b, applicable to configurations k=yfl from Z’Z’, where 7 is the content of the magazine tape (the head at the right end), 8 is the unread remainder of the input tape (head at the left end), h is a rule for the application of ~bstitutions estab~~~g a correspondence between the conjurations k and some set X(k) of substitutions applied to the k. This definition of an MM-automaton differs from the usual one by the absence of a set of states (since uniqueness of the state is assumed) and is important for the subsequent explicit isolation of the function X. We also determine the relation k-tk’ between configurations, indicating that k’ is obtained from k by the application of a substitution from the set X(k), and also the relation kl k’, denoting that k = k’ or k+k,-+ . . .+ k,=k’. The sequence k, k,, . . . , k’ forms the path of the automaton. A con~guration is regarded as successful if it consists of the empty string E, that is, with an empty magazine and an empty remainder of the input string. If there exists a path from the initial configuration zoo to a successful one, then we consider that the automaton has accepted the string 19.The set of such strings forms the language L(a), defmed by the automaton 8. Automata defming the same language are said to be equivalent. Substitutions 6 of the automaton % wilI sometimes be written as a formula which takes outside the brackets the common symbol z: ~(~~-+7i~.4~,-+7,),
where t,EXiiE. 4. We will consider automata with two types of functions X: free automata for which h(k) contains all the substitutions applicable to k (we denote such a function by hf), and ordered automata for which there occurs in h(k) only that substitution of &f(k), which in formula (2) is not preceded by any other of the Xr(k) (we denote such a function by LJ). An automaton is said to be dete~~tic ordered MM-automaton is deterministic.
if 1h(k) 1G 1 for all k. It is obvious that every
(2)
Weaklyseparated grammars
201
If % is an ordered MM-automaton and in the substitutions of formula (2) ti = tj for i
for
t,=e,
i+z;i;
then
i-12.
W) W2)
Everywhere below we will consider that these conditions are included in the definition of an ordered MM-automaton. For a deterministic free MM-automaton the satisfaction of the following conditions is necessary and sufficient: t&t,
for
if &=&,then
i#j;
t=n=i.
@f) (W
It is obvious that these conditions are more rigorous than (Ul), (U2). If an MM-automaton satisfies only the conditions (Ul), (U2) we say that it is pseudodeterministic. We call an ordered MM-automaton Ilo and a free MM-automaton 91f corre~on~ng, if MM-automaton). It they differ only in the function h (and hence, Bf is a p~udodete~~stic is obvious that L(%J c L ( gf), since every path-for the automaton & is a path for the automaton Qr, also, since ho(k) c&(k). In the particular case where Qlf is such that L(IIf) =L ($?I,),it is called weakly deterministic. It is obvious that this definition is not constructive, since it is not obvious whether the problem of the equivalence of the corresponding automata 9lr and 91L,is solvable. An MM-automaton is said to be laconic, ifin (2) ti=e implies yi=a, of the input symbol the automaton can only erase the magazine.
that is, irrespective
5. We now note that free MM-automata are closely connected with grammars. We say that the grammar G= (V, 2, P, 0) corresponds to the automaton %= (i, 2, zo, 6, ho), if Z=V--C, zo=o and zt-+y=&=-z -+tyR~P. It iseasy to see that L(G)=L(Qi). If s$ is a laconic p~udodete~i~ic MM-automaton, then it is easy to see that the corresponding grammar G is pseudoseparated. If % is a laconic deterministic free MM-automaton and for z there exists the substitution zt-t 1 with t=& then by (D2) there are no other substitutions for z, and by the conditions of laconicity it is of the form Z*E. Therefore the symbol z can be removed from all the right sides of the substitutions. But if there is no such symbol z, then the corresponding grammar G is separated. ~e~~~on. A p~udo~p~ated grammar G is said to be weakly separated, if the MM-automaton corresponding to it is laconic and weakly deterministic. The concept of weakly-separated grammars has some bask features. Namely, such a grammar G, on the one hand, describes a language L(G) in the ordinary terms of generation, without any conditions on the derivation of the strings of the language from (7.On the other hand, if for this
202
A. L. Fuksmm
grammar we construct the corresponding ~-automaton 9[f and the corresponding ~-automaton cd,, then since &ff is a weaklydeterministic MM-automaton, it is equivalent to Qe, therefore, the language L(G) =L(%Sf) is accepted by the automaton 910. This also indicates the possibility mentioned in section 1 of obtaining directly from the grammar G an efficient parser‘%e. The efficiency consists of its determinacy and of the fact that its programmed realization is not more complex than for dete~~sti~ MM-automata accepting separated languages: in both cases the first suitable substitution from formula (2) corresponding to the top magazine symbol is used. We consider in greater detail the cor~~nden~e that to every path of an MM-automaton &+k*-+. where k,&‘X’,
of automata and grams.
It can be shown
. .+k*,
there corresponds a left-sided inference in the corresponding grammar G ao*ai*.
. .-tan
such that if ko=yo60, and for PO we have ki=yi&“, where Bo=fjll’&“, then CG/--O~‘~~. where fi<=l(iR, and conversely, to an inference in G there corresponds a path of the automaton 5. In particular, if k,==z& and k,=s, then ao=zo and a,=(I), which implies the equality L(I) =L(G) mentioned above. We defme for the automata the following concepts: a vanishing symbol XEZ with the substi~tion X+E and a string of such symbols, a sub~itution*~~~ a read-out (za-ta) and without a read-out (z-a), a path with a reading (notation k+ V), whose last applied substitution was with a reading. We say that an inference in the grammar G is ordered, if the corresponding path of the ~-automaton % is permissible for. the cormspon~g ordered MM-automaton ofo. For example, in the grammar (o+aoy 1e, p-b 1E) the inference o* aoy~aaayy~a~~~-t~~~~~~~~ is -J-aayy-tuuy-+aab is not ordered, although it also ordered, and the inference o-+uuy+uaoyy is left-sided.
If .a*%$$, a*%aa are ordered significant inferences in the pseudoseparated grammar G, where a,_ du”fN’, then a”=a’. where The proof follows from the fact that the corresponding paths, @z,’ and +&$‘, ~=a’, r’=a’R, rB=aNR,are executed by an ordered MM-automaton, and it is deterministic. This assertion is a generalkation of the property of strong determinacy of an arbitrary inference for modified separated grammars proved in [6], since it can be shown that in these grammars every inference is ordered (by using the equal.ityI’~ws, = rar.s. and section 2 Theorem 5). From the deftition lkorem
of weakly separated grammars we easily deduce Theorem 2.
2
A pseudoseparated grammar C is weakly separated if and only if for every string 0 of t(G) there is an ordered inkence.
203
Weakly sewrated grammars
We note that this definition of a weakly separated grammar is given in [lo] , where an ordered inference is first defmed, of course in other, purely grammatical, terms. 6. In view of the importance of weakly separated grammars mentioned above, there is particular interest in the Con~Nctiveness of the devotion of them given, that is, the re~lvab~ity of the predicate (G=rw.s.)or what is the same thing, GEI’~.~.*GEI\W.~.(since rw.s_c rP,s. and the predicate GE rP.s. is trivially resolvable). This question is not simple, as is shown by the following theorem.
If we resolve the predicate GE r,.,
then we also resolve the predicate
(Gi~r,.s.AGz~r,.~.~L(G,) cL(Gi)). Proofi We suppose- that there exists an ~go~t~ At, deciding for an arbitrary CF-grammar G, whether it belongs to rw.,.. It is required to construct a corresponding algorithm A2 determining for arbitrary weakly separated grammars Gl and G2 whether the language La = L(G2) is a subset of the language L 1 = L(G1). The algorithm A2 amounts to the fact that co~e~ond~g to the bears Gi= (Z,, Vi, oi, &)ami Gz=(Za, Vz, ~2, p 2) , where N,ilNz=@ (any pair of grammars is easily reduced to this), there is constructed a grammar G= (8, V, o, p) , where ~=ZU~zU {d), N= v-Z= N,UN,U (9, x, y, z), ~=P,uPzu {o+dxy, x-+dwI E, y-d24e, z-4, and iwZ,UZ,,
0,x,
y,
&‘N,UN2.
We prove that GErW.$.-@-L,cL,: Then to compfete the algorithm A2 it is sufficient to apply the algorithm Al to the grammar G constructed.
L.et LzcLi; we prove that in G every string is derived by an ordered inference. For this we note that any inference a,%,, a2%Igmay be regarded as ordered, sinceG,, G2~&v,g Hence, an unordered inference from u can occur only on the application of the substitution x-e: o+dx y+dy-+ddo&dd0,d=-c. howevercJ2~L2cLi, so that there exists an inference oi%J2 and therefore, for the string T there is the ordered inference
Therefore, for any string OEL( G) there is an ordered inference, so that GE I’ws,sby Theorem 2. Now let L&L, and t&L2--L,. Then for the string ordered inference, so that G+&.,. by Theorem 2.
dd&d there does not exist an
The theorem is proved. 7. We consider constructive sufficient conditions for the membership of the grammar G in the class rw.$. . For this we introduce some concepts. We defme as neighbours of the layer (Yoccurrences of the symbols z and w of N in the string is vanishing. Neighbours of the string @Z’ fi from v* such that /3= P’zawp”, and the string (11 for an MM-automaton are defined similarly.
204
A. L. Fuksnm
We defme as competitors neighbours which have at least one alternative with an identical initial symbol a from Z, that is, z-taa’, w+uo!‘, where z is vanishing (for automata, the substitutions .za+ y’, wa-t y”,). Definition. The pseudoseparated grammar G is said to be k-weakly separated (k = 1,2,3,4), if for every pair of competitors z and w, present in the strings 7 derivable from u, there is satisfied at teast one of the conditions (Cl), . . . , (Ck): there are none in Y,
(Cl)
z=w,
@a
~--+a and if z+aq
w+a/3, theno= 0,
W-VT andif z-tacz, w+ap,
K3)
theneithera=@v,where
v+a/3vItz,
K4)
Abbreviation of the notation: b., for classes of grammars, Lb.S. for classes of languages. In the same system of notation we include the separated grammars and languages r~~.~., LQ~,~.It + is obvious that I’~w.p.cI’(k+~)W.S;for 06kZL Examples. In the tirst exampleG,= 171w.s.and in the others GP.EI&, .s.:
L*= {ad} u { (nr) “y, The corresponding languages Lk are as follows: LO= {aR, n>O}) for in>O) ; it is difficult to express L3 by a formula, but its structure is fairly clear; L,,= (&~a~,where 0~ {a, zl}*, and I< 1B 1 for CE{n>*, otherwise 2 is arbitrary}. 77worem4
r4w.s. =
r,.,.
Proof: By Theorem 2 it is sufficient to prove that ifG= r4w .,.then for every string QEL (G) there is an ordered inference. For this we consider the arbitrary left-sided inference 8:
cr=eoyo-+eiy,+ . . . -4,y,=o,
(31
and show that it can be reconstructed in such a way that it becomes ordered. For this we isolate in this inference the first step which is not ordered: t%y,+f3,+iyl+l. By the defmition of an ordered inference, this means that the corresponding step of the MM-automaton ki=Bi~i-+-Bl+i~~+l=ki+ll
205
Weaklyseparated grammars
where B-&z;, &=yiR, &+.*=c”fir$,is not permissiile for the ordered MM-automaton ‘&,corresponding to the grammar G. This means that to ki there is applied a substitution from a,( kr) -&, (Q , Since GEI’~.~. the automaton +2f is pseudodeterministic and laconic, that IS,Xf (k) contains not more than the two ~bsti~tions: z*e and za-ta, if &=&‘z, %,=a%,‘. Here za-+a occurs also in At (ki), so that to ki there is applied in the step JEi*ki+t the substitution Z+E, that is, k,+i= BlTi- In the next path from ki+l to the symbols of,bil==x, . . . x,there are applied some (possibly, none) of the substitutions of the form z>-+E, 1QjGZ, 120, and then xl+ia+@. Accordingly, in the inference (3) we have
ej~i=eiz5j..
. xm+etxi..
. xm~etxr+,. . .x,-43ia~51,2..
.
(4)
. . *5,~e1Qeh
l+2.. . X,,where f&ef.
But the beginning of the substitutions Z+E, z-+aa, Xj-te, 1GfGZ, ~f+~-t@ indicates that z,xi+l are competitors. They satisfy one of the conditions (C2), (C3) or (C4), since by hypothesis G=r4w
.s.
If (C2) or (C3) is satisfied, then fy = fi and we have the substi~~on xr+*+e; therefore, then a=B+f?, then xi-t&, f
the inference (4) can be reconstructed by applying fast z+aq
ejyi-4taax,.
. .~~-4ha8’~~. . .~~~e,ae’~,+,.
. .x,.
We obtain an inference of the same string which in an ordered way derives the string 8s longer by one symbol than (4). Now let (C4) be satisfied for z, x2+1. Corresponding to the two possibilities in (C4) we consider two cases. Gzse 1. a=_Bv, v-w@ 18. Since in (4) there is the fragment xl+i -+aeaO’, so there is also the inference z-+afiv-j a@‘v+ae’. If this is used in (4), after removing the inference z-to, and replacing the inference fromxl+lby x~+~-E, we obtain the inference Biyi+Bia@x,
. . . xm %3,aefx,
. . .X* %M3~xl+z
.. . 5,
of the same string as in (4), but in it in an ordered way it is derived by at least one more terminal symbol. Case 2. ~~=c.zv,
u+aw
1e.
Ifwe write LI=L(aa), then L(v)=L,‘=UL,P.~Therefore,~~L(vj~ ap(~+ . .: q,, GEL,). Since in (4) there exists the fragment x1+;-ta/3~v~a0’, then &‘=t=q ..‘ T~:+$,~20, &EL,. Therefore it is possible to reconstruct the inference (4) by r&g for z the. inference z+aa~zl; also x*--+8(IGKZ); then ifp=O we have xt+*-+s, andifp>O,then xt+I+ap=aavkz . . . z,+*.We obtain 8iy,-+0iaaxi . . . x,,,-;“&x, . . . x, h3rz,xl+, . . . x,L&T, . . . T~+~x[+~. . . zm = e,ae’21+2 , . . x,,, that is again the inference of the same string as in (4), but with one additional terminal symbol throughout the ordered beginning of the inference.
206
A. L. Fuksman
The same reorganization can be continued further, until the whole string is derived in an ordered manner. The theorem is proved. i%eorem 5 1. If an inference in a pseudoseparated grammar is not ordered, then there are competitors in some of its strings. 2. In a l-weakly-separated grammar every left-sided inference from u is ordered. Z+oof: Part 1 is obtained in the proof of Theorem 4, Part 2 follows from Part 1 and the deftition of l-weakly-separated grammars. Theorem 6 An algorithm exists which for any pseudoseparated grammar and string a=N defines a set S(a) of neighbours encountered in the strings f3obtained from a by left-sided inference. Proof. Let aAfi=/3'zywp",where ‘yis vanishing, that is, (z, w) is a pair of neighbours. Then any z and w arc derived from one symbol x of the string LY,that is, (z, w) ES (5) , or from some symbol Yof the string a there is derived z and in this case v=+pzzyl, where /~‘=B$z, y=y+. Since the inference is left-sided, in this latter case, as the inference from v is not finished, the remaining part of the string 0, that is, yzwj3”, is contained in a, that is, a= a’vyzwfi”, &&zyi, yiyz is vanishing.
Therefore to find the set S(o) it is necessary, first to construct the set of neighbours S(x) for all the symbols x, and secondly, the setsK(x) of “potential ends”, that i&K(z) = {Z1z&&s& B vanishing. Then we obtain S(a)=S(q...2,)=
IJ S(sf) U { (2,w) l3s(a=BZywG, ict
y
vanishing
z=K (5)) 1
‘Ihe set K(x) is constructed sequentially. First we obtain K,(Z) = lx), and then ZEK, (w) , a” vanishing . The process terminates when K,+i(z)=K,(z) U{zI %.7(s+a’wa”, K,+,(z)=K,(s) forallx.It isobviousthat then K,(z)=H(z). To construct S(x) we first put S,(z) = { (z, w) 13~ (z+ a’va”wa”‘, 01” vanishing z=K{v))}, and thenS,+,(z)S,(z)U{(z, w) j3u(z+a’ua”), (z, w)E~,,(v)},’ until or a x, and then we put S(x) = S,(x). Since the quantifiers refer we obtain S n+i(z)=S,(r) f h not to the inferences, but to the substitutions from P, the number of which is ftite, all these constructions are constructive. The theorem is proved.
The defmitions of the classes rkW.s. lGkG4,
are constructive.
201
Weok& separated grammars
Indeed, it is sufficient for the pseudoseparated grammar G to const~ct the set of neighbours S(o), to isolate the competitJ,rs among them and verify for them conditions (~1)~~~). 8. We consider in more detail the imbedding L~.s,c
L(k+l)w+.S,
ThesetL(k+fj~.~.--~Lk~.s.@SkG3, ,is non-empty and contains the lapse
Lk+ 1.
For brevity we omit the proof. 9. The conditions determiningI’kw.s.fOr k82, can be generalized. We consider this generalization by the example ofI’ w.s.-grammars.Instead of the requirement x =y for the competitors x and y in the strings 7 we require that L(x) = L(Y), where L(z) is the language generated in the grammar G from the string z. ‘Ihe proof that I’2w.s,c: I’,,, is then compIetely preserved. The drawback of the new condition is that it is not constructive. However, in certain cases the equation L(n) = Lft) can be verified by means of algorithms based on the specific features of the grammar. One such case is where the grammars G,=( V, Z;, P, 2) and Gy==(V, 8, U, y) are, after reduction, I -weakly-separated grammars; then L(x) = L(Y) is verified by the algorithm described in [7] . For example, the grammar (~-+&XT] boz 1c, ~-tay 1E, y-dz, z-&z f E) is weakly separated, since Lfx) = L@). Thus changing the class of grammars we write I’&v,s.We simihrrly determine l?&v.s. (instead of o = 8 we require L(or) = L(P) and fiw .,(instead of @lot??, u-*(l%lr~1E we require that L(ab) =L (a&)* ). However, these eIasses I’iw,s. generate all the same the languages Lkw .S. since if, fdr example, L(x) = L(y), #en the grammar can be equivalently transformed, replacing y by x everywhere. 10. From all the weakly-separated grammars the l-weakly-separated grammars are ~~~i~ed by the property of uniqueness.
If GE I”‘,.,, ~~~G~rfw,~~f
and only ifit is unique,
For brevity the proof is omitted. rkw are not unique for k22 this is all the more ~~~st~g since they are analyzed by means of an ordered MM-automaton. Thereby from all the possible inferences of the string 8 this automaton reconstructs only one, namely the ordered one. Therefore,
&
11. Every p~udo~p~ated ~~~ G, besides the language L(G), which it defmes ~iquely, also defines a language L(G), a set of terminal strings derived from oby only ordered inferences. We call such a language ordered-separated, we denote the class of such languages by Le,s.” In other words, Lo.,. is that class of languages defmed by orderej ~~-automata with one state. If C is a weakly separated grammar, then by hypothesis L(G) = L(G), so that Lo.s.- LwA. Theorem9 The d~fe~n~L~,s,~L~,~.
is not empty.
208
A. L. Fuksman
Proof: Let G5= (o-wzxyz 1bdl dd, z-+bd 1dd, x*axy LI-{anbm+ldUanbmdd, n>m>O},
1E, y+ b 1E), then
~,={a”b”‘ddUa”b”+‘d,
n>m20}.
It can be proved that x,@L., .s.We omit the proof of this.
The properties of MM-automata are studied in detail in [ 1I] . In particular they imply that L O.s.~LD, where LD- L o .s. is not empty and contains the languages {a” b”Uuncn, n> I}. 12. So far we have considered weakly separated and pseudoseparated grammars corresponding to laconic automata. But if these automata are not necessarily laconic, then we call the corresponding grammars extended weakly separated and pseudoseparated grammars, abbreviated tdJ’e.w.s. and r e.p .s. Example. The grammar G~=(o+axy(yr, y*byxIc, x*alyy) belongs to T’e.w.s.To verify this we have to consider the free and ordered MM-automata corresponding to it and show that they are equivalent. For the given case it happens that every path of the free automaton, which differs from the path of the ordered automaton arrives at a deadend. This grammar is equivalent to x+ a 1byxy Icy) , obtained by “substitution” thegrammarG,=(o+uzy]byxx]cz, y-tbyxlc, of the rules for y in the alternative yx for u and yy for x. Theorem 10 There exists an algorithm A, which given the grammar GE J’e.w.s.constructs a grammar A(G)=rw.s.equivalent to it. This theorem is proved in [ 121 as a corollary of more general results. Using a modification Al of this algorithm we can give the following definition. Definition. An extended pseudoseparated grammar G is defined as kextended weakly-separh”,ed (we denote the class of these grammars byrke.w.s.), if Ai (G) E rkw.s. This definition directly implies the constructiveness of the classes rke.w.s. and the equality the corresponding classes of languages.
Lke.w.~7.kw.s.fOr
13. We compare the classes Lfli) and Lkw.s. As already mentioned LF (1) =Llw.,.For i, C-2 these classes are completely different. Thus, in [3] the language Lgk is generated with the d I bk+i), which belongs to L&‘(k),but not to LF(k - 1). It grammar .Gti= ((J-VZISX1e, x-&I can be proved that L82 does not belong toLw.s.and all the more not toLkwss.The proof is based on some properties of automata described in [ 1l] . Moreover the following theorem holds. Theorem 12 The language Lz= {unbm, n>m3I}
does not belong to LF(k) for any k.
An unsolved problem. Ascertain the composition of the set LFfl L,.,. where LF-U LF There is a conjecture that this is exactly L1,.,.=LF(1),that is, the languages ofL,_,.-Ll,.,. not belong to LF(k) for any k, and the languages of LF-LF(1) do not belong to Lw.s.
(k) . do
Translatedby J. Berry
209
Weaklyseparated grammars REFERENCES 1.
FUKSMAN, A. L., On some grammars for the description of context-free languages. Proceedings ofthe First All-Union Conference on Programming (Tr. I Vses. konf. po programmirovaniyu), Section A, 135-143, IK Akad. Nauk Ukr SSR, Kiev, 1968.
2.
KORENJAK, A. J. and HOPCROFT, J. E., Simple deterministic languages. Techn. Rept. No. 51, Princeton Univ., 1966.
3.
WOOD, D., The theory of left-factored languages. Part I. Cornput. .I., 12,4,1966;
4.
FUKSMAN, A. L., The processing of a text on the basis of a separated syntax. In: The devedopment of translators(Razrabotka translyatorov), 18-57, Izd-vo RGU, Rostovon-Don, 1972.
5.
ERSHOV, A. P. et a.!, Theoretical programming In the USSR. In: Sysfem and theoreticalprogramming (Sistemnoe i teoretich. programmirovanie), 30-31, VTs SO Akad. Nauk SSSR, Novosibirsk.
6.
KOMOR, T., A property of modified factored grammars. Zh. @hi& Mat. mat. Fiz., 12,6, 1612-1615, 1972.
7.
FUKSMAN, A. L., The problem of equivalence for a subclassof deterministic magazineautomata, (Problema ekvivalentnosti dlya odnogo podklassa determinirovannykh magazinnykh avtomatov), Report VTs RGU, 1973.
8.
DENISENKO, V. T. and FUKSMAN, A. L., Some classes of languages defmed by their relation to left-sided analysis. Proceedings of a seminar “Automation of programming” (Tr. seminara “Avtomatizatsiya programmirovaniya”), IK Akad. Nauk Ukr SSR, Kiev, 1973.
9.
GINSBURG, S., Mathematicaltheory of context-free hmguages (Matematicheskaya teoriya kontekstnosvobodnykh yazykov), “Mir”, Moscow, 1970.
Part ZZ,13,2, 1970.
10. KRITSKII, S. P., Continuation of the analysisof a weakly-factoredgrammar after the detection of an error, (ProdoIzhenie analiza po slabopazdelennoi grammatike posle obnaryzheniya oshibki), Report VTs RGU, 1973. 11. FUKSMAN, A. L., Ordered magazineautomata (Uporyadochemtye 1972.
magazinnye avtomaty), Report VTs RGU,
12. ABRAMOVICH, S. M. and FUKSMAN, A. L., Ordered magazine automata and some classes of grammars. In: Homogeneous d@tal computing and integratingstructures (Odnorodnye tsifrovye vychisl. i integriruyushchie struktury), 142-150, TRTI, Taganrog, 1975.