COMPUTEI:t GRAPHICS AND IMAGE PROCESSING
(1974) 3, (48-62)
Linguistic Methods for the Description of a Straight Line on a Grid R. B1RONS Delft University of Technology, Delft, Netherlands Communicated by H. Freeman Received August 6, 1973 This paper describes the construction of strings representing straight lines in an arbitrary direction on a grid and compares some grammatical systems that generate these strings. An algorithm is given that constructs a string representing a sh'aight line on a grid in Freeman's coding scheme. Some number-theoretical aspects of this algorithm are treated (Euclid's algorithm, Farey series, continued fractions). This algorithm is based on structural properties of the string. Strings generated in this way can also be produced with a programmed grammar, Lindenmayer grammars are also very powerful for this kind of problem, because of the simultaneous applications of production rules. Another method for constructing strings representing straight lines, where the generation is essentially sequential, for instance when noise is added, is treated. In this case Lindenmayer grammars are quite useless, but programmed grammars are still very convenient. Rule-labeled programs are less convenient for both kinds of problems. 1, INTRODUCTION In the literature on pattern recognition, two a p p r o a c h e s are g i v e n for describing a picture by linguistic methods. T h e first gives r a t h e r g e n e r a l definitions of primitives and their relations. F o r instance, w e h a v e the p i c t u r e description language o f Shaw [15], w h i c h is most a p p l i c a b l e to "graphr e p r e s e n t a b l e " pictures. T h e other approach uses the elements o f a grid as p r i m i t i v e s a n d defines simple relations such as "at the right side of." In this article w e restrict ourselves to the latter approach. T h e grammar for the 45 ° triangles of Kirsch [7] is the first k n o w n example. D a c e y [2,3] tried to improve this method. T h e grammars u s e d w e r e context sensitive. Chang [1] u s e d a context-free g r a m m a r to g e n e r a t e straight horizontal and vertical lines. H e could c o m b i n e t h e s e lines to form m o r e complex pictures. Some noise is a l l o w e d in his grammar, for t h e r e is a possibility that some grid elements in a straight line are not " b l a c k e n e d . " E v e n more noise is allowed in the stochastic c o n t e x t - f r e e p r o g r a m m e d grammar (SCFPG) of Swain and F u [16] and Fu [5], to g e n e r a t e noisy squares. This grammar generates squares with horizontal a n d vertical lines and allows deviations p e r p e n d i c u l a r to the sides. In t h e i r g r a m m a r the main directions are still the directions of the grid (horizontal a n d vertical). T h e main aim of our p a p e r is to p r o d u c e an extension to a r b i t r a r y d i r e c t i o n s on a Copyright (~ 1974 by Academiv Press, Ine, All rights oF reproduotion in any form reserved.
48
S T R A I G H T LINE ON A GRID
49
grid. As the pictures to be generated, straight lines are chosen. We restrict ourselves to directions between 0 and 45 °. All other directions can be obtained by mirroring. Another aim is to compare grammatical systems recently developed for this kind of problem. In the second section we shall introduce the tools we use for the generation, including the grammatical systems. In the third section we describe straight lines using structural properties; in the fourth section a so-called conditional approach is given. ~,. T O O L S
Grammatical Systems The grammars we need in this paper are defined in this section. For examples, see the Appendix. (1) We define a common grammar G = ['fiN, Vr, S, P], where V~ is a finite nonempty set of nonterminals; VT is a finite nonempty set of terminals (VN f~ VT = •); S is the e l e m e n t of VN that starts the production; P is a finite nonempty set of production or rewriting rules. Production rules have the notation ~ ~ ~: with V ~ VN + and ~ ~ V*. V = VN U VT, V + is the set of words that can be constructed by the elements of V. V* = V+ tA {h}, the empty word. When fl is derived from o~ after application of rule r, we write a ff/3. (2) Rosenkrantz [12] introduced the programmed grammar PG = [VN, V,r, S, J, P], w h e r e VN, Va,, and S are as before; J is a finite nonempty set of labels; P is a finite nonempty set of labeled production rules. Production rules have the notation r ' ~ ~ ~ S ( r l , rz . . . . ) F(r~, r~. . . . ) where E V~; f ~ V,; r is the label of the rule; S ( . . . ) is the success branch; F ( . . . ) is the failure branch; S ( . . . ) and F ( . . . ) may be empty. When the rule is applicable, one of the labels in the success branch indicates what rule must be applied next. When the rule is not applicable, we have to look in the failure branch. (3) Van L e e u w e n [8] defines rule-labeled programs (RLP), having some resemblance with programmed grammars. He does not use a special set of labels but determines which nonterminal must be rewritten. After each production, there is a test whether a symbol occurs or not. The formal definition is rather complex. We restriet ourselves to explaining how it works with an example in the Appendix. We mention a few differences between PG's and RLP's: (a) In a PG, the label sometimes can be chosen at will, then the rule to
50
1~. BRONS
b e executed is determined. In a RLP, the label is deterministically indicated, but as to the rules it is possible to make a choice. (b) In a R L P there is no failure branch. (e) W h e n a P G is correct, it produces an e l e m e n t of the language. A correct R L P is not always successful: after apply false, no string is produced. Van L e e u w e n extensively discusses the p o w e r of RLP's. In comparison with PG's, they have the same p o w e r in the context-free case; however, t h e y d e m a n d a more complicated construction. (4) The next rewriting systems differ considerably from the usual ones. T h e y have b e e n d e v e l o p e d b y the biologist L i n d e n m a y e r [9]. A Lind e n m a y e r grammer LG = [V, P, ~], w h e r e V is a finite nonempty set, the vocabulary (no difference b e t w e e n terminals and nonterminals); P is a finite-nonempty set of production rules; cr is a n o n e m p t y string of symbols in V, called the axiom. Production rules have the notation ~ ~ ~, with ~, ~ E V +. T h e axiom is the string that starts the production. The axiom and each derived string are elements of the language. At each derivation, every symbol is rewritten by one of the rules, if possible. So the rules are applied simultaneously. Rozenberg and D o u e e t [13] define 0LG's, in which t h e left side of each production rule consists of one element of V. In a 1 LG, there is at least one production rule with two symbols on the left side. Rozenberg [14] gives an interesting extension of 0Lgrammars, the table 0L-grammars (TOLG). In a TOLG, the production rules are grouped in tables, and only the rules of one table are applied at a time. This idea is also of biological origin, for there exist different d e v e l o p m e n t s as to day or night, summer or winter, etc. Rozenberg [14] proved that every language generated b y a TOLG can be generated by a context-free programmed grammar (CFPG) (the converse is not true). (5) We shall use two extensions of PG's: (a) PG's with a tail. At the end of a string is a substring of nonterminals, which has a bookkeeping function. F o r a definition, see F u [5]. When t h e production stops, nonterminals can still be in the string. When we do not define a semantic interpretation of these symbols, their occurr e n c e in the final string is of no importance. (b) Stochastic PG's (SPG) (see Swain and Fu [16]). An SPG = [VN, Vv, S, J, P, D], w h e r e Vm VT, S, and J as in a PG; D is a finite n o n e m p t y set of probabilities; P is a finite n o n e m p t y set of labeled production rules, with given probabilities o f choosing labels. Notation of a rule: r.~)--->~ S(rl, r2. . . . )(p(r~), (p(r2) . . . . ) F ( . . . ) ( . . . ) , in which p(r~) is t h e probability to choose rl.
STRAIGHT LINE ON A GttlD
51
Freeman's Chain Coding Scheme According to Fig. 1, Freeman [4] quantizes the direction of a curve b y elements coded with the integers 0,1,2 . . . . . 7. The chain of elements is represented by a string of symbols (the numbers 0-7). Freeman states that strings representing straight lines must possess three speeiIlc properties: (1) at most two types of symbols can be present, and these can differ only by unity, modulo eight; (2) one of the two symbols always occurs singly; (3) successive occurrences of the single symbol are as uniformly spaced as possible. Properties 2 and 3 can be taken together: T h e occurrences of each of the symbols are as uniformly spaced as possible. D u e to the restriction to directions between 0 ° and 45 °, we only need the symbols 0 and 1. So the direction of a line can be expressed by a slope b e t w e e n 0 and 1. When a slope is an irreducible rational fraction, the string is periodic, and the length of a period is the denominator of the fraction. For example, one period of the string for a straight line with slope 2/5 can be expressed as 01010, 00101, 10010, 01001, or 10100. Which of these periods is chosen is not important, because the bounds of the period can be placed anywhere. It does not follow from the properties stated by Freeman, because it depends on information about the place of the origin of the line, which is of no importance for the form of the line. In Section 3 w e give an algorithm and some grammars based on our attempt to space the symbols as uniformly as possible.
Farey Series Montanari [10] states that on a grid, depending upon the number of neighbor elements, certain slopes are possible. With n neighbors at the right side of the grid-element, the slopes are indicated by an nth order Farey series. For instance with six neighbors,
F(6) = {3, ~, ~, ¼, 3, ~, ½, ~, ~, ~, 3, ~, {}. Farey series are well known in number theory. Hardy and Wright [6] give a theorem which is useful for the construction of F(n + 1) from F(n). W h e n 3
2
1
",,,, f / \//\
o
,/l 5
\ 6
7
FIO. 1. Chain encoding scheme.
52
R. BRONS
hlk, h"/k", a n d h'/k' are three successive terms of F(n + 1), then h"[k" = (h + h')[(k + k'). To construct F(n + 1), we look for successive pairs h/k and h'/k' in F(n) that satisfy the condition k + k' = n + 1. B e t w e e n these pairs is a fraction
(h + h')[(k + k') in F(n + 1). We note that in Farey series, the interval b e t w e e n two numbers is at its largest b e t w e e n 0 and 1/n, and b e t w e e n (n - 1)/n and 1. W h e n we apply this to angles for large n, the first interval is 1/n radians, the latter 1/2n. So the accuracy one can reach by a Farey series is limited by 1/n for directions near t h e main directions of the grid (in our case near 0°). 3. A S T R U C T U R A L
APPROACH
A Structural Algorithm B a s e d u p o n a spacing of the symbols "as uniformly as possible," we develo p e d an algorithm for the construction of a straight line. W e are only interested in one period, since the other parts of the line are j u s t repetitions of this period. In t h e case of just one specimen of the least occurring symbol, there is no problem. W e can place this symbol arbitrarily in the period. For conven i e n c e ' s sake, w e always choose the end of the period. W h e n more specimens occur, we construct subperiods each containing one of the symbols w h i c h occur least frequently. There are two kinds o f subperiods, one with say n of the most frequent occurring symbols, the other with n + 1. Next, we have to space these different subperiods as uniformly as possible. We n o w determine the n u m b e r of the subperiods which are least frequent. If the n u m b e r is one, w e place it at the e n d of the string, otherwise w e continue the process by constructing taller subperiods from the previous ones. W e repeat this procedure until one subperiod appears singly. To construct the period belonging to a slope p/q, we use a recursive algorithm with the following steps (the subscripts serve to count the n u m b e r of times the algorithm is applied): (1) Read p and q under the assumption 0 < p < q; p and q are integers. (2) R e d u c e p/q if possible, call the n e w values p0 and q0. (3) Calculate r0 = q0 - p0. There are r0 symbols 0 and Po symbols 1 in a period. (4) Let c~ = MAX(p~, rt); call symbols occurring c~ times Ci. (5) Let ti = MIN(p~, ri); call symbols occurring t~ times Di. (6) If t~ = 1, generate the period C~,Di and stop t h e program. (7) Calculate n~ = ENTIER(cdti).
STRAIGHT
LINE O N A GRID
53
(8) Calculate ~9t+1 = ct -- ttn,. (9) Calculate rl+l -- tt -- p,+,. (10) G e n e r a t e r~+l subperiods C~n'Di, call t h e m B,+I. (11) G e n e r a t e P,+I subperiods Ci~'+ID,, call them A,+I. (12) Go to step 4. T h e condition 0 < p < q in step 1 is d u e to the restriction to slopes b e t w e e n 0 and 1. T h e r e d u c t i o n in step 2 is necessary, because othelwise tt will b e c o m e 0, and then step 7 is n o t allowed. E x a m p l e . In constructing the period for slope 14/39, we first see p0 = 14 and q0 = 39, so r0 = 25. A zeroeth approximation could be 0~114. By applying the algorithm o n e time, we find 11 subperiods 001 and 3 subperiods 01. So a first approximation could be (001)11(01)3. Applying the algorithm for the second t i m e yields two subperiods (001)401 a n d one subperiod (001)801. The third t i m e one of t h e subperiods occurs singly, so we finish the construction. The string is ((001)401)2(001)801, or 001001001001010010010010010100100100101 See Fig.2. After some reductions, w e find that equations for t~+l and c~+1 in relation to tt,e~, and n~ are tt+l MIN(ci - t~ • hi, t~ ' (1 + n~) -- ct), c,+1 = MAX(ct - tl ' n~, t~ ' (1 + nt) -- c~). =
We conclude t+~l <½ti, so the n u m b e r of times the algorithm has to be applied is ENTIER(2LOG(MIN(p0, re))). The case of slowest c o n v e r g e n c e is w h e n every n~ = 1, a n d
/'t+1
= Ct ~ ti,
c~+1 = 2tt + ct.
J
2 th a
"
'
/" ~
i
f
J
f
/
/ 0 TM approximafian
FIG. 2. Approximationof a straight line with slope 14/39.
54
R. B1AONS
W h e n we apply the algorithm in the opposite direction and start from the pair o f numbers (ti+l, c~+1)with (1,1), we find the following pairs: (1,1), (2,3), (5,7), (12,17), (29,41), (70,99) . . . . In the theory of numbers [6], this is known to be a good approximation of ~/2. This can easily be verified by writing one of the pairs as a continued fraction, and by writing ~/2 as a continued fraction (V-~ = 1 + +--2 1~ 2+.
• .)
The algorithm bears some resemblance to Euclid's algorithm for finding the largest common divisor of two numbers, well known in the theory of numbers. Euclid's algorithm can be written as d~ = ni • di+l + di+2, where n~ = ENTIER(dJd~+I)
and do and dl are the numbers in question. Here also the slowest convergence appears w h e n ni = 1 for all i. Applying this algorithm in the opposite way, starting with (1,1), we find the Fibonacci series (1,1,2,3,5,8,13,21,34 . . . . ). Each quotient of two successive numbers is an approximation of (~J5 + 1)/2,
w h i c h can b e verified by their continued fractions -(V5 + ~ 2
= 1+
1 1+1
1+...) With the h e l p of continued fractions, we can give an approximation of irrationals b y rationals. In constructing certain directions with some accuracy, we can u s e continued fractions (for the best approximation) and Farey series (for the accuracy). Grammars f o r the Structural Approach
T h e periods constructed by the structural algorithm can be regarded as e l e m e n t s of a language. In this subsection we shall give grammars that generate this language. First we give a programmed grammar PG = [Vn,VT,SJ,P ] with VN = {A,B,C ,D,E,F,S }, y ~ = {0,1}
J = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
STRAIGHT
P={1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
S~F F F D E C D C D A B c D ~ C ~ D
>C ~ CD ,E , CD ,A ,B , B >A ;, CCD , CD ,0 1 1 )0
LINE ON A GRID
S(2,3) S (12,14) S (4,6,8,12,14) S (4) S(5) S (6) S(7) S(8) S(9) S(10) S(ll) s(12) S (13) S(14) S (15)
55
-F(5) F(4,6,8,12,14) F(7) F(10) F(9) F(10) F(ll) F(4,6,8,12,14) F(13) F(15) --
}.
The rules can be divided into four groups: {1,2,3} : the starting rules; they only are used once. {12,13,14,15} : the endrules. When arriving at these rules, the final result is determined. {6,7,8,9,10,11}: These rules complicate the period. The number of times that these rules are applied equals the number of applications of the structural algorithm in constructing the period. {4,5} : the n u m b e r of times that this group is used before the group {6,7,8,9,10,11} determines the number of the most frequent occurring symbols or subperiods.
Example. 4,5
6,7
S--L-->F--L-),. CD ------~CCD-----b. A A B 4'-~, (C4D)2C3 v 6"~(,(A4B)2A3 ), 10'1!'~ ((CCD) 4CD) 2(CCD) aCD 1~,,~), ((001) 401) 2(001)301.
10,11 "~
. (CCD)2CD~
This example generates a period of a string representing a line with a slope of 14/39. An attempt to find a rule-labeled program for this language was not suecessful. Very essential in the PG is that there are different rules for rewriting, for instance, D, but there is little freedom to choose. I n a rule-labeled program, a great n u m b e r of tests must be added or a great n u m b e r of false productions will be obtained. It will be very d i ~ c u l t to find a rule-labeled program for this language. Much more successful was an attempt to construct a Lindenmayer grammar. We give a TOLG = {V,P,cr}, with v = {0,1},
= {0},
56
R. BRONS
P=I.
{1 - - - ~ 0,0 ----° 1}, 9~. { 1 ----' 01,0 - > 01}, 3. {1 ~- 001,0 ~ 01}, 4. {1 ~ 01,0 ~ 001}.
The first group inverts the string; the second has the same function as rules {4,5} in t h e PC; the third and fourth group do the same as {6,7,8,9,10,11}. T h e s e last two groups can be replaced b y the group { 1 ) 01,0 - - - * 0}. N o w we have f o u n d an even simpler grammar with the same power. The table of production rules is n o w given by: P = I . {1 2. {1 3. {I
~0,0 ,1,0 ~ 01,0
, 1}, * 01}, ~- 0}.
Example 0 - ~ ), 01-A-~, 001 :: ~ -), (01)21 ~ (001)"01 ~ (0al)"0~1 =: =:> (0 1) 031 ((01)41) 2 (01).1 ((001),,01) 2 (001).01. Again slope 14/39 is represented. This L i n d e n m a y e r grammar is simpler than the programmed grammar. This can b e explained by the fact that it is essential in L-grammars that rewriting o n e symbol of a kind implies that all symbols are rewritten in the same way. An L-grammar is very convenient for this p u r p o s e due to the simultaneous application of the rules. In a P C it must b e simulated, which costs additional rules.
Noise and Parsing It is no use to add noise in the structural approach and in these structural grammars. The structure w o u l d vanish. Parsing, with this kind of grammar, is o n l y possible with a noiseless period, and this period m u s t coincide with the period generated. For practical purposes, this is a v e r y uninteresting problem. Parsing methods b a s e d on this kind o£ grammar can only b e applied w h e n one knows almost all about the line! In the next section w e describe a method better suited to problems with noise and to parsing. 4. A CONDITIONAL APPROACH
Introduction In the previous section we used structural properties of the line. Until the generation is almost finished, nothing of the endstring exists. In this section we describe a sequential method. The symbols are generated successively. T h e symbol chosen next d e p e n d s on a certain condition; that is w h y it is called a conditional approach. The idea of a period does not play any role. Morse [11] gives a straight-line generation algorithm. Each time
STRAIGHT LINE ON A GB1-D
57
he chooses a n e w symbol, he has to test whether it satisfies a certain condition. If so, this is the right symbol, if not he chooses another one. His m e t h o d can be simplified in the following manner: T h e equation for a straight line is y = (p/q) x + y~ with O~< plq < 1.
Each symbol 0 or 1 adds one step in the x-direction. The best point in the y-direction at the ith step is the one where y ~ - (p/q) i - Yb is minimal. We find this by rounding (p/q) i -k Yb to the nearest integer, or y~-- E N T I E R ((p/q) i Jr Yb q- 1/9,). T h e Freeman symbol we find at the ith step is y~ - YH. Most plot routines are based on this method. G r a m m a r s for the C o n d i t i o n a l A p p r o a c h
We give a programmed grammar based on the method just mentioned: PG = [V~,VT,SJ,P] w i t h V~ = { U , T , S } ,
VT = {0,1}, J = {1,2,3,4,5,6},
P ={i. 2. 3. 4. 5. 6.
S
~ UTm
UT q U ~ U UT ~ U
' ~1 U ~OU
7 UT p ~1 >0
S(4) s(4) S(4) s(2,5) -
F(3) -
F(6)
p and q again determine the slope p/q, m depends on yb: m = (yb + 1/2) • q. This grammar is not context free. By the rules 2 and 5 it is tested whether q symbols T occur. If so, the next symbol to be generated is 1, i£not 0. As an example, we take p / q = 2/5 and y~ =3/10, so m =4. The successive productions for the first five symbols are: UT 4 4 ), UT 6 ~ I U T _ L . . _ > I U T : ~~ I O U T 3 4 .;,10UT 5 ~ ) 101U 4 ),101UT2 2,8 ),1010UT 2 ~1010UT4 ~,6 ?, 10100T4. S~
>
T h e meaning of T is not defined, so we can neglect this symbol. T h e generated chain is 10100. A rule-labeled program producing the same is (see Appendix, Example 3, for the notation): begin ( UT m+" : apply (UT q) ; apply (U); UT q > 1 UT p ] 1 .. a p p l y (UT ~) ; apply (U); apply end; U( 0UT z' ] 0 : a p p l y (UT q) ; apply (U); apply end; end.
58
I~. BRONS
It is impossible to construct a L i n d e n m a y e r grammar w h i c h uses this sequential method of production, perhaps unless we construct something like a programmed Lindenmayer grammar. Noisy Straight Lines
The straight lines generated by the grammars above have discretization noise. Inspired by the generation of noisy squares as described by Fu [5] and Swain and F u [16] we developed a meflaod to add another kind of noise. We use an SPG and allow the symbols 2 and 7 besides 0 and 1. D e p e n d i n g •on the number of T's in the tail, the probability for the production of a 2, 1, 0, or 7 is determined. The grammar is SPG = [VN,Vr,S,],P,D] with v,, = { S,U,T,R} ,
Vr = {2,1,0,7}, J = {1,2 . . . . . 11}. The labeled production rules with the labels and their probabilities are: 1. s
, U T 'n+p
2. Tq+~ .... > T q+p 3. Tq > Tq 4. T >T 5. 6. 7. 8. 9.
U u U U TR 10. R T
11. U
S(2)
(1)
-
S(5,6,7,8) S(5,6,7,8) S(5,6,7,8)
(pl,a,a",a:') ( a , p ~ a , a '2)
F(3) F(4) F(5,6,7,8)
>2 ,1 >0 >7
UR q+; S ( 1 0 ) uaq-, s(10) UTp S (9) U T q+n S(9)
"X
S(9) S(10) -
~ k
>U
p~ = l - - a - - a 2 - - a
P2=l--2a--a 2
3
(a2,a,p~,a) (1) (1) (1) (1) (1) (1) O
(i) (i)
-
F(2,11) F(2,11)
(1 - # , # ) (i - #,#).
0 < Pl < 1, 0
1
X is the empty word: T B * k means that both T and R vanish. We give an example with p / q -- 2/5 and Yb = 3/10, or m --- 4. A period of an ideal straight line is 01010. For a =/3 = 1/20, a string is 101720101000110020. See Fig. 3. With other choices of a, or by changing the probabilities in the branches in /
i
+
1-'0
quontised line
FIG. 3. Noisy straight line.
S T R A I G H T LINE ON A GRID
59
an other way, other kinds of noisy lines can be generated. To construct a rule-labeled program will not be simple, perhaps impossible. It is essential that the probability of rewriting U depends upon the number of T's. In a rule-labeled program, these probabilities cannot be obtained from this number, but must be fixed. So a role-labeled program has no memory. Only w h e n in an ItLP a m e m o r y is simulated, if possible, noisy straight lines can be generated.
The Parsing of Noisy Straight Lines To test whether a given string can be generated by a given grammar, the string is generated by this grammar and the product of the production probabilities is computed, giving the production probability of this string. If this product is higher than a given value (depending upon the length of the string), one may conclude that the string belongs to this grammar. This method makes it possible to compute the most likely direction of a string with a certain accuracy. To obtain a certain accuracy, we first determine (see Section 2) which order of Farey series is needed. Next, we construct all SPG's with the p/q's from the Farey series found. For each p/q, we need q values ofm (0 ~
After we had performed this research, a report of Reggiori [17] appeared, containing an algorithm for the generation of an encoded straight line between two points on a grid. His algorithm can be interpreted as a combination of our structural and conditional algorithms. First, the structural algorithm is applied once. The result is two subperiods. These subperiods are concatenated by a method based on the conditional algorithm. H i s approach is very useful. It is more efficient than the conditional algorithm and it can be applied very simply, in contrast to our structural algorithm. On the other hand, the structural algorithm gives a better idea of the structure of a line. For instance, the notion of a period does not exist in the approach of Reggiori. ACKNOWLEDGMENT
This work has been performed in the Pattern Recognition group in Delft. I a m very grateful for the help of many members of this group, especially
60
R. BRONS
Pro£ dr. ir. C.J.D.M. Verhagen, Dr. ir. J.C. Joosten, Ir. F.C.A. Groen, J. Karman, and A.H.E. Nienaber. APPENDIX: EXAMPLES OF GRAMMATICAL SYSTEMS
In this Appendix we give some examples of the grammatical systems described in Section 2, which generate the language anb"cnd'L (1) G = [Vn,Vr,S,P] is a grammar with v~ = {s,~,~}, Vr = {a,b,c,d} P={1. S 2. S 3. Ea 4. E b 5. dF 6. cF
, >E S F ~ abcd ~ aE ~ abb ~ Fd ~ ccd.
The numbers in P are for reference only. Suppose after n - 1 applications of rule 1, rule 2 is applied. Then we have the string E n-1 a b c d b-~-~. String and rules are symmetric. No exchange being possible between the left and the right half, we can restrict our attention to the left part and the rules 3 and 4. The nonterminal E can only move up to the right (rule 3) or vanish at the confrontation with b (rule 4), producing ab. The production stops when all E's have been eliminated. As far as known, this is the simplest contextsensitive grammar that generates the language a"bncnd~ (n >- 1). Usually a more complex grammar is given. For instance, Swain and Fu [16] give a grammar with 19 rules and compare this with a PG of seven rules. When the previous grammar is compared with the PG given below, the advantages of the PG are less spectacular, though still existing. (2) P G = [V~,Vr, S J , P ] is a programmed grammar with v,~ = { S , E , e } , V~ = { a , b , c , d } ,
j = {1,2,3,4,5}, P = {1. S ' EF 2. E 3. F 4. E 5.
F
~ aEB ' cFd ~ ab ~cd
S(2,4) S(3) S(2,4) S(5) -
-
-
}.
T h e failure branch of this PG is empty (in Section 3 an example with nonempty failure branch is given). Here n is determined by the number of applications of rules 2 and 3. The language generated is again anb'~c"d '~ (n 1> 1). (3) An RLP (rule labeled program) which generates the same language is the following
STRAIGHT
begin E
F
LINE ON
A GRID
61
EF :apply (E); aEblab : apply (F); apply false; ",.... cFdlcd :apply (E) occur (F), apply false; apply end; (
false; end. T h e notation is that of van L e e u w e n [8]. He used the opposite direction of the arrows. Writing an RLP is very similar to writing a computer program. W e will explain the "statements." First, we begin and construct a string EF. T h e n we have to execute the statement apply (E). The meaning of this statem e n t is: ifE occurs in the string, then rewrite E, else execute next statement. So in our case we have to rewrite E by aEb or ab. After that, we can rewrite F by cFd or cd. Now we find apply (E). First, the occurrence of E is tested. W h e n no E occurs, we find occur (F), apply false. This statement/~rst tests w h e t h e r F occurs or not. If F occurs, the generation must be stopped without producing an e l e m e n t of the language. When no F occurs, we find apply end a n d an element of the language is produced. The apply false in the second r u l e is to test w h e t h e r the number of F's is the same as the number of E's. (4) LG = [V,P,cr] is a L i n d e n m a y e r grammar with
V = {a,b,c,d}, o" = {abcd}, P = {ab >aabb,cd
•ccdd}.
This is a very simple 1 L-grammar that generates the same language. This is a good illustration of the advantages of simultaneous productions. REFERENCES
1. S. K. CHANG,Picture processing grammar and its applications, Information Science 3, 1971, 121-148. 9.. M. F. DACE¥, The syntax of a triangle and some other figures, Pattern Recognition 2, No. 1, 1970, 11-32. 3. M. F. DACEY,Poly: A two-dimensional language for a class of polygons. Pattern Recognition 3, No. 2, 1971, 197-208. 4. H. Fm~EMAN,Boundary encoding and processing, in Picture Processing and Psyvhopictortcs, B, S, LIPKINand A. ROSENFELD, Eds., Academic Press, New York, 1970, pp. 241-266. 5. K. S. FU, Stochastic automata, stochastic languages and pattern recognition, Journal of Cybernetics 1, No. 3, 1971, 31-49. 6. G. H. HARDYAND E. M. WKtGHT,An Introduction to the Theory of Numbers, 4th ed., Oxford at the Clarendon Press, London, 1960, Chaps. III and X. 7. R, A. KIRSCI-I,Computer interpretation of English text and picture patterns, IEEE Trans. Electronic Computers 13, 1964, 363-376. 8. J. VAN LEEUWEN, Rule-labeled p#ograms, Doctoral Dissertation, R. University Utrecht (Holland), June 1972. 9. A. L~DENMAYEB, Mathematical models for cellular interaction in development I, H, ]. Theoret. Biol. 18, 1968, 280-315. 10. U. G. MONTANARI, A method for obtaining skeletons using a quasi-euclidean distance, ]. ACM 15, No. 4, 1968, 600-624.
62
R. BRONS
11. S. P. MOtaSE, Computer storage of contour-map data, Proceedings 1968 ACM Nat. Conf., 1968, 45-51. 12. D. ROSENKRANTZ, Programmed grammars and classes of formal languages, J. ACM 16, No. 1, 1969, 107-131. 13. G. ROZENBERC AND P. G. DOUCET, On OL-languages, Information and Control 19, No. 4, 1971, 302-318. 14. ROZENBERG,T-0-L systems and languages, ERCU-Publication 103, Mathematical Institute, R. University Utrecht (Holland), May 1971. 15. A. C. SHAW, Parsing of graph-representable pictures, J. ACM 17, No. 3, 1970, 453-481. 16. P. H. SwanxrAND K. S. Fu, Stochastic programmed grammars for syntactic pattern recognition, Pattern Recognition 4, No. 1, 1972, 83-101. 17. C. B. RECCIOm, Digital computer ~ansformations for irregular line drawings, Technical Report 403-22, New York University, April 1972.