ELSEVIER
Information
Processing
Letters 64 ( 1997) 53-59
Recognizable subsets of the two letter plactic monoid A. Arnolda,‘, M. Kanta b*2,D. Krob b** a La&i (CNRS), Vniversift! de Bordeaux, 351, Cours de la Liberation, 33405 Talence, France b Liafa (CNRS), Vniversite’ Paris 7, 2 Place Jussieu, 75251 Paris Cedex, France Received 25 June 1997; revised 18 September Communicated by L. Boasson
1997
Abstract The plactic monoid %(a, b) on two letters will be studied from the point of view of classical language theory. In particular, we will give the fine structure of its recognizable subsets. @ 1997 Elsevier Science B.V. Keywords: Formal languages;
Plastic monoid; Tableaux; Recognizable
languages;
1. Introduction The history of the plactic monoid can be traced back to Schensted’s paper of 1961 [8] in which an algorithm was given that allows to find the length of the largest increasing subsequence of a sequence of integers. This algorithm sets up a correspondence between words w over some totally ordered alphabet A and Young tableaux. The Young tableau Q(w) associated with a word by Schensted’s algorithm is called its insertion tableau and the length of its last row is the length of the longest increasing subsequence of the word. In 1970, Knuth proved that the insertion tableaux of two words u and v coincide if and only if these words are equivalent with respect to the congruence of A* generated by the relations
* Corresponding author. Email:
[email protected]. ’ Email:
[email protected]. ’ Email:
[email protected]. 0020-0190/97/$17.00 @ 1997 Elsevier Science B.V. All rights reserved. Pns0020-0190(97)00157-9
Automaton
aba s baa,
bba E bab
when a < b
acb E cab,
bca E bat
when a < b < c.
(1)
However, Knuth did not exploit the fact that these relations were the defining relations of a monoid, a feat which was performed by Lascoux and Schtitzenberger in [6]. They decided to call it plactic monoid and put it to good use in one of the first proofs of the so-called Richardson-Littlewood rule. The continued development of the theory of the plactic monoid turned it into a major tool in a variety of combinatorial contexts, such as the Kosta-Foulkes polynomials, charge and cocharge of Young tableaux, etc. In more recent developments it turned out that the plactic monoid also figures prominently in the theory of quantum groups mainly due to the strong relationship, discovered by Jimbo and Miwa, between the usual Robinson-Schensted correspondence and quantum U, (gl,) . Leclerc and Thibon even managed to obtain a quantum characterization of the plactic monoid in [7].
54
A. Arnold et al./Information
Processing
Thus it becomes clear that the plactic monoid is an object worthy of further attention. In this paper, we study it from the classical point of view of language theory. In particular, we give the fine structure of the recognizable subsets of the plactic monoid on two letters. Due to the fact that it has a cross section of entropy zero, the recognizable subsets are of a particularly simple form.
2. Generalities 2.1. Rationat and recognizable
subsets of a monoid
Let M be a monoid. A subset R of M will be called recognizable if there exists a finite monoid N and a monoid morphism 4 : M -+ N such that +-‘(4(R)) = R (forth is and the other definitions and results in this subsection cf. [ 1] ) . A subset R of M is rational if it can be constructed in a finite number of steps, each of which consists of choosing a one element subset {m} C M or of forming the union, product or the star of sets obtained in previous steps. The set of rational subsets of M will be denoted by Rut(M) whereas the set of recognizable subsets will be denoted by Rec( M) . In general there is no strong relationship between the two sets. In some cases of practical relevance, however, such a relationship exists. Proposition 1. Let M be a monoid. ( 1) If M is finitely generated then Rec( M) Rat(M). (2) IfM isfree then Ret(M) = Rat(M).
L
In order to determine whether or not a given subset of a monoid is recognizable, it often pays to consider the residuals of the set. Definition 2. Let M be a monoid, M. Then the set m-IL
L C M and m E
= {x E M 1mx E L}
Proposition 3. Let M be a monoid and L C M. Then L is recognizable if and only if {m-l L 1 m E M} is a jinite set. 2.2. Cross sections and entropy If A is an alphabet, R a family of pairs of elements of A* and = is the congruence defined by R on A, then a language S C A* is said to be a cross section of the quotient monoid A4 = A*/= if and only if for each w E A* there is a unique s E S such that w E s (for this and the rest of this subsection see also [ 31 and the references given there). In particular, two elements s, t E S are equivalent with respect to z iff and only if they are equal. Furthermore, a cross section S will be called rational if S is a rational subset of A*. Cross sections are important tools for the description of quotient monoids since they give a representative for each equivalence class of z and thus can be viewed as elements of the quotient monoid A*/=. Thus, it is often necessary to decide whether or not a given set constitutes a cross section of a certain monoid. This can be achieved by defining an action S x A* + S and verifying certain properties of this action. Lemma 4. If there exists an action . : S x A’ + S of A* on S such that (1) s.Ui = s.uifor ~11 (ui, Ui) E R, (2) s.1~ = sforall s E S, (3) (s.u).u=s.uuforalls~Sandu,u~A*, then S is a cross section of A*/=, where z is the congruence on A* generated by R. The idea behind this action is that for a cross section S, s E S and w E A* there is a unique element s’ E S such that s’ z SW. This element s’ is then the result of the action, that is s’ = S.W. If the action from S x A* into S is extended into a mapping from A* x A* into A* by defining U.U to be the element in S equivalent to uu then the following is true, where, for a word u E A*, not necessarily in S, and a set X C S, U-‘X = {u E S ) u.u E X}. Lemma 5. Let X G S and u,u E A*. If u E u then u--lx = u-lx.
is called a (left) residual of L. Notethat (mtm~)-lL=m~‘(m~‘L).Thefactrelevant to the determination of the recognizability set L is the following.
Letters 64 (1997) 53-59
of a
Rational cross sections of slow growth have a particular impact on the form of the recognizable sets of a monoid.
A. Arnold et al/Information
Definition 6. Let P be a subset of A*. The entropy of P is e(P)
= lim sup n-cc
loglPnA”l n
.
Zero entropy of the star of a language L has a marked effect since it prohibits the existence of free monoids on more than one generator inside L*. More specifically: Lemma 7. Let L be a non-empty subset of A*. Then the following are equivalent. (1) e(L*) =O. (2) There exists t E A* and I G N such that L* = {t’ 1i E I}. (3) There exists t E A*, n, m E N and ajnite subset I of N such that L* = (u
ti) u t”‘(t”)*.
Processing
Letters 64 (I 997) 53-59
55
2.3. The plactic monoid As mentioned in the Introduction, the plactic monoid is closely related to Young tableaux via Schensted’s algorithm. So a few explanations with regard to that subject are in order here. For more details concerning the planar constructions, see also [4]. Let (At,. . . , A,) be a partition of a positive integer n, that is a sequence of positive integers such that At < A2 < ... < A, and ci Ai = n. Then the Ferrers diagram associated with this sequence consists of left justified rows of boxes with Ai boxes in the(r-i+l)strowforl
iE/
Proof. The equivalence between (2) and (3) is the classical characterization of one letter rational languages. Part (3) implies part (I) quite obviously, whereas part (2) follows from part ( 1) by the observation that an entropy 0 subset of A* cannot contain 0 a free monoid on more than one generator. Proposition 8. If M = A*/5 has a rational cross section of entropy 0 then the recognizable subsets of M are finite unions of finite products of words and stars of words. Proof. Let X C M be recognizable and suppose 7r : A* -+ M is the natural projection. If S is a rational cross section, then X = %-(rr-‘(X)) = rr(7.r-‘(X> n S). Since inverse images of recognizable languages are recognizable, 7r-I (X) is recognizable. Furthermore, since A* is finitely generated and free, the notions of recognizability and rationality coincide, hence rTT-*(X) n S is rational in A*. Since it is contained in S, its entropy is zero and its stars can thus be decomposed according to Lemma 7. Hence, r-r (X) n S can be written as a finite union of finite products of words and stars of words. But then its image X is of this form as well since 7r(B U C) = n-(B) U r(C), 0 n-(K) = n-(B)r(C) and r(C*) = r(C)*.
The first letter al of w is put in a box. If the nth letter a,, of w is greater or equal to every letter in the last row of the insertion tableau of ala2.. . a,_1 then put it in a box and adjoin it to the right of the last row of this last tableau. Otherwise find the leftmost letter at in the last row of Q(al . . . a,_ 1) which is strictly greater than a,, put a, into al’s box and do step 2 with al starting in the row above the last row. Thus the insertion tableau of bba with a < b is constructed in the following way.
The insertion
tableau for bab is obtained as follows.
Thus, different words may lead to the same insertion tableau, suggesting the introduction of a relation on the
A. Arnold
56
et al. /Information
words of A* that takes account of this phenomenon. It is clear that the relation on A* defined by w G u if and only if Q(W) = Q(U) is an equivalence relation. Knuth proved that this equivalence relation is, in fact, the congruence generated by
Processing
Letfers
64 (1997) 53-59
monoid isomorphic to Pla( a, b). Thus Pla(a, 6) will be identified with (S, .) in the sequel. Note that
(ba) iaj bk .b” = (ba) i aj bk+n , (ba)“‘a”.(ba)‘ajbk = (ba)‘+“‘aj+“bk, and
aba E baa, bba G bab
( ba)i+.ib”-.i+k
for every pair a, b E A with a < b,
(2)
acb I cab, bca E bat
if n ~
j,
b”.(ba)‘aibk = i+n&nbk
if
j
<
fi,
tba)
for every triple a, b, c E A with a < b < c.
i
Thus, the plactic monoid Pla(A) on the alphabet A is defined to be the quotient of A’ by this congruence relation. Since the map P from A* to Young tableaux is onto, there exist right inverses of P from the Young tableaux to A* whose images will yield cross sections of the plactic monoid Ha(A). The two most obvious ones are obtained by concatenating the rows or the columns, respectively, of a Young tableau. The cross section obtained from concatenating the columns yields a rational cross section of the plactic monoid (cf. [ 31) . In the case of the plactic monoid Pla( a, b), whose quotient by the relation ab = 1 is widely used in Computer Science for describing trees and parenthesis systems, this cross section has the form S = (ba)*a*b*. The action S x A* 4 S related to this cross section by Lemma 4 is given by
(ba)‘a’bk.a =
(ba) i+laibk-’
ifk
(ba) ‘aj+’
if k = 0.
>
1,
and ( ba)‘aibk.b = (ba) iaj bk+l on the generators
of A*. It is easy to check that
(ba)‘aibk.bba=
(ba)‘ajbk.bab=
(ba)‘+*ajbk+’
and
(ba)‘ajbk.aba = (ba)‘aibk.baa (ba) =
i
i+zajbk-l
(ba)i+lai+l
ifk>
1,
if k = 0.
Note that the restriction of this action to S x S + S defines a monoid structure on S which turns S into a
3. Recognizable subsets of Pla(a, b) Before starting the search for the recognizable subsets of the plactic monoid, note the following relationship between the recognizable subsets of the free monoid A* and a cross section (S, .). Lemma 9. Let L be a subset of S C A*. Then L is a recognizable subset of (S, .) if and only if [L] = {ZJE A* 1there is a u E L such that u E u}
is recognizable in A*. Thus, if S is recognizable in A* then the recognizability of L & S in S implies the recognizability of L in A*. The converse of this statement, however, is not true. For example, in the case of the two letter plactic monoid, (ba)* is recognizable in A*, while it is not recognizable in S for [ (ba)‘] is the semi Dyck language. In order to study the recognizable subsets of the plactic monoid on two letters, it is worth noticing that every subset L C Pla( a, b) can be written in the form L = U (ba)ia’Li,j,
(3)
i,j>O
where Li,j is a subset of b*. The reason for this is that (ba)*a*b* is a cross section of Pla(a, b). This representation is unique and the residuals of a subset L of Pla( a, b) represented in this form look as follows. Lemma 10. lf L, & PIa(a, b) is represented as in Eq. (3), olte has
(1)
a-‘L = U (ba)‘ajLi,,i+l, i,j)O
A. Arnold et al./Information Processing Letters 64 (1997) 53-59
F’L = U _ - (ba)‘-‘d+‘Li,,i U U(bu)‘Li,o,
(2)
(3)
(bu)-‘L=
A=
i>O
i&O
U U
U (bu)‘ujLi+‘,j.
U
u’Li,j)
O
U
(bu)‘(
O
i,j>O
Corollary 11. (1) Let u = (ba)‘uj.
(bu)‘(
O(i
57
U
uqfr+PRLi,q+r)
(4)
O
and Then
B =
U-’ L = U (bu) ruSLi+,,j+s. r,s (2) ((bu)‘uj)-‘Lnb*= Li,j und (3) fork > 0, ( (bu)‘ujbk)-‘L n 6’ = bbkLi,j.
u (bu)“+‘+““( u CZ~+“~L~+~,,~). O(r
Lemma 14. If i > q and j (P-j) -‘Li+,f().
(5)
< p, then Li,,i =
For recognizable subsets L, the languages Li,,i turn out to be recognizable as well, which a direct consequence of part (2) of the last corollary.
Proof. Let u V -‘L, hence (bu) iuj and and Corollary
Lemma 12. If L is recognizable for all i, j > 0.
Lemma 15. For every r, s with 0 < r, s < p - 1, one has:
then so are the Li,j
Another consequence of the recognizability of L is that the Li,j are not completely independent in such a case. Lemma 13. If L is recognizable, then there exist p, q E N such that ( 1) Li,j = Li,j+r, for ~11 i 2 q and j 2 0, (2) Li,,i = Li+p,j for ~11 i > q and j 3 0. Proof. Since L is recognizable in S, there are only finitely many c4-‘L . Thus there are q, p such that for all u E A* and for all i 2 q, (a’+/‘)-‘(u-‘L)
= (a’)-‘(u-IL)
(b’+“)-‘(U-IL)
= (b’)-‘(u-‘L).
and
This implies that (uu’+Pu)-‘L = (uu’u)-‘L and (ubif”u) -‘L = (ub’u)-‘L for all u, u E A* and for all i 2 q. Since u(bu)‘u 3 ub’u’u, it follows from Lemma 5 that (u( bu) i+J’u)-‘L = (u(bu)‘u)-‘L. Corollary 11(2) then yields the result. 0 By reordering terms in Eq. (3) as suggested by the previous lemma, a recognizable subset L of Plu( a, b) can be written as L=AuB, where
L q+r,s =
= b’u’+j and u = bi+P&. Then u-l L = (U-IL) r7 b* = (u-IL) n b*. But u = u = (ba) iijbp--j. Applying Lemma 5 11 then yields the result. 0
(p-“)-‘(Qq+&?
witht=r+smodpundQq+l~{l,b,...,~J-l}. Proof. By Lemma 14, L q+r,s = (p-“)
-’ Lq+r+s,O
and, by Lemma 13, Lq+r+s,~ = Lq+,,o, with t = r + s mod p. Again by Lemma 14, L q+t,o = b-“Lq+t,o. Hence, Lq+r,~ = Qq+,bP”
with Qqft & (1, b,. . . , bp-‘}.
0
Lemma 16. Let @ and 8 represent addition and subtraction modulo p, respectively, where taking the modulus is interpreted us computing the remainder in [0, p - 1] ofdivision by p. Let ai(r) = 1 or 0 uccording to whether or not b’ E Qq+r. Then for every s in [O,p - 11, one has p- 1 L q+r,s -- c Ly;;y’ bi+P”. i=O
Proof. By the previous lemma,
58
A. Arnold et al./Information Processing Letters 64 (1997) 53-59
L y+r,s= U+‘-S)-‘(Qy+#‘N)
In order to show that, conversely, all subsets of Plu(u, b) of the form given in Proposition 17 are recognizable it is enough to show that the Li (L,, Lb) and the L, (io, jo) are recognizable.
But
Lemma 18. IfL is a recognizable subset of the pluctic monoid then so is (bu) i. L for all i E N.
c
=
Proof. It is enough to prove this result for i = 1, that follows from the fact that [ bu.L] = {ubuw ( uw E [L] } that is recognizable (in A*) when [L] is recognizable in A*. 0
a!r~S)b’+(“-(P-S))+p~ I
i=p-
s
Lemma
19. The languages
Li(L,,Lb)
= (bU)‘L,Lb
p-l
c
=
&W
1
ps+PN
are recognizable.
i=O
which implies the result.
Proof. It iS obvious that [L,Lb] apply the previous lemma. 0
0
According to the form of the Ly+r,s given in this last lemma, B is a union of
U
idl
which in turn is a union of languages KpJio,jo)
=
u (ba) O
of the form
q+(iOeS)+PNa.r+P~b(S~jO)+PN
with 0 6 ic, jo < p - 1. Similarly, by a previous remark, A is a union of languages Ki( L,, Lb) of the form K,‘(L,,Lb)
= (ba)‘L,Lb,
where L, (respectively Lb) is a recognizable subset of a* (respectively b* ) . Thus the recognizable subsets of PZa( a, b) have the following structure. Proposition 17. If L 2 Plu( a, b) is recognizable, then L is a$nite union of &,,(io,jo) =
(2)
As an example of the lemma, we show in Fig. 1 the minimal automaton recognizing the language L2,2(2, 1,l) = (bu)2u’+2Nb’+2N of Plu(u,b).
(ba)
O(r.s
(1)
= L,LI, and we can
UO
Ki( L,, Lb) = (bu)‘L,Lb, where L, (respectively Lb) is a recognizable subset of u* (respectively b*).
Lemma 20. For non-negative q, Kp,q( io, jo) is recognizable.
integers io, jo, p, and
Proof. The languages (bu)q.K~(io,jo) with
K,,,(io, jo) are of the form
KF(io,jo)
j i+jmodp
= {(bu)“ujbk
=io,
i+kmodp=ia+ja}. Since [K:‘(io,jo>l = {u ( (U(,modp =ic, (U(bmodp = io + jo i is recognizable in A*, the recognizability of Kp,q( i, j) follows from an argument similar to that in the proof of the previous lemma. q An automaton
recognizing
the language
I+(1@2S)+2~uS+*~bS+2N (ba) U O
A. Arnold et aLlInformation Processing Letters 64 (1997) 53-59
\
b
b
/I
b b a
a
a
c a
a Fig. 2. An automaton
Fig. I. An automaton
for the language
for Uo
(ba)2a’+2N b’+*‘.
Theorem 21. A subset L of the plactic monoid Pla(a, 6) is recognizable if and only if it is a finite union of sets of the form Kp,4 (io, jo ) and KL(L,, Lb), as in Proposition 17.
References ] 1 ] J. Berstel, Transductions and Context-Free Languages, Teubner, Stuttgart, 1979. [2] E. Date, M. Jimbo, T. Miwa, Representationsof Llq(gln) at q = 0 and the Robinson-Schensted correspondence, in: L. Bring, D. Fridan, A.M. Polyakov (Eds.), Physics and Mathematics of Strings, Memorial volume of V. Knizhnik, World Scientific, Singapore, 1990.
monoids, in: M. ho, [31 G. Duchamp, D. Krob, Plactic-growth-like H. Jtlrgensen (Eds.), Words, Languages and Combinatorics II, Kyoto, Japan, 1992, World Scientific, Singapore, 1994, pp. 124-142. London Mathematical Society [41 Fulton, Young tableaux, Lecture Note Series, vol. 35, Cambridge University Press, Cambridge, 1997. 151 D.E. Knuth, Permutations, matrices and generalized Young tableaux, Pacific J. Math. 34 (1970) 7099727. 161 A. Lascoux, M.P Schiitzenberger, Le. monoide plaxique, in: A. De Luca (Ed.), Noncommutative Structures in Algebra and Geometric Combinatorics, Quademi de la Ricerca Scientifica, vol. 109 (1981) 129-156. corresponL71 B. Leclerc, J.Y. Thibon, The Robinson-Schensted dence as the quantum straightening at 4 = 0, LITP Tech. Rept. 95/17, Paris, 1995. Longest increasing and decreasing subt81 C. Schensted, sequences, Canad. J. Math. 13 (1961) 179-191.