Information Processing Letters 38 (1991) 57-60 North-Holland
26 April 1991
atime Crochemore * LITP,
UniversitP
Institute
Paris
of Informatics.
7, 2 Place Jussieu.
Warsaw
7.5251 Paris Cedex 05, F: ante
University,
PKiN
8p.. 00901
Warsaw,
Poland
Communicated by L. Boasson Received 28 June 1990 Revised 7 February 1991
Abstract
Crochemore, M. and W. Rytter, Efficient Processing Letters 38 (1991) 57-60.
parallel
algorithms
to test square-freeness
and factorize
strings,
Information
A string is square-free iff it does not contain a nonempty subword of the form w’w. We give an algorithm testing square-freeness of strings in fog ir time with n processors of a CRCW PRAM. The input alphabet is not bounded. The best sequential time algorithm for this problem takes 0( n log n) time. Hence the total number of operations in our parallel algorithm matches that of the best sequential algorithm. The algorithm relies on an efficient parallel computation of a factorization of words used in text compression. Keywords:
Parallel algorithms,
combinatorial
problems
Squares in strings are a subject of many papers related to combinatorics on strings, see [lo]. Testing square-freeness of a string by an efficient algorithm is a difficult problem even in the sequential case, see [6] and [12] (and also [4,5,11]). The problem is especially interesting from the point of view of algorithmic techniques. Recently an O(log*n) time parallel algoiithm on a CREW PRAM for this problem has been given in [8]. However, it seems that this algorithm cannot be implemented on a CRCW PRAM in log n time. The situation is analogue to the problem of connected components for undirected graphs: the algorithm working in log n time on a * Work by this author is supported by PRC “MathematiquesInformatique” & NATO Grant CRG 900293. 0020-0190/91/$03.50
(0 1991 - Elsevier Science Publishers
CRCW PRAM could not be derived as a version of a log’n time algorithm on a CREW PRAM, see [13]. The main result of the present paper is a simple O(log n) time algorithm on a CRCW PRAM. Independently an algorithm with similar complexity bounds has been recently discovered by Apostolic0 [2]. This algorithm is rather involved because it essentially exploits a linear ordering on the alphabet. As far as we know, this idea has not been used before in any sequential algorithm for the problem. The advantage of the algorithm in [2] is that its extension can detect all squares (at the cost of a more complicated algorithm). Our algorithm is much simpler. It is a straightforward parallelization of a known sequential algorithm
FlB.V. (North-
57
Volume 38. Number 2
INFORMATION
PROCESSING
As a by-product, the algorithm provides an efficient and very simple parallel algorithm for string factorization. This is useful in some data compression schemes related to the method of Ziv and Lempel [14] (see also [7]). This last scheme is fnmollc it ic _____--., heyalla= ___--__ -_ -_ the .,--- hastic cf !hp I_lcjx cornmand “COMPRESS’ known as efficient in practice. Our paper has also a methodological character. It shows a transformation of a particular sequential algorithm into a parallel one. At first glance the initial sequential algorithm looks inherently sequential. As it was written in [2] about parallel algorithms for basic string problems: “none of these algorithms resembles in any way its sequenOur parallel algorithm is an tial counterpart”. exception. The model of parallel computations is the concurrent-write concurrent-read parallel random access machine (CRCW PRAM), see [9]. The main part of the known most efficient sequential algorithms for findin, a square in a word is an operation called Test (see [11,12] or [6]). This operation applies to square-free words u and u and returns true iff the word uu contains a square (the square must begin in u and end in u). This operation is a composition of two smaller ones RightTest and LeftTest. The first (respectively second) operation tests whether uu contains a square whose centre is in ~1(respectively u). Lemma 1. Functions RightTest( u, u) and LeftTest( u, u) can be computed in O(log 1u I) time with 1u 1 processors
in the CRC W PRAM
model.
roof. The proof is omitted because it directly follows from Theorem 2.2, Theorem 3.1 and Lemma 3.2 in [S]. •I
We now define a factorization of a text t that is similar to the factorization included in the data compression scheme of Ziv and Lempel [14]. The present factorization has been considered in [6]. It is a sequence of nonempty words (u,, u2,. . . , u,,) such that f = u,uz . . . u,,~, where u, = t[l] and for k > 1, ox is defined as follows: if i= )u,u2...ox_, 1, then uA is the longest prefix tf of text t[ i + 1..n] which occurs at least twice 58
LE-M-ERS
26 April 1991
Pas(i) Fig. 1. Factorization of a text.
in text t[l..i]u. If there is no such u, then uk is set to t[i + 11. Figure 1 shows the longest prefix found to define ~5. One may note that the two occurrences of 05 can overlap. Let us denote by Pos(u, ) the position of the first occurrence of Us (it is the length of the shortest word y such that yu, is a prefix of t ). e~~esfl 2. The factorization and the table Pos for a given text t of size n can be computed in log n time using n processors in the CRC W PRAM model.
roof. (a) First construct the suffix tree T for the text t by the algorithm of [3]. The leaves of T correspond to the suffixes of t. The path from the root to the leaf labelled i spells the suffix occurring at position i in t (i.e., suffix t[i + 1, . . . . It I]). (b) Each node u of T is given a label that is the minimal label of the leaves belonging to the subtree rooted at u. The computation can be done by a tree-contraction algorithm, for example, the one from [9]. (c) Now for each leaf i compute the first node u on the bottom-up path from i to the root with a value j < i. Denote by Size(i) the length of the string spelled by the path from the root to the node u. The value of j equals Pas(i). If there is no such node, then Pos( i) = i, Size(i) = 1. The computation of tables Pos( i ) and Size(i) can be done efficiently in parallel as follows: Let Up[u, k] be the node lying on the path fram u to the root whose distance from u is 2”. If there is no such node, then we set Up[ u, k] = root. For each node u of the tree, compute the table MinUp[ u, k] for k = 1,. . . , log 12. The value of MinUp[u, k] is the node with the smallest label on the path from u to Up[u, k]. Both tables can be easily computed in log n time with O(n) . v___.,_ P mrPcsnrs~
Volume 38, Number 2
INFORMATION
PROCESSING LETTERS
Next, the values of Pos( i)‘s can be computed by assigning one processor to each leaf i. Then the assigned processor is making a line of binary search to find the first j < i on the path from i to the root. It takes again logarithmic time. (d) Denote Next(i) = i + Size(i). Compute in parallel all powers of Next. Then we get the factorization (u,, u2,. . . , u,,,) because the position of u, + , is Next’(O). The table Pos is already computed in (c). This completes the proof. •I u,,,) be the factorization Let (u,, +..., given text. The following combinatorial shown in [6].
of the fact is
Lemma 3. Text t contains a square iff for some k at least one of the following conditions holds: (1) Pos(u,)+ 1uk 1 Z Iu,uz...uk_, 1 (selfouer-
lapping oj uli), (2) LeftTest( uA_ ,, uk ) or RightTest( ok _ ,, u,J are true, (3) RightTest( u,uZ . . . uk _ 2, uA_ ,uk ) is true. The structure of the sequential algorithm is as follows: Algorithm
Sequential-Test; compute the factorization (u,, u2,. . . , un,) of t; for k = 1 to m do check if one of the conditions (1) (2) or (3) holds. The parallel version is as follows:
(u,, u2,. . . ,
u,,) of t; for each k E [ 1.. m ] do in parallel
check if one of the conditions (1) (2) or (3) holds. Now the key point of the parallel algorithm is that the computation of the value of IZightTest( u,u2 . . . uA_ 2, ~1~ _ ,ux )
be done
in logarithmic time using only processors, due to Lemma 1. All other tests can be done independently. 0( 1oh _ luA 1)
T/2e square-freeness test for a given text t of size n can be realized in log n time using n prncpwws in the CRC W PRA $1 model. roof. Each condition in the algorithm ParallelTest can be checked in O(log n ) time using O( I uk _ luk I) processors, due to Lemma 1. Thus the total number of processors we need is at most the sum of all 1ok_ *uk I. However, this is obviously O(n), which completes the proof. •I
Final remark
The basic data structure needed for the parallel computation of suffix trees and for the evaluation of function Test is the dictionary of basic factors (see [S] for the definition). This dictionary is computed only once and for the whole string. In the computation of the function Test for smaller parts of the text the same global dictionary (already computed) is used.
eferences [I] A. Apostohco. Apostolico
A myriad virtues of suffix trees. in: A. and Z. Galil, eds.. Combinatorial Algorithms on
Words (Springer, Berlin. 1985) 85-96. [2] A. Apostolico. Optimal parallel detection of squares in strings, Manuscript, University of L’Aquilla. 1990. [3] A. Apostolico, C. Iliopoulos. G. Landau, B. Schieber and U. Vishkin, Parallel construction of a suffix tree with applications, Algorithmica 3 (1988) 347-365. [4] A. Apostolico and F.P. Preparata. Optimal off-line detection of repetitions in a string. Theoret. Comput. Sci. 22
Algorithm
Parallel-Test; compute in parallel the factorization
can
26 .4pril 19Yl
(1983) 297-315. [5] M. Crochemore.
An optimal aigorithm for computing the repetitions in a word, Inform. Process. Lett. 12 (1981) 244-250. Transducers and repetitions. Theoret. (61 M. Crochemore. Comput. Sci. 45 (1986) 63-86. [7] M. Crochernore. Data compression with substitution. in: Gross and D. Perrin. eds.. Electronic Dlctwnarwp ad Automata in Computational Linguistics, Lecture Notes in Computer Science 377 (Springer. Berlin. 1989) l-16. [S] M. Crochemore and W. Rytter, Parallel computations and Lengauer. strings and arrays. in: Choffrut
on eds..
59
Volume
38. Number
INFORMATION
2
ST’CS’90. Lecture Notes in Computer Science 415 (Springer. Berlin. 1990) 109- 123. [9] A. Gibbons and W. Rytter, Efficient Parallel Algorithms (Cambridge Univ. Press, Cambridge, 1988). [lo] L. Lothaire, Combinatorics on Words (Addison-Wesley, Reading, MA. 1983). [ll] M.G. Main and R.J. Lorentz. An 0( n log n) algorithm for finding all repetitions 422-432.
in a string,
J. Algorithms
§ (1984)
PROCESSING
26 April 1991
LETTERS
[12] M.G. Main and R.J. Lorentz, Linear time recognition of square-free strings. in: A. Apostolico and Z. Galil. eds.. Combinatorial Algorithms on Words (Springer, Berlin, 1985). [13] Y. Shiloach and U. Vishkin. An O(log n) parallel connectivity algorithm, J. Algorithms 2 (1981) 57-63. [14] J. Ziv and A. Lempel. A universal algorithm for sequential data compression, (1977) 337-343.
IEEE
Trans.
Inform.
Theoty
23 (3)