ELSEVIER
Information Processing Letters
Information Processing Letters 52 ( 1994) 277-284
A parallel list update problem F. Luccio
a,*, A. Pedrotti
’
b,2
a Dipartimento di Informutica, Universitci di Piss, Corso ltalia 40, I-56125 Pisa, Italy ’ Scuola Normale Superiore, Piazza dei Cavalieri, n. 7, Piss, Italy
Communicated by W.M. Turski; received 18 May 1994
Abstract Given a set of n . m elements arranged in n lists of m elements LI . . . L,, and a sequence of requests R = RIR~. ., where Ri is a set of n elements to be retrieved in the lists, we discuss how to execute each Ri in parallel. LI . . L, are stored in the shared memory of an n-processor EREW-PRAM. After Ri has been executed, the lists are updated with a move to front (MTF) strategy; then Ri+l can be processed. An amortized analysis shows that the proposed algorithm is (n* + 1 )-competitive against a static optimal algorithm, while a lower bound 2n is proved. The use of randomization allows to decrease the competitivity factor to !n, versus a lower bound fn. Keywords: List searching; Competitivity factor; Amortized analysis of algorithms; Parallel algorithms
1. Presentation
of the problem
The (sequential)
list update problem is the following: elements, and a sequence of requests R = x1, x2,. . ., with xi E L, retrieve the xi in L, and update L after each retrieval, in order to process R efficiently. Among others, a standard move to front (MTF) algorithm is used to solve this problem. In MTF, an element is moved to the front of the list whenever has been accessed. After the initial analysis of [4], several studies have been devoted to LU with MTF strategy. In particular, the amortized behaviour of MTF has been studied in [ 61, with crucial use of a potentialfunction Q, that keeps track, at any time, of the differences between the list maintained by an arbitrary algorithm, and that of an optimal one. In [6] it is proved that MTF is 2-competitive, that is, for any sequence of requests, the number of steps executed by MTF does not exceed twice the number of steps of an optimal algorithm. Randomization has then been introduced as a tool to improve the amortized efficiency of MTF. In particular, a randomized algorithm BIT is given in [ 21, where a random bit b(x) is initially associated to each element x of L. When x is accessed, b(x) is complemented. x is then moved to the front if b(x) = 0; it is left in place if LU: Given
a list L of m distinct
* Corresponding author. Email:
[email protected]. ’ This work has been supported in part by Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo of CNR, Italy. * Email:
[email protected]. 0020-0190/94/$07.00 @ 1994 Elsevier Science B.V. All rights reserved SSDIOO20-0190(94)00146-4
278
E Luccio. A. Pedrotti/Information
Processing Letters 52 (1994) 277-284
b(x) = 1. BIT has been proved to be $-competitive against an adversary that does not know the initial random choice of the bits. In this paper we consider LU in parallel computation, using the EREW-PRAM model. The new formulation is as follows: PLU: Given a set S of R. m elements, arranged in II lists Li . . . L, of m elements each, with m > n, and a sequence of requests I? = RI R2. . ., where each request Ri C S is a set of n elements, retrieve the elements of each Ri in parallel, and then update the lists in order to process R efficiently. To solve PLU we use n processors, one for each list, and adopt the MTP strategy. We allow to move elements from one list to another. If two or more elements are found in the same list Li, only the first one is moved to the front of that list, while each one of the others is brought to the front of a list Lj where no element has yet been found. We propose an algorithm for PLU, and a randomized version of it. Using a proper potential function, we find an asymptotic competitivity factor n2 + 1 for the basic algorithm, and ;n for the randomized
version. The corresponding lower bounds are 2n and +n, respectively. As a side result of our analysis, we obtain the same competitivity factors of [ 61 and [2] for the sequential case (n = 1). In our analysis we adopt the so called standard COSTmodel, where the cost of an access to an element in the kth position of some list is k [ 21.
2. Analysis of the parallel algorithm Consider the EREW-PRAM model of computation with n processors Pt . . . P,. The lists L1 . . . L, are stored in the shared memory, and their addresses are known to all processors. The requests may also be stored, or may be received on line (and then stored). Request Ri+l is not initiated until Ri has been completely processed. The following algorithm PMTP is applied to process each request Ri. The elements of Ri are sorted, and the sorted version, called R for simplicity, is replicated in n copies R’ . . . R”, one for each processor. The EREW paradigm suffices, due to the replication of R. Let R = {XI,. . . , x,}. Two auxiliary binary vectors F, V of length n are used, where F(i) = 1 denotes that an element has been moved to the front of Li, and V(k) = 1 denotes that xk has been found in some list. The search proceeds in steps. At each step t, the tth element of Li, denoted by Li( t), is examined by Pi. Algorithm PMTF for step t = 1,2,. . ., execute the following three phases: 1. each processor Pi performs a binary search for Li( t) into R’ = {xl, . . . ,x,}; if Li(t) = xk for some k then: a. Pi sets V(k) := 1; b. if F(i) = 0, Pi moves L,(t) to the front of Li, and sets F(i) := 1; 2. let ‘P be the subset of processors such that each Pi E P finds that Li(t) E R’ in phase 1, but cannot move such an element to the front of Li, because this front has been previously occupied (F(i) = 1 in step 1.b); a. a parallel ranking is performed to assign a numbering to the elements of P. Call them Pi,, . . . , Pi,, r = IPI, and let Li, . . . Li, be the corresponding lists. A parallel ranking is also performed in F, to assign a numbering to the lists Li with free front (F(i) = 0). Let L be the set of such lists, and rename them Lj, *s s Lj,, s = IL1 > r. Note that Li, . . . Li, are not in C; b. each processor Pi, E P allocates the element Li, (t) found in phase 1 to the front of Lj,, and imports Ljh (t) in the tth position of Li,t (i.e., Ljh (t) replaces Li, (t) in Li,$) . 3. The AND of all V(k) is computed, to decide if all the elements of R have been found. The time needed to sort and replicate R in the preprocessing phase is O(n) (the time for replication is dominant). An upper bound to the time required by algorithm PMTP is easily derived as 0( m logn) . In fact, phases 1, 2 and 3 can be performed in time logn (see [ 3]), and these phases are repeated m times in the worst case. Recalling that m > n, the total time is 0( m log n) . As customary in list update problems, however,
E Luccio, A. Pedrotti/Information Processing Letters 52 (1994) 277-284
279
the interest of the method relies on the amortized analysis of PMTF performed on a sequence of requests R= RI. . . R,, and on the comparison of PMTF with an optimal static algorithm OPT for R. That is, OPT sets the list elements in proper ordering, and does not move the elements later, such that the total number of steps for R is minimized. A clever modification of PMTF allows to save some operations in phases 2 and 3 of each step (see [5] ). Still the two algorithms require the same number of steps, and the same time complexity of O( log n) for each step, due to the search in phase 1. Note that each step of OPT also require time 0( log n), for the same reason. We then refer to PMTF. At any given step, let xPMTF and xopT denote the positions occupied by element x in the list where it resides, in algorithms PMTF and OPT, respectively. We pose: Definition 1. An inversion is an unordered
pair (x, y) such that xPMTF < yPMTF and xonr > yopT
Definition 2. The potential @ of the list arrangement, Note that @ is an extension of the potential function
at any step of PMTF, is given by the number of inversions. given in [ 61 for the sequential case (n = 1).
Let the current request be Rh = {XI,. . . , xn}. For ease of computation,
we assume
XPMTF3 xTMTF for i < j, where xPMTF is the position of x in its list, before & is processed. By the use of a potential function, the amortized cost of algorithm PMTF on R can be computed as the cost of the algorithm on each & (see [ 61). For this purpose we first analyze the variation of the potential @ for Rh. Note that PMTF changes the position of an element in steps 1.b (Li( t) is moved to the front of Li) , and 2.b (Li, (t) moved to the front of Ljh, and Lj,, (t) is moved to Li,, in the same relative position t). Only the two move to front operations affect the value of @. After Rh is processed, all Xi have been brought to the front of a list. Without loss of generality, assume that each Xi is brought to the front of Li. We denote with u the variation in the number of inversions due to pairs (xi, xj) ; with ui the variation due to pairs (Xiv z), z 61 Rh; with Wij the variation due to pairs (y, z), y, z $? &, y E Li, z E Lj, i < j. The variation of @ for &, is then: A@=u+
C
pi+
Wij.
C
(2)
1
I
Note that all the elements of Rh are brought in position between such elements, by Definition 1. We have
1. Therefore, processing
Rh does not form inversions
u < 0.
(3)
TO evaluate the contribution
of Ui in relation
(2), let ki = _xrMTF,hi = ~0”.
Ei = {Z 1ZPMTF< ki},
Ei = {Z 1 ZPMTF < ki},
pi = {z 1 zom
Fi = {Z 1 Zopr < hi},
< hi},
For z E S \ Rh, consider
the sets
Fi = {z 1 Zom > hi},
and denote qi = 1Ei n Fi I. Note that, if z E Ei f~ Fi, a new inversion (Xi, Z) is created. If z E Ei n Fi, this inversion is destroyed. There are no other contributions to Ui. We thus obtain
already exists and
280
E Luccio, A. Pedrotti/lnformation
Processing
Letters 52 (1994) 277-284
Since Ei n Fi and $ n Fi are disjoint and their union is l?i, we may write Ui < (Ei \ Bil + I$\ - 21Ei n Fil = \EiJ - 2lEi n Fil. It is easy to check that Ei has at most nki + i - n - 1 elements, in positions < ki. We thus have
because at least n + 1 - i elements
Ui < nki + i - n - 1 - 2qi.
of & are
(4)
We now consider the contribution of Wij in relation (2). Since i < j, we have ki 2 kj. If ki = kj, we see thatw~j=O.Fork~>k~,letp=ki-k~-1,andleta~,...,a,~L~,b~,...,b,~L~betheelementsofLi and Lj whose positions are intermediate between kj and ki. The pairs (ai, bi) are not inversions before & is processed. However, since the elements ak shift towards the rear of their list, while the bk are not moved, up to p inversions may arise. We conclude that wij < max( 0, ki - kj - 1) < ki - kj, hence
c
Wij
I (i
(n + 1 - 2i)ki.
(5)
1
Recalling that ki is maximal among all ki by relation (l), applying the potential function as done in [ 61, and taking into account relations (2) to (5), we find that the amortized cost CPMTF of algorithm PMTP is given by:
CPMTF = kl +
A@ < 2nkl - q1 + ( C
(2n+l
-2i)ki_2qi)
- n(n2t I).
(f-5)
2
Since lEi1 - (Bi fl Fil = lgil - qi 6 IF;il, noting that (Fi( < nhi - 1, and lEi[ 3 n(ki - 1) - (n - i) (in fact, no more than n - i elements of Rh are in position < ki), we have nki - qi < nhi + (2n - i - I), that may be rewritten as
(
aiki - piqi < ai hi +
2n-i-1 n
>
for arbitrary constants (Yi and pi such that ai < npi. Letting “~1 = 2n, ai = 2n + 1 - 2i for i = 2,. . . , n, pi = 2 for i= l,... , n, relation (6) can be rewritten as CPMTF < 7 + C
(aiki - piqi)
t
(8)
1
where y = - 9.
Combining
CPMTF~ y+ C
relations
(7) and (8) we have
cui(hi+2n-~- I).
(9)
1
OPT is given by Con
7n3-12n2+lln-12 6n
= ma(hi),
and that CtGiGncri
< (n2 + l)CopT + in2.
= n2 + 1, we
(10)
For II = 1 we have, from the first inequality in ( 10): C PMTF< 2CopT - 1. This is the same result stated in [6] for the sequential case, and is consistent with the fact that the two potential functions coincide for n = 1 (see Definition 2). We formulate the result of (10) as a theorem:
E Luccio, A. Pedrotti/lnformation
Processing
Letters 52 (1994) 277-284
Theorem 3. The parallel list update problem can be solved with a deterministic factor of n2 + 1 over the static optimal algorithm.
281
algorithm with a competitivity
A lower bound to the competitivity factor of our algorithm PMTF can be determined as follows. Take two disjoint subsets Y, X of S, Y = {yt , . . . , yn-I}, X = {xl,. . . , x,}, and m G 1 mod n (the same argument holds for any value of m). Consider a sequence of requests R = EE . . . E, where & is a subsequence repeated k > 1 times, with & = El . . . E,,, Ei = (~1,. . . ,yn_l ,xi} for 1 < i < m. It can be easily seen that the lists are arranged by OPT in such a way that yp” = . . . = yf-7 = 1, and all th e elements of X are placed ahead of all the other elements (i .., e xom 1 < zopT for 1 < i 6 m and z E S \ (Y U X) ). By easy computations we find:
com(R> < 2(1+
;>.
To compute CPMTF(R), suppose that all elements of X are initially in the same list, say L,, in the order xi,. . .,x,. Algorithm PMTF brings ~1,. . . ,yn-1 to the head of Li,. . . , L,_l when processing the first request, then leaves these elements in place. It also maintains the elements of X in L, during the whole processing of R. The first execution of I reverts the order of these elements to x,, . . . , XI, with a cost of v. In each of the following repetitions of E, each request Ri brings the last element of L, to the front of such a list, with a cost m2 for 1. We have CPMTF(~)
= m(m2+
‘1
+
(k-
l)m2,
hence CPMTF( R) COPT( R)
2n > (1 +n/m)
for k -+ co.
(11)
In conclusion, for n/m --f 0 we have an upper bound of n2 + 1 and a lower bound of 2n. It is worth noting that, with the proposed potential function, the asymptotic upper bound n2 + 1 may not be improved. For example assume that the initial configuration of L1 . . . L, is the same for OPT and PMTF, and that the first request RI = {XI,. . . , x,} is such that all elements to be retrieved are in position m. Each Xi is brought to the front of its list, giving rise to an inversion with each z E S \ RI. The total number of inversions created is then (m - l)n2 = A@, hence C pMTF/CopT = (m + A@)/m = (n2 + 1) - n2/m, yielding a competitivity factor of n2 + 1 for n2/m + 0.
3. The randomized
algorithm
As already mentioned in Section 1, the use of randomization allows to speed up the solution of LU. Randomization is now introduced in our parallel framework, using a strategy similar to the one of [ 21. In the new algorithm PBIT, one random bit b(x) is initially assigned to each element x E S, and is used to decide whether or not x is to be moved to the front. An access to x causes the complementation of b(x), then x is moved iff b(x) = 0. Algorithm PBIT is an immediate extension of PMTF, and is not explicitly reported here. For the analysis of PBIT we introduce a new potential function. An inversion is now defined as an ordered pair [x, y], such that xPBtT < yPBrT, but xopT > yopr (square brackets denote ordered pairs). If b(y) = 1, we say that an inversion [x, y] is of type 1. Otherwise, [x, y] is of type 2. Let It and 12 be the total number of inversions of type 1 and 2, respectively. The new potential function QR is an extension of that given in [ 21, namely @JR= AIR, where ZR = It + 212 is a weighted number of inversions, and A is a positive constant to be determined later. A variation of @R may arise when an inversion is created or destroyed, and when its type changes.
E Luccio. A. Pedrotti/lnformation Processing Letters 52 (1994) 277-284
282
The positions of the elements in the lists of PBIT are random variables on the probability space (0, l},” of the initial (random) bit choices. The variables u, ui, Wij, ki, hi, qi of the previous section maintain their meaning, however, they are now random variables, denoted by U, x, W,, Ki, Hi, Qi. The definitions of the sets EL, Ei, Fi, Pi, Fi are also maintained, and expressed in terms of Ki and Hi. We have: AIR = U + C
L$+
1
C
Wij.
(12)
1
Let us focus on u. If Xi, Xj E &,, there are two cases: Case 1: Both, or neither, elements are moved. No inversion in created, hence U < 0. Case 2: Only one element, say xi, is moved. An inversion [xi, Xj] may arise. Since Xj is not moved, we have b(Xj) = 0 before processing Rh. Hence, 6( Xj) is set to 1 and the new inversion, if any, is of type 1. Since cases 1 and 2 occur with the same probability, we have (13) Let us now bound the values of K. ( 1) If z E .l?i n Fi, the pair [z, Xi] is an inversion. If xi is moved, the inversion is destroyed; if xi is not moved, the inversion changes its type. In both cases the potential is decreased by 1. (2) If z E Ei n fi a new inversion [xi, z ] is created if and only if xi is moved, which occurs with probability i. If xi is moved, the potential is increased by 1 or 2 according to the value of b(z). The expected potential increase is then i. There are no other contributions
to F. With easy calculations
we have: (14)
We finally bound the variables W,. According to the values of b(xi) and b(Xj), four cases may occur with the same probability, namely: Case 1: Xi and Xj are moved. Up to max(O, ki - kj - 1) < (ki - kj) inversions [z, y] may be created, with equal probability of being of type 1 or 2 according to the value of b(y). The expected increase in potential is then < ;E [Ki - Kj]. Case 2: Xi and Xj are not moved, hence Wij = 0. Case 3: Only xi is moved. The expected increase in potential is < $E [ Ki]. Case 4: Only Xj is moved. The expected increase in potential is < ;E [Kj]. Combining
cases 1-4, we get E [Will < $E [ Ki], hence:
C
E
[
1
Wij < 1
The cost of algorithm points. CPBrT = y + C
C %+
[
Ki]
(15)
.
1
PBIT is computed
E [ai Ki - PiQi]
as done for CPMTF (relations
(6)-(
8) ) . We summarize
the main
9
(16)
1
whereai=l+y(2n-l),ai=F(2n-i)fori=2 ,..., n,pi=iAfori=l,..., n,y=-$(n2+5n). Assuming cui < npi, we have 1 + h( $n - a) < iAn, from which we choose the (smallest possible)
value
I? Luccio, A. Pedrotti/lnformation
4/n + 3. The condition ( 16) we have:
A =
C
PBIT <
’
Processing
Letters 52 (1994) 277-284
Lyi < n/.$ also implies aiKi - PiQi < Lyi(Ci + (2n - i - 1) /n),
9n2-n+6C0,
+
2(n+3)
13n3 - 19n2 + 12n - 12 2n(n + 3)
283
and from relation
(17)
where algorithm OPT is the same used for the non-randomized case, run by an oblivious adversary (i.e., one who has no knowledge of the initial random bits). In particular, letting n = 1 in the first inequality in ( 17)) we obtain the same sequential result of [ 21, that is Csm < ZCor”r _ 24. We formulate the result of ( 17) as a theorem: ‘4 Theorem 4. The parallel list update problem factor of gn against an oblivious adversary.
can be solved with a randomized algorithm
with a competitivity
To establish a lower bound to the competitivity factor of our algorithm PBIT, we use the same sequence of requests R = E& . . E (k times) introduced in Section 2, with the only difference that we now take & to be E = El ElE2E2.. . E,,,E,,,. We have
con< q1+ J$ (see Section 2). Since each Ei is called twice, element xi ends up to the front of L, independently of the initial value of b( xi). Hence, after one execution of E, the elements Xi are in reverse order xn,, . . . , XI. Moreover, since each Ei is repeated twice, the second request may cost as much as the first, or 1, according to whether xi is moved to the front after the first or after the second request. It is easy to verify that cPBIT(z)
= (k_
,)m(3;+1’
+
3m2;Sm,
hence CPBtT(R) COpT(R)
3
n
>2(l+n/m)’
(18)
Therefore, we obtain from (17) and (18) an upper bound of !n, and a lower bound of in to the competitivity factor. The huge gap found for the non-randomized algorithm is now drastically reduced.
4. Concluding remarks We have introduced the list update problem in a parallel setup, namely searching in parallel n elements in n lists. For this problem, we have proposed and analyzed an algorithm based on the move to front (MTF) strategy. The analysis is based on a proper potential function, and shows that the algorithm is (n* + 1 )-competitive against an optimal static algorithm. The use of randomization allows to decrease the competitivity factor to zn. The corresponding lower bounds are 2n and in, respectively. As a side result, the same competitivity factors of [ 61 and [2] are derived for the sequential case (n = 1). The extension of the list update problem to the retrieval of sets of n elements was originally introduced in [ I], where the search for each set was performed sequentially on a single list. The upper and lower bounds derived in [ l] for the competitivity factor are n + 1, and 2, respectively. We use n EREW-PRAM processors to search in n lists. Among the possible extensions, we mention as relevant and non-trivial the use of n’ processors, where this number is chosen independently of the number n of lists. The adoption of a parallel strategy different from MTF should also be considered.
284
E Luccio, A. Pedrotti/Information Processing Letters 52 (1994) 277-284
References [ 11 E D’Amore and V. Liberatore, The list update problem and the retrieval of sets, in: Proc. SWAT’92, Helsinki, 1992. [2] S. Irani, N. Reingold, J. Westbrook and D.D. Sleator, Randomized competitive algorithms for the list update problem, in: Proc. 2nd ACM-SIAM Ann. Symp. on Discrete Algorithms, ( 199 1) 25 l-260. [3] R. Karp and V. Ramachandran, Parallel algorithms for shared memory machines, in: J. van Leeuwen, ed., Hundbook of Theoretical Computer Science (North-Holland, Amsterdam, 1990) 869-941. [ 41 D.E. Knuth, The art of computer programming, I/01. 3 (Addison-Wesley, Reading, MA, 1973). [ 51 E Luccio and A. Pedrotti, A parallel list update problem, Tech. Rept. TR 28/93, Dipartimento di Informatica, Universita di Pisa, 1993. [6] D.D. Sleator and R.E. Tarjan, Amortized efficiency of list update and paging rules, Comm. ACM 28 (2) (1985) 202-208.