Optimal strategies for the list update problem under the MRM alternative cost model

Information Processing Letters 112 (2012) 218–222 Contents lists available at SciVerse ScienceDirect Information Processing Letters www.elsevier.com...

Download PDF

134KB Sizes 0 Downloads 47 Views

Report

PDF Reader
Full Text

Information Processing Letters 112 (2012) 218–222

Contents lists available at SciVerse ScienceDirect

Information Processing Letters www.elsevier.com/locate/ipl

Optimal strategies for the list update problem under the MRM alternative cost model Alexander Golynski a , Alejandro López-Ortiz b,∗ a b

Google Inc., New York City, United States University of Waterloo, Waterloo, Ontario, Canada

a r t i c l e

i n f o

a b s t r a c t

Article history: Received 8 July 2010 Received in revised form 16 November 2011 Accepted 2 December 2011 Available online 6 December 2011 Communicated by B. Doerr Keywords: Online algorithms List update Dynamic programming

We give an explicit representation for the oﬄine optimum strategy for list update under the MRM model of Martínez and Roura [C. Martínez, S. Roura, On the competitiveness of the move-to-front rule, Theoret. Comput. Sci. 242 (1–2) (2000) 3130–325] and Munro [J.I. Munro, On the competitiveness of linear search, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: Lecture Notes in Comput. Sci., vol. 1879, 2000, pp. 338–345] and give an O (n3 ) algorithm to compute it. This is in contrast to the standard model of Sleator and Tarjan [D.D. Sleator, R.E. Tarjan, Amortized eﬃciency of list update and paging rules, Commun. ACM 28 (2) (1985) 202–208] under which computing the oﬄine optimum was shown to be NP-hard [C. Ambühl, Oﬄine list update is NPhard, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: Lecture Notes in Comput. Sci., vol. 1879, 2000, pp. 42–51]. This algorithm follows from a new characterization theorem for realizable visiting sequences in the MRM model. © 2011 Elsevier B.V. All rights reserved.

1. Introduction List update, also known as list access, is a fundamental problem in the context of online computation [1,6,8,10]. Consider an unsorted list L of items. An online list update algorithm A is a strategy for reordering the elements of L after each access. The input to the algorithm is an access sequence X = x1 , x2 , . . . , xm that must be served in an online manner. To serve a request to an item x j , algorithm A linearly searches the list until it ﬁnds x j . If x j is the ith item in the list, A incurs a cost i to access x j . In the standard cost model the algorithm may traverse the list past this position for reorganization purposes (see e.g. [10,1]). The cost of a reorganization is as follows: immediately after x j ’s access, A can move this item to any position closer to the front of the list at no extra cost. This is called a free exchange. A can also exchange any other two consecutive items accessed in this step at a cost of 1. These are

*

Corresponding author. E-mail address: [email protected] (A. López-Ortiz).

0020-0190/$ – see front matter doi:10.1016/j.ipl.2011.12.001

© 2011

Elsevier B.V. All rights reserved.

called paid exchanges. An eﬃcient algorithm can thus use free and paid exchanges to minimize the overall cost of serving a sequence. While list update algorithms have been widely studied under this framework, the validity of the cost model has been debated. More precisely, Martínez and Roura [7] and Munro [9], independently addressed this issue. Let (a1 , a2 , . . . , al ) be the list currently maintained by an algorithm A. They argued that in a realistic setting a complete rearrangement of all items in the list which precede item ai would require time proportional to i, while this has cost proportional to i 2 in the standard cost model. Munro provided the example of accessing the last item of the list and then reversing the entire list. The real cost of this operation in an array or a linear link list should be O (l), while it costs about l2 /2 in the standard cost model. As a consequence, their main objection to the standard model is that it prevents online algorithms from using their true power. They instead proposed a new model in which the cost of accessing the ith item of the list plus the cost of reorganizing the ﬁrst i items is linear in i. We will refer to this model as the MRM cost model (see [4,5] for further

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

study of this model). Surprisingly, it turns out that the off-line optimum beneﬁts substantially more from this realistic adjustment than the online algorithms do. Indeed, under MRM, every online algorithm has amortized cost of Θ(l) per access for some arbitrary long sequences, while an optimal off-line algorithm incurs a cost of Θ(log l) on every sequence and hence all online list update algorithm have a constant competitive ratio of Ω(l/ log l). One may be tempted to argue that this is proof that the new model makes the off-line optimum too powerful and hence this power should be removed, however this is not correct as in real life online algorithms can rearrange items at the cost indicated. In this note we give an O (n3 ) algorithm for computing the oﬄine optimum under the MRM model. This is in contrast to the classical list update model, where the oﬄine optimum was shown to be NP-hard by Ambühl [3]. The algorithm is based on the concept of visiting sequence which denotes the number of elements visited by a list update algorithm in each access. More formally, the visiting cost w i of an element xi under an algorithm A is the number of elements accessed in the list during the access to xi . Observe that if the element is in position j in the list it follows naturally that w i j since we need to traverse the list from left to right to at least position j. Note that the algorithm might choose to go past this position in order to rearrange the list, hence w i may be strictly greater than j. Deﬁnition 1. Given an access sequence X = x1 , x2 , . . . , xn and a list update algorithm A, w iA denotes the number of elements visited during the linear traversal of the list while servicing the request xi using a given algorithm A. The visiting sequence of A for X is the sequence of visiting costs W A = w 1A , . . . , w nA .

219

Table 1 Conﬁguration after each requested item. Note that in the last row the list has been reordered past the currently accessed item. Item

Conﬁguration

Access cost

Visiting cost

b c a a a a

a, b, c, d, e a, b, c, d, e a, b, c, d, e a, b, c, d, e a, b, c, d, e a, c, b, d, e

2 3 1 1 1 1

2 3 1 1 1 3

T last (x) < T next ( y )

or

T last (x) < T last ( y )

and

T now − T last (x) > T last ( y ) − T next ( y ). Let X = b, c , a, a, a, a be a request sequence starting from the conﬁguration (a, b, c , d, e ). Timestamps are initialized as if the previous ten accesses had been e , d, c , b, a, e , d, c , b, a. The sequence of resulting conﬁgurations is shown in Table 1. Deﬁnition 2. Given an initial list conﬁguration L, a request sequence X and a visiting sequence W = w i we say that the triplet ( L , X , W ) is realizable if there exists a (possibly oﬄine) rearrangement strategy of the list such that for any step i we have:

• the requested element xi is found at a position j w i in the list and

• no element past w i is relocated in this step. As one would expect not all triplets ( L , X , W ) are realizable. We illustrate this fact with two sample visiting sequences W and W both starting from the same initial conﬁguration L and request sequence X . One visiting sequence leads to a realizable triplet the other one does not.

In what follows, we omit the superscript A if the choice of algorithm is clear from the context. Observe that for MTF the cost of accessing the ith element in the sequence is the same as the visiting cost since no elements are visited past the currently accessed element. The next example illustrates the deﬁnition of visiting sequence for an algorithm that visits elements past the currently accessed element. Recall that the classical timestamp algorithm [2] moves the last accessed item x forward until it precedes the ﬁrst item that was requested at most once since the next to last request to x. Consider now the following active variant of the timestamp algorithm.

Example. Let the initial list conﬁguration L be ( f , a, e , c , b, d) and consider the request sequence X = e , c , e , e , f and visiting sequences W = 4, 1, 2, 1, 3 and W = 3, 3, 2, 2, 2. The triplet ( L , X , W ) is realized, for example, by an algorithm with the following intermediate list conﬁgurations: (c , e , f , a, b, d), (c , e , f , a, b, d), (e , c , f , a, b, d), and lastly (e , c , f , a, b, d). On the other hand the triplet ( L , X , W ) cannot be realized by any list update algorithm, as can be shown by inspection.

Example. The active variant of timestamp maintains the timestamp of the last two accesses to each item and after each access the entire list is insertion-sorted by relative age. Informally we say that x is older than y if the last access time of x is prior to the last access of y and enough time has gone by to assume that y is likelier to reoccur before x does, judging from the last two accesses. Formally, let T last (x) and T next (x) denote the last and next to last accesses to item x, then we say that age(x) > age( y ) if

As before, let X = x1 , x2 , . . . , xn and W = w 1 , . . . , w n denote an access sequence and a visiting sequence over a universe U . Let X i : j = xi , xi +1 , . . . , x j be the subsequence of X between requests i and j, inclusively. Let di : j denote the number of distinct elements requested in the subsequence X i : j . In this section we characterize all realizable triplets ( L , X , W ). That is, we determine if there exists an algorithm A that realizes the given triplet. A full characterization theorem is of independent interest. Consider ﬁrst

2. A characterization theorem

220

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

triplets in which the initial conﬁguration happens to have the items listed by order of ﬁrst request. The notion of a list “sorted by order of ﬁrst request” is rather simple and as such is best understood by means of an example. To this end consider the request sequence X = c , a, c , e , c , a, f , d, then the list sorted by order of ﬁrst request is (c , a, e , f , d) as c is the ﬁrst element ever requested, a is the second element ever requested, e is the third (distinct) element ever requested, f is the fourth element ever requested and d is the last distinct element requested. We also introduce a slightly weaker version of this condition. We say that a list is weakly sorted if after having visited up to position w i in the list, all elements between position 1 and w i in the list appear in the list conﬁguration by their increasing order by next request in the sequence X i +1:n . Example. Consider the list conﬁguration ( f , d, c , e , b, a) immediately after the request for x6 = c with visiting cost 4 in the request sequence X = c , a, b, e , c , c , f , d, d, f , c , f , e . The remaining request sequence is X 7:n = f , d, d, f , c , f , e . The list conﬁguration is weakly sorted as the elements in the sublist C 1:4 = ( f , d, c , e ) appear in increasing order by next request in X 7:n . We ﬁrst characterize realizable triplets for algorithms on lists with initial conﬁguration sorted by order of ﬁrst request. Crucial in the proof of this theorem is the fact that after examining w i elements in the list only the elements appearing in positions 1, . . . , w i may possibly have moved forward in the list conﬁguration. This property holds both for the MRM and the standard cost model. Theorem 1. Let L be an initial list conﬁguration sorted by order of ﬁrst request, X = x1 , . . . xn be a given access sequence and let W = w i be a visiting sequence. The triplet ( L , X , W ) is realizable if and only if the furthest element touched while servicing a subsequence X i : j is at least as large as the number of distinct elements in it. That is,

max { w k } di : j

i k j

for all 1 i < j n.

Proof. (⇒) This follows directly: if on a given subsequence the algorithm visits less than di : j distinct elements then it cannot possibly reach all di : j distinct elements requested in the list and the algorithm is not correct. (⇐) We show that a strategy which keeps the list “weakly sorted” by order of next request satisﬁes the conditions of the theorem. To maintain weak sortedness, the algorithm moves the most recently requested element xi to its proper position according to rank by its next request in the sublist from position 1 to w i . In the process the strategy also rearranges any elements appearing before position w i which are not already weakly sorted. Since the list is initially sorted by order of next request clearly we can serve the ﬁrst element, as w i 1 for all 1 i n. Now consider the ith request in the sequence. Observe that since max1 j i { w i } d1:i and the list was initially sorted by order of next request then xi must necessarily have been visited by the algorithm at some point

in the past up to and including this last step. If xi is visited in the last step, then the triplet is realizable through the ith request and there is nothing to show, otherwise let k be that last visit to xi which took place at some point in the past. Then at that point, by the weak sortedness property, xi was moved forward to a position no further than dk+1:i . Observe that from the hypotheses we know that maxk+1 j i { w j } dk+1:i so at some point between time k + 1 and i the element xi is visited again. However since k was the last visit prior to request i it follows that the value w j such that it exceeds dk+1:i can only be w i and hence the strategy can serve the present request i. This shows that an arbitrary request i can be serviced and the theorem follows by induction. 2 The theorem above only characterizes realizable triplets whose initial conﬁguration is sorted by order of ﬁrst request. The next theorem characterizes all realizable triplets without restriction on the initial conﬁguration. In the proof we use the previous theorem to characterize this set. Theorem 2. A triplet ( L , X , W ) is realizable if and only if the following two conditions hold

• maxi k j { w k } di : j for all 1 i < j n, and • for all , max1 j { w j } p , where is the index of a request to an item that has up and until now not been requested before; p denotes its position in the initial list conﬁguration L; and W = w i . Proof. (⇐) As before, every correct algorithm has to visit at least di : j many elements between requests i and j, as well as p i elements from the initial conﬁguration as to properly reach each requested element in the list. (⇒) Conversely we must show that if the work sequence w i satisﬁes the conditions, then it is possible to serve the request sequence X . We will show that the algorithm from the proof of the theorem above, starting now from the given arbitrary conﬁguration, achieves this. Recall that the algorithm maintains a weakly sorted list. Every initial request is met since either (a) we have already visited the next element and has been moved forward to its position by order of ﬁrst request and thus satisﬁes the conditions of Theorem 1, or (b) it has not yet been visited and it remains in its initial position p . This means that w j < p for 1 j − 1 and hence the current w must be large enough to reach p , since the hypothesis states that max1 j { w j } p . Thus the element can be reached and hence served in the present request. Now for non-initial requests, the list is weakly sorted up to element x and the sequence is realizable by Theorem 1. 2 3. Oﬄine optimum In this section we show that in the MRM model the offline optimum can be computed in polynomial time. This is in contrast to the standard model for which computing the oﬄine optimum is known to be NP-hard [3]. The algorithm we describe in this section works both for the case

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

where an initial conﬁguration is given as well as for the case where the initial conﬁguration is to be chosen by the oﬄine algorithm. In general given an arbitrary request sequence Y we denote by opt(Y ) the lowest possible cost to serve Y starting from a list arranged in some ideal order. In what follows, given a ﬁxed request sequence X , for convenience of notation we will shorten opt( X i : j ) to opt(i , j ). We compute the oﬄine optimum using a dynamic programming algorithm by ﬁrst computing the minimum cost over the sequence X , i.e. opt( X ) and then, as usual, reconstructing the sequence of reorderings for X from the dynamic programming table. Let R denote the sequence of reorderings after each move by the oﬄine optimum on X . It follows then that | R | = opt( X ) and in general the decomposition of the global oﬄine optimum on the input sequence X is given by

opt(1, n) = | R 1:i | + | R i +1:n | where | R i : j | denotes the number of operations performed by the global oﬄine optimum R during the requests corresponding to the subsequence X i : j . It now follows from the deﬁnition of opt(i , j ) that

opt(1, n) opt(1, i ) + opt(i + 1, n). This expression does not (yet) yield an algorithm to compute the optimal sequence since the starting conﬁguration for the rightmost term, namely opt(i + 1, n) does not necessarily match the ﬁnal conﬁguration of opt(1, i ). To this end, observe that we can rearrange the ﬁnal conﬁguration of opt(1, i ) into the initial conﬁguration of opt(i + 1, n) at a cost of at most d1:n steps which can be added to the cost of servicing the ith request and hence, this implies that, for all i

opt(1, i − 1) + d1:n + opt(i + 1, n) opt(1, n). Similarly, if we consider the number of elements w opt visi ited by opt(1, n) during the ith access we obtain

opt(1, n) = | R 1:i −1 | +

w opt i

+ | R i +1:n |

opt(1, i − 1) + w opt + opt(i + 1, n). i Now from the characterization theorem before we know d1:n , and hence there that there exists i such that w opt i exists i such that

221

Table 2 Oﬄine optimum conﬁguration after each requested item. Item

Conﬁguration e, b, c, d, a

Access cost

Visiting cost

e b e b c d d e c a

e, b, e, b, e, b, e, b, d, e, d, e, d, e, d, e, a, d, a, d,

1 2 1 2 3 1 1 2 3 1

1 2 1 2 4 1 1 2 5 1

c, c, c, c, c, c, c, c, e, e,

d, d, d, d, b, b, b, b, c, c,

a a a a a a a a b b

or equivalently,

opt(i , j ) = di : j + min opt(i , k − 1) + opt(k + 1, j ) . i k j

From this recurrence it follows that the oﬄine optimum can be computed in O (n3 ) time. In fact this can be slightly improved. The ﬁrst step consists in identifying all pairs values of i k j for which di :k = di : j . It is not hard to see that the recurrence does not need to be evaluated for those values of k. The values di : j can be computed in O (n2 ) since it is a stepwise monotonic function. However, in the worst case the recurrence is still evaluated for Θ(n) values of k. These pairs are stored as marked entries in an upper triangular n × n matrix M with each entry pointing to its nearest marked predecessor in the same row. The algorithm then partitions the request sequence using the marked entry pointed to by M [1, n], say entry M [i , n] and proceeds recursively to the entries pointed to by M [1, i − 1] and M [i + 1, n] and so on. Lastly the reordering sequence is recovered by following the sequence of partition values and weakly sorting the w kA items visited during the current step. This gives the following theorem. Theorem 3. The oﬄine optimum for the MRM alternative list update model can be computed in O (n3 ) time. Example. Consider the request sequence X = e , b, e , b, c , d, d, e , c , a. Table 2 gives the sequence of oﬄine conﬁgurations to serve X optimally. The total cost over the entire sequence is twenty, which can be shown to be optimal using the algorithm above.

opt(1, n) = | R 1:i −1 | + w opt + | R i +1:n | i 4. Conclusions

| R 1:i −1 | + d1:n + | R i +1:n | opt(1, i − 1) + d1:n + opt(i + 1, n). The last two inequalities together give the desired dynamic programming recurrence:

We gave a characterization theorem and from that a polynomial time algorithm for computing the oﬄine optimum in time O (n3 ) for the MRM alternative model of list update.

opt(1, n) = min opt(1, i − 1) + d1:n + opt(i + 1, n) 1i n

Acknowledgements

and in general

opt(i , j ) = min opt(i , k − 1) + di : j + opt(k + 1, j ) , i k j

We thank Ian Munro for early discussions on this problem.

222

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

References [1] S. Albers, J. Westbrook, Self-organizing data structures, in: Online Algorithms: The State of the Art, in: LNCS, vol. 1442, 1998, pp. 13–51. [2] Susanne Albers, Improved randomized on-line algorithms for the list update problem, in: Proc. 6th Annual ACM–SIAM Symposium on Discrete Algorithms (SODA 1995), 1995, pp. 412–419. [3] Christoph Ambühl, Oﬄine list update is NP-hard, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: LNCS, vol. 1879, 2000, pp. 42–51. [4] S. Angelopoulos, R. Dorrigiv, A. López-Ortiz, List update with locality of reference, in: Proc. 8th Latin American Symposium on Theoretical Informatics (LATIN 2008), in: LNCS, vol. 4957, Springer, 2008, pp. 399–410. [5] R. Dorrigiv, M. Ehmsen, A. López-Ortiz, Parameterized analysis of paging and list update algorithms, in: Proc. 7th International Work-

[6]

[7]

[8] [9]

[10]

shop on Approximation and Online Algorithms (WAOA 2009), in: LNCS, vol. 5893, Springer, 2009, pp. 104–115. G.H. Gonnet, J.I. Munro, H. Suwanda, Towards self-organizing linear search, in: Proceedings of the 20th Annual IEEE Symposium on Foundations of Computer Science (FOCS 1979), IEEE, 1979, pp. 169– 174. C. Martínez, S. Roura, On the competitiveness of the move-tofront rule, Theoretical Computer Science 242 (1–2) (July 2000) 313– 325. J. McCabe, On serial ﬁles with relocatable records, Operation Research 12 (1965) 609–618. J.I. Munro, On the competitiveness of linear search, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: LNCS, vol. 1879, 2000, pp. 338–345. Daniel D. Sleator, Robert E. Tarjan, Amortized eﬃciency of list update and paging rules, Communications of the ACM 28 (2) (February 1985) 202–208.

Optimal strategies for the list update problem under the MRM alternative cost model

Optimal strategies for the list update problem under the MRM alternative cost model

Recommend Documents