Optimal strategies for the list update problem under the MRM alternative cost model

Optimal strategies for the list update problem under the MRM alternative cost model

Information Processing Letters 112 (2012) 218–222 Contents lists available at SciVerse ScienceDirect Information Processing Letters www.elsevier.com...

134KB Sizes 0 Downloads 47 Views

Information Processing Letters 112 (2012) 218–222

Contents lists available at SciVerse ScienceDirect

Information Processing Letters www.elsevier.com/locate/ipl

Optimal strategies for the list update problem under the MRM alternative cost model Alexander Golynski a , Alejandro López-Ortiz b,∗ a b

Google Inc., New York City, United States University of Waterloo, Waterloo, Ontario, Canada

a r t i c l e

i n f o

a b s t r a c t

Article history: Received 8 July 2010 Received in revised form 16 November 2011 Accepted 2 December 2011 Available online 6 December 2011 Communicated by B. Doerr Keywords: Online algorithms List update Dynamic programming

We give an explicit representation for the offline optimum strategy for list update under the MRM model of Martínez and Roura [C. Martínez, S. Roura, On the competitiveness of the move-to-front rule, Theoret. Comput. Sci. 242 (1–2) (2000) 3130–325] and Munro [J.I. Munro, On the competitiveness of linear search, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: Lecture Notes in Comput. Sci., vol. 1879, 2000, pp. 338–345] and give an O (n3 ) algorithm to compute it. This is in contrast to the standard model of Sleator and Tarjan [D.D. Sleator, R.E. Tarjan, Amortized efficiency of list update and paging rules, Commun. ACM 28 (2) (1985) 202–208] under which computing the offline optimum was shown to be NP-hard [C. Ambühl, Offline list update is NPhard, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: Lecture Notes in Comput. Sci., vol. 1879, 2000, pp. 42–51]. This algorithm follows from a new characterization theorem for realizable visiting sequences in the MRM model. © 2011 Elsevier B.V. All rights reserved.

1. Introduction List update, also known as list access, is a fundamental problem in the context of online computation [1,6,8,10]. Consider an unsorted list L of  items. An online list update algorithm A is a strategy for reordering the elements of L after each access. The input to the algorithm is an access sequence X = x1 , x2 , . . . , xm  that must be served in an online manner. To serve a request to an item x j , algorithm A linearly searches the list until it finds x j . If x j is the ith item in the list, A incurs a cost i to access x j . In the standard cost model the algorithm may traverse the list past this position for reorganization purposes (see e.g. [10,1]). The cost of a reorganization is as follows: immediately after x j ’s access, A can move this item to any position closer to the front of the list at no extra cost. This is called a free exchange. A can also exchange any other two consecutive items accessed in this step at a cost of 1. These are

*

Corresponding author. E-mail address: [email protected] (A. López-Ortiz).

0020-0190/$ – see front matter doi:10.1016/j.ipl.2011.12.001

© 2011

Elsevier B.V. All rights reserved.

called paid exchanges. An efficient algorithm can thus use free and paid exchanges to minimize the overall cost of serving a sequence. While list update algorithms have been widely studied under this framework, the validity of the cost model has been debated. More precisely, Martínez and Roura [7] and Munro [9], independently addressed this issue. Let (a1 , a2 , . . . , al ) be the list currently maintained by an algorithm A. They argued that in a realistic setting a complete rearrangement of all items in the list which precede item ai would require time proportional to i, while this has cost proportional to i 2 in the standard cost model. Munro provided the example of accessing the last item of the list and then reversing the entire list. The real cost of this operation in an array or a linear link list should be O (l), while it costs about l2 /2 in the standard cost model. As a consequence, their main objection to the standard model is that it prevents online algorithms from using their true power. They instead proposed a new model in which the cost of accessing the ith item of the list plus the cost of reorganizing the first i items is linear in i. We will refer to this model as the MRM cost model (see [4,5] for further

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

study of this model). Surprisingly, it turns out that the off-line optimum benefits substantially more from this realistic adjustment than the online algorithms do. Indeed, under MRM, every online algorithm has amortized cost of Θ(l) per access for some arbitrary long sequences, while an optimal off-line algorithm incurs a cost of Θ(log l) on every sequence and hence all online list update algorithm have a constant competitive ratio of Ω(l/ log l). One may be tempted to argue that this is proof that the new model makes the off-line optimum too powerful and hence this power should be removed, however this is not correct as in real life online algorithms can rearrange items at the cost indicated. In this note we give an O (n3 ) algorithm for computing the offline optimum under the MRM model. This is in contrast to the classical list update model, where the offline optimum was shown to be NP-hard by Ambühl [3]. The algorithm is based on the concept of visiting sequence which denotes the number of elements visited by a list update algorithm in each access. More formally, the visiting cost w i of an element xi under an algorithm A is the number of elements accessed in the list during the access to xi . Observe that if the element is in position j in the list it follows naturally that w i  j since we need to traverse the list from left to right to at least position j. Note that the algorithm might choose to go past this position in order to rearrange the list, hence w i may be strictly greater than j. Definition 1. Given an access sequence X = x1 , x2 , . . . , xn  and a list update algorithm A, w iA denotes the number of elements visited during the linear traversal of the list while servicing the request xi using a given algorithm A. The visiting sequence of A for X is the sequence of visiting costs W A =  w 1A , . . . , w nA .

219

Table 1 Configuration after each requested item. Note that in the last row the list has been reordered past the currently accessed item. Item

Configuration

Access cost

Visiting cost

b c a a a a

a, b, c, d, e a, b, c, d, e a, b, c, d, e a, b, c, d, e a, b, c, d, e a, c, b, d, e

2 3 1 1 1 1

2 3 1 1 1 3

T last (x) < T next ( y )

or

T last (x) < T last ( y )

and

T now − T last (x) > T last ( y ) − T next ( y ). Let X = b, c , a, a, a, a be a request sequence starting from the configuration (a, b, c , d, e ). Timestamps are initialized as if the previous ten accesses had been e , d, c , b, a, e , d, c , b, a. The sequence of resulting configurations is shown in Table 1. Definition 2. Given an initial list configuration L, a request sequence X and a visiting sequence W =  w i  we say that the triplet ( L , X , W ) is realizable if there exists a (possibly offline) rearrangement strategy of the list such that for any step i we have:

• the requested element xi is found at a position j  w i in the list and

• no element past w i is relocated in this step. As one would expect not all triplets ( L , X , W ) are realizable. We illustrate this fact with two sample visiting sequences W and W  both starting from the same initial configuration L and request sequence X . One visiting sequence leads to a realizable triplet the other one does not.

In what follows, we omit the superscript A if the choice of algorithm is clear from the context. Observe that for MTF the cost of accessing the ith element in the sequence is the same as the visiting cost since no elements are visited past the currently accessed element. The next example illustrates the definition of visiting sequence for an algorithm that visits elements past the currently accessed element. Recall that the classical timestamp algorithm [2] moves the last accessed item x forward until it precedes the first item that was requested at most once since the next to last request to x. Consider now the following active variant of the timestamp algorithm.

Example. Let the initial list configuration L be ( f , a, e , c , b, d) and consider the request sequence X = e , c , e , e , f  and visiting sequences W = 4, 1, 2, 1, 3 and W  = 3, 3, 2, 2, 2. The triplet ( L , X , W ) is realized, for example, by an algorithm with the following intermediate list configurations: (c , e , f , a, b, d), (c , e , f , a, b, d), (e , c , f , a, b, d), and lastly (e , c , f , a, b, d). On the other hand the triplet ( L , X , W  ) cannot be realized by any list update algorithm, as can be shown by inspection.

Example. The active variant of timestamp maintains the timestamp of the last two accesses to each item and after each access the entire list is insertion-sorted by relative age. Informally we say that x is older than y if the last access time of x is prior to the last access of y and enough time has gone by to assume that y is likelier to reoccur before x does, judging from the last two accesses. Formally, let T last (x) and T next (x) denote the last and next to last accesses to item x, then we say that age(x) > age( y ) if

As before, let X = x1 , x2 , . . . , xn  and W =  w 1 , . . . , w n  denote an access sequence and a visiting sequence over a universe U . Let X i : j = xi , xi +1 , . . . , x j  be the subsequence of X between requests i and j, inclusively. Let di : j denote the number of distinct elements requested in the subsequence X i : j . In this section we characterize all realizable triplets ( L , X , W ). That is, we determine if there exists an algorithm A that realizes the given triplet. A full characterization theorem is of independent interest. Consider first

2. A characterization theorem

220

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

triplets in which the initial configuration happens to have the items listed by order of first request. The notion of a list “sorted by order of first request” is rather simple and as such is best understood by means of an example. To this end consider the request sequence X = c , a, c , e , c , a, f , d, then the list sorted by order of first request is (c , a, e , f , d) as c is the first element ever requested, a is the second element ever requested, e is the third (distinct) element ever requested, f is the fourth element ever requested and d is the last distinct element requested. We also introduce a slightly weaker version of this condition. We say that a list is weakly sorted if after having visited up to position w i in the list, all elements between position 1 and w i in the list appear in the list configuration by their increasing order by next request in the sequence X i +1:n . Example. Consider the list configuration ( f , d, c , e , b, a) immediately after the request for x6 = c with visiting cost 4 in the request sequence X = c , a, b, e , c , c , f , d, d, f , c , f , e . The remaining request sequence is X 7:n =  f , d, d, f , c , f , e . The list configuration is weakly sorted as the elements in the sublist C 1:4 = ( f , d, c , e ) appear in increasing order by next request in X 7:n . We first characterize realizable triplets for algorithms on lists with initial configuration sorted by order of first request. Crucial in the proof of this theorem is the fact that after examining w i elements in the list only the elements appearing in positions 1, . . . , w i may possibly have moved forward in the list configuration. This property holds both for the MRM and the standard cost model. Theorem 1. Let L be an initial list configuration sorted by order of first request, X = x1 , . . . xn  be a given access sequence and let W =  w i  be a visiting sequence. The triplet ( L , X , W ) is realizable if and only if the furthest element touched while servicing a subsequence X i : j is at least as large as the number of distinct elements in it. That is,

max { w k }  di : j

i k j

for all 1  i < j  n.

Proof. (⇒) This follows directly: if on a given subsequence the algorithm visits less than di : j distinct elements then it cannot possibly reach all di : j distinct elements requested in the list and the algorithm is not correct. (⇐) We show that a strategy which keeps the list “weakly sorted” by order of next request satisfies the conditions of the theorem. To maintain weak sortedness, the algorithm moves the most recently requested element xi to its proper position according to rank by its next request in the sublist from position 1 to w i . In the process the strategy also rearranges any elements appearing before position w i which are not already weakly sorted. Since the list is initially sorted by order of next request clearly we can serve the first element, as w i  1 for all 1  i  n. Now consider the ith request in the sequence. Observe that since max1 j i { w i }  d1:i and the list was initially sorted by order of next request then xi must necessarily have been visited by the algorithm at some point

in the past up to and including this last step. If xi is visited in the last step, then the triplet is realizable through the ith request and there is nothing to show, otherwise let k be that last visit to xi which took place at some point in the past. Then at that point, by the weak sortedness property, xi was moved forward to a position no further than dk+1:i . Observe that from the hypotheses we know that maxk+1 j i { w j }  dk+1:i so at some point between time k + 1 and i the element xi is visited again. However since k was the last visit prior to request i it follows that the value w j such that it exceeds dk+1:i can only be w i and hence the strategy can serve the present request i. This shows that an arbitrary request i can be serviced and the theorem follows by induction. 2 The theorem above only characterizes realizable triplets whose initial configuration is sorted by order of first request. The next theorem characterizes all realizable triplets without restriction on the initial configuration. In the proof we use the previous theorem to characterize this set. Theorem 2. A triplet ( L , X , W ) is realizable if and only if the following two conditions hold

• maxi k j { w k }  di : j for all 1  i < j  n, and • for all , max1 j  { w j }  p  , where  is the index of a request to an item that has up and until now not been requested before; p  denotes its position in the initial list configuration L; and W =  w i . Proof. (⇐) As before, every correct algorithm has to visit at least di : j many elements between requests i and j, as well as p i elements from the initial configuration as to properly reach each requested element in the list. (⇒) Conversely we must show that if the work sequence  w i  satisfies the conditions, then it is possible to serve the request sequence X . We will show that the algorithm from the proof of the theorem above, starting now from the given arbitrary configuration, achieves this. Recall that the algorithm maintains a weakly sorted list. Every initial request is met since either (a) we have already visited the next element and has been moved forward to its position by order of first request and thus satisfies the conditions of Theorem 1, or (b) it has not yet been visited and it remains in its initial position p  . This means that w j < p  for 1  j   − 1 and hence the current w  must be large enough to reach p  , since the hypothesis states that max1 j  { w j }  p  . Thus the element can be reached and hence served in the present request. Now for non-initial requests, the list is weakly sorted up to element x and the sequence is realizable by Theorem 1. 2 3. Offline optimum In this section we show that in the MRM model the offline optimum can be computed in polynomial time. This is in contrast to the standard model for which computing the offline optimum is known to be NP-hard [3]. The algorithm we describe in this section works both for the case

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

where an initial configuration is given as well as for the case where the initial configuration is to be chosen by the offline algorithm. In general given an arbitrary request sequence Y we denote by opt(Y ) the lowest possible cost to serve Y starting from a list arranged in some ideal order. In what follows, given a fixed request sequence X , for convenience of notation we will shorten opt( X i : j ) to opt(i , j ). We compute the offline optimum using a dynamic programming algorithm by first computing the minimum cost over the sequence X , i.e. opt( X ) and then, as usual, reconstructing the sequence of reorderings for X from the dynamic programming table. Let R denote the sequence of reorderings after each move by the offline optimum on X . It follows then that | R | = opt( X ) and in general the decomposition of the global offline optimum on the input sequence X is given by

opt(1, n) = | R 1:i | + | R i +1:n | where | R i : j | denotes the number of operations performed by the global offline optimum R during the requests corresponding to the subsequence X i : j . It now follows from the definition of opt(i , j ) that

opt(1, n)  opt(1, i ) + opt(i + 1, n). This expression does not (yet) yield an algorithm to compute the optimal sequence since the starting configuration for the rightmost term, namely opt(i + 1, n) does not necessarily match the final configuration of opt(1, i ). To this end, observe that we can rearrange the final configuration of opt(1, i ) into the initial configuration of opt(i + 1, n) at a cost of at most d1:n steps which can be added to the cost of servicing the ith request and hence, this implies that, for all i

opt(1, i − 1) + d1:n + opt(i + 1, n)  opt(1, n). Similarly, if we consider the number of elements w opt visi ited by opt(1, n) during the ith access we obtain

opt(1, n) = | R 1:i −1 | +

w opt i

+ | R i +1:n |

 opt(1, i − 1) + w opt + opt(i + 1, n). i Now from the characterization theorem before we know  d1:n , and hence there that there exists i such that w opt i exists i such that

221

Table 2 Offline optimum configuration after each requested item. Item

Configuration e, b, c, d, a

Access cost

Visiting cost

e b e b c d d e c a

e, b, e, b, e, b, e, b, d, e, d, e, d, e, d, e, a, d, a, d,

1 2 1 2 3 1 1 2 3 1

1 2 1 2 4 1 1 2 5 1

c, c, c, c, c, c, c, c, e, e,

d, d, d, d, b, b, b, b, c, c,

a a a a a a a a b b

or equivalently,





opt(i , j ) = di : j + min opt(i , k − 1) + opt(k + 1, j ) . i k j

From this recurrence it follows that the offline optimum can be computed in O (n3 ) time. In fact this can be slightly improved. The first step consists in identifying all pairs values of i  k  j for which di :k = di : j . It is not hard to see that the recurrence does not need to be evaluated for those values of k. The values di : j can be computed in O (n2 ) since it is a stepwise monotonic function. However, in the worst case the recurrence is still evaluated for Θ(n) values of k. These pairs are stored as marked entries in an upper triangular n × n matrix M with each entry pointing to its nearest marked predecessor in the same row. The algorithm then partitions the request sequence using the marked entry pointed to by M [1, n], say entry M [i , n] and proceeds recursively to the entries pointed to by M [1, i − 1] and M [i + 1, n] and so on. Lastly the reordering sequence is recovered by following the sequence of partition values and weakly sorting the w kA items visited during the current step. This gives the following theorem. Theorem 3. The offline optimum for the MRM alternative list update model can be computed in O (n3 ) time. Example. Consider the request sequence X = e , b, e , b, c , d, d, e , c , a. Table 2 gives the sequence of offline configurations to serve X optimally. The total cost over the entire sequence is twenty, which can be shown to be optimal using the algorithm above.

opt(1, n) = | R 1:i −1 | + w opt + | R i +1:n | i 4. Conclusions

 | R 1:i −1 | + d1:n + | R i +1:n |  opt(1, i − 1) + d1:n + opt(i + 1, n). The last two inequalities together give the desired dynamic programming recurrence:





We gave a characterization theorem and from that a polynomial time algorithm for computing the offline optimum in time O (n3 ) for the MRM alternative model of list update.

opt(1, n) = min opt(1, i − 1) + d1:n + opt(i + 1, n) 1i n

Acknowledgements

and in general





opt(i , j ) = min opt(i , k − 1) + di : j + opt(k + 1, j ) , i k j

We thank Ian Munro for early discussions on this problem.

222

A. Golynski, A. López-Ortiz / Information Processing Letters 112 (2012) 218–222

References [1] S. Albers, J. Westbrook, Self-organizing data structures, in: Online Algorithms: The State of the Art, in: LNCS, vol. 1442, 1998, pp. 13–51. [2] Susanne Albers, Improved randomized on-line algorithms for the list update problem, in: Proc. 6th Annual ACM–SIAM Symposium on Discrete Algorithms (SODA 1995), 1995, pp. 412–419. [3] Christoph Ambühl, Offline list update is NP-hard, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: LNCS, vol. 1879, 2000, pp. 42–51. [4] S. Angelopoulos, R. Dorrigiv, A. López-Ortiz, List update with locality of reference, in: Proc. 8th Latin American Symposium on Theoretical Informatics (LATIN 2008), in: LNCS, vol. 4957, Springer, 2008, pp. 399–410. [5] R. Dorrigiv, M. Ehmsen, A. López-Ortiz, Parameterized analysis of paging and list update algorithms, in: Proc. 7th International Work-

[6]

[7]

[8] [9]

[10]

shop on Approximation and Online Algorithms (WAOA 2009), in: LNCS, vol. 5893, Springer, 2009, pp. 104–115. G.H. Gonnet, J.I. Munro, H. Suwanda, Towards self-organizing linear search, in: Proceedings of the 20th Annual IEEE Symposium on Foundations of Computer Science (FOCS 1979), IEEE, 1979, pp. 169– 174. C. Martínez, S. Roura, On the competitiveness of the move-tofront rule, Theoretical Computer Science 242 (1–2) (July 2000) 313– 325. J. McCabe, On serial files with relocatable records, Operation Research 12 (1965) 609–618. J.I. Munro, On the competitiveness of linear search, in: Proc. 8th Annual European Symposium on Algorithms (ESA 2000), in: LNCS, vol. 1879, 2000, pp. 338–345. Daniel D. Sleator, Robert E. Tarjan, Amortized efficiency of list update and paging rules, Communications of the ACM 28 (2) (February 1985) 202–208.