Bounds for the expected duration of the monopolist game

Bounds for the expected duration of the monopolist game

Information Processing Letters 101 (2007) 86–92 www.elsevier.com/locate/ipl Bounds for the expected duration of the monopolist game Eric Bach Compute...

141KB Sizes 1 Downloads 71 Views

Information Processing Letters 101 (2007) 86–92 www.elsevier.com/locate/ipl

Bounds for the expected duration of the monopolist game Eric Bach Computer Sciences Department, University of Wisconsin, Madison, WI 53706, USA Received 6 January 2003; received in revised form 22 July 2004; accepted 10 May 2006 Available online 23 October 2006 Communicated by P.M.B. Vitányi

Abstract The monopolist game is a multi-player ruin process that arose in the study of certain learning processes. We prove that when the players begin with equal stakes, the expected duration of the game is, up to constant factors, the square of their collective initial wealth. This proves a conjecture of Amano, Tromp, Vitányi, and Watanabe. More generally, we find that the expected duration is similarly related to a quadratic function that reflects the uniformity of the initial stakes, and calculate the expected duration exactly for three players. © 2006 Elsevier B.V. All rights reserved. Keywords: Analysis of algorithms; Martingales; Ruin problems

1. Introduction Amano, Tromp, Vitányi, and Watanabe [1] have recently studied ruin problems of the following type. Each of k players starts with some initial stake. At each round, all sufficiently solvent players contribute equal shares to a pot of unit value, which is then awarded to a randomly chosen winner. Eventually, all players are bankrupt save one (the “monopolist”), and the game stops. When the initial stakes are equal, say to I , and k = 2, this game is equivalent to a symmetric random walk with absorbing barriers. It has been known for many years that expected duration of such a walk is (I 2 ) [2]. The authors of [1] conjectured that the expected duration of the equal-stake k-player game, that is, the expected time at which the monopolist emerges, is (k 2 I 2 ). This was suggested by experiments they did. E-mail address: [email protected] (E. Bach). 0020-0190/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.ipl.2006.05.016

They were also able to prove analytically that their conjecture held to within logarithmically growing factors. The main object of this note is to prove their conjecture. We will also extend it to games with unequal stakes, and give an exact formula for the expected duration of the 3-player game. Our precise results are stated in Theorems 1–6 below. Our principal tool is the optional stopping theorem of martingale theory. The monopolist game was inspired by (but is not identical to) von der Malsburg’s neuronal model for self-organization. One can also think of it as extracting the essence of more complex “winner take all” games such as the World Series of Poker. For more on its applications, see the references in [1]. 2. Formalization of the game The game starts with k players, each of whom may be active or inactive. Initially all players are active, and if there is only one active player, he is declared the win-

E. Bach / Information Processing Letters 101 (2007) 86–92

87

ner and play stops. A round of the game consists of the following: each of the m active players contributes 1/m to a pot. This pot (which totals 1) is then awarded to a randomly chosen active player. Let Xi (t) denote the assets of the ith player after t rounds of play. To monitor the progress of the game at times t = 0, 1, 2, . . . , we use the vector   X(t) = X1 (t), . . . , Xk (t) .

Our next goal is to associate some martingales with the game. To this end, let the random variable T denote the number of rounds played. Also, let m(t) denote the number of players that are active in the tth round. Then we have m(1) = k, and m(t) = 1 after the last round. We also define for t  1

Sums and differences of such vectors are to be interpreted componentwise. We will assume Xi (0)  0, and  write S for i Xi (0). We also let X2 denote Xi2 , and let Ij denote the j th player’s initial stake, so that

We will write Di (t) for ith component of this difference.

X(0) = (I1 , . . . , Ik ). For the moment we will make very few assumptions about the mechanism by which a player becomes inactive. We require that one can determine by playing t − 1 rounds which players will be active at the tth. Once a player becomes inactive, his assets are frozen and he is never active again. We also require there to be a L such that inf Xi (t)  L > −∞. This is allowed to depend on the number of players and the initial stakes; it functions as a sort of universal debt limit below which no player’s account can go. For example, we might stipulate that any player that cannot pay the ante becomes inactive, and take L = 0. In Section 4 below we will elaborate on this and similar rules. 3. Martingales We will adapt some standard definitions to our needs. For further background on martingales, see [4] and [5]. Let X(0), X(1), X(2), . . . be a stochastic process. Let Z(0), Z(1), . . . be a sequence of real-valued random variables with finite expectations, and assume that Z(t) is measurable with respect to X(0), X(1), . . . , X(t). We will abbreviate X(0), . . . , X(t) by F(t). We call Z a submartingale if for all t  1,   E Z(t) | F(t − 1)  Z(t − 1), and a martingale if this holds with equality. By a stopping time, we mean a random variable T (taking values 0, 1, . . . , ∞) for which the event that T = t is F(t)measurable. Intuitively, this means that we can determine if T = t by making observations up to and including time t. The optional stopping theorem states that if Z(0), Z(1), . . . is a submartingale with uniformly bounded absolute differences, and T is a stopping time with finite first moment, then     E Z(T )  E Z(0) . This becomes an equality when Z is a martingale.

D(t) = X(t) − X(t − 1).

Lemma 1. If the ith player is active at time t, then for ν  1, we have   E Di (t)ν | F(t − 1)  m(t) − 1  (m(t) − 1)ν−1 + (−1)ν . = m(t)ν+1 Otherwise Di (t) = 0. Proof. Observe that Di (t) is 1 − 1/m(t) with probability 1/m(t), and −1/m(t) with probability 1 − 1/m(t). 2 Since our result does not depend on the particular active player chosen, we will write Mν (t) for the right hand side of the above equation. For future reference, we note the following. M1 (t) = 0; m(t) − 1 M2 (t) = ; m(t)2 (m(t) − 1)(m(t) − 2) M3 (t) = . m(t)3

(1) (2) (3)

We also let A(t) denote the total capital of the active players immediately before the tth round of play. Lemma 2. The following processes are all martingales: i = 1, . . . , k;    Xi (t)2 − Z(t) = 1−

Xi (t),

i

W (t) =

 i



1ut

1 ; m(u)

3

Xi (t)

 m(u) − 1   3A(u) + m(u) − 2 . 2 m(u)

1ut

Furthermore, |Xi (t) − Xi (t − 1)| = O(1), with similar bounds for the differences of Z and W .

88

E. Bach / Information Processing Letters 101 (2007) 86–92

Proof. The measurability condition is obviously satisfied. We also have |Xi (t)|  S + k|L| = O(1), so E[|Xi (t)|], E[|Z(t)|], and E[|W (t)|] are all O(t). We will verify the remaining properties only for W , the others being similar. Observe first that when ν  1 and μ  0,    E Di (t)ν Xi (t − 1)μ | F(t − 1) i

=



  Xi (t − 1)μ E Di (t)ν | F(t − 1)

i

= Mν (t)



Xi (t − 1)μ .

(4)

i active

This holds because Xi (t − 1) is measurable with respect to F(t − 1), and because the quantity E[Di (t)ν | F(t −1)] equals Mν (t) for active players i, and vanishes otherwise. Now we prove the main martingale property. Start with Xi (t)3 = Xi (t − 1)3 + 3Di (t)Xi (t − 1)2 + 3Di (t)2 Xi (t − 1) + Di (t)3 , sum on i, and take expectations. Using (4) and then (1)– (3), we get

 3 E Xi (t) | F(t − 1) i

=



Xi (t − 1)3 + 3A(t)

i

m(t) − 1 m(t)2

(m(t) − 1)(m(t) − 2) + . m(t)2 This implies that   E W (t) | F(t − 1)  = Xi (t)3 i



 1ut−1

(The exponent is ν because the probability that i and j survive is upper bounded by the probability that i survives.) Taking c = (1 − ε)1/ , we see that Pr[T > t] = O(ct ), and since c < 1,     2 Pr[T > t] = O ct < ∞. E[T ] = t0

t0

4. Games with integral stakes We have now gone as far as we can go without considering particular bankruptcy rules. The simplest imaginable one is (B1) Let there be m active players at the tth round. Any player i with Xi < 1/m is declared inactive at time t + 1. This rule has the nice feature that no special rule is needed to handle bankruptcies, if the initial stakes are integral. Theorem 1. Assume that Ii is a non-negative integer, for i = 1, . . . , k. Under rule (B1), each player’s capital is always  0, and he becomes inactive only if it equals 0.

 m(u) − 1  3A(u) + m(u) − 2 2 m(u)

= W (t − 1), as needed. Finally, for the differences we have   W (t) − W (t − 1)  2k S + k|L| 3 + 3S + 1 = O(1).

Proof. The number of active players, and hence the event T (t), are measurable with respect to F(t), so T is a stopping time. We must now show that E[T ] is finite. To prove this, choose a positive integer , sufficient to bankrupt any player who loses  times in a row. (Any   k(S + k|L|) will do.) The probability that any particular active player is eliminated in  or fewer rounds is at least ε = (1/k)l > 0. Therefore, the probability that there exist two players still active after ν rounds is at most  k (1 − ε)ν → 0. 2

2

Lemma 3. Let T be the first time at which there is a winner (that is, only one player is active). Then T is a stopping time, and E[T ] < ∞.

Proof. First, observe that at all times, the Xi are congruent to each other mod 1. Now, consider play up to and including the first bankruptcy. Since each player holds an integral multiple of 1/k, the first bankruptcy occurs when some player’s capital vanishes. By the observation above, the capital of any other player must also be integral (possibly 0). If the game continues, we repeat the same argument using a smaller k. 2 Theorem 2. Let Ii be a non-negative integer, for i = 1, . . . , k. Then under rule (B1), we have  

 2  4 ( i Ii )3 − i Ii3   E[T ]  2 Ii − Ii2 . 3 i Ii i

i

E. Bach / Information Processing Letters 101 (2007) 86–92

In particular, when each of k players starts with I units, we have  4 2 k − 1 I 2  E[T ]  2k(k − 1)I 2 . 3 Proof. We prove the upper bound first. By optional stopping, we have      2 E Z(T ) = E Z(0) = Ii . i

Substituting the definition of Z and rearranging gives us

   E 1 − 1/m(t) 1tT

=E





2

Xi (T )



i

 2  = Ii − Ii2 . i

Xi (0)2

i

i

On the other hand, 1 − 1/m  1/2, so

   E[T ]/2  E 1 − 1/m(t) . 1tT

Combining these gives the upper bound for E[T ]. The lower bound is proved similarly using W . First, by Theorem 1 we can replace A by S in the definition of W , since A(t) is constant. Second, we can assume that the initial stakes are positive, since the sums in the bound are not affected by the inclusion of zero terms. We now use the following estimate: for integers m in the range 2  m  k, we have  m−2 1 m−1 1+  . 3S 4 m2 (This can be proved as follows. When m = 2 it becomes an equality. Otherwise, observe that S  m, and treat m = 3 and m  4 separately.) This implies

  m(t) − 1  3S 3S + m(t) − 2  E[T ] . E 2 4 m(t) 1tT

However, by optional stopping,

  m(t) − 1  E 3S + m(t) − 2 m(t)2 1tT 

 Xi (T )3 − Xi (0)3 =E i

 3  = Ii − Ii3 . i

Some remarks are appropriate here. First, for k = 2, the bounds match, giving E[T ] = 4Ii I2 . This is equivalent to a well known result about the symmetric random walk. Second, by clearing denominators we can see that    2  ( i Ii )3 − i Ii3   Ii − Ii2 , I i i i

i

whenever Ii  0. Therefore, E[T ] = (Q),   where Q = ( i Ii )2 − i Ii2 measures the equity in the players’ initial stakes. This estimate suggests that, among all ways to initially partition a given amount of capital, the even distribution maximizes the expected time of the game. This is well known to be true for two players, and in the sequel we will show it holds for three players as well. Third, we can improve Theorem 2 in certain cases, as follows. The convex combination αZ + (1 − α)W is also a martingale, and for given α, we can find an m that minimizes (or maximizes) the coefficient of E[T ] in our proof. Using a computer, one can easily look for values of α that give better estimates, and this is successful in certain cases. For example, if 4 players each start with a stake of 2 units, Theorem 2 gives 80  E[T ]  96, which is not as sharp as the estimate 82.6  E[T ]  85.2, obtained by using α = 0.7 for the lower bound and α = 0.85 for the upper bound. This technique, however, seems to improve Theorem 2 only for limited ranges of the parameters. For two players, it is well known that each player’s probability of winning is proportional to his share of the total wealth. The following result generalizes this. be a non-negative integer, for i = Theorem 3. Let Ii 1, . . . , k, with S = i Ii . The probability that player i wins the game is Ii /S. Proof. By optional stopping,      Ii = E Xi (0) = E Xi (T ) = Ii Pr[i wins]. i

Solving for the probability gives the result.

2

5. Exact analysis for three players

i

i

Combining these gives the desired lower bound.

89

2

Further analysis of the game is complicated by the feature that its character changes at random times. For three players, however, we have enough information to compute the expected duration exactly, if the initial stakes are integral.

90

E. Bach / Information Processing Letters 101 (2007) 86–92

Theorem 4. Let k = 3, and assume that I1 , I2 , I3 are positive integers. Then   12 i=j Ii2 Ij − 8 i
 3S 6S + 2 E[T3 ] + E[T2 ] = S 2 − Ii2 . 9 4 i

Solving these gives

E[T2 ] =

12



2 i=j Ii Ij

−8



i
3S − 2

Recall that a differentiable function on a compact set has a minimum value [3, p. 311], attained either on the boundary or at a critical point (an interior point at which all partial derivatives vanish) [3, p. 60]. Since the simplex is compact, the minimum of g must be attained somewhere. There are finitely many choices for its location: the center x1 = x2 = x3 (the only critical point), the corners, and critical points of the restricted function obtained by setting one variable to zero. Evaluating g at the center gives the smallest value. 2 6. Games with arbitrary stakes

.

Since T = T3 + T2 , the result follows.

2

Theorem 4 is actually correct for degenerate cases where some player’s initial stake is zero. This can be seen by substitution. When each player starts with the same stake I , Theorem 4 becomes E[T ] =

g(x1 , x2 , x3 ) = x13 + x23 + x33 + 2(x1 x2 + x2 x3 + x1 x3 ).

In this section we will consider what happens when the initial stakes can be arbitrary positive numbers. Relaxing the condition of integrality introduces the possibility of more complicated bankruptcy procedures. In their paper, Amano et al. considered the following rule.

27I1 I2 I3 E[T3 ] = 3S − 2 and

The first term is well known to be maximized when x1 = x2 = x3 , so it suffices to show that this also minimizes

99I 3 − 24I 2 . 9I − 2

One might ask if the expected duration of the game is maximized when the initial stakes are equal, as compared to other integral allocations with the same sum. The result below tells us that the answer is yes. Theorem 5. Let S  3. On the simplex x1 + x2 + x3 = S, xi  0, the symmetric function f (x1 , x2 , x3 )   12 i=j xi2 xj − 8 i
(B2) Let there be m active players at the tth round. Any player i with Xi < 1/m is declared inactive at time t + 1, and his assets are divided equally among the remaining active players. Should Xi < 1/m for more than one i, this is done sequentially, handling the lowest numbered player first, then the second lowest, and so on. Unfortunately this rule destroys the martingale properties we have relied upon so far. To restore them, we will consider another game that is equivalent as to duration. At each time there will be a threshold Y , initially set to 0. Bankruptcies are handled by the following rule. (B2 ) Let there be m active players at the tth round. Any player i with Xi − 1/m < Y is declared inactive at time t + 1, and the threshold is lowered by (Xi − Y )/(m − 1). Should this hold for more than one i, bankrupt players are handled sequentially as in rule (B2). With this modification, T is identical to the stopping time under rule (B2). Also, Lemma 3 remains true. Furthermore, the threshold will never be reduced by an amount greater than 1 1 1 1 + + ··· + =1− . k(k − 1) (k − 1)(k − 2) 2·1 k We therefore can take L = −1.

E. Bach / Information Processing Letters 101 (2007) 86–92

Theorem 6. Let Ii  0, for i = and assume 1, . . . , k,  that S = i Ii  1. Let Q = ( i Ii )2 − i Ii2 . Then under rule (B2), the expected duration of the game satisfies

X = (X1 , . . . , Xm , 0, . . . , 0), with Xm < 1/m, and afterward is  Xm Xm , . . . , Xm−1 + , 0, 0, . . . , 0 . X  = X1 + m−1 m−1 Expanding out the squares and using the estimate 0  Xm < 1/m, we find that Xm (2S − mXm ) m−1 1 Xm (2S − 1),  m−1

X  2 − X2 

which is positive since S  1. This shows that Z is a submartingale (increases on average), so by optional stopping,     E Z(T )  E Z(0) . From this we find  2 

   2 Ii − Ii  E 1 − 1/m(t)

and the upper bound follows. For the lower bound, we first replace (B2) by (B2 ). By optional stopping we get

   E[T ]  E 1 − 1/m(t) 1tT

=E

i

i2

S −2 2



i2

Xi (T )S

 S 2 − 2S log k. Combining these results gives the desired lower bound. 2 The natural parameter space for a game with realvalued stakes is the positive orthant of Rk , defined by Ii  0. Unfortunately, it is no longer true that E[T ] = (Q) on the orthant. Indeed, suppose that i Ii < 1 with at least two Ii > 0. Then some player is already bankrupt. Redistributing his assets under rule (B2) will not change the sum, so bankruptcies will occur until one player has everything. In such cases, then, T = 0 but Q > 0. We can, however, prove such a lower bound if the initial stakes are suitably restricted. One interesting restriction is to make the initial stakes equal, as in [1]. Then Q = k(k − 1)I 2 , and the proof of Theorem 6 implies that 2(Hk − 1) 1 E[T ] 1−  , Q (k − 1)I 6 provided k  3 and I  1. Another approach is to show that E[T ] = (Q) on a “large” subset of the orthant. To state this formally, we let μ, be (k − 1)-dimensional volume, normalized  so that the simplex Ii = S has volume 1.

1tT

 E[T ]/2,



i

 S 2 − 2S(Hk − 1)S

Proof. For the upper bound, we observe that redistribution after bankruptcy increases X2 . Indeed, choose indices so that the asset vector before redistribution is

i

 2   Xi (T )2 = S − Xi (T ) + Xi (T )2

i2

Q − 2S log k  E[T ]  2Q.

i



91

 Xi (T )2 − Ii2 . i

We can choose the indices so that player 1 is the winner, and players 2, 3, . . . , k are bankrupt in that order. Since Y  0 at all times (this can be proved by induction), we have 1 1 1 , . . . , Xk (T )  . X2 (T )  , X3 (T )  k k−1 2 From this we find (writing Hk for 1 + 1/2 + · · · + 1/k)

Theorem 7. Let ε > 0. If S  (4 log k)/ε, we have E[T ]  (1 − ε)Q, except for a subset B with μ(B)  k2−(k−1) . Proof. Let B be the set of initial stakes for which  Ii2  S 2 /2. On the complement of B we have 2Q  2 S , so  2S log k  (1 − ε)Q. E[T ]  Q 1 − Q We now estimate the volume of B. This set is the union of k congruent “corners”, each of which is a simplex with one curved face. Since length squared is convex, if we replace each curved face by a flat one with the same vertices, we get a set that contains B. To compute its volume, observe that the vertices of the ith corner are Sei , and (S/2)(ei + ej ), j = i. (Here ei has 1 in the ith coordinate and 0 elsewhere.) Using the determinant formula [3, p. 221], we find the normalized volume of

92

E. Bach / Information Processing Letters 101 (2007) 86–92

one corner to be 2−(k−1) . The volume of B is at most k times this. 2 The volume of the exceptional set B decays rapidly as k grows; at k = 10, for example, it comprises less than 2 percent of the simplex. Finally, we note that any game with k = 2 has the same duration as a game with integral stakes, so an exact formula can be given in this case. 7. Games with other entrance fees So far, we have assumed that the players in the monopolist game compete for a pot of constant value, which makes the entrance fee inversely proportional to the number of active players. Other conventions are possible, and in this section we briefly explore two of them. One possibility is to make the entrance fee one unit, no matter how many players are active. This variant of the game was considered already by Amano et al. [1], at the end of their paper. This case is more difficult to analyze by our techniques, because the conditional expectation of D(t)2 varies drastically with the number of players. We can prove, however, that T , the number of rounds of this game, satisfies

and this becomes 1 if we take 1 . α= √ m(m − 1) With this choice of entrance fee, the process

2

Z(t) = X(t) − t is a martingale, so by the arguments we have used, E[T ] ∼ k(k − 1)I 2 assuming that each player starts with an initial stake I . We note that for large k, the irrational entrance fee is very close to that of the original game (to wit, 1/k). However the original game has a greater mean duration (by Theorem 2). Thus one might surmise that the expected time taken by the final rounds of the game becomes relatively larger, as the number of players grows. Acknowledgements I would like to thank Osamu Watanabe for telling me about the monopolist game. Comments by the anonymous referees improved the presentation considerably. The support of the National Science Foundation, via grant CCR-9988202, is also gratefully acknowledged.

E[T ]  Q/2,

  where as before Q = ( i Ii )2 − i Ii2 . This is proved similarly to the upper bound of Theorem 2. There is a corresponding lower bound, but it is much less informative. In particular, it does not come near the experimental findings of [1], which suggest that when all Ii = I , E[T ] is about k 2 I 2 /4. For k = 3, one can use the ideas of Section 5 to find that E[T | = (7I 3 − 6I 2 )/(3I − 2) = 0.259259k 2 I 2 . A second possibility is to vary the entrance fee to make the analysis easier. If we can tolerate entrance fees that are irrational (in the mathematical sense), then the duration of the game can be ascertained fairly precisely. Indeed, suppose each player is charged an entrance fee α. Then by Lemma 1, we have

2   E D(t) |F(t − 1) = α 2 m(m − 1),

References [1] K. Amano, J. Tromp, P. Vitányi, O. Watanabe, in: M.X. Goemans, et al. (Eds.), Approximation, Randomization, and Combinatorial Optimization, 4th Int’l Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2001, and 5th Int’l Workshop on Randomized and Approximation Techniques in Computer Science, RANDOM 2001, in: Lecture Notes in Comput. Sci., vol. 2129, Springer-Verlag, Berlin, 2001, pp. 181–191. [2] A. de Moivre, The Doctrine of Chances, or, a Method of Calculating the Probabilities, of Events in Play, third ed., A. Millar, London, 1756. Reprinted by Chelsea, New York, 1967. [3] W. Fleming, Functions of Several Variables, Addison–Wesley, Reading, MA, 1965. [4] G. Grimmet, N. Stirzaker, Probability and Random Processes, second ed., Oxford Univ. Press, Oxford, UK, 1992. [5] D. Williams, Probability with Martingales, Cambridge Univ. Press, Cambridge, UK, 1991.