Strategies generalization and payoff fluctuation optimization in the iterated ultimatum game

Strategies generalization and payoff fluctuation optimization in the iterated ultimatum game

Physica A xx (xxxx) xxx–xxx Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa Strategies generali...

8MB Sizes 0 Downloads 10 Views

Physica A xx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

Strategies generalization and payoff fluctuation optimization in the iterated ultimatum game Q1

Enock Almeida a , Roberto da Silva b,∗ , Alexandre Souto Martinez a a

Departamento de Física e Matemática, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Av. Bandeirantes, 3900 - CEP 14040-901, Ribeirão Preto, São Paulo, Brazil b Instituto de Fisica, Universidade Federal do Rio Grande do Sul, Av. Bento Gonçalves, 9500 - CEP 91501-970, Porto Alegre, Rio Grande do Sul, Brazil

highlights • • • • •

We propose a generalization of strategies in the iterated ultimatum game. Our analysis is separated in two ways: no-memory players and one-step-memory players. The accepting of the proposals is performed under two prescriptions: (a) fixed probabilities and (b) dependent on proposals. A detailed analysis of optimization of payoff fluctuations was performed. Monte Carlo simulations corroborate analytical results.

article

info

Article history: Received 19 June 2014 Available online xxxx



abstract An iterated version of ultimatum game, based on generalized probabilistic strategies, which are mathematically modeled by accepting proposal functions is presented. These strategies account for the behavior of the players by mixing levels of altruism and greed. We obtained analytically the moments of the payoff of the players under such a generalization. Our analysis is divided into two cases: (i) no memory players, where players do not remember previous decisions, and (ii) one-step memory players, where the offers depend on players’ last decision. We start considering the former case. We show that when the combination of the proposer’s altruism and responder’s greed levels balances the proposer’s greedy and responder’s altruism levels, the average and variance of the payoff of both players are the same. Our analysis is carried out considering that the acceptance of an offer depends on: (a) a fixed probability p or (b) the value offered. The combination of cases (i) and (a) shows that there exists a p value that maximizes the cumulative gain after n iterations. Moreover, we show n × p diagrams with ïso-average’’ and ïso-variance’’ of the cumulative payoff. Our analytical results are validated by Monte Carlo simulations. For the latter case, we show that when players have no memory (i), there are cutoff values, which the variance of the proposer’s cumulative payoff presents local maximum and minimum values, while for the responder, the same amount presents a global maximum. Case (b) combined with onestep memory players (ii), we verified, via MC simulations that, for the same number of iterations, the responder obtains different cumulative payoffs by setting different cutoff values. This result composes an interesting pattern of stripes in the cutoff per n diagrams. Simultaneously, looking at variance of this amount, for the responder player in a similar diagram, we observe regions of iso-variance in non trivial patterns which depend on initial

Corresponding author. Tel.: +55 51 8196 0903. E-mail addresses: [email protected] (E. Almeida), [email protected], [email protected] (R. da Silva), [email protected] (A.S. Martinez). http://dx.doi.org/10.1016/j.physa.2014.06.032 0378-4371/© 2014 Published by Elsevier B.V.

2

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

value of the proposal. Our contributions detailed by analytical and MC simulations are useful to design new experiments in the ultimatum game in stochastic scenarios. © 2014 Published by Elsevier B.V.

Q2

1. Introduction

1

Q3

42

Game theory plays an important role in explaining the interaction between living creatures in biological sciences or social features of stock markets among other examples. In these systems one considers individuals composing homogeneous or heterogeneous populations, with or without spacial structure and they negotiate/combat/collaborate via any protocol of the theoretical game theory framework. The full comprehension of cooperation between individuals as an emergent collective behavior is a challenge [1–3]. The cooperation emerges as a stable strategy in the spatial prisoner’s dilemma [4–8]. The ultimatum game plays an important role to mimic bargaining aspects that emerge in real situations. In this game, firstly proposed by Güth et al. [9], two players must divide an amount (a sum of money). One of the players proposes a division (the proposer) and the second player can either accept or reject it. If the second player accepts it, the values are distributed according to the division established by the proposer. Otherwise, no earning is distributed to the players. Even in a single turn, the ultimatum game might be interesting. Although it is better for the responder to accept any offer, offers below one third are often rejected [10]. The responder punishes the proposer up to the balance between proposal and accepting in the iterated game. In general, values around a half of the total amount are accepted [10,11]. If played iteratively, for example, in n turns, the iterated ultimatum game is suitable to explain the emergence of player’s cooperation [10]. On one hand, the authors of Refs. [12,13] have showed that, in a linear lattice with the periodic boundary conditions, players who offer and accept the smallest values can spread their strategies throughout their neighbors. On the other hand, Szolnoki et al. [10] noticed that, in a square lattice, fair players get larger payoffs. Also, this altruistic behavior has been observed in humans, and there are evidences that unequal offers activate brain regions related to pain and affliction [11]. Uncomfortable feelings lead the responder to sacrifice his own gain to punish the proposer. The authors of Refs. [14,15] have calculated the players’ payoff statistics and have shown the necessary conditions for a strategy to dominate the others. Their results have been corroborated by Monte Carlo numerical simulations. The aspects of the payoff statistical fluctuations of this game have been addressed in different versions of the model: (i) spacial ultimatum game (see, for example, Refs. [16,17]) and (ii) populations of players in matching graphs and in complete graphs (mean-field regimes) [14,15]. Nevertheless, the ultimatum game strategy generalizations and optimization are poorly explored, even its iterated version, without topology effects. In this paper, we call attention to iterated ultimatum game, disregarding the effects of the spatial structure. We focus on the payoff statistical moments of generalized strategies searching for optimum parameter values. We generalize strategies for the proposer and the responder in the iterated ultimatum game. This generalization allows us to go beyond basic strategies, in which the only effect is to lead to the fifty-fifty proposal/accepting as the known punishment mechanism. Also, we address optimal strategies related to maximization/minimization of the payoff and its variance when the players face stochastic scenarios. Firstly, we consider players without memory of previous decisions. Next, we consider players who recall the values and decisions of the preceding turn. For memoryless players, we have obtained analytical probability distributions for proposal and for the response, with a parameter tuning player decisions from altruism to greed. For players with one step memory, the proposer adjusts the values depending on the declination/acceptance of the adversary in the preceding turn. We have analytically calculated the first and the second payoff statistical moment. Our result is presented as follows. In Section 2, we develop a generalized approach to calculate the statistical payoff moments considering static strategies, for memoryless players. General results, ranging from altruistic to greedy behaviors are obtained for the gain fluctuations for the iterated version of this game. In Section 3, we address the evolutionary strategies, with players having one step memory of this iterated game based on the idea of offer increments/decrements depending on the previous result. In Section 4, we present our main results in static and evolutionary versions of the iterated game. Optimization of strategies are studied in detail. Finally, in Section 5, we present our summaries and some conclusions of our results.

43

2. No memory players (static strategies) with generalized strategies

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

44 45 46

47

48

In the iterated ultimatum game, consider the probability: (i) pp (y), that the proposer offers y ∈ [0, x] to her/his opponent and (x − y) is its corresponding part, and (ii) pr (y), that the responder accepts the deal. The kth statistical moments of the gain (per/match) for the both players are respectively: g

( n) = p

 k

(n) = r

 k

g

n 1

n j =1 n 1

n j=1

(x − yj )k p(pj) (yj )p(rj) (yj )

ykj pp(i) (yj )p(rj) (yj ),

(1)

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

3

where n is the number of rounds (iterations). In each round, the proposer offers and the responder accepts the amounts (j) (j) with different probability distributions, indexed by (j) in pp (yj ) and pr (yj ). Notice that the term relative to zero gain, which (j)

occurs with probability 1 − pr , has not been written. For numerous rounds (n ≫ 1), one can make use of the continuous approximation. Considering that the offers and acceptances are identically distributed, these statistical moments become:

 k g

p

 k g

r

=

4

(1 − y)k fp (y)fr (y)dy

(2)

5

yk fp (y)fr (y)dy,

(3)

6

1

 = 0

Γ (q1 + q2 + 2) yq1 (1 − y)q2 , Γ (q1 + 1)Γ (q2 + 1) 1 for 0 ≤ y ≤ 1, so that 0 dyfp (y) = 1. The distribution function relative to pr (y) is: c  q3 ,q4 g (y) for 0 ≤ y ≤ q6 q4 fr (y) = q5 q6  1 for q6 < y ≤ 1, cq3 ,q4 =

(q3 + q4 )q3 +q4 q

q

q33 q44

g (y) = (1 − y)q3 yq4 ,

(4)

=

cq3 ,q4

=

cq3 ,q4

q5

Γ (q1 + q2 + 2) Γ (q1 + q4 + 1)Γ (q2 + q3 + k + 1) · Γ (q1 + 1)Γ (q2 + 1) Γ (q1 + q2 + q3 + q4 + k + 2)

7 8 9 10 11

12

13

(5)

14

(6)

15

(7)

16

so that 0 ≤ fr (y) ≤ 1. The parameters q1 , . . . , q6 control the behavior of the players. On one hand, the proposer can offer greater amounts to the responder with greater probabilities following a distribution based on ‘‘complete altruism’’ (q2 = 0, q1 > 0). On the other hand, the probabilities can follow a distribution based on a ‘‘complete greedy ’’ (q2 > 1, q1 = 0), where the proposer offers smaller amounts to the responder with greater probabilities. The intermediate cases (q1 > 0, q2 > 0) mimic players with mixed strategies, which interpolate the extreme situations. The case q1 = q2 = 0 represents a very special situation, the proposer offers amounts carelessly, i.e., the offers are equally probable. The accepting process is significantly different from the proposing one. Mathematically, fr (y) is not a pdf and depends on four parameters: q5 ≥ 1 only controls the offer accepting magnitude and q6 < 1 is a cutoff to be used in some specific strategies. For example, firstly consider q3 = q4 = 0. For q5 = q6 = 1, the responder accepts any amount, independently of the offer. The responder accepts the offers according to a non-biased coin, for q5 = 2 or for, q5 > 2, according to a biased coin. Now, let us turn our attention to the payoffs by justifying the term: cq3 ,q4 . Consider Eq. (7), for q3 = q4 = 0, g (y) vanishes in the domain. For q3 > 0 and q4 = 0, g is a monotonic decreasing function with, g (0) = 1. For q3 = 0 and q4 > 0, g is a monotonic increasing function, with g (1) = 1. If q3 , q4 > 0, then g (0) = g (1) = 0. Since g (y) ≥ 0, it has a maximum value for any y ∈ ]0, 1[. If g ′ (y) = [q4 /y − q3 /(1 − y)]g (y), g ′ (y) = 0, then for 0 < y < 1 only if y∗ = q4 /(q3 + q4 ) ⇒ f (y∗ ) = 1/cq3 ,q4 , which leads to Eq. (5). Some examples are depicted in Fig. 1. For the sake of the simplicity, let us start considering q6 = 1. The cases with cutoff will be explored in Section 4. Substituting Eqs. (4) and (5) in Eq. (2), we obtain: p

3

0

fp (y) =

g

2

1



where, without loss of generality, we consider x = 1. The discrete case probability pp (y) becomes the fp (y) probability density function (pdf). Nevertheless, the probability pr (y) does not lead directly to a pdf. For technical means, it only needs to be a number in the interval [0, 1] for each given y ∈ [0, x], since the responder accepts with probability pr (y) and rejects 1 − pr (y). To cover a multitude of different situations in the iterated ultimatum game, let us propose the use of the generalized fp (y) and fr (y) multi-parametric functions:

 k

1

(8)

and

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

34

35

 k g

r

q5

Γ (q1 + q2 + 2) Γ (q1 + q4 + k + 1)Γ (q2 + q3 + 1) · Γ (q1 + 1)Γ (q2 + 1) Γ (q1 + q2 + q3 + q4 + k + 2)

(9)

which leads to the general ratio between the average values:

 k g

 p = k g

r

Γ (q1 + q4 + 1)Γ (q2 + q3 + k + 1) . Γ (q1 + q4 + k + 1)Γ (q2 + q3 + 1)

37

(10)

A simple expression is obtained for k = 1 and q1 , q2 , q3 and q4 integers:

⟨g ⟩p Γ (q1 + q4 + 1)Γ (q2 + q3 + 2) q2 + q3 + 2 = = . ⟨g ⟩r Γ (q1 + q4 + 2)Γ (q2 + q3 + 1) q1 + q4 + 2

36

38

39

(11)

40

4

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

Fig. 1. Examples of different parameter values in the accepting functions (Eq. (5)) without cutoff effects (i.e., q6 = 1). (a) The plot shows the cases with q3 = q4 = 0, which means to accept an offer according with probability 1/q5 , where q5 > 1, for q5 = 1 (all values are accepted), q5 = 2, where 50% of the offers are accepted independently of their value and q5 = 3. The case q5 → ∞ corresponds to a player who simply rejects all offers. In the plots (b), (c), (d), (e), and (f) we set q5 = 1 which guarantees that max[fr (y)] = 1. The plots (b) and (c) explore respectively q3 (q4 ) fixed to 0.1 and q4 (q3 ) assuming the values 0, 0.1, 0.5, 1, and 4. The plots (d) and (e) correspond to the same study but fixing q3 (q4 ) to 1. Finally the plot (f) corresponds to the case where q3 = q4 = q assuming the same values 0.1, 0.3, 1, 2, and 10. Such figures explore that such parameters cover the many maps between altruism and greedy behavior.

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

5

If q2 + q3 = q1 + q4 , then ⟨g ⟩p = ⟨g ⟩r . It means that the proposer and the responder have the same average gain when the altruism level of the proposer combined with the greedy level of the responder balances the altruism level of the responder combined with the greedy level of the proposer.     The variances of the proposer and the responder are Var(g )p = g 2 p − ⟨g ⟩2p and Var(g )r = g 2 r − ⟨g ⟩2r respectively,



Var(g )p =

(q2 + q3 + 3)(q2 + q3 + 2) − (q2 + q3 + 2)2 Φ (q1 , q2 , q3 , q4 , q5 ) · Φ (q1 , q2 , q3 , q4 , q5 ) (q1 + q2 + q3 + q4 + 4)

1 2 3 4



(12)

and

5

6

 (q1 + q4 + 3)(q1 + q4 + 2) 2 Var(g )r = − (q1 + q4 + 2) Φ (q1 , q2 , q3 , q4 , q5 ) · Φ (q1 , q2 , q3 , q4 , q5 ) (q1 + q2 + q3 + q4 + 4) 

(13)

where

7

8

Φ (q1 , q2 , q3 , q4 , q5 ) =

cq3 ,q4 q5

Γ (q1 + q2 + 2) Γ (q1 + q4 + 1)Γ (q2 + q3 + 1) · . Γ (q1 + 1)Γ (q2 + 1) Γ (q1 + q4 + q2 + q3 + 3)

(14)

Thus,

9

10

Var(g )p

=

Var(g )r

(q2 + q3 + 3) − (q1 + q2 + q3 + q4 + 4) · (q2 + q3 + 2)2 Φ (q1 , q2 , q3 , q4 , q5 ) . (q1 + q4 + 3) − (q1 + q2 + q3 + q4 + 4) · (q1 + q4 + 2)2 Φ (q1 , q2 , q3 , q4 , q5 )

(15)

11

Independent of the value of q5 , if q2 + q3 = q1 + q4 , then Var(g )p = Var(g )r , as well as it has occurred with average values. In Section 4, we present some results for no memory players about two specific situations with static strategies. In each round, the proposer offers an amount x uniformly distributed in [0, 1] and the responder accepts the value with (i) the same probability p, independently from offer, (ii) probability that depends on the offer. In the latter case, an analysis for the cutoff parameter is performed.

14

3. Players with one step memory

17

Now, let us consider the case where the proposer’s offer can be incremented or decremented depending only on the immediate responder decision. In our model, in the first round, the proposer offers y0 . If the receptor accepts it, in the following round, the proposer decreases the offer by 1y. However, if the receptor declines it, in the following round, the proposer increases the offer by 1y. Consider the case where the responder always accepts the offer with a fixed probability: f (y) = p ∈ [0, 1], and the offer rejection occurs with probability 1 − p. This consideration allows us to obtain analytical results in the one-step memory iterated game. Given 1y and p, in the ith round, the average offer is: yi = y0 + i1y(1 − 2p),

(16)

where i = 0, 1, . . . , n, since in each round the average offer is modified by ⟨(1y)i ⟩ = (1 − p)1y − p1y = 1y(1 − 2p). In the ith round, the responder average payoff is gi = pyi = py0 + ip1y(1 − 2p). Thus, after n iteration, the average of the cumulative payoff is Gr =

n 

gi = npy0 +

n( n − 1 ) 2

i =1

p(1 − 2p)1y.

(17)

Similarly, the proposer has an average cumulative payoff: Gp = np(1 − y0 ) −

n(n − 1) 2

p(1 − 2p)1y.

p∗ =

1 4

2y0

(n − 1)1y

13

15 16

18 19 20 21 22 23 24 25 26 27 28

29

30

(18)

The probability p, for a given n, that maximizes the cumulative responder gain GR is:



12

31

32



+1 .

(19)

After a determined number of rounds, the value offered by the proposer can be negative or greater than 1. Such a critical number can be calculated with yi = 0 or 1 in Eq. (16), respectively, yielding two possibilities:

  1   n 0  2p − 1  nc =   1 − y0   n0 y0 (1 − 2p)

33

34 35

for p > 1/2 (20) for p < 1/2

36

6

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

Fig. 2. Existence of a value of p that maximizes the cumulative payoff (Eq. (17)) as function of p, for different values of y0 . For these experiments, n = 103 and 1y = 6 · 10−4 have been used. 1 2 3 4 5

where ⌈x⌉ denotes the nearest integer greater than x. Naturally, y0 and 1 − y0 are non-negative values, therefore, if p > 1/2, nc = ⌈y0 /[1y(2p − 1)]⌉, while for p < 1/2, nc = ⌈(1 − y0 )/[1y(1 − 2p)]⌉. As p → 1/2, nc → ∞. Only for n < nc , Eq. (17) holds. For example, for n − 1 = n0 , p∗ ≡ 3/4, showing that there is a non-trivial optimal value. There is probability p that maximizes the proposer gain, given y0 , n and 1y. This maximization is depicted in Fig. 2. In the nth round, the responder cumulative gain variance is:



6

Varr (G) = np(1 − p)y20 + 4n(n − 1)p(p − 1) p −





4

y0 1y

+ 2n(n − 1)(2n − 3)p (1 − p) − 2n(n − 1)(n − 2)p (1 − p) +

7

8

1

3

2

6



1y2 (21)

and similarly, for the proposer:



9

n(n − 1)(2n − 1)p(1 − p)

Varp (G) = np(1 − p)(1 − y0 )2 − 4n(n − 1)p(p − 1) p −

1 4



(1 − y0 )1y

  n(n − 1)(2n − 1)p(1 − p) 1y2 . (22) + 2n(n − 1)(2n − 3)p3 (1 − p) − 2n(n − 1)(n − 2)p2 (1 − p) +

10

6

18

So in this section, we showed that accepting offers with fixed probability p, after n iterations, leads to the cumulative gain which has a maximum value at p∗ given by Eq. (19). In next section (Main results), we will show color maps n × p that illustrate some patterns of iso-payoff and iso-variance, since the cumulative payoff and its variance can assume equal values for different values of n under different p-values. We will also show the results for the cases where we do not have analytical formulas just MC simulations: accepting which depends on the offer. In this case we check that the interesting pattern of stripes of iso-payoff appears in diagrams n × q6 with cross over in values of q6 that depends on y0 . In other similar diagrams, we checked that regions of maximal variances are larger for small values y0 and concentrated for q6 > 0.5. However the magnitude of maximal values of variance is larger for greater values of y0 .

19

4. Main results

11 12 13 14 15 16 17

26

Here, we present results about some specific situations. For instance, in each round, the proposer offers an amount x uniformly distributed in [0, 1]. The responder accepts this amount with (i) the same probability p, independently from offer value or (ii) probability y, which is greater, the greater the offer (greedy player) is or (iii) it can raise the value up to y = q6 (controlled greedy player), remaining unitary for greater values. Our main results concern fixed strategies and evolutionary strategies (increment/decrement of the offers). In the first part, we studied some cases by setting some specific parameters to explore our analytical formulas previously proposed. In the second part, we compare the analytical results with MC simulations.

27

4.1. Static probabilistic strategies on the offers (no memory game)

20 21 22 23 24 25

28 29 30 31 32

In the case of the static probabilistic strategies on the offers, it is important to calculate the average gain per round, since the trace of game is not available. The gain in the following iteration does not depend on the previous decisions. Let us choose an important particular case: consider that offers are distributed uniformly in the interval [0, 1]. In this case, let us consider accepting functions of Eqs. (8) and (9). The emphasis of our analysis is for cutoff values (q6 ̸= 1), which have been disregarded previously.

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

7

Fig. 3. Average and variance payoff of the responder (surface) and the proposer (patch) as function of the pair (q4 , q6 ).

Firstly, consider a fixed probability for the responder to accept an offer, so that q1 = q2 = q3 = q4 = 0, q6 = 1 and p = 1/q5 in Eqs. (8) and (9), leading to ⟨g ⟩p = p/2, ⟨g ⟩r = p/2 and Varr (g ) = Varp (g ) = −p2 /4 + p/3. Particularly for p = 1/2, Varr = Varp = 5/48. Still, considering the case of no cutoff (q6 = 1). Let us address the case where the responder accepts the offer with probability proportional to the offer value (greedy player). In this case, we choose: q1 = q2 = q3 = 0 and q4 > 0, q5 = q6 = 1, so that ⟨g ⟩p = Γ (q4 + 1)/Γ (q4 + 3) and ⟨g ⟩r = Γ (q4 + 2)/Γ (q4 + 3). For arbitrary q4 value, one has ⟨g ⟩r / ⟨g ⟩p = Γ (q4 + 2)/Γ (q4 + 1) = q4 + 1, for q4 ∈ Z. For example, for the linear case (q4 = 1), i.e., fr (y) = y, we have ⟨g ⟩p = 1/6 = ⟨g ⟩r /2, which shows that the responder’s average gain is double that of the proposer. Before analyzing the variance, let us generalize the study of the fluctuations for any pair (q4 , q6 ) and then make q6 → 1. Now, let us consider the case with cutoff (q6 ≤ 1) to explore the effects of the cutoff value for accepting dependence of the offer for an interesting case in particular. Offers are uniformly distributed in the interval [0, 1] (q1 = q2 = q3 = 0) and q5 = 1 so that: fr (y) =

 q4  y q6



if 0 < y ≤ q6

(23)

1

2 3 4 5 6 7 8 9 10 11 12

13

if q6 < y ≤ 1.

1

Here it is important to mention that our analysis concerns 0 ≤ q4 ≤ 1, where the maximum response is the linear one. In this case, Eqs. (8) and (9) are not valid since q6 < 1. The average values are given by:

⟨g ⟩p =

1

q6



q q64

(1 − y)yq4 dy +



0

1

(1 − y)dy = q6

q4 q26 2(2 + q4 )



q4 q6

(1 + q4 )

+

1

(24)

2

and

14 15

16

17

⟨g ⟩r =

1 q q64

q6

 0

yq4 +1 dy +



1

ydy = q6

1 2

 1−

q4 q26



2(q4 + 2)

.

(25)

As expected we can verify that: ⟨g ⟩r / ⟨g ⟩p = Γ (q4 + 2)/Γ (q4 + 1), according to previous results. An interesting case is the linear response (q4 = 1) up to q6 , and the responder accepts all offers with greater values, i.e., fa (y) = y/q6 , for 0 < y < q6 and fa (y) = 1 for q6 ≤ y ≤ 1. In this case ⟨g ⟩p = q26 /6 − q6 /2 + 1/2 and ⟨g ⟩r = −q26 /6 + 1/2. Obviously, arg max0 1, i.e., the proposer has a minimum gain, when the responder does not accept all offers, with unitary probability. There are no candidates to local extremal since ∂ ⟨g ⟩r /∂ q6 = 0 leads to q∗6 ≡ 0 and ∂ ⟨g ⟩r /∂ q4 = 0 results in q∗6 ≡ 0. On the other hand, ∂ ⟨g ⟩p /∂ q6 = 0 results in q∗6 = (2 + y)/(1 + y), but under restriction 0 ≤ q∗6 ≤ 1, so that q∗4 ∈] − ∞, −2]. Since the offers are concentrated in 0 ≤ q4 ≤ 1, there are no candidates to extremal values. Similarly, ∂ ⟨g ⟩p /∂ q4 = 0, leads to q∗6 = [(q4 + 2)/(y + 1)]2 and under the same restriction, so that q∗4 ∈] − ∞, −2] there are no candidates to extremal values. In Fig. 3 (first plot), we observe the monotonic behavior of average gains of the receptor (surface) and proposer (patch). The receptor’s average gain outperforms the proposer’s one, for any set of parameters (q6 , q4 ). The second moments are calculated according to calculations:

 2 g

p

=

1 q q64

q6

 0

(1 − y)2 yq4 dy +



1

(1 − y)2 dy = q6

q6 q4 + 1

−2

q26 q4 + 2

+

q36 q4 + 3



(q6 − 1)3 3

(26)

18

19 20 21 22 23 24 25 26 27 28 29 30 31

32

8

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

Fig. 4. Variance of the responder (a) and the proposer (b) for different values of q4 as a function of q6 .

Fig. 5. Variance of the receptor as a function of q6 for specific case q4 = 0.47.

1

2

3 4 5

6

and

 2 g

r

=

1 q q64

q6



yq4 +2 dy +

0



1

y2 dy = q6

The variances are Var(g )r = g 2

q36 q4 + 3

+

1 − q36 3

.

(27)

  − ⟨g ⟩2r and Var(g )p = g 2 p − ⟨g ⟩2p according to Eqs. (24)–(27). In Fig. 3 (second plot), we show the behaviors of Var(g )r (surface) and Var(g )p (patch) which motivate some questions to be explored. First of all we analyze the candidates to the critical points. Setting ∂ Var(g )r /∂ q6 = 0 results in two nonzero possibilities:  

√ (2 + q4 )2 ± ∆ q6 = − 2q4 (3 + q4 ) ∗

r

(28)

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

9

Fig. 6. Validation of our analytical results with Monte Carlo numerical simulations. The color diagrams show the average values of the cumulative payoff of the responder as a function of n (iterations) and p (accepting probability). The left side presents the results obtained by Monte Carlo simulations while the right side the analytical results (Eq. (18)). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

where ∆ = 16 + 104q4 + 108q24 + 40q34 + 5q44 . On the other hand, ∂ Var(g )r /∂ q4 = 0 leads to also two nonzero candidates: q∗6 = −

√ (2 + q4 )3 ± ∆2 2q4 (3 + q4 )2

(29)

where ∆2 = 64 + 840q4 + 1428q24 + 1024q34 + 372q44 + 68q54 + 5q64 . The only simultaneous solutions of Eqs. (28) and (29) correspond to q∗4 = −2, which does not belong to [0, 1]. So, there are no extremal candidates for Var(g )r , and for Var(g )p . Similarly, we can conclude that there are also no candidates for 0 < q6 < 1. However, it is important to analyze the following marginal optimization question: for a fixed a q4 -value, what is the value of q6 that locally maximizes/minimizes Var(g )? Let us start with √ Var(g )r . In this case, we must analyze Var(g )r , using Eq. (28), where the acceptable alternative (q6 ≥ 0) is the one for + ∆. Substituting it√ in the formula of Var(g )r , we have a function that only depends on q4 . Numerically, we can observe that 0 < q∗6 < 1 when 3 − 1 = 0.732 05 . . . < q4 < 1, so extremal of Var(g )r conditioned to values of q4 must be found only for this interval of q4 . The candidates to extremes of

1

2

3 4 5 6 7 8 9 10

10

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

Fig. 7. Validation of our analytical results with Monte Carlo numerical simulations. The color diagrams show the average values of the cumulative payoff variance of the responder as a function of n (iterations) and p (accepting probability). The left side presents the results obtained by Monte Carlo simulations while the right side the analytical results (Eq. (21)). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

1 2 3 4

5

Var(g )r are concentrated around q∗6 > 0.94 . . . as can be numerically verified. In Fig. 4(a), the plots show the maximum of Var(g )r , for different values of q4 , as a function of q6 . √ We can also verify that maximum points just appear for interval q4 ∈ [ 3 − 1, 1]. Now, let us analyze Var(g )p . The condition: ∂ Var(g )p /∂ q6 = 0 results in: q∗6 =

q34 + 5q24 + 5q4 − 2 ±



∆3

(30)

q4 (1 + q4 )(3 + q4 )



6 7 8 9 10 11

where ∆3 = 4 − q44 − 6q34 − 10q24 − 2q4 . On the one hand, if one considers + ∆3 , a branch of acceptable solutions for the √ interval: 0.682 . . . < q∗6 < 1, which corresponds to 0.4142 . . . < q4 < 0.481 . . . . On the other hand, if one considers − ∆3 , ∗ a branch of acceptable solutions for the complementary interval 0 < q6 < 0.682 . . . also corresponds to the same interval: 0.4142 . . . < q4 < 0.481 . . . . Outside the [0.4142 . . . , 0.481 . . .], there are no extremal values. In Fig. 4(b), the plots validate the behavior of the Var(g )p as a function of q6 , for different values of q4 . We can observe that for values of q4 belonging to the referred interval, extrema exist in the q6 -values, respectively for the first and the second branch of solutions. In Fig. 5

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

11

Fig. 8. Validation of our analytical results with Monte Carlo numerical simulations. The color diagrams show the average (left side) and variance (right side) values of the cumulative payoff of the responder as function of n (iterations) and q6 (cutoff) via MC simulations. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

we choose the specific value q4 = 0.47 to show that a maximum corresponds to the first branch and that a minimum to the second one. We have presented a detailed study of fluctuations on the gain of players, considering acceptations that depend on the offer, with and without cutoff effects.

1 2 3 4

4.2. Evolutionary strategies on the offers (game with unitary memory) Now, let us study the case where the offers are incremented/decremented. Consider the case where the proposer can increase/decrease the offer according to the previous responder’s decision. Our results are presented in two steps. First, we calculate the average receptor’s gain as well as the variance when acceptation is randomly chosen with a probability p (q3 = q4 = 0, q6 = 1 and p = 1/q5 ). In Figs. 6 and 7, the p × n diagrams show respectively results for the average and variance of the cumulative receptor’s gain for the different values of initial offers. The results from Monte Carlo simulations are presented on left hand side in both the figures and Eqs. (18) and (21) on the right hand side. Our Monte Carlo simulations simply mimic the sequence of results between two players under a certain nrun number of repetitions in

5

6 7 8

Q4

9 10 11 12

12

1 2 3 4 5 6

E. Almeida et al. / Physica A xx (xxxx) xxx–xxx

order to obtain the average and variance of the cumulative gain. Our analytical results are corroborated by Monte Carlo simulations. The gain optimization is not a trivial issue. We can observe that cumulative payoff (Fig. 6) and the variance (Fig. 7) can have equal values for different values of n, for different p values. In such color maps n × p we can observe patterns of iso-payoff and iso-variance. Next, consider that the accepting decision depends on the offer value. This situation is studied only using Monte Carlo simulations. Let us consider a linear accepting function (q4 = 1)

y fr (y) =

7

q6 1

if 0 < y ≤ q6 if q6 < y ≤ 1

(31)

12

to study the effects of the cutoff, so that 0 ≤ q6 ≤ 1. In this case, an interesting pattern of stripes of iso-payoff appears in diagrams n × q6 (Fig. 8, left side), with cross over in values of q6 that depends on y0 . For the variance (Fig. 8, right side), we can observe the regions of maximal dispersion are larger for small values y0 and concentrated for q6 > 0.5. However the magnitude of maximal values of variance are larger for greater values of y0 .

13

5. Conclusions

8 9 10 11

14 15 16 17 18 19 20 21 22 23 24

25

Our paper presents results for the iterated and probabilistic ultimatum game based on generalized offering/accepting strategies. The proposer can offer random independent values (memoryless) or these values can depend on previous results (one-step memory). Also, the responder can randomly accept the offer or it can depend on the value. For memoryless players, we calculate the averaged payoff and the averaged cumulative payoff, as well as their variances, using very general choices of offer/accepting functions. Our analytical calculations have been corroborated by numerical validations. The possible situations where the average and variance of the payoff of the proposer and the responder are optimized have been found. For one-step memory, the system evolution is described by analytical results in some cases and Monte Carlo simulations been performed for all of them. Interesting patterns of iso-variance and iso-payoff were observed. Finally, our study of generalized ultimatum game under different proposal and accepting functions is the basis for future studies on the emergent behavior considering many players, with or without, topological structures, since it may still keep many possibilities of optimization Q5 as the ones of the iterated two-players game. Acknowledgment

27

The authors thank CNPq (The Brazilian National Research Council) (305738/2010-0, 476722/2010-1) for its financial support.

28

References

26

33

[1] [2] [3] [4] [5]

34

[6]

35

[7]

36

[8]

37

[9] Q6 [10] [11] [12] [13] [14] [15] [16] [17]

29 30 31 32

38 39 40 41 42 43 44 45

J. von Neumann, O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, NJ, 1944. J.M. Smith, J. Theoret. Biol. 47 (1974) 209–221. G. Szabó, G. Fáth, Phys. Rep. 446 (2007) 97–216. M.A. Nowak, R.M. May, Evolutionary games and spatial chaos, Nature 359 (1992) 826–829. R.O.S. Soares, A.S. Martinez, The geometrical patterns of cooperation evolution in the spatial prisoner’s dilemma: an intra-group model, Physica A 369 (2006) 823–829. M.A. Pereira, A.S. Martinez, A.L. Espíndola, Exhaustive exploration of prisoner’s dilemma parameter space in one-dimensional cellular automata, Braz. J. Phys. 38 (2008) 65–69. M.A. Pereira, A.S. Martinez, A.L. Espíndola, Prisoner’s dilemma in one-dimensional cellular automata: visualization of evolutionary patterns, Internat. J. Modern Phys. C 19 (2008) 187–201. M.A. Pereira, A.S. Martinez, A.L. Espíndola, Pavlovian prisoner’s dilemma—analytical results, the quasi-regular phase and spatio-temporal patterns, J. Theoret. Biol. 265 (2010) 346–358. W. Guth, R. Schmittberger, B. Schwarze, J. Econ. Behav. Organ. 24 (1982) 153. A. Szolnoki, M. Perc, G. Szabó, Phys. Rev. Lett. 109 (2012) 078701-1. A.G. Sanfey, J.K. Rilling, J.A. Aronson, L.E. Nystrom, J.D. Cohen, Science 300 (2003) 1755. M.N. Kuperman, S. Risau-Gusman, Eur. Phys. J. B 62 (2008) 233. M.A. Nowak, K.M. Page, K. Sigmund, Science 289 (2000) 1773. R. da Silva, G.A. Kellermann, L.C. Lamb, J. Theoret. Biol. 258 (2009) 208–218. R. da Silva, G.A. Kellerman, Braz. J. Phys. 37 (2007) 1206–1211. K.M. Page, M.A. Nowak, K. Sigmund, Proc. R. Soc. B: Biol. Sci. 267 (2000) 2177–2182. J. Iranzo, J. Román, A. Sánchez, J. Theoret. Biol. 278 (2011) 1–10.