Cooperation in spatial evolutionary games with historical payoffs

Cooperation in spatial evolutionary games with historical payoffs

Physics Letters A 380 (2016) 2819–2822 Contents lists available at ScienceDirect Physics Letters A www.elsevier.com/locate/pla Cooperation in spati...

692KB Sizes 0 Downloads 104 Views

Physics Letters A 380 (2016) 2819–2822

Contents lists available at ScienceDirect

Physics Letters A www.elsevier.com/locate/pla

Cooperation in spatial evolutionary games with historical payoffs Xu-Wen Wang a , Sen Nie a , Luo-Luo Jiang b,∗ , Bing-Hong Wang b,c,d , Shi-Ming Chen a a

School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang, Jiangxi 330013, PR China College of Physics and Electronic Information Engineering, Wenzhou University, Wenzhou, Zhejiang 325035, PR China c Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, PR China d School of Science, Southwest University of Science and Technology, Mianyang, Sichuan 621010, PR China b

a r t i c l e

i n f o

Article history: Received 14 March 2016 Received in revised form 16 June 2016 Accepted 17 June 2016 Available online 21 June 2016 Communicated by C.R. Doering Keywords: Cooperation Spatial prisoner’s dilemma game Memory length

a b s t r a c t The most common of strategy adoption in evolutionary games relies on players’ payoffs of the last round. While a rational player usually fixes the coming strategy by comprehensively considering certain amount of payoff information within its memory length. Here, we explore several measures of historical payoffs in getting the weighted average payoff. Then, player sets the strategy by comparing the weighted average payoff of neighbour’s and itself. We show that, cooperators can resist the invasion by referring to the most payoff information, when strategy and measure coevolve. In contrast, strategy adoption of defectors only relies on the nearest one round. Especially, our results suggest that, excessive attention of past payoffs is not favorable to spread cooperative behaviors. © 2016 Elsevier B.V. All rights reserved.

1. Introduction Cooperation behavior, aiming to gain the maximum collective fitness, is almost universal in natural and social systems [1–3]. It violates the self-interesting fact because defection can win higher self fitness. Game theory which has provided an efficient tool in understanding the cooperation behaviors, is extensively explored. Some typical models, such as Prisoner’s dilemma games (PDG) [4–9], Snow-drift games (SG) [10–13] have been introduced to mimic the strategy selection of players in dilemma situations. If considering the games of multi-individual, it can be found that well-mixed populations hinder spreading cooperation [14]. For past decades, spatial cooperation has attracted considerable attentions [15–21]. In spatial games, player locates on the network and earns a total payoff received from the interactions of its neighbours [22,23]. Then, strategy adoption process is realized by comparing the self-payoff and neighbours’. With the development of complex networks, the former finding also shows that scale-free networks can greatly improve the cooperation level [24–26]. Some features, including social diversity [27–31], reputation [32–34], reward and punishment [35–39] have also been incorporated into cooperation on spatial networks to mimic the game behaviors in realm. In particular, recent works suggest the zero-determinant strategy, meaning players cooperate with probabilities on the basis of payoffs of last round, can dominate any opponents [40–42].

*

Corresponding author. E-mail address: [email protected] (L.-L. Jiang).

http://dx.doi.org/10.1016/j.physleta.2016.06.026 0375-9601/© 2016 Elsevier B.V. All rights reserved.

Most spatial game models mainly focused on memory-one process, in which player updates its strategy according to the payoff of last one round. While a rational player usually fixes its actions of the next rounds by comprehensively considering the payoff information within a period [43–45]. It is evident that amount and measures of historical payoff in adjusting strategy are different for kinds of players and circumstances. For instance, in the stocks trading in economics, one may regard the nearest data is more important in selecting the coming strategies efficiently, because the period of trading is short. However, in other cases, e.g. determination of production for an enterprise in coming years, a longer series of profits in different years would be regarded to have equally important roles. Based on these considerations, we explore cooperation of spatial prisoner’s dilemma games with different measures of historical payoffs. That is, player assigns specific weights to the historical payoffs of nearest Mi rounds to measure the importance of them. Then, each player adopts its strategy by comparing the weighted average payoff of neighbour’s and itself. The simulation results suggest that, coevolution of measure and strategy renders the defectors to adopt strategies by relying on the nearest one round. While cooperators are always using maximal payoff information to survive in systems. We show treating each historical payoff equally role in selecting the strategy mostly promote the cooperation behaviors, and phase can exhibit the abrupt transition from complete defection phase to mix phase. Especially, above a critical memory length, cooperation level monotonously increases with memory length. Our results indicate that, excessive attention of past payoffs is not favorable to spread cooperation. This paper is organized as follows: Section 2 presents the model, Section 3

2820

X.-W. Wang et al. / Physics Letters A 380 (2016) 2819–2822

demonstrates numerical results, and the discussion is presented in Section 4. 2. Prisoner’s dilemma game with weighted average payoff For spatial PDG, each player locates on the sites of square lattice networks and with periodic boundary condition. Initially, each player is designed either as a cooperator (s x = C) or defector (sx = D) with equal probability. For each player, its payoff is the total payoffs acquired from games with its nearest four neighbours. Following the common practice, the parameter of reward for mutual cooperation R = 1, and both the punishment for mutual defection P as well as the suckers payoff S are set as P = S = 0, temptation to defect T = b is the only payoff parameter in PDG. We introduce the memory length M to characterize the most payoff information of neighbours and itself that player can remember. Then, player x adjusts its following strategy by referring the payoff information of the very nearest Mx rounds. Initially, Mx is randomly selected in the interval [1, M]. Here, unlike the memory-one model, player x updates strategy by comparing the weighted average payoff P w (s y ) of a randomly selected neighbour y and itself payoff P w (s x ), which is defined as following: Mx 

P w (s x ) =

k =1

k w P (s x , k) Mx 

(1)

, kw

k =1

where P (s x , k) is the payoff of player x in the nearest k round (0 < k ≤ Mx ). Measure length Mx quantifies the amount of payoff information that player used to determine the strategy of next round, and weight exponent w regulates the importance of the historical payoff P (s x , k). w = −1, 0, 1 indicates the longer step of a payoff (e.g. larger k), the lesser, equal and greater proportion or importance on the weighted average payoff P w (s x ), respectively. For example, an individual pair i − j with payoff sequences of i is: P 1 , P 2 ( P 1 < P 2 ), and j is: P 2 , P 1 . Then, w = −1, 0, 1 renders the weighted average payoff P w (i ) be larger, equal and lesser than that of P w ( j ), respectively. Here, w = 1 is also similar to the memory model in Ref. [45]. Player x adopts the strategy of neighbour y’s strategy with the probability [46]:

W (s x ← s y ) =

1 1 + e [( P w (sx )− P w (s y ))/κ ]

,

(2)

where κ quantifies uncertainty by strategy adoptions. To mimic the reality, the parameter M y of x’s neighbour y is adopted by player x simultaneously, once strategy s y is adopted by player x. We set the network size N = 10000 and κ = 0.1 for all simulations. The players update their strategies synchronously. 3. Numerical results Coevolution of strategy and measure reveals competing among individuals with different Mi and strategies. Therefore, we explore evolving of the fraction of cooperators, defectors and fraction of Mi for different weight values, respectively. Fig. 1(a) presents that, cooperators and defectors can coexist in lattice for intermediate memory length M = 8 and w = −1. Different from w = −1, Fig. 1(b) demonstrates cooperators cannot resist the invasion of defectors for w = 1. We see that time evolving of cooperation before MC S = 100 for w = 1 is quite similar to that of w = 1. Yet, payoff differences between defector and cooperator decline with time, owing to frozen of more resources. Thus, assigning large weights to historical payoffs against to maintenance of cooperation. Then, we present the competition of players with different Mi (1 ≤ Mi ≤ 8)

Fig. 1. (Color online.) Evolution of density of players for different weight values as b = 1.15 and M = 8. (a) Density of cooperators and defectors with w = −1. (b) Density of cooperators and defectors with w = 1. (c) Density of Mi (1 ≤ Mi ≤ 8) with w = −1. (d) Density of Mi (1 ≤ Mi ≤ 8) with w = 1. Each data point is an average of 100 independent realizations.

for w = −1 and w = 1, as shown in Fig. 1(c) and (d), respectively. It can be seen that densities of players with Mi = 8 and Mi = 1 are exactly equal to the fraction of cooperators f C and defectors f D in (a) and (b), respectively. In this sense, defectors with Mi = 1 and cooperators with Mi = 8 are the most competitive players in systems, and the similar result is also observed in Ref. [5,47]. Referring more payoff information in strategy selection reduces the differences of weighted average payoff between player’s and neighbour’s. Thus making cooperators with most payoff information are easiest to resist the invasion of defectors. The features w and M adjust role of historical payoffs on the weighted average payoff P w in the current round, and further regulate the cooperative behaviors. So, we explore cooperation level f C as a function of memory length M with different temptation to defect b and weight factor w, as shown in Fig. 2. By observing from Fig. 2(a) for w = −1, we find that cooperation level with historical payoff is enhanced, compared with the original PDG. For the original PDG that correspond to M = 1, cooperators become extinct as b > 1.02. While f C depicts transition with increasing of M even for notable b. Importantly, the critical memory length Mc monotonously increases with the temptation to defect b and weight factor w, namely maintenance of cooperative behavior for large b needs much more payoff information. Fig. 2(b) shows f C as a function of memory length for the case of w = 0. Comparing with the results of w = −1, system depicts the pure cooperation phase C, for large enough M. We find that, cooperation levels for three different values of b in Fig. 2(c) are always lower than the results in Fig. 2(a) and (b), which demonstrates the over-emphasis on past payoffs prevents spreading cooperation. Overall, Fig. 2 indicates that, assigning identical weight to historical payoffs can mostly promote cooperation. Fig. 3 presents the underlying causes of the improvement of cooperation by exploring the average payoffs of cooperators and randomly selected defective neighbours, respectively. It is clear that the difference between average payoffs of cooperators and defectors shows transition with memory length M. Cooperators can gain higher payoffs than that of defective neighbours only when M exceeds the critical length Mc . And average payoff for b = 1.25 in Fig. 3(b) varies in nonmonotonic way, which causes the decline of cooperation level in Fig. 2(a) as

X.-W. Wang et al. / Physics Letters A 380 (2016) 2819–2822

2821

Fig. 2. (Color online.) Cooperation level f C as a function of temptation to defect b with different memory length M. (a) w = −1. (b) w = 0 and (c) w = 1. Each data point is an average of 100 independent realizations.

Fig. 3. (Color online.) Average payoff of cooperators and randomly selected defective neighbours as a function of memory length M with w = −1. (a) b = 1.05. (b) b = 1.25. Each data point is an average of 100 independent realizations.

4 < M < 8. In addition, critical memory lengths Mc in Fig. 3 for b = 1.05 and b = 1.25 are exactly the same as that in Fig. 2(a). As well known, cooperators might soon become extinct with the temptation to defect b, for the original PDG on lattice. So, to intuitively view the effect of memory-based weighted average payoffs on the cooperative behaviors, we explore the cooperation level f C as a function of b with different memory length M, as shown in Fig. 4. We find that, incorporating of memory-based historical payoff spreads cooperation, and cooperators can survive even for large enough b (e.g. b = 1.4). For all of three memory lengths M, cooperation level f C of the larger M is always higher than that with smaller M. Meanwhile, we find that the case of w = −1 has

the largest critical value bc , irrespective of M. This suggests that cooperators can survive easier if players adopt strategies mainly according to the very nearest payoff information. Be in agreement with the results in Fig. 2, w = 0 and w = 1 produce the most highest and lowest cooperation levels, respectively. Fig. 5 presents the density of cooperators in the b–M phase space. As observed from three cases, there exists the pure cooperation phase C for M > 0 and less b. The boundary of the mixed phase C + D and pure phase D extremely rely on the memory length M. Here, memory length M determines the period that players using the payoff information within it to adopt the strategy of the next round. Increasing M renders the gap between the payoff P w (s x ) of player x and randomly selected neighbour’s payoff P w (s y ) trend too small, which are equivalent to the decline of temptation of defect b. In this sense, cooperators can survive in the systems easier with the increasing of memory length M. 4. Discussion Spatial prisoner’s dilemma games have been widely explored in past decades. The most commonly used mechanism for strategy adoption is Fermi equation [46]. That is player compares its payoff of last one round and randomly selected neighbour’s, which is the typical memory-one process. In general, a rational player may determine its strategy of the next round by comprehensively considering the payoffs in a period. The importance of a historical

Fig. 4. (Color online.) Cooperation level f C as a function of memory length M with different temptation to defect b. (a) w = −1. (b) w = 0 and (c) w = 1. Each data point is an average of 100 independent realizations.

Fig. 5. (Color online.) Cooperation level f C as functions of memory length M and temptation to defect b. (a) w = −1. (b) w = 0 and (c) w = 1. Each data point is an average of 100 independent realizations.

2822

X.-W. Wang et al. / Physics Letters A 380 (2016) 2819–2822

payoff that far from the current round can be different dramatically for two players. Hence in this paper, we introduce the memorybased weighted average payoff into spatial PDG and investigate how it affects the spatial cooperative behaviors. We assign three weight values respectively to each payoff to characterize the importance of it. We find that, the introduction of memory-based weighted average payoff can greatly promote the cooperation. The cooperators can survive in the lattice networks with the increasing of memory length M, even for considerable temptation to defect. By comparing the cooperation levels of the three weight values, we find that player assigns equal weight to payoffs of different rounds in memory length can induce the highest cooperation levels. For all of simulations, strategy and selection of payoff information Mx are adopted by player simultaneously. If a player does not adopt strategy and Mx simultaneously, but just with the same probabilities, we find that, assigning the same equal weight to each historical payoff is also most favorable for spreading cooperation. Whereas, assigning larger weights to older payoffs makes lowest cooperation levels. Acknowledgements This work is funded by: The National Natural Science Foundation of China (Grant Nos.: 61203145, 11275186, 91024026), and FOM2014OF001. References [1] J.M. Smith, Evolution and the Theory of Games, Cambridge University Press, 1982. [2] A.M. Colman, Game Theory and Its Applications: In the Social and Biological Sciences, Psychology Press, 2013. [3] J. Hofbauer, K. Sigmund, Evolutionary Games and Population Dynamics, Cambridge University Press, 1998. [4] G. Szabó, G. Fath, Evolutionary games on graphs, Phys. Rep. 446 (4) (2007) 97–216. [5] A. Szolnoki, J. Vukov, G. Szabó, Selection of noise level in strategy adoption for spatial social dilemmas, Phys. Rev. E 80 (5) (2009) 056112. [6] M. Perc, Z. Wang, Heterogeneous aspirations promote cooperation in the prisoner’s dilemma game, PLoS ONE 5 (12) (2010) e15117. [7] F. Fu, M.A. Nowak, C. Hauert, Invasion and expansion of cooperators in lattice populations: prisoner’s dilemma vs. snowdrift games, J. Theor. Biol. 266 (3) (2010) 358–366. [8] K. Shigaki, J. Tanimoto, Z. Wang, S. Kokubo, A. Hagishima, N. Ikegaya, Referring to the social performance promotes cooperation in spatial prisoner’s dilemma games, Phys. Rev. E 86 (3) (2012) 031141. [9] H.-X. Yang, Z. Rong, W.-X. Wang, Cooperation percolation in spatial prisoner’s dilemma game, New J. Phys. 16 (1) (2014) 013010. [10] C. Hauert, M. Doebeli, Spatial structure often inhibits the evolution of cooperation in the snowdrift game, Nature 428 (6983) (2004) 643–646. [11] W.-X. Wang, J. Ren, G. Chen, B.-H. Wang, Memory-based snowdrift game on networks, Phys. Rev. E 74 (5) (2006) 056113. [12] W.-B. Du, X.-B. Cao, M.-B. Hu, W.-X. Wang, Asymmetric cost in snowdrift game on scale-free networks, Europhys. Lett. 87 (6) (2009) 60004. [13] Z. Wang, S. Kokubo, M. Jusup, J. Tanimoto, Universal scaling for the dilemma strength in evolutionary games, Phys. Life Rev. 14 (2015) 1–30. [14] F.C. Santos, J.M. Pacheco, T. Lenaerts, Evolutionary dynamics of social dilemmas in structured heterogeneous populations, Proc. Natl. Acad. Sci. USA 103 (9) (2006) 3490–3494. [15] M. Perc, J. Gómez-Gardeñes, A. Szolnoki, L.M. Floría, Y. Moreno, Evolutionary dynamics of group interactions on structured populations: a review, J. R. Soc. Interface 10 (80) (2013) 20120997. [16] J. Peña, H. Volken, E. Pestelacci, M. Tomassini, Conformity hinders the evolution of cooperation on scale-free networks, Phys. Rev. E 80 (1) (2009) 016110. [17] A. Szolnoki, M. Perc, Group-size effects on the evolution of cooperation in the spatial public goods game, Phys. Rev. E 84 (4) (2011) 047102.

[18] L.-L. Jiang, M. Perc, W.-X. Wang, Y.-C. Lai, B.-H. Wang, Impact of link deletions on public cooperation in scale-free networks, Europhys. Lett. 93 (4) (2011) 40001. [19] Z. Wang, A. Szolnoki, M. Perc, Interdependent network reciprocity in evolutionary games, Sci. Rep. 3 (2013) 1183. [20] X. Chen, L. Wang, Promotion of cooperation induced by appropriate payoff aspirations in a small-world networked game, Phys. Rev. E 77 (1) (2008) 017103. [21] Z. Yang, Z. Li, T. Wu, L. Wang, Effects of partner choice and role assignation in the spatial ultimatum game, Europhys. Lett. 109 (4) (2015) 40013. [22] Z. Wang, L. Wang, A. Szolnoki, M. Perc, Evolutionary games on multilayer networks: a colloquium, Eur. Phys. J. B 88 (5) (2015) 1–15. [23] C. Xia, Q. Miao, J. Wang, S. Ding, Evolution of cooperation in the traveler’s dilemma game on two coupled lattices, Appl. Math. Comput. 246 (2014) 389–398. [24] F.C. Santos, J.M. Pacheco, Scale-free networks provide a unifying framework for the emergence of cooperation, Phys. Rev. Lett. 95 (9) (2005) 098104. [25] M. Perc, Evolution of cooperation on scale-free networks subject to error and attack, New J. Phys. 11 (3) (2009) 033027. [26] Z. Wang, M.A. Andrews, Z.-X. Wu, L. Wang, C.T. Bauch, Coupled disease– behavior dynamics on complex networks: a review, Phys. Life Rev. 15 (2015) 1–29. [27] M. Perc, A. Szolnoki, Social diversity and promotion of cooperation in the spatial prisoner’s dilemma game, Phys. Rev. E 77 (1) (2008) 011904. [28] A. Szolnoki, M. Perc, G. Szabó, Diversity of reproduction rate supports cooperation in the prisoner’s dilemma game on complex networks, Eur. Phys. J. B 61 (4) (2008) 505–509. [29] Z.-X. Wu, Z. Rong, P. Holme, Diversity of reproduction time scale promotes cooperation in spatial prisoner’s dilemma games, Phys. Rev. E 80 (3) (2009) 036106. [30] H.-X. Yang, W.-X. Wang, Z.-X. Wu, Y.-C. Lai, B.-H. Wang, Diversity-optimized cooperation on complex networks, Phys. Rev. E 79 (5) (2009) 056107. [31] C.-Y. Xia, S. Meloni, M. Perc, Y. Moreno, Dynamic instability of cooperation due to diverse activity patterns in evolutionary social dilemmas, Europhys. Lett. 109 (5) (2015) 58002. [32] F. Fu, C. Hauert, M.A. Nowak, L. Wang, Reputation-based partner choice promotes cooperation in social networks, Phys. Rev. E 78 (2) (2008) 026117. [33] X. Chen, A. Schick, M. Doebeli, A. Blachford, L. Wang, Reputation-based conditional interaction supports cooperation in well-mixed prisoner’s dilemmas, PLoS ONE 7 (5) (2012) e36260. [34] Z. Wang, L. Wang, Z.-Y. Yin, C.-Y. Xia, Inferring reputation promotes the evolution of cooperation in spatial social dilemma games, PLoS ONE 7 (7) (2012) e40218. [35] A. Szolnoki, M. Perc, Reward and cooperation in the spatial public goods game, Europhys. Lett. 92 (3) (2010) 38003. [36] Z. Wang, A. Szolnoki, M. Perc, Rewarding evolutionary fitness with links between populations promotes cooperation, J. Theor. Biol. 349 (2014) 50–56. [37] H. Brandt, C. Hauert, K. Sigmund, Punishment and reputation in spatial public goods games, Proc. R. Soc. B 270 (1519) (2003) 1099–1104. [38] M. Perc, A. Szolnoki, Self-organization of punishment in structured populations, New J. Phys. 14 (4) (2012) 043013. [39] Z. Wang, C.-Y. Xia, S. Meloni, C.-S. Zhou, Y. Moreno, Impact of social punishment on cooperative behavior in complex networks, Sci. Rep. 3 (2013) 3055. [40] W.H. Press, F.J. Dyson, Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent, Proc. Natl. Acad. Sci. USA 109 (26) (2012) 10409–10413. [41] A. Szolnoki, M. Perc, Evolution of extortion in structured populations, Phys. Rev. E 89 (2) (2014) 022804. [42] A. Szolnoki, M. Perc, Defection and extortion as unexpected catalysts of unconditional cooperation in structured populations, Sci. Rep. 4 (2014) 5496. [43] R. Alonso-Sanz, Memory boosts cooperation in the structurally dynamic prisoner’s dilemma, Int. J. Bifurc. Chaos 19 (09) (2009) 2899–2926. [44] Y. Liu, Z. Li, X. Chen, L. Wang, Memory-based prisoner’s dilemma on square lattices, Physica A 389 (12) (2010) 2390–2396. [45] S.-M. Qin, Y. Chen, X.-Y. Zhao, J. Shi, Effect of memory on the prisoner’s dilemma game in a square lattice, Phys. Rev. E 78 (4) (2008) 041129. ˝ [46] G. Szabó, C. Toke, Evolutionary prisoner’s dilemma game on a square lattice, Phys. Rev. E 58 (1) (1998) 69. [47] A. Szolnoki, M. Perc, G. Szabó, Accuracy in strategy imitations promotes the evolution of fairness in the spatial ultimatum game, Europhys. Lett. 100 (2) (2012) 28005.