A synergy of punishment and extortion in cooperation dilemmas driven by the leader

A synergy of punishment and extortion in cooperation dilemmas driven by the leader

Chaos, Solitons and Fractals 119 (2019) 263–268 Contents lists available at ScienceDirect Chaos, Solitons and Fractals Nonlinear Science, and Nonequ...

606KB Sizes 1 Downloads 12 Views

Chaos, Solitons and Fractals 119 (2019) 263–268

Contents lists available at ScienceDirect

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena journal homepage: www.elsevier.com/locate/chaos

A synergy of punishment and extortion in cooperation dilemmas driven by the leader JunFang Wang a,b, JinLi Guo a,∗ a b

Business school, University of Shanghai Science & Technology, Shanghai 200093, China School of Mathematics & Statistics, North China University of Water Resources & Electric Power, Zhengzhou 450046, China

a r t i c l e

i n f o

Article history: Received 1 August 2018 Revised 31 October 2018

Keywords: Punishment Extortion Large degree node Cooperation Combined strategy

a b s t r a c t Punishment and extortion have been acknowledged to play key roles in sustaining and catalysing cooperation respectively, yet we still have to face a rather gloomy evolutionary outlook if we continue using them alone in a high betrayal temptation. This paper proposes a new strategy that combines punishment and extortion used by one leader. The results show that a node with a large degree is more capable of influencing others. Using the combined strategy, one large degree node could push everyone’s mutual cooperation in a probability close to or equal to 1. Moreover, he/she obtains the highest scores. We also demonstrate that the combined strategy is superior to some classic winning strategies (WSLS). The findings show the synergy of punishment and extortion is effective in promoting cooperation. An immediate implication is that it combines two strategies’ merits, and the leader can choose the right strategy between them at the right time to fight the defectors. And the results are robust to game betrayer temptation, competitive strategies. Complementary, it has strong flexibility for the user. © 2019 Elsevier Ltd. All rights reserved.

1. Introduction Cooperative behaviour is crucial and ubiquitous, not only in human society but also in nature. Yet Nash pointed out that the individual’s greedy behaviour results a Nash Equilibrium, which occurs when individual rationality conflicts with group rationality in a single game [1,2] and which cannot explain such cooperative phenomena. In order to understand how cooperation and competition might arise among agents with selfish objectives, Smith and Price proposed the evolutionary game [3]. Researchers concentrate on studying the evolutionarily stable strategies and the dynamics of evolutionary equilibrium [4,5]. To promote the level of cooperation in the population game, on one hand, scholars have focused on the relationship between the level of cooperation and the structure of network. They have found that the network heterogeneity [6–11] and alteration [12] promote the emergence of cooperation. On the other hand, some classic winning strategies have also been put forward and include WSLS (win stay, lost shift), TFT (titfor-tat), GTFT (generous tit-for-tat), P (punishment), and ZD (zerodeterminant) strategy. Most studies have been focused more on strategies’ stability and promoting cooperation behaviours in iterated games [4,13–20].



Corresponding author. E-mail address: [email protected] (J. Guo).

https://doi.org/10.1016/j.chaos.2019.01.004 0960-0779/© 2019 Elsevier Ltd. All rights reserved.

In such strategies, it is worth noting that Ex (extortionate strategy), a sub-strategy of ZD, has been verified to be a catalyst for cooperation [21,22], but it isn’t the stable outcome of natural selection in large populations [23]. Besides ZD strategies, punishment mechanism is also effective to sustain cooperation. Nevertheless, using punishment to explain cooperation only leads to further questions. For instance, why spend precious resources to punish free-riders, especially if others can avoid this investment and cheaters can penalise you back? Even some social dilemma experiment data suggest that their punishment mechanism is ineffective in promoting cooperation [24]. So various punishment mechanisms have been proposed for this problem, including a player’s option to change an opponent [25], the synergy of punishment and contract [26], the heterogeneous power of players [27], the possibility of adding voluntary cooperators [28], and considering punishers’ synergistic effect [29]. More punishment mechanisms and human cooperation can be seen in [30]. Many researchers have demonstrated that punishment can sustain cooperation. However, when facing a defector, his/her power is inferior to the extortioner’s. In the meantime, the extortion strategy is just a catalyst for cooperation that cannot sustain it. Considering the above situation, we introduce a new strategy that combines the punishment and extortion. Moreover, there must be some nodes whose behaviour strongly affects others in a scale-free network. In this paper, we first briefly introduce the game model and strategies. Next, we compare some important indicators to

264

J. Wang and J. Guo / Chaos, Solitons and Fractals 119 (2019) 263–268

obtain leader nodes based on punishment’s influence. Then, we discuss the leader’s influence on the population, using the combined strategy. Finally, we demonstrate the robustness of results.

All C

2. The model and strategies Donation game model is an extensively studied form of prisoner’s dilemma game. A cooperator who pays a cost c can get a reward b or 0 depending on whether his/her partner cooperates or not, and his/her betrayal opponent obtains a high benefit b without paying any. However, if both defect, they all get a meagre score 0. As a cooperator and supervisor, each punisher pays a cost α to make the betrayal opponent lose β with β > α . Ex strategy in donation game, supposing that player X cooperates with his/her partner in a probability p = ( p1 , p2 , p3 , p4 ) under the previous results (CC, CD, DC, DD). His/her opponent Y’s strategy is analogously q = (q1 , q2 , q3 , q4 ). Their payoff matrixes are SX and SY . The Markov transition matrix is as follows:



p1 ( 1 − q1 ) p2 ( 1 − q3 ) p3 ( 1 − q2 ) p4 ( 1 − q4 )

p1 q1 ⎢p q Q =⎣ 2 3 p3 q2 p4 q4

⎤ (1 − p1 )(1 − q1 ) (1 − p2 )(1 − q3 )⎥ . (1) (1 − p3 )(1 − q2 )⎦ (1 − p4 )(1 − q4 )

( 1 − p1 )q1 ( 1 − p2 )q3 ( 1 − p3 )q2 ( 1 − p4 )q4

The stationary vector satisfies T

v Q = vT .

(2)

The matrix Q has a unit eigenvalue, while the matrix M = Q − I is singular. So its adjugate matrix M∗ satisfies

M ∗ M = 0.

(3)

Eq. (3) implies that the row of X’s score is

sX = v · SX =

M∗

is proportional to

D( p, q, SX ) , D( p, q, 1 )

vT .

Then

(4)

where

  p1 q1 − 1   p2 q3 D( p, q, f ) =   p3 q2  p4 q4

p1 − 1 p2 − 1 p3 p4

q1 − 1 q3 q2 − 1 q4

 f1   f2   f3  f4 

(5)

And the linear function of both sides’ returns satisfies

sX − χ sY =

D( p, q, SX − χ SY ) . D( p, q, 1 )

(6)

The second column of the determinant of the numerator (Eq. (6) is only related to the strategy of X. Then, if Xchooses the strategy that satisfies p˜ = ( p1 − 1, p2 − 1, p3 , p4 ) = ϕ (SX − χ SY ), then the score is x(x > 1) times of the opponent regardless of the opponent’s strategy. More detail results can be seen in [21]. In particular, if an extortioner X plays games with a cooperative punisher Y, he would be punished after he betrays. Then their payoff matrixes are SX = (b − c, −c, b − β , 0 ) and SY =(b − c, b, −c − α , 0 ), and parameter φ satisfies1 < φ ≤ (c + xb)−1 . Since both scores are irrelevant to the parameter φ , we let φ =(c + xb)−1 . Then

p1 p2 p3 p4

Table 1 The payoffs of four strategies in donation game.

b+xc c+bx

= =0 +x(c+α ) = b−βc+ bx = 0.

So the payment of an extortioner against a punisher can be cal)x(c+b+α −β ) culated through Eq. (6). It is (b−c (α +b)x−(β −c ) .

The payment of a player using strategy i against a player using strategy j is given by the (i, j)th element of the following matrix (see Table 1). We can find that Ex always defects facing to All D and him/herself.

Ex All D P

All C

Ex

All D

P

b−c

(b2 −c2 )

−c

b−c

0 0

0 0 −c − α

b−β b−c

bx+c

(b2 −c2 )x bx+c

b b−c

(b−c )(c+b+α −β ) (α +b)x−(β −c )

(b−c )x(c+b+α −β ) (α +b)x−(β −c )

Agents’ relations in real life can be described as a complex network [31], a heterogeneous population structure with hubs that play games with their neighbours. Such structures are known to promote cooperation using accumulated payoffs through cooperation cluster [32], and the effect of heterogeneity basically vanishes utilizing average payoffs [33]. Whereas once some extortioners taking part in the games, the emergence of cooperation level is always promoted no matter with accumulated payoffs via a bottomup mechanism [34] or with average payoffs through cooperationextortion alliance [35]. Here, we adopt weighted utility function defined as follows:

Ui (τ ) = γ Pi + (1 − γ )Pi /ki .

(7)

where ki and Pi denote node i’s degree and accumulated payoff from all the neighbours at time τ . Parameter γ denotes the weight of average payoff and absolute payoff. It is well known that heterogeneity in aspirations can promote cooperation [36]. To avoid interference with this factor, we suppose that the population is homogeneous in aspiration. Each agent is bounded rational and he/she chooses a player j uniformly at random from his/her neighbours. Then he/she changes his/her strategy si to sj at time τ with the probability

p( s i ← s j ) =

1 1 + e[(Ui (τ )−U j (τ )/a]

,

(8)

where a embodies the rational degree which plays an important role in the evolution of cooperation with extortion [37]. Now that the scale-free network is highly heterogeneous, we want to know if there are some nodes whose strategies are crucial to the evolution. If so, we call them leaders. In the following discussion, we always assume that b = 2, c = 1 (for there is low cooperation level in such condition and it is also compatible with assumption b − c = 1 in previous researches [35,34]) and a modest extortion factor x = 3. We also suppose that α = 0.5, β = 2, γ = 0.5 unless otherwise stated. 3. Identifying leader nodes in a scale-free network It is well known that degree, H-index and coreness value play important roles in a scale-free network. The large-degree node could change more nodes’ strategies for its numerous neighbours. If we sort the degrees of all neighbours of vertex i and assume ky1 ≥ ky2 ≥ · · · , then another indicator H-index [38] is defined as follows:

H (i ) = max {m : kym ≥ m}.

(9)

So H-index can embody the degree of its neighbours and itself. The coreness of a node is measured by k-core decomposition, and a larger coreness implies that the node is more centrally located in the network. They have shown their importance in the spread of the virus, opinions or rumours and evolutionary game [38–40]. Since the evolution of cooperative behaviour is related to different degree mixing patterns both in scale-free network [41 ,42] and in multilayer networks [43], to seek which index above is more effective to identify leaders, we construct three scale-free networks with assortative (Fig. 1(a)) and disassortative (Fig. 1(b)) degree correlation and an ordinary network (Fig. 1(c)) following

J. Wang and J. Guo / Chaos, Solitons and Fractals 119 (2019) 263–268

265

Fig. 1. (a) Cooperation frequency curves (including P and All C) are shown if P strategy is imposed on 2–16% of nodes with the larger degree (black), H-index (red), coreness (blue), or randomness (olive) in a scale-free network with assortativity coefficient rk = 0.18. (b) The results are shown in a disassortative network with assortativity coefficient rk = −0.16. (c) The results are shown in an ordinary scale-free network. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

the construction by Rong et al. [42]. Firstly, we generate an ordinary scale-free network of 50 0 0 nodes and the average degree k = 4. In order to get an assortative network, two different edges with four different ends are choosed randomly in each step, then we swap two edges by linking nodes with larger degrees and smaller degrees, respectively. By repeating such step, the network becomes assortative without changing the degree distribution of the original network. A disassortative network can also be generated through the opposite operation. This type of degree correlation of network can be measured by assortativity coefficient proposed by Newman:rk = (i j − i j )/(i2  − i2 ), where i, j are the degrees at the two ends of one edge and  · represents the average over all edges. In the ‘test’ condition, some larger degree nodes are the imposed punishment strategy, and other nodes adopt the other three strategies randomly with equal probability at initial round. They all later update strategies based on Eq. (8). We also perform the same simulations to other indices in each network. In the ‘control’ condition, the same proportion of nodes is restricted to using strategy P, but instead, they are chosen uniformly at random. Facing the outcomes, we pay more attention to the cooperation level than to the equilibrium density of each strategy in this paper. Hence, we no longer distinguish P and All C since a punisher is also a cooperator. We compare the cooperation levels of each simulation. The results are as follows. First, there is no significant difference among three indices in a assortative network. However, the degree and H-index significantly outperform the coreness in other networks. Even coreness is worse than control condition in disassortative network. Secondly, degree indicator is superior to H-index in disassortative network. Finally, with the increase of the initial density of punishment, the probability of cooperation is becoming larger and larger, and cooperation prevails once the number of punishers exceeds 4% in an ordinary scale-free network. Such results suggest that we should not use the coreness as the only indicator unless the network is assortative. And degree achieves remarkable success although the influences brought by degree and H-index are neckand-neck in disassortative and ordinary scale-free networks. So we would take the degree indicator to identify the leaders. 4. The influence of leader’s combined strategy Punishment has been proven to play an important role in evolution, and the majority of large nodes could sustain a high cooperation through the initial punishment strategy. However, we have to face a gloomy evolutionary outlook if only few leaders are willing to punish betrayers; after all, punishment is costly. Can one large degree node rescue the outlook by the means of a good strategy? Let us numerically explore it. Here, we propose a new strategy,

Table 2 The probabilities for a variety of outcomes.

(a) (b)

Equilibrium strategies

(1)

(2)

(3)

(4)

All C + P All D + EX All C + P All D + EX EX + All C + P EX + D + All C

100% 0% 100% 0% 0% 0%

0% 100% 98% 2% 0% 0%

0% 100% 0% 60% 0% 40%(All C 6.39%)

0% 100% 0% 78% 4% 18%

according to which a player takes the action as followed at time t:

S(t ) =

⎧ ⎨P

⎩Ex P

t=1 nt−1 k

> η,

(10)

otherwise

where k is the user’s degree and nt−1 is the number of defectors in his/her neighbours at time t − 1, and we suppose η = 13 . In other words, the player adopts punishment strategy in the first round. Then he/she will observe behaviours of his/her neighbours. Once the proportion of betrayers exceeds his/her tolerance, he/she will extort his/her neighbours until the betrayer is small enough; then he/she would continue to adopt punishment. In reality, the player implements a combined strategy of P and Ex . In order to evaluate its effect, we implement the following four treatments with the leader. (1) He/she insists on the combined strategy. (2) He/she insists on P strategy. (3) He/she insists on Ex strategy. (4) The leader has no preference for strategy like other nodes. We also employ two treatments to other nodes in the first round: (a) with All C or All D equally; (b) with All C, All D, P, orEx equally; then they update their strategies based on a utility function. We repeat 200 simulations under each treatment on ordinary scale-free network with the same parameters as before. All equilibrium results and their probabilities are listed in Table 2. It is worth noting that statistical results do not include leader node because its strategy is designed in the condition 1–3. We find sharp differences among the equilibrium results. First, there are only two results: all cooperators or all defectors excluding treatment b3 and b4. Second, comparing conditions (a) and (b), we can see that the participation of punishment and extortion can significantly improve the level of cooperation. By looking from right to left, we also find that there will be a higher cooperation level if the leader also insists on any of them, in particular adopting the punishment strategy (98%). However, the punishment of leader is no longer effective for cooperation once he/she loses the extortion partner (0%, see treatment a2). Would it be better to combine them? As expected, if using the new strategy,

266

J. Wang and J. Guo / Chaos, Solitons and Fractals 119 (2019) 263–268

Fig. 2. (a) The average scores of the leader in treatments b1, b2 and b3. (b) The average scores of the group in treatments b1, b2 and b3. (c) The evolution of four strategies if the leader utilizes the combination of the P and Ex .

the leader will definitely guide all players to cooperate, regardless of whether there are other punishers and extortioners or not (see treatment 1). Next, it is necessary to test if there is a significant difference between the treatment a1 and b2 with two-sample binomial test, supposing that p1 , p2 denote the probability of full cooperation in the treatment a1 and b2, respectively.

H0 : p1 = p2

H1 : p1 = p2 .

Test statistics is as follows:

U=



pˆ 1 − pˆ 2



pˆ 1 − pˆ x

1 m1

+

1 m2

.

(11)

x +x

where pˆ i = mi , pˆ = m1 +m2 , and xi ,mi are frequencies of all cooper1 2 i ation and simulations on the two treatments. Equation U ∼ N(0, 1) works under the null hypothesis for enough samples. We can obtain |U | = 2.01 > 1.96 = z0.025 , which shows there is a qualitative difference. Namely, insisting on punishment is inferior to the combination of P and Ex that ensure all cooperation in the possibility 100%. It is because that combined strategy can balance the function of catalytic and sustaining cooperation. With regards to their scores in treatment b, Fig. 2(a) shows that the leader’s payoff with the new strategy exceeds the other cases especially with the extortion strategy although the extortioner always wins his opponent . He/she can get about 4 times scores of the extortioner, and his/her group also gets the highest income (see Fig. 2(b)). That is, such strategy not only benefits itself but also others; hence the leader would have a strong will to enforce rules. Next, let us analyse its working mechanism. In the early stage, the cooperators are difficult to survive. With the increasing of the defectors, the leader changes his/her strategy to extortion from initial punishment which makes more and more neighbours adopt extortion to eliminate defectors. Once defectors are a minority group in his neighbours (the ratio is less than 1/3), the leader immediately changes to the punishment strategy again at this crucial moment. And there must be several neighbours following him/her among his/her so many neighbours in the next round. Then the punishment strategy gradually substitutes Ex and All D entirely (see Fig. 2(c)).

5. The robustness of the result It can be concluded that the combined strategy of punishment and Ex is superior to other four. What would happen with more strategies and different parameters? We extend our research to six competing strategies, taking into account also the TFT and the WSLS, in addition to the previous four. We also set b to three

levels which represents low, medium and high betrayer temptation to discuss the robustness of the results. WSLS has been proven to be very effective in promoting cooperation in many cases due to its resilience against errors. So we compare our strategy’s effect with it. We implement a set of treatments. In the ‘test’ condition, the leader still takes the new strategy. By contrast, the leader insists on WSLS or adopts initial strategy randomly. In each simulations we repeat it 100 times. Firstly, it can be found that the cooperation level is promoted significantly by leader’s WSLS, since there are more players with WSLS instead of TFT in “random” condition especially with high betrayal temptation. Yet there are still some defectors even if the leaders take WSLS. However, there is a rather exciting evolutionary outlook in test group. There are no betrayers in each simulation no matter how b changes (see Fig. 3(c)). Moreover, a combined strategy still gives the leader the highest payoffs no matter with low or high betrayer temptation (see Fig. 3(d)). Even he gets higher than WSLS although WSLS has been demonstrated to be a winner in the classic strategies. Hence, the combination strategy is proved to be robust in promoting cooperation and getting the highest score through changing the parameter b and adding competing strategies.

6. Discussion Throughout this manuscript, we have focused on the synergy of punishment and extortion used by leader. First we discussed who is likely to be a leader. As a leader, he/she should be able to change the fate of the group. We identified it through his/her influence on punishment strategy. Results show that individuals with more interactive objects are more likely to be leaders since they have more opportunity to change other’s behaviour. In a scale network, the large degree node is best to be a leader. Table 1 shows that Ex is more effective against All D than the punisher, for punisher’s score is minus against a betrayer. However, Ex is not evolutionarily stable [44], so it cannot sustain cooperation, which can be made up by punishment. Therefore, we constructed a combined strategy, whose principle is as follows: the user adopts punishment in the early stage; once defectors are too many and its proportion exceeds η in his neighbours, he should switch to Ex so as to eliminate some defectors, and then continue to wipe out the remaining few betrayers and sustain mutual cooperation with punishment. Its merits are not limited to the ones listed in the text. In fact, we also studied its robustness about parameter η. Changing η from 0.1 to 1/3, the result is hardly changed, which indicates the user has a more flexible execution condition. However, it also has a shortcoming as there is no significant effect if the user plays with few opponents.

J. Wang and J. Guo / Chaos, Solitons and Fractals 119 (2019) 263–268

267

Fig. 3. By setting b − c = 1 and other parameters are same as the simulations above. The leader’s influences on three different strategies are compared through simulations repeated 100 times. (a) Supposing that b=2 which represents the low betrayal temptation, the leader is assigned an initial strategy randomly and updates it in each round (left). The leader insists WSLS (medium) and combination strategy (right). (b) b= 3.5 which represents medium betrayal temptation. (c) b= 5 which represents high betrayal temptation. (d) The leader’s average incomes on three treatments are depicted in three conditions.

7. Conclusion In the denotation games, individuals are not likely to cooperate. However, some large degree nodes’ initial punishment will push the population mutual cooperation in a large probability. We call them leaders. Considering that such solution requires sufficient leaders’ punishment, we proposed a new combined strategy of punishment and extortion. Only one leader, he/she can not only achieve high returns but also push for mutual cooperation among almost all individuals. Moreover, the result is independent of competing strategies, betrayer temptation and its parameters. Acknowledgements The authors thank the National Natural Science Foundation of China (Grant No. 71571119, No. 71801139), Humanity and Social Science Foundation of Henan Educational Committee (2019-ZDJH-116). References [1] Nash JF. Equilibrium points in n-person games. Proc Natl Acad Sci USA 1950;36:48–9. [2] Nash J. Non-cooperative games.. Ann Math 1951;54:286–95. [3] Smith JM, Price GR. The logic of animal conflict. Nature 1973;246:15–18. [4] Nowak Ma, Sigmund K. Tit for tat in heterogenous populations. Nature 1992;355:250–3. [5] Rodríguez IN, Neves AGM. Evolution of cooperation in a particular case of the infinitely repeated prisoner’s dilemma with three strategies. J Math Biol 2016;73:1665–90.

[6] Szabó G, Fáth G. Evolutionary games on graphs. Phys Rep 2007;446:97–216. [7] Wu YH, Li X, Zhang ZZ, Rong ZH. The different cooperative behaviors on a kind of scale-free networks with identical degree sequence. Chaos Solitons Fractals 2013;56:91–5. [8] Xiang HT, Liang SD. Evolutionary gambling dynamics for two growing complex networks. Acta Phys Sin 2015;62:18902. [9] Xu B, Li M, Deng RP. The evolution of cooperation in spatial prisoner’s dilemma games with heterogeneous relationships. Phys A 2015;424:168–75. [10] Xu B, Lan YN. The distribution of wealth and the effect of extortion in structured populations. Chaos Solitons Fractals 2016;87:276–80. [11] Zhang JJ, Ning HY, Yin ZY, Sun SW, Wang L, Sun JQ, et al. A novel snowdrift game model with edge weighting mechanism on the square lattice. Front Phys 2012;7:366–72. [12] Pichler E, Shapiro AM. Public goods games on adaptive coevolutionary networks. Chaos 2017;27:47–97. [13] Imhof LA, Fudenberg D, Nowak MA. Tit-for-tat or win-stay, lose-shift? J Theor Biol 2007;247:574–80. [14] Lorberbaum J. No strategy is evolutionarily stable in the repeated prisoner’s dilemma. J Theor Biol 1994;168:117–30. [15] Nowak M. Stochastic strategies in the Prisoner’s Dilemma. Theor Popul Biol 1990;38:93–112. [16] Su DY, Baek SK, Choi JK. Combination with anti-tit-for-tat remedies problems of tit-for- tat. J Theor Biol 2017;412:1–7. [17] Adami C, Hintze A. Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nat Commun 2013;4:2193. [18] Chen J, Zinger A. The robustness of zero-determinant strategies in Iterated Prisoner’s Dilemma games. J Theor Biol 2014;357:46–54. [19] Hao D, Rong ZH, Zhou T. Zero-determinant strategy: an underway revolution in game theory. Chin Phys B 2014;23:78905. [20] Hilbe C, Sigmund MA, Sigmund K. Evolution of extortion in Iterated prisoner’s dilemma games. Proc Natl Acad Sci USA 2013;110:6913. [21] Press WH, Dyson FJ. Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proc Natl Acad Sci USA 2012;109:10409–13. [22] Wang JF, Guo JL, Liu H, Shen AZ, School B. Evolution of zero-determinant strategy in iterated snowdrift game. Acta Phys Sin 2017;66:180203.

268

J. Wang and J. Guo / Chaos, Solitons and Fractals 119 (2019) 263–268

[23] Hilbe C, Nowak M A, Sigmund K. Evolution of extortion in iterated prisoner’s dilemma games. Proc Natl Acad Sci USA 2013;110:6913–18. [24] Li X, Jusup M, Wang Z, Li HJ, Shi L, Podobnik B, et al. Punishment diminishes the benefits of network reciprocity in social dilemma experiments. Proc Natl Acad Sci USA 2018;115:30–5. [25] Barclay P, Raihani N. Partner choice versus punishment in human prisoner’s dilemmas. Evol Hum Behav 2016;37:263–71. [26] Han TA, Lenaerts T. A synergy of costly punishment and commitment in cooperation dilemmas. Adapt Behav 2016;24:237–48. [27] Bone JE, Wallace B, Bshary R, Raihani NJ. Power asymmetries and punishment in a prisoner’s dilemma with variable cooperative investment. PLOS One 2016;11:1–16. [28] Hauert C, Traulsen A, Brandt H, Nowak MA, Sigmund K. Via freedom to coercion: the emergence of costly punishment. Science 2007;316:1905–7. [29] Liu J, Meng H, Wang W, Li T, Yu Y. Synergy punishment promotes cooperation in spatial public good game. Chaos Solitons Fractals 2018;109:214–18. [30] Perc M, Jordan JJ, Rand DG, Wang Z, Boccaletti S, Szolnoki A. Statistical physics of human cooperation. Phys Rep 2017;687:1–51. [31] Santos FC, Pacheco JM. Scale-free networks provide a unifying framework for the emergence of cooperation. Phys Rev Lett 2005;95:1–4. [32] Gómez-Gardeñes J, Campillo M, Floría LM, Moreno Y. Dynamical organization of cooperation in complex topologies. Phys Rev Lett 2007;98:108103. [33] Szolnoki A, Perc M, Danku Z. Towards effective payoffs in the prisoner’s dilemma game on scale-free networks. Phys A 2008;387:2075–82.

[34] Xu X, Rong ZH, Wu ZX, Zhou T, Tse CK. Extortion provides alternative routes to the evolution of cooperation in structured populations. Phys Rev E 2017;95:052302. [35] Mao YJ, Xu XR, Rong ZH, Wu ZX. The emergence of cooperation-extortion alliance on scale-free networks with normalized payoff. EPL 2018;122:50 0 05. [36] Perc M, Wang Z. Heterogeneous aspirations promote cooperation in the prisoner’s dilemma game. PLOS One 2010;5:e15117. [37] Xu XR, Rong ZH, Tse CK. Bounded rationality optimizes the performance of networked systems in prisoner’s dilemma game. In: Proceedings of the IEEE International Symposium on Circuits and Systems; 2018. [38] Korn A, Schubert A, Telcs A. Lobby index in networks. Phys A 2009;388: 2221–2226. [39] Santos FC, Pacheco JM, Lenaerts T, Lenaerts J. Evolutionary dynamics of social dilemmas in structured heterogeneous populations. Proc Natl Acad Sci USA 2006;103:3490–4. [40] Kitsak M, Gallos LK, Havlin S, Liljeros F, Muchnik L, Stanley HE, et al. Identification of influential spreaders in complex networks. Nat Phys 2010;6:888–93. [41] Rong ZH, Wu ZX. Effect of the degree correlation in public goods game on scale-free networks. EPL 20 09;87:30 0 01. [42] Rong ZH, Li X, Wang XF. Roles of mixing patterns in cooperation on a scale-free networked game. Phys Rev E 2007;76:027101. [43] Wang Z, Wang L, Perc M. Degree mixing in multilayer networks impedes the evolution of cooperation. Phys Rev E 2014;89:052813. [44] Adami C, Hintze A. Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nat Commun 2013;4:1–7.