Memory boosts turn taking in evolutionary dilemma games

Memory boosts turn taking in evolutionary dilemma games

BioSystems 131 (2015) 30–39 Contents lists available at ScienceDirect BioSystems journal homepage: www.elsevier.com/locate/biosystems Memory boosts...

3MB Sizes 0 Downloads 54 Views

BioSystems 131 (2015) 30–39

Contents lists available at ScienceDirect

BioSystems journal homepage: www.elsevier.com/locate/biosystems

Memory boosts turn taking in evolutionary dilemma games Tao Wang a, * , Zhigang Chen b , Lei Yang a , You Zou c , Juan Luo a a

College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China School of Software, Central South University, Changsha 410083, China c High Performance Computing Center, Central South University, Changsha 410083, China b

A R T I C L E I N F O

A B S T R A C T

Article history: Received 20 September 2014 Received in revised form 30 March 2015 Accepted 30 March 2015 Available online 1 April 2015

Spontaneous turn taking phenomenon can be observed in many self-organized systems, and the mechanism is unclear. This paper tries to model it by evolutionary dilemma games with memory mechanism. Prisoner’s dilemma, Snowdrift (including Leader and Hero) and Stag-hunt games are unified on an extended S–T plane. Agents play game with all the others and make decision by the last game histories. The experiments find that when agents remember last 2-step histories or more, a kind of cooperative turn taking (CAD) bursts at the area of Snowdrift game with restriction of S + T > 2R and S 6¼ T, while the consistent strategy (DorC) gathers on the line of S + T > 2R and S = T. We also find that the system’s fitness ratio greatly improved with 2-step memory. ã 2015 Elsevier Ireland Ltd. All rights reserved.

Keywords: Cooperation R reciprocity ST reciprocity P reciprocity TFT WSLS

1. Introduction Spontaneous turn taking can be seen as an action of rotation or oscillation in populations (Jiang et al., 2006). This phenomenon has been found in many different systems such as the density oscillator (Steinbock et al., 1998), ticking hour glass (Wu et al., 1993), RNA Polymerase traffic on DNA (Sneppen et al., 2005) and traffic of ants (Dussutour et al., 2005). Turn taking can also solve the problem of pedestrians passing a bottleneck (Burstedde et al., 2001; Helbing and Molnár, 1995; Helbing et al., 2000, 2005). In this paper we introduced evolutionary game theory to model the action of turn taking in self-organized populations (Gintis, 2000; Hofbauer and Sigmund, 1998; Maynard Smith, 1982; Nowak, 2006a). It may also be able to illustrate the systems mentioned above. Evolutionary game theory (EGT) is the application of game theory to evolving populations of life forms in biology. EGT is useful in this context by defining a framework of contests, strategies, and analytics into which Darwinian competition can be modeled (Maynard-Smith and Price, 1973). The payoff matrix of a 2-people-2-strategy game is shown in Table 1. There are three kinds of relationships between the players: player cooperates with each other and gets payoff R (R reciprocity); player defects with each other gets payoff P (P reciprocity); player cooperates with defector gets S and the opponent gets T (ST

* Corresponding author. E-mail address: [email protected] (T. Wang). http://dx.doi.org/10.1016/j.biosystems.2015.03.006 0303-2647/ ã 2015 Elsevier Ireland Ltd. All rights reserved.

reciprocity). There exists social metaphor for dilemma situations (Rand and Nowak, 2013) such as Prisoner’s dilemma game (PD, T > R, P > S), Snowdrift game (SD, T > R, S > P), Stag hunt game (SH, R > T, P > S). In Hawk dove game (also named as snowdrift game, T > R, S > P) (Maynard-Smith and Price, 1973), when facing a new territory, if one player choose cooperation (dove strategy, to avoid combat and may lose the territory), the best strategy for the other is defection (hawk strategy, to combat in any case and may acquire the territory), and vice versa. However, the cooperator usually get lower payoff than the defector which will injure the cooperator’s enthusiasm if the game is played repeatedly. Is there any way to help the 2 player to reach a state of “turn taking” in ST reciprocity: the one who cooperate will defect next time; and the one who defect will cooperate next time, and going on? Some studies revealed the relationship between turn taking and evolutionary games. There are some dilemma games that related to turn taking, including leader game, Hero game and common-pool resources game (Browning and Colman, 2004; Lau and Mui, 2012; Rapoport, 1967). These games all belong to the subtype of Snowdrift game, and the game matrix meets the conditions of T + S > 2R. Crowley et al. (1998) found turn taking phenomenon in evolutionary games. They studied iterated complementarity dilemma game, and observed two special cooperation mechanism: cooperation alternating with defection (CAD), which means that both players taking alternative strategies: (cooperate, defect), (defect, cooperate), ...; the other is defection or cooperation (DorC),

T. Wang et al. / BioSystems 131 (2015) 30–39 Table 1 Payoff matrix for a 2  2 game. Opponent

Ego

Cooperation (C) Defection (D)

Cooperation (C)

Defection (D)

R T

S P

which means that both players taking one strategy and do not change: (cooperate, defect), (cooperate, defect), .... These two mechanisms both belong to ST reciprocity and CAD is a kind of turn taking. We call CAD as cooperative turn taking for it can help agents overcome the dilemma of games like Snowdrift. Browning and Colman (Browning and Colman, 2004; Colman and Browning, 2009b) studied iterated Prisoner’s dilemma, Snowdrift game, Hero game and Leader game. They made agents remember the last 3-step histories and adopt genetic algorithm as strategy evolutionary rule, and they also observed turn taking strategies like CAD. Tanimoto and Sagara (2007) defined a special 2-dimensional Dg–Dr game plane, which including Prisoner’s dilemma, Snowdrift, Stag Hunt, Leader and Hero (Fig 1(b)). Wakiyama and Tanimoto (2011) studied the system when agents gamed and updated strategies in different groups on Dg–Dr plane. Agents had one-step memory and CAD strategy was found in the area of harmony game.

31

An important application of EGT is in the researches of cooperation mechanism (Axelrod and Hamilton, 1981; Nowak, 2006b). Cooperation is the process of groups of organisms working or acting together for their common benefit, as opposed to working in competition for selfish benefit (Kohn, 1992). Since Nowak found that cooperation can exist in evolutionary Prisoner’s dilemma on lattice. In 1992 (Nowak and Robert, 1992), many scientists were interested in the issue of how cooperation exist in rational individuals (Hauert and Szabo, 2005; Nowak et al., 2010; Perc and Szolnoki, 2010; Roca et al., 2009; Szabó and Fáth, 2007). In the study of turn taking in evolutionary game, memory is an important mechanism. The origin of the study should start from Axelrod’s tournaments in 1980s (Axelrod, 1984, 1987; Axelrod and Hamilton, 1981). He conducted computer tournaments to find the strategy that performs best in a population playing PD game. TFT (Tit-for-tat) won both of the tournaments. However, Nowak found that WSLS (win-stay, lose-shift) outperforms TFT in 1993 (Nowak and Sigmund, 1993). Both TFT and WSLS are based on memory. There are many studies about memory mechanism and strategies (Alonso-Sanz, 2009; Alonso-Sanz and Martin, 2006; Deng et al., 2010; Imhof et al., 2007; Liu et al., 2010; Posch, 1999; Qin et al., 2008; Tanimoto and Sagara, 2007; Wang et al., 2006). ST reciprocity is a subtype of cooperation. People usually think that bilateral cooperation is the key to overcome the dilemma of 2-person game. However, in the studies of Chen et al. (2013) and Wang et al. (2014), the system’s cooperation ratio is composed by R

Fig. 1. Extended S–T plane. (a) general S–T plane; (b) Dg–Dr plane defined by Tanimoto; (c) unified extended S–T plane. PD is Prisoner’s dilemma game, SD is Snowdrift game, SH is Stag hunt game and HG is Harmony game (without dilemma). In (b) and (c), PD game includes area of S + T > 2R, which is not the common definition, we call it extend PD game.

32

T. Wang et al. / BioSystems 131 (2015) 30–39

Table 2 Examples of the coding of strategies when m = 1. Game history of two player

Payoff of the first one

The next step of the first one ALLC

ALLD

TFT

WSLS

00 (defect, defect) 01 (defect, cooperate) 10 (cooperate, defect) 11 (cooperate, cooperate)

P T S R

1 1 1 1

0 0 0 0

0 1 0 1

1 0 0 1

reciprocity and ST reciprocity. Wang et al. (2014) showed that memory promotes ST reciprocity while demotes R reciprocity in Snowdrift game, which cause the decrease of cooperation and the increase of the system’s fitness. The researches above showed that turn taking is possible to evolve in evolutionary games. However, there still remain challenges. The first is that memory’s effect to turn taking is not clear in evolutionary games. Some studies fix the memory length to three (Browning and Colman, 2004; Colman and Browning, 2009a) and some others fix it to one (Tanimoto and Sagara, 2007), so different memory length’s effect to turn taking is not clear. The second is that distribution of turn taking on general game parameters is not clear. Some papers studied fixed payoff matrix, and some others studied self-defined payoff plane, so how turn taking distribute on the general S–T plane is not clear. This paper will study the system based evolution system on an extended S–T plane with different memory length. The study contents include CAD and DorC’s distribution, the system’s cooperation and fitness, and R, ST, P reciprocities’ distribution. 2. The model 2.1. Game matrix   R S In order to reduce parameter, the game matrix usually T P   1 S . To Prisoner’s dilemma (PD), T > 1, S < 0; to reduces to T 0 Snowdrift(SD) game, T > 1, S > 0; to Stag hunt (SH) game, T < 1, S < 0; to harmony game (HG) without dilemma, T < 1, S > 0 (Fig. 1(a)). In order to define dilemma, Tanimoto defined two indicators (Tanimoto and Sagara, 2007): Dg = T-R

(1)

Dr = P-S

(2)

If one of the indicators is greater than zero, the game is a dilemma game. To the definition, PD (Dg > 0, Dr > 0), SD (Dg > 0, including leader and hero) and SH (Dr > 0) are all dilemma games.   Dr 1 . Based Letting R = 1, P = 0, he got the payoff matrix 1þDg 0 on the matrix, Tanimoto defined a 2-dimensional plane. It is obvious that Dg = T-1, Dr = -S, so the S–T plane and Dg–Dr plane are actually the same.

This paper unified the general S–T plane and the Dg–Dr plane to extended S–T plane (Fig. 1). As shown in Fig. 1(c), dilemma games including Prisoner’s dilemma, Snowdrift (including leader and hero) and Stag hunt are all on the plane. 2.2. Memory mechanism Following previous work (Chen et al., 2013; Wang et al., 2014), we design a memory mechanism to code the game history and strategies. One-step memory means that agents can remember history for the last game. Cooperation is coded as “1” and defection as “0”. Thus, there are four possible historical interactions for onestep memory (Table 2): 00 (the focused player defects and the opponent defects), 01 (the focused player defects and the opponent cooperates), 10, and 11. Depending on the history, agents choose to cooperate or defect. Thus, the strategy can be coded as ****, where * represents 0 or 1. The position of * indicates a particular history. Thus, this 4-bit encoding can represent 16 different strategies. For example, 1001 is WSLS strategy and 1010 is TFT (high bit on the left). Table 2 shows the codes for always cooperates (ALLC), always defects (ALLD), TFT, and WSLS strategies. When agents can remember more than one-step, the relationship between memory length and strategies is shown in Table 3. 2.3. Strategy update rule There are many strategy update rule such as replicator rule, unconditional imitation rule, etc. (Perc and Szolnoki, 2010). In the studies of repeated dilemma game with memory, genetic algorithm is a common update rule (Axelrod, 1987; Browning and Colman, 2004; Colman and Browning, 2009b; Crowley, 2001; Tanimoto and Sagara, 2007). In this paper we adopt genetic algorithm as the update rule. There are three steps of the algorithm: selection, crossover and mutation. The first step is selection. Every agent randomly selected a neighbor. If the neighbor’s payoff was higher than the focal agent, then the following steps are executed; else the focal agent keeps the strategy unchanged and go to the third step. The second step is crossover. The focal agent and the neighbor’s strategies are coded into binary strings. The position of crossover is at the middle of the strategy. One of the two children strings is randomly selected as the strategy of next generation. When memory length = 0, and the strategy is only one bit, then it had 0.5 ratio to change to the neighbor’s strategy. The third step is mutation. Every bit of the new strategy has a mutation rate of 0.01. A sketch map of crossover and mutation is in Fig. 2. We discussed the algorithm in Section 4.

Table 3 The relationship between memory length and coding of strategies (bit). Memory length

Coding length of game history

Strategies’ length

Number of strategies

0 1 2 3

0 2 4 6

1 4 16 64

21 24 216 264

T. Wang et al. / BioSystems 131 (2015) 30–39

33

Fig. 2. Sketch map of strategy crossover and mutation.

The process of the experiment is as follows: (1) For different memory length from 0 to 3, the following steps

((2)–(4))is done. (2) We sampled the extended S–T plane every 0.5 unit, and the

total samples are 15  15 = 225. Every sample point corresponding to a specific payoff matrix, and at each sample point, the following steps ((3)–(4)) was done. (3) N = 2000 agents are well-mixed (to exclude the network structure’s effect), and each focal agent picks up her opponent randomly. All the agents gamed Lr = 1000 generations (step (4)); (4) Each agent randomly selected 100 neighbors and gamed 100 times with them. Then genetic algorithm was executed to generation new generation. (5) All the following statistic data were based on 100 samples (steps (1)–(4)).

3. Experiments and analysis 3.1. Turn taking strategy (CAD) and consistent strategy (DorC) Turn taking is the focus of this paper. Previous researches concentrate on 1-step memory and 3-step memory, and the relationship between memory length and spontaneous turn taking is not clear. We studied the evolutionary system with memory length from 0 to 3. When system run to a stable state (1000 generations), we analyzed the last 6-step history. If the Table 4 List of 6-step repeated histories. Coding of game history

Binary coding of 6-step history

Strategy

DDDDDDDDDDDD DCDCDCDCDCDC CDCDCDCDCDCD DCCDDCCDDCCD CDDCCDDCCDDC CCCCCCCCCCCC

000000000000 010101010101 101010101010 011001100110 100110011001 111111111111

ALLD DorC DorC CAD CAD ALLC

history looks like 011001100110 or 100110011001, then it is CAD (Table 4); if the game history looks like 010101010101 or 101010101010, it is DorC. It is not accurate to study CAD and DorC just by strategies. The ratio of CAD and DorC are defined as follows: rCAD ¼

nCAD ntotal

(1)

rDorC ¼

nDorC ntotal

(2)

where rCAD,rDorC are the ratio of CAD and DorC, nCAD, nDorC are the number of histories like CAD, DorC, and ntotal is the total number of histories between all the nodes. We show the distribution of turn taking (CAD) in Fig. 3. We found that when agents remember 2-step history, spontaneous turn taking spring up on the S–T plane. When memory length m = 0, CAD does not exist; When m = 1, CAD exist in a small area in SD game. When m = 2, the distributions change drastically. CAD expands rapidly to the area of S + T > 2R. When m = 3, the result is similar to m = 2. We also show the distribution of consistent strategy (DorC) in Fig. 4. When memory increases from 0 to 1, DorC increases at the area of SD game; while when m = 2 and 3, DorC strategies gathers on the line of S = T. Some results need further analyses. In Fig 4(c) and (d), DorC is on some discrete points near the axis. In fact, in the experiment, the sampling density on the S–T plane is every 0.5 unit one point. So these discrete points in Fig. 4 is the sampling points, and DorC actually distribute strictly on the line S = T in SD game area. This experiment shows that when agents have 2-step memory, turn taking (CAD) will emerge on S–T plane with the condition: S + T > 2R, S 6¼ T. This area includes most of SD game, all the leader and hero game, and even includes part of PD and HG. 3.2. Cooperation ratio and system’s fitness In the study of dilemma game, system’s cooperation ratio and average fitness are important factors. Cooperative turn taking is a component of system’s cooperation. The system’s cooperation

34

T. Wang et al. / BioSystems 131 (2015) 30–39

Fig. 3. Distribution of spontaneous turn-taking (CAD) on the extended S–T plane with memory length m = 0, 1, 2, 3.

Fig. 4. Distribution of consistent strategy (DorC) on the extended S–T plane with memory length m = 1, 2, 3.

T. Wang et al. / BioSystems 131 (2015) 30–39

35

Fig. 5. The distribution of system’s cooperation ratio and fitness with different memory length on the extended S–T plane. The first column is cooperation ratio and the second column is fitness. Line 1–4 display memory length from 0 to 3, respectively.

ratio rc and average fitness pave are defined as follows: rc ¼

nc ðnc þ nd Þ

(6)

pi ¼

Pf i ðk  tÞ

(7)

36

T. Wang et al. / BioSystems 131 (2015) 30–39

get in a round, k is agent i’s degree and t is the repeated game times in a round. N is the total number of agents. We studied the system with memory from 0 to 3 and got Fig. 5. We can see that when memory increases step by step, cooperation ratio increases rapidly at the area between PD and SH (area B); while between SD and HG (area A), cooperation is decreased. That means memory not necessarily promotes cooperation in dilemma games (Wang et al., 2014). We can see that increasing memory promotes the system’s fitness. We make further study to see how much memory affects system’s fitness. Firstly we get the theoretical max fitness Fmax by the definition:  R; S þ T  2R (9) F max ¼ ðS þ TÞ=2; S þ T > 2R

Fig. 6. Theoretical max fitness Fmax on extended S–T plane. We got it by Formula (9).

Then we get Fig. 6. Next we divided the system’s fitness by Fmax and got fitness ratio (Fig. 7). We can see that when agents remember no less than 2-step history, the system get high fitness ratio in the whole game area. It is interesting that fitness ratio adjacent to the line of S + T = 2R is lower than other area. It means that agents are in chaos of whether to cooperate or turn taking with others at these areas. We can also see that cooperation ratio and fitness ratio are not synchronously increasing with memory’s increase.

N X

pi

pave ¼

i¼1

N

(8)

where nc (nd) is number of cooperation (defection) strategies adopted by the agents in a game round, pi is the fitness of the agent i, pave is the system’s fitness, where Pfi is the total payoffs agent i

3.3. Three kinds of reciprocity R, ST, P As we have introduced that turn taking is a subtype of ST reciprocity, studying the distribution of R, ST and P ratio laid a foundation for the further study of turn taking. It can also help us understand the composition of cooperation.

Fig. 7. Fitness ratio on the extended S–T plane with different memory length. We divided the fitness in Fig. 5 by Fmax of Fig. 6 and got this figure.

T. Wang et al. / BioSystems 131 (2015) 30–39

In 2-people-2-strategy game, players have a choice to cooperate, C, or to defect, D. Then there are three kinds of reciprocity between two players: R reciprocity (players both chose C, they get payoff R), P reciprocity (players both chose D and get payoff P) and ST reciprocity (one player chose C and the other chose D, they get S and T). When we talk about cooperation ratio, it actually includes 2 parts: all the R reciprocity and half of the ST reciprocity. These reciprocities have been studied by some researchers (Browning and Colman, 2004; Crowley, 2001; Tanimoto and Sagara, 2007). The relationship between cooperation ratio and the ratio of R, ST and P can be described as follows: rc ¼

rR þ rST 2

(9)

rd ¼

rP þ rST 2

rST ¼ rS þ rT ¼ rS  2 ¼ rT  2

37

(10)

(11)

rc ; rd ; rR ; rS ; rT ; rP ; rST means the ratio of cooperation, defection, R, S, T, P and ST reciprocity. The result is in Fig. 8. We can see from the figure that when memory increases, ST reciprocity increases dramatically. When memory length m  2, ST reciprocity mainly distributes at the area of S þ T > 2R. Memory’s effect to R reciprocity is complex: at the area between SD and HG, R reciprocity decreases with memory’s increase; at the area between

Fig. 8. Distribution of R, ST, P ratio with different memory length on extended S–T plane. Column 1, 2, 3 are the distribution of R, ST and P, respectively; line 1, 2, 3, 4 display memory length from 0 to 3, respectively.

38

T. Wang et al. / BioSystems 131 (2015) 30–39

PD and SH, R reciprocity increases with memory’s increase. P reciprocity monotonically decreases with memory’s increase. When there is no memory, we found that even when add the ratio of CAD and DorC together, the sum is less than the ratio of ST reciprocity. It means that there exist other erratically ST reciprocity (for example, the game history looks like 010110, 011010, 101001, etc., in 3-step history) when agents have no memory. 4. Conclusions and discussion We studied evolutionary dilemma game on extended S–T plane with memory mechanism. There are mainly two findings in this work: 1. When agents can remember 2-step history or more, turn taking

(CAD) occupies S + T > 2R and S 6¼ T on the extended S–T plane (which even including part of extended Prisoner’s dilemma game and Harmony game). 2. When agents can remember 2-step history or more, consistent strategy (DorC) occupies S + T > 2R and S = T on the extended S–T plane. 3. When agents can remember 2-step history or more, the system gets high fitness ratio (>0.6) in all the dilemma games. The result shows that cooperation ratio and fitness are two different factors to an evolutionary game system. Number of strategies is relatively large vis-à-vis the number of agents N = 2000. For instance, in case of 4-step memory, the number of strategies is 2^(2^4) = 2^16 = 65536. That means many good strategies may not join the system’s evolution. However, two factors help the system test more strategies than the number of agents: the high mutation rate (r = 0.01 at each bit of the strategy) when agents update their rule and all the results are based on 100 samples. We do not select the agents proportional to their fitness as the common genetic algorithm did, but the agents are randomly selected. Because in well mixed population, some good strategies will die earlier if they are selected proportional to the fitness. We have also do experiments with the common genetic algorithm, and the results are not so clear to support the conclusions. It seems that 2-step memory can overcome the dilemma of Snowdrift game and help agents get high payoff. The result maybe can illustrate the turn taking phenomenon in biological systems; also it may give hints to manmade self-organized system, such as self-organized network (Liu et al., 2013, 2014), multi-robot system, etc. As Snowdrift game models the dilemma of resources competition and conflict, the study maybe could shed light to the regional conflicts and wars all around the world today.

Acknowledgements We thank advices given by the two anonymous reviewers, and the editor’s kind help for polishing the paper. This work is supported by “Fundamental Research Funds for the Central Universities”, partly supported by the National Natural Science Foundation of China under Grant 61379110, 61379058, 61272062, 61202289, the Priority Development Areas Project 20120162130008 of Chinese Ministry of Education and the fundamental research funds for the Central Universities of China.

References Alonso-Sanz, R., 2009. Memory boosts cooperation in the structurally dynamic prisoner's dilemma. Int. J. Bifurcat. Chaos 19, 2899–2926.

Alonso-Sanz, R., Martin, M., 2006. Memory boosts cooperation. Int. J. Mod. Phys. C 17, 841–852. Axelrod, R., 1984. The Evolution of Cooperation. Basic Books, New York. Axelrod, R., 1987. Genetic Algorithms and Simulating Annealing. Pitman, London. Axelrod, R., Hamilton, W.D., 1981. The evolution of cooperation. Science (New York, N.Y.) 211, 1390–1396. Browning, L., Colman, A.M., 2004. Evolution of coordinated alternating reciprocity in repeated dyadic games. J. Theor. Biol. 229, 549–557. Burstedde, C., Klauck, K., Schadschneider, A., Zittartz, J., 2001. Simulation of pedestrian dynamics using a two-dimensional cellular automaton. Physica A 295, 507–525. Chen, Z.-G., Wang, T., Xiao, D.-G., Xu, Y., 2013. Can remembering history from predecessor promote cooperation in the next generation? Chaos Soliton. Fract. 56, 59–68. Colman, A., Browning, L., 2009a. Evolution of cooperative turn-taking. Evol. Ecol. Res. 11, 949–963. Colman, A.M., Browning, L., 2009b. Evolution of cooperative turn-taking. Evol. Ecol. Res. 11, 949–963. Crowley, P.H., 2001. Dangerous games and the emergence of social structureevolving memory-based strategies for the generalized hawk-dove game. Behav. Ecol. 12, 753–760. Crowley, P.H., Cottrell, T., Garcia, T., Hatch, M., Sargent, R.C., Stokes, B.J., White, J.M., 1998. Solving the complementarity dilemma: evolving strategies for simultaneous hermaphroditism. J. Theor. Biol. 195, 13–26. Deng, X., Liu, Y., Chen, Z., 2010. Memory-based evolutionary game on small-world network with tunable heterogeneity. Phys. Stat. Mech. Appl. 389, 5173–5181. Dussutour, A., Deneubourg, J.L., Fourcassie, V., 2005. Temporal organization of bidirectional traffic in the ant Lasius niger (L.). J. Expe. Biol. 208, 2903–2912. Gintis, H., 2000. Game Theory Evolving. Princeton University Press, Princeton. Hauert, C., Szabo, G., 2005. Game theory and physics. Am. J. Phys. 73, 405–414. Helbing, D., Molnár, P., 1995. Social force model for pedestrian dynamics. Phys. Rev. E 51, 4282–4286. Helbing, D., Farkas, I., Vicsek, T., 2000. Simulating dynamical features of escape panic. Nature 407, 487–490. Helbing, D., Schonhof, M., Stark, H.U., Holyst, J.A., 2005. How individuals learn to take turns: Emergence of alternating cooperation in a congestion game and the prisoner’s dilemma. Adv. Complex Syst. 8, 87–116. Hofbauer, J., Sigmund, K., 1998. Evolutionary Games and Population Dynamics. Cambridge University Press. Imhof, L.A., Fudenberg, D., Nowak, M.A., 2007. Tit-for-tat or win-stay, lose-shift? J. Theor. Biol. 247, 574–580. Jiang, R., Helbing, D., Shukla, P.K., Wu, Q.S., 2006. Inefficient emergent oscillations in intersecting driven many-particle flows. Phys. Stat. Mech. Appl. 368, 567–574. Kohn, A., 1992. No Contest: The Case Against Competition. Houghton Mifflin Harcourt. Lau, S.-H.P., Mui, V.-L., 2012. Using turn taking to achieve intertemporal cooperation and symmetry in infinitely repeated 2  2 games. Theor. Decis. 72, 167–188. Liu, Y., Li, Z., Chen, X., Wang, L., 2010. Memory-based prisoner’s dilemma on square lattices. Phys. Stat. Mech. Appl. 389, 2390–2396. Liu, A., Jin, X., Cui, G., Chen, Z., 2013. Deployment guidelines for achieving maximum lifetime and avoiding energy holes in sensor network. Inf. Sci. 230, 197–226. Liu, A., Zhang, D., Zhang, P., Cui, G., Chen, Z., 2014. On mitigating hotspots to maximize network lifetime in multi-hop wireless sensor network with guaranteed transport delay and reliability. Peer-to-Peer Network. Appl. 7, 255–273. Nowak, M.A., Robert, M.M., 1992. Evolutionary games and spatial chaos. Nature 359. Maynard Smith, J., 1982. Evolution and the Theory of Games. Cambridge University Press, UK. Maynard-Smith, J., Price, G.R., 1973. The logic of animal conflict. Nature 246. Nowak, M.A., 2006a. Evolutionary Dynamics. Harvard University Press, Cambridge, MA. Nowak, M.A., 2006b. Five rules for the evolution of cooperation. Science 314, 1560–1563. Nowak, M., Sigmund, K., 1993. A strategy of win-stay, lose-shift that outperforms titfor-tat in the Prisoner’s dilemma game. Nature 364, 56–58. Nowak, M., Tarnita, C., Antal, T., 2010. Evolutionary dynamics in structured populations. Phil. Trans. R. Soc. B 365, 19–30. Perc, M., Szolnoki, A., 2010. Coevolutionary games – a mini review. Biosystems 99, 109–125. Posch, M., 1999. Win-stay, lose-shift strategies for repeated games – memory length, aspiration levels and noise. J. Theor. Biol. 198, 183–195. Qin, S.-M., Chen, Y., Zhao, X.-Y., Shi, J., 2008. Effect of memory on the prisoner's dilemma game in a square lattice. Phys. Rev. 78. Rand, D., Nowak, M., 2013. Human cooperation. Trends Cogn. Sci. 17, 413–425. Rapoport, A., 1967. Exploiter, leader, hero, and martyr: the four archetypes of the 2 times 2 game. Behav. Sci. 12, 81–84. Roca, C., Cuesta, J., Sanchez, A., 2009. Evolutionary game theory: temporal and spatial effects beyond replicator dynamics. Phys. Life Rev. 6, 208–249. Sneppen, K., Dodd, I.B., Shearwin, K.E., Palmer, A.C., Schubert, R.A., Callen, B.P., Egan, J.B., 2005. A mathematical model for transcriptional interference by RNA polymerase traffic in Escherichia coli. J. Mol. Biol. 346, 399–409. Steinbock, O., Lange, A., Rehberg, I., 1998. Density oscillator: analysis of flow dynamics and stability. Phys. Rev. Lett. 81, 798–801. Szabó, G., Fáth, G., 2007. Evolutionary games on graphs. Phys. Rep. 446, 97–216. Tanimoto, J., Sagara, H., 2007. A study on emergence of alternating reciprocity in a 2  2 game with 2-length memory strategy. Biosystems 90, 728–737.

T. Wang et al. / BioSystems 131 (2015) 30–39 Wakiyama, M., Tanimoto, J., 2011. Reciprocity phase in various 2  2 games by agents equipped with two-memory length strategy encouraged by grouping for interaction and adaptation. Biosystems 103, 93–104. Wang, W., Ren, J., Chen, G., Wang, B., 2006. Memory-based snowdrift game on networks. Phys. Rev. E 74.

39

Wang, T., Chen, Z., Li, K., Deng, X., Li, D., 2014. Memory does not necessarily promote cooperation in dilemma games. Phys. Stat. Mech. Appl. 395, 218–227. Xl, Wu, Måløy, K.J., Hansen, A., Ammi, M., Bideau, D., 1993. Why hour glasses tick. Phys. Rev. Lett. 71, 1363–1366.