Evolutionary dynamics in division of labor games on cycle networks

Evolutionary dynamics in division of labor games on cycle networks

Evolutionary dynamics in division of labor games on cycle networks Recommended by Prof. T Parisini Journal Pre-proof Evolutionary dynamics in divis...

2MB Sizes 0 Downloads 70 Views

Evolutionary dynamics in division of labor games on cycle networks

Recommended by Prof. T Parisini

Journal Pre-proof

Evolutionary dynamics in division of labor games on cycle networks Chunyan Zhang, Qiaoyu Li, Yuying Zhu, Jianda Han, Jianlei Zhang PII: DOI: Reference:

S0947-3580(18)30267-X https://doi.org/10.1016/j.ejcon.2019.11.002 EJCON 392

To appear in:

European Journal of Control

Received date: Revised date: Accepted date:

20 June 2018 1 November 2019 9 November 2019

Please cite this article as: Chunyan Zhang, Qiaoyu Li, Yuying Zhu, Jianda Han, Jianlei Zhang, Evolutionary dynamics in division of labor games on cycle networks, European Journal of Control (2019), doi: https://doi.org/10.1016/j.ejcon.2019.11.002

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 European Control Association. Published by Elsevier Ltd. All rights reserved.

Evolutionary dynamics in division of labor games on cycle networks Chunyan Zhang, Qiaoyu Li, Yuying Zhu, Jianda Han and Jianlei Zhang∗ College of Artificial Intelligence, Nankai University, Tianjin, China.

Abstract This paper theoretically investigates the evolutionary dynamics in the division of labor games performed on the cycle networks. Several typical updating processes based on the Moran process are considered here: Birth-Death (BD) updating process, Death-Birth (DB) updating process and the mixed DB-BD updating process. The exponential payoff-to-fitness mapping is adopted, and analytical results for fixation time and fixation probability under the condition of strong selection are provided. In addition, numerical values of several crucial quantities within the framework of different game scenarios, decided by the payoff matrix of the division of labor(DOL) games, are studied for comparison. The conclusion clearly shows that, even under different kinds of DOL games, the fixation times are closely associated with the specific update mechanism. This work provides a deeper look into the strategic interactions and network dynamics in the context of the DOL games based on self-organized task allocation. Keywords: evolutionary stability; iterated games; network dynamics

1. Introduction The mechanisms by which altruistic behavior among multi-agent systems has evolved and can be sustained have long been of interest to researchers across several disciplines. Traditional approach to analyze the decision-making process of the participants in a series of game-like interactions is the evolutionary game theory (1–3). And it provides a pertinent mathematical analytical framework for describing the interactions between selfish individuals and the involved interest conflicts (4–6). Besides, definition of the Nash equilibrium for non-cooperative games is one of the most important landmark, which points out that there may exist one or more equilibria among selfish individuals during the interaction processes (7; 8). Under the Nash equilibrium state, all individuals can not increase their earnings by unilaterally altering the strategies. Evolutionary game theory is conducive for us to better understand the population dynamics driven by self-optimization and the consequent state updating processes of the intelligent agents (9–11). In the large system with infinite individuals, replication equation is an often-used technique to characterize the Preprint submitted to Elsevier

November 22, 2019

evolution dynamics of involved strategies (12–14). However, the evolutionary game dynamics in a finite population can be described by a series of stochastic processes (15; 16), such as the pairwise comparison process (17), Moran process (18), Wright-Fisher process (19), etc. Here, one important determinant in the finite population dynamics is the fixation probability, which refers to the probability for one strategy to eventually dominate the whole population. Another factor which is of our main concern, is the fixation time: the time for the system to reach certain absorbing state. According to the specific fixation condition, this time can be refined into the mean unconditional fixation time and the mean conditional fixation time (20; 21). However, individuals are not always in such nondistinctive positions (wellmixed connections). In the networked systems, the evolutionary dynamics is influenced by the underlying updating processes (22–25), as well as the connection mode between two different individuals. Two often-used updating process in previous studies are: Birth-Death (or BD, for short) updating process (26; 27) and Death-Birth (or DB, for short) updating process (28; 29). Under the BD process, firstly, an individual i is chosen to reproduce, which is proportional to i’s fitness, then the offspring generated by i replaces one of the neighbors. Under the DB process, an individual i is randomly chosen to death, then i’s neighbors compete to take over the vacant position. Notably, the mixed BD-DB updating process will also be employed in this study to further study and compare the effect generated by each of above two update rules (i.e., BD and DB) (30). Many theoretical models, characterized by the payoff matrix, are employed for establishing the framework to describe the conflicts of interest between individuals and groups, such as the prisoner’s dilemma game (31), public goods game (32; 33), etc. Division of labor(DOL) where a crowd of individuals perform tasks collaboratively is a pretty common phenomenon in nature (34–36). Especially, in human society, the division of labor, which usually be deemed as the key factor that bring successful business, is an important symbol of human civilization. Even in the insect society, ants or other insects show the steady and efficient division of labor by the self-organization of involved memberships, in which an unequal distribution of effort between or among particular tasks may exist and conflicts of interest are created (37; 38). From the perspective of evolutionary game theory, DOL can be seen as a special form of cooperative dilemma among agents with the personal benefits-maximizing pursuit. To realize higheffective division of labor on the premise of the success of the tasks undertaken by the whole group has wide applications. And it is of great significance on studying the development of human society. In this paper, we model the individuals’ decision making process by using the division of labor game, where synergistic benefit is considered. From the perspective of game theory, efficient social division of labor can promoting the common payoff for the whole group, even some agents’ interests are not optimal at the individual level. Our work can helps to build the effective mechanism for individuals’ decision making let a high-quality task allocation in the group. Different from the traditional approach employing threshold-based models (39), here we take the task allocation issue as a cooperative dilemma and study 2

the evolution dynamics of strategies adopted by agents situating on cycle networks. A cycle here refers to such a regular graph where each individual has exactly two neighbours (k = 2) (40; 41). In the division of labor game, each player can freely choose one of the two tasks to perform, which are respectively denoted by task A and B. Individuals who completing the task will be given corresponding benefits, while the efforts required to complete the task will be counted. Besides, what we need to think about is the updating rule, which has a crucial impact on population dynamics. Discussions in three types of scenarios (i.e., the Birth-death(BD) updating process, Death-birth(DB) updating process and the mixed BD-DB updating process) on cycle networks indicate that fixation of individuals’ decisions are much more quickly in the BD process than than in the DB process. And actually, the way of division of labor (dominance games and coexistence games) have little impact on the fixation times. Our findings here are expected to provide a theoretical basis for exploring the influence of different updating process on transition probabilities and the mean fixation times in individuals’ decision making process. This paper is organized as follows. In section 2, the model of the division of labor game(DOL) is provided and transition probability within different updating rules are gained. Section 3 summarizes the main results: fixation probability, fixation times under neutral selection and strong selection. In the final part, main conclusions of our work are provided and relevant explanations are put forward based on that. 2. Material and methods 2.1. Preliminaries This section include the basic knowledge that we need to study evolution dynamics of multi-agent systems in the context of DOL game played by the players who have two strategy choices. To achieve this goal, the approach we adopt is: calculate the probability for a single mutant individual to invade the whole system(fixation probability). Besides, unconditional fixation time and the mean conditional fixation time are introduced here, which helps to show the efficiency of DOL game on reducing the fixation time of a beneficial strategy. Consider such an evolution dynamics process, where there are two available strategies in the context of DOL game: A (performing task A) and B (performing task B). At each time, the players can choose one of the tasks as the current strategies based on the payoffs. Once the task has been finished (the game is played with the chosen strategy), the individuals will collect the benefits, meanwhile they bear the corresponding costs led by the chosen tasks. For simplicity and without loss of generality, here we set the benefit to b, and the costs to cA and cB . Parameter b represents the benefit shared by all the players if the task is finished. When both tasks have been performed in this system, the synergistic benefit which indicates reasonable task assignments will come into being: β (β > 0). All above settings can be described by following payoff matrix:

3

A B



A b − cA b − cB + β

B  b − cA + β . b − cB

(1)

Command b − cA = α and b − cB = 0, a simplified payoff matrix will be A B

A α β

B  α+β . 0

(2)

Note that α = cB − cA , parameter α denotes the cost gap between task B and task A. Obviously, collective dilemma appears, if α + β > 0 in the context of the division of labor games. Considering the interaction structure of players provides a better description about the gaming cases in real social systems. In this sense, many researches have devoted to study the evolutionary game dynamics in spatial environments (42–44). In a structured population, the focal agent collects payoffs after playing games with all its neighbors. It is true that the existing neighbors will be involved in the game process and therefore affect the transition probabilities. Hence, strategy distributions of the neighbors should be confirmed during every time steps, which makes it harder to study the strategy dynamics in networks. The referred updating rules on describing the evolutionary dynamics are: (1) BD updating rule, where an individual i is chosen to generate an offspring with an identical strategy proportional to the i’s fitness in terms of the whole population. Then the offspring will randomly replace a neighbor of i. (2)DB updating rule, where there is random death of an individual and the vacant spot is then filled by the offspring of one of the neighbor which is also selected with a probability proportional to individual’s fitness. Firstly, we consider the most fundamental case: in a multi-agent system, the nodes are randomly connected, and each one has the same number of neighbors which is marked as k. We then employ the Moran process to model the evolution of division of labor game in finite populations, and calculate the transition probabilities based on the different update rules. Note that we study the discrete time dynamics, during each time step, there are one birth and one death in the population of fixed size N : i As and N − i Bs. And assume that the strategy evolution starts with a single mutant A in the system. The degree of each individual is k, in mean sense, each A-player N −i i has Ni−1 −1 k A-neighbors and N −1 k B-neighbors. While, each B-player has N −1 k A-neighbors and NN−i−1 −1 k B-neighbors. The expected payoff of A-player is given i−1 N −i i by πA = α N −1 k + (α + β) N −1 k, and that of B-player is πB = β N −1 k. Notably, here an exponential payoff-to-fitness mapping is adopted, in the form of fA = eωπA and fB = eωπB , where w, named as the intensity of selection, accounts for the contribution of the game’s payoff to the fitness of players (45). In particular, at zero selection intensity, the game has nothing to do with one’s 4

payoff. While if ω is extremely large (ω → ∞), the payoff will greatly affect individuals’ fitness. Denote the probability of As increasing from i to i + 1 by T i+ , and the probability of the number of As decrease from i to i − 1 is T i− . For the BD updating process, relevant transition probabilities can be expressed by ifA i+ N −i TBD = ifA +(N −i)fB N −1 (3) −i)fB i− i TBD = ifA(N +(N −i)fB N −1 . For the DB updating process, the transition probabilities are given by i+ TDB = i− TDB =

ifA N −i N ifA +(N −i−1)fB (N −i)fB i N (i−1)fA +(N −i)fB .

(4)

Given all above transition probabilities, the fixation probability φi will be 1+ φi = 1+

i−1 k P Q

k=1 l=1 NP −1 Q k

k=1 l=1

T l− T l+

,

(5)

T l− T l+

which refers to the probability of an initial state of i players to take over the population. To characterize the expected time, we consider two important quantities totally. Firstly, the mean unconditional fixation time which describes the expected time of the population to reach one of the two absorbing states: A or B. Another indicator that we can not neglect is the mean conditional fixation time, which describes the expected time to reach the absorbing state A. Sojourn time refers to the variable to characterize the two expected times (40), for example, ti,j is the expected time that the process spends in the state with j A-players evolved from the initial i A-players. Started from a single mutant A, the sojourn time is φ1,j t1,j = j+ . (6) j+1,j T (1 − φ ) + T j− (1 − φj−1,j )

Here, φi,j means the probability to start from state i and ever return to the state j. The mean unconditional fixation time is given by the sum over all sojourn times (46), N −1 X φ1,j t1 = . (7) j+ j+1,j T (1 − φ ) + T j− (1 − φj−1,j ) j=1 The mean conditional fixation time is t1→N =

N −1 X j=1

φj φ1,j , 1 j+ j+1,j φ T (1 − φ ) + T j− (1 − φj−1,j ) 5

(8)

where φj+1,j =

PN −1

k=j+1

Qk

T m− m=j+1 T m+ T m− m=j+1 T m+

PN −1 Qk k=j

and φj−1,j =

Pj−2 Qk k=0

T m− m=1 T m+ T m− m=1 T m+

Pj−1 Qk k=0

, please refer

to (47) for the calculation details. Next, we study the evolution dynamics within the framework of division of labor game on cycle networks. Here, the interacting among agents under certain network structure is a key point and should not be overlooked. Studies on the basis of the complex networks provide us the appropriate method and frame to describe the individual interactions among multi-agent systems. However, mathematical modeling and theoretical analysis of evolution dynamics in the framework of division of labor and complex networks are still complicated. Originally, we begin with the most simple case: regular networks which are typical networks with relatively simple structures, among which the cycle network is a representative one (29; 40). Here, analysis about the dynamics of stochastic evolutionary game will be carried out in the framework of the cycle graph, where the degree of each node is 2. In addition to the BD and DB updating rules, the effect of mixed BD-DB update rule on dynamic evolution will also be investigated. To achieve this, parameter δ ∈ [0, 1] is introduced to allow the transition between the updating rules. Hence an assumption is made: in each step the DB updating rule is employed with probability δ and BD updating rule is used with probability 1 − δ. Obviously, δ = 1 recovers to the DB updating rule, whereas δ = 0 leads to the BD updating rule. 2.2. Birth-Death (BD) updating process on a cycle On the adopted cycles, all individuals play games against their two neighbors according to the BD updating rule. Under the BD updating process, the number of As increase by one when an A-player reproduce and its offspring replaces a B. Correspondingly, the number of As decreases if a B-player generates an offspring and then replaces an A. During the evolution process, the number of As (or Bs) fluctuates and can always be influenced by the edge position, which leads the strategies (As and Bs) to grow as a cluster on cycle networks. As a common method to model the strategy dynamics in this form, the transition probabilities are respectively given by i+ TBD =  e2ω(α+β)   e2ω(α+β) +2eωβ +(N −3)    eω(2α+β)

and

    

i=1

2eω(2α+β) +(i−2)e2ωα +2eωβ +(N −i−2) eω(2α+β) 2eω(2α+β) +(N −3)e2ωα +e2ωβ

0

6

1
i=N −1 i = 0, N

(9)

i− T BD = eωβ   e2ω(α+β) +2eωβ +(N −3)    eωβ

    

i=1

2eω(2α+β) +(i−2)e2ωα +2eωβ +(N −i−2) e2ωβ 2eω(2α+β) +(N −3)e2ωα +e2ωβ

0

1
.

(10)

i=N −1 i = 0, N

Given above, transition probabilities should be discussed in four different cases in accordance with the range of i. In BD updating process, the form of transition probabilities are relatively complicated and be depicted by a panel of exponential functions as a result of the exponential payoff-to-fitness mapping. 2.3. Death-Birth (DB) updating process on a cycle For the DB updating process, an individual is randomly selected to be removed initially, and then her neighbors compete for this empty spot. The number of As changes only when the random death occurs at the boundary of A and Bs’ clusters. Population state with merely one A (or one B) are special since there is no strategy competition at border points, hence the transition probability is 1/N . Moreover, the cases with two borders which can be chosen for random death with the probability of 2/N are studied. Based on their fitness, the two neighbors, who enclosing the node to death, compete to occupy the focal empty spot. Furthermore, the transition probabilities take following form of  2 1 i=1   N2 1+e−2ω(α+β)  1   N 1+e−ω(2α+β) 1 < i < N − 2 i+ 1 2 (11) TDB = i=N −2 N 1+e−2ωα  1   N i=N −1   0 i = 0, N and

i− TDB =

          

1 N 2 1 N 1+e2ωα 2 1 N 1+eω(2α−β) 2 1 N 1+e2ω(α−β)

0

i=1 i=2 2
(12)

2.4. Mixed DB-BD updating process on a cycle The mixed DB-BD updating process refers to such a circumstance, where the DB rule is used with probability δ and BD appears with 1 − δ at each step. i+ i− The transition probabilities TM ix , TM ix for the number of As to increase or decrease by one are

7

and

i+ i+ i+ TM ix = TDB δ + TBD (1 − δ)

(13)

i− i− i− TM ix = TDB δ + TBD (1 − δ) .

(14)

On the basis of the transition probabilities in BD and the DB update proi+ i− cesses, the probabilities TM ix , TM ix of the mixed BD-DB updating process are obtained,

i+ TM ix =

i− TM ix =

                

              

1 e2ω(α+β) 2 N 1+e−2ω(α+β) δ + e2ω(α+β) +2eωβ +(N −3) (1 − δ) 2 1 eω(2α+β) N 1+e−ω(2α+β) δ + 2eω(2α+β) +(i−2)e2ωα +2eωβ +(N −i−2) (1 − 2 1 eω(2α+β) N 1+e−2ωα δ + 2eω(2α+β) +(i−2)e2ωα +2eωβ +(N −i−2) (1 − δ) 1 eω(2α+β) N δ + 2eω(2α+β) +(N −3)e2ωα +e2ωβ (1 − δ)

i=1 δ)

0

1 eωβ N δ + e2ω(α+β) +2eωβ +(N −3) (1 − δ) , 1 eωβ 2 N 1+e2ωα δ + 2eω(2α+β) +(i−2)e2ωα +2eωβ +(N −i−2) (1 − δ) , 2 1 eωβ N 1+eω(2α−β) δ + 2eω(2α+β) +(i−2)e2ωα +2eωβ +(N −i−2) (1 − δ) , 2 1 e2ωβ N 1+e2ω(α−β) δ + 2eω(2α+β) +(N −3)e2ωα +e2ωβ (1 − δ) ,

0,

1
i=N −2

i=N −1 i = 0, N (15)

i=1 i=2 2
i=N −1 i = 0, N (16)

3. Results 3.1. Neutral selection For the simplest case, we consider the neutral selection where ω = 0, it leads to the independent transition probabilities of the initial number of As and the specific updating process. In this case, we get T i+ = T i− and amounts to 1/N on the cycle network. Under any of above three updating rules, we get the i . (17) N It is clear that the probability to fixed to all As is influenced by the initial number of As. The mean unconditional fixation time from one A in the system is 1 t1BD = t1DB = t1M ix = (N − 1) N. (18) 2 φiBD = φiDB = φiM ix =

8

Similarly, mean conditional fixation time for neutral selection can be expressed as N (N − 1) (N + 1) . (19) 6 Under the neutral selection, time scales for the mean unconditional fixation time and the mean conditional fixation time are N 2 and N 3 respectively. To explore the difference between neutral selection and strong selection, we compare the fixation times between them in following sections. It is worthy noticing the parameter settings for the payoff matrix: β > 0 and α + β > 0 in our model, which is vital in subsequent analysis. 1→N 1→N t1→N BD = tDB = tM ix =

3.2. Birth-Death (BD) updating process under strong selection Strong selection means that payoff from the game represents a large contribution to fitness. By reducing the randomness, results under strong selection help us to study how the way of division in the division labor game and spatial structure influence the fixation times. In most cases, the intuition is that the advantageous strategy dominates fast, while the time to be fixed in a game with coexisting strategies will be long. To prove this, we make following analysis. Transition probabilities and fixation times will be provided here under different updating rules on cycle networks. We first focus on the BD updating process: the fixation times for any selection intensity under BD updating process are given in Eqs (9)-(10). Analytical solutions under strong selection are performed in the following parts. Transition probabilities in the limiting case of ω → ∞ are given by the following equations,  1 2α + β > 0 1+ TBD →  01 2α + β < 0 α > 0, 2α + β > 0 i+ 2 TBD → (20) 0  α < 0 or 2α + β < 0 1 2α − β > 0 (N −1)+ 2 TBD → 0 2α − β < 0 and 

1 2

2α + β < 0 2α + β > 0 2α − β < 0, α < 0 i− 2 TBD → 0  2α − β > 0 or α > 0 1 α < β, 2α − β < 0 (N −1)− TBD → 0 α > β or 2α − β > 0 1− TBD →

 01

(21) .

Given all above transition probabilities, the last transition before full-A or full-B is completely determined under certain conditions. Since the birth step comes first, which selects the best individual among the population, then the

9

death step follows to select one of the two neighbors, the transition probabilities are independent of the population size. It is important to show the transition probability under different division of labor games, which can be divided into two cases in terms of the parameter settings: games dominated by strategy A and coexistence games. Detailed analytical processes are shown as follows. Scenario 1: Parameter settings of β > 0 and α + β > 0 in the division of labor games are given here for a reasonable description about cooperative dilemma. If α > β, the payoff matrix for the DOL game indicates a strategic dominance circumstance: games dominated by strategy A, that is, no matter what actions the other player takes, only A strategy is the best task to be selected. The fixation of As with non-zero probability in the BD updating process requires α > β in this case. It is easy to get 2α + β > 0, α > 0 and 2α − β > 0 on the premise of β > 0 and α + β > 0. As a result, we get the following transition probabilities and sojourn times:

i+ TBD i− TBD 1i tBD

1 1 0 1

2 1 2

0 2

··· ··· ··· ···

i 1 2

0 2

··· ··· ··· ···

N −2

N −1

0 2

0 2

1 2

1 2

(22)

Here, all of the fixation probabilities amount to 1. And fixation times can be calculated by Eqs. (7)-(8), which leads to t1BD → 2N − 3

(23)

t1→N BD → 2N − 3,

(24)

and where the mean unconditional fixation time and the mean conditional fixation time are equivalent in this case. Scenario 2: In the coexistence games, it is better for an individual to interact with the opponent with different strategy. So A is the best reply to B, meanwhile B is the best response to A. Fixation of As with non-zero probability in the BD updating process requires 0 < α < β and 2α − β > 0, which together lead to the same transition probabilities, sojourn times, fixation probabilities and fixation times that have been derived in the strategic dominance games (as shown in Eqs. (22)-(24)). As compared with the games with dominant strategies, time scales in the coexistence games shows the same standard of N . When the updating rule is BD, under both the dominant game and the coexistence games, the fixation probability can reach 1, which benefits the players’ choices of task A. Under BD updating process, the fixation time scales in strong selection are smaller than that in the neutral case.

10

3.3. Death-Birth (DB) updating process under strong selection Under DB updating process, we calculate the transition probabilities and approximate the transition probabilities in the limiting relations when ω → ∞ on a cycle network. With the aid of Eqs. (11)-(12), transition probabilities to increase the number of As are 1+ TDB → N2 2 2α + β > 0 i+ N TDB → 0 2α + β < 0 2 α>0 (N −2)+ N TDB → 0 α<0 (N −1)+ → N1 TDB

.

(25)

1+ When α + β > 0, TDB converges to 2/N . Besides, no other constraint is required in this case. Transition probabilities to decrease the number of As are

1− 1 TDB → N 2 2− N TDB →  02 i− N TDB → 0 (N −1)− TDB →

α<0 α>0 2α − β < 0 2α − β > 0 2 α<β N 0 α>β

.

(26)

Apparently, transition probabilities in the DB updating process are related to the population size since the death happened ahead of the birth. Scenario 1: When α > β, the fixation of As can be guaranteed in division of labor game with A dominance. In this case, we obtain the transition probabilities and sojourn times,

i+ TDB i− TDB t1i DB

1

2

2 N 1 N N 3

2 N

0

N 3

··· ··· ··· ···

i 2 N

0

N 3

··· ··· ··· ···

N −2 2 N

0

N 3

N −1 1 N

0

(27)

2N 3

Fixation probability for a single A is 2/3, and the fixation probability amounts to 1 provided that the number of As is larger than 1. According to Eqs. (7)-(8), we get N2 t1DB →= (28) 3 and t1→N DB →

N2 N − . 2 6

11

(29)

In both cases, the leading order term is quadratic (∼ N 2 ). Scenario 2: When α < β, the payoff matrix (2) indicates that the division of labor game be classified as the coexistence game. Fixation of strategy A with non-zero probability requires both 0 < α < β and 2α − β > 0. The fixation probabilities remain the same with the previous.

i+ TDB i− TDB t1i DB

1

2

2 N 1 N N 3

2 N

0

N 3

··· ··· ··· ···

··· ··· ··· ···

i 2 N

0

N 3

N −2 2 N

0 N

N −1 1 N 2 N 2N 3

(30)

The fixation time becomes t1DB →

N (N + 2) 3

(31)

and N (3N + 5) . (32) 6 Especially, 0 < α < β and 2α−β < 0 will lead to another form of coexistence games, where we get the t1→N DB →

i+ TDB i− TDB 1i tDB

1

2

2 N 1 N N 3

2 N

0

N (N −1) 3

··· ··· ··· ···

i 2 N 2 N N (N −i+1) 3

··· ··· ··· ···

N −2 2 N 2 N

N

N −1 1 N 2 N 2N 3

(33)

In this case, the mean unconditional fixation time can be obtained by summing up all the sojourn times, N 2 (N − 1) . 6 Then the mean unconditional fixation time converges to t1DB →

t1→N DB →

3N 2 (N −1)−2N . 12

(34)

(35)

Here, the leading order of the fixation time is N 2 in the first form of the games, while it is N 3 in the second case. The differences are clearly illustrated in Eq. (30) and Eq. (33). In the second scenario, the number of As can go down from i in the intermediate state, which leads to longer time in this case. In general, under strong selection, the fixation in the DB updating process occurs slowly than that in BD updating process, which can be deduced by the order of magnitude of the fixation time. 12

To better understand the theoretical outcomes, two sets of simulations on a cycle with 10 nodes are performed and the results are depicted in Figs. 1, 2. In Fig. 1, parameter settings are: α = 6 and β = 2. Based on the two different updating processes of DB process and BD process, we get the corresponding distributions of the conditional fixation times. Even when the selection intensity is strong, randomness of the DB process means that the conditional fixation time can not be short. In Fig. 2, parameters in the payoff matrix are α = 3 and β = 4 (the coexistence game). Simulation result in Fig. 2 clearly shows that, under DB process, the conditional fixation times are widely distributed on the horizontal axis. By contrast, result in the BD updating process indicates that the mutant individuals are endowed with an advantage to take over the whole system more quickly than that in DB. From this point, the change law of fixation times in theoretical results conform to that in the simulations in general. 0.1 ω=10

BD

Probability

0.08

0.06

0.04 DB

0.02

0 0

25

50 75 100 Conditional fixation time

125

150

Figure 1: Conditional fixation times in the case of strategic dominance game with α > β under the BD and DB processes. Network size was set to be N = 10 here, each node on the cycle plays the division of labor game with payoff parameters α = 6, β = 2. Vertical axis represents the proportion of times for A players being successfully fixed on the whole cycle during 5000 independent simulations at corresponding fixation time (depicted by the horizontal axis). Then, by calculating, following conclusions is true: the mean conditional fixation time of DB process is 32.2, and that of BD process is 19.6. Corresponding to the N2 N 1→N theoretical value, we get: t1→N DB → 2 − 6 = 48.3 in the DB case, while tBD → 2N −3 = 17 in BD.

3.4. Mixed DB-BD updating process under strong selection Through simplifying the complicated constraints in this case, we obtain the limiting relations of the transition probabilities. Under the mixed DB-BD updating process, limiting case of ω → ∞ leads to following transition probabilities on a cycle network.

13

0.1 ω=10

Probability

0.08

0.06 BD 0.04

0.02 DB 0

0

100 150 Conditional fixation time

50

200

250

Figure 2: Simulation results for the conditional fixation time distributions when individuals are involved in the game with coexisting strategies A, B. To constitute such a game scenario, parameters in the payoff matrix are set to be α = 3, β = 4. Similarly, totally N = 10 agents are arranged on the cycle. And 5000 independent simulations are performed under strong selection (ω = 10). The conditional fixation time based on the DB process is 57.6, and it is N (3N +5) = 58.3 in the DB case, while 28.9 in the DB process. Theoretically, we get t1→N DB → 6 1→N tBD → 2N − 3 = 17 in the BD case.

1+ TM ix →



 

2 N δ + (1 − δ) 2α + β > 0 2 2α + β < 0 Nδ 1 2 δ + (1 − δ) 2α + β > N 2 2 δ 2α +β > N

0, α > 0 0, α < 0 ,  0 2α + β < 0 2 1 (N −2)+ N δ + 2 (1 − δ) α > 0, 2α + β > 0 TM ix → 0 α<0  1 1 δ + (1 − δ) 2α − β > 0 (N −1)+ N 2 TM ix → 1 δ 2α −β <0 N i+ TM ix →

and



1 Nδ 1 Nδ 2 Nδ

+

1 2

(1 − δ)

2α + β < 0 2α +β >0  + 21 (1 − δ) α < 0, 2α − β < 0 2− TM ix → α>0  02  N δ + 12 (1 − δ) α < 0, 2α − β < 0 i− 2 δ α > 0, 2α − β < 0 TM ix →  N 0 2α − β > 0  N2 δ + (1 − δ) α < β, 2α − β < 0 (N −1)− 2 δ α < β, 2α − β > 0 TM ix →  N 0 α>β 1− TM ix →

(36)

.

(37)

Although introduction of δ increases the complexity of subsequent calculations, it provides a better way to investigate the changes of transition probabilities, fixation probabilities and the fixation times under the DB and BD updating

14

rules. According to the payoffs in the division of labor game (Eq. (2)), we study the three quantities from following two game scenarios. Scenario 1: Parameters here are set to be α > β, which leads to the game with dominating strategy. Fixation probability for a single A-player is 2δ+N (1−δ) 3δ+N (1−δ) , which is non-zero. Following table summarizes the transition probabilities and sojourn times, 1 2 .. . i .. . N −2 N −1

i+ TM ix 2 N δ + (1 − δ) 1 2 N δ + 2 (1 − δ)

i− TM ix 1 Nδ

0 .. . 0 .. . 0 0

.. .

2 Nδ

+

2 Nδ 1 Nδ

+ +

1 2

(1 − δ)

1 2 1 2

(1 − δ) (1 − δ)

.. .

t1i M ix

N 3δ+N (1−δ) 2N [2δ+N (1−δ)] [3δ+N (1−δ)][4δ+N (1−δ)]

.. .

2N [2δ+N (1−δ)] [3δ+N (1−δ)][4δ+N (1−δ)]

.. .

2N [2δ+N (1−δ)] [3δ+N (1−δ)][4δ+N (1−δ)] 2N 3δ+N (1−δ)

The fixation time can be gained by the aid of Eqs. (7)-(8),  4N 2 δ + 2N 3 − 3N 2 (1 − δ) t1M ix → [3δ + N (1 − δ)] [4δ + N (1 − δ)]

and

t1→N M ix →

(38)

2N (N − 3) 2N N + + . 3δ + N (1 − δ) 4δ + N (1 − δ) 2δ + N (1 − δ)

(39)

(40)

When δ → 0, scale of the fixation time is N . When δ → 1, the fixation time scale in leading order is N 2 . Scenario 2: On the premise of α < β (α + β > 0, β > 0), further consideration is put on the values of 2α − β. If α > 0, then 2α − β > 0 holds, the transition probabilities and sojourn times will therefore be 1 2 .. . i .. . N −2 N −1

i+ TM ix 2 N δ + (1 − δ)

2 Nδ

+

2 Nδ

+

2 Nδ 1 Nδ

+ +

1 2

(1 − δ)

1 2

(1 − δ)

1 2 1 2

(1 − δ) (1 − δ)

.. . .. .

i− TM ix 1 Nδ

0 .. . 0 .. . 0 2 Nδ

t1i M ix

N 3δ+N (1−δ) 2N [2δ+N (1−δ)] [3δ+N (1−δ)][4δ+N (1−δ)]

.. .

2N [2δ+N (1−δ)] [3δ+N (1−δ)][4δ+N (1−δ)]

.. .

2N [6δ+N (1−δ)] [3δ+N (1−δ)][4δ+N (1−δ)] 2N 3δ+N (1−δ)

The mean unconditional fixation time amounts to   4N 2 + 8N δ + 2N 3 − 3N 2 (1 − δ) 1 tM ix → , [3δ + N (1 − δ)] [4δ + N (1 − δ)] 15

(41)

(42)

1 2 .. . i .. . N −2

i− TM ix 1 Nδ

i+ TM ix 2 N δ + (1 − δ) 2 1 N δ + 2 (1 − δ)

0

.. .

2 Nδ

+

1 2

.. . 2 Nδ

+

1 2

2 Nδ

(1 − δ)

2 Nδ

i+1,i where φM ix =

.. . 2 Nδ

N 3δ+N (1−δ) 2δ+N (1−δ) 2N 3δ+N (1−δ) [4δ+N (1−δ)](1−φi+1,i ) M ix

.. .

(1 − δ)

1 Nδ

N −1

.. .

t1i M ix

2δ+N (1−δ) 2N 3δ+N (1−δ) [4δ+N (1−δ)](1−φi+1,i ) M ix

.. .

+ (1 − δ)

2δ+N (1−δ) 2N 3δ+N (1−δ) [4δ+N (1−δ)](1−φi+1,i ) M ix N [2δ+N (1−δ)] δ[3δ+N (1−δ)]

(44)

4δ 2 [4δ+N (1−δ)]N −2−i −δ(4δ)N −1−i +[2δ+N (1−δ)](4δ)N −2−i N (1−δ) . N (1−δ)δ[4δ+N (1−δ)]N −2−i +4δ 2 [4δ+N (1−δ)]N −2−i −δ(4δ)N −1−i +[2δ+N (1−δ)](4δ)N −2−i N (1−δ)

And the mean conditional fixation time amounts to 2N (N −4) N 3δ+N (1−δ) + 4δ+N (1−δ) 4N [5δ+N (1−δ)] + [2δ+N (1−δ)][4δ+N (1−δ)]

t1→N M ix →

.

(43)

If 0 < α < β and 2α − β < 0, we obtain that the transition probabilities and sojourn times are summarized in table (44) which slightly differs from Eq. (41). Thus t1M ix → and t1→N M ix →

N −2 X N [2δ + N (1 − δ)] N t1i + M ix + 3δ + N (1 − δ) δ [3δ + N (1 − δ)] i=2 N −2 X N 3δ + N (1 − δ) 1i N + tM ix + . 3δ + N (1 − δ) 2δ + N (1 − δ) δ i=2

(45)

(46)

where t1i M ix represents the expect time spent in each intermediate state start in only one A-player, and more details please refer to Eq. (44). Note that the equations of fixation times do not apply to the case of δ = 0 and (N −1)+ δ = 1. If δ = 0, TM ix = 0 and the fixation time tends to be infinity. Besides, the summation equation of geometric series used for simplifying Eqs. (42), (43) suggests that the value of the ratio can not be 1. Since δ = 1 just makes the ratio amount equals to 1, relation δ = 1 should be excluded. Update according to the mixed DB-BD rule, a single A-player to be fixed (1−δ) on the cycle network with the probability of 2δ+N 3δ+N (1−δ) when selection intensity is infinite (ω → ∞). Once the A strategy spreads to two or more agents, then it can take over the whole population. Parameter δ allows a smooth transition between BD updating rule and DB updating rule, thus it affects the order of magnitude of the fixation time. In Scenario 1 (α > β) and the first case of Scenario 2 (0 < α < β and 2α − β > 0), the fixation times increase as δ gets 16

larger. When α > β, we can find that the fixation time under strong selection is less than that under neutral selection for any δ if N > 3. Since the threshold value for N is very small, it can be deduced that strong selection is always superior to neutral selection in this case. In the first case of Scenario 2 (0 < α < β and 2α − β > 0), if N > 7, the mean unconditional fixation time under strong selection is less than that under neutral selection for any δ. In the same way, when N > 4, we can find that the mean conditional fixation time under strong selection is less than that under neutral selection for any δ. However, the second case of Scenario 2 (0 < α < β and 2α − β < 0) is more complicated, impact of δ on the time is too obscure to analyze. Note that the condition of δ → 0 or δ → 1 will result a relatively large fixation time, there is no critical N for any δ, which lead to better strong selection than neutral selection here. 4. Conclusions This work mainly studies the issue on how to realize the high-efficiency task allocations from the group level. Throughout the process, there is no external or central control and no global information exchange, all of the individuals in this population play the division of labor games, and keep the advantageous strategies to realize self-organizing optimizations. From an individual’s perspective, the problem of task allocation can be considered as a type of collective dilemma in game theory. We study the group dynamics in the context of division of labor games by theoretically calculating relevant fixation probabilities and fixation times. Besides, some simulations are performed to provide more visualized conclusions. In structured populations, the selection is influenced by the updating processes and the mapping relation between payoff and fitness. Here we establish the study in the premise of the cycle network and exponential payoff-to-fitness mapping. To provide more sufficient view of the division of labor among agents, we respectively study three different update rules, called birth-death (BD), death-birth (DB) and Mixed DB-BD updating process. We derive the transition probabilities that are adaptive to any value of selection intensities. And studies are carried out on two types of game scenarios, which are classified according to the different payoffs in the division of labor games, including the strategic dominance and the coexistence games. Moreover, following results under strong selection are obtained, where the payoff from the game represents a large contribution to the fitness. Under the BD updating rule, the transition probability has no close relation with the population size. In such a series of games with dominating strategy, which ensures the dominance of strategy A, the leading order of the fixation time will be N . Stochastic evolutionary dynamics in games with coexisting strategies on the cycle networks show the consistent results with the dominance games. Under the DB updating rule, the transition probability is proportional to 1/N . Compared with the BD situation, the fixation of strategy A under the DB updating rule always takes longer time both in dominance games and coexistence games. When the strategy updating process proceeds under the mixed DB-BD 17

rule, parameter δ allows a smooth transition between BD and DB updating rules. In Scenario 1 (α > β) and the first case of Scenario 2 (0 < α < β and 2α − β > 0), increasing δ means converting to DB updating processes which will lead to an increasing trend of fixation times. Our results under the different updating processes are expected to provide a well-rounded view to understand the self-organization mechanisms in the context of the division of labor games. In addition, results on the fixation probability and fixation time are summarized in the framework of the cycle network. Further explorations performed on more complex networks can be expected in the future studies. Conflict of Interest We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted. 5. Acknowledgements The authors are supported by the National Natural Science Foundation of China (Grant Nos. 61603199, 61603201 and 91848203), and the Tianjin Natural Science Foundation of China (Grant No. 18JCYBJC18600). References [1] M. Perc, J. J. Jordan, D. G. Rand, Z. Wang, S. Boccaletti, A. Szolnoki, Statistical physics of human cooperation, Physics Reports 687 (2017) 1–51. [2] V. Capraro, M. Perc, Grand challenges in social physics: In pursuit of moral behavior, Frontiers in Physics 6 (2018) 107. doi:10.3389/fphy. 2018.00107. [3] J. Zhang, C. Zhang, T. Chu, Cooperation enhanced by the ’survival of the fittest’ rule in prisoner’s dilemma games on complex networks, J. Theor. Biol. 267 (2010) 41–47. doi:10.1016/j.jtbi.2010.08.011. [4] J. Smith, Evolution and the Theory of Games, Cambridge Univ Pr, 1982. [5] M. Zhang, H. H. Liu, Game-theoretical persistent tracking of a moving target using a unicycle-type mobile vehicle, IEEE Transactions on Industrial Electronics 61 (11) (2014) 6222–6233. [6] S. Tan, J. L¨ u, D. J. Hill, Towards a theoretical framework for analysis and intervention of random drift on general networks, IEEE Transactions on Automatic Control 60 (2) (2015) 576–581. [7] J. F. Nash, et al., Equilibrium points in n-person games, Proceedings of the national academy of sciences 36 (1) (1950) 48–49.

18

[8] X. Vives, Nash equilibrium with strategic complementarities, Journal of Mathematical Economics 19 (3) (1990) 305–321. [9] J. Zhang, C. Zhang, M. Cao, F. J. Weissing, Crucial role of strategy updating for coexistence of strategies in interaction networks, Physical Review E 91 (4) (2015) 042101. [10] J. Zhang, F. J. Weissing, M. Cao, Fixation of competing strategies when interacting agents differ in the time scale of strategy updating, Physical Review E 94 (3) (2016) 032407. [11] C. Zhang, Y. Zhu, Z. Chen, J. Zhang, Punishment in the form of shared cost promotes altruism in the cooperative dilemma games, Journal of theoretical biology 420 (2017) 128–134. [12] C. Tang, A. Li, X. Li, When reputation enforces evolutionary cooperation in unreliable manets, IEEE transactions on cybernetics 45 (10) (2015) 2190– 2201. [13] J. Hofbauer, K. Sigmund, Evolutionary games and population dynamics, Cambridge university press, 1998. [14] A. Traulsen, C. Hauert, Stochastic evolutionary game dynamics, Reviews of nonlinear dynamics and complexity 2 (2009) 25–61. [15] A. Traulsen, M. A. Nowak, J. M. Pacheco, Stochastic dynamics of invasion and fixation, Physical Review E 74 (1) (2006) 011909. [16] C. Taylor, D. Fudenberg, A. Sasaki, M. A. Nowak, Evolutionary game dynamics in finite populations, Bulletin of mathematical biology 66 (6) (2004) 1621–1644. [17] A. Traulsen, J. M. Pacheco, M. A. Nowak, Pairwise comparison and selection temperature in evolutionary game dynamics, Journal of theoretical biology 246 (3) (2007) 522–529. [18] M. A. Nowak, A. Sasaki, C. Taylor, D. Fudenberg, Emergence of cooperation and evolutionary stability in finite populations, Nature 428 (6983) (2004) 646. [19] L. A. Imhof, M. A. Nowak, Evolutionary game dynamics in a wright-fisher process, Journal of mathematical biology 52 (5) (2006) 667–681. [20] T. Antal, I. Scheuring, Fixation of strategies for an evolutionary game in finite populations, Bulletin of mathematical biology 68 (8) (2006) 1923– 1944. [21] Z. Xu, J. Zhang, C. Zhang, Z. Chen, Fixation of strategies driven by switching probabilities in evolutionary games, EPL (Europhysics Letters) 116 (5) (2017) 58002. 19

[22] G. Szab´ o, G. F´ ath, Evolutionary games on graphs, Phys. Rep. 446 (2007) 97–216. [23] C. P. Roca, J. A. Cuesta, A. S´ anchez, Evolutionary game theory: Temporal and spatial effects beyond replicator dynamics, Phys. Life Rev. 6 (2009) 208–249. [24] M. Perc, A. Szolnoki, Coevolutionary games – a mini review, BioSystems 99 (2010) 109–125. doi:10.1016/j.biosystems.2009.10.003. [25] M. Perc, J. G´ omez-Garde˜ nes, A. Szolnoki, L. M. Flor´ıa, Y. Moreno, Evolutionary dynamics of group interactions on structured populations: a review, Journal of the royal society interface 10 (80) (2013) 20120997. [26] H. Ohtsuki, C. Hauert, E. Lieberman, M. A. Nowak, A simple rule for the evolution of cooperation on graphs and social networks, Nature 441 (2006) 502–505. [27] H. Ohtsuki, M. A. Nowak, The replicator equation on graphs, J. Theor. Biol. 243 (2006) 86–97. [28] M. A. Nowak, C. E. Tarnita, T. Antal, Evolutionary dynamics in structured populations, Phil. Trans. R. Soc. B 365 (2010) 19–30. doi:10.1098/rstb. 2009.0215. [29] P. M. Altrock, A. Traulsen, M. A. Nowak, Evolutionary games on cycles with strong selection, Physical Review E 95 (2) (2017) 022407. [30] J. Zukewich, V. Kurella, M. Doebeli, C. Hauert, Consolidating birth-death and death-birth processes in structured populations, PLoS One 8 (1) (2013) e54639. [31] C. Zhang, J. Zhang, G. Xie, L. Wang, Effects of encounter in a population of spatial prisoners dilemma players, Theoretical population biology 80 (3) (2011) 226–231. [32] J. Zhang, T. Chu, F. J. Weissing, Does insurance against punishment undermine cooperation in the evolution of public goods games?, Journal of theoretical biology 321 (2013) 78–82. [33] C. Zhang, J. Zhang, G. Xie, L. Wang, Coevolving agent strategies and network topology for the public goods games, The European Physical Journal B 80 (2) (2011) 217–222. [34] Y. Youm, E. O. Laumann, The effect of structural embeddedness on the division of household labor: A game-theoretic model using a network approach, Rationality and Society 15 (2) (2003) 243–280.

20

[35] P. Llerena, T. Burger-Helmchen, P. Cohendet, Division of labor and division of knowledge: A case study of innovation in the video game industry, in: Schumpeterian Perspectives on Innovation, Competition and Growth, Springer, 2009, pp. 315–333. [36] J. Li, C. Li, X. Li, Quantifying the contact memory in temporal human interactions, in: Circuits and Systems (ISCAS), 2018 IEEE International Symposium on, IEEE, 2018, pp. 1–5. [37] G. E. Robinson, Regulation of division of labor in insect societies, Annual review of entomology 37 (1) (1992) 637–665. [38] A. Sendova-Franks, N. R. Franks, Task allocation in ant colonies within variable environments (a study of temporal polyethism: experimental), Bulletin of Mathematical Biology 55 (1) (1993) 75–96. [39] A. Duarte, F. J. Weissing, I. Pen, L. Keller, An evolutionary perspective on self-organized division of labor in social insects, Annual Review of Ecology, Evolution, and Systematics 42 (2011) 91–110. [40] H. Ohtsuki, M. A. Nowak, Evolutionary games on cycles, Proceedings of the Royal Society of London B: Biological Sciences 273 (1598) (2006) 2249– 2256. [41] E. Lieberman, C. Hauert, M. A. Nowak, Evolutionary dynamics on graphs, Nature 433 (2005) 312–316. [42] J. Zhang, C. Zhang, T. Chu, The evolution of cooperation in spatial groups, Chaos, Solitons & Fractals 44 (1-3) (2011) 131–136. [43] H. Brandt, C. Hauert, K. Sigmund, Punishment and reputation in spatial public goods games, Proceedings of the Royal Society of London B: Biological Sciences 270 (1519) (2003) 1099–1104. [44] A. L. Amadori, A. Boccabella, R. Natalini, A hyperbolic model of spatial evolutionary game theory, Comm. Pure Appl. Analysis 11 (2012) 981–1002. [45] H. Ohtsuki, P. Bordalo, M. A. Nowak, The one-third law of evolutionary dynamics, J. Theor. Biol. 249 (2007) 289–295. [46] P. M. Altrock, A. Traulsen, T. Galla, The mechanics of stochastic slowdown in evolutionary games, Journal of theoretical biology 311 (2012) 94–106. [47] W. J. Ewens, Mathematical population genetics 1: theoretical introduction, Vol. 27, Springer Science & Business Media, 2012.

21