DECADE: Distributed Emergent Cooperation through ADaptive Evolution in mobile ad hoc networks

DECADE: Distributed Emergent Cooperation through ADaptive Evolution in mobile ad hoc networks

Ad Hoc Networks 10 (2012) 1379–1398 Contents lists available at SciVerse ScienceDirect Ad Hoc Networks journal homepage: www.elsevier.com/locate/adh...

2MB Sizes 0 Downloads 28 Views

Ad Hoc Networks 10 (2012) 1379–1398

Contents lists available at SciVerse ScienceDirect

Ad Hoc Networks journal homepage: www.elsevier.com/locate/adhoc

DECADE: Distributed Emergent Cooperation through ADaptive Evolution in mobile ad hoc networks Marcela Mejia a,b,⇑, Néstor Peña b, Jose L. Muñoz c, Oscar Esparza c, Marco Alzate d a

Universidad Militar Nueva Granada, Bogotá, Colombia Universidad de los Andes, Bogotá, Colombia c Universitat Politècnica de Catalunya, Barcelona, Spain d Universidad Distrital, Bogotá, Colombia b

a r t i c l e

i n f o

Article history: Received 13 March 2011 Received in revised form 27 September 2011 Accepted 20 March 2012 Available online 18 April 2012 Keywords: MANET DSR Trust models Game theory Emergent behavior Adaptation

a b s t r a c t The scarce resources of a mobile ad hoc network (MANET) should not be wasted attending selfish nodes (those nodes that use resources from other nodes to send their own packets, without offering their own resources to forward other nodes’ packets). Thus, rational nodes (those nodes willing to cooperate if deemed worthy) must detect and isolate selfish nodes in order to cooperate only among themselves. To achieve this purpose, in this paper we present a new game theoretic trust model called DECADE (Distributed Emergent Cooperation through ADaptive Evolution). The design of DECADE is shown by first, analyzing a simple case of packet forwarding between two nodes, and then the results are extended to bigger networks. In DECADE, each node seeks individually to maximize its chance to deliver successfully their own packets, so that the cooperation among rational nodes and the isolation of selfish nodes appear as an emergent collective behavior. This behavior emerges as long as there is a highly dynamic interaction among nodes. So, for those cases where the mobility alone does not suffice to provide this interaction, DECADE includes a sociability parameter that encourages nodes to interact among them for faster learning and adaptability. Additionally, DECADE introduces very low overhead on computational and communication resources, achieving close to optimal cooperation levels among rational nodes and almost complete isolation of selfish nodes.  2012 Elsevier B.V. All rights reserved.

1. Introduction Mobile Ad Hoc NETworks (MANETs) are formed by wireless mobile devices with limited battery power, computation capacity, and memory space. These devices communicate among them through multi-hop routes, without relying on any communication infrastructure [1]. For a MANET to function appropriately, each node must be willing to contribute with its own resources and, in exchange, should be able to use other nodes’ resources. However, in this environment there could be selfish nodes. Selfish ⇑ Corresponding author at: Universidad Militar Nueva Granada, Bogotá, Carrera 11 No. 101-80, Colombia. Tel.: +57 3002083542. E-mail address: [email protected] (M. Mejia). 1570-8705/$ - see front matter  2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.adhoc.2012.03.017

nodes use resources from other nodes of the network to send their packets without forwarding packets of these other nodes [2]. To avoid this unfair situation, MANETs have to use an enforcement mechanism. The contributions of this paper are the following. First, we propose a cooperation enforcement mechanism called DECADE (Distributed Emergent Cooperation through ADaptive Evolution) for MANETs based on a source routing protocol like DSR (Dynamic Source Routing) [3]. DECADE is based on trust and game theory. DECADE differs from similar models in its completely distributed nature, in which cooperation becomes an emergent property that evolves from the individual learning and adaptive behavior of each node. We present an extensive set of simulations that show how DECADE achieves almost optimal cooperation

1380

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

values among rational nodes and almost total isolation of selfish nodes, with high efficiency and fast convergence. Second, we conduct an analytic study of our model on a simplified two-node scenario, proving analytically the optimal results of our previous work presented in [4]. Both the analysis method based on dynamical systems theory and the analysis results stating the weakness of the prisoner’s dilemma for trust modeling in MANETs are significant contributions of this paper. Third, we make a more extensive study of the model in [4] noticing how, under low mobility conditions, the rational nodes need to explore its environment more aggressively. Finally, we propose the ‘‘sociability factor’’ as one possible way to achieve this exploratory behavior. Sociability is necessary in low/medium mobility scenarios, in which it is not easy to obtain timely knowledge about other node’s behavior. We show that our sociability parameter encourages more interactions among nodes and effectively improves environment perception. There are two kinds of selfish misbehaviors at the network layer of an ad hoc network [5]: selfish routing misbehavior, when a ‘‘silent node’’ does not participate in route discovery, and selfish forwarding misbehavior, when a node fully participates in the route discovery phase but refuses to forward packets for others. DECADE, like most cooperation enforcement systems in the literature, is designed to cope with selfish forwarding nodes. The silent nodes, however, can be encouraged by DECADE to participate, since they need to gain reputation in order to transmit their own packets. The rest of the paper is organized as follows: Section 2 summarizes the main cooperation enforcement models in the literature for MANETs, emphasizing those that are based on game theory and providing a comparison with DECADE. Section 3 analyzes the simplest case of two nodes taking turns to send packets using the other one as a relay. We modify the payoff table of the prisoner dilemma in order to make it more robust to the uncertain environment of a MANET, leading to the Forwarding Dilemma model. In Section 4, this forwarding dilemma is extended to the environment of a bigger network that has both rational and selfish nodes. After describing the game model, we briefly review the distributed cellular/bacterial evolution algorithm (which was extensively presented in [4]). In Section 5, we analyze the effects of low/medium mobility and we introduce the sociability parameter as a way of encouraging the interactions among nodes for a better environment estimation. Finally, Section 6 concludes the paper.

2. State-of-the-art Models for cooperation enforcement in MANETs can be broadly divided in two categories according to the techniques they use to enforce cooperation [6]: credit-based models and trust models. The main drawback of creditbased models is that they require the existence of either tamper-resistant hardware or a virtual bank, heavily restricting their usability for MANETs. On the other hand, models in which trust is the base for cooperation are envisioned as the most promising solutions for MANETs

because these models do not have the restrictions of credit-based models. Trust models can frustrate the intentions of selfish nodes by coping with observable misbehaviors. If a node does not behave cooperatively, the affected nodes, reciprocally, may deny cooperation. Generally speaking, in a trust model, an entity called the Subject S commends the execution of an action a to another entity called the Agent A, in which case we say that T{S:A, a} is the trust level that S has on A with respect to the execution of action a [7]. This trust level varies as the entities interact with each other; i.e., if the Agent A responds satisfactorily to the Subject S, S can increase the trust level T{S:A; a}. On the other hand, if the subject S is disappointed by the agent A, the corresponding trust level could be decreased by some amount. In this sense, a trust model helps the subject of a distributed system to select the most reliable agent among several agents offering a service [8]. To make this selection, the trust model should provide the mechanisms needed for each entity to measure, assign, update and use trust values. Several trust models have been proposed in the literature for improving the performance of MANETs. In [9], there is a classification of trust-based systems according to the theoretical mechanisms used for trust scoring and trust ranking. Following this classification, we can further divide the trust-based proposals in approaches based on social networks, information theory, graph theory and game theory. In proposals based on social networks like CONFIDANT [10], improved CONFIDANT [11], CORE [12] or SORI [13], the nodes build their view of the trust or reputation not only taking into account their own observation but also considering the recommendations from others. In addition to the previous ones, there are some trust models based on social networks that also use cluster-heads [14–17]. The cluster-head is a node who is elected to play a special role regarding the management of recommendations. An example can be found in [14]. The problem of the previous proposals is that the calculation and measurement of trust with recommendations in unsupervised ad-hoc networks involves the complex issue of rating the honesty of recommendators. Although there are efforts like [15–17] that try to alleviate this problem, it is still a hard difficulty for systems that use recommendations. Regarding the proposals based on information theory, one of these is [7], in which the authors proposed a trust model to obtain a quantitative measurement of trust and its propagation through the MANET. However, the proposal is theoretical and it does not include an implementation specification. [18] describes a trust inference algorithm in terms of a directed and weighted Trust Graph, T, whose vertices correspond to the users in the system and for which an edge from vertex i to vertex j represents the trust that node i has in node j. However, covering the whole graph is still a highly complex computational problem. Finally, there are several proposals that use game theory. These proposals can be further divided into cooperative and non-cooperative games [19]. In cooperative games, users form coalitions, so a group of players can adopt a certain strategy to obtain a higher gain than the one it may be obtained making decisions individually. In

1381

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

cooperative games, the nodes need to communicate with each other and discuss the strategies before they play the game. [20,21] are representative proposals of cooperative games for MANET. Nevertheless, in MANETs these type of games have the disadvantage of generating network overhead due to the complex processes involved in coalition management. On the other hand, non-cooperative games are especially suitable for scenarios in which players might have conflicting interests. In these scenarios each user wants to maximize her own profit taking individual decisions [22]. Non-cooperative games capture very well each node dilemma about deciding whether to cooperate and obtain the trust of peer nodes, or not to cooperate and save scarce energy resources. In addition, non-cooperative games rely only on private histories and thus, the costly coalition overhead and possible conflicting interests can be avoided. For this reason, our proposal like others that can be found in the literature ([23–27]) is based on a variation of the classical non-cooperative game of the iterated prisoner’s dilemma game, in which nodes can use different strategies. In DECADE we use a genetic algorithm to evolve the strategies like in [23,24]. This type of evolution algorithm is interesting not only because it presents a way of dynamically adapting the collaboration strategy to the network conditions, but also because the evolution algorithm achieves promising results regarding cooperation and energy saving. However, Seredynski et al. [23,24] run their evolution algorithm in a centralized entity. Their algorithm is a conventional genetic algorithm that needs a large number of interactions between nodes to converge. On the contrary, DECADE exploits the distributed nature of a MANET by using a local genetic information exchange and a decentralized genetic algorithm. DECADE’s algorithm works much like plasmid migration in bacterial colonies [28,29] combined with a parallel cellular genetic algorithm [30–33]. Through this procedure, individual nodes select the strategies that locally maximizes their payoff in terms of both packet delivery and resource saving. The work presented in [25,26] also uses a non-cooperative game theory with a distributed evolution based on BNS (Best Neighbor Strategy). In BNS, a node decides the strategy to follow by changing to other player’s strategy if deemed worthy. The BNS proposal distributes the evolution process among the nodes, but the payoff structure strictly encourages cooperating strategies without taking into account the energy savings of discarding strategies. DECADE also uses a distributed (though different) GA, but differs from [26] in that our payoffs also reward the energy savings of discarding strategies, and in that our decisions do not depend only on the outcomes of the random pairing games but on the observed behavior history of the source and the general perceived environment of the whole network. Finally, another close related work can be found in [27]. In this work the authors present a system called Darwin, which is also based on a non-cooperative game. The strategy proposed in Darwin considers a retaliation situation after a node has been falsely perceived as selfish. The system assumes that nodes share their perceived dropping probability with each other. This assumption is made in

order to facilitate the theoretical analysis by isolating a pair of nodes, but in a real implementation, a mechanism is required to guarantee that even if a node lies, the reputation scheme still works. Authors state that the study of this problem in the context of wireless networks is not solved, and that they just propose a simple mechanism which relies on other cooperative nodes to tell the actual perceived dropping probability of a node to minimize the impact of liars. Although their results are not directly comparable with ours, it is interesting to notice that unlike our system, even when the non-Darwin nodes discard with probability one (which is the definition of selfish nodes in our work), they still receive a considerable cooperation from the network (around 70%). As we will see in the following sections, the highly distributed strategy evolution of DECADE along with its capabilities to enhance sociability allow DECADE to achieve optimal cooperation with highly efficient energy usage effectively isolating nodes that have currently a selfish behavior. 3. Analysis of a forwarding game between two nodes We start the analysis of the DECADE’s game by considering first a simpler game between two nodes that take turns to send their packets, in such a way that each node requires the retransmission services of the other, as shown in Fig. 1. Node A needs to use node B as a relay to deliver its packets to the destination node RxA. Similarly, node B needs to use node A as a relay to deliver its packets to node RxB. Although this two-node scenario is an oversimplified model, we use it to build an analytically tractable nonzero-sum, non-cooperative game with complete information, the Forwarding Dilemma (FD). The analysis method we devise will allow us to show its superiority over the classical Prisoner Dilemma (PD) for trust modeling in MANETs, due to its robustness against errors. These theoretical results led us to use the FD as the basis for DECADE, as discussed in Sections 4 and 5. 3.1. A prisoner dilemma game model In the case depicted in Fig. 1, the most rewarding event for any source node is getting its packets to their destination, for which its happiness increases in a certain quantity x. However, transmitting packets is an expensive action in terms of energy, so each transmission reduces the happiness of the transmitting node in a quantity y. Consequently, if a packet from source node A successfully arrives to the destination node RxA, the happiness of node A is incremented in x  y, while the happiness of the intermediate

Fig. 1. A two nodes game.

1382

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

node B is incremented in y. So then, in a game in which each node A and B transmit a packet, there are three possible cases:  Both nodes transmit their packets successfully, so both are rewarded with an increment in happiness pc = x  2y, where c stands for cooperation.  A node is tempted not to cooperate in order to obtain a greater reward pt = x  y, where t stands for temptation. In this case, the other node only receives ps = 2y, where s stands for sucker.  If both nodes are tempted not to cooperate, they will both receive pp = y, the punishment for their selfishness. If x and y are chosen in such a way that x > 2y > 0, as expected, we get the relations pt > pc > pp > ps, which corresponds exactly to the famous prisoner’s dilemma game, as described in [34]. Furthermore, the relation x > 2y ensures that, in an iterated game under reciprocating strategies, it would be better for both nodes to forward other node’s packets than to discard them, since 2pc > pt + pp: the temptation does not overcome the fear of punishment. For instance, we could choose x = 5 and y = 1 to capture the relations above. As shown in Table 1, the payment structure is a prisoner’s dilemma, in which ptemptation = 4 > pcooperation = 3 > ppunishment = 1 > psucker = 2. For a game with a single iteration, the rational choice is the strategy profile in which both nodes discard the packets in transit since, for any action of a given node, the best response of the other node is to discard. Of course, both nodes will obtain better results if they both cooperate, but none of them can tell whether the other one will be tempted to discard and none of them wants to be the sucker. This strategy profile forms the only Nash equilibrium of this game, because none of the nodes can increase its payoff by unilaterally changing its strategy [35]. Under a finite and known number of iterations, mutual discarding is still the rational strategy profile, no matter how big the number of iterations is. Indeed, in the last iteration each node is tempted to discard, so they will both discard to avoid becoming the sucker; but the same rationale holds for the previous-to-last iteration, and it goes on, up to the first move. However, if the number of iterations is infinite (i.e., unknown), for a specific strategy of one of the nodes there could be a particular best response of the other node, one that possibly leads to a different Nash equilibrium in which mutual forwarding is the way to go. For example, a well known strategy for this game is tit-fortat: begin cooperating and, from then on, do what the other node did in the previous move [35]. If a node follows this strategy, the best response of the other node is to obey the same strategy. In this case, nobody will be tempted

Table 1 A Payoff table for the two players game of Fig. 1.

A forwards A discards

B forward

B discards

(3, 3) (4, 2)

(2, 4) (1, 1)

to be the first to discard, since the punishment will be highly expensive in the long run. The disadvantage of this strategy is that any erroneous perception of a cooperating behavior will make the network go down forever and, unfortunately, in a MANET environment these perception errors could be quite frequent. Because of this situation, it is important to consider the use of strategies that are based on trust relationships, where cooperation can emerge, even in the presence of perception errors. Indeed, if each node has built a good reputation with its neighbor, they both could expect a reciprocal but tolerant behavior. Section 3.1.1 considers strategies based on reputation principles. 3.1.1. Building trust relationships The general principles that have brought good results to iterated prisoner’s dilemma games are those of reciprocation, forgiveness, and non-enviousness: do not be the first to discard; if the opponent starts to cooperate after discarding a packet, begin also to cooperate with it; and do not worry about being better paid than your opponent as long as you are well paid [35]. In order to introduce these heuristics in our forwarding game, each node will build its reputation through the history of cooperation and discarding decisions it has taken previously. If the decision to forward or discard a packet depends on the trust level that the intermediate node has on the source node, in the long run it would become more valuable for intermediate nodes to forward other node’s packets (and increase its trust level) than to discard them to save energy. Of course, if the other node never cooperates or always cooperates, it would not be worthy to gain its trust. According to the previous discussion, the trust level node A has on node B after the nth iteration, Tn{A:B}, must be a function of the past behavior of node B. In particular, node A will assign a trust value to node B equals to the number of forwarding decisions of node B during the last m iterations, and node B will assign a trust value to node A equals to the number of forwarding decisions of node A during the last m iterations. The parameter m, the memory of the players, determines the number of trust levels (m + 1 levels, ranging from 0 to m), as shown in Eq. (1),

0 6 T n fA : Bg ¼

m1 X

dB ðn  tÞ 6 m

ð1Þ

t¼0

0 6 T n fB : Ag ¼

m1 X

dA ðn  tÞ 6 m

t¼0

where dA(i) and dB(i) are the discarding (0) or forwarding (1) decisions taken by nodes A and B, respectively, at iteration i. The memory of the nodes with respect to the observed past, m, is limited for adaptability purposes. Indeed, since strategies are continuously modified, it is important to have observed decisions that obey to current strategies. A long memory would allow a cooperating node to become selfish and go unpunished relying on its good reputation. Similarly, a long memory would not allow a selfish node

1383

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

to begin cooperating and vindicate itself without carrying its bad reputation upon itself forever. Now that each node has the possibility to build a reputation, we can reward positively each decision: It is good to discard a packet because it saves energy, and it is good to forward a packet because it increases the trust of the other node. To keep the preferences of Table 1, we will pay each node 3 units just for participating in a single game, regardless the outcome of the game or the decision of the node. This way we avoid a zero or negative payment, since there will always be some energy saving or some trust gaining to reward. The corresponding payoff table, shown in Table 2, can be interpreted according to the role of each player as in Table 3. A source node (the one that originates a packet) receives 5 or 0 units, depending on whether its packet reaches its destination or not. An intermediate node (the one that receives a packet in transit, and decides whether to forward or discard it), receives 2 units for discarding, which correspond to the reward of saving energy, and 1 unit for forwarding, which correspond to the reward for increasing the trust of the other node. In the next section, we will introduce a method for the analysis of the previous PD payoff structure. This method is based on a discrete time dynamical system with discrete space state, which represents the trust state between the nodes and the transitions among those states. 3.1.2. Prisoner dilemma analysis With the shortest memory, m = 1, the decision of a node will depend exclusively on the behavior of the other node in the immediately previous iteration. So, there are only two trusts levels, 0 and 1, assigned anew at each iteration, according to the behavior of each node in the last iteration. In this section we introduce an analysis method based on a discrete dynamical system, which will show that such a short memory becomes too sensitive to errors that cooperation cannot emerge, as already mentioned. Although the result is well known, the method will be useful for analyzing some variations of the game model in subsequent sections. Since the strategy of a node depends exclusively on its current trust level on the source node, it can be represented as a string of (m + 1) bits, where each bit of this string indicates the decision to take according to the trust level: 0 for discarding and 1 for forwarding. For example, tit-for-tat corresponds to m = 1, an initial trust level of 1, and the strategy ‘‘01’’, which means ‘‘discard if the trust level in the source node is zero (i.e., if it discarded the last packet), and cooperate if the trust level is one (i.e., if it forwarded the last packet)’’. We will see that, if both nodes adhere to this strategy, they will cooperate forever. Indeed, we will see that nobody could change unilaterally for a better payoff, so tit-for-tat is a Nash equi-

Table 3 PD Payoff structure according to node’s role. Source node Transmission status Success Failure

Intermediate node Decision 5 0

Discards Forwards

2 1

librium for this game. But we will also see that the strategy profile in which both nodes always discard (‘‘00’’) is another Nash equilibrium, a dangerous one for the operation of the network. This result can be observed if we consider the iterated game as a dynamic system with the state transition diagram of Fig. 2. The state of the system corresponds to the pair of trust levels (T{A:B}, T{B:A}). The transitions among states obey the decisions taken by the nodes and are labeled with the corresponding pair of payoffs (pA, pB). For instance, if we play a single game from the state (1, 1), in which the nodes trust each other, and node A collaborates but node B discards, the next state will be (0, 1), which means that node A does not trust node B, but node B still trusts node A. In this case, node A wins a payoff of 1 unit for preserving the trust of node B. On its side, node B wins 7 units, 5 because its packet reached the destination node RB, and 2 because it saved energy, but at the cost of loosing the trust of node A. A given initial state, is = (T0{A:B}, T0{B:A}), and a fixed strategy profile, sp = (sA, sB), determines a state trajectory of the game. Such trajectory includes, in general, an initial sequence of non-recurrent states, which will be called a transient trajectory, C(is, sp). The transient trajectory leads to a sequence of recurrent states that repeats periodically, which will be called a limit cycle, L(is, sp). Although the transient trajectory can be empty and the limit cycle can be unitary (in which case it is just an stable state), for generality we will still call them transient trajectory and limit cycle. Fig. 3 shows the corresponding trajectories for each strategies profile under the initial state (1, 1), where the two bits in the first row of each column is the strategy of node B, and the two bits of the first column of each row

Table 2 A PD Payoff table that possitively rewards energy saving and trust building.

A forwards A discards

B forwards

B discards

(6, 6) (7, 1)

(1, 7) (2, 2)

Fig. 2. State transition diagram for the iterated game of Table 2 and memory m = 1.

1384

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Fig. 3. State trajectory and long run payoff per iteration of the game in Fig. 2, beginning in the state (1, 1).

correspond to the strategy of node A. For each case, Fig. 3 also shows, the average payoff per iteration during the limit cycle, pL(is, sp) = (pAL, pBL), defined in Eq. (2)

pAL ðis; spÞ ¼

jLðis;spÞj X 1 pAiL jLðis; spÞj i¼1

pBL ðis; spÞ ¼

jLðis;spÞj X 1 pBiL jLðis; spÞj i¼1

ð2Þ

where jL(is, sp)jis the period length of the limit cycle and pX iL is the payoff received by node X during the ith iteration of the limit cycle, X 2 A, B. We can compare different strategy profiles according to this long run average payoff per iteration since, as the number of iterations increases, the average payoff contributed by the transient trajectory vanishes to zero. For example, consider the strategy profile (10, 10), in which each node forwards a packet if the trust on the other node is 0, and discards it if the trust is 1 (a weird strategy, indeed). Starting in the state (1, 1), both nodes will discard the received packet, earning a payoff of two units and moving to state (0, 0). At that state both nodes will forward the received packet, earning six units, and the cycle repeats. In this case, the transient trajectory is empty, the limit cycle has two states, and the average payoff per iteration in the limit cycle is (2 + 6)/2 = 4 for each node, as shown in Fig. 3. Comparing the average payoffs in Fig. 3, we can see that the infinitely iterated forwarding game beginning in (1, 1) has two Nash equilibrium strategy profiles: Always discard, (00, 00), and tit-for-tat (01, 01). In the first case, the average payoff per iteration is (2, 2) and, in the second case, it is (6, 6). In both cases, no node can change unilaterally its strategy to receive a better average payoff per

iteration. However, if for any reason there is a misunderstanding and the nodes begin with a different initial state, the only long run equilibrium is the strategy profile in which both nodes always discard, with no hope for them to begin cooperating at any time due to the lack of incentives. This situation can be easily seen by constructing trajectory diagrams, similar to that of Fig. 3, for different initial states. Of course, since a single perception error can change the trust state to any one different from (1, 1), leading to mutual discarding from then on, we can conclude that the prisoner dilemma with m = 1 is not an appropriate model for the emergence of cooperation in a MANET. Using the same analysis method with a bigger memory parameter m, we can compute the average payoff per iteration for each of the (m + 1)2 initial states and each of the 22(m+1) strategy profiles. For example, with m = 3, we obtain a 16-state transition diagram. A given initial state and a given strategy profile determine a state trajectory of the game where, in the long run, the steady limit cycle gives both the throughput of the network and the average payoff per iteration received by each node. With this information it is easy to find the Nash equilibrium for each initial state. For example, the profile (0000, 0000) is a Nash equilibrium for any initial state. And the strategy profiles (0001, 0001), (0011, 0011), and (0111, 0111) are also Nash equilibrium for initial states (3, 3), (2, 2) and (1, 1), respectively. Since the last one is a highly forgiving strategy profile and the first one is a highly punishing one, the last strategy profile is less sensitive to erroneous perception of cooperation decisions, as shown in Fig. 4, where the cooperation ratio (calculated as the fraction of forwarding decisions) is plotted against the probability to perceive erroneously a cooperation decision. However, it is also noticeable that even the most robust profile is highly

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Cooperation under different tit_for_tat strategies 1 0.9 0.8

Cooperation

0.7 0.6 0.5 0.4 0.3 Strategy "0001" Strategy "0011" Strategy "0111"

0.2 0.1 0 0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Fraction of erroneous perceptions Fig. 4. Average cooperation under three different tit-for-tat strategy profiles, with given perception error.

sensitive to perception errors exceeding 4%. This is why, in the next section, we propose a change in the payoff structure to encourage a better policy. 3.2. From Prisoner Dilemma (PD) to Forwarding Dilemma (FD) As a conclusion of the analysis above, the prisoner dilemma model could be too risky for the emergence of cooperation in a MANET, since even the most robust versions of tit-for-tat are still too sensitive to perception errors. However, we are going to show that, if we modify the payoff structure, we can generate more Nash equilibrium strategy profiles capable of leading to a sustained cooperation under the uncertain MANET environment, without discouraging the energy saving behavior. We will keep the prisoner dilemma game when the trust level between the nodes is low. However, once the nodes have reached a high mutual trust level, we will modify the payoff structure in order to make more rewarding for the nodes to keep the high trust of each other than to save energy by discarding other’s packets. This makes sense since, after all, the nodes want to use their energy to send their own packets, and this will be possible only by cooperating with each other. Therefore, cooperating with a good partner should be as rewarding as saving energy discarding the packets of a bad partner, but saving energy by discarding the packets of a good partner will never be a temptation. In the rest of the section, we build the new payoff table and we analyze the resulting Forwarding Dilemma (FD) game model using the previously introduced analysis method. 3.2.1. Payoff table modifications Considering the same memory parameter m = 3, we have to find a payoff table that exhibit the temptation to discard packets from nodes with trust level 0 or 1, but not from nodes with trust level 2 or 3. Let D(T) be the

1385

payoff for discarding a packet from a node with trust level T and C(T) the corresponding payoff for cooperation, with T 2 {0, 1, 2, 3}. We need D(T) to be a decreasing function of T and C(T) to be an increasing function of T; we also need D(0) > C(0) and D(1) > C(1) to produce the temptation of discarding packets from low trust nodes, and we need D(2) < C(2) and D(3) < C(3) to encourage the willing to build a good reputation among the most trusted nodes. Additionally, it is necessary that D(0) < x and C(3) < x, since there should be nothing more rewarding than the effective delivery of its own packets at the destination. Finally, we want that D(3) > 0 and C(0) > 0, since any effort for saving energy or building reputation deserves some reward. The simplest functions that meet the requirements are C(T) = max(T, 1/2) and D(T)max(3  T, 1/2). Thus, the new payment structure is the one depicted in Table 4, where the payoff earned by an intermediate node depends both on its decision and in the trust level it has on the source node. 3.2.2. FD game analysis with memory m = 3 Using the previous payoff tables, we have a new game for packet forwarding. The new FD game will evolve according to the state transition diagram shown in Fig. 5. In this figure, the state indicates the trust levels (T{A:B}, T{B:A}), and the transition labels indicate the payoff earned by each node in each movement, which corresponds to one of the four possible decisions (Discard, Discard), (Discard, Forward), (Forward, Discard) and (Forward, Forward). Compared with the transition diagram for the PD with memory m = 3, we notice that, even in a single iteration, the Nash equilibrium now leads to mutual cooperation if the game begins in a high trust state, although mutual discarding is still the Nash equilibrium under low trust initial states. Considering a finite horizon game of n iterations, there will be 4n+2 possible n  step paths, where each path is determined by an n  step set of actions and an initial state. We could compute the average payoff per iteration for each of these paths, so we can find the Nash equilibrium of any finite horizon game beginning in any initial state. It comes out that we only need a few iterations to see what could happen: the Nash equilibrium strategy profiles for a low initial trust level, T0{A:B} + T0{B:A} < 3, lead to mutual discarding but, if the initial trust level is high, T0{A:B} + T0{B:A} > 3, the Nash equilibrium strategy profiles lead to mutual cooperation; in an intermediate initial state, with T0{A:B} + T0{B:A} = 3, the Nash equilibrium strategy profiles lead to a flipping behavior between mutual discarding and mutual forwarding, as shown in Fig. 6. This is an interesting result: while the finite iterated prisoner dilemma game will always lead to mutual discarding, in the finite iterated forwarding game, the long run outcome depends on the initial trust level, which can be chosen rather arbitrarily. So, we must encourage the nodes to develop a high trust level as early as possible. These changes must be also observable in an infinitely iterated forwarding game, where we expect cooperation to emerge from many different initial states, leading to sustained cooperation in such a way that, if some node accidentally discards a packet, or if a forwarding decision

1386

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Table 4 Another payment structure for trust building in the forwarding game of Fig. 1. Source node payoffs

Intermediate node payoffs

Transmission status

Successful Failed

Trust level of the source node

5 0

Cooperate Discard

T=3

T=2

T=1

T=0

3 0.5

2 1

1 2

0.5 3

Fig. 5. State transition diagram for the iterated game of Table 4, with memory m = 3.

is not perfectly perceived, the network will recover easily. To verify this behavior in the iterated FD game, we repeat the previous analysis. In order to find long run Nash equilibrium, we use the average payoff per iteration during the limit cycle of each combination of initial state (is) and strategy profile (sp), pL(is, sp) = [pAL(is, sp), pBL(is, sp)]. Fig. 7 shows the corresponding table for the initial state (0, 0), which is the worst case. The four bits in the first row of each column is the strategy of node B, and the four bits of the first column of each row correspond to the strategy of node A. Each element of the array is the average payoff per iteration, (pA, pB), in the limit cycle. Bold entries are easily verified to be Nash equilibrium for this game. While the prisoner dilemma game does not offer any alternative to the only equilibrium decision profile (discard, discard), now we have 25 additional Nash equilibrium, all of them leading to sustained cooperation. Consider, for example, the equilibrium strategy profile (1011, 0011), where B plays a kind of tit-for-tat game. This

is a Nash equilibrium in our forwarding game, but it is not in the prisoner dilemma game. Indeed, in the prisoner dilemma the best response of B to the strategy 1011 of A is just 0000, and hence node A will be the sucker. On the other hand, the best response of A to the strategy 0011 of B is just 0011, which will lead to cooperation only if the initial trust value is (2, 2) or higher, which is not our case. However, with our payment structure, node A will take the initiative to forward the first two packets from B, even when B discards the first two packets from A. Finally, node B will begin to cooperate and the nice behavior of A will be highly rewarded. Even more important, as shown in Fig. 8 this strategy profile is highly robust to perception errors, as compared to Fig. 4. The perception error we have used to test the robustness of the two game models above, tries to capture the effects of the many uncertainties that a MANET exhibits. Indeed, each node must monitor its neighbors in order to learn about their behavior and, thus, learn or refine their

1387

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Fig. 6. After a few iteration, the Nash equilibrium strategy profiles in a finite iteration forwarding game leads to mutual discarding, mutual cooperation or a flipping behavior, according to the initial state.

A

B 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (2.43, 5.00) (2.43, 5.00) (4.50, 4.50) (4.50, 4.50) (2.63, 5.38) (4.50, 4.50) (3.93, 5.86) (8.00, 8.00)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (2.88, 4.38) (4.50, 4.50) (3.60, 4.90) (8.00, 8.00) (3.25, 5.00) (8.00, 8.00) (4.13, 5.63) (8.00, 8.00)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.75, 3.50) (3.75, 3.50) (3.75, 3.50) (3.75, 3.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.75, 3.50) (3.75, 3.50) (3.75, 3.50) (3.75, 3.50) (3.30, 4.70) (4.50, 4.50) (3.75, 3.50) (8.00, 8.00)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.21, 3.71) (4.50, 4.50) (4.50, 4.50) (3.75, 3.75) (3.50, 3.75) (3.50, 3.75) (0.50, 5.50) (0.50, 5.50)

(3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.00, 3.00) (3.21, 3.71) (4.50, 4.50) (4.50, 4.50) (8.00, 8.00) (3.88, 4.13) (8.00, 8.00) (4.57, 4.29) (8.00, 8.00)

(5.50, 0.50) (5.00, 2.43) (3.75, 3.75) (4.38, 2.88) (3.50, 3.75) (3.50, 3.75) (3.71, 3.21) (3.71, 3.21) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50)

(5.50, 0.50) (5.00, 2.43) (3.75, 3.75) (4.50, 4.50) (3.50, 3.75) (3.50, 3.75) (4.50, 4.50) (4.50, 4.50) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (3.50, 3.75) (4.50, 4.50) (3.93, 5.86) (8.00, 8.00)

(5.50, 0.50) (4.50, 4.50) (3.75, 3.75) (4.90, 3.60) (3.50, 3.75) (3.50, 3.75) (4.50, 4.50) (4.50, 4.50) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50) (0.50, 5.50)

(5.50, 0.50) (4.50, 4.50) (3.75, 3.75) (8.00, 8.00) (3.50, 3.75) (3.50, 3.75) (3.75, 3.75) (8.00, 8.00) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (3.75, 3.75) (5.50, 0.50) (8.00, 8.00) (3.75, 3.75) (8.00, 8.00)

(5.50, 0.50) (5.38, 2.63) (5.50, 0.50) (5.00, 3.25) (5.50, 0.50) (4.70, 3.30) (3.75, 3.50) (4.13, 3.88) (5.50, 0.50) (3.75, 3.50) (5.50, 0.50) (0.50, 5.50) (3.50, 3.50) (3.50, 3.50) (0.50, 5.50) (0.50, 5.50)

(5.50, 0.50) (4.50, 4.50) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (4.50, 4.50) (3.75, 3.50) (8.00, 8.00) (5.50, 0.50) (4.50, 4.50) (5.50, 0.50) (8.00, 8.00) (3.50, 3.50) (3.50, 3.50) (3.75, 3.50) (8.00, 8.00)

(5.50, 0.50) (5.86, 3.93) (5.50, 0.50) (5.63, 4.13) (5.50, 0.50) (3.50, 3.75) (5.50, 0.50) (4.29, 4.57) (5.50, 0.50) (5.86, 3.93) (5.50, 0.50) (3.75, 3.75) (5.50, 0.50) (3.50, 3.75) (3.75, 3.75) (0.50, 5.50)

Fig. 7. Long run average payoff per iteration under different strategy profiles for the initial state (0, 0).

(5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00) (5.50, 0.50) (8.00, 8.00)

1388

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Cooperation under (1011, 0011) strategy 1 0.998 0.996

Cooperation

0.994 0.992 0.99 0.988 0.986 0.984 0.982 0.98 0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

on the source node, so it cannot qualify the observed decision. Under these circumstances, a simple PD game model will not achieve the cooperation level required for the network to offer a high throughput to its nodes, as we have verified in the simplified two nodes case. Thus, we will use the second game model FD, where keeping the trust of highly reliable nodes could be more rewarding than saving energy by discarding its packets. Next we will describe the game model, the trust evaluation mechanism based on direct observations, and the strategy coding, which is based on the recent behavior of both the source node and the network as a whole. Later in this section we will describe the strategy evolution process, which is based on a cellular genetic algorithm, enhanced with some bacterial plasmid migration heuristics. The result of all these elements is the DECADE trust model.

Fraction of erroneous perceptions Fig. 8. Average cooperation under the given strategy profile, as a function of the perception error.

trust levels. However, the observer does not know the trust level that the observed node has on the source node, so it cannot qualify the observed decision. Under these circumstances, the prisoner dilemma model would not achieve the necessary cooperation for the network to be useful. Because of this, next section will apply the second model to a network composed of many nodes. This is, it would be more rewarding to keep the trust of a reliable node than to save energy discarding its packets, and it would also be more rewarding to save energy discarding the packets of unreliable nodes than to keep their trust. The general idea is that, even when the network environment could discourage cooperation, those rational nodes willing to cooperate will be able to identify other rational nodes and finally, there will be a good cooperation ratio within groups of rational nodes.

4. Distributed trust model and evolution of cooperation There are many additional issues to consider in a bigger MANET than those captured by the simple models of Section 3. To begin with, the game is not played between a pair of nodes that take turns for playing as source; instead, it is more like a collective game played among all the nodes in the network. The game is played in such a way that a given node can be requested to forward several packets before it sends its first own packet, perhaps through a set of intermediate nodes it has not previously interacted with. This is so because, unlike the previous simple twonode scenario, in a general setting different nodes have different packet generation rates. Furthermore, for its packet to get the intended destination, all the nodes in the path must decide simultaneously to cooperate. Clearly, if each node waits to play an iterated game with each of its network mates, the required information will take too long to be collected. Thus, nodes must monitor their neighbors in order to learn about their behavior and obtain or refine the corresponding trust value. However, the observer node does not know about the trust value the observed node has

4.1. The basic game model in DECADE Extending the results of Section 3, each intermediate node will use a strategy to decide whether to forward or discard a packet in transit. This strategy will depend on (1) the trust level that the intermediate node has on the source node, and (2) the recent behavior of the network as a whole, when the deciding intermediate node has actuated as a source. Indeed, deciding to forward a packet when it is almost sure that some node will drop it ahead in the path is not good for the intermediate node, nor for the network, regardless of the trust the intermediate node has on the source node. However, as we already noticed, it could be a good idea to begin with an initial period of indiscriminate cooperation in order to establish good initial trust relationships, and to force an initial state that lead to sustained cooperation. Thus, the proposed game model works as follows. Since each node observes the decisions taken by its neighbors and by all nodes preceding it in a path, the trust on each observed node can be computed as the number of forwarding decisions among the last m observations. More precisely, if an observer node A has made n observations of an observed node B, where the decision taken by node B in the ith observation is dA:B(i), the trust level that node A has on node B is the one shown in Eq. (3):

0 6 T n fA : Bg ¼

m 1 X

dA:B ðn  tÞ 6 m

ð3Þ

t¼0

We use a limited memory, m, because the nodes can change their strategies continuously in order to adapt their behavior to environmental changes, so it would be unfair to have a long record of observed decisions if they were taken under a previous strategy, different to the current one. The value of the memory length, m, obeys a trade-off: we would like m to be large enough to obtain a fair evaluation of the forwarding rate, but we would also like m to be small enough to ensure that the forwarding rate actually corresponds to the current strategies. We would also like m to be small in order to keep a small decision table, since the search strategy space grows exponentially with m (as 2m+1). In [4], we showed by simulating different scenarios

1389

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

that m = 3 provides an appropriate trade-off between adaptability and optimality of cooperation strategies. This allows us to use the payoff table of Section 3.2.2. On the other hand, because of the nature of the problem, and despite the cooperation encouraging payoff structure, there are still some dilemmas for the nodes to face. A node must be very careful to discard a packet because many nodes are watching, so it may loss their trusts; but it cannot cooperate indiscriminately because it can become a sucker. So, recognizing that it is highly convenient to discard packets from selfish sources, rational nodes must develop some tolerance in accepting some fraction of discarding decisions from its network mates. This tolerance must depend on the general environment of the network: if there are no selfish nodes, discarding should be unacceptable; but, with a few selfish nodes, rational nodes must be willing to accept a small discarding rate from their neighbors; As the number of selfish nodes increases, the tolerance should also increase, up to the case in which the whole point of a nice cooperating behavior is lost. Correspondingly, the strategy encoding must consider different possible behaviors, allowing the possibility to discard or forward according to the changing conditions of the network. In particular, each node considers how useful has the network been for the effective transmission of its own last k packets. The decision will be based on what the network did with each of these transmitted packets. We consider k = 2 in order to keep the model simple, while still obtaining a good estimate of the general behavior of the network. The strategy that a node follows when it is acting as an intermediate node is encoded by a string of bits that represent the decision of discarding (D = 0) or cooperating (C = 1). The strategy depends on the trust level that the node has in the source node, which belongs to the set {0, 1, 2, 3}, and the transmission status of the two previous packets that the node generated as source, each of which could be either success (S = 1) or failure (F = 0). The resulting strategy has 16 bits, as shown in the example strategy of Table 5. To initially encourage the building of good trust relations, when a rational node has transmitted less than two packets, the decision made as intermediate node is always to forward, regardless of the trust level it has on the current source. Finally, and again according to the discussion above, when the source node is unknown, it is treated as a node with a high trust. Indeed, in order not to let the first encounter to be too definitive, each node assumes that an unknown node has an observed history of one discarding decision followed by two cooperating decisions, so the trust on it is 2 and the first encounter can only increase it to 3 or decrease it to 1. Each node participates in repeated games, where the decision to cooperate or discard obeys to its current strategy. A game consists on the successful or failed transmission

of a packet. Whenever a node is ready to send a packet, it chooses the most trusted path to the destination, i.e., the one that maximizes the probability that the packet gets its intended destination. Then, the source node sends the packet on this path and each intermediate node decides whether to forward it or to discard it, according to their own strategies. The game ends either when the packet is delivered to its destination, or when an intermediate node decides to discard the packet. Once the game has finished, each participant receives a payoff, according to Table 4. The trust of a path is obtained as the product of the normalized trust the source node has on each of the intermediate nodes. More specifically, the normalized trust that a source node Ns has in an intermediate node Ni, Tn{Ns:Ni}/ m, is interpreted as the probability that the intermediate node forwards the requested packet. Assuming independence among nodes, the probability that a path delivers the packet to its destination would be the product of the individual probabilities of the intermediate nodes constituting the path. The path with the highest successful transmission probability is the most trusted one, which can be found through a shortest path routing algorithm under the distance metric Dn(Nj:Ni) = log(Tn{Ns:Ni}/m), where Dn(j:i) is the distance from node j to its adjacent node i at time n, for every node j. Given this game theoretic network trust model, we need to find an optimal strategy for each node, so that the network as a whole maximizes both the cooperation among rational nodes and the isolation of selfish nodes. This will be done through a genetic algorithm that will evolve constantly to keep the pace of dynamical changes within the network. 4.2. Review of the algorithm for strategy evolution In this section, we briefly review the evolution algorithm we presented in [4]. The algorithm proposed uses a distributed genetic algorithm of the cellular type with plasmid migration heuristics, which not only gives good results in term of the optimality of the converged solutions, but also exhibits good adaptability to changing conditions. In our case, each node tries to maximize its fitness from local measurements and individual decisions, interacting only with their adjacent neighbors and with the nodes that constitute the paths it belongs to. When a node had played a hundred games as the source of packets, we say that a Plasmid Migration Period (PMP) has been completed, so the node exchanges genetic information with its neighbors in order to evolve its strategy. As in a classical cellular genetic algorithm, each node receives the genetic information from all its one-hop neighbors, selects randomly two of them with a probability of being selected proportional

Table 5 Strategy coding, example strategy 0001 0011 0101 0111. Source trust level Transmission status 2 Transmission status 1 Current decision

0 F F D

0 F S D

0 S F D

0 S S C

1 F F D

1 F S D

1 S F C

1 S S C

2 F F D

2 F S C

2 S F D

2 S S C

3 F F D

3 F S C

3 S F C

3 S S C

1390

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

to their fitness and, through the classical one point crossover and mutation processes, combines them to construct a new strategy. This classical cellular mechanism is enhanced with a bacterial plasmid migration concept, where two heuristics are added. First, each node can accept or reject the new strategy depending on whether the reported fitness is greater or smaller than its own fitness. Second, each node can keep a copy of its best previous strategy so that, if during the current plasmid migration period the new strategy did not increase the fitness, the old strategy can be restored. An important heuristic in this evolution process is that, since each node keeps a record of its best strategy so far (plasmid genes instead of chromosomal genes), a node can replace the current strategy with the stored one, just before any strategy exchange among neighbors takes place. This heuristic enhances the exploratory capacity of the evolution process. Further details of the algorithm and its pseudo-code can be found in [4]. As a final remark, notice that DECADE relies on feedback about which packets were successfully delivered or dropped by an intermediate node. This feedback information can be received either through an extra end-to-end network layer message, or by exploiting properties of transport layers, such as TCP with SACK [36]; this feedback approach is somewhat similar that used in IPv6 for Neighbor Unreachability Detection [37]. End-to-end feedback is also used in other Ad hoc routing protocols like ARIADNE [38,39], which is a secured version of the standard DSR (Dynamic Source Routing)[3] or RDSR-V [40], which is a reputation routing protocol based also on DSR for videostreaming. Of course, such notifications should be sent along the reversed path. This back propagation of discarding and cooperation information, along with the undiscriminated monitoring of neighbor nodes and the limited memory selected for adaptability purposes, makes the differences in packet generation rates to play no role in the performance of the DECADE, unless the network has a really low usage: As long as the network is used, all nodes are monitoring their neighbors and they will have enough history to use and evolve their strategies.

5. Enhancing DECADE for adaptability to mobility conditions In the following subsections we examine the performance of DECADE under different mobility conditions. We study the system under high mobility, no mobility and a realistic mobility model. We will observe that the lack of mobility affects the adaptability of the strategies to environmental changes so, to enhance this aspect, we will add a sociability parameter which will improve performance. Before discussing the results, next we mention the assumptions made to evaluate DECADE’s performance. In the first place, we consider that the network is composed of N = 100 mobile nodes. NR of them are rational nodes and NS = N  NR of them are selfish nodes. Each packet is sent through a selected path. The possible paths depend on the mobility model. If the packet arrives to its destination, this

is considered a successful game; otherwise it is a failed game. In any case, the participating nodes receive their corresponding payoff, according to Table 4. For the simulation, the nodes take turns to transmit a packet to a random destination and, after each node has originated a hundred packets, we say that a Plasmid Migration Period (PMP) has been completed. At each PMP, every node interchanges genetic information with its one-hop neighbors, in order to run a single iteration of the distributed cellular/bacterial genetic algorithm, as described in Section 4.2. At this point, we need to define how we will measure the cooperation within the network. In Section 3 we used the fraction of packets that arrived to their destinations as a measure of cooperation, because this quantity was exactly the fraction of cooperation decisions taken by the nodes. However, with arbitrary longer paths, the fraction of cooperation decisions will not be as easy to compute since, in a failed transmission, many nodes could have cooperated but one of them did not. Nevertheless, the fraction of delivered packets seems to be, still, a good indirect measure of the global cooperation behavior within the network. Of course, cooperation is an individual behavior and, as such, should be measured individually. However, the global cooperative behavior of the nodes in the network must impact the general performance of the network, so there could be a global measure of the effectiveness of individual cooperation decisions. In particular, since the lack of cooperation can be directly observed as a general reduction of the global network throughput, we measure cooperation as the fraction of packets that arrived to their destinations among those packet transmission attempts that require at least one cooperation decision. Notice that our measure of cooperation does not take into account packet transmission attempts between adjacent nodes because this would increase the figure of cooperation without real forwarding decisions. In addition, we neither consider packet transmission attempts that do not arrive to destination because the route is broken due to a temporal partition of the network, since this would decrease the figure of cooperation in a situation in which there are not discarding decisions. More formally, defining the events A = ‘‘a packet arrives to its destination’’ , B = ‘‘there exist at least one multi-hop path between source and destination nodes’’ and C = ‘‘Source and destination nodes are not adjacent’’, our measure for cooperation is c = Pr[AjB \ C]. Notice that this is not exactly a measure of normalized throughput, which could be defined as Pr[A], although there could be some scenarios in which the events B or C are sure events, as in the theoretically analyzed two-node game of Section 3 5.1. High mobility scenario The high mobility conditions were already studied in [4], so here we only review it briefly. In this case the mobility model considers that when a node wants to transmit a packet, it gets a random number of possible paths, each composed of a random number of randomly selected nodes. The probability distribution of these variables can be consulted in [4]. Clearly, this model provides an extremely high mobility. Then, the source node selects the most

1391

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

trusted path according to the normalized trust on each intermediate node, as described in Eq. (4):

0.8

hi Y ðT n ðs : Ni;j Þ=mÞ

ð4Þ

j¼1

where r is the number of paths, hi is the number of intermediate nodes in the ith path, s is the source node, Ni,j is the jth intermediate node of the ith path, and Tn(s, Ni, j)/ m is the normalized trust that node s has on node Ni,j at the current time n. According to this setup, Fig. 9a shows the cooperation among rational nodes and Fig. 9b shows the cooperation with selfish nodes, as a function of time. Four graphs are plotted for four different percentages of selfish nodes, 0%, 20%, 50%, 60%. The horizontal axis represents the time in PMPs. The vertical axis represents the fraction of packets that arrived to their destinations. In this scenario, since all packets always find some multi-hop paths, the cooperation is exactly the normalized throughput of the network. It can be observed that the evolved strategies achieve a relatively high cooperation among rational nodes (given the fraction of selfish nodes), and also isolate effectively the selfish nodes by reducing their delivered packet ratio to a negligible value. Additionally, we can notice the fast convergence of the system, as the network takes between 5 and 20 PMPs to obtain the convergent cooperation values, depending on the environment. Furthermore, we can compare the cooperation value achieved among rational nodes after convergence with the theoretical maximum achievable cooperation value, as shown in Fig. 10. The horizontal axis of this figure represents the fraction of selfish nodes (the environment of the network), and the vertical axis represents the maximum achievable throughput, both theoretically (solid line) and after convergence (dashed line). Clearly, the proposed game model and evolution algorithm achieve optimality with this scenario, i.e., the optimum throughput is obtained without wasting energy in forwarding packets of

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.4

0.6

0.8

selfish nodes. The theoretical optimum achievable cooperation of Fig. 10 is obtained by assuming that each node knows the rational or selfish condition of each other node, so that rational nodes always cooperate among them, and only among them. Finally, under this scenario the basic model has a remarkable adaptability, as shown in Fig. 11, where steep environmental changes occur every 200 PMPs, changing the fraction of rational nodes from 30% to 90% and back. With each change, rational nodes adapt their strategies correspondingly, in less than 5 PMPs, achieving the optimal cooperation values obtained under a fixed fraction of rational nodes. The encouraging results of Fig. 11 (and some more that can be found in [4]) were obtained under the highly dynamic neighbor and path selection algorithms. This implies a high number of interactions among nodes, i.e., each node has a good chance to know the forwarding behavior of each other node. However, in practice, there could be other mobility patters, where the chance to learn might be heavily reduced. Next, we examine these situations.

(b) Evolution of cooperation with selfish nodes 0.1

0% selfish nodes 20% selfish nodes 50% selfish nodes 60% selfish nodes

0.8

20% selfish nodes 50% selfish nodes 60% selfish nodes

0.09 0.08 0.07

Cooperation

0.7 0.6 0.5 0.4

0.06 0.05 0.04

0.3

0.03

0.2

0.02

0.1

0.01 0

10

20

30

40

50

Plasmid migration period

60

1

Fig. 10. Maximum achievable cooperation.

1 0.9

Cooperation

0.2

Fraction of selfish nodes

(a) Evolution of cooperation with rational nodes

0

Optimum Basic model

0.9

Cooperation



Path ¼ argmaxi¼1...r

1

0

0

10

20

30

40

50

Plasmid migration period

Fig. 9. Evolution of Cooperation under four different environments.

60

1392

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

was obtained by simulating DECADE in an environment in which all nodes are perfectly identified as rational or selfish, in such a way that rational nodes cooperate only among them. The cooperation is not monotonically decreasing, as it could be expected, because of the fixed topology: When a rational node becomes selfish, the connectivity among the remaining rational nodes can change drastically with a huge reduction on cooperation, or can remain the same, with a small increase in cooperation. The negative impact of the lack of mobility resides in the adaptability to a changing environment. Fig. 14 shows that, as the fraction of selfish nodes changes, the strategies cannot keep the best cooperation for each environment, unlike the previous high mobility scenario (see Fig. 11). For example, this lack of adaptability can be seen in Fig. 14, in which the maximum cooperation value achieved in this changing environment is a modest 0.75, quite low if compared to the value of 0.95 that is achieved in a fixed environment of 90% of rational nodes. Concluding, the lack of mobility reduces the possibility of building and updating precise reputation information within the network, and delays the propagation of good strategies. Next we will test DECADE under a realistic mobility model, in order to determine the effect of mobility in the interactivity among the nodes, which became an important requirement for DECADE.

1 0.9 0.8 0.7 0.6 fraction of rational nodes Cooperation among rational nodes Cooperation with selfish nodes

0.5 0.4 0.3 0.2 0.1 0 0

100

200

300

400

500

600

700

800

900

1000

Plasmid migration period Fig. 11. Adaptation to environmental changes.

5.2. No mobility scenario Now, we consider a scenario with no mobility. A 100 nodes occupy fixed positions in an hexagonal pattern, so that each node can hear/talk to six adjacent neighbors. The unit of distance is the radio transmission range, identical for each node, and equal to the distance between every pair of adjacent nodes. We numbered the nodes from 0 to 99, with a random assignment, and show how the basic system behaves as the first NR nodes are rational nodes and the last NS = N  NR are selfish nodes, where N = 100 is the total number of nodes. Fig. 12 shows the evolution of cooperation for rational and selfish nodes under this no mobility scenario. Despite the cooperation values are not as good as the ones depicted in Fig. 9 under the high mobility scenario, this basic model evolves quickly to good cooperation values among rational nodes, achieving almost complete isolation of selfish nodes. The lack of mobility does not compromise in a significant way the optimality of the convergence cooperation values, as shown in Fig. 13. In Fig. 13, the optimal cooperation

5.3. Realistic mobility scenario Having determined the importance of mobility in the performance of a trust model in a MANET, we tested different standard mobility models like random-way-point [41,42], Gauss-Markov [43,42], Manhattan grid [43,42], and AVW [44]. For each of them we could draw similar conclusions, so here we only show the AVW mobility model, useful for pedestrian nodes in an office building, a shopping mall, an amusement park, etc. The model consists of a U-shaped office building with 47 offices in a 60  48 m2 area, as shown in Fig. 15. There are 100 nodes moving

(b) Evolution of cooperation with selfish nodes

1

0.2

0.9

0.18

0.8

0.16

0.7

0.14

Cooperation

Cooperation

(a) Evolution of cooperation with rational nodes

0.6 0% selfish nodes 20% selfish nodes 50% selfish nodes 60% selfish nodes

0.5 0.4

0.12 0.1

0.3

0.06

0.2

0.04

0.1

0.02

0

20% selfish nodes 50% selfish nodes 60% selfish nodes

0.08

0 0

10

20

30

40

50

Plasmid migration period

60

0

10

20

30

40

50

Plasmid migration period

Fig. 12. Evolution of cooperation under four different environments for the fixed topology scenario.

60

1393

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Cooperation in the fixed positions scenario

50

1 45

Optimum Basic model

0.9

40 0.8 35

Cooperation

0.7 30 0.6 25 0.5 20 0.4 15 0.3 10 0.2 5 0.1 0 0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 0

Fraction of selfish nodes

10

20

30

40

50

60

Fig. 15. Network topology when each node is at some office. Fig. 13. Performance of the basic model under the fixed positions scenario.

1 0.9 0.8 0.7 0.6 Fraction of rational nodes Cooperation among rational nodes Cooperation with selfish nodes

0.5 0.4 0.3 0.2 0.1 0 0

100

200

300

400

500

600

700

800

900

1000

Plasmid migration period Fig. 14. Adaptation to changing environment in the fixed topology scenario.

independently from one office to another along the shortest path on the hallways, according to the following procedure:  Each node selects randomly an initial office and starts at that office at the beginning of the simulation.  The node stays visiting the current office during TV seconds, where TV is a random variable uniformly distributed in the range [10, 40] seconds.  Then the node chooses a destination office uniformly among the other 46 offices and moves to that destination using the shortest path along the hallways.  The nodes move with a constant velocity v = 1 m/s, so the traveling time, TM, only depends on the origin and destination offices.  The whole procedure is repeated from the second step. At any instant, a bidirectional link exists between two nodes n1 and n2 if they are closer than 10 m, so the topology changes dynamically over time. After a warming-up period of 1000 s, during which the nodes move along the

building according to the previous rules, the network begins its operation. This is an interesting mobility model because of several features likely found in real environments. For example, some regions are less populated than others. Some hallways are highly used by nodes in transit, while some others are seldom used. The movements have drastic turns with well-defined minimum and maximum times between turns, given by the geometry of the mobility plane. The Ushape introduces a natural obstacle that must be surmounted by multiple hop routes. And, finally, although we got a fairly connected network, we also have several periods of network fragmentation to challenge the routing algorithms. Running the trust model and its evolution algorithm with the AVW mobility model, we obtained again almost optimal cooperation values among rational nodes and a relative good isolation of selfish nodes, when the environment (number of selfish nodes) is kept constant, as shown in Fig. 16. Fig. 16a shows the cooperation for rational nodes and Fig. 16b shows the cooperation for selfish nodes, as a function of time, where time is measured in plasmid migration periods. There are four plots in each graph, corresponding to different percentages of selfish nodes, as reported in the legend. The vertical axis represents the fraction of packets that arrived to their destinations among those that required and found a multi-hop path. Because of the network partitioning events, convergence took longer under this scenario than under the previous two scenarios. Despite this, it took less than 70 PMPs for the cooperation to converge to stable values, which are close to optimal values, as shown in Fig. 17, to be compared with Figs. 10 and 13. Although it seems that the basic system can achieve good cooperation values, it is again the adaptability what causes troubles with the performance of the basic system, as shown in Fig. 18. The optimal cooperation values in environments with 90 and 30 rational nodes are 0.98 and 0.36, respectively, and the achievable cooperation after evolution in a fixed environment is 0.91 and 0.28 in each case, as observed in Fig. 17. However, from Fig. 18, after the environment changes from 30 to 90 rational nodes,

1394

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

Cooperation with selfish nodes with AVW mobility

1

1

0.9

0.9

0.8

0.8

0.7

0.7

Cooperation

Cooperation

Cooperation among rational nodes with AVW mobility

0.6 0.5 0.4

0.6 0.5 0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0% Selfish nodes 20% Selfish nodes 50% Selfish nodes 60% Selfish nodes

0 0

100

200

300

400

500

0

100

Plasmid migration period

200

300

400

500

Plasmid migration period

Fig. 16. Evolution of Cooperation under four different environments for the AVW mobility scenario.

Cooperation in the AVW Mobility scenario 1 basic model, rational basic model, selfish optimum, rational

0.9 0.8

Cooperation

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

20

40

60

80

100

Fraction of selfish nodes

and from 90 to 30 again, the network only achieves a fraction of 0.64 and 0.12 delivered packets, respectively, which is far below from what it can achieve. The problem can be attributed to the low interactivity among nodes. Fig. 19 shows the average number of intermediate nodes with which an average source node has interacted, as a function of time, where time is measured by the number of generated packets. In the high mobility scenario of Section 4, a source node has already interacted with 96 intermediate nodes during a single plasmid migration period. This high interactivity not only implies an accurate gathering of reputation information, but also a fast propagation of good game strategies, explaining the corresponding outstanding performance shown above. In the scenario with no mobility of Section 5.2, the lack of interaction among nodes due to their fixed positions reduces the possibility of building accurate reputation

Fig. 17. Performance of the basic model under the AVW mobility scenario.

Average number of intermediate nodes a source node has interacted with 100

0.9

90

0.8

80

0.7 0.6 0.5

Fraction of rational nodes Cooperation among rational nodes Cooperation with selfish nodes

0.4 0.3

Number of nodes

1

70 60 50 40

high mobility AVW mobility No mobility

30 20

0.2

10

0.1

0 0

0 0

100

200

300

400

500

600

700

800

900

Plasmid migration period Fig. 18. Adaptation to changing environment in different scenarios.

100

200

300

400

500

Number of transmitted packets Fig. 19. Interactivity between a source node and intermediate nodes in the AVW Mobility scenario.

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

information within the network and slows down the propagation of good strategies. Indeed, as shown in Fig. 19, there is an upper-bound of 72 intermediate nodes that an average source node can interact with. In the AVW mobility model, the interactivity is not as low as in the fixed topology scenario, but it is still far worse than the high mobility model we used initially. Why is it the case that, under a fixed environment, the low mobility models achieve an almost optimum cooperation level, but not under a changing number of selfish nodes? Certainly, in a fixed environment every game, every interaction and every information exchange were reinforcing the learning of the fixed environment. Instead, in a dynamic environment, source nodes had already learned about some preferable nodes to interact with by the time the environment changes. In the extremely high mobility scenario of Section 4, the high interactivity among nodes was an important part of the outstanding performance of DECADE. And it worked in two ways: first, every node could have a direct knowledge of the forwarding behavior of each other node and this knowledge could be updated very often; and, second, at each plasmid migration period, each node had the opportunity to share its genetic information with new neighbors, accelerating the spread of good strategies. In the low mobility scenarios, the nodes monitor the behavior of fewer neighbors, and the genetic information is spread mostly through neighbor superposition. So, once the nodes had learned some good strategy for a given environment, a change in this environment will not be completely detected and the variety of genetic information to reformulate the strategy will be too poor. Clearly, interactivity is an important requirement for a network that implements an evolutionary trust model like DECADE. In low mobility scenarios, there is a loss in the learning capacity of the nodes since changes in the environment are not accompanied by further exploration of new nodes, nor new strategies. For this reason, in the next section we will extend the basic DECADE model with a simple heuristic designed to keep the willingness of nodes to learn. 5.4. Sociability parameter From the previous discussion, it can be deduced that DECADE still lacks a heuristic mechanism to encourage nodes to explore the network in low mobility scenarios. Indeed, the routing algorithm, which is the client of our trust model, reduces the freedom of the nodes to ‘‘make new friends’’. This compromises the adaptability of DECADE to changing environments. To alleviate this situation, we introduce a sociability parameter within the routing metric, as described next. As previously mentioned, for a given source destination pair (s, d), there exists a set of possible paths, P(s, d). Let fp(i; s) be the forwarding rate that the source node s has seen from the ith intermediate node in the pth path of P(s, d). So far, we have chosen the path that maximizes the delivery probability of a packet, p⁄, as in Eq. (5),

p ¼ arg max

p2Pðs;dÞ

np Y i¼1

fp ði; sÞ ¼ arg max log p2Pðs;dÞ

np Y fp ði; sÞ i¼1

!

¼ arg max

p2Pðs;dÞ

¼ arg min

p2Pðs;dÞ

1395

np X logðfp ði; sÞÞ i¼1 np X

 logðfp ði; sÞÞ

ð5Þ

i¼1

where np is the number of intermediate nodes in the pth path. This is, looking for the shortest path under the metric D(j, i) = log(f(i; s)) "j adjacent to i, we find the path that maximizes the probability of packet delivery. Our objective is to introduce an heuristic modification to the routing algorithm to give incentives to each source node s to make new friends spontaneously. For this purpose, it will keep a counter R that registers the number of interactions with each intermediate node, i.e., R(i; s) is the number of times node s has asked node i to forward a packet on its behalf. In Eq. (6), we define a normalized interaction factor of source node s with the intermediate node i, w(i; s):

0 6 wði; sÞ ¼

Rði; sÞ 61 max Rðj; sÞ

ð6Þ

j¼0...N

Modifying the metric to find the shortest path to the expression of Eq. (7), we can control the degree of exploration that a node will exert when looking for a path, where S is a sociability parameter.

Dði; sÞ ¼ log½f ði; sÞ  ð1  S  wði; sÞÞ

ð7Þ

Just notice that, when S = 0, we have the situation in which the most trusted path will be selected, that is, the path with best forwarding rate. On the other hand, when S = 1 the source node just tend to omit the nodes with whom it has interacted to previously. This way, it is possible to give incentives to source nodes to socialize, increasing the possibilities to interact with new nodes. Clearly, this sociability parameter obeys the exploitation/exploration trade-off [30]. It is in the interest of a node to use its most trusted peers to route its packets, in order to maximize what it can obtain from its current knowledge about the network. However, it is also in its interest to get a better knowledge of the network in order to improve future utilities by meeting better partners, previously unknown. The parameter S becomes a knob to tune this trade-off. This parameter has an important impact on the performance of the general trust model for the whole range of scenarios we tested. Just for lack of space, we show here its effect on adaptability only under the AVW mobility scenario. Fig. 20 shows the adaptation to a changing environment under three different sociability parameters: S = 0, 0.2, 0.8. The case S = 0 was already plotted before in Fig. 18, but we include it here again for comparison purposes. The cases S = 0.2 and S = 0.8 show that, after having learned the appropriate strategy for a given environment, the nodes can do a more aggressive exploration of the solution space, leading to cooperation values closer to the maximum achievable ones. This is clear for S = 0.8 when the number of rational nodes increases from 30 to 90, where the achieved cooperation is just the maximum achievable cooperation, and the convergence speed has not been affected noticeable.

1396

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

6. Conclusions

1 0.9

S = 0.8

In this article we propose DECADE, a game theoretic trust model whose aim is encouraging cooperation in MANETs based on source routing. DECADE uses a noncooperative game that considers trust to achieve both cooperation among rational nodes and isolation of selfish nodes. Furthermore, due to the distributed nature of the evolution algorithm and the trust evaluation mechanisms, cooperation emerges with a low overhead on computational and communication resources. On the other hand, under high mobility, DECADE exhibits excellent performance results in terms of optimality, speed of convergence and adaptability to changing environments. However, under less dynamic mobility models the exploration capacity of the system is reduced. This is due to the fact that, with low mobility, the scope of information exchanged is reduced to small groups of nodes and, thus, it is not possible for a node to obtain global knowledge. The higher the mobility, the more naturally a node gets acquainted with its global environment. To alleviate this problem we introduced the sociability factor, whose main aim is to encourage the nodes to explore the network and obtain widerscope information. This sociability parameter relaxes the requirement of choosing strictly the best path, in favor of using paths that include unknown nodes. The sociability parameter allowed the system to achieve, again, its best possible performance after an environmental change. By simulation, we have also demonstrated that DECADE is worthy when the fraction of selfish nodes is above a small threshold, saving valuable energy resources and keeping a high throughput for the rational nodes. As a concluding remark, we would like to mention that DECADE is based in complex systems engineering: the whole system develops the cooperation as an emergent phenomenon, which appears as a consequence of individual decisions, based on local observations. Each node wants to save its scarce energy by using it rationally, seeking the cooperation of intermediate nodes to deliver their

0.8 S = 0.2 0.7

S = 0.0

0.6 Faction of rational nodes Cooperation among rational nodes Cooperation with selfish nodes

0.5 0.4 0.3 0.2

S = 0.8, 0.2 S = 0.0

0.1 0 0

100

200

300

400

500

600

700

800

900

1000

Plasmid migration period Fig. 20. Adaptation to changing environment in the AVW mobility scenario, with different sociability parameters.

Finally, as a concluding experiment, we wanted to test the overall effect of DECADE with respect to the efficient use of throughput and energy resources. For this purpose, we vary the fraction of selfish nodes from 0% to 35% in the AVW scenario. We ran several simulations with and without DECADE, and measured the cooperation among rational nodes, the cooperation with selfish nodes and the number of overhead packets per delivered packet. The results are plotted in Fig. 21. Fig. 21a and b shows how the trust mechanism keeps a better throughput for the rational nodes and almost isolate completely the selfish nodes. If the trust mechanism is omitted, a huge effort is wasted attending selfish nodes, with a significant reduction in the throughput of rational nodes. Of course, with no selfish nodes, DECADE introduces an additional overhead with no reason, as shown in Fig. 21c. However, as the number of selfish nodes increases, the benefits of DECADE become more and more evident.

(a) Cooperation among rational nodes

(b) Cooperation with selfish nodes 1

1000 With no trust mechanism With trust mechanism

0.9

0.8

0.8

800

0.7

0.7

700

0.6

600

0.6 0.5 0.4

900

packets

0.9

Cooperation

Cooperation

1

(c) Overhead packets per delivered packet

0.5 0.4

500 400

0.3

0.3

300

0.2

0.2

200

0.1

0.1

100

0

0

10

20

30

Percentage of selfish nodes

0

0

10

20

30

Percentage of selfish nodes

0

0

10

20

30

Percentage of selfish nodes

Fig. 21. Effects of the trust mechanisms in the performance of the network.

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398

own packets. As a consequence of the local interactions among nodes, the global cooperative behavior arises, with the reported performance benefits. In particular, these complex systems engineering method is responsible for the adaptability properties of our model, through its fast convergence, one of the most challenging issues in related proposals in the literature. Acknowledgements This work was partly supported by the Colombian Institute for science and Technology development, Colciencias, Universidad Militar Nueva Granada and Universidad de los Andes, in Colombia, as well as the Spanish Government through projects CONSOLIDER INGENIO 2010 CSD200700004 ‘‘ARES‘‘, TEC2011-26452 ’’SERVET‘‘, and by the Government of Catalonia under Grant 2009 SGR 1362 to consolidated research groups. References [1] C.E. Perkins, Ad Hoc Networking, vol. 1, Addison-Wesley Professional, 2001. [2] K. Wrona, P. Mähönen, Analytical model of cooperation in ad hoc networks, Telecommunication Systems 27 (2004) 347–369. [3] D.B. Johnson, D.A. Maltz, and Y.C. Hu, The Dynamic Source Routing Protocol for Mobile Ad Hoc Networks (DSR), Published Online, IETF MANET Working Group, Tech. Rep., February 2007. [4] M. Mejia, N. Peña, J.L. Muñoz, O. Esparza, M. Alzate, A game theoretic trust model for on-line distributed evolution of cooperation in manets, Journal of Network and Computer Applications 34 (2011) 39–51. [5] A. Boukerche, Algorithms and Protocols for Wireless, Mobile Ad Hoc Networks, Wiley-IEEE Press, 2008. [6] G.F. Marias, P. Georgiadis, D. Flitzanis, K. Mandalas, ‘‘Cooperation enforcement schemes for manets: a survey: research articles, Wireless Communications and Mobile Computing 6 (3) (2006) 319–332. [7] Y.L. Sun, W. Yu, Z. Han, K. Liu, Information theoretic framework of trust modeling and evaluation for ad hoc networks, IEEE Journal on Selected Areas in Communications 24 (2) (2006) 305–317. [8] S. Marti, H. Garcia-Molina, Taxonomy of trust: categorizing p2p reputation systems, Computer Networks 50 (4) (2006) 472–484. [9] M. Mejia, N. Peña, J.L. Muñoz, O. Esparza, A review of trust modeling in ad hoc networks, Internet Research 19 (1) (2009) 88–104. [10] S. Buchegger, J.-Y. Le Boudec, Performance analysis of the confidant protocol, in: MobiHoc ’02: Proceedings of the 3rd ACM international Symposium on Mobile Ad Hoc networking & Computing, ACM Press, New York, NY, USA, 2002, pp. 226–236. [11] S. Buchegger, J.-Y.L. Boudec, A robust reputation system for p2p and mobile ad-hoc networks, in: Proceedings of the Second Workshop on the Economics of Peer-to-Peer Systems, 2004. [12] P. Michiardi, R. Molva, Core: a collaborative reputation mechanism to enforce node cooperation in mobile ad hoc networks, in: Proceedings of the IFIP TC6/TC11 Sixth Joint Working Conference on Communications and Multimedia Security, Kluwer, B.V., Deventer, The Netherlands, The Netherlands, 2002, pp. 107–121. [13] S. Bansal, M. Baker, Observation-based cooperation enforcement in ad hoc networks, Tech. Rep., 2003 . [14] H. Safa, H. Artail, D. Tabet, A cluster-based trust-aware routing protocol for mobile ad hoc networks, Wireless Networking 16 (4) (2010) 969–984. [15] J. Luo, X. Liu, M. Fan, A trust model based on fuzzy recommendation for mobile ad-hoc networks, Computer Networks 53 (14) (2009) 2396–2407. [16] F. Li, J. Wu, Uncertainty modeling and reduction in manets, IEEE Transactions on Mobile Computing 9 (7) (2010) 1035–1048. [17] C. Zouridaki, B.L. Mark, M. Hejmo, R.K. Thomas, E-hermes: A robust cooperative trust establishment scheme for mobile ad hoc networks, Ad Hoc Networks 7 (6) (2009) 1156–1168.

1397

[18] R. Sherwood, S. Lee, B. Bhattacharjee, Cooperative peer groups in nice, Computer Networks 50 (4) (2006) 523–544. [19] M. Osborne, An Introduction to Game Theory, Oxford University Press, 2004. [20] J. Baras, T. Jiang, Cooperation, Trust and Games in Wireless Networks, Springerlink, 2005. pp. 183–202 (chapter 4). [21] W. Saad, Z. Han, M. Debbah, A. Hjørungnes, A distributed coalition formation framework for fair user cooperation in wireless networks, Transactions on Wireless Communications 8 (9) (2009) 4580–4593. [22] M. Felegyhazi, J.-P. Hubaux, L. Buttyan, Nash equilibria of packet forwarding strategies in wireless ad hoc networks, IEEE Transactions on Mobile Computing 5 (5) (2006) 463–476. [23] M. Seredynski, P. Bouvry, Evolutionary game theoretical analysis of reputation-based packet forwarding in civilian mobile ad hoc networks, in: Parallel and Distributed Processing Symposium, International, vol. 0, 2009, pp. 1–8. [24] M. Seredynski, P. Bouvry, M. Klopotek, Modelling the evolution of cooperative behavior in ad hoc networks using a game based model, in: IEEE Symposium on Computational Intelligence and Games CIG., April 2007, pp. 96–103. [25] K. Komathy, P. Narayanasamy, Best neighbor strategy to enforce cooperation among selfish nodes in wireless ad hoc network, Computer Communications 30 (18) (2007) 3721–3735. [26] K. Komathy, P. Narayanasamy, Trust-based evolutionary game model assisting aodv routing against selfishness, Journal of Network and Computer Applications 31 (4) (2008) 446–471. [27] J.J. Jaramillo, R. Srikant, A game theory based reputation mechanism to incentivize cooperation in wireless ad hoc networks, Ad Hoc Networks 8 (4) (2010) 416–429. [28] J. Dale, S. Park, Molecular Genetics of Bacteria, forth ed., vol. 3, Wiley, 2004. [29] I.W. Marshall, C. Roadknight, Adaptive management of an active service network, BT Technology Journal 18 (4) (2000) 78–84. [30] E. Alba, B. Dorronsoro, Cellular Genetic Algorithms (Operations Research/Computer Science Interfaces Series), first ed., vol. 6, Springer, 2008. [31] E. Alba, J.M. Troya, A survey of parallel distributed genetic algorithms, Complexity 4 (4) (1999) 31–52. [32] E. Cantupaz, A survey of parallel genetic algorithms, Calculateurs Paralleles, Réseaux et Systémes Répartis 10 (2) (1998) 141–171. [33] M. Nowostawski, R. Poli, Parallel genetic algorithm taxonomy, in: Third International Conference on Knowledge-Based Intelligent Information Engineering Systems, 1999, pp. 88–92. [34] R. Axelrod, D. Dion, The further evolution of cooperation, Science 242 (4884) (1988) 1385–1390. [35] R.M. Axelrod, The Evolution of Cooperation, Basic Books, New York, 1984. [36] M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, TCP Selective Acknowledgment Options, RFC 2018 (Proposed Standard), Internet Engineering Task Force, October 1996 . [37] T. Narten, E. Nordmark, W. Simpson, Neighbor Discovery for ip Version 6 (ipv6), United States, 1998. [38] Y.-C. Hu, A. Perrig, D.B. Johnson, Ariadne: a secure on-demand routing protocol for ad hoc networks, Wireless Networks 11 (2005) 21–38. [39] Y.-C. Hu, A. Perrig, D.B. Johnson, Ariadne: a secure on-demand routing protocol for ad hoc networks, in: MOBICOM, 2002, pp. 12– 23. [40] J.L. Muñoz, O. Esparza, M. Aguilar, V. Carrascal, J. Forné, Rdsr-v. reliable dynamic source routing for video-streaming over mobile ad hoc networks, Computer Networks 54 (2010) 79–96. [41] D.B. Johnson, D.A. Maltz, Dynamic source routing in ad hoc wireless networks, in: Mobile Computing, Kluwer Academic Publishers, 1996, pp. 153–181. [42] S. Buruhanudeen, M. Othman, M. Othman, B.M. Ali, Mobility models, broadcasting methods and factors contributing towards the efficiency of the manet routing protocols: overview, in: Proceedings of the 2007 IEEE International Conference on Telecommunications and Malaysia International Conference on Communications, 2007, pp. 226–230. [43] V. Tolety, Load reduction in ad hoc networks using mobile servers, Master’s thesis, Colorado School of Mines, 1999. [44] M. Alzate, J. Baras, Dynamic routing in mobile wireless ad hoc networks using link life estimates, in: 38th Conference on Information Sciences and Systems, CISS04, Princeton University, Princeton, NJ, 2004, pp. 363–367.

1398

M. Mejia et al. / Ad Hoc Networks 10 (2012) 1379–1398 Marcela Mejı´a is an electronic engineer from Universidad Santo Tomás (Bogotá, Colombia). She received her M.Sc. in Telematics from Universidad Distrital (Bogotá, Colombia), her Ph.D. degree in Electrical Engineering from Universidad de los Andes (Bogotá) and her Ph.D. degree in Telematics from Universidad Politécnica de Cataluña (Barcelona, Spain). She has occupied different academic and industry positions in research and technology management. Currently, Dr. Mejı´a is assistant professor at the Engineering School of Uni-

Oscar Esparza was born in Viladecans (Spain) in 1975. He received his M.S. degree in Telecommunication Engineering in the Technical University of Catalonia (UPC) in 1999, and the Ph.D. degree in 2004. In 2001, he joined the Information Security Group at the Department of Telematics Engineering. Currently working as Associate Professor in the UPC.

versidad Militar, in Bogotá.

Néstor M. Peña received his BSc. degrees in Electrical Engineering and Mathematics from Universidad de los Andes, Bogotá, Colombia, his Msc. Degree in Electrical Engineering from the same University and his DEA and Ph.D. degrees in Signal Processing and Telecommunications from Université de Rennes I, Rennes, France. He is Associate professor at the Electrical and Electronics Engineering Department of Universidad de los Andes.

Jose L. Muñoz was born in Terrassa (Spain) in 1975. In 1999, he received the M.S. degree in Telecommunication Engineering in the Technical University of Catalonia (UPC). In the same year he joined the AUNA Switching Engineering Department. Since 2000 he works as ssistant Professor in the Department of Telematics Engineering of the UPC. In 2003, he received the PhD. degree in Network Security. Currently working as Associate Professor in the UPC.

Marco A. Alzate is an electronic engineer from Districtal University, Bogotá, Colombia. He received his M.Sc. and Ph.D. degrees in Electrical Engineering from Universidad de los Andes, also in Bogotá, in collaboration with University of South Florida and University of Maryland. He has been Researcher/Lecturer at the Research Division of Instituto Tecnológico de Electrónica y Telecomunicaciones (Telecom), Research Assistant at the Maryland Hybrid Networks Center (University of Maryland) and Research Scientist at the National Institute for Applied Computational Intelligence (University of South Florida). Currently, Dr. Alzate is full professor at the Engineering School of Districtal University in Bogotá.