J. Parallel Distrib. Comput. 72 (2012) 809–818
Contents lists available at SciVerse ScienceDirect
J. Parallel Distrib. Comput. journal homepage: www.elsevier.com/locate/jpdc
The importance of considering unauthentic transactions in trust management systems Pilar Manzanares-Lopez ∗ , Josemaria Malgosa-Sanahuja, Juan Pedro Muñoz-Gea Department of Information Technologies and Communications, Technical University of Cartagena, Antiguo Cuartel de Antigones Campus Muralla del Mar s/n, E-30202 Cartagena, Spain
article
info
Article history: Received 24 May 2011 Received in revised form 5 March 2012 Accepted 13 March 2012 Available online 23 March 2012 Keywords: Trust management P2P systems File sharing
abstract In peer-to-peer (P2P) networks, trust management is a key tool to minimize the impact of malicious nodes. EigenTrust is claimed to be one of the most powerful distributed reputation management systems focused on P2P file-sharing applications. It is the theoretical base of other systems, and it has also been directly modified in an attempt to improve its performance. However, none of them give appropriate importance to all the information about transactions. This paper proposes an enhancement of EigenTrust, which considers unsatisfactory transactions in greater depth. Pos&Neg EigenTrust is able to obtain a blacklist of the identities of the malicious nodes. Therefore, it is able to significantly reduce the number of unsatisfactory transactions in the network. © 2012 Elsevier Inc. All rights reserved.
1. Introduction Two types of adversary can be identified in peer-to-peer (P2P) networks: selfish peers and malicious peers. Whereas the former (known as free riders in file-sharing networks [10]) use system services contributing minimal resources or nothing by themselves, the latter cause harm to either specific targeted members of the network or the system as a whole (for example, distributing corrupted or unauthentic files in file-sharing networks). In today’s P2P networks, incentive schemes [5,8,15] can be used to encourage cooperation of selfish peers, but they can be ineffective against malicious peers. Trust management systems are a key tool to minimize the impact of malicious node actions [13]. Here, trust is understood as the confidence that one peer has about another peer’s reputation. The interactions among peers give the opportunity to acquire the necessary experience (measured in terms of good and bad transactions) to build up a reputation management algorithm. Basically, these algorithms are able to calculate and share among the network community a global trust value for each peer. Several representative P2P reputation systems have been proposed in recent years. Some of them are focused on trust management in P2P e-commerce applications, while others are focused on generic P2P applications such as P2P file sharing. Belonging to the latter, EigenTrust [9] is claimed to be one of
∗
Corresponding author. E-mail addresses:
[email protected] (P. Manzanares-Lopez),
[email protected] (J. Malgosa-Sanahuja),
[email protected] (J.P. Muñoz-Gea). 0743-7315/$ – see front matter © 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.jpdc.2012.03.006
the most powerful reputation management algorithms. This is, in greater or lesser extent, the theoretical base of many other P2P reputation models. EigenTrust is a distributed protocol that is able to compute the global trust value of a peer by applying each peer’s local trust values transitively. To guarantee the convergence of the global trust values, EigenTrust uses linear algebraic theory. This system is focused on minimizing the number of unauthentic downloads during the transactions. To reach this aim, EigenTrust focus its work on the identification of the peers that offer more authentic files. A perfect balance between simplicity and efficiency has made EigenTrust the most widely used reputation management algorithm up to now. In fact, the vast acceptance of this system in the research community is patently obvious in the set of proposals trying to improve its performances. Although identifying a peer with good behavior is a key task to reduce the number of unauthentic downloads, in a real scenario in which different threats can be caused by other peers, it can be as interesting, or indeed more interesting, also to identify these malicious peers. They introduce unauthentic files in the system and also introduce wrong information about the transactions with others, adding additional noise to the system. To our knowledge, none of the EigenTrust-based systems deal with this aim. All of them present a limitation as regards the definition of local trust values in the original EigenTrust: it does not take into account the cases where the difference between satisfactory and unsatisfactory transactions is negative. These negative values can arise in two ways: peers can upload (whether consciously or not consciously) unauthentic files, and also peers can lie about the evaluation of the transactions. If the negative
810
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
values were considered, one of the most important properties of the algebraic model would not be met. Therefore, it would be impossible to guarantee the convergence of the global trust values. In other words, EigenTrust-based systems are able to measure the goodness of the peers but they cannot measure their wickedness. This behavior can be exploited by the malicious peers to improve their efficiency in shutting down the system. Our work proposes an enhancement of EigenTrust called Pos&Neg EigenTrust. This system considers all the information about transactions in more depth. However, it is defined so that all the algebraic properties of the original are still valid. After some transactions, Pos&Neg EigenTrust can obtain two ordered lists. The first list classifies the peers on the basis of their goodness and the other list classifies them on the basis of their wickedness. Using these rankings adequately, it is possible to build a blacklist containing the identities of the malicious nodes. This is a significant improvement over the original EigenTrust, which is focused only on the identification of good peers. This paper evaluates our proposal by simulation of two types of threat model. The first one (a simple threat model called individual malicious peers) consists of completely isolated nodes which always provide unauthentic files. The second one (a more complex model called malicious peers in a collective with camouflage) is more realistic. The malicious peers work collectively and sometimes (maybe often, maybe rarely) provide unauthentic files (that is, they are camouflaged). When evaluating that last type of malicious node, two attacks are considered. In the first case, malicious peers always lie about the local trust values associated with the rest of the peers. In the second one, a more sophisticated attack, the malicious peers only lie with a certain probability. Simulation results show that, thanks to the identification of good nodes and also the identification of malicious ones, Pos&Neg EigenTrust is able to significantly reduce the number of unsatisfactory transactions in the network with any type of malicious nodes. The rest of the paper is organized as follows. Section 2 describes previous works on P2P reputation management systems, besides identifying the contribution of this paper. Section 3 summarizes the original EigenTrust system and describes its main shortcomings. Section 4 presents our proposal. In Section 5, both the EigenTrust and the Pos&Neg EigenTrust systems are evaluated and compared by simulation. Section 6 concludes the paper. 2. Related works in P2P reputation systems Trust and reputation models have been proposed as a solution for guaranteeing a minimum level of security between entities belonging to a distributed system (such as P2P networks, ad hoc networks, wireless sensor networks, or multi-agent systems) that want to have a transaction or interact [6]. In the case of P2P networks, in which users do not know each other at all or, at least, do not know everyone, the application of trust mechanisms can help peers to find out which is the most trustworthy peer to interact with, thereby preventing or minimizing the selection of a malicious one. Most of the current P2P reputation systems follow the same general steps. Peers obtain information about others in the community from their own experience and/or asking other users their opinions about the others. Then, the obtained information is aggregated properly and somehow a score for every peer in the network is computed. Thus, the most trustworthy or reputable entity in the system providing the required service is selected to have an interaction with it, evaluating a posteriori the satisfaction of the received service. Finally, according to the satisfaction obtained, the trust or the reputation deposited in the selected service peer is consequently adjusted. How these general steps are
implemented and how the trust parameters are defined differ from one reputation system to another. As mentioned before, the classification of P2P reputation system depends on the applications they are focused on. Next, this section summarizes some of the most representative reputation models for file-sharing P2P systems. EigenTrust [9], described in detail in the next section, is one of the most representative global reputation P2P systems. It is one of the most known and cited reputation models besides being, to a greater or lesser extent, the theoretical base of other models. It relies on linear algebraic theory to guarantee obtaining global trust values. Moreover, it is very simple and does not depend on complex parameters or assumptions. For these reasons, it is one of the most widely studied trust management systems. EigenTrust assumes that all the peers perform transactions with others evenly. However, in practice, there are some closed community groups in which the global trust values used by EigenTrust do not adequately reflect the trustworthiness of peers [11]. In these scenarios, the EigenTrust reputation model will not be able to identify the malicious peers and, consequently, will not be able to isolate them from the network (for an interesting survey of security threats scenarios in P2P reputation models, see [7]). It is widely accepted that, in order to solve this problem, it is necessary to calculate the global trust values taking into account the fact that peers can be classified into groups or profiles (at least two profiles: good and malicious). For example, [12,17] propose calculating the global trust values weighting the local trust values by a parameter called distance (Euclidean and Hamming distance, respectively), which in turn, is a measure of the similarity among different profiles. On the other hand, [2,1], and, more specifically, [18] propose the use of more complex local aggregation functions which consider the existence of different groups. [18] demonstrates that, in a general P2P reputation system, the number of feedbacks follows a power-law distribution (only a handful of peers have the most feedback) [4]. This fact enables the definition of a complex local aggregation algorithm with an overhead which grows only linearly. In addition, this handful of peers is dynamically selected to carry out special tasks to reinforce the reputation network. However, the above problem can also be faced from a different point of view. Neither EigenTrust nor any of its improvements consider the limitation that, in our opinion, has the local trust value definition. As analyzed in detail in Section 3.4, EigenTrust, and also [12,17,2], make limited use of the information about unsatisfactory transactions. They only consider the cases where the difference between satisfactory and unsatisfactory transactions is positive. Other works such as [18] do not impose a specific algorithm to calculate local trust values, enabling each peer to have its own criteria to generate feedback scores. Other cases, such as [1], do not even consider unauthentic transactions to calculate local trust values. Our proposal improves the original EigenTrust system by considering all the transaction information in more detail. Pos&Neg EigenTrust uses both positive and negative values of the difference between satisfactory and unsatisfactory transactions when obtaining the local trust values. Thus, in addition to the global trust vector, another useful vector called the negative trust vector or bad reputation vector is obtained. Using both adequately, it is possible to obtain a blacklist containing the identities of the malicious nodes, whatever their behavior is. To our knowledge, this approach has only been taken into account in [3]. There, the authors define new metrics that aggregate the negative opinions expressed by peers: badness, positive dishonesty, and negative dishonesty. They define two methods to integrate these metrics into EigenTrust: the integration
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
with badness and clustering, and integration with the exact mean global dishonesty value. The key of the first method is the clustering tool, but it is not detailed in the paper. Both depend on a set of parameters and thresholds that must be adequately set to optimize the number of false positives (the number of good peers that are considered malicious peers) and true positives (the number of malicious peers that are not identified). A more detailed description of this work is included in Section 5. In our proposal, a simpler and more efficient method is proposed to integrate the negative opinion of the peers in the trust management system. In addition, the proposed system is evaluated in a wide scenario, considering the threat models described above.
Table 1 Basic EigenTrust algorithm. Source: From [9].
⃗t (0) = p⃗; repeat ⃗t (k+1) = C T ⃗t (k) ; ⃗t (k+1) = (1 − a)⃗t (k+1) + ap⃗; δ = ∥t (k+1) − t (k) ∥; until δ < ϵ Table 2 Distributed EigenTrust algorithm. Source: From [9]. Definitions: Ai : set of peers which have downloaded files from peer i Bi : set of peers from which peer i has downloaded files Algorithm: Each peer i do {
3. EigenTrust EigenTrust is a distributed algorithm to compute the global trust values of peers in P2P file-sharing networks. To do this, each peer first calculates the local trust values of others. Then, EigenTrust collects the local trust values of all peers to calculate the global trust value of a given peer. The global reputation of each peer i is given by the local trust value assigned to peer i by other peers, weighted by the global reputations of the assigning peers. To introduce EigenTrust, the basic (and centralized) version of the system is summarized. Next, a distributed version of EigenTrust is described. 3.1. Non-distributed EigenTrust A common method to evaluate the reputation in trust models is to consider the number of satisfactory and unsatisfactory transactions that results after the peers’ interaction. If peer i has had sat(i, j) satisfactory transactions and unsat(i, j) unsatisfactory transactions with peer j, a possible evaluation of peer j by peer i is sij = sat(i, j) − unsat(i, j). These values, which are local to peer i, allow it to identify who is the best and the worst peer among all the peers it has interacted with, considering only its own experience. In addition, peer i will be able to obtain a ranking just by ordering the sij values. However, the main aim of EigenTrust is to obtain a global and unique trust value, and not a local and particular trust value. To do that, EigenTrust will obtain a value for a peer i from the local opinion that the rest of the peers have about that peer. To aggregate local trust values to obtain the global trust value, it is necessary to normalize sij in some manner. EigenTrust defines a normalized local trust value, cij , as follows:
max(sij , 0) max(sij , 0) cij = j pj
if
max(sij , 0) ̸= 0,
j
(1)
otherwise.
EigenTrust considers some practical issues. The most important one is the notion of pre-trusted nodes, which are some peers in the network that are known to be trustworthy. To introduce the notion of pre-trusted nodes in the algorithm, pi is defined as
pi =
1/P 0
if peer i ∈ pre-trusted peers, otherwise,
(2)
where P is the number of the pre-trusted peers. If the notion of pre-trusted nodes is not considered, pi = 1/n, where n is the total number of peers in the network. Once the normalized local trust value has been defined, a peer i will be able to evaluate a peer k, with which it has not interacted, by asking its known peers their opinions about peer k. Thus, the aggregated opinion of peer i about peer k is obtained as tik =
j
cij cjk .
(3)
811
(0)
Query all peers j ∈ Ai for tj repeat (k+1)
Compute ti
(k+1)
Send cij ti
= pj ;
= (1 − a)(c1i t1(k) + c2i t2(k) + cni tn(k) ) + api ;
to all peers j ∈ Bi ; (k+1)
Compute δ = |ti
− ti(k) |;
(k+1)
Wait for all peers j ∈ Ai to return cji tj until δ < ϵ ; }
;
In matrix notation, if C is defined as the matrix [cij ] and t⃗i is the vector containing the values tik , then t⃗i = C T c⃗i . To get a wider view, peer i may wish to ask his acquaintances’ acquaintances (ti = (C T )2 ci ). If the system continues in this manner (ti = (C T )n ci ), peer i will have a complete view of the network after some iterations. Fortunately, if n is large and C is aperiodic and irreducible, the trust vector t⃗i will converge to the same vector for every peer i. That vector ⃗t is the global trust vector of the system. Therefore, the existence of vector ⃗t is associated to the definition of matrix C . The of convergence is ∥C ∥ < 1, and ∥C ∥ = sufficient condition max |cij |, so max |cij | < 1. The normalized local trust values defined by EigenTrust fulfill this condition. However, EigenTrust also presents some drawbacks. The first one is presented by the authors of EigenTrust. If cij = cik , peer j has the same reputation as peer k in the eyes of peer i, but it is not known if both of them are very reputable or mediocre. Another drawback is the use of max(sij , 0) in the cij definition. In all the cases in which the number of unsatisfactory transactions is higher than that of satisfactory transactions, cij is directly set to 0. This drawback is the seed that originated this work, and it will be explained in detail in Sections 4 and 5. Assuming that some central server knows all the cij values, the EigenTrust algorithm would proceed as described in Table 1. 3.2. Distributed EigenTrust EigenTrust presents an algorithm in which all peers in the network cooperate to compute and store the global trust values in a distributed environment. Each peer stores its normalized local trust vector c⃗i and also its own global trust value ti , which is computed as follows: (k+1)
ti
= c1i t1(k) + · · · + cni tn(k) .
(4)
In other words, the global trust value of peer i is calculated as the aggregation of the opinions that all the nodes have about that peer, weighted by the reputation that EigenTrust assigns to each peer. It lends to the simple distributed algorithm shown in Table 2. In this algorithm, each peer i computes and reports its own trust value ti . Therefore, malicious peers can easily report false
812
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
trust values, subverting the system. To combat this fact, a Secure EigenTrust algorithm is also defined. There, the trust value of peer i will not be computed by or reside at the peer itself. Instead, a peer called score manager will compute it. However, regarding the work developed in this paper, this issue has no effect and therefore, from now, on the distributed EigenTrust algorithm will be assumed. 3.3. Source selection algorithm In a P2P file-sharing network using EigenTrust, peers issue queries for files and receive response probably from more than one node owning the requested file. When a querying node receives a response, it must select the download source. EigenTrust authors define two selecting algorithms: a deterministic algorithm and a probabilistic algorithm. From experiments in [9], it can be concluded that the probabilistic algorithm is the best option. The probabilistic algorithm is defined as follows. If t0 , t1 , . . . , tR−1 are the trust values of peers responding to a query, the probability that peer i is selected is defined as follows:
prob_select_peer_i =
ti 90% · R −1 tj
if ti > 0 (5)
j =0
1 10% · 0 tj
Fig. 1. Node ranking using EigenTrust. The network is composed of 60 good peers, 3 pre-trusted peers, and 18 malicious peers. Position 0 is the best-ranking place and position 80 is the worst-ranking place. Malicious peers upload 30% of authentic files.
if ti = 0,
where tj0 is the number of peers with tj = 0. If the download returns an unauthentic file, the peer is deleted from the list of responding peers, and the algorithm is repeated. 3.4. EigenTrust shortcoming From [9], EigenTrust is used to decrease the number of downloads of unauthentic files in a P2P file-sharing network. To do that, it makes use of a unique global trust value that the system assigns to each peer, based on the peer’s history of uploads. Thus, according to that work, EigenTrust is able to identify malicious peers and to isolate them from the network. Nevertheless, due to the cij definition, peers make limited use of the information about transactions. This fact causes EigenTrust to really focus its work on the identification of the best peers. This approach makes it possible to identify and isolate the malicious peers only when they have a simple behavior (for example, they always send unauthentic files). However, if they have a more complex behavior (they are camouflaged by sending a certain number of authentic files), EigenTrust will not be able to identify them and, consequently, will not be able to isolate them from the network. Although experimental evaluation details will be presented and analyzed in detail later (Section 5), some results that validate the aforementioned assertions are presented here. Figs. 1 and 2 show the results corresponding to a network composed of 60 good nodes and 3 pre-trusted nodes (both good and pre-trusted nodes uploaded authentic files, except in the 5% of the cases in which they return unauthentic files, and assigned correct cij values) and 18 camouflaged malicious nodes (uploaded authentic files in 30% and 50% of the cases, respectively, and always assigned incorrect cij values). The malicious peers connect to the most highly connected peers and they respond to the top 35% of queries received. The good peers are identified as nodes 0–59; the pre-trusted peers are nodes 60, 61, and 62; and the malicious peers are identified as nodes 64–80. Both figures show the peer ranking that the EigenTrust algorithm obtains using the global trust values. Position 0 is the best-ranking place and position 80 is the worst-ranking place. As
Fig. 2. Node ranking using EigenTrust. The network is composed of 60 good peers, 3 pre-trusted peers, and 18 malicious peers. Position 0 is the best-ranking place and position 80 is the worst-ranking place. Malicious peers upload 50% of authentic files.
can be observed, the pre-trusted peers are always ranked in the first positions. However, the malicious peers are not located in the last 18 positions of the ranking. When the malicious peers upload 30% of authentic files (Fig. 1), they are mostly ranked in the last positions, as are the other good peers. When the malicious peers upload 50% of authentic files (Fig. 2), they are ranked ahead of the good peers. Because the malicious peers offer authentic files, they can be positively valued by the others, and because they lie about the behavior of the other nodes (assigning wrong local trust values), the malicious peers succeed in tricking the system into ranking them out of the last positions. To solve this shortcoming, we propose an enhancement of EigenTrust, called Pos&Neg EigenTrust, which assigns a higher weight and significance to the information that peers obtain from unauthentic file downloads. The proposed system sets a double goal. On the one hand, as does EigenTrust, Pos&Neg EigenTrust seeks to reduce the number of unauthentic file downloads on the P2P file-sharing network with respect to a non-reputation system. On the other hand, Pos&Neg EigenTrust seeks to identify unequivocally the malicious peers in the system, under any malicious behavior. This last objective is the most important contribution and is the novelty our proposed system obtains with respect to EigenTrust and other similar reputation systems. 4. Pos&Neg EigenTrust 4.1. Pos&Neg EigenTrust specification As described earlier, the EigenTrust solution calculates cij values using only sij values that are higher than 0. This definition makes peers lose useful information about the transactions. Our proposal lies in using all the available information, that is, sij > 0 values and also sij < 0 values. This allows us to obtain two rankings of the peers: the first ranking classifies the nodes on the
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818 Table 3 Distributed Pos&Neg EigenTrust algorithm.
4.2. Obtaining and use of the list of malicious peers The main enhancement of our proposal is the ability to identify the malicious peers, whatever their behavior. This enables us to redefine the probabilistic algorithm used to select a download source. This improvement is possible thanks to the use of the double reputation ranking: the positive ranking associated to the positive trust values, and the negative ranking associated to the negative global trust values. In an ideal case, in which all the nodes assign correct local trust values to the peers they interact with (that is, sij is increased by one unit each time peer i downloads an authentic file from peer j and sij is decreased by one unit each time peer i downloads an unauthentic file from peer j), the following situations could occur.
Definitions: Ai : set of peers which have downloaded files from peer i Bi : set of peers from which peer i has downloaded files Algorithm: Each peer i do { +(0)
Query all peers j ∈ Ai for tj repeat +(k+1)
Compute ti
+(k+1)
Send cij+ ti
= pj ;
= (1 − a)(c1i+ t1+(k) + c2i+ t2+(k) + cni+ tn+(k) ) + api ;
to all peers j ∈ Bi ; +(k+1)
Compute δ = |ti
− ti+(k) |;
+(k+1)
Wait for all peers j ∈ Ai to return cji+ tj ; until δ < ϵ ; Send cij− ti+ to all peers j ∈ Bi ; Wait for all peers j ∈ Ai to return cji− tj+ ; − + − + − + Compute ti− = c1i t1 + c2i t2 + · · · + cni tn − p∈P cpi tp+ ; }
• The network is composed of good nodes, pre-trusted nodes,
basis of their goodness, and the second classified them one on the basis of their wickedness. As will be explained later, both rankings are used to identify the malicious peers, whatever their pattern of behavior is: mainly they will upload unauthentic files (in a certain percentage), but also they can lie about their interactions with others. This allows us to obtain an additional ranking of the peers that is focused on the identification of peers which offer more unauthentic files than authentic files. Usually these peers are the malicious peers. Therefore, Pos&Neg EigenTrust defines two normalized local trust values, cij+ (the positive normalized local trust values) and cij− (the negative normalized local trust values), as follows:
max(sij , 0) + max(sij , 0) cij = j pj
min(sij , 0) − min(sij , 0) cij = j 0
if
max(sij , 0) ̸= 0 (6)
j
otherwise if
min(sij , 0) ̸= 0 (7)
j
otherwise.
Similarly to EigenTrust, the cij+ values are used to compute the positive global trust values (also called ‘‘good’’ reputation): +(k+1)
ti
= c1i+ t1+(k) + · · · + cni+ tn+(k) .
(8)
−
The cij values are used to compute the negative global trust values (also called ‘‘bad’’ reputation) as follows: − + − + − + ti− = c1i t1 + c2i t2 + · · · + cni tn −
813
− + cpi tp .
(9)
p∈P
As can be observed, the negative global trust values ti− are obtained by weighting the negative local opinion that each peer has about peer i by the good reputation (or positive global trust value) that the system assigns to each peer, once t⃗+ converges to the principal eigenvector of C + . Moreover, the negative local opinion that the pre-trusted nodes have about peer i is not considered. That is because this information is weighted by the positive global trust values of pre-trusted nodes, which are overestimated. Table 3 describes the distributed Pos&Neg EigenTrust algorithm. The t⃗− definition does not affect the algebraic properties of t⃗+ , which is obtained as in the original EigenTrust. In addition, this vector will enable the identification of the malicious nodes and, consequently, the creation of a blacklist or a list of the malicious nodes. The blacklist creation mechanism and the source selection algorithm are described in the next subsection.
and individual malicious peers or malicious peers in a collective with a low level of camouflage (f < 50%). Good and pre-trusted + nodes have tG+ − tG− ≈ tG+ > 0, and malicious nodes have tM − − − tM ≈ −tM < 0. • The network is composed of good nodes, pre-trusted nodes, and malicious peers in a collective with a high level of camouflage (f > 50%). Good and pre-trusted nodes have tG+ − tG− ≈ tG+ > 0, + − tM− ≈ tM+ < tG+ due to the fact and malicious nodes have tM that malicious nodes, even though they upload a fraction of authentic files, also upload unauthentic files. Therefore, according to the positive global values, they will be worse nodes than good nodes. In both situations, the malicious nodes can be identified using the values of ti+ − ti− . In a more realistic case, in which the malicious nodes assign wrong local trust values to the peers they interact with (that is, sij is decreased by one unit each time malicious peer i downloads an authentic file from peer j and sij is increased by one unit each time malicious peer i downloads an unauthentic file from peer j), the identification of the malicious peers is also possible using the values of ti+ − ti− .
• If the network is composed of good nodes, pre-trusted nodes, and individual malicious peers or malicious peers in a collective with a low level of camouflage. The positive local trust values + assigned to the malicious peers (ciM ) tend to be low, and the negative local trust values assigned to the malicious peers tend − to be high (ciM ). Thus, the positive global trust values of the + malicious nodes (tM ) will be worse than the positive global values of the good nodes (tG+ ). That is, the trust value of the good nodes will be predominant against that of the malicious nodes. In consequence, the system damage caused by the malicious nodes lying about local trust has limited repercussion. Thus, + both tG+ − tG− > 0 and tM − tM− < 0 are also fulfilled. • If the network is composed of good nodes, pre-trusted nodes, and malicious peers in a collective with a high level of + camouflage. As f increases, ciM increases too and, consequently, the positive global values of malicious peers resemble those of good peers. In addition, as the malicious peers assign wrong local trust values, they evaluate good peers as bad peers, and + therefore tG+ decreases and tM increases. Now, tG+ − tG− < 0 and + − tM − tM > 0. Based on the aforementioned conclusions, and to be able to identify the malicious peers depending on the sign of ti+ − ti− , it is necessary to know the value of f . However, this parameter is unknown to the good peers. After performing many experiments in several network scenarios, we were able to conclude experimentally that, whatever the sign of ti+ − ti− is, the absolute value is always higher for the malicious nodes. Thus, the mechanism to identify the malicious nodes is described in Table 4.
814
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
Table 4 Basic mechanism to identify the malicious nodes. Once the steady state is reached, obtain t⃗+ y t⃗− Calculate t⃗+ − t⃗− . Then ∀i If | max(ti+ − ti− )| > | min(ti+ − ti− )| malicious peers have ti+ − ti− > 0 Otherwise malicious peers have ti+ − ti− < 0 Table 5 Mechanism to identify the malicious nodes. Once the steady state is reached, obtain t⃗+ y t⃗− Calculate t⃗+ − t⃗− Calculate M + = E (ti+ − ti− ), ∀(ti+ − ti− ) > 0 Calculate M − = E (ti+ − ti− ), ∀(ti+ − ti− ) < 0 If M + > M − malicious peers have ti+ − ti− > 0 Otherwise malicious peers have ti+ − ti− < 0
However, due to the content distribution model (in our experiments, a Zipf distribution is considered), if the f value is small, a small number of peers could show anomalous results for ti+ − ti− , which could distort the previous mechanism for identifying the malicious peers. To eliminate these anomalous values, it is sufficient to apply a low-pass filter. The filter is applied separately to positive ti+ − ti− and negative ti+ − ti− values. The filter order is the number of samples, and consequently its use is equivalent to obtaining the mean value. Thus, the previously described algorithm can be used, but just using both these mean values. Now, the definitive mechanism to identify the malicious nodes is described in Table 5. Using the blacklist, the Pos&Neg EigenTrust system is able to redefine the probabilistic algorithm defined in EigenTrust to select a download source among the set of peers responding to a query. The proposed advanced probabilistic algorithm works as follows. During the initial transient states, the original probabilistic algorithm is used. Once the steady state is reached (t⃗+ ≈ constant), the list of the malicious peers (or the blacklist) is obtained using the positive and the negative global trust values. How this list is obtained is described in Table 5. Then, the selecting algorithm works as follows. If t0+ , t1+ , . . . , + tR−1 are the trust values of peers responding to a query, choose peer t
+
i with probability R i
+.
j=0 tj
With probability of 10%, select a peer j
that has a trust value tj+ = 0. In any case, before using the selected peer as the download source, check if the node is located in the list of malicious peers. If so, the peer is not used, and a new node is selected. 4.3. Computational cost considerations Pos&Neg EigenTrust enhances the original EigenTrust algorithm. Nevertheless, as examined below, the computational cost of both is similar. Assessing the performance of both algorithms, in terms of CPU and/or memory usage in peers, would require one to implement and deploy them in a real P2P network. However, by analyzing the pseudo-codes which describe the algorithms, some conclusions can be drawn. In both algorithms, the main contribution to the computational cost in terms of CPU consumption comes from the execution of the instructions in the Repeat loop. At this point, Pos&Neg EigenTrust only adds an additional execution of the instructions, which does not increment the order of the computational cost. Another difference is that, in Pos&Neg EigenTrust, peers must also
calculate the negative local trust values. The cost of calculating these values (c⃗− ) is the same as calculating the positive local trust values (c⃗+ ). Due to the additive nature of both costs, this issue does not increment the order of the computational cost either. Finally, Pos&Neg EigenTrust defines a mechanism to obtain the list of malicious nodes (or the blacklist). This procedure does not include any loop, and the instructions consist of simple arithmetic and logical operations. Therefore, the computational cost added by this procedure is not significant. In addition, in the same way that EigenTrust defines a secure version in which a set of special peers, called Score Managers, obtains the global trust values, the obtaining of the blacklist could be located in these nodes. With regards to memory usage, Pos&Neg EigenTrust defines new vectors: t⃗− and c⃗− . The t⃗− vector is distributed among all the peers in the network, and each one of them must store its corresponding ti− value, which in terms of memory consumption is negligible. On the other hand, each peer must maintain its own c⃗− vector, with the negative normalized local trust values. This vector requires a quantity of memory similar to the c⃗+ vector (on the order of Kbytes). Therefore, the increment of memory due to the c⃗− is not a critical factor, even more so when taking into account the memory features of current devices. 5. Experimental evaluation 5.1. Simulation description To evaluate our proposal, we used the Query Cycle Simulator [16] developed by the Stanford P2P Sociology Project. The software tool simulates a P2P file-sharing environment under the following considerations. Peers are interconnected by a power-law network, a type of network prevalent in real-world P2P networks [14]. The network consists of good nodes (some of them considered as pre-trusted nodes) and a certain type of malicious nodes depending on the scenario. Files shared in a P2P network are often clustered by content categories. For that, the Query Cycle Simulator considers content categories and also assumes that within one content category there are files with different popularities. The category and the popularity are based on Zipf distributions. The files are assigned randomly to peers at initialization based on the file popularity and the content categories the peer is interested in. A query will correspond to the category and the rank (or popularity) of a requested file. Similar to the results in [9], all the experiments were performed in a very pessimistic scenario. The malicious peers connect to the most highly connected peers when joining the network, and they are considered especially prepared nodes that have all possible files. To limit their load, they respond to the top 35% of queries received. The simulation settings that were used in the experiments are summarized in Table 6. These settings were selected to replicate the experiments evaluated in the original EigenTrust work [9]. Each simulation is subdivided into 500 query cycles. In each query cycle, a peer i may be actively issuing a query, inactive, or even down, and not responding to queries passing by. Each query is propagated by broadcast with a hop-count limit (TTL). Peers that receive the query forward it and check if they are able to respond. When the querying node receives a response, probably from more than one node, it must select the download source and start downloading the file. EigenTrust uses the original probabilistic algorithm to select a download source. Pos&Neg EigenTrust uses the proposed advanced probabilistic algorithm.
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
815
Table 6 Simulation settings. # of good peers # of pre-trusted peers # of malicious peers # of neighbors of good peers # of neighbors of pre-trusted peers # of neighbors of malicious peers TTL for query messages Set of content categories supported by good peer i # of distinct files at good peer i in category j % of download requests in which good peer i returns unauthentic file
60 3 It varies 3 10 10 7 Zipf distribution over 20 content categories Uniform random distribution over peer i’s total number of distinct files 5%
To evaluate our proposal, we considered two parameters. The first one is the fraction of unauthentic downloads in the network versus the total number of files downloaded the same period of time. The second one measures the success of the system in identifying the malicious nodes. 5.2. Individual malicious peers The behavior of individual malicious peers is simple. When they are selected as a download source, they always provide an unauthentic file. Besides, the malicious peers try to damage the system operation by setting their local trust values to be sij = unsat(j) − sat(j). That is, they evaluate unauthentic downloads as good and vice versa. According to this behavior, the malicious peers act as ‘‘very bad peers’’. In this section, we consider a network consisting of 60 good nodes, 3 pre-trusted nodes, and a number of individual malicious peers that are a proportion (from 10% to 70%, in steps of 10%) of the total number of the good nodes. With that simple malicious behavior, both EigenTrust and Pos&Neg EigenTrust systems are able to identify all the malicious peers, using only the (positive, in EigenTrust) global trust values. Malicious peers will be those nodes that have a global trust value of 0. Fig. 3 shows the fraction of unauthentic downloads in the network versus the total number of files downloaded in the same period of time. Each experiment is run five times, and the results are averaged over the last 100 query cycles in each experiment. Confidence intervals are not represented due to their low value. As can be seen, Pos&Neg EigenTrust succeeds in reducing the fraction of unauthentic downloads in the network. Because EigenTrust does not define the concept of a blacklist, a malicious peer can be selected as a download source even when the steady state is reached. It can be observed that the fraction of the unauthentic downloads value is around 10–15%. Some of them are due to the fact that the good nodes upload unauthentic files once in a while (the setting parameters set this probability to 5%) and the rest are due to the downloads from the malicious peers. On the other hand, Pos&Neg EigenTrust manages to reduce the fraction of unauthentic downloads in the network to a value lower than 5%, thanks to the use of the blacklist. 5.3. Malicious peers in a collective with camouflage In this section, a more complex behavior of the malicious peers is analyzed. Here, the malicious peers try to get some trust value from the good and pre-trusted peers by providing authentic files in some cases when they are selected as download source. That is, the malicious peers are camouflaged among the good and pretrusted peers. In addition, they form a collective; that is, they know each other and act consequently. When a malicious peer receives a query from another malicious peer, the query is ignored. Otherwise, the malicious peer responds to it. If it is selected as a download source, the malicious peer sends an authentic file in f %
Fig. 3. Fraction of unauthentic downloads versus the total number of files downloaded in the same period of time. The network is composed of 60 good nodes, 3 pre-trusted nodes, and a number of individual malicious nodes that are a proportion (from 10% to 70%) of the number of good nodes.
of the cases. Finally, they always lie about the local trust values computation. That is, each time a malicious peer i downloads an authentic file from another peer j, it decreases sij ; each time a malicious peer i downloads an unauthentic file from another peer j, it increases sij . We evaluate a network consisting of 60 good nodes (0–59), 3 pre-trusted nodes, and 18 malicious peers (63–80) in a collective with camouflage. We assess the system applying different settings of parameter f , such that the probability that a malicious peer returns an authentic file varies from 10% to 90%, in steps of 10%. Whereas EigenTrust is not able to identify all the malicious peers in this scenario, Pos&Neg EigenTrust is able to obtain the list of malicious peers or blacklist using the mechanism described in Table 5. Fig. 4 shows the ti+ − ti− , M + , and M − values obtained. Furthermore, it can be observed that our proposed mechanism is able to identify the malicious nodes clearly for all f values. Fig. 5 shows the fraction of unauthentic downloads in the network versus the total number of files downloaded in the same period of time. Each experiment is run five times, and the results are averaged over the last 100 query cycles in each experiment. Confidence intervals are not represented due to their low value. As can be seen, Pos&Neg EigenTrust succeeds in reducing the fraction of unauthentic downloads in the network. The figure shows that the malicious peer impact on the EigenTrust network depends on the camouflage level. For low f values, the behavior of the malicious nodes tends to be similar to that of simple malicious nodes. Therefore, the fraction of unauthentic files in the network can be explained by the download source selection algorithm. For high f values, the malicious nodes obtain higher global trust values. The number of authentic files is high compared to the number of unauthentic files. In other words, the malicious nodes tend to resemble ‘‘good nodes’’ when uploading files. Therefore, they will be chosen as the download source most frequently. From the Pos&Neg EigenTrust results, the percentage of unauthentic files compared to the total downloaded files remains close to 5%, as the blacklist is used to never select a malicious node as a download source. Therefore, only good peers can introduce unauthentic files in the network (by simulation settings, 5%).
816
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
Fig. 4. ti+ − ti− , M + , and M − values obtained in a P2P file-sharing network composed of 60 good nodes (identified from 0 to 59), 3 pre-trusted nodes, and a collective of 18 malicious nodes (identified from 63 to 80) with different levels of camouflage (f ).
Fig. 5. Fraction of unauthentic downloads versus the total number of files downloaded in the same period of time. The network is composed of 60 good nodes, 3 pre-trusted nodes, and 18 malicious peers in a collective with camouflage. f % is the probability that a malicious peer returns an authentic file. It determines the camouflage level.
5.4. Good and malicious peer identification accuracy In this section we are going to evaluate the accuracy of our proposal to identify the good and the malicious nodes. To do that, we analyze, in addition to the percentage of the unauthentic downloads, two new factors, called true positive and false positive. True positive is defined as the percentage of malicious peers that are identified as malicious peers. False positive is defined as the percentage of good peers that are identified as malicious peers. To analyze more sophisticated attacks of the malicious nodes, we introduce a new threat feature called camouflaging behind positive judgments. This feature was introduced in [3]. As well as camouflaging by uploading a certain percentage of authentic files when they are requested as source nodes, the malicious peers also camouflage by valuing a good peer that sends a good file positively. The malicious peers do this in β % of cases. Tables 7 and 8 show the evaluated metrics in two different networks. In the first case, the P2P file-sharing network is
composed of 60 good nodes, 3 pre-trusted nodes, and a collective of 18 malicious nodes. In the second case, the P2P file-sharing network is composed of 60 good nodes, 3 pre-trusted nodes, and a collective of 36 malicious nodes. The system is evaluated applying different settings for parameter f (that is, the probability that a malicious peer returns an authentic file) and applying two values of β (0.66 and 1.0). The proposal described in [3], the most interesting proposal to be compared with Pos&Neg EigenTrust, is evaluated in similar network scenarios where β is set to these values. As can be seen and was concluded in Section 5.3, the Pos&Neg EigenTrust solution succeeds in reducing the number of unauthentic downloads versus the total number of files downloaded in the same period of time. It can be observed that this percentage remains close to 4%. With respect to true positive and false positive, it is interesting to make some remarks before concluding if the system works well or not. Ideally, the percentage of malicious peers that are identified should be 100% and the percentage of good peers that are identified as malicious peers should be 0%. Observing the results, the values obtained are very close to those values. In fact, the slight discrepancies in some scenarios must not be interpreted as system failures. Although the nodes are classified as good nodes or malicious ones, they are not ‘‘pure’’ good peers or ‘‘pure’’ malicious peers. The good peers offer 5% of unauthentic files when they are selected as sources. Therefore, although they are called ‘‘good’’ peers, they are acting in some sense as bad peers. On the other hand, depending on the f and β values, the malicious peers act in some sense as good peers. Thus, if a ‘‘malicious’’ peer acts closer to a good peer than a bad peer, the system performs well when the malicious peer is not included in the blacklist. If these results are compared with the results published in [3], it can be concluded that the Pos&Neg EigenTrust proposal offers results very close to or even better than those. However, the proposal in [3] is substantially more complex to implement than
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
817
Table 7 Percentage of unauthentic downloads, true positive, and false positive. The network is composed of 60 good peers, 3 pre-trusted peers, and 18 malicious peers in a collective with camouflage. f % is the camouflage level by uploading authentic files. β % is the camouflage level by valuing a good peer positively.
β
f = 0 (%)
f = 10 (%)
f = 20 (%)
f = 30 (%)
f = 40 (%)
f = 50 (%)
0.66
4.39 100 5.14
3.90 100 0.52
4.08 100 2.20
3.99 100 3.28
3.89 100 8.64
4.10 99.21 12.24
Unauthentic downloads True positive False positive
1.0
4.18 100 2.35
3.98 100 1.11
4.16 100 2.19
4.02 100 1.73
4.10 100 9.34
4.08 99.99 15.11
Unauthentic downloads True positive False positive
Table 8 Percentage of unauthentic downloads, true positive, and false positive. The network is composed of 60 good peers, 3 pre-trusted peers, and 36 malicious peers in a collective with camouflage. f % is the camouflage level by uploading authentic files. β % is the camouflage level by valuing a good peer positively.
β
f = 0 (%)
f = 10 (%)
f = 20 (%)
f = 30 (%)
f = 40 (%)
f = 50 (%)
0.66
4.05 100 4.54
3.95 100 0.43
4.20 100 3.28
4.04 98.10 9.00
4.16 89.58 13.53
3.73 78.19 23.61
Unauthentic downloads True positive False positive
1.0
3.80 100 1.24
4.13 100 0.59
4.15 100 1.6
3.97 99.83 7.43
3.93 93.94 15.45
4.09 80.11 25.99
Unauthentic downloads True positive False positive
our solution. It needs one to define three different parameters (badness, dishonesty, and clustering), whereas, in our case, the definition of the badness involves the previous aspects. In addition, the selection algorithm in [3] needs the global trust values (t + ) to be computed integrating the three previous parameters at a time, whereas our proposal works with t + and t − independently. To implement the algorithm proposed in [3], it is necessary to define 15 different parameters, among which only three can be calculated applying a specific formula. The rest of them are tunable values which are quite difficult to set up properly, as its authors recognize in the paper.
about) the local and global trust values. It could be interesting to consider, among other facts, the penalty for misbehavior over positive behavior (or vice versa), to consider if the node is novel or not, and to consider the characteristics of the file being uploaded Acknowledgments This research has been supported by project grant TEC201021405-C02-02/TCM (CALM). It is also developed in the framework of ‘‘Programa de Ayudas a Grupos de Excelencia de la Region de Murcia, de la Fundacion Seneca, Agencia de Ciencia y Tecnologia de la RM (Plan Regional de Ciencia y Tecnologia 2007/2010)’’.
6. Conclusions EigenTrust is one of the most powerful reputation management algorithms for P2P file-sharing networks. The vast acceptance of this system in the research community is patently obvious in the set of proposals trying to improve its performances. However, neither the original algorithm nor the others developed in recent years have taken into account cases in which the difference between satisfactory and unsatisfactory transactions is negative. This feature can be exploited by malicious peers to improve their efficiency in shutting down the system. In this paper, an enhancement of the EigenTrust system called Pos&Neg EigenTrust is proposed. This system considers the information about transactions in more depth. However, it is defined so that all the algebraic properties of the original EigenTrust are still valid. Using Pos&Neg, it is possible to obtain a blacklist containing the identities of the malicious nodes. The proposed algorithm has been tested by simulation in a wide range of scenarios. The simulation results show the improvements reached by this proposal even when advanced malicious node behaviors are considered. Therefore, Pos&Neg EigenTrust is able to identify the good and the malicious peers, and consequently to significantly reduce the number of unauthentic downloads in the network. This work is a starting point for future work. Once the validity of our proposal has been demonstrated, advanced versions of Pos&Neg EigenTrust may be defined. In further work, we want to analyze the interest in and the convenience (or not) of including the modifications proposed by the EigenTrust-based systems described above. Furthermore, it will be interesting to do research into the use of additional parameters to calculate (or just think
References [1] R. Aringhieri, E. Damiani, S. De Capitani di Vimercati, P. Samarati, Assessing efficiency of trust management in peer-to-peer systems, in: Proceedings of the 1st International Workshop on Collaborative Peer-to-Peer Information Systems, COPS05, 2005, pp. 368–373. [2] D. Choi, S. Jin, Y. Lee, Y. Park, Personalized eigentrust with the beta distribution, ETRI Journal 32 (2) (2010) 348–350. [3] D. Donato, S. Leonardi, M. Paniccia, Combining transitive trust and negative opinions for better reputation management in social networks, in: Proceedings of SNAKDD 2008: KDD Workshop on Social Network Mining and Analysis, in Conjunction with the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008. [4] M. Faloutsos, P. Faloutsos, C. Faloutsos, On power-law relationship of the Internet technology, in: Proceedings of the ACM SIGCOMM’99, 1999, pp. 251–262. [5] M. Feldman, J. Chuang, Overcoming free-riding behavior in peer-to-peer systems, ACM SIGecom Exchanges 5 (4) (2005) 41–50. [6] F. Gomez Marmol, G. Martinez Perez, Security threats scenarios in trust and reputation models for distributed systems, Computers & Security 28 (2009) 545–556. [7] F. Gomez Marmol, G. Martinez Perez, Towards pre-standardization of trust and reputation models for distributed and heterogeneous systems, Computer Standards & Interfaces 32 (2010) 185–196. [8] MS.D. Kamvar, M.T. Schlosser, H. Garcia-Molina, Incentives for combatting freeriding on P2P networks, Technical Report, Stanford University, 2003. [9] S.D. Kamvar, M.T. Scholosser, H. Garcia-Molina, The eigentrust algorithm for reputation management in P2P networks, in: Proceedings of the 12th International Conference on World Wide Web, WWW’03, 2003, pp. 640–651. [10] M. Karakaya, I. Korpeoglu, O. Ulusoy, Free riding in peer-to-peer networks, IEEE Internet Computing 13 (2) (2009) 92–98. [11] Q. Lian, et al., Robust incentives via multilevel tit-for-tat, Concurrency and Computation: Practice & Experience 20 (2) (2008) 167–178. [12] X. Li, J. Wang, A global trust model of P2P network based on distance-weighted recommendation, in: Proceedings of the IEEE International Conference on Networking, Architecture, and Storage, 2009, pp. 281–284. [13] S. Marti, H. Garcia-Molina, Taxonomy of trust: categorizing P2P reputation systems, Computer Networks 50 (2006) 472–484.
818
P. Manzanares-Lopez et al. / J. Parallel Distrib. Comput. 72 (2012) 809–818
[14] M. Ripeanu, I. Foster, Mapping the Gnutella network-macroscopic properties of large-scale P2P networks and implications for system design, IEEE Internet Computing Journal 6 (1) (2002). [15] A. Samreen, S. Hussain, Trust management and incentive mechanism for P2P networks: survey to cope challenges, in: Proceedings of the IEEE International Multitopic Conference, INMIC 2008, 2008, pp. 301–306. [16] Query cycle simulator. Available online at:http://P2P.standford.edu/www/ download.htm. [17] W. Wang, X. Wang, S. Pan, P. Liang, A new global trust model based on recommendation for peer-to-peer network, in: Proceedings of the International Conference on New Trends in Information and Service Science, 2009, pp. 325–328. [18] R. Zhou, K. Hwang, Powertrust: a robust and scalable reputation system for trusted peer-to-peer computing, IEEE Transactions on Parallel and Distributed Systems 18 (4) (2007) 460–473.
Pilar Manzanares-Lopez received her Engineering degree in Telecommunications in 2001 from the Technical University of Valencia (UPV), Spain. In April 2006, she received her Ph.D. degree in Telecommunications from the Polytechnic University of Cartagena (UPCT), Spain. She has been an Assistant Professor in the Department of Information Technologies and Communications (Polytechnic University of Cartagena) since 2001. She has been involved in several national research projects related to multicast technology and multimedia facilities. She is the co-author of several papers in the fields of transport protocols, P2P networks, and distributed systems.
Josemaria Malgosa-Sanahuja received his Telecommunications Engineering degree in 1994 from the Polytechnic University of Catalonia (UPC), Spain. In November 2000, he received his Ph.D. degree in Telecommunications from the University of Zaragoza (UZ), Spain. He has been an assistant professor at the Department of Electronic and Communications Engineering (University of Zaragoza) since 1995. In September 1999, he joined the Polytechnic University of Cartagena (UPCT), Spain, as an Associate Professor. He has been involved in several national and international research projects related to multicast technologies (switching and protocols), multimedia value-added service design, and inhome/building network design. He is the author of several papers in the fields of switching, multicast technologies, and distributed systems. He has been regional correspondent of the Global Communications included in the IEEE Communications Magazine since 2002. His research group has been awarded Excellent Research Group honorable mention in the Spanish Murcia Region. Juan Pedro Muñoz-Gea received his Telematics Technical Engineering degree (cum laude) and his Telecommunications Engineering degree (cum laude) in 2003 and 2005, respectively, both from the Polytechnic University of Cartagena (UPCT), Spain. In 2006, he started working as a research assistant at the UPCT. Since 2008, he has been an assistant professor with the Department of Information Technologies and Communications, UPCT, where he is pursuing a Ph.D. degree. His research interests are focused on overlay networks and P2P systems.