Detecting Sybil attacks in Wireless Sensor Networks using neighboring information

Detecting Sybil attacks in Wireless Sensor Networks using neighboring information

Computer Networks 53 (2009) 3042–3056 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet ...

948KB Sizes 0 Downloads 78 Views

Computer Networks 53 (2009) 3042–3056

Contents lists available at ScienceDirect

Computer Networks journal homepage: www.elsevier.com/locate/comnet

Detecting Sybil attacks in Wireless Sensor Networks using neighboring information Kuo-Feng Ssu *, Wei-Tong Wang, Wen-Chung Chang Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan, ROC

a r t i c l e

i n f o

Article history: Received 3 November 2008 Received in revised form 22 April 2009 Accepted 28 July 2009 Available online 4 August 2009 Responsible Editor: J. Misic Keywords: Wireless Sensor Networks Security Sybil attacks

a b s t r a c t As the prevalence of Wireless Sensor Networks (WSNs) grows in the military and civil domains, the need for network security has become a critical concern. In a Sybil attack, the WSN is subverted by a malicious node which forges a large number of fake identities in order to disrupt the network’s protocols. In attempting to protect WSNs against such an attack, this paper develops a scheme in which the node identities are verified simply by analyzing the neighboring node information of each node. The analytical results confirm the efficacy of the approach given a sufficient node density within the network. The simulation results demonstrate that for a network in which each node has an average of 9 neighbors, the scheme detects 99% of the Sybil nodes with no more than a 4% false detection rate. The experiment result shows that the Sybil nodes can still be identified when the links are not symmetric. Ó 2009 Elsevier B.V. All rights reserved.

1. Introduction Advances in the wireless communications field and the continuing trend toward device miniaturization have led to an increasing interest in Wireless Sensor Networks (WSNs) in recent years [1]. Such networks provide an ideal solution for a variety of monitoring and surveillance applications, including traffic control, health care, environmental monitoring, battlefield surveillance, disaster relief coordination, and so on. Typical WSNs comprise hundreds or even thousands of sensor nodes deployed over a wide and frequently unattended area. The sensor nodes are resource constrained both in terms of their memory capacity and their computational abilities. Each node is equipped with a receiver and a transmitter and communicates with its neighbors over a wireless channel. However, this leaves the network open to attack by malicious eavesdroppers. This is clearly a major concern; particularly when the data sensed by the network is of a critical nature. Thus, developing effective mechanisms to protect WSNs from malevo* Corresponding author. Tel.: +886 6 2374532; fax: +886 6 2345482. E-mail address: [email protected] (K.-F. Ssu). 1389-1286/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2009.07.013

lent attackers has emerged as a key requirement in the wireless communications and sensing field. Sybil attacks pose a particularly serious threat to the integrity of WSNs. In such an attack, a single malicious node forges multiple entities within a network in order to mislead the genuine nodes into believing that they have many neighbors [2,3]. Sybil attacks are easily perpetrated in conventional WSNs because the sensor nodes are invariably deployed in a distributed, unstructured environment and communicate with one another solely via radio transmissions. Compared to other forms of network attack, Sybil attacks require very little in the way of specialized hardware and/or cooperation with other nodes in the network, yet have the ability to wreak havoc with many network operations, including voting [19–21], data aggregation, reputation evaluation, and so on [4]. Protection mechanisms based on voting or reputation approaches do not work efficiently because some nodes are actually impostors and cannot be relied upon to provide reliable information. Authentication methods [4–7] typically require significant memory space to store the necessary authentication information (e.g. shared encryption keys, identity certificates, and so forth) and involve the processing of

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

complex verification algorithms [8]. Furthermore, in the event that a malicious node successfully penetrates the authentication mechanism and gains access to the network, the overall integrity of the network protection scheme may be lost. Besides the authentication-based approaches, there are also RSSI-based and TDOA-based schemes proposed recently [8,9]. Since the radio signal is prone to be interfered by the environment, the detection accuracy of the RSSI-based and TDOA-based schemes would also be affected. Other detection approaches are thus needed to improve security in WSNs. In this paper, the fact that each malicious node creates many fake identities is exploited to develop a mechanism to protect a WSN from Sybil attacks. Unlike previous schemes, the mechanism does not utilize authenticationbased methods, location information, or specialized hardware. It does not matter if the attackers use more powerful device than the one of legitimate node. For example, they can use the device that is capable of transmitting on more than one radio channel simultaneously. The protection scheme is based on the fundamental assumption that the probability of two nodes having exactly the same set of neighbors is extremely low provided that the network has a high node density. In a Sybil attack, the forged nodes typically have the same set of neighbors because they are all associated with the same physical device, i.e. the malicious node. As a result, the presence of a malicious node can be detected by checking the neighboring nodes of the suspected victim of the Sybil attack in order to determine whether or not some of these nodes share the same neighbors and are therefore Sybil nodes. The feasibility of the approach is evaluated both mathematically and numerically. An experiment was also conducted to validate the proposed scheme. The simulation results reveal that the protection scheme successfully identifies more than 99% of the Sybil nodes within a network with a very low false detection rate. The experiment result shows that the proposed scheme still works well under unsymmetrical links.

2. Related work The threat of Sybil attacks was first identified by Douceur in the context of peer-to-peer systems [2]. Douceur claimed that within such distributed computing environments, a single device could easily project multiple identities due to the lack of trusted, central authorities within the network. Karlof and Wagner argued that sensor networks were equally vulnerable to a Sybil attack since they too are deployed in a distributed, unstructured environment [10]. They discussed the relative merits of asymmetric and symmetric encryption techniques in protecting WSNs against Sybil attacks. When using an asymmetric encryption approach, the digital signature is achieved by pre-distributing public and private keys to all of the nodes in the network. By contrast, in a symmetric encryption mechanism, each node holds a unique key with the sink node. Whenever two nodes have information to communicate to one another, they first contact the sink to signal their desire to communicate. The sink verifies the identifications of the two nodes via their symmetric keys and then

3043

distributes a shared key to each of them such that they can communicate directly with one another. Newsome et al. demonstrated that Sybil attacks have the potential to disrupt many of the protocols within WSNs, including data aggregation, voting, fair resource allocation, and misbehavior detection [4]. They proposed several schemes to protect the WSNs against Sybil attack, including a radio resource testing (RRT) mechanism and a random key pre-distribution (RKP) mechanism. The RRT mechanism is based on the assumption that each node in a general network is incapable of transmitting on more than one radio channel simultaneously. Whenever a node wishes to verify whether it is a victim of a Sybil attack, it assigns each of its neighbors a unique channel and requests them to broadcast an acknowledgement (ACK) message on their allocated channel at a specified time. The node then randomly tunes its receiver to a particular channel and waits to receive an ACK message. If no ACK message is received, the node infers that the node assigned this particular channel is a Sybil node since the malicious node is unable to broadcast the ACK message from all of its false identities on multiple channels simultaneously. By iteratively repeating this procedure, all of the fake nodes within the network can be established with a high degree of probability. In RKP, each node randomly picks k keys from a large pool of m keys. The number, m, is chosen such that two nodes will share at least one key with some probability after they pick their keys. The identity of the node is then combined with the particular set of keys which it chooses. In this way, any node can be authenticated by verifying some or all of the keys which it claims to possess. A key feature of Merkle hash trees is that each leaf value can be verified provided that its root value is known in advance [5]. Zhang et al. exploited this feature to develop a method to enable each node in a network to authenticate the identities of all the other nodes [6]. This approach provides a highly effective protection capability in small-scale WSNs. A fingerprint based mechanism was also proposed [11]. Each sensor node has a social fingerprint which is computed based on the neighborhood information through superimposed s-disjunct code. A cloned sensor cannot have the valid social fingerprint of other neighborhoods (or communities) which the originate node does not belong to. If the clones pretend to be the members of other communities and transmit messages to the other members, the fingerprint inconsistency will occur. The legitimate nodes will raise an alarm to the base station to make further judgement. Various non-authentication type protection methods have also been proposed for thwarting Sybil attacks. For example, Demirbas and Song proposed the use of a Received Signal Strength Indicator (RSSI) to detect Sybil attacks [8]. Whenever receiving any message from a new sender, the node will compute the RSSI of the message. The node binds the RSSI with the sender’s ID and stores it in a lookup table. If at any time in the future the node receives another message with the same RSSI but a different sender ID, it immediately perceives itself to be a victim of a Sybil attack. For each RSSI occurrence, four different simul-

3044

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

taneous indicators were required to accommodate the variable and unreliable nature of the received signal strength. Wang et al. also proposed a similar mechanism [12] in the cluster based WSN and they used the Jakes channel model in which the path loss and fading influence are considered for the WSN. The TDOA-based mechanism associates the TDOA ratio with the sender’s ID [9]. Once there are two different identities with the same TDOA ratio, it is detected as a Sybil attack. A hybrid intelligence scheme to predict Sybil attack was proposed by Muraleedharan et al. [13]. They used the swarm intelligence algorithm to collect the information of each route. A malicious node can be detected by its energy variation. The information collected by the swarm agents is used as training data for Bayesian network to adjust the threshold parameter. Huang et al. proposed a clock skew based scheme to verify the sensor nodes [14]. From their experiments, they found that each sensor node had its own clock skew. Since the Sybil nodes were forged by the same physical node, they had the same clock skew. Therefore, the Sybil nodes can be identified.

Newsome et al. claimed that a Sybil attack can seriously degrade WSN protocols such as data aggregation and voting methods even with a small number of malicious nodes when the number of Sybil nodes is enough [4]. For the performance of these protocols to be disrupted in this way, the number of fraudulent node identities must necessarily be sufficiently high. Hence, it is also assumed that the number of Sybil nodes is greater than normal nodes amongst the neighboring set here.

4. Detection of Sybil nodes When a malicious node discerns itself to be within the communication range of a normal node, it immediately forges multiple Sybil nodes. Since these Sybil nodes are all associated with the same physical node (i.e. the malicious node), they share the same set of neighbors. This section demonstrates how this characteristic of Sybil nodes can be exploited to detect a Sybil attack by collecting neighboring information and then analyzing the results. 4.1. Detection mechanism

3. Assumptions and attack model 3.1. System assumptions The WSN comprises n sensor nodes randomly distributed in an m  m square area. The nodes are all stationary and are unaware of their locations. It is assumed that the nodes communicate with one another via a wireless radio channel and broadcast in an omni-directional mode. When a node transmits a message, the message is received (i.e. ‘‘heard”) only by those nodes within the sender’s communication range (designated hereafter as ‘‘neighboring nodes” or simply ‘‘neighbors”). During the detection of Sybil nodes, each message is sent several times to make sure that the message will be obtained by all of the sender’s one-hop neighbors. 3.2. Attack model According to the categories made by Newsome et al. [4], the direct, fabricated and simultaneous Sybil attack is considered here. It is assumed that the network is insecure and that the nodes can be compromised with a certain probability. The compromised node is referred to as a malicious node, while the remaining nodes within the network are referred to as normal nodes. Each malicious node cheats its neighbors by creating multiple identities, referred to henceforth as Sybil nodes. When a node sends message to anyone of the Sybil nodes, the message will be received and replied (if needed) by the malicious node. The attack model assumes that the malicious node forges new identities for the multiple entities which it creates. The main mission of the malicious node is to trick the normal nodes in the network into believing that they have many neighbors. Since these nodes do not in fact exist, many of the network protocols are seriously disturbed or rendered inoperable.

As described earlier, all of the Sybil nodes have a similar set of neighbors because they are all associated with the same physical node and the amount of Sybil nodes is large. There should be a set of nodes whose appearance times are higher. The objective of the detection scheme is to find a such set, which is called critical set, C, and determine Sybil nodes by the assistance of the set. Before introducing the proposed method, it is pertinent to define the corresponding notations:  SN: the set of all of the sensor nodes, including Sybil nodes which are forged by the malicious nodes.  M: the malicious node responsible for the Sybil attack.  NBi : the set of i’s neighbors, i 2 SN.  CNBi;j : the set of common neighbors for both i and j; i; j 2 SN; i – j. Therefore, CNBi;j ¼ NBi \ NBj .  NBni : the set of i’s normal neighbors, i 2 SN. Since NBi contains all of i’s neighbors, including Sybil nodes and normal nodes, NBni  NBi .  CNBni;j : the set of common normal neighbors for both i and j. Also, CNBni;j  CNBi;j . The protection scheme is executed by the normal node which suspects itself to be the victim of a Sybil attack, and such node is called V in the following. In order to find C, V first establishes CNBi;V ; 8i 2 NBV . The procedure employed to determine each CNBi;V can be summarized as follows: I. Node V broadcasts a request message to one of its neighbors, e.g. node i. II. When i receives this message, it broadcasts a message over its maximum transmission range. III. Any node hearing this message (e.g. node j) replies using one-hop broadcast directly to V. IV. Node V records the IDs of the nodes which send a reply and combines these IDs to form the set CNBi;V .

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

V. The above steps are repeated until all of the CNBi;V have been collected. Fig. 1 illustrates the procedure described above. Importantly, the Sybil node detection method is based upon a simple information collection and analysis approach rather than a direct interrogation of each node. This strategy avoids the risk of the malicious node deliberately responding with false information in order to confuse V. In the second step, when i broadcasts its message, it is heard not only by all its neighboring nodes, but also by V itself. If V does not observe the broadcast from i, i will be treated as a Sybil/malicious node. This prevents the malicious node from deliberately hiding its normal neighbors and also improves the accuracy for compiling CNBM;V . Having collected all of the information from its neighboring nodes, V will then count the number of times each node ID appears in the CNBi;V ð8i 2 NBV Þ compiled by V. The number is defined as APðiÞ which means the total number of appearances of node i in CNBj;V ð8j 2 NBV Þ. Since the Sybil nodes have the same set of neighbors and the amount

3 1

2

3045

of Sybil nodes is much more than the normal nodes, there will be some nodes whose APðiÞ is much higher and these nodes are consequently the neighbors of Sybil nodes. Therefore, if the number of appearances of a node exceeds a certain threshold value h with proper setting, the node is designated a critical member and assigned to C. Hence, C ¼ fijAPðiÞ > h; i 2 NBV g. Node i is considered to be a Sybil node if CNBi;V  C. Fig. 2 presents a schematic illustration of the proposed detection approach. In this figure, nodes 1–6 are normal nodes, while nodes 7–14 are Sybil nodes created by the malicious node M. CNBi;V ; i ¼ 1; 2; 3; . . . ; 14; M are listed on the right-hand side of the figure. As shown, each set contains not only the normal nodes amongst its neighbors, but also some Sybil nodes. For example, CNB1;V contains normal nodes 2 and 4, and Sybil nodes 7, 9, 13, and 14. The total number of appearances of each node ID in all of the CNBi;V ; i ¼ 1; 2; 3; . . . ; 14; M sets is shown in the table on the left-hand side of the figure. Assume that the threshold value is assigned a value of 10.5 (how to determine the value of the threshold will be discussed in Section 5), so nodes 4–6 are specified as critical members. As a result, nodes 7–14 and node M are all identified as Sybil nodes since they contain nodes 4–6. Some normal nodes are erroneously identified as Sybil nodes if their neighbors include the complete set C. For example, in Fig. 2, node 2 is also identified as a Sybil node since CNB2;V includes C. In other words, node 2 represents a false detection result. Fig. 3 illustrates some typical false detection scenarios. In Fig. 3a, both i and j are considered by V to be Sybil nodes because their neighbors include the complete set C. However, in Fig. 3b, only j is considered to be a Sybil node since i includes only a subset of C. Similarly, in Fig. 3c, only i is considered to be a Sybil node. Finally, in Fig. 3d, neither i nor j is considered to be Sybil nodes. 4.2. Enhancement mechanism

Fig. 1. An example of V collecting CNBi;V .

When the malicious node is far from V, the number of the common normal neighbors of V and the malicious node is less, which means the size of critical set is smaller. When the size of critical set is smaller, the probability that a nor-

Fig. 2. Illustration of detection method.

3046

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

Fig. 3. Typical false detection scenarios.

Fig. 4. Reducing communication range to enhance detection performance.

mal node has the critical set in its neighbor set is higher. Therefore, there will be more false positives. If the malicious node can be excluded, the malicious node will no longer have an effect and the remaining neighbors of V will be normal nodes. If the normal nodes that are erroneously

identified as Sybil nodes are still the neighbors of V, they will be corrected as normal nodes and the number of the false positives will be reduced. As shown in Fig. 4, i is falsely identified as a Sybil node when the original communication range is used. Since adjusting the communication

3047

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

range of sensor nodes has been widely used [15,16], the mechanism is also adopted here. The communication range of V is reduced until the malicious node falls outside of its range such that the malicious node no longer has an effect. As a result, i can be identified as a normal node since i is still V’s neighbor and the number of false detection events is reduced. The enhancement procedure can be summarized as follows. Upon completion of the detection procedure, V possesses a set S whose members represent the Sybil nodes determined by V. V reduces its communication range gradually and sends a message to determine which nodes are still in S every time. Once the number of members in S is half less than the previous one, it can be inferred that M is now out of V’s current communication range, and the enhancement procedure can be stopped. Since S includes all of the Sybil nodes and some normal nodes, the inference that reducing the number of nodes in S by half indicates that the malicious node now lies outside of V’s communication range stems from the original assumption that the number of Sybil nodes in the network far exceeds the number of normal nodes. Those nodes remaining in the set are then judged to be normal nodes. In the worst-case scenario, if all of the nodes (i.e. both the normal nodes and the Sybil nodes) are classified as Sybil nodes, the appropriate setting of V’s communication range to preclude the malicious node can still be determined by the enhancement. 5. Analysis A series of analyses are performed to investigate several fundamental issues relating to the proposed detection scheme.

Fig. 5. Analysis model.

Thus, the Expected Value of AðxÞ is shown to be2

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi#  Z r" x x2 2x  x r2  2r2 arccos dx 2r r2 4 0 pffiffiffi! 3 3 2 r : ¼ p 4

E½AðxÞ ¼

The notation AE is substitute for E½AðxÞ in the following. 5.2. Node density Assume that the WSN comprises n physical nodes deployed within a m  m square area. The node density, d, n . Let of the environment is therefore given by mm n be the number of nodes in jCNBi;V j ¼ AE  d CNBni;V ; 8i 2 NBnV , and let jNBnV j ¼ pr 2 d be the number of   jNBnV j possibilities for each nodes in NBnV . There are n bjCNBi;V jc CNBni;V given the overlap area of the communication ranges of two nodes is AE . The probability that CNBni;V equals CNBnj;V ,

5.1. Preliminaries

where i; j 2 NBnV and i – j, can be derived from

The communication range of each node is denoted by r. Node i is one of V’s neighbor. The distance between i and V is indicated by x, and the gray area representing the overlap in the communication ranges of V and i, respectively, is given as AðxÞ (see Fig. 5). The CDF (Cumulative Distribution Function) of x is given by

8 x < 0; > < 0; FðxÞ ¼ x2 =r2 ; 0 6 x 6 r; > : 1; x > r:

ð1Þ

The PDF (Probability Density Function) is the differential of the CDF, and has the form

( f ðxÞ ¼

2x=r 2 ; 0 6 x 6 r; 0;

otherwise:

ð2Þ

The area of AðxÞ is computed as1

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x x2  x r2  : 2r2 arccos 2r 4

1

ð4Þ

The details are shown in Appendix A.

1 jNBnV j



r 2 dð

p

r 2 d1Þð

p

bjCNBni;V jc

r 2 d2Þ

p

123

j

1 



ffi

p

pr2 db p3 4 3 r2 p

ffi

p3 4 3 r2

n m2

k

n cþ1 m2

:

ð5Þ Fig. 6 illustrates the probability in the case of a WSN with m ¼ 100 and nodes with a communication range of r ¼ 10. The probability reduces to a value close to zero when n ¼ 300. The results confirm the fundamental assumption made in this paper that the probability that two different common neighbor sets formed by a normal node with its two different neighbors, respectively, are exactly the same is extremely low when the node density is sufficiently high. 5.3. Threshold value h

ð3Þ

Threshold parameter h determines the size of set C that has a direct impact on the performance of the detection scheme. As the size of set C reduces, the number of nodes 2

The details are shown in Appendix B.

3048

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

m=100, r=10 1

0.9

0.8

0.7

Probability

0.6

0.5

0.4

0.3

0.2

0.1

0

0

50

100

150

200

250

300

350

400

450

500

n Fig. 6. Probability of two nodes having exactly the same neighbors.

considered to be Sybil nodes increases. Conversely, if the number of nodes in set C increases to the extent that C  CNBnM;V , none of the Sybil nodes will be detected. The better detection result is obtained when C ¼ CNBnM;V . Since each of the Sybil nodes has CNBnM;V as the subset of its neighbor set as well as the malicious node, having CNBnM;V as the critical set (C) is enough to determine all of the Sybil nodes. As described in Section 5.2,

And

jNBnV j ¼ pr d ) d ¼ ; pr2 AE jCNBni;V j ¼ AE  d ¼ 2 jNBnV j: pr

jSj > jNBnV j;

jNBnV j

2

ð6Þ

AE

pr 2

jNBnV j þ jSj:

ð8Þ

Let V have a total number of jNj neighbors (including both physical and Sybil nodes), i.e.

jNj ¼ jNBnV j þ jSj ) jSj ¼ jNj  jNBnV j:

AE

p r2 

ð9Þ

ð10Þ

As stated earlier, an assumption is made that the number of Sybil nodes is much higher than the number of normal nodes, i.e.

ð11Þ

jNj ¼ jNBnV j þ jSj > jNBnV j þ jNBnV j ¼ 2jNBnV j ) jNBnV j <

ð7Þ

Let T denote the number of appearances of a critical member in the set C and let the number of Sybil nodes be denoted as jSj. Since the better result comes from C ¼ CNBM;V , the number of appearances of a critical member is the sum of the number of its physical neighbors and the number of Sybil nodes which can be expressed as

T ¼ jCNBni;V j þ jSj ¼

AE jNBnV j þ jSj ¼ 2 jNBnV j þ jNj  jNBnV j p r  AE n ¼  1 jNBV j þ jNj: pr 2



jNj ; 2 ð12Þ

and in (10),

 AE ¼ pr2

pffiffi



pffiffi

p  3 4 3 r2 p  343 AE ¼ < 1 ) 2  1 < 0: 2 pr p pr

ð13Þ

Such that, (10) can be rewritten as

   AE AE jNj n  1 jNB j þ jNj >  1 þ jNj V 2 pr 2 pr 2  

AE 1  1  þ 1 jNj ¼ 2 pr2 " ! # pffiffi 1 p  343  1 þ 1 jNj 0:79jNj: ¼ 2 p





ð14Þ

Based on the above calculation, the expected value of h is around 0:79jNj.

3049

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

5.4. Influence of malicious node and neighboring node positions on detection accuracy As shown in Fig. 7a, let the distance between the malicious node and V be x. Let the intersection area of the communication ranges of V and the malicious node be denoted as AðxÞ. The number of nodes within AðxÞ is denoted as jCNBnM;V j and is given by jCNBnM;V j ¼ AðxÞ  d. Furthermore, the number of nodes within V’s communication range is jNBnV j ¼ pr 2 d. For any node i lying within V’s communica-

tion range, all the possible combinations of CNBni;V can be computed as (unlike (5), AE is not considered here)



jNBnV j



0

 þ

bjNBnV jc 

¼

jNBnV j



1

X

jNBnV j

i¼0

i

 þ

jNBnV j 2

 :

 þ  þ

jNBnV j

!

bjNBnV jc ð15Þ

In the ideal case, a node is considered to be a Sybil node if CNBi;V  CNBM;V . Therefore, the total number of combinations of CNBi;V for which i is considered to be a Sybil node is given by bjNBnV jc

X

jNBnV j  jCNBnM;V j

i¼jCNBnM;V j

i  bjCNBnM;V jc

! :

ð16Þ

The probability of a node being considered as a Sybil node by V can be computed as ! ! jNBnV j  jCNBnM;V j jNBnV j  jCNBnM;V j PbjNBnV jc PbjNBnV jc i¼jCNBnM;V j i¼jCNBnM;V j i  bjCNBnM;V jc i  bjCNBnM;V jc ¼ :   n n PbjNBnV jc jNBV j 2bjNBV jc i¼0 i ð17Þ

Fig. 7. (a) Analysis model for malicious node only; (b) analysis model for malicious node and i.

Fig. 8 illustrates the probability of a node being considered as a Sybil node as a function of the distance between the malicious node and V. It is observed that the probability increases exponentially as the distance increases.

Fig. 8. False detection rate as function of distance between malicious node and V.

3050

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

Dis

tan

ce b

etw

een m (x) alicio

us n

ode

and

n etwee nce b (y) D is t a

nd V

V

Fig. 9. Probability of false detection as function of distances x and y.

Fig. 7b shows the analysis model for the case where both the malicious node and i are considered. The distance between the malicious node and V is denoted as x, while that between i and V is denoted as y. The intersection areas between the communication range of V and those of the malicious node and i are denoted as AðxÞ and AðyÞ, respectively. As discussed previously, the total number of possi  jNBnV j , while the ble results of CNBni;V is given by n bjCNBi;V jc total number of combinations of CNBni;V when i is consid  jNBnV j  jCNBnM;V j . Thus, ered to be a Sybil node is n n bjCNBi;V j  jCNBM;V jc the probability of i being falsely identified as a Sybil node is given by

jNBnV j  jCNBnM;V j bjCNBni;V j  jCNBnM;V jc ! jNBnV j

! :

ð18Þ

bjCNBni;V jc Fig. 9 shows the probability of i being erroneously considered as a Sybil node as a function of distances x and y. It can be seen that the probability increases as the value of x increases or the value of y decreases. The results confirm the suitability of the enhancement mechanism as a means of reducing the false detection rate. 5.5. Memory cost Each node has to save the common neighbors of each neighbor and its own. Assuming each node has jNBAVG j

neighbors, the number of common neighbors of each neighbor and its own is OðjNBAVG jÞ. The additional memory space to save all of the common neighbors is OðjNBAVG j2 Þ. The additional memory space can be released after the Sybil node detection. 5.6. Energy consumption The energy consumption of the proposed mechanism is mainly spent on the procedure to construct the CNB list of each node, so the energy consumption is quantified by the number of the additional control messages. The computing power consumption to determine the Sybil nodes is omitted since the computing power consumption is much less than the transmission energy. There are three steps for a node to construct the CNB list of one of its neighbors. In the first step, V sends a request message to i. Node i then broadcasts a message to all of its neighbors to ask its neighbors to reply to V. Therefore, the number of transmitted messages will be 1 þ 1 þ jNBAVG j for a node to construct the CNB list of one of its neighbors. The total number of transmitted messages is OðjSNj  jNBAVG j  jNBAVG jÞ. 5.7. Time cost of the enhancement mechanism The enhancement mechanism reduces its current communication rage to determine the remaining neighbor set in each calibration. Assume that the enhancement mechanism includes a calibrations and the time to complete one calibration is b. The total execution time of the enhancement mechanism is a  b.

3051

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

all times. To ensure the validity of the results, each simulation was repeated 10 times and the results were then averaged to obtain a final value.

6. Numerical evaluation The performance of the detection scheme was evaluated by performing a series of numerical simulations. Two performance metrics were considered, namely the detection rate, i.e. the percentage of Sybil nodes identified, and the false detection rate, i.e. the percentage of normal nodes erroneously classified as Sybil nodes.

6.2. Threshold parameter Fig. 10 shows the variation of the detection rate and the false detection rate as the threshold value is increased from 0:6jNj to 0:85jNj, where jNj is the total number of V’s neighbors, for a constant n ¼ 300 and jSj ¼ 20. Both performance measures increase as the threshold value increases. In other words, in specifying a suitable value of h, a tradeoff is required between the percentage of Sybil nodes detected and the number of normal nodes erroneously classified as Sybil nodes. In practice, the value of h should therefore be specified in accordance with the degree of network security demanded by the particular sensing application. It is also shown in Fig. 10 that the detecting rate is more than 99% and the error rate is less than 5% when the threshold value is set 0:7jNj which approximates to the expect value ð0:79jNjÞ derived in Section 5. The increase

6.1. Simulation model The simulation environment was implemented using Dev-C++ [17]. In executing the simulations, it was assumed that the network comprised n nodes randomly distributed within a 100  100 square meter area. The sensing field contained 20 malicious nodes, each of which forged jSj Sybil nodes. Every node (both normal and malicious) had a maximum communication range of 10 m, but could reduce this communication range by regulating its transmission power. In the simulations, all of the nodes were considered to be static and had sufficient power to operate normally at

35

100 99

30 25

97

Error Rate (%)

Detecting Rate (%)

98

96 95 94 93

20 15 10

92

5 91 90 0.55

0.6

0.65

0.7

0.75

0.8

0.85

0

0.9

0.55

0.6

0.65

Threshold

0.7

0.75

0.8

0.85

0.9

Threshold

Fig. 10. Effect of threshold value on detection performance for n ¼ 300 and jSj ¼ 20.

12 Mathematical analysis results Simulation results

10

Error Rate (%)

Detecting Rate (%)

8

6

4

2

150

200

250

300

350

400

450

500

0 150

200

250

300

Fig. 11. Effect of network density on detection performance h ¼ 0:7jNj and jSj ¼ 20.

350

400

450

500

3052

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

in detecting rate cannot compensate the increase in error rate as the threshold value is set more than j0:7jN. Therefore, h is set 0:7jNj in the following evaluation.

and h ¼ 0:7jNj. It is observed that the detection rate increases and the false detection rate decreases as the number of Sybil nodes increases. The effectiveness of the detection scheme improves as the number of identities forged by the malicious node increases.

6.3. Node density The simulations commenced by considering the effect of the node density on the detection performance of the scheme. Fig. 11 shows the results obtained for the detection rate and the false detection rate for the case of a threshold value of h ¼ 0:7jNj, and jSj ¼ 20. The results indicate that the detection rate exceeds 99% irrespective of the number of nodes in the network. It can also be seen that the false detection rate reduces from around 11% to 2% as the number of nodes is increased from 150 to 500. The accuracy of the detection method improves significantly as the node density increases.

6.5. Enhancement mechanism The final simulation considered the effectiveness of the enhancement mechanism in improving the performance of the detection system. Based on the analysis, reducing V’s communication range can improve false detection. Fig. 13 demonstrates that for a network with n ¼ 300; h ¼ 0:7jNj and jSj ¼ 20, the false detection rate decreases initially as the communication range is reduced, but then saturates at a constant value. It matches the analysis that most instances of false detection occur when the malicious node is located far from V. When the communication is reduced to certain level in the simulation, some normal nodes are excluded before the malicious node. Therefore, the false detection rate stop decreasing.

6.4. Number of Sybil nodes Fig. 12 illustrates the effect of the number of Sybil nodes on the detection performance of the approach for n ¼ 300

16 100 14

99.8 99.6

Error Rate (%)

Detecting Rate (%)

12 99.4 99.2 99 98.8

10

8

6

98.6 98.4

4 98.2 98 10

12

14

16

18

20

Number of Sybil nodes

22

24

26

2 10

12

14

16

18

20

Number of Sybil nodes

Fig. 12. Effect of number of Sybil nodes on detection performance for n ¼ 300 and h ¼ 0:7jNj.

Fig. 13. False detection rate of enhancement mechanism for n ¼ 300; h ¼ 0:7jNj and jSj ¼ 20.

22

24

26

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

3053

Fig. 14. The topology in the experiment.

7. Experiment result An experiment was also conducted to validate the detection mechanism. The network topology was a cluster with 8 Tmote Sky sensor nodes [18] that is illustrated in Fig. 14. The power level of each sensor node was set to 2 where the stable communication range was 30 cm. Each of the cluster members was randomly deployed at the location where the distance to the cluster head is less than 30 cm. The critical set contains 1, 2, 5, 6 since the threshold here is 10.5. As a consequence, the nodes, 7; 8; 9; . . . ; 14, and m are identified as Sybil nodes. Because of the irregularity of the signal propagation, it can be seen that some of the legitimate neighbor pairs are not symmetric, such as (1, 2) pair and (2, 7) pair. The detection mechanism still functioned well in the environment. 8. Conclusion A single malicious node can make a Sybil attack by forging multiple identities (Sybil nodes) within a network with the aim of disrupting the network’s key protocols, e.g., voting, data aggregation, reputation verification, and so forth. In general, in a stationary network, it is reasonable to assume that each node has a different set of neighboring nodes provided that the node density in the network is sufficiently high. However, in a Sybil attack, each Sybil node has the same set of neighbors since they are created by the same malicious node. This paper has shown that this characteristic of Sybil nodes can be exploited to detect the occurrence of a Sybil attack and to identify the Sybil nodes simply by comparing the neighboring information collected from the neighbors of the victimized node. In contrast to existing protection schemes, the approach has no requirement for shared keys, secret information, or special hardware support. The feasibility of the approach has

been proved analytically. Furthermore, the simulation results have shown that for the case where n ¼ 300, h ¼ 0:7jNj and jSj ¼ 20, 99.8% of the Sybil nodes can be correctly identified, with a false detection rate of 4%. Based on the performance evaluation, when more Sybil nodes are forged in a attack, the approach will achieve better detection rates. The detection mechanism can also detect the Sybil nodes in the experiment that was conducted using real wireless sensor nodes.

Acknowledgments The authors would like to thank the anonymous reviewers for the valuable suggestions that improved this paper. This research was supported in part by the Delta Electronics, Inc. and in part by the Taiwan National Science Council (NSC) under Contracts NSC 97-2918-I-006-009, 97-2628-E-006-093-MY3, 97-2221-E-006-176-MY3, and 97-2221-E-006-146-MY3.

Appendix A. Area of AðxÞ As illustrated in Fig. A.1, AðxÞ can be separated into four gray regions. Therefore, the area of AðxÞ is given by

4

Z r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r2  x2 dx: x 2

Then, let

x ¼ r cos h ) dx ¼ r sin h dh: The area of AðxÞ can thus be derived as

3054

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi#  Z r" x x2 2x 2r2 arccos  x r2  dx 2r r2 4 0 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Z r x 2x2 x2 r 2  dx 4x arccos  2 ¼ 2r r 4 0 Z Z r ffi x 1 r 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼4 x arccos dx  2 x 4r 2  x2 dx: ðB:1Þ 2r r 0 0

E½AðxÞ ¼

(B.1) can be separated into two parts. The first part is derived as follows:

4

Z

x x arccos dx 2r   2 ffi

x 4r 2 x x pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4r2  x2 þ C:  arccos  ¼4 2r 4 2 4

ðB:2Þ

The second part is given by Fig. A.1. Illustration of AðxÞ.

4

1 r2

Z r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r2  x2 dx x 2

¼4

Z

x ¼ 2r sin h ) dx ¼ 2r cos h dh: As a result, it can be shown that

1 r2

x arccos2r

¼ 4

Z

0

2

r 2 sin h dh

x arccos2r

Z

0

1  cos 2h dh 2   0 1 ¼ 2r 2 h  sin 2h 2 arccos x

¼ 4r 2

x arccos2r

0 sin h cos hÞjarccos x 2r

¼ 2r ðh  n h  x x ¼ 2r 2 0  arccos  sin arccos 2r 2r  x io  cos arccos 2r x * arccos 2r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4r 2  x2 ¼ arcsin 2r " pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! 4r 2  x2 x 2 ¼ 2r arccos  sin arcsin 2r 2r #  x  cos arccos 2r " # pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4r 2  x2 x x ¼ 2r 2 arccos   2r 2r 2r ffi x x pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ¼ 2r arccos  4r 2  x2 2r 2rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x x2 ¼ 2r 2 arccos  x r 2  : 2r 4

Appendix B. Expected value of AðxÞ The expected value of AðxÞ has the form:

Z

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 4r 2  x2 dx

¼

1 r2

¼

1 r2

Z Z

¼ 16r2

2r

2

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 4r 2  x2 dx:

Let

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r2  r2 cos2 h  ðr sin hÞ dh

0

Z

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 4r 2 sin h 4r 2  4r 2 sin h  2r cos h dh 2

16r 4 sin h cos2 h dh Z

ðsin h cos hÞ2 dh

2 Z  1 sin 2h dh 2 Z 1 2 ¼ 16r2 sin 2h dh 4 Z 1  cos 4h dh ¼ 4r 2 2 Z ¼ 2r 2 1  cos 4h dh ¼ 16r2

  1 ¼ 2r 2 h  sin 4h 4 1 ¼ 2r 2 h  r 2 sin 4h: 2

ðB:3Þ

Since 2

sin 4h ¼ 2 sin 2h cos 2h ¼ 2½2 sin h cos h½1  2 sin h 3

¼ 4 sin h cos h  8 sin h cos h;

ðA:1Þ and

x x ) h ¼ arcsin ) cos h x ¼ 2r sin h ) sin h ¼ 2r 2r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4r 2  x2 ¼ : 2r

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056

(B.3) can be expressed as Z pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 x2 4r 2  x2 dx r2 1 ¼ 2r 2 h  r2 sin 4h 2   3 ¼ r 2 2h  2 sin h cos h þ 4 sin h cos h pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! 4r 2  x2 4r 2  x2 x x x3 2 ¼ r 2 arcsin  2   þ4 3 2r 2r 8r 2r 2r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 x x 4r 2  x2 x 4r 2  x2 þ C: ðB:4Þ þ ¼ 2r 2 arcsin  2 4r 2 2r From (B.2) and (B.4), the expected value of AðxÞ given in (B.1) can be rewritten as

Z ffi x 1 r 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x arccos dx  2 x 4r 2  x2 dx 2r r 0 0   2 ffi

x 4r2 x x pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4r 2  x2 arccos  ¼4  2r 4 2 4 " pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi# r x x 4r 2  x2 x3 4r 2  x2 2  2r arcsin  þ 2r 2 4r 2 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2  x x 2 2 2 2 ¼ 2x  4r arccos  x 4r  x  2r arcsin 2r 2r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r x 4r 2  x2 x3 4r 2  x2 þ  2 4r 2 0   p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 2 ¼ ð2r  4r Þ arccos  r 4r 2  r 2 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi# 1 r 4r2  r2 r3 4r2  r 2  2r 2 arcsin þ  2 4r 2 2 pffiffiffi! p 3 3 2 r :  ð4r 2 Þ  ¼ p  4 2

E½AðxÞ ¼ 4

Z

r

References [1] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, A survey on sensor networks, IEEE Communications Magazine 40 (8) (2002) 102–114. [2] J.R. Douceur, The Sybil attack, in: Proceedings of the International Workshop on Peer-to-Peer Systems, March 2002, pp. 251–260. [3] Z. Su, C. Lin, F. Ren, X. Zhan, Security mechanisms analysis of wireless sensor networks specific routing attacks, in: Proceedings of the International Symposium on Pervasive Computing and Applications, August 2006, pp. 579–584. [4] J. Newsome, E. Shi, D. Song, A. Perrig, The Sybil attack in sensor networks: analysis and defenses, in: Proceedings of the International Symposium on Information Processing in Sensor Networks, April 2004, pp. 259–268. [5] Q. Zhang, P. Wang, D.S. Reeves, P. Ning, Defending against Sybil attacks in sensor networks, in: Proceedings of IEEE International Conference on Distributed Computing Systems Workshops, June 2005, pp. 185–191. [6] Y. Zhang, W. Liu, W. Lou, Y. Fang, Location-based compromisetolerant security mechanisms for wireless sensor networks, IEEE Journal on Selected Areas in Communications 24 (2) (2006) 247–260. [7] D. Liu, P. Ning, Establishing pairwise keys in distributed sensor networks, in: Proceedings of the ACM Conference on Computer and Communications Security, October 2003, pp. 52–61. [8] M. Demirbas, Y. Song, An RSSI-based scheme for Sybil attack detection in wireless sensor networks, in: Proceedings of International Symposium on a World of Wireless, Mobile and Multimedia Networks, June 2006, pp. 564–570. [9] M. Wen, H. Li, Y. Fei Zheng, K. Fei Chen, TDOA-based Sybil attack detection scheme for wireless sensor networks, Journal of Shanghai University 12 (1) (2008) 66–70.

3055

[10] C. Karlof, D. Wagner, Secure routing in wireless sensor networks: attacks and countermeasures, in: Proceedings of the IEEE International Workshop on Sensor Network Protocols and Applications, May 2003, pp. 113–127. [11] K. Xing, F. Liu, X. Cheng, D.H. Du, Real-time detection of clone attacks in wireless sensor networks, in: Proceedings of the International Conference on Distributed Computing Systems, June 2008, pp. 3–10. [12] J. Wang, G. Yang, Y. Sun, S. Chen, Sybil attack detection based on RSSI for wireless sensor network, in: Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing, September 2007, pp. 2684–2687. [13] R. Muraleedharan, X. Ye, L.A. Osadciw, Prediction of Sybil attack on WSN using Bayesian network and Swarm intelligence, in: Proceedings of Wireless Sensing and Processing, March 2008. [14] D.-J. Huang, W.-C. Teng, C.-Y. Wang, H.-Y. Huang, J.M. Hellerstein, Clock skew based node identification in wireless sensor networks, in: Proceedings of the IEEE Global Telecommunications Conference, November 2008, pp. 1–5. [15] S. Lin, J. Zhang, G. Zhou, L. Gu, T. He, J.A. Stankovic, ATPC: adaptive transmission power control for wireless sensor networks, in: Proceedings of the International Conference on Embedded Networked Sensor Systems, November 2006, pp. 223–236. [16] C. Song, M. Liu, J. Cao, Y. Zheng, H. Gong, G. Chen, Maximizing network lifetime based on transmission range adjustment in wireless sensor networks, Special Issue of Computer Communications on Heterogeneous Networking for Quality, Reliability, Security, and Robustness 32 (11) (2009) 1316–1325. [17] Bloodshed Dev-C++, 2008. . [18] Tmote Sky, 2008. . [19] S.K. Huang, K.F. Ssu, T.T. Wu, A fault-tolerant multipath routing protocol in wireless sensor networks, in: Proceedings of the International Computer Symposium, Dec. 2004, pp. 966–971. [20] K.F. Ssu, C.H. Chou, H.C. Jiau, W.T. Hu, Detection and diagnosis for data inconsistency failures in wireless sensor networks, Computer Networks 50 (9) (2006) 1247–1260. [21] K.F. Ssu, C.H. Chou, L.W. Cheng, Using overhearing technique to detect malicious packet-modifying attacks in wireless sensor networks, Computer Communications 30 (11-12) (2007) 2342– 2352.

Kuo-Feng Ssu received the B.S. degree in Computer Science and Information Engineering from National Chiao Tung University and the Ph.D. degree in Computer Science from the University of Illinois, Urbana-Champaign. He is an associate professor in the Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan. He was a visiting associate professor in the School of Electrical and Computer Engineering, Cornell University, Ithaca, New York. His research interests include mobile computing, dependable systems, and distributed systems. Research awards include the Ta-You Wu Memorial Award, the K.T. Li Research Award, and the Lam Research Thesis Award. He is a member of the IEEE, the ACM, and the Phi Tau Phi Honor Scholastic Society.

Wei-Tong Wang received the B.S. degree in Computer Science and Information Engineering from National Chung Cheng University. He was a visiting scholar with the School of Electrical and Computer Engineering, Cornell University, Ithaca, New York. He is also a Ph.D. Candidate in the Department of Electrical Engineering, National Cheng Kung University. His research interests include mobile computing, and wireless sensor networks.

3056

K.-F. Ssu et al. / Computer Networks 53 (2009) 3042–3056 Wen-Chung Chang received the B.S. degree in Computer Science and Information Engineering from Tamkang University and the M.S. degree in Institute of Computer and Communication Engineering from National Cheng Kung University. His research interests include ad hoc and sensor networks.