A detection mechanism on malicious nodes in IoT

A detection mechanism on malicious nodes in IoT

Computer Communications 151 (2020) 51–59 Contents lists available at ScienceDirect Computer Communications journal homepage: www.elsevier.com/locate...

2MB Sizes 0 Downloads 72 Views

Computer Communications 151 (2020) 51–59

Contents lists available at ScienceDirect

Computer Communications journal homepage: www.elsevier.com/locate/comcom

A detection mechanism on malicious nodes in IoT Bohan Li a,b,c , Renjun Ye a , Gao Gu a , Ruochen Liang a , Wei Liu d , Ken Cai e ,∗ a

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China Key Laboratory of Safety-Critical Software, Ministry of Industry and Information Technology, Nanjing 211106, China c Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, China d Nanjing Institute of technicians, Nanjing 210046, China e College of Automation, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China b

ARTICLE

INFO

Keywords: Malicious nodes Network topology Online learning algorithm IoT

ABSTRACT The increasing scale of the Internet of Things (IoT) makes systems vulnerable to serious security threats, especially when the attacks of malicious nodes exist in these networks. Different malicious nodes will launch different attacks, but most of them are based on tampering, re-transmission, and discarding methods. For such attacks, an effective method to detect malicious nodes focuses on the received and sent messages of each node. However, gathering messages about each node in the network is time-consuming, as well as collecting all the message in the network would consume the limited resources in the IoT. In this paper, we propose a novel method to detect malicious nodes based on an online learning algorithm. We first calculate the credibility of each path in the network based on the collected packets., then modeled the got path reputation by the online learning algorithm, finally, calculated the trust of each node in the IoT environment and detected the malicious node by a clustering algorithm. To make the model have a good performance when the network scale is small, we perform some processing on the network topology based on the general online learning detection algorithm and get an enhanced online learning detection algorithm. The result of the experiment proves that the methods we proposed can detect malicious nodes with high accuracy and work well with good stability.

1. Introduction IoT networks usually consist of many portable devices (such as sensors and actuators) and powerful devices (such as control units), these devices are connected and exchange information everywhere. The information exchange between long-distance and the power-limited IoT devices generally passes through multiple nodes, forming a multi-hop mesh network. Multi-hop networks are vulnerable to multiple security threats [1], such as discard packet attack, tamper packet attack, reply packet attack and so on. Most of these attacks are controlled by attackers and executed by malicious nodes. Thus, it is necessary to detect malicious nodes in the IoT. Most nodes will appear on multiple routing paths. Once a message passes through a malicious node during transmission, the malicious node will launch an attack. In order to prevent it from being detected by the general detection algorithm, the malicious nodes will attack with uncertain probability. Some researchers [2] believe that determining whether a node is malicious or not can be achieved by exploiting the trust of that node, and the assessment of trust values of nodes in IoT is based on the communication, which requires the collection of messages obtained and sent by each node in the network. Although it is possible

to identify malicious nodes, this node-based trust model needs to collect the message passed by each node, and then calculate the reputation value of each node. Collecting data and calculating the results will consume the limited resources of the IoT, so the model is not suitable for large-scale application networks. To solve this problem, Xin Liu [3] proposed Hard detection (HD) algorithm that uses the reputation of each path to calculate the trust of all nodes. This method simplifies the process of message collection, processing, and analysis for some nodes. But it is based on an assumption that all nodes on any path have the same probability of attack, resulting in low accuracy, high false positive rate and a large amount of calculation in the sink. Especially for smart malicious nodes, they dynamically adjust their attacks. Meanwhile, they attack the network with uncertain probability. This static detection model cannot perform well in the situation where the attacker is dynamic. Once the network topology changes dynamically, the original model cannot be applied to the new network. The main requirements for a malicious nodes’ detection algorithm include efficiency, accuracy and real-time. In this regard, the traditional detection algorithm cannot meet these needs, because the traditional detection algorithms usually artificially adjust the algorithm in different network environments to establish a trust evaluation model suitable

∗ Corresponding author. E-mail address: [email protected] (K. Cai).

https://doi.org/10.1016/j.comcom.2019.12.037 Received 8 September 2019; Received in revised form 11 December 2019; Accepted 20 December 2019 Available online 30 December 2019 0140-3664/© 2019 Elsevier B.V. All rights reserved.

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

had a high confidence interim, as the level of confidence started to fall, the system directed the data flow to other nodes that had a high confidence interim, this technique helped the whole network optimized transmission path. In this network, every node collected direct and indirect experiences to build trust value, a node experience record includes sensing data, forwarding or receiving data, nodes evaluated their neighbors by monitoring the communications performed by them and recorded these communications as experiences in an experience record. The model deals with the credibility of nodes, not the credibility of the flow of data in it. The model does not take into account the dissemination of trust value. Perrig A. [2] presented hierarchical reputation model in Wireless Sensor Networks, this model first cluster all nodes and select some node as cluster header by leach algorithm [7], thus, the sensor network is divided into a base station, a cluster head node layer and a common node layer, the common nodes are managed by cluster head nodes. On this basis, the node behavior attributes and network attacks are modeled to find malicious nodes. This model puts high demands on the bandwidth and performance of the entire topology network. It consumes a lot of resources in the node election phase and is not practical. Some researches focused on the detection of malicious nodes with low energy consumption. Chuang Wang [8] proposed a simple yet effective scheme, where a sink counts the percentages of modified packets along each path and uses the relative position information for malicious node identification, this path-based detection method is more efficient than node-based detection methods, yet the packets must be fully encrypted. Xin Liu [3] assumed that all nodes on any path have the same probability of attacking, as long as the reputation of each path is calculated in the sink, HD figure the trust of each node. This pathbased algorithm simplifies our detection process, but the probability of attack is different for each node. Therefore, the detection rate of the algorithm based on the same probability distribution principle is low. In most cases, people need find a way to detect malicious nodes with incremental information [9]. The advantage of using incremental information is that although we cannot accurately identify malicious nodes at first, the accuracy of algorithm detection increases with the increase of useful information. Such ideas are similar to machine learning. Machine learning is used to detect malicious attack [10,11], for common network attacks, like black hole attacks and selective forwarding attacks, these methods are superior to previous algorithms based on probabilistic models, but they do not identify the malicious node efficiently. S. Kaplantzis [10] used support vector machine (SVM) to detect attacks but they cannot identify malicious nodes which launched the attack. Nahiyan et al. [12] used K-means clustering to classify statistical data tuples into benign and malicious. The above research work proves that machine learning could detect malicious attacks in networks and 𝐾-means clustering is a reliable approach to classify nodes. But for different problems, the authors of these articles propose different machine learning algorithms. To the best of our knowledge, machine learning can be divided into online learning and offline learning [13]. For offline learning, the whole training data must be available at the time of model training. Only when training is completed can the model be used for predicting. On the contrary, online learning need not to acquire the entire data set, and only needs to acquire data in real time to continuously update the detection model. Alfuqaha A. I. [13] prioritized and minimized latency of data by using online learning algorithms in vehicular ad-hoc networks. Ma J. and Saul L. K. [14] explored online learning approaches for detecting malicious Web sites (those involved in criminal scams) using lexical and host-based features of the associated URLs, achieving classification accuracy up to 99% over a balanced data set. Therefore, we use a similar approach to apply online learning algorithms to malicious node detection in the IoT. Xiao, B [15] used a sliding window to count the weighted occurrence number of the rotation invariant uniform LBP pattern pairs to obtain the spatial contextual information

for the current situation, or design a trust-based routing protocol to detect malicious nodes. Our goal is to design a countermeasure that can adapt to dynamic environments to detect malicious nodes and minimize false positives. Thus, we propose the Truncated Gradient detection (TGD) and Online Gradient Descent detection (OGD). The main steps of these methods are as follows, firstly, injecting some probe packets that are known to the sink, so the packet corruption on each path in the network can be calculated according to the received packet on the sink, and the reputation of each path in the topology graph is obtained. Then, the online machine learning method is used to predict the trust of each node. Finally, the malicious nodes are identified by 𝐾-means clustering algorithm. Compared with the previous detection algorithms, the online detection algorithm has a better convergence and can identify nodes faster when the data scale is massive. In our work, we consider that an attacker can launch tamper attack, discard attack and replay attack at the same time or choose some of these attacks intentionally, and the probability of each attack is uncertain. This mixed attack can cause more serious damage to the network and is difficult to detect. To detect the malicious nodes that carry out these attacks, we inject some probe packets into the IoT and collect probe packets transmitted in the network. Since the probe packets are designed by us, we can easily analyze the damage of these probe packets. According to this information, we use the Truncated Gradient detection (TGD) and Online Gradient Descent detection (OGD) to calculate the trust of all nodes and cluster the nodes to three groups including benign group (𝐺𝑏 ), unknown group (𝐺𝑢 ), malicious group (𝐺𝑚 ). Then we change the route that packets transmission and re-inject packets into the network to collect the information about those nodes in UG’s influence to the network. We input this information to the Truncated Gradient detection (TGD) and Online Gradient Descent detection (OGD) again to enhance its learning and get the final result about the trust of nodes and finally cluster all nodes to two groups including last benign group (𝐺𝑙𝑏 ) and last malicious group (𝐺𝑙𝑚 ), 𝐺𝑙𝑚 and 𝐺𝑙𝑏 are our detection results. The remainder of this paper is organized as follows. In Section 2, we introduce the related work. In Section 3, we discuss the preliminaries including hard detection and some widely used online learning algorithms. In Section 4, we present several network model and attack model. In Section 5, we compare the performance of several online learning algorithms through simulation experiments. In Section 6, we conduct the experiments on the datasets. In Section 7, we give the conclusion. 2. Related work Due to the complexity of the IoT environment and the diversity of attack behavior,our work focus on tamper attack, drop attack and replay attack of malicious nodes because these attacks are typical and harmful. With different machine learning algorithms, we can deal with large-scale data problems, so it is common to see applying online machine learning algorithm to deal with cyber-security problems. For the problem of malicious node detection in the IoT, most of the solutions are to establish an appropriate trust evaluation model or design trust-based routing protocols. Josang A. [4] proposed a model for quantifying and reasoning about trust in IT equipment, this trust model is based on subjective logic, describe the posterior probability of binomial events by beta distribution function to obtain the probability. Based on this reasoning model, Gu Xiang and Qiu Jianlin [5] provided a trust model of sensor nodes. The model discusses the integrated direct and indirect trust value, then to identify bad nodes and malicious nodes. Due to the reputation value calculated in this way is subjective and uncertain, the algorithm has a problem of malicious evaluation. Ssu K. [6] presented a model that minimized the influence of malfunctioned or malicious nodes using reputation. Reputation between nodes is based on the trust level between nodes in each other. A high confidence level meant high trust. Data flowed between nodes that 52

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

for further detection. [16] proposed a pixel-wise convolutional neural network that can recognize the focused and defocused information from neighborhood for multi-focus fusion. For the above offline learning algorithms, the models are inconsecutive. In contrast, online learning algorithms process data sequentially. They produce a model and put it in operation without having the complete training dataset available at the beginning. The model is continuously updated during operation as more training data arrives. Because the detection algorithm is deployed on the sink, it will continuously receive various data, and the online learning algorithm can make better use of streaming data. Therefore, we put forward some new approaches to detect malicious nodes such as TG detection (TGD) and OGD detection (OGD), these two detection methods can be realized through two different trains of thought: original online learning detection and enhance online learning detection. Thus, we can get a serial of detection algorithm, including original TG detection (OTGD), enhance TG detection (ETGD), original OGD detection (OOGD), enhance OGD detection (EOGD).

Fig. 1. The topology of the IoT.

must be solved during training. [17] proposed to use 𝐿1 -regularization method implements sparsification; this method simplifies the model while preserving the most important information in the data set. Truncated Gradient (TG) [18] use a similar approach to get a sparse solution, which sets a truncation value to control the size of the coefficient. Online learning algorithm tends to set each feature value in the feature matrix as a float number, and gradient update has the form 𝑥 + 𝑦 where x and y are two floats, so it is expected that gradient update will not produce sparsity. TG by setting the threshold and refining the feature matrix, we can get the sparse solution by simply comparing the coefficient with the threshold.

3. Preliminaries In this section, we introduce the principle and deficiency of HD algorithm, we also bring in two practical online learning detection algorithms, Truncated Gradient (TG) and Online Gradient Descent (OGD). 3.1. Hard detection We know that calculating the trust value of each node at the sink would consume limited resources in the IoT. HD gets the trust of each node based on the reputation of path, HD simplifies the calculation process of node trust values, because this algorithm assumes that each node devotes equally to the reputation of the path, therefore, according to the probability distribution principle, the contribution of each node on each path to the path reputation can be quantified. The main steps are as follows:

3.3. Online gradient descent The selection of threshold in the Truncated Gradient algorithm mentioned above requires a certain amount of experience and skill. Online Gradient Descent (OGD) [19] is an algorithm that does not need to set any value in advance to get a sparse solution. Because online learning algorithms is real-time and flow-based, it is impossible to use all the samples for each training. We use Stochastic Gradient Descent (online learning can be divided into batch mode, and delta mode, batch model using all of the data to update model, Delta model uses partial data to update model, stochastic gradient descent is benign of delta model) update the model for only one observed sample at a time. So, the state of the art of logical regression has been widely used in industry because of its simplicity and natural parallelism, and it can get excellent results based on complex feature engineering.

1. Calculate the reputation of each path in sink by analyzing the loss of probe messages; 2. Calculate the node trust according to the node distribution of each path and the reputation value of this path; 3. Cluster malicious nodes using K-means algorithm based on node trust. Actually, we can use Fig. 1 as an example, there are multiple paths in the diagram, where 𝑝𝑎𝑡ℎ1 = ⟨𝑁1 , 𝑁4 , 𝑁7 , 𝑁6 , 𝑠𝑖𝑛𝑘⟩, 𝑝𝑎𝑡ℎ2 = ⟨𝑁1 , 𝑁4 , 𝑁6 , 𝑠𝑖𝑛𝑘⟩, 𝑝𝑎𝑡ℎ3 = ⟨𝑁2 , 𝑁4 , 𝑁7 , 𝑁6 , 𝑠𝑖𝑛𝑘⟩, assume the reputation of 𝑝𝑎𝑡ℎ2 is 0.73 and the reputation of 𝑝𝑎𝑡ℎ3 is 0.41(the reputation of jth 𝑗 [1]). We use 𝑇𝑖𝑗 to denote the ith node’s trust path is calculated by 𝑁𝑠 𝑅𝑠𝑗 √ 3 𝑁𝑠2 = 0.9, in jth path, hence, the trust of node in 𝑝𝑎𝑡ℎ2 is 𝑇1,4,6 = 𝑅𝑠2 √ 4 𝑁𝑠3 and the trust of node in 𝑝𝑎𝑡ℎ3 is 𝑇1,4,6,7 = = 0.8, then we take 𝑅𝑠3 the average √ 𝑁7 in√the correlated paths the as final trust of 𝑁7 , which 2

4. Network model and attack model In this section, we first describe the topology of the entire network and model a variety of network attack behaviors. We have had a comprehensive review for the distributed attack in IoT including blockchain technologies [20]. Then, for the packet discard attack, packet tamper attack, and packet replay attack on the node. Finally, we formalize the above three attack behaviors and generate a common attack model.

3

is 𝑇7 = ( 𝑁𝑠 + 𝑁𝑠 )∕2 = (0.8 + 0.9)∕2 = 0.85. In the same way, we 𝑅𝑠2 𝑅𝑠3 can calculate the trust of each node in the network. Hard detection algorithm is a summary and innovation of some malicious node detection algorithms in the past, which ensures a good detection rate without causing high energy consumption. 3

4

4.1. Network model The IoT consists of a set 𝑉 = {𝑉1 , 𝑉2 , 𝑉3 … 𝑉𝑛 } of n ordinary nodes, there is only one sink node, and each sensor node is randomly deployed. As a gateway, the sink is mainly responsible for receiving the messages transmitted in the network and analyzing the data. The remaining nodes either collect message data or transport message data. Some sensor nodes can be captured by attackers to launch attacks, which we call malicious nodes. A malicious sensor node may tamper or drop the packets it receives from other sensor nodes. To build a path between nodes to pass data out, therefore, during the route

3.2. Truncated gradient The hard detection algorithm mentioned above simplifies the calculation process, but the model cannot adapt to the dynamic environment, so we need to find a dynamic model that can be iterated continuously. This requirement can be met using an online machine learning algorithm because the online learning algorithm continually adjusts itself based on the received data. However, the problem of overfitting 53

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

Also, we define a tampered set of packets is 𝑇𝑆′ .

establishment phase, the common border nodes periodically send a path detection messages to the sink on each path, this type of broadcast message is typically implemented using some common IoT protocols. Sinks monitor these messages legality by using a hash function, then calculate the reputation of each path and the contribution of each node.

𝑇𝑆′ = {𝑝𝑎𝑐𝑘′𝑖 , 𝑝𝑎𝑐𝑘𝑗 |𝑝𝑎𝑐𝑘′𝑖 𝑖𝑠 𝑡𝑎𝑚𝑝𝑒𝑟𝑒𝑑, 𝑝𝑎𝑐𝑘𝑖 , 𝑝𝑎𝑐𝑘𝑗 𝜖𝑅𝑠, 𝑝𝑎𝑐𝑘′𝑖 𝜖𝑆𝑆 } In the same way, we define replayed set of the packet is 𝑅′𝑆 and discard sets of the packet is 𝐷𝑆′ .

4.2. Attack model

𝑅′𝑆 = {𝑝𝑎𝑐𝑘𝑖𝑐𝑝 , 𝑝𝑎𝑐𝑘𝑗 |𝑝𝑎𝑐𝑘𝑖𝑐𝑝 𝑖𝑠 𝑟𝑒𝑝𝑙𝑎𝑦𝑒𝑑,

We define 𝛼𝑖 as a binary flag to manifest whether 𝑁𝑖 is malicious. In our attack model, if 𝛼𝑖 = 0, this means the 𝑁𝑖 would discard or tamper with received packets by a fixed probability 𝑃𝑖 , that is, { 1, 𝑖𝑓 𝑁𝑜𝑑𝑒 𝑁𝑖 𝑖𝑠 𝑚𝑎𝑙𝑖𝑐𝑖𝑜𝑢𝑠, 0 < 𝑝𝑖 < 1 𝛼𝑖 = 0, 𝑖𝑓 𝑁𝑜𝑑𝑒 𝑁𝑖 𝑖𝑠 𝑏𝑒𝑛𝑖𝑔𝑛, 𝑝𝑖 = 0

𝑝𝑎𝑐𝑘𝑖 , 𝑝𝑎𝑐𝑘𝑗 𝜖𝑅𝑆 , 𝑝𝑎𝑐𝑘𝑖𝑐𝑝 𝜖𝑆𝑆 } 𝐷𝑆′

= {𝑝𝑎𝑐𝑘𝑖 |𝑝𝑎𝑐𝑘𝑖 𝑖𝑠 𝑟𝑒𝑝𝑙𝑎𝑦𝑒𝑑, 𝑝𝑎𝑐𝑘𝑖 𝜖𝑅𝑆 𝑏𝑢𝑡 𝑝𝑎𝑐𝑘𝑖 ∉ 𝑆𝑆 }

We also could use 𝑇𝑆′ and 𝑅′𝑆 to computer 𝐷𝑆′ , that is 𝐷𝑆′ = 𝑅𝑆 − 𝑆𝑆 − (𝑅′𝑆 ∩ 𝑇𝑆′ ), the set of packets that are not attacked by a malicious node is 𝑁𝑆 , that 𝑁𝑆 = 𝑅𝑆 ∩ 𝑇𝑆 . We can know 𝑆𝑆 = 𝑇𝑆′ + 𝑅′𝑆 + 𝑁𝑆 .

The probability of each node launching an attack is determined, so we denote the probability that the malicious node initiate tamper message attack as 𝑃𝑇 𝐴 , denote the probability that the malicious node initiate discard message attack as 𝑃𝐷𝐴 , denote the probability that the node makes replay message attack as 𝑃𝑅𝐴 . The 𝑃𝑇 𝐴 , 𝑃𝐷𝐴 , 𝑃𝑅𝑀𝐴 of beginning node are zero, and the 𝑃𝑇 𝐴 , 𝑃𝐷𝐴 , 𝑃𝑅𝐴 malicious node are not all zero. Malicious nodes are possible to make multiple attacks, and these attacks are independent of each other, so 𝑃𝑇 𝐴 𝑃𝐷𝐴 𝑃𝑅𝐴 do not interact with each other. Here, node is { 𝑃𝑇 𝐴 = 0 & 𝑃𝐷𝐴 = 0 & 𝑃𝑅𝐴 = 0, 𝛼𝑖 = 0 𝑁𝑖 = 𝑃𝑇 𝐴 ≠ 0 ∣ 𝑃𝐷𝐴 ≠ 0 ∣ 𝑃𝑅𝐴 ≠ 0, 𝛼𝑖 = 1

1 − 𝑁.𝑃𝑇 𝐴 − 𝑁.𝑃𝑅𝐴 − 𝑁.𝑃𝐷𝐴 =

𝑁𝑆 𝑅𝑆

According to our network model, there is a multi-hop path 𝑃 𝑎𝑡ℎ1 = ⟨𝑁1 , 𝑁4 , 𝑁5 , 𝑁7 , 𝑁6 , 𝑠𝑖𝑛𝑘⟩, the 𝑅𝑆 of 𝑁1 and the 𝑆𝑆 of 𝑁𝑛 are known 𝑛 ∏ ( ) 1 − 𝑁𝑖 .𝑃𝑇 𝐴 − 𝑁𝑖 .𝑃𝑅𝐴 − 𝑁𝑖 .𝑃𝐷𝐴 𝑖=0

= (1 − 𝑁1 .𝑃𝑇 𝐴 − 𝑁1 .𝑃𝑅𝐴 − 𝑁1 .𝑃𝐷𝐴 ) ∗ ⋯ (1 − 𝑁𝑛 .𝑃𝑇 𝐴 − 𝑁𝑛 .𝑃𝑅𝐴 − 𝑁𝑛 .𝑃𝐷𝐴 ) 𝑁 = 𝑆 𝑅𝑆

In view of tamper attacks, discard attacks, and replay attacks are all launched by malicious nodes operating on transmitted messages. If we can acquire these messages before they pass through a node, then to compare with the messages transmitted through the node, we could determine which type of attack according to the changes of the data packet. As shown in Fig. 2, every node in the network must receive and send messages, the number of processed messages is related to the location of the node. Send-set is a collection of all packets sent by a node and 𝑆𝑆 = {𝑝𝑎𝑐𝑘1 , 𝑝𝑎𝑐𝑘2 , 𝑝𝑎𝑐𝑘3 … 𝑝𝑎𝑐𝑘𝑛 }. Received-set is a collection of all packets received by a node, 𝑅𝑆 = {𝑝𝑎𝑐𝑘1 , 𝑝𝑎𝑐𝑘2 , 𝑝𝑎𝑐𝑘3 … 𝑝𝑎𝑐𝑘4 } In the progress of message transmission, the probe messages sent by the boundary node is known, as long as we know the nodes information in the message’s transmission path, we can get the reputation value of the current path. Because the attack probability of all nodes does not affect each other, thus the reputation of a path is the product of the reputation value of all node on the path. According to our topology, as defined above, 𝑝𝑎𝑡ℎ1 = ⟨𝑁1 , 𝑁4 , 𝑁7 , 𝑁6 , 𝑠𝑖𝑛𝑘⟩. As shown in Fig. 3, where, if a packet A arrived destination sink node by going through 𝑝𝑎𝑡ℎ1 , that indicates packet A delivered through ⟨𝑁1 , 𝑁4 , 𝑁7 , 𝑁6 , 𝑠𝑖𝑛𝑘⟩. For each node in this path, they all have the possible to initiate tamper messages, discard messages and replay messages, even launch multiple attacks. We use N.P TA to indicate the probability of the current node tamper messages, and N.P DA to indicate the probability of the current node discard messages, and use N.P RA to indicate the probability of the current node reply messages.

𝑁

The value 𝑅 𝑠 represents the path reputation, we just need to know 𝑠 the distribution of nodes on this path, and build the right algorithm model based on these network model and attack model, then design a node-trust model to describe the trust of the node, and then we can detect and identify malicious nodes. The above models consider attacks happening in networks individually, but in fact these attacks may happen at the same time with different probabilities. When the attacks occur alone, we can analyze these attacks through probabilities with our model. However, when the mixture of attacks occurs, due to the interaction between different malicious nodes, we have to consider the impact between the attack behaviors. 5. Detection based on online learning The TGD and OGD methods can be applied to our proposed network model, yet how to better adapt to our network model and produce a good effect is still a problem. Since our goal is to use the online learning algorithm to obtain the trust of each node, we can perform secondary calculations on uncertain nodes. We formalize the online learning method by constructing detection equation set. Then, we train the model and use K-means method to cluster these nodes to the malicious node group (𝐺𝑚 ) and benign node group (𝐺𝑏 ).

{ } ⎧ 𝑝𝑎𝑐𝑘′𝑖 (𝑝𝑎𝑐𝑘𝑖 ≠ 𝑝𝑎𝑐𝑘′𝑖 ) 𝑤𝑖𝑡ℎ 𝑡ℎ𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑁.𝑃𝑇 𝐴 ⎪ 𝑁 ⎪ {…} 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑁.𝑃𝐷𝐴 } 𝑝𝑎𝑐𝑘𝑖 ⟶ ⎨ { 𝑝𝑎𝑐𝑘 , 𝑝𝑎𝑐𝑘 , 𝑝𝑎𝑐𝑘 𝑤𝑖𝑡ℎ 𝑡ℎ𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑁.𝑃𝑅𝐴 𝑖 𝑖𝑐𝑝1 𝑖𝑐𝑝2 ⎪ { } ⎪ 𝑝𝑎𝑐𝑘𝑖 𝑤𝑖𝑡ℎ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(1 − 𝑁.𝑃𝑇 𝐴 − 𝑁.𝑃𝐷𝐴 − 𝑁.𝑃𝑅𝐴 ) ⎩

5.1. Formalization of online learning methods We define the trust of a node N is N.T, the trust of a multi-hop path 𝑃1 is 𝑃1 .T, and 𝑁.𝑇 = 1 − 𝑁.𝑃 𝑇 𝐴 − 𝑁.𝑃 𝐷𝐴 − 𝑁.𝑃 𝑅𝐴 𝑛 ∏

{𝑝𝑎𝑐𝑘𝑖 } is the packet received by a malicious node N, N.P 𝑇 𝐴 denotes the probability about the Node N makes tamper attack, N.P 𝐷𝐴 denotes the probability about the N makes discard attack, N.P 𝑅𝐴 denotes the probability about the node N makes replay attack, 𝑝𝑎𝑐𝑘𝑖 denotes one of the packet in Receive-set, pack′𝑖 denotes one of the packet in Sendset. For example, 𝑅𝑆 = {𝑝𝑎𝑐𝑘1 , 𝑝𝑎𝑐𝑘2 , 𝑝𝑎𝑐𝑘3 , 𝑝𝑎𝑐𝑘4 , 𝑝𝑎𝑐𝑘5 }, after passing through malicious N, the 𝑆𝑆 = {𝑝𝑎𝑐𝑘′1 , 𝑝𝑎𝑐𝑘′2 , 𝑝𝑎𝑐𝑘2𝑐𝑝 , 𝑝𝑎𝑐𝑘′4 }.

𝑖=1

𝑁1 .𝑇 ∗ 𝑁2 .𝑇 ∗ 𝑁3 .𝑇 ∗ 𝑁4 .𝑇 ⋯ ∗ 𝑁𝑛 .𝑇 =

|𝑁𝑠| |𝑅𝑠|

Where the NS denotes the packets that are not attacked by a malicious node, Rs denotes the packets received in the sink. According to these basic equations, we can construct a detection equation set about this path. This process is repeated at the sink until the set of detection equations for all paths is obtained. In theory, if the detection equation is efficient, the trust of the node can be calculated accurately, but we cannot assure that the detection equation is completely correct.

𝑁

𝑅𝑆 {𝑝𝑎𝑐𝑘1 , 𝑝𝑎𝑐𝑘2 , … , 𝑝𝑎𝑐𝑘𝑛 } → 𝑆𝑆 {𝑝𝑎𝑐𝑘′1 , 𝑝𝑎𝑐𝑘′2 , … , 𝑝𝑎𝑐𝑘′4 } 54

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

Fig. 2. The Send-set model and Receive-set model.

Fig. 3. Injected packets transport in IoT.

Therefore, we can only improve the accuracy of the results through the online learning model. IoT network includes a sink(D), and other nodes denoted 𝑁𝑖 (i = 0∼n, n is the total count of nodes). If we inject packets to 𝑁𝑖 (assume 𝑁𝑖 and D is benign), and there exist m paths from 𝑁𝑖 to D 𝑃 𝑎𝑡ℎ1 = ⟨𝑁𝑖1 , 𝑁𝑖2 , 𝑁𝑖3 … 𝐷⟩ 𝑃 𝑎𝑡ℎ2 = ⟨𝑁𝑖1 , 𝑁𝑖2 , 𝑁𝑖3 … 𝐷⟩ 𝑃 𝑎𝑡ℎ𝑚 = ⟨𝑁𝑖1 , 𝑁𝑖2 , 𝑁𝑖3 … 𝐷⟩ The detection equation of 𝑃1 is known, so the detection equation of 𝑝𝑎𝑡ℎ𝑚 is

Then, construct a matrix EM (Exist Matrix) ⎡ ⎢ 𝐸𝑀 = ⎢ ⎢ ⎢ ⎣

𝑎11 𝑎12 … 𝑎𝑛1

𝑎12 𝑎22 … 𝑎𝑛2

𝑎13 𝑎23 … 𝑎𝑛3

… … … …

𝑎1𝑚 𝑎2𝑚 … 𝑎𝑛𝑚

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

Finally, in order to more accurately describe the association information between nodes and paths in the network, we can construct a matrix. The number of rows represents the serial number of the node, and the number of rows represents the sequence number of the path. If 𝑎24 is zero in this matrix, that is means the fourth path does not pass through the second node. We can construct an equation based on PTM, EM, and NTM

|𝑁𝑆 | 𝑝𝑎𝑡ℎ𝑚 .𝑇 = | | | 𝑅𝑆 | | | 𝑐 ∏ 𝑁𝑚𝑖 .𝑇 =

𝑁𝑇 𝑀 ∗ 𝐸𝑀 = 𝑃 𝑇 𝑀

𝑖=1

= 𝑁𝑖 .𝑇 ∗ 𝑁𝑚1 .𝑇 ∗ 𝑁𝑚2 .𝑇 ∗ ⋯ ∗ 𝑁𝑚𝑛 .𝑇 ∗ 𝐷.𝑇

EM and PTM are known, so we aim to ensure the accuracy of NTM. Since the input data cannot be guaranteed to be completely linearly separable, logistic regression is used to process the data. The online learning model OD and OGD we proposed does well in these aspects, with the input of data, the accuracy of NTM is gradually improved. Once the NTM is got, the next task is to identify malicious nodes from all nodes according to their trust. Common methods include threshold comparison and horizontal comparison. Threshold comparison refers to comparing the trust index of a node with a threshold set in advance, we think that a node is benign if the trust of this node is higher than the threshold, conversely if the trust is lower than the threshold, then this node is malicious. The horizontal comparison is based on the node reputation value as the main feature, and judges whether the node is benign through the assistance of other features. Because it is in the IoT environment, collecting other features will consume limited resources, we use the threshold comparison method.

Assume the sink and the node inject packet is benign, so the equation is simplified to 𝑝𝑎𝑡ℎ𝑚 .𝑇 = 𝑁𝑖 .𝑇 ∗ 𝑁𝑚1 .𝑇 ∗ 𝑁𝑚2 .𝑇 ∗ ⋯ ∗ 𝑁𝑚𝑛 .𝑇 . For the convenience of calculation, change the equation into (| |) ( ) 𝑁𝑆 ln 𝑝𝑎𝑡ℎ𝑚 .𝑇 = ln | | |𝑅𝑆 | | | 𝑚 ∑ ( ) = ln 𝑁𝑚𝑖 .𝑇 𝑖=1

( ) ( ) ( ) ( ) = ln 𝑁𝑚1 .𝑇 + ln 𝑁𝑚2 .𝑇 + ln 𝑁𝑚3 .𝑇 + ⋯ + ln 𝑁𝑚𝑛 .𝑇 At last, get the detection equation set based on all paths’ equation, so that, we can get the PTM. 5.2. Training model and cluster nodes After constructed detection equation set, use the data we collected to training online learning model. First, construct a matrix NTM (Node Trust Matrix),

5.3. General online learning detection From the above analysis, we can summarize the online learning detection method’s steps:

𝑁𝑇 𝑀 = (ln(𝑁1 .𝑇 ), ln(𝑁2 .𝑇 ), ln(𝑁3 .𝑇 ) … ln(𝑁𝑛 .𝑇 )) 55

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59 Table 1 Simulation environment.

1. Inject packets to a node (we assume the node is benign) in the IoT network; 2. Construct detection equation set based on the result in sink; 3. Train the online learning model TG and OGD with detection equation set; 4. Collect the parameter of the model of TG and OGD; 5. Calculate the trust of all the node; 6. Use K-means clustering to cluster all nodes into two groups including benign group (𝐺𝑏 ), malicious group (𝐺𝑚 ).

Item

Description

CPU Memory OS Scikit-learn

Intel Core i5-4200U 2.4 GHz, 4Core (8 Threads) Kingston DDR3L 4GB*2 Windows 10 Professional 1709 0.20.3

Our detection is based on the trust of nodes so it is important to make sure the calculated trust has a high accuracy. According to the steps of the general online learning detection algorithm, we consider the algorithm to be a practical strategy, but malicious nodes always affect other nodes on the same path, benign nodes would improve the trust of malicious nodes on the same path. 5.4. Enhance online learning detection Although we cannot directly change the accuracy of the model we proposed, we can enhance the detection rate in other ways. The method mentioned above is to divide the node directly into benign node and malicious node according to the trust of nodes, if some trust of nodes is in the middle of malicious and benign, these nodes are easy to be incorrectly detected. We can use three clusters instead of two, that is to say, all nodes are divided into the malicious node group (𝐺𝑚 ), unknown node group (𝐺𝑢 ) and benign node group (𝐺𝑏 ). Then we change the routing that packets transmission, increase the packet injection and collect the information about those nodes in UGs influence to the network, the results will be used to retrain the model, finally cluster all nodes to two groups including last benign group (𝐺𝑙𝑏 ) and last malicious group (𝐺𝑙𝑚 ). Therefore, the steps about enhanced online learning detection are as followed:

6. Experiment In our experiment, we use TG and OGD to combine both algorithms with two different detection methods, get four different detection algorithms. We compare the results of the simulation experiment to find the best detection algorithm. At the same time, to reflect the advantages of online learning algorithms, we need to add traditional detection methods to the simulation experiment. Hard Detection(HD) is a typical algorithm, so the simulation experiment is carried out with the HD and the proposed new algorithms. In our simulation experiments, in order to fully analyze the performance of the algorithm, some key factors will be changed, such as the diversity of network, the proportion of malicious nodes, the count of nodes, the count of received packets in sink, then to find the best detection algorithm by comparing the results of simulation experiments. Combining the different methods above, we can get four different detection algorithms, general TG detection, enhance TG detection, general OGD detection, enhance OGD detection. To evaluate the performance of these methods, we should not only pay close attention to the accuracy (number of correctly identified malicious nodes/number of all malicious nodes) but also focus on the false alarm rate ((number of benign nodes false identified as malicious+number of malicious nodes not be identified)/number of all nodes).

1. Inject packets to a node (we assume the node is benign) in the IoT; 2. Construct detection equation set based on the result in sink; 3. Train the online learning model TG and OGD with detection equation set; 4. Collect the parameter of the model of TG and OGD 5. Calculate the trust of all the node; 6. Use K-means clustering to cluster all nodes into three groups including benign group (𝐺𝑏 ), unknown group (𝐺𝑢 ) and malicious group (𝐺𝑚 ); 7. Adjust the routes of the nodes of 𝐺𝑢 then inject new packets to go through these nodes; 8. Analyze the result in sink again and construct a new detection equation set (Enhanced Detection Equation Set, EDES); 9. Use ES to train the model incrementally 10. Collect the parameter of the model and calculate the trust of all nodes; 11. Use K-means clustering to cluster all nodes to two groups including 12. Last benign group (𝐺𝑙𝑏 ) and Last malicious group (𝐺𝑙𝑚 ). FBG and FMG are our detection results.

6.1. Settings Our experiment is based on the simulation of the multi-hop IoT and running on a regular computer. The simulation and all algorithms are realized by Python. We can use the scikit-learn, which is a useful machine learning tool library, could help us cluster nodes by their trust with 𝑘-means method. Our simulation is in a 100 × 100 m2 rectangle area and the communication radius of all nodes is 10 m. In this simulated network, all sensor nodes are randomly distributed, so we can adjust various parameters according to the experimental needs, such as the number of nodes, the complexity of the path.The specific experimental environment is shown in the following Table 1. 6.2. Effect of the diversity of network

One of the most important step above is to adjust the routes of the whole network, we summarized it and illustrated in Algorithm 3 (Path Set Update). We record the original path set 𝐿0 and the original node set 𝐺0 , the reason for updating the path is to eliminate the impact of malicious nodes on other nodes, so we need exclude any path with malicious nodes from the path set 𝐿0 , the specific algorithm is as followed:

In order to examine the impact of the diversity of network to the detection of general TG detection, enhance TG detection, general OGD detection, enhance OGD detection and hard detection, we adjust the utilization of the path to change the diversity of network in the simulation. For other factors that may affect the detection rate, like the count of node, the proportion of malicious node, we assume they 56

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

Fig. 4. Accuracy for diversity. Fig. 6. Accuracy for proportion.

Fig. 5. False alarm rate for diversity. Fig. 7. False alarm rate for proportion.

in range of normal value, so the count of nodes is 15, the proportion of malicious node is 0.3, the count of packets received at the sink is 10000. The results are shown as in Figs. 4 and 5. According to the results, we know when the diversity(use rate of paths) is low, the accuracy of general TG detection, enhance TG detection, general OGD detection, enhance OGD detection, and hard detection are low, with the increase of the diversity of network, the accuracy increases gradually. Because these methods detect malicious nodes relying on the trust of path, the more paths there are, the more efficient equations set we can construct, these results also show that our detection method is right. In general, online learning algorithms are better than hard detection.

nodes, the remaining three methods cannot work well when the proportion is low, When the percentage of malicious nodes is very small, the count of paths which contains malicious nodes become much small correspondingly, so the count of valid equations of detection equation set is still small. Because online learning model iterates the model as the data update, the accuracy of OGD and OD are higher than hard detection. 6.4. Effect of the count of node To examine the impact of the count of nodes to the detection of general TG detection, enhanced TG detection, general OGD detection, enhanced OGD detection and hard detection, in order to ensure the establishment of normal network routing, we set the count of nodes to be 5, 10, 15, 20, 25 and 30. In this experiment, the diversity of network is 0.5, the proportion of malicious nodes is 0.5, and the count of packets received at the sink is 10000. The results are shown as Figs. 8 and 9. According to the result, when the scale of the IoT is small, hard detection’s accuracy is high than other methods. Actually, online learning model needs enough data to build a reasonable model, the topology of the IoT is simple and the number of paths is small when the count of node is 5, so online learning model cannot predict accurate results. With the expansion of the network, the topology of the IoT is complicated, and it is more difficult to identify all malicious nodes

6.3. Effect of the proportion of malicious nodes To examine the impact of the proportion of malicious nodes to the detection of general TG detection, enhance TG detection, general OGD detection, enhance OGD detection and hard detection, we adjust the proportion of malicious nodes in our network. Because there are not too many malicious nodes in the real network, we set the proportion to be 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, in this experiment, the diversity of network is 0.5, the count of nodes is 15, and the count of packets received at the sink is 10000. The results are shown as Figs. 6 and 7. According to the results, enhance TG detection and enhance OGD detection can work well in the all kinds of the proportion of malicious 57

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

Fig. 8. Accuracy of counting.

Fig. 10. Accuracy of received packets.

Fig. 9. False alarm rate of counting.

Fig. 11. False alarm rate of received packets.

and reduce the error rate, so the accuracy of online learning algorithm detection is increased, on the contrary, the accuracy of HD decreases gradually.

has the same performance with online learning algorithm. Over time, the detection rate of online learning algorithm gradually increases, on the contrary, the detection rate of hard detection is no changed. So, the proposed online learning algorithm is stronger on real-time than hard detection. Over time, the accuracy of online learning algorithms will gradually increase.

6.5. Impact of the count of received packets To examine the impact of the count of packets to the detection of general TG detection, enhance TG detection, general OGD detection, enhance OGD detection and hard detection, we set the count of packets received at the sink to be 100, 200, 500, 1000, 1500, 2000. In this experiment, the diversity of network is 0.5, the proportion of malicious nodes is 0.5, the count of nodes is 15. The results are shown as Figs. 10 and 11. According to the result, when the received packets are not enough, all detection methods have low accuracy. As the count of injected packets increases, the behavior of nodes can be identified, so that the performance of enhance TG detection and enhance OGD detection is better than HD. The number of packets received in sink increases as the detection time increases. So we can think that the relationship between the number of messages received and the detection rate reflects the relationship between the time of the detection algorithm and the detection rate. When the running time of the online detection algorithm is extremely short (the count of the received packet is 100), hard detection

7. Conclusion

In this paper, we address the problem of detecting malicious nodes for sensor network and propose an effective strategy for calculating the trust of node based on online learning algorithm. To verify the effectiveness of this strategy, we propose four different detection algorithms, namely original TG detection (OTGD), enhance TG detection (ETGD), original OGD detection (OOGD) and enhance OGD detection (EOGD). The simulation results show that both of our detection algorithms perform better than HD in terms of accuracy and false alarm rate. Moreover, compared with the original detection method, numerical studies show that our detection algorithm is iteratively updated as the data is acquired, so it has a robust real-time performance, and obtain an improvement by 10% to 20% in the detection rate. 58

B. Li, R. Ye, G. Gu et al.

Computer Communications 151 (2020) 51–59

Declaration of competing interest

[6] K.F. Ssu, C.H. Chou, L.W. Cheng, Using overhearing technique to detect malicious packet-modifying attacks in wireless sensor networks, Comput. Commun. 30 (11–12) (2007) 2342–2352. [7] L. XingGuo, W. JunFeng, B. LinLin, LEACH protocol and its improved algorithm in wireless sensor network, in: IEEE International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC, 2016, pp. 418–422. [8] C. Wang, T. Feng, J. Kim, G. Wang, W. Zhang, Catching packet droppers and modifiers in wireless sensor networks, in: 6th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, 2009, pp. 1–9. [9] N. Komninos, E. Philippou, A. Pitsillides, Survey in smart grid and smart home security: Issues, challenges and countermeasures, IEEE Commun. Surv. Tutor. 16 (4) (2014) 1933–1954. [10] S. Kaplantzis, A. Shilton, N. Mani, Y.A. Sekercioglu, Detecting selective forwarding attacks in wireless sensor networks using support vector machines, in: IEEE 3rd International Conference on Intelligent Sensors, Sensor Networks and Information, 2007, pp. 335–340. [11] R. Akbani, T. Korkmaz, G.V.S. Raju, A machine learning based reputation system for defending against malicious node behavior, in: IEEE GLOBECOM 2008 Global Telecommunications Conference, 2008, pp. 1–5. [12] K. Nahiyan, S. Kaiser, K. Ferens, R. McLeod, A multi-agent based cognitive approach to unsupervised feature extraction and classification for network intrusion detection, in: Proceedings of the International Conference on Applied Cognitive Computing, 2017. [13] A. Al-Fuqaha, A. Gharaibeh, I. Mohammed, S.J. Hussini, A. Khreishah, I. Khalil, Online algorithm for opportunistic handling of received packets in vehicular networks, IEEE Trans. Intell. Transp. Syst. 20 (1) (2018) 285–296. [14] J. Ma, L.K. Saul, S. Savage, G.M. Voelker, Identifying suspicious URLs: an application of large-scale online learning, in: ACM Proceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 681–688. [15] B. Xiao, K. Wang, X. Bi, W. Li, J. Han, 2D-LBP: an enhanced local binary feature for texture image classification, IEEE Trans. Circuits Syst. Video Technol. 29 (9) (2018) 2796–2808. [16] H. Tang, B. Xiao, W. Li, G. Wang, Pixel convolutional neural network for multi-focus image fusion, Inform. Sci. 433 (2018) 125–141. [17] L. Xiao, Dual averaging methods for regularized stochastic learning and online optimization, J. Mach. Learn. Res. 11 (2010) 2543–2596. [18] J. Langford, L. Li, T. Zhang, Sparse online learning via truncated gradient, J. Mach. Learn. Res. 10 (2009) 777–801. [19] Y. Ying, M. Pontil, Online gradient descent learning algorithms, Found. Comput. Math. 8 (5) (2008) 561–596. [20] W. Xu, Xuan Zh, Wei N., et al., Survey on blockchain for Internet of Things, Comput. Commun. 136 (2019) 10–29.

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/ 10.1016/j.comcom.2019.12.037. CRediT authorship contribution statement Bohan Li: Writing - original draft, Methodology, Supervision. Renjun Ye: Data curation, Conceptualization. Gao Gu: Writing - review & editing. Ruochen Liang: Writing - review & editing. Wei Liu: Writing - review & editing. Ken Cai: Supervision, Investigation. Acknowledgments This work is supported in part by National Natural Science Foundation of China (61728204), Innovation Funding, China (NJ20160028, NT2018027, NT2018028, NS2018057), Aeronautical Science Foundation of China (2016551500), State Key Laboratory for smart grid protection and operation control Foundation, Association of Chinese Graduate Education (ACGE). References [1] M.M. Hossain, M. Fotouhi, R. Hasan, Towards an analysis of security issues, challenges, and open problems in the Internet of Things, in: IEEE World Congress on Services, New York, USA, 2015, pp. 21–28. [2] D. Hui-hui, G. Ya-jun, Y. Zhong-qiang, C. Hao, A wireless sensor networks based on multi-angle trust of node, in: IEEE International Forum on Information Technology and Applications, 2009, pp. 28–31. [3] X. Liu, M. Abdelhakim, P. Krishnamurthy, D. Tipper, Identifying malicious nodes in multihop IoT networks using diversity and unsupervised learning, in: IEEE International Conference on Communications, 2018, pp. 1–6. [4] A. Jøsang, S.J. Knapskog, A metric for trusted systems, in: Proceedings of the 21st National Security Conference, NSA, 1998. [5] G. Xiang, Q. Jianlin, W. Jin, Research on trust model of sensor nodes in WSNs, Procedia Eng. 29 (2012) 909–913.

59