Decentralized detection of hybrid faults in mobile sensor nodes

Decentralized detection of hybrid faults in mobile sensor nodes

Accepted Manuscript Decentralized Detection of Hybrid Faults in Mobile Sensor Nodes Hamid Nourizadeh Azar, Hadi Tabatabaee Malazi PII: DOI: Reference...

825KB Sizes 0 Downloads 18 Views

Accepted Manuscript

Decentralized Detection of Hybrid Faults in Mobile Sensor Nodes Hamid Nourizadeh Azar, Hadi Tabatabaee Malazi PII: DOI: Reference:

S1569-190X(18)30093-5 10.1016/j.simpat.2018.07.001 SIMPAT 1826

To appear in:

Simulation Modelling Practice and Theory

Received date: Revised date: Accepted date:

1 January 2018 10 May 2018 6 July 2018

Please cite this article as: Hamid Nourizadeh Azar, Hadi Tabatabaee Malazi, Decentralized Detection of Hybrid Faults in Mobile Sensor Nodes, Simulation Modelling Practice and Theory (2018), doi: 10.1016/j.simpat.2018.07.001

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights • Tarantula, software testing method, can be used to detect sensor faults. • It detects intermittent and permanent faults in mobile sensor networks.

CR IP T

• DBSCAN, a density-based clustering method, is useful in investigating the actual sensing value. • The devised method is more resilient against fault percentages of above 30%.

AC

CE

PT

ED

M

AN US

• The method performs well in random, stuck-at, and gain-offset faults respectively.

1

ACCEPTED MANUSCRIPT

Decentralized Detection of Hybrid Faults in Mobile Sensor Nodes Hamid Nourizadeh Azar, Hadi Tabatabaee Malazi

CR IP T

Faculty of Computer Science and Engineering, Shahid Beheshti University. GC, Tehran, Iran

Abstract

CE

PT

ED

M

AN US

The widespread use of sensor nodes that are operating under the Internet of Things paradigm motivates researcher to step forward and build reliable systems capable of detecting their faulty nodes. These nodes lead to decrease in the accuracy and functionality of the networks, which finally result in quality degradation of network services. From the temporal point of view, the faults can be either permanent or intermittent. The detection of the latter one is more challenging since the nodes show contradictory behaviors at different times. from the topological point of view, the mobility of the sensor nodes is an intrinsic characteristic in many IoT-based applications, where numerous mobile nodes are managed by static overlay nodes. The dynamics of these network introduces the second challenge in identifying faulty nodes. Several works have been conducted to address the problem, but there is a research gap in identifying hybrid soft sensor faults in the aforementioned networks. The focus of attention in this paper is the detection of soft faults in the sensing unit of the nodes. We devised a new method, called Hybrid Fault Detection in Mobile Sensors, to detect nodes with mixed permanent and intermittent faults. A software debugging approach inspired the main idea. We also applied data mining techniques such as DBSCAN and K-means to validate sensed data, and differentiate the classes of faults, respectively. We evaluated the devised method using the NS2 simulator in various situations. One of the outcomes of the method is that the mobility of the nodes does not reduce the accuracy, in contrast to most of the traditional methods. Moreover, the evaluation demonstrates promising results for the networks with more than 50% faulty nodes. The results also show perfect performance in detecting permanent and intermittent faults in the networks with various percentage of faulty nodes.

AC

Keywords: Internet of Things, Mobile Ad Hoc Networks, Fault Diagnosis, Hybrid Faults. Email addresses: [email protected] (Hamid Nourizadeh Azar), [email protected] (Hadi Tabatabaee Malazi)

Preprint submitted to Simulation Modelling Practice and Theory

July 6, 2018

ACCEPTED MANUSCRIPT

1. Introduction

AC

CE

PT

ED

M

AN US

CR IP T

Smart city applications [1] are among the important categories of the Internet of Things (IoT). Smart environments (homes [2, 3], offices [4, 5], hospitals [6], museums [7], and hotels [8]), urban road traffic [9], and healthcare [10] are the examples of such applications, in which many sensors are used to monitor these areas. The malfunctioning sensor nodes reduce the accuracy of the system by generating invalid or inaccurate sensed data. For instance, a heat sensor in a hotel room may measure the temperature with an offset of -5 degrees permanently, or a humidity sensor in a factory may intermittently transmit a random wrong number every 5 minutes. In another system, to monitor the health and medical status of patients, some small sensors are attached to different places or worn by the patients [11]. These micro-electromechanical sensors are prone to failure, which may feed the system with incorrect outputs, and result in misinterpretations. According to the fault type of the node, the level of affection can be different. The faults may have different origins including physical failures, reduction of the energy level, and hardware/software failures. There are different perspectives to classify the faults of the sensor module. The major sources of failures are sensing units, processors [12], communication networks (link failures), base station node, and the application. According to the level of severity, the sensors may have soft or hard faults. In the former case, the node can continue some of its duties, but in the latter one, the node is completely out of use. From the temporal (timespan) viewpoint, three fault categories are defined, including transient, intermittent, and permanent faults [13, 14]. Finally, based on the amount of error in the measured value, they can be grouped into various classes such as stuck-at, gain, offset, and random noise faults [15, 16]. The problem that we addressed in this paper is to detect soft faults in the sensing unit of the nodes. Each node may be either fault-free or have intermittent\permanent faults. Moreover, they can be mobile or stationary. Since wireless communication is used to connect the nodes, the intrinsically unreliable packet exchange is also considered, and there is the possibility of packet loss. The target network of this research is a mobile sensor network built on top of a static overlay network. This hierarchical structure is one of the suitable topologies for large-scale sensor networks that can be used in smart cities. The primary goal is to find the faulty sensor nodes in the network and then, classify them into permanent or intermittent faults. It is worth pointing out that we do not consider transceiver, processing, and memory faults of the nodes in this paper. The problem involves several significant challenges. The first one is that the detection of the faulty sensing unit is more complicated than the faulty transceiver. In the latter case, a packet is either delivered or not. Therefore, we are facing a binary result. But in the 3

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

former one, two seemingly intact sensor nodes may report different values due to imperfect measurement accuracies. For instance, two nearby temperature sensors may report 25.1 and 24.9 degrees. Although the reported values are not the same, both sensors are reporting almost the same temperature. The challenging point is how to differentiate these values and assess whether they are correct or faulty ones. The second challenge is that the nodes with intermittent fault sometimes send correct values. This uncertainty makes it difficult to label them as a fault-free or a faulty node. The mobility of the nodes introduces the third challenge since the network topology changes dynamically. Smart city applications intensively use both mobile and static nodes, which results in time-variant dynamic network topology. The fourth challenge is the unreliable wireless means of communication. Hence, the possible methods have to be resistant against this unreliability. They have to consider that either a test request or the reply may not always reach the destination. Finally, the scalability in the number of sensor nodes introduces challenges for the centralized approaches. Several methods are introduced in the literature [13, 16, 17, 14, 12] to address the distributed detection of faulty sensor nodes. They use a variety of techniques such as neighborhood-based [18, 19], statistics-based [20], probability-based [21], soft computingbased [22], and self-detection-based [23] techniques for their diagnosis module. Some of the methods do not support the node mobility; others are only capable of detecting a single type of failures (i.e. permanent faults). In this paper, we devised a new decentralized method called Hybrid Fault Detection in Mobile Sensors (HFDMS). The method is designed to detect hybrid faults in mobile sensor nodes that are deployed upon a static overlay network. One of the contributions of this work is to devise a new evaluation method which is inspired by a software debugging technique. This method iteratively collected test data and produces the regional analysis. The second contribution is to use DBSCAN, a density-based data clustering method, to validate the sensed values to tackle the accuracy challenge of the sensors. Finally, the last contribution of HFDMS is to differentiate the classes of intermittent and permanent faults according to our modified version of K-means. One of the contributions of our work is that HFDMS analyzes the data in two-stage hierarchical tables in a decentralized manner. Therefore, the mobility of the nodes helps to update the sensors’ status information and boosts the performance. Moreover, the method is tolerant of unreliable communication and packet loss. To evaluate the accuracy of the proposed method, we simulated different scenarios using NS2 for communication and Java for processing modules. The results indicate significant accuracy and low error rate in comparison with the methods that are introduced in [18, 24]. The organization of the paper is as follows: In the next section, we review some of the similar outstanding works. Section 3 presents the system models including fault, network, 4

ACCEPTED MANUSCRIPT

and mobility models. The details of the devised method are introduced in Section 4. The performance analysis is discussed in Section 5. Finally, Section 6 concludes the paper. 2. Related work

AC

CE

PT

ED

M

AN US

CR IP T

Researchers introduced novel methods for detecting the faulty nodes in mobile ad hoc (MANET) [25] and wireless sensor networks (WSN) in recent years. In [16], Muhammed et al. performed a comprehensive survey and quantitatively analyzed the extant work in this area. The remainder of this section fills in the details of some of the distinguished researches. Panda et al. in [20] introduced a distributed algorithm for failure detection in WSNs, which uses neighboring coordination approach. It finds the permanently soft faulty nodes by using the z-test. In other words, each node detects faulty nodes by comparing its sensed values with the data collected from its neighbors. The method is suitable for the dynamic networks since in each time interval, the node has to detect its neighbors and evaluate whether it is fault-free or not. The energy efficiency of the fault detection is the topic of attention in [26]. An increase in the number of nodes causes more transferred messages. The energy consumed for fault detection increases with the growth in the number of nodes. But it remains fixed with the growth in the possibility of the presence of malfunctioning nodes. The main idea is to synchronize the nodes and use a broadcast protocol for the status exchange packets to reduce the number of messages, which leads to less energy consumption. The authors introduced a distributed 4-phase method (initialization, self-detection, status exchange phase, and decision), which uses the neighboring coordination technique. The method is designed for dynamic networks and identifies soft permanent faults. They also exploit the confidence level for each node. The nodes are responsible for calculating their confidence level by comparing the number of matching sensed values that are received from the neighbors. According to the simulation results, the total amount of the transferred data in the network varies due to packet loss. One of the deficiencies of the method is that in highly populated networks, the accuracy decreases and the error rate increases. The researchers in [23] devised a distributed fault detection method, which is based on a modified version of three sigma edit test [27]. It is capable of detecting data faults including offset, gain, and stuck-at faults as well as hardware faults. The method has two phases of initialization and self-diagnosis. In the first phase, the neighboring nodes exchange their sensed values within a specific period. Then, each node creates its list of neighbors with their associated sensed values. In the second phase, each node analyzes the reported values. If the node does not receive any value from its neighbors, it detects the hard fault. Otherwise, the node measures the discrepancy according to the sensed values 5

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

of neighboring nodes and its sensed value. The result indicates whether the node is faulty or fault-free. The energy efficiency is one of the main advantages of the method. Besides, the method is applicable in small networks. On the other side, their devised method does not support intermittent faults. Mahapatro et al. in [18] examined the effect of node mobility in detecting sensor faults. They introduced a comparison based distributed algorithm for mobile ad hoc networks that applies a test pattern for each sensor and recognizes the faulty ones according to the difference between the results of the neighboring nodes. The method guarantees the detection of the soft and hard faults in the network. The main drawback of the method is that it cannot detect offset faults. Sahoo et al. in [24] introduced a self-adaptive distributed fault detection protocol for mobile ad hoc networks, called flexible Distributed Self Diagnosis Protocol (DSDP). It is capable of detecting hard faults as well as permanent and intermittent soft ones. The algorithm works based on a test task, and it is executed by each node. Then, the nodes compare their outputs with the neighbors. Based on the matching of the outputs, the nodes decide whether the situation is faulty or fault-free. One of the shortcomings of the method is that there is the possibility of false positive error. That is, the algorithm recognizes some of the fault-free or permanent faulty nodes as an intermittent fault. Moreover, the authors stated that the communication protocol between the nodes is a reliable broadcast protocol. Therefore, the algorithm will not perform well in networks with possible packet loss. Titouna et al. in [28] introduced a hybrid hierarchical outlier detection method. The devised fusion based method has two stages. In the first level, the detection process is performed inside each node via Na¨ıve Bayes classifier. Assuming that the correct sensed values are in the range of a to b, the range is split into intervals that represent system classes. The method applies maximum posterior to infer the class. In the second level, the node compares the class of its sensed value with its associated class. If it belongs to the same class, the node sets report = 0, otherwise sets to 1. The decision is sent to the cluster head for higher-level assessment. Each cluster head uses Normal and Anomaly tables to keep track of the reported values, by comparing the sensed data of a node in the Anomaly table with the nodes of the Normal table. According to the degree of similarity, the cluster head detects various faults such as gain, out of bound, and stuck-at faults. One of the main advantages of their devised method is to consider the spatio-temporal values, which enhances the system to cope with network dynamics. On the other hand, the deficiency of the method is the low rate of detected faults in the beginning, although the rate improves in time. Chanak et al. in [29] devised a two-stage method. It detects the faults in the first stage and classifies them in the second one. The method uses fuzzy rule-based approach to tackle the uncertainty challenges and identifies various types of the faults including sen6

ACCEPTED MANUSCRIPT

ED

3. System model

M

AN US

CR IP T

sor circuits, receiver circuits, transmitter, and battery faults. In the first stage, the faults are detected. To detect the transmitter fault, the sensor nodes send a heartbeat packet periodically at predefined intervals, and the sink node replies via an OK message. The sink node detects the transmitter faults of each node by analyzing the number of received heartbeat packets within a specific period. Likewise, the efficiency of the receiver is calculated by the sensor node according to the number of OK messages received from the sink. Then, the node maps the efficiency to a fuzzy variable. The sensor nodes identify the sensor circuit faults by exchanging their sensed data and comparing the differences. The value below a threshold indicates the fault-free status. The authors also introduced average sensing information that shows the differences between the sensing values of the neighbors and is represented via fuzzy logic variables. Similarly, the remaining battery is also defined by the fuzzy variables. In the second stage, the fuzzy rules are utilized to classify different types of faults. The main advantage of the devised method is handling of the uncertainty of the sensor network. Besides, it provides a flexible fuzzy based system that can be configured for a variety of applications. But, it cannot adapt itself to network dynamics. The study of the aforementioned researches shows that most of them are not suitable for mobile sensor nodes, in other works, the mobility of nodes reduces the performance of these methods. From the fault type viewpoint, some of them support permanent faults only. Moreover, the detection of intermittent faults comes along with false positives in identifying either the faulty nodes or differentiating permanent with intermittent ones. The unreliability of the wireless transmission medium also leads to the further complication.

CE

PT

Before we embark on an in-depth explanation of our method, it is necessary to define the system models including fault, network, and mobility to provide a better view of the scope of the problem. It is important to declare that the main focus of our work is to devise a method that can analyze the test data in a clustered mobile sensor network [30]. The cluster formation and data streaming challenges are not addressed in this work, and the interested readers may refer to [30, 31].

AC

3.1. Fault model Generally, fault-free nodes in the same location and time do not send exactly the same values, since the accuracy of the sensing units is not perfect. The maximum difference between the sensed values of two fault-free nodes is denoted by δ. In other words, if two sensor nodes are fault-free, the difference of their sensed value is less than δ. Equation 1 demonstrates the fact, where dit and dtj are the sensor generated data at time t for nodes i 7

ACCEPTED MANUSCRIPT

and j, respectively. It is important to point out that the opposite situation is not correct, as it is possible that two faulty sensors produce values close to each other. |dit − dtj | < δ

(1)

ED

M

AN US

CR IP T

From the behavioral viewpoint, there are two categories of soft and hard faults. In the hard fault case, the node cannot operate and do not react to anything. In the soft one, the sensor can function, but the transmitted data is incorrect [32]. This paper concentrates on the issue of soft faults. From the timespan standpoint, there are three classes of transient, intermittent, and permanent faults [13]. A transient fault is dependent on external factors, which does not necessarily indicate a malfunctioning system. Since these faults are produced accidentally and disappeared shortly, diagnosing and tracking them is relatively difficult. An intermittent error is dependent on the internal components of either software or hardware origins. In this case, the faulty sensor repeatedly sends incorrect values, then, alternates and sends the correct ones. The last class of fault is the permanent one in which a sensor always sends incorrect data. We do not consider transient faults in this paper. Finally, from the data-centric standpoint, a faulty sensor node may be affected by offset fault, gain fault, stuck-at-fault, out of bound fault, or random noise fault. In the following, we formally introduce these faults according to the definition provided in [16]. Let d(n, t, f (t)) represent sensed values that are modeled as time series, where n denotes the node-ID, t demonstrates the sensing time, and f (t) shows the sensed value. The f (t) can be further modeled as α + βx + η, where α is a constant additive that shows the offset, β is a constant coefficient called gain, and η is the external noise in the value. • Stuck-at fault: The sensor node sends a fixed constant value. To model this type of faults, β and η are omitted from the model (x0 = α , x0 ∈ f (t)).

CE

PT

• Offset fault: The sensed data always has a fixed amount of difference from the original one. It can be modeled as x0 = α + x + η, where x0 ∈ f (t), and α is the constant value that is added to the original value.

AC

• Gain fault: The transmitted sensed data is always a fixed multiple of the original value. It is modeled as x0 = βx + η, where x0 ∈ f (t), and β is the coefficient of the fault. • Random noise fault: The sensor node sends data, which is affected by the noise with the average of zero and a high variance. The fault is modeled as x0 = η, where x0 ∈ f (t) and η is the external noise. 8

ACCEPTED MANUSCRIPT

• Out of bounds fault: This type of fault occurs in cases where the sensed value does not stand between the predefined acceptable ranges. If x0 > θ1 or x0 < θ2 , then the sensed value is out of the predefined ranges of θ1 and θ2 (θ1 > θ2 ).

F = { f1 , . . . , fn }

M = {m1 , . . . , mm }

CR IP T

3.2. Network model Our devised method can be applied to clustered mobile sensor networks. The work presented in [30] shows a suitable environment, where a static overlay covers network comprised of mobile nodes. We used the following case study of a smart campus to assess the performance of our method. We consider a campus comprised of n blocks and a set of sensors (si ) with n static and m mobile nodes (Equation 2). Let S , F, and M denote the sets of sensors, static nodes, and mobiles nodes, respectively. S =F∪M

(2)

ED

M

AN US

Each node has a unique ID and is aware of its current location (block ID). Each block has a static sensor node. The static nodes form a connected graph and can communicate with each other. The rest of the nodes are distributed randomly all over the campus to monitor the environmental factors. The mobile nodes can relocate between the blocks independently and randomly. Equation 3 demonstrates the position of node i at time t. If the node is not inside any block at time t, its location is zero. The minimum communication range of the nodes is adequate for communicating with all the nodes inside the block including the static node.     x if the node is in block x t (3) pi =   0 if the node is not inside any block

CE

PT

The nodes organize an undirected dynamic graph C(S , Lt ), where S is the set of all the nodes (static and mobile) and Lt represents the set of the edges (such as lt ) at time t. (Equation 4). The associated fixed node of a mobile node i at time t is f j provided that they both be in the same position and the mobile node is inside a block (Equation 5). Finally, Nit denotes the list of the neighbors for the fixed node i at time t (Equation 6). (4)

lt (mi , f j ) ∈ Lt : (pti = p j ) and (pti , 0)

(5)

Nit = {m x : lt (m x , fi ) ∈ Lt }

(6)

AC

lt (mi , m j ) ∈ Lt : (pti = ptj ) and (pti , ptj , 0)

9

ACCEPTED MANUSCRIPT

CR IP T

3.3. Mobility model In the mobility model used in this paper, choosing the next block to move conforms to the uniform probability function. Besides, the movement path is restricted by pavements and streets. For this reason, we used the Manhattan mobility pattern [33]. The movement speed is assumed identical for the mobile nodes. They stay in a block for a randomly chosen period (tdelay ) and then select the next block to move. 4. The devised method

AN US

Our proposed method is called Hybrid Fault Detection in Mobile Sensors (HFDMS). It is a decentralized method that detects the faulty sensors in two phases. In the first phase, the regional nodes (static nodes in each block) analyze the sensed values of the sensors. Then, they send the aggregated information to the base station. In the second phase, the base station performs the final analysis and disseminates the health status of each node. 4.1. Detection phase The observations of the sensor nodes reveal the spatial and temporal correlations. Two adjacent nodes at a particular time in a block may record similar values. We use this fact and design the detection phase with four steps. The devised method repeats the detection phase through its associated steps for r rounds.

PT

ED

M

4.1.1. Test data exchange In the first step, the static node of the block i broadcasts a request message reqti . The message includes sender ID, request ID, Block ID, and the energy level of the sender. The mobile nodes, which receive the message, compare their locations with the block ID field. If they are in the same block, the mobile node j prepares the reply message reptj with the following fields: mobile node ID, request ID, sensed value, and static node ID. The reply messages are stored in a round list of the fixed node. It is worth pointing out that the pairwise exchange of test data is not efficient, since it increases the number of message exchange considerably, which results in energy depletion.

AC

CE

4.1.2. Regional analysis The primary challenge of this step is to identify the real value among the spectrum of the sensed data. To address this challenge, by the time the reply messages are stored in the round list of the static node, the fixed node starts to analyze the sensed values based on the time and location correlations. Theoretically, the nodes at the same time and in the same location may sense similar, but not necessarily the same values. On the other hand, the faulty node may sense scattered ones. Hence, the fault-free readings form a cluster. The main goal of this step is to find this cluster. The use of average or median in this step is not 10

ACCEPTED MANUSCRIPT

CR IP T

suitable, since the false values may affect them, especially in cases where the percentage of faulty nodes is high. To find the cluster, a density-based clustering algorithm, called DBSCAN [34], is used. The algorithm takes a set of points and defines the points closely placed together as a cluster. The points that lie alone in low-density areas are marked as outliers. Besides the set of points, DBSCAN needs two more input parameters. The µ (minpoint), which defines the minimum number of neighbors for a point to be considered as a center of a cluster. The second one is ε that defines the maximum distance between two neighboring points in the cluster. The main steps of the algorithm are [35]:

AN US

1. Find the neighbors of each point within ε distance, and identify the core points with more than µ neighbors. 2. Find the connected components of the core points and ignore the non-core ones. 3. Assign each non-core point to a nearby cluster if the cluster is an ε neighbor. Otherwise, consider it noise.

ED

M

To utilize the DBSCAN, our devised method passes the round list, which includes the list of nodes (in a single region) with their associated sensed values to the clustering algorithm. To pass the ε parameter, there is an analogy between ε and the accuracy (δ) of fault-free nodes. In our devised method, we set this value to one. In the best case, the outcome of this step is one cluster and a few noise data. Consequently, the cluster represents the fault-free nodes (FFi list). In the worst scenario, the algorithm produces several clusters. The cluster with the largest population has the higher probability to contain the correct values. The next task is to label the fault-free nodes. There is also a chance that a single cluster includes a wide range of values. To solve this issue, the median of the cluster is taken into account, and the nodes reporting the values in the range of ±δ are labeled as fault-free and the rest of the nodes are labeled as faulty (Fi list).

AC

CE

PT

4.1.3. Update of the regional table The static node of each block aggregates the analysis results of the previous step in its regional table. The aggregation method is inspired by a software debugging technique, called Tarantula [36]. It is an automatic debugging method for detecting faulty lines of the software program by checking the units of the program using a set of test cases. The results of the tests (success, failure) and the lines of the program that participate in the trial indicate the faulty lines. The main idea is that those lines which engaged more than others in failed tests have the higher probability of being faulty. The reason for choosing this method is the analogy of finding a faulty line of the program with finding a faulty node in a network. The elements in both cases have cooperation. In the former one, a test case and in the latter one a test task of sensing a common phenomenon. Moreover, only a 11

ACCEPTED MANUSCRIPT

AN US

CR IP T

subset of elements participates in each test in both cases. In the former one, according to the test case path, a subset of the lines of the program is involved, and in the latter one, the adjacent nodes take part in the test. Each static node has two data structures. The first one is the regional table that assigns two parameters to each sensor: Passedi that indicates the number of times the sensor i was in the fault-free cluster, and Failedi that shows the number of times the node was detected as faulty. The initial value of these parameters is zero, and the values are updated in each round. Figure 1 demonstrates the data structure of a regional table. The second data structure is a vector that records the number of T otalPassed and T otalFailed cases. The T otalPassed parameter counts the number of cases that a cluster is detected. Additionally, T otalFailed indicates the number of rounds in which at least one faulty sensor exists.

Figure 1: The structure of a regional table.

ED

M

For example, five sensor nodes (s1 , s2 , s3 , s4 , s5 ) in a block perform the sensing task. Let δ be 0.5 and the outcomes of the sensors be 25.2, 25.3, 29.7, 25.1, and 21.2 respectively. According to the previous step, s1 , s2 , and s4 form a cluster of fault-free sensors. Therefore, the Passed values of them are incremented by one. The other two sensors are labeled as faulty, and their Failed parameters are increased one unit. The T otalPassed and T otalFailed parameters are also incremented, since at least one cluster is detected, and at least one faulty node is identified.

CE

PT

4.1.4. Sending the regional information By the time the static nodes ran the tests for the round r, they send the regional tables and vectors to the base station node for the final assessment. Algorithm 1 depicts the pseudocode of the first phase. A static node ( fi ) executes the algorithm in each diagnostic round.

AC

4.2. Aggregation phase Upon receiving the regional information by the base station, it starts to aggregate them and identify the fault-free nodes as well as the type of the faults (intermittent, permanent). This phase includes three steps, which are suspiciousness assessment, identifying the fault status of the nodes, and disseminating the results to the network. 12

ACCEPTED MANUSCRIPT

Algorithm 1: Detection Phase input : C(S , Lt ) Communication graph of WSN; output: The local diagnostic view of node fi ; */ */ */

*/

*/ */ */

M

AN US

CR IP T

begin FF fi =∅; /* The set of fault-free nodes F fi =∅; /* The set of faulty nodes reqti = ( fi ,Req ID,i); /* Sending a request message from fi to its neighbors Nit Send(reqti , Nit ); Add fi and its sensedValue to round list; /* Receiving reply messages from m j , where m j ∈ Nit foreach (Received reptj ) do rep=(m j ,Req ID,sensedValue); Append(round list,rep) ; /* The maximum acceptable difference of two sensed values ε = δ;/* DBSCAN parameter µ = 1; /* DBSCAN parameter Cluster List= DBSCAN(round list, µ, ε); if (∃ any cluster returned by DBSCAN) then C= the largest cluster; foreach (m j ∈ C) do if (|sensedValue j − mid(C)| ≤ δ) then FF fi = FF fi ∪ m j

AC

CE

PT

ED

; foreach (m j ∈ Nit AND m j < FF fi ) do F fi = F fi ∪ m j ; if (FF fi , ∅) then foreach (m j ∈ FF fi ) do Passed fi = Passed fi + 1; TotalPassed=TotalPassed+1; if (F fi , ∅) then foreach (m j ∈ FF fi ) do Failed fi = Failed fi + 1; TotalFailed=TotalFailed+1; Update Regional Table;

4.2.1. Suspiciousness assessment The input of this step includes both the regional tables and the vectors of the static nodes. The regional tables are aggregated in the global table, which is comprised of node 13

ACCEPTED MANUSCRIPT

CR IP T

ID, the summation of the number of passed cases, summation of the number of failed cases, suspiciousness of being faulty, and the fault status. It should be noted that the last two columns will be computed in this step. The vectors of the static nodes are also aggregated by summing in the T otalPassed and T otalFailed. The primary task is to calculate the global suspiciousness of each node. Figure 2 demonstrates the structure of the global table.

Figure 2: The structure of the global table.

Failed( j) T otalFailed Passed( j) j) + T Failed( T otalPassed otalFailed

(7)

M

S uspiciousness( j) =

AN US

To calculate the fault probability, we use the suspiciousness factor that is introduced in Tarantula [36]. Equation 7 demonstrates the suspiciousness for a given node j. The maximum suspiciousness is one, which indicates the high likelihood of being faulty, and the minimum is zero that shows the lowest probability of being faulty.

CE

PT

ED

4.2.2. Identifying the fault status To determine whether a node is fault-free or not and its associated fault type (intermittent, permanent), the global table is sorted according to the suspiciousness field. A sample result is demonstrated in Figure 3, where the horizontal axis represents the nodes, and the vertical one shows the suspiciousness (fault probability). The suspiciousness degrees of the fault-free nodes are close to zero, and the ones with permanent faults are close to one. The values between the two levels demonstrate the intermittent faults. To discriminate the three groups, we apply K-means clustering algorithm. It partitions the data into K different clusters by iteratively assigning each data to the cluster that has the nearest mean. In the initial step, the algorithm randomly selects or generates K centroids. Then, in each iteration the following steps executes:

AC

1. Data assignment: Each data point is assigned to its nearest centroid. 2. Centroid update: The centroid is recomputed by calculating the new mean value of the cluster.

The problem in applying K-means for detecting faulty nodes is that zero value of the fault-free nodes causes the mean value of this cluster to be close to zero. Therefore, the 14

CR IP T

ACCEPTED MANUSCRIPT

AN US

Figure 3: A sample suspiciousness assessment of the network nodes.

M

other fault-free nodes with non-zero suspiciousness values are identified as intermittent faulty. To solve this issue, our modified version of K-means does not consider the zero values in the clustering process. As a result, the mean value of the cluster increases and the fault-free nodes with non-zero suspiciousness are included in the fault-free group. The output of the clustering method fills the last column of the global table.

PT

ED

4.2.3. Status dissemination of faulty nodes The last step is to inform the static nodes about the identified faulty nodes. The base station disseminates the list of Node IDs and their associated fault status. This helps the static node to prevent getting information from the faulty ones. Algorithm 2 depicts the pseudocode of the second phase. 5. Performance evaluation

CE

In this section, we analyze the performance of our method from various perspectives. Before we embark on an in-depth discussion on the results, the configuration of the simulation, evaluation metrics, and the compared methods are briefly introduced.

AC

5.1. Simulation configuration We use NS2 to simulate the communication environment and apply different numbers of mobile nodes ranging from 75 to 500 in various simulation scenarios. The number of mobile nodes is considered 25 to 100. Besides, a wide range of fault percentage from 15

ACCEPTED MANUSCRIPT

Algorithm 2: Aggregation Phase input : List of Regional Tables, List of Regional Vectors output: The local diagnostic view of node si ;

S uspiciousness(s j ) =

CR IP T

*/

AN US

/* The sink node runs the algorithm begin foreach (Received Regional information) do Msg=Regional information; foreach (node s j in Msg) do Passed j = Passed j + Msg.Passed j ; Failed j = Failed j + Msg.Failed j ; TotalPassed=TotalPassed+Msg.TotalPassed; TotalFailed=TotalFailed+Msg.TotalFailed; /* Applying Tarantola foreach (Node s j ) do

*/

Failed(s j ) T otalFailed Passed(s j ) Failed(s j ) T otalPassed + T otalFailed

foreach (s j ) do globalTable.Add (s j , suspiciousness(s j ));

*/

ED

M

/* using k-means algorithm for Database and 3 clusters k mean(D,3) Update Fault Status column of the Global Table; Broadcast the list of Faulty nodes to the Network;

AC

CE

PT

2% to 60% is used to investigate different conditions. The nodes are chosen to be faulty using uniform probability distributional function and according to the fault percentage of the network. The fault percentage for both the static and mobile nodes are considered the same. Finally, two timespan faults (permanent and intermittent) are used to study the behavior of the methods. To simulate the mobility of nodes, the mobile node may stay in a block for a random period and then move to another randomly chosen block. We ran each simulation for five times and averaged the results. The rest of the configuration parameters are depicted in Table 1. To evaluate the proposed method, we use two metrics. The first one is the false detection rate, which is the ratio of the detected faulty nodes that are not actually faulty to the total number fault-free nodes. The second one is the detection accuracy, which is the ratio of the detected faulty nodes to the total number of actual faulty nodes. 16

ACCEPTED MANUSCRIPT

CR IP T

Parameter Values Number of mobile nodes [75-500] Number of static nodes [25-100] Fault probability [2%-60%] Fault types Intermittent- Permanent Transmission range 30m Antenna Omni-directional MAC layer protocol 802.11a Propagation model TwoRayGround Table 1: The parameter configuration of the simulator

AN US

We compared our method with two other methods that support node mobility. The first one is the method presented in [18], which only detects permanent faulty nodes in mobile sensor networks. The second one is the method presented in [24] that detects both types of permanent and intermittent faults. It should be noted that the method presented in [24] is designed for MANETs. So, we changed the content of the test message to report the sensed values of the sensors. The rest of the method remains unchanged.

AC

CE

PT

ED

M

5.2. Sensitivity analysis of the number of rounds The number of rounds in the devised method plays an important role in terms of accuracy and message complexity. We investigate the convergence in accuracy as well as the rate of the false detection in a various number of rounds. To have an in-depth insight, we used an extreme case, where 50% of the nodes are randomly chosen faulty in 25 blocks that contain nine nodes on average per block. Figure 4a reveals the false detection percentage (Y-axis) of our method in a different number of rounds (X-axis). It shows that the method converges to zero in the early rounds. This implies that the devised method does not require an excessive number of message exchange to detect faulty nodes, regardless of the fault types. Figure 4b demonstrates the results, where the horizontal axis shows the number of rounds and the vertical one represents the detection accuracy for permanent and intermittent faulty nodes. The figure shows that the detection accuracy for the permanent faulty nodes converges fast in the ten early rounds. The convergence of the detection accuracy for intermittent faulty nodes requires more rounds, and it reaches an acceptable level within twenty rounds. For the rest of the experiments, we run the methods for twenty rounds. The combination of the Figures 4a and 4b implies that the method detects the actually faulty nodes in early rounds (low rate of mis-detection). Then, it gradually detects the rest of faulty nodes and improve the accuracy. 17

CR IP T

ACCEPTED MANUSCRIPT

(a) False detection.

(b) Detection accuracy.

AN US

Figure 4: The effect of number of rounds.

AC

CE

PT

ED

M

5.3. The effect of node density The behavior of various network management methods is highly dependent on the density of the nodes. In this scenario, we analyze the performance of the devised method in networks with various node densities. In the configuration of the simulations, the number of blocks is set to 25 and 30% of nodes are randomly chosen to have permanents faults. We study the behavior of our method for the node densities ranging from 3 to 14. Figure 5a demonstrates the results, where the horizontal coordinate shows the node density, and the vertical one demonstrates the percentage of false detection. Different lines represent the performance of the devised method (HFDMS), Sahoo [24], and Mahapatro [18]. The figure shows that Sahoo [24] does not perform well for densities bellow seven, but as the density of the nodes increases, the false detection rate starts to improve. The main reason behind the deficiency is that the method tries to find similar nodes. If it does not find any similar sensor reading, then it will mark a node as faulty. This leads to high false detection rate in the sparse network. The figure also depicts that Mahapatro [18] has a better performance in lower densities compared to the previous method. The false detection rate is 13.45% for the density of three, and the rate converges to 4.5% for the density of 14. According to the method, the nodes need to consent to the sensed value in the majority. Low-density networks hinder the nodes to find an adequate number of neighbors with similar sensed values, which leads to performance degradation. The best performance is achieved by HFDMS. It is not affected by the density of the nodes and has zero false detections. Figure 5b shows the detection accuracy of the three methods. Mahapatro [18] has the lowest performance in the sparse networks, but the accuracy of the method is improved 18

ACCEPTED MANUSCRIPT

100

HFDMS Sahoo [24] Mahapatro [18]

60

95 90

40

30

20

10

85 80 75 70 65 60

HFDMS Sahoo [24] Mahapatro [18]

55 0 4

6

8 Density

10

12

50

14

(a) False detection.

CR IP T

Detection Accuracy (%)

False Detection (%)

50

4

6

8 Density

10

12

14

(b) Detection accuracy.

AN US

Figure 5: The effect of node density.

M

gradually by increasing the node density. Sahoo [24] has a better accuracy in detecting faulty nodes compared to the previous method. For the density of 14, the method achieves 97% detection accuracy. By comparing the results, it can be inferred that Sahoo [24] detects more faulty nodes that actually exists, which leads to increase in false detection. But it helps to improve the detection accuracy. The devised HFDMS shows 100% accuracy regardless of the node density.

AC

CE

PT

ED

5.4. The performance of the HFDMS in detecting permanent faults In this scenario, we investigate the behavior of the methods in networks with different percentage of permanent faulty nodes. The density of nodes is considered nine, and they are distributed randomly in 25 blocks. The fault percentage ranges from 2% to 50%. Figure 6a presents the performance of the methods in terms of false detection, where the horizontal axis shows the percentage of the faulty nodes. According to the figure, in cases where the number of faulty nodes is low, all the methods have an acceptable rate of false detections. As the percentage of the faulty nodes grows, the performances of Mahapatro [18] and Sahoo [24] start to degrade. The main reason is the lack of similar sensed values of sensors when the ratio of faulty nodes increases. Our devised method performs promisingly since it is not affected by the result of a single test message. In other words, although the outcome of the test may be incorrect in cases where most of the faulty nodes are landed in a block, the mobility of the nodes and their participation in r different tests reduces the effect of an incorrect test result. Figure 6b shows simulation results from the detection accuracy viewpoint. Both Mahapatro [18] and Sahoo [24] methods perform well in a low probability of faulty nodes, but 19

ACCEPTED MANUSCRIPT

60

100

HFDMS Sahoo [24] Mahapatro [18]

50

90

30

20

70 60 50 40 30 20

10

HFDMS Sahoo [24] Mahapatro [18]

10 0 0

10

20 30 Percentage of Faulty Nodes (%)

40

0

50

(a) False detection.

CR IP T

Detection Accuracy (%)

False Detection (%)

80 40

0

10

20 30 Percentage of Faulty Nodes (%)

40

50

(b) Detection accuracy.

AN US

Figure 6: The performance in detecting permanent faults.

M

as the percentage of faulty nodes increases, the performance in Mahapatro [18] degrades significantly. It can be inferred that the performance (false detection and detection accuracy) of Mahapatro [18] is more sensitive to fault percentages of more than 30% compared to the other two methods. Our devised method has perfect performance. This is due to the use of DBSCAN clustering method in discovering the real sensed value. Moreover, the effect of test messages in different rounds is aggregated in the base station for a thorough analysis.

AC

CE

PT

ED

5.5. The performance of the HFDMS in detecting hybrid faults In another simulation scenario, we investigate the performance of the methods in detecting hybrid faults. The density of the sensors and the number of blocks are set to 9 and 25, respectively. The nodes can be either fault-free, permanent faulty, or intermittent faulty. The probability of the two fault types is considered the same and ranges from 2% to 60%. Figures 7a and 7b show the achieved detection accuracy for intermittent and permanent nodes separately, where the horizontal axis shows the total percentage of both types of faulty nodes. According to Figure 7a, our devised method does not show impressive accuracy in detecting intermittent faulty nodes in cases where the percentage of intermittent faulty nodes are low, but as the number of faulty nodes increases, the performance starts to improve. In cases where the number of intermittent faulty nodes are low, the Tarantula equation generates a big number as the fault probability of the nodes. This results in wrongly detecting the intermittent faulty nodes as the permanent faulty nodes. Figure 7b shows the superior accuracy of our devised method compared to Sahoo [24] in detecting 20

100

90 80 70 60 50 40 30 20 HFDMS Sahoo [24]

10 0

0

10

20 30 40 Percentage of Faulty Nodes (%)

50

90 80 70 60 50 40 30 20

(a) Intermittent faults.

HFDMS Sahoo [24]

10 0

60

CR IP T

100

Detection Accuracy of Permanent Faulty Nodes (%)

Detection Accuracy of Intermittent Faulty Nodes (%)

ACCEPTED MANUSCRIPT

0

10

20 30 40 Percentage of Faulty Nodes (%)

50

60

(b) Permanent faults.

AN US

Figure 7: The detection accuracy in hybrid faults.

ED

M

permanent faulty nodes. As the percentage of the faulty nodes increases, the performance of Sahoo [24] in detecting permanent faults weakens. The two figures indicate that Sahoo’s method mis-detects the permanent faulty nodes, and considers them as intermittent ones, when the percentage of faulty nodes increases. In Figure 8 the false detection rate of the methods is investigated. For the networks with lower than 30% faulty nodes, both methods have acceptable achievements. In the Sahoo’s work, by increasing the number of faulty nodes, the performance starts to degrade. But, the false detection in HFDMS is not affected by the percentage of faulty nodes. The main reason is the exploitation of K-means to differentiate various health status of the nodes (fault-free, permanent fault, or intermittent fault).

AC

CE

PT

5.6. The performance of the HFDMS in detecting various data-centric faults From the data-centric point of view, there are various types of faults. In this section, we study the behavior of our devised method according to these faults. We set up several simulation scenarios that in each of them, the nodes are affected by only a single type of faults namely stuck-at fault, random faults, and gain-offset faults. It is worth pointing out that our method does not have any classification regarding the data-centric type of failure. It is only capable to distinguish permanent and intermittent faults. Figure 9 shows the detection accuracy of the devised method in the three data-centric stuck-at fault, random faults, and gain-offset faults. Figure 9a demonstrates the detection accuracy of the permanent faults. It reveals that for the random faults the devised method has the best performance. The main reason is that the faulty sensed values are scattered in a way that the clustering algorithm considers them as outliers. Therefore, the random faulty 21

ACCEPTED MANUSCRIPT

50 HFDMS Sahoo [24]

45 40

30 25 20 15 10 5 0 0

10

20 30 40 Percentage of Faulty Nodes (%)

50

CR IP T

False Detection (%)

35

60

AN US

Figure 8: The false detection rate for hybrid faults.

AC

CE

PT

ED

M

values can be easily detected. For the case of gain-offset faults, the method achieves an acceptable performance, where less than 35% of nodes are faulty. But, as the percentage of faulty nodes increases, the performance starts to degrade. The key issue is that the faulty sensed values are close to each other. In other words, they are the linear transformation of the actual values with a similar offset and coefficient. Consequently, as the population of the faulty values increases, the density-based clustering method cannot distinguish the cluster of correct values. For the stuck-at fault case, each node reports the same value for every sense. That is, although the faulty sensed values of two distinct nodes may be different, each of them always reports the same sensed value. The detection of this type of fault is more probable than the gain-offset faults since the faulty sensed values do not form a density close values. Consequently, the density-based clustering method is able to consider them as outliers. Figure 9b demonstrates the results of a similar experiment for the intermittent faults. For the case that the percentage of faulty nodes is below 10%, all types of data-centric faults exhibit poor performances. The detection accuracy decreases for the gain-offset faults in cases where the number of faulty nodes exceeds 40%. The reason is similar to the case of the permanent faulty nodes. The comparison of Figures 9b and 9a also shows that the degradation of detection accuracy for gain-offset faults is more severe in intermittent faults since detecting each of the intermittent faults and the gain-offset faults impose additional detection error. Figure 10 demonstrates the false detection percentage of the devised method in different types of data-centric faults. The lowest performance belongs to the gain-offset fault, in cases where the percentage of faulty nodes are above 40%. In these cases, several faulty sensors express a same wrong reading unanimously, and it is difficult to distinguish the 22

CR IP T

ACCEPTED MANUSCRIPT

(a) permanent faulty nodes.

(b) intermittent faulty nodes.

ED

M

AN US

Figure 9: Detection accuracy in various data-centric faults.

PT

Figure 10: The false detection rate for various data-centric faults.

correct reading via a clustering method.

AC

CE

5.7. The effect of packet loss The last piece of our work investigates the effect of packet loss on detection accuracy of permanent and intermittent faults. Figure 11 demonstrates the results, where the horizontal axis shows the rate of the packet loss, and the vertical axis represents the detection accuracy. In Figure 11a, although for both types of faults the method is resistant to up to 20% of packet loss, it shows that the total detection accuracy is mostly related to detecting intermittent faults. 23

CR IP T

ACCEPTED MANUSCRIPT

(a) 20 rounds.

(b) 100 rounds.

AN US

Figure 11: The effect of packet loss on detection accuracy.

In Figure 11b, we repeated the experiment and used hundred test rounds to analyze the effect of packet loss. The results show almost perfect achievement for the packet loss rates of bellow 0.6. They express that for the networks with significant packet loss, it is recommended to use more test rounds to overcome the problem.

ED

M

5.8. Complexity assessment The last part of the performance evaluation section is dedicated to the complexity assessment of the devised method from computational, memory, and message exchange points of view. Then, we provide the comparison between the proposed method and the researches presented in [18] and [24].

AC

CE

PT

5.8.1. Computational complexity Let the average number of nodes in each block be d (d = mn ). In the first phase, the static node of each block is involved in the computation of the first phase. The computational complexity of each round is equal to the complexity of DBSCAN algorithm, which is O(d.log d). If the first phase iterates for r rounds, then the complexity in each static node is O(r × d × log d). The computational complexity of the second phase is comprised of three elements. The first one is the process of calculating the number of successes and failures of each node, which is m × n. The second one is the complexity of Tarantula, and the third one is the complexity of K-means algorithm. Equation 8 shows the computation complexity of the second phase. In this phase, the base station node participates in the computation.

24

ACCEPTED MANUSCRIPT

m O(HFDMS) = O(n.m) + O(T arantula(m + n)) + O(n.K means( , 3)) n m = O(n.m) + O(m + n) + O(n × 3.( )) n

(8)

CR IP T

5.8.2. Memory consumption In the network model, three types of nodes are defined according to their responsibilities. The first type is the mobile nodes that do not store any information. The second type is the static nodes of blocks. Each of them stores a regional table that consists of m+1 rows and three columns. They also store a vector of size three that keeps the aggregated results of the rounds in the first phase. Finally, the last type is the base station that aggregates regional information and stores them in a global table, which keeps m + n records that contain five elements. The global vector, which is the aggregation of the regional vectors, is composed of two elements as well.

AN US

5.8.3. Message exchange complexity The message complexity reflects the energy efficiency of our method, due to the fact that transceiver is the main energy consuming unit in sensor nodes. The message complexity of the method in the first phase for each node is r, since each node sends the sensed values once in each round. In the second phase, the static nodes transfer the information to the base station only once. Therefore, the message complexity for the static nodes is r + 1.

AC

CE

PT

ED

M

5.8.4. The comparison of message complexity The message complexity is the most influential factor in most of the wireless sensor network applications due to the energy consumption considerations. In the following, the message complexity of methods presented in [18] and [24] will be briefly reviewed and compared against our devised method. The mobility-aware diagnosis method presented in [18] consists of detection and dissemination phases. Before starting the first phase, one hop clustering method have to be performed (Msgclustering ). In the detection phase, each node broadcasts a hello message that encompasses diagnosis and other required information. By the end of the first phase, each node sends another message that reports its fault state and possibly the hard fault neighbors. In the dissemination phase, a spanning tree of cluster heads is established (Msg spt ). Then, each cluster heads collects the local fault view of its children and sends the aggregated report to its cluster head at the upper level (Msgdisseminate ). Finally, the sink node broadcasts the global view. Equation 9 demonstrates an approximate message complexity of the method per m mobile node. Msgcomplexity =

Msg spt Msgclustering +1+1+ + Msgdisseminate m m 25

(9)

ACCEPTED MANUSCRIPT

Msg spt + P1 .Msgrecon + Msgre f + r + P2 .(Msgrecon + Msgre f ) + 1 (10) m

AN US

Msgcomplexity =

CR IP T

The fault diagnosis method introduced in [24] has four phases. The message complexity of the first phase, which is called maintenance, is comprised of the communication for establishing a spanning tree (Msg spt ), reconnecting a mobile node (Msgrecon ) with the probability of P1 , and parent refining by a node (Msgre f ). In the second phase, named comparison, each node periodically sends r messages to its parent. The third phase is repairing in which, if the parent of a node is detected as faulty, the first and the second phase have to be redone. The objective of this phase is to prevent any faulty node to become a parent. Let P2 be the fault probability of a parent node, then the message complexity of this phase is P2 .(Msgrecon + Msgre f ). In the final phase (dissemination), the leaf nodes inform their diagnosis to their parents. Hence, each node sends a single message. Equation 10 presents the average message complexity per m mobile node.

Putting aside the messaging required to build a spanning tree and clusters, the comparison of the message complexities of the methods shows that the diagnosis task has to be performed either in multiple rounds (r) or in two broadcasts. The former case empowers the methods to detect intermittent and permanent faults, while the latter one is used only to diagnose permanent faults.

M

6. Conclusion

AC

CE

PT

ED

In the implementation of IoT based applications, identifying the faulty nodes is essential to preserve the integrity and accuracy of the information. This paper addressed a research gap of detecting faulty nodes in mobile wireless sensor networks. Unlike detecting communication failures (to transmit or not), the sensing unit of each node may report unequal values regardless of their health status. To solve this issue, we benefit from a data mining technique that clusters the sensed values. It helps to recognize the real value among the reported sensed values. We also introduce a probabilistic method, inspired by software testing, that calculates the fault probability of each node. Furthermore, the method analyzes the probability by utilizing K-means to differentiate various timespan fault types. The devised method is decentralized and scalable; therefore, it meets the middle size network requirements. The simulation results demonstrate its low false detection rates in various network conditions. Besides, the detection accuracy of permanent faulty nodes outperforms similar methods. Also, it is not vulnerable to node density or the percentage of faulty nodes. Almost similar results are achieved for the intermittent faults. The simulation results also reveal that the best performance is achieved in determining the random noise faults. On the other hand, the devised method has some limitations. For instance, the 26

ACCEPTED MANUSCRIPT

detection accuracy of intermittent faults for the networks with few malfunction nodes is not significant. Moreover, the method does not perform strongly in detecting intermittent gain-offset faults in networks with either high or low percentage of faults. 7. Acknowledgment

CR IP T

The authors may express their appreciation to Dr. Mojtaba Vahidi-Asl for his useful comments on software testing techniques that elevated the quality this paper. References

AN US

[1] F. Cicirelli, A. Guerrieri, G. Spezzano, A. Vinci, An edge-based platform for dynamic smart city applications, Future Generation Computer Systems 76 (Supplement C) (2017) 106 – 118. [2] R. S. Ransing, M. Rajput, Smart home for elderly care, based on wireless sensor network, in: 2015 International Conference on Nascent Technologies in the Engineering Field (ICNTE), 2015, pp. 1–5.

M

[3] H. Tabatabaee Malazi, M. Davari, Combining emerging patterns with random forest for complex activity recognition in smart homes, Applied Intelligence 48 (2) (2018) 315–330.

ED

[4] J. Nelis, H. Vandaele, M. Strobbe, A. Koning, F. D. Turck, C. Develder, Supporting development and management of smart office applications: A dyamand case study, in: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), 2015, pp. 1053–1058.

PT

[5] A. Capozzoli, F. Lauro, I. Khan, Fault detection analysis using data mining techniques for a cluster of smart office buildings, Expert Systems with Applications 42 (9) (2015) 4324 – 4338.

CE

[6] A. Holzinger, C. R¨ocker, M. Ziefle, From Smart Health to Smart Hospitals, Springer International Publishing, Cham, 2015, pp. 1–20.

AC

[7] F. Colace, M. D. Santo, L. Greco, S. Lemma, M. Lombardi, V. Moscato, A. Picariello, A context-aware framework for cultural heritage applications, in: 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems, 2014, pp. 469–476.

27

ACCEPTED MANUSCRIPT

[8] X. W. Peng, X. P. Guo, Q. B. Geng, Design of smart hotel lighting control system based on arm, in: Mechanical and Electrical Technology V, Vol. 392 of Applied Mechanics and Materials, Trans Tech Publications, 2013, pp. 347–350.

CR IP T

[9] J. Yang, Y. Han, Y. Wang, B. Jiang, Z. Lv, H. Song, Optimization of real-time traffic network assignment based on iot data using dbn and clustering model in smart city, Future Generation Computer Systems (available online) 2017. [10] B. Farahani, F. Firouzi, V. Chang, M. Badaroglu, N. Constant, K. Mankodiya, Towards fog-driven iot ehealth: Promises and challenges of iot in medicine and healthcare, Future Generation Computer Systems 78 (Part 2) (2018) 659 – 676.

AN US

[11] D.-J. Kim, B. Prabhakaran, Motion fault detection and isolation in body sensor networks, Pervasive and Mobile Computing 7 (6) (2011) 727 – 745, the Ninth Annual {IEEE} International Conference on Pervasive Computing and Communications (PerCom 2011). [12] C.-F. Cheng, K.-T. Tsai, Eventual strong consensus with fault detection in the presence of dual failure mode on processors under dynamic networks, Journal of Network and Computer Applications 35 (4) (2012) 1260 – 1276, intelligent Algorithms for Data-Centric Sensor Networks.

M

[13] A. Mahapatro, P. M. Khilar, Fault diagnosis in wireless sensor networks: A survey, IEEE Communications Surveys Tutorials 15 (4) (2013) 2000–2026.

ED

[14] D. Raposo, A. Rodrigues, J. S. Silva, F. Boavida, A taxonomy of faults for wireless sensor networks, Journal of Network and Systems Management (2017) 1–21.

PT

[15] A. Siddiqua, S. Swaroop, P. Krishan, S. Mandal, Distance based fault detection in wireless sensor network, International Journal on Computer Science and Engineering 5 (5) (2013) 368.

CE

[16] T. Muhammed, R. A. Shaikh, An analysis of fault detection strategies in wireless sensor networks, Journal of Network and Computer Applications 78 (2017) 267 – 287.

AC

[17] S. Chouikhi, I. E. Korbi, Y. Ghamri-Doudane, L. A. Saidane, A survey on fault tolerance in small and large scale wireless sensor networks, Computer Communications 69 (2015) 22 – 37. [18] A. Mahapatro, P. M. Khilar, Mobility aware distributed diagnosis of mobile ad hoc sensor networks, Networking Science 2 (1) (2013) 52–65. 28

ACCEPTED MANUSCRIPT

[19] A. Mahapatro, A. K. Panda, Choice of detection parameters on fault detection in wireless sensor networks: A multiobjective optimization approach, Wireless Personal Communications 78 (1) (2014) 649–669.

CR IP T

[20] M. Panda, P. M. Khilar, Distributed soft fault detection algorithm in wireless sensor networks using statistical test, in: Parallel Distributed and Grid Computing (PDGC), 2012 2nd IEEE International Conference on, 2012, pp. 195–198. [21] H. Yuan, X. Zhao, L. Yu, A distributed bayesian algorithm for data fault detection in wireless sensor networks, in: 2015 International Conference on Information Networking (ICOIN), 2015, pp. 63–68.

AN US

[22] P. Chanak, I. Banerjee, Fuzzy rule-based faulty node classification and management scheme for large scale wireless sensor networks, Expert Systems with Applications 45 (2016) 307 – 321. [23] M. Panda, P. Khilar, Distributed self fault diagnosis algorithm for large scale wireless sensor networks using modified three sigma edit test, Ad Hoc Networks 25, Part A (2015) 170 – 184.

M

[24] M. N. Sahoo, P. M. Khilar, Intermittent fault diagnosis in dynamic topology manets, International Journal of Signal and Imaging Systems Engineering 8 (6) (2015) 345– 355.

ED

[25] H. Jarrah, N. I. Sarkar, J. Gutierrez, Comparison-based system-level fault diagnosis protocols for mobile ad-hoc networks: A survey, Journal of Network and Computer Applications 60 (2016) 68 – 81.

PT

[26] M. Panda, P. M. Khilar, Energy efficient soft fault detection algorithm in wireless sensor networks, in: Parallel Distributed and Grid Computing (PDGC), 2012 2nd IEEE International Conference on, 2012, pp. 801–805.

CE

[27] R. Maronna, R. D. Martin, V. Yohai, Robust statistics, John Wiley & Sons, Chichester. ISBN, 2006.

AC

[28] C. Titouna, M. Aliouat, M. Gueroui, Outlier detection approach using bayes classifiers in wireless sensor networks, Wireless Personal Communications 85 (3) (2015) 1009–1023. [29] P. Chanak, I. Banerjee, Fuzzy rule-based faulty node classification and management scheme for large scale wireless sensor networks, Expert Systems with Applications 45 (2016) 307 – 321. 29

ACCEPTED MANUSCRIPT

[30] A. Pruteanu, S. Dulman, K. Langendoen, Ash: Tackling node mobility in large-scale networks (2010) 144–153. [31] D. T. Wagner, A. Rice, A. R. Beresford, Device analyzer: Large-scale mobile data collection, SIGMETRICS Perform. Eval. Rev. 41 (4) (2014) 53–56.

CR IP T

[32] I. Banerjee, P. Chanak, H. Rahaman, T. Samanta, Effective fault detection and routing scheme for wireless sensor networks, Computers & Electrical Engineering 40 (2) (2014) 291–306. [33] F. Bai, N. Sadagopan, A. Helmy, Important: a framework to systematically analyze the impact of mobility on performance of routing protocols for adhoc networks, in: INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications. IEEE Societies, Vol. 2, 2003, pp. 825–835 vol.2.

AN US

[34] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., A density-based algorithm for discovering clusters in large spatial databases with noise., in: Kdd, Vol. 96, 1996, pp. 226–231.

M

[35] E. Schubert, J. Sander, M. Ester, H. P. Kriegel, X. Xu, Dbscan revisited, revisited: Why and how you should (still) use dbscan, ACM Trans. Database Syst. 42 (3) (2017) 19:1–19:21.

AC

CE

PT

ED

[36] J. A. Jones, M. J. Harrold, Empirical evaluation of the tarantula automatic faultlocalization technique, in: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, ASE ’05, ACM, New York, NY, USA, 2005, pp. 273–282.

30