Energy conservation in WSN through multilevel data reduction scheme

Energy conservation in WSN through multilevel data reduction scheme

Accepted Manuscript Energy Conservation in WSN through Multilevel Data Reduction Scheme Muruganantham Arunraja, Veluchamy Malathi, Erulappan Sakthivel...

625KB Sizes 20 Downloads 57 Views

Accepted Manuscript Energy Conservation in WSN through Multilevel Data Reduction Scheme Muruganantham Arunraja, Veluchamy Malathi, Erulappan Sakthivel PII: DOI: Reference:

S0141-9331(15)00079-4 http://dx.doi.org/10.1016/j.micpro.2015.05.019 MICPRO 2234

To appear in:

Microprocessors and Microsystems

Please cite this article as: M. Arunraja, V. Malathi, E. Sakthivel, Energy Conservation in WSN through Multilevel Data Reduction Scheme, Microprocessors and Microsystems (2015), doi: http://dx.doi.org/10.1016/j.micpro. 2015.05.019

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Muruganantham ARUNRAJA1, Veluchamy MALATHI2, Erulappan SAKTHIVEL3 Research Scholar, Department of Electrical and ElectronicsEngg, Anna University, Regional Centre, Madurai. 2 Professor, Department of Electrical and ElectronicsEngg, Anna University, Regional Centre, Madurai. 3 Research Scholar, Department of Electrical and ElectronicsEngg, Anna University, Regional Centre, Madurai. Email: [email protected]; [email protected]; [email protected] 1

ENERGY CONSERVATION IN WSN THROUGH MULTILEVEL DATA REDUCTION SCHEME Muruganantham ARUNRAJA1, Veluchamy MALATHI2, Erulappan SAKTHIVEL3 Department of Electrical and Electronics, Research Scholar, Anna University, Regional Centre, Madurai.

Email: [email protected]; [email protected] ; [email protected] ; Abstract: Lifetime is one of the major Quality of Service factors for Wireless Sensor Networks (WSN). As sensor nodes are generally battery-powered devices, the network lifetime can be extended over a reasonable time span by lessening the energy consumption of nodes. Reducing the amount of data transmission can effectively minimize the energy consumption, the bandwidth requirement and network congestions. In a WSN, denser deployment of nodes results in a high spatial correlation between data generated by neighboring nodes. Slow varying nature of many physical phenomenon results in similar sensor observations over the period. In this proposed work, a two level data reduction technique is employed. Here the Data and Energy Aware Passive (DEAP) clustering approach is introduced to divide the sensor network into data similar clusters. A Dual Prediction (DP) based reporting is deployed between cluster members and their Cluster Head (CH). This first level data reduction is attributed to the temporal correlation of data over the time. In CHs, the data from multiple data similar nodes are aggregated to reduce the spatial data redundancy. The proposed method DEAP-DP is verified with real world datum and has achieved up to 68% data reduction at 0.5°C error tolerance. Keywords—Wireless Sensor Network, Data Reduction, Data aware clustering, Dual Prediction, Data Aggregation. 1.

Introduction

Wireless sensor Nodes cover up significant portions of the earth’s surface. Typically a WSN is a collection of spatially distributed autonomous sensor nodes deployed for monitoring physical parameters like temperature, pressure, humidity, sound, vibration, seismic events etc. [1]. The monitored parameters are reported to the Base Station (BS) through the networked architecture. A sensor node is a miniature device with some essential components like a) Sensing unit – for acquiring data from the surrounding environment. b) Processor – for data processing and for a transitory storage purpose. c) RF transceiver-for transmitting the processed data [2]. Of which the RF transceiver is the major consumer of nodal energy. WSNs are preferred over a wide variety of applications due to their small form factor and mobility compatible nature. Thus, it is impractical to have a large sized power source. This power source often consists of a battery with a limited energy budget, which results in finite lifetime of nodes. In addition, it could be impossible or inconvenient to recharge the battery, because nodes may be deployed in a hostile or unpractical environment. In the era of IoT [3], the sensor nodes are feeding the data hungry internet servers. Thus a longer life span and optimal quality of collected data are essential. On one hand, the sensor network should

have a lifetime long enough to fulfill the application requirements. On the other hand, data accuracy and latency are the major issues of WSN. Thus the data gathering process should conserve the residual energy of the node while maintaining optimal data accuracy. Hence, by exploiting the inherent properties of WSN data, energy can be conserved with minimum tolerance on the collected data accuracy. Tobler’s first law of geography [4] states that “Everything is related to everything else, but near things are more related than distant things”. This statistical observation implies that data correlation increases with decreasing spatial separation of sensor nodes [5]. In a WSN, the nodes are densely placed and transmit data at high sampling rates, which results in a high level of spatial and temporal correlations; hence a vast amount of data transmitted is redundant. By reducing this redundant or similar data, a significant amount of nodal energy can be conserved along with bandwidth savings. In a WSN, both the network structure and data collection methodology decide the energy expenditure. Hierarchical structures are preferred over plain network, due to their improved reliability and energy conservation. Clustering is the prominent hierarchical architecture. By creating data similar clusters spatially redundant data can be excluded. Higher sampling rates can achieve accurate and microscopic data collection. But not all the sampled data are worth transmission. By avoiding unnecessary

data transmission, huge energy savings can be realized. The major contributions of the proposed work are defining an energy efficient intra cluster data communication through adaptive nLMS based dual prediction and the construction of data similar clusters with minimal communication overhead. The proposed work constructs data similar clusters using passive clustering approach. Here the high energy nodes elect themselves as CH after an energy driven proclamation delay. The proclamation includes the features of recent data history. Based on the proclaimed information, the neighboring nodes estimate the data similarity with the proclaimed node and associate with the data similar CH. Using an adaptive nLMS based filter, the cluster members communicate their data to the CH. At the CH, spatially redundant data are aggregated and transmitted to the BS. The proposed work is intended to conserve the energy of continuous data collection systems without significant loss in data accuracy. As most of the systems are densely deployed and the phenomenon under measurement has the slow varying nature, the proposed work can achieve significant drop in nodal energy expenditure. The energy saving increases with increased temporal error tolerance. The rest of the paper is organized as follows. Section 2 discusses the related work. Section 3 explains the proposed methodology. Experimental setup is explained in Section 4. Simulation results are discussed in Section 5. Finally, section 6 concludes our work.

2. Related works The problem of WSN lifetime maximization, in general, has been addressed in several literatures. Anastasi et al [2] listed multiple approaches for saving energy. Since radio communication is the governing power consuming activity in WSN, by minimizing the amount of data transmitted, a noteworthy level of energy conservation can be achieved. We divide the previous literature into the two main categories. In the first category, we discussed various energy aware clustering methods and their issues. In the second category, we discussed different energy efficient data collection methods and highlighted the importance of the dual prediction.

Clustering is the most significant and widely used hierarchical routing algorithms. Based on the parameters used for CH election and cluster formation criteria, the clustering algorithms can be classified as Probabilistic and Non Probabilistic Clustering Algorithms. To decide the initial CH, each sensor node is provided with a node ID or some probability value in the probabilistic clustering algorithms. Better network lifetime and energy consumption can be achieved by determining the appropriate probability value for clustering the sensor nodes. LEACH [6] is a prominent probabilistic clustering algorithm, where CH is elected based on a random probability and the neighboring nodes join with the nearest CH. HEED [7] is also a probabilistic Clustering Algorithm that uses distributed scheme for cluster formation. Here CHs are selected based on node`s residual energy, node degree or node’s proximity to its neighbors. EECS [8] is a distributed clustering method similar to LEACH with certain improvement in CH section and cluster formation. Unlike HEED, in EECS single-hop communication pattern is retained between the CHs and BS. In case of non-probabilistic Clustering Algorithms, the criteria for CH election and cluster formation are based on the node`s nearness parameters like degree, distance, connectivity etc. Here the cluster formation requires extensive communication between the nodes and their neighbors. Highest Connectivity cluster [9] is a nonprobabilistic method that proceeds in two levels namely tree discovery and cluster formation. The number of neighbor nodes is broadcasted by the sensor node to consider the connectivity. The sensor with highest connectivity that is the maximum number of one-hop neighbors is elected as CH, whereas the clustering process can be started by any node in the WSN. Biologically Inspired Clustering Algorithm [10], uses swarm agents in a stochastic algorithm to obtain near optimal solutions to complex, nonlinear optimization problems. Separation and alignment swarms are used to achieve a stable and near uniform distribution of the CH nodes. Weight-Based Clustering Algorithm [11] is a non-probabilistic clustering algorithm that elects the CH non-periodically and utilizes distributed scheme for cluster formation. In order to save power, new CH is elected when the sensor loses the connection with its CH. The parameter used for election procedure are the ideal node degree, transmission power, mobility and the remaining energy of the nodes. The probabilistic clustering methods have the advantages of low overhead clustering and faster convergence time, but suffer from inefficient cluster formation and reduced cluster reliability. The nonprobabilistic clustering methods achieve highly

reliable and well balanced cluster, but the control message overhead is high and long convergence time. The proposed approach is a non-probabilistic clustering method with minimal control overhead, as it uses a passive clustering technique. The convergence time is also reduced with the help of energy aware back off delay. Hence the proposed method is reliable and highly energy balancing. There is a new genre of clustering algorithms that considers the data similarity between the neighboring nodes. CAG [5] utilizes the correlation to form clusters in a tree-based structure sensor network. Since the clusters are data similar, only CHs data represent the entire cluster. Since the adjacent nodes are not aware of each other’s status, there is a high probability of redundant cluster formation. In CAG during the cluster adjustment phase, the existing clusters can only be bifurcated and cannot be merged. This issue leads to the formation of clusters with one member, so after a while all sensor nodes will become CHs. To cope with this, DACA [12] proposes a data and energy aware algorithm, where the clusters can be merged. Still, in both the methods, the error tolerance between the CH and its members cannot be guaranteed. The DACA includes the node degree factor for CH election and hence increases the control overhead. The proposed DEAP uses the passive clustering through which the clustering overheads are reduced significantly. Being a deterministic approach the method distributes the clusters more efficiently. The data aware nature of proposed clustering helps in achieving highly efficient data aggregation at the CH. Among the three major energy conservation techniques, both the mobility and duty cycling methods are unaware of the inherent data redundancy, thus data driven methods are more preferable for data correlated networks. Based on problems and applications, Data-driven methods [13] are classified as Data compression, in-network processing and data prediction. The data-driven approaches aim at reducing the energy spent by the communication subsystem and the amount of data to be delivered to the sink node. Since the data generated by a WSN has a high level of spatial and temporal similarity, the method that exploits these similarities can serve better towards effective energy conservation without affecting the accuracy of the collected data. Xiong et al [14] use distributed source coding to encode the correlated data independently at the source nodes and decode it jointly at the destination. But it requires prior knowledge on precise correlation of data. Compressed sensing [15,16] methods can

compress the data using a small number of nonadaptive, randomized linear projection samples and recover the same without any loss, provided the data has sufficient sparsity. The assumption of constant sparsity in the sensed data in conventional CS is unrealistic for most natural signals, which may cause reconstruction quality degradation. To reduce data while traversing the network towards the sink, In-network processing [17] method performs data aggregation between the sources and the sink. Instead of transmitting whole data, a model describing data progression is built in data prediction methods [18]. With certain error limits, the model can predict the sensed value and resides both at the source nodes and at the sink. In case of inaccurate predictions, the deviated data are transmitted to the sink from the sensor node and the model is getting updated. There are numerous prediction based reporting schemes reported in earlier works. Deshpande et al [19] utilize the combination of a statistical model and live data acquisition. BBQ is the one using multivariate Gaussian joint distribution to capture the correlation between sensor readings. This correlation is used to estimate the readings of nonsampled data from sensed data. But they suffer from expensive long training phase, historical data over a long span of time and continuous model updating to ensure correctness of the model. Ankur Jain et al [20] used DKF (Dual Kalman Filter) for prediction, which needs a prior model of the system, where the optimality of the Kalman filter fully relies on the model perfection. System modeling is again an extensive overhead. In [21], ARIMA based methods are used to predict the sensor data from previous values. ARIMA models are effective in predicting correlated time series. However, needs a great number of basic data, thus computationally expensive and is poor at predicting series with turning points. Santini et al [22] proposed a dual prediction based data reduction. This approach exploits the Least Mean Squares (LMS) adaptive algorithm. Nicholas et al [23] have proposed LMS filter implemented using FPGA. Both the approaches use static filter parameters for data prediction. But the efficient performance of filter necessitates adaptive step size and adaptive filter length. F. Ernst, et al., [24] proposed a fast lane approach for identifying optimal values of step size and filter length, where multiple lanes simultaneously predict data with different step size and filter length. This system involves high computational overhead and may not be suitable for simultaneous predictions for multiple

data sets. Low level wireless sensor nodes cannot cope up with this much of computational requirements. Stojkoska et al [25] proposed a variable step size LMS (VSS-LMS) algorithm, where step size is changed only during mode changes of prediction filter. Since the step size does not adapt to the state of convergence, results in deficient data reduction. The filter length also kept constant irrespective of the change in data dynamics. In this work we use an nLMS filter that adapts its step size based on prediction error, which results in faster convergence and slower deviation. The normalization also helps in minimizing the RMSE in the data reported. Being a lightweight algorithm the computation cost of the proposed work is also much lesser.

3. Methodology In the proposed work, an energy aware nonprobabilistic passive clustering method is used for establishing the cluster based aggregation framework. The process involves the selection of high energy nodes as CH. Around the CHs, the clusters are formed by associating data similar neighbors. The cluster members communicate with the CH through the Dual Prediction Framework (DPF), which is based on an adaptive nLMS filter. In our DEAP-DP approach, the temporally correlated data are filtered out through the dual prediction, thus reduces the intra cluster communication. The spatially correlated data are filtered out at the CH through data aggregation, thus reduces the inter cluster communication. 3.1 Network model The network consists of a battery powered stationary homogeneous sensor nodes and a BS. Sensor nodes are randomly distributed over the region of interest. The network continuously collects the data from the environment and communicates the same to the BS. The clusters are formed in such a way that all the cluster members can reach CH at one hop and the CHs can reach the BS located at the center of the region either at one hop or at multiple hops through a network backbone that consists only of CHs. The sensor nodes transmit their sensed data to their CHs, where the data are spatially aggregated and transmitted to the BS. 3.2 CH election In this work, energy aware DEAP-DP clustering protocol is introduced. The DEAP-DP initiates the CH selection mechanism by introducing an energy

aware selection method. Being a CH is an energy expensive mission. In order to achieve energy balancing, higher energy nodes are assigned with CH role. To select the highest energy node of the region, the residual energy of each and every node in that particular region has to be exchanged. So that a node can compare and identify the highest energy node of its neighborhood. In an energy scarce network, exchange of this residual energy information is another expensive task. Thus the DEAP-DP follows the passive clustering mechanism. Here each node can announce itself as a CH after a certain delay. The CH announcement delay for the node is made inversely proportional to the residual energy of the node. Hence the node with highest energy among its neighbors announces first and is selected as CH.

3.3 Cluster Formation The clustering is effective only if the formed cluster is an iso cluster, where all the cluster members generate similar data. In order to create the iso cluster, the proclaiming node advertises the mean and standard deviation of the recent history of data. This data provides the idea about the distribution of data. The other nodes that receive the proclamation, measure the distance between the mean, lower spread and upper spread of their own and the proclaimed node. These distances can give a clear picture about the correlation between the proclaimed node series and the sensor node series. If the distance is lesser than the user defined distance threshold, the sensor node accepts the proclaimed node as its CH and refrains itself from the contention. Let x be the proclaimed node data and y be the data of the received node. Mean distance is computed using the formula given below, Mean distance ̅ =  ∑   -  ∑   ………(1) 



Where, n be the number of elements of the recent data history. The standard deviation of the past data series is computed using the following formula 1 ( − ̅ ) … … … … . (2)

 =  −1 

 

Where ̅ is the mean of data series x.

The upper bound of data distribution is given as

 = ̅ +  ……………….(3)

data transmission can be reduced. The un-transmitted data are replaced by predicted data.

The lower bound of data distribution is given as  = ̅ −  ………………..(4)

The two sensor nodes x and y are data similar, only if abs (̅ − ) < 

!"

abs ( − # ) < 

!"

abs( − # ) < 

!"

………….(5)

Where,  !" is the data difference threshold of the DEAP clustering method. Thus the comparatively higher energy nodes are selected as CH and the nodes associate with the data similar CH with minimal control overhead. The method also avoids, two CHs happened to be in close proximity, as the neighbors refrain from being CH, after hearing the announcement of higher energy neighbor. The CH nodes instead of direct communication use a multi hop path to reach the BS. The multi hop path consists only of CH nodes. Thus the DEAP-DP method achieves energy balanced iso clusters at minimal control overhead. 3.4 Dual prediction As indicated in paper [26] and from the real word sensor datum, it is inferred that most of the physical phenomenon under measurement show a good level of temporal correlation. This temporal correlation adds up a significant amount of redundant data over the time span. By developing an efficient correlation model, the data from the node can be predicted beforehand. This DEAP-DP work implements a dual prediction based reporting scheme between cluster members and CH. The cluster member and CH, after a few rounds of data transmission, starts to predict the next data from previous data. In this work, the temporal correlation of the sensor data is used to construct a linear prediction engine that runs in both source node and CH synchronously. If the prediction is in line with the desired value, the prediction is said to be accurate. If the data deviates from the actual data, then only the cluster member transmits the data to the CH. Based on the accuracy of the prediction method, the amount of data transmission between the CH and cluster can be significantly reduced. Based on the prediction accuracy, a considerable amount of

The dual prediction method involves the cluster member and the CH having the same length of data history and a prediction engine to forecast future data. As shown in Fig.1, the prediction process takes place at both the cluster member and the CH simultaneously. Both the predictor outputs are identical. At every data collection round, the cluster member compares the predicted output with the actual data. If the predicted and actual data are nearly equal, then no data is transmitted from the cluster member to the CH. The CH adds up its predicted data as the sensor data of the instant. If the predicted data and the actual data differ more than a threshold value, then the data is communicated to the CH. In this DPF, the CH node, instead of direct communication, employs a time series model to predict the local reading of the cluster members with certain accuracy. Hence the number of communication between cluster members and the CH is reduced and energy expensive periodic radio transmission is avoided. The main objective of this approach is to transmit the subset of all samples. In this model, each sensor node has a prediction model that is trained from the history of past sensor measurements. Prediction involves multiple rounds of data processing and it is not a static process as the data pattern is not static. With their ability to adapt to the data pattern changes and flexible implementation in a processor, adaptive filters are used for the prediction purpose. The prediction engine quickly adapts itself to the changes in the data pattern with respect to time. Whenever the data deviates beyond the threshold value for certain consecutive rounds, the prediction engine adjusts its parameters to cope up with the change in the data pattern. The filter parameter’s adaptation is in line with the magnitude of deviation between actual and predicted data. In DPF, the prediction processes take place in both source and sink node simultaneously using identical filters. The source node transmits data, only if the prediction deviates from the desired data more than the threshold value $%& . Hence the amount of energy expensive periodic radio transmission is reduced. The structure of the DPF is shown in Fig. 1. In DPF, there are three distinct modes of operation: initialization, normal and standalone. A node goes through the initialization mode only at the beginning and then switches between the

normal and standalone modes. During the initialization mode, at every sampling instant t, the source node transmits the observed data to the sink. A temporal prediction model using an adaptive filter is constructed by the source node in parallel. The energy consumption of the mode is given as

'1 ='() + ',) + '-./0 …………………….. (6)

Where, '() and ',) are the transmission and reception energy of source and sink respectively. '-./0 is the prediction energy incurred by the source during model construction. The prediction is said to be converging, if the deviation is less than the error threshold $%& for M consecutive predictions. Now the source switches to standalone mode, by communicating the prediction model to the sink. During the standalone mode, the energy is saved as the source does not report its readings to the sink. In standalone mode, at each sampling instant t, the source and sink predict the data using the prediction model based on data history. Along with prediction, source node still collects data and compares the actual sensed value with the value predicted by the filter. If the deviation is lesser than the threshold, the filter model is assumed to be accurate for that time instant. Hence the filter is fed with the prediction y [t] instead of sensed value x [t]. Similarly, the sink predicts a value based on model and uses it as an approximation of the actual observation for the time instant. The energy consumption of the dual prediction is given as

'2& =2'-./0 ………………………….. (7)

If the error exceeds $%& , the mode switches to normal mode. During normal mode, the data are transmitted to the sink. The prediction engine at the source adjusts the weight values towards the convergence of the prediction with the desired value. Once the prediction is converged, the mode switches again to standalone mode. The energy consumption during normal mode is given as

'3.% = '1 + ',) + '-./0 ……..…….…. (8)

3.5 nLMS filter While effectively reducing power consumption, most of the adaptive filters so far needed to rely on a priori knowledge to correctly model the expected values. The proposed approach instead employs an algorithm that requires no prior modeling, allowing nodes to work independently and without using global model parameters. In this paper, a data-reduction strategy that exploits the LMS algorithm is presented. Basically, the LMS algorithm is a stochastic gradient algorithm, which means that

the gradient of the error performance surface with respect to the free parameter vector changes randomly from one iteration to the next. This stochastic property combined with the presence of nonlinear feedback, is responsible for making a detailed convergence analysis of the LMS algorithm, which is otherwise a difficult mathematical task. The LMS is an adaptive algorithm with very low computational overhead and memory footprint that despite its simplicity provides excellent performance. Hence, this scheme can be applied to a variety of real-world phenomena without restrictions. Moreover, the LMS algorithm does not require nodes to be assisted by a central entity for performing prediction, since no global model parameters need to be defined. Due to these characteristics, the proposed approach can be easily integrated with a variety of existing data collection approaches, including schemes that support in-network data aggregation. An LMS based DPF is discussed in detail in [22]. In this work, a variant of LMS algorithm, normalized LMS (nLMS) is used for data prediction. The nLMS based prediction filter samples data stream x over a length N at the instant k, which is denoted as x [k]. The prediction filter delays the current input value x [k] by one time instance and uses it as the reference signal d [k]. The filter calculates the prediction as a linear combination of the previous N samples of the data stream, weighed by the corresponding weight vector w [k]. The block diagram of the nLMS prediction filter is shown in Fig. 2. The algorithm predicts the future value of the parameter by comprehending the data pattern over the time span. By following up historical data values, the algorithm assigns weight for each data. Based on the weight of the historical data set the future value is predicted. Y [k] =wt [k] x [k]…………….. (9) Where y [k] is the predicted output at the instant of k. The output is a linear combination of last N samples of input signal x. Each input signal is weighted by the respective filter coefficient w. e[k]=d[k]-y[k]………………(10) e [k] is the error between predicted and desired value d [k] at the instant of k. w[k+1]=w[k]+µx[k]e[k]……...(11)

The new weight value is raised from the previous values, in steps of µ. Larger values of µ lead to faster convergence, but unstable filter prediction, whereas smaller values of µ lead to stable filter prediction and slower convergence. 3.6 Adaptive nLMS filter In the proposed method, the step size is calculated on every new prediction, with the input values. During normal mode, maximum value of step size is selected. The step size is selected in such a way; there is least instability in prediction. During the standalone mode, in order to avoid fluctuations, the step size is set at the minimum possible value. 0 ≤ μ ≤



7

….………...(12)

 ' = 8 ∑8 <  9:; …….(13) 

Here the step size µ can be set between 0 to 1/Ex, where Ex indicates the mean input power. The value of µ decides the speed of convergence and deviation of the prediction engine. The convergence of the nLMS filter largely depends on the step size and filter length. In this paper, an algorithm that dynamically computes the step size based on the input values is proposed. The step size adapts itself to the changes in input values. This method is said to be nLMS with Step Size Adaptation, we denote it as adaptive nLMS. The step size computation varies for different modes of adaptive filter, which leads to faster convergence in normal and lesser deviation in standalone mode. The adaptive filters update the weights of data elements in step sizes µ, based on error value. Though nLMS normalizes the step size at every iteration, the speed of convergence is constant for different states of the DPF. The proposed method improves the efficiency of the nLMS filter by applying adaptive step size that controls the convergence based on the error magnitude. The value of the step size decides the weight change, hence the speed of convergence. Higher the step size, faster the convergence and vice versa. Larger step size may also lead to fluctuations around the convergence point. Here we control the value of step size µ by a variable D. From equations (11), (12) and (13),

µ = (1 / ' ) / D………………………… (14) D =C/e, during initialization and normal modes, where C is a constant.

Here D is a non-zero integer value. During normal mode the value of D is inversely proportional to error magnitude. Higher the error, D value is reduced, results in larger step size that enable faster convergence. If lesser the error, D value is increased, leads to the smaller step size that reduces the fluctuations around the convergence point. During standalone mode, the prediction does not update weight value, due to the absence of the desired value. As the error tends to be zero, the step size does not have any significance during standalone mode. In the standalone mode, the filter predicts the new data based on previously predicted data. The observed data are not updated. Hence, due to the absence of d [k] in standalone mode the error tends to zero. From eq. (11), w [k +1] =w [k] +µx [k] e [k]………………. (15) Hence w [k +1] =w [k], µ has no significance. Step size computation adds some overhead, but is compensated by the faster convergence.

4. Experimental Setup Here PIC microcontroller [27] based wireless nodes are used for experimental verification. The processor is interfaced with x-bee radio modules, operating at 2.4GHz frequency range. The nodes, instead of measuring the local temperature values, are programmed with Intel lab data [28]. The nodes periodically output the sensor data, along with predictions. The data are stored onto the processor’s data EEPROM, during the IC programming itself. The architecture operates at 5V helps in reducing the computation cost of the node. The architecture is an 8-bit Harvard architecture that supports numerous arithmetic operations in a single cycle. The processor operates up to 20MHz speed. The processor has 8 analog channels with 10-bit resolution. The unit also incorporates USART that support high speed serial communications in both synchronous and asynchronous modes. The total unit is powered by 12V 1.2AH battery. In the receiving end a PC with LAB View software is used for data and energy expense plotting.

5. Result and Discussion The performance of the proposed DEAP-DP work is evaluated in two stages. In the first stage we evaluate adaptive nLMS based dual prediction

framework (DPF) using the benchmark data sets available in [28]. The performance of the proposed DPF is evaluated on three aspects, namely the percentage reduction in data transmission, the mean deviation between the predicted and the observed data and the number of model reconstructions. We initially evaluate the performance of the proposed system on different temperature data sets with varying correlations. The proposed work is also compared with other data prediction approaches in terms of data reduction and data accuracy. In the second stage, the DPF is applied on our proposed Data and Energy Aware Passive (DEAP) clustering scheme. And the performance of both the proposed clustering scheme and the DPF algorithm are evaluated together by comparing with equivalent clustering approaches on MATLAB platform. The performance is evaluated in terms of node lifetime, the average energy of the network and the amount of data transmitted. In order to investigate the performance of DEAP-DP with large-scale networks, we generate large traces of a spatially correlated data set based on a mathematical model proposed in [29], through which the model parameters are extracted from small-scale real data sets [28]. Here we applied the DPF on real world temperature data set with 1000 data elements. We compared it with periodic data transmission and dual prediction enabled data transmission. With the introduction of Adaptive nLMS, the total transmission is reduced to 97 rounds from 1000 rounds. Fig. 3 shows the instances of data transmission from the source node to the sink node. The data are reconstructed with an error boundary of 0.25°C and is shown in Fig. 4. In Adaptive nLMS, the value of D varies between 2 and 20 during normal mode. Since the step size is adjusted adaptively, the overshoot near the convergence point is reduced. The energy consumption of Adaptive nLMS is only 10% of the conventional data gathering. The percentage data reduction of ASALnLMS is compared with LMS [22], VSS-LMS [25] and ARIMA [21] schemes as shown in Fig. 5. The transmission percentage is almost one fifth of [22] and [25] and half of [21]. Normalization and adaptation of step size are the major contributors to this substantial data reduction. The data reduction percentages for different node data with various $%& values are shown in Fig. 6. The data reduction is the function of error threshold and dynamicity of the data. The data reduction decreases in a dynamic environment to preserve the desired accuracy level of

predicted signal. With the increased error threshold, the data reduction is also increased. The prediction accuracy of a filter is evaluated by measuring Root Mean Square Error (RMSE) of prediction. The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values predicted by a model or an estimator and the values actually observed. Basically, the RMSD represents the sample standard deviation of the differences between predicted values and observed values. These individual differences are called residuals when the calculations are performed over the data sample that was used for estimation, and are called prediction errors when computed outof-sample. The RMSD serves to aggregate the magnitudes of the errors in predictions for various times into a single measure of predictive power. RMSD is a good measure of accuracy, but only to compare forecasting errors of different models for a particular variable and not between variables, as it is scale-dependent. RMSE= @

C ∑D AEF(0A B#A )

1

… … … … … … … … …(18)

The Adaptive nLMS outputs a close prediction with the original data with minimal deviations. The RMSE of Adaptive nLMS is compared to [21] [22] and [25] for different node data as shown in Fig. 7. In VSS-LMS, RMSE is inversely proportional to the data reduction with respect to LMS as shown in Fig.5. However, with massive reduction in data transmission, Adaptive nLMS shows slight hike in RMSE than LMS and VSS-LMS. Fig. 8 shows the RMSE of 5 different node data with varying error threshold from 0.25°C to 1.25°C. The RMSE is only half of the error threshold, which increases with increased $%& . The Fig. 9 shows the frequency of model reconstruction with respect to increased $%& . The low $%& value imposes tight constraints on prediction, hence necessitates frequent model updates. With the increased $%& , the prediction is allowed deviate more; hence the frequency of model reconstruction is reduced. The trend is steeper for lower $%& values and smoother for higher $%& value. When the data is linear, the model can guarantee the prediction for long term, hence the reconstruction frequency is low. In a non-linear data environment, the model reconstruction happens more frequently. The average convergence time is linear throughout the range of $%& , which is about 3 round average. With low

$%& the convergence needs more rounds to converge, whereas in the large $%& the convergence happens within minimum rounds. The energy model of TELOSB [30] is used for energy estimation of our work. In adaptivenLMS, (5N+5) cycles are required for each prediction. For the filter length of 4, the energy cost of prediction per round is 30nJ.Each round, the energy cost of transmitting and receiving a 16 bit data is 11.52µJ and 12.59µJ respectively. From eq. (6-8), the energy cost during initialization and normal modes is 24.14µJ and during standalone mode is 60 nJ. The total cost of conventional data gathering for 1000 rounds is given as, 1000 (Etx+ Erx) =24140µJ……………… (19)

The total cost of adaptive nLMS based data gathering for 1000 rounds is given as, 10 (Einit) +97 (Enorm) +900 (Esa) = 2514µJ……..(20) This impact of data reduction on energy conservation is shown in Fig. 10.

In a clustered aggregation scheme the dual prediction is used to reduce the intra cluster communication. Here the DPF is executed between the CH and cluster members. Each member constructs a temporal correlation model based on its own observation and communicates the same to the CH. In CH, multiple prediction models are executed in each round and the predicted data are aggregated and transmitted to the sink. The trend changes are eventually updated by the cluster members. Here we employed the first order energy model used in [6] for evaluating the energy expense of the proposed clustering work on 100 randomly deployed node environment. The major purpose of the work is to reduce the energy expense of the network through which the network lifespan is prolonged. Here the work is evaluated with varying temporal error threshold. With increased error threshold the amount of intra cluster data communication is largely reduced. Hence the nodal energy is conserved. As shown in Fig. 11 the life time of network with error threshold 1°C is nearly three times longer than the network with 0.25°C error threshold. This indicates the impact of error tolerance over the network lifetime. Average energy consumption is the major indicator of energy efficiency of the system, which is the summation of the total energy of alive nodes divided by number of alive nodes. Here we use the typical values EELEC = 50 nJ/bit, εfs = 10 pJ/bit/m2

and εmp = 0.0013 pJ/bit/m4.As noted previously, the CHs are responsible for aggregating their cluster members’ data. The energy for data aggregation and for data prediction is set as EDA= EDP= 5 nJ/bit/signal. As the data cost has been reduced abruptly in our proposed work, the Average energy consumption is also subsiding. Here we set the system error threshold at 0.25°C. With 2.5 joules of initial energy for each node. LEACH has consumed 90% of its residual energy and DACA has consumed 20% of its residual energy at the round of 5000. But in the same instant our proposed work consumed less than 10% of total energy, which is almost less than half of DACA. The Fig. 12 shows the average energy expense for different data gathering schemes. Fig. 13 shows the number of data packets transmitted by different clustering approaches. LEACH tops the list with the highest amount of data transmission over the network. Both DACA and DEAP-DP have reduced the transmission considerably by employing correlation aware data reduction strategies. Since DACA employs only spatial correlation, DEAP-DP significantly outperforms it with the spatio-temporal correlation aware strategies. With the mere 0.25°C tolerance on collected data, DEAP-DP has achieved a 90% reduction in data with guaranteed error bound. The main purpose of clustering is to achieve a reduction in the volume of inter-node communication by localizing data transmission within the formed clusters. The clustering also extends the nodes sleep time by allowing CH to coordinate the network activities. But the above mentioned advantages can be efficiently realized only if the formed clusters are data similar clusters. Here we justify the formation of data similar clusters and dual prediction based reporting in terms of energy savings. Being a group of data similar nodes, the proposed cluster can reduce a large amount of spatially similar data. With the formation of data similar clusters and the inclusion of DPF results in enormous energy savings in the proposed work as shown in Fig. 14. Duty cycling is one of the prominent methods of energy conservation; however it is insensible to data dynamics. In a denser network the data similarity between neighboring nodes and the correlation between the consecutive data generated by sensor node are not being exploited by duty cycling methods. In data reduction approaches, the nodes though active, do not consume much energy as minimal communication is incurred. Here we estimate the energy expenditure based on TELOSB

model [30]. Fig. 15 shows the energy expense of the network with 10 to 50 % duty cycling and the energy expense of network with 0.1°C to 0.5°C error tolerance on data prediction. With the minute error tolerance of 0.1°C, the proposed approach can achieve equivalent energy conservation of 20% duty cycling.

6. Conclusion The paper proposed a two level data reduction strategy for energy conservation in wireless sensor networks. The nodes are clustered based on their data similarity. The DPF is used to acquire data from multiple cluster members to the CH. An adaptive nLMS filter is used in the DPF to exploit the temporal correlation of sensor data. The acquired data are then aggregated at the CH and sent to the sink node. This method exploits both spatial and temporal correlations between data, works well than the existing data reduction methods, where either spatial or temporal correlation is employed. The work not only exploits the temporal and spatial correlations but also refines the data prediction methods in terms of faster convergence and reduced data error. In this work temporal correlation based data reduction achieves up to 96% intra cluster data reduction and spatial aggregation results in 60% inter cluster data reduction. Together total of 62% data reduction has been achieved. Future work involves the inclusion of data compression techniques at the CH towards further reduction in inter cluster data transmission. REFERENCE 1. ON World Inc.Wireless Sensor Networks – Growing Markets, Accelerating Demands. July 2005. Available from : http://www.onworld.com/html/wirelesssensorsrprt2.htm. 2. Giuseppe Anastasi, Marco Conti, Mario Di Francesco, Andrea Passarella .Energy Conservation in Wireless Sensor Networks: A Survey. Ad Hoc Networks 2009; 7: 537-568. 3. Li, Shancang, Li Da Xu, and Xinheng Wang. "Compressed sensing signal and data acquisition in wireless sensor networks and internet of things." Industrial Informatics, IEEE Transactions on 9.4 (2013): 2177-2186. 4. Tobler, Waldo. "On the first law of geography: A reply." Annals of the Association of American Geographers 94.2 (2004): 304-310. 5. Sunhee Yoon, Cyrus Shahabi. The Clustered Aggregation (CAG) Technique Leveraging Spatial and Temporal Correlations in Wireless Sensor Networks. ACM Transactions on Sensor Networks 2007; 3. 6. W. R. Heinzelman, A. Chandrakasan, H. Balakrishnan. Energy-Efficient Communication Protocol for Wireless Microsensor Networks.In Proceedings of the Hawaii International Conference on System Science.Maui: Hawaii:2000. 7. Ossama Younis and Sonia Fahmy, "HEED: A Hybrid, Energy- Efficient, Distributed Clustering Approach for Ad-hoc Sensor Networks"

8. M. Ye, C. Li, G. Chen and J. Wu, “EECS: An Energy Efficient Clustering Scheme in Wireless Sensor Networks”, National Laboratory of Novel Software Technology, Nanjing University,China. 9. M. Gerla, and J. T. C. Tsai, “Multi cluster, mobile, multimedia radio network”, Wireless Networks, Vol. 1, Issue 3, 1995, pp. 255-265. 10. Selvakennedy, S., Sukunesan Sinnappan, and Yi Shang. "A biologically-inspired clustering protocol for wireless sensor networks." Computer Communications 30.14 (2007): 2786-2801. 11. M. Chatterjee, S. K. Das and D. Turgut, “WCA: A Weighted Clustering Algorithm for Mobile Ad Hoc Networks,” Clustering Computing, 2002, vol. 5, pp. 193–204. 12. Bahrami, Somaieh, Hamed Yousefi, and Ali Movaghar. "Daca: data-aware clustering and aggregation in query-driven wireless sensor networks." Computer Communications and Networks (ICCCN), 2012 21st International Conference on. IEEE, 2012. 13. G. Werner-Allen, K. Lorincz, M. Ruiz, O. Marcillo, J. Johnson, J. Lees, M. Welsh, “Deploying a Wireless Sensor Network on an Active Volcano”, IEEE Internet Computing, Special Issue on Data-Driven Applications in Sensor Networks , March/April 2006. 14. Z. Xiong, A.D. Liveris, and S. Cheng, “Distributed source coding for sensor networks,” IEEE Signal Processing Magazine, vol. 21, no. 5, pp. 80–94, 2004. 15. J. Haupt, W.U. Bajwa, M. Rabbat, and R. Nowak, “Compressed sensing for networked data,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 92–101, 2008. 16. Luo, Jun, Liu Xiang, and Catherine Rosenberg. "Does compressed sensing improve the throughput of wireless sensor networks?." Communications (ICC), 2010 IEEE International Conference on. IEEE, 2010. 17. E. Fasolo, M. Rossi, J. Widmer, M. Zorzi, “In-network aggregation techniques for wireless sensor networks: a survey”, IEEE Wireless Communications , Volume: 14, Issue: 2, Pages: 7087, April 2007. 18. Q. Han, S. Mehrotra, N. Venkatasubramanian, “Energy Efficient Data Collection in Distributed Sensor Environments”, Proc. of the 24th IEEE International Conference on Distributed Computing Systems (ICDCS'04) , pp. 590-597, March, 2004. 19. A. Deshpande, C. Guestrin, S.R. Madden, J.M. Hellerstein ,W. Hong.Model-Driven Data Acquisition in Sensor Networks. In Proceedings of the 30th VLDB Conference (VLDB ’04).Toronto:Canada:2004. 20. A. Jain, E.Y. Chang ,Y.-F. Wang.Adaptive stream resource management using Kalman Filters. In Proceedings of the ACM SIGMOD/PODS Conference (SIGMOD ’04). Paris: France: 2004. 21. Li G and Wang Y, “Automatic ARIMA modelingbased data aggregation scheme in wireless sensor networks,” EURASIP J Wirel Commun Netw, 2013: 85, Mar. 2013. 22. Silvia Santini, Kay Römer .An Adaptive Strategy for Quality-Based Data Reduction in Wireless Sensor Networks. Proceedings of the 3rd International Conference on Networked Sensing Systems (INSS’06). 23. Nicholas Paul Borg , Carl James Debono .An FPGA Implementation of an Adaptive Data Reduction Technique for Wireless Sensor Networks. Workshop in Information and Communication Technology.

24. Floris Ernst, Alexander Schlaefer, Sonja Dieterich, Achim Schweikard.A Fast Lane Approach to LMS prediction of respiratory motion signals. Biomedical Signal Processing and Control 2008. 25. Solev D, Stojkoska B and DancoDavcev, “Data prediction in WSN using variable step size LMS algorithm,” in Proceedings of the Fifth International Conference on Sensor Technologies and Applications (SENSORCOMM 2011), vol. 3, no. 4, pp. 191 to 196, Aug. 2011. 26. Pham, Ngoc Duy, Trong Duc Le, and Hyunseung Choo. "Enhance exploring temporal correlation for data collection in WSNs." Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on. IEEE, 2008. 27. [Online]www.microchip.com/pic 16F877A/ 28. [Online].available:.http://www.intelresearch.net/ berkeley/ index.asp 29. A. Jindal and K. Psounis, “Modeling SpatiallyCorrelated Sensor Network Data,” ACM T SENSOR NETWORK, vol. 2, no. 4, pp. 466-499, 2006. 30. Meulenaer DG, Gosset F, Standaert F and Pereira O, “On the energy cost of communication and cryptography in wireless sensor networks,” in Proceedings of the 4th IEEE International Conference on Wireless and Mobile Computing, Networking and Communication (WiMob '08), pp. 580–585, Avignon, France, October 2008.

FIGURES

Fig. 1. Dual Prediction Framework

+ x[k]

Delay

x[k-1]

Adaptive Filter

y[k]

+ e[k]

Adaptive Weight Control Mechanism

Fig. 2. nLMS based prediction filter

Fig.3. Adaptive nLMS transmission Instances

Fig.4. Comparison of original data and Adaptive nLMS Predicted data

Fig.5. Data reduction comparison ($%& =0. 25°C)

Fig. 6. Percentage Transmission with varying $%&

Fig.7. RMSE comparison ($%& =0. 25°C)

Fig. 8. RMSE with varying $%&

Fig. 9. No. of model reconstruction with average convergence time for varying $%&

Fig. 10. Energy comparison

Fig. 11. Node lifetime

Fig. 12. Network energy consumption

Fig. 13. No of transmitted packets

Fig. 14. Plain network vs. cluster network

Fig. 15. Duty vs. Data reduction

Muruganantham ARUNRAJA completed his Bachelor degree in Shanmugha college of Engineering, Thanjavur and completed his Master’s degree in Embedded system and technologies in Anna University, Thrunelveli. He has eight years of Industrial experience in Embedded system domains He is doing his Ph.D in Anna University Regional Centre, Madurai His areas of interest are embedded systems, wireless networks and instrumentation.

Veluchamy MALATHI is working as professor in the department of Electrical and Electronics Engineering in Anna University Regional Centre, Madurai. She completed her Bachelor degree in College of Engineering Guindy and her Masters in Thiyagaraja College of Engg, Madurai. She completed her Ph.D in Anna University Chennai and her areas of interest are intelligent techniques and its applications, Smart Grid, FPGA based power system and Automation

Erulappan SAKTHIVEL completed his Bachelor degree in Madurai kamarajar university, Madurai. He completed his Master’s degree in Embedded system and technologies in anna university Thirunelveli. He has eight years of Industrial experience in VLSI domains. Currently He is doing his Ph.D in Anna University Regional Centre, Madurai in VLSI domain as a full time Research Scholar.