MSDG: A novel green data gathering scheme for wireless sensor networks

MSDG: A novel green data gathering scheme for wireless sensor networks

Accepted Manuscript MSDG: a Novel Green Data Gathering Scheme for Wireless Sensor Networks Zhetao Li , YuXin Liu , Ming Ma , Anfeng Liu , Xiaozhi Zha...

1MB Sizes 0 Downloads 28 Views

Accepted Manuscript

MSDG: a Novel Green Data Gathering Scheme for Wireless Sensor Networks Zhetao Li , YuXin Liu , Ming Ma , Anfeng Liu , Xiaozhi Zhang , Gungming Luo PII: DOI: Reference:

S1389-1286(18)30373-6 10.1016/j.comnet.2018.06.012 COMPNW 6522

To appear in:

Computer Networks

Received date: Revised date: Accepted date:

14 August 2017 16 May 2018 12 June 2018

Please cite this article as: Zhetao Li , YuXin Liu , Ming Ma , Anfeng Liu , Xiaozhi Zhang , Gungming Luo , MSDG: a Novel Green Data Gathering Scheme for Wireless Sensor Networks , Computer Networks (2018), doi: 10.1016/j.comnet.2018.06.012

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT 1

MSDG: a Novel Green Data Gathering Scheme for Wireless Sensor Networks Zhetao Li a, YuXin Liub,*, Ming Mac, Anfeng Liu b, Xiaozhi Zhangb, Gungming Luo a a

College of Information Engineering, Xiangtan University, Xiangtan 411105, China School of Information Science and Engineering, Central South University, ChangSha 410083 Chi na c Department of Computer Science, Stony Brook University, NY 11794, United States

CR IP T

b

______________________________________________________________________________________ ARTICLE

INFO

ABSTRACT

Article history: Received Received in revised form

Green data gathering is urgently required for solving the conflict between various functionalities demanded by applications and the limited energy supply for sensor nodes. This paper proposes a Multi-Strip Data Gathering (MSDG) scheme that can effectively reduce the amount of data transmitted. In the MSDG scheme, the network is divided into multiple strips, in which each of the nodes around the strip transmit data to the center with the data fusion method. The number of data fusion packets is m when data arrives at the strip center. The nodes at the strip center collect data using the compressed sensing (CS) method, and all the data in the strip after routing along a circle are collected on a node called the start node, which maintains a packet length of m. Finally, the data on each start node of the strip are fused using the CS method, starting from the outermost start nodes to reach the sink, to achieve data collection for the entire network. Simulation and theoretical analyses demonstrate that the MSDG strategy, compared with the existing hybrid-CS scheme, can enable the network lifetime to increase by nearly 5 times. Additionally, the energy utilization is increased from 21% to 61%, which significantly enhances the performance of data collection in the wireless sensor networks.

ED

M

AN US

Accepted Available online Keywords: Wireless sensor networks Multi-strip data gathering Compressive sensing Energy efficiency Lifetime maximization

1. Introduction

AC

CE

PT

Many potential applications of Wireless sensor networks (WSNs) have been exploited in the fields of environmental engineering, healthcare [1], industry [2], military applications [3, 4], smart home [5], green buildings, etc. [6]. With the rise of edge computing, sensor nodes based Internet of Things (IoT) has been developed rapidly [7, 8]. To enable the pervasive deployment of WSNs, the biggest stumbling block is the contradiction between the various functionalities demanded by applications and the limited energy supply for sensor nodes [9-12]. Such a situation is getting worse, especially considering the increasing network scale [13, 14]. Therefore, numerous researchers have proposed energysaving methods to prolong the network lifetime as much as possible [1, 3, 4, 9, 10, 12-14]. As the communication overhead for transmitting and receiving data is the maximal energy consumption in the WSNs, how to reduce the amount of data transmitted in the network becomes the key issue in the study. There are mainly two categories of methods to reduce the amount of data. One is data fusion [14-17]. The idea behind this method is based on the ———————

∗ Corresponding author. Tel.: +86 731 8879628. E-mail address: [email protected] (Z. T. Li); [email protected] (Y. X. Liu); [email protected] (M. Ma);[email protected] (A. F. Liu); [email protected] (X. Zhang); [email protected] (G. Luo).

situation where redundancy exists between the data sensed by the sensor nodes. When the data packets meet at the intermediate node in the route to the sink, they are fused to form a single packet smaller than the sum of their packets. The advantage of this method is the ease of use, whereas the disadvantage is the restriction to the application, that is, if the correlation between the data packets is large, the method is very effective, for example, if the average humidity, the maximum and the minimum value of the monitoring region needs to be calculated, all the packets can be integrated into one data packet, thus greatly reducing the amount of data to be transmitted. However, if the correlation between the data packets is small, the fused data packet becomes larger and larger with the integration of more and more data packets during the process of data transmission to the sink, thus resulting in large energy consumption and affecting the network lifetime. Another effective method of compressing packets is the compressive sensing (CS) [18]. In this method, the number of packets that the node needs to transmit is m, where m is much smaller than the number of nodes n in the network. Thus, data can be greatly compressed. However, a limitation exists in the compressive sensing method, which is that each node needs to transmit m data packets in the network. In the data fusion method, at the early stage, the node transmits less than m data packets. In

ACCEPTED MANUSCRIPT Z. T. Li et al./Computer Networks

)

7 times, demonstrating the better performance of our proposed scheme. The remainder of this paper is organized as follows: In Section 2, the related work is reviewed. The system model is described in Section 3. In Section 4, the details of the Multi-Strip Data Gathering (MSDG) scheme are presented. Performance analysis is provided in Section 5. The analysis and comparison of experimental results are given in Section 6, and Section 7 presents the conclusion.

2. Related Work There have been some methods proposed to reduce the amount of data that a node needs to transmit, so that some of the network performances, e.g., latency, congestion, and network lifetime, can be effectively improved. One such traditional method is the data fusion method [14-17], which fuses two data packets with correlation into a smaller data packet without losing information. Another method is the compressive sensing method [18]. Some research works related to these two methods will be discussed in this section.

AC

CE

PT

ED

M

AN US

the process of data transmission to the sink, the data packet becomes larger and larger due to continuous data fusion. Finally, the node with a certain distance from the sink will undertake more than m data packets. To save energy, a hybrid compressed sensing (hybrid-CS) method [19] is proposed, in which the compressed sensing method and data fusion method are combined. That is, the data fusion method is used in the far-sink region, so that the number of packets undertaken by the node is less than m. The compressive sensing method is employed in the near-sink region where the node needs to transmit more than m data packets and then the amount of data undertaken by the node is m. In such a scheme, the amount of data carried by the far-sink node is less than m, while the number of packets undertaken by the near-sink node is at most m [20]. Generally, the main innovations of the Multi-Strip Data Gathering (MSDG) scheme are as follows: (1) A novel data fusion routing scheme is proposed so that the maximum amount of data that the node undertakes in the network is reduced to ⁄ , greatly improving the data fusion efficiency and network lifetime. The MSDG scheme integrates data in the network using data fusion and transmits the data along only a routing path to the sink. In a cycle of data collection, only one node within one-hop range from the sink is required to transmit m data packets, while each of other nodes only needs to send its one packet. If the number of nodes within one-hop from the sink is , each node needs to undertake data packets after cycles of data collection. Thus, the node transmits ( )⁄ < ( ⁄ ) +1, on average, in each cycle, which can effectively reduce the amount of data that the node undertakes and improve the network lifetime. (2) The MSDG scheme, in contrast to previous schemes, can effectively reduce the energy consumption of the nodes and make full use of the energy so that the energy consumption in the network can be effectively balanced, and the network performance is improved. In the MSDG scheme, the network is divided into multiple strips, in which each of the nodes around the strip transmit data to the center with the data fusion method. The length of a data fusion packet is when data arrives at the strip center, then the compressed sensing method is used. In each strip, only a small number of nodes in the strip center undertakes m packets at most, while most of the other nodes need to transmit less than m packets. In other words, in the MSDG scheme, most nodes in the network undertake a relatively small amount of data, so the node's energy consumption is low. Additionally, the strips in the network take turns changing, so that the amount of data undertaken by nodes is relatively balanced. Therefore, the MSDG scheme is better than previous schemes in terms of energy consumption performance. (3) Through our extensive theoretical analysis and simulation study, we demonstrate that the proposed MSDG scheme can reduce the amount of data that the node undertakes. Compared with the previous best scheme, the amount of data packets that the node undertakes in the MSDG scheme is only 21% of that in the previous best scheme. Additionally, the network lifetime is improved by nearly 5 times, and the energy efficiency is enhanced by 5-

(

CR IP T

2

2.1 Data fusion method Although the data fusion method is to fuse multiple data packets into a small one such that the energy consumption is reduced [14-17], it is a challenging issue to design an effective data fusion method. In this paper, the data fusion methods are classified into the following categories: (1) First is the routing-driven data fusion method [4, 14-17, 21]. Since the premise of data fusion is that data packets meet in the process of data fusion, the main feature of the routing-driven data fusion method is to enable data packets to meet as much as possible. (2) The second is the coding-driven data fusion method [22-24]. The main method of data aggregation in this data fusion method is to reduce the amount of data that a node needs to transmit via network coding. In this method, minimal attention is paid to the route, and the key is to design efficient coding, with the main goal of reducing the amount of data as much as possible. (3) The main goal of the fusion-driven method is to aggregate the correlative data [15] by arranging more relevant data to meet, to obtain a higher data aggregation effect. This method is often more closely related to the routing scheme, while it is less related to network coding (data compression) [22-24]. The following describes the abovementioned three data fusion methods. (1) Routing-driven data fusion method [15]: This data fusion method is mainly used to design an effective data fusion routing scheme so that as many data packets as possible are fused together [4, 14-17, 21]. The most effective data fusion method is the fusion functions of a set, such as max, min, and sum. In this case, only one packet is output after 𝑛 input packets are merged. Thus, a sink-rooted tree structure is formed, in which each node starts to route forward only after it receives all the data of its children nodes and obtains one fused data packet. Thus, this method has a high data fusion efficiency [15]. However, in most applications, data fusion does not

ACCEPTED MANUSCRIPT 3

CR IP T

reduce the amount of data. The compressive sensing theory is not constrained by the strict requirements on sampling frequency of the classic Nyquist sampling theorem. It is possible to achieve compression as the sampling is conducted [18]. According to the theory of CS, assume the signal , . For a signal with length 𝑛, it can be represented by a linear combination of ,𝛹 𝛹 standard orthogonal matrix 𝛹 ∗ 𝛹 - as follows: 𝛹𝛼 ∑ 𝜓 𝛼 (1) where 𝛹 is a column vector, and 𝛼 is a 𝐾-sparse column vector representation of . There are 𝐾 non-zero values in 𝛼 and other 𝑛 𝐾 values either zeros or close to zero and 𝐾 ≪ 𝑛. The theory of CS suggests that under certain conditions, instead of sending the original signal vector of size 𝑛, we may send a sample measurement of size as follows: 𝑍 𝛷 (2) where 𝑍 is a column vector of sample measurement of size × ,𝑍 and 𝛷 is a random sample basis matrix of ∗ size × 𝑛, 𝛷 ( ≪ 𝑛). According to the compressive sensing theory, to guarantee the perfect recovery of data at the destination [19], the size of the sample measurement should satisfy the restricted isometry property (RIP) 𝐾 ( ) (3) In Equation 3, is a small positive constant, 𝐾 denotes the sparse degree, and 𝑛 is the measured signal length. It is recommended that the value of should be set to 3𝐾 ≤ m ≤ 4𝐾 [18]. The compressive sensing method has attracted extensive attention since it was first proposed by Candes. E et al. in 2006 [18]. Luo.C et al. [19] used the compressive sensing method in the data collection process of large-scale WSN. It has been shown that the compressive sensing method together with a good routing method can optimize the utilization rate of network energy. The basic process of data collection using the CS method in the wireless sensor network is as follows: assume the number of nodes deployed in the network is 𝑛, and the nodes are denoted as , ]. The data vector collected by each node is , -. The collected data is K-sparse in a certain domain, and 𝐾 ≪ 𝑛. Then, data can be collected using compressive sensing. If the non-compressed data collection method is used, the data of each node are routed to the sink in the form of the original data packet via multi-hop routing. For example, in Fig. 1, sends its own packet to , then needs to send the data and , and continues to transmit data to . This process continues until collects all the data and transmits them to the sink. The final number of packets transmitted to the sink is 𝑛 [20].

AC

CE

PT

ED

M

AN US

produce 1 output with 𝑛 input, but usually produces 𝑛 input with 𝑛 output, where is a decimal number associated with the fusion coefficient. Even if a beneficial route for data fusion is not designed beforehand, the data packets in the network also have a certain probability of encountering for data fusion. This effect is called random data fusion routing [15, 17]. This method has no requirement on routing and is most widely used in practice. However, the proportion of randomly encountered data packets is not high, so the data fusion efficiency is not high either. Leandro Aparecido Villas et al. [21] proposed a routing method that would allow data packets to encounter with a higher probability on the route to sinks, thus improving the data fusion efficiency. In this type of routing, the node on the first formed route to the sink set is its distance from sink to 0 hop. After the spread of the hop count, the nodes near the formed route are attracted to send their data to the sink via the first formed routing path, thus increasing the probability of data fusion and improving the data fusion efficiency. In fact, cluster-based WSNs can also be seen as a data fusion method, in which the data of each cluster node will be routed through the cluster head to the sink [25]. Liang S. et al. [4] proposed a RBCDR scheme that is not directly sent to the sink but routes first to the ring routing in non-hotspot areas. After completing a circle in the ring routing (RBCDR) scheme, it then uses the shortest routing approach to route to the Sink. In the RBCDR scheme, all the data packets will meet together, so the data fusion rate is extremely high. (2) Coding-driven scheme [15]. These algorithms [2224] focus on compressing data at the source node using coding techniques so that the amount of data needed to be transmitted is small. These schemes can be roughly divided into two different data aggregation models [6]: the distributed source coding scheme (e.g., Slepian-Wolf coding model) [22, 23] and explicit edge information aggregation [24]. (3) Fusion-driven data fusion scheme [15]. Unlike the routing-driven scheme, the fusion-driven approach does not assume that there is a correlation between the data packets, so data fusion can only be done between packets with correlation. Thus, the focus of this scheme is on how to fuse data packets that are relevant to benefit from the reduction in data load due to data aggregation [4]. Liu et al. [26] proposed a method to reduce the amount of data that needs to be sent to the sink based on representative nodes. The idea of the approach is that since the data values sensed by multiple nodes in the network differ very little, for example, the sensed data values can be regarded as identical if they are smaller than a threshold, the value of a node can be used to represent the value of all nodes whose sensed values are within a certain threshold. Thus, the values of a small number of nodes in the network can be used to represent the observed values of the entire network, thus greatly reducing the amount of data to be transmitted. Such a method using representative nodes can also be considered a fusion-driven method.

2.2 Compressive sensing method Compressive sensing (CS) is a method to effectively

x1

S1

xn xn-1 ︙ x1

x2 x1

S2

S3

SN

Sink

Fig. 1 Data collection without data compression.

ACCEPTED MANUSCRIPT 4

Z. T. Li et al./Computer Networks

ΣФMixi ︙ ΣФ2ixi ΣФ1ixi

ФM1x1+ФM2x2 ︙ Ф21x1+Ф22x2 Ф11x1+Ф12x2

S1

S2

6 S3

SN

𝛷

𝛷

That is:

CE

PT

ED

M

𝑍 𝛷 Compressive sensing can break through the traditional sampling theory so that the maximum number of packets that must be transmitted in the network is reduced from 𝑛 to , and the network lifetime can be effectively prolonged. However, using the compressive sensing-based data collection method, all nodes in the network undertake data (see Fig. 3(a)). If the network only uses the data fusion method, the number of packets undertaken by the nodes in the far-sink region, i.e., the amount of data that a node needs to transmit, is less than at the beginning of data fusion. With the fusion of more data packets, the amount of data that a node needs to transmit in the near-sink region is greater than . Obviously, if the data fusion scheme is used in the far-sink region, the amount of data that a node needs to transmit is less than . In the near-sink region, the compressive sensing method is employed, then the amount of data that a node needs to transmit is greater than . Therefore, the energy consumption of the network can be reduced. Based on the above idea, Luo et al. [19] proposed a hybrid compressed sensing (hybrid-CS) method, which is based on the idea of using the compression perception method, as shown in Fig. 3 (b). In Fig. 3, the bold black node means that the CS data collection method is used, and the amount of data that a node undertakes is . Nevertheless, the non-bold black node denotes that the general data fusion algorithm is applied, and the amount of data that a node undertakes increases with the growth of the number of data packets [20].

AC

(a) CS scheme (b) hybrid-CS scheme Fig. 3 Two different data collection methods From the above study, the main goal of the data collection method in WSNs is to reduce the amount of data that the node undertakes, save energy, improve the network lifetime, and keep the amount of information collected by the sink from being lost. The first type of data fusion method involves reducing the restrictions on the correlation between the data and the impact of coding. Since these studies have been developed to a very high degree, it is difficult to further reduce the amount of data after data fusion [27]. The CS method requires the node to undertake data. Although it has effectively reduced the amount of data that the node is responsible for, the amount of data that the node undertakes is still large. Thus, it is still a challenge issue to find a more efficient data collection scheme. Although the energy consumption of nodes can be effectively reduced by decreasing the amount of data transmitted in the network, there are some different factors that affect the energy consumption reduction of the nodes. One of important methods is to use the effective MAC protocol to reduce the energy consumption of the nodes [28, 29, 30]. Idle listening is one of the major sources of energy consumption. In some cases, idle listening and data reception have almost the same energy consumption [28, 30, 31, 32]. Thus, one effective method for decreasing the energy consumption is to reduce idle listening. A commonly used method in sensor networks is the duty cycle method [28, 30, 33, 34], which can effectively reduce the idle listening of nodes by allowing the nodes to stay in a sleeping state when no data operation is needed. Since the energy consumption of a node in the sleeping state is 1/100 of the energy consumption in the working state, the node should be kept in the sleeping state to reduce the idle listening to save energy. Therefore, the duty cycle technique can effectively reduce the idle listening and save energy. This technique, however, has the disadvantage that the delay of data transmission may be increased [28, 30], which occurs when a node has to wait for the sleeping forwarding nodes to wake up and then starts the data transmission. In Ref. [33, 34], Tong et al. proposed a novel and effective method that adopts the duty-cycling technique plus the pipelinedforwarding technique to balance the energy consumption and packet delivery latency. Therefore, effective reduction of the energy consumption of nodes requires comprehensive optimization.

AN US

𝛷

m

m

Sink

Fig. 2 Data collection based on CS. In the compressive sensing-based data collection, only a small amount of the measured values, that is, the sum of the weights of the data, are transmitted to the sink. As shown in Fig. 2, first the random coefficient 𝛷 is multiplied with its own data and then sends 𝛷 ∗ to . When receives the data transmitted by , it transmits 𝛷 ∗ 𝛷 ∗ to . sends the data to in the same way, and so on. Each node forwards the result of its own data multiplied by the random coefficient together with the received data to the next hop node. The sink will eventually receive the observed value ∑ 𝛷 ∗ of data from the entire network, with a total of observations. If the data collection in the entire network uses the compressive sensing method, then all nodes undertake data packets. The data collection process can be formulated as follows: 𝑍 𝛷 𝛷 𝛷 𝑍 𝛷 𝛷 𝛷 [ ] [ ][ ] (4)

𝑍

)

CR IP T

ФM1x1 ︙ Ф21x1 Ф11x1

(

ACCEPTED MANUSCRIPT 5

3.1. System Model The network model that is adopted in this paper is similar to Ref. [4, 35]. In a circular region with radius , 𝑛 homogeneous wireless sensor network nodes are randomly distributed. The nodes are , ]. All nodes comply with the Poisson distribution with a density of . The data sensed by the network is a periodic data collection type. For example, the temperature and humidity in the monitoring area, such as a farm, are sensed. Each node in a sensing cycle generates data and transmits it to the sink. The data vector collected by the node in one cycle is , - The network lifetime is defined as the number of data collection cycles that the node survives. The sink is located in the center of the network.

3.2. Data collection model

3.3. Energy consumption model

The classic energy consumption model [4, 11, 14, 36] is used in this paper. Since the energy consumption of nodes in the reception and transmission of data in the sensor network is dominant, this paper considers only the energy consumption of nodes for data transmission and reception as in previous studies [4, 11, 14, 36]. See the energy consumption of sending data in Eq. (7), and the energy consumption of receiving data in Eq. (8). ( ) { , (7) ( ) () (8) where represents transmitting circuit loss. Both the free space ( power loss) and the multi-path fading ( power loss) channel models are used. If the transmission distance is less than the threshold , the power amplifier loss is based on the free-space model; if the transmission distance is larger than or equal to the threshold , the multi-path attenuation model is used. and are the energy required by power amplification in the two models. In this text, the above parameters are from references [4, 11, 36], shown in Table 1.

AC

CE

PT

ED

M

AN US

In this paper, two types of data collection methods are used. One is the data fusion method, and the other is the compressive sensing method. These two methods are introduced as follows. 1) Data fusion method. For the data fusion model, we use the lossless step-by-step multi-hop aggregation model [4, 15]. In this data fusion model, the data of multiple input nodes are sequentially fused with the data of source node . Only when all the children nodes and the source node complete the data aggregation will the source node transmit the aggregated data packet to the next node. denotes the original data packet of node , and 𝔖( ) represents the intermediate result of the data fusion between the data of the node and the data of the node . Simply, 𝔖 represents the intermediate result of the current data fusion of the node , and 𝔗 is the final result of the data fusion between node and the data of all its children nodes. When the node receives the packet sent by node , it will aggregate the received data packet, i.e., the packet 𝔗 sent by node , with the current data packet of node (which may be the original packet of node or the intermediate result 𝔖 of the data fusion of node ). If the data packet of the current node is , and the received data packet from node satisfies 𝔗 = , that is, the data packet participating in the data fusion is the original packet, then the data fusion formula is as follows: 𝔗( )=max( ) ( )min( ) (5) where is the correlation coefficient between nodes and ) [4, 15]. The larger the value, the higher the data correlation between nodes, and the smaller the length of the data packet after data fusion. When the node receives the data 𝔗 of the node in the data fusion, if any data packet in the data fusion is not the original data packet, the data fusion formula is expressed as follows: 𝔖(𝔖 𝔗 ) max(𝔖 𝔗 ) ( )min(𝔖 𝔗 ) (6) where is the forgetting factor, which is a decimal number less than 1 [15]. 𝔖 𝔗 represent the intermediate result of data fusion and the final result sent by the children nodes, respectively, and at least one of 𝔖 and 𝔗 is not the origi-

nal data packet. 2) Compressive sensing method. The compressive sensing method used in this paper is the same as that in [18, 20], and its related works have been discussed in Section 2. Since the focus of this paper is to use the compressive sensing method for data collection rather than improve the compressive sensing method, the key requirement on the compressive sensing method in this paper is that the number of nodes in the network is 𝑛, and the amount of data that the node must transmit is 𝑍 when the compression sensing method is adopted, where 𝑍 is the vector of data packets, and its value is given in Equation 3.

CR IP T

3. System Model

Table 1: Network Parameters

Symbol

Description

Value

Threshold distance (m) Sensing range (m) Transmitting circuit loss (nJ/bit) Power amplification for the free space (pJ/bit/m2)

87 15 50 10

Power amplification for the multipath fading (pJ/bit/m4)

0.0013

Initial energy (J)

0.5

3.4. Problem statement The main problem studied in this paper is to design a highly efficient data collection method and optimize two performances: (1) Network lifetime maximization; (2) Effective utilization of energy minimization. (1) Network lifetime maximized As in previous studies [4], the network lifetime refers to the elapsed time when the first node dies in the network. The death of the first node can affect the performance of network. The lifetime of the node depends on its energy consumption, and the energy consumption is mainly

ACCEPTED MANUSCRIPT Z. T. Li et al./Computer Networks

(

proportional to the amount of data sent and received. Assume the average amount of data undertaken by the node in a cycle of data collection is . According to the energy consumption formulas (7) and (8), the energy consumption is . Let max( ) * 𝑛+, and let be the initial energy of the nodes in the network. Therefore, the network lifetime can be expressed as formula (9). According to formula (9), the essence of improving the network life is to reduce the amount of data undertaken by the node with the largest amount of data to be transmitted. max( ) (9)

4. Designing of MSDG scheme

M

4.1 Overview of the proposed scheme

CE

PT

ED

The overall network structure of the MSDG scheme is shown in Fig. 4. In the MSDG scheme, the nodes within the one-hop range of the sink transmit the data directly to the sink, and the other areas of the network are divided into strips (see Fig. 4). In each strip, the data of the other nodes are routed to the strip center and the data fusion method is used in the routing process, as shown in Fig. 5. The width of the strip is determined by the characteristics of data fusion: when the data packet is routed to the strip center, data will arrive at the strip center if data packets are fused. As shown in Fig. 5, the width of the strip is . Delimited by the strip center, the strip is divided into 2 parts with opposite routing directions. (a) The nodes on the near-sink side route data to the strip center in the direction away from the sink, and the width is . This region is called the strip inner zone (see Fig. 5). (b). The far-sink region is called the strip outer zone, and the width is ( ). The nodes in this region route data to the strip center in the direction opposite to the routing direction of nodes in the strip inner zone. Since the length of the data packet in the strip center is , the node in the strip center uses the CS methods for data collection. These nodes are called the CS nodes, while other nodes using data fusion for data aggregation are called aggregation nodes (see Fig. 5).

AC

Rendezvous routing

Strip 1 Strip 2 CS routing

Start node

Candidate start node

AN US

(2) Maximize the effective utilization of energy Efficient utilization of network energy refers to the ratio of utilized energy and initial energy in the network when the network dies. The maximum effective utilization of the network energy can be expressed as the following formula: ⁄∑ max( ) min(∑ ) Thus, in summary, the optimization objective is the following equation: max( ) { (10) max( ) min(∑𝑛 ⁄∑𝑛 𝑛 )

)

CR IP T

6

Sink CS node

Non-CS node

Fig. 4 The overall structure of the MSDG scheme.

In the MSDG scheme, the data collection has the following characteristics: (1) The network is divided into multiple strips (nodes in the near-sink strip transmit directly to the sink to send data. This strip can also be considered a special strip). (2) In each strip, the data of the nodes are routed to the strip center, and the data fusion method is used to carry out the data aggregation in the routing process. The amount of data packets that a node undertakes is less than or equal to . In special cases, when the amount of data exceeds packets and the data fail to reach the strip center, the compressive sensing method is used for data collection. However, the number of such nodes is very small. (3) The CS-nodes in the strip center uses the compressive sensing method for data collection. The maximum amount of data that a node undertakes is . (4) A start node of the CS nodes in the strip center is set, and it routes the data along the strip for one cycle back to itself using the compressive sensing method. This onecycle routing along the strip is called the CS routing. All the data of the nodes in this strip are fused into the start node, and the amount of data packets is still . The start node selection is relatively simple. This can be done by first selecting all the candidate start nodes on the shortest routing path from outer region of network to the sink beforehand. Only when these candidate start nodes are strip CS nodes can they be the start nodes. The red nodes in the network shown in Fig. 4 are candidate start nodes, and those that are both CS nodes and candidate start nodes are the start nodes. (5) The shortest route to the sink that the outermost start node launches is called rendezvous routing (see Fig. 4), and the CS method is used for data collection for the data of each passing start node. In this way, the data of the entire network are collected at the sink. (6) Due to the heaviest load of CS nodes in each strip, the strip is periodically subdivided so that the CS nodes are

ACCEPTED MANUSCRIPT 7

CR IP T

undertake less than data packets. It is especially important that only one node within a one-hop range of the sink undertakes data packets, while other nodes only need to send one data packet. In the MSDG scheme, strip and rendezvous routing take turns. After a number of turns, the amount of data undertaken by the network nodes are relatively balanced from the statistical point of view (balance here means that the amount of data undertaken by the nodes with the same distance from the sink is balanced). Thus, the amount of data that the node bears can be effectively reduced, and the network lifetime can be improved by approximately times. (2) The MSDG scheme has high energy efficiency. There are two main aspects of energy efficiency performance: one is that the average energy consumption per round of data collection is low; the second is that the energy consumption should be balanced. Compared with the previous scheme, the MSDG scheme proposed in this paper performs better in these two aspects of performance. First, the average energy consumption per round has been discussed, and the MSDG scheme is superior to the previous schemes. Second, the MSDG scheme is better than previous schemes in energy consumption equilibrium for the following reasons: in the hybrid-CS method, the energy consumption in the far-sink region is lower than that in the near-sink region. The node with the farthest distance from the sink undertakes 1 data packet, while the nodes in the near-sink region undertake data packets. Thus, and the ratio of the maximum and minimum is . However, in the MSDG, the CS nodes and strips take turns performing tasks, thus energy consumption equilibrium can be achieved. By contrast, each node in the CS scheme undertakes data packets. Although the energy consumption in the CS scheme is balanced, the energy consumption of the node is very high. from outer strip candidate start node

AC

3

2

3 2

2 3 2

1

1

3 3

2

1

0

3

3

1 3

Fig. 5

3

3

strip outer zone

2

2

3

CS routing rendezvous routing

3

2

3

2 2

3

3

Wi

3

2 3

2

2

2

2 2 2

0

1

Wo

w

1

1 2

1 1

3 2

1

0

1

1

1

0

0

1

1 0

0

1 1

1

3

1

1

2

1

1

1

2

2

2

2

2

2

3

3

3

3

3

2

3

3 3

CS node

2

3

3

start node

3

3 3

aggregation node

CE

PT

ED

M

AN US

taken by all nodes in turn to achieve load balancing purposes. On the other hand, start nodes are randomly reselected and take turns so that the network energy consumption is balanced. The MSDG scheme proposed in this paper has the following advantages compared with the previous schemes: (1) The MSDG scheme greatly reduces the average amount of data undertaken by the node, decreases the energy consumption, and effectively improves the network lifetime. The MSDG scheme effectively reduces the average amount of data undertaken by the node by designing two mechanisms, (1) the strip and (2) data collection routing. According to the discussion in Section 2, the best scheme currently is that each node undertakes data packets in each cycle of data collection. In the CS scheme, each node in the network undertakes data packets. In the hybrid-CS method, some nodes undertake less than data packets, while nodes in the near-sink region undertake data packets. In the MSDG scheme of this paper, the network is divided into multiple strips, and most of the nodes in each strip undertake less than data packets. Only the CS nodes in the strip center undertake data packets. Thus, from the perspective of the whole network, the vast majority of nodes in the network undertake less than data packets. In fact, the hybrid-CS method is equivalent to having only one strip in the network. In this strip, a large number of the nodes undertake data packets. In the MSDG scheme, the network is divided into multiple strips, and in each strip, a small number of nodes undertake data packets. Therefore, the number of CS nodes undertaking data packets are greatly reduced. However, if the MSDG scheme has only the above strategy, it can reduce the energy consumption of the network, but the maximum amount of data that the node undertakes is still , so it cannot improve the network lifetime effectively. Therefore, the MSDG scheme uses another mechanism to improve the network lifetime by employing the CS data collection method. The goal of this method is to create an aggregated route as follows: In the CS scheme, the node will calculate the received data, and then forward the calculated data. Although the number of forwarded data packets is , the forwarded data packets contain all the information. With this feature, if the all the data packets can be aggregated before they are transmitted to the sink, all the information of the entire network is contained in a large data packet with a length equal to the sum of data packets. Thus, the sink only needs to get one such large data packet to obtain the information of the entire network. Therefore, the key to compressing all the data packets in the network into one large data packet with the length of data packets in the MSDG scheme is to create a rendezvous routing that brings together all the data in the strip's start nodes. Therefore, all the data packets are aggregated before arriving at the sink, and then they are transmitted to the sink. In this case, only a small number of nodes bear data packets in a cycle of data collection. These nodes are the CS nodes in the strip center and the nodes on the rendezvous route (see Fig. 4 and Fig. 5). Other nodes

strip inner zone

to sink 2

The routing in the MSDG scheme

4.2 Design of the MSDG scheme This section discusses the data collection method in the MSDG scheme in detail. The data collection in the MSDG scheme has the following main steps: (1) Strip division. The main component of the MSDG scheme is composed of multiple strips. Therefore, the first step of the MSDG scheme is to divide the network into strips. In a given network, the width of the strip can be

ACCEPTED MANUSCRIPT Z. T. Li et al./Computer Networks

)

However, in the route of step (1), the node does not transmit data. In the rendezvous routing, each start node receives the packet and carries out compressive sensing with its own data packet, and then forwards the data. The other candidate start nodes simply transmit the data packet. Thus, the data collected in the entire network are transmitted to the sink. (7) After a period of time in such a network structure, repeat the same steps (1)-(6) but with different selected strip center locations and different elected rendezvous routings, so that the energy consumption of the entire network node is balanced. The pseudocode of the MSDG scheme is given in Algorithm 1. Algorithm 1: The MSDG scheme INUT: , // is distance of the outmost strip from the network edge in the -th data collection, and is the width of the strip. OUTPUT: // is the result of data collection of the entire network by the sink. Stage 1: Strip division 1: Run the routing diffusion protocol as [21] so that each

AC

CE

PT

ED

M

AN US

determined by the property of the data fusion. As long as the distance between the outmost strip center and the network edge is determined, the location of the -th strip center can be determined since every two strip centers ( ). Moreover, have a distance . Thus, next time the change of the distance of the first strip center from the edge of the network enables all the strips in the network to change the positions. Therefore, the strip division in the MSDG scheme becomes simpler. First, the sink randomly selects a node on the edge of the network and gives the distance from the first strip center to the edge of the network, as well as the parameter . Then, the node initiates the shortest route to the sink, and all the nodes passing through this route are the candidate start nodes. If a candidate start node is closer to than the other candidate start nodes, then this candidate start node becomes the start node. Thus, after the above method, all the start nodes on the strip have been found, and each start node represents the initial node of the strip center. Therefore, the division of all strips is completed. (2) Initiate the CS routing, and all the CS nodes are formed. Each start node initiates the same hop routing, that is, the start node selects the farthest node on the right-hand side with the same hop count to the sink as the next hop node. The process continues until the route returns to its own node. The nodes on this route are all the CS nodes. The CS nodes in each strip center are formed. (3) Form a route from the aggregation node to the strip center. At the beginning, the hop count of all the aggregation nodes to the strip is . First, the CS codes set their hop count to the strip to 0, and then broadcast their own hop counts to reach the strip. The node receiving the broadcasted hop count will compare it with its own hop count to the strip. If the hop count in the broadcast plus one is less than the hop count saved on itself, the node will update its hop count to the strip by the broadcasted hop count plus one. After this process, each node in the network has its hop count to the nearest strip and the parent node for data forwarding. (4) The data fusion routing from the aggregation node to the strip center. According to the data fusion routing in step (3), each node can find the parent node to the CS nodes in the strip center. Thus, each node initiates a route to the parent node. Each node fuses the data of all its children nodes then forwards the data to the parent node until the CS nodes arrive. After this process, the data of all the nodes in the network have been transmitted to the CS nodes (note that the nodes within one-hop range of the sink send data directly to the sink). (5) Each strip has one aggregated data packet. The start node initiates a route along a circle similar to that of step (2). Unlike step (2) in which there is no data packet transmitted and the routing load is very light, the routing in this step uses the CS method to collect the data packets. Therefore, the data in each strip are all collected on the start node. (6) Aggregate the data of the start node in each strip and route it to the sink. After step (5), the data of each strip has been aggregated to the start node. Thus, the starting node farthest from sink initiates the rendezvous routing to the sink. This routing is similar to the routing in step 1.

(

CR IP T

8

node obtains its hop count to the sink and the parent node in the routing;

2: Randomly get a node 3: Node

that is on the network edge;

initiates the shortest path routing to the sink,

and the routing path is 4: Set all the nodes in

;

as the candidate start nodes, and

the set of candidate start nodes is denoted by ; 5: Set

(

)

6: Let

*

+|

* and

+; is closer to

than the other nodes in ; 7: Set all nodes in as the start nodes; Stage 2: CS nodes routing 8:

For each node

9:

While

in

Do

𝑛

Do

is labeled as the CS node;

10:

𝑛

11:

End while

12:

13: End for Stage 3: Formation of hop count to strip center 14: CS nodes label their hop count to the strip as 0, while

other nodes label their hop count to the strip as

;

15: Each CS node broadcasts that its hop count to the strip

is 0; 16: The node that has received the broadcast will update

its hop count to the strip and update its parent node by the broadcasting node if it finds that a shorter routing path to the strip exists; 17: Each node that updates its hop count to the strip will

ACCEPTED MANUSCRIPT 9

re-broadcast its hop count to the strip; 18: Repeat step 16 and step 17 until the hop count is not

updated anymore; Stage 4: Aggregation data 19: For each aggregation node when it has no children

nodes, or has received the data of all its children nodes Do Aggregate its own data with all the received da-

20:

ta;

CR IP T

Transmit the fused data to the parent node;

21:

22: End For

Stage 5: CS data aggregation 23: For each start node

Do

𝑛

While

24:

Do

send its data to

25:

𝑛

;

aggregation data using CS approach;

27:

End while

28:

As . /<. /, the MSDG scheme can greatly enhance the network lifetime.

Stage 6: Rendezvous routing

be the start node with the farthest distance from 𝑛

in send its data to

32:

belongs to the start nodes

aggregation data using the CS approach; End if

PT

37:

ED

If

35: 36:

;

𝑛

33: 34:

𝑛

Do

M

sink; 31: While



( − )

29: End for 30: Let

AN US

𝑛

26:

Theorem 1: In the MSDG scheme, the maximum amount of data borne by the node in one round of data col)⁄ ) on average, where is the lection is (( number of nodes within a one-hop range of the sink. By comparison, the maximum amount of data borne by the node in the CS scheme is . The ratio of the maximum amount of data borne by the node in the MSDG scheme to ( − ) that of the CS scheme is . Proof: According to previous analysis, the nodes within a one-hop range of the sink bear the largest amount of data. In the MSDG scheme, one node among the nodes within a one-hop range of the sink bear data packets, whereas other nodes only need to bear 1 data packet, as they only need to send their own packets to the sink. After rounds of data collection, the number of data packets that each node bears is , and the average amount of data in )⁄ ). By coneach round that a node bears is (( trast, the amount of data borne by each node is in the CS scheme. Therefore, the ratio of the maximum amount of data borne by the node in the MSDG scheme to that of the ( − ) CS scheme is .

End while

5.1 Analysis of data reception and transmission under different schemes According to formula (9), the network lifetime is determined by the node with the largest energy consumption in the network. Therefore, when comparing the data fusion performance, CS, hybrid-CS and the MSDG methods in the network lifetime, the analysis and comparison are carried out on the maximum energy consumption of a node in a certain period of running time. To make the discussion more clear and convenient, Table 2 summarizes the meanings and descriptions of the relevant symbols in this section.

Stage 7: Repeat the above process after the strip loca-

CE

tions are updated; 38:

update

39:

Repeat Stage 1 through Stage 6;

Table 2: Symbols and descriptions.

Symbol

AC

;

5. Performance Analysis in Theory In this section, the performance index of the MSDG scheme is derived mainly from the theoretical analysis. First, the maximum data reception and transmission of nodes under data fusion, CS [18], hybrid-CS [19] and the MSDG schemes are analyzed and compared theoretically. The theoretical analysis results show that the MSDG scheme proposed in this paper has greatly improved the performance. In the following, we first give an overall comparison of the amount of data borne by the node in the MSDG scheme with the amount of data borne by the node in other schemes. Then, we give the amount of data borne by a node in the MSDG scheme in detail and compare it with other schemes.

Description Network radius

𝑛

Total number of node in the entire region

𝑤

Width of ring region

ρ σ

𝔖

Tran mi ion radiu of node Di tribution den ity of node Correlation coefficient between 𝑛 and 𝜎𝑛 σ 0 05 by default Fu ion function

𝑗

Hop count of node to in

𝜃

Central angle of ring region ( ee Fig 6)

𝐵

ector region with a di tance x from the in and central angle 𝜃 The number of node in region 𝐵

𝑛

𝜏 𝜑

The amount of data received by the node in 𝐵 The size of the data packet generated by the node The amount of data undertaken by the node

ACCEPTED MANUSCRIPT 10

Z. T. Li et al./Computer Networks

(

)

dertake at the 𝜑 −(

with a distance of x+jr from the sink

(𝑍 − )

) position is as follows: ( − ) 𝔖 2 ( ) 𝜑 −( − ) 𝜏3 −

(15)

The amount of data that the node must undertake at the (𝑍 ) position is as follows: ( − ) 𝜑 −( − ) 𝔖 2 ( ) 𝜑 −( − ) 𝜏3 (16)

5.1.1 Data load under the data fusion scheme Bx+2r,k Bx+r,k



k

r r

R

Fig. 6 Routing process of non-CS scheme

5.1.2 Data load under the CS scheme The CS scheme uses compression for each node. According to the principle of compressive sensing, both the number of data packets received, and the number of data packets transmitted are 𝜎𝑛 for each node. When the total number 𝑛 of nodes in the region is determined, is also determined, and 𝜎 is usually between 0.05-0.25 [18]. Therefore, the maximum amount of data that a node needs to undertake in the CS scheme is 𝜎n𝜏.

AC

CE

PT

ED

M

AN US

The amount of data undertaken by a node under the data fusion scheme is calculated as follows. As shown in Fig. 6, in the sector region 𝐵 with distance from sink and central angle 𝜃 , the nodes in 𝐵 must receive the data sent by the nodes in the region 𝐵 since the transmission radius of the node is . The nodes in 𝐵 receive and transmit the data sent by the nodes in the region 𝐵 , and so on. Suppose denotes a very small radian in 𝐵 , and represents the width of small region in 𝐵 . Then, the area of the sector region in 𝐵 can be calculated. Theorem 2: Suppose within the range of one transmission radius from the sink, the node has a distance from the sink, then the amount of data transmitted by the node after one round of data transmission is completed in the entire network is calculated as follows: ( ) 𝜑 𝔖2 𝜑 ( ) 𝜏 3 𝑍 *0 ⌊ ⁄ ⌋ { 𝜑 −( − ) 𝜏𝑍 ⌊ ⁄ ⌋ (11) Proof: In a non-CS scheme, the node that has the maximum load of data is located closest to the sink, that is, the ring area with the layer marked as 0. The nodes located at x
CR IP T

The nodes in the outermost layer do not receive data, and the amount of data it transmits is its own packet size, as follows: 𝜑 −( − ) 𝜏 (17) █

Bx,k

𝜑 𝔖*( )𝜏+ 𝔖 2 𝜑 𝜏3 (13) It can be calculated that the node at the position must undertake the following amount of data: 𝜑 𝔖2 𝜑 𝜏3 (14) Similarly, the amount of data that the node needs to un-

5.1.3 Data load under the hybrid-CS scheme The amount of data undertaken by a node in the hybridCS scheme at the peripheral region (when the amount of data is less than ) is similar to that in the data fusion scheme. In the near-sink region, when the number of data packets is accumulated to greater than , the node starts to compress the data. Therefore, the amount of data undertaken by the node in the hybrid-CS is similar to that of the CS scheme. Therefore, the data collection of the node with the maximum amount of data undertaken in the hybrid-CS scheme is similar to that of the CS scheme, that is, the node receives data packets and transmits data packets. Therefore, when 𝑛 is large enough, the maximum amount of data that a node undertakes is 0.05𝑛τ in the hybrid-CS scheme. However, when 𝑛 is not sufficiently large, the hybrid-CS does not turn to the compression scheme, and the maximum amount of data that a node undertakes is less than , which is similar to that in the data fusion scheme. In this paper, the network we consider is a largescale network. Therefore, the case where 𝑛 is not large is ignored. 5.1.4 Data load under the MSDG scheme The meanings of the primary symbols in this section are shown in Table 3.

ACCEPTED MANUSCRIPT 11

be obtained. Symbol

Description Remainder of the distance of the node from the strip center divided by The amount of data undertaken by the node located at the inside of 𝑗-th strip with distance d from the CS-node The amount of data undertaken by the node located at the outside of 𝑗-th strip with distance d from the CS-node The amount of data received by the node located at the inside of 𝑗-th strip with distance d from the CSnode The amount of data received by the node located at the outside of 𝑗-th strip with distance d from the CS-node The distance of the 𝑗-th CS strip from the sink

𝛹 𝛹 𝛩 𝛩





⌋ from the strip center is as follows:

𝛹

M

ED

⌊ ⌋ from the strip center is as follows: −(

𝔖2 𝛹

) ) )

*0 ⌊ ⁄ ⌋

𝑦 {

(

−(

𝛹

(

)

PT

𝛹

(, / -− )

-

𝜏𝑦

𝜏3 (18)

⌊ ⁄ ⌋

AC

CE

Proof: Suppose the distance of 𝑗 -th strip from the sink is 𝔏 𝑗 (0 ), and the transmission radius of the node is . The maximum hop count of nodes located at the inside of the 𝑗 -th strip to reach the CS-nodes is ⌊ / ⌋. For the node with 𝜅 | 𝜅 *0 ⌊ ⌋ ⌊ ⌋ + from the strip center, if 𝜅 ⌊ ⌋, the node does not forward data, and the amount of data undertaken by the node is τ. According to Theorem 2, when transmitting data to the CS ring, the far-away node needs to find the relay node for data forwarding. The relay node for the node lo(𝜅 ) is the node at (𝜅 ) . The cated at (𝜅 ) is as amount of data received by the node at follows: 𝛩

𝔖2

−(

(, / -− ) )

−(

(, / -− ) )

𝜏3

(19)

After 𝛩 data packets are aggregated with τ data collected by the node itself, the aggregated data are routed to the CS ring and simultaneously carry out data fusion. This process is similar to that in Theorem 2. Hence, Corollary 1 can

( (

) ) )

*0 ⌊(

𝑦 {

(

𝔖2 𝛹

(,(

𝛹

(

)⁄ ⌋ 𝜏𝑦

− )/ -− )

𝜏3

)

(20)

-

⌊(

)⁄ ⌋

Proof: The maximum hop count of nodes located at the − outside of the 𝑗 -th strip to reach the strip center is ⌊ ⌋. −

The node with distance ⌊ ⌋ from the strip center does not forward data, and the amount of data undertaken by the node is τ. Other nodes need multi-hop routing to transmit data to CS nodes. The node with distance

AN US

𝑗𝑤 The MSDG scheme creates multiple strips in the network. In each strip, the data fusion method between the node and the CS-node is similar to the data fusion method in Section 5.1.1. That is, the nodes in the strip use singlehop or multi-hop routing to CS-nodes. In the MSDG scheme, assume that the strip width is , and the strip ID 𝑗 =0, 1, 2, ..., ⌊ / ⌋-1. Data of the node: The amount of data that a non-CS node undertakes is calculated as follows: Corollary 1: Suppose the distance of the 𝑗 -th strip from the sink is 𝔏 𝑗 (0 ). The amount of data undertaken by the node located at the inside of 𝑗 -th strip with distance 𝜅 | 𝜅 20 ⌊ ⌋3

█ Corollary 2: Suppose the distance of the 𝑗 -th strip from the sink is 𝔏 𝑗 (0 ). The amount of data undertaken by the node located at the outside of the 𝑗 -th − strip with distance 𝜅 |𝜅 20 ⌊ ⌋3

CR IP T

Table 3: Symbols

.⌊





/

from the strip center needs the relay −

node with distance .⌊ ⌋ / from the strip center. According to Theorem 2, the amount of data received .⌊

by the node with distance strip center is as follows: 𝛩

𝔖{

(

(0

1− ) )

(

(0

1− ) )

𝜏}





/

from (21)

After 𝛩 data packets are aggregated with τ data collected by the node itself, the aggregated data are routed to the strip center and simultaneously carry out data fusion. Therefore, Corollary 2 can be obtained. █ Compared with previous schemes, construction of the MSDG scheme requires the node to bear a certain message forwarding, thus consuming a certain energy. The following gives the analysis of the number of messages borne by the node in the MSDG scheme. Since the amount of data borne by the node is different from the previous analysis, which refers to the number of data packets, the number of messages here is the number of messages transmitted between nodes during the MSDG construction. Obviously, the length of the data bits in a message is far less than that of a data packet. The construction of the MSDG scheme mainly needs the following three operations: (1) Strip division. The shortest route is formed from the outermost layer to the sink in this step; (2) Initiate the CS routing. Each strip center initiates a routing around the sink in this step. (3) Form a route from the aggregation node to the strip center. In this step, each node broadcasts once. Thus, the number of messages borne by the node in the MSDG scheme is given in the following Theorem 3. Theorem 3: Let the radius of the circular area be and the strip width 𝒰 . In the MSDG scheme, the average amount of messages borne by the node in the construction of the MSDG scheme is expressed in formula 22.

ACCEPTED MANUSCRIPT

{

Z. T. Li et al./Computer Networks



(⌈ ⁄ ⌉

⌉−

⌈(𝑅− )⁄

)))⁄(𝜋

( 𝜋( ⌉−

)

( 𝜋(

)))⁄(𝜋

)

under different schemes The meaning of the primary symbols in this section is 𝕜 shown in Table 4. Table 4: Symbols Symbol Description 𝛺 𝛺 𝛶̅ 𝛶̅ 𝜔

The minimum value of the total amount of data undertaken in the 𝑗-th strip The maximum value of the total amount of data undertaken in the 𝑗-th strip The minimum value of the average amount of data undertaken in the 𝑗-th strip The maximum value of the average amount of data undertaken in the 𝑗-th strip The total amount of data undertaken in the 𝑗-th strip during a round of data transmission.

The MSDG scheme balances network energy consumption via the rotation of the strip and CS nodes. The average energy consumption after a period of running time can reflect the true performance of the data collection scheme. For the MSDG scheme, after several rounds of data collection, the cumulative amount of data for each node is similar. It can be considered that the data of the whole network is distributed approximately evenly on all nodes. ∑ ∑𝑞 (23) 𝑞 /𝑛 𝑞 represents the amount of data undertaken by the 𝑞th node in the 𝑗-th strip during the -th round of data transmission. c denotes the number of rounds, and 𝑛 is the total number of all nodes in the 𝑗-th strip. The total amount of data undertaken by the network during a round of data transmission is denoted by 𝜔, and the total amount of data undertaken by the 𝑗-th strip during a round of data transmission is denoted by 𝜔 . ∑ ∑𝑞 𝜔 /𝑛 𝜔 /𝑛 (24) 𝑞 /𝑛 Therefore, we calculate the average amount of data that nodes undertake during one round of data transmission and use it to replace the average amount of data undertaken by a node in different rounds of data transmission. Due to the rotation of the nodes in the MSDG scheme, 𝜔/𝑛 can represents the amount of data undertaken in one round of data transmission in the MSDG scheme. Theorem 4: Let the radius of the circular area be and the strip width 𝒰 . In the MSDG scheme, the minimum and maximum amount of data undertaken by the node in the 𝑗-th (𝑗=0,1,2,...⌊ / ⌋ )layer during one round of data transmission are expressed in formula 25 and formula 26, respectively: 𝒰 𝜏 𝜋 𝒰 𝜏 (( 𝑗) 𝜋 𝜋𝑗𝒰)𝜗𝜏 𝛶̅ (25) 2

AC

CE

PT

ED

M

AN US



)⁄

) (22) Proof: Three operations are needed in the construction of the MSDG scheme. The number of messages needed to be sent by the node in each of the operations is calculated as follows. (1) Strip division. This stage only creates a route from the outermost edge of the network to the sink. Since the radius of the network is , and the sending radius of the node is , the number of messages that need to be sent for ⌈ ⁄ ⌉, and the number of mesbuilding this route is ℕ ⌈ ⁄ ⌉ sages received is ℕ . (2) Initiate the CS routing. During this stage, a circular routing is carried out in the center region of each strip. Since the strip width is , the number of strips is ⌈( )⁄ ⌉ . For the strip with a distance 𝒴 from the sink, the number of messages transmitted for a circular routing is: ℕ y ⌈ 𝜋𝒴 ⁄ ⌉=⌈ 𝜋𝒴⌉. The number of messages received is the same as the number of messages transmitted: ℕ y ⌈ 𝜋𝒴⌉. When the strip is located at the outermost location of the network, its circumference is the largest and thus its message load is the largest. At this time, the message transmitted in this stage is ℕ ∑⌈(𝑅− )⁄ ⌉− ( 𝜋( )) , and the number of message received is ℕ =ℕ . (3). Form a route from the aggregation node to the strip center. In this stage, each node sends a broadcast packet. Since the average degree of nodes is 𝕜, the number of messages received by each node is 𝕜. In summary, the average number of messages transmitted ℕ and the number received ℕ in the process of constructing the MSDG at a time are given as follows: )+1 ℕ (ℕ ℕ )⁄(𝜋 ⌈(𝑅− )⁄ ⌉− )))⁄(𝜋 )+1 =(⌈ ⁄ ⌉ ∑ ( 𝜋( )+𝕜= ℕ (ℕ ℕ )⁄(𝜋 ∑⌈(𝑅− )⁄ ⌉− ( 𝜋( )))⁄(𝜋 )+ 𝕜 (⌈ ⁄ ⌉ █ According to Theorem 3, the load borne by a node is very small when constructing the MSDG scheme. The reasons are as follows: (1) A message mainly includes the ID of the node, whose length is often shorter than the packet length by an order of magnitude; (2) After the completion of construction of the MSDG, the network runs stably for a period of time. The re-construction of the MSDG is only needed after many rounds of data collection. By contrast, the data collection is carried out regularly. Therefore, the message load for constructing the MSDG is smaller than the load for data collection. (3) As shown in Theorem 3, the message load is mainly caused by broadcasting. When sending a message, k messages will be received. However, the value of ⌈(𝑅− )⁄ ⌉− )))⁄(𝜋 ) is much (⌈ ⁄ ⌉ ∑ ( 𝜋( less than 1. Therefore, the energy consumption of the MSDG scheme itself is very small. ℕ

(⌈ ⁄ ⌉

∑⌈(𝑅−

(

CR IP T

12

5.2 Comparison of data reception and transmission

𝛶̅

𝒰 𝜏 ( 𝜋

)𝒰

( 𝜏 (( (

)𝜌𝜋 )𝜌𝜋

)𝜌𝜋

2

2 −( 𝜋

)𝒰)𝜗𝜏

(26)

Proof: Strip moves in the network. If the strip center is furthest from the sink, that is, the 𝑗-th strip center is located at (𝑗 ) . Since the transmission radius of a node is , the number of CS nodes in the strip center is ( 𝜋(𝑗 ) )⁄ = 𝜋(𝑗 )𝒰. When the 𝑗-th strip center is located at 𝑗 , the number of CS nodes in the strip center is 𝜋𝑗𝒰.

ACCEPTED MANUSCRIPT 13

𝒰 𝜏

𝜋𝑗𝒰

(( 𝑗

𝜏

) 𝜋

𝜋𝑗𝒰 )𝜗𝜏

(27

(

30000

)𝜌𝜋

𝜑

MSDG

𝜑 𝜏

(

𝜏 )𝜏

10000

0

𝛶̅

5.2.1 Comparison of data load in data fusion, CS, and hybrid-CS schemes In this section, we compare the amount of data under-

10

20

30

40

50

The distance from sink (m)

Fig. 7 Data load in data fusion scheme

100

80

m

ED

PT

CE

AC

Table 5: Summary of amount of data undertaken maximum amount maximum average of data undertaken amount of data underin one round of data taken in the lifetime transmission

Data fusion

15000

0

M

)𝜌𝜋

█ Table 5 shows the maximum amount of data undertaken under different schemes, and the average amount of data undertake by the node depleting its energy first in the single-round of data transmission. Since the data fusion, CS, hybrid-CS schemes do not achieve the rotation of routing centers, the location of the node with the maximum amount of data undertaken will not change, and the amount of data undertaken in one round of data transmission is exactly the average amount of data undertaken in the entire lifetime. However, in the MSDG scheme, there exist the rotation of location for the node with the maximum amount of data undertaken in each round of data transmission. Therefore, the average amount of data undertaken is lower than the maximum value in one round.

CS and hybridCS

20000

5000

2). When the 𝑗-th strip center is located at (𝑗 ) , the amount of data undertaken in the circular strip region is as follows: )𝒰 𝜏 (( 𝑗 ) 𝜋 𝛺 𝒰 𝜏 ( 𝜋𝑗 ( 𝜋𝑗 )𝒰)𝜗𝜏 (28) Therefore, the average amount of data undertaken by a node in one round of data collection is as follows: )𝒰 𝜏 (( )𝜌𝜋 2 −( 𝜋 )𝒰)𝜗𝜏 𝒰 𝜏 ( 𝜋 𝛶̅ 2 (

25000

AN US

Thus, in the MSDG scheme, the average amount of data undertaken by a node during each round of data collection is as follows: 𝒰 𝜏 𝜋 𝒰 𝜏 (( 𝑗 ) 𝜋 𝜋𝑗𝒰)𝜗𝜏 𝛶̅ 2

35000

CR IP T

𝛺

taken in different data collection schemes. The network parameters are set as follows: the total number of nodes is 800 ~ 2000, the radius of distribution area is = 300, and the maximum transmission distance of node =15. In the data fusion scheme, the node with the maximum amount of data undertaken is within the range of a transmission radius from the sink. Therefore, we mainly analyze within the range of a transmission radius from the sink

The data load of nodes

1). When the 𝑗 -th strip center is located at 𝑗 , the amount of data undertaken in the ring region consists of the following parts: one is the data packets of the outer layer forwarded in the rendezvous routing. In each layer, 𝒰 candidate start nodes forward the data, and each data packet is . Therefore, the total amount of data undertaken is 𝒰 𝜏. The second part is the amount of data undertaken by the nodes in the CS routing. The number of CS nodes in the routing is 𝜋𝑗𝒰, and each node forwards 1 data packet with length , and the total amount of data undertaken is 𝜋𝑗𝒰 𝜏. The third part is the amount of data in data fusion in the strip. The number of nodes in the strip is ) ) - 𝜋(𝑗 ) = ( 𝑗 ) 𝜋 𝜋((𝑗 minus the ( ) 𝜋 number of CS nodes, which becomes 𝑗 𝜋𝑗𝒰. Let the average amount of data undertaken by each node during data fusion be 𝜗𝜏, 𝜗 ( ). Then, we have:

60

40

800

1000

1200

1400

1600

1800

2000

the number of nodes

Fig. 8 Data load in the CS scheme Figures 7 and 8 show the data load in the data fusion scheme and the CS scheme, respectively. It can be seen from Fig. 7 that the data load in the data fusion scheme exceeds that of the CS scheme (see Fig. 8) within the range of one hop from the sink. Hence, the CS scheme improves the data load of the nodes in the network, and the CS scheme is better than the data fusion scheme. As shown in Fig. 8, the maximum data load of the node in the CS scheme increases with the increase of the total number of nodes. According to the principle of compressive sensing, the maximum data load in the hybrid-CS scheme is the same as the CS scheme. However, when the data load is less than , the data load of node in the hybrid-CS scheme is less than , but the data load of all nodes in the CS scheme are . In other words, the maximum data load of a single node in the hybrid-CS scheme is the same as that of the CS scheme. However, for the entire network, the overall data

ACCEPTED MANUSCRIPT 14

Z. T. Li et al./Computer Networks

(

load in the hybrid-CS scheme is less than that in the CS scheme, which means that the hybrid-CS scheme is better than the CS scheme. Therefore, the MSDG scheme is mainly compared with the hybrid-CS scheme.

)

node in the hybrid-CS scheme and the maximum data load in the MSDG scheme (nodes=800)

M

640

ED

nodes=800 nodes=2000

620

600

580

2

4

PT

The data load of nodes

660

6

4

8

Fig. 11 Average data load of the strip at (𝑗 the MSDG scheme

CE

in

nodes=800 nodes=2000

8

4

2

4

6

8

10

The serial number of strip

5.3 Energy efficiency

in the MSDG

Table 6: Symbols

Symbol

𝑗

𝜗

𝛹 0 8

)

12

𝑍

4 6 The serial number of strip

10

16

5

2

8

10

nodes=800 nodes=2000

10

6

The serial number of strip

Fig. 12 Ratio between the maximum data load of the node in the hybrid-CS scheme and the maximum data load in the MSDG scheme (nodes=2000)

AC

Fig. 9 Average data load of the strip at 𝑗 scheme

The ratio of data load of nodes

CR IP T

2

The serial number of strip

15

800

AN US

680

20

nodes=800 nodes=2000

850

750

The ratio of data on node for hybrid-CS over MSDG

5.2.2 Comparison of data load in the hybrid-CS scheme and in the MSDG scheme Figures 9 and 11 show the average data load of the nodes ) , respectively, in the MSDG at the 𝑗 and (𝑗 scheme. Figures 10 and 12 provide the ratio between the maximum data load of the node in the hybrid-CS scheme and the maximum data load in the MSDG scheme for the abovementioned two cases, respectively. As shown in Fig. 10, when 𝑛 800, the maximum data load of the node in the hybrid-CS scheme is more than 10 times that of the MSDG scheme. Since the network lifetime is determined by the node with the most energy consumption, and the energy consumption of a node is linearly proportional to its data load [4], it shows that the MSDG scheme can improve the network lifetime by at least 4 times in theory. From Fig. 10 and Fig. 12, the larger the 𝑛, the larger the reduced data load of reception and transmission. That is, the larger the network is, the better the effect of reducing the energy consumption in the MSDG scheme is, in contrast to other schemes.

Data load of node

900

10

Fig. 10 Ratio between the maximum data load of the

Description

Initial energy of node Number of layers in the divided region of distributed nodes The serial number of the strip, labeled 𝑗 =0, 1, 2, ..., ⌊ / ⌋-1from inside out The label of the layer that first depletes energy Percentage of remaining energy of nodes in the 𝑗-th layer Remaining energy of the 𝑗-th layer Remaining energy of entire network Energy efficiency of the network

Energy efficiency refers to the ratio of the energy efficiently used in the network, that is, the lower the energy residual rate, the higher the effective use of energy. The

ACCEPTED MANUSCRIPT 15

, 2(

)( ̅

−̅

)/ ̅

-

OMNET ++ was used to conduct experiment [37]. The experiments mainly focus on the following aspects of the performance: (1) the energy consumption of each node in the network; (2) the amount of data that each node undertakes; (3) the lifetime of the entire network; (4) the remaining energy of the network or the energy utilization rate. 6.1 The data load of the node 4000

The 3st round in MSDG scheme The 5st round in MSDG scheme

3000



𝜋

( 𝑗

)(𝛶̅

𝛶̅

)/𝛶̅

(32)

ED



M

AN US

𝛹 (29) 𝑅2 Proof: The nodes within the range of one hop from the sink transmit the data directly to the sink. In each round, the amount of data that a node transmits is 𝜏. In the near-sink region, one node relays the data packet of size 𝜏 from the outside strip. The average amount of data that one node transmits during one round is ̅𝛶 𝛶̅ (𝜏 𝜏)/ 𝜋 . For the nodes in other circular regions, based on formulas (4-5), the minimum and maximum average amount of data transmission for a single node in each round are given in formula 25 and formula 26, respectively. Assume that the energy of the nodes in the -th layer is depleted first. The efficient utilization rate of energy is determined by the difference between different layers. Suppose the strip that first depletes the energy consumes the initial energy in the form of ̅𝛶 in each round of data transmission, while other layers consume the initial energy in the form of ̅𝛶 , where 𝑗 . That is, some layers still have the remainingenergy, and the percentage of residual energy of the node is: 𝜗 (𝛶̅ 𝛶̅ )/𝛶̅ (30) Then, the remaining energy of the 𝑗-th layer is as follows: )(𝛶̅ 𝜋 ( 𝑗 𝛶̅ )/𝛶̅ (31) The remaining energy of entire network is as follows:

6. Experimental results and Analyses

CR IP T



zation rate in the general schemes proposed in [25], the MSDG scheme achieves a significant improvement.

The transmitted data

energy efficiency of the MSDG scheme is analyzed as follows: Theorem 5: In the MSDG strategy, the lowest efficient utilization rate is as follows:

0

𝜌𝜋 2 (

−̅

)/ ̅

-

(33)

PT

𝜌𝜋𝑅 2

)( ̅

200

300

The distance from sink (m)

70000

60000

The MSDG scheme of this paper The hybrid-CS scheme 50000

40000

30000

20000

10000

AC

The efficiency of energy utilization

0.7480

CE

Thus, it is proven after simplification of the above formula. █

100

Fig. 15 The data load of the node at different locations in the network

The transmitted data

𝛹

,

1000

0

The initial energy of the network is 𝜋 , and the final efficient utilization rate of energy is as follows: ∑

2000

0.7475

0 0

200

300

The distance from sink (m)

0.7470

0.7465 1000

100

1500

2000

The number of nodes

Fig. 13 Energy utilization under different numbers of nodes in the MSDG scheme Figure 13 shows that the energy utilization rate of nodes in the MSDG scheme with the increase in the number of nodes in the network. It can be seen from Fig. 13 that the energy utilization rate in the MSDG scheme can reach approximately 74%. Compared with the 10% energy utili-

Fig. 16 Data load of nodes after a period of time In the MSDG scheme, the amount of data that a node undertakes at different locations in the network is shown in Fig. 15. As seen from Fig. 15, the data load of the CSnodes in the strip is larger than that of non-ring nodes. Since the nodes on the circular route undertake large amount of data, the MSDG scheme needs to change the position of the ring route after a certain period of time, so that the node undertaking larger amount of data in the last round of data collection decreases its data load in the next round. Additionally, each node in the strip takes turns acting as the CS-node to achieve balanced energy consumption. As shown in Fig. 16, in the MSDG scheme, and the

ACCEPTED MANUSCRIPT (

ring of each strip in each round takes turns from the inside out after 10 rounds of data collection. Additionally, the node that has acted as the ring node has larger energy consumption than the node that has not acted as the ring node. In each round, the rotation of the CS-node can basically balance the energy consumption, and the nodes that have not yet acted as CS-nodes consume less energy. Compared to hybrid-CS scheme, our scheme dynamically establishes the data collection center and continues changing in turn, and the average amount of data undertaken decreases and becomes more balanced after several rounds, which means that the network lifetime is improved. 6.2 Energy consumption of network 80000

60000

6.3 Energy consumption and efficiency 40000

The hybrid-CS scheme The MSDG scheme

30000

20000

10000

2

4

The number of round

6

20000

100

200

300

The distance from sink (m)

M

0

AN US

(a) Maximum amount of data undertaken by the node

40000

0

1400000

1200000

AC

CE

PT

ED

Fig. 17 Energy consumption at different locations in the network Figure 17 gives the energy consumption of nodes at different locations corresponding to Fig. 15. Since the relationship between the data load of the node and the energy consumption of the node is linear, the trend of Fig. 17 is approximately the same as that in Fig. 15. Figure 18 shows the comparison of energy consumption in the MSDG scheme and the hybrid-CS scheme after a period of time. According to the results of the comparison, the near-sink nodes in the hybrid-CS scheme have larger data loads than other nodes, and the energy consumption is still not very balanced. However, in the MSDG scheme, the energy consumption is more balanced than that in the hybrid-CS scheme.

The energy consumption (NJ)

Fig. 18 Comparison of energy consumption at different locations after a period of time

The hybrid-CS scheme The MSDG scheme

1000000

800000

600000

400000

200000

0 0

100

200

The distance from sink (m)

300

The ratio of accumulated data by MSDG scheme over hybrid-CS scheme

The energy consumption (NJ)

The 3st round in MSDG scheme The 5st round in MSDG scheme

)

CR IP T

Z. T. Li et al./Computer Networks

The maximum amount of accumulated data

16

1.0

0.8

0.6

0.4

0.2

2

4

The number of round

6

(b) Ratio of maximum data load Fig. 19 Maximum amount of data undertaken by the node and the ratio between two different schemes Figure 19 shows the maximum cumulative data load and their ratio between the MSDG scheme and the hybrid-CS scheme. As seen from Fig. 19 (a), in the first round of data collection, the maximum amount of data undertaken by the node in the MSDG scheme is basically the same as that in the hybrid-CS scheme. Starting from the second round, the maximum amount of data undertaken by the node in the hybrid-CS scheme rises very fast. However, the maximum amount of data undertaken by the node in the MSDG scheme increases relatively slower. This is because the CSnode rings take turns changing their position in each round, and the nodes acting as the CS nodes in last round undertake small amounts of data in the next round of data collection. This dynamic method can distribute the network load to each strip. As seen from the experimental result, when the number of rounds of data collection reaches 6, the energy consumption in the MSDG scheme is only 1/5 of that in the hybrid-CS scheme. That is, the network lifetime is improved by five times, which shows the effectiveness of the

ACCEPTED MANUSCRIPT 17

MSDG scheme.

data packet, so the data in each strip can be aggregated into a data packet with length using the CS routing method. Then, rendezvous routing is used to transmit all the data in the strip to the sink in the form of a data packet with length . Additionally, the strip and CS-nodes use the rotation to achieve the goal of a balanced network load. The experimental results and theoretical analysis demonstrated that the MSDG scheme can effectively improve the network lifetime.

6.4 Energy utilization rate of network The hybrid-CS sheme

0.22

Although the MSDG routing scheme can greatly reduce the data load of the nodes and improve the network lifetime, the construction of the MSDG scheme is complicated, and it adopts a centralized control method. Thus, the MSDG scheme may require complicated controls in practical applications. In the future, we will further study the adaptive and distributed method and reduce the complexity of the protocol.

0.20

0.18

0.16

800

1000

1200

1400

1600

1800

2000

The number of nodes

(a) Efficiency of energy utilization in the hybrid-CS scheme 0.716

0.714

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China ( 61772554, 61572528, 61572526), The National Basic Research Program of China (973 Program) (2014CB046305).

AN US

0.715

References

0.713

[1]

0.711

800

1000

1200

1400

1600

1800

M. Huang, A. Liu, N. Xiong, T. Wang, A. V. Vasilakos. A Lowlatency Communication Scheme for Mobile Wireless Sensor Control

M

0.712

ED

The efficiency of energy utilization

The MSDG sheme

CR IP T

The efficiency of energy utilization

0.24

Systems. IEEE Transactions on Systems Man Cybernetics-Systems. (2018) DOI: 10.1109/TSMC.2018.2833204. [2]

F. Li, J. Luo, G. Shi, Y. He. ART: Adaptive fRequency-Temporal co-existing of ZigBee and WiFi. IEEE Transactions on Mobile

2000

The number of nodes

AC

CE

PT

(b) Efficiency of energy utilization in the MSDG scheme Fig. 20 Efficiency of energy utilization under different node densities The experimental results of energy efficiency are shown in Fig. 20. Compared with the hybrid-CS scheme, which achieves a 21% energy utilization rate, the MSDG scheme strategy obtains a 61% energy utilization rate, and the remaining energy rate is reduced. The energy utilization rate of the MSDG scheme is improved by 3 times, demonstrating the effectiveness of the MSDG scheme proposed in this paper.

Computing. 16(3) (2017) 662-674. [3]

barrier coverage in wireless sensor networks. IEEE Transactions on Wireless Communications. 13(2) (2014) 724-735. [4]

L. Jiang, A. Liu, Y. Hu, Z. Chen, Lifetime Maximization through Dynamic Ring-Based Routing Scheme for Correlated Data Collecting in WSNs, Computers & Electrical Engineering. 41 (2015) 191– 215.

[5]

Z. Li, F. Xiao, S. Wang, T. Pei, J. Li. Achievable Rate Maximization for Cognitive Hybrid Satellite-Terrestrial Networks with AF-Relays. IEEE Journal on Selected Areas in Communications Special issue on Advances in Satellite Communications. 26(2) (2018) 304-313.

[6]

X. Liu, M. Dong, Y. Liu, A. Liu, N. Xiong. Construction Low Complexity and Low Delay CDS for Big Data Codes Dissemination.

7. Conclusion In this paper, the MSDG routing scheme combines the compressive sensing and routing scheme, which changes the previous sink-centered routing scheme. First, the network is divided into multiple strips. In each strip, the aggregation nodes use the data fusion method to collect and transmit data to the strip center. The CS-nodes in the strip center employ the CS scheme for data collection. The MSDG scheme makes full use of the characteristic of the CS method. Multiple data packets can still be fused into a

S. He, X. Gong, J. Zhang, J. Chen. Curve-based deployment for

Complexity. 2018 (2018) 5429546, DoI: 10.1155/2018/5429546. [7]

M. Huang, Y. Liu, N. Zhang, N. Xiong, A. Liu, Z. Zeng, H. Song. A Services Routing based Caching Scheme for Cloud Assisted CRNs. IEEE Access. (2018) DoI: 10.1109/ACCESS.2018.2815039.

[8]

H. Li, D. Liu, Y. Dai and T. Luan, Engineering Searchable Encryption of Mobile Cloud Networks: When QoE Meets QoP, IEEE Wireless Communications. 22(4) (2015) 74-80.

[9]

S. He, J. Chen, X. Li, X. Shen and Y. Sun, Mobility and Intruder Prior Information Improving the Barrier Coverage of Sparse Sensor

ACCEPTED MANUSCRIPT 18

Z. T. Li et al./Computer Networks

(

Networks, IEEE Transactions on Mobile Computing. 13(6) (2014) 1268-1282.

)

and algorithms. IEEE/ACM Transactions on Networking. 14(1) (2006) 41-54.

[10] X. Xu, M. Yuan, X. Liu, A. Liu, N. Xiong, Z. Cai, T. Wang.

[25] A. Liu, P. Zhang, Z. Chen, Theoretical analysis of the lifetime and

Cross-layer Optimized Opportunistic Routing Scheme for Loss-and-

energy hole in cluster based wireless sensor networks, Journal of

delay

Sensitive

WSNs.

Sensors,

18(5)

(2018)

1422;

doi:10.3390/s18051422..

Parallel and Distributed Computing. 71(10) (2011) 1327-1355. [26] A. Liu, X. Liu, T. Wei, L.T. Yang, S.C. Rho, A. Paul. Distributed

[11] S. He, D. Shin, J. Zhang, J. Chen, Y. Sun. Full-view area coverage in

Multi-representative Re-Fusion approach for Heterogeneous Sensing

camera sensor networks: dimension reduction and near-optimal solu-

Data Collection. ACM Transactions on Embedded Computing Sys-

tions. IEEE Transactions on Vehicular Technology. 65 (2016) 7448-

tems, 16 (3) (2017) 73, DOI: http://dx.doi.org/ 10.1145/2974021. [27] X. Li, A. Liu, M. Xie, N. Xiong, Z. Zeng, Z. Cai. Adaptive Aggrega-

[12] M. Wu, Y. Wu, C. Liu, Z. Cai, N. Xiong, A. Liu, M. Ma. An Effective Delay Reduction Approach through Portion of Nodes with Larger Duty Cycle for Industrial WSNs. Sensors, 8(5) (2018) 1535, 2018; doi: 10.3390/s18051535.

tion Routing to Reduce Delay for Multi-Layer Wireless Sensor Net-

CR IP T

7461.

works. Sensors, 18(4) (2018) 1216. doi:10.3390/s18041216. [28] M. Huang, A. Liu, M. Zhao, T. Wang. Multi Working Sets Alternate Covering Scheme for Continuous Partial Coverage in WSNs. Peer-

[13] W. Jiang, G. Wang, MZA. Bhuiyan, J. Wu. Understanding graphbased trust evaluation in online social networks: Methodologies and challenges. ACM Computing Surveys (CSUR), 49(1) (2016) 10.

to-Peer Networking and Applications, (2018) DOI: 10.1007/s12083018-0647-z.

[29] Z. Li, B. Chang, S. Wang, A. Liu, F. Zeng, G. Luo. Dynamic Compressive Wide-band Spectrum Sensing Based on Channel Energy

AN US

[14] Y. Liu, A. Liu, S. Guo, Z. Li, Y.J. Choi, H.S. Context-aware collect data with energy efficient in Cyber-physical cloud systems. Future

Reconstruction in Cognitive Internet of Things. IEEE Transactions

Generation

on Industrial Informatics. (2018) DoI: 10.1109/TII.2018.2797096.

Computer

Systems.

(2017)

DOI:

10.1016/j.future.2017.05.029.

[30] Y. Wu, X.Y. Li, Y.H. Liu, et al., Energy-Efficient Wake-Up Sched-

[15] H. Luo, Y. Liu, S. K. Das. Routing correlated data in wireless sensor networks: A survey. IEEE Network, 21(6) (2007) 40-47.

uling for Data Collection and Aggregation. IEEE Transactions on Parallel and Distributed Systems, 21(2) (2010) 275-287.

[31] MZA. Bhuiyan, J. Wu, G. Wang, T. Wang, et al. e-Sampling: Event-

correlation on routing with compression in wireless sensor net-

Sensitive Autonomous Adaptive Sensing and Low-Cost Monitoring

works.ACM Transactions on Sensor Networks (TOSN). 4(4) (2008)

in Networked Sensing Systems.ACM Transactions on Autonomous

M

[16] S. Pattem, B. Krishnamachari, R. Govindan. The impact of spatial

24.

and Adaptive Systems (TAAS), 12(1) (2017)1. [32] MZA. Bhuiyan, G. Wang, J. Wu, J. Cao, et al. Dependable structural

for wireless sensor networking, IEEE/ACM Transactions on Net-

health monitoring using wireless sensor networks. IEEE Transac-

working. 11(1) (2003) 2-16.

tions on Dependable and Secure Computing, 14(4) (2017) 363-376.

ED

[17] C. Intanagonwiwat, R. Govindan, D. Estrin, et al. Directed diffusion

[33] F. Tong, L. Zheng, M. Ahmadi, M. Ni, J. Pan. Modeling and analyz-

Exact signal reconstruction from highly incomplete frequency in-

ing duty-cycling pipelined-scheduling MAC for linear sensor net-

PT

[18] EJ. Candès, J. Romberg, T. Tao. Robust uncertainty principles: formation. IEEE Transactions on information theory. 52(2) (2006) 489-509.

CE

[19] C. Luo, J. Sun, F. Wu, C.W. Chen, Compressive data gathering for

works. IEEE Transactions on Vehicular Technology, 65(4) (2016) 2608-2620. [34] F. Tong, R Zhang, J Pan. One handshake can achieve more: an ener-

large-scale wireless sensor networks, in: Proc. ACM Mo-

gy-efficient, practical pipelined data collection for duty-cycled sen-

bicomn++09, 2009, 145-156.

sor networks. IEEE Sensors Journal, 16(9) (2016) 3308-3322. [35] S. Yu, X. Liu, A. Liu, N. Xiong, Z. Cai, T. Wang. Adaption Broad-

projection for energy efficient wireless sensor networks. Ad Hoc

cast Radius based Code Dissemination Scheme for Low Energy

Networks. 16 (2014) 105-119.

Wireless Sensor Networks. Sensors, 18(5) (2018) 1509; doi:10.3390/

AC

[20] D. Ebrahimi, C. Assi. Compressive data gathering using random

[21] L. A. Villas, A. Boukerche, H. S. Ramos, DRINA: A Lightweight

s18051509.

and Reliable Routing Approach for In-Network Aggregation in

[36] X. Liu, S. Zhao, A. Liu, N. Xiong, A. V. Vasilakos. Knowledge-

Wireless Sensor Networks, IEEE Transactions on Computers. 62(4)

aware Proactive Nodes Selection Approach for Energy management

(2013) 676-689.

in Internet of Things. Future Generation Computer Systems. (2017)

[22] R. Cristescu, B. Beferull-Lozano, M. Vetterli. Networked SlepianWolf: theory, algorithms, and scaling laws. IEEE Transactions on Information Theory. 51(12) (2005) 4057-4073. [23] T. Wang, A. Vosoughi, W. Heinzelman, et al. Maximizing gathered samples in wireless sensor networks with slepian-wolf coding. IEEE Transactions on Wireless Communications. 11(2) (2012) 751-761. [24] R. Cristescu, B. Beferull-Lozano, M. Vetterli, et al. Network correlated data gathering with explicit communication: NP-completeness

DoI: 10.1016/j.future.2017.07.022. [37] OMNet++

Network

Simulation

Framework,

http://www.omnetpp.org/.

Zhetao Li received a B.Eng. from Xiangtan University in 2002, a M.Eng. degree from Beihang University in 2005, and a Ph.D. degree in Computer Application Technology from Hunan University in 2010. He is currently a professor of

ACCEPTED MANUSCRIPT 19

Yuxin Liu is currently a student in the School of Information Science and Engineering of Central South University, China. Her research interests are services based networks, crowd sensing networks, and wireless sensor networks.

AN US

Ming Ma is currently a Ph.D. student at the Department of Computer Science, Stony Brook University, United States. His research interests include geometric modeling, digital geometric processing and wireless sensor networks.

CR IP T

the College of Information Engineering, Xiangtan University. He was a visiting researcher at Ajou University from May to Aug 2012 and from Feb 2013 to Dec 2013. His research interests include (1) wireless networks and the Internet of Things (IOT) and (2) compressed sensing and big data.

ED

M

Anfeng Liu is a Professor of the School of Information Science and Engineering of Central South University, China. He is also a Member (E200012141M) of China Computer Federation (CCF). He received M.Sc. and Ph.D degrees from Central South University, China in 2002 and 2005, respectively, both in computer science. His major research interest is cyber-physical systems.

CE

PT

Xiaozhi Zhang received a B.Sc in 2014. Currently, she is a master in the School of Information Science and Engineering of Central South University, China. Her research interest is wireless sensor networks.

AC

Guangmin Luo received a M.Eng. degree from Chongqing University in 1990 and a Ph.D. degree from Beihang University in 2009. He is currently an associate professor of the College of Information Engineering, Xiangtan University. His research interests include wireless networks and the Internet of Things (IOT).