An efficient adjustable grid-based data replication scheme for wireless sensor networks

An efficient adjustable grid-based data replication scheme for wireless sensor networks

ARTICLE IN PRESS JID: ADHOC [m3Gdc;July 23, 2015;13:46] Ad Hoc Networks 000 (2015) 1–11 Contents lists available at ScienceDirect Ad Hoc Networks...

3MB Sizes 11 Downloads 106 Views

ARTICLE IN PRESS

JID: ADHOC

[m3Gdc;July 23, 2015;13:46]

Ad Hoc Networks 000 (2015) 1–11

Contents lists available at ScienceDirect

Ad Hoc Networks journal homepage: www.elsevier.com/locate/adhoc

An efficient adjustable grid-based data replication scheme for wireless sensor networks Tzung-Shi Chen a, Neng-Chung Wang b,∗, Jia-Shiun Wu c a

Department of Computer Science and Information Engineering, National University of Tainan, Tainan 700, Taiwan, ROC Department of Computer Science and Information Engineering, National United University, Miao-Li 360, Taiwan, ROC c Department of Electrical Engineering, National University of Tainan, Tainan 700, Taiwan, ROC b

a r t i c l e

i n f o

Article history: Received 24 April 2014 Revised 17 April 2015 Accepted 6 July 2015 Available online xxx Keywords: Query Data replication Load balancing Virtual grid Wireless sensor networks

a b s t r a c t Wireless Sensor Networks (WSNs) consist of hundreds or thousands of sensor nodes connected to each other through short-distance wireless links. Because sensor nodes are energyconstrained, efficient energy use is a critical issue for WSNs. In WSNs, the energy of popular nodes may be quickly depleted when a large number of sensor nodes are interested in the same popular events, and they are frequently queried. This paper proposes a data replication scheme called adjustable data replication (ADR), which is based on a virtual grid in order to improve the lifetime of data nodes. In each grid, a head node is selected as a manager, and is responsible for receiving or transmitting packets from or to other nodes in the same virtual grid. The head nodes around data nodes or replica nodes will compute the number of requested packets in order to determine how best to build a replica node in an appropriate location. ADR repeatedly builds data replica nodes close to query nodes in order to balance the overhead and energy consumption of sensor nodes. Simulation results show that the proposed ADR scheme outperforms existing approaches. © 2015 Elsevier B.V. All rights reserved.

1. Introduction Wireless Sensor Networks (WSNs) consist of hundreds or thousands of sensor nodes connected to each other through short-distance wireless links [1]. In a WSN, energy conservation is a primary concern, since its sensor nodes operate on small batteries with limited energy, thus limiting the lifespan of the nodes. A greater number of sensor nodes allows for sensing over larger geographical regions, with greater accuracy. WSNs have been extensively used in many applications, such as environmental monitoring, vehicular ad-hoc networks and battlefield surveillance [2–5]. Sensor nodes measure ambient conditions in their surrounding environments and then transform these measurements into signals ∗

Corresponding author. Tel.: +886 37 381899. E-mail addresses: [email protected] (T.-S. Chen), ncwang@nuu. edu.tw (N.-C. Wang), [email protected] (J.-S. Wu).

that can be processed to reveal characteristics of phenomena located in the area around the sensor nodes. Since sensor nodes have significant power constraints, energy efficient methods must be employed to prolong network lifetimes [6]. The development of a means of effectively and efficiently controlling sensor energy is thus an important and challenging task. A WSN may consist of a large number of sensor nodes for data sensing, data processing and communication components. Minimizing communication in query and sending execution can save a significant amount of energy, and help to prolong the lifetime of a network. In WSNs, all In-Network Storage schemes proposed to date have been in Data-Centric Storage (DCS) schemes [7–11]. The DCS schemes store sensor data readings and to process a query efficiently. DCS schemes may have the same phenomenon when query requests or storage requests [12] continuously access the same area or node. The data are stored or accessed by every kind of routing protocol [13,14].

http://dx.doi.org/10.1016/j.adhoc.2015.07.002 1570-8705/© 2015 Elsevier B.V. All rights reserved.

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

JID: ADHOC 2

ARTICLE IN PRESS

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

In real-world application, data in WSNs cannot be uniformly distributed. This study deployed sensor nodes in a disaster environment, and used sensor nodes for data storage, those sensor nodes are also called data nodes. Every sensor node in a WSN can be represented as a query node. When the data node receives query packets from query nodes, it sends data to those nodes. By using the DCS mechanism, nodes can determine where required data is. However, when the query frequency is larger than the storage frequency, the data node must expend more energy on replying the query data. This kind of increased query request frequency may result in the rapid depletion of the data node’s power. This kind of the phenomenon is called the query hotspots problem. This paper proposes a data replication scheme called Adjustable Data Replication (ADR). This scheme uses a virtual grid to mitigate traffic overloading in hotspots. When the data traffic rate of a data node increases, the node will set itself as a hotspot, and collect statistical data on the query frequency from neighboring nodes. The data node will then send a replication packet to neighboring nodes, and the replicated data will be the same as the original data. The neighboring nodes will receive the original data, and are called replica nodes. When replica nodes begin to receive fewer queries, this scheme will dynamically remove the replica nodes. ADR is thus able to use a few replica nodes to maintain a routing tree, and conserve node energy in a WSN. The contributions of the proposed scheme are given below: 1. The ADR scheme uses a few replica nodes to maintain a routing tree reducing energy consumption of data nodes. 2. The ADR scheme uses a virtual grid to mitigate traffic overloading in hotspots to balance the overload and energy consumption of sensor nodes. 3. The ADR scheme has good performance in terms of the metrics of the number of replica nodes, the number of packets, the energy consumption, and the query latency. The remainder of this paper is organized as follows. Section 2 will review related work. Assumptions on environment and the network model are introduced in Section 3. Section 4 will describe the proposed data replication scheme for WSNs. Simulation results are described in Section 5. In Section 6, conclusions are drawn. 2. Related work In the DCS routing protocol, nodes requesting data may encounter an over-emphasis on data storage and data replication [8,11]. Sensor nodes may forward larger data to the next hop, while sensor nodes are positioned close to the sink or the large query target. When sensor nodes constantly forward data to a static sink which uses their nominal communication range, the energy sink-hole problem will occur at the sensor nodes nearby the sink [15,16]. A number of studies to date have proposed various solutions to this problem. Cheng proposed a Grid-based Data Replication algorithm (GDR) [17], used to mitigate the query overload of a target area. GDR is a strategy that reduces query hotspots based on a virtual grid structure. In GDR, each grid has a head node that is responsible for receiving or transmitting packets from or to other nodes in the same virtual grid. The head nodes

around data nodes or replica nodes will compute the number of requested packets in order to determine data replication. When the average query frequency is greater than the overload threshold, data nodes will produce a replication, and place it in head node. Ishihara and Suda proposed a novel replica arrangement scheme called Dynamic Replica Arrangement on Concentric Circular Arcs (DRACA) [18]. DRACA is based on concentric circles, and builds new replica nodes in fixed target positions. In DRACA, replica nodes of a data item are arranged so that they are placed on nodes at closer locations from where queries are frequently sent. Replica nodes make nodes on the same circular arc centering the hashed location of the key, which has a pointer to indicate the position of the replica node. On the way to the hashed location, a query message may arrive at a node with a pointer pointing to a replica node. The node then forwards the query message to the location of that replica node. If the replica node receives the query message, it sends a reply message to the query node. If there are replica nodes and pointers near a query node, the communication cost required for sending the query and reply message is small. In DRACA, replica nodes are arranged according to the geographical distribution of the frequency of queries. New replica nodes and pointers are arranged based on whether and where the frequency of queries increases. The basic concept of DRACA is to adaptively arrange replica nodes at positions close to query nodes that frequently send queries. The basic idea of GDR is that the first replica node can distribute almost half of the query frequency. However, DRACA and GDR have a major drawback, in that the replica nodes cannot be dynamically removed. DRACA must first remove the outermost nodes until the target node is at the outermost position, and only then can the target node be removed. GDR is also very restrictive in this sense, because the back produced replica node is based on the boundary of the former replication. When removing replica nodes, GDR must first remove the tail-end replica node in order to maintain the entire system. However, end replica nodes are often more significantly responsible for dispersed flow and reducing the distance between replica and query nodes. 3. Assumptions and network model In this section, some basic assumptions are first introduced, and the proposed network model is presented. 3.1. Assumptions This section presents the basic design of the proposed method with the following basic assumptions: the monitoring field is assumed to consist of a large number of homogeneous sensor nodes which communicate with each other through short-range radio signals. Each sensor node is aware of its own location, for example, through Global Positioning System (GPS) signals [19], or through techniques such as [20,21]. In the proposed scheme, each sensor node stores data, and the query methods are based on the Geographic Hash Table (GHT) for data-centric storage [11]. All packets forwarded by sensor nodes use Greedy Perimeter Stateless Routing for wireless networks (GPSR) [22]. Mobile issuer

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

3

Y

d : the side length ofgrids R : communication range

A R

d d

d

d

B

R 5

Fig. 1. The relationship between d and R.

nodes can request interesting information anywhere. The proposed system is a DCS system. As in other DCS systems, this system also assumes that a WSN is divided into grids, where each pair of nodes in neighboring grids can communicate directly with each other. The grid is the minimum unit for detecting events and for storing sensor data. It is most important here that the sensor node deployment does not have overlay problems.

... 4

(1, 4) ...

3

(1, 3) (2, 3) ...

d

2

(1, 2) (2, 2) (3, 2) ...

d

1

(1, 1) (2, 1) (3, 1) (4, 1) ...

d

1

2

3

4

X d

d

d

Fig. 2. Virtual grids partitioning a physical area.

3.2. Network model Because sensor nodes must be aware of their physical locations, this information may be provided by Global Positioning System (GPS) [19] or other methods [20,21]. In GPS, receivers are used to estimate the positions of the nodes in mobile ad-hoc networks. GPS uses atomic clocks for time synchronization. Each GPS satellite transmits signals to sensor nodes on the ground, providing their location and current time. A node estimates the distance to each GPS satellite by estimating the amount of time it takes for the signal to reach the sensor node. Once the distance from four GPS satellites is estimated, a sensor node can calculate its position in three dimensions [23]. Thus it can be assumed that all sensor nodes know their location and time synchronization. First, the side length of the grids is designed in order to ensure that the head nodes in each pair of neighboring grids can communicate directly with each other [24]. In Fig. 1, node A communicates with node B of the farthest diagonal point node. Let R be the transmission distance of a radio signal, and d be the side length of the grids. The relationship between d and R must be predefined because the communication range of the head node must be able to communicate with any node in its neighboring grids. Here the value of d is selected here as d ≤ √R . 5

The sensing range has a significant impact on the performance of a sensing application, and is usually determined empirically in order to satisfy the Signal-to-Noise Ratio (SNR) [25] required by the application. In practice, both communication and sensing ranges are highly dependent on the system platform, the application and the environment. The communication range of a wireless network interface depends on the properties of the radio (e.g., transmission power, base band/wide-band, and antenna) and the environment (e.g., indoor or outdoor). Since network connectivity is necessary for any routing algorithm to find a routing path, it is reasonable to assume the rate is around double that of the communication to sensing range [26,27]. This paper assumes that the ratio of sensing range to communication is 1/2. With this

value, the sensing range can guarantee full coverage of the grid. The monitored area (sensor field) is divided into virtual grids. Each node associates itself with a virtual grid, depending on its physical location, as shown in Fig. 2. Every sensor node has its grid ID and notates Grid(X, Y) as its grid coordinate, where X is the x-coordinate of the grid and Y is the y-coordinate of the grid. When the sensor nodes have been deployed, each grid must select a head node in order to sense, receive and relay data. Head nodes also maintain and manage energy consumption within their grids. First, the sensor nodes send hello packets to collect their information of neighboring nodes. By collecting this information, all sensor nodes select the node nearest the central position with energy greater than the threshold, as the head node of their grid. Nodes not selected to be head nodes will remain in sleep mode. If a head node’s energy is lower than the threshold, it will send a hello packet to select a new head node in its own grid. These head nodes will also be responsible for the collection of information such as query packets from nodes in their virtual grid, and forwarding that information to data nodes. Whenever a head node receives query packets from other nodes within the same grid that head node becomes a query issuer. When many query issuers send requests to a node, those query packets will be requested from query issuers in all directions, and the data node will be overloaded. In order to determine the query frequency of a data node, the query frequency of the virtual grid around the data node must be computed. Query packets and data packets are forwarded through the head nodes of the virtual grids. These head nodes around the virtual grids of data are called boundary nodes. The virtual grids with boundary nodes are called boundary grids. Therefore, the overload threshold is simply defined as the query frequency of a boundary node or boundary nodes, as shown in Fig. 3. When n9 is a data node, many query packets are sent to it from other nodes. Because these query packets are transmitted to data nodes, and because they can travel

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC 4

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

n2

n1

n3 Data node Head node

n8 n7

n9 n6

n4

Query direction

n5

(a)

(b)

Fig. 3. Replication threshold definition.

k2 through the data nodes of neighboring nodes, the number of packets for its neighboring nodes is used to determine data replication. The proposed scheme aims to have data nodes produce data replications, and place them in the direction of large query numbers. The data replications are the same as the data nodes, and these data nodes are called replica nodes. The boundary nodes of each replica node change the query packet routing, and send packets to replica nodes. Replica nodes receive query packets and return data to query nodes, so that the scheme can reduce the numbers of queries sent to the data nodes. Because energy is consumed in building and updating replica nodes, the scheme does not produce a large number of replica nodes for data node query reduction. It must find a balance between reducing query packets transmitted to the data node and the number of replica nodes. Therefore, the proposed scheme places replica nodes in the direction of large numbers of queries. This can address problematic query numbers and save time in returning data to the query nodes. A detailed description will be given in the next section. In the proposed scheme, when a service is stored in a data node, the data node broadcasts a Date_Packet to all nodes, and each node will construct the data structure of the GPSR route, called the Routing_Packet. When a node wants to request data, the node sends a Query_Packet to all nodes. When the boundary node of a data node exceeds the overload threshold, it will begin the ADR scheme to reduce the packet flow. When these conditions are reached, the data node will send a Replica_packet. After a new replica node is produced, it will send the Boundary_Packet to build the boundary node. The data node will then update all replica nodes via a Tree_Packet. 4. Adjustable data replication In this section, the adjustable data replication (ADR) scheme is proposed. The establishment of replica nodes in the proposed scheme can be as close as possible to a region which includes a lot of query packets. 4.1. Determining around boundary nodes In the proposed scheme, in order to calculate the query frequency of a target data node or replica node, the positions of the virtual grids of boundary nodes for the target data

(c)

k1

Data node

(d)

Grid ofboundary node Fig. 4. The size of boundary nodes.

node or replica node must be determined. When query issuers send query packets to data nodes or replica nodes and forward packets through boundary nodes, the range of the boundary nodes can cover the data nodes and replica nodes, and they can change query packet routing to new replica nodes as well. So boundary nodes know the query frequency to data nodes or replica nodes. Boundary node size can be set as k1 × k2 (k1 , k2 ≥ 3 and k1 , k2 ∈ N) according to the WSN environment. For example, the size of boundary nodes is 3 × 3 as in Fig. 4(a), 3 × 5 in Fig. 4(b) and 6 × 4 in Fig. 4(c). In a boundary node size k1 × k2 , there are (kid × k2 ) − (k1 -2) × (k2 − 2) = 2 × (k1 + k2 ) − 4 boundary grids. The data nodes and replica nodes know their own positions. The data nodes and replica nodes can compute the virtual grid positions of boundary nodes on their own with these positions. With the position of the virtual grid, the range of the boundary nodes for Grid(X, Y) can be computed. The maximum and minimum values of the x-coordinate are X + (k1 − 2) and X − 1, and the same for Y, the maximum and minimum values of the y-coordinate are Y + (k2 − 2) and Y − 1. By the maximum and minimum values of the x-coordinate and the y-coordinate, the query frequency of boundary nodes can be computed by data nodes or replica nodes. After the boundary nodes have been built, the boundary nodes are divided into eight groups, and obtain new replica nodes by computing the query frequency of those eight groups. The eight groups of boundary nodes include four single grids: Grid(X − 1, Y − 1), Grid(X − 1, Y + (k2 − 2)), Grid(X + (k1 − 2), Y − 1), Grid(X + (k1 − 2), Y + (k2 − 2)), and four side grids: the union of Grid(X, Y − 1) to Grid (X + (k1 − 3), Y − 1), the union of Grid(X, Y + (k2 − 2)) to Grid(X + (k1 − 3), Y + (k2 − 2)), the union of Grid(X − 1, Y) to Grid(X − 1, Y + (k2 − 3)), and the union of Grid(X + (k1 − 2),

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

JID: ADHOC

ARTICLE IN PRESS

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

5

Hop count is 1 or 0

Fig. 5. Eight groups of boundary nodes.

Hop count is 0 to 4

Fig. 7. Differences of different sizes of the boundary nodes.

Fig. 6. Four main grids of each side grid.

Y) to Grid(X + (k1 − 2), Y + (k2 − 3)). There are eight groups of boundary nodes, corresponding to the eight directions of replica nodes, as shown in Fig. 5. After the boundary nodes have been built, the next step is to select eight grids in order to compute the query frequency. There are only four grids on the four corner grids in each size, and each corner grid is the main corner grid. However, the four side grids in each size are not the same. Data nodes or replica nodes select the four main grids of each side grid, as shown in Fig. 6. The x-coordinate or y-coordinate of the four main side grids is the same as the data nodes or replica nodes. As shown in Fig. 7, changing the size of the boundary nodes will affect three things. One is the speed of the replication from the data to the query nodes with increased query frequency. Second is the total number of replica nodes. The other is when ADR has already been built; the average number of query hop counts in each size will be different. The flow diagram of determining the boundary nodes of data node and replica node with boundary node size k × k is described in Fig. 8. For convenience, a boundary node size of 3 × 3 is used to describe the proposed approach. The total boundary grids are 2 × (3 + 3) − 4 = 8. There are eight grids close to target data nodes or target replica nodes. In Fig. 9, Grid(2, 2) is the virtual grid of the data node, and computes the virtual grid including original data nodes. With the position of the virtual grid, the range of the boundary nodes for Grid(X, Y) can be computed. The maximum and minimum values of the x-coordinate are

X + 1 and X − 1, and the same for y, the maximum and minimum values of the y-coordinate are Y + 1 and Y − 1. When a new replica node is built by a data node or a replica node, the new replica node is the father node on the former level sending having_replica packets to the data node. The father node is the data node or the replica node on the former level. In Fig. 10, the Grid(2, 2) is the virtual grid of the data node, and Grid(5, 2) is the virtual grid of the replica node. 4.2. Data replication of corner boundary nodes In the proposed scheme, eight groups of boundary nodes are used to build replica nodes in eight directions. However, grids are not round, and each of the eight groups corresponds to a different area. Different settings must be given to each group in the proposed approach. The difference in each group can be summarized in two cases: corner and side. The side case is the sum of one side besides corner grids. Here, this study focuses on the corners. When a data node or a replica node has a new replica node in the direction of a corner, the new replica node may be useless. For example, as shown in Fig. 11, when a lot of queries come from the top of corner, the new replica node is unable to share the responsibility. In order to avoid this problem, a mechanism is set as follows. When a replica node is generated, the query frequency of three outside nodes of the corner nodes will be compared, as shown in Fig. 12(a). If the highest query frequency comes from an outside corner of the corner nodes, the replica node will be built in the direction of the corner, as shown in Fig. 12(b). If the highest query frequency comes

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC 6

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

Start

A

Grid(X, Y) denotes the grid of sensor node with grid location (X, Y)

C

No

i <= X+1

Yes

d sends replica packet or d receives having_replica packet

B

No

i <= MaxX+1

No

Yes

d receives boundary packet from ni

D

Yes

nij denotes the sensor nodes, where i, j > 0

No

j <= Y+1

rs denotes the replication of sensor nodes, where s > 0

Yes

No

grid location of nij <> grid location of d

d denotes the original data node or old replication of sensor nodes

Yes

j <= MaxY+1

compare the replication table of Grid(X, Y), get the minimum x-coordinate MinX and the maximum values ycoordinate MinY

grid location of nij <> grid location of d and grid location of nij <> grid location of rs

nij sends boundary packet to d

i = X-1, j = Y-1

No

store the replica node information in its replication table

store Grid(X, Y) information in its surrounding table

Yes End No

Yes i = MinX-1, j = MinY-1 ni sends boundary packet to d

i=i+1

B

j=j+1

A

i=i+1

D

j=j+1

C

Fig. 8. The flow diagram of determining the boundary nodes of data node and replica node with size of k × k.

Y

Y

Y

4

4

3

2,2

2

1

0

4

3

1,3

2

1,2

1

1,1

1

2,3

3,3

3

3,2

2

2,1

3,1

1

2

3

0 0

1

3

2

4

5

X

0

4

5

X

0 0

1

2

3

4

5

6

7

8

X

(b)

(a) Grid of data node

Grid of boundary node

Fig. 9. Boundary nodes of data nodes.

Grid of data node

Boundary node of data node

Grid of replica node

Boundary node of replica node

Fig. 10. Replication of boundary nodes.

from two laterals of the corner nodes, the replica node will be built in the direction of the corresponding side, as shown in Fig. 12(c). The flow diagram of data replication of corner boundary nodes is described in Fig. 13. 4.3. Removal of replica nodes It is assumed that the environment cannot predict the query frequency. At one time, the query frequency may be high, but the next, the query frequency may be zero. This will happen if some of the original query nodes no longer send query packets. When the query frequency of a replica node is lower than the threshold, the replica node should be removed in order to reduce the cost of updating. This is because of the relationship between a data node and each of its replica nodes, which are its father nodes and child nodes. This kind of relationship is called a replication tree. Once

a node has been removed, it will change the relationship between data nodes and replica nodes. Here, nodes are removed based on two cases. In Fig. 14, the first case is that the target replica node is at the end of the replication tree. There is no child node for this kind of replica node, so this replica node will notify its father node. The replica node will then be removed from the table. Therefore, the updated information will not be sent to the node. The second case is one in which the replica node is in the middle of the replication tree, as shown in Fig. 15. The target replica node also has a father node and child nodes. In this case, the target node to be removed will inform the father node and child nodes. The replica node will be removed from the table, and the updated information will not be sent to the node. In addition, the node must also instruct its father node and child nodes to establish a new relationship between

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

7

Y 6

Start

r1

5

Data node Boundary node of Data node Replica node Boundary node of Replica node

4 3

2

d receives query packets from qh

Grid(X, Y) denotes the grid of sensor node with grid location (X, Y)

Yes

Query Direction

1

rs denotes the replication of sensor node, where s > 0

0 0

1

2

3

4

5

No

6

one direction of query frequency >= T

X

Fig. 11. Misjudged situation of replica node location.

d denotes the original data node or old replication of sensor node

each other. The flow diagram of the data replication removal is described in Fig. 16.

Yes

the direction is a corner boundary node

qh denotes the query issuer of sensor node, where h > 0

4.4. Replication tree In ADR, data nodes produce replica nodes and place them in assigned positions. However, the data nodes are father nodes which produce replica nodes, and the data of the replica nodes is from the data nodes. Thus, data nodes which produce replica nodes, or new replica nodes produced from old replica nodes are data nodes. Therefore, this study proposes a data replication tree in which the original data is the data node. The data nodes are the root, and the replica nodes are the child nodes of the data nodes. To begin, a data node makes a new replica node, and becomes its father node. After each new replica node is built, the data node or the replica node on the former level updates the registration information. When a replication tree is built, each of replica nodes in the tree must be updated. Data nodes send update information to their own child nodes. When

Y

No

No

Yes determine the transmission direction according to the highest query frequency of outside nodes of the corner nodes

T denotes overload threshold

d sends replica packet to the replica node rs in the direction of the corner or the corresponding side

End Fig. 13. The flow diagram of the corner boundary nodes.

Y 6

6

5

5

4

4

3

3

2

2

1

1

0

r1

0 0

1

2

3

4

5

6

(a)

Y

X

0

1

2

3

(b)

4

5

6

X

6

r1

5

Data node

4

Replica node

3

Boundary node of data node r1

2

Boundary node of replica node Boundary node of grid with a greatest number of packet Query Direction

1 0 0

1

2

3

(c)

4

5

6

X

Fig. 12. Three additional grids for corner nodes.

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC 8

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

r2

r3

r1

Start

r2

r1

r4

r4

rs does not receive query packets for time>t

Grid(X, Y) denotes the grid of sensor node with grid location (X, Y)

Yes

Data node

rs denotes the replication of sensor nodes, where s > 0

Replica node

rs sends replica_remove_packet to father node of rs

Grid ofboundary node Fig. 14. Removing replica nodes from the end of a replication tree.

r2

r3

r

No

r4

r2

d denotes the original data node or old replication of sensor node d receives replica_remove_packet and d is father node of rs

r3

t denotes a unit of time in which there is no received query packets

r4

No

Yes compare the replication table of Grid(X, Y) and update the replication tree

Data node Replica node Grid of boundary node

Fig. 15. Removing replica nodes from the middle of a replication tree.

End

child nodes of a data node receive update information, they will update their data and forward the update information to their child nodes. In this way, the replica nodes will be updated layer by layer to the last replica nodes.

In this section, we conducted simulations to compare the performances of GPSR [22], GDR [17], DRACA [18] and the proposed ADR. The simulation tool NS-2 [28], version 2.31 was used. The WSN consists of 200 sensor nodes, which are deployed randomly in a 500 m × 500 m area. The transmission range of each node is set to 112 m, and the radius of the sensing range is set to 50 m. Each grid has at least one sensor node that can respond for monitoring and data transmission. The initial battery energy is set to 10 J. In the simulations, many query nodes are used to query the data of one data node. A structure is constructed for querying the data node. For each value of metric, 40 rounds are conducted. Each value of metric is obtained with the average of all values of metric. 5.1. Impact of threshold In Fig. 17, different thresholds are set for ADR. The threshold values are set from 1 to 20 packets/s. Initially, the data node becomes overloaded and starts the ADR algorithm. Because the threshold values are 1 packet/s and 2 packets/s, too many replica nodes are made, even though the packet flow is reduced and equally distributed among each replica node. The cost of updating data is too great. When threshold values are 16, 17, 18, 19 and 20 packets/s, there are many packets to query the data node, and the data node sends many packets to query issuers. Here, the threshold values 7, 8, 9, 10, 11 and 12 packets/s are better threshold values.

16

Replica nodes (number)

5. Simulation results

Fig. 16. The flow diagram of the data replication removal.

ADR

14 12 10 8 6 4 2 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20

Threshold (packets/sec) Fig. 17. Number of replica nodes with different threshold values for ADR.

From these findings, it was determined that ADR does not have the best threshold values. It was therefore concluded that before an optimal solution could be obtained, all the conditions had to be given. For example, it will be not enough to only change the threshold values if the average query frequency is much higher than before. It would also be necessary to adjust the boundary size, and to build more or less replica nodes.

5.2. Impact of replication Two types of query node distributions were simulated. One is the flat type, as shown in Fig. 18(a), in which the same sending query frequency occurs in all nodes. The other is the spot type, as shown in Fig. 18(b), in which 80–100% of query packets come from nodes in sixteen of the same side corner

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

9

Replica nodes (number)

16 GPSR

14

GDR

12

DRACA

10

ADR

8 6 4 2 0 5

(a) Flat

10

(b) Spot

15

20

25

30

Time (sec)

Fig. 18. Two types of query node distributions.

Fig. 20. Number of replica nodes in a spot distribution of query nodes for GPSR, GDR, DRACA and ADR. GPSR

100

GDR

90

24

DRACA

80

20

ADR

28

16 12 8

Packet folw (%)

Replica nodes (number)

32

4

70 60 50 40 30 20

0

10 5

10

15

20

25

30

0

Time (sec)

grids. The following simulations will be discussed with these distributions. In Fig. 19, the numbers of replica nodes in a flat distribution of query nodes are compared. Because GPSR does not use the replica method, it has no replications. GDR, DRACA and ADR place the replica nodes in the direction of high query frequency. However, in GDR, even if replica nodes do not work, they cannot be removed. In fact, most of the query packets come from outside the boundary node. The replica nodes at the fringe and the replica nodes close to the center mitigate different query packets. Because the second layer and the third layer of the DRACA scheme are so close, replica nodes in the second layer do not mitigate query packets as the other replica nodes do. ADR can reduce replica nodes by about 70% of those produced by GDR, and can reduce about 60% of the replica nodes in DRACA in a flat distribution of query nodes. In Fig. 20, the numbers of replica nodes in a spot distribution of query nodes are compared. Because ADR places replica nodes near query issuers and removes useless replica nodes, ADR is able to reduce about 65% of the replica nodes produced by GDR, and 15% of replica nodes in DRACA.

GPSR

GDR

DRACA

ADR

Fig. 21. Packet flow in a flat distribution of query nodes for GPSR, GDR, DRACA and ADR. 100 90

Energy consumption (%)

Fig. 19. Number of replica nodes in a flat distribution of query nodes for GPSR, GDR, DRACA and ADR.

80 70 60 50 40 30 20 10 0 GPSR

GDR

DRACA

ADR

Fig. 22. Energy consumption in a flat distribution of query nodes for GPSR, GDR, DRACA and ADR.

packets flow is much less than in the other schemes. This is because there are far more replica nodes in GDR than in ADR, and the forwarding path between boundary nodes and replica nodes is so long. The packet flows with DRACA and ADR are initially similar, but in the final stage, ADR shows the highest number of query nodes sending and receiving packets for only one hop count.

5.3. Impact of packet flow 5.4. Impact of energy consumption Next, the performances of packet flow in a flat distribution of query nodes for GPSR, GDR, DRACA and ADR are compared. All nodes in the network randomly generate queries for 0–4 packets to the data node every second. Because of the query frequency, the network collision probability is higher than in the other simulation situation. As shown in Fig. 21, GDR, DRACA and ADR are able to mitigate the packet flow to the data node. In ADR, the final

The performances of energy consumption in a flat distribution of query nodes compared, as shown in Fig. 22. The total energy includes updated packets. The data packet is 0.01 J, and the other packet is 0.001 J for each hop count. Comparing ADR’s performance with those of other schemes, the total energy consumption is reduced by up to 33%.

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

ARTICLE IN PRESS

JID: ADHOC 10

[m3Gdc;July 23, 2015;13:46]

T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

Acknowledgments

100

Energy consumption (%)

90

This work was supported by the Ministry of Science and Technology of Republic of China under Grants MOST-1032221-E-024-006 and MOST-103-2221-E-239-024.

80 70 60 50 40

References

30 20 10 0 GPSR

GDR

DRACA

ADR

Fig. 23. Energy consumption in a spot distribution of query nodes for GPSR, GDR, DRACA and ADR. 12

Latency time (sec)

GPSR 10

GDR

8

DRACA ADR

6 4 2 0 10

20

30

40

50

Time (sec) Fig. 24. Average query latency time of query nodes for GPSR, GDR, DRACA and ADR.

In Fig. 23, the performances of energy consumption in a spot distribution of query nodes are also compared. ADR shows a 55% total energy consumption reduction. 5.5. Impact of query latency Finally, the average query latency times of query nodes in GPSR, GDR, DRACA and ADR are compared. Average query latency time of query nodes includes the time taken for query nodes to send query packets to the data node, and for query nodes to receive the data. In Fig. 24, the original data is in the center of the WSN. Because the packet flow in ADR is lower than in other schemes, and most query nodes send and receive packets in only one hop count, ADR displays better performance than do the other schemes. 6. Conclusion In this paper, an adjustable data replication (ADR) scheme was proposed in order to reduce query packets when a large number of queries occur in a WSN. In the proposed ADR, a replica node was dynamically determined in a gridbased WSN. The replica nodes can also be dynamically removed. The boundary size was also adjusted, and more or less replica nodes were built. Simulation results showed that the proposed ADR scheme outperformed existing approaches in terms of the metrics of the number of replica nodes, the number of packets, the energy consumption, and the query latency. In particular, the energy consumption of ADR for flat distribution was reduced by up to 33% and the energy consumption of ADR for spot distribution was reduced by up to 55%.

[1] I.F. Akyildiz, Y.S.W. Su, E. Cayirci, a survey on sensor networks, IEEE Commun. Mag. 40 (8) (2002) 102–116. [2] L. Selavo, A. Wood, Q. Cao, T. Sookoor, H. Liu, A. Srinivasan, Y. Wu, W. Kang, J. Stankovic, J.P.D. Young, LUSTER: wireless sensor network for environmental research, in: Proceedings of the 5th ACM Conference on Embedded Networked Sensor Systems, Sydney, Australia, 2007, pp. 103–116. [3] R. Mittal, M.P.S. Bhatia, wireless sensor networks for monitoring the environmental activities, in: Proceedings of the IEEE International Conference on Computational Intelligence and Computing Research, Tamilnadu, India, 2010, pp. 1–5. [4] A.K. Jain, A. Khare, K.K. Pandey, Developing an efficient framework for real time monitoring of forest fire using wireless sensor network, in: Proceedings of the IEEE International Conference on Parallel Distributed and Grid Computing, Himachal Pradesh, India, 2012, pp. 811– 815. [5] O. Evangelatos, K. Samarasinghe, J. Rolim, Syndesi: a framework for creating personalized smart environments using wireless sensor networks, in: Proceedings of the IEEE International Conference on Distributed Computing in Sensor Systems, Cambridge, MA, USA, 2013, pp. 325–330. [6] S. Lin, J. Zhang, G. Zhou, L. Gu, T. He, J.A. Stankovic, ATPC: adaptive transmission power control for wireless sensor networks, in: Proceedings of the 4th ACM Conference on Embedded Networked Sensor Systems, Boulder, Colorado, USA, 2006, pp. 223–236. [7] M. Aly, K. Pruhs, P.K. Chrysanthis, KDDCS: a load-balanced in-network data-centric storage scheme for sensor networks, in: Proceedings of the ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA, 2006, pp. 317–326. [8] C. Intanagonwiwate, R. Govindan, D. Estrin, Directed diffusion: a scalable and robust communication paradigm form sensor networks, in: Proceedings of the ACM Annual International Conference on Mobile Computing and Networking, Chicago, Illinois, 2000, pp. 56–67. [9] X. Li, Y.J. Kim, R. Govidan, W. Hong, Multi-dimensional range queries in sensor networks, in: Proceedings of the International Conference on Embedded Networked Sensor Systems, Los Angeles, California, USA, 2003, pp. 63–75. [10] J. Newsome, D. Song, GEM: graph embedding for routing and datacentric storage in sensor networks without geographic information, in: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems, Los Angeles, California, USA, 2003, pp. 76–88. [11] S. Ratnasamy, B. Karp, S. Shenker, D. Estrin, R. Govindan, L. Yin, F. Yu, Data-centric storage in sensornets with GHT, a geographic hash table, Mob. Netw. Appl. 8 (4) (2003) 427–442. [12] T.N. Le, D. Xuan, W. Yu, An adaptive zone-based storage architecture for wireless sensor networks, in: Proceedings of the IEEE Global Telecommunications Conference, St. Louis, MO, 2005, pp. 2782–2786. [13] J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, K. Pister, System architecture directions for networked sensors, in: Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, Cambirdge, MA, USA, 2000, pp. 93–104. [14] P. Bonnet, J. Gehrke, P. Seshadri, Towards sensor database systems, in: Proceedings of the Second International Conference on Mobile Data Management, Hong Kong, China, 2001, pp. 3–14. [15] C. Efthymiou, S. Nikoletseas, J. Rolim, Energy balanced data propagation in wireless sensor networks, J. Wirel. Netw. 12 (2006) 691–707. [16] J. Li, P. Mohapatra, Analytical modeling and mitigation techniques for the energy hole problem in sensor networks, Pervasive Mob. Comput. 3 (2007) 233–254. [17] Y.-T. Cheng, Mitigating query hotspots by data replication in wireless sensor networks, Master Thesis, Department of Information and Learning Technology, National University of Tainan, Tainan, Taiwan, 2009. [18] S. Ishihara, T. Suda, Replica arrangement scheme for location dependent information on sensor networks with unpredictable query frequency, in: Proceedings of the IEEE International Conference on Communications, Dresden, Germany, 2009, pp. 1–6. [19] B. Hoffman-Wellenhof, H. Lichteneger, J. Collins, Global Positioning System: Theory and Practice, fourth ed., Springer-Verlag, Vienna , New York, 1997.

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002

JID: ADHOC

ARTICLE IN PRESS T.-S. Chen et al. / Ad Hoc Networks 000 (2015) 1–11

[20] A. Nasipuri, K. Li, A directionality based location discovery scheme for wireless sensor networks, in: Proceedings of the ACM Workshop on Wireless Sensor Networks and Applications, New York, USA, 2002, pp. 105–111. [21] P. Mazurkiewicz, A. Gkelias, K.K. Leung, Linear antenna array, ranging and accelerometer for 3D GPS-less localization of wireless sensors, in: Proceedings of the International Conference on Indoor Positioning and Indoor Navigation, ETH Zurich, Switzerland, 2010, pp. 1–5. [22] B. Karp, H.T. Kung, GPSR: greedy perimeter stateless routing for wireless networks, in: Proceedings of the ACM International Conference on Mobile Computing and Networking, Chicago, Illinois, 2000, pp. 243– 254. [23] H. Sabbineni, K. Chakrabarty, Location-aided flooding: an energyefficient data dissemination protocol for wireless sensor networks, IEEE Trans. Comput. 54 (1) (2005) 36–46. [24] Y. Hu, J. Heidemann, D. Estrin, Geography-informed energy conservation for adhoc routing, in: Proceedings of the ACM Annual International Conference on Mobile Computing and Networking, Rome, Italy, 2001, pp. 70–84. [25] M. Duarte, Y.-H. Hu, Distance based decision fusion in a distributed wireless sensor network, in: Proceedings of the 2nd International Workshop on Information Processing in Sensor Networks, Palo Alto, CA, 2003, pp. 392–404. [26] X. Wang, G. Xing, Y. Zhang, C. Lu, R. Pless, C. Gill, Integrated coverage and connectivity configuration in wireless sensor networks, in: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems, Los Angeles, California, USA, 2003, pp. 28–39. [27] G. Xing, C. Lu, R. Pless, Q. Huang, Impact of sensing coverage on greedy geographic routing algorithms, IEEE Trans. Parallel Distributed Syst. 17 (4) (2006). [28] VINT Project, Network Simulator version 2 (NS-2), Technical report, http://www.isi.edu/nsnam/ns, 2001.

[m3Gdc;July 23, 2015;13:46] 11

Neng-Chung Wang received the B.S. degree in Information and Computer Engineering from Chung Yuan Christian University, Taiwan, in June 1990, and the M.S. and Ph.D. degrees in Computer Science and Information Engineering from National Cheng Kung University, Taiwan, in June 1998 and June 2002, respectively. He joined the faculty of the Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taiwan, as an Assistant Professor in August 2002. From August 2006 to July 2007, he was an Assistant Professor at the Department of Computer Science and Information Engineering, National United University, Taiwan. He was an Associate Professor at the Department of Computer Science and Information Engineering, National United University, Taiwan, from August 2007 to July 2011. Since August 2011, he has been a Full Professor at the Department of Computer Science and Information Engineering, National United University, Taiwan. His research interests include wireless and mobile networks, wireless communications, mobile computing, and cloud computing. He is a member of the IEEE Communication Society and IEEE Communications Society. Jia-Shiun Wu received the Master Degree from the Department of Electrical Engineering, National University of Tainan, Tainan, Taiwan in 2010. His current research interests include wireless communications and Internet of Things. He is now an engineer at Advanced Semiconductor Engineering, Inc., Taiwan.

Tzung-Shi Chen received the PhD in Computer Science and Information Engineering from National Central University, Taiwan, in June 1994. He joined the Faculty of Chung Jung Christian University, Taiwan, as an Associate Professor in 1996. He was a Visiting Scholar at the Department of Computer Science, University of Illinois at Urbana-Champaign, USA, from June to September 2001. Since February 2008, he has become a Professor at the Department of Computer Science and Information Engineering, National University of Tainan, Taiwan. He was a Director of Library at National University of Tainan, Taiwan, from August 2010 to July 2013. From August 2013 to July 2015, he was a Director of Computer Center at National University of Tainan, Taiwan. He has served as program committee members on many international conferences and as Editorial Board members and Guest Editors on many international journals. From 2015 to 2017, he serves as a Chairman of Taiwan ACM SIGMOBILE Chapter. His research interests include wireless networks, mobile computing, and cloud computing. He is a member of the IEEE Communications Society and ACM.

Please cite this article as: T.-S. Chen et al., An efficient adjustable grid-based data replication scheme for wireless sensor networks, Ad Hoc Networks (2015), http://dx.doi.org/10.1016/j.adhoc.2015.07.002