A novel prediction-based strategy for object tracking in sensor networks by mining seamless temporal movement patterns

A novel prediction-based strategy for object tracking in sensor networks by mining seamless temporal movement patterns

Expert Systems with Applications 37 (2010) 2799–2807 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ww...

1MB Sizes 0 Downloads 10 Views

Expert Systems with Applications 37 (2010) 2799–2807

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

A novel prediction-based strategy for object tracking in sensor networks by mining seamless temporal movement patterns Kawuu W. Lin a, Ming-Hua Hsieh b, Vincent S. Tseng b,* a b

Department of Computer Science and Information Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan, ROC Department of Computer Science and Information Engineering, National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan, Taiwan, ROC

a r t i c l e

i n f o

Keywords: Location prediction Seamless temporal movement patterns Object tracking Sensor networks Data mining

a b s t r a c t Energy saving in sensor networks has received a great deal of research attention in recent years due to its wide applications. One important research issue is energy efficient object tracking in sensor networks (OTSNs). Past studies on energy saving in OTSNs can be divided into two main directions: (1) improvement in hardware design; and (2) improvement in software approaches. Many research papers save energy in hardware design, but few discuss software approaches. The intuitive way to conserve the energy of sensor nodes is to reduce the operation time of high-powered components. Utilizing the movement patterns of objects to save energy is one software approach. However, it did not take temporal information into consideration nor did it define a suitable segmenting time unit of time interval in advance. Due to the time interval between movements is a real number, an improper segmenting time unit may not discover the useful patterns, directly resulting in the inefficient object tracking. In this paper, we propose a seamless data mining algorithm named STMP-Mine to efficiently discover the temporal movement patterns of objects in sensor networks without predefining the segmenting time unit. Moreover, we propose novel location prediction strategies that employ the discovered temporal movement patterns to reduce prediction errors to save energy. With empirical evaluation on simulated data, STMP-Mine and the proposed prediction strategies are shown to deliver excellent performance in terms of scalability and energy efficiency. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction Saving energy in object tracking sensor networks (OTSNs) has attracted extensive attention. With the emergence of embedded technologies, sensor networks are applied to many applications, such as the military and object tracking. Location prediction, using the behavior patterns of objects in the sensor networks can benefit energy conservation. However, the intrinsic limitations such as power constraints, synchronization, deployment and data routing bring numerous research challenges (Ye, Heidemann, & Estrin, 2002). In this paper, we focus on the problem of energy saving in OTSNs. In an OTSN, each sensor node consists of sensing, data processing, and communication components (Raghunathan, Schurgers, Park, & Srivastava, 2002). The intuitive way to conserve the energy of the sensor node is to reduce the operation times of high-powered components. For example, using a velocity-based strategy to track the moving objects requires a velocity-sensing component, which is an energy expensive device and is not a necessary equip* Corresponding author. Tel.: +886 6 2757575x62536; fax: +886 6 2747076. E-mail address: [email protected] (V.S. Tseng). 0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2009.09.011

ment for all sensor nodes. The velocity-sensing process needs not only the object detection components of sensor nodes, but also several message transmissions among the nodes. Reducing the operation times of or removing this kind of component is the primary motivation of this research. The study (Tseng & Lin, 2007) first proposes a mining algorithm, named TMP-Mine, with strategies that utilize the discovered temporal movement patterns to track objects in OTSNs without a velocity-sensing component, and the approach was demonstrated to be energy efficient. The algorithm, however, has a critical drawback. The user must specify a value to discretize the time, which is a real value, of the movement log so that the mining algorithm can proceed. Determining the value is a difficult task, and an improper value may not discover useful hidden patterns. For instance, if the segmenting time is set very small, it would work without defining any segmenting time unit, similar to web access behaviors described in Pei, Han, Mortazavi-Asl, and Zhu (2000). On the contrary, if the segmenting time is set very large, the temporal trait will be eliminated, and under this situation we can only discover the frequent large itemsets (Agrawal, Imielinski, & Swami, 1993). Therefore, the second motivation of this research is to propose a mining algorithm to discover the temporal movement patterns without defining the value of

2800

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807

time segmentation in advance and a series of accompanied prediction strategies to track object energy efficiently. In this paper, we propose a seamless data mining algorithm named STMP-Mine without defining the value of time segmentation to efficiently discover the seamless temporal movement patterns (STMPs) of objects in sensor networks. Moreover, we also propose a seamless pattern-based prediction strategy named PSTMP by utilizing STMPs and a hybrid strategy named PES + PSTMP, which integrates the PES method with PSTMP. The first prediction strategy is capable of making prediction by employing STMPs with no need to detect the object velocity. Hence, it can be applied to sensor networks with low-end sensor nodes. The second strategy is a hybrid approach, which integrating PSTMP method with a popular velocity-based strategy named PES (Xu, Winter, & Lee, 2004). This integrated strategy can further enhance the energy efficiency if the sensor nodes carry the velocity detection capability. Through empirical evaluation on various simulation conditions, STMP-Mine and the proposed prediction strategies deliver excellent performance in terms of energy efficiency. We briefly review related work in Section 2. In Section 3, we formulate the proposed problem. We describe the overall system architecture and workflow in Section 4. The data mining algorithm, namely STMP-Mine, is presented in Section 5. Section 6 gives a detailed description of prediction strategies. The empirical evaluation for performance study is made in Section 7. The conclusions are given in Section 8.

2. Related work A number of past studies tried to solve the issue of energy saving policies in sensor networks from the aspect of hardware design. For instance, the optimization problem of communication cost by inactivating the RF radios of idle sensor nodes was widely discussed (Goel & Imielinski, 2001). Many research efforts focused on energy efficient media access control (MAC) (Ye et al., 2002). Several researchers tried to save energy with a software design approach. In Lin, Peng, and Tseng, (2006), the authors developed tree structures for efficient object tracking, by considering the physical network structure. In Xu et al. (2004), proposed a network model, in which a sensor node is activated only if some objects exist in its coverage region. A number of past studies have been done on location tracking (Han & Cho, 2004; Lin & Juang, 2005; Ma, Fang, & Lin, 2007; Pathirana, Savkin, & Jha, 2004; Tsai, Chu, & Chen, 2007; Yavas, Katsaros, Ulusoy, & Manolopoulos, 2005). In ad hoc wireless mobile network systems, the mobility estimation was proposed (Pathirana et al., 2004). The authors used a Robust Extend Kalman Filter to derive an estimate of the mobile users’ next mobile base station by considering the users’ location, heading and altitude for a variant of the GSM network. In addition, a location tracking scheme on transportation systems (TSs) (Han & Cho, 2004) was proposed. Han and Cho (2004) proposed a distributed group location tracking scheme, based on two new concepts of a virtual visitor location register (VVLR) and a representative identity. Numerous studies have been done on mining users’ behavior patterns like association rules or sequential patterns in WWW (Pei et al., 2000; Srikant & Agrawal, 1996; Su, Yang, Lu, & Zhang, 2000) and transactional databases (Agrawal & Srikant, 1995). In Pei et al. (2000), the authors proposed a method named WAP-Mine for fast discovery of the web access patterns from web logs, using a tree-based data structure without candidate generation. In the category of mobility mining, most existing research focuses only on the analysis of user movement behavior (Yavas et al., 2005). The problem of mining location associated service patterns was first studied by Tseng and Tsui (2004). A novel method for discovering

users’ sequential movement patterns associated with requested services in mobile web systems was also proposed by Tseng and Lin (2006). In the area of mining object movement pattern, a variation of Markov models was proposed (Peng, Ko, & Lee, 2006) to predict object movements in sensor networks. Besides, a prediction method based on temporal movement pattern was proposed (Tseng & Lin, 2005, 2007) to accurately predict object movements.

3. Problem statement In this section, we first state the problem. Then, we describe the network environments and the behavior issues of moving objects. The performance metrics are also described in the end of this section. We adopt a network model for OTSNs as proposed in Xu et al. (2004), in which a sensor node is activated only if there is an object in its coverage/sensing region. Moreover, we assume that the movement log of objects is collectable (Tseng & Lin, 2005) and the trajectory of each object is represented in the form of S = h(l1, t1) (l2, t2) . . . (ln, tn)i, where li represents the sensor node location at time ti. The log is considered a valuable resource, because it contains the habitual patterns of objects. Because time interval value is continuous in the real world, we propose a novel temporal movement pattern, known as seamless temporal movement pattern (STMP). The target problem is to twofold: (1) Efficient discovery of STMPs for objects, and (2) Location prediction by utilizing STMPs for energy efficiency. To solve the problem described above, we shall find STMP in the form as P = h(li, ik, lj)i where ik semantically means the time interval range between two traversed locations. A time interval range between li and lj can be obtained from a time interval aggregation table of Lij such as Table 1. Moreover, we shall generate seamless temporal movement rule (STMR) in the form of:

Rt ¼ hðli ; ik Þi ! hðlj Þi for incorporation into the location prediction mechanisms to achieve low energy and a low missing rate in the OTSNs. Note that we assume that the behavior of moving objects is often based on certain underlying events instead of randomness (Lin et al., 2006; Tseng & Lin, 2005, 2007; Tseng et al. 2008). An event is a stream of locations with time intervals. Note that the characteristics of events in OTSNs include not only locations but also time intervals. The network model for the movement behavior of objects will be given in detail in Section 7.1. To solve the targeted problem, some important performance metrics should be considered. In this work, we adopt two popular metrics named Total Energy Consumed (TEC) (Xu et al., 2004) and missing rate (Xu et al., 2004). TEC indicates the total energy consumed by sensor nodes in the OTSN during the data mining phase and the object tracking phase. Missing rate denotes the number of erroneous predictions in a specified time period in ratio of the total number of movement of objects. Obviously, low TEC and a low missing rate can benefit the lifetime of the whole network, and this is the goal for this research.

Table 1 Time interval aggregation table of Lij. Time interval range 0.2–0.5 10.2–13.4 24.4–28.6

2801

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807 Table 2 An integrated log of temporal moving sequences.

4. System architecture Fig. 1 shows the proposed system architecture. In this scenario, we assume that the movement of objects in sensor networks is recorded in system logs (Mani, 2003; Tseng & Lin, 2005). In our proposed network model, each object is able to record the sensor nodes it visited within the arrival time at each node. To facilitate collecting movement logs, several powerful sensor nodes, equipped with storage devices, are deployed over the outer network to retrieve the log of each object, exiting from the network. The system workflow consists of two main phases: (1) the data mining phase, and (2) the object tracking phase. At first, the sensor network collects and integrates the movement log of moving objects. Then, the integrated movement log is used as input for the data mining method named STMP-Mine to discover STMPs. The STMRs are generated for use by location prediction to track objects in an energy efficient manner. The two phases for the system are described as follows: 1. Date mining mechanism The data mining mechanism consists of three components, namely the data integrator, the STMP-Mine, and the STMR generator. The data integrator integrates and preprocesses the logs collected from the sensors in OTSNs. Table 2 shows an example of an integrated log for objects’ temporal moving sequences. Take the third tuple, {(a, 1.24) (e, 2.02) (c, 5.16) (b, 10.78)}, for example, it means that the object arrived in the region of sensor nodes a, e, c and b at time points 1.24, 2.02, 5.16 and 10.78, respectively. Once the log is prepared, STMP-Mine algorithm will be applied to discover the STMPs from the integrated log. Then, the rule generator produces STMRs from discovered STMPs, and the STMRs are utilized for location prediction strategies to save energy. 2. Object tracking mechanism The purpose of the proposed tracking mechanism is to predict the next location of each object by utilizing STMRs. Before activating the object tracking mechanism, we dispatch the STMRs to appropriate sensor nodes. Because the storage associated with each sensor node is limited, we dispatch the STMRs to the sensor nodes according to location-based criterion. That is, only the STMRs that are location-related to a sensor node will be loaded into that node. Take the STMR (l1, 0.2–0.5) ? (l2),

Movement Log

STMP-Mine Method

Seamless Temporal Moving Patterns

Object ID

Temporal moving sequence

1 2 3 4 5 6

h(a, 1.2)(e, 3.6)(c, 5.7)(b, 10.15)i h(a, 3.15)(b, 5.45)(c, 7.16)(d, 12.18)i h(a, 1.24)(e, 2.02)(c, 5.16)(b, 10.78)i h(f, 0.15)(e, 5.27)(b, 13.75)i h(a, 4.17)(b, 6.58)(c, 7.96)(d, 12.32)i h(f, 0.29)(a, 4.47)(c, 6.16)(d, 10.228)i

for example, we would deploy this STMR at only the sensor on l2 rather than all sensors in the network. The tracking mechanism is composed of the location prediction strategy and object recovering method. For location prediction strategies, we propose two approaches: PSTMP and PES + PSTMP. PSTMP performs location prediction by employing STMRs with no need to detect the object velocity, while PES + PSTMP, is a hybrid method, which integrates the PSTMP method with a popular velocity-based strategy named PES (Xu et al., 2004). Recall that a sensor node is periodically activated when an object is in its coverage region according to the scheduling policy. Under such environments, the prediction mechanism is triggered whenever a sensor node loses an object. If the prediction mechanism fails to recover the object within a specific deadline, the flooding (Cerpa et al., 2001) strategy will be used to recover the missing object. 5. Proposed data mining methodology: STMP-Mine To discover seamless temporal movement patterns efficiently, we propose STMP-Mine, based on TMP-Mine (Tseng & Lin, 2005). The main advantages of STMP-Mine are that: (1) it uses the clustering method to handle the continuous value, i.e., the time interval, without the need to define the segmenting time unit in advance, and (2) it is able to find abundant movement patterns without omission of a short time interval. To seamlessly mine temporal movement patterns, we first manipulate the temporal information of the movement logs to obtain the clusters of time intervals. We design a one-way clustering technique named DIR (Discovering Interval Range) to obtain the clusters. 5.1. Formulation of data mining problem Let S = h(l1, rt1) (l2, rt2) . . . (lm, rtm)i be a temporal movement sequence of an object with length equal to m, where li represents the object’s location at time rti and rti < rtiþ1 8  i < m. The ascending order of elements in a sequence is decided by using the time as the key. Definition 1. A movement transaction is derived from a temporal movement sequence. Given a temporal movement sequence S = h(l1, rt1) (l2, rt2) . . . (lm, rtm)i, the movement transaction is defined as MT = hl1, l2, . . . ,lmi and the temporal movement transaction is defined as TMT = h(l1, ti1, l2, ti2, . . . tim1, lmi tii = rti+1  rti, where li represents the object’s location and tii represents the time interval between the visitation of li and li+1

Definition 2. A node to node movement is represented as (na, nb), which means that nb accompanies na after a time interval. Location Prediction

Fig. 1. System architecture.

Object Recovering Method

Definition 3. A temporal node to node movement is the basic element of object mobility and is represented as (na, D, nb), abbreviated as TNNM, where na denotes the start node, nb denotes the end node and D denotes the interval between the visitation of na and nb.

2802

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807

Definition 4. Given a movement transaction database DMT = {MT1, MT2, . . . ,MTN} that contains N node to node movement transactions and a frequency threshold e, a frequent node to node movement (na, nb) is defined as:

jfMT i jðna ; nb Þ  MT i and 1  i  Ngj  e: N Definition 5. Given a temporal movement transaction database DTMT = {TMT1, TMT2, . . . , TMTN} that contains N temporal node to node movement transactions, a time interval set of frequent node to node movement, (na, nb), is defined as:

Kðna ; nb Þ ¼ fDjðna ; D; nb Þ  TMT i and 1  i  Ng is a frequent node to node interval set, abbreviated as FNNISs Definition 6. A time interval set of node to node movement, (na, nb), abbreviated as TIS(na, nb), under a given frequency threshold e is critical if all of the following properties hold simultaneously. A critical interval range is defined as CIRðna ; nb Þ ¼ ½ip ; iq ip  iq where ip is the lower bound of the range and iq is the upper bound.

TISðna ; nb Þ  Kðna ; nb Þ Critical Property ð1Þ TISðna ; nb Þ ¼ fijip 6 i 6 iq ; i 2 Kðna ; nb Þg Critical Property ð2Þ jTISðna n;b Þj P N e where N is the transaction count of TMT database Critical Property ð3Þ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi MaxðjAVGðTISðna ; nb ÞÞ  ðip þ iq Þ=2j; iq  ip Þ < ðip þ iq Þ=2 where AVGðÞis the function return the average value of the TISðna ; nb Þ Critical Property ð4Þ Definition 7. b = (na, I, nb) is a seamless temporal movement pattern, abbreviated as STMP, where I denotes a critical interval range of K(na, nb). Given a database D containing the temporal movement transactions of objects and a specified support threshold d, the problem is to discover all STMPs existing in this database. 5.2. STMP-Mine STMP-Mine can discover STMPs from temporal node to node movements and output various frequent node to node time interval clusters, known as FNNT-Clusters. This algorithm is given in Fig. 2. STMP-Mine first initializes the NN-Table to store the counts of each node to node movement and the time intervals belonging to the node to node movements. Because the table size grows with the number of sensor nodes and the number of distinct time intervals, we must store the table when the size is too large to fit in the main memory. The algorithm counts every combination for each sequence in the database, and finally returns the NN-Table. Now the NNIRSs can be determined by scanning the elements of the NN-Table with support no less than the specified support threshold. After the NNIRSs that contain frequent node to node interval sets are obtained, we can determine the CIR from the NNIRSs. Fig. 3 shows the DIR algorithm. Initially the algorithm creates a set, FNNIR, to store discovered interval ranges. Because FNNISs is a one-dimension data set, it will be sorted to facilitate clustering. After the sorting procedure, the time interval is in ascending order. Therefore, the procedure scans the minimal time interval in the beginning. Then, it will scan the next time interval in order, and calculate both the average of time interval and the center of the time interval scanned thus far. Note that at line 7, it checks the Critical Property (4). If it does not violate the Critical Property (4), it will continue scanning the next time inter-

Fig. 2. STMP-Mine algorithm.

val. Otherwise, it will check whether Critical Property (3) holds in line 8. If Critical Property (3) holds, a critical interval range will be added into FNNIR and it starts a new scan from the current time interval. If Critical Property (3) does not hold, it will recalculate the average of time interval from the middle position of the current scan. In our algorithm, each interval range length is unique. This is because a fixed interval length may be not suitable for a varying time interval. For example, a frequent node to node interval set is listed as (0.12, 0.14, 0.20, 0.25, 0.4, 2.6, 2.8, 3.1, 10.6, 11.5, 12.7, 30.1, 33.2, 36.3, 37.9), if the range length is too large, then DIR will find a large interval range. In our algorithm, we set the range length based on the square root of the range center. Take an interval range [0.15, 1.25] for example, its range center is 0.7 and we can calculate its square root, which is about 0.84. However, the interval range is not critical because it violates Critical Property (4). DIR can ensure that a time interval cluster will not diverge. 5.3. Seamless temporal movement rules After obtaining the time interval clusters, we perform TMPMine (Tseng & Lin, 2005) whose time interval would be replaced by a representative time interval. In this way, we reduce the memory TMP-Tree (Tseng & Lin, 2005) needed in the aggregating manner. We can derive Seamless Temporal Movement Rules from various STMPs. The definition of STMRs is stated as follows: For a discovered STMP, Pt = h(li, ik, lj)i, the form of the corresponding STMR Rt and the definition of confidence conf(Pt) is given as:

Rt ¼ hðli; ik; Þi ! hðlj Þi conf ðPt Þ ¼

supðhðli ; ik ; lj ÞiÞ  100% supðhðli ; ik ÞiÞ

ð1Þ ð2Þ

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807

2803

Fig. 3. DIR algorithm.

We term the last location of the antecedent, namely lm1, LLocation. To reveal the strength of each rule, each is ranked by the following formula that considers both support and confidence:

strengthðRt Þ ¼ supðP t Þ  conf ðPt Þ

ð3Þ

In Section 7, we will show through experimental results that ranking rules by strength instead of support or confidence can save more energy. Moreover, if two or more rules have the same strength value, the rule with larger confidence will be given higher priority over the other rules. 6. Proposed prediction strategies for object tracking As described in Section 5, the generated STMRs are deployed over the sensor network by loading the location-related (Tseng & Lin, 2007) STMRs into corresponding nodes. We propose two extended prediction strategies based on PTMP and PES + PTEMP (Tseng & Lin, 2007), namely PSTMP and PES + PSTMP, to achieve the prediction. A location prediction requires two message transmissions to know whether the missing object is recovered in an OTSN. In the message transmission, the transmission and receiving operations between radio components in two sensor nodes and the activation power of two nodes introduce additional communication costs. Therefore, limiter prediction is in real-time and practical object tracking application. Our pattern-based prediction strategies use the ranked STMRs one at a time to predict the location. For the sake of simplicity, the real-time constraint for prediction is represented by number of predictions or TOP-N predictions. PSTMP is a non-velocity-based prediction strategy that exploits STMRs to predict the location of the missing object and two vari-

ants, namely standard N-gram and N+-gram. PES + PSTMP is a hybrid strategy that incorporates the well-known velocity-based strategy, PES with PSTMP, and uses both detected velocity and STMRs. Fig. 4a gives the PSTMP-N-gram prediction algorithm. The input parameters for this algorithm are the historical movement behaviors of the object, N-gram values and the ranking method for STMRs. First, we extract the last n TNNM from the movement behavior to form the antecedent for prediction (line 1). By using the antecedent we obtain a consequent set, known as the prediction set from the STMRs, where the predicted locations are ranked by the specified rule ranking method, such as support, confidence and strength (line 2). After the prediction set is obtained, the corresponding sensor nodes will be activated one by one to recover the object by the original node that lost the object (line 4). Finally, the algorithm returns whether the object is found by PSTMP-Ngram. Fig. 4b gives the PSTMP-N+-gram prediction algorithm. The purpose of this algorithm is to predict that a longer antecedent often produces higher precision than a shorter one. However, the applicability will decrease with the increase in antecedent length. Therefore, the algorithm starts with high N-gram value and decreases the N-gram value after each round in the PSTMP-N-gram method (line 2). The activated node must report whether the missing object is found back to the original node (line 3). The algorithm terminates only if the object is found or the number of predictions exceeds the specified value (line 4). Fig. 5 shows the hybrid prediction algorithm, PES + PSTMP. The input parameters are the same as those of PSTMP. It first uses the latest detected velocity of the object to predict its current location (line 1). If the object cannot be recovered by the velocity-based prediction, the algorithm will invoke PSTMP to recover the missing

2804

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807

Fig. 4. (a) PSTMP-N-gram algorithm. (b) PSTMP-N+-gram algorithm.

Fig. 5. PES + PSTMP algorithm.

object. Here, the TOP-N value is subtracted from 1, because of the error prediction, made by PES (line 5). 7. Experimental results In this section, we evaluate the proposed prediction strategies by measuring the TEC and missing rate under various time constraints. The evaluation on variations of PSTMP was also discussed. In the object tracking experiments, 80% of the simulated data are used for training to obtain STMRs, and the remaining 20% are the testing set for object tracking. All of the experiments were conducted on a P4-2.4 GHz machine with 1 GB of main memory. The algorithms and the sensor network simulator were implemented in Java. 7.1. Experimental setup To evaluate the performance of the proposed methods, we implemented a simulator that generates the workload data of an OTSN.

7.1.1. Simulation model Table 3 summarizes the primary parameters used in the simulation model with the default setting. In the base experimental model, the network is modelled as a mesh network as |W| = 20 * 20, and with N (defaulted as 10,000) objects. Initially, each object arrives at the network on an arbitrary outer sensor node deploying outside of the sensor network at some time. We assume that the behavior of the moving objects in the OTSNs is event-driven instead of random. Hence, we use two parameters le and Pe to model the average length and the event probability, respectively. The length of each event is modelled by Poisson distribution with mean le defaulted as 4. The event probability indicates the probability for an object to adhere to a certain event, and is modelled by Normal distribution with mean Pe (defaulted as 0.6). The events of a node are structured in a tree, in which the fanout of each node is modelled by Normal distribution with mean F (defaulted as 2). Each object in the network may move by adhering to a certain event or it may move at random. When an object is in random movement, it will move back with probability Pb

2805

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807 Table 3 Primary parameters for the simulation model. Parameter

Description

Default value

|W| N Pe le F Pb Pn T I V Nr

W*W nodes of Network Number of objects in the OTSN Average event probability on each node Average event length Average event fan-out Probability of backward movement Probability of next-node movement Tracking time for each object (s) Average stay time on each node (s) Average object velocity (m/s) Neighboring radius (nodes)

20 10,000 0.6 4 2 0.1 0.18 120 4 15 2

(defaulted as 0.1) or randomly move to other nodes in the hexagon network structure with probability Pn = (1  Pb)/(6 – 1). The node staying time is modelled by Exponential distribution with mean I (defaulted as 4). The tracking time for each object is set as 120 s. We assume that the sensing coverage range is 15 m and the average object velocity is set at 15 m/s. For communications between the sensor nodes and the base stations, we utilize a well-known routing algorithm, known as the shortest path multi-hop, used in Xu et al. (2004). We adopted the Rockwell’s WINS node (WINS) as our basis for simulating energy consumption. A more detailed power analysis of WINS nodes can be found in Raghunathan et al. (2002), Tseng and Lin, (2007), and WINS. The default value settings for the parameters reflect a reasonable and compact environment for OTSN and mobile systems, as in related studies (Eagle & Pentland, 2006; Lin et al., 2006; Xu et al., 2004). 7.2. Performance of prediction strategies In the following series of experiments, we take two metrics, the TEC and missing rate. TEC represents the total energy consumption of tracking all objects in an OTSN. Missing rate is the ratio of the number of erroneous predictions to the total number of movement of objects within a specified time period. The goal of prediction strategies is to track the moving objects with low TEC and low Missing Rate. 7.2.1. Selection of ranking method Fig. 6 shows the relationship of the missing rate and the TOP-N under various ranking criteria: strength, support and confidence, with the training data occupying 80% of the dataset. It is clear that the strength-ranking approach outperforms other ranking criteria

from the aspect of the missing rate. The confidence-ranking method has the worst performance in missing rate, because this kind of ranking might recommend a rule with high confidence but very low support. The strength-ranking method considers both the support and confidence of a rule. This makes sense because it takes both rule confidence and the proportion to the dataset into consideration. Therefore, we adopt the strength-ranking method in the subsequent experiments. 7.2.2. Performance of variations of PSTMP Fig. 7 shows the performance of variations of PSTMP in terms of TEC with TOP-N varying from 1 to 8. As shown in Fig. 7, an increase of the TOP-N accompanies the great decrease of the TEC of PSTMP with 1-gram (denoted as PSTMP-1-gram) and PSTMP-3+-gram. On the other hand, the TEC for PSTMP-2-gram and PSTMP-3gram decreases slowly. This phenomenon can be explained by probing into the number of generated STMRs. In our experiments, the average number of STMRs stored in each sensor node has a length greater than or equal to 2, and is about 4.17, on average, which is much less than that with a length equal to 1 (about 9.56). Therefore, the PSTMP-2-gram and PSTMP-3-gram will often invoke flooding recovery for the missing objects, because it has fewer STMRs. The TEC of PSTMP-1-gram decreases greatly with the increase of TOP-N value because more STMRs are used for prediction. Note that the number of sensor nodes activated by the flooding method is (6  1 + 6  2++6  m) = 6  (m + m2)/2, where the value 6 is the number of neighboring sensors in a hexagon network structure and m is the distance (in number of sensors) between the missing object and the original sensor node. Fig. 8 shows the performance of variations of PSTMP in terms of Missing rate with TOP-N varying from 1 to 8. Note that the missing rate of PSTMP-2-gram and PSTMP-3-gram stay stable when TOP-N is greater than 3. This is because the average number of STMRs stored in each sensor node with a length greater than or equal to 2 is about 4.17, on average, and only limited portions of movement behaviors can be predicted by STMRs. Hence, the missing rate is high and it suffers from high TEC. PSTMP-3+-gram has the lowest TEC and the lowest missing rate among the four methods. It is able to take advantage of the PSTMP-N-gram property that high precision and applicability can be achieved with a high N value and a low N value, respectively. 7.2.3. Comparisons of various prediction methods In this experiment, we probe into the performance of five prediction methods, namely Continuous Monitoring (CM) (Xu et al.,

0.75

800x103

Confidence Support Strength

0.70

Total Energy Consumption (J)

0.60

Missing Rate

PSTMP-1-gram PSTMP-2-gram PSTMP-3-gram PSTMP-3+-gram

700x103

0.65

0.55 0.50 0.45 0.40

600x103

500x103

400x103

300x103

0.35

200x103

0.30 0

1

2

3

4

5

6

7

8

9

TOP-N Fig. 6. Missing Rate for using strength, support, and confidence to rank the STMRs.

0

1

2

3

4

5

6

7

8

9

TOP-N Fig. 7. TEC for PSTMP-N-gram and PSTMP-N+-gram with TOP-N value varied.

2806

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807 700x103

1.0 0.9

Total Energy Consumption (J)

600x103

0.8

Missing Rate

0.7 0.6 0.5 0.4 0.3 PSTMP-1-gram PSTMP-2-gram PSTMP-3-gram PSTMP-3+-gram

0.2 0.1

500x103 400x103 300x103 200x103

CM PES (X=0.5) PES (X=0.5)+PTMP PES (X=0.5)+PSTMP PSTMP

100x103 0

0

1

2

3

4

5

6

7

8

9

TOP-N

0.0 0

1

2

3

4

5

6

7

8

9

TOP-N

Fig. 10. The TEC with TOP-N varied for CM, PES, PTMP + PES, PES + PSTMP, and PSTMP.

Fig. 8. Missing Rate for PSTMP-N-gram and PSTMP-N+-gram with TOP-N value varied.

7.2.4. Effects of varying the object velocity In this sub experiment, we observe the effects of varying the object velocity versus TEC and Missing Rate. Fig. 11 demonstrates that the hybrid strategy PES + PSTMP can save more energy than pure PES strategy and PSTMP strategy. Fig. 12 shows that PES + PSTMP has a lower missing rate than PES. As velocity increases, the TEC and Missing Rate of all methods increase. This is because more energy is required when a sensor node misses an object with a higher velocity. Also, the number of nodes activated by the flooding method may be higher because the object

600x10 3 500x10 3 400x10 3 300x10 3

100x10 3 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Support Threshold (%) Fig. 9. The TEC with support threshold varied for CM, PES, PTMP + PES, PES + PSTMP, and PSTMP.

2004), PES (Xu et al., 2004), PTMP + PES (Tseng & Lin, 2007), PSTMP and PES + PSTMP, in terms of TEC, i.e., the energy saving efficiency. Here, a hybrid method is proposed by integrating the PES(Destination, Instant) method with PSTMP, namely PES + PSTMP. As Fig. 9 shown, in our experiments we found PES(Destination, Instant) to be the most energy efficient method proposed in Xu et al. (2004). This is because we use parameters Pb and Pn in our model to simulate the activities of objects rather than follow the assumption that the object intends to move straight forward (Xu et al., 2004), which benefits other PES variations. We observed that the accuracy of the other variations is not absolutely higher than PES(Destination, Instant) but the energy penalty is much higher. This is because more than one node will be activated to search for the missing object. Because STMPs are more abundant than TMPs, PES + PSTMP outperforms PES + PTMP in energy consumption. This is because a time interval range has a higher hit rate for prediction than a time interval. Fig. 10 shows the performance of the prediction methods in terms of TEC, with TOP-N varying from 1 to 8. In the subsequent experiment, we set support to be 0.3. As shown in Fig. 10, PSTMP outperforms PTMP + PES, because TOP-N is greater than 2. This is because the number of STMRs is greater than the TMRs. When TOP-N increases, PSTMP has more rules to make prediction than PTMP + PES. Although PSTMP has abundant STMRs, PSTMP + PES still outperforms PSTMP. This is because the hybrid strategy exploits the advantages of the velocity-based method and the event-based method for prediction.

2x106

Total Energy Consumption (J)

200x10

CM PES (X=0.5) PES (X=0.5)+PTMP PES (X=0.5)+PSTMP PSTMP

3

2x106

PSTMP PES (X=0.1) PES (X=0.5) PES (X=0.1)+PSTMP PES (X=0.5)+PSTMP

1x106

500x103

0.0

7.5

15.0

22.5

30.0

Velocity (m/s) Fig. 11. TEC with average object velocity varied for PES, PSTMP and PES + PSTMP.

1.0 0.9 0.8

Missing Rate

Total Energy Consumption (J)

700x10 3

PSTMP PES (X=0.1) PES (X=0.5) PES (X=0.1)+PSTMP PES (X=0.5)+PSTMP

0.7 0.6 0.5 0.4 0.3 0.2 0.0

7.5

15.0

22.5

30.0

Velocity (m/s) Fig. 12. Missing Rate with average object velocity varied for PES, PSTMP and PES + PSTMP.

K.W. Lin et al. / Expert Systems with Applications 37 (2010) 2799–2807

is now far from the original node. Another reason is that the number of STMRs decreases with the increase in velocity. We explain the phenomenon from following aspects: (1) with the low velocity, more visits will be attracted in the outside of the network than in the inner circle. (2) However, the visits will be dispersed when the average velocity of the objects is high. The higher velocity disperses the visits and results in a decrease of the number of STMRs whose support is greater than the specified threshold. Consequently, fewer STMRs result in a higher Missing Rate, which in turn causes more flooding recoveries and a higher TEC. 8. Conclusion In this paper, we propose a seamless data mining algorithm, the STMP-Mine, without defining a segmenting time unit to efficiently discover the seamless temporal movement patterns (STMPs) of objects in sensor networks. By not defining a segmenting time unit, we can get more abundant and temporal movement patterns and rules than with TMP-Mine. We also propose a seamless pattern-based prediction strategy, PSTMP, by utilizing STMPs and a hybrid strategy, PES + PSTMP, to integrate the PES method with PSTMP. The first strategy makes predictions by employing STMPs without detecting the object velocity. Hence, it can be applied to sensor networks with low-end sensor nodes. The second strategy is a hybrid approach, which integrates the PSTMP method with a popular velocity-based strategy, PES (Xu et al., 2004). This integrated strategy can further enhance energy efficiency if the sensor nodes carry the velocity detection capability. Through our experiments, we show that ranking rules by strength criteria delivers better results in terms of TEC and missing rate than by using confidence or support. We also observe that STMPs are more adapted to apply to prediction strategies in OTSNs than TMPs, because they have more abundant temporal movement patterns and rules. Acknowledgement This research was supported by National Science Council, Taiwan, R.O.C. under grant no. NSC 96-2221-E-006-143-MY3,NSC 98-2631-H-006 -001, and NSC 97-2218-E-151-003-MY2. References Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD conference on management of data (pp. 207–216), Washington, D.C., May. Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In Proceedings of the 11th international conference on data engineering (pp. 3–14). Cerpa, A., Elson, J., Estrin, D., Girod, L., Hamilton, M., Zhao, J. (2001). Habitat monitoring: application driver for wireless communications technology. In Proceedings of the 1st ACM SIGMOMM workshop on data communications in Latin America and the Caribbean, 2001.

2807

Eagle, N., & Pentland, A. (2006). Reality mining: Sensing complex social systems. Personal and Ubiquitous Computing, 10(4). Goel, S., & Imielinski, T. (2001). Prediction-based monitoring in sensor networks: Taking lessons from MPEG. ACM Computer Communication Review, 31(5). Han, Il, & Cho, Dong-Ho (2004). Group location management for mobile subscribers on transportation systems in mobile communication networks. IEEE Transactions on Vehicular Technology, 53(1). Lin, C. Y., Peng, W. C., & Tseng, Y. C. (2006). Efficient in-network moving object tracking in wireless sensor networks. IEEE Transaction on Mobile Computing, 5(8). Lin, D.-B., & Juang, R.-T. (2005). Mobile location estimation based on differences of signal attenuations for GSM systems. IEEE Transactions on Vehicular Technology, 54(4). Ma, W. C., Fang, Y. G., & Lin, P. (2007). Mobility management strategy based on user mobility patterns in wireless networks. IEEE Transactions on Vehicular Technology, 56(1). Mani, M. (2003). Understanding the semantics of sensor data. ACM SIGMOD Record, 32(4). Pathirana, P. N., Savkin, A. V., & Jha, S. (2004). Location estimation and trajectory prediction for cellular networks with mobile base stations. IEEE Transactions on Vehicular Technology, 53(6). Pei, J., Han, J., Mortazavi-Asl, B., & Zhu, H. (2000). Mining access patterns efficiently from web logs. In Proceedings of the 4th Pacific Asia conference on knowledge discovery and data mining (pp. 396–407). Peng, W. C., Ko, Y. Z., & Lee, W. C. 2006. On mining moving patterns for object tracking sensor networks. In Proceedings of the 7th IEEE international conference on mobile data management (MDM’06). Raghunathan, V., Schurgers, C., Park, S., & Srivastava, M. B. (2002). Energy aware wireless microsensor networks. IEEE Signal Proceedings Magazine, 19(2), 40–50. Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th international conference on extending database technology (EDBT’06). Su, Z., Yang, Q., Lu, Y., & Zhang, H. (2000). What next: A prediction system for web requests using n-gram sequence models. In Proceedings of the 1st international conference on web information systems and engineering (WISE’00) (pp. 200–207). Tsai, H. W., Chu, C. P., & Chen, T. S. (2007). Mobile object tracking in wireless sensor networks. Computing Communication, 30(8Jun), 1811–1825. Tseng, V. S., & Lin, K. W. (2005). Mining temporal moving patterns in object tracking sensor networks. In Proceedings of the international workshop on ubiquitous data management (held with ICDE’05) (pp. 105–112). Tseng, V. S., & Lin, K. W. (2006). Efficient mining and prediction of user behavior patterns in mobile web systems. Information and Software Technology, 48(6), 357–369. Tseng, V. S., & Lin, K. W. (2007). Energy efficient strategies for object tracking in sensor networks: A data mining approach. Journal of Systems and Software, 80(10), 1678–1698. Tseng, V. S., & Tsui, C. F. (2004). Mining multi-level and location-aware associated service patterns in mobile environments. IEEE Transaction on Systems, Man and Cybernetics: Part B, 34(6). Tseng, V. S., Lin, K. W., & Hsieh, M. -H. (2008). Energy efficient object tracking in sensor networks by mining temporal moving patterns. In Proceedings of the 2008 IEEE international conference on sensor networks, ubiquitous and trustworthy computing (SUTC’08) (pp. 170–176). Xu, Y., Winter, J., & Lee, W. C. (2004). Prediction-based strategies for energy saving in object tracking sensor networks. In Proceedings of the 5th IEEE international conference on mobile data management (MDM’04) (pp. 346–357). Yavas, G., Katsaros, D., Ulusoy, Ö., & Manolopoulos, Y. (2005). A data mining approach for location prediction in mobile environments. Data and Knowledge Engineering, 54(2). Ye, W., Heidemann, J., & Estrin, D. (2002). An energy-efficient MAC protocol for wireless sensor networks. In Proceedings of the 21st international annual joint conference of the IEEE computer and communications societies (INFOCOM 2002) (pp. 1567–1576), New York, NY, USA, June. WINS project, Rockwell Science Center. .