Dynamic radius and confidence prediction in grid-based location prediction algorithms

Dynamic radius and confidence prediction in grid-based location prediction algorithms

Pervasive and Mobile Computing 42 (2017) 265–284 Contents lists available at ScienceDirect Pervasive and Mobile Computing journal homepage: www.else...

2MB Sizes 0 Downloads 22 Views

Pervasive and Mobile Computing 42 (2017) 265–284

Contents lists available at ScienceDirect

Pervasive and Mobile Computing journal homepage: www.elsevier.com/locate/pmc

Dynamic radius and confidence prediction in grid-based location prediction algorithms Itay Hazan, Asaf Shabtai * Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel

article

info

Article history: Received 18 February 2016 Received in revised form 10 August 2017 Accepted 10 October 2017 Available online 28 October 2017 Keywords: Location prediction Mobile

a b s t r a c t Grid-based location prediction algorithms for predicting future locations of mobile device users have been proposed and evaluated. Most of the studies, however, ignore the fact that prediction accuracy is highly dependent on the user’s current behavior (i.e., whether current user behavior complies with previous behavior patterns), the type of predicted location (e.g., home, shopping center, work), the confidence in the prediction, and the error of the location sensor. In this paper we propose four methods for providing a dynamic and effective confidence area for each predicted location provided by any grid-based location prediction algorithm. We define confidence area as the extended area defined by radius around the predicted location (cell) that the user might be in, and confidence as the level of assurance that the user will be in the predicted area. We applied the proposed radius prediction methods on the output of three representative location prediction algorithms (frequent cells, Markov chain model and matrix factorization) using three different datasets, and compared the methods with the previously proposed fixed radius approach. Our results demonstrate the ability to dynamically determine a confidence radius that increases prediction accuracy while maintaining a small average radius. © 2017 Elsevier B.V. All rights reserved.

1. Introduction In recent years we have witnessed a shift towards the personalization of services in multiple fields. This statement is particularly true for services provided on mobile devices, where a plethora of context-based applications (for example: Yelp, Uber, Google Maps, and Google Now) are used daily by millions of users. A key component of many of these services is the ability to predict the future location of users based on location sensors embedded in the devices. Such knowledge would enable service providers to present relevant and timely offers to their users or manage better traffic congestion control, thus increasing customer satisfaction and engagement. Existing methods for location prediction take into account both the previous movements of the user and contextual information gathered thorough the mobile device (for example, day, hour [1,2], or recent phone calls [3]). There are two main types of location prediction approaches: grid-based [3–6] and point of interest (POI)-based [1,7,8]. In grid-based prediction the map of a certain geographical area is split into cells according to a predefined grid, and the prediction for the location the user is expected to visit (the next cell on the map) is made using a probability function that, given the user’s current location and context, assigns a probability to each cell of the grid. In the POI approach significant locations are either indicated (i.e., specific points of interest such as a shopping center) or automatically deduced and the algorithm attempts to predict the next POI the user is expected to visit. POIs can automatically

*

Corresponding author. E-mail addresses: [email protected] (I. Hazan), [email protected] (A. Shabtai).

https://doi.org/10.1016/j.pmcj.2017.10.007 1574-1192/© 2017 Elsevier B.V. All rights reserved.

266

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

be deduced through clustering algorithms (e.g., the user’s location cluster at night is marked as ‘home’) [9] or by using predefined rules and constraints such as the distance from the closest antenna in GSM networks [10,11]. Inferring the type of location using the POI approach is a challenging task [7] which usually requires human assistance to continuously label POIs. In both approaches the accuracy of prediction is highly dependent on the following factors: 1. Recent behavior of the user: the movement patterns of the user prior to the source (current) location can indicate whether the current user behavior is predictable (i.e., behaving according to previously observed movement patterns) or unpredictable (i.e., behaving unexpectedly and moving in previously unobserved patterns, possibly due to an unexpected event). The predictability level of the user can be inferred by evaluating the accuracy of the location prediction algorithm in recent prediction attempts. 2. The type of the predicted location: the exact position of the user in the predicted location is highly connected with the location’s size and type. For example, in grid-based prediction a shopping mall usually encompasses a larger number of cells than home or workplace locations, making it easier to pinpoint the user’s exact position. 3. Confidence in the prediction: confidence in the prediction depends on the number of previous observations (i.e., transitions from the current location to the predicted location, possibly within a specific context). For example, a prediction that is based on one previous transition is much less reliable than a prediction based on 1000 transitions. 4. The sampling error: the inherent errors of different location sensors (location sampling accuracy), including GPS, cellular network, or Wi–Fi, is a problem that has been addressed in a limited way in previous studies [12]. Learning from inaccurate location samples may result in inaccurate predictions. While multiple studies have attempted to predict the user’s next location, to the best of our knowledge only a few of them have dealt with the accuracy of predictions by providing a predefined fixed-sized confidence area [4,6], and none of them have tried to leverage the above factors in order to dynamically set a confidence area for the prediction. In this paper we focus on the grid-based approach and propose four methods for providing a dynamic and effective confidence area for each predicted location. We define confidence area as the extended area around the predicted location that the user might be in, and confidence score as the level of assurance that the user will be in the predicted cell/area. The proposed methods can be applied to the output of any grid-based location prediction algorithm and do not depend on the configuration of the grid or the maximization functions that determine which cell to predict. Specifically, we propose two greedy algorithms that set the radius of the confidence area based on the probabilities of cells around the predicted location (cell) and two methods that calculate the confidence area based on the behavior of the user or the performance of the algorithm around the source and destination:

• The accuracy of recent predictions method considers the user’s recent behavior or the algorithm’s recent performance (see the first factor above - recent behavior of the user) by analyzing the accuracy of recent predictions in order to infer how predictable the user is. • Prediction based on previously observed errors at the predicted location (accuracy at the predicted cell) addresses the second and third factors above by using the errors (in distance) of previous predictions in the vicinity of the predicted location. • The probability aggregation and the recursive search methods consider the second factor above and determine a cost-effective area surrounding the predicted cell (i.e., the cell with the highest probability) based on the probabilities of cells in the vicinity. We evaluated the proposed methods using two well-known datasets in the field of location prediction: the Geolife dataset [13] collected in 2008–2009 in Beijing, China and the LDCC dataset collected in 2010–2011 in Lausanne, Switzerland [14]. We also used a third dataset that we collected ourselves in an experiment involving 20 users in Beer-Sheva, Israel for a period of 20 weeks in 2015. Our results demonstrate the effectiveness of our proposed methods on three well-known location prediction algorithms: frequent cells, Markov chain model and matrix factorization. Using our methods we were able to determine, on average, smaller confidence areas, and at the same time, achieve a higher level of accuracy (by 5%–15%), compared to the fixed radius approach used in previous studies. The proposed methods also reduce the incidence of mistakes resulting from the natural error of mobile device location sensors [12,15] by inspecting the surrounding cells and taking previous errors into account. The remainder of this paper is organized as follows: Section 2 provides a motivation for the proposed approach. Section 3 presents related work. Section 4 provides a detailed description of the location prediction algorithms evaluated in this paper. Our methods are thoroughly explained in Section 5. Section 6 presents the evaluation results, and Section 7 suggests future work. 2. Motivation Predicting location is useful in many domains including coupon recommendation [16], predicting traffic congestion [17], and improving resources allocation (e.g., dispatching Uber drivers in advance or assigning police officers to locations where

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

267

they are most needed). Because of the challenging nature of location prediction (i.e., the accuracy of prediction is based on many factors) and the wide variety of settings in which it is employed, we maintain that a prediction should consist of the following two elements: (i) a predicted location; (ii) a set of confidence areas in the form of radii around the predicted location, each with a score indicating the system’s confidence in the prediction. This representation has several advantages. First, as different types of tasks require different types of granularity, providing the radius can enable organizations to make better resource allocation decisions. For example, an Uber driver may be indifferent to being two kilometers from a customer, but for a coupon recommendation service this may be an untenable distance. By being able to dynamically adapt the granularity of our model, we create a generic system for various types of tasks. Second, by providing a confidence score we enable organizations to optimize their performance through the creation of probabilistic models that weigh the utility of various courses of action. Finally, by taking into account the dynamic radius around a location we are able to provide useful predictions when the available data is limited or sparse, which can be improved as the amount of available data for each location increases. To better illustrate the possible contribution of our approach (namely, the inclusion of dynamic confidence area in the prediction of possible locations), we propose a method for integrating it into the frequently used k-arm Bandits model. This model, also known as contextual bandits [18], is frequently used in fields such as recommendation systems, particularly content and advertisement recommendations. We now demonstrate how our location prediction approach could be effectively integrated into such a solution. The general k-arm Bandits model is implemented as follows: 1. we define a large set of ‘‘arms’’ A = {a1 , a2 . . . an }. Each arm is a unique combination of a recommended item (e.g., a coupon for a restaurant or other services) and context. The context includes – but is not limited to – the traits of the user u to which the recommendation is made, the traits of the business offering the discount and other types of general information (day of the week, time of day, temperature, etc.) The size of the discount may also vary, thus leading to the creation of multiple ‘‘arms’’. 2. the approach is carried out in a set of discrete and sequential trials t = 1, 2, 3, . . . , m. 3. for each trial t we do the following: a. Analyze the traits (i.e., ‘‘context’’) of the user as well as the context of the surrounding world (i.e., current state), and identifies all relevant arms; b. Select and arm based on the payoff pi observed in the previous trial iterations for each arm ai ∈ A (e.g., the information which coupons were actually presented and consumed). This arm represents the coupon that will be presented to the user; c. Finally, based on the user u’s feedback – purchase or no purchase – the framework updates its estimation of the expected payoff by arm ai . T (i)



2∗ ln t

i , where T (i) is the overall number of times We can calculate the payoff pi using the following formula: pi = Tp(i) Ri + T (i) that arm i has been ‘‘pulled’’ (i.e., this coupon has been proposed to a user) and Tp (i) is the number of successes (i.e., the coupon has been used by the user). Ri represents the reward – monetary or otherwise – that is to be obtained if the user consumes the offered coupon and ti is the index of the current iteration. This method for calculating the payoff balance two opposing needs – exploitation and exploration. While the former attempts to maximize the payoff using existing knowledge, the latter’s aim is to explore additional options and improve our assessment of the various arms’ probabilities of success. The formula presented above is a common approach to balancing these needs. Our location prediction approach could be used to improve the K-arms Bandits model in two important ways:

1. Filtering. By using the confidence score/area as a filtering criteria, we can reduce the number of evaluated arms for each user and improve the chances of gathering useful information in every iteration. This will be achieved by removing all arms associated with locations that did not receive a minimal probability of the user visiting them (or the confidence area is too large) within a given time window. The smaller number of evaluated arms will also reduce computational overhead. 2. Scoring. The exploitation component of the payoff calculation process presented above takes into account the expected reward and perceived probability of success. We argue that the formula could be made more accurate by also including the probability of the user visiting the physical location associated with the arm. The augmented formula will be represented by: pi =

confg (i) radiusg (i)

·

Tp (i) T (i)

√ Ri +

2 · ln ti T (i)

where confg (i) and radiusg (i) are the confidence score and confidence radius respectively of the user visiting the location associated with arm i.

268

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

3. Related work 3.1. Location prediction Predicting people’s whereabouts at a specific time in the future is a challenging and extensively researched problem. Location prediction algorithms can be grouped based on (1) the spatial representation of the location – grid-based vs. point of interest (POI)-based; (2) the algorithm used to model previous transitions; and (3) whether the prediction is based on the user’s personal data or on cumulative data gathered from multiple users. 3.1.1. Point-of-Interest (POI) vs. grid-based representation There are two main types of geographical area spatial representation in location prediction algorithms. The first type deals with defining places according to their type such as ‘‘home’’, ‘‘work’’, or ’’shopping center’’, and its prediction algorithm attempts to predict the user’s next location [1,7,8]. These places are referred to as points of interest (POI) and can be inferred by semantic tagging by the user (e.g., using ‘‘check-in’’ applications such as Facebook) or an expert [9], or through the use of learning techniques [19]. In the second approach a specific geographical area is represented as N × N grid of square cells, and the location prediction algorithm attempts to predict the location (cell) of a user at a specific time in the future using the user’s previously observed transitions between cells [3–6]. Previous research introduced variations of the grid-based approach, all aimed at improving the accuracy of the prediction by proposing different model representations [5], considering contextual information such as the applications used, call logs, and Bluetooth scans [3,20], modeling different lengths of trajectories [9], and combining a personal model (based solely on the user’s own data) with social data (other users’ data) [3]. In our research we opt to use the grid-based approach because: (1) grid-based models are usually simpler and easier to implement; and (2) they do not require inference by location semantics (i.e., POI). This is, however, the drawback of the grid-based approach – dividing the map into cells without understanding the location semantics may result in POIs that are split across multiple cells or multiple POIs contained within a single cell, a situation which may reduce prediction accuracy. 3.1.2. Modeling algorithms Prior studies suggested various classes of algorithms for modeling the previous transitions of the user in order to predict the user’s future location. The first class of algorithms are based on different variations of neural networks. Saikath et al. [21] compared the performance of back propagation and radial basis function networks in predicting the next POI. Liu et al. [22] proposed a recurrent neural network model that predicts the next POI using POI’s extracted from active user check-ins. Another prominent class of algorithms is based on the Markov chain model. Chon et al. [23] evaluated different configurations of Markov chain models using POIs extracted by the user’s habits in the locations. In their experiments the Markov chain model presented the best results over other algorithms evaluated. Gambs et al. [9] developed a novel model called the n-MMC based on the mobility Markov chain that takes into account the n last POIs the user has visited. Do et al. [3] used the Markov chain model for predicting the next cell in the grid while embedding contextual features such as day and hour. A third class of algorithms is based on machine learning classifiers. Minh Tri Do et al. [1] used a set of classifiers combined in a linear regression model, in order to predict the label of the next location (e.g., home, work, restaurant, transport station, shopping, and entertainment). The features they used include current location, time, application usage, Bluetooth proximity, and communication logs. Another state-of-the-art class of algorithms is matrix factorization [24] which was adapted from the recommender systems domain. Geo et al. [25] presented a contextual matrix factorization model, which uses the time and the user’s cells visited pattern for predicting the next cell in the grid. Lian et al. [26] used matrix factorization for predicting the next location (POI) while using collaborative data from other users. All the presented categories of algorithm, when applied on grid-based representation, can be used with our proposed confidence radius prediction algorithms. In this study we opt to evaluate the confidence radius prediction algorithms using three different representative grid-based location algorithms: frequent cells, Markov model, and matrix factorization. 3.1.3. Personal vs. collaborative data Previous studies can also be grouped according to whether the prediction is based on the user’s own data (personal) or on cumulative data gathered from multiple users (collaborative). Gao et al. [25] created location-based social networks by analyzing user co-occurrences in different geographical locations. Through the use of language modeling [27], a technique primarily used in the field of information retrieval, they modeled and smoothed the probabilities of a user being at different locations. Zheng et al. [13] also modeled user co-occurrences as part of their proposed framework. They created three graphs: user– user, location–location, and user-location, and combined all three in a probabilistic model. Asahara et al. [28] combined a user-model and a social-model using a mixed Markov chain model [29]. Ying et al. [30] took advantage of data gathered from multiple users, as well as past trajectories and geographic features of the road, in order to predict users’ future locations. Krumm and Horvitz [4] presented Predestination, a system capable of inferring a user’s trajectory based on previous trips made by the user and typical driving patterns for the general driving population. Yong et al. [31] presented several methods from the collaborative filtering domain for location prediction based on personal and collaborative data. In this study we focus on personal models which rely only on the user’s personal information.

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

269

Fig. 1. An example of a grid-based model with nine non-zero probability cells clustered into three probability clusters.

3.2. Prediction accuracy A problem that has been addressed in a limited way in previous studies relates to the factors that affect the prediction accuracy (e.g., the location sensors’ natural error) and how to determine the level of confidence in the prediction. The Android developer guide [15], for instance, defines acceptable error as the radius r that provides 68% assurance; i.e., the probability that the user will be in the area that is defined as a circle with the predicted location as the center and radius r is 0.68. Similarly, previous grid-based studies addressed this issue by defining a fixed size radius around the predicted cell in order to eliminate the fact that the user can actually be in one of the nearest cells. For example, Minh Tri Do et al. [3] and Zheng et al. [32] first attempted to identify the significant locations of users (locations in which the user spent a predefined amount of time). Then, in order to increase accuracy, these locations were defined as fixed size areas with relatively dense instances of user location samples. Nizetic et al. [6] compared the accuracy when predicting the area that is defined by the cell with the highest probability score (fixed radius equal to 0) and the cells surrounding it (fixed radius equal to 1). Krumm and Horvitz [4] showed that fixed radii of different sizes produce different prediction accuracy, and therefore the size of the radius needs to be determined based on the accuracy required. We hypothesize that the fixed radius approach is inefficient and that most of the time it sets a larger confidence area than required; therefore, in this study, we propose four methods that dynamically set the size of the area around the predicted location (cell), a capability that eventually results in higher prediction accuracy while necessitating a smaller confidence area compared to the fixed radius approach. 4. Location prediction algorithms evaluated A grid-based model divides the geographical map into cells, and then, within a given context (e.g., current location, day, hour, previous locations), it assigns a probability to each cell which indicates the likelihood of the user being in the cell at a future time. An example of such probability distribution is presented in Fig. 1. The example in Fig. 1 presents a grid-based map containing nine non-zero probability cells that are clustered into three different areas. We refer to each cluster of cells as a probability cluster that may indicate a specific type of location (point of interest) such as an office, home, shopping center, etc. We evaluated the proposed methods using three well-known grid-based location prediction algorithms with several configurations. The first is a predictor, referred to as frequent cells, presented by Song et al. [33], the second is a Markov chain based predictor widely used in location prediction research [6], and the third is based on matrix factorization [34]. For the three location predictors we used the day and hour of the day (part of day) as contextual features. This was done by creating different prediction models of the same type (i.e., algorithm) for each combination of day and part of day (POD). Smaller PODs will result in a large number of models and take a longer amount of time to converge than larger PODs, however they are expected to result in more accurate models. 4.1. Frequent cells model The frequent cells model simply predicts the most frequent cells for each combination of day and POD. This simple model was used as a baseline in previous studies to evaluate the performance of proposed location prediction algorithms [1,33]. This model can also provide an indication of users’ predictability.

270

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

4.2. Markov chain model This class of predictors represents the mobility of an individual using a Markov chain model, and predicts the next location based on the previously visited location(s) that are part of a trajectory that ends with the current user location [23]. This model maintains a transition matrix in which cell [i, j] indicates the probability of the user to move from cell i to cell j within a predetermined time period (e.g., 30 min); this is based on the number of times that the user previously moved from cell i to cell j. According to the Markov chain model, multiplying the matrix by itself N times provides the aggregative probabilities of all routes from the current location i to all other cells after N steps [17]. For example, a transition matrix that models the transitions of the user every 30 min multiplied by itself one time will provide the probabilities of the user’s location one hour ahead. We opt to use a Markov chain model, because this approach is commonly used in location prediction research. Similarly to the frequent cells model, we implemented the predictors based on the Markov chain model by aggregating the data into day and POD time slots, and derived a cell’s transition matrix for each time slot. The predicted cell returned is the cell with the highest probability score. 4.3. Matrix factorization model Matrix factorization (MF) is a commonly used technique in recommender systems. MF attempts to predict the rating user u would assign to item i based on similar users and/or items within different contexts. One of the best-known implementations of MF was presented by Koren et al. [24]. The prediction of a rating ru,i for a user uϵ U and item iϵ I is T produced using the equation rˆ u,i = µ + bi + bu + qi · pu , where bi is the item bias, bu is the user bias, and pu and qi are vectors representing latent relations. Koren’s approach uses stochastic gradient descent to update the biases, where the difference between ru,i and rˆ u,i is used to update each of these parameters. Lately there has been a shift towards using collaborative filtering methods for location prediction problems, in which items are replaced by locations and ratings by probabilities. Matrix factorization is frequently used for this task [26,35,36], but other methods such as weighted cosine similarity [25] have also been explored. Some works attempt to predict the next location in a user based manner (user-location) [26,35], while others consider item based (location–location) approaches [25]; all of them take the context of the user at the moment of the prediction (e.g. day, hour) into account. As we are focusing on personalized models, we use the location–location prediction, where we try to predict the next location according to the current location and the contextual features: day of week and part of day. This approach is mainly used in POI recommendation [31,35], but lately it has also been used in the grid-based modeling [36] utilized in this work. As suggested in [31], in order to avoid high dimensionality of factorization, we only focused on activity areas which are the areas the user has previously visited in a given context. At the time of the prediction we generate a sub-matrix R′ that contains all of the user’s activity areas (R is the unfiltered matrix that contains all cells in the grid). Assuming M activity areas for a given user in a given context, we start the gradient descent phase for learning the parameters iteratively on all of the entries in the RM ,M matrix. The error of the learning phase is calculated by ec ,a = rc ,a − rˆ c ,a , where rc ,a is the probability of moving from location c to location a in the R matrix, and rˆ c ,a is the probability approximated by the matrix factorization model. 5. Proposed methods for radius prediction In this section we present the four proposed methods for determining the radius around the predicted cell in the grid. The methods are designed such that they can be applied to any grid-based location predictor that outputs the subset of predicted cells and associated probabilities. We compare the proposed methods with the fixed radius method which was used in previous studies. Previous research showed that there are few relevant points of interest for each user within a given context [37]. We empirically validated this assumption with the datasets used in our experiments and have several noteworthy observations. The first is that, on average, for each location prediction the frequent cells model had to choose from 8.1 different probability clusters, a much smaller number than the average of 32.5 non-zero probability cells. The Markov model had to choose from 5.7 different probability clusters compared to an average of 20.1 non-zero probability cells. The matrix factorization model had to choose from 3.3 different probability clusters compared to an average of 9.4 non-zero probability cells. Note that these observations depend on the grid resolution that represents the geographic space. The lower the resolution (i.e., larger cells), the smaller the ratio between the non-zero probability cells and probability clusters. In our datasets the resolution was set to small cells of 100 × 100 m. Another observation from our dataset relates to the cell with the highest probability and the sum of probabilities for each probability cluster. We noticed that in 93% of the cases involving the frequent cells model and 97% of the cases for the Markov-based model, the cell with the highest probability is located in the probability cluster with the highest sum of probabilities. Furthermore, for both algorithms, in almost 100% of the cases the cell with the highest probability was located in the cluster with the highest or second highest sum of probabilities. This observation justifies the greedy approach of starting the search for the best probability cluster from the highest probability cell. We utilize these two observations in our proposed methods for radius prediction by analyzing probability clusters as will be described in the following sub-sections.

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

271

5.1. Notations Assume that at time is t we attempt to predict the location of the user at time t + 1. locationt +1 : The user’s actual location (i.e., cell) at time t + 1. predictionst : Pairs of cells and probabilities ⟨celli , probi ⟩ provided by the location prediction algorithm when predicting the location of the user at time t + 1 (and current time is t). predictedt : The predicted location (i.e., cell) for time t + 1; that is, the cell with the highest probability in predictionst . dist(celli , cellj ): The distance between the centers of celli and cellj . errort = dist(predictedt , locationt +1 ) : The distance between the center of the user’s actual cell at time t + 1 and the center of the predicted cell (made at time t). wt : The time window of most recent predictions before time t. errLimit : A configurable threshold used to filter predictions with large errors; this threshold is used by the proposed methods in order to remove completely wrong predictions. maxRadius: A predefined (configurable) maximum radius that can be returned (predicted) by a radius prediction algorithm. confidence: The assurance level in the prediction which is calculated as the sum of probabilities of all cells in the predicted radius (0 ≤ confidence ≤ 1). In our case, it includes the probabilities of all cells that surround the predicted cell c up to radius size i. For example, radius 0 (zero) includes the exact cell c, radius 100 includes cell c and the eight cells surrounding c, and so on (as seen in Fig. 2). Next, we present the proposed methods for dynamically determining the confidence area. 5.2. Error of recent predictions method This method attempts to understand whether the recent behavior of the user is predictable, i.e., moving in a pattern similar to those previously observed within the current context. The assumption is that if the most recent predictions of the model resulted in a high and/or unstable level of errors, the user is currently behaving in an unpredictable way which should result in a larger confidence radius. Therefore, according to this method, when predicting the location at time t + 1, the confidence radius is defined as the average of the most recent errors (the most recent errors is defined by the time window wt ). This process is presented in Algorithm 1. First, the predicted cell, which is the cell in predictionst with the highest probability, is selected (line 2). Then, we compute the radius of the predicted cell as the minimum of maxRadius and the average of previous prediction errors in time window wt (lines 3–4). The computed radius r defines the confidence area. Finally, the confidence score is computed as the sum of probabilities of all cells within the confidence area (lines 5–6).

5.3. Error at predicted cell method This method determines the confidence radius based on the previous errors related to the predicted cell - predictedt . The confidence radius is calculated as the most frequent error previously seen for the predicted cell, thus eliminating the effect of extreme errors and anomalies. This process is presented in Algorithm 2. First, the predicted cell, which is the cell in predictionst with the highest probability, is selected (line 2). Then, we compute the radius of the predicted cell as the minimum of maxRadius and the most frequent value in the set of previous errors made when predicting cell predictedt (lines 3–4). Finally, the confidence score is computed as the sum of probabilities of all cells within the confidence area (lines 5–6).

272

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Fig. 2. Radius sizes surrounding the center location.

Fig. 3. Different types of predicted locations – home, work, and mall/shopping center.

5.4. Probability aggregation method This method is based on the assumption that every predicted location can be of a different type/nature, and therefore the movement patterns around the predicted location may be different. In grid-based location prediction methods this assumption is manifested by different distributions of probabilities of cells around the predicted location; i.e., probability clusters. An example of three different types of predicted locations – home, work, and mall/shopping center is presented in Fig. 3. These three examples were observed in the users of the BGU dataset (see Section 5.2). In each of the three types of locations in the example the predicted cell is the cell with the highest probability. The home location is characterized by almost a single cell, the work location is characterized by a few cells around the predicted cell, possibly indicating that the user sometimes moves to different buildings/offices around their main working location, and finally, in the mall/shopping center the probabilities are distributed over a larger number of cells around the predicted cell. In each of the three cases the confidence radius is determined by the distribution pattern of probabilities. In the home location returning a zero radius may be sufficient, since the predicted cell provides 0.95 probability of being correct. For the work location, setting (i.e., increasing) the confidence radius to r = 100 increases the probability of being correct from 0.2 to 0.96. Similarly, for shopping location, setting (i.e., increasing) the confidence radius as r = 100 increases the probability of being correct from 0.14 to 0.63, and setting the confidence radius to r = 200 increases the probability of being correct from 0.63 to 1 according to previous trajectories. The probability aggregation method is presented in Algorithm 3. It attempts to look for the best confidence radius according to the type of predicted location. Since there is a tradeoff between the accuracy and the size of the confidence area (the larger the confidence area, the higher the chances of being correct in the prediction), the confidence radius should be optimal (i.e., the minimum required for a correct prediction) and also acceptable for the specific use-case. The inputs to the algorithm are ordered pairs of cells and their corresponding probabilities. The output of the algorithm consists of the predicted cell (the cell with the highest probability), the radius, and the resulting confidence level. The confidence of the predicted area is determined by the sum of the probabilities of the cells within the predicted radius.

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

273

The algorithm iteratively increments the radius around the predicted cell until the increase in the confidence of the new radius is smaller than a (configurable) α parameter (compared to the confidence of the previous radius), or until a maximum radius bound is reached.

5.5. Recursive search method The recursive search method also aggregates the probabilities around the predicted cell but utilizes a different approach than the previous method. It starts from the cell with the highest probability, but instead of extending its radius around that cell, it recursively increases the probability cluster with new adjacent specific cells as long as there is a significant increase in the sum of probabilities. Once the probability cluster has been defined, the algorithm finds the optimal center location (cell) and radius that contains all cells in the probability cluster. The recursive search method is presented in Algorithm 4 and Algorithm 5. The algorithm initializes the probability cluster with the cell with the highest probability (line 2 in Algorithm 4). Then, using the getCellsInCluster method (presented by Algorithm 5), it reclusively adds new cells to the probability cluster (initially, the probability cluster contains only the cell with the highest probability). This is done by examining all of the adjacent cells of the cells already included in the probability cluster and estimating their contribution to the probability cluster. If the probability of an examined cell exceeds a certain threshold determined by the growing parameter α , it is added to the probability cluster (lines 2–4 in Algorithm 5). Once this process is terminated and the probability cluster is determined, the determineOptimalArea method finds the optimal center and radius of the probability cluster (lines 4–6 in Algorithm 4). Finally, the confidence score is computed as the sum of probabilities of all cells within the confidence area (lines 7–8 in Algorithm 4).

Fig. 4 illustrates the confidence area determined by the probability aggregation method (black line) and the recursive search method (yellow line). It can be seen that the two methods perform differently in different cases. In the case on the left, the recursive search method is more efficient and covers the probability cluster using a small number of cells. In the case on the right the probability aggregation method covers a larger area containing multiple probability clusters as it starts from the highest probability cell and expands evenly towards all sides, while the recursive search method grows in the direction of the cluster.

274

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Fig. 4. Two cases demonstrating the resulting confidence area according to the probability aggregation method (in black) and the recursive search method (in yellow). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

6. Evaluation 6.1. Configuration In order to understand the effectiveness of the proposed methods we implemented the three location prediction algorithms presented in Section 4 and the four confidence-based methods presented in Section 5. We compared our four proposed methods to the previously used fixed radius approach. The fixed radius method returns a predefined confidence radius for all predictions, regardless of the user behavior or current context. Minh Tri Do et al. [5] discussed the difficulty of predicting user location for a specific time period in the future, i.e., predicting not only where a user will be next, but also predicting exactly when the user will be there. In our evaluation we attempt to predict the location of a user one hour in advance. Based on previous work we set the trajectory size of the N-Markov model to 2 (N = 2). In addition, in order to predict one hour in advance we considered user transitions every 30 min. By multiplying the Markov transition matrix, we were able to obtain one hour trip predictions as discussed by [5]. All methods were tested with 0 ≤ maxRadius ≤ 700 (i.e., 100, 200 . . . 700 m). The probability aggregation and the recursive search methods were tested with α ∈ {0.01, 0.1, 0.2, 0.3, 0.4, 0.5}. The w parameter of the error of recent predictions method was set to four hours; this was determined based on our preliminary evaluation. 6.2. Datasets We conducted the evaluation and tested the proposed methods using three different datasets described below (different attributes of the datasets are presented in Table 1). Geolife 1.3: The Geolife GPS trajectory dataset was collected by Microsoft Research Asia from 182 users for a period of more than three years starting in 2007 [13,37,38]. A GPS trajectory in this dataset is represented by a sequence of timestamped positions, each of which contains the latitude, longitude, and altitude. In this dataset we focused on the area of Beijing and created a grid of 200 by 200 cells, each measuring 100 × 100 m. We selected the data of 15 users who spent more than 50% of their time in the chosen area and had more than 30 distinct days of transmission. LLDC: This dataset was collected by Nokia Research using the Nokia N95 phones of nearly 170 participants from Lausanne, Switzerland during 2010–2011. The data collection software used runs in the background of the phones in a non-intrusive manner, yielding data on modalities such as social interaction and spatial behavior [14,39]. We created a grid of 120 by 190 cells, each measuring 100 × 100 m. We selected 31 users who spent more than 50% of their time within the chosen area and had the greatest number of continuous transmission hours of data. BGU: We conducted an experiment with 25 Android mobile device owners (students at Ben-Gurion University of the Negev) over a period of five months in 2015 in the city of Beer-Sheva, Israel. The size of the city is approximately 8 × 8.5 km, and we divided the area into 80 by 85 cells, each measuring 100 × 100 m. During the experiment we asked the participants to turn on their GPS, and using a dedicated application we tracked their device every five minutes. Because the accuracy of the prediction had to be high, we therefore removed location samples with an error of more than 100 m. 76% of the samples had high accuracy. As can be seen in Table 1, in the BGU dataset the average number (and standard deviation in parenthesis) of hours per day with transmitted data is 22 compared to ∼4–5 h per day in the two other datasets. For all of the datasets each day was divided into eight time slots of three hours beginning at 7am. Location prediction models were created for each time slot.

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

275

Table 1 Description of datasets. Dataset

City

Users

Distinct days

Transmission h/day avg, (stdev)

Number of cells

Years

Coordinates

BGU

Beer-Sheva

20

190.35

22.07 (4.72)

80 × 85

2015

MinLatitude: 31.212072, MaxLatitude: 31.288824, MinLongitude: 34.751552, MaxLongitude: 34.835936

LLDC

Lausanne

31

228.13

4.55 (3.18)

190 × 120

2010–2011

MinLatitude: 46.494078, MaxLatitude: 46.601959, MinLongitude: 6.463294, MaxLongitude: 6.711030

Geolife

Beijing

15

141.13

5.14 (3.34)

200 × 200

2008–2009

MinLatitude: 39.824, MaxLatitude: 40.004, MinLongitude: 116.263, MaxLongitude: 116.497

Fig. 5. A Stay case is indicated by the first ring around the black cell (red arrows), while the Move cases are indicated by the rest of the cells (green arrows). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

6.3. Evaluation measures Our evaluation is based on the following definitions: Successful prediction – a prediction is defined as successful if the actual location falls within the predicted area. The predicted area is defined by the central cell and the radius returned by the selected method for radius prediction. Move – we define move as the case in which the location of the user after one hour (the prediction time frame) is beyond the current cell’s first ring as indicated by the green arrows in Fig. 5. Stay – we define stay as the case in which the actual cell of the user after one hour (the prediction time frame) is in the first ring of the current cell as indicated by the red arrows in Fig. 5. Accuracy – we define accuracy as the percentage of successful predictions out of the total predictions made for the move cases. Average radius – the average radius size calculated for a set of location predictions made during the evaluation. We intentionally separated Move from Stay cases as users tend to spend most of the time in the same locations (e.g., home, work, class); not separating the two types of cases can provide us with an overall accuracy rate approaching 80% according to Monreale et al. [40]. Therefore, our evaluation focuses on improving the Move cases which are more difficult to predict. We assume that when predicting the Stay cases the radius is fixed (and equal to one) around the current cell of the user and therefore the accuracy rate is not affected by the radius prediction methods. 6.4. Results Tables 2–4 compares the proposed methods with the fixed radius method for each of the three datasets: LLDC, Geolife, and BGU, respectively. Each table presents the improvement in accuracy (column H) over the fixed radius method achieved by each of the proposed methods (column D) for each location prediction algorithm: frequent cells, Markov chain and matrix factorization (column A). In order to compare the dynamic radius prediction methods with the fixed radius method, for each method we

276

1 2 3 4 5 6 7 8 9 10 11 12

Location prediction algorithm (A)

Weeks (B)

# of predictions (C)

Method (D)

Configuration (E)

Avg. radius (meters) (F)

Avg. radius (meters) (G)

Accuracy improvement (H)

Correlation (I)

df (J)

Significance 2-tailed (K)

Frequent Cells Frequent Cells Frequent Cells Frequent Cells Markov Markov Markov Markov MatrixFact MatrixFact MatrixFact MatrixFact

First 10 First 10 10–20 10–20 First 10 First 10 10–20 10–20 First 10 First 10 10–20 10–20

82 651 82 651 30 244 30 244 18 851 18 851 11 888 11 888 27 265 27 265 31 381 31 381

CellError CellError CellError CellError CellError CellError CellError CellError CellError CellError CellError CellError

MaxRadius: 5 MaxRadius: 9 MaxRadius: 5 MaxRadius: 9 MaxRadius: 5 MaxRadius: 9 MaxRadius: 7 MaxRadius: 9 MaxRadius: 5 MaxRadius: 9 MaxRadius: 5 MaxRadius: 9

93 175 94 206 103 172 97 135 101 192 94 172

100 200 100 200 100 200 100 200 100 200 100 200

7.10% 6.70% 10.80% 8.90% 10.20% 6.40% 12.30% 3.50% 8.4% 7.1% 11.4% 5.5%

0.811 0.845 0.915 0.923 0.880 0.880 0.940 0.955 0.881 0.906 0.945 0.975

309 309 104 104 196 196 83 83 229 229 133 133

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.134 0.000 0.000 0.000 0.000

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Table 2 Accuracy comparison of the proposed methods with the fixed radius approach (LLDC dataset).

1 2 3 4 5 6 7 8 9 10 11 12

Location prediction algorithm (A)

Weeks (B)

# of predictions (C)

Method (D)

Configuration (E)

Avg. radius (meters) (F)

Avg. radius (meters) (G)

Accuracy improvement (H)

Correlation (I)

df (J)

Significance 2-tailed (K)

Frequent Cells Frequent Cells Frequent Cells Frequent Cells Markov Markov Markov Markov MatrixFact MatrixFact MatrixFact MatrixFact

First 10 First 10 10–20 10–20 First 10 First 10 10–20 10–20 First 10 First 10 10–20 10–20

139192 139 192 127 494 127 494 31 425 31 425 20 327 20 327 86 234 86 234 65 278 65 278

CellError CellError CellError CellError CellError CellError CellError CellError CellError CellError CellError CellError

MaxRadius: 4 MaxRadius: 8 MaxRadius: 7 MaxRadius: 9 MaxRadius: 2 MaxRadius: 6 MaxRadius: 4 MaxRadius: 9 MaxRadius: 4 MaxRadius: 9 MaxRadius: 4 MaxRadius: 9

102 201 106 154 86 203 98 181 102 183 94 179

100 200 100 200 100 200 100 200 100 200 100 200

8.80% 11.70% 3.90% 3.60% 9.60% 14.20% 10.60% 12.60% 9.6% 7.6% 5.7% 7.9%

0.764 0.717 0.793 0.816 0.912 0.869 0.800 0.788 0.866 0.838 0.792 0.801

64 64 35 35 47 47 25 25 70 70 35 35

0.000 0.000 0.063 0.101 0.000 0.001 0.001 0.002 0.000 0.009 0.001 0.002

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Table 3 Accuracy comparison of the proposed methods with the fixed radius approach (Geolife dataset).

277

278

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Location prediction algorithm (A)

Weeks (B)

# of predictions (C)

Method (D)

Configuration (E)

Avg. radius (meters) (F)

Avg. radius (meters) (G)

Accuracy improvement (H)

Correlation (I)

df (J)

Significance 2-tailed (K)

Frequent Cells Frequent Cells Frequent Cells Frequent Cells Frequent Cells Markov Markov Markov Markov Markov Markov Markov Markov Markov Markov MatrixFact MatrixFact MatrixFact MatrixFact MatrixFact

First 10 First 10 10–20 10–20 10–20 First 10 First 10 First 10 First 10 10–20 10–20 10–20 10–20 10–20 10–20 First 10 First 10 First 10 10–20 10–20

24 039 24 039 29 840 29 840 29 840 14 469 14 469 14 469 14 469 21 358 21 358 21 358 21 358 21 358 21 358 15 518 15 518 15 518 20 629 20 629

CellError ProbAgg Recursive Recursive ProbAgg CellError ProbAgg CellError ProbAgg Recursive CellError ProbAgg Recursive CellError ProbAgg TimeError CellError ProbAgg Recursive CellError

MaxRadius: 9 Alpha: 0.6 Alpha: 0.01 Alpha: 0.0 Alpha: 0.9 MaxRadius: 2 Alpha: 0.9 MaxRadius: 4 Alpha: 0.4 Alpha: 0.01 MaxRadius: 2 Alpha: 0.9 Alpha: 0.0 MaxRadius: 3 Alpha: 0.5 MaxRadius: 4 MaxRadius: 4 Alpha: 0.6 Alpha: 0.0 MaxRadius: 4

93 190 118 183 202 81 98 208 190 91 93 106 132 183 195 102 194 195 94 209

100 200 100 200 200 100 100 200 200 100 100 100 200 200 200 100 200 200 100 200

6.00% 4.30% 10.10% 10.60% 5.90% 9.20% 9.60% 12.70% 9.80% 11.10% 7.10% 12.40% 7.00% 9.50% 11.30% 7.4% 10.6 8.9% 9.5% 12.3%

0.901 0.945 0.903 0.894 0.915 0.832 0.884 0.806 0.838 0.762 0.770 0.827 0.770 0.804 0.789 0.928 0.860 0.906 0.891 0.866

157 157 290 290 290 150 150 150 150 281 281 281 281 281 281 85 85 85 172 172

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Table 4 Accuracy comparison of the proposed methods with the fixed radius approach (BGU dataset).

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

279

present the configuration (column E) that achieved the average predicted radius (column F) that is the closest to the fixed radius setting – 100 and 200 m (column G). For each dataset we used 20 consecutive weeks containing sufficient data and applied a chronological evaluation such that for each user the first n weeks were used for training, and week n + 1 was used for evaluation. The total number of predictions over all datasets resulted in more than 600,000 predictions. In order to understand the effectiveness of the proposed methods over time (as more data was available for training), we compared the performance of the proposed methods for two time periods: the first 10 weeks (weeks 0–10) and the next 10–20 weeks (weeks 10–20) (indicated in column B). In some (rare) cases in which the number of location samples was relatively low we used more than 20 weeks. Finally, in order to statistically compare the advantages of the proposed methods and the fixed radius approach, we conducted a t-test on the accuracy of each method. In this test we compared the methods for each user and week while selecting the configurations which provided the same (or similar) average radius. For each pair, the table presents the Pearson correlation between the compared methods (column I) and the t-test results including the t-statistics, degree of freedom (column J), and the significance of a 2-tailed t-test (K). Note that in Tables 2–4 we only present the methods that outperformed the fixed radius method: for the LLDC and Geolife datasets – the error at predicted cell method (denoted by CellError), and for the BGU dataset – the probability aggregation (denoted by ProbAgg) and recursive search (denoted by Recursive) methods. We come to the following conclusions from the results presented in Tables 2–4. The proposed methods outperform the fixed radius method. In almost all of the above comparisons we can see significant improvement (p-value < 1%) in the number of correct predictions per week and user. This improvement is observed for all three location prediction algorithms, a finding which demonstrates that the proposed methods are generic and can be applied to any location prediction algorithm. Sampling rates and accuracy matters. The best improvement in average accuracy for the Geolife and LLDC datasets (8.19% and 8.82%, respectively) was achieved by the error at predicted cell method. The probability aggregation and recursive search methods also significantly outperformed the fixed radius method for the BGU dataset (improvement in average accuracy of 8.89% and 9.28% respectively). The main reason is that, as mentioned before, the BGU dataset is less sparse than the Geolife and LLDC datasets and contains location samples that are collected at a high rate and high accuracy. Therefore such data will be better used by the probability aggregation and recursive search methods, while the error at predicted cell method has no advantage in this case. Proposed methods are robust to cold start. Even though location prediction algorithms suffer from the cold start problem, the methods we present still perform well. It can be observed that the improvement in accuracy is not dependent on the time period in which the predictions were made (the first 10 weeks or the next 10–20 weeks). Fig. 6 presents the tradeoff between the (average) size of the confidence radius (the x-axis) and the location prediction accuracy (the y-axis) for each of the datasets (LLDC, Geolife, and BGU) and location prediction algorithm (frequent cells, Markov chain and matrix factorization). The presented results are averaged over all users and predictions, and each marker represents different configuration for a specific method (defined by growing coefficient and maximum radius). The black line represents the results of the fixed radius method, and the optimum scatter represents the most accurate possible results considering the maximum predicted radius. The results presented in Fig. 6 show that for the Geolife and LLDC datasets the error cells algorithm demonstrated improved average accuracy for the same average radius. For the BGU dataset the best performance was obtained by the recursive search and probability aggregation methods. This is due to the nature of the BGU dataset which contains more accurate and continuous data than the Geolife and LLDC datasets and contains location samplings that are collected at a high rate and high accuracy. These observations are consistent for all three location prediction algorithms. We also wanted to understand if, in general, the proposed methods provide (predict) the optimal radius for each location prediction (i.e., the radius that is the closest to the distance between the predicted location and the actual location of the user), and in particular, which method provides better predictions. Therefore, given a set of radius predictions {ri }, we define the OptimalHit rate as follows: OptimalHit =

|{ri |ri − 100 < errorDistancei ≤ ri }| |{ri }|

where |{ri }| is the total number of radius predictions made (ri ∈ {0, 100, 200, 300 . . .}, and |{ri |ri − 100 < errorDistancei ≤ ri }| is the number of radius predictions for which the error distance (i.e., the distance between the predicted location and the actual location) is between ri − 100 and ri . This set contains the predictions with an optimal radius, meaning that if the predicted radius was smaller, the actual location was not contained in the prediction, and a larger radius prediction would be inefficient. We evaluate the Optimal Hit rate for r, r + 100 and r − 100, as presented in Figs. 7–10 for the three datasets: LLDC, Geolife, and BGU, respectively. While the evaluation of the Optimal Hit rate for r indicates the percentage of optimal predictions, the evaluation of the Optimal Hit rate for r + 100 indicates the percentage of correct radius predictions we would have gained if the radius was 100 m larger (adding one more ring), while the evaluation of the hit ratio for r − 100 indicates the percentage of radius predictions we would have lost if the radius was 100 m smaller.

280

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Fig. 6. The tradeoff between the (average) size of the confidence radius (the x-axis) and the location prediction accuracy (the y-axis) for each of the datasets (LLDC, Geolife, and BGU) and the three location prediction algorithms.

From the three graphs it can be observed that the fixed radius method provides a similar hit rate for r, r − 100, and r + 100. This supports the assumption that setting a fixed radius results in an arbitrary prediction that is not sensitive to the context at the time of prediction. On the other hand, the error at predicted cell method showed the best predictions, that is: (1) most of the predictions fall within the predicted radius (r); (2) increasing the radius by one additional ring (r + 100) has no significant contribution to the predictions; and (3) reducing the predictions by one ring (r − 100) significantly reduces the accuracy of predictions. It can also be observed that the probability aggregation and recursive search methods also provide good predictions for the BGU dataset, a finding which supports our previous conclusion that these two probabilistic methods perform better for more consistent (i.e., more frequent sampling) and accurate datasets. Finally, in Fig. 10 we present the average ratio of accurate predictions per week (i.e., for a given week, the number of predictions that were within the predicted radius divided by the total number of predictions) for the error at the predicted cell, probability aggregation, and recursive search methods. The ratio is averaged over all users, weeks, location prediction

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Fig. 7. Average Optimal Hit value for the LLDC dataset.

Fig. 8. Average Optimal Hit value for the Geolife dataset.

Fig. 9. Average Optimal Hit value for the BGU dataset.

281

282

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

Fig. 10. Accuracy ratio for the BGU dataset.

methods, and configurations, and calculated only for the BGU dataset, because the performance of the methods was relatively similar, as can be seen in Table 4. 6.5. Discussion The proposed methods are designed to dynamically set the confidence radius by considering four different factors: recent behavior of the user, the type of the predicted location, confidence in the prediction, and the sampling error. The Error of Recent Predictions method models the recent behavior of the user while the Error at Predicted Cell, Probability Aggregation, and Recursive Search methods, implicitly learns the type of the predicted location. The sampling error and confidence in the predictions are handled by all method; i.e., all four methods can help in providing better and more meaningful predictions in different cases of sampling errors and confidence in the predictions. From the evaluation results we could see that the Error at Predicted Cell model achieved the best results in predicting the confidence radius for sparse and inconsistent datasets such as LLDC and Geolife. For these datasets the users’ location was not measured periodically and consistently (location sampling occurs, on average in 4–5 h per day, and in these periods the sampling rate was every few seconds). For that reason, in these datasets we can find many location samples around the same cells which allows the Error at Predicted Cell model to provide better estimation of the confidence area. The location sampling in the BGU dataset is more consistent and the average amount of measured hours per day for a user is around 22 with a sampling rate of five minutes. Although it is not frequent enough to assess small movements it still cover longer periods which allows to profile user’s whereabouts in a way that is both continuous and periodic. For this dataset the Recursive Search model outperformed when predicting small confidence areas and the Probability Aggregation model outperformed for when predicting larger confidence areas. Due to the frequent and consistent sampling in such dataset probabilistic approaches get a better view of the visited clusters of locations and thus can provide better prediction of the confidence radius (area). The Error of Recent Predictions method provided the lowest performance compared to the other methods, although it is still better than the fixed radius approach. The reason for this is that this method focuses on the highest probability cell and sets the radius around this cell based on the average of recent prediction errors without considering the type of predicted location (i.e., type of POI), which makes this method less efficient than the other models in terms of the tradeoff between accuracy and average predicted radius. These observations provide the guidelines on which method to apply for a given dataset (or data stream) as well as the basis for intelligent ensemble of the four methods. The probabilistic methods (probability aggregation and recursive search) are more efficient and better suited for datasets with accurate and frequent location samples. The error at predicted cell method is more appropriate for more sparse datasets. The effectiveness of the proposed approach can also be seen by the analysis of the randomness of the data. For the BGU dataset, setting a fixed confidence radius of 200 m (i.e., 25 cells) around the predicted cell resulted in 44% accuracy, while a random selection of a predicted location results in 0.36% accuracy. For the Geolife and LLDC datasets the location prediction algorithms with the same fixed radius of 200 m resulted in up to 36% and 27% accuracy respectively, while a random selection of a predicted location results in 0.0625% and 0.13% accuracy (respectively). Thus, the dynamic radius prediction models can further profile the confidence radius for different states/contexts and improve the predictions of state-of-the-art location prediction methods up to 500 times better than a random prediction. 7. Conclusions and future work In this paper we propose post-processing methods for grid-based location prediction algorithms that dynamically determine a confidence radius for each predicted location. We applied the proposed radius prediction methods on the output

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

283

of three location prediction algorithms (frequent cells, Markov chain and matrix factorization models) using three datasets and then compared the methods with the fixed radius approach. The results demonstrated the ability of our methods to dynamically determine an efficient confidence radius that increases prediction accuracy while maintaining, on average, a small radius. We believe that this is an essential component for grid-based location prediction algorithms as it can take into account different factors that may affect the accuracy of each prediction and set the confidence area accordingly, thus resulting in a more informative and usable prediction. The confidence area, when applied on grid-based location prediction algorithms, assist in overcome the shortage of grid-based over POI-based location prediction algorithms. As mentioned, POIs provides semantics and additional information that can be utilized for predicting locations. For example, different POIs (e.g., gas station, supermarket or home) indicates how long the user is going to spend at that POI. Applying the confidence area on grid-based location prediction algorithms can implicitly reveal the POI within the relevant cells thus improving the accuracy of the location prediction algorithm. In the future we intend to exploit the advantages of each of the four proposed radius prediction methods by applying an ensemble of the methods to determine the best combination of the presented models. This ensemble can be based on a predefined rule-based model (e.g., if the location prediction algorithm is based on Markov chain and there is only a small amount of data to learn from use the error at predicted cell method) or based on a machine learning process that can be adapted over time (e.g., a regression model which assigns a weight for each model based on the model’s previous performance). The ensemble approach should be capable of optimally switching between the four described methods to perform the task of dynamic radius prediction. In addition, we plan to use the measures of user speed and direction in order to enhance confidence area prediction (e.g., if the user is currently driving, we might want to assign a larger confidence radius at the predicted location). Finally, in future work the proposed method for calculating the confidence radius of a predicted location can be extended to other domains as well. Some examples of these domains includes estimating the location of textual documents [41], or the location of multimedia resources (e.g., photos [42], videos [43], and music files [44]) based on their metadata and contents. Similarly to the location prediction case, in these cases the estimation of location can be influenced by different types of errors which can be analyzed (e.g., the topic in the text or the quality of media file) in order to provide an optimal confidence radius for the actual location of the object. Acknowledgment This research was supported by Deutsche Telekom AG. In this paper we used the MDC Database made available by Idiap Research Institute, Switzerland and owned by Nokia. References [1] Trinh Minh Tri Do, Daniel Gatica-Perez, Where and what: Using smartphones to predict next locations and applications in daily life, Pervasive Mobile Comput. 12 (2014) 79–91. [2] Manlio De Domenico, Antonio Lima, Mirco Musolesi, Interdependence and predictability of human mobility and social interactions, Pervasive Mobile Comput. 9 (6) (2013) 798–807. [3] Trinh Minh Tri Do, Daniel Gatica-Perez, Contextual conditional models for smartphone-based human mobility prediction, in: Proceedings of the ACM Conference on Ubiquitous Computing, ACM, 2012. [4] John Krumm, Eric Horvitz, Predestination: Inferring destinations from partial trajectories, in: Proceedings of the International Conference on Ubiquitous Computing, Springer, Berlin Heidelberg, 2006. [5] Trinh Minh Tri Do, Olivier Dousse, Markus Miettinen, Daniel Gatica-Perez, A probabilistic kernel method for human mobility prediction with smartphones, Pervasive Mobile Comput. 20 (2015) 13–28. [6] Ivana Nizetic, Kresimir Fertalj, Damir Kalpic, A prototype for the short-term prediction of moving object’s movement using Markov chains, in: Proceedings of the 31st International Conference on Information Technology Interfaces, ITI, IEEE, 2009. [7] Abson Sae-Tang, Michele Catasta, LucasKelsey McDowell, Karl Aberer, Semantic place prediction using mobile data, in: Proceedings of the Mobile Data Challenge Workshop, MDC, no. EPFL-CONF-182133, 2012. [8] Rein Ahas, Siiri Silm, Olle Järv, Erki Saluveer, Margus Tiru, Using mobile positioning data to model locations meaningful to users of mobile phones, J. Urban Technol. 17 (1) (2010) 3–27. [9] Sébastien Gambs, Marc-Olivier Killijian, Miguel Núñez del Prado Cortez, Next place prediction using mobility Markov chains, in: Proceedings of the 1St Workshop on Measurement, Privacy, and Mobility, ACM, 2012. [10] Yu-Liang Tang, Der-Jiunn Deng, Yannan Yuan, Chun-Cheng Lin, Yueh-Min Huang, Dividing sensitive ranges based mobility prediction algorithm in wireless networks, in: Proceedings of the 6th International Conference on Wireless Communications and Mobile Computing, ACM, 2010, pp. 1223– 1227. [11] Theodoros Anagnostopoulos, Christos Anagnostopoulos, Stathes Hadjiefthymiades, Mobility prediction based on machine learning, in: Proceedings of the 12th International Conference on Mobile Data Management (ICDM), IEEE, 2011. [12] http://www.andygup.net/how-accurate-is-android-gps-part-1-understanding-location-data/. [13] Yu Zheng, Xing Xie, Wei-Ying Ma, GeoLife: A collaborative social networking service among user, location and trajectory, IEEE Data Eng. Bull. 33 (2) (2010) 32–39. [14] Niko Kiukkonen, Jan Blom, Olivier Dousse, Daniel Gatica-Perez, Juha Laurila, Towards rich mobile phone datasets: Lausanne data collection campaign, in: Proceedings of the International Conference on ICPS, Berlin, 2010. [15] http://developer.android.com/reference/android/location/Location.html. [16] Shanshan Feng, Xutao Li, Yifeng Zeng, Gao Cong, YeowMeng Chee, Quan Yuan, Personalized ranking metric embedding for next new POI recommendation, in: Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI, 2015. [17] Xin Lu, Erik Wetter, Nita Bharti, Andrew J. Tatem, Linus Bengtsson, Approaching the limit of predictability in human mobility, Sci. Rep. 3 (2013).

284

I. Hazan, A. Shabtai / Pervasive and Mobile Computing 42 (2017) 265–284

[18] L. Li, W. Chu, J. Langford, R.E. Schapire, A contextual-bandit approach to personalized news article recommendation, in: Proceedings of the 19th International Conference on World Wide Web, ACM, 2010, pp. 661–670. [19] Daniel Ashbrook, Thad Starner, Learning significant locations and predicting user movement with GPS, in: Proceedings of the 6th International Symposium on Wearable Computers, ISWC, IEEE, 2002, pp. 101–108. [20] Bill N. Schilit, et al., Challenge: Ubiquitous location-aware computing and the place lab initiative, in: Proceedings of the 1st International Workshop on Wireless Mobile Applications and Services on WLAN Hotspots, ACM, 2003, pp. 29–35. [21] Saikath Bhattacharya, Sudhansu Sekhar Singh, Location Prediction Using Efficient Radial Basis Neural Network, in: Proceedings of the International Conference on Information and Network Technology, IPCSIT, Vol. 4, 2011. [22] Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan, Predicting the next location: A recurrent model with spatial and temporal contexts, in: Proceedings of the 30th Conference on Artificial Intelligence, AAAI, 2016. [23] Yohan Chon, Hyojeong Shin, Elmurod Talipov, Hojung Cha, Evaluating mobility models for temporal prediction with high-granularity mobility data, in: Proceedings of the International Conference on Pervasive Computing and Communications, PerCom, IEEE, 2012, pp. 206–212. [24] Yehuda Koren, Robert Bell, Chris Volinsky, Matrix factorization techniques for recommender systems, Computer 42 (8) (2009) 30–37. [25] Huiji Gao, Jiliang Tang, Huan Liu, Exploring social-historical ties on location-based social networks, in: Proceedings of the 6th International Conference on Web and Social media, ICWSM, 2012. [26] Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, Yong Rui, GeoMF: joint geographical modeling and matrix factorization for pointof-interest recommendation, in: Proceedings of the 20th International Conference on Knowledge Discovery and Data Mining, SIGKDD, ACM, 2014, pp. 831–840. [27] John Lafferty, Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, in: Proceedings of the 24th International Conference on Research and Development in Information Retrieval. (SIGIR), ACM, 2001. [28] Akinori Asahara, Kishiko Maruyama, Akiko Sato, Kouichi Seto, Pedestrian-movement prediction based on mixed Markov-chain model, in: Proceedings of the 19th International Conference on Advances in Geographic Information Systems (SIGSPATIAL), ACM, 2011, pp. 25–33. [29] A. Fridman, Mixed markov models, Proc. Natl. Acad. Sci. 100 (14) (2003) 8092–8096. [30] Josh Jia-Ching Ying, Wang-Chien Lee, Tz-Chiao Weng, Vincent S. Tseng, Semantic trajectory mining for location prediction, in: Proceedings of the 19th International Conference on Advances in Geographic Information Systems (SIGSPATIAL), ACM, 2011, pp. 34–43. [31] Yong Liu, Wei Wei, Aixin Sun, Chunyan Miao, Exploiting geographical neighborhood characteristics for location recommendation, in: Proceedings of the 23rd International Conference on Conference on Information and Knowledge Management, ACM, 2014, pp. 739–748. [32] Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang, Collaborative location and activity recommendations with GPS history data, in: Proceedings of the 19th International Conference on World Wide Web, ACM, 2010, pp. 1029–1038. [33] Chaoming Song, Zehui Qu, Nicholas Blumm, Albert-László Barabási, Limits of predictability in human mobility, Science 327 (5968) (2010) 1018–1021. [34] Long Vu, Quang Do, Klara Nahrstedt, Jyotish: A novel framework for constructing predictive model of people movement from joint wifi/bluetooth trace, in: Proceedings of the International Conference on Pervasive Computing and Communications, PerCom, IEEE, 2011. [35] Chen Cheng, Haiqin Yang, Irwin King, Michael R. Lyu, Fused matrix factorization with geographical and social influence in location-based social networks, in: Proceedings of the International Conference on Artificial Intelligence (AAAI), Vol. 12, 2012, pp. 17-23. [36] Badrul Sarwar, George Karypis, Joseph Konstan, John Riedl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the 10th International Conference on World Wide Web, ACM, 2001, pp. 285–295. [37] Elisabeth Lex, Oliver Pimas, Jörg Simon, Viktoria Pammer-Schindler, Where am I? Using mobile sensor data to predict a user’s semantic place with a random Forest Algorithm, in: Proceedings of the International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services, Springer, Berlin Heidelberg, 2012, pp. 64–75. [38] Yu Zheng, Lizhu Zhang, Xing Xie, Wei-Ying Ma, Mining interesting locations and travel sequences from gps trajectories, in: Proceedings of the 18th International Conference on World Wide Web, ACM, 2009, pp. 791–800. [39] J.K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T. Do, O. Dousse, J. Eberle, M. Miettinen, The Mobile Data Challenge: Big Data for Mobile Computing Research, in: Proc. Mobile Data Challenge Workshop (MDC) in conjunction with Int. Conf. on Pervasive Computing, Newcastle, 2012. [40] A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti, Wherenext: a location predictor on trajectory pattern mining, in: Proceedings of the 15th International Conference on Knowledge Discovery and Data Mining (SIGKDD), ACM, pp. 637–646. [41] Fernando Melo, Bruno Martins, Automated geocoding of textual documents: A survey of current approaches, Trans. GIS (2016). [42] Pavel Serdyukov, Vanessa Murdock, Roelof Van Zwol, Placing flickr photos on a map, in: Proceedings of the 32nd International Conference on Research and Development in Information Retrieval (SIGIR), ACM, 2009. [43] Jaeyoung Choi, Claudia Hauff, Olivier Van Laere, Bart Thomee, The placing task at mediaeval 2015, in: Proceedings of the Ceur Workshop, CEUR, 2015. [44] Zhou Fang, Q. Claire, Ross D. King, Predicting the geographical origin of music, in: Proceedings of the International Conference on Data Mining, ICDM, IEEE, 2014.