Physica A 535 (2019) 122255
Contents lists available at ScienceDirect
Physica A journal homepage: www.elsevier.com/locate/physa
The implicit network inferred from users’ residences and workplaces enhancing collaborative recommendation on smartphones ∗
Yubo Jiang a , Yunfang Zhu b , Xin Du a , , Tao Jin a a b
College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China College of Computer Science and Information Engineering, Zhejiang Gongshang University, Hangzhou 310018, China
highlights • • • •
A novel hierarchical method is proposed to build the semantic location-based network. Users with neighboring semantic locations tend to have similar tastes and interests. Popular but unused items have high confidence being unattractive to target users. Both the network and the hybrid recommender are built in large-scale settings.
article
info
Article history: Received 15 January 2019 Received in revised form 14 June 2019 Available online 8 August 2019 Keywords: Interpersonal similarity Individual preference Semantic locations Hierarchical neighbor discovery Implicit feedback Matrix factorization
a b s t r a c t Personalized recommendation based on side information extracted from social networks has achieved promising performance in numerous applications. However, such side information is generally derived from users’ explicit interactions, such as Twitter connections or trust lists, which is not always available in most scenarios. Alternately, obtaining such side information from users’ implicit networks is easy. Thus, in this paper, we consider the similarity network for users’ semantic locations, assuming that users with neighboring semantic locations have similar consumption habits. To demonstrate this, we evaluate our study in a practical scenario: the smartphone recommendation based on operator records. A novel recommendation paradigm is designed, which includes three key steps: discovery of semantic locations, hierarchical construction of the neighbor network, and items’ popularity-based recommendation based on interpersonal similarities. The empirical results illustrate that our method outperforms the state-ofthe-art methods (13% coverage improvement than models without introducing networks and about 5% higher than model with a call-log based network). © 2019 Published by Elsevier B.V.
1. Introduction Today, personalized recommendation has been prevalent in real-world business scenarios, as it helps to substantially increase both customer satisfaction and business profits by capturing personalized preferences from historical feedback [1,2]. Among the various implementation methods, matrix factorization (MF) is the most popular and effective technique that characterizes both users and items, by decomposing the user–item interaction matrix into the product of two lower-dimensionality rectangular matrices. The MF method and its variants leverage users’ feedback either explicitly ∗ Corresponding author. E-mail address:
[email protected] (X. Du). https://doi.org/10.1016/j.physa.2019.122255 0378-4371/© 2019 Published by Elsevier B.V.
2
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
Fig. 1. The schematic of our semantic location based recommender.
(e.g., ratings [3] or favors [4]), or implicitly (e.g., purchase [5] or message forwarding [6]). In general, they formulate the recommendation task as a preference prediction problem by learning user–item latent feature vectors from observed entries. Recently, He et al. proposed a good-performance MF variant, that considers item popularity [2]. However, in most applications, effective recommendation does not only depend on user–item interactions but also involves users’ interpersonal influence, which often precludes the use and the potential advantage of sophisticated MF methods [7]. Social psychology has shown that people with similar tastes and interests tend to influence each other [8,9]. Many studies on recommendation explore the interpersonal side information extracted from customers’ explicit social networks to infer the similarities on the customers’ preferences. However, in most real-world scenarios, digging explicit user relationships is far from being off-the-shelf, especially in large-scale settings [10]. In contrast to employing explicit networks, Ma acquired an implicit network based on users’ scorings to indicate similarities of their preference [10]. Their success concludes that when no explicit networks are available, it is a good choice to utilize implicit networks of interpersonal similarities. Currently, the boom of mobile technology provides a brand-new source of social relational statistics. Locations such as users’ residences and workplaces have been comprehensively analyzed in many users-profiling applications [11,12]. Studies have implied that most people spend the largest amount of their time in a small number of meaningful locations (e.g., residences, workplaces, schools) [11,13]. In this paper, we call them semantic locations, and assume that people with adjacent semantic locations have a good chance to own similar shopping tastes and interests. As a result, we learn interpersonal similarities from an implicit network (i.e, the semantic location-based network). In this study, we reframe the popularity-based matrix factorization (PopMF) method [2] by including the interpersonal similarities from users’ semantic locations. Fig. 1 shows the entire factors in our model. In daily life, people prefer items that they are interested in; such factor is termed individual preference. Similar to MF-based methods, we first learn latent features for both users and items from user–item interactions. In the case of implicit feedback, we adopt the item popularitybased weighting [2] as confidence coefficients for entries missed by uninterested users. Consequently, we can lean the individual preference using these factors. People with similar interests have similar behaviors. In this paper, we learn these interpersonal similarities based on semantic locations. We first collect people’s semantic locations, user-POI (Point Of Interest) relationship from their records. Then, based on POI–POI distances, we propose the Hierarchical Neighbor Discovery (HND) method to build an implicit network (i.e., the semantic location-based network) for user–user similarity. Under the assumption that users’ similarities are consistent, these similarities are utilized to regularize users’ latent features. To sum up, we study users choices of smartphones considering two factors: individual preference and interpersonal similarity. The main contributions are summarized as follows. 1. Our HND algorithm adopts a hierarchical paradigm to effectively build the implicit network. It shares a linear complexity in terms of the number of users, which is particularly useful in large-scale settings. 2. We propose the PopMF-based Hybrid Recommender System (PHRS), which weights missing feedback based on item popularity and add location-based similarities into the objective. As far as we know, it is the first method that adopts semantic locations to enhance recommendation. 3. We carefully consider the privacy issues and hide the personal location information of the relevant users (e.g., where they live or work) through a coding method based on isometric-transformation. Comprehensive experiments have been implemented in an important task: smartphone recommendation through usage records from operators. The empirical results illustrate that our PHRS significantly outperforms the state-of-the-art methods for large-scale settings (i.e., records of over 370,000 anonymous users).
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
3
The rest of this paper is organized as follows. Firstly, we outline related studies in Section 2. Then, we give a detailed statement about our dataset and list preprocessing process in Section 3. Related notations are listed in Section 4. Next, we show the details to acquire users’ semantic locations and introduce the HND method in Section 5. Section 6 describes the proposed hybrid recommender. Finally, we show the results and analysis in Section 7, and provide a conclusion in Section 8. 2. Related work In this section, we review related approaches, which including (1) MF methods for both explicit and implicit feedback, (2) recommenders with interpersonal side information, and (3) studies on finding users’ semantic locations. 2.1. Matrix factorization methods Early MF based methods focus on explicit feedback from users, which directly reflects users’ preference on items. For example, Zhou et al. formulated a recommender based on users’ scoring on movies [3], and Jawaheer employed users’ preference assessments of songs (like or dislike) for recommendation [4]. Sometimes, acquiring that explicit feedback in real-world applications is difficult. Instead, implicit feedback (e.g., purchase [5], messages forwarding [6]) is easy to collect. Compared to explicit feedback, implicit feedback is more challenging to utilize because of the natural scarcity of negative information. Modeling only observed positive feedback results in a biased presentation of the user profile [14]. A simple solution is to treat all the missing data as negative feedback. However, this regards all negative feedback as equal and ignores missing but attractive items. Rendle assumed that customers prefer the used items than the unused and proposed a pair-wise ranking model (i.e., Bayesian Personalized Ranking, BPR) [15]. Unfortunately, this method adversely degrades learning efficiency due to the full consideration of every observed and missing interaction [2]. Alternately, He et al. employed a popularity-aware weighting strategy for these missing but potential items [2]. Inspired of He’s success, in this paper, we extend their popularity-based matrix factorization (PopMF) method with other side information. 2.2. Recommenders with users social networks As the saying goes, ‘‘birds of a feather flock together’’, people of the same social relationships have similar tastes and interests [8,9]. Based on this idea, researchers have proposed to improve user profiling using interpersonal side information. For the first time, Ma assumed that different levels of trust existed among friends and proposed a trust-aware model for recommendation [16]. Since then, many recommenders have been constructed with users’ social networks. Chen et al. integrated side information from users’ trust-based relations and made recommendation by network diffusion [17]. Jiang constrained users’ latent factors using their similarities on scoring among friends [18], whereas Qian introduced the role of interpersonal influence among friends [9]. In addition, Tang deemed that those interpersonal influence varied for different categories of items and proposed a domain-based recommender [19]. With comprehensive consideration, Zhao combined individuals’ preferences, interpersonal similarities of interests, and interpersonal similarities to make personalized recommendations [20]. Hereafter, they extended their model through geographical side information with the assumption that friends living in different locations have different influence for different items [21]. However, these methods generally need people’s explicit networks (e.g., users’ Twitter connections and trust lists), which is sometimes hard to obtain especially when privacy protection is considered. Thus, to enhance recommendation, we learn interpersonal similarities from users’ implicit networks. 2.3. Semantic location discovery The advancement of location-acquisition technologies such as GPS and Wi-Fi has enabled people conveniently to record their location histories [22]. This location-based information provides a novel source of massive data about where and when users behave [12]. Studies have shown that people tend to spend most of their time in a small number of locations [11,13]. It is meaningful to discover such locations, not only enhancing customer profiling, but also to promote the discovery of urban functional areas [12,23]. Unfortunately, these locations cannot be directly obtained due to the lack of personal information, especially in largescale and anonymous settings. To this end, we collect them from customers’ historical records. First, we need to collect POIs as candidates for users’ semantic locations. A common solution is to perform density-based clustering according to the occurrence frequencies of locations and set the cluster centers as the POIs [24,25]. After obtaining the POIs, locations’ semantics are determined based on temporal characteristics, such as the common registration time [25] or the statistical value of residence time [26]. For example, Dash discovered users’ POIs by inactive time and learned their semantics from registration periods [11]. In addition, Yu acquired POIs through a two-dimensional kernel density estimation based on frequencies and learned semantics according to registration periods [12]. Among these locations, residences and workplaces are the most remarkable [11]. For example, the workplace usually implies the occupation of each user, and the place of residence is commonly related to users’ income level. These two locations are important for customer profiling. Therefore, we choose these two semantics for subsequent studies and will introduce others in future studies.
4
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255 Table 1 A sample of our anonymized dataset. Name
Sample
Timestamp UserID TAC Location
1486287225 U2FsdGVkX19wbs...qfDwaAV22ffp0= 86153303 (31.22, 121.56)
3. Data description 3.1. Overview Previous location-based studies rely only on users’ call detail records (CDRs). However, cellular data usage (i.e., mobile Internet) has become the main form of people’s mobile life and offers fine-grained insights into behavior analysis (e.g., ratings on business [14], application usage [27]). In this paper, the experimental data includes anonymous records both the call details and cellular usage from operators. When users request data access on the mobile network, records are passively captured the identification (ID) of each mobile user (anonymous), location of the base station sector, timestamps of the data connection and Type Allocation Code (TAC) of device. Specifically, our dataset contains records of more than 370,000 anonymous users from October 24, 2016, to February 15, 2017. 3.2. Measures for privacy It is worth noting that privacy issues have been carefully considered and measures have been taken to protect the privacy of involved users. 1. Records are collected via a collaboration with one mobile operator, excluding any personality identifiable information. 2. The ‘‘user ID’’ field has been anonymized via a two-step non-reversible AES encryption and hash process, which makes it impossible to trace back to the original ID. 3. The initial location (i.e., the latitude and longitude) is at the mobile cell tower level, covering a range of 500 m. To further protect the privacy, they are anonymized through an arbitrary isometric transformation. It hides actual locations for cell towers but preserves their relative distances. In addition, all researchers are subjected to strict confidentiality protocols and the dataset is located in secure off-line servers. There would be no personal information in our dataset. Table 1 represents a sample of this dataset. 3.3. Clean process Data clean process is an essential step for modeling. At first, we filter invalid records that could not correspond to locations or smartphones, and obtain the data containing 266,424 users, 5353 smartphones, 79 recording days and 92,663 distinct cell towers. In order to provide customers with sufficient services, operators always set up cell towers covering overlapping areas, especially in densely populated areas. Consequently, the number of cell towers is large, although they entirely come from one Chinese city. Preliminary analyses are conducted to see how smartphone usage is distributed. Fig. 2a shows the quantity distribution of different smartphones every user has used, whereas Fig. 2b shows the distribution of user coverage as the number of smartphones increases. As Fig. 2a shows, the number of different smartphones users have used allows a power-law distribution. This indicates that most users have seldom changed smartphones in this period (i.e., 73 days). As the report from Counterpoint Research states in 2017, the global average replacement time for smartphones is 21 months.1 It is abnormal for users to change smartphones too frequently in this period. We need to discard users by the number of their used smartphones. However, too much discarding would impact the training. For the sake of simplification, users who have used more than 4 smartphones (about 5% of total users) are excluded. From Fig. 2b, we can see that most customers only use a small number of smartphones. Therefore, most records could be covered by considering a small number of smartphones. For clarity, we select the most popular 872 smartphones covering more than 95% of users. Table 2 shows the final dataset. 1 https://www.counterpointresearch.com/zh-hans/smartphone-users-replace-their-device-every-twenty-one-months.
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
5
Fig. 2. The quantity distribution of smartphones users used (left panel), and the coverage of users as the number of smartphones increases (right panel). Table 2 The overview of our experimental dataset. Notations
Number
Users Smartphones Days Locations
246,193 872 79 92,571
Table 3 Notations and their description. Notation
Description
Notation
Description
N M
Number of users Number of items
u i
User Item
L
Involved locations Period for semantic s Location of user u on semantic s Maximum number of neighbors Semantic Location-Based Network
S
Relevant semantics Semantic locations of user u Longitude and latitude of location Maximum distance between neighbors Semantic neighbors of u
perds lsu kmax G
xui rui
Observed interaction between u and i Predicted preference of u on i Set of users that interact with item i Parameters for popularity-based weights Latent factors of user u Dimension of the latent space
Γi w0 , τ pu d
Lu
φus , λsu
dmax Eu yui
wui fi suv qi
α, β, γ
Predicted probability of u interacting with i Popularity-based weight of predicted probability Popularity of item i Similarity between user u and v Latent factors of item i Trade-off in objectives
4. Problem formulation
The notations utilized in our model are given in Table 3. In this paper, we aim to improve personalized recommendation of smartphones using side information from customers’ historical locations. Suppose we have N users and M items (i.e., smartphones) in our dataset. Given a user u and an item i, their interaction is denoted as xui , as the Eq. (1) shows.
{ xui =
1 0
u used i other w ise
(1)
Then, the task is converted to recommend unused items to users based on their observed interactions and other factors. As Fig. 1 shows, our model is proposed considering two aspects: individual preference and interpersonal similarity. For individual preference, we employ the weighted MF method. Given a user u and an item i, let pu ∈ Rd and qi ∈ Rd be their latent feature vectors, respectively. The predict preference is denoted as rui and the popularity-aware confidence is wui . For interpersonal similarity, we learn it based on users’ semantic locations. Given a user u, let Lu be the user’s semantic locations. The semantic location-based network is G and u’s semantic neighbors are denoted as Eu . For every neighbor
v ∈ Eu , interpersonal similarity suv is defined by semantic distances, which is then used to regularize users’ latent vectors.
6
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
With the help of these factors, we propose the hybrid recommender described below. For clarity, the detailed process is introduced in the following sections. 5. Semantic locations and the location-based network In this section, we learn user profiling based on their geographical side information. Roughly speaking, we first acquire users’ semantic locations from their historical movements. Then, a brand-new neighborhood discovery method, namely, the HND method has been proposed to build a semantic location-based network. Finally, observations of user smartphone selection have been represented. 5.1. Semantic locations acquisition At first, we define the semantic locations as follows. Definition 1 (Users’ Semantic Locations). Given a user u, its semantic locations Lu satisfies 1. Lu = lsu : s ∈ S , lsu ∈ L . ( ) 2. |Lu | ≤ |S |, lsu = φus , λsu .
{
}
where φus , λsu denote the longitude and latitude of location lsu . Semantic locations could be obtained based on people’s historical records from operators. In this paper, we employ two most remarkable semantic (i.e., residences and workplaces). Studies have concluded that locations with different semantics can be well distinguished by stay-in periods [11,23]. For example, people are thought to be at home from 9pm to 7am, while in the workplace from 9am to 5pm. Thus, for a given semantic s, candidates of locations are collected into following steps: (1) set a rational period perds , (2) filter the records according to this period, (3) extract remaining locations as candidates for each user. Hereafter, density-based clustering methods are employed to collect semantic locations (i.e., the residences and the workplaces) [12]. Without loss of generality, we employ a MeanShift method with a two-dimensional Gaussian Kernel when clustering. For each user, the complexity of finding semantic locations is only about the number of candidates. Therefore, this method has linear complexity with respect to the number of users, namely, Ω (N ). 5.2. Hierarchical neighbor discover method Tobler’s First Law of Geography stats ‘‘Everything is related to everything else, but near things are more related than distant things’’. Similar conclusions can be drawn for the users’ behaviors. For example, customers working nearby are supposed to have close salaries and lifestyles; whereas people living nearby tend to have similar consuming habits. These similar factors lead to common habits and tastes of users. It is reasonable for us to learn interpersonal similarities from these locations (i.e., the semantic locations). At first, we define the distance between users’ semantic locations as follows. For users u and v , the semantic distance is defined by the Eq. (2).
√ D (u, v) =
1 ∑
|S |
d2 lsu , lsv
(
)
(2)
s∈S
where the geographical distance d(x, y) is calculated by the Haversine Formula [28]. Given thresholds (i.e., the maximum distance dmax and the maximum size of neighborhood kmax ), we obtain the following network. Definition 2 (Semantic Location-based Network). Let G (V , E) denote the network, which satisfies 1. V ⊆ [N ], E ⊆ [N ] × [N ]. 2. E = {(u, v ) : D(u, v ) ≤ dmax , ranku (v ) ≤ kmax }. where [N ] denotes the set of users, ranku (v) = | v ′ : D(v ′ , u) < D(v, u), ∀v ′ ∈ [N ] | returns the number of other users closer to u than v .
{
}
The kth Nearest Neighbor (KNN) method is a common ( ) solution to obtain this network. However, it needs to compare distances between every two users, resulting in a Ω N 2 -complexity, which is impractical under large-scale settings. To overcome it, a novel method, that is, the HND method, has been proposed. It includes two steps, (1) divide users into smaller groups under the maximum distance dmax , and (2) implement a group-wise KNN method to obtain neighbors for every user. For the grouping process, two steps are included, (1) divide users into blocks based on their corresponding locations and (2) expand each block with users of adjacent blocks. Given a user u and a semantic s, let lsu ∈ L be the semantic
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
7
Fig. 3. The algorithm diagram about the HND method.
location.As Table 1 represents, locations are recorded by longitudes and latitudes, that is, lsu = (φus , λsu ). We implement the dividing process as the Eq. (3) shows. B (x, y|dmax ) =
{
⌊ u : u ∈ [N ],
φus dmax
⌋
≡ x,
⌊
λsu dmax
⌋
} ≡y
(3)
where (x, y) is the divided block ID, and ⌊x⌋ denotes the largest integer not exceeding x. Thereafter, the adjacent neighbors B (x, y|dmax ) are defined as the Eq. (4) shows. B˜ (x, y|dmax ) =
⋃
B (x + ∆x, y + ∆y|dmax )
(4)
∆x,∆y=±1
And the grouping result is obtained in the following form. Bˆ (x, y|dmax ) = B (x, y|dmax ) ∪ B˜ (x, y|dmax )
(5)
Apparently, for any user u ∈ B, this augmented set includes whole neighbor candidates who have distance less than dmax . Fig. 3 shows this method. Specifically, users are first grouped according to their residence to collect residence-based neighbor candidates; Then, for users grouped in every residence-based set, we employ a similar process based on their workplace; After hierarchical grouping, the network is obtained through a group-wise KNN method. Unfortunately, this method would be impractical, when ( the )number of semantics increases. Accordingly, we convert the previous hierarchical method into a parallel one. Let xsu , ysu denote the block ID for user u under semantic s. Then, u’s neighbor candidates Bˆu is defined as the Eq. (6) shows. Bˆu = v : φvs , λsv ∈ Bˆ xsu , ysu , ∀s ∈ S
{
(
)
(
)
}
Obviously, it could be equivalently derived as an intersection form, Bˆu =
(6)
⋂
s∈S
Bˆs xsu , yu .
(
) s
In other words, we group users under each semantic, and obtain the final neighbor candidates by intersection. Detailed steps for the HND method which considers only two significant semantics (i.e., residence and workplace) are listed in Algorithm 1. This improved method is now applicable to scenarios with many semantics.
5.3. Complexity analysis of HND The goal of this section is to give the complexity analysis of the method shown in Algorithm 1. The first part contains the steps from 2 to 7. It is equivalent to firstly obtain both block IDs of users based on his/her residence and workplace, then group users based on these IDs. Definitely, these steps have a linear complexity with respect to the number of users, namely, the Ω (N )-complexity. The second part contains the steps from 8 to 17. Step 9 has already been proved with a Ω (N )-complexity in the previous part. For step 10 and 11, we obtain whole candidates for every user by the Eq. (5), with no more than Ω (N )complexity. For step 12, simple set operations are implemented for every user, definitely with Ω (N )-complexity. Similar conclusions can be drawn for step 14 and 15, as well. [⏐ ⏐] The complexity for step 13 is based on the expected number of candidates, namely, E ⏐Bˆu ⏐ for every user. Our purpose is to make reasonable assumptions and give an upper bound. Given a user u, let Bˆu be the candidates. Thus, that complexity is equivalent to the probability that a user is grouped into this block, as the Eq. (7) shows.
⎛ ⎞ ⋂ ( ) P v ∈ Bˆu = P ⎝v ∈ Bˆs ⎠ s∈{R,W }
(7)
8
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
Algorithm 1 The steps for Hierarchical Neighbor Discovery Input: settings for the network dmax and kmax Output: the network G (V, E) 1: Initialize the network G (V, E) 2: for u ← 1, 2, ..., N do { } 3: Collect the semantic location Lu = (φuR , λRu ), (φuW , λW u ) 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17:
xRu ←
φuR
⌊
⌋
, yRu ←
λRu
⌊
dmax xRu yRu
⌋
dmax
, xW u ←
⌊
φuW dmax
⌋
, yRu ←
⌊
λRu
⌋
dmax
W IDu ← , , xW u , yu end for Obtain blocks BR , BW based IDs as Eq. (3) for u ←⌊1, 2, ⌋..., N do ⌊ ⌋ ⌊ W ⌋ ⌊ R ⌋ φR λR φu λ xRu ← d u , yRu ← d u , xW , yRu ← d u u ← d
(
)
max
max max max ( ) ( ) Obtain BˆW xW , yW by Eq. (5) ( R u R) u ( ) Bˆu ← BˆR xu , yu ∩ BˆW xW , yW u u { } Eu = v : D(u, v ) ≤ dmax , ranku (v ) ≤ kmax , v ∈ Bˆu \ {u} V ← V ∪ {u} ∪ Eu E ← E ∪ {(u, v ), v ∈ Eu }
Obtain BˆR xRu , yRu by Eq. (5)
end for Returns the semantic location-based network G (V, E)
where notations R and W denote the involved semantics (i.e., residence and workplace), and Bˆs = Bˆs xsu , ysu denotes u’s neighbor candidates under semantic s. When people choose semantic locations, various factors have to be considered. For example, factors such as house prices, transportation and infrastructure are considered when choosing residence. For simplicity, we make the following independent assumptions.
(
)
Assumption 1. For u’s neighbor candidate v , semantic locations are distributed independently,
⎞
⎛ ⋂
P v ∈ Bˆu = P ⎝v ∈
(
)
Bˆs ⎠
(8)
s∈{R,W }
∏
=
P v ∈ Bˆs
(
)
s∈{R,W }
As a result, we can get this probability by setting a proper distribution for users’ semantic locations. It is commonly believed that both residence or workplace are designed to locate around functional centers. Thus, we employ a twodimensional Gaussian Mixture distribution for locations for each semantic. Without loss of generality, the longitude and latitude of each location are assumed independently and identically distributed, as the following shows. Assumption 2.
Given a semantic s, the distribution of semantic locations satisfies the following equation.
⎡ ( )2 ( )2 ⎤ Cs φ s − φˆ ks + λs − λˆ sk ∑ ) ⎢ ⎥ p φ s , λs = ηks exp ⎣− ⎦ 2σs2 (
(9)
k=1
(
where Cs denotes the number of functional centers, φˆ ks , λˆ sk
)
is the functional center.
Then we derive the following proposition. Given a user u, its group Bˆu under semantics {R, W } satisfies
Proposition 1.
P v ∈ Bˆu ≤ C
(
)
∏ [
(
1 − exp −
s∈{R,W }
d2max
)]
2σs2
where C is a constant unrelated to the number of users.
(10)
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
9
Fig. 4. The averaged percentage of users having at least 1, 3, 6 and 10 neighbors within a distance threshold from 100 to 2000 m.
W Proof. Given the maximum distance dmax , for a target user u, its group ID is xRu , yRu , xW u , yu . Then, users in this group Bˆu are denoted as the Eq. (11) shows.
(
Au =
} ⋂ { φs λs v : v − xsu , v − ysu ∈ [−1, 1] dmax
s∈{R,W }
dmax
)
(11)
Let K = P v ∈ Bˆu be the target probability. Based on Assumption 2, it is the same as the probability that a point falls in the corresponding region.
(
)
∏ ∑
K=
ηis
s∈{R,W } i∈[Cs ]
⎡ ( )2 ( )2 ⎤ φ s − φˆ is + λs − λˆ si ⎥ s s ⎢ exp ⎣− ⎦ dφ dλ 2σs2 Asu
∫∫
(12)
where the region Asu is dented as the Eq. (13) shows. Asu =
{
v:
φvs dmax
− xsu ∈ [−1, 1] ,
λsv dmax
} − ysu ∈ [−1, 1]
(13)
After series calculations, we obtain the upper bound presented in Proposition 1. We conclude that the√ proposed method has linear complexity with respect to the number of users, when setting proper 4 dmax (e.g., mins∈{R,W } σs / N). 5.4. Observations In this section, we present 2 important observations from users’ smartphones selection in their semantic location-based network. Observation 1. Most people have neighbors within a short semantic distance. More than 45.6% people have at least one neighbor next to it within 100 m and the percentage rise to 86.9% within 1000 m (i.e., 12-min walk). According to the Eq. (2), we would obtain the semantic distance between any two users. Fig. 4 presents the percentage of users having at least 1, 3, 6, and 10 neighbors respectively, within a distance threshold ranging from 100 to 2000 m. It shows that most users are not isolated geographically from others. More than 45.6% of users have one neighbor close to them within 100 m. The percentage rise to 86.9% if the threshold is set to 1000 m (i.e., a distance of 12-min walk). Within this walking distance, about 77.1% of users have at least 3 neighbors and 62.8% of users have at least 10 neighbors. Observation 2. A user’s smartphone selection is similar to choices of his/her semantic neighbors. And the closer their semantic locations are, the more similar their smartphones choices are. People with proximal semantic locations are assumed to have similar interests and tastes. To visually compare the similarities between users and their semantic neighbors when select smartphones, we define the following network-based similarity. Given a user u, let Γu = {i : xui = 1} be its interacted items (i.e., smartphones). Then, whether user u and v have used same smartphones is denoted as I (u, v), as the Eq. (14) shows. I (u, v) =
{
1 0
Γu ∩ Γv ̸= ∅
other w ise
(14)
10
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
Fig. 5. The average similarity (AG) of users and their neighbors in smartphone selection within a distance threshold from 100 to 2000 m.
Given a network G , let Eu be u’s semantic neighbors. As Fig. 4 shows, there are still users who have no neighbors within the distance threshold. For clarity, we exclude them when calculating similarity. Let V = {u : |Eu | > 0} be users with at least one neighbor. The network-based similarity (AS) is defined as the Eq. (15) shows. AS (G ) =
1 ∑ 1
|V|
u∈V
|Eu |
∑ (
I u, u′
)
(15)
u′ ∈Eu
Fig. 5 represents these similarities between his/her at most 1, 3, 6 and 10 nearest neighbors, under different distance thresholds from 100 to 2000 m. As a reference, for each user participating in the computation, we randomly sample a user from the dataset (regardless of their distance) and compute their similarity in smartphone selection as the label ‘‘Random’’. Fig. 5 witnesses that users’ smartphone selection is similar to their neighbors. (1) The percentage increases within a smaller distance (e.g., from 100 to 800). This is mainly because when the threshold is small, the number of neighbors is so small that there would be random selection factors. (2) For larger thresholds, it decreases slowly and becomes stable of 1500 m or larger, indicating that users with closer locations tend to have stronger similarities. (3) The smaller the maximum number of neighbors you set, the higher the average similarity. Given the distance threshold, neighbors are assigned with nearest users. The smaller its size, the closer their distances. Therefore, it confirms that users with nearby locations have similar smartphone selection. (4) For the reference one, the ‘‘Random’’ achieves a similarity score less than 0.032, which is obviously inferior to ones based on semantic neighbors. With these observations, we could confirm that users with nearby semantic locations tend to have similar smartphones choices and the closer their locations are, the more similar their choices are. Consequently, our proposed method is to improve the recommender by side information from this implicit network. 6. Hybrid personalized recommender In this section, we will introduce our PHRS method in a practical scenario (i.e., the smartphone recommendation task through usage records from operators). 6.1. Methodology As Table 3 shows, our dataset contains N users and M items, and notations u and i denote a user and an item respectively. Their interaction state is denoted as the Eq. (1) shows, where xui = 1 denotes u has used i. Let rui be the preference from user u to item i. The probability of whether u will use i is denoted as yui = 1/[1 + exp (−rui )]. Consequently, the task is converted to make the predicted probabilities consistent with the observed interactions. Inspired of the binary classification studies, we adopt the Logarithmic Loss [29] for this consistency, as the Eq. (16) shows.
∑ u,i
[xui log yui + (1 − xui ) log (1 − yui )]
(16)
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
11
Missing entries in implicit feedback are a mixture of negative and unknown items. As we mentioned in Section 1, in this paper, we adopt the weighted consistency [2], as the Eq. (17) shows. N M ∑ ∑
wui [xui log yui + (1 − xui ) log (1 − yui )]
(17)
u=1 i=1
where the popular-based weighting is defined in the Eq. (18).
wui = xui + (1 − xui ) w0 ∑
fi τ
j∈[M ] fj
(18)
τ
and fi = |Γi |/N denotes i’s popularity, Γi = {u : xui = 1} are users who have used i, and w0 , τ are parameters for weighting. Our goal is to obtain a predictor, which maximizes the consistency shown in Eq. (17). Similar to other effective recommenders, we adopt a MF-based method. We first model both users and items into a latent low-dimension space, of which the dimension is d ≪ min {N , M }. Let pu , qi ∈ Rd be the latent feature vectors of user [ ( u and )] item i respectively. The preference is defined as rui = pTu qi , and the probability is predicted as yui = 1/ 1 + exp −pTu qi . As described in Section 5.4, users with adjacent semantic locations are observed to have similar tastes and interests. Consequently, in this paper, we learn interpersonal similarities from an implicit network (i.e., the semantic location-based network). Given a semantic location-based network G , for a target user u, its neighbors are denoted as Eu . For its neighbor v ∈ Eu , let suv be their location-based similarity, as the Eq. (19) shows. su v = ∑
exp (−D (u, v)) v ′ ∈Eu
(19)
exp (−D (u, v ′ ))
where D (u, v) denotes their semantic distance. With the hypothesis that similarities in the(location-based )2 space are ∑N ∑ T consistent with the latent space, the users’ latent factors could be regularized by u=1 v∈Eu suv − pu pv . Finally, the recommendation problem has been derived to learn the unknown factors pu , qi for every user u and item i, in order to minimize the following regularized error. J =−
N M ∑ ∑
wui [xui log yui + (1 − xui ) log (1 − yui )]
(20)
u=1 i=1
+
N α ∑∑(
2
u=1 v∈Eu
suv − pTu pv
)2
+
N β∑
2
pTu pu +
u=1
M γ ∑
2
qTi qi
i=1
where α, β, γ are parameters for regularization. 6.2. Convergence analysis and training process Considering the large number of users involved in our experiments, we seek the optimal vectors by an Alternating Least Square (ALS) method [1]. That is, starting from random initial for latent factors, we optimize each of them alternatively with others fixed and implement stepwise process until convergence. The algorithm is guaranteed to be convergent for the following factors. 1. The objective is continuous respect to both latent feature factors pu and qi . 2. The objective has a lower bound at 0. 3. The alternating gradient search will reduce the objective monotonically. Gradients of this objective are defined as follows. M
∑ ∑( ) ∂ȷ = wui (yui − xui ) qi + α pTu pv − suv pv + β pu ∂ pu u=1
(21)
v∈Eu
N
∑ ∂ȷ = wui (yui − xui ) pu + γ qi ∂ qi u=1
According to the training process in Algorithm 2, the objective will decrease in the fastest direction and finally converge to the desired minimum. After learning the optimized latent factors for both users and items, we could make personalized recommendation for each user in the following steps. Given a user u, (1) predict preference scores on every item i by rui = pTu qi , (2) sort items by their predicted preference scores, and (3) recommend the top unused items for u.
12
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
Algorithm 2 The steps for the hybrid model combining popularity and semantic locations Input: parameters d, α, λ, µ, learning rate η, and convergence threshold ϵ Output: optimized factors {pu } , {qi } 0 0 1: Initialize factors for every user and item pu , qi 2: Set iterate step t = 0 0 3: Determine initial error ȷ by the Eq. (20) 4: repeat 5: t ←t +1 6: for u ← 1, 2, ..., N do ∂ȷ 7: Learn gradient ∂ p based on Eq. (21) 8: 9: 10: 11: 12: 13: 14: 15: 16:
∂ȷ
u
ptu ← ptu−1 + η ∂ p u end for for i ← 1, 2, ..., M do ∂ȷ Learn gradient ∂ q based on Eq. (21) ∂ȷ
i
qti ← qti −1 + η ∂ q i end for Determine ȷt by the Eq. (20) until |ȷt − ȷt −1 |≤ ϵ Return the optimized vectors
Table 4 The distributions of train and test dataset. Dataset
Train set
Test set
Users Smartphones Interactions
243,418 872 295,476
12,322 828 14,066
7. Experiments In this section, we conduct comprehensive evaluation in a real-world scenario and compare our PHRS method to the state-of-the-art baselines. 7.1. Experimental settings Dataset As shown in Section 3, we evaluate performance by the smartphones usage records. At first, we need to divide the dataset into two parts (i.e, the training set and the testing set). Since our purpose is to recommend unused smartphones for customers based on historical records, we make dividing according to the recording time. Similar to [2], we sort the used smartphones in chronological order for every user. Interactions in the first 64 days are employed for training, and those in the last 15 days are for testing. Only users who have gotten new smartphones in the last 15 days are included in the test set. Table 4 shows the distribution of them. Metrics We adopt two popular rank-based metrics, namely, the Recall and Mean Reciprocal Rank (MRR). The larger Recall or MRR means better prediction accuracy. Let T denote the set of testing users. Given a recommending size k, for a user u ∈ T , let Puk be the recommended list whereas Tu be smartphones which are really used. Hence, metrics are defined as following equations shows. Recall@k =
MRR@k =
1 ∑ |Iuk |
|T |
u∈T
1 ∑
|T |
u∈T
|Tu | ∑
i∈Iuk
(22) 1/ranku (i)
|Tu |
(23)
where Iuk = Puk ∩ Tu denotes the matched recommended smartphones, ranku (i) denotes i’s ranking in the recommended list. For users who have changed multiple smartphones, metrics are evaluated by ranking-based weighting for each smartphone. Baselines We introduce following well-known methods to evaluate the PopMF method at first. 1. ItemKNN: It obtains the similarities between items based on their common users, then recommend similar items based users used items.
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
13
Table 5 The properties of semantic location-based network. Properties
Vertices
Edges
Intensity
The number
220,735
1,811,508
8.20
Table 6 The settings of regularize parameters. Parameter
α
λ
µ
Selected
10−3
10−3
10−2
Fig. 6. The performance (log-loss) of models via different rank from 5 to 30.
2. BasicMF: It directly regards the observed usage as a positive feedback, and adopt MF method only on positive feedback. 3. SampledMF: It randomly samples missing items for users as negative feedback, and utilizes MF method to obtain a recommender. 4. FreqMF: It utilizes the frequency of users interactions on items as explicit feedback, and utilize MF method based on them. 5. BPR: It assumes that users have a higher preference on their used items than unused. Then it utilizes pair-wise MF method on these ranking information. 6. PopMF: It assumes that potential items tend to be missing for unpopular, and weights the missing feedback by popularity. Parameter Setting Locations are recorded by cell towers and each cell tower covers an area with a radius of 500 m, as described in Section 3. Thus, we empirically set the maximum distance dmax as 500 m. For the maximum number of neighbors, we set it as 10 by default and study its further impact in the following section. Table 5 represents the implicit network under these settings. Without loss of generalization, popularity-based parameters are assigned with w0 = 1, τ = 0.5 [2]. Parameters for regularization are set based on physical meanings (as Table 6 shows). Lastly, we consider the dimensionality d, which has been proved important for recommending. Without loss of generality, we perform a 10-fold cross-validation on the training set. Fig. 6 shows the average objective with different d. In the trade-off of performance and complexity, we set the dimensionality as d = 26. 7.2. Experimental results Table 7 shows the above metrics (i.e., Recall and MRR) in a top-10 recommending setting for each model. PopMF shows the best performance among these models in terms of both the Recall and MRR. BPR also shows a similar performance with PopMF, coincident with the study proposed [2]. SampledMF performs worse than these two models but better than others. Facing the sparsity of our dataset, FreqMF improves the performance of BasicMF but worse than the above models. BasicMF performs worse than other improved ones, but better than ItemKNN. The ItemKNN performs the worst in our dataset. Considering the trade-off between both complexity and performance, our hybrid recommender is proposed by extending the PopMF method with side information from the semantic location-based network. To further illustrate the role of this implicit network, we build an explicit network based on customers’ call-logs as their social network. In this network, both callers and listener correspond to vertices and their relations are employed as edges (as Table 8 shows).
14
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255 Table 7 The performance (MRR and Recall) for models recommending top-10 smartphones. Model
Recall@10
MRR@10
ItemKNN BasicMF SampledMF FreqMF BRP PopMF
0.166215 0.206615 0.313417 0.233928 0.334303 0.334877
0.063321 0.090975 0.130230 0.100315 0.133474 0.135553
Table 8 The properties of the call-log based network. Properties
Vertices
Edges
Intensity
The number
249,120
318,222
1.27
Fig. 7. The performances of hybrid models whiling recommending smartphones from 1 to 10, Recall (left) and MRR (Right).
Accordingly, we compare performance of following four models on personalized smartphones recommendation. (1) BPR model without networks, (2) PopMF without networks, (3) PopMF with a call-log based network, (4) our proposed method (i.e., PopMF with a semantic locations based network). The performances for recommending smartphones from 1 to 10 are plotted in Fig. 7, respectively. Figs. 7a and 7b show the performances of recommenders for different numbers of smartphones. It can be well concluded that we have enhanced recommendation by introducing this implicit network. Actually, our proposed model (PHRS) outperforms all the models in terms of both Recall and MRR metrics. It has more than 13% improvement over ones without networks, and about 5% over the one with an approximate explicit network (i.e., the call-log based network) consistently. The hybrid model with the call-log based network performs worse than our model, but better than ones without networks. At the same time, these two methods that do not introduce networks perform similarly [2]. However, both of them perform worse than ones considering side information from networks. These observations indicate that, (1) the performance of recommendation could be improved by introducing side information extracting from networks, (2) the performance would be degraded based on a sparse network, and (3) the semantic location-based network is important for improving the performance of recommendation, especially in real-life scenarios without any explicit networks. Last but not the least, we conclude that people with nearby semantic locations tend to have similar tastes and interests; and the personalized recommendation could be improved by considering side information from the semantic location-based network. 7.3. Impact of neighborhood size The distance threshold is determined by the cover radius of cell towers. Therefore, users’ semantic neighbors are only determined by the maximum number of neighbors, namely, kmax . In this set of experiments, we evaluate its impact on the recommended performance. Figs. 8a and 8b respectively present Recall@10 and MRR@10 of the proposed method by taking the maximum number of neighbors varying from 1 to 15. From these figures, we can see that when the distance threshold is set as 500 m, the top-10 nearest neighbors achieves the best performance. Therefore, we set kmax = 10 in our model.
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
15
Fig. 8. The performance on users’ smartphone recommendation with neighborhood size, Recall@10 (left) and MRR@10 (right).
8. Conclusion and future work Friends trend to have similar tastes and interests. Based on this sociological discovery, researchers have focused on enhancing recommendation using side information from users’ social networks. However, these explicit networks are often difficult to acquire in real-life scenarios. To resolve it, implicit networks based on users behaviors are employed as alternates. In this paper, we propose the PopMF-based Hybrid Recommender System (PHRS), which combines a natural implicit network (i.e., the semantic location-based network) and item popularity. In detail, we firstly determine users semantic locations according to historical records; Then, a Hierarchy Neighbors Discovery (HND) method is proposed to build the semantic location-based network in large-scale setting. Finally, we reframe the PopMF with side information extracting from users’ implicit network and evaluate performance in a real-world task (i.e. the smartphone recommendation based on operator records). After comprehensive experiments, our model with users’ semantic locations based network consistently outperforms the state-of-the-art methods (13% coverage improvement over network-independent network and about 5% higher than one with call-log based network). Results also conclude that introduce an implicit network based on users semantic locations could improve the personalized recommendation. Our future work is to extend our algorithm in three aspects: (1) apply our semantic location-based network into other recommendation systems; (2) consider other efficient solution about generating users’ interpersonal networks; and (3) expand our model to solve the cold-start problem. Acknowledgment This work was supported by the welfare technology research project of Zhejiang Province (No. LGG18F010003). References [1] R. Bell, Y. Koren, C. Volinsky, Matrix factorization techniques for recommender systems, Computer 42 (2009) 30–37. [2] X. He, H. Zhang, M.-Y. Kan, T.-S. Chua, Fast matrix factorization for online recommendation with implicit feedback, in: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, in: SIGIR ’16, ACM, New York, NY, USA, 2016, pp. 549–558. [3] Y. Zhou, D. Wilkinson, R. Schreiber, R. Pan, Large-scale parallel collaborative filtering for the netflix prize, in: R. Fleischer, J. Xu (Eds.), Algorithmic Aspects in Information and Management, Springer Berlin Heidelberg, Berlin, Heidelberg, 2008, pp. 337–348. [4] G. Jawaheer, M. Szomszor, P. Kostkova, Comparison of implicit and explicit feedback from an online music recommendation service, in: Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems, in: HetRec ’10, ACM, New York, NY, USA, 2010, pp. 47–51. [5] K. Choi, D. Yoo, G. Kim, Y. Suh, A hybrid online-product recommendation system: Combining implicit rating-based collaborative filtering and sequential pattern analysis, Electron. Commer. Res. Appl. 11 (4) (2012) 309–317. [6] P. Cui, F. Wang, S. Yang, L. Sun, Item-level social influence prediction with probabilistic hybrid factor matrix factorization, AAAI Conference on Artificial Intelligence, 2011. [7] X. Yang, Y. Guo, Y. Liu, H. Steck, A survey of collaborative filtering based social recommender systems, Comput. Commun. 41 (2014) 1–10. [8] M. Jiang, P. Cui, F. Wang, W. Zhu, S. Yang, Scalable recommendation with social contextual information, IEEE Trans. Knowl. Data Eng. 26 (11) (2014) 2789–2802. [9] X. Qian, H. Feng, G. Zhao, T. Mei, Personalized recommendation combining user interest and social circle, IEEE Trans. Knowl. Data Eng. 26 (7) (2014) 1763–1777. [10] H. Ma, An experimental study on implicit social recommendation, in: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, in: SIGIR ’13, ACM, New York, NY, USA, 2013, pp. 73–82.
16
Y. Jiang, Y. Zhu, X. Du et al. / Physica A 535 (2019) 122255
[11] M. Dash, H.L. Nguyen, C. Hong, G.E. Yap, M.N. Nguyen, X. Li, S.P. Krishnaswamy, J. Decraene, S. Antonatos, Y. Wang, et al., Home and work place prediction for urban planning using mobile network data, in: 2014 IEEE 15th International Conference on Mobile Data Management, Vol. 2, IEEE, 2014, pp. 37–42. [12] H. Chen, Y. Yu, B. Ma, B. Yen, Utilizing geospatial information in cellular data usage for key location prediction, in: The 51st Hawaii International Conference on System Sciences, in: HICSS’2018, 2018. [13] S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, M. Martonosi, J. Rowland, A. Varshavsky, Identifying important places in people’s lives from cellular network data, in: K. Lyons, J. Hightower, E.M. Huang (Eds.), Pervasive Computing, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 133–151. [14] L. Hu, A. Sun, Y. Liu, Your neighbors affect your ratings: On geographical neighborhood influence to rating prediction, in: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, in: SIGIR ’14, ACM, New York, NY, USA, 2014, pp. 345–354. [15] S. Rendle, C. Freudenthaler, Z. Gantner, L. Schmidt-Thieme, Bpr: Bayesian personalized ranking from implicit feedback, in: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2009, pp. 452–461. [16] H. Ma, H. Yang, M.R. Lyu, I. King, Sorec: Social recommendation using probabilistic matrix factorization, in: Proceedings of the 17th ACM Conference on Information and Knowledge Management, in: CIKM ’08, ACM, New York, NY, USA, 2008, pp. 931–940. [17] L.-J. Chen, J. Gao, A trust-based recommendation method using network diffusion processes, Physica A 506 (2018) 679–691. [18] M. Jiang, P. Cui, R. Liu, Q. Yang, F. Wang, W. Zhu, S. Yang, Social contextual recommendation, in: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, in: CIKM ’12, ACM, New York, NY, USA, 2012, pp. 45–54. [19] J. Tang, S. Wang, X. Hu, D. Yin, Y. Bi, Y. Chang, H. Liu, Recommendation with social dimensions, in: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, AAAI press, 2016, pp. 251–257. [20] G. Zhao, X. Qian, X. Xie, User-service rating prediction by exploring social users’ rating behaviors, IEEE Trans. Multimed. 18 (3) (2016) 496–506. [21] G. Zhao, X. Qian, C. Kang, Service rating prediction by exploring social mobile users’ geographical locations, IEEE Trans. Big Data 3 (1) (2016) 67–78. [22] J.J. Levandoski, M. Sarwat, A. Eldawy, M.F. Mokbel, Lars: A location-aware recommender system, in: 2012 IEEE 28th International Conference on Data Engineering, 2012, pp. 450–461. [23] D. Yu, Y. Li, F. Xu, P. Zhang, V. Kostakos, Smartphone app usage prediction using points of interest, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1 (4) (2018) 174:1–174:21. [24] D. Falcone, C. Mascolo, C. Comito, D. Talia, J. Crowcroft, What is this place? inferring place categories through user patterns identification in geo-tagged tweets, in: 6th International Conference on Mobile Computing, Applications and Services, 2014, pp. 10–19. [25] R. Montoliu, A. Martínez-Uso, J. Martínez-Sotoca, J. McInerney, Semantic place prediction by combining smart binary classifiers, in: Nokia Mobile Data Challenge 2012 Workshop. p. Dedicated Task, Vol. 1, 2012. [26] R. Montoliu, J. Blom, D. Gatica-Perez, Discovering places of interest in everyday life from smartphone data, Multimedia Tools Appl. 62 (1) (2013) 179–207. [27] Y. Jiang, X. Du, T. Jin, Using combined network information to predict mobile application usage, Physica A 515 (2019) 430–439. [28] R.W. Sinnott, Virtues of the haversine, Sky Telesc. 68 (1984) 159. [29] P.-T. De Boer, D.P. Kroese, S. Mannor, R.Y. Rubinstein, A tutorial on the cross-entropy method, Ann. Oper. Res. 134 (1) (2005) 19–67.