Will this session end with a purchase? Inferring current purchase intent of anonymous visitors

Will this session end with a purchase? Inferring current purchase intent of anonymous visitors

Accepted Manuscript Will this session end with a purchase? Inferring current purchase intent of anonymous visitors Veronika Bogina, Osnat Mokryn, Tsvi...

520KB Sizes 1 Downloads 51 Views

Accepted Manuscript Will this session end with a purchase? Inferring current purchase intent of anonymous visitors Veronika Bogina, Osnat Mokryn, Tsvi Kuflik PII: DOI: Article Number: Reference:

S1567-4223(19)30013-4 https://doi.org/10.1016/j.elerap.2019.100836 100836 ELERAP 100836

To appear in:

Electronic Commerce Research and Applications

Received Date: Revised Date: Accepted Date:

4 October 2018 3 February 2019 9 February 2019

Please cite this article as: V. Bogina, O. Mokryn, T. Kuflik, Will this session end with a purchase? Inferring current purchase intent of anonymous visitors, Electronic Commerce Research and Applications (2019), doi: https://doi.org/ 10.1016/j.elerap.2019.100836

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Will this session end with a purchase? Inferring current purchase intent of anonymous visitors Osnat Mokryna,b , Veronika Boginac , Tsvi Kuflikc a Department

of Information and Knowledge Management, University of Haifa, Israel. b Corresponding author. Email: [email protected] c Department of Information Systems, University of Haifa, Israel.

Abstract Understanding the online behavior and intent of online visitors is the subject of a long line of research. Compared to profiled returning customers whose history is known, anonymous visitors garner less attention. The lack of a known shopping history or interests makes it hard to learn from their behavior, or infer their shopping intent. Here, we suggest the use of products’ popularity trends and visit’s temporal information to infer the purchase intention of anonymous visitors. We model these dynamics and utilize our model to infer purchase intent of visitors of two large real e-commerce retailer sites. Our model identifies online signals for purchase intent that can be used for online purchase prediction. Keywords: Purchase intent; Anonymous visitors; Session dynamics; Products trendiness; Temporal session information.

Preprint submitted to Elsevier

February 26, 2019

1. Introduction E-commerce has become a prevalent method for shopping, with over 79% of Americans visiting e-commerce sites, as reported by Pew Research Center (2016). Yet, only a tiny fraction of these visits end with a purchase, in the range of 2 to 5 percent (Pew Research Center, 2016; Center for Retail Research, 2017; McDowell et al., 2016). This fraction is called ”site’s purchase conversion rate”. Considering that current online retail is estimated at more than 460 billion dollars, according to Forrester Research (2017), even a small improvement in a site’s purchase conversion rate will yield a significant increase in revenue (Sismeiro and Bucklin, 2004). Understanding online behavior is the subject of a long line of research in marketing, aimed at gaining insight into shoppers’ decision process, and apply this understanding for marketing, improving the visitors’ shopping experience and increasing sales. Most of the previous research explored the online behavior of returning shoppers. The online behavior and purchase intent of anonymous visitors was researched to a lesser extent, as they cannot be profiled (Suh et al., 2004; Sch¨ afer and Kummer, 2013; Baumann et al., 2018). Gaining insights on anonymous visitors and their navigation and decision process is crucial for applying marketing strategies, recommender systems and personal automated shoppers (Kim and Yum, 2011). To that end, understanding the current purchase intent of shoppers is rudiment. Moreover, being able to identify whether a session will end without a purchase, can help sellers to take actions to address shoppers’ needs and preferences better and to increase the conversion rate (Kim et al., 2003; McDowell et al., 2016). Lately, as a way to enrich the limited information about online shoppers, Pinterest Research has published a large-scale study investigating how the crossplatform activity of users may build up over time into a purchase intent (Lo et al., 2016). Still, learning from previous visits can help only with predicting the purchase intent of returning customers, who are identified. It cannot help in the case of anonymous returning shoppers, first-time visitors, or occasional anonymous shoppers. Occasional online shopping is known to be prevalent and accounts for almost half of the online purchases (Pew Research Center, 2016). Understanding and correctly classifying the current purchase intent of online visitors that have no prior history is essential for a site’s ability to target these visitors with accurate online help. Anonymous visitors, even if are returning, might be turned into shoppers with the right recommendations or aid. Classifying the purchase intent of anonymous visitors might also help in understanding the factors that lead to an impulse purchase of visitors that do not have a history of acquisitions in the site (Cobb and Hoyer, 1986; Chan et al., 2017). We refer to an unidentified visitor as anonymous, as there is no known shopping history of this visitor in the site. Inferring the purchase intent of visitors with no historical information is challenging, as there is no data about the visitor for building their profile. When the purchase intent of anonymous visitors has been considered, studies have hitherto assumed a fixed baseline value (Sismeiro and Bucklin, 2004; Park and

Park, 2016; Bhatnagar et al., 2016) or a per-domain baseline value of purchase intent (Suh et al., 2004; Moe and Fader, 2004; Ding et al., 2015; Panagiotelis et al., 2014). We investigate here the question of how to determine the purchase intent of anonymous visitors from their current session information, and how to utilize it to predict whether the session will end up with a purchase. We suggest to consider the clickstream of visitors and to check the recent trend of products popularity they explore (i.e., click on) as a means for understanding their purchase intent. This trend of product popularity can be thought of as the site’s local view of the overall perceived product popularity and success. Product popularity is known to affect product success (Hanson and Putler, 1996; Salganik et al., 2006; Cai et al., 2009; Tucker and Zhang, 2011), but has been found to be trending with time (Tsymbal, 2004), even on a scale of days (Srinivasan and Mekala, 2014). To conform to the locality in time, we check only the recent trajectory (i.e., trend) of the popularity of products (”product trendiness”). We are also interested in the temporal aspects of the visit, i.e., on which day of the week and in which period of the year the Web site is visited. The work of Lo et al. (2016), using Pinterest dataset, found that these are predictive features for personal purchase intent. We are extending this aspect to see whether the temporal aspects have a general effect on the shopping intent of anonymous visitors. We also take into account the length of the anonymous visitors’ sessions (in the number of clicks) as a feature and discuss how to use it for predictions. We build an elaborate purchase intent modeling algorithm for anonymous visitors, utilizing the visitor’s session’s temporal characteristics, number of clicks and product trendiness. We present an analysis of two large datasets from real and active e-commerce sites, each containing tens of millions of sessions of visiting users collected during the period 2012-2014. We evaluate the performance of our inference algorithm using an ensemble of classifiers (Ricci et al., 2015) over each of the datasets. Given that the visitor’s sessions’ temporal information differ between the sites, the inference algorithm is adapted to account for this difference. We further show that our results are competitive against a state-ofthe-art deep learning technique that mines recurrent patterns utilizing neural networks. A general framework that does not take into account site-specific temporal details is presented and evaluated as well. Our good inference results for the purchase intent of anonymous visitors over both datasets support our hypotheses. In addition to purchase intent inference the method is applicable for cold start situations. Employing our method would enable websites to replace the initial fixed baseline value of purchase intent for anonymous visitors with a personalized purchase intent value. Our approach is also applicable for bootstrapping new e-commerce sites after a few days, where no prior information is available on any of the customers or products, and give a dynamically calculated intent baseline during the session.

2

2. Related Work A low conversion rate in online shopping is a widely recognized problem for e-commerce sites (Moe and Fader, 2004; Venkatesh and Agarwal, 2006; Panagiotelis et al., 2014). The online environment introduces inherent barriers for purchase, such as the shoppers’ inclination to accept the involved technology, their perception of the site and their ability to trust it (Deng and Poole, 2010; Gefen et al., 2003; Pavlou and Fygenson, 2006; Zhou et al., 2007; Amaro and Duarte, 2015); On the other hand, it offers a plethora of information, and thus shoppers may browse many sites before making a decision (Pavlou and Fygenson, 2006). Senecal et al. (2005) show that shoppers tend to search more online than they do offline, and get involved in a rather complex browsing behavior. Further, browsing is not only done for goal-oriented reasons such as purchasing, but also for experiential reasons, e.g., the shopping experience itself (Wolfinbarger and Gilly, 2001; Scarpi et al., 2014). Assessing visitors purchase intentions correctly can help sites in converting visitors to shoppers by showing them recommendations or suggesting assistance (Lu et al., 2015; Chen et al., 2017). 2.1. Predicting purchase intent utilizing clickstream data Tracking and modeling behaviors over recurrent visits have been the focus of much research. Shoppers, in a search for information, browse and compare prices over different sites. Their final choice of site to purchase depends, among other things, on the stickiness of the site (Venkatesh and Agarwal, 2006; Wolfinbarger and Gilly, 2001; Panagiotelis et al., 2014). Popular (sticky) sites can then log their visitors’ online browsing sequences (clickstream data) over time, as well as additional information, to gain knowledge of their online behavior and predict their purchase intent. Clickstream data gives a competitive edge to sites and enables researchers to understand and model online actions, and use it for predictions (Bucklin et al., 2002; Bucklin and Sismeiro, 2009; Olbrich and Holsing, 2011; Lo et al., 2016). Moe (2003) identifies different behavior types of online shoppers from their clickstream data, creating a typology of shopping strategies. The intent is built over time when shoppers browse or search for information while planning a future purchase. This type of search is either directed or exploratory. Immediate purchases occur when a shopper visits a site with the intent to make a purchase. Hedonic browsing, derived from the shopping experience itself, may also result in an immediate purchase. Moe and Fader (2004) model individual conversion dynamics for visitors based on the above typology. They assume a baseline intent that is gamma distributed and model the personal development of the intent over time to predict a purchase. Montgomery et al. (2004) model navigation patterns of recurring visits as a Markov chain, and find that six prior visits are enough to learn and predict the personal intent of recurrent shoppers. Other models include modeling of completed tasks (Sismeiro and Bucklin, 2004; Su and Chen, 2015), and graph mining over navigation between tasks (Kalczynski et al., 2006).

3

Recent models consider dynamic behavioral patterns that we also take into account. Park and Park (2016) model the user’s visit dynamics over time as clusters, where close-by visits are clustered and between clusters are periods of no visits. They find that the conversion rates are higher at later visits in the clusters than at earlier ones. Modeling these temporal dynamics for visitors, they predict the purchase intent using a Bayesian learning model. Baumann et al. (2018) follow visitors navigation paths, modeling each session as a navigation graph that is updated with every click. They use graph metrics to determine purchase intentions. Bhatnagar et al. (2016) find that the length of the first visit predicts the possibility of next visits, and model online behavior utilizing visit dynamics and navigation information. Like us, they also consider temporal information, such as the day of the week and the time of day. They determine a time window in which the visitor is more likely to purchase, however their model depends on the personalization of the visitors, and therefore cannot be applied to the case of anonymous shoppers. More similar to our approach is the use of machine learning on clickstream data to predict individual intent. Common in this case is the use of the dynamics of visits, e.g., the frequency of visits, time from last visit, and in-session dynamics (like dwell time, the time spent viewing a page). Findings show that the rate of visits and dwell time increase when the visitor is close to the purchase. Each visitor is then characterized by these dynamics, as well as their current session’s dynamics and additional information that is available (such as their demographics, detailed clickstream and product information, purchase history, social interactions and influence, and more). Conversion prediction is then made for either the current or the next visit (Van den Poel and Buckinx, 2005; Bucklin and Sismeiro, 2009; Lukose et al., 2008; Su and Chen, 2015; Lo et al., 2016; Kooti et al., 2016; Raphaeli et al., 2017). We consider the case of anonymous visitors that do not have prior history in the site, nor do we know their social network. When no information on the visitor exists, as is the case of first-time or anonymous visitors, understanding their current, real-time intent, is challenging. Polites et al. (2018) find a misalignment between online shoppers’ initial intention and the outcome of their online visit, i.e., some of those stating they are in the browsing phase and do not intend to buy end the visit with a purchase, while others, starting with the intention to buy, do not. Purchasing without prior intent, or impulse purchase accounts for many of the online transactions (Chan et al., 2017). While some of the visitors have a predisposition to purchase, others might be inclined or driven to impulse purchases. Research on websites stimuli that may trigger an impulse purchase considers the site’s visibility and cues embedded in it, such as promotions, persuasive aids, etc. (Jeffrey and Hodge, 2007; Wells et al., 2011; Chan et al., 2017). In our work we do not consider the initial intent of the visitor, nor can we assume their predisposition to purchase or impulsive shopping.

4

2.2. Products’ temporal dynamics and trends Product popularity information is known to affect its success, and people tend to consume more products perceived as popular (Hanson and Putler, 1996; Salganik et al., 2006; Cai et al., 2009; Tucker and Zhang, 2011). The temporal nature of popularity, though, is unstable. In many scenarios in our lives (TV program consumption, product purchase, tweet topics and so on) our interests change with time, in a process known as concept drift (Widmer and Kubat, 1996; Tsymbal, 2004; Krawczyk et al., 2017). Concept drift can be sudden, or gradual, changing slowly with time. Systems for handling concept drift differ according to the type of change they handle (Tsymbal, 2004): (1) Instance selection, where the goal is to select instances that are relevant to the current time window; (2) Instance weighting, were instances are weighted based on their estimated relevance; and (3) Ensemble learning, handling the family of predictors that are weighted per their relevance to the present time. Among the three, the first is more relevant to our research. Concept drift is naturally linked to temporal trends. Temporal trends have been shown to govern general interests (Mokryn et al., 2016), and are studied not only in recommender systems (Choi and Varian, 2012; Dias and Fonseca, 2013; Koren, 2010; Lathia et al., 2010; Srinivasan and Mekala, 2014) but also in general Web search. Google Trends1 uses the time series index of the volume of submitted queries. For example, the volume of queries on a particular brand of a watch during the second week of May might be helpful in predicting June sales for that brand. Choi and Varian (2012) use Google Trends to demonstrate that Google queries help predict economic activity. Indeed, a recent work shows that product popularity is trending in nature, and changes with time: Tang Srinivasan and Mekala (2014) find that the trendiness of products changes with time, week by week, or even day by day, due to either user interest change, demand shifts ignited by some external events, or just because a product is out of inventory. Item popularity and user ratings over time are studied by Koren (2010) in a different context, showing that temporal dynamics can affect user preferences. We further examine whether these temporal dynamics, as well as the trendiness of the viewed products, are predictors of the user’s current purchase intent. 2.3. Session-based recommenders The study of session-based recommenders is a growing trend, especially in the music domain, as described next. Park et al. (2011) have coined in this reference the term Session-based Collaborative Filtering (SSCF) and present a modified user-based CF that relies on session information that captures sequence patterns and repetitiveness in the users’ listening process. Their goal is to predict which song will be played next given past sessions. When a song is played, an event describing the user and song (item), with the corresponding time stamp, 1 https://trends.google.com/trends/

5

is created. A session is defined as a sequence of per-user events within a specific continuous time frame. Then, session similarity is calculated using the cosine distance between each pair of sessions. The items’ log data is used as implicit feedback. In their experimental results, using log data from Bugs Music (one of the biggest music services in Korea), they show that SSCF outperforms the basic CF. Zheleva et al. (2010) develop a session-based hierarchical graphical model using Latent Dirichlet Allocation and show that their model can facilitate playlist completion based on previous listening sessions or songs that the user has just listened to. Using the Zune2 Social music community as a test bed, they model a song listening process by two graphical models with latent variables. The first one, the taste model, is characterized by a set of tastes or media preferences of a specific community. The second one, the session model, is where each song the user has listened to is defined as a finite combination of listening moods. They show that from the perplexity perspective there is a clear advantage in using a session-based model for characterizing user preferences in the social media content. Dias and Fonseca (2013) improve music recommendations by adding temporal context and session’ diversity factors into the analysis of music sessions. Their purpose is to recommend to the user the next song to listen to. They represent each session using five features: time of day (users tend to listen to different songs at different periods of the day), weekday (users’ song preferences are different in weekdays and weekends), day of month (users tend to listen to more happy music in the beginning of the month and more sad one towards the end), month (users’ preferences are not the same during different seasons), and song diversity. They show that the inclusion of temporal information, either explicitly or implicitly, increase the accuracy of the recommendations significantly when compared with traditional Session-based CF. A fundamental difference from our work is that playlists are longer sequences compared with shopping sessions. Additionally, they did not use dwell time, and did not consider the song’s popularity. Jannach et al. (2017, 2015) incorporate long-term preferences into a next music track generation. They distinguish two types of preferences: short-term history (current session) and long-term history (previous sessions), considering repeated tracks, co-occurrences of the tracks, favorite singers and social friends’ track preferences. They combine all into a multi-faceted scoring scheme to provide the best recommendation for the next track in the playlist. Classification: In recommender systems, one common practice is to use ensembles of classifiers (Ricci et al., 2015). Any hybrid technique that combines the results of several classifiers could be seen as an ensemble method. Netflix winners (Bell et al., 2007) used a combination of many different methods in their study. The two most common ensemble techniques are Bagging and Boosting Ricci et al. (2015). Bagging (Bootstrap Aggregation) was initially proposed by Breiman (1996), and it combines outputs from several machine learning techniques for improving the performance and stability of prediction or 2 https://en.wikipedia.org/wiki/Zune

6

classification. This technique is a special form of the averaging model (Hoeting et al., 1999). In our experiment, we use three classifiers: Bagging, NBTree, and Logistic Regression. WEKA (Hall et al., 2009) enables the use of different base-learn classifiers for Bagging. 3. Datasets Description The data in this study is comprised of two datasets of clickstream events from two e-commerce websites, each representing a different domain of goods, as detailed below. Both datasets are anonymized and contain clickstream log information for extended periods. To predict the real-time intent of anonymous visitors we treat each session as a separate visit of an anonymous visitor and do not consider user information. Yet, in our datasets, there are possibly repeated visits by users. We explain here why treating each session as an anonymous session strengthens our results. Previous works found that in cases where shoppers are searching for information a repeated process in which the time in-between visits decreases captures well their behavior (Moe and Fader, 2004; Kalczynski et al., 2006; Van den Poel and Buckinx, 2005; Bucklin and Sismeiro, 2009; Lukose et al., 2008; Su and Chen, 2015; Lo et al., 2016; Kooti et al., 2016; Raphaeli et al., 2017). However, recent work showed that shoppers might change their mind during their online visit (Polites et al., 2018). By treating each session as an anonymous independent session, we make no prior assumptions on the purchase intent of the user, do not consider previous behavior and patterns, and learn solely from the session dynamics whether it will end with a purchase. The datasets contain real visits of users, some are anonymous, some repeating customers. We handle each session separately, and do not consider personal information. This enables us to model every session as belonging to an unknown visitor, and hence we do not collect user information or previous visit dynamics. 3.1. YooChoose RecSys dataset Our primary dataset is the YooChoose RecSys challenge dataset, representing six months of user activities in a large European e-commerce business that sells a variety of consumer goods including garden tools, toys, clothes, electronics, and more (Ben-Shimon et al., 2015). The YooChoose dataset contains two log files: a click events log, and a purchase events log. The click events log consists of a list of click events on items. Each such event is associated with the session id, a timestamp (the time when the click occurred), the item id, and the category of the item. The purchase events log consists of purchase events from sessions that appear in the click events log and end with a purchase. Each entry contains a session id, a timestamp (the time when the purchase occurred), and details on the purchased item - the item id, the price, and the quantity. The sessions vary in terms of length, the number of clicks, and the number of items clicked on. Sessions’ lengths last from a few minutes to a few hours and the

7

number of clicks varies from one click to a few hundred in a session, depending on the user’s activity. Name YooChoose

Clicks 3,3003,944

Buying sessions 509,696

Non-buying sessions 8,740,032

Items 52,739

Table 1: YooChoose dataset general data statistics

Table 1 presents the main characteristics of the YooChoose clickstream events, i.e., the number of sessions that end with a purchase (Buying sessions), the number of sessions in the dataset that do not end up with a purchase (Nonbuying sessions), the overall number of clicks and the overall number of items. Only 5.5% of the sessions end up with a purchase. The average number of clicks per session is roughly 2.8, with the majority of sessions ending within less than three clicks. However, the distribution is a right-skewed one, with sessions lasting over more than 40 clicks.

Figure 1: Distribution of the number of clicks per sessions, YooChoose dataset

Figure 1 depicts the distribution of sessions’ lengths, measured by the number of clicks, for sessions that ended with a purchase (termed buying sessions), 8

and for those that did not (termed non-buying sessions). It is quite easy to see that the non-buying sessions are much shorter than buying sessions. About 80% of the sessions contain between 1 and 4 sessions, while 40% of the sessions are 2-clicks sessions. This is quite understandable, as users are not happy with what they looked at and abandon the search. On the other hand, buying sessions have a much longer tail. Unlike the non-buying sessions, the percentage of 1-click buying sessions is small (4% compared with 14% 1-click non-buying sessions) while 2-click buying sessions is the largest portion (about 22% of the buying sessions) Then the percentage decreases gradually, but in a much moderate rate compared with the non-buying sessions. This behavior seems to be quite understandable as users tend to better examine items they are about to purchase - they may want to learn more about them and possibly compare several options. We can see that in both cases most of the sessions are 2-clicks long: they account for almost 40% of non-buying sessions, and 20% of buying sessions. Next come 3-click sessions, which account for 17.6% and 14.5%, respectively. The third-place diverges between non-buying sessions (1-click sessions, 14.2%), and buying sessions (4-click sessions, 11.3%). 3.2. Zalando dataset The second dataset we use is an anonymized click log from Zalando3 , a large European online fashion retailer, used previously for session-based recommendations (Tavakol and Brefeld, 2014). Every click is associated with a timestamp, the attributes of the viewed item, user ID, and the clicked items. The dataset is richer in details, and more attributes are associated with each product than in the YooChoose dataset. However, to validate our results across these domains, we limit ourselves to the use of the features used in the YooChoose dataset. Table 2 describes the total number of clicks, items, and sessions in the dataset. Name Zalando

Clicks 224,175,394

Buying sessions 1,647,738

Non-buying sessions 25,976,224

Items 350

Table 2: Zalando dataset general data statistics

Here we see longer sessions on average, namely 8.11 clicks per session on average, and a larger percentage (5.9%) of sessions that end in a purchase. 4. Modeling Dynamics in E-Commerce Sessions Modeling the purchase intent of an anonymous visitor can be thought of as modeling the purchase intent of an anonymous visitor during their session, i.e., during an anonymous session. To understand the characteristics of anonymous sessions that end with a purchase, we quantify the dynamics of e-commerce sessions. Each session is considered as a distinguishable visit of an anonymous 3 www.zalando.com

9

visitor. We characterize the dynamics of each session by the trendiness of the viewed products, the clickstream, and the session’s temporal characteristics, as detailed below. We define the recent trendiness of each product and consider a session to be as trendy as the trendiest product in it. 4.1. Modeling products recent trendiness in purchasing sessions We consider here a local view of products’ popularity, in terms of both space and time. We use the site’s local information about the products, and refer to this view as the ’local popularity’ of products. Similarly, we are interested in the recent trend of the products’ popularity, rather than their historical popularity information. Hence, the modeled trend of popularity is the recent local trajectory of the popularity of each product within site, for purchasing customers. A product is considered trendy if in the last several days before the analyzed session it had been viewed in a non-decreasing number of sessions that ended with a purchase. To model the trendiness of products let us define the following process: Both YooChoose and Zalando datasets are split into two random subsets, a learning subset and an experimental one. 80% of the sessions are used as the learning subset for the learning phase (Temporal Model Building Data), and the remaining 20% are used for experimentation (Experimental Data). The proportion of buying and non-buying sessions (5% vs. 95%) is similar in these two subsets of the datasets. We split the sessions in the learning subset, i.e., in the Temporal Model Building Data, into two corresponding lookup tables. The first, PS, is for sessions that ended with a purchase, and the NPS table is for sessions that did not end with a purchase. Each entry in the PS table (or NPS table) corresponds to a tuple of < day, producti >, depicting the number of sessions that the product was viewed at that day, that ended with a purchase (or did not end with a purchase, respectively). Product Product1 Product2 Product3

Day 91 6 5 7

Day 92 8 5 5

Day 93 9 5 4

Trend Increasing Non-decreasing Decreasing

Table 3: PS Table example from the YooChoose dataset

Table 3 shows an example of the PS table for three consecutive days. In these days, Product1 was viewed in 6, 8, and 9 sessions that ended with a purchase, while Product2 was viewed in each of these days in 5 sessions that ended with a purchase, and Product3 in 7, 5, and 4 such sessions. The data was extracted from the YooChoose Buying Table. The days are counted by their numerical offset from the date the dataset begins with. Let us assume we model the product trendiness over a period of three days4 . Defining trendy products as 4 the window size for determining the trendiness of products over sessions that end with a purchase is a parameter. Section 5.1 evaluates the results obtained for different time windows.

10

products participating in a non-decreasing number of sessions that ended with a purchase within a predefined time window, then in this example Product1 is a trendy product with an increasing trend, Product2 is trendy with a stable non-decreasing trend, and Product3 has a decreasing trend and, therefore, is not a trendy product. We further determine the average trendiness of products, and the corresponding trendiness of a session, as explained below. 4.2. Modeling the trendiness of products and sessions We define a session to be as trendy as the trendiest products in it. To that end, we determine the average trendiness of each product at time t, denoted in days. For a given chosen time window size of n days, the average recent trendiness of a product i is a factor of the number of sessions it was viewed in that ended with a purchase, divided by the overall number of sessions it was viewed in. Let product i’s overall performance in a time window n be determined as the overall number of sessions it was viewed in during the preceding n days, Pin . Then, Pin (t) = Σt−1 (1) j=t−n−1 (P S(j, i) + N P S(j, i)) calculated as the sum of the number of buying sessions in which i was clicked on with the number of non-buying sessions in which it was clicked on during these days. The average trendiness T D of a product i at day t is then defined as follows: Σt−1 j=t−n−1 P S(j, i) T Din (t) = = (2) Pin (t) =

#of buying sessions product i was viewed in during the preceding n days #of overall sessions product i was viewed in during the preceding n days

We can now proceed to model the trendiness of current sessions (i.e., in our example, sessions occurring at day t), using the trendiness of the products viewed in them. If we define a session of length k, denoted by Sk , as a sequence of k views of products on a site (with possible repetitions), then, a session Sk at day t will be as trendy as the trendies product viewed in it: T DSk (t) = max T Din (t), i ∈ Sk i

(3)

4.3. Modeling temporal and clickstream characteristics of a session Additional temporal characteristics were used for modeling a session in both datasets. However, the temporal characteristics differ between the YooChoose and Zalando datasets, denoted with Y and Z respectively: MonthY : Some months are more prone to purchases than others. For example, the YooChoose dataset spans seven months, from April to September. During this time, August was the month with the highest purchase conversion rate.

11

Day of the weekY : People behave differently on different days of the week. For example, in the YooChoose dataset we found that people tend to purchase more on Sundays and Mondays than on other days. Dwell timeY : Dwell time, the time a customer spends on viewing a particular page or a product, has been recently linked to the interest the customer has in the product (Yi et al., 2014; Bogina and Kuflik, 2017). Here, we use the session’s latency. Day number from the beginning of the datasetZ : As the Zalando dataset does not include the dates, we use the number of days offset from the beginning of the dataset. Additionally, we use the number of clicks in a session. This feature has been previously found important in several studies, as described in Section 2.1. Number of clicks in a sessionY,Z defines the length of a session in number of clicks. Clearly, as the same product can be clicked-on several times within a session, this value does not necessarily correlates with the number of viewed products. It is used as an additional feature for both datasets. 5. Evaluating Purchase Intention in an Anonymized Session Our goal is to determine the purchase intent of anonymous visitors from their session’s dynamic characteristics, as modeled above. To that end we consider each session as a distinguishable visit of an anonymous visitor. We train an ensemble of classifiers, as well as the XGBoost classifier, and further examine the effect of the different dynamics modeled. We compare our results to deep learning technique that mines recurrent patterns and utilizes neural networks (RNN). For the classification task, each session is modeled with the following set of features: max product trendiness (T DSk (t), calculated over different time windows of time t); number of clicks; temporal parameters of the session. YooChoose and Zalando differ in the available temporal parameters, as described in Section 4.2. Therefore, the temporal parameters used for modeling the YooChoose dataset sessions are: Day of the week; Month the session took place in, and the session’s Dwell time. The temporal parameters used for the Zalando dataset sessions are: Day number from the beginning of the dataset. Figure 2 depicts our design flow for trendiness’ modeling. We learn the global trendiness information over 80% of the data. We then take the remaining 20% of the data, termed test set. In the Figure, this set is in the Session Generation. We then perform SMOTE over the test set, and divide the test set to ten folds, learn from 90% of the test set, and evaluate our results over each of the remaining 10%. Recall that each of the two datasets we have, YooChoose and Zalando, has been split into two subsets. The first, consisting of 80% of the data, is used for the modeling, and the second, consisting of 20% of the sessions, is used for 12

Figure 2: Evaluation Flow

experimentations (test part). Each of these sets keeps the original characteristics of imbalanced data, as only less than 6% of the sessions end with a purchase. Classifiers working well with imbalanced data, perform well while classifying the main items belonging to the main category, but poorly otherwise (Chawla et al., 2002). To overcome this imbalance, we use SMOTE (Chawla et al., 2002), a combination of over-sampling and under-sampling techniques. SMOTE combines Informed over-sampling of the minority class with the random undersampling of the majority class. We conduct 10-fold cross-validation experiments on the test part of the dataset, which was not used for modeling. We train an ensemble of classifiers (Ricci et al., 2015), namely Bagging, NBTree and Logistic Regression (Hall et al., 2009), and a state-of-the-art boosting machine learning method, XGBoost (Chen and Guestrin, 2016). For Bagging we use the following: Reduced Error Pruning Tree (”REPTree”) (Srinivasan and Mekala, 2014), which is a quick decision tree learner that is built upon the information gain; NBTree (Kohavi, 1996), a hybrid of decision trees with Naive Bayes classifiers that learn from instances that reach the decision trees’ leaves; and Logistic Regression (Hosmer Jr et al., 2013), where the binary dependent variable is categorical, as is the case with our prediction of purchase. XGBoost5 is used with max depth of seven and grid search. Our experimental results show that using temporal and dynamic characteristics of the products and the sessions we are able to achieve a good classification for whether a session ends up with a purchase or not. Preliminary results calculated with ensemble methods over the YooChoose dataset, with a fixed time window of three days (Bogina et al., 2016) gave very good classification measures, with the Bagging classifier giving a precision of 0.937, and AU C = 0.939. Here, we deepen the understanding of the effect of different session dynamics as well as the time window used for learning, and compare our results over both the YooChoose and Zalando datasets, enabling a cross-domain and cross-site evaluation of our model. 5 https://github.com/dmlc/xgboost

13

5.1. Products trendiness over time To calculate products trendiness, we first plot the number of sessions that trending products were viewed during different time windows in the YooChoose dataset. Figure 3 depicts the total number of sessions on consecutive days in which products were trending in (a non-decreasing trend of clicks), for time windows that span from two to six days. We chose this range as the total number of sessions that contain the same product over seven consecutive days is negligible, with only 175 such sessions with trending products that end with a purchase. The calculation is applied to the entire dataset. As expected, there are more such sessions in shorter time periods than longer ones, and the same proportion of sessions end with a purchase compared to the global dataset (around 5%).

Figure 3: YooChoose: Aggregated number of sessions with trending products over different time windows, compared to the number of sessions with non-trending products

5.2. The effect of session temporal features and product trendiness over different window sizes First, to deepen our understanding of the effect of the trendiness of products viewed on the estimated purchase intent, we evaluate the quality of the estimation with and without trendiness, and over several time windows. We compare the prediction of each of the four classifiers to the other three. Table 4 depicts the purchase intent classification performance for the YooChoose dataset over the different time windows showing all classifiers predictions F 1−measures. For each time window, we compare the performance of

14

Days 2 days 3 days 4 days 5 days 6 days

Features

Logistic

Bagging

NBTree

XGBoost

With Trendiness Without Trendiness With Trendiness Without Trendiness With Trendiness Without Trendiness With Trendiness Without Trendiness With Trendiness Without Trendiness

0.739 0.733 0.72 0.71 0.7 0.686 0.68 0.664 0.677 0.65

0.904 0.886 0.886 0.854 0.882 0.816 0.889 0.786 0.899 0.796

0.916 0.888 0.899 0.855 0.883 0.817 0.867 0.786 0.873 0.795

0.7853 0.765 0.798 0.76 0.8236 0.758 0.8559 0.75 0.9 0.7788

Table 4: YooChoose: Quality of prediction (F1) of purchase intent over different time windows

the model over the different classifiers with and without taking product trendiness into account. Product trendiness improves the purchase intent estimation over all examined window sizes significantly, with p−value < 0.03 for T-tests performed for all time windows. Interestingly, when the temporal features are taken over a larger time window, the ensemble classifiers’ prediction quality decreases, in which case taking into account the product trendiness improves the prediction. However, when only the very recent history is taken into account, the temporal features give a prediction result that is almost as good as the one that includes recent product trendiness. Recency improves the effect of the temporal features on the quality of the prediction of the purchase intent when using ensemble methods. To further understand which features contribute most to the prediction, we applied WEKA’s automatic feature selection algorithm to the YooChoose dataset for all features (after SMOTE). The algorithm selected trendiness and the number of clicks. Hence, the very recent click behavior in the site is indeed a good predictor. The use of product trendiness improves the prediction accuracy over all selected time windows, for all four classifiers. The effect of trendiness for the longer time windows is surprising, considering that the number of sessions with trending products over five or six days is rather small, and may indicate a higher predictive power for products trending over these longer periods. The effect of trendiness for longer time windows improves XGBoost’s prediction quality, with the highest value (F 1−measure of 0.9) over a time window of six days. For Zalando the available temporal information is the session’s offset in days from the beginning of the dataset. We use this temporal information along with the number of clicks and product trendiness, that were found important for the YooChoose dataset. Table 5 presents the quality of the prediction of purchase intent over different time windows showing all classifier prediction F 1−measures. At each time window, we compare the results with and without the trendiness. The trendiness of products improves the prediction across number of clicks and the temporal information when recent history is considered, and was found sig-

15

Days 2 days 3 days 4 days 5 days 6 days

Features

Logistic

Bagging

NBTree

XGBoost

With Trendiness Without Trendiness With Trendiness Without Trendiness With Trendiness Without Trendiness With Trendiness Without Trendiness With Trendiness Without Trendiness

0.702 0.701 0.702 0.701 0.705 0.7 0.725 0.718 0.761 0.755

0.859 0.745 0.892 0.761 0.859 0.807 0.876 0.87 0.893 0.896

0.905 0.731 0.892 0.735 0.806 0.762 0.848 0.825 0.883 0.875

0.7554 0.7281 0.7543 0.7222 0.7861 0.7246 0.8681 0.7409 0.9438 0.786

Table 5: Zalando- Quality of prediction (F1) of purchase intent over different time windows

nificant using a T-test only for a time window of four (p−value = 0.05). These results are somewhat different than the ones of YooChoose, where trendiness has a larger positive effect on prediction in longer time windows. This, however, can be attributed to the difference in the available temporal history between the two datasets. For XGBoost, as is the case with the YooChoose dataset, the effect of trendiness for longer time win improves its prediction quality, with the highest value (F 1−measure of 0.94) over a time window of six days. Prediction with recurrent neural networks. We further compare our results with a deep learning technique that mines recurrent patterns utilizing neural networks (RNN). RNN is considered the state-of-the-art in e-commerce within sessions next click prediction recommendations (Hidasi et al., 2015, 2016; Quadrana et al., 2017), though, as far as we know, it has not been previously used for in-session intent prediction. Here, we represent each session as a sequence of clicked items. Due to the imbalanced nature of our datasets, we perform a downsampling of the data, such that the number of sessions that end up with a purchase is equal to the number of sessions that do not. The classification task is then to predict whether a session ends up with a purchase. RNN achieves an F 1−measure of 0.84 for the YooChoose dataset, and an F 1−measure of 0.80 for Zalando. 5.3. General modeling of purchase intention In order to build a general model of purchase intent for anonymous users over the e-commerce domain, we build a representative features set that will perform well on both datasets. Considering the two different datasets are in the same e-commerce domain, we use the same features: number of clicks and trendiness, and do not consider any temporal information. To create a valid comparison, we run SMOTE only on these two particular features for both datasets. This way we assure the same baseline for comparison between the results. The results are given in Table 5. Since temporal features are not explicitly taken into account

16

in this part of the experiment, only implicitly in the trendiness calculation, the differences between different time windows are negligible. Interestingly, the quality of predicting purchase intention in both datasets is similar over different time spans (especially when using an ensemble technique, i.e., Bagging). Time-Window

2 days

3 days

4 days

5 days

6 days

Classifier logistic Bagging NBTree XGBoost logistic Bagging NBTree XGBoost logistic Bagging NBTree XGBoost logistic Bagging NBTree XGBoost logistic Bagging NBTree XGBoost

Datasets Yoochoose Zalendo 0.644 0.784 0.717 0.6486 0.629 0.786 0.717 0.65127 0.616 0.77 0.7 0.65309 0.613 0.735 0.675 0.6512 0.604 0.729 0.674 0.6501

0.686 0.76 0.741 0.7335 0.686 0.777 0.763 0.72201 0.686 0.766 0.714 0.7187 0.684 0.747 0.711 0.7214 0.684 0.747 0.711 0.729

Table 6: Prediction results (F1) of the general comparative model

The general model does not take into account temporal parameters, but rather only the products trendiness and the number of clicks. Tables 6 and 5 detail the classification results with all the model parameters, that is: products trendiness, number of clicks, and the temporal features, for the YooChoose and Zalando datasets, respectively. Comparing these results to the results of the general model in Table 6, it is clear that the best results are achieved for each dataset when all the model parameters, including the available temporal information, are taken into account. When temporal information is not considered, as is the case with the general model, the prediction quality decreases. 6. Discussion This study explores factors that can reveal the purchase intent of anonymous visitors to sites. Understanding the purchase intent of anonymous visitors also applies to occasional shoppers (that may also be unidentified returning 17

shoppers) who are responsible for almost half of all online purchases. Previous works that consider only anonymous visitors mine the session’s clickstream data for rules, looking for known patterns that are typical of recurrent visitors with known purchase intent. Our work is the first to try to quantify unique factors that help explain the shopping intent of anonymous and unknown occasional visitors. We show that sites’ visitors that view currently trending products are more likely to purchase at the end of their visit. In our case, we define this trendiness as not losing popularity in that site recently, in what can be viewed as a locality in time (recency) and space (site-centric). Interest in products is known to shift in a process described as concept drift and explained earlier in Section 2.2. To capture this temporal change of interest, we implement an instance time window and detect the local trend of popularity of products in the site. While others also have explored the strolling habits of online shoppers, the recent local popularity of products was not considered yet. Our results demonstrate that there is a connection between viewing trendy products and purchase intent. Temporal aspects of the session itself, namely the day of the week and the time of year the session took place in, were previously shown to be indicative of the purchase intent of returning visitors. We show here that it is also indicative of anonymous visitors’ shopping intent. Building on the above, we introduce a novel classification method that uses the temporal dynamics of products (which we term trendiness) together with the length and temporal features of a session to classify whether the session ends with a purchase. Our datasets come from the product and retail e-commerce industries. The temporal features denoting the time of the visit differ between both datasets. The use of sessions’ temporal features improves the classification performance, and like trendiness, their removal is associated with a negative impact on the result. The temporal information available for the YooChoose dataset is rich. The Zalando dataset does not contain temporal information, and we only had the offset in days from the starting date of the given dataset. Nevertheless, adding this temporal information was sufficient to improve the inference over all time windows that we experimented with. When temporal information is not used, as is the case with our general model (described in Section 5.3) the inference quality decreases for both datasets. The datasets used are of real visits done by registered or returning customers, as well as anonymous visitors. Anonymous visitors might be first-time shoppers, or returning visitors and occasional shoppers who do not wish to register at the site. We treat each session as done by an anonymous visitor, and model the session’s dynamics, as discussed before. While we think that this approach is more challenging to our model, this is also a limitation of this study. Another limitation is the lack of use of the extensive product information available at sites. Previous works have found that shopping intention and acceptance is influenced by product characteristics (Pavlou and Fygenson, 2006). Our study takes a machine learning approach to the classification of anonymous visitors’ purchase intention. Both our datasets were imbalanced, with 18

5 − 5.9% of the sessions ending with a purchase. There are several known techniques to overcome imbalanced datasets. We have observed that using SMOTE on our datasets provides better results than using undersampling techniques. We further found that applying SMOTE on all features is more successful than applying SMOTE only on selected ones. In this study, we define the temporal features as nominal, rather than applying the numeric values we used in our preliminary study (Bogina et al., 2016). Both these changes improve the results compared to the initial ones presented in the preliminary paper. For the classification task we employ an ensemble of classifiers (Ricci et al., 2015), and XGBoost, a novel boosting method (Chen and Guestrin, 2016). We achieve classification with F1 measures of 0.9 and 0.94 for the datasets. These results outperform a random baseline, which was used in these recent empiric works Lo et al. (2016); Kooti et al. (2016). We further achieve better quality compared to RNN, a within-session deep learning method, which classified with F1 measures of 0.8 and 0.84, respectively. Generally, Bagging gives the best results for both datasets regardless of the time window used. XGBoost seems to have a different tendency than the ensemble methods, producing better results over the longer time windows. This might be attributed to the smaller amount of data that is available for longer time windows, as demonstrated in Figure 3. Similar to our findings, it has been reported, that while XGBoost is a common choice in Kaggle challenges and KDD Cup competitions, yet, depending on the dataset, ensemble methods may give better results (Bekkerman, 2015). At the moment, we classified successfully sessions that ended with a purchase or not. It identifies online signals for purchase intent that can be used for online purchase prediction of anonymous visitors while their session is undergoing. Almost half the sessions are long and involve three or more clicks. This gives rise to a predictive paradigm, in which our model is used for predicting the purchase intent of an unknown visitor after three or more clicks. Predicting early that a visitor does not intend to purchase improves the site’s ability to introduce recommenders and personal aids to the visitor during their visit. There are interesting managerial implications to our finding. We have considered the session’s temporal parameters, and products’ recent trendiness. Temporal information was previously considered for returning customers. Our results indicate that sites can use sessions’ temporal information for intent prediction not only for recurrent visitors, but also for first-time and anonymous visitors. The temporal information we have for the YooChoose and the Zalando datasets differ, yet in both cases it improves the prediction quality. Sites should, therefore, consider using the full temporal information that exists per session. Our findings on products’ trendiness contribute to understanding the intent of these visitors. We show that removing the product trendiness feature negatively affects the classification accuracy for both datasets, and more so for long time windows. Utilizing our preliminary results (Bogina et al., 2016), the use of products’ recent popularity in the site was applied in an e-commerce recommender system for purchase intent prediction (Zhang et al., 2016).The findings from our learning process of product trendiness over windows of consecutive days indicate a high recency in the products’ attention span. Products are of19

ten being viewed by visitors in up to three consecutive days, but less so in longer windows of times. Interestingly, the number of sessions a product is viewed in during consecutive days, e.g., in a window of two to six consecutive days, decreases with each day. When we consider a time window of seven days in which we require that a product is viewed (at least once) in each consecutive day, we find that we are left with a negligible number of sessions, indicating a clear within-site concept drift for the vast majority of products. Sites can thus track this temporal trending interest in products and identify a per product trend and concept drift; suggest products accordingly; look for patterns; and try to identify products with correlating trends, or complementary trends. 7. Conclusions and Future Work We present a method for determining the shopping intent of anonymous visitors to a site. Our method uses only the visitor’s session information, namely the session temporal information, session length, and the recent trendiness of products clicked on in that session. The trendiness offers a local temporal view of the products’ recent popularity. To detect a recent trend, we draw from machine learning techniques for identifying a concept drift in popularity, and find strong locality in time, in scale of days. We show over two separate datasets from the retail industry that our method achieves good classification for understanding anonymous and occasional visitors’ purchase intent. The best intent inference is achieved when using temporal aspects together with the session’s trendiness and the number of clicks. The results of this work can be utilized for creating a novel real-time recommender systems that integrates trendiness, and session temporal information into the reasoning process of an online purchase intent classification mechanism in sites. This mechanism may guide an online recommender for improving the shopping experience to the benefit of both the buyer and the seller. Our work is a first step towards predicting shopping intent of anonymous visitors in sites. We find here purchase intent only at the end of the session. An interesting future direction is to find how early in the session the classifier may have a good enough recommendation. 8. Acknowledgements We would like to thank the Zalando team for providing their data for our research. References Amaro, S. and Duarte, P. (2015). An integrative model of consumers’ intentions to purchase travel online. Tourism management, 46:64–79.

20

Baumann, A., Haupt, J., Gebert, F., and Lessmann, S. (2018). Changing perspectives: Using graph metrics to predict purchase probabilities. Expert Systems with Applications, 94:137–148. Bekkerman, R. (2015). The present and the future of the kdd cup competition: an outsider’s perspective. https://www.linkedin.com/pulse/present-future-kdd-cupcompetition-outsiders-ron-bekkerman/. Bell, R., Koren, Y., and Volinsky, C. (2007). Modeling relationships at multiple scales to improve accuracy of large recommender systems. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 95–104. ACM. Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., and Hoerle, J. (2015). Recsys challenge 2015 and the yoochoose dataset. In Proceedings of the 9th ACM Conference on Recommender Systems, pages 357–358. ACM. Bhatnagar, A., Sen, A., and Sinha, A. P. (2016). Providing a window of opportunity for converting estore visitors. Information Systems Research, 28(1):22–32. Bogina, V. and Kuflik, T. (2017). Incorporating dwell time in session-based recommendations with recurrent neural networks. In First Workshop on Temporal Reasoning in Recommender Systems, Como, Italy. Bogina, V., Kuflik, T., and Mokryn, O. (2016). Learning item temporal dynamics for predicting buying sessions. In Proceedings of the 21st International Conference on Intelligent User Interfaces, pages 251–255. ACM. Breiman, L. (1996). Bagging predictors. Machine learning, 24(2):123–140. Bucklin, R. E., Lattin, J. M., Ansari, A., Gupta, S., Bell, D., Coupey, E., Little, J. D., Mela, C., Montgomery, A., and Steckel, J. (2002). Choice and the internet: From clickstream to research stream. Marketing Letters, 13(3):245–258. Bucklin, R. E. and Sismeiro, C. (2009). Click here for internet insight: Advances in clickstream data analysis in marketing. Journal of Interactive Marketing, 23(1):35–48. Cai, H., Chen, Y., and Fang, H. (2009). Observational learning: Evidence from a randomized natural field experiment. The American Economic Review, 99(3):864– 882. Center for Retail Research (2017). Online retailing: Britain, europe, us and canada 2017. http://www.retailresearch.org/onlineretailing.php. Chan, T. K., Cheung, C. M., and Lee, Z. W. (2017). The state of online impulse-buying research: A literature analysis. Information & Management, 54(2):204–217. Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357. Chen, C., Hou, C., Xiao, J., Wen, Y., and Yuan, X. (2017). Enhancing purchase behavior prediction with temporally popular items. IEICE TRANSACTIONS on Information and Systems, 100(9):2237–2240. Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–794. ACM.

21

Choi, H. and Varian, H. (2012). Predicting the present with google trends. Economic Record, 88(s1):2–9. Cobb, C. J. and Hoyer, W. D. (1986). Planned versus impulse purchase behavior. Journal of retailing. Deng, L. and Poole, M. S. (2010). Affect in web interfaces: a study of the impacts of web page visual complexity and order. Mis Quarterly, pages 711–730. Dias, R. and Fonseca, M. J. (2013). Improving music recommendation in sessionbased collaborative filtering by using temporal context. In In the Proceedings of the 2013 IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI), pages 783–788. IEEE. Ding, A. W., Li, S., and Chatterjee, P. (2015). Learning user real-time intent for optimal dynamic web page transformation. Information Systems Research, 26(2):339– 359. Forrester Research (2017). Forrester data: Online retail forecast, 2017 to 2022. https://www.forrester.com/report/Forrester+Data+Online+Retail+Forecast+2017+To+2022+US//E-RES139271. Gefen, D., Karahanna, E., and Straub, D. W. (2003). Trust and tam in online shopping: An integrated model. MIS quarterly, 27(1):51–90. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The weka data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10–18. Hanson, W. A. and Putler, D. S. (1996). Hits and misses: Herd behavior and online product popularity. Marketing letters, 7(4):297–305. Hidasi, B., Karatzoglou, A., Baltrunas, L., and Tikk, D. (2015). Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939. Hidasi, B., Quadrana, M., Karatzoglou, A., and Tikk, D. (2016). Parallel recurrent neural network architectures for feature-rich session-based recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, pages 241– 248. ACM. Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, C. T. (1999). Bayesian model averaging: a tutorial. Statistical science, pages 382–401. Hosmer Jr, D. W., Lemeshow, S., and Sturdivant, R. X. (2013). Applied logistic regression, volume 398. John Wiley & Sons. Jannach, D., Kamehkhosh, I., and Lerche, L. (2017). Leveraging multi-dimensional user models for personalized next-track music recommendation. In Proceedings of the Symposium on Applied Computing, pages 1635–1642. ACM. Jannach, D., Lerche, L., and Kamehkhosh, I. (2015). Beyond hitting the hits: Generating coherent music playlist continuations with the right tracks. In Proceedings of the 9th ACM Conference on Recommender Systems, pages 187–194. ACM. Jeffrey, S. A. and Hodge, R. (2007). Factors influencing impulse buying during an online purchase. Electronic Commerce Research, 7(3):367–379. Kalczynski, P. J., Senecal, S., and Nantel, J. (2006). Predicting on-line task completion with clickstream complexity measures: A graph-based approach. International Journal of Electronic Commerce, 10(3):121–141.

22

Kim, E., Kim, W., and Lee, Y. (2003). Combination of multiple classifiers for the customer’s purchase behavior prediction. Decision Support Systems, 34(2):167– 175. Kim, Y. S. and Yum, B.-J. (2011). Recommender system based on click stream data using association rule mining. Expert Systems with Applications, 38(10):13320– 13327. Kohavi, R. (1996). Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In KDD, volume 96, pages 202–207. Kooti, F., Lerman, K., Aiello, L. M., Grbovic, M., Djuric, N., and Radosavljevic, V. (2016). Portrait of an online shopper: Understanding and predicting consumer behavior. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pages 205–214. ACM. Koren, Y. (2010). Collaborative filtering with temporal dynamics. Communications of the ACM, 53(4):89–97. Krawczyk, B., Minku, L. L., Gama, J., Stefanowski, J., and Wo´zniak, M. (2017). Ensemble learning for data stream analysis: A survey. Information Fusion, 37:132– 156. Lathia, N., Hailes, S., Capra, L., and Amatriain, X. (2010). Temporal diversity in recommender systems. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 210–217. ACM. Lo, C., Frankowski, D., and Leskovec, J. (2016). Understanding behaviors that lead to purchasing: A case study of pinterest. In KDD, pages 531–540. Lu, J., Wu, D., Mao, M., Wang, W., and Zhang, G. (2015). Recommender system application developments: a survey. Decision Support Systems, 74:12–32. Lukose, R., Li, J., Zhou, J., and Penmetsa, S. R. (2008). Learning user purchase intent from user-centric data. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 673–680. Springer. McDowell, W. C., Wilson, R. C., and Kile Jr, C. O. (2016). An examination of retail website design and conversion rate. Journal of Business Research, 69(11):4837– 4842. Moe, W. W. (2003). Buying, searching, or browsing: Differentiating between online shoppers using in-store navigational clickstream. Journal of consumer psychology, 13(1-2):29–39. Moe, W. W. and Fader, P. S. (2004). Dynamic conversion behavior at e-commerce sites. Management Science, 50(3):326–335. Mokryn, O., Wagner, A., Blattner, M., Ruppin, E., and Shavitt, Y. (2016). The role of temporal trends in growing networks. PloS one, 11(8):e0156505. Montgomery, A. L., Li, S., Srinivasan, K., and Liechty, J. C. (2004). Modeling online browsing and path analysis using clickstream data. Marketing science, 23(4):579– 595. Olbrich, R. and Holsing, C. (2011). Modeling consumer purchasing behavior in social shopping communities with clickstream data. International Journal of Electronic Commerce, 16(2):15–40.

23

Panagiotelis, A., Smith, M. S., and Danaher, P. J. (2014). From amazon to apple: Modeling online retail sales, purchase incidence, and visit behavior. Journal of Business & Economic Statistics, 32(1):14–29. Park, C. H. and Park, Y.-H. (2016). Investigating purchase conversion by uncovering online visit patterns. Marketing Science, 35(6):894–914. Park, S. E., Lee, S., and Lee, S.-g. (2011). Session-based collaborative filtering for predicting the next song. In In proceedings of the 2011 First ACIS/JNU International Conference on Computers, Networks, Systems and Industrial Engineering (CNSI), pages 353–358. IEEE. Pavlou, P. A. and Fygenson, M. (2006). Understanding and predicting electronic commerce adoption: An extension of the theory of planned behavior. MIS quarterly, pages 115–143. Pew

Research Center (2016). Online shopping and e-commerce. http://www.pewinternet.org/2016/12/19/online-shopping-and-e-commerce/.

Polites, G. L., Karahanna, E., and Seligman, L. (2018). Intention–behaviour misalignment at b2c websites: when the horse brings itself to water, will it drink? European Journal of Information Systems, 27(1):22–45. Quadrana, M., Karatzoglou, A., Hidasi, B., and Cremonesi, P. (2017). Personalizing session-based recommendations with hierarchical recurrent neural networks. In Proceedings of the Eleventh ACM Conference on Recommender Systems, pages 130–137. ACM. Raphaeli, O., Goldstein, A., and Fink, L. (2017). Analyzing online consumer behavior in mobile and pc devices: A novel web usage mining approach. Electronic Commerce Research and Applications, 26:1–12. Ricci, F., Rokach, L., Shapira, B., and Kantor, P. B. (2015). Recommender systems handbook. Springer. Salganik, M. J., Dodds, P. S., and Watts, D. J. (2006). Experimental study of inequality and unpredictability in an artificial cultural market. science, 311(5762):854– 856. Scarpi, D., Pizzi, G., and Visentin, M. (2014). Shopping for fun or shopping to buy: Is it different online and offline? Journal of Retailing and Consumer Services, 21(3):258–267. Sch¨ afer, K. and Kummer, T.-F. (2013). Determining the performance of website-based relationship marketing. Expert Systems with Applications, 40(18):7571–7578. Senecal, S., Kalczynski, P. J., and Nantel, J. (2005). Consumers’ decision-making process and their online shopping behavior: a clickstream analysis. Journal of Business Research, 58(11):1599–1608. Sismeiro, C. and Bucklin, R. E. (2004). Modeling purchase behavior at an e-commerce web site: A task-completion approach. Journal of marketing research, 41(3):306– 323. Srinivasan, D. B. and Mekala, P. (2014). Mining social networking data for classification using reptree. International Journal of Advance Research in Computer Science and Management Studies, 2(10). Su, Q. and Chen, L. (2015). A method for discovering clusters of e-commerce interest patterns using click-stream data. Electronic Commerce Research and Applications, 14(1):1–13.

24

Suh, E., Lim, S., Hwang, H., and Kim, S. (2004). A prediction model for the purchase probability of anonymous customers to support real time web marketing: a case study. Expert Systems with Applications, 27(2):245–255. Tavakol, M. and Brefeld, U. (2014). Factored mdps for detecting topics of user sessions. In Proceedings of the 8th ACM Conference on Recommender Systems, pages 33– 40. ACM. Tsymbal, A. (2004). The problem of concept drift: definitions and related work. Computer Science Department, Trinity College Dublin, 106(2). Tucker, C. and Zhang, J. (2011). How does popularity information affect choices? a field experiment. Management Science, 57(5):828–842. Van den Poel, D. and Buckinx, W. (2005). Predicting online-purchasing behaviour. European journal of operational research, 166(2):557–575. Venkatesh, V. and Agarwal, R. (2006). Turning visitors into customers: A usabilitycentric perspective on purchase behavior in electronic channels. Management Science, 52(3):367–382. Wells, J. D., Parboteeah, V., and Valacich, J. S. (2011). Online impulse buying: understanding the interplay between consumer impulsiveness and website quality. Journal of the Association for Information Systems, 12(1):32. Widmer, G. and Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine learning, 23(1):69–101. Wolfinbarger, M. and Gilly, M. C. (2001). Shopping online for freedom, control, and fun. California Management Review, 43(2):34–55. Yi, X., Hong, L., Zhong, E., Liu, N. N., and Rajan, S. (2014). Beyond clicks: dwell time for personalization. In Proceedings of the 8th ACM Conference on Recommender systems, pages 113–120. ACM. Zhang, H., Ni, W., Li, X., and Yang, Y. (2016). Modeling the heterogeneous duration of user interest in time-dependent recommendation: A hidden semi-markov approach. IEEE Transactions on Systems, Man, and Cybernetics: Systems. Zheleva, E., Guiver, J., Mendes Rodrigues, E., and Mili´c-Frayling, N. (2010). Statistical models of music-listening sessions in social media. In Proceedings of the 19th international conference on World wide web, pages 1019–1028. ACM. Zhou, L., Dai, L., and Zhang, D. (2007). Online shopping acceptance model-a critical survey of consumer factors in online shopping. Journal of Electronic commerce research, 8(1):41.

25