A note on explicit versus implicit information for job recommendation

A note on explicit versus implicit information for job recommendation

    Explicit versus Implicit information for job recommendation: A case study with the Flemish public employment services Michael Reusens...

410KB Sizes 3 Downloads 170 Views

    Explicit versus Implicit information for job recommendation: A case study with the Flemish public employment services Michael Reusens, Wilfried Lemahieu, Bart Baesens, Luc Sels PII: DOI: Reference:

S0167-9236(17)30061-1 doi: 10.1016/j.dss.2017.04.002 DECSUP 12826

To appear in:

Decision Support Systems

Received date: Revised date: Accepted date:

18 July 2016 4 April 2017 4 April 2017

Please cite this article as: Michael Reusens, Wilfried Lemahieu, Bart Baesens, Luc Sels, Explicit versus Implicit information for job recommendation: A case study with the Flemish public employment services, Decision Support Systems (2017), doi: 10.1016/j.dss.2017.04.002

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Explicit versus Implicit information for job recommendation: A case study with the Flemish public employment services

Faculty of Economics and Business, KU Leuven, Naamsestraat 69, 3000 Leuven, Belgium, Email: {michael.reusens, wilfried.lemahieu, bart.baesens, luc.sels}@kuleuven.be b Southampton Business School, University of Southampton, United Kingdom

SC

RI

a

PT

Michael Reusensa , Wilfried Lemahieua , Bart Baesensa,b , Luc Selsa

Abstract

NU

Recommender systems have proven to be a valuable tool in many online applications. However, the multitude of user related data types and recommender system algorithms makes it difficult for

MA

decision makers to choose the best combination for their specific business goals. Through a case study on job recommender systems in collaboration with the Flemish public employment services

D

(VDAB), we evaluate what data types are most indicative of job seekers’ vacancy interests, and

TE

how this impacts the appropriateness of the different types of recommender systems for job recommendation. We show that implicit feedback data covers a broader spectrum of job seekers’ job

AC CE P

interests than explicitly stated interests. Based on this insight we present a user-user collaborative filtering system solely based on this implicit feedback data. Our experiments show that this system outperforms the extensive knowledge-based recommender system currently employed by VDAB in both offline and expert evaluation. Furthermore, this study contributes to the existing recommender system literature by showing that, even in high risk recommendation contexts such as job recommendation, organizations should not only hang on to explicit feedback recommender systems but should embrace the value and abundance of available implicit feedback data. Keywords: Recommender System, Job Recommender System, User Behavior

1. Introduction Looking for a job can be a daunting task for a job seeker. When evaluating an open vacancy (s)he needs to take into account a multitude of factors, such as the type of job presented, location, required experience and certifications, etc. The job seeker needs to make sure that these features match his/her properties and interests. It is unproductive and discouraging for a job seeker to spend a lot of time looking at jobs (s)he is not interested in, or unqualified for. In Flanders (the Preprint submitted to Elsevier

April 6, 2017

ACCEPTED MANUSCRIPT northern region of Belgium) alone, over 1,000,000 new vacancies were published between November 2014 and October 2015 [1]. Requiring a job seeker to filter through all vacancies is highly impractical, since (s)he will only be interested to pursue a small subset of these vacancies. A more general

PT

version of this problem, in which there are too many available items so that manually finding those

RI

that are interesting becomes infeasible, is called the information overflow problem[2]. Recommender systems were first proposed as a solution for information overflow problems. They

SC

are the set of software tools that, given a user and a set of items, suggest those items that are probably relevant to the user [3]. Since their introduction, research on recommender systems has

NU

continued to rise, and they have been successfully applied in various contexts [4, 5, 6, 7, 8]. Research on recommender systems has also been conducted in the context of job search. Section 2

MA

provides an overview of the most prominent job recommendation studies. However, job recommendation can be considered a high risk recommendation context [7] compared

D

to more established domains such as movie-, or book recommendation. A high risk recommenda-

TE

tion context is one in which “wrong” recommendations have a higher cost related to them than in low risk recommendation domains. The risk related to recommending poor vacancies as public

AC CE P

employment services (PES) is a loss of image, a higher number of dissatisfied job seekers, and most importantly potentially longer unemployment durations. Because of this risk PESs traditionally approach job recommendation from a conservative perspective: only recommend jobs to people based on what they explicitly tell us they are interested in. This is called recommendation based on explicit feedback[9].

Alternative to explicit feedback, many prominent recommender algorithms are purely based on implicit feedback[10]. The term implicit feedback encompasses all signals that users give through their natural behavior, from which their interests can be deducted. Some common examples of implicit feedback used in recommender systems are pages people visited, how long they stayed on a specific page and saving an item for revisiting at a later time. In previous job recommender system studies (See Section 2) we see that there are two strains of job recommender studies: researchers that claim that given the high risk nature of job recommendation only explicit feedback should be taken into account and researchers that propose pure implicit feedback based job recommendations, disregarding the high-risk nature. A gap in these studies is a comparison between which interest indicators best capture job seeker’s interest, and which interest 2

ACCEPTED MANUSCRIPT indicators lead to better job recommendations.

PT

This study compares explicit interest data with implicit interest data to answer two questions: 1. How does user interest expressed by explicit data compare to user interest captured by implicit

RI

data sources, and

SC

2. which data source leads to better performing vacancy recommender systems? Organizations such as PESs are currently reluctant to embrace the potential value of implicit feed-

NU

back data out of fear for inferior job recommendations. A result in favor of implicit feedback data could lead to adoption in this high risk domain, resulting in better job recommendations and poten-

MA

tially shorter unemployment durations. These research questions could also lead to more insights in other high risk recommendation contexts such as tourism, real-estate and financial services [7].

D

The rest of this paper is structured as follows. In Section 2, the literature that forms the foun-

TE

dations for this study is presented, with a focus on interest data, recommender system algorithms

AC CE P

and earlier job recommender systems. Section 3 further clarifies the exact research questions that we try to answer in this paper, together with the corresponding methodology. This section also focuses on what performance metrics are most suitable for the recommender system comparison and presents the data and recommender system algorithms used in this research. The results of our experiments are presented in Section 4. To conclude, section 5 provides a discussion on the experimental results, analyzes the limitations of this study, and suggests future research directions.

2. Related Work This section discusses the main works this study builds upon. The related research is split into the following domains: interest data for recommender systems, general recommender algorithms, and job recommender systems. 2.1. Interest data for recommender systems At the core of every recommender system lies the concept of user interest. In this study, we distinguish between two types of user interest data: data from interests that are consciously entered

3

ACCEPTED MANUSCRIPT in the system (explicit feedback data), and interest inferred during the period in which the user interacts with the system (implicit feedback data). Examples of explicit feedback data used in this study are desired job type, desired location and experience required. As implicit feedback data we

PT

use the total read-time for each vacancy per user, the amount of clicks on each vacancy per user,

RI

which vacancies are saved by the user in his/her profile and which vacancies the user applied for. User provided data is generally seen as more robust information about a user’s preferences [9]. It

SC

is intuitive to see that a user explicitly stating in his/her profile to be interested in job type A is a stronger signal about his/her interests than a user that merely clicks on a vacancy of a specific job

NU

type. User provided profiles also make it easier to explain recommendations based on them, since the system can show which features do, or do not match [7]. For these reasons, recommender risk

MA

averse organizations such as PESs tend to stick with explicit feedback data. A downside of this type of information is that it imposes a large data-input threshold for potential

D

users of the recommender system. Providing the system with your complete information regarding

TE

personal details, education, experience, qualifications, desired job types, ... requires high user effort and increases the probability of users entering incomplete or incorrect information [11, 9].

AC CE P

Numerous studies have shown that implicit feedback features correlate well with user interest [12, 13]. Implicit feedback as a substitute for explicit feedback in recommender systems was first researched by [10], and later applied in other recommender system studies in the area of e-commerce [14], book recommendation [15], television show recommendation [16], job recommendation [17] and other recommendation domains.

The downside of implicit feedback is the assumption that a specific behavior towards an item implies interest in that item. This is not always the case (e.g. accidentally clicking on a vacancy), making implicit feedback data to be more prone to noise [9]. 2.2. Recommender Systems In our experiments, we compare two different types of user-user collaborative filtering recommenders [18] with a hybrid content- and knowledge based recommender [19, 20]. Collaborative filtering is selected as representative for implicit feedback recommender systems, as it is the most popular recommender algorithm, successfully deployed in many recommendation contexts [7] and commonly used in recommendation benchmarks [21]. The choice for user-user (vs. item-item)

4

ACCEPTED MANUSCRIPT collaborative filtering, is related to performance, as there is a much greater amount of vacancies (800,000) compared to users (190,000). This way fewer similarities need to be computed. As we will further elaborate on in Section 5.2, we could have expanded the benchmark with more algorithms

PT

(e.g. item-item collaborative filtering [22] or matrix factorization algorithms [16, 23]). However the

RI

aim of this study is not to find the best-in-class job recommender but to analyze the differences between implicit and explicit feedback data and their impact on recommendation quality. Given

SC

our results, including a potentially better implicit feedback based recommender system would not

NU

change our conclusions. 2.3. Job recommender systems

MA

One of the first job recommender systems is presented by Rafter & Smyth [17]. This system builds user profiles by analyzing the behavior of users on an e-recruitment platform. Based on these profiles, a collaborative filtering recommender system is applied. This system only uses implicit

D

feedback data such as read-time and amount of visits to generate recommendations. This approach

TE

is interesting since other researchers claim that job-recommendation is an intricate task that re-

AC CE P

quires taking into account the matching of several key properties such as the location of job seeker and job, education level, working experience,... [24, 25]. These studies take a reciprocal approach to job-recommendation, meaning that the interest of the job seeker in a specific vacancy is not sufficient, and that it should be combined with the interest the employer has in the job seeker. A prominent reciprocal job recommender is presented by Malinowski, Keim, Wendt, & Weitzel [26]. In existing research, job recommender systems use implicit feedback, often extracted from web-logs, explicit user- and vacancy descriptions, or a combination of both. These data, or combinations of data, are then applied using various recommendation algorithms, such as collaborative filtering [17], case-based reasoning [27], cluster-based techniques [28], etc. For a full survey on job recommender systems, we refer to Al-Otaibi & Ykhlef [29] and Siting, Wenxing, Ning, & Fan [30].

The research gap this paper aims to fill is an analysis of what data capture which parts of user interest, and how this impacts the quality of the recommended items. This question is particularly interesting for both researchers working on job recommendation and other similar high risk recommendation tasks, and organizations that are reluctant to move away from explicit feedback data

5

ACCEPTED MANUSCRIPT because of the assumption that other types of interest data will lead to inferior recommendations.

PT

3. Methodology To recap, this study focuses on the following 2 questions: 1) How does user interest expressed

RI

by explicit data compare to user interest captured by implicit data sources, and 2) which data source leads to better performing vacancy recommender systems?

SC

Question 1 looks at the difference in what job types a job seeker states to be interested in versus what vacancies (s)he is actually looking at. In order to answer this question, the HTTP-logs of

NU

the website of the public employment service are compared with the explicit user profiles of the same job seekers. We will look at what proportion of looked at vacancies have a job type that

MA

corresponds with a job type the user has explicitly stated to be interested in his/her profile. This comparison provides a basis to understand the answers to question 2. Knowing what part of user

D

interest is captured in what data is essential to understand systems that build upon this data.

TE

Question 2 addresses the problem of selecting the appropriate data type in a vacancy recommender system. We have implemented two user-user collaborative filtering recommender systems based on

AC CE P

implicit feedback data and compare these systems with a knowledge-based recommender system based on explicit data. These three systems are representative for their respective recommender system types. In order to properly compare them, we first address the issue of evaluating the performance of a recommender system. 3.1. Evaluating vacancy recommender systems Getting a good understanding of how well a recommender system performs in a given setting requires analyzing the following two aspects: a) how well do the recommendations align with the business goals of the party providing the recommendations (in our case the Flemish PES), and b) how useful are the recommendations for users of the system (in our case these are job seekers)? The goal of a vacancy recommender system for both the PES and the job seeker is helping people to find more vacancies that are of interest to them. In this case aspect a) and b) align, and can be evaluated in the same way. If other business requirements would be added (e.g. adding a sponsorship model in which paying companies would get a preferential treatment for their vacancies), the evaluation would be different for aspects a) and b). The case in which a) and b) do not align is not 6

ACCEPTED MANUSCRIPT discussed in this paper, but it is always key to consider why you as an organization provide users with recommendations (e.g. increasing profit vs. having people’s best interest at heart) and tune

PT

the recommender system to achieve these goals.

RI

The most popular evaluation metrics capture how accurate a recommender system is in predicting how interesting an item will be for a user. Recommender accuracy is usually expressed using error

SC

metrics (mean absolute error, mean squared error, root mean squared error, ...) that indicate how far off the predictions of the recommender system are from the known user-item interests[31]. For

NU

example, a recommender system can have a mean absolute error of 0.5 stars in a 5-star rating system, meaning that the system’s predictions about the interest a user has in an item will be on

MA

average 0.5 stars off from the actual value. The smaller this error, the better the recommender system predicts user-item interests.

D

As will be further discussed later on, the three recommender systems compared in our study each

TE

use different types of user interest, so using evaluation metrics that are based on the deviation from the actual value are not good candidates. Instead of measuring how good the systems are at pre-

AC CE P

dicting the exact user-item interests, we will employ metrics that indicate how relevant the actual recommendations are for the user: recall@N and mean-reciprocal rank (MRR). These metrics are widely used in recommender system research [31, 7]. recall@N is the percentage of items, recommended by a top-N recommender system, that are part of the test set in a leave-X-out experiment. In such experiments, the available interest data is split in a disjoint training and test set. The test set consists of X data points and the training set of all the other data available in the original data set. The training set is used by the recommender system to generate recommendations for a user. These recommendations are compared to the items in the test set. If a high percentage of items in the top-N recommendations are present in the test set, which contains items we know the user has shown an interest for, the system can be considered good at predicting what items a user will like. Although often used together with recall@N, precision@N (which is equal to the proportion of recommended vacancies that are also in the test set) is not used in our evaluation. This because in our experiments the test set size is equal to N (both are 5), so precision@N and recall@N will result in the same value. MRR is a similar metric as recall@N, in the sense that it also indicates how well a system rec7

ACCEPTED MANUSCRIPT ommends items that we know a user likes. The MRR of a recommender system is the average position on which the first left-out item appears in the list of recommendations. An advantage over recall@N is that when no left-out items appear in the top-N recommendation, but at least one still

PT

appears further down the list of recommendations, we can still differentiate between the quality of

RI

two sets of recommendations. For example, imagine two sets of recommendations A and B, generated during a leave-1-out experiment. The left out item appears on place 99 in set A and place

SC

15 in set B. If we would compare these two recommendations based on recall@5 (0 for both) they will both be considered equal, but MRR (99 and 15 respectively) allows us to differentiate between

NU

both systems. However, MRR has as disadvantage that only the first occurrence of left-out items is accounted for, regardless of how far the second (and consequent) left-out items are positioned in

MA

the list of recommendations. In our experiments, we use MRR for optimizing the system parame-

D

ters (since one of the systems has a constant recall@5 of 0) and recall@5 for inter-system comparison.

TE

When recommending vacancies one also needs to pay attention to several aspects that get more important in this specific domain, on top of metrics such as recall@5 and MRR.

AC CE P

Because this research works on automatically recommending vacancies by public employment services, recommendation risk is important. Recommendation risk is defined as the user-tolerance for poor recommendations. In the case of employment services, job seekers that are unhappy with the recommendations are quick to complain, and discard the system as being untrustworthy. Risk can be mitigated by being able to explain to the user why (s)he received a specific set of recommendations. The choice of recommender systems in this context should be focused around those algorithms that do not behave like a black box, such as matrix factorization techniques. Both the user-user collaborative filtering- and the knowledge-based recommender systems used in this research are good choices when explaining recommendations is important [32, 7]. Recommendation diversity should also be taken into account [33, 34]. Recommendation diversity refers to the similarity of the recommended items. Vacancy recommendation can have as purpose giving job seekers an idea of what type of jobs they could apply for. Recommending a diverse set of vacancies can be an inspiration for job seekers. Diversity is measured in the experiments by counting the number of different job types the top-5 recommendations consist of. As observed in several studies [35, 33], offline evaluation metrics do not necessarily positively corre8

ACCEPTED MANUSCRIPT late with user appreciation of recommender systems, and should be interpreted with care. Because of this reason, an expert evaluation is set up in order to support the findings of the offline evaluation. Recommendations made by each of the three recommender systems used in this research will

PT

be evaluated by experts from the employment services. Each recommended vacancy will be labeled

RI

as either relevant or irrelevant for the user for whom recommendations are generated.

SC

3.2. Recommender system set-up

This section first presents the detailed implementation of the two user-user collaborative fil-

NU

tering recommender systems based on implicit feedback. These systems are similar to the system presented by Rafter & Smyth [17], and will be used in a comparison with a knowledge-based rec-

MA

ommender system currently in use at the Flemish employment services. Next, the knowledge-based recommender system is described. The results of the comparison can be found in Section 4.

D

3.2.1. Collaborative filtering systems based on implicit feedback

TE

The two versions of collaborative filtering applied in this research are standard user-user collaborative filtering (CF) [18] and one-class collaborative filtering (OCCF). OCCF is a special case

AC CE P

of CF in which the interest is either 1 (positive) or 0 (unknown). In standard CF interests can be either unknown (expressed as 0), or expressed as any number, allowing for different levels of interest. The choice of including OCCF in the comparison stems from the question whether it is possible to consistently capture different levels of positive or negative feedback from the implicit feedback data. It is possible that only the most simple representation of user interest (some positive feedback vs. unknown; 1 vs. 0) will result in a good recommender system, given the less robust nature of the implicit feedback data we are working with. The collaborative filtering framework used in the experiments is based on the framework proposed by Herlocker[18]. It consists of 3 steps: • Step 1: calculate user interest for each vacancy • Step 2: find similar neighbors • Step 3: calculate predicted interest for unseen vacancies The set-up and parameters of both systems are discussed below.

9

ACCEPTED MANUSCRIPT Step 1: Calculate user interest for each vacancy. The implicit feedback data used in the collaborative filtering systems come from web-logs of the employment services’ website. The features related to a user’s u interest on a vacancy v are listed below. These are based on interest indicators used

PT

in existing recommender systems [16, 15, 17].

RI

• T OT AL READ T IM E: The total time a user looked at a specific vacancy. Read-time is

SC

calculated as the difference in seconds between the HTTP-request to view the vacancy’s web page and the subsequent HTTP-request. If the amount of time between two subsequent clicks

NU

is larger than 30 minutes we consider the two clicks to come from different user sessions and set this read-time to unknown’. This is a common cut-off for sessionization used in web

MA

analytics [36]. Using HTTP-logs is a rudimentary approach to estimating total read-time but is the best available given the available data.

D

• N B V ISIT S: The number of times a user visited a vacancy. This can occur multiple times

TE

in the same session.

AC CE P

• SAV ED: Whether the user saved the vacancy in his/her profile or not. • AP P LIED: Whether the user applied for this specific vacancy or not. This data is dependent on the user logging his/her job applications in the system. • U N LIKED: Whether the user clicked a button indicating the specific vacancy is not to his/her liking.

These implicit feedback factors above are transformed into one INTEREST SCORE(ui ,vj ) that should express the interest of a user ui for a specific vacancy vj . This transformation is done using a linear weighting, as shown in Equation 1.

10

ACCEPTED MANUSCRIPT

j

if U N LIKED(ui , vj ) = 0.

if U N LIKED(ui , vj ) = 1. (1)

NU

SC

   δ ∗ AP P LIED(ui , vj ),                −,

PT

i

RI

IN T EREST SCORE(ui , vj ) =

    α ∗ T OT AL READ T IM E(ui , vj )+         β ∗ N B V ISIT S(ui , vj )+        γ ∗ SAV ED(u , v )+

MA

For the CF system, the variables α, β, γ, δ and  are optimized using a grid search for the optimal MRR using leave-5-out cross-validation. Possible values for the variables were {-1000, -100, -5, -1, 0, 0.001, 0.001, 0.1, 1, 5, 10, 33, 50, 100, 500, 1000} as they cover different orders of magnitudes and

D

the optimal parameters never lied on the outer boundaries (-1000 or 1000). Note that we implicitly

TE

employ some feature selection by allowing a variable to have a 0 weight causing the rest of the

AC CE P

algorithm to ignore it. The resulting interest formula for the CF system can be seen in Equation 2.

IN T EREST SCORECF (ui , vj ) =

    0.01 ∗ T OT AL READ T IM E(ui , vj )+         33 ∗ N B V ISIT S(ui , vj )+       100 ∗ SAV ED(u , v )+  i

j

   100 ∗ AP P LIED(ui , vj ), if U N LIKED(ui , vj ) = 0.                −100, if U N LIKED(ui , vj ) = 1. (2) Equation 3 shows the interest calculation of the OCCF recommender. In this system, interest is 1 when a user has a positive value for T OT AL READT IM E, N B V ISIT S, SAV ED or AP P LIED, and 0 otherwise. IN T EREST SCOREOCCF can be thought of as a simplified version of IN T EREST SCORECF that only looks at whether there was any indication of interest, 11

ACCEPTED MANUSCRIPT or not. Note that this system does not make use of the U N LIKED feature.

PT

SAV ED(ui , vj ) > 0 or

(3)

AP P LIED(ui , vj ) > 0.

NU

                  0,

N B V ISIT S(ui , vj ) > 0 or

RI

=

if T OT AL READ T IM E(ui , vj ) > 0 or

SC

IN T EREST SCOREOCCF (ui , vj )

    1,                

MA

otherwise.

Step 2: Find similar neighbors. In the CF recommender system, similarity simCF (ui , uj ) between two users ui (the user the system is generating recommendations for) and uj (another user known

D

to the system) is calculated as the significance-weighted Pearson correlation coefficient of the

TE

IN T EREST SCORECF on vacancies both users have a known interest for. The Pearson correlation coefficient is chosen because it is the most prominent similarity metric employed in user-user

AC CE P

collaborative filtering recommender systems. Significance weighting is incorporated both by a minimum overlap size ω, and a shrinking factor β [37] on the amount of vacancies the two users both have a known IN T EREST SCORECF for. The calculation can be seen in Equation 4. The set of vacancies that have received implicit feedback from a user ux are noted as Vux and the set of vacancies that have received implicit feedback from both users ux and uy are noted as Vux ,uy (= Vux ∩Vuy ). r(Vui ,uj ) denotes the Pearson correlation coefficient between IN T EREST SCORECF (ui , Vui ,uj ) and IN T EREST SCORECF (uj , Vui ,uj ).

simCF (ui , uj ) =

   N A

if |Vui ,uj | < ω.

  r(Vui ,uj ) ∗

|Vui ,uj | |Vui ,uj |+β ,

(4)

otherwise.

The parameters ω and β used in the experiments are optimized using a grid search for the optimal mean reciprocal rank using leave-5-out cross-validation. Possible values for ω were 1, 3, 10, 50 and 0, 1, 10, 50, 100 for β. The OCCF recommender system employs a weighted overlap similarity metric and can be seen in 12

ACCEPTED MANUSCRIPT Equation 5. simOCCF (ui , uj ) =

   N A

if |Vui ,uj | < ω.

   |Vui ,uj | ∗ |Vu |

(5)

otherwise.

PT

i

|Vui ,uj | |Vui ,uj |+β ,

RI

simOCCF is equal to the amount of vacancies ui and uj both have a known (positive) opinion on, divided by the amount of vacancies ui has given a known opinion on. Note that this is not a

SC

symmetric metric: simOCCF (ui , uj ) is not always equal to simOCCF (uj , ui ). E.g. consider a set of vacancies Vu1 ,u2 two users u1 and u2 have both given an opinion on. If u1 has a smaller number

NU

of other vacancies (save)he gave an opinion on (Vu1 − Vu1 ,u2 ), simOCCF (u1 , u2 ) will be higher than simOCCF (u2 , u1 ). In this case, we believe u1 to be more informative for u2 than vice-versa. Similar

MA

to Equation 4, Equation 5 uses a minimum overlap size ω and shrinking factor β, which are both optimized for mean reciprocal rank using leave-5-out cross-validation, using the same possible values

D

as for simCF .

TE

Step 3: calculate predicted interest for unseen vacancies. Using the user-user similarities, the systems can assign predicted interests to those vacancies that are seen by similar users, and unseen by

AC CE P

the user in question. This way users will not get recommendations for vacancies they already know about. Because of the different nature of similarity metrics used in the CF (correlation) and OCCF (overlap) system, predicted interests are calculated differently. Since the recommender systems will be evaluated using recall@5 and diversity as offline metrics, it suffices to look at what vacancies actually end up in the set of recommended vacancies, instead of looking at how well the systems predict the actual interests defined in Equations 2 and 3. It is thus not a problem that the two systems calculate this interest differently. Equation 6 shows the calculation of interest prediction for the CF system and Equation 7 shows how interest is calculated for the OCCF recommender.

P REDICT ED IN T ERESTCF (ui , vj ) = X simCF (ui , ux ) ∗ (IN T EREST SCORECF (ux , vj )− ux :simCF (ui ,ux )6=N A

mean(IN T EREST SCORECF (ux , Vux ))) (6)

13

ACCEPTED MANUSCRIPT

The CF recommender calculates the predicted interest for a vacancy vj as the sum of the deviations of the IN T EREST SCORECF for vj from the mean IN T EREST SCORECF of each similar

PT

user, multiplied by simCF (ui , ux ). The result is positive if the prediction is that ui will have a higher

RI

than average interest in vj , and negative if the prediction is that the interest will be lower than average. The vacancy vj with the highest predicted interest P REDICT ED IN T ERESTCF (ui , vj )

SC

will be the first recommended vacancy.

NU

P REDICT ED IN T ERESTOCCF (ui , vj ) = X simOCCF (ui , ux ) ∗ IN T EREST SCOREOCCF (ux , vj ) (7)

MA

ux :simOCCF (ui ,ux )6=N A

D

The interest prediction for the OCCF recommender is the sum of similarities of similar users that

TE

have a known opinion on vj . Note that IN T EREST SCOREOCCF (ux , vj ) is 1 when ux has a known opinion for vj and 0 if not. Similar to the CF system, the vacancy vj with the highest

AC CE P

predicted interest IN T EREST SCOREOCCF (ui , vj ) will be the first recommended vacancy. 3.2.2. Knowledge-based system

This section describes the working of the knowledge based (KB) recommender system used by the Flemish employment services. KB systems make use of expert knowledge, mostly on top of explicit user- and item- profiles. Both the features used and the interest calculation of this system are discussed. Note that not all the features nor the detailed working of the interest calculation are discussed, since these are subject to change on a weekly basis. The goal of this section is to give an overview of the general working of this system. Table 1 shows the features used in the KB system. Each feature is available in the user profiles and the vacancy profiles.

The predicted interest of a user ui for a vacancy vi is calculated as the weighted sum of the similarities of each of the user features with their vacancy counterpart. The calculation results in a positive number between 0 (lowest possible interest) and 1 (highest possible interest). The

14

ACCEPTED MANUSCRIPT Table 1: Features used by the KB recommender system

Vacancy Location of the job Required work experience Required skill-set Required certifications Type of the job ...

PT

Job Seeker Home adress Working experience Mastered skill-set Obtained certifications Desired job type ...

RI

Feature Location Experience Competences Certifications Job type ...

SC

vacancies with the highest P REDICT ED IN T ERESTKB (ui , vj ) are recommended to the user.

MA

P REDICT ED IN T ERESTKB (ui , vj ) =

NU

The concept behind the full calculation can be seen in Equation 8.

(8)

TE

D

α ∗ simLOCAT ION (ui , vj ) + β ∗ simEXP ERIEN CE (ui , vj ) + ... M AX SIM ILARIT Y

The Parameters α, β, · · · as well as the similarity metrics for each feature are defined using expert-

AC CE P

knowledge and are manually tuned. On top of the expert-based definition of similarity weights, several expert rules are applied. These rules define threshold values that P REDICT ED IN T ERESTKB (ui , vj ), and simF EAT U RE (ui , vj ) must exceed in order for a vacancy vj to be considered recommendable to a user ui . Examples of such rules are that P REDICT ED IN T ERESTKB (ui , vj ) must exceed 0.6, or that the location of the job must be lower than 60km away from the home address of the job seeker. Weekly expert tests on this system are performed, in order to detect undesired behavior.

4. Results This section presents the results of two analyses: 1) the comparison of the search behavior of job seekers with their explicit profiles and 2) a benchmarking of the three recommender systems described in Section 3 by looking at diversity, recall@5 and an expert evaluation. A discussion on the results presented here can be found in Section 5.1. The data used in these experiments are from all users who have logged in to the unemployment services’ website between January 1st 2014 and March 1st 2015. This results in a dataset with 205,638 job seekers and 863,038 vacancies.

15

25

PT

20 15

RI

10

SC

5 0

average number of vacancies looked at

30

ACCEPTED MANUSCRIPT

not explicitely stated

NU

explicitely stated

MA

Figure 1: Distribution of proportion of looked at vacancies with job type not covered by user profile.

This study is limited to logged-in users since this is, given the data, the only way to link the

D

browse-behavior to user profiles.

TE

4.1. User interest: explicit profiles vs. implicit feedback This research considers two types of data to capture user interest: user profiles, filled in by

AC CE P

job seekers, and implicit feedback based on (click-)behavior on the employment services’ website. The first analysis is a comparison between the interests explicitly specified in user profiles with the interest induced from a user’s search behavior. To this end, desired job types stated in the explicit profiles are compared to the job types corresponding to the vacancies a user looked at. On average, a job seeker has 2.8 job types indicated as interesting in his/her profile, and looked at vacancies from 8.6 different job types. Figure 1 shows the average number of vacancies a person looked at that have a job type that (s)he indicated to be an interesting job type in his/her profile compared to the number of vacancies looked at with a job type not explicitly expressed to be interesting. On average a job seeker in our dataset looks at 29.1 vacancies of which 6.5 (22.3%) have a job type that (s)he explicitly states to be interesting for him/her. This leaves on average 22.6 (77.7%) vacancies looked at by the job seeker not explained by the explicit interest indicators using his/her profile. Figure 2 shows the amount of job seekers that have a specific percentage of non-coverage by their explicit profiles. Notice the big spike for users for whom 100% of the vacancies they looked at had a job type which they have not explicitly expressed interest for.

16

PT

60000

RI

40000 0.0

0.2

0.4

0.6

SC

20000 0

number of job seekers

80000

ACCEPTED MANUSCRIPT

0.8

1.0

NU

percentage of seen vacancies

MA

Figure 2: Proportion of looked at vacancies covered by explicitly defined interest vs. not covered by explicit interest.

4.2. Comparison of recommender systems

This section describes the results of the evaluation of the three recommender systems described

expert testing.

AC CE P

4.2.1. Offline evaluation

TE

D

earlier: CF, OCCF and KB. First, the offline evaluation is discussed, followed by the results of the

During the offline evaluation, 100 random job seekers that had looked at minimum 50 vacancies were selected and leave-5-out evaluation was performed for the top-5 recommendation for each of the three systems. The systems are evaluated based on recall@5 and diversity in job types of the recommended vacancies. Figure 3 shows the recall@5 of CF, OCCF and the KB systems. There is a clear distinction in terms of recall@5, with OCCF being the better system having a recall@5 of 9.0%, CF having a recall@5 of 0.0% (in all 100 top-5 recommendations not a single left-out vacancy is recommended) and KB having a recall@5 equal to 2.0%. As mentioned in Section 3.1, diversity can be especially important when recommending vacancies. Figure 4 shows the average number of different job types belonging to the vacancies in the top-5 recommendations. The collaborative-filtering based systems (CF and OCCF) clearly provide more diverse recommendations than the KB system. This falls within what could be expected, since both CF and OCCF base themselves on the search behavior, which is wider than the explicit profiles, as shown in the offline evaluation. The diversity of CF and OCCF is similar, with the diversity of CF being slightly higher. 17

RI SC

0.06 0.04

NU

KB

CF

OCCF

MA

0.00

0.02

recall@5

0.08

PT

0.10

ACCEPTED MANUSCRIPT

recommender system

4 3 2 1 0

average number of different job types

5

AC CE P

TE

D

Figure 3: Recall@5 of CB, CF and OCCF

KB

CF

OCCF

recommender system

Figure 4: Average number of different job types in top-5 recommendations

18

PT

0.8 0.6

RI

0.4

KB

CF

SC

0.2 0.0

average expert appreciation (%)

1.0

ACCEPTED MANUSCRIPT

OCCF

NU

recommender system

MA

Figure 5: Expert appreciation of top-5 recommendation

4.2.2. Expert testing

We perform expert testing of the three recommender systems to supplement the offline metrics.

D

Over the course of two afternoon sessions, nine domain experts from the Flemish PES were presented

TE

with the 3 different top-5 recommendations (one for each system discussed in this study). During each session, they evaluated each recommended vacancy for 30 random users to be either suitable

AC CE P

or not suitable given the job seeker’s search behavior. The experts were not informed which top-5 set was generated by which algorithm in order to remove bias towards the KB system currently in place. The experts all had extensive (multiple years) experience in helping job seekers find a suitable job.

The result of this evaluation can be seen in Figure 5. This figure shows the average appreciation of the experts for the recommended vacancies. Appreciation is the percentage of recommended vacancies that were deemed appropriate by the domain experts, given the job seekers’ previous search behavior. The appreciation of OCCF is the highest (68%), followed by KB (49%) and CF (16%). These numbers support our findings from the offline evaluation: OCCF outperforms KB, and CF offers the worst vacancy recommendations, in terms of recall@5 and expert appreciation. Having nine experts from a PES confirm our offline results is extremely valuable qualitative support for our conclusions, given that offline results alone are not always confirmed by user testing[38]. Our main results are summarized in Table 2.

19

ACCEPTED MANUSCRIPT Table 2: Summary of key results

diversity 2.6 2.5 1.3

expert appreciation 16% 68% 49%

PT

CF OCCF KB

recall@5 0% 9% 2%

RI

5. Discussion and future work

SC

5.1. Discussion

The two main research questions in this study were: (1) are job seekers actively looking for

NU

the types of jobs they state to be interested in based on their explicit user profiles and (2) what is

systems using implicit feedback data?

MA

the difference in vacancy recommendation performance between systems using explicit data versus

We answered question (1) by comparing job seekers’ explicitly defined desired job types with the

D

job types of vacancies actually looked at. The results of this comparison clearly indicate that there

TE

is a big discrepancy between both data types. This comparison only considers the job type feature, but clearly shows that the vacancies a job seeker claims to be interested in are often different from

AC CE P

the vacancies he/she spends time looking at. Using only a job seeker’s explicitly defined interests will cause loss of a great part of the total interest. This has negative implications for KB systems that are exclusively based on this type of data. Vacancies with a job type not explicitly stated as interesting will be penalized by the knowledge-based system and will not be included in the recommendations.

In order to answer question (2), we performed a benchmarking study that compared the performance of two user-user collaborative filtering systems based on implicit feedback with a knowledge-based system based on explicit user profiles. Predictive performance was measured using recall@5, diversity of the recommendations was measured using the average number of different job types in top-5 recommendations, and relevance of the recommended vacancies was captured using expert testing. Our results do not allow to clearly prioritize one type of system over the other. Of the two representatives using implicit feedback one performs better and one performs worse than the algorithm using explicit profiles, both in terms of recall@5 and expert appreciation. The poor recall@5 for the KB recommender can be explained by the results related to question (1). This system is exclusively based on explicit user profiles and we have shown that user profiles miss a big 20

ACCEPTED MANUSCRIPT portion of a job seeker’s interest. The extremely poor performance of the CF system (recall@5 = 0) is however a surprise. We expect that the mean-centered nature of CF, in which a prediction is made if interest will be above or below the mean interest, is the cause of this bad performance.

PT

Inducing different gradations of user interest from the available data apparently degrades perfor-

RI

mance tremendously, and a more simple interest calculation, such as in the OCCF system performs better for job recommendation. However, looking at other interest calculation techniques (both for

SC

CF and OCCF), such as also incorporating the UNLIKED feature in the OCCF algorithm, is an interesting future research track that could improve recommendation performance.

NU

In terms of diversity of top-5 recommendations, we see that both collaborative filtering systems outperform the KB system. However, it is important to note that high diversity in recommenda-

MA

tions alone is no guarantee for the overall quality of a recommender system. E.g. even though the CF system has the highest recommendation diversity, the predictive quality and recommendation

D

relevance are not sufficient for the system to be considered useful.

TE

To summarize, the contributions of this research are the following: • We provide new insights in the trade-off between using explicit or implicit feedback data in

AC CE P

job recommender systems by comparing user-made profiles with implicit interest extracted from web-logs on the employment services’ vacancy search tool. • As part of our evaluation, we develop two new collaborative filtering vacancy recommender systems for the Flemish Public employment service (VDAB), using HTTP-logs (collected during the period January 2014-March 2015) covering clicks of over 190,000 unique job seekers on over 800,000 unique vacancies. The collaborative recommender algorithms we employ are standard user-user collaborative filtering and one-class user-user collaborative filtering. These approaches differ in how user interest is induced from implicit data, and we show that there is a big performance difference in vacancy recommendation between these two methods. • We benchmark these systems with a knowledge-based recommender system built by the Flemish employment services via both offline and expert testing. Our results indicate that the relatively simple OCCF recommender system relying on implicit feedback data outperforms extensive knowledge-based recommender systems that exclusively use

21

ACCEPTED MANUSCRIPT explicit profiles. We believe that our conclusions not only have an impact on job recommender systems but that a similar study is warranted in all high risk recommendation domains where the recommendation provider sticks to explicit profiles out of fear to present poor recommendations.

PT

Prominent examples of this type of domains are date-, house- and financial recommendation.

RI

5.2. Limitations and future work

SC

The first limitation of our study is related to data availability. We only use the behavioral data from logged-in job seekers. In the current study, it was only possible to link the behavioral data

NU

with user profiles for HTTP-requests that occurred when the job seeker was logged in. This has as drawback that we do not take into account the data from periods during which someone was

MA

browsing the website while not being logged in. This could cause us to only capture parts of a job seeker’s vacancy interests through his/her click behavior. However, we do not see any reason why vacancies looked at while not being logged in would consistently be of different job types than

D

those looked at when being logged in. The impact of this bias on our findings will be limited. A

TE

way to bypass this potential bias could be to implement a cookie-based system that allows clicks

AC CE P

to be linked to specific users even when they are not logged in. Another bias is potentially introduced by our evaluation strategy. The offline results are supplemented with expert testing, rather than end user testing. The validity of this strategy is based on the assumption that the experts are a good judge of what recommendations are (not) valid for the end user. This assumption is based on the fact that the experts in our evaluation panel have years of experience in guiding job seekers to relevant work, giving them a strong basis in judging vacancy relevance. However, it is impossible to claim that this assumption will be valid for each expert-end user evaluation pair, which could bias our results. An example bias could be that the experts might prefer vacancies the job seeker has a decent chance of successfully applying for over highly ambitious jobs for which the job seeker is unlikely to successfully apply. Next, we see two general recommender system challenges which are not explicitly tackled in our study: timeliness of ratings and cold-start. Timeliness of the ratings refers to the observation that interest shown a long time ago could be of less importance than interest shown more recently. As a solution for this, a time decay factor in the interest score calculation could be added. Solutions for the cold-start problem can lie in offering a search engine to let users actively look for vacancies

22

ACCEPTED MANUSCRIPT themselves. Once enough implicit vacancy interest is gathered, collaborative recommender systems can start recommending vacancies. In any case, further improving on the implicit feedback-based recommender systems (CF and OCCF), would only strengthen our conclusions about the merit of

PT

implicit feedback compared to explicit feedback.

RI

Since the goal of this research is to compare implicit feedback-based systems with knowledge-based systems no extensive benchmarking with other techniques is performed. An avenue for future re-

SC

search is to enrich the benchmarking by using other implicit feedback-based recommender systems, such as those presented in [39][16]. Preliminary experiments with matrix factorization techniques

NU

show that we can further improve on the recall@5 of the basic OCCF recommender system presented in this paper. However, this technique has as drawback that its recommendations are harder

MA

to explain. In any case, it is clear that job recommender systems can be designed that outperform the systems compared in our study.

D

The recommendations from our study were formulated from a job seeker’s perspective. However, a

TE

big part of the job search process is being considered an interesting candidate by a potential employer. Analyzing interest from the employer’s side to see how job seeker recommendation would

AC CE P

differ from job recommendation would be an interesting follow-up study, and could prove to be valuable for organizations that are recruiting new employees. Building further upon this idea, investigating how to combine a vacancy recommender system with a job seeker recommender system into a reciprocal recommender system could provide improvements to the recommender system quality.

Vacancy recommendation differs from recommendation in e-commerce (e.g. book recommendation, movie recommendation, ...) in the sense that (usually) a person can only fill in one vacancy, and a vacancy can only be filled in by one person. Several aspects, such as making sure as many vacancies as possible can find candidates, and as many job seekers as possible can find vacancies they are interested in, are interesting research tracks. Inspiration for such research could be found in the online dating context in which similar challenges appear when developing date recommender systems [40]. As a last piece of future work we look towards hybrid recommender systems. We have shown that implicit feedback data is extremely valuable for job recommendation. Systems based exclusively on this data can outperform a system based exclusively on explicit user profiles. It would be very in23

ACCEPTED MANUSCRIPT teresting to see how hybrid approaches, combining both datasets (e.g. based on proven approaches in date recommendation [41]), could be used to further improve job recommendations.

PT

Funding: This research was funded by the VDAB research chair on career management analytics.

RI

References

SC

[1] Arvastat, vacatures, basisstatistieken, https://arvastat.vdab.be/arvastat basisstatistieken vacatures.html,

NU

[Online; accessed: 2015-12-26].

[2] D. H. Lee, P. Brusilovsky, Fighting information overflow with personalized comprehensive

MA

information access: A proactive job recommender, in: Autonomic and Autonomous Systems, 2007. ICAS07. Third International Conference on, IEEE, 2007, pp. 21–21. [3] P. Resnick, H. R. Varian, Recommender systems, Communications of the ACM 40 (3) (1997)

TE

D

56–58.

[4] Y. Li, C. Wu, C. Lai, A social recommender mechanism for e-commerce: Combining similarity,

AC CE P

trust, and relationship, Decision Support Systems 55 (3) (2013) 740–752. doi:10.1016/j. dss.2013.02.009.

URL http://dx.doi.org/10.1016/j.dss.2013.02.009 [5] R. S. Garfinkel, R. D. Gopal, A. K. Tripathi, F. Yin, Design of a shopbot and recommender system for bundle purchases, Decision Support Systems 42 (3) (2006) 1974–1986. doi:10. 1016/j.dss.2006.05.005.

URL http://dx.doi.org/10.1016/j.dss.2006.05.005 [6] B. Ma, Q. Wei, Measuring the coverage and redundancy of information search services on ecommerce platforms, Electronic Commerce Research and Applications 11 (6) (2012) 560–569. doi:10.1016/j.elerap.2012.09.001. URL http://dx.doi.org/10.1016/j.elerap.2012.09.001 [7] F. Ricci, L. Rokach, B. Shapira (Eds.), Recommender Systems Handbook, Springer, 2015. doi:10.1007/978-1-4899-7637-6. URL http://dx.doi.org/10.1007/978-1-4899-7637-6 24

ACCEPTED MANUSCRIPT [8] K. Zhang, A. M. Ouksel, S. Fan, H. Liu, Scalable audience targeted models for brand advertising on social networks, in: Eighth ACM Conference on Recommender Systems, Rec-

doi:10.1145/2645710.2645763.

RI

URL http://doi.acm.org/10.1145/2645710.2645763

PT

Sys ’14, Foster City, Silicon Valley, CA, USA - October 06 - 10, 2014, 2014, pp. 341–344.

SC

[9] G. Jawaheer, M. Szomszor, P. Kostkova, Comparison of implicit and explicit feedback from an online music recommendation service, in: proceedings of the 1st international workshop on

NU

information heterogeneity and fusion in recommender systems, ACM, 2010, pp. 47–51. [10] D. W. Oard, J. Kim, et al., Implicit feedback for recommender systems, in: Proceedings of the

MA

AAAI workshop on recommender systems, 1998, pp. 81–83. [11] M. Galesic, M. Bosnjak, Effects of questionnaire length on participation and indicators of

D

response quality in a web survey, Public Opinion Quarterly 73 (2) (2009) 349–360.

TE

[12] H. Wang, Q. Wei, G. Chen, From clicking to consideration: A business intelligence approach

AC CE P

to estimating consumers’ consideration probabilities, Decision Support Systems 56 (2013) 397– 405. doi:10.1016/j.dss.2012.10.052. URL http://dx.doi.org/10.1016/j.dss.2012.10.052 [13] D. Kelly, J. Teevan, Implicit feedback for inferring user preference: a bibliography, Vol. 37, 2003, pp. 18–28. doi:10.1145/959258.959260. URL http://doi.acm.org/10.1145/959258.959260 [14] J. Bauer, A. Nanopoulos, Recommender systems based on quantitative implicit customer feedback, Decision Support Systems 68 (2014) 77–88. doi:10.1016/j.dss.2014.09.005. URL http://dx.doi.org/10.1016/j.dss.2014.09.005 [15] E. R. N´ un ˜ez-Vald´ez, J. M. C. Lovelle, O. S. Mart´ınez, V. Garc´ıa-D´ıaz, P. O. de Pablos, C. E. M. Mar´ın, Implicit feedback techniques on recommender systems applied to electronic books, Computers in Human Behavior 28 (4) (2012) 1186–1193. doi:10.1016/j.chb.2012.02.001. URL http://dx.doi.org/10.1016/j.chb.2012.02.001

25

ACCEPTED MANUSCRIPT [16] Y. Hu, Y. Koren, C. Volinsky, Collaborative filtering for implicit feedback datasets, in: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15-19, 2008, Pisa, Italy, 2008, pp. 263–272. doi:10.1109/ICDM.2008.22.

PT

URL http://dx.doi.org/10.1109/ICDM.2008.22

RI

[17] R. Rafter, B. Smyth, Passive profiling from server logs in an online recruitment environment,

SC

in: Workshop on Intelligent Techniques for Web Personalization at the the 17th International Joint Conference on Artificial Intelligence, Seattle, Washington, USA, August, 2001, 2001.

NU

[18] J. L. Herlocker, J. A. Konstan, A. Borchers, J. Riedl, An algorithmic framework for performing collaborative filtering, in: SIGIR ’99: Proceedings of the 22nd Annual International ACM

MA

SIGIR Conference on Research and Development in Information Retrieval, August 15-19, 1999, Berkeley, CA, USA, 1999, pp. 230–237. doi:10.1145/312624.312682.

D

URL http://doi.acm.org/10.1145/312624.312682

TE

[19] P. Lops, M. de Gemmis, G. Semeraro, Content-based recommender systems: State of the art

AC CE P

and trends, in: Recommender Systems Handbook, 2011, pp. 73–105. [20] R. Burke, Integrating knowledge-based and collaborative-filtering recommender systems, in: Proceedings of the Workshop on AI and Electronic Commerce, 1999, pp. 69–72. [21] D. Kluver, J. A. Konstan, Evaluating recommender behavior for new users, in: Eighth ACM Conference on Recommender Systems, RecSys ’14, Foster City, Silicon Valley, CA, USA October 06 - 10, 2014, 2014, pp. 121–128. doi:10.1145/2645710.2645742. URL http://doi.acm.org/10.1145/2645710.2645742 [22] B. M. Sarwar, G. Karypis, J. A. Konstan, J. Riedl, Item-based collaborative filtering recommendation algorithms, in: Proceedings of the Tenth International World Wide Web Conference, WWW 10, Hong Kong, China, May 1-5, 2001, 2001, pp. 285–295. doi:10.1145/371920. 372071. URL http://doi.acm.org/10.1145/371920.372071 [23] Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, N. Oliver, A. Hanjalic, Climf: learning to maximize reciprocal rank with collaborative less-is-more filtering, in: Sixth ACM Conference 26

ACCEPTED MANUSCRIPT on Recommender Systems, RecSys ’12, Dublin, Ireland, September 9-13, 2012, 2012, pp. 139– 146. doi:10.1145/2365952.2365981.

PT

URL http://doi.acm.org/10.1145/2365952.2365981 [24] T. Keim, Extending the applicability of recommender systems: A multilayer framework for

RI

matching human resources, in: 40th Hawaii International International Conference on Systems

SC

Science (HICSS-40 2007), CD-ROM / Abstracts Proceedings, 3-6 January 2007, Waikoloa, Big Island, HI, USA, 2007, p. 169. doi:10.1109/HICSS.2007.223.

NU

URL http://dx.doi.org/10.1109/HICSS.2007.223

[25] J. Malinowski, T. Weitzel, T. Keim, Decision support for team staffing: An automated re-

MA

lational recommendation approach, Decision Support Systems 45 (3) (2008) 429–447. doi: 10.1016/j.dss.2007.05.005.

D

URL http://dx.doi.org/10.1016/j.dss.2007.05.005

TE

[26] J. Malinowski, T. Keim, O. Wendt, T. Weitzel, Matching people and jobs: A bilateral recommendation approach, in: 39th Hawaii International International Conference on Systems

AC CE P

Science (HICSS-39 2006), CD-ROM / Abstracts Proceedings, 4-7 January 2006, Kauai, HI, USA, 2006. doi:10.1109/HICSS.2006.266. URL http://dx.doi.org/10.1109/HICSS.2006.266 [27] K. Bradley, R. Rafter, B. Smyth, Case-based user profiling for content personalisation, in: Adaptive Hypermedia and Adaptive Web-Based Systems, International Conference, AH 2000, Trento, Italy, August 28-30, 2000, Proceedings, 2000, pp. 62–72. doi:10.1007/3-540-445951_7. URL http://dx.doi.org/10.1007/3-540-44595-1_7 [28] W. Hong, S. Zheng, H. Wang, J. Shi, A job recommender system based on user clustering, JCP 8 (8) (2013) 1960–1967. doi:10.4304/jcp.8.8.1960-1967. URL http://dx.doi.org/10.4304/jcp.8.8.1960-1967 [29] S. T. Al-Otaibi, M. Ykhlef, A survey of job recommender systems, International Journal of the Physical Sciences 7 (29) (2012) 5127–5142.

27

ACCEPTED MANUSCRIPT [30] Z. Siting, H. Wenxing, Z. Ning, Y. Fan, Job recommender systems: a survey, in: Computer Science & Education (ICCSE), 2012 7th International Conference on, IEEE, 2012, pp. 920–924.

PT

[31] J. L. Herlocker, J. A. Konstan, L. G. Terveen, J. Riedl, Evaluating collaborative filtering recommender systems, ACM Trans. Inf. Syst. 22 (1) (2004) 5–53. doi:10.1145/963770.

SC

URL http://doi.acm.org/10.1145/963770.963772

RI

963772.

[32] J. L. Herlocker, J. A. Konstan, J. Riedl, Explaining collaborative filtering recommendations, in:

NU

CSCW 2000, Proceeding on the ACM 2000 Conference on Computer Supported Cooperative Work, Philadelphia, PA, USA, December 2-6, 2000, 2000, pp. 241–250. doi:10.1145/358916.

MA

358995.

URL http://doi.acm.org/10.1145/358916.358995

D

[33] S. M. McNee, J. Riedl, J. A. Konstan, Being accurate is not enough: how accuracy metrics

TE

have hurt recommender systems, in: Extended Abstracts Proceedings of the 2006 Conference on Human Factors in Computing Systems, CHI 2006, Montr´eal, Qu´ebec, Canada, April 22-27,

AC CE P

2006, 2006, pp. 1097–1101. doi:10.1145/1125451.1125659. URL http://doi.acm.org/10.1145/1125451.1125659 [34] C. Ziegler, S. M. McNee, J. A. Konstan, G. Lausen, Improving recommendation lists through topic diversification, in: Proceedings of the 14th international conference on World Wide Web, WWW 2005, Chiba, Japan, May 10-14, 2005, 2005, pp. 22–32. doi:10.1145/1060745. 1060754. URL http://doi.acm.org/10.1145/1060745.1060754 [35] F. Garcin, B. Faltings, O. Donatsch, A. Alazzawi, C. Bruttin, A. Huber, Offline and online evaluation of news recommender systems at swissinfo.ch, in: Eighth ACM Conference on Recommender Systems, RecSys ’14, Foster City, Silicon Valley, CA, USA - October 06 - 10, 2014, 2014, pp. 169–176. doi:10.1145/2645710.2645745. URL http://doi.acm.org/10.1145/2645710.2645745 [36] P. Patel, M. Parmar, Improve heuristics for user session identification through web server log

28

ACCEPTED MANUSCRIPT in web usage mining, International Journal of Computer Science and Information Technologies 5 (3) (2014) 3562–3565.

PT

[37] R. M. Bell, Y. Koren, C. Volinsky, Modeling relationships at multiple scales to improve accuracy of large recommender systems, in: Proceedings of the 13th ACM SIGKDD International

RI

Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August

SC

12-15, 2007, 2007, pp. 95–104. doi:10.1145/1281192.1281206. URL http://doi.acm.org/10.1145/1281192.1281206

NU

[38] M. Rossetti, F. Stella, M. Zanker, Contrasting offline and online results when evaluating recommendation algorithms, in: Proceedings of the 10th ACM Conference on Recommender Systems,

MA

Boston, MA, USA, September 15-19, 2016, 2016, pp. 31–34. doi:10.1145/2959100.2959176. URL http://doi.acm.org/10.1145/2959100.2959176

D

[39] P. Gopalan, J. M. Hofman, D. M. Blei, Scalable recommendation with hierarchical poisson

TE

factorization, in: Proceedings of the Thirty-First Conference on Uncertainty in Artificial In-

AC CE P

telligence, UAI 2015, July 12-16, 2015, Amsterdam, The Netherlands, 2015, pp. 326–335. [40] L. Li, T. Li, MEET: a generalized framework for reciprocal recommender systems, in: 21st ACM International Conference on Information and Knowledge Management, CIKM’12, Maui, HI, USA, October 29 - November 02, 2012, 2012, pp. 35–44. doi:10.1145/2396761.2396770. URL http://doi.acm.org/10.1145/2396761.2396770 [41] Y. Park, An adaptive match-making system reflecting the explicit and implicit preferences of users, Expert Syst. Appl. 40 (4) (2013) 1196–1204. doi:10.1016/j.eswa.2012.08.019. URL http://dx.doi.org/10.1016/j.eswa.2012.08.019

29

ACCEPTED MANUSCRIPT Michael Reusens holds a Bachelor in Informatics, and a Master in Computer Science Engineering from KU Leuven (Belgium). In September 2014, he started as a PhD researcher with Prof. Lemahieu and Prof. Baesens at the Department of Decision Sciences and Information Management.

PT

His main research topics include Recommender Systems and User Modelling with a focus on job

RI

search.

SC

Wilfried Lemahieu holds a PhD in applied economic sciences from KU Leuven. At present, he is an Associate Professor in the Department of Decision Sciences and Information Management

NU

of the same university. He conducts research in software engineering, business process management, database management and web based systems. He is also a co-founder and board member of the

MA

BPM-Forum Belgium.

D

Bart Baesens is an Associate Professor at K.U. Leuven (Belgium), and a lecturer at the Uni-

TE

versity of Southampton (United Kingdom). He has done extensive research on predictive analytics, data mining, customer relationship management, fraud detection, and credit risk management. His

AC CE P

findings have been published in well-known international journals and presented at international top conferences. He is also co-author of the book Credit Risk Management: Basic Concepts, published in 2008.

Luc Sels is full professor and dean of the Faculty of Economics and Business at KU Leuven (Belgium), honorary professor at Cardiff Business School and part-time professor at the Simon Business School.

30

ACCEPTED MANUSCRIPT Highlights • Users’ interests can be captured using both explicit and implicit data.

PT

• Our experiments on job search data show that implicit data contains a broader spectrum of

RI

user interest.

• Job recommenders using implicit feedback data provide more predictive and diverse recom-

AC CE P

TE

D

MA

NU

SC

mendations.

31