Mining personality traits from social messages for game recommender systems

Mining personality traits from social messages for game recommender systems

Knowledge-Based Systems xxx (xxxx) xxx Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/k...

3MB Sizes 1 Downloads 60 Views

Knowledge-Based Systems xxx (xxxx) xxx

Contents lists available at ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Mining personality traits from social messages for game recommender systems ∗

Hsin-Chang Yang , Zi-Rui Huang National University of Kaohsiung, Kaohsiung, Taiwan

article

info

Article history: Received 20 April 2018 Received in revised form 7 July 2018 Accepted 18 November 2018 Available online xxxx Keywords: Personality trait Recommender system Game recommendation Text mining Five Factor Model

a b s t r a c t Recently, recommender systems for various types of resource received lots of attention due to the need for finding interesting resources from gigantic body such as World Wide Web or social network services. An emerging branch of recommender systems tried to recommend resources to users according to their personality traits and received promising results. In this work, we proposed an approach on recommending computer games to players according to their identified personality traits. We first applied text mining processes on some textual contents related to the players to identify their personality traits using the Five Factor Model. The same personality recognition process was also applied on contents related to games. The games with similar personality traits to the players’ were then recommended to the players. We performed experiments on 63 players and 2050 games with data collected from Steam and obtained satisfying result. © 2018 Elsevier B.V. All rights reserved.

1. Introduction Nowadays, social network services have emerged to be an active and primary channels for social activities. People relied on social services to communicate and interact with others for information and opinion sharing. More and more people like to obtain information from social services regarding things that they interest in. For example, many people will consult user reviews in TripAdvisor1 in arranging their itineraries. On the other hand, social services also like to recommend profitable resources to people according to their need to increase user satisfaction and chance of purchase. However, such recommendation may also produce negative effect on user experiences if the recommended resources were not interesting to users. Research on recommender systems were thus prevalent in last decades. People tend to visit social services regarding their interests for information gathering and opinion sharing. It is common that social services provide bulletins or fora to allow users posting comments or reviews on specific resources besides giving ratings. For example, TripAdvisor allows users to leave comments on hotels they have visited. Another example is the video sharing site YouTube which also allows viewers to add comments on the videos. These social messages, such as comments or reviews, provide insightful information about the resources. A lot of ∗ Corresponding author. E-mail addresses: [email protected] (H.-C. Yang), [email protected] (Z.-R. Huang). 1 http://www.tripadvisor.com.

semantically relevant knowledge could be extracted from such messages if proper procedures were applied. Texts related to a resource could be divided into two types here, namely resource-related texts and user-related texts, according to their sources. Resource-related texts, e.g. product information or reviews, were provided by the producers and generally carry lots of semantic information regarding the resources. Recommender systems could benefit from discovering and applying such information. Many content-based recommender systems adopted such information to find relevant resources [1–5]. User-related texts, e.g. comments and reviews, were generated by users to express their opinions on specific resources. These user-generated contents provide semantic information regarding not only resources but also users. A major difference between these two types is that the user-related texts often mingle with emotions which may reflect the personalities of the users [6,7]. Analyzing personality traits from user-generated texts thus drew attention in past decade [8–10]. Personality affects many aspects of life such as people’s behavior and interests. There is a high potential that incorporating users’ characteristics into recommender systems could enhance recommendation quality and user experience [11,12]. Personality-based recommender systems have been developed by several researchers in last decade. The idea behind such systems is simple. People with similar personality traits should have similar interest in various aspects, including hobbies, genres of favorite music and movies, and so on. Therefore, recommendations can be made by identifying the similarity between users’ personality traits. An example is the TWIN system [13] which recommended hotels to people with similar personalities using data collected from TripAdvisor.

https://doi.org/10.1016/j.knosys.2018.11.025 0950-7051/© 2018 Elsevier B.V. All rights reserved.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

2

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

There are many ways to describe the personality of human beings. Trait-based approaches used a set of traits (features) to predict a person’s behavior. Many types of traits have been proposed, such as Five Factor models, Eysenck’s traits, Cattel’s traits and Cloninger’s temperament and character traits [14]. Among these, the five factor model (FFM), or Big Five model, have been widely accepted since it is assumed to represent the basic structure behind all personality traits [15]. The FFM identified five traits, namely openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism, to describe the personality of a person. These five traits are often represented by the acronyms OCEAN or CANOE. To measure the OCEAN traits of a person, practices such as NEO PI-R [16] were generally adopted. A recent approach relied on text analysis techniques to identify Big Five traits automatically. Linguistic terms used by a person were categorized and analyzed to reveal his Big Five traits. Evidences showed that such approach may provide convincible result on personality traits recognition [8,9,17–21]. Nowadays, users have thousands of computer games to choose from and make them difficult to find games that fit their interests. For example, there were 14,373 actions games available in the famous gaming platform Steam.2 It is rather difficult to recommend games to users from such gigantic amount of games in the same genre. Game recommender systems were thus developed to fulfill the needs of players. However, most of the schemes resemble traditional recommender systems in methodologies and characteristics. In addition to traditional recommendation schemes, personalitybased recommender systems could be feasible for games since evidences have shown that relationships exist between personality and games [22–24]. In this work, we will describe a game recommendation scheme based on player personality identified through text mining. First we applied a text mining process on social messages posted by players to identify their Big Five personality traits. The personality traits of each game were also identified through the same process. We then calculated the similarity between player and game according to their personality traits. Games with similar personality traits to the player’s were then recommended to the player. The major contributions of this work are two-fold: First, we proposed a new text-based personality traits identification scheme. Second, we devised a game recommendation scheme using personality similarity between games and players. We believe the proposed personality-based recommendation scheme can also be applied to other genres of resources, especially those having affective content such as music and movies. This article is divided into the following sections. Section 2 will briefly summarize some related work. The proposed scheme will be addressed in Section 3. Section 4 shows the experimental results and their evaluation. Finally, We will address some conclusions and discussions in the last section. 2. Related work 2.1. Personality and its relationships to games Research on personality traits has been conducted in psychology for over a century. Traits theory concerns of the measurement of traits, which can be defined as habitual patterns of behavior, thought, and emotion [25]. Traits are relatively more stable characteristics on a person, although differ significantly over different people. Personality could be composed of series of traits which are persistent characteristics of human behavior [26]. Although there could be thousands of traits [27], five high-order traits were generally recognized to describe the human personality, namely the Five Factor Model or Big Five personality traits [28]. Each trait 2 https://steamdb.info/genres/ Data retrieved on Jan. 18, 2018.

in FFM can be described by a set of lexical terms [29,30]. Evidences have shown that personality traits do take effect on many aspects of behavior, social interaction, and mental and physical status. For example, a study showed that the job performance of a work team is related to the personality traits of its members [31]. Another interesting effect of personality traits lies in the satisfaction in romantic relationships [32]. Hence, predicting human behaviors such as game preferences using personality traits seems to be plausible. The associations between computer game players and their game preferences have been investigated in previous works. Peever et al. [23] suggested that people with certain personality types exhibit preferences for particular game genres. They found that players with extraversion traits likely favor game types such as party, music and casual games. On the other hand, conscientious players like to play sport, racing, flight simulation, simulation, and fighting games. Additionally, people with openness to experience tend to play action adventure and platform games. Zammitto [22] investigated the relationships between 9 game genres and 5 personality traits. She discovered that extraversion and neuroticism traits are positively related to shooting action, non-shooting action, fighting action, and sports games. Openness trait, on the other hand, is positively related to artificial intelligence simulation, adventure, and puzzle games. Both studies demonstrated the possible relatedness between personality traits and game genres. Park et al. [33] studied the correlations among several personality factors, such as social anxiety, self-esteem, impulsivity, and game genre in patients with problematic online game playing. They found that the mean social anxiety score was highest in the massive multiplayer online role playing game group and lowest in the first-person shooter group. The mean self-esteem score was highest in the real-time strategy group. de Vette et al. [34] reported that four out of five personality traits correlate weakly with their corresponding game preference domains for participants younger than 60. Chory and Goodboy [35] found that players with low agreeableness score are more likely to play violent video games. They also found that openness is positive related to violent video game playing. van Lankveld et al. [36] analyzed the game behavior and personality traits of a set of players in role playing games. They concluded that personality effects on game behavior exist for all five traits of the FFM. These evidences demonstrate that certain personality traits do affect the selection of games for players. Thus, recommendation of games to players according to their personality traits, which is the main theme of this work, is considerably plausible. 2.2. Personality trait recognition from texts Many schemes for identifying personality traits automatically have been devised recently [37]. One approach is to use the cues from lexical terms. Pennebaker and King [38] developed the Linguistic Inquiry and Word Count (LIWC) system and showed that some LIWC categories correspond with Big Five personality traits [8]. Yarkoni [20] analyzed the relationships between words and personality using a framework similar to Pennebaker and King [38] and revealed robust correlations between the Big Five traits and the frequency with which bloggers used different word categories. Argamon et al. [17] also studied the relations between personality and texts. They used four different sets of lexical features, including a standard function word list, conjunctive phrases, modality indicators, and appraisal adjectives and modifiers for this task. Support Vector Machine (SVM) was used to learn linear separators for the high and low classes. The study showed that appraisal use is the best predictor for neuroticism, and that function words work best for extraversion. Oberlander and Nowson [18] used Naïve Bayes and SVM on different sets of n-gram feature to classify the personality of blog authors. They found that n-gram features exhibit an

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

empirical relationship with personality traits. Their experiments also showed that Naïve Bayes outperformed SVM in their classification tasks. Mairesse et al. [19] provided a long list of linguistic features that correlate with personality traits of the Big Five model. Based on this list, Celli [39] selected 12 features from the list and developed a personality recognition system. He found that different writing styles and personality models are associated with different communities using Twitter. Golbeck et al. [40] adopted the same approach and proposed a method to predict a user’s personality through the publicly available information on his Twitter profile. Poria et al. [9] incorporated common sense knowledge with ordinary phycho-linguistic and frequency-based features personality classification. They found that common sense knowledge with affective and sentiment information could enhance the accuracy of frameworks which used only psycho-linguistic features and frequency-based analysis at lexical level. Schwartz et al. [21] proposed an open-vocabulary approach differing from traditional a priori lexicon approaches such as LIWC for personality prediction. They conducted experiments on a large scale dataset composed of 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers. Their result suggested the largest relative improvement between open-vocabulary approaches and LIWC for personality. An interesting work by Zhu and Fang [41] proposed a revised lexical scheme for recognition of game personality and user experience by analyzing online game reviews. They used factor analysis to discover 9 and 6 factors to describe the game personality and user experience in game play, respectively.

3

Fig. 1. The system architecture of the proposed method.

2.3. Personality-based recommender systems Various recommendation techniques based on personality traits have been developed over recent decades [11,12]. Roshchina et al. [13] adopted the tool developed by Mairesse et al. [19] to construct the personality profile of a user through his/her written texts. However, they did not provide the details in measuring the accuracy. Besides, their recommendation scheme is based on the clustering result and is difficult to compare to gold standards. Ferwerda and Schedl [42] proposed an idea to enhance the accuracy of music recommendations according to personality and emotional states. Although the framework seems plausible, there is no implementation result being evaluated. Guntuku et al. [43] Karumur et al. [44] and Nguyen et al. [45] studied the relationships between personality and performance factors of recommender systems. Their studies showed that personality is positively related to user satisfaction, behavior, and preference in recommender systems. Although personality traits have been applied on recommender systems, we have not found such application on games to our best knowledge. 3. The proposed scheme 3.1. Methodology overview In this work, we try to use the personality traits for game recommendation. Fig. 1 depicts the architecture of the proposed method. The details of each step will be addressed in the following subsections. 3.2. Data acquisition We first fetched game-related data from Steam. Steam is a major gaming platform which provides access to various types of games. According to Steam’s report, there were over eighteen million concurrent users and over twenty thousand of games on Jan. 21, 2018 [46]. Each game is associated with a set of attributes

Fig. 2. An example of the user reviews on the list sorted by ‘MOST HELPFUL’ option. Most (93%) of the rating users approved this review as shown in the figure.

as well as user reviews. These reviews could be multi-lingual since players are all over the world. However, we used English reviews only in this work. The game-related data were retrieved using Steam Web API [47] and import.io Web scraper service.3 The markups of the retrieved pages were removed by jsoup Java HTML Parser. The resulted ‘clean’ pages without markups were stored for further processing. The numbers of reviews for games vary widely. To obtain consistent data volume for every game, we only retrieved 250 reviews for each game. Meanwhile, those games with number of reviews less than 250 were discarded. For those with more than 250 reviews, we selected their top 250 reviews on the list sorted by ‘MOST HELPFUL (ALL TIME)’ option as shown in Fig. 2. These reviews received high user approvals and should better reflected the quality of reviews. We also collected a set of user-related contents that were submitted by users. Each participant was asked to submit at least 10 3 https://www.import.io/.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

4

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

Fig. 3. An example of the result of Personality Recognizer.

qualified contents longer than 200 English words. The submitted contents had been examined by us to ensure the appropriateness in the experiments. 3.3. Data preprocessing The reviews of games are textual data that need to be processed and transformed into suitable representation for later processing. We adopted several text preprocessing steps on these reviews prior to personality recognition. A review was first segmented into a set of words using common word segmentation tools. We removed all numbers and dates that should be irrelevant to the recognition of personality traits. Punctuation marks were also removed except periods, commas, exclamation marks, and question marks. The remaining keywords will be used to represent these reviews. The same process was also applied to the user-related contents to transform them into sets of keywords. 3.4. Personality recognition In this work, we obtained the personality traits of games by two methods. The first method used the Personality Recognizer tool developed by Mairesse et al. [19] which computed the Pearson’s Correlation Coefficients between the OCEAN personality traits and LIWC [48]. The Personality Recognizer is a Java command-line application that reads a set of text files and computes estimates of personality scores along the OCEAN traits.4 It provided four prediction models, namely linear regression, M5’ model tree, M5’ regression tree, and support vector machine with linear kernel (SMOreg). In this work, we adopted the M5’ Regression Tree [49] which produced better result in Roshchina’s report [50]. The Personality Recognizer will compute a score for each trait on a scale from 1 to 7, e.g. where 7 is strongly extravert. An running example is depicted in Fig. 3. The personality traits of a game were determined using two approaches according to the personality traits of texts. The first approach used the game’s reviews for its personality recognition. On the other hand, the second approach relied on the personality of players to decide a game’s personality. Note that both approaches computed the personality traits from texts, i.e. reviews and user contents. We will call the first approach as game-centric approach and the second approach as user-centric approach in the remaining text. For game-centric approach, the personality traits of a game Gi were calculated by the following equation:

∑ PGi =

r ∈ RG

i

|RGi |

sr

,

(1)

where r and RGi denote a review( and the set ) of reviews associated with Gi , respectively. sr = sr ,t |t ∈ VP denotes the OCEAN personality trait scores for review r, where VP = {Extraversion, Emotional stability, Agreeableness, Conscientiousness, Openness to experience} denotes the set of the personality traits and t 4 http://farm2.user.srcf.net/research/personality/recognizer.

denotes a trait. The values of sr ,t were obtained using Personality Recognizer. For example, the review 1.txt in Fig. 3 has s1.txt = (4.858445, 3.982087, 4.594376, 5.048187, 4.712929). Note that |RGi | = 250 in our experiments. The personality of users could be recognized through userrelated contents. A direct source of user-related contents is their reviews on games. However, some of the participants in our experiments may not give enough quality reviews in Steam. Therefore, we decided to request the users to submit a set of their contents such as emails, posts, and social messages, etc. As we mentioned before, the length of each content should be over 200 words. The personality traits of a user Uj were determined in the same manner of the games as follow:



m∈MU

PUj =

j

sm

|MUJ |

,

(2)

where m and MUJ denote a user content and the set of contents submitted by Uj , respectively. sm resembles the sr in Eq. (1) by replacing game reviews with user contents. In user-centric approach, the personality traits of a game is determined by the personality of its players. Eq. (1) was then modified as follows:

∑ PGi =

Uj ∈OG

i

PUj

|OGi |

,

(3)

where OGi is the set of users who played Gi . The personality traits of users were calculated in the same ways as game-centric approach based on personality of texts. They can also be obtained using other psychometric tests such as NEO-PI-R. However, collecting the test result is generally difficult for a large set of users. Besides using Personality Recognizer, we also developed another scheme to identify the personality traits from texts. We first calculated the OCEAN scores for each keyword using the myPersonlity dataset5 [51]. The myPersonality dataset collected 9917 status updates of 250 users in Facebook. Their OCEAN traits were also identified. Both OCEAN scores and labels were included in the dataset. An excerpt of the dataset is shown in Fig. 4. We will use the status updates as well as their associated author’s personality traits to calculate the OCEAN scores for each keyword. However, the status updates contain function words, redundant words, and punctuation symbols which are not helpful. We also met some minor errors in the data and corrected them manually. For ease of processing and better result, we performed several preprocessing steps on the myPersonality dataset which will be addressed in Section 4. The OCEAN scores of a keyword v were calculated by: v = (st |t ∈ VP ) , O · St st = , ∥O∥1

(4) (5)

where O is the occurrence vector whose ith element indicates the number of times v occurs in the ith status update. For example, 5 Data downloadable at http://mypersonality.org/wiki/doku.php?id=wcpr13.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

5

Fig. 4. A part of the myPersonality dataset. We excluded social network attributes which were not used in this work. Each record contains the author’s ID(#AUTHID), the status update (STATUS), the OCEAN scores (sEXT, sNEU, sAGR, sCON, sOPN), and the OCEAN classes (cEXT, cNEU, cAGR, cCON, cOPN). Table 1 The OCEAN scores of keywords. # denotes the number of occurrences of keyword v . v

#

Trait score scheme

Trait class scheme



sEXT sNEU sAGR sCON sOPN cEXT cNEU cAGR cCON cOPN Love Bless Friend Hate Hurt

461 21 185 109 34

3.48 3.47 3.39 3.31 3.33

2.62 2.48 2.66 2.87 2.76

3.73 3.58 3.72 3.51 3.74

where VUj denotes the set of unique keywords appeared in MUj . For user-centric approach, Eq. (6) was modified as follows:

3.49 3.51 3.39 3.31 3.22

4.18 4.19 4.12 4.11 4.16

0.47 0.52 0.41 0.38 0.35

0.36 0.33 0.36 0.52 0.47

0.60 0.52 0.62 0.43 0.56

0.47 0.43 0.36 0.41 0.41

0.76 0.81 0.78 0.65 0.74



PGi =

Uj ∈OG

i

P′U

j

|OGi |

.

(9)

We should call this scheme the myPersonality recognizer hereinafter. 3.5. Similarity computation and recommendation

O = (1, 0, 0, 1, 0, 0, 1, 0, 0, 1, . . .) for v = ‘like’ according to Fig. 4. St denotes the vector constructed by scores of personality trait t over the dataset. For example, St = (2.65, 2.65, 2.65, 2.65, 2.65, 2.65, 2.65, 2.65, 2.65, . . .) for t = ‘sEXT’ (for score of Extraversion) in Fig. 4. In this example, the elements have the same value since these updates were posted by the same author. In addition to scores, myPersonality dataset also provides trait classes for each status update. In this scope, the ith element of St will have value 1 if the ith status update belongs to class t. For example, St = (0, 0, 0, 0, 0, 0, 0, 0, 0, . . .) for t = ‘cEXT’ (for class of Extraversion) in Fig. 4. Naturally, the elements all have values of 0 for the same author. ∥O∥1 is the Euclidean 1-norm of O which is the sum of all its elements. Table 1 shows the OCEAN scores of example keywords. Fig. 5 depicts the score distributions for keywords in Table 1 using trait score and trait class schemes. We can observe that trait score scheme seems to be more consistent in representing keywords’ traits such that words with similar affective polarity (love and bless, hurt and hate) tempted to have similar traits. However, it is necessary to conduct further research to give affirmative conclusions. In this work, we just adopted the OCEAN scores in Eq. (4). The personality traits of a game were then obtained by aggregating the OCEAN scores of its constituent keywords. We should obtain the personality traits of game Gi by

∑ ′

PGi =

v∈VGi

wv v

|VGi |

,

(6)

where VGi denotes the set of unique keywords appeared in RGi . v is the OCEAN score vectors defined in Eq. (4). wv denotes the weight of keyword v and was defined according to the tf − idf weighting scheme in information retrieval [52] as follow:

wv =

tf (v, i) maxv∈VG tf (v, i) i

log

N nv

,

(7)

where tf (v, i) denotes the number of occurrences of v in RGi . N and nv denote the number of games and number of games whose reviews contain v , respectively. Likewise, the OCEAN scores of a user Uj could be obtained by

∑ P′Uj =

v∈VUj

wv v

|VUJ |

,

(8)

Three recommendation schemes were devised in this work. The first scheme, namely user-based recommendation scheme, recommended games to a user barely according to the relatedness between their personality traits. That is, we will recommend games to a user if they have similar personality traits. The similarity between game Gi and user Uj using Personality Recognizer approach is defined as follows: Suser (Gi , Uj ) =

PGi · PUj

∥PGi ∥∥PUj ∥

.

(10)

Eq. (10) resembles the classical cosine similarity measure. Likewise, Eq. (11) defines the similarity using myPersonality recognizer approach: Suser (Gi , Uj ) =

P′G · P′U i

j

∥P′Gi ∥∥P′Uj ∥

.

(11)

The second scheme recommends games according to games under consideration. We called this scheme the game-based recommendation scheme. For example, if a user is browsing a game g, we should recommend those games whose personality traits are similar to g’s to the user. Another scenario of game-based recommendation happens when the favorite games of a user were known a priori. In this scenario, we can recommend games that are similar to these favorite games to the user. We should calculate the similarity between users and games for Pattern Recognizer and myPersonality recognizer approaches, respectively, by Sgame (Gi , Uj ) =

1

|CUj |

∑ g ∈ CU

PGi · Pg

∥PGi ∥∥Pg ∥

,

(12)

,

(13)

j

and Sgame (Gi , Uj ) =

1

|CUj |



P′G · P′g

g ∈ CU

∥P′Gi ∥∥P′g ∥

j

i

where CUj is the set of games regarding Uj which could be determined implicitly or explicitly. Implicit construction of C could use knowledge such as browsing histories and user profiles, while explicit construction relies on externally provided information such as user’s favorite game list.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

6

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

Fig. 5. The distributions of keyword trait scores using different schemes.

The last scheme, namely hybrid scheme, suggested games according to the relatedness between the game and the user as well as the game under consideration, e.g. his/her favorite games. Eq. (14) defines the similarity between a game Gi and a user Uj . Shybrid (Gi , Uj ) = wu Suser (Gi , Uj ) + wg Sgame (Gi , Uj ).

(14)

The weights wu and wg differentiate contributions from user and game similarities. In this work, we let wu + wg = 1. When a service provider intends to recommend games to a specific player proactively or reactively, user-based recommendation scheme will compute the similarity between the user and each game using Eq. (10) or (11). The games will then be ranked according to their similarities to the user. Usually, top-ranked games will then be recommended to the user. When user information is unavailable or unnecessary, game-based recommendation could be applied. Hybrid scheme can be used when both information were available. 4. Experimental result We collected the list of games from Steam using Selenium Java API.6 The AppIDs (application ID) of 25715 products were collected in June, 2016. These products contained non-game applications that were first removed. The number of remaining games was about 8000. We then removed games whose numbers of reviews were less than 250 and reduced the number of games to 2050. The jsoup Java HTML Parser API was used to retrieve reviews associated with games. All reviews were stored in JSON format for later processing. Fig. 6 shows examples of the reviews. Table 2 shows some statistics of the retrieved data. 6 https://seleniumhq.github.io/selenium/docs/api/java/.

Table 2 Statistics of retrieved data. Number of retrieved items Number of games used in experiments Number of downloaded reviews Number of review authors

25 715 2 050 3 048 375 328 972

The user-related contents were submitted by reviewers participated in the experiments. There were 298 contents submitted by 29 reviewers. Each content was required to have at least 200 words. The reviews and contents were segmented into keywords using Apache Lucene’s StandardAnalyzer.7 Fig. 7 shows the segmentation result of example reviews. The personality traits of games and users were obtained by approaches described in Section 3.4. We performed some preprocessing steps on the myPersonality dataset. First, we removed all punctuation marks in all status updates. We then removed all 439 stop words compiled by University of Tennessee. The remaining words were stemmed by Porter’s stemming algorithm [53]. Table 3 lists statistics of the myPersonality dataset. The myPersonality dataset contains many unmeaningful words (e.g. typos), net-centric abbreviations (e.g. ‘OMG’), elongated words (e.g. ‘ugggggghhhhh’), etc. that should be filtered and processed. After inspection, we observed that these unimportant words can easily be removed by their number of occurrences. For example, if we set the minimum number of occurrences to 10, the number of remaining keywords will be 1223, i.e less than 10% of the original number 12 827. These frequent words contain meaningful and popular words with negligible amount of above-mentioned unimportant words. However, we kept all words in our experiments to retain as much information as possible. 7 https://lucene.apache.org/.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

7

Fig. 6. Examples of downloaded reviews in JSON format.

Fig. 7. Segmentation result of example reviews.

Fig. 8. The game submission interface in our experiments. The interface was implemented in Chinese. We translated the messages into English to allow easier comprehension.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

8

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

Table 3 Statistics of myPersonality dataset. All keywords are shown in their stemmed form. The keyword ‘propnam’ is the stemmed form of ‘*PROPNAME*’ which stands for replaced proper names. Number of records Number of authors Maximum number of status updates per author Minimum number of status updates per author Number of unprocessed words Number of keywords Number of multiple occurrence keywords Top 20 most occurred keywords

9917 250 223 1 148,481 12827 5165 propnam thi dai ar wa im ha love want todai think need hi night feel dont happi tomorrow peopl home

We recruited 63 reviewers, including 47 males and 16 females aged from 20 to 41, to evaluate the performance of the proposed approaches. Most of the reviewers (43 out of 63) have posted reviews on Steam. These reviewers were recruited from local Facebook Steam clubs and university clubs. The reviewers were asked to submit at most 10 favorite games through a Web interface, as shown in Fig. 8. The interface allowed reviewers to select games through three ways, namely searching by keywords, browsing all games, and browsing by tags. The average number of submitted games is 9.73. We also asked the reviewers to submit their written contents as well as conduct the short form for the IPIP-NEO test.8 However, only 29 reviewers completed both tasks and submitted 298 contents. All three recommendation schemes described in Section 3.5 were conducted in our experiments. However, we only conducted game-based recommendation for reviewers who did not submit their contents. Therefore, the user-based and hybrid recommendation schemes were applied to the 29 reviewers who submitted their contents. The game-based recommendation, on the other hand, was applied to all 63 reviewers who submitted their favorite games. For game-based recommendation, we will recommend games in respond to the reviewer’s every favorite game. Therefore, the system will generate several recommendation results. For userbased and hybrid recommendation schemes, we did not refer to the favorite games. Therefore, only one result will be generated. We evaluated 12 different configurations of recommendation schemes composed of two types of game personality recognition, namely game-centric(GC) and user-centric(UC) schemes, two types of text-based personality recognition schemes, namely Personality Recognizer(PR) and myPersonality(MP) recognizer, and three types of recommendation schemes, namely user-based(UB), game-based(GB), and hybrid(Hy) schemes. Table 4 lists the characteristics of these configurations. We set the weights wu and wg in Eq. (14) to both 0.5 in our experiment. The reviewers were asked to give ratings on each recommended game using Likert’s 5-level scale. An example of the recommendation result and rating system is shown in Figs. 9 and 10. The result of 12 different recommendation settings were presented to the reviews, as shown in Fig. 9. Reviewers could give rating to each of the recommended games as shown in Fig. 10. Table 5 lists the average rating scores of the 12 configurations. For ease of comparison, we grouped these scores according to different aspects, namely recommendation scheme, game personality scheme, and recognizer, as shown in Fig. 11. The average scores for different aspects were summarized in Table 6. We also calculated the mean and standard deviation of rating scores given each reviewer. The average mean and average standard deviation over all reviewers were 3.3 and 0.51, respectively. To confirm the 8 http://www.personal.psu.edu/~j5j/IPIP/ipipneo120.htm.

Fig. 9. Recommendation result.

Fig. 10. Rating given by reviewers.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

9

Table 4 The configurations used in the experiments. Configuration Game personality

Recognizer

Recommendation

Game-centric User-centric Personality recognizer myPersonality recognizer Game-based User-based Hybrid GC-PR-GB UC-PR-GB GC-MP-GB UC-MP-GB GC-PR-UB UC-PR-UB GC-MP-UB UC-MP-UB GC-PR-Hy UC-PR-Hy GC-MP-Hy UC-MP-Hy



✓ ✓

✓ ✓

✓ ✓

✓ ✓

✓ ✓ ✓ ✓

✓ ✓

✓ ✓

✓ ✓ ✓ ✓

✓ ✓

✓ ✓

✓ ✓

✓ ✓

✓ ✓ ✓ ✓

✓ ✓



Table 5 The rating scores given by the reviewers on all configurations. Note that the number of reviewers for game-based scheme (GB) is 63 while the numbers are 29 for both user-based (UB) and hybrid (Hy) schemes. Configuration

Mean rating score

Standard deviation

Ratio of top rating

GC-PR-GB UC-PR-GB GC-MP-GB UC-MP-GB GC-PR-UB UC-PR-UB GC-MP-UB UC-MP-UB GC-PR-Hy UC-PR-Hy GC-MP-Hy UC-MP-Hy

3.64 3.49 3.29 3.25 3.03 3.24 3.29 3.14 3.20 3.35 3.34 3.30

0.44 0.43 0.37 0.53 0.36 0.37 0.74 0.57 0.69 0.61 0.52 0.52

23/63(36.5%) 19/63(30.2%) 14/63(22.2%) 15/63(23.8%) 6/29(20.7%) 7/29(24.1%) 8/29(27.6%) 7/29(24.1%) 7/29(24.1%) 9/29(31.0%) 7/29(24.1%) 7/29(24.1%)

Average

3.30

0.51

26.0%

Table 6 The average rating scores given by the reviewers on all aspects. Aspect

Scheme

Average rating score

Recommendation scheme

Game-based User-based Hybrid

3.42 3.18 3.30

Game personality

Game-centric User-centric

3.30 3.30

Personality recognizer

Personality Recognizer myPersonality recognizer

3.32 3.27

Average

3.30

consistency of these inspectors, we calculated the Fleiss’ kappa value [54] of their judgements and obtained a value of 0.69, which can be interpreted as ‘‘substantial agreement’’ according to Landis and Koch’s scale [55]. In Table 5, the twelve configurations exhibited similar performance in user satisfaction. However, there are still ignorable differences among these configurations. We believe that the best configuration should have the largest mean score while also keep the deviation small. Therefore, configuration GCPR-GB seems to be the best among all. To further justify this conjecture, we observed the number of reviewers who gave the highest score, i.e. 5, to each configuration, as shown in Table 5. The same configuration received the largest ratio of highest scores among all. As a result, we were convinced that configuration GCPR-GB should be the best. We believe that such satisfying and consistent rating scores should demonstrate the feasibility of the proposed approaches. In Table 6 we can observe that game-based recommendation scheme seems to outperform the other two recommendation schemes. The reasons behind such superiority may be that the game-based scheme recommended games according to the

Table 7 The statistics and result of mutual similarity computation. We selected 30 reviews for each favorite game to make the number of reviews even to the number of usergenerated contents. The similarity between each pair of reviews as well as contents was calculated and then averaged. Number of users Number of games Number of reviews/contents Number of keywords Average mutual similarity

Game reviews

User-generated contents

1 10 300 6127 0.0277

29 NA 298 6634 0.0214

similarity between games. Such similarity may be greater than those between games and users since similar games will receive similar reviews, i.e. reviews using similar words. To justify such conjecture, we computed the mutual similarity between each pair of game reviews as well as user-generated contents. To compute the similarities of the former, we randomly selected a reviewer and retrieved 30 reviews for each of his favorite games. The latter was computed using user-generated contents from all reviewers. The mutual similarities between game reviews or user-generated contents were computed by cosine similarity measure in vector space model [52]. The dataset statistics and result are shown in Table 7. As shown in Table 6, game-centric and user-centric game personality computation schemes performs similarly. This is an interesting result since these two schemes rely on subjects of different nature, i.e. people and games. A possible cause for such consistent satisfaction is the two schemes generated similar personality traits for games. To reveal such similarity, we calculated the mean difference between the personality traits obtained by ∑ g i ∥ PG

−PuG ∥

g

i i the two schemes, i.e. , where PG and PuG denote the N i i personality traits of game Gi using game-centric and user-centric schemes, respectively. We obtained the mean difference of 1.17 which √is considerably small compared to its theoretical maximum, i.e. 6 5 ≈ 13.42. Therefore, these two schemes may generate similar recommendations. Finally, using Personality Recognizer seems to have limited improvements over using myPersonality recognizer. Actually, this result is not so surprising for the myPersonality recognizer being modeled using a rather small dataset. We believe that myPersonality recognizer will be improved if we applied a significantly large dataset to generate the model. Actually, myPersonality project constructed a large dataset consisting of millions of records. We are planing to incorporate such dataset to see how the recognizer can be improved. We also compared the recognized personality of reviewers to their ‘true’ personality measured by IPIP-NEO test. All personality trait scores were first normalized into range [0, 1]. We then calculated the differences between the recognized personality traits and

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

10

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

Fig. 11. The rating scores grouped by different aspects. Table 8 The comparison of different personality measuring schemes. Personality recognition scheme Mean difference to true OCEAN scores EXT Pattern Recognizer myPersonality recognizer

NEU AGR CON OPN

0.21 0.17 0.24 0.13 0.16 0.19 0.18 0.22 0.15 0.18

the true personality traits for both recognition schemes mentioned in Section 3.4. The result is shown in Table 8. The two automatic schemes exhibited similar performance on their differences to the true personality. It is difficult to find comparable work with ours since we adopted self-collected data from Steam. However, the Steam platform provides recommendation facility called ‘‘LOOKING FOR SIMILAR ITEMS’’. An example is shown in Fig. 12. Twelve games will be recommended to the player according to the tags, i.e. the tags

players have most frequently applied to the game under consideration have also been applied to these recommended games. We asked the reviewers to rate the Steam’s recommendations for comparison. However, some reviewers could not participate in such evaluation. The number of reviewers was thus reduced to 56, which is still enough. The reviewers were asked to give ratings on the Steam’s recommendation lists of their favorite games. The average rating score is 3.26 with standard deviation 0.54. We can observe that most of our configurations (7 out of 12) outperform the Steam’s recommendations in average rating score according to Table 5. 5. Conclusions and discussions In this work, we proposed a scheme for game recommendation based on personality traits. The personality traits of games and players were identified by their related textual data. The games

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

11

Table 9 The performance of different weight settings in hybrid recommendation scheme. A false configuration indicates that hybrid recommendation scheme outperformed the game-based recommendation scheme in this configuration. The mean difference is the average difference between scores of game-based and hybrid recommendation schemes. A positive mean difference indicates that game-based scheme outperformed hybrid scheme, and vice versa.

Fig. 12. An example of Steam ‘‘LOOKING FOR SIMILAR ITEMS’’ recommendation.

with similar personality traits to a player will be recommended to the player. We suggested two different schemes to determine the personality of games, together with two text-based personality recognition methods. We also proposed various recommendation schemes based on the identified personality traits. Experimental result shows that the reviewers were satisfied with the proposed approaches. The evaluation results showed that reviewers were satisfied about the recommended games with average rating score 3.30 over all configurations. However, it is not easy to compare this result to other related work since there is no gold standard for the similarity between games. A possible gold standard is the Steam’s ‘similar item’ list of a game9 These similar games were suggested by their common user tags, i.e. the tags customers have most frequently applied to the game have also been applied to these similar games. However, it is difficult to decide if such ‘similar’ games reflected their true similarity. We will conduct experiments using such similar items to exploit the plausibility of this gold standard approach. The experiments demonstrated that the proposed myPersonality recognizer had comparable performance to the Personality Recognizer. It will be interesting if we adopted other personality datasets. The myPersonality dataset contains 148,481 words in 9917 records, i.e. about 15 words per record. The maximum and minimum numbers of status updates for authors are 223 and 1, respectively. This means that some authors (the exact number is 12) only posted one update in the dataset. This will not affect the recognition result since the personality of every author counts for they did use words according to their personality. However, we believed that the result may be improved if we have restricted the minimum words each author should give. Other datasets could also be used instead of myPersonality. However, it is not easy to associate a text with the true personality of its author. Constructing a large-scale dataset composing user texts as well as user personality will be beneficial for study related to text-based personality recognition. We believe that the proposed recognition scheme (as for myPersonality dataset) could perform better if such large-scale datasets were available. Zhu and Fang [41] suggested a set of novel traits and factors to describe games and discussed their relations to game play experience. Nine game traits, namely Playability, Usability, Innovative design, Strategy, Type/Nature of Action, Characteristics of Game Characters, Theme of Narrative, Violence, and Realism/Audio/Video Effects, were discovered. It will be interesting to associate such game traits to personality traits. 9 Visit http://store.steampowered.com/recommended/morelike/app/578080/ for a sample.

wg

wu

# of false configurations

Mean difference

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

0 1 0 1 1 2 0 2 1 0 0

0.43 0.37 0.45 0.26 0.20 0.12 0.42 0.16 0.14 0.08 0

The weights in hybrid recommendation scheme differentiate the contributions from game-based and user-based recommendation. Although different weight settings should produce various results, we did not go further on the issue of optimal weight setting since the game-based scheme outperform the user-based scheme in every configuration. Taking hybrid scheme just produced intermediate result at various degrees in our experiments. Therefore, we just set both weights to 0.5 in our experiments. Note that hybrid scheme did outperform the other two schemes in some configurations using some weight setting. For example, the hybrid scheme performed better in GC-MP-Hy and UC-MY-Hy in Fig. 11(a) such that both weights were 0.5. However, we tried several weight settings and found that such false configurations did not influence the superiority of the game-based scheme, as shown in Table 9. We were convinced that weight settings play insignificant role on the recommendation performance in our approaches. In this work, we did not encompass the time factor in our model which considers the players having consistent game preferences over time. However, people’s interests on games do change from time to time. It is interesting and worth investigating to incorporate the time factor into the proposed model. A simple approach is to use data collected within a small period of time in which the game preferences should be consistent. However, the data collected in such small period may not be sufficient to reveal the true personalities of the players. Acknowledgments This work is supported by funding from Ministry of Science and Technology under grant MOST 103-2410-H-390-017-MY2. References [1] C.C. Aggarwal, Content-based recommender systems, in: Recommender Systems, Springer, 2016, pp. 139–166. [2] F. Ricci, L. Rokach, B. Shapira, Recommender systems: introduction and challenges, in: Recommender Systems Handbook, Springer, 2015, pp. 1–34. [3] J. Bobadilla, F. Ortega, A. Hernando, A. Gutiérrez, Recommender systems survey, Knowl.-Based Syst. 46 (2013) 109–132. [4] L. Chen, G. Chen, F. Wang, Recommender systems based on user reviews: the state of the art, User Model. User-Adapt. Interact. 25 (2) (2015) 99–154. [5] J. Lu, D. Wu, M. Mao, W. Wang, G. Zhang, Recommender system application developments: a survey, Decis. Support Syst. 74 (2015) 12–32. [6] R.A. Sherman, J.F. Rauthmann, N.A. Brown, D.G. Serfass, A.B. Jones, The independent effects of personality and situations on real-time expressions of behavior and emotion., J. Pers. Soc. Psychol. 109 (5) (2015) 872. [7] E. Komulainen, K. Meskanen, J. Lipsanen, J.M. Lahti, P. Jylhä, T. Melartin, M. Wichers, E. Isometsä, J. Ekelund, The effect of personality on daily life emotional processes, PLoS One 9 (10) (2014) e110907.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.

12

H.-C. Yang and Z.-R. Huang / Knowledge-Based Systems xxx (xxxx) xxx

[8] Y.R. Tausczik, J.W. Pennebaker, The psychological meaning of words: LIWC and computerized text analysis methods, J. Lang. Soc. Psychol. 29 (1) (2010) 24–54. [9] S. Poria, A. Gelbukh, B. Agarwal, E. Cambria, N. Howard, Common sense knowledge based personality recognition from text, in: Mexican International Conference on Artificial Intelligence, Springer, 2013, pp. 484–496. [10] N. Majumder, S. Poria, A. Gelbukh, E. Cambria, Deep learning-based document modeling for personality detection from text, IEEE Intell. Syst. 32 (2) (2017) 74–79. [11] M.A.S. Nunes, R. Hu, Personality-based recommender systems: an overview, in: Proceedings of the Sixth ACM Conference on Recommender Systems, ACM, 2012, pp. 5–6. [12] M. Tkal˘ci˘c, L. Chen, Personality and recommender systems, in: Recommender Systems Handbook, Springer, 2015, pp. 715–739. [13] A. Roshchina, J. Cardiff, P. Rosso, TWIN: personality-based intelligent recommender system, J. Intell. Fuzzy Syst. 28 (5) (2015) 2059–2071. [14] P.J. Corr, G. Matthews, The Cambridge Handbook of Personality Psychology, Cambridge University Press, New York, 2009. [15] B.P. O’Connor, A quantitative review of the comprehensiveness of the fivefactor model in relation to popular personality inventories, Assessment 9 (2) (2002) 188–203. [16] P.T. Costa, R.R. MacCrae, Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI): Professional Manual, Psychological Assessment Resources, Incorporated, 1992. [17] S. Argamon, S. Dhawle, M. Koppel, J.W. Pennebaker, Lexical predictors of personality type, in: Proceedings of the Joint Annual Meeting of the Interface and the Classification Society of North America, 2005. [18] J. Oberlander, S. Nowson, Whose thumb is it anyway?: classifying author personality from weblog text, in: Proceedings of the COLING/ACL on Main Conference Poster Sessions, Association for Computational Linguistics, 2006, pp. 627–634. [19] F. Mairesse, M.A. Walker, M.R. Mehl, R.K. Moore, Using linguistic cues for the automatic recognition of personality in conversation and text, J. Artificial Intelligence Res. 30 (2007) 457–500. [20] T. Yarkoni, Personality in 100,000 words: A large-scale analysis of personality and word use among bloggers, J. Res. Personal. 44 (3) (2010) 363–373. [21] H.A. Schwartz, J.C. Eichstaedt, M.L. Kern, L. Dziurzynski, S.M. Ramones, M. Agrawal, A. Shah, M. Kosinski, D. Stillwell, M.E. Seligman, et al., Personality, gender, and age in the language of social media: The open-vocabulary approach, PLoS One 8 (9) (2013) e73791. [22] V.L. Zammitto, Gamers’ Personality and Their Gaming Preferences (Ph.D. thesis), Communication, Art & Technology: School of Interactive Arts and Technology, 2010. [23] N. Peever, D. Johnson, J. Gardner, Personality & video game genre preferences, in: Proceedings of the 8th Australasian Conference on Interactive Entertainment: Playing the System, ACM, 2012, p. 20. [24] B. Braun, J.M. Stopfer, K.W. Müller, M.E. Beutel, B. Egloff, Personality and video gaming: Comparing regular gamers, non-gamers, and gaming addicts and differentiating between game genres, Comput. Hum. Behav. 55 (2016) 406– 412. [25] S. Kassin, Psychology, fourth ed., Prentice Hall, Upper Saddle River, NJ, 2004. [26] D. Cervone, L.A. Pervin, Personality: Theory and Research, thirtheenth ed., John Wiley & Sons, Inc., NJ, 2015. [27] G.W. Allport, H.S. Odbert, Trait-Names: A Psycho-Lexical Study, in: Psychological monographs, Psychological Review Co, 1936. [28] G. Matthews, I.J. Deary, M.C. Whiteman, Personality Traits, Cambridge University Press, Cambridge, UK, 2009. [29] R.R. McCrae, P.T. Costa, Validation of a five-factor model of personality across instruments and observers, J. Pers. Soc. Psychol. 52 (1987) 81–90. [30] G. Saucier, Mini-markers: A brief version of Goldberg’s unipolar big-five markers, J. Pers. Assess. 63 (3) (1994) 506–516. [31] G.A. Neuman, S.H. Wagner, N.D. Christiansen, The relationship between workteam personality composition and the job performance of teams, Group Organ. Manag. 24 (1) (1999) 28–45.

[32] A.S. Holland, G.I. Roisman, Big Five personality traits and relationship quality: Self-reported, observational, and physiological evidence, J. Soc. Pers. Relat. 25 (5) (2008) 811–829. [33] J.H. Park, D.H. Han, B.N. Kim, J.H. Cheong, Y.S. Lee, Correlations among social anxiety, self-esteem, impulsivity, and game genre in patients with problematic online game playing, Psychiatry Invest. 13 (3) (2016) 297–304. [34] A. de Vette, M. Tabak, M. Dekker-van Weering, M.M. Vollenbroek-Hutten, Exploring personality and game preferences in the younger and older population: A pilot study, in: ICT4AgeingWell, 2016, pp. 99–106. [35] R.M. Chory, A.K. Goodboy, Is basic personality related to violent and nonviolent video game play and preferences?, Cyberpsychol. Behav. Soc. Netw. 14 (4) (2011) 191–198. [36] G. van Lankveld, P. Spronck, J. Van den Herik, A. Arntz, Games as personality profiling tools, in: Computational Intelligence and Games (CIG), 2011 IEEE Conference on, IEEE, 2011, pp. 197–202. [37] A. Vinciarelli, G. Mohammadi, A survey of personality computing, IEEE Trans. Affect. Comput. 5 (3) (2014) 273–291. [38] J.W. Pennebaker, L.A. King, Linguistic styles: language use as an individual difference, J. Pers. Soc. Psychol. 77 (6) (1999) 1296. [39] F. Celli, Mining User Personality in Twitter, Language, Interaction and Computation Laboratory (CLIC), University of Trento, Italy, 2011. [40] J. Golbeck, C. Robles, M. Edmondson, K. Turner, Predicting Personality from Twitter, in: Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on, IEEE, 2011, pp. 149–156. [41] M. Zhu, X. Fang, A lexical approach to study computer games and game play experience via online reviews, Int. J. Hum.-Comput. Interact. 31 (6) (2015) 413–426. [42] B. Ferwerda, M. Schedl, Enhancing Music Recommender Systems with Personality Information and Emotional States: A Proposal, in: M. Tkalcic, B.D. Carolis, M. de Gemmis, A. Odic, A. Košir (Eds.), Proceedings of EMPIRE 2014 - 2nd Workshop on Emotions and Personality in Personalized Services, CEUR, 2014, pp. 36–44. [43] S.C. Guntuku, S. Roy, L. Weisi, Personality modeling based image recommendation, in: International Conference on Multimedia Modeling, Springer, 2015, pp. 171–182. [44] R.P. Karumur, T.T. Nguyen, J.A. Konstan, Personality, user preferences and behavior in recommender systems, Inf. Syst. Front. (2017) 1–25. [45] T.T. Nguyen, F.M. Harper, L. Terveen, J.A. Konstan, User personality and user satisfaction with recommender systems, Inf. Syst. Front. (2017) 1–17. [46] Steam, Steam & Game Stats, http://store.steampowered.com/stats/ (Retrieved 21.01.18). [47] Steam, Steam Web API Documentation, https://steamcommunity.com/dev (Retrieved 21.01.18). [48] J.W. Pennebaker, R.L. Boyd, K. Jordan, K. Blackburn, The Development and Psychometric Properties of LIWC2015, Tech. rep.,, University of Texas at Austin, Austin, TX, 2015. [49] Y. Wang, I.H. Witten, Inducing model trees for continuous classes, in: Proceedings of the 9th European Conference on Machine Learning Poster Papers, 1997, pp. 128–137. [50] A. Roshchina, TWIN: Personality-Based Recommender System (Ph.D. thesis), Dublin, Ireland, 2012. [51] F. Celli, F. Pianesi, D. Stillwell, M. Kosinski, Workshop on computational personality recognition: shared task, in: Proceedings of the Workshop on Computational Personality Recognition, 2013. [52] G. Salton, M.J. McGill, Introduction to Modern Information Retrieval, McGrawHill, New York, 1983. [53] M.F. Porter, An algorithm for suffix stripping, Program 14 (3) (1980) 130–137. [54] J.L. Fleiss, Measuring nominal scale agreement among many raters, Psychol. Bull. 76 (5) (1971) 378–382. [55] J.R. Landis, G.G. Koch, The measurement of observer agreement for categorical data, biometrics (1977) 159–174.

Please cite this article as: H.-C. Yang and Z.-R. Huang, Mining personality traits from social messages for game recommender systems, Knowledge-Based Systems (2018), https://doi.org/10.1016/j.knosys.2018.11.025.