Available online at www.sciencedirect.com
ScienceDirect Using big data to advance personality theory Wiebke Bleidorn1, Christopher J Hopwood1 and Aidan GC Wright2 Big data has led to remarkable advances in society. One of the most exciting applications in psychological science has been the development of computer-based assessment tools to assess human behavior and personality traits. Thus far, machine learning approaches to personality assessment have been focused on maximizing predictive validity, but have been underused to advance our understanding of personality. In this paper, we review recent machine learning studies of personality and discuss recommendations for how big data and machine learning research can be used to advance personality theory.
Addresses 1 University of California, Davis, United States 2 University of Pittsburgh, United States Corresponding author: Bleidorn, Wiebke (
[email protected])
Current Opinion in Behavioral Sciences 2017, 18:79–82 This review comes from a themed issue on Big data in the behavioural sciences Edited by Michal Kosinski and Tara Behrend
http://dx.doi.org/10.1016/j.cobeha.2017.08.004 2352-1546/ã 2017 Published by Elsevier Ltd.
In the digital age people generate behavioral footprints nearly constantly. These footprints agglomerate to ‘big data’ that offer psychological researchers unprecedented opportunities for tracking, analyzing, and predicting human behavior. A guiding assumption of this kind of research is that psychological characteristics (e.g., traits) influence the particular ways in which individuals use digital services and act in online environments. Consequently, data about how individuals use digital services and act in online environments should in turn be predictive of users’ psychological characteristics [1,2]. To test this hypothesis, researchers have begun to use machine learning approaches to predict users’ personality characteristics from their digital footprints such as Facebook likes [3] or Twitter profiles [4]. Most of this work has focused on developing reliable estimates of the ‘Big Five’ personality traits neuroticism, extraversion, openness to www.sciencedirect.com
experience, agreeableness, and conscientiousness [5]. Identifying markers of these traits in big data has significant potential for furthering research on the structure and development of personality across languages and cultures. In this paper, we outline how this potential can be more fully achieved by situating machine learning research within a construct validation framework of test and theory development, with a particular focus on the content validity of computer-based assessments [6–9].
Construct validation and big data Construct validation emphasizes the bidirectional relationship between test development and theory development. Any scientific theory must be operationalized so that its variables can be measured and used in experiments. Any particular measure that operationalizes the variables in a theory will be imperfect. Thus, evidence that a measure is not performing as expected could mean that there is something inaccurate about the theory or that there is something inadequate about the measure (cf., [10,11]). From this perspective, establishing content validity amounts to connecting a test with the theoretical variable that test is meant to measure. Figure 1 depicts the bidirectional relationship between a latent or theoretical concept and a manifest or measured variable. The solid arrow from theory to test development represents the half of the theory-test relationship focused on the degree to which tests adequately operationalize theories. This arrow is labeled with the term ‘prediction’ because tests that adequately operationalize theories should be more effective for predicting behavior. For instance, an adequate personality measure should be able to predict individual differences in personality traits in a manner that corresponds to previous research on how individuals differ as assessed by other instruments. The dashed arrow in Figure 1 from test to theory development represents the other half of the relationship. This arrow is labeled ‘understanding’ because it focuses on how the development and refinement of a personality measure’s content can provide new insights about personality theory. So far, machine learning research has predominantly focused on the ‘prediction’ of personality differences [12]. These studies generally maximize the convergence between computer-based and other (typically questionnaire) measures of personality traits. This process assumes the content validity of the other measure in Current Opinion in Behavioral Sciences 2017, 18:79–82
80 Big data in the behavioural sciences
Figure 1
machine learning research on personality has only focused on one half of the construct validation process. We see this as an appreciable gap in this line of work, and believe that there is significant potential, because of some of the unique features of big data, to use machine learning to develop new insights about personality through an enhanced focus on content validity.
Content Validity Prediction
Theory
Test
What can big data tell us about personality? Understanding
Current Opinion in Behavioral Sciences
Construct validation.
the absence of any theoretical concerns. This approach stands in contrast to deductive/iterative approaches to establishing content validity in construct validation. Normally, researchers first define a universe of indicators based on their understanding of the latent constructs they are trying to measure, sample systematically within this universe to develop the test, and then refine the content of the measure based on additional validity data [13].
We focus on three broad areas in which big data could inform personality theory, all of which would require a more thorough investigation of the content validity of computer-based personality measures: (1) the delimitation of trait content, (2) the articulation of how developmental processes impact personality measures, and (3) the identification of culture-specific personality markers. Fundamental to each of these domains is the distinction between manifest indicators and latent variables. In Figure 1, the test represents the manifest indicators of a latent variable or set of variables as indicated by the theory. As described above, from a construct validation perspective, establishing content validity essentially amounts to closing the gap between manifest (measurement) and latent (theoretical) variables.
Trait content Thus far, little research has been done to establish or evaluate content validity in big data personality measures. For example, Kosinski, Stillwell, and Graepel (2013) found that Facebook users’ personality traits can be predicted to a high degree of accuracy based on their likes. Facebook likes allow users to connect with objects that have an online presence (e.g., products, movies, etc.) and are shared with the public or among Facebook friends to express support or indicate individual preferences [14]. Some of the most highly predictive likes seemed face valid and tied in with previous research, as in the case of ‘Cheerleading’ and high extraversion. Yet, many other highly predictive likes were rather elusive and raise questions concerning the measure’s content validity; as in the case of ‘Getting Money’ and low neuroticism. The relatively unexplored content validity of computer-based personality measures complicates the interpretation of findings that are solely based on these scales and constrains the degree to which these measures can be used to advance personality theory. As we will outline below, considerations of content validity can be particularly fruitful for theory, precisely in those cases where the content might bear limited apparent relations to the trait [8]. In summary, previous machine learning studies have predominantly focused on the ‘prediction’ of personality differences. Machine learning algorithms have rarely been used to further our understanding of personality structure, processes, and development, as indicated by the dashed ‘understanding’ arrow in Figure 1. As such, Current Opinion in Behavioral Sciences 2017, 18:79–82
The content of the Big Five were generated lexically [15]. Early personality researchers assumed that most of the important trait-relevant information would be contained within language, because an important function of language is to communicate about what people are like and how they differ from each other. This information was combined empirically primarily through factor analyses of the trait descriptors derived from the lexicon. Decades of research led to the general consensus that the Big Five represent broad traits that capture how more specific behaviors, thoughts, and feelings tend to covary in the population (e.g., [16]). Yet, the emerging structure and content of traits inevitably depend on the universe of items that were considered. Big data and machine learning approaches have the potential to broaden and refine our understanding of the structure and content of the Big Five. A unique feature of big data is that they are wide ranging and inductive. The relatively unconstrained access to digital traces of personality allows researchers to detect and include personality indicators that might not have been conceived of by lexical or deductive approaches. In other words, big data offer researchers access to a new lexicon which contains a wealth of data that are often personal and otherwise difficult to assess [2,17]. To the degree that these data contain new and hitherto unexplored traitrelevant content, they could greatly advance our understanding of the specific ways in which personality traits manifest in online environments and beyond. www.sciencedirect.com
Big data and personality theory Bleidorn, Hopwood and Wright 81
However, the availability of innumerable digital records also creates a strong press to reduce and cross-validate the information culled from big data because many variables will turn out to be unreliable or irrelevant indicators. A significant emphasis in machine learning research focuses on trimming data and cross-validating algorithms, but these procedures are exclusively inductive [18]. Rarely are issues of content validity discussed. A related issue concerns the degree to which digital indicators of personality traits are valid over time. Preferences as expressed by Facebook likes may go in and out of favor over time. For example, Kosinski et al. (2013) found that liking ‘Mitt Romney’ was predictive of agreeableness, emotional stability, and conscientiousness among Facebook users sampled between 2007 and 2012 [14]. It remains open whether this and other indicators that were closely related to certain trends or historic events would replicate in more recent data.
Personality development Because development can influence the way a trait is expressed, identifying trait indicators that are similarly effective across the lifespan has posed a significant challenge for personality researchers [19]. Given that personality traits reflect underlying consistencies that can nevertheless be expressed differently at different ages, some valid indicators of personality will tend to be age-general whereas others will tend to be age-specific. Machine learning research has the potential to both improve developmentally sensitive personality assessment to permit enhanced prediction of behavior at different life stages and contribute to a better theoretical understanding of personality development by identifying and distinguishing age-general and age-specific indicators. For example, in a sample of 80-year-olds, a preference for ‘Taylor Swift’ as expressed by Facebook likes may exemplify exceptional curiosity and indicate high openness. In a sample of 20-year-olds, however, liking Taylor Swift may be normative and unrelated to openness [20]. In this scenario, liking Taylor Swift would be an age-specific indicator of openness, and including this indicator in an openness test would improve behavioral predictions among 80 year-olds but not 20-year olds. To the degree that digital footprints can be used to predict personality differences across age groups, machine learning research can help develop measures that are more effective across the lifespan. Likewise, the detection of indicators that are highly predictive in certain age groups (but not in others) could inform the development of age-specific personality tests. Big data can also be used to inform our understanding of how personality changes across the lifespan. For example, Kern et al. (2013) found age-graded decreases in the use of negative emotion words (e.g., ‘hate’ and ‘bored’) in a sample of 70,000 Facebook users [21]. This age trend in www.sciencedirect.com
social media language use may reflect normative decreases in trait neuroticism across adulthood [22,23]. We note that machine learning studies have thus far been restricted to cross-sectional comparisons of age groups. It is impossible to distinguish cohort, time, and developmental effects using cross-sectional data. To test whether latent personality traits actually change over time, future machine learning researchers need to examine age-related trends longitudinally (e.g., using time stamped digital records).
Personality across cultures Culture also influences the ways in which personality differences manifest [24,25]. Whereas being late is a reliable indicator of conscientiousness in more punctual cultures, it may be the norm in more relaxed groups and therefore not discriminate between people who are high vs. low on conscientiousness [26]. A potential mismatch between manifest and latent variables poses a significant challenge and source of debate in cross-cultural psychology. When different indicators are needed across two cultures, does this mean that the same latent traits are expressed differently, or that personality fundamentally differs across cultures? Several methodological issues complicate the interpretation of cross-cultural comparisons using self-or peerreport personality measures. For example, it is often not clear whether items are interpreted in the same way in different cultures or whether individuals in different cultures compare themselves to different standards when rating themselves or others [27,28]. Behavior-based machine learning measures are less prone to such methodological issues [29] and could greatly advance research on cross-cultural personality differences. To the degree that there are digital records that predict personality differences across cultures, machine learning research can help develop culture-free personality measures. Likewise, the detection of indicators that are highly predictive in certain cultures (but not in others) could inform the development of culture-specific tests.
Moving forward In summary, whereas machine learning research has thus far focused almost exclusively on predicting human behavior using an established model of personality, we emphasize the potential for machine learning algorithms of personality traits from big data to contribute to a deeper understanding of personality. We encourage machine learning researchers to examine more carefully the content of their algorithms, not only in terms of the ability to cross-validate, but also in terms of the theoretical importance of indicators vis-a`-vis the traits they represent [30]. Are these indicators intuitive, from the perspective of existing personality measures, in the sense that they could be translated into items that are similar to those that already exist on a trait questionnaire? Or are they surprising, and potential indicators of behavior that has Current Opinion in Behavioral Sciences 2017, 18:79–82
82 Big data in the behavioural sciences
not been regarded as falling within the content domain of personality traits? Do the same indicators provide for reliable estimates of traits across developmental stage and culture? Or do they differ, suggesting the opportunity to develop new insights into how traits are expressed differently across groups? More careful consideration of the content of machine learning algorithms can offer the field more than just powerful new tools for predictions, but can help answering some of the most puzzling questions in contemporary personality theory involving the boundaries of personality traits, personality development, and cultural influences on personality.
Conflict of interest statement Nothing declared.
References 1.
Back MD, Stopfer JM, Vazire S, Gaddis S, Schmukle SC, Egloff B, Gosling SD: Facebook profiles reflect actual personality, not self-idealization. Psychol Sci 2010, 21:372-374.
2.
Kosinski M, Matz SC, Gosling SD, Popov V, Stillwell D: Facebook as a research tool for the social sciences: opportunities, challenges, ethical considerations, and practical guidelines. Am Psychol 2015, 70:543-556. This paper presents offers an overview of opportunities and challenges associated with using Facebook for research, provides practical guidelines on how to implement studies on Facebook, and discusses ethical considerations.
3.
Youyou W, Kosinski M, Stillwell D: Computer-based personality judgments are more accurate than those made by humans. Proc Natl Acad Sci U S A 2015, 112:1036-1040. This study compares the accuracy of human and machine learning personality assessments based on Facebook Likes. 4.
Quercia D, Kosinski M, Stillwell D, Crowcroft J: Our twitter profiles, our selves: Predicting personality with twitter. 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom); IEEE: 2011, October:180-185.
5.
John OP, Naumann L, Soto CJ: Paradigm shift to the integrative big five trait taxonomy. Handb Pers: Theory Res 2008, 3:114-158.
6.
Cronbach LJ, Meehl PE: Construct validity in psychological tests. Psychol Bull 1955, 52:281-302.
7.
Jackson DN: The dynamics of structured personality tests: 1971. Psychol Rev 1971, 78:229-248.
8.
Loevinger J: Objective tests as instruments of psychological theory. Psychol Rep 1957, 3:635-694.
9.
Tellegen A, Waller NG: Exploring personality through test construction: development of the Multidimensional Personality Questionnaire. In Handbook of Personality Theory and Testing: Vol. II. Personality Measurement and Assessment. Edited by Boyle GJ, Matthews G, Saklofske DH. Thousand Oaks, CA: Sage; 2008:261-292.
10. Wright AGC, Hopwood CJ: Advancing the assessment of dynamic psychological processes. Assessment 2016, 23:399-403. 11. Sellbom M, Hopwood CJ: Evidence-based assessment in the 21st century: comments on the special series papers. Clin Psychol: Sci Pract 2017, 23:403-409. 12. Wright AGC: Current directions in personality science and the potential for advances through computing. IEEE Trans Affect Comput 2014, 5:292-296. Current Opinion in Behavioral Sciences 2017, 18:79–82
This paper highlights current research topics in personality science and offers ideas for a greater integration of personality research and computer science. 13. Ackerman RA, Hands AJ, Donnellan MB, Hopwood CJ, Witt EA: Experts’ views regarding the conceptualization of narcissism. J Pers Disord 2017, 31:346-361. 14. Kosinski M, Stillwell D, Graepel T: Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci U S A 2013, 110:5802-5805. 15. Goldberg LR: An alternative ‘description of personality’: the big-five factor structure. J Pers Soc Psychol 1990, 59:12161229. 16. Mo˜ttus R, Kandler C, Bleidorn W, Riemann R, McCrae RR: Personality traits below facets: the consensual validity, longitudinal stability, heritability, and utility of personality nuances. J Pers Soc Psychol 2017, 112:474-490. This study is a recent example of contemporary research that applied principles of construct validation to examine the structure and content of Big Five personality traits. 17. Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Ramones SM, Agrawal M, Ungar LH: Personality, gender, and age in the language of social media: the open vocabulary approach. PLoS ONE 2013, 8:e73791 http://dx.doi.org/10.1371/ journal.pone.0073791. 18. Kosinski M, Wang Y, Lakkaraju H, Leskovec J: Mining big data to extract patterns and predict real-life outcomes. Psychol Methods 2016, 21:493-506. This paper discusses essential tools that can be used to mine large data sets and and build predictive models using these data. 19. Caspi A, Roberts BW, Shiner RL: Personality development: stability and change. Annu Rev Psychol 2005, 56:453-484. 20. Schwaba T, Luhmann M, Denissen JJA, Chung JM, Bleidorn W: Openness to experience and culture-openness transactions across the lifespan. J Pers Soc Psychol 2017 http://dx.doi.org/ 10.1037/pspp0000150. (in press). 21. Kern ML, Eichstaedt JC, Schwartz HA, Park G, Ungar LH, Stillwell DJ, Kosinski M, Dziurzynski L, Seligman ME: From ‘Sooo excited!!!’ to ‘So proud’: using language to study development. Dev Psychol 2014, 50:178-188. 22. Roberts BW, Walton KE, Viechtbauer W: Patterns of mean-level change in personality traits across the life course: a metaanalysis of longitudinal studies. Psychol Bull 2006, 132:1-25. 23. Bleidorn W, Hopwood CJ: Stability and change in personality traits over the lifespan. In Handbook of Personality Development. Edited by McAdams D, Shiner R, Tackett J. 2017. 24. Church AT: Personality traits across cultures. Curr Opin Psychol 2016, 8:22-30. 25. Bleidorn W, Klimstra TA, Denissen JJA, Rentfrow PJ, Potter J, Gosling SD: Personality maturation around the world: a crosscultural examination of Social Investment Theory. Psychol Sci 2013, 24:2530-2540. 26. Heine SJ, Buchtel EE, Norenzayan A: What do cross-national comparisons of personality traits tell us? The case of conscientiousness. Psychol Sci 2008, 19:309-313. 27. Bleidorn W, Arslan RC, Denissen JJA, Rentfrow PJ, Gebauer JE, Potter J, Gosling SD: Age and gender differences in selfesteem — a cross-cultural window. J Pers Soc Psychol 2016, 111:396-410. 28. Heine SJ, Buchtel EE: Personality: the universal and the culturally specific. Annu Rev Psychol 2009, 60:369-394. 29. Youyou W, Stillwell D, Schwartz HA, Kosinski M: Birds of a feather do flock together: behavior-based personality-assessment method reveals personality similarity among couples and friends. Psychol Sci 2017, 28:276-284. This study is an example for a recent attempt to use machine learning personality assessment to test theoretical hypotheses concerning personality similarity among romantic partners and friends. 30. Lazer D, Kennedy R, King G, Vespignani A: The parable of Google Flu: traps in big data analysis. Science 2014, 343:1203-1205. www.sciencedirect.com