Rose is a rose is a rose: Content and construct validity

Rose is a rose is a rose: Content and construct validity

Available online at www.sciencedirect.com Personality and Individual Differences 45 (2008) 110–112 www.elsevier.com/locate/paid Short Communication ...

89KB Sizes 3 Downloads 82 Views

Available online at www.sciencedirect.com

Personality and Individual Differences 45 (2008) 110–112 www.elsevier.com/locate/paid

Short Communication

Rose is a rose is a rose: Content and construct validity Marvin Zuckerman * Department of Psychology, University of Delaware, Newark, DE 19716 – 2577, USA Received 4 December 2007; received in revised form 4 February 2008; accepted 18 February 2008 Available online 1 April 2008

Abstract This is a response to Roth, Hammelstein, and Bra¨hler’s (2007) article: ‘‘Beyond a youthful behavior style—Age and sex differences in sensation seeking based on need theory”. The authors developed a new scale for sensation seeking because they claim that the behavioral type of items in Zuckerman (1979, 1994, 2007) sensation seeking scale (SSS) do not reflect a need for varied, novel, intense, and complex sensations. The criticism of the SSS is one of content rather than construct validity. Content validity cannot substitute for construct validity, as measured by relationships between a scale and external criteria. ‘‘Need” was part of an older definition of sensation seeking but was dropped because the construct of need as a secondary drive is not relevant to the trait concept of sensation seeking. Tolerance for risk is part of the definition of sensation seeking but sensation seeking covers a broader range of phenomena including non-risky preferences. In spite of the specificity of the content of different sensation seeking scales all of them show age declines. A ‘‘youthful behavior style”, as reflected in the content of the items, does not produce the age changes in the trait itself. Ó 2008 Published by Elsevier Ltd. Keywords: Validity; Personality assessment; Sensation seeking

This article is in response to one by Roth, Hammelstein, and Bra¨hler’s (2007) regarding age and sex differences in sensation seeking, however it will also address a larger issue in personality assessment before addressing the specifics of the Roth et al. study. Cronbach and Meehl (1955) distinguished four types of validity, including content and construct validity. Content validity is whether or not the test samples the meanings implicit in the construct the test constructor is trying to assess. Construct validity is the extent to which the test is related to other presumed measures of the construct or predicts behavior deduced from the theory underlying the test construct. Construct validity requires a study of the test in many kinds of prediction of phenomena external to the test itself. If the test fails to predict some phenomenon then either the test or the theory of what is being measured by the test must be reevaluated. A new test is like a hypothesis based on content alone until its construct validity is defined *

Present Address: 1500 Locust Street, Apt. 4013, Philadelphia, PA, United States. Tel.: +1 215 732 2408. E-mail address: [email protected] 0191-8869/$ - see front matter Ó 2008 Published by Elsevier Ltd. doi:10.1016/j.paid.2008.02.014

by many studies. But as a preliminary step it should be related to more established tests to see if it is redundant as a measure of the construct. If it claims to be a better measure then it must show how it surpassed the older measure in prediction based on the theory of the construct. Content validity alone is not sufficient as a definition of the construct being measured. If content alone were the criteria the kind of ad hoc ‘‘test yourself” scale that appears in magazines would be valid. In fact test items can be developed by empirical correlation with external criteria, as was the MMPI, with no question of the meaning of the items selected. Whether one chooses to focus on obvious content like behavior, preferences, or attitudes theoretically related to the construct being assessed is not crucial. The most direct content is a simple self-rating, assuming the respondent shares your meaning of the trait. For instance if you want to measure impulsivity you might use a simple item: ‘‘I am an impulsive person”. The problem with such items as the sole content of scales is that there is no specification of situation or expression of the trait. Questionnaire trait measures usually include some behavioral or attitudinal expression of the trait such as ‘‘I often act on

M. Zuckerman / Personality and Individual Differences 45 (2008) 110–112

an impulse without thinking ahead or planning”. Introversion-extraversion items may ask about behavior at parties or preferences for company or solitude. There should be some adequate content sampling of behavior, preference, and attitude in any test that is based on a construct. Many new tests use fewer items and achieve variation by using Likert type item response ratings instead of ‘‘true-false” options. Such scales can achieve good internal reliability but often at the expense of content variation. The more alike the items are the higher the item intercorrelations and the reliability but the less likely the overall scale will truly represent the broader content of the construct. What may result is a highly reliable definition of a narrower trait than was intended by the test constructor. Of course there is no necessary connection between the Likert type of item response and the breadth of the content in the scale. However many test constructors use the Likert type scale to reduce the number of items needed to give an extended range of scores for a small number of items and thereby limit the breadth of the scale. The personality literature is full of studies using newly developed ad hoc scales to assess some construct and then using it to predict some natural behavior or experimental outcome. Failures of prediction are often dumped into the file-drawer never to see the light of publication. Successes, no matter how weak as long as they are statistically significant, are proudly displayed as proof of the validity of the measure and the theory predicting the outcome. Sometimes the test is given a new label, even if many of the items are taken from older measures and the test correlates as highly with these more established measures as their reliabilities allow. Block (1995) called this the ‘‘jingle-jangle” phenomenon. New scales closely resembling sensation seeking have been variously labeled ‘‘novelty seeking”, ‘‘arousal seeking”, ‘‘thrill seeking, ‘‘experience seeking, ‘‘excitement seeking”, ‘‘venturesomeness” and ‘‘fun seeking”. To quote Gertrude Stein (1913): ‘‘Rose is a rose is a rose is a rose” (even if you call it ‘‘a tulip”). The original Sensation Seeking Scale (SSS) was based on the idea of individual differences in ‘‘optimal” levels of stimulation and arousal. This was translated into items describing a desire or intention to engage in activities that provided unusual or novel sensations. In the Thrill and Adventure Seeking (TAS) subscale the items did not ask about actual experience in such activities only the desire to engage in them. Most people never engage in any of them, except perhaps down-hill skiing. But with age the actual desire to engage in them diminishes (I can personally testify to this; I now regard my younger interest in sky-diving as insane). Other types of sensation seeking, described within the experience seeking (ES) and disinhibition (Dis) subscales are well within common experience. Those in the Dis scale may be socially or legally risky but many of those in ES are not risky at all. They merely describe preferred types of entertainment (Zuckerman, 2006). The definition of sensation seeking as the seeking of varied, novel, and intense stimuli and the willingness to take risks ‘‘for the sake of

111

such experience” (Zuckerman, 1994) does not make risk an integral part of the definition but stresses the positive attraction to exciting stimulation as the main factor. Sensation seekers do engage in many kinds of seeking that are risky (Zuckerman, 2007) but most of their day to day choices and behaviors are varied but not risky. Roth, Hammelstein, & Bra¨hler (2007) developed a new scale to measure sensation seeking as a ‘‘need” rather than a behavioral trait. The idea of ‘‘needs”, was first developed by Henry Murray (1938) as a directional and energenic motivation (or drive) defined by its goals rather than its physiological origins. He provided a large catalogue of traits all called ‘‘needs”. One of his needs ‘‘sentience” was directed toward the enjoyment of sensations. Oddly enough this one, along with sex, was regarded as ‘‘viscerogenic”, a primary need based on biological tensions. However most of the needs in his list were described as ‘‘psychogenic”, that is secondary needs that develop indirectly from an association with primary needs. My earlier definition of sensation seeking described it as ‘‘....a need for varied, novel, and complex sensations and experiences...” (Zuckerman, 1979). Later, two changes in the definition were ‘‘seeking” for ‘‘need” and the inclusion of ‘‘intensity” as an additional quality of stimulation influencing sensation seeking (Zuckerman, 1994). The latter change was a consequence of evidence from 40 years of behavioral and physiological studies of the trait. The change of the ‘‘need” to a more behavioral concept (‘‘seeking”) was simply due to the conclusion that the construct of ‘‘need” adds little or nothing to a behavioral trait definition. In nearly all needs, as in the need for Achievement (nAch, McClelland, 1985), one can remove the n with no damage at all to the underlying Ach construct. In the Costa and McCrae (1992) Big-Five, for instance, ‘‘achievement striving” is a facet of the broader trait of conscientiousness. Need implies a recurring state of internal arousal related to the goal followed by a decrease in arousal when the goal is reached. Whereas this may apply to sex I do not believe it characterizes sensation seeking or nearly any other trait. Some traits may be regarded as motives, but all are best described as ‘‘correlated habits of reaction” that are relatively consistent over time and manifested in particular classes of situations (Zuckerman, 1991). It is not parsimonious to include an additional theoretical assumption unless it is required by your theory. Roth et al. developed a short 17 item scale containing two minimally related factors: need for stimulation (NS) and avoidance of rest (AR). This scale, the Need Inventory of Sensation Seeking (NISS), was only ‘‘submitted for publication” at the time of the publication of the current article so all the information on it is reported in the current article. Apparently NS is moderately correlated with the SSS Total score (r = 0.54) whereas the AS is not correlated at all with the SSS. The magnitude of the NS-SSS correlation does not indicate identity (correcting for reliability would raise it to .64) but this might be because of the limited range of content in the NISS.

112

M. Zuckerman / Personality and Individual Differences 45 (2008) 110–112

The NISS was developed as a better measure of need for sensation than the SSS and Arnett’s (1994) Inventory of Sensation Seeking (AISS) because the latter two contain items characteristic of a ‘‘youthful behavioral style”. On the basis of this supposed content deficiency they argue ‘‘we cannot be sure the SSS-V actually measures sensation seeking or merely an age-related activity status” (p. 1842). Their assumption seems to be that any scale with content correlating with age is not measuring the construct it purports to measure. This is not a coherent conclusion. ‘‘Youthful behavioral style” is not a personality trait or even a type, but is a relationship of certain traits to age in some populations. In every age group there is variation in traits like sensation seeking and conscientiousness but the mean levels tend to change even if the relative strength of the trait is consistent for individuals. The claim that the SSS is measuring a disposition expressed in the seeking of novel and intense sensations and experiences is supported by over four decades of research (Zuckerman, 1979, 1994, 2007). Actually the items of a new version of the SSS, Impulsive Sensation Seeking (ImpSS, Zuckerman, 2002; Zuckerman, in press), do not contain specific activities. In a randomly selected sample from the mid-Atlantic region of the USA, ImpSS showed strong age declines for both men and women between the ages of 18 and 65+ (McDaniel & Zuckerman, 2003). Aluja et al. (2006) developed a short form of the ZKPQ. The 10 item ImpSS scales was significantly and negatively related to age in Germany, Spain, and Switzerland. There were also significant gender and country effects. Men score higher than women on the long as well as the short form of the ImpSS. Similar differences have been found on the Total SSS in America, Sweden, Spain and Japan (Zuckerman, 1994). SSS Total declines with age in England and Australia. In sum, whether scales with specific behavioral activities in their items, like the SSS, or scales with only general references to ‘‘excitement” or ‘‘variety”, like the NISS and ImpSS, are used the age and gender effects are the same. This suggests that these differences represent more than changes in a ‘‘youthful behavioral style”. Perhaps the age and gender differences have something to do with the biological correlates of sensation seeking like testosterone and the enzyme monoamine oxidase that also change with age and are quantitatively different in men and women (Zuckerman, 1994). The outcome of the Roth et al. study using their own NISS scale, containing no ‘‘youthful behavior” items,

shows the same age decline and gender differences as shown in the SSS and AISS. Presumably like the SSS and the AISS it will predict the same forms of risky and non-risky types of behavior and preferences. The problem with ‘‘jingle-jangle” is that it wastes research energies that could better be spent investigating new areas with already construct validated techniques of assessment. References Aluja, A., Rossier, J., Garcia, L. F., Angleitner, A., Kuhlman, M., & Zuckerman, M. (2006). A cross-cultural shortened form of the ZKPQ (ZKPQ-50-CC) adapted to English, French, German, and Spanish languages. Personality and Individual Differences, 41, 619–628. Arnett, J. (1994). Sensation seeking: A new conceptualization and a new scale. Personality and Individual Differences, 16, 289–296. Block, J. (1995). A contrarian view of the five factor approach to personality description. Psychological Bulletin, 117, 187–215. Costa, P. T., Jr., & McCrae, R. R. (1992). NEO-PI-R: Revised NEO Personality Inventory. Odessa, FL: Psychological Assessment Resources. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. McClelland, D. C. (1985). Human motivation. Glenview, IL: Scott Foresman. McDaniel, S. R., & Zuckerman, M. (2003). The relationships of impulsive sensation seeking and gender to interest and participation in gambling activities. Personality and Individual Differences, 35, 1385–1400. Murray, H. A. (1938). Explorations in personality. New York: Oxford University Press. Roth, M., Hammelstein, P., & Bra¨hler, E. (2007). Beyond a youthful behavior style—Age and sex differences in sensation seeking based on need theory. Personality and Individual Differences, 43, 1839–1850. Stein, G. (1913). Sacred Emily. Zuckerman, M. (1979). Sensation seeking: Beyond the optimal level of arousal. Hillsdale, NJ: Erlbaum. Zuckerman, M. (1991). Psychobiology of personality (lst ed.). Cambridge, UK: Cambridge University Press. Zuckerman, M. (1994). Behavioral expressions and biosocial bases of sensation seeking. New York: Cambridge University Press. Zuckerman, M. (2002). Zuckerman–Kuhlman Personality Questionnaire (ZKPQ): An alternative five factorial model. In B. DeRaad, & M. Perugini (Eds.) Big five assessment (pp. 377–396). Zuckerman, M. (2006). Sensation seeking in entertainment. Mahwah, NJ: Erlbaum. Zuckerman, M. (2007). Sensation seeking and risky behavior. Washington, DC: American Psychological Association. Zuckerman, M. (in press). Zuckerman–Kuhlman Personality Questionnaire (ZKPQ): An operational definition of the alternative five factorial model of personality. In G. J. Boyle, G. Matthews, & D. H. Saklofske (Eds.) Handbook of personality theory and assessment (Vol. 2, pp. 211–230). Los Angeles, CA: Sage.