Joint factors in self-reports and ratings: Neuroticism, extraversion and openness to experience

Joint factors in self-reports and ratings: Neuroticism, extraversion and openness to experience

0191-XXh’)‘83,030?~5-1 Iso3.oo:o Per~atnon Press JOINT RATINGS: Ltd FACTORS IN SELF-REPORTS AND NEUROTICISM, EXTRAVERSION AND OPENNESS TO EXPERIE...

1MB Sizes 0 Downloads 29 Views

0191-XXh’)‘83,030?~5-1

Iso3.oo:o

Per~atnon Press

JOINT RATINGS:

Ltd

FACTORS IN SELF-REPORTS AND NEUROTICISM, EXTRAVERSION AND OPENNESS TO EXPERIENCE

ROBERT R. MCCRAE* and PAUL T. COSTA JR Section

on Stress and Coping, National Institute on Aging. Gerontology Research Center. Institutes of Health. Baltimore City Hospitals. Baltimore, MD 21224. U.S.A. (Recrired

2 I October

National

1982)

Summary-Although most dimensional theories of personality assume that the same traits can be assessed in either ratings or self-reports. joint factor analyses of data from these two methods have seldom provided clear evidence in support of this position. Previous analysts have found a preponderance of within-method factors. or have had to transform variables or use unorthodox rotational procedures in order to control the effects of method variance. The present study argues that recent conceptual and technical advances should now make it possible to show joint factors at the second-order level using standard factor techmqucs. The NE0 Inventory and NE0 Rating Form, which measure IX traits in the domains of Neuroticism, Extraversion and Openness to Experience, were administered to a sample of 281 men and women. Varimax rotation of three principal components clearly showed the hypothesized structure within and across self-reports and spouse ratings. Convergent and discriminant validity of the joint factors with the EPI scales was also shown. The results suggest that the effects of method variance can be minimized if well-qualified raters use psychometrically adequate instruments to provide ratings of clearly conceptualired traits. In addition, they provide strong evidence for the validity of the proposed three-domain model of personality.

INTRODUCTION

If there are consistent individual differences in personality traits. and if these traits can be measured in a number of different ways, and if factor analysis can reveal the structure of personality, then a factor analysis of a broad range of traits assessed by different methods should show the basic 1950). Empirical attempts to confirm this dimensions of personality (Cattell and Saunders, conclusion in the past three decades have had limited success, leading some writers to question the premises on which it is based. An alternative interpretation of the literature would call attention to the limitations of methodology in previous studies, and the present article provides another empirical test, using new methods and measures which may allow a clearer test of the deduction. When a set of personality traits is assessed in a single sample using two (or more) methods of measurement, a joint factor analysis can be performed. Several meaningful patterns of results may be found. Method factors might emerge, defined entirely by variables assessed by a single method. Parallel, but separate, factors might be found in each method; or entirely different personality structures might be seen in the two media. Alternatively. factors might be defined by variables from both methods, in which case we can say that the factors ‘cross’ the methods. Results of the joint analysis would provide strong support for the basic trait mode1 only if all the factors were cross-media, and the defining variables of each factor were corresponding forms of the same traits. The correlations to be factored in such an analysis form a multitrait-multimethod matrix, and joint factor analysis has the same goal of establishing convergent and discriminant validity as the analytic procedures discussed by Campbell and Fiske (1959). Indeed, factor analysis has been recommended as a formal method of evaluating multitrait-multimethod matrices (Jackson, 1969; Kenny, 1976). Historically, however, the two approaches seem to have had different applications. Whereas multimethod analyses are used for the validation of specific measures (e.g. Price and Eliot, 1975; Winne, Marx and Taylor, 1977), factor analyses have been tied to the identification of the major substantive dimensions of personality.

‘To

whom

all reprint

requests

should

be addressed 245

243

RotttiKT

R.

MC(.C‘KAF

and

PAtIt

T.

COSTA

JK

Factor analyses of different response formats (Kenny, 1976). of different personality instruments (e.g. Golding and Knudson. 1975) and of personality questionnaires with measures of other psychological domains such as vocational interests (Seiss and Jackson, 1970) have frequently been reported. But all of these studies rely on the reports of the individual being assessed. whose concept of him- or herself may be systematically distorted, regardless of the instrument or response format used in assessing it. Far more informative are studies which seek factors common to two different observers. Concurrence between independent raters might be useful, although raters may share common stereotypes which inflate agreement without increasing accuracy (Bourne. 1977). A better test of the ‘principle of indifference of indicator’ (Cattell and Birkett, 1980) would be provided by a joint analysis of self-reports and observer ratings. Both of these traditional methods of personality assessment are liable to distortion. but. as has been discussed elsewhere (McCrae, 1982), the artifacts that may influence self-reports (acquiesence. extreme responding. defensiveness) are independent of the artifacts which may contaminate ratings (halo effects, stereotypes). Agreement between the two constitutes powerful evidence of consensual validation. Surprisingly few such analyses have been published, and the success of the reported attempts has been disputed. In the 1950s and 1960s it was assumed that irrelevant but powerful effects of method distorted the results; later. the basic premise that personality traits were veridical was called into question (Becker, 1960; D’Andrade, 1965; Shweder. 1975). The failure to show cross-media factors seemed to imply that conceptions of individuals’ personality held by raters and by the individuals themselves were merely elaborate fictions. with no necessary relation to objective reality or to each other. Recently a third possibility has been recognized that suggests that complete correspondence bctwcen factors in self-reports and ratings should not be expected. According to this view (Golding, 1977). what is normally dismissed as ‘method variance’ may in fact represent the real expression of traits as seen from a particular point of view. Personality as presented to others may differ systematically from personality as experienced. The notion of different traits on different levels of personality (interpersonal, intrapsychic, unconscious) is an explicit part of Leary’s (1957) theory of interpersonal behavior. In a recent test of that theory (Truckenmiller and Schaie. 1979) parallel factor analyses of ratings. self-reports and TAT productions were reported, but the researchers felt that a joint factor analysis was theoretically inappropriate. Joint analyses of self-reports and ratings can thus provide support for any of three positions: they may show the same factors across both methods, the position most in keeping with classical trait models; they may show no joint factors. a possibility suggested by the view that traits are linguistic fictions: or they may show mixed results. perhaps suggesting that it is necessary to conceptualize personality on at least two levels as observed and as experienced.

Cattell and his colleagues have been among those most committed to the discovery of cross-media factors. As part of the research program leading to the development of the 16PF. Cattell and Saunders (1950) offered one of the first joint factor analyses, combining self-reports, ratings and objective test data. Cattell, Pierson and Finkbeiner (1976) attempted to replicate the self-report/rating factors. and a recent paper by Cattell and Birkett (1980) discusses the alignment of ob.jective test factors with second-order questionnaire factors. However. the conclusion that Cattell and his colleagues had succeeded in matching factors was disputed by Becker (1960). In his detailed reexamination of the early Cattell studies. Becker found little evidence for the convergence of self-reports and ratings at the primary trait level, although he did find some agreement at the second-order level. In his own self-report/rating study (Becker, Peterson. Hellmer. Shoemaker and Quay, 19.59). in which Fels Parent Behavior Rating Scales were factored with Guilford Scales. correspondence was found for activity and sociability. but not for adjustment. Anticipating the criticisms of D’Andrade and Shweder. Becker speculated that “the common language structure of a group of self-raters and judges” could account for the regular appearance of factor structure within self-reports or ratings. but noted that “this meaning system bias would not necessarily lead to a nutching of’ftictor scores” (p. 209) and thus to jointly defined factors. He concluded that factors derived from personality questionnaires should be regarded as dimensions of the self-concept rather than dimensions of observable behavior.

Self-report

and rating

factors

247

Howarth (I 976) suggested a different reason for the failure to match factors across media. He reviewed an extensive literature on a five-factor model of personality (Tupes and Christal, 1961; Norman, 1963) and argued that only six or seven factors should have been extracted from Cattell’s rating data, instead of the dozen that were retained. The initial misidentification of factors within media might account for the inability to find clear matches across media. In effect, this possibility had already been tested, when Norman (1969) published a joint analysis of four of these five dimensions of personality. Using pooled peer ratings, self-ratings and scales from a personality questionnaire on a sample of 169 fraternity men, Norman was able to match two factors across the three methods of measurement when Varimax rotation was used. He was unable to match all four factors except through the use of a Procrustes rotation, a procedure looked on with suspicion by many factor analysts (Gorsuch, 1974) when used for hypothesis testing. Noting that most attempts at joint analyses resulted in the identification of method factors (that is, factors defined by variables from only one method of measurement), Jackson (1969) proposed a special form of analysis which he called “multimethod factor analysis”. He reasoned that the problem was not that correlations between the same traits across methods of measurement were too low, but that correlations between different traits within the same method were inflated by shared method variance. He suggested that method factors could be eliminated by the expedient of substituting zeros for the monomethod, heterotrait correlations. Factors in this procedure were based solely on the correlations between traits assessed in two different methods. Using this procedure, Jackson was able to show factors defined by self- and other ratings in a reanalysis of data from Kelly and Fiske (1951) as well as in the Personality Research Form. Golding and Seidman (1974) however, took exception to this proposal, pointing out that the matrices Jackson proposed were artificial, had undesirable mathematical properties and inappropriately removed the shared substantive variance between traits as well as the shared method variance. They proposed a two-step procedure in which variables are first factored within media. and scores representing the resulting common factors are then intercorrelated across media and factored. Empirically, the strategy of seeking convergence at the second-order level seems to have been more successful. Using this procedure, Golding and Knudson (1975) were able to show meaningful factors across a battery of self-report measures of interpersonal behavior; these factors, in turn, showed some correspondence with self-ratings and peer ratings. Curiously, they did not attempt a joint analysis of the peer ratings and self-reports. Mungas, Trontel and Winegardner (1981) employed the same strategy in factoring two personality questionnaires and an observer rating of effectiveness in a social interaction under experimental conditions. The rating variable showed a large loading (0.68) on a factor contrasting social ascendency with interpersonal situational anxiety in one study; however. rated effectiveness loaded on a different factor in a replication. These data provide at best mixed evidence for factor matching.

Conceptuul und technical adtunces The claim that there is a single structure of personality invariant across methods rests on this rather small set of studies. None of them is fully convincing, in part because of the necessity to transform the data or employ unusual methods of rotation to obtain the desired match. There is a much stronger body of data demonstrating consensual validation of discrete traits comparing self-reports with ratings (McCrae, 1982) and with nominations (Eysenck, 1969), which gives some basis for the hypothesis that. if personality variables are more clearly conceptualized and more adequately measured, it should be possible to demonstrate in straightforward fashion factors defined by the same traits in self-reports and ratings. Conceptual udzwzces. Until one knows what personality factors to expect, it is impossible to construct adequate measures of them within any method of measurement, or to recognize a match across methods even if one has found it. In Cattell’s early work the tasks of identifying and matching factors were undertaken almost simultaneously. so complete success could hardly be expected. When Norman used scales whose factor structure was well known both in ratings and in self-reports. his results were correspondingly clearer. Since that time a number of conceptual advances have been made in personality that aid attempts to match across media.

24X

ROBEKTR. MCCRAE and

PACJL

T. COSTAJK

The NE0 (Neuroticism-ExtraversionOpenness to Experience) model used in the present study is a conceptual classification of traits which begins with the recognition that personality is most reliably measured at the second-order level (Costa and McCrae. 1980). Three major domains of personality traits-Neuroticism (N), Extraversion (E) and Openness to Experience (0)-are postulated as a basic, if not exhaustive, set of second-order dimensions of personality. The first two are familiar in the work of Eysenck (Eysenck and Eysenck, 1968) Cattell (Cattell, Eber and Tatsuoka, 1970) and many others. The existence of a third broad domain of openness has been demonstrated by Coan (1972), Tellegen and Atkinson (1974) and our own work (Costa and McCrae, 1980). and is also seen in individual scales such as Dogmatism (Rokeach, 1960) and the Experience Seeking facet of Sensation Seeking (Zuckerman, 1979). These three domains clearly do not encompass the entire range of individual differences in human personality, but accurate measurement of these three would appear to be a necessary first step in personality assessment. Characterization of an individual in terms of these domains would give a reasonable. global portrait. However, for many purposes finer distinctions are useful. and many specific traits have been extensively researched. In developing the NE0 model and its operationalizations, we attempted to incorporate the most important traits identified in the literature relevant to the three-domain model. Scales measuring Anxiety, Hostility, Depression, Self-consciousness, Impulsiveness and Vulnerability to Stress are included in the NE0 Inventory as facets of N. Similarly, six well-established traits~~~Warmth. Gregariousness. Assertiveness, Activity. Excitement Seeking and Positive Emotions or Cheerfulness--are measured in the E domain. 0, which has been conceptualized and measured less often, is sampled in the areas of Fantasy. Aesthetics, Feelings, Actions. Ideas and Values. This list encompasses many of the specific traits of interest to contemporary personologists. Td~nicd udrunccs. Methodologists concerned with correspondence across media have chosen to concentrate on techniques that reduce the influence of method variance, either in the construction of tests (Jackson, 1967) or in the statistical procedures used in joint analysis (Gelding and Seidman, 1974). Test construction strategies for the NE0 Inventory (described in the Appendix) were designed to maximize discriminant validity within each domain, and items were balanced to control for acquiesence. By decreasing the correlations within each method, the likelihood of finding factors which span two or more methods is increased. But a second and equally important approach to the problem seeks to increase correlations across methods directly. The mot-c self-reports and ratings agree on descriptions of personality. the less relative weight specific method variance will have in determining factor structure. Agreement on a trait is most likely to occur if precisely the same trait is assessed by both methods. and this is facilitated by using the same operational definitions. A global peer rating of anxiety may differ from self-reports on an anxiety questionnaire not only in the source of information, but also in the specific content. Elements enumerated in the questionnaire may not occur to peer raters as relevant aspects of anxiety. As a first step. then, correspondence between the two sources can be increased by employing the same instrument for both. Norman (1969) for example, used self-ratings (as well as self-report questionnaires) in his joint analysis with peer ratings. However, the use of multiple items to define scales in questionnaires generally gives them higher reliability than simple ratings have. An optima1 strategy. it would seem, would be to have both the target individual and the raters complete parallel questionnaires. The choice of raters is also important. It is probably unreasonable to expect that a school acquaintance will have the depth or breadth of knowledge needed to give an accurate description of the ratee’s personality. Most researchers using peer ratings have addressed this problem by pooling the opinions of a number of raters, none of whom may know the individual well, but who collectively may give a reasonably accurate assessment. Norman and Goldberg (1966) have shown that the internal agreement between raters does increase with their numbers. But the same article also points to another and perhaps better solution: raters who have had longer acquaintances with subjects give consistently better ratings, judged against the external criterion of self-reports. A single rater might suffice to give excellent descriptions of personality if he or she were someone with long and intimate knowledge of the individual. such as a husband or wife. Research using spouse ratings tends to confirm this hypothesis (Edwards and Klockars, 1981; Plomin, 1974).

Self-report and rating factors

249

The present article analyses self-reports on the NE0 Inventory jointly with spouse ratings on a parallel NE0 Rating Form. Since previous research has indicated that agreement is most likely to be found on the second-order level, the specific hypothesis to be tested is that the three domains of N, E and 0 will be found in factors which cross the two methods. METHOD Subjects

Participants in the study were members of the Augmented Baltimore Longitudinal Study of Aging (ABLSA). The Baltimore Longitudinal Study of Aging (BLSA) sample itself is composed of a community-dwelling, generally healthy group of volunteers who have agreed to return for medical and psychological testing at regular intervals. The sample has been recruited continuously since 1958, with most new Ss referred by friends or relatives already in the study. Many are married couples, with both spouses participating in the BLSA. Among the men, 937; are high-school graduates and 71”; are college graduates; nearly one-fourth have doctorate level degrees. The ABLSA sample consists of 423 men and 129 women who are participants in the BLSA, together with 183 wives and 16 husbands who are not themselves BLSA participants, but who have agreed to complete questionnaires at home. For approximately half of these Ss. both husband and wife were in the sample and they were asked to join the spouse-rating study. Complete data were available for 139 men and 142 women, ranging in age from 21 to 89. Item selection and retest reliability information was based in part on responses from a second group of Ss participating in an unrelated study on stress and psoriasis. Psoriatics, their spouses. their siblings and a non-psoriatic control group from a dermatological clinic were Ss. Age ranged from 17 to 78 for the 28 men and 36 women in this sample. Measures

Primary data for this study came from the NE0 Inventory and NE0 Rating Form. In both instruments, S-item scales are used to measure each of six facets within the domains of Neuroticism (N), Extraversion (E), and Openness to Experience (0). The NE0 Rating Form was constructed from the NE0 Inventory by transforming items from the first person (“I have a very vivid imagination”; “ I’m known as a warm and friendly person”) to the third person (“He has a very vivid imagination”; “She is a warm and friendly person”). Both instruments use a 5-point Likert format, with response options from ‘strongly agree’ to ‘strongly disagree’. Form A of the Eysenck Personality Inventory (EPI; Eysenck and Eysenck, 1968) was also administered to the ABLSA sample. It provides global measures of E and N, using two 24-item scales with a true-false format. a Lie (L) scale is also included. Procedure

Subjects completed all questionnaires at home. The EPI was given to all ABLSA Ss as part of a larger mailing, and was followed after 4 months by the NE0 Inventory. The NE0 Rating Form was sent to spouses after a second interval of 6 months. In completing the NE0 Rating Form, Ss were specifically instructed not to discuss their ratings with their spouses, in order to provide independent assessments: “On this questionnaire we are asking you to help us understand your wife’s [husband’s] personality. Since we are interested in your opinions, please fill out this form yourself, and don’t discuss items with your wife [husband] until you have completed the form.” As a check on this instruction, an item was added to the inventory stating “I have discussed some of these items with my wife [husband]“. Five Ss who answered ‘agree’ or ‘strongly agree’ to this item were excluded from all analyses. Subjects in the psoriasis study sample completed the NE0 Inventory at home and returned it by mail. Six months later psoriatics and their spouses (but not the other control Ss) were readministered the NE0 Inventory. Retest reliability of the self-report scales is based on the responses of 31 of these Ss.

250

Rotrwr

R. MCCKAE and PAUI T.

COSTA

JK

The major analyses to be reported concern the factor structure of self-reports and ratings separately and jointly. In all these analyses, principal-component factor analyses followed by Varimax rotation are used. These procedures require a minimum of assumptions and are widely understood; they provide a simple, straightforward and objective description of the relations between variables. Other methods, including Promax rotation of factors with iterated commonalities. gave essentially similar results. and are not reported. Preliminary analyses showed that the criterion of retaining factors with eigenvalues greater than I .O would have resulted in five factors in each of the monomethod analyses. and ten factors in the joint analysis. However, because a three-domain model was hypothesized in this study. only three factors are retained. If, as Golding (1977) suggests, method variance is likely to define the first and largest factors. then the results of this underextraction will be method factors. On the other hand, if substantive dimensions of personality contribute more to the variance. then they should be seen in the first factors. In the monomethod analyses, three factors account for 50-53”,, of the variance; in the joint analysis three factors account for 41”,,. Factor analyses are performed separately for men and women in the ABLSA sample. Because of the smaller number of cases. separate analyses by gender are not reported for spouse ratings. Data from 303 husbands and wives are used in the analysis of the ratings. Complete data for the joint factor analyses were available for I39 men and 142 women. Additional analyses retaining five factors are also discussed briefly. RESULTS Basic psychometric data on the facets and domain scores are presented in Table I. The first two columns give coefficient alpha for men and women separately on the sample on which item selection was based. Internal consistency for these self-report scales ranges from 0.60 to 0.82 for the X-item facets. and from 0.85 to 0.93 for the 48-item domain scores. The third column in Table 1 gives 6-month retest reliability on a sample of 31 men and women from the psoriasis study. Coefficients range from 0.66 to 0.92 for facets. from 0.86 to 0.91 for domains. The last column in Table I shows coefficient alphas for spouse ratings provided by 264 men and women. These values are comparable to those in self-reports. with Excitement Seeking (0.66) and Openness to Actions (0.64) lowest and Anxiety (0.83) and Openness to Aesthetics (0.84) among

Self-report and rating factors

251

the highest. It is notable that, although this is a cross-validation sample, the coefficients are as high or higher in the rating data than in the original self-report data. Table 2 presents the results of factor analyses for self-reports of men and women separately. The two analyses are in close agreement with each other, and with the hypothesized structure. All facets have loadings above 0.40 on the intended factor; only one variable has this high a loading on an unintended factor (Openness to Actions among women). There are only a scattering of secondary loadings above 0.30, although some of these do seem to be consistent for men and women. Thus. it appears that Openness to Fantasy and lower Assertiveness have a small contribution from N: Openness to Feelings has an element of E. In general, however. the data in Table 2 gives strong support for the hypothesized model of personality as measured by self-report questionnaires. Table 3 gives factor loadings for scales based on spouse ratings of personality. Again, the three-domain structure is well supported, with all variables showing their chief loading on the hypothesized factor. Secondary loadings also generally follow the same pattern seen in self-reports. with the addition of contributions of warmth and especially positive emotions to the 0 factor. It appears that friendly and cheerful people, who may be interpersonally open, are perceived by their spouses as being experientially open as well. Table 4 gives results of the joint factor analysis of self-reports and ratings. Each of the 36 variables has its primary loading on the hypothesized factor; personality construct, rather than method of assessment, determines the nature of the factors. The domains of the NE0 model as operationalized by the NE0 Inventory and Rating Form clearly show factorial validity. Evidence of construct validity would be strengthened by data from more than a single instrument. Do the factors in Table 4 actually measure N, E and O? Table 5 gives the correlations between factor scores for three joint factors and scores on the EPI for the 261 Ss who had complete data on both. These correlations show convergent and discriminant validity for the joint factors against the EPI scales. High correlations are seen between EPI N and the N factor and between EPI E and the E factor; but neither is correlated with the theoretically distinct 0 factor. Correlations with L are uniformly low.

Additional unalyses Although three factors were hypothesized and found, it may be of some interest to consider the results of a different approach. In both self-report and rating data, five factors had eigenvalues greater than 1.0. Varimax rotation of five factors for the two methods separately showed almost identical results: the N and 0 factors were unchanged, and the E factor was split into three new factors. The first of these, marked by Assertiveness and Activity (with a secondary loading from Hostility), might be considered a ‘Social Dominance’ factor. The second. with loadings on Warmth, Gregariousness and Positive Emotions. and with secondary loadings on Openness to Feelings and Table Table 2. Varimax-rotated

factors

in self-reports

Men C.%= 363) NE0

t.lcet\

Anrwty Ho\ulq LIcpre\uon SrlC-uon\u,oornrsl Impul\Iwnr\\ Vulner;ih,ht~ W.~rmrh C;rcg.,r,o”\nr\\ ,,\\ert,\rnr\\ *ct,wy Excllrment Seekmg Pos~bve Emotmns Fanta\y Aesrhetlcs Feehngs ACtIOn\ Ideas VdlUCb

Decimal

N x1 66 K? 7i 61 69

-34

E

53 6X 6X 68 70 67

39

points

omitted;

7x 75 XI 73 67 76

35 36

3s

loadings

N

34 -13

61 65 59 53 6X 5x

F

0

less than 0.30 not

PI’

AlMety Hoabhty

X6 71

D.ZpreSSlOn

xs

Self-c”n,c,“u\ne\\

74 65 x3

Warmth GregarKIurner\ Asscrt,venrss ACtlbll) Excitement Seekmg Positive Emotmns

51 XI 54 47 57 57

31 44

NE0 Faurlr

Impul*,venerr Vulnerahdlt)

-?I

33

in

t.rtor\

Womrn , ,v = 269) 0

3. Varimax-rotated factors spouse’s ratings (K = 303)

65 71 63 43 70 63

shown.

Fantasy Aesthetu FdlllgS ACtIOn* Id&b VdlK,

t

0

il

6, 72 60 45 64 57 43 3’) 40 -32

Decimal points omitted; loadings than 0.30 not shown.

30

46 59 74 Sh 49 6’) SY

less

low Hostility. could be intcrprcted as ‘Afliliation’. the usual accompaniment to dominance in dimensional models of social interaction (Leary, 1957). Finally, Excitement Seeking and Gregariousness. with small contributions from Impulsiveness and Openness to Actions, form the third factor. Eysenck’s ‘Impulsivity’ component of E would seem to be of similar composition. However. problems arise in the joint analysis. Ten. rather than five. factors have eigenvalues over I .O. When five factors are rotated. matching across media becomes more difficult. N and 0 factors are discernable, as is one of the E factors~~~Afliiliation. But the Dominance and lmpulsivity factors are merged. and the fifth factor is defined solely by variables (Warmth, Positive Emotions and Openness to Aesthetics and Feelings) in the rating data. None of the corresponding traits in the self-report medium have loadings on this factor. It would appear from these analyses that E is a more complex dimension of personality than either N or 0. However, the unity of the factor on the highest level is clear (cf. Eysenck and Eysenck, 1969). and only at this level can matching across factors be demonstrated in the prcscnt data.

DISCUSSION

1980) has shown that the three-domain model of Previous research (Costa and McCrae. personality examined in this article can bc found in factor analyses of a number of different instruments in men. and that the factor structure is invariant across age groups in adulthood. Table 2 shows that the instrument designed to operationalize the model. the NE0 Inventory, shows the hypothesized structure in a different sample of men and women ranging in age across the entire adult lifespan: Table 3 reports the first confirmation of the NE0 model as seen in personality ratings. There can thus be little doubt that these three dimensions are central to the conceptions of personality that adults have about themselves and others. The present study, however, was adressed

Self-report

and rating

factors

253

to the more fundamental question of whether these conceptions are semantically-structured fictions, or more-or-less accurate depictions of personality. The results provide some of the clearest evidence to date on the existence of major dimensions of personality that span self-reports and observer ratings. The use of principal-components analysis and Varimax rotation of unaltered correlations among rated and self-reported variables gives a solution which precisely fits the hypothesized model. There is more than the elegance of simplicity in this result. In essence, these findings show that the dimensions of N, E and 0, as measured by the NE0 Inventory and Rating Form, determine more of the variance in individual scores than does method variance, whether conceived as a statistical nuisance or a meaningful difference of perspective (Golding, 1977). Approaches that replace the original variables by orthogonal components (Jackson, 1969; Gelding and Seidman, 1974) beg this question of the relative importance of method vs trait variance. Although appropriate and necessary in some applications, they are less than ideal for the analysis of multimethod data. They may show convergent and discriminant validity of hypothetical constructs, represented by the components they factor, but they do not demonstrate the same validities for the raw scales. Yet it is these scales that researchers and clinicians employ, and if they measure predominantly method variance, and only secondarily the construct of interest, the user should be aware of this fact. In the NE0 measures, trait variance determines the three largest factors; both the self-report and rating scales can thus be assumed to measure primarily the construct they were intended to measure, and only secondarily method variance. It is possible to show clear factors without special analytic procedures to reduce the influence of method variance primarily because the agreement between ratings and self-reports is quite high (cf. McCrae, 1982). This agreement, in turn, is the result of clear conceptualization, psychometrically adequate instruments and the choice of raters well-qualified to give accurate assessments of personality. In part, these results are a reflection of the progress made in the field of personality over the past thirty years on a number of fronts. Substantively, the results confirm earlier conclusions on the importance of E and N factors in both questionnaire and rating data. In the E domain, Becker et ul. (1959) found agreement on Activity and Sociability, and Norman (1969) on E. In the N domain, Norman found joint factors for Agreeableness and Emotional Stability. These factors emerge clearly in the present analyses, with each of the hypothesized facets contributing to the definition of the second-order dimensions. In addition. this study provides the first evidence for a broad and consensually-validated dimension of 0. Imaginative daydreaming, artistic sensitivity, awareness and appreciation of emotional responses, willingness to try new activities, intellectual curiosity, and a flexible and broad-minded approach to moral and social values have been shown to have a common core of openness. It is of some interest to speculate why 0 has so rarely been recognized as a major domain of personality. In part the reason may be that the facets of this domain are more loosely related than those of E and N, as its emergence as the third factor attests. But probably more important is the frequent failure to include scales from this domain in personality inventories. The influence of clinical and social psychologists has directed attention to the development of scales measuring psychopathology and styles of social interaction, and these scales have formed the bases of many factor analyses. Once explicitly included, however, the 0 scales show their coherence as an independent dimension in self-reports and ratings. Finally, it may be useful to compare 0 with the third dimension in Eysenck and Eysenck’s (1975) model: Psychoticism (P). Individuals high in P are emotionally isolated or hostile, like odd and unusual things, and are toughminded in attitudes. Criminals and psychotics are among the groups that score high on this dimension. The toughmindedness and emotional isolation are reminiscent of the authoritarian attitudes and lack of empathy found in closed individuals, suggesting a negative correlation between 0 and P. On the other hand, the liking for odd and unusual things and the unconventionality would suggest a positive correlation with 0. However, all these resemblances are superficial. Fundamentally, P seems to reflect the strength of the bond between the individual and society, other people, and other living things. Unconventionality, suspicion, and cruelty may all result from an underlying alienation or pathology. By contrast, 0 has to do with the person’s preferred mode of dealing with novel experience-a very different content domain. It

33

ROIIEK~ R. MCC’KIZ~ and PAUI. T. COSTA JK

is unlikely that there is much correlation empirical test of the relationship is needed

between the two constructs. to resolve the question.

although

of course

an

REFERENCES Becker W. C. (1960) The matching of behavior

rating and questlonnairc personality factors. Ps~chol. Bull. 57, 201-212. Becker W. C.. Peterson D. R.. Hellmer L. A., Shoemaker D. J. and Quay H. C. (1959) I-actors m parental behavior and personality RS related to problem behavior in children. J. c~onsrrl/. P~w/w/. 23, 107~1 IX. Bourne E. (1977) Can WC describe an individual‘s personality’! Agreement on stereotype versus indlwdual attributes. J. /J(~rw,l. .\O(‘. Ps,who/. 35, 863-872. Campbell D. T. and Fiske D. W. (I 959) Convergent and discriminwt validation by the multitrait-multimethod matrix. P swhol. Rull. 56, 8 I 105. Cattell R. B. and Birkett H. (1980) The known personality structures found aligned between lirst order T-data and second order Q-data factors. with new evidence on the inhibitor) control. independence and regression traits. Pww~n. in&id. /Ii//. I, 120 -23x. C‘attell R. B. and &under< D. R. (1950) Inter-relation and malchlng of personality fxtors from hehawor rating. questionnaire, and objective test data. J. SOL’./?~~c/w/. 31, 243 260. (‘attcll R. B., Eher H. W. and Tatsuoka M. M. (1970) T/w Nonclhook /or //w Si\rwn P~,r.\ontr/i/~ Ftrcrw Qlcc,.rrionncrirc,. Institute for Personality 6i Ability Testing, Champaign. Illinois. C‘attcll R. B., Pierson Cr. and Finkbeiner C. (1976) Alignment of personality source trait l’actors from questionnaires and ohscr\er ratings: the theory of Instrument-free patterns. Mu/,rrw c’.xp. (./;!I. Rec. 2, 63-88: P.~.who/. rlhvrr. 51, No. 8274. Coan R. W. (1072) Measurable components of openness to experience. J. wnsult. (,/in. P.\~,c/w/. 39, 346. Cotta P. T. Jr and McCrac R. R. (197X)Objective personality assessment. In T/w Cliuicd P.v,~c~/w/o,y~ of’ A,qirzg (Edited by Storandt M., Siesler I. C. and Eluts M. F.). Plenum Prcca. Ncm York. C‘wta P. T. Jr and McC‘rae R. R. (19X0) Still stnhlc al’lcr all the\c kc;“\: pcrwnulit! ;I\ x hcq to wmc ~ssucs In adulthood :~nd old age. In f,i/cv/~w ~wc/o/~rrc/r/ rrrrr/ Uc/wrr~~. Vol. III (Fdltcd by Baltes P. B. ~tnd Brim 0. G Jr). Academic Prcs\. Nwr York D‘Andradc R. B. (1965) Trait psychology and componcntlal ~alys~s. -lfrr. A~t/rro/~. 67, 21 5 22X. Edwnrds A. L. und Klockars A. J. (1981) Signilicant others :lnd self-evaluation: relationships between perceived and actual c\aluntions. Prrwtl. .soc’. P.sw/w/. Bull. 7. 244-15 I. Eywnck H. J. ( 1’969) The validity of the M.P.I. -Positive validity. In Pcr.,ow/ir~. Srrwruw crncl M~,(IsIIT(,~~~(,~I~ (Edlted by Ebscnck H. J. and Eysenck S. B. G.) Routledge & Kegan Paul. London. II~scnck I I. J. ;~ncI Eywnch S. Ii. <;. (I Y6X) /‘/w I-ltr~~rrl o/ //I(, t‘~~.wrtX Pc~rw~rr///~~ /w (w/o)‘~‘ Educational & Industrial ‘Tc\ting Ser\ ice. Snn Dlcpo. C‘:ilil’ornla.

(ioldin~ S. I.. xnd Knudwn R. M. ( lY7.5) Multl\arlable multlmcthod convcrsence 111the domain ofintcrperwnal hcha\lor. .\/~lr,r (IV. /w/,tr/ Rvr. IO. 3’5 44X. Gold~ng S. L. and Seidman E. ( lY74) Analqs~s of multitralt multimcthod m;ltl-Icc\: ;I t\+o \tcp pl-lncip;ll components proccdurc. .\lrt/r/rw~. /I&U.. Kc\. 9, 370 496. <;or~uch R. 1.. (1974) b;w/or -lw/l~v~. Saundel-\. Philadelphian. Pcnn\bl\an~a. correctlc Idcntilicd in the lirht in\t;lncc’! /jr. .I. /‘\>.<,/r~/. 67. tlo\bL;lrth I:. ( lY7h) Were C‘attcll’\ “personality \phcrc” fxtor\ 713 7io.

Kcnn! D. A. ( 1976) An emplric~ll appllcution \O(’ /J.\l~c~/r/,/.12. 247- 252.

Plomln R. S. ( I Y73) A tcmpwmicnt Iln~\. of Texas.

of conlirm:itory

theory of pcrwnnlit!

fxtor

~unalysis to the multitralt

devclopmcnt:

parcn~chlld

multlmethod

lntcractlons.

Doctoral

m;~tris. .I. <~\/I.

Dxxrtatwn,

Self-report

and rating

factors

255

Sci\\ T. F. and Jackson D. N. (1970) Vocational interests and personality: an empirical rnveattpation. J. ~~CJWLW/. P.\>d~rd. 17, 27735. Shwcdcr R. A. (1975) How relevant is an individual difference theory of personality? J. Pmo~~. 43, 455 -4X4. a trait rclatcd to Tcllegen A. and Atkinson G. (1974) Openness to absorbing and self-altering experiences (“absorption”). hypnottc susceptibility. J. uhnom~ P.~J,cho/. 83, 268277. Truckenmiller .I. L. and Schaie K. W. (t979) Multilevel structural validation of Leary’s tntcrpcrsonal diagnosis system. J. c~r,fl.clr/r.cI//f. Ps~drol.47, 1030-1045. Tupes E. C. and Christal R. E. (1961) Recurrent personality factors based on trait ratings. (‘S.4F‘ ilSD Tec./~/r(c~n/R~~por/. No. 61-97. Winnc P. H.. Marx R. W. and Taylor T. P. (1977) A multitrait~multtmethod study of three cell-concept inventories. C%i/d Det.. 48, X9%90 I. Zuckcrman M. (1979) Senscrrion Seeking: &~~wd rhe Oprind Lcw/ of .Arou.vul. Erlbaum. Hillsdale. New Jersey.

APPENDIX

Item selection for the NE0 Inventory was based on the combined responses of the 64 Ss in the psortasis study and Ss in the ABLSA sample vvho had returned their questionnaires by a cut-otfdatc (N = 5X6). Seven positive and seven ncgaticc items had been written to measure each of the I8 facets hypothesized by the NE0 model. Thirteen additional items from an earlier form of the 0 scales (the Experience Inventory; Costa and McCrae, 1978) were also included. Using factor analyses. items were selected to best fit the conceptualized model of personality. Because factors on both the first and second order were hypothesized, the analyses were conducted in two stages, In order to ensure maximal discrimination between the three major domains, the first step in item selectton was Varimax rotation of three principal components from the intercorrelation of 250 items (I 5 items had been dropped on a prelimtnary examination of responses). N. E and 0 factors could be clearly identified. All items which failed to load on the Intended factor. or which showed higher loadings on one of the other factors, were eliminated. Thus. all items in each of the three major domains loaded on the appropriate second-order factor. Seventy-six items in the N domain remained. vvith 65 each in the E and 0 domains. In the second stage, item analyses were conducted within each domain separately. The goal in this step was the selection of items which best represented the six hypothesized facets, The items written for each facet constituted preliminary rational scales. and an internal consistency criterion might have been used on these scales. However. since the six facets in each set were known to form part of the same domain, discrimination between scales was as important as internal consistency. and a factor analytic approach was again employed. Because Varimax rotation of six factors would not necessarily have produced factors corresponding to the rational scales, an orthogonal Procrustes rotation (Schonemann. 1966) was used instead. Item factors were rotated to a maximum tit with a target matrix defined by rational item assignment. The etght items from each factor with the highest loading on the intended factor were selected. (Subsequent Varimax rotation of factors from the final item set showed good agreement with the Procrustes solutions for E and 0 domains. and fair agreement for the N domain.) Items thus selected for the NE0 Inventory were used in the third-person as the NE0 Rating Form. Estrmates of internal consistency or of factor loadtngs based on the ABLSA sample may be somewhat inllatcd because this sample overlaps substantially with that on which item selection was based. However, it is essential to note that data from the retest administration of the NE0 Inventory and from spouse ratings were collected after item selection was complete and in no way intluenced the selection of items for either questionnaire. Thus, psychometric assessment of the NE0 Rating scales must be construed as the most demanding form of cross-validation, m which not only a new sample, but an entirely different source of data is used. Retest correlations and, most importantly, self-report/rating correlations. arc also unatfected by theac item-selection procedures.