Use of Psychometric Techniques in the Analysis of Epidemiologic Data

Use of Psychometric Techniques in the Analysis of Epidemiologic Data

Use of Psychometric Techniques in the Analysis of Epidemiologic Data SAMUEL F. POSNER, PHD, LEAVONNE PULLEY, PHD, LYNN ARTZ, MD, MPH, AND MAURIZIO MAC...

72KB Sizes 0 Downloads 37 Views

Use of Psychometric Techniques in the Analysis of Epidemiologic Data SAMUEL F. POSNER, PHD, LEAVONNE PULLEY, PHD, LYNN ARTZ, MD, MPH, AND MAURIZIO MACALUSO, MD, DRPH

PURPOSE: This article demonstrates techniques for developing reliable multi-item scales for analysis of complex public health data. METHODS: Information from a questionnaire designed to evaluate the acceptability and efficacy of the female condom as a method for STD/HIV prevention was summarized using psychometric analysis. 1159 high-risk women attending STD clinics participated in this study. Questionnaire items were designed to measure nine domains of predictors of condom use. RESULTS: Principal components analysis was employed to reduce the number of potential predictors. Reliability of the multiple-item scales was assessed using Cronbach’s . Pearson’s correlation coefficients were calculated to evaluate collinearity among multi-item scales. Approximately half (51%) of the questionnaire items that were analyzed were retained in the final scales. Data reduction procedures identified several multi-item scales with acceptable reliability (Cronbach’s  0.70). The correlation coefficients between scales was never .5, suggesting that there was little collinearity among the scales. CONCLUSIONS: When focused on multiple partially interdependent determinants of an outcome, data reduction decreases the number of independent variables to be evaluated, ensures they have adequate reliability, maximizes strength of their association with outcomes, and reduces collinearity among predictors. Ann Epidemiol 2003;13:344–350. © 2003 Elsevier Inc. All rights reserved. KEY WORDS:

Measurement, Psychometric Analysis, Condom Use.

INTRODUCTION Identifying the risk factors for a specific outcome is frequently an inefficient process. Increased access to software capable of fitting multiple regression models makes it easier for researchers to test large sets of variables as independent predictors of outcome. The independence assumption is less likely to be met when multiple questionnaire items are designed to gather data on related aspects of the same exposure, trait, or behavior. While collinearity is a recognized threat to model fit in the epidemiologic methods literature, the methods for dealing with multiple correlated variables and for using such variables to improve measurement are not well discussed. It is typical of behavioral science research to rely on multiple interdependent items as measures of latent variables that are hard to define.

From the Division of Reproductive Health, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA (S.F.P.); Department of Health Behavior, School of Public Health, University of Alabama at Birmingham, Birmingham, AL (L.P.); Department of Epidemiology, School of Public Health, University of Alabama at Birmingham, Birmingham, AL (L.A., M.M.). Address correspondence to: Dr. Samuel F. Posner, Ph.D., National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, 4770 Buford Highway, MS K-34, Atlanta, GA 30341-3724, USA. Tel: (770) 488-6398; Fax: (770) 488-6391. E-mail: [email protected] Received November 6, 2000; revised July 22, 2002; accepted July 29, 2002. © 2003 Elsevier Inc. All rights reserved. 360 Park Avenue South, New York, NY 10010

Data reduction techniques are used to specify summary scales from the initial set of individual items. Fewer correlates included in a regression analysis results in increased power to test the significance of the parameter estimates. In addition, the increase in the reliability of the measurement (reduction of measurement error) will potentially increase the precision of the measure of association between the outcome and the independent variable (1). Use of multi-item scales is becoming more common in public health research, as the latter sharpens its focus on the interplay of biological and behavioral factors. In addition, data reduction methods may be appropriate in other situations when numerous and potentially redundant determinants are evaluated. In both situations, it is important to develop appropriate measurement scales and examine their properties prior to carrying out the final analysis. The development of questionnaire items to measure complex constructs should be theoretically based. Often researchers design an a priori conceptual framework to guide the development of individual items and multi-item scales (2). Careful construction and selection of summary measures will result in more precise and efficient models (1, 3–6). Because measurement quality is critical for valid data analysis, it is important to assess the measurement properties of multi-item scales before including such scales in predictive models (1, 3–5). Furthermore, a recent report by the American Psychological Association underscored the need for re1047-2797/03/$–see front matter PII S1047-2797(02)00436-2

AEP Vol. 13, No. 5 May 2003: 344–350

Selected Abbreviations and Acronyms PAFA  Principal Axis Factor Analysis STDs  sexually transmitted diseases

porting on the psychometric properties of measures to fully evaluate the model and interpret the findings (7). Subjective evaluation of the face validity (content similarity) of questionnaire items is insufficient to determine whether they measure the same construct and whether they can be combined to form a reliable summary measure. Factor analysis is commonly used to identify related items, and can assist with constructing reliable scales (8, 9). Internal consistency of the resulting scales can be evaluated using measures of intra-class correlation such as Cronbach’s  (1). This report describes the development and evaluation of multi-item scales from a larger set of questionnaire items designed to assess potential determinants of condom use in a prospective study of women attending two public sexually transmitted disease (STD) clinics in Alabama (10, 11).

METHODS Study Design and Subjects The objectives of the prospective study were: 1) to evaluate the efficacy of the female condom in preventing STDs and to compare the efficacy of the female and male condom; and 2) to identify socio-demographic, psychosocial, and behavioral determinants of using the female condom. This report focuses on the psychometric analysis techniques used to construct scales measuring potential determinants of condom use, which were employed in a variety of subsequent analyses of the study results. Women were eligible to participate in the study if they were 18 to 35 years old, had not had a hysterectomy, were not pregnant and did not wish to become pregnant in the next 6 months, and were not on long-term antibiotic therapy. Out of the 3,531 potentially eligible women who attended one of two STD clinics, 2,702 agreed to participate in the study. Approximately 43 percent (1159) of the women who were eligible and agreed to participate in the study attended the initial visit, which took place 10 days after the recruitment visit. The women completed the initial study visit between July 1995 and September 1997. Participants attended an initial study visit and up to six monthly follow-up visits. During the initial visit, they provided informed consent, completed a baseline interview, participated in a behavioral intervention, received a pelvic examination, completed a brief end-of-visit interview, and learned how to keep a sexual diary. The women received $25 for each study visit to help defray transportation and childcare expenses. Differences between participants and non-participants, that is those women who participated (n  1159) and those that

Posner et al. PSYCHOMETRICS AND EPIDEMIOLOGY

345

agreed and did not participate (n  1543), have been reported elsewhere and suggest that women at high-risk for STDs and unintended pregnancy participated in the study (10, 11). Women who were younger, less educated, African American, had income from welfare programs, a higher lifetime number of partners, were not married and reported a history of STD were more likely to participate. The characteristics of women who agreed to participate but did not attend the initial visit were virtually identical to those of women who participated (10). The selection process was clearly successful in obtaining a study group that was not different from the women who agreed to participate but did not attend the initial study visit. Analytic Methods A conceptual framework based on the theory of reasoned action, social cognitive, theory, and the transtheoretical model/stages of change were used to group the items from the baseline (initial visit) questionnaire into nine domains(12–16). These domains were: 1) condom use selfefficacy, 2) attitudes, and 3) beliefs about male condoms, 4) attitudes about female-controlled barrier contraceptives, 5) attitudes about the female condom, 6) relationship characteristics, 7) power and control in the relationship, 8) satisfaction with communication with sexual partners, and 9) perceived HIV/STD risk. All items within a given domain had the same number and format for the response options. Principal Axis Factor Analysis/Common FA (PAFA) using the varimax method of rotation was used to identify items that could be combined into scales in each of the domains. Briefly, given an initial set of k items, this method identifies up to k factors that consist of linear combinations of the initial variables. When varimax rotation is used, as in this paper, an orthogonal solution is identified where the expectation is that the factors are independent. In practice, however, a small correlation between factors is often observed. Usually, fewer than k factors explain a large part of the variability of individual subjects with respect to the initial set of items (8, 9). In classical test theory, each item is assumed to have two components, one measuring true score (i.e., the construct of interest) and the other unique to the item consisting of an item-specific combination of measurement of factors other than the construct of interest and measurement error. The factor analytic method was selected because it treats the items as having these two components. The item-specific component is removed prior to rotation so that only the common or true score component is rotated to identify the factor solution. To be chosen for scale construction, a questionnaire item was required to have a high “signal-to-noise ratio,” i.e., to have a high loading coefficient (r .60) on only one of the factors identified through PAFA, and low loading coefficients on all other factors (r .30). This inclusion criterion is necessary but not sufficient to ensure scale reliability. Factors in this analysis were retained if their

346

Posner et al. PSYCHOMETRICS AND EPIDEMIOLOGY

eigenvalues exceeded 1.0; a criteria commonly used to select factors that explain sufficiently large proportion of the total variance. To reduce Type I error in identifying multi-item scales, items were analyzed separately within each of the nine domains described above. Analyzing all items at once is problematic for at least three reasons. First, in this analysis, a priori hypotheses were made about which items measured specific constructs. Including all items in a single analysis would devolve into an atheoretical solution. Second, because of the dimensions of the covariance matrix, it can be difficult to identify a comprehensible solution. Thus, a solution identified in this analysis may capitalize on chance relationships in the data set and not be replicable. Finally, because the factor analysis procedure selected employs a list-wise deletion, the effective sample size was insufficient to support the analysis. Next, the scales were tested for internal consistency using Cronbach’s  (1). In test theory, a summary score is composed of two components, true score and error (5, 6). Cronbach’s  assesses the ratio of the true score variance to the total variance. This also can be interpreted as an intraclass correlation coefficient for the set of items comprising the scale. This measure of internal consistency reflects the degree to which a given respondent provides correlated responses to the component items. If the reliability is low, a large proportion of the summary score can be attributed to measurement error (or measurement of extraneous factors), suggesting that the association of the scale with the outcome will be weakened. In all analyses, reliability was maximized by dropping items that negatively influenced the estimate maximized Cronbach’s . This resulted in cases where one or more items were dropped from the initial factor solution to maximize reliability. The range, shape and variance of the distribution of each scale were examined to identify any problem with estimation that may arise from extreme patterns in the data. For this analysis, a reliability coefficient was considered acceptable if it was equal to or greater than 0.70 (2). This criterion was selected because of the previously reported use of some items included in the analysis and the a priori specified conceptual model. In cases where the analysis is conducted on newly developed items, a lower criteria might be employed for the first evaluation. Subsequent evaluation of the items should be conducted using the 0.70 criterion. Correlation among scales was evaluated systematically both within and between domains to identify potential for redundancy. This analysis identifies highly related factors that could be combined into a single construct. While this does not test a higher order factor model, it is an important step in measure analysis. Hypotheses regarding the exact magnitude of the correlation coefficients were not specified, however, we expected that for the majority of the possible pairs of scales, correlation coefficients would be low

AEP Vol. 13, No. 5 May 2003: 344–350

(less than 0.30). An inter-scale correlation coefficient greater than 0.50 is a conservative threshold for redundancy, and indicates a need to combine the factors.

RESULTS Participants were generally young (median age 23 years), were usually single (13 percent were married or living with their partner) and predominantly African American (84 percent). Almost half had completed high school (median number of years in school: 12) and were of low socioeconomic status. The median per capita household income was only $200 per month, and 36 percent of participants received food stamps. Seventy-three percent of the women reported a previous pregnancy and 61 percent had children. At study entry, 29% of participants were using hormonal methods of birth control, usually oral contraceptives; 20% had tubal ligation; 17% reported no current method of birth control and only 4% were using intravaginal barrier methods, such as a spermicide, the diaphragm, the sponge, or a female condom. Overall, about 40% of the group used male latex condoms (percent figures may add to more than 100, as women may have reported using multiple methods). All subjects were at high risk for acquiring an STD. The median lifetime number of sex partners at recruitment was 7. Most of the women (65%) had an STD previously. The median age at first intercourse was 16. Only 28 percent of the women were using male condoms consistently, 38 percent were using inconsistently, and 34 percent were not using them at all. Psychometric Analysis The factor loadings from the analysis of items selected for scale development are shown in Table 1. Summary statistics for the scales retained in this analysis are presented in Table 2. A two-factor solution was identified from the seven proposed items that measured self-efficacy for condom use. The two factors delineated self-efficacy for condom use with the regular partner and with other partners. The first scale (for the regular partner) included five items, but one of the five originally hypothesized items did not discriminate between the factors and was dropped from the analysis. The second scale (for self-efficacy with other partners) included two items, but only one item discriminated between the measures of main and other partner self-efficacy. Reliability was adequate for the first scale (Cronbach’s   0.79). Reliability for the second scale could not be computed because only one item was retained in the analysis. Two domains measuring: 1) attitudes and 2) beliefs about male condoms were analyzed. Five scales were identified in these two domains, but only four had adequate reliability.

AEP Vol. 13, No. 5 May 2003: 344–350

Posner et al. PSYCHOMETRICS AND EPIDEMIOLOGY

347

TABLE 1. Factor loadings from rotated factor pattern matrix for items tested in constructing scales by domain a Self efficacy Condom use self-efficacy (main partner) Sure can convince regular partner to use condoms Sure can use condoms with regular partner after using drugs Sure can use condoms when turned on with regular partner Sure can use condoms when regular partner is angry Condom use self-efficacy (other partner) Sure can refuse sex with new partner if no condom used Attitudes and beliefs about condoms (1) Perceived convenience of male condom Condoms are inconvenient Condoms interrupt love-making Condoms make sex feel unnatural Perceived need for male condom in trusting relationship Condoms not necessary if you trust your partner Condoms not necessary if you are in a long-term relationship Attitudes and beliefs about condoms (2) Perceived need for male condom in faithful relationships (main partner) Partner will think you have been unfaithful if you request a condom Partner will think you don’t trust him if you request a condom Perceived need for male condom in faithful relationships (other partner) Other partner will think you play around Other partner will think you have an STD Other partner will think that you believe he has an STD Female controlled barrier contraceptive methods Pros of female controlled contraception methods Man can not tell that you are using it Can put it in ahead of time Something that the woman uses Cons of female controlled contraception methods Female condom goes inside you You have to touch yourself to put it in You have to use your finger to put the inner ring in Satisfaction with communication Satisfaction with communication with main partner About birth control methods with main partner About using condoms with main partner About your feeling regarding condoms with main partner About what you like sexually with main partner Satisfaction with communication with friends With friends about birth control methods With friends about using condoms for STD prevention With friends about your feelings regarding condoms a

Factor loadings 0.64 0.69 0.73 0.71

0.01 0.05 0.05 0.11

0.08

0.89

0.64 0.74 0.63

0.09 0.16 0.21

0.09 0.18

0.85 0.62

0.30 0.21

0.79 0.97

0.78 0.88 0.76

0.23 0.22 0.31

0.01 0.07 0.07

0.63 0.91 0.66

0.54 0.88 0.83

0.10 0.01 0.01

0.63 0.79 0.84 0.54

0.07 0.05 0.05 0.03

0.11 0.08 0.09

0.66 0.73 0.78

Factor loadings for target factor are shown in bold

The first factor described the convenience of using male condoms and consisted of three items (  0.74). The second scale measured attitudes towards condom use in long-term or trusting relationships and consisted of two items (  0.73). The domain pertaining to beliefs about condoms consisted of five items, which resulted in a two-factor solution. These two factors consisted of similar items about the partner’s lack of trust or suspicion when requesting a condom. The “regular partner” factor consisted of two items and the “other partner” factor consisted of three. Both scales had adequate reliability (  0.84 and 0.82, respectively).

We specified one domain measuring attitudes toward female-controlled methods of barrier contraception in general and one specifically focused on the female condom. Factor analysis of the first domain identified two factors (pros and cons of female controlled methods) with three items each. Reliability coefficients were adequate for both scales (  0.87 and 0.84, respectively). Although two factors were identified in the PAFA analysis of the domain measuring attitudes specific to the female condom, neither scale had acceptable reliability ( 0.70). Thus, these factors were not considered further.

348

Posner et al. PSYCHOMETRICS AND EPIDEMIOLOGY

AEP Vol. 13, No. 5 May 2003: 344–350

TABLE 2. Summary statistics for proposed scales Scale

Reliability

Mean

SD

N items

Minimum

Maximum

Condom use self-efficacy with main partner Perceived convenience of male condom Perceived need for male condom in trusting relationship Perceived need for male condom in faithful relationships (main partner) Perceived need for male condom in faithful relationships (other partner) Pros of female controlled contraception methods Cons of female controlled contraception methods Satisfaction with communication with friends Satisfaction with communication with main partner Perceived risk for STD/HIV from main partner Perceived risk for STD/HIV from other partner

0.79 0.74 0.73 0.84 0.82 0.87 0.84 0.73 0.79 0.88 0.91

1.56 2.75 3.03 2.54 3.13 1.58 2.60 1.28 1.38 1.81 1.30

0.62 0.58 0.64 0.77 0.59 0.68 0.56 0.42 0.50 0.76 0.56

4 3 2 2 2 3 3 3 3 2 2

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

4.0 4.0 4.0 4.0 4.0 3.0 3.0 3.0 3.0 3.0 3.0

The next two domains measured aspects of the respondent’s relationship characteristic. The first consisted of five items describing possible outcomes of the respondent’s primary relationship ending. The reliability of this scale was substandard (  0.63) and could not be improved by reducing the number of items (data not shown). The second relationship domain included six items measuring power and control. The items in this analysis were originally proposed by Harrison and colleagues (2). No factors were identified in this analysis because the factor loadings failed to meet the specified criteria. The final domain measured the degree of satisfaction women had in conversations with their partners (regular or other) and friends or family regarding four sensitive topics. Factor analysis was conducted for items measuring satisfaction with communication with their main partner and friends. Items pertaining to communication with other partners were not included because few women in this study reported that they had secondary partners. Reliability for the two scales was adequate (for regular partner,   0.79; and friends, 0.73). Factor analysis was not appropriate for the domain that measured perceived HIV/STD risks from regular and other partners as well as condom use norms, because only two items were written to measure each of these factors. Two partner type-specific scales were obtained by taking the average value of the responses to the individual items, and their reliability was high (  0.88 and 0.89, respectively). The two items measuring subjective norms, that is the perception of friends use of condoms with regular and other partners could not be combined into a single scale because of low reliability (  0.63).

talk to a regular partner had a borderline correlation with self-efficacy of condom use with a regular partner (r  0.29).

Discriminant Validity Correlation analysis of the scales developed in this study indicates little overlap between them, as the correlation coefficients (r) were all less than 0.30. Satisfaction with ability to

Impact of Data Reduction on Outcome Modeling We specified three regression models to evaluate the association of male condom use self-efficacy (with the regular partner) with consistency of male condom use during the 30-day period preceding entry into the study (consistency was measured as the proportion of acts of vaginal intercourse in which a male condom was used) (Table 3). The first model included the five initial questionnaire items as independent predictors. All effect estimates were relatively small (standardized regression coefficients ranging from 0.18 to 0.02), and only two of the five were statistically significant at the conventional confidence level. The second model included only the summary score developed through data reduction procedures as an independent predictor of male condom use. The magnitude of the regression coefficient for the summary score was nearly twice as large as the largest in model 1. The third model included both the summary score and four of the five residual scores (i.e., for each item, the residual of the regression of the item on the summary score). Only the summary score was associated with the outcome. The regression coefficients of the residual scores were small, demonstrating that the summary score captured most of the information about the latent self-efficacy variable.

DISCUSSION In this report we demonstrate an application of three data reduction techniques commonly employed in psychometric analysis: 1) factor analysis, 2) estimation of reliability (internal consistency) of multi-item scales, and 3) correlation analysis to evaluate scale redundancy. This procedure improves the efficiency of analyses of complex data and enhances the power to detect important associations.

AEP Vol. 13, No. 5 May 2003: 344–350

Posner et al. PSYCHOMETRICS AND EPIDEMIOLOGY

349

TABLE 3. Alternative models predicting male condom use using self-efficacy items Model 1—five single items How sure are you that you can convince your regular partner to use condoms How sure are you that you can refuse sex with your regular partner if he objects to using a condom How sure are you that you can always use condoms with your regular partner when you are using drugs How sure are you that you can always use condoms with your regular partner when you are really turned on How sure are you that you can always use condoms with your regular partner when you are angry Model 2—summary score Self efficacy Model 3—summary score and item residuals Self efficacy How sure are you that you can convince your regular partner to use condoms How sure are you that you can refuse sex with your regular partner if he objects to using a condom How sure are you that you can always use condoms with your regular partner when you are using drugs How sure are you that you can always use condoms with your regular partner when you are really turned on

Introducing apparent redundancy in data collection by designing multiple correlated items allows the investigator to obtain measures of hard-to-define variables whose reliability may be assessed with quantitative methods, rather than judged on their face value. Summary measures obtained from multiple questionnaire items have several advantages in addition to their greater reliability. Their use can improve the likelihood of identifying true associations. Selecting among single partially correlated items, rather than among summary scales, can be difficult, and choosing the wrong ones can be disastrous. Selecting any one item according to a data based-criterion (typically size of the effect or statistical significance) is hard to replicate across studies. In a review of self-efficacy measures, Forsyth and Carey, report that estimates of the association of singleitem independent variables with an outcome are more influenced by random error compared with reliable multiitem scales that minimize the influence of random error from any single item (17). To study potential predictors of consistency of condom use, we wrote five questionnaire items to measure self-efficacy of male condom use with the regular partner. When the items were included as independent predictors in a multiple regression model specifying consistent condom use as the dependent variable, the estimates of the regression coefficients were imprecise. Constructing a composite score had several important effects: 1) it increased the strength of the association with condom use, suggesting that the reliability of the self-efficacy measurement was improved; 2) it reduced the imprecision associated with the inevitable collinearity among the individual items, which were designed as multiple measures of the same latent variable; 3) it decreased the number of parameters needed on the right side of the regression equation, proportionately increasing the degrees of freedom for the error term. The larger the number of potentially correlated items, the more critical it is to apply data reduction

Beta

p value

0.11 0.02 0.05 0.18 0.05

0.01 0.69 0.29 0.0001 0.25

0.27

0.0001

0.28 0.05 0.05 0.008 0.07

0.0001 0.28 0.31 0.85 0.10

techniques to summarize the information with fewer summary measures. The methods described in this paper enable the researcher to decide whether a scale and its component items are a consistent measure of the underlying construct (1, 3– 5). The large imprecision associated with unreliable scales typically distorts associations toward the null. In this situation, reliability assessment provides the researcher with the appropriate criteria for deciding that neither the single items nor the scale are appropriate for inclusion in analysis. These methods assist the researcher at the model-building stage and help prevent model redundancy. In our analysis, only 27 (51%) of the 53 items in the initial analysis were retained in the scales deemed acceptable at the end of the process. Numerous items within a domain, as well as items used to measure an entire domain, were not written adequately to survive psychometric analysis, even though these items appeared to have face validity. For example, in the relationship domain, the questionnaire items that we developed could not be combined to measure the impact of a relationship ending. We note that the items used to construct our final scales included measures from all domains outlined in the conceptual framework except the relationship domain. Thus, our data reduction exercise refined the domains outlined in the conceptual framework to produce specific scales that simplify the analysis and strengthen the interpretation of the findings. The exclusion of a relatively large number of variables from the final analysis acknowledges the incomplete success of item construction and the data gathering procedures that measure hypothesized constructs. If data reduction and reliability evaluation are not included in the analytic plan of a study, the researcher is forced to include a large number of correlated items in the analysis, in hope that they will explain a significant proportion of variability in the outcome. With a smaller number of reliable scales that are independent by design, the re-

350

Posner et al. PSYCHOMETRICS AND EPIDEMIOLOGY

searcher is more likely to detect important associations, and to interpret them properly. In conclusion, in the present study we have detailed a method for constructing and evaluating scales that is logically appealing and improves both the precision and the validity of an analytic plan. While this methodology may be new to epidemiologists, attention to this aspect of data analysis is relatively recent even in behavioral science research, for which the methods were developed. For example, it has been noted that limited or no attention has been paid to the reliability of self-efficacy scales used in regression analysis of behavioral outcomes (17, 18). Even though application of these methods may appear to be time consuming, it actually facilitates the analytic work downstream and makes interpretation of the results considerably more coherent. Furthermore, the inclusion of this type of analysis facilitates the reader’s ability to accurately interpret the results. As epidemiologists tackle more complex causal pathways leading to biological outcomes, and explore models that predict behavior, we hope that they will add data reduction and reliability assessment techniques to their toolbox.

AEP Vol. 13, No. 5 May 2003: 344–350

3. 4.

5. 6. 7.

8. 9. 10.

11.

12. This project was carried out in part under contract with the National Institute of Child Health and Human Development (Contract N01-HD-13135) and in part under a cooperative agreement with the Centers for Disease Control and Prevention (U48/CCU409679-02, SIP 10). The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government.

13. 14. 15.

16.

REFERENCES 1. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334. 2. Harrison JS, Kay KL, Dixon D, Peters M. Moore J. Defining and measuring power: Researcher and study participant perspectives: Confer-

17.

18.

ence on Psychosocial and Behavioral Factors in Women’s Health. Washington, D.C.; 1996. Nunnaly JC, Bernstein IH. Psychometric Theory, 3rd ed. New York: McGraw Hill; 1994. Cronbach LH, Glesser GC, Nanda H, Rajaratnam N. The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. New York: John Wiley & Sons; 1972. Crocker L, Algina J. Introduction to Classical and Modern Test Theory. New York: Holt, Rinehart & Winston; 1986. Lord FM, Novick MR. Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wilsey; 1968. Wilkinson L. Statistical methods in psychology journals: Guidelines and explanations. Task Force on Statistical Inference, APA Board of Scientific Affairs. Am J Psychol. 1999;54:594–604. Gorsuch RL. Factor Analysis. Philadelphia, PA: Saunders Books in Psychology; 1974. Harman HH. Modern Factor Analysis. Chicago, IL: University of Chicago Press; 1976. Macaluso M, Carew B, Artz L, Fleenor M, Robey L, Austin H, Hook E, Kelagan J. A study of the prophylactic efficacy of the female condom among high risk women (Oral presentation). Society for Epidemiologic Research, June 12–15, 1996, Boston, MA. Am J Epidemiol. 1996; 143: S78. Artz L, Macaluso M, Brill I, Kelaghan J, Austin J, Fleenor M, et al. Effectiveness of an intervention promoting the female condom to sexually transmitted disease clinic patients. Am J Public Health. 2000;90(2): in press. Ajzen I. The theory of planned behavior. Organ Behav Hum Decis Process. 1991;50:179–211. Mischel W. Toward a cognitive social learning reconceptualization of personality. Psychol Rev. 1973;80:252–283. Bandura A. Social Learning Theory. Englewood Cliffs, NJ: Prentice Hall; 1977. Prochaska JO, DeClemente CC. Stages and Processes of self-change of smoking: Toward an integrative model of change. J Consul Clinc Psychol. 1983;51:390–395. Prochaska JO, Redding CA, Harlow LL, Rossi JS, Velicer WF. The transtheoretical model and Human Immunodeficiency Virus prevention: A review. Health Educ Quart. 1994;4:471–486. Forsyth AD, Carey MP. Measuring self-efficacy in the context of HIV risk reduction: Research Challenges and Recommendations. Health Psychol. 1998;17:559–568. Baker SA, Morrison DM, Carter WB, Verdon MS. Using the theory of reasoned action (TRA) to understand the decision to use condoms in an STD clinic population. Health Educ Qua. 1996;23:528–542.