Journal of Substance Abuse Treatment, Vol. 14, No. 5, pp. 439-443, 1997 Published by Elsevier Science Inc. Printed in the USA. All rights reserved 0740-5472/97 $17.00 + .00
PII S0740-5472(97)00123-2
ELSEVIER
ARTICLE
The Use of Case Vignettes for Addiction Severity Index Training J O H N S. CACCIOLA, PhD, A R T H U R I. A L T E R M A N , PhD, IAN FUREMAN, MA, G A R G I A . PARIKH, BS, AND M E G A N J. RUTHERFORD, PhO Center for Studies of Addiction, Departments of Psychiatry, University of Pennsylvaniaand Philadelphia Veterans Affairs Medical Center
Abstract-In an attempt to enhance the reliability of Addiction Severity Index (ASI) interviewer severity ratings (ISRs), we developed a set of eight AS1 vignettes, fictionalized narrative case summaries that reproduced the quantitative ASI information in case report format. Each vignette has an ISR answer key, a consensus ISR of two expert ASI trainers for each problem area. Additionally, for four of the vignettes the rationale for the correct ISRs was operationalized. This report is a description of the ASI vignette packet, its use as a supplement to standard ASl training and the results" of a pilot study to gauge its effectiveness in improving criterion validity of lSRs. Five ASI videotapes were also developed for the purposes of this investigation. There was limited evidence in this preliminary investigation that the addition of the ASI vignette packet to standard AS1 training increased agreement with expert consensus ISRs. The AS1 vignettes, relative to videotaped or live observed interviews, do however provide a brief means of assessing the adequacy of AS1 interviewer skills with regard to ISRs. Published by Elsevier Science Inc.
Keywords-ASI; Addiction Severity Index; reliability; validity; assessment.
INTRODUCTION
nally, the ASI has become a mainstay in substance abuse research. The ASI assesses patient status in seven areas and obtains demographic information as well. The seven following potential problem areas are evaluated within the ASI; medical, employment, drug use, alcohol use, legal, family/social, psychological. Questions in each area address lifetime and current (i.e., past 30 days) functioning. Each problem area has several different types of items. The large majority are considered objective items that detail the type, number, and duration of problems and, to a lesser extent, assets. Two more subjective items in each problem area are included; a patient rating of recent problem severity, and a patient rating of current need for treatment. The ASI has two summary measures for each problem area. First, interviewer severity ratings (ISRs) are 0 - 9 point estimates of problem severity, defined as need for treatment. ISRs are a subjective synthesis of all the information in a specific problem area. A
THE ADDICTION SEVERITY Index (ASI) is a semi-structured clinical or research interview (McLellan et ai., 1980; 1985; 1992). It was developed more than 15 years ago to fill the need for a standardized, reliable and valid instrument with which to evaluate substance-abusing patients. More specifically, it was created to enable clinical researchers to evaluate the treatment outcomes of drug and alcohol patients. Since that time, it has been widely used and has become a standard. The ASI is used internationally and has been translated into numerous languages. Nationally, a number of states, counties, and cities, in programs that they fund, have mandated the use of the ASI for clinical and program evaluation purposes. Fi-
Requests for reprints should be addressed to John S. Cacciola, Ph.D., University of Pennsylvania TRC, 3900 Chestnut St., Philadelphia, PA 19104.
Received October 17, 1996; AcceptedApril 22, 1997. 439
440
two-step derivation procedure for ISRs is outlined in the ASI Manual (Fureman, Parikh, Bragg, & McLellan, 1990). Composite scores are a second type of summary measure and are calculated from a subset of items that reflect current status in a given problem area. The items are standardized and summed to produce a mathematically derived composite score, which ranges from 0.00-1.00, for each ASI problem area. Baseline composite scores and ISRs are generally found to be highly correlated (A1terman, Brown, Zabellero, & McKay, 1994, McLellan et al., 1985). In the initial paper introducing the ASI, McLellan and his colleagues (1980) reported interrater reliabilities of all ISRs exceeding 0.85 (Spearman-Brown) based on research technicians viewing videotaped interviews. A second more extensive paper on the reliability and validity of the ASI (McLellan et al., 1985), reported comparable interrater reliabilities of the ISRs (range 0.84-0.93) using the same methodology. Other studies have, however, yielded lower reliabilities. Hodgins and EI-Guebaly (1992) reported ISR interrater reliabilities ranging from 0.30-0.96 (intraclas correlations, ICCs) with raters observing ASI interviews through a one-way mirror. In this study, the authors reported reliabilities of less than 0.60 in three areas: employment (0.30), family/social (0.42), and psychiatric (0.57). Alterman et al. (1994) assessed interrater reliability by having a research technician conduct an ASI interview in the presence of other technicians. The average ICC across the seven problem areas was 0.57. The values for the individual problem areas were: 0.63 medical, 0.53 employment, 0.78 alcohol, 0.31 drug, 0.48 legal, 0.66 family/social and 0.61 psychiatric. Since ISRs have been used in research reports to provide patient profiles for treatment matching and in clinical settings to inform treatment planning, interviewer consistency is clearly important. In this regard, the considerably lower interrater reliabilities reported by Hodgins and E1-Guebaly (1992) and Alterman et al. (1994) than were found in the earlier ASI articles are a reason for concern. In an attempt to address this concern, we developed a set of eight ASI vignettes, fictionalized narrative case summaries that reproduced the quantitative ASI information in case report format. Each vignette had an ISR answer key (i.e., a consensus ISR of two expert ASI trainers for each problem area). Additionally, for four of the vignettes were operationalized the rationale for the expert consensus ISR withila a problem area (ASI vignette packets are available from the first author.) This report is a description of the use of this vignette packet to supplement standard ASI training and the results of a pilot study to gauge its effectiveness in enhancing criterion validity of ISRs. ASI videotapes were also developed for the purposes of this investigation. Three expert ASI trainers conducted a total of five ASI interviews with Center staff member role playing drug and/or alcohol dependent patients entering treatment. ISRs for
J.S. Cacciola et al.
these videotapes were determined through a consensus rating of two expert trainers.
METHODS The ASI Manual (Fureman et al., 1990) provides a derivation procedure for ISRs, but neither the manual nor any other ASI training materials supply behavioral or situational anchor points to assist interviewers in making valid and reliable ISRs. The derivation procedures for ISRs as outlined in the ASI Manual involve two steps: (a) derive a range of scores (2 or 3 points), which best describe the patient's need for treatment at the present time based on the objective data alone; (b) select a point within the range using the subjective data (i.e., patient ratings). Severity is defined as need for treatment according to the following 0-9 scale: 0-1 2-3 4-5 6-7 8-9
No real problem, treatment not indicated Slight problem, treatment probably not necessary Moderate problem, some treatment indicated Considerable problem, treatment necessary Extreme problem, treatment absolutely necessary.
The vignettes were designed to provide sample cases for individuals being trained in the use of the ASI to practice and calibrate ISRs. The University of Pennsylvania Center for the Studies of Addiction conducts monthly ASI training workshops for Center staff and opens the workshop as space permits to others interested in learning the ASI. These workshops are essentially similar to those conducted nationally and internationally. Participants are forwarded the ASI Manual, score form, selected articles and supplementary materials to review prior to formal training. The 1 1/2-day ASI workshop consists of: an overview on the development of the ASI, a review of the intent of ASI questions and interviewing strategies, role playing, and explanations of the ISR derivation procedures, the follow-up ASI and composite scores. Trainees who were not affiliated with the Center were asked to take part in a research project to test the helpfulness of additional ASI training materials. Informed consent was obtained and participants were paid. Participants were assigned to three different conditions. Assignment was random with the exception that if more than one trainee from a facility agreed to take part in the study all were assigned to the same condition. 1: training alone, participants agreed to observe and rate the five ASI videotapes 3 4 months following their ASI training. • C o n d i t i o n 2: training plus vignettes, participants agreed to rate the eight ASI vignettes within 1 month following the ASI training and to observe and rate the five ASI videotapes 2-3 months later (i.e., 3 4 months after training). They did not received an ISR answer key for any of the vignettes. • Condition
Using Case Vignettesfor ASI Training
441
• Condition 3: training plus vignettes plus annotated answers, was the same as Condition 2 except that ISR keys with the supporting rationale for the answers accompanied the first four vignettes.
Two studies were conducted. Study 1 addressed whether Condition 3 participants would make more accurate ISRs than Condition 2 participants on the four vignettes for which neither group received the answers (i.e., test vignettes). Our hypothesis was that Condition 3 participants would be more accurate since they had practice and correct feedback for four vignettes (i.e., training vignettes) which would help them on the subsequent four test vignettes, whereas Condition 2 participants had practice but no feedback. Study 2 addressed whether training alone (Condition 1), training with vignette practice (Condition 2) or training with practice and feedback (Condition 3) would yield different levels of ISR accuracy 3-4 months post-initial training. Our hypothesis was that accuracy would be best for Condition 3, worst for Condition 1, and Condition 2 would be intermediate.
Data Analysis For Study 1, percent exact agreement between raters and the expert ISRs for the four test vignettes was calculated for Conditions 2 and 3 for each problem area and overall. The chi-square statistic was used to determine whether there were significant differences between the two conditions. These analyses were repeated using accuracy within one point in either direction of the expert ISR as an indication of agreement. Also, intraclass correlations (ICCs) with the standard (i.e., expert consensus ISRs) were calculated and compared using t-tests. Study 2 used similar data analytic procedures. Percent exact agreement between raters and the expert ISRs for the five videotapes was calculated for Conditions 1, 2
and 3 for each problem area and overall. Chi-squares were used to determine whether there were significant differences among the three conditions. These analyses were repeated using accuracy within one point in either direction of the expert ISR as an indication of agreement. Also, ICCs with the standard were calculated and compared using one-way Analysis of Variance (ANOVA).
Subjects There were 24 participants in Study 1, 11 were assigned to Condition 2 and 13 to Condition 3. The majority of the participants were women (54.2%), White (66.7% vs. 29.2% Black and 4.2% Hispanic), and clinicians (68.2% vs. 18.2% administrators and 13.6% researchers). On average they were 38.8 + 9.5 years old with 16.1 _+ 2.8 years of education and had worked 5.8 _+ 5.9 years in the chemical dependency/mental health field. There were no significant differences (p < . 10) among participants in Conditions 2 and 3. There were 17 participants in Study 2, 6 in Condition 1, 6 (of the original 11) in Condition 2, and 5 (of the original 13) in Condition 3. Thus, not all participants in Study 1 continued into Study 2. The participants were again predominantly women (64.7%). White (58.8% vs. 35.3% Black and 5.9% Hispanic), and clinicians (50.0% vs. 31.3% researchers and 18.8% administrators). On average they were 37.1 + 8.0 years old with 16.3 _+ 2.7 years of education and had worked 5.6 _+ 5.7 years in the chemical dependency/mental health field. Participants in Condition 3 were the oldest (42.6 _+ 4.4 years) and those in Condition 2 were the youngest (32.1 + 7.3) with participants in Condition 1 being intermediate (37.3 + 8.9) (F = 2.82, d f = 2,14;p < .10). There were no other significant differences (p < .10) among participants in Conditions 1, 2 and 3.
TABLE 1 Study 1 : Comparison of ISR Levels of Agreement for Conditions 2 and 3 % Exact Agreement
ASI Problem Areas Medical Employment Alcohol Drug Legal Family/social Psychiatric Overall c
% Agreement in Range
Condition 2 (n = 44 a)
Condition 3 (n = 52 b)
Condition 2 (n = 44 a)
Condition 3 (n = 52 b)
38.6 20.5 20.5 56.8 47.7 20.5 25.0 30.5
32.7 28.8 30.8 63.5 44.2 25.0 25.0 33.0
77.3 54.5 61.4 75.0 77.3 63.6 72.7 68.8
80.8 67.3 80.8* 86.5 75.0 73.1 67.3 75.8*
ICC M +_ SD Condition 2 (n = 44 a)
Condition 3 (n = 52 b)
0.80 0.63 0.46 0.94 0.82 0.08 0.79 0.65
0.85 0.68 0.49 0.88 0.82 0.18 0.73 0.66
-- 0.19 --_ 0.29 _+ 0.29 _+ 0.08 _+ 0.13 +_ 0.38 _+ 0.24 _+ 0.15
-- 0.13 _+ 0.39 _+ 0.28 -+ 0.25 _+ 0.19 + 0.43 _+ 0.24 _ 0.19
a11 participants × 4 vignettes for each participant. b13 participants x 4 vignettes for each participant. °Condition 2, n = 308 (i.e., 11 participants x 4 vignettes x 7 problem areas); Condition 3, n = 364 (i.e., 13 participants × 4 vignettes × 7 problem areas). *p < .05.
442
J.S. Cacciola et al.
RESULTS
Study 1 Study 1 results are detailed in Table 1. There were no significant differences in percent exact agreement with the expert consensus ISRs between Condition 2 and Condition 3 in any problem area or overall. When a range of scores (1 point in either direction) was used to assess agreement, Condition 3 participants were significantly more accurate than Condition 2 participants in the alcohol section (80.8% vs. 61.4%; chi-square = 4.44, d f = 1, p < .05) and overall (75.8% vs. 68.8%; chi-square = 4.10, d f = 1, p < .05). The ICCs for both conditions were generally good and exceeded 0.60 for all problem areas with the exceptions of the alcohol and family social sections. With regard to the ICCs, there were no significant differences between conditions.
Study 2 Study results are specified in Table 2. Although percent exact agreement did not differ significantly among the three conditions for any individual ASI problem area, percent exact agreement overall, was significantly higher for Condition 3 than Condition 2 (26.3% vs. 16.2%; chisquare = 5.91, d f = 1, p < .05). There were no significant differences among the three conditions when comparing agreement within the defined range. Also, with regard to the ICCs, there were no significant differences among conditions.
DISCUSSION There was limited evidence that the addition of the ASI vignettes and feedback to standard ASI training in-
creased agreement with expert consensus ISRs. The few significant differences that were found indicated superiority of Condition 3. Additionally, in both studies, overall accuracy (i.e., taking into account all the ISRs regardless of problem area), was consistently highest for Condition 3, although usually not significantly so. Using either the four test vignettes or the five videotapes, the ICCs across conditions were roughly comparable to the interrater reliabilities reported by others using observed live interviews (Alteman et al., 1994; Hodgins and E1-Guebaly, 1992). The reliabilities were, however, not as high as those reported by McLellan et al. (1981, 1985). Further, the low levels of exact agreement caution against using ISRs as sole indicators of patient status or treatment needs in a particular problem area. There is a need to improve ISR reliability and validity and/or to develop alternate methods of summarizing problem severity. Composite scores are one accepted alternate method, but have the drawback of being tied to current status items only and thus neglect potentially important historical variables in determining problem severity. McDermott et al. (1996) have developed refined ASI subscales with demonstrated reliability and validity that include current and lifetime information. Another approach is computerized algorithms that have been generated to derive more reliable and valid ISRs in a less subjective manner (Petro, Zanis, Fureman, & McLellan, 1996). Clearly a larger study needs to be done to assess the value of the training vignettes (vignettes with the annotated expert ISRs) to enhance the criterion validity of the ISRs. This present preliminary investigation provides minimal support for their effectiveness in this regard, but is limited by very small sample sizes. The test vignettes and the ASI videotapes do, however, provide a means of assessing adequacy of ASI interviewer skills with regard to ISRs. The test vignettes can, in fact, provide a very
TABLE 2 Study 2: Comparison of ISR Levels of Agreement for Conditions 1, 2 and 3 % Exact Agreement
ASI Problem Areas Medical Employment Alcohol Drug Legal Family/social Psychiatric Overallc
% Agreement in Range
Condition Condition Condition Condition Condition Condition 1 2 3 1 2 3 ( n = 3 0 a) ( n = 3 0 a) ( n = 2 5 b) ( n = 3 0 a) ( n = 3 0 a) ( n = 2 5 b) 13.3 13.3 13.3 53.3 23.3 26.7 23.3 23.8
16.7 6.7 23.3 30.0 16.7 6.7 13.3 16.2
16.0 24.0 4.0 44.0 32.0 32.0 32.0 26.3*
53.3 40.0 36.7 80.0 60.0 53.3 53.3 53.8
76.7 53.3 66.7 66.7 63.3 46.7 53.3 61.0
60.0 44.0 48.0 92.0 64.0 60.0 72.0 62.9
ICC M +- SD Condition 1 ( n = 3 0 a) 0.59 0.45 0.42 0.91 0.63 0.35 0.57 0.56
_+ 0.23 _+ 0.20 _+ 0.26 _+ 0.12 _+ 0.32 _+ 0.31 _+ 0.24 _+ 0.17
Condition 2 ( n = 3 0 a) 0.74 0.60 0.60 0.87 0.76 0.19 0.49 0.61
_+ 0.11 _+ 0.22 _+ 0.32 _ 0.11 + 0.09 _+ 0.16 _+ 0.25 _+ 0.11
Condition 3 ( n = 2 5 °) 0.71 0.50 0.51 0.96 0.53 0.47 0.77 0.64
_+ 0.04 _+ 0.19 _+ 0.19 __+0.03 __ 0.28 _+ 0.21 ___0.12 _+ 0.09
a6 participants x 5 videotapes for each participant. b5 participants x 5 videotapes for each participant. cConditions 1 and 2, n = 210 (i.e., 6 participants x 5 videotapes x 7 problem areas). Condition 3, n = 175 (i.e., 5 participants x 5 videotapes x 7 problem areas). *3 > 2, p < .05.
Using Case Vignettes for ASI Training
brief assessment (approximately 1 hour to score four vignettes) of ISR accuracy as opposed to the videotapes, which take about an hour each.
REFERENCES Alterman, A.I., Brown, L.S., Zaballero, A., & McKay, J.R. (1994). Interviewer severity ratings and the composite scores of the ASI: A further look. Drug and Alcohol Dependence, 34, 201-209. Fureman, B., Parikh, G., Bragg, A., & McLellan, A.T. (1990). Addiction Severi~' Index (5th ed.) A guide to training and supervising ASI interviews. University of PA/Philadelphia VAMC, Center for Studies of Addiction. Hodgins, D.C., & E1-Guebaly, N. (1992). More data on the Addiction Severity Index: Reliability and validity with the mentally ill substance abusers. Journal of Nervous and Mental Disease, 180, 197-201.
443
McDermott, P.A., Alterman, A.I., Brown, L., Zaballero, A., Snider, E., & McKay, J.R. (1996). Construct refinement and confirmation for the Addiction Severity Index. Psychological Assessment, 8, 182-189. McLellan, A.T., Kushner, H., Metzger, D., Peters, R., Smith, I., Grissom, G., Pettinati, H., & Argeriou, M. (1992). The fifth edition of the Addiction Severity Index. Journal of Substance Abuse Treatment, 9, 199-213. McLellan, A.T., Luborsky, L., Cacciola, J., Griffith, J., Evans, F., Barr, H.L., & O'Brien, C.P. (1985). New data from the Addiction Severity Index: Reliability and validity in three centers. Journal of Nervous and Mental Disease, 173, 412-423. McLellan, A.T., Luborsky, L., Woody, G.E., & O'Brien, C.P. (1980). An improved diagnostic evaluation instrument for substance abuse patients. Journal of Nervous and Mental Disease, 168, 26-33. Petro, P., Zanis, D.A., Fureman, I., & McLellan, A.T. (1996). An examination of" the ASI inter~'iewer severiO' rating system. College on Problems of Drug Dependence 58th Annual Scientific Meeting San Juan Puerto Rico, June 1996.