Addictive Behaviors, Vol. 11, pp. 149-161, 1986 Printed
in the USA. All rights reserved.
THE RELIABILITY NORMAL
OF A TIMELINE
DRINKER
HISTORY:
Copyright
COLLEGE
UTILITY
METHOD
STUDENTS’
0306-4603/86 o 1986 Pergamon
FOR ASSESSING
RECENT
FOR ALCOHOL
$3.00 + .OO Journals Ltd
DRINKING
RESEARCH
MARK B. SOBELL, LINDA C. SOBELL, and FELIX KLAJNER Addiction
Research
Foundation
and University
of Toronto,
Ontario,
Canada
DANIEL PAVAN University
of Toronto
ELLEN BASIAN New York University Abstract-The test-retest reliability of male (n = 40) and female (n = 40) college students’ reports of recent drinking behavior was evaluated using a timeline (TL) procedure. The students also completed a quantity-frequency (QF) questionnaire (Cahalan, Cisin, & Crossley, 1969) often used to categorize subjects’ drinking histories in alcohol research studies. The TLderived data were found to have generally high reliability (usually r’s zz .87) for both males and females, with males having slightly higher reliabilities overall. Subjects were classified into drinker categories based on the QF questionnaire answers, and the resulting groups were compared using their TL-derived data on quantity, frequency, and quantity X frequency (mean number of drinks per drinking day) measures of drinking. The drinking behavior of subjects (as assessed by the TL) had great variability within the QF categories, and there was extensive overlap between subjects classified by the QF method as heavy, moderate and light drinkers. Thus, QF categorization provides a relatively insensitive measure of individual differences in drinking behavior as compared to TL-derived data. The TL method also can be used to generate a variety of potentially useful dependent variables, whereas the QF method generates a single variable.
Acquired tolerance to alcohol, the decrease in the effect of a given dose of alcohol on an individual as a function of drinking experience, is an important determinant of how an individual is affected by alcohol (Cappell & LeBlanc, 1981). Thus, in experimental alcohol research, acquired tolerance is likely to be a mediator of experimental effects. Although it often is not possible to obtain an independent assessment of tolerance levels (e.g., due to ethical considerations, equipment limitations), subjects’ reports of their recent drinking behavior may provide a useful, if rough, estimate of their levels of tolerance. However, the literature on experimental alcohol studies with humans includes relatively few studies where recent drinking history has been used as a variable (see Adesso, 1980, and Marlatt & Rohsenow, 1980, for partial reviews of this literature). When such studies have been conducted, recent drinking history has most often been used as a between-subjects variable where subjects are assigned to groups (e.g., light drinkers, heavy drinkers) based on their drinking practices (e.g., Wilson, Abrams, 8z Lipscomb, 1980; Steele, Southwick, & Critchlow, 1981), although occasionally the effects of recent drinking history have been investigated using correlational or regression techniques (e.g., Brown & Cutter, 1977). The authors gratefully acknowledge the assistance of Reinhard &huller in the computer analysis of data. The views expressed in this publication are those of the authors and do not necessarily reflect those of the Addiction Research Foundation. Requests for reprints should be sent to Mark B. Sobell, Clinical Institute, Addiction Research Foundation, 33 Russell St., Toronto, Ontario, Canada, M5S 2Sl. 149
150
MARK
B. SOBELL
et al.
One likely reason why individual differences in drinking behavior have not been more intensively investigated is that there is a relative lack of adequate (e.g., sensitive) techniques for assessing drinking behavior. The most common methods of assessing drinking behavior have involved estimation formulae, such as in the QuantityFrequency (QF) method (e.g., Cahalan, Cisin, & Crossley, 1969; Polich, Armor, & Braiker, 1981). Unfortunately, such methods pose several problems for experimental research, the most notable of which is their relative lack of sensitivity to individual differences. For example, most QF methods require subjects to characterize their drinking in terms of average, or typical, patterns of intake by beverage type, and ignore instances of combined beverage use (e.g., consumption of beer and wine in the same day). These average patterns are then usually aggregated into a small set of categories, each of which contains subjects who may actually have widely varying consumption patterns. Another problem with QF methods is that young adult males, who often comprise the subjects in such studies, have been found to exhibit much less patterned drinking than older drinkers (Cahalan et al., 1969). Thus, for these individuals, their “average” patterns of consumption may yield data not reflective of their actual consumption levels. Further, in some instances atypical but clinically important levels of consumption (i.e., sporadic days of very heavy drinking) have been found to be obscured by a QF procedure (Sobell, Cellucci, Nirenberg, & Sobell, 1982). Future research would benefit, therefore, from the availability of more sensitive measures of recent drinking behavior. The time-line (TL) method, an alternative to QF procedures, provides more sensitive measures of recent drinking behavior. Several studies have now established the reliability and validity of the TL method for assessing the drinking behavior of both alcoholics and problem drinkers (Sobell, Maisto, Sobell, & Cooper, 1979; Maisto, Sobell, Cooper, & Sobell, 1979; Cooper, Sobell, Maisto, & Sobell, 1980; Sobell, Maisto, Sobell, Cooper, Cooper, & Saunders, 1980; O’Farrell, Cutter, Bayog, Dentch, & Fortgang, 1984; Saunders, Haines, Portmann, Wodak, Powell-Jackson, Davis, & Williams, 1982). Using this procedure, subjects’ estimates of their drinking over a specified time period are partitioned into a set of mutually exclusive categories, such as days abstinent, days l-3 drinks consumed, etc. Such data can be used to generate several variables describing recent drinking history (e.g., tota! amount of ethanol consumed, number of drinking days, number of drinking days at specified consumption levels, maximum amount of ethanol consumed on a single day, patterns of drinking). Moreover, since many of the variables derived from TL data are continuous rather than categorical, such data can be used in parametric statistical analyses (e.g., as a covariate in analyses of covariance). Although the reliability of the TL method has been established with alcohol abusers, at present its generalizability to other populations has not been demonstrated. To date, only one study, as yet unpublished, has investigated the reliability of the TL method of gathering drinking behavior data from nonalcoholic subjects (Maisto, Connors, & Watson, 1984). Although the purpose of that study was to examine differences in drinking behavior as a function of sex and race, test-retest data of self-reported drinking were gathered from one-half of the 96 subjects for a 120-day pre-study interval using a randomized factorial design varying subjects’ and interviewers’ race and sex (n = 12/group, N = 48). Days were coded as abstinent, moderate consumption (l-6 standard drinks) and heavy consumption (over 6 standard drinks). Of the limited results reported thus far, it was found that the test-retest correlations were .97, .96 and .82 for number of days abstinent, moderate drinking and heavy drinking, respectively. No significant differences were found as a function of subject and interviewer race or sex.
Timeline assessment of drinking history
151
The present study had two objectives. The first was to examine the test-retest reliability of the TL method for assessing the recent drinking behavior of normal drinker college students. The second was to compare the TL data with drinking data gathered from the same subjects using a common QF method of scaling (Cahalan et al., 1969). METHOD
Subjects The 80 subjects, 40 males and 40 females, were students at the University of Toronto, solicited by posters placed throughout the campus. The posters indicated that people would be interviewed on two occasions about their drinking behavior, and would receive $5.00 for each interview. To be eligible for the study, subjects had to (a) sign an informed consent, (b) present proof that they were at least 20 years of age, (c) score less than three on the Short Michigan Alcoholism Screening Test @MAST: Selzer, Vinokur, & van Rooijen, 1975), (d) have a negative breath test for blood alcohol at the start of each session (Mobat SM-9, Luckey Laboratories, Inc., San Bernardino, CA), and (e) report having had at least one alcoholic drink in the 90 days prior to the first session. Of those potential subjects who appeared for their first session, one (male) was ineligible based on his SMAST score. Of those subjects who took part in the first session, two (1 male, 1 female) did not appear for their second session. Demographic and general drinking pattern data for the 80 subjects in the test-retest sample were gathered at the first session and are presented in Table 1. Generally, the male and female subjects were of comparable backgrounds. Procedures Subjects participated in two sessions scheduled three to four weeks apart. The mean test-retest interval was 22.96 days (range: 21-34 days; see Table 1). Sessions were conducted by two trained research assistants, one male and one female. The design was fully counterbalanced for interviewer sex and subject sex across the two sessions, and the research assistant who conducted a subject’s second session was blind to the subject’s data from the first session. Session 1. When potential subjects appeared for their first session, they read and signed an informed consent statement indicating that (a) the purpose of the study was to evaluate methods of gathering information about drinking behavior, and (b) that they would be interviewed again in about three to four weeks. At this time, subjects were not forewarned that in the second session they would be asked to report their drinking behavior for the same time period as in the first session. Subjects were then screened using the eligibility criteria described earlier. After subjects had completed the SMAST and a brief questionnaire concerning their background and general drinking history, they were asked to recall as accurately as possible their daily drinking behavior over the 90-day period prior to the date of their most recent drink of alcohol (use of such an interval has been suggested by Polich, Armor, & Braiker, 1981). The dates of the 90-day target interval for each subject were marked by the research assistant on a calendar form, similar to that used in a previous study (see Sobell et al., 1980). Subjects were asked to report their alcohol consumption in terms of Standard Drinks, with one Standard Drink defined as containing 13.6 g of absolute ethanol (1 l/2 oz of 80-proof spirits, 12 oz of 5% beer, 5 oz of 12% table wine,
MARK
152
Table
1.
Variable
SMAST Score Yrs. Regular Drinker Avg. Days/Week Drinking Beer Wine Spirits Days Between
Tests
Ethnicity: White Marital Status: Single Married Separated Primary Beverage Beer Wine Spirits Reported Any Combined Beverage Use:
et al.
Background Variables and General Drinking Pattern for Normal Drinker College Students All Subjects N = 80 M f
Age
B. SOBELL
SD
Male Subjects n = 40 M f
SD
Description
Female Subjects n = 40 M + SD 24.15 + 3.83 0.55 l 0.75 6.95 f 3.25
Male vs. Female Pa
24.14 f 3.58 0.45 f 0.71 6.74 f 3.11
24.13 zrz 3.36 0.35 l 0.66 6.54 f 3.00
1.21 l 1.43 1.00 f 1.00
1.58 zt 1.64 0.86 f 0.90
0.63 zt 0.71 22.96 f 2.84
0.69 zt 0.75 22.63 f 2.79
0.56 zt 0.67 23.30 f 2.88
%
70
%
P (x’)
86.25 90.00 8.75 1.25
85.00 97.50 2.50 0.00
87.50 82.50 15.00 2.50
n.s. (x+ = .ll) .029d
42.50 42.50 15.00
60.00 30.00 10.00
25.00 55.00 20.00
p < .Ol (xi = 10.04)
50.00
52.50
47.50
n.s. (x+ = .20)
0.85 f 1.13 f
1.09 1.12
ns. (t = .03) n.s. (I = 1.25) n.s. (I = .58) Beverage, p < .Olb Beverage x Sex, p < .05c n.s. (t = 1.04)
“Two-tailed r-tests were used to compare means except for the analysis relating to days/week when beer, wine and spirits were consumed, for which a 3 (beverage type) x 2 (sex) repeated measures analysis of variance *as used. Differences in percentages were analyzed using Chi-squared tests. bF(2,156) = 6.24; simple effects analyses indicated that subjects drank spirits lets often than either beer or wine (p i .02). CI;(2,156) = 4.43; simple effects analyses indicated that the interaction derived from males reporting a higher frequency of drinking beer and females reporting a higher frequency of drinking wine (p < ,001). dsingie vs. pooled Married + Separated; since some cell frequencies were less than 5, a Fishers exact tert was performed on these data.
3 oz of 20% fortified wine). The Standard Drink definition appeared on the bottom of the calendar. Prior to completing the calendar, the subjects were apprised by the research assistant of several techniques which have been found useful in aiding recall of the target interval, such as the use of anchoring (major) events, and identifying extended periods of nondrinking (see Sobell et al., 1980). Subjects were instructed that although the specific calendar days when various levels of drinking had occurred might be recalled in many cases, the most important part of the task involved accurately accounting for all days of the 90-day interval. Thus, if a subject could recall having consumed eight drinks on one day during a particular week, but was not certain of the exact date, he/she was told to estimate as well as possible when the event had occurred. After completing the calendar, subjects were asked whether the 90-day target interval was representative of their drinking behavior over the past year. Subjects who reported that the interval was not typical of their drinkiing were asked to complete a second calendar reflecting what they considered to be a 90-day period representative of their drinking in the past year. After completing the calendar(s), subjects were scheduled for their second session, paid $5.00 and dismissed. Session 2. Subjects were initially asked to sign a second informed consent statement explaining that in the second session they would be asked to provide much of the same
Timeline assessment of drinking history
153
and also to answer a few additional questions. The consent form included the statement: “Repeating a iuestionnaire on the same person is a standard part of assessing its usefulness.” All subjects who appeared for the second session completed that session. The procedures were identical to those of the first session, including the same 90-day target interval, with three exceptions: (a) the background and screening questionnaires were not readministered; (b) after completing the calendar(s), subjects were administered a quantity-frequency questionnaire (Cahalan et al., 1969) often used in experimental alcohol research studies; and (c) subjects were asked to provide confidence ratings on a scale from 0% to 100% indicating their confidence (a) in the accuracy of the data from their first session, (b) in the accuracy of the data from their second session, and (c) that their data from the two sessions were in agreement. Finally, subjects were paid $5.00 and dismissed.
data as in the first session,
RESULTS
Reliability of tirneline drinking data Table 2 presents descriptive statistics and product-moment test-retest reliability coefficients for several drinking behavior variables derived from the TL data. Results are presented for the total sample and for males and females separately. Also, data are summarized for the entire 90-day reporting period, as well as for the three 30-day blocks comprising that interval. The initial test-retest reliability analyses were performed using all data from all subjects. For some variables, however, some subjects had scores of zero in both sessions (i.e., no days in that category). Since it is possible that a substantial number of pairs of zero data points could artificially inflate the correlations (i.e., unreliability among subjects who had non-zero-pairs might not be apparent from the overall group reliability computations), adjusted correlation coefficients were also calculated by excluding all zero-pairs from the data sets. Both the unadjusted (full sample) and adjusted reliabilities appear in Table 2. For purposes of analysis, drinking days were categorized as abstinent days, light drinking days [l-3 Standard Drinks (SDS); I 41 g of absolute ethanol], moderate drinking days [4-6 SD; > 41 g but 5 82 g of absolute ethanol] and heavy drinking days [ > 6 SDS; > 82g of absolute ethanol]. These categories were selected because they have previously been reported to be associated with various types of alcohol-related health risks (Turner, Mezey, & Kimball, 1977a,b; Lieber, 1984). As shown in Table 2, with minor exceptions test-retest reliabilities were generally high, regardless of whether zero-pairs were excluded from the analyses. Except for reports of heavy drinking by female subjects during the middle interval (days 31-60; r = .42), unadjusted reliabilities for the variables abstinent days and days of light, moderate and heavy drinking ranged from .76 to .97. Reliability coefficients for male subjects were generally slightly higher than corresponding coefficients for female subjects. As indicated in Table 2, the few cases of ostensibly unacceptable reliabilities among female subjects are largely an artifact of the statistical procedures (e.g., for females’ reports of heavy drinking for days 31-60, their reports were identical in 36 of the 40 cases, and differed by only one day in the other 4 cases). For the other drinking behavior variables in Table 2, unadjusted test-retest correlations were also high, ranging from .77 to .98 with 16 of the 21 (76%) correlations equal to or exceeding .91. Temporal stability of reliabilities of reported drinking behavior As indicated in Table 2, no systematic variations were evident in reliability coefficients across the three 30-day intervals. The number of days drinking at various con-
154
MARK
B. SOBELL
et al.
sumption levels by female and male subjects were quite stable over time, although female subjects reported slightly more days of light drinking for the least recent 30-day block (days 61-90). Also, males reported a slightly greater number of both light and heavy drinking days for the most recent 30-day block and slightly more moderate Table 2.
Test-retest Drinking Behavior Data (Means + Standard Deviations) and Correlation for Females (n = 40), Males (n = 40) and All Subjects Combined (N = 80) for three Consecutive 30-day Intervals and for the Combined (90-day) Interval
Variable
Group
Session 1 M f SD
Session 2 M f SD Entire
Longest No. Consec. Abstinent Days
Longest No. Consec. Drinking Days
Greatest No. Drinks Any Single Day
Total No. Drinks
Days Abstinent
Days 1-3 Drinks
All Females Males
14.34 z+z15.46 12.63 f 7.07 16.05 f 20.68
All Females Males
5.53 f 3.73 f 7.33 f
10.20 2.65 14.04
All Females Males
5.80 f 5.50 zt 6.10 f
3.56 3.52 3.62
90-Day
ra
Coefficients
Adjusted r(n)b
Interval
13.15 f 13.06 12.95 f 9.84 13.35 l 15.77
.I7 .83 .79
C
5.94 f 10.43 4.05 + 2.83 7.83 f 14.32
.98 .79 .98
C
5.24 f 5.15 f 5.33 f
3.22 3.29 3.19
.95 .97 .93
C
C
.77(79) .79(39) .98(79) .98(39) .94(79) .92(39)
All Females Males
69.36 56.05 82.68
zt 61.56 z+z39.12 zt 75.68
73.50 57.70 89.30
f 69.34 f 38.05 + 88.21
.98 .95 .98
All Females Males
62.78 66.05 59.50
f f f
18.29 14.17 21.43
60.39 63.78 57.00
zt 19.12 + 15.79 f 21.63
.96 .96 .97
C
All Females Males
21.28 f 14.13 19.68 +z 12.16 22.88 zt 15.85
24.11 23.13 25.10
f f f
15.06 14.49 15.73
.92 .94 .92
C
.98(79) .98(39) .96(79) .96(39) .92(79) .91(39)
Days 4-6 Drinks
All Females Males
4.83 zt 3.83 zt 5.83 f
6.34 4.66 7.59
4.15 f 2.53 f 5.78 f
6.83 3.45 8.78
.94 .87 .97
.93(60) .82(28) .97(32)
Days
All Females Males
1.13 f 0.45 f 1.80 f
3.49 0.90 4.79
1.35 * 0.58 + 2.13 f
3.87 1.17 5.27
.95 .84 .95
.93(27) .61(12) .93(15)
> 6 Drinks
Days l-30
(Most Recent
Interval)
All Females Males
25.05 f 22.85 19.13 f 14.90 30.98 zt 27.63
25.04 f 24.24 18.38 zt 13.85 31.70 f 30.13
.93 .87 .94
C
All Females Males
20.53 zt 21.98 + 19.08 f
6.39 5.26 7.13
19.94 f 6.83 21.33 f 5.86 18.55 +z 7.49
.92 .92 .91
C
All Females Males
7.26 f 6.48 f 8.05 f
5.18 4.68 5.58
8.10 f 7.65 zt 8.55 zt
5.31 5.28 5.37
.87 .92 .84
C
Days 4-6 Drinks
All Females Males
1.65 & 1.35 l 1.95 *
2.27 1.93 2.56
1.44 zt 0.85 f 2.03 f
2.33 1.41 2.88
.85 .76 .91
.80(49) .65(23) .87(26)
Days
All Females Males
0.56 f 0.20 f 0.93 zt
1.77 0.46 2.42
0.53 f 0.18 f 0.88 f
1.65 0.45 2.24
.96 .82 .96
.94(20) .00(7)d .94(13)
Total No. Drinks
Days Abstinent
Days l-3 Drinks
> 6 Drinks
.93(79) .94(39) .91(79) .89(39) .87(79) .83(39)
Timeline
assessment
of drinking
history
155
Table 2. (continued)
Variable
Group
Session 1 M f SD
Session 2 M f SD
Days 3 l-60 (Middle
ra
Adjusted rWb
Interval)
Total No. Drinks
All Females Males
19.35 f 20.11 15.13 zt 12.33 23.58 ztz 25.11
21.88 f 22.40 16.38 f 11.56 27.38 f 28.65
.97 .91 .98
.96(77) .91(39) .98(38)
Days Abstinent
All Females Males
21.90 23.08 20.73
Days l-3 Drinks
zt zt zt
6.45 4.69 7.72
20.95 f 22.28 f 19.63 f
6.30 4.91 1.26
.93 .90 .95
.92(79) C .94(39)
All Females Males
6.53 f 5.78 zt 7.28 f
5.03 4.01 5.83
7.48 f 6.95 f 8.00 l
5.06 4.68 5.44
.87 .86 .88
.86(77)
Days 4-6 Drinks
All Females Males
1.32 f 1.05 f 1.58 f
2.14 1.65 2.53
1.25 f 2.50 0.65 ztz 1.29 1.85 z!z 3.21
.88 .80 .93
.84(39) .71(19) .89(20)
Days
All Females Males
0.26 zt 0.10 zt 0.43 it
1.18 0.38 1.62
0.33 * 0.13 l 0.53 f
.91 .42 .93
.88(14) - .42(5)e .92(9)
> 6 Drinks
Days 61-90 (Most Distant Total No. Drinks
All Females Male
24.96 21.80 28.13
Days Abstinent
All Females Males
20.35 zt 21.00 l 19.70 f
6.59 5.55 7.51
Days 1-3 Drinks
All Females Males
7.49 l 7.43 f 7.55 l
Days 4-6 Drinks
All Females Males
Days > 6 Drinks
All Females Males
1.17 0.40 1.58
.85(39) .87(38)
Interval)
f f f
25.48 16.97 31.51
.96 .96 .96
.96(79) C .96(39)
19.50 f 20.18 f 18.83 f
6.75 5.88 7.53
.93 .91 .94
.92(79) C .93(39)
5.14 4.85 5.48
8.56 f 5.58 8.58 ziz 5.30 8.55 f 5.91
.87 .84 .89
.86(79) C .88(39)
1.86 f 1.43 zt 2.30 f
2.70 1.89 3.28
1.46 f 1.03 l 1.90 +
2.55 1.51 3.23
.92 .81 .95
.90(49) .72(24) .94(25)
0.30 f 0.15 * 0.45 l
1.04 0.48 1.38
0.48 f 0.23 zt 0.73 f
1.42 0.70 1.87
.89 .96 .88
.79(13) .87(5) .77(8)
zt 22.83 f 16.53 f 27.60
26.14 22.05 30.23
aCorrelations based on N = 80 for all subjects, and n’s = 40 for female and male subjects. korrelations based on all non-zero-pairs (i.e., zero-pairs deleted); number of cases on which the correlation is based is indicated in parentheses. CNo zero-pairs in the data set. dSession l-Session 2 data pairs used in calculating the adjusted correlation (r = .oO) were as follows: l-0, l-l, l-1, 1.1,l-1, I-2, 2-I. %ession I-Session 2 data pairs used in calculating the adjusted correlation (r = - .42) were as follows: O-1, l-1,0_1,2-0, l-2.
drinking days for the least recent 30-day block. However, in absolute terms, these differences were relatively small for all subjects (i.e., mean differences were typically less than 1 day). Comparison of target period data and typical timeline calendar data Twenty-two subjects (9 females, 13 males) completed the “typical” 90-day drinking period calendar at the first session, 17 subjects (7 females, 10 males) did so at the second session, and 14 of the above subjects (6 females, 8 males) completed the second calendar at both sessions. Two analyses were conducted using data from this “typical” TL calendar. First, the test-retest reliability was evaluated for the 14 subjects who com-
156
MARK
B. SOBELL
et al.
pleted the second calendar at both sessions. The test-retest reliability of this calendar, although somewhat less than that for the target period calendar, was still relatively high, especially considering the small sample size. Unadjusted reliability coefficients were .83 for abstinent days, .77 for light drinking days, .70 for moderate drinking days, .96 for heavy drinking days, and .87 for total number of Standard Drinks consumed. Since the reliability analysis and inspection of the scatterplots indicated good agreement between the two “typical” calendars, data for the 22 subjects who completed the alternative timeline during the first session were compared with data from those same subjects’ first session target interval calendar. This comparison appears in Table 3 and indicates that the differences are quite minimal. Differences between the target and typical interval data were analyzed in two ways. First, t-tests for correlated data were performed to examine differences in the mean total number of drinks consumed over the two 90-day intervals (Target Interval M = 62.59; Alternative Interval M = 64.45), and the mean number of drinks consumed per drinking day over the two intervals (Target Interval M = 2.37; Alternative Interval M = 2.05). In both cases the differences between intervals were not significant (p > .OS). The second analysis was a two-way repeated measures analysis of variance with two within-subject factors, Interval (target, alternative) and Alcohol Consumption Category (days abstinent, days drank l-3 drinks, days drank 4-6 drinks). In order to meet test assumptions, the fourth consumption category (days drank >6 drinks) was not included in this analysis. The only significant finding was a main effect of category [F(1,21) = 126.73, p < .OOl], simply reflecting that the distribution of proportions of days across drinking categories is uneven. Although no reliable differences were found between data obtained using the target interval or an alternative “typical” interval (see also Table 3), it should be recalled that all subjects completed the target calendar before completing the “typical” calendar. For this reason, the present results cannot be interpreted as suggesting that use of a specified target interval is not necessary with the timeline procedure. Comparison of timeline and quantity-frequency data Based on subjects’ responses to the QF questionnaire, 33.8% (27) of the subjects were classified as “heavy” drinkers, 25% (20) as “moderate” drinkers, 38.5% (31) as “light” drinkers, 2.5% (2) as “infrequent” drinkers, and 1.25% (1) as an “abstainer”. Since subjects in this study had to meet various eligibility criteria, this categorical
Table 3. Target Interval (90 days prior to last drink) and Alternative Interval (90 day “typical” drinking) Data (Means + Standard Deviations) for the 22 Subjects who Reported First Session that the Target Interval did not Reflect their Typical Drinking Behavior.
Variable Days Abstinent Days l-3 Drinks Days 4-6 Drinks Days > 6 Drinks Total No. Drinks Drinks/Drinking Day
Target Interval M f SD 65.36 19.45 4.41 0.77 62.59 2.37
f 13.81 f 11.28 f 4.81 * 0.92 f 40.10 f 0.88
at the
Alternative Interval M f SD 62.55 22.55 3.95 0.95 64.45 2.05
f 18.26 zt 14.40 f 6.38 f 3.63 i 60.36 f 0.93
Timeline assessment of drinking history
157
distribution cannot readily be compared with those found in undergraduate student surveys. The QF data were gathered from college students in order to evaluate the relative utility of the QF and TL methods for experimental alcohol research. In this regard, Figure 1 presents frequency distributions for three timeline variables (using Session 1 data) for subjects classified by the QF questionnaire as heavy, moderate and light drinkers (the 2 subjects classified as infrequent drinkers and the 1 subject classified as an abstainer are not included). The first TL variable, total number of drinking days, is a measure of frequency of drinking. The second variable, total number of drinks consumed, is a measure of total quantity of drinking. The third variable, mean number of drinks per drinking day, combines quantity and frequency variables and thus bears the closest relationship to the QF questionnaire categories. Figure 1 illustrates the relative failure of the QF categories to differentiate among individual subjects and, to a large extent, among groups of subjects, at least with respect to the above parameters. The overlap among QF categories, especially the light and moderate drinker categories, is substantial. Further, the range of cases within a given category, in terms of timeline reports of drinking, is often extensive. For example, for the QF category of “heavy” drinkers the total number of drinking days on the TL ranged from 11 to 90, the total number of drinks consumed ranged from 45 to 315, and the mean number of drinks per drinking day ranged from 1.45 to 6.74. As also apparent in Figure 1, one of the advantages of using the TL method is that the resulting data can be used to generate several continuous variables describing different aspects of drinking behavior. Finally, using the Session 1 TL data, subjects were classified according to the Cahalan et al. (1969) QF categories in order to examine how such a classification procedure would relate to the same classifications but based on the subjects’ actual QF questionnaire responses. Fifty of the 80 subjects (62.5%) were classified identically by the two methods. The majority of the 30 discrepant cases were of two types: (a) 9 subjects (30%) were classified as heavy drinkers by the QF questionnaire but as moderate drinkers based on their TL data, and (b) 12 subjects (40%) were classified as light drinkers by the QF questionnaire but as moderate drinkers based on their TL data. The remaining nine discrepant cases were scattered among the drinking categories, with only one or two cases per category. The major difference between the two methods was that many more subjects (47.5%) were classified as moderate drinkers using their timeline data than when the classifications were based on their QF questionnaire responses (20%). Correspondingly, fewer subjects were classified as either heavy (26.25%) or light (23.75%) drinkers based on their timeline data. Confidence ratings Not surprisingly, subjects’ confidence ratings in the accuracy of their reports diminished from the first session (M = 70.83) to the second session (M = 54.25). Their mean confidence rating that their data from the two sessions were in agreement was 58.14. In all three cases, differences between confidence ratings by males and females were negligible; the differences in mean ratings between these groups ranged from 0.60 (first session data) to 4.70 (second session data). DISCUSSION
The results of this study demonstrate that reports of recent drinking by normal drinker college students obtained using the TL method have high test-retest reliability.
158
MARK
B. SOBELL
et al.
12 10
EaHEAVY DRINKERS Es!MODERATE DRINKERS
8
m
LIGHT
DRINKERS
6 4 2
0
10
20
30
40
50
60
70
80
90
NUMBER OF DRINKING DAYS REPORTED BY TIMELINE
ft 3
10 8
b
6
5 m 2
4 2
0
20
40
60
80
TOTAL NUMBER
100 120 140 160 180 200 220
OF STANDARD
DRINKS
240 260 280 300 320
REPORTED
BY TIMELINE
12 10 8 6 4 2 0 0.2
0.6
1.0
MEAN NUMBER
1.4
1.8
2.2
OF STANDARD
2.6
3.0
DRINKS
3.4 3.8
4.2
4.6
PER DRINKING
5.0
5.4
6.4
6.8
DAY BY TIMELINE
Fig. 1. Histograms showing the relationship between subjects’ drinking classifications (heavy, moderate, or light) based on their answers to the Cahalan et al., (1969) quantityfrequency questionnaire, and three drinking behavior variables derived from Session 1 timeline data- number of drinking days, total number of standard drinks consumed, and mean number of standard drinks consumed per drinking day during the 90-day target interval. Interval scaling is 0.00-4.99, 5.00-9.99, etc. for drinking days, 0.00-9.99, 10.00-19.99, etc. for total drinks consumed, and 0.00-0.19, 0.20-0.39, etc. for mean number of drinks per drinking day.
Timeline assessment of drinking history
159
These findings are consistent with those of Maisto et al. (1984). Moreover, TL data can also be used to generate a variety of continuous as well as nominal variables of potential relevance in alcohol research. For example, using data from the present study and defining weekend days as Friday through Sunday, a two-way repeated measures analysis of variance (Subject Sex; Drinking Period: weekend, weekday) found a main effect for drinking period [F(1,78) = 113.03, p < .OOl], indicating that subjects drank on a significantly greater proportion of weekend days (M = 0.43) than weekdays (M = 0.21). When the dependent variable in a similar analysis was the number of drinks consumed per drinking day, a significant main effect for subject sex was found [F( 1,78) = 3.95, p < .05], reflecting that when subjects drank at all, males consumed a greater mean number of drinks (M = 0.98) than did females (M = 0.67). A significant main effect for drinking period was also found [F(1,78) = 71.63, p c .OOl], indicating that when drinking occurred, the mean number of drinks consumed on weekend days (M = 1.20) was significantly greater than the mean number of drinks consumed on weekdays (M = 0.45). The Subject Sex X Drinking Period interaction was not significant. Thus, these illustrative analyses suggest that for college students, weekend drinking is more frequent than weekday drinking, and it also involves consumption of greater amounts of alcohol. In contrast with the TL, the QF method, which has been the most frequently used measure of recent drinking behavior, produces a very limited set of data since it generates only one data point per subject. Because the differentiation between QF categories has been shown to be quite poor in terms of several continuous variables (e.g., number of drinking days), and because the range of drinking behaviors included within each category is large, the QF method developed by Cahalan et al. (1969) has very limited research utility. Since the TL data were also found to have good stability over time, the use of a shorter target interval (e.g., 30 days prior to the last drinking day) should be adequate for most alcohol research studies using college students as subjects. Although the amount of time that it took subjects to complete the 90-day timeline was not specifically measured in the study, most subjects completed the calendar in less than 10 minutes. Thus, if a 30-day interval is used, the amount of time necessary to complete the calendar should be a negligible factor in most studies. At present the validity of the TL method as well as any other method for assessing the self-reports of normal drinkers, remains to be established. Although a good deal of evidence, cited earlier, suggests that for clinical populations the TL method has good validity if the individual is not intoxicated when providing the report, it will take considerable ingenuity to validate the self-reports of normal drinker college students. The typical ways of validating the reports of clinical subjects have involved verifying the occurrence of discrete events (e.g., arrests, hospitalizations), the use of biochemical indices (e.g., Pomerleau & Adkins, 1980) and gathering reports from collateral informants (reviewed in Sobell & Sobell, 1980). Unfortunately, verifiable alcohol-related events are unlikely to occur for normal drinker college students, and given the levels of consumption that characterize most students, biochemical tests (e.g., gamma-glutamyl transpeptidase) are likely to be within the normal range. Breath alcohol tests may be useful, but they can only reflect very recent drinking (i.e., when alcohol is still in the blood). Finally, knowledgeable collateral informants (often a spouse in clinical studies) may not be as readily available for college students. A possible alternative method for assessing recent drinking behavior is selfmonitoring, where subjects are instructed to record their drinking on a daily basis for a
160
MARK
B. SOBELL
et al.
specified time period (Poikolainen & Kgrkktiinen, 1983). However, the self-monitoring technique has liabilities for some types of alcohol research. Primary among these is that only some individuals asked to record their drinking will do so, thus yielding a selfselected rather than representative sample of the subject population. Also, selfmonitoring requires subjects to record data for some period of time prior to their participation in actual experimental sessions, which may make its use unfeasible in some studies. Finally, in some instances self-monitoring of drinking may be reactive with drinking itself (Sobell & Sobell, 1973), thus yielding data of questionable generalizability. In summary, the TL method is currently the most empirically established technique for assessing recent drinking behavior with alcohol abusers. This study extended the use of the TL method to normal drinker college students and found that TL data had generally high test-retest reliability. The TL method can be of value in alcohol research studies where measurement of individual differences in drinking behavior is important. REFERENCES Adesso,
V.J. (1980). Experimental
studies of human drinking behavior. In H. Rigter & J. Crabbe, Jr. (Eds.), Alcohol Tolerance and Dependence (pp. 123-154). New York: North-Holland Biomedical Press. Brown, R.A., &Cutter, H.S.G. (1977). Alcohol, customary drinking behavior, and pain. JournalofAbnor-
mal Psychology, 86, 179-188. H.M. (1969). Americun Drinking Practices: A national survey of No. 6. New Brunswick, NJ: Rutgers Center for Alcohol Studies. Cappell, H., & LeBlanc, A.E. (1981). Tolerance and physical dependence: Do they play a role in alcohol and drug self-administration? In Y. Israel. F.B. Glaser, H. Kalant, R.E. Popham, & W. Schmidt (Eds.), Research advances in alcohol and drug problems, Vol. 6 (pp. 159-196). New York: Plenum Press. Cooper, A.M., Sobell, M.B., Maisto, S.A., & Sobell, L.C. (1980). Criterion intervals for pretreatment drinking measures in treatment evaluation. Journal of Studies on Alcohol, 41, 1186-l 195. Lieber, CA. (1984). To drink (moderately) or not to drink? New England Journal ofMedicine, 13, 846-848. Maisto, S.A., Connors, G.J., & Watson, D.W. (1984, March). Test-retest reliability of nonalcoholic young adults’ self-reports of drinking behavior. Poster session presented at the meeting of the Southeastern Psychological Association, New Orleans, LA. Maisto, S.A., Sobell, M.B., Cooper, A.M., & SobelI, L.C. (1979). Test-retest reliability of retrospective selfreports in three populations of alcohol abusers. Journal of BehavioralAssessment, 1, 315-326. Marlatt, G.A., & Rohsenow, D.J. (1980). Cognitive processes in alcohol use: Expectancy and the balanced placebo design. In N.K. Mello (Ed.), Advances in substance abuse: Behavioral and biological research (pp. 159-199). Greenwich, CN: JAI Press. O’FarrelI, T.J., Cutter, H.S.G., Bayog, R.D., Dentch, G., & Fortgang, J. (1984). Correspondence between one-year retrospective reports of pretreatment drinking by alcoholics and their wives. Behavioral AssessCahalan,
D., Cisin,
I.H.,
& Crossley,
behavior and attitudes. Monograph
ment,6,263-274. Poikolainen, K., & KgrkkHinen, P. (1983). Diary gives more accurate information about alcohol consumption than questionnaire. Drug and Alcohol Dependence, 11, 209-216. PoIich, J.M., Armor, D.J., & Braiker, H.B. (1981). The course of alcoholism: Four years after treatment. New York: John Wiley. Pomerleau, O., & Adkins, D. (1980). Evaluating behavioral and traditional treatment for problem drinkers. In L.C. Sobell, M.B. Sobell, & E. Ward (Eds.), Evaluating alcohol and drug abuse treatment effectiveness: Recent advances (pp. 93-108). New York: Pergamon Press. Saunders, J.B., Haines, A., Portmann, B., Wodak, A.D., Powell-Jackson, P.R., Davis, M., & Williams, R. (1982). Accelerated development of alcoholic cirrhosis in patients with HLA-B8. Luncet, 8286,
1381-1384. Selzer, M.L., Vinokur, A., & van Rooijen, L. (1975). A self-administered short Michigan Alcoholism Screening Test (SMAST). Journal of Studies on Alcohol, 36, 117-126. Sobell, L.C., Cellucci, T., Nirenberg, T.D., & Sobell, M.B. (1982). Do quantity-frequency data underestimate drinking-related health risks? American Journal of Public Health, 72, 823-828. SobelI, L.C., Maisto, S.A., Sobell, M.B., & Cooper, A.M. (1979). Reliability of abusers’ self-reports of drinking behaviour. Behaviour Research Therapy, 17, 157-160. Sobell, L.C., & Sobell, M.B. (1973). A self-feedback technique to monitor drinking behavior in alcoholics. Behuviour Research & Therapy, 11, 237-238. Sobell, L.C., & Sobell, M.B. (1980). Convergent validity: An approach to increasing confidence in treatment outcome conclusions with alcohol and drug abusers. In L.C. Sobell, M.B. Sobell, & E. Ward (Eds.), Evaluating alcohol and drug abuse treatment effectiveness: Recent advances (pp. 177-183). New York: Pergamon Press.
Timeline assessment of drinking history
161
Sobell, M.B., Maisto, %A., Sobell, L.C., Cooper, A.M., Cooper, T.C., & Saunders, B. (1980). Developing a prototype for evaluating alcohol treatment effectiveness. In L.C. Sobell, M.B. Sobell, & E. Ward (Eds.), Evaluating alcohol and drug abuse treatment effectiveness: Recent advances (pp. 129-150). New York: Pergamon Press. Steele, C.M., Southwick, L.L., & Critchlow, B. (1981). Dissonance and alcohol: Drinking your troubles away. Journal of Personality and Social Psychology, 41, 831-846. Turner, T.B., Mezey, E., &Kimball, A.W. (1977a). Measurement of alcohol-related effects in man: Chronic effects in relation to levels of alcohol consumption, Part A. Johns Hopkins Medical Journal, 141, 235-248. Turner, T.B., Mezey, E., &Kimball, A.W. (1977b). Measurement of alcohol-related effects in man: Chronic effects in relation to levels of alcohol consumption, Part B. Johns Hopkins Medical Journal, 141, 273-286. Wilson, G.T., Abrams, D.B., & Lipscomb, T.R. (1980). Effects of intoxication levels and drinking pattern on social anxiety in men. Journal of Studies in Alcohol, 41, 250-264.