Addictive Behaviors 31 (2006) 1035 – 1049
Application of item response theory to quantify substance use disorder severity Levent Kirisci a,*, Ralph E. Tarter a, Michael Vanyukov a, Chris Martin b, Ada Mezzich a, Stacy Brown a a
Center for Education and Drug Abuse Research, Department of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh, 711 Salk Hall, Pittsburgh, PA 15261, USA b University of Pittsburgh Medical Center, Department of Psychiatry, 3501 Forbes Avenue, Pittsburgh, PA 15213, USA
Abstract Objective: The present investigation had two main goals: (1) Determine whether binary substance use disorder (SUD) diagnoses are indicators of a unidimensional trait indexing severity of disorder; and, (2) demonstrate the predictive, concurrent and construct validity of the SUD severity scale. Methods: Boys and their biological parents were administered structured diagnostic interviews to diagnose SUD. Item response theory (IRT) was applied to determine whether the diagnoses are indicators of a unidimensional trait. The score on this scale was correlated with substance use behavior, violence, treatment history, risky sex, and social adjustment. Results: SUD diagnoses are indicators of a unidimensional latent trait. Maternal and paternal SUD severity predicted son’s SUD severity at age 19. The score on the SUD severity scale correlated with drug use frequency, number of different drugs used in lifetime, treatment seeking, illegal behavior, social maladjustment, and risky sex. Conclusion: SUD can be quantified on an interval scale indexing severity of disorder. The advantages of measuring SUD severity as a continuous trait are discussed. D 2006 Elsevier Ltd. All rights reserved. Keywords: Substance use disorder; Item response theory; Risky sex; Violence
* Corresponding author. Tel.: +1 412 624 1070; fax: +1 412 383 9960. E-mail address:
[email protected] (L. Kirisci). 0306-4603/$ - see front matter D 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.addbeh.2006.03.033
1036
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
Two perspectives guide the designation of a problem concomitant to consumption of abusable drugs. One approach emphasizes behavior topology; that is, defines the presence of a problem according to quantity or frequency of substance use in context of contemporary societal norms. Dipsomania and narcomania, albeit outdated terms; denote the presence of a problem according to excessive behavior. The second approach aligns with the clinical perspective as, for example, the DSM taxonomic system wherein criteria for determination of a problem are adverse consequences of substance consumption. The use of each approach for determination of a problem has important advantages and disadvantages. For example, measurement of consumption behavior enables quantification of severity on an interval or ratio scale. A disadvantage is that social norms differ widely between social and ethnic groups and change over time. Hence, designation of a problem based on consumption behavior is not appropriate for a clinical diagnosis. On the other hand, whereas the DSM taxonomy has established, although arbitrary, criteria defining the diagnostic threshold, it does not enable determination of severity of disorder. The DSM system denotes only presence or absence of disorder. Substance use disorder (SUD) is one of the most important, if not the most important, variables in drug abuse research. Accordingly, it is important to accurately measure this clinical phenotype. Considered solely from the perspective of measurement theory, there are many advantages to quantifying this phenotype on an interval or ratio scale beyond current dichotomous description. Moreover, because cooccurring SUD diagnoses are commonly present in the same individual, it is highly desirable to characterize the individual in a fashion that accurately accommodates the impact of each specific disorder on the overall clinical presentation. In addition, it is noteworthy that both the genetic underpinnings and biobehavioral indicators of liability for SUD are largely but not entirely common across abusable drugs (Tsuang et al., 1998;Vanyukov, Kirisci et al., 2003; Vanyukov, Tarter et al., 2003). In a genetically informative adoption study, Yates, Cadoret, Troughton, and Stewart (1996) did not observe a relation between the type of substances abused by the biological parents and offspring. While some family studies (e.g. Merikangas et al., 1998) noted specificity of substance use in parents and offspring, the weight of the evidence indicates that most of the variance underlying liability is accounted for by common factors (Kendler, Jacobson, Prescott, & Neale, 2003; Tsuang, Bar, Harley, & Lyons, 2001). Regardless of the extent to which common and specific factors influence SUD liability, the fact that a significant portion of liability variance is due to common factors illustrates that it is not appropriate to merely sum these correlated disorders to quantify severity of disorder. Lastly, it should be noted that the probability of qualifying for SUD diagnosis is not the same for each drug category due to differences in both opportunities to access particular compound and severity of presaging liability. Thus, while the importance of quantification of SUD is recognized, research has not yet been conducted to demonstrate that this clinical phenotype is amenable to measurement beyond only designation of its presence or absence. Prior research using item response theory (IRT) methods has revealed that the symptoms used for diagnosing SUD in the DSM-IV are indicators of a unidimensional latent trait (Kirisci, Vanyukov, Dunn, & Tarter, 2002). In effect, severity of clinical disturbance can be quantified on a scale which capture the full spectrum of severity spanning both the subthreshold and suprathreshold segments of the diagnostic distribution. Within the suprathreshold segment of the distribution, defined by the presence of one or more SUDs, it is not known whether the clinical disorder can be similarly quantified beyond merely denotation of a diagnosis. One common method of quantifying SUD severity consists of tabulating the number of cooccurring diagnoses. This procedure is unsatisfactory because, as noted above, the various SUD categories are not independent. It is also noteworthy that in the absence of an accurate method of
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1037
measuring SUD severity, the total diagnostic configuration is often not considered. It is not uncommon, for example, for investigators to focus on one SUD without accounting for, or even acknowledging, SUD comorbidity. For instance, research on SUD consequent to cocaine use may simply ignore SUDs consequent to alcohol, tobacco or other drug use. This decision to exclude important information about diagnostic status is due largely to the lack of an acceptable method to depict the presence of a particular phenotype (e.g. SUD consequent to cocaine use) in context of all co-occurring SUDs reflecting overall severity of disorder. The present investigation had two major aims. The first goal was to demonstrate that binary SUD diagnoses are indicators of a unidimensional SUD trait. It was hypothesized that a two-parameter logistic item response theory (IRT) model provides an accurate fit of the data. The second aim was directed at demonstrating that the derived SUD severity scale has predictive, concurrent and construct validity. It was hypothesized that (a) SUD severity score in parents is a significant predictor of SUD severity in their children; (b) SUD severity is associated with treatment seeking, risky sexual behavior, psychosocial maladjustment, and history of illegal behavior; and, (c) SUD severity covaries with current drug use involvement. Demonstrating that the SUD clinical phenotype can be accurately quantified on a continuous scale has potentially important practical ramifications for designing interventions tailored to severity of disorder.
1. Method 1.1. Participants Families were recruited consisting of a proband father, current or past spouse, and the oldest biological child in the 10–12-year age range. The fathers either qualified for a DSM-III-R diagnosis of SUD consequent to consumption of an illegal compound or consumption of a legal compound without medical prescription (SUD+; N = 248) or had no adult psychiatric disorder (SUD; N = 248). To qualify for inclusion of the family, the men were required to have an IQ of 80 or higher, good health, and absence of lifetime psychosis. Co-occurring diagnoses of alcohol, tobacco or caffeine abuse/dependence did not preclude participation in this study. Because of low prevalence and inaccessibility of SUD+ men satisfying the above ascertainment criteria, it was not feasible to employ only a random sampling strategy. Thus, in addition to recruitment conducted using random digit telephone calls, men were also accrued using newspaper and radio advertisements and public service announcements. Approximately 25% of the sample of SUD+ men were identified after they were discharged from substance abuse treatment. A previous analysis of the characteristics of the sample has revealed that it is similar in socioeconomic status, SUD severity, and pattern of comorbid psychiatric disorder to age-equivalent men in the Epidemiological Catchment Area study (Tarter & Vanyukov, 2001). The mean age of the probands was 40.4years. European–Americans comprised 76.7% of the sample. Mean family socioeconomic status according to Hollingshead (1975) criteria was 40.3, indicating that this sample was primarily middle class. The biological mothers of the children were also recruited. With the exception of psychosis, there were no psychiatric exclusionary criteria. To qualify for study, the women were required to have an IQ of at least 80 and good health. Their mean age was 37.9years. DSM-III-R criteria were employed because this research was initiated prior to the publication of the DSM-IV manual.
1038
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
A baseline evaluation was conducted on the biological male children when they were 10–12years old. They were re-evaluated at ages 12–14, 16 and 19. Females were not included in this study because an insufficient number were available to conduct longitudinal analyses due to commencement of their recruitment several years after the boys. Because the DSM-IV manual was published prior to the time the boys reached age 19, the DSM-III-R were used to diagnose SUD. Table 1 summarizes the lifetime SUD diagnoses in the fathers and mothers at the time the boys were enrolled in the project. The lifetime SUD diagnoses include abuse and dependence. SUD, consequent to alcohol, cannabis and cocaine consumption, were the most frequent disorders. In addition, the distribution of SUD diagnoses in the boys (n = 288) at age 19 is presented. The most frequent SUDs pertained to cannabis (19.8%), and alcohol (15.8%) abuse. 1.2. Instrumentation 1.2.1. Structured Clinical Interview for DSM-III-R An expanded version of the Structured Clinical Interview for DSM-III-R (SCID) (Spitzer, Williams, Gibbons, & First, 1990) was administered to characterize lifetime and current psychiatric and substance use disorders in the parents and their sons. Diagnoses were formulated during a clinical conference chaired by a psychiatrist certified in addiction psychiatry and attended by another psychiatrist or a psychologist along with the clinical associates who conducted the interviews. The best estimate procedure was used to formulate the diagnoses (Leckman, Sholomaskas, & Thompson, 1982). In this procedure, the results of the diagnostic interview along with medical, legal, and social history information obtained from other facets of the research protocol and official records were considered in aggregate when formulating the diagnoses. In this study, lifetime substance use disorder diagnoses were used for parents and their sons. 1.2.2. Substance use Past month frequency of use of 20 compounds was documented using Section 1A of the (DUSIR) (Tarter, 1990). Total level of past month drug involvement was computed by summing the frequency of drug use across all compounds when the boys were 19 years old. In addition, total
Table 1 Lifetime SUD diagnosis in fathers, mothers and sons Diagnosis Any SUD Alcohol Amphetamines Cannabis Hallucinogens Cocaine Inhalants Opiates PCP Sedatives
Fathers (n = 496)
Mothers (n = 496)
Sons (age 19) (n = 288)
%
%
%
52.0 42.6 8.5 34.3 4.0 24.3 .6 11.6 2.7 4.6
25.6 18.5 4.8 10.2 1.0 6.0 .4 4.8 .4 2.7
24.8 15.8 2.2 19.8 1.8 .7 0 1.8 .7 .7
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1039
number of drugs tried in lifetime from approximately 40 compounds was recorded using the Drug Use Chart (unpublished). 1.2.3. Psychosocial maladjustment Five DUSI-R (Tarter, 1990) scales were used as indicators of a latent trait to quantify severity of social maladjustment when the boys were 19 years of age. The scales used to create the social maladjustment construct were Work Adjustment (e.g. bDid you have trouble getting along with bosses?Q), Peer Relationships (e.g. bCompared to most people, do you have few friends?Q), Leisure and Recreation (e.g. bDid you do most of your recreation or leisure activities alone?Q), School Adjustment (be.g. bDid you feel welcome in school clubs or extracurricular activities?Q), and Social Competence (e.g. bWas it difficult to make friends in a new group?Q). Confirmatory factor analysis revealed that these scales were indicators of a unidimensional latent trait (chi-square = .720, df = 4, p = .95, root mean square error approximation b.001). 1.2.4. Psychiatric treatment A 16-item self-report Psychiatric Treatment History Questionnaire (unpublished) was administered to the boys at age 19 to assess lifetime history of psychiatric treatments for emotional problems, drinking problems and drug problems in outpatient and inpatient treatment facilities. The psychiatric treatment variable is a summary score of total number of treatments received by the boys at age 19. 1.2.5. Illegal behavior The 85-item self-report Severity and History of Offenses Scale (Andrew, 1974) was administered to the boys at age 19 to document lifetime illegal behavior. Cronbach’s alpha coefficient was .93. 1.2.6. Risky sex The CEDAR Follow-up Protocol (unpublished) was administered to the boys at age 19 to assess outcomes such as physical well-being, criminal activities, risky sex, school activities, etc. in the prior three years. The questions that were used to assess risky sex (yes/no) were bIn the past 3 years, have you given sex to get drugs?Q, bIn the past 3 years, have you received money for sex?Q, and, bIn the past 3 years, have you given someone drugs to get sex?Q. 1.3. Procedure Written informed consent was obtained from the parents prior to administering the research protocols. The boys provided written informed consent at age 19. Prior to age 19, they provided written assent. All of the study participants were additionally informed that the findings from this research were protected by a Certificate of Confidentiality issued to the Center for Education and Drug Abuse Research (CEDAR) by the National Institute on Drug Abuse. 1.4. Statistical analysis Confirmatory factor analysis (CFA) was conducted to document unidimensionality of the SUD severity scale. MPLUS (Muthen & Muthen, 2001) was used to conduct confirmatory factor analysis.
1040
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
Item response theory (IRT) was employed to document the psychometric properties of SUD diagnoses and to derive the latent SUD severity trait. IRT relates performance of the subject on a test item to a latent trait. In this study, IRT was used to assess the psychometric characteristics of SUD diagnoses and to derive the SUD severity scale. The advantages of IRT, compared to traditional psychometric approaches, are well documented (Embretson & Reise, 2000; Hambleton, Swaminathan, & Rogers, 1991; van der Linden & Hambleton, 1997). Classical psychometric theory, although widely used in constructing and evaluating scales for measuring alcohol and drug use, is limited in that subject characteristics and scale characteristics are interrelated and cannot be separated. Item threshold and item discrimination parameters as well as reliability and validity of a scale must also be interpreted in the context of a particular sample. Hence, scale and item parameters vary across samples. This limitation of classical measurement theory is referred to as group dependency. In addition, reliability is quantified in classical measurement theory as the correlation between test scores on parallel forms. In practice, it is impossible to create strictly parallel scale forms. Therefore, the classical reliability coefficient reflects either a lower bound estimate or an estimate having unknown biases. Furthermore, classical measurement theory is test-oriented rather than item-oriented. Accordingly, it cannot inform about the probabilistic relationship between the diagnoses and the overall scale. Also, in classical measurement theory the standard error of measurement is constant among scores in the population. In contrast, IRT provides error estimates that are specific to the trait level. Importantly, the latent trait level estimates are not scale-dependent and item characteristics are not group-dependent. Hence, as in this study, IRT methods demonstrate whether a subject’s score at a particular latent trait level indicates that the probability of qualifying for a diagnosis is the same regardless of recruitment source. In addition to the above advantages, IRT informs about the relationship between responses (in this study, presence or absence of diagnosis) to sets of bitemsQ (categories of SUD diagnoses) and the subject’s latent trait (SUD severity trait). Provided that the model fits the data, the information obtained from IRT analyses thus enables documenting SUD severity across the gradient of latent trait scores taking into account difference among items in discriminating between trait levels. A two-parameter IRT logistic model was used in this study. It is described as Pj ðhi Þ ¼ 1=ð1 þ expð Daj ðhi bj ÞÞÞ; i ¼ 1; N ; n; where h is a continuous variable (latent SUD severity trait), P j (h i ) is the probability of subject i endorsing an item j (qualifying for SUD diagnosis), a j is the item discrimination parameter, b j is the item threshold (or location) parameter, n is the number of subjects, and D is the scaling constant used to approximate the logistic model to the normal ogive model. The threshold and discrimination parameters were estimated using PARSCALE (Muraki & Bock, 1997). This procedure utilizes the marginal maximum likelihood method to calibrate items and the Bayesian expected a posteriori method to estimate latent trait scores. The probability of endorsing an item (qualifying for SUD diagnosis) is related to the SUD severity scale as a monotonically increasing S-shaped item response function (IRF). The trait value at which 50% of the sample qualifies for a diagnosis is referred to as the item threshold parameter. The item discrimination (a) parameter is the slope of the item response function at this trait value; that is, the rate at which the probability of endorsement of a diagnosis changes as the latent trait score increases at the point of inflection. Higher item discrimination values are associated with steeper IRFs. In other words, higher discrimination parameters indicate a stronger relationship between SUD severity and observed diagnosis. Diagnosis with low value of a would result in IRF which increases gradually as a function of SUD
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1041
severity. The item threshold parameter determines the position of the curve along the latent trait. The position of the curve on the SUD severity scale corresponds to the severity of the diagnosis. A higher threshold parameter indicates that fewer subjects endorse a particular diagnosis. In other words, a higher trait score (higher score on the continuum of the SUD severity scale) is required for the person to endorse the particular diagnosis, hence, individuals whose scores fall on the right of the scale are more severe cases compared to individuals whose scores are on the left of the scale. 1.4.1. Item response model assumptions Two testable assumptions need to be satisfied when applying IRT models. The unidimensionality assumption implies that only one latent trait is measured by the items used in developing the scale, i.e., the probability of endorsing a diagnosis is a function of only one latent trait. In addition, if the residual covariances among items are small (rather than zero) at a given trait level, it is possible that there exists a dominant factor to be measured by a set of items, even though other factors may be present (Stout et al., 1996). The second assumption is that no relationship is present between the subject’s responses to different items (or diagnoses) after taking into account the subject’s latent trait level. This is referred to as the local independence assumption (Lord, 1980). Unidimensionality is a sufficient condition for satisfying the local independence assumption.
2. Results 2.1. Confirmation of the SUD severity trait 2.1.1. Confirmatory factor analysis (CFA) Unidimensionality of the SUD severity trait in parents and sons was established using confirmatory factor analysis (CFA) allowing for correlated error terms between diagnoses. A tetrachloric correlation matrix, in conjunction with weighted least squares method, were used to test unidimensionality of the factor structure. The results of the CFA revealed an adequate data-model fit for father (v 2 = 21.06, df = 17, p = .22, root mean square error approximation (RMSEA) = .022), mothers (v 2 = 22.56, df = 27, p = .75, RMSEA b .001), and sons (v 2 = 11.68, df = 20, p = .93, RMSEA b .001). 2.1.2. One-factor IRT model vs. two-factor IRT model A two-parameter IRT model was compared to a one-parameter Rasch model. The one-parameter model estimates latent scores allowing individual items to have different difficulty but with the same discrimination. The two-parameter model fit the data better than the one-parameter model for fathers (change in v 2 = 81.70, change in df = 8, p b .001) and sons (change in v 2 = 15.01, change in df = 7, p = .04). Borderline significance was observed in the mothers (change in v 2 = 13.19, change in df = 7, p = .06). Based on these results, the two-parameter model was used to estimate item parameters and latent trait scores. 2.1.3. Computing SUD severity trait scores SUD severity scores were computed using the two-parameter IRT for nine SUD diagnoses consequent to consumption of alcohol, amphetamines, cannabis, cocaine, hallucinogens, inhalants, opioids, PCP, and sedatives. These SUD diagnoses were all used in the IRT analysis of fathers. Because the number of mothers who qualified for SUD diagnosis consequent to inhalant and PCP use was small, which
1042
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
provided large standard error of estimates (i.e. indicated inaccurate estimates), these SUDs were excluded from analysis. SUD diagnoses consequent to alcohol, cannabis and amphetamine use were used to derive the SUD severity trait in the boys. Low endorsement of diagnoses yielded large standard errors (i.e. indicated inaccurate estimates) for diagnoses other than alcohol, cannabis, and amphetamine. Thus, only alcohol, cannabis, and amphetamine use disorder were included in the SUD severity scale for boys. Item difficulty (threshold) and discrimination parameters are presented in Table 2. As can be seen, the item discrimination parameters for parents and their sons are in the moderate to high range. Alcohol use disorder has the highest item discrimination value in fathers (a = 6.81) and sons (a = 2.34) whereas hallucinogens (a = 2.43) had the highest item discrimination value for mothers. These results indicate that these diagnoses discriminated individuals having high and low SUD severity. They had the highest associations between the SUD severity and observed diagnoses. Opioid use disorder had the lowest item discrimination parameter for fathers (a = .40) and cocaine use disorder for mothers (a = .97) whereas amphetamine use disorder (a = 1.51) had the lowest discrimination parameter in sons. The item threshold parameters of the SUD diagnoses in the fathers indicate that inhalant use disorder (b = 6.05) is located to the far right on the severity trait; that is, it is endorsed by those who have the most severe SUD. Among mothers, hallucinogens use disorder diagnosis (b = 2.41) has the highest threshold. Amphetamine disorder (b = 2.48) has the highest threshold value in sons. Alcohol use disorder diagnosis in fathers (b = .46) and mothers (b = 1.04) has the lowest item threshold parameters, namely, they are relatively easier to qualify for these disorders compared to other SUD diagnoses. Cannabis (b = .96) has the lowest item threshold parameter in the sons. An illustration of the item response functions (IRF) of alcohol use disorder and cannabis use disorder in the parents and sons are presented in Fig. 1 using the estimated item discrimination and item threshold parameters in Table 2. As can been seen, mother’s and son’s IRFs for alcohol use disorder and son’s IRF for cannabis use disorder overlapped. Father’s IRF for alcohol use disorder diagnosis has the steepest slope whereas mother’s IRF for cannabis use disorder has the highest threshold value. Because IRT transforms dichotomous items (diagnosis) into an interval scale (see SUD severity), and the severity (threshold) of each diagnosis and the latent trait score of SUD severity can be placed on the same metric (disorders and subjects), the SUD severity scores for fathers, mothers and sons, and the individual SUD diagnoses of parents and sons were thus comparable on the same metric even though the scores were derived using different arrays of SUD diagnoses (Hambleton & Swaminathan, 1985, pp. 53–73; Embretson, 2005), as depicted in Fig. 1. Table 2 Item threshold and discrimination parameters for each SUD Threshold (b) parameter Alcohol Amphetamines Cannabis Cocaine Hallucinogens Inhalants Opioids PCP Sedatives
Discrimination (a) parameter
Father
Mother
Son
Father
Mother
Son
.46 3.10 .85 1.41 1.59 6.05 3.19 1.09 3.30
1.04 2.18 1.62 2.28 2.41 n/a 2.14 n/a 2.41
1.09 2.48 .96 n/a n/a n/a n/a n/a n/a
6.81 .56 .78 .72 2.82 .97 .40 2.66 .70
1.96 1.23 1.32 .97 2.43 n/a 1.29 n/a 1.36
2.34 1.51 1.95 n/a n/a n/a n/a n/a n/a
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1043
1.0000 Alcohol-Fathers
Probability of Endorsement
0.8000
0.6000
Cannabis-Fathers
0.4000
Cannabis-Mothers Cannabis-Sons
0.2000 Alcohol-Sons
Alcohol-Mothers 0.0000 -3
.9
-3
.1
-2
.3
-1
.6
-.8
.0
.8
1. 6
2. 3
3.
1
3. 9
SUD Severity Index
Fig. 1. Item response function of alcohol and cannabis use disorder diagnoses for parents and sons.
The overall statistical information function of parents and sons are presented for the SUD severity trait in Fig. 2. The overall information function, is inversely related to standard error of estimate of latent ffiffiffiffiffiffiffiffiffi qwhich ˆ ˆ trait score of SUD severity (SEðhÞ ¼ 1= IðhÞ), is plotted for each level of the SUD severity trait for parents and sons. It is derived by simply summing the information functions of diagnoses at each level of SUD severity (e`). The value of standard error of estimate of SUD severity (SE(h)) varies with the level of SUD severity. Smaller standard error (or higher information) of SUD severity can be obtained by increasing (a) the number of diagnoses in the scale, (b) having highly discriminating diagnoses, and (c) matching subject’s SUD latent trait with the threshold (or severity). As can be seen, the overall information curve is featured by a single marked peak in the fathers, mothers and sons. The information function in fathers has the highest peak compared to mother and son, because father’s SUD severity scale had higher number of diagnoses and had diagnoses with, on average, a higher discrimination value. The information functions also illustrate large measurement error in estimating SUD severity of subjects who are in the lower range on the SUD severity trait. In general, subjects in middle to high range of the trait have more accurate estimates of severity. The most accurate estimates can be obtained when SUD severity (e`) is approximately + 3 for mothers, + 1 for sons and + .5 for fathers. In addition, the Gaussian fit of SUD severity scores of parents and sons generated by IRT were plotted. These distributions, as shown in Fig. 3, were rescaled so that each distribution had a normal distribution with mean of 0 and a variance of 1. As can be seen, mother’s and son’s IRT-based SUD severity score were almost identical. Father’s SUD severity distribution had a higher peak and covered more area under the curve than with the mother’s or son’s.
1044
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
40.0000
Fathers
Information
30.0000
20.0000
Mothers
10.0000
Sons
0.0000 -3
.9
-3
-1
-2
.1
.6
.3
-.8
.0
1.
.8
6
2.
3.
3
1
3.
9
SUD Severity Index
Fig. 2. Statistical information function of SUD severity index for parents and sons.
2.2. Predictive validity of SUD severity trait 2.2.1. Parental SUD severity predicts SUD severity in their sons Multiple linear regression was performed to predict son’s SUD severity score at age 19 using the SUD severity scores of the parents when sons were 10–12years of age. The results were significant: F(2,275) = 10.507, p b .001, adjusted. The SUD severity score of the mother predicted their son’s SUD severity score (B = .235, t(275) = 3.438, p = .001). However, the father’s SUD severity score did not predict son’s SUD severity score (B = .086, t(275) = 1.458, p = .146). Because the data obtained from of 250
Frequency
200 150 100
Fathers
Mothers
Sons
50 0 -3
-2
-1
0
1
2
3
SUD Severity Index
Fig. 3. Gaussian fit to IRT-based SUD severity scores of parents and sons.
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1045
the proband men and their spouse are correlated (r = .36, p b .001), due to assortment or contagion, it is plausible that the influence of the men on their sons is mediated by the mother. A post hoc test of this hypothesis reveals that paternal SUD severity alone predicts son’s SUD severity (B = .16, t(276) = 2.975, p = .003; F(1,276) = 8.849, p = .003, adjusted R 2 = .03). Adding maternal SUD severity into the model significantly increased the R 2 by .04 (change in F(1,275) = 11.818, p = .001); however, adding maternal SUD severity into the model reduced the significant path between parental SUD severity and son’s SUD severity to insignificant path. Thus, according to Barron and Kenny (1986), it can be concluded that mother’s SUD severity score mediates the association between paternal SUD severity score and son’s SUD severity score. Moreover, multiple linear regression analysis was conducted to predict son’s total number of different drugs used in lifetime using the SUD severity score of the parents. The SUD severity score of the mother (B = 1.403, t(266) = 3.547, p b .001) but not father significantly predicted son’s total number of different drugs used at age 19 (B = .057, t(266) = .177, p = .860). A post hoc analysis reveals that the father’s SUD severity score alone predicts the son’s total number of different drugs used (B = .827, t(294) = 2.309, p = .022). Including maternal SUD severity score into the model increased the R 2 by .025 (change in F(1,283) = 9.032, p = .003); however, including maternal SUD severity removed the significance of path between paternal SUD severity and son’s total number of drugs used. Multiple logistic regression analysis was also performed to determine whether SUD severity scores of the parents predicted dichotomous SUD diagnosis in their sons. The results were significant: v 2(df = 2, N = 278) = 20.530, p b .001. The sons were almost twice more likely to be diagnosed with SUD for each unit increase in the mother’s SUD severity score (B = .830, v 2(1) = 12.073, p = .001, OR = 2.293). In contrast, the father’s SUD severity score did not significantly predict SUD outcome in their son (B = .294, v 2(1) = 1.803, p = .179, OR = 1.342). Moreover, a post hoc analysis revealed that the father’s SUD severity score alone (B = .594, v 2(1) = 9.341, p = .002, OR = 1.811) predicted the son’s SUD diagnosis. Including the mother’s SUD severity score into the model increased R 2 by .041 (v 2(1) = 12.066, p b .001); however, it removed the significant path between paternal SUD severity and son’s SUD diagnosis. 2.2.2. Prediction of severity of drug use in 2-year follow-up study Linear regression analyses results indicate that SUD severity in the father predicted their own level of drug use (frequency of drug use) 2 years later (B = 1.994, F(1,319) = 47.328, p b .001, adjusted). Similarly, the SUD severity score in the mother predicted their own level of drug use 2 years later: (B = 1.312, F(1,323) = 26.563, p b .001, adjusted R 2 = .073). 2.3. Son’s age of substance use onset predicts son’s SUD severity Simple linear regression revealed that son’s SUD severity score at age 19 was associated with son’s age of onset of substance use. Significant negative prediction was observed (B = .179, F(1,105) = 51.493, p b .001, adjusted). 2.4. Concurrent validity The son’s current drug use (frequency of drug use) at age 19 was significantly associated with the son’s SUD severity score (B = 4.050, t(266) = 12.091, p b .001, F(3,266) = 67.457, p b .001, adjusted).
1046
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
Furthermore, the number of drugs tried in lifetime was significantly associated with son’s SUD severity score (B = 3.764, t(287) = 11.47, p b .001; F(1,286) = 131.49, p b .001, adjusted R 2 = .313). 2.5. Construct validity SUD severity score in the boys at age 19 was correlated positively with severity of social maladjustment (r = .596, p b .001), treatment seeking (r pb, p = .026), illegal behavior (B = .040, F(1,282) = 132.038, p b .001, adjusted R 2 = .316), and risky sex (B = 1.012, v 2(1) = 6.518, p = .011, OR = 2.750). Notably, the sons were 2.75 times more likely to have risky sex for each unit increase in SUD severity score. 2.6. Replication of construct and concurrent validity analyses using the son’s observed score (sum of SUD diagnosis) Construct and concurrent validity analyses were repeated using the son’s observed score (i.e., sum of SUD diagnoses). The observed score was significantly associated with age of onset of substance use (B = .177, F(1,105) = 34.77, p b .001, adjusted R 2 = .236). Son’s current drug use (frequency of drug use) at age 19 covaried with the observed score (B = 2.73, t(266) = 10.51, p b .001, adjusted R 2 = .363). The number of drugs tried in lifetime was also significantly predicted by son’s observed score (B = 3.918, t(287) = 16.381, p b .001; F(1,286) = 110.394, p b .001, adjusted R 2 = .276). Furthermore, son’s observed score was correlated with social maladjustment (r = .575, p b .001), treatment seeking (r pb = .121, p = .052), illegal behavior (B = .043, F(1,282) = 108.213, p b .001, adjusted R 2 = .275), and risky sex (B = .662, v 2(1) = 3.839, p = .050, OR = 1.939). The sons were 1.9 times more likely to have risky sex for each unit increase in sum of SUD diagnoses. In summary, results indicated that, on average, the associations between the SUD severity trait score and outcome variables were stronger than the association between observed score and the outcome variables.
3. Discussion The present investigation demonstrated that the various SUDs are indicators of a unidimensional trait. The scores on this trait, reflecting variation of SUD severity, complement previous research demonstrating quantification of the common liability underlying the various SUD categories (Vanyukov, Kirisci et al., 2003; Vanyukov, Tarter et al., 2003). In addition, the results of this study extend prior research indicating that the SUD symptoms are indicators of a unidimensional latent trait (Kirisci et al., 2002). Considered from the conceptual perspective, the findings point to the need to advance taxonomic accuracy by enhanced specification of the SUD clinical phenotype. At the descriptive level, a continuous scale provides more information than merely presence or absence of diagnosis. Moreover, the score on the SUD scale accounts for all co-occurring SUDs. Furthermore, using IRT, the salience of each SUD diagnosis is accounted for in the overall severity score. Accordingly, an IRT-derived SUD severity scale provides an accurate and complete clinical picture of SUD severity that is not possible by tabulating the number of disorders. An observed score (e.g. the unweighted sum of diagnosis) has considerable shortcomings compared to the IRT-derived SUD severity index. For instance, an observed score assumes that each item (diagnosis) represents an equal
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1047
level of severity of SUD. Some diagnoses tend to be endorsed by subjects with greater SUD severity, whereas other diagnoses (e.g., alcohol) are used by individuals with both low and high levels of severity of SUD. An observed score also assumes that each diagnosis is equally related to overall severity of SUD. The comparison of observed scores between different populations also assumes that the individual diagnoses have identical item properties in these populations. This assumption may be invalid; for example, alcohol diagnosis may discriminate normal subjects with lower and higher severity of substance use involvement, however, it may not discriminate subjects who qualify for SUD. Finally, the IRT-derived index takes into account item properties—varying item discrimination and item threshold parameters. Merely summing the number of SUD diagnoses to document severity could thus readily lead to biased results due to the fact that the different categories have unequal weight as indicators of overall severity of SUD. For instance, alcohol disorder and cannabis disorder have much less egregious implications as indicators of the severity of SUD than crack/cocaine diagnosis. Using IRT to estimate the latent trait of SUD severity resolves the issue of differential item weight. Furthermore, observed scores reflect an ordinal scale that is not normally distributed. However, the IRT-based severity of SUD has a normal distribution and is measured on an interval scale. These properties satisfy two of the most important assumptions in parametric statistical procedures (Harwell & Gatti, 2001). Accordingly, an important heuristic value of the results reported herein pertains to standardizing the SUD phenotype on a continuous scale. This would be readily feasible using currently available epidemiological data sets. Indexing SUD severity affords the opportunity for investigators to compare (or match) their samples on a common metric, and precisely document treatment impacts. In the current era of accountability, quantifying SUD severity will also enable clinicians to more accurately ascertain the status of the individual client during and after treatment. Accommodating the complete diagnostic configuration of SUDs is also heuristic for addressing problems impacted by SUD. For example, it potentially affords the opportunity to derive severity thresholds which accurately inform about the likelihood of subsequently developing health, social, legal or vocational problems. Notably, this investigation demonstrated that each unit increase in SUD severity amplified the likelihood of risky sex by a factor of 2.75. Numerous other practical applications of an SUD scale can be readily identified, such as determining optimum intensity and modality of treatment in relation to SUD severity, prioritizing prevention resources for children who are at highest risk for severe SUD, formulating child custody decisions in cases of divorce or maltreatment in consideration of parental SUD severity, and predicting criminal recidivisms and treatment relapse. The advantages of quantifying SUD severity notwithstanding, it should be noted that several limitations of the present investigation do not allow definitive conclusions. Because a random sampling strategy was not employed, the possibility of sampling bias cannot be ruled out. In addition, all of the probands were men. Hence, it is conceivable that the pattern and rate of SUD observed in the women are, in part, due to the influence of mating assortment (Vanyukov, Neale, Moss, & Tarter, 1996). Moreover, the sample size was modest. Lastly, it should be noted that nicotine dependence was not considered among the SUD categories. Analyses replicating and extending the results of this investigation thus need to be conducted using epidemiological samples. While the results generally conformed to expectation, there was, however, one interesting exception. Specifically, maternal but not paternal SUD severity predicted offspring’s SUD severity. Because the data obtained from the proband men and their spouse are correlated (r = .36, p b .001), due to phenotypic assortment or contagion, a post hoc test reveals that mother’s SUD severity score mediates the association between paternal SUD severity score and son’s SUD severity score. These findings suggest
1048
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
that the liability conferred to the son by the father’s SUD is effected, in part, by the mother’s SUD severity. Thus, whereas paternal SUD amplifies SUD in male offspring by 4–7 fold (Vanyukov & Tarter, 2000), the variation in overall offspring’s severity of disorder may be due to additional factors, particularly maternal SUD severity. As the primary caretaker, the mother’s status may, therefore, be the child’s conduit to SUD outcome. In summary, this investigation demonstrates that the variety of specific SUD diagnoses are indicators of a unidimensional latent trait. The spectrum of scores (clinical phenotypes) potentially captures the full range of SUD severity variation. As shown herein, SUD severity scale scores covary with social maladjustment, risky sex, treatment seeking, and propensity for illegal behavior. Moreover, severity of parental SUD predicts severity of offspring’s SUD. These findings illustrate the utility of measuring SUD severity in research directed at elucidating etiology as well as for research aimed at quantifying treatment efficacy.
Acknowledgement This work was supported by NIDA grants P50 DA005605, K02 DA017822, K02 DA018701, and R01 DA019157.
References Andrew, J. (1974). Violent crime indices among community retained delinquents. Criminal Justice and Behavior, 1, 123 – 130. Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research; conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173 – 1182. Embretson, S. E. (2005). The continued search for nonarbitrary metrics in psychology. American Psychologist, 61, 50 – 55. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ7 Lawrence Erlbaum Associates. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Boston7 Kluwer-Nijhoff Publishing. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newburry Park7 Sage. Harwell, M. R., & Gatti, G. G. (2001). Rescaling ordinal data to interval data in educational research. Review of Educational Research, 71, 105 – 131. Hollingshead, A. (1975). Four-index of social status. Hillsdale, NJ7 Department of Sociology, Yale University. Kendler, K. S., Jacobson, K. C., Prescott, C. A., & Neale, M. C. (2003). Specificity of genetic and environmental risk factors for use and abuse/dependence of cannabis, cocaine, hallucinogens, sedatives, stimulants, and opiates in male twins. American Journal of Psychiatry, 160, 687 – 695. Kirisci, L., Vanyukov, M., Dunn, M., & Tarter, R. (2002). Item response theory modeling of substance use. An index based on 10 drug categories. Psychology of Addictive Behaviors, 16, 290 – 298. Leckman, J., Sholomaskas, D., & Thompson, W. (1982). Best estimate of lifetime psychiatric diagnosis: A methodological study. Archives of General Psychiatry, 39, 879 – 883. Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsdale, NJ7 Lawrence Erlbaum. Merikangas, K., Stolar, M., Stevens, D., Goulet, J., Preisig, M., Fenton, B., et al. (1998). Familial transmission of substance use disorders. Archives of General Psychiatry, 55, 973 – 979. Muraki, E., & Bock, R. D. (1997). PARSCALE: IRT item response analysis and test scoring for rating-scale data. Chicago7 Scientific Software International. Muthen, B. O., & Muthen, L. K. (2001). Mplus user’s guide. Los Angeles, CA7 Muthen & Muthen. Spitzer, R., Williams, B., Gibbons, M., & First, M. (1990). Users guide for structured clinical interview for DSM-III-R. New York, NY7 New York State Psychiatric Institute.
L. Kirisci et al. / Addictive Behaviors 31 (2006) 1035–1049
1049
Stout, W., Habing, B., Dougles, J., Kim, H. R., Rousses, L., & Zhang, J. (1996). Conditional covariance-based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331 – 354. Tarter, R. (1990). Evaluation and treatment of adolescent substance abuse: A decision tree method. American Journal of Drug and Alcohol Abuse, 16, 1 – 46. Tarter, R., & Vanyukov, M. (2001). Introduction: Theoretical and operational framework for research into the etiology of substance use disorders. Journal of Child and Adolescent Substance Abuse, 10, 1 – 12. Tsuang, M., Bar, J., Harley, R., & Lyons, M. (2001). The Harvard twin study of substance abuse: What have we learned. Harvard Review of Psychiatry, 9, 267 – 279. Tsuang, M. T., Lyons, M. J., Meyer, J. M., Doyle, T., Eisen, S. A., Goldberg, J., et al. (1998). Co-occurrence of abuse of different drugs in men: The role of drug-specific and shared vulnerabilities. Archives of General Psychiatry, 55, 967 – 972. van der Linden, W. J., & Hambleton, P. K. (1997). Handbook of modern item response theory. New York7 Springer. Vanyukov, M. M., Kirisci, L., Tarter, R. E., Simkevitz, H. F., Kirillova, G. P., & Maher, B. S. (2003). Liability to substance use disorders: 2. A measurement approach. Neuroscience and Biobehavioral Reviews, 27, 507 – 515. Vanyukov, M. M., Neale, M. C., Moss, H. B., & Tarter, R. E. (1996). Mating assortment and the liability to substance abuse. Drug and Alcohol Dependence, 42, 1 – 10. Vanyukov, M., & Tarter, R. (2000). Genetic studies of substance abuse. Drug and Alcohol Dependence, 59, 101 – 123. Vanyukov, M. M., Tarter, R. E., Kirisci, L., Kirillova, G. P., Maher, B. S., & Clark, D. B. (2003). Liability to substance use disorders: 1. Common mechanisms and manifestations. Neuroscience and Biobehavioral Reviews, 27, 517 – 526. Yates, R., Cadoret, R., Troughton, E., & Stewart, M. (1996). An adoption study of DSM-III-R alcohol and drug dependence severity. Drug and Alcohol Dependence, 41, 9 – 15.