Psychometric Properties of the Strengths and Difficulties Questionnaire

Psychometric Properties of the Strengths and Difficulties Questionnaire

Psychometric Properties of the Strengths and Difficulties Questionnaire ROBERT GOODMAN, PH.D. ABSTRACT Objective: To describe the psychometric propert...

118KB Sizes 78 Downloads 141 Views

Psychometric Properties of the Strengths and Difficulties Questionnaire ROBERT GOODMAN, PH.D.

ABSTRACT Objective: To describe the psychometric properties of the Strengths and Difficulties Questionnaire (SDQ), a brief measure of the prosocial behavior and psychopathology of 3–16-year-olds that can be completed by parents, teachers, or youths. Method: A nationwide epidemiological sample of 10,438 British 5–15-year-olds obtained SDQs from 96% of parents, 70% of teachers, and 91% of 11–15-year-olds. Blind to the SDQ findings, all subjects were also assigned DSM-IV diagnoses based on a clinical review of detailed interview measures. Results: The predicted five-factor structure (emotional, conduct, hyperactivity-inattention, peer, prosocial) was confirmed. Internalizing and externalizing scales were relatively “uncontaminated” by one another. Reliability was generally satisfactory, whether judged by internal consistency (mean Cronbach α: .73), cross-informant correlation (mean: 0.34), or retest stability after 4 to 6 months (mean: 0.62). SDQ scores above the 90th percentile predicted a substantially raised probability of independently diagnosed psychiatric disorders (mean odds ratio: 15.7 for parent scales, 15.2 for teacher scales, 6.2 for youth scales). Conclusion: The reliability and validity of the SDQ make it a useful brief measure of the adjustment and psychopathology of children and adolescents. J. Am. Acad. Child Adolesc. Psychiatry, 2001, 40(11):1337–1345. Key Words: questionnaire, psychopathology, reliability, validity, factor structure.

The Strengths and Difficulties Questionnaire (SDQ) is a one-page questionnaire for assessing the psychological adjustment of children and youths (www.sdqinfo.com). Identical or nearly identical versions can be completed by the parents or teachers of 3–16-year-olds and by 11–16year-olds themselves (Goodman, 1997; Goodman et al., 1998). The SDQ can be used for screening, as part of a clinical assessment, as a treatment-outcome measure, and as a research tool (Garralda et al., 2000; Goodman et al., 2000b,c). Copies of the SDQ in more than 40 languages can be downloaded from the Internet and copied without charge for noncommercial purposes. The SDQ asks about 25 attributes, some positive and others negative; respondents use a 3-point Likert scale to indicate how far each attribute applies to the target child. The 25 items are divided between five scales of five items

Accepted June 5, 2001. From the Department of Child and Adolescent Psychiatry, Institute of Psychiatry, King’s College London. This study was funded by the U.K. Department of Health. Correspondence to Dr. Goodman, Department of Child and Adolescent Psychiatry, Institute of Psychiatry, De Crespigny Park, London SE5 8AF, England; e-mail: [email protected]. 0890-8567/01/4011-1337䉷2001 by the American Academy of Child and Adolescent Psychiatry.

each, generating scores for emotional symptoms, conduct problems, hyperactivity-inattention, peer problems, and prosocial behavior; all but the last are summed to generate a total difficulties score. The selection of SDQ items and their grouping into scales were based on current nosological concepts as well as on previous factor analyses. For example, the five items comprising the SDQ’s Hyperactivity-Inattention scale were deliberately selected to tap inattention (two items), hyperactivity (two items), and impulsiveness (one item) because these are the three key symptom domains for a DSM-IV diagnosis of attention-deficit/hyperactivity disorder (ADHD) (American Psychiatric Association, 1994). Extended versions of the SDQ include the 25 core items plus a brief impact supplement that asks whether the respondent thinks that the child or youth has a problem, and if so, inquires further about overall distress, social impairment, burden, and chronicity. For clinicians and researchers with an interest in psychiatric caseness and the determinants of service use, the impact supplement provides useful additional information without taking up much more of respondents’ time (Goodman, 1999). For many purposes, the SDQ functions at least as well as the longer-established Achenbach (1991a–c) and Rutter questionnaires (Elander and Rutter,1995), correlating

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

1337

GOODMAN

highly with them (Goodman and Scott, 1999; Klasen et al., 2000; Koskelainen et al., 2000). Cross-informant correlations are higher than those usually reported for questionnaires (Goodman et al., 1998), perhaps reflecting the identity or near identity of the questionnaires administered to different informants. One study showed that correlations between internalizing and externalizing scales were lower for the SDQ than for the Child Behavior Checklist (CBCL) (Achenbach, 1991a); this finding raises the possibility that the SDQ scales were “purer” and less “contaminated” by one another (Goodman and Scott, 1999). The same study showed that as judged against an interview with parents, the SDQ was significantly better than the CBCL at detecting inattention and hyperactivity and at least as good at detecting internalizing and externalizing problems (Goodman and Scott, 1999). Another study showed that despite its brevity, the SDQ was significantly better than the CBCL at predicting a clinical diagnosis of a hyperactivity disorder (Klasen et al., 2000). Although these existing studies suggest that the SDQ has satisfactory psychometric properties, all of these studies are limited by potentially unrepresentative samples, relatively small sample sizes, or the absence of independent psychiatric diagnoses as validating criteria. To circumvent these limitations, the current study used results from a nationwide British study of child and adolescent mental health to examine the psychometric properties of the SDQ in a large and representative community sample of children and youths, all of whom were independently assigned psychiatric diagnoses. The analyses examined the postulated five-factor structure, cross-scale correlations, crossinformant correlations, internal reliability, 6-month stability, and validity as judged against independent psychiatric diagnoses. METHOD Cross-Sectional Sample In 1999 the Office for National Statistics carried out a survey of the mental health of British 5–15-year-olds. The total sample of 10,438 children was recruited through child benefit records; child benefits are available without means testing and are claimed on behalf of approximately 98% of British children. Details of ascertainment and representativeness have been presented elsewhere (Meltzer et al., 2000). Parents provided questionnaire and interview information on 99% of the sample (with the remaining 1% largely being composed of parents who could not speak English well). Ninety-seven percent of families gave permission to send teachers a postal questionnaire. A partly or fully completed questionnaire was returned by 80% of teachers; nonresponse was more likely when schools were in economically deprived neighborhoods. Questionnaires and interviews were completed by

1338

96% of the eligible 11–15-year-olds. For the purpose of the analyses in this report, SDQs were included only if no items were missing. Although it is possible to prorate SDQ scores when a small number of items have been omitted, these incomplete SDQs could not be used for the factor analyses. After incomplete SDQs were eliminated, valid SDQs had been completed by 9,998 parents (96%), by 7,313 teachers (70%), and by 3,983 11–15-year-olds (91%). Follow-up Sample Follow-up questionnaires were sent to all parents and 11–15-year-olds who had participated in the first wave of data collection (January 4 to February 14, 1999) and who had agreed to be contacted again. Followup was restricted to the first wave because of budgetary constraints and because this permitted a 4- to 6-month follow-up before the end of the school year in July 1999. The latter consideration allowed a random sample of teachers to be contacted again for a follow-up SDQ on a current pupil. Of the 2,618 follow-up questionnaires sent to parents who had completed valid SDQs initially, 2,091 were returned (80%). The corresponding response rates were 91% (796/876) for teachers and 77% (781/1014) for 11–15-year-olds. Questionnaires Extended versions of the SDQ were administered to parents, teachers, and youths and subsequently were scored in the standard manner (Goodman, 1997, 1999; Goodman et al., 1998). Further information on the SDQ and sample questionnaires in more than 40 languages are available from the Internet (www.sdqinfo.com). This study used questionnaires in British English, but slightly modified National Institute of Mental Health versions are also available in American English. Interview Measures and Psychiatric Diagnosis All subjects in the initial cross-sectional survey were assigned DSMIV psychiatric diagnoses on the basis of the Development and WellBeing Assessment (DAWBA) (Goodman et al., 2000a), an integrated package of questionnaires, interviews, and rating techniques designed to generate psychiatric diagnoses on 5–16-year-olds. Nonclinical interviewers administered a structured interview to parents and older children, supplementing the structured questions with open-ended questions to get respondents to describe the problems in their own words. Experienced clinical raters assigned DSM-IV diagnoses (American Psychiatric Association, 1994) after reviewing the interview records and teacher questionnaires. DAWBA diagnoses were generated blind to the SDQ scores. In the validation study of the DAWBA (Goodman et al., 2000a), there was excellent discrimination between community and clinic samples in rates of diagnosed disorder. Within the community sample, subjects with and without diagnosed disorders differed markedly in external characteristics and prognosis. In the clinic sample, there was substantial agreement between DAWBA and case note diagnoses. In the present study, 500 randomly chosen subjects were independently assigned DAWBA diagnoses by two clinical raters, generating κ coefficients for interrater reliability of 0.86 for any DSM-IV diagnosis, 0.98 for all oppositional or conduct disorders combined, 0.86 for ADHD, and 0.57 for all emotional disorders combined. Statistical Analyses All analyses were performed with SPSS version 10.0 (SPSS Inc., 1999). Although the complex nature of the original sampling procedure made it essential to use appropriate weighting when calculating prevalence rates, there were only minimal design effects for other sorts

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

PSYCHOMETRIC PROPERTIES OF SDQ

of analyses (Meltzer et al., 2000). Consequently, as in previous analyses of this data set (Fombonne et al., 2001), statistical analyses were conducted on the unweighted data. Factor analyses were carried out to examine the factor structure for each class of informant separately. Reliability was judged from internal consistency, interrater agreement, and temporal stability. Validity was assessed from the degree of association between high questionnaire scores and independently diagnosed DSM-IV disorders.

RESULTS Factor Analyses

Table 1 shows the rotated five-factor solutions for parent, teacher, and self-report SDQs. A five-factor solution was chosen in each case because this was the predicted number of factors on theoretical grounds. As it happens, use of the “eigenvalue greater than 1.00” rule would have generated similar solutions: there were only five factors with eigenvalues of greater than 1.00 for the teacher and self-report SDQs; and although there were six such factors for the parent SDQ, the sixth factor had an eigenvalue of only 1.02. For all three categories of rater (parent, teacher, and youth), all 25 items loaded on the predicted factors, with a few items also loading on additional factors. The loadings on the predicted factors were higher than the loadings on the additional factors for all 25 parent items and for 24 of the 25 teacher and self-report items. For both teacher and self-report SDQs, the “prosocial” factor could potentially have been labeled as a “positive” factor: the highest loadings (0.55–0.80) were for the five prosocial items, but there were also substantial loadings (0.34–0.52) for four other positively worded items covering reflectiveness, persistence, popularity, and obedience. The same occurred to a lesser extent for the parent SDQ, with obedience having a subsidiary loading (0.30) on the same factor as the five prosocial items. Cross-Scale Correlations

Table 2 presents the correlations between the Emotional, Conduct, and Hyperactivity-Inattention scales for each type of informant separately. Averaging across all three types of informants, the correlations were 0.28 for Emotional-Conduct, 0.27 for Emotional-Hyperactivity, and 0.55 for Conduct-Hyperactivity, i.e., the internalizing-externalizing correlations were half the magnitude of the externalizing-externalizing correlations. Interrater Correlations

The correlations between parent, teacher, and youth SDQ scores are shown in Table 3. To provide a bench-

mark for evaluating these correlations, Table 3 also presents the mean cross-informant correlations for other measures, based on the meta-analysis conducted by Achenbach et al. (1987). This meta-analysis calculated the mean interrater correlation for 41 samples that reported parent–teacher correlations, 14 samples that reported parent–self correlations, and 21 samples that reported teacher–self correlations. Because the metaanalytic means were based on Pearson product moment correlations, the SDQ correlations were also calculated as Pearson product moment correlations (with Spearman nonparametric correlations coefficients shown in parentheses). Nearly all of the SDQ correlations were above the meta-analytic mean, often substantially so. Internal Consistency

Table 4 presents the Cronbach α coefficients for the different SDQ scores and informants. These were generally satisfactory (mean 0.73), particularly for the total difficulties and total impact scores (all 0.80 or greater). The internal consistency of the self-report peer problems score was notably low (0.41). Retest Stability

The SDQ was readministered to some parents, teachers, and youths after an interval of 4 to 6 months. This cannot be thought of as a measure of test-retest reliability because the interval is too great, such that changes in the scores with time may reflect genuine changes in the children’s psychological state as well as test-retest unreliability. Nevertheless, the mean retest stability of 0.62 after 4 to 6 months does provide a lower bound for test-retest reliability—the true test-retest reliability is not plausibly lower than this, and it is likely to be substantially higher. Teacher ratings were the most stable (mean correlation = 0.73) and youth ratings the least stable (mean correlation = 0.51) (Table 5). Stability was greatest for the total difficulties and hyperactivity-inattention scores. The stability of self-rated impact was particularly low (0.21). Agreement With Independent Psychiatric Diagnosis

Table 6 presents data on the prevalence of relevant DSM-IV diagnoses after the sample was split into low-risk and high-risk subjects according to each of the SDQ scores. For each score, the extreme 10% of the population were compared with the remaining 90%, in line with the previous designation of the most extreme 10% as an “abnormal” band (Goodman, 1997; Goodman et al., 1998). Because the scores are discrete, it was not possible to

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

1339

GOODMAN

TABLE 1 SDQ Factor Analyses (a) Parent SDQ: Community Sample of 9,998 Children, Aged 5–15

Total Variance Explained Factor loadings Distractible Persistent Restless Fidgety Reflective

Factor 1 “Hyper” 11.0%

Factor 2 “Emotion” 9.4%

Factor 4 “Conduct” 8.6%

0.68 0.67 0.58 0.56 0.53

–0.30

Factor 5 “Peer” 7.5%

0.77 –0.72 0.66 0.65 –0.64

Fears Worries Clingy Unhappy Somatic

0.71 0.69 0.66 0.60 0.47

Helps out Caring Considerate Kind to kids Shares Lies Fights Tempers Steals Obedient

Factor 3 “Prosocial” 9.4%

–0.30

Good friend Popular Best with adults Solitary Bullied

0.38

0.64 0.61 0.54 0.52 –0.43 –0.64 –0.61 0.57 0.56 0.47

0.32 (b) Teacher SDQ: Community Sample of 7,313 Children, Aged 5–15

Total Variance Explained Factor loadings Caring Helps out Kind to kids Considerate Shares Distractible Fidgety Restless Persistent Reflective Fears Worries Clingy Unhappy Somatic

Factor 1 “Prosocial” 14.7% 0.80 0.76 0.75 0.68 0.63

0.36 0.45

Factor 2 “Hyper” 13.9%

Factor 3 “Emotion” 11.2%

–0.30

Factor 4 “Conduct” 10.2%

Factor 5 “Peer” 8.2%

–0.31

0.81 0.80 0.79 –0.72 –0.61 0.80 0.77 0.75 0.64 0.53

0.31 –– Continued

1340

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

PSYCHOMETRIC PROPERTIES OF SDQ

TABLE 1 Continued (b) Teacher SDQ: Community Sample of 7,313 Children, Aged 5–15

Total Variance Explained Lies Fights Tempers Steals Obedient Best with adults Solitary Good friend Popular Bullied

Factor 1 “Prosocial” 14.7%

0.40

Factor 2 “Hyper” 13.9%

Factor 3 “Emotion” 11.2%

Factor 4 “Conduct” 10.2%

Factor 5 “Peer” 8.2%

0.71 0.68 0.63 0.60 –0.37

–0.41

0.74 0.69 –0.65 –0.48 0.45

0.47 0.33

(c) Self-Report SDQ: Community Sample of 3,983 Children, Aged 11–15 Years

Total Variance Explained Factor loadings Caring Helps out Kind to kids Shares Considerate

Factor 1 “Prosocial” 10.6%

Restless Fidgety Distractible Persistent Reflective Best with adults Solitary Bullied Good friend Popular

Factor 3 “Conduct” 8.5%

0.66 0.65 0.60 0.60 0.55

Worries Fears Clingy Unhappy Somatic Lies Fights Steals Tempers Obedient

Factor 2 “Emotion” 9.2%

Factor 4 “Hyper” 8.3%

Factor 5 “Peer” 5.9%

–0.33 0.74 0.72 0.64 0.51 0.41

0.31

0.31

0.63 0.58 0.57 0.52 –0.36

0.35

0.39 0.40 0.34

0.77 0.73 0.52 –0.45 –0.42 0.62 0.54 0.48 –0.46 –0.33

0.32 0.52

Note: Rotated (varimax) five-factor solution; loadings between ±0.3 omitted, loadings greater than ±0.4 in boldface type. SDQ = Strengths and Difficulties Questionnaire.

divide the sample into exactly 10% and 90%, e.g., sometimes it was 8% and 92%, or 11% and 89%. For nearly all scores, the high-risk group comprised the 10% (approximately) of subjects with the highest scores. The only exception was the Prosocial scale, where the high-risk

group comprised the 10% (approximately) of subjects with the lowest scores. Tables of the frequency distribution of SDQ scores are not presented here but are available for inspection at www.sdqinfo.com. The DSM-IV diagnoses that contributed to the prevalence estimates shown in

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

1341

GOODMAN

with conceptually different disorders (e.g., the SDQ emotional symptoms score with the presence or absence of oppositional-conduct disorders) Several generalizations can be drawn from the pattern of results in Table 6. First, all scales are associated with the relevant DSM-IV diagnoses, in the sense that there were significant differences in prevalence between low-risk and high-risk groups (p < .002 in all comparisons). Second, specificity and negative predictive value were high, while sensitivity and positive predictive value were lower. Third, the association with psychiatric disorder was weakest for the Prosocial scale, supporting the original decision not to include this scale in the Total Difficulties scale (Goodman, 1997). Fourth, youth SDQ scores were generally less strongly associated with disorder than were the corresponding parent or teacher scores. Averaging across all scales, the odds ratio for having a psychiatric disorder in high- rather than low-risk groups was 15.7 for parent SDQ scales, 15.2 for teacher SDQ scales, and 6.2 for youth SDQ scales. The pattern was different, however, for the Emotional scale: the odds ratios were similar for youth and parent report, and lower for teacher report. Finally, when we examined whether the odds ratios for parents and teachers overlapped, we found that similarities were more striking than differences. The SDQ Emotional and Impact scales were slightly more strongly associated with disorder when parent rather than teacher reports were used, while the reverse was true for the SDQ Conduct and Prosocial scales.

TABLE 2 Cross-Scale Correlations for SDQ Scores in a Community Sample of 5–15-Year-Olds Pearson Cross-Scale Correlations Informant Parent Teacher Youth

N

EmotionConduct

EmotionHyper/Inatt

ConductHyper/Inatt

9,998 7,313 3,983

0.30 0.21 0.33

0.26 0.24 0.31

0.50 0.61 0.53

Note: All correlations significant at p < .001. SDQ = Strengths and Difficulties Questionnaire.

Table 6 varied according to the SDQ score being considered. Inasmuch as total symptom and impact scores cover a wide gamut of psychopathology, it was relevant to consider the presence or absence of any DSM-IV diagnosis when comparing the low-risk and high-risk groups. Because peer problems and lack of prosocial behavior may accompany many different disorders, all diagnoses were also considered relevant when the peer and prosocial scores were assessed. For the emotional symptom score, by contrast, the relevant diagnoses were any depressive, phobic, or anxiety diagnosis, including obsessive-compulsive disorder (ANX or DEP in Table 6). For the conduct problems score, the relevant diagnoses were oppositional defiant disorder, conduct disorder, or other disruptive behavioral disorders (ODD or CD in Table 6). For the hyperactivityinattention score, the relevant diagnosis was ADHD. Comparing SDQ scores with conceptually similar diagnoses (e.g., the SDQ emotional symptoms score with the presence or absence of emotional disorders) consistently resulted in higher odds ratios than comparing SDQ scores

DISCUSSION

The psychometric properties of the SDQ were assessed in a representative sample of ten thousand

TABLE 3 Interrater Correlations for SDQ Scores in a Community Sample of 5–15-Year-Olds Pearson (Spearman) Interrater Correlations SDQ Scale Total Difficulties Emotional Symptoms Conduct Problems Hyperactivity-Inattention Peer Problems Prosocial Behavior Impact Pearson meta-analytic mean for other measures

Parent ⫻ Teacher (N = 7,313) 0.46 0.27 0.37 0.48 0.37 0.25 0.37

(.42) (.24) (.30) (.46) (.28) (.24) (.32)

0.27

Parent ⫻ Youth (N = 3,983) 0.48 0.37 0.44 0.41 0.40 0.30 0.30

(.46) (.35) (.41) (.40) (.32) (.30) (.25)

0.25

Teacher ⫻ Youth (N = 2,767) 0.33 0.21 0.30 0.32 0.29 0.23 0.23

(.29) (.19) (.25) (.30) (.21) (.22) (.24)

0.20

Note: All SDQ correlations significant at p < .001. Pearson correlations in boldface type are higher than the meta-analytic mean reported by Achenbach et al. (1987). SDQ = Strengths and Difficulties Questionnaire.

1342

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

PSYCHOMETRIC PROPERTIES OF SDQ

TABLE 4 Reliability Coefficients for SDQ Scores in a Community Sample of 5–15-Year-Olds Reliability Correlations (α) SDQ Scale Total Difficulties Emotional Symptoms Conduct Problems Hyperactivity-Inattention Peer Problems Prosocial Behavior Impact

Parent Teacher Youth (N = 9,998) (N = 7,313) (N = 3,983) 0.82 0.67 0.63 0.77 0.57 0.65 0.85

0.87 0.78 0.74 0.88 0.70 0.84 0.85

0.80 0.66 0.60 0.67 0.41 0.66 0.81

Note: SDQ = Strengths and Difficulties Questionnaire.

5–15-year-olds, all of whom had psychiatric assessments. The findings confirmed and extended previous reports of satisfactory reliability and validity based on studies of smaller community and clinic samples from around the world (García et al., 2000; Goodman, 1997, 1999; Goodman et al., 1998, 2000c; Goodman and Scott, 1999; Klasen et al., 2000; Koskelainen et al., 2000; Smedje et al., 1999). Factor analyses showed that nearly all items loaded primarily, and usually exclusively, on the predicted five factors, covering emotional symptoms, conduct problems, hyperactivity-inattention, peer problems, and prosocial behavior. The predicted five-factor structure fitted the findings particularly well for the parent SDQ. A factor analysis of the German version of the parent SDQ similarly confirmed the predicted five-factor structure (Woerner et al., 2000). For teacher and self-report SDQs, the principal divergence from the predicted five-factor structure was the tendency for any positively worded item to load on the prosocial factor. As previously reported TABLE 5 Stability Over 4 to 6 Months of SDQ Scores in a Community Sample of 5–15-Year-Olds Time 1 ⫻ Time 2 Correlations SDQ Scale Total Difficulties Emotional Symptoms Conduct Problems Hyperactivity-Inattention Peer Problems Prosocial Behavior Impact

Parent (N = 2,091)

Teacher (N = 796)

Youth (N = 781)

0.72 0.57 0.64 0.72 0.61 0.61 0.57

0.80 0.65 0.69 0.82 0.72 0.74 0.68

0.62 0.57 0.51 0.60 0.54 0.51 0.21

Note: All correlations significant at p < .001. SDQ = Strengths and Difficulties Questionnaire.

(Goodman, 1994), raters seem to vary in their readiness to attribute positive qualities, with the result that the prosocial factor also functions, though to a lesser extent, as a “positive construal” factor. There was very little overlap between the items loading on the internalizing scale (Emotional Symptoms) and the two externalizing scales (Conduct Problems and Hyperactivity-Inattention). This confirms a previous suggestion that the internalizing and externalizing scales of the SDQ are relatively “uncontaminated” by one another (Goodman and Scott, 1999). The correlation between the internalizing and externalizing scales (approximately 0.3) is half that of the correlation between the two externalizing scales. By contrast, studies of the CBCL (Achenbach, 1991a) typically report much higher correlations between internalizing and externalizing scales, perhaps overestimating true comorbidity as a result of “contaminated” scales. Reliability is generally judged in three ways: from internal consistency, from interrater agreement, and from testretest stability. The internal consistencies of the SDQ were generally satisfactory. Interrater agreement for the SDQ was, for the most part, substantially better than the average level of agreement reported for other measures (Achenbach et al., 1987). Indeed, the interrater correlations for the total difficulties score were almost twice the meta-analytic means, reflecting almost four time the shared variance. Whereas test-retest reliability is usually measured by repeating the assessment after a brief interval of approximately 1 to 4 weeks, the retest in the current study was not carried out until 4 to 6 months later. Consequently, changes in the scores over this long period will have resulted from true alterations in the children’s psychological adjustment as well as from measurement unreliability. Nevertheless, the reported test-retest stabilities do set a lower bound for true test-retest reliability, suggesting that the latter are generally satisfactory. The validity of the SDQ was gauged by how strongly the various scales were associated with the presence or absence of psychiatric disorders. High SDQ scores (in the extreme 10% of the population) were associated with a substantial increase in psychiatric risk, with odds ratios of approximately 15 for parent and teacher SDQ scales and approximately 6 for self-report SDQ scales. When applied to a community sample, the proportion of true negatives is high (specificities and negative predictive values around 95%) but the proportion of true positives is substantially lower (sensitivities and positive predictive values around 35%). This sort of overinclusiveness is

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

1343

GOODMAN

TABLE 6 Prevalence of DSM-IV Diagnoses According to Whether the Child’s SDQ Score Puts Him or Her at Low Risk (90% of Population) or High Risk (Extreme 10% of Population) Prevalence of Diagnosis in SDQ Scale (Cutoff ) Parent SDQ Total Difficulties (16/17) Emotional (4/5) Conduct (3/4) Hyperactivity (7/8) Peer Problems (3/4) Prosocial (6/7) Total Impact (1/2) Teacher SDQ Total Difficulties (15/16) Emotional (4/5) Conduct (3/4) Hyperactivity (7/8) Peer Problems (4/5) Prosocial (5/6) Total Impact (1/2) Youth SDQ Total Difficulties (17/18) Emotional (6/7) Conduct (4/5) Hyperactivity (6/7) Peer Problems (3/4) Prosocial (5/6) Total Impact (1/2)

Which DSM-IV Diagnosis? Any ANX or DEP ODD or CD ADHD Any Any Any

Any ANX or DEP ODD or CD ADHD Any Any Any

Any ANX or DEP ODD or CD ADHD Any Any Any

Low-Risk Group

High-Risk Group

Odds Ratio (95% CI)

5.4% (488/9,057) 2.2% (192/8,882) 1.7% (148/8,762) 0.7% (59/9,054) 6.4% (571/8,870) 7.7% (694/8,965) 5.2% (473/9,143)

46.3% (436/941) 20.5% (229/1,116) 25.7% (318/1,236) 17.5% (165/944) 31.3% (353/1,128) 22.3% (230/1,033) 52.7% (451/855)

15.2 (13.0–17.7) 11.7 (9.5–14.3) 20.2 (16.4–24.8) 32.3 (23.8–43.9) 6.6 (5.7–7.7) 3.4 (2.9–4.0) 20.5 (17.4–24.1)

5.5% (370/6,689) 3.0% (203/6,723) 1.8% (121/6,783) 0.8% (54/6,706) 7.0% (481/6,830) 6.5% (417/6,465) 5.2% (343/6,597)

44.1% (275/624) 14.1% (83/590) 37.7% (200/530) 19.1% (116/607) 34.0% (164/483) 26.6% (228/857) 42.2% (302/716)

13.5 (11.1–16.3) 5.3 (4.0–6.9) 33.4 (26.0–42.9) 29.1 (20.8–40.7) 6.8 (5.5–8.4) 5.2 (4.4–6.3) 13.3 (11.1–16.0)

7.6% (274/3,625) 4.0% (150/3,773) 3.3% (118/3,572) 1.5% (52/3,534) 8.5% (310/3,631) 9.6% (349/3,645) 8.0% (299/3,759)

35.2% (126/358) 28.6% (60/210) 19.5% (80/411) 8.9% (31/449) 25.6% (90/352) 15.1% (51/338) 45.1% (101/224)

6.6 (5.2–8.5) 9.7 (6.9–13.6) 7.1 (5.2–9.6) 5.0 (3.1–7.8) 3.7 (2.8–4.8) 1.7 (1.2–2.3) 9.5 (7.1–12.7)

Spec

Sens

NPV

PPV

94%

47%

96%

46%

91%

54%

98%

21%

91%

68%

98%

26%

92%

74%

99%

17%

91%

38%

94%

31%

91%

25%

92%

22%

96%

49%

95%

53%

95%

43%

94%

44%

93%

29%

97%

14%

95%

62%

98%

38%

93%

68%

99%

19%

95%

25%

93%

34%

91%

35%

94%

27%

94%

47%

95%

42%

94%

23%

92%

35%

96%

29%

96%

29%

96%

29%

97%

19%

91%

40%

98%

9%

89%

37%

91%

26%

92%

13%

90%

15%

97%

25%

92%

45%

Note: SDQ = Strengths and Difficulties Questionnaire; CI = confidence interval; Spec = specificity; Sens = sensitivity; NPV = negative predictive value; PPV = positive predictive value; ANX = anxiety; DEP = depression; ODD = oppositional defiant disorder; ADHD = attentiondeficit/hyperactivity disorder.

1344

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

PSYCHOMETRIC PROPERTIES OF SDQ

often acceptable in screening tests, in which the first priority is generally to reduce the rate of false negatives even if this is at the cost of increasing the rate of false positives. However, it is worth noting that the screening properties reported in this article are all based on individual scores derived from single informants; using a computer algorithm to combine SDQ symptom and impact scores from multiple informants substantially improves screening properties (Goodman et al., 2000b). Limitations

The follow-up of part of the original sample at 4 to 6 months set a lower bound on likely test-retest reliability, but a more comprehensive follow-up after a shorter period could have provided a more accurate estimate of test-retest reliability. While participation rates were excellent for parents and youths, the 20% nonparticipation rate for teachers (particularly those from schools in deprived neighborhoods) may have made the sample of teacher SDQs unrepresentatively “supernormal.” There is no reason, though, to suppose that this will have distorted this report’s findings on factor structure, reliability, or validity. Clinical and Research Implications

Overall, the SDQ functions remarkably well for a brief measure that fits on one piece of paper. It is potentially useful for screening (Goodman et al., 2000b), as part of a clinical assessment (Goodman et al., 2000c), and as a measure of treatment outcome (Garralda et al., 2000). The fact that it can be used without charge for noncommercial purposes is an additional advantage for researchers and clinicians working in the public sector. REFERENCES Achenbach TM (1991a), Manual for the Child Behavior Checklist/4–18 and 1991 Profile. Burlington: University of Vermont Department of Psychiatry Achenbach TM (1991b), Manual for the Teacher’s Report Form and 1991 Profile. Burlington: University of Vermont Department of Psychiatry Achenbach TM (1991c), Manual for the Youth Self-Report. Burlington: University of Vermont Department of Psychiatry Achenbach TM, McConaughy SH, Howell CT (1987), Child/adolescent behavioral and emotional problems: implications of cross-informant correlations for situational specificity. Psychol Bull 101:213–232

American Psychiatric Association (1994), Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV). Washington, DC: American Psychiatric Association Elander J, Rutter M (1995), Use and development of the Rutter Parents’ and Teachers’ Scales. Int J Methods Psychiatr Res 5:1–16 Fombonne E, Simmons H, Ford T, Meltzer H, Goodman R (2001), Prevalence of pervasive developmental disorders in the British nationwide survey of child mental health. J Am Acad Child Adolesc Psychiatry 40:820–827 García P, Mazaira JA, Goodman R (2000), The initial validation study of the Gallego version of the Strengths and Difficulties Questionnaire (SDQ). Rev Psiquiatría Infanto-Juvenil 2:95–100 Garralda ME, Yates P, Higginson I (2000), Child and adolescent mental health service use: HoNOSCA as an outcome measure. Br J Psychiatry 177:52–58 Goodman R (1994), A modified version of the Rutter parent questionnaire including items on children’s strengths: a research note. J Child Psychol Psychiatry 35:1483–1494 Goodman R (1997), The Strengths and Difficulties Questionnaire: a research note. J Child Psychol Psychiatry 38:581–586 Goodman R (1999), The extended version of the Strengths and Difficulties Questionnaire as a guide to child psychiatric caseness and consequent burden. J Child Psychol Psychiatry 40:791–801 Goodman R, Ford T, Richards H, Gatward R, Meltzer H (2000a), The Development and Well-Being Assessment: description and initial validation of an integrated assessment of child and adolescent psychopathology. J Child Psychol Psychiatry 41:645–655 Goodman R, Ford T, Simmons H, Gatward R, Meltzer H (2000b), Using the Strengths and Difficulties Questionnaire (SDQ) to screen for child psychiatric disorders in a community sample. Br J Psychiatry 177:534–539 Goodman R, Meltzer H, Bailey V (1998), The Strengths and Difficulties Questionnaire: a pilot study on the validity of the self-report version. Eur Child Adolesc Psychiatry 7:125–130 Goodman R, Renfrew D, Mullick M (2000c), Predicting type of psychiatric disorder from Strengths and Difficulties Questionnaire (SDQ) scores in child mental health clinics in London and Dhaka. Eur Child Adolesc Psychiatry 9:129–134 Goodman R, Scott S (1999), Comparing the Strengths and Difficulties Questionnaire and the Child Behavior Checklist: is small beautiful? J Abnorm Child Psychol 27:17–24 Klasen H, Woerner W, Wolke D et al. (2000), Comparing the German versions of the Strengths and Difficulties Questionnaire (SDQ-Deu) and the Child Behavior Checklist. Eur Child Adolesc Psychiatry 9:271–276 Koskelainen M, Sourander A, Kaljonen A (2000), The Strengths and Difficulties Questionnaire among Finnish school-aged children and adolescents. Eur Child Adolesc Psychiatry 9:277–284 Meltzer H, Gatward R, Goodman R, Ford T (2000), Mental Health of Children and Adolescents in Great Britain. London: Stationery Office Smedje H, Broman J-E, Hetta J, von Knorring A-L (1999), Psychometric properties of a Swedish version of the “Strengths and Difficulties Questionnaire.” Eur Child Adolesc Psychiatry 8:63–70 SPSS Inc. (1999), SPSS. Chicago: SPSS Woerner W, Friedrich C, Becker A, Goodman R, Rothenberger A (2000), Normierung und Evaluation des Strengths and Difficulties Questionnaire (SDQ) Erste Ergebnisse aus einer deutschen Feldstichprobe zur Elternverion. Abstract from 26th Kongress der Deutschen Gesellschaft für Kinder- und Jugendpsychiatrie und Psychotherapie, Jena, Germany, April

J . A M . AC A D . C H I L D A D O L E S C . P S YC H I AT RY, 4 0 : 11 , N OV E M B E R 2 0 0 1

1345