Child Psychopathology Rating Scales and Interrater Agreement: II. Child and Family Characteristics PETER S. JENSEN, M.D., MAJ., STEPHEN N. XENAKIS, M.D., COL., HARRY DAVIS, M.S., AND JAMES DEGROOT, PH.D.
Abstract. One hundred families with a child from ages 6 to II from a nonclinical population were surveyed to determinethe effects of child and family psychosocial and demographic characteristics on interrateragreement about children'ssymptomsand behavior problems. Results indicated that a significant proportion of the variance in difference scores among parent, child, and teacher reports about the child is a function of the family status (blended/previous divorce versus intact family), sex of parent and child, life stressors, the child's tendency to respond in socially desirable fashion, sibling position and family size, and familiarity of the child to the rater. Development of scales less sensitive to the effects of external, environmental, and child-related variables is needed to improve their reliability and validity and their usefulness as screening instruments in nonclinical populations. J. Am. Acad. Child Adolesc. Psychiatry, 1988, 27, 4:451-461. Key Words: rating scales, interrater agreement, family status, stress, sibling position.
In a companion report (Jensen et aI., 1988), the authors described the significant differences in mothers' and fathers' ratings of their sons' behavioral problems when using the Child Behavior Checklist (CBCL) (Achenbach and Edelbrock, 1983). Also reported was the finding that each parent's own psychiatric symptoms account for systematic differences between his or her ratings of the child and the other parent's rating, the teacher's rating, and the child's rating. This report examines the effects of additional factors that affect reliabilities between the various raters of children's psychiatric symptoms and behavior problems. Kazdin and Petti (1982) have noted the need to study child and parent symptom report patterns by the sex and age of the child, but it appears that only one study in the literature has followed through on their recommendations (Kazdin et al., 1983a). Other factors may affect interrater reliabilities, including the familiarity of the target child to the rater, the type of symptom being reported, gender-based differences in patterns of child psychiatric symptoms, social desirability response sets, child's intelligence, socioeconomic status, and family factors. The available research in each of these areas is reviewed here in tum. Effects of child's and parent's sex. A common problem in studying interrater reliabilities with child psychiatric scales involves the frequent lumping of raters' reports of boys' and girls' symptoms (e.g., Kashani et al., 1985; Kazdin et aI.,
1983a), despite clear evidence that boys' and girls' symptom patterns differ (Achenbach and Edelbrock, 1983; Jacobsen et aI., 1983). In most cases, this practice should be discontinued, given the growing evidence that mothers and fathers differ in their actual threshold of reporting about child psychiatric problems (Jensen et al., 1988; Reynolds et al., 1985) as well as in their response patterns to their children's behavior (Rothbart and Maccoby, 1966). Although the pioneering work of Achenbach and Edelbrock (1983) in the development of the CBCL demonstrated differences in symptom patterns between boys and girls, little work has been done to document systematic gender differences with other scales. In one of the few reports available Finch et al. (1985) reported significant gender effects in children's response to the Child Depression Inventory (COl). In a study of psychiatric inpatient children, Kazdin et al. (l983a) found that boys show less agreement with their parents about their reports of psychiatric symptoms and tend to be more independent observers of themselves than girls. Effects ofchild's age. Several researchers have documented the effects of a child's age on the symptom pattern reporting. In the development of the CBCL, different factors were described for children of different ages (Achenbach and Edelbrock, 1983). Finch et al. (1985) found a pattern of increasing COl scores for girls from grades 2 to 8 and slightly declining COl scores for boys grades 2 to 8. In contrast, Saylor et aI. (1984), in a study of children (not separated by sex), found lower COl scores with increasing age. In terms of reliabilities between child and parent ratings, Reich et al. (1982) found a decreasing discrepancy between mother-child reports as a function of increasing child age. Touliatos and Lindholm (1975) also found clear child age effects on teachers' reports on the Behavioral Problem Checklist. It is possible that symptom reporting patterns may vary as a function of sex, age, and the type of symptom; thus, while depression scores may gradually decrease for boys as a function of age, behavior problem scores for the same children may rise (Touliatos and Lindholm, 1975; Werry and Quay, 1971).
Accepted April s, /988. From the Department of Psychiatry and Neurology. Eisenhower Army Medical Center. Fort Gordon. Georgia. Reprints may be requested from Maj. Peter S. Jensen. M.D .. P.O. Box 342. Eisenhower Army Medical Center. Fort Gordon. GA 30905-5650. The opinions and assertions contained herein are the private views ofthe authors and are not to be construed as official or as reflecting the views ofthe Department ofthe Army or the Department ofDefense. The authors wish to thank Rosina Martinez and Sandra Ferrellfor their careful editing and preparation ofthe manuscript. 0890-8567/88/2704-0451 $02.00/0© 1988 by the American Acad-
emy of Childand Adolescent Psychiatry. 451
452
JENSEN ET AL.
Familiarity of the rated child. Burrows and Kelley (1983) reported better interparent reliability when parents rated their own child, versus an unfamiliar child. Interrater reliability may increase as a function of increasing similarity of the daily situations and time spent with that child. Although time sampling methods with independent observers may seem more objective, these methods may miss rare but very important behaviors. Time sampling methods may actually lower interparent reliabilities (Burrows and Kelley, 1983). The parents' in-depth knowledge of a child may make him or her sensitive to nuances of the child's behavior or internal feeling states that will not be apparent to an objective observer. Time sampling methods may suffer from the artificiality of the procedure itself, and the effects of time sampling methods (and the presence of an observer) on the parent-child interactions should not be underestimated. Much more research will need to be done to tease out the effects of the rater's relationship to and familiarity with the rated child. Effects of symptom type. The type of symptom reported likely affects the reliabilities between raters. A companion report in this issue describes the higher correlations between mothers', fathers', and teachers' reports of children's externalizing behaviors than internalizing symptoms (Jensen et aI., 1988a). Similar findings have also been reported in other studies using clinical and nonclinical samples and alternative research methods (Mash and Johnston, 1983; Reich et al., 1982). An obvious problem that needs further exploration concerns the potential impact ofthe different origin and evolution of child- and parent-completed scales. Most scales for completion by children were devised with children in mind, based on the child's ability to understand and respond to the questions; in contrast, most scales completed by parents about their children are based on factor analytic approaches, using descriptors of the child's symptoms from an adult perspective. To begin to deal with this built-in lack of concordance between parent and child reports, some authors have developed parent versions of child report instruments; use of such methods may result in higher correlations between parent and child reports (e.g., Garber, 1986). Although this makes inherent sense, difficulties arise in that symptoms observed by the parent may not be reported by the child; instead, the child may report other internal states or behaviors. For example, Kazdin et al. (1983b) reported higher correlations between the parents' report on the CBCL somatization subscale and the child's own report of depression than between parent and child reports of the child's depression. Similarly, Kashani et al. (1985) reported that while some parents reported the presence of attention deficit disorder symptoms in their children, their children reported depressive symptoms. Conceivably, both the parent and child may be reporting relevant aspects of a psychopathologic entity; conversely, the possibility exists that either or both parties may be subject to distortions in their reports. Although most discussions in the literature about the reliabilities of various parent versus child reports focus around the area of underreporting versus overreporting (e.g., Kashani et al., 1985; Kazdin et al., 1983c; Orvaschel et al., 1981), the weight of evidence suggests that the best correlations between
parents' and children's reports may be between the parents' reports of the child's external, troubling behaviors (obviously noticeable and worrisome to the parent) and the child's reports of internal, depressed, or anxious feelings (painful to the child but not always obvious to the parent). This hypothesis awaits investigation. Effects ofsocial desirability response sets. Social desirability response sets in one or both raters may affect parent-child or other interrater reliabilities on child symptom and behavior problem scales. Ledingham et al. (1982) noted that peer and teacher assessments of a given child within the classroom showed better agreement than between teacher-target child or peer-target child ratings. They concluded that the target children may be likely to respond to the rating scales based on the desirability of the trait being evaluated. Their conclusions parallel those of previous authors (Pekarik et aI., 1976; Semler, 1960), but much more work needs to be done with current scales. Effects of intelligence and socioeconomic status. The discrepancy between parent and child reports may vary as a function of higher child IQ, suggesting that the child's capacity to dissemble and maintain an independent opinion may contribute to the discrepancy between parent and child reports (Kazdin et al., 1983a). These authors also found that ethnic, socioeconomic, and racial variables were associated with the discrepancy between parent and child reports: white parents tended to overestimate their child's depression, while black parents tended to underestimate their child's depression. Furthermore, biologic mothers tended to rate their child as more depressed than did other maternal figures (e.g., stepmothers). Possibly, this phenomenon may be a function of the objectivity of the stepparent rater; conversely, the greater familiarity of the child to a natural parent may enhance the parents' sensitivities to that child's symptoms. Various researchers (McDermott et al., 1965; Touliatos and Lindholm, 1975) have reported systematic differences among teachers in their reports of children's behavioral problems as a function of the child's and teacher's "middle-class bias." Similarly, Touliatos and Lindholm (1981) found higher correlations between parents and teachers about reports of children's behavior as a function of increasing child age, child gender (parents and teachers agreed more about ratings of boys than girls), and higher socioeconomic status. Because of the different demands on the child in the school setting, teachers' reports of child symptoms may be influenced by the child's cognitive abilities and academic performance rather than simply by behavior problems or psychiatric symptoms alone (Lessing et al., 1974). Effects offamily stressors. An additional factor often observed by clinicians (but scarcely mentioned in the research literature) is the presence of family and marital stresses affecting parents' reports of child symptoms. In practice, clinicians commonly observe that the child may actually be relatively well adjusted or only slightly symptomatic, while the majority of the parental complaints about the child's behavior seem related to marital turmoil or family stressors. Poznanski et al. (1984) and Reynolds et al. (1985) noted the possibility that family factors and parental stigmatization of a child contribute to parents' or the child's overreporting the child's
INTERRATER AGREEMENT: CHILD AND FAMILY FACTORS
symptoms. Forehand et al. (1986) found that mothers' reports of marital dissatisfaction were better predictors of maternal perceptions of child behavior problems than independent observers' ratings of child problems in the home. Purpose. Because of the multiple potential factors affecting mothers', fathers', teachers', and children's reports of child behavior problems, and the dearth of systematic research in this area, this study attempted to systematically examine these factors in a community sample of boys and girls across age ranges of 6 to 12, using well-standardized instruments based on parent, teacher, and child reports. Hypotheses Based on a review of the literature and clinical experience, a set of hypotheses was developed to test for sources of systematic variance between parents', teachers', and children's reports of children's symptoms and behavioral problems: I. Discrepancy of symptom scores in parent-child and teacher-child dyads (difference score) will increase as a function of child's age (hypothesized to occur on the basis of the older child's increasing sensitivity to "private" information and his or her growing capacity to take an independent position from that of the parent or teacher). 2. Increasing time spent with and exposure to the rated child should decrease the discrepancy between parent-child and mother-father dyad reports. (These criteria were operationalized by (a) the extent to which the mother worked outside the home, (b) the extent of the father's absence on trips during the past year, and (c) a comparison between the difference scores of natural parent dyads' ratings of their child versus natural parent/stepparent dyads' ratings of their child.) 3. The discrepancy between parent-child dyad reports and between mother-father dyad reports should decrease with increasing socioeconomic status (presumably a function of increased parental education, parental awareness of psychologic issues, understanding the meanings of children's behaviors, etc.). 4. The discrepancy in parent-child dyad scores will increase as a function of greater family size and the rated child's more distal birth order. (In larger families, parents may be less aware of their child's feelings; also, there may be less time and attention available for the parent to respond to the individual child's cues.) 5. Parent-child dyad discrepancy in symptom reports will increase as a function of the child's tendency to respond in a socially desirable fashion (operationalized by the child's completing a self-report social desirability scale [Lie score)). 6. The type of symptom construct being compared between parent-child or teacher-child dyads will result in systematic variations in the discrepancies between dyads' scores. Specifically, it was hypothesized that the discrepancies in internalizing symptom constructs rated by both parent and child will show systematic covariation with extraneous factors (e.g., social desirability, sibling position, family size, age of child, socioeconomic status, familiarity of the target child) and result in significant correlations with those extraneous factors; in contrast, it was hypothesized that relatively fewer significant correlations will emerge between the difference scores between parents' externalizing symptom reports and the child's inter-
453
nalizing reports and the extraneous factors. Thus, the agreement between parent and child when the parent describes externalizing symptoms and when the child describes internalizing symptoms should be more stable than when parent and child are both describing internalizing symptoms and behaviors in the child. Three additional questions were raised with respect to the data, but a priori hypotheses concerning the direction of effects were not constructed. 7. Does the level of family stressors during the past year affect parent-child and teacher-child dyad discrepancies? (The level of family stressors was measured by using the mothercompleted Life Events Record which documented stressful events occurring to the child and family over the previous year). 8. Do the child's characteristic social patterns affect parentchild and teacher-child discrepancy scores (social patterns were operationalized as the number of friends the child has and how frequently he or she plays with friends each week)? 9. Does the sex of child affect the patterns of intercorrelations of the various difference scores with other sociodemographic variables, and if so, how? Method Subjects One hundred and twenty-four intact families with children attending the same elementary school were invited by letter and follow-up phone call (by one ofthe authors) to participate in a study of the effects of life stresses on children. Families with one or more children from ages 6 to 11.9 were randomly selected from military housing lists; of this group, eight families were ineligible for participation because of temporary father absence, while six families refused to participate. Two families that agreed to participate could not be reached at home despite several attempts. Of the participating 108 families, 100 mothers, 95 fathers, 90 children, and 94 teachers completed questionnaires. All fathers were officers or senior enlisted personnel on active duty with the U.S. Army, thus ensuring moderately uniform socioeconomic status. The sample of children on which at least partial data were available was composed of 59 boys and 49 girls, ages 6 to 11.9. Mean age of boys and girls was 8.9 and 8.7 years, respectively.
Instruments The CBCL was used to document the parents' reports of their children's depressivesymptoms and behavioral problems (Achenbach and Edelbrock, 1978). The CBCL was selected because it allows compilation of a total behavior problem score, two "broad band" scores (internalizing and externalizing symptoms), and 12 subscale scores ("narrow band"), e.g., depression, anxiety, aggression, and hyperactivity. Because separate standardized norms and scoring procedures have been devised for boys and girls and for different age groups, boys' and girls' scores were tabulated separately using the standardized forms for ages 6 to II. The two CBCL broad band subscales were used to assess parents' ratings of their children's behavior problems. These scores were selected for analysis from the total number ofCBCL scales and subscales,
454
JENSEN ET AL.
because much of the previous research with these scales has used these broad band subscale scores. Narrow band scales are not diagnosis-specific,and the increased number of scales would have excessively complicated the data analysis. The Teacher Report Form (TRF) of the CBCL (Achenbach and Edelbrock, 1983) was used to measure children's behavioral problems in the school setting, according to each child's main teacher. As with the parent-report version of the CBCL, the broad band subscale scores were selected for analysis. To assess symptoms in the children from their perspective, each child completed the COl and the Revised Children's Manifest Anxiety Scale (CMAS). The COl is a 27-item multiple choice instrument that documents the child's symptoms of depression from his or her perspective (Kovacs and Beck, 1977). The CMAS is a 37-item true/false questionnaire that is designed to measure various types of anxiety in children . This scale has three anxiety subscales and a Lie subscale (Reynolds and Richmond, 1978). In addition, the Life Events Record (Coddington, 1972) was used for completion by the mother to document the level of stresses on the child and family during the past 12 months. Also used was the Hopkins Symptom Checklist (HSCL) to measure parents' psychologic status, after it was modified by deleting the paranoia and psychotic symptom subscales to better accommodate a nonclinical community sample (Derogatis et al., 1974) (see companion report for a description of the HSCL findings). Gathering these reports of psychiatric symptoms from mother, father, teacher, and child provided a means of crosschecking the reliabilities of ratings between various combinations of the family members and the child's main teacher. A demographic questionnaire obtained further information from the mother: the mother responded to questions about the child's age, birth order, total number of siblings, the fathers' military rank (a measure of family socioeconomic status), the number of close friends the child had, and how frequently the child played with his or her friends per week. Also, the extent of father's absence from the home was operationalized on a 7-point Likert scale (0 = no absences greater than 2 weeks during the past year, 6 = father away from home 6 months or more during the past year). The extent of mother's working was operationalized on a 3-point scale to document the amount of time that she was away from home (0 = mother not working outside of the home, I = working fewer than 20 hours/week, 2 = working 20 or more hours/week) . Although these socio-demographic variables (e.g., father absence, socioeconomic status, child personality characteristics, etc.) might have been measured more precisely with instruments with established reliability and validity, the amount of work for the parents and children in completing lengthier questionnaires might have compromised participation in the study.
Procedures A research worker personally delivered the parents' questionnaires to each home and gathered the basic demographic information about the child and family from the mother during the visit. Parents were asked to complete the scales and questionnaires at home independently of each other (although there was no means of ensuring that they did not
collaborate) and then return these materials by mail in a stamped, addressed envelope provided with the questionnaires. The children completed the COl and CMAS at school in a testing room in the presence of one of the authors or a trained research assistant. Data analysis. Because no systematic instruments have been well standardized for use by both parents and children, it was not possible to administer the same measures to both adult and child reporters. For this reason, it was necessary to convert child and adult ratings of child symptoms to z scores to allow for comparison of the differences between adult and child ratings of child behavior and symptoms. Z score conversions were performed on parents' and teachers' internalizing and externalizing CBCL subscale scores; similarly, z score conversions of child-completed measures (COl and CMAS Total Anxiety scores) were also performed. Difference scoreswere computed by subtracting the z scores of one rater's report from the z score of the other rater, for each parentchild, mother-father, teacher-child, and parent-teacher dyad. Difference scores were chosen for use in the analysis because of the evidence cited above (e.g., Achenbach and Edelbrock, 1978;Jensen et al., 1988a; Rothbart and Maccoby, 1966)that children's behaviors and adults' perceptions of child behavior are "dyad- and situation-specific." In other words, ratings of children's behavior by adults do not show strong correlations between different raters, and the rating of a given child may be specific to the relationship between the child and the rater or it may be the unique response of the child in that environmental setting. Pearson correlations were then performed between each dyad difference score and the child's age, number of friends, frequency of playing with friends per week, parents' socioeconomic status (military rank), the child's s0cial desirability response set (Lie subscale score on the Children's Manifest Anxiety Scale), familiarity of the child to the adult rater (the extent to which mother works during the day and the extent offather's absence out of the home in the past 12 months), family size, and the child's sibling position in the family. Also, families with both natural parents were coded "0" while those with a step- or adoptive father were coded "I." (There were no step- or adoptive mothers in the sample.) This "dummy" coding of families by natural versus adoptive/ stepparent status allowed computation of point biserial correlations of family type with dyad difference scores. All correlations were examined separately for boys and girls because of the evidence of different response sets and symptom patterns reported by boys versus girls and the difference between mothers and fathers in their reports of boys' and girls' symptom and behavior problems (Jensen et al., 1988a). Results Difference scores were computed between the child's responses (CMAS and COl scores), the teacher's responses (TRF internalizing and externalizing scores), the mother's responses (CBCL internalizing and externalizing scores), and father's responses (CBCL internal izing and externalizing scores). Table I shows the absolute difference scores between the child's CMAS scores and the parents' and teachers' CBCL internalizing and externalizing scores. Table 2 presents the absolute difference scores between the child's COl responses and par-
455
INTERRATER AGREEMENT: CHILD AND FAMILY FACTORS TABLE
I. Absolute Difference Scores between CMAS (Total Anxiety) and Parent/Teacher CBCL (Internalizing/Externalizing) Scales
Boys Mean S.D. N Girls Mean S.D. N
TABLE
Mothers CBCL Int.CMAS
Fathers CBCL Int.-CMAS
Teachers TRF Int.CMAS
Mothers CBCL Ext.-CMAS
Fathers CBCL Ext.-CMAS
Teachers TRF Ext.CMAS
0.951 0.668 48
0.913 0.809 46
1.058 0.784 47
1.020 0.665 48
1.025 0.701 46
1.071 0.956 47
0.998 0.774 39
1.135 0.839 38
1.188 0.885 32
1.005 0.841 39
1.344 0.916 38
1.003 0.745 32
2. Absolute Difference Scores between Children's Depression Inventory and Parent/Teacher CBCL (Internalizing/Externalizing) Scales
Sons Mean S.D.
N Daughters Mean S.D.
N
Mothers CBCL Int.-CDI
Fathers CBCL Int.-CDI
Teachers TRF Int.-COl
Mothers CBCL Ext.-COI
Fathers CBCL Ext.-COI
Teachers TRF Ext.-CDI
0.888 0.656 46
1.042 0.868 44
1.308 0.800 45
0.815 0.703 46
0.943 0.818 44
1.102 0.958 45
0.989 0.800 39
0.962 0.728 38
0.987 0.975 32
0.918 0.675 39
1.052 0.911 38
0.784 0.798 32
ents' and teachers' CBCL internalizing and externalizing scores. The findings in these tables indicate that the average difference between various raters is usually in the range of I S.D. There do not appear to be any differences between boys and girls in the average differences between them and the various adult raters. Pearson correlations of these difference scores with the child's age, birth order, family status (stepfather versus both natural parents), family socioeconomic status (father's rank), extent of mother working, father absence, number of friends and frequency of play with friends, and the child's social desirability scores are shown in Tables 3 (boys) and 4 (girls) for CMAS scores. In both Tables 3 and 4, significant correlations were found between fathers' stepparent status and increasing discrepancy in father-child difference scores, indicating greater differences in symptom ratings between stepfathers and their stepchildren than between the biologic fathers and children. Also, increased number of friends (boys, Table 3) and frequency of play with friends (girls, Table 4) were associated with smaller discrepancies between adult's and children's ratings. Increased number of siblings and more distal birth order was associated With smaller mother-daughter dyad discrepancies. Tables 5 (boys) and 6 (girls) show the Pearson correlations of these same background and socioeconomic demographic variables with parent-child difference scores, using children's COl responses and the parents' and teachers' CBCL internalizing and externalizing scores. Greater number of siblings was generally associated with parents' higher ratings of boys' symptoms, relative to the boys' reports (Table 5). Also, as was seen in Table 3, Table 5 demonstrates that the number of boys' friends was generally associated with decreasing discrep-
ancies in parent-child ratings. As found in Tables 3 and 4, stepfather status was associated with increased discrepancies in father-child difference scores than between biologic fathers and their children. Higher family stress levels showed a slight association with increasing discrepancies between adults and boys but showed fairly robust associations with decreasing adult-child discrepancies for girls. Furthermore, the CMAS Lie score showed consistent associations with adult-girl discrepancies but not for boys. Like the findings for girls in Table 4, earlier birth order was associated with decreased motherdaughter discrepancies in Table 6. Surprisingly, increased father absence was associated with decreased teacher-girl discrepancies (Table 6) but showed no association with adultboy discrepancies. Table 7 shows the mean difference scores between motherfather, mother-teacher, and father-teacher dyads on the internalizing and externalizing scales. These findings indicate that the differences between parents are about 0.7 S.D., while parent-teacher discrepancies are usually greater than I S.D. Pearson correlations of the demographic and socioeconomic variables with the mother-father, mother-teacher, and fatherteacher dyads' difference scores are shown in Tables 8 (boys) and 9 (girls). These findings are generally consistent with those of the previous tables: stepfather status is associated with increased discrepancies between mother-father and father-teacher ratings (the direction ofthe association is reversed in the correlation between mother-father discrepancy scores and stepfather status, since the father's score was subtracted from the mother's score, in contrast to the other difference scores involving fathers). Also, as was found in Tables 3 and 5, Table 8 indicates that an increased number of friends was associated with decreased discrepancies between raters for
456
JENSEN ET AL. TABLE 3. Pearson Correlations ofDifference Scores" and Socio-demographic Variables for Boys" Mothers CBCL Int.-CMAS
Age Birth order No. of siblings Family status (biologic vs. stepfather) SES (rank) Mother working Father absence Family stressors No. of friends Frequency of play/week CMAS Lie score
Fathers CBCL Int.-CMAS
Teachers CBCL Int.-CMAS
Mothers CBCL Ext.-CMAS
Fathers CBCL Ext.-CMAS
Teachers CBCL Ext.-CMAS
0.159 -0.072 0.049
0.092 -0.003 0.147
-0.036 0.061 0.047
-0.028 -0.135 0.018
-0.093 0.018 0.104
-0.115 -0.091 0.031
0.138 0.007 -0.217 -0.246 0.193 -0.292" -0.067 -0.074
0.388.... 0.106 -0.038 -0.135 -0.056 -0.217 -0.062 0.094
-0.180 0.194 0.202 -0.090 -0.115 -0.274" 0.120 0.140
-0.074 0.127 -0.113 -0.289" -0.125 -0.194 0.010 -0.036
0.203 0.171 -0.059 -0.200 -0.023 -0.186 0.079 0.096
-0.178 0.093 0.082 -0.157 -0.166 0.146 0.087 0.055
• Between boys' CMAS and adults' CBCL scores. h N= 49. .. P < 0.05; .... p < 0.01.
TABLE 4. Pearson Correlations ofDifference Scores" and Socio-demographic Variables for Girlsh Mothers CBCL Int.-CMAS Age Birth order No. of siblings Family status (biologic vs. stepfather) SES (rank) Mother working Father absence Family stressors No. of friends Frequency of play/week CMAS Lie score
Fathers CBCL Int.-CMAS
Teachers CBCL Int.-CMAS
Mothers CBCL Ext.-CMAS
Fathers CBCL Ext.-CMAS
Teachers CBCL Ext.-CMAS
0.011 -0.412.... -0.327"
0.191 -0.142 -0.183
0.034 -0.080 -0.099
-0.043 -0.221 -0.300"
0.087 -0.134 -0.176
0.056 -0.157 -0.126
0.194 -0.170 -0.210 0.139 -0.007 -0.161 -0.315" 0.154
0.424.... 0.079 -0.290" -0.029 -0.118 -0.190 -0.318" -0.019
0.033 0.137 -0.049 -0.163 -0.180 -0.195 -0.229 0.220
0.233 -0.032 -0.142 -0.033 -0.155 -0.107 -0.375·· 0.234
0.361" 0.045 -0.189 -0.134 -0.196 -0.154 -0.285" 0.080
0.119 0.096 -0.131 -0.218 -0.120 -0.151 -0.360" 0.216
• Between girls' CMAS and adults' CBCL scores. b N= 39. • p < 0.05; "Oo P <
om.
TABLE 5. Pearson Correlations ofDifference Scores" and Socio-demographic Variables for Boys" Mothers CBCL Int.-CDI Age Birth order No. of siblings Family status (biologic vs. stepfather) SES (rank) Mother working Father absence Family stressors No. of friends Frequency of play/week CMAS Lie score
0.188 0.029 0.311" 0.237 -0.184 -0.138 -0.124 0.354" -0.460...... -0.226 0.106
• Between boys' COl and adults' CBCL scores. hN ... 49. .. P < 0.05; ....p < 0.0 I; ......P < 0.00 1.
Fathers CBCL Int.-CDI
0.137 0.128 0.393.... 0.422.... -0.025 0.047 -0.065 0.011 -0.290" -0.166 0.220
.Teachers CBCL Int.-CDI
Mothers CBCL Ext.-CDI
Fathers CBCL Ext.-CD)
Teachers CBCL Ext.-CDI
0.026 0.136 0.219 -0.067
0.005 -0.023 0.325" 0.028
-0.012 0.150 0.355.... 0.292"
-0.064 -0.011 0.200 -0.096
0.039 0.182 -0.091 -0.114 -0.090 -0.111 0.193
-0.066 -0.072 -0.277 0.211 -0.356.... -0.122 0.136
0.004 -0.009 -0.143 0.064 -0.307" -0.050 0.229
-0.036 0.074 -0.130 0.203 0.017 -0.090 0.121
457
INTER RATER AGREEMENT: CHILD AND FAMILY FACfORS
TABLE 6. Pearson Correlations ofDifference Scores" and Socio-demographic Variables' for Girls
Age Birth order No. of siblings Family status (biologic vs. stepfather) SES (rank) Mother working Father absence Family stressors No. of friends Frequency of play/week CMAS Lie score
Mothers CBCL Int.-COl
Fathers CBCL Int.-COl
-0.097 -0.304· -0.166 0.098
0.099 -0.056 -0.044 0.388··
-0.322· -0.059 -0.001 -0.088 -0.185 -0.119 0.260
-0.075 -0.167 -0.172 -0.242 -0.257 -0.176 0.095
Teachers CBCL Int.-COl
Mothers CBCL Ext.-CDI
Fathers CBn Ext.-CDI
Teachers CBCL Ext.COl
-0.051 -0.D15 0.064 0.007
-0.175 -0.140 -0.150 0.155
-0.015 -0.049 -0.035 0.320·
-0.055 -0.075 0.081 0.088
-0.043 -0.048 -0.316· -0.329· -0.259 -0.053 0.320·
-0.225 0.011 -0.200 -0.311· -0.156 -0.237 0.394··
-0.110 -0.057 -0.291 -0.347· -0.260 -0.167 0.204
-0.143 -0.007 -0.427·· -0.322· -0.260 -0.135 0.355·
"Between girls' COl and adults' CBCL scores. 'N= 39. • p < 0.05; •• p < 0.0 I. TABLE 7. Mean Difference Scores (Absolute Values) between Mothers'. Fathers'. and Teachers' CBCL/TRF Internalizing/Externalizing Scales Internalizing Difference Scores
Boys Mean S.D. N Girls Mean S.D. N
Externalizing Difference Scores
MotherFather
MotherTeacher
FatherTeacher
MotherFather
MotherTeacher
FatherTeacher
0.850 0.658 51
1.200 0.876 45
1.077 0.965 43
0.686 0.636 51
1.096 0.720 45
0.990 0.840 43
0.726 0.623 44
1.007 0.816 37
0.944 0.773 35
0.573 0.652 44
0.745 0.603 37
0.857 0.785 35
TABLE 8. Mother-Father. Mother- Teacher. and Father- Teacher Difference CBCL Score Correlations with Socio-demographic Variables for Boys" Internalizing Difference Scores
Age Birth order No. of siblings Family status (biologic vs. stepfather) SES (rank) Mother working Father absence Family stressors No. offriends Frequency of play/week CMAS Lie score
Externalizing Difference Scores
MothersFathers
MothersTeachers
FathersTeachers
MothersFathers
MothersTeachers
FathersTeachers
0.087 -0.127 -0.132 -0.260
0.237 -0.082 -0.018 0.262
0.098 -0.102 0.035 0.504···
0.128 -0.175 -0.124 -0.341··
0.189 -0.050 -0.107 0.139
0.088 0.034 -0.033 0.416··
-0.082 -0.142 0.074 0.224 -0.073 -0.029 -0.222
-0.085 -0.313· -0.022 0.196 -0.299· -0.113 -0.220
-0.055 -0.221 0.011 0.123 -0.219 -0.121 -0.054
0.012 -0.077 0.092 0.127 0.009 -0.118 -0.184
0.050 -0.253 0.074 -0.003 -0.282· -0.094 -0.089
0.089 -0.225 0.077 -0.057 -0.275 -0.050 0.D18
"N= 43. • p < 0.05;·· p < 0.01; ••• p < 0.001.
boys. Table 9 also demonstrates an earlier finding: earlier birth order in girls is associated with smaller discrepancy scores involving mothers. Discussion Before discussion of these findings, a number of cautions seem warranted. First, although it was felt important to use a
nonclinical sample in this study, these findings may be limited to that population and may not apply to clinical samples. It is possible that the problems observed in normal children may not be severe enough for significant differences to occur, in terms of the originally stated hypotheses. Because of the large number of correlations, caution is necessary in the interpretation of these findings; however, of
458 TABLE
JENSEN ET AL.
9. Mother-Father, Mother-Teacher, and Father-Teacher Difference CBCL Score Correlations with Socio-demographic Variables/or Girls" Internalizing Difference Scores
Age Birth order No. of siblings Family status (biologic vs. stepfather) SES (rank) Mother working Father absence Family stressors No. offriends Frequency of play/week CMAS Lie score
Externalizing Difference Scores
MothersFathers
MothersTeachers
FathersTeachers
MothersFathers
MothersTeachers
FathersTeachers
-0.258 -0.284* -0.165 -0.311*
-0.099 -0.334* -0.219 0.180
0.131 -0.140 -0.114 0.458**
-0.244 -0.081 -0.210 -0.300*
-0.254 -0.242 -0.230 0.201
0.027 -0.150 -0.057 0.345*
-0.267 0.114 0.186 0.135 0.080 0.094 0.214
-0.169 0.006 0.245 0.172 0.094 -0.040 0.001
0.121 -0.167 0.074 0.064 0.101 -0.120 -0.180
-0.057 0.194 0.138 0.124 0.011 -0.013 0.232
-0.118 0.139 0.076 -0.162 0.272 -0.Q78 0.165
0.083 -0.074 -0.058 -0.211 0.213 -0.087 -0.008
• N= 37. * p < 0.05; ** p < 0.01.
the 396 correlations presented in Tables 3 to 6, 8, and 9, chance alone would have resulted in 20 significant correlations, whereas 50 significant correlations resulted (x 2 = 13.1, df = I, P < 0.00 I). Findings are most robust when they are consistent with original hypotheses. Statistical corrections for the number of correlations are indicated where there are no a priori hypotheses, but hypotheses were constructed for the majority of the data analyses. Of the original questions, only questions 7 to 9 were exploratory, and no direction of effect was postulated. Readers are advised special caution in the interpretation of findings with these exploratory questions. Results will be discussed in the order of the originally stated hypotheses. Effects ofChild's Age
Results (Tables 3 to 6) do not support the hypotheses of Kazdin et al. (1983a) that boys or older children are more able to take an independent position, as evidenced by no significant correlations found between the child's age and the child-adult difference scores. Although it seems possible that these differences could be demonstrated in a psychiatric clinical sample (e.g., Kazdin et al., 1983a) for boys and older children, such findings may actually be a function of the child's psychopathology (i.e., an older boy with a conduct disorder may deny the presence of symptoms obvious to other reporters). This question must await further research with both clinical and nonclinical samples. Familiarity ofthe Rated Child
The hypotheses that the frequency and the extent of parental contact with the child would be significantly related to parent-child difference scores received mixed support: except for the mother-teacher CBCL internalizing difference score (Table 8), the extent of mother working was generally not significantly associated with the mother-child difference scores. However, most of these correlations (II of 15) were in the expected direction. Interestingly, the extent of mother working was associated with father-child difference scores for girls. The extent of father absence showed similar, modest
associations with father-child difference scores: 13 of 16 correlations wereall in the expected direction. Surprisingly, father absence also accounted for significant associations between mother-child and teacher-child difference scores for both boys and girls. These findings suggest that although father absence may exert effects on the child (resulting in an increase in the child's COl and CMAS scores), the parent (in this case, the mother) may relatively underestimate these effects or be unaware of these effects on the child. This could be because the child tends to keep a "stiff upper lip," or because the mother is preoccupied with her own feelings about the father's absence. Conceivably, the child may try to hide these feelings from the mother in order to not burden the family further. Obviously, additional research will be needed to test these post hoc hypotheses. The finding that stepfather status was associated with increased discrepancies in fathers' ratings with all other raters (mothers, teachers, and the children), in contrast to decreased discrepancies in biologic fathers' ratings, is both robust and intriguing. All correlations were in the expected direction, and 15 of 16 correlations were significant. This finding is suggestive of family interaction patterns where the mother and stepfather may be divided in their perceptions and interactions with the rated child. Whether the child is more accurately described by the stepfather's more objective view or the mother's longitudinal, in-depth view is unclear. However, the consistency of these findings across all raters and settings suggests that stepfathers' perceptions may be skewed. Alternatively, the child may resent the stepfather and emit more disturbing behaviors in his presence. If this is true, the higher symptom ratings by the stepfathers would reflect the interaction between the stepfather and the child rather than a stable psychopathologic entity in either the child or "projection" by the stepfather. Further understanding of these findings must await future research. Socioeconomic Status
Hypotheses concerning socioeconomic status (SES) explaining mother-father and parent-child difference scores were
INTER RATER AGREEMENT: CHILD AND FAMILY FACTORS
not supported by the findings, except for one significant correlation between mothers' CBCL internalizing/girls' COl difference scores and SES (Table 6). Although this correlation was in the expected direction (as was the correlation of SES with the mother-father difference scores on the CBCL internalizing scale (Table 9), these results are not conclusive or persuasive enough to allow any conclusions. It is possible that potential SES effects were restricted by the nature of this sample: all families had fathers with a stable income, no-cost medical care, and other supports built into the military system. In another sample with greater variability in SES, effects may have been demonstrated. This may explain the differences between these findings and those of Kazdin et al. (1983a), who found that mothers in families on welfare tended to underestimate their children's reports, while mothers and families not on welfare tended to relatively overestimate their children's scores.
Birth Order and Family Size The hypotheses that later birth order and increased family size account for increased discrepancies in parent-child difference scores received modest support for mother-daughter difference scores; this relationship held true for difference scores computed between the CMAS and the COl, and the mother-completed CBCL internalizing and externalizing scores. These relationships did not hold for difference scores computed between boys and their respective adult raters; in fact, significant relationships were found in the reverse direction: this was an unexpected finding. These significant ass0ciations generally held true for difference scores computed between mother-son, father-son, and to some extent, teacherboy dyads, and applied to the adult-completed internalizing and externalizing scores and the boy-completed COl scores. These findings may suggest that with increased family size, boys may be more prone to behavioral problems, but at the same time, they may be less psychologically-minded and less able (or less apt) to describe the feelings associated with their behavioral problems. Girls, in contrast, under conditions of larger family size or later birth order may experience increased internalizing symptoms, but adult raters may be unaware of girls' inner feelings. These possible explanations will need further exploration.
Social Desirability Response Sets Partial support was found for the hypotheses that an increased tendency to respond in a socially desirable fashion on the part ofthe child would account for increasing parent-child and teacher-child difference scores. Modest correlations were found for girls but none for boys; nonetheless, 21 of 24 correlations were in the hypothesized direction.
Type ofSymptom Construct No support was found for the notion that extraneous factors would more frequently be significantly associated with adultchild internalizing-internalizing difference scores, compared with externalizing-internalizing difference scores (approximately equal numbers of significant associations with extraneous, environmental factors were found for adult-child internalizing-internalizing discrepancy scores, compared with
459
adult-child externalizing-internalizing scores; see Tables 3 to 6). In contrast, environmental factors tended to be more commonly associated with adult-adult internalizing discrepancy scores than with adult-adult externalizing discrepancy scores (Tables 8 and 9). The hypothesis that internalizing scores are less stable and more prone to environmental and other extraneous factor influences may be worth further research, but the lack of any child self-report externalizing scale in our study limited the ability to construct and test internalizingexternalizing parent-child difference scores. It would be of interest in further research to determine whether parent-child and teacher-child dyad discrepancies vary as a function of the nature of the instruments used. (For example, do parent-child dyad discrepancies differ when using one particular child report instrument versus another?) It is possible that two different instruments, both ostensibly measuring the same construct, may differ in what has been termed "environmental stability," i.e., the extent to which scores by raters vary in response to environmental factors. Conceivably, those instruments that show less variation in this regard may be measuring a more stable, less evanescent construct. This question will need further research.
Family Stress Levels Modest support was found for the hypothesis that family stress levels affect the interrater reliabilities of parent, teacher, and child reports. The finding that stress levels are positively associated with greater mother-son internalizing symptom discrepancy scores suggests either a tendency for the mother to project (a function of the stress effects on her directly) or the son to deny the stressor's impact. In contrast, inverse correlations were found between stress levels and adult-daughter difference scores. This suggests that adults may underestimate the effects of stressors on daughters or simply that under conditions of stress, daughters and parent raters tend to agree about the daughters' symptom and behavior problems.
Child Personality and Interaction Patterns General support was found for the notion that changes from the child's regular play patterns account for significant associations in parent-child difference scores; thus, children with more friends and/or who play more frequently with friends tend to show smaller discrepancy scores. This finding suggests that it is easier for raters to agree on symptoms and behavioral problems in more active, outgoing children than in children with quieter behavioral styles (55 of60 correlations were in the expected direction).
Child's Sex A priori hypotheses were not constructed about the nature or direction of effects of the child's gender on interrater reliabilities. Rather, differences based on the child's gender were assumed likely. Indeed, the present evidence suggests that while boys and girls may not differ in the total symptoms reported (for example, on the COl), patterns in their symptom reports differ. In a number of the findings mentioned above, the direction of association was opposite for boys' versus girls' discrepancy scores (e.g., the number of siblings, birth order,
460
JENSEN ET AL.
and stress levels). Symptom reports tended to vary as a function of sex in the correlation with other raters, possibly suggesting differential meanings of high scores in symptom inventories in boys versus girls. These findings parallel those of Jacobsen et al. (1983), who found differences in the intercorrelations among several depression measures for boys compared with girls. Although the present results may be criticized for the generally small correlations given the hypotheses tested, the authors wish to point out that such modest correlations are the rule for most studies of interrater reliabilities (except interparent reliabilities, which tend to be moderate (Achenbach and Edelbrock 1978)). In this context, the current findings that psychosocial and environmental variables show associations that account for systematic differences between raters are very important. Results indicate that in order for further progress to be made in the use of parent- and child-completed symptom checklists, more research to find environmentally stable response items and symptoms scales must be conducted. In summary, the data suggest that familiarity of the target child, stepparent status, family size, stress levels, the child's characteristic play patterns, and social desirability response sets may affect the reliability of reports about boys' and girls' symptoms and behavioral problems. The child's age and family SES do not appear to exert major effects on interrater reliability in the sample studied here. Because this work is the first systematic study of these variables that the authors have been able to find in the literature, the findings presented here should be viewed with caution. Replication and validation of the findings by others is encouraged. Since parents are influenced in their reports not only by their own psychiatric symptoms (Jensen et al., 1988) but by the contextual factors outlined above, only with careful attention to these variables (too frequently assumed to be "measurement error" in previous research) will we be able to improve the reliability of parent-, child-, and teacher-report scales. In the meantime, the evidence at hand demonstrates the importance of gathering information from multiple sources to obtain as complete and accurate a picture of the child as possible (Orvaschel et al., 1981). These data also lend support to the dangers of using self- and parent-report behavior rating scales to diagnose child psychopathology (Shekim et al., 1986). References Achenbach, T. M. & Edelbrock, C. S. (1978), The classification of child psychopathology: review and analysis of empirical efforts. Psychol. Bull., 85:1275-1301. - - - - (1983), Manual for the Child Behavior Checklist and Revised Child Behavior Profile. Burlington, VT.: University Associates in Psychiatry. Burrows, K. R. & Kelley, C. K. (1983), Parental interrater reliability as a function of situation specificity and familiarity of target child. J. Abnorm. Child Psychol., 11:41-48. Coddington, R. D. (1972), The significance of life events as etiologic factors in the diseases of children: a study of a normal population. J. Psychosom. Res., 16:205-213. Derogatis, L. R., Lipman, R. S., Richels, K., Uhlenhuth, E. H. & Covi, L. (1974), The Hopkins Symptom Checklist (HSCL): a selfreport symptom inventory. Behav. Sci., 19:1-15. Finch, A. J., Jr., Saylor, C. F. & Edwards, G. L. (1985), Children's
Depression Inventory: sex and grade norms for normal children. J. Consult. Clin. Psychol., 53:424-425. Forehand, R., Brody, G. & Smith, K. (1986), Contributions of child behavior and marital dissatisfaction to maternal perceptions of child maladjustment. Behav. Res. Ther., 24:43-48. Garber, J., Kriss, M. R., & Koch, M. (1986, Oct.), Children ofmothers with depression: psychiatric status and general adjustment. Paper presented at the Annual Meeting of the American Academy of Child and Adolescent Psychiatry, Los Angeles, Cal. Jacobsen, R. H., Lahey, B. J. & Strauss, C. C. (1983), Correlates of depressed mood in normal children. J. Abnorm. Child Psychol., 11:29-40. Jensen, P. S., Traylor, J., Xenakis, S. N. & Davis, H. (1988), Child psychopathology rating scales and interrater agreement: I. parents' gender and psychiatric symptoms. J. Am. Acad. Child Adolesc. Psychiatry, 27:442-450. Kashani, J. H., Orvaschel, H., Burk, J. P. & Reid, J. C. (1985), Informant variance: the issue of parent-child disagreement. J. Am. Acad. Child Adolesc. Psychiatry, 24:437-441. Kazdin, A. E. & Petti, T. A. (1982), Self-report and interview measures of childhood and adolescent depression. J. Child Psychol. Psychiatry, 23:437-457. - - French, N. H. & Unis, A. S. (I 983a), Child, mother, and father evaluation of depression in psychiatric inpatient children. J. Abnorm. Child Psychol., 11:167-180. - - - - - - Dawson, K. (1983b), Assessment of childhood depression: correspondence of child and parent ratings. J. Am. Acad. Child Psychiatry, 22:157-164. - - - - - - Rancurello, M. D. (1983c), Child and parent evaluations of depression and aggression in psychiatric inpatient children. J. Abnorm. Child Psychol., 11:401-413. Kovacs, M. & Beck, A. T. (1977), An empirical-clinical approach toward a definition of childhood depression. In: Depression in Childhood, ed. J. G. SchuIterbrandt & A. Raskin. New York: Raven, pp. 1-25. Ledingham, J. E., Younger, A., Schwartzman, A. & Bergeron, G. (1982), Agreement among teacher, peer, and self-ratings of children's aggression, withdrawal, and likability. J. Abnorm. Child Psychol., 10:363-372. Lessing,E., Oberlander, M. & Barbera, L. (1974), Convergent validity of the IPAT Children's Personality Questionnaire and teachers' ratings of the adjustment of elementary school children. Social Behavior and Personality, 2:222-229. Mash, E. J. & Johnston, C. (1983), Parental perceptions of child behavior problems, parenting self-esteem, and mothers' reported stress in younger and older hyperactive and normal children. J. Consult. Clin. Psychol., 54:86-89. McDermott, J. F., Harrison, S. I., Schroger, J. & Wilson, P. (1965), Social class and mental illness in children: observations of bluecollar families. Am. J. Orthopsychiatry, 35:500-508. Orvaschel, H., Weissman, M. M., Padian, N. & Lowe, T. L. (1981), Assessing psychopathology in children of psychiatrically disturbed parents. J. Am. Acad. Child Psychiatry, 20:112-122. Pekarik, E. G., Prinz, R. J., Liebert, D. E., Weintraub, S. & Neale, J. M. (1976), The pupil evaluation inventory: a sociometric technique for assessing children's social behavior. J. Abnorm. Child Psychol., 80:225-243. Poznanski, E. 0., Grossman, J. A., Buchsbaum, Y., Banegas, M., Freeman, L. & Gibbons, R. (1984), Preliminary studies of the reliability and validity of the Children's Depression Rating Scale. J. Am. Acad. Child Psychiatry, 23:191-197. Reich, W., Herjanic, B., Weiner, Z. & Ghandy, P. R. (1982), Development of a structured psychiatric interview for children: agreement on diagnosis comparing child and parent interviews. J. Abnorm. Child Psychol., 10:325-336. Reynolds, C. R. & Richmond, B. O. (1978), What I Think and Feel: a revised measure of children's manifest anxiety. J. Abnorm. Child Psychol., 6:271-280. Reynolds, W. M., Anderson, G. & Bartell, N. (1985), Measuring depression in children: a multi method assessment investigation. J. Abnorm. Child Psychol., 13:513-526. Rothbart, M. K. & Maccoby, E. E. (1966), Parents' differential reactions to sons and daughters. J. Pers. Soc. Psychol., 4:237-243.
INTER RATER AGREEMENT: CHILD AND FAMILY FACTORS
Saylor, C. E, Finch, A. J., Baskin, C. H., Saylor, C. B., Darnell, G. & Furey, W. (1984), Children's Depression Inventory: investigation of procedures and correlates. J. Am. Acad. Child Psychiatry, 23:626-628. Semler, I. J. (1960), Relationships among several measures of pupil adjustment. Journal ofEducational Psychology, 51:62-64. Shekim, W.O., Cantwell, D. P., Kashani, J., Beck, N., Martin, J. & Rosenberg, J. (1986), Dimensional and categorical approaches to the diagnosis of attention deficit disorder in children. J. Am. Acad. Child Psychiatry, 25:653-658.
461
Touliatos, J. & Lindholm, B. W. (1975), Relationships of children's grade in school, sex, and social class to teachers' ratings on the Behavior Problem Checklist. J. Abnorm. Child Psychol., 3:115126. - - - - (1981), Congruence of parents' and teachers' ratings of children's behavior problems. J. Abnorm. Child Psychol., 9:347354. Werry, J. S. & Quay, H. C. (1971), The prevalence of behavior symptoms in younger elementary school children. Am. J. Orthopsychiatry, 41: 136-143.