Archives of Clinical Neuropsychology, Vol. 13, No. 7, pp. 611–616, 1998 Copyright 1998 National Academy of Neuropsychology Printed in the USA. All rights reserved 0887-6177/98 $19.00 ⫹ .00
PII S0887-6177(97)00077-2
Validity of the Wonderlic Personnel Test as a Brief Measure of Intelligence in Individuals Referred for Evaluation of Head Injury Jennifer Saltzman, Esther Strauss, Michael Hunter, and Frank Spellacy University of Victoria
Some have argued that the Wonderlic Personnel Test (WPT) may represent a brief and efficient measure of intellectual functioning (e.g., Dodrill, 1980). The present study investigated the validity of the WPT as such a measure, in individuals with head injury. The findings suggested that, although the WPT showed relatively high agreement with the Wechsler Adult Intelligence Scale-Revised (WAISR) in the whole group, it did not have good agreement with WAIS-R scores on an individual case basis. Since clinical practice typically seeks to evaluate individual performance, it is suggested that the WPT is not a suitable tool for psychological assessment of individuals with known or suspected head injury. 1998 National Academy of Neuropsychology. Published by Elsevier Science Ltd
INTRODUCTION Brief measures of intellectual functioning may represent a shorter and more cost-efficient alternative to standardized intelligence testing (e.g., the Wechsler tests). To this end, some researchers have suggested that the Wonderlic Personnel Test (WPT; Wonderlic, 1992) may be a promising tool in psychological assessment (e.g., Dodrill, 1980). The WPT is a 50-item pencil and paper test based on the original Otis Test of Mental Ability that requires only 12 minutes administration time. Studies attempting to validate the WPT as a measure of intellectual ability in normal, psychiatric, and neurological populations have yielded mixed results (see Table 1 for a summary of the studies). Comparisons of the WPT to the Wechsler Adult Intelligence Scale (WAIS; Wechsler, 1955) revealed correlations ranging from .85 to .93 and statistically nonsignificant differences between mean WPT and mean WAIS scores. The crucial issue in a clinical setting, however, is not how correlated the scores are or even how close the means are; the degree of concordance is critical. The idea of a ‘‘hit rate’’ score (the percentage of WPT scores falling with 10 points of the Wechsler Adult Intelligence Scale Full Scale Intelligence Quotient [WAIS FSIQ]) was introduced by Dodrill (1981) as a means of evaluating the accuracy of the WPT on an individual case basis. A WPT score that was more than 10 points different from the WAIS criterion was ‘‘considered to give information that was erroneous and likely to be misleading in clinical and research situations’’ (p. 670). Based on this criterion, several
Address correspondence to: Jennifer Saltzman, Department of Psychology, University of Victoria, P.O. Box 3050, Victoria, B.C., Canada V8W 3P5. E-mail:
[email protected].
611
612
J. Saltzman et al.
TABLE 1 Studies Comparing Wechsler Scale Performance to Wonderlic Performance
Study
Wechsler Scale
Dodrill (1980)
WAIS
Dodrill (1981) Edinger, Shipley, Watkins, & Hammett (1985) Dodrill & Warner (1988)
WAIS WAIS-R
Frisch & Jessop (1989)
Hawkins, Faraone, Pepple, Seidman, & Tsuang (1990) Carswell & Snow (1996)
WAIS
WAIS-R (compared to Dodrill’s WAIS score minus 8 pts) WAIS-R
WAIS-R (compared to Dodrill’s WAIS score minus 7 pts)
r
t
Percent Within 10 Points
0.88
—
90
0.93 0.75
ns p ⬍ .001
90 71
34 psychiatric inpatients; 43 nonpsychiatric inpatients; 34 psychiatric ⫹ epilepsy; 60 healthy controls 34 psychiatric inpatients
0.85 0.87 0.88 0.91
ns ns p ⬍ .05 ns
94 84 81 88
0.89
p ⬍ .05
—
18 psychiatric inpatients
0.92
p ⬍ .05
88
80 head-injured participants
0.76
—
78
Population (N ) 150 patients with seizure disorder 120 healthy controls 100 male psychiatric inpatients
Note. WAIS ⫽ Wechsler Adult Intelligence Scale; WAIS-R ⫽ Wechsler Adult Intelligence Scale-Revised; ns ⫽ nonsignificant.
studies found the clinical utility of the WPT to be good, with 81 to 94% of WPT scores falling within 10 points of the gold standard WAIS FSIQ. The availability of published Wonderlicto-WAIS IQ conversion tables (Dodrill, 1980, 1981) made the WPT even simpler to apply. The introduction of the Wechsler Adult Intelligence Scale-Revised (WAIS-R; Wechsler, 1981) caused researchers to reexamine the predictive power of the WPT. Understanding that the WAIS-R tends to produce full scale scores that are approximately 8 points lower than the WAIS, several studies continued to utilize Dodrill’s conversion tables, subtracting 7 or 8 points to obtain a more accurate WAIS-R estimate (Carswell & Snow, 1996; Frisch & Jessop, 1989; Hawkins, Faraone, Pepple, Seidman, & Tsuang, 1990). Others compared WAIS-R scores to Wonderlic raw scores that had been age-corrected and standardized to a mean of 100 and a standard deviation of 15 (Edinger, Shipley, Watkins, & Hammett, 1985). In all cases it was found that mean WPT and WAIS-R scores were statistically different. The correlations between WPT and WAIS-R scores varied from moderate to high, however, only one study predicted at least 80% of individual WPT scores within 10 points of their WAIS-R counterparts (Hawkins et al., 1990). The reason for the variability is not certain but may relate, at least in part, to the method used to convert Wonderlic to IQ scores and/ or the nature of the population studied. The intent of the present study was to examine the validity of the WPT in a head-injured population, using the WAIS-R as the measurement standard. Predictive validity was assessed on the basis of three measurements: (a) the correlation between WPT and WAIS-R scores; (b) the difference between the mean WPT IQ level and the mean WAIS-R FSIQ; and (c) the percentage of WPT scores that fall within 10 points of their corresponding WAIS-R
Wonderlic and WAIS-R
613
scores. Because the method for converting Wonderlic scores to IQ scores has varied across studies, a number of different procedures were used, including Dodrill’s (1980, 1981) conversion tables, as well as standardizing Wonderlic raw scores to have a mean of 100 and standard deviation of 15, corresponding to the distribution of Wechsler full scale scores. METHOD Subjects Participants in this study consisted of 129 individuals (73 male, 56 female) referred to the private practice of one of the authors (FS) for neuropsychological assessment arising from known or suspected head injury. The participants ranged in age from 16 to 69 years (M ⫽ 33.84 years, SD ⫽ 13.01) and in education from 3 to 19 years (M ⫽ 11.8 years, SD ⫽ 2.43). Procedure Each participant was administered the WAIS-R and the WPT (Form A) as part of a larger test battery and assessment for sustained head injury. Wonderlic raw scores were converted to IQ scores according to three separate methods and subsequently each was compared to the obtained WAIS-R FSIQs. In the first method, WPT raw scores were age-adjusted and converted to WAIS FSIQs according to Dodrill’s (1981) conversion table based on healthy control subjects. The second analysis was carried out with WPT raw scores that had been age-corrected and converted to WAIS FSIQs according to Dodrill’s (1980) conversion table based on patients with seizure disorder. Dodrill (1980) suggested that this table may be more suitable for use with neurologic populations. Finally, WPT raw scores were age-corrected according to the WPT manual (1992) and standardized to a mean of 100 and standard deviation of 15. In each set of analyses WPT IQ scores were compared to WAIS-R FSIQs as a whole and then broken down by age (16–25, 26–39, 40–69 years), education (6–11, 12, 13–18 years), gender, and WAIS-R FSIQ level (64–89, 90–110, 111–145). In addition to the Wonderlic, a second estimate of FSIQ (Block Design and Vocabulary subtests of the WAIS-R; Sattler, 1988) was calculated for each participant, in order to determine how the WPT fared in its estimation, compared to other brief measures. RESULTS The overall mean, standard deviation, and range for each of the three conversions, the short-form and the WAIS-R FSIQ are presented in Table 2. Regardless of which conversion method was used, WPT mean FSIQs were higher than the WAIS-R FSIQ mean by approximately 7 to 9 IQ points. Because these conversion tables were designed to provide estimates of WAIS FSIQ, whereas this study used WAIS-R FSIQ, WPT IQ scores were further corrected by subtracting 7 or 8 points from each WPT IQ score. A significant difference was also noted between the WAIS-R FSIQ and the short-form estimate. This was adjusted by subtracting 5 points from the short-form scores. In each case, the decision on how many points to subtract was based on the mean difference between the unadjusted scores and the WAIS-R. Table 3 lists the new corrected means, t-scores and Pearson (r) correlations for each Wonderlic conversion method, and the Block Design/ Vocabulary estimation, compared with WAIS-R FSIQ, as well as the percentage of scores that fell within 10 points and within 15 points of the WAIS-R FSIQ. The WAIS-R two subtest short-form yielded the best 10-point concordance rate.
614
J. Saltzman et al.
TABLE 2 Summary Scores for WAIS-R FSIQ and Wonderlic Predicted FSIQs Range Score WAIS-R Block Design/Vocabulary short form Dodrill (1980) WAIS conversion Dodrill (1981) WAIS conversion Standardized raw WPT scores
M
SD
Minimum
Maximum
98.45 103.38
12.9 13.8
69.0 68.0
143.0 141.0
106.76
12.4
71.0
141.0
105.49
13.4
71.0
138.0
108.14
20.5
58.0
160.0
Note. WAIS-R FSIQ ⫽ Wechsler Adult Intelligence Scale-Revised Full Scale Intelligence Quotient; M ⫽ mean; SD ⫽ standard deviation; WPT ⫽ Wonderlic Personnel Test.
DISCUSSION Consistent with previous findings, the Wonderlic Personnel Test correlated relatively highly with the WAIS-R, although the current findings were slightly lower than some others have reported. Minor adjustments to the scores produced by the original Wonderlic-to-WAIS conversion tables resulted in mean WPT scores that were comparable to mean WAIS-R scores when considering the group as a whole. Such correction also eliminated significant differences between the WAIS-R and the short-form. However, the WPT did not agree well with WAIS-R scores on an individual case basis, as evidenced by the finding that less than 80% of WPT scores fell within 10 points of their ‘‘true’’ WAIS-R FSIQ. It is worth noting, however, that 92% of WPT scores fell within 15 points of the true score when Dodrill’s (1981) conversion table was used. Given that accurate estimates of IQ are requisite on a case-bycase basis for clinical practice, it is of some concern whether the WPT should be considered an effective replacement for traditional standardized tests, at least with a head-injured population. Other brief measures of intelligence, such as the two subtest short-form or the Kaufman TABLE 3 Further Corrections to WPT Predicted FSIQ Scores, and Block Design/Vocabulary Short Form, Compared to WAIS-R FSIQ
Score WAIS-R Block Design/Vocabulary short form minus 5 points Dodrill (1980) WAIS conversion minus 7 points Dodrill (1981) WAIS conversion minus 7 points Standardized raw WPT scores minus 8 points
M (SD)
r
t
Percent Within 10 Points
Percent Within 15 Points
98.45 (12.9) 98.38 (13.8)
— 0.887
— ns
— 88
— 99
99.76 (12.4)
0.727
ns
77
91
98.49 (13.4)
0.727
ns
75
92
100.14 (20.5)
0.731
ns
54
74
Note. WPT ⫽ Wonderlic Personnel Test; FSIQ ⫽ Full Scale Intelligence Quotient; WAIS-R FSIQ ⫽ Wechsler Adult Intelligence Scale-Revised Full Scale Intelligence Quotient; M ⫽ mean; SD ⫽ standard deviation.
Wonderlic and WAIS-R
615
Brief Intelligence Test (K-BIT), may be more effective in estimating FSIQ in individuals with known or suspected head injury. For example, in our study, the Block Design/Vocabulary short-form provided reasonable estimates (83% within 10 points; 94% within 15 points) and takes about 15 minutes to administer. While a certain portion of the correlation between the WAIS-R and Block Design/Vocabulary scales was due to the fact that the predictor is actually part of the criterion, it is worth noting that this comparison is in keeping with previous work (e.g., Eisenstein & Engelhart, 1997; Sattler, 1988). More importantly, it is necessary to compare this short-form to the full WAIS-R since this is the gold-standard for comparison. To do otherwise (i.e., removing these subtests from the IQ computation) would result in a different criteria for comparing the ability of the Wonderlic and the WAIS-R short-form. Similar findings emerge with other measures. In their study of a heterogeneous group of patients referred for neuropsychological assessment, Naugle, Chelune, and Tucker (1993) found that performance on the K-BIT was also highly correlated with WAIS-R FSIQ, and that 95% of the differences between the K-BIT Composite scores and the FSIQ scores were 15 points or less. Eisenstein and Engelhart (1997) recently reported that in a sample of clinic referrals, Wechsler four-subtest short forms did better than the K-BIT in predicting WAISR FSIQ. Difference scores between the K-BIT and FSIQ were 16 points or less for 95% of the cases. Difference scores for the four-subtest short forms were 12 points or less for about 95% of the cases. Of course, one drawback to these other measures (K-BIT and WAIS-R) is that they require a professional for administration, while the WPT does not. The power of the WPT to predict in individual cases remains uncertain with various studies reporting agreement ranging from 71 to 88% of WPT scores falling within 10 points of WAIS-R scores. In explaining this discrepancy some consideration should be given to the type of participants featured in each study. In the present study, as well as the recent study by Carswell and Snow (1996), it appears as though the WPT may be particularly unsuitable for use with a head-injured population. This may relate to the timed nature of the WPT. Investigations of people with head injuries have suggested that both reaction time and processing speed are often compromised (Lezak, 1995). Indeed, in our population, there was a modest correlation between WPT scores and performance on measures of speeded processing (Symbol Digit, r ⫽ .34; PASAT, r ⫽ .26). Accordingly, an untimed version of the WPT may provide a more promising tool.
CONCLUSION Overall, the results of the present study suggest that the Wonderlic Personnel Test is limited in its ability to accurately estimate FSIQs. This does not preclude previous findings that have shown the WPT to be an effective measure in psychiatric conditions, but rather suggests that there may some factor(s), such as timed administration, which may impact the performance of head-injured individuals on this test. Although the WPT shows good agreement with the WAIS-R for the entire group, its modest agreement with WAIS-R FSIQs on a case-by-case basis suggests that clinicians should carefully consider the degree of accuracy with which they hope to estimate FSIQ when selecting a brief measure of intelligence. In its ability to predict within 15 points of the true score, the WPT performs as well as other brief measures such as the K-BIT and Block Design/Vocabulary short form; however, if a more stringent estimation is desired, it appears as though the Block Design/Vocabulary short form may be a more suitable measure for individuals with possible head injuries. In light of its predictive power in large groups, the WPT may also prove to be an adequate screening measure for research purposes.
616
J. Saltzman et al.
REFERENCES Carswell, L., & Snow, G. (1996). An examination of potential alternative measures of WAIS-R FSIQ. Poster presented at the Convention of the American Psychological Association, Toronto, Canada, August, 1996. Dodrill, C. B. (1980). Rapid evaluation of intelligence in adults with epilepsy. Epilepsia, 21, 359–367. Dodrill, C. B. (1981). An economical method for the evaluation of general intelligence in adults. Journal of Consulting and Clinical Psychology, 49, 668–673. Dodrill, C. B., & Warner, M. H. (1988). Further studies of the Wonderlic Personnel Test as a brief measure of intelligence. Journal of Consulting and Clinical Psychology, 56, 145–147. Edinger, J. D., Shipley, R. H., Watkins, C. E., & Hammett, E. B. (1985). Validity of the Wonderlic Personnel Test as a brief IQ measure in psychiatric patients. Journal of Consulting and Clinical Psychology, 53, 937–939. Eisenstein, N., & Engelhart, C. I. (1997). Comparison of the K-BIT with short-forms of the WAIS-R in a neuropsychological population. Psychological Assessment, 9, 57–62. Frisch, M. B., & Jessop, N. S. (1989). Improving WAIS-R estimates with the Shipley-Hartford and Wonderlic Personnel Tests: Need to control for reading ability. Psychological Reports, 65, 923–928. Hawkins, K. A., Faraone, S. V., Pepple, J. R., Seidman, L. J., & Tsuang, M. T. (1990). WAIS-R validation of the Wonderlic Personnel Test as a brief intelligence measure in a psychiatric sample. Psychological Assessment, 2, 198–201. Lezak, M. D. (1995). Neuropsychological assessment (3rd ed.) New York: Oxford University Press. Naugle, R. I., Chelune, G. J., & Tucker, G. D. (1993). Validity of the Kaufman Brief Intelligence Test. Psychological Assessment, 5, 182–188. Sattler, J. (1988). Assessment of children (3rd ed.) San Diego: Sattler. Wechsler, D. (1955). Manual for the Wechsler Adult Intelligence Scale. New York: Psychological Corporation. Wechsler, D. (1981). Manual for the Wechsler Adult Intelligence Scale-Revised. New York: Psychological Corporation. Wonderlic, E. F. (1992). Manual of the Wonderlic Personnel Test & Scholastic Level Exam II: Wonderlic Personnel Test, Inc.