~]IUTTE
RWO RTH
IIIE'"EMANN
The Utility of the Warrington Recognition Memory Test for Temporal Lobe Epilepsy: Pre- and Postoperative Results 1-4Bruce P. Hermann, 5Bryan Connell, 6William B. Barr, and 7Allen R. Wyler
We wished to examine the ability of the Warrington Recognition Memory Test (RMT) to distinguish epilepsy of left versus right temporal lobe origin during preoperative testing and to assess pre- to postoperative changes in memory function after anterior temporal lobectomy (ATL). Seventy-seven patients were assessed preoperatively and 6-8 months postoperatively. Patients' performance for verbal (Words) and nonverbal (Faces) material was examined with raw, scaled, and discrepancy scores, as well as with more formal diagnostic efficiency statistics and receiver operating characteristic (ROC) curves. Preoperatively, no aspect of the RMT could reliably distinguish left from right temporal lobe epilepsy groups. Examination of pre- to postoperative memory change showed declines in Word recognition memory after left ATL and less consistent declines in Face recognition memory after right ATL. Diagnostic efficiency statistics demonstrated poor classification ability preoperatively and improved classification ability postoperatively. We conclude that the RMT is not insensitive to lateralized temporal lobe lesions (as demonstrated by the pre- to postoperative performance changes), but it is of extremely limited clinical utility in identifying laterality of temporal lobe seizure onset preoperatively. The reasons for this interesting pattern of results are discussed. Key Words: Warrington Recognition Memory Test-Neuropsychology--Memory--Anterior temporal lobectomy--Epilepsy-Temporal lobe epilepsy.
Received January 30, 1995; accepted February 7, 1995. From the 1Epi-Care Center, Baptist Memorial Hospital, 2Departments of Psychiatry and 3Neurosurgery, University of Tennessee, Memphis, 4Semmes-Murphey Clinic, Memphis, TN, 5Carolinas Epilepsy Center, Charlotte Mecklenberg Hospital, Charlotte, NC, 6Departments of Neurology & Psychiatry, Long Island Jewish Medical Center, The Long Island Campus of the Albert Einstein College of Medicine, Long Island, NY, and 7Epilepsy Center, Swedish Medical Center, Seattle, WA, U.S.A. Address correspondence and reprint requests to Dr Bruce Hermann at Epi-Care Center, 930 Madison Avenue, Memphis, TN 38103, U.S.A. J. Epilepsy 1995;8:139-145 © 1995 by Elsevier Science Inc. 655 A v e n u e of the Americas, New York, NY 10010
One purpose of the preoperative neuropsychological evaluation for epilepsy surgery candidates is to provide some information regarding the lateralization and localization of the underlying epileptogenic region (1). Because the most common surgical procedure is anterior temporal lobectomy (ATL), tests that are sensitive to lateralized focal temporal lobe dysfunction are particularly important. The Warrington Recognition Memory Test (RMT) is a test of recognition memory for verbal (visually presented words) and nonverbal (unfamiliar faces) material that in a large validation
0896-6974/95/$9.50 SSDI 0896-6974(95)00022-6
B. P. HERMANN ET AL.
study showed a double dissociation for patients with lateralized temporal lobe lesions (primarily neoplasms and infarctions). Patients with left temporal lobe lesions exhibited significantly poorer performance for words than faces, whereas patients with right temporal lobe lesions showed significantly poorer performance for faces than words (2). Despite this impressive performance, the lab eralizing ability of the RMT needs to be investigated in a sample of patients with epilepsy. Delaney (3) has pointed out that tests validated on populations of patients with gross cerebral lesions are often less sensitive to the neuropathology associated with focal epilepsy. The purpose of this investigation was to examine the ability of the RMT to discriminate patients with complex partial seizures (CPS) of left or right temporal lobe origin before surgery and to examine changes in material-specific m e m o r y performance after ATL. This was done with raw, scaled, and discrepancy scores obtained from the RMT using conventional analyses of group data, followed by more critical examination of the ability of the RMT to classify individuals with lateralized temporal lobe epilepsy (TLE) accurately, using formal diagnostic efficiency statistics and receiver operating characteristic (ROC) curves.
Method
Subjects
In all, 77 consecutive patients aged ~16 years with intractable epilepsy of unilateral temporal lobe origin were the study subjects. Patients were referred for surgical evaluation because of medication-resistant epilepsy; they therefore underwent extensive evaluation. The preoperative evaluation included continuous (24-h) closed-circuit TV/EEG monitoring with scalp electrodes to record several of the patients' typical spontaneous seizures for classification of seizure type and preliminary localization of seizure onset. All patients also underwent invasive EEG monitoring with subdural strip electrodes to provide precise localization of the epileptogenic lesion (4). All determinations as to the localization and lateralization of the ictal onset were made independently by the electroencephalographer, blinded to the results of the neuropsychological testing. Surgical decisions were made on the basis of ictal (primarily invasive) EEG findings. 140 J EPILEPSY, VOL. 8, NO. 2, 1995
Patients u n d e r w e n t intracarotid amobarbital testing to d e t e r m i n e cerebral d o m i n a n c e for speech (5). Individuals with anomalous organization of language function (e.g., bilateral speech, right hemisphere speech) were deleted from the subject pool. Patients u n d e r w e n t magnetic resonance imaging (MRI), and individuals with underlying structural lesions (e.g., tumor, infarction, encephalomalacia, arteriovenous malformation) were excluded from consideration. Mesial temporal sclerosis (MTS) was not considered a structural lesion. Additional subject inclusion criteria were WAIS-R Full Scale I.Q. ~70. In summary, the final sample consisted of 77 consecutive nonretarded, left hemisphere dominant patients with nonlesional, intractable epilepsy of unilateral temporal lobe onset. Invasive EEG monitoring showed that 48 patients had ictal onset from the left (dominant) temporal lobe and 29 patients had ictal onset from the right (nondominant) temporal lobe. Table 1 shows demographic and seizure-related characteristics of the sample. There were no significant differences between the left and right temporal lobe groups in the variables examined. Procedure and Data Analyses All patients were administered the complete RMT in the standardized fashion (2) as part of their comprehensive neuropsychological evaluation. Patients were assessed w h e n they were undergoing scalp electrode monitoring while treated with reduced amounts of anticonvulsant medication. The RMT consists of the presentation of 50 printed words at the rate of one word every 3 s, and for each word the subject is required to judge the presented stimulus as "pleasant" or "unpleasant" to help ensure that they are attending to the stimulus items. The patient is then presented with a series of word pairs, and the task is to identify which of the two words came from the target list.
Table 1.
Patient characteristics
Parameter Age (yrs) Gender Age at onset (yrs) WAIS-R Verbal I.Q. WAIS-R Performance I.Q. WAIS-R Full Scale I.Q.
Left temporal (n = 48)
Right temporal (n = 29)
30.6 (8.5) 23 F, 25 M 11.1 (11.3) 86.4 (10.7) 88.1 (12.6) 86.0 (10.9)
34.2 (9.1) 16 F, 13 M 12.5 (11.4) 90.6 (12.4) 87.9 (13.3) 88.8 (11.9)
Presented values are mean (SD).
RECOGNITION MEMORY A series of 50 faces is t h e n presented at the same rate, a n d the patient is asked to provide the same pleasant versus u n p l e a s a n t j u d g m e n t s ; the patient is t h e n presented w i t h a series of 50 pairs of faces, a n d the task is again to identify which of the two faces came from the target list. For the Words a n d Faces subtests, the raw scores can range from 0 to 50, a n d raw scores are converted to age-corrected scaled scores (mean = 10, SD = 3). In addition, if the raw score for Words is lower t h a n the raw scores for Faces, a " W o r d discrepancy" is said to exist. Conversely, w h e n the raw score for Faces is below that for Words, a "Face discrepancy" exists. A table provided in the m a n u a l specifies w h e n the m a g n i t u d e of the Face or W o r d discrepancy scores exceeds various probability levels, including p < 0.05, which was the m a g n i t u d e of difference exa m i n e d here. Finally, we also examined the simple difference b e t w e e n the raw score for Faces m i n u s the raw score for W o r d s as a n o t h e r possible lateralizing index. Negative scores represented relatively poorer Face performance; positive scores represented relatively poorer W o r d performance.
Results
Analyses of Group Data Table 2 s h o w s the m e a n pre- a n d postoperative raw scores, age-corrected scaled scores, a n d m e a n raw discrepancy scores (raw Face m i n u s raw Word
score). Each d e p e n d e n t m e a s u r e was a n a l y z e d by a 2 x 2 (groups by time of assessment) mixed analysis of variance (ANOVA). Significant interaction effects were obtained for raw score Words (F = 3.8, p = 0.05), raw score Faces (F = 4.9, p = 0.03), scaled scores for Faces (F = 6.9, p = 0.011), a n d the raw Face m i n u s W o r d discrepancy score (F = Table 2.
11.7, p = 0.001), but not scaled scores for Words (F = 1.8, p = 0.18). For all significant interaction effects, there were no preoperative differences bet w e e n the left a n d right TLE g r o u p s on either Words or Faces (all p-values >0.10). The significant interaction effects were driven by pre- to postoperative changes in performance as a function of laterality of resection a n d type of m e m ory material. After left ATL, there were significant declines in Words raw (p = 0.001) a n d the raw Face minus W o r d discrepancy score, with a greater discrepancy favoring Faces (p K 0.001). After right ATL, there was a significant decline on the Face scaled score (p = 0.006) but not on Face raw (p = 0.06) or the Face/Word discrepancy score (p >
0.10). A l t h o u g h there were no p r e o p e r a t i v e differences b e t w e e n the left a n d right TLE groups on a n y m e m o r y index, there were significant postoperative differences b e t w e e n the left a n d right ATL groups on all indices [Words raw (p = 0.004), Faces raw (p = 0.01), Faces scaled (p = 0.003), a n d Face/Word discrepancy (p < 0.001)]. The left ATL group performed worse on Words, a n d the right ATL group performed worse on Faces. Figures I a n d 2 s h o w a representative pattern of findings using the raw scores for Words a n d Faces. The lack of preoperative group differences on the m e m o r y indices, the pre- to postoperative changes as a function of laterality of resection a n d type of m e m o r y , a n d the resultant postoperative differences b e t w e e n the left a n d right ATL groups are apparent.
Diagnostic Efficiency Statistics The analyses just described relied on analyses of group data, but the clinical diagnostic efficiency of the RMT is also of interest. The test m a y be u s e d in
Mean pre- and postoperative RMT performance Left temporal
Measure Words (raw) Faces (raw) Words (scaled) a Faces (scaled) a Discrepancyb
Preoperatively 40.5 38.9 6.7 7.4 -1.6
(5.5) (5.3) (3.2) (3.1) (6.2)
Right temporal
Postoperatively 36.7 39.6 5.2 7.9 2.9
(6.2) (5.8) (2.4) (3.8) (7.4)
Preoperatively 41.4 37.9 7.5 7.2 -3.5
(7.0) (6.8) (3.9) (3.1) (5.9)
Postoperatively 41.1 36.1 7.2 5.6 -5.0
(6.7) (5.5) (3.9) (2.7) (8.3)
RMT, Warrington Recognition Memory Test. Presented values are mean (SD). aScaled scores are corrected for chronologic age (2). bDiscrepancy score: Raw Word score minus raw Face score.
J EPILEPSY, VOL. 8, NO. 2, 1995 141
B. P. HERMANN ET AL. Yield and accuracy of R M T Word and Face discrepancy scores significant at the p < 0.05 level
Table 3.
Time of assessment Parameter
Time of Assessment
Significant Word or Face discrepancy score? Correct lateralization Incorrect lateralization
Preoperative (%)
Postoperative (%)
22
44
59
85
41
15
Figure 1. Mean pre- and postoperative raw scores for RMT
Words. Lightly shaded box, lefl-ATL; darkly shaded box, rightATL. the preoperative assessment of epilepsy surgery candidates for lateralization a n d localization purposes, a n d the information regarding the ability of the RMT to lateralize patients correctly is pertinent. W o r d discrepancy or Face discrepancy scores that reached the p < 0.05 level are s h o w n in Table 3, w h i c h provides a gross overview of the general yield of significant discrepancy indices a n d their ability to identify the laterality of lesion. The yield of significant discrepancy scores was quite modest at the time of preoperative assessment, and the ability of the RMT to lateralize correctly the temporal lobe from which the seizures e m a n a t e d was poor. Of note, the yield of significant discrepancy scores doubled postoperatively a n d there was a m a r k e d i m p r o v e m e n t in the classification rate. A more appropriate examination of the diagnostic utility of the RMT is provided by formal diag-
p o O u) ¢ m
45 44 43 42 41 40 39 38
37 36
35 R~,~ehft-AT L t-ATL
Time of A s s e s s m e n t
Figure 2. Mean pre- and postoperative raw scores for RMT
Faces. Lightly shaded box, left-ATL; darkly shaded box, right ATL. 142 J EPILEPSY, VOL. 8, NO. 2, 1995
nostic efficiency statistics (6) (Table 4). The following trends were evident. First, at preoperative assessment, the RMT s h o w e d poor ability to identify laterality of temporal lobe seizure onset. Overall correct classification rates did not exceed 0.60, sensitivity was very low (0.15 a n d 0.10), a n d falsenegative rates were very high (0.85 a n d 0.90). For Face discrepancy scores, the false-positive rate m a t c h e d the sensitivity rate. The positive predictive power for the W o r d discrepancy index was high. Second, at the time of postoperative assessment, there was a considerable i m p r o v e m e n t in
Diagnostic efficiency statistics for Word Discrepancy Index identifying left TLE and Face Discrepancy Index identifying right TLE
Table 4.
Preoperative Parameter Overall correct classification Sensitivity Specificity Positive predictive power Negative predictive power False-positive rate False-negative rate
Postoperative
LTLE
RTLE
LTLE
RTLE
0.44 0.15 0.93
0.60 0.10 0.90
0.61 0.42 0.93
0.70 0.31 0.94
0.78
0.38
0.91
0.75
0.40 0.07 0.85
0.62 0.10 0.90
0.49 0.07 0.58
0.69 0.06 0.69
LTLE and RTLE, left and right temporal lobe epilepsy. Note: Definitions below from Kessel and Zimmerman (6): Sensitivity: the true positive rate, proportion of patients with the target laterality correctly identified; specificity: the true negative rate, proportion of patients not having the target laterality correctly identified; positive predictive value: probability that the patient has the target laterality given that the test identifies him/her as a case; negative predictive value: probability that the patient does not have the target laterality given that the test does not identify him/her as a case.
RECOGNITION MEMORY
most of the diagnostic efficiency statistics. The sensitivity rates were essentially tripled, false-positive rates remained essentially unchanged but were n o w considerably below the sensitivity rates, positive predictive p o w e r was considerably improved for both indices and, although the false-negative rates remained high they were considerably lower than preoperative figures. The pattern of postoperative findings suggests that the RMT is not insensitive to lateralized temporal lobe lesions, but preoperatively the RMT appears to be relatively insensitive to the neuropathology underlying the lateralized TLE.
SENSITIVITY 1~
=
~---, • , . . .
{),9 {),8
~" ~" "" ~" I ~* "~ 1 ~. ~..~ .~. i ~ , i '
0.'7
~
0,6
~.,~
0,5 0.4
,, ,,
0.3 0,2
R O C Curves
The analyses described were based on the cutoff points suggested in the RMT manual. An additional series of analyses was performed with data obtained from ROC curves. Like diagnostic efficiency statistics, ROC curves provide a means to evaluate the accuracy of diagnostic tests (7). The ROC analyses were conducted in a manner similar to those described by Monsch et al. (8). Every empirically obtained score from the Face and Word tests was treated as a separate cutoff score. Similar procedures were conducted for a raw Faces minus raw Word discrepancy score. The cumulative number of patients from the left and right TLE groups w h o obtained scores at or below these cutoffs was determined by examination of frequency distributions. Measures of sensitivity and specificity for identifying lateralized patients were calculated from these values. Sensitivity and specificity values ranging from 0 to 1 were then plotted graphically to obtain an ROC curve for each test. The area under the curve represents the degree of classification accuracy, with greater area representing greater accuracy. Examples of ROC curves for the preoperative and postoperative difference scores are shown in Fig. 3. Again the RMT scores performed very poorly preoperatively and were improved postoperatively, as represented in the larger area under the postoperative curve. One of the major benefits of the use of ROC curves is to obtain optimal cutting scores for various tests. Whereas the diagnostic efficiency analyses were performed on cutoff values provided by the RMT manual, ROC enables one to compute optimal cutting scores based on the data from the current sample. When determining an optimal cutoff score, the investigator is faced with an arbitrary judgment regarding w h e t h e r to emphasize the
0,1
0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
SPECIFICITY Figure 3. Receiver operating characteristic (ROC) curves for preoperative (solid line) and postoperative (broken line) RMT difference score (raw Faces minus raw Words).
sensitivity or specificity of the measure or whether to treat these two characteristics equally. For this study, we decided a priori that cutting scores characterized by a combination of high specificity and maximal sensitivity would be most appropriate for discriminating between patients with left versus right TLE. To achieve this goal, we used a criterion of >80% specificity combined with maximal specificity to determine the optimal cutting scores. Preoperative and postoperative cutoff scores derived from the use of ROC curves are shown in Table 5. Preoperatively, the largest sensitivity value (0.31) was obtained with a raw Faces score 434, with little change evident postoperatively. Overall, Words were relatively insensitive to the classification of side of TLE onset. The optimal preoperative difference score ( - 9 ) yielded a high specificity (0.88) but a relatively low sensitivity (0.10). The sensitivity of this measure improved substantially with the use of postoperative scores, with a difference score lower than - 5 , yielding a sensitivity >50%.
Discussion Every analysis showed that preoperative RMT performance was a poor discriminator of the laterJ EPILEPSY, VOL. 8, NO. 2, 1995 143
B. P. HERMANN ET AL.
Table 5. Index Preoperative data Faces (raw) Words (raw) Discrepancy~ Postoperative data Faces (raw) Words (raw) Discrepancy~
ROC statistics for the R M T
Raw score cutoff
Sensitivity
Specificity
34 34 - 9
0.31 0.14 0.10
0.83 0.85 0.88
33 31 - 5
0.31 0.07 0.52
0.81 0.83 0.81
ROC, receiver operating characteristics; RMT, Warrington recognition memory test. aDiscrepancy score: Raw Word score minus raw Face score. ality of temporal lobe EEG onset in this sample of surgical candidates. Analyses of mean raw, scaled, and discrepancy scores of various types failed to show significant differences between the left and right temporal lobe groups on the verbal (Words) or visual (Faces) material. Examination of the diagnostic efficiency indices was consistent with this trend, and use of ROC curve did not appreciably improve the diagnostic yield. Considerable caution therefore appears warranted in the use of this test w h e n one attempts to identify the laterality of temporal lobe seizure onset in populations comparable to that which we studied. The overall pattern of preoperative results suggests that the RMT is insensitive to lateralized temporal lobe dysfunction. However, the pre- to postoperative findings suggest that this is not the case. Postoperatively, the analyses of group data indicated a decline in Word recognition memory after left ATL on raw scores and raw Face minus Word discrepancy scores, with a more modest decline in Face recognition memory after right ATL on scaled scores only. The net effect of these changes is that a significant postoperative difference becomes apparent between the left and right ATL groups on the m e m o r y indices. The left ATL group performed worse on the raw Word and Face minus Word discrepancy score, and the right ATL group performed worse on the Face indices. Examination of the diagnostic efficiency statistics also showed improved overall classification, sensitivity, lower false-positive rates, and higher positive predictive p o w e r following surgery. Pre- to postoperative ROC discrepancy score findings were also consistent with this pattern of findings. It is of interest that the RMT was validated on a population of patients presenting primarily with structural lesions, largely neoplasms and infarc144 J EPILEPSY, VOL. 8, NO. 2, 1995
tions (2). Our postoperative data are generally consistent with the results of the validation study of Warrington (2) in that after ATL the RMT appeared sensitive to lateralized temporal lobe lesions. In contrast, the preoperative data indicated that the RMT was insensitive to the laterality of focal epileptogenic tissue, given the absence of a gross space-occupying or destructive structural lesion. Delaney (3) previously pointed out that tests validated on lesional populations will generally be considerably less sensitive to focal epileptogenic lesions, and his point was certainly demonstrated in the present study. Naugle et al. (9) conducted the only other investigation that examined the RMT among epilepsy surgery patients. Their study was different from ours in several ways (e.g., ATL patients had to be completely seizure-free for i n f u s i o n in the study, a nonsurgical epilepsy control g r o u p w a s included, different analytic procedures were used), yet the two studies have several similarities. The most important similarity may be that Naugle et al. (9) found the RMT to be of limited clinical utility preoperatively. The classification of patients as having left or right TLE did not differ from chance preoperatively, but was considerably improved after ATL, but still with modest sensitivity [Tables 4 and 5 of Naugle et al. (9)].
References 1. Jones-GotmanM, Smith M, Zatorre RJ. Neuropsychological testing for localizing and lateralizlng the epileptogenic region. In: Engel J, ed. Surgical treatment of the epilepsies, 2nd ed. New York: Raven Press, 1993:245-61. 2. Warrington EK. Recognition memory test. Berkshire: NFER-Nelson, 1984. 3. DelaneyRC. Screening for organicity: the problem of subtle
RECOGNITION MEMORY neuropsychological deficit and diagnosis. J Clin Psychol 1982;38:843-6. 4. Wyler AR, Ojemann G, Lettich E, Ward AA Jr. Subdural strip electrodes for localizing epileptogenic foci. J Neurosurg 1984;60:1195-200. 5. Blume WT, Grabow JD, Darley FL, et al. Intracarotid amobarbital test of language and memory before temporal lobectomy for seizure control. Neurology 1973;23:812-9. 6. Kessel JB, Zimmerman M. Reporting errors in studies of the diagnostic performance of self-administered questionnaires: extent of the problem, recommendations for stan-
dardized presentation of results, and implications for the peer review process. Psychol Assess 1993;5:395-9. 7. Swets JA. Measuring the accuracy of diagnostic systems. Science 1988;240:1285-93. 8. Monsch AU, Bondi MW, Butters N, Salmon DP, Katzman R, Thal LJ. Comparisons of verbal fluency tasks in the detection of dementia of the Alzheimer's type. Arch Neurol 1992;49:1253-8. 9. Naugle RI, Chelune GJ, Schuster J,. Lfiders H, Comair Y. Recognition memory for words and faces before and after temporal lobectomy. Assessment 1994;1:373-81.
J EPILEPSY, VOL. 8, NO. 2, 1995
145