Proverbs and the Modern Mental Status Exam James H. Reich
P
ROVERB INTERPRETATION is an interesting and often enjoyable process that has found a place in both standardized tests and in the mental status exam. One area of standardized testing where proverbs are firmly established with adequate reliability and validity are in intelligence tests, such as those developed by Wechsler.’ Proverbs have also been used in standardized tests to measure psychological variables. Gorham was at the forefront of developing proverb interpretation into a standardized psychological test. In schizophrenics he reported test retest reliabilities ranging from 0.67 to 0.96,* and in normals he reported correlations between different forms of his proverbs test ranging from 0.71 to 0.81.’ Gorham found the test highly successful in differentiating schizophrenics from normals.3 He also reviewed the large amount of work in using standardized proverb interpretation in psychological testing.4 Gorham achieved both reliability and validity with his particular testing population and stimulated interest in the field. In mental status exams proverbs are seldom given in a standardized fashion and often without using scoring sorting criteria. The question can be raised about the usefulness of proverbs in such a situation. Andreason addressed this very point5 She examined schizophrenics, depressives, and manics using five scales to score proverbs using multiple raters. In her Table of Interrater Reliability of 24 raters in Proverb Assessment no correlation was above 0.55 and most were significantly lower. Her kappa values for reliability of proverb measurement never reached 0.6 in any category. However, when mean values of all raters were used, depressives and schizophrenics were significantly distinguished in all five categories of measurement, depressives and manics were significantly distinguished in all five categories of measurement, and manics and schizophrenics were differentiated significantly in three out of five categories of measurement. She stated in her discussion,’ “Thus at best, proverb interpretation may have relatively good validity but poor reliability . . . at worst, therefore, the validity of using proverbs in a clinical situation is somewhat questionable” (p. 471). She concluded the widespread use of proverbs in mental status exams should be discontinued. These findings of validity in standardized situation and possible validity in clinical situations with no reliability in clinical situations seems to be irreconcilable with a statement of Spitzer and Fleiss.6 “There is no guarantee that a reliable system is valid, but assuredly an unreliable system must be invalid” (p. 341). Either validity is truly not present in clinical situations or reliability can be achieved. The goals of this article are to: (1) examine whether proverb From the University of California, Davis, Calif. James H. Reich. M.D.: Resident in Psychiatry, University of California, Davis, Calif. Address reprint requests to James H. Reich, M.D., 2380 Sierra Blvd.# 80, Sacramento, 95825. 0 1981 by Grune & Stratton, Inc. 0010-440X181/2205-0011$01 .OOlO 528
Comprehensive
Psychiatry,
Vol. 22, No. 5 (September/October)
Calif.
1981
PROVERBS AND THE MODERN MENTAL STATUS EXAM
529
interpretation can achieve sufficient reliability to be of use in clinical practice: (2) examine the validity of proverbs in clinical practice; and (3) if reliability can be achieved, to describe the steps necessary to achieve it. MATERIALS AND METHODS Subjects were 21 schizophrenic. 14 manic-depressive, and 22 control patients. The schizophrenic and manic patients were admitted to the University of California at Davis inpatient psychiatric ward during an acute psychotic episode. Six-month histories were available from outpatient psychiatry on all patients. Average length of hospital stay was four to five days. Diagnosis was by Diagnostic and Statistical Manual of Mental Disorder (DSM) III criteria. The schizophrenic group consisted of two schizophrenic disorganized type (295.10). two schizophrenic disorder catatonic type (295.0), and 17 schizophrenic disorder paranoid type (295.30). For statistical purposes all schizophrenic diagnoses were combined in one group. All those in the manic group met the criteria for bipolar disorder. manic (296.44). All patients were started on major antipsychotic drugs immediately on admission, and those with a well-documented affective disorder were often started on lithium on admission as well. The control group was drawn from hospitalized patients on medicine and surgical units whom the nursing staff felt had no major emotional problems. The mean age in years of the three groups were schizophrenic, 31 .O (SD = 6.8); manic, 35.4 (SD = 12.5): and control, 32.5 (SD = 10.9). Schizophrenics were 61% male and 29% black, manics were 43% male and 14% black, and controls were 47% male and 19% black. The mean educational level was schizophrenic. 1I .8 (SD = 2.4): manic. 13.4 (SD = 2.0); and control, 12.2 (SD = 2.4). Analysis of variance showed no significant difference among groups in age. sex, race. or educational level. Four Benjamin proverbs were given as instructed by Gorham’ on his standardized test. The proverbs given were: (1) “Don’t cross your bridges till you come to them”; (2) “Strike while the iron is hot”; (3) “New brooms sweep clean”: and (4) “Shallow brooks are noisy”. These particular proverbs were chosen as to be culturally appropriate to our patient populations. The proverbs were always given in the same order to all subjects. Psychiatric patients were given the test as soon as they were judged able to cooperate. Scorers were two psychiatry residents who spent time learning specific scoring criteria. One of the scorers was blind to diagnosis. Gorham’s test manual’ gives two categories for scoring the proverbs. These are abstraction. a scale from 0 to 2 with 2 being most abstract. and idiosyncracy, a scale from 0 to 2 with 2 being most idiosyncratic. In Gorham’s examples of scoring these proverbs, only six combinations of these two scores appear. They are abstract = 2, idiosyncratic = 0: abstract = 1, idiosyncratic = 0: abstract = 1. idiosyncratic = 1: abstract = 0, idiosyncratic = 0: abstract = 0, idiosyncratic = 1; abstract = 0, idiosyncratic = 2. For ease of scoring, since only six combinations occurred, each combination was given a numerical score as follows: 5 points = abstract 2, idiosyncratic 0; 4 points = abstract 1. idiosyncratic 0: 3 points = abstract I, idiosyncratic 1; 2 points = abstract 0. idiosyncratic 0; 1 point = abstract 0; idiosyncratic I; 0 points = abstract 0, idiosyncratic 2. The scores were combined to increase the power of the numerical rating to differentiate different groups (Gorham has occasionally combined his scales in a similar fashion).4 This gives a possible range of scores of 0 to 5 for each proverb and from 0 to 20 for all four proverbs.
RESULTS In order to test reliability the proverb test scores for both raters were correlated for each proverb. None of the Pearson correlation coefficients was above 0.40. However, when the scores from the four proverbs were summed and then the sums correlated for the two raters, the Pearson correlation coefficient equalled 0.82, a finding significant at the 0.001 level. In order to determine kappa values the proverb scores were divided into three groups, low scores (O-.5), middle scores (6- 10) and high scores (1 l-20). These scores were then used to calculate kappa values by the method of Spitzer and Fleiss.6 Kappa is a statistic for measuring categories, such as
530
JAMES
H. REICH
Table 1. Kappa Values for Proverb Scores
Low Middle High
Overall Rater Agreement
Expected by Chance
87% 78% 87%
54% 40% 68%
Kappa 0.72 0.63 0.65
diagnostic categories, which incorporates a correction for chance. The results are shown in Table 1. As can be seen all three groups have kappa values above 0.6 (the minimum needed for a reliable finding). The mean scores on the proverb test were schizophrenic, 5.9 (SD = 3.2); manic 6.9 (SD = 2.3); and control 12.5 (SD = 2.9). An analysis of variance disclosed significant group differences (f = 46.6, p < .OOl). Duncan’s test indicated that the tendency for manics to score higher on proverbs than schizophrenics was not significant, while both groups scored significantly lower than controls. DISCUSSION Although, like Andreason, this study had poor success in correlation of individual proverbs between raters, the correlation coefficient of 0.82 for the four proverbs summed represents a high degree of reliability. In addition, when the scores were analyzed in groups of low, middle, and high scoring groups all had a kappa value indicative of good reliability. Since four proverbs is not an unreasonable number to give during a mental status exam, it appears high reliability for proverbs can be achieved. In this study proverbs were easily able to distinguish between hospitalized psychiatry patients (manics and schizophrenics) and hospitalized nonpsychiatry patients, but not between manics and schizophrenics. Andreason’s procedure using the mean scores of all raters could distinguish manics and schizophrenics from depressives. It also separated schizophrenics from manics in three out of five categories of proverb interpretation. The present study does note a nonsignificant trend separating the two groups (schizophrenic 5.9, manits 6.9), and it did not use the category that most strongly differentiated the two groups in Andreason’s study (personalization). It appears proverbs have good validity in differentiating manics and schizophrenics from depressives or hospitalized controls. Although there are some interesting trends in the data, the ability of proverbs to separate manics from schizophrenics during the routine mental status exam (one examiner, four or less proverbs) seems questionable at best. It may be a fruitful area of future research, however. A major question is what procedures did this study use to achieve reliability that Andreason’s did not. There appear to be several differences: (1) the raters took pains to learn the scoring methods used; (2) the raters gave the proverb in the standardized fashion prescribed by the manual; and (3) the scores of the four proverbs used were summed rather than examined individually. In conclusion, it appears that if intelligence and cultural variables are controlled for, at least four proverbs given, and scoring done by standardized
PROVERBS AND THE MODERN MENTAL STATUS EXAM
531
methods, proverbs can achieve both high reliability and validity in the mental status exam. There are even some indications that using proverbs to distinguish manics from schizophrenics would be a possible area of fruitful research. ACKNOWLEDGMENT The assistance of Michel Wahba, Ph.D., Patric Donlon, M.D., T. Morrison, Ph.D., and Richard Strassman. M.D. is gratefully acknowledged.
REFERENCES I. Wechsler D: Measurement and Appraisal of Adult Intelligence (ed 3). New York, Brunner/Mazel, 1972 2. Gorham DR: Clinical Manual for the Proverbs Test. Missoula, Montana, Psychological Test Specialists, 1956 3. Gorham DR: Use of the proverbs test for differentiating schizophrenics from normals. J Consul Psycho1 20:435-440, 1956
4. Gorham DR: Verbal abstractions in psychiatric illness: Assay in impairment utilizing proverbs. J Ment Sci 10752-59, 1961 5. Andreason N: Reliability of proverbs: Interpretation to assess mental status. Compr Psychiatry l&465-473. 1977 6. Spitzer RL, Fleiss JL: A re-analysis of the reliability of psychiatric diagnosis. Br J Psychiatry 125:341-347, 1974