DSM-III A Step Forward or Back in Terms of the Classification of Child Psychiatric Disorders?
Michael Rutter, M.D., F.R.C.P., F.R.C.Psych. and David Shaffer, M.B., B.S., M.R.C.P., M.R.C.Psych.
Abstract. This paper appraises DSM-III as a classification of child psychiatric disorders. Its
successes include the use of a phenomenological approach, its recognition that disorders rather than individuals should be classified, the introduction of a multiaxial framework, the provision of a more comprehensive listing of child psychiatric disorders, improved diagnostic criteria, the addition of codings for psychosocial stressors, and the recognition that disorders may persist into adult life. Criticisms of DSM-III as a source of discriminatory labeling are rejected. However, criticisms are made of the structure of the multiaxial system, the decision not to put mental retardation on a separate axis, the principles employed on thepsychosocial axes, the proliferation of unvalidated diagnostic categories, and the extension of research diagnostic criteria to categories which lack the empirical findings which might justify them. Nevertheless, DSM-III constitutes a marked improvement over DSM-II and represents a landmark in the development of psychiatric classification systems. Journal of the American Academy of Child Psychiatry, 19:371-394, 1980.
The emergence of DSM-IIP constitutes something of a landmark in the history of systems of psychiatric classification. In the first place, its aims have been high. The goals, as set out in the introduction to the manual, include both a commitment to reliability and validity as determined from research findings and a concern that the scheme should be useful and acceptable to clinicians, suitable I
The third edition of the Diagnostic and Statistical Manual ofMental Disorders of the Ameri-
can Psychiatric Association.
Dr. Rutter is Professor, Department of Ghild and Adolescent Psychiatry, Institute of Psychiatry, London. Dr. Shaffer is Clinical Professor of Psychiatry and Pediatrics, Columbia University, New York. Reprints may be requestedfrom Dr. Michael Rutter, Department of Child and Adolescent Psychiatry, Institute of Psychiatry, De Crespigny Park, Denmark Hill, London, SE58AF, England. This paper was prepared while Dr. Rutter was at the Center for Advanced Study in the Behavioral Sciences. He is grateful for financial support provided by the Grant Foundation, the Foundation for Child Development, the Spencer Foundation, and the National Science Foundation (BNS 7824671). David Shaffer is supported in part by NIMH-Psychiatry Education Branch Grant MHO 7715-17.
0002-7138/80/1903-0371 $01.93 c 1980 American Academy of Child Psychiatry.
371
372
Michael Rutter and David Shaffer
for research, and well designed for educational purposes. These high aims are most welcome, and in this paper we will discuss how far they are met. In the second place, it was subjected to a series of field trials in order to identify problem areas and to tryout solutions to these problems. Thus, it is important to consider the scientific value of these trials, their appropriateness with respect to the issues on which there is the greatest nosological uncertainty or disagreement, and the use made of the research findings in the final form of the classification. Third, the scheme differs from its most immediate American predecessor, DSM-II, both in its attempt to utilize a multiaxial framework and in its incorporation of codings for psychosocial stressors and for level of adaptive functioning. We need to discuss the purposes of this change and the extent to which it has been successful in what it set out to do. Fourth, perhaps the most novel and ambitious feature of all has been the attempt to use precise operational criteria for all diagnostic categories. It is in this respect that it differs most markedly from the World Health Organization equivalent, ICD-9. 2 Accordingly, it is necessary not only to assess the success of the attempt to use operational criteria but also to consider the merits and demerits of DSM-III in comparison with the ICD-9. Finally, even before its publication, DSM-III has been subjected to a variety of critiques in terms of whether it will be good for patients (Salzinger, 1977), for science (Zubin, 1977), for psychologists (Schacht and Nathan, 1977), and for children (Garmezy, 1978). Hence, we need to ask how far these criticisms have been justified and how far misplaced. THE SUCCESSES
OF
DSM-III
Perhaps the most fundamental success of DSM-III is the acceptance of the fact that classification as a means of ordering information and of grouping phenomena is not only basic to all forms of scientific enquiry but also is essential as a code for communication between clinicians. It provides a kind of language by which people can describe the disorders they investigate and treat, and for this purpose there has to be uniformity in the usage of terms. Several consequences flow from this position if an adequate classification scheme is to be achieved (Rutter, 1965, 1977a; Rutter et al., 1975). 2
The ninth edition of the International Classification of Diseases.
r
DSM-III: A Step Forward Or Back?
373
Thus, it has been necessary to take a descriptive or phenomenological approach which is not tied to any particular theory (an obvious essential when clinicians are hopelessly divided on which theory to follow). Moreover, a great deal of work has gone into the development of operational criteria and of a firmly structured set of rules for diagnosis. Clearly, the aim has been to ensure as far as possible that everyone will use the scheme in the same way to mean the same things. While we have criticisms of some of the details of these criteria and rules, the principle is surely right and Spitzer and his colleagues deserve congratulation for their very considerable accomplishments in this area. Second, it is good to see an explicit recognition that it is disorders and not individuals that are being classified (Rutter 1965, 1977a). This is important as an indication that individuals have many different characteristics and that it is both inaccurate and demeaning to assume that children are no more than vehicles for psychopathology. The classification of disorders rather than people is also crucial in terms of its reflection of the fact that children change and develop. This step does much to diminish the concerns over the possible misuse of classification as a means of labeling people. Third, the introduction of a multiaxial framework is a step forward in terms of its recognition that clinical diagnosis necessarily involves several different elements which do not constitute alternatives to one another. Thus, it is appropriate to expect the clinician to choose between the diagnosis of manic-depressive psychosis and infantile autism, whereas there is no sense in asking for a choice between mental retardation and cerebral palsy. The point is that the former distinction represents a choice between two mutually exclusive ways of categorizing a single behavioral syndrome;" whereas the latter concerns a choice between two entirely different frames of reference (one describes level of intellectual functioning, and the other refers to a neurological disorder involving motor abnormalities). The solution provided by a multiaxial framework is to put these different areas of discourse onto separate and independent axes, each of which must be coded in all cases. The advantages of .a multiaxial system include not only the elimination of 3 Excluding for the moment the possibility that the individual happens to have two separate and independent psychiatric conditions simultaneously.
374
Michael Rutter and David Shaffer
these artificial and misleading false choices but also the possibility of much more systematic and unambiguous information. Thus, in the usual multicategory system (e.g., DSM-II), if a clinician recorded just mental retardation, but not cerebral palsy, one would have no means of knowing whether the child did or did not have cerebral palsy. The diagnosis might have been omitted because the child did not have cerebral palsy, because he did have cerebral palsy but the clinician did not think it important in relation to the clinical problem, or because cerebral palsy was of no interest to the psychiatrist. In contrast, in a multiaxial system, the physical disorder axis cannot be left blank. Either a definite diagnosis must be made, or there must be a coding to indicate that the child is free of all physical disorders. The issues involved in multiaxial systems are more fully discussed by Tarjan et al. (1972) and Rutter et al. (1975). It is an approach which is by no means free of difficulties, and it has problems of its own; but there can be little doubt that its introduction constitutes an advance. Fourth, DSM-III provides a much more comprehensive listing of child psychiatric disorders than that available in DSM-II. We have specific criticisms concerning both the number of categories included and the choice of particular syndromes; but it is certainly good to have made a systematic attempt to provide an exhaustive classification of psychiatric disorders as they occur in children, rather than to have to rely on a few categories added to a scheme basically designed for use with adults. Fifth, there can be no doubt that the diagnostic criteria provided for the better researched syndromes (such as schizophrenia or infantile autism) are a marked improvement on what was available hitherto. They are a credit to the thorough and scholarly approach taken by Spitzer and his colleagues, and unquestionably they constitute an important aid to improved communication and to greater comparability with respect to these conditions. Sixth, we welcome the attempt to provide a means of coding psychosocial stressors, although the particular system chosen seems unfortunate. There is growing evidence of the importance of psychosocial factors in the etiology, precipitation, or prolongation of many psychiatric disorders, and it is high time that serious efforts were made to provide for their accurate classification. Seventh, we are glad to see that DSM-III recognizes that disorders typical of childhood may persist into adult life and that provision is needed for diagnosis of these residual states. The
DSM-III: A Step Forward Or Back?
375
persistence of infantile autism into adulthood constitutes a good example of this circumstance; and it is right both that there be a coding for these adult disorders, which represent the continuation of childhood conditions, and also that that coding be linked with the diagnosis as it is made in infancy. The DSM-III criteria for diagnosis of these syndromes are rather vague and imprecise, but this may reflect our present state of knowledge. The important thing is that a start has been made. So much for the successes of DSM-Ill. The listing by no means exhausts the many real accomplishments present in this new classification scheme, but it may serve to provide pointers to some of the more important areas of progress. We now need to turn to the points of criticism. In this connection, it should be made clear at the outset that the criticisms are being made in relation to the criteria which must be met for any really adequate system of classification (Rutter, 1977a) and also in relation to the goals specifically set by Spitzer et al. for DSM-III. It is not suggested that any other scheme meets all the criteria, for it is only too obvious that none does. However, as the final sentence of the ·introduction to the DSM-III manual states: "DSM-III is only one still frame in the ongoing process of attempting to better understand mental disorders." Our comments are made in that spirit, in the hope of drawing attention to issues and areas requiring attention if future classifications are to build on what has been achieved so far. MENTAL DISORDERS, MEDICAL CONTROL, AND LABELING
Before outlining our own criticism of DSM-III, it is appropriate to pay brief attention to the points made in previous reviews, because many of these dealt with general issues regarding classification. Previous critics have taken grave exception to DSM:-III's assumption that syndromes included in the classification are "mental disorders" (Garmezy, 1978) and to the further assumption that these constitute "medical conditions" (Schacht and Nathan, 1977; Zubin, 1977). It is not really clear what the scientific issues at stake here are, in that none of the terms has any precise meaning. To us, the term "mental disorder" seems rather neutral. The categories all refer to one or another aspect of behavior or cognition, and it would seem reasonable to suppose that these involve the mind in some way and hence that the adjective "mental" be applied. The term "medical condition" is somewhat more problematic, but it
376
Michael Rutter and David Shaffer
could be assumed that it means no more than that the whole process of psychiatric classification started from the need to find a means of dealing with hospital and clinic statistics and that because psychiatrists are medical men the disorders referred to them could be termed "medical" in some sense. Of course, it is perfectly true that many of these same disorders are rightly dealt with by psychologists, social workers, and other nonmedical professionals. It would be most unfortunate, to put it mildly, if the terms used implied that these professionals should not have responsibility in this area. However, most psychiatrists, psychologists, and social workers work in interdisciplinary settings, and it would be equally disastrous if there had to be separate and incompatible classification schemes to be used by each of the different professions. So we need to ask why psychologists have got so hot under the collar about the terminology and what are the real areas of dispute in this connection? Mental Disorder
Schacht and Nathan (1977) base their objections on Spitzer and Endicott's (1978) definition of mental and medical disorder in terms of "inferred or identified organismic dysfunction." As they point out, there are considerable conceptual difficulties in this whole notion. Moreover, as a criterion it is seriously inadequate in view of the need with many DSM-III categories to invoke highly speculative inferences about "organismic dysfunction." We share their concerns, but DSM-III utilizes a rather different concept of "mental disorder." This includes just three key criteria-distress, impairment, and "behavioral, psychological or biological dysfunction." The last is rather vague, but in itself it seems unobjectionable, and it is a good deal wider than "organismic dysfunction." More particularly, it certainly does not seem to imply an organic disease state, and it includes no requirement of a "medical model" (whatever that may mean). There seems to be little to be gained by continuing such semantic squabbles. There must be other issues which have caused DSM-III to be greeted with such dismay by many distinguished psychologists. Three rather separate points seem to be involved. First, Garmezy (1978) scoffs at the inclusion of such poorly validated entities as oppositional disorder, identity disorder, and avoidant disorder of childhood on the grounds that they can scarcely be regarded as mental disorders. We do not share his faith that "mental
DSM-III: A Step Forward Or Back?
377
disorder" is susceptible to precise definition, but certainly we very much agree with his concern about the inclusion of diagnostic entities which lack validating criteria (see below). Medical Control
The second point present in all the critiques refers to the view that many conditions are included, not because they are scientifically justifed, but rather because they are needed to justify third-party payment and to ensure that medical men remain in control. We are not in any position to judge how far these considerations played a part, but we would be adamant that it would have been totally improper for them to have played any role at all. If the use of the term "medical disorder" is meant to imply that medical men should have prime responsibility for the treatment of such conditions, then the implication is not only unjustified but also seriously mischievous in its likely effect on services. The caution that the use of the manual for nonclinical purposes "must be critically examined in each instance within the appropriate institutional framework" is appropriate but it does not go far enough. As one of us has argued (Rutter, 1977a), "it is a mistake to equate classification with administrative action. It is never justifiable to assume that mentally retarded children need to be in an institution for the retarded ... or that children with psychiatric disorder require psychiatric treatment.... Services [should be] tailored to individual needs rather than slotting individuals into pigeon-holes which offer a diagnosis cum treatment package" (p. 377). Labeling
The third objection raised by Garmezy (1978) is in terms of the labeling effect of having such a wide definition of "mental disorder." He asks, "Are the many learning disabled children of the nation to be similarly tagged and numbered?" And quoting the example of the dropping of a Vice-Presidential nominee because he had had an affective disorder, he asks whether nominations would now be similarly denied to persons who had had a reading deficit in childhood on the grounds that it was a mental disorder. Surely this argument is unacceptable on the multiple grounds that it suggests that the classification labels people, whereas in fact it describes disorders (see above); that it implies that there is something shameful about having a mental disorder and hence that we should
378
Michael Rutter and David Shaffer
be very conservative about the use of the term; and that the logic of the plea requires the premise that some diagnoses are in themselves grounds for discriminatory or political action. We reject all three implicit assumptions. Of course, Garmezy (1978) does not in fact approve of any such discriminatory behavior (as earlier portions of his paper make clear). Moreover, he is undoubtedly correct that classifications .not only can be misused but are in fact sometimes employed to justify reprehensible actions which demean and derogate both individuals and whole groups of people. We share his unease about the possible misuses of classification (DSM-III or any other) and we profoundly deplore the suggestion that DSM-III categories should be used to justify any administrative, political, or discriminatory actions whatsoever. But does this mean that we reject all classifications on the grounds that they can be abused? Of course not, and Garmezy (1978) himself regards psychiatric classifications as worthwhile and necessary. It is also significant that he is forthright in asserting that affective conditions are mental disorders, in spite of the fact that on occasion that label has been used as the grounds for discrimination. The only logical position possible on the basis of his argument is that if prejudicial discrimination has to occur, it is better that it be based on scientific grounds. This seems highly dubious, to say the least. Would anti-Semitism be any better if it were confined to a pure and narrow designation of who is Jewish, and would McCarthyism have been any more acceptable if only it had confined its target to "real" left-wingers? Surely not! Indeed, discrimination which is apparently supported by a narrowly defined, soundly based classificatory system of proven validity may actually be more damaging than discrimination carried out on an overinclusive, quasi-random basis (in which all could see the ridiculousness of it). No, we do not accept that discrimination is more acceptable if done on a scientific basis, but we do join Garmezy in attacking any misuse of classification. The dangers are real, and we should be both alert and active in dealing with these abuses when they occur, and alive to the need for actions to prevent them. But we part company with these other critics in terms of the suggestion that this problem is one that is specific to DSM-III. Rather, DSM-III needs to be justified or condemned in terms of its value or lack of value as a scientific classification devised for clinical purposes (i.e., for
379
DSM-III: A Step Forward Or Back?
use with the disorders shown by individuals seeking help from professionals working in clinic or hospital settings). How does it measure up against that criterion? This question is best answered by considering a number of different aspects of DSM-III in turn. DSM-III
AND
ICD-9
The first question with respect to DSM-III as a classification system is why it was necessary for the American Psychiatric Association to have its own private system of classification instead of using the scheme which will be used throughout most of the rest of the world, and which was developed by the World Health Organization on the basis of a lO-year program of research and through working groups which always included representatives from the United States. It seems curious for DSM-III to start its introduction with an explicit statement about the need for "a common language" in psychiatry, and then to go on to argue that the United States must speak a different language from the rest of the world! Three reasons are given for this decision: (a) the concern that ICD-9 "would not be suitable for use in the United States" (no indications are given why it should be suitable elsewhere but not in the United States); (b) many areas "did not seem sufficiently detailed" (the pros and cons on the proliferation of categories are described below); and (c) the glossary of ICD-9 failed to make use of specified diagnostic criteria and of multiaxial approaches (both points are considered below). While we will argue about the validity of these arguments, it must be said straight away that DSM-III has been designed to make it as comparable with ICD-9 as possible. In practice, with a few exceptions, it should not be too difficult to equate the two systems so that direct comparisons can be made. We should add that, while some of our criticisms which follow apply specifically to DSM-III, many deal with problems which are similarly present in ICD-9. MULTIAXIAL SYSTEMS OF CLASSIFICATION
DSM-III provides a very poor account of both the rationale and the implementation of a multiaxial system. In particular, no explanation is given of the difference in the principles which apply to a multiaxial system compared with a multicategoryscheme, and there is neither an emphasis on the need to make a coding on all
380
Michael Rutter and David Shaffer
axes in all cases nor an account of why this should be required. It is true that there is a rather obscure provision by means of the V71.07 and V71.08 codings for noting that there is no diagnosable condition on axes I and II, but the layout makes it rather unlikely that these will be used in the ways required for a multiaxial system to work properly. Moreover, it seems curious to have to search for codings on an entirely different axis in order to make a "no disorder" coding. However, we also need to consider the choice of axes, and here the difficulties multiply immediately. Mental Retardation
The most obvious feature with respect to child psychiatric disorders is the decision to include mental retardation on the first axis. The decision seems curious in that the studies undertaken in preparation for ICD-9 indicated that the greatest need for a multiaxial system was precisely in those cases in which a disorder of emotions or behavior coexisted with intellectual retardation (Rutter et aI., 1969, 1975). Moreover, logicially the decision on a child's intellectual level is of an entirely different kind from those involved in assessing whether he is suffering from a depressive condition or drug dependency. It seems inevitable that the statistics on mental retardation as it occurs in children with other psychiatric disorders will remain difficult, if not impossible, to interpret. An important opportunity greatly to improve psychiatric statistics has been lost, without there having been apparent advantages in doing things the way they have been done. The explanation put forward in the report of the child psychiatry field studies (Russell et aI., 1979) is that the diagnosis of mental retardation involves an assessment of adaptive functioning as well as of intellectual level, and hence that it should be included with other mental disorders. The argument is false on several grounds. First, as it is, DSM-III already includes two axes for mental disorders, so that the observation that mental retardation is a mental disorder is irrelevant-it could have constituted a third axis. Second, the claim that the involvement of adaptive functioning requires that mental retardation should be on axis I is negated by the fact that personality disorders are placed on axis II (the whole concept of personality disorder involves the presence of lifelong abnormalities in adaptive functioning). Third, it is not true that mental illness and mental retardation concern similar disorders. While it may be that psychiatrists are mainly interested in the
DSM-III: A Step Forward Or Back?
381
behavioral abnormalities associated with mental retardation, nevertheless the essential distinction between illness and retardation is that the former is primarily defined in terms of abnormal type of mental functioning, whereas the latter is primarily defined in terms of abnormal level of mental functioning. Fourth, and most important of all, there is the empirical observation from both the ICD and the DSM-III studies that "if multiple diagnoses are required within a single axis, they will probably be recorded with less reliability" (Russell et aI., 1979, p. 1224). The DSM-III field study findings on three cases in which a mental disorder and mental retardation coexisted showed that there was an agreed diagnosis of mental retardation in an average of 85% of instances. This result was used to argue that the problem associated with multiple diagnoses on one axis may have been exaggerated. However, in this connection attention was paid to the wrong statistic. The central problem with two diagnoses on the same axis is that one tends to be chosen at the expense of the other. This was evident in the DSMIII trials which showed only 38% agreement on both diagnoses in these same three cases (Cantwell et al., 1979a). We are forced to conclude that the inclusion of mental retardation on axis I was both based on false logic and also likely to be damaging in practice.
Developmental Disorder Axis Although strongly criticized by Garmezy (1978), the second axis of "specific developmental disorders" seems generally appropriate in overall design, although it remains obscure to us what is meant by the claim that they constitute "mental disorders," or indeed why the claim has been made at all. If the claim has any scientific meaning, which seems extremely doubtful, then certainly Garmezy's objections apply. However, it seems to be asking too much of the imagination even to speculate on how there could be a severe disorder of, say, reading or arithmetic which did not fulfill the criteria of distress, impairment, and psychological dysfunction (we presume that "psychological" must include cognition in that it is held to cover intellectual retardation) which are said to define "mental disorder." However, the diagnostic criteria appear very vague in comparison with those of most other conditions, and some of the descriptive statements are seriously misleading. For example, it is said that "in most cases the disturbance is stable throughout childhood." What is meant by that statement? If it is meant to imply that it is rare for improvement to occur, it is just
382
Michael Rutter and David Shaffer
wrong-most language disorders improve greatly as the children grow older. If it is intended to suggest that the form of the disorder does not alter, then that too is wrong-for example, many children with a developmental language disorder gain a normal level of language competence but go on to exhibit reading difficulties (Rutter, 1977b). Psychosocial Axes
The fourth and fifth axes of "severity of psychosocial stress" and of "highest level of adaptive functioning during past year" provide greater difficulties. It appears unfortunate, to say the least, that "severity" was selected as the one aspect of psychosocial stress to code. In the first place, it seems to assume that all stresses act through the same mechanism, whereas research findings clearly indicate that this is most unlikely (Rutter, 1972, 1979a). In the second place, the grounds for assessing severity seem highly questionable. For example, chronic parental fighting is regarded as a much less severe stressor than death of a sibling in spite of the fact that it is well established that family discord is strongly associated with disorder, whereas there is little indication that death of a sib strongly predisposes to disorder. Or, again, a vacation with the family is treated as equivalent to a minor violation of the law! Moreover, there is a most regrettable confusion between stresses which arise independently of the individual himself (e.g., bereavement) and of those which are a consequence of his own behavior (e.g., arrest). The methodological advances over the last decade in the measurement of stressors seem to have been completely ignored (e.g., Brown and Harris, 1978). The fifth axis of an individual's highest level of adaptive functioning during the past year is included on the grounds of its prognostic significance: "an individual returns to his or her previous level of adaptive functioning after an episode of illness." Indeed, this is often the case, but the adjective "previous" needs to refer to the premorbid level and not to some arbitrarily restricted period which must be within the past year. The whole meaning of this coding will be entirely different for individuals whose disorders have lasted more than 12 months. The classification rule that the coding must apply to the past year seems both to lack known validity and also to provide a spurious impression of objectivity. V ALIDITY
OF DIAGNOSTIC DISTINCTIONS
One of the striking features of the DSM-III has been the proliferation of diagnostic categories. For example, a child presenting with
DSM-III: A Step Forward Or Back?
383
socially disruptive behavior might be included under one of the three attention deficit disorder categories, under one of the five varieties of conduct disorder, oppositional disorder, identity disorder, personality disorder, adjustment disorder, or under the V code for childhood antisocial behavior. Similarly, an anxious or fearful child might be classified in a dozen or so different places. It is necessary to ask whether all these codes are necessary or helpful, and in particular to examine the extent to which there is evidence validating these multiple syndromes. Before we consider the categories in greater detail, it is perhaps useful to review the general criteria for the separate inclusion of any category in a medical classification. The criteria are five: (1) that the syndrome is identifiable; (2) that the diagnosis can be made reliably with reasonable agreement between different clinicians; (3) that it constitutes a handicapping condition warranting clinical attention; (4) that the syndrome has validity in the sense that it has been shown to differ from other syndromes in terms of etiology, course, response to treatment, or some alternative clinical feature other than the symptoms which define it (Rutter, 1978); and (5) that it either occurs sufficiently often that its presence warrants separate coding, or that its public health importance is so great that its presence must always be noted, however rarely it occurs (smallpox or cholera constitute good examples of medical conditions included in all medical classifications on the basis of the last clause). How do the very large numbers of categories in DSM-III measure up to these basic (and generally noncontroversial) criteria? Presumably, at least some clinicians consider the syndromes to be recognizable and meaningful, so let us accept that the categories meet the first criterion. However, it appears extremely doubtful whether most of them would meet the second criterion of interrater reliability. The more limited (but still extensive) range of categories in ICD-9 was recently put to the test in a WHO study with more than 50 British child psychiatrists (Sturge et al., 1977). Each clinician rated 28 case histories under conditions which effectively ensured that the ratings were independent. The findings showed a high level of agreement oil broad diagnostic categories such as disturbance of conduct, emotional disorder, or depressive condition, but generally rather low reliability for the finer subdivisions within the general groupings. Of course, these findings refer to the categories as used in ICD-9 (but these overlap to a considerable extent with those in DSM-III) and to the WHO glossary, which does not use either operational criteria or rigid sets of rules. However, it is very striking that the DSM-III field studies (Mattison et al., 1979) showed exactly the same thing. In particular,
384
Michael Rutter and David Shaffer
it was found that there was low interrater reliability for the various subvarieties of depressive disorder, anxiety disorder, conduct disorder, and adjustment disorder, which between them probably constitute the majority of conditions seen at child psychiatry clinics. The third criterion, of a handicapping condition, is obviously met by the traditional psychiatric categories, but previous critics, such as Garmezy (1978), have queried whether it is met by some of the newly introduced categories such as "avoidant disorder" or "schizoid disorder" or "oppositional disorder." His questioning of these categories appears fully justified. For example, "oppositional disorder" is to be diagnosed if at least two of the following five behaviors have lasted at least six months: violations of minor rules, temper tantrums, argumentativeness, provocative behavior, or stubbornness. That description-and only two are required for the diagnosis-i-sounds like the behavior of a lot of children one meets socially and not at all like psychiatric disorder. Of course, these behaviors are often involved in psychiatric conditions, but on their own they do not sound sufficient for a psychiatric diagnosis. At the very least, one must ask for validating data. That brings us to the fourth criterion. A careful review of the evidence indicates that rather few of the diagnostic categories have been validated satisfactorily (Rutter, 1978). Of course, autism is a well-established condition which differs from other psychiatric disorders in numerous respects; and the differentiation between emotional disturbance and disorders of conduct is also well validated. There are useful pointers to possible meaningful subdivisions within these broad groups, but none is yet adequately tested. In particular, although recent research findings suggest that there may prove to be valid syndromes of depression and of the hyperkinetic syndromes, doubts and uncertainties still remain. If this is so, as it is, even with some of the more traditional and well-accepted diagnostic categories, there is even less evidence for the nosological validity of the new syndromes. The recent WHO study (Shaffer et aI., 1979) also examined the differentiation between clinical categories. The findings confirmed the utility of the broad diagnostic groupings, but once again, little evidence could be found to justify the fine subdivisions. It seemed that ICD-9 had introduced more diagnostic differentiations than could be adequately justified on the basis of the published research findings. DSM-III has provided even finer subdivisions.
DSM-III: A Step Forward Or Back?
385
Of course, we recognize that there is a dilemma here in that it is not sensible or practicable to demand that all categories be proved to the hilt before inclusion in a psychiatric classification. It may well be justifiable to include diagnoses when extensive clinical experience suggests their utility and when the research findings at least do not contradict the clinical observations and perhaps provide some pointers to the possible validity of the syndromes. However, does this apply to "identity disorder"? There is now a substantial body of research of many different kinds which runs counter to the views of adolescence which seem to underlie this concept (Rutter, 1979b). Also, what is the evidence on the validity of an "oppositional disorder" which is not part of a conduct disorder (an explicit exclusion clause in DSM-III)? Similar queries may be raised in connection with many of the new categories introduced into DSM-III, and we conclude that the proliferation of diagnostic categories so far outruns the empirical validating evidence that their inclusion is unjustified. What about the fifth criterion? Certainly, there is precious little indication that most of the new categories have such a public health importance that they have to have a separate coding in the classification. In this connection it is important to appreciate that the issue is not whether there should be a place in the classification to note the diagnosis. To the contrary, as Stengel (1959) cogently argued, any adequate classification should utilize categories which are "mutually exclusive and jointly exhaustive." In other words, there must be a place for all psychiatric conditions in a psychiatric classification, however rare or exotic the diagnosis. But that does not mean that there must be a separate coding for all of these infrequent conditions of limited importance. The usual solution is to include them under some broader heading. We suggest that this would have been a better mode of dealing with some of the new diagnoses introduced into DSM-III. FIELD STUDIES
In the introduction to DSM-III, a special point is made of the utilization of field trials "to identify problem areas in the classification and to tryout solutions to these problems." Indeed, the claim is made that "In the past, new classifications of mental disorders have not been extensively subjected to clinical trials before official
386
Michael Rutter and David Shaffer
adoption." Not only is that untrue; the authors must have been aware that the previous development of ICD-9 was preceded by nearly a decade of international collaborative research which gave rise to a series of monographs and papers published before ICD-9 was introduced (indeed this is appropriately acknowledged by Cantwell and his colleagues in their reports of the child psychiatry field trials). Of course, it is true that many of the key issues in ICD-9 were not researched, and it is certainly the case that many codings were retained or added in the absence of adequate empirical justification. Nevertheless, it is wrong to assert that DSM-III has blazed the research trail when it has clearly followed in the wake of the WHO and has adopted some of the research strategies in the WHO program. However, precedent apart, what did the field trials achieve? No adequate assessment of them is yet possible in that many of the findings remain unpublished. However, there are two published reports (Spitzer and Forman, 1979; Spitzer et al., 1979) on the testing of some of the categories for adults. Undoubtedly these studies were useful and it is good that they were undertaken, but as pieces of research they leave much to be desired. Both reports concern the reliability study which involved clinicians "from Maine to Hawaii." Unfortunately this impression of spread is largely spurious in that the reliability concerned agreements only between close colleagues (each clinician chose his own partner in the study). Moreover, the reliability findings were derived from an unknown data base which differed from pair to pair. That is, there was no uniformity in the information provided for the reliability study. Finally, there was no control over adherence to the rules and no means of preventing consultation between the two clinicians making supposedly independent diagnostic codings. There must be very considerable reservations about the meaning of reliability figures obtained under such inadequate conditions. Of course, we are acutely aware of the difficulties involved in such field studies and it may well be that this was the best that could be done within the time and resources available. However, the findings do little to provide a scientific basis for DSM-III. Fortunately, the studies which were undertaken to test the classification of child psychiatric disorders were both more rigorous and more extensive (Cantwell et aI., 1979a, 1979b; Mattison et aI., 1979; Russell et aI., 1979). Twenty-four case histories prepared according to a standard format were used to assess interrater reliabil-
DSM-III: A Step Forward Or Back?
387
ity with eight child psychiatry faculty and twelve fellows at UCLA. The results were analyzed in detail according to the various types of psychiatric disorder and according to the use of all four axes. The findings are most informative in the light they throw on several crucial classification issues. Both the successes and the difficulties found were strikingly similar to those experienced with ICD-9 (Rutter et aI., 1975). With both systems of classification there was general satisfaction with the better coverage of disorders of childhood (compared with either DSM-II or ICD-8) and with the multiaxial approach. There also was goodinterrater reliability for the broad groupings of syndromes. However, there was poor agreement on the finer subdivisions and on the classification of complex disorders; consequently the field studies do not provide any justification for the great proliferation of new and unvalidated categories. Moreover, as found with the ICD-9 glossary, participants did not always follow the operational criteria. RESEARCH DIAGNOSTIC CRITERIA
That brings us to what in many ways is the most novel feature of DSM-III, namely, the use of precise operational diagnostic criteria. The style of approach is closely modeled on the research diagnostic criteria pioneered by the Washington University group (Feighner et al., 1972). The argument which underlies their use has two main points: (1) that in research it is often preferable to utilize rather narrow diagnostic criteria in order to be sure that the categories studied are as "pure" as it is reasonably possible to obtain; and (2) that only by providing fully explicit, unambiguous operational criteria with clear rules on their application can it be ensured that cases will be diagnosed in the same way by different clinicians. There is no doubt that this approach has paid off richly and has justified itself fully in clinical research (Spitzer et al., 1978). However, it is important to recognize that the criteria have been employed with a limited number of fairly well-established syndromes and that the price to be paid has always included a rather high proportion of cases which meet none of the sets of diagnostic criteria. Moreover, different sets of research diagnostic criteria may disagree sharply (Overall and Hollister, 1979), so that the particular criteria used are crucial. What has happened, then, in the extension of this approach to the whole of DSM-III? Several points stand out. First, operational
388
Michael Rutter and David Shaffer
criteria have had to be devised for conditions regarding which there is a total lack of evidence on which criteria to use. It is entirely reasonable to propose criteria for schizophrenia or autism when there is a vast research literature on which to draw, but how does one decide on the criteria for, say, oppositional disorder? The result, as one might expect, is an absurd specificity of rules when there are no empirical grounds for rules of any kind. Even with some of the better established disorders there are major problems. For example, the criteria for the attentional deficit disorder specifies that there must be at least two out of a list of five possible overactive behaviors, but the rules make no mention of whether the behavior is pervasive or situation-specific. Yet, recent research findings indicate that this may constitute the most important criterion of all (Schachar et al., 1980). In Schachar's study, situational hyperactivity appeared of no particular diagnostic importance; whereas pervasive hyperactivity (i.e., that present both at home and at school) had validity in terms of both its association with cognitive impairment and an increased likelihood that the disorder would persist over the next four years. Of course, it could reasonably be argued that much of this evidence was not available when the criteria were devised and that in any case the issue is far from closed. That is so, but the point is that if criteria are pulled out of the air for conditions not yet adequately validated, it is almost inevitable that some of the rules will prove to be inappropriate even before the manual has been printed. Second, not all the diagnostic categories have precise operational criteria; for example, the criteria for "developmental reading disorder," a much-studied condition with an enormous research literature devoted to it (Benton and Pearl, 1978). These specify: "Performance on standardized, individually administered tests of reading skill is significantly below the expected level, given the individual's schooling, chronological age and mental age (as determined by an individually administered IQ test)." Obviously, one of the crucial elements in this set of criteria is the assessment of "significant" impairment. Reference to the general description of the disorder provides guidance: "one-to-two-year discrepancy in reading skill for ages 8 to 13 is significant, but below that age, it is difficult to specify how great a discrepancy is significant." Oh, dear, what is the conscientious clinician to make of that? Should he take a one-year or a two-year discrepancy? He does not need to be a statistician to realize that the prevalence of the disorder will be
DSM-III: A Step Forward Or Back?
389
hugely different according to the choice made. And what does he do with a 7-year-old? Even worse, what does he do with a 14-yearold? At least the criteria admit to having to give up in the case of a 7-year-old, whereas the possibility of a developmental reading dis, order in a 14-year-old is not even mentioned in passing. What sort of diagnostic criteria are these if the aim is to produce a precise and unambiguous set of rules? Third, some of the criteria are patently unworkable. For example, the criteria for "reactive attachment disorder" specify that the age of onset must be "before 8 months." The initial description asserts boldly that attachments are formed by 8 months if there has been adequate caretaking-which runs counter to the research findings which show individual variation extending above that age (Rutter, 1980)-and, even more surprisingly, "The diagnosis can be made as early as in the first month of life." How is a disorder of attachment to be diagnosed in the neonatal period when selective attachments are not normally evident until some months later? The claim that the condition can be recognized by 4 weeks of age not only lacks empirical support but seems to require a considerable degree of prescience. Fourth, the criteria occasionally slip into making unwarranted etiological assumptions. For example, the ICD-9 category of "disintegrative psychosis" has been abolished in DSM-III, with the instruction that such cases should be classified under the coding for "dementia." There are two problems with this decision. First, the clinical features of disintegrative psychosis as it occurs in young children are very different from those of dementia in adults. Second, and more crucial, the coding of dementia demands the presence of organic brain damage;' and there may be no evidence of this in cases of disintegrative psychosis. For example, Evan-Jones and Rosenbloom (1978) described 10 such children, at least 5 of whom showed no abnormalities on either clinical examination or specialized neurological investigations. Of course, the negative findings may well reflect the inadequacy of our present measures of brain functioning, and it remains a very reasonable hypothesis that ultimately an organic basis will be found. We are inclined to favor such a hypothesis, but the point is that it is a hypothesis and 4 DSM-III states that this may be' "presumed" if there has been widespread cognitive impairment, but classifications should not be based on such presumptions which lack empirical support.
390
Michael Rutter and David Shaffer
not a fact. It is no advance to base classification on neuromythology rather than psychomythology! We accept that this is not at all a general feature of DSM-III , but it demands attention just because it represents an unnecessary departure from ICD-9 on the basis of an inappropriate principle. Fifth, DSM-III lacks adequate provision for dealing with disorders that do not fit any of the specified criteria. As Spitzer et al. (1978) have cogently argued, a necessary feature of research diagnostic criteria is that some (often many) patients have syndromes outside those listed. For research purposes it may well be acceptable to lump all these together under an unspecified category of "other psychiatric disorders," in that the greatest need is to have "pure" groups to study which have a minimum proportion of false positives. However, this involves an unacceptable loss of information when the classification is to be used for ordinary clinical purposes (as it is intended that DSM-III should be) . Accordingly, in many instances DSM-III very usefully includes a variety of "other" categories which involve some specification of a kind likely to have both reliability and validity. Thus, there is a coding for "atypical childhood onset pervasive developmental disorder" in recognition of the fact that there are many children (especially those with mental retardation) who exhibit severe disorders of an autistic type but which do not fulfill the specified criteria for infantile autism. Similar provision is made in other parts of DSM-Ill. However, there are a few notable failures to do this with some of the most common types of child psychiatric disorder. For example, there is no "other" provision for the anxiety disorders of childhood and adolescence. As a result, it is not clear how many cases of school refusal should be coded. As the manual correctly points out, this syndrome is not necessarily due to separation anxiety. Moreover, it often does not represent a simple phobia of school either (Hersov, 1977). How should such cases be dealt with? Or again, it is clear that many emotional disorders of childhood do not fall into any of the discrete syndromes listed (Rutter et aI., 1970). Rather they have a varied admixture of symptomatology which is obviously "emotional" in type. How should these be dealt with? DSM-III provides no clear guidance. Another very common clinical picture in children and adolescents is an admixture of emotional disturbance and aggressive or antisocial behavior (Rutter et aI., 1970 , 1975). However, apart from those which may be regarded as adjustment disorders (and most cannot), DSM-III makes no provision for these
DSM-III: A Step Forward Or Back?
391
cases. Both the field trials for ICD-9 (Rutter et al., 1975) and for DSM-III (Cantwell et al., 1979a) demonstrated that these are common situations in clinical practice. It would be convenient if all disorders filled textbook descriptions, but unfortunately they do not. As Kanner (1969) so eloquently put it: "Many patients do not oblige by fitting themselves into any set of criteria; they haven't read those books or articles." If research diagnostic criteria are to be employed, there must be adequate provision for these mixed and atypical pictures. Unfortunately, DSM-III falls short of what is needed in this connection. Sixth, the whole point of research diagnostic criteria is that they provide a clear-cut set of rules and regulations which have to be followed, but DSM-III seems to want to have it both ways. On the one hand, frequent reference is made to "necessary" criteria, and there is frequent specification that there must be three out of five (or some such number) of a list of defined symptoms. On the other hand, the manual also repeatedly describes the criteria as "guides," explaining that they should be interpreted in the light of clinical experience and judgment and should not be followed slavishly. This appears contradictory and undoubtedly opens the way for major variations in usage. Of course, it could be argued that this kind of inconsistency is inevitable in the present state of knowledge and in the varying conditions of clinical practice. Moreover, undoubtedly it would be claimed that, even though the rules cannot be made absolute, nevertheless it is still preferable to be definite rather than vague; and also that the presence of specific criteria is likely to encourage greater diagnostic uniformity than has existed in the past. We accept these arguments in the case of criteria which are well based, but we question the arguments on educational grounds when they are not (as is often the case). EDUCATIONAL VALUE
One of the explicit goals of the DSM-III Task Force was to produce a manual with "usefulness for educating health professionals." How useful is it? It certainly contains a wealth of clinically relevant information; and with some diagnostic categories, it clearly provides most valuable guidelines for diagnosis. We welcome both and we recognize with great appreciation the immense amount of scholarly work which has gone into its production.
392
Michael Rutter and David Shaffer
There is much of considerable educational value in the manual. In spite of these very positive qualities, however, we regard the overall effect as educationally unsound. It is partially a question of the dogmatic style which sometimes seems to leave little room for doubt. Examples of this have already been given, but many others could have been added. More especially it is a function of the failure to make any adequate differentiation between those statements which represent the summary of decades of research and those which are no more than spitting in the wind. The introduction to the manual does make it clear that much of the description relies solely on clinical judgment. That is fair enough, but surely it is not asking too much to provide some indication of the degree of empirical support for the statements given. Without that indication, it provides a rather unsatisfactory educational tool which is likely to mislead as often as it enlightens. All too often it provides a vivid illustration of the old saying, "It ain't ignorance that does the harm, it's knowing so many things that ain't so!" CONCLUSIONS
The criticisms we have made of DSM-III may seem harsh-they are. We regret that in so many instances it does seem to represent opportunities missed rather than taken. Nevertheless, it is important to end by putting the perspective right. As evident in our introductory remarks, we regard DSM-III as a marked improvement over DSM-II in most respects, and we certainly consider it a landmark in the development of psychiatric classification systems. We would urge child psychiatrists in the United States to use it and to evaluate it. If systematically employed and if subjected to critical study, much should be learned. As the DSM-III Task Force explicitly recognized, this version constitutes but a stepping stone. We hope that the experience gained with DSM-III will lead to a sharpening of concepts which can constitute the basis for the muchimproved DSM-IV version to which we now look forward.
DSM-III: A Step Forward Or Back?
393
REFERENCES
BENTON, A. L. & PEARL, D., eds. (1978), Dyslexia. New York: Oxford University Press. BROWN, G. W. & HARRIS, T. (1978), SocialOrigins of Depression. London: Tavistock. CANTWELL, D. P., RUSSELL, A. T., MATTISON, R, & WILL, L. (1979a), A comparison of DSM-II and DSM-III in the diagnosis of childhood psychiatric disorders: I. Agreement with expected diagnosis. Arch. Gen. Psychiat., 36:1208-1213. - - - - - - - - (1979b), A comparison of DSM-II and DSM-III in the diagnosis of childhood psychiatric disorders: IV. Difficulties in use, global comparison and conclusions. Arch. Gen. Psychiat., 36: 1227-122K EVANS-JONES, L. G. & ROSENBLOOM, L. (1978), Disintegrative psychosis in childhood. Develpm. Med. Child Neurol., 20:462-470. FEIGHNER, J. P., ROBINS, E., GUZE, S. B., WOODRUFF, R. A., WINOKUR, G., & MUNOZ, R. (1972), Diagnostic criteria for use in psychiatric research. Arch. Gen. Psychiat., 26: 57-63. GARMEZY, N. (1978), DSM-Ill. Never mind the psychologists: is it good for the children? Clin. Psychologist, 31, No.3 HERSOV, L. (1977), School refusal. In: Child Psychiatry, ed. M. Rutter & L. Hersov. Oxford: Blackwell Scientific Publications, pp. 455-486.. KANNER, L. (1969), The children haven't read those books. Acta Paedopsychiat., 36:2-11. MATTISON, R., CANTWELL, D. P., RUSSELL, A. T., & WILL, L. (1979), A comparison of DSM-II and DSM-III in the diagnosis of childhood psychiatric disorders: II. Interrater agreement;Arch. Gen. Pyschiat., 36:1217-1222. OVERALL, J. E. & HOLLISTER, L. E. (1979), Comparative evaluation of research diagnostic criteria for schizophrenia. Arch. Gen. Psychiat., 36: 1987-1205. RUSSELL, A. T., CANTWELL, D. P., MATTISON, R., & WILL, L. (1979), A comparison of DSM-II and DSM-III in the diagnosis of childhood psychiatric disorders: III. Multiaxial features. Arch. Gen. Psychiat., 36: 1223-1226. RUTTER, M. (1965), Classification and categorization in child psychiatry.]. Child Psychol. Psychiat., 6:71-83. - - - (1972), Maternal Deprivation Reassessed. Hammondsworth, Middelsex: Penguin. - - (1977a), Classification. In: Child Psychiatry, ed. M. Rutter & L. Hersov. Oxford: Blackwell Scientific Publications, pp. 359-384. - - (1977b), Speech delay. In: Child Psychiatry, ed. M. Rutter & L. Hersov. Oxford: Blackwell Scientific Publications,pp. 688-716. - - (1978), Diagnostic validity in child psychiatry. Adv. Biol. Psychiat., 2:2-22. - - (1979a), Maternal deprivation 1972-1978. ChildDevelpm., 50:283-305. - - (1979b), Changing Youth in a Changing Society. London: Nuffield Provincial Hospitals Trust; Cambridge, Mass.: Harvard University Press, 1980. - - - (1980), Attachment and the development of social relationships. In: Scientific Foundations of Developmental Psychiatry, ed. M. Rutter. London: Heinemann Medical (in press). - - - LEBOVICI, S., EISENBERG, L., SNEZNEVSKIj, A. B., SADOUN, R, BROOKE, E.,& LIN, T-Y. (1969), A tri-axial classification of mental disorders in childhoodv j', Child Psychol. Psychiat., 10:41-61. - - - SHAFFER, D., & SHEPHERD, M. (1975), A Multi-axial Classification of Child Psychiatric Disorders. Geneva: World Health Organization. - - - TIZARD, J., & WHITMORE, K., eds. (1970), Education, Health and Behaviour. London: Longmans. SALZINGER, K. (1977), But is it good for the patient? Read at the 85th annual convention of the American Psychological Association, San Francisco, California. SCHACHAR, R., RUTTER, M., & SMITH, A. (1980), The characteristics of situationally and pervasively hyperactive children.j', Child Psychol. !,sychiat. (in press). SCHACHT, T. & NATHAN, P. E. (1977), But is it good for the psychologists? Amer. Psychologist, 32: 1017-1025.
394
Michael Rutter and David Shaffer
SHAFFER D., RUTTER, M., STURGE, C., & NICHOLS, P. G. (1979), An examination of categories relating to child mental health in the International Classification of Diseases. Read at the American Academy of Child Psychiatry, Atlanta, Georgia. SPITZER, R. L. & ENDICOTT,]. E. (1978), Medical and mental disorder. In: Critical Issues in PsychiatricDiagnosis, ed. R. L. Spitzer & D. Klein. New York: Raven Press, pp. 15-39. - - - - - - & ROBINS, E. (1978), Research diagnostic criteria. Arch. Gen. Psychiat., 35:773-782. - - & FORMAN,]. B. W. (1979), DSM-lIIfield trials: II. Amer.], Psychiat., 136:818-820. - - - - & NEE,]. (1979), DSM-III field trials: I. Amer.], Psychiat., 136:815-817. STENGEL, E. (1959), Classification of mental disorders. Bull. World Hlth Organiz., 21 :601-663. STURGE, C., SHAFFER, D., & RUTTER, M. (1977), The reliability of diag-nostic categ-ories for child psychiatric disorders in ICD-9. Read at the Royal College of Psychiatrists Section of Child Psychiatry meeting, Sterling, Scotland. TARJAN, G., TIZARD, ]., RUTTER, M, BEGAB, M., BROOKE, E. M., CRUZ, F., LIN, T-Y., MONTENEGRO, H., STROTZKA, H., SARTORIUS, N. (1972), Classification and mental retardation. Amer.], Psychiat., 128(1l, suppl.):34-45. ZUBIN,]. (1977), But is it good for science? Clin. Psychologist., 31(2): I, 5-7.