Research at the Interface of Psychiatry and Medicine The science base for all of medicine has been expanding exponentially; in particular, advances in basic and clinical sciences are revolutionizing our understanding of the interface between psychiatry and the rest of medicine. The intent of this section is to present and synthesize recent research relevant to a clinical audience involved in consultation liaison psychiatry and general hospital psychiatry. This section, edited by Harold Pincus, M.D., will include consideration of new scientific approaches-from biological psychiatry and the neurosciences to immunology, pharmacology, physiology of body systems other than the central nervous system, to the range of behavioral sciences, epidemiology, and health-services research.
Principles of Screening Applied to Psychiatric Disorders Daniel E. Ford, M.D., M.P.H. Abstract: The morbidity and mortality caused by psychiatric illness is a significant public health problem. The use of a psychiatric screening questionnaire has been one strategy to improve the recognition and treatment of psychiatric disorders in general medical settings. This paper discusses how well psychiatric screening procedures fulfill criteria outlined by the World Health Organization for evaluating the utility of general medical screening efforts. Current research on the utility of screening for psychiatric disorders is reviewed, and the lack of data on the treatment of psychiatric disorders is identified in the general medical sector is emphasized. The randomized clinical trial is offered as the best method to test the efficacy of screening by eliminating various biases outlined in the paper. Quantitative concepts such as positive predictive value and receiver operator characteristic analysis are discussed. The need for more research on the efficacy of early treatment of psychiatric disorders is emphasized.
As efforts are made to increase the provision of preventive medical care, screening for disease has become a topic of increasing importance. However, scientific evidence to support the use of many From the Primary Care Research Program, National Institute of Medical Health, Rockville, MD. Address reprint requests to: Daniel E. Ford., M.D., M.P.H., Primary Care Research Program, National Institute of Mental Health, Room 18C-14,560O Fishers Lane, Rockville, MD 20857. General Hospital Psychiatry 10, 177-188, 1988 This paper is U.S. Government work, cannot be copyrighted, and lies in the public domain.
screening modalities is not available. There appears to be a rush to include screening practices in standard medical care before a systematic evaluation has been completed. For example, regardless of the widespread use of occult blood testing of stool to detect early colon cancer, the results of two large randomized clinical trials to assess the effectiveness of the practice are not yet known [l]. The objective of this article is the examination of the rationale and role of screening for psychiatric disorders in primary care settings. A large proportion of individuals with psychiatric disorders receive their care in nonpsychiatric, general medical settings [2,3]. In addition, most studies have found that primary care physicians typically underdiagnose mental illness in their patients [4]. The advantages of earlier and more accurate psychiatric diagnoses in general medical patients are thought to include better care to reduce psychiatric morbidity, prevention of unnecessary medical tests and invasive procedures, and decreased utilization of general medical services [5,6]. In response to this problem, there have been numerous studies that have tried to improve the recognition and treatment of psychiatric disorders in general medical settings by utilizing short, screening questionnaires about psychiatric symptoms to inform the primary care physician of the patient’s current 177 ISSN 0163~8343/88/$3.50
D. E. Ford
mental state. These studies provide the data base to help answer the following questions: 1) What are the goals of screening for psychiatric disorders? 2) Do the criteria for screening used in nonpsychiatric disorders apply to psychiatric disorders? 3) What evidence is available to help determine the role of screening for psychiatric disorders in primary care? These questions will be addressed utilizing the framework of the World Health Organization (WHO) criteria for screening. These criteria were chosen because of their relevance to both clinical practice and public health. By examining the use of screening techniques in medical conditions it is hoped that the use of screens for psychiatric conditions can be improved. Initially, some would suggest differentiating case finding from screening. Sackett and colleagues [7] define screening as inviting the general public or requiring specific groups (immigrants or transportation personal) to undergo tests to determine if 1hey are at higher risk of disease. Case finding, on the other hand, is defined as the detection of early disease in patients who seek help for other, unrelated problems. It is unclear if there is any relevant distinction between the 50 year old who is offered occult blood testing for his stool at a local health fair and the same individual being offered occult blood testing by his physician when he goes in for a blood pressure check. The only real advantage to case finding is that the individual being screened is already part of a health system so that follow-up of the results of the screening test can be complete and efficient. In both situations, the implied consent of the individual being screened is that the procedure has the ability to either extend life or reduce morbidity. As Cochrane and Holland [S] point out, there is an ethical difference between everyday medical practice and screening. In everyday medical practice, the physician uses the best available knowledge to manage symptoms even if complete proof of the efficacy of the treatment is not available, whereas screening an asymptomatic individual implies guaranteed benefit to society, if not to that particular individual. The guidelines by which screening techniques can be evaluated have been outlined by the WHO [91: 1. The condition must have a significant effect on the quality or quantity of life. 2. The incidence of the condition must be sufficient to justify the cost of screening. 3. The condition must have an asymptomatic pe-
178
riod during which detection and treatment significantly reduce morbidity or mortality. 4. Treatment in the asymptomatic phase must yield a therapeutic result superior to that obtained by delayed treatment until symptoms appear. 5. Acceptable methods of treatment must be available. 6. Tests that are acceptable to patients must be available at reasonable cost to detect the condition in the asymptomatic period. There are at least two situations in which screening could have a role in psychiatric care. The first would be to screen for nonpsychiatric physical disorders in patients with psychiatric disease. The literature in this area is limited, and there is no recent information on how detection of “occult” nonpsychiatric disorders affects the long-term morbidity or mortality of the patient. This may be because there are multiple possible relationships between psychiatric and nonpsychiatric disease. First, the psychiatric illness and nonpsychiatric disease may coexist and be unrelated to one another, for example, a patient has both pneumonia and depression, and even when the pneumonia resolves, the depression remains. Second, an organic or nonpsychiatric disease may be the cause of the depression. Hypothyroidism mimicking depression, or a patient becoming secondarily depressed because of a chronic medical condition and the depression improving when the medical condition improves. Third, the patient may have psychiatric disease causing the nonpsychiatric symptoms, as in the patient with weight loss who regains weight after the depression is treated. Using one screening procedure to gain new information about this complex relationship is difficult and would be much more effective if the explicit purpose of the screening test is well characterized early. Evaluation of these screening techniques is complicated, as research in this area has always been dependent on a judgment of the importance of the physical disorders uncovered, with no documentation of the ultimate change in outcome for the patient. The second situation in which screening instruments have been used more widely with psychiatric disorders is in the use of questionnaires to detect psychiatric disease, either in the community or in general medical settings. There has been more research in this setting and this will be the basis for most of the discussion of general screening issues that follows.
Principles of Screening
Public Health Importance of Psychiatric Disorders The first criteria outlined by WHO, that the condition must have a significant effect on the quality or quantity of life, appears to be applicable to psychiatric disorders in general medical settings and the community. The most reliable estimates of DSM-III psychiatric disorders in general medical settings are between 17% and 26.7%, [lo-121 which is slightly higher than community estimates. The effect of psychiatric disorders on society has been well outlined by Kamerow and associates [13] who reviewed the costs of these disorders including health costs, productivity loss, suicide, increased mortality, and calculation of potentially productive years of life lost. Suicide is the sixth leading cause of potentially productive years lost, with accidents the first cause of potentially productive years lost. Both are significantly related to psychiatric disorders if substance abuse is included. Psychiatric disorders have a higher incidence than do breast and colon cancer, both of which have a recommended screening procedure. However, incidence of disorders is only one criterion to judge the importance of screening as evidenced by newborn screening In this situation, the incifor phenylketonuria. dence of the disease is very low, but the consequences are severe and effective treatment is available. The prevalence of the disorder also has important implications for the performance of any screening test. Sensitivity (percentage of individuals with disease who have a positive test) and specificity (percentage of individuals who have no disease and have a negative test) are the most quoted charTable 1. Relationship value
acteristics of diagnostic tests, but the positive predictive value of the test result is equally important. As shown in Table 1, the predictive value of a positive result on a screening test, that is, the percentage of true positives with disease divided by the number of positive tests, is extremely dependent on the prevalence of disease in the population being screened. If the sensitivity and specificity remain constant, the positive predictive value of a result increases from 30% to 73% when the prevalence of the disorder increases from 5% to 25%. The direct relationship between the prevalence of the disorder and positive predictive value has lead to investigation of methods to select populations at greater risk. This has the effect of increasing the prevalence of the disorder in the population being screened and thus increasing the positive predictive value of the test [14]. Screening only higher risk groups can be used to reduce total costs, particularly for the more expensive screening tests, and increasing the acceptance of the screening test. The factors used to define increase risk could include demographic characteristics, family history, known medical disorders, and even current symptoms. Recommendations for screening total populations often appear to be made without regard to the utilization of services that would occur if all participated. For example, Frank [l] estimated that yearly occult-blood screening tests of the 50 million North Americans more than 45 years of age using standard diagnostic models would create a demand for one million colonoscopies a year and would cost $0.5-$1 billion a year. These estimated costs would quickly overburden the system. If every woman followed the guidelines for breast cancer screening, of prevalance
of a disorder and positive predictive
Sensitivity = 0.80 Specificity = 0.90 Prevalence = 0.25
Prevalence = 0.05
Screening Test
+ -
Disease + 40 95 10 50
855 950
+ 135 865 1000
Positive Predictive Value = 401135 = 30%
Screening Test
+ _
Disease 200 75 50 250
675 750
275 725 1000
Positive Predictive Value = 2001275 = 73%
179
D. E. Ford
a similarly huge increase in health expenditures would occur. If this strategy of identifying high risk patients is applied to screening for psychiatric disease in general medical settings, primarily anxiety and affective disorders, what groups appear to be at high risk? The limited data suggest that women, the poor, those less than 65 years of age, the more severely nonpsychiatrically ill, the nonmarried, and those with a past psychiatric disorder are most likely to have a psychiatric disorder [15-171. No data could be located concerning the effect of family history of psychiatric illness on the performance of psychiatric screening tests. More relevant than the overall prevalence of disease in various subgroups of the population are data indicating which groups have the largest amount of undetected psychiatric disease leading to the greatest amount of benefit when it is detected. Although data from the National Institute of Mental Health Epidemiological Catchment Area (ECA) study [18] suggest that the elderly have lower rates of psychiatric disorders, a psychiatric screening procedure might have a large impact on this group. German and coworkers [19] report that the elderly have lower rates of detection and management of psychiatric disorders when compared to other age groups in the primary care setting they studied. The feedback, of the results from a psychiatric screening instrument to the patient’s physicians significantly increased detection of elderly patient’s psychiatric disorders as compared with no increased detection with feedback in the younger age groups. Both disorder prevalence and detection of the disorder in current usual practice must be considered in any evaluation of a screening effort as subgroups with low prevalence of disease may still have a high rate of undetected disease.
Is Early Detection of Psychiatric Disorders Useful? The next criterion states that a condition must have an asymptomatic period during which detection and treatment significantly reduce morbidity or mortality. Almost all diseases have an asymptomatic period; however, in psychiatric diseases, it is difficult to know what asymptomatic means, as there are currently no reliable laboratory or pathologic tests to confirm diagnoses. The presence of symptoms is the only way to make a diagnosis. 180
Nonetheless, it is useful to conceptualize psychiatric disease, like other diseases, as having four stages. The first is the onset of the disease process, which may only be the onset of the risk factor associated with the disease. This stage probably cannot be measured exactly. The second stage is one in which early detection of the disease is possible, and this stage will vary depending upon the technology available. The third stage is when the usual clinical diagnosis is made, and this will also vary with the type of health care available and the skill of the health care provider. Finally, there is disease outcome, which includes the full manifestations of the disease, specific symptoms, or death [2]. For screening to be effective, there must be a critical point in the natural history of a disease “before which therapy is either more effective or easier to apply than afterward’ [20]. This model emphasizes that an evaluation of screening requires knowledge of the natural history of the disease and the efficacy of various treatment procedures. The efficacy of treatment may even define the point where abnormal begins and normal ends. Do psychiatric disorders fit with this model? The fact that depression and anxiety appear to be on a continuum in primary care does not invalidate the model. For example, blood pressure falls on a continuum, and when clinical trials demonstrated that treatment to reduce blood pressure in individuals who have diastolic blood pressures between 90 mm Hg to 105 mm Hg reduced cardiovascular morbidity, the definition of diastolic hypertension fell from 105 mm Hg to 90 mm Hg (211. Treatment trials for depression in primary care should define the level of depression that requires treatment in a similar manner. Psychiatric screening questionnaires give no information beyond that capable of being ascertained from a complete clinical interview. Therefore, the utility of a psychiatric questionnaire will be very dependent on the state of usual or routine care. This is different than screening for occult blood in stool to detect colon cancer when the test for occult blood gives information that cannot be obtained in any other way. The efficacy of treatment depends upon results from clinical randomized trials and not comparative studies of screened and unscreened patients, because various biases might confound such results. There is a selection bias in who agrees to undergo screening. Studies have demonstrated that healthier populations tend to volunteer for more screening tests [22], and in psychiatric screen-
Principles of Screening
ing it is possible that those patients who deny or fear the label of psychiatric disease would be less likely to enter into screening. This is particularly true for those with alcohol or substance abuse disorders. The psychiatric cases discovered by screening might then appear to have better treatment outcomes, but this would be so because the more difficult cases refused to participate in screening. If length of psychiatric symptoms was the standard by which psychiatric treatment efficacy was judged, cases detected by screening might appear to have a worse outcome. This is an example of a variant of lead-time bias [22] described by Shapiro. To illustrate, suppose two women with major affective disorder were destined to recover 2 years after their mother’s death. The first individual is diagnosed by screening 3 months after her mother’s death when she visits her primary care physician for her annual physical and the second individual is diagnosed 6 months after her mother’s death when she consults her physician for depressive symptoms. Although treatment had no effect one way or another, the case detected by screening had a duration of 21 months and the case detected by ordinary clinical practice had a duration of 18 months. If duration of the episode was the criteria for therapeutic efficacy, screening would actually appear to be harming the patient. One other potential bias comes from the field of oncology but could apply to psychiatric disease as well. Periodic screening tends to pick up slow-growing as compared to fast-growing tumors [23]. Suppose psychiatric screening detected more chronic disorders. Then, the duration of symptoms for cases detected by screening would appear longer if compared to more acute and briefer disorders occurring between screening visits. All the above biases make evaluation of the efficacy of screening for psychiatric disorders as dependent on randomized clinical trials as in other fields of medicine. It is also possible that screening procedures may actually be harmful to an individual. Mammography is the most widely investigated screening procedure, and much of this research was done because of the widespread knowledge that radiation had the potential to be harmful. How could psychiatric screening be harmful? An individual could be diagnosed as having a psychiatric disorder without getting adequate treatment. The patient might not get adequate treatment because the provider is not knowledgeable about treatment options, financial, or insurance constraints are present, or the patient finds the treatment unacceptable and
is noncompliant. The stigma of the psychiatric label has consequences for the patient in terms of income [24], friendships, and social interaction [25]. Of respondents in a recent community survey, 24% indicated that a family member would get upset if he or she sought help for an emotional problem [26]. Beyond the social aspects of a psychiatric label, the patient who is told he or she has a screening score indicative of a psychiatric disease may begin to adopt the role of the psychiatric patient, leading to a self-fulfilling prophecy. Symptoms that otherwise might resolve could develop into a complete psychiatric disorder, particularly if the primary care physician is not comfortable with psychiatric treatment. Referral to mental health specialists is an option, but patient acceptance of these referrals is often poor [27]. None of the screening instruments are diagnostic, and a major source of harm to the patient could occur if the scores on the screen were given too much weight by the physician in the formulation of the patient’s problems and treatment. A false positive could be misinterpreted by the provider and primary nonpsychiatric disease could remain undetected. A false negative could lead to inappropriate neglect of psychosocial issues by the physician in the patient interview and over-utilization of tests to search for organic causes. None of the trials to date have collected enough data on how physicians use the data from psychiatric screening instruments to address this issue. There is no way to determine the risk-benefit ratio of screening for psychiatric disorders without randomized clinical trials that follow patients 1 to 2 years beyond the usual course of the disease. One way that screening might be helpful would be in the detection of individuals who are having psychiatric symptoms but have not reached the full case threshold. It is possible that early detection of this group could prevent the progression to a full psychiatric disorder. In the “Secondary Prevention Study,” [12] conducted in a hospital internal medicine clinic, all patients were administered the GHQ 6 months apart. With this version of the GHQ, a score above 4 is considered a possible case. Of those scoring 0 on the first GHQ 4% scored above threshold when the GHQ was readministered 6 months later. This contrasted with 23% above threshold on the second GHQ if they scored l-4 on the initial GHQ. This suggests that those in the subthreshold area are more likely to progress to a full psychiatric disorder. Kessler and colleagues [28] reported on 166 selected patients in a primary 181
D. E. Ford
care clinic. The 14 new cases by RDC classification had a mean GHQ score of 3.9 6 months earlier as compared to 1.55 for those who had no psychiatric disorder at either time. Therefore it appears that future cases are associated with higher GHQ scores. Nevertheless, what is the positive predictive value of a subthreshold GHQ score? The data from the “Secondary Prevention Study” allow calculation of the positive predictive value of a subthreshold GHQ score (l-4) and only 14% of these patients go on to develop a score above the threshold on the GHQ. Unfortunately, there is little information on the efficacy of any intervention for patients determined to have subthreshold disorders. However, the preliminary results of a clinical trial directed by Munoz to evaluate whether or not depression can be prevented in a group of highrisk patients from two primary care clinics, have been favorable [29]. Another use of psychiatric screening instruments in clinical practice is to increase the recognition of psychiatric disorders already present in patients in general medical settings. It is difficult to demonstrate the efficacy of this effort, because the practitioners who are the most psychiatrically oriented and the best interviewers will have the least amount of undetected psychiatric disease in their patients. They may already be detecting the majority of medical patients who have typical psychiatric disorders or who can answer psychiatrically-oriented questions effectively. A trial of feedback of a self-rating depression scale to family practice residents reported that the group with the least improvement in chart notation of depressed patients were the third-year residents, who already had the highest chart notation of depression as compared to first-year residents [30]. The practitioners who never ask psychiatric screening questions in their routine patient interviews and detect little psychiatric pathology may also be least likely to alter their behavior after seeing the results of the screen. This is particularly the case if management, and not simply detection, of psychiatric disorders is the goal of the screening process.
Is Treatment of Psychiatric Acceptable to Patients
Disorders
The fifth criterion for an effective screening modality is that there be an acceptable treatment currently available. The three major classes of disorders seen in primary care include affective disorders, anxiety disorders, and substance abuse dis182
orders. In brief, there is good evidence of the efficacy of both pharmacologic and psychotherapy treatment modalities for treating depression [31]. There is less evidence that treatment of anxiety disorders is effective in nonpsychiatric settings [32], but it certainly is for specific anxiety disorders such as panic syndrome [33,34] and phobias [35]. There have been very few randomized studies of treatment of alcohol or substance abuse, and there is certainly no compelling evidence that early treatment is more effective [36,37]. Moreover, it is important to remember that these studies were almost exclusively done on patients who had primary psychiatric disorders and no secondary medical problems. A report in the literature [38] has documented that at least 32% of medically hospitalized patients with major affective disorder had to have their tricyclic antidepressants discontinued because of unacceptable side effects, and only 40% of the patients showed a positive response to tricyclics. Other studies with small samples have shown no benefits from treating depressed medically ill patients with tricyclics [39]. None of these studies has documented that early treatment is more effective or less costly than later treatment. Many patients in medical settings resist psychiatric referrals [40]. Whether patients with early psychiatric disease detected by screening would find psychiatric care given by either the primary care physician or a mental health specialist acceptable to them would have to be studied before any large scale screening effort was initiated.
Are the Screening Instruments Adequate? Finally, the performance of the screening instruments needs to be considered. Several of the questionnaires used most frequently are directed primarily at depression; these include the Beck Depression Inventory (BDI) [41], the Center for Epidemiological Studies Depression Scale (CES-D [42], and the Zung Self-Rating Depression Scale (SDS) [43]. Those scales directed at general psychopathology are the General Health Questionnaire (GHQ) [44], and the Hopkins Symptom Checklist (HSCL) [45]. All these screens are usually self-administered, and they have proven to be quite acceptable to most patients in primary care settings. The screening instruments generally take less than 15 minutes to complete and more than 90% of the patients have completed the questionnaire in most studies. This may not be the case
Principles of Screening Applied to Psychiatric Disorders
with hospitalized medical patients when poor cooperation and organic brain syndromes are more common. There are several factors to consider in the selection of one scale over the other. First, one needs to decide whether the detection of depression or all psychopathology is the goal. However, even the more general questionnaires such as the GHQ and HSCL do have more specific subscales that include depression. Second, whether or not the detection of longstanding chronic disorders is a goal needs to be considered. The GHQ was not designed to detect chronic disorders, although this may be partially compensated for with an alternate scoring method [46]. The sensitivity of the instrument to chronicity may also affect how well the instrument does in measuring change with multiple administrations. Finally, the dependency of the scale on somatic symptoms is important, as general medical patients may have either somatic symptoms easily explained by known nonpsychiatric diseases or unexplained somatic symptoms that may be explained by a psychiatric disorder [47]. Two reviews [48,49] and one study 1501 have concluded that there are minimal differences between the instruments. In primary care settings, these screening instruments have sensitivities that cluster around 70% and specificities around 80%. In general, they perform best for affective disorders and worst for substance abuse, with anxiety disorders somewhere in between. There has not been much attention given as to why there are discrepancies between the results of a screening questionnaire and a full clinical exam. Medical patients may have misleading high scores on screening questionnaires because of high numbers of somatic symptoms [47] or overreporting of symptoms consistent with a “demonstrative” style 1511. Underreporting of symptoms on the screening questionnaires may be due to denial, paranoia, or fearfulness. A major issue in the use of these screens is determining the appropriate cutoff point that divides respondents into cases and noncases. The optimal cutoff point has differed in specialty mental health settings as compared with those in primary care settings [52]. The use of one particular cutoff point to evaluate tests does not make full use of the distribution of scores along the entire range of possible values, For example, in the 20-item GHQ, a score of O-3 usually indicates a noncase, and a score of 4-20 indicates a case. However, the probability that a person scoring 4 is indeed a case is less than that
for a person scoring 20. A method for displaying data over the entire range of values of the screening instrument is the Receiver Operator Characteristic or ROC analysis. This technique was developed by Swets [53] and originated in radar and signal detection theory. ROC analysis allows one to display the true-positive rate (sensitivity) and false-positive rate (l-specificity) at every possible cutoff. The true-positive rate and the false-positive rate will vary together in the same direction. The values for each cutoff point are then connected to form a line. If the test has no ability to discriminate a case from a noncase, the line will be a 45” diagonal. The best tests have values that are closest to the upper lefthand corner. Although the true-positive rate and false-positive rate will both increase as the cutoff point gets lower, a better discriminating test will have a large difference between the true-positive rate and false-positive rate at each cutoff point. The optimal cutoff point for a test from a statistical perspective is that point closest to the upper lefthand corner. This statistically optimal cutoff point is not necessarily the optimal point clinically, because false-positive results may not be equal to false-negative results in terms of the consequences of a misclassification error. The area under the curve (AUC) of an ROC analysis is representative of the discriminating ability of the test across all possible values of the test and is equal to the probability that a random pair of normal and abnormal respondents would be appropriately categorized by the test. Figure 1 illustrates how ROC analysis could be used to compare two hypothetical screening tests. This example illustrates that screening instrument A differentiates cases from noncases more poorly at low values and better at high values as compared to screening instrument B. Note that each instrument could have a different number of questions and the analysis would still be valid. Other uses of ROC analysis in psychiatry have included the ability of the GHQ to detect cases in respondents with low or high levels of education 1541 and the performance of the GHQ to detect psychiatric cases in individuals who have used no health services, only general medical services, or specialty mental health services [article submitted for publication]. The AUC and its standard error can be easily calculated with a personal computer using a nonparametric technique [55,56]. At this point, the studies that have involved screening of adult patients will be described briefly to see how they have dealt with the screening issues outlined above. Only studies that include ran-
183
D. E. Ford
False Positive Rate il-Specificity)
Figure 1. Example of ROC analysis.
domized feedback of the results from the psychiatric screening instruments to the clinician will be considered, because this is the method required to truly test the effects of the screening procedure. However, even this design is not pure if the control group completes the screening questionnaire and the results are not given to the clinician, because the patient will have been sensitized to psychiatric issues by completing the screening questionnaire. This might lead to more discussion about mental health issues initiated by the patient than is the usual case, but this seems unavoidable if the control patients are to have any psychiatric evaluation. The first study, by Johnstone and Goldberg [57], was based on 1093 consecutive attenders to Dr. Johnstone’s practice. Three groups were formed for comparison: 1) patients with high GHQ scores already known to the practitioner, 2) patients with high GHQ scores unknown to the physician who had feedback of results to the practitioner, and 3) patients with high GHQ scores unknown to the practitioner who had no feedback of results to the practitioner. Follow-up was conducted with a clinical interview and readministration of the GHQ in the patient’s home 1 year after the initial GHQ was completed. The study does reveal the value of an adequate time for follow-up. At 3 months, groups 1 and 2 had a greater percentage “feeling well” than the undetected patients in group 3. However, 184
the groups were equal at 1 year. These data may be susceptible to a type of recall bias, as the time course of symptoms is based on patient’s recall when interviewed 1 year later. The screening process sensitizes the patient to psychiatric issues and could bias results of duration and acknowledgement of psychiatric symptoms, especially in the short term. They also reported that feedback improved the outcome for those with severe psychiatric morbidity and had little effect for those patients with mild psychiatric morbidity. The definitions of mild and severe were not clearly defined. Although the trial was based on one practitioner, the assessments were not blind, and there was little description of the disorders present, the study did make the important distinction between patients previously identified and patients with undetected psychiatric disorders. A psychiatric screening instrument has to concentrate on this group of previously undetected patients. Four studies have investigated how feedback of psychiatric screening scores affected the documentation of depression on the medical chart. Moore and colleagues [30] reported their results of providing feedback of the SDS to family practice residents. The study included 212 patients. If a patient scored above 50 on the SDS, a note was placed on the chart indicating either major or minor depression for one half the patients. The control patients had a note on their chart indicating only that they had been screened. Feedback of the screening results increased the percentage of depressed patients with depression noted on the chart from 22% to 56%. Linn [58] completed a similar study with 150 patients in a university hospital general medical clinic. The notation of depression in the screened patients with feedback increased from 8% to 25%. The physicians who received the feedback before or after the patient visit had similar documentation, which may indicate that they were inappropriately using the SDS as a diagnostic and not screening instrument. The Moore study did not mention the effect on treatment, and the Linn study found only a slight increase in documented treatment for the patients who had feedback given to their physicians. Neither study could provide information on the types of patients uncovered only with the feedback of screening results; the links to treatment and the efficacy of that treatment were not studied. Linn did discuss the difficulties of studying one visit in an entire episode of depression. Recognition and treatment often do not occur on the first visit, and this can only be captured if the episode
Principles of Screening Applied to Psychiatric Disorders
of illness, not a single visit, is the unit of study. This is particularly true if the goal of a psychiatric screening effort is treatment early in the episode. Later studies on how feedback of screening scores affects documentation of psychiatric disorders have included the Marshfield study [ll] and one by Rucker and associates [59]. The Marshfield study included 1452 patients of 16 practicing primary care providers at a rural medical clinic. Feedback of GHQ scores was provided in a randomized manner. The recording of a psychiatric disorder on a separate study record was not altered for the entire group (16.8% vs. 16%) or for the individuals with high GHQ scores (29% vs. 30%) when feedback was provided. It is unclear whether this lack of effect was due to the physicians’ attitudes or psychiatric knowledge base. The group of physicians did have previous experience with psychiatric screening scores and baseline documentation of psychiatric disease in the clinic had already increased from 3% to 15%. Moreover, if physicians incorporate most of the questions from the GHQ into their usual interview, the feedback of scores will add little to increased identification. In Rucker’s study, 375 patients completed the Beck Depression Index (BDI) before seeing their resident physician in a general internal medicine clinic. After the visit, the 36 residents were asked to rate the patients for depression on a special form, and then the results of the BDI were provided. They could then reinterview the patient if desired before indicating how useful they thought the BDI was in that case. The physicians indicated that 17% of the patients were more depressed than they expected, and the patient was less depressed than they expected 4% of the time. The physicians stated that the results altered their plan 21% of the time and the BDI was “useful in any way” 58% of the time. Although this study did not have rigorous definitions of psychiatric disorder nor include efficacy of treatment, it did concentrate on what the clinician feels about the information from the BDI on a specific case by case basis. This is unlike global ratings of the utility of psychiatric screening information, which may be more affected by preconceived prejudices. 51.5% of the residents indicated that the BDI facilitated frank discussions with their patients. It would have been interesting to see how this affected the physician and patient’s rating of overall satisfaction with the visit as well. Practical considerations of how a psychiatric screen affects the length of visit, percent of patients who return for follow-up, and ease of referral will all signifi-
cantly influence how many primary care providers would use these screening questionnaires. No information on these issues was provided. Zung and coworkers [60] took these studies a step further by studying how recognition affected treatment and outcome. The SDS was administered to 1086 patients who had not had depression diagnosed in a family practice center. Of these, 143 scored above 55 and were symptomatically depressed. These patients were then randomized, using a 2:l ratio, into “identified” and “nonidentified” patients, placing the identified patients’ screening scores on the chart before the visit. Only 12% of the patients scoring high on the SDS had a chief complaint of a psychiatric nature. Assessment of 63% of these patients was completed 4 weeks later with a repeat SDS. Improvement was defined as a decrease of 12 points on the SDS. Of the identified patients, 50% were treated with a tricyclic antidepressant, usually maprotiline. At 4 weeks, 18% of the control patients, 28% of the untreated identified patients, and 64% of the identified treated patients had improved. The follow-up did not extend beyond 4 weeks. The numbers of patients were relatively small and this is only the experience of one clinic, but this study did evaluate the most important variable, patient outcome, and found a benefit from screening at 4 weeks. Only one other published study looking at how feedback of psychiatric screening scores affects detection and management of psychiatric disorders in general medical patients was located. The Secondary Prevention Study [12] administered the GHQ to 1242 consecutive attenders to a university hospital internal medicine clinic. Clinicians (physician assistants, residents, and faculty) were asked to fill out a form on diagnosis and management of psychiatric disorders after each visit. By random assignment in half the visits, the clinicians received the results of the GHQ before the visit. The patients were followed for 6 months with a repeat GHQ administered at the end of the study period. The feedback of results for GHQ-positive respondents overall increased detection from 53% to 60% (nonsignificant), but did significantly improve detection in those groups with lower detection rates (males, nonwhites, and the elderly). Formal management activities (physician counseling, psychotropic drug prescriptions, and referral) were not significantly affected by feedback. GHQ scores and patient selfassessment of emotional health at 6 months were not significantly better for the feedback group either. This experiment was not accompanied by 185
D. E. Ford
any educational effort to inform the physicians of how to use the results of the GHQ. Regardless of the lack of documented affects, 75% of the 32 clinicians felt the information was mildly to very useful and 48% wished to have the information routinely available. It is possible that the study did not capture the important ways in how the screening scores were helpful to the clinicians. The clinicians had a relatively high rate of detection (73%) of individuals with anxiety and affective disorders as determined by the Diagnostic Interview Schedule, so that there was not as much possible gain from screening feedback as there might have been in other studies. However, this does not mean that the clinician identified the “correct” psychiatric disorder, or formulated a reasonable treatment plan. The Secondary Prevention study did not include an educational component for the clinicians, which may be the reason there was minimal effect from the feedback of the GHQ. Recommendations for treatment of psychiatric disorders in primary care are based more on clinical impressions than on empirical trials. There are few guidelines the primary care physician can use to direct treatment of psychiatric disorders. When there is little consensus on appropriate care, the variation in practice will be large. Before psychiatric screening practices are recommended as routine medical practice, there needs to be more research and formulation of practical guidelines so that primary care clinicians are confident they know how to use the data from the psychiatric screening instruments. Although expansion of the data base is essential, Schulberg [61] has recently described a multifaceted conceptual model of how to educate primary care physicians about depression including components other than psychiatric knowledge as well.
In comparison with screening practices in general medicine, there has been a significant amount of research on the utility of screening for psychiatric disorders in primary care. Proper recognition and management of psychiatric disorders in primary care remains as an important challenge. The guidelines for the evaluation of screening of nonpsychiatric disease as outlined by the WHO can serve as a useful guide to direct future research in screening for psychiatric disorders as well. Management of psychiatric disorders and the relative efficacy of detection of early, as opposed to later, psychiatric disorders appear to be the areas where new information would have the greatest utility in determining the proper role for screening to detect psychiatric disorders in primary care. I would like to thank Douglas Kamerow, M.D. and Barbara Burns, Ph.D. for review and improvemenf of fhe manuscript.
References 1.
2.
3.
4.
5. 6.
7.
Summary In 1968, when reviewing the case for psychiatric screening, Wilson and Jungner [9] felt they could not recommend screening for anxiety or affective disorders until there were clinical trials to determine which patients, if any, would benefit from early treatment. In a similar manner almost 20 years later, Frame [62] stated, “Screening for depression is not indicated because there is no evidence that early diagnosis of unrecognized symptoms results in net benefit to the patient” (p. 33). 186
8. 9.
10.
11.
12.
Frank JW: Occult-bloodscreening for colorectalcarcinoma: The yield and costs. Am J Prev Med 1:1824, 1985 Regier D, Goldberg I, Taube C: The de facto US mental health services system: A public health perspective. Arch Gen Psychiatry 35:685-693, 1978 Schurman R, Kramer P, Mitchell J: The hidden mental health network: Treatment of mental illness by nonpsychiatric physicians. Arch Gen Psychiatry 42:89-94, 1985 Shepherd M, Cooper B, Brown AC, Kalton G: Psychiatric Illness in General Practice. London, Oxford University Press, 1981 Wise T: Commentary: Psychiatry and primary care. Gen Hosp Psychiatry 7:202-204, 1985 Mumford E, Schlesinger H, Glass G, Patrick C, Cuerdon T: A new look at evidence about reduced cost of medical utilization following mental health treatment. Am J Psychiatry 10:1145-1158, 1984 Sackett DL, Haynes BH, Tugwell P: Clinical Epidemiology: A Basic Science for Clinical Medicine. Boston, Little, Brown, 1985 Cochrane AL, Holland WW: Validation of screening procedures. Br Med Bull 27~3-8, 1971 Wilson JMG and Jungner G: Principles and Practice of Screening for Disease. Geneva, World Health Organization, 1968 Kessler LG, Burns BJ, Shapiro S, et al: Psychiatric diagnoses of medical service users: Evidence from the Epidemiological catchment area program. Am J Public Health 77~18-24, 1987 Hoeper EW, Nycz GR, Cleary PD, et al: Estimated prevalence of RDC mental disorder in primary medical care, Int J Ment Health 8:6-15, 1979 Shapiro S, German PS, Skinner EA, et al: An ex-
Principles of Screening
13.
14.
15.
16.
17.
18.
19.
20. 21.
22.
23.
24.
25. 26.
27.
28.
29.
30.
31.
32.
periment to change detection and management of mental morbidity in primary care. Med Care 25:327339, 1987 Kamerow DB, Pincus HA, MacDonald DI: Alcohol abuse, other drug abuse, and mental disorders in medical practice: Prevalence, costs, recognition, and treatment. JAMA 255:2054-2057, 1986 Schecter MT, Miller AB, Baines CJ, et al: Selection of women at high risk of breast cancer for initial screening. J Chron Dis 39:253-260, 1986 Marks JN, Goldberg DP, Hillier VI? Determinants of the ability of general practitioners to detect psychiatric illness. Psycho1 Med 9:337-353,1979 Hoeper EW, Nycz GR, Kessler LG, et al: The usefulness of screening for mental illness. Lancet 2:3335, 1984 Robins LN, Helzer JE, Weissman MM, et al: Lifetime prevalence of specific psychiatric disorders in three sites. Arch Gen Psychiatry 41:949-958, 1984 Kramer M, German I’S, Anthony JC, Von Korff M, Skinner EA: Patterns of mental disorders among the elderly residents of eastern Baltimore. J Am Geriatr Sot 33:236-245, 1985 German PS, Shapiro S, Skinner EA, et al: Detection and management of mental health problems of older patients by primary care providers. JAMA 257:489493, 1987 Hutchinson GB: Evaluation of preventive services. J Chron Dis 11:497-503, 1960 Veteran’s Administration Cooperative Study on Antihypertensive Agents: Effects of treatment on morbidity in hypertension. JAMA 213:1143-1152, 1970 Shapiro S: Evidence of screening for breast cancer from a randomized trial. Cancer (suppl) 39:27722782, 1977 Feinleib M, Zelen M: Some pitfalls in the evaluation of screening programs. Arch Environ Health 19:412418, 1969 Link B: Mental patient status, work, and income: An examination of the effects of a psychiatric label. Am Social Rev 47:202-215, 1982 Phillips, D: Public identification and acceptance of the mentally ill. Am J Pub Health 56:755-763, 1966 Leaf PJ, Livingston MM, Tischler GL, Weissman MM, Holzer CE, Myers JK: Contact with health professionals for the treatment of psychiatric and emotional problems. Med Care 23:1322-1337, 1985 Bursztajn H, Barsky AJ: Facilitating patient acceptance of a psychiatric referral. Arch Intern Med 145:73-75, 1985 Kessler LG, Cleary I’D, Burke JD: Psychiatric disorders in primary care: Results of a follow-up study. Arch Gen Psychiatry 42583-587, 1985 Sargent, M: NIMH Report: Prevention studies focus on infants, schoolchildren, adults. Hosp Community Psychiatry 38:455-456, 1987 Moore JT, Silimperi DR, Bobula JA: Recognition of depression by family medicine residents: The impact of screening. J Fam Prac 7:509-513, 1978 Klerman GL: Practical issues in the treatment of depression and mania. In Paykel ES (ed), Handbook of Affective Disorders. New York, Guilford, 2982 Barlow DH, Beck JG. The psychosocial treatment of
33.
34.
35.
36.
37. 38.
39.
40.
41. 42.
43. 44.
45.
46.
47.
48.
49.
50.
Applied to Psychiatric
Disorders
anxiety disorders: Current status, future directions. In Williams JBW, Spitzer RL (eds), Psychotherapy Research: Where Are We and Where Should We GO? New York, Guilford Press, 1984 Liebowitz MR: Imipramine in the treatment of panic disorder and its complications. Psychiatr Clin N 8:37-47, 1985 Chouinard G, Annable L, Fontaine R, et al: Alprazolam in the treatment of generalized anxiety and panic disorders. A double-blind placebo-controlled study. Psychopharmacology 77:229-233, 1982 Marks IM: Controlled trial of psychiatric nurse therapists in primary care. Br Med J 290:1181-1184, 1985 McCrady BS, Sher KJ: Alcoholism treatment approaches: Patient variables, treatment variables. In Tabahoff B, Sutker PB, Randell CL (eds.) Medical and Social Aspects of Alcohol Abuse. New York, Plenum, 1983, pp 309-374 Blume SB: Is alcoholism treatment worthwhile? Bull NY Acad Med 59:171-180, 1983 Popkin MK, Callies AL, MacKenzie TB: The outcome of antidepressant use in the medically ill, Arch Gen Psychiatry 42:1160-1163, 1985 Light RW, Merrill EJ, Despars J, et al: Doxepin treatment of depressed patients with chronic obstructive pulmonary disease. Arch Intern Med 146:1377-1380, 1986 Bursztajn H, Barsky AJ: Facilitating patient acceptance of a psychiatric referral. Arch Intern Med 145:73-75, 1985 Beck AT, Beck RW: Screening depressed patients in family practice. Postgrad Med J 52:81-85, 1972 Radloff, LS: The CESD scale: A self-report depression scale for research in the general population. App Psycho1 Measurement 1:385-401, 1977 Zung WWK: A self-rating depression scale. Arch Gen Psychiatry 12:63-70, 1965 Goldberg DP: The Detection of Psychiatric Illness by Questionnaire: A Technique for the Identification and Assessment of Nonpsychotic Psychiatric Illness. London, Oxford University Press, 1972 Parloff MB, Kelman HC, Frank JD: Comfort, effectiveness, and self-awareness as criteria of improvement in psychotherapy. Am J Psychiatry 3343-351, 1954 Goodchild ME, Duncan-Jones P: Chronicity and the general health questionnaire. Br J Psychiatry 146:5561, 1985 Cavanaugh S, Clark DC, Gibbons RD: Diagnosing depression in hospitalized medically ill. Psychosomatics 24:809-815, 1983 Wells KB: Depression as a tracer condition for the National Study of Medical Care Outcomes: Background Review. Santa Monica, Rand Corporation, 1985 Murphy JJ: Psychiatric lnstrument Development for Primary Care Research: Patient Self-Report Questionnaire, Final Report for Contract No. 80M014280101D, National Institute of Mental Health, Rockville, MD, 1980 Hough RL, Landsverk JA, Stone JD, et al: Comparison of Psychiatric Screening Questionnaires for Pri-
187
D. E. Ford
51. 52.
53. 54. 55. 56.
mary Care Patients. Final Report for Contract No. 278-81-0036, National Institute of Mental Health, Rockville, MD, 1982 Kass F, Charles E, Klein DF, Cohen P: Discordance between the SCL-90 and therapists’ psychopathology ratings. Arch Gen Psychiatry 40:389-393, 1983 Schulberg HC, Saul M, McClelland M, et al: Assessing depression in primary medical and psychiatric practices. Arch Gen Psychiatry 421164-1170, 1985 Swets JA, Pickett RM: Evaluation of diagnostic systems: Methods from signal detection theory. New York, Academic, 1982 Mari JJ, Williams P: Misclassification by psychiatric screening questionnaires. J Chron Dis 39:371-378, 1986 Hanley JA, McNeil BJ: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29-36, 1982 Centor RM: A VisiCalc program for estimating the
188
57. 58. 59. 60. 61.
62.
area under a receiver operating characteristic (ROC) curve. Med Decis Making 5:149-156, 1985 Johnstone A, Goldberg D: Psychiatric screening in general practice. Lancet 1:605-608, 1976 Linn LS, Yager J: The effect of screening, sensitization, and feedback on notation of depression. J Med Educ 55:942-949,198O Rucker L, Frye EB, Cygan RW: Feasibility and usefulness of depression screening in medical outpatients. Arch Intern Med 146:729-731, 1986 Zung WWK, Magill M, Moore JT, George DT: Recognition and treatment of depression in a family medicine practice. J Clin Psychiatry 44:3-6, 1983 Schulberg HC, McClelland M: A conceptual model for educating primary care providers in the diagnosis and treatment of depression. Gen Hosp Psychiatry 9:1-10, 1987 Frame F’S: A critical review of adult health maintenance: Part 4. Prevention of metabolic, behavioral, and miscellaneous conditions. J Fam Prac 23:29-39, 1986