Quality of Life and Diagnostic Imaging Outcomes David Seidenwurm, MDa, Robert Rosenberg, MDb
The US Preventive Services Task Force recently promulgated revised guidelines for screening mammography. Criticisms were related to the undervaluation of future lives saved and the overvaluation of negative impacts of mammography. Radiologists downplayed quality-of-life factors, potentially understating the value of all imaging procedures. The task force’s recommendations for core needle biopsy, based on similar conceptual frameworks, were not met with equivalent responses. Full appreciation of the costs and benefits of screening provides the basis for making the best decisions for individuals and populations. This is undermined by the mixed messages that patients and physicians receive during clinical encounters and through other means. Quantitative approaches to medical care are valid on their own terms and when evaluated in the individual context. Insights from behavioral economics and political science inform discussion of population-based medical interventions. Preventing harm from medical interventions satisfies both the “primum non nocere” dictum and the loss aversion heuristic concordantly. The most effective medical care is provided when benefits are maximized and complications are minimized, especially when the harms occur immediately and the benefits are delayed. The importance of both quality of life and longevity in health care decision making require minimizing negative impacts of mammography when screening low-risk populations. Current practice differs significantly from the successful randomized trials, front-loading costs of false-positive examinations, and overtreatment. By decreasing false-positive mammographic results through adherence to ACR BI-RADS® recommendations, radiologists can answer critics of early and frequent screening while still reducing cancer deaths. Key Words: Practice guidelines, performance measures, US Preventive Services Task Force, breast cancer screening, randomized trials, risk benefit, adverse effects, adverse events J Am Coll Radiol 2010;7:265-268. Copyright © 2010 American College of Radiology
The recent controversy concerning new guidelines governing the timing of mammographic screening for breast cancer has revealed a fundamental contradiction in our thinking about the utility of diagnostic imaging. Although much of the discussion has centered on the effects of annual screening beginning at age 40 on all-cause mortality in women of average risk, another central strain in the discussion has involved the relative importance, or unimportance, of the anxiety caused by false-positive findings and the psychological, as well as the physical, consequences of overdiagnosis and overtreatment. The anxiety argument against screening has been dismissed as demeaning to women, and radiologists have emphasized the benefits, of annual screening between the ages of 40 and 50. Because the mortality effects of screening lowrisk populations are small, the consideration of qualityof-life factors is reasonable if significant disutility can be
ascribed to them. Screening takes time, causes anxiety, and results in discomfort among women, so disutility is present, and therefore it is reasonable to consider the psychological as well as the physical consequences of screening [1-20]. Radiologists argue at their peril that anxiety and the other adverse impacts on quality of life are insignificant in determining the comparative effectiveness or relative value of diagnostic imaging. The ramifications of the argument that anxiety is unimportant in defining the efficacy of imaging are complex and require further exploration before this point of view can be accepted by our specialty. This is because a principal justification for imaging not obviously related to improved health outcome by standard metrics is based in substantial part on the argument that negative results in a patient with nonspecific symptoms relieve anxiety and improve the health of the patient for precisely that reason [2,4,8,9,11,12,16,17].
a
QUALITY-OF-LIFE FACTORS IN IMAGING
Radiological Associates of Sacramento, Sacramento, California. University of New Mexico, Albuquerque, New Mexico. Corresponding author and reprints: David Seidenwurm, MD, Radiological Associates of Sacramento, 1500 Expo Parkway, Sacramento, CA 95815. b
© 2010 American College of Radiology 0091-2182/10/$36.00 ● DOI 10.1016/j.jacr.2010.01.001
MRI for head, back, or joint pain; CT for abdominal or chest pain; and ultrasound for pelvic pain without “red 265
266 Journal of the American College of Radiology/ Vol. 7 No. 4 April 2010
flags” suggesting rapidly progressive specifically treatable disorders will yield pathologic findings at rates that do not differ substantially from those observed in unselected subjects. Similarly, in the setting of trauma, the rate of surgically treatable pathology is vanishingly low when patients do not exhibit certain well-defined clinical features. Yet these patients are frequently imaged with the justification that negative results are reassuring to patients and their physicians. This apparent inconsistency seems self-serving and lacking in logical foundation, and careful analysis of the problem does not provide satisfying explanations for current practices. Consider the trade-offs involved in screening women annually between 40 and 50 years of age for breast cancer. Just about 1800 women screened for 10 years will result in one death postponed by about 20 years from a point beginning, perhaps, 10 years after the first screening round. In this cohort, about half the women will have false-positive results, several hundred will have biopsies of various sorts, and about 10 will be treated for breast cancer with various combinations and sequences of surgery, radiation, and chemotherapy. Anxiety during the days or weeks it takes to sort out the hundreds of false-positive findings is surely relevant when the induced uncertainty is existential. Does this balance out the reduced anxiety associated with a negative result? What about the additional physical and psychological trauma that results from inevitable diagnoses of ductal carcinoma in situ that would not progress? Also important, but difficult to model, is the cost to society of the induced cancer consciousness for women. All of this ignores the costs of lost work, child care, transportation, and so on, not to mention the actual costs of the procedures themselves [3,13]. These costs and complications of mammography as currently practiced are front-loaded, whereas the benefits of mammography occur far in future. Thus, any disutility attributed to false-positive results, even if transient, is significant at the population level. If any nonzero discount rate is applied to the calculation, the impacts are greater still because many years elapse between the time the costs are incurred and the time the benefits are accrued. Because almost all of us prefer to postpone harm and secure immediate benefits, discounting in the calculation is reasonable to consider. Consider the case of a headache patient by contrast. Even if the rate of discovery of treatable brain tumor, hydrocephalus, arteriovenous malformation, aneurysm, or other pathology is low, the benefit of reduced anxiety is achieve immediately, and the benefit is experienced by a large proportion of patients examined for this indication. Thus, the impacts of discounting are negligible, and the number needed to “treat” is small, presumably because patients are relieved of anxiety when they are in-
formed of negative results. The anxiety relief provided by the demonstration of a tumor-free brain is gained immediately, as is the opportunity to treat a symptomatic lesion when one is discovered. Another important consideration is that a patient with a headache is already sick, and therefore already a patient. In the case of mammographic screening, on the other hand, the subject is not sick until we make her so. Because, by definition, the utility of health states in a sick patient are lower than those in a healthy subject, there is more to lose in the case of the healthy subject, and the dictum of “primum non nocere” has more force in screening. Thus, the classical concept of the physician-patient relationship and the more recent quantitative method are, as usual, concordant. EVIDENCE AND QUANTITATIVE REASONING IN DIAGNOSTIC IMAGING Another possible distinction between the anxiety caused by false-positive mammographic results and that relieved by true-negative results on abdominal CT searching for pancreatic cancer in the setting of pain is that the utility of screening in mammography has been demonstrated in randomized trials, whereas the utility of what amounts to screening for pancreatic cancer in low-yield populations has not. This is a persuasive argument that would be even more persuasive if mammography were practiced in a fashion closely resembling or superior to that which prevailed in the clinical trials; that is, if the rates of falsepositive results in clinical practice were similar to those that prevailed in the randomized trials. The argument that false-positive findings were a neutral component of mammographic screening, rather than a complication, would be more convincing if rates of false-positive results in each trial correlated well with the observed effects of mammography on cancer mortality among those trials. However, false-positive rates are perhaps triple in current practice in the United States compared with those in the randomized trials, and the trial with the highest recall rates showed the smallest impacts. Rigorous reliance on randomized trial data to explain the differences in our approach to patient anxiety cannot, therefore, be the explanation either [7,13,14,18]. What we are left with, evidently, is either a difference of opinion about, or a lack of understanding of, quantitative reasoning in medicine. This is not a new problem, nor is it one that is easily resolved. Medical reasoning has been based on an analysis of individual patients and the experiences of physicians. It has only been in the past two centuries that quantitative data related to populations of patients have been added to the mix. The first clinical study using multiple patients and a systematic design analyzed the early and late application of venous bleeding in the setting of pneumonia, erysipelas, and tonsilar an-
Seidenwurm, Rosenberg/Quality of Life and Diagnostic Imaging Outcomes 267
gina and showed no difference between the two treatment groups in mortality and the duration of symptoms. Because the timing of bleeding, then considered a critical variable in the efficacy of the procedure, made no difference, the study called the entire practice into question. Responses to the report of the study were predictable [6]. The author opened himself to criticism on multiple fronts by not only studying the fraction of patients who demonstrated a specific characteristic or outcome but also by using averages. These parametric measures of population attributes shifted the focus of medicine from individual patients to the group of which those patients were members. This seemed at the time to be a radical departure from the focus on the individual. In the case of mammography, the situation is somewhat different because the harms are to patients we know and can identify precisely, whereas the benefit is granted to patients we cannot identify directly. This is because we cannot tell which of the patients who receive cancer diagnoses have been truly helped had the tumors become evident clinically. We do, however, know each of the patients whose false-positive or true-positive mammographic results are confirmed at diagnostic evaluation and, in addition, each of those whose biopsies result in cancer diagnoses and, of course, each of those whose further imaging workup or subsequent follow-up studies reveal metastases. The salience of patient anxiety is decreased for us in this manner because we receive misleading positive reinforcement every step of the way, just as the salience of anxiety is misleadingly increased for us by dramatic presentations of pain, even when the likelihood of tumor or infection is low [10,19,20]. INDIVIDUAL AND GROUP UTILITY Not only in medicine does propinquity lead to a counterintuitive result as to relative importance of particular impacts. In public choice theory, the focused interests of the few are shown to overcome the diffuse interests of the many, despite aggregate disutility for society as a whole. If we are the recipients of a subsidy, for example as producers of mohair, a substance whose strategic use has been obsolete for half a century, we are able to exert more pressure on the legislature to preserve our sinecures than the rest of the citizenry are able to exert to eliminate them. Although the mohair producers are few in number, each derives substantial benefit. The remaining population each pays only negligible amounts to support the subsidy, so sufficient objection to overcome the strong voices of the few is not justified for any one individual. A closely related phenomenon in behavioral science is the availability heuristic, in which we mistake the significance of observations on the basis of our individual experience.
SOLVING THE PROBLEM The solution to the problem at hand, the proper relative valuation of anxiety and other quality-of-life factors in the determination of the risks and benefits of any medical procedure, is to recognize that a period of time worried that one will die prematurely of a treatable cancer is a negative health state that is to be avoided, if at all possible. The adverse consequences of cancer anxiety are equally important, whether they are caused by spontaneously occurring pain or by a false-positive result of a screening procedure. Our efforts to alleviate or prevent anxiety should be equally intensive in each case, and the most effective and efficient methods should be used regardless of the cause. In the case of cancer anxiety caused by pain, the most efficient validated method for alleviating anxiety may be reassurance by a trained physician who is able to take the time to explain to a patient the most likely causes of his problem, propose evidence-based action to remediate the symptoms, and stress the low likelihood that an underlying malignancy or other rapidly progressive disorder is present. For the anxiety induced by false-positive results on mammography, the most efficient evidence-based approach is to reduce the number of false-positive results while maintaining a sufficient cancer detection rate to derive the benefits of screening average-risk women in a relatively low risk decade of their lives. Two approaches achieve this result, alone or in combination. We can reduce the frequency of screening, though this seems unlikely to be feasible at present. Alternatively, we can directly reduce the frequency of false-positive findings by practicing mammography in a manner similar to that practiced in the randomized trials that proved its benefits. Thus a consistent approach to imaging and patients mandates that recall rates be minimized. Targeting the levels specified in the BI-RADS® desirable goals, which are double or triple those in the successful trials, and eliminating short-interval follow-up studies from screening as specified in the BI-RADS manual, reduces the burden of false-positive findings substantially because half of mammographers recall more than 10%. If we cite particular studies when justifying a medical procedure, we ought to be willing to adhere to the standards set by the practitioners in those trials. Of course, this should be particularly true with respect to screening in low-risk patient subgroups [3,15]. REFERENCES 1. American College of Radiology. Breast Imaging Reporting and Data System® (BI-RADS®). 4th ed. Reston, Va: American College of Radiology; 2003.
268 Journal of the American College of Radiology/ Vol. 7 No. 4 April 2010 2. American College of Radiology, Society of Breast Imaging. USPSTF mammography recommendations will result in unnecessary breast cancer deaths each year. Reston, Va: American College of Radiology; 2009.
12. Kopans D. The recent US Preventive Services Task Force guidelines are not supported by the scientific evidence and should be rescinded. J Am Coll Radiol 2010; 7:260-64.
3. Brewer N, Salz T, Lille S. Systematic review: the long-term effects of false-positive mammograms. Ann Intern Med 2007;146:502-10.
13. Mandelblatt J, Cronin K, Bailey S, Berry D. Effects of mammography screening under different screening schedules: model estimates of potential benefits and harms. Ann Intern Med 2009;151:738-47.
4. Cochrane AL, Holland WW. Validation of screening procedures. Br Med Bull 1971;27:3-8. 5. Davis PC, Wippold FJ II, Brunberg JA. ACR Appropriateness Criteria on low back pain. J Am Coll Radiol 2009;6:401-7. 6. Feinstein A. Two centuries of conflict-collaboration between medicine and mathematics. J Clin Epidemiol 1996;49:1139-43. 7. Humphrey L, Helfand M, Chan B, Woolf S. Breast cancer screening: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med 2002;137:347-60. 8. Jordan JE, Ramirez GF, Bradley WG, et al. Economic and outcomes assessment of magnetic resonance imaging in the evaluation of headache. J Natl Med Assoc 2000;92:573-8.
14. Nelson H, Tyne K, Naik A, Bougatsos C, Chan B, Humphrey L. Screening for breast cancer: an update for the U.S. Preventive Services Task Force. Ann Intern Med 2009;151:727-37. 15. Rosenberg R, Yankaskas BC, Abraham LA, et al. Performance benchmarks for screening mammography. Radiology 2006;241:55-66. 16. Society of Breast Imaging. Society of Breast Image talking points: re: USPSTF screening guidelines. Available at: http://www.sbi-online.org/ associations/8199/files/SOCIETY%20OF%20BREAST%20IMAGING %20TALKING%20POINTS.pdf. Accessed February 6, 2010. 17. Stiell IG, Clement CM, Rowe BH. Comparison of the Canadian CT head rule and the New Orleans criteria in patients with minor head injury. JAMA 2005;294:1511-8.
9. Jordan JE; Expert Panel on Neurologic Imaging. Headache. AJNR Am J Neuroradiol 2007;28:1824-6.
18. US Preventive Services Task Force. Screening for breast cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med 2009;151:716-26.
10. Jorgensen K, Gotzsche P. Overdiagnosis in publicly organised mammography screening programmes: systematic review of incidence trends. BMJ 2009;339:b2587.
19. Welch HG, Black WC. Using autopsy series to estimate the disease “reservoir” for ductal carcinoma in situ of the breast: how much more breast cancer can we find? Ann Intern Med 1997;127:1023-8.
11. Kirkley A, Birmingham TB, Litchfield RB. A randomized trial of arthroscopic surgery for osteoarthritis of the knee. N Engl J Med 2008;359: 1097-107.
20. Zahl P, Moehlen J, Welch G. The natural history of invasive breast cancers detected by screening mammography. Arch Intern Med 2008; 168:2311-6.