Results of Expert Meetings
Evaluation of diagnostic imaging technologies and therapeutics devices: Better information for better decisions: Proceedings of a multidisciplinary workshop Robert M. Califf, MD, for the Workshop Participants Durham, NC
We are entering an era in which the success of biomedical science and the increasing understanding of the value of evidence for practice are in a state of tension. This tension is especially notable in the device arena, in which the short life cycles and iterative nature of development are at odds with current design constructs of the types of clinical trials that provide evidence for medical decision making. The financial pressure arising from strained budgets and expanding costs from the aging of the population and the continued development of new technology heightens the need for a focus on new approaches. Given this background, a group of experts representing constituencies with different perspectives were convened for a day and a half to discuss key issues and their potential solutions. Because of the complex and heterogeneous nature of the environments in which devices are used, the meeting focused on 3 broad, general uses of devices: imaging, risk stratification, and therapeutics. The goal of the meeting was to develop a preliminary list of ideas that could be framed as researchable questions or constructs for consideration by policy makers that ultimately might lead to improvements in the current system. Across diagnostic imaging, risk stratification devices, and therapeutic devices, the crosscutting issues can be identified: We need better methods of collaborative funding and priority setting, improved and more flexible methods, and new approaches to the integration of federal agencies in overseeing the system. (Am Heart J 2006;152:50 - 8.)
Biomedical devices have been a major source of improvement in the quality of life of Americans. We are entering a period of explosive growth in the development and use of devices in medical care. Although the benefits of biomedical technology are abundant, the potential for negative consequences also exists. In addition, the impending financial pressure exerted by the aging of the population and the expanded capability of new technology (with its associated cost) calls for a continual effort to optimize policies that could enhance the development and use of devices that improve longevity and quality of life while diminishing the use of devices that do not add value. Decisions about when to use a device involve several different components. Device manufacturers must fund the experimental and engineering work needed to develop a device for marketing. Regulators operating
From the Duke Clinical Research Institute, Durham, NC. Funded by a contract (03R000225) from the Agency for Healthcare Research and Quality, Rockville, MD. Submitted October 3, 2005; accepted October 3, 2005. Reprint requests: Robert M. Califf, MD, Duke Clinical Research Institute, PO Box 17969, Durham, NC 27715. E-mail:
[email protected] 0002-8703/$ - see front matter n 2006, Published by Mosby, Inc. doi:10.1016/j.ahj.2005.10.001
under a set of laws, guidances, and precedents must then judge whether the proposed device should be allowed on the market and, if so, approve the labeling of that device for use. Administrators in insurance companies, health systems, and government organizations, most notably the Centers for Medicare and Medicaid Services (CMS), must decide whether the device will be paid for and, if so, what the level of reimbursement will be for performing a procedure in which the device is used. If all of the above decision makers allow the device to progress to clinical availability, the physician then decides in each individual case whether to use the device. To make the best decisions, information is needed at the policy/regulatory level, at the health system level, and at the individual patient/physician interface level. Ideally, this information would enable decision makers to choose devices to optimize the health outcomes in the system, including insight into how the use of a device compares with available alternative approaches, including device and nondevice approaches. Given unlimited time and resources, each device and its substantial iterations would have an ideal data set that would document its impact on longevity, morbidity, quality of life, and the balance of risks and benefits of using the device in particular clinical situations. However, time and resources are major constraints on the device industry, and concerns about braising the bar too
American Heart Journal Volume 152, Number 1
highQ and thereby dissuading investment in device development have fueled policies that provide the b least burdensomeQ approach to meeting the needs of society to understand the balance of benefit and risk when using a device. This tension between the desire to optimize available knowledge and the need to avoid stifling innovation requires that policy discussions should involve diverse groups with an interest in the use of biomedical devices. Given this background, a group of experts representing constituencies with different perspectives were convened for a day and a half to discuss key issues and their potential solutions. This workshop, held September 14 to 16, 2003, in Rockville, MD, was entitled Evaluation of Diagnostic Imaging Technologies and Therapeutics Devices: Better Information for Better Decisions. Because of the complex and heterogeneous nature of the environments in which devices are used, the meeting focused on 3 broad, general uses of devices: imaging, risk stratification, and therapeutics. The goal of the meeting was to develop a preliminary list of ideas that could be framed as researchable questions or constructs for consideration by policy makers that ultimately might lead to improvements in the current system.
Perspectives The development and use of technologies can be viewed from a variety of perspectives. All members of the clinical and research community share an overarching goal of optimizing the use of technology to improve the health of individuals and populations to the extent possible. Representatives from CMS, medical providers, health services researchers who evaluate technology, the device industry, and regulators provided an overview of perspectives that would need to be considered in a research agenda on the assessment of biomedical devices.
Centers for Medicare and Medicaid Services As the US population ages (the population N65 years is expected to increase from 34 to N70 million between now and 2030),1 the CMS has an increasing influence on decisions about which technology will be used because of the issue of reimbursement. Data accruing in national databases make several trends evident: the use of technologies is dramatically heterogeneous in the United States,2 and the United States spends more on healthcare than do other societies with relatively poor aggregate health statistics. One approach to improving the value of the CMS investment is to identify effective interventions and to not only reimburse their use but to encourage their use through quality measures and differential reimbursement. Initial efforts are underway in several
Califf 51
well-studied areas of medicine (acute myocardial infarction, heart failure, and community-acquired pneumonia) where standards are proven and performance measures can be used with confidence. However, most areas of medicine do not have enough evidence to allow clear performance measures to be constructed based on outcomes. The Centers for Medicare and Medicaid Services has the statutory authority to cover services that are considered reasonable and necessary for intervention in illness and injury. Interestingly, despite multiple commissions on the topic, a clear definition of breasonable and necessary Q has never been published. However, CMS has taken the view that the desirable situation is to reimburse for devices that are considered safe and effective by the Food and Drug Administration (FDA) as an initial requirement; beyond that standard, however, it is also considered to be desirable to reimburse for technologies that have a demonstrated benefit in health outcomes that the patient experiences (mortality, morbidity, quality of life, or function). Furthermore, it is preferred for the evidence to be generalizable to the Medicare population and supportive of benefit in addition to or compared with other reimbursable alternatives. Although this evidence-based approach is desirable, it is often impossible because of gaps in the evidence—in many cases because conducting research is considered either impossible or financially disadvantageous in the commercial environment. The controversial decisions by CMS about coverage are focused on devices in which definitive outcome data are not available. When definitive trials have been done, as with coronary stents and cardiac defibrillators,3 coverage can follow the results of the trials. The Centers for Medicare and Medicaid Services would prefer to see more solid evidence when coverage decisions are required.
Practitioner’s view Clinicians may be viewed as having several key objectives in the use of devices: to get the right treatment to the right patient at the right time and to be good stewards of the patient’s healthcare funds. To meet these goals, the clinician needs timely, high-quality information that is relevant to clinical practice and therefore informs the choices that must be made so that technology can be seamlessly integrated into practice. The b iron triangleQ for the clinician in the use of technology is made up of the patient, whose values and welfare are primary; the colleagues upon whom the clinician depends for validation of practices; and the insurers/payers who decide what is reimbursed. A particular problem for the clinician is that technologies, especially diagnostic technologies used for screening, often lack a negative feedback system for their use. A negative test is reassuring, and a positive test leads to
52 Califf
gratitude that something was done to prevent an outcome that is thought to be worse.4 In the case of cancer screening, individuals who had a false-positive screening test tended to be positive about having undergone the testing, apparently because of the relief that the false-positive test was incorrect. Unfortunately, the benefits of screening tests are frequently undocumented in objective, outcome-based studies, and numerous examples exist of technology being adopted only to later prove to be either not beneficial or detrimental. One approach of some value to the clinician is the use of reimbursement to stimulate randomized assessment. In the case of the LVRS, CMS reimbursed the operation only if the patient was randomized into the clinical trial evaluating the approach, which was organized by the National Heart, Lung, and Blood Institute of the National Institutes of Health (NIH).5 Although considerable concern was initially expressed about this approach, it resulted in the identification of groups of patients who benefited or did not benefit from the procedure, thus clarifying future reimbursement and clinical practice decisions.
Health services researchers The technology assessment researcher, a scant societal resource, is in a special position to provide insight into the system. However, the researcher is constrained by several limitations in the research and evaluation system. Devices, more so than drugs, have a limited life cycle with an iterative process of development that makes assessment a moving target. Different models and iterations may provide very different results, and the public/provider community is relatively intolerant to bfreezing Q new technologies to allow evaluation. Particularly for imaging or risk assessment technology, the gold standard is circular in that an assessment that is assured to be more accurate cannot be done. For example, new brain imaging methods can only be compared with older brain imaging methods, because access to the actual brain of living humans is understandably limited. Furthermore, clinical sites that are experts at assessing technology are limited, and the use of only expert sites creates a bias when the results are used to justify proliferation of the technology to nonexpert sites. Clinical studies often show a significantly different value, depending on the expertise of the user. This is a major contemporary issue in medical imaging for cancer detection, affecting almost all submissions to the Center for Devices and Radiological Health in that area.6 Device assessment studies do not typically have the same level of funding as drug evaluation trials, and both are hampered by bias against promotion of faculty who do this type of research.7
American Heart Journal July 2006
Device manufacturers The device manufacturing community is a heterogeneous blend of companies and people. There are thousands of device companies, and 65% of them have b 50 employees.8 An individual device company may manufacture only 1 device or thousands. In recent years, the industry has consolidated for broad device sales, but is still largely characterized by entrepreneurial start-ups. The iterative nature of device development and short life cycles also drive the device industry to resist unnecessary studies and prefer to function in a close relationship with device users to gain feedback on enhancements of existing devices. Much of this is driven by the financial model of the industry, which generally does not believe it is sustainable if premarketing development times are too long and cumbersome or if postmarketing use of devices is unduly restricted. These concerns have led to a different set of rules for the device industry compared with the drug industry, with a much less rigorous approach to documenting clinical benefit and a more flexible approach to incorporating alternative analytical methods. Regulators The US FDA regulates the vast array of medical devices available on the US market, ranging from latex gloves to cardiac defibrillators and from blood glucose test strips to magnetic resonance imaging systems. The regulations specify 3 pathways for devices: class I, low-risk devices (for example, tongue blades) have a general exemption from human testing, using general controls; class II devices require premarket notification with limited human testing; class III devices, considered highest-risk, require premarket approval by the FDA and generally require significant evaluation in humans. A class II device must be bsubstantially equivalent Q to a marketed bpredicateQ device. A class III device must be deemed safe and effective for its intended use. A device takes a lengthy tour through the system during its lifecycle; major gaps are created because different questions are asked about the device at different stages, and synthesizing the information across time is difficult.9 The pulmonary artery catheter, for example, was bgrandfathered Q to FDA approval in 1976 and has remained on the market without serious prospective assessment of its value despite a long series of observational studies raising concern about excess mortality.10 Within 2 years of its approval, transmyocardial revascularization for coronary disease was used off label in most cases, with evidence raising concerns about its application in patients with a relative contraindication.11 Controversy concerning the use of the prostate-specific antigen test continues to swirl despite its availability for many years.12 On the other hand, many devices, including coronary stents and many
American Heart Journal Volume 152, Number 1
surgical instruments, have been used predominately off label with well-documented societal benefits. The FDA is often limited in its ability as a single institution to push for outcome data because of regulations that allow 510(k) submissions, which, by definition, do not require clinical data. Furthermore, the FDA acknowledges an appropriate restraint from regulating the practice of medicine. The NIH has had very limited interest in funding pragmatic device questions, although many exceptions can be cited.13 Thus, the FDA is frequently aware of gaps in the data needed to improve clinical practice, but it cannot directly force studies to be done outside the regulatory pathway, and it has no mechanism for directly translating this knowledge to funding sources likely to pay for studies to fill these gaps. Overall, all segments desire better information about the value of devices, but multiple impediments prevent optimization of the system. Considering specific types of devices gives insight into potential approaches to improving the system.
Diagnostic and therapeutic technologies Predicting the risk of sudden death Coronary heart disease is the leading cause of death in the United States by a large margin. Close to half of these deaths are b sudden,Q in the sense that they occur without adequate warning to seek medical care. The advent of the implantable cardioverter defibrillator (ICD) provides an effective treatment for patients at high risk, as demonstrated in a series of clinical trials.14 These trials have shown a reduction in all-cause mortality in a broad array of patients,15 and although the number needed to treat and cost-effectiveness look attractive compared with many other commonly-used therapies, the total societal cost of implanting ICDs in all patients meeting the criteria would be enormous, as several million people are known to have an ejection fraction b35%, the key criterion for the recent series of primary prevention ICD trials. An additional future concern is that although the population with impaired left ventricular (LV) function is at high risk for sudden death, most sudden deaths in the United States occur in people with no previous known symptom of damage to the myocardium. Thus, a broad screening test to identify a population at risk might have major implications for reducing premature death. The desirable operating characteristics of a screening test in the 2 situations are quite different. Patients with known impairment of ventricular function are currently considered to need a defibrillator,16,17 so the goal of a test would be to identify patients in whom a defibrillator is not needed. Thus, in the impaired LV function population, the major goal of a test would be to identify people who would not benefit from ICD implantation.
Califf 53
Accordingly, a screening test with a very high negative predictive accuracy would be needed, particularly given that the consequence of a misclassification would be sudden death rather than a repairable symptomatic outcome. In contrast, the more general screening test for people without known LV dysfunction would be used to identify people who would benefit from having a defibrillator implanted. This test would need to be highly sensitive because the cost of implanting defibrillators in the general population would be intolerable if there were no high risk of sudden death. To some extent, the growing popularity of coronary calcium scoring18 and measurement of carotid intimal thickness with ultrasound stem from a desire to detect increased risk before a catastrophic event occurs. Yet, broad payment for these tests would cost tens of billions of dollars with uncertain value without a clinical trial to determine whether implementation of the technology (ie, ICDs) based on these measures would lead to an improvement in patient outcomes.19,20 At least 1 model based on the limited available data predicted that coronary calcium scoring would not be within the range of cost-effective interventions.19 In the future, testing for risk of sudden cardiac death is likely to include combinations of genes and proteins that provide not only the probability of sudden death but also the likelihood of other cardiac events. Already in acute and chronic coronary disease, panels of lipids, creatinine values, and protein markers of inflammation (C-reactive protein) and response to hemodynamic changes (brain natriuretic peptides) are used to predict outcome and alter therapy. In the future, proteins will likely be rapidly added and subtracted as new data accumulate about both the ability to predict outcome and the ability of different patterns to determine which therapies are most likely to be successful. The issues that must be resolved for diagnostic testing for such a broad population are complex. If simply demonstrating a difference in the predictive ability of a test is enough to merit payment, the potential impact on the healthcare budget could be unbearable. On the other hand, requiring proof of diagnostic efficacy could retard development of new diagnostic tests for a problem of vital public health importance. The absence of an efficient evaluation system is evident, and the differing missions of the FDA (viz, to identify devices that meet the claims of those who wish to sell them) and CMS (to pay for devices that help treatment) come into conflict when the costs are enormous or the actual value for health outcomes is unclear.
Treating benign prostatic hypertrophy—devices to treat a medical problem With the aging of the population, benign prostatic hypertrophy (BPH) has become a major cause of
American Heart Journal July 2006
54 Califf
impaired quality of life in the United States. Unfortunately, the impact of BPH on symptoms is highly variable and does not correlate well with available physiological measurements, making it difficult to define a population in whom breasonable and necessary Q care should be reimbursed. In addition, a wide variety of approaches to medical care are practiced, including watchful waiting, medications, minimally invasive treatments, and surgery (including open prostatectomy, transurethral prostatectomy, and newer surgical technologies). Only a few randomized trials have been done, although simultaneously, the technology has evolved rapidly. Currently, N10 devices are used to treat BPH, including multiple approaches to produce high temperatures to reduce prostate size through necrosis. Transurethral microwave thermotherapy of the prostate (TUMT), transurethral needle ablation (TUNA), water-induced thermotherapy (WIT) all have been used. In addition, multiple cooling devices to cool the nonprostate tissue are also used and reimbursed by CMS. A technology evaluation revealed a total of 10 randomized clinical trials,21 obviously providing a base of data inadequate to assess the multiple permutations of these devices. All the devices were said to bwork Q in the clinical trials, but a sham device also had efficacy. In the postmarketing period, 16 serious injuries due to burning were reported to the FDAs spontaneous adverse event reporting system. Given the aging of the population, the expanding number of people in whom an ever-increasing number of devices could be used poses a threat to the healthcare budget. The current postmarketing surveillance system does not provide adequate information to provide a balance of risks and benefits for these types of products,22 leaving decision makers at both the individual patient level and the health system level with many judgment calls in the absence of evidence. This technology demonstrates the difficulties in technology assessment in multiple areas of medicine. Multiple different technologies and combinations of technologies can be applied to the same medical problem. Many medical and surgical specialties do not have a tradition of focusing on medical evidence but have evolved through a more experiential paradigm. This leads to a situation in which polar opposite approaches are paid for, experts truly believe that one approach or another is superior, and no definitive evidence favoring either is available. One approach to this conundrum would be to define the characteristics of successful technologies in treating BPH (eg, heat, cold, ultrasound, microwave, or radiofrequency) that would be independent of the particular device. This approach would require pooling of data across devices, an approach that has been used to create a model for coronary stent registries23 and for assessment of mammography imaging systems.24 However, little funding exists to promote intellectual work on
innovative methods to deal with such problems. As with diagnostic testing, the absence of an efficient system for clinical testing and the dissociation of regulation and payment are also issues that need to be addressed.
Alzheimer disease diagnosis—molecular imaging The Alzheimer disease epidemic is product of our success in prolonging life. It affects 3% of people aged 60 to 74 years and 40% of people N85 years, with an estimated total cost of N$140 billion. Although intensive therapy for comorbid disease may have much to offer, current therapies for Alzheimer disease provide modest benefit and are not believed to modify disease progression. The current test for Alzheimer disease is a clinician evaluation in the absence of a positive diagnosis of a different disease. There is reason to believe that molecular medicine is defining targets for both the diagnosis and treatment of Alzheimer disease.25 As this knowledge evolves, an effective diagnostic test could exclude people without Alzheimer disease, identify people with the disease, and provide evidence for differential therapy based on the underlying mechanism and stage of disease. Essentially, by defining the relevant pathophysiological mechanisms with imaging, the clinician could possibly target therapies at the relevant pathway. Current FDA regulations regarding imaging for Alzheimer disease would require longitudinal studies in a mixed patient population with autopsy confirmation as the gold standard. This approach would take 5 to 7 years, during which the technology would continue to evolve and during which new therapies for the disease are likely to be introduced. An alternative approach would be to combine the imaging validation trial with therapeutic trials of therapies directed by the imaging, using clinical outcome and clinician assessment as the gold standard. The combination of diagnostic efficacy evaluation with therapeutic evaluation raises multiple questions about methodology. The questions of multiplicity, adaptable designs in which the protocol changes as new information becomes available, and combining information from multiple arms within a study require simulation and experimental efforts that have not been worked out yet. Furthermore, conducting the studies requires access to cutting-edge imaging, patient populations, and sites with expertise in clinical research. In essence, the pace of technology development is outstripping our ability to assess it.
Cross-cutting issues Funding, partnerships, research infrastructure, and priority setting Despite the vast successes of medical devices, most workshop participants felt that a better system could be
American Heart Journal Volume 152, Number 1
Table I. Key issues in device administration Funding and partnerships Clarifying coordination of funding of federal studies of device value Developing public-private partnership rules that enhance joint funding without compromising system integrity Developing better operational standards for HIPAA that encourage appropriate data sharing Improving clinical research infrastructure Synchronizing device approval (FDA) and payment systems (CMS) Methodology Employing Bayesian approaches to iterative device development Modeling of life cycle of devices Understanding proper extrapolation from efficacy to effectiveness Understanding when observational studies can substitute for randomized clinical trials Developing designs for linking diagnostic/imaging technologies with therapies Devising methods for adjusting for multiplicity when diagnostic tests stratify populations into multiple subpopulations HIPAA, Health Insurance Portability and Accountability Act.
constructed to generate the needed information to guide decision makers in the use of devices. Such a system would need to recognize the major differences between devices and drugs while also dealing with the increasing tendency to develop drug delivery devices and coupling of diagnostic devices with drug therapy. The funding issues in device evaluation are substantial (Table I). The device industry, especially the small, entrepreneurial companies, is not structured to produce the large clinical trials that would be most ideal to determine the value of particular devices in different populations. The Clinical Research Roundtable has identified this issue as a general problem in clinical research,7 but it seems particularly serious in the device arena. The Centers for Medicare and Medicaid Services is legally limited in its ability to fund research, the Agency for Healthcare Research and Quality has only a small amount of funding,26 and the NIH does not see itself as having a mandate to answer pragmatic healthcare questions that could be assessed by the industry. Other sectors of the healthcare industry (health systems, insurers, and payers) have not felt that their role was to provide major funding for research. The result is a broadly accepted need without a mechanism for funding support. Inevitably, progress in this arena will require partnerships across the spectrum of interested parties. In particular, the prospect of developing networks of investigative sites and investigators for the purpose of device evaluation was seen as an important topic to explore. An important area for broad partnership is the ongoing effort to regulate and protect the privacy of patients in the US healthcare systems. Many experts have identified specific ways in which the current Health Insurance Portability and Accountability Act legislation has im-
Califf 55
peded the ability to gather helpful data for healthcare decisions while providing little real protection of privacy.27,28 A concerted effort is needed to develop operational approaches that would maintain the needed privacy of patients without creating insurmountable barriers for developing the evidence needed to inform patients and their healthcare providers. One approach that deserves serious consideration is the development of a more sustained clinical research infrastructure. The currently dominant approach is driven by immediate need: manufacturers or government agencies with a need to initiate a study generally identify investigators as they are needed for particular research protocols. If they are no longer needed for an additional study, there is no support for the infrastructure at the research site. This highly inefficient approach leads to huge costs of study initiation due to the need for site identification and qualification, followed by an intensive training period. The NIH, through its Roadmap program (
[email protected]),29 has recently initiated an effort to develop interoperable networks to improve the clinical research infrastructure in the United States. Simultaneously, the FDA has funded the MedSun network (http://www.fda.gov/cdrh/postsurv/medsun. html) of hospitals to provide a stable group of centers that can provide more complete information on device adverse events. However, this network is not capable of performing clinical trials. Decisions about which research protocols to fund are generally made by manufacturers in conjunction with their clinical consultants and approved by the FDA or other regulatory bodies in international studies. Many who have studied this issue have advocated a more coordinated effort to identify the key research priorities by convening the industry, government agencies, payers/insurers, clinicians, and affected patients and their families.30 In this manner, studies of critical public health priority might achieve joint funding from the government and the device industry to enhance the likelihood of getting the needed answers.
Methodology improvement There is a broad discontent with the fit between current methods of clinical evaluation and the pace and configuration of device development (Table I). As discussed above, devices are developed in an iterative process that evolves over a relatively short period compared with drugs. Given the constant nature of efforts to improve mechanical inventions, the life cycle of a particular device may be measured in months. Yet, the impact of the device on the patient may last for years or may lead to changes in the management of disease with broad implications. The current construct of the randomized clinical trial is felt by many to be too constraining and inflexible for the recognized dynamic needs of device technology
American Heart Journal July 2006
56 Califf
evaluation. When a device has a short life cycle, it is simply not feasible to organize the sites, obtain institutional review board approval, negotiate contracts, and get the study done in a time frame that is reasonable from the investment perspective. Methodology is needed that can deal with iterative changes in device design and function, account for the operator/device interface, and begin to incorporate simulation methods into the evaluation. Ample literature documents that device outcomes can be different, depending on the skills and experience of the operator.31 During device development, the answers to 2 different types of questions are often in conflict: what is the value of a device if used in an optimal fashion, and what is the value of a device if used in the hands of the spectrum of operators and conditions in practice? These 2 questions are essentially the age-old befficacy Q and beffectiveness Q issues. One approach to both the iteration issue and the problem of long-term consequences of devices is to use modeling to attempt to estimate the impact of the particular device either over a period not measured in a trial or in patient populations not evaluated in the trial.32 However, standards are not currently adequate to provide comfort that a particular model will be accurate, reproducible, and generalizable. Controversy continues about the circumstances in which an observational study can substitute for a randomized clinical trial in giving decision makers the needed information.33 On one hand, a growing number of cases can be cited in which drugs (hormone replacement therapy), devices (atherectomy), or procedures (knee surgery) have been accepted based on observational studies, only to have randomized clinical trials discover that the impact on patient outcome was either detrimental or neutral.34 This issue is especially important in the assessment of devices because a continuous registry provides a pleasing approach to iterative changes in a device or the technique of using the device. What designs could be used to link diagnostic and imaging technologies to therapies to provide the guidance needed for clinical decision making? In the current environment, if a test were to stratify patients into those who did or did not benefit from a treatment, the subgroup findings might need to be replicated in an independent sample. In the example of imaging for Alzheimer disease, such a study would require broad collaboration across federal agencies as well as agreements among disparate companies with different goals.
The continuum of development and interagency issues The device industry in the United States is driven by a tremendous entrepreneurial spirit that is regulated by the FDA but heavily influenced by reimbursement decisions made by CMS. The discontinuity between phases of the device life cycle create lost opportunities in the early phases for the development of needed
information for the later phases of reimbursement decisions.35 Furthermore, the bold Q system was predicated on the concept that one set of studies would be conducted to achieve approval for marketing by the FDA, whereas a second set of studies would be required to optimize the funding from payors. Given the dynamics of device use and the funding of device companies, there is increasing belief that there is little leeway for lack of CMS funding after FDA approval. This belief is largely based on the view that without payment for CMS, many devices would simply not accrue enough revenue to justify the cost of development. One approach that has been suggested is a more formal synchronization of standards between the FDA and CMS.36 Under this strategy, depending upon the particular device and public health need, communication among the agencies could lead to joint decisions on approval and funding or sharing of data from one side to the other to optimize decision making.
Summary We are entering an era in which the success of biomedical science and the increasing understanding of the value of evidence for practice are in a state of tension. This tension is especially notable in the device arena, in which the short life cycles and iterative nature of development are at odds with current design constructs of the types of clinical trials that provide evidence for medical decision making. The financial pressure arising from strained budgets and expanding costs from the aging of the population and the continued development of new technology heightens the need for a focus on new approaches. Across diagnostic imaging, risk stratification devices, and therapeutic devices, the cross-cutting issues can be identified. We need better methods of collaborative funding and priority setting, improved and more flexible methods, and new approaches to the integration of federal agencies in overseeing the system.
References 1. US Census Bureau. National population projections. Available at: http://www.census.gov/population/www/projections/natproj. html [Accessed 30 September 2004]. 2. Baicker K, Chandra A. Medicare spending, the physician workforce, and beneficiaries’ quality of care. Health Aff 2004;W4: 184 - 97. 3. Tunis SR, Stryer DB, Clancy CM. Practical clinical trials: increasing the value of clinical research for decision making in clinical and health policy. JAMA 2003;290:1624 - 32. 4. Leiner S, Chatterton HT, Punglia RS, et al. Verification bias in screening for prostate cancer. N Engl J Med 2003;349:1672 - 3. 5. National Emphysema Treatment Trial Research Group. A randomized trial comparing lung-volume-reduction surgery with medical therapy for severe emphysema. N Engl J Med 2003;348: 2059 - 73.
American Heart Journal Volume 152, Number 1
6. Wagner RF, Beiden SV, Campbell G, et al. Assessment of medical imaging and computer-assist systems: lessons from recent experience. Acad Radiol 2002;9:1264 - 77. 7. Sung NS, Crowley Jr WF, Genel M, et al. Central challenges facing the national clinical research enterprise. JAMA 2003;289:1278 - 87. 8. Kimbell JJ. A wake-up call to device entrepreneurs. Med Device Diagn Ind 1996;24 - 40. 9. O’Shea JC, Kramer JM, Califf RM, et al. Sharing a commitment to improve cardiovascular devices–part 1: identifying holes in the safety net. Am Heart J 2004;147:977 - 84. 10. Shah MR, O’Connor CM, Sopko G, et al. Evaluation Study of Congestive heart failure And Pulmonary artery catheterization Effectiveness (ESCAPE): design and rationale. Am Heart J 2001;141:528 - 35. 11. Peterson ED, Kaul P, Kaczmarek RG, et al, for the Society of Thoracic Surgeons. From controlled trials to clinical practice: monitoring transmyocardial revascularization use and outcomes. J Am Coll Cardiol 2003;42:1611 - 6. 12. Hakama M, Stenman UH, Aromaa A, et al. Validity of the prostate specific antigen test for prostate cancer screening: followup study with a bank of 21,000 sera in Finland. J Urol 2001;166:2189 - 92. 13. Crowley Jr WF, Sherwood L, Salber P, et al. Clinical research in the United States at a crossroads: proposal for a novel public-private partnership to establish a national clinical research enterprise. JAMA 2004;291:1120 - 6. 14. Kupersmith J. The past, present, and future of the implantable cardioverter defibrillator. Am J Med 2002;113:82 - 4. 15. Lee KL, Hafley G, Fisher JD, et al, for the Multicenter Unsustained Tachycardia Trial Investigators. Effect of implantable defibrillators on arrhythmic events and mortality in the multicenter unsustained tachycardia trial. Circulation 2002;106:233 - 8. 16. Centers for Medicare and Medicaid Services. Available at: http:// www.cms.hhs.gov/manuals/06 _ cim/R173FlowChart.pdf [Accessed 30 September 2004]. 17. Al Khatib S, Sanders GD, Mark DB, et al. Implantable cardioverter defibrillators and cardiac resynchronization therapy in patients with left ventricular dysfunction: randomized trial evidence through 2004. Am Heart J 2005;149:1020 - 34. 18. Lamont DH, Budoff MJ, Shavelle DM, et al. Coronary calcium scanning adds incremental value to patients with positive stress tests. Am Heart J 2002;143:861 - 7. 19. O’Malley PG, Greenberg BA, Taylor AJ. Cost-effectiveness of using electron beam computed tomography to identify patients at risk for clinical coronary artery disease. Am Heart J 2004;148:106 - 13. 20. US Preventive Services Task Force. Screening for coronary heart disease: recommendation statement. Ann Intern Med 2004;140: 569 - 72. 21. British Columbia Office of Health Technology Assessment (BCOHTA). Incorporating clinical effectiveness debates into hospital technology assessment: the case of laser treatment of benign prostatic hyperplasia. Int J Technol Assess Health Care 1997;13:937 - 9. 22. Gross R, Strom BL. Toward improved adverse event/suspected adverse drug reaction reporting. Pharmacoepidemiol Drug Saf 2003;12:89 - 91. 23. Cutlip DE, Leon MB, Ho KK, et al. Acute and nine-month clinical outcomes after bsuboptimalQ coronary stenting: results from the STent Anti-thrombotic Regimen Study (STARS) registry. J Am Coll Cardiol 1999;34:698 - 706. 24. Kerlikowske K, Grady D, Rubin SM, et al. Efficacy of screening mammography: a meta-analysis. JAMA 1995;273:149 - 54. 25. Minoshima S, Frey KA, Cross DJ, et al. Neurochemical imaging of dementias. Semin Nuclear Med 2004;34:70 - 82.
Califf 57
26. Clancy CM. Carolyn M. Clancy, MD, Director, Agency for Healthcare Research and Quality. J Investig Med 2004;52:77 - 80. 27. Kulynych J, Korn D. The new HIPAA (Health Insurance Portability and Accountability Act of 1996) Medical Privacy Rule: help or hindrance for clinical research? Circulation 2003;108:912 - 4. 28. Califf RM, Muhlbaier LH. Health Insurance Portability and Accountability Act (HIPAA): must there be a trade-off between privacy and quality of health care, or can we advance both? Circulation 2003;108:915 - 8. 29. Zerhouni E. The NIH Roadmap. Science 2003;302:63 - 72. 30. Tilson H, Helms D, Dowdy D. Improving the US health care system: action plan to enhance efficiency, reduce errors, and improve quality. J Investig Med 2003;51:72 - 8. 31. Peterson ED, Coombs LP, DeLong ER, et al. Procedural volume as a marker of quality for CABG surgery. JAMA 2004;291:195 - 201. 32. O’Malley AJ, Normand SL, Kuntz RE. Application of models for multivariate mixed outcomes to medical device trials: coronary artery stenting. Stat Med 2003;22:313 - 36. 33. Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med 2000;342:1887 - 92. 34. Yusuf S, Wittes J, Friedman L. Overview of results of randomized clinical trials in heart disease. I. Treatments following myocardial infarction. JAMA 1988;260:2088 - 93. 35. Califf RM. Defining the balance of risk and benefit in the era of genomics and proteomics. Health Aff 2004;23:77 - 87. 36. Wang SS, Mendelson DN, Schulman KA, et al. Exploring options for improving healthcare. Am Heart J 2004;147:23 - 30.
Appendix A. Workshop participants Jeroan Allison, MD University of Alabama at Birmingham Birmingham, AL Susan Alpert, MD, PhD Medtronic, Inc Minneapolis, MN Elise Berliner, PhD Agency for Healthcare Research and Quality Rockville, MD Ashley Boam US Food and Drug Administration Rockville, MD Lynn Bosco, MD, MPH Agency for Healthcare Research and Quality Rockville, MD Robert Califf, MD CERTs Coordinating Center Durham, NC Charles Carignan, MD Boston Scientific Corporation Natick, MA William Clarke, MD, MSc Amersham Health Amersham, United Kingdom Judi Consalvo Agency for Healthcare Research and Quality Rockville, MD
American Heart Journal July 2006
58 Califf
Patrice Drew American Association of Orthopaedic Surgeons Washington, DC
Scott Ramsey, MD, PhD Fred Hutchinson Cancer Research Center Seattle, WA
Martin Erlichman, MS Agency for Healthcare Research and Quality Rockville, MD
Richard Rettig, PhD RAND Science and Technology Arlington, VA
David Feigal Jr, MD US Food and Drug Administration Rockville, MD
Donald Rucker, MD Siemens Medical Solutions, Inc Malvern, PA
Mark Fendrick, MD University of Michigan Ann Arbor, MI
Marcel Salive, MD, MPH Centers for Medicaid and Medicare Services Baltimore, MD
Richard Frank, MD, PhD GE Healthcare Princeton, NJ Steve Goodman, MD, MHS, PhD Johns Hopkins School of Medicine Baltimore, MD Thomas Gross, MD, MPH US Food and Drug Administration Rockville, MD Steven Gutman, MD US Food and Drug Administration Rockville, MD Thomas Holohan, MD US Veterans Health Administration Washington, DC Elizabeth Jacobson, PhD Advanced Medical Technology Association Washington, DC Larry Kessler, ScD US Food and Drug Administration Rockville, MD Allan Korn, MD Blue Cross Blue Shield Association Chicago, IL Judith Kramer, MD, MS Duke Clinical Research Institute Durham, NC Jeffrey Lerner, PhD ECRI Plymouth Meeting, PA Paul Marshall Cordis Corporation Warren, NJ
Peter Savage, MD National Institutes of Health Bethesda, MD Alan Schechter, MD National Institutes of Health Bethesda, MD Daniel Schultz, MD US Food and Drug Administration Rockville, MD J Sanford Schwartz, MD University of Pennsylvania School of Medicine Philadelphia, PA Joanna Siegel, ScD Agency for Healthcare Research and Quality Rockville, MD Jean Slutsky, PA, MSPH Agency for Healthcare Research and Quality Rockville, MD Alan Stiles, MD North Carolina Children’s Hospital Chapel Hill, NC Hugh Tilson, MD, DrPH University of North Carolina, Chapel Hill Chapel Hill, NC Sean Tunis, MD, MSc Centers for Medicaid and Medicare Services Baltimore, MD Stanley Wang, MD, JD, MPH Duke University Medical Center Cary, NC Deborah Zarin, MD Agency for Healthcare Research and Quality Rockville, MD
David Matchar, MD Duke University Medical Center Durham, NC
Staff: Leanne Madre, JD, MHA CERTs Coordinating Center Durham, NC
Barbara McNeil, MD, PhD Harvard Medical School Boston, MA
Donna McMullen Duke Clinical Research Institute Durham, NC
Richard Platt, MD, MSc Harvard Pilgrim Health Care Boston, MA
Terease Oliver Duke Clinical Research Institute Durham, NC