Journal of Clinical Epidemiology 68 (2015) 1247e1250
EDITORIAL
Is the ‘Evidence-Pyramid’ now dead? Has the ‘Evidence-Pyramid’ with RCTs at the apex and observational studies almost at the bottom (only outranking eminence-based clinical texts), outlived its usefulness? Although providing an easy to understand aide-memoire, from the beginning it upset traditional ‘big E’ epidemiologists [a term coined by David Sackett for traditional/ classic/population epidemiology to contrast with little ‘e’clinical epidemiology that focused on improving the care of individuals] including pharmaco-epidemiologists [Vandenbroucke [1] who felt it too simplistic and argued that all clinicians need a much broader set of study designs and should be exposed to the argumentation process needed to
tailor the research methods to the research question. One of us has supported this, with the following argument for ‘fitefor-purpose’ appropriate study methods to assess the success of the recommendations of the WHO Commission on the Social Determinants of Health [Tugwell] [2]; ‘‘Taking an evidence based approach does not mean relying on or privileging only one kind of method, such as the randomised controlled trial. It does not mean that there is only one hierarchy of evidence, and it does not mean an epistemological rejection of subjective positions or methods.’’ In this issue [3] Walach et al. take further their ‘Circle of Methods’ first proposed by them in 2006 (Fig. 1).
Fig. 1. Circle of methods. Experimental methods that test specifically for efficacy (upper half of the circle) have to be complemented by observational, non-experimental methods (lower half of the circle) that are more descriptive in nature and describe real-life effects and applicability. Shading indicates the complementarity of experimental and quasi-experimental methods, of internal and external validity [3]. http://dx.doi.org/10.1016/j.jclinepi.2015.10.001 0895-4356/Ó 2015 Elsevier Inc. All rights reserved.
1248
Editorial / Journal of Clinical Epidemiology 68 (2015) 1247e1250
Quantum Theory Matrix analysis of the evidence Type of study
Positive assoc.
Negative assoc.
Undecided
Basic research Human studies Natural cohorts Cross sectional Randomized studies Single cases Autopsy studies
They move away from a hierarchy to a Matrix analysis to show the mutual dependence of methods on knowledge from other types of studies and demonstrate in an easily understandable e fashion what types of evidence are still necessary, thus re-creating the argumentation and conceptual basis for such research called for by Vandenbroucke [1]. The authors make a case for taking this matrix approach and linking to the Bayesian type of reasoning to estimate the prior probability from mechanistic knowledge from animal and in vitro models for the effectiveness of an intervention; they also show how to estimate the number of studies that are needed to shift probability toward virtual certainty. The ‘5 A’s’ of Evidence Based Practice (ask, acquire, appraise, apply, and audit [4]) are now well known and form the basis for both clinical practice and teaching, so it is good to see the translation of this into a practical self-assessment questionnaire of the needed skills and competencies. Kaper et al. derived candidate items from existing EBP scales, psychology, and behavioral economics. In an online Delphi study, over 500 international expert clinicians, researchers, teachers, and policymakers identified items with sufficient face and content validity; this was then piloted and validated among over a hundred clinicians from various specialties and career stages. The one page 26 item questionnaire covers 5 domains: decision making, subjective norm, attitude, perceived behavior control, intention and behavior. We look forward to assessments of its usefulness and utility in education and quality assurance. Two papers address issues around core sets of outcomes and selective outcome reporting bias; the latter has been underappreciated as one of the most important forms of bias, but this has been the subject of recent publications from Cochrane, COMET (Smith [5]) and OMERACT organisations [6]. As part of the process to establish a core set of parsimonious critical patient important outcomes for Cochrane systematic reviews, Lenza et al.. surveyed the domains and instruments in over 170 published trials in physical therapy interventions for shoulder pain (rotator cuff disease, adhesive capsulitis, or nonspecific shoulder pain). Page et al. show a real life example to demonstrate that the fact that trials report data from multiple time-points further increases the risk of selective outcome reporting and argue for a replicable decision algorithm. They report that
half of a sample of 210 RCTs in arthritis and depression from 27 Systematic Reviews listed in MEDLINE in 2010e2012, had more than one outcome instrument and/ or time-point to choose from. Data mining of ‘big data’ is coming to clinical epidemiology! The mathematics of information is suddenly everywhere in the press [7] and best sellers [8] and computer power has now progressed so that text mining of computerised medical records, genetic registries [9,10], clinical research and administrative databases is likely to replace much of the human effort now required for the whole range of the 5 E’s of clinical epidemiology from etiology, efficacy, effectiveness, economic evaluation, evidence synthesis [11] though to implementation via evidence-informed practice and policy. Austin et al. showed how this improved stratification of risk in different types of heart failure [12]; in this issue Sheng-Feng Sung et al. show the power of the addition of data-mining software to improve the case-mix adjustment and thus raise the accuracy of the classifications up to the levels achieved by prospective data collection in trials. The issue of the nosology/classification of the domain of primary care research for chronic conditions was of great interest to a previous JCE Editor Alvan Feinstein (Knottnerus) [13], so he would have been pleased to see the paper by Kendall et al. using the example of the continuing care of people with HIV in Ontario Canada which interestingly in over half the cases are primarily or totally cared for by the primary care physician. The challenge here is how to define best practices in this example of a major chronic condition; this study carried out the first step of using routinely collected administrative data to construct a theoretically defined typology of how care is shared between family physicians and specialists, that can be used at a population level to determine how different models of care impact quality. The GRADE approach to guidelines continues build momentum as the go-to evidence-based approach. AHRQ was prominently present at GRADE meetings from the start although their approach has differed in a number of ways. The article by Berkman et al. updates the way that the AHRQ Evidence Practice Centres grade the ‘SOE’ Strength Of {a body} of Evidence when assessing health care interventions. This now uses the same 5 components as GRADE to consider when possibly rating down the SOE for a specific outcome [study limitations, precision, consistency, directness, and reporting bias] that now matches the 5 components used by the GRADE assessment of Quality ‘ROBPI3’ [Risk of Bias, Publication Bias, Imprecision, Inconsistency, Indirectness]; notably this version of the EPC Strength of Evidence has re-incorporated ‘Directness’ that the EPC schema previously rated separately. This is heartening but is still somewhat confusing to stakeholders that the 2 groups do not actually use the exact same terms for several apparently identical attributes
Editorial / Journal of Clinical Epidemiology 68 (2015) 1247e1250
[eg Strength of Body of Evidence/Quality, certainty or confidence; Study Limitations/Risk of Bias]. We would encourage a further attempt to reach consensus on a glossary. This should include not only for these 2 groups but the other major Evidence categorisation systems used in clinical epidemiology, guideline and health technology assessment, groups [14e18] perhaps brokered by the new organisation proposed at the Global Evaluation Synthesis Summit in Oslo in June 2015 alongside the 2015 HTAi meeting, where 37 evidence synthesis groups have indicated their interest in developing better cooperation between synthesis organisations (Jeremy Grimshaw; personal communication June 14. 2015). Iain Chalmers [19] has championed the feedback loop between Systematic reviews [whether Aggregate Data [AD] or Individual Patient Data [IPD] to define research gaps to justify any new primary study; and then to incorporate the results of the primary study into the existing systematic review to influence future research priorities. In this issue Tierney et al. now provide a view of how individual participant data meta-analyses have influenced trial design, conduct, and analysis; they focus on lessons-learned from the study of a case series of 21 out of 41 Individual Patient Data Meta-Analyses thought to have had a direct impact on the design or conduct of trials. An impressive list is collated that include influence upon the selection of comparators and participants, sample size calculations, analysis and interpretation of subsequent trials, and the conduct and analysis of ongoing trials. This proposal that this must be based on Individual Patient Data is however very time-intensive compared to just relying on Aggregate Data Meta-Analyses. We look forward to seeing a more extensive body of data on the relative value of contributions of Individual Patient Data vs Aggregate Data Meta-Analyses in informing subsequent clinical trials. This issue has four papers showing exemplars applications of statistical concepts. Two papers one on Cystic fibrosis and one on multiple sclerosis provide examples of the development and validation of clinical prediction that provide critical adjustments to avoid the potential confounding effects of short-term disease fluctuations (i.e., FEV1 fluctuations in cystic fibrosis and clinical relapses in multiple sclerosis) on an individual’s background longer-term disability. Nordt et al. provide a good example from a opioid substitution program in Zurich of how Measures of participation in chronic disease can handle important variations on times in and out of the intervention program. Stratified/precision medicine aims to make optimal treatment decisions for individual patients by predicting their response to treatment (treatment benefit) from baseline information [20]. Randomized clinical trials provide strong evidence of the benefits and harms of treatments. The estimated overall treatment effect is an important summary result of a clinical trial but is insufficient to
1249
decide which treatment is best suited for an individual patient without modeling the interactions between prevalence and effect size influenced by prognostic factors. van Klaveren et al. show how modeling treatment interactions with prognostic factors, rather than a constant relative treatment effect, caused a major shift in the predicted most favorable treatment among the synergy between Percutaneous Coronary Intervention with Taxus and Cardiac Surgery trial patients. A fifth statistics article by Viechtbauer provides a simple formula for the calculation of sample size in pilot studies. One of the goals of a pilot study is to identify unforeseen problems, such as ambiguous inclusion or exclusion criteria or misinterpretations of questionnaire items. Although sample size calculation methods for pilot studies have been proposed, none of them are directed at the goal of problem detection, so these authors present a simple formula to calculate the sample size needed to be able to identify, with a chosen level of confidence, problems that may arise with a given probability decisions about sample sizes in pilot studies. Finally, a cautionary lesson is reported to show that although monetary incentives generally increase participation rates in surveys, too high a payment may be counterproductive. Koetsenruijter et al. who randomised patients to 4 levels of remuneration for completing a 15 minute questionnaire plus 15 minute interview found that participation rates actually dropped with the higher amounts; they discuss the possible reasons for this, e.g., that a high incentive may be perceived to indicate high burden for responders. Peter Tugwell J. Andre Knottnerus Editors E-mail address:
[email protected] (J.A. Knottnerus) References [1] Vandenbroucke JP. Observational research evidence-based medicine: what should we teach young physicians? J Clin Epidemiol 1998;51:467e72. [2] Tugwell P, Petticrew M, Kristjansson E, et al. Assessing equity in systematic reviews: realising the recommendations of the Commission on Social Determinants of Health. BMJ 2010;341:c4739. [3] Walach H, Falkenberg T, Fonnebo V, Lewith G, Jonas W. Circular instead of hierarchicaldmethodological principles for the evaluation of complex interventions. BMC Med Res Methodol 2006;6:29. [4] Rosenberg W, Donald A. Evidence based medicine: an approach to clinical problem-solving. BMJ 1995;310:1122e6. [5] Smith V, Clarke M, Williamson P, Gargon E. Survey of new 2007 and 2011 cochrane reviews found 37% of prespecified outcomes not reported. J Clin Epidemiol 2015;68:237e45. [6] Boers M, Kirwan JR, Wells G, Beaton D, Gossec L, d’Agostino MA, et al. Developing core outcome measurement sets for clinical trials:OMERACT filter 2.0. J Clin Epidemiol 2014;67:745e53. [7] The NHS plan to share our medical data can save lives..Available at www.theguardian.com Society NHS. Accessed February 21, 2014-Ben Goldacre: Care.data, the grand project to make the medical... It was
1250
[8] [9]
[10]
[11]
[12]
[13]
Editorial / Journal of Clinical Epidemiology 68 (2015) 1247e1250 supposed to link all NHS data about all patients together into one giant database,.. From My Brief Online Bio You Can Work Out That Moved Oxford. Ellenberg J. How not to be wrong ethe power of mathematical thinking. New York: Penguin books; 2014. Wallace BC, Small K, Brodley CE, Lau J, Schmid CH, Bertram L, et al. Toward modernizing the systematic review pipeline in genetics: efficient updating via data mining. Genet Med 2012;14:663e9. Ioannadis J. Why most published research findings are false. Available at http://journals.plos.org/plosmedicine/article?idZ10.1371/ journal.pmed.0020124. Accessed October 9, 2015. Elliott JH, Turner T, Clavisi O, Thomas J, Higgins JP, Mavergames C, et al. Living systematic reviews: an emerging opportunity to narrow the evidence-practice gap. PLos Med 2014;11:e1001603. Austin PC, Tu JV, Ho JE, Levy D, Lee DS. Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. J Clin Epidemiol 2013;66:398e407. Knottnerus JA. Between iatrotropic stimulus and interiatricreferral: the domain of primary care research. J Clin Epidemiol 2002;55:1201e6.
[14] Oxford Centre for Evidence-Based Medicine (OCEBM). Available at http://www.cebm.net/?oZ1025. Accessed October 9, 2015. [15] Australian National Health and Medical Research Council (ANHMRC). Available at http://www.nhmrc.gov.au/_files_nhmrc/file/ guidelines/developers/nhmrc_levels_grades_evidence_120423.pdf. Accessed October 9, 2015. [16] Scottish Intercollegiate Guidelines Network (SIGN). Available at http:// www.sign.ac.uk/methodology/index.html. Accessed October 9, 2015. [17] Tregear SJ, Reston JT, Turkelso CM, et al. 2006. Available at http:// www.biomedcentral.com/1471-2288/6/52. [18] Harris RP, Helfand M, Woolf SH, et al. Current methods of the US Preventive Services Task Force: A review of the process. Am J Prev Med 2001;20:21e35. [19] Chalmers I, Altman DG. How can medical journals help prevent poor medical research? Some opportunities presented by electronic publishing. Lancet 1999;353:490e3. [20] Rothwell PM, Mehta Z, Howard SC, Gutnikov SA, Warlow CP. Treating individuals 3: from subgroups to individuals: general principles and the example of carotid endarterectomy. Lancet 2005;365: 256e65.