SERIES ON EPIDEMIOLOGY Evidence-Based Reviews and Meta-Analysis JENNIFER EVANS
A
LTHOUGH IT IS GENERALLY ACCEPTED THAT RAN-
domized controlled trials are the best source of evidence on treatment efficacy, the importance of taking into account the findings of previous research when designing and reporting trials is less well appreciated. Ignoring the results of other trials can lead to harm for patients and can waste resources. For example, 33 trials of intravenous streptokinase as thrombolytic therapy for acute myocardial infection were conducted between 1959 and 1988. A cumulative metaanalysis by Lau and associates showed that a consistent statistically significant beneficial effect on mortality would have been evident if a systematic review and metaanalysis had been carried out after the eighth trial was completed.1 Unfortunately, the review was not performed at that stage and a further 25 trials, involving more than 34 000 patients, were undertaken subsequently. Systematic reviews also may highlight gaps in the evidence, for example, where published trials do not provide the evidence needed to guide clinical practice.2 The analysis by Lau and associates was published 15 years ago, but still very few published reports of trials refer to systematic reviews, either to justify conducting a trial or when discussing the results.3 In 2005, Chalmers pointed out that “academia as a whole has still not grasped that it is unscientific and unethical to embark on new research without first analysing systematically what can be learned from existing research.”4 Indeed, only relatively recently have some biomedical journals required that reports of clinical trials provide a summary of previous research findings and an explanation of how the reported trial affects this summary, “using direct reference to existing systematic reviews and meta-analysis.”5 Regularly updated systematic reviews are increasingly available. For example, there are 3826 reviews on The Cochrane Library (http:// www.cochrane.org/reviews/clibintro.htm; accessed June 2, 2009), including 73 reviews on interventions for eye diseases (www.cochraneeyes.org). The strength of randomized trials is that random allocation ensures that the treatment groups are comparable. Accepted for publication Sep 4, 2009. From the International Centre for Eye Health, London School of Hygiene and Tropical Medicine, London, United Kingdom. Inquiries to Jennifer Evans, International Centre for Eye Health, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, United Kingdom; e-mail: jennifer.evans@ lshtm.ac.uk 0002-9394/10/$36.00 doi:10.1016/j.ajo.2009.09.001
©
2009 BY
Good-quality trials will also mask (or blind) participants and clinicians, thus ensuring that recruitment into the trial, and assessment of outcome, are unbiased. However, the results of any one trial may be wrong. It may not be large enough to detect the true effect, or there may be no true effect and the (statistically significant) result has arisen by chance alone. Systematic reviews aim to summarize findings from individual studies using systematic methods to appraise critically the quality of the studies and to select those appropriate for the summary to avoid bias and to increase the likelihood that the summarized evidence is close to the truth.6 The term metaanalysis refers to the combined analysis of the results of the studies included in the review to produce a pooled estimate of treatment effect. The key point is that the original randomization must be preserved when pooling the data; it is not correct simply to treat all the different trials as one large study. A weighted average of the results of the individual trials is calculated, with larger trials being given more weight. The result of the metaanalysis depends on the methods used to perform the systematic review. Metaanalysis of a biased sample of poor-quality trials, for example, may be precise but will be wrong. Combining the results of multiple studies increases the power of the analysis and gives a more precise measure of effect. However, if there is considerable variation in the results of individual studies (heterogeneity), it may not be appropriate to report a pooled estimate of treatment effect. Heterogeneity arises because of differences in patient populations, interventions, or study design and should be investigated. In essence, a systematic review is an observational study of the trials addressing a particular clinical or scientific question. As for primary studies, it is important to set out the methods to be used in advance. A detailed protocol should give information on the procedures for identifying relevant trials, inclusion and exclusion criteria, and the methods for assessing the quality of included trials. The protocol should set out a priori which outcomes are to be analyzed and how they are to be analyzed. The choice of outcome measures, and length of follow-up, should be selected on the basis of relevance to patients, rather than simply summarizing all the outcomes reported in the individual trials. Consumer or patient groups can be involved in the choice of outcomes at the protocol stage.7 The implicit purpose behind having a detailed
ELSEVIER INC. ALL
RIGHTS RESERVED.
701
TABLE. GRADE System GRADE Quality of Evidence
High (⫹⫹⫹⫹)
Meaning
Type of Studies
Further research is very unlikely to change Randomized the confidence in the estimate of effect controlled trials
Moderate (⫹⫹⫹) Further research is likely to have an important impact on the confidence in the estimate of effect and may change the estimate
Low (⫹⫹)
Further research is very likely to have an important impact on the confidence in the estimate of effect and is likely to change the estimate
Very low (⫹)
Any estimate of effect is very uncertain
Decrease grade if: Serious (⫺1) or very serious (⫺2) limitation to study quality Important inconsistency (⫺1) Some (⫺1) or major (⫺2) uncertainty about directness Imprecise or sparse data (⫺1) High probability of reporting bias (⫺1) Increase grade if: Strong evidence of association—significant relative risk of ⬎ 2 (⬍ 0.5) based on consistent evidence Observational studies from 2 or more observational studies, with no plausible confounders (⫹1) Very strong evidence of association—significant relative risk of ⬎ 5 (⬍ 0.2) based on direct evidence with no major threats to validity (⫹2) Evidence of a dose-response gradient (⫹1) Any other evidence All plausible confounders would have reduced the effect (⫹1)
protocol is that the conduct of the review should be as objective as possible, especially because it is likely that the reviewers will be aware of the results of some of the published trials. There are several well-recognized potential biases that may affect systematic reviews. Trials that have statistically significant results are more likely to be published, and within published trials, statistically significant outcomes are more likely to be reported.8 If studies showing no effect for a particular outcome are excluded selectively from the metaanalysis, the overall effect of treatment will be exaggerated. Currently these publication and outcome reporting biases are difficult to avoid because it is not always possible to identify and obtain access to unpublished data. Current efforts to ensure that all trials are registered at inception9 and better access to trial protocols10 will improve future reviews. However, reviewers should try to identify unpublished studies and to contact investigators directly for data on outcomes that are not reported adequately. The results of systematic reviews are dependent on the quality of the trials included in the review. There have been many attempts to grade study quality. In general, domain-based grading scales are clearer and more transparent than summary scores.11 The Cochrane Collaboration’s tool for assessing the risk of bias considers separately allocation of treatment, masking, incomplete outcome data, and selective outcome reporting.11 Structured assessment of individual study quality can feed into an overall assessment of the quality of the evidence for each outcome. Again, there are many published schemes for grading quality of evidence, but one particularly useful approach is that offered by GRADE12 (The Grading of Recommendations 702
Qualifications
AMERICAN JOURNAL
Assessment, Development and Evaluation, http://www. gradeworkinggroup.org/, accessed September 3rd 2009; Table). The scheme takes into account study quality and publication bias as well as the consistency, precision, and directness of the evidence. This provides a transparent and systematic approach to assessing the overall quality of the evidence, and thus facilitates evidence-based conclusions regarding the effects of treatment. Systematic reviews usually are based on published aggregate data, but also can use individual participant data. This can have considerable benefits, particularly because it involves more collaboration between investigators of the individual studies, provides more flexibility in the analysis, and enables better quality control of the data. It is considered the gold standard approach, but does require considerable extra time and effort. As for published data, the original randomization must be preserved when pooling the data and heterogeneity between studies should be assessed. Reviews of observational studies, for example, studies that investigate the cause of, or risk factors for disease, also need to be conducted paying attention to potential biases involved in the review process. Publication bias and selective outcome reporting are likely to be even more of a problem with observational data. Assessing the quality of such studies also is more challenging, particularly assessing the effects of uncontrolled confounding (which is not an issue in trials because of the random allocation of treatment). The interpretation of reviews of observational studies is different to reviews of randomized controlled trials (Table). Observational studies can provide highquality evidence if the effect is reasonably strong (relative OF
OPHTHALMOLOGY
MAY 2010
risk of more than 2) and is observed consistently across different studies with different designs. It is likely that in the future, there will be increasing debate about the interpretation of individual stud-
ies.13,14 Evidence-based reviews, if they are properly conducted and reported,15 provide the link between the results of individual studies and choices facing patients and clinicians.
THE AUTHOR IS AN EDITOR OF THE COCHRANE EYES AND VISION GROUP (CEVG), LONDON, UNITED KINGDOM, AND IS PAID by CEVG for work on systematic reviews. CEVG is funded by UK National Health Service Research and Development. The author was involved in Design and conduct of study; Collection and management of the data; Analysis and interpretation of data; and Preparation of the manuscript. The author thanks Richard Wormald, Gianni Virgili, and Anupa Shah for comments on drafts of this editorial.
REFERENCES 1. Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med 1992;327:248 –254. 2. Rossetti L, Marchetti I, Orzalesi N, Scorpiglione N, Torri V, Liberati A. Randomized clinical trials on medical treatment of glaucoma: are they appropriate to guide clinical practice? Arch Ophthalmol 1993;111:96 –103. 3. Clarke M, Hopewell S, Chalmers I. Reports of clinical trials should begin and end with up-to-date systematic reviews of other relevant evidence: a status report. J R Soc Med 2007;100:187–190. 4. Chalmers I. Academia’s failure to support systematic reviews. Lancet 2005;365:469. 5. Young C, Horton R. Putting clinical trials into context. Lancet 2005;366:107–108. 6. Egger M, Davey Smith G, Altman DG. Systematic reviews in health care: meta-analysis in context. London: BMJ Publishing Group; 2001. 7. Evans JR, Virgili G, Gordon I, et al. Interventions for neovascular age-related macular degeneration. Cochrane Database of Systematic Reviews—Protocols 2009 Issue 1. New York: John Wiley & Sons, Ltd.; 2009.
VOL. 149, NO. 5
SERIES
ON
8. Dwan K, Altman DG, Arnaiz JA, et al. Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLoS ONE 2008;3:e3081. 9. Laine C, Horton R, DeAngelis CD, et al. Clinical trial registration: looking back and moving ahead. JAMA 2007; 298:93–94. 10. Chan AW. Bias, spin, and misreporting: time for full access to trial protocols and results. PLoS Med 2008;5:e230. 11. Higgins JPT, Altman DG. Assessing the risk of bias in included studies. In: Higgins JPT, Green S, eds. Cochrane Handbook for Systematic Reviews of Interventions. The Cochrane Collaboration. 2008. Available from www.cochrane-handbook.org. 12. Atkins D, Best D, Briss PA, et al, GRADE Working Group. Grading quality of evidence and strength of recommendations. BMJ 2004;328:1490–1494. 13. Ioannidis JPA. Why most published research findings are false. PLoS Med 2005;2:e124. 14. Borm GF, Lemmers O, Fransen J, Donders R. The evidence provided by a single trial is less reliable than its statistical analysis suggests. J Clin Epidemiol 2009;62:711–715. 15. Moher D, Liberati A, Tetzlaff J, Altman DG, for the PRISMA group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 2009;339:b2535.
EPIDEMIOLOGY
703