Journal of Clinical Epidemiology 53 (2000) 1130–1136
Weighting bias in meta-analysis of binary outcomes Jin-Ling Tang* Department of Community and Family Medicine, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, The People’s Republic of China Received 5 January 1999; received in revised form 29 February 2000; accepted 4 March 2000
Abstract This article demonstrates that the weighting-according-to-the-variance method may introduce biases in meta-analyses of binary outcomes. The weighting favors studies that have certain frequencies of outcome events and weights given to studies of the same size may differ tens to thousands of times merely because of variations in the frequency. It also applies different standards to different measures of effect. Thus, the weighting may distort the combined result or even lead to contradictory conclusions when different measures of effect are used. Generally, the bias is more likely to arise when the effect is heterogeneous across the combined trials, the trials are conducted in populations of highly varied risks, the relative risk is used as the effect measure, the effect to be combined is small, any of the trials falls beyond the risk range of 20% and 80%, and/or the number of trials is small. Suggestions for detection and control of the bias are also given. © 2000 Elsevier Science Inc. All rights reserved. Keywords: Meta-analysis; Bias/weighting; Randomized controlled trial; Statistical method; Systematic reviews
1. Introduction Meta-analysis is increasingly used in medical research [1–5]. The overall result in a meta-analysis is a weighted average of study-specific results, each study being weighted normally by the inverse of its variance [6–8]. The weighting method is used to combine from different studies the effect of a treatment and of a risk factor on disease. For binary outcomes, the effect is usually expressed by a risk difference, a relative risk, or an odds ratio [9–10]. The standard errors of these effect measures are determined by two factors, the number of study subjects and the frequency of the outcome events in the compared groups. These factors in turn determine the weight given to a study in meta-analysis using the weighting-according-to-the-variance method [6–8]. This article demonstrates that the association of the weight with the frequency of the outcome events gives rise to the weighting bias in meta-analysis of binary outcomes. The weighting bias may distort the combined effect or even lead to contradictory results when different effect measures are used, such as the risk difference and relative risk [11]. It implies that many of the observed asymmetrical funnel plots [12,13] and discrepancies between large trials and corresponding meta-analyses [14,15], may be a result of true heterogeneity rather than biases as usually interpreted * Tel. 852-2692-8779; fax: 852-2606-3500. E-mail address:
[email protected] (J.-L. Tang)
[12,14]. Dissection of the weighting bias may also help to better understand the impact of outlier trials and to improve the validity of future meta-analyses. 2. Materials and methods Hypothetical and published meta-analyses of randomized controlled trials were used to illustrate the problem. The risk difference and relative risk were used to quantify the treatment effect. The effect measures and their variances were computed using standard formulas [8,16,17]. The weight in the weighting-according-to-the-variance method is defined as the inverse of the variance. Let p1 and p0 be the risk (i.e., the proportion of the outcome events), q1 ⫽ 1⫺p1 and q0 ⫽ 1⫺p0 the complements of the risks, and n1 and n0 the number of subjects, in the treated and the control groups, respectively. The risk difference and relative risk will be (p1⫺p0) and (p1/p0) with a variance being (p1q1/n1 ⫹ p0q0/n0) and (q1/p1n1 ⫹ q0/p0n0), respectively. To aid illustration of the bias, a special weight, called the risk-attributable weight, is defined as the component of the actual weight assigned to a trial which is completely determined by the risks in the two compared groups of the trial and independent of the trial size. It is computed by dividing the actual weight assigned to a study by the number of study subjects that, to simplify matters, are assumed to be equally divided in the two compared groups. Using the denotations defined above and assuming n1 ⫽ n0 ⫽ n, the risk-attributable weight will be 1⁄2(p1q1 ⫹ p0q0), (p1p0)/2(q0p1 ⫹ q1p0),
0895-4356/00/$ – see front matter © 2000 Elsevier Science Inc. All rights reserved. PII: S0895-4356(00)00 2 3 7 - 7
J.-L. Tang / Journal of Clinical Epidemiology 53 (2000) 1130–1136
p1q1p0q0/2(p1q1 ⫹ p0q0), and n(p1 ⫹ p0)(q1 ⫹ q0)/16(n-0.5), respectively, for the risk difference, relative risk, odds ratio, and Peto’s odds ratio. The last was further approximated by (p1 ⫹ p0)(q1 ⫹ q0)/16 assuming n is relatively large. The relations of the risk-attributable weight with the risk are examined, among different effect measures and at various magnitudes of effect, to demonstrate how and when the bias may occur. Published meta-analyses chosen from the Cochrane Database of Systematic Reviews [18] were used to corroborate the findings about the bias. All the meta-analyses in this article used the fixed-effect model [6]. All analyses were conducted on SAS 6.07.
3. Results 3.1. A hypothetical example of the weighting bias The example in Table 1 is constructed to demonstrate the features of the bias. Similar or larger variations in the effect and the risk that cause the bias (demonstrated below) may often exist in real meta-analyses. Evidently, the overall risk difference of the 10 trials is best represented by that in the nine majority trials that have an identical result. The pooled risk difference of the nine trials is ⫺2% (95% CI: ⫺3.9%, ⫺0.1%), suggesting that the treatment is effective. However, when the tenth trial is included, it results in an overall risk difference of 0.16% (95% CI: ⫺1.1%, 1.4%), a result that defies common sense and differs in both the magnitude and direction from that in the majority of the trials. One trial with 20 outcome events, as compared with nine trials with 290 events in each, changes the conclusion about
1131
the treatment from “very likely to be beneficial” to “no effect or possibly harmful.” Inclusion of the tenth trial also reduces the standard error of the combined effect from 0.96% to 0.65%. In contrast, the relative risk is little affected by the tenth trial. The combined relative risks with and without the tenth trial are very similar and well represent the nine identical trials. It is also noted that the combined relative risk of the 10 trials differs in direction from that of the combined risk difference. 3.2. How does the weighting bias occur? How have the above results occurred? In a meta-analysis, the weight given to a trial is determined by the risk and the number of subjects in the compared groups (for details see Materials and Methods). Thus, the weighting may give some small trials disproportionately large weights merely because of the magnitude of the risk. The bias will result if these disproportionately over-weighted studies also differ considerably in the effect from the majority studies. As in Table 1, the tenth trial has a risk much lower than the nine trials and is given 54.0% of the total weight in combining the risk difference. The trial also has an effect that differs in the direction from that in the nine trials. Consequently, the combined risk difference is predominantly determined by, and biased toward, the effect in the tenth trial. Furthermore, the standard errors of the risk difference and relative risk are a different function of the risks in the compared groups (for details see Materials and Methods). Thus, the weighting may weight the same studies differently and lead to discrepant or contradictory results when different effect measures are used. Again in Table 1, the tenth trial is given only a 0.4% of the total weight in combining
Table 1 A hypothetical example of the weighting bias in meta-analysis of 10 randomized controlled trials Treated group
Control group
Risk difference Relative Risk
Trial ID
Total number (n1)
% of events (p1)
Total number (n0)
% of events (p0)
Risk difference (%)
Absolute weight
Weight %a
Relative risk on In scale
Absolute weight
Weight %
1 2 3 4 5 6 7 8 9 10
500 500 500 500 500 500 500 500 500 500
28 28 28 28 28 28 28 28 28 3
500 500 500 500 500 500 500 500 500 500
30 30 30 30 30 30 30 30 30 1
⫺2 ⫺2 ⫺2 ⫺2 ⫺2 ⫺2 ⫺2 ⫺2 ⫺2 2
1214.8 1214.8 1214.8 1214.8 1214.8 1214.8 1214.8 1214.8 1214.8 12820.5
5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 5.1 54.0
⫺0.069 ⫺0.069 ⫺0.069 ⫺0.069 ⫺0.069 ⫺0.069 ⫺0.069 ⫺0.069 ⫺0.069 1.099
101.94 101.94 101.94 101.94 101.94 101.94 101.94 101.94 101.94 3.81
11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 11.1 0.4
Meta-analyses of the risk difference: Trials 1–9: Overall risk difference (SE): ⫺2.00% (0.96%) All 10 trials: Overall risk difference (SE): 0.16% (0.65%) Heterogeneity test: 2 ⫽ 9.44, df ⫽ 9, P ⫽ 0.3976 Meta-analyses of the relative risk: Trials 1–9: Overall relative risk on ln scale (SE): ⫺0.069 (0.033) All 10 trials: Overall relative risk on ln scale (SE): ⫺0.064 (0.033) Heterogeneity test: 2 ⫽ 5.17, df ⫽ 9, P ⫽ 0.8192 a Weight % may not add up to 100% due to approximation.
1132
J.-L. Tang / Journal of Clinical Epidemiology 53 (2000) 1130–1136
Fig. 1. The relation of the risk-attributable weight with the risk in the treated group of randomized controlled trials for various magnitudes of effect expressed in the relative risk, assuming that the smallest risk-attributable weight is unity: risk difference.
the relative risk as compared to 54.0% in combining the risk difference. Consequently, the combined relative risk and risk difference differ in direction. As the former is hardly affected by the tenth trial and thus represents the majority nine trials well. 3.3. What trials will be disproportionately weighted? What trials will be disproportionately weighted in a meta-analysis and cause bias? This can be determined by examining how the risk-attributable weight varies as the risk varies. The curve for the risk difference is shown in Fig. 1. It shows that in combining the risk difference trials with small risks (or large risk in some situations) will be given in the weighting large weights, while medium-sized risks between 20% and 80% will be given small weights. Thus, if one or few trials have small risks and the majority trials have medium-sized risks in a meta-analysis, the trials with small risks will be over-represented and the bias may result. The combined effect will be biased toward that in the overweighted trial(s). If these trials have smaller effects than the majority trials, the combined effect will be under-estimated. (The meta-analysis in Table 1 is a case in point.) Otherwise, it will be over-estimated. When the effect is similar in the
Fig. 2. The relation of the risk-attributable weight with the risk in the treated group of randomized controlled trials for various magnitudes of effect expressed in the relative risk, assuming that the smallest risk-attributable weight is unity: relative risk.
outlier and the rest of trials, the combined effect will not be biased but its standard error may be affected. Conversely, if a few large trials have medium-sized risks and the rest very small risks, the bias may also occur due to the large trials being under-weighted. The combined effect will be biased toward that in the small trials. Both over- and under-weighted trials will be referred to below as outlier trials and may co-exist in a single meta-analysis. Furthermore, a single combined effect will surely be inappropriate when the effect is associated with the risk and hence directly associated with the weight. In this case no outlier trials may be evidently identifiable. Further discussion will focus on situations in which outlier trials are of a major concern. Outlier trials in meta-analysis of the relative risk, odds ratio, and Peto’s odds ratio may be similarly determined by using their respective curves in Fig. 2, 3, and 4. Generally speaking, in combining the relative measures of effect the same trials will be weighted similarly by the weighting scheme except when the risk is close to 100% and/or the effect close to no effect. Thus an outlier trial, if present, may affect similarly these relative measures. However, the curves for the relative effect measures are the inverted shape of that for the risk difference. Outlier trials that are given very large weight and over-weighted in pooling the
J.-L. Tang / Journal of Clinical Epidemiology 53 (2000) 1130–1136
Fig. 3. The relation of the risk-attributable weight with the risk in the treated group of randomized controlled trials for various magnitudes of effect expressed in the relative risk, assuming that the smallest risk-attributable weight is unity: odds ratio.
risk difference will thus be given very small weight and under-weighted in pooling the relative risk, and vice versa. The same outlier trials may thus affect differently the two combined measures of effect. As shown in Table 1, this may even lead to combined effects that contradict each other when different effect measures are used. 3.4. When is the bias more likely to occur? The four figures also show that the degree of the variation of the risk-attributable weight with the risk is much greater (1) for the relative risk than for the other three measures of effect; (2) as the risk gets close to 0% and 100%; and/or (3) as the effect gets close to the null. These imply that in these situations outlier trials are more likely to arise and thus cause the bias. Evidently, the weighting bias is also more likely to occur in meta-analyses with a small number of trials because the outlier(s) will be relatively more predominant. 3.5. Some real examples of the weighting bias Six meta-analyses are selected from the Cochrane Database of Systematic Reviews to present various situations in which the weighting bias may occur (Table 2) [19–24]. Two outlier trials are identified in each meta-analysis, one for the
1133
Fig. 4. The relation of the risk-attributable weight with the risk in the treated group of randomized controlled trials for various magnitudes of effect expressed in the relative risk, assuming that the smallest risk-attributable weight is unity: Peto’s odds ratio.
risk difference and one for the relative risk. Over- or underrepresentation of these trials is shown by comparing the weight and the number of subjects of the trial relative to their total in the meta-analysis. The pooled risk difference and relative risk with and without the outlier trial using the weighting-according-to-the-variance method and their corresponding chi-square values for heterogeneity and standard errors are provided in the table. The combined effect weighted using the number of subjects as weight is also given as a reference estimate that is not affected by the weighting bias. Meta-analysis 1 [19] provides the simplest form of the weighting bias in which the combined effect is affected by a single small trial. The combined relative risk is largely determined by a small trial that is given 74% of the total weight and thus substantially under-estimated, whereas the combined risk difference is not thus affected and seems to provide an unbiased overall estimate. In such a situation, either using the unbiased effect measure or excluding the outlier trial from the analysis will avoid the bias. Often, the outlier trial may not be small and both effect measures may be biased. In meta-analysis 2 [20], a large trial is over-represented both in combining risk difference and relative risk and thus biases both. In meta-analysis 3 [21], the same large trial is over-represented in combining
King [19]
Martin-Hirsch [20]
Soll [21]
Hodnett [22]
Cullum [23]
Handoll [24]
1
2
3
4
5
6
Any antibiotics vs no antibiotics (maternal infection) Spatula⫹cytobrush vs spatula⫹swab (Presence of endocervical cells) Prophylactic surfactant vs treatment with surfactant (pulmonary interstitial emphysema) Support from caregivers during childbirth (oxytocin augmentation - unaccompanied) Compression vs no compression (complete healing in trial period: varying lengths) Any heparin vs control/placebo (DVT-any: U Heparin)
Comparison (outcome)
10
6
6
5
10
6
Number of trials Risk diff. LnRR Risk diff. LnRR Risk diff. LnRR Risk diff. LnRR Risk diff. LnRR Risk diff. LnRR
Measure of effecta 15% 6% 43% 43% 61% 61% 25% 30% 26% 16% 18% 6%
n 21% 74% 71% 78% 96% 7% 48% 89% 23% 40% 54% 33%
weight
% of No. and weight % of the outlier trialb H2 2.2 5.3 179.7 203.2 9.7 1.3 17.4 22.1 8.7 9.8 22.7 20.0
Effect (SE) ⫺4.5% (1.8%) ⫺0.23 (0.17) 9.6% (0.4%) 0.10 (0.005) ⫺0.2% (0.4%) ⫺0.60 (0.21) ⫺6.5% (1.5%) 0.02 (0.05) 34.6% (5.4%) 0.47 (0.11) ⫺12.6% (2.4%) ⫺0.40 (0.09)
With
⫺4.6% (2.0%) ⫺0.80 (0.34) 17.3% (0.8%) 0.21 (0.01) ⫺6.1% (2.0%) ⫺0.64 (0.22) ⫺2.7% (2.1%) ⫺0.45 (0.15) 34.3% (6.2%) 0.71 (0.14) ⫺20.4% (3.5%) ⫺0.42 (0.11)
Effect (SE)
Without
Meta-analyses with and without the outlier trialb
2.2 1.5 30.1 41.7 0.6 0.7 10.8 10.5 8.6 3.2 13.2 19.9
H2
⫺4.8% ⫺0.83 13.0% 0.16 ⫺2.6% ⫺0.29 ⫺4.0% ⫺0.61 32.7% 0.65 ⫺16.9% ⫺0.80
Pooled effect using trial size as weight
b
LnRR, the natural logarithm of the relative risk. An outlier trial here is the trial that is most disproportionately over-weighted and often given the largest weight, except that in combining the relative risk in meta-analysis 3 it is the largest trial that is most disproportionately under-weighted. Evidently, the outlier trial also differs in effect from the overall effect in the remaining trials.
a
First author
Metaanalysis
Table 2 Weighting bias in published meta-analyses of randomized controlled trials
1134 J.-L. Tang / Journal of Clinical Epidemiology 53 (2000) 1130–1136
J.-L. Tang / Journal of Clinical Epidemiology 53 (2000) 1130–1136
the risk difference but under-represented in combining the relative risk, leading to a biased overall estimate of both effect measures. More often, the outlier trial that biases both effect measures may not be the same. In meta-analysis 4 [22], the combined risk difference is largely determined by one trial but the combined relative risk another; both effect measures are biased. This meta-analysis also provides a real example of the hypothetical meta-analysis in Table 1 in which the combined risk difference differs in direction from the relative risk. In these meta-analyses, the bias cannot be avoided by choosing an unbiased effect measure (as it does not exist) nor by excluding the outlier trial. Sources of heterogeneity need to be evaluated. In meta-analysis 5 [23], the relative risk is associated with the risk in the control group. The combined effect is thus in favour of trials that have large risks and a single combined effect does not best represent the individual trials. Description of the risk–effect association may be more appropriate. Alternatively, the risk difference may be used to combine the trials in this example as it is not evidently associated with the risk. However, the relative risk in meta-analysis 6 [24] is associated with the risk in the treated group. Again, the combined effect is inappropriate. As the risk in the treated group is not a cause but a consequence of the heterogeneity in the effect, other sources of heterogeneity need to be explored. (In fact meta-analyses 2 and 3 are also complicated by the relation of the effect with the risk in the compared groups.) 4. Summary and suggestions In summary, the weighting bias is a distortion in the combined effect of a meta-analysis that is caused by some of the combined trials being disproportionately over- and/or under-represented due to the outlying risks in their compared groups. The bias may under- or over-estimate the combined effect and lead to discrepant or even contradictory results when different effect measures are used. It also implies that many of the observed asymmetrical funnel plots [12,13] and discrepancies between large trials and corresponding meta-analyses [14,15], may be a result of true heterogeneity rather than biases as usually interpreted [12,14]. Variations both in the effect and the risk in compared groups are the prerequisite for the occurrence of the weighting bias. Normally the risk will vary as the effect varies. As the effect often varies across combined trials [11], the bias is likely to be common in practice. It is noted that the bias may still result when the effect variation is not statistically significant (see meta-analysis 1 in Table 2). The presence, direction, and magnitude of the bias will further depend on the relative number of outlier trials, the choice of effect measure, and the magnitude of effect to be pooled. Generally speaking, the bias is more likely to arise when the effect is heterogeneous across the combined trials, the trials are conducted in populations of highly varied risks, the relative
1135
risk is used as the effect measure, the effect to be combined is small, any of the trials falls beyond the risk range of 20% and 80%, and/or the number of trials is small. To detect the presence of the bias, one may compare the combined effect weighted according to the variance with the combined effect of the same effect measure but weighted according to the trial size. The percentage difference between the two combined effects can be quantified as a measure of the bias and the sign of the difference will indicate the direction of bias. In many cases, however, the outlier trial(s) may not be obvious and the bias may not be readily apparent. When the bias is considered to be large enough to be of practical importance, further comparison of the relative weight and the relative number of subjects of individual trials will help to identify the outlier trial(s). Special attention should be paid to the over-weighted small trials and under-weighted large trials. The association of the effect with the risk in the control group or the treated group should also be examined, in particular when no outlier trials are evidently identifiable. Where the weighting bias is largely caused by a single small trial that is over-weighted, excluding the trial in metaanalysis may provide an unbiased estimate. However, the disproportionately weighted trials often may not be small either in size or in number so that exclusion of them may be inappropriate. Or the outlier trials may not be evidently identifiable. In these cases, it should first be determined whether or not the risk in the control group is a determinant of the heterogeneity [25,26]. If not, further efforts should be made to evaluate other sources of the heterogeneity [27– 29]. Wherever appropriate, the effect should be presented according to the factor(s) that cause the heterogeneity. Such information would be of great value for future research and clinical decision making. Acknowledgments I thank Joseph Liu for comments on earlier drafts and Aprille Sham for help with graphs. References [1] Fleiss JL, Gross AJ. Meta-analysis in epidemiology, with special reference to studies of the association between exposure to environmental tobacco smoke and lung cancer. J Clin Epidemiol 1991;44:127–39. [2] Chalmers TC, Lau J. Meta-analytical stimulus for changes in clinical trials. Statistical Methods in Medical Research 1993;2:161–72. [3] Lau J, Schmid CH, Chalmers TC. Cumulative meta-analysis of clinical trials builds evidence for exemplary medical care. J Clin Epidemiol 1995;48:45–57. [4] Irwig L, MacCaskill P, Glasziou P, Fahey M. Meta-analytic methods for diagnostic test accuracy. J Clin Epidemiol 1995;48:119–30. [5] Collins R, Peto R, Gray R, Parish S. Large-scale randomized evidence: trials and overviews. In: Weatherall DJ, Ledingham JGG, Warrel DA, editors. Oxford Textbook of Medicine, 3rd edition. Oxford: Oxford University Press, 1996. pp. 21–32. [6] DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials 1986;7:177–88.
1136
J.-L. Tang / Journal of Clinical Epidemiology 53 (2000) 1130–1136
[7] Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiol Rev 1988;9:1–30. [8] Fleiss JL. The statistical basis of meta-analysis. Statistical Methods in Medical Research 1993;2:121–45. [9] Fletcher RH, Fletcher SW, Wagner EH. Clinical Epidemiology: The Essentials. 3rd edition. Baltimore, MD: Williams & Wilkins, 1996. pp. 149–50. [10] Greenland S, Rothman KJ. Measures of effect and measures of association. In: Rothman KJ, Greenland S, editors. Modern Epidemiology. 2nd edition. Philadelphia, PA: Lippincott-Raven, 1998. pp. 47–64. [11] Berlin JA, Laird NM, Sacks HS, Chalmers TC. A comparison of statistical methods for combining events rates from clinical trials. Stat Med 1989;8:141–51. [12] Egger M, Smith GD, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphic test. BMJ 1997;315:629–34. [13] Tang JL, Liu LYJ. Misleading funnel plot for detection of bias in meta-analysis. J Clin Epidemiol 2000;53:477–84. [14] Villar J, Carroli G, Belizan JM. Predictive ability of meta-analyses of randomised controlled trials. Lancet 1995;345:772–6. [15] Cappelleri JC, Ioannidis JPA, Schmid CH, de Ferranti SD, Aubert M, Chalmers TC, Lau J. Large trials vs meta-analysis of smaller trials: how do their results compare? JAMA 1996;276:1332–8. [16] Altman DG. Practical Statistics for Medical Research. London: Chapman & Hall, 1991. p. 233. [17] Yusuf S, Peto R, Lewis J, Parish S. Beta blockade during and after myocardial infarction. An overview of randomized trials. Prog Cardiovasc Dis 1985;27:335–71. [18] Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. [19] King J, Flenady V. Antibiotics in preterm labour with intact membranes. (Comparison: any antibiotics vs no antibiotics; outcome: maternal infection). Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. [20] Martin-Hirsch P, Jarvis G, Kitchener H, Lilford R. Collection devices for obtaining cervical cytology samples. (Comparison: spatula ⫹ cy-
[21]
[22]
[23]
[24]
[25] [26]
[27] [28] [29]
tobrush vs spatula ⫹ swab; outcome: presence of endocervical cells.) Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. Soll RF, Morley, CJ. Prophylactic versus selective use of surfactant for preventing morbidity and mortality in preterm infants. (Comparison: prophylactic surfactant vs treatment with surfactant; outcome: pulmonary interstitial emphysema). Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. Hodnett ED. Caregiver support for women during childbirth. (Comparison: support from caregivers during childbirth; outcome: oxytocin augmentation - unaccompanied). Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. Cullum N, Nelson EA, Fletcher AW, Sheldon TA. Compression bandages and stockings in the treatment of venous leg ulcers. [Comparison: compression vs no compression; outcome: complete healing in trial period (varying lengths)]. Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. Handoll HHG, Farrar MJ, McBirnie J, Tytherleigh-Strong G, Awal KA, Milne AA, Gillespie WJ. Heparin, low molecular weight heparin and physical methods for preventing deep vein thrombosis and pulmonary embolism following surgery for hip fractures. (Comparison: any heparin vs control/placebo; outcome: DVT-any: U Heparin). Cochrane Library: Cochrane Database of Systematic Reviews. Oxford: Update Software, 1999. Sharp SJ, Thompson SG, Altman G. The relation between treatment benefit and underlying risk in meta-analysis. BMJ 1996;313:735–8. Schmid CH, Lau J, McIntosh M, Cappelleri JC. An empirical study of the effect of the control rate as a predictor of treatment efficacy in meta-analysis of clinical trials. Stat Med 1998;17:1923–42. Thompson SG. Why sources of heterogeneity in meta-analysis should be investigated? BMJ 1994;309:1251–5. Smith GD, Egger M. Who benefits from medical interventions? BMJ 1994;308:72–4. Berlin JA. Invited commentary: benefits of heterogeneity in meta-analysis of data from epidemiologic studies. Am J Epidemiol 1995;142:383–7.