Public Health (2001) 115, 165±172 ß R.I.P.H.H. 2001 www.nature.com/ph
Leading Article Breast and cervical cancer survival: making sense of `league tables' JL Botha1,2,3*, PB Silcocks1,4, N Bright1 and P Redgrave5 1
Trent Cancer Registry, Weston Park Hospital, Whitham Road, Shef®eld, S10 2SJ, UK; 2Trent Institute for Health Services Research, 22-28 Princess Road West, Leicester LE1 6TP, UK; 3Department of Oncology, University of Leicester, Osborne Building, Leicester Royal In®rmary, Leicester LE1 5WW, UK; 4Trent Institute for Health Services Research, Room B39, Medical School, Queen's Medical Centre, Nottingham NG7 2UH, UK; and 5Rotherham Health Authority, Bevan House, Oakwood Hall Drive, Rotherham S60 3AQ, UK During 1998, the Department of Health proposed to use survival rates of cervical and breast cancer in the 1989=90 incidence cohort as indicators of care. Valid interpretation was of concern within Trent and the Trent Cancer Registry responded by performing additional analyses. Trent Cancer Registry registrations for 1989=90 were re-analysed and the stability of districts' ranks for that cohort was investigated using random simulation techniques. Stability of ranks across more recent cohorts was investigated and attempts made to use all available information. The Department of Health's analyses were con®rmed by our re-analysis of the 1989=90 cohort: Rotherham residents appeared to have the `worst' survival for cervical cancer, and Shef®eld residents for breast cancer, although not statistically signi®cantly so. Random simulations indicated that ranks based on a single cohort are not stable: for example Shef®eld (ranked tenth for 1-y breast cancer survival) was ranked third or better in 6% of randomisations. Ranks were also unstable across cohorts: for example Rotherham 1-y cervical cancer survival was ranked tenth for 1989=90, ®fth for 1991=92 and tenth for 1993=94. Analysis of 3-y running averages provided better information than the league table approach. Most districts improved over time, to different degrees, and similar sized gaps remained between the `best' and the `worst' districts. This analysis illustrates the need to be circumspect when interpreting `league tables' based on a single year or cohort analysis. League tables are based on ranks: clearly a large difference in rank may re¯ect only trivial (ie medically unimportant) differences in actual outcome. Lack of a statistically signi®cant difference in survival between two districts does not mean their survival is equivalent. Even for a common cancer, like breast cancer, rankings were unstable from cohort to cohort. At the Registry we propose to perform these trend analyses routinely in future, adjusting, when possible, for the effects of deprivation and stage at diagnosis. Public Health (2001) 115, 165±172. Keywords: cancer survival; league tables
Introduction During the 1980s and 1990s more and more indicators to assess health care performance were developed and displayed as league tables. This happened despite repeated questions about their usefulness,1 ± 6 validity7 ± 11 and impact.12 ± 14 In a recent example from 1998, the Department of Health (DH) circulated a paper proposing to use 1-y and 5-y survival statistics as indicators of care for cervical and breast cancer. In the paper, different regions and districts within regions were compared. In Trent it appeared, for example, as if residents of Rotherham had the worst survival for cervical cancer. These indicators were to be put in the public domain and the NHS Executive in
Trent asked Rotherham Health Authority and service providers for explanations of their `poor performance'. Similar concerns were raised regarding breast cancer survival for people resident in Shef®eld Health Authority. At that time, Trent Cancer Registry was asked whether it was feasible to perform more up to date analyses, given that the analyses published were for cases diagnosed in 1989 or 1990. That was the most recent cohort that could be analysed nationally. At the Registry we were also interested in exploring ways to interpret `league tables' sensibly, particularly as related survival estimates were also published as High Level Performance Indicators by the DH.15
Methods *Correspondence: JL Botha, Trent Cancer Registry, Weston Park Hospital, Whitham Road, Shef®eld S10 2SJ, UK. Accepted 29 January 2001
Several analyses were done: 1. We repeated the analyses reported in the original DH paper, ie 1989=90 incidence cohorts.
Breast and cervical cancer survival JL Botha et al
166
2. For every district we assumed a relative survival to have a normal distribution11 with mean and variance given by the observed estimate and its variance for 1989=90. For each health district, this is effectively a single sample from all possible relative survival values consistent with the data for that district. We then drew 1000 other samples (randomly) from each district's distribution, and ranked the districts each time. We did this to investigate the stability of the observed ranks for the 1989=90 cohort. 3. We analysed more recent cohorts for 1-y (1991=92, 1993=94) and 5-y relative survival (1991=92),16 similar to the analysis reported in the DH paper. A potential problem was incompleteness of our mortality data, because the Of®ce for National Statistics (ONS) at Southport had fallen behind in notifying cancer registries about non-cancer deaths. However, we had no reason to think that any one district would be systematically affected by this problem, as it was likely to be a random effect across all districts. 4. We used additional information available in the Registry, particularly trends over time. This enabled us to smooth some of the random variation due to small numbers by using 3-y rolling average estimates of survival, rather than successive 2-y cohorts. We were
able to do these analyses in cases diagnosed between 1985 and 1993, for both 1-y and 5-y survival.
Results In Table 1 the incidences of breast and cervical cancer in Trent districts in recent years are shown. These are the numbers used for the analyses reported here. 1989=90 cohort In Figures 1 and 2, 1-y and 5-y relative survival for cervical cancer and breast cancer, respectively, is shown, with 95% con®dence intervals. Districts are shown ranked, from highest to lowest observed relative survival. For cervical cancer, average 1-y survival for Trent was 83%, with districts varying around that (Figure 1a). It is evident that for this cohort (used in the DH paper) Rotherham appears to be the district in Trent with the worst 1-y relative survival (71%, ranked tenth out of 10 districts), although this was not statistically signi®cant. Average 5-y relative survival in Trent was 65%, with districts varying around that (Figure 1b). Variation was less than for 1-y survival, partly re¯ecting the larger number of deaths
Table 1 The incidence of breast and cervical cancer in Trent districts N N S Year Barnsley Doncaster Leicestershire Lincolnshire Derbyshire Nottinghamshire Nottingham Rotherham Shef®eld Derbyshire Female: 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 Female: 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 Public Health
All Ages: 112 113 101 106 93 110 135 137 156 147 125 137 All Ages: 21 24 33 28 27 27 19 16 23 25 27 19
Breast (invasive) 134 417 136 437 151 418 136 450 145 472 162 479 195 476 168 528 156 527 160 481 147 421 179 434 Cervical (invasive) 29 68 45 79 38 61 48 56 37 65 42 58 35 44 19 51 25 37 19 46 20 23 15 38
256 273 324 287 338 374 409 445 347 391 391 376
209 184 175 168 189 214 258 255 216 252 229 222
178 197 184 187 222 218 247 251 263 207 216 207
305 281 318 342 307 343 334 374 354 326 320 324
126 111 112 117 128 146 151 164 147 141 101 109
247 305 265 278 259 281 323 327 312 271 258 249
252 252 264 233 271 319 320 289 334 311 299 322
49 43 40 42 46 37 32 38 36 44 31 33
32 21 35 32 28 35 21 20 20 17 20 17
26 26 35 34 37 32 24 24 22 22 27 24
60 64 64 59 54 61 48 44 40 43 28 32
30 23 18 24 26 23 23 18 18 24 13 8
40 39 47 34 45 44 29 23 37 24 36 19
43 44 46 52 20 40 22 34 25 25 24 34
Breast and cervical cancer survival JL Botha et al
167
Figure 1
Invasive carcinoma of the cervix Ð relative survival in the 1989=90 cohort.
occurring. Rotherham did not have the worst 5-y relative survival in this cohort (62%, ranked sixth out of 10 districts). For breast cancer, average 1-y survival for Trent was 88%, with districts varying around that (Figure 2a). It is evident that for this cohort (used in the DH paper) Shef®eld appeared to be the district in Trent with the worst 1-y relative survival (82%, ranked tenth out of 10 districts), although this was not statistically signi®cant. Average 5-y relative survival in Trent was 68%, with districts varying around that (Figure 2b). Shef®eld appeared also to be the district in Trent with the worst 5-y relative survival (63%, again ranked tenth out of 10 districts), although this was not statistically signi®cant. For both 1-y and 5-y relative survival, between district variation was less, and 95% CIs narrower, than for cervical cancer, partly re¯ecting the fact that breast cancer occurs more commonly.
Figure 2 Breast cancer Ð relative survival in the 1989=90 cohort.
Stability of the observed ranks for the 1989=90 cohort Selected distributions of ranks obtained from the randomisation procedure are shown in Figure 3 (1-y relative survival for cervical cancer and 5-y relative survival for breast cancer). In each case the distribution of ranks is shown for the index district (Rotherham for cervical cancer and Shef®eld for breast cancer), as well as one other district with a contrasting observed rank. From Figure 3a it can be seen that for 1-y relative survival in cervical cancer patients, Rotherham was ranked tenth (observed rank) in 71% of randomisations, yet it was ranked seventh or better in 7.1% of randomisations. Shef®eld had an observed rank of ®fth and the sampling distribution of ranks was ¯at, with the most common rank (®fth) occurring in 15% of randomisations. For 5-y survival (not shown in the ®gure) the distributions were less spread out, re¯ecting the larger numbers of deaths. Rotherham had Public Health
Breast and cervical cancer survival JL Botha et al
168
Figure 3 Selected distributions of ranks resulting from 1000 randomisations (1989=90 cohort).
an observed rank of sixth, and the most common randomised rank was seventh (23.4%). Doncaster had an observed rank of ®rst, that being the rank occurring in 82% of randomisations, although it was ranked fourth or ®fth in 2.3% of randomisations. From Figure 3b it can be seen that for 5-y survival Shef®eld had an observed rank of tenth, which was also Public Health
the most common randomised rank (47.3%), although it was ranked sixth or better in 5% of randomisations. Rotherham had an observed rank of ®fth, the rank occurring most commonly was sixth, in 16% of randomisations, although it was ranked ®rst or second in 10% of randomisations, and tenth in 5%. For 1-y relative survival in breast cancer patients (not shown in the
Breast and cervical cancer survival JL Botha et al
®gure), Shef®eld was again ranked tenth (observed rank) in 36% of randomisations, yet it was ranked third or better in 6% of randomisations. Leicestershire (a large district) had an observed rank of ®fth and the sampling distribution of ranks was wide and ¯at, with the most common rank (®fth) occurred in 16% of randomisations. More recent cohorts For each cohort, districts were ranked according to the observed relative survival. The ranks are shown in Table 2. From Table 2 it is evident that there is considerable cohort-to-cohort variation in survival from cervical cancer for each district. Whereas Rotherham was ranked tenth out of 10 districts for 1-y survival in the 1989=90 cohort, it was ranked ®fth for the 1991=92 cohort and tenth again for the 1993=94 cohort. Such variation is also evident for larger districts, for example Leicestershire (®rst, third, eighth) and Nottingham (fourth, ®rst, seventh), and for 5-y survival for example Shef®eld (third, eighth). It is also evident that there is also cohort-to-cohort variation in survival from breast cancer for each district, if somewhat smaller. Shef®eld was ranked tenth out of 10 districts for 1-y survival in the 1989=90 cohort, tenth again for the 1991=92 cohort and ninth for the 1993=94 cohort. Bigger variation occurred in larger districts, for example Leicestershire (®fth, second, ®rst) and Lincolnshire (®rst,
Table 2 Variation in ranking of Trent Districts when analysing relative survival in different incidence cohorts 1-year District
5-year
1989±90 1991±92 1993±94 1989±90 1991±92
Invasive cervical cancer Barnsley 9 Doncaster 2 Leicestershire 1 Lincolnshire 6 N Derbyshire 8 N Nottinghamshire 7 Nottingham 4 Rotherham 10 S Derbyshire 3 Shef®eld 4
6 7 3 9 10 1 1 5 3 8
8 6 8 2 4 1 7 10 4 2
2 1 4 6 9 9 5 6 8 3
6 2 3 7 10 4 1 5 9 8
Breast cancer Barnsley Doncaster Leicestershire Lincolnshire N Derbyshire N Nottinghamshire Nottingham Rotherham S Derbyshire Shef®eld
1 5 2 9 8 6 4 3 7 10
6 8 1 6 10 1 3 3 3 9
3 4 1 7 2 9 6 5 8 10
1 9 3 7 5 10 6 2 4 8
3 8 5 1 7 4 2 9 6 10
ninth, sixth), and for 5-y survival for example Shef®eld (tenth, eighth).
169
Other information Selected survival estimates for districts are plotted over time in Figure 4 (1-y relative survival for cervical cancer and 5-y for breast cancer), enabling us to examine time trends. In each graph three lines are emphasised: the observed and ®tted trends for the Trent population, the Rotherham trend for cervical cancer, and the Shef®eld trend for breast cancer. To show the trends more clearly, the vertical axes are not the same and they are truncated at both ends. For cervical cancer the average trend with time shows slight improvement in both 1-y (Figure 4a) and 5-y survival (not shown in the ®gure). However, marked betweendistrict variation can clearly be seen, and some districts show improvement with time, although others do not. It is also evident that between-district variations remain of similar magnitude over time, although the rank order of districts varies with time. These variations were re¯ected in the earlier analyses of ranks. Comparison with Figure 1a illustrates the danger in choosing one cohort only to use as an indicator: because 1989=90 was chosen, Rotherham appeared worst. For breast cancer the average trend with time shows marked improvement in both 1-y (not shown in the ®gure) and 5-y survival (Figure 4b). Between-district variation is visible, but it is smaller than for cervical cancer. In addition, virtually all districts show this trend. It is also evident that between-district variations remain of similar magnitude over time, although the rank order of districts varies with time. Again these variations were re¯ected in the earlier analyses of ranks. For both 1-y and 5-y survival Shef®eld shows improvement with time, although improvements in other districts appear to be greater, with Shef®eld survival lying at the lower end of the distribution in recent years. Discussion This is a very good illustration of the need to be circumspect when interpreting `league tables' based on a single year or cohort analysis and when the numbers of cases are small. It con®rms the questions asked by others14 and again raises questions about the usefulness of using these estimates as indicators of performance, over and above the problem that these comparisons have not been adjusted for case-mix or stage at presentation. By using the observed information from the 1989=90 cohort and simulating these results on 1000 random samples, we showed that even for a given cohort there is uncertainty about the observed ranking of districts. By analysing more recent cohorts, we showed that there is Public Health
Breast and cervical cancer survival JL Botha et al
170
Figure 4
Time trends in relative survival Ð 3 y rolling averages by district of residence (note differing and truncated vertical axes).
also wide variation in rank for a given district between cohorts, even large districts. It is therefore possible that the converse should also be remembered: where there appear to be stable ranks for a given district across cohorts (for example Leicestershire (®rst, third), Lincolnshire (seventh, seventh) for 5-y breast cancer survival), this may give false reassurance. Finally we showed that despite variation in ranks from cohort to cohort, the overall trend in survival is one of improvement. On the other hand the persistent variation between districts, the magnitude of which seems not to change much, needs further investigation. Some of this variation is random, being due to small numbers of deaths occurring, particularly within the ®rst year, for these cancers of good prognosis. Some of it could, however, be due to differences between districts in known confounders, eg age distributions, stage of presentation17,18 and=or deprivation.19,20 The relative survival estimates, both in the DH publication and here, do not take account of these. Another problem of interpretation is that it is tempting to say that because there is no statistically signi®cant difference in survival between Districts A and B, their survival is for practical purposes the same. Strictly this would mean that the survival was equivalent Ð however lack of statistical signi®cance may re¯ect a small sample size rather than a truly small difference in survival. This is illustrated in Figure 5. In addition, since differences in survival for equivalent treatments should be negligible by de®nition, sample sizes to establish equivalence need either to be very much larger or have a much lower threshold of suspicion Public Health
(for example P-values of 0.1 rather than 0.05) than is typical in epidemiological studies; moreover the de®nition of a `negligible' difference needs to be speci®ed in advance and explicitly justi®ed. To illustrate, suppose survival in District A is 35%. To establish whether survival in District B is equivalent, (`equivalent' survival being any value with 95% con®dence limits lying in the range 30% to 40%) then about 1125 patients would be needed from each group (with two-tailed P 0.05 and power 80%). Conversely suppose we just want to establish that survival is de®nitely worse in District B, then (if it actually is worse) the con®dence interval must lie entirely outside the zone of equivalence (to the right in the Figure 5) and only about 500 patients would be needed from each District (with onetailed P 0.05 and power 80%) to detect a difference of 7.4 percentage points. The comforting assumption that survival is equivalent should not be made without taking these factors into account. We therefore suggest that analyses of trends over time should be the way to present these data to commissioners and providers. All the effects visible in earlier analyses are visible in the graphs of time trends. They also show that differences of one rank (for example between ®rst and second, or ninth and tenth) do not necessarily mean the same difference in survival. League tables are based on ranks: clearly a large difference in rank may re¯ect only trivial (ie medically unimportant) differences in actual outcome. In addition, we suggest that commissioners and providers will bene®t if performance is assessed in terms of two indices: consistency and mean outcome. If there is
Breast and cervical cancer survival JL Botha et al
171
Figure 5 Possible outcomes when measuring difference in survival between Districts A and B.
acceptably low variability (ie high consistency) and mean outcome is good, then regional performance is satisfactory, whatever any district's ranking is for a particular year. Also, estimates of equivalence are urgently needed. References 1 Shaw CD. Health-care league tables in the United Kingdom. J Qual Clin Pract 1997; 17: 215 ± 219. 2 Anon. Too much attention should not be paid to ranking in league tables. Br Med J 1998 Jun 6; 316: B.
3 Parry GJ, Gould CR, McCabe CJ, Tarnow-Mordi WO. Annual league tables of mortality in neonatal intensive care units: longitudinal study. International Neonatal Network and the Scottish Neonatal Consultants and Nurses Collaborative Study Group. Br Med J 1998; 316: 1931 ± 1935. 4 Anon. League tables are inaccurate in ranking hospital mortality outcomes. Br Med J 1998 Jun 27; 316: A. 5 Davies HT, Lampel J. Trust in performance indicators? Qual Health Care 1998; 7: 159 ± 162. 6 Winston R. League tables of in vitro fertilisation clinics misinform patients. Br Med J 1998; 317: 1593 ± 1594. 7 Hayes C, Murray GD. Case mix adjustment in comparative audit. J Eval Clin Pract 1995 Nov; 1: 105 ± 111. Public Health
Breast and cervical cancer survival JL Botha et al
172
8 Bagust A. League tables. Br J Hosp Med 1996; 55: 369 ± 370. 9 Leyland AH, Boddy FA. League tables and acute myocardial infarction. Lancet 1998; 351: 555 ± 558. 10 Poloniecki J, Valencia O, Littlejohns P. Cumulative risk adjusted mortality chart for detecting changes in death rate: observational study of heart surgery. Br Med J 1998; 316: 1697 ± 1700 (Published erratum appears in Br Med J 1998; 316: 1947). 11 Marshall EC, Spiegelhalter DJ. Reliability of league tables of in vitro fertilisation clinics: retrospective analysis of live birth rates. Br Med J 1998; 316: 1701 ± 1704; discussion 1705. 12 Fletcher J. Excavating health care policy: are league tables likely to dig out the truth about hospital performance or dig it in? J Nurs Manag 1995; 3: 255 ± 262. 13 Nutley S, Smith PC. League tables for performance improvement in health care. J Health Serv Res Policy 1998; 3: 50 ± 57. 14 Cribb A. League tables, institutional success and professional ethics. J Med Ethics 1999; 25: 413 ± 417.
Public Health
15 Finance and Performance Assessment. Quality and performance in the NHS: high level performance indicators. NHS Executive: Leeds, June 1999. 16 Esteve J, Benhamou E, Raymond L. Statistical methods in cancer research, Vol IV. Descriptive Epidemiology. IARC Scienti®c Publications No 128. International Agency for Research on Cancer: Lyon, 1994. 17 Quinn MJ, Martinez-Garcia C, Berrino F. Variations in survival from breast cancer in Europe by age and country, 19781989. Eur J Cancer 1998; 34: 2204 ± 2211. 18 Gatta G et al. Understanding variations in survival for colorectal cancer in Europe: a EUROCARE high resolution study. Gut 2000; 47: 533 ± 538. 19 Schrijvers CT, Mackenbach JP, Lutz JM, Quinn MJ, Coleman MP. Deprivation and survival from breast cancer. Br J Cancer 1995; 72: 738 ± 743. 20 Coleman MP et al. Cancer survival trends in England and Wales, 1971-1995: deprivation and NHS Region (CDROM). Of®ce for National Statistics: London, 1999.