c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
journal homepage: www.intl.elsevierhealth.com/journals/cmpb
SURVSOFT—Software for nonparametric survival analysis Karla Geiss a,∗ , Martin Meyer a , Martin Radespiel-Tröger a,b , Olaf Gefeller b a
Population Based Cancer Registry Bavaria, Registry Office, Östliche Stadtmauerstr. 30, 91054 Erlangen, Germany Department of Medical Informatics, Biometry and Epidemiology, University of Erlangen-Nuremberg, Waldstr. 6, 91504 Erlangen, Germany b
a r t i c l e
i n f o
a b s t r a c t
Article history:
Long-term observed and relative survival are important outcome measures of cancer patient
Received 27 August 2008
care reported routinely by many cancer registries, but no commercial statistical software
Received in revised form
exists for estimating relative survival or performing period survival analysis. The programs
2 April 2009
publicly available focus only on certain methods, require specific input data formats and
Accepted 2 April 2009
often are macros or functions which require underlying software packages. Here we introduce SURVSOFT, a comprehensive, user-friendly Windows program with graphical user
Keywords:
interface. It can handle different input data formats and incorporates a variety of nonpara-
Survival analysis
metric statistical methods for survival data analysis. SURVSOFT produces high-resolution
Relative survival
graphs, which can be printed, saved or exported to be used with standard graphics editors.
Period analysis
The use of SURVSOFT is illustrated by the analysis of survival data from the Bavarian Cancer
Software
Registry.
Cancer registries
1.
Introduction
Long-term survival is an important outcome in clinical epidemiology to measure the efficacy of therapeutic interventions for various forms of cancer. It reflects the combination of the natural course of disease and the beneficial impact of patient care. Routinely, observed (absolute) and relative survival rates, usually for 5 or 10 years, are reported by many cancer registries worldwide [1]. By monitoring such survival rates, population based cancer registries can assess changes over time or regional differences that may point to temporal or regional imbalances in the diagnosis and treatment of cancer patients, thus contributing to the evaluation of the effectiveness of cancer care. The repertoire of statistical methods dealing with survival data in general and with specific register-based survival data in particular has been expanded during the last decade. Nowadays, a wide range of methods for estimating survival exists
© 2009 Elsevier Ireland Ltd. All rights reserved.
[2–4]. While methods for analyzing absolute survival have been implemented in all major statistical software packages, no commercial statistical software is available for computing relative survival. To close this gap, cancer registries have developed their own programs, often integrated in their proprietary software system or database application. Some software is publicly available via internet, like SURV3 [5], SAS macros [6,7] and Stata commands [7]. Unfortunately, these programs focus only on certain methods and need specific input data formats, which make them cumbersome to use. Furthermore, using specific macros requires the availability and knowledge of the underlying software packages. The most comprehensive coverage of statistical methods for analyzing cancer registry data is provided by SEER*Stat [8]. Although this powerful software package includes a wide range of tools for analyzing population based cancer registry data, it has some restrictions regarding the general applicability of the methods for survival analysis because it is specifically tailored to the requirements
∗ Corresponding author at: Bevölkerungsbezogenes Krebsregister Bayern, Östliche Stadtmauerstr. 30, 91054 Erlangen, Germany. Tel.: +49 9131 8536063 fax: +49 9131 8536040. E-mail address:
[email protected] (K. Geiss). 0169-2607/$ – see front matter © 2009 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.cmpb.2009.04.002
64
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
of the SEER program [9]. For example, in case of period analysis, patients diagnosed in the most recent year for which data is available are not included in the survival estimates and the period of interest cannot include the last year of available data. Analyzing own cancer data with SEER*Stat requires the input data to be converted to a complex specific file format, which complicates the practical use of the software. This manuscript presents SURVSOFT, a user-friendly software application with a graphical user interface running under Microsoft Windows, which can handle different formats for input data and is able to perform a variety of nonparametric methods for survival data analysis. SURVSOFT provides an interactive environment for importing data, performing analyses and editing, saving and exporting the results in the form of tables and charts. Section 2 describes briefly the statistical methods for survival analysis implemented in SURVSOFT. For a more thorough discussion, the interested reader is referred to the cited publications. Section 3 gives an overview of the functionality and the practical usage of the program. An illustrative example for the application of SURVSOFT to empirical data from the Bavarian Cancer Registry is provided in Section 4.
2.
Statistical methods
Survival time is defined as the length of the time interval from some starting point of observation, such as the onset or diagnosis of a disease under study, until the occurrence of a certain event of interest, such as death, recurrence of a disease, etc. Let T denote this survival time. The distribution of T, a positive random variable, can be characterized by the survival function S(t), also known as the cumulative survival rate. If the final event is death, S(t) represents the probability of surviving longer than time t: S(t) = P(T > t) = 1 − F(t), where F(t) denotes the cumulative distribution function of T. Typically, T cannot be observed exactly for some part of the population under study as the observation time can be shorter than the survival time. In a study setting only the minimum of the observation time and the survival time can be observed, leading to data comprising a mixture of true (uncensored) survival times and minimum (censored) survival times. Statistical methods to analyze survival data have to cope with these censored data appropriately.
2.1.
Estimation of observed (absolute) survival
The observed (absolute) survival rate is the basic measure of the survival experience of a group of patients from the date of diagnosis to a certain time, for example 5 or 10 years. SURVSOFT provides several methods for estimating observed survival. The life table (actuarial) method [2,3] is widely used for analysis of grouped survival data and is at present the method of choice for routinely published survival statistics from many cancer registries. It requires a rather large number of observations in order to group individual survival times into fixed length time intervals [ti−1 , ti ). Usually, annual intervals are used, although other intervals are possible. The life table
method provides the cumulative proportion of patients surviving the end of the interval [ti−1 , ti ), an estimate for the survival function at time ti : ˆ i) = S(t
i
i
j=1
j=1
pˆ j =
1−
di ni
,
where pˆ j denotes the conditional probability of surviving the ith interval, di the number of deaths and ni the size of the population at risk during the ith interval. It is assumed that censoring occurs uniformly within the interval. Thus, ni can be calculated as li − ci /2, where li denotes the number of people entering in the ith interval and ci the number censored during this interval. With the Kaplan–Meier (product-limit) method [2,3], the proportion of patients still surviving can be calculated at intervals as short as the time unit the survival times are measured in, e.g. day, month, year, etc. In practice, Kaplan–Meier estimates are calculated at all uncensored survival times ti , assuming that the probability of surviving is constant between two successive uncensored survival times. Assuming ordered observation times t1 ≤ t2 ≤ . . . ≤ tn , the Kaplan–Meier estimator for the survival function is given by ˆ = S(t)
i:ti ≤t
pˆ ti =
i:ti ≤t
1−
dti nti
,
where ti ≤ t and ti is uncensored. dti denotes the number of deaths and nti the size of the population at risk at time ti . To provide more up-to-date long-term survival estimates, period analysis has been proposed by Brenner and Gefeller [4,10] and has been intensively applied to cancer registry data in recent years [11–14]. With this method, the latest follow-up information is incorporated into the analysis by considering only the survival experience in some recent time period. This is done by left truncating the observations at the beginning of the period of interest in addition to the right censoring at the end of the period. Period analysis can be performed using either life table or Kaplan–Meier techniques. For the period analysis approach, these standard methods are modified by including only followup experience during the period of interest into the analysis. This means that in the formulas for calculating survival estimates given above only patients at risk and events during this period are taken into account. SURVSOFT implements both techniques (life table and Kaplan–Meier) for performing period analysis. In addition to this, the program offers the possibility of describing trends in n-year survival rates by iteratively performing period analysis. To this end, the period of interest is shifted by one time interval in each iteration step until the entire observation period has been covered.
2.2.
Estimation of relative survival
When comparing the survival experience for a specific disease between heterogeneous patient groups, observed survival rates are not an adequate measure since they account for all deaths, regardless of cause. Some patient subgroups may have higher competing risks of death than others, which renders a
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
simple comparison of absolute survival between such groups meaningless. For instance, older patients are more likely to die of causes unrelated to the disease under study than younger patients. Therefore, the interest lies usually in describing mortality attributable to the disease under study. One method of estimating net survival, where the disease of interest is assumed to be the only possible cause of death, is relative survival [15,16]. It is generally the preferred measure for survival analysis based on data from population based cancer registries, and has the major advantage of not requiring information on cause of death, which is often inaccurate or not available. The cumulative relative survival rate r(t) is defined as the ˆ and ratio of the observed survival rate in a patient group S(t) * the expected survival rate S (t) of a comparable group from the general population matched with respect to the main factors affecting patient survival:
r(t):=
ˆ S(t) . S∗ (t)
Expected survival is usually estimated from general population life tables stratified by age, sex, calendar time and sometimes by race. There are several methods of estimating expected survival for a patient group in order to estimate relative survival. The two most widely used methods in practice are known as Ederer II [17] and Hakulinen [18], which are both implemented in SURVSOFT. For relative period analysis, the two methods were adapted by left truncation of the observations at the beginning of the period of interest. Expected survival can be estimated from general population life tables for annual intervals only. To be used in combination with the Kaplan–Meier method or with the life table method and time intervals different than one year, expected survival can be approximated by interpolation of the annual values.
2.3. Standard error and confidence intervals for estimated survival rates Along with the cumulative (observed or relative) survival rates, SURVSOFT also provides estimates of the standard errors and pointwise confidence intervals. Greenwood’s method [19] is used to estimate the standard ˆ error SE[S(t)] of observed survival rates obtained with any of the methods described in Section 2.1. The variance of the expected survival rate is very small compared to the variance of the observed survival rate. Under the simplifying assumption that S* (t) can be treated as known constant, the variance of the relative survival rate can be approximated by Var[r(t)] = ∗ (t)] ∼ Var[S(t)]/[S ∗ (t)]2 . The approximate standard ˆ ˆ Var[S(t)/S = error of the relative survival rate is then given by SE[r(t)] ∼ = ∗ (t). By assuming normality for the survival rate estiˆ SE[S(t)]/S mates, pointwise 95% confidence intervals for the survival rates can be obtained as [u(t) − 1.96*SE[u(t)], u(t) + 1.96*SE[u(t)], where u(t) denotes the observed or relative survival rate at time t.
3.
65
The computer program
SURVSOFT is a stand-alone graphical software application running under Microsoft Windows operating systems (XP, Vista). It was implemented as an MDI (Multiple Document Interface) application using Visual C++ 6.0 and MFC (Microsoft Foundation Class Library). 512 MB RAM are recommended. The program requires 2 MB hard-disc space and an 800 × 600 or higher-resolution display with at least 256 colors. The block diagram in Fig. 1 shows the schematic overview of a sample program run illustrating the user options. After importing the survival data from an external data file, the user can choose between several methods for survival analysis: life table, Kaplan–Meier, period analysis or trend in n-year survival. Depending on the selected method, different parameters have to be specified in order to define the survival analysis method and to control the computation (e.g., time interval, confidence interval, period of interest). In case of relative survival, the method for estimating expected survival has to be selected, as well as the general population life table which should be used for the computation. The results of the performed survival analysis are displayed in the form of a table and a chart. SURVSOFT allows the user to edit the chart and save or export the results. Several data sets can be imported and multiple analyses can be performed on one data set. The following subsections demonstrate, on the basis of some examples, the convenient use of SURVSOFT to perform survival analysis.
3.1.
Import of survival data
The program imports survival data from external data files, for example, exported from a database system or a statistical software package. It is possible to import different file types: text files (fixed or free column format), Excel sheets and database files (dBase, MS Access). At least three variables are required to perform survival analysis: start of observation, end of observation and life status at the end of follow-up. An additional grouping variable is optional. Relative survival analysis requires, in addition to this, information on sex and date of birth. The import of survival data into SURVSOFT is arranged flexible, i.e., the data are not required to be organized in a specific format regarding position of data columns, date format, coding of life status or sex, etc. The user can import own data files by selecting certain parameters depending on the file type. For example, in case of a database file, the user simply selects the columns in his database that correspond to the aforementioned list of variables (Fig. 2). The actual column position within the database file is not relevant, the columns being identified by their name.
3.2.
Performing survival analysis
Once the survival data has been imported, the user can select (by using the appropriate menu command) one of the survival analysis methods implemented in SURVSOFT: life-table method, Kaplan–Meier method, period analysis (based on both
66
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
Fig. 1 – Schematic overview of a sample program run consisting of data import, method selection and configuration (with or without relative survival), and output of results.
life-table and Kaplan–Meier method) or trend in n-year survival (Fig. 3). After selecting a survival analysis method, a corresponding configuration dialog will appear to set the parameters needed to perform the analysis. As an example, Fig. 4 shows the dialog window for period analysis.
In the upper part of the dialog window, a plot of the observation time of a sample of the imported cases is displayed. The user can adjust the period of interest by using the corresponding spin buttons. Simultaneously, the currently selected period of interest is highlighted in the observation time plot. In addition, the length of the time intervals used for the esti-
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
67
Fig. 2 – Selection of data columns from the file to be imported.
mation of survival rates can be set, if the life table technique is used. The user can choose between year, month, week and day. The list box ‘Coding of vital status’ contains all codes for vital status found in the data records. The code of the event of interest has to be specified by selection of the appropriate list box entry. Regarding the confidence interval, the user can choose between 95% and 99% level. If the option ‘Stratified analysis’ is checked, an individual survival curve is created for each group of patients. If this option is unchecked, only one survival curve will be plotted containing the information of all cases (i.e., the grouping variable will be ignored). With SURVSOFT, period analysis can be performed both on the basis of life table and product-limit techniques by simply selecting the appropriate entry in the ‘Technique’ list box. Whenever the estimation of relative survival is the aim of the analysis, some further configuration is needed. Similar to the vital status coding, the appropriate code for the sex has to be selected from a list box. In addition to this, the general population life table and the method for the estimation of expected survival have to be selected (Ederer II or Hakulinen). The handling of general population life tables by SURVSOFT is described in Section 3.4. After accepting the parameter settings, the program displays the results in table and chart form.
3.3.
Output of results
The results of the performed survival analysis are displayed in two windows (Fig. 5). A text output window lists the sur-
vival results in tabular form. The contents can be printed or saved like any standard Windows text file. In the chart output window, survival curves or plots of n-year survival rates over time are displayed, depending on the selected analysis method. Special editing functions are available for modifying the chart design: all text elements, fonts, line styles and marker styles can be edited. Charts can be printed, saved in a SURVSOFT-specific chart format and exported as Windows Metafile (WMF), which facilitates the import into standard office applications and common drawing programs.
3.4.
Population life tables and expected survival
As described in Section 2.2, in order to estimate the expected survival for a group of patients, general population mortality data is required as published in nationwide population life tables. In case of SURVSOFT, general population life tables are stored in an external database, which can contain different life tables (for example for different regions or countries). Each life table contains annual probabilities of death in the general population for single year age groups stratified by calendar year and sex. Fig. 6 shows an excerpt from the general population life table for Germany, where the general population mortality rates are published annually indicating an average over 3-year calendar periods. The import of the general population life tables with SURVSOFT is very convenient. The program reads automatically the name of the population life tables stored in the database and the user only needs to select the appropriate table from the
Fig. 3 – Selection of the method for survival analysis.
68
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
Fig. 4 – Dialog window for parameter selection in case of period analysis.
list box within the parameter selection dialog for the survival analysis (Fig. 4). The estimation of the expected survival of a group of patients involves, for each individual patient, the calculation
of the expected survival probability for a person in the general population similar to the patient with respect to age, sex and calendar year. If the age of the patient exceeds the maximum age for which an entry is found in the general population
Fig. 5 – Output windows containing the results of the analysis.
69
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
Fig. 6 – General population life table for Germany.
Table 1 – Age distribution and age-specific 5-year period relative survival rates in Bavaria for colorectal cancer (2000–2004 period).
Table 2 – Stage distribution of cases with known stage and stage-specific 5-year period relative survival rates in Bavaria for colorectal cancer (2000–2004 period).
Age group
UICC stage
<45 45–54 55–64 65–74 75+
Age distribution (%)
Age-specific 5-year relative survival (%)
3.2 8.9 24.2 32.0 31.8
64.7 65.0 62.9 60.8 56.5
life table, the probability of death for this maximum age is used (for example 100 years in the life table in Fig. 6). If a certain calendar year cannot be found in the general population life table, the table for the nearest available year is used. General population life tables contain usually only annual probabilities of death. Correspondingly, expected survival of a patient group can be estimated from the population life tables only for annual intervals. If values for smaller intervals are needed, they are determined by linear interpolation of the annual values.
Stage distribution (%)
I II III IV
4.
Stage-specific 5-year relative survival (%)
20.2 27.0 27.1 25.8
93.3 82.2 67.1 13.9
Example
As an illustrative example for the use of SURVSOFT, this section presents the results of the application of the program to empirical survival data from the Bavarian Cancer Registry for colon and rectum cancer (ICD-10 C18-C21), one of the most common forms of cancer in males and females. Patients diagnosed from 1998 to 2004 and aged 15 or older were included in this analysis, unless they were notified by death certificate only (dataset as of April 2008). Relative survival estimates were calculated using period analysis for the 2000–2004 period. The
Fig. 7 – Age-specific relative survival of Bavarian patients with colorectal cancer.
70
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
Fig. 8 – Stage-specific relative survival of Bavarian patients with colorectal cancer.
expected survival estimates were obtained according to the Hakulinen method using calendar-year and sex specific life tables for Germany published by the Federal Statistical Office [20]. The data set included 29,661 colorectal cancer patients, thereof 16,760 (56.5%) males. The overall 5-year relative survival rate was 60.4% (60.0% for males and 60.9% for females). Table 1 shows the age distribution of the patients and the agespecific 5-year relative survival estimates. The distribution by UICC stage and the stage-specific 5-year relative survival estimates are presented in Table 2 for the patients with known staging. Figs. 7 and 8 show age-specific and stage-specific survival curves created with SURVSOFT.
5.
Discussion
The lack of appropriate software has been identified as a major obstacle for a wider utilization of modern statistical methods in practical applications of various disciplines [21]. This general observation applies to survival analysis of register-based cancer data as well. Survival analysis software implementing parametric techniques was already published recently in this journal [22,23]. Here, we introduced SURVSOFT, a comprehensive, user-friendly Windows program, which incorporates a flexible data import, a variety of nonparametric methods for survival analysis and convenient graphical features for the output of results, in order to facilitate the widespread application of these methods in practice. Some of the implemented methods of survival analysis require more complex algorithms and are not included in standard software packages: relative survival based on the Kaplan–Meier method as well as observed and relative period analysis using product-limit techniques. Another convenient feature in SURVSOFT is the option to choose between different interval lengths (year, month, week, day) for the calculation of survival estimates when using life table methods. Other programs provide survival estimates for yearly intervals only.
For the survival analysis methods for which it was possible, SURVSOFT was successfully validated against existing software. Additionally, the program was tested in a comparative study conducted by members of GEKID, the Association of Population Based Cancer Registries in Germany. Each of the eight participating cancer registries calculated, using several survival analysis methods, 3-, 5- and 7-year survival rates for the same dataset with the (proprietary or publicly available) software used in their routine statistical evaluations. The results obtained by the Bavarian Cancer Registry using SURVSOFT were in good concordance with the results of the other participants. In particular, the results obtained with SURVSOFT agreed to five decimal places with the ones computed with SURV3 for all methods offered by both programs. In this manuscript, the use of SURVSOFT was illustrated by analyzing survival for population based cancer data. However, it can be used to analyze the survival experience of any group of patients since the implemented methods apply equally to hospital data [1]. SURVSOFT will be available as freeware on the website of the Bavarian Cancer Registry (http://www.krebsregister-bayern.de/software e.html). The development of SURVSOFT is an ongoing process which is only partially completed with the current version. Observed and relative survival rates typically vary by age for many forms of cancer. Since the age distribution often differs between populations or within one population over time, age-adjustment should be performed when comparing cancer patient survival of different populations or in analyses of time trends. Future work will include the integration of such methods of age-adjustment of survival rates [24,25] as well as statistical tests for comparison of survival rates [2,26,27]. Techniques for nonparametric estimation of hazard functions will also be incorporated in a future version to supplement the repertoire of survival statistics.
Conflict of interest statement None declared.
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 9 6 ( 2 0 0 9 ) 63–71
references
[1] D.M. Parkin, T. Hakulinen, Analysis of survival, in: O.M. Jensen, D.M. Parkin, R. Maclennan, C.S. Muir, R.G. Skeet (Eds.), Cancer Registration: Principle and Methods, International Agency for Research on Cancer, Lyon, 1991, pp. 159–176 (IARC Scientific Publications No. 95). [2] E.T. Lee, J.W. Wang, Statistical Methods for Survival Data Analysis, 3rd ed., Wiley-Interscience, Hoboken, NJ, 2003. [3] J.P. Klein, M.L. Moeschberger, Survival Analysis: Techniques for Censored and Truncated Data, Springer, New York, 1997. [4] H. Brenner, O. Gefeller, An alternative approach to monitoring cancer patient survival, Cancer 78 (1996) 2004–2010. [5] SURV3: Windows Software for Relative Survival Analysis, Finnish Cancer Registry, Helsinki, Finland, available at: http://www.cancerregistry.fi/surv3. [6] H. Brenner, O. Gefeller, SAS period and periodh: Period Analysis of Survival Data, available at: http://www.imbe. med.uni-erlangen.de/issan/SAS/period/period.htm. [7] P. Dickman, Estimating and modelling relative survival in SAS and Stata, available at: http://www.pauldickman.com/ rsmodel. [8] SEER*Stat statistical software, Surveillance Research Program, NCI, Bethesda, MD, USA, available at: http://seer. cancer.gov/seerstat. [9] K. Cronin, A. Mariotto, S. Scoppa, D. Green, L. Clegg, Differences between Brenner et al. and NCI Methods for Calculating Period Survival, Statistical Research and Applications Branch, NCI, Technical Report #2003-02-A. [10] H. Brenner, O. Gefeller, Deriving more up-to-date estimates of long-term patient survival, Journal of Clinical Epidemiology 50 (1997) 211–216. [11] A. Gondos, V. Arndt, B. Holleczek, C. Stegmaier, H. Ziegler, H. Brenner, Cancer survival in Germany and the United States at the beginning of the 21st century: An up-to-date comparison by period analysis, International Journal of Cancer 121 (2007) 395–400. [12] S. Houterman, M.L.G. Janssen-Heijnen, L.V. van de Poll-Franse, H. Brenner, J.W.W. Coebergh, Higher long-term cancer survival rates in southeastern Netherlands using up-to-date period analysis, Annals of Oncology 17 (2006) 709–712. [13] A. Verdecchia, S. Francisci, H. Brenner, G. Gatta, A. Micheli, L. Mangone, I. Kunkler, The EUROCARE-4 Working Group, Recent cancer survival in Europe: a 2000–02 period analysis of EUROCARE-4 data, The Lancet Oncology 8 (2007) 784–796.
71
[14] E. Steliarova-Foucher, V. Arndt, D.M. Parkin, F. Berrino, H. Brenner, Timely disclosure of progress in childhood cancer survival by ‘period’ analysis in the Automated Childhood Cancer Information System, Annals of Oncology 18 (2007) 1554–1560. [15] F. Ederer, L.M. Axtell, S.J. Cutler, The relative survival rate: a statistical methodology, National Cancer Institute Monographs 6 (1961) 101–121. [16] D.E. Henson, L.A. Ries, The relative survival rate, Cancer 76 (1995) 1687–1688. [17] F. Ederer, H. Heise, The effect of eliminating deaths from cancer on general population survival rates, in: Methodological Note 11, End Results Evaluation Section, National Cancer Institute, 1959. [18] T. Hakulinen, Cancer survival corrected for heterogeneity in patient withdrawal, Biometrics 38 (1982) 933–942. [19] M. Greenwood, The errors of sampling of the survivorship table Reports on Public Health and Medical Subjects, vol.33, Her Majesty’s Stationery Office, London, 1926. [20] Federal Statistical Office of Germany, http://www.destatis.de. [21] X. Fan, Using commonly available software for bootstrapping in both substantive and measurement analyses, Educational and Psychological Measurement 63 (2003) 24–50. [22] B. Yu, R.C. Tiwari, K.A. Cronin, C. McDonald, E.J. Feuer, CANSURV: a Windows program for population-based cancer survival analysis, Computer Methods and Programs in Biomedicine 80 (2005) 195–203. [23] M. Pohar, J. Stare, Relative survival analysis in R, Computer Methods and Programs in Biomedicine 81 (2006) 272–278. [24] A. Verdecchia, R. Capocaccia, M. Santaquilani, T. Hakulinen, Methods of survival data analysis and presentation issues, in: F. Berrino, R. Capocaccia, J. Estève, G. Gatta, T. Hakulinen, A. Micheli, M. Sant, A. Verdecchia (Eds.), Survival of Cancer Patients in Europe: the EUROCARE-2 Study, International Agency for Research on Cancer, Lyon, 1999, pp. 41–45 (IARC Scientific Publications No. 151). [25] I. Corazziari, M. Quinn, R. Capocaccia, Standard cancer patient population for age standardising survival ratios, European Journal of Cancer 40 (2004) 2307–2316. [26] R. Peto, M.C. Pike, P. Armitage, N.E. Breslow, D.R. Cox, S.V. Howard, N. Mantel, K. McPherson, J. Peto, P.G. Smith, Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. Analysis and examples, British Journal of Cancer 35 (1977) 1–39. [27] T. Hakulinen, L. Tenkanen, K. Abeywickrama, L. Päivärinta, Testing equality of relative survival patterns based on aggregated data, Biometrics 43 (1987) 313–325.