Changing Definitions of Biochemical Failure After External Beam Radiotherapy for Localized Prostate Cancer: Effect on Outcome Analyses

Changing Definitions of Biochemical Failure After External Beam Radiotherapy for Localized Prostate Cancer: Effect on Outcome Analyses

S198 I. J. Radiation Oncology ● Biology ● Physics Volume 63, Number 2, Supplement, 2005 analyses, no factors were selected by recursive partitioni...

79KB Sizes 2 Downloads 75 Views

S198

I. J. Radiation Oncology

● Biology ● Physics

Volume 63, Number 2, Supplement, 2005

analyses, no factors were selected by recursive partitioning analysis, and the only factor selected by the Cox model was biopsy result histology 6 –9 vs. others. The results using survival measured from end of RT were the same as for survival measured from biopsy. Conclusions: Our analysis suggested that 2-year post-RT prostate biopsy was useful in forecasting eventual CN⫹2 biochemical failure, but not ASTRO failures. Biopsy appears to be predictive of overall survival. These results may be useful in identifying patients for early salvage clinical trials.

1089

How Reliable Are Electronic Biochemical Failure Algorithms? An Examination of the Variability of Results From Four Centers

S. Williams,1 L. Kestin,4 L. Potters,3 P. Fearn,3 T. Pickles2 Peter MacCallum Cancer Centre, Melbourne, VIC, Australia, 2British Columbia Cancer Agency, Vancouver, BC, Canada, 3 New York Prostate Institute, New York, NY, 4William Beaumont Hospital, Detroit, MI 1

Purpose/Objective: To evaluate the results of four biochemical failure (bF) algorithms using a standard dataset following application of three bF definitions. Materials/Methods: A collaborative dataset of 1200 men was developed using data from prostate cancer patients of three institutions treated with varying combinations of external beam radiation, brachytherapy and androgen deprivation therapy from 1994 –1998. Pretreatment prognostic variables, post therapy PSA results and hormone therapy details were collated. The data were then analysed for bF using electronic bF calculators of four institutions. Three bF definitions were tested: the ASTRO consensus definition (ACD), current nadir ⫹ 2 ng/mL (N⫹2), and a threshold of 3 ng/mL (T3). Results were graded according to the concordance of bF identification across centres, the variation in the derived time of bF and actuarial results. Results: Unanimous agreement regarding failure status using the ACD, N⫹2 and T3 definitions occurred in 87.3, 96.4 and 92.7% of cases respectively. Using the ACD, 63% of the variation in bF identification was due to a single institution allowing failure status to be reversed if a PSA fall was seen after bF (a PSA bounce). Excluding that institution improved agreement to 95.3%. Of those with ACD bF identified, the interobserver variation in calculated time to failure for 250 (53.5%) was greater than 2 months (median: 11; range: 2.1–105 months), with 195 of these being due to variations in the nadir date used in failure calculation. Censoring times ranged by more than 2 months in 3.2%. Actuarial curves showed a range in the 5 year freedom from bF (FFbF) range of 49.8 – 60.9%, associated with considerable variation in the curve shapes. The N⫹2 definition had a 20.5% rate of calculated failure times which varied by more than 2 months between the four algorithms (median: 6.4; range: 2.1–75.6 months). 75% of these were due to one centre requiring a minimum duration of follow-up of 12 months, while 96% of the 306 variations of timing of censoring were due to one centre using the last non-rising PSA date rather the last test date. Disagreement about the failure status was typically due to either differing handling of fluctuating PSA levels, or the way in which data were handled inside 12 months. The 5 year FFbF ranged from 55.9 – 61.0%. The T3 definition was interpreted as requiring a rising PSA at or above 3 ng/mL in two definitions, as opposed to a nadir or a rising PSA at or above 3 ng/mL in the other two, creating failure time differences in 29 cases. Allowing bF to reverse after a PSA bounce created 34 timing and 22 failure variations. One definition using any PSA at or above 3 ng/mL beyond 12 months had 78 failure or timing variations. Differing implementation of the hormone therapy date created 39 major errors. There was a 2.0% range in the 5 year FFbF across the 4 calculators. Conclusions: Considerable interobserver differences in bF outcomes were seen when four electronic bF algorithms were compared using a standard dataset. The ACD showed poor agreement for both the identification of failure and its timing, resulting in an 11% range of FFbF rate at 5 years. The interpretation of the nadir date and the way in which bounces were handled were the most influential variables. The N⫹2 and T3 definitions showed less interobserver variability, and major timing variations had negligible influence on the FFbF. Consideration should be made to developing a validation system to assure the quality of electronic bF calculator outcomes.

1090

Changing Definitions of Biochemical Failure After External Beam Radiotherapy for Localized Prostate Cancer: Effect on Outcome Analyses

C.A. Reddy,1 A.M. Reuther,1 A. Mahadevan,1 P.A. Kupelian2 Radiation Oncology, Cleveland Clinic Foundation, Cleveland, OH, 2Radiation Oncology, MD Anderson Cancer Center Orlando, Orlando, FL

1

Purpose/Objective: To determine if the prognostic factors for biochemical failure (bF) differ between two biochemical failure definitions. Materials/Methods: The study sample consisted of 1134 patients with clinical stage T1 or T2 node-negative adenocarcinoma of the prostate treated with external beam radiotherapy (RT) between 1987 and 2001. All patients had a pretreatment PSA value (iPSA); biopsy Gleason score (bGS); at least 24 months follow-up; at least 3 follow-up PSA levels; a radiation dose ⱖ68 Gy; and no androgen deprivation (AD) use for ⬎6 months. The median RT dose was 78 Gy and, the median follow-up was 69 months (range: 24 –208 months). Two different definitions of bF were used: 1) the ASTRO consensus definition (DefA) and 2) the nadir⫹2ng/mL definition (DefN). Survival analysis was performed using the Kaplan-Meier method, and the log rank test was used to determine significance. Cox proportional hazards regression was used for the multivariate analyses with the following parameters: Age, Race (white vs black), iPSA, bGS, clinical T-stage, radiation dose (⬍72 Gy vs ⱖ72⬍78 Gy vs ⱖ78 Gy), radiation technique (conventional vs conformal/IMRT), and use of AD (yes vs no). Results: For all patients, the biochemical relapse-free survival (bRFS) rate at 5 years was 73% (95% C.I. 70 –76) for DefA, and 82% (95% C.I. 79 – 85) for DefN (Fig. 1). The bRFS rate at 10 years was 68% (95% C.I. 65–72) for DefA, and 60% (95% C.I. 54 – 66) for DefN (p⬍0.001). The multivariate analysis using DefA revealed iPSA, bGS, T stage, RT technique, and radiation dose, to be significant predictors of bF. Repeating the multivariate analysis using DefN revealed iPSA, bGS, T stage, and

Proceedings of the 47th Annual ASTRO Meeting

radiation dose to still be independent predictors of bF. However, RT technique was no longer significant. With either definition, on both univariate and multivariate analysis, radiation dose remained an independent predictor of biochemical failure. Conclusions: With changing definitions of bF, the results of failure analyses will change. In this large study sample with long median follow-up, the ASTRO definition resulted in worse bRFS rates earlier in the follow-up period (within the first 6 –7 years), while the nadir⫹2 definition resulted in worse outcomes later in the follow-up period (beyond 7– 8 years). Discrepancies in the results of the multivariate analyses were seen, such as radiation technique being significant in predicting bF with one definition (ASTRO), but not the other. Radiation dose, iPSA, bGS, and T stage remained important predictors of bF with either definition. On the basis of this analysis, no significant changes in the use of prognostic factors or treatment factors would be recommended.

Fig. 1

1091

First Year PSA Kinetics and Nadirs After Prostate Cancer Radiotherapy Are Predictive of Overall Survival

R. Cheung, S. Tucker, D. Kuban UT MD Anderson Cancer Center, Houston, TX Purpose/Objective: We analyzed whether first year PSA kinetics and nadirs are useful as early response indicators after prostate cancer radiotherapy (RT) in predicting overall survival (OS). Materials/Methods: The dataset included 1174 patients treated with external beam RT from 1987 to 2001. There were at least two PSA measurements during the first year after RT. The median patient age was 69 (Range 37 to 84). The median 1992 AJCC stage was T2a (range T1b to T4). The median Gleason score was 6 (range 2 to 10). The median pretreatment PSA was 7.8 (range 0.3 to 150). 298 patients had low-risk disease (PSA ⱕ 10 and Gleason score ⱕ 6 and AJCC stage T2a or lower disease), 472 had intermediate-risk disease, and 404 patients had high-risk disease (PSA ⬎ 20 or Gleason score ⱖ 8 or AJCC stage T3 or higher). Patients received a median dose of 70 Gy (range 60 to 78.2 Gy). Since we analyzed overall survival in this elderly patient population, we also included patient age in the analyses. For each patient, the relative rate of change (␭) in post-treatment PSA values during the year (13.5 months) after RT was computed using regression analysis of ln( PSA) versus time. We also computed the PSA nadir (mPSA) reached during the first 13.5 month after RT. Univariate recursive partitioning analysis (RPA) was used to identify one or more relevant cutpoints for each of the continuous factors being investigated for its association with survival: age, pretreatment PSA, radiation dose, relative rate of change in PSA post-RT, and 1-year nadir. For each of the other factors (stage, Gleason score, and risk group), all possible cutpoints were considered in the multivariate analyses (e.g. stage ⬍ T2a vs ⱖ T2a, stage ⬍ T2b vs ⱖ T2b, etc.). Cox proportional hazards analyses was used to identify independent predictors for overall survival. These predictors were then used to construct predictive models for OS. Results: The median value of ␭ was ⫺1.0 yr-1 (range ⫺11.0 to 5.1 yr-1). The 1-year nadir had a median of 0.8 ng/ml (range 0.01 to 30.9 ng/ml). The 5-year post-RT OS is 92% (95% CI 91% to 94%) and the 10-year OS is 69% (95% CI 66% to 73%). Univariate RPA identified the cutpoints for each of the continuous factors listed in Table 1.Cox proportional hazards analyses using both forward and backward stepwise selection of factors identified the same list of 8 factors adversely related to survival, with Hazard ratios (95% C.I.) and P-values: 1. Age ⱖ 71;1.61 (1.24, 2.09); ⬍ 0.001. 2. Age ⱖ 75;1.71 (1.26, 2.34);0.001. 3.Gleason score ⱖ 6;1.93 (1.48, 2.51); ⬍0.001. 4. Gleason score ⱖ 8;1.40 (1.02, 1.91);0.036. 5. Stage ⱖ T2b;1.29 (1.03, 1.61);0.024. 6.Dose ⬍ 66 Gy;1.37 (1.07, 1.77);0.013. 7. ␭ ⬎ 0 yr-1;1.68 (1.22, 2.32);0.002. 8.mPSA ⱖ 4 ng/ml;2.50 (1.70, 3.67);⬍0.001. The number of adverse factors was predictive of the overall survival. Patients with ␭ ⬎ 0 and/or mPSA ⱖ4 ng/ml during the first year post-RT had significantly worse survival for each of the groupings with ⱕ 2, 3, or ⱖ4 adverse factors vs. otherwise. Conclusions: First year post-RT PSA kinetics and nadirs, but not the pretreatment PSA, are predictive of overall survival for prostate cancer patients treated with RT. 1-year PSA kinetics and nadir may provide early RT response indicators to further segregate patients with other poor pretreatment prognostic factors into groups with better vs. worst overall survival. This information may be useful in selecting patients for adjuvant therapy after RT.

S199