Guest Editorial When Is “Final”? Many clinical investigators continue to present results in a way that inadequately addresses the crucial issue of variable follow-up time. For example, it is well known that glaucoma surgeries tend to fail as time postoperatively increases. The 5Fluorouracil filtering study found that the percentage of patients who successfully avoided an IOP > 21 mmHg or further glaucoma operations was 80% at 1 year, but only 48% at 5 years.’ Yet, some investigators still report postoperative results in terms of “final IOP,” without identifying when “final” is. Data from final visits of all patients are often averaged and presented as a single result based on “mean follow-up,” thus obscuring the effect of time on the results. An even more misleading error that has occurred frequently in the glaucoma literature has been to present the “final IOP” data in a “life-table” format, which suggests that time has been appropriately accounted for when it has not. The most widely accepted statistical method of taking time into account in analyses of medical procedures is life-table analysis-the Kaplan-Meier product limit method.2B3 Life-table statistics are able to use information on all patients including those with unequal follow-up times and those lost to follow-up, considerations that create the incomplete data that clinicians are forced to analyze. This method requires the clinician to define arbitrary criteria that will be considered a failure or “death” of the procedure being evaluated. The arbitrary criteria should be carefully considered so that “resurrections” are rare. In macular degeneration, these might include the loss of 3 lines of vision. For a glaucoma procedure, these might be the occurrence of further glaucoma surgery or an IOP 1 21 (or > 1.5) mmHg. In some instances, such as in evaluating cornea1 wound healing, the outcome may be a success rather than a failure, such as the re-establishment of an intact epithelium. None of these failure criteria is perfect, e.g., further glaucoma surgery may depend on the surgeon’s preference. Data for each patient are examined for the occurrence of failure, at which time the patient is considered a failure and is removed from subsequent analysis (each patient can only fail once). An illustration of the life-table method can be made using typical postoperative follow-up data from a glaucoma surgery (Table 1). A single failure criterion is used: IOP > 21 mmHg. The life-table analysis calculates the proportion of patients who are still successful (have not failed) at each follow-up interval. At 6 and 12 months, both patients are successful, while at 18 months, the one remaining patient has an IOP > 21 mmHg and has failed according to the criterion. The results using these methods reveal success rates of 100% at 6 and 12 months and 0% at 18, 24, and 36 months. The life-table provides a reasonable picture of the effect of time on success as a series of measures. In contrast, if only the last IOPs are analyzed (“final IOP-averaged”) as a single number summary, the result is 50% success at an average follow-up of 24 months, which conveys the misleading message that some of the patients were still successful at 24 months when in fact none were (Table l), Similarly, if a ‘ ‘life-table” is incorrectly developed from ‘ ‘final’ ’ IOP data only, the success rates of 100% at 6, 12, and 24 months and 0% at 36 months are misleading for the same reason.
Table
1. Illustration
Using
Two
Procedure
Sample A (IOP
Follow-up Patient Patient Patient
X Y
3
6
9
12 8
14 10
15 14 Procedure
12 18 19 a Success, IOP Follow-up
Patients mmHe)
(mos) 18 23 < 21 mmHg
24 27
30 24
36 26
(%)
(mos)
Method
3
6
9
12
18
Life table Final-averaged Final life-table
100
100
100
100
0
100
100
100
100
100
24 0 50 100
30
36
0
0
100
0
395
Ophthalmology Table Day of Failure 0
92 148 190 228 237 273
Patients Left (N)
21 21 19 17 16 15 11
* Failure criteria m&led of glaucoma me&cation
Failed* N
0 1 1 1 1 1 1
Volume 10.5, Number 2. Molten0 Interval Rate
Implant Survival (%)
100 95 ;: 94 93 91
wtth
3, March I998
Mitomycin Cumulative Rate
C Survival (%)
95%
100 95 90 85 80 74 68
Confidence Interval
87-100 78-100 71-100 64-100 57-97 49-93
an 6 < IOP > 21 mmHg on nvo consecuttve observations 1 month apart, or the addltmn after 90 days follo\v-up; additional ,qlaucoma surgery; or devastatmg complications.
Use of the Kaplan-Meier method will also be illustrated with data from a study of Molten0 implants.4 The study started with 21 patients. The first patient failed at day 92 postoperatively, leaving a success rate of 95% (20 of 21). Following this failure, one patient was censored for incomplete follow-up (in this case, moved out of state), which left only 19 patients still successful. The second patient failed at 148 days. Therefore, 95% (18 of 19) of patients survived this last interval, and 95% of 95% of patients survived both intervals to yield a success rate of 90% at 140 days. This process was continued until the last patient failed. The success rate at 1 year (365 days) could therefore be determined to be 68%. The 95% confidence interval was calculated to describe the certainty with which the results described the actual behavior of the patients. Lifetable analyses have limitations, however. They assume patients lost to follow-up are no more or less likely to fail than patients remaining. This assumption is commonly violated and must be addressed, such as when patients are preferentially lost to follow-up when they become blind. “Interval censoring” may be introduced when some patients are seen more often than others; prevention of such bias requires that patients in studies be observed at regular intervals in follow-up. Clinical investigators need to familiarize themselves with life-table methods and with biostatisticians who can help answer clinical questions. Readers should assess the validity of life-tables in manuscripts by asking at least the following questions: Do criteria for success or failure make clinical sense? Were patients analyzed at regular intervals rather than just at the end of follow-up? Were differences in pretreatment characteristics accounted for appropriately? Were all patients who began the study accounted for at each time point provided (patients left, failed, censored), and were reasons for each patient lost to follow-up given? Considering the deleterious effect of longer follow-up time on many procedures, ’ it is essential that new procedures be evaluated with methods that appropriately account for that time in follow-up. Otherwise, does anyone really know when “final” is? References 1. The Fluorouracil Filtering Study Group. Five-year follow-up of the Fluorouracil Filtering Surgery Study. Am J Ophthalmol 1996; 121:349-85. 2. Fisher LD, Van Belle G. Biostatistics. A Methodology for the Health Sciences. New York: John Wiley and Sons, 1993:801-22. 3. Dawson-Saunders B, Trapp RG. Basic and Clinical Biostatistics. Norwalk, CT: Appleton and Lange, 1990:136206. 4.
Perkins TW, Cardakil UF, Eisale JR, et al. Adjunctive mitomycin-C in Molten0 implant surgery. Ophthalmology 1995;
102:91-7. TODD
W.
MARIAN
Madison.
396
MD
PERKINS, FISHER,
PHD
Wisconsin