Editorials The dilemma of digital imaging in retinopathy of prematurity Graham E. Quinn, MD, MSCE
T
wo articles, one in the August issue of the Journal of AAPOS by Wallace et al1 and one in this issue of the Journal of AAPOS by Gelman et al,2 highlight the most vexing dilemma facing the ophthalmologist caring for prematurely born babies who develop “serious” retinopathy of prematurity (ROP): Are the posterior pole vessels of this eye sufficiently abnormal to qualify as having plus disease?3,4 The answer to this question largely determines whether or not an eye undergoes peripheral retinal ablation, based on the clinical algorithm developed by investigators of the Early Treatment for ROP Trial that established new treatment indications for serious ROP in late 2003.5 Answering this critical question is not straightforward, relying on a clinical judgment in which the examined baby’s fundus appearance is compared with a 20-year-old fundus photograph selected by ROP experts as demonstrating the “minimum” dilation and tortuosity sufficient to qualify as having plus disease.6,7 Therein lies the dilemma—there is no quantitative standard for the diagnosis of plus disease—and yet the term has been clinically useful in describing an important ocular finding that indicates potentially sight-threatening retinopathy. The two groups of investigators have taken different and creative approaches to addressing the problem of plus disease diagnosis in ROP. Wallace and coworkers1 have developed “ROPtool,” a semi-automated program that takes high-quality digital images and develops measures of vascular tortuosity and dilation by comparing scores with those of vessels in the reference image of plus disease used in clinical trials.6,7 First, each quadrant of the 20 posterior pole images was judged by the two clinician authors to have vascular abnormalities consistent with plus disease, preplus disease (a term that indicates the presence of vascular abnormalities of the posterior pole insufficient to qualify as plus disease),3 or no abnormal dilation and tortuosity (normal). Then images were analyzed independently by the same authors using ROPtool to determine a tortuosity index and dilation index for vessels they identified in each quadrant. Though certainly open to a criticism of recall bias, the results showed very good eye-level interAuthor affiliations: Division of Pediatric Ophthalmology, The Children’s Hospital of Philadelphia, University of Pennsylvania, Philadelphia, Pennsylvania Submitted September 27, 2007. Accepted September 28, 2007. Reprint requests: Graham E. Quinn, MD, MSCE, Pediatric Ophthalmology, 1st Floor, Wood Building, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104 (email:
[email protected]). J AAPOS 2007;11:529 –530. Copyright © 2007 by the American Association for Pediatric Ophthalmology and Strabismus. 1091-8531/2007/$35.00 ⫹ 0 doi:10.1016/j.jaapos.2007.09.014
Journal of AAPOS
observer agreement on the presence of plus disease clinically (90%) and using ROPtool (95%). Using ROPtool in its current iteration, it appears that eye-level vascular tortuosity scores were more useful indicators of the presence of plus disease than were measures of vessel dilation, as shown with ROC curves showing high sensitivity and specificity for tortuosity and barely above chance for dilation. The second group of investigators, led by Chiang2 at Columbia University, have chosen to examine the accuracy of ROP experts in judging the presence or absence of plus disease, based on the appearance of digital images that showed only the posterior pole. Based on a comparison to a consensus diagnosis developed for each the 34 images (ie, what most of the 22 experts judged the eye to demonstrate), they found that the “accuracy of ROP experts for plus disease diagnosis is imperfect.” This is not a novel observation. Clinicians disagree on treatment indications not infrequently as shown by the 12% disagreement between experienced observers on whether or not an eye fulfilled the criteria for treatment in the cryotherapy for ROP trial.8 In the Gelman et al2 study reported here, sensitivity and specificity for the presence of plus disease varied widely across the experts with only nine (40.9%) having both sensitivity and specificity of more than 80%. As in the Wallace article, the images then underwent computer-assisted analysis to determine estimates of curvature, diameter, and tortuosity for all vessels in the image. These parameters and various combinations of these parameters were then examined to determine sensitivity and specificity for detection of plus disease. The computerassisted analyses generally did well and some combinations predicted better than the vast majority of the clinicians evaluating the posterior pole vessels. These efforts to use computerized systems to evaluate eyes for plus disease represent a start toward standardizing approaches to plus disease. However, there are years of work ahead as we determine the validity of imaging in ROP clinical diagnosis. At present, we should view the quantification of ROP abnormalities using digital images as important investigational exercises. Since a number of studies have shown disagreements in the diagnosis of plus disease on clinical examination,9,10 using the clinical diagnosis of plus disease as the gold standard is not sufficient for validating this new tool. The luxury of examining, measuring, and comparing findings in detail, rather than simply discussing a cartoon/diagram of the ocular findings in a tiny baby who is none too happy to have us there examining eyes, provides an opportunity to develop quantitative scores of the various vascular abnormalities that
529
530
Quinn
characterize both the normal and abnormal retinal vascular development in the prematurely born baby. Initially these scores need to be based on what clinicians are currently calling normal, preplus disease, and plus disease and such an exercise may well help parse out the amorphous line between “near” plus and “clear” plus disease. Eye scores could then be used to develop a portfolio of images that provide examples of vascular abnormalities that the clinician could use during indirect ophthalmoscopy in the NICU to distinguish worrisome vascular abnormalities from nonthreatening ones. Ultimately, although, the utility of using vessel scores needs to be proved or disproved in clinical trials that examine the outcome of treatment decisions made on some combination of stage, location, and vascular score. Such studies will require participating institutions to have ready access to imaging and image evaluation, as well as a group of investigators who can maintain equipoise in a trial that may go against their clinical impression. Before we get to clinical trials, however, we need to determine the validity of using digital images to substitute for the eyes of the clinician performing a diagnostic examination. There are a variety of settings in which digital imaging and image analysis of eyes of babies at risk may prove very useful. Digital imaging has great potential as a screening tool when used to identify “referral-warranted ROP” (RW-ROP) as defined by Ells et al.11 By stepping back from developing an eye score that indicates that an eye needs treatment to using a score to indicate that the eye should be evaluated by an ophthalmologist familiar with ROP, digital imaging becomes clinically useful in areas of the United States that are underserved by experienced ophthalmologists and, importantly, useful in areas of developing countries, where ophthalmic expertise in ROP is sparse and where the “third epidemic” of blindness due to ROP is occurring.12,13 Such a use of digital retinal imaging is certainly appealing, but the cost in terms of cases missed and eyes blinded (ie, negative predictive value) must be carefully examined in several contexts. In the United States, where ophthalmic expertise is generally available and there are large expenditures on preserving lives of premature babies, the number of cases missed must necessarily be very low, while in areas of developing countries where no or minimal screening is currently available and resources are scarce, different values of sensitivity and specificity for detecting serious ROP may well be acceptable. Research in this important area must concentrate not only on the validity of digital imaging, but also on the societal impact of its use. There are barriers to the wide use of such image analysis including availability of computer-assisted evaluation of ROP in the NICU and the cost of digital imaging. The difficulty of introducing computerized ROP evaluation systems into the NICU was demonstrated in the ETROP study5 when investigators chose not to recommend new
Volume 11 Number 6 / December 2007
treatment levels based on the risk determined by the computer-based RM-ROPII program14 on which the study was designed. Treatment recommendations were, instead, based on the clinical appearance of the eye; the patient demographic and pace of disease information that are integral to the RM-ROP program were essentially disregarded by assigning them to “clinical judgment.” This decision allowed the clinician to make the treatment decision at the bedside without the need for consulting a computerized program. Nonetheless, as technology advances, barriers to computerized methodologies for ROP evaluation will fall as the cost of imaging systems decreases and less imperfect systems for determining whether an eye needs treatment are developed. Using recent and future breakthroughs in digital retinal imaging technology, we in ophthalmology will hopefully embrace the opportunity to provide better care to our smallest patients.
References 1. Wallace DK, Zhao Z, Freedman SF. A pilot study using “ROPtool” to quantify plus disease in retinopathy of prematurity. J AAPOS 2007;11: 381-7. 2. Gelman R, Jiang L, Du YE, Martinez-Perez ME, Flynn JT, Chiang MF. Plus disease in retinopathy of prematurity: Pilot study of computer-based and expert diagnosis. J AAPOS 2007;11:532-40. 3. The International Classification of Retinopathy of Prematurity revisited. Arch Ophthalmol 2005;123:991-9. 4. An international classification of retinopathy of prematurity. The Committee for the Classification of Retinopathy of Prematurity. Arch Ophthalmol 1984;102:1130-4. 5. Early Treatment for Retinopathy of Prematurity Cooperative Group. Revised indications for the treatment of retinopathy of prematurity: Results of the early treatment for retinopathy of prematurity randomized trial. Arch Ophthalmol 2003;121:1684-94. 6. Cryotherapy for Retinopathy of Prematurity Cooperative Group. Multicenter trial of cryotherapy for retinopathy of prematurity: Preliminary results. Pediatrics 1988;81:697-706. 7. Good WV, Hardy RJ, Dobson V, et al. The incidence and course of retinopathy of prematurity: Findings from the early treatment for retinopathy of prematurity study. Pediatrics 2005;116:15-23. 8. Reynolds JD, Dobson V, Quinn GE, et al. Evidence-based screening criteria for retinopathy of prematurity: Natural history data from the CRYO-ROP and LIGHT-ROP studies. Arch Ophthalmol 2002; 120:1470-6. 9. Freedman SF, Klystra J, Capowski J, Realini T, Rich C, Hunt D. Observer sensitivity to retinal vessel diameter and tortuosity in retinopathy of prematurity: A model system. J AAPOS 1996;33:248-54. 10. Chiang M, Keenan J, Starren J, et al. Accuracy and reliability of remote retinopathy of prematurity diagnosis. Arch Ophthalmol 2006;124:322-7. 11. Ells A, Holmes JM, Astle W, et al. Telemedicine approach to screening for severe retinopathy of prematurity: A pilot study. Ophthalmology 2003;110:2113-17. 12. Gilbert C, Fielder A, Gordillo L, et al. Characteristics of infants with severe retinopathy of prematurity in countries with low, moderate, and high levels of development: Implications for screening programs. Pediatrics 115:e518-25, 2005. 13. Gilbert C, Rahi J, Eckstein M, O’Sullivan J, Foster A. Retinopathy of prematurity in middle-income countries. Lancet 1997;350(9070):12-24. 14. Hardy RJ, Palmer EA, Dobson V, et al. Risk analysis of prethreshold retinopathy of prematurity. Arch Ophthalmol 2003;121:1697-701.
Journal of AAPOS