ORIGINAL
ARTICLES
Accuracy of teledermatology for pigmented neoplasms Erin M. Warshaw, MD, MS,a,b Frank A. Lederle, MD,a Joseph P. Grill, MS,a Amy A. Gravely, MA,a Ann K. Bangerter, BS,a Lawrence A. Fortier, MA,a Kimberly A. Bohjanen, MD,b Karen Chen, MD,b Peter K. Lee, MD, PhD,b Harold S. Rabinovitz, MD,c Robert H. Johr, MD,c,d Valda N. Kaye, MD,b Sacharitha Bowers, MD,e Rachel Wenner, MD,b Sharone K. Askari, MD,f Deborah A. Kedrowski, RN,a and David B. Nelson, PhDa Minneapolis, Minnesota; Miami, Florida; Springfield, Illinois; and St Louis, Missouri Background: Accurate diagnosis and management of pigmented lesions is critical because of the morbidity and mortality associated with melanoma. Objective: We sought to compare accuracy of store-and-forward teledermatology for pigmented neoplasms with standard, in-person clinic dermatology. Methods: We conducted a repeated measures equivalence trial involving veterans with pigmented skin neoplasms. Each lesion was evaluated by a clinic dermatologist and a teledermatologist; both generated a primary diagnosis, up to two differential diagnoses, and a management plan. The primary outcome was aggregated diagnostic accuracy (match of any chosen diagnosis with histopathology). We also compared the severity of inappropriately managed lesions and, for teledermatology, evaluated the incremental change in accuracy when polarized light dermatoscopy or contact immersion dermatoscopy images were viewed. Results: We enrolled 542 patients with pigmented lesions, most were male (96%) and Caucasian (97%). The aggregated diagnostic accuracy rates for teledermatology (macro images, polarized light dermatoscopy, and contact immersion dermatoscopy) were not equivalent (95% confidence interval for difference within 610%) and were inferior (95% confidence interval lower bound \10%) to clinic dermatology. In general, the addition of dermatoscopic images did not significantly change teledermatology diagnostic accuracy rates. In contrast to diagnostic accuracy, rates of appropriate management plans for teledermatology were superior and/or equivalent to clinic dermatology (all image types: all lesions, and benign lesions). However, for the subgroup of malignant lesions (n = 124), the rate of appropriate management was significantly worse for teledermatology than for clinic dermatology (all image types). Up to 7 of 36 index melanomas would have been mismanaged via teledermatology. Limitations: Nondiverse study population and relatively small number of melanomas were limitations. Conclusions: In general, the diagnostic accuracy of teledermatology was inferior whereas management was equivalent to clinic dermatology. However, for the important subgroup of malignant pigmented lesions, both diagnostic and management accuracy of teledermatology was generally inferior to clinic dermatology and up to 7 of 36 index melanomas would have been mismanaged via teledermatology. Teledermatology and teledermatoscopy should be used with caution for patients with suspected malignant pigmented lesions. ( J Am Acad Dermatol 2009;61:753-65.) Key words: accuracy; melanoma; pigmented lesion; teledermatology; telemedicine. From the Minneapolis Veterans Affairs Medical Center, Center for Chronic Disease Outcomes Researcha; Department of Dermatology, University of Minnesota School of Medicineb; Departments of Dermatologyc and Pediatrics,d University of Miami School of Medicine; Department of Dermatology, Southern Illinois Universitye; and Department of Dermatology, St Louis University.f Supported by the Department of Veterans Affairs Health Services Research and Development Service (Grant IIR 01-072-2). During this study, Dr Warshaw was supported by a Veterans Affairs Cooperative Studies Program Clinical Research Career Development Award. Conflicts of interest: None declared. Presented in part at the Plenary Session 68th Annual Meeting of the Society of Investigative Dermatology in Los Angeles, CA, on May 10-12, 2007.
The findings and conclusions presented in this report are those of the authors and do not necessarily represent the views of the Department of Veterans Affairs or Health Services Research and Development. Accepted for publication April 17, 2009. Reprints not available from the authors. Correspondence to: Erin M. Warshaw, MD, MS, Department 111 K VAMC, 1 Veterans Drive, Minneapolis, MN 55417. E-mail: erin.
[email protected]. Available online August 13, 2009. 0190-9622/$36.00 ª 2009 by the American Academy of Dermatology, Inc. doi:10.1016/j.jaad.2009.04.032
753
754 Warshaw et al
J AM ACAD DERMATOL NOVEMBER 2009
Teledermatology holds the promise of providing suggest possible equivalency, to date, adequately cost-effective care to underserved areas that lack powered studies have been lacking. access to dermatology services.1,2 Before wideThe purpose of this study was to compare conspread implementation, however, rigorous evaluaventional, in-person clinical dermatology with storetion of this relatively new technology is essential, and-forward teledermatology for pigmented skin especially for conditions with significant mortality neoplasms, using the outcomes of diagnostic accurand morbidity such as melanoma. Although prelimacy and appropriateness of management. Using inary store-and-forward teleequivalence tests, we asdermatology studies3-13 have sessed whether the diagnosCAPSULE SUMMARY shown that simple diagnostic tic accuracy of store-andagreement of teledermatoloforward teledermatology difThe diagnostic accuracy of clinic gists and clinic dermatolofers by no more than 10% dermatologists evaluating 542 veterans gists are comparable for from the accuracy of stanwith pigmented skin lesions was most dermatologic condidard, clinical evaluations. superior to teledermatologists and the tions (primarily rashes), acWe also examined the effect addition of dermatoscopic images did curacy (using the reference of including PLD or CID imnot significantly increase the diagnostic standard of histopathology) ages on teledermatology acaccuracy of teledermatologists. is the most appropriate outcuracy rates and assessed Despite the superiority of clinic come for skin neoplasms.11outcomes in the subgroups 13 dermatology for diagnostic accuracy, the Within the category of skin of malignant and benign two methods of care had overall neoplasms, pigmented lelesions. equivalent rates of appropriate sions present specific chalmanagement; however, 7 index lenges because of significant METHODS melanomas (19%) would have been mortality and morbidity asThis study was a crossmismanaged via teledermatology. sociated with malignant melsectional, repeated measures anoma, especially in nonstudy that compared Clinicians should be aware of the Hispanic men older than 65 store-and-forward telederlimitations of teledermatology and years.14 There are only a few matology with traditional, teledermatoscopy for diagnosis and studies15 evaluating standard in-person clinical encounters management of melanoma. macro images for the diagnofor pigmented lesions. The sis of pigmented lesions usmethods for this study of ing histopathology as a reference standard. pigmented lesions were identical to a study of Dermatoscopy (also called dermoscopy or epilunonpigmented lesions24 with one exception. In adminescence microscopy) reduces light reflection dition to standard macro images (distance and closefrom the skin surface by eliminating surface light up) (digital Nikon Coolpix 4500 with a Nikon SLreflection via contact immersion dermatoscopy 1 ring flash, Nikon, Meville, NY) and PLD images (CID) or polarized light dermatoscopy (PLD).16 The (digital Nikon Coolpix 4500 with a 3Gen Dermlite added value of using PLD or CID is well established lens attachment, 3Gen, San Juan Capistrano, CA), a for clinical evaluation of pigmented lesions when standard CID image (35-mm Minolta X 370 with a performed by a dermatologist who is trained and Heine dermphot lens attachment, Heine, Dover, NH) experienced in dermatoscopy.17 Therefore, it is logwas also obtained for each pigmented lesion. The ical that teledermatoscopy would yield similar benresulting 35-mm kodachrome was scanned (Nikon efits over teledermatology with standard macro Cool Scan LS-4000ED, Konica Minolta, Tokyo, Japan) images alone. It is even possible that a highly to create a digital image. experienced teledermatoscopist could potentially be more accurate than an in-person clinical dermaOutcome measures tologist who is not as skilled in dermatoscopy. The outcome measures for this study were idenSeveral studies18-23 have compared clinicians and tical to those used for the study of nonpigmented teledermatoscopists evaluating pigmented lesions lesions.24 Briefly, the primary outcome measure was using histopathology as a reference standard. aggregated diagnostic accuracy, defined as agreeSample sizes for these studies were small (66-157 ment of the primary diagnosis or any of the differlesions) and relatively few melanomas (1-32) were ential diagnoses with the histopathology results. studied. Although the data from these pilot studies, Diagnostic accuracy of the primary diagnosis was a yielding diagnostic accuracy rates of 75% to 95% for secondary outcome. Management plan appropriateteledermatologists and 64% to 92% for clinicians, ness was judged in reference to standards d
d
d
J AM ACAD DERMATOL
Warshaw et al 755
VOLUME 61, NUMBER 5
determined by a dermatology expert panel based on histopathologic diagnoses. Protocol The inclusion and exclusion criteria were the same for this study as in the study of nonpigmented lesions.24 Briefly, eligible participants included those referred to dermatology by nondermatology health care providers for evaluation of a pigmented skin neoplasm subsequently biopsied or those patients already enrolled in the Minneapolis Department of Veteran Affairs dermatology clinic who were undergoing removal/biopsy of a pigmented neoplasm either because of patient request or physician recommendation. After informed consent, one of 11 staff clinic dermatologists completed a clinical assessment that included: (1) a choice of 17 common diagnoses for one primary and up to two differential diagnoses; (2) a choice of 5 basic management plans (remove/biopsy/destroy, observe/reassure, antifungal treatment, antibiotic treatment, antiinflammatory treatment); (3) pigmentation status; and (4) level of confidence in chosen diagnosis and management. Clinicians were also allowed to choose ‘‘other’’ for diagnoses and management plan and to handwrite descriptive answers. The clinic dermatologist obtained history in the usual manner of a clinic encounter (verbal queries, review of the medical record) and the clinical examination could include all options normally available in the clinical setting (eg, palpation, diascopy, dermatoscopy). For the purposes of this study, management plan was based on medical opinion, not participant preference. Teledermatology encounter For each participant, one of 3 board-certified dermatologists (none of whom served as clinic dermatologists) with clinical expertise in dermatoscopy (defined as [5 years experience and recognized pigmented lesion expert in the community)25 was randomly assigned (using a computer-generated randomization program) to review the electronically transmitted clinical digital photographs and the standardized patient and lesion history collected by the research assistants. Using the same diagnostic and management categories as used by clinic dermatologists, the teledermatologist recorded one primary diagnosis, up to two differential diagnoses, and a management plan for each lesion. In addition to confidence level, teledermatologists also rated image quality. To minimize recall bias, the teledermatologist received a sequence of 3 image packages, separated by 3 weeks, comprising either macro images only (distance and close-up, with or without an angle image), macro plus PLD image, or macro plus CID
image. The order of the image packages was determined by the randomization schedule. Images of all lesions for a single participant visit were evaluated by the teledermatologist in a single session. Expert panel A panel of 3 board-certified dermatologists, none of whom served as clinic dermatologists or teledermatologists and who were unaware of the study design and purpose, convened for two expert panel meetings with the aims of: (1) grouping ‘‘other’’ diagnoses into diagnostic categories for meaningful analysis; (2) defining gold standard reference management plans for each lesion category; and (3) qualitatively rating the severity for differences in chosen management plans from the reference plans as ‘‘minor’’ (non-life threatening, no delay in appropriate therapy), ‘‘moderate’’ (non-life threatening, possible delay in appropriate therapy), or ‘‘major’’ (potentially life threatening). Statistical analysis The primary statistical analyses used two-sided equivalence tests to examine the equivalence of the aggregated diagnostic accuracy of teledermatology and clinic-based dermatology. The analysis tested the null hypothesis that the absolute difference in accuracy rates is at least 10% against the alternative hypothesis that the difference is less than 10%.26 This test, conducted using a significance level of .025, corresponds to assessing whether the 95% confidence interval for the difference in accuracy is entirely within 610%. To account for potential correlation among paired diagnoses of the same lesions, we used methodology outlined by Agresti27 in 1990 to construct these 95% confidence intervals. Similar methods assessed the equivalence of the accuracy of the primary diagnoses and the appropriateness of the chosen management plans. In addition to assessing equivalence, we tested whether accuracy rates and appropriate management rates for clinic-based dermatology and teledermatology were significantly different. Furthermore, we assessed the added value of PLD and CID images by comparing teledermatology accuracy using just macro images with that using both macro and PLD images and that using both macro and CID images. Finally, we also compared teledermatology and clinic-based dermatology with respect to severity of management plan inappropriateness. All of this was repeated for the subcategories of malignant and benign lesions. McNemar’s test for paired observations27 was used to examine these issues. If a participant presented with more than one lesion at a single visit, only one lesion was chosen
756 Warshaw et al
J AM ACAD DERMATOL NOVEMBER 2009
Fig 1. Flow chart of study participants.
as the index lesion for the primary analysis. The protocol for identifying an index lesion was based on pigmentation status and completeness of image packages. There were 542 index pigmented lesions (one lesion/participant) included in the main analysis. Analysis of all pigmented lesions (N = 651, index and nonindex lesions) included mixed effects logistics models that yielded results similar to models ignoring the correlation and to the analysis of just the index lesions. Given these results, for simplicity of exposition, we report the results for just index lesions. None of the patients with index pigmented lesions in this study were included as index patients for the separate study of nonpigmented lesions.24 A sample of 520 biopsied pigmented lesions provided approximately 80% or greater power for declaring equivalence (two-sided) between teledermatology and clinic-based evaluation across a wide range of diagnosis accuracy and appropriate management rates.
RESULTS Characteristics of study participants and index lesions A total of 542 patients met criteria for this analysis (Fig 1). The majority of participants were male, elderly, and Caucasian (Table I). About one quarter of participants had a history of nonmelanoma skin cancer and 6% had a personal history of melanoma. The most common lesion locations were head/neck and upper extremity. Table II summarizes histopathology categories. Approximately one fifth were benign keratoses (23%) and dysplastic nevi (21%), followed by benign nevi (15%) and pigmented basal cell carcinomas (12%). There were 36 index melanomas. Overall comparisons of teledermatology and clinic-based dermatology For the primary outcome, aggregated diagnostic accuracy, teledermatology and clinic dermatology were not equivalent and clinic dermatology was superior for all image types (macro only, macro plus
Warshaw et al 757
J AM ACAD DERMATOL VOLUME 61, NUMBER 5
Table I. Characteristics of study participants (n = 542) and index lesions (n = 542)
Table II. Histopathologic diagnoses of biopsied pigmented index lesions (n = 542)
Characteristic
Histopathologic diagnosis
Male Mean age, y [range] Fitzpatrick skin type I II III IV-VI Ethnicity Caucasian African American Other Personal history Nonmelanoma skin cancer Melanoma Other skin condition No skin conditions Lesion location Head/neck Hand/arm Trunk Leg/foot Buttock/groin Lesion duration \3 mo 3-12 mo 1-2 y 2-5 y 5-20 y Other* Lesion symptomsy No symptoms Size change Itching Bleeding Tenderness Burning Other
No. (%)
519 (95.8) 66 [23-94] 37 142 301 61
(6.8) (26.2) (55.5) (11.3)
526 (97.1) 7 (1.3) 8 (1.5) 147 34 64 319
(27.1) (6.3) (11.8) (58.9)
207 169 131 31 4
(38.2) (31.2) (24.2) (5.7) (0.7)
19 77 46 77 60 235
(3.5) (14.2) (8.5) (14.2) (11.1) (43.4)
324 105 90 52 41 10 35
(59.8) (19.4) (16.6) (9.6) (7.6) (1.8) (6.5)
*Includes unknown. y Denominator is number of participants; because [1 symptom could be reported, totals exceed number of participants.
PLD, or macro plus CID) (Table III). For the secondary outcome of primary diagnostic accuracy, teledermatology and clinic dermatology were equivalent when macro plus CID images were viewed by teledermatologists but clinic dermatology was significantly more accurate for macro images only and macro plus PLD images. In contrast to the results for diagnostic accuracy, the rates of appropriate management were equivalent for macro images only and for macro plus PLD images, and teledermatology was significantly better than clinic dermatology for all 3 image types (Table III).
Benign keratosis Dysplastic nevus Benign nevus Basal cell carcinoma Melanoma Lentigo Squamous cell carcinoma Benign angioma Actinic keratosis Dermatofibroma Other*
Index lesions No. (%)
125 115 82 66 36 24 18 17 12 12 35
(23.1) (21.2) (15.1) (12.2) (6.6) (4.4) (3.3) (3.1) (2.2) (2.2) (6.5)
*All benign lesions except for 4 indeterminate melanocytic lesions.
Comparisons assessing the added value of dermatoscopic images for teledermatology There were no significant differences in accuracy rates between macro images alone and macro images plus PLD images (n = 541, aggregated diagnostic: 64.14% vs 64.88%, P = .7807; primary diagnostic: 50.46% vs 51.57%, P = .6636; management plan: 70.43% vs 70.06%, P = .9132). The addition of CID images, however, significantly enhanced primary diagnostic accuracy (n = 494, 50.61% vs 56.88%, P = .0049), but not aggregated diagnostic accuracy (64.37% vs 67.00%, P = .2276) or management plan appropriateness (71.05% vs 73.89%, P = .1797), as compared with macro images alone. There was a borderline significant difference in primary diagnostic accuracy rates between macro plus PLD images and macro plus CID images (n = 493, primary diagnostic: 51.93% vs 57.00%, P = .0261), but not aggregated diagnostic rates (65.31% vs 67.14%, P = .3966) or management plan accuracy rates (70.59% vs 73.83%, P = .1011). Malignant and benign categories In general, the results for the benign subcategory of lesions (Table IV) followed the same pattern as that described above with the exception of equivalency of appropriate management in which the benign category did not show equivalency for macro images only and macro plus PLD images. Results for malignant lesions, however, differed. Although the equivalence test results were indeterminate for both aggregated and primary diagnostic accuracy, clinicbased aggregated diagnostic accuracy was superior to that for teledermatology. Further, in contrast to the results for the overall group and the benign subgroup, clinic dermatology was superior in management appropriateness (all image types) for malignant lesions.
758 Warshaw et al
J AM ACAD DERMATOL
73.89%
65.99% 65.62% 70.06%
56.88%
CI, Confidence interval; CID, contact immersion dermatoscopy; PLD, polarized light dermatoscopy. *Equivalent, CI wholly within 610%. y Definitely not equivalent, CI wholly outside 610%. z Teledermatology statistically significantly better than clinical dermatology (if CI $ 0). § Teledermatology statistically significantly worse than clinical dermatology (if CI \ 0).
65.68% 70.48%
58.67% 50.37%
64.02%
80.26%
e16.24 (2.10)y§ e20.36, e12.12 e8.30 (2.36)§ e12.92, e3.68 4.80 (2.09)*z 0.69, 8.90
e15.53 (2.02)y§ e19.48, e11.58 e7.21 (2.45)§ e12.02, e2.4 4.44 (1.98)*z 0.55, 8.32
67.00%
59.31% 58.78%
e13.77 (2.01)y§ e17.71, e9.82 e2.42 (2.25)* e6.8, 2.00 7.89 (2.12)z 3.74, 12.05 51.57%
Outcome
Aggregated diagnostic Primary diagnostic Management plan
Difference (SE) 95% CI Difference (SE) 95% CI Clinic accuracy rate Teledermatology accuracy rate
80.77% 80.41% 64.88%
Difference (SE) 95% CI Clinic accuracy rate Clinic accuracy rate Teledermatology accuracy rate
Teledermatology accuracy rate
Macro 1 CID images, n = 494 Macro 1 PLD images, n = 541 Macro images only, n = 542
Table III. Accuracy rates for index lesions by image package type
NOVEMBER 2009
Among teledermatologists, the addition of PLD images increased the diagnostic accuracy for both benign and malignant lesions but these differences were not statistically significant (P values $ .6208). The management appropriateness rates also increased for malignant lesions (3.9%), but not for benign lesions (e1.7%), although neither change was significant (P values $ .3323). As compared with macro images alone, there were no significant changes in the diagnostic accuracy of teledermatologists with the addition of CID images for malignant lesions (6.6%-7.6%, P values $ .2632) although there was a significant increase in primary diagnostic accuracy for benign lesions (6.3%, P = .0134; aggregated diagnostic accuracy change 2.0%, P = .4944). There were no significant changes in management plan accuracy rates for either malignant (1.6%, P = .8063) or benign (4.1%, P = .2127) lesions. Comparing teledermatologists viewing PDL images and CID images, diagnostic accuracy rates were better, but not significantly so, with CID images by 3.6% to 7.1% (P = .4240, P = .1671) for malignant lesions and 1.8% to 5.0% (P = .6445, P = .0854) for benign lesions. Although the rate of management appropriateness was slightly worse for malignant lesions (e2.4%, P = .5488), it was borderline significantly better for benign lesions (5.8%, P = .0344) for CID images as compared with PLD images. Clinical significance of inappropriate management The expert panel rated the clinical significance of inappropriate management in which either the clinician’s or teledermatologist’s management did not match the gold standard management. Table V summarizes these findings. Grouping severities as either ‘‘none/mild’’ versus ‘‘moderate/major’’ or ‘‘none’’ versus ‘‘mild/moderate/major,’’ teledermatology resulted in a significantly higher severity rate of management inappropriateness as compared with clinic dermatology. This difference was significant, regardless of image type (Table VI). Details of mismanaged melanomas Table VII provides details of mismanaged index melanomas. For clinic dermatology, there was one lentigo maligna that was mismanaged. For teledermatology, there were 7 melanomas mismanaged with macro images alone, 3 with macro plus PLD images, and 5 with macro plus CID images (Figs 2 to 4). Of the 5 nonindex melanomas, for clinic dermatology, one was mismanaged whereas for teledermatology, two were mismanaged with macro images alone, one with macro plus PLD images, and 3 with macro plus CID images (data not shown).
J AM ACAD DERMATOL
VOLUME 61, NUMBER 5
Table IV. Accuracy rates for malignant and benign pigmented index lesions Macro images only
Outcome
Aggregated diagnostic Primary diagnostic Management plan Aggregated diagnostic Primary diagnostic Management plan
Teledermatology accuracy rate (%)
67.74 54.84 84.68
62.92 49.04 66.27
Clinic accuracy rate (%)
Macro 1 PLD images Difference (SE) 95% CI
Malignant n = 124 81.45 e13.71 (4.01)§ e21.56, e5.86 62.90 e8.06 (4.36) e16.61, 0.48 96.77 e12.10 (3.14)§ e18.26, e5.94 Benign n = 418 79.90 e16.99 (2.45)y§ e21.79, e12.18 57.42 8.37 (2.77)§ e13.80, e2.94 56.46 9.81 (2.50)z 4.91, 14.70
Teledermatology accuracy rate (%)
70.73 55.28 88.62
63.16 50.48 64.59
Clinic accuracy rate (%)
Macro 1 CID images Difference (SE) 95% CI
Malignant n = 123 82.11 e11.38 (3.49)§ e18.22, e4.55 63.41 e8.13 (4.54) e17.03, 0.77 96.75 e8.13 (2.72)§ e13.46, e2.80 Benign n = 418 79.90 e16.75 (2.40)y§ e21.44, e12.05 57.42 e6.94 (2.88)§ e12.58, e1.29 56.46 8.13 (2.41)z 3.42, 12.85
Teledermatology accuracy rate (%)
74.31 62.39 86.24
64.94 55.32 70.39
Clinic accuracy rate (%)
Difference (SE) 95% CI
Malignant n = 109 82.57 e8.26 (2.94)§ e14.02, e2.50 66.06 e3.67 (3.88) e11.27, 3.93 98.17 e11.93 (3.10)§ e18.01, e5.84 Benign n = 385 80.26 5.33 (2.44)y§ e20.10, e10.55 57.40 e2.08 (2.67)* e7.32, 3.16 56.88 13.51 (2.50)z 8.60, 18.42
CI, Confidence interval; CID, contact immersion dermatoscopy; PLD, polarized light dermatoscopy. *Equivalent, CI wholly within 610%. y Definitely not equivalent, CI wholly outside 610%. z Teledermatology statistically significantly better than clinical dermatology (if CI $ 0). § Teledermatology statistically significantly worse than clinical dermatology (if CI \ 0).
Warshaw et al 759
760 Warshaw et al
J AM ACAD DERMATOL NOVEMBER 2009
Table V. Clinical significance of inappropriate management plans for index lesions Macro images only n = 542 Clinical significance of disagreement*
None (agreement) Minor (non-life threatening, no delay in treatment) Moderate (non-life threatening, possible delay in treatment) Major (potentially life threatening)
Macro 1 PLD images n = 541
Macro 1 CID images n = 494
Teledermatology (No.) Clinic (No.) Teledermatology (No.) Clinic (No.) Teledermatology (No.) Clinic (No.)
382 137
356 177
379 142
355 177
365 108
326 161
11
7
13
7
11
5
127 MM 5 SCC
21 MM 1 SCC
73 MM 4 SCC
21 MM 1 SCC
106 MM 4 SCC
21 MM 1 SCC
CID, Contact immersion dermatoscopy; MM, misdiagnosed histopathologic malignant melanoma; PLD, polarized light dermatoscopy; SCC, misdiagnosed histopathologic squamous cell carcinoma. *Expert panel rating.
Table VI. Comparison of management disagreement severity rates of teledermatologists and clinic dermatologists Moderate/major
Macro images only, n = 542 Macro 1 PLD images, n = 541 Macro 1 CID images, n = 494
Major
Teledermatology
Clinic
Comparison*
Teledermatology
Clinic
Comparison*
4.24%
1.66%
2.21%
0.37%
3.70%
1.66%
1.29%
0.37%
4.24%
1.42%
P = .001 95% CI 1.06, 4.10 P = .0076 95% CI 0.55, 3.52 P = .0005 95% CI 1.27, 4.40
2.02%
0.40%
P = .0016 95% CI 0.71, 2.98 P = .0253 95% CI 0.11, 1.73 P = .0047 95% CI 0.51, 2.73
CI, Confidence interval; CID, contact immersion dermatoscopy; PLD, polarized light dermatoscopy. *McNemar test.
DISCUSSION Main study findings There are several important findings of this study. First, for the primary study outcome, equivalence of aggregated diagnostic accuracy, teledermatology was not equivalent to clinic dermatology. Regardless of image type (macro alone, macro plus PLD, or macro plus CID images), aggregated diagnostic accuracy of clinic dermatology for pigmented lesions was also superior to that of teledermatology. Second, despite the superiority of clinic dermatology for aggregated diagnostic accuracy, overall, clinic dermatology was actually inferior (and equivalent for macro and macro plus PLD images) to teledermatology for management appropriateness. Third, despite the overall equivalency and/or superiority of teledermatology for appropriate management, the severity of inappropriate management was significantly worse using teledermatology than clinic dermatology (all image types). Of particular concern is that up to one fifth of index melanomas (7/36, 19.4%) would have been mismanaged via teledermatology as compared with only 2.8% (1/36) in clinic. Fourth, although teledermatology diagnostic accuracy and management plan accuracy rates improved with the
addition of dermatoscopic images (PLD or CID), only primary diagnostic accuracy (macro plus CID vs macro alone) was significantly better. Finally, although results for benign lesions were similar to the overall results, the results for malignant lesions were different, although some of this is likely a result of small sample size for malignant lesions (n = 124). Dermatoscopic images In all cases (overall group as well as benign and malignant subgroups) the addition of PLD or CID images improved the aggregated and primary diagnostic accuracy rates for teledermatology. However, the only statistically significant improvements were for primary diagnostic accuracy (macro alone vs macro plus CID, or macro plus PLD vs macro plus CID) for all pigmented lesions and for benign, but not malignant lesions. The rate of appropriate management for teledermatology did not significantly change with the addition of PLD or CID images as compared with macro images alone. We were surprised by this finding as we expected that the addition of dermatoscopic images would enhance diagnostic accuracy, as this is the case in the clinical setting.17
Warshaw et al 761
J AM ACAD DERMATOL VOLUME 61, NUMBER 5
Table VII. Details of mismanaged index melanomas Teledermatology
Macro images only Mismanaged melanomas (No.) Chosen diagnoses Confidence level
Photograph quality
Macro 1 PLD images Mismanaged melanomas (No.) Chosen diagnoses Confidence level
Photograph quality
Macro 1 CID images Mismanaged melanomas (No.) Chosen diagnoses Confidence level
Photograph quality
Clinic dermatology
7 Hyperpigmentation, lentigo, benign keratosis, benign nevus, melanoma* High 3 Moderate 3 Low 1 High 2 Moderate 4 Low 1
1 Hyperpigmentation, lentigo, benign nevus High 0 Moderate 0 Low 1 N/A
3 Lentigo, benign nevus, benign keratosis High 0 Moderate 3 Low 0 High 0 Moderate 3 Low 0
N/A
6 Benign keratoses, lentigo, actinic keratosis, benign nevus, dysplastic nevus, High 2 Moderate 4 Low 0 High 2 Moderate 4 Low 0
N/A
CID, Contact immersion dermatoscopy; N/A, not applicable; PLD, polarized light dermatoscopy. *Melanoma listed in differential diagnosis but observe/reassure was chosen as management plan.
Mismanaged melanomas The majority of mismanaged melanomas in each image category were performed by the same teledermatologist. This is likely a result of chance, as this dermatologist performed the majority of assessments on melanomas (51.2% vs 26.8%, 22.0%) and did not differ in rates of management plan appropriateness for either pigmented and nonpigmented lesions (index and nonindex lesions, n [ 1300 for each teledermatologist) or just pigmented lesions (n[174 for each teledermatologist, data not shown). Most of the mismanaged melanomas were rated with moderate confidence and moderate photograph quality by the teledermatologists (Table VII). This finding is similar to that of Piccolo et al,20 who also found that image quality did not correlate with diagnostic accuracy for images of pigmented lesions. Further subgroup analysis of lesions rated as either high quality (macro n = 67; PLD n = 137; CID = 165) or low quality (macro n = 111; PLD n = 69; CID n = 45) for all 3 image types revealed the same results as the overall
group: aggregated diagnostic accuracy was significantly better for clinic dermatology as compared with teledermatology (data not shown). Comparison of current study with other studies Comparison of our results with other studies is limited to studies that evaluated teledermatology accuracy for pigmented skin neoplasms using a gold standard such as histopathology. Piccolo et al19 compared clinic dermatologists with teledermatologists viewing digitized dermatoscopic images (from a stereomicroscope, viewed at 316 magnification) of 66 pigmented lesions. Diagnostic accuracy of clinic dermatologists was 92% as compared with 86% for teledermatologists and neither group misdiagnosed the one melanoma in this series. In a separate study, Piccolo et al18 provided digital clinical and CID images of 43 pigmented lesions (including 11 melanomas) to several dermatologists with varying expertise in dermatoscopy. Of the 4 teledermatologists
762 Warshaw et al
J AM ACAD DERMATOL NOVEMBER 2009
Fig 2. Superficial spreading melanoma 0.35 mm. Clinic: misdiagnosed but not mismanaged. A and B, Macro: misdiagnosed and mismanaged; 18: dysplastic nevus, differential diagnosis benign nevus; high confidence, high quality. C, Polarized light dermatoscopy: misdiagnosed but not mismanaged; 18: dysplastic nevus, differential diagnosis benign nevus; high confidence, moderate quality. D, Contact immersion dermatoscopy: misdiagnosed and mismanaged; 18: dysplastic nevus, differential diagnosis benign nevus; moderate confidence, high quality.
Fig 3. Lentigo maligna melanoma 0.23 mm. Clinic: misdiagnosed and mismanaged; 18: postinflammatory hyperpigmentation, differential diagnosis benign keratosis, lentigo; low confidence. A and B, Macro: misdiagnosed and mismanaged: 18: benign keratosis, differential diagnosis lentigo; high confidence, moderate photograph quality. C, Polarized light dermatoscopy: misdiagnosed and mismanaged: 18: dysplastic nevus, differential diagnosis benign keratosis; moderate confidence, high photograph quality. D, Contact immersion dermatoscopy: misdiagnosed and mismanaged: 18: benign keratosis, differential diagnosis benign nevus; high confidence, moderate photograph quality.
J AM ACAD DERMATOL
Warshaw et al 763
VOLUME 61, NUMBER 5
with the highest level of expertise, diagnostic accuracy was 81%, 88%, 91%, and 95% as compared with the clinic dermatologist’s diagnostic accuracy of 91%. The clinic dermatologist misdiagnosed 3 melanomas whereas the experienced teledermatologists each misdiagnosed 3 to 5 melanomas. Blum et al21 studied dermatoscopic images of 157 pigmented lesions, including 32 melanomas. Three teledermatologists yielded diagnostic accuracy rates of 90%, 89%, and 87%. Diagnostic accuracy for the subgroup of 32 melanomas was 84%, 88%, and 84%, respectively. Braun et al22 evaluated 55 pigmented lesions with dermatoscopic images and found a clinical diagnostic accuracy rate of 64% compared with 75% for teledermatoscopy. For the 9 melanomas, diagnostic accuracy was 78% and 100%, respectively. In another study by Piccolo et al,20 77 melanocytic acral lesions (including 6 melanomas) were evaluated with dermatoscopic images by 11 different dermatologists. Sensitivity ranged from 0.83 to 1.00 and specificity ranged from 0.92 to 1.00 for the 4 melanomas. Previous studies have not evaluated appropriateness of management plans for pigmented lesions, although Piccolo et al20 commented that none of the 6 melanomas in their study of 77 melanocytic acral lesions were mismanaged. Using macro images alone, our primary diagnostic accuracy rates of 59% (clinic dermatology) and 50% to 57% (teledermatology) were higher than those of Joliff et al15: 43% and 47%, respectively, for 144 pigmented lesions (including 4 melanomas). For teledermatoscopy, the diagnostic accuracy rates of 50% to 67% from our study were lower than those from other studies: 75%,22 86%,19 and 81% to 95%.18 It is unlikely that these differences are a result of lesion difficulty or dermatoscopy expertise as all of our study teledermatologists had clinical expertise in dermatoscopy, but it is possible that these differences could be a result of study design. Our study attempted to assess effectiveness (accuracy in a realworld setting) by including consecutive lesions that met inclusion criteria and photographed by a research assistant, rather than efficacy (accuracy in a controlled setting, eg, select photographs taken with special equipment by dermatologists). It is unclear from some reported studies whether only the best cases or best photographs were chosen for inclusion. Piccolo et al18 excluded images rated as poor; we
Fig 4. Lentigo maligna melanoma 0.2 mm. Clinic: not misdiagnosed or mismanaged. A and B, Macro: misdiagnosed and mismanaged: 18: benign keratosis, differential diagnosis lentigo; moderate confidence, moderate photograph quality. C, Polarized light dermatoscopy:
misdiagnosed and mismanaged: 18: benign keratosis, differential diagnosis lentigo; moderate confidence, high photograph quality. D, Contact immersion dermatoscopy: misdiagnosed and mismanaged: 18: benign keratosis, differential diagnosis dysplastic nevus; moderate confidence, moderate photograph quality.
764 Warshaw et al
J AM ACAD DERMATOL NOVEMBER 2009
included all images that were in focus, resulting in 45 to 111 lesions rated as low quality. Subgroup analysis of the low-quality images in our study, however, yielded similar results to the overall group (data not shown). The study by Braun et al22 most closely simulated our study conditions; in that study, 6 dermatologists in private practice photographed lesions scheduled for excision with a commercially available digital camera. The majority of the photographs in that study were rated as good (11, 65%) on a scale of 1 to 111. Several studies included special equipment such as a stereomicroscope,19 computer systems with video camera and digital dermatoscopic microscope system,21 MoleMax II system,20 single-chip video camera,15 or a highquality teledermatology workstation built by the author and not commercially available.18 Our study used a standard digital camera with a relatively simple polarized light attachment and scanned 35-mm dermatoscopic images. Limitations There are several limitations to this study. First, the study population was primarily elderly, male, and Caucasian. Although this population lacks diversity, it does represent the population with the greatest increased incidence and mortality caused by melanoma.14 Second, although the teledermatologists were blinded to the exact purpose of the study, they were aware that it was a study and therefore may not has been as careful in their diagnostic choices or conservative in their management as they may have been if they knew it would directly affect patient outcomes. Third, this study was limited to skin neoplasms, the management of which is predominantly either observe/reassure or remove/biopsy/destroy. The equivalency of management plans was likely enhanced by relatively few and broad management plan choices. On the other hand, this is the most important and relevant management distinction for skin neoplasms. Fourth, as this study did not evaluate a functioning site-to-site teledermatology program, it does not provide information regarding the important outcomes of patient satisfaction, provider satisfaction, and cost-effectiveness of teledermatology. More studies in these areas are needed to address these health services outcomes. Finally, because our comparison group was standard, in-person clinic dermatology, we cannot comment on how the accuracy of teledermatology compares with in-person care by primary care providers; this comparison would be relevant in settings with extremely limited or no access to dermatology (eg, cruise ships, battle situations, remote areas [eg, Alaska]).
CONCLUSIONS In general, standard, in-person clinic dermatology was superior to teledermatology for diagnostic accuracy. The addition of dermatoscopic images did not significantly improve teledermatology diagnostic accuracy rates. Although the rates of appropriate management plans were generally equivalent, further subgroup analysis showed that teledermatology was superior to clinic dermatology for benign lesions but inferior for malignant lesions. A particularly concerning finding was that up to one fifth of melanomas would have been mismanaged via teledermatology. More studies are needed with larger numbers of melanomas to further assess outcomes in this important subgroup and practicing teledermatologists should be aware of this potentially serious limitation of teledermatology. The authors would like to thank the dermatology nurses, residents, and study participants at the Minneapolis Veterans Affairs Medical Center. Herbert D. Stockley, MSW, is acknowledged for his contributions in project management and web site design and Kaye B. Williams for her help with project management.
REFERENCES 1. Hassol A, Gaumer G, Irvine C, Grigsby J, Mintzer C, Puskin D. Rural telemedicine data/image transfer methods and purposes of interactive sessions. J Am Med Inform Assoc 1997;4:36-7. 2. US Department of Commerce. Telemedicine report to the Congress, January 31, 1997. 3. Loane MA, Corbett R, Bloomer SE, Eedy DJ, Gore HE, Mathews C, et al. Diagnostic accuracy and clinical management by realtime teledermatology: results from the Northern Ireland arms of the UK multicenter teledermatology trial. J Telemed Telecare 1998;4:95-100. 4. Oakley AM, Astwood DR, Loane M, Duffil MB, Rademaker M, Wootton R. Diagnostic accuracy of teledermatology: results of a preliminary study in New Zealand. N Z Med J 1997;110:51-3. 5. Phillips CM, Burke WA, Shecter A, Stone D, Balch D, Gustke S. Reliability of dermatology teleconsultations with the use of teleconferencing technology. J Am Acad Dermatol 1997;37: 398-402. 6. Lesher JL, Davis LS, Gourdin FW, English D, Thompson WO. Telemedicine evaluation of cutaneous diseases: a blinded comparative study. J Am Acad Dermatol 1998;38:27-31. 7. Lowitt MH, Kessler II, Kauffman CL, Hooper FJ, Siegel E, Burnett JW. Teledermatology and in-person examinations: a comparison of patient and physician perceptions and diagnostic agreement. Arch Dermatol 1998;134:471-6. 8. High WA, Houston MS, Calobrisi SD, Drage LA, McEvoy MT. Assessment of the accuracy of low-cost store-and-forward teledermatology consultation. J Am Acad Dermatol 2000;42:776-83. 9. Krupinski EA, LeSueur B, Ellsworth L, Levine N, Hansen R, Silvis N, et al. Diagnostic accuracy and image quality using a digital camera for teledermatology. Telemed J 1999;5:257-63. 10. Zelickson BD, Homan L. Teledermatology in the nursing home. Arch Dermatol 1997;133:171-4. 11. Whited JD, Mills BJ, Hall RP, Drugge RJ, Grichnik JM, Simel DL. A pilot trial of digital imaging in skin cancer. J Telemed Telecare 1998;4:108-12.
J AM ACAD DERMATOL
Warshaw et al 765
VOLUME 61, NUMBER 5
12. Whited JD, Hall RP, Simel DL, Foy ME, Stechuchak KM, Drugge RJ, et al. Reliability and accuracy of dermatologists’ clinicbased and digital image consultations. J Am Acad Dermatol 1999;41:693-702. 13. Barnard CM, Goldyne ME. Evaluation of an asynchronous teleconsultation system for diagnosis of skin cancer and other skin diseases. Telemed J E Health 2000;6:379-84. 14. Linos E, Swetter SM, Cockburn MG, Colditz GA, Clarke CA. Increasing burden of melanoma in the United States. J Invest Dermatol 2009;129:1604-6. 15. Jolliffe VM, Harris DW, Whittaker SJ. Can we safely diagnose pigmented lesions from stored video images? A diagnostic comparison between clinical examination and stored video images of pigmented lesions removed for histology. Clin Exp Dermatol 2001;26:84-7. 16. Anderson RR. Polarized light examination and photography of the skin. Arch Dermatol 1991;127:1000-5. 17. Kittler H, Pehamberger H, Wolff K, Binder M. Diagnostic accuracy of dermoscopy. Lancet Oncol 2002;3:159-63. 18. Piccolo D, Smolle J, Argenziano G, Wolf IH, Braun R, Cerroni L, et al. Teledermoscopyeresults of a multicenter study on 43 pigmented skin lesions. J Telemed Telecare 2000;6:132-7. 19. Piccolo D, Smolle J, Wolf IH, Peris K, Hofmann-Wellenhof R, Dell’Eva G, et al. Face-to-face diagnosis vs telediagnosis of pigmented skin tumors: a teledermoscopic study. Arch Dermatol 1999;135:1467-71.
20. Piccolo D, Soyer HP, Chimenti S, Argenziano G, Bartenjev I, Hofmann-Wellenhof R, et al. Diagnosis and categorization of acral melanocytic lesions using teledermoscopy. J Telemed Telecare 2004;10:346-50. 21. Blum A, Hofmann-Wellenhof R, Luedtke H, Ellwanger U, Steins A, Roehm S, et al. Value of the clinical history for different users of dermoscopy compared with results of digital image analysis. J Eur Acad Dermatol Venereol 2004;18:665-9. 22. Braun RP, Meier M, Pelloni F, Ramelet AA, Schilling M, Tapernoux B, et al. Teledermatoscopy in Switzerland: a preliminary evaluation. J Am Acad Dermatol 2000;42:770-5. 23. Browns IR, Collins K, Walters SJ, McDonagh AJG. Telemedicine in dermatology: a randomized controlled trial. Health Technol Assess 2006;10:1-58. 24. Warshaw EM, Lederle FA, Grill JP, Gravely AA, Bangerter AK, Fortier LA, et al. Accuracy of teledermatology for non-pigmented neoplasms. J Am Acad Dermatol 2009;60:579-80. 25. Piccolo D, Ferrari A, Peris K, Diadone R, Ruggeri B, Chimenti S. Dermoscopic diagnosis by a trained clinician vs a clinician with minimal dermoscopy training vs computer-aided diagnosis of 341 pigmented skin lesions: a comparative study. Br J Dermatol 2002;147:481-6. 26. Food and Drug Administration, Division of Anti-infective Drug Products. Points to consider, 1992. CPMP EWP. Evaluation of new anti-bacterial medicinal products. Final, April 1997. 27. Agresti A. Categorical data analysis. New York: Wiley; 1990.