Reliability of store and forward teledermatology for skin neoplasms

Reliability of store and forward teledermatology for skin neoplasms

Reliability of store and forward teledermatology for skin neoplasms Erin M. Warshaw, MD, MS,a,b Amy A. Gravely, MA,a and David B. Nelson, PhDa,c Minne...

364KB Sizes 1 Downloads 51 Views

Reliability of store and forward teledermatology for skin neoplasms Erin M. Warshaw, MD, MS,a,b Amy A. Gravely, MA,a and David B. Nelson, PhDa,c Minneapolis, Minnesota Background: Teledermatology may be less optimal for skin neoplasms than for rashes. Objectives: We sought to determine agreement for skin neoplasms. Methods: This was a repeated measures study. Each lesion was examined by a clinic dermatologist and a teledermatologist; both generated a primary diagnosis, up to 2 differential diagnoses, and management. Macro images and polarized light dermoscopy images were obtained; for pigmented lesions only, contact immersion dermoscopy image was obtained. Results: There were 3021 lesions in 2152 patients. Of 1685 biopsied lesions, there were 410 basal cell carcinomas (24%), 240 squamous cell carcinomas (14%), and 41 melanomas (2.4%). Agreement was fair to substantial for primary diagnosis (45.7%-80.1%; kappa 0.32-0.62), substantial to almost perfect for aggregated diagnoses (primary plus differential; 78.6%-93.9%; kappa 0.77-0.90), and fair for management (66.7%-86.1%; kappa 0.28-0.41). Diagnostic agreement rates were higher for pigmented lesions (52.8%-93.9%; kappa 0.44-0.90) than nonpigmented lesions (47.7%-87.3%; kappa 0.32-0.86), whereas the reverse was found for management agreement (pigmented: 66.7%-79.8%, kappa 0.19-0.35 vs nonpigmented: 72.0%-86.1%, kappa 0.38-0.41). Agreement rates using macro images were similar to polarized light dermoscopy; contact immersion dermoscopy, however, significantly improved rates for pigmented lesions. Limitations: We studied a homogeneous population. Conclusions: Diagnostic agreement was moderate to almost perfect whereas management agreement was fair. Polarized light dermoscopy increased rates modestly whereas contact immersion dermoscopy significantly increased rates for pigmented lesions. ( J Am Acad Dermatol 2015;72:426-35.) Key words: dermoscopy; diagnosis; management; reliability; skin cancer; teledermatology.

lthough several small studies have reported reliability (agreement) of teledermatology with in-person examinations, few have focused specifically on skin neoplasms. A recent systematic review on teledermatology1,2 found that primary diagnostic agreement has only

A

been assessed in 5 lesion studies3-7 and 1 dermoscopy pigmented lesion study8; the weighted average for these 6 studies (n = 708 lesions) was 62.3%.1 The weighted average for aggregated diagnostic agreement for 4 studies4-7 (n = 358 lesions) was similar, 64.4%.1 Because almost half

From the Minneapolis Veterans Affairs Medical Center, Center for Chronic Disease Outcomes Researcha; and Departments of Dermatologyb and Medicine,c University of Minnesota School of Medicine. This research was supported by the Department of Veterans Affairs (VA), Veterans Health Administration, Health Services Research and Development Service IIR 01-072-2. During this study, Dr Warshaw was supported by a VA Cooperative Studies Clinical Research Career Development Award. Conflicts of interest: None declared. The findings and conclusions presented in this report are those of the authors and do not necessarily represent the views of the

Department of Veterans Affairs or Health Services Research and Development Service. Accepted for publication November 3, 2014. Reprints not available from the authors. Correspondence to: Erin M. Warshaw, MD, MS, Minneapolis Veterans Affairs Medical Center, Department 111 K, 1 Veterans Dr, Minneapolis, MN 55417. E-mail: erin.warshaw@va. gov. Published online January 16, 2015. 0190-9622 Published by Elsevier on behalf of the American Academy of Dermatology, Inc. http://dx.doi.org/10.1016/j.jaad.2014.11.001

426

J AM ACAD DERMATOL

Warshaw, Gravely, and Nelson 427

VOLUME 72, NUMBER 3

encounter and the clinical examination could (47%) of all veteran dermatology visits9 are related include all options normally available (eg, palpation, to skin neoplasms, evaluation of this subset diascopy, dermoscopy). of dermatologic conditions is critical in this patient population. Previous work by our group reported accuracy (histopathology as gold stanPhotographs dard) of store and forward teledermatology as Research assistants obtained images with up to 3 compared with conventional in-person dermatodifferent cameras. All patients had at least 2 macro logic examinations for skin images (distance and closeneoplasms and found that up; digital Nikon Coolpix CAPSULE SUMMARY the diagnostic accuracy of 4500 with a Nikon SL-1 ring teledermatology was infeflash [Nikon, Melville, NY]) Store-forward teledermatology is being rior to standard, in-person and 1 polarized light image implemented but studies focusing on examinations whereas (polarized light dermoscopy skin neoplasms are lacking. management accuracy var[PLD]) (digital Nikon Coolpix In our series, diagnostic agreement was ied by lesion type.10-12 4500 with a 3Gen Dermlite moderate to almost perfect whereas Interobserver accuracy lens attachment [3Gen, San management agreement was fair. was also reported.13,14 Juan Capistrano, CA]). For leThe purpose of this Store-forward teledermatology should sions greater than 2 mm in analysis was to compare conbe used cautiously in evaluating skin height, an additional macro ventional, in-person dermaneoplasms; contact immersion angle shot was obtained. All tology with store and teledermoscopy should be used of the images were obtained forward teledermatology for whenever possible for pigmented in a standardized fashion skin neoplasms, using the lesions. (distance 640 3 470 pixels; all outcomes of agreement for others at the highest resolution primary diagnosis, aggre1600 3 1200 pixels). For gated diagnoses, and management. pigmented lesions only, a contact immersion dermoscopy (CID) image (35-mm Minolta X 370 METHODS [Minolta, Tokyo, Japan] with a Heine dermphot lens The Minneapolis Veterans Affairs Medical attachment [Heine, Herrsching, Germany]) was also Center Human Studies Subcommittee in Minnesota obtained. Previous publications have focused on provided institutional review board approval for this index lesions.10,11 The study reported herein includes study. The design of the cross-sectional, repeated both index and secondary lesions. measures equivalence study has been reported previously.10-12 Image packages For each patient, a sequence of 2 or 3 images, or Participants and inclusion criteria ‘‘package,’’ was sent to a teledermatologist according Neoplasms were defined as circumscribed lesions. to a computer-generated randomization schedule Patients referred to the Minneapolis Veterans Affairs separated by at least 3 weeks, to avoid recall bias. For Medical Center Dermatology Clinic for evaluation of patients with only nonpigmented lesions, the a skin neoplasm or patients already enrolled in the teledermatologist received a macro package dermatology clinic who were undergoing a biopsy of (distance, close-up, and/or angle [if height [2 mm] a skin neoplasm were eligible for inclusion. After image of all lesions) and PLD package (macro and informed consent was obtained, digital photographs PLD images of all lesions). For patients with of the lesion(s) and a standardized history were pigmented lesions, the teledermatologist received a obtained by research staff. A staff dermatologist macro package, a PLD package, and a CID package then completed a clinical assessment that consisted (macro images of all lesions plus CID images of of: (1) a choice of 17 common diagnoses for 1 primary pigmented lesions). diagnosis and up to 2 differential diagnoses; (2) a choice of 4 basic management plans; (3) pigmentaTeledermatology encounter tion (yes or no); and (4) level of diagnostic For a given patient, 1 of 3 board-certified confidence (low, moderate, or high). Clinicians dermatologists (none of whom served as general were also allowed to choose ‘‘other’’ for diagnoses clinic dermatologists but all of whom had clinical and management and hand-write choices. expertise in dermoscopy and pigmented lesions)14 Additional history could be obtained by the was randomly selected to review the electronically clinical dermatologist in the usual manner of a clinic transmitted packages of clinical digital photographs d

d

d

J AM ACAD DERMATOL

428 Warshaw, Gravely, and Nelson

MARCH 2015

Fig 1. Flow diagram of study participants.

and standardized history. The order in which the different packages were sent to the selected dermatologist was randomly selected. Using the same diagnostic and management categories as the clinic dermatologist, the teledermatologist recorded 1 primary diagnosis, up to 2 differential diagnoses, and management. Confidence and image quality was also rated. Panel A separate panel of 3 board-certified dermatologists, none of whom served as a clinic or teledermatologist in the study and who were unaware of the study design and purpose, grouped write-in diagnoses into categories for the purpose of data analysis. Outcome measures ‘‘Primary diagnostic agreement’’ was defined as exact matching of primary diagnoses only. ‘‘Aggregated diagnostic agreement’’ was defined as matching of any of the clinic dermatology diagnoses (primary or differential) with any of the teledermatology diagnoses (primary or differential).

‘‘Management agreement’’ was defined as matching of the teledermatology and clinic dermatology management. Statistical analysis We used standard large sample methods to estimate rates of agreement and kappa statistics (with corresponding 95% confidence intervals) for lesions in the 4 combinations of pigmentation and biopsy for each of the 3 photographic modes considered. Note that we calculated agreement for all lesions for primary, aggregated, and management agreement, and by confidence level and photograph quality. For simplicity and brevity we present these latter results for pigmented and nonpigmented lesions given the generally similar patterns in differences in rates across categories of confidence and image quality for biopsied and nonbiopsied lesions. Comparisons of agreement rates between the categories of pigmentation and biopsy status were made using large sample, normal approximation-based tests (z-tests) ignoring any positive correlation between observations made by the same clinic dermatologist, teledermatologist, or

J AM ACAD DERMATOL

Warshaw, Gravely, and Nelson 429

VOLUME 72, NUMBER 3

for the same patient. For comparisons of agreement rates for different image types, we used large sample, normal approximation-based tests that incorporated a correlation between observations on the same lesion. Data analysis The study targeted at least 520 individuals with lesions in each category (biopsied pigmented, biopsied nonpigmented, nonbiopsied nonpigmented, and nonbiopsied nonpigmented). This sample size allowed estimation of agreement rates with SE less than 2.5% and 90% power for comparing rates, at a .05 significance level and an underlying 10% difference. Data analysis was performed on a statistical software package (SAS for Windows, SAS Institute Inc, Cary, NC).

RESULTS

Table I. Demographic characteristics of enrolled patients Characteristic

Male gender, n (%) Mean age, y (range) Fitzpatrick skin type, n (%)* I II III IV-VI Ethnicity, n (%)y Caucasian African American American Indian Hispanic/Latino Asian Other Personal history, n (%)zk Nonmelanoma skin cancer Melanoma Eczema Psoriasis Other skin problem None Family history, n (%)xk Nonmelanoma skin cancer Melanoma Eczema Psoriasis Other skin problem None

Unique enrolled patients, n = 2152

2082 (96.8) 68 (19-94) 209 609 1025 308

(9.7) (28.3) (47.6) (14.3)

2098 32 8 8 2 2

(97.5) (1.5) (0.4) (0.4) (0.1) (0.1)

567 77 91 75 150 1321

(24.9) (3.4) (4.0) (3.3) (6.6) (57.9)

295 119 81 70 39 1637

(13.2) (5.3) (3.6) (3.1) (1.7) (73.1)

Demographics Participant characteristics. Participant characteristics have been previously reported.10,11 Briefly, of 2905 volunteers who were invited to participate, 2152 (74%) were eligible and agreed to participate (Fig 1). The majority of participants was male and Caucasian; most reported no personal (58%) or family (73%) history of skin problems (Table I). Of the 259 patients who were approached more than once because they had more than 1 clinic visit and/or consult during the study period, 129 were enrolled twice, 6 were enrolled 3 times, and 1 patient was enrolled 4 times. Of all visits, the majority (1735, 81%) involved only 1 lesion, although 423 (18%) included 2 lesions and 138 (6%) involved 3 or more lesions. Lesion characteristics. Characteristics of lesions are listed in Table II. Overall, almost half (45%) were located on the face or ears. About a quarter of all lesions had been present for 3 to 12 months (24%). Approximately two fifths (43%) of lesions were asymptomatic whereas one-quarter had changed in size (25%) or were pruritic (24%). Histopathologic diagnostic categories for biopsied lesions are summarized in Table III. Approximately one fourth of biopsied lesions were basal cell carcinomas (24%) followed by squamous cell carcinomas (14%) and benign keratoses (13%). There were 41 melanomas.

categories (P \ .0001) and significantly lower for biopsied nonpigmented lesions (P \ .025). Rates of agreement with the addition of the PLD images were comparable with those with just the macro images, except for biopsied nonpigmented lesions where the rate was significantly higher (P = .0046). The addition of CID images significantly increased the rates of agreement for pigmented lesions (all P values \ .022). Kappa statistics were fair to moderate.

Primary diagnostic agreement For standard macro images, diagnostic agreement for primary diagnosis between teledermatologists and clinic dermatologists ranged across the lesion categories from 45.7% to 75.7% (Table IV). Agreement rates were significantly higher for nonbiopsied pigmented lesions than for the other 3

Aggregated diagnostic agreement For the standard macro images, aggregated diagnostic agreement ranged between 78.6% and 91.0% with significantly lower rates (P values \ .001) of agreement for nonbiopsied nonpigmented lesions and significantly higher rates (P values \ .001) of

*Not specified in 1 patient. y Not specified in 2 patients. z Not specified in 4 patients. x Not specified in 2 patients. k n = Number of patients who answered yes to each question; therefore the totals in this section exceed the number of patients.

J AM ACAD DERMATOL

430 Warshaw, Gravely, and Nelson

MARCH 2015

Table II. Lesion characteristics Biopsied, n = 1685 Characteristic

Pigmented, n = 651

Nonpigmented, n = 1034

Location (% of answers in column) Face/ears 164 (25.2) Back 210 (32.3) Hand/arm 66 (10.1) Chest 66 (10.1) Scalp 38 (5.8) Neck 37 (5.7) Leg/foot 42 (6.5) Abdomen 21 (3.2) Buttocks/groin 7 (1.1) Lesion duration (% of answers in column) \3 mo 20 (3.1) 3-12 mo 89 (13.7) 1-2 y 50 (7.7) 2-5 y 86 (13.2) 5-10 y 29 (4.5) 10-20 y 47 (7.2) Since birth 31 (4.8) Other 299 (45.9) Symptoms* (% of answers in column) None 409 (62.8) Size change 113 (17.4) Itching 101 (15.5) Bleeding 57 (8.8) Tenderness 48 (7.4) Other 40 (6.1) Burning 13 (2.0)

Nonbiopsied, n = 1336 Pigmented, n = 753

Nonpigmented, n = 583

Total, n = 3021

600 85 141 85 57 50 32 5 8

(58.0) (8.2) (13.6) (8.2) (5.5) (4.8) (3.1) (1.0) (1.0)

269 184 76 64 47 28 46 30 9

(35.7) (24.4) (10.1) (8.5) (6.2) (3.7) (6.1) (4.0) (1.2)

315 25 105 17 60 16 38 1 0

(54.0) (4.3) (18.0) (3.0) (10.3) (2.7) (6.5) (0) (1.0)

1348 504 388 203 202 131 158 57 30

(44.6) (16.7) (12.8) (6.7) (6.7) (4.3) (5.2) (1.9) (1.0)

112 327 209 166 63 50 7 100

(10.8) (31.6) (20.2) (16.1) (6.1) (4.8) (1.0) (9.7)

20 113 113 148 64 89 30 176

(2.7) (15.0) (15.0) (19.7) (8.5) (11.8) (4.0) (23.4)

43 197 112 107 38 28 7 51

(7.4) (33.8) (19.2) (18.4) (6.5) (4.8) (1.2) (8.7)

195 726 484 507 194 214 75 626

(16.5) (24.0) (16.0) (11.8) (6.4) (7.1) (2.5) (20.7)

307 320 258 273 282 52 49

(29.7) (31.0) (25.0) (26.4) (27.3) (5.0) (4.7)

388 192 176 43 50 48 2

(51.5) (25.5) (23.4) (5.7) (6.6) (6.4) (0)

196 142 179 104 151 41 19

(33.6) (24.4) (30.7) (17.8) (25.9) (7.0) (3.3)

1300 767 714 477 531 181 83

(43.0) (25.4) (23.6) (15.8) (17.6) (6.0) (2.8)

*n = Multiple symptoms possible; therefore percentages exceed 100.

Table III. Histopathologic diagnoses for biopsied lesions Histopathologic diagnosis

Basal cell carcinoma Squamous cell carcinoma Benign keratoses Dysplastic nevus Premalignant/nonmelanocytic lesion (actinic keratoses) Benign nevus Cyst Melanoma Benign appendageal tumor Lentigo Benign vascular or lymphatic tumor Benign fibrohistiocytic lesion Neurofibroma Chondrodermatitis nodularis helicis Wart Other diagnoses

No. (%) N = 1685

410 240 223 154 145

(24.3) (14.2) (13.2) (9.1) (8.6)

138 73 41 35 29 26 24 18 10 9 110

(8.2) (4.3) (2.4) (2.1) (1.7) (1.5) (1.4) (1.1) (0.6) (0.5) (6.5)

agreement among nonbiopsied pigmented lesions. The addition of PLD images yielded similar agreement rates to those from macro images alone; the only significant difference was for biopsied pigmented lesions (P = .0152). Further, the addition of CID images significantly increased the agreement for pigmented lesions (both biopsied and nonbiopsied; P values \ .034) compared with macro images alone. Kappa statistics were substantial to almost perfect. Management agreement For standard macro images, overall agreement between teledermatologists and clinic dermatologists in specified management ranged from 66.7% to 85.3% with the agreement for nonbiopsied nonpigmented lesion substantially higher (P values\.0001) than the rates of agreement in the other lesion categories. Rates of agreement with the addition of PLD images were comparable with those with just the macro

J AM ACAD DERMATOL

Warshaw, Gravely, and Nelson 431

VOLUME 72, NUMBER 3

Table IV. Diagnostic and management plan agreement by lesion category Image type No. of lesions Percent agreement (95% CI) Kappa statistic (95% CI)

Lesion category

Macro

Nonbiopsied pigmented lesions N = 753 Primary diagnosis 75.7 (72.6-78.8) 0.56 (0.51-0.61) Aggregated 91.0 (88.9-93.0) diagnoses 0.86 (0.83-0.89) Management plan 71.1 (67.8-74.3) 0.21 (0.15-0.28) Biopsied pigmented lesions N = 651 Primary diagnosis 52.8 (49.0-56.7) 0.44 (0.40-0.48) Aggregated 85.1 (82.4-87.8) diagnoses 0.84 (0.81-0.87) Management plan 66.7 (63.1-70.3) 0.28 (0.21-0.35) Nonbiopsied nonpigmented lesions N = 583 Primary diagnosis 51.5 (47.4-55.5) 0.38 (0.34-0.43) Aggregated 78.6 (75.2-81.9) diagnoses 0.77 (0.73-0.80) Management plan 72.0 (68.4-75.7) 0.39 (0.32-0.46) Biopsied nonpigmented lesions N = 1034 Primary diagnosis 45.7 (42.6-48.7) 0.32 (0.29-0.36) Aggregated 85.4 (83.2-87.5) diagnoses 0.84 (0.82-0.86) Management plan 85.3 (83.1-87.5) 0.38 (0.30-0.46)

Pairwise comparisons P values (Student t test)

Macro 1 PLD

Macro 1 CID

N = 752 75.3 (72.2-78.4) 0.56 (0.51-0.61) 91.6 (89.6-93.6) 0.87 (0.85-0.90) 71.0 (67.8-74.3) 0.19 (0.12-0.25)

N = 684 80.1 (77.1-83.1) 0.62 (0.57-0.67) 93.9 (86.9-91.9) 0.90 (0.87-0.93) 79.8 (76.8-82.8) 0.26 (0.18-0.35)

N = 652 53.4 (49.6-57.2) 0.45 (0.40-0.49) 88.8 (86.4-91.2) 0.88 (0.85-0.90) 69.6 (66.1-73.2) 0.33 (0.26-0.40)

N = 595 60.0 (56.1-63.9) 0.52 (0.47-0.57) 89.4 (86.9-91.9) 0.88 (0.86-0.91) 69.4 (65.7-73.1) 0.35 (0.29-0.42)

N = 579 50.3 (46.2-54.3) 0.38 (0.33-0.42) 79.1 (75.8-82.4) 0.77 (0.74-0.80) 72.0 (68.4-75.7) 0.38 (0.31-0.45) N = 1020 50.1 (47.0-53.2) 0.37 (0.33-0.40) 87.3 (85.2-89.3) 0.86 (0.84-0.88) 86.1 (84.0-88.2) 0.41 (0.33-0.49)

n/a

Macro vs macro 1 PLD

Macro vs macro 1 CID

PLD vs macro 1 CID

.7886

.0218*

.0041*

.5588

.0221*

.0268*

1.0000

\.0001*

\.0001*

.8745

.0048*

.0045*

.0152*

.0337*

.8991

.1271

.1246

.6650

.5473

n/a

n/a

n/a

n/a

.8417 .9029

n/a

.0046* .1144 .6258

CI, Confidence interval; CID, contact immersion dermoscopy; n/a, not applicable; PLD, polarized light dermoscopy. *Statistically significant.

images. The addition of CID images increased the rates of agreement for nonbiopsied pigmented lesions (P values \ .0001). Kappa statistics were fair. Agreement rate by teledermatologist’s confidence For every lesion type and image package, agreement rates increased when confidence increased from low to moderate and from moderate to high (Table V). For primary diagnostic agreement (all image types), agreement rates were almost double or greater for cases where teledermatologists indicated high confidence in their diagnosis compared with those rated as low confidence. The

aggregated diagnostic and management agreement rates for cases where teledermatologists indicated high confidence were at least 14 percentage points greater than those rated as low confidence; these differences were higher for nonpigmented lesions than pigmented lesions. Kappa statistic differences were similar.

Agreement rate by teledermatologist-rated image quality For every lesion type and image package, agreement rates increased when image quality increased from low to moderate and from moderate to high

J AM ACAD DERMATOL

432 Warshaw, Gravely, and Nelson

MARCH 2015

Table V. Agreement rate by teledermatologist-rated confidence Confidence level No. of lesions Percent agreement (95% CI) Kappa statistic (95% CI)

Image type Outcome Lesion category

Macro images Primary diagnostic agreement Pigmented Nonpigmented Aggregated diagnostic agreement Pigmented Nonpigmented Management plan agreement Pigmented Nonpigmented Macro 1 PLD Primary diagnostic agreement Pigmented Nonpigmented Aggregated diagnostic agreement Pigmented Nonpigmented Management plan agreement Pigmented Nonpigmented Macro 1 CID* Primary diagnostic agreement Aggregated diagnostic agreement Management plan agreement

Low

Moderate

High

N = 304

N = 1430

N = 1287

41.7 0.31 29.1 0.20

(33.3-50.1) (0.21-0.41) (22.3-35.9) (0.13-0.28)

57.8 0.47 43.0 0.31

(54.0-61.6) (0.43-0.52) (39.6-46.5) (0.27-0.35)

77.6 0.64 58.2 0.49

(74.3-80.9) (0.60-0.69) (54.4-61.9) (0.44-0.53)

78.0 0.77 69.8 0.68

(71.0-85.1) (0.69-0.84) (62.9-76.6) (0.61-0.75)

86.4 0.85 81.6 0.80

(83.8-89.0) (0.82-0.88) (78.9-84.3) (0.77-0.83)

92.3 0.89 87.9 0.87

(90.2-94.4) (0.86-0.92) (85.4-90.4) (0.84-0.89)

56.1 0.12 60.5 0.19

(47.6-64.5) (0.02-0.27) (53.2-67.8) (0.06-0.31) N = 257

64.6 (60.9-68.3) 0.30 (0.23-0.37) 80.6 (77.8-83.4) 0.36 (0.28-0.44) N = 1254

76.3 (73.0-79.7) 0.47 (0.39-0.54) 85.7 (83.0-88.3) 0.52 (0.44-0.60) N = 1343

33.7 0.17 27.5 0.19

(23.7-43.7) (0.06-0.28) (20.8-34.2) (0.12-0.26)

54.0 0.43 44.0 0.33

(50.0-58.1) (0.39-0.48) (40.3-47.7) (0.29-0.37)

78.1 0.68 61.9 0.53

(75.0-81.2) (0.64-0.72) (58.2-65.6) (0.46-0.57)

74.4 0.74 64.9 0.64

(65.2-83.6) (0.65-0.83) (57.8-72.1) (0.57-0.71)

87.7 0.86 82.8 0.81

(85.0-90.4) (0.83-0.89) (80.0-85.6) (0.78-0.84)

94.4 0.92 90.6 0.90

(92.6-96.1) (0.90-0.94) (88.3-92.8) (0.87-0.92)

57.0 0.16 62.0 0.19

(46.5-67.4) (0.02-0.35) (54.7-69.3) (0.07-0.31) N = 75 (20.2-41.1) (0.11-0.35) (61.8-82.2) (0.59-0.80) (47.5-69.8) (0.07-0.34)

64.3 (60.3-68.2) 0.31 (0.23-0.38) 78.9 (75.8-81.9) 0.36 (0.29-0.44) N = 456 61.0 (56.5-65.4) 0.51 (0.46-0.56) 90.4 (87.6-93.1) 0.88 (0.86-0.92) 70.2 (66.0-74.4) 0.42 (0.35-0.50)

30.7 0.23 72.0 0.70 58.7 0.13

75.9 (72.7-79.1) 0.50 (0.43-0.56) 87.7 (85.2-90.2) 0.54 (0.46-0.63) N = 678 80.7 (77.7-83.7) 0.72 (0.68-0.76) 94.5 (92.8-96.3) 0.93 (0.91-0.95) 79.2 (76.1-82.3) 0.54 (0.48-0.61)

CI, Confidence interval; CID, contact immersion dermoscopy; PLD, polarized light dermoscopy. *Pigmented lesions only.

(Table VI). For cases where the teledermatologist rated the image quality as high, the primary diagnostic agreement rates (all image types) increased by 24 percentage points or greater than those rated as low image quality. The agreement rates for aggregated diagnoses and management agreement rates for cases where the teledermatologist rated the image quality as high were at least 10 points greater than the corresponding rates among cases where the teledermatologist rated the image quality as low. Image quality seemed to have less

effect on aggregated diagnostic agreement rates for pigmented lesions than nonpigmented lesions, whereas the opposite was true for management agreement rates. Kappa statistic differences were similar. Association between teledermatologist-rated image quality and confidence There was a statistically significant association between teledermatologist’s rated image quality and confidence level (P \.0001).

J AM ACAD DERMATOL

Warshaw, Gravely, and Nelson 433

VOLUME 72, NUMBER 3

Table VI. Agreement by teledermatologist-rated image quality Image quality No. of lesions Percent agreement (95% CI) Kappa statistic (95% CI)

Image type Outcome Lesion category

Standard macro images Primary diagnostic agreement Pigmented Nonpigmented Aggregated diagnostic agreement Pigmented Nonpigmented Management plan agreement Pigmented Nonpigmented Macro 1 PLD Primary diagnostic agreement Pigmented Nonpigmented Aggregated diagnostic agreement Pigmented Nonpigmented Management plan agreement Pigmented Nonpigmented Macro 1 CID* Primary diagnostic agreement Aggregated diagnostic agreement Management plan agreement

Low

Moderate

High

N = 556

N = 1836

N = 629

52.3 0.42 38.1 0.27

(46.0-58.6) (0.34-0.49) (32.7-43.5) (0.22-0.33)

62.9 0.53 45.7 0.35

(59.7-66.1) (0.49-0.57) (42.6-48.8) (0.31-0.38)

82.2 0.57 62.6 0.54

(77.8-86.6) (0.48-0.66) (57.4-67.8) (0.48-0.60)

83.0 0.81 75.6 0.74

(78.2-87.7) (0.76-0.86) (70.8-80.3) (0.69-0.79)

87.9 0.86 82.1 0.81

(85.8-90.1) (0.84-0.89) (79.7-84.5) (0.78-0.83)

93.5 0.87 92.3 0.92

(90.7-96.3) (0.82-0.92) (89.4-95.1) (0.88-0.95)

58.5 (52.3-64.7) 0.18 (0.06-0.30) 71.4 (66.4-76.4) 0.29 (0.18-0.39) N = 422

68.8 (65.7-71.8) 0.38 (0.32-0.44) 81.9 (79.4-84.3) 0.41 (0.34-0.48) N = 1646

78.4 (73.7-83.1) 0.39 (0.27-0.51) 85.2 (81.4-89.0) 0.50 (0.38-0.61) N = 786

43.8 0.28 37.7 0.28

(35.8-51.9) (0.19-0.38) (32.0-43.4) (0.22-0.35)

61.2 0.51 48.5 0.38

(57.7-64.6) (0.47-0.55) (45.2-51.8) (0.34-0.41)

79.5 0.67 62.6 0.54

(75.6-83.4) (0.62-0.73) (57.7-67.6) (0.49-0.60)

80.1 0.78 68.5 0.67

(73.7-86.6) (0.71-0.85) (63.0-74.0) (0.62-0.72)

89.9 0.89 86.3 0.85

(87.7-92.0) (0.86-0.91) (84.0-88.6) (0.83-0.87)

94.4 0.92 90.9 0.91

(92.2-96.7) (0.89-0.95) (87.9-93.8) (0.88-0.94)

56.8 0.16 68.1 0.27

(48.8-64.9) (0.02-0.31) (62.6-73.6) (0.17-0.38) N = 95 44.2 (34.2-54.2) 0.33 (0.22-0.44) 78.9 (70.7-87.1) 0.76 (0.68-0.85) 63.2 (53.4-72.9) 0.30 (0.13-0.48)

66.5 (63.2-69.8) 0.34 (0.28-0.41) 83.4 (81.0-85.9) 0.42 (0.35-0.49) N = 656 67.8 (64.3-71.4) 0.59 (0.54-0.63) 90.9 (88.6-93.1) 0.89 (0.86-0.92) 71.3 (67.9-74.8) 0.43 (0.36-0.49)

80.2 (76.4-84.0) 0.57 (0.49-0.65) 84.1 (80.4-87.9) 0.50 (0.39-0.60) N = 458 78.8 (75.1-82.6) 0.69 (0.64-0.74) 95.2 (93.2-97.2) 0.94 (0.91-0.96) 81.4 (77.9-85.0) 0.60 (0.53-0.68)

CI, Confidence interval; CID, contact immersion dermoscopy; PLD, polarized light dermoscopy. *Pigmented lesions only.

DISCUSSION Main findings There are several important findings of this study. First, agreement was fair to substantial for primary diagnosis (45.7%-80.1%; kappa 0.32-0.62), substantial to almost perfect for aggregated diagnoses (78.6%-93.9%; kappa 0.77-0.90), and fair for management (66.7%-86.1%; kappa 0.28-0.41). Second, diagnostic agreement (primary and aggregated) was highest for nonbiopsied pigmented

lesions whereas management agreement was highest for biopsied nonpigmented lesions. Finally, although generally only modest improvements in agreement rates were observed with the addition of PLD images, CID significantly improved rates for pigmented lesions. Comparison of our results to other studies A recent systematic review on teledermatology1,2 found a weighted average for primary diagnostic

434 Warshaw, Gravely, and Nelson

agreement of 62.3% from 5 lesion studies3-7 and 1 dermoscopy pigmented lesion study8 (n = 708 lesions). The overall primary diagnostic agreement rate for our study involving 3021 lesions was lower, 55.5%. For aggregated diagnostic agreement, a weighted average from 4 lesion studies4-7 was 64.4%1 (n = 358 lesions); the overall rate for our study involving 3021 lesions was higher, 85.5%. For management agreement, 2 store and forward teledermatology studies evaluated agreement of the triage management decision of ‘‘refer or not refer’’ for pigmented lesions15 or skin lesions3; percent concordance was 80%15 and 61%3 for these 2 studies, yielding a weighted average of 75.3% (809/1075).1 Two lesion studies evaluated agreement for the diagnostic procedure decision ‘‘biopsy versus no biopsy’’ and found concordance rates of 100%16 and 95%7 (weighted average of 98.5%, 68/69).1 One pigmented lesion study found an agreement rate of 96% for 3 different management options.17 Our study had 5 possible management choices and a write-in option that resulted in a total of 11 management categories. This larger number of management choices (11 vs 2-3) may explain our lower rate of agreement (75.2%) than previous studies. Kappa statistics control for agreement caused by chance alone. Because of the smaller number of management options, chance agreement was likely higher; this may explain lower kappa statistic values.18 Dermoscopy Although PLD increased all 3 agreement outcomes over macro images, this increase was modest (# 4.4 percentage points) and only significantly better in 2 categories (aggregated diagnostic agreement for biopsied pigmented lesions and primary diagnostic agreement for biopsied nonpigmented lesions). The improvement in agreement outcomes was higher for CID (# 8.7 percentage points) as compared with macro images alone (all statistically significant except for management agreement for biopsied pigmented lesions). Importantly, all of the teledermatologists in our study were experts in dermoscopy.14 This study confirms that adding dermoscopic images increases agreement, as it does in the clinical setting.19 This underscores the need for proper equipment and dermoscopy-trained teledermatologists in the clinical setting of teledermatology involving pigmented lesions. Confidence and image quality It is not surprising that high confidence of teledermatologists was associated with better agreement

J AM ACAD DERMATOL

MARCH 2015

rates. This likely reflects the difference between ‘‘easy’’ and ‘‘hard’’ lesions. It is also not surprising that high-quality images would result in higher diagnostic agreement rates. Importantly, our previous analysis of biopsied lesions in this cohort found that the accuracy (using gold standard of histopathology) was significantly lower for teledermatologists as compared with clinic dermatologists and several melanomas were mismanaged, even with high levels of confidence and high-quality photographs.11 Limitations There are several important limitations to this study. First, the study population was primarily male and Caucasian. Second, although teledermatologists were blinded to the study purpose, they were aware that it was a study and therefore may not have been as careful or conservative in generating diagnoses and management plans as in a real-world setting. Third, pigmented biopsied lesions were oversampled to meet recruitment goals; other categories were approached in a convenience fashion. Fourth, although interrater reliability was evaluated,13 intrarater reliability of teledermatologists was not. Finally, this study was not designed to address other important outcomes such as cost, satisfaction, or clinical outcomes. Conclusions In summary, this study of store and forward teledermatology involving over 3000 skin neoplasms found moderate to almost perfect diagnostic agreement whereas management agreement was fair. CID significantly improved rates for pigmented lesions. Further studies are needed to assess clinical outcomes, patient and provider satisfaction, and costeffectiveness in managing not only skin neoplasms but other categories of dermatologic conditions. REFERENCES 1. Warshaw EM, Hillman YJ, Greer NL, et al. Teledermatology for diagnosis and management of skin conditions: a systematic review. J Am Acad Dermatol. 2011;64:759-772. 2. Warshaw E, Greer N, Hillman Y, et al. Teledermatology for diagnosis and management of skin conditions: a systematic review of the evidence. VA-ESP Project #09-009; 2009. Available from; URL: http://www.hsrd.research.va.gov/ publications/esp/telederm.cfm. Accessed November 20, 2014. 3. Bowns IR, Collins K, Walters SJ, McDonagh AJ. Telemedicine in dermatology: a randomized controlled trial. Health Technol Assess. 2006;10:27-33. 4. Oakley AM, Reeves F, Bennett J, Holmes SH, Wickham H. Diagnostic value of written referral and/or images for skin lesions. J Telemed Telecare. 2006;12:151-158. 5. Mahendran R, Goodfield MJ, Sheehan-Dare RA. An evaluation of the role of a store-and-forward teledermatology system in

J AM ACAD DERMATOL

Warshaw, Gravely, and Nelson 435

VOLUME 72, NUMBER 3

6.

7.

8.

9. 10.

11.

12.

skin cancer diagnosis and management. Clin Exp Dermatol. 2005;30:209-214. Barnard CM, Goldyne ME. Evaluation of an asynchronous teleconsultation system for diagnosis of skin cancer and other skin diseases. Telemed J E Health. 2000;6: 379-384. Whited JD, Mills BJ, Hall RP, Drugge RJ, Grichnik JM, Simel DL. A pilot trial of digital imaging in skin cancer. J Telemed Telecare. 1998;4:108-112. Piccolo D, Smolle J, Wolf IH, et al. Face-to-face diagnosis vs telediagnosis of pigmented skin tumors: a teledermoscopic study. Arch Dermatol. 1999;135:1467-1471. Austin Data Base, Fiscal Year 2000. [VA-specific database.] Warshaw EM, Lederle FA, Grill JP, et al. Accuracy of teledermatology for nonpigmented neoplasms. J Am Acad Dermatol. 2009;60:579-588. Warshaw EM, Lederle FA, Grill JP, et al. Accuracy of teledermatology for pigmented neoplasms. J Am Acad Dermatol. 2009;61:753-765. Warshaw EM, Gravely AA, Nelson DB. Accuracy of teledermatology/teledermatoscopy and clinical-based dermatology for

13.

14. 15.

16.

17.

18. 19.

specific categories of skin neoplasms. J Am Acad Dermatol. 2010;63:348-352. Warshaw EM, Gravely AA, Bohjanen KA, et al. Interobserver accuracy of store and forward teledermatology for neoplasms. J Am Acad Dermatol. 2010;62:513-516. Warshaw EM. Reply. J Am Acad Dermatol. 2009;61:903-904. Jolliffe VM, Harris DW, Morris R, Wallace P, Whittaker SJ. Can we use video images to triage pigmented lesions? Br J Dermatol. 2001;145:904-910. Shapiro M, James WD, Kessler R, et al. Comparison of skin biopsy triage decisions in 49 patients with pigmented lesions and skin neoplasms: store-and forward teledermatology vs face-to-face dermatology. Arch Dermatol. 2004;140:525-528. Di Stefani A, Zalaudek I, Argenziano G, Chimenti S, Soyer HP. Feasibility of a two-step teledermatologic approach for the management of patients with multiple pigmented skin lesions. Dermatol Surg. 2007;33:686-692. McHugh ML. Interrater reliability: the kappa statistic. Biochem Med. 2012;22:276-282. Kittler H, Pehamberger H, Wolff K, Binder M. Diagnostic accuracy of dermoscopy. Lancet Oncol. 2002;3:159-163.