Review
Dermoscopy
Diagnostic accuracy of dermoscopy H Kittler, H Pehamberger, K Wolff, and M Binder
The accuracy of the clinical diagnosis of cutaneous melanoma with the unaided eye is only about 60%. Dermoscopy, a non-invasive, in vivo technique for the microscopic examination of pigmented skin lesions, has the potential to improve the diagnostic accuracy. Our objectives were to review previous publications, to compare the accuracy of melanoma diagnosis with and without dermoscopy, and to assess the influence of study characteristics on the diagnostic accuracy. We searched for publications between 1987 and 2000 and identified 27 studies eligible for meta-analysis. The diagnostic accuracy for melanoma was significantly higher with dermoscopy than without this technique (log odds ratio 4.0 [95% CI 3.0 to 5.1] versus 2.7 [1.9 to 3.4]; an improvement of 49%, p = 0.001). The diagnostic accuracy of dermoscopy significantly depended on the degree of experience of the examiners. Dermoscopy by untrained or less experienced examiners was no better than clinical inspection without dermoscopy. The diagnostic performance of dermoscopy improved when the diagnosis was made by a group of examiners in consensus and diminished as the prevalence of melanoma increased. A comparison of various diagnostic algorithms for dermoscopy showed no significant differences in their diagnostic performance. A thorough appraisal of the study characteristics showed that most of the studies were potentially influenced by verification bias. In conclusion, dermoscopy improves the diagnostic accuracy for melanoma in comparison with inspection by the unaided eye, but only for experienced examiners.
Figure 1. Superficial spreading melanoma viewed with dermoscopy (large panel) and with the unaided eye (inset panel). Compared with the unaided eye, dermoscopy reveals several additional structural features, which are typical of melanoma, including irregular dots and irregular extensions (pseudopods) in the periphery and a blue-whitish veil.
amount of training of the dermatologist, the diagnostic difficulty of the lesions, and the type of algorithm used for assessment,4–6 but also as a result of differences in the explicit or implicit threshold used to differentiate between melanoma and non-melanoma. We have used the metaanalytic method for diagnostic tests, which combines data from many studies,7,8 takes into account differences in the test threshold, and provides a way to examine the association between test accuracy and study characteristics, to compare the diagnostic accuracy for melanoma with and without dermoscopy, to assess the influence of study characteristics on the diagnostic accuracy of dermoscopy, and to report summary estimates of the diagnostic accuracy by combining data from many reports.
Lancet Oncol 2002; 3: 159–65
Methods Early diagnosis is thought to be very important for improving the prognosis of patients with cutaneous melanoma, but even in specialised centres the accuracy of the clinical diagnosis for melanoma achieved with the unaided eye is only slightly better than 60%.1 Dermoscopy (epiluminescence microscopy, dermatoscopy, skin-surface microscopy, incident light microscopy) is a non-invasive, in vivo examination with a microscope that uses incident light and oil immersion to make subsurface structures of the skin accessible to visual examination (Figure 1). Dermoscopy allows the observer to look not only onto but also into the superficial skin layers, and thus permits a more detailed inspection of pigmented skin lesions.2 The results of several studies have suggested that dermoscopy improves the rate of detection of melanoma compared with inspection by the unaided eye.3 However, the reported sensitivity and specificity vary significantly between studies, partly because the diagnostic accuracy of dermoscopy depends on the THE LANCET Oncology Vol 3 March 2002
Eligible studies (see Search strategy and selection criteria) were classified, with no masking, by two readers in consensus on prospectively defined characteristics important for assessment of diagnostic tests. The following information was extracted from each report: authors’ names; year of publication; description of pigmented skin lesions (melanoma prevalence, melanoma invasion thickness, frequency of non-melanocytic lesions); experience of examiners; independence of clinical and histological assessment; type of diagnostic algorithm; mode All the authors are at the Department of Dermatology, Division of General Dermatology, University of Vienna Medical School, Vienna, Austria. HK is a Research Assistant, HP is a Professor, KW is Professor and Chairman, and MB is an Associate Professor. Correspondence: Dr Harald Kittler, Department of Dermatology, University of Vienna Medical School, Waehringerguertel 18–20, A-1090 Vienna, Austria. Tel: +43 1 40400 7701. Fax: +43 1 4081928. E-mail:
[email protected]
http://oncology.thelancet.com
159
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Review
Dermoscopy
Table 1. Main characteristics of eligible studies First author
Ref
Number of NML lesions included (% melanomas)
Dermoscopic experience of examiners
Dermoscopic algorithm
Mode of presentation
Assessment independent
Mode of diagnosis
Argenziano
12
342 (34%)
No
Experts and non-experts
Scoring system, ABCD rule*, pattern analysis
Images
Yes
Consensus
Bauer
19
279 (15%)
No
Experts
Pattern analysis
Patients
Yes
Consensus
Benelli
17
401 (15%)
Yes
Experts
Scoring system
Patients
Yes
Not recorded
Binder
20
100 (40%)
No
Experts
Pattern analysis
Images
Yes
Consensus
Binder
6
240 (24%)
Yes
Experts and non-experts
Pattern analysis
Images
Yes
Individual
Binder
5
100 (37%)
Yes
Non-experts before and after training
Pattern analysis
Images
Yes
Individual
Binder
4
250 (16%)
No
Experts and non-experts
ABCD rule*, pattern analysis
Images
Yes
Individual
Carli
21
15 (27%)
No
Experts
Pattern analysis
Images
Yes
Individual
Cristofolini
22
220 (15%)
Yes
Experts
Pattern analysis
Patients
Yes
Consensus
Dal Pozzo
18
713 (22%)
Yes
Experts
Scoring system
Images
Yes
Consensus
Dummer
23
824 (3%)
Yes
Experts
Pattern analysis
Patients
Yes
Not recorded
Feldmann
24
500 (6%)
No
Experts
ABCD rule*
Patients
Yes
Individual
Kittler
25
50 (46%)
No
Experts
Pattern analysis
Images
Yes
Individual
Kittler
26
356 (21%)
Yes
Experts
ABCD rule*
Images
Yes
Individual
Krähn
27
80 (49%)
No
Experts
Not recorded
Patients
Yes
Not recorded
Lorentzen
28
232 (21%)
Yes
Experts and non-experts
Pattern analysis
Images
Not recorded
Individual
Lorentzen
29
258 (25%)
Yes
Experts and non-experts
Scoring system, ABCD rule
Images
Not recorded
Individual
Menzies
13
385 (28%)
Yes
Experts
Scoring system
Images
Yes
Not recorded
Nachbar
11
172 (40%)
No
Experts
ABCD rule*
Patients
Yes
Not recorded
Nilles
30
209 (20%)
No
Experts
Scoring system
Not recorded
Not recorded
Not recorded
Seidenari
31
90 (34%)
No
Experts and non-experts
Pattern analysis
Images
Not recorded
Individual
Soyer
32
159 (41%)
Yes
Experts
Pattern analysis
Patients
Yes
Individual
Stanganelli
33
20 (50%)
No
Experts
Pattern analysis
Images
Yes
Individual
Stanganelli
34
3329 (2%)
No
Experts
Pattern analysis
Patients
Yes
Individual
Steiner
35
318 (23%)
Yes
Experts
Pattern analysis
Patients
Yes
Consensus
Stolz
10
79 (61%)
No
Experts
ABCD rule*
Images
Not recorded
Consensus
Westerhoff
36
100 (50%)
Yes
Non-experts before and after training
Scoring system
Images
Yes
Individual
NML, non-melanocytic skin lesions. *The ABCD rule aids clinical diagnosis of melanoma on the basis of observable morphological features – asymmetry, border irregularity, colour variegation, and dermoscopic structure.
of diagnosis, mode of presentation; and results (sensitivity and specificity). The independence of clinical and histological assessment was defined according to whether the clinical diagnosis was made without knowledge of histology. The diagnostic algorithm refers to the type of analysis that was used for the dermoscopic assessment of pigmented lesions. We differentiated between pattern analysis as described by Pehamberger and colleagues,9 the ABCD rule for dermoscopy reported by Stolz and coworkers,10,11 and algorithms that used a modified form of pattern analysis in conjunction with a scoring system. The latter group included the 7-point checklist of Argenziano and colleagues,12 Menzies and co-workers’ scoring system,13,14 risk stratification as described by Kenet and Fitzpatrick,15 and the seven features of melanoma as described by Benelli and others.16–18 160
For mode of diagnosis, we noted whether or not the diagnosis was established in consensus by a group of examiners, and for mode of presentation, we differentiated between studies that used presentation of colour prints, photographs, slides, or digital images and studies that investigated the accuracy of face-to-face diagnosis. Studies were further examined according to whether their results were potentially influenced by verification bias. Verification bias is likely when the decision to proceed with the reference test (histopathology) partly depends on the results of the clinical diagnosis. The influence of verification bias on the diagnostic accuracy was not analysed statistically because only one study looked at the outcome of benign lesions that were not selected for excision.
THE LANCET Oncology Vol 3 March 2002
http://oncology.thelancet.com
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Review
Dermoscopy
Statistical analysis Sensitivity and specificity were calculated according to standard formulae. When individual assessments from several observers were given in a study, the median values of the sensitivity and specificity were used in our analysis. Least-squares linear regression was used to estimate parameters for summary receiver-operating-characteristic (SROC) models. Estimates of sensitivity and specificity were obtained from each study and used to calculate their log odds ratio (logit), which measures how well the test discriminates between melanoma and non-melanoma. The SROC model was obtained by regression of the difference, D, of the logits, logit (sensitivity) minus logit (1 minus specificity), on the sum, S, of the logits, logit (sensitivity) plus logit (1 minus specificity), to test whether the log odds ratio is associated with the test threshold.7,8 An inverse transformation was then used to transform the data back to the ROC space and to express sensitivity as a function of 1 minus specificity. SROC curves were constructed for each diagnostic method, and differences between them were compared by use of linear regression analysis. To adjust for covariates we used multiple linear regression analysis. For the comparison of more than two groups, the log odds ratios were compared by ANOVA, and adjustment for covariates was done by ANCOVA. The Scheffe test was used to account for multiple comparisons. For paired observations, the log odds ratios were compared by use of the paired t test. If studies that were included in the paired analysis reported the results for experts and non-experts, only the experts’ readings were included in the model. The mean difference between the log odds ratios observed in the paired analysis was used to calculate the relative improvement achieved with dermoscopy. Univariate and multivariate regression analyses were done to assess the variation in diagnostic accuracy due to study characteristics. The regression coefficients give a measure of the difference in diagnostic performance, with positive coefficients indicating better discriminatory power and negative coefficients corresponding to lower discriminatory ability. For multivariate analysis we used a
forward stepwise linear regression analysis. Variables were entered in the stepwise model if the probability obtained from the F test was below 0.05 and removed if p was greater than 0.1. Statistical analyses used SPSS (version 10.0). All p values are two-tailed.
Results Study characteristics
The main characteristics of each of the 27 eligible studies4–6,10–13,17–36 are presented in Table 1. The pooled sample was 9821 pigmented skin lesions (median per study 232). The prevalence of melanoma ranged from 1.6% to 60.8% (mean 28.3%). The mean or median Breslow thickness was reported in 15 studies and ranged from 0.40 mm to 1.11 mm (median 0.70 mm). In most of the available studies, all lesions were selected for disease verification. Only one study looked at the outcome of benign lesions that were not selected for excision.34 Several studies compared different diagnostic methods for the diagnosis of melanoma. In fourteen studies (52%), the diagnostic accuracy for melanoma with and without dermoscopy was directly compared and in three (11%) two or more diagnostic algorithms for dermoscopy were compared. Pattern analysis was used in 16 studies (59%), the ABCD rule in seven (26%), and modified pattern analysis in conjunction with a scoring system in seven (26%). Five studies (19%) compared the performance of experts and non-experts, and two (7%) assessed the influence of training on the performance of non-experts. All but one study investigated dermatologists; Westerhoff and colleagues studied the effect of dermoscopy on the diagnostic performance of primary-care physicians.36 The first model was a paired analysis and included only those studies that directly compared the diagnostic accuracy for melanoma with and without dermoscopy (Table 2). One of these 14 studies presented the results in such a way that the sensitivity and specificity of dermoscopy could not be calculated, and it was therefore excluded from the paired analysis. The mean log odds ratio achieved with
Table 2. Main results of studies that directly compared the diagnostic accuracy for melanoma with and without dermoscopy First author
Ref
Sample size
Sensitivity
Specificity
Log odds ratio
Unaided eye
Dermoscopy
Unaided eye
Dermoscopy
Unaided eye
Dermoscopy
Benelli
17
401
0.67
0.80
0.79
0.89
2.04
3.49
Binder
6
240
0.58
0.68
0.91
0.91
2.64
3.07
Binder
5
100
0.73
0.73
0.70
0.78
1.84
2.26
Carli
21
15
0.42
0.75
0.78
0.89
0.93
3.17
Cristofolini
22
220
0.85
0.88
0.75
0.79
2.83
3.32
Dummer
23
824
0.65
0.96
0.93
0.98
3.21
7.07
Krähn
27
80
0.79
0.90
0.78
0.93
2.59
4.78
Lorentzen
28
232
0.77
0.82
0.89
0.94
3.30
4.27
Nachbar
11
172
0.84
0.93
0.84
0.91
3.29
4.89
Soyer
32
159
0.94
0.94
0.82
0.82
4.27
4.27
Stanganelli
33
20
0.55
0.73
0.79
0.73
1.52
1.94
Stanganelli
34
3329
0.67
0.93
0.99
1.00
5.82
8.25
Westerhoff
35
100
0.63
0.76
0.54
0.58
0.66
1.46
THE LANCET Oncology Vol 3 March 2002
http://oncology.thelancet.com
161
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Review
Dermoscopy
dermoscopy was significantly higher than that achieved without dermoscopy (4.0 [95% CI 3.0 to 5.1] versus 2.7 [1.9 to 3.4]), resulting in a mean difference of 1.3 (0.7 to 2.0), or an improvement of 49% (p = 0.001). The second model included the results of all 27 eligible studies and yielded similar results. The mean log odds ratio achieved with dermoscopy was again significantly higher than that achieved without dermoscopy (3.4 [2.9 to 3.9] versus 2.5 [1.9 to 3.1], p = 0.03). Inclusion of information on the experience of the examiners showed that the diagnostic performance of dermoscopy was significantly better for experts than for non-experts (mean log odds ratio 3.8 [3.3 to 4.3] versus 2.0 [1.4 to 2.6]; mean difference 1.8 [0.8 to 2.7], p = 0.001). To account for this finding, we generated a model that compared the performance of the clinical diagnosis without dermoscopy, dermoscopy by non-experts, and dermoscopy by experts. For each of the methods, SROC curves were constructed (Figure 2). The clinical diagnosis without dermoscopy showed similar diagnostic accuracy to dermoscopy by non-experts (mean log odds ratio 2.5 versus 2.0; mean difference 0.5 [95% CI for difference -0.4 to 1.4], p = 0.65). For both approaches the diagnostic accuracy was significantly lower than that achieved with dermoscopy by experts (mean log odds ratio 3.8, p = 0.003 and p = 0.001). The influence of study characteristics on the diagnostic performance of dermoscopy was investigated by univariate and multivariate regression analysis including the results of all eligible studies. As in the analysis above, the diagnostic performance of dermoscopy increased for experts (regression coefficient 1.8 [95% CI 0.8 to 2.8], p < 0.001). The diagnostic performance also increased when the Without dermoscopy Dermoscopy when performed by experts Dermoscopy when performed by non-experts
1.0 0.9 0.8
Sensitivity
0.7 0.6 0.5 0.4 0.3
diagnosis was made by a group of two or more examiners in consensus (regression coefficient 1.1 [0.2 to 2.1], p = 0.02). Although consensus increased the discriminatory power of dermoscopy, the procedure performed by experts achieved higher accuracy than inspection with the unaided eye whether or not the dermoscopic diagnosis was made in consensus. The accuracy of the (clinically more relevant) non-consensus diagnosis achieved with dermoscopy was significantly higher than that achieved without dermoscopy (mean log odds ratio 3.7 versus 2.5; mean difference 1.2 [95% CI for difference 0.3 to 2.2], p = 0.01). The diagnostic ability of dermoscopy was inversely correlated with the prevalence of melanoma in the sample (regression coefficient -0.04 [95% CI -0.06 to -0.01], p = 0.006) and lower for experimental studies that used presentation of slides, colour prints, or digital images than for clinical studies in which the diagnosis was made face to face (regression coefficient -1.3 [-2.1 to -0.5], p = 0.001). Other study characteristics did not significantly influence the diagnostic performance of dermoscopy. For multivariate analysis we used a forward stepwise regression analysis. The final model included three variables: the experience of examiners (regression coefficient 1.2 [0.3 to 2.1], p = 0.01), the prevalence of melanoma (regression coefficient -0.04 [-0.06 to -0.01], p = 0.01), and whether the diagnosis was made in consensus (regression coefficient 1.0 [0.04 to 1.9], p = 0.04). Other variables were not independently associated with the diagnostic accuracy of dermoscopy. Since the dermatologists’ experience was the strongest predictive variable for the diagnostic performance of dermoscopy, we built a SROC model for the pooled diagnostic performance of dermoscopy adjusted for three settings with different degrees of experience (Figure 3). Univariate analysis of the individual results of all eligible studies showed that the diagnostic accuracy of dermoscopy was similar for the different diagnostic algorithms. The log odds ratios achieved with pattern analysis (3.6 [95% CI 2.8 to 4.4]), the ABCD rule (3.2 [2.4 to 3.9]), and scoring systems (3.1 [2.1 to 4.0]) did not differ significantly (p = 0.64). We analysed the influence of the experience of the examiners on the performance of the diagnostic algorithms. The degree of experience had a significant effect on the diagnostic accuracy of pattern analysis (regression coefficient 2.0 [95% CI 0.4 to 3.6], p = 0.02) and scoring systems (regression coefficient 2.3 [0.5 to 4.1], p = 0.02). By contrast, the degree of experience had no significant effect on the diagnostic accuracy achieved with the ABCD rule (regression coefficient 0.8 [-1.1 to 2.7], p = 0.35).
0.2
Discussion
0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1⫺Specificity Figure 2. SROC curves for the performance of the clinical diagnosis without dermoscopy (red line), dermoscopy by experts (black line), and dermoscopy by non-experts (blue line).
162
This meta-analysis of 27 studies provides evidence that dermoscopy gives better diagnostic accuracy for melanoma than clinical inspection without dermatoscopy (ie with the unaided eye). This conclusion accords with that of a previous review, which included six studies,37 and another meta-analysis, which included eight studies.38 The review did not provide a quantitative analysis and the other metaanalysis was restricted to studies that directly compared the THE LANCET Oncology Vol 3 March 2002
http://oncology.thelancet.com
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Review
Dermoscopy
Best case Base case Worst case
1.0 0.9 0.8
Sensitivity
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1⫺Specificity
Figure 3. SROC curves for the pooled diagnostic performance of dermoscopy. The base case (black line) is adjusted to a setting at which half of the examiners are experienced in dermoscopy (experts). The best case (red line) is adjusted to a setting at which all examiners are experts in dermoscopy. The worst case (blue line) is adjusted to a setting at which all examiners are untrained or less experienced (non-experts).
diagnostic performance with and without dermoscopy. Neither study addressed the influence of study characteristics on the diagnostic performance of dermoscopy. According to our analysis, the diagnostic accuracy of dermoscopy significantly depends on the experience of the examiners. Moreover, the diagnostic accuracy achieved is no better with dermoscopy applied by non-experts than with the unaided eye. This finding underlines the importance of training for the application of dermoscopy.5,6 The study by Westerhoff and colleagues, investigating the value of dermoscopy on the diagnostic performance of primary-care physicians, deserves further attention.36 It was the only study of non-dermatologists. Primary-care physicians were trained to use a simplified diagnostic scoring system for dermoscopy. Their diagnostic performance before training was only slightly better than chance. After training, there was a significant improvement in the diagnosis of melanoma by dermoscopy versus inspection with the unaided eye. However, the reported diagnostic accuracy after training was much lower than in comparable studies involving dermatologists. We also found that the diagnostic performance of dermoscopy was improved when the diagnosis was made by a group of examiners in consensus. A consensus diagnosis might not be practicable in most clinical settings, but it may be important for telemedical applications. By electronic transmission of digital dermoscopic images, teledermoscopy potentially involves two or more experts at THE LANCET Oncology Vol 3 March 2002
geographically distant facilities. However, how a consensus can be reached for a group of examiners working at geographically distant facilities is unclear. Two studies that compared face-to-face diagnosis with remote diagnosis found no differences in the diagnostic performances, indicating that electronically transmitted dermoscopic images convey the information necessary for differentiation between melanoma and non-melanoma.39,40 Future work should assess the value of a consensus diagnosis for electronically transmitted dermoscopic images. The prevalence of melanoma was inversely correlated with the diagnostic accuracy of dermoscopy. A possible interpretation of this finding is that if more melanomas are included in a sample, the overall diagnostic difficulty of the sample is increased. Another explanation could be differences in the criteria applied to select the lesions between the studies. Since the original reports by Pehamberger, Steiner, and colleagues,3,9,35 describing the use of pattern analysis for the dermoscopic assessment of pigmented skin lesions, several diagnostic algorithms have been developed. Pattern analysis relies on the description of several dermoscopic features, which can be difficult for non-experts to recognise. Scoring systems are simplified versions of pattern analysis with a limited number of dermoscopic features. The ABCD rule is somewhat different from the other algorithms because the exact description of the dermoscopic features is not so important. Pattern analysis requires a sufficient amount of training,6 whereas the other, simpler, diagnostic algorithms might be more suitable for less experienced examiners.4,12 In our analysis, all algorithms did equally well. Pattern analysis showed slightly better diagnostic accuracy than the other algorithms but the differences were not statistically significant. As expected, the diagnostic performance of pattern analysis was strongly influenced by the experience of the examiners. Surprisingly, this was also true for scoring systems. One explanation might be that, as for pattern analysis, the recognition of dermoscopic features is crucial for the diagnostic procedure. Compared with pattern analysis and scoring systems, the degree of experience had less influence on the diagnostic ability of the ABCD rule for dermoscopy, which suggests that this algorithm is especially suitable for beginners in dermoscopy. As shown by the SROC curves in Figure 3, the diagnostic accuracy of dermoscopy does not reach 100% even under the assumption of optimum conditions, indicating that dermoscopy cannot replace histopathology. However, dermoscopy may provide useful additional information for the histopathologist in difficult cases. Soyer and colleagues showed that clinicopathological correlation of pigmented skin lesions by dermoscopy is useful for dermatopathologists when reporting on melanocytic skin lesions.41 Dermoscopy and histopathology should be regarded as concurrent examinations of a joint diagnostic procedure with additive information. The summary estimates of the diagnostic accuracy of dermoscopy provided by this meta-analysis have to be interpreted with caution because the results of most studies were potentially influenced by verification bias, which is likely to occur when the decision to proceed with the
http://oncology.thelancet.com
163
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Review
Dermoscopy
eye. However, dermoscopy requires sufficient training and cannot be recommended for untrained users. A consensus diagnosis involving two or more experts is recommended to yield the highest possible diagnostic accuracy.
Search strategy and selection criteria Relevant studies were identified and retrieved by a search of MEDLINE for the period January 1987 to December 2000, by manual searches of the reference lists of retrieved articles, and by direct communication with experts on this topic. The terms “epiluminescence”, “dermoscopy”, “dermatoscopy”, and “incident light microscopy” were linked with a Boolean OR operator and the search yielded 157 articles. 116 articles were excluded at this stage: those that were not relevant to the topic, did not address the diagnostic accuracy for melanoma, or were published in languages other than English or German, review articles, letters, and reports without original data. Additional articles were identified by manual searches of the reference lists of retrieved articles and by direct communication with experts. Articles that did not include original data on the diagnostic accuracy for melanoma and those that did not report sufficient data for the sensitivity and specificity to be estimated were excluded. Estimates of the diagnostic accuracy for melanoma involving computerised image analysis were also excluded from further analysis. The final sample included 27 studies, of which 20 were identified by the MEDLINE search, three by manual searches of the reference lists of retrieved articles, and four by communication with experts.
References
reference test (histopathology) partly depends on the results of the clinical diagnosis. Suspect clinical findings are more likely to be investigated by histopathology, so the chance of detecting a true positive is higher than that for a false negative and the chance for detecting a false positive is higher than that for a true negative. In this case, sensitivity seems to be falsely increased and specificity falsely decreased. Since most of the studies included in our meta analysis were potentially influenced by verification bias, in general the sensitivity is probably overestimated and the specificity underestimated. Another important issue that may have influenced our results is publication bias. This bias refers to the systematic error induced in a statistical analysis by the requirement for studies to be published. The influence of publication bias is difficult to assess. The most important question is whether our results can be explained solely by its presence. We think that this is unlikely, because generally publication bias arises because studies with statistically significant results are more likely to be published than those with non-significant results, but only a few studies included in our analysis provided a direct statistical comparison of the diagnostic accuracy with and without dermoscopy. However, publication bias cannot be ruled out completely and may have influenced the results of our analysis towards an overoptimistic estimate of the diagnostic accuracy of dermoscopy.
Conclusion Dermoscopy improves the diagnostic accuracy for melanoma in comparison with inspection by the unaided 164
1 Grin CM, Kopf AW, Welkovich B, Bart RS, Levenstein MJ. Accuracy in the clinical diagnosis of malignant melanoma. Arch Dermatol 1990; 126: 763–66. 2 Argenziano G, Soyer HP. Dermoscopy of pigmented skin lesions: a valuable tool for early diagnosis of melanoma. Lancet Oncol 2001; 2: 443–49. 3 Pehamberger H, Binder M, Steiner A, Wolff K. In vivo epiluminescence microscopy: improvement of early diagnosis of melanoma. J Invest Dermatol 1993; 100: 356S–62S. 4 Binder M, Kittler H, Steiner A, et al. Reevaluation of the ABCD rule for epiluminescence microscopy. J Am Acad Dermatol 1999; 40: 171–76. 5 Binder M, Puespoeck-Schwarz M, Steiner A, et al. Epiluminescence microscopy of small pigmented skin lesions: short-term formal training improves the diagnostic performance of dermatologists. J Am Acad Dermatol 1997; 36: 197–202. 6 Binder M, Schwarz M, Winkler A, et al. Epiluminescence microscopy: a useful tool for the diagnosis of pigmented skin lesions for formally trained dermatologists. Arch Dermatol 1995; 131: 286–91. 7 Littenberg B, Moses LE. Estimating diagnostic accuracy from multiple conflicting reports: a new meta-analytic method. Med Decis Making 1993; 13: 313–21. 8 Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary ROC curve: dataanalytic approaches and some additional considerations. Stat Med 1993; 12: 1293–316. 9 Pehamberger H, Steiner A, Wolff K. In vivo epiluminescence microscopy of pigmented skin lesions: I, pattern analysis of pigmented skin lesions. J Am Acad Dermatol. 1987; 17: 571–83. 10 Stolz W, Riemann A, Armand B, et al. ABCD rule of dermatoscopy: a new practical method for early recognition of melanoma. Eur J Dermatol 1994; 4: 521–27. 11 Nachbar F, Stolz W, Merkle T, et al. The ABCD rule of dermatoscopy: high prospective value in the diagnosis of doubtful melanocytic skin lesions. J Am Acad Dermatol. 1994; 30: 551–59. 12 Argenziano G, Fabbrocini G, Carli P, et al. Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions: comparison of the ABCD rule of dermatoscopy and a new 7-point checklist based on pattern analysis. Arch Dermatol 1998; 134: 1563–70. 13 Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Frequency and morphologic characteristics of invasive melanomas lacking specific surface microscopic features. Arch Dermatol 1996; 132: 1178–82. 14 Menzies SW, Crotty KA, McCarthy WH. The morphologic criteria of the pseudopod in surface microscopy. Arch Dermatol 1995; 131: 436–40. 15 Kenet RO, Fitzpatrick TB. Reducing mortality and morbidity of cutaneous melanoma: a six year plan. B) Identifying high and low risk pigmented lesions using epiluminescence microscopy. J Dermatol 1994; 21: 881–84. 16 Benelli C, Roscetti E, Dal PV. Reproducibility of a dermoscopic method (7FFM) for the diagnosis of malignant melanoma. Eur J Dermatol 2000; 10: 110–14. 17 Benelli C, Roscetti E, Pozzo VD, et al. The dermoscopic versus the clinical diagnosis of melanoma. Eur J Dermatol 1999; 9: 470–76. 18 Dal Pozzo V, Benelli C, Roscetti E. The seven features for melanoma: a new dermoscopic algorithm for the diagnosis of malignant melanoma. Eur J Dermatol 1999; 9: 303–08. 19 Bauer P, Cristofolini P, Boi S, et al. Digital epiluminescence microscopy: usefulness in the differential diagnosis of cutaneous pigmentary lesions: a statistical comparison between visual and computer inspection. Melanoma Res 2000; 10: 345–49. 20 Binder M, Steiner A, Schwarz M, et al. Application of an artificial neural network in epiluminescence microscopy pattern analysis of pigmented skin lesions: a pilot study. Br J Dermatol 1994; 130: 460–65. 21 Carli P, De Giorgi V, Naldi L, Dosi G. Reliability and interobserver agreement of dermoscopic diagnosis of melanoma and melanocytic naevi. Eur J Cancer Prev 1998; 7: 397–402.
THE LANCET Oncology Vol 3 March 2002
http://oncology.thelancet.com
For personal use. Only reproduce with permission from The Lancet Publishing Group.
Review
Dermoscopy
22 Cristofolini M, Zumiani G, Bauer P, et al. Dermatoscopy: usefulness in the differential diagnosis of cutaneous pigmentary lesions. Melanoma Res 1994; 4: 391–94. 23 Dummer W, Doehnel KA, Remy W. Videomicroscopy in differential diagnosis of skin tumors and secondary prevention of malignant melanoma. Hautarzt 1993; 44: 772–76. 24 Feldmann R, Fellenz C, Gschnait F. The ABCD rule in dermatoscopy: analysis of 500 melanocytic lesions. Hautarzt 1998; 49: 473–76. 25 Kittler H, Seltenheim M, Pehamberger H, et al. Diagnostic informativeness of compressed digital epiluminescence microscopy images of pigmented skin lesions compared with photographs. Melanoma Res 1998; 8: 255–60. 26 Kittler H, Seltenheim M, Dawid M, et al. Morphologic changes of pigmented skin lesions: a useful extension of the ABCD rule for dermatoscopy. J Am Acad Dermatol 1999; 40: 558–62. 27 Krahn G, Gottlober P, Sander C, Peter RU. Dermatoscopy and high frequency sonography: two useful non-invasive methods to increase preoperative diagnostic accuracy in pigmented skin lesions. Pigment Cell Res 1998; 11: 151–54. 28 Lorentzen H, Weismann K, Petersen CS, et al. Clinical and dermatoscopic diagnosis of malignant melanoma assessed by expert and non-expert groups. Acta Dermatol Venereol 1999; 79: 301–04. 29 Lorentzen H, Weismann K, Kenet RO, et al. Comparison of dermatoscopic ABCD rule and risk stratification in the diagnosis of malignant melanoma. Acta Dermatol Venereol 2000; 80: 122–26. 30 Nilles M, Boedeker RH, Schill WB. Surface microscopy of naevi and melanomas: clues to melanoma. Br J Dermatol 1994; 130: 349–55. 31 Seidenari S, Pellacani G, Pepe P. Digital videomicroscopy improves diagnostic accuracy for melanoma. J Am Acad Dermatol 1998; 39: 175–81. 32 Soyer HP, Smolle J, Leitinger G, et al. Diagnostic reliability of
THE LANCET Oncology Vol 3 March 2002
33
34
35
36 37 38
39 40 41
http://oncology.thelancet.com
dermoscopic criteria for detecting malignant melanoma. Dermatology 1995; 190: 25–30. Stanganelli I, Serafini M, Cainelli T, et al. Accuracy of epiluminescence microscopy among practical dermatologists: a study from the Emilia-Romagna region of Italy. Tumori 1998; 84: 701–05. Stanganelli I, Serafini M, Bucch L. A cancer-registry-assisted evaluation of the accuracy of digital epiluminescence microscopy associated with clinical examination of pigmented skin lesions. Dermatology 2000; 200: 11–16. Steiner A, Pehamberger H, Wolff K. In vivo epiluminescence microscopy of pigmented skin lesions: II, diagnosis of small pigmented skin lesions and early detection of malignant melanoma. J Am Acad Dermatol 1987; 17: 584–91. Westerhoff K, McCarthy WH, Menzies SW. Increase in the sensitivity for melanoma diagnosis by primary care physicians using skin surface microscopy. Br J Dermatol 2000; 143: 1016–20. Mayer J. Systematic review of the diagnostic accuracy of dermatoscopy in detecting malignant melanoma. Med J Aust 1997; 167: 206–10. Bafounta ML, Beauchet A, Aegerter P, Saiag P. Is dermoscopy (epiluminescence microscopy) useful for the diagnosis of melanoma? Results of a meta-analysis using techniques adapted to the evaluation of diagnostic tests. Arch Dermatol 2001; 137: 1343–50. Piccolo D, Smolle J, Argenziano G, et al. Teledermoscopy: results of a multicentre study on 43 pigmented skin lesions. J Telemed Telecare 2000; 6: 132–37. Braun RP, Meier M, Pelloni F, et al. Teledermatoscopy in Switzerland: a preliminary evaluation. J Am Acad Dermatol 2000; 42: 770–75. Soyer HP, Kenet RO, Wolf IH, et al. Clinicopathological correlation of pigmented skin lesions using dermoscopy. Eur J Dermatol 2000; 10: 22–28.
165
For personal use. Only reproduce with permission from The Lancet Publishing Group.