CAD in full-field digital mammography—influence of reader experience and application of CAD on interpretation of time

CAD in full-field digital mammography—influence of reader experience and application of CAD on interpretation of time

Clinical Imaging 34 (2010) 418 – 424 CAD in full-field digital mammography—influence of reader experience and application of CAD on interpretation of...

700KB Sizes 0 Downloads 38 Views

Clinical Imaging 34 (2010) 418 – 424

CAD in full-field digital mammography—influence of reader experience and application of CAD on interpretation of time Christian Sohns a , Besim Cetin Angic b , Samuel Sossalla a , Frank Konietschke c , Silvia Obenauer b,⁎ a

Department of Cardiology and Pneumology/Heart Center, Georg-August-University, Göttingen, Germany b Department of Radiology, Georg-August-University Göttingen, Germany c Department of Medical Statistics, Georg-August-Universität Göttingen, Germany Received 28 August 2009; accepted 8 October 2009

Abstract Aim: To assess time expenditure using the influence of computer-assisted detection (CAD) system in the interpretation of the dependence of early research and benign and malignant mammograms on readers' experience. Materials and Methods: CAD (Image Checker V2.3; R2 Technology, Los Altos, CA, USA) was prospectively applied on digital mammograms of 303 patients [early research (n=103), benign (n=102), and malignant group (n=98)]. Mammograms were analyzed by three readers with varying experience in evaluating mammograms (medical student, resident and attending) according to the BI-RADS classification. Time was stopped and recorded. All images were presented randomly with and without the influence of CAD and from the different patient groups. To evaluate the statistical significance, the corresponding P value for time to read the mammograms in addition to different patient groups, application of CAD, readers' experience, and interaction of reader was calculated. Results: The attending needs, independent of CAD application, the least time, followed by the medical assistant and the student. In all three patient groups, CAD adoption elongates reading time of the student and the resident. The medical specialist needs with and without CAD median the same time. In the early research group, no significant differences were registered (P=.1343). Concerning readers' experience, there is an explicit significant difference (Pb.0001). The application of CAD correlates with the corresponding readers' experience and also provides a not significant result. In comparison, the P value for the malignant and benign groups shows significant interactions between the readers' experiences as well as CAD application. Conclusion: The future role of CAD application depends on whether sensitivity can be increased and time expenditure caused by false-positive marks can be decreased. In the future, second reading could be substituted by a CAD system if the reader has a wide professional experience. © 2010 Elsevier Inc. All rights reserved. Keywords: CAD; Digital mammography; Time expenditure

1. Introduction The benefit and cost of computer-assisted detection (CAD) mammography screening remains a topic of great interest in breast imaging. There is increasing evidence that screening mammography has a significant impact on reducing mortality by early detection of malignant lesions in women older than 40 years [1,2]. The efficiency of

⁎ Corresponding author. Department of Radiology, UMG, RobertKoch-Str. 40, 37099 Göttingen, Germany. Tel.: +49 0551 398965. E-mail address: [email protected] (S. Obenauer). 0899-7071/$ – see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.clinimag.2009.10.039

screening mammography in reducing mortality has thus far been limited by its sensitivity, which a number of studies have shown to range from 80% to 90% [3–5]. The development and implementation of CAD hold the potential to improve screening mammography by marking suspicious findings otherwise missed by radiologists. This leads to increasing detection rates and improves sensitivity. Although there is a preponderance of evidence related to the usefulness of CAD as measured retrospectively in a laboratory setting [5–7], there are relatively few studies that prospectively examine the use of CAD in a working clinical environment [8–10]. Furthermore, there are no studies that provide “true” sensitivity and specificity of the use of CAD in the

C. Sohns et al. / Clinical Imaging 34 (2010) 418–424

419

Table 1 Descriptive variables for reading time in seconds for each reader with and without CAD application (reader 1=resident, 2=student, 3=attending) Group

CAD

Reader

n

Median (s)

Lower limit (s)

Upper limit (s)

Variance

Benign group

Without

1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

102 102 102 102 102 102 103 103 103 103 103 103 98 98 98 98 98 98

26.53 35.19 11.12 28.24 39.71 10.93 22.60 30.10 7.466 23.56 32.12 6.796 27.54 37.18 11.37 30.43 43.00 11.32

24.72 33.89 9.909 26.76 38.20 9.735 20.87 28.77 6.528 22.00 30.72 6.112 25.85 35.70 9.624 28.60 41.59 9.773

28.34 36.48 12.33 29.71 41.21 12.13 24.33 31.42 8.404 25.12 33.52 7.480 29.23 38.66 13.12 32.26 44.41 12.87

84.65 43.66 37.85 56.06 58.47 37.07 78.52 45.97 23.02 63.86 50.74 12.24 72.94 55.99 77.39 84.77 50.00 60.77

With

Early research group

Without

With

Malignant group

Without

With

interpretation of screening mammography in a working clinical setting. It is time-consuming and expensive to measure prospectively the effect of the use of CAD, and the low incidence of breast cancer in a screening population largely limits the statistical significance of findings. In the last few years, CAD systems have become available for use with film-screen mammography. These systems require a digitizer that converts the analog image to a digital image prior to implementation of the CAD algorithms [11–14]. Since a few years ago, CAD systems are available that process primary digital images and can be used directly on digital mammography images [15–18]. Recently, the Digital Mammographic Imaging Screening Trial found that the accuracy of full-field digital mammography is significantly higher than that of screen-film mammography in women younger than 50 years, in women with heterogeneously dense or extremely dense breasts at mammography, and in premenopausal or perimenopausal women [19,20]. An evident problem of CAD in clinical application is still the time-consuming effect of a large number of false-positive marks. This might complicate the usefulness of CAD by distracting the interpreting reader [14–22]. Therefore, the maximum number of marks per patient is limited by commercial systems. The purpose of this study was to prospectively assess the expenditure of time by the influence of a CAD system in the interpretation of the dependence of early research and benign and malignant mammograms on the readers' experience.

2. Materials and methods

Table 2 Early research group (significance niveau P≤.05)

Table 3 Benign group (significance niveau P≤.05)

The CAD system (Image Checker V2.3; R2 Technology, Los Altos, CA, USA) was applied on digital mammograms of 303 patients who were divided into three groups: early research (n=103), benign (n=102) and malignant group (n=98). Mammograms were analyzed by three readers with varying experience in evaluating mammograms. All 303 images were available in mediolateral oblique (MLO) and in craniocaudal view (CC). Time for reading was stopped and recorded. One reader was a medical student, one a resident, and another one was an attending physician. All images were presented randomly with and without CAD marks and from the different patient groups. The breast tissue was classified according to the ACR types into four classes and the mammograms were categorized according the BI-RADS classification. To evaluate the statistical significance, the corresponding P value for time to read the mammograms in addition to different patient groups, application of CAD, readers' experience, and interaction of reader was calculated. 3. Results Table 1 shows the dependence of reading time (median time, upper and lower limit, variance) from the corresponding readers' experience in evaluating mammograms.

Effect

P value

Interpretation

Effect

P-value

Interpretation

Method (±CAD) Readers' experience Reader⁎ method

.1334 b.0001 .1001

Not significant Significant Not significant

Method (±CAD) Readers' experience Reader⁎ method

.0003 b.0001 .0023

Significant Significant Significant

420

C. Sohns et al. / Clinical Imaging 34 (2010) 418–424

Table 4 Malignant group (significance niveau P≤.05) Effect

P-value

Interpretation

Method (±CAD) Readers' experience Reader⁎ method

b.001 b.001 .0015

Significant Significant Significant

Tables 1–4 and Fig. 1 clearly illustrate a significant elongation of reading time by the influence of CAD in all patient groups for the resident and the student. Concerning the attending, there is no significant extension of reading time by using CAD. Interestingly, the readers need, with and without the application of CAD, different times for reading the mammograms of the different patient groups. Thus, all readers needed least time interpreting the early research group. The resident and the student needed the longest time for reading the mammograms of the malignant group, followed by the benign group. The attending needs less reading time for interpretation of the malignant group than for the benign group. Fig. 1 shows a graphic presentation of the different reading times. This shows the times categorized to reader and method. Additionally, the median, the 25–75% quantile, and the non-outlier range are also displayed. The figure clearly indicates a dependence of reading time to the readers' experience and to the corresponding patient group, as well as a time difference with and without the application of CAD.

The following analyses indicate the corresponding P value separated for the three patient groups and the likely significance subjected to the readers' experience, the reading method with and without CAD, as well as the readers' experience in connection with the aid of CAD (Tables 2–4). In the early research group, there is no significant difference concerning reading time with and without the influence of CAD (Table 2; P=.1343). In terms of readers' experience, a significant distinction is shown (Pb.0001). CAD application in correlation with the readers' experience delivers a not significant result. In comparison to these results, the P value for the benign group (Table 3) and for the malignant group (Table 4) shows significant interaction between the three readers' experiences, as well as the CAD application.

4. Discussion Mammography is basic for the diagnostic imaging of breast cancer and a method that significantly reduces mortality [22,23]. In clinical routine, mammography is used as the only effective method within early detection of breast cancer that aids in the higher possibility of recovery. An improvement in breast cancer detection by the application of a CAD system is proven in multiple studies. Sensitivity could be raised by CAD to about 12–

Fig. 1. Distribution of reading time depending on patient group, reader, and CAD application.

C. Sohns et al. / Clinical Imaging 34 (2010) 418–424

421

Fig. 2. A 66-year old woman from the benign group. (A) MLO view of the right breast with three false-positive marks for microcalcification. (B) Photographic amplification with false-positive marked vascular microcalcification.

19.5% [8,24–26]. Today, there are discussions whether CAD systems could adopt the role of a second reader in clinical routine [21,27,28]. A main disadvantage of CAD application is still the high rate of false-positive marks [18,21,28–30] (Figs. 2–4). This leads in great extent to an elongation of mammogram reading time because the reader could be distracted by many CAD marks. Our study is a prospective analysis of patient groups with

significant elongation of reading time by application of CAD (Tables 1–4; Fig. 1). The attending with long-time experience of reading mammograms needs, with and without CAD utilization, the shortest time, as expected. It is noticeable that the experienced reader requires nearly the same median reading time independent of CAD application. Also, the statistical spread of time is nearly equal. It seems that the attending has the minor account by

Fig. 3. A 51-year old woman from the early research group. (A) MLO view of the right breast with one false-positive marker for microcalcification. (B) Photographic amplification.

422

C. Sohns et al. / Clinical Imaging 34 (2010) 418–424

Fig. 4. A 65-year old woman from the early research group. (A) No marker in MLO view into the right breast and one false-positive marker for mass into the left breast (B). (C) Corresponding left MLO view without CAD marker. CC view without marker into the right breast (D) and into the left breast (E). (F) Accordant CC view of the left side without marks.

using the CAD system. The resident requires clearly more time for reading. In addition, the range between minimal and maximal reading time is clearly elevated in comparison to the attending. The longest reading time is needed by the student and CAD application additionally elongates reading. Median reading time is extended by the aid of CAD, too. As mentioned before, the main factor for elongation of reading time by CAD is the high rate of false-positive marks (Figs. 2–4). This has to be checked against the positive effect of rising sensitivity by the application of CAD (Fig. 5). The extent of the suspected CAD effect will depend on the sensitivity level and the rate of false-positive results. Sensitivity of screening mammography lies within 85% [3,31]; consequently, there is a relatively large range for CAD-associated improvements. Concerning high sensitivity of readers with long experience, only a small range of CADassociated improvement is suspected. The outcome of this is that the maximal CAD effect is even greater if the readers' experience is lower. Accordingly, the reader with less experience should benefit the most from CAD application. Thus, it is useful to apply CAD as a learning aid into readers' training. The learning radiologist, in the study of Quek et al., identified 257 of 294 tumors (87.4%) as autonomous, in 21 cases (7.2%) assisted CAD, in 13 cases (4.4%) CAD pointed to a suspicious finding that was not recognized by the reader, and in 3 instances (1%), CAD catered with a false-positive marker for confusion. In total,

sensitivity of the unexperienced radiologist could be significantly increased by CAD application from 74.4% to 87.2% [24] (Fig. 5). A further criteria for reading time is the patient group. In the early research group, were no significant differences according to reading time (P=.1343). In terms of readers' experience, there is certainly a clearly significant disagreement of expenditure of time (Pb.0001). The influence on time by combination of readers' experience and CAD application is not significantly different (Table 2). In comparison, the P value for the benign group (Table 3) and malignant group (Table 4) showed consistently significant results concerning reading time with and without the use of CAD, readers' experience, and the combination of experience and CAD application. In the future, CAD adoption in clinical routine will depend on the improvement of sensitivity to detect malignant lesions and decrease of false-positive marks by enhancements in CAD systems and the corresponding software. New CAD systems due to neuronal networks have the added ability to make permanent automatic data acquisition for pooling information to compare new mammograms. These systems are currently scientifically examined [32]. Finally, we come to the conclusion that, in spite of the reservations named above, CAD systems could already be an essential assistance in interpretation of mammograms. It might be possible that in the future, CAD could substitute the human second reader. However, this depends essentially on the first readers' experience.

C. Sohns et al. / Clinical Imaging 34 (2010) 418–424

423

Fig. 5. A 55-year-old woman with histologically proven invasive ductal carcinoma into the right breast. (A) True-positive marker of the CAD system in the MLO view. (B) Photographic amplification. (C) True-positive marker for microcalcification in CC view, as well with photographic amplification (D). Additional one false-positive marker for mass in CC view.

References [1] Fletcher SW, Elmore JG. Clinical practice: mammographic screening for breast cancer. N Engl J Med 2003;348:1672–80. [2] United States Preventive Services Task Force. Screening for breast cancer: recommendations and rationale. Ann Intern Med 2002;137: 344–6. [3] Bird RE, Wallace TW, Yankaskas BC. Analysis of cancers missed at screening mammography. Radiology 1992;184:613–7. [4] Goergen SK, Evans J, Cohen GP, MacMillan JH. Characteristics of breast carcinomas missed by screening radiologists. Radiology 1997; 204:131–5. [5] Warren Burhenne LJ, Wood SA, D'Orsi CJ, et al. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000;215:554–62. [6] Birdwell RL, Ikeda DM, O'Shaughnessy KF, Sickles EA. Mammographic characteristics of 115 missed carcinomas later detected with screening mammography and the potential utility of computer-aided detection. Radiology 2000;219:192–202. [7] Brem RF, Baum J, Lechner M, et al. Improvement in sensitivity of screening mammography with computer-aided detection: a multiinstitutional trial. AJR Am J Roentgenol 2003;181:687–93. [8] Freer TW, Ulissey MJ. Screening mammography with computer-aided detection: prospective study of 12,860 patients in a community breast center. Radiology 2001;220:781–6.

[9] Gur D, Sumkin JH, Rockette HE, et al. Changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system. J Natl Cancer Inst 2004;96: 185–90. [10] Birdwell RL, Bandodkar P, Ikeda DM. Computer-aided detection with screening mammography in a university hospital setting. Radiology 2005;236:451–7. [11] Astley SM. Evaluation of computer-aided detection (CAD) prompting techniques for mammography. Br J Radiol 2005;78:20–5. [12] Vyborny CJ. Can computers help radiologists read mammograms? Radiology 1994;191:315–7. [13] Doi K, MacMahon H, Katsuragawa S, Nishikawa RM, Jiang Y. Computer-aided diagnosis in radiology: potential and pitfalls. Eur Radiol 1999;31:97–109. [14] Funovics M, Schamp S, Helbich TH, Lackner B, Wunderbaldinger P, Fuchsjäger M, Lechner G, Wolf G. Evaluation of a computerassisted diagnosis system in breast carcinoma. Röfo 2001;173: 218–23. [15] Muller S. Full field digital mammography designed as a complete system. Eur Radiol 1999;31:25–34. [16] Bick U. Full-field digital mammography. Röfo 2000;173:957–64. [17] Obenauer S, Luftner-Nagel S, von Heyden D, Munzel U, Baum F, Grabbe E. Screen-film vs full-field digital mammography: image quality, detectability and characterization of lesions. Eur Radiol 2002; 12:1697–702.

424

C. Sohns et al. / Clinical Imaging 34 (2010) 418–424

[18] Baum F, Fischer U, Obenauer S, Grabbe E. Computer-aided detection in direct digital full-field mammography: initial results. Eur Radiol 2002;12:3015–7. [19] Pisano ED, Gatsonis C, Yaffe MJ, et al. American College of Radiology Imaging Network digital mammographic imaging screening trial: objectives and methodology. Radiology 2005;236:404–12. [20] Pisano ED, Gatsonis C, Hendrick E, et al. Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med 2005;353:1773–83. [21] Obenauer S, Sohns C, Werner C, Grabbe E. Retrospektive Analyse eines computerassistierten Detektions-Systems (CAD) in der digitalen Vollfeldmammographie in Abhängigkeit von der Histologie. Rofo 2005;177:1103–9. [22] Shapiro S, Strax P, Venet L. Periodic breast cancer screening in reducing mortality from breast cancer. JAMA 1971;215:1777–85. [23] Tabar L, Fagerberg CJ, Gad A, Baldetorp I, Holmberg LH, Grontoft O, Ljungquist U, Lundstrom B, Manson JC, Eklund G. Reduction in mortality from breast cancer after mass screening with mammography. Lancet 1985;2:829–32. [24] Quek ST, Thng CH, Khoo JB, Koh WL. Radiologists' detection of mammographic abnormalities with and without a computer-aided detection system. Australas Radiol 2003;47:257–60. [25] Cupples TE. Impact of computer-aided detection (CAD) in a regional screening mammography program. Radiology 2001;221:221–520.

[26] Bandodkar P, Birdwell RL, Ikeda DM. Computer aided detection (CAD) with screening mammography in an academic institution: preliminary findings. Radiology 2002;202:225–458. [27] Thurfjell EL, Lernevall KA, Taube AA. Benefit of independent double reading in a population-based mammography screening program. Radiology 1994;191:241–4. [28] Obenauer S, Sohns C, Werner C, Grabbe E. Computer-aided detection in full-field digital mammography: detection in dependence of the BI-RADS categories. Breast J 2006;12:16–9. [29] Castellino RA, Roehring JR. The promise of computer aided detection in digital mammography. Eur Radiol 1999;31:35–9. [30] Malich A, Sauner D, Marx C, Facius M, Boehm T, Pfleiderer SO, Fleck M, Kaiser WA. Influence of breast lesion size and histologic findings on tumor detection Rate of a computer-aided detection system. Radiology 2003;228:851–6. [31] Baines CJ, Miller AB, Wall C, McFarlande DV, Simor IS, Jong R, Shapiro BJ, Audet L, Petitclerc M, Ouimet-Oliva D. Sensitivity and specificity of first screen mammography in the Canadian national breast screen study: a preliminary report from five centers. Radiology 1986;160:295–8. [32] Elter M, Schulz-Wendtland R, Wittenberg T. The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med Phys 2007;23: 4164–72.