CAD-associated Reader Error in CT Colonography Vadim S. Koshkin, BA, J. Louis Hinshaw, MD, Kristen Wroblewski, MS, Abraham H. Dachman, MD Rationale and Objectives: Computed tomographic colonographic interpretation with computer-aided detection (CAD) may be superior to unaided viewing, although polyp characteristics may influence accuracy. Reader error due to polyp characteristics was evaluated in a multiple-case, multiple-reader trial of computed tomographic colonography with CAD. Materials and Methods: Two experts retrospectively reviewed 52 positive cases (74 polyps) and categorized them as hard, moderate, or easy to detect. Each case was evaluated without and with CAD. Features that may influence a reader’s ability to detect a polyp or to accept or reject a CAD mark were tabulated. The association between polyp characteristics and detection rates in the trial was assessed. The difference in detection rates (CAD vs unassisted) was calculated, and regression analysis was performed. Results: Of 64 polyps found by CAD, experts categorized 20 as hard, 28 as moderate, and 16 as easy to detect. Reader characterization errors predominated (47.3%) over other errors. Factors associated with lower detection rates included small size, flat morphology, and resemblance to a thickened fold. CAD was superior for polyps resembling lipomas compared to those that did not resemble lipomas (average increase in detection rate with CAD, 12.8% vs 5.5%; P < .05). Conclusions: Polyp characteristic may impair computed tomographic colonographic interpretation augmented by CAD. Readers can avoid errors of measurement by evaluating diminutive polyp candidates with sample measurements. Caution should be taken when evaluating focally thick folds and when using visual impression to dismiss a polyp candidate as a lipoma when it is submerged in densely tagged fluid. Key Words: CAD; colon polyps; colorectal cancer screening; computer-aided detection; CT colonography; virtual colonoscopy. ªAUR, 2012
C
omputed tomographic (CT) colonography (or ‘‘virtual colonoscopy’’) is an examination of the colon to detect polyps and masses that is endorsed by the American Cancer Society for colorectal cancer screening. Substantial training is required to learn how to interpret CT colonographic studies (1,2), and despite the reported high sensitivity for polyps $6 mm in size in some large screening trials (3–7), sensitivity has been lower in other trials (8,9). This variability has been attributed to many factors, but the largest factor seems to be observer errors due to missing polyps that could potentially have been recognized (10). However, certain errors, such as those due to immobile stool that is untagged by oral contrast, are unavoidable. Other sources of error in reading CT colonographic studies involve technique-related or patient-related factors (eg, collapsed bowel, polyps obscured by untagged residual fluid, poor scanning technique, respiratory motion). Small lesions and plaquelike flat lesions are also potential sources of error (11). Acad Radiol 2012; 19:801–810 From the Department of Radiology (V.S.K., A.H.D.), MC2026, and the Department of Health Studies (K.W.), MC2007, The University of Chicago Medical Center, 5841 S Maryland Avenue, Chicago, IL 60637; and the Department of Radiology, University of Wisconsin Hospital and Clinics, 600 Highland Ave, University of Wisconsin Hospital and Clinics, Madison, Wisconsin (J.L.H.). Received December 22, 2011; accepted February 9, 2012. Address correspondence to: A.H.D. e-mail:
[email protected] ªAUR, 2012 doi:10.1016/j.acra.2012.03.008
Computer-aided detection (CAD) has been developed and tested as a means to improve reader sensitivity. CAD systems detect raised areas suspicious for polyps and then use a set of classifiers to reduce false-positives (eg, analyzing the texture of a polyp candidate to determine if it is stool) (12). CAD has high sensitivity in stand-alone trials (ie, comparing optical colonoscopic truth to CAD software output without a human reader) (11,13–16). When CAD is used by a human reader, the reader may accept or reject a CAD hit, so the ultimate benefit of CAD depends on whether it helps or hinders the radiologist interpreting an exam (17–21). For example, polyps with irregular surface shapes have been shown to be more likely to be dismissed as not polyps even when detected by CAD (17). Consequently, CAD observer trials are important in proving the benefit of CAD-assisted reading (17–21). In a recent multiple-reader, multiple-case (MRMC) trial of CT colonography, 19 readers interpreted 100 cases in two ways: once without CAD and once using CAD to modify the final interpretation after first reading the case without CAD (22). The readers’ average sensitivity for detecting patients with polyps $6 mm in size improved by 0.055 (11.8%) with CAD, and that for detecting patients with polyps of 6 to 9 mm in size improved by 0.066 (19.9%) with CAD but was accompanied by a 0.025 (2.7%) decrease in specificity. The purpose of the present study was to evaluate reader errors due to the imaging characteristics of polyps and CAD prompts on the basis of the MRMC trial. The readers’ 801
KOSHKIN ET AL
Academic Radiology, Vol 19, No 7, July 2012
TABLE 1. Characteristics of Polyps Recorded in the MRMC Trial Polyp Characteristic
Unassisted Detection Rate (%)*
CAD hit No (n = 10) Yes (n = 64) Morphology Sessile (n = 47) Pedunculated (n = 12) Flat (n = 5) Adenoma No (n = 23) Yes (n = 41) Segment Rectum (n = 10) Sigmoid (n = 12) Descending (n = 10) Transverse (n = 11) Ascending (n = 15) Cecum (n = 6) Size (mm) 6–9 (n = 44) $10 (n = 20)
P
CAD-assisted Detection Rate (%)*
<.01 6.8 9.9 47.7 35.6
<.001 7.4 7.5 54.0 33.5
<.001 44.5 32.8 75.9 34.0 10.5 12.3
<.001 52.0 31.5 78.1 30.3 15.8 12.9
.016 38.2 32.3 53.0 36.6
.020 42.6 33.6 60.5 32.1
.073 38.9 40.3 49.1 33.4 64.2 30.2 44.0 35.9 52.6 36.5 26.3 35.7
.037 41.6 35.6 57.0 29.3 72.6 29.5 51.7 27.7 61.1 33.5 24.6 38.8
<.001 35.9 30.8 73.7 31.9
P
<.001 43.8 29.3 76.6 31.7
CAD, computer-aided detection; MRMC, multiple-reader, multiple-case. Data are expressed as mean standard deviation. The mean is the average detection rate among all polyps with that characteristic. The P values reported are from the comparison of mean detection rates across categories of each polyp characteristic using a generalized estimating equation model. *Percentage of the 19 readers detecting the polyp.
false-negative interpretations (ie, detection rates) were correlated with polyp characteristics and difficulty ratings to explain why the polyps were missed or why when CAD found polyps, the CAD ‘‘hits’’ were ignored by the readers. MATERIALS AND METHODS The contributing institutions to the MRMC trial performed the exams under approval of their local institutional review boards and contributed anonymized cases in compliance with Health Insurance Portability and Accountability Act guidelines. This study was performed under an institutional review board waiver. One investigator (blinded) was a paid consultant to the MRMC trial sponsor (iCAD, Inc, Nashua, NH) and was one of several radiologists who defined the reference standard. The MRMC study sponsor did not fund or participate directly in the study reported herein. All 100 CT colonographic studies (52 positive) as well as data regarding polyp characteristics were obtained. Patients’ average age was 57.9 years (range, 46–74 years); 56 patients were men, 39 were women, and five did not have age or sex available in the contributing site records. All patients underwent CT colonography with saline cathartic bowel preparation, oral contrast for fluid and stool tagging (combined hypertonic water soluble contrast plus barium), rectal insufflation with room air, and no spasmolytic, and all exams were of diagnostic quality. For the 52 patients 802
with positive results, the average age was 57.5 years (range, 50–70); 29 patients were men, 21 were women, and age and gender were unknown in two. The original MRMC trial interpretations were performed by 19 board-certified radiologists (interpreting an average of 92 CT colonographic cases per year with an average of 5 years of experience) selected from both academic and community environments. All readers interpreted each case in both an unassisted read session and a session in which they were first able to consider the case without CAD, immediately followed by a read with CAD turned on (a study design to minimize reader bias). The latter session in which each reader had the opportunity to either accept or disregard CAD polyp marks was termed the ‘‘CAD-assisted read.’’ The order of the sessions was randomized and separated by a minimum of 27 days (range, 27–58 days). The readings were performed on a Viatronix V3D-Colon System (Viatronix, Inc, Stony Brook, NY) with CAD software (VeraLook 1.0; iCAD, Inc). Readers interpreted cases using the method that they typically used in clinical practice, either a primary twodimensional (2D) read with three-dimensional (3D) problem solving or vice versa. They had access to standard workstation functionality but not to electronic cleansing. The readers rated their confidence that a finding was an actionable polyp or mass using a 100-point scale, with a rating of $51 regarded as positive, and were instructed to report only lesions measuring
Academic Radiology, Vol 19, No 7, July 2012
Figure 1. Male patient, aged 61 years. Error of detection. Computer-aided detection (CAD) miss of 8-mm flat polyp in the cecum. Consensus expert opinion established difficulty rating as hard. Detected by none of 19 readers during unassisted read and none of 19 readers during CAD-assisted read. Highly magnified axial view from a prone scan showing white, densely tagged residual fluid outlining a small, sessile, lobular mucosal, soft tissue density irregularity consistent with a polyp (arrows). This polyp was not visible on a standard 120 endoluminal fly-through and was categorized in retrospective review as difficult to detect.
$6 mm. The polyps detected by readers were matched using the spatial coordinates of the polyps’ locations, with polyps determined to be the reference standard as detailed in that report (22). Study Design
The original MRMC study reports and CT colonographic workstations with specialized CT colonographic interpretation software were used. Two workstations were necessary, one to reread or view the cases and one to view the reference standard used for CAD. Using the ‘‘reader computer’’ for directed reads (with full functionality as available to original readers, including display of the CAD output, which could be used to view and read the case as would be done in normal clinical practice), and the ‘‘truthing computer’’ for the reference standard (which showed the pixels constituting the proven polyp in color as defined and proven by the original trial), an unblinded retrospective review of all 52 positive cases (74 polyps) was performed. Two expert readers (>1000 cases of experience each) independently performed the review. Neither of these radiologists participated as readers in the initial MRMC trial. All divergent opinions were resolved by consensus. Each of the 74 polyps was classified according to the percentage of readers (of 19) who correctly detected the polyp during the unassisted read and separately during the CAD-assisted read. A number of polyp features (as determined by the MRMC reference standard) were available and considered in the analysis, including polyp morphology, colonic segment, histology, and size (single longest dimension; Table 1). In addition, information on whether each polyp was
SOURCES OF CTC CAD-ASSISTED READER ERRORS
Figure 2. Male patient, aged 58 years. Error of characterization. Computer-aided detection (CAD) hit of a 10-mm sessile polyp in the descending colon. Consensus expert opinion established difficulty rating as moderate. Detected by 15 of 19 readers on unassisted read and 18 of 19 readers on CAD-assisted read. Highly magnified axial prone view shows dense residual fluid partially coating a soft tissue lesion consistent with a polyp (arrows). This lesion was categorized as having surface tagging in retrospective review, ordinarily a helpful sign in detecting polyps. Note that because of a partial volume effect, some of the high density appears to be within the polyp, rather than just on the surface. This can mislead the reader to judge the polyp candidate as stool.
detected by CAD in the supine and prone views and the number of CAD false-positives for the case was available. Each polyp was evaluated by the experts for characteristics that could potentially influence the ability of readers in polyp detection or affect the likelihood of the readers in deciding to accept or reject a CAD mark (CAD hit). Using directed reads and image sets on the reader and truthing computers, the two experts evaluated all 74 polyps. The relevant features sought were (1) whether the polyp was submerged in fluid, reported separately for supine and prone views (not submerged, partially submerged, completely submerged); (2) tagging suggesting stool: the extent and location of hyperdense tagging on the lesion in question suggestive of stool (yes or no); (3) position change: whether colonic movement produced a change in a polyp’s perceived location (yes or no); (4) subjective fatlike density: whether the polyp resembled a lipoma (eg, similar to omental, mesenteric, or subcutaneous fat when viewed on CT colonography or soft tissue window and level settings) (yes or no); (5) resemblance to a thickened fold: whether the polyp might be mistaken to resemble a normal but thickened fold when viewed in standard fly-through or 2D read (yes or no); (6) the position of the polyp relative to fold or ileocecal valve (on a fold, between folds, on the ileocecal valve, or not applicable if not close to any folds); (7) visibility on a 3D fly-through, only for polyps classified as being on a fold or ileocecal valve (not visible or visible in at least one direction); and (8) the difficulty of polyp detection 803
KOSHKIN ET AL
Academic Radiology, Vol 19, No 7, July 2012
Figure 3. (a) Axial image and (b) 3D endoluminal view. Female patient, aged 55 years. Error of measurement. Computer-aided detection (CAD) hit of a 7-mm sessile polyp in the sigmoid colon. Consensus expert opinion established difficulty rating as moderate. Detected by 14 of 19 readers on unassisted read and 13 of 19 readers on CAD-assisted read. The measurement of lesions near the 6-mm threshold is more subjective and subject to error. Choosing the axial image to create a measurement results in a long axis of 4.1 mm, yet an optimized three-dimensional view measurement results in a 6.1-mm long-axis measurement.
as rated by experts on the basis of overall interpretation of features such as characteristics and conspicuity of the polyp and the quality of the exam, including the presence of falsepositive CAD hits, the quality of the preparation for the given case, and appearance on different views (overall three-point scale: easy, moderate, or hard). Evaluation of Potential Sources of Error
On the basis of the characteristics of polyps, the sources of error that might contribute to a reader missing a given polyp on either unassisted read or CAD-assisted read were analyzed. Specifically, the experts were asked what about a given polyp would cause it to be missed by any reader (regardless of its actual detection rate in the MRMC trial). Potential errors were divided into four categories (Figs 1–3): (1) error of detection, if the polyp was likely not seen by the reader or skipped altogether; (2) error of characterization, if it was presumed that the lesion in question was likely found by the reader as a potential polyp but dismissed as something else or resulting from the inability to match up the polyp on both supine and prone views; (3) error of measurement, if the experts suspected that the lesion was dismissed when the readers erroneously measured the polyp as <6 mm (whereas the radiologists determining the reference standard measured it as $6 mm) and followed guidelines in not considering it a polyp; and (4) easily conspicuous polyps with no evident reason for the error, categorized as ‘‘unknown’’ errors. A polyp could fall into more than one of these categories, and during the retrospective review, the experts were required to select the single error most likely to contribute to lower detection rates in addition to assessment of the main polyp characteristics listed above and in Table 2. Statistical Analyses
A detection rate was calculated for each polyp for both the unassisted and CAD-assisted reads as a measure of the likelihood that a given polyp would be missed by a reader (falsenegative). The detection rate was defined as the percentage of readers who found the polyp. With the exception of the analysis comparing polyps that were CAD hits (detected by 804
CAD on at least one view) to those that were CAD misses, only the 64 polyps that were CAD hits were included in the analysis. A secondary analysis was performed that included all 74 polyps (Fig 4). The primary analyses performed were of two types. First, the association between polyp characteristics and detection rates was assessed for each study session (unassisted and CAD-assisted reads) separately. Then, the difference in detection rates (CAD-assisted unassisted) was used as the dependent variable to examine whether the effect of CAD varied depending on the characteristics of the polyp. Generalized estimating equation regression (23) with robust standard errors was performed, treating detection rates or change in detection rates as the dependent variable and accounting for the clustering in the data (ie, cases could have multiple polyps). An exchangeable working correlation structure was assumed. Because the detection rates did not necessarily follow a normal distribution, the arcsin transformation (sin1[sqrt(detection rate)]) was also considered. Only the results from analyses using the untransformed data are presented, because use of the arcsin transformation resulted in similar conclusions. Secondary analyses examined associations between error type and several polyp characteristics using generalized estimating equation regression. All statistical analyses were performed using Stata version 10 (StataCorp LP, College Station, TX). P values < .05 were considered statistically significant. No adjustment for multiple comparisons was made.
RESULTS Associations between Polyp Characteristics and Detection Rates
Polyp features that could influence readers’ ability to detect polyps are summarized in Tables 1and 2. Note that other miscellaneous features were rarely found and thus are not included here (eg, stool-like texture, extreme mobility of a pedunculated polyp, polyp location on a fold and proximity to a synchronous polyp). Polyps not detected by CAD on either view (CAD misses; Table 3) were significantly more likely, compared to CAD hits, to be missed by readers in both the unassisted and CAD-assisted reads. Polyps most likely
Academic Radiology, Vol 19, No 7, July 2012
SOURCES OF CTC CAD-ASSISTED READER ERRORS
TABLE 2. Characteristics of Polyps on the Basis of the Retrospective Analysis Polyp Characteristic* Position relative to fold Between folds (n = 21) On a fold (n = 33) Far from fold (n = 10) Resembles thick fold No (n = 56) In at least one view (n = 8) Polyp changed position No (n = 41) Yes (n = 20) Difficulty Easy (n = 16) Moderate (n = 28) Hard (n = 20) Lipoma like No (n = 57) Yes (n = 7) Tagging No (n = 25) Yes (n = 39) Visible on supine FOV No (n = 10) In at least one view (n = 54) Visible on prone FOV No (n = 12) In at least one view (n = 52) Submerged supine No (n = 49) Partially (n = 7) Completely (n = 7) Submerged prone No (n = 47) Partially (n = 7) Completely (n = 8) Main error Characterization (n = 32) Detection (n = 20) Measurement (n = 9)
Unassisted Detection Rate (%)y
P
CAD-assisted Detection Rate (%)y
.738 49.1 36.1 48.0 33.3 43.7 44.6
P .598
59.9 31.3 52.3 33.5 47.4 39.2 .002
51.2 35.7 23.0 24.2
<.001 58.8 32.2 20.4 21.3
.872 50.7 35.0 48.2 35.9
.976 56.2 34.8 55.5 30.0
<.001 86.2 14.5 50.0 29.8 13.7 17.3
89.5 12.0 56.4 27.2 22.4 20.8 .257
48.4 35.8 42.1 36.2
.143 53.9 33.6 54.9 35.7
.020 36.2 35.3 55.1 34.2
.011 44.0 31.1 60.5 33.8
.336 41.1 37.3 48.9 35.5
.721 51.6 34.5 54.5 33.6
.598 34.2 40.2 50.8 34.1
.312 39.5 39.8 57.4 31.4
.391 49.4 35.2 34.6 38.1 54.1 37.6
.611 55.4 33.1 41.4 36.5 61.7 35.1
.007 48.5 35.6 75.9 16.9 30.3 31.5
.125 55.0 32.6 79.7 21.4 36.8 34.5
<.001 62.3 32.9 21.8 25.1 36.3 25.4
<.001 67.8 32.0 30.0 24.2 43.3 18.6
CAD, computer-aided detection; FOV, field of view. Data are expressed as mean standard deviation. The mean is the average detection rate among all polyps with that characteristic. The P values reported are from the comparison of mean detection rates across categories of each polyp characteristic using a generalized estimating equation model. *Numbers may not sum to 64, because of missing or not applicable data. y Percentage of the 19 readers detecting the polyp.
to be missed by readers (1) were flat as opposed to sessile or pedunculated, (2) were small (6–9 mm) as opposed to large ($10 mm), (3) were nonadenomatous, (4) resembled a thick fold in at least one view (Fig 5), (5) were unlabeled with significant dense surface tagging as opposed to tagged polyps (Fig 2), (6) were completely submerged on prone view, (7) were classified as hard as opposed to moderate or easy, and (8) were determined to have main error of detection or measurement as opposed to characterization. Generally, differences in detection rates on the basis of these polyp features were statistically significant in both the unassisted
and CAD-assisted reads. An important exception, however, were polyps submerged on prone view, for which CADassisted read results only trended toward significance. Similarly, differences in detection rates according to colonic segment were statistically significant only in the CAD-assisted read. Results of the secondary analysis including all 74 polyps were similar in most cases (data not shown). There were two exceptions. First, the difference in detection rates on the basis of histology, although similar in magnitude, was no longer statistically significant in either the unassisted or CAD-assisted read (eg, a mean of 53.6% for adenomas vs 805
KOSHKIN ET AL
Academic Radiology, Vol 19, No 7, July 2012
(Fig 4). The predominant source of error found was characterization (35 of 74 [47.3%], Table 4). Three polyps did not have identifiable sources of error and were categorized as unknown. Error type showed a significant association with polyp size (P = .004), as small polyps were most frequently errors of detection and large polyps were most commonly errors of characterization. There was also a significant relationship between difficulty level and error type (P = .010), with polyps deemed to be easy or moderate most frequently considered errors of characterization, while approximately 65% of polyps deemed hard were errors of detection. DISCUSSION
Figure 4. Flowchart illustrating the number of patients and polyps from the original multiple-reader, multiple-case trial and the final retrospective experts’ consensus categorization of the overall degree of difficulty for detecting the polyps. CAD, computer-aided detection.
37.4% for nonadenomas in the CAD-assisted read, P = .080). Second, the difference in detection rates on the basis of resemblance to a thickened fold was no longer significant in the unassisted read (a mean of 23.0% for those that did vs 44.5% for those that did not, P = .145). Effect of CAD on Detection Rates
The magnitude of improvement in detection rates with the CAD-assisted read compared to the unassisted read varied significantly on the basis of several polyp characteristics (Tables 1and 2). With CAD on, there was greater improvement for polyps correctly labeled by CAD (CAD hits) compared to that for polyps missed by CAD (CAD-assisted unassisted detection rates, +6.3% and +0.5%, respectively; P < .05). Similarly, there was greater improvement with CAD for polyps resembling lipomas (Fig 6) compared to polyps that did not (+12.8% vs +5.5%, respectively, P < .05) and for polyps that did not resemble thick folds compared to polyps that did (+7.6% vs 2.6%, respectively, P < .05). Main Sources of Error Evaluated in the MRMC Trial
The two experts independently agreed on the main source of error in about 92% of cases and easily achieved consensus in remaining 8% of cases. All but one of the polyps missed by CAD were classified as hard, whereas all polyps that were classified as easy were detected by CAD in at least one view 806
In this study, we evaluated the characteristics of polyps and their appearance on 2D and 3D images that might explain why some of the 19 readers in an MRMC trial of CT colonography might have missed a polyp or dismissed it after viewing the CAD hit list. The consensus opinion of two experts categorizing the difficulty of polyp detection as hard, moderate, or easy (in which there was high independent agreement of the two experts even prior to consensus review) showed that in this cohort, the ratio of hard to easy or moderate cases was unusually high. In total, 29 polyps were considered hard, 29 moderate, and 16 easy among all 74 polyps $6 mm in size. Of the 64 found by CAD, 20 were hard, 28 moderate, and 16 easy. The consensus opinion of two experts had a strong correlation with reader performance for both the unassisted and CAD-assisted reads. The improvement on CAD-assisted reads was greatest for lesions categorized as hard, with less pronounced improvement for the detection of moderate and easy polyps. Although this finding did not reach statistical significance, it was in agreement with a prior report by Taylor et al (17). It may also be a contributing factor to a previously made observation that CAD reads lead to greater improvements in sensitivity for less experienced readers (24). The predominant source of error found was characterization (35 of 74 [47.3%]) rather than detection (27 of 74 [36.5%]), and this finding is different than that reported by prior investigators, who found errors of detection predominating (10,25). Of the 35 errors of characterization established by experts, 12 were described as easy to detect and 28 as easy or moderate, with only seven lesions characterized as hard. Among errors of detection, on the other hand, 19 of 27 polyps were characterized as hard, with the remaining eight as moderate and none as easy. Not surprisingly, despite the predominance of errors of characterization among polyps in this trial, the polyps most likely to be missed by readers in both unassisted and CAD-assisted reads were determined to have main errors of detection (Table 2). Detection rates increased with the addition of CAD for all main error types. The CAD system itself missed only 10 of 74 polyps. As described in prior literature, CAD systems in stand-alone reads are more likely to detect polyps that are more visually conspicuous, with polyp height the leading determinant of polyp conspicuity (26). The results of our
Academic Radiology, Vol 19, No 7, July 2012
SOURCES OF CTC CAD-ASSISTED READER ERRORS
TABLE 3. Details on 10 CAD Misses
Case 4 4 21 34 48 49 76 76 139 144
Polyp
Segment
Seen on
Size
Morphology
CRADS
Adenoma
CAD-assisted Detection Rate* (%)
1 2 1 1 1 1 2 1 3 1
Ascending Ascending Descending Rectum Cecum Sigmoid Sigmoid Transverse Ascending Sigmoid
Supine and prone Supine and prone Supine and prone Supine and prone Supine and prone Prone Supine and prone Supine Supine and prone Supine and prone
8 6 6 9 8 6 7 6 7 6
Sessile Sessile Pedunculated Sessile Flat Sessile Flat Flat Sessile Sessile
C3 C3 C2 C2 C2 C2 C2 C2 C3 C2
Yes No Yes Yes Yes Yes No No Yes No
10.5 21.1 5.3 15.8 0 0 10.5 0 10.5 0
Unassisted Detection Rate* (%) 5.3 31.6 5.3 15.8 0 0 5.3 0 5.3 0
CAD, computer-aided detection; CRADS, computed tomographic colonography reporting and data systems. Polyp characteristics were as determined by the multiple-reader, multiple-case reference standard. *Percentage of the 19 readers detecting the polyp.
Figure 5. Male patient, aged 64 years. Error of characterization. Computer-aided detection (CAD) hit of 6-mm flat polyp. Consensus expert opinion established difficulty rating as hard. Detected by one of 19 readers on unassisted read and one of 19 readers on CAD-assisted read. Missed polyp can be mistaken for a thick fold when evaluated on a 120 endoluminal flythrough. (a) Supine axial view shows a bulbous fold (arrow). (b) Supine endoluminal view as seen on the automated flythough along the centerline (arrows). Similar pitfall (arrows) is seen on (c) prone axial and (d) prone endoluminal views. (e) Corresponding fly-through movie from the supine view, without CAD on and (f) with CAD on to identify the polyp candidate in the cecum (near the center of the viewing field during fly-though and on the final frame in the seven o’clock position), shown along the automated centerline at a 120 viewing angle. The polyp is easily confused for a normal slightly thick fold. Parts (e) and (f) are available online at www.academicradiology.org.
retrospective analysis were in agreement with this. The polyps that were CAD misses (Table 3) appeared to be less conspicuous, with nine of 10 of such polyps being flat or sessile and nine of 10 classified as hard. The reader detection rates of these polyps were low for both unassisted and CAD-assisted reads. As the MRMC trial showed, the improvement with CAD was greater for polyps 6 to 9 mm in size than for those $10 mm in size, a finding consistent with a number of prior studies (17,24). Polyps covered with a layer of hyperdense tagging agent were more likely to be detected by readers compared to those without tagging in both unassisted and
CAD-assisted reads, a finding at variance with a previous finding that polyps coated with tagged fluid were more likely to be missed by unassisted readers (17). Untagged or poorly tagged stool has in general been considered a potential source of CAD false-positives, as discussed by Nappi and Nagata (27). However, it was also noted that true lesions may likewise be coated by a layer of tagging and that CAD markings on tagged surfaces should be investigated carefully and not easily dismissed for reasons of tagging alone. As suggested by our results, the presence of surface tagging on a polyp may in fact draw extra reader attention and lead to greater scrutiny 807
KOSHKIN ET AL
Academic Radiology, Vol 19, No 7, July 2012
Figure 6. Male patient, aged 68 years. Lipoma-like polyp. Error of characterization. Computer-aided detection (CAD) hit of 11-mm pedunculated polyp in the ascending colon. Consensus expert opinion established difficulty rating as easy. Detected by 16 of 19 readers on unassisted read and 17 of 19 readers on CAD-assisted read. Prone axial view shown in (a) standard computed tomographic colonographic view (window, 1500 Hounsfield units; level, 0 Hounsfield units) and (b) wide soft tissue (window, 1300 Hounsfield units; level, 400 Hounsfield units), and (c) with a region of interest showing the polyp density of 28 Hounsfield units. The polyp (arrow) is submerged in well-tagged residual fluid. Note that the polyp more closely resembles the density of subcutaneous fat (F) than soft tissue density of the abdominal wall muscle (M).
TABLE 4. Main Source of Error Stratified by Difficulty, Size, Histology, and CAD Detection Main Error Variable Difficulty Easy Moderate Hard Size (mm) 6–9 $10 Histology Adenoma Not adenoma CAD detection Hit Miss
Characterization (n = 35)
Detection (n = 27)
Measurement (n = 9)
Unknown (n = 3)
12 (75%) 16 (55%) 7 (24%)
0 8 (28%) 19 (66%)
1 (6%) 5 (17%) 3 (10%)
3 (19%) 0 0
18 (33%) 17 (85%)
26 (48%) 1 (5%)
9 (17%) 0
1 (2%) 2 (10%)
21 (45%) 14 (52%)
16 (34%) 11 (41%)
7 (15%) 2 (7%)
3 (6%) 0
32 (50%) 3 (30%)
20 (31%) 7 (70%)
9 (14%) 0
3 (5%) 0
P* .010
.004
.567
.315
CAD, computer-aided detection. *From generalized estimating equation regression model of the likelihood of a characterization error, excluding the three cases for which no error could be determined (ie, everyone should have detected it).
of the potential lesion and thus to increased detection of such polyps. This may especially be the case for polyps with slight surface tagging not highly suggestive of stool. If a polyp was categorized as resembling a thick fold in least one view (2D or 3D), the readers were much more likely to both miss that polyp on the unassisted read and ignore the CAD hit on the CAD-assisted read. Additionally, it was shown that the improvement in detection with CAD was significantly greater for polyps not resembling thick colonic folds than for polyps that resembled thick folds in at least one view. Poor detection of polyps resembling thick folds may be attributed to the fact that CAD false-positives are also known to occur because of thick or blunted folds (24,27). An irregularly shaped ileocecal valve is another 808
potential source of false-positives that can sway readers to dismiss even potential true CAD markings on the ileocecal valve or in close proximity to it (27). However, the number of such polyps in this trial was not sufficient to provide adequate evidence for or against this claim. Polyps completely submerged in fluid, and therefore not visible on standard 3D fly-through, on the prone view were more likely to be missed by readers than polyps that were not submerged. This did not apply to polyps submerged on the supine view, which were detected just as frequently as polyps that were not submerged. Prior studies have revealed that CAD markings of the same polyp on both supine and prone views, as opposed to only one view, were associated with increased detection rates (28); polyps visible to the reader in
Academic Radiology, Vol 19, No 7, July 2012
both views were more frequently detected than polyps visible in one view (29); and when a polyp was detected by CAD on only one view, this was usually on the prone scan, which allows for optimal distension of the rectal and sigmoid segments (30). This result of our analysis thus supports the importance of obtaining both supine and prone scans in patients whenever possible and especially highlights the importance of the prone scan, which is frequently omitted in frail patients. Finally, improvement with CAD was noted to be significantly greater for polyps that resembled lipomas on 2D sections compared to those that showed no resemblance. Lipoma as an incidental finding on a CT colonographic read can be distinguished as a fat attenuation lesion using soft tissue window and level settings and usually requires no further workup (31). Dismissing polyps resembling lipomas on 2D views represents a potential pitfall and highlights an important potential utility of CAD systems for readers who may otherwise disregard lipoma-like polyps. Instead, they are able to correct the mistake of their initial impression with further investigation when prompted by a CAD mark. Study Limitations
The limitations of this study involved the subjective nature of the retrospective analysis. It was not practical to conduct a survey of each reader in the original trial after each interpretation to know for certain what his or her thought process was. For this reason, we used a consensus opinion of two experts in CT colonography, and in fact the experts rarely disagreed, and disagreement usually related to choosing the single main source of error. The small number of polyps in certain groups, and the resulting ability to detect only relatively large differences and performing univariate tests, is also a limitation. Our results are specific to this version of the CAD system, and ongoing improvement in polyp classifiers is expected. Other limitations relate to those of the MRMC study per se, such as the number and skill of the readers and biases introduced by the cohort selection process (which was random but limited to two source institutions). CONCLUSIONS On the basis of these results, CT colonography readers may encounter errors related to polyp characteristics. Overall sensitivity can be improved by carefully assessing focally thick folds. Diminutive polyp candidates should have sample measurements from an optimized viewing perspective on the endoluminal 3D view. Using visual impression to dismiss a polyp candidate as a lipoma may cause errors when it is submerged in densely tagged fluid. In this situation, even a region of interest or grid of Hounsfield unit CT density measurements may be helpful. The specific sources of error in the interpretation of CT colonographic studies as found in this study are avoidable and may lead to a more effective use of the CAD system when interpreting CT colonographic studies. Future considerations to improve CAD software
SOURCES OF CTC CAD-ASSISTED READER ERRORS
could include labeling CAD hits in certain locations, such as on thick folds or in proximity to the ileocecal junction to alert the reader that they should be more carefully scrutinized before they are dismissed. REFERENCES 1. Dachman AH, Kelly KB, Zintsmaster MP, et al. Formative evaluation of standardized training for CT colonographic image interpretation by novice readers. Radiology 2008; 249:167–177. 2. Fletcher JG, Chen MH, Herman BA, et al. Can radiologist training and testing ensure high performance in CT colonography? Lessons from the National CT Colonography Trial. AJR Am J Roentgenol 2010; 195:117–125. 3. Pickhardt PJ, Choi JR, Hwang I, et al. Computed tomographic virtual colonoscopy to screen for colorectal neoplasia in asymptomatic adults. N Engl J Med 2003; 349:2191–2200. 4. Johnson CD, Harmsen WS, Wilson LA, et al. Prospective blinded evaluation of computed tomographic colonography for screen detection of colorectal polyps. Gastroenterology 2003; 125:311–319. 5. Johnson CD, Chen MH, Toledano AY, et al. Accuracy of CT colonography for detection of large adenomas and cancer N. Engl J Med 2008; 359:1207–1217. 6. Regge D, Laudi C, Galatola G, et al. Diagnostic accuracy of computed tomographic colonography for the detection of advanced neoplasia in individuals at increased risk of colorectal cancer. JAMA 2009; 301: 2453–2461. 7. Knudsen AB, Lansdorp-Vogelaar I, Rutter CM, et al. Cost-effectiveness of computed tomographic colonography screening for colorectal cancer in the Medicare population. J Natl Cancer Inst 2010; 102:1238–1252. 8. Cotton PB, Durkalski VL, Benoit PC, et al. Computed tomographic colonography (virtual colonoscopy): a multicenter comparison with standard colonoscopy for detection of colorectal neoplasia. JAMA 2004; 291: 1713–1719. 9. Rockey DC, Paulson E, Davis W, et al. Analysis of air contrast barium enema, computed tomographic colonography, and colonoscopy: prospective comparison. Lancet 2005; 365:305–311. 10. Doshi T, Rusinak D, Halvorsen RA, et al. CT colonography: false-negative interpretations. Radiology 2007; 244:165–173. 11. Park SH, Kim SY, Lee SS, et al. Sensitivity of CT colonography for nonpolyploid colorectal lesions interpreted by human readers and with computer-aided detection. AJR Am J Roentgenol 2009; 193:1–9. 12. Krupinski EA. What can the radiologist teach CAD: lessons from CT colonoscopy. Acad Radiol 2009; 16:1–3. 13. Bogoni L, Cathier P, Dundar M, et al. Computer-aided detection (CAD) for CT colonography: a tool to address a growing need. Br J Radiol 2005; 78:S57–S62. 14. Summers RM, Yao J, Pickhardt PJ, et al. Computed tomographic virtual colonoscopy computer-aided polyp detection in a screening population. Gastroenterology 2005; 129:1832–1844. 15. Yoshida H, Nappi J, MacEneaney P, et al. Computer-aided diagnosis scheme for detection of polyps at CT colonography. Radiographics 2002; 22:963–979. 16. Lawrence EM, Pickhardt PJ, Kim DH, et al. Computer-aided detection (CAD) of colorectal polyps: standalone performance in a large asymptomatic screening population. Radiology 2010; 256:791–798. 17. Taylor SA, Robinson C, Boone D, et al. Polyp characteristics correctly annotated by computer-aided detection software but ignored by reporting radiologists during CT colonography. Radiology 2009; 243:715–723. 18. Taylor SA, Greenhalgh R, Ilangovan R, et al. CT colonography and computer-aided detection: effect of false-positive results on reader specificity and reading efficiency in a low-prevalence screening population. Radiology 2008; 247:133–140. 19. Taylor SA, Charman SC, Lefere P, et al. CT colonography: investigation of the optimum reader paradigm by using computer-aided detection software. Radiology 2008; 246:462–471. 20. Taylor SA, Brittenden J, Lenton J. Influence of computer-aided detection false-positives on reader performance and diagnostic confidence for CT colonography. AJR Am J Roentgenol 2009; 192:1682–1689. 21. Petrick N, Haider M, Summers RM, et al. CT colonography with computeraided detection as a second reader: observer performance study. Radiology 2007; 246:148–156. 22. Dachman AH, Obuchowski NA, Hoffmeister JW, et al. Impact of computer aided detection for CT colonography in a multireader, multicase trial. Radiology 2010; 256:827–835.
809
KOSHKIN ET AL
23. Zeger SL, Liang KY. Longitudinal data analysis for discrete and continuous outcomes. Biometrics 1986; 42:121–130. 24. Zhang H, Guo W, Liu G, et al. Colonic polyps: application value of computer-aided detection in computed tomographic colonography. Chin Med J 2011; 124:380–384. 25. Fidler JL, Fletcher JG, Johnson CD. Understanding interpretive errors in radiologists learning computed tomography colonography. Acad Radiol 2004; 11:750–756. 26. Summers RM, Frentz SM, Liu J, et al. Conspicuity of colorectal polyps at CT colonography. Acad Radiol 2009; 16:4–14. 27. Nappi JJ, Nagata K. Sources of false positives in computer-assisted CT colonography. Abdom Imaging 2011; 36:153–164.
810
Academic Radiology, Vol 19, No 7, July 2012
28. Summers RM, Liu J, Rehani B, et al. CT colonography computer-aided polyp detection: effect on radiologist observers of polyp identification by CAD on both the supine and prone scans. Acad Radiol 2010; 17: 948–959. 29. Gluecker TM, Fletcher JG, Welch TJ, et al. Assessment of vasculature using combined MRI and MR angiography. AJR Am J Roentgenol 2004; 182:881–889. 30. Robinson C, Halligan S, Iinuma G, et al. CT colonography: computerassisted detection of colorectal cancer. Br J Radiol 2011; 84:435–440. 31. Pickhardt PJ, Kim DH, Taylor AJ, et al. Extracolonic tumors of the gastrointestinal tract detected incidentally at screening CT colonography. Dis Colon Rectum 2006; 50:56–63.