Assessing Diagnostic Confidence

Assessing Diagnostic Confidence

Assessing Diagnostic Confidence: A Comparative Review of Analytical Methods1 Chaan S. Ng, MBBS, MRCP, FRCR, Christopher R. Palmer, MA, PhD Rationale ...

458KB Sizes 0 Downloads 22 Views

Assessing Diagnostic Confidence: A Comparative Review of Analytical Methods1 Chaan S. Ng, MBBS, MRCP, FRCR, Christopher R. Palmer, MA, PhD

Rationale and Objectives. The ability of a test to influence diagnostic confidence is used as a measure of its efficacy. Our aim was to compare analytic methods that evaluate changes in confidence. Materials and Methods. The approaches compared were “basic,” “retained diagnoses,” “Omary,” “Tsushima,” and “score-based” methods. For illustration, data from a clinical study assessing changes in diagnostic confidence (0%–100%) before and after abdominopelvic computed tomography (CT) in patients with acute abdominal pain were used. Results. The basic, retained diagnoses and Omary methods all ignore whether the test yields a correct diagnosis (confident, but incorrect, diagnoses are regarded positively). Although the Tsushima method takes some account of diagnostic accuracy, all misdiagnoses are considered equal. The score-based method addresses some of the fundamental limitations in the other analyticl methods, such as diagnostic accuracy and the varying nature of different misdiagnoses. In the case study, mean (SD) diagnostic confidence for the cohort as a whole (n ⫽ 62) increased following CT: 50.7% (20.8%) to 73.2% (20.9%). Pretest diagnoses were changed following CT in 43% (27 of 62) of patients. Pretest diagnoses proved to be incorrect in 52% (32 of 62), and post-test diagnoses incorrect in as many as 19% (12 of 62) of patients. All five analytic methods indicated a positive contribution for CT (all P ⱕ .003). Conclusion. Although our illustrative case study revealed no consequential differences across the five methods, there remain substantial differences in the fundamental principles underlying them that should affect choice of analytic method when assessing diagnostic confidence. Key Words. Diagnostic confidence; diagnostic accuracy; CT; acute abdominal pain. ©

AUR, 2008

The degree of confidence in a diagnosis is an important factor in patient management. If confidence in a particular diagnosis is high, management usually proceeds on that basis; if there is diagnostic uncertainty, other diagnostic possibilities are generally pursued.

Acad Radiol 2008; 15:584 –592 1

From the Department of Radiology, The University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030-4009 (C.S.N.); and Centre for Applied Medical Statistics, Institute of Public Health, University of Cambridge, Department of Public Health and Primary Care, Cambridge CB2 2SR, UK (C.R.P.). Received October 6, 2007; accepted December 10, 2007. Address correspondence to: C.S.N. e-mail: [email protected]

© AUR, 2008 doi:10.1016/j.acra.2007.12.004

584

The effect that a diagnostic test has on diagnostic confidence is used as one assessment of its efficacy (1,2). A number of analytic methods have been used to assess how changes in pretest compared to post-test confidences can be used to assess such diagnostic tests. A basic approach considers simple changes in confidence before and after the test and applies this to all the cases undergoing the test. This has been utilized in a variety of studies (3–9). Some developments of this simple approach attempt to take into account changes in diagnoses that might arise as a result of the test. These include applying the analysis only to the “retained” (or unchanged) diagnoses (3,10) and simple attempts to adjust for changes in (post-test) diagnoses (as expounded and utilized by Omary et al. and others (11–14).

Academic Radiology, Vol 15, No 5, May 2008

These methods take no account of whether the posttest diagnoses might be correct or not, which would seem of fundamental importance; the potentially adverse and severe impact of high confidence in incorrect diagnoses is overlooked in these approaches. Two analytic methods have been reported that attempt to take into account the potential influences of changes in diagnoses and of the accuracy of post-test diagnoses (15,16). Our purpose was to assess the principles underlying the different analytic methods for assessing diagnostic confidence and to compare the methods using data from an illustrative case study.

MATERIALS AND METHODS Basic Analytic Method The basic method for assessing diagnostic confidence takes the difference in pre- and post-test confidences and applies it to all cases (Fig. 1). Analyses Incorporating Changes in Diagnoses Two methods have been used to modify the above basic approach, which take into account whether the test changes the initial pretest diagnosis. Retained diagnoses.—The simplest modification removes from consideration cases in which the post-test diagnosis differs from the pretest diagnosis and only uses subjects with unchanged (or “retained”) diagnoses (Fig. 2a). “Omary” correction.—In contrast, this approach uses all cases by applying an adjustment to the post-test confidence if the pre- and post-test diagnoses differ (11). In the latter situation, it is argued that if the pretest confidence in the leading diagnosis was rated above 50% on a 0%⫺100% scale (e.g., 70% in appendicitis), confidence in any other diagnosis to which the test might have changed, the “inferred” confidence (e.g., in renal colic) must necessarily have been less than the difference between 100% and the pretest confidence (less than 100% ⫺ 70%) (i.e., 30%). However, if the pretest confidence was rated below 50%, say 30%, then confidence in any other competing diagnosis, the inferred confidence, must also have been below 30% (for otherwise that diagnosis would have been the leading diagnosis). In effect, the Omary correction modifies the basic method in the subset of cases in which the test changes the diagnosis and the pretest confidence exceeds 50% (Fig. 2b).

ANALYSES OF DIAGNOSTIC CONFIDENCE

Figure 1. Basic method. C0 and C1 denote pre- and post-test confidences, respectively, on a 0%⫺100% scale.

Analyses Incorporating Diagnostic Changes and Diagnostic Accuracy The previously mentioned methods do not consider the possibility of errors in diagnoses. Two methods have been proposed that incorporate changes in diagnoses and ultimate diagnostic accuracy. “Tsushima” method.—Changes in confidences are assessed according to one of six possible combinations of the correctness of pre- and post-test diagnoses (15). In this method, the final correct diagnoses are known. Confidence values associated with incorrect diagnoses are given a negative value, incorrect diagnoses are thereby regarded as “harmful,” essentially penalizing such diagnoses (Fig. 3a). “Score-based” method.—In this patient-centered approach, quantitative scores are used to take into account changes in diagnoses and diagnostic confidences according to overall benefit or detriment to patients. The method recognizes that 1) misdiagnoses or changes in diagnoses are not all equal; 2) “overcalls” and “undercalls” do not necessarily have the same consequences on patients; and 3) management decisions are essentially based on whether levels of confidence in diagnoses are “high” or “low.” In this scoring system, a set of nine potential patient pathways has to be assessed in some detail, including consideration of plausible extreme outcomes, and with scores for three further sets of pathways derived by applying some simple rules (16) (Fig. 3b).

Illustrative Case Study The five analytic methods were compared for illustrative purposes using data from a case study set in a major U.K. teaching hospital (17). The study enrolled patients with acute abdominal pain of uncertain etiology who were admitted to the surgical service but were deemed on admission not to require abdominopelvic computed

585

NG AND PALMER

Figure 2.

(a), Retained diagnosis method and (b) Omary method.

tomography (CT) on clinical grounds. The subset of patients who underwent CT within 24 hours of admission (n ⫽ 62) provided the basis of the data for this case study. The admitting surgeons documented their diagnoses and graded their diagnostic confidence at the time of admission and again after the computed tomographic examination. Grading of diagnostic confidence by the surgeons was on a 5-point scale: 1, 10% confidence (i.e., very unsure); 2, 30%; 3, 50%; 4, 70%; and 5, 90% confidence (i.e., highly confident). The correct (“gold standard”) diagnosis was defined as that made at surgery or at 6-month follow-up, whichever was sooner.

586

Academic Radiology, Vol 15, No 5, May 2008

The case study’s diagnostic and confidence data were subjected in turn to each of the previously described analytic methods. Statistical Analyses All five methods were subjected to a one-sample t-test to assess the hypothesis that undergoing a computed tomographic scan makes no difference to the level of diagnostic confidence. For all methods, distributions of changes in confidence scores were checked and found to be sufficiently close to the normal distribution. Data summaries are presented as means and standard deviations (SDs). The 95% confidence intervals (CIs) were computed, and P-values were reported.

Academic Radiology, Vol 15, No 5, May 2008

ANALYSES OF DIAGNOSTIC CONFIDENCE

Figure 3. (a) Tsushima method: illustration of the six types of confidence score change (15). (b) Score-based method: illustration of the nine possible pathways of patients’ diagnostic histories pre- and post-test. According to this method, one first assumes “High” diagnostic confidence at both time points when determining scores; then rules apply to derive variations in scores for other situations involving “Low” diagnostic confidence at one or both timepoints (16).

587

NG AND PALMER

Academic Radiology, Vol 15, No 5, May 2008

Table 1 Summary of Evaluation of Changes in Diagnostic Confidence, by Analytic Method, in the Illustrative Case Study

Analytical Method

No. of Patients Contributing to Analysis

Mean Change in Confidence (SD)

Estimated Difference (95% CI)

t-Test Value

P-Value

Basic method Retained diagnosis Omary method Tsushima method Score-based method

62 35 62 62 62

22.6% (24.7%) 17.7% (23.1%) 26.5% (23.4%) 37.7% (69.6%) 0.68 (1.73)

(16.3–28.9) (9.8–25.7) (20.5–32.4) (20.1–55.4) (0.24–1.12)

7.20 4.53 8.90 4.27 3.10

⬍.0001 ⬍.0001 ⬍.0001 ⬍.0001 .003

All analyses were carried out using the software SPSS for Windows (version 12.0; SPSS, Chicago, IL).

RESULTS Patients and Diagnoses in the Case Study The illustrative case study consisted of 62 patients (median age [range] of 62 years [18 –92]; 33 [53%] men). Diagnostic confidence for the group as a whole increased following CT: mean (SD) 50.7% (20.8%) to 73.2% (20.9%). Pretest diagnoses were changed following CT in 44% (27 of 62) of patients with 95% CI (31%⫺56%) (i.e., retained diagnoses amounted to 56% [95% CI, 44%⫺69%]). Of the 62 patients in the illustrative study, 12 (19.4%) had surgery. Fifty had 6-month follow-up as their “gold standard” final diagnosis. Pretest diagnoses proved to be incorrect in 52% (32 of 62) of patients with 95% CI (39%⫺64%), and posttest diagnoses were incorrect in 19% (12 of 62) with 95% CI (10%⫺29%) (i.e., the accuracy was 81% [95% CI, 71%⫺90%]). The sensitivity and specificity of CT in identifying the correct acute abdominal disorder were 98% (39 of 40) with 95% CI (93%⫺100%) and 50% (11 of 22) with 95% CI (29%⫺71%), respectively.

Numerical Comparison of Analytic Methods in the Case Study The results for the five analytic methods are presented in Table 1. All methods indicated a significant increase in confidence following the computed tomographic examination. The degree of significance for the score-based method was P ⫽ .003; that for each of the other four methods was P ⬍ .0001. Differences in confidence values between pairs of analytic methods are presented, in the form of scatter-

588

Figure 4. Scatterplot of differences in pre-t and post-test confidences: Omary correction versus basic method. Each symbol may represent more than one data point. Solid lines centered on the origin separate positive and negative changes in confidence. Points in upper right and lower left quadrants are in broad agreement; whereas those in the other two quadrants indicate cases where the two methods are in disagreement.

plots, for basic versus Omary (Fig. 4), Omary versus Tsushima (Fig. 5), and Tsushima versus score-based methods (Fig. 6). Data points in the upper right and lower left quadrants represent broad agreement between the two analytic methods. Data points in the other two quadrants indicate discordances between the two methods in question as to whether the test is considered beneficial or detrimental. Data points lying on the axes represent situations in which there are intermediate levels of disagreement between the two analytic methods

Academic Radiology, Vol 15, No 5, May 2008

ANALYSES OF DIAGNOSTIC CONFIDENCE

DISCUSSION We have considered five analytic methods that have been used to assess changes in diagnostic confidence that might result from a test. These methods have been used to evaluate the contribution of a variety of diagnostic tests and have evolved in attempts to address limitations that have been recognized. We have used a case study primarily as a vehicle to illustrate the methods and to assist in exploring their differences.

Figure 5. Scatterplot of differences in pre- and post-test confidences: Tsushima versus Omary correction. Note that due to superimposition of cases, there are in fact nine individual cases in the lower right quadrant and two on the vertical zero axis above the origin.

Figure 6. Scatterplot of differences in pre- and post-test confidences: score-based method versus Tsushima method. Note that due to superimposition of cases, there are in fact six individual cases on the horizontal zero axis to the right of the origin.

(one method considers the test to be neutral [i.e., neither beneficial nor detrimental], whereas the other method considers the test to be either beneficial or detrimental).

Analytic Methods That Do Not Take Diagnostic Accuracy Into Consideration The basic method treats all cases in the same light and takes no account of whether the test changes diagnoses or whether the post-test (or pretest) diagnoses are correct or not (Fig. 1). Modifications to the basic method acknowledge the possibility that the test may change (pretest) diagnoses and attempt to take these into consideration. The retained diagnosis method has the limitation that it discards all cases in which diagnoses might have changed (in this illustrative case study, as many as 44% [27 of 62] of cases); it is these very cases that arguably contain some of the most valuable information about the test, and hence should not be ignored. The “Omary” correction does include all available cases. Its overall treatment of cases is the same as for the basic method, except for those cases in which diagnoses might have changed and in which the pretest confidence was greater than 50%; the Omary correction effectively gives some added advantage to these cases, as illustrated in Figure 2b. The previously mentioned three methods take no consideration in their assessments of diagnostic confidence as to whether the post-test diagnosis is correct or not. Clearly, this is problematic because a test that leads one to a higher post-test confidence, in an incorrect diagnosis, should not be considered beneficial; in contrast, it should be considered detrimental. These methodologies would be adequate if the post-test diagnoses were reliably correct, but this is an unsound assumption. In practice, the need to overturn post-test diagnoses is not unusual. For example, in our case study, as many as 12 of 62 (19%) post-test diagnoses proved to be incorrect. Comparable rates of incorrect post-test diagnoses have been reported in other similar studies of CT for acute abdominal pain, namely, 7.2% and 16% (18,19).

589

NG AND PALMER

Analytic Methods That Take Diagnostic Accuracy Into Consideration The concept of including diagnostic accuracy in the assessment of diagnostic confidence is fundamental. Analytic methods that do not take into consideration the accuracy of the test run the risk of giving positive scores to tests that simply increase diagnostic confidence, regardless of whether the (post-test) diagnoses are correct or not. It is difficult to imagine how being confidently wrong can be in a patient’s best interests. The limitations of the Omary correction have been previously presented (15). Essentially, Tsushima “type (e) and (f)” cases (as in Fig. 3a) are deemed detrimental by the Tsushima method but may attract positive confidence differences in the Omary method (depending on the relative post-test and Omary-corrected pretest confidences). Conversely, Tsushima “type (c)” cases are given a positive confidence difference by their method but may be given a negative confidence difference by the Omary method. Tsushima “type (b)” cases attract higher positive confidence differences than the Omary method (Fig. 3a). In our case study, there was broad agreement in the beneficial or detrimental nature of confidence values between the Omary and Tsushima methods for the majority of cases. However, there were nine of 62 cases (15%) that were considered beneficial (positive) by the Omary method but detrimental (negative) by the Tsushima method, and vice versa in one case (cases in the lower right and upper left quadrants of the scatterplot, respectively; Fig. 5). There were an additional two cases that were considered neutral by the Omary method but considered beneficial by the Tsushima method. These cases serve to highlight differences between the methods. Fundamental Differences Between Tsushima and Score-Based Methods The Tsushima method considers all types of incorrect diagnoses to be essentially the same (e.g., a misdiagnosis of appendicitis is considered the same as a misdiagnosis of ureteric calculus). It also does not differentiate between potential differences in incorrect pre- and post-test diagnoses. In comparison, the score-based method introduces the concept that there are different types (or levels) of wrong diagnoses: missing the presence of appendicitis (and reporting the scan as normal) is not quite the same as, for example, missing ureteric calculus— both of these diagnoses might be wrong (“false negative” types of error), but the errors are likely to have different consequences and cannot fairly be considered of the same nature. Simi-

590

Academic Radiology, Vol 15, No 5, May 2008

larly, these types of misdiagnoses are different from, say, reporting appendicitis, when the appendix is normal (“false positive” type of error). This introduces the notion of the presence of some relative ranking (of “severities”) between diagnoses. The relative severities between pairs of diagnoses clearly carry a degree of subjectivity, not in the least because there is a spectrum of clinical severities within individual diagnoses. The score-based method tackles this subjectivity by including a potential range of scores (“pessimistic” and “optimistic”) and weighting them accordingly (16). The score-based method dichotomizes the diagnostic confidences into “high” and “low,” which may be perceived as a strength or a weakness of the approach. On the one hand, patient management decisions tend only to be taken when confidence is sufficiently high, arguing in favor of dichotomizing. On the other hand, unlike all the other methods, differences within confidence bands are not always distinguished (e.g., 70% and 90% or 10% and 30%). A specific area of discordance between the Tsushima and score-based methods is in the view of Tsushima “type (c)” cases: the former method considers these to be beneficial (i.e., positive score), whereas the latter considers that the post-test diagnosis is still incorrect. In contrast, the score-based method regards such cases as detrimental, not beneficial, for the patient, on the grounds that “two wrongs do not make a right.” In our case study, there was broad agreement in the beneficial or detrimental nature of confidence values between the Tsushima and score-based methods for the majority of cases. However, there was one case that the Tsushima method considered beneficial but that the scorebased method considered detrimental (lower right quadrant of the scatterplot, Fig. 6); there were six cases that the Tsushima method considered beneficial, but the scorebased method considered neutral. There were an additional three cases where one method considered diagnostic confidence unchanged, but the other method considered the test positive or negative. This totals 10 of 62 discrepant cases (16% with 95% CI [7%⫺25%]). Even among the remainder of cases in broad agreement (i.e., upper right and lower left quadrants of Fig. 6), there are differences in extent of agreement in benefit or harm. Similarities and Differences Across All Methods Four of the five methods are in exact quantitative agreement when the diagnostic test increases, or decreases, the diagnostic confidence in a correct diagnosis, when it remains

ANALYSES OF DIAGNOSTIC CONFIDENCE

Academic Radiology, Vol 15, No 5, May 2008

Table 2 Scenarios Illustrating the Changes in Confidence as Assessed by the Five Different Analytic Methods

Scenario

Diagnosis

Diagnostic Confidence

Analytic Method

Derivation of Confidence Change

Estimated Confidence Change

Direction of Confidence Change

1 ⫹20% NA ⫹60% ⫺160% ⫺3.83†

⫹ NA ⫹ ⫺ ⫺

50%⫺70% 50%⫺70% 50% ⫺ (100% ⫺ 70%) Type (c) case*: (⫺50%) ⫺ (⫺70%) CT wrongly confirms undercalling a severe diagnosis

⫺20% ⫺20% ⫹20% ⫹20% ⫺1.17†

⫺ ⫺ ⫹ ⫹ ⫺

Basic method Retained Omary Tsushima Score-based

50%⫺70% 50%⫺70% 50% ⫺ (100% ⫺ 70%) Type (c) case*: (⫺50%) ⫺ (⫺70%) CT wrongly confirms overcalling a less severe diagnosis

⫺20% ⫺20% ⫹20% ⫹20% ⫺0.33†

⫺ ⫺ ⫹ ⫹ ⫺

Basic method Retained Omary Tsushima Score-based

50%⫺70% NA 50% ⫺ (100% ⫺ 70%) Type (c) case*: (⫺50%) ⫺ (⫺70%) CT changes diagnosis but wrongly undercalls a severe diagnosis

⫺20% NA ⫹20% ⫹20% ⫺2.83†

⫺ NA ⫹ ⫹ ⫺

Pretest Post-test Actual

Appendicitis Renal colic Appendicitis

70% 90%

Basic method Retained Omary Tsushima Score-based

90%⫺70% NA 90% ⫺ (100% ⫺ 70%) Type (e) case*: (⫺90%) ⫺ (70%) CT confidently overturns initial correct severe diagnosis

Pretest Post-test Actual

Renal colic Renal colic Appendicitis

70% 50%

Basic method Retained Omary Tsushima Score-based

Pretest Post-test Actual

Appendicitis Appendicitis Renal colic

70% 50%

Pretest Post-test Actual

Renal colic Ovarian cyst Appendicitis

70% 50%

2

3

4

NA, not applicable; ⫹, the test is considered to be positive or beneficial (increases confidence); ⫺, the test is considered to be negative, harmful, or detrimental (decreases confidence). *Tsushima case type, as illustrated in Figure 3a. †Scores in score-based method obtained from Ng and Palmer, 2007 (16).

unchanged after the test, that is, a correct, retained diagnosis, Omary type (a), and a Tsushima type (a) or (b) (Figs. 2a, 2b, and 3a). In these same circumstances, when the test changes diagnostic confidence from low to high, or vice versa, there is also qualitative agreement between these four methods and the score-based method, that is, all five approaches return positive, or negative, results. It is worthy of note that, based on our case study, this circumstance is not infrequent (48%). In all remaining circumstances, the differences in the principles underlying each of the analytic methods may result in completely discordant points of view. Such potential divergences are highlighted in the following hypothetic scenarios (Table 2).

In scenario 1, the test not only reports an incorrect diagnosis, but it overturns a correct pretest diagnosis, and it does that with increased confidence. The basic and Omary methods assess the test positively; in contrast, the Tsushima and score-based methods assess the test negatively. The evaluation provided by the latter two methods would seem more appropriate in this circumstance. In scenarios 2 and 3, the test does not change the pretest diagnoses, which prove to be incorrect. For illustrative purposes the pretest confidences are the same in both scenarios, as are the post-test confidences. The basic and retained diagnosis methods assess the test negatively in both situations, whereas the Omary and Tsushima methods assess the test positively. The score-based method

591

NG AND PALMER

considers this negative because the post-test diagnoses were incorrect in both scenarios. Furthermore, the other four methods return the same confidence scores (⫺20% or ⫹20%) for both the scenarios, although it would seem that the diagnostic errors have quite different potential outcomes: missing appendicitis does not, in general, have the same consequences as missing a renal calculus. In scenario 4, the test changes the initial diagnosis (unlike scenarios 2 and 3), but the pre- and post-test confidences are the same as in scenarios 2 and 3. The basic, retained diagnosis, Omary and Tsushima methods treat this scenario in essentially the same way as for scenarios 2 and 3, returning similar conclusions. In comparison, the score-based method views scenarios 2, 3, and 4 as different, with varying degrees of negativity. Although our illustrative case study revealed no consequential differences across the five methods, there remain substantial differences in the fundamental principles underlying them that should affect one’s choice of analytic method when assessing diagnostic confidence. The basic, retained diagnoses and Omary methods all ignore whether the test in question yields a correct diagnosis or not (confident, but incorrect, diagnoses are regarded positively). Although the Tsushima method takes some account of diagnostic accuracy, it places all misdiagnoses in the same category—for example, undercalling and overcalling are considered equally detrimental. In contrast, the need to distinguish these misdiagnoses is a fundamental characteristic of the score-based method. Assessing confidence of a diagnosis pretest and posttest is an important component of evaluating the value of the diagnostic test in question. Further research is needed to determine which of several analytic methods is best in any given situation. We suggest, however, that the most reliable conclusions are likely to be obtained from the method(s) with the most sound underlying principles. REFERENCES 1. Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med Decis Making 1991; 11:88 –94.

592

Academic Radiology, Vol 15, No 5, May 2008

2. Mackenzie R, Dixon AK. Measuring the effects of imaging: An evaluative framework. Clin Radiol 1995; 50:513–518. 3. Anzilotti K Jr, Schweitzer ME, Hecht P, Wapner K, Kahn M, Ross M. Effect of foot and ankle MR imaging on clinical decision making. Radiology 1996; 201:515–517. 4. Blanchard TK, Bearcroft PW, Constant CR, Griffin DR, Dixon AK. Diagnostic and therapeutic impact of MRI and arthrography in the investigation of full-thickness rotator cuff tears. Eur Radiol 1999; 9:638 – 642. 5. Dhillon S, Halligan S, Goh V, Matravers P, Chambers A, Remedios D. The therapeutic impact of abdominal ultrasound in patients with acute abdominal symptoms. Clin Radiol 2002; 57:268 –271. 6. Stacul F, Cova M, Pravato M, Floriani I. Comparison between the efficacy of dimeric and monomeric non-ionic contrast media (iodixanol vs iopromide) in urography in patients with macroscopic haematuria. Eur Radiol 2003; 13:810 – 814. 7. Chambers A, Halligan S, Goh V, Dhillon S, Hassan A. Therapeutic impact of abdominopelvic computed tomography in patients with acute abdominal symptoms. Acta Radiol 2004; 45:248 –253. 8. Esses D, Birnbaum A, Bijur P, Shah S, Gleyzer A, Gallagher EJ. Ability of CT to alter decision making in elderly patients with acute abdominal pain. Am J Emerg Med 2004; 22:270 –272. 9. Bearcroft PW, Guy S, Bradley M, Robinson F. MRI of the ankle: Effect on diagnostic confidence and patient management. AJR Am J Roentgenol 2006; 187:1327–1331. 10. Dixon AK, Southern JP, Teale A, Freer CE, Hall LD, Williams A, Sims C. Magnetic resonance imaging of the head and spine: Effective for the clinician or the patient? BMJ 1991; 302:79 – 82. 11. Omary RA, Kaplan PA, Dussault RG, Hornsby PP, Carter CT, Kahler DM, Hillman BJ. The impact of ankle radiographs on the diagnosis and management of acute ankle injuries. Acad Radiol 1996; 3:758 –765. 12. Neish AS, Taylor GA, Lund DP, Atkinson CC. Effect of CT information on the diagnosis and management of acute abdominal injury in children. Radiology 1998; 206:327–331. 13. Carrico CW, Fenton LZ, Taylor GA, DiFiore JW, Soprano JV. Impact of sonography on the diagnosis and treatment of acute lower abdominal pain in children and young adults. AJR Am J Roentgenol 1999; 172: 513–516. 14. Abramson S, Walders N, Applegate KE, Gilkeson RC, Robbin MR. Impact in the emergency department of unenhanced CT on diagnostic confidence and therapeutic efficacy in patients with suspected renal colic: A prospective survey. 2000 ARRS President’s Award. American Roentgen Ray Society. AJR Am J Roentgenol 2000; 175:1689 –1695. 15. Tsushima Y, Aoki J, Endo K. Contribution of the diagnostic test to the physician’s diagnostic thinking: New method to evaluate the effect. Acad Radiol 2003; 10:751–755. 16. Ng CS, Palmer CR. Analysis of diagnostic confidence and diagnostic accuracy: A unified framework. Br J Radiol 2007; 80:152–160. 17. Ng CS, Watson CJ, Palmer CR, See TC, Beharry NA, Housden BA, Bradley JA, Dixon AK. Evaluation of early abdominopelvic computed tomography in patients with acute abdominal pain of unknown cause: Prospective randomised study. BMJ 2002; 325:1387. 18. Tsushima Y, Yamada S, Aoki J, Motojima T, Endo K. Effect of contrast-enhanced computed tomography on diagnosis and management of acute abdomen in adults. Clin Radiol 2002; 57:507–513. 19. Sala E, Watson CJ, Beadsmoore C, Groot-Wassink T, Fanshawe TR, Smith JC, Bradley A, Palmer CR, Shaw A, Dixon AK. A randomized, controlled trial of routine early abdominal computed tomography in patients presenting with non-specific acute abdominal pain. Clin Radiol 2007; 62:961–969.