PII S1095-0397(01)00060-7
Molecular Imaging and Biology Vol. 4, No. 2, 157–160. 2002 Copyright © 2002 Elsevier Science Inc. Printed in the USA. All rights reserved. 1095-0397/02 $–see front matter
ORIGINAL ARTICLE
The Receiver Operating Characteristic Curve for the Standard Uptake Value in a Group of Patients with Bone Marrow Metastasis Paulo S. Duarte, MD1, Hongming Zhuang, MD, PhD2, Paolo Castellucci, MD2, Abass Alavi, MD2 1
2
Section of Nuclear Medicine, Fleury Laboratory, São Paulo, SP, Brazil; Division of Nuclear Medicine Hospital of the University of Pennsylvania, Philadelphia, PA, USA Purpose: The aim of this work was to determine the standard uptake value (SUV) threshold for differentiating malignant from benign bone lesions. Material and methods: Ninety-nine bone sites in 33 patients who had undergone a 2-deoxy-2[18F]fluoro-D-glucose-positron emission tomography (FDG-PET) study for cancer evaluation were studied. In addition to FDG-PET, a bone scan and at least two of the following determinations: magnetic resonance imaging (MRI), computed tomography (CT), and x-ray were conducted in each patient. The bone lesions were considered positive for malignancy if confirmed by clinical follow-up or a high degree of suspicion based on the positive results of at least three (which must include bone scan) out of four other imaging modalities. By these criteria, 39 lesions were considered positive and 60 were considered negative. The SUV values were classified as positive or negative using 61 different values of threshold (range from 1.0 to 7.0). These results were compared with the positive criteria above and reclassified as true positive, true negative, false positive, and false negative. The true-positive fraction and false-positive fraction were calculated for each threshold value. The receiver operating characteristic (ROC) curve was drawn and the best value was determined by visual analysis. Results: The SUV threshold was considered 2.5. Twenty-nine out of 39 bone lesions classified as positive showed a SUV 2.5. Of the 10 false-negative lesions, seven showed a SUV between 1.1 and 2.0, and three were not detected. Fifty-six out of 60 lesions classified as negative showed a SUV 2.5. Four lesions were false positive: one was a rib fracture and three were severe degenerative changes in the lumbar spine. Using an SUV threshold of 2.5, the sensitivity was 74.3% and the specificity was 93.3%. Conclusion: In our patient population, the optimal SUV to classify a bone lesion as malignant or benign is 2.5. (Mol Imag Biol 2002;4:157–160) © 2002 Elsevier Science, Inc. All rights reserved. Key Words: ROC; FDG; PET; SUV; Bone Metastasis.
Introduction he use of 2-deoxy-2-[18F]fluoro-D-glucose-positron emission tomography (FDG-PET) in clinical practices has increased in recent years and most PET centers currently are clinically oriented in contrast to the research orientation of few years ago. Due to the high glucose utilization and lack of glucose-6-phosphatase of many neoplasms, FDG and PET have demonstrated clinical value in the evaluation of neoplastic processes based on net FDG uptake.1–3 There-
T
Address correspondence to: Paulo S. Duarte, MD, Section of Nuclear Medicine, Fleury Laboratory, Cincinato Braga 282, Paraiso, São Paulo, SP, Brazil, 01333-910. E-mail:
[email protected]
fore, FDG-PET is useful for evaluation of primary tumor of different organs as well as metastatic disease.4–6 Although the FDG uptake demonstrates the increased metabolism by malignant cells compared to normal cells, some other pathologic processes can present an increase in the glucose metabolism, such as inflammation.7 The differentiation of the malignant process from the inflammatory process is possible but not always easy, and a process like sarcoidosis can easily be confused with a tumor. The analysis of the bone involvement by the malignancy is particularly difficult because of various nonmalignant processes that could affect this system and the same age range of both pathologies.8–10 The nature of PET determinations permits a quantification of the uptake in the lesions. This quality is useful in 157
158 Molecular Imaging and Biology, Volume 4, Number 2
the characterizing lesions if used carefully and might be helpful to differentiate malignant from benign bone process. One of the quantification tools is the standardized uptake value (SUV). This value is defined as the tissue concentration of the probe as measured by PET scanner divided by the activity injected by body weight.11 The advantage of the SUV is to offer a semi-quantitative analysis of the investigation without invasive procedures. This quantitative characteristic permits standardization of the positivity criteria and transfer of knowledge from the groups with a vast casuistic of FDG-PET to the groups that are starting in this field. The receiver operating characteristic (ROC) analysis is a statistical tool utilized to measure the observer or the test performance. It is also utilized to determinate the operating point that has the best combination of sensitivity and specificity. This is important because a full optimization of diagnostic strategy involves choosing not only the best sequence of tests, but also the best threshold for each test.12 The ROC is a plot of the true-positive rate (TPR; probability that a malignant lesion will be classified as positive by the test) versus the false-positive rate (FPR; probability that a benign lesion will be classified as positive by the test).12–14 The area of this curve is used to determinate the performance of the test. The best operating point is dependent on the decision-consequence costs and disease prevalence, but the better relation between sensitivity and specificity is achieved in the point of the curve farthest away from the chance line.
Objective The purpose of this work is to determine the best SUV threshold for differentiating malignant from benign bone lesions.
Methods Patients Ninety-nine bone sites in 33 patients with cancer (16 breast cancer; 12 lung cancer; one prostate cancer; one endometrium cancer; one bladder cancer, and two nonHodgkin’s Lymphoma) were included in our study. All patients had undergone a FDG-PET determination, a bone scan, and at least two of the following studies: magnetic resonance imaging (MRI), computed tomography (CT), and x-ray. Final results were considered positive for malignancy if confirmed by clinical follow-up or a high degree of suspicion based on the evaluation of the other imaging studies. By these criteria, 39 lesions were considered positive and 60 were considered negative.
FDG-PET Imaging Whole body FDG-PET imaging was performed using a C-PET camera (UGM Medical Systems, Philadelphia,
PA). All the patients fasted for at least four hours prior to injection of 2.516 MBq/Kg (0.068 mCi/Kg) FDG. Image acquisition began 50–60 minutes post injection. Singles transmission images were acquired over the entire region scanned for measuring attenuation correction using a caesium-137 point source.15 The images were reconstructed using the ordered subsets-expectation maximization (OSEM) method.16 The SUVs were measured in 80 of 99 sites (19 sites were not visually detected by FDG-PET scan and the SUVs were considered to be zero). The results were classified as positive or negative using 61 different values of threshold (range from 1.0 to 7.0) with a difference of 0.1 between the values. These results were compared with the positive criteria above and reclassified as true positive (TP), true negative (TN), false positive (FP), and false negative (FN). The TPR and the FPR were calculated for each threshold value. The ROC curve was drawn (Figure 1) and the best operating value was determined by visual analysis.
Results These results indicated that the optimal SUV threshold was 2.5 for distinguishing malignant from benign skeletal lesions. This value has a good sensitivity without loss of specificity and was the point farthest away from the chance line. Twenty-nine out of 39 bone malignant lesions presented a SUV 2.5 (3.6 1.3). Of the 10 falsenegative lesions (five patients), seven showed a SUV between 1.1 and 2.0, and three were not detected. All three were previously treated bone metastases with low bone turnover. Fifty-six out of 60 lesions classified as negative showed a SUV 2.5 (1.6 0.6). The four lesions (2 patients) that were false positive showed an SUV between 2.9 and 3.8. One false positive result was caused by a rib fracture, and three others were due to severe degenerative changes in the lumbar spine. Using a SUV threshold of 2.5, the sensitivity and specificity for differentiating malignant from benign bone lesions were 74.3% and 93.3%, respectively.
Discussion The correct classification of bone lesions as malignant or benign will certainly impact the treatment of the patients with cancer. Failure to diagnose early bone metastatic lesions could preclude a curative surgery and result in widespread disease. On the other hand, a benign lesion classified as malignant could cause local and systemic unnecessary therapies with some level of morbidity and sometimes mortality. Therefore, the correct stage of a tumor, as well as the correct characterization of suspicious lesions, is a cornerstone in the management of oncology patients. In the past few years, FDG-PET has emerged as a powerful tool in characterizing malignancy.4,6,17,18 As
ROC in Bone Marrow Metastasis/ Duarte et al. 159
Figure 1. The ROC curve showing the TPR and FPR values for a range of SUVs. The optimal SUV to differentiate malignant from benign bone lesions is 2.5. This is the point of the curve farthest away from the chance line.
any diagnostic method, the accuracy of FDG-PET depends on the expertise of the physician who reads the determinations. Establishment of cutoff values in quantitative or semi-quantitative analysis will be useful in correctly interpreting FDG-PET results. The advantage of using quantitative or semi-quantitative analyses to classify the lesion is the reduction of the operator dependency of the results and establish a semiquantitative criteria for making correct diagnoses. However, the ideal methodology utilized for semi-quantification of metastatic bone lesions is not yet established. One of the most common methodology utilized to semiquantity the FDG uptake is SUV. This value is defined as the tissue concentration of tracer as measured by PET scanner divided by the activity injected by body weight.11 In order to define a best cutoff or threshold value to classify a lesion, it is important to know the outcomes of the false results and the prevalence of the disease. For a rare disease, high specificity is important in order to avoid generating a large number of false positives. For a more common disease, high specificity is relatively less important. On the other hand, if the consequences of not detecting a particular disease is severe, a comparatively lax criterion should be adopted to increase the test sensitivity. Without taking into consideration the prevalence of the disease, the best mathematical threshold could be established by a ROC curve utilizing the point farthest away from the chance line. This value has the better relation between the sensitivity and specificity. Out of it, the gain in one of the two parameters (sensitivity and specificity) is followed by a disproportional lost in the other one. In this investigation, we drew a ROC curve for a set of SUV values and established the best operating point.
Without consider the consequences of the false results, 2.5 was the best SUV threshold. An interesting point observed is that the slope of the ROC curve for this study changes abruptly at the 2.5 value. This means that small increases in one of the two parameters will be followed by big decreases in the other one. Therefore, using SUV values different from 2.5 will be only justified if the consequences of one of the two false results are disproportionately worse than the other. One important point to be clarified is why we considered SUV equal to zero in lesions not detected visually. This is because the FDG-PET may be the first imaging method to be performed in patients with cancer. In this situation, lesions not detected by visual analysis will never be semi-quantified.
Conclusion For this group of patients, the optimal SUV to differentiate malignant from benign lesions is 2.5. Scans of patients treated previously, with history of trauma or severe osteodegenerative disease should be read with caution.
References 1. Gallagher, B.M.; Fowler, J.S.; Gutterson, N.I.; MacGregor, R.R.; Wan, C.N.; Wolf, A.P. Metabolic trapping as a principle of radiopharmaceutical design: some factors resposible for the biodistribution of 2-deoxy-2-[18F]fluoro-Dglucose. J. Nucl. Med. 19:1154–1161; 1978. 2. Knox, W.E.; Jamdar, S.C.; Davis, P.A. Hexokinase, differentiation and growth rates of transplanted rat tumors. Cancer Res. 30:2240–2244; 1970. 3. Som, P.; Atkins, H.L.; Bandoypadhyay, D.; et al. A fluori-
160 Molecular Imaging and Biology, Volume 4, Number 2
4. 5.
6.
7.
8.
9.
10.
11.
nated glucose analog, 2-fluoro-2-deoxy-D-glucose (F-18): nontoxic tracer for rapid tumor detection. J. Nucl. Med. 21:670–675; 1980. Delbeke, D. Oncological applications of FDG PET imaging. J. Nucl. Med. 40:1706–1715; 1999. Delbeke, D. Oncological applications of FDG PET imaging: brain tumors, colorectal cancer, lymphoma and melanoma. J. Nucl. Med. 40:591–603; 1999. Wahl, R.L.; Cody, R.; Hutchins, G.; Mudgett, E. Positronemission tomographic scanning of primary and metastatic breast carcinoma with the radiolabeled glucose analogue 2-deoxy-2-[18F]fluoro- D-glucose. N. Engl. J. Med. 324:200; 1991. Shreve, P.D.; Anzai, Y.; Wahl, R.L. Pitfalls in oncologic diagnosis with FDG PET imaging: physiologic and benign variants. Radiographics 19:61–77; 1999. Kobayashi, A.; Shinozaki, T.; Shinjyo, Y.; et al. FDG PET in the clinical evaluation of sarcoidosis with bone lesions. Ann. Nucl. Med. 14:311–313; 2000. Zhuang, H.; Duarte, P.S.; Pourdehand, M.; Shnier, D.; Alavi, A. Exclusion of chronic osteomyelitis with F-18 fluorodeoxyglucose positron emission tomographic imaging. Clin. Nucl. Med. 25:281–284; 2000. Schulte, M.; Brecht-Krauss, D.; Heymer, B.; et al. Grading of tumors and tumorlike lesions of bone: evaluation by FDG PET. J. Nucl. Med. 41:1695–1701; 2000. Sadato, N.; Tsuchida, T.; Nakaumra, S.; et al. Non-inva-
12. 13.
14.
15.
16.
17.
18.
sive estimation of the net influx constant using the standardized uptake value for quantification of FDG uptake of tumours. Eur. J. Nucl. Med. 25:559–564; 1998. Metz, C.E. Basic principles of ROC analysis. Semin. Nucl. Med. 8:283–298; 1978. Knapp, R.G. ; Miller, M.C. Describing the performance of a diagnostic test. In: Knapp, R.G., Miller, M.C., eds. Clinical Epidemiology and Biostatistics. Baltimore: Willians & Wilkins; 1992:31–52. Knapp, R.G. ; Miller, M.C. Defining normality using the predictive value method. In: Knapp, R.G., Miller, M.C., eds. Clinical Epidemiology and Biostatistics. Baltimore: Willians & Wilkins; 1992:53–60. Karp, J.S.; Muehllehner, G.; Qu, H.; Yan, X.H. Singles transmission in volume-imaging PET with a 137Cs source. Phys. Med. Biol. 40:929–944; 1995. Bedigian, M.P.; Benard, F.; Smith, R.J.; Karp, J .S.; Alavi, A. Whole-body positron emission tomography for oncology imaging using singles transmission scanning with segmentation and ordered subsets-expectation maximization (OSEM) reconstruction. Eur. J. Nucl. Med. 25:659–661; 1998. Conti, P.S.; Lilien, D.L.; Hawley, K.; Keppler, J.; Grafton, S.T.; Bading, J.R. PET and [18F]-FDG in oncology: a clinical update. Nucl. Med. Biol. 23:717–735; 1996. Hoh, C.K.; Schiepers, C.; Seltzer, M.A.; et al. PET in oncology: will it replace the other modalities? Semin. Nucl. Med. 27:94–106; 1997.