Acceptable compression ratio of full-field digital mammography using JPEG 2000

Acceptable compression ratio of full-field digital mammography using JPEG 2000

Clinical Radiology 66 (2011) 609e613 Contents lists available at ScienceDirect Clinical Radiology journal homepage: www.elsevierhealth.com/journals/...

122KB Sizes 0 Downloads 57 Views

Clinical Radiology 66 (2011) 609e613

Contents lists available at ScienceDirect

Clinical Radiology journal homepage: www.elsevierhealth.com/journals/crad

Original Paper

Acceptable compression ratio of full-field digital mammography using JPEG 2000 B.J. Kang, H.S. Kim, C.S. Park, J.J. Choi, J.H. Lee, B.G. Choi* Department of Radiology, The Catholic University of Korea, Seoul, Republic of Korea

art icl e i nformat ion Article history: Received 10 October 2010 Received in revised form 27 January 2011 Accepted 1 February 2011

AIM: To estimate the acceptable compression ratio of full-field digital mammography (FFDM) using the Joint Photographic Experts Group (JPEG) 2000 compression algorithm. MATERIALS AND METHODS: Eighty cases that included images of 40 masses (20 benign, 20 malignant) and 40 microcalcifications (20 benign, 20 malignant) were collected. The images were compressed to five different lossy ratios: 20:1, 40:1, 60:1, 80:1, and 100:1, and four radiologists independently determined whether the compressed group was distinguishable from the control group. The ratio of the compressed group that was rated indistinguishable from the control group was compared for each reviewer, and the results were analysed for agreements of three or more reviewers. RESULTS: The ability to distinguish the compressed image from the control group is given as a range across the four reviewers: 0e1.3% (0/80 to 1/80) of the 20:1, 0e2.5% (0/80 to 2/80) of the 40:1, 5e7.5% (4/80 to 6/80) of the 60:1, 10e37.5% (8/80 to 30/80) of the 80:1, and 30e87.5% (24/ 80 to 70/80) of the 100:1. For three compression groups (20:1, 40:1, and 60:1), three or more reviewers agreed that there was a distinguishable difference for 0/80, 0/80, and 3/80 images, respectively. Thus, the compressed images do not differ significantly from the control group (p > 0.05). However, the 80:1 and 100:1 compressed images were different for 9/80 and 29/80 images, respectively, which is significantly different from the control group (p < 0.05). CONCLUSION: The lossy 60:1 compression ratio for FFDM is visually identical to the control image and, therefore, potentially acceptable for primary interpretation. Ó 2011 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.

Introduction Picture archiving and communication system (PACS) is used widely, and it is critically important to compress efficiently and store the large amount of image data, particularly from full-field digital mammography (FFDM). In the medical image field, various methods of data compression and storage have been evaluated.1,2 The Joint Photographic Experts Group

* Guarantor and correspondent: B.G. Choi, Department of Radiology, Seoul St. Mary’s Hospital, The Catholic University of Korea, 505 Banpo-Dong, Seocho-Ku, Seoul 137-040, Republic of Korea. Tel.: þ82 2 2258 1485; fax: þ82 2 599 6771. E-mail address: [email protected] (B.G. Choi).

(JPEG) 2000 compression algorithm is regarded as the most sensible choice for use in the event that compression is adopted in a modern PACS.3 JPEG 2000 is designed to define compression ratios as required, in contrast to other existing compression algorithms.4 Image compression algorithms are mainly subdivided into lossless (reversible) and lossy (irreversible). Images that are compressed with a lossless algorithm can be perfectly restored to the original images, but they require more storage space due to the low compression ratio.5 Lossy algorithms provide high compression ratio and require less storage space, but the images may lose some information when restored. An acceptable compression ratio is dependent on a variety of parameters: image contents, compression

0009-9260/$ e see front matter Ó 2011 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.crad.2011.02.004

610

B.J. Kang et al. / Clinical Radiology 66 (2011) 609e613

algorithm, and specific reading tasks; however, the most appropriate compression ratio that will minimize the loss of necessary diagnostic information is not known.3,6,7 For computed tomography (CT) data, even though lossy compression is known to be effective, different visuallylossless thresholds and diagnostically-lossless thresholds continue to be studied.3,8e11 Mammography, which is done every 1e3 years depending on country, is a useful tool for screening and early detection of breast cancer. Because it is important to compare current mammograms with previous examinations, mammography data must be stored with a compression algorithm that maintains the diagnostic quality. The effect of image compression on detection of suspicious lesions with mammography has been studied previously.12,13 It was reported that digital mammograms compressed with a 40:1 ratio could substitute for the original images in breast cancer diagnosis,12 but the study only evaluated digitized mammography and the effect of compression on accuracy of mass diagnosis. Koff et al.14 reported that the compression level or type had no effect on the diagnostic accuracy of mammography images and also compression had no effect on readers’ subjective assessment of mammography images. They broadly recommended the compression ratio of 15e25 for mammography including CR/DR images.14 However, this study did not evaluate FFDM specifically. The purpose of the present study was to estimate the acceptable compression ratio for FFDM using the JPEG 2000 compression algorithm.

Materials and methods Case selection The institutional review board approved this retrospective study and waived informed patient consent. FFDMs, including bilateral craniocaudal (CC) and mediolateral oblique (MLO) views, were obtained from 80 patients (age range 25e76 years). From digital mammograms acquired with an FFDM unit (Selenia, Hologic; Bedford, MA, USA) in 2009, 80 images, including 40 masses (20 benign, 20 malignant) and 40 microcalcifications (20 benign, 20 malignant) were collected. In total, 80 consecutive cases that had breast masses and microcalcifications, histologically proven by 16-gauge core needle biopsy or definitive surgery, were included in this study. All mammograms were clinically indicated and requested by clinicians as part of the patient’s diagnostic Table 1 Lesion characterization and parenchymal density of selected cases. Group Lesion

Benign Malignant Parenchymal Fatty type Dense Lesion size Mean  SD (cm) Range

Mass

Microcalcifications Subtotal Total

20 20 12 28 2.1  0.98 0.4e5

20 20 14 26 2.76  2.31 0.3e8

40 80 40 26 80 54 2.43  1.79 0.3e8

work-up. Lesion characterization and parenchymal density of selected cases are listed in Table 1.

Image compression Each of the 80 cases (320 images, including bilateral CC and MLO views per case) was compressed to five different lossy ratios: 20:1, 40:1, 60:1, 80:1, and 100:1. These lossy compressed image groups were compared with the original raw image group (control group). The compression algorithm was a two-dimensional JPEG 2000 compression algorithm by Leadtools, version 14.5 (Lead Technologies, Charlotte, NC, USA).

Visual analysis Four board-certified radiologists participated in the analysis. Reviewers 1e4 had 2, 7, 8, and 15 years of experience in interpreting breast images, respectively. Each reviewer was informed of the purpose of the evaluation, a description of the study protocol, and the structure of datasets to be analysed. Each of the 400 lossy compressed image pairs was paired with the corresponding control group for visual comparison. Each pair included eight images (bilateral CC and MLO views for both compressed and control). The 400 pairs of the control and compressed image groups were randomly assigned to five reading sessions (80 pairs per each session), and the order of reading sessions was changed among the reviewers. Reading sessions were separated by a minimum of 2 weeks. Reviewers used a two-monitor system. The control image, which was identified as such, was always displayed first, and the lossy compressed image was displayed later. Control images were displayed on the left-side monitor, and the compressed images were displayed on right-side monitor. The CC views were displayed first, followed by the MLO views. In other words, on the left-side monitor, control images were arranged with right CC, left CC, right MLO, and left MLO views, and compressed images were displayed on the right-side monitor in the same order. The reviewer could return to the first image as desired. Each reviewer independently determined whether the compressed image was identical to the control image or if any detectable mass or microcalcifications were present (binary response). When making comparisons, the reviewers were asked to pay attention to margin and internal density as well as the texture of uniform attenuation areas. The digital mammograms were displayed in a one-byone format on a flat-panel 5-megapixel monochrome screen (PN21IQS, Wide; Yongin, Korea; 21.3 inch, 2048  2560 resolution, 0.165 mm pixel pitch, 10 bit grey gradation) using the PACS workstation (m-view version 5.4; Infinitt, Seoul, Korea). All annotations and labels suggesting the compression ratio were hidden. Images were initially presented with specific settings (window level: 2047, window width: 4096), but the reviewers were encouraged to adjust window centres and level settings. Reading time was set without limitation. The use of functions (e.g., magnification) was not regulated.

B.J. Kang et al. / Clinical Radiology 66 (2011) 609e613

The ambient room light was subdued. Reading was conducted at the reviewers’ convenience.

Statistical analysis SPSS (SPSS 17.0 for Windows; SPSS, Chicago, IL, USA) was used for analysing inter-observer agreement. A McNemar test based on the cumulative binomial distribution (MedCalc, version 10.4.8; MedCalc Software, Mariakerke, Belgium) was used for comparison between each pair. p < 0.05 was considered to indicate a statistically significant difference. In addition, the analysis was done independently for each reviewer and on the agreement of three or more reviewers. A positive response was counted if the compressed and control images were distinguishable.

Results The inter-class correlation coefficient (ICC) for interreviewer agreement was 0.73 (p < 0.05). The ability to distinguish the compressed image from the control group is given as a range across the four reviewers: 0e1.3% (0/80 to 1/80) of the 20:1, 0e2.5% (0/80 to 2/80) of the 40:1, 5e7.5% (4/80 to 6/80) of the 60:1, 10e37.5% (8/80 to 30/80) of the 80:1, and 30e87.5% (24/80 to 70/80) of the 100:1 (Table 2). Regarding the images from the 20:1 group, two reviewers reported that all 80 were identical, but the other two reviewers reported that only 79 out of 80 were identical. Regarding the images from the 40:1 group, two reviewers reported that all of 80 were identical, but the other two reviewers reported that only 78 out of 80 were identical. Table 3 lists the statistics when three or more reviewers agreed that images were distinguishable. It was concluded that 80 cases from both the 20:1 group and 40:1 groups were identical to the control group, so a p-value could not be obtained for these two groups. Three images from the 60:1 group were reported to be distinguishable, which was not a significant difference from the control group (p ¼ 0.25). However, nine and 29 cases were reported to be distinguishable from the 80:1 group (p ¼ 0.004) and the 100:1 group (p < 0.0001), respectively, which is a significant difference from the control group.

Discussion Many researchers regard JPEG 2000 as the most appropriate choice if image compression is adopted in a modern PACS.3 Although there are many compression algorithms to study, the JPEG 2000 algorithm was chosen because it could Table 2 Results of reviewers’ visual analysis per compression ratios. Reviewer

Compression ratio/na (%) 20:1

40:1

60:1

80:1

100:1

1 2 3 4

0 1 0 1

2 (2.5) 0 (0.0) 0 (0.0) 1 (1.3)

6 4 6 4

15 12 30 8

38 32 70 24

a

(0.0) (1.3) (0.0) (1.3)

(7.5) (5.0) (7.5) (5.0)

Number of comparatively distinguishable images.

(18.8) (15.0) (37.5) (10.0)

(47.5) (40.0) (87.5) (30.0)

611

be widely applied. In the results of the present study, the responses of all four reviewers were similar and suggested that 20:1, 40:1 compressed images were completely indistinguishable from the control images, 60:1 compressed images were statistically identical to the control images, but images compressed to a level of 80:1 or greater were distinguishable from the control images. In the group of malignancy and mass, 60:1 JPEG 2000 compression is acceptable for the primary interpretation of mammograms (Table 3). From these results, the acceptable visually lossless threshold is estimated to be somewhere between 60:1 and 80:1 for digital mammograms compressed using the JPEG 2000 algorithm. The effect of image compression on mammograms has been studied less frequently than the effect of compression on plain radiographs and CT. CT images generally exhibit lower tolerance to compression than plain radiographs.6,15e17 The range of acceptable compression levels was reported to be 8:1 to 20:1. These reports typically evaluated the diagnostic performance (the diagnostically lossless threshold for the compression ratio) with a receiver operating characteristic study. Although the visually lossless criterion would likely allow a relatively lower compression ratio, this more conservative criterion should be more readily acceptable, even by sceptical radiologists.1 This concept was introduced to chest radiography by Slone et al.,1,7 who concluded that JPEG compression up to 10:1 is visually lossless. The design of the present study was intended to be as conservative as possible in any estimate of the visually lossless threshold. The alternating presentation of registered images on the same monitor was used because the human visual system is naturally drawn to changes in structure or brightness.1 In addition, viewing distance was unconstrained, and images were magnified by displaying them in a one-byone format. This study design, together with the adoption of a visually lossless threshold, should result in a very conservative and widely accepted threshold for evaluating the lossy compression ratio. The effect of image compression on the detection of suspicious lesions by mammography has been previously studied.12,13 For instance, Liang et al.12 in 2008 reported that digitized mammograms compressed at 40:1 could be used to substitute control images in the diagnosis of breast cancer; however, it is uncertain whether this compression ratio is acceptable for the characterization of the nodules and for the detection of a potential auxiliary or coincidental finding that might be clinically important in the same mammography dataset. Moreover, the study was performed with digitalized mammography and FFDM may be different. The purpose of the present study was to estimate the visually lossless threshold for JPEG 2000 compression of digital mammograms. The visually-lossless threshold was chosen rather than the diagnostically-lossless threshold because the former would be more objective and must precede the analysis for diagnostic performance. The results of the present study suggest that 60:1 JPEG 2000 compression is visually lossless statistically for most reviewers and is, therefore, potentially acceptable for the primary interpretation of mammograms without

612

B.J. Kang et al. / Clinical Radiology 66 (2011) 609e613

Table 3 Results of visual analysis of compression ratios by disease type. Type

Group

n

Compression ratio/na (p-value) Identical

Malignancy Lesion Parenchymal Type Total

Benign Malignant Mass Microcalcifications Fatty Dense

40 40 40 40 26 54 80

Distinguishable

20:1

40:1

60:1

80:1

100:1

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0 3 3 0 2 1 3

3 6 8 1 5 4 9

9 20 17 12 17 12 29

(NA) (NA) (NA) (NA) (NA) (NA) (NA)

(NA) (NA) (NA) (NA) (NA) (NA) (NA)

(NA) (0.250) (0.250) (NA) (0.500) (1.000) (0.250)

(0.25) (0.031) (0.008) (1.000) (0.063) (0.125) (0.004)

(0.004) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000)

a

Number of distinguishable images agreed on by three or more reviewers. NA, not available.

compromising image quality, which eliminates the need to maintain the control images as the diagnostic standard. This reduction in data would directly affect operational costs in transmission and storage. The mean file size of each control image was 17,043,094  21 bytes. The mean file sizes of each 20:1, 40:1, 60:1, 80:1, and 100:1 image were 642,380  124; 322,880  214; 216,391 137; 163,195  117; and 131,247  100 bytes. The file size of each image of FFDM compressed to 60:1 was only 1.3% (216,391/17,043,094 bytes) of control image of FFDM. When one set of FFDM (bilateral CC and MLO views) images was compressed to 60:1, the file size would be decreased from 68,172,376 bytes to 865,564 bytes. Therefore, compression of FDDM offers space saving without compromising interpretation and image quality. The first limitation of the present study is that the interpretations of the individual reviewers may be different because four mammography films (CC and MLO views of bilateral breasts) were displayed. To minimize this limitation, the images were arranged so that a mammogram with a clearly visible lesion was compared with the same lesion type. However, subtle lesions on FFDM, particularly microcalcifications, stromal distortion, and asymmetry, may indicate early malignancy and the detection of subtle and small malignancy is very important in screening FFDM. The vast majority of screening mammograms is normal or has benign features. In the present study, microcalcifications ranging from 0.3e8 cm were included (Table 1) and microcalcifications with an extent less than 1 cm were evident at 60:1 compression ratio (Table 3); however, subtle and faint microcalcifications, stromal distortion, and asymmetry were not included in the control images in the present study. If the normal mammograms, subtle microcalcifications, stromal distortion, or asymmetry, were included, some readers might not detect the lesion and the interpretations of the individual reviewers might be different. In the future, further research of large numbers of FFDM, including normal mammograms and subtle lesions, particularly microcalcifications, stromal distortion, and asymmetries, will be needed. In addition, cases were selected evenly to reduce deviation in lesion characterization; the same number of cases of malignant, benign, mass, and microcalcifications were presented. In spite of these efforts,

deviation may have occurred if the reviewers did not identify the lesion due to observer variability. Even if they were identified, then they may have been biased by either the CC or MLO view. The lesion may cause reviewers to disagree if there are different views on margin, internal density, and texture of uniform attenuation areas. To offset this variability, several reviewers were asked to evaluate a large number of cases. Second, instead of evaluating the reviewers’ responses based on the diagnosis as benign and malignant, the image quality was compared based on the reviewers’ ability to differentiate compressed images from the controls. Lazarus et al.18 in 2006 reported that only fair agreement was obtained for the final assessment category of Breast Imaging-Reporting and Data System (BI-RADS), which is usually used to distinguish benign from malignant lesions in mammography. Furthermore, most women in Asia have smaller and denser breasts, which may cause false-negative results and make it difficult to distinguish between benign and malignant lesions.19 The density of the breast parenchyma was evaluated according to the gradation of the American College of Radiology BI-RADS protocol on a scale of 1e4.20 Patients who participated in this study are distributed as follows: three people in scale 1; 23 people in scale 2; 33 people in scale 3; and 21 people in scale 4, which shows that there are relatively more dense breasts than fatty breasts in the cases studied (Table 1). In this study, there was no difference in acceptable compression ratios between fatty and dense parenchymal types (Table 3). In the present study, therefore, the analysis of image quality is more objective and must precede the analysis of diagnostic performance. Until now, acceptable image compression ratios for FFDM have not been studied, which makes the present study significant. In the future, further research to establish diagnostically acceptable compression ratios by assessing diagnostic performance outcomes will be needed. In conclusion, digital mammographic images irreversibly compressed at a level of 40:1 using the JPEG 2000 algorithm are completely indistinguishable from control images and 60:1 images are statistically visually lossless. Therefore, the lossy 60:1 compression ratio is potentially acceptable for primary interpretation without compromising image quality.

B.J. Kang et al. / Clinical Radiology 66 (2011) 609e613

References 1. Slone RM, Foos DH, Whiting BR, et al. Assessment of visually lossless irreversible image compression: comparison of three methods by using an image-comparison workstation. Radiology 2000;215: 543e53. 2. Agarwal A, Rowberg AH, Kim Y. Fast JPEG 2000 decoder and its use in medical imaging. IEEE Trans Inform Technol Biomed 2003;7: 184e90. 3. Lee KH, Kim YH, Kim BH, et al. Irreversible JPEG 2000 compression of abdominal CT for primary interpretation: assessment of visually lossless threshold. Eur Radiol 2007;17:1529e34. 4. Ringl H, Schernthaner RE, Bankier AA, et al. JPEG 2000 compression of thin-section CT images of the lung: effect of compression ratio on image quality. Radiology 2006;240:869e77. 5. Shiao YH, Chen TJ, Chuang KS, et al. Quality of compressed medical images. J Digit Imaging 2007;20:149e59. 6. Erickson BJ, Manduca A, Palisson P, et al. Wavelet compression of medical images. Radiology 1998;206:599e607. 7. Slone RM, Muka E, Pilgram TK. Irreversible JPEG compression of digital chest radiographs for primary interpretation: assessment of visually lossless threshold. Radiology 2003;228:425e9. 8. Tamm EP, Thompson S, Venable SL, et al. Impact of multislice CT on PACS resources. J Digit Imaging 2002;15(Suppl. 1):96e101. 9. Rubin GD. 3-D imaging with MDCT. Eur J Radiol 2003;45(Suppl. 1): S37e41. 10. Rubin GD. Data explosion: the challenge of multidetector-row CT. Eur J Radiol 2000;36:74e80.

613

11. Lee KH, Lee HJ, Kim JH, et al. Managing the CT data explosion: initial experiences of archiving volumetric datasets in a mini-PACS. J Digit Imaging 2005;18:188e95. 12. Liang Z, Du X, Liu J, et al. Effects of different compression techniques on diagnostic accuracies of breast masses on digitized mammograms. Acta Radiol 2008;49:747e51. 13. Kocsis O, Costaridou L, Varaki L, et al. Visually lossless threshold determination for microcalcification detection in wavelet compressed mammograms. Eur Radiol 2003;13:2390e6. 14. Koff D, Bak P, Brownrigg P, et al. Pan-Canadian evaluation of irreversible compression ratios. J. Digital Imaging 2009;22:569e78. 15. Zheng B, Sumkin JH, Good WF, et al. Applying computer-assisted detection schemes to digitized mammograms after JPEG data compression: an assessment. Acad Radiol 2000;7:595e602. 16. Goldberg MA, Gazelle GS, Boland GW, et al. Focal hepatic lesions: effect of three-dimensional wavelet compression on detection at CT. Radiology 1997;202:159e65. 17. Cosman PC, Davidson HC, Bergin CJ, et al. Thoracic CT images: effect of lossy image compression on diagnostic accuracy. Radiology 1994;190: 517e24. 18. Lazarus E, Mainiero MB, Schepps B, et al. BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 2006;239:385e91. 19. Crystal P, Strano SD, Shcharynski S, et al. Using sonography to screen women with mammographically dense breasts. AJR Am J Roentgenol 2003;181:177e82. 20. Radiology. ACo. Breast imaging reporting and data system, breast imaging atlas. 4th ed. Reston, VA: American College of Radiology; 2003.