Computer Methods and Programs in Biomedicine 71 (2003) 105 /115 www.elsevier.com/locate/cmpb
Compression assessment based on medical image quality concepts using computer-generated test images O. Kocsis a, L. Costaridou a, G. Mandellos b, D. Lymberopoulos b, G. Panayiotakis a,* b
a Department of Medical Physics, School of Medicine, University of Patras, GR 26500 Patras, Greece Department of Electrical and Computer Engineering, Wire Communications Laboratory, University of Patras, Patras 26500, Greece
Received 20 February 2002; received in revised form 23 May 2002; accepted 23 May 2002
Abstract Compression algorithms are widely used in medical imaging systems for efficient image storage, transmission, and display. In the acceptance of lossy compression algorithms in the clinical environment, important factors are the assessment of ‘visually lossless’ compression thresholds, as well as the development of assessment methods requiring fewer data and time than observer performance based studies. In this study a set of quantitative measurements related to medical image quality parameters is proposed for compression assessment. Measurements were carried out using region of interest (ROI) operations on computer-generated test images, with characteristics similar to radiographic images. As a paradigm, the assessment of the lossy Joint Photographic Expert Group (JPEG) algorithm, available in a telematics application for healthcare, is presented. A compression ratio of 15 was found as the visually lossless threshold for the JPEG lossy algorithm, in agreement with previous observer performance studies. Up to this ratio low contrast discrimination is not affected, image noise level is decreased, high contrast line-pair amplitude is decreased by less than 3%, and input/output gray level differences are minor (less than 1%). This type of assessment provides information regarding the type of loss, offering cost and time benefits, in parallel with the advantages of test image adaptation to the requirements of a certain imaging modality and clinical study. # 2002 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Medical image compression; JPEG; Computer-generated test image; Assessment; Teleradiology
1. Introduction The large volume of information that is generated and handled in medical imaging systems (Picture Archiving and Communication Sys* Corresponding author. Tel.: /30-610-996-113; fax: /30610-996-113 E-mail address:
[email protected] (G. Panayiotakis).
tems */PACS, telemedicine applications) is affecting storage, transmission, real-time display and processor speed requirements. As a solution, different types of image compression algorithms are used, the most common being the Joint Photographic Expert Group (JPEG) algorithm, while wavelet and fractal compression algorithms are emerging. Since in medical imaging applications any loss of information may be critical,
0169-2607/02/$ - see front matter # 2002 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 1 6 9 - 2 6 0 7 ( 0 2 ) 0 0 0 9 0 - 1
106
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
lossless methods are recommended, especially for diagnostic purposes [1]. The compression ratios offered by these methods [2,3] are much lower than the ones offered by the lossy methods, therefore it is important to define acceptable compression ratios for visually lossless compression methods, which however preserve displayed medical image quality [4 /6]. One method to assess compression algorithms is based on objective measures, such as normalized mean-square error (NMSE), sum of absolute differences, peak signal-to-noise ratio (PSNR), and histogram of difference image, and it is mainly used in the development and implementation steps of a compression algorithm [7 /9]. A clinically accepted subjective method for the assessment of compressed image quality is receiver operating characteristic (ROC) analysis, based on observer rating of disease in compressed images compared to gold standard classification of original images. ROC considerations include number of images and number of observers, normal to abnormal image ratio, threshold for classification, subtlety of disease, appearance of disease type, and controlled display conditions [10,11]. Although data set considerations are dealt by reference image data sets [12] still carrying a ROC compression assessment study is not trivial, as the number of images is multiplied by the number of compression ratios considered for the study, thus limiting the step between two consecutive compression ratios. Another approach for compressed image quality assessment could be based on medical image quality parameters, such as those defined in radiological practice and used for performance evaluation of the different medical imaging systems components [13 /15]. Assessment of these parameters correlates with the visual appearance of image quality degradation. In this study, a set of procedures and measurements for objective assessment of compression algorithms in terms of input/output gray level response, high contrast response, low contrast discrimination, and noise variation as functions of compression ratio is presented. The compression algorithm is applied to digital test images, designed taking into account radiographic image characteristics with respect to size (matrix size and
pixels per inch), contrast, spatial resolution and noise content. To validate this approach, the lossy JPEG compression algorithm, available in a telematics application for healthcare, implemented by the Hellenic Telecommunication Organization (OTE), is used as a paradigm.
2. Materials and methods
2.1. Image quality assessment A medical imaging system involves many components such as acquisition, compression, transmission, image processing, storage, printing and display, and each component can affect the final digital image quality. Medical image quality is primarily dictated in the acquisition step, and is determined by imaging equipment characteristics and examination conditions. Image quality is expressed in terms of image sharpness (spatial resolution and contrast resolution) and image noise level [13]. Image transfer characteristics of a component are used to verify its performance appropriateness, with respect to image quality. In order to provide means of measuring the degree of image degradation, different test objects (physical phantoms or computer-generated test images) or clinical images can be used in subjective or objective studies, depending on the goal of assessment and which component is being tested. In the acquisition step, image degradation assessment is performed using different physical phantoms. In case of components that have as input a digital image, digital test images (computer-generated ones or digital images of physical phantoms), whose characteristics are well known, can be used for image quality assessments. In compression algorithms assessment subjective observer studies, including ROC, have used either (a) clinical images [5,10,11,16,17]; or (b) digital test images [16,18/20], while objective measurements have been performed using both types of images [7,9,20 /22].
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
107
Fig. 1. Part of computer-generated test image including different types of patterns indicated with arrows.
2.2. Assessment parameters and test images The test images used in this study for image degradation assessment (due to compression) are computer-generated [15]. A test image includes different types of test patterns used for the assessment of parameters related to certain image quality characteristics. Step patterns are associated with the assessment of input/output gray level response, and image noise variation, as functions of gray level. High contrast resolution patterns are associated with the assessment of high contrast response. Low contrast discrimination patterns are associated with the assessment of smallest discriminated size for varying low contrast values. Fig. 1 presents a representative part of a test image, where different types of patterns are included.
These patterns can have different dimensions, orientation, positioning, and resolution, while providing a varying user defined range of the parameters being tested. For the purpose of this study three test images have been created. The characteristics of the first test image (TI1) are: resolution 500 ppi, pixel depth 8 bits, and size 3.937 MB (as a result of a 2K /2K matrix size). TI1 includes five step patterns (gray level from 0 to 255, with 16 steps), ten spatial resolution patterns (five vertical and five horizontal, each including groups of 9.85, 4.92, 3.22, 2.46, 1.97 and 1.64 lp/mm) and five low contrast discrimination patterns with constant background and variable contrast (objects’ size from 0.1 to 1.0 mm with 0.1 mm step, and contrast from 1 to 9% with 1% step). The patterns are
108
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
Fig. 2. Schematic representation of the telemedicine application of the Hellenic Telecommunication Organization.
distributed at the center and four corners of the entire test image area, to test compression algorithm performance with respect to spatial location. For the second test image (TI2) and the third one (TI3) the same characteristics and patterns are used and, in addition, Gaussian noise was added, with a standard deviation of approximately 3 gray level, and approximately 5 gray level respectively, in an effort to mimic radiographic noise. For all test images the value ranges for each pattern were selected taking into account the characteristics of medical images. Specifically, the spatial resolutions for the high contrast resolution patterns and
objects’ size and contrast for the low contrast discrimination patterns were selected to match the radiography requirements. 2.3. Teleradiology application The lossy JPEG compression algorithm used in a telematics application for healthcare, implemented by the Hellenic Telecommunication Organization, is assessed, as part of an overall assessment of the application [23]. This application provides a stack of functions, protocols and interfaces suitable for co-ordination and management of high
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
109
image is used. The QI equal 100% corresponds to lossless JPEG compression. 2.4. Image display and measurements
Fig. 3. JPEG quality index conversion to compression ratio for the three test images, used in this study.
level consult, report and review activities [23 /25]. The collaboration of the users is performed through the establishment of two or more party sessions. The events of any session are performed in three discrete phases (convene, execution and reporting). The session’s convene phase is employed for the asynchronous (off-line) transmission of the mass quantity of the application data among the collaborative users. During the execution phase of the session, the users handle the application data through the establishment of either an off-line or an on-line session. The latter is considered to be a synchronous session that allows the participating users to collaborate in real time, in order to make a common diagnosis. A schematic representation of the application is provided in Fig. 2. For efficient image storage and transmission lossless and lossy JPEG compression algorithms are available. For the lossy JPEG algorithm tested, a quality index (QI) from 1 to 99% is available, this representing the scale factor for each entry in the quantization table. QI equal to 99% indicates the smallest quantization and QI equal to 1% is the coarsest quantization, the last one leading to higher compression ratios. The resulting compression depends both on the selected QI as well as on characteristics of the image itself. The correspondence between QI and compression ratio was determined (Fig. 3), as the assessment parameters are evaluated with respect to compression ratio. As compression ratio, the ratio of size of the original to the compressed
For display and measurements of original and reconstructed images after compression at different selected QIs, the ROI operations of a visualization tool, developed in our department, were used [26,27]. To test input/output gray level response and image noise variation with respect to gray level, the mean and standard deviation (S.D.) of gray level values of square ROIs (20 /20 pixels) inside each step of the step patterns were used. The input/output gray level difference is given by the absolute difference of mean gray level values of corresponding steps between the original and compressed images. To assess high contrast response, for each high contrast resolution pattern, a set of profile lines was used to obtain an ‘average’ line profile. Using this line profile the amplitude of each high contrast line-pair group is obtained. To assess low contrast discrimination, the size of the smallest discriminated objects is determined by inspection, using an amplitude criterion [28] on profile lines corresponding to columns of low contrast objects. In this study, an object in a compressed image is considered discriminated if its amplitude is at least one third of the original object amplitude.
3. Results Figs. 4 /7 show representative results based on measurements performed using the three test images, TI1, TI2 and TI3, on original and compressed images. Fig. 4a presents the input/output gray level response as function of the compression ratio for the step patterns of noise free test image, TI1. An increased variation of the input/output gray level difference (in absolute values) is observed for compression ratios higher than 15. Input/output gray level responses were studied in presence of two noise levels, using the corresponding step patterns of TI2 and TI3. Fig. 4b presents the input/output gray level response for the step
110
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
Fig. 4. Input/output gray level response derived using step patterns of (a) noise free test image, TI1, and (b) noisy test image, TI2.
patterns of TI2. As seen from this figure minor differences are noticed up to a compression ratio of 15. Similar results were obtained for TI3. Assessment of noise variation as function of gray level values, as S.D., in case of TI1 (noise free
test image), demonstrates that no noise is introduced by the compression algorithm in uniform areas corresponding to 16 gray level steps ranging from 0 to 255 (S.D. /0 for each step of the step patterns and for all compression ratios). In the
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
111
Fig. 5. Image noise variation derived using step patterns of (a) noisy test image of S.D. approximately 3, TI2, and (b) noisy test image of SD approximately 5, TI3.
presence of noise (TI2 and TI3), SD is initially decreased for all gray levels, up to a compression ratio of 20, and then increased. Increase with respect to gray level is random. Fig. 5a and b show S.D. as function of compression ratio for TI3. Fig. 6a presents line-pair amplitude as a function of compression ratio for different spatial frequencies, derived using the high contrast reso-
lution patterns of TI1. As the frequency is increased, the line-pair amplitude degradation increases with increasing compression ratio. Similar results are obtained in the presence of noise (TI2 and TI3). Fig. 6b shows line-pair amplitude as function of compression ratio for TI2. Fig. 7a shows the effect of compression ratio on discriminated object size at low contrast condi-
112
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
Fig. 6. High contrast response, expressed as line-pair amplitude, derived using high contrast resolution patterns of (a) noise free test image, TI1, and (b) and noisy test image, TI2.
tions for TI1 (displayed results correspond to 1, 3, 5 and 9% contrast values). Results were similar for all ranges of contrast values (from 1 to 9%). The smallest size discriminated is 0.1 mm for compression ratios up to 15 for all tested contrast values. Compression ratio higher than 15 introduces an increase in discriminated object size. This increase is more significant for low contrasts (1 and 3%). For compression ratio higher than 30 the
1% contrast objects are not discriminated anymore. For the low contrast patterns of TI2 (Fig. 7b), due to presence of noise, the smallest detectable sizes for contrasts of 1 and 2% are 0.7 and 0.6 mm, respectively. The 1% contrast objects are discriminated for compression ratio up to 20, while the 2% contrast objects are discriminated for compression ratio up to 25. For the 2% contrast objects the
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
113
Fig. 7. Detected objects’ diameter for objects of various contrast values, derived using low contrast discrimination patterns of (a) noise free test image, TI1, (b) noisy test image of S.D. approximately 3, TI2, and (b) noisy test image of SD approximately 5, TI3.
114
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115
discriminated object size is decreased for compression ratio up to 15 and this is likely to be related to the denoising effect of the compression algorithm observed in Fig. 5a. Fig. 7c presents the low contrast discrimination in case of a higher noise level (TI3). In this case the 1 and 2% objects are not discriminated. However, for the rest of the contrast values the objects are discriminated for compression ratios up to 15. The above measurements were repeated for all test patterns located at various positions (the four corners and center), and no dependence of performance of the JPEG compression algorithm with respect to location was found, as expected from the nature of the algorithm.
4. Discussion and conclusion In the development and implementation steps of compression algorithms the objective measures used for assessment are NMSE and PSNR. These measures have a ‘global’ character, not providing information regarding the type of loss (i.e. spatial frequency and location) in the compressed image, and do not correlate well with the visual appearance and impression of the images when presented to human observers [29]. This study presents a set of quantitative measurements for compression assessment, based on parameters adopted from medical image quality concepts. This type of assessment is easy to perform and repeat as many times as necessary, and no expert observers are required. The types of patterns included in the test images are directly related to medical image quality parameters and their characteristics can be adapted to the requirements of a certain imaging modality and clinical study (i.e. resolution, type/size of finding, noise level). This assessment is envisaged as a preliminary step for compression assessment during development of a compression algorithm and, afterwards, in the efficient planning of a ROC study with respect to range of interest of compression ratios. This can considerably reduce the total number of images required for an observer performance study.
The results obtained from the application paradigm indicate the compression ratio of 15 as a visually lossless threshold for the JPEG lossy algorithm, in agreement with previous observer performance studies [10,11,16,17]. Up to this ratio low contrast discrimination is not affected, image noise level is decreased (denoising effect of the compression algorithm), high contrast line-pair amplitude is decreased by less than 3% (the highest frequency being the most affected), and input/ output gray level differences are minor (less than 1%). Currently, a wide variety of lossy compression algorithms are proposed for clinical use, the wavelet-based ones performing better, as compared with the standard lossy JPEG [5,9,16,21,30]. Objective performance assessment methods, as the one presented in this study, are reducing the time and resources costs for subsequent observer performance studies, thus potentially leading to faster integration of compression algorithms in medical applications. In conclusion, results indicate that the proposed approach of compression assessment, based on medical image quality parameters, provides information regarding the type of loss, offering cost and time benefits, with the additional advantage of test image adaptation to the requirements of a certain imaging modality and clinical study.
Acknowledgements Otilia Kocsis was supported by a grant by the State Scholarship Foundation (SSF), Republic of Greece.
References [1] American College of Radiology (ACR), Standard for Teleradiology, www.acr.org, 1998. [2] J. Kivijarvi, T. Ojala, T. Kaukoranta, A. Kuba, L. Nyul, O. Nevalainen, A comparison of lossless compression methods for medical images, Comp. Med. Imag. Graph. 22 (1998) 323 /339. [3] D. Okkalides, S. Efremides, Quality assessment of DSA, ultrasound and CT digital images compressed with the JPEG protocol, Phys. Med. Biol. 39 (1994) 1407 /1421.
O. Kocsis et al. / Computer Methods and Programs in Biomedicine 71 (2003) 105 /115 [4] M.G. Strintzis, A review of compression methods for medical images in PACS, Int. J. Med. Inf. 52 (1998) 159 / 165. [5] R.M. Slone, D.H. Foos, B.R. Whiting, E. Muka, D.A. Rubin, T.K. Pilgram, K.S. Kohm, S.S. Young, P. Ho, D.D. Hendrickson, Assessment of visually lossless irreversible image compression: comparison of three methonds by using an image-comparison workstation, Radiology 215 (2000) 543 /553. [6] H.K. Huang, Teleradiology technologies and some service models, Comp. Med. Imag. Graph. 20 (1996) 59 /68. [7] A. Bruckmann, A. Uhl, Selective medical image compression techniques for telemedical and archiving applications, Comp. Biol. Med. 30 (2000) 153 /169. [8] S.K. Thompson, J.D. Hazle, D.F. Schomer, A.A. Elekes, D.A. Johnston, J. Huffman, C.K. Chui, Performance analysis of a new semiortogonal spline wavelet compression algorithm for tonal medical images, Med. Phys. 27 (2000) 276 /288. [9] T.A. Iyriboz, M.J. Zukoski, K.D. Hopper, P.L. Stagg, A comparison of wavelet and Joint Photographic Experts Group lossy compression methods applied to medical images, J. Digital Imaging 12 (1999) 14 /17. [10] D.P. Beall, P.D. Shelton, T.V. Kinsey, M.C. Horton, B.J. Fortman, S. Achenbach, V. Smirnoff, D.L. Courneya, B. Carpenter, J.T. Gironda, Image compression and chest radiograph interpretation: image perception comparison between uncompressed chest radiographs and chest radiographs stored using 10:1 JPEG compression, J. Digital Imaging 13 (2000) 33 /38. [11] H. MacMahon, K. Doi, S. Sanada, S.M. Montner, M.L. Giger, C.E. Metz, N. Nakomori, F.F. Yin, X.W. Xu, H. Yonekawa, H. Takeuchi, Data compression: effect on diagnostic accuracy in digital chest radiography, Radiology 178 (1991) 175 /179. [12] Internet URL: http://marathon.csee.usf.edu/Mammography/Database.html [13] The European Commission, European guidelines on quality criteria for diagnostic radiographic images, EUR16260 EN. Luxembourg: CEC, 1996. [14] E.P. Efstathopoulos, L. Costaridou, O. Kocsis, G. Panayiotakis, A protocol-based evaluation of medical image digitizers, Br. J. Radiol. 74 (2001) 841 /846. [15] O. Kocsis, L. Costaridou, E.P. Efstathopoulos, D. Lymberopoulos, G. Panayiotakis, A tool for designing digital test objects for module performance evaluation in medical digital imaging, Med. Inf. 24 (1999) 291 /308. [16] A. Kalyanpur, V.P. Neklesa, C.R. Taylor, A.R. Daftary, J.A. Brink, Evaluation of JPEG and wavelet compression of body CT images for direct digital teleradiologic transmission, Radiology 217 (2000) 772 /779.
115
[17] W.F. Good, G.S. Maitz, D. Gur, Joint Photographic Experts group (JPEG) compatible data compression of mammograms, J. Digital Imaging 7 (1994) 123 /132. [18] G. Anastassopoulos, E. Karavatselou, D. Lymberopoulos, G. Panayiotakis, Comparative evaluation of lossless image compression techniques on mammograms, Proc. Int. Conf. Med. Inf. (1996) 734 /738. [19] L.T. Cook, M.F. Insana, M.A. McFadden, T.J. Hall, G.G. Cox, Contrast-detail analysis of image degradation due to lossy compression, Med. Phys. 22 (1995) 715 /721. [20] D. Okkalides, Assessment of commercial compression algorithms, of the lossy DCT and lossless types, applied to diagnostic image files, Comp. Med. Imag. Graph. 22 (1998) 25 /30. [21] J. Ricke, P. Mass, E.L. Hanninen, T. Liebig, H. Amthauer, C. Stroszczynski, W. Schauer, T. Boskamp, M. Wolf, Wavelet versus JPEG (Joint Photographic Expert Group) and fractal compression, Invest. Radiol. 33 (1998) 456 / 463. [22] G. Anastassopoulos, G. Panayiotakis, D. Lymberopoulos, A. Bezerianos, Performance evaluation of breast image compression techniques, Proc. Int. Conf. Med. Phys. Biomed. Eng. 1 (1994) 193 /197. [23] G. Mandelos, G. Economou, K. Mammas, D. Lymberopouos, An ISDN-based tele-medicine service, Proc. Int. Conf. Adv. Commun. (2001) 161 /166. [24] E. Karavatselou, G.P. Economou, C. Chassomeris, V. Danelli, D. Lymberopoulos, OTE-TS: a new value-added telematics service for telemedicine applications, IEEE Trans. Inform. Tech. Biomed. 5 (2001) 1 /16. [25] D. Lymberopoulos, K. Spiropoulos, G. Anastassopoulos, S. Kotsopoulos, K. Solomou, ELPIDA: a general architecture for medical imaging systems supporting telemedicine applications, J. Electron. Imaging 4 (1) (1995) 84 /97. [26] P. Sakellaropoulos, L. Costaridou, G. Panayiotakis, An image visualization tool in mammography, Med. Inf. 24 (1999) 53 /73. [27] P. Sakellaropoulos, L. Costaridou, G. Panayiotakis, Using component technologies for web based wavelet enhanced mammographic image visualization, Med. Inf. 25 (2000) 171 /181. [28] E.J. Halpern, A test pattern for quality control of scanner and charged-coupled device film digitizers, J. Digital Imaging 8 (1995) 3 /9. [29] H.K. Huang, PACS: Basic Principles and Applications, Wiley-Liss, New York, 1999. [30] C. Christopoulos, A. Skodras, T. Ebrahimi, The JPEG Still image coding system: an overview, IEEE Trans. Consumer Electron. 46 (2000) (2000) 1103 /1127.