Transform image coding with global thresholding: Application to baseline JPEG

Pattern Recognition Letters 24 (2003) 959–964 www.elsevier.com/locate/patrec Transform image coding with global thresholding: Application to baseline...

Download PDF

164KB Sizes 0 Downloads 79 Views

Report

PDF Reader
Full Text

Pattern Recognition Letters 24 (2003) 959–964 www.elsevier.com/locate/patrec

Transform image coding with global thresholding: Application to baseline JPEG Azza Ouled Zaid *, Christian Olivier, Olivier Alata, Francois Marmoiton IRCOM-SIC, UMR-CNRS 6615, BP 30179, 86962 Futuroscope Chasseneuil Cedex, France

Abstract In this paper, we present an image-adaptive joint photographers expert group encoding algorithm that incorporates global thresholding based on the statistical information of the discrete cosine transform (DCT) coeﬃcient distributions. These statistics are also used to construct rate/distortion-speciﬁc quantization tables with nearly optimal tradeoﬀs. Results on natural and SPOT images demonstrate that the proposed thresholding technique improves the reconstructed image quality compared to that provided by other DCT image coding schemes without thresholding. 2002 Elsevier Science B.V. All rights reserved. Keywords: DCT image coding; Global thresholding; Statistic modeling; Information criteria

1. Introduction The discrete cosine transform (DCT) is used in many popular image compression schemes, in particular, the baseline joint photographers expert group (JPEG) coding algorithm, which achieves good coding performance with low computational complexity (see Pennebaker and Mitchell, 1993). Therefore, improving its encoder performance is particularly important. JPEG quantizer step sizes largely determine the rate/distortion tradeoﬀ in the compressed image. Many approaches optimize this tradeoﬀ. Some of them include psycho-visual model-based quantization (Watson, 1993) and stochastic optimization techniques (Monro and

*

Corresponding author. E-mail address: [email protected] (A.O. Zaid).

Sherlock, 1993). The best results (with PSNR as quality metric) which have been achieved by such approaches were reported in (Wu and Gersho, 1993; Course and Ramchandran, 1997), by using a search strategy that is equivalent to moving toward the direction that minimizes the Lagrangian cost, J ¼ R þ kD. Ratnakar (1997) proposed the RD-OPT algorithm, which measures DCT coeﬃcient statistics for the given image data to construct the quantization tables in a rate/distortion speciﬁc way, without going through the entire compression–decompression cycle. However, even with adaptive quantizer selection, JPEG must apply the same quantizer step sizes on all blocks of the image. Thus, JPEG quantization suﬀers from a local adaptation. To overcome this lack of eﬀectiveness, adaptive thresholding of the coeﬃcients can be incorporated in the DCT coding chain. Thresholding strategy refers to adaptively dropping some coeﬃcients to zero. This could

0167-8655/03/$ - see front matter 2002 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 0 2 ) 0 0 2 1 9 - 2

960

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

reﬁne quantizer scales for the retained coeﬃcients, without adding any complexity to the decoder. A technique for making the zeroing decisions, local thresholding, has been proposed by Course and Ramchandran (1997). Making the zeroing decisions on a per-coeﬃcient basis in large images is computationally expensive. Ratnakar and Livny (2000) suggested another thresholding technique, global thresholding, by extending the RD-OPT algorithm to construct the threshold table. The latter decides the zeroing cutoﬀ levels for each DCT coeﬃcient. The drawback of RD-OPT is its computational complexity, since the threshold search is made along with quantizer scales using an exhaustive search process. To overcome the shortcomings of the previous approaches, we formulate a new image-adaptive coding chain (see Fig. 1) by extending current works that exists for JPEG coding. In our algorithm, we consider two fast and adaptive global thresholding methods. Both of them are based on DCT coeﬃcient distribution modeling. The ﬁrst thresholding approach uses the information criteria (IC) that consists of optimally determining the bin number and sizes of the AC coeﬃcient histograms. The second, parametric global thresholding, determines the thresholding table components as a function of the DCT coeﬃcient probability density. To select a near optimal quantization table over a wide range of compression rates, we adopt the RD-OPT algorithm. In the entropy encoder we customize the Huﬀman tables according to the statistics of the quantized image. For simplicity, only grayscale images are considered, but the same method can also be applied to color images. The organization of the remainder of this paper is as follows: Section 2 recapitulates the deﬁnitions

of local and global thresholding and how these can be applied to DCT image coding. Section 3 describes the modeling of AC DCT coeﬃcient distributions. In Section 4 we give the main notions required for understanding the origin and the interest of the IC and we explain our use of these criteria in the bin boundary selection of the AC histograms. Section 5 provides our parametric global thresholding technique based on coeﬃcient distribution modeling. Finally, Section 6 is devoted to experimental results. 2. Thresholding scheme applied to discrete cosine transform coeﬃcients 2.1. Brief review of joint photographers expert group The notation for the JPEG encoding process can be introduced as follows: we express an H W image pixels by a set I of ðH W Þ=64 blocks, each with 64 pixels. By applying an 8 8 DCT transform to each block I b , where b is the block index, we perform the block DCT, noted as cb : H W : ð1Þ cb ¼ DCT88 I b ; b ¼ 1; 2; . . . ; 64 A quantization table Q is typically speciﬁed by 64 step sizes that quantize the DCT coeﬃcients at spatial indexes ði; jÞ, ði; j ¼ 0; . . . ; 7Þ, in each 8 8 image block cb , where b is the block index through the transformed image. To simplify the notation, we refer to individual DCT coeﬃcients in a block by numbering them in raster order. cbi;j is referred to as cbn , where n ¼ 8i þ j. Denoting the quantized block DCT by c^b , for a given quantization table Q, the quantized coeﬃcient c^bn is calculated as:

Fig. 1. Block diagram for the proposed coding chain.

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

c^bn ¼ Round

cbn ; Qn

ð2Þ

with RoundðxÞ the closest integer to x. Quantization strategy in JPEG can be enhanced considerably by incorporating a thresholding step in the coding chain. In (Ramchandran and Vitterli, 1994), it has been shown that thresholding approximates the performance of ﬁnding the best quantization level for each DCT coeﬃcient. 2.2. Thresholding process We distinguish two thresholding types: local and global thresholding. • In the case of a local thresholding process, the quantized coeﬃcients that are not worth transmitting in a rate/distortion sense may be dropped in order to save bits. Course and Ramchandran (1997) mathematically described this process by a set, T ¼ fTnb g, 0 6 n 6 63, 0 6 b 6 ðH W Þ=640 of binary thresholding parameters that decide whether to threshold components of cb , i.e., 8 b < 1 ) c^bn ¼ Round Qcnn ; b ð3Þ Tn ¼ : 0 ) c^b ¼ 0: n

• The global thresholding is speciﬁed by a table of 64 non-negative real components T ½n, 0 6 n 6 63 called the threshold table. We deﬁne global thresholding as follows: if for any DCT image block cb , cbn < T ½n, then set cbn to zero in the quantization step. The combined result of quantization and thresholding is represented by: ( b

c Round Qnn if cbn P T ½n; b c^n ¼ ð4Þ 0 otherwise:

961

plication is unproﬁtable in the complexity-performance tradeoﬀ sense, since the zeroing decision is made through all the transform image coeﬃcients. For this reason, we have adopt global thresholding in our work. 3. Modeling of discrete cosine transform coeﬃcient distribution For DCT image coding schemes, diﬀerent assumptions have been made concerning the distribution of the DCT coeﬃcients. Reininger and Gibson (1983) used the Kolmogorov–Smirnov goodness-of-ﬁt tests in order to conclude that the AC coeﬃcient distributions are closely approximated by Laplacian densities. Other approaches proposed coeﬃcients distribution modeling based either on the Cauchy distribution (e.g., Baskurt, 1989), or on a mixture of Gaussian distributions (e.g., Eude et al., 1994). Under these assumptions, Eude concluded that a mixture of 1, 2 or 3 Gaussian distributions is the best model of AC coeﬃcient distribution. Nevertheless, we have retained a Laplacian model because of its reliability and simplicity. In fact, the distribution model can be readily matched by measuring its shape parameter, a, as a function of the standard deviation r. The corresponding distribution is then given by: pﬃﬃﬃ a ajxj 2 : ð5Þ ; where a ¼ lðxÞ ¼ e 2 r Next, we explore the Laplacian models of the AC coeﬃcient distributions in order to solve the thresholding problem. 4. Global thresholding using information criteria 4.1. Information criteria deﬁnition

In order to preserve the quality of the DC coeﬃcients, we set T ½0 to zero. Thus the thresholding table design operates only on the AC components. The table T does not need to be included with the compressed image, as the decoder does not need to know the threshold values. Preliminary results, have conﬁrmed that the beneﬁt of using local thresholding for JPEG ap-

IC are tools for parametric model estimation, as well as model selection. Initially proposed by Akaike (1973) for obtaining the order of an AR model, their usual form is: ICðkÞ ¼ 2

N X i¼1

log f ðXi jh^k Þ þ CðN Þk;

ð6Þ

962

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

where X N ¼ fX1 ; . . . ; XN g is a sample of observations that are supposed to be independent and (1) h^k is the maximum likelihood estimator of the empirical model hk with k free parameters. (2) 2 log f ðXi =h^k sÞ is the likelihood term. (3) CðN Þ is the penalty term which depends on the observation number N. It diﬀers according to the chosen criterion. We regard a model with the minimum IC value as a better one, that is to say, k^ ¼ argk min ICðkÞ. In our applications, we have retained the ub criterion developed by Matouat and Hallin (1996). 4.2. IC for optimal histogram construction The use of IC was extended by Olivier et al. (1994) to approach an empirical distribution by a histogram with optimal bin number. Our task reduces in determining the threshold value which corresponds to the dead-zone positive boundary and we applied this principle to the AC coeﬃcient distributions. Note that each distribution is obtained with N ¼ ðH W Þ=64 observations. The search of the optimal subpartition C ¼ [cr¼1 Br , is made from an initial histogram having m bins with equal sizes. Therefore, we should obtain c minimizing the ICðcÞ by pooling adjacent bins Br and Brþ1 . In the case of ub applied to histograms, we proposed the following expression: ub ðkÞ ¼ k 1 þ N b logðlog N Þ ! k ^k ðBr Þ X h h^k ðBr Þ log 2N ; ð7Þ lðBr Þ r¼1 where h^k is the maximum likelihood estimator of the theoretical distribution hk approximated by the histogram with k bins fBr g, r ¼ 1; . . . ; k. In the following section, we describe the pool process of histogram bins, especially in the case of AC coeﬃcient histograms. 4.2.1. Pool process of AC discrete cosine transform histogram classes For each AC coeﬃcient distribution, starting with m bins, a ﬁrst histogram is built giving an initial partition M. The merging of bins is carried

out according to an iterative process characterized by the IC variation. At a given iteration, let us suppose that we have a histogram with k bins Br , r ¼ 1; . . . ; j; . . . ; k. If two adjacent bins Bj and Bjþ1 merge so that Br ¼ Bj [ Bjþ1 , we obtain a new histogram with (k 1) bins. Among all possible mergers with (k 1) classes we keep the one that minimizes ub ðk 1Þ. Thus, we obtain the histogram which could be used, if necessary, for other pools. Only the diﬀerence between DðkÞ ¼ ub ðkÞ ub ðk 1Þ is computed. If DðkÞ is positive then we continue the pool process by keeping the histogram with ðk 1Þ classes before trying to run the process one more time. Otherwise, the pool process is stopped and the optimal histogram is the one with k bins. Given the optimal AC DCT histogram obtained according to the pool process, the demanded threshold value equals the positive boundary of the dead-zone.

5. Parametric global thresholding Referring to the work reported by Eude et al. (1994), on statistic modeling applied to baseline JPEG, we proposed a second technique for global thresholding table design. In this context, we take into account the probability density function of each AC coeﬃcient throughout the DCT blocks of the image. Since the value of the information differs from one coeﬃcient to another, the thresholding in high frequency spaces does not involve debasement as much as the thresholding in low frequency spaces. Thus, given the statistics of the transform coeﬃcients, we can deﬁne the limits over which they have a low probability of appearing. Therefore, they should be retained in order to avoid considerable quality debasement in the reconstructed image. This idea can be formulated as follows: An AC coeﬃcient cn has a low probability if its absolute value is up to a limit value Sn deﬁned by: Z Sn ln ðcn Þ dcn ¼ 0:95; ð8Þ 1

where ln is always the probability density function of the Laplacian which ﬁts better to the real co-

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

eﬃcient distribution. Thus, using the 63 Laplacian models, we determine the thresholding table components T ½n, n 2 f0; . . . ; 63g, such as: Fe if n 6¼ 0; Tn ¼ Sn ð9Þ 0 if n ¼ 0; with Fe a scaling factor. Fe is ﬁxed a priori depending on the experimental results on diﬀerent test images. After computing the appropriate Sn values for diﬀerent test images, we perceive low magnitudes of Sn in high frequencies involving high threshold values. So, most of coeﬃcients in high frequencies are dropped whereas, in low frequencies, the Sn magnitudes are high and the corresponding threshold values are therefore small. Consequently, most of the coeﬃcients are retained.

6. Experimental results In this section, we present the test results using our proposed global thresholding techniques, each of them coupled with the RD-OPT quantization algorithm. The performances of our global thresholding techniques applied to baseline JPEG are discussed using compression results for two grayscale images. These are Lena natural image, and Toulouse SPOT image. In Fig. 2(a) and (b) the peak signal

963

to noise ratio (PSNR) is plotted against the rate for these images compressed using our thresholding strategies coupled with RD-OPT quantization. Moreover, the PSNR-rate plots using JPEG with default quantization tables are depicted. In addition, PSNR-rate plots are shown in the case of RD-OPT without thresholding. For each image, parametric thresholding coupled with the RD-OPT quantizer selection algorithm results in PSNR gains of up to 2 dB (decibel), in comparison with RD-OPT algorithm without thresholding and 4 dB compared to JPEG with ‘‘default’’ quantization table. The rates shown in these plots are the actual rates resulting from JPEG compression with Huﬀman coding, and not entropy estimates. In the case of IC thresholding coupled with the RD-OPT quantizer selection algorithm, we ﬁnd that our algorithm achieves nearly the same PSNR gains as the unthresholded RD-OPT coding scheme in low bit rates. However, with all the test images there is a point beyond which performance starts to degrade. Intuitively, such a thresholding design is not suitable for encoding in higher rate values. This is mainly due to the bin pool process which involves a ﬁne merging of high probability bins which should be roughly dropped to zero whilst the low probability bins are coarsely merged, therefore they are encoded less accurately. Nevertheless, IC thresholding process does not require any a priori assumption.

Fig. 2. (a) Curves of PSNR versus bitrate for diﬀerent coding techniques in the case of the Lena image, (b) curves of PSNR versus bitrate for diﬀerent coding techniques in the case of the Toulouse image.

964

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

7. Conclusion We have proposed an image coding technique based on the statistical information of the AC coeﬃcients distribution. Our algorithm uses a global thresholding pass to approximate the zero bin sizes and then completely quantizes the remaining coeﬃcients in a second pass. Two thresholding strategies have been tested. The ﬁrst is based on the IC application used to optimally determine the DCT coeﬃcient histogram bins, in order to select the limit of the low magnitude coeﬃcient bin which corresponds to the threshold value. The second strategy consists of threshold table generation by using the probability density functions corresponding also to AC coeﬃcient distributions. We note that our thresholding step is independent of the bit allocation process in contrast to previous global thresholding approaches. Our experiments show that by incorporating parametric global thresholding in the coding chain, we can improve the performance of DCTbased image compression. However, the IC application did not provide any improvements for the coding performances. References Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. 2 Symposium on Information Theory, Budapest, pp. 267–281. Baskurt, A., 1989. Compression d’images numeriques par la transformation cosinus discrete. Ph.D. thesis, University of Lyon.

Course, M., Ramchandran, K., 1997. Joint thresholding and quantizer selection for transform image coding: Entropyconstrained analysis and applications to baseline JPEG. IEEE Trans. Image Process. 6 (2), 285–297. Eude, T., Grisel, R., Cheriﬁ, H., Debrie, R., 1994. On the distribution of the DCT coeﬃcients. In: ICASSP’94, Vol. 5. Matouat, A.E., Hallin, M., 1996. Order selection stochastic complexity and Kullback-Leibler information. In: Robinson, P.M., Rosenblatt, M. (Eds.), TSA, Vol. 2. Springer Verlag, New York, pp. 291–299, in memory of E.J. Hannan. Monro, D.M., Sherlock, B.G., 1993. Optimum DCT quantization. In: Proc. IEEE Data Compression Conf. Olivier, C., Courtellemont, P., Colot, O., de Brucq, D., Matouat, A.E., 1994. Comparison of histograms: a tool of detection. Eur. J. Diagnosis Safety Automation 4 (3), 335– 355. Pennebaker, W.B., Mitchell, J.L., 1993. JPEG Still Image Data Compression Standard. Van Nostrand Reinhlod, New York. Ramchandran, K., Vitterli, M., 1994. Syntax-constrained encoder optimisation using adaptive quantization thresholding for JPEG/MPEG encoders. In: Proc. IEEE Data Compression Conf. Ratnakar, V., 1997. Quality-controlled lossy image compression. Ph.D. thesis, University of Wisconsin-Madison. Ratnakar, V., Livny, M., 2000. An eﬃcient algorithm for optimizing DCT quantization. IEEE Trans. Image Process. 9 (2), 835–839. Reininger, R.C., Gibson, J.D., 1983. Distributions of the twodimentional DCT coeﬃcients for images. IEEE Trans. Commun. 31 (6), 160–175. Watson, A.B., 1993. DCT quantization matrices visually optimized for individual images. In: Rogowitz, B.E. (Ed.), Human Vision, Visual Processing, and Digital Display IV, pp. 202–216. Wu, S., Gersho, A., 1993. Rate-constrained picture-adaptive quantization for JPEG baseline coders. In: ICASSP’93, Vol. 5.

Transform image coding with global thresholding: Application to baseline JPEG

Transform image coding with global thresholding: Application to baseline JPEG

Recommend Documents