Transform image coding with global thresholding: Application to baseline JPEG

Transform image coding with global thresholding: Application to baseline JPEG

Pattern Recognition Letters 24 (2003) 959–964 www.elsevier.com/locate/patrec Transform image coding with global thresholding: Application to baseline...

164KB Sizes 0 Downloads 79 Views

Pattern Recognition Letters 24 (2003) 959–964 www.elsevier.com/locate/patrec

Transform image coding with global thresholding: Application to baseline JPEG Azza Ouled Zaid *, Christian Olivier, Olivier Alata, Francois Marmoiton IRCOM-SIC, UMR-CNRS 6615, BP 30179, 86962 Futuroscope Chasseneuil Cedex, France

Abstract In this paper, we present an image-adaptive joint photographers expert group encoding algorithm that incorporates global thresholding based on the statistical information of the discrete cosine transform (DCT) coefficient distributions. These statistics are also used to construct rate/distortion-specific quantization tables with nearly optimal tradeoffs. Results on natural and SPOT images demonstrate that the proposed thresholding technique improves the reconstructed image quality compared to that provided by other DCT image coding schemes without thresholding.  2002 Elsevier Science B.V. All rights reserved. Keywords: DCT image coding; Global thresholding; Statistic modeling; Information criteria

1. Introduction The discrete cosine transform (DCT) is used in many popular image compression schemes, in particular, the baseline joint photographers expert group (JPEG) coding algorithm, which achieves good coding performance with low computational complexity (see Pennebaker and Mitchell, 1993). Therefore, improving its encoder performance is particularly important. JPEG quantizer step sizes largely determine the rate/distortion tradeoff in the compressed image. Many approaches optimize this tradeoff. Some of them include psycho-visual model-based quantization (Watson, 1993) and stochastic optimization techniques (Monro and

*

Corresponding author. E-mail address: [email protected] (A.O. Zaid).

Sherlock, 1993). The best results (with PSNR as quality metric) which have been achieved by such approaches were reported in (Wu and Gersho, 1993; Course and Ramchandran, 1997), by using a search strategy that is equivalent to moving toward the direction that minimizes the Lagrangian cost, J ¼ R þ kD. Ratnakar (1997) proposed the RD-OPT algorithm, which measures DCT coefficient statistics for the given image data to construct the quantization tables in a rate/distortion specific way, without going through the entire compression–decompression cycle. However, even with adaptive quantizer selection, JPEG must apply the same quantizer step sizes on all blocks of the image. Thus, JPEG quantization suffers from a local adaptation. To overcome this lack of effectiveness, adaptive thresholding of the coefficients can be incorporated in the DCT coding chain. Thresholding strategy refers to adaptively dropping some coefficients to zero. This could

0167-8655/03/$ - see front matter  2002 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 0 2 ) 0 0 2 1 9 - 2

960

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

refine quantizer scales for the retained coefficients, without adding any complexity to the decoder. A technique for making the zeroing decisions, local thresholding, has been proposed by Course and Ramchandran (1997). Making the zeroing decisions on a per-coefficient basis in large images is computationally expensive. Ratnakar and Livny (2000) suggested another thresholding technique, global thresholding, by extending the RD-OPT algorithm to construct the threshold table. The latter decides the zeroing cutoff levels for each DCT coefficient. The drawback of RD-OPT is its computational complexity, since the threshold search is made along with quantizer scales using an exhaustive search process. To overcome the shortcomings of the previous approaches, we formulate a new image-adaptive coding chain (see Fig. 1) by extending current works that exists for JPEG coding. In our algorithm, we consider two fast and adaptive global thresholding methods. Both of them are based on DCT coefficient distribution modeling. The first thresholding approach uses the information criteria (IC) that consists of optimally determining the bin number and sizes of the AC coefficient histograms. The second, parametric global thresholding, determines the thresholding table components as a function of the DCT coefficient probability density. To select a near optimal quantization table over a wide range of compression rates, we adopt the RD-OPT algorithm. In the entropy encoder we customize the Huffman tables according to the statistics of the quantized image. For simplicity, only grayscale images are considered, but the same method can also be applied to color images. The organization of the remainder of this paper is as follows: Section 2 recapitulates the definitions

of local and global thresholding and how these can be applied to DCT image coding. Section 3 describes the modeling of AC DCT coefficient distributions. In Section 4 we give the main notions required for understanding the origin and the interest of the IC and we explain our use of these criteria in the bin boundary selection of the AC histograms. Section 5 provides our parametric global thresholding technique based on coefficient distribution modeling. Finally, Section 6 is devoted to experimental results. 2. Thresholding scheme applied to discrete cosine transform coefficients 2.1. Brief review of joint photographers expert group The notation for the JPEG encoding process can be introduced as follows: we express an H  W image pixels by a set I of ðH  W Þ=64 blocks, each with 64 pixels. By applying an 8  8 DCT transform to each block I b , where b is the block index, we perform the block DCT, noted as cb :   H W : ð1Þ cb ¼ DCT88 I b ; b ¼ 1; 2; . . . ; 64 A quantization table Q is typically specified by 64 step sizes that quantize the DCT coefficients at spatial indexes ði; jÞ, ði; j ¼ 0; . . . ; 7Þ, in each 8  8 image block cb , where b is the block index through the transformed image. To simplify the notation, we refer to individual DCT coefficients in a block by numbering them in raster order. cbi;j is referred to as cbn , where n ¼ 8i þ j. Denoting the quantized block DCT by c^b , for a given quantization table Q, the quantized coefficient c^bn is calculated as:

Fig. 1. Block diagram for the proposed coding chain.

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

c^bn ¼ Round



 cbn ; Qn

ð2Þ

with RoundðxÞ the closest integer to x. Quantization strategy in JPEG can be enhanced considerably by incorporating a thresholding step in the coding chain. In (Ramchandran and Vitterli, 1994), it has been shown that thresholding approximates the performance of finding the best quantization level for each DCT coefficient. 2.2. Thresholding process We distinguish two thresholding types: local and global thresholding. • In the case of a local thresholding process, the quantized coefficients that are not worth transmitting in a rate/distortion sense may be dropped in order to save bits. Course and Ramchandran (1997) mathematically described this process by a set, T ¼ fTnb g, 0 6 n 6 63, 0 6 b 6 ðH  W Þ=640 of binary thresholding parameters that decide whether to threshold components of cb , i.e., 8  b < 1 ) c^bn ¼ Round Qcnn ; b ð3Þ Tn ¼ : 0 ) c^b ¼ 0: n

• The global thresholding is specified by a table of 64 non-negative real components T ½n , 0 6 n 6 63 called the threshold table. We define global thresholding as follows: if for any DCT image block cb , cbn < T ½n , then set cbn to zero in the quantization step. The combined result of quantization and thresholding is represented by: (  b



c Round Qnn if cbn P T ½n ; b c^n ¼ ð4Þ 0 otherwise:

961

plication is unprofitable in the complexity-performance tradeoff sense, since the zeroing decision is made through all the transform image coefficients. For this reason, we have adopt global thresholding in our work. 3. Modeling of discrete cosine transform coefficient distribution For DCT image coding schemes, different assumptions have been made concerning the distribution of the DCT coefficients. Reininger and Gibson (1983) used the Kolmogorov–Smirnov goodness-of-fit tests in order to conclude that the AC coefficient distributions are closely approximated by Laplacian densities. Other approaches proposed coefficients distribution modeling based either on the Cauchy distribution (e.g., Baskurt, 1989), or on a mixture of Gaussian distributions (e.g., Eude et al., 1994). Under these assumptions, Eude concluded that a mixture of 1, 2 or 3 Gaussian distributions is the best model of AC coefficient distribution. Nevertheless, we have retained a Laplacian model because of its reliability and simplicity. In fact, the distribution model can be readily matched by measuring its shape parameter, a, as a function of the standard deviation r. The corresponding distribution is then given by: pffiffiffi a ajxj 2 : ð5Þ ; where a ¼ lðxÞ ¼ e 2 r Next, we explore the Laplacian models of the AC coefficient distributions in order to solve the thresholding problem. 4. Global thresholding using information criteria 4.1. Information criteria definition

In order to preserve the quality of the DC coefficients, we set T ½0 to zero. Thus the thresholding table design operates only on the AC components. The table T does not need to be included with the compressed image, as the decoder does not need to know the threshold values. Preliminary results, have confirmed that the benefit of using local thresholding for JPEG ap-

IC are tools for parametric model estimation, as well as model selection. Initially proposed by Akaike (1973) for obtaining the order of an AR model, their usual form is: ICðkÞ ¼ 2

N X i¼1

log f ðXi jh^k Þ þ CðN Þk;

ð6Þ

962

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

where X N ¼ fX1 ; . . . ; XN g is a sample of observations that are supposed to be independent and (1) h^k is the maximum likelihood estimator of the empirical model hk with k free parameters. (2) 2 log f ðXi =h^k sÞ is the likelihood term. (3) CðN Þ is the penalty term which depends on the observation number N. It differs according to the chosen criterion. We regard a model with the minimum IC value as a better one, that is to say, k^ ¼ argk min ICðkÞ. In our applications, we have retained the ub criterion developed by Matouat and Hallin (1996). 4.2. IC for optimal histogram construction The use of IC was extended by Olivier et al. (1994) to approach an empirical distribution by a histogram with optimal bin number. Our task reduces in determining the threshold value which corresponds to the dead-zone positive boundary and we applied this principle to the AC coefficient distributions. Note that each distribution is obtained with N ¼ ðH  W Þ=64 observations. The search of the optimal subpartition C ¼ [cr¼1 Br , is made from an initial histogram having m bins with equal sizes. Therefore, we should obtain c minimizing the ICðcÞ by pooling adjacent bins Br and Brþ1 . In the case of ub applied to histograms, we proposed the following expression:   ub ðkÞ ¼ k 1 þ N b logðlog N Þ ! k ^k ðBr Þ X h h^k ðBr Þ log 2N ; ð7Þ lðBr Þ r¼1 where h^k is the maximum likelihood estimator of the theoretical distribution hk approximated by the histogram with k bins fBr g, r ¼ 1; . . . ; k. In the following section, we describe the pool process of histogram bins, especially in the case of AC coefficient histograms. 4.2.1. Pool process of AC discrete cosine transform histogram classes For each AC coefficient distribution, starting with m bins, a first histogram is built giving an initial partition M. The merging of bins is carried

out according to an iterative process characterized by the IC variation. At a given iteration, let us suppose that we have a histogram with k bins Br , r ¼ 1; . . . ; j; . . . ; k. If two adjacent bins Bj and Bjþ1 merge so that Br ¼ Bj [ Bjþ1 , we obtain a new histogram with (k 1) bins. Among all possible mergers with (k 1) classes we keep the one that minimizes ub ðk 1Þ. Thus, we obtain the histogram which could be used, if necessary, for other pools. Only the difference between DðkÞ ¼ ub ðkÞ ub ðk 1Þ is computed. If DðkÞ is positive then we continue the pool process by keeping the histogram with ðk 1Þ classes before trying to run the process one more time. Otherwise, the pool process is stopped and the optimal histogram is the one with k bins. Given the optimal AC DCT histogram obtained according to the pool process, the demanded threshold value equals the positive boundary of the dead-zone.

5. Parametric global thresholding Referring to the work reported by Eude et al. (1994), on statistic modeling applied to baseline JPEG, we proposed a second technique for global thresholding table design. In this context, we take into account the probability density function of each AC coefficient throughout the DCT blocks of the image. Since the value of the information differs from one coefficient to another, the thresholding in high frequency spaces does not involve debasement as much as the thresholding in low frequency spaces. Thus, given the statistics of the transform coefficients, we can define the limits over which they have a low probability of appearing. Therefore, they should be retained in order to avoid considerable quality debasement in the reconstructed image. This idea can be formulated as follows: An AC coefficient cn has a low probability if its absolute value is up to a limit value Sn defined by: Z Sn ln ðcn Þ dcn ¼ 0:95; ð8Þ 1

where ln is always the probability density function of the Laplacian which fits better to the real co-

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

efficient distribution. Thus, using the 63 Laplacian models, we determine the thresholding table components T ½n , n 2 f0; . . . ; 63g, such as:  Fe if n 6¼ 0; Tn ¼ Sn ð9Þ 0 if n ¼ 0; with Fe a scaling factor. Fe is fixed a priori depending on the experimental results on different test images. After computing the appropriate Sn values for different test images, we perceive low magnitudes of Sn in high frequencies involving high threshold values. So, most of coefficients in high frequencies are dropped whereas, in low frequencies, the Sn magnitudes are high and the corresponding threshold values are therefore small. Consequently, most of the coefficients are retained.

6. Experimental results In this section, we present the test results using our proposed global thresholding techniques, each of them coupled with the RD-OPT quantization algorithm. The performances of our global thresholding techniques applied to baseline JPEG are discussed using compression results for two grayscale images. These are Lena natural image, and Toulouse SPOT image. In Fig. 2(a) and (b) the peak signal

963

to noise ratio (PSNR) is plotted against the rate for these images compressed using our thresholding strategies coupled with RD-OPT quantization. Moreover, the PSNR-rate plots using JPEG with default quantization tables are depicted. In addition, PSNR-rate plots are shown in the case of RD-OPT without thresholding. For each image, parametric thresholding coupled with the RD-OPT quantizer selection algorithm results in PSNR gains of up to 2 dB (decibel), in comparison with RD-OPT algorithm without thresholding and 4 dB compared to JPEG with ‘‘default’’ quantization table. The rates shown in these plots are the actual rates resulting from JPEG compression with Huffman coding, and not entropy estimates. In the case of IC thresholding coupled with the RD-OPT quantizer selection algorithm, we find that our algorithm achieves nearly the same PSNR gains as the unthresholded RD-OPT coding scheme in low bit rates. However, with all the test images there is a point beyond which performance starts to degrade. Intuitively, such a thresholding design is not suitable for encoding in higher rate values. This is mainly due to the bin pool process which involves a fine merging of high probability bins which should be roughly dropped to zero whilst the low probability bins are coarsely merged, therefore they are encoded less accurately. Nevertheless, IC thresholding process does not require any a priori assumption.

Fig. 2. (a) Curves of PSNR versus bitrate for different coding techniques in the case of the Lena image, (b) curves of PSNR versus bitrate for different coding techniques in the case of the Toulouse image.

964

A.O. Zaid et al. / Pattern Recognition Letters 24 (2003) 959–964

7. Conclusion We have proposed an image coding technique based on the statistical information of the AC coefficients distribution. Our algorithm uses a global thresholding pass to approximate the zero bin sizes and then completely quantizes the remaining coefficients in a second pass. Two thresholding strategies have been tested. The first is based on the IC application used to optimally determine the DCT coefficient histogram bins, in order to select the limit of the low magnitude coefficient bin which corresponds to the threshold value. The second strategy consists of threshold table generation by using the probability density functions corresponding also to AC coefficient distributions. We note that our thresholding step is independent of the bit allocation process in contrast to previous global thresholding approaches. Our experiments show that by incorporating parametric global thresholding in the coding chain, we can improve the performance of DCTbased image compression. However, the IC application did not provide any improvements for the coding performances. References Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. 2 Symposium on Information Theory, Budapest, pp. 267–281. Baskurt, A., 1989. Compression d’images numeriques par la transformation cosinus discrete. Ph.D. thesis, University of Lyon.

Course, M., Ramchandran, K., 1997. Joint thresholding and quantizer selection for transform image coding: Entropyconstrained analysis and applications to baseline JPEG. IEEE Trans. Image Process. 6 (2), 285–297. Eude, T., Grisel, R., Cherifi, H., Debrie, R., 1994. On the distribution of the DCT coefficients. In: ICASSP’94, Vol. 5. Matouat, A.E., Hallin, M., 1996. Order selection stochastic complexity and Kullback-Leibler information. In: Robinson, P.M., Rosenblatt, M. (Eds.), TSA, Vol. 2. Springer Verlag, New York, pp. 291–299, in memory of E.J. Hannan. Monro, D.M., Sherlock, B.G., 1993. Optimum DCT quantization. In: Proc. IEEE Data Compression Conf. Olivier, C., Courtellemont, P., Colot, O., de Brucq, D., Matouat, A.E., 1994. Comparison of histograms: a tool of detection. Eur. J. Diagnosis Safety Automation 4 (3), 335– 355. Pennebaker, W.B., Mitchell, J.L., 1993. JPEG Still Image Data Compression Standard. Van Nostrand Reinhlod, New York. Ramchandran, K., Vitterli, M., 1994. Syntax-constrained encoder optimisation using adaptive quantization thresholding for JPEG/MPEG encoders. In: Proc. IEEE Data Compression Conf. Ratnakar, V., 1997. Quality-controlled lossy image compression. Ph.D. thesis, University of Wisconsin-Madison. Ratnakar, V., Livny, M., 2000. An efficient algorithm for optimizing DCT quantization. IEEE Trans. Image Process. 9 (2), 835–839. Reininger, R.C., Gibson, J.D., 1983. Distributions of the twodimentional DCT coefficients for images. IEEE Trans. Commun. 31 (6), 160–175. Watson, A.B., 1993. DCT quantization matrices visually optimized for individual images. In: Rogowitz, B.E. (Ed.), Human Vision, Visual Processing, and Digital Display IV, pp. 202–216. Wu, S., Gersho, A., 1993. Rate-constrained picture-adaptive quantization for JPEG baseline coders. In: ICASSP’93, Vol. 5.